How AI Processes Information — What Happens After Words Become Numbers
Heya! 👋 I love helping people, and one of the best ways I do this is by sharing my knowledge and experiences. My journey reflects the power of growth and transformation, and I’m here to document and share it with you.
I started as a pharmacist, practicing at a tertiary hospital in the Northern Region of Ghana. There, I saw firsthand the challenges in healthcare delivery and became fascinated by how technology could offer solutions. This sparked my interest in digital health, a field I believe holds the key to revolutionizing healthcare.
Determined to contribute, I taught myself programming, mastering tools like HTML, CSS, JavaScript, React, PHP, and more. But I craved deeper knowledge and practical experience. That’s when I joined the ALX Software Engineering program, which became a turning point. Spending over 70 hours a week learning, coding, and collaborating, I transitioned fully into tech.
Today, I am a Software Engineer and Digital Health Solutions Architect, building and contributing to innovative digital health solutions. I combine my healthcare expertise with technical skills to create impactful tools that solve real-world problems in health delivery.
Imposter syndrome has been part of my journey, but I’ve learned to embrace it as a sign of growth. Livestreaming my learning process, receiving feedback, and building in public have been crucial in overcoming self-doubt. Each experience has strengthened my belief in showing up, staying consistent, and growing through challenges.
Through this platform, I document my lessons, challenges, and successes to inspire and guide others—whether you’re transitioning careers, exploring digital health, or diving into software development.
I believe in accountability and the value of shared growth. Your feedback keeps me grounded and motivated to continue this journey. Let’s connect, learn, and grow together! 🚀
In the last lesson, you saw something important:
Words are not processed as words. They are converted into numbers called embeddings.
So now we have a new question:
Once everything becomes numbers… what does the AI actually do with them?
Because turning words into numbers is only the beginning.
The real work happens after that.
This is where neural networks and layers come in.
If embeddings are the input, then layers are the processing system.
By the end of this lesson, you should understand:
what a neural network layer is
how data moves through layers
why multiple layers are needed
what activation functions actually do (in simple terms)
What Is a Neural Network?
Let’s keep this simple.
A neural network is a system made up of multiple steps that transform data.
Each step is called a layer.
So instead of doing everything at once, the AI processes information gradually.
Think of it like this:
Input → Transformation → Transformation → Transformation → Output
Each transformation is a layer.
A Simple Analogy: An Assembly Line
Imagine a factory.
At the start, you have raw materials.
At each stage, something is added or changed.
By the end, you have a finished product.
Neural networks work the same way.
You start with raw input (numbers from embeddings)
Each layer transforms the data slightly
The final layer produces an output
This “assembly line” idea is exactly how layers behave.
What Is a Layer?
A layer is simply:
A step that takes input, changes it, and passes it forward.
Nothing more complicated than that.
Each layer receives numbers, performs calculations, and sends new numbers to the next layer.
How Data Flows Through the Network
Let’s walk through the full journey.
Step 1: Input Layer
This is where your data enters.
In a language model, this is your embeddings.
So your sentence:
"I love small dogs"
becomes a set of vectors (numbers).
Step 2: Hidden Layers
This is where most of the work happens.
Each hidden layer:
looks at the input
detects patterns
transforms the data
Early layers detect simple patterns. Later layers detect more complex patterns.
Step 3: Output Layer
This is the final step.
The network produces an answer, such as:
the next word in a sentence
a classification (spam / not spam)
a prediction
Why Multiple Layers Matter
This is one of the most important ideas.
Different layers learn different levels of meaning.
Let’s break it down using language.
Early Layers
These focus on simple features:
word shapes
basic grammar
common patterns
Middle Layers
Now things get more interesting:
phrases
relationships between words
sentence structure
Deeper Layers
Now the system starts capturing:
tone
intent
context
subtle meaning
So instead of trying to understand everything at once, the AI builds understanding step by step.
Early layers handle simple patterns, later layers combine them into complex meaning
What Actually Happens Inside a Layer?
Let’s slow this down.
Inside each layer, something very specific happens:
The layer receives numbers
It applies weights (importance values)
It adds them together
It passes the result through a function
We’ll go deeper into weights in the next lesson.
For now, focus on this:
👉 A layer is doing calculations to reshape the data.
Activation Functions (The Gatekeepers)
Now we introduce something important, but we’ll keep it simple.
After a layer does its calculations, it uses something called an activation function.
This decides:
👉 What information should continue
👉 What should be filtered out
Simple Analogy
Think of a security checkpoint.
Not everything passes through.
Some signals are allowed forward. Some are reduced. Some are blocked.
Example: ReLU (Rectified Linear Unit)
ReLU is one of the most common activation functions.
It works like this:
positive numbers → allowed
negative numbers → turned into zero
So it removes weak or irrelevant signals.
Example: Sigmoid
Sigmoid takes any number and converts it into a value between 0 and 1.
This is useful when the AI needs to decide something like:
yes or no
spam or not spam
Why Activation Functions Matter
Without activation functions, layers would not add real value.
Everything would collapse into one simple calculation.
Activation functions introduce non-linearity.
That means:
👉 The AI can learn complex patterns
👉 Not just simple straight-line relationships
This is what allows AI to handle language, images, and real-world complexity.
What You Should Notice When You Experiment
When you use tools like TensorFlow Playground, you’ll see this directly.
If you:
add more layers
change activation functions
You’ll notice:
👉 The model behaves differently
Sometimes better. Sometimes worse.
That’s because you are changing how information is processed.
Common Beginner Mistakes
Mistake 1: Thinking more layers always means better
More layers can help, but they can also make things harder to train.
Balance matters.
Mistake 2: Thinking each layer “understands”
Layers don’t understand.
They transform numbers.
Understanding is an illusion created by many layers working together.
Mistake 3: Ignoring activation functions
Activation functions are not optional details.
They are essential to how the network works.
Mental Model
Here’s the best way to think about it:
A neural network is a multi-step transformation system.
Input: raw numbers
Layers: refine and reshape the data
Output: final result
Each layer adds a little more structure.
Like building meaning one step at a time.
Practice Thinking
Think through these:
Why might one layer not be enough to understand language?
What could go wrong if all layers did the exact same thing?
Why would removing activation functions make the network weaker?
If early layers detect simple patterns, what might deeper layers detect?
Try to explain it in your own words.
That’s where real understanding starts.
Key Takeaways
Neural networks process data through layers
Each layer transforms the data slightly
Early layers detect simple patterns
Deeper layers detect complex meaning
Activation functions control what information passes through
Multiple layers allow the AI to build understanding step by step
What’s Next
Now you understand:
how words become numbers
how those numbers move through layers
But there’s one more critical piece:
👉 Why does the AI choose one output over another?
That comes down to:
weights
and parameters like temperature and top-p
In the next lesson, we’ll break that down clearly so you understand what is really happening when AI generates a response.