Skip to main content

Command Palette

Search for a command to run...

How AI Processes Information — What Happens After Words Become Numbers

Published
6 min read
D

Heya! 👋 I love helping people, and one of the best ways I do this is by sharing my knowledge and experiences. My journey reflects the power of growth and transformation, and I’m here to document and share it with you.

I started as a pharmacist, practicing at a tertiary hospital in the Northern Region of Ghana. There, I saw firsthand the challenges in healthcare delivery and became fascinated by how technology could offer solutions. This sparked my interest in digital health, a field I believe holds the key to revolutionizing healthcare.

Determined to contribute, I taught myself programming, mastering tools like HTML, CSS, JavaScript, React, PHP, and more. But I craved deeper knowledge and practical experience. That’s when I joined the ALX Software Engineering program, which became a turning point. Spending over 70 hours a week learning, coding, and collaborating, I transitioned fully into tech.

Today, I am a Software Engineer and Digital Health Solutions Architect, building and contributing to innovative digital health solutions. I combine my healthcare expertise with technical skills to create impactful tools that solve real-world problems in health delivery.

Imposter syndrome has been part of my journey, but I’ve learned to embrace it as a sign of growth. Livestreaming my learning process, receiving feedback, and building in public have been crucial in overcoming self-doubt. Each experience has strengthened my belief in showing up, staying consistent, and growing through challenges.

Through this platform, I document my lessons, challenges, and successes to inspire and guide others—whether you’re transitioning careers, exploring digital health, or diving into software development.

I believe in accountability and the value of shared growth. Your feedback keeps me grounded and motivated to continue this journey. Let’s connect, learn, and grow together! 🚀

In the last lesson, you saw something important:

Words are not processed as words. They are converted into numbers called embeddings.

So now we have a new question:

Once everything becomes numbers… what does the AI actually do with them?

Because turning words into numbers is only the beginning.

The real work happens after that.

This is where neural networks and layers come in.

If embeddings are the input, then layers are the processing system.

By the end of this lesson, you should understand:

  • what a neural network layer is

  • how data moves through layers

  • why multiple layers are needed

  • what activation functions actually do (in simple terms)


What Is a Neural Network?

Let’s keep this simple.

A neural network is a system made up of multiple steps that transform data.

Each step is called a layer.

So instead of doing everything at once, the AI processes information gradually.

Think of it like this:

Input → Transformation → Transformation → Transformation → Output

Each transformation is a layer.


A Simple Analogy: An Assembly Line

Imagine a factory.

At the start, you have raw materials.

At each stage, something is added or changed.

By the end, you have a finished product.

Neural networks work the same way.

  • You start with raw input (numbers from embeddings)

  • Each layer transforms the data slightly

  • The final layer produces an output

This “assembly line” idea is exactly how layers behave.


What Is a Layer?

A layer is simply:

A step that takes input, changes it, and passes it forward.

Nothing more complicated than that.

Each layer receives numbers, performs calculations, and sends new numbers to the next layer.


How Data Flows Through the Network

Let’s walk through the full journey.

Step 1: Input Layer

This is where your data enters.

In a language model, this is your embeddings.

So your sentence:

"I love small dogs"

becomes a set of vectors (numbers).


Step 2: Hidden Layers

This is where most of the work happens.

Each hidden layer:

  • looks at the input

  • detects patterns

  • transforms the data

Early layers detect simple patterns. Later layers detect more complex patterns.


Step 3: Output Layer

This is the final step.

The network produces an answer, such as:

  • the next word in a sentence

  • a classification (spam / not spam)

  • a prediction


Why Multiple Layers Matter

This is one of the most important ideas.

Different layers learn different levels of meaning.

Let’s break it down using language.

Early Layers

These focus on simple features:

  • word shapes

  • basic grammar

  • common patterns


Middle Layers

Now things get more interesting:

  • phrases

  • relationships between words

  • sentence structure


Deeper Layers

Now the system starts capturing:

  • tone

  • intent

  • context

  • subtle meaning


So instead of trying to understand everything at once, the AI builds understanding step by step.

Early layers handle simple patterns, later layers combine them into complex meaning


What Actually Happens Inside a Layer?

Let’s slow this down.

Inside each layer, something very specific happens:

  1. The layer receives numbers

  2. It applies weights (importance values)

  3. It adds them together

  4. It passes the result through a function

We’ll go deeper into weights in the next lesson.

For now, focus on this:

👉 A layer is doing calculations to reshape the data.


Activation Functions (The Gatekeepers)

Now we introduce something important, but we’ll keep it simple.

After a layer does its calculations, it uses something called an activation function.

This decides:

👉 What information should continue

👉 What should be filtered out


Simple Analogy

Think of a security checkpoint.

Not everything passes through.

Some signals are allowed forward. Some are reduced. Some are blocked.


Example: ReLU (Rectified Linear Unit)

ReLU is one of the most common activation functions.

It works like this:

  • positive numbers → allowed

  • negative numbers → turned into zero

So it removes weak or irrelevant signals.


Example: Sigmoid

Sigmoid takes any number and converts it into a value between 0 and 1.

This is useful when the AI needs to decide something like:

  • yes or no

  • spam or not spam


Why Activation Functions Matter

Without activation functions, layers would not add real value.

Everything would collapse into one simple calculation.

Activation functions introduce non-linearity.

That means:

👉 The AI can learn complex patterns

👉 Not just simple straight-line relationships

This is what allows AI to handle language, images, and real-world complexity.


What You Should Notice When You Experiment

When you use tools like TensorFlow Playground, you’ll see this directly.

If you:

  • add more layers

  • change activation functions

You’ll notice:

👉 The model behaves differently

Sometimes better. Sometimes worse.

That’s because you are changing how information is processed.


Common Beginner Mistakes

Mistake 1: Thinking more layers always means better

More layers can help, but they can also make things harder to train.

Balance matters.


Mistake 2: Thinking each layer “understands”

Layers don’t understand.

They transform numbers.

Understanding is an illusion created by many layers working together.


Mistake 3: Ignoring activation functions

Activation functions are not optional details.

They are essential to how the network works.


Mental Model

Here’s the best way to think about it:

A neural network is a multi-step transformation system.

  • Input: raw numbers

  • Layers: refine and reshape the data

  • Output: final result

Each layer adds a little more structure.

Like building meaning one step at a time.


Practice Thinking

Think through these:

  1. Why might one layer not be enough to understand language?

  2. What could go wrong if all layers did the exact same thing?

  3. Why would removing activation functions make the network weaker?

  4. If early layers detect simple patterns, what might deeper layers detect?

Try to explain it in your own words.

That’s where real understanding starts.


Key Takeaways

  • Neural networks process data through layers

  • Each layer transforms the data slightly

  • Early layers detect simple patterns

  • Deeper layers detect complex meaning

  • Activation functions control what information passes through

  • Multiple layers allow the AI to build understanding step by step


What’s Next

Now you understand:

  • how words become numbers

  • how those numbers move through layers

But there’s one more critical piece:

👉 Why does the AI choose one output over another?

That comes down to:

  • weights

  • and parameters like temperature and top-p

In the next lesson, we’ll break that down clearly so you understand what is really happening when AI generates a response.