← Back to modules
πŸ—ΊοΈ

Module 2

Embeddings

Imagine a secret code that turns every word into a list of numbers that captures its meaning. That's what embeddings do β€” and words with similar meanings end up close together in a special mathematical space.

What is an embedding vector?

Think of an embedding vector as a unique digital fingerprintfor a word. It's a list of numbers β€” like coordinates on a map β€” that tells the AI what a word "means" based on how it's used alongside other words.

Every word gets its own list of numbers. Those numbers don't describe what the word looks like β€” they describe its meaning and relationships to everything else in the language.

Why are similar words β€œclose”?

In the AI's numerical meaning space, words aren't just random points. Words that appear in similar situations β€” or have similar meanings β€” will naturally have number lists that look very alike.

So "happy" and "joyful" sit close together, while "happy" and "rock" are far apart β€” just like cities on a real map.

Explore word similarity

cat
word A
similarity
95%
kitten
word B
far apartidentical
πŸ’‘ 'Kitten' is simply a young 'cat' β€” their meanings are almost identical, so their embedding vectors sit extremely close together.

Simplified 2D "meaning space" β€” closer = more similar

catkitten

Analogy by maths

Because words are numbers, the AI can actually do arithmetic with meaning. This lets it understand analogies β€” relationships between ideas β€” without ever being explicitly taught them.

The famous example: take "king", subtract "man", add "woman" β€” and the result is closest to "queen".

Analogy by maths: king βˆ’ man + woman = ?

πŸ‘‘

Start with "king"

The AI has a number list representing "king" β€” royalty, power, authority.

king

🌌 Did you know? Real AI embeddings are far more complex than a 2D map. Actual embedding vectors can have hundreds or thousands of numbers β€” each dimension captures a different subtle aspect of meaning. More dimensions means more ways to represent the fine-grained differences between words. We visualise them in 2D to make sense of them, but the real space is almost unimaginably large.

What you've learned