Embeddings

Imagine a secret code that turns every word into a list of numbers that captures its meaning. That's what embeddings do — and words with similar meanings end up close together in a special mathematical space.

What is an embedding vector?

Think of an embedding vector as a unique digital fingerprintfor a word. It's a list of numbers — like coordinates on a map — that tells the AI what a word "means" based on how it's used alongside other words.

Every word gets its own list of numbers. Those numbers don't describe what the word looks like — they describe its meaning and relationships to everything else in the language.

Why are similar words “close”?

In the AI's numerical meaning space, words aren't just random points. Words that appear in similar situations — or have similar meanings — will naturally have number lists that look very alike.

So "happy" and "joyful" sit close together, while "happy" and "rock" are far apart — just like cities on a real map.

Explore word similarity

cat

word A

similarity

95%

kitten

word B

far apartidentical

💡 'Kitten' is simply a young 'cat' — their meanings are almost identical, so their embedding vectors sit extremely close together.

Simplified 2D "meaning space" — closer = more similar

Analogy by maths

Because words are numbers, the AI can actually do arithmetic with meaning. This lets it understand analogies — relationships between ideas — without ever being explicitly taught them.

The famous example: take "king", subtract "man", add "woman" — and the result is closest to "queen".

Analogy by maths: king − man + woman = ?

👑

Start with "king"

The AI has a number list representing "king" — royalty, power, authority.

king

🌌 Did you know? Real AI embeddings are far more complex than a 2D map. Actual embedding vectors can have hundreds or thousands of numbers — each dimension captures a different subtle aspect of meaning. More dimensions means more ways to represent the fine-grained differences between words. We visualise them in 2D to make sense of them, but the real space is almost unimaginably large.

What you've learned

✓Embeddings transform words into numerical representations called vectors.
✓Similar words are located closer together in the AI's embedding space.
✓This numerical representation allows AI to understand relationships and analogy through arithmetic.
✓Embeddings are fundamental to how Large Language Models process meaning.

← M1: Tokenisation Try M5: Prediction →