Module 2
Embeddings
Imagine a secret code that turns every word into a list of numbers that captures its meaning. That's what embeddings do β and words with similar meanings end up close together in a special mathematical space.
What is an embedding vector?
Think of an embedding vector as a unique digital fingerprintfor a word. It's a list of numbers β like coordinates on a map β that tells the AI what a word "means" based on how it's used alongside other words.
Every word gets its own list of numbers. Those numbers don't describe what the word looks like β they describe its meaning and relationships to everything else in the language.
Why are similar words βcloseβ?
In the AI's numerical meaning space, words aren't just random points. Words that appear in similar situations β or have similar meanings β will naturally have number lists that look very alike.
So "happy" and "joyful" sit close together, while "happy" and "rock" are far apart β just like cities on a real map.
Explore word similarity
Simplified 2D "meaning space" β closer = more similar
Analogy by maths
Because words are numbers, the AI can actually do arithmetic with meaning. This lets it understand analogies β relationships between ideas β without ever being explicitly taught them.
The famous example: take "king", subtract "man", add "woman" β and the result is closest to "queen".
Analogy by maths: king β man + woman = ?
Start with "king"
The AI has a number list representing "king" β royalty, power, authority.
king
π Did you know? Real AI embeddings are far more complex than a 2D map. Actual embedding vectors can have hundreds or thousands of numbers β each dimension captures a different subtle aspect of meaning. More dimensions means more ways to represent the fine-grained differences between words. We visualise them in 2D to make sense of them, but the real space is almost unimaginably large.
What you've learned
- βEmbeddings transform words into numerical representations called vectors.
- βSimilar words are located closer together in the AI's embedding space.
- βThis numerical representation allows AI to understand relationships and analogy through arithmetic.
- βEmbeddings are fundamental to how Large Language Models process meaning.