Skip to main content

Lesson 1 · 9 min

What embeddings are and why they work

Embeddings map tokens, sentences, or documents into dense vectors in a semantic space where geometric distance encodes meaning. Understanding the training objective explains both their power and their limits.

The core idea: meaning as geometry

An embedding model maps text to a point in a high-dimensional vector space. The training objective forces semantically related text to land nearby:

  • "dog" and "puppy" → close neighbors
  • "bank" (financial) and "river bank" → different neighborhoods (if the model is context-aware)
  • "The contract is void" and "This agreement is null" → very close, despite different words

This encoding of meaning as distance is what makes embeddings useful for search, classification, clustering, and retrieval. Cosine similarity between two vectors measures how semantically related the underlying text is.