Part 4: Embeddings and Vector Search
The Concept That Changed How I Build Everything
What Embeddings Actually Represent
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
texts = [
"How to deploy a Docker container",
"Pushing images to a container registry",
"Making pasta from scratch",
]
embeddings = model.encode(texts)
print(f"Shape: {embeddings.shape}") # (3, 384)
print(f"First 5 dims: {embeddings[0][:5]}") # [-0.034, 0.089, ...]Generating Embeddings
Local Embeddings with sentence-transformers
API-Based Embeddings
Choosing Between Local and API Embeddings
Vector Similarity β How Search Actually Works
Cosine Similarity
Euclidean (L2) Distance
Inner Product (Dot Product)
Which One to Use?
Storing Vectors with pgvector
Setting Up pgvector
Database Schema
Async Database Engine
The Ingestion Pipeline
Step 1: Load Documents
Step 2: Chunk Text
Strategy
Recall@5
Notes
Step 3: Embed and Store
Semantic Search
Testing the Search
Indexing for Performance
IVFFlat Index
HNSW Index
Index
Build Time
Query Time
Recall@10
Memory
Practical Lessons
PreviousPart 3: How LLMs Work β A Practical GuideNextPart 5: Prompt Engineering for Production Systems
Last updated