Part 2: Vector Embeddings Fundamentals
The Product Recommendation Disaster
// My terrible first attempt
function getSimilarProducts(product: Product) {
return products.filter(p =>
p.category === product.category &&
p.price >= product.price * 0.8 &&
p.price <= product.price * 1.2
);
}// Using semantic similarity
const similar = await prisma.$queryRaw`
SELECT * FROM products
WHERE category = ${product.category}
ORDER BY embedding <=> ${product.embedding}::vector
LIMIT 6
`;What Are Vector Embeddings?
Text as Numbers
Why Embeddings Work
How I Use Embeddings in Production
1. Generate Embeddings with OpenAI API
2. Batch Processing for Efficiency
3. Embeddings for Different Data Types
Product Descriptions
Documentation Search
User Queries
Understanding Similarity Metrics
Cosine Similarity (Recommended for Text)
Euclidean Distance (L2)
Dot Product
Which One to Use?
Similarity Metric
Best For
pgvector Operator
Embedding Models Comparison
OpenAI Embeddings (What I Use)
Local Embedding Models (Free, Private)
My Recommendation
Visualizing Embeddings (2D Projection)
Practical Embedding Tips from Production
1. Text Preprocessing
2. Handle Empty or Short Text
3. Cache Embeddings (They Don't Change)
4. Chunk Long Documents
5. Retry Logic for API Calls
Complete Example: Product Search with Embeddings
What's Next
Last updated