Part 2: Vector Embeddings Fundamentals

← Part 1: Introduction | Part 3: pgvector Setup β†’

The Product Recommendation Disaster

I was tasked with building a "Similar Products" feature for an e-commerce site. My first attempt was embarrassingly naive:

// My terrible first attempt
function getSimilarProducts(product: Product) {
  return products.filter(p => 
    p.category === product.category && 
    p.price >= product.price * 0.8 && 
    p.price <= product.price * 1.2
  );
}

The problem? This recommended a $1,200 MacBook Pro alongside a $1,000 Windows laptop because they were "similar price." Customers complained. The recommendations made no sense.

I needed a way to capture "similarity" beyond category and price. That's when I discovered embeddings.

After implementing vector embeddings:

// Using semantic similarity
const similar = await prisma.$queryRaw`
  SELECT * FROM products
  WHERE category = ${product.category}
  ORDER BY embedding <=> ${product.embedding}::vector
  LIMIT 6
`;

Click-through rate: 1.8% β†’ 9.2%. Recommendations finally made sense.

This article explains what embeddings are, how they work, and how to use them effectively.

What Are Vector Embeddings?

Simple definition: Embeddings are numerical representations (arrays of numbers) that capture the semantic meaning of data.

Text as Numbers

Words and sentences are converted into vectors (arrays of numbers) where similar meanings have similar numbers.

The magic: Words with similar meanings have similar vectors β†’ we can use math to find similar text.

Why Embeddings Work

Embeddings place words in multi-dimensional space where:

  • Distance = similarity

  • Close vectors = similar meaning

  • Far vectors = different meaning

In reality, embeddings use 384, 768, 1536, or more dimensions. More dimensions = more nuanced understanding.

How I Use Embeddings in Production

1. Generate Embeddings with OpenAI API

2. Batch Processing for Efficiency

OpenAI API accepts up to 2048 inputs per request. Always batch for production:

Cost calculation:

  • 10,000 products Γ— 50 tokens average = 500,000 tokens

  • $0.02 per 1M tokens = $0.01 total

Embeddings are incredibly cheap.

3. Embeddings for Different Data Types

Product Descriptions

User Queries

Understanding Similarity Metrics

How do we measure if two vectors are "similar"? Math.

Measures the angle between vectors, ignoring magnitude.

Range: -1 (opposite) to 1 (identical) PostgreSQL pgvector operator: <=>

Euclidean Distance (L2)

Measures straight-line distance in vector space.

Range: 0 (identical) to ∞ (very different) PostgreSQL pgvector operator: <->

Dot Product

Combines magnitude and angle. Use when vectors are normalized.

PostgreSQL pgvector operator: <#> (negative dot product)

Which One to Use?

Similarity Metric
Best For
pgvector Operator

Cosine Similarity

Text embeddings (OpenAI, Sentence Transformers)

<=>

Euclidean Distance

Spatial data, coordinates, image embeddings

<->

Dot Product

Normalized vectors, some ML models

<#>

For text with OpenAI embeddings, use cosine similarity (<=>).

Embedding Models Comparison

OpenAI Embeddings (What I Use)

Pros:

  • State-of-the-art quality

  • Easy API, no infrastructure

  • Multilingual support

  • Consistent updates

Cons:

  • Costs money (though cheap)

  • API dependency

  • Data leaves your infrastructure

Local Embedding Models (Free, Private)

Pros:

  • Free

  • No API calls / rate limits

  • Data stays private

  • Works offline

Cons:

  • Lower quality than OpenAI

  • Requires more setup

  • Slower (CPU inference)

  • Model management overhead

My Recommendation

Start with OpenAI text-embedding-3-small:

  • Production-ready immediately

  • Excellent quality

  • Very cheap ($0.02/1M tokens)

  • No infrastructure needed

Switch to local models if:

  • High volume (>100M embeddings)

  • Privacy requirements (healthcare, finance)

  • No internet access

  • Want to avoid API dependencies

Visualizing Embeddings (2D Projection)

Embeddings are 1536-dimensional, impossible to visualize directly. We can use dimensionality reduction to project to 2D:

Output visualization:

Similar items cluster together in vector space!

Practical Embedding Tips from Production

1. Text Preprocessing

2. Handle Empty or Short Text

3. Cache Embeddings (They Don't Change)

For persistent caching, store in database:

4. Chunk Long Documents

Why chunk?

  • OpenAI has ~8K token limit

  • Smaller chunks = more precise search

  • Overlap preserves context between chunks

5. Retry Logic for API Calls

Complete Example: Product Search with Embeddings

What's Next

In this article, you learned:

  • βœ… What embeddings are and why they work

  • βœ… How to generate embeddings with OpenAI API

  • βœ… Similarity metrics (cosine, euclidean, dot product)

  • βœ… Embedding models comparison

  • βœ… Production tips (batching, caching, chunking, retries)

  • βœ… Complete product search example

Next: We'll set up PostgreSQL with pgvector extension, create vector columns, and implement indexes for fast similarity search.


← Part 1: Introduction | Part 3: pgvector Setup β†’

Last updated