RAG 101

A practical series on building a production-grade RAG system using Python 3.12, FastAPI, and PostgreSQL with pgvector.

I've been maintaining this git-book for a while β€” hundreds of markdown articles across Kubernetes, architecture, DevOps, AI, and more. At some point it became difficult to find things I'd already written. A vector search over my own knowledge base was the obvious solution. This series documents exactly how I built it: from understanding what RAG actually is, to a running FastAPI service that answers questions against my personal documentation.

No fake product scenarios. No contrived "imagine you have 10 million documents" examples. This is a real system I built for a real personal need.


The Project

Goal: A self-hosted RAG service that:

  • Ingests markdown files from this git-book (or any text corpus)

  • Embeds them into pgvector using sentence-transformers or GitHub Models API

  • Answers natural-language questions by retrieving relevant chunks and calling an LLM

  • Exposes a REST API via FastAPI

Stack

Layer
Technology

Language

Python 3.12

API Framework

FastAPI (async)

Vector Store

PostgreSQL 16 + pgvector

ORM / Migrations

SQLAlchemy 2 async + Alembic

Embedding Models

sentence-transformers (all-MiniLM-L6-v2) / GitHub Models API

LLM

GitHub Models API (GPT-4o)

Infra

Docker Compose (single VM)


Series Structure

Phase 1 β€” Foundations

Article
Topic

What is RAG and Why I Built One

pgvector on PostgreSQL β€” Setup, Vector Types, and Indexes

Phase 2 β€” Ingestion Pipeline

Article
Topic

Document Loading and Chunking Strategies

Generating and Storing Embeddings in pgvector

Phase 3 β€” Retrieval and Generation

Article
Topic

Semantic Search, Cosine Similarity, and Hybrid Retrieval

Prompt Construction and the Generation Layer

Phase 4 β€” Production Service

Article
Topic

Wrapping Everything in a FastAPI Service


Project File Tree


How to Follow Along

Each article is self-contained. You can read in order or jump to any topic. Code snippets reference actual file paths from the project tree above.

Dependencies covered across the series:

Last updated