AI Agent Development 101

Build a single, capable AI agent from scratch in Python 3 — then power it with OpenAI and Claude.

Why I Wrote This Series

I spent a long time treating AI agents as black boxes. I'd drop a task into AutoGPT or a LangChain agent, watch it run, and have no idea why it made the decisions it did. When it failed — and it often did — I had nothing to debug.

The turning point was reading the original ReAct paper (Yao et al., 2022). The idea is simple: alternate between reasoning (write a thought) and acting (call a tool). That's the core of every capable AI agent. Once I understood that, I stopped cargo-culting framework code and started building agents I could reason about.

This series documents what I learned building AI agents for my own projects. It goes deeper on the single-agent case than my Multi Agent Orchestration 101 series — covering the ReAct reasoning loop, memory systems, prompt engineering, and how to evaluate whether your agent is actually working.

How This Differs from Multi Agent Orchestration 101

This Series

Multi Agent Orchestration 101

One agent, built deeply

Multiple agents coordinating

ReAct reasoning loop

Supervisor/worker delegation

Memory architecture

Message bus and shared context

Prompt engineering per model

API differences OpenAI vs Claude

Evaluation and testing

Production observability

Read this series first if you are new to agents. Come back to the orchestration series when you need agents to collaborate.

What You Will Learn

Part 1: Agent Foundations and the ReAct Loop

The ReAct pattern: Thought → Action → Observation → repeat
Building a ReAct loop in pure Python 3 with no LLM
Adding a proper reasoning trace for debugging
Why chain-of-thought prompting matters for tool selection

Part 2: Agent Memory and State

The four types of agent memory (sensory, short-term, long-term, episodic)
Implementing a sliding context window
Semantic memory with a local vector store (no external service)
Persisting and restoring agent state across sessions

Part 3: Building an Agent with OpenAI

System prompt engineering for reliable tool selection
OpenAI function calling in the ReAct loop
Structured output for deterministic agent responses
Streaming responses with tool calls interleaved

Part 4: Building an Agent with Claude

Why system prompt structure matters more with Claude
Anthropic's tool use API in the ReAct loop
Extended thinking as an explicit reasoning step
Prompt caching for long-running agents

Part 5: Evaluating and Testing Your Agent

What "correct" means for an agent (it's not accuracy)
Deterministic tests for tool dispatch
Trajectory evaluation: did the agent take a sensible path?
Regression testing when you upgrade the underlying model

Prerequisites

Python 3.11+
Familiarity with async/await
An OpenAI or Anthropic API key

If you've read LLM API Development 101, you're already prepared.

Series Parts

Part

Title

Focus

Agent Foundations and the ReAct Loop

Pure Python ReAct — no LLM yet

Agent Memory and State

Context window, vector memory, persistence

Building an Agent with OpenAI

Function calling, structured output, streaming

Building an Agent with Claude

Tool use, extended thinking, prompt caching

Evaluating and Testing Your Agent

Deterministic tests, trajectory eval, regression

PreviousPart 5: Deployment and Scaling NextPart 1: Agent Foundations and the ReAct Loop

Last updated 1 month ago

hashtagWhy I Wrote This Series

hashtagHow This Differs from Multi Agent Orchestration 101

hashtagWhat You Will Learn

hashtagPart 1: Agent Foundations and the ReAct Loop

hashtagPart 2: Agent Memory and State

hashtagPart 3: Building an Agent with OpenAI

hashtagPart 4: Building an Agent with Claude

hashtagPart 5: Evaluating and Testing Your Agent

hashtagPrerequisites

hashtagSeries Parts