AI Agent Development 101

Build a single, capable AI agent from scratch in Python 3 β€” then power it with OpenAI and Claude.

Why I Wrote This Series

I spent a long time treating AI agents as black boxes. I'd drop a task into AutoGPT or a LangChain agent, watch it run, and have no idea why it made the decisions it did. When it failed β€” and it often did β€” I had nothing to debug.

The turning point was reading the original ReAct paper (Yao et al., 2022). The idea is simple: alternate between reasoning (write a thought) and acting (call a tool). That's the core of every capable AI agent. Once I understood that, I stopped cargo-culting framework code and started building agents I could reason about.

This series documents what I learned building AI agents for my own projects. It goes deeper on the single-agent case than my Multi Agent Orchestration 101 series β€” covering the ReAct reasoning loop, memory systems, prompt engineering, and how to evaluate whether your agent is actually working.


How This Differs from Multi Agent Orchestration 101

This Series
Multi Agent Orchestration 101

One agent, built deeply

Multiple agents coordinating

ReAct reasoning loop

Supervisor/worker delegation

Memory architecture

Message bus and shared context

Prompt engineering per model

API differences OpenAI vs Claude

Evaluation and testing

Production observability

Read this series first if you are new to agents. Come back to the orchestration series when you need agents to collaborate.


What You Will Learn

Part 1: Agent Foundations and the ReAct Loop

  • The ReAct pattern: Thought β†’ Action β†’ Observation β†’ repeat

  • Building a ReAct loop in pure Python 3 with no LLM

  • Adding a proper reasoning trace for debugging

  • Why chain-of-thought prompting matters for tool selection

Part 2: Agent Memory and State

  • The four types of agent memory (sensory, short-term, long-term, episodic)

  • Implementing a sliding context window

  • Semantic memory with a local vector store (no external service)

  • Persisting and restoring agent state across sessions

Part 3: Building an Agent with OpenAI

  • System prompt engineering for reliable tool selection

  • OpenAI function calling in the ReAct loop

  • Structured output for deterministic agent responses

  • Streaming responses with tool calls interleaved

Part 4: Building an Agent with Claude

  • Why system prompt structure matters more with Claude

  • Anthropic's tool use API in the ReAct loop

  • Extended thinking as an explicit reasoning step

  • Prompt caching for long-running agents

Part 5: Evaluating and Testing Your Agent

  • What "correct" means for an agent (it's not accuracy)

  • Deterministic tests for tool dispatch

  • Trajectory evaluation: did the agent take a sensible path?

  • Regression testing when you upgrade the underlying model


Prerequisites

  • Python 3.11+

  • Familiarity with async/await

  • An OpenAI or Anthropic API key

If you've read LLM API Development 101, you're already prepared.


Series Parts

Part
Title
Focus

Agent Foundations and the ReAct Loop

Pure Python ReAct β€” no LLM yet

Agent Memory and State

Context window, vector memory, persistence

Building an Agent with OpenAI

Function calling, structured output, streaming

Building an Agent with Claude

Tool use, extended thinking, prompt caching

Evaluating and Testing Your Agent

Deterministic tests, trajectory eval, regression

Last updated