# Part 1: What is an AI Engineer?

## The Role That Didn't Exist Five Years Ago

When I started my career, the path was clear: you were either a software engineer or a data scientist. Software engineers built APIs, deployed services, and wrote infrastructure code. Data scientists trained models, ran experiments, and wrote notebooks. The two worlds occasionally overlapped, but they had different tools, different workflows, and different definitions of "done."

Then large language models happened.

Suddenly, you could build intelligent systems without training a single model. The bottleneck shifted from "can we build a model that does this?" to "can we build a reliable system around a model that already does this?" That shift created the AI engineer role — someone who bridges software engineering discipline with enough AI understanding to build production systems that use language models, embeddings, and retrieval effectively.

I came to this from the software engineering side. I was building backend services, DevOps automation, and Kubernetes infrastructure. When I started integrating LLMs into my projects — a RAG service over my personal knowledge base, an LLM-powered monitoring agent, automation tools that use natural language — I realized I needed a different skill set than what I had.

Not entirely different. Maybe 60% of what makes a good AI engineer is just being a good software engineer. But that other 40% — understanding tokens, embeddings, prompt design, evaluation — makes the difference between a demo that works and a system you can actually rely on.

***

## AI Engineer vs ML Engineer vs Data Scientist

These roles overlap, but the core focus is different. Here's how I think about it after working across all three areas:

| Dimension                   | Data Scientist                | ML Engineer                        | AI Engineer                                |
| --------------------------- | ----------------------------- | ---------------------------------- | ------------------------------------------ |
| **Primary output**          | Insights, models, experiments | Trained models, training pipelines | AI-powered applications and services       |
| **Core skill**              | Statistics, experimentation   | Model training, MLOps              | Software engineering + AI integration      |
| **Typical tools**           | Jupyter, pandas, scikit-learn | PyTorch, Kubeflow, MLflow          | FastAPI, LLM APIs, vector databases        |
| **Trains models?**          | Yes — often from scratch      | Yes — at scale                     | Rarely — uses pre-trained models and APIs  |
| **Writes production code?** | Sometimes                     | Yes                                | Always                                     |
| **Cares about latency?**    | Not usually                   | For inference, yes                 | For every request, yes                     |
| **Evaluation approach**     | Accuracy, F1, loss curves     | Model performance metrics          | End-to-end system quality, user experience |

The boundaries are blurry. I've done work that falls into all three columns. But when I'm building an AI-powered API that takes a user question, retrieves relevant context from a vector store, constructs a prompt, calls an LLM, and returns a structured response — that's AI engineering.

***

## The AI Engineer Skills Map

Through building my own projects, I've found the skills fall into four categories:

### 1. Software Engineering Fundamentals (the 60%)

This is the foundation. Without solid software engineering, your AI system will be a pile of notebooks that only works on your laptop.

```python
# This is what AI engineering code looks like — it's software engineering
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field
import httpx

app = FastAPI()

class QuestionRequest(BaseModel):
    question: str = Field(..., min_length=1, max_length=1000)
    max_tokens: int = Field(default=512, ge=1, le=4096)

class AnswerResponse(BaseModel):
    answer: str
    sources: list[str]
    tokens_used: int

@app.post("/ask", response_model=AnswerResponse)
async def ask_question(request: QuestionRequest) -> AnswerResponse:
    # retrieve → prompt → generate → respond
    # This is a software engineering pattern with AI components
    ...
```

What matters here:

* **API design**: REST endpoints, request/response schemas, error handling
* **Async programming**: LLM calls are I/O-bound; you need concurrency
* **Type safety**: Pydantic models for every input and output
* **Testing**: pytest for deterministic parts, evaluation for non-deterministic parts
* **Deployment**: Docker, CI/CD, observability

### 2. LLM Understanding (the core AI knowledge)

You don't need to train models from scratch. But you need to understand how they work well enough to debug problems.

What I've found essential:

* **Tokenization**: knowing that `"ChatGPT"` becomes multiple tokens, and why that matters for context windows
* **Context windows**: understanding that a 128k context window doesn't mean you should stuff 128k tokens into every request
* **Temperature and sampling**: knowing when to use `temperature=0` (structured extraction) vs `temperature=0.7` (creative generation)
* **Model capabilities**: understanding what GPT-4o is good at vs what Claude Sonnet handles better

### 3. Retrieval and Embeddings

Most useful AI systems retrieve information before generating. Understanding this pipeline changed how I build everything:

```
User question
     ↓
Generate embedding (sentence-transformers or API)
     ↓
Vector similarity search (pgvector, cosine distance)
     ↓
Retrieve top-k relevant chunks
     ↓
Construct prompt with retrieved context
     ↓
Call LLM → generate response
```

This is the RAG pattern, and it's the backbone of most AI engineering work I've done.

### 4. Evaluation and Quality

This is the hardest part. When your system returns a natural-language answer, how do you know if it's good?

I've learned to think about evaluation at three levels:

* **Component-level**: Does the retrieval return relevant documents? (measurable with precision/recall)
* **Output-level**: Is the generated answer accurate and helpful? (requires human judgment or LLM-as-judge)
* **System-level**: Does the user get value? (requires logging, feedback, and iteration)

***

## Where AI Engineers Fit in a Team

In my experience, AI engineers sit at the intersection of three concerns:

```
        Product / UX
            │
            ▼
    ┌───────────────┐
    │  AI Engineer   │
    └───────────────┘
       ▲           ▲
       │           │
  Backend /     ML / Data
  Platform      Science
```

* **From product**: "We need the system to answer customer questions about their orders"
* **From ML/data science**: "Here's a fine-tuned model" or "Use this embedding model"
* **The AI engineer's job**: Build a reliable service that takes user questions, retrieves relevant order data, calls the model through a well-designed prompt, evaluates the output quality, and serves it through an API with proper error handling, caching, and observability

***

## My Path to AI Engineering

I didn't plan to become an AI engineer. I was building backend services and DevOps tooling. Here's roughly how the transition happened:

**Phase 1 — Curiosity**: I started calling the OpenAI API from a Python script to summarize log files. No architecture, no error handling, just `httpx.post()` and `print()`.

**Phase 2 — First real project**: I built a RAG service over my own git-book — the knowledge base you're reading right now. This forced me to learn about embeddings, vector search, chunking strategies, and prompt construction. Suddenly I needed pgvector, sentence-transformers, and proper async code.

**Phase 3 — Production thinking**: I built an LLM-powered DevOps monitoring agent. This required observability (tracking token usage and latency), guardrails (preventing the LLM from executing dangerous commands), and evaluation (making sure the agent's suggestions were actually correct).

**Phase 4 — System design**: Now I think about AI systems the same way I think about any distributed system — with contracts, failure modes, testing strategies, and deployment pipelines. The AI part is a component, not the whole system.

***

## What This Series Covers

Each part builds on the previous one. By the end, you'll have built a working AI-powered system from scratch:

```python
# What you'll be able to build by Part 8
#
# A complete AI-powered question-answering API:
# - Python 3.12, FastAPI, async throughout
# - Embedding generation with sentence-transformers
# - Vector search with pgvector
# - Prompt engineering with structured output
# - LLM integration with streaming
# - Evaluation pipeline with LLM-as-judge
# - Production observability and cost tracking
```

The code is real. The patterns come from my own projects. The problems I discuss are problems I actually hit.

***

## Prerequisites

Before starting this series, you should be comfortable with:

* **Python 3.11+**: functions, classes, type hints, async/await
* **REST APIs**: HTTP methods, request/response patterns, JSON
* **Command line**: pip/uv, virtual environments, running scripts
* **Git**: version control basics

If you're not there yet, start with [Python 101](https://blog.htunnthuthu.com/getting-started/programming/python-101) and [REST API 101](https://blog.htunnthuthu.com/getting-started/programming/rest-api-101) in this git-book.

***

## What You'll Need

```bash
# Python 3.12
python3 --version  # 3.12.x

# uv for package management (faster than pip)
curl -LsSf https://astral.sh/uv/install.sh | sh

# A GitHub Models API key (free tier works)
# or OpenAI / Anthropic API key

# Docker (for PostgreSQL + pgvector)
docker --version
```

We'll set up the full development environment in Part 2.

***

**Next:** [**Part 2 — Python Tooling for AI Engineers**](https://blog.htunnthuthu.com/ai-and-machine-learning/artificial-intelligence/ai-engineer-101/part-2-python-tooling)