# Part 6: AI Agents and Communication Protocols

*Part of the* [*AI Fundamentals 101 Series*](https://blog.htunnthuthu.com/ai-and-machine-learning/artificial-intelligence/ai-fundamentals-101)

## Beyond Text Generation: AI That Takes Action

Everything we've covered so far — ML models, NLP, LLMs, RAG — has been about systems that **process input and return output**. You ask a question, you get an answer. You provide data, you get a prediction.

AI agents are different. An agent doesn't just generate text — it **decides what to do next**, **takes actions**, and **loops until the task is complete**.

The moment this clicked for me was when I was building my home lab monitoring system. The original version was RAG-based: I'd paste alert data, the LLM would analyze it and tell me what to do. But I still had to *do it* — run kubectl commands, check logs, restart pods. The agent version does all of that itself. It reads the alert, decides which tools to use, executes them, evaluates the results, and takes the next step.

That shift from "AI as advisor" to "AI as actor" is the fundamental leap of agentic AI.

***

## What Makes an Agent Different from an LLM Call?

```python
# A plain LLM call: input → output (one shot)
def llm_call(prompt: str) -> str:
    """One-shot: send prompt, get response, done."""
    response = call_model(prompt)
    return response

# An agent: observe → think → act → repeat
class SimpleAgent:
    """Multi-step: reasons, takes actions, checks results, continues."""

    def __init__(self, tools: dict):
        self.tools = tools
        self.history: list[dict] = []

    def run(self, task: str) -> str:
        """Execute a task through an observe-think-act loop."""
        self.history.append({"role": "user", "content": task})

        for step in range(10):  # Max 10 steps (safety limit)
            # THINK: LLM decides what to do next
            response = call_model(
                system="You are a DevOps agent. Use tools to complete tasks. "
                       "When done, respond with DONE: <summary>",
                messages=self.history
            )

            # CHECK: Is the task complete?
            if response.startswith("DONE:"):
                return response

            # ACT: Parse and execute the tool call
            tool_name, args = parse_tool_call(response)
            if tool_name in self.tools:
                result = self.tools[tool_name](**args)
                self.history.append({
                    "role": "tool",
                    "content": f"Tool '{tool_name}' returned: {result}"
                })

        return "Reached maximum steps without completing task."
```

The key differences:

| Feature             | LLM Call                         | AI Agent                                        |
| ------------------- | -------------------------------- | ----------------------------------------------- |
| **Steps**           | Single request/response          | Multi-step loop                                 |
| **Actions**         | Generates text only              | Executes tools and commands                     |
| **Memory**          | Stateless (or limited context)   | Maintains history across steps                  |
| **Decision-making** | None — returns one response      | Decides which tool to use, when to stop         |
| **Autonomy**        | Zero — does exactly what's asked | Partial — figures out *how* to accomplish goals |

***

## Agent Architectures

### The ReAct Pattern (Reasoning + Acting)

The most common agent architecture. The LLM alternates between **reasoning** about what to do and **acting** on that reasoning.

```python
class ReActAgent:
    """
    ReAct loop:
    1. THOUGHT: The LLM reasons about the current situation
    2. ACTION: The LLM chooses a tool and arguments
    3. OBSERVATION: The tool returns results
    4. Repeat until the task is complete
    """

    def __init__(self):
        self.tools = {
            "get_pod_status": self.get_pod_status,
            "get_pod_logs": self.get_pod_logs,
            "describe_pod": self.describe_pod,
            "restart_pod": self.restart_pod,
        }

    def get_pod_status(self, namespace: str = "default") -> str:
        """Simulated: get pod status in a namespace."""
        return (
            "payment-svc-7d8b  Running   2 restarts   CPU:92%  Mem:95%\n"
            "api-gateway-abc1   Running   0 restarts   CPU:45%  Mem:60%\n"
            "db-proxy-xyz9      Running   0 restarts   CPU:30%  Mem:40%"
        )

    def get_pod_logs(self, pod_name: str, lines: int = 20) -> str:
        """Simulated: get recent pod logs."""
        return (
            "java.lang.OutOfMemoryError: Java heap space\n"
            "  at com.payment.processor.TransactionBatch.process(TransactionBatch.java:142)\n"
            "Container killed due to OOM. Memory limit: 512Mi, Usage: 510Mi"
        )

    def describe_pod(self, pod_name: str) -> str:
        """Simulated: describe pod details."""
        return (
            f"Pod: {pod_name}\n"
            "  Memory Limit: 512Mi\n"
            "  Memory Request: 256Mi\n"
            "  JVM Args: -Xmx256m -Xms128m\n"
            "  Restarts: 2 (last restart 5 min ago)"
        )

    def restart_pod(self, pod_name: str) -> str:
        """Simulated: restart a pod (with safety check)."""
        return f"Pod {pod_name} deleted. New pod scheduling..."

    def demonstrate_react_trace(self):
        """Show what a ReAct trace looks like."""
        trace = [
            {
                "step": 1,
                "thought": "User reports pod issues. Let me check pod status first.",
                "action": "get_pod_status(namespace='production')",
                "observation": "payment-svc-7d8b has 2 restarts and 95% memory usage."
            },
            {
                "step": 2,
                "thought": "payment-svc-7d8b has high memory and restarts. Let me check logs.",
                "action": "get_pod_logs(pod_name='payment-svc-7d8b')",
                "observation": "OutOfMemoryError: Java heap space. Container OOMKilled."
            },
            {
                "step": 3,
                "thought": "OOMKilled with Java heap error. Let me check memory limits vs JVM config.",
                "action": "describe_pod(pod_name='payment-svc-7d8b')",
                "observation": "Memory limit 512Mi, JVM heap 256m. Total JVM likely exceeds 512Mi."
            },
            {
                "step": 4,
                "thought": "Root cause identified: JVM heap (256m) + metaspace + threads + GC ≈ 500-600Mi, "
                          "exceeding the 512Mi container limit. Fix: increase memory limit to 768Mi "
                          "or reduce JVM heap.",
                "action": "DONE",
                "observation": None
            }
        ]

        for step in trace:
            print(f"\n--- Step {step['step']} ---")
            print(f"💭 Thought: {step['thought']}")
            print(f"🔧 Action:  {step['action']}")
            if step['observation']:
                print(f"👁️ Observe:  {step['observation']}")

agent = ReActAgent()
agent.demonstrate_react_trace()
```

### Tool Use: How Agents Interact with the World

Tools are functions that the agent can call. The LLM decides *which* tool to use and *what arguments* to pass.

```python
# Defining tools for an agent
tool_definitions = [
    {
        "name": "kubectl_get",
        "description": "Run kubectl get commands to list resources",
        "parameters": {
            "resource": "The resource type (pods, services, deployments)",
            "namespace": "The Kubernetes namespace (default: 'default')",
            "output": "Output format (wide, json, yaml)"
        }
    },
    {
        "name": "kubectl_logs",
        "description": "Get logs from a pod",
        "parameters": {
            "pod_name": "Name of the pod",
            "tail": "Number of lines from the end (default: 50)",
            "container": "Specific container name (optional)"
        }
    },
    {
        "name": "prometheus_query",
        "description": "Query Prometheus metrics",
        "parameters": {
            "query": "PromQL query string",
            "duration": "Time range (e.g., '5m', '1h')"
        }
    },
    {
        "name": "create_jira_ticket",
        "description": "Create a Jira ticket for tracking",
        "parameters": {
            "title": "Ticket title",
            "description": "Ticket description",
            "priority": "P1/P2/P3/P4",
            "assignee": "Team or person to assign"
        }
    }
]

# The LLM sees these definitions and decides which to call.
# For example, given "investigate why payment-service is slow":
# 1. Calls prometheus_query → gets latency metrics
# 2. Calls kubectl_get → finds which pods are affected
# 3. Calls kubectl_logs → checks for errors
# 4. Calls create_jira_ticket → documents the findings
```

### Planning: Agents That Think Ahead

Some agents don't just react — they plan a sequence of actions before executing:

```python
class PlanningAgent:
    """Agent that creates a plan before executing."""

    def create_plan(self, task: str) -> list[str]:
        """LLM generates a step-by-step plan."""
        # In practice, this would be an LLM call
        plan = [
            "1. Check current pod status across production namespace",
            "2. Identify pods with high restart counts or resource pressure",
            "3. For each problematic pod, check logs for error patterns",
            "4. Cross-reference with recent deployments (last 24h)",
            "5. Check Prometheus metrics for correlated anomalies",
            "6. Compile findings into a root cause analysis",
            "7. Generate recommended remediation steps",
        ]
        return plan

    def execute_plan(self, plan: list[str]) -> list[dict]:
        """Execute each step and collect results."""
        results = []
        for step in plan:
            print(f"Executing: {step}")
            # Each step maps to one or more tool calls
            # The agent adapts if a step fails or reveals new information
            result = self.execute_step(step)
            results.append({"step": step, "result": result})

            # Agent can revise the plan based on findings
            if "critical" in str(result).lower():
                print("  ⚠️ Critical finding — adding escalation step")
                plan.append("8. Escalate to on-call engineer via PagerDuty")

        return results

    def execute_step(self, step: str) -> str:
        """Simulated step execution."""
        return f"Completed: {step[:50]}..."

agent = PlanningAgent()
plan = agent.create_plan("Diagnose production latency spike")
print("Plan:")
for step in plan:
    print(f"  {step}")
```

***

## Communication Protocols: How Agents Connect to Tools and Data

As AI agents become more capable, a critical question arises: **how does an agent talk to external tools, data sources, and other agents?**

Three protocols dominate this space, and understanding when to use each is essential.

### MCP (Model Context Protocol)

**MCP** is a protocol that standardizes how LLMs and agents connect to external tools and data sources. Think of it as a "USB-C for AI" — a standard interface that any tool can implement.

```python
# MCP: A standard way for agents to discover and use tools

# Without MCP: every tool integration is custom
# With MCP: tools expose a standard interface that any MCP-compatible agent can use

# Conceptual MCP server definition
mcp_server = {
    "name": "kubernetes-mcp",
    "version": "1.0",
    "tools": [
        {
            "name": "get_pods",
            "description": "List pods in a namespace",
            "input_schema": {
                "type": "object",
                "properties": {
                    "namespace": {"type": "string", "default": "default"},
                    "label_selector": {"type": "string"}
                }
            }
        },
        {
            "name": "get_logs",
            "description": "Get logs from a pod",
            "input_schema": {
                "type": "object",
                "properties": {
                    "pod_name": {"type": "string"},
                    "lines": {"type": "integer", "default": 50}
                },
                "required": ["pod_name"]
            }
        }
    ],
    "resources": [
        {
            "uri": "k8s://production/deployments",
            "name": "Production deployments",
            "description": "Current state of all production deployments"
        }
    ]
}

# Any MCP-compatible agent (VS Code Copilot, Claude Desktop, custom agents)
# can discover and use this tool automatically.
# No custom integration code needed on the agent side.
```

**Key benefit:** Tool authors build once, every MCP client uses it. I've integrated MCP servers with VS Code Copilot in my own workflow — one protocol handles database queries, Kubernetes commands, and documentation search.

### API (REST/GraphQL)

Traditional REST APIs remain the most common way to connect to external services.

```python
# REST API: the traditional way agents call external services

import json

# Agent decides to call an API
def agent_api_call(endpoint: str, method: str, payload: dict = None) -> dict:
    """Simulated API call from an agent."""
    # In practice: httpx.get/post/etc.
    print(f"  {method} {endpoint}")
    if payload:
        print(f"  Payload: {json.dumps(payload, indent=2)}")
    return {"status": "success", "data": "..."}

# Agent workflow using REST APIs
print("Agent: Investigating production incident\n")

# Step 1: Query monitoring API
agent_api_call(
    "https://prometheus.internal/api/v1/query",
    "GET",
    {"query": "rate(http_requests_total{status='500'}[5m])"}
)

# Step 2: Check deployment history
agent_api_call(
    "https://argocd.internal/api/v1/applications/payment-svc/history",
    "GET"
)

# Step 3: Create incident ticket
agent_api_call(
    "https://jira.internal/rest/api/2/issue",
    "POST",
    {"fields": {"summary": "Payment service 500 errors", "priority": {"name": "P2"}}}
)
```

### MCP vs API: When to Use Which

```python
mcp_vs_api = {
    "Use MCP when": [
        "Building tools meant for AI agent consumption",
        "Want automatic tool discovery (agents find your tools themselves)",
        "Need a standard interface across many different tools",
        "Integrating with MCP-compatible clients (Copilot, Claude Desktop)",
    ],
    "Use REST API when": [
        "Building services for human-operated applications",
        "Need fine-grained authentication and authorization",
        "Integrating with existing infrastructure (most services have REST APIs)",
        "Performance is critical (REST is well-optimized)",
    ],
    "Key difference": (
        "APIs are designed for developers to call from code. "
        "MCP is designed for AI agents to discover and call dynamically."
    )
}

for key, value in mcp_vs_api.items():
    print(f"\n{key}:")
    if isinstance(value, list):
        for item in value:
            print(f"  - {item}")
    else:
        print(f"  {value}")
```

### gRPC: High-Performance Agent Communication

**gRPC** uses Protocol Buffers and HTTP/2 for high-performance, strongly-typed communication. It's relevant when agents need to communicate at high throughput.

```python
# gRPC: defined via .proto files (strongly typed)

# Example: a monitoring service that agents query
proto_definition = """
service MonitoringService {
    rpc GetMetrics(MetricsRequest) returns (MetricsResponse);
    rpc StreamAlerts(AlertFilter) returns (stream Alert);
}

message MetricsRequest {
    string namespace = 1;
    string metric_name = 2;
    int32 duration_minutes = 3;
}

message MetricsResponse {
    repeated DataPoint data_points = 1;
    float average = 2;
    float p99 = 3;
}
"""

# MCP vs gRPC comparison:
comparison = {
    "MCP": {
        "transport": "JSON over stdio/HTTP",
        "typing": "JSON Schema",
        "discovery": "Built-in (agents discover tools automatically)",
        "best_for": "AI agent ↔ tool communication",
        "speed": "Moderate"
    },
    "gRPC": {
        "transport": "Protocol Buffers over HTTP/2",
        "typing": "Strongly typed (.proto files)",
        "discovery": "Requires service registry (Consul, etcd)",
        "best_for": "Microservice ↔ microservice communication",
        "speed": "Very fast (binary protocol, streaming)"
    }
}

for protocol, details in comparison.items():
    print(f"\n{protocol}:")
    for k, v in details.items():
        print(f"  {k}: {v}")
```

### A2A (Agent-to-Agent): Multi-Agent Communication

**A2A** is a protocol for agents to communicate with each other — not with tools, but with *other agents*.

```python
# A2A: How agents communicate with each other
# MCP = Agent ↔ Tool (agent uses tools)
# A2A = Agent ↔ Agent (agents collaborate)

# Example: A monitoring agent detects an issue and delegates to a remediation agent

a2a_workflow = {
    "Monitor Agent": {
        "role": "Watches metrics and detects anomalies",
        "sends_to": "Triage Agent",
        "message": {
            "type": "alert",
            "severity": "critical",
            "details": "payment-svc memory at 98%, 3 OOMKills in 10 min",
            "context": {"namespace": "production", "pod_pattern": "payment-svc-*"}
        }
    },
    "Triage Agent": {
        "role": "Assesses severity and routes to specialists",
        "sends_to": "Remediation Agent",
        "message": {
            "type": "remediation_request",
            "diagnosis": "OOMKill due to JVM heap misconfiguration",
            "approved_actions": ["restart_pod", "scale_hpa"],
            "requires_human_approval": ["modify_deployment"]
        }
    },
    "Remediation Agent": {
        "role": "Executes approved fixes",
        "sends_to": "Monitor Agent",
        "message": {
            "type": "action_result",
            "actions_taken": ["Restarted payment-svc-7d8b", "Scaled HPA to min=5"],
            "status": "stabilizing"
        }
    }
}

print("A2A Multi-Agent Workflow:")
for agent, details in a2a_workflow.items():
    print(f"\n  {agent} ({details['role']})")
    print(f"  → Sends to: {details['sends_to']}")
    print(f"  → Message type: {details['message']['type']}")
```

### Protocol Summary

```python
protocol_summary = {
    "Protocol": ["MCP", "REST API", "gRPC", "A2A"],
    "Communication": [
        "Agent ↔ Tool",
        "App ↔ Service",
        "Service ↔ Service",
        "Agent ↔ Agent"
    ],
    "When to Use": [
        "Agent needs to use external tools/data",
        "Traditional service integration",
        "High-throughput microservice communication",
        "Multiple agents collaborating on a task"
    ],
    "Data Format": [
        "JSON (flexible)",
        "JSON/XML (flexible)",
        "Protobuf (binary, typed)",
        "JSON (standardized agent messages)"
    ]
}

print(f"{'Protocol':<12} {'Communication':<22} {'When to Use':<45} {'Format':<20}")
print("-" * 99)
for i in range(len(protocol_summary["Protocol"])):
    print(f"{protocol_summary['Protocol'][i]:<12} "
          f"{protocol_summary['Communication'][i]:<22} "
          f"{protocol_summary['When to Use'][i]:<45} "
          f"{protocol_summary['Data Format'][i]:<20}")
```

***

## Human-in-the-Loop: When AI Should Ask Before Acting

Not every agent action should be autonomous. **Human-in-the-loop (HITL)** is the pattern where the agent pauses to get human approval before executing high-risk actions.

```python
class HumanInTheLoopAgent:
    """Agent that requests human approval for risky actions."""

    # Actions classified by risk level
    SAFE_ACTIONS = {"get_pods", "get_logs", "get_metrics", "describe_resource"}
    RISKY_ACTIONS = {"restart_pod", "scale_deployment"}
    DANGEROUS_ACTIONS = {"delete_pod", "delete_deployment", "modify_hpa",
                         "apply_manifest", "drain_node"}

    def execute_action(self, action: str, args: dict) -> str:
        """Execute an action, requesting approval if needed."""
        if action in self.SAFE_ACTIONS:
            # Read-only: execute immediately
            print(f"  ✅ Auto-approved (read-only): {action}")
            return self.run_tool(action, args)

        elif action in self.RISKY_ACTIONS:
            # Risky: execute with notification
            print(f"  ⚠️ Executing with notification: {action}")
            self.notify_human(action, args)
            return self.run_tool(action, args)

        elif action in self.DANGEROUS_ACTIONS:
            # Dangerous: require explicit approval
            print(f"  🛑 Requires approval: {action}")
            approved = self.request_approval(action, args)
            if approved:
                return self.run_tool(action, args)
            else:
                return f"Action {action} rejected by human operator."

    def request_approval(self, action: str, args: dict) -> bool:
        """Request human approval (via Slack, web UI, etc.)."""
        print(f"     Requesting approval for: {action}({args})")
        print(f"     Waiting for human response...")
        # In practice: post to Slack, wait for reaction
        # Here: simulate approval
        return True

    def notify_human(self, action: str, args: dict):
        """Notify human about an action (non-blocking)."""
        print(f"     📨 Notified: executing {action}({args})")

    def run_tool(self, action: str, args: dict) -> str:
        """Simulated tool execution."""
        return f"Executed: {action}"


# Example workflow
agent = HumanInTheLoopAgent()
print("Agent investigating production issue:\n")
agent.execute_action("get_pods", {"namespace": "production"})
agent.execute_action("get_logs", {"pod": "payment-svc-7d8b"})
agent.execute_action("restart_pod", {"pod": "payment-svc-7d8b"})
agent.execute_action("delete_deployment", {"name": "payment-svc", "namespace": "production"})
```

**My rule:** Read operations are always auto-approved. Write operations that are reversible (restart) get executed with notification. Write operations that are destructive (delete, drain) require explicit approval. This matches how I handle production operations as a human — the agent should have the same safety bars.

***

## When to Use Agents vs Plain LLM Calls

```python
use_agent_when = [
    "The task requires multiple steps that depend on each other",
    "The system needs to interact with external tools (APIs, databases, CLIs)",
    "The task requires real-time information gathering",
    "Different outcomes require different follow-up actions",
    "The task involves monitoring and responding to changing conditions",
]

use_plain_llm_when = [
    "Single question → single answer (no tools needed)",
    "Text generation (summaries, translations, code)",
    "Classification or analysis of provided data",
    "The task is stateless — no need to check results and continue",
    "Determinism matters more than adaptability",
]

print("Use an AGENT when:")
for item in use_agent_when:
    print(f"  ✅ {item}")
print("\nUse a plain LLM CALL when:")
for item in use_plain_llm_when:
    print(f"  📝 {item}")
```

***

## What's Next

In the final article, we'll zoom out and look at the complete picture: **The AI Stack and Building Real AI Systems** — from hardware to models to applications, why most AI projects fail, and how to approach AI development responsibly.

***

*Next:* [*Part 7 — The AI Stack and Building Real AI Systems*](https://blog.htunnthuthu.com/ai-and-machine-learning/artificial-intelligence/ai-fundamentals-101/part-7-ai-stack-and-building-systems)

***

[← Part 5: RAG, Fine-Tuning, and Prompt Engineering](https://blog.htunnthuthu.com/ai-and-machine-learning/artificial-intelligence/ai-fundamentals-101/part-5-rag-finetuning-prompt-engineering) · [Series Overview](https://blog.htunnthuthu.com/ai-and-machine-learning/artificial-intelligence/ai-fundamentals-101) · [Next →](https://blog.htunnthuthu.com/ai-and-machine-learning/artificial-intelligence/ai-fundamentals-101/part-7-ai-stack-and-building-systems)
