Part 3: Orchestrating Agents with OpenAI

Part of the Multi Agent Orchestration 101 Series

Why OpenAI Function Calling Is Perfect for Multi-Agent Work

I switched from prompt-engineering my tool calls (parsing TOOL: name args from raw text) to OpenAI function calling when I noticed my agents were hallucinating tool names. The model would invent tools that didn't exist, or pass arguments in the wrong format. Function calling enforces a contract: the model must emit a structured JSON object that matches a schema you define, or it returns a plain text response. No more parsing.

For multi-agent systems, this constraint is valuable. Each agent has a precise interface. Coordination becomes reliable.


Prerequisites

pip install openai>=1.14.0 python-dotenv aiosqlite
export OPENAI_API_KEY="sk-..."

Use python-dotenv and a .env file in practice — never hardcode keys.


Replacing _think() with an OpenAI Call

Taking the AgentV2 base from Part 2, here is a concrete OpenAI agent:

# openai_agent.py
from __future__ import annotations
import json
import os
from openai import AsyncOpenAI
from agent_v2 import AgentV2, Message
from dispatcher import ToolDispatcher

client = AsyncOpenAI(api_key=os.environ["OPENAI_API_KEY"])


class OpenAIAgent(AgentV2):
    model: str = "gpt-4o-mini"

    def __init__(
        self,
        name: str,
        system_prompt: str,
        dispatcher: ToolDispatcher,
        model: str = "gpt-4o-mini",
        persist: bool = False,
    ) -> None:
        super().__init__(
            name=name,
            system_prompt=system_prompt,
            dispatcher=dispatcher,
            persist=persist,
        )
        self.model = model

    async def _think(self) -> str:
        """Call OpenAI with current memory and available tools."""
        messages = [{"role": "system", "content": self.system_prompt}]
        for msg in self.memory:
            if msg.role == "tool":
                messages.append({
                    "role": "tool",
                    "tool_call_id": msg.metadata.get("tool_call_id", "unknown"),
                    "content": msg.content,
                })
            else:
                messages.append({"role": msg.role, "content": msg.content})

        response = await client.chat.completions.create(
            model=self.model,
            messages=messages,
            tools=[
                {"type": "function", "function": schema}
                for schema in self.dispatcher.schemas()
            ],
            tool_choice="auto",
        )

        choice = response.choices[0]

        # Model wants to call a tool
        if choice.finish_reason == "tool_calls":
            tool_call = choice.message.tool_calls[0]
            name = tool_call.function.name
            args = tool_call.function.arguments  # already a JSON string

            # Store the assistant message with tool_calls so history is valid
            self.memory.append(
                Message(
                    role="assistant",
                    content="",
                    metadata={"tool_calls": [tool_call.model_dump()]},
                )
            )
            # Return in the format AgentV2.run() expects
            return f"TOOL: {name} {args}"

        # Plain response
        content = choice.message.content or ""
        return f"DONE: {content}"

One important detail: when the model emits a tool_calls finish, you must append that assistant message (with tool_calls in metadata) to history before the tool result. OpenAI's API validates message ordering strictly.


The Supervisor Pattern

The most useful multi-agent pattern with OpenAI is the supervisor: one agent that breaks down a goal and delegates subtasks to specialised worker agents.

The supervisor's tools are the worker agents themselves. From the supervisor's perspective, calling a worker is just another tool call.

Now build the full system:

When I run this on my machine, the supervisor calls shell_worker with python3 --version, gets back the version string, then calls memory_worker to store it. The final response confirms both steps. The supervisor never ran a shell command directly — it delegated.


Parallel Agent Execution

Sequential delegation is simple but slow. When subtasks are independent, run workers in parallel with asyncio.gather:

I use this pattern extensively when a supervisor has to gather information from multiple sources before synthesising an answer.


Handling Errors and Retries

Agents in production fail. Models return malformed JSON, tools time out, API rate limits hit. A simple retry wrapper:

For rate limiting specifically, catch openai.RateLimitError and sleep for longer. I use tenacity in production for more sophisticated retry policies, but the pattern above is enough to understand the concept.


Key Takeaways

  • OpenAI function calling enforces a tool interface contract — no more parsing free text

  • The supervisor pattern: strong model for planning, cheaper models for execution

  • Worker agents are just tools from the supervisor's perspective

  • asyncio.gather makes parallel delegation straightforward

  • Retry with exponential back-off covers most transient failures


Up Next

Part 4: Claude Multi-Agent Workflow — the same patterns using Anthropic's tool use API, plus Claude's extended thinking for supervisor-level planning.

Last updated