How to build multi-agent workflows with OpenAI Agents SDK Python 2025

How to Build Multi-Agent Workflows with OpenAI Agents SDK Python 2025

Building production-grade AI systems requires more than a single LLM prompt. When you need agents to delegate tasks, validate outputs, maintain conversation history, and work across long-running sessions, the OpenAI Agents SDK provides a lightweight framework designed exactly for this use case.

Unlike generic LLM libraries, OpenAI Agents SDK (available for Python 3.10+) lets you orchestrate multiple agents with built-in support for tool calling, guardrails, human-in-the-loop validation, and distributed tracing. This guide walks you through setting up your first multi-agent system.

Prerequisites and Installation

Before you start, ensure you have:

  • Python 3.10 or newer
  • An OpenAI API key
  • Familiarity with async Python (helpful but not required)

Install via venv (Standard)

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install openai-agents

Install via uv (Faster Alternative)

If you use uv for package management:

uv init
uv add openai-agents

Optional Dependencies

  • Voice/Realtime agents: pip install 'openai-agents[voice]'
  • Redis session support: pip install 'openai-agents[redis]'

Core Concepts: Agents, Tools, and Handoffs

The OpenAI Agents SDK revolves around four key components:

| Component | Purpose | Example | |-----------|---------|----------| | Agents | LLM instances with instructions, tools, and guardrails | A research agent that summarizes papers | | Tools | Functions or APIs agents can call | Web search, file operations, API calls | | Handoffs | Agent-to-agent delegation | Research agent → Writing agent → Editor agent | | Guardrails | Input/output validation and safety checks | Block harmful prompts, validate JSON responses |

When an agent receives a task, it can:

  1. Call a tool directly
  2. Hand off to another agent for specialized work
  3. Request human feedback before proceeding
  4. Store context in a session for multi-turn conversations

Building Your First Agent

Here's a minimal example of a data analysis agent that calls tools and hands off to a reporting agent:

from agents import Agent, Runner, Tool
from agents.run import RunConfig
import anthropic

# Define tools available to agents
def fetch_sales_data(month: str) -> str:
    """Fetch monthly sales data."""
    return f"Sales data for {month}: $150,000 (retrieved from database)"

def generate_chart(data: str) -> str:
    """Generate a visualization."""
    return f"Chart created: {data}"

# Create the main analyst agent
analyst_agent = Agent(
    name="Data Analyst",
    instructions="You analyze sales data and prepare summaries. When finished, hand off to the Report Writer.",
    tools=[
        Tool.from_function(fetch_sales_data),
        Tool.from_function(generate_chart),
    ],
)

# Create a report writer agent
writer_agent = Agent(
    name="Report Writer",
    instructions="You write clear, executive-friendly reports based on analysis provided.",
)

# Add handoff from analyst to writer
analyst_agent.register_handoff(writer_agent)

# Run the workflow
runner = Runner([analyst_agent, writer_agent])
result = runner.run(
    "Analyze Q3 sales data and write an executive summary",
    initial_agent=analyst_agent,
    config=RunConfig(),
)

print(result.final_output)

Using Sandbox Agents for Long-Running Tasks

For workflows that require filesystem access, command execution, or persistent state across multiple steps, Sandbox Agents are ideal. They execute within a controlled container environment.

from agents.sandbox import Manifest, SandboxAgent, SandboxRunConfig
from agents.sandbox.entries import GitRepo
from agents.sandbox.sandboxes import UnixLocalSandboxClient
from agents import Runner
from agents.run import RunConfig

# Define what the sandbox can access
manifest = Manifest(
    entries={
        "repo": GitRepo(repo="your-username/your-repo"),
    }
)

# Create a sandbox agent
agent = SandboxAgent(
    name="Code Reviewer",
    instructions="Inspect the code repository, identify issues, and suggest improvements.",
    default_manifest=manifest,
)

# Run with sandbox config
config = SandboxRunConfig(
    sandbox_client=UnixLocalSandboxClient(),
)

runner = Runner([agent])
result = runner.run(
    "Review the codebase for security vulnerabilities",
    initial_agent=agent,
    config=config,
)

Sandbox agents automatically manage workspace state, so they can inspect files, run git commands, apply patches, and maintain context across long task sequences.

Managing Sessions and Conversation History

The SDK handles session management automatically—you don't need to manually track conversation history:

from agents import Agent, Runner, Session

agent = Agent(
    name="Customer Support",
    instructions="Help customers troubleshoot issues. Remember previous context.",
)

# Create a persistent session
session = Session(agent_id=agent.id)

runner = Runner([agent])

# Multiple calls in the same session maintain context
result1 = runner.run(
    "I can't log into my account",
    initial_agent=agent,
    session=session,
)

result2 = runner.run(
    "I tried resetting my password but it didn't work",
    initial_agent=agent,
    session=session,  # Agent remembers the login issue
)

For distributed setups, enable Redis session support with pip install 'openai-agents[redis]' to share state across multiple processes.

Adding Guardrails for Safety

Guardrails validate inputs and outputs before processing:

from agents import Agent, Guardrail

def validate_output(output: str) -> bool:
    """Ensure output contains no harmful content."""
    forbidden_keywords = ["password", "api_key", "secret"]
    return not any(word in output.lower() for word in forbidden_keywords)

agent = Agent(
    name="API Helper",
    instructions="Help users integrate APIs securely.",
    guardrails=[
        Guardrail(
            name="no_credentials_in_response",
            validate_fn=validate_output,
        ),
    ],
)

Tracing and Debugging Workflows

The SDK includes built-in tracing to monitor agent execution:

from agents import Runner
from agents.run import RunConfig

config = RunConfig(
    trace=True,  # Enable detailed logging
)

runner = Runner([agent])
result = runner.run(
    "Your task here",
    initial_agent=agent,
    config=config,
)

# Access trace data
print(result.trace)  # Contains all tool calls, agent handoffs, and decisions

Use the tracing UI to visualize agent workflows, identify bottlenecks, and debug unexpected behavior.

Key Differences from Generic LLM Frameworks

OpenAI Agents SDK vs. LangChain vs. LlamaIndex:

  • Purpose-built for agentic workflows: Native multi-agent orchestration, not just chains
  • Provider-agnostic: Supports OpenAI APIs and 100+ other LLMs
  • Built-in observability: Tracing UI for debugging without third-party tools
  • Realtime support: First-class voice agent support with gpt-realtime-2
  • Lightweight: Minimal dependencies compared to LangChain

Common Patterns

Pattern 1: Research → Write → Edit Pipeline

Three agents in sequence, each handing off results:

researcher = Agent(name="Researcher", ...)
writer = Agent(name="Writer", ...)
editor = Agent(name="Editor", ...)

researcher.register_handoff(writer)
writer.register_handoff(editor)

runner = Runner([researcher, writer, editor])
result = runner.run("Write an article about...", initial_agent=researcher)

Pattern 2: Parallel Tool Execution

One agent calls multiple tools simultaneously:

agent = Agent(
    name="Data Fetcher",
    tools=[
        Tool.from_function(fetch_weather),
        Tool.from_function(fetch_stock_prices),
        Tool.from_function(fetch_news),
    ],
)

Pattern 3: Human-in-the-Loop Approval

Agents can pause and request human confirmation:

from agents import Agent, HumanInTheLoop

agent = Agent(
    name="Order Processor",
    instructions="Process orders. Request approval for refunds over $500.",
    human_in_the_loop=HumanInTheLoop(threshold="high_cost_actions"),
)

Troubleshooting

Issue: "No module named 'agents'"

  • Ensure you've installed the package: pip install openai-agents
  • Verify you're using Python 3.10+: python --version

Issue: Agent not calling tools

  • Verify tools are registered with the agent
  • Check that tool descriptions are clear (agents use them to decide when to call)
  • Review traces to see if agent decided a tool wasn't relevant

Issue: Handoff not working

  • Ensure agents are registered with the Runner: Runner([agent1, agent2])
  • Add explicit handoff registration: agent1.register_handoff(agent2)

Next Steps

  1. Explore examples: Check the GitHub examples directory
  2. Read full docs: Visit openai.github.io/openai-agents-python
  3. Build a prototype: Start with a simple two-agent system and expand
  4. Monitor production: Use tracing and guardrails for reliability

The OpenAI Agents SDK is actively maintained with regular updates supporting new features like realtime voice agents. Start simple, scale gradually, and use tracing to optimize your workflows.

Recommended Tools