Hands-On Tutorial by Lydia Manikonda using OpenAI Agents SDK

Building AI Agents From Scratch using OpenAI Agents SDK

A beginner-friendly guide to creating, running, and chaining intelligent agents with the OpenAI Agents SDK.

Primarily built this for my students in MGMT 6560/4190 -- Introduction to Machine Learning Applications, Lally School of Management, Rensselaer Polytechnic Institute (RPI).

An AI agent is a program that uses a large language model (LLM) as its reasoning engine to decide what to do next — then acts on that decision, either by producing a response, calling a tool, or handing off to another agent.

This tutorial uses OpenAI's Agents SDK (openai-agents), a lightweight Python library built on top of the OpenAI API that provides everything you need to build, chain, and run agents without boilerplate.

No prior experience with AI frameworks is required — just a working Python environment via Jupyter notebooks and an OpenAI API key.

Prerequisites

Installation & Setup

Install both packages. The openai package is the base client; openai-agents adds the agent orchestration layer on top.

Install pip install openai pip install openai-agents
I prefer Jupyter notebooks -- so use the following statements in a cell before you run the following examples.
import os
from openai import OpenAI
os.environ['OPENAI_API_KEY'] = key
If you dont prefer the previous way, you can set your OpenAI API key as an environment variable before running any of the examples:
export OPENAI_API_KEY=key (macOS/Linux) or add it to your .env file.

Every example in this tutorial follows the same three-step pattern:

1DefineCreate one or more Agent objects with a name, instructions, and optional tools or handoffs.
2RunCall await Runner.run(agent, input="...") to start execution.
3ReadInspect result.final_output for the agent's answer.

Agent Type 01

Triage (Handoff) Agent

What is it?

A triage agent acts as a router. It reads the incoming request and decides which specialist sub-agent is best qualified to handle it, then hands off the conversation seamlessly. This is the "receptionist" pattern — one entry point, many experts behind the scenes.

In the example below we build three agents: a Spanish-only agent, an English-only agent, and a triage agent that inspects the language of each message and routes accordingly.

triage_agent.ipynb
from agents import Agent, Runner

# Two specialist agents — each speaks only one language
spanish_agent = Agent(
    name="Spanish agent",
    instructions="You only speak Spanish.",
)

english_agent = Agent(
    name="English agent",
    instructions="You only speak English.",
)

# The triage agent routes to the correct specialist
triage_agent = Agent(
    name="Triage agent",
    instructions="Handoff to the appropriate agent based on the language of the request.",
    handoffs=[spanish_agent, english_agent],
)

async def main():
    # Spanish input → goes to spanish_agent
    result = await Runner.run(triage_agent, input="Hola, ¿cómo estás?")
    print(result.final_output)

    # English input → goes to english_agent
    result = await Runner.run(triage_agent, input="Hello, how are you?")
    print(result.final_output)

    # Directly calling spanish_agent with English input
    result = await Runner.run(spanish_agent, input="Hello, how are you?")
    print(result.final_output)

await main()
Expected Output
¡Hola! Estoy bien, gracias. ¿Y tú?
Hello! I'm doing well, thank you. How are you?
¡Hola! Estoy bien, gracias. ¿Y tú?
Key concept: The handoffs parameter is a list of agents that the current agent is allowed to delegate to. The LLM chooses which one to pick based on the instructions and the input.

Agent Type 02

Tool-Use Agent

What is it?

A tool-use agent can call external functions — like fetching live data, querying a database, or doing math — and weave the results into its response. The LLM decides when to call a tool and what arguments to pass. You just define the function and decorate it.

The SDK makes adding tools effortless: decorate any Python function with @function_tool and the agent automatically discovers it, understands its purpose from the docstring, and calls it when needed.

1Define a toolWrite a regular Python function and add @function_tool.
2Attach itPass the function to the agent via tools=[...].
3Let it runThe agent decides when to call the tool and incorporates the result.
tool_use_agent.ipynb
from agents import Agent, Runner, function_tool
import datetime

# ── Define tools ──────────────────────────────────────────

@function_tool
def get_current_time(city: str) -> str:
    """Return the current UTC time. City is accepted but we use UTC for simplicity."""
    now = datetime.datetime.utcnow().strftime("%H:%M:%S UTC")
    return f"The current time (UTC) is {now}."

@function_tool
def calculate(expression: str) -> str:
    """Evaluate a simple arithmetic expression and return the result."""
    try:
        result = eval(expression, {"__builtins__": {}})
        return f"{expression} = {result}"
    except Exception as e:
        return f"Error: {e}"

# ── Create the agent with tools ───────────────────────────
assistant = Agent(
    name="Assistant",
    instructions="You are a helpful assistant. Use your tools whenever a question requires live data or calculation.",
    tools=[get_current_time, calculate],
)

async def main():
    result = await Runner.run(assistant, input="What time is it in London?")
    print(result.final_output)

    result = await Runner.run(assistant, input="What is 1492 multiplied by 33?")
    print(result.final_output)

await main()
Expected Output
The current time in London is 14:32:07 UTC.
1492 × 33 = 49,236.
Key concept: The agent reads the function's docstring to understand what a tool does. Write clear, descriptive docstrings — they are the tool's "instructions" to the model.
Using eval() in production code is a security risk. In a real application, replace it with a proper math parser like simpleeval or asteval.

Agent Type 03

Conversational Memory Agent

What is it?

By default, each call to Runner.run() is stateless — the agent has no memory of previous turns. A memory agent solves this by manually maintaining a conversation history (a list of message dicts) and passing it into every call. The agent can then refer back to what was said earlier in the session.

This pattern is essential for chatbots, tutors, or any use case where context accumulates over multiple exchanges.

1Keep a listMaintain a history list of {"role": ..., "content": ...} dicts.
2Append each turnAdd the user message, then the agent's reply after each run.
3Pass history inFeed the full history as the input on the next call.
memory_agent.ipynb
from agents import Agent, Runner

tutor = Agent(
    name="Tutor",
    instructions=(
        "You are a patient Python tutor. Remember what the student has already learned "
        "in this session and build on it. Keep explanations short and encouraging."
    ),
)

async def chat(history: list, user_message: str) -> str:
    # Add the new user message to history
    history.append({"role": "user", "content": user_message})

    # Run the agent with the full conversation history as input
    result = await Runner.run(tutor, input=history)
    reply = result.final_output

    # Append the agent's reply so the next call has full context
    history.append({"role": "assistant", "content": reply})
    return reply

async def main():
    history = []   # Start with an empty conversation

    reply1 = await chat(history, "Hi! I'm new to Python. What's a variable?")
    print(f"Tutor: {reply1}\n")

    reply2 = await chat(history, "Got it. Now, what's a list and how is it different?")
    print(f"Tutor: {reply2}\n")

    # The tutor remembers we started with variables
    reply3 = await chat(history, "Can I store a list inside a variable?")
    print(f"Tutor: {reply3}\n")

await main()
Expected Output (abbreviated)
Tutor: Hello! Great question. A **variable** in Python is like a container that...

Tutor: Excellent question! A **list** in Python is a variable that can store multiple values in one place,...

Tutor: Absolutely! In fact, that's exactly what you're doing when you assign a list to a variable. For example: python...
Note that these are stochastic models -- so you may not see the same exact output as I got when I ran this code on my end.
Context window limits: History grows with each turn. For long sessions, implement a trimming strategy (e.g., keep only the last N messages) so you don't exceed the model's context window.

Agent Type 04

Sequential Pipeline Agent

What is it?

A pipeline agent chains multiple specialist agents together in a fixed sequence — the output of one becomes the input of the next. Each agent does one focused job, and together they complete a complex multi-step task. Think of it as an assembly line for text processing.

The example below builds a three-stage writing pipeline: a drafting agent writes a rough blog post, a critic agent identifies weaknesses, and an editor agent produces a polished final version — all automatically, from a single topic prompt.

1Stage 1 · DraftThe drafting agent turns a topic into a rough blog post.
2Stage 2 · CritiqueThe critic agent reviews the draft and lists improvements.
3Stage 3 · EditThe editor agent applies the feedback to produce a final piece.
pipeline_agent.ipynb
from agents import Agent, Runner

# ── Stage 1: Drafter ──────────────────────────────────────
drafter = Agent(
    name="Drafter",
    instructions=(
        "You are a blog writer. Given a topic, write a short 3-paragraph blog post. "
        "Be informative but conversational. Do not polish — this is a rough draft."
    ),
)

# ── Stage 2: Critic ───────────────────────────────────────
critic = Agent(
    name="Critic",
    instructions=(
        "You are a sharp editor. Review the given draft and list exactly 3 specific, "
        "actionable improvements. Be concise — use bullet points."
    ),
)

# ── Stage 3: Editor ───────────────────────────────────────
editor = Agent(
    name="Editor",
    instructions=(
        "You are a professional editor. You will receive a draft and a list of critique points. "
        "Rewrite the draft addressing all critique points. Output only the final polished post."
    ),
)

async def run_pipeline(topic: str) -> str:
    # Stage 1 — Generate draft
    draft_result = await Runner.run(drafter, input=topic)
    draft = draft_result.final_output
    print("Draft complete\n")

    # Stage 2 — Critique the draft
    critique_result = await Runner.run(critic, input=draft)
    critique = critique_result.final_output
    print("Critique complete\n")

    # Stage 3 — Edit using the draft + critique together
    combined = f"DRAFT:\n{draft}\n\nCRITIQUE:\n{critique}"
    final_result = await Runner.run(editor, input=combined)
    print("Final edit complete\n")

    return final_result.final_output

async def main():
    final_post = await run_pipeline("Why learning to code is still valuable in the age of AI")
    print("── FINAL Result ──────────────────────────────────────\n")
    print(final_post)

await main()
Expected Output (abbreviated)
Draft complete
Critique complete
Final edit complete

── FINAL POST ──────────────────────────────────────

Every week brings a fresh headline about artificial intelligence getting smarter. With tools like ChatGPT, Copilot, and AI-powered design platforms becoming more capable, it’s natural to wonder: is learning to code still worth the effort?...
As I mentioned in the previous example's note these are stochastic models so output may vary for each run. So dont panic if you dont see the same exact output when you run this code.
Key concept: Each agent in a pipeline should have a single, well-defined job. Trying to make one agent do everything usually produces worse results than splitting responsibilities across focused stages.
Going further: You can combine all four patterns — e.g., a triage agent that routes to a pipeline that uses tools and maintains memory. Hope this tutorial helps and please reach out to me if you have any questions. Again, please note that I have implemented these in Jupyter notebooks and if you are using a different way to execute these codes, they may need slight modifications.
Also, the SDK updates constantly -- so certain commands may slightly change. Good luck!

Summary

Choosing the Right Agent Type

Use this quick reference to decide which pattern fits your use case:

quick reference
Pattern           When to use
────────────────────────────────────────────────────────────
Triage Agent      Route different request types to specialists
Tool-Use Agent    Need live data, APIs, math, or external actions
Memory Agent      Multi-turn conversations that need context
Pipeline Agent    Complex tasks that benefit from staged processing