Multi-Agent Architecture, Part 2: Orchestration — Routing Conversations to the Right Agent

This is Part 2 of the Multi-Agent Architecture series. In Part 1 we covered user interaction — how agents receive and respond to messages. Today we tackle the next question: once a message arrives, how do you decide which agent should handle it?

The Routing Problem

A single-agent system is simple. Every message goes to the same agent, every response comes back through the same path. But the moment you add a second agent, you face a decision that compounds with every agent after it: who handles this message?

RoomKit solves this with orchestration strategies — declarative patterns that wire routing, handoff tools, and conversation state automatically. You pick a strategy, give it your agents, and the framework handles the rest.

Four Strategies, Zero Boilerplate

RoomKit ships four built-in orchestration strategies. Each one represents a common multi-agent pattern:

Pipeline — agents are chained linearly (triage → handler → resolver)
Swarm — every agent can hand off to every other agent, no fixed order
Supervisor — one manager delegates tasks to worker agents in background rooms
Loop — a producer and reviewer cycle until the output is approved

Let's start with the most common one.

Pipeline: Linear Agent Chains

A support pipeline where conversations flow from triage to handler to resolver — three agents, each passing the conversation forward when their part is done:

from roomkit import Agent, Pipeline, RoomKit, WebSocketChannel, InboundMessage, TextContent
from roomkit.memory.sliding_window import SlidingWindowMemory
from roomkit.orchestration.handoff import HandoffMemoryProvider
from roomkit.orchestration.state import get_conversation_state
from roomkit.providers.anthropic import AnthropicAIProvider, AnthropicConfig

# Define agents with role and description (used in handoff context)
triage = Agent(
    "agent-triage",
    provider=AnthropicAIProvider(config=AnthropicConfig(model="claude-sonnet-4-20250514")),
    role="Triage receptionist",
    description="Routes incoming requests to the right specialist",
    system_prompt="You triage incoming requests. Hand off to the handler when you understand the issue.",
    memory=HandoffMemoryProvider(SlidingWindowMemory(max_events=50)),
)
handler = Agent(
    "agent-handler",
    provider=AnthropicAIProvider(config=AnthropicConfig(model="claude-sonnet-4-20250514")),
    role="Request handler",
    description="Handles and resolves customer requests",
    system_prompt="You handle customer requests. Hand off to the resolver when done.",
    memory=HandoffMemoryProvider(SlidingWindowMemory(max_events=50)),
)
resolver = Agent(
    "agent-resolver",
    provider=AnthropicAIProvider(config=AnthropicConfig(model="claude-sonnet-4-20250514")),
    role="Resolution specialist",
    description="Confirms resolution and closes requests",
    system_prompt="You confirm resolution and close requests.",
    memory=HandoffMemoryProvider(SlidingWindowMemory(max_events=50)),
)

# Pipeline strategy: triage -> handler -> resolver
kit = RoomKit(
    orchestration=Pipeline(agents=[triage, handler, resolver]),
)

# Create room — agents, routing, handoff tools, and state are wired automatically
await kit.create_room(room_id="support-room")

That last line — create_room() — does all the heavy lifting. Under the hood, the Pipeline strategy:

Registers all three agents on the RoomKit instance
Attaches them to the room as intelligence channels
Installs a BEFORE_BROADCAST hook that routes events to the active agent
Injects a handoff_conversation tool into each agent (constrained to only reachable targets)
Initializes ConversationState with the first agent active

No manual router setup. No handoff handler configuration. No tool injection. The strategy composes all the low-level primitives for you.

Handoffs: The Agent Decides

When the triage agent determines that a user has a billing issue, it calls the handoff_conversation tool — a tool that the framework injected automatically:

# The AI agent calls this tool during its response.
# In production, the LLM decides when to hand off based on its system prompt.
# For demonstration, we invoke it directly:
result = await triage.tool_handler(
    "handoff_conversation",
    {
        "target": "agent-handler",
        "reason": "Billing issue needs specialist",
        "summary": "User has a billing question about their account.",
    },
)

# State is updated automatically
room = await kit.get_room("support-room")
state = get_conversation_state(room)
print(state.phase)            # "agent-handler"
print(state.active_agent_id)  # "agent-handler"
print(state.handoff_count)    # 1

# Next user message automatically routes to the handler
await kit.process_inbound(InboundMessage(
    channel_id="ws-user",
    sender_id="user",
    content=TextContent(body="My last invoice looks wrong."),
))
# handler responds — triage is silent

The handoff updates ConversationState, transitions the phase, and stamps the event metadata so the router sends subsequent messages to the new agent. The summary field is injected into the receiving agent's context via HandoffMemoryProvider, so it knows what happened before it took over. The user never has to repeat themselves.

Swarm: Free-Form Agent Routing

Pipelines are great when the flow is predictable. But sometimes any agent should be able to reach any other agent — a sales agent discovers a support issue, a support agent finds a billing error, a billing agent needs to loop back to sales. The Swarm strategy enables this:

from roomkit import Agent, Swarm, RoomKit

kit = RoomKit(
    orchestration=Swarm(
        agents=[sales, support, billing],
        entry="agent-sales",  # first agent to handle messages
    ),
)
await kit.create_room(room_id="conversation-001")

In a swarm, every agent gets a handoff_conversation tool that can target any other agent in the group. There are no phase constraints — routing uses sticky affinity (the active agent keeps handling until it explicitly hands off). This is the pattern you want when conversations are unpredictable and agents need full flexibility.

Supervisor: Delegation to Background Workers

Sometimes you don't want agents taking turns in the same conversation. You want a manager that talks to the user and delegates work to background agents running in isolated rooms. The Supervisor strategy handles this:

from roomkit import Agent, Supervisor, RoomKit

kit = RoomKit(
    orchestration=Supervisor(
        supervisor=manager,
        workers=[researcher, coder],
    ),
)
await kit.create_room(room_id="project-room")

# The manager gets delegate_to_researcher and delegate_to_coder tools.
# Workers run in child rooms — isolated from the user conversation.
# Results flow back to the manager via the delegation lifecycle.

Only the supervisor is attached to the user's room. Workers execute in child rooms via kit.delegate(), so their work doesn't pollute the main conversation timeline. The supervisor gets auto-generated delegate_to_<worker> tools — one per worker — instead of handoff_conversation.

Loop: Producer-Reviewer Cycles

Some workflows need iteration. A writer drafts content, a reviewer evaluates it, the writer revises, the reviewer approves. The Loop strategy models this:

from roomkit import Agent, Loop, RoomKit

kit = RoomKit(
    orchestration=Loop(
        agent=writer,
        reviewer=reviewer,
        max_iterations=3,  # safety limit
    ),
)
await kit.create_room(room_id="draft-review")

# writer produces -> reviewer evaluates
# reviewer can hand off back to writer (with feedback) or call approve_output
# Loop continues until approved or max_iterations reached

The reviewer gets both a handoff_conversation tool (to send it back for revision) and an approve_output tool (to end the cycle). Loop state is tracked in ConversationState.context — iteration count, approval status — so you always know where the cycle stands.

Conversation State and Phase History

Every strategy uses the same state model under the hood. ConversationState tracks the current phase, the active agent, the handoff count, and a full audit trail of every transition:

from roomkit.orchestration.state import get_conversation_state

room = await kit.get_room("support-room")
state = get_conversation_state(room)

print(state.phase)            # current phase (e.g., "agent-handler")
print(state.active_agent_id)  # which agent owns the conversation
print(state.handoff_count)    # total handoffs so far

# Full transition audit trail
for t in state.phase_history:
    print(f"  {t.from_phase} -> {t.to_phase} ({t.reason})")

State is stored in Room.metadata["_conversation_state"] and persists across server restarts. Every transition is recorded as a PhaseTransition with from_phase, to_phase, from_agent, to_agent, reason, and a timestamp. This is your audit trail when debugging why a customer ended up talking to the wrong agent at 2 AM.

How It Works Under the Hood

All four strategies compose the same low-level primitives:

ConversationRouter — installed as a BEFORE_BROADCAST hook (priority −100) that stamps _routed_to on event metadata. The event router filters out non-targeted agents.
HandoffHandler — processes handoff_conversation tool calls. Updates state, validates allowed transitions, records phase history.
ConversationPipeline — defines stage ordering with PipelineStage objects. Controls forward progression (next) and backward loops (can_return_to).
ConversationState — immutable state model. Each transition creates a new instance with the updated phase and appended history.

You can use these primitives directly if the built-in strategies don't fit your use case. But for most multi-agent scenarios, a strategy gives you production-ready orchestration in three lines of code.

Non-Active Agent Filtering

In a room with three agents, only one is active at a time. Non-active, non-supervisor agents are filtered out at the event router level — events are simply not dispatched to them, so they do not consume LLM tokens or processing time. The supervisor (if configured) always receives events for oversight, but its responses are suppressed unless it explicitly intervenes.

This is a deliberate design choice. In a multi-agent system, silence is preferable to chaos. A user waiting a few seconds for a supervisor to pick up is a better experience than three agents responding simultaneously with conflicting answers.

StatusBus: Shared Awareness Between Agents

Routing and handoffs solve who handles what. But in a multi-agent system, agents also need to know what the others are doing. A supervisor delegating research to two workers needs to know when each finishes. A triage agent handing off to a specialist wants confirmation that the handoff landed. A background agent running a long task should broadcast its progress.

RoomKit's StatusBus is a shared event log for this kind of coordination. Every agent can post status updates, and every agent (or hook, or external system) can subscribe to be notified in real time.

from roomkit.orchestration.status_bus import StatusBus, StatusLevel

# The bus is available on every RoomKit instance
bus = kit.status_bus

# An agent posts a status update after completing a tool call
bus.post(
    agent_id="agent-researcher",
    action="search_google",
    status=StatusLevel.OK,
    detail="Found 7 results for 'roomkit multi-agent'",
)

# Another agent posts a failure
bus.post(
    agent_id="agent-coder",
    action="run_tests",
    status=StatusLevel.FAILED,
    detail="3 tests failed in test_billing.py",
)

# Subscribe to status updates in real time
async def on_status(entry):
    print(f"[{entry.agent_id}] {entry.action} → {entry.status}")

await bus.subscribe(on_status)

# Query recent activity
recent = await bus.recent(10, agent_id="agent-researcher")

# Get a text summary suitable for injecting into an agent's context
summary = await bus.recent_text(5)
# "[14:32:07] agent-researcher: search_google → ok | Found 7 results..."

The post() method is synchronous — safe to call from tool handlers, hooks, or any context. Subscriber notification is scheduled asynchronously. Each StatusEntry carries a timestamp, agent ID, action name, status level (OK, FAILED, PENDING, INFO, COMPLETED), and optional detail and metadata.

The bus uses a pluggable backend. The default InMemoryStatusBackend works for single-process setups and optionally persists to a JSONL file. For distributed deployments, you can implement a Redis or NATS backend by subclassing StatusBackend.

This matters for orchestration because it gives agents shared awareness without coupling them. The supervisor doesn't poll each worker — it subscribes to the bus and reacts when a worker posts COMPLETED or FAILED. A dashboard hook can subscribe and stream status updates to a UI. The recent_text() method produces a formatted summary you can inject into an agent's context window, so it knows what happened while it was idle.

Choosing a Strategy

The four strategies cover the most common multi-agent patterns:

Pipeline when the conversation has a predictable flow — triage, handle, resolve
Swarm when any agent might need to reach any other — dynamic conversations
Supervisor when one agent coordinates background work — research, analysis, generation
Loop when output needs iterative refinement — drafting, code review, quality checks

Each strategy is a few lines of configuration. The framework handles agent registration, channel attachment, routing hooks, handoff tool injection, and state management. You focus on your agents' prompts and tools. The orchestration layer takes care of the rest.

This article is part of a 9-part series on production-ready multi-agent architecture. Next up: Part 3: Knowledge.

Series: Introduction · Part 1: User Interaction · Part 2: Orchestration · Part 3: Knowledge · Part 4: Storage · Part 5: Agents · Part 6: Integration · Part 7: External Tools · Part 8: Observability · Part 9: Evaluation