Multi-Agent Architecture, Part 4: Storage — Persistent Memory for Multi-Agent Conversations

This is Part 4 of a 9-part series on multi-agent architecture. Start with the series introduction, or read the previous article on Knowledge.

Agents are stateless by default. An LLM processes a prompt, returns a completion, and forgets everything. If you want a conversation that lasts longer than a single request — and you do — you need to build the memory yourself.

This sounds trivial until you start counting what needs to be remembered. The conversation history, obviously. But also: which agent is currently active, how many handoffs have occurred, what phase the conversation is in, who the user is across different channels, what tasks agents have created, and what observations they have recorded along the way. Drop any of these and your multi-agent system degrades from "production-ready" to "impressive demo."

In Part 3, we covered how agents acquire knowledge. This article covers where everything gets stored and how it survives between sessions, restarts, and deployments.

The Problem: What Needs to Persist

A multi-agent conversation generates several categories of data that all need different storage strategies:

Conversation events — every message, every agent response, every system notification, ordered and queryable
Agent state — which agent is active, what phase the conversation is in, the history of handoffs
Channel bindings — which channels are attached to which rooms, with what configuration
Participant membership — who is in each room, when they joined, their roles
Identity — resolving the same person across SMS, WhatsApp, email, and WebSocket
Work products — tasks created by agents, observations recorded during processing

Most frameworks punt on this. They give you an in-memory list of messages and call it "memory." That works until your process restarts, your server scales horizontally, or a customer comes back the next day expecting continuity.

ConversationStore: The Abstract Interface

RoomKit defines storage through a single abstract base class: ConversationStore. Every storage operation in the framework goes through this interface, which means you can swap implementations without touching any application code.

There are two built-in implementations:

InMemoryStore — dictionary-backed, zero dependencies, perfect for development and testing
PostgresStore — production-grade, backed by PostgreSQL with asyncpg connection pooling

The ABC covers everything: room lifecycle, event storage, channel bindings, participant management, identity resolution, tasks, and observations. If you want to back your storage with Redis, DynamoDB, or Google Cloud Memorystore, you implement the same interface and the rest of the framework does not care.

PostgresStore: The Production Backend

For production, PostgresStore manages 10 tables that cover the full lifecycle of multi-agent conversations:

rooms — room metadata, conversation state, timestamps
events — the conversation timeline, JSONB content with GIN indexes
bindings — channel-to-room attachment records
participants — room membership tracking
identities — resolved user identities
identity_addresses — multi-channel address linking
tasks — agent-created work items
observations — agent-recorded findings
read_markers — per-participant read positions
schema_version — migration tracking

Setting it up with connection pooling takes a handful of lines:

from roomkit import RoomKit
from roomkit.store.postgres import PostgresStore

async def create_app():
    store = PostgresStore(
        dsn="postgresql://user:pass@localhost:5432/roomkit",
    )
    # init() creates tables and opens the connection pool
    await store.init(min_size=5, max_size=20)

    kit = RoomKit(store=store)

    # Everything from here uses PostgreSQL transparently
    await kit.create_room(room_id="support-42")
    return kit

The min_size and max_size parameters passed to init() control asyncpg's connection pool. Under load, the pool scales up to max_size connections and reclaims them when traffic drops. This matters in multi-agent systems where multiple agents in the same room might issue concurrent storage operations — you do not want them blocking each other on a single connection.

Event Storage: The Conversation Timeline

The event table is the heart of the storage layer. Every message, every agent response, every system action becomes an event with a monotonically increasing per-room index. This is not just an append log — it is an ordered, queryable timeline that agents use to reconstruct conversation context.

Events are stored as JSONB with GIN indexes on metadata fields. This means you can query events by type, by channel, by sender, or by arbitrary metadata without scanning the entire table.

# Store events through the normal message flow
await kit.process_inbound(InboundMessage(
    channel_id="customer-ws",
    sender_id="customer-1",
    content=TextContent(body="I need to change my shipping address"),
))

# Query events back — ordered by room index
events = await kit.store.list_events("support-42")
for event in events:
    print(f"[{event.index}] {event.source.channel_id}: {event.content.body}")

# Query with visibility filter
ai_events = await kit.store.list_events(
    "support-42",
    visibility_filter="intelligence",
)

# Paginate through history — useful for long conversations
recent = await kit.store.list_events(
    "support-42",
    limit=50,
    before_index=200,
)

Each event gets a room-scoped index that increases monotonically. This is critical for two reasons. First, it gives agents a stable ordering even when events arrive concurrently from multiple channels. Second, it enables idempotency — if a message gets processed twice (network retry, duplicate webhook), the store can detect and deduplicate it based on the idempotency key.

Conversation State: Where Agent Coordination Lives

In a multi-agent system, the conversation itself has state beyond just the message history. Which agent is currently active? How many handoffs have occurred? What phase is the conversation in? RoomKit tracks this as ConversationState stored in the room's metadata:

phase — the current stage of the conversation (e.g., "greeting", "triage", "resolution")
active_agent_id — which agent currently owns the conversation
handoff_count — how many times the conversation has been transferred between agents
phase_history — an ordered record of every phase transition with timestamps

This state lives in the room, not in any individual agent. That is a deliberate design decision. When Agent A hands off to Agent B, Agent B does not need to ask Agent A what happened — it reads the room metadata and picks up exactly where things left off. This decoupling is what makes agent handoffs reliable instead of fragile.

Identity Resolution: One Person, Many Channels

Here is a scenario that breaks most frameworks: a customer starts a conversation over SMS, follows up on WhatsApp, and then calls in by phone. Are these three separate customers or one person using three channels?

RoomKit's identity resolution links multiple addresses to a single identity. An identity has a canonical ID and a set of addresses, each associated with a channel type. When a message arrives, the store resolves the sender's address to an identity, creating a new one if no match exists.

# Identity resolution links addresses across channels
identity = await kit.store.resolve_identity(
    channel_type="sms",
    address="+1-555-0123",
)

# Link additional addresses to the same identity
await kit.store.link_address(
    identity_id=identity.id,
    channel_type="whatsapp",
    address="+1-555-0123",
)
await kit.store.link_address(
    identity_id=identity.id,
    channel_type="email",
    address="jane.doe@example.com",
)

# Now any channel resolves to the same person
same_person = await kit.store.resolve_identity(
    channel_type="email",
    address="jane.doe@example.com",
)
assert same_person.id == identity.id  # True — same identity

This is stored in two tables: identities for the canonical record and identity_addresses for the address-to-identity mappings. The separation means you can add or remove addresses without touching the identity itself, and you can query all channels a person has used.

For multi-agent systems, identity resolution is what makes continuity possible. When the same customer reaches your system through a different channel, the orchestrator can pull their full conversation history across all rooms and channels, giving the active agent the complete picture instead of starting from zero.

Channel Bindings, Participants, and Work Products

Three more storage concerns round out the picture:

Channel bindings track which channels are attached to which rooms. The store supports the full lifecycle: add, get, update, remove, and list. Each binding can carry its own configuration, which means the same channel type can behave differently in different rooms.

Participant membership records who is in each room, when they joined, and their role. This is essential for multi-party conversations where human agents join and leave, and for auditing who had access to what information.

Tasks and observations give agents structured storage for their work products. A triage agent can create a task ("escalate to billing"), a research agent can record observations ("customer has been a subscriber for 3 years"), and downstream agents can query these artifacts to inform their decisions. This is more structured than stuffing everything into the message timeline — tasks have status, priority, and ownership; observations have types and metadata.

Why Not Redis? Why Not DynamoDB?

PostgreSQL is the default production backend because conversations are inherently relational. Events belong to rooms. Rooms have participants. Participants have identities. Identities have addresses. Tasks reference rooms and agents. Trying to model this in a key-value store means reimplementing joins in application code, and doing it badly.

That said, the ConversationStore ABC exists precisely so you can bring your own backend. If your infrastructure is built on AWS and DynamoDB is your team's comfort zone, implement the interface and use it. If you need Redis for hot-path caching with PostgreSQL for durable storage, compose them. The framework does not impose a storage topology — it imposes a storage contract.

JSONB with GIN indexes gives you the best of both worlds for event data: the flexibility of document storage with the query power of a relational database. You can add arbitrary metadata to events without schema migrations, and the GIN index ensures those ad-hoc queries stay fast.

Storage as Architecture

The storage layer is not a detail you bolt on at the end. It is an architectural decision that shapes everything above it. How you store events determines how fast agents can reconstruct context. How you resolve identities determines whether your system recognizes returning customers. How you track conversation state determines whether agent handoffs are seamless or lossy.

RoomKit makes this explicit by putting storage at the foundation: every RoomKit instance takes a store, every operation flows through it, and the abstract interface means you can start with InMemoryStore in development and switch to PostgresStore in production without changing a line of application code.

In the next article, we will look at the agents themselves — the execution units that consume all this stored context to actually do useful work.

This article is part of a 9-part series on production-ready multi-agent architecture. Next up: Part 5: Agents.

Series: Introduction · Part 1: User Interaction · Part 2: Orchestration · Part 3: Knowledge · Part 4: Storage · Part 5: Agents · Part 6: Integration · Part 7: External Tools · Part 8: Observability · Part 9: Evaluation