Pure Python Async-First Type-Safe Voice AI MCP Ready

Multi-Channel Conversations,
Simplified.

The missing layer between your channels and your logic. Primitives for multi-channel conversations — not a platform, a foundation.

Get Started View on GitHub

pip install roomkit

example.py

from roomkit import RoomKit, WebSocketChannel, ChannelCategory
from roomkit.channels.ai import AIChannel

kit = RoomKit()

# Register channels
kit.register_channel(WebSocketChannel("web"))
kit.register_channel(AIChannel("ai", provider=my_ai))

# Create a room and attach channels
room = await kit.create_room()
await kit.attach_channel(room.id, "web")
await kit.attach_channel(
    room.id, "ai", category=ChannelCategory.INTELLIGENCE)

# Process inbound messages
result = await kit.process_inbound(message)

# Messages flow through hooks, get stored,
# and broadcast to all attached channels

Why RoomKit?

Building conversation systems today is harder than it should be.

The Multi-Channel Nightmare

✗

Fragmented codebases

Separate integrations for SMS, Email, WhatsApp, chat widgets. Each with its own SDK, webhooks, and quirks.

✗

Lost context

Customer starts on SMS, continues on email, finishes on chat. Your system treats these as 3 strangers.

✗

No unified history

"What did they say last week?" requires querying 5 different APIs.

✗

Identity hell

+1-555-1234 on SMS is john@example.com on email is "John D." on chat. Connecting these is your problem.

✗

Vendor lock-in

Switching from Twilio to Telnyx means rewriting everything.

How RoomKit Fixes This

✓

One conversation, any channel

Messages flow into rooms, not silos. Switch channels mid-conversation without losing context.

✓

Pluggable adapters

Swap providers without changing application logic. Twilio today, Telnyx tomorrow.

✓

Built-in identity resolution

Resolve unknown senders, handle ambiguity with hooks, merge identities across channels.

✓

Powerful hook system

Intercept, route, moderate, or transform messages at any point. One place for all your logic.

✓

Unified history

Query conversations, not channels. Full context regardless of how customers reached you.

How RoomKit Compares

There are many tools in this space. Here's where RoomKit fits.

Message Brokers

Kombu, Celery, RabbitMQ

Move bytes between services. No concept of conversations or participants.

Use RoomKit when: You need to manage actual human conversations.

Chatbot Frameworks

Rasa, Dialogflow, ChatterBot

Focus on NLP, intent detection, and response generation.

Use RoomKit when: You need to route messages to AI (or humans) across channels.

Full Platforms

Chatwoot, Rocket.Chat, Intercom

Complete applications with UI, dashboards, and hosted infrastructure.

Use RoomKit when: You're building your own product and need the primitives.

Voice AI Frameworks

Pipecat, LiveKit Agents, TEN Framework

Pipeline or session-based voice AI. Focus on audio processing, not conversations.

Use RoomKit when: You need voice + text + multi-agent in the same room.

	RoomKit	Chatwoot	Twilio	Kombu	Rasa
Open source	✓	✓	✗	✓	✓
Self-hosted	✓	✓	✗	✓	✓
Python library	✓	✗	✓	✓	✓
Multi-channel	✓	✓	✓	✗	✗
Room-based	✓	✗	✗	✗	✗
Identity resolution	✓	~	✗	✗	✗
Hook system	✓	✗	✗	✗	✗
Async-first	✓	N/A	✓	✓	✗
No per-message fees	✓	✓	✗	✓	✓
Real-time voice	✓	✗	✓	✗	✗
AI-ready (llms.txt)	✓	✗	✗	✗	✗
Multi-agent orchestration	✓	✗	✗	✗	✗

Everything You Need

A complete framework for building conversation systems at any scale.

Room-Based Architecture

Organize conversations into rooms with participants, events, and channel bindings. Each room is a self-contained conversation context.

Multi-Channel Support

SMS, Email, WhatsApp, Teams, Messenger, Voice, WebSocket, AI, and more. Messages flow seamlessly between channels with automatic transcoding.

Async-First Design

Built on Python's asyncio from the ground up. Handle thousands of concurrent conversations without blocking.

Powerful Hook System

35+ hook triggers to intercept, modify, or block events at any point. Build content moderation, analytics, AI routing, and more with sync and async hooks.

Identity Resolution

Resolve unknown senders to known identities. Handle ambiguous cases with hooks for challenges, verification, or manual resolution.

Pluggable Backends

In-memory defaults for development, plug in Redis, PostgreSQL, or custom implementations for production. Storage, locks, and realtime all pluggable.

Real-Time Voice

4 voice backends (FastRTC, SIP, RTP, Local Audio). Full audio pipeline with AEC, AGC, Denoiser, VAD, DTMF, and Diarization. 4 interruption strategies. STT/TTS or speech-to-speech modes.

Event-Driven Sources

Connect persistent message sources like WebSocket, NATS, or SSE. Auto-restart with exponential backoff, health monitoring, and backpressure control built-in.

Production Resilience

Circuit breakers isolate failing providers. Rate limiting with token buckets. Retry with exponential backoff. Chain depth limits prevent infinite loops.

Multi-Agent Orchestration

Define agents, wire them into pipelines, and hand off conversations — including on live voice calls.

Inbound → ConversationRouter → Active Agent → Handoff → Next Agent

Agent

Extends AIChannel with role, voice, greeting, language, and memory. Each agent is a self-contained persona.

ConversationPipeline

Staged workflows that move conversations through phases — triage, handling, resolution — automatically.

Handoff Protocol

Context-preserving handoffs with audit trail. Voice handoffs without disconnect — the caller never knows.

ConversationState

Phase tracking, transition history, and custom context. Know exactly where every conversation stands.

pipeline.py

from roomkit.orchestration import ConversationPipeline, PipelineStage

pipeline = ConversationPipeline(stages=[
    PipelineStage(phase="triage", agent_id="agent-triage", next="handling"),
    PipelineStage(phase="handling", agent_id="agent-handler", next=None),
])

router, handler = pipeline.install(kit, [triage, handler])

Expressive API

Clean, intuitive APIs that make complex operations simple.

from roomkit import RoomKit, HookTrigger, HookResult

kit = RoomKit()

# Content moderation hook
@kit.hook(HookTrigger.BEFORE_BROADCAST)
async def moderate_content(event, ctx):
    if contains_profanity(event.content.body):
        return HookResult.block("Content policy violation")
    return HookResult.allow()

# AI routing hook
@kit.hook(HookTrigger.BEFORE_BROADCAST)
async def route_to_ai(event, ctx):
    if needs_ai_response(event, ctx):
        return HookResult.inject_to(["ai-channel"])
    return HookResult.allow()

from roomkit import RoomKit, HookTrigger, IdentityHookResult

kit = RoomKit(identity_resolver=my_resolver)

# Handle ambiguous identity (multiple matches)
@kit.identity_hook(HookTrigger.ON_IDENTITY_AMBIGUOUS)
async def resolve_ambiguous(event, ctx, id_result):
    # Access sender info directly
    sender = id_result.address  # e.g., "+14185551234"

    if sender in known_senders:
        identity = get_identity(known_senders[sender])
        return IdentityHookResult.resolved(identity)

    # Ask user to identify themselves
    return IdentityHookResult.pending(
        display_name=f"Unknown ({sender})",
        candidates=id_result.candidates
    )

from roomkit import RoomKit, EphemeralEventType

kit = RoomKit()

# Subscribe to typing indicators and presence
async def on_realtime(event):
    if event.type == EphemeralEventType.TYPING_START:
        print(f"{event.user_id} is typing...")
    elif event.type == EphemeralEventType.PRESENCE_ONLINE:
        print(f"{event.user_id} came online")

sub_id = await kit.subscribe_room("room-123", on_realtime)

# Publish typing indicator
await kit.publish_typing("room-123", "user-456")

# Publish read receipt
await kit.publish_read_receipt("room-123", "user-456", "event-789")

from roomkit import RoomKit, VoiceChannel
from roomkit.voice.stt.deepgram import DeepgramSTTProvider
from roomkit.voice.tts.elevenlabs import ElevenLabsTTSProvider
from roomkit.voice.backends.fastrtc import FastRTCBackend

# Configure voice channel with STT, TTS, and backend
kit = RoomKit()
kit.register_channel(VoiceChannel(
    "voice",
    stt=DeepgramSTTProvider(api_key="..."),
    tts=ElevenLabsTTSProvider(api_key="...", voice_id="..."),
    backend=FastRTCBackend(),
))

# Attach to a room — voice joins the same conversation
await kit.attach_channel(room.id, "voice")

# Transcriptions flow through hooks like any message

from roomkit import RoomKit, RealtimeVoiceChannel
from roomkit.providers.gemini.realtime import GeminiLiveProvider
from roomkit.voice.realtime.ws_transport import WebSocketRealtimeTransport

# Speech-to-speech AI — no STT/TTS pipeline needed
provider = GeminiLiveProvider(api_key="...")
transport = WebSocketRealtimeTransport()

kit = RoomKit()
kit.register_channel(RealtimeVoiceChannel(
    "realtime-voice",
    provider=provider,
    transport=transport,
    system_prompt="You are a helpful voice assistant.",
))

# Connect a participant — audio flows directly to/from Gemini
session = await channel.start_session("room-1", "user-1", websocket)

# Transcriptions appear as RoomEvents in the room
# Text from other channels is injected into the AI session

from roomkit import RoomKit
from roomkit.channels.ai import AIChannel
from roomkit.orchestration import (
    Agent, ConversationPipeline, PipelineStage
)

kit = RoomKit()

# Define agents with distinct roles
triage = Agent(
    "triage", provider=my_ai,
    role="Classify the customer's intent",
    greeting="Hi! How can I help you today?",
)
handler = Agent(
    "handler", provider=my_ai,
    role="Resolve the customer's issue",
)

# Wire into a pipeline
pipeline = ConversationPipeline(stages=[
    PipelineStage(phase="triage", agent_id="triage", next="handling"),
    PipelineStage(phase="handling", agent_id="handler", next=None),
])

# Install — router and handler are wired automatically
router, on_handoff = pipeline.install(kit, [triage, handler])

from roomkit import RoomKit, BaseSourceProvider, SourceStatus

class NATSSource(BaseSourceProvider):
    def __init__(self, subject: str):
        super().__init__()
        self.subject = subject

    @property
    def name(self) -> str:
        return f"nats:{self.subject}"

    async def start(self, emit):
        self._set_status(SourceStatus.CONNECTED)
        async for msg in self.subscribe():
            await emit(parse_message(msg))
            self._record_message()

# Attach with resilience options
await kit.attach_source(
    "nats-events", NATSSource("chat.>"),
    max_restart_attempts=10,   # Give up after 10 failures
    max_concurrent_emits=20,   # Backpressure control
)

Model Context Protocol

Native MCP Integration

RoomKit is designed for seamless integration with the Model Context Protocol (MCP). Build AI assistants that can manage conversations, send messages, and handle multi-channel communication through a standardized protocol.

Room management tools (create, list, get)
Message sending and history retrieval
Channel attachment and management
Participant and identity management

MCP Documentation

Claude Desktop

You

Create a new support room and send a welcome message

Claude

I'll create a support room and send a welcome message.

roomkit_create_room {"metadata": {"type": "support"}}

roomkit_send_message {"room_id": "...", "body": "Welcome!"}

Done! Room created with ID rm_abc123 and welcome message sent.

Multi-Channel Conversations,
Simplified.

Why RoomKit?

The Multi-Channel Nightmare

How RoomKit Fixes This

How RoomKit Compares

Message Brokers

Chatbot Frameworks

Full Platforms

Voice AI Frameworks

Everything You Need

Room-Based Architecture

Multi-Channel Support

Async-First Design

Powerful Hook System

Identity Resolution

Pluggable Backends

Real-Time Voice

Event-Driven Sources

Production Resilience

Multi-Agent Orchestration

Agent

ConversationPipeline

Handoff Protocol

ConversationState

Connect Any Channel

Expressive API

Built for AI Assistants

Native MCP Integration

Ready to Build?

Multi-Channel Conversations, Simplified.

Why RoomKit?

The Multi-Channel Nightmare

How RoomKit Fixes This

How RoomKit Compares

Message Brokers

Chatbot Frameworks

Full Platforms

Voice AI Frameworks

Everything You Need

Room-Based Architecture

Multi-Channel Support

Async-First Design

Powerful Hook System

Identity Resolution

Pluggable Backends

Real-Time Voice

Event-Driven Sources

Production Resilience

Multi-Agent Orchestration

Agent

ConversationPipeline

Handoff Protocol

ConversationState

Connect Any Channel

Expressive API

Built for AI Assistants

Native MCP Integration

Ready to Build?

Multi-Channel Conversations,
Simplified.