AI Thinking / Reasoning¶

RoomKit provides first-class support for AI thinking (chain-of-thought reasoning). Models like Claude 3.5+, DeepSeek-R1, and QwQ produce internal reasoning before their answer. RoomKit captures this reasoning, preserves it across tool-loop rounds, and exposes it through hooks and ephemeral events.

Quick start¶

from roomkit import AIChannel, AnthropicAIProvider, AnthropicConfig

provider = AnthropicAIProvider(AnthropicConfig(api_key="sk-..."))
ai = AIChannel(
    "ai-thinker",
    provider=provider,
    system_prompt="Think step by step before answering.",
    thinking_budget=8192,  # Token budget for reasoning
)

That's it. When the provider supports thinking, the reasoning is automatically captured and preserved in conversation history.

How it works¶

User message arrives
    │
    ▼
AIChannel builds AIContext (with thinking_budget)
    │
    ▼
Provider generates response
    ├── Thinking: "Let me reason step by step..."  →  THINKING_START ephemeral event
    │                                               →  ON_AI_THINKING hook
    │                                               →  THINKING_END ephemeral event
    └── Answer: "The answer is 42."                →  Broadcast as RoomEvent
    │
    ▼
AIThinkingPart preserved in conversation history
    │
    ▼
Next generation sees prior reasoning (required by Anthropic, useful for all)

Configuration¶

Parameter	Location	Description
`thinking_budget`	`AIChannel()` constructor	Default token budget for reasoning
`thinking_budget`	Binding metadata	Per-room override

Default thinking budget¶

Set the default budget when creating the channel:

ai = AIChannel(
    "ai-thinker",
    provider=provider,
    thinking_budget=8192,
)

Per-room override¶

Override the budget for specific rooms via binding metadata:

await kit.attach_channel("math-room", "ai-thinker",
    category=ChannelCategory.INTELLIGENCE,
    metadata={
        "system_prompt": "You are a math tutor. Show your work.",
        "thinking_budget": 16384,  # More budget for complex reasoning
    },
)

Provider support¶

Anthropic (native extended thinking)¶

AnthropicAIProvider uses the native extended thinking API. When thinking_budget is set:

The API receives thinking: {type: "enabled", budget_tokens: N}
Temperature is automatically set to 1 (required by the API)
Thinking blocks include a signature for round-trip fidelity
AIThinkingPart is preserved verbatim in conversation history (Anthropic requires this)

from roomkit import AnthropicAIProvider, AnthropicConfig

provider = AnthropicAIProvider(AnthropicConfig(
    api_key="sk-...",
    model="claude-sonnet-4-20250514",
))

ai = AIChannel("ai", provider=provider, thinking_budget=8192)

Ollama / vLLM (`<think>` tags)¶

Models served via Ollama or vLLM (DeepSeek-R1, QwQ, etc.) emit reasoning inside <think>...</think> tags. The OpenAIAIProvider parses these automatically:

Streaming: _ThinkTagParser handles tags split across chunk boundaries
Non-streaming: Regex extraction from the complete response
History: AIThinkingPart is re-wrapped as <think> tags when sent back to the model

from roomkit import create_vllm_provider

provider = create_vllm_provider(
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model="deepseek-r1:8b",
)

ai = AIChannel("ai", provider=provider, thinking_budget=8192)

Gemini¶

Gemini does not currently emit thinking content. The thinking_budget parameter is accepted but has no effect.

Streaming¶

During streaming generation, thinking content arrives as StreamThinkingDelta events before StreamTextDelta events:

from roomkit.providers.ai.base import (
    StreamThinkingDelta,
    StreamTextDelta,
    StreamToolCall,
    StreamDone,
)

async for event in provider.generate_structured_stream(context):
    if isinstance(event, StreamThinkingDelta):
        print(f"Thinking: {event.thinking}")
    elif isinstance(event, StreamTextDelta):
        print(f"Text: {event.text}")

The AIChannel handles this automatically — thinking deltas trigger ephemeral events, and text deltas are delivered to downstream channels.

Hooks and ephemeral events¶

ON_AI_THINKING hook¶

Fires when the AI produces thinking content. Use it for logging, observability, or cost tracking:

from roomkit import HookTrigger

@kit.hook(HookTrigger.ON_AI_THINKING)
async def log_thinking(event, ctx):
    thinking = ctx.get("thinking", "")
    print(f"AI reasoning ({len(thinking)} chars): {thinking[:100]}...")

Ephemeral events¶

Two ephemeral events bracket the thinking phase:

Event	When
`THINKING_START`	AI begins reasoning
`THINKING_END`	AI finishes reasoning (thinking text in payload)

These are published via the RealtimeBackend and do not persist in the conversation store. Use them for real-time UI indicators (e.g., "AI is thinking...").

Tool loop integration¶

Thinking is preserved across tool-loop rounds. When the AI calls a tool and then continues generating, the thinking from each round is kept in the conversation history:

Round 1: AI thinks → calls tool
    ├── AIThinkingPart(thinking="I need to look up...")
    └── AIToolCallPart(name="search", ...)

Tool executes → result appended

Round 2: AI thinks → generates answer
    ├── AIThinkingPart(thinking="Based on the results...")
    └── AITextPart(text="Here's what I found...")

This ensures the model has full context of its prior reasoning when generating follow-up responses.

Data model¶

AIThinkingPart¶

Represents a thinking block in conversation history:

from roomkit import AIThinkingPart

part = AIThinkingPart(
    thinking="Let me reason step by step...",
    signature="abc123",  # Optional, used by Anthropic for round-trip
)

StreamThinkingDelta¶

A streaming event for thinking content:

from roomkit import StreamThinkingDelta

delta = StreamThinkingDelta(thinking="Step 1: Consider...")

AIResponse fields¶

Field	Type	Description
`thinking`	`str \\| None`	Accumulated thinking text
`thinking_signature`	`str \\| None`	Provider-specific signature (Anthropic)

AIContext field¶

Field	Type	Description
`thinking_budget`	`int \\| None`	Token budget for reasoning

Testing¶

Use MockAIProvider with AIResponse that includes thinking content:

from roomkit import AIChannel, MockAIProvider
from roomkit.providers.ai.base import AIResponse

provider = MockAIProvider(
    ai_responses=[
        AIResponse(
            content="The answer is 42.",
            thinking="Let me reason about this...",
            finish_reason="stop",
            usage={"prompt_tokens": 20, "completion_tokens": 15},
        ),
    ],
    streaming=True,
)

ai = AIChannel("ai", provider=provider, thinking_budget=8192)

When streaming=True, MockAIProvider.generate_structured_stream() yields StreamThinkingDelta before StreamTextDelta, matching the real provider behavior.

Example¶

See examples/ai_thinking.py for a runnable demo showing thinking with AIChannel and per-room configuration.