A native desktop app for real-time voice conversations with AI. Switch between Google Gemini and OpenAI with one click. Built on RoomKit.
uv run python -m room_ui
A complete voice assistant experience with professional-grade audio processing.
Full-duplex voice conversations with sub-second latency. Speak naturally with interruption support.
iMessage-style chat bubbles with streaming partial transcriptions. See every word as it's spoken.
Ambient glow visualization showing mic and speaker audio levels with smooth animated waveforms.
Built-in WebRTC and Speex AEC for hands-free conversations without feedback loops.
Choose your microphone and speaker from settings. Supports all system audio devices.
Optional RNNoise denoiser removes background noise for crystal-clear voice input.
Press a global hotkey to dictate anywhere. Transcription is pasted into the focused app automatically.
Connect external tools via Model Context Protocol. Stdio, SSE, and HTTP transports supported.
Dictation in 14+ languages including English, French, Spanish, German, Japanese, Chinese, and more.
Apple-inspired dark and light mode. Switch instantly from settings with full theme-aware components.
AI responses render with full markdown — code blocks, tables, links, and inline formatting.
Switch between providers instantly. Your API keys are saved independently.
Native audio with Gemini 2.5 Flash. Low latency, multilingual, with built-in thinking.
GPT-4o Realtime API with server-side VAD. Natural, expressive voices in real time.
Pre-built binaries for every major operating system. Or run from source with one command.
git clone https://github.com/roomkit-live/roomkit-ui.git
cd roomkit-ui && uv sync
uv run python -m room_ui
Open Settings, choose your provider (Gemini or OpenAI), paste your API key, and start talking.
Download RoomKit UI or build from source. It's open-source and free.