Engineering insights, tutorials, and deep-dives into multi-channel conversation systems.
I built RoomKit UI, a desktop voice assistant for macOS, Linux, and Windows. It supports Google Gemini and OpenAI Realtime, connects to external tools via MCP, includes system-wide dictation, and ships as a standalone app — all built on top of RoomKit.
Read moreRoomKit now supports SIP natively. Incoming calls from any PBX are answered and bridged to conversational AI in real time — no WebRTC, no browser, just a phone call in under 50 lines of Python.
Read moreHow I integrated Gradium's audio language models into RoomKit for multi-channel voice AI — with semantic VAD for natural turn-taking, streaming STT/TTS via WebSocket, and sub-300ms time-to-first-token.
Read moreA fair comparison of four open-source conversational AI frameworks — their philosophies, code examples, strengths, and ideal use cases. Pipelines vs. graphs vs. rooms vs. sessions: choose the abstraction that matches your problem.
Read moreBuild a fully local, open-source voice assistant in Python — no API keys, no subscriptions, no data leaving your machine. A fully local voice pipeline running on a single NVIDIA 4070, responding in under 300ms.
Read moreIf you've ever integrated SMS, email, voice, and chat into the same app, you know the pain. Each channel has its own SDK, its own webhooks, its own quirks. After the third time rebuilding the same plumbing, I extracted the pattern into a library.
Read more