SIP Voice Backend¶
The SIP backend handles the full SIP call lifecycle: incoming INVITE → SDP negotiation → RTP media → BYE teardown. For PBX and SIP trunk integration.
Install with: pip install roomkit[sip]
Quick start¶
from roomkit.voice.backends.sip import SIPVoiceBackend
backend = SIPVoiceBackend(
local_sip_addr=("0.0.0.0", 5060),
local_rtp_ip="10.0.0.5",
rtp_port_start=10000,
)
backend.on_call(handle_incoming_call)
await backend.start()
See the full SIP example for a complete runnable script.
API Reference¶
SIPVoiceBackend ¶
SIPVoiceBackend(*, local_sip_addr=('0.0.0.0', 5060), local_rtp_ip='0.0.0.0', advertised_ip=None, rtp_port_start=10000, rtp_port_end=20000, supported_codecs=None, dtmf_payload_type=101, user_agent=None, server_name='-', jitter_capacity=32, jitter_prefetch=0, skip_audio_gaps=True, rtp_inactivity_timeout=30.0, auth_users=None, auth_realm='roomkit')
Bases: SIPAuthMixin, SIPCallingMixin, SIPAudioMixin, VoiceBackend
VoiceBackend that handles incoming SIP calls with full lifecycle.
Listens for SIP INVITE requests, negotiates codecs via SDP, creates
RTP sessions for audio streaming, and handles BYE/CANCEL for call
teardown. Incoming calls are auto-accepted; an on_call callback
lets the application route the session to a room.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
local_sip_addr
|
tuple[str, int]
|
|
('0.0.0.0', 5060)
|
local_rtp_ip
|
str
|
IP address for RTP media binding. |
'0.0.0.0'
|
advertised_ip
|
str | None
|
Public IP to advertise in SDP |
None
|
rtp_port_start
|
int
|
First RTP port to allocate. |
10000
|
rtp_port_end
|
int
|
Last RTP port in the allocation range. |
20000
|
supported_codecs
|
list[int] | None
|
List of payload type numbers to accept
(default |
None
|
dtmf_payload_type
|
int
|
RTP payload type for RFC 4733 DTMF events. |
101
|
user_agent
|
str | None
|
Value for the SIP |
None
|
server_name
|
str
|
SDP session name ( |
'-'
|
jitter_capacity
|
int
|
Maximum number of packets the RTP jitter buffer can hold. Default 32 (~640 ms at 20 ms/packet). |
32
|
jitter_prefetch
|
int
|
Number of packets to accumulate before starting playout. Default 0 (start immediately, optimised for low latency). |
0
|
skip_audio_gaps
|
bool
|
When |
True
|
rtp_inactivity_timeout
|
float
|
Seconds of RTP silence before forcing session disconnect (safety net for missed BYE). Set to 0 to disable. Default 30. |
30.0
|
auth_users
|
dict[str, str] | None
|
Optional mapping of |
None
|
auth_realm
|
str
|
Realm string used in the |
'roomkit'
|
on_dtmf_received ¶
Register a callback for inbound DTMF digits (RFC 4733).
on_call ¶
Register a callback for incoming SIP calls.
Fired after the INVITE has been accepted and the RTP session is active. Accepts both sync and async callbacks. Can be used as a decorator::
@backend.on_call
async def handle_call(session):
await kit.process_inbound(
parse_voice_session(session, channel_id="voice")
)
on_call_disconnected ¶
Register a callback for remote BYE (call hangup).
In SIP, call disconnect and client disconnect are the same event
(a BYE terminates the dialog). Callbacks registered here and via
:meth:on_client_disconnected share the same list and are all
fired on any disconnect. Do not register the same function via
both methods.
on_client_disconnected ¶
Register callback for client disconnection (base-class API).
In SIP, this is equivalent to :meth:on_call_disconnected — both
register into the same callback list. Fired on remote BYE or
RTP inactivity timeout.
Codecs¶
The SIP backend negotiates codecs via SDP. Supported codecs:
| Codec | Payload Type | Audio Rate | Quality |
|---|---|---|---|
| G.722 | 9 | 16 kHz (wideband) | Best — recommended for voice AI |
| G.711 µ-law (PCMU) | 0 | 8 kHz (narrowband) | Standard |
| G.711 A-law (PCMA) | 8 | 8 kHz (narrowband) | Standard |
By default, the backend accepts all three codecs with G.722 preferred. You can restrict codecs via supported_codecs:
from roomkit.voice.backends.sip import SIPVoiceBackend, PT_G722, PT_PCMU, PT_PCMA
# G.722 only (wideband)
backend = SIPVoiceBackend(
local_sip_addr=("0.0.0.0", 5060),
local_rtp_ip="10.0.0.5",
rtp_port_start=10000,
supported_codecs=[PT_G722],
)
# G.711 only (narrowband)
backend = SIPVoiceBackend(
local_sip_addr=("0.0.0.0", 5060),
local_rtp_ip="10.0.0.5",
rtp_port_start=10000,
supported_codecs=[PT_PCMU, PT_PCMA],
)
Capabilities¶
The SIP backend declares DTMF_SIGNALING (RFC 4733 out-of-band DTMF) and INTERRUPTION (cancel outbound audio mid-stream).
X-header routing¶
Room and session IDs are extracted from X-Room-ID and X-Session-ID SIP headers. All X-headers are available in session.metadata["x_headers"].
Callbacks¶
In addition to the standard VoiceBackend callbacks, the SIP backend provides:
on_call(callback)— fired when an incoming INVITE is acceptedon_call_disconnected(callback)— fired when the remote party sends BYE