Skip to content

SIP Voice Backend

The SIP backend handles the full SIP call lifecycle: incoming INVITE → SDP negotiation → RTP media → BYE teardown. For PBX and SIP trunk integration.

Install with: pip install roomkit[sip]

Quick start

from roomkit.voice.backends.sip import SIPVoiceBackend

backend = SIPVoiceBackend(
    local_sip_addr=("0.0.0.0", 5060),
    local_rtp_ip="10.0.0.5",
    rtp_port_start=10000,
)
backend.on_call(handle_incoming_call)
await backend.start()

See the full SIP example for a complete runnable script.

API Reference

SIPVoiceBackend

SIPVoiceBackend(*, local_sip_addr=('0.0.0.0', 5060), local_rtp_ip='0.0.0.0', advertised_ip=None, rtp_port_start=10000, rtp_port_end=20000, supported_codecs=None, dtmf_payload_type=101, user_agent=None, server_name='-', jitter_capacity=32, jitter_prefetch=0, skip_audio_gaps=True, rtp_inactivity_timeout=30.0, auth_users=None, auth_realm='roomkit')

Bases: SIPAuthMixin, SIPCallingMixin, SIPAudioMixin, VoiceBackend

VoiceBackend that handles incoming SIP calls with full lifecycle.

Listens for SIP INVITE requests, negotiates codecs via SDP, creates RTP sessions for audio streaming, and handles BYE/CANCEL for call teardown. Incoming calls are auto-accepted; an on_call callback lets the application route the session to a room.

Parameters:

Name Type Description Default
local_sip_addr tuple[str, int]

(host, port) to bind the SIP listener.

('0.0.0.0', 5060)
local_rtp_ip str

IP address for RTP media binding.

'0.0.0.0'
advertised_ip str | None

Public IP to advertise in SDP c=/o= lines and SIP Contact/Via headers when behind NAT. RTP sockets still bind to local_rtp_ip. Default None (use the resolved local IP for everything).

None
rtp_port_start int

First RTP port to allocate.

10000
rtp_port_end int

Last RTP port in the allocation range.

20000
supported_codecs list[int] | None

List of payload type numbers to accept (default [PT_G722, PT_PCMU, PT_PCMA]).

None
dtmf_payload_type int

RTP payload type for RFC 4733 DTMF events.

101
user_agent str | None

Value for the SIP User-Agent header in responses.

None
server_name str

SDP session name (s= line) in answers.

'-'
jitter_capacity int

Maximum number of packets the RTP jitter buffer can hold. Default 32 (~640 ms at 20 ms/packet).

32
jitter_prefetch int

Number of packets to accumulate before starting playout. Default 0 (start immediately, optimised for low latency).

0
skip_audio_gaps bool

When True (default), gaps in the RTP stream are skipped rather than filled with silence.

True
rtp_inactivity_timeout float

Seconds of RTP silence before forcing session disconnect (safety net for missed BYE). Set to 0 to disable. Default 30.

30.0
auth_users dict[str, str] | None

Optional mapping of username → password for inbound digest authentication. When set, incoming INVITEs without valid credentials are challenged with 401.

None
auth_realm str

Realm string used in the WWW-Authenticate challenge header (default "roomkit").

'roomkit'

start async

start()

Start the SIP listener and prepare for incoming calls.

close async

close()

Disconnect all sessions, unregister, and stop UAS/transport.

on_dtmf_received

on_dtmf_received(callback)

Register a callback for inbound DTMF digits (RFC 4733).

on_call

on_call(callback)

Register a callback for incoming SIP calls.

Fired after the INVITE has been accepted and the RTP session is active. Accepts both sync and async callbacks. Can be used as a decorator::

@backend.on_call
async def handle_call(session):
    await kit.process_inbound(
        parse_voice_session(session, channel_id="voice")
    )

on_call_disconnected

on_call_disconnected(callback)

Register a callback for remote BYE (call hangup).

In SIP, call disconnect and client disconnect are the same event (a BYE terminates the dialog). Callbacks registered here and via :meth:on_client_disconnected share the same list and are all fired on any disconnect. Do not register the same function via both methods.

on_client_disconnected

on_client_disconnected(callback)

Register callback for client disconnection (base-class API).

In SIP, this is equivalent to :meth:on_call_disconnected — both register into the same callback list. Fired on remote BYE or RTP inactivity timeout.

Codecs

The SIP backend negotiates codecs via SDP. Supported codecs:

Codec Payload Type Audio Rate Quality
G.722 9 16 kHz (wideband) Best — recommended for voice AI
G.711 µ-law (PCMU) 0 8 kHz (narrowband) Standard
G.711 A-law (PCMA) 8 8 kHz (narrowband) Standard

By default, the backend accepts all three codecs with G.722 preferred. You can restrict codecs via supported_codecs:

from roomkit.voice.backends.sip import SIPVoiceBackend, PT_G722, PT_PCMU, PT_PCMA

# G.722 only (wideband)
backend = SIPVoiceBackend(
    local_sip_addr=("0.0.0.0", 5060),
    local_rtp_ip="10.0.0.5",
    rtp_port_start=10000,
    supported_codecs=[PT_G722],
)

# G.711 only (narrowband)
backend = SIPVoiceBackend(
    local_sip_addr=("0.0.0.0", 5060),
    local_rtp_ip="10.0.0.5",
    rtp_port_start=10000,
    supported_codecs=[PT_PCMU, PT_PCMA],
)

Capabilities

The SIP backend declares DTMF_SIGNALING (RFC 4733 out-of-band DTMF) and INTERRUPTION (cancel outbound audio mid-stream).

X-header routing

Room and session IDs are extracted from X-Room-ID and X-Session-ID SIP headers. All X-headers are available in session.metadata["x_headers"].

Callbacks

In addition to the standard VoiceBackend callbacks, the SIP backend provides:

  • on_call(callback) — fired when an incoming INVITE is accepted
  • on_call_disconnected(callback) — fired when the remote party sends BYE