Skip to content

SIP Voice Backend

The SIP backend handles the full SIP call lifecycle: incoming INVITE → SDP negotiation → RTP media → BYE teardown. For PBX and SIP trunk integration.

Install with: pip install roomkit[sip]

Quick start

from roomkit.voice.backends.sip import SIPVoiceBackend

backend = SIPVoiceBackend(
    local_sip_addr=("0.0.0.0", 5060),
    local_rtp_ip="10.0.0.5",
    rtp_port_start=10000,
)
backend.on_call(handle_incoming_call)
await backend.start()

See the full SIP example for a complete runnable script.

API Reference

SIPVoiceBackend

SIPVoiceBackend(*, local_sip_addr=('0.0.0.0', 5060), local_rtp_ip='0.0.0.0', rtp_port_start=10000, rtp_port_end=20000, supported_codecs=None, dtmf_payload_type=101, user_agent=None, server_name='-', rtp_inactivity_timeout=30.0)

Bases: VoiceBackend

VoiceBackend that handles incoming SIP calls with full lifecycle.

Listens for SIP INVITE requests, negotiates codecs via SDP, creates RTP sessions for audio streaming, and handles BYE/CANCEL for call teardown. Incoming calls are auto-accepted; an on_call callback lets the application route the session to a room.

Parameters:

Name Type Description Default
local_sip_addr tuple[str, int]

(host, port) to bind the SIP listener.

('0.0.0.0', 5060)
local_rtp_ip str

IP address for RTP media binding.

'0.0.0.0'
rtp_port_start int

First RTP port to allocate.

10000
rtp_port_end int

Last RTP port in the allocation range.

20000
supported_codecs list[int] | None

List of payload type numbers to accept (default [PT_G722, PT_PCMU, PT_PCMA]).

None
dtmf_payload_type int

RTP payload type for RFC 4733 DTMF events.

101
user_agent str | None

Value for the SIP User-Agent header in responses.

None
server_name str

SDP session name (s= line) in answers.

'-'
rtp_inactivity_timeout float

Seconds of RTP silence before forcing session disconnect (safety net for missed BYE). Set to 0 to disable. Default 30.

30.0

start async

start()

Start the SIP listener and prepare for incoming calls.

dial async

dial(to_uri, from_uri, proxy_addr, *, room_id=None, channel_id='voice', codec=PT_PCMU, auth=None, extra_headers=None, timeout=30.0)

Initiate an outbound SIP call.

Builds an SDP offer, sends INVITE via the UAC, waits for the remote party to answer (200 OK), then sets up an RTP session and returns a :class:VoiceSession.

Parameters:

Name Type Description Default
to_uri str

SIP URI of the callee (e.g. "sip:alice@example.com").

required
from_uri str

SIP URI of the caller (e.g. "sip:bot@example.com").

required
proxy_addr tuple[str, int]

(host, port) of the outbound SIP proxy.

required
room_id str | None

Room ID for the session (defaults to the call ID).

None
channel_id str

Channel ID for the session.

'voice'
codec int

RTP payload type number (default :data:PT_PCMU).

PT_PCMU
auth Any | None

Optional :class:~aiosipua.SipDigestAuth for 401/407 retry.

None
extra_headers dict[str, str] | None

Extra SIP headers to include in the INVITE.

None
timeout float

Seconds to wait for the call to be answered.

30.0

Returns:

Type Description
VoiceSession

An active :class:VoiceSession for the established call.

Raises:

Type Description
RuntimeError

If the call is rejected or the backend is not started.

TimeoutError

If the call is not answered within timeout seconds.

connect async

connect(room_id, participant_id, channel_id, *, metadata=None)

Return a pre-created session by metadata lookup.

For the SIP backend, sessions are created during INVITE handling. connect() is called after the INVITE handler has already set up the session. Pass session_id in metadata to look up the pre-created session.

disconnect async

disconnect(session)

Disconnect a SIP session, sending BYE if the call is still active.

close async

close()

Disconnect all sessions, stop UAS and transport.

end_of_response

end_of_response(session)

Signal end of an AI response to the session pacer.

send_transcription async

send_transcription(session, text, role='user')

Log transcription text (no UI channel in SIP mode).

on_dtmf_received

on_dtmf_received(callback)

Register a callback for inbound DTMF digits (RFC 4733).

Accepts both sync and async callbacks. Can be used as a decorator::

@backend.on_dtmf_received
async def handle_dtmf(session, event):
    ...

Parameters:

Name Type Description Default
callback DTMFReceivedCallback

Function called with (session, dtmf_event).

required

on_call

on_call(callback)

Register a callback for incoming SIP calls.

Fired after the INVITE has been accepted and the RTP session is active. Accepts both sync and async callbacks. Can be used as a decorator::

@backend.on_call
async def handle_call(session):
    await kit.process_inbound(
        parse_voice_session(session, channel_id="voice")
    )

Parameters:

Name Type Description Default
callback CallCallback

Function called with (session).

required

on_call_disconnected

on_call_disconnected(callback)

Register a callback for remote BYE (call hangup).

Fired when the remote party sends BYE. Accepts both sync and async callbacks. Can be used as a decorator::

@backend.on_call_disconnected
async def handle_disconnect(session):
    ...

Parameters:

Name Type Description Default
callback CallCallback

Function called with (session).

required

on_client_disconnected

on_client_disconnected(callback)

Register callback for client disconnection (base-class API).

Called by VoiceChannel and RealtimeVoiceChannel to receive automatic cleanup notifications when the SIP session ends.

Parameters:

Name Type Description Default
callback TransportDisconnectCallback

Called with (session) when the remote party disconnects.

required

Codecs

The SIP backend negotiates codecs via SDP. Supported codecs:

Codec Payload Type Audio Rate Quality
G.722 9 16 kHz (wideband) Best — recommended for voice AI
G.711 µ-law (PCMU) 0 8 kHz (narrowband) Standard
G.711 A-law (PCMA) 8 8 kHz (narrowband) Standard

By default, the backend accepts all three codecs with G.722 preferred. You can restrict codecs via supported_codecs:

from roomkit.voice.backends.sip import SIPVoiceBackend, PT_G722, PT_PCMU, PT_PCMA

# G.722 only (wideband)
backend = SIPVoiceBackend(
    local_sip_addr=("0.0.0.0", 5060),
    local_rtp_ip="10.0.0.5",
    rtp_port_start=10000,
    supported_codecs=[PT_G722],
)

# G.711 only (narrowband)
backend = SIPVoiceBackend(
    local_sip_addr=("0.0.0.0", 5060),
    local_rtp_ip="10.0.0.5",
    rtp_port_start=10000,
    supported_codecs=[PT_PCMU, PT_PCMA],
)

Capabilities

The SIP backend declares DTMF_SIGNALING (RFC 4733 out-of-band DTMF) and INTERRUPTION (cancel outbound audio mid-stream).

X-header routing

Room and session IDs are extracted from X-Room-ID and X-Session-ID SIP headers. All X-headers are available in session.metadata["x_headers"].

Callbacks

In addition to the standard VoiceBackend callbacks, the SIP backend provides:

  • on_call(callback) — fired when an incoming INVITE is accepted
  • on_call_disconnected(callback) — fired when the remote party sends BYE