Resampler¶

A pluggable audio resampler that converts between transport and internal pipeline formats. The pipeline uses it in two directions:

Inbound: transport format (e.g. 48kHz stereo from WebRTC) to internal format (e.g. 16kHz mono for VAD/STT)
Outbound: internal format back to transport format

Quick start¶

from roomkit.voice.pipeline import AudioPipelineConfig, AudioPipelineContract, AudioFormat

# Declare formats — the contract is the single source of truth
contract = AudioPipelineContract(
    transport_inbound_format=AudioFormat(sample_rate=48000, channels=2),
    transport_outbound_format=AudioFormat(sample_rate=48000, channels=2),
    internal_format=AudioFormat(sample_rate=16000, channels=1, sample_width=2),
)

config = AudioPipelineConfig(
    contract=contract,
    # ... other providers (vad, denoiser, etc.)
)

When contract is set but no explicit resampler is provided, the pipeline auto-creates a LinearResamplerProvider. No extra configuration needed for the common case.

Built-in providers¶

LinearResamplerProvider¶

Pure-Python resampler using linear interpolation. Handles channel conversion (mono/stereo), sample rate conversion, and sample width conversion. Zero external dependencies.

from roomkit.voice.pipeline import AudioPipelineConfig, LinearResamplerProvider

config = AudioPipelineConfig(
    resampler=LinearResamplerProvider(),
    contract=contract,
)

This is the same algorithm that was previously hardcoded in the pipeline engine. It's suitable for development and testing. For production use with high sample-rate ratios, consider writing a custom provider backed by a higher-quality library.

MockResamplerProvider¶

Passes frames through unchanged and records all calls. Useful for testing.

from roomkit.voice.pipeline.resampler import MockResamplerProvider

resampler = MockResamplerProvider()
# ... run pipeline ...
assert len(resampler.calls) == 1
assert resampler.calls[0].target_rate == 16000

Custom providers¶

Implement ResamplerProvider to plug in a higher-quality algorithm:

from roomkit.voice.pipeline.resampler import ResamplerProvider
from roomkit.voice.audio_frame import AudioFrame


class SoxrResamplerProvider(ResamplerProvider):
    """High-quality resampler using libsoxr."""

    @property
    def name(self) -> str:
        return "soxr"

    def resample(
        self,
        frame: AudioFrame,
        target_rate: int,
        target_channels: int,
        target_width: int,
    ) -> AudioFrame:
        if (
            frame.sample_rate == target_rate
            and frame.channels == target_channels
            and frame.sample_width == target_width
        ):
            return frame
        # ... your soxr/libsamplerate/scipy implementation ...

    def reset(self) -> None:
        """Reset any internal state between sessions."""

    def close(self) -> None:
        """Release native resources."""

The resample() method receives target format parameters (not fixed at construction) because the pipeline calls it with different targets for inbound vs outbound.

Auto-default behavior¶

`resampler`	`contract`	Behavior
not set	not set	No resampling. Frames pass through unchanged.
not set	set	Auto-creates `LinearResamplerProvider`.
set	set	Uses the explicit provider.
set	not set	Resampler is stored but inbound resampling is skipped (no target format). Outbound resampling is also skipped.

Pipeline position¶

Inbound: first stage (before recorder tap, AEC, denoiser, VAD)
Outbound: last stage (after post-processors, recorder tap, AEC reference feed)

Lifecycle¶

The pipeline calls reset() on session start and close() on pipeline shutdown, following the same pattern as all other providers.