Resampler¶
A pluggable audio resampler that converts between transport and internal pipeline formats. The pipeline uses it in two directions:
- Inbound: transport format (e.g. 48kHz stereo from WebRTC) to internal format (e.g. 16kHz mono for VAD/STT)
- Outbound: internal format back to transport format
Quick start¶
from roomkit.voice.pipeline import AudioPipelineConfig, AudioPipelineContract, AudioFormat
# Declare formats — the contract is the single source of truth
contract = AudioPipelineContract(
transport_inbound_format=AudioFormat(sample_rate=48000, channels=2),
transport_outbound_format=AudioFormat(sample_rate=48000, channels=2),
internal_format=AudioFormat(sample_rate=16000, channels=1, sample_width=2),
)
config = AudioPipelineConfig(
contract=contract,
# ... other providers (vad, denoiser, etc.)
)
When contract is set but no explicit resampler is provided, the pipeline auto-creates a LinearResamplerProvider. No extra configuration needed for the common case.
Built-in providers¶
LinearResamplerProvider¶
Pure-Python resampler using linear interpolation. Handles channel conversion (mono/stereo), sample rate conversion, and sample width conversion. Zero external dependencies.
from roomkit.voice.pipeline import AudioPipelineConfig, LinearResamplerProvider
config = AudioPipelineConfig(
resampler=LinearResamplerProvider(),
contract=contract,
)
This is the same algorithm that was previously hardcoded in the pipeline engine. It's suitable for development and testing. For production use with high sample-rate ratios, consider writing a custom provider backed by a higher-quality library.
MockResamplerProvider¶
Passes frames through unchanged and records all calls. Useful for testing.
from roomkit.voice.pipeline.resampler import MockResamplerProvider
resampler = MockResamplerProvider()
# ... run pipeline ...
assert len(resampler.calls) == 1
assert resampler.calls[0].target_rate == 16000
Custom providers¶
Implement ResamplerProvider to plug in a higher-quality algorithm:
from roomkit.voice.pipeline.resampler import ResamplerProvider
from roomkit.voice.audio_frame import AudioFrame
class SoxrResamplerProvider(ResamplerProvider):
"""High-quality resampler using libsoxr."""
@property
def name(self) -> str:
return "soxr"
def resample(
self,
frame: AudioFrame,
target_rate: int,
target_channels: int,
target_width: int,
) -> AudioFrame:
if (
frame.sample_rate == target_rate
and frame.channels == target_channels
and frame.sample_width == target_width
):
return frame
# ... your soxr/libsamplerate/scipy implementation ...
def reset(self) -> None:
"""Reset any internal state between sessions."""
def close(self) -> None:
"""Release native resources."""
The resample() method receives target format parameters (not fixed at construction) because the pipeline calls it with different targets for inbound vs outbound.
Auto-default behavior¶
resampler |
contract |
Behavior |
|---|---|---|
| not set | not set | No resampling. Frames pass through unchanged. |
| not set | set | Auto-creates LinearResamplerProvider. |
| set | set | Uses the explicit provider. |
| set | not set | Resampler is stored but inbound resampling is skipped (no target format). Outbound resampling is also skipped. |
Pipeline position¶
- Inbound: first stage (before recorder tap, AEC, denoiser, VAD)
- Outbound: last stage (after post-processors, recorder tap, AEC reference feed)
Lifecycle¶
The pipeline calls reset() on session start and close() on pipeline shutdown, following the same pattern as all other providers.