core.mixer¶
Mix audio signals at a target Signal-to-Noise Ratio (SNR) using symmetric amplitude scaling and RMS normalization.
The mixer follows a robust 5-step procedure:
- Pre-mixing normalization: Normalize speech and noise independently to the same RMS level
- Symmetric SNR splitting: Boost signal by SNR/2 dB and attenuate noise by SNR/2 dB
- Weighted sum: Mix as
mixed = speech × mult_signal + noise × mult_noise - Post-mixing RMS normalization: Re-normalize result to target RMS to correct energy accumulation
- SNR preservation: SNR is defined by the ratio of multipliers, independent of absolute signal level
mix_at_snr¶
Mix speech and noise at a target SNR using symmetric RMS scaling.
Signature
mix_at_snr(speech: np.ndarray, noise: np.ndarray, snr_db: float,
target_rms: float = 0.1) -> np.ndarray
Parameters
speech: Speech signal (1D array)noise: Noise signal (1D array)snr_db: Target SNR in dB (symmetric split: signal boosted by SNR/2 dB, noise attenuated by SNR/2 dB)target_rms: Target RMS level for output normalization (default: 0.1, range 0.01–0.5 recommended)
Returns
Mixed audio normalized to target_rms.
Details
The multiplier applied to each component is computed as:
mult_signal = 10^(snr_db / 2 / 20)(boost by SNR/2 dB)mult_noise = 10^(-snr_db / 2 / 20)(attenuate by noise/2 dB)
Both signal and noise are normalized independently before mixing, ensuring consistent energy levels regardless of input loudness. After mixing, the result is re-normalized to target_rms, which controls final loudness without affecting SNR.
Example
Mix a 16 kHz speech signal with noise at +10 dB SNR:
from triton.core.mixer import mix_at_snr
import numpy as np
speech = np.random.randn(16000) # 1 second at 16 kHz
noise = np.random.randn(16000)
mixed = mix_at_snr(speech, noise, snr_db=10)
mix_at_snr_segmented¶
Mix multiple segments of audio at varying SNR levels with optional boundary smoothing.
Signature
mix_at_snr(speech: np.ndarray, noise: np.ndarray, snr_db: float) -> np.ndarray
mix_at_snr_segmented(speech_segments: list[np.ndarray],
noise_segments: list[np.ndarray],
snr_levels: list[float],
target_rms: float = 0.1,
smooth_transitions: bool = False,
transition_samples: int = 100) -> list[np.ndarray]
Parameters
speech_segments: List of speech signal segments (1D arrays)noise_segments: List of noise signal segments (1D arrays, must matchspeech_segmentslength)snr_levels: List of SNR values in dB (must match segment count)target_rms: Target RMS level for each segment's outputsmooth_transitions: IfTrue, smoothly interpolate amplitude multipliers across segment boundaries to avoid clicks or abrupt changestransition_samples: Number of samples to smooth across boundaries (default: 100)
Returns
List of mixed segments, each normalized to target_rms.
Details
This function is useful for processing long audio divided into chunks with varying degradation levels (e.g., different SNR per sentence in a speech corpus). When smooth_transitions=True, multiplier vectors are linearly interpolated across boundaries, creating a continuous amplitude envelope without audible discontinuities.
Example
Mix three sentences at different SNR levels with smooth transitions:
from triton.core.mixer import mix_at_snr_segmented
speech_segments = [segment1, segment2, segment3]
noise_segments = [noise1, noise2, noise3]
snr_levels = [5, 10, 15] # Progressively cleaner
mixed = mix_at_snr_segmented(
speech_segments,
noise_segments,
snr_levels,
smooth_transitions=True
)
mix_babble¶
Mix multiple talker waveforms into a babble signal.
Signature
mix_babble(talkers: list[np.ndarray], target_rms: float | None = None, peak_normalize: bool = True, normalize_talkers: bool = True) -> np.ndarray
Args
talkers: List of talker arraystarget_rms: Optional RMS target for each talker before mixingpeak_normalize: Peak-normalize final mixnormalize_talkers: Whether to RMS-normalize talkers before mixing
mix_babble_from_segments¶
Normalize talker segments, concatenate per talker, then mix talkers.
Signature
mix_babble_from_segments(talker_segments: list[list[np.ndarray]], target_rms: float | None = None, peak_normalize: bool = True) -> np.ndarray
Args
talker_segments: Nested list where each inner list is one talker's segmentstarget_rms: Optional RMS target applied per segment before concatenationpeak_normalize: Peak-normalize final mix