tonic.audio_augmentations
#
Module Contents#
Classes#
Time-stretch an audio sample by a fixed rate. |
|
Shift the pitch of a waveform by n_steps steps . |
|
Scales the maximum amplitude of the incoming signal to a random amplitude chosen from a |
|
Add white noise to the data sample with a known ratio. |
|
Convolves a RIR (room impluse response) to the data sample. |
- class tonic.audio_augmentations.RandomTimeStretch#
Time-stretch an audio sample by a fixed rate. :param samplerate: sample rate of the sample :type samplerate: float :param sample_length: sample length in seconds :type sample_length: int :param factors: range of desired factors for time stretch :type factors: float :param aug_index: index of the chosen factor for time stretch. It will be randomly chosen from the desired range (if not passed while initilization) :type aug_index: int :param caching: if we are caching the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform :type caching: bool :param fix_length: if True, time stretched signal will be returned in a fixed length (samplerate * sample_length ) :type fix_length: bool
- Parameters:
audio (np.ndarray) – data sample
- Returns:
time stretched data sample
- Return type:
np.ndarray
- samplerate: float#
- sample_length: int#
- factors: list#
- aug_index: int = 0#
- caching: bool = False#
- fix_length: bool = True#
- __call__(audio: numpy.ndarray)#
- Parameters:
audio (numpy.ndarray) –
- class tonic.audio_augmentations.RandomPitchShift#
Shift the pitch of a waveform by n_steps steps .
- Parameters:
samplerate (float) – sample rate of the sample
factors (float) – range of desired factors for pitch shift
aug_index (int) – index of the chosen factor for pitchshift. It will be randomly chosen from the desired range (if not passed while initilization)
caching (bool) – if we are caching, the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform
audio (np.ndarray) – data sample
- Returns:
pitch shifted data sample
- Return type:
np.ndarray
- samplerate: float#
- factors: list#
- aug_index: int = 0#
- caching: bool = False#
- __call__(audio: numpy.ndarray)#
- Parameters:
audio (numpy.ndarray) –
- class tonic.audio_augmentations.RandomAmplitudeScale#
Scales the maximum amplitude of the incoming signal to a random amplitude chosen from a range.
- Parameters:
samplerate (float) – sample rate of the sample
min_amp (float) – minimum of the amplitude range in volts
max_amp (float) – maximum of the amplitude range in volts
factors (float) – range of desired factors for amplitude scaling
aug_index (int) – index of the chosen factor for pitchshift. It will be randomly chosen from the desired range (if not passed while initilization)
caching (bool) – if we are caching, the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform
data (np.ndarray) – input (single-or multi-channel) signal.
- Returns:
scaled version of the signal.
- Return type:
np.ndarray
- samplerate: float#
- min_amp: float = 0.058#
- max_amp: float = 0.159#
- factors: list#
- aug_index: int = 0#
- caching: bool = False#
- __post_init__()#
- __call__(audio: numpy.ndarray)#
- Parameters:
audio (numpy.ndarray) –
- class tonic.audio_augmentations.AddWhiteNoise#
Add white noise to the data sample with a known ratio.
- Parameters:
samplerate (float) – sample rate of the sample
factors (float) – range of desired ratios for added noise
aug_index (int) – index of the chosen factor for noise. It will be randomly chosen from the desired range (if not passed while initilization)
caching (bool) – if we are caching the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform
audio (np.ndarray) – data sample
- Returns:
data sample with added noise
- Return type:
np.ndarray
- samplerate: float#
- factors: list#
- aug_index: int = 0#
- caching: bool = False#
- __call__(audio: numpy.ndarray)#
- Parameters:
audio (numpy.ndarray) –
- class tonic.audio_augmentations.RIR#
Convolves a RIR (room impluse response) to the data sample.
- Parameters:
samplerate (float) – sample rate of the sample
rir_audio (str) – path to a sample room impluse response in the .wav format
caching (bool) – if we are caching the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform
audio (np.ndarray) – data sample
- Returns:
data sample convolved with RIR
- Return type:
np.ndarray
- samplerate: float#
- rir_audio: str#
- caching: bool = False#
- __call__(audio)#