tonic.audio_augmentations

`tonic.audio_augmentations`#

Module Contents#

Classes#

`RandomTimeStretch`	Time-stretch an audio sample by a fixed rate.
`RandomPitchShift`	Shift the pitch of a waveform by n_steps steps .
`RandomAmplitudeScale`	Scales the maximum amplitude of the incoming signal to a random amplitude chosen from a
`AddWhiteNoise`	Add white noise to the data sample with a known ratio.
`RIR`	Convolves a RIR (room impluse response) to the data sample.

class tonic.audio_augmentations.RandomTimeStretch#

Time-stretch an audio sample by a fixed rate. :param samplerate: sample rate of the sample :type samplerate: float :param sample_length: sample length in seconds :type sample_length: int :param factors: range of desired factors for time stretch :type factors: float :param aug_index: index of the chosen factor for time stretch. It will be randomly chosen from the desired range (if not passed while initilization) :type aug_index: int :param caching: if we are caching the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform :type caching: bool :param fix_length: if True, time stretched signal will be returned in a fixed length (samplerate * sample_length ) :type fix_length: bool

Parameters:: audio (np.ndarray) – data sample
Returns:: time stretched data sample
Return type:: np.ndarray

samplerate: float#

sample_length: int#

factors: list#

aug_index: int = 0#

caching: bool = False#

fix_length: bool = True#

__call__(audio: numpy.ndarray)#

Parameters:: audio (numpy.ndarray) –

class tonic.audio_augmentations.RandomPitchShift#

Shift the pitch of a waveform by n_steps steps .

Parameters:

samplerate (float) – sample rate of the sample
factors (float) – range of desired factors for pitch shift
aug_index (int) – index of the chosen factor for pitchshift. It will be randomly chosen from the desired range (if not passed while initilization)
caching (bool) – if we are caching, the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform
audio (np.ndarray) – data sample

Returns:

pitch shifted data sample

Return type:

np.ndarray

samplerate: float#

factors: list#

aug_index: int = 0#

caching: bool = False#

__call__(audio: numpy.ndarray)#

Parameters:: audio (numpy.ndarray) –

class tonic.audio_augmentations.RandomAmplitudeScale#

Scales the maximum amplitude of the incoming signal to a random amplitude chosen from a range.

Parameters:

samplerate (float) – sample rate of the sample
min_amp (float) – minimum of the amplitude range in volts
max_amp (float) – maximum of the amplitude range in volts
factors (float) – range of desired factors for amplitude scaling
aug_index (int) – index of the chosen factor for pitchshift. It will be randomly chosen from the desired range (if not passed while initilization)
caching (bool) – if we are caching, the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform
data (np.ndarray) – input (single-or multi-channel) signal.

Returns:

scaled version of the signal.

Return type:

np.ndarray

samplerate: float#

min_amp: float = 0.058#

max_amp: float = 0.159#

factors: list#

aug_index: int = 0#

caching: bool = False#

__post_init__()#

__call__(audio: numpy.ndarray)#

Parameters:: audio (numpy.ndarray) –

class tonic.audio_augmentations.AddWhiteNoise#

Add white noise to the data sample with a known ratio.

Parameters:

samplerate (float) – sample rate of the sample
factors (float) – range of desired ratios for added noise
aug_index (int) – index of the chosen factor for noise. It will be randomly chosen from the desired range (if not passed while initilization)
caching (bool) – if we are caching the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform
audio (np.ndarray) – data sample

Returns:

data sample with added noise

Return type:

np.ndarray

samplerate: float#

factors: list#

aug_index: int = 0#

caching: bool = False#

__call__(audio: numpy.ndarray)#

Parameters:: audio (numpy.ndarray) –

class tonic.audio_augmentations.RIR#

Convolves a RIR (room impluse response) to the data sample.

Parameters:

samplerate (float) – sample rate of the sample
rir_audio (str) – path to a sample room impluse response in the .wav format
caching (bool) – if we are caching the DiskCached dataset will dynamically pass copy index of data item to the transform (to set aug_index). Otherwise the aug_index will be chosen randomly in every call of transform
audio (np.ndarray) – data sample

Returns:

data sample convolved with RIR

Return type:

np.ndarray

samplerate: float#

rir_audio: str#

caching: bool = False#

__call__(audio)#

tonic.audio_augmentations

Contents

tonic.audio_augmentations#

Module Contents#

Classes#

`tonic.audio_augmentations`#