calpy.dsp package¶

Submodules¶

calpy.dsp.audio_features module¶

calpy.dsp.audio_features.dB_profile(signal, sampling_rate, time_step=0.01, frame_window=0.025)[source]¶

Computes decible of signal amplitude of an entire conversation

Args:: signal (numpy.array(float)): Padded audio signal. sampling_rate (float): Sampling frequency in Hz. time_step (float, optional): The time interval (in seconds) between two dB values. Default to 0.01. frame_window (float, optional): The length of speech (in seconds) used to estimate dB. Default to 0.025.
Returns:: numpy.array(float): The decibles.

calpy.dsp.audio_features.get_pause_length(pauses)[source]¶

Compute the length of pause. Args:

pauses (numpy array, bool): True indicates occurrence of pause.

Returns:: res (numpy array): The length of consecutive pauses.

calpy.dsp.audio_features.mfcc_profile(signal, sampling_rate, time_step=0.01, frame_window=0.025, NFFT=512, nfilt=40, ceps=12)[source]¶

Compute MFCC for a long (usually over an entire conversation) sound signal.

Reference: http://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html

Args:: signal (numpy.array(float)): Padded audio signal. sampling_rate (float): Sampling frequency in Hz. time_step (float, optional): The time interval (in seconds) between two MFCC. Default to 0.01. frame_window (float, optional): The length of speech (in seconds) used to estimate MFCC. Default to 0.025. NFFT (int, optional): NFFT-point FFT. Defaults to 512. nfilt (int, optional): Number of frequency bands in Mel-scaling. Defaults to 40. ceps (int, optional): Number of mel frequency ceptral coefficients to be retained. Defaults to 12.
Returns:: numpy.array() : Calculated Mel-Frequecy Cepstral Coefficients Matrix.

calpy.dsp.audio_features.pause_length_histogram(pauses, min_silence_duration=0.01, bins=30)[source]¶

Compute the histogram of pause lenghth. Args:

pauses (numpy array, bool): True indicates occurrence of pause. min_silence_duration (float, optional): The minimum duration in seconds to be considered pause. If not provided, then default to 0.01. bins (int, optional): Defines the number of equal-width bins in the given range. Defaults to 30.

Returns:: hist (numpy array): The values of the histogram. bin_edges (numpy array, float): the bin edges (length(hist)+1) in seconds.

calpy.dsp.audio_features.pause_profile(signal, sampling_rate, min_silence_duration=0.01, time_step=0.01, frame_window=0.025)[source]¶

Find pauses in audio.

Args:: signal (numpy.array(float)): Audio signal. sampling_rate (float): Sampling frequency in Hz. min_silence_duration (float, optional): The minimum duration in seconds to be considered pause. Default to 0.01. time_step (float, optional): The time interval (in seconds) between two pauses. Default to 0.01. frame_window (float, optional): The length of speech (in seconds) used to estimate pause. Default to 0.025.
Returns:: numpy.array(float): 0-1 1D numpy integer array with 1s marking sounding.

calpy.dsp.audio_features.pitch_profile(signal, sampling_rate, time_step=0.01, frame_window=0.025, lower_threshold=75, upper_threshold=255)[source]¶

Compute pitch for a long (usually over an entire conversation) sound signal

Args:: signal (numpy.array(float)): Padded audio signal. sampling_rate (float): Sampling frequency in Hz. time_step (float, optional): The time interval (in seconds) between two pitches. Default to 0.01. frame_window (float, optional): The length of speech (in seconds) used to estimate pitch. Default to 0.025. lower_threshold (int, optional): Defaults to 75. upper_threshold (int, optional): Defaults to 225.
Returns:: numpy.array(float): Estimated pitch in Hz.

calpy.dsp.audio_features.remove_long_pauses(inputfilename, outputfilename, long_pause=0.5, min_silence_duration=0.01)[source]¶

Remove long pauses/silence in a wav file.

Args:: inputfilename (string): file name of input wav. outputfilename (string): file name of output wav. long_pause (float, optional): minimum duration of silence to be considered a long pause, in seconds. Defaults to 0.5. min_silence_duration (float, optional): The minimum duration in seconds to be considered pause. Default to 0.01.
Returns:: NULL: writes a wav file to disk.

calpy.dsp.yin module¶

calpy.dsp.yin.absolute_threshold(signal, threshold)[source]¶

Absolute thresholdeshold. Step 4 in YIN.

Args:: signal (numpy.array(float)): A small piece normalised self correlated audio d’(t, tau) processed by normalisation(). 1D array like. threshold (float): Threshold value.
Returns:: float: The index tau.

calpy.dsp.yin.difference_function(signal)[source]¶

Calculate difference function of the signal. Step 1 and 2 of YIN.

Args:: signal (numpy.array(float)): A short audio signal. 1D array.
Returns:: numpy.array(float): Equation (6) of YIN. The difference function d(t, tau). 1D array.

calpy.dsp.yin.instantaneous_pitch(signal, sampling_frequency, threshold=0.1)[source]¶

Computes fundamental frequency (based on YIN) as pitch of a given (usually a very short) time interval.

Code is an adpation of https://github.com/ashokfernandez/Yin-Pitch-Tracking.

Args:: signal (numpy.array(float)): Audio signal. 1D array. sampling_frequency (int): Sampling frequency in Hz. threshold (float,optional): Absolute thresholdeshold value as defined in Step 4 of YIN. Default 0.1
Returns:: f0: fundamental frequency in Hz (estimated speech pitch), a float

calpy.dsp.yin.normalisation(signal)[source]¶

Normalise the difference function by the cumulative mean. Step 3 of YIN.

Args:: signal (numpy.array(float)): A small piece of self correlated audio signal d(t, tau) processed by difFunction(). 1D array.
Returns:: numpy.array(float): Equation (8) of YIN. Normalised difference function d’(t, tau). 1D array.

calpy.dsp.yin.parabolic_interpolation(signal, tau)[source]¶

Parabolic Interpolation on tau. Step 5 in YIN.

Args:: signal (numpy.array(float)): A small piece normalised self correlated audio d’(t, tau) processed by normalisation(). 1D array. tau (int): Estimated thresholdeshold.
Returns:: float: A better estimation of tau.

calpy.dsp package¶

Submodules¶

calpy.dsp.audio_features module¶

calpy.dsp.yin module¶

Table Of Contents

Previous topic

Next topic

This Page