UPDATE TO COME!
Voice based gender recognition using:
The The Free ST American English Corpus dataset (SLR45) can be found on SLR45. It is a free American English corpus by Surfingtech, containing utterances from 10 speakers (5 females and 5 males). Each speaker has about 350 utterances.
The Mel-Frequency Cepstrum Coefficients (MFCC) are used here, since they deliver the best results in speaker verification. MFCCs are commonly derived as follows:
According to D. Reynolds in Gaussian_Mixture_Models: A Gaussian Mixture Model (GMM) is a parametric probability density function represented as a weighted sum of Gaussian component densities. GMMs are commonly used as a parametric model of the probability distribution of continuous measurements or features in a biometric system, such as vocal-tract related spectral features in a speaker recognition system. GMM parameters are estimated from training data using the iterative Expectation-Maximization (EM) algorithm or Maximum A Posteriori(MAP) estimation from a well-trained prior model.
This script require the follwing modules/libraries:
Libs can be installed as follows:
pip install -r requirements.txt