A Python library for audio feature extraction, classification, segmentation and applications

This doc contains general info. Click here for the complete wiki

News

[2020-06-05] Published medium article Basic Audio Handling: How to handle and process audio files in command-line and through basic Python programming. Please refer to this as introductory material for handling audio data.
[2020-03-20] pip package has been updated version 0.3.0
pyAudioAnalysis master [2019-11-19] contains major refactoring changes mainly in feature extraction. Please report possible issues that have not been fixed, or inconsistencies in the documentation.
Check out paura a python script for realtime recording and analysis of audio data
pyAudioAnalysis [2018-08-12] now ported to Python 3

General

pyAudioAnalysis is a Python library covering a wide range of audio analysis tasks. Through pyAudioAnalysis you can:

Extract audio features and representations (e.g. mfccs, spectrogram, chromagram)
Classify unknown sounds
Train, parameter tune and evaluate classifiers of audio segments
Detect audio events and exclude silence periods from long recordings
Perform supervised segmentation (joint segmentation - classification)
Perform unsupervised segmentation (e.g. speaker diarization)
Extract audio thumbnails
Train and use audio regression models (example application: emotion recognition)
Apply dimensionality reduction to visualize audio data and content similarities

Installation

Clone the source of this library:

git clone https://github.com/tyiannak/pyAudioAnalysis.git

Install dependencies:
```
pip install -r ./requirements.txt
```
Install using pip:
```
pip install -e .
```
(also works with pip3 now)

An audio classification example

More examples and detailed tutorials can be found at the wiki

pyAudioAnalysis provides easy-to-call wrappers to execute audio analysis tasks. Eg, this code first trains an audio segment classifier, given a set of WAV files stored in folders (each folder representing a different class) and then the trained classifier is used to classify an unknown audio WAV file

from pyAudioAnalysis import audioTrainTest as aT
aT.extract_features_and_train(["classifierData/music","classifierData/speech"], 1.0, 1.0, aT.shortTermWindow, aT.shortTermStep, "svm", "svmSMtemp", False)
aT.file_classification("data/doremi.wav", "svmSMtemp","svm")
Result:
(0.0, array([ 0.90156761,  0.09843239]), ['music', 'speech'])

In addition, command-line support is provided for all functionalities. E.g. the following command extracts the spectrogram of an audio signal stored in a WAV file: python audioAnalysis.py fileSpectrogram -i data/doremi.wav

Author