Learn Python – Python Audio Modules- Basic and advance

Python programming language is a main at present because of its person – friendly features. Python also has many interesting modules and libraries through which users can do a lot by the usage of them. One of the most fascinating points of the Python language is its Audio modules. In this article, we will discuss the more than a few kinds of audio modules and their unique points and advantages.

This article will cover 10 one-of-a-kind kinds of audio modules and libraries in python:









python – sounddevice


Let’s understand the above audio modules one by means of one.

1. PYO

PYO is a Module of Python is written in the C programming language for the advent of a digital signal processing script. This module of Python contains instructions for processing a large range of audio sign types. Due to this, customers are able to import sign processing chains at once in Python scripts or initiatives and can manipulate the audio alerts in actual time by using the use of an interpreter.

The tool of PYO modules in Python have primitives such as mathematical operations, simple signal processing like delays, synthesis generators, filters, and lots more. But it additionally complexes the algorithms to improve sound granulation and many other artistic audio operations.

For example:

# to play a sound file:  
from pyo import *  
sound = Server ( ) .boot ( )  
sound.start  ( )  
sound_file = SFPlayer ( " path /to /users /sound.aif ", speed = 1, loop = True ).out ( )  
# for Granulating an audio buffer:  
sound = Server ( ) .boot ( )  
sound_nd = SndTable ( " path /to /users /sound.aif " )  
ev = HannTable ( )   
ps = Phasor ( freq = sound_nd.getRate ( )*.25, ml = sound_nd.getSize ( ) )  
dr = Noise ( mul = .001, add = .1 )  
granulate = Granulator ( sound_nd, ev, [ 1, 1.001 ] , ps, dr, 32, ml = .1 ).out ( )  
# to generate melodies:  
sound = Server ( ) .boot ( )  
sound.start ( )  
wv = SquareTable ( )  
ev = CosTable ( [ ( 0, 0 ) , ( 100 , 1 ) , ( 500 , 0.3 ) , ( 8391 , 0 ) ] )  
mt = Metro ( 0.135 , 12 ).play ( )  
ap = TrigEnv ( mt , table = ev , dr = 1 , ml = .1 )  
pt = TrigXnoiseMidi ( mt , dist = ' loopseg ' , x1 = 20 , scale = 1 , mrange = ( 47, 74 ) )  
out = Osc ( table = wav , freq = pt , ml = ap ).out ( )  

2. pyAudio

Pyaudio is a Python library which is an open – source and move – platform audio enter – output. It has a extensive range of functionalities, which are audio – related and in the main focusing on segmentation, elements extraction, classification and visualization issues.

By the use of the pyaudio library, users can classify unknown sounds, perform supervised and unsupervised segmentation, extract audio points and representations, observe audio occasions and filter out silence intervals from the lengthy recordings, observe dimensionality discount to visualize audio statistics and content material similarities and much more.

This library offers bindings for PortAudio. The users can use this library for playing and recording audio on different platforms, like Windows, Mac and Linux. For enjoying audio through the use of the pyaudio library, the consumer has to write to a .stream.

For example:

import pyaudio  
import wave  
filename = ' example.wav '  
# Set chunk size of 1024 samples per data frame  
chunksize = 1024    
# Now open the sound file, name as wavefile  
wavefile = wave.open ( filename, ' rb ' )  
# Create an interface to PortAudio  
portaudio = pyaudio.PyAudio ( )  
# Open a .Stream object to write the WAV file to play the audio using pyaudio  
# in this code, 'output = True' means that the audio will be played rather than recorded  
streamobject = portaudio.open(format = portaudio.get_format_from_width ( wavefile.getsampwidth ( ) ),  
                channels = wavefile.getnchannels ( ),  
                rate = wavefile.getframerate ( ),  
                output = True ( )  
# Read data in chunksize  
Data_audio = wavefile.readframes ( chunksize )  
# Play the audio by writing the audio data to the streamobject  
while data != '':  
    streamobject.write ( data_audio )  
    data_audio = wavefile.readframes ( chunksize )  
# Close and terminate the streamobject  
streamobject.close ( )  
portaudio.terminate ( )  

Here, users can be aware that playing audio the usage of the pyaudio library can be a bit complex, evaluating the other audio enjoying libraries. That’s why this library may no longer be the first desire of customers for enjoying the audio in their tasks or applications.

Although, as pyaudio library presents more low – degree control, which makes it possible for the users to set the parameters for their enter and output devices. This library additionally lets the customers check the load of their CPU and input – output activity.

Pyaudio library also allows its customers to play and report the audio in callback mode. Where a referred to callback function is referred to as when new statistics is wished for playback and on hand for recording. These are the elements of the pyaudio library, which makes it distinctive from different audio libraries and modules. This library is mainly used if the user wishes to play the audio past easy playback.

3. Dejavu

Dejavu is an audio fingerprinting module in Python. It is an open – supply module. This module can take note the recorded audio via listening to it once, and this module shops the audio in the database. After this, when a song is played, or microphone enter or a disk file, Dejavu tries to suit the audio with the fingerprints saved in the database and return the song or recording which was once played earlier.

Dejavu module surpasses at the cognizance of particular alerts with a sensible amount of noise. There are two varieties in which consumer can use Dejavu to understand the audio:

User can recognize the audio by reading and processing the audio files on disk.


User can use the Computer’s microphone.

For example:

#User should create a MySQL database where Dejavu can store fingerprints of the audio.   
#on user local setup:  
$ mysql -u root -p  
Enter password: *************  

Now customers can start fingerprinting their audio collection!

from dejavu import Dejavu  
config = {  
    " database ": {  
         " host ": " ",  
         " user ": " root ",  
         " password ": < password imported in Local setup >,   
         " database ": < name of the database user has created in local setup >,  
dejv = Dejavu ( config )  

4. Mingus

Mingus is a package in Python. It is used by using many programmers, musicians’ researchers and composers for making and inspecting the song and songs. This bundle is a pass – platform and very superior music principle representing package deal for python alongside with Musical Instrument Digital Interface archives and playback support.

Mingus package can be used to play with track theory, for schooling tools, to construct editors for songs, and in many other applications and software’s in which users want to import the characteristic of processing and playing music. This package deal is a track theory, and it consists of topics like scales, progressions, chords and intervals. This bundle assessments these components and is used for producing and recognizing the musical necessities with the assist of handy shorthand.

For example:

import mingus.core.notes as notes_m  
# for valid notes  



for invalid notes:

notes_m.is_valid_note("D #")  



5. hYPerSonic

hYPerSonic is a framework of Python and C language. This is used for growing and running the sound processing pipelines, which are supposed for actual – time control. This framework is a low – level in which each and every byte count, and this also includes objects for soundcard, filters reminiscence operations, file – io, and oscillators. This framework is operated on Linux and OSX operating systems.

6. Pydub

Pydub is a Python library used for manipulating audios and including effects to it. This library is a very easy and convenient however high – degree interface which is based on FFmpeg and inclined by means of jquery. This library is used for including id3 tags in audio, cutting it, and concatenating the audio tracks. Pydub library helps 2.6, 2.7, 3.2 and 3.3 versions of Python.

However, customers can open and keep the WAV file by way of the usage of the pydub library besides any dependencies. But users are required to installation an audio playback bundle if they want to play audio.

The following code can be used to play a WAV file with pydub:

For example:

from pydub import AudioSegment  
from pydub.playback import play  
sound_audio = AudioSegment.from_wav ( ' example.wav ' )  
play ( sound_audio )  

if user wishes to play other audio files formats like MP3 files, they ought to installation libav or FFmpeg.

After installing FFmpeg, the user wants to make a small trade in the code to play an MP3 file.

For example:

from pydub import AudioSegment  
from pydub.playback import play  
sound_audio = AudioSegment.from_mp3 ( 'example.mp3 ' )   
play ( sound_audio )  

By the use of the AudioSegment.from_file (file_name, file_type ) statement, users can play any format supported by means of ffmpeg of the audio file.

For example:

# Users can play a WMA file:  
sound = AudioSegment.from_file ( 'example.wma ', ' wma ' )  

Pydub library also approves the customers to save the audio in distinct file formats. Users can also calculate the size of the audio files. User can use go – fades in the audio through the usage of this library.

7. Simpleaudio

Simpleaudio is a Python library which is a go – platform. This library is additionally used for playing back WAV documents except any dependencies. simpleaudio library waits for the file to end the playing audio in WAV layout earlier than termination of the script.

For example:

import simpleaudio as simple_audio  
filename = ' example.wav '  
wave_object = simple_audio.WaveObject.from_wave_file ( filename )  
play_object = wave_object.play ( )  
play_object.wait_done ( )    
# Wait until audio has finished playing  

In a WAV layout file, a categorization of bits is saved which represents the uncooked audio data, and headers along with metadata in Resource Interchange File layout is additionally stored.

The definitive file of the industry is to save each audio sample, which is a particular statistics factor associated to air pressure, as at 44200 samples per second, a sixteen – bit value, for CD recordings.

For lowering the size of the file, it is adequate for storing few recordings like Human speech, at a lower sampling rate, like 8000 samples per second. However, the higher sound frequencies can’t be represented plenty accurately.

Some of the libraries and modules discussed in this article play and files the bytes objects, and some of them use NumPy arrays to file raw audio data. Both resemble to a categorization of data factors that can be performed lower back at a precise sample rate to play audio.

In a NumPy array, each issue can contain a 16 – bit fee equivalent to an person sample, and for the bytes object, each sample is stored as a set of two 8 – bit values. The important difference between these two data types is that the NumPy arrays are mutable, and the bytes objects are immutable, which makes the latter more appropriate for producing audios and processing the more complex signals.

Users can play NumPy arrays and bytes object in the simpleaudio library by way of the use of simpleaudio.play_buffer ( ) statement. But, before this, users ought to make positive that they have already hooked up NumPy and simpleaudio libraries.

For example:

To generate a Numpy array corresponding to a 410 Hz tone.

import numpy as numpy  
import simpleaudio as simple_audio  
frequency = 410  # user's played note will be 410 Hz  
fsample = 44200  # 44200 samples per second will be played  
second = 5  # Note duration of 5 seconds  
# Generate array with second*sample_rate steps, ranging between 0 and seconds  
tp = numpy.linspace ( 0 , second , second * fsample, False )  
# to generate a 410 Hz sine wave  
note = numpy.sin ( frequency * tp * 2 * numpy.pi )  
# user should Ensure that highest value is in 16-bit range  
audio = note * (2**15 - 1) / numpy.max ( numpy.abs ( note ) )  
# now, Convert to 16-bit data  
ado = audio.astype ( numpy.int16 )  
# to Start the playback  
play_object = simple_audio.play_buffer ( ado , 1 , 2 , fsample )  
# user now Waits for playback to finish before exiting  
play_object.wait_done ( )  

8. winsound

winsound is a module in Python which is used for gaining access to the fundamental sound – playing machinery of the Windows running system.

In the winsound module, the WAV file can be played in simply a few traces of code.

For example:

import winsound  
filename = ' example.wav '  
winsound.PlaySound ( filename, winsound.SND_FILENAME )  

winsound module does now not help any file format without WAV files. It lets in the users to beep their audio system with the aid of the use of winsound.Beep ( frequency, period ) statement.

For example:

# User can beep a 1010 Hz tone for 110 milliseconds:  
import winsound  
winsound.Beep ( 1010, 110 )  # Beep at 1010 Hz for 110 milliseconds   

9. python-sounddevice

The python – sounddevice is a python module for pass – platform audio play back. This module gives bindings for the PortAudio library and has some appropriate features to play and report NumPy arrays, which incorporate audio signals.

If the user wishes to play a WAV file, they need to installation NumPy and soundfile to open an audio file layout in WAV documents as NumPy arrays.

For example:

import sounddevice as sound_device  
import soundfile as sound_file  
filename = ' example.wav '  
# now, Extract the data and sampling rate from file  
data_set, fsample = sound_file.read ( filename , dtype = ' float32 ' )    
sound_device.play ( data_set, fsample )  
# Wait until file is done playing  
status = sound_device.wait ( )    

The assertion sound_file.read ( ) used for extracting the uncooked audio statistics and additionally the sampling fee of the file, which are stored in Resource Interchange File layout header. sound_device.wait ( ) declaration is used to make certain that the script is solely terminated after the audio finishes playing.

10. playsound

playsound is a Python module with the aid of which users can play sound in a single line of code. It is a move – platform module which is a single function besides any dependencies for taking part in sounds and audios.

For example:

from playsound import playsound  
playsound ( ' example.wav ' )  

The playsound module is used for files formatted in WAV file and MP3 file, and it can also work with other file formats.


In this article, we have mentioned a number of types of Python Library and modules which are used for playing and recording exceptional sorts of audio archives and sounds. Here, we have explained the special facets and importance of each library and modules for taking part in sounds in the venture of growing and enhancing applications and software.