Learn Python – How to Convert Text to Speech in Python- Basic and advance

In this tutorial, we will study how to convert the human language text into human-like speech.

Sometimes we decide on listening to the content material instead of reading. We can do multitasking whilst listening to the crucial file data. Python provides many APIs to convert textual content to speech. The Google Text to Speech API is famous and commonly regarded as the gTTS API.

It is very easy to use the device and offers many built-in features which used to store the text file as an mp3 file.

We do not need to use a neural network and teach the mannequin to covert the file into speech, as it is also tough to achieve. Instead, we will use these APIs to whole a task. Difference between JDK, JRE, and JVM

The gTTS API presents the facility to convert text archives into exceptional languages such as English, Hindi, German, Tamil, French, and many more. We can additionally play the audio speech in quick or sluggish mode.

However, as its brand new replace we can’t exchange the speech file; it will generate by way of the machine and now not changeable.

To convert text files into, we will use some other offline library known as pyttsx3.

Installation of gTTS API

Type the following command in the terminal to deploy the gTTS API.

pip install gTTS  

Then, deploy the additional module to work with the gTTS.

pip install playsound  

and then install pyttsx3.

pip install pyttsx3  

Let’s understand the working of gTTS API

import gtts  
from playsound import playsound  

As we can see that, it is very convenient to use; we need to import it and skip the gTTS object that is an interface to the Google Translator API.

# make a request to google to get synthesis  
t1 = gtts.gTTS("Welcome to javaTpoint")  

In the above line, we have sent the statistics in text and acquired the genuine audio speech. Now, retailer this an audio file as welcome.mp3.

# save the audio file  
t1.save("welcome.mp3")   

It will keep it into a directory, we can hear this file as follow:

# play the audio file  
playsound("welcome.mp3")  

Output:

Please flip on the gadget volume, hear the text as we have saved earlier.

Now, we will define the whole Python application of textual content into speech.

Python Program

# Import the gTTS module for text  
# to speech conversion  
from gtts import gTTS  
  
# This module is imported so that we can  
# play the converted audio  
  
from playsound import playsound  
  
# It is a text value that we want to convert to audio  
text_val = 'All the best for your exam.'  
  
# Here are converting in English Language  
language = 'en'  
  
# Passing the text and language to the engine,  
# here we have assign slow=False. Which denotes  
# the module that the transformed audio should  
# have a high speed  
obj = gTTS(text=text_val, lang=language, slow=False)  
  
#Here we are saving the transformed audio in a mp3 file named  
# exam.mp3  
obj.save("exam.mp3")  
  
# Play the exam.mp3 file  
playsound("exam.mp3")  

Output:

Explanation:

In the above code, we have imported the API and use the gTTS function. The gTTS() characteristic which takes three arguments –

The first argument is a text value that we want to convert into a speech.

The second argument is a specified language. It supports many languages. We can convert the text into the audio file.

The third argument represents the speed of the speech. We have passed slow value as false; it means the speech will be at normal speed.

We saved this file as exam.py, which can be handy anytime, and then we have used the playsound() function to pay attention the audio file at runtime.

The list of available languages

To get the on hand languages, use the following features –

Output:

{'af': 'Afrikaans', 'sq': 'Albanian', 'ar': 'Arabic', 'hy': 'Armenian', 'bn': 'Bengali', 'bs': 'Bosnian', 'ca': 'Catalan', 'hr': 'Croatian', 'cs': 'Czech', 'da': 'Danish', 'nl': 'Dutch', 'en': 'English', 'et': 'Estonian', 'tl': 'Filipino', 'fi': 'Finnish', 'fr': 'French', 'de': 'German', 'el': 'Greek', 'en-us': 'English (US)','gu': 'Gujarati', 'hi': 'Hindi', 'hu': 'Hungarian', 'is': 'Icelandic', 'id': 'Indonesian', 'it': 'Italian', 'ja': 'Japanese', 'en-ca': 'English (Canada)', 'jw': 'Javanese', 'kn': 'Kannada', 'km': 'Khmer', 'ko': 'Korean', 'la': 'Latin', 'lv': 'Latvian', 'mk': 'Macedonian', 'ml': 'Malayalam', 'mr', 'en-in': 'English (India)'}

We have cited few vital languages and their code. You can locate almost each and every language in this library.

Offline API

We have used the Google API, however what if we choose to convert text to speech using offline. Python presents the pyttsx3 library, which looks for TTS engines pre-installed in our platform.

Let’s understand how to use pyttsx3 library:

Example –

import pyttsx3  
# initialize Text-to-speech engine  
engine = pyttsx3.init()  
# convert this text to speech  
text = "Python is a great programming language"  
engine.say(text)  
# play the speech  
engine.runAndWait()  

In the above code, we have used the say() method and passed the textual content as an argument. It is used to add a phrase to communicate to the queue, while the runAndWait() method runs the actual event loop till all instructions queued up.

It additionally affords some additional residences that we can use in accordance to our needs. Let’s get the details of speaking rate:

# get details of speaking rate  
rate = engine.getProperty("rate")  
print(rate)  

Output:

200
We can change rate of speed as we want:  
# setting new voice rate (faster)  
engine.setProperty("rate", 300)  
engine.say(text)  
engine.runAndWait()  

If we pass the 100 then it will be slower.

engine.setProperty("rate", 100)  
engine.say(text)  
engine.runAndWait()  

Now, we can hear the text file in the voices.

# get details of all voices available  
voices = engine.getProperty("voices")  
print(voices)  

Output:

[<pyttsx3.voice.Voice object at 0x000002D617F00A20>, <pyttsx3.voice.Voice object at 0x000002D617D7F898>, <pyttsx3.voice.Voice object at 0x000002D6182F8D30>]

In this tutorial, we have mentioned the transformation of textual content file into speech the use of the third-party library. We additionally discussed the offline library. By the usage of this, we can build personal virtual assistance.