This guide will help you set up a Python Text-to-Speech (TTS) script on a fresh Debian 12 virtual machine using gTTS. Follow these concise steps to ensure everything works flawlessly.
Step 1: Install Required Package
- Open the terminal in your Debian 12 virtual machine.
- Install
ffmpeg
(necessary for audio processing):sudo apt install ffmpeg -y
Step 2: Set Up a Python Virtual Environment
- Create a virtual environment to isolate your project:
python3 -m venv tts-env
source tts-env/bin/activate
Step 3: Install gTTS
- Within the virtual environment, install the gTTS library:
pip install gTTS
Step 4: Create the TTS Script
- Create a file named
tts.py
in your working directory. - Add the following code to the file:
from gtts import gTTS
import os
def text_to_speech_google(text, language='en', output_file="output.mp3"):
# Generate speech using Google TTS
tts = gTTS(text=text, lang=language, slow=False)
tts.save(output_file) # Save the generated speech to an audio file
print(f"Speech saved to {output_file}")
# Example usage
if __name__ == "__main__":
text = "Hello, this is a test sentence!"
output_file = "output.mp3"
text_to_speech_google(text, output_file=output_file)
Step 5: Run the Script
- Execute the script in the terminal:
python3 tts.py
- The script will generate an MP3 file (
output.mp3
) in the current directory, containing the spoken version of your text.
While gTTS provides a simple and free way to convert text into speech, it’s important to understand its limitations. The service operates through Google Translate’s TTS API, which imposes rate limits to prevent overuse. For instance, each request can handle up to 5,000 characters, but there’s no official documentation specifying an overall daily or monthly usage cap. However, frequent or excessive requests may trigger temporary blocks or errors.
This makes gTTS suitable for small-scale or personal projects but not ideal for high-volume or enterprise applications. For larger needs, consider upgrading to Google’s Cloud Text-to-Speech API, which offers robust features, higher character limits, and commercial usage options.