forked from letsRobot/letsrobot
-
Notifications
You must be signed in to change notification settings - Fork 19
Google Cloud Text To Speech API
bmorrison4 edited this page Jul 20, 2019
·
5 revisions
The Google Cloud TTS API offers high fidelity, ultra realistic speech synthesis. A demo is avaialbe on its website, https://cloud.google.com/text-to-speech.
This is not a free service. Prices may vary, but as of the time of this writing, a free one year trial is available with $300 credit towards all Google Cloud Platform services.
- Multilingual
- Supports 30+ voices in 13 languages and variants
- WaveNet Voices
- Exclusive multilingual access to DeepMind WaveNet voices that provide the most natural sounding speech
- Text and SSML support
- Customize your speech with SSML tags that allow you to add pauses, numbers, date & time formatting, and other pronunciation instructions
- Speaking Rate Tuning
- Customize your speaking rate to be 4x faster or slower than normal rate.
- Pitch Tuning
- Customize the pitch of your selected voice, up to 20 semitones more or less than the default output.
- Volume Gain Control
- Increase the volume of the output by up to 16dB or decrease the volume up to -96dB.
- Audio Format Flexibility
- Choose from a number of audio formats including mp3, Linear16, Ogg Opus.
- Audio Profiles (BETA)
- Optimize for the type of speaker from which your speech is intended to play, such as headphones or phone lines.
- Standard (Non-WaveNet) voices: up to 4 million characters free; then USD$4.00 per 1 million characters.
- WaveNet voices: Up to 1 million characters free, then USD$16.00 per 1 million characters.
- Select or create a GCP Project.
- Make sure that billing is enabled for your project.
- Enable the Cloud Text-to-Speech API
- Set up authentication:
- In the CGP console, go to the Create service account key page.
- From the Service Account drop-down list, select New service account.
- Don't select a role from the Role drop-down list. No role is required to access this service.
- Click Create. A note appears, warning that this service account has no role.
- Click Create without role. A JSON file that contains your key downloads to your computer. (this needs to go onto your robot.)
Copy your JSON file to your robot. All of the following steps need to be run on your robot.
python -m pip install --upgrade google-cloud-texttospeech
- In controller.conf, set your tts
type
togoogle_cloud
. - Choose a voice from this list. Make the following changes in the
[google_cloud]
section ofcontroller.conf
. - Set
key_file
to the full path of your key file. (i.e./home/pi/googlecloudkey.json
) - Set
voice
to the name of the voice you want to use. It needs to have a matching language code on the front. (i.e.en-US-Wavenet-A
)
Here's a list of all the options
Variable | Default Value | Description |
---|---|---|
ssml_enabled |
False |
SSML, or Speech Synthesis Markup Language gives you more control over how the text is said, bleep things, or inject audio into the text. It's set to false by default because it can be readily abused. |
key_file |
The JSON key needed to authorize the robot to use the API. | |
voice |
en-US-Wavenet-A |
What voice you want to use. A script to show the list of supported voices is available in the optional directory. |
voice_pitch |
0.0 |
Pitch changes the pitch of the voice. It can be between -20.0 and 20.0 . |
voice_speaking_rate |
1.0 |
Speaking rate changes the speed at which the voice talks at. It can be between 0.25 and 4.0 . |
- Adafruit Motor Hat
- Adafruit PWM / Servo Hat
- Anki Cozmo on MacOS/Linux
- Anki Cozmo on Windows
- Cytron MDD10 10 Amp Motor Driver
- GoPiGo2
- GoPiGo3
- L298N Dual Motor Driver
- MAX7219 SPI LED Driver
- MotoZero 4 Motor Controller
- MQTT Publish Controller
- OWI 535 Robotic Arm (USB Controller)
- Serial Based Controllers (Parallaxy or Arduinos)
- PiBorg ThunderBorg Motor Driver
- Pololu Daul MC33926 Motor Driver (experimental)
- Pololu DRV8835 Dual Motor Driver
- Pololu Maestro Servo Controller (experimental)