Manim Voiceover can plug into various speech synthesizers to generate voiceover audio. Below is a comparison of the available services, their pros and cons, and how to set them up.
.. py:currentmodule:: manim_voiceover.services
Manim Voiceover defines the :py:class:`~~base.SpeechService` class for adding new speech synthesizers. The classes introduced below are all derived from :py:class:`~~base.SpeechService`.
Speech service | Quality | Can run offline? | Paid / requires an account? | Notes |
---|---|---|---|---|
:py:class:`~recorder.RecorderService` | N/A | N/A | N/A | This is a utility class to record your own voiceovers with a microphone. |
:py:class:`~azure.AzureService` | Very good, human-like | No | Yes | Azure gives 500min/month free TTS quota. However, registration still needs a credit or debit card. See Azure free account FAQ for more details. |
:py:class:`~elevenlabs.ElevenLabsService` | Very good, human-like | No | Yes | Requires ElevenLabs account. Click here to sign up. |
:py:class:`~coqui.CoquiService` | Good, human-like | Yes | No | Requires PyTorch to run. May be difficult to set up on certain platforms. |
:py:class:`~gtts.GTTSService` | Good | No | No | It's a free API subsidized by Google, so there is a likelihood it may stop working in the future. |
:py:class:`~openai.OpenAIService` | Very good, human-like | No | Yes | Requires OpenAI developer account. See platform to sign up, and the pricing page for more details. |
:py:class:`~pyttsx3.PyTTSX3Service` | Bad | Yes | No | Requires espeak. Does not work reliably on Mac. |
It is on our roadmap to provide a high quality TTS engine that runs locally for free. If you have any suggestions, please let us know in the Discord server.
This is not a speech synthesizer but a utility class to record your own voiceovers with a microphone. It provides a command line interface to record voiceovers during rendering.
Install Manim Voiceover with the recorder
extra in order to use :py:class:`~recorder.RecorderService`:
pip install "manim-voiceover[recorder]"
Refer to the example usage to get started.
As of now, the highest quality text-to-speech service available in Manim Voiceover is Microsoft Azure Speech Service. To use it, you will need to create an Azure account.
Tip
Azure currently offers free TTS of 500 minutes/month. This is more than enough for most projects.
Install Manim Voiceover with the azure
extra in order to use :py:class:`~azure.AzureService`:
pip install "manim-voiceover[azure]"
Then, you need to find out your subscription key and service region:
- Sign in to Azure portal and create a new Speech Service resource.
- Go to the Azure Cognitive Services page.
- Click on the resource you created and go to the
Keys and Endpoint
tab. Copy theKey 1
andLocation
values.
Create a file called .env
that contains your authentication
information in the same directory where you call Manim.
AZURE_SUBSCRIPTION_KEY="..." # insert Key 1 here
AZURE_SERVICE_REGION="..." # insert Location here
Check out Azure docs for more details.
Refer to the example usage to get started.
Coqui TTS is an open source neural text-to-speech engine. It is a fork of Mozilla TTS, which is an implementation of Tacotron 2. It is a very good TTS engine that produces human-like speech. However, it requires PyTorch to run, which may be difficult to set up on certain platforms.
Install Manim Voiceover with the coqui
extra in order to use :py:class:`~coqui.CoquiService`:
pip install "manim-voiceover[coqui]"
If you run into issues with PyTorch or NumPy, try changing your Python version to 3.9.
Refer to the example usage to get started.
gTTS is a text-to-speech library that wraps Google Translate's text-to-speech API. It needs an internet connection to work.
Install Manim Voiceover with the gtts
extra in order to use :py:class:`~gtts.GTTSService`:
pip install "manim-voiceover[gtts]"
Refer to the example usage to get started.
OpenAI provides a text-to-speech service. It is through an API, so it requires an internet connection to work. It also requires an API key to use. Register for one here.
Install Manim Voiceover with the openai
extra in order to use :py:class:`~openai.OpenAIService`:
pip install "manim-voiceover[openai]"
Then, you need to find out your api key:
- Sign in to OpenAI platform and click into Api Keys from the left panel.
- Click create a new secret key and copy it.
Create a file called .env
that contains your authentication
information in the same directory where you call Manim.
OPENAI_API_KEY="..." # insert the secret key here. It should start with "sk-"
Check out OpenAI docs for more details.
Refer to the example usage to get started.
pyttsx3 is a text-to-speech library that wraps espeak, a formant synthesis speech synthesizer.
Install Manim Voiceover with the pyttsx3
extra in order to use :py:class:`~pyttsx3.PyTTSX3Service`:
pip install "manim-voiceover[pyttsx3]"
Refer to the example usage to get started.
ElevenLabs offers one of the most natural sounding speech service APIs. It has a range of realistic and emotive voices, and also allows you to clone your own voice by uploading a few minutes of your speech. To use it, you will need to create an account at Eleven Labs.
Tip
ElevenLabs currently offers free TTS of 10,000 characters/month and up to 3 custom voices.
Install Manim Voiceover with the elevenlabs
extra in order to use :py:class:`~elevenlabs.ElevenLabsService`:
pip install "manim-voiceover[elevenlabs]"
Then, you need to find out your API key.
- Sign in to ElevenLabs portal and go to your profile to obtain the key
- Set the environment variable
ELEVEN_API_KEY
to your key
Create a file called .env
that contains your authentication
information in the same directory where you call Manim.
ELEVEN_API_KEY="..." # insert Key 1 here
Check out ElevenLabs docs for more details.
Refer to the example usage to get started.