Manim Voiceover is a Manim plugin for all things voiceover:
- Add voiceovers to Manim videos directly in Python without having to use a video editor.
- Record voiceovers with your microphone during rendering with a simple command line interface.
- Develop animations with auto-generated AI voices from various free and proprietary services.
- Per-word timing of animations, i.e. trigger animations at specific words in the voiceover, even for the recordings. This works thanks to OpenAI Whisper.
- NEW: Supports both local and cloud-based Whisper for ARM64 architectures (like Apple Silicon) where the local model may not work.
Here is a demo:
VoiceoverDemo.mp4
Currently supported TTS services (aside from the CLI that allows you to records your own voice):
- Azure Text to Speech (Recommended for AI voices)
- Coqui TTS
- gTTS
- pyttsx3
Check out the documentation for more details.
Installation instructions in Manim Voiceover docs.
Check out the docs to get started with Manim Voiceover.
Check out the example gallery to get inspired.
For ARM64 architectures (like Apple Silicon Macs) or systems where installing the local Whisper model is problematic, you can now use OpenAI's cloud-based Whisper API for speech-to-text alignment:
# Run with the provided script
python manim_cloud_whisper.py -pql examples/cloud_whisper_demo.py CloudWhisperDemo
Or enable it programmatically:
service = OpenAIService(
voice="alloy",
model="tts-1",
transcription_model="base",
use_cloud_whisper=True # This enables cloud-based Whisper
)
You can also set an environment variable to enable cloud-based Whisper:
# Set the environment variable
export MANIM_VOICEOVER_USE_CLOUD_WHISPER=1
# Run Manim normally
manim -pql examples/cloud_whisper_demo.py CloudWhisperDemo
Learn more about cloud-based Whisper in the documentation.
Manim Voiceover can use machine translation services like DeepL to translate voiceovers into other languages. Check out the docs for more details.