A gradio frontend for generating transcribed or translated subtitles for videos using OpenAI Whisper locally.
This has been tested with Python 3.12
python -m venv .venv
Activate the venv on Windows with
.\.venv\Scripts\activate
On Linux and Mac, run
source .venv/bin/activate
Now, install the pip dependencies:
# if this doesn't work, pip install the following manually: openai-whisper ffmpeg torch gradio
pip install -r requirements.txt
Then launch the server with
python server.py
To share, add --remote=True
.
For embedding subtitles into a video, you need to have ffmpeg
installed on your system.
- Input a video or any other media file
- Input a YouTube URL
- Transcribe
- Translate to English
- Select different models for your hardware
- CUDA support
- Output .srt or video file with embedded subtitles
If the output says gpu available: False
you might need to pip install a different version of Torch for your specific hardware
I just wanted a nice frontend where you can just drop a video or url and it will spit out subs. Whisper is amazing but I haven't found that many implementations, especially ones that can be run locally.
By default, Gradio sends some data to its servers. Gradio provides some variables to disable them, which have been applied here.