Skip to content

AI MultiMedia Transcriber is a powerful subtitle generator that uses AI voice recognition models.

License

Notifications You must be signed in to change notification settings

Gabrieliam42/AI-MultiMedia-Transcriber

Repository files navigation

AI MultiMedia Transcriber

AI MultiMedia Transcriber helps you generate srt subtitles from multimedia files like mp4, mkv, wav or mp3, or even from a YouTube video URL.

Note:

  • Python 3.10 must be installed for this script! You can get python-3.10.11-amd64.exe from this link.
  • The AI_MultiMedia_Transcriber.py Python script can use CUDA GPU processing so it can run faster, or it will use CPU if CUDA is not available.
  • It needs FFmpeg to be present in the operating system and registered (for Windows users in PATH in Environment Variables).
  • For GPU CUDA processing, it needs the latest supported NVIDIA GPU Computing Toolkit to be installed with cuDNN version 8.9.7.29.
  • The AI_MultiMedia_Transcriber-Forced-EN.py and AI_MultiMedia_Transcriber-Forced-RO.py versions are to be used in case the first script fails to recognize the source language correctly.
  • The script has a requirements.txt that contains required dependencies as listed bellow.

Requirements:

  • faster-whisper
  • ffmpeg-python
  • pytubefix
  • tk
  • torch==2.4.1+cu124

Therefore AI MultiMedia Transcriber performs the following actions:

  1. It starts by prompting the user to select a Source multimedia file.
  2. It extracts a wav audio file from the source file or URL link.
  3. It starts the transcribe process and generates a text file (.txt) named as the input file in the current working directory.
  4. You can then click the Convert_Subtitle_TimeFrame.exe OR run the Convert_Subtitle_TimeFrame.py script to convert the timeframe from the respective text file and it generates an srt subtitle file also named after the input file in the current working directory.
  5. It is meant to use the GPU device with CUDA but if it's not available, it falls back to using CPU.

Full Description:

AI MultiMedia Transcriber is a Python script that helps you create subtitles for your videos and audio files. It utilizes the power of Whisper, an OpenAI voice recognition model, to convert the audio track into text that can be converted into srt timeframe type subtitle format.

  • It supports various formats of multimedia files like MP4, MKV, WAV, MP3, or simply a YouTube video URL provided by the user.
  • It does Automatic Audio Extraction so there's no need to extract audio separately, the script handles it for you.
  • The Text to Subtitle Conversion script generates a subtitle file (.srt) compatible with all video players.
  • GPU Acceleration is Optional so it uses NVIDIA GPUs for faster processing but it seamlessly switches to CPU if a compatible GPU isn't available.





Script Developer: Gabriel Mihai Sandu
GitHub Profile: https://github.com/Gabrieliam42

About

AI MultiMedia Transcriber is a powerful subtitle generator that uses AI voice recognition models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages