A collection of video and audio processing tools

Description

This repo performs various operations on video and audio files, including:

Extracting short video clips from longer ones.
Enhancing audio by adjusting pitch and volume, eg. for a deeper voice.
Compressing and converting video files to WebM format.
Extracting audio from a video and saving it as an MP3 file.
Amplifying audio if necessary.
Transcribing audio using Whisper.
Correcting raw audio transcripts using ChatGPT.
Embedding subtitles into the WebM video files.

Main Functions

Extract video clips.
Enhance audio in a video file.
Convert video to WebM format for web optimization.
Convert audio to MP3 and amplify it.
Transcribe audio using Whisper.
Correct transcripts using AI (ChatGPT).
Add subtitles to videos.

The main file of this repo is runtools.py. In this file, (un)comment the functions you want execute.

Requirements

FFmpeg for video/audio processing. It must be installed on your machine and added to the PATH variable
OpenAI API (Whisper and ChatGPT models) for transcription and transcript correction.
Set OpenAI API key for ChatGPT in the .env file. Whisper can be run without API key

Demo

Using this toolkit, an mp4-video has been converted into the following products:

A WebM video. In this video, the sound volume has been amplified and the voice of the speaker has been made lower/deeper. Also the file size of the webm is about 10 times smaller than the orginal mp4.
A full text audio transcript (.txt) has been generated. It has been embedded in the video description. This was done using Whisper with ChatGPT post-corrections.
Closed captions / subtitles in English were also generated. This was done using Whisper with ChatGPT post-corrections.

Articles

How to create high-quality offline video transcriptions and subtitles using Whisper and Python and same article on Zenodo, 6 November 2024

Info

Latest update: 6 November 2024
Author: Olaf Janssen (ookgezellig) - Supported by ChatGPT
License: Creative Commons CC0 - http://creativecommons.org/publicdomain/zero/1.0

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
input_files		input_files
output_files		output_files
stories		stories
.env		.env
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ai_correct_audiotranscripts.py		ai_correct_audiotranscripts.py
runtools.py		runtools.py
tools.py		tools.py
transcribe_audio.py		transcribe_audio.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A collection of video and audio processing tools

Description

Main Functions

Requirements

Demo

Articles

Info

About

Languages

License

KBNLresearch/videotools

Folders and files

Latest commit

History

Repository files navigation

A collection of video and audio processing tools

Description

Main Functions

Requirements

Demo

Articles

Info

About

Topics

Resources

License

Stars

Watchers

Forks

Languages