Skip to content

devilismyfriend/ozen-toolkit

Repository files navigation

OZEN toolkit, AI powered audio dataset helper.

Buy Me a Coffee at ko-fi.com

OZEN is a small tool to help you process audio files to a LJ format.

Given a folder of files or a single audio file, it will extract the speech, transcribe using Whisper and save in the LJ format (wavs in wavs folder, train and valid txts).

INSTALLATION

Accept the license terms on https://huggingface.co/pyannote/segmentation 
Install Anaconda or setup your own environment and install requirements
git clone https://github.com/devilismyfriend/ozen-toolkit
run Set Up Ozen.bat

USAGE

Drag a folder or a file on the Drag_Here.bat to process it.

The first time you'll be prompted to provide an HuggingFace token, once you do a config file will be created where you can specifiy models to use, the validation/training data desired split and more.

Alternatively you can use ozen.py in cli.

About

Audio datasets, easier.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published