speech-to-text-voxforge

Download the speech corpus

In order to download the speech corpus run

python downloader.py "voxforge-corpus"

You can additionally specify the amount of speaker directories to be downloaded using -n or the amount of threads to be used for the download using -w:

python downloader.py "voxforge-corpus" -n 20000 -w 15 -url http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/8kHz_16bit/

Generate training data

If you want to generate a training data file for the speech recognition tool, run generator.py providing the path to the directory where the voxforge corpus was being downloaded and a path to the new file where the training data should be stored. The data will be stored as JSON.

python generator.py "voxforge-corpus" "training_data.json"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

speech-to-text-voxforge

Download the speech corpus

Generate training data

Files

README.md

Latest commit

History

README.md

File metadata and controls

speech-to-text-voxforge

Download the speech corpus

Generate training data