Free and offline foreign (non-English) speech recognition with Python, Google API vosk and SpeechRecognition / Pocketsphinx.
For speech recognition with timestamps see timestamps folder.
Read instructions in this Medium article to know what library you need to set up and how to do it.
- Online speech recognition with Google API:
pip install SpeechRecognition
- Offline speech recognition with vosk:
pip install vosk
- download vosk model, unzip it and specify path to the model in program
- Offline Speech Recognition with SpeechRecognition and Pocketsphinx:
pip install SpeechRecognition
python -m pip install --upgrade pip setuptools wheel
pip install --upgrade pocketsphinx
- download foreign models for pocketsphinx, unzip and setup it
See overview jupyter notebook, which contains examples of all methods.
Open it with jupyter or see directly in a browser.
As any python script any of these tree scripts (.py
files) can be run with the following command: python script_name.py parameter1, parameter2 ...
.
Every script has two parameters:
- first (required) - name of the .wav file to recognize
- second (optional) - name of the text file to write recognized text. If not specified, uses
first_parameter.txt
Examples:
python script_online_sr.py audio.wav
(writes text inaudio.txt
)python script_online_sr.py audio.wav audio_outout.txt
python script_vosk.py 'sounds\filename.wav'
(recognizecurrent_folder\sounds\filename.wav
)python script_offline_sr.py 'D:\sounds\filename.wav'