Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to just evaluate a pre-trained network on an audio file? #87

Open
devinbostIL opened this issue May 18, 2017 · 6 comments
Open

How to just evaluate a pre-trained network on an audio file? #87

devinbostIL opened this issue May 18, 2017 · 6 comments

Comments

@devinbostIL
Copy link

Hi,

I was able to get my environment setup, and I am wanting to just try evaluating an existing model (such as the LibriSpeech network) to attempt speech-to-text on an audio file. I just want to perform the transcription.
How do I go about this with your library? I am not sure from the documentation what steps are necessary and how much extra development work I will need to do (if any) to perform the transcription task from your library.

@SeanNaren
Copy link
Owner

Hey my bad! Should update the docs sometime :) To do this use the predict script like below:

th Predict.lua -modelPath /path/to/model.t7 -audioPath /path/to/audio.wav

There are further parameters if you need them, use the -help argument to see them!

@devinbostIL
Copy link
Author

devinbostIL commented May 25, 2017

Thanks for the information!

I attempted to run the model, and it blew up with this message:

$ th Predict.lua -modelPath libri_deepspeech.t7 -audioPath '/home/devinbost/Downloads/speech_audio_files_sample/nameOfAudioFile.wav'
/home/devinbost/torch/install/bin/luajit: ...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 2 module of nn.Sequential:
In 3 module of nn.Sequential:
In 1 module of cudnn.BatchBRNNReLU:
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: input view (5107x1x1x1760) and desired view (5107x-1) do not match
stack traceback:
	[C]: in function 'error'
	/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: in function 'batchsize'
	/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:79: in function </home/devinbost/torch/install/share/lua/5.1/nn/View.lua:77>
	[C]: in function 'xpcall'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
	[C]: in function 'xpcall'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
	[C]: in function 'xpcall'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	Predict.lua:42: in main chunk
	[C]: in function 'dofile'
	...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	Predict.lua:42: in main chunk
	[C]: in function 'dofile'
	...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

Any ideas?

@devinbostIL
Copy link
Author

Is it expecting me to pass it a table or a directory with a collection of audio files?

@devinbostIL
Copy link
Author

I tried changing the file and then also the sampling rate, and these were the error messages that I got:

~/src/deepspeech.torch$ th Predict.lua -modelPath libri_deepspeech.t7 -audioPath '/home/devinbost/Downloads/speech_audio_files_sample/4402691.wav'
/home/devinbost/torch/install/bin/luajit: ...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67:
In 2 module of nn.Sequential:
In 3 module of nn.Sequential:
In 1 module of cudnn.BatchBRNNReLU:
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: input view (3951x1x1x1760) and desired view (3951x-1) do not match
stack traceback:
[C]: in function 'error'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: in function 'batchsize'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:79: in function </home/devinbost/torch/install/share/lua/5.1/nn/View.lua:77>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

~/src/deepspeech.torch$ th Predict.lua -modelPath libri_deepspeech.t7 -audioPath '/home/devinbost/Downloads/speech_audio_files_sample/4402691.wav' -sampleRate 13000
/home/devinbost/torch/install/bin/luajit: ...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
In 7 module of nn.Sequential:
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: input view (1x32x26x4864) and desired view (1312x-1) do not match
stack traceback:
[C]: in function 'error'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: in function 'batchsize'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:79: in function </home/devinbost/torch/install/share/lua/5.1/nn/View.lua:77>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

@SeanNaren
Copy link
Owner

SeanNaren commented Jun 8, 2017

Make sure the file is a 16khz wav file, is this the case?

I've also added documentation here.

@sirmick
Copy link

sirmick commented Jun 8, 2017

I'm having the same problem. I downloaded the LibriSpeech pre trained model, am launching with th Predict.lua -modelPath libri_deepspeech.t7 -audioPath amy.out.wav -dictionaryPath ./dictionary -nGPU 1

I'm trying to run this against a WAV file I downsampled to 16k mono with sox amy.wav amy.out.wav rate 16k channels 1. It is a 16bit file, if that counts for anything.

I'm getting a very similar error when i try to run predict, View.lua:47: input view (241x1x1x1760) and desired view (241x-1) do not match

If I figure out what I'm doing wrong, I'd be happy to contribute some better documentation or strengthen the input file checking in Predict.lua so it throws actionable errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants