How to just evaluate a pre-trained network on an audio file? #87

devinbostIL · 2017-05-18T21:59:25Z

Hi,

I was able to get my environment setup, and I am wanting to just try evaluating an existing model (such as the LibriSpeech network) to attempt speech-to-text on an audio file. I just want to perform the transcription.
How do I go about this with your library? I am not sure from the documentation what steps are necessary and how much extra development work I will need to do (if any) to perform the transcription task from your library.

SeanNaren · 2017-05-19T10:01:34Z

Hey my bad! Should update the docs sometime :) To do this use the predict script like below:

th Predict.lua -modelPath /path/to/model.t7 -audioPath /path/to/audio.wav

There are further parameters if you need them, use the -help argument to see them!

devinbostIL · 2017-05-25T20:31:01Z

Thanks for the information!

I attempted to run the model, and it blew up with this message:

$ th Predict.lua -modelPath libri_deepspeech.t7 -audioPath '/home/devinbost/Downloads/speech_audio_files_sample/nameOfAudioFile.wav'
/home/devinbost/torch/install/bin/luajit: ...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 2 module of nn.Sequential:
In 3 module of nn.Sequential:
In 1 module of cudnn.BatchBRNNReLU:
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: input view (5107x1x1x1760) and desired view (5107x-1) do not match
stack traceback:
	[C]: in function 'error'
	/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: in function 'batchsize'
	/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:79: in function </home/devinbost/torch/install/share/lua/5.1/nn/View.lua:77>
	[C]: in function 'xpcall'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
	[C]: in function 'xpcall'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
	[C]: in function 'xpcall'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	Predict.lua:42: in main chunk
	[C]: in function 'dofile'
	...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	Predict.lua:42: in main chunk
	[C]: in function 'dofile'
	...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

Any ideas?

devinbostIL · 2017-06-07T23:22:44Z

Is it expecting me to pass it a table or a directory with a collection of audio files?

devinbostIL · 2017-06-07T23:56:14Z

I tried changing the file and then also the sampling rate, and these were the error messages that I got:

~/src/deepspeech.torch$ th Predict.lua -modelPath libri_deepspeech.t7 -audioPath '/home/devinbost/Downloads/speech_audio_files_sample/4402691.wav'
/home/devinbost/torch/install/bin/luajit: ...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67:
In 2 module of nn.Sequential:
In 3 module of nn.Sequential:
In 1 module of cudnn.BatchBRNNReLU:
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: input view (3951x1x1x1760) and desired view (3951x-1) do not match
stack traceback:
[C]: in function 'error'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: in function 'batchsize'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:79: in function </home/devinbost/torch/install/share/lua/5.1/nn/View.lua:77>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

~/src/deepspeech.torch$ th Predict.lua -modelPath libri_deepspeech.t7 -audioPath '/home/devinbost/Downloads/speech_audio_files_sample/4402691.wav' -sampleRate 13000
/home/devinbost/torch/install/bin/luajit: ...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
In 7 module of nn.Sequential:
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: input view (1x32x26x4864) and desired view (1312x-1) do not match
stack traceback:
[C]: in function 'error'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: in function 'batchsize'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:79: in function </home/devinbost/torch/install/share/lua/5.1/nn/View.lua:77>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

SeanNaren · 2017-06-08T07:56:35Z

Make sure the file is a 16khz wav file, is this the case?

I've also added documentation here.

sirmick · 2017-06-08T21:16:04Z

I'm having the same problem. I downloaded the LibriSpeech pre trained model, am launching with th Predict.lua -modelPath libri_deepspeech.t7 -audioPath amy.out.wav -dictionaryPath ./dictionary -nGPU 1

I'm trying to run this against a WAV file I downsampled to 16k mono with sox amy.wav amy.out.wav rate 16k channels 1. It is a 16bit file, if that counts for anything.

I'm getting a very similar error when i try to run predict, View.lua:47: input view (241x1x1x1760) and desired view (241x-1) do not match

If I figure out what I'm doing wrong, I'd be happy to contribute some better documentation or strengthen the input file checking in Predict.lua so it throws actionable errors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to just evaluate a pre-trained network on an audio file? #87

How to just evaluate a pre-trained network on an audio file? #87

devinbostIL commented May 18, 2017

SeanNaren commented May 19, 2017

devinbostIL commented May 25, 2017 •

edited

Loading

devinbostIL commented Jun 7, 2017

devinbostIL commented Jun 7, 2017

SeanNaren commented Jun 8, 2017 •

edited

Loading

sirmick commented Jun 8, 2017

How to just evaluate a pre-trained network on an audio file? #87

How to just evaluate a pre-trained network on an audio file? #87

Comments

devinbostIL commented May 18, 2017

SeanNaren commented May 19, 2017

devinbostIL commented May 25, 2017 • edited Loading

devinbostIL commented Jun 7, 2017

devinbostIL commented Jun 7, 2017

SeanNaren commented Jun 8, 2017 • edited Loading

sirmick commented Jun 8, 2017

devinbostIL commented May 25, 2017 •

edited

Loading

SeanNaren commented Jun 8, 2017 •

edited

Loading