High values of WER on Libri Dataset (test-clean and dev-clean) #80

ismorphism · 2017-03-02T07:18:42Z

Hi everyone! I have the following problem: when I firstly try to train pure DeepSpeech net with default parameters (7 layers, 1760 neurons, batch size is 20) I always get
cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-3543/cutorch/lib/THC/generic/THCStorage.cu:66. (My dataset is default Libri dataset of 600 Mb which was described in Data preparation and running wiki chapter of this repository. My GPU is Nvidia 1080 GTX.)
I thought it's ok and maybe I have to decrease batch size and number of layers/neurons. But only satisfied architecture on this moment is 6 layers , 1200 neurons and batch size is 12. Other architectures caused out of memory error every time on 5 or 6 epoch or something. At the same time my best WER value is about 76.5 and doesn't seem to get low. I tried to change batch size a little but my maximum working value is 12 now. Also, I tried to use LSTM architecture with 600 neurons and 6 layers and it showed worse results. I tried to change learning rates, rate annealing, maxnorm and momentum, add some permute Batch but always I get something about 76.5. Does anyoune know what else could I do?
Maybe the answer is to use bigger batch size and deeper architecture but then I have to use more computational power and it doesn't seem good for me..

The text was updated successfully, but these errors were encountered:

mtanana · 2017-03-02T14:47:47Z

Does the 1080 have 6GB? I'm not sure if that will be able to fit the full model.

If you look back at my responses to the thread on running out of memory, I found some tweaks to the code that drastically reduced the memory. (But, alas, I haven't had time to commit them...)

In the end, if you don't have much memory, you can't run large batches. (on a 6gb testing card, I don't think I could run more that 3 or 4 in a batch).

As the batch size changes, the ideal learning rate usually does as well (in my experience). You may have to play with that (this is the real hard work of deep learning)

suhaspillai · 2017-03-02T14:53:45Z

Hi Boris,
Well you are right, you need to have bigger batch size and more gpus. I think the reason you get out of memory error is because in permute batch after 5 or 6 epochs, you suddenly have a batch, where speech files have many timesteps and you need to store intermediate layers values for each timestep for backpropagation, the only way to make it work is to have less batch size. Other way around is to store gradients locally on your machine (i.e if you want 30 samples per batch and you can only run 10 samples per batch), you will store gradients 3 times for each batch of 10 samples and update the gradient once after 3 batchs.This will require you to make changes in the code.

mtanana · 2017-03-02T14:56:00Z

#71
There's a comment from me near the bottom that helps with memory..

SeanNaren · 2017-03-13T10:52:55Z

Sorry for the late response, the GTX 1080 is a great card but as said above, only has 8gb of VRAM. Reduce the minibatch size if you want to train on this GPU!

Which dataset are you training on specifically?

mtanana · 2017-03-24T15:24:37Z

Don't forget to downsize the minibatch for testing too

SeanNaren closed this as completed Mar 13, 2017

SeanNaren reopened this Mar 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High values of WER on Libri Dataset (test-clean and dev-clean) #80

High values of WER on Libri Dataset (test-clean and dev-clean) #80

ismorphism commented Mar 2, 2017 •

edited

Loading

mtanana commented Mar 2, 2017 •

edited

Loading

suhaspillai commented Mar 2, 2017

mtanana commented Mar 2, 2017

SeanNaren commented Mar 13, 2017 •

edited

Loading

mtanana commented Mar 24, 2017

High values of WER on Libri Dataset (test-clean and dev-clean) #80

High values of WER on Libri Dataset (test-clean and dev-clean) #80

Comments

ismorphism commented Mar 2, 2017 • edited Loading

mtanana commented Mar 2, 2017 • edited Loading

suhaspillai commented Mar 2, 2017

mtanana commented Mar 2, 2017

SeanNaren commented Mar 13, 2017 • edited Loading

mtanana commented Mar 24, 2017

ismorphism commented Mar 2, 2017 •

edited

Loading

mtanana commented Mar 2, 2017 •

edited

Loading

SeanNaren commented Mar 13, 2017 •

edited

Loading