-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High values of WER on Libri Dataset (test-clean and dev-clean) #80
Comments
Does the 1080 have 6GB? I'm not sure if that will be able to fit the full model. If you look back at my responses to the thread on running out of memory, I found some tweaks to the code that drastically reduced the memory. (But, alas, I haven't had time to commit them...) In the end, if you don't have much memory, you can't run large batches. (on a 6gb testing card, I don't think I could run more that 3 or 4 in a batch). As the batch size changes, the ideal learning rate usually does as well (in my experience). You may have to play with that (this is the real hard work of deep learning) |
Hi Boris, |
#71 |
Sorry for the late response, the GTX 1080 is a great card but as said above, only has 8gb of VRAM. Reduce the minibatch size if you want to train on this GPU! Which dataset are you training on specifically? |
Don't forget to downsize the minibatch for testing too |
Hi everyone! I have the following problem: when I firstly try to train pure DeepSpeech net with default parameters (7 layers, 1760 neurons, batch size is 20) I always get
cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-3543/cutorch/lib/THC/generic/THCStorage.cu:66
. (My dataset is default Libri dataset of 600 Mb which was described in Data preparation and running wiki chapter of this repository. My GPU is Nvidia 1080 GTX.)I thought it's ok and maybe I have to decrease batch size and number of layers/neurons. But only satisfied architecture on this moment is 6 layers , 1200 neurons and batch size is 12. Other architectures caused
out of memory error
every time on 5 or 6 epoch or something. At the same time my best WER value is about 76.5 and doesn't seem to get low. I tried to change batch size a little but my maximum working value is 12 now. Also, I tried to use LSTM architecture with 600 neurons and 6 layers and it showed worse results. I tried to change learning rates, rate annealing, maxnorm and momentum, add some permute Batch but always I get something about 76.5. Does anyoune know what else could I do?Maybe the answer is to use bigger batch size and deeper architecture but then I have to use more computational power and it doesn't seem good for me..
The text was updated successfully, but these errors were encountered: