-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cost and Weights are NAN #6
Comments
Hi, I've seen similar cost values to yours in my experiments, although I don't recall running into NaNs. |
That sounds fair to me- I appreciate the help and understand that you have other stuff going on! For all purposes I think it is fair to say that my dataset is similar to the one from the paper - it is a list of icd,cpt,ndc codes from person's visit to the doctor with the transformations from the provided ReadMe (lists of indexed ints with different patients separated by [-1] ) I have tried implementing gradient clipping by adding grad_clip on total_cost in build_model method but even with thresholds of -.5 and .5 I am still eventually getting NAN (probably because that is not the right way to do it)
|
Hello,
During training cost goes to NAN probably because one of the weights becomes too large and data goes out of bounds of float32. This causes all other weights to become NAN as well. I think classic way to deal with is to add Batch Normalization layers which clips large updates to weights however my limited understanding of Theano and your script prevents me from testing it out... Also cost seems quite high- have you seen similar values with your training? Let me know your thoughts on this:

The text was updated successfully, but these errors were encountered: