You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
we're from the University of Wuerzburg and tried to replicate your project for German report data.
For now, we simply tried to get your code to run and train on MIMIC with the default settings provided as well as the settings provided in your paper. Of course, we made sure to have the same package versions as in the project.
However, we quickly get NaN loss after some iterations. So first, we tried to create a subsample of the dataset. For a very small dataset (~300 images), the training does converge. However, even for 1000 images the loss does not get smaller. We also tried several different learning rates and hyperparameters, but nothing helped so far.
I was hoping that you might be familiar with our problems and give us advice here.
Thanks in advance!
The text was updated successfully, but these errors were encountered:
we actually did solve it, it was a huge mistake on our side!
Our Github Repo was set to use LFS and we had the reports inside the repo, which lead to the contents simply being links to github LFS.
When we actually trained on the proper reports, we had no problems running the code.
Maybe you have a similar issue on your end, fingers crossed!
Unfortunately it is not the same problem as we all have cases on a secondary HD in the cluster. And apparently the reports are being sent correctly to TextEncoder.
If it doesn't bother me to ask a few more questions, how much did you use a batch size? Did you evaluate other parameters besides the default ones?
Hi,
we're from the University of Wuerzburg and tried to replicate your project for German report data.
For now, we simply tried to get your code to run and train on MIMIC with the default settings provided as well as the settings provided in your paper. Of course, we made sure to have the same package versions as in the project.
However, we quickly get NaN loss after some iterations. So first, we tried to create a subsample of the dataset. For a very small dataset (~300 images), the training does converge. However, even for 1000 images the loss does not get smaller. We also tried several different learning rates and hyperparameters, but nothing helped so far.
I was hoping that you might be familiar with our problems and give us advice here.
Thanks in advance!
The text was updated successfully, but these errors were encountered: