You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure if this repo is still being maintained, but I'm having some trouble with the benchmarking of the randomly generated levels, e.g., append-still. For some reason the episodes end after 1, 2, or 3 steps with a score of 0. I think something is triggering the `done' flag in the benchmark environments too early.
To reproduce this I installed locally and simply ran
./start-training.py testAgent
This occurs with and without wandb.
Training seems to work fine, so this is likely related to the time limit for benchmark levels or the way the benchmark levels are run.
Any help is massively appreciated!
The text was updated successfully, but these errors were encountered:
Hmm, I'm not able to reproduce by running ./start-training.py testAgent. It might be worthwhile to drop the video interval all the way down to 1 so that you can get a clearer view of what's actually happening, and after that you can always set a breakpoint and step through the code to see what's going on. Remember that you can print the board state to the terminal which can help in debugging. Sorry that that's not more helpful!
Could you tell me the details of your setup? Everything compiled ok, presumably? Let me know if you're not able to figure this out tomorrow and I'll try to look into it in more detail.
Thanks for the help!
I've managed to track down the issue to the fixed benchmark levels in the npz file.
They load ok, and seem to look fine but are returning as done straight away. If I replace the benchmark levels with randomly generated ones using the validation seed then it works as expected.
This suits me better as I was hoping to evaluate on randomly generated levels.
I imagine this might be the result of different versions of numpy or something like that? But at the same time, the levels look normal.
For reference I'm running on Ubuntu 20.04, with python 3.9.
Hey,
Not sure if this repo is still being maintained, but I'm having some trouble with the benchmarking of the randomly generated levels, e.g., append-still. For some reason the episodes end after 1, 2, or 3 steps with a score of 0. I think something is triggering the `done' flag in the benchmark environments too early.
To reproduce this I installed locally and simply ran
./start-training.py testAgent
This occurs with and without wandb.
Training seems to work fine, so this is likely related to the time limit for benchmark levels or the way the benchmark levels are run.
Any help is massively appreciated!
The text was updated successfully, but these errors were encountered: