-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does the walker run reproduce correctly? #4
Comments
I know it is too late to comment on this issue but could you tell me the hyper parameters you tried this experiment? |
@yusukeurakami Hi, I also find that the results for And the hyperparameters I use are: Hyperparameters
|
The same question. Hoping for your reply. Thanks. |
Hi, thanks for the explanation. However, I'm a little confused about the fix. |
It is correct that I try to predict rt with (st, ht), and I didn't use (st+1, at) to predict rt. Acutally, the origin implementation of dreamer-torch (i.e., |
Thanks @sumwailiu for your PR |
@coderlemon17 @yusukeurakami There are some mistakes in the first version of the problem fix (#12), where the reward computation is indeed correct. Please refer to the second version (#13). |
Hi, thank you for the cool repository!
I tried several tasks
walker walk
,cheetah run
. They seem to work fine.But when I run
walker run
, the episode_reward cannot achieve around 700.Is there any problem...? 🤔
The original paper seems to say
walker run
will achieve around 700 with 1M steps, saying on the following page 7, Figure7.https://arxiv.org/pdf/1912.01603.pdf
Thank you.
The text was updated successfully, but these errors were encountered: