-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The model does not converge for breakout #211
Comments
Same issue here, but for all envs.
A quinta, 20/10/2022, 03:58, yungangwu ***@***.***> escreveu:
… Search before asking
- I have searched the MuZero issues
<https://github.com/werner-duvaud/muzero-general/issues> and found no
similar feature requests.
Description
I trained muzero for breakout with the hyperparameters given in the code,
but up to 450,000 steps, its reward was still 0 and showed no convergence.
So I would like to ask, are the hyperparameters in the code validated
hyperparameters? Thank, you!
Additional context
*No response*
—
Reply to this email directly, view it on GitHub
<#211>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACPAYROELDTJJHTULUPDSF3WECYOLANCNFSM6AAAAAARJWGUG4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Have you tried any other parameter Settings? For example, if batch_size is set to 1024, does the model converge under certain hyperparameter Settings? @JohnPPP |
Tried a bunch of hyperparameters on a bunch of games. Just wasted my time.
Perhaps others can show me how can this work...
A quinta, 20/10/2022, 07:47, yungangwu ***@***.***> escreveu:
… Have you tried any other parameter Settings? For example, if batch_size is
set to 1024, does the model converge under certain hyperparameter Settings?
—
Reply to this email directly, view it on GitHub
<#211 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACPAYRLVXSFUSRWTKDIO5VTWEDTGZANCNFSM6AAAAAARJWGUG4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
gg. I also met the same problem, did a lot of experiments, but nothing happened, I don't know if there is a mistake in the code. @JohnPPP |
Yeah, probably is.
A quinta, 20/10/2022, 09:31, yungangwu ***@***.***> escreveu:
… gg. I also met the same problem, did a lot of experiments, but nothing
happened, I don't know if there is a mistake in the code. @JohnPPP
<https://github.com/JohnPPP>
—
Reply to this email directly, view it on GitHub
<#211 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACPAYRLWXHNWUUNPJQ4MWQLWED7OVANCNFSM6AAAAAARJWGUG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Did the reward stay zero the entire time, or did it occasionally get some reward? I have it working on cartpole, but not on Atari. That said, it still gets a reward of 2 or 3 occasionally in breakout, indicating that it is behaving randomly. |
I also encountered the same problem. I adjusted the super parameters for a long time, but I couldn't learn a good effect in my environment |
Yes, I have this problem. I also experimented with another code, muzero-pytorch, on gomoku games, but I adjusted for a long time and didn't get the ideal results.
…---Original---
From: ***@***.***>
Date: Sat, Dec 31, 2022 23:25 PM
To: ***@***.***>;
Cc: ***@***.***>;"State ***@***.***>;
Subject: Re: [werner-duvaud/muzero-general] The model does not converge forbreakout (Issue #211)
I also encountered the same problem. I adjusted the super parameters for a long time, but I couldn't learn a good effect in my environment
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
|
Is there a possibility that many networks need to be learned, leading to decision failure. |
Yes, that's why I guess, probably because it has three series networks need to optimize together, so very careful training to converge. As far as contact information, I'm using the wechat app. Do you know this app?
|
您可以加我的微信联系方式 |
Search before asking
Description
I trained muzero for breakout with the hyperparameters given in the code, but up to 450,000 steps, its reward was still 0 and showed no convergence. So I would like to ask, are the hyperparameters in the code validated hyperparameters? Thank, you!
Additional context
No response
The text was updated successfully, but these errors were encountered: