You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why we use test() to see the reward and testing_sample_step?
Can I use the train() to see how the reward change when training?
It seems that the last perf is the reward.
Because we want to compare with the openai,
The text was updated successfully, but these errors were encountered:
huangjiancong1
changed the title
Question about to see the change of reward when trainning
Question about visualize the change of reward when training
Jun 23, 2019
The test() function disable exploration for testing the performance of deterministic policies.
0.0.monitor.csv contains training performance (with exploration)
0.1.monitor.csv contains testing performance (without exploration)
Why we use
test()
to see the reward and testing_sample_step?Can I use the
train()
to see how the reward change when training?It seems that the
last perf
is the reward.Because we want to compare with the openai,
The text was updated successfully, but these errors were encountered: