Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about visualize the change of reward when training #7

Open
huangjiancong1 opened this issue Jun 23, 2019 · 1 comment
Open
Labels

Comments

@huangjiancong1
Copy link

huangjiancong1 commented Jun 23, 2019

Why we use test() to see the reward and testing_sample_step?

Can I use the train() to see how the reward change when training?
It seems that the last perf is the reward.

Because we want to compare with the openai,

@huangjiancong1 huangjiancong1 changed the title Question about to see the change of reward when trainning Question about visualize the change of reward when training Jun 23, 2019
@matthieu637
Copy link
Owner

The test() function disable exploration for testing the performance of deterministic policies.
0.0.monitor.csv contains training performance (with exploration)
0.1.monitor.csv contains testing performance (without exploration)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants