This is an implementation of asynchronous reinforcement learning algorithms as described in https://arxiv.org/pdf/1602.01783.pdf. This implementation is for gym's doom and atari environment.
There are 4 algorithms
- asynchronous one-step Q-learning
- asynchronous n-steps Q-learning
- asynchronous advanced actor critic
- asynchronous LSTM advanced actor critic
###Training In order to execute one of the above algorithms, and train an agent in one of the available environment, just run:
python dqn.py --game "ppaquette/DoomDefendCenter-v0"
This way the algorithm will use the default parameteres, running the environment in 8 actor learner threads.
If you want to train in an atari game run the following command
python a3c_lstm.py --game "Breakout-v0" --game_type "Atari"
###Testing In order to test an already trained agent just run
python dqn.py --game "ppaquette/DoomDefendCenter-v0" --testing True --checkpoint_path "path/to/parameters/"
Below there are some evaluation from gym openai
- Using one step dqn:
https://gym.openai.com/evaluations/eval_MGqu9wbTxS0fVFlz2puow
https://gym.openai.com/evaluations/eval_YB4PBRMQRWmWDW9eCXmV6g
- Using n steps dqn:
https://gym.openai.com/evaluations/eval_f8hCpqhQnqJEJCn3uiOWg
- Using a3c:
https://gym.openai.com/evaluations/eval_bxAN82ZRQe07kgfJTV5jA This evaluation is only for 10M steps. It can become a lot better with 80M steps training
https://github.com/coreylynch/async-rl
https://github.com/muupan/async-rl/wiki
https://github.com/tatsuyaokubo/async-rl
https://arxiv.org/pdf/1602.01783.pdf
https://webdocs.cs.ualberta.ca/~sutton/book/the-book-2nd.html