Addition of Rainbow and Soft Actor-Critic to RL codebase

Hi! Maintaining mlpack's tradition, this page contains weekly updates of my contributions at mlpack during Google Summer of Code(2020).

Here's the output of the agents that I implemented this summer, solving some of the classical reinforcement learning environments.

Overview:

I spent my summers this year, working on adding Rainbow Hessel et al., 2017 and Soft Actor-Critic Haarnoja et al., 2019 to the existing reinforcement learning codebase of mlpack.

I feel that these are one of the most in-demand and recent algorithms, whose implementation in mlpack was crucial.

Here's the summary of what I've accomplished at the end of this summer.

Improved the current QLearning implementation.
Implemented Rainbow as an improvement on DQN. This includes adding the following as extensions:
- Dueling DQN
- Noisy DQN
- Categorical DQN
- N-step DQN
Wrote test cases for each of the implementations, after tuning hyperparameters and testing each for several runs.
Implemented Soft Actor-Critic (SAC) for continuous action space, along with its tests.
Created detailed documentation for all the above implementations.
Created documented Jupiter notebooks explaining solved examples of agents solving classical reinforcement learning problems, using a TCP API to communicate with an OpenAI gym instance.

The original project proposal can be found on the GSoC website here.

Weekly Progress:

Week 1 - Layout for Dueling and Noisy DQNs
Week 2 and 3 - Finishing Dueling and Noisy DQNs
Week 4 and 5 - Completed Multi-step DQN, C51 almost ready
Week 6 and 7 - Training on gym_tcp_api, Layout for Soft-actor-critic
Week 8 and 9 - Soft-Actor basic implementation complete, making solved example notebooks
Week 10 and 11 - C51 merged, Soft-Actor-Critic almost complete, three new solved notebooks added, bug fixes 🐛🐛
Week 12 - Wrapping up

the Code:

Links to open and merged pull requests can be found here.

Acknowledgements:

There has been a lot of coding, experimentation, thousands of builds and never ending debugging, so much that to think that it is coming to a wrap up doesn’t feel like reality. Special thanks to Marcus Edel, one of my mentors this year, who had the answers to almost all the problems I faced. With your and Rahul's support, I had a very smooth and enjoyable experience, and I hardly ever got stuck anywhere.

I would also like to acknowledge the help from other members of the community, who were always available whenever needed, just a small chat away. I strongly intend to continue contributing to mlpack in all the ways I can, because it's just too much fun. :)

I would also like to thank @shivanshs9, @adityauser, @YashJipkate and @lok-i for their proposal reviews.

Also, thanks to Google for this amazing opportunity and the generous funding.

This has been a summer worth remembering!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Addition of Rainbow and Soft Actor-Critic to RL codebase

Overview:

Weekly Progress:

the Code:

Acknowledgements:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Addition of Rainbow and Soft Actor-Critic to RL codebase

Overview:

Weekly Progress:

the Code:

Acknowledgements: