Skip to content

Latest commit

 

History

History
71 lines (51 loc) · 4.87 KB

README.md

File metadata and controls

71 lines (51 loc) · 4.87 KB

Addition of Rainbow and Soft Actor-Critic to RL codebase

drawing drawing

Hi! Maintaining mlpack's tradition, this page contains weekly updates of my contributions at mlpack during Google Summer of Code(2020).

Here's the output of the agents that I implemented this summer, solving some of the classical reinforcement learning environments.

drawing drawing

drawing drawing

drawing drawing

Overview:

I spent my summers this year, working on adding Rainbow Hessel et al., 2017 and Soft Actor-Critic Haarnoja et al., 2019 to the existing reinforcement learning codebase of mlpack.

I feel that these are one of the most in-demand and recent algorithms, whose implementation in mlpack was crucial.

Here's the summary of what I've accomplished at the end of this summer.

  1. Improved the current QLearning implementation.
  2. Implemented Rainbow as an improvement on DQN. This includes adding the following as extensions:
    • Dueling DQN
    • Noisy DQN
    • Categorical DQN
    • N-step DQN
  3. Wrote test cases for each of the implementations, after tuning hyperparameters and testing each for several runs.
  4. Implemented Soft Actor-Critic (SAC) for continuous action space, along with its tests.
  5. Created detailed documentation for all the above implementations.
  6. Created documented Jupiter notebooks explaining solved examples of agents solving classical reinforcement learning problems, using a TCP API to communicate with an OpenAI gym instance.

The original project proposal can be found on the GSoC website here.

Weekly Progress:

the Code:

Links to open and merged pull requests can be found here.

Acknowledgements:

There has been a lot of coding, experimentation, thousands of builds and never ending debugging, so much that to think that it is coming to a wrap up doesn’t feel like reality. Special thanks to Marcus Edel, one of my mentors this year, who had the answers to almost all the problems I faced. With your and Rahul's support, I had a very smooth and enjoyable experience, and I hardly ever got stuck anywhere.

I would also like to acknowledge the help from other members of the community, who were always available whenever needed, just a small chat away. I strongly intend to continue contributing to mlpack in all the ways I can, because it's just too much fun. :)

I would also like to thank @shivanshs9, @adityauser, @YashJipkate and @lok-i for their proposal reviews.

Also, thanks to Google for this amazing opportunity and the generous funding.

This has been a summer worth remembering!