What's in the package?

Overview

The package consists of three parts, which are the agent base module, the utility module, and all of the standard agents. A specific agent, say DDPG, inherits the agent base class. The agent base class provides universal functions that can be used by any child-agent class. The utility module provides many components commonly seen in modern DRL algorithms, like exploration strategies, networks, replay buffers, etc. All of these are dedicated to facilitating fast implementations. This page will explain most of them. Note that they are still evolving.

The agent base class

The class can be accessed as: from drl_implementation.agent.agent_base import Agent. It expects several arguments to initialize, you can find the doc strings in the source script. The __init__ function does the following things:

set a cuda device and seeding pytorch
create a universal numpy random number generator
create file storage directories if necessary
setup a replay buffer
setup a normalizer for state normalization if desired
store some universal algorithmic parameters, such as learning rates
create a dictionary to store pytorch networks
create a dictionary to store training statistics

It provides functions that can be used by any child-agent class, including:

_remember(...), for storing data into the replay buffer
_soft_update(...), for updating target networks with polyak averaging
_save_network(...) and _load_network(...), for saving and loading pytorch network checkpoints
_save_statistics(...), for saving the statistics dictionary using the json module
_plot_statistics(...), for plotting training statistics using the matplotlib module

The above-mentioned functions are generally applicable to any child-agent class. However, there are four functions that a child-agent class must implement:

run(...), for running a whole experiment, calling functions to train, test, and manage, print, plot and save data.
_interact(...), for interacting with the environment, collecting data, and most of the time, calling the _learn(...) function
_select_action(...), for producing actions, either exploratory or greedy ones
_learn(...), for updating the agent's networks

For some specific codes about what to implement in these four functions, try open up an agent class script, say DDPG, and have a dive.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's in the package?

Overview

The agent base class

Clone this wiki locally