ODE-RL

Experiment code for ICML 2021 paper Continuous-time Model-based Reinforcement Learning. Implemented in Python 3.7.7 and torch 1.6.0 (later versions should be OK). Also requires torchdiffeq, TorchDiffEqPack and gym.

Quick introduction

runner.py should run off-the-shelf. The file can be used to reproduce our results and it also demonstrates how to
- create a continuous-time RL environment
- initiate our model (with different variational formulations) as well as baselines (PETS & deep PILCO)
- visualize the dynamics fits
- execute the main learning loop (Algorithm-1 in the paper)
ctrl folder has our model implementation as well as helper functions for training.
- ctrl/ctrl: creates our model and serves as an interface between the model and training/visualization functions.
- ctrl/dataset: contains state-action-reward trajectories and interpolation (for continuous-time action) classes.
- ctrl/dynamics: implements the dynamics model and is responsible for forward simulating all models.
- ctrl/policy: deterministic policy implementation
envs contains our continuous-time implementation of RL environments.
utils includes the function approximators.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ODE-RL

Quick introduction

Files

README.md

Latest commit

History

README.md

File metadata and controls

ODE-RL

Quick introduction