Consistency Models for RL — Official PyTorch Implementation

Official implementation for:

Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning
Zihan Ding, Chi Jin
https://arxiv.org/abs/2309.16984

Requirements

Installations of PyTorch, MuJoCo, and D4RL are needed. Please see the requirements.txt for environment set up details.

pip install -r requirements.txt

Run

You can use either diffusion model or consistency model.

Dataset

First download D4RL dataset with:

python download_data.py

The data will be saved in ./dataset/.

Offline RL

# train offline RL Consistency-AC for hopper-medium-v2 task
python offline.py --env_name hopper-medium-v2 --model consistency --ms offline --exp RUN_NAME --save_best_model --lr_decay
# train offline RL Diffusion-QL for walker2d-medium-expert-v2 task
python offline.py --env_name walker2d-medium-expert-v2 --model diffusion --ms offline --exp RUN_NAME --save_best_model --lr_decay

Online RL

From scratch:

# train online RL Consistency-AC for hopper-medium-v2 task
python online.py --env_name hopper-medium-v2 --num_envs 3 --model consistency --exp RUN_NAME
# train online RL Diffusion-QL for walker2d-medium-expert-v2 task
python online.py --env_name walker2d-medium-expert-v2 --num_envs 3 --model diffusion --exp RUN_NAME

Online RL initialized with offline pre-trained models (offline-to-online):

python online.py --env_name kitchen-mixed-v0 --num_envs 3 --model consistency --exp online_test --load_model 'results/**PATH**' --load_id 'online'

As an example, with a model saved in path results/**PATH**/actor_online.pth, it will be loaded for initializing the online training with the above command.

Training Scripts

Use bash scripts:

bash scripts/offline.sh
bash scripts/online.sh

Use Slurm scripts:

sbatch scripts/offline.slurm
sbatch scripts/online.slurm
sbatch scripts/offline2online.slurm

Citation

If you find this open source release useful, please cite in your paper:

@article{ding2023consistency,
  title={Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning},
  author={Ding, Zihan and Jin, Chi},
  journal={arXiv preprint arXiv:2309.16984},
  year={2023}
}

Acknowledgement

We acknowledge the original official repo of Diffusion Policy and corresponding paper: https://arxiv.org/abs/2208.06193.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
agents		agents
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_data.py		download_data.py
offline.py		offline.py
online.py		online.py
requirements.txt		requirements.txt
tabulate.py		tabulate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Consistency Models for RL — Official PyTorch Implementation

Requirements

Run

Dataset

Offline RL

Online RL

Training Scripts

Citation

Acknowledgement

About

Releases

Packages

Languages

License

quantumiracle/Consistency_Model_For_Reinforcement_Learning

Folders and files

Latest commit

History

Repository files navigation

Consistency Models for RL — Official PyTorch Implementation

Requirements

Run

Dataset

Offline RL

Online RL

Training Scripts

Citation

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages