Skip to content

Code for "On the Robustness of Safe Reinforcement Learning under Observational Perturbations" (ICLR 2023)

License

Notifications You must be signed in to change notification settings

liuzuxin/safe-rl-robustness

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

On the Robustness of Safe Reinforcement Learning under Observational Pertubrations

This project provides the open source implementation of the robust safe RL introduced in the ICLR 2023 paper: "On the Robustness of Safe Reinforcement Learning under Observational Pertubrations" (Liu, et al. 2023).

Safe RL trains a policy to maximize the reward while satisfying constraints. While prior works focus on the performance optimality, we find that the optimal solutions of many safe RL problems are not robust and safe against carefully designed observational perturbations. We propose two adversarial attacks - one maximizes the cost and the other maximizes the reward. One interesting and counter-intuitive finding is that the maximum reward attack is strong, as it can both induce unsafe behaviors and make the attack stealthy by maintaining the reward. We further propose a defense method based on adversarial training, which can make the agent stay safe under attacks. Video demos are available at the project webpage.

If you find this code useful, consider to cite:

@article{liu2022robustness,
  title={On the robustness of safe reinforcement learning under observational perturbations},
  author={Liu, Zuxin and Guo, Zijian and Cen, Zhepeng and Zhang, Huan and Tan, Jie and Li, Bo and Zhao, Ding},
  journal={arXiv preprint arXiv:2205.14691},
  year={2022}
}

Table of Contents

The structure of this repo is as follows:

Robust safe RL libraries
├── rsrl  # package folder
│   ├── policy # core algorithm implementation
│   ├── ├── model # stores the actor critic model architecture
│   ├── ├── policy_name # algorithms implementation
│   ├── util # logger and pytorch utils
│   ├── runner.py # training logic of the algorithms
│   ├── evaluator.py # evaluation logic of trained agents
├── script  # stores the training scripts.
│   ├── config # stores some configs of the env and policy
│   ├── run.py # launch a single experiment
│   ├── experiment.py # launch multiple experiments in parallel with ray
│   ├── eval.py # evaluate script of trained agents
├── data # stores experiment results

Environment setup

System requirements

  • The repo is tested in Ubuntu 20.04 and should be fine with Ubuntu 18.04
  • We recommend to use Anaconda3 for python env management

Installation

  1. Activate a python 3.7+ virtual anaconda env, then install the bullet_safety_gym simulation environment:
cd envs/Bullet-Safety-Gym
pip install -e .
cd ../..
  1. After switching back to the repo root folder, install the dependencies that are listed in requirement.txt and the rsrl library:
pip install -r requirement.txt
pip install -e .
  1. Then install pytorch based on your system configurations, see instructions here. For example, installing a cpu-only version pytorch via Anaconda3 by the following command:
conda install pytorch cpuonly -c pytorch
  1. The MAD attacker requires pysgmcmc library for optimization. Install it by:
pip install git+https://github.com/MFreidank/pysgmcmc@pytorch

How to run experiments

To run a single experiment:

python script/run.py --rs_mode vanilla --policy robust_ppo

To run multiple experiments in parallel:

python script/experiment.py -e experiment_name 

To evaluate a trained model, run:

python script/eval.py -d path_to_model

To evaluate multiple trained model in parallel:

python script/evaluation.py -d path_to_model -e env_name

The complete hyper-parameters can be found in script/config/config_robust_ppo.yaml.

In particular, PPO-Lagrangian has different robust training modes, which are specified by the rs_mode parameter. We detail the modes in the following table.

Algorithm PPOL ADV-PPOL(MC) ADV-PPOL(MR) PPOL-random SA-PPOL SA-PPOL(MC) SA-PPOL(MR)
Mode vanilla max_cost max_reward uniform kl klmc klmr
  • The proposed adversarial training methods correspond to the max_cost, max_reward modes.
  • For SA-PPOL series, the modes are kl, klmc, klmr. The SA-PPOL with the original MAD attacker is the kl mode, the SA-PPOL method with the MC and MR attackers are klmc and klmr respectively.
  • Note that FOCOPS also supports the adversarail training modes max_cost, max_reward and uniform, vanilla.

Pretrained weights

The pretrained weights are available at here.

Acknowledgments

Part of the code is based on several public repos:

About

Code for "On the Robustness of Safe Reinforcement Learning under Observational Perturbations" (ICLR 2023)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published