Repository for implementing adversarial attack for reinfrocement learning
Deep reinforcement learning (DRL) has been shown to have numerous potential applications in the real world. However, DRL algorithms are still extremely sensitive to noise and adversarial perturbations, hence inhibiting the deployment of RL in many real-life applications. Analyzing the robustness of DRL algorithms to adversarial attacks is an important prerequisite to enabling the widespread adoption of DRL algorithms. Common perturbations on DRL frameworks during test time include perturbations to the observation and the action channel. Compared with observation channel attacks, action channel attacks are less studied; hence, few comparisons exist that compare the effectiveness of these attacks in DRL literature. In this work, we examined the effectiveness of these two paradigms of attacks on common DRL algorithms and studied the natural robustness of DRL algorithms towards various adversarial attacks in hopes of gaining insights into the individual response of each type of algorithm under different attack conditions.
The figure above shows the various black-box strategies implemented. The attacks can be mounted on either one of the two channels, with the constraint on the attack following one of the three magnitudes and the specific instantiation following one of the four directions.
We implemented the three white-box attack strategies tested on the common benchmark RL algorithms. The white-box attacks we implemented are the Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD) algorithm and the Myopic Action Space (MAS) attack algorithm.
The figures above shows the results of black box attack. The x-axis represents
The figures above shows the results of white box attack. The plots show the relationship between the value of
The figure above summarizes the algorithms' correlation between sensitivity and robustness.
clone repo and install MuJoCo and PyTorch in a Python>=3.6.13 environment.
git clone https://github.com/super864/Natural-Robustness-RL.git # clone
install MuJoCo and install PyTorch 1.10.2 refer https://github.com/openai/mujoco-py and https://pytorch.org/
Example run for running black box attack series:
under ddpg directory
python black_box_attack.py --agent ddpg
Each Algorithm directory has its own black_box_attack.py file to run
Example run for running white box attack series:
under ddpg directory
python Whitebox_DDPG_runner.py --agent ddpg
Each Algorithm directory has its own whitebox runner file to run
under ppo directory
Example run for black box attack
python train_ppo.py --load /mnt/raid10/Natural-Robustness-RL/ppo/withoutattackHalfCheetah/attack_HalfCheetah_seed_0/20210709T093820.456842/HalfCheetah-v2_0/agent --env HalfCheetah-v2 --gpu -1 --seed 0 --objective action --attack fix --space caction --direction same --percentage 5 --rollout BlackBox
Example run for white box attack
python train_ppo.py --load /mnt/raid10/Natural-Robustness-RL/ppo/withoutattackHalfCheetah/attack_HalfCheetah_seed_0/20210709T093820.456842/HalfCheetah-v2_0/agent --env HalfCheetah-v2 --gpu -1 --seed 0 --rollout PGD --epsilon 0.1 --clip True --lr 3 --attack_steps 25