This folder contains the code implementation for the plots presented in our paper.
- Launch a docker image through the following commands:
# assume the current directory is the root of this repository
docker run --rm -it --gpus all --ipc=host -v ${PWD}:/app nvcr.io/nvidia/pytorch:20.12-py3
# inside the docker container, run:
cd /app
- Install
conda
, create a conda envrionmentmeow
, and activate it:
conda create --name meow python=3.8 -y
source activate
conda activate meow
- Install the dependencices for plotting:
pip install tbparse
pip install seaborn
Most of the evaluation results presented in the paper are provided in the following table:
# | Experiment | Position in the Paper | Environment | File |
---|---|---|---|---|
1 | Return Comparison | Fig. 3 | MuJoCo | results.zip |
2 | Return Comparison | Fig. 4 | Omniverse Isaac Gym | results.zip |
3 | Ablation Analysis (LRS & SCDQ) | Fig. 6 | MuJoCo | results.zip |
4 | Ablation Analysis (Deterministic) | Fig. 7 | MuJoCo | results.zip |
5 | Ablation Analysis (Affine) | Fig. A1 | MuJoCo | results.zip |
6 | Ablation Analysis (Parameterization) | Fig. A2 | MuJoCo | results.zip |
7 | Ablation Analysis (SAC+LRS) | Fig. A4 | MuJoCo | results.zip |
- Download
results.zip
and unzip it. You will obtain a directory calledsmoothed
that contains the (smoothed) results evaluated in various environments. This directory has a nested structure organized by${name_of_env}
and${name_of_algorithm}
. For example, the results of the#1
experiment presented in the above table are arranged as follows:
smoothed/
├── Ant-v4/
| ├── meow/
| | ├── 1/
| | | └── ${tfevents_log}
| | ├── 2/
| | ├── 3/
| | ├── 4/
| | └── 5/
| ├── ddpg/
| ├── ppo/
| ├── td3/
| ├── sql/
| └── sac/
├── HalfCheetah-v4/
├── Hopper-v4/
├── Humanoid-v4/
└── Walker2d-v4/
- Place the
smoothed
directory in the current folder and execute the plotting commands (i.e.,plot_fig_${num}.py
). For example, each subfigure of Fig. 3 in the paper can be reproduced using the following command:
python plot_fig_3.py
- You may obtain the following figures stored at a directory named
fig_3
.
NOTE: The smoothing method for these curves is the same as that implemented in
Tensorboard
. Please open an issue if you need the code implementation for the smoothing method.
If you find this repository useful, please consider citing our paper:
@inproceedings{chao2024maximum,
title={Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow},
author={Chao, Chen-Hao and Feng, Chien and Sun, Wei-Fang and Lee, Cheng-Kuang and See, Simon and Lee, Chun-Yi},
booktitle={Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS)},
year={2024}
}
Visit our GitHub pages by clicking the images above.