Official Implementation of Deep Variance Weighting (DVW) [Experiments in Section 7.2.2]

This repository is the official implementation of Deep Variance Weighting for the MinAtar experiments in Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice.

We modified CleanRL repository (commit d67ae0c) and MinAtar repository (commit 548b136).
You can see the implementation of M-DQN with DVW in cleanrl/dqn_minatar.py (or cleanrl/dqn.py). We leave other files same as the original CleanRL.

Requirements

Step 1: Install dependencies

# make sure you are in Variance-Weighted-MDVI/Deep-Variance-Weighting-MinAtar
poetry install

# Install MinAtar in submodule
poetry shell
git submodule update --init && cd MinAtar
pip install -e .

Step 2: Login to wandb (for ease of visualization and plotting)

wandb login # only required for the first time

You can test if everything works by:

# If you have something wrong with GPU, please replace "--device cuda" with "--device cpu"

# Weighted M-DQN
poetry run python cleanrl/dqn_minatar.py --total-timesteps 50000 --env-id breakout --track --wandb-project-name minatar-test --exp-name Weight-Net-M-DQN --weight-type variance-net --device cuda
# M-DQN
poetry run python cleanrl/dqn_minatar.py --total-timesteps 50000 --env-id breakout --track --wandb-project-name minatar-test --exp-name M-DQN --weight-type none --device cuda

# Weighted DQN
poetry run python cleanrl/dqn_minatar.py --total-timesteps 50000 --env-id breakout --track --wandb-project-name minatar-test --exp-name Weight-Net-M-DQN --weight-type variance-net --kl-coef 0.0 --ent-coef 0.0 --device cuda
# DQN
poetry run python cleanrl/dqn_minatar.py --total-timesteps 50000 --env-id breakout --track --wandb-project-name minatar-test --exp-name DQN --weight-type none --kl-coef 0.0 --ent-coef 0.0 --device cuda

Run MinAtar Experiments

Run bash run_minatar.bash

Plot results

Run all the cells in minatar-results/result-plotter.ipynb. The figures will be saved in minatar-results directory.

(Optional) Classic Control

If you are interested in other environments, try the following for classic controls:

# If you have something wrong with GPU, please replace "--device cuda" with "--device cpu"

# Weighted M-DQN
poetry run python cleanrl/dqn.py --total-timesteps 50000 --env-id CartPole-v1 --track --wandb-project-name classic-control-test --exp-name Weight-Net-M-DQN --weight-type variance-net  --device cuda
# M-DQN
poetry run python cleanrl/dqn.py --total-timesteps 50000 --env-id CartPole-v1 --track --wandb-project-name classic-control-test --exp-name M-DQN --weight-type none --device cuda 

# Weighted DQN
poetry run python cleanrl/dqn.py --total-timesteps 50000 --env-id CartPole-v1 --track --wandb-project-name classic-control-test --exp-name Weight-Net-M-DQN --weight-type variance-net --kl-coef 0.0 --ent-coef 0.0 --device cuda
# DQN
poetry run python cleanrl/dqn.py --total-timesteps 50000 --env-id CartPole-v1 --track --wandb-project-name classic-control-test --exp-name DQN --weight-type none --kl-coef 0.0 --ent-coef 0.0 --device cuda

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
MinAtar @ 548b136		MinAtar @ 548b136
cleanrl		cleanrl
cleanrl_utils		cleanrl_utils
minatar-results		minatar-results
requirements		requirements
.gitignore		.gitignore
.gitmodules		.gitmodules
.python-version		.python-version
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
run_classic_control.bash		run_classic_control.bash
run_minatar.bash		run_minatar.bash

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Official Implementation of Deep Variance Weighting (DVW) [Experiments in Section 7.2.2]

Requirements

Run MinAtar Experiments

Plot results

(Optional) Classic Control

About

Releases

Packages

Languages

matsuolab/Deep-Variance-Weighting-MinAtar

Folders and files

Latest commit

History

Repository files navigation

Official Implementation of Deep Variance Weighting (DVW) [Experiments in Section 7.2.2]

Requirements

Run MinAtar Experiments

Plot results

(Optional) Classic Control

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages