Skip to content

Commit

Permalink
Rename to RL-Zoo3 and better packaging (#291)
Browse files Browse the repository at this point in the history
* Rename and better packaging

* Move plot scripts inside package
  • Loading branch information
araffin committed Oct 3, 2022
1 parent 8cbe79d commit b372e9a
Show file tree
Hide file tree
Showing 50 changed files with 926 additions and 876 deletions.
2 changes: 1 addition & 1 deletion .coveragerc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
branch = False
omit =
tests/*
rl_zoo/utils/plot.py
rl_zoo3/utils/plot.py

[report]
exclude_lines =
Expand Down
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
- low pass filter was removed

### New Features
- RL Zoo cli: `rl_zoo train` and `rl_zoo enjoy`
- RL Zoo cli: `rl_zoo3 train` and `rl_zoo3 enjoy`

### Bug fixes

Expand Down
6 changes: 4 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
LINT_PATHS = *.py tests/ scripts/ rl_zoo/
LINT_PATHS = *.py tests/ scripts/ rl_zoo3/

# Run pytest and coverage report
pytest:
Expand All @@ -10,7 +10,7 @@ check-trained-agents:

# Type check
type:
pytype -j auto rl_zoo/ tests/ scripts/ -d import-error
pytype -j auto rl_zoo3/ tests/ scripts/ -d import-error

lint:
# stop the build if there are Python syntax errors or undefined names
Expand Down Expand Up @@ -44,12 +44,14 @@ docker-gpu:

# PyPi package release
release:
# rm -r build/* dist/*
python setup.py sdist
python setup.py bdist_wheel
twine upload dist/*

# Test PyPi package release
test-release:
# rm -r build/* dist/*
python setup.py sdist
python setup.py bdist_wheel
twine upload --repository-url https://test.pypi.org/legacy/ dist/*
Expand Down
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,13 +154,13 @@ python enjoy.py --algo algo_name --env env_id -f logs/ --exp-id 1 --load-last-ch

Upload model to hub (same syntax as for `enjoy.py`):
```
python -m rl_zoo.push_to_hub --algo ppo --env CartPole-v1 -f logs/ -orga sb3 -m "Initial commit"
python -m rl_zoo3.push_to_hub --algo ppo --env CartPole-v1 -f logs/ -orga sb3 -m "Initial commit"
```
you can choose custom `repo-name` (default: `{algo}-{env_id}`) by passing a `--repo-name` argument.

Download model from hub:
```
python -m rl_zoo.load_from_hub --algo ppo --env CartPole-v1 -f logs/ -orga sb3
python -m rl_zoo3.load_from_hub --algo ppo --env CartPole-v1 -f logs/ -orga sb3
```

## Hyperparameter yaml syntax
Expand Down Expand Up @@ -255,7 +255,7 @@ for multiple, specify a list:
```yaml
env_wrapper:
- rl_zoo.wrappers.DoneOnSuccessWrapper:
- rl_zoo3.wrappers.DoneOnSuccessWrapper:
reward_offset: 1.0
- sb3_contrib.common.wrappers.TimeFeatureWrapper
```
Expand All @@ -279,7 +279,7 @@ Following the same syntax as env wrappers, you can also add custom callbacks to

```yaml
callback:
- rl_zoo.callbacks.ParallelTrainCallback:
- rl_zoo3.callbacks.ParallelTrainCallback:
gradient_steps: 256
```

Expand All @@ -306,19 +306,19 @@ Note: if you want to pass a string, you need to escape it like that: `my_string:
Record 1000 steps with the latest saved model:

```
python -m rl_zoo.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000
python -m rl_zoo3.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000
```

Use the best saved model instead:

```
python -m rl_zoo.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000 --load-best
python -m rl_zoo3.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000 --load-best
```

Record a video of a checkpoint saved during training (here the checkpoint name is `rl_model_10000_steps.zip`):

```
python -m rl_zoo.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000 --load-checkpoint 10000
python -m rl_zoo3.record_video --algo ppo --env BipedalWalkerHardcore-v3 -n 1000 --load-checkpoint 10000
```

## Record a Video of a Training Experiment
Expand All @@ -328,18 +328,18 @@ Apart from recording videos of specific saved models, it is also possible to rec
Record 1000 steps for each checkpoint, latest and best saved models:

```
python -m rl_zoo.record_training --algo ppo --env CartPole-v1 -n 1000 -f logs --deterministic
python -m rl_zoo3.record_training --algo ppo --env CartPole-v1 -n 1000 -f logs --deterministic
```

The previous command will create a `mp4` file. To convert this file to `gif` format as well:

```
python -m rl_zoo.record_training --algo ppo --env CartPole-v1 -n 1000 -f logs --deterministic --gif
python -m rl_zoo3.record_training --algo ppo --env CartPole-v1 -n 1000 -f logs --deterministic --gif
```

## Current Collection: 195+ Trained Agents!

Final performance of the trained agents can be found in [`benchmark.md`](./benchmark.md). To compute them, simply run `python -m rl_zoo.benchmark`.
Final performance of the trained agents can be found in [`benchmark.md`](./benchmark.md). To compute them, simply run `python -m rl_zoo3.benchmark`.

List and videos of trained agents can be found on our Huggingface page: https://huggingface.co/sb3

Expand Down
2 changes: 1 addition & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ COPY requirements.txt /tmp/


RUN \
mkdir -p ${CODE_DIR}/rl_zoo && \
mkdir -p ${CODE_DIR}/rl_zoo3 && \
pip uninstall -y stable-baselines3 && \
pip install -r /tmp/requirements.txt && \
pip install pip install highway-env==1.5.0 && \
Expand Down
2 changes: 1 addition & 1 deletion enjoy.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from rl_zoo.enjoy import enjoy
from rl_zoo3.enjoy import enjoy

if __name__ == "__main__":
enjoy()
6 changes: 3 additions & 3 deletions hyperparams/her.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ FetchSlide-v1:
FetchPickAndPlace-v1:
env_wrapper:
- sb3_contrib.common.wrappers.TimeFeatureWrapper
# - rl_zoo.wrappers.DoneOnSuccessWrapper:
# - rl_zoo3.wrappers.DoneOnSuccessWrapper:
# reward_offset: 0
# n_successes: 4
# - stable_baselines3.common.monitor.Monitor
Expand Down Expand Up @@ -96,7 +96,7 @@ FetchReach-v1:
NeckGoalEnvRelativeSparse-v2:
model_class: 'sac'
# env_wrapper:
# - rl_zoo.wrappers.HistoryWrapper:
# - rl_zoo3.wrappers.HistoryWrapper:
# horizon: 2
# - sb3_contrib.common.wrappers.TimeFeatureWrapper
n_timesteps: !!float 1e6
Expand All @@ -122,7 +122,7 @@ NeckGoalEnvRelativeSparse-v2:
NeckGoalEnvRelativeDense-v2:
model_class: 'sac'
env_wrapper:
- rl_zoo.wrappers.HistoryWrapperObsDict:
- rl_zoo3.wrappers.HistoryWrapperObsDict:
horizon: 2
# - sb3_contrib.common.wrappers.TimeFeatureWrapper
n_timesteps: !!float 1e6
Expand Down
2 changes: 1 addition & 1 deletion hyperparams/ppo.yml
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@ MiniGrid-FourRooms-v0:

CarRacing-v0:
env_wrapper:
- rl_zoo.wrappers.FrameSkip:
- rl_zoo3.wrappers.FrameSkip:
skip: 2
- gym.wrappers.resize_observation.ResizeObservation:
shape: 64
Expand Down
4 changes: 2 additions & 2 deletions hyperparams/ppo_lstm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ BipedalWalker-v3:
# TO BE TUNED
BipedalWalkerHardcore-v3:
# env_wrapper:
# - rl_zoo.wrappers.FrameSkip:
# - rl_zoo3.wrappers.FrameSkip:
# skip: 2
normalize: true
n_envs: 32
Expand Down Expand Up @@ -285,7 +285,7 @@ InvertedPendulumSwingupBulletEnv-v0:

CarRacing-v0:
env_wrapper:
# - rl_zoo.wrappers.FrameSkip:
# - rl_zoo3.wrappers.FrameSkip:
# skip: 2
- gym.wrappers.resize_observation.ResizeObservation:
shape: 64
Expand Down
16 changes: 8 additions & 8 deletions hyperparams/sac.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ MountainCarContinuous-v0:

Pendulum-v1:
# callback:
# - rl_zoo.callbacks.ParallelTrainCallback
# - rl_zoo3.callbacks.ParallelTrainCallback
n_timesteps: 20000
policy: 'MlpPolicy'
learning_rate: !!float 1e-3
Expand Down Expand Up @@ -74,9 +74,9 @@ BipedalWalkerHardcore-v3:
HalfCheetahBulletEnv-v0: &pybullet-defaults
# env_wrapper:
# - sb3_contrib.common.wrappers.TimeFeatureWrapper
# - rl_zoo.wrappers.DelayedRewardWrapper:
# - rl_zoo3.wrappers.DelayedRewardWrapper:
# delay: 10
# - rl_zoo.wrappers.HistoryWrapper:
# - rl_zoo3.wrappers.HistoryWrapper:
# horizon: 10
n_timesteps: !!float 1e6
policy: 'MlpPolicy'
Expand Down Expand Up @@ -163,12 +163,12 @@ MinitaurBulletDuckEnv-v0:
# To be tuned
CarRacing-v0:
env_wrapper:
- rl_zoo.wrappers.FrameSkip:
- rl_zoo3.wrappers.FrameSkip:
skip: 2
# wrapper from https://github.com/araffin/aae-train-donkeycar
- ae.wrapper.AutoencoderWrapper:
ae_path: "logs/car_racing_rgb_160.pkl"
- rl_zoo.wrappers.HistoryWrapper:
- rl_zoo3.wrappers.HistoryWrapper:
horizon: 2
# frame_stack: 4
normalize: True
Expand Down Expand Up @@ -238,7 +238,7 @@ donkey-generated-track-v0:
env_wrapper:
- gym.wrappers.time_limit.TimeLimit:
max_episode_steps: 500
- rl_zoo.wrappers.HistoryWrapper:
- rl_zoo3.wrappers.HistoryWrapper:
horizon: 5
n_timesteps: !!float 1e6
policy: 'MlpPolicy'
Expand All @@ -262,9 +262,9 @@ donkey-generated-track-v0:
NeckEnvRelative-v2:
<<: *pybullet-defaults
env_wrapper:
- rl_zoo.wrappers.HistoryWrapper:
- rl_zoo3.wrappers.HistoryWrapper:
horizon: 2
# - rl_zoo.wrappers.LowPassFilterWrapper:
# - rl_zoo3.wrappers.LowPassFilterWrapper:
# freq: 2.0
# df: 25.0
n_timesteps: !!float 1e6
Expand Down
8 changes: 4 additions & 4 deletions hyperparams/tqc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -258,12 +258,12 @@ parking-v0:
# Tuned
CarRacing-v0:
env_wrapper:
- rl_zoo.wrappers.FrameSkip:
- rl_zoo3.wrappers.FrameSkip:
skip: 2
# wrapper from https://github.com/araffin/aae-train-donkeycar
- ae.wrapper.AutoencoderWrapper:
ae_path: "logs/car_racing_rgb_160.pkl"
- rl_zoo.wrappers.HistoryWrapper:
- rl_zoo3.wrappers.HistoryWrapper:
horizon: 2
# frame_stack: 4
normalize: True
Expand All @@ -280,7 +280,7 @@ RocketLander-v0:
n_timesteps: !!float 3e6
policy: 'MlpPolicy'
env_wrapper:
- rl_zoo.wrappers.FrameSkip:
- rl_zoo3.wrappers.FrameSkip:
skip: 4
- rl_zoo.wrappers.HistoryWrapper:
- rl_zoo3.wrappers.HistoryWrapper:
horizon: 2
1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ gym-minigrid
scikit-optimize
optuna
pytablewriter~=0.64
seaborn
pyyaml>=5.1
cloudpickle>=1.5.0
plotly
Expand Down
15 changes: 0 additions & 15 deletions rl_zoo/cli.py

This file was deleted.

1 change: 0 additions & 1 deletion rl_zoo/version.txt

This file was deleted.

2 changes: 1 addition & 1 deletion rl_zoo/__init__.py → rl_zoo3/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import os

from rl_zoo.utils import (
from rl_zoo3.utils import (
ALGOS,
create_test_env,
get_latest_run_id,
Expand Down
2 changes: 1 addition & 1 deletion rl_zoo/benchmark.py → rl_zoo3/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import pytablewriter
from stable_baselines3.common.results_plotter import load_results, ts2xy

from rl_zoo.utils import get_hf_trained_models, get_latest_run_id, get_saved_hyperparams, get_trained_models
from rl_zoo3.utils import get_hf_trained_models, get_latest_run_id, get_saved_hyperparams, get_trained_models

parser = argparse.ArgumentParser()
parser.add_argument("--log-dir", help="Root log folder", default="rl-trained-agents/", type=str)
Expand Down
File renamed without changes.
22 changes: 22 additions & 0 deletions rl_zoo3/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import sys

from rl_zoo3.enjoy import enjoy
from rl_zoo3.plots import all_plots, plot_from_file, plot_train
from rl_zoo3.train import train


def main():
script_name = sys.argv[1]
# Remove script name
del sys.argv[1]
# Execute known script
known_scripts = {
"train": train,
"enjoy": enjoy,
"plot_train": plot_train,
"plot_from_file": plot_from_file,
"all_plots": all_plots,
}
if script_name not in known_scripts.keys():
raise ValueError(f"The script {script_name} is unknown, please use one of {known_scripts.keys()}")
known_scripts[script_name]()
12 changes: 6 additions & 6 deletions rl_zoo/enjoy.py → rl_zoo3/enjoy.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@
from huggingface_sb3 import EnvironmentName
from stable_baselines3.common.utils import set_random_seed

import rl_zoo.import_envs # noqa: F401 pylint: disable=unused-import
from rl_zoo import ALGOS, create_test_env, get_saved_hyperparams
from rl_zoo.callbacks import tqdm
from rl_zoo.exp_manager import ExperimentManager
from rl_zoo.load_from_hub import download_from_hub
from rl_zoo.utils import StoreDict, get_model_path
import rl_zoo3.import_envs # noqa: F401 pylint: disable=unused-import
from rl_zoo3 import ALGOS, create_test_env, get_saved_hyperparams
from rl_zoo3.callbacks import tqdm
from rl_zoo3.exp_manager import ExperimentManager
from rl_zoo3.load_from_hub import download_from_hub
from rl_zoo3.utils import StoreDict, get_model_path


def enjoy(): # noqa: C901
Expand Down
Loading

0 comments on commit b372e9a

Please sign in to comment.