Big refactor - SB3 upgrade - Last before v1.0
Pre-release
Pre-release
Breaking Changes
- Removed
LinearNormalActionNoise
- Evaluation is now deterministic by default, except for Atari games
sb3_contrib
is now requiredTimeFeatureWrapper
was moved to the contrib repo- Replaced old
plot_train.py
script with updatedplot_training_success.py
- Renamed
n_episodes_rollout
totrain_freq
tuple to match latest version of SB3
New Features
- Added option to choose which
VecEnv
class to use for multiprocessing - Added hyperparameter optimization support for
TQC
- Added support for
QR-DQN
from SB3 contrib
Bug fixes
- Improved detection of Atari games
- Fix potential bug in plotting script when there is not enough timesteps
- Fixed a bug when using HER + DQN/TQC for hyperparam optimization
Documentation
- Improved documentation (@cboettig)
Other
- Refactored train script, now uses a
ExperimentManager
class - Replaced
make_env
with SB3 built-inmake_vec_env
- Add more type hints (
utils/utils.py
done) - Use f-strings when possible
- Changed
PPO
atari hyperparameters (removed vf clipping) - Changed
A2C
atari hyperparameters (eps value of the optimizer) - Updated benchmark script
- Updated hyperparameter optim search space (commented gSDE for A2C/PPO)
- Updated
DQN
hyperparameters for CartPole - Do not wrap channel-first image env (now natively supported by SB3)
- Removed hack to log success rate
- Simplify plot script