Releases
v1.1.0
Breaking Changes
Upgrade to Stable-Baselines3 (SB3) >= 1.1.0
Upgrade to sb3-contrib >= 1.1.0
Add timeout handling (cf SB3 doc)
HER
is now a replay buffer class and no more an algorithm
Removed PlotNoiseRatioCallback
Removed PlotActionWrapper
Changed 'lr'
key in Optuna param dict to 'learning_rate'
so the dict can be directly passed to SB3 methods (@justinkterry)
New Features
Add support for recording videos of best models and checkpoints (@mcres )
Add support for recording videos of training experiments (@mcres )
Add support for dictionary observations
Added experimental parallel training (with utils.callbacks.ParallelTrainCallback
)
Added support for using multiple envs for evaluation
Added --load-last-checkpoint
option for the enjoy script
Save Optuna study object at the end of hyperparameter optimization and plot the results (plotly
package required)
Allow to pass multiple folders to scripts/plot_train.py
Flag to save logs and optimal policies from each training run (@justinkterry)
Bug fixes
Fixed video rendering for PyBullet envs on Linux
Fixed get_latest_run_id()
so it works in Windows too (@NicolasHaeffner )
Fixed video record when using HER
replay buffer
Documentation
Updated README (dict obs are now supported)
Other
Added is_bullet()
to ExperimentManager
Simplify close()
for the enjoy script
Updated docker image to include latest black version
Updated TD3 Walker2D model (thanks @modanesh )
Fixed typo in plot title (@scottemmons )
Minimum cloudpickle version added to requirements.txt
(@amy12xx )
Fixed atari-py version (ROM missing in newest release)
Updated SAC
and TD3
search spaces
Cleanup eval_freq documentation and variable name changes (@justinkterry)
Add clarifying print statement when printing saved hyperparameters during optimization (@justinkterry)
Clarify n_evaluations help text (@justinkterry)
Simplified hyperparameters files making use of defaults
Added new TQC+HER agents
Add panda-gym
environments (@qgallouedec )
You can’t perform that action at this time.