0.4.9

Trinkle23897 released this 04 Jul 17:10

· 491 commits to master since this release

Bug Fix

Fix save_checkpoint_fn return value to checkpoint_path (#659, @Trinkle23897)
Fix an off-by-one bug in trainer iterator (#659, @Trinkle23897)
Fix a bug in Discrete SAC evaluation; default to deterministic mode (#657, @nuance1979)
Fix a bug in trainer about test reward not logged because self.env_step is not set for offline setting (#660, @nuance1979)
Fix exception with watching pistonball environments (#663, @ycheng517)
Use env.np_random.integers instead of env.np_random.randint in Atari examples (#613, @ycheng517)

API Change

Upgrade gym to >=0.23.1, support seed and return_info arguments for reset (#613, @ycheng517)

New Features

Add BranchDQN for large discrete action spaces (#618, @BFAnas)
Add show_progress option for trainer (#641, @michalgregor)
Added support for clipping to DQNPolicy (#642, @michalgregor)
Implement TD3+BC for offline RL (#660, @nuance1979)
Add multiDiscrete to discrete gym action space wrapper (#664, @BFAnas)

Enhancement

Use envpool in vizdoom example (#634, @Trinkle23897)
Add Atari (discrete) SAC examples (#657, @nuance1979)

Contributors

michalgregor, Trinkle23897, and 3 other contributors

Assets 4