Skip to content

0.4.9

Compare
Choose a tag to compare
@Trinkle23897 Trinkle23897 released this 04 Jul 17:10
· 491 commits to master since this release
6505484

Bug Fix

  1. Fix save_checkpoint_fn return value to checkpoint_path (#659, @Trinkle23897)
  2. Fix an off-by-one bug in trainer iterator (#659, @Trinkle23897)
  3. Fix a bug in Discrete SAC evaluation; default to deterministic mode (#657, @nuance1979)
  4. Fix a bug in trainer about test reward not logged because self.env_step is not set for offline setting (#660, @nuance1979)
  5. Fix exception with watching pistonball environments (#663, @ycheng517)
  6. Use env.np_random.integers instead of env.np_random.randint in Atari examples (#613, @ycheng517)

API Change

  1. Upgrade gym to >=0.23.1, support seed and return_info arguments for reset (#613, @ycheng517)

New Features

  1. Add BranchDQN for large discrete action spaces (#618, @BFAnas)
  2. Add show_progress option for trainer (#641, @michalgregor)
  3. Added support for clipping to DQNPolicy (#642, @michalgregor)
  4. Implement TD3+BC for offline RL (#660, @nuance1979)
  5. Add multiDiscrete to discrete gym action space wrapper (#664, @BFAnas)

Enhancement

  1. Use envpool in vizdoom example (#634, @Trinkle23897)
  2. Add Atari (discrete) SAC examples (#657, @nuance1979)