Skip to content

0.2.7

Compare
Choose a tag to compare
@Trinkle23897 Trinkle23897 released this 08 Sep 13:38
· 646 commits to master since this release
64af7ea

API Change

  1. exact n_episode for a list of n_episode limitation and save fake data in cache_buffer when self.buffer is None (#184)
  2. add save_only_last_obs for replay buffer in order to save the memory. (#184)
  3. remove default value in batch.split() and add merge_last argument (#185)
  4. fix tensorboard logging: h-axis stands for env step instead of gradient step; add test results into tensorboard (#189)
  5. add max_batchsize in onpolicy algorithms (#189)
  6. keep only sumtree in segment tree implementation (#193)
  7. add __contains__ and pop in batch: key in batch, batch.pop(key, deft) (#189)
  8. remove dict return support for collector preprocess_fn (#189)
  9. remove **kwargs in ReplayBuffer (#189)
  10. add no_grad argument in collector.collect (#204)

Enhancement

  1. add DQN Atari examples (#187)
  2. change the type-checking order in batch.py and converter.py in order to meet the most often case first (#189)
  3. Numba acceleration for GAE, nstep, and segment tree (#193)
  4. add policy.eval() in all test scripts' "watch performance" (#189)
  5. add test_returns (both GAE and nstep) (#189)
  6. improve the code-coverage (from 90% to 95%) and remove the dead code (#189)
  7. polish examples/box2d/bipedal_hardcore_sac.py (#207)

Bug fix

  1. fix a bug in MAPolicy: buffer.rew = Batch() doesn't change buffer.rew (thanks mypy) (#207)
  2. set policy.eval() before collector.collect (#204) This is a bug
  3. fix shape inconsistency for torch.Tensor in replay buffer (#189)
  4. potential bugfix for subproc.wait (#189)
  5. fix RecurrentActorProb (#189)
  6. fix some incorrect type annotation (#189)
  7. fix a bug in tictactoe set_eps (#193)
  8. dirty fix for asyncVenv check_id test