Skip to content

0.2.5

Compare
Choose a tag to compare
@Trinkle23897 Trinkle23897 released this 22 Jul 06:59
· 668 commits to master since this release
bd9c3c7

New feature

Multi-agent Reinforcement Learning: https://tianshou.readthedocs.io/en/latest/tutorials/tictactoe.html (#122)

Documentation

Add a tutorial of Batch class to standardized the behavior of Batch: https://tianshou.readthedocs.io/en/latest/tutorials/batch.html (#142)

Bugfix

  • Fix inconsistent shape in A2CPolicy and PPOPolicy. Please be careful when dealing with log_prob (#155)
  • Fix list of tensors inside Batch, e.g., Batch(a=[np.zeros(3), torch.zeros(3)]) (#147)
  • Fix buffer update when stack_num > 0 (#154)
  • Remove useless kwargs