Releases: UoA-CARES/cares_reinforcement_learning
Releases · UoA-CARES/cares_reinforcement_learning
v2.0.0
What's Changed
A lot...
-
Numerous new algorithms:
- SACD
- PER Variations - TD3, SAC
- PALTD3
- LAP Variations - TD3, SAC
- MAPER Variations - TD3, SAC
- TQC
- REDQ
- CTD4
- RD Variations - TD3, SAC
- NaSATD3
- AE Variations - TD3, SAC
-
Autoencoder Support
- Vanilla
- Burgess
-
Memory Buffer
- Prioritisation Buffer functionality
- Saving and Loading functionality - utilising pickle (note terribly storage space inefficient for image-based learning)
-
Record
- Train/Eval format standardised
-
Plotter
- Plotting any training/evaluation metrics
-
Standardised Code format
- Type hints
- Formalised Network formats - Base, Default, Custom
- Naming of various methods/functions/variables across algorithms
-
Tests
- Various tests - algorithms, network configurations, and memory buffer.
-
Fixes
- Saving and Loading functions for all algorithms
- Various default configuration parameters
New Contributors
- @qiaoting159753 made their first contribution in #112
- @ManfredStoiber made their first contribution in #115
- @kvan910 made their first contribution in #156
- @h-yamani made their first contribution in #153
- @PKWadsy made their first contribution in #184
- @Doge-God made their first contribution in #186
Full Changelog: v1.0.0...v2.0.0
v1.0.0
What's Changed
- Dev/record by @beardyFace in #81
- Dev/dmcs by @beardyFace in #82
- Dev/training loops by @beardyFace in #83
- Dev/training loops by @beardyFace in #84
- Fixing make dirs all by @beardyFace in #85
- Dev/image observation by @beardyFace in #86
- Dev/external dev by @beardyFace in #87
- set_seed in helpers by @beardyFace in #88
- fix TD3 network to sequence by @dvalenciar in #89
- LR move out by @dvalenciar in #92
- Dev/pydantic by @beardyFace in #93
- Dev/seeds by @beardyFace in #95
- feat: make configs subscriptable by @retinfai in #98
- Fixed configurations not being in training config for PPO and value b… by @beardyFace in #100
- Dev/configurations generic by @beardyFace in #101
- Added noise decay into policy loop by @beardyFace in #103
- Dev/nasa td3 by @beardyFace in #104
- fix td3 to original by @dvalenciar in #105
- chore: remove training loops by @retinfai in #106
- feat: make network factory dynamic by @retinfai in #108
Full Changelog: v0.0.0-beta...v1.0.0
v0.0.0-beta
What's Changed
- Progress by @retinfai in #1
- chore: add requirements.txt by @retinfai in #2
- Chore by @retinfai in #3
- Ddpg by @retinfai in #4
- feat: add DQN wrapper by @retinfai in #5
- Plot util by @retinfai in #6
- feat: add DDPG, DoubleDQN, DuelingDQN complete wrappers by @retinfai in #7
- Chore by @retinfai in #8
- Td3 by @retinfai in #9
- refactor: abstract utils methods to Agent super class by @retinfai in #10
- chore: forgot to add Agent superclass to last commit by @retinfai in #11
- Chore by @retinfai in #12
- Ppo by @retinfai in #14
- refactor: move action formatting inside forward by @retinfai in #15
- Sac by @retinfai in #16
- Refactor by @dvalenciar in #17
- evaluation function by @dvalenciar in #19
- Refactor dvr by @dvalenciar in #20
- Modified Examples to be clearer and run with argsparse - moved RolloutBuffer to seperate instance, aligned with MemoryBuffer by @beardyFace in #21
- Corrected ppo example to denormalise instead of normalise the action … by @beardyFace in #22
- add eval() and train() lines by @dvalenciar in #23
- refactor: move memory outside of PPO by @retinfai in #25
- feat: add support for DuelingDQN by @retinfai in #31
- Update TD3 save/load model by @rainingx683 in #33
- Refactor all save/load models by @rainingx683 in #36
- Dev/refactor memory buffers by @Jack17432 in #38
- added future to be backwards compatible for python3.8 users on 20… by @beardyFace in #40
- Dev/refactor algorithm by @Jack17432 in #42
- Added timed loops and shifted setting of memory type outside of the i… by @beardyFace in #45
- Dev/impl augments by @Jack17432 in #47
- Workflow testing by @beardyFace in #43
- oops forgot to change the workflow by @Jack17432 in #48
- added augments to the memory buffer and removed the PrioritizedMemory… by @Jack17432 in #49
- 52 algorithm classes return value by @retinfai in #54
- 55 separate info from algorithm by @retinfai in #56
- feat: add initial Logger iteration by @retinfai in #57
- Dev/record #59 by @long715 in #60
- refactor: log once by @Jack17432 in #64
- NaSA-TD3 by @dvalenciar in #69
- chore: remove record example by @retinfai in #70
- Patched each algorithm to handle batch normalisation by @beardyFace in #72
- feat: add option for saving checkpoint models by @retinfai in #74
- 75 potential memory leak in record by @retinfai in #76
- 75 potential memory leak in record by @retinfai in #77
- bugFix: lower exploration rate for dqn by @retinfai in #79
- Load actor and critic models when create TD3 by @Bac0nEater in #80
New Contributors
- @retinfai made their first contribution in #1
- @rainingx683 made their first contribution in #33
- @Jack17432 made their first contribution in #38
- @long715 made their first contribution in #60
- @Bac0nEater made their first contribution in #80
Full Changelog: https://github.com/UoA-CARES/cares_reinforcement_learning/commits/v0.0.0-beta