Releases: rlberry-py/rlberry
v0.7.3
v0.7.2
Relax dependencies
v0.7.1
rlberry-v0.7.0
Release of version 0.7.0 of rlberry.
This is the first rlberry release since we did a major restructuration of rlberry in three repositories (PR #379) :
rlberry (this repo): everything for rl that is not an agent or an environment, e.g. experiment management, parallelization, statistical tools, plotting...
rlberry-scool: repository for teaching materials, e.g. simplified algorithms for teaching, notebooks for tutorials for learning RL...
rlberry-research: repository of agents and environments used inside Inria Scool team
Changes since last version.
PR #397
- Automatic save after fit() in ExperienceManager
PR #396
- Improve coverage and fix version workflow
- Switch from RTD to github page
PR #382
- switch to poetry
PR #376
- New plot_writer_data function that does not depend on seaborn and that can plot smoothed function and confidence band if scikit-fda is installed.
rlberry-v0.6.0
Release of version 0.6.0 of rlberry.
This is the last rlberry release before we do a major restructuration of rlberry in three repositories:
- rlberry: everything for rl that is not an agent or an environment, e.g. experiment management, parallelization, statistical tools, plotting...
- rlberry-scool: repository for teaching materials, e.g. simplified algorithms for teaching, notebooks for tutorials for learning RL...
- rlberry-research: repository of agents and environments used inside Inria Scool team
Changes since last version.
PR #276
- Non adaptive multiple tests for agent comparison.
PR #365
- Fix Sphinx version to <7.
PR #350
- Rename AgentManager to ExperimentManager.
PR #326
- Moved SAC from experimental to torch agents. Tested and benchmarked.
PR #335
- Upgrade from Python3.9 -> python3.10
rlberry-v0.5.0
Release of version 0.5.0 of rlberry.
With this release, rlberry switches to gymnasium!
New in version 0.5.0:
- Merge gymnasium branch into main, make gymnasium the default library for environments in rlberry.
Remark: for now stablebaselines 3 has no stable release with gymnasium. To use stablebaslines with gymnasium, use the main branch from github:
pip install git+https://github.com/DLR-RM/stable-baselines3
rlberry-v0.4.1
Release of version 0.4.1 of rlberry.
Before the rlberry installation, please install the fork of gym 0.21 : "gym[accept-rom-license] @ git+https://github.com/rlberry-py/gym_fix_021"
New in 0.4.1
PR #307
- Create fork gym0.21 for setuptools non-retrocompatible changes.
PR #306
- Add Q-learning agent in :class:
rlberry.agents.QLAgent
and SARSA agent in :class:rlberry.agents.SARSAAgent
.
PR #298
- Move old scripts (jax agents, attention networks, old examples...) that we won't maintain from the main branch to an archive branch.
PR #277
- Add and update code to use "Atari games" env
rlberry-v0.4.0
Release of version 0.4.0 of rlberry.
New in 0.4.0
PR #273
- Change the default behavior of plot_writer_data so that if seaborn has version >= 0.12.0 then a 90% percentile interval is used instead of sd.
PR #269
- Add rlberry.envs.PipelineEnv a way to define pipeline of wrappers in a simple way.
PR #262
- PPO can now handle continuous actions.
-
Implementation of Munchausen DQN in rlberry.agents.torch.MDQNAgent.
-
Comparison of MDQN with DQN agent in the long tests.
- Compress the pickles used to save the trained agents.
PR #235
- Implementation of rlberry.envs.SpringCartPole environment, an RL environment featuring two cartpoles linked by a spring.
-
Improve logging, the logging level can now be changed with rlberry.utils.logging.set_level().
-
Introduce smoothing in curves done with plot_writer_data when only one seed is used.
PR #223
- Moved PPO from experimental to torch agents. Tested and benchmarked.
rlberry-v0.3.0
Release of version 0.3.0 of rlberry.
New in 0.3.0
PR #206
- Creation of a Deep RL tutorial, in the user guide.
PR #132
- New tracker class
rlberry.agents.bandit.tools.BanditTracker
to track statistics to be used in Bandit algorithms.
PR #191
- Possibility to generate a profile with
rlberry.agents.manager.AgentManager
.
- Misc improvements on A2C.
- New StableBaselines3 wrapper
rlberry.agents.stable_baselines.StableBaselinesAgent
to import StableBaselines3 Agents.
PR #119
- Improving documentation for agents.torch.utils
- New replay buffer
rlberry.agents.utils.replay.ReplayBuffer
, aiming to replace code in utils/memories.py - New DQN implementation, aiming to fix reproducibility and compatibility issues.
- Implements Q(lambda) in DQN Agent.
Feb 22, 2022 (PR #126)
- Setup
rlberry.__version__
(currently 0.3.0dev0) - Record rlberry version in a AgentManager attribute equality of AgentManagers
- Override
__eq__
method of the AgentManager class.
Feb 14-15, 2022 (PR #97, #118)
- (feat) Add Bandits basic environments and agents. See
~rlberry.agents.bandits.IndexAgent
and~rlberry.envs.bandits.Bandit
. - Thompson Sampling bandit algorithm with gaussian or beta prior.
- Base class for bandits algorithms with custom save & load functions (called
~rlberry.agents.bandits.BanditWithSimplePolicy
)
- (fix) Fixed bug in
FiniteMDP.sample()
: terminal state was being checked withself.state
instead of givenstate
- (feat) Option to use 'fork' or 'spawn' in
~rlberry.manager.AgentManager
- (feat) AgentManager output_dir now has a timestamp and a short ID by default.
- (feat) Gridworld can be constructed from string layout
- (feat)
max_workers
argument for~rlberry.manager.AgentManager
to control the maximum number of processes/threads created by thefit
method.
Feb 04, 2022
- Add
~rlberry.manager.read_writer_data
to load agent's writer data from pickle files and make it simpler to customize in~rlberry.manager.plot_writer_data
- Fix bug, dqn should take a tuple as environment
- Add a quickstart tutorial in the docs
quick_start
- Add the RLSVI algorithm (tabular)
~rlberry.agents.RLSVIAgent
- Add the Posterior Sampling for Reinforcement Learning PSRL agent for tabular MDP
~rlberry.agents.PSRLAgent
- Add a page to help contributors in the doc
contributing
rlberry-v0.2.1
New in v0.2
Improving interface and tools for parallel execution (#50)
AgentStats
renamed toAgentManager
.AgentManager
can handle agents that cannot be pickled.Agent
interface requireseval()
method instead ofpolicy()
to handle more general agents (e.g. reward-free, POMDPs etc).- Multi-processing and multi-threading are now done with
ProcessPoolExecutor
andThreadPoolExecutor
(allowing nested processes for example). Processes are created withspawn
(jax does not work withfork
, see #51).
New experimental features (see #51, #62)
- JAX implementation of DQN and replay buffer using reverb.
rlberry.network
: server and client interfaces to exchange messages via sockets.RemoteAgentManager
to train agents in a remote server and gather the results locally (usingrlberry.network
).
Logging and rendering:
- Data logging with a new
DefaultWriter
and improved evaluation and plot methods inrlberry.manager.evaluation
. - Fix rendering bug with OpenGL (bf606b4).
Bug fixes.
New in v0.2.1 (#65)
Features:
Agent
andAgentManager
both have aunique_id
attribute (useful for creating unique output files/directories).DefaultWriter
is now initialized in base classAgent
and (optionally) wraps a tensorboard SummaryWriter.AgentManager
has an optionenable_tensorboard
that activates tensorboard logging in each of itsAgent
s (with theirwriter
attribute). Thelog_dir
s of tensorboard are automatically assigned byAgentManager
.RemoteAgentManager
receives tensorboard data created in the server, when the methodget_writer_data()
is called. This is done by a zip file transfer withrlberry.network
.BaseWrapper
andgym_make
now have an optionwrap_spaces
. If set toTrue
, this option convertsgym.spaces
torlberry.spaces
, which provides classes with better seeding (using numpy'sdefault_rng
instead ofRandomState
)AgentManager
: new method get_agent_instances() that returns trained instancesplot_writer_data
: possibility to set xtag (tag used for x-axis)
Bug fixes:
- Fixed agent initialization bug in
AgentHandler
(eval_env
missing in kwargs for agent_class).