Skip to content

Latest commit

 

History

History
64 lines (42 loc) · 2.4 KB

Readme.md

File metadata and controls

64 lines (42 loc) · 2.4 KB

Hyperparameters

    "run_id": 0,
    "env": 0,
    "action_space_limits": ([-10.], [10.]),
    "buffer_size": 10000,
    "batch_size": 64,
    "is_quanser_env": True,
    "gamma": .99,
    "tau": 1e-2,
    "steps": 100000,
    "warmup_samples": 1000,
    "noise_decay": 0.9,
    "transform": lambda x:x,
    "actor_lr": 1e-3,
    "critic_lr": 1e-3,
    "lr_decay": 1.0,
    "lr_min":1.e-7,
    "trial_horizon": 5000,
    "actor_hidden_layers":[10, 10, 10],
    "critic_hidden_layers":[10, 10, 10],
    "device": "cpu"

These will be set in every run, either a trial or training session!

run_id every algorithm should specify a run_id to obtain a structured order

env the environment which is used in this run (e.g. "Pendulum-v0")

action_space_limits limits the actions to select between two values, given in a tuple

buffer_size the buffer size taken of a replay buffer

batch_size size of batches that are sampled from a replay buffer, to train the actor and critic network with

is_quanser_env either if the used env is a quanser_robots environment (True) or not (False)

gamma used as a weight factor of the critic's prediction

tau soft update for the weights from the source to target network

steps that will be taken in this training session

warmup_samples number of samples generated before starting the training

noise_decay the gaussian noise will be reduced by this factor in every episode

transform user specific transformation function (for dimensionality reduction for example)

actor_lr adam optimizer learning rate of your actor networks

critic_lr adam optimizer learning rate of your critic networks

lr_decay decay of adam optimizers learning rates

lr_min your optimizer learning rate cant drop this value

trial_horizon in between training episodes number of trials to test your new policy

actor_hidden_layers a numeric list of layers between input and output layer that will be used in your actor networks

critic_hidden_layers a numeric list of layers between input and output layer that will be used in your critic networks

device either cpu or cuda (faster learning), tensors will be calculated on specified device