diff --git a/README.md b/README.md index 7e34319..c4f0f6e 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ Vectorized environments allow batching beam search planning and select actions i if you need to evaluate agent on number of episodes (or seeds) during training. -# Training +## Training I trained it on D4RL medium datasets to validate that everything is OK. Scores seem to be very close to the original. Pretrained models are [available](pretrained). @@ -44,7 +44,7 @@ Also, all datasets for [D4RL](https://sites.google.com/view/d4rl/home) gym tasks python scripts/train.py --config="configs/medium/halfcheetah_medium" --device="cuda" --seed="42" ``` -# Evaluation +## Evaluation Available evaluation parameters can be seen in validation [config](configs/eval_vase.yaml). Here parameters are set to match evaluation configs from original implementation by [@jannerm](https://github.com/jannerm). @@ -77,7 +77,7 @@ python scripts/eval.py \ beam_width=128 ``` -# References +## References ``` @inproceedings{janner2021sequence, title = {Offline Reinforcement Learning as One Big Sequence Modeling Problem}, diff --git a/scripts/eval.py b/scripts/eval.py index cf65178..057ef9e 100644 --- a/scripts/eval.py +++ b/scripts/eval.py @@ -25,8 +25,6 @@ def create_argparser(): def run_experiment(config, seed, device): set_seed(seed=seed) - print(config) - run_config = OmegaConf.load(os.path.join(config.checkpoints_path, "config.yaml")) discretizer = torch.load(os.path.join(config.checkpoints_path, "discretizer.pt"), map_location=device)