Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Density in Mujoco Environments doesn't seem to change #7

Open
jsalfity-hplabs opened this issue Aug 2, 2019 · 1 comment
Open

Density in Mujoco Environments doesn't seem to change #7

jsalfity-hplabs opened this issue Aug 2, 2019 · 1 comment

Comments

@jsalfity-hplabs
Copy link

Great work! We are trying to replicate your experiments.

Description of our setup - Ubuntu 16.04, installed rl-generalization and docker using the install instructions given in readme.

We came across an interesting bug that seems incorrect. We wanted to see the performance of HalfCheetah when only varying density so we ran python -m examples.run_experiments examples/test_density.yml /tmp/output with the following yml file

models:
  # PPO2 Baselines.
  - name: PPO2
    train:
      command: |
        python3 -m examples.ppo2_baselines.train
          --env {environment}
          --output {output}
          --total-episodes {episodes}
          --lr {lr}
          --nsteps {nsteps}
          --nminibatches {nminibatches}
          --policy {policy}
      output: 'checkpoints/*'
      parameters: 'env-parameters-*.json'
    evaluate:
      command: |
        python3 -m examples.ppo2_baselines.evaluate
          --env {environment}
          --outdir {output}
          --eval-n-trials 1000
          --eval-n-parallel 1
          {model}
      output: 'evaluation.json'
    hyperparameters:
      episodes: 1500000
      policy: 'mlp'
      lr: [0.0003] 
      nsteps: [256]
      nminibatches: 1

 #############################################################################

environments:
  - train: SunblazeHalfCheetah-v0
    test:
      - SunblazeHalfCheetah-v0
      - SunblazeHalfCheetahRandomExtreme-v0 #edited so only density is changing

and with SunblazeHalfCheetahRandomExtreme only changing the density to 1000000 in mujoco.py as below:

class RandomExtremeHalfCheetah(RoboschoolXMLModifierMixin, ModifiableRoboschoolHalfCheetah): #edited to only change density 

    def randomize_env(self):
        self.density = 1000000 #manually changed density value

        with self.modify_xml('half_cheetah.xml') as tree:
            for elem in tree.iterfind('worldbody/body/geom'):
                elem.set('density', str(self.density))

    def _reset(self, new=True):
        if new:
            self.randomize_env()
        return super(RandomExtremeHalfCheetah, self)._reset(new)

    @property
    def parameters(self):
        parameters = super(RandomExtremeHalfCheetah, self).parameters
        parameters.update({'density': self.density})
        return parameters

Looking at the json output of run_experiments, the SunblazeHalfCheetah model testing reward on both the SunblazeHalfCheetah and SunblazeHalfCheetahRandomExtreme (with density manually set to 1000000) are nearly the same. Last 2 rewards of both testing environments below:

"environment": {"id": "SunblazeHalfCheetah-v0"}, "reward": [26.933929443359375]}, {"success": false, "model": "examples/test_density_output/PPO2/SunblazeHalfCheetah-v0/checkpoints/182420", "environment": {"id": "SunblazeHalfCheetah-v0"}, "reward": [30.036670684814453]}, {"success": false, "model": "examples/test_density_output/PPO2/SunblazeHalfCheetah-v0/checkpoints/182420", "environment": {"id": "SunblazeHalfCheetah-v0"}, "reward": [25.795215606689453]}]}
 "environment": {"id": "SunblazeHalfCheetahRandomExtreme-v0", "density": 1000000000}, "model": "examples/test_density_output/PPO2/SunblazeHalfCheetah-v0/checkpoints/182420"}, {"success": false, "reward": [40.01738739013672], "environment": {"id": "SunblazeHalfCheetahRandomExtreme-v0", "density": 1000000000}, "model": "examples/test_density_output/PPO2/SunblazeHalfCheetah-v0/checkpoints/182420"}, {"success": false, "reward": [26.907756805419922], "environment": {"id": "SunblazeHalfCheetahRandomExtreme-v0", "density": 1000000000}, "model": "examples/test_density_output/PPO2/SunblazeHalfCheetah-v0/checkpoints/182420"}]}

How can we confirm the density is changing? It doesn't seem logical that the Mujoco HalfCheetah simulation should be able to move at all given a density of 1000000 nor have similar testing rewards to the nominal environment.

@pokaxpoka
Copy link

I think environments can change when you add RoboschoolForwardWalkerMujocoXML.__init__(self, self.model_xml, 'torso', action_dim=6, obs_dim=26, power=0.9) in randomize_env(self). I can be wrong because I'm using different version of roboschool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants