Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPC GPS #81

Open
wants to merge 52 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
b23475e
Hingle L2 loss for cost obstacle. Experiment on Box2D point_mass.
thobotics Dec 30, 2016
536f7fa
Compute mu, sigma of marginal distribution p(x) and test it gradient
thobotics Jan 7, 2017
f3cdf6a
test_box2d custom sample
thobotics Jan 7, 2017
79d2def
Merge branch 'mpc_gps' of https://github.com/thobotics/gps into mpc_gps
thobotics Jan 7, 2017
448aad6
Surogate cost, regularized for PD Quu, non line search
thobotics Jan 8, 2017
50d4b99
Using MPC trajectory distribution with surogate cost.
thobotics Jan 11, 2017
7319ee5
Real MPC (Online Optimization) follow offline trajectory.
thobotics Jan 12, 2017
c69e0bf
Using MPC when sample to train trajectory distribution. Tested on Box…
thobotics Jan 13, 2017
6e89046
Small fix: Added check for using mpc in algorithm.py
thobotics Jan 13, 2017
cf87743
Train update MPC use previous sample to evaluate cost.
thobotics Jan 14, 2017
710b3f6
Fix pointmass return nan of closest point of obstacle, this make nan …
thobotics Jan 27, 2017
00f7d73
It work for both Arm and Pointmass by solving MPC unconstrained problem.
thobotics Jan 28, 2017
1beecb8
Add DESIGN_DOC for mpc_gps trajectory optimization
thobotics Jan 31, 2017
2e9b22f
Update document for fitting dynamics from sample
thobotics Feb 9, 2017
41d440a
Merge branch 'master' of https://github.com/cbfinn/gps
thobotics Feb 11, 2017
70b768b
Update README.md
thobotics Feb 12, 2017
4282056
Update fomula of Sigma in gmm
thobotics Feb 12, 2017
f8644ef
Merge branch 'master' of https://github.com/thobotics/gps
thobotics Feb 12, 2017
93e2645
Update DESIGN_DOC
thobotics Feb 12, 2017
f6df92f
Seperate Pointmass non-obstacle and obstacle world
thobotics Feb 16, 2017
2e9b925
Create a point mass obstacle world and test with BADMM.
thobotics Feb 18, 2017
1eebe77
Check use_mpc key exists in hyperparams
thobotics Feb 23, 2017
5c6e3b8
First commit for training mobile robot using GPS.
thobotics Mar 10, 2017
043ca6d
Small fix for cost_translation_table_ caused Seg fault
thobotics Mar 11, 2017
5ef44a8
Trained work when using OMNI direction mobile robot.
thobotics Mar 18, 2017
f7cbde1
** Completed BADMM MPC_GPS surogate cost. Tested by box2d_pointmass_b…
thobotics Mar 23, 2017
bba5400
Experiment on turtlebot_badmm one obstacle world.
thobotics Mar 23, 2017
0c05978
Use 1d raw laser range sensor as observation source.
thobotics Mar 25, 2017
e8bbd30
** Successful train (2 initial state) and test (check note git status…
thobotics Mar 25, 2017
556c48c
Update video for testing phase
thobotics Mar 28, 2017
c68f6be
Update video for testing phase
thobotics Mar 28, 2017
7270a21
** DIFF Mobile robot move in hall way
thobotics Apr 1, 2017
d2cb11d
Merge branch 'mpc_gps' of https://github.com/thobotics/gps into mpc_gps
thobotics Apr 1, 2017
1064913
Update link for turtlebot (differential drive) move in hallway.
thobotics Apr 1, 2017
feac983
Clean up
thobotics Apr 4, 2017
dfc921f
Merge branch 'mpc_gps' of https://github.com/thobotics/gps into mpc_gps
thobotics Apr 4, 2017
2fdb3be
Remove junk file
thobotics Apr 4, 2017
4abf703
Merge branch 'mpc_gps' and edit README for pull request.
thobotics Apr 4, 2017
06ba91e
Reset cost_state
thobotics Apr 4, 2017
e910769
Merge branch 'master' of https://github.com/cbfinn/gps
thobotics Apr 4, 2017
d92d2f9
Update cost obstacle config
thobotics Apr 5, 2017
ea71a69
Add reference to MPC GPS
thobotics Apr 5, 2017
3de3115
Remove junk worlds and maps of turtlebot
thobotics Apr 5, 2017
4f8722d
Restore pr2_gazebo launch file
thobotics Apr 5, 2017
0a95108
Update camera sensor
thobotics Apr 6, 2017
df218b2
Change weight of cost_obstacle and initial state.
thobotics Apr 7, 2017
67b4716
Show iLQG and MPC Plan on RVIZ
thobotics May 31, 2017
c1f28b8
Merge branch 'master' of https://github.com/cbfinn/gps
thobotics Jan 31, 2018
aec8993
Publish nearest obstacle points
thobotics Feb 27, 2018
d7f2707
Agent turtlebot mpepc.
thobotics Feb 27, 2018
ff9e82f
Update README.md
thobotics Feb 9, 2021
dcb7ef5
Update README.md
thobotics Feb 9, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,22 @@
GPS
======

Please check the mpc_gps branch for the reimplementation of MPC-GPS paper!

## Demo
1. Hallway
```
https://youtu.be/4h49wDTnrxw?list=PLeH0h5k_xzT5Rxuqkb2ZHuNJTIqAc76qN
```

2. 3 obstacle test
```
https://youtu.be/xaNUwspDp2w
```
---

======

This code is a reimplementation of the guided policy search algorithm and LQG-based trajectory optimization, meant to help others understand, reuse, and build upon existing work.

For full documentation, see [rll.berkeley.edu/gps](http://rll.berkeley.edu/gps).
Expand Down
19 changes: 18 additions & 1 deletion experiments/box2d_arm_example/hyperparams.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
'data_files_dir': EXP_DIR + 'data_files/',
'log_filename': EXP_DIR + 'log.txt',
'conditions': 1,
'use_mpc': True,
}

if not os.path.exists(common['data_files_dir']):
Expand All @@ -46,7 +47,7 @@
'type': AgentBox2D,
'target_state' : np.array([0, 0]),
'world' : ArmWorld,
'render' : True,
'render' : False,
'x0': np.array([0.75*np.pi, 0.5*np.pi, 0, 0, 0, 0, 0]),
'rk': 0,
'dt': 0.05,
Expand All @@ -55,14 +56,20 @@
'pos_body_idx': np.array([]),
'pos_body_offset': np.array([]),
'T': 100,
'use_mpc': common['use_mpc'],
'M': 5,
'sensor_dims': SENSOR_DIMS,
'state_include': [JOINT_ANGLES, JOINT_VELOCITIES, END_EFFECTOR_POINTS],
'obs_include': [],
}

#if common['use_mpc']:
# agent['smooth_noise_var'] = 0.3

algorithm = {
'type': AlgorithmTrajOpt,
'conditions': common['conditions'],
'use_mpc': common['use_mpc'],
}

algorithm['init_traj_distr'] = {
Expand All @@ -75,6 +82,16 @@
'T': agent['T'],
}

algorithm['init_mpc'] = {
'type': init_lqr,
'init_gains': np.zeros(SENSOR_DIMS[ACTION]),
'init_acc': np.zeros(SENSOR_DIMS[ACTION]),
'init_var': 0.1,
'stiffness': 0.01,
'dt': agent['dt'],
'T': agent['M'],
}

action_cost = {
'type': CostAction,
'wu': np.array([1, 1])
Expand Down
163 changes: 163 additions & 0 deletions experiments/box2d_pointmass_badmm_example/hyperparams.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
""" Hyperparameters for Box2d Point Mass."""
from __future__ import division

import os.path
from datetime import datetime
import numpy as np

from gps import __file__ as gps_filepath
from gps.agent.box2d.agent_box2d import AgentBox2D
from gps.agent.box2d.point_mass_world import PointMassWorld
from gps.algorithm.algorithm_badmm import AlgorithmBADMM
from gps.algorithm.cost.cost_state import CostState
from gps.algorithm.cost.cost_action import CostAction
from gps.algorithm.cost.cost_sum import CostSum
from gps.algorithm.dynamics.dynamics_lr_prior import DynamicsLRPrior
from gps.algorithm.dynamics.dynamics_prior_gmm import DynamicsPriorGMM
from gps.algorithm.policy.policy_prior_gmm import PolicyPriorGMM
from gps.algorithm.traj_opt.traj_opt_lqr_python import TrajOptLQRPython
from gps.algorithm.policy_opt.policy_opt_caffe import PolicyOptCaffe
from gps.algorithm.policy.lin_gauss_init import init_pd
from gps.gui.config import generate_experiment_info
from gps.proto.gps_pb2 import END_EFFECTOR_POINTS, END_EFFECTOR_POINT_VELOCITIES, ACTION, POSITION_NEAREST_OBSTACLE

SENSOR_DIMS = {
END_EFFECTOR_POINTS: 3,
END_EFFECTOR_POINT_VELOCITIES: 3,
ACTION: 2
}

BASE_DIR = '/'.join(str.split(gps_filepath, '/')[:-2])
EXP_DIR = BASE_DIR + '/../experiments/box2d_pointmass_badmm_example/'


common = {
'experiment_name': 'box2d_pointmass_badmm_example' + '_' + \
datetime.strftime(datetime.now(), '%m-%d-%y_%H-%M'),
'experiment_dir': EXP_DIR,
'data_files_dir': EXP_DIR + 'data_files/',
'log_filename': EXP_DIR + 'log.txt',
'conditions': 4,
'use_mpc': True,
}

if not os.path.exists(common['data_files_dir']):
os.makedirs(common['data_files_dir'])

agent = {
'type': AgentBox2D,
'target_state' : np.array([5, 20, 0]),
"world" : PointMassWorld,
'render' : False,
'x0': [np.array([0, 5, 0, 0, 0, 0]),
np.array([0, 30, 0, 0, 0, 0]),
np.array([10, 5, 0, 0, 0, 0]),
np.array([10, 30, 0, 0, 0, 0]),
],
'rk': 0,
'dt': 0.05,
'substeps': 1,
'conditions': common['conditions'],
'pos_body_idx': np.array([]),
'pos_body_offset': np.array([]),
'T': 100,
'use_mpc': common['use_mpc'],
'M': 5,
'sensor_dims': SENSOR_DIMS,
'state_include': [END_EFFECTOR_POINTS, END_EFFECTOR_POINT_VELOCITIES],
'obs_include': [END_EFFECTOR_POINTS, END_EFFECTOR_POINT_VELOCITIES],
}

algorithm = {
'type': AlgorithmBADMM,
'conditions': common['conditions'],
'iterations': 10,
'lg_step_schedule': np.array([1e-4, 1e-3, 1e-2, 1e-2]),
'policy_dual_rate': 0.2,
'ent_reg_schedule': np.array([1e-3, 1e-3, 1e-2, 1e-1]),
'fixed_lg_step': 3,
'kl_step': 5.0,
'min_step_mult': 0.01,
'max_step_mult': 1.0,
'sample_decrease_var': 0.05,
'sample_increase_var': 0.1,
}

algorithm['init_traj_distr'] = {
'type': init_pd,
'init_var': 5.0,
'pos_gains': 0.0,
'dQ': SENSOR_DIMS[ACTION],
'dt': agent['dt'],
'T': agent['T'],
}

algorithm['init_mpc'] = {
'type': init_pd,
'init_var': 5.0,
'pos_gains': 0.0,
'dQ': SENSOR_DIMS[ACTION],
'dt': agent['dt'],
'T': agent['M'],
}

action_cost = {
'type': CostAction,
'wu': np.array([5e-5, 5e-5])
}

state_cost = {
'type': CostState,
'data_types' : {
END_EFFECTOR_POINTS: {
'wp': np.ones(SENSOR_DIMS[END_EFFECTOR_POINTS]),
'target_state': agent["target_state"],
},
},
}

algorithm['cost'] = {
'type': CostSum,
'costs': [action_cost, state_cost],
'weights': [1.0, 1.0],
}

algorithm['dynamics'] = {
'type': DynamicsLRPrior,
'regularization': 1e-6,
'prior': {
'type': DynamicsPriorGMM,
'max_clusters': 20,
'min_samples_per_cluster': 40,
'max_samples': 20,
},
}

algorithm['traj_opt'] = {
'type': TrajOptLQRPython,
}

algorithm['policy_opt'] = {
'type': PolicyOptCaffe,
'weights_file_prefix': EXP_DIR + 'policy',
}

algorithm['policy_prior'] = {
'type': PolicyPriorGMM,
'max_clusters': 20,
'min_samples_per_cluster': 40,
'max_samples': 20,
}

config = {
'iterations': 10,
'num_samples': 5,
'verbose_trials': 5,
'verbose_policy_trials': 0,
'common': common,
'agent': agent,
'gui_on': True,
'algorithm': algorithm,
}

common['info'] = generate_experiment_info(config)
21 changes: 19 additions & 2 deletions experiments/box2d_pointmass_example/hyperparams.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,15 @@
from gps.agent.box2d.agent_box2d import AgentBox2D
from gps.agent.box2d.point_mass_world import PointMassWorld
from gps.algorithm.algorithm_traj_opt import AlgorithmTrajOpt
from gps.algorithm.cost.cost_obstacles import CostObstacle
from gps.algorithm.cost.cost_state import CostState
from gps.algorithm.cost.cost_action import CostAction
from gps.algorithm.cost.cost_sum import CostSum
from gps.algorithm.dynamics.dynamics_lr_prior import DynamicsLRPrior
from gps.algorithm.dynamics.dynamics_prior_gmm import DynamicsPriorGMM
from gps.algorithm.traj_opt.traj_opt_lqr_python import TrajOptLQRPython
from gps.algorithm.policy.lin_gauss_init import init_pd
from gps.proto.gps_pb2 import END_EFFECTOR_POINTS, END_EFFECTOR_POINT_VELOCITIES, ACTION
from gps.proto.gps_pb2 import END_EFFECTOR_POINTS, END_EFFECTOR_POINT_VELOCITIES, ACTION, POSITION_NEAREST_OBSTACLE
from gps.gui.config import generate_experiment_info

SENSOR_DIMS = {
Expand All @@ -36,6 +37,7 @@
'target_filename': EXP_DIR + 'target.npz',
'log_filename': EXP_DIR + 'log.txt',
'conditions': 1,
'use_mpc': True,
}

if not os.path.exists(common['data_files_dir']):
Expand All @@ -46,22 +48,28 @@
'target_state' : np.array([5, 20, 0]),
"world" : PointMassWorld,
'render' : False,
'x0': np.array([0, 5, 0, 0, 0, 0]),
'x0': [np.array([0, 5, 0, 0, 0, 0])],
'rk': 0,
'dt': 0.05,
'substeps': 1,
'conditions': common['conditions'],
'pos_body_idx': np.array([]),
'pos_body_offset': np.array([]),
'T': 100,
'use_mpc': common['use_mpc'],
'M': 5,
'sensor_dims': SENSOR_DIMS,
'state_include': [END_EFFECTOR_POINTS, END_EFFECTOR_POINT_VELOCITIES],
'obs_include': [],
}

#if common['use_mpc']:
# agent['smooth_noise_var'] = 1.0

algorithm = {
'type': AlgorithmTrajOpt,
'conditions': common['conditions'],
'use_mpc': common['use_mpc'],
}

algorithm['init_traj_distr'] = {
Expand All @@ -73,6 +81,15 @@
'T': agent['T'],
}

algorithm['init_mpc'] = {
'type': init_pd,
'init_var': 5.0,
'pos_gains': 0.0,
'dQ': SENSOR_DIMS[ACTION],
'dt': agent['dt'],
'T': agent['M'],
}

action_cost = {
'type': CostAction,
'wu': np.array([5e-5, 5e-5])
Expand Down
Loading