irl-imitation

Differences from forked repo

Fixed issue yrlu#1
Implemented parallel value iteration and state visitation frequency calculation (numpy, CPU
Implemented value iteration and state visitation in tensorflow (GPU, greatly enhances speed!)
Implemented expected value difference evaluation metric (https://github.com/MatthewJA/Inverse-Reinforcement-Learning/blob/master/irl/maxent.py#L208)

Implementations of some model-based Inverse Reinforcement Learning (IRL) algorithms in python/Tensorflow. Mainly for educational purpose. (WIP)

python demo.py

Algorithms implemented

Linear inverse reinforcement learning (Ng & Russell 2000)
Maximum entropy inverse reinforcement learning (Ziebart et al. 2008)
Maximum entropy deep inverse reinforcement learning (Wulfmeier et al. 2015)

MDP & solver implemented

gridworld 2D
gridworld 1D
value iteration

Dependencies

python 2.7
cvxopt
Tensorflow 0.12.1
matplotlib

Linear Inverse Reinforcement Learning

Following Ng & Russell 2000 paper: Algorithms for Inverse Reinforcement Learning, algorithm 1

$ python linear_irl_gridworld.py --act_random=0.3 --gamma=0.5 --l1=10 --r_max=10

Maximum Entropy Inverse Reinforcement Learning

(This implementation is largely influenced by Matthew Alger's maxent implementation)

Following Ziebart et al. 2008 paper: Maximum Entropy Inverse Reinforcement Learning
$ python maxent_irl_gridworld.py --help for options descriptions

$ python maxent_irl_gridworld.py --height=10 --width=10 --gamma=0.8 --n_trajs=100 --l_traj=50 --no-rand_start --learning_rate=0.01 --n_iters=20

$ python maxent_irl_gridworld.py --gamma=0.8 --n_trajs=400 --l_traj=50 --rand_start --learning_rate=0.01 --n_iters=20

Maximum Entropy Deep Inverse Reinforcement Learning

Following Wulfmeier et al. 2015 paper: Maximum Entropy Deep Inverse Reinforcement Learning. FC version implemented. The implementation does not follow exactly the model proposed in the paper. Some tweaks applied including elu activations, clipping gradients, l2 regularization etc.
$ python deep_maxent_irl_gridworld.py --help for options descriptions

$ python deep_maxent_irl_gridworld.py --learning_rate=0.02 --n_trajs=200 --n_iters=20

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
cartpole		cartpole
imgs		imgs
mdp		mdp
.gitignore		.gitignore
README.md		README.md
deep_maxent_irl.py		deep_maxent_irl.py
deep_maxent_irl_gridworld.py		deep_maxent_irl_gridworld.py
demo.py		demo.py
demo_gridworld1d.py		demo_gridworld1d.py
img_utils.py		img_utils.py
linear_irl_gridworld.py		linear_irl_gridworld.py
lp_irl.py		lp_irl.py
maxent_irl.py		maxent_irl.py
maxent_irl_gridworld.py		maxent_irl_gridworld.py
tf_utils.py		tf_utils.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

irl-imitation

Differences from forked repo

Algorithms implemented

MDP & solver implemented

Dependencies

Linear Inverse Reinforcement Learning

Maximum Entropy Inverse Reinforcement Learning

Maximum Entropy Deep Inverse Reinforcement Learning

MIT License

About

Releases

Packages

Languages

magnusja/irl-imitation

Folders and files

Latest commit

History

Repository files navigation

irl-imitation

Differences from forked repo

Algorithms implemented

MDP & solver implemented

Dependencies

Linear Inverse Reinforcement Learning

Maximum Entropy Inverse Reinforcement Learning

Maximum Entropy Deep Inverse Reinforcement Learning

MIT License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages