PIRO

Proximal Inverse Reward Optimization

Note: If the environment is running for the first time (i.e., no expert data is present in the Folder expert_data), please uncomment Line 84 and Line 85 in main.py. This is for training and saving the expert policy model, and sampling and saving demonstrated trajectories.

PRIO:

python main.py --env_name = Task Name

f-IRL

python firl.py --env_name = Task Name

See argument.py for more adjustable parameters.

Name		Name	Last commit message	Last commit date
Latest commit History 151 Commits
Atari		Atari
__pycache__		__pycache__
expert_data		expert_data
README.md		README.md
arguments.py		arguments.py
firl.py		firl.py
main.py		main.py
phi_mountaincar_1		phi_mountaincar_1
reward_function.py		reward_function.py
rollouts.py		rollouts.py
test.py		test.py
test1.py		test1.py
test_phi.py		test_phi.py
trrl.py		trrl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PIRO

About

Releases

Packages

Contributors 3

Languages

PolynomialTime/TRRL

Folders and files

Latest commit

History

Repository files navigation

PIRO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages