Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

40 finish first version of gym interface #50

Merged
merged 44 commits into from
Jan 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
670fc79
move sim code to robot.py
Armandpl Jan 14, 2024
6a7b3b2
use the new robot interface and simplify reset
Armandpl Jan 14, 2024
5728949
swap rendered for the o.g cartpole renderer
Armandpl Jan 14, 2024
ad05b35
move dyn code out; upgrade to gymnasium
Armandpl Jan 14, 2024
6edfde3
upgrade to gymnasium
Armandpl Jan 14, 2024
8f65b66
add missing dependencies
Armandpl Jan 14, 2024
91c70c7
reorg package structure
Armandpl Jan 14, 2024
0b502f8
setup training on sim to check everything ok
Armandpl Jan 15, 2024
0d21791
add missing configs
Armandpl Jan 15, 2024
efec34a
fix eval freq, fix reset vec env
Armandpl Jan 15, 2024
ccdd56b
fix path in lint ignores
Armandpl Jan 17, 2024
be977c5
match github action python version to min project python
Armandpl Jan 18, 2024
4057ab7
make reset not dependant on the velocity filter
Armandpl Jan 18, 2024
9e5872f
fix control freq wrapper
Armandpl Jan 18, 2024
7692f0e
handle random init state later
Armandpl Jan 18, 2024
69c3876
separate state_max for safety from obs max for learning
Armandpl Jan 18, 2024
ffbdafe
use control freq already set in the env
Armandpl Jan 18, 2024
865aff2
setup configs for training on real robot
Armandpl Jan 18, 2024
47b3d75
remove useless code
Armandpl Jan 18, 2024
2d73ad0
properly seed and wrap position between -pi pi when rendering
Armandpl Jan 18, 2024
3b018eb
properly seed
Armandpl Jan 18, 2024
9f95c33
add env.reset to seed everything to comply with gymnasium
Armandpl Jan 18, 2024
235c61f
clean up wrapper instantiation
Armandpl Jan 18, 2024
57c85c7
update config
Armandpl Jan 18, 2024
bb25273
make sure we can extend the wrappers config
Armandpl Jan 18, 2024
c3ea9ce
seed is keyword arg not positional
Armandpl Jan 18, 2024
920f3ab
lower max obs based on simulated values
Armandpl Jan 18, 2024
61c4e0b
only usb subproc vecenv when n_envs > 1
Armandpl Jan 18, 2024
fa53fdc
fix #25
Armandpl Jan 18, 2024
a5d02a5
reset velocity filter at each reset
Armandpl Jan 18, 2024
2293209
Revert "only usb subproc vecenv when n_envs > 1"
Armandpl Jan 18, 2024
b2e113a
add simple env test, add it to ci
Armandpl Jan 18, 2024
3781e04
tmp fix to pass CI
Armandpl Jan 18, 2024
2cbc048
fix E721
Armandpl Jan 18, 2024
f65e3dd
update configs
Armandpl Jan 18, 2024
cc6bbb3
change the way we config wrappers
Armandpl Jan 18, 2024
ffe6e9a
setup sweep on safety limits
Armandpl Jan 18, 2024
013468d
remove unecessary try catch
Armandpl Jan 21, 2024
39fec9c
remove unecessary line
Armandpl Jan 21, 2024
b30f1f8
describe obs, set the bounds slightly higer
Armandpl Jan 21, 2024
4d02dc2
add link to where render was copy pasted from
Armandpl Jan 21, 2024
360b6a5
pygame should work headless without a virtual display
Armandpl Jan 21, 2024
525be80
CPRs are actually floats
Armandpl Jan 21, 2024
d8a07de
add back comments describing the sim params
Armandpl Jan 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
name: ci

on:
push:
branches:
- master
pull_request:
branches:
- master

# https://jacobian.org/til/github-actions-poetry/
jobs:
flake8-lint:
runs-on: ubuntu-latest
name: Lint
steps:
- name: Check out source repository
uses: actions/checkout@v2
- name: Set up Python environment
uses: actions/setup-python@v1
with:
python-version: "3.9"
- name: flake8 Lint
uses: py-actions/flake8@v1
with:
ignore: "E203,E402,E501,F401,F841"
exclude: "logs/*,data/*,furuta/logging/protobuf/*"

test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2

# If you wanted to use multiple Python versions, you'd have specify a matrix in the job and
# reference the matrixe python version here.
- uses: actions/setup-python@v2
with:
python-version: 3.9

# Cache the installation of Poetry itself, e.g. the next step. This prevents the workflow
# from installing Poetry every time, which can be slow. Note the use of the Poetry version
# number in the cache key, and the "-0" suffix: this allows you to invalidate the cache
# manually if/when you want to upgrade Poetry, or if something goes wrong. This could be
# mildly cleaner by using an environment variable, but I don't really care.
- name: cache poetry install
uses: actions/cache@v2
with:
path: ~/.local
key: poetry-1.4.0-0

# Install Poetry. You could do this manually, or there are several actions that do this.
# `snok/install-poetry` seems to be minimal yet complete, and really just calls out to
# Poetry's default install script, which feels correct. I pin the Poetry version here
# because Poetry does occasionally change APIs between versions and I don't want my
# actions to break if it does.
#
# The key configuration value here is `virtualenvs-in-project: true`: this creates the
# venv as a `.venv` in your testing directory, which allows the next step to easily
# cache it.
- uses: snok/install-poetry@v1
with:
version: 1.4.0
virtualenvs-create: true
virtualenvs-in-project: true

# Cache your dependencies (i.e. all the stuff in your `pyproject.toml`). Note the cache
# key: if you're using multiple Python versions, or multiple OSes, you'd need to include
# them in the cache key. I'm not, so it can be simple and just depend on the poetry.lock.
- name: cache deps
id: cache-deps
uses: actions/cache@v2
with:
path: .venv
key: pydeps-${{ hashFiles('**/poetry.lock') }}

# Install dependencies. `--no-root` means "install all dependencies but not the project
# itself", which is what you want to avoid caching _your_ code. The `if` statement
# ensures this only runs on a cache miss.
- run: poetry install --no-interaction --no-root
if: steps.cache-deps.outputs.cache-hit != 'true'

# Now install _your_ project. This isn't necessary for many types of projects -- particularly
# things like Django apps don't need this. But it's a good idea since it fully-exercises the
# pyproject.toml and makes that if you add things like console-scripts at some point that
# they'll be installed and working.
- run: poetry install --no-interaction

# And finally run tests. I'm using pytest and all my pytest config is in my `pyproject.toml`
# so this line is super-simple. But it could be as complex as you need.
- run: poetry run pytest
26 changes: 0 additions & 26 deletions .github/workflows/lint.yml

This file was deleted.

3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
**/videos/*
**/outputs/*
**/wandb/*
**/runs/*
**/models/*
Expand Down Expand Up @@ -124,7 +125,7 @@ celerybeat.pid
# Environments
.env
.venv
env/
# env/ bc we use env for gym env config but have no env/ dir for actual venvs
venv/
ENV/
env.bak/
Expand Down
4 changes: 2 additions & 2 deletions furuta/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

register(
id="FurutaReal-v0",
entry_point="furuta_gym.envs:FurutaReal",
entry_point="furuta.rl.envs.furuta_real:FurutaReal",
)
register(
id="FurutaSim-v0",
entry_point="furuta_gym.envs:FurutaSim",
entry_point="furuta.rl.envs.furuta_sim:FurutaSim",
)
14 changes: 9 additions & 5 deletions furuta/controls/controllers.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,15 @@ def compute_command(self, position: float):
@staticmethod
def build_controller(parameters: dict):
controller_type = parameters["controller_type"]
match controller_type:
case "PIDController":
return PIDController(parameters)
case _:
raise ValueError(f"Invalid controller type: {controller_type}")
# match controller_type:
# case "PIDController":
# return PIDController(parameters)
# case _:
# raise ValueError(f"Invalid controller type: {controller_type}")
if controller_type == "PIDController":
return PIDController(parameters)
else:
raise ValueError(f"Invalid controller type: {controller_type}")


class PIDController(Controller):
Expand Down
4 changes: 1 addition & 3 deletions furuta/logging/protobuf/pendulum_state.proto
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,5 @@ message PendulumState {
float motor_angle_velocity = 3;
float pendulum_angle_velocity = 4;
float reward = 5;
bool done = 6;
float action = 7;
float corrected_action = 8;
float action = 6;
}
21 changes: 11 additions & 10 deletions furuta/logging/protobuf/pendulum_state_pb2.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 0 additions & 3 deletions furuta/rl/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +0,0 @@
# from furuta_gym.envs.furuta_base import FurutaBase # noqa F420
# from furuta_gym.envs.furuta_real import FurutaReal # noqa F420
# from furuta_gym.envs.furuta_sim import FurutaSim # noqa F420
4 changes: 2 additions & 2 deletions furuta/rl/algos.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
# check if they all have the train freq param
# check if they have other tuple args
# check if it would be cleaner for sb3 to accept list instead of tuple?
# TODO also does having the sb3 import means i need sb3 when importing from everywhere else in the package?
class SAC(stable_baselines3.SAC):
def __init__(self, **kwargs):
# sb3 expects tuple, omegaconf returns list
# so we need to convert kwarg train_freq from tuple to list
kwargs.update({"train_freq": tuple(kwargs["train_freq"])})
if "train_freq" in kwargs and isinstance(kwargs["train_freq"], list):
kwargs.update({"train_freq": tuple(kwargs["train_freq"])})

super().__init__(**kwargs)
Loading
Loading