Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] RewardManager cannot handle multiple critics #220

Closed
1 task done
lorenwel opened this issue Jan 29, 2024 · 3 comments
Closed
1 task done

[Proposal] RewardManager cannot handle multiple critics #220

lorenwel opened this issue Jan 29, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@lorenwel
Copy link
Contributor

Proposal

Currently, the reward manager assumes that there is a single reward, made up of multiple rewards terms. However, in multi-critic scenarios (e.g. CMDP) we require multiple critics, as the name implies.
I suggest generalizing the reward manager to multiple reward terms, akin to how the observation manager does with observation groups.

Motivation

Allows for multi-critics.

Pitch

Allow multiple groups of rewards with different reward terms.

Alternatives

Create multi-reward manager or a dedicated manager for different critics (e.g. a "cost manager" for CMDP).

Additional context

I'll open a PR.

Checklist

  • I have checked that there is no similar issue in the repo (required)
@lorenwel
Copy link
Contributor Author

PR is open: #221

@Mayankm96
Copy link
Contributor

Mayankm96 commented Jan 29, 2024

Thank you for bringing this up and the MR with it.

Does it also make sense to do the same for other MDP signals? Once that is done, we should also be able to incorporate multi-agent learning setups.

@Dhoeller19 This is akin to the discussions we had on generalizing the managers further

@Mayankm96 Mayankm96 added the enhancement New feature or request label Jan 29, 2024
ElonMusk9577 pushed a commit to ElonMusk9577/Orbit that referenced this issue Jan 30, 2024
…saac-sim#220)

# Description

This MR does the following:

* Changes the way environment and agent configs are handled. These are
now moved to their respective `env` folder.
* Replaces YAML file with configclasses for rsl_rl library
* Adds examples of other robots for locomotion-velocity task: anymal-b,
unitree-a1, and anymal-d (placeholder)

The new environment folder structure is as follows:

```
my_task (folder)
   |- mdp  (folder where all user's task-specific terms are stored)
   |- config
       |- robot_name_1  (folder)
            |- robot_specific_env_cfg.py
            | agents (folder)
                |- rsl_rl_cfg.py
                |- sb3_cfg.py
                |- ....
            |- __init__.py (register robot-specific tasks)
       |- robot_name_2 (folder)  
   |- my_task_env_cfg.py (base env config)
   |- my_task_env.py (base env -- optional -- should normally use RLEnv)
```

Fixes isaac-sim#142 

## Type of change

- New feature (non-breaking change which adds functionality)
- Breaking change (fix or feature that would cause existing
functionality to not work as expected)

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./orbit.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file

---------

Co-authored-by: Nikita Rudin <[email protected]>
@lorenwel
Copy link
Contributor Author

lorenwel commented Jan 30, 2024

I guess for the actions and commands it might make sense to do this, for the multi-agent setups you mention.
Observations are already generalized and for terminations and randomization it does not seem necessary to me.
Depending on the types of curriculums you want, that might also be useful in some fringe cases.

Since these are parallel efforts to the multi-critic though, I'd appreciate if we would not bulk up this issue with multi-agent things on top of the multi-critic changes.

EDIT: Or I guess what I'm mostly trying to say is that I would prefer not to delay #221 due to multi-agent stuff.

fatimaanes pushed a commit to fatimaanes/omniperf that referenced this issue Aug 8, 2024
…saac-sim#220)

# Description

This MR does the following:

* Changes the way environment and agent configs are handled. These are
now moved to their respective `env` folder.
* Replaces YAML file with configclasses for rsl_rl library
* Adds examples of other robots for locomotion-velocity task: anymal-b,
unitree-a1, and anymal-d (placeholder)

The new environment folder structure is as follows:

```
my_task (folder)
   |- mdp  (folder where all user's task-specific terms are stored)
   |- config
       |- robot_name_1  (folder)
            |- robot_specific_env_cfg.py
            | agents (folder)
                |- rsl_rl_cfg.py
                |- sb3_cfg.py
                |- ....
            |- __init__.py (register robot-specific tasks)
       |- robot_name_2 (folder)  
   |- my_task_env_cfg.py (base env config)
   |- my_task_env.py (base env -- optional -- should normally use RLEnv)
```

Fixes isaac-sim#142 

## Type of change

- New feature (non-breaking change which adds functionality)
- Breaking change (fix or feature that would cause existing
functionality to not work as expected)

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./orbit.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file

---------

Co-authored-by: Nikita Rudin <[email protected]>
@isaac-sim isaac-sim locked and limited conversation to collaborators Oct 2, 2024
@Dhoeller19 Dhoeller19 converted this issue into discussion #1102 Oct 2, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants