[Proposal] RewardManager cannot handle multiple critics #1102

lorenwel · 2024-01-29T16:46:55Z

lorenwel
Jan 29, 2024

Proposal

Currently, the reward manager assumes that there is a single reward, made up of multiple rewards terms. However, in multi-critic scenarios (e.g. CMDP) we require multiple critics, as the name implies.
I suggest generalizing the reward manager to multiple reward terms, akin to how the observation manager does with observation groups.

Motivation

Allows for multi-critics.

Pitch

Allow multiple groups of rewards with different reward terms.

Alternatives

Create multi-reward manager or a dedicated manager for different critics (e.g. a "cost manager" for CMDP).

Additional context

I'll open a PR.

Checklist

I have checked that there is no similar issue in the repo (required)

lorenwel · 2024-01-29T16:50:53Z

lorenwel
Jan 29, 2024
Author

PR is open: #221

0 replies

Mayankm96 · 2024-01-29T18:22:48Z

Mayankm96
Jan 29, 2024
Maintainer

Thank you for bringing this up and the MR with it.

Does it also make sense to do the same for other MDP signals? Once that is done, we should also be able to incorporate multi-agent learning setups.

@Dhoeller19 This is akin to the discussions we had on generalizing the managers further

0 replies

lorenwel · 2024-01-30T09:17:00Z

lorenwel
Jan 30, 2024
Author

I guess for the actions and commands it might make sense to do this, for the multi-agent setups you mention.
Observations are already generalized and for terminations and randomization it does not seem necessary to me.
Depending on the types of curriculums you want, that might also be useful in some fringe cases.

Since these are parallel efforts to the multi-critic though, I'd appreciate if we would not bulk up this issue with multi-agent things on top of the multi-critic changes.

EDIT: Or I guess what I'm mostly trying to say is that I would prefer not to delay #221 due to multi-agent stuff.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] RewardManager cannot handle multiple critics #1102

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

[Proposal] RewardManager cannot handle multiple critics #1102

lorenwel Jan 29, 2024

Proposal

Motivation

Pitch

Alternatives

Additional context

Checklist

Replies: 3 comments

lorenwel Jan 29, 2024 Author

Mayankm96 Jan 29, 2024 Maintainer

lorenwel Jan 30, 2024 Author

lorenwel
Jan 29, 2024

lorenwel
Jan 29, 2024
Author

Mayankm96
Jan 29, 2024
Maintainer

lorenwel
Jan 30, 2024
Author