Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purpose of CommonRLInterface #166

Closed
Sid-Bhatia-0 opened this issue Dec 16, 2020 · 12 comments
Closed

Purpose of CommonRLInterface #166

Sid-Bhatia-0 opened this issue Dec 16, 2020 · 12 comments

Comments

@Sid-Bhatia-0
Copy link
Member

I was wondering what is the purpose and use case of the CommonRLInterface.

Let us say a user has their own RL interface called MyRLInterface.

This means that in order to use RLCore, RLZoo etc., one would have to do the following conversion (if using CommonRLInterface):
MyRLInterface -> CommonRLInterface -> RLBase interface

Since RLBase interface is more expressive than CommonRLInterface, the conversion CommonRLInterface -> RLBase interface would be lossless.

If MyRLInterface is more expressive than CommonRLInterface, then the conversion MyRLInterface -> CommonRLInterface may be lossy.

If MyRLInterface is less expressive than CommonRLInterface, then the conversion MyRLInterface -> RLBase interface can be done directly without the additional step to convert to CommonRLInterface in between.

So I am wondering when would one need to use the CommoneRLInterface?

@findmyway
Copy link
Member

In short, CommonRLInterface is simple enough to be shared across different packages, like POMDPs AlphaZero and ReinforcementLearning.jl

@Sid-Bhatia-0
Copy link
Member Author

Ok... So if I understand correctly, do you mean that other packages (like POMDPs, AlphaZero) would often directly use CommonRLInterface instead of creating their own MyRLInterface? Essentially, CommonRLInterface would prevent the need of creating a MyRLInterface entirely, at least in those simple settings which don't require an API as expressive as the RLBase interface?

@findmyway
Copy link
Member

Essentially, CommonRLInterface would prevent the need of creating a MyRLInterface entirely

I'd say partially instead of entirely. So that we can easily test an environment across different packages.

@Sid-Bhatia-0
Copy link
Member Author

I'd say partially instead of entirely. So that we can easily test an environment across different packages.

Are you referring to those environments that have already implemented their own MyRLInterface? But can't such environments convert MyRLInterface -> RLBase interface for testing purposes instead of the CommonRLInterface?

@findmyway
Copy link
Member

So the Common here means that we all obey the same set of interfaces and don't need to know others' implementation details (MyRLInterface you mentioned above).

@Sid-Bhatia-0
Copy link
Member Author

So the Common here means that we all obey the same set of interfaces and don't need to know others' implementation details (MyRLInterface you mentioned above).

Yes. I understand. But in order for the CommonRLInterface to be able to incorporate a variety of environment implementations (for whatever purpose, like testing or using algorithms off the shelf), it has got to be a pretty general one. And RLBase interface seems to be the more general one out of the two. Or am I missing something?

@findmyway
Copy link
Member

And RLBase interface seems to be the more general one out of the two. Or am I missing something?

Many things in RLBase are still experimental, as you might have noticed in recent commits. The instability hurts its usage among a broader field. And CommonRLInterface is a good compromise here, less expressive but simple yet stable enough.

@Sid-Bhatia-0
Copy link
Member Author

The instability hurts its usage among a broader field.
And CommonRLInterface is a good compromise here, less expressive but simple yet stable enough.

Ok. Now I get it. Thanks!

@zsunberg
Copy link
Member

@Sid-Bhatia-0 A goal of CommonRLInterface (at least in my mind) is to get to a place where it can both be very expressive and very stable. We added the @provide/`provided' so that the interface can be more flexible with a consistent way to deal with optional functions and allow for incremental changes without breaking too much.

The historical perspective is this: When CommonRLInterface was started, RLBase, also had an observe function that returned an object that had to have a specific interface, which was a bit unconventional, and took a bit of extra understanding. A big reason that I advocated for creating CommonRLInterface was so that it could be a more conventional alternative. Now RLBase has changed to be more conventional in that respect.

In any case, now that we have it, I think it is nice to have a separate environment package that we can discuss apart from any particular RL framework. Of course, there is always this danger: https://xkcd.com/927/ 🤣

@Sid-Bhatia-0
Copy link
Member Author

Thank you for pointing out these things @zsunberg

We added the @provide/`provided' so that the interface can be more flexible with a consistent way to deal with optional functions and allow for incremental changes without breaking too much.

Nice idea! I never thought about something like this before.

A goal of CommonRLInterface (at least in my mind) is to get to a place where it can both be very expressive and very stable.
A big reason that I advocated for creating CommonRLInterface was so that it could be a more conventional alternative. Now RLBase has changed to be more conventional in that respect.

I see. So are we hoping/aiming that the RLBase and CommonRLInterface will both eventually become equally expressive? This would imply that either one can be converted to the other in a lossless manner. Also, at such a stage, if it comes, some crucial packages like RLCore and RLZoo would still be sticking to the RLBase interface, right?

In any case, now that we have it, I think it is nice to have a separate environment package that we can discuss apart from any particular RL framework.

Also, could you throw some light on what do you mean by this environment package? Methods like get_state etc. are already there in RLBase, right (and analogues in CommonRLInterface) ?

@zsunberg
Copy link
Member

zsunberg commented Jan 5, 2021

So are we hoping/aiming that the RLBase and CommonRLInterface will both eventually become equally expressive?

Yes, I think this should be a goal for CommonRLInterface at least.

Also, at such a stage, if it comes, some crucial packages like RLCore and RLZoo would still be sticking to the RLBase interface, right?

That's up to @findmyway or other people who are working on those packages. There is a lot to learn about interface design. I will get a lot of experience with other people using CommonRLInterface in my class this Spring, and we can see what other people writing RL packages prefer. If things like provided and the Wrappers and JuliaReinforcementLearning/CommonRLInterface.jl#44 work well and JuliaReinforcementLearning/CommonRLInterface.jl#36 can be solved well, it might be worth switching completely. In any case, since the environment interface in RLBase and CommonRLInterface are conceptually similar, it would not be too difficult to switch.

Also, could you throw some light on what do you mean by this environment package?

I just mean it has an interface for an environment and not agents or policies or experience buffers, etc. A complete RL framework like ReinforcementLearning.jl would have all of these.

@findmyway findmyway mentioned this issue Jan 8, 2021
@Sid-Bhatia-0
Copy link
Member Author

Got it. Thanks!

findmyway pushed a commit that referenced this issue May 3, 2021
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants