Q: Why are Classic observation spaces always so sparse (e.g. one-hot, unary, etc.)? #442

rallen10 · 2021-08-12T14:57:43Z

I'm developing a pettingzoo-based environment that is in family with the classic environments. I am trying to decide the most effective way to encode the observation and action spaces and I noticed that many (all?) of the classic environments use very sparse observation spaces. For example, Chess uses a separate 8x8 channel for each color-piece combination. Hanabi uses unary encoding of cards, deck size, etc.

I think of using one-hot encoding for categorical data; therefore this paradigm seems to treat most (all?) observations in classic environments as categorical (which, to be fair, many of them are).

Why is it done this way? Is it simply modeled off of AlphaZero that also seemed to use sparse observation spaces or is there literature that supports the wide-spread use of sparse encoding? Is it just to be on the safe side by treating all observations as categorical? Does this not come at the cost of obscuring relational data in observations (e.g. card order in Hanabi, Texas Hold-em, etc.) and expanding the observation space dimensions which can slow learning?

This is part of a larger question on why sparse encoding seems to be so common in mature RL environments, not just PettingZoo. For example, even AlphaStar (for StarCraft II) seemed to use one-hot encoding for almost everything, even clearly non-categorical data like unit health.

rallen10 · 2021-08-12T15:33:08Z

Upon further consideration, I'm thinking this is just a normalization scheme to make all observation data in the range [0,1].

jkterry1 · 2021-08-12T15:46:49Z

I don't believe that any formal literature on this problem exists. The intuition after talking to people though (at DeepMind etc.) though is to unambiguously specify every single state without potentially implying unnecessary relationships in a sane datastructure where illegal actions can be masked and in a [0,1] scheme, and to "let the neural network figure it out" if additional relationships exist. There is actually evidence that alternate schemes may be viable: recently RL achieved superhuman performance at Dou Dizhu (super famous 3 player game in china) for the first time using a wildly different observation space, that broke a ton of stuff for us. The discussion was scattered all over but I believe you can see the gist here and in the linked papers: datamllab/rlcard#228. However, no study of these schemes exists (and it's not clear to me how that could even be done).

@benblack769 feel free to chime in here

jkterry1 · 2021-08-12T15:47:38Z

I'll also add an additional point: why are sparse observations inherently so bad? I don't think they are unless you're worried about embedded systems or the like.

rallen10 · 2021-08-12T16:24:14Z

Thanks for the insights. I don't know that they are inherently "bad", but it's also not immediately obvious why they would be inherently "good". However, given the wide spread use in mature RL environments, it seems to imply some kind of inherent benefit.

As for drawbacks to sparse observations, it can drastically increase the dimensionality of the observation space. My initial reaction is that higher dimension observation spaces are slower to learn than an equivalent lower-dimension space for non-categorical data (e.g. Discrete(5) with dim=1 would be more sample efficient than OneHot(5) with dim=5); but I don't have any empirical evidence to backup this intuition nor do I know if it would be generally applicable.

I'll leave this open a bit longer in case anyone else wants to chime in, then I'll close the issue.

rallen10 closed this as completed Aug 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q: Why are Classic observation spaces always so sparse (e.g. one-hot, unary, etc.)? #442

Q: Why are Classic observation spaces always so sparse (e.g. one-hot, unary, etc.)? #442

rallen10 commented Aug 12, 2021

rallen10 commented Aug 12, 2021 •

edited

Loading

jkterry1 commented Aug 12, 2021

jkterry1 commented Aug 12, 2021

rallen10 commented Aug 12, 2021 •

edited

Loading

Q: Why are Classic observation spaces always so sparse (e.g. one-hot, unary, etc.)? #442

Q: Why are Classic observation spaces always so sparse (e.g. one-hot, unary, etc.)? #442

Comments

rallen10 commented Aug 12, 2021

rallen10 commented Aug 12, 2021 • edited Loading

jkterry1 commented Aug 12, 2021

jkterry1 commented Aug 12, 2021

rallen10 commented Aug 12, 2021 • edited Loading

rallen10 commented Aug 12, 2021 •

edited

Loading

rallen10 commented Aug 12, 2021 •

edited

Loading