-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On the ambiguity of .shape
behavior
#891
Comments
.shape
behavior.shape
behavior
@ricardoV94 thanks for the questions and idea. I'd like to try to get some clarity on the actual problem first. Here is a bit of context and some thoughts:
I assume you meant In general, lazy implementations have more limitations than eager ones, like you cannot use functions from the stdlib most of the time. That's not specific to the standard though. The standard is carefully designed to not require eager behavior unless it absolutely cannot be avoided - and those few parts have warnings about value-dependent behavior. The most annoying one is
I hope the above makes clear that this is not a case that happens in the real world, since static shapes are never unknown. Are you running into an actual problem using or implementing |
The point was that this complicates writing code that operates on Now this is fine within a library because I'm allowed to define x.shape as I want. But then what about meta-libraries that want to implement their own version of reshape? They would need to know if the library is going to do the first sort of shape or the second. So they cannot be backend agnostic. Am I miusnderstanding the scope of the project? |
Or put another way why would anyone implement x.shape as a tuple with None in their library? Is anyone doing it /interested in that format? |
This is just not true? It always works eagerly because static shapes are known, and it always works lazily because
Yes, that's a misunderstanding, unless I'm misunderstanding what you are saying - one of the key goals of this whole effort is to allow libraries to write code that's agnostic to the library and execution model that's backing the input arrays.
I don't quite understand this question, so I'll answer the below.
It's only |
I'm not that familiar with PyTensor so there's a chance there is something I am missing that's behind your questions. There's also a lot of history here. We are rapidly gaining more experience with lazy libraries and their strengths and limitations when used through the standard, e.g. adding support for JAX and Dask in SciPy and scikit-learn. I'm happy to set up a call if you prefer and talk it through? |
I guess my question is, how do I decide which format to offer? Well it's easy to answer that because if I want But importantly for me, will another library ever look for For a concrete example, when adding the PyTensor backend to einops, we implemented I had to tell the library how to do that specifically the PyTensor backend (there's something similar for non-eager TF above). I guess for dask the equivalent would be to call No idea how the JAX case can be used from the outside. Maybe the point is that without the standard, a meta-library like einops will have to figure out which backend it is if they want to make eager decisions on lazy graphs? That's why I feel this may be connected to #839 although it's about shape and not values, which is a simpler case? |
I'm sure we're both missing something (me more) :) Feel free to reach out to me |
I'd say put in the actual values if you have them, and
From what I've seen, this is only done when an algorithm has inherently value-dependent behavior, so there is no way to keep things lazy. E.g.: if unique(x).shape[0] < 5:
small_size_algo(...)
else:
regular_algo(...) Scikit-learn has a fair amount of code like that for example, often using These cases are very hard to support for lazy arrays, and that's more the problem than whether you hit the "must compute or raise" point in
done! |
According to #97 a library can decide to either return a
tuple[int | None, ...]
or a tuple-like object that:This seems like a recipe for disaster? The second option allows to operate on shape graphs, whereas the first would fail when you try to act on
None
, say to find the size of some dimensions by doingprod(x.shape[1:])
(forced example so that.size
wouldn't be applicable).In PyTensor we have the distinction between
variable.shape
andvariable.type.shape
, that correspond to those two kinds of output. They are flipped though, and it seems odd to makevariable.shape
return a tuple withNone
. It doesn't make sense to build a computation on top of static shape, because thoseNone
are not linked to anything.Besides that, we sometimes also allow users to replace variables with different static shapes, although it's arguable a bit of an undefined behavior. It seems to contradict the specification that it must be immutable, so happy to say it's out of scope:
Proposal
Would make sense to separate the two kinds of shape clearly? Perhaps as
variable.shape
andvariable.static_shape
. The first should be valid to build computations on top of variable shapes, statically known or not, while the second would allow libraries to reason as much as possible about what is known (and choose to fail if the provided information is insufficient) without having to try and probe which kind of shape output is returned by a specific library.This is somewhat related to #839, where a library may need as much information as possible to make a decision. Perhaps a
static_value
would also make sense for a library to return the entries that can be known ahead of time. Anyway that should be discussed there.If both options make sense, I would argue that
.shape
should behave like pytensor does.The standard should also specify if
library.shape(x)
should matchx.shape
orx.static_shape
. Again I think it should match the first.The text was updated successfully, but these errors were encountered: