-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change init
's role in reduce-like functions: remove "neutral element" restriction and guarantee its use
#53871
Conversation
How does this play together with vectorization & associativity? I vaguely remember that this was a contentious point, since the result can be very different depending on how often that "initial value" (seems to be a misnomer) is used in reductions. I believe this is the reason the actual number of uses of |
That's exactly my motivation here — I want to push forward @stevengj's #52397. If Looks like #44819 was trying to add similar guarantees to
|
I'm torn on leaving the reduction position of While it can be deduced from the documentation, we could consider making an explicit note warning against using |
Co-authored-by: mikmoore <[email protected]>
We should definitely specify the ordering of In principle, we still don't need to specify the associativity, though in most cases a a = rand(Float16, 1000)
mapreduce(identity, +, a, init=0.0) do? Is it allowed to compute |
So this is trying to capture the current thought-processes in #49042 — in particular Jeff's comment. The case brought up in #44819 (comment) is helpful in thinking about an associative-but-not-commutative operator:
I'm explicitly saying that's a-ok with this PR — and in fact it's exactly what you sign up for with |
init
's role in reduce and mapreduceinit
's role in reduce-like functions: remove "neutral element" restriction and guarantee its use
Okay, but I think we should still document that (But I'm still worried that re-associating |
Yes, breaking this would be annoying for those that used it successfully. However, it was also explicitly not a valid use:
By the old documentation, it was explicitly undefined whether |
OK, I've updated the documentation changes here with the feedback and incorporated the requirement that At a minimum CUDA does not follow this specification: #49042 (comment) (it didn't match my initial shot, either, though). It's unclear to me where to go from here. |
(I don't see why parallel reduction ala CUDA requires |
It surely doesn't require it, but it is a current implementation that deviates from this PR's proposed documentation. I don't know how many other implementations are at variance. |
Fair enough — it seems like the documentation that |
I still think this is a useful exercise here because this is describing what I personally would want
Point (1) is tricky, because some common use-cases (included some documented ones) already disregard it. Points (2) and (3) would be perfectly compatible with the prior documentation if we remove the "guarantee". So where to go? We could either decide that this is a "minor change" worth doing given existing usages and implementations (not many packages seem to use a non-neutral init, but I could imagine that this kind of "abuse" would be more common in scripts and non-package code), or we could find a new kwarg name with these tighter semantics, or we could just give up on #49042 for now and clarify the status quo as best we can. |
Co-authored-by: Steven G. Johnson <[email protected]>
I would also like to remove the counter-intuitive restriction that |
I think I missed something along the way here. Can someone explain in what sense (Disregard the rest of my comments if there is a sense is which this is breaking to the use of Any existing implementations of I don't see this as profoundly different than any other feature expansion (e.g., most anything with a compat notice). Pre-existing implementations may not support the expanded interface.
|
No, you're exactly right — the "breaking-ness" here is simply that we're changing the rules of the game out from underneath existing implementations. From the calling side, all of these changes "fit" within the previous documentation. But it means that previously technically-correct implementations are now immediately buggy as soon as the new docs lands. That said, I do think it's warranted to call this a minor change and still do it, but that doesn't mean it's not without impact. It'd be good to get some voices from the JuliaFolds folks here (@MasonProtter, @jw3126, @DilumAluthge) — what do y'all think? |
I like it. I think some paralell implementations in the ecosystem will still want the neutral element assumption. But IMO it is then reasonable to make it their job to document that. |
I'm against this direction. This change would make it so that the I'd much rather that we make the opposite guarantee: that the
which was split to
then the |
I suppose the less breaking thing would be to retain If we really wanted, we could introduce a new keyword element to apply at the left (and right?) with the proposed semantics. However, the proposed changes don't do anything that couldn't be implemented simply and more clearly as |
I did a search for all uses of |
I'm with @MasonProtter . there are easy enough ways to simply add something to the left of a collection. reducing with an |
I haven't really read into the whole problem here, so just responding to the GPU bits.
Just dividing into non-empty chunks is not trivial, because many operations (e.g. tree-style reduction in shared memory, or subgroup intrinsics) are executed cooperatively, i.e., by a number of threads that may not cleanly divide the number of elements. Adding branches to prevent that is generally not a good idea on GPUs. So right now in GPUArrays.jl we use |
Thank you all! I'm still not totally thrilled about the "neutral element" meaning, but I can appreciate the arguments given here, especially since it is the status quo. |
I support the position of @MasonProtter. If a user requires a certain associativity, or that Maybe the non-associative example could be extended to explicitly allow all of the following to be returned by
Footnotes
|
Fwiw,
|
OK, I would still like to improve our semantics here. I've opened my shot at take-2 in #53945. |
Here's a shot at revamped documentation in light of #49042. The goal here is
threefoldfivefold:init
provides the return value if the collections are empty (status quo)init
must be a "neutral element" (change)init
is used exactly once (change from being completely unspecified)init
is used as the left-most argument in the reduction (change from being unspecified)And in the process, this cleans up the language around associativity a bit.
This is a draft as this language needs to be propagated through to all the "special" reduction functions like
sum
et al. But it gives us something concrete to consider.