Preserve eltype where possible for moments #688

palday · 2021-04-23T17:03:40Z

Fixes #680.

src/moments.jl

palday · 2021-04-26T11:38:12Z

The failure on nightly is an upstream bug (JuliaLang/julia#40609).

src/moments.jl

ararslan · 2021-04-26T19:46:15Z

This looks good but the prior uses of Float64 make me worried that the code may have issues around under/overflow when using other types.

palday · 2021-04-26T19:52:29Z

Ugh, I'm conflicted on that front. Because that would be breaking, but technically a user hitting overflow on their custom narrow types is expected behavior. @nalimilan do we have enough for another breaking pre-1.0 release?

nalimilan · 2021-04-26T20:15:42Z

We can break things if we want to tag 1.0. Though ideally I'd rather move these to Statistics, but that work has stalled (JuliaLang/julia#31395).

nalimilan · 2021-04-26T20:15:47Z

src/moments.jl

+    T = promote_type(eltype(v), eltype(wv), typeof(m))
+    s = zero(T)


To ensure type stability, how about following exactly what the loop does and using T = typeof(zero(eltype(v) - zero(m))^2 * zero(eltype(w)))? That's what I did in my first attempt to moving these functions to Statistics: https://github.com/JuliaLang/julia/pull/31395/files#diff-6b82a0c4e60cbe68048e35375987a86688afb4aea60dda8f2d2d4938dc845776R375

palday · 2021-04-26T21:15:55Z

Okay, so on the breaking / non breaking front:

Did the API ever specify the return type for mean and the various moments? If not, this is not breaking in terms of type behavior.
In terms of numeric behavior, Julia generally keeps things at narrow types even at the risk of over-/underflow. As such, keeping the narrowest type that would be possible for the arithmetic operations in these computations seems fine. In other words, the new return type and potential numeric implications are expected in the Julia ecosystem.
The potential numeric issues with all of the moments here are possible regardless of type width -- we do accumulate then divide without sorting first, so
- we have issues with computations where things differ in scale and the smaller values accumulate enough to be relevant on the larger scale, but consistently round down when the initial values in the array are large (sorting solves this but is computationally expensive)
- we can have overflow in the accumulation step (batching the accumulate+divide on subsets solves this but potentially causes performance and numeric issues from the extra divisions)
The way for best numerical accuracy would be sort, do the math in a higher precision and cast down. But this also seems unacceptable -- if a user choose to keep everything in single or half precision, then we should assume there is a reason for that. If they really need that level of numerical stability, then they should probably just implement their own moments.

In other words: given that we're not going to change the algorithm used (which I think is reasonable), the potential numeric issues from the eltype preservation should not be considered breaking. Perhaps the best solution is to add to each of the docstrings:

!!! note
    The computation is done in the narrowest type for which the underlying 
    arithmetic operations are defined based on the element types of the 
    vectors and the mean. For higher numerical accuracy and to reduce the risk
    of overflow when dealing with values near the limits of the type (or many 
    values whose sum is near the limits of the type), we recommend casting to 
    a broader type / higher precision first.

nalimilan · 2021-04-27T07:30:05Z

Yeah it's not super likely to break user code. Though we could collect the few changes which are breaking (some of them are marked with the breaking label) and take that occasion to tag 1.0.

src/moments.jl

Co-authored-by: Milan Bouchet-Valat <[email protected]>

src/moments.jl

palday added 2 commits April 23, 2021 19:01

Preserve eltype where possible for moments

b54b9fa

rm extra newline

f85a002

palday requested a review from ararslan April 23, 2021 17:10

oxinabox approved these changes Apr 23, 2021

View reviewed changes

ararslan reviewed Apr 23, 2021

View reviewed changes

src/moments.jl Outdated Show resolved Hide resolved

ararslan reviewed Apr 23, 2021

View reviewed changes

src/moments.jl Outdated Show resolved Hide resolved

ararslan reviewed Apr 23, 2021

View reviewed changes

src/moments.jl Outdated Show resolved Hide resolved

simplify

94d2656

palday requested a review from ararslan April 26, 2021 10:56

ararslan reviewed Apr 26, 2021

View reviewed changes

src/moments.jl Outdated Show resolved Hide resolved

ararslan reviewed Apr 26, 2021

View reviewed changes

src/moments.jl Outdated Show resolved Hide resolved

get serious about promotion

56e0fff

ararslan reviewed Apr 26, 2021

View reviewed changes

src/moments.jl Outdated Show resolved Hide resolved

flatten

40e7290

nalimilan reviewed Apr 26, 2021

View reviewed changes

even more type stable

338eb05

nalimilan reviewed Apr 27, 2021

View reviewed changes

palday and others added 2 commits April 28, 2021 08:57

Apply suggestions from code review

f05e14f

Co-authored-by: Milan Bouchet-Valat <[email protected]>

similar type stability

eb0e7c3

nalimilan reviewed May 2, 2021

View reviewed changes

src/moments.jl Outdated Show resolved Hide resolved

skewness and kurtosis

aacef2a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Preserve eltype where possible for moments #688

Preserve eltype where possible for moments #688

Uh oh!

palday commented Apr 23, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

palday commented Apr 26, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ararslan commented Apr 26, 2021

Uh oh!

palday commented Apr 26, 2021

Uh oh!

nalimilan commented Apr 26, 2021

Uh oh!

nalimilan Apr 26, 2021

Uh oh!

palday commented Apr 26, 2021

Uh oh!

nalimilan commented Apr 27, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		T = promote_type(eltype(v), eltype(wv), typeof(m))
		s = zero(T)

Preserve eltype where possible for moments #688

Are you sure you want to change the base?

Preserve eltype where possible for moments #688

Uh oh!

Conversation

palday commented Apr 23, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

palday commented Apr 26, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ararslan commented Apr 26, 2021

Uh oh!

palday commented Apr 26, 2021

Uh oh!

nalimilan commented Apr 26, 2021

Uh oh!

nalimilan Apr 26, 2021

Choose a reason for hiding this comment

Uh oh!

palday commented Apr 26, 2021

Uh oh!

nalimilan commented Apr 27, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!