Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accuracy issue with summation of @view #52457

Open
sjkelly opened this issue Dec 8, 2023 · 5 comments
Open

Accuracy issue with summation of @view #52457

sjkelly opened this issue Dec 8, 2023 · 5 comments
Labels
arrays [a, r, r, a, y, s] fold sum, maximum, reduce, foldl, etc. maths Mathematical functions

Comments

@sjkelly
Copy link
Contributor

sjkelly commented Dec 8, 2023

This was originally noticed in:
#28848 (comment)

I don't know if this is somewhat related to this issue, but I also found an accuracy issue which I think is more troublesome because it also affects DataFrames.jl

here is the MVP

using Distributions
using Random

# It doesn't happen always, here is a seed where this happens.
rng = MersenneTwister(630);
v = rand(rng, Normal(zero(Float32), one(Float32)), 1000)
sa = @view v[collect(1:end)]

# View (as SubArray) vs Vector
sum(sa) ≈ sum(v) # false
# They are different! and worse part is that the view version is less accurate! (according to Kahan compensated summation)

IndexStyle(v) isa IndexLinear # true
IndexStyle(sa) isa IndexCartesian #true

# They are dispatched to different implementations in base/reduce.jl > _mapreduce 

Another MWE is:

julia> using Statistics, Random

julia> Random.seed!(1);

julia> x = rand(Float32, 10^6);

julia> mean(x)
0.49974534f0

julia> mean(@view x[:])
0.49974534f0

julia> mean(@view x[collect(1:end)])
0.49974778f0

@mbauman points out that the latter in this MWE hits the naive iterator: foldl(+, x)/length(x) which leads to the poor floating point accuracy.

cc @el-oso

@mikmoore
Copy link
Contributor

mikmoore commented Dec 8, 2023

Would probably be resolved by #52397 (or easily resolved after it).

@stevengj
Copy link
Member

stevengj commented Dec 8, 2023

Yes, this line should resolve it: https://github.com/JuliaLang/julia/pull/52397/files#diff-97ad8b900c30b39398c32d78754c404f4a6df9a153d45284b2c11ab54837deb9R524 (changing the fallback for CartesianIndex arrays)

@stevengj stevengj added arrays [a, r, r, a, y, s] maths Mathematical functions fold sum, maximum, reduce, foldl, etc. labels Dec 8, 2023
@el-oso
Copy link

el-oso commented Dec 11, 2023

Thanks for the fast response!

@el-oso
Copy link

el-oso commented Dec 22, 2023

I just tested master and got this

 julia> sum(sa) ≈ sum(v)
true

Great!

@stevengj
Copy link
Member

Note (a) that the new pairwise sum is not merged yet and (b) the operator only checks whether half of the significant digits match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] fold sum, maximum, reduce, foldl, etc. maths Mathematical functions
Projects
None yet
Development

No branches or pull requests

4 participants