Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make array isapprox more generic by avoiding zip/explicit iteration #44893

Merged
merged 2 commits into from
Sep 30, 2022

Conversation

danielwe
Copy link
Contributor

@danielwe danielwe commented Apr 7, 2022

Hi all, I was going to raise an issue but figured I might as well frame it in terms of a proposed solution.

I stumbled upon a case where I couldn't isapprox two GPU arrays because they hit this zip trying to pull elements one by one. Using mapreduce instead should fix the problem for these and any other array types that don't like sequential iteration.

The drawback is that this hurts performance for regular CPU arrays (#38558). The slowdown is worse in cases that admit an early short-circuit:

julia> A = randn(100000); B = copy(A); A[end] = Inf;  # No short-circuit

julia> @btime isapprox(A, B);
  172.186 μs (2 allocations: 781.30 KiB)

julia> @btime isapprox_new(A, B)
  292.771 μs (12 allocations: 879.28 KiB)

julia> A[1] = 0;  # Immediate short-circuit possible

julia> @btime isapprox(A, B);
  79.641 μs (2 allocations: 781.30 KiB)

julia> @btime isapprox_new(A, B);
  292.528 μs (12 allocations: 879.28 KiB)

However, note that:

  • This branch is a fallback for an exceptional case
  • Much worse performance can occur for unrelated reasons, for example:
julia> A[1] = Inf;  # Why does this make norm(A - B) an order of magnitude slower??

julia> @btime isapprox(A, B);
  726.410 μs (2 allocations: 781.30 KiB)

julia> @btime isapprox_new(A, B);
  966.347 μs (12 allocations: 879.28 KiB)

So maybe a temporary slowdown (until mapreduce is improved) in this exceptional branch is an acceptable price to pay for more generic code? That way GPUArrays.jl won't have to duplicate the entire method just to change one line.

Bonus: some mapreduce calls in generic_norm are really just straightforward applications of minimum/maximum/sum, so I rewrote them as such for improved clarity. The behavior should be unchanged. Since this change also concerns reductions in generic linear algebra routines I thought a single PR might suffice even if this is really a separate issue.

About tests

  • isapprox isn't explicitly tested on arrays. On the other hand, it's the foundation for almost every other test, so in that sense, it's very thoroughly tested.
  • The generic_norm rewrites shouldn't change behavior, so existing tests should suffice.

EDIT: rolled back the mapreduce simplifications in norm since they trigger #44906 and are unrelated to the main issue here.

@stevengj
Copy link
Member

stevengj commented Apr 9, 2022

isapprox is probably almost never a performance-critical operation, so a slight slowdown (in a rare corner case) in exchange for greater generality is probably a worthwhile tradeoff.

(There's been some discussion of a short-circuiting mapreduce in the past, but it doesn't seem to have gone anywhere.)

@stevengj stevengj added maths Mathematical functions arrays [a, r, r, a, y, s] labels Apr 9, 2022
@danielwe
Copy link
Contributor Author

danielwe commented Apr 9, 2022

Thanks, that's what I figured as well.

I rolled back the unrelated norm simplifications such that this PR can be considered independently of #44906.

@stevengj
Copy link
Member

Why does this make norm(A - B) an order of magnitude slower??

Floating-point exceptions are slow.

@stevengj
Copy link
Member

Can you double-check that we have test coverage for this case?

@danielwe
Copy link
Contributor Author

Why does this make norm(A - B) an order of magnitude slower??

Floating-point exceptions are slow.

What confuses me is that A already contained an infinity in the example that was 10 times faster. Performance dropped like a rock when going from 1 to 2 Infs.

I'll take a closer look at the tests.

@danielwe
Copy link
Contributor Author

If I understand the tools correctly, this branch was hit by 150 tests in the last coveralls build: https://coveralls.io/builds/45390050/source?filename=stdlib%2FLinearAlgebra%2Fsrc%2Fgeneric.jl#L1680.

It's specifically targeted by the following test: https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/test/generic.jl#L407-L409

(I now realize that this line has been up for debate before: #17658)

@danielwe
Copy link
Contributor Author

Bump to maybe get this in by the 1.9 feature freeze? I don't know what's causing the test failure, I can't seem to be able to read the log. Rebasing to re-trigger CI and see what happens.

danielwe and others added 2 commits September 28, 2022 14:09
This is more generic and works for, e.g., gpu arrays
@danielwe
Copy link
Contributor Author

danielwe commented Sep 30, 2022

The buildbot log says make: *** No rule to make target 'win-extras'. Stop., which sounds like a bug in the CI setup and is definitely not related to this PR

@KristofferC
Copy link
Member

Since @stevengj approved and tests passes I'll merge this.

@KristofferC KristofferC merged commit ec5fe69 into JuliaLang:master Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] maths Mathematical functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants