Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minimum() (foldl(min, ary)) seems to be much slower than it can be #56353

Closed
Moelf opened this issue Oct 27, 2024 · 5 comments
Closed

minimum() (foldl(min, ary)) seems to be much slower than it can be #56353

Moelf opened this issue Oct 27, 2024 · 5 comments
Labels
fold sum, maximum, reduce, foldl, etc. performance Must go faster

Comments

@Moelf
Copy link
Contributor

Moelf commented Oct 27, 2024

julia> function normal_findmin(dij)
           vmin = Inf
           for x in dij
               vmin = ifelse(x < vmin, x, vmin)
           end
           return vmin
       end
julia> a = rand(120)

julia> @be minimum($a)
Benchmark: 2911 samples with 71 evaluations
 min    376.620 ns
 median 412.183 ns
 mean   439.258 ns
 max    1.319 μs

julia> @be normal_findmin($a)
Benchmark: 2923 samples with 761 evaluations
 min    37.783 ns
 median 38.403 ns
 mean   40.277 ns
 max    130.361 ns

(Handling isempty() and init isn't enough to account for the difference)

@giordano giordano added performance Must go faster fold sum, maximum, reduce, foldl, etc. labels Oct 27, 2024
@giordano
Copy link
Contributor

Duplicate of #38558

@giordano giordano marked this as a duplicate of #38558 Oct 27, 2024
@giordano giordano closed this as not planned Won't fix, can't repro, duplicate, stale Oct 27, 2024
@Moelf
Copy link
Contributor Author

Moelf commented Oct 27, 2024

I think this is not a 100% duplicate. Yes, solving #38558 will solve this, but we don't need to solve #38558 to solve this -- we can imagine specialize minimum/maximum/maxima to at least make them faster?

@Moelf
Copy link
Contributor Author

Moelf commented Oct 27, 2024

at least maybe don't implement them as mapreduce? for identity, and Colon(), we just need foldl(), but foldl() is still slower, because fold() is also implemented via mapfoldreduce

@giordano giordano reopened this Oct 27, 2024
@jakobnissen
Copy link
Contributor

jakobnissen commented Oct 28, 2024

Your normal_findmin does not take NaN into account, and therefore compiles to more efficient machine code. Handling NaN (by using min instead of the hand-rolled ifelse(x < vmin, x, vmin)) completely closes the speed gap on my computer.

@Moelf
Copy link
Contributor Author

Moelf commented Oct 28, 2024

I will note that right now on aarch64 using min is actually faster, not slower, so maybe still something to be done for x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fold sum, maximum, reduce, foldl, etc. performance Must go faster
Projects
None yet
Development

No branches or pull requests

3 participants