Benchmarking against JuMP: faster on Jacobians, slower on Hessians #124

gdalle · 2024-06-05T15:51:54Z

gdalle
Jun 5, 2024
Collaborator

Here are the results of a benchmark between SCT and JuMP sparsity detection, on the suite of OptimizationProblems. The benchmark code is at https://github.com/gdalle/SparsityDetectionComparison

Provided that @amontoison validates the benchmark as fair, we are:

uniformly faster than JuMP for Jacobians
often much slower (up to almost 100x) for Hessians

Note that these benchmarks are done without any optimization, for instance on the set types in the detector. And our Hessian code is still pretty slow, but that's expected and probably fixable. Stay tuned!

Vaibhavdixit02 · 2024-06-05T15:56:00Z

Vaibhavdixit02
Jun 5, 2024

but I would say that's expected and partly fixable

Can you say a bit more about this?

0 replies

gdalle · 2024-06-05T16:02:05Z

gdalle
Jun 5, 2024
Collaborator Author

Damn, Vaibhav is watching the repo like a hawk.
I'll let @adrhill fill in the gaps, but essentially, the problem comes from allocations inside Hessian tracing, which are much more expensive than inside Jacobian tracing.

Jacobian tracing: each traced scalar carries a sparse gradient pattern.
Hessian tracing: each traced scalar carries a sparse gradient pattern AND a sparse Hessian pattern

Currently, the Hessian patterns are encoded as sets of tuples, because it fits well into the theoretical part of our paper. But we have other storage ideas, including some that would sacrifice some accuracy for huge speedups. The main one is storing a single Hessian pattern for all scalar quantities involved in the algorithm. Resulting memory savings would be absolutely off-the-charts, and I wouldn't be surprised if this got us within reach of JuMP.

0 replies

Vaibhavdixit02 · 2024-06-05T16:17:28Z

Vaibhavdixit02
Jun 5, 2024

Damn, Vaibhav is watching the repo like a hawk.

😂 yeah you guys are pretty much helping solve problems that bugged me for a year so I try to keep up (I know my integration attempt might not make it seem like that but I have good reasons for the delay sorry 😭)

including some that would sacrifice some accuracy for huge speedups.

This would be tremendously useful in some cases where the pattern changes and it's recomputed (so it doesn't need to be very accurate)

This makes sense, thanks for the insight

0 replies

gdalle · 2024-06-05T16:24:51Z

gdalle
Jun 5, 2024
Collaborator Author

That's precisely the approach we're taking in our paper with Adrian. Our gamble is to make sparsity pattern detection so fast that it's even worth it when you do it every time due to pattern changes.

0 replies

adrhill · 2024-06-05T20:42:39Z

adrhill
Jun 5, 2024
Maintainer

that's expected and probably fixable. Stay tuned!

Agreed!

As Guillaume already mentioned, we so far focussed on correctness over performance. In our current implementation, each scalar value (each tracer) carries an index set that indicates non-zero derivatives. To obtain accurate results over branching compute graphs, these sets are never mutated in place. Even within this accurate approach, we have very low-hanging fruit to pick, like smarter unions over (partially) empty index sets (#80).

A big performance gain could come from adding methods on the array-level, e.g. to LinearAlgebra (#115). Currently, operations like dense matrix-vector products of tracers dispatch down to scalar operations. Instead, we could be a lot smarter in the way we compute unions of index sets.

Finally, at the cost of accuracy, an index set that is "shared globally" over the entire computation would result in large memory savings and huge performance increases. Our move away from simple AbstractSets has started in #119.

0 replies

gdalle · 2024-06-06T08:15:31Z

gdalle
Jun 6, 2024
Collaborator Author

Here is the same benchmark but where I added Symbolics. I wasn't able to run every problem in a reasonable amount of time, missing the last few. If the Pluto build of https://github.com/gdalle/SparsityDetectionComparison ever finishes, we'll have them all

0 replies

adrhill · 2024-06-06T09:07:49Z

adrhill
Jun 6, 2024
Maintainer

Skimming the benchmarking notebook, it also uses the default TracerSparsityDetector
https://github.com/gdalle/SparsityDetectionComparison/blob/34782bce3ac5082dd3132c3168a3750b2c7ca527/index.jl#L169

which defaults to BitSet for Gradient information and Set{Tuple{Int, Int}} for Hessian information:

SparseConnectivityTracer.jl/src/adtypes.jl

Line 39 in 1786815

TracerSparsityDetector() = TracerSparsityDetector(BitSet, Set{Tuple{Int,Int}})

If other available set types are more performant, we should change this default.

0 replies

adrhill · 2024-06-06T09:57:50Z

adrhill
Jun 6, 2024
Maintainer

FYI @gdalle: Decent speed-ups (up to a factor of 2) on some set types in #119 (comment).

0 replies

gdalle · 2024-06-07T09:55:26Z

gdalle
Jun 7, 2024
Collaborator Author

Nearly complete benchmark on https://gdalle.github.io/SparsityDetectionComparison/

I only removed the tetra_stuff instances because at least one of them caused CI to still be running after 4h (due to Symbolics).

0 replies

amontoison · 2024-06-07T15:32:40Z

amontoison
Jun 7, 2024

Excellent work Guillaume! 🚀

0 replies

gdalle · 2024-06-07T16:17:31Z

gdalle
Jun 7, 2024
Collaborator Author

I can't help but notice (after including the number of variables and constraints in the plot) that most of the optimization problems in the suite are rather small, even tiny

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking against JuMP: faster on Jacobians, slower on Hessians #124

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 11 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Benchmarking against JuMP: faster on Jacobians, slower on Hessians #124

gdalle Jun 5, 2024 Collaborator

Replies: 11 comments

Vaibhavdixit02 Jun 5, 2024

gdalle Jun 5, 2024 Collaborator Author

Vaibhavdixit02 Jun 5, 2024

gdalle Jun 5, 2024 Collaborator Author

adrhill Jun 5, 2024 Maintainer

gdalle Jun 6, 2024 Collaborator Author

adrhill Jun 6, 2024 Maintainer

adrhill Jun 6, 2024 Maintainer

gdalle Jun 7, 2024 Collaborator Author

amontoison Jun 7, 2024

gdalle Jun 7, 2024 Collaborator Author

gdalle
Jun 5, 2024
Collaborator

Vaibhavdixit02
Jun 5, 2024

gdalle
Jun 5, 2024
Collaborator Author

Vaibhavdixit02
Jun 5, 2024

gdalle
Jun 5, 2024
Collaborator Author

adrhill
Jun 5, 2024
Maintainer

gdalle
Jun 6, 2024
Collaborator Author

adrhill
Jun 6, 2024
Maintainer

adrhill
Jun 6, 2024
Maintainer

gdalle
Jun 7, 2024
Collaborator Author

amontoison
Jun 7, 2024

gdalle
Jun 7, 2024
Collaborator Author