Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OptimizationBase v2.0 plan #62

Closed
Vaibhavdixit02 opened this issue Jun 15, 2024 · 2 comments
Closed

OptimizationBase v2.0 plan #62

Vaibhavdixit02 opened this issue Jun 15, 2024 · 2 comments

Comments

@Vaibhavdixit02
Copy link
Member

The current AD implementation here (at least) lacks these features/suffers from these issues -

  1. MOAR!! oracles: Add lag_h support to FiniteDiff, MTK and Enzyme AD implementations #10, jvp and vjp for constraints, Add stochastic oracles #61, Add Proximal oracles #50
  2. Redundant function evaluations: Evaluate functional and gradient simultaneously #22
  3. Consistent support for exotic array types: Static arrays + autodiff #14 Real sparsity support in Enzyme backend #7
  4. Some more backends: add PolyesterForwardDiff as an AD backend #21 Wrap HyperHessians.jl as an AD backend #32 and FastDifferentiation
  5. Defaults
  6. Mixing AD backends when it makes sense

The reason these are outstanding has been lower priority and tediousness due to having to do it for multiple backends all the time. Hence, the emergence of DI is a timely solution for a bunch of these.

So my aim with #54 has been to start fleshing out some of the ideas of how these would look, we obviously can't expect DI to be the solution for everything but some of these are right up in its alley and it gives an excuse to spend time rethinking the implementation.

I plan to address #10, jvp and vjp for constraints, points 2, 3, #21 and FastDifferentiation, 5, and 6 in #54. I am hopeful to get this in by (at the absolute maximum) end of summer.

@Vaibhavdixit02
Copy link
Member Author

It might also make sense to do a v2.0 branch and start doing incremental PRs for these that build off #54 to that

@gdalle
Copy link
Contributor

gdalle commented Jun 15, 2024

From my perspective, it would make sense to integrate DifferentiationInterface as early as possible, even if it doesn't do everything yet. The reason is that a large stress test like this would be great to spot bugs and inefficiencies.

Warning

Not everything is gonna be perfect in DI from the start. But at least if we start using it, we can optimize every implementation in just one place instead of several.

MOAR!! oracles

The Lagrangian hessian is a recurrent concern, a possible solution is gdalle/DifferentiationInterface.jl#206 but I think it's a bit overkill. gdalle/DifferentiationInterface.jl#311 is more reasonable to allow passing the multipliers as a constant parameters.

jvp and vjp for constraints

DI has pushforward and pullback

Redundant function evaluations

DI has value_and_gradient and now value_gradient_and_hessian

Consistent support for exotic array types

DITest has a battery of scenarios involving static arrays for testing. Non-mutating operators are written so that even SArray will work (but more testing won't hurt).

As for sparsity, what you really need is SparseConnectivityTracer + DifferentiationInterface. Our sparsity detection is 10-100x faster than Symbolics, our coloring seems on par with the C++ ColPack. Last thing I need to do is benchmark against SparseDiffTools.

Some more backends

DI does all the work for you there.

Mixing AD backends when it makes sense

DI has SecondOrder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants