[WIP] Fresh attempt at DI integration #54

Vaibhavdixit02 · 2024-05-31T15:09:35Z

Checklist

Appropriate tests were added
Any code changes were done in a way that does not break public API
All documentation related to code changes were updated
The new code follows the
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Any new documentation only uses public API

Additional context

Add any other context about the problem here.

gdalle · 2024-05-31T16:00:11Z

You're running DI v0.4.2 here, something is blocking it from updating. Can you add a v0.5.2 compat bound and see what it says?

gdalle · 2024-05-31T21:00:36Z

The error sounds like a type instability?

gdalle · 2024-06-03T14:24:39Z

@Vaibhavdixit02 when I replace the function inside hessian! with sum it works, so I would assume the current behavior is due to the way you construct cons_oop with closures.

Vaibhavdixit02 · 2024-06-03T15:11:27Z

Yup, I figured that. I spent some time reading up on the vector -> vector hessian discussion (and the background) you and Tim Holy were having, hoping to do this more rigorously but I don't see a natural way, asking users to pass a vector of functions is not ideal and feels unnatural to me - this will show up in multiobjective problems as well, even though that's typically majorly derivative free algorithms. I have some WIP locally this is moving slow because of other things right now, I will get back to it soon.

gdalle · 2024-06-04T12:15:36Z

So do you think you need the vector Hessian?

Vaibhavdixit02 · 2024-06-04T12:21:26Z

No, not right now, I can work around that (as I have been doing in the implementations that exist here already) but it would be nice to have it in the future

wsmoses · 2024-06-06T20:50:18Z

@Vaibhavdixit02 @ChrisRackauckas please do not drop the Enzyme extension in favor of DI here.

This will lead to both more codes erring and suboptimal performance as DI will create both unnecessary closures among other issues. I would prefer for user code not to be made to break because of an unnecessary intermediate closure from an interface.

See some comments about this tpapp/LogDensityProblemsAD.jl#29 (review) / tpapp/LogDensityProblemsAD.jl#29 (comment)

Unfortunately when I commented "Closures were one of the reasons imo that AD.jl didn't take off, so long term I think DI.jl would be well served by both not introducing them itself (my understanding is that it does this successfully), but also not forcing users to create the same closures user-side (which would create similar issues)." but the response was "DI is unlikely to support either multiple arguments or activity specification anytime soon"

x/ref JuliaDiff/DifferentiationInterface.jl#307

gdalle · 2024-06-06T21:42:05Z

I am aware of this limitation of DifferentiationInterface but I'm not sure it's obvious to others how much work would be required for multiple arguments or activity analysis. The reason we're able to detect bugs and suboptimalities in Enzyme before anyone else (see the 100x gradient slowdown from a month ago) is because we have a top tier test suite, which would need to grow by a factor of at least 2 to accommodate this new feature. It's 1k to 2k LOCs at least and a profound rethink, so even if I want to do it, it can't happen overnight.

And since it's only essential for Enzyme, another option would be to provide the operators I need for DI on the Enzyme side. That would make it so I don't have to code every operator suboptimally with my imperfect understanding of your docs.

gdalle · 2024-06-06T21:42:31Z

Cross referencing for technical details

Multiple arguments / activities? JuliaDiff/DifferentiationInterface.jl#311

gdalle · 2024-06-06T21:45:24Z

And I also want to push back on the "unnecessary interface intermediate". DI fills a longstanding need in the ecosystem, especially for packages like Enzyme where the API is very hard to grasp. Not everyone is gonna be able or willing to use Enzyme directly, so I do believe there is value in even a suboptimal DI.

Since I aim to support every backend and not just Enzyme, there's always gonna be a compromise on how far I support each one. I'm gonna look into activities, but I just need you to realize that it's easy to get it out quick and dirty, much harder to do it right

wsmoses · 2024-06-06T22:20:08Z

Ah I apologize @gdalle I did not mean that to refer to DI as unnecessary (I think it is an amazing package and the work that you and Adrian are doing to make things easier to use is sorely needed).

I meant only to refer to the intermediate closure that DI would generate for the hessian as unnecessary -- strictly in the technical sense (eg wrapping things into a closure isn't needed when you could call a multi argument version without the closure).

I realize in retrospect the poor wording and I sincerely apologize.

wsmoses · 2024-06-06T22:25:06Z

And I fully agree with you that it is a difficult thing to get multi argument and activities correct.

However, replacing existing code that does not introduce these will create breakage so I don't think it wise for DI to replace existing codes with such support until DI vendors support as well.

For the backends without this need -- or packages without generalization of backend -- it is already a strict improvement and isn't a question that its use is immediately useful (and thus would be good to replace the other backends here, for example).

gdalle · 2024-06-06T22:38:53Z

Thanks for clarifying, and sorry for overreacting on my end 😊 Maybe this PR can replace the other backends and leave Enzyme alone for the time being?

ChrisRackauckas · 2024-06-06T22:59:15Z

We have no intentions to change to DI until it matches the performance of what we already have. It's on our radar, but we will not sacrifice features or performance. Instead, we will keep on top of it until DI can do exactly this, and then it's ready.

gdalle · 2024-06-07T04:07:34Z

Which benchmarks am I expected to pass for this bar to be met?
And what happens if I'm faster in some aspects but slower in others?

ChrisRackauckas · 2024-06-07T12:41:08Z

If it's anywhere close to the performance then we'll take it as we would greatly enjoy the decreased maintanance burden. But it should be able to do the standard constrained and unconstrained tutorials with Zygote, Enzyme, and ForwardDiff and basically match the performance all of the way through. If I'm not mistaken, currently the closures are a bit of a blocker to this.

gdalle · 2024-06-07T12:53:06Z

If you're talking about the tutorials in the docs, some of them are on toy problems (see e.g. https://docs.sciml.ai/Optimization/stable/tutorials/constraints/). Is there an actual realistic benchmark suite that can be run to track and compare performance?

ChrisRackauckas · 2024-06-07T12:54:12Z

The toy problems will be the hardest because that will measure pure overhead.

gdalle · 2024-06-07T12:54:22Z

But it should be able to do the standard constrained and unconstrained tutorials with Zygote, Enzyme, and ForwardDiff and basically match the performance all of the way through.

What I'm suggesting above, following the discussion with Billy, is that DI doesn't necessarily have to overtake all backends at once. It seems reasonable to leave Enzyme aside for now cause that's the one where I would struggle most to milk the last drops of performance.

gdalle · 2024-06-07T12:55:14Z

The toy problems will be the hardest because that will measure pure overhead.

Okay then. Ping me when the PR is ready for use @Vaibhavdixit02 and I'll review then help benchmark

…structure

gdalle · 2024-07-16T17:15:36Z

Beware of the local sparsity detector: you shouldn't have to use it in general.

julia> using ADTypes: jacobian_sparsity

julia> using SparseConnectivityTracer

julia> function con2_c(res, x)
           res .= [x[1]^2 + x[2]^2, x[2] * sin(x[1]) - x[1]]
       end
con2_c (generic function with 1 method)

julia> jacobian_sparsity(con2_c, zeros(2), zeros(2), TracerSparsityDetector()) # input-independent
2×2 SparseArrays.SparseMatrixCSC{Bool, Int64} with 4 stored entries:
 1  1
 1  1

julia> jacobian_sparsity(con2_c, zeros(2), zeros(2), TracerLocalSparsityDetector()) # input-dependent
2×2 SparseArrays.SparseMatrixCSC{Bool, Int64} with 3 stored entries:
 1  1
 1  ⋅

gdalle mentioned this pull request Jun 6, 2024

Multiple arguments / activities? JuliaDiff/DifferentiationInterface.jl#311

Closed

gdalle mentioned this pull request Jun 8, 2024

Possible direct users JuliaDiff/DifferentiationInterface.jl#134

Closed

Vaibhavdixit02 mentioned this pull request Jun 15, 2024

OptimizationBase v2.0 plan #62

Closed

Vaibhavdixit02 added 6 commits July 13, 2024 05:31

Fresh attempt at DI integration

57fbfae

DI to latest

e771c6c

Flesh out latest ideas

3b41754

EnzymeExt and fix mode checking

3951c88

Use the changes from main in enzyme

ad7ca08

Flesh out sparsity and secondorder things more, change the extension …

4fe1f54

…structure

Vaibhavdixit02 force-pushed the Diattempt2 branch from 8212343 to 4fe1f54 Compare July 15, 2024 19:43

bump adtypes

f0b8401

Vaibhavdixit02 force-pushed the Diattempt2 branch from 406564e to f0b8401 Compare July 15, 2024 19:57

Use only the dense_ad in gradient and misc ups

48ad2f4

Vaibhavdixit02 closed this Jul 16, 2024

Vaibhavdixit02 reopened this Jul 16, 2024

Vaibhavdixit02 added 6 commits July 16, 2024 14:26

Use global sparsity detector

b9479d0

format

517c061

handle reinitcache dispatches better

7beebae

try to get downstream running

27cd0d4

hvpextras preparation with random v

35bd067

fix sparse adtype passed to hvp

c1a5e1f

Vaibhavdixit02 changed the base branch from main to v2 July 20, 2024 13:30

Vaibhavdixit02 merged commit 1325478 into v2 Jul 20, 2024
2 of 5 checks passed

Vaibhavdixit02 deleted the Diattempt2 branch July 20, 2024 13:31

Uh oh!

[WIP] Fresh attempt at DI integration #54

[WIP] Fresh attempt at DI integration #54

Uh oh!

Conversation

Vaibhavdixit02 commented May 31, 2024

Checklist

Additional context

Uh oh!

gdalle commented May 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gdalle commented May 31, 2024

Uh oh!

gdalle commented Jun 3, 2024

Uh oh!

Vaibhavdixit02 commented Jun 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gdalle commented Jun 4, 2024

Uh oh!

Vaibhavdixit02 commented Jun 4, 2024

Uh oh!

wsmoses commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gdalle commented Jun 6, 2024

Uh oh!

gdalle commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gdalle commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wsmoses commented Jun 6, 2024

Uh oh!

wsmoses commented Jun 6, 2024

Uh oh!

gdalle commented Jun 6, 2024

Uh oh!

ChrisRackauckas commented Jun 6, 2024

Uh oh!

gdalle commented Jun 7, 2024

Uh oh!

ChrisRackauckas commented Jun 7, 2024

Uh oh!

gdalle commented Jun 7, 2024

Uh oh!

ChrisRackauckas commented Jun 7, 2024

Uh oh!

gdalle commented Jun 7, 2024

Uh oh!

gdalle commented Jun 7, 2024

Uh oh!

gdalle commented Jul 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gdalle commented May 31, 2024 •

edited

Loading

Vaibhavdixit02 commented Jun 3, 2024 •

edited

Loading

wsmoses commented Jun 6, 2024 •

edited

Loading

gdalle commented Jun 6, 2024 •

edited

Loading

gdalle commented Jun 6, 2024 •

edited

Loading

gdalle commented Jul 16, 2024 •

edited

Loading