-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change handling of sparse backends #38
Comments
Definitely agree that the current way of AutoSparse* is pretty tedious and not modular. I'd keep the name something else since |
I thought about that but AutoSparse <: AbstractADType so it seems coherent. |
I see, that seems reasonable |
@avik-pal @ChrisRackauckas any thoughts? It's mostly SciML folks who are using this feature so far |
One more option AutoSparsifyBackend for the constructor? 😅 |
The killer argument is that we will export that name, and we cannot decently export something named
What do you mean? |
AutoSparse <: AbstractADType is a very good middle ground IMO. And there's an interconnected piece here. Though it would take quite a bit of work to actually make SparseDiffTools work with all AD backends in its current form. But with DI, DI could take over what is the front end of SparseDiffTools and redeisgn the color Jacobian functions to not be forwarddiff exclusive but instead call the DI high level chunked modes, and then all backends that support that would then be color differentiation compatible. Ultimately that's the best solution here, and no one has had the time to pull it off but @gdalle seems like you have the time to finally put the polishing steps on this and see it, and indeed that's the right thing to do here. |
My vision is that there are 3 phases to sparse AD, and they can (should) be separated:
The first step is this ADTypes revamp |
Yes, that vision is correct. We've just been scraped for hands so... welcome aboard!
Nice! Yes, this has been a bit of a pain for us. This should probably be in JuliaDiff?
Yup and that's ultimately all it should be.
Agreed. |
Ultimately everything should be in JuliaDiff, but @adrhill and I are iterating very fast so it's easier to keep it local. I fully intend to transfer DI to JuliaDiff eventually |
This is based on input propagation and won't work (give incorrect pattern) for control flow or branching that affects sparsity patterns, right? I am not familiar with how symbolics works, but from Shashi's paper, I thought that is one of the challenges it is addressing. Regardless this seems like a more principled way to alteast replace the approx sparse methods in sparsedifftools. |
SparsityDetection.jl had extra things for the union sparsity, but that had maintenance troubles of Cassette and those got dropped. The simplified version then was to Symbolics trace, which then became what we had today. A simpler tracer is following that evolution in a fine way. Yes, it is limited, but it's at least correct and errors when not possible. Not ideal but tends to work in most use cases on model code, which is good enough. |
Yes indeed, this assumes the same control flow throughout subsequent function executions. In that way it is coherent with DifferentiationInterface semantics (otherwise we couldn't e.g. build a ReverseDiff tape during the preparation step).
It was kind of our original idea : redo the "Sparsity Programming" paper with a much more lightweight approach. Maybe not as powerful, but a hell of a lot faster |
We could in theory provide a second dual number tracer that uses primal values for control flow / branches. Currently, SparseConnectivityTracer returns input-output connectivities, which are a conservative estimate of the Jacobian sparsity. We could be a bit smarter in our operator overloading for functions like |
I don't think we can handle control flow, there is no way to do this with operator overloading alone. |
That's also why StochasticAD for instance has to rely on workarounds to support if/else https://gaurav-arya.github.io/StochasticAD.jl/dev/limitations.html |
Yes, it requires global not local information, so it's simply not possible to do with operator overloading. |
Problems:
AutoSparseBackend
is a duplicate ofAutoBackend
, sometimes with misses (see GiveAutoSparseFiniteDiff
the same type parameters asAutoFiniteDiff
#33)Suggested solutions (instead of #37):
AutoSparseBackend
with a wrapper structAutoSparse(AutoBackend)
To preserve backwards-compatibility we can try to define shortcuts like the following, to be removed in v0.3.0:
However I'm not 100% sure we can keep this from being breaking.
The text was updated successfully, but these errors were encountered: