Possible way out of dependency hell? #509

oschulz · 2024-09-28T09:39:22Z

I thought about way to get out of the current "dependency hell" (#378, #506, ...) - my apologies if this approach has been considered (and found wanting) already:

We move the DifferentiationInterface API definitions into a package DifferentiationInterfaceAPI (or similar name).
We create one package per AD backend, e.g. DIForwardDiff and so on. These packages are basically empty, they only depend on DifferentiationInterfaceAPI, and they implement the DifferentiationInterface API for "their" backend via a single extension (in each backend-implementation package). So DIForwardDiff has a weak dependency on ForwardDiff and an extension DIForwardDiffForwardDiffExt, but DIForwardDiff itself is basically empty. These backend-implementation packages can quicky adapt to breaking changes in the AD package they interface to, while only make a patch-increment of their own version number.
DifferentiationInterface depends on DifferentiationInterfaceAPI and all of the backend implementation packages - but they will only extremely rarely have breaking version changes. (Breaking changes in DifferentiationInterfaceAPI, however, will obviously have a ripple effect). DifferentiationInterface would be mostly empty then.

This way, users can load AD-backends, and combinations of AD-backends, with much wider version combinations, because the backend-implementations are decoupled, version-wise. So ideally, as long as a set of AD backends can be loaded together at all, in some version combination, it should also be accessible via DifferentiationInterface.

I realize that this approach would introduce a not insignificant management overhead ... but there should basically be no adverse effect on load and compile times. Also, tests would run decoupled for each backend.

gdalle · 2024-09-28T11:26:52Z

Thanks for this suggestion! To be honest I hadn't thought that far ahead.

I understand the need for DifferentiationInterfaceAPI (or DifferentiationInterfaceInterface ^^), because that would allow people to write code using our API without bringing in the compatibility conflicts.

I'm struggling a bit more with the second part of the proposal. Essentially, version compatibility would be improved because instead of a single DI version number that has all the backends in lockstep, one would have a set of version numbers (one for DIForwardDiff, one for DIZygote, etc) which can be combined in different ways?
Also, if we do this big split, do we really need the DifferentiationInterfaceAPI?

Monorepo Packages via Submodules/Subpackages: Support in Pkg and Julia Compiler JuliaLang/julia#55516

oschulz · 2024-09-28T13:01:19Z

I'm struggling a bit more with the second part of the proposal. Essentially, version compatibility would be improved because instead of a single DI version number that has all the backends in lockstep, one would have a set of version numbers (one for DIForwardDiff, one for DIZygote, etc) which can be combined in different ways?

Yes, that's the idea. By having DifferentiationInterface hard-depend on the separate (almost) empty backend-implementation packages, which all soft-depends on one backend each, we can decouple it from the AD-Backend versions. Since those backend-implementation packages will do everything in their extension, they will have zero load time as long as the AD-Backend is not loaded., so DifferentiationInterface can depend on all of them without load-time penalties.

Also, if we do this big split, do we really need the DifferentiationInterfaceAPI?

I think, so, yes - because all the the backend-implementation packages need to get the DifferentiationInterface API from somewhere. DifferentiationInterfaceAPI wouldn't be a user-facing package, just a "helper" package. No one except for packages like DIForwardDiff, DIZygote, and so on (maybe not the best names) should use it directly. DifferentiationInterface would re-export the API, of course.

So the dependency order would be (with ForwardDiff as an example backend):

DifferentiationInterface depends on DIForwardDiff, DIZygoteDiff, ... which all depend on DifferentiationInterfaceAPI.

And DIForwardDiff has no main code, but a weak dependency on ForwardDiff, and it implements the DifferentiationInterface API in it's DIForwardDiffForwardDiffExt extension. But to do that, I needs to depend on a package that defines that API, so we need DifferentiationInterfaceAPI as a separate package.

Longer term we can hope to convince the AD-backend maintainers to support DifferentiationInterface via an extension, instead of the other way round. That would result in exactly the same version-decoupling across backends. But it will take time, and will never happen for all of them, I fear. So for now, the scheme above would give us that decoupling - and when backends start to provide native support for DifferentiationInterface, we can retire the "implementation packages" like DIForwardDiff one-by-one.

gdalle · 2024-09-28T14:11:33Z

Can you explain in concrete terms how this would have prevented e.g. #506?
For lower bound testing (#378), our conclusion was that the Julia tooling itself is not up to the task, so that's gonna have to wait.

oschulz · 2024-09-28T18:54:20Z

Can you explain in concrete terms how this would have prevented e.g. #506

Not at all, it turns out, I should have used SciMLSensitivity instead of DiffEqSensitivity (which is now incompatible with recent Zygote versions in itself). :-)

But having fairly tight compatibility version ranges for the backends in DifferentiationInterface will likely result in other cases where packages require version combinations of AD-backends that DifferentiationInterface doesn't offer, right?

For lower bound testing (#378), our conclusion was that the Julia tooling itself is not up to the task, so that's gonna have to wait.

Well, with separate packages for each backend, that should be easy (since there's only one dependency to worry about)?

gdalle · 2024-09-30T07:29:58Z

But having fairly tight compatibility version ranges for the backends in DifferentiationInterface will likely result in other cases where packages require version combinations of AD-backends that DifferentiationInterface doesn't offer, right?

On second though I'm not sure this interpretation is correct. The compat bounds in DifferentiationInterface correspond to the minimal versions of AD backends where (1) the features I need are present and (2) incorrectness bugs have been fixed.
Therefore, if you try to instantiate an environment with an AD backend version that DI "doesn't offer", you run the risk of getting errors or incorrect derivatives when you try to query them.

To make things concrete, imagine you have two backends, A and B, each with versions 1 and 2.
Further imagine that DI has evolved through the following versions with their respective compats, which more or less reflects what actually goes on in the package:

DI v1: A=v1
DI v2: A=v1, B=v2
DI v3: A=v2, B=v2

Can you give me a concrete combination where you don't get what you want?

A	B	DI
1	1	1
1	2	2
2	1	error
2	2	3

oschulz · 2024-09-30T08:16:44Z

Can you give me a concrete combination where you don't get what you want?

That would happen when you also (possibly indirectly) depends on a package that requires A=2 and on another package that still requires B=1. Or even when depending on a single other package that requires A=2 but has not been updated to B=2 yet. So one would either not be able to install these/this package(s) at all together with DifferentiationInterface, or at best get old versions of them/it.

In a perfect world, algorithm packages would of course not depend on AD-backends directly or indirectly at all, but currently quite a few of them do (SciMLSensitivity being one of the more extreme examples, though it can't cause trouble with DifferentiationInterface anymore because it now indirectly depends on it).

gdalle · 2024-09-30T08:21:01Z

So one would either not be able to install these/this package(s) at all together with DifferentiationInterface, or at best get old versions of them/it.

But this situation wouldn't be any different in the scenario you're proposing, right? Except that what the user gets are old versions of the DI backend subpackages

oschulz · 2024-09-30T08:26:49Z

But this situation wouldn't be any different in the scenario you're proposing, right? Except that what the user gets are old versions of the DI backend subpackages

Yes, but that would be fine - the user would get the current version of the algorithmic packages they need, with their latest features. Getting the old versions of the DI backend subpackages would just mean compatibility with the older version of the AD-backend in question, but not that the older DI backend subpackage is "worse" (assuming it's implementation was already decent).

gdalle · 2024-09-30T08:30:16Z

the user would get the current version of the algorithmic packages they need, with their latest features.

Sorry to press on but I'm really trying to understand. So here the underlying assumption is that "algorithmic packages" (like SciML) need the latest DI to function well, and that keeping DI in lockstep with AD backend implementations somehow hinders updates?
The thing is, DI is mostly AD backend implementations. Between breaking releases of the interface like the one I just tagged, the main thing that evolves is the backend extensions. So the "latest DI" doesn't mean much beyond that.

oschulz · 2024-09-30T09:23:40Z

So here the underlying assumption is that "algorithmic packages" (like SciML) need the latest DI to function well

Not quite - I rather have the following situation in mind:

It's hard for DifferentationInterface to support a wide version range of all of it's AD-backends without lot's of @static if isdefined and similar (also hard to test such code properly). So let's assume DI was updated to B=2 first, and then to A=2, meaning different versions of DifferentationInterface will support: A=1&B=1, A=1&B=2 and A=2&B=2.
There's a package Dep1 that requires A=2 in it's recent versions, which the user needs (directly or indirectly) because of features, or because older versions of it would hold lots of other packages back.
Another package Dep2, that the user needs directly or indirectly, requires B=1, and is not available for B=2 yet.

Now the user won't be able to install the a recent version of Dep1 (needs A=2) and Dep2 (needs B=1) together with DifferentiationInterface at all: no version of DI would support A=2&B=1. As a result, a lot of packages might be held back, or there may be no suitable combination of packages at all.

But if DifferentiationInterface uses separate backend-implementation packages, then the DI backend package for A will offer A=1 and A=2 (in it's different versions) and the backend package for B will support B=1 and B=2 (in it's different versions). And DifferentiationInterface can combine arbitrary versions of the backend-implementation packages, as long as they there have been no breaking changes in the DifferentiationInterface API between their versions.

So now the user could combine a recent version of Dep1 with Dep2 and with DifferentiationInterface (latest version, though that might not even be so important) - but of course B=1 and the DI backend version for B that supports B=1 would be used (but that would be fine).

I think DifferentiationInterface is in a bit of a unique situation here, because there are so many AD backends, some of which which are newer and can still go through many breaking changes (and additional such "young" ones will likely appear over time). And AD-backends are complex, and supporting multible backend versions across their breaking changes well will often be difficult (and hard to maintain) when using @static if isdefined and similar methods. Plus, such "multi-backend-version" code is harder to test in practice. I don't think many packages that have a lot of extensions would benefit from a complicated package scheme like I proposed - but DifferentiationInterface very well may.

gdalle · 2024-09-30T09:58:28Z

I don't think you're entirely right about B=1 being possible in your new configuration. Even if backend functionality is in DI subpackages, these extensions will need to specify compatibility bounds. So DIB will have a lower bound B=2 on package B, and it will automatically kick in whenever the environment in question contains B. Since I'm not gonna go back in time and implement support for B=1, even with your proposal I don't see how B=1 could ever be possible in an environment that contains any version of DI

oschulz · 2024-09-30T10:10:50Z

Since I'm not gonna go back in time and implement support for B=1, even with your proposal I don't see how B=1 could ever be possible in an environment that contains any version of DI

Ah, no, I was assuming that DifferentiationInterface (in it's current scheme) did support B=1 once, but was updated to B=2 (dropping the B=1 compatibility because supporting both would have made the code a hard-to-test mess), and then later was updated to A=2.

So with backend-implementation packages, there would be an older one for B that supports B=1 and that can still be used just fine (assuming no breaking changes in the DifferentiationInterface API). The new DIB with B=2 would then not kick in, the package version solver would see that it can pick the older DIB to satisfy other direct or indirect package requirements.

gdalle · 2024-09-30T10:19:43Z

Alright, that is a bit different from the scenario I initially outlined in my table.
But if we go with your scenario, where B=1 was supported and then dropped, then with the current configuration the user would just get an older version of (all of) DI, the last one before B=1 was dropped. How is that a problem?

oschulz · 2024-09-30T10:22:01Z

with the current configuration the user would just get an older version of (all of) DI, the last one before B=1 was dropped. How is that a problem?

Because that version would not support A=2, which the user might really need because of another package (which would either not support A=1, or only in an old version that would hold a lot of other packages back, meaning some other requirement would not be fulfilled).

gdalle · 2024-09-30T10:41:36Z

Okay, I think I'm starting to get it. Consider the following release sequence for DI and its backends (I've labeled DI versions with lowercase letters to simplify):

DI	A	B	event
a	1	-	add A
b	2	-	bump A
c	2	2	add B
d	2	3	bump B
e	3	3	bump A
f	4	3	bump A
g	4	4	bump B

Then, depending on the versions of A and B, this is which version of DI would be installed (or none if the environment cannot be resolved):

A \ B	1	2	3	4
1	a	-	-	-
2	b	c	d	-
3	-	-	e	-
4	-	-	f	g

By keeping all the backends in a single package, we essentially force this snake-like evolution through the table of version pairs. And with your suggestion, we could fill all the remaining squares?

oschulz · 2024-09-30T12:47:42Z

And with your suggestion, we could fill all the remaining squares?

Yes, I think so, as long as the DifferentiationInterface API has no breaking changes in between. But I assume that won't happen often once DifferentiationInterface has matured.

wsmoses mentioned this issue Sep 30, 2024

Migrate to DifferentiationInterface TuringLang/AdvancedVI.jl#98

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible way out of dependency hell? #509

Possible way out of dependency hell? #509

oschulz commented Sep 28, 2024 •

edited

Loading

gdalle commented Sep 28, 2024 •

edited

Loading

oschulz commented Sep 28, 2024

gdalle commented Sep 28, 2024

oschulz commented Sep 28, 2024

gdalle commented Sep 30, 2024 •

edited

Loading

oschulz commented Sep 30, 2024

gdalle commented Sep 30, 2024

oschulz commented Sep 30, 2024

gdalle commented Sep 30, 2024 •

edited

Loading

oschulz commented Sep 30, 2024 •

edited

Loading

gdalle commented Sep 30, 2024

oschulz commented Sep 30, 2024

gdalle commented Sep 30, 2024

oschulz commented Sep 30, 2024

gdalle commented Sep 30, 2024

oschulz commented Sep 30, 2024

Possible way out of dependency hell? #509

Possible way out of dependency hell? #509

Comments

oschulz commented Sep 28, 2024 • edited Loading

gdalle commented Sep 28, 2024 • edited Loading

oschulz commented Sep 28, 2024

gdalle commented Sep 28, 2024

oschulz commented Sep 28, 2024

gdalle commented Sep 30, 2024 • edited Loading

oschulz commented Sep 30, 2024

gdalle commented Sep 30, 2024

oschulz commented Sep 30, 2024

gdalle commented Sep 30, 2024 • edited Loading

oschulz commented Sep 30, 2024 • edited Loading

gdalle commented Sep 30, 2024

oschulz commented Sep 30, 2024

gdalle commented Sep 30, 2024

oschulz commented Sep 30, 2024

gdalle commented Sep 30, 2024

oschulz commented Sep 30, 2024

oschulz commented Sep 28, 2024 •

edited

Loading

gdalle commented Sep 28, 2024 •

edited

Loading

gdalle commented Sep 30, 2024 •

edited

Loading

gdalle commented Sep 30, 2024 •

edited

Loading

oschulz commented Sep 30, 2024 •

edited

Loading