-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible way out of dependency hell? #509
Comments
Thanks for this suggestion! To be honest I hadn't thought that far ahead. I understand the need for DifferentiationInterfaceAPI (or DifferentiationInterfaceInterface ^^), because that would allow people to write code using our API without bringing in the compatibility conflicts. I'm struggling a bit more with the second part of the proposal. Essentially, version compatibility would be improved because instead of a single DI version number that has all the backends in lockstep, one would have a set of version numbers (one for DIForwardDiff, one for DIZygote, etc) which can be combined in different ways? Related: |
Yes, that's the idea. By having DifferentiationInterface hard-depend on the separate (almost) empty backend-implementation packages, which all soft-depends on one backend each, we can decouple it from the AD-Backend versions. Since those backend-implementation packages will do everything in their extension, they will have zero load time as long as the AD-Backend is not loaded., so DifferentiationInterface can depend on all of them without load-time penalties.
I think, so, yes - because all the the backend-implementation packages need to get the DifferentiationInterface API from somewhere. DifferentiationInterfaceAPI wouldn't be a user-facing package, just a "helper" package. No one except for packages like DIForwardDiff, DIZygote, and so on (maybe not the best names) should use it directly. DifferentiationInterface would re-export the API, of course. So the dependency order would be (with ForwardDiff as an example backend): DifferentiationInterface depends on DIForwardDiff, DIZygoteDiff, ... which all depend on DifferentiationInterfaceAPI. And DIForwardDiff has no main code, but a weak dependency on ForwardDiff, and it implements the DifferentiationInterface API in it's DIForwardDiffForwardDiffExt extension. But to do that, I needs to depend on a package that defines that API, so we need DifferentiationInterfaceAPI as a separate package. Longer term we can hope to convince the AD-backend maintainers to support DifferentiationInterface via an extension, instead of the other way round. That would result in exactly the same version-decoupling across backends. But it will take time, and will never happen for all of them, I fear. So for now, the scheme above would give us that decoupling - and when backends start to provide native support for DifferentiationInterface, we can retire the "implementation packages" like DIForwardDiff one-by-one. |
Not at all, it turns out, I should have used SciMLSensitivity instead of DiffEqSensitivity (which is now incompatible with recent Zygote versions in itself). :-) But having fairly tight compatibility version ranges for the backends in DifferentiationInterface will likely result in other cases where packages require version combinations of AD-backends that DifferentiationInterface doesn't offer, right?
Well, with separate packages for each backend, that should be easy (since there's only one dependency to worry about)? |
On second though I'm not sure this interpretation is correct. The compat bounds in DifferentiationInterface correspond to the minimal versions of AD backends where (1) the features I need are present and (2) incorrectness bugs have been fixed. To make things concrete, imagine you have two backends, A and B, each with versions 1 and 2.
Can you give me a concrete combination where you don't get what you want?
|
That would happen when you also (possibly indirectly) depends on a package that requires A=2 and on another package that still requires B=1. Or even when depending on a single other package that requires A=2 but has not been updated to B=2 yet. So one would either not be able to install these/this package(s) at all together with DifferentiationInterface, or at best get old versions of them/it. In a perfect world, algorithm packages would of course not depend on AD-backends directly or indirectly at all, but currently quite a few of them do (SciMLSensitivity being one of the more extreme examples, though it can't cause trouble with DifferentiationInterface anymore because it now indirectly depends on it). |
But this situation wouldn't be any different in the scenario you're proposing, right? Except that what the user gets are old versions of the DI backend subpackages |
Yes, but that would be fine - the user would get the current version of the algorithmic packages they need, with their latest features. Getting the old versions of the DI backend subpackages would just mean compatibility with the older version of the AD-backend in question, but not that the older DI backend subpackage is "worse" (assuming it's implementation was already decent). |
Sorry to press on but I'm really trying to understand. So here the underlying assumption is that "algorithmic packages" (like SciML) need the latest DI to function well, and that keeping DI in lockstep with AD backend implementations somehow hinders updates? |
Not quite - I rather have the following situation in mind:
Now the user won't be able to install the a recent version of Dep1 (needs A=2) and Dep2 (needs B=1) together with DifferentiationInterface at all: no version of DI would support A=2&B=1. As a result, a lot of packages might be held back, or there may be no suitable combination of packages at all. But if DifferentiationInterface uses separate backend-implementation packages, then the DI backend package for A will offer A=1 and A=2 (in it's different versions) and the backend package for B will support B=1 and B=2 (in it's different versions). And DifferentiationInterface can combine arbitrary versions of the backend-implementation packages, as long as they there have been no breaking changes in the DifferentiationInterface API between their versions. So now the user could combine a recent version of Dep1 with Dep2 and with DifferentiationInterface (latest version, though that might not even be so important) - but of course B=1 and the DI backend version for B that supports B=1 would be used (but that would be fine). I think DifferentiationInterface is in a bit of a unique situation here, because there are so many AD backends, some of which which are newer and can still go through many breaking changes (and additional such "young" ones will likely appear over time). And AD-backends are complex, and supporting multible backend versions across their breaking changes well will often be difficult (and hard to maintain) when using |
I don't think you're entirely right about B=1 being possible in your new configuration. Even if backend functionality is in DI subpackages, these extensions will need to specify compatibility bounds. So DIB will have a lower bound B=2 on package B, and it will automatically kick in whenever the environment in question contains B. Since I'm not gonna go back in time and implement support for B=1, even with your proposal I don't see how B=1 could ever be possible in an environment that contains any version of DI |
Ah, no, I was assuming that DifferentiationInterface (in it's current scheme) did support B=1 once, but was updated to B=2 (dropping the B=1 compatibility because supporting both would have made the code a hard-to-test mess), and then later was updated to A=2. So with backend-implementation packages, there would be an older one for B that supports B=1 and that can still be used just fine (assuming no breaking changes in the DifferentiationInterface API). The new DIB with B=2 would then not kick in, the package version solver would see that it can pick the older DIB to satisfy other direct or indirect package requirements. |
Alright, that is a bit different from the scenario I initially outlined in my table. |
Because that version would not support A=2, which the user might really need because of another package (which would either not support A=1, or only in an old version that would hold a lot of other packages back, meaning some other requirement would not be fulfilled). |
Okay, I think I'm starting to get it. Consider the following release sequence for DI and its backends (I've labeled DI versions with lowercase letters to simplify):
Then, depending on the versions of A and B, this is which version of DI would be installed (or none if the environment cannot be resolved):
By keeping all the backends in a single package, we essentially force this snake-like evolution through the table of version pairs. And with your suggestion, we could fill all the remaining squares? |
Yes, I think so, as long as the DifferentiationInterface API has no breaking changes in between. But I assume that won't happen often once DifferentiationInterface has matured. |
I thought about way to get out of the current "dependency hell" (#378, #506, ...) - my apologies if this approach has been considered (and found wanting) already:
We move the DifferentiationInterface API definitions into a package DifferentiationInterfaceAPI (or similar name).
We create one package per AD backend, e.g. DIForwardDiff and so on. These packages are basically empty, they only depend on DifferentiationInterfaceAPI, and they implement the DifferentiationInterface API for "their" backend via a single extension (in each backend-implementation package). So DIForwardDiff has a weak dependency on ForwardDiff and an extension DIForwardDiffForwardDiffExt, but DIForwardDiff itself is basically empty. These backend-implementation packages can quicky adapt to breaking changes in the AD package they interface to, while only make a patch-increment of their own version number.
DifferentiationInterface depends on DifferentiationInterfaceAPI and all of the backend implementation packages - but they will only extremely rarely have breaking version changes. (Breaking changes in DifferentiationInterfaceAPI, however, will obviously have a ripple effect). DifferentiationInterface would be mostly empty then.
This way, users can load AD-backends, and combinations of AD-backends, with much wider version combinations, because the backend-implementations are decoupled, version-wise. So ideally, as long as a set of AD backends can be loaded together at all, in some version combination, it should also be accessible via DifferentiationInterface.
I realize that this approach would introduce a not insignificant management overhead ... but there should basically be no adverse effect on load and compile times. Also, tests would run decoupled for each backend.
The text was updated successfully, but these errors were encountered: