-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Improve documentation * Rename source files * More tables in README * Split docs into user and dev docs * Add Fallback call structure diagrams * Improve Mermaid diagrams * Fix typos * Fix API ref * No duplicates * Reorder stuff --------- Co-authored-by: Guillaume Dalle <[email protected]>
- Loading branch information
Showing
12 changed files
with
174 additions
and
86 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
# For AD developers | ||
|
||
## Backend requirements | ||
|
||
Every [operator](@ref operators) can be implemented from either of these two primitives: | ||
|
||
- the pushforward (in forward mode), computing a Jacobian-vector product | ||
- the pullback (in reverse mode), computing a vector-Jacobian product | ||
|
||
The only requirement for a backend is therefore to implement either [`value_and_pushforward!`](@ref) or [`value_and_pullback!`](@ref), from which the rest of the operators can be deduced. | ||
We provide a standard series of fallbacks, but we leave it to each backend to redefine as many of the utilities as necessary to achieve optimal performance. | ||
|
||
Every backend we support corresponds to a package extension of DifferentiationInterface.jl (located in the `ext` subfolder). | ||
Advanced users are welcome to code more backends and submit pull requests! | ||
|
||
## Fallback call structure | ||
|
||
### Forward mode | ||
|
||
```mermaid | ||
flowchart LR | ||
subgraph Gradient | ||
gradient --> value_and_gradient | ||
value_and_gradient --> value_and_gradient! | ||
gradient! --> value_and_gradient! | ||
end | ||
subgraph Jacobian | ||
jacobian --> value_and_jacobian | ||
value_and_jacobian --> value_and_jacobian! | ||
jacobian! --> value_and_jacobian! | ||
end | ||
subgraph Multiderivative | ||
multiderivative --> value_and_multiderivative | ||
value_and_multiderivative --> value_and_multiderivative! | ||
multiderivative! --> value_and_multiderivative! | ||
end | ||
subgraph Derivative | ||
derivative --> value_and_derivative | ||
end | ||
subgraph Pushforward | ||
pushforward --> value_and_pushforward | ||
value_and_pushforward --> value_and_pushforward! | ||
pushforward! --> value_and_pushforward! | ||
end | ||
value_and_jacobian! --> value_and_pushforward! | ||
value_and_gradient! --> value_and_pushforward! | ||
value_and_multiderivative! --> value_and_pushforward! | ||
value_and_derivative --> value_and_pushforward | ||
``` | ||
|
||
### Reverse mode | ||
|
||
```mermaid | ||
flowchart LR | ||
subgraph Gradient | ||
gradient --> value_and_gradient | ||
value_and_gradient --> value_and_gradient! | ||
gradient! --> value_and_gradient! | ||
end | ||
subgraph Jacobian | ||
jacobian --> value_and_jacobian | ||
value_and_jacobian --> value_and_jacobian! | ||
jacobian! --> value_and_jacobian! | ||
end | ||
subgraph Multiderivative | ||
multiderivative --> value_and_multiderivative | ||
value_and_multiderivative --> value_and_multiderivative! | ||
multiderivative! --> value_and_multiderivative! | ||
end | ||
subgraph Derivative | ||
derivative --> value_and_derivative | ||
end | ||
subgraph Pullback | ||
pullback --> value_and_pullback | ||
value_and_pullback --> value_and_pullback! | ||
pullback! --> value_and_pullback! | ||
end | ||
value_and_jacobian! --> value_and_pullback! | ||
value_and_gradient! --> value_and_pullback! | ||
value_and_multiderivative! --> value_and_pullback! | ||
value_and_derivative --> value_and_pullback | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# Getting started | ||
|
||
## [Operators](@id operators) | ||
|
||
Depending on the type of input and output, differentiation operators can have various names. | ||
We choose the following terminology for the ones we provide: | ||
|
||
| | **scalar output** | **array output** | | ||
| ---------------- | ----------------- | ----------------- | | ||
| **scalar input** | `derivative` | `multiderivative` | | ||
| **array input** | `gradient` | `jacobian` | | ||
|
||
Most backends have custom implementations for all of these, which we reuse whenever possible. | ||
|
||
### Variants | ||
|
||
Whenever it makes sense, four variants of the same operator are defined: | ||
|
||
| **Operator** | **non-mutating** | **mutating** | **non-mutating with primal** | **mutating with primal** | | ||
| :---------------- | :------------------------ | :------------------------- | :---------------------------------- | :----------------------------------- | | ||
| Derivative | [`derivative`](@ref) | N/A | [`value_and_derivative`](@ref) | N/A | | ||
| Multiderivative | [`multiderivative`](@ref) | [`multiderivative!`](@ref) | [`value_and_multiderivative`](@ref) | [`value_and_multiderivative!`](@ref) | | ||
| Gradient | [`gradient`](@ref) | [`gradient!`](@ref) | [`value_and_gradient`](@ref) | [`value_and_gradient!`](@ref) | | ||
| Jacobian | [`jacobian`](@ref) | [`jacobian!`](@ref) | [`value_and_jacobian`](@ref) | [`value_and_jacobian!`](@ref) | | ||
| Pushforward (JVP) | [`pushforward`](@ref) | [`pushforward!`](@ref) | [`value_and_pushforward`](@ref) | [`value_and_pushforward!`](@ref) | | ||
| Pullback (VJP) | [`pullback`](@ref) | [`pullback!`](@ref) | [`value_and_pullback`](@ref) | [`value_and_pullback!`](@ref) | | ||
|
||
Note that scalar outputs can't be mutated, which is why `derivative` doesn't have mutating variants. | ||
|
||
## Preparation | ||
|
||
In many cases, automatic differentiation can be accelerated if the function has been run at least once (e.g. to record a tape) and if some cache objects are provided. | ||
This is a backend-specific procedure, but we expose a common syntax to achieve it. | ||
|
||
| **Operator** | **preparation function** | | ||
| :---------------- | :-------------------------------- | | ||
| Derivative | [`prepare_derivative`](@ref) | | ||
| Multiderivative | [`prepare_multiderivative`](@ref) | | ||
| Gradient | [`prepare_gradient`](@ref) | | ||
| Jacobian | [`prepare_jacobian`](@ref) | | ||
| Pushforward (JVP) | [`prepare_pushforward`](@ref) | | ||
| Pullback (VJP) | [`prepare_pullback`](@ref) | | ||
|
||
If you run `prepare_operator(backend, f, x)`, it will create an object called `extras` containing the necessary information to speed up `operator` and its variants. | ||
This information is specific to `backend` and `f`, as well as the _type and size_ of the input `x`, but it should work with different _values_ of `x`. | ||
|
||
You can then call `operator(backend, f, similar_x, extras)`, which should be faster than `operator(backend, f, similar_x)`. | ||
This is especially worth it if you plan to call `operator` several times in similar settings: you can think of it as a warm up. | ||
|
||
By default, all the preparation functions return `nothing`. | ||
We do not make any guarantees on their implementation for each backend, or on the performance gains that can be expected. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.