FMI 3.0 provides optional access to partial derivatives for variables of an FMU. Partial derivatives can be used:
-
in Newton algorithms to solve algebraic loops,
-
in implicit integration algorithms in Model Exchange,
-
in iterative Co-Simulation algorithms, and
-
in optimization algorithms.
To avoid expensive numeric approximations of these derivatives, FMI offers dedicated functions to retrieve partial derivatives for variables of an FMU:
-
fmi3GetDirectionalDerivative
to compute the directional derivatives \(\mathbf{v}_{\mathit{sensitivity}} = \mathbf{J} \cdot \mathbf{v}_{\mathit{seed}}\), and -
fmi3GetAdjointDerivative
to calculate the adjoint derivatives \(\mathbf{\bar{v}}_{\mathit{sensitivity}}^T = \mathbf{\bar{v}}_{\mathit{seed}}^T \cdot \mathbf{J}\)
with the Jacobian
where \(\mathbf{v}_{\mathit{known}}\) are the \(n\) knowns, and \(\mathbf{g}\) are the \(m\) functions to calculate the \(m\) unknown variables \(\mathbf{v}_{\mathit{unknwon}}\) from the knowns.
The functions can be used to compute Jacobian-vector products or to construct the partial derivative matrices column-wise or row-wise by choosing the seed vector \(\mathbf{v}_{\mathit{seed}} \in \mathbb{R}^n\) or \(\mathbf{\bar{v}}_{\mathit{seed}} \in \mathbb{R}^m\), respectively, accordingly.
For information on the call signature see the FMI specification.
The function fmi3GetDirectionalDerivative
computes the directional derivative
One can either retrieve the \(\mathit{i}\)-th column of the Jacobian by specifying the \(\mathit{i}\)-th unit vector \(\mathbf{e}_{\mathit{i}}\) as the seed vector \(\mathbf{v}_{\mathit{seed}}\), or compute a Jacobian-vector product \(\mathbf{Jv}\) by using \(\mathbf{v}\) as the seed vector \(\mathbf{v}_{\mathit{seed}}\). Therefore, the function can be utilized for the following purposes, among others:
-
Solving algebraic loops with a nonlinear solver requires matrix \({\frac{\partial \mathbf{g}}{\partial \mathbf{u}}}\).
-
Numerical integrators of stiff methods need matrix \({\frac{\partial \mathbf{f}}{\partial \mathbf{x}}}\).
-
If the FMU is connected with other FMUs, the partial derivatives of the state derivatives and outputs with respect to the continuous states and the inputs are needed in order to compute the Jacobian for the system of the connected FMUs.
-
If the FMU shall be linearized, the same derivatives as in the previous item are needed.
-
If the FMU is used as the model for an extended Kalman filter, \({\frac{\partial \mathbf{f}}{\partial \mathbf{x}}}\) and \({\frac{\partial \mathbf{g}}{\partial \mathbf{x}}}\) are needed.
-
If matrix-free linear solvers shall be used, Jacobian-vector products \({\mathbf{Jv}}\) are needed (e.g. as a user-supplied routine in CVODE [CVODE570]).
Example 1:
Assume an FMU has the output equations
and this FMU is connected, so that \({y_1, u_1, u_3}\) appear in an algebraic loop. Then the nonlinear solver needs a Jacobian and this Jacobian can be computed (without numerical differentiation) provided the partial derivative of \({y_1}\) with respect to \({u_1}\) and \({u_3}\) is available. Depending on the environment where the FMUs are connected, these derivatives can be provided:
(a) with one wrapper function around function fmi3GetDirectionalDerivative
to compute the directional derivatives with respect to these two variables (in other words, \({v_{\mathit{unknown}} = y_1}\), \({v_{\mathit{known}} = \left \{ u_1, u_3 \right \}}\)), and then the environment calls this wrapper function with \({v_{\mathit{seed}} = \left \{ 1, 0 \right \}}\) to compute the partial derivative with respect to \({u_1}\) and \({v_{\mathit{seed}} = \left \{ 0, 1 \right \}}\) to compute the partial derivative with respect to \({u_3}\), or
(b) with two direct function calls of fmi3GetDirectionalDerivative
(in other words, \({v_{\mathit{unknown}} = y_1, v_{\mathit{known}} = u_1, v_{\mathit{seed}} = 1}\); and \({v_{\mathit{unknown}} = y_1, v_{\mathit{known}} = u_3, v_{\mathit{seed}} = 1}\)).
Note that a direct implementation of this function with analytic derivatives:
(a) Provides the directional derivative for all input variables; so in the above example: \({\Delta y_1 = \frac{\partial g_1}{\partial x} \cdot \Delta x + \frac{\partial g_1}{\partial u_1} \cdot \Delta u_1 + \frac{\partial g_1}{\partial u_3} \cdot \Delta u_3 + \frac{\partial g_1}{\partial u_4} \cdot \Delta u_4}\)
(b) Initializes all seed-values to zero; so in the above example: \({\Delta x = \Delta u_1 = \Delta u_3 = \Delta u_4 = 0}\)
(c) Computes the directional derivative with the seed-values provided in the function arguments; so in the above example: \({v_{\mathit{sensitivity}} = \Delta y_1 (\Delta x = 0, \Delta u_1 = 1, \Delta u_3 = 0, \Delta u_4 = 0)}\)] and \({v_{\mathit{sensitivity}} = \Delta y_1 (\Delta x = 0, \Delta u_1 = 0, \Delta u_3 = 1, \Delta u_4 = 0)}\)]
Example 2:
If a dense matrix shall be computed, the columns of the matrix can be easily constructed by successive calls of fmi3GetDirectionalDerivative
.
For example, constructing the system Jacobian \({\mathbf{A} = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}}\) as dense matrix can be performed in the following way:
// c[] column vector
// set time, states and inputs
fmi3SetTime(S, time);
fmi3SetContinuousStates(S, x, nx);
// fmi3Set{VariableType}(S, ...);
// if required at this step, compute the Jacobian as a dense matrix
for (i = 0; i < nx; i++) {
// construct the Jacobian matrix column wise
fmi3GetDirectionalDerivative(S, vr_dx, nx, &vr_x[i], 1, &dk, 1, c, nx);
for (j = 0; j < nx; j++) {
J[j][i] = c[j];
}
}
If the sparsity of a matrix shall be taken into account, then the matrix can be constructed in the following way:
-
The incidence information of the matrix (whether an element is zero or not zero) is extracted from the XML file from element
ModelStructure
. -
A so called graph coloring algorithm is employed to determine the columns of the matrix that can be computed by one call of
fmi3GetDirectionalDerivative
. -
For the columns determined in (2), one call to
fmi3GetDirectionalDerivative
is made. After each such call, the elements of the resulting directional derivative vector are copied into their correct locations of the partial derivative matrix.
More details and implementational notes are available from [ABL12].
Example 3:
Directional derivatives for higher dimension variables are almost treated in the same way.
Consider, for example, an FMU which calculates its output \({Y}\) by multiplying its 2x2 input \({U}\) with a 3x2 constant gain \({K}\), with
The output \({Y=K U}\) is a matrix of size 3x2. The directional derivative of an output element \({Y(i,j)}\) with respect to the input \({U}\) and the seed \({\Delta U}\) is:
To get the directional derivative of \({Y}\) with respect to \({U(2,1)}\) the command fmi3GetDirectionalDerivative(m, vr_Y, 1, vr_U, 1, {0.0, 0.0, 1.0, 0.0}, 4, dd, 6)
can be used where vr_Y
and vr_U
are references of the variable \({Y}\) and \({U}\), respectively.
Note that in order to get the directional derivative of \({Y}\) with respect to \({U(2,1)}\), the seed value {0, 0, 1.0, 0}
has been used.
The retrieved directional derivative dd
is stored in a matrix of size 3x2, so nSensitivity
is 6.
The function fmi3GetAdjointDerivative
computes the adjoint derivative
One can either retrieve the \(\mathit{i}\)-th row of the Jacobian by specifying the \(\mathit{i}\)-th unit vector \(\mathbf{e}_{\mathit{i}}\) as the seed vector \(\mathbf{\bar{v}}_{\mathit{seed}}\), or compute a vector-Jacobian product \(\mathbf{v}^T\mathbf{J}\) by using \(\mathbf{v}\) as the seed vector \(\mathbf{\bar{v}}_{\mathit{seed}}\).
Adjoint derivatives are beneficial in several contexts:
-
in artificial intelligence (AI) frameworks the adjoint derivatives are called "vector gradient products" (VJPs). There adjoint derivatives are used in the backpropagation process to perform gradient-based optimization of parameters using reverse mode automatic differentiation (AD), see, e.g., [BPRS15].
-
in parameter estimation (see [BKF17])
Typically, reverse mode automatic differentiation (AD) is more efficient for these use cases than forward mode AD because the number of knowns is much higher than the number of unknowns (\(\mathit{n} \gg \mathit{m}\)), as explained in the cited references. If the full Jacobian is needed and the number of knowns and unknowns are somewhat equal (\(\mathit{m} \approx \mathit{n}\)) or small, the column-wise construct using fmi3GetDirectionalDerivative
is generally more efficient.
Example:
Assume an FMU has the output equations
and \(\left( w_1, w_2 \right)^T \cdot \mathbf{ \frac{\partial h}{\partial u} }\) for some vector \(\left( w_1, w_2 \right)^T\) is needed.
Then one can get this with one function call of fmi3GetAdjointDerivative
(with arguments \(\mathbf{v}_{\mathit{unknown}} = \text{valueReferences of} \left \{ y_1, y_2 \right \}\), \(\mathbf{v}_{\mathit{known}} = \text{valueReferences of} \left \{ u_1, u_2 \right \}\), \(\mathbf{\bar{v}}_{\mathit{seed}} = \left( w_1, w_2 \right)^T\)), while with fmi3GetDirectionalDerivative
at least two calls would be necessary to first construct the Jacobian column-wise and then multiplying from the right with \(\left( w_1, w_2 \right)^T\).
If a dense matrix shall be computed, the rows of the matrix can be easily constructed by successive calls of fmi3GetAdjointDerivative
.
For example, constructing the system Jacobian \({\mathbf{A} = \frac{\partial \mathbf{f}}{\partial \mathbf{x}}}\) as a dense matrix can be performed in the following way:
for (i = 0; i < nx; i++) {
// construct the Jacobian matrix column wise
fmi3GetAdjointDerivative(S, &vr_dx[i], 1, vr_x, nx, &dk, 1, &J[i][0], nx);
}