diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 23049a6b..760049d3 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.3","generation_timestamp":"2024-05-06T15:57:53","documenter_version":"1.4.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.3","generation_timestamp":"2024-05-06T16:11:50","documenter_version":"1.4.1"}} \ No newline at end of file diff --git a/dev/acceleration/index.html b/dev/acceleration/index.html index 9d504f4a..a33f2ebe 100644 --- a/dev/acceleration/index.html +++ b/dev/acceleration/index.html @@ -4,4 +4,4 @@ gtag('js', new Date()); gtag('config', 'UA-134239283-1', {'page_path': location.pathname + location.search + location.hash});

Acceleration

By default COSMO's ADMM algorithm is wrapped in a safeguarded acceleration method to achieve faster convergence to higher precision. COSMO uses accelerators from the COSMOAccelerators.jl package.

By default, the solver uses the accelerator type AndersonAccelerator{T, Type2{QRDecomp}, RestartedMemory, NoRegularizer}. This is the classic type-II Anderson acceleration method where the least squares subproblem is solved using an updated QR method. Moreover, the method is restarted, i.e. the history of iterates is deleted, after mem steps and no regularisation for the least-squares method is used.

In addition, the method is safeguarded (safeguard = true), i.e. the residual-norm of the accelerated point can not deviate too much from the current point. Otherwise, the point is discarded and the ADMM algorithm performs a normal step instead.

The acceleration method can be altered as usual via the solver settings and the accelerator keyword. To deactivate acceleration pass an EmptyAccelerator:

settings = COSMO.Settings(accelerator = EmptyAccelerator)

To use the default accelerator but with a different memory size (number of stored iterates) use:

settings = COSMO.Settings(accelerator = with_options(AndersonAccelerator, mem = 15))

To turn the safeguarding off use:

settings = COSMO.Settings(safeguard = false)

To use an Anderson Accelerator of Type-I with a rolling-memory (oldest iterate replaced by newest) approach, use:

settings = COSMO.Settings(accelerator = AndersonAccelerator{Float64, Type1, RollingMemory, NoRegularizer})

For more fine-grained control look at the implementation of the accelerator here.

When JuMP is used, the accelerator settings can be passed in the usual way:

model = JuMP.Model(optimizer_with_attributes(COSMO.Optimizer, "accelerator" => with_options(AndersonAccelerator, mem = 15)))

Or using the set_optimizer_attribute() method:

model = JuMP.Model(COSMO.Optimizer);
-set_optimizer_attribute(model, "accelerator", with_options(AndersonAccelerator, mem = 15))
+set_optimizer_attribute(model, "accelerator", with_options(AndersonAccelerator, mem = 15)) diff --git a/dev/api/index.html b/dev/api/index.html index 944afc62..a69e82bd 100644 --- a/dev/api/index.html +++ b/dev/api/index.html @@ -3,10 +3,10 @@ function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-134239283-1', {'page_path': location.pathname + location.search + location.hash}); -

API Reference

Model

COSMO.ModelType
Model{T <: AbstractFloat}()

Initializes an empty COSMO model that can be filled with problem data using assemble!(model, P, q,constraints; [settings, x0, s0, y0]).

source
COSMO.assemble!Function
assemble!(model, P, q, constraint(s); [settings, x0, y0, s0])

Assembles a COSMO.Model with a cost function defind by P and q, and a number of constraints.

The positive semidefinite matrix P and vector q are used to specify the cost function of the optimization problem:

min   1/2 x'Px + q'x
+

API Reference

Model

COSMO.ModelType
Model{T <: AbstractFloat}()

Initializes an empty COSMO model that can be filled with problem data using assemble!(model, P, q,constraints; [settings, x0, s0, y0]).

source
COSMO.assemble!Function
assemble!(model, P, q, constraint(s); [settings, x0, y0, s0])

Assembles a COSMO.Model with a cost function defind by P and q, and a number of constraints.

The positive semidefinite matrix P and vector q are used to specify the cost function of the optimization problem:

min   1/2 x'Px + q'x
 s.t.  Ax + b ∈ C

constraints is a COSMO.Constraint or an array of COSMO.Constraint objects that are used to describe the constraints on x.


The optional keyword argument settings can be used to pass custom solver settings:

custom_settings = COSMO.Settings(verbose = true);
 assemble!(model, P, q, constraints, settings = custom_settings)

The optional keyword arguments x0 and y0 can be used to provide the solver with warm starting values for the primal variable x and the dual variable y.

x_0 = [1.0; 5.0; 3.0]
-COSMO.assemble!(model, P, q, constraints, x0 = x_0)
source
COSMO.set!Function
set!(model, P, q, A, b, convex_sets, [settings])

Sets model data directly based on provided fields.

source
COSMO.empty_model!Function
empty_model!(model)

Resets all the fields of model to that of a model created with COSMO.Model() (apart from the settings).

source
COSMO.warm_start_primal!Function
warm_start_primal!(model, x0, [ind])

Provides the COSMO.Model with warm starting values for the primal variable x. ind can be used to warm start certain components of x.

source
COSMO.warm_start_slack!Function
warm_start_slack!(model, s0, [ind])

Provides the COSMO.Model with warm starting values for the primal slack variable s. ind can be used to warm start certain components of s.

source
COSMO.warm_start_dual!Function
warm_start_dual!(model, y0, [ind])

Provides the COSMO.Model with warm starting values for the dual variable y. ind can be used to warm start certain components of y.

source

Constraints

COSMO.ConstraintType
Constraint{T <: AbstractFloat}(A, b, convex_set_type, dim = 0, indices = 0:0)

Creates a COSMO constraint: Ax + b ∈ convex_set.

By default the following convex set types are supported: ZeroSet, Nonnegatives, SecondOrderCone, PsdCone, PsdConeTriangle.

Examples

julia> COSMO.Constraint([1 0;0 1], zeros(2), COSMO.Nonnegatives)
+COSMO.assemble!(model, P, q, constraints, x0 = x_0)
source
COSMO.set!Function
set!(model, P, q, A, b, convex_sets, [settings])

Sets model data directly based on provided fields.

source
COSMO.empty_model!Function
empty_model!(model)

Resets all the fields of model to that of a model created with COSMO.Model() (apart from the settings).

source
COSMO.warm_start_primal!Function
warm_start_primal!(model, x0, [ind])

Provides the COSMO.Model with warm starting values for the primal variable x. ind can be used to warm start certain components of x.

source
COSMO.warm_start_slack!Function
warm_start_slack!(model, s0, [ind])

Provides the COSMO.Model with warm starting values for the primal slack variable s. ind can be used to warm start certain components of s.

source
COSMO.warm_start_dual!Function
warm_start_dual!(model, y0, [ind])

Provides the COSMO.Model with warm starting values for the dual variable y. ind can be used to warm start certain components of y.

source

Constraints

COSMO.ConstraintType
Constraint{T <: AbstractFloat}(A, b, convex_set_type, dim = 0, indices = 0:0)

Creates a COSMO constraint: Ax + b ∈ convex_set.

By default the following convex set types are supported: ZeroSet, Nonnegatives, SecondOrderCone, PsdCone, PsdConeTriangle.

Examples

julia> COSMO.Constraint([1 0;0 1], zeros(2), COSMO.Nonnegatives)
 Constraint
 Size of A: (2, 2)
 ConvexSet: COSMO.Nonnegatives{Float64}

For convex sets that require their own data, it is possible to pass the pass the instantiated object directly rather than the type name.

Examples

julia> COSMO.Constraint([1 0;0 1], zeros(2), COSMO.Box([-1.;-1.],[1.;1.]))
@@ -18,4 +18,4 @@
 ConvexSet: COSMO.ZeroSet{Float64}

Notice that extra columns of A have been added automatically.

julia>Matrix(c.A)
 2×4 Array{Float64,2}:
 0.0  1.0  0.0  0.0
-0.0  0.0  1.0  0.0
source
COSMO.ZeroSetType
ZeroSet(dim)

Creates the zero set $\{ 0 \}^{dim}$ of dimension dim. If xZeroSet then all entries of x are zero.

source
COSMO.NonnegativesType
Nonnegatives(dim)

Creates the nonnegative orthant $\{ x \in \mathbb{R}^{dim} : x \ge 0 \}$ of dimension dim.

source
COSMO.BoxType
Box(l, u)

Creates a box or intervall with lower boundary vector $l \in \mathbb{R}^m \cup \{-\infty\}^m$ and upper boundary vector$u \in \mathbb{R}^m\cup \{+\infty\}^m$.

source
COSMO.SecondOrderConeType
SecondOrderCone(dim)

Creates the second-order cone (or Lorenz cone) $\{ (t,x) \in \mathrm{R}^{dim} : || x ||_2 \leq t \}$.

source
COSMO.PsdConeType
PsdCone(dim)

Creates the cone of symmetric positive semidefinite matrices $\mathcal{S}_+^{dim}$. The entries of the matrix X are stored column-by-column in the vector x of dimension dim. Accordingly $X \in \mathbb{S}_+ \Rightarrow x \in \mathcal{S}_+^{dim}$, where $X = \text{mat}(x)$.

source
COSMO.PsdConeTriangleType
PsdConeTriangle(dim)

Creates the cone of symmetric positive semidefinite matrices. The entries of the upper-triangular part of matrix X are stored in the vector x of dimension dim. A $r \times r$ matrix has $r(r+1)/2$ upper triangular elements and results in a vector of $\mathrm{dim} = r(r+1)/2$.

Examples

The matrix

\[\begin{bmatrix} x_1 & x_2 & x_4\\ x_2 & x_3 & x_5\\ x_4 & x_5 & x_6 \end{bmatrix}\]

is transformed to the vector $[x_1, x_2, x_3, x_4, x_5, x_6]^\top$ with corresponding constraint PsdConeTriangle(6).

source
COSMO.ExponentialConeType
ExponentialCone(MAX_ITERS = 100, EXP_TOL = 1e-8)

Creates the exponential cone $\mathcal{K}_{exp} = \{(x, y, z) \mid y \geq 0 ye^{x/y} ≤ z\} \cup \{ (x,y,z) \mid x \leq 0, y = 0, z \geq 0 \}$

source
COSMO.DualExponentialConeType
DualExponentialCone(MAX_ITERS::Int = 100, EXP_TOL = 1e-8)

Creates the dual exponential cone $\mathcal{K}^*_{exp} = \{(x, y, z) \mid x < 0, -xe^{y/x} \leq e^1 z \} \cup \{ (0,y,z) \mid y \geq 0, z \geq 0 \}$

source
COSMO.PowerConeType
PowerCone(alpha::Float64, MAX_ITERS::Int = 20, POW_TOL = 1e-8)

Creates the 3-d power cone $\mathcal{K}_{pow} = \{(x, y, z) \mid x^\alpha y^{(1-\alpha)} \geq \|z\|, x \geq 0, y \geq 0 \}$ with $0 < \alpha < 1$

source
COSMO.DualPowerConeType
DualPowerCone(alpha::Float64, MAX_ITERS::Int = 20, POW_TOL = 1e-8)

Creates the 3-d dual power cone $\mathcal{K}^*_{pow} = \{(u, v, w) \mid \left( \frac{u}{\alpha}\right)^\alpha \left( \frac{v}{1-\alpha}\right)^{(1-\alpha)} \geq \|w\|, u \geq 0, v \geq 0 \}$ with $0 < \alpha < 1$

source
+0.0 0.0 1.0 0.0
source
COSMO.ZeroSetType
ZeroSet(dim)

Creates the zero set $\{ 0 \}^{dim}$ of dimension dim. If xZeroSet then all entries of x are zero.

source
COSMO.NonnegativesType
Nonnegatives(dim)

Creates the nonnegative orthant $\{ x \in \mathbb{R}^{dim} : x \ge 0 \}$ of dimension dim.

source
COSMO.BoxType
Box(l, u)

Creates a box or intervall with lower boundary vector $l \in \mathbb{R}^m \cup \{-\infty\}^m$ and upper boundary vector$u \in \mathbb{R}^m\cup \{+\infty\}^m$.

source
COSMO.SecondOrderConeType
SecondOrderCone(dim)

Creates the second-order cone (or Lorenz cone) $\{ (t,x) \in \mathrm{R}^{dim} : || x ||_2 \leq t \}$.

source
COSMO.PsdConeType
PsdCone(dim)

Creates the cone of symmetric positive semidefinite matrices $\mathcal{S}_+^{dim}$. The entries of the matrix X are stored column-by-column in the vector x of dimension dim. Accordingly $X \in \mathbb{S}_+ \Rightarrow x \in \mathcal{S}_+^{dim}$, where $X = \text{mat}(x)$.

source
COSMO.PsdConeTriangleType
PsdConeTriangle(dim)

Creates the cone of symmetric positive semidefinite matrices. The entries of the upper-triangular part of matrix X are stored in the vector x of dimension dim. A $r \times r$ matrix has $r(r+1)/2$ upper triangular elements and results in a vector of $\mathrm{dim} = r(r+1)/2$.

Examples

The matrix

\[\begin{bmatrix} x_1 & x_2 & x_4\\ x_2 & x_3 & x_5\\ x_4 & x_5 & x_6 \end{bmatrix}\]

is transformed to the vector $[x_1, x_2, x_3, x_4, x_5, x_6]^\top$ with corresponding constraint PsdConeTriangle(6).

source
COSMO.ExponentialConeType
ExponentialCone(MAX_ITERS = 100, EXP_TOL = 1e-8)

Creates the exponential cone $\mathcal{K}_{exp} = \{(x, y, z) \mid y \geq 0 ye^{x/y} ≤ z\} \cup \{ (x,y,z) \mid x \leq 0, y = 0, z \geq 0 \}$

source
COSMO.DualExponentialConeType
DualExponentialCone(MAX_ITERS::Int = 100, EXP_TOL = 1e-8)

Creates the dual exponential cone $\mathcal{K}^*_{exp} = \{(x, y, z) \mid x < 0, -xe^{y/x} \leq e^1 z \} \cup \{ (0,y,z) \mid y \geq 0, z \geq 0 \}$

source
COSMO.PowerConeType
PowerCone(alpha::Float64, MAX_ITERS::Int = 20, POW_TOL = 1e-8)

Creates the 3-d power cone $\mathcal{K}_{pow} = \{(x, y, z) \mid x^\alpha y^{(1-\alpha)} \geq \|z\|, x \geq 0, y \geq 0 \}$ with $0 < \alpha < 1$

source
COSMO.DualPowerConeType
DualPowerCone(alpha::Float64, MAX_ITERS::Int = 20, POW_TOL = 1e-8)

Creates the 3-d dual power cone $\mathcal{K}^*_{pow} = \{(u, v, w) \mid \left( \frac{u}{\alpha}\right)^\alpha \left( \frac{v}{1-\alpha}\right)^{(1-\alpha)} \geq \|w\|, u \geq 0, v \geq 0 \}$ with $0 < \alpha < 1$

source
diff --git a/dev/citing/index.html b/dev/citing/index.html index 10c7ebf5..9d1c56e9 100644 --- a/dev/citing/index.html +++ b/dev/citing/index.html @@ -32,4 +32,4 @@ pages={435--440}, year={2022}, organization={IEEE} -}

A preprint can be downloaded here.

+}

A preprint can be downloaded here.

diff --git a/dev/contributing/index.html b/dev/contributing/index.html index 63cbf8b5..4157dd56 100644 --- a/dev/contributing/index.html +++ b/dev/contributing/index.html @@ -14,4 +14,4 @@ Imperative style for the commit message: "Fix bug" and not "Fixed bug" or "Fixes bug." -The issue id can be ommitted if the commit does not related to a specific open issue +The issue id can be ommitted if the commit does not related to a specific open issue diff --git a/dev/decomposition/index.html b/dev/decomposition/index.html index ec92d7b3..270d7fcb 100644 --- a/dev/decomposition/index.html +++ b/dev/decomposition/index.html @@ -69,7 +69,7 @@ Status: Solved Iterations: 40 Optimal objective: -1.4134 -Runtime: 0.002s (1.93ms)

Under sets we can indeed see that COSMO solved a problem with five PSD constraints corresponding to the cliques $\mathcal{C}_1,\ldots, \mathcal{C}_5$ discovered in the sparsity pattern. (Note that the dimension printed is the number of entries in the upper triangle of the matrix block.)

Clique merging

After we have found the cliques of the sparsity pattern, we are allowed to merge some of them back together. For the graph of the sparsity pattern this just means adding more edges or treating some structural zeros as numerical zeros. The main reason to merge two cliques is that they might overlap a lot and therefore it is not advantageous to treat them as two different blocks. Consider the two extreme cases below:

In the left figure we have the ideal case that all the blocks overlap in just one entry. A full decomposition would leave us with a large number of small blocks. The sparsity pattern in the right figure has two large blocks overlapping almost entirely. In this case it would be disadvantageous to decompose the blocks. Instead, we would do the initial decomposition, realize the large overlap, and then merge the two blocks back together. For sparsity patterns that arise from real applications the case is not always as clear and we have to use more sophisticated strategies to decide which blocks to merge.

COSMO currently provides three different strategies that can be selected by the user:

COSMO.NoMergeType
NoMerge <: AbstractMergeStrategy

A strategy that does not merge cliques.

source
COSMO.ParentChildMergeType
ParentChildMerge(t_fill = 8, t_size = 8) <: AbstractTreeBasedMerge

The merge strategy suggested in Sun and Andersen - Decomposition in conic optimization with partially separable structure (2014). The initial clique tree is traversed in topological order and a clique $\mathcal{C}_\ell$ is greedily merged to its parent clique $\mathcal{C}_{par(\ell)}$ if at least one of the two conditions are met

  • $(| \mathcal{C}_{par(\ell)}| -| \eta_\ell|) (|\mathcal{C}_\ell| - |\eta_\ell|) \leq t_{\text{fill}}$ (fill-in condition)
  • $\max \left\{ |\nu_{\ell}|, |\nu_{par(\ell)}| \right\} \leq t_{\text{size}}$ (supernode size condition)
source
COSMO.CliqueGraphMergeType
CliqueGraphMerge(edge_weight::AbstractEdgeWeight = ComplexityWeight()) <: AbstractGraphBasedMerge

The (default) merge strategy based on the reduced clique graph $\mathcal{G}(\mathcal{B}, \xi)$, for a set of cliques $\mathcal{B} = \{ \mathcal{C}_1, \dots, \mathcal{C}_p\}$ where the edge set $\xi$ is obtained by taking the edges of the union of clique trees.

Moreover, given an edge weighting function $e(\mathcal{C}_i,\mathcal{C}_j) = w_{ij}$, we compute a weight for each edge that quantifies the computational savings of merging the two cliques. After the initial weights are computed, we merge cliques in a loop:

while clique graph contains positive weights:

  • select two permissible cliques with the highest weight $w_{ij}$
  • merge cliques $\rightarrow$ update clique graph
  • recompute weights for updated clique graph

Custom edge weighting functions can be used by defining your own CustomEdgeWeight <: AbstractEdgeWeight and a corresponding edge_metric method. By default, the ComplexityWeight <: AbstractEdgeWeight is used which computes the weight based on the cardinalities of the cliques: $e(\mathcal{C}_i,\mathcal{C}_j) = |\mathcal{C}_i|^3 + |\mathcal{C}_j|^3 - |\mathcal{C}_i \cup \mathcal{C}_j|^3$.

See also: Garstka, Cannon, Goulart - A clique graph based merging strategy for decomposable SDPs (2019)

source

In our example problem we have two cliques $\mathcal{C}_3 = \{ 3,6,7,8\}$ and $\mathcal{C}_5 = \{6,7,8,9 \}$ that overlap in three entries. Let's solve the problem again and choose the default clique merging strategy merge_strategy = COSMO.CliqueGraphMerge:

model = JuMP.Model(with_optimizer(COSMO.Optimizer, decompose = true, merge_strategy = COSMO.CliqueGraphMerge));
+Runtime: 0.002s (1.93ms)

Under sets we can indeed see that COSMO solved a problem with five PSD constraints corresponding to the cliques $\mathcal{C}_1,\ldots, \mathcal{C}_5$ discovered in the sparsity pattern. (Note that the dimension printed is the number of entries in the upper triangle of the matrix block.)

Clique merging

After we have found the cliques of the sparsity pattern, we are allowed to merge some of them back together. For the graph of the sparsity pattern this just means adding more edges or treating some structural zeros as numerical zeros. The main reason to merge two cliques is that they might overlap a lot and therefore it is not advantageous to treat them as two different blocks. Consider the two extreme cases below:

In the left figure we have the ideal case that all the blocks overlap in just one entry. A full decomposition would leave us with a large number of small blocks. The sparsity pattern in the right figure has two large blocks overlapping almost entirely. In this case it would be disadvantageous to decompose the blocks. Instead, we would do the initial decomposition, realize the large overlap, and then merge the two blocks back together. For sparsity patterns that arise from real applications the case is not always as clear and we have to use more sophisticated strategies to decide which blocks to merge.

COSMO currently provides three different strategies that can be selected by the user:

COSMO.NoMergeType
NoMerge <: AbstractMergeStrategy

A strategy that does not merge cliques.

source
COSMO.ParentChildMergeType
ParentChildMerge(t_fill = 8, t_size = 8) <: AbstractTreeBasedMerge

The merge strategy suggested in Sun and Andersen - Decomposition in conic optimization with partially separable structure (2014). The initial clique tree is traversed in topological order and a clique $\mathcal{C}_\ell$ is greedily merged to its parent clique $\mathcal{C}_{par(\ell)}$ if at least one of the two conditions are met

  • $(| \mathcal{C}_{par(\ell)}| -| \eta_\ell|) (|\mathcal{C}_\ell| - |\eta_\ell|) \leq t_{\text{fill}}$ (fill-in condition)
  • $\max \left\{ |\nu_{\ell}|, |\nu_{par(\ell)}| \right\} \leq t_{\text{size}}$ (supernode size condition)
source
COSMO.CliqueGraphMergeType
CliqueGraphMerge(edge_weight::AbstractEdgeWeight = ComplexityWeight()) <: AbstractGraphBasedMerge

The (default) merge strategy based on the reduced clique graph $\mathcal{G}(\mathcal{B}, \xi)$, for a set of cliques $\mathcal{B} = \{ \mathcal{C}_1, \dots, \mathcal{C}_p\}$ where the edge set $\xi$ is obtained by taking the edges of the union of clique trees.

Moreover, given an edge weighting function $e(\mathcal{C}_i,\mathcal{C}_j) = w_{ij}$, we compute a weight for each edge that quantifies the computational savings of merging the two cliques. After the initial weights are computed, we merge cliques in a loop:

while clique graph contains positive weights:

  • select two permissible cliques with the highest weight $w_{ij}$
  • merge cliques $\rightarrow$ update clique graph
  • recompute weights for updated clique graph

Custom edge weighting functions can be used by defining your own CustomEdgeWeight <: AbstractEdgeWeight and a corresponding edge_metric method. By default, the ComplexityWeight <: AbstractEdgeWeight is used which computes the weight based on the cardinalities of the cliques: $e(\mathcal{C}_i,\mathcal{C}_j) = |\mathcal{C}_i|^3 + |\mathcal{C}_j|^3 - |\mathcal{C}_i \cup \mathcal{C}_j|^3$.

See also: Garstka, Cannon, Goulart - A clique graph based merging strategy for decomposable SDPs (2019)

source

In our example problem we have two cliques $\mathcal{C}_3 = \{ 3,6,7,8\}$ and $\mathcal{C}_5 = \{6,7,8,9 \}$ that overlap in three entries. Let's solve the problem again and choose the default clique merging strategy merge_strategy = COSMO.CliqueGraphMerge:

model = JuMP.Model(with_optimizer(COSMO.Optimizer, decompose = true, merge_strategy = COSMO.CliqueGraphMerge));
 @variable(model, x[1:2]);
 @objective(model, Min, c' * x )
 @constraint(model, Symmetric(B - A1  .* x[1] - A2 .* x[2] )  in JuMP.PSDCone());
@@ -109,4 +109,4 @@
 Status: Solved
 Iterations: 40
 Optimal objective: -1.4134
-Runtime: 0.003s (2.68ms)

Unsurprisingly, we can see in the output that COSMO solved a problem with four PSD constraints. One of them is of dimension 15, i.e. a $5\times 5$ block, which correspond to the merged clique $\mathcal{C_3} \cup \mathcal{C}_5 = \{3,6,7,8,9 \}$.

Completing the dual variable

After a decomposed problem is solved, we can recover the solution to the original problem by assembling the matrix variable $S$ from its subblocks $S_\ell$:

\[S = \displaystyle \sum_{\ell = 1}^p T_\ell^\top S_\ell T_\ell,\]

Following Agler's Theorem, $S$ will be a positive semidefinite matrix. However, this is not true for the corresponding dual variable matrix $Y$. The dual variable returned after solving the decomposed problem will be in the space of PSD completable matrices $Y \in \mathbb{S}_+^n(E,?)$. This means that the entries in $Y$ corresponding to the blocks $S_\ell$ (black dots) have been chosen correctly. The numerical values for all the other entries (corresponding to the zeros in $S$ and denoted with a red dot) have to be chosen in the right way to make $Y$ positive semidefinite.

For more information about PSD matrix completion and the completion algorithm used in COSMO take a look at Vandenberghe and Andersen - Chordal Graphs and Semidefinite Optimization (Ch.10). To configure COSMO to complete the dual variable after solving the problem you have to set the complete_dual option:

model = JuMP.Model(with_optimizer(COSMO.Optimizer, complete_dual = true));

Example Code

The code used for this example can be found in /examples/chordal_decomposition.jl.

+Runtime: 0.003s (2.68ms)

Unsurprisingly, we can see in the output that COSMO solved a problem with four PSD constraints. One of them is of dimension 15, i.e. a $5\times 5$ block, which correspond to the merged clique $\mathcal{C_3} \cup \mathcal{C}_5 = \{3,6,7,8,9 \}$.

Completing the dual variable

After a decomposed problem is solved, we can recover the solution to the original problem by assembling the matrix variable $S$ from its subblocks $S_\ell$:

\[S = \displaystyle \sum_{\ell = 1}^p T_\ell^\top S_\ell T_\ell,\]

Following Agler's Theorem, $S$ will be a positive semidefinite matrix. However, this is not true for the corresponding dual variable matrix $Y$. The dual variable returned after solving the decomposed problem will be in the space of PSD completable matrices $Y \in \mathbb{S}_+^n(E,?)$. This means that the entries in $Y$ corresponding to the blocks $S_\ell$ (black dots) have been chosen correctly. The numerical values for all the other entries (corresponding to the zeros in $S$ and denoted with a red dot) have to be chosen in the right way to make $Y$ positive semidefinite.

For more information about PSD matrix completion and the completion algorithm used in COSMO take a look at Vandenberghe and Andersen - Chordal Graphs and Semidefinite Optimization (Ch.10). To configure COSMO to complete the dual variable after solving the problem you have to set the complete_dual option:

model = JuMP.Model(with_optimizer(COSMO.Optimizer, complete_dual = true));

Example Code

The code used for this example can be found in /examples/chordal_decomposition.jl.

diff --git a/dev/examples/closest_correlation_matrix/index.html b/dev/examples/closest_correlation_matrix/index.html index 23896ade..360a3b03 100644 --- a/dev/examples/closest_correlation_matrix/index.html +++ b/dev/examples/closest_correlation_matrix/index.html @@ -43,7 +43,7 @@ Acc: Anderson Type2{QRDecomp}, Memory size = 15, RestartedMemory, Safeguarded: true, tol: 2.0 -Setup Time: 994.87ms +Setup Time: 981.92ms Iter: Objective: Primal Res: Dual Res: Rho: 1 9.8585e+00 5.9778e-01 7.6572e-01 1.0000e-01 @@ -54,7 +54,7 @@ Status: Solved Iterations: 25 Optimal objective: 2.21 -Runtime: 5.357s (5356.77ms)

Double check result against known solution:

known_opt_val = 12.5406
+Runtime: 5.217s (5217.47ms)

Double check result against known solution:

known_opt_val = 12.5406
 known_solution =  [
   1.0         0.732562   -0.319491   -0.359985   -0.287543   -0.15578     0.0264044  -0.271438;
   0.732562    1.0         0.0913246  -0.0386357   0.299199   -0.122733    0.126612   -0.187489;
@@ -64,4 +64,4 @@
  -0.15578    -0.122733    0.461783    0.250601   -0.0875199   1.0        -0.731556    0.0841783;
   0.0264044   0.126612   -0.248641    0.141151    0.137518   -0.731556    1.0        -0.436274;
  -0.271438   -0.187489   -0.395299    0.286088    0.0262425   0.0841783  -0.436274    1.0  ];
-@test isapprox(obj_val, known_opt_val , atol=1e-3)
Test Passed
@test norm(X_sol - known_solution, Inf) < 1e-3
Test Passed

This page was generated using Literate.jl.

+@test isapprox(obj_val, known_opt_val , atol=1e-3)
Test Passed
@test norm(X_sol - known_solution, Inf) < 1e-3
Test Passed

This page was generated using Literate.jl.

diff --git a/dev/examples/index.html b/dev/examples/index.html index 2c0a0674..f9c5921b 100644 --- a/dev/examples/index.html +++ b/dev/examples/index.html @@ -68,4 +68,4 @@ # solve and get results status = JuMP.optimize!(m) obj_val = JuMP.objective_value(m) -X_sol = JuMP.value.(X)

Logistic Regression

Logistic regression problems can be solved using exponential cone constraints. An example on how to use COSMO to solve a logistic regression problem is presented in /examples/logistic_regression_regularization.ipynb.

+X_sol = JuMP.value.(X)

Logistic Regression

Logistic regression problems can be solved using exponential cone constraints. An example on how to use COSMO to solve a logistic regression problem is presented in /examples/logistic_regression_regularization.ipynb.

diff --git a/dev/examples/logistic_regression/10279931.svg b/dev/examples/logistic_regression/a5da2105.svg similarity index 74% rename from dev/examples/logistic_regression/10279931.svg rename to dev/examples/logistic_regression/a5da2105.svg index 3387e8ae..4f30dbe8 100644 --- a/dev/examples/logistic_regression/10279931.svg +++ b/dev/examples/logistic_regression/a5da2105.svg @@ -1,349 +1,349 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + - - - - - - - - - - + + + + + + + + + + diff --git a/dev/examples/logistic_regression/f5fc5625.svg b/dev/examples/logistic_regression/bae01cb9.svg similarity index 72% rename from dev/examples/logistic_regression/f5fc5625.svg rename to dev/examples/logistic_regression/bae01cb9.svg index 76064110..c211a4ee 100644 --- a/dev/examples/logistic_regression/f5fc5625.svg +++ b/dev/examples/logistic_regression/bae01cb9.svg @@ -1,341 +1,341 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/logistic_regression/index.html b/dev/examples/logistic_regression/index.html index 951e610e..1b79c42f 100644 --- a/dev/examples/logistic_regression/index.html +++ b/dev/examples/logistic_regression/index.html @@ -25,7 +25,7 @@ # visualize data plot(x1[1:n_half-1], x2[1:n_half-1], color = :blue, st=:scatter, markershape = :cross, aspect_ratio=:equal, label = "Accepted", xlabel = "x1 - Microchip Test Score 1", ylabel = "x2 - Microchip Test Score 2") -plot!(x1[n_half:end], x2[n_half:end], color = :red, st=:scatter, markershape = :circle, label = "Rejected")Example block output

The plot shows two test scores of $n$ microchip samples from a fabrication plant and whether the chip passed the quality check. Based on this data we would like to build a logistic model that takes into account the test scores and helps us predict the likelihood of a chip being accepted.

Defining the logistic model

The logistic regression hypothesis is given by

\[h_\theta(x) = g(\theta^\top x)\]

where $g$ is the sigmoid function:

\[g(\theta^\top x) = \frac{1}{1+\exp(-\theta^\top x)}.\]

The vector $x$ represents the independent variables and $\theta$ represents the model parameters. For our samples we set the dependent variable $y =1$ if the chip was accepted and $y = 0$ otherwise.

The function $h_\theta(x)$ can be interpreted as the probability of the outcome being true rather than false. We want to find the parameters $\theta$ that maximize the log-likelihood over all (independently Bernoulli distributed) observations

\[J(\theta) = \sum_{i, y_i = 1} \log h_\theta(x_i) + \sum_{i, y_i = 0} \log (1-h_\theta(x_i)).\]

Consequently, we want to solve the following optimization problem:

\[\text{minimize} \quad -J(\theta) + \mu \|\theta \|_2,\]

where we added a regularization term with parameter $\mu$ to prevent overfitting.

Feature mapping

As our dataset only has two independent variables (the test scores) our model $y = \theta_0 + \theta_1 x_1 + \theta_2 x_2$ will have the form of a straight line. Looking at the plot one can see that a line will not perform well in separating the samples. Therefore, we will create more features based on each data point by mapping the original features ($x_1$, $x_2$) into all polynomial terms of $x_1$ and $x_2$ up to the 6th power:

\[\text{map\_feature}(x_1,x_2) = [1, x_1, x_2, x_1^2, x_1x_2, x_2^2, x_1^3, \dots, x_1x_2^5, x_2^6 ]\]

This will create 28 features for each sample.

function map_feature(x1, x2)
+plot!(x1[n_half:end], x2[n_half:end], color = :red, st=:scatter, markershape = :circle, label = "Rejected")
Example block output

The plot shows two test scores of $n$ microchip samples from a fabrication plant and whether the chip passed the quality check. Based on this data we would like to build a logistic model that takes into account the test scores and helps us predict the likelihood of a chip being accepted.

Defining the logistic model

The logistic regression hypothesis is given by

\[h_\theta(x) = g(\theta^\top x)\]

where $g$ is the sigmoid function:

\[g(\theta^\top x) = \frac{1}{1+\exp(-\theta^\top x)}.\]

The vector $x$ represents the independent variables and $\theta$ represents the model parameters. For our samples we set the dependent variable $y =1$ if the chip was accepted and $y = 0$ otherwise.

The function $h_\theta(x)$ can be interpreted as the probability of the outcome being true rather than false. We want to find the parameters $\theta$ that maximize the log-likelihood over all (independently Bernoulli distributed) observations

\[J(\theta) = \sum_{i, y_i = 1} \log h_\theta(x_i) + \sum_{i, y_i = 0} \log (1-h_\theta(x_i)).\]

Consequently, we want to solve the following optimization problem:

\[\text{minimize} \quad -J(\theta) + \mu \|\theta \|_2,\]

where we added a regularization term with parameter $\mu$ to prevent overfitting.

Feature mapping

As our dataset only has two independent variables (the test scores) our model $y = \theta_0 + \theta_1 x_1 + \theta_2 x_2$ will have the form of a straight line. Looking at the plot one can see that a line will not perform well in separating the samples. Therefore, we will create more features based on each data point by mapping the original features ($x_1$, $x_2$) into all polynomial terms of $x_1$ and $x_2$ up to the 6th power:

\[\text{map\_feature}(x_1,x_2) = [1, x_1, x_2, x_1^2, x_1x_2, x_2^2, x_1^3, \dots, x_1x_2^5, x_2^6 ]\]

This will create 28 features for each sample.

function map_feature(x1, x2)
   deg = 6
   x_new = ones(length(x1))
   for i = 1:deg, j = 0:i
@@ -99,7 +99,7 @@
 Acc:      Anderson Type2{QRDecomp},
           Memory size = 15, RestartedMemory,
           Safeguarded: true, tol: 2.0
-Setup Time: 23.45ms
+Setup Time: 23.92ms
 
 Iter:	Objective:	Primal Res:	Dual Res:	Rho:
 1	-1.2883e+03	2.5314e+01	1.7951e+00	1.0000e-01
@@ -112,7 +112,7 @@
 Status: Solved
 Iterations: 75
 Optimal objective: 51.93
-Runtime: 0.889s (889.49ms)
theta = value.(θ)
28-element Vector{Float64}:
+Runtime: 0.889s (889.17ms)
theta = value.(θ)
28-element Vector{Float64}:
   2.6744881022317313
   1.7749123100576332
   2.9342821763027476
@@ -140,7 +140,7 @@
     z[i, j] = dot(map_feature(u[i], v[j]), theta);
 end

To add the decision boundary we have to plot the line indicating $50\%$ probability of acceptance, i.e. $g(\theta^\top x) = g(z) = 0.5$ which we get at $z=0$.

plot(x1[1:n_half-1], x2[1:n_half-1], color = :blue, st = :scatter, markershape = :cross, aspect_ratio=:equal, label = "Accepted", xlabel = "x1 - Microchip Test Score 1", ylabel = "x2 - Microchip Test Score 2")
 plot!(x1[n_half:end], x2[n_half:end], color = :red, st = :scatter, markershape = :circle, label = "Rejected")
-contour!(u, v, z', levels = [0.], c = :black, linewidth = 2)
Example block output

Solving the optimisation problem directly with COSMO

We can solve the problem directly in COSMO by using its modeling interface. The problem will have $nn = 5 n + n_\theta + 1$ variables. Let us define the cost function $\frac{1}{2}x^\top P x + q^\top x$:

nn = 5 * n + n_theta +  1
+contour!(u, v, z', levels = [0.], c = :black, linewidth = 2)
Example block output

Solving the optimisation problem directly with COSMO

We can solve the problem directly in COSMO by using its modeling interface. The problem will have $nn = 5 n + n_\theta + 1$ variables. Let us define the cost function $\frac{1}{2}x^\top P x + q^\top x$:

nn = 5 * n + n_theta +  1
 P = spzeros(nn, nn)
 q = zeros(nn)
 q[1] = μ
@@ -219,7 +219,7 @@
 Acc:      Anderson Type2{QRDecomp},
           Memory size = 15, RestartedMemory,
           Safeguarded: true, tol: 2.0
-Setup Time: 1.09ms
+Setup Time: 1.08ms
 
 Iter:	Objective:	Primal Res:	Dual Res:	Rho:
 1	-1.2883e+03	2.5314e+01	1.7951e+00	1.0000e-01
@@ -232,6 +232,6 @@
 Status: Solved
 Iterations: 75
 Optimal objective: 51.93
-Runtime: 0.189s (189.49ms)

Let us double check that we get the same $\theta$ as in the previous section:

using Test
+Runtime: 0.19s (190.34ms)

Let us double check that we get the same $\theta$ as in the previous section:

using Test
 theta_cosmo = res.x[2:2+n_theta-1]
-@test norm(theta_cosmo - theta) < 1e-4
Test Passed

This page was generated using Literate.jl.

+@test norm(theta_cosmo - theta) < 1e-4
Test Passed

This page was generated using Literate.jl.

diff --git a/dev/examples/lovasz_petersen/abe5edc1.svg b/dev/examples/lovasz_petersen/31ff0c37.svg similarity index 90% rename from dev/examples/lovasz_petersen/abe5edc1.svg rename to dev/examples/lovasz_petersen/31ff0c37.svg index f89235ee..c2a0f925 100644 --- a/dev/examples/lovasz_petersen/abe5edc1.svg +++ b/dev/examples/lovasz_petersen/31ff0c37.svg @@ -1,95 +1,95 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/lovasz_petersen/index.html b/dev/examples/lovasz_petersen/index.html index 32a932c5..476a2ab6 100644 --- a/dev/examples/lovasz_petersen/index.html +++ b/dev/examples/lovasz_petersen/index.html @@ -31,7 +31,7 @@ for θ = 90:-72:-198 push!(coordinates, [ri * cosd(θ), ri * sind(θ)]) end -graphplot(E, names = 1:n, x = getindex.(coordinates, 1), y = getindex.(coordinates, 2), fontsize = 10, nodesize = 1, nodeshape =:circle, curvature = 0.)Example block output

Let's solve the SDP with COSMO and JuMP:

model = JuMP.Model(COSMO.Optimizer);
+graphplot(E, names = 1:n,  x = getindex.(coordinates, 1), y = getindex.(coordinates, 2), fontsize = 10, nodesize = 1, nodeshape =:circle, curvature = 0.)
Example block output

Let's solve the SDP with COSMO and JuMP:

model = JuMP.Model(COSMO.Optimizer);
 
 @variable(model, X[1:n, 1:n], PSD)
 x = vec(X)
@@ -67,7 +67,7 @@
 Acc:      Anderson Type2{QRDecomp},
           Memory size = 15, RestartedMemory,
           Safeguarded: true, tol: 2.0
-Setup Time: 0.14ms
+Setup Time: 0.15ms
 
 Iter:	Objective:	Primal Res:	Dual Res:	Rho:
 1	-1.2711e+03	2.1055e+01	1.0052e+00	1.0000e-01
@@ -78,4 +78,4 @@
 Status: Solved
 Iterations: 25
 Optimal objective: -4
-Runtime: 0.461s (460.58ms)

The optimal objective is given by:

JuMP.objective_value(model)
4.000000007423991

Which is the correct known value for the Petersen Graph.

References

[1] Lovász - On the Shannon Capacity of a Graph, IEEE Transactions on Information Theory (1979)


This page was generated using Literate.jl.

+Runtime: 0.479s (479.43ms)

The optimal objective is given by:

JuMP.objective_value(model)
4.000000007423991

Which is the correct known value for the Petersen Graph.

References

[1] Lovász - On the Shannon Capacity of a Graph, IEEE Transactions on Information Theory (1979)


This page was generated using Literate.jl.

diff --git a/dev/examples/lp/index.html b/dev/examples/lp/index.html index dac7b996..6013650f 100644 --- a/dev/examples/lp/index.html +++ b/dev/examples/lp/index.html @@ -24,4 +24,4 @@ model = COSMO.Model(); assemble!(model, P, q, [c1; c2; c3; c4], settings = settings); res = COSMO.optimize!(model)
>>> COSMO - Results
-Status: 

Compare the result to the known solution:

@test isapprox(res.x[1:4], [3; 5; 1; 1], atol=1e-2, norm = (x -> norm(x, Inf)))
Test Passed
@test isapprox(res.obj_val, 20.0, atol=1e-2)
Test Passed

This page was generated using Literate.jl.

+Status:

Compare the result to the known solution:

@test isapprox(res.x[1:4], [3; 5; 1; 1], atol=1e-2, norm = (x -> norm(x, Inf)))
Test Passed
@test isapprox(res.obj_val, 20.0, atol=1e-2)
Test Passed

This page was generated using Literate.jl.

diff --git a/dev/examples/maxcut/b98ec54e.svg b/dev/examples/maxcut/6767f22c.svg similarity index 89% rename from dev/examples/maxcut/b98ec54e.svg rename to dev/examples/maxcut/6767f22c.svg index 91f464b9..9acb147a 100644 --- a/dev/examples/maxcut/b98ec54e.svg +++ b/dev/examples/maxcut/6767f22c.svg @@ -1,47 +1,47 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/maxcut/index.html b/dev/examples/maxcut/index.html index adffc0d1..378f2e32 100644 --- a/dev/examples/maxcut/index.html +++ b/dev/examples/maxcut/index.html @@ -11,7 +11,7 @@ W[2, 3] = 2; W[2, 4] = 10; W[3, 4] = 6; W = Symmetric(W) -graphplot(W, names = 1:n, edgelabel = W, x = [0; 1; 1; 0], y = [1; 1; 0; 0], fontsize = 12, nodeshape =:circle)Example block output

The maximum cut problem tries to find a cut or a partition of the graph's vertices into two complementary sets $S$ and $\bar{S}$ such that the total weight of the edges between the two sets is maximized. For this small graph the problem is trivial. The optimal solution is $S = \{1,2,3 \}$ and $\bar{S}=\{4\}$. Formally, this problem can be written as a mixed-integer optimisation problem:

\[\begin{array}{ll} \text{maximize} & \frac{1}{2} \sum_{i < j} w_{ij}(1 - y_i y_j)\\ +graphplot(W, names = 1:n, edgelabel = W, x = [0; 1; 1; 0], y = [1; 1; 0; 0], fontsize = 12, nodeshape =:circle)Example block output

The maximum cut problem tries to find a cut or a partition of the graph's vertices into two complementary sets $S$ and $\bar{S}$ such that the total weight of the edges between the two sets is maximized. For this small graph the problem is trivial. The optimal solution is $S = \{1,2,3 \}$ and $\bar{S}=\{4\}$. Formally, this problem can be written as a mixed-integer optimisation problem:

\[\begin{array}{ll} \text{maximize} & \frac{1}{2} \sum_{i < j} w_{ij}(1 - y_i y_j)\\ \text{subject to} & y_i \in \, \{-1, 1 \}, \quad \forall i \in V, \end{array}\]

where $y_i = 1$ indicates that $v_i \in S$ and $y_i = -1$ indicates that $v_i \in \bar{S}$. This problem is of interest in the field of integrated circuit design, where one tries to minimize the number of cross-layer connections in a circuit under layout constraints.

For more complicated graphs this problem quickly becomes hard to solve to optimality and in fact is known to be NP-hard. For this example we are interested in the randomized approximation algorithm that relaxes the problem to an SDP and was devised in Goemans and Williamson (1995) [1] (see for more details).

The approach can be divided into three steps:

  1. Solve a relaxed SDP to obtain $Y^*$
  2. Recover an approximate solution $V$ via a Cholesky factorisation $Y^* = V^\top V$
  3. Round the approximate solution using a random vector $r$ from the unit sphere.

The authors showed that this approach guarantees a solution of at least $0.87856$ times the optimal solution.

Solving the primal SDP

Before we formulate the SDP, let's compute the Laplacian matrix $L$:

using LinearAlgebra, Random, SparseArrays, COSMO, JuMP
 
@@ -64,7 +64,7 @@
 Status: Solved
 Iterations: 176 (incl. 26 safeguarding iter)
 Optimal objective: -24
-Runtime: 0.276s (276.19ms)
Yopt = JuMP.value.(Y);
+Runtime: 0.326s (326.4ms)
Yopt = JuMP.value.(Y);
 obj_val = JuMP.objective_value(model)
23.9999999938528

Solving the dual SDP

Notice that the decision matrix $Y$ is generally dense (as correctly classified in the solver output above). Therefore, we won't be able to utilize COSMO's chordal decomposition features. However, assuming strong duality, it turns out that we can also solve the dual problem, which is given by:

\[\begin{array}{ll} \text{minimize} & \sum_i \gamma_i \\ \text{subject to} & \text{diag}(\gamma) - \frac{1}{4} L = S \\ & S \in \mathbf{S}_{+}^n. @@ -92,4 +92,4 @@ 1.0 1.0 1.0 - -1.0

For larger graphs this rounding step could be repeated several times to improve the rounding. As expected $S = \{1, 2, 3\}$ and $\bar{S}=\{ 4\}$.

References

[1] Goemans and Williamson - Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programs (1995)

[2] Post-processing code steps 2 and 3 from this JuMP example


This page was generated using Literate.jl.

+ -1.0

For larger graphs this rounding step could be repeated several times to improve the rounding. As expected $S = \{1, 2, 3\}$ and $\bar{S}=\{ 4\}$.

References

[1] Goemans and Williamson - Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programs (1995)

[2] Post-processing code steps 2 and 3 from this JuMP example


This page was generated using Literate.jl.

diff --git a/dev/examples/portfolio_optimisation/11509310.svg b/dev/examples/portfolio_optimisation/d954136a.svg similarity index 86% rename from dev/examples/portfolio_optimisation/11509310.svg rename to dev/examples/portfolio_optimisation/d954136a.svg index 1df95949..b12700f6 100644 --- a/dev/examples/portfolio_optimisation/11509310.svg +++ b/dev/examples/portfolio_optimisation/d954136a.svg @@ -1,47 +1,47 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/portfolio_optimisation/index.html b/dev/examples/portfolio_optimisation/index.html index 60d8fddc..36289669 100644 --- a/dev/examples/portfolio_optimisation/index.html +++ b/dev/examples/portfolio_optimisation/index.html @@ -51,7 +51,7 @@ Acc: Anderson Type2{QRDecomp}, Memory size = 15, RestartedMemory, Safeguarded: true, tol: 2.0 -Setup Time: 0.16ms +Setup Time: 0.15ms Iter: Objective: Primal Res: Dual Res: Rho: 1 -1.0930e-01 6.0222e-01 6.5188e-02 1.0000e-01 @@ -62,7 +62,7 @@ Status: Solved Iterations: 25 Optimal objective: -0.07952 -Runtime: 0.001s (0.81ms)

After solving the problem, we can calculate the expected return and risk $\sigma= \sqrt{x^{* \top} \Sigma x^*}$:

x_opt = JuMP.value.(x);
+Runtime: 0.001s (0.78ms)

After solving the problem, we can calculate the expected return and risk $\sigma= \sqrt{x^{* \top} \Sigma x^*}$:

x_opt = JuMP.value.(x);
 y_opt = JuMP.value.(y);
 expected_return_basic = dot(μ, x_opt)
0.09960959937666913
expected_risk_basic = sqrt(dot(y_opt, y_opt))
0.013048755030479095

Using standard deviation in the model

It is pointed out in [1] that above problem formulation can lead to numerical problems, e.g. if $\Sigma$ is not strictly positive semidefinite. Another option is to formulate the risk constraint in terms of the standard deviation $\|M^\top x \|$ where $M M^\top = D + F F^\top$ and bound it using a second-order cone constraint:

\[\begin{array}{ll} \text{minimize} & - \mu^\top x \\ \text{subject to} & \|M^\top x\| \leq \gamma \\ @@ -99,7 +99,7 @@ Acc: Anderson Type2{QRDecomp}, Memory size = 15, RestartedMemory, Safeguarded: true, tol: 2.0 -Setup Time: 0.18ms +Setup Time: 0.16ms Iter: Objective: Primal Res: Dual Res: Rho: 1 -2.9234e-01 6.0111e-01 6.3843e-02 1.0000e-01 @@ -122,7 +122,7 @@ Status: Solved Iterations: 326 (incl. 1 safeguarding iter) Optimal objective: -0.1189 -Runtime: 0.007s (7.0ms)

Note that the result is different from the example above because $\gamma$ scales the problem in a different way. Here it can be seen as an upper bound on the standard deviation of the portfolio.

x_opt = JuMP.value.(x);
+Runtime: 0.006s (6.28ms)

Note that the result is different from the example above because $\gamma$ scales the problem in a different way. Here it can be seen as an upper bound on the standard deviation of the portfolio.

x_opt = JuMP.value.(x);
 expected_return = dot(μ, x_opt)
0.11886638687579054

Let us verify that the bound holds:

@test norm(Mt * x_opt) <= γ
Test Passed

Pareto-optimal front

The above portfolio optimisation approach yields the optimal expected return for a given level of risk. The result is obviously impacted by the risk aversion $\gamma$ parameter. To visualise the trade-off and present the investor with an efficient Pareto optimal portfolio for their risk appetite we can compute the optimal portfolio for many choices of $\gamma$ and plot the corresponding risk-return trade-off curve.

gammas = [ 0.001, 0.01, 0.1,  0.5,  1., 3., 10, 100, 1000]
 risks = zeros(length(gammas))
 returns = zeros(length(gammas))
@@ -144,7 +144,7 @@
     returns[k] = dot(μ, x_opt)
     risks[k] = sqrt(dot(y_opt, y_opt))
 end

We can now plot the risk-return trade-off curve:

using Plots
-Plots.plot(risks, returns, xlabel = "Standard deviation (risk)", ylabel = "Expected return", title = "Risk-return trade-off for efficient portfolios", legend = false)
Example block output
Note

When the model is updated in JuMP as above the JuMP.model is copied in full to COSMO. We are trying to improve the interface with respect to model updates in the future. Until then you can use Model Updates in COSMOs native interface.

Transaction costs

In the model above we assume that trading the assets is free and does not impact the market. However, this is clearly not the case in reality. To make the example more realistic consider the following cost $c_j$ associated with the trade $δ_j = x_j - x_j^0$:

\[c_j(\delta_j) = a_j |\delta_j| + b_j |\delta_j|^{3/2},\]

where the first term models the bid-ask spread and broker fees for asset $j$. The second term models the impact on the market that our trade has. This is obviously only a factor if the volume of our trade is significant. The constant $b_j$ is a function of the total volume traded in the considered time periode and the price volatility of the asset and has to be estimated by the trader. To make this example simple we consider the same coefficients $a$ and $b$ for every asset. The $|\delta_j|^{3/2}$ term can be easily modeled using a power cone constraint $\mathcal{K}_{pow} = \{(x, y, z) \mid x^\alpha y^{(1-\alpha)} \geq |z|, x \geq 0, y \geq 0, 0 \leq \alpha \leq 1 \}$. In fact this can be used to model any market impact function with exponent greater than 1. We can write the total transaction cost $a^\top s + b^\top t$ where $s_j$ bounds the absolute value of $\delta_j$ and $t_{j}$ is used to bound the term $|x_j - x_j^0|^{3/2} \leq t_{j}$ using a power cone formulation: $(t_{j}, 1, x_j - x_j^0) \in \mathcal{K}_{pow}(2/3)$.

a = 1e-3
+Plots.plot(risks, returns, xlabel = "Standard deviation (risk)", ylabel = "Expected return", title = "Risk-return trade-off for efficient portfolios", legend = false)
Example block output
Note

When the model is updated in JuMP as above the JuMP.model is copied in full to COSMO. We are trying to improve the interface with respect to model updates in the future. Until then you can use Model Updates in COSMOs native interface.

Transaction costs

In the model above we assume that trading the assets is free and does not impact the market. However, this is clearly not the case in reality. To make the example more realistic consider the following cost $c_j$ associated with the trade $δ_j = x_j - x_j^0$:

\[c_j(\delta_j) = a_j |\delta_j| + b_j |\delta_j|^{3/2},\]

where the first term models the bid-ask spread and broker fees for asset $j$. The second term models the impact on the market that our trade has. This is obviously only a factor if the volume of our trade is significant. The constant $b_j$ is a function of the total volume traded in the considered time periode and the price volatility of the asset and has to be estimated by the trader. To make this example simple we consider the same coefficients $a$ and $b$ for every asset. The $|\delta_j|^{3/2}$ term can be easily modeled using a power cone constraint $\mathcal{K}_{pow} = \{(x, y, z) \mid x^\alpha y^{(1-\alpha)} \geq |z|, x \geq 0, y \geq 0, 0 \leq \alpha \leq 1 \}$. In fact this can be used to model any market impact function with exponent greater than 1. We can write the total transaction cost $a^\top s + b^\top t$ where $s_j$ bounds the absolute value of $\delta_j$ and $t_{j}$ is used to bound the term $|x_j - x_j^0|^{3/2} \leq t_{j}$ using a power cone formulation: $(t_{j}, 1, x_j - x_j^0) \in \mathcal{K}_{pow}(2/3)$.

a = 1e-3
 b = 1e-1
 γ = 1.0;
 model = JuMP.Model(optimizer_with_attributes(COSMO.Optimizer, "eps_abs" => 1e-5, "eps_rel" => 1e-5));
@@ -188,7 +188,7 @@
 Acc:      Anderson Type2{QRDecomp},
           Memory size = 15, RestartedMemory,
           Safeguarded: true, tol: 2.0
-Setup Time: 3.81ms
+Setup Time: 3.64ms
 
 Iter:	Objective:	Primal Res:	Dual Res:	Rho:
 1	-1.5637e-01	1.1398e+00	6.3250e-02	1.0000e-01
@@ -201,8 +201,8 @@
 Status: Solved
 Iterations: 80 (incl. 5 safeguarding iter)
 Optimal objective: -0.07786
-Runtime: 0.094s (93.66ms)

Let's look at the expected return and the total transaction cost:

x_opt = JuMP.value.(x);
+Runtime: 0.087s (87.41ms)

Let's look at the expected return and the total transaction cost:

x_opt = JuMP.value.(x);
 y_opt = JuMP.value.(y);
 s_opt = JuMP.value.(s);
 t_opt = JuMP.value.(t);
-expected_return = dot(μ, x_opt)
0.09665905111704713
expected_risk = dot(y_opt, y_opt)
0.00016794910539706204
transaction_cost = a * sum(s_opt) + b * sum( t_opt)
0.02732439918367407

References

[1] Mosek Case Studies


This page was generated using Literate.jl.

+expected_return = dot(μ, x_opt)
0.09665905111704713
expected_risk = dot(y_opt, y_opt)
0.00016794910539706204
transaction_cost = a * sum(s_opt) + b * sum( t_opt)
0.02732439918367407

References

[1] Mosek Case Studies


This page was generated using Literate.jl.

diff --git a/dev/examples/qp/index.html b/dev/examples/qp/index.html index 29330beb..fa4558ff 100644 --- a/dev/examples/qp/index.html +++ b/dev/examples/qp/index.html @@ -38,7 +38,7 @@ Acc: Anderson Type2{QRDecomp}, Memory size = 8, RestartedMemory, Safeguarded: true, tol: 2.0 -Setup Time: 0.08ms +Setup Time: 0.07ms Iter: Objective: Primal Res: Dual Res: Rho: 1 -2.0835e-01 1.2796e+00 2.3510e-01 1.0000e-01 @@ -49,7 +49,7 @@ Status: Solved Iterations: 30 (incl. 5 safeguarding iter) Optimal objective: 1.88 -Runtime: 0.001s (0.58ms)

Alternatively, we can also use two-sided constraints with COSMO.Box:

constraint1 = COSMO.Constraint(A, zeros(3), COSMO.Box(l, u));
+Runtime: 0.001s (0.52ms)

Alternatively, we can also use two-sided constraints with COSMO.Box:

constraint1 = COSMO.Constraint(A, zeros(3), COSMO.Box(l, u));
 
 model = COSMO.Model();
 assemble!(model, P, q, constraint1, settings = settings);
@@ -93,4 +93,4 @@
   @test abs(res_box.obj_val - 1.88) < 1e-3
 end
 nothing
Test Summary: | Pass  Total  Time
-QP Problem    |    4      4  0.1s

This page was generated using Literate.jl.

+QP Problem | 4 4 0.1s

This page was generated using Literate.jl.

diff --git a/dev/examples/sum_abs_k_eigenvalues/index.html b/dev/examples/sum_abs_k_eigenvalues/index.html index c60d8bb2..3da501e5 100644 --- a/dev/examples/sum_abs_k_eigenvalues/index.html +++ b/dev/examples/sum_abs_k_eigenvalues/index.html @@ -71,7 +71,7 @@ Status: Solved Iterations: 89 (incl. 14 safeguarding iter) Optimal objective: -79.28 -Runtime: 0.011s (10.69ms)
opt_objective = JuMP.objective_value(model)
79.28110570563726

Now, we can check the solution by computing the sum of the absolute value of the 3-largest eigenvalues:

k_λ_abs = sum(sort(abs.(eigen(A).values), rev = true)[1:k])
79.28110570563712

Solve the dual

Alternatively, we can solve the dual problem:

model = JuMP.Model(optimizer_with_attributes(COSMO.Optimizer, "verbose" => true));
+Runtime: 0.011s (11.14ms)
opt_objective = JuMP.objective_value(model)
79.28110570563726

Now, we can check the solution by computing the sum of the absolute value of the 3-largest eigenvalues:

k_λ_abs = sum(sort(abs.(eigen(A).values), rev = true)[1:k])
79.28110570563712

Solve the dual

Alternatively, we can solve the dual problem:

model = JuMP.Model(optimizer_with_attributes(COSMO.Optimizer, "verbose" => true));
 @variable(model, V[1:n, 1:n], PSD);
 @variable(model, U[1:n, 1:n], PSD);
 @variable(model, z);
@@ -115,9 +115,9 @@
 Status: Solved
 Iterations: 25
 Optimal objective: 79.28
-Runtime: 0.003s (2.81ms)
opt_objective = JuMP.objective_value(model)
79.28110569379851

This gives the same result.

Problem with A as variable

Above problems are mostly helpful for illustrative purpose. It is obviously easier to find the sum of the k-largest eigenvalues by simply computing the eigenvalues of $A$. However, above results become useful if finding $A$ itself is part of the problem. For example, assume we want to find a valid matrix $A$ under the constraints: $C\, \text{vec}(A) = b$ with the minimum sum of absolute values of the k-largest eigenvalues. We can then solve the equivalent problem:

\[\begin{array}{ll} \text{minimize} & kz + Tr(U) + Tr(V) \\ +Runtime: 0.003s (2.88ms)

opt_objective = JuMP.objective_value(model)
79.28110569379851

This gives the same result.

Problem with A as variable

Above problems are mostly helpful for illustrative purpose. It is obviously easier to find the sum of the k-largest eigenvalues by simply computing the eigenvalues of $A$. However, above results become useful if finding $A$ itself is part of the problem. For example, assume we want to find a valid matrix $A$ under the constraints: $C\, \text{vec}(A) = b$ with the minimum sum of absolute values of the k-largest eigenvalues. We can then solve the equivalent problem:

\[\begin{array}{ll} \text{minimize} & kz + Tr(U) + Tr(V) \\ \text{subject to} & C \text{vec}(A) = b \\ & zI + V - A \succeq 0 \\ & zI + U + A \succeq 0 \\ & U, V \succeq 0. -\end{array}\]

References

[1] Alizadeh - Interior point methods in semidefinite programming with applications to combinatorial optimization (1995)


This page was generated using Literate.jl.

+\end{array}\]

References

[1] Alizadeh - Interior point methods in semidefinite programming with applications to combinatorial optimization (1995)


This page was generated using Literate.jl.

diff --git a/dev/examples/svm_primal/2f3f2fa2.svg b/dev/examples/svm_primal/299b1770.svg similarity index 75% rename from dev/examples/svm_primal/2f3f2fa2.svg rename to dev/examples/svm_primal/299b1770.svg index 34e928fc..775efdee 100644 --- a/dev/examples/svm_primal/2f3f2fa2.svg +++ b/dev/examples/svm_primal/299b1770.svg @@ -1,144 +1,144 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/svm_primal/ba6f4b3f.svg b/dev/examples/svm_primal/67499641.svg similarity index 79% rename from dev/examples/svm_primal/ba6f4b3f.svg rename to dev/examples/svm_primal/67499641.svg index ea51b001..5f796e0b 100644 --- a/dev/examples/svm_primal/ba6f4b3f.svg +++ b/dev/examples/svm_primal/67499641.svg @@ -1,148 +1,148 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/examples/svm_primal/index.html b/dev/examples/svm_primal/index.html index e523d870..ef7cfafe 100644 --- a/dev/examples/svm_primal/index.html +++ b/dev/examples/svm_primal/index.html @@ -12,7 +12,7 @@ ypos = ones(div(num_samples, 2)); yneg = -ones(div(num_samples, 2));
# Plot dataset
 plot(Xpos[:, 1], Xpos[:, 2], color = :red, st=:scatter, markershape = :rect, label = "positive", xlabel = "x1", ylabel = "x2")
-plot!(Xneg[:, 1], Xneg[:, 2], color = :blue, st=:scatter, markershape = :circle, label = "negative")
Example block output

with samples $(x_1, x_2, \ldots, x_m) \in \mathbb{R}^2$ and labels $y_i \in \{-1,1\}$.

Solving SVM as a QP

We want to compute the weights $w$ and bias term $b$ of the (soft-margin) SVM classifier:

\[\begin{array}{ll} +plot!(Xneg[:, 1], Xneg[:, 2], color = :blue, st=:scatter, markershape = :circle, label = "negative")Example block output

with samples $(x_1, x_2, \ldots, x_m) \in \mathbb{R}^2$ and labels $y_i \in \{-1,1\}$.

Solving SVM as a QP

We want to compute the weights $w$ and bias term $b$ of the (soft-margin) SVM classifier:

\[\begin{array}{ll} \text{minimize} & \|w\|^2 + \lambda \sum_{i=1}^m \text{max}(0, 1 - y_i(w^\top x_i - b)), \end{array}\]

where $\lambda$ is a hyperparameter. This problem can be solved as a quadratic program. We can rewrite above problem into an optimisation problem in primal form by introducing the auxiliary slack variables $t_i$:

\[t_i = \text{max}(0, 1 - y_i(w^T x_i - b)), \quad t_i \geq 0.\]

This allows us to write the problems in standard QP format:

\[\begin{array}{ll} \text{minimize} & \|w\|^2 + \lambda \sum_{i=1}^m t_i\\ @@ -50,7 +50,7 @@ Acc: Anderson Type2{QRDecomp}, Memory size = 15, RestartedMemory, Safeguarded: true, tol: 2.0 -Setup Time: 0.17ms +Setup Time: 0.18ms Iter: Objective: Primal Res: Dual Res: Rho: 1 -1.2711e+03 1.8883e+01 9.5020e+00 1.0000e-01 @@ -73,12 +73,12 @@ Status: Solved Iterations: 335 (incl. 10 safeguarding iter) Optimal objective: 15.55 -Runtime: 0.003s (3.44ms)

The optimal weights $w = [w_0, w_1, w_2]^\top$ (where $w_0 = b$) are:

w_opt = JuMP.value.(w)
3-element Vector{Float64}:
+Runtime: 0.003s (3.49ms)

The optimal weights $w = [w_0, w_1, w_2]^\top$ (where $w_0 = b$) are:

w_opt = JuMP.value.(w)
3-element Vector{Float64}:
  0.1531242891284758
  1.0341323254762473
  0.520268546984156

Plotting the hyperplane

The separating hyperplane is defined by $w^\top x - b = 0$. To plot the hyperplane, we calculate $x_2$ over a range of $x_1$ values:

\[x_2 = (-w_1 x_1 - w_0) / w_2, \text{ where } w_0 = b.\]

x1 = -4:0.1:4;
 x2 = (-w_opt[2] * x1  .- w_opt[1]) / w_opt[3]
-plot!(x1, x2, label = "SVM separator", legend = :topleft)
Example block output

Modelling with COSMO

The problem can also be solved by transforming it directly into COSMO's problem format. Define COSMO`s $x$-variable to be $x=[w, t]^\top$ and choose $P$, $q$, accordingly:

P = blockdiag(spdiagm(0 => ones(n)), spzeros(m, m));
+plot!(x1, x2, label = "SVM separator", legend = :topleft)
Example block output

Modelling with COSMO

The problem can also be solved by transforming it directly into COSMO's problem format. Define COSMO`s $x$-variable to be $x=[w, t]^\top$ and choose $P$, $q$, accordingly:

P = blockdiag(spdiagm(0 => ones(n)), spzeros(m, m));
 q = [zeros(n); 0.5 * λ * ones(m)];

Next we transform the first constraint $y_i (w^\top x_i - b) \geq 1 - t_i, \quad \text{for } i = 1,\ldots, m$ into COSMO's constraint format: $Ax + b \in \mathcal{K}$.

A1 = [(spdiagm(0 => y) * X) spdiagm(0 => ones(m))];
 b1 = -ones(m);
 cs1 = COSMO.Constraint(A1, b1, COSMO.Nonnegatives);

It remains to specify the constraint $t_i \geq 0, \quad \text{for } i = 1,\ldots, m$:

A2 = spdiagm(0 => ones(m));
@@ -87,4 +87,4 @@
 assemble!(model2, P, q, [cs1; cs2]);
 result2 = COSMO.optimize!(model2);
 w_opt2 = result2.x[1:3];
-@test norm(w_opt2 - w_opt, Inf) < 1e-3
Test Passed

This page was generated using Literate.jl.

+@test norm(w_opt2 - w_opt, Inf) < 1e-3
Test Passed

This page was generated using Literate.jl.

diff --git a/dev/examples/two_way_partitioning/index.html b/dev/examples/two_way_partitioning/index.html index 6f8997e9..4e3252ec 100644 --- a/dev/examples/two_way_partitioning/index.html +++ b/dev/examples/two_way_partitioning/index.html @@ -103,7 +103,7 @@ Status: Solved Iterations: 677 (incl. 77 safeguarding iter) Optimal objective: 25.62 -Runtime: 0.117s (117.5ms)

Looking at the solver output you can see how the PSD constraint was decomposed into 19 PSD constraints. Let's look at the lower bound:

JuMP.objective_value(model)
-25.623709241260592

As $n$ is small, we can verify our result by finding the optimal solution by trying out all possible combinations:

function brute_force_optimisation(W, n)
+Runtime: 0.119s (118.79ms)

Looking at the solver output you can see how the PSD constraint was decomposed into 19 PSD constraints. Let's look at the lower bound:

JuMP.objective_value(model)
-25.623709241260592

As $n$ is small, we can verify our result by finding the optimal solution by trying out all possible combinations:

function brute_force_optimisation(W, n)
    opt_obj = Inf
    opt_x = Inf * ones(n)
 
@@ -117,4 +117,4 @@
    end
    return opt_obj, opt_x
 end
-opt_obj, opt_x = brute_force_optimisation(W, n)
(-25.623709242765823, [-1.0, 1.0, -1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0, 1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0])

References

[1] Boyd and Vandenberghe - Convex Optimization, Cambridge University Press (2004)


This page was generated using Literate.jl.

+opt_obj, opt_x = brute_force_optimisation(W, n)
(-25.623709242765823, [-1.0, 1.0, -1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0, 1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, 1.0, -1.0, 1.0, 1.0])

References

[1] Boyd and Vandenberghe - Convex Optimization, Cambridge University Press (2004)


This page was generated using Literate.jl.

diff --git a/dev/getting_started/index.html b/dev/getting_started/index.html index 9a5a684e..358c86d3 100644 --- a/dev/getting_started/index.html +++ b/dev/getting_started/index.html @@ -7,7 +7,7 @@ constraint2 = COSMO.Constraint([0.0 0.0 1.0 0.0 0.0], -3.0, COSMO.Nonnegatives) constraints = [constraint1; constraint2]

The second option is to include both in one constraint:

constraint1 = COSMO.Constraint([0.0 1.0 0.0 0.0 0.0; 0.0 0.0 1.0 0.0 0.0], [5.0; -3.0], COSMO.Nonnegatives)

Another way to construct the constraint is to used the optional arguments dim, the dimension of x, and indices, the elements of x that appear in the constraint. When specifying these arguments, A and b only refer to the elements of x in indices:

constraint1 = COSMO.Constraint([1.0 0.0; 0.0 1.0], [5.0; -3.0], COSMO.Nonnegatives, 5, 2:3)

Consider as a second example the positive semidefinite constraint on a matrix $X \in \mathbb{S}_+^{3}$. Our decision variable is the vector $x$ obtained by stacking the columns of $X$. We can specify the constraint on $x$ in the following way:

\[I_9 x + \{0\}_9 \in \mathcal{S}_+^9,\]

or in Julia:

constraint1 = COSMO.Constraint(Matrix(1.0I, 9, 9), zeros(9), COSMO.PsdCone)

Several constraints can be combined in an array:

constraints = [constraint_1, constraint_2, ..., constraint_N]

It is usually enough to pass the convex_set as a type. However, some convex sets like Box, PowerCone and DualPowerCone require more information to be created. In that case you have to pass an object to the constructor, e.g.

l = -ones(2)
 u = ones(2)
-constraint = COSMO.Constraint(Matrix(1.0I, 2, 2), zeros(2), COSMO.Box(l, u))

or in the case of a power Cone you specify the alpha:

constraint = COSMO.Constraint(Matrix(1.0I, 3, 3), zeros(3), COSMO.PowerCone(0.6))

Settings

The solver settings are stored in a Settings object and can be adjusted by the user. To create a Settings object just call the constructor:

COSMO.SettingsType
COSMO.Settings{T}(; kwargs) where {T <: AbstractFloat}

Creates a COSMO settings object that is used to pass user settings to the solver.

ArgumentDescriptionValues (default)
rhoADMM rho step0.1
sigmaADMM sigma step1e-6
alphaRelaxation parameter1.6
eps_absAbsolute residual tolerance1e-5
eps_relRelative residual tolerance1e-5
nearly_ratioResidual tolerance ratio between MOI.NEARLY_FEASIBLE_POINT and $MOI.FEASIBLE_POINT$100
eps_prim_infPrimal infeasibility tolerance1e-5
eps_dual_infDual infeasibility tolerance1e-5
max_iterMaximum number of iterations5000
verboseVerbose printingfalse
verbose_timingVerbose timingfalse
kkt_solverLinear System solverQdldlKKTSolver
check_terminationCheck termination interval25
check_infeasibilityCheck infeasibility interval40
scalingNumber of scaling iterations10
adaptive_rhoAutomatic adaptation of step size parametertrue
adaptiverhomax_adaptionsMax number of rho adaptionstypemax(Int64) (deactivated)
decomposeActivate to decompose chordal psd constraintstrue
complete_dualActivate to complete the dual variable after decompositionfalse
merge_strategyChoose a strategy for clique mergingCliqueGraphMerge
compact_transformationChoose how a decomposed problem is transformedtrue
time_limitSet solver time limit in s0 (deactivated)
acceleratorAcceleration schemeAndersonAccelerator{T, Type2{QRDecomp}, RestartedMemory, NoRegularizer}
accelerator_activationAccelerator activationImmediateActivation
safeguardAccelerator safeguardingtrue
safeguard_tolSafeguarding tolerance2.0
source

To adjust those values, either pass your preferred option and parameter as a key-value pair to the constructor or edit the corresponding field afterwards. For example if you want to enable verbose printing and increase the solver accuracy, you can type

settings = COSMO.Settings(verbose = true, eps_abs = 1e-5, eps_rel = 1e-5)
+constraint = COSMO.Constraint(Matrix(1.0I, 2, 2), zeros(2), COSMO.Box(l, u))

or in the case of a power Cone you specify the alpha:

constraint = COSMO.Constraint(Matrix(1.0I, 3, 3), zeros(3), COSMO.PowerCone(0.6))

Settings

The solver settings are stored in a Settings object and can be adjusted by the user. To create a Settings object just call the constructor:

COSMO.SettingsType
COSMO.Settings{T}(; kwargs) where {T <: AbstractFloat}

Creates a COSMO settings object that is used to pass user settings to the solver.

ArgumentDescriptionValues (default)
rhoADMM rho step0.1
sigmaADMM sigma step1e-6
alphaRelaxation parameter1.6
eps_absAbsolute residual tolerance1e-5
eps_relRelative residual tolerance1e-5
nearly_ratioResidual tolerance ratio between MOI.NEARLY_FEASIBLE_POINT and $MOI.FEASIBLE_POINT$100
eps_prim_infPrimal infeasibility tolerance1e-5
eps_dual_infDual infeasibility tolerance1e-5
max_iterMaximum number of iterations5000
verboseVerbose printingfalse
verbose_timingVerbose timingfalse
kkt_solverLinear System solverQdldlKKTSolver
check_terminationCheck termination interval25
check_infeasibilityCheck infeasibility interval40
scalingNumber of scaling iterations10
adaptive_rhoAutomatic adaptation of step size parametertrue
adaptiverhomax_adaptionsMax number of rho adaptionstypemax(Int64) (deactivated)
decomposeActivate to decompose chordal psd constraintstrue
complete_dualActivate to complete the dual variable after decompositionfalse
merge_strategyChoose a strategy for clique mergingCliqueGraphMerge
compact_transformationChoose how a decomposed problem is transformedtrue
time_limitSet solver time limit in s0 (deactivated)
acceleratorAcceleration schemeAndersonAccelerator{T, Type2{QRDecomp}, RestartedMemory, NoRegularizer}
accelerator_activationAccelerator activationImmediateActivation
safeguardAccelerator safeguardingtrue
safeguard_tolSafeguarding tolerance2.0
source

To adjust those values, either pass your preferred option and parameter as a key-value pair to the constructor or edit the corresponding field afterwards. For example if you want to enable verbose printing and increase the solver accuracy, you can type

settings = COSMO.Settings(verbose = true, eps_abs = 1e-5, eps_rel = 1e-5)
 # the following is equivalent
 settings = COSMO.Settings()
 settings.verbose = true
@@ -16,4 +16,4 @@
 y_0 = [1.0; 2.0]
 COSMO.assemble!(model, P, q, constraints, x0 = x_0, y0 = y_0)

Another option is to use

COSMO.assemble!(model, P, q, constraints)
 warm_start_primal!(model, x_0)
-warm_start_dual!(model, y_0)

Solving

After the model has been assembled, we can solve the problem by typing

results = COSMO.optimize!(model)

Once the solver algorithm terminates, it will return a Results object that gives information about the status of the solver. If successful, it contains the optimal objective value and optimal primal and dual variables. For more information see the following section.

Results

After attempting to solve the problem, COSMO will return a result object with the following fields:

COSMO.ResultType
Result{T <: AbstractFloat}

Object returned by the COSMO solver after calling optimize!(model). It has the following fields:

FieldnameTypeDescription
xVector{T}Primal variable
yVector{T}Dual variable
sVector{T}(Primal) set variable
obj_valTObjective value
iterIntTotal number of ADMM iterations (incl. safeguarding_iter)
safeguarding_iterIntNumber of iterations due to safeguarding of accelerator
statusSymbolSolution status
infoCOSMO.ResultInfoStruct with more information
timesCOSMO.ResultTimesStruct with several measured times
source
COSMO.ResultInfoType
ResultInfo{T <: AbstractFloat}

Object that contains further information about the primal residual, the dual residuals and the rho updates.

source

Status Codes

COSMO will return one of the following statuses:

Status CodeDescription
:SolvedAn optimal solution was found
:UnsolvedDefault value
:Max_iter_reachedSolver reached iteration limit (set with Settings.max_iter)
:Time_limit_reachedSolver reached time limit (set with Settings.time_limit)
:Primal_infeasibleProblem is primal infeasible
:Dual_infeasibleProblem is dual infeasible

Timings

If settings.verbose_timing is set to true, COSMO will report the following times in result.times:

COSMO.ResultTimesType
ResultTimes

Part of the Result object returned by the solver. ResultTimes contains timing results for certain parts of the algorithm:

Time NameDescription
solver_timeTotal time used to solve the problem
setup_timeSetup time = graph_time + init_factor_time + scaling_time
scaling_timeTime to scale the problem data
graph_timeTime used to perform chordal decomposition
init_factor_timeTime used for initial factorisation of the system of linear equations
factor_update_timeSum of times used to refactor the system of linear equations due to rho
iter_timeTime spent in iteration loop
proj_timeTime spent in projection functions
post_timeTime used for post processing
update_timeTime spent in the update! function of the accelerator
accelerate_timeTime spent in the accelerate! function of the accelerator

By default COSMO only measures solver_time, setup_time and proj_time. To measure the other times set verbose_timing = true.

source

It holds: solver_time = setup_time+ iter_time + factor_update_time + post_time,

setup_time = graph_time+ init_factor_time + scaling_time,

proj_time is a subset of iter_time.

Updating the model

In some cases we want to solve a large number of similar models. COSMO allows you to update the model problem data vectors q and b after the first call of optimize!(). After changing the problem data, COSMO can reuse the factorisation step of the KKT matrix from the previous problem which can save a lot of time in the case of LPs and QPs.

COSMO.update!Function
update!(model, q = nothing, b = nothing)

Updates the model data for b or q. This can be done without refactoring the KKT matrix. The vectors will be appropriatly scaled.

source
+warm_start_dual!(model, y_0)

Solving

After the model has been assembled, we can solve the problem by typing

results = COSMO.optimize!(model)

Once the solver algorithm terminates, it will return a Results object that gives information about the status of the solver. If successful, it contains the optimal objective value and optimal primal and dual variables. For more information see the following section.

Results

After attempting to solve the problem, COSMO will return a result object with the following fields:

COSMO.ResultType
Result{T <: AbstractFloat}

Object returned by the COSMO solver after calling optimize!(model). It has the following fields:

FieldnameTypeDescription
xVector{T}Primal variable
yVector{T}Dual variable
sVector{T}(Primal) set variable
obj_valTObjective value
iterIntTotal number of ADMM iterations (incl. safeguarding_iter)
safeguarding_iterIntNumber of iterations due to safeguarding of accelerator
statusSymbolSolution status
infoCOSMO.ResultInfoStruct with more information
timesCOSMO.ResultTimesStruct with several measured times
source
COSMO.ResultInfoType
ResultInfo{T <: AbstractFloat}

Object that contains further information about the primal residual, the dual residuals and the rho updates.

source

Status Codes

COSMO will return one of the following statuses:

Status CodeDescription
:SolvedAn optimal solution was found
:UnsolvedDefault value
:Max_iter_reachedSolver reached iteration limit (set with Settings.max_iter)
:Time_limit_reachedSolver reached time limit (set with Settings.time_limit)
:Primal_infeasibleProblem is primal infeasible
:Dual_infeasibleProblem is dual infeasible

Timings

If settings.verbose_timing is set to true, COSMO will report the following times in result.times:

COSMO.ResultTimesType
ResultTimes

Part of the Result object returned by the solver. ResultTimes contains timing results for certain parts of the algorithm:

Time NameDescription
solver_timeTotal time used to solve the problem
setup_timeSetup time = graph_time + init_factor_time + scaling_time
scaling_timeTime to scale the problem data
graph_timeTime used to perform chordal decomposition
init_factor_timeTime used for initial factorisation of the system of linear equations
factor_update_timeSum of times used to refactor the system of linear equations due to rho
iter_timeTime spent in iteration loop
proj_timeTime spent in projection functions
post_timeTime used for post processing
update_timeTime spent in the update! function of the accelerator
accelerate_timeTime spent in the accelerate! function of the accelerator

By default COSMO only measures solver_time, setup_time and proj_time. To measure the other times set verbose_timing = true.

source

It holds: solver_time = setup_time+ iter_time + factor_update_time + post_time,

setup_time = graph_time+ init_factor_time + scaling_time,

proj_time is a subset of iter_time.

Updating the model

In some cases we want to solve a large number of similar models. COSMO allows you to update the model problem data vectors q and b after the first call of optimize!(). After changing the problem data, COSMO can reuse the factorisation step of the KKT matrix from the previous problem which can save a lot of time in the case of LPs and QPs.

COSMO.update!Function
update!(model, q = nothing, b = nothing)

Updates the model data for b or q. This can be done without refactoring the KKT matrix. The vectors will be appropriatly scaled.

source
diff --git a/dev/index.html b/dev/index.html index 55086bd0..2d7c2e49 100644 --- a/dev/index.html +++ b/dev/index.html @@ -46,4 +46,4 @@ status = JuMP.termination_status(m) X_sol = JuMP.value.(X) -obj_value = JuMP.objective_value(m)

Credits

The following people are involved in the development of COSMO:

*all contributors are affiliated with the University of Oxford.

If this project is useful for your work please consider

Licence

COSMO.jl is licensed under the Apache License 2.0. For more details click here.

+obj_value = JuMP.objective_value(m)

Credits

The following people are involved in the development of COSMO:

*all contributors are affiliated with the University of Oxford.

If this project is useful for your work please consider

Licence

COSMO.jl is licensed under the Apache License 2.0. For more details click here.

diff --git a/dev/jump/index.html b/dev/jump/index.html index f20319bf..92f3474b 100644 --- a/dev/jump/index.html +++ b/dev/jump/index.html @@ -14,4 +14,4 @@ # This means that `1 <= r_prim / (eps_abs * max_norm_prim * eps_rel) < nearly_ratio`

The values of r_prim, max_norm_prim, r_dual and max_norm_dual can also be accessed as the fields of the res_info struct that can be obtained as follows

res_info = MOI.get(m, COSMO.RawResult()).info

Then, the feasibility can either be checked manually as in

res_info.r_prim < eps_abs + res_info.max_norm_prim * eps_rel

or using one of the following:

COSMO.is_primal_feasible(res_info, eps_abs, eps_rel)
 COSMO.is_primal_nearly_feasible(res_info, eps_abs, eps_rel)
 COSMO.is_dual_feasible(res_info, eps_abs, eps_rel)
-COSMO.is_dual_nearly_feasible(res_info, eps_abs, eps_rel)

For more information on how to use JuMP check the JuMP documentation.

+COSMO.is_dual_nearly_feasible(res_info, eps_abs, eps_rel)

For more information on how to use JuMP check the JuMP documentation.

diff --git a/dev/lin_solver/index.html b/dev/lin_solver/index.html index 7298b77a..619b19d1 100644 --- a/dev/lin_solver/index.html +++ b/dev/lin_solver/index.html @@ -6,4 +6,4 @@

Linear System Solver

One major step of COSMO's ADMM algorithm is solving a linear system of equations at each iteration. Fortunately, the left-hand matrix is only dependent on the problem data and therefore only needs to be factored once. Depending on the problem class this factorisation can be the computationally most expensive step of the algorithm (LPs, QPs). See the Method section for a more detailed description of the linear system.

COSMO allows you to specify the linear system solver that performs the factorisation and back-substitution. We also support indirect system solver which are useful for very large problems where a factorisation becomes inefficient. The table below shows the currently supported linear system solver:

TypeSolverDescription
CholmodKKTSolverCholmodJulia's default linear system solver (from SuiteSparse)
QdldlKKTSolverQDLDLFor more information QDLDL.jl
PardisoDirectKKTSolverPardiso (direct)Pardiso 6.0 direct solver
PardisoIndirectKKTSolverPardiso (indirect)Pardiso 6.0 indirect solver
MKLPardisoKKTSolverIntel MKL PardisoPardiso optimised for Intel platforms
CGIndirectKKTSolverIterativeSolvers.jlConjugate Gradients on the reduced KKT linear system.
MINRESIndirectKKTSolverIterativeSolvers.jlMINRES on the (full) KKT linear system.
Note

To use the Pardiso and Intel MKL Pardiso solver, you have to install the respective libraries and the corresponding Julia wrapper. For more information about installing these, visit the Pardiso.jl repository page. Likewise in order to use Indirect(Reduced)KKTSolver you have to install IterativeSolvers.jl (v0.9+) and LinearMaps.jl. We are using the Requires package for lazy loading of code related to Pardiso and IterativeSolvers. This means in order to use Pardiso / IterativeSolvers, you'll have to load these packages alongside COSMO, i.e. using Pardiso and using IterativeSolvers, LinearMaps.

COSMO uses the Cholmod linear system solver by default. You can specify a different solver in the settings by using the kkt_solver keyword and the respective type:

settings = COSMO.Settings(kkt_solver = CholmodKKTSolver)
 

COSMO also allows you to pass in solver-specific options with the with_options(solver_type, args...; kwargs...) syntax. For example, if you want to use Pardiso with verbose printing use the following syntax:

settings = COSMO.Settings(kkt_solver = with_options(PardisoDirectKKTSolver, msg_level_on = true))

Likewise, CGIndirectKKTSolver and MINRESIndirectKKTSolver are also parameterizable with with_options(solver_type, args...; kwargs...) and accept the following arguments:

ArgumentDescriptionValues (default)
tol_constant::T and tol_exponent::TParameter that defines the solution tolerance for the iterative solvers across iterations. In particular, the solution tolerance at every iteration is defined as $\text{tol\_constant} / \text{iteration}^{\text{tol\_exponent}}$1.0, 1.5

This also works if you want to use this configuration with JuMP:

model = JuMP.Model(optimizer_with_attributes(COSMO.Optimizer, "kkt_solver" => with_options(PardisoDirectKKTSolver, msg_level_on = true));
 

Or alternatively:

model = JuMP.Model(COSMO.Optimizer);
-set_optimizer_attribute(model, "kkt_solver", with_options(PardisoDirectKKTSolver, msg_level_on = true));
+set_optimizer_attribute(model, "kkt_solver", with_options(PardisoDirectKKTSolver, msg_level_on = true)); diff --git a/dev/literate/build/arbitrary_precision/index.html b/dev/literate/build/arbitrary_precision/index.html index e9255614..a22a2f3f 100644 --- a/dev/literate/build/arbitrary_precision/index.html +++ b/dev/literate/build/arbitrary_precision/index.html @@ -39,7 +39,7 @@ Acc: Anderson Type2{QRDecomp}, Memory size = 5, RestartedMemory, Safeguarded: true, tol: 2.0 -Setup Time: 1118.09ms +Setup Time: 1114.85ms Iter: Objective: Primal Res: Dual Res: Rho: 1 -7.8808e-03 1.0079e+00 2.0033e+02 1.0000e-01 @@ -50,5 +50,5 @@ Status: Solved Iterations: 25 Optimal objective: 1.88 -Runtime: 3.248s (3247.58ms)

Moreover, notice that when no type parameter is specified, all objects default to Float64:

model = COSMO.Model()
A COSMO Model with Float precision: Float64
-

Two limitations to arbitrary precision:

  • Since we call the LAPACK function syevr for eigenvalue decompositions, we currently only support solving problems with PSD constraints in Float32 and Float64.
  • We suggest to use the pure Julia QDLDL linear system solver (kkt_solver = QdldlKKTSolver) when working with arbitrary precision types as some of the other available solvers don't support all available precisions.
Note

If you want to use COSMO directly with MathOptInterface, you can use: COSMO.Optimizer{<: AbstractFloat} as your optimizer. Again, the problem data precision of your MathOptInterface-model has to agree with the optimizer's precision.


This page was generated using Literate.jl.

+Runtime: 3.279s (3279.47ms)

Moreover, notice that when no type parameter is specified, all objects default to Float64:

model = COSMO.Model()
A COSMO Model with Float precision: Float64
+

Two limitations to arbitrary precision:

  • Since we call the LAPACK function syevr for eigenvalue decompositions, we currently only support solving problems with PSD constraints in Float32 and Float64.
  • We suggest to use the pure Julia QDLDL linear system solver (kkt_solver = QdldlKKTSolver) when working with arbitrary precision types as some of the other available solvers don't support all available precisions.
Note

If you want to use COSMO directly with MathOptInterface, you can use: COSMO.Optimizer{<: AbstractFloat} as your optimizer. Again, the problem data precision of your MathOptInterface-model has to agree with the optimizer's precision.


This page was generated using Literate.jl.

diff --git a/dev/literate/build/custom_cone/index.html b/dev/literate/build/custom_cone/index.html index c0c90dd2..eadf3f18 100644 --- a/dev/literate/build/custom_cone/index.html +++ b/dev/literate/build/custom_cone/index.html @@ -93,7 +93,7 @@ Acc: Anderson Type2{QRDecomp}, Memory size = 6, RestartedMemory, Safeguarded: true, tol: 2.0 -Setup Time: 0.08ms +Setup Time: 0.09ms Iter: Objective: Primal Res: Dual Res: Rho: 1 -2.7216e+01 1.7200e+01 6.0000e-01 1.0000e-01 @@ -104,7 +104,7 @@ Status: Solved Iterations: 26 (incl. 1 safeguarding iter) Optimal objective: -7 -Runtime: 0.015s (15.22ms)
x_opt = JuMP.value.(x)
3-element Vector{Float64}:
+Runtime: 0.016s (15.68ms)
x_opt = JuMP.value.(x)
3-element Vector{Float64}:
  2.999990000199988
  2.000000000116415
- 2.0000099998433143

You can see in the solver output that indeed a set Nonpositives of dim: 2 was used.

The discussed cone of Nonpositives is of course trivial, but the ability to define new cones and constraints for COSMO can be very powerful to model complex problems.

References

[1] Garstka et al. - COSMO: A conic operator splitting method for convex conic problems (2020)


This page was generated using Literate.jl.

+ 2.0000099998433143

You can see in the solver output that indeed a set Nonpositives of dim: 2 was used.

The discussed cone of Nonpositives is of course trivial, but the ability to define new cones and constraints for COSMO can be very powerful to model complex problems.

References

[1] Garstka et al. - COSMO: A conic operator splitting method for convex conic problems (2020)


This page was generated using Literate.jl.

diff --git a/dev/literate/build/portfolio_model_updates/ab8679a2.svg b/dev/literate/build/portfolio_model_updates/e65363eb.svg similarity index 86% rename from dev/literate/build/portfolio_model_updates/ab8679a2.svg rename to dev/literate/build/portfolio_model_updates/e65363eb.svg index 1c5e216b..524baed2 100644 --- a/dev/literate/build/portfolio_model_updates/ab8679a2.svg +++ b/dev/literate/build/portfolio_model_updates/e65363eb.svg @@ -1,47 +1,47 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/literate/build/portfolio_model_updates/index.html b/dev/literate/build/portfolio_model_updates/index.html index 70bf1e62..7a78a5af 100644 --- a/dev/literate/build/portfolio_model_updates/index.html +++ b/dev/literate/build/portfolio_model_updates/index.html @@ -66,4 +66,4 @@ end end solve_repeatedly!(returns, risks, gammas, model, k, n, μ);

We can now plot the risk-return trade-off curve:

using Plots
-Plots.plot(risks, returns, xlabel = "Standard deviation (risk)", ylabel = "Expected return", title = "Risk-return trade-off for efficient portfolios", legend = false)
Example block output

This page was generated using Literate.jl.

+Plots.plot(risks, returns, xlabel = "Standard deviation (risk)", ylabel = "Expected return", title = "Risk-return trade-off for efficient portfolios", legend = false)Example block output

This page was generated using Literate.jl.

diff --git a/dev/logistic_regression/index.html b/dev/logistic_regression/index.html index 869f7c7d..027821f6 100644 --- a/dev/logistic_regression/index.html +++ b/dev/logistic_regression/index.html @@ -3,4 +3,4 @@ function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-134239283-1', {'page_path': location.pathname + location.search + location.hash}); - + diff --git a/dev/method/index.html b/dev/method/index.html index dc31a428..3cc909a5 100644 --- a/dev/method/index.html +++ b/dev/method/index.html @@ -33,4 +33,4 @@ \end{aligned}\]

The absolute and relative tolerances $\epsilon_{\mathrm{abs}}$and $\epsilon_{\mathrm{rel}}$ can be set by the user by specifying eps_abs and eps_rel. Furthermore, the user can adjust the number of iterations after which the convergence criteria are checked (check_termination).

Infeasibility detection

COSMO uses conditions based on separating hyperplanes to detect infeasible problems. The conditions for COSMO's problem format have been developed in [1]. Define the convex set $\mathcal{C} = \mathcal{-K} + \{b\}$ then we can use the following infeasibility conditions:

\[\begin{aligned} \mathcal{P} &= \left\{x \in \mathbb{R}^n \mid Px = 0, \, Ax \in \mathcal{C}^{\infty}, \, \langle q,x \rangle < 0 \right\} \\ \mathcal{D} &= \left\{y \in \mathbb{R}^m \mid A^\top y = 0, \, \sigma_\mathcal{C}(y) < 0 \right\}. -\end{aligned}\]

The existence of some $y \in \mathcal{D}$ is a certificate that the problem is primal infeasible, while the existence of some $x \in \mathcal{P}$ is a certificate for dual infeasibility. COSMO regularly checks above conditions to detect infeasible problems. If the detection is successful, the solver terminates and returns the status codes :Primal_infeasible or :Dual_infeasible. COSMO checks the conditions every check_infeasibility iterations, which can be adjusted by the user.

References

[1] Banjac, G. et al. Infeasibility detection in the alternating direction method of multipliers for convex optimization. Preprint, 2017.

+\end{aligned}\]

The existence of some $y \in \mathcal{D}$ is a certificate that the problem is primal infeasible, while the existence of some $x \in \mathcal{P}$ is a certificate for dual infeasibility. COSMO regularly checks above conditions to detect infeasible problems. If the detection is successful, the solver terminates and returns the status codes :Primal_infeasible or :Dual_infeasible. COSMO checks the conditions every check_infeasibility iterations, which can be adjusted by the user.

References

[1] Banjac, G. et al. Infeasibility detection in the alternating direction method of multipliers for convex optimization. Preprint, 2017.

diff --git a/dev/performance/index.html b/dev/performance/index.html index cb8579b0..feec1860 100644 --- a/dev/performance/index.html +++ b/dev/performance/index.html @@ -3,4 +3,4 @@ function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-134239283-1', {'page_path': location.pathname + location.search + location.hash}); -

Performance Tips

There are a number of ways to improve the performance of the solver given a particular problem. If you are not satisfied with the performance there are a number of things you have to determine first. Is the solver slow because

  • it's the first time you ran it in the current Julia session or
  • is it because the solver needs a lot of iterations (convergence) or
  • is each iteration or the initial factorisation slow (computational performance)?

Let's see how each point can be addressed.

First run

Whenever a new Julia session is started, the first run will trigger a compilation of all functions based on their arguments used in your script. This will slow the first execution of COSMO down. After that Julia will call the fast compiled functions. To get around this, you can either keep your current Julia session open and discard the first run. Alternatively, if your problem is very large, you could solve a small version of your problem first to trigger the compilation. Another option is to use PackageCompiler to save compiled functions into a sysimage that can be loaded at startup.

Solver Timings

It is often instructive to look at the detailed solver timing results for your problem. This can reveal where most of the time is spent. To achieve this, run COSMO with the setting verbose_timing = true. After solving the problem with result = COSMO.optimize!(model) you can look at result.times for a breakdown of the times spent in different parts of the algorithm, see Timings for more details. Especially take a look at the ratio of factorisation time and iteration time. If you use JuMP to solve the problem, you can take a look at the timings with backend(model).optimizer.model.optimizer.results.times.

Convergence

It is possible that COSMO converges slowly, i.e. needs a large number of iterations, for your problem given its default parameters.

Parameters

You could try changing any of the following parameters:

  • rho: The initial algorithm step parameter has a large influence on the convergence. Try different values between 1e-5 and 10.
  • adaptive_rho = false: You can try to disable the automatic rho adaption and use different rho values.
  • adaptive_rho_interval: This specifies after how many iterations COSMO tries to adapt the rho parameter. You can also set adaptive_rho_interval = 0 which adapts the rho parameter after the time spent iterating passes 40% of the factorisation time. This is currently the default in OSQP and works well with QPs.
  • alpha = 1.0: This disables the over-relaxation that is used in the algorithm. We recommend values between 1.0 - 1.6.
  • scaling = 0: This disables the problem scaling.
  • eps_abs and eps_rel: Check the impact of modifying the stopping accuracies.

Use warm starting

The number of iterations can be dramatically decreased by providing a good initial guess for x, s and y. Examples where warm starting is commonly used are model predictive control and portfolio backtests, see Warm starting.

Computational performance

If the convergence of the algorithm is not an issue, there are still a number of steps you can take to make COSMO faster.

Intel MKL BLAS/LAPACK

We experienced significant performance improvements on Intel CPUs if Julia is compiled with MKL BLAS. This is because Julia's linear algebra function will use Intel MKL BLAS and LAPACK functions that are optimised for Intel hardware. The effect is especially significant for SDPs because most of the time is spent in the LAPACK function syevr. If you are running Julia on Intel hardware, an easy way to compile Julia with MKL is to add and build the MKL package, see MKL.jl. To verify your current BLAS vendor you can use julia> LinearAlgebra.BLAS.vendor().

Linear system solver

COSMO uses QDLDL.jl as the default linear system solver. In our experience this seems to be a competitive choice until about 1e5 - 1e6 nonzeros in the constraint matrix. After that it is worth trying one of the indirect system solvers, such as CG or MINRES. Furthermore, we also recommend trying Pardiso (or MKLPardiso) for problems of that dimension. More details can be found here: Linear System Solver.

Custom cones

In some cases the computations can be speed-up if certain constraints in the problem allow the implementation of a fast projection function. We allow the user to define their own custom convex cone with a corresponding projection function. The custom cone has to be defined as struct CustomCone{T} <: COSMO.AbstractConvexSet{T}. Furthermore, the user has to define a function that projects an input vector x onto the custom cone, i.e. function COSMO.project!(x::AbstractVector{T}, C::CustomCone{T}) where {T} ... end.

Multithreading

COSMO allows the execution of the projection step for multiple constraints in parallel using Julia's multithreading features. This is currently not enabled in the tagged release because of stability issues in earlier Julia versions. To use multithreading checkout the branch with_multi_threading, which we keep in sync with the latest tagged release. This can be installed via Julia's package manager with pkg> add COSMO#with_multi_threading. Afterwards, before starting Julia, set export JULIA_NUM_THREADS=[NUMBER_PHYSICAL_CORES_HERE]. In Julia you can verify the number of threads with julia> Threads.nthreads().

Notice that the extra overhead for multithreading can slow the solver down if the problem is small. However, we noticed significant performance improvements if the problem contained multiple positive semidefinite constraints or when one large constraint was decomposed. In that case it also helps to restrict the number of BLAS threads per Julia thread with julia> BLAS.set_num_threads(1) to prevent oversubscription of the available cores.

Multithreading can also be used in the factorisation step if the Pardiso or MKLPardiso solver are selected. This is only advisable for constraint matrices with more than 1e5 nonzeros.

Chordal decomposition and Clique merging

When solving large structured and sparse SDPs significant performance improvements are achievable if the problem is passed to COSMO in the right way. This means the solver has to be able to infer the structure of the positive semidefinite variable from the constraint. See the section on Chordal Decomposition for more details. In some cases the primal SDP doesn't allow decomposition but the dual SDP does, consider the Maximum Cut Problem and the Relaxed Two-Way Partitioning Problem for examples.

If the problem is decomposable it is also worth experimenting with different clique merging strategies to see how they impact the performance. More details can be found here: Clique merging.

+

Performance Tips

There are a number of ways to improve the performance of the solver given a particular problem. If you are not satisfied with the performance there are a number of things you have to determine first. Is the solver slow because

  • it's the first time you ran it in the current Julia session or
  • is it because the solver needs a lot of iterations (convergence) or
  • is each iteration or the initial factorisation slow (computational performance)?

Let's see how each point can be addressed.

First run

Whenever a new Julia session is started, the first run will trigger a compilation of all functions based on their arguments used in your script. This will slow the first execution of COSMO down. After that Julia will call the fast compiled functions. To get around this, you can either keep your current Julia session open and discard the first run. Alternatively, if your problem is very large, you could solve a small version of your problem first to trigger the compilation. Another option is to use PackageCompiler to save compiled functions into a sysimage that can be loaded at startup.

Solver Timings

It is often instructive to look at the detailed solver timing results for your problem. This can reveal where most of the time is spent. To achieve this, run COSMO with the setting verbose_timing = true. After solving the problem with result = COSMO.optimize!(model) you can look at result.times for a breakdown of the times spent in different parts of the algorithm, see Timings for more details. Especially take a look at the ratio of factorisation time and iteration time. If you use JuMP to solve the problem, you can take a look at the timings with backend(model).optimizer.model.optimizer.results.times.

Convergence

It is possible that COSMO converges slowly, i.e. needs a large number of iterations, for your problem given its default parameters.

Parameters

You could try changing any of the following parameters:

  • rho: The initial algorithm step parameter has a large influence on the convergence. Try different values between 1e-5 and 10.
  • adaptive_rho = false: You can try to disable the automatic rho adaption and use different rho values.
  • adaptive_rho_interval: This specifies after how many iterations COSMO tries to adapt the rho parameter. You can also set adaptive_rho_interval = 0 which adapts the rho parameter after the time spent iterating passes 40% of the factorisation time. This is currently the default in OSQP and works well with QPs.
  • alpha = 1.0: This disables the over-relaxation that is used in the algorithm. We recommend values between 1.0 - 1.6.
  • scaling = 0: This disables the problem scaling.
  • eps_abs and eps_rel: Check the impact of modifying the stopping accuracies.

Use warm starting

The number of iterations can be dramatically decreased by providing a good initial guess for x, s and y. Examples where warm starting is commonly used are model predictive control and portfolio backtests, see Warm starting.

Computational performance

If the convergence of the algorithm is not an issue, there are still a number of steps you can take to make COSMO faster.

Intel MKL BLAS/LAPACK

We experienced significant performance improvements on Intel CPUs if Julia is compiled with MKL BLAS. This is because Julia's linear algebra function will use Intel MKL BLAS and LAPACK functions that are optimised for Intel hardware. The effect is especially significant for SDPs because most of the time is spent in the LAPACK function syevr. If you are running Julia on Intel hardware, an easy way to compile Julia with MKL is to add and build the MKL package, see MKL.jl. To verify your current BLAS vendor you can use julia> LinearAlgebra.BLAS.vendor().

Linear system solver

COSMO uses QDLDL.jl as the default linear system solver. In our experience this seems to be a competitive choice until about 1e5 - 1e6 nonzeros in the constraint matrix. After that it is worth trying one of the indirect system solvers, such as CG or MINRES. Furthermore, we also recommend trying Pardiso (or MKLPardiso) for problems of that dimension. More details can be found here: Linear System Solver.

Custom cones

In some cases the computations can be speed-up if certain constraints in the problem allow the implementation of a fast projection function. We allow the user to define their own custom convex cone with a corresponding projection function. The custom cone has to be defined as struct CustomCone{T} <: COSMO.AbstractConvexSet{T}. Furthermore, the user has to define a function that projects an input vector x onto the custom cone, i.e. function COSMO.project!(x::AbstractVector{T}, C::CustomCone{T}) where {T} ... end.

Multithreading

COSMO allows the execution of the projection step for multiple constraints in parallel using Julia's multithreading features. This is currently not enabled in the tagged release because of stability issues in earlier Julia versions. To use multithreading checkout the branch with_multi_threading, which we keep in sync with the latest tagged release. This can be installed via Julia's package manager with pkg> add COSMO#with_multi_threading. Afterwards, before starting Julia, set export JULIA_NUM_THREADS=[NUMBER_PHYSICAL_CORES_HERE]. In Julia you can verify the number of threads with julia> Threads.nthreads().

Notice that the extra overhead for multithreading can slow the solver down if the problem is small. However, we noticed significant performance improvements if the problem contained multiple positive semidefinite constraints or when one large constraint was decomposed. In that case it also helps to restrict the number of BLAS threads per Julia thread with julia> BLAS.set_num_threads(1) to prevent oversubscription of the available cores.

Multithreading can also be used in the factorisation step if the Pardiso or MKLPardiso solver are selected. This is only advisable for constraint matrices with more than 1e5 nonzeros.

Chordal decomposition and Clique merging

When solving large structured and sparse SDPs significant performance improvements are achievable if the problem is passed to COSMO in the right way. This means the solver has to be able to infer the structure of the positive semidefinite variable from the constraint. See the section on Chordal Decomposition for more details. In some cases the primal SDP doesn't allow decomposition but the dual SDP does, consider the Maximum Cut Problem and the Relaxed Two-Way Partitioning Problem for examples.

If the problem is decomposable it is also worth experimenting with different clique merging strategies to see how they impact the performance. More details can be found here: Clique merging.