-
Notifications
You must be signed in to change notification settings - Fork 13
2. QP problem
Along the next week I will implement a solver to the following trust-region QP problem (which is a substep of the main solver):
min 1/2 x.T H x + g.T x
subject to: A x - b = 0
||x|| <= delta
lb <= x <= ub
Note that the above problem may not have a solution, since the solutions of A x - b = 0
may not satisfy
the constraints ||x|| <= delta
and lb <= x <= ub
. And this need to be taken into account.
The approach that the algorithm will take when this happens is to solve the following problem instead:
min 1/2 x.T H x + g.T x
subject to: A x - b = r
||x|| <= delta
lb <= x <= ub
for which r
is chosen such that the problem is feasible. This approach (due to Byrd and Omojokun) solve the problem in two steps: 1. First it choose a suitable r
(normal step). 2. And then it solve the above problem for the given r
(tangential problem).
A variation of the dogleg procedure will be used to find the normal step and the previously implemented projected CG will be used after a few modifications (in order to take into account the constraints
lb <= x <= ub
) in order to solve tangential problem.
The implementation will be based in [1] and [2].
- Theoretical revision
- Read sections 19.1-19.6 from Nocedal and Wrigth book [1].
- Read paper [2];
- Implementation
- Normal step: Implement dogleg variation described on [2] (this variation is significantly different from the dogleg implemented in Scipy).
- Tangential problem: Modify the conjugate gradient in order to take into account the constraints
lb <= x <= ub
. - Implement Trust-region QP solver using (normal step) + (tangential step).
- Testing
- Test normal step.
- Test tangential step.
- Test QP solver.
[1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization" Second Edition (2006).
Today I solved the normal step problem. The normal step can be reformulated as the problem:
min || A x + b ||^2
subject to: ||x|| < Delta
lb <= x <= ub
For which A
is a m x n
matrix for which n > m
. I implemented a variation of the dogleg described in [2] in order to solve this problem. I implemented auxiliary functions that find the intersection between line/sphere and line/rectangular-box that I think it might be useful for the tangential problem as well
-
Question for Nikolay: I believe you may have implemented similar subproblem solvers for you nonlinear least-squares solver. Do you think yours implementation can be used here? (Remember that
A
doesn't have a full column rank) Maybe we can compare the performance of the proposed dogleg with yours implementation later.-
I have to admit I'm still behind on the theory (I even decided to clear up some fundamentals). Am I correct that your main source is [2], what other references from there do you use?
- For this part I am using reference [2] together with Nocedal book [1]. Chapters 18 and 19 from Nocedal/Wright covers most of the theory behind the implementation and the details are provided in [2].
-
As for your question: as far as I understand you are solving a linear least squares problem approximately, i.e. its accurate solution will require several iterations of the proposed dogleg steps (and it's not particular clear whether it will converge at all). What I did was 2 things 1) dogleg iterations in the non-linear solver, but a trust-region was chosen rectangular as well to make the geometry simple 2) accurate solver for linear problems, using different algorithms (one of them is not really suitable for large scale). So I don't see that the two approaches are directly comparable.
- It was just an idea. If you had implemented something similar as a substep of your solver it would be nice to compare it with the one I just implemented. Nevertheless, there is no harm in not doing it and I will move on to the tangential problem.
-
I have performed tests on the modified dogleg method, corrected a few bugs and improve its documentation.
I also started to implement the tangential problem. In order to do that I am modifying the projected CG method in order to take into account the box constraints lb <= x <= ub
.
Today I:
- Finish the modification of the projected CG method in order to take into account the box constraints
lb <= x <= ub
. - Design some test and do experiments to the box constrained
projected CG
.
The inclusion of box constraints is not very intuitive since we can enter and leave the feasible region several times during the iterations. This doesn't happens when considering spherical constraints, for which there are theorems showing that if an iterations is infeasible all the subsequent ones will also be.
After several trials I came up with a clean implementation that I think is working well. It is not very easy to design test to this problem, but I did my best and tested most of the behaviours in some small problems. I will see if I manage to run it on large problems latter.
From Nikolay
Inclusion of box constraints is not very intuitive indeed. I see that in [2] they apply the same ad-hoc truncation approach, so it is not a mistake for sure to follow it. Overall I have an impression that their consideration of "box" constraints is an afterthought, i. e. it is applied one sided to only slack variables and is not supposed to happen very often.
Overall I feel like that the proposed approach is not ideal algorithmically and a bit messy: there are two quite similar problems --- normal and tangential steps, which are solved by different methods, which are both not particularly elegant (mixture of box and spherical constraints). I understand the reasons though. But it looks like that possibly there is some space for improvements.
One sort of reasonable suggestion is to use rectangular trust-regions. Would it make sense to consider this (they mention this option in the paper as well)? As far as I understand the Steihaug approach won't have this property then, that we can stop iterations after the current iterate leaves the trust-region. So it looks like it won't solve all the issue you mentioned ("we can enter and leave the feasible region several times during the iterations").
Overall I guess the best bet is to follow through with the current algorithm, and try to make it work if any issues arise. I believe there is no new "silver bullet" efficient and elegant algorithm to tackle our subproblems.
- Antonio: The combination of box and spherical constraints is a bit messy indeed. I think the authors of the papers find it as well. In p. 885 the state:
We have experimented with other forms of trust region, in particular with box-shaped trust
regions defined by a ``l_infinity`` but so far the current method (box+spherical) appears
to be the most appropriate for our algorithm.
Beside this downside, I like the algorithm basic idea, particularly the trust-region approach.
I combined the normal and tangential steps under a common interface that receives the parameters of the following QP problem:
min 1/2 x.T H x + g.T x
subject to: A x - b = 0
||x|| <= delta
lb <= x <= ub
and return an (approximate) solution to it. Regardless of the problem being feasible or not.
- Split the code into several files;
- Remove too many blank lines;
- Punctuated comments;
- Improved docstrigs and comments;
- Additional tests and more complete doctoring to
qp_subproblem
.