Loading [MathJax]/extensions/TeX/AMSmath.js

Monday, November 8, 2021

Some animals are more equal than others: least-squares vs Euclidean norm

A user recently reached out to MOSEK support because the solver returned an UNKNOWN problem/solution status on their (constrained) Least-squares problem, that was implemented in CVXPY . The solver log looked as shown in the first MOSEK log shown below.

The solution we offered was to minimize the Euclidean norm instead of the sum-of-squares. Theoretically, both models are equivalent; their optimal points are the same. But as a great man once said: in theory, theory and practice are the same. In practice, well... Take a look at the MOSEK log for the problem that minimizes the Euclidean norm (second MOSEK log)

Problem
Name : Sum-of-squares
Objective sense : max
Type : CONIC (conic optimization problem)
Constraints : 5626
Cones : 1
Scalar variables : 6527
Matrix variables : 0
Integer variables : 0
Optimizer - threads : 8
Optimizer - solved problem : the dual
Optimizer - Constraints : 200
Optimizer - Cones : 1
Optimizer - Scalar variables : 465 conic : 107
Optimizer - Semi-definite variables: 0 scalarized : 0
Factor - setup time : 0.00 dense det. time : 0.00
Factor - ML order time : 0.00 GP order time : 0.00
Factor - nonzeros before factor : 6689 after factor : 6805
Factor - dense dim. : 0 flops : 4.41e+05
ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME
0 0.0e+00 3.6e+02 2.0e+00 0.00e+00 -1.000000000e+00 0.000000000e+00 1.0e+00 0.01
1 5.0e-08 1.1e+00 1.1e-01 -1.00e+00 4.960026469e+02 1.660020636e+02 3.0e-03 0.01
2 1.4e-08 2.9e-01 5.7e-02 -9.99e-01 2.387440435e+03 1.174714928e+03 8.2e-04 0.01
3 4.6e-09 1.0e-01 3.3e-02 -9.98e-01 1.124214900e+04 7.687603705e+03 2.8e-04 0.01
4 2.5e-09 3.5e-02 2.0e-02 -9.95e-01 4.667631344e+04 3.657669361e+04 9.8e-05 0.01
5 4.5e-09 1.9e-02 1.4e-02 -9.82e-01 8.903325500e+04 7.010582668e+04 5.2e-05 0.01
6 4.0e-09 1.2e-02 1.1e-02 -9.71e-01 4.832354936e+05 4.559769337e+05 3.5e-05 0.01
7 5.5e-08 1.7e-03 4.2e-03 -9.68e-01 3.072374301e+06 2.890515517e+06 4.9e-06 0.01
8 1.1e-06 9.5e-04 2.9e-03 -8.64e-01 8.289313029e+06 7.991216139e+06 2.7e-06 0.01
9 4.3e-07 4.4e-04 2.0e-03 -9.25e-01 8.626878505e+06 7.970892218e+06 1.2e-06 0.01
10 7.5e-07 2.9e-04 1.4e-03 -7.95e-01 3.305192666e+07 3.225377427e+07 8.0e-07 0.02
11 1.3e-06 6.7e-05 4.6e-04 -6.37e-01 1.301100655e+08 1.286101798e+08 1.9e-07 0.02
12 1.6e-05 1.9e-05 8.9e-05 2.57e-03 2.774563813e+08 2.767697848e+08 5.4e-08 0.02
13 3.2e-06 3.9e-06 5.0e-06 1.04e+00 3.569880554e+08 3.569349076e+08 1.1e-08 0.02
14 1.8e-04 8.7e-07 7.4e-07 1.47e+00 3.727958420e+08 3.727903994e+08 2.2e-09 0.02
15 1.1e-04 4.4e-07 8.2e-08 9.96e-01 3.738918735e+08 3.738891138e+08 1.1e-09 0.02
16 1.1e-04 4.3e-07 5.3e-08 9.99e-01 3.739307883e+08 3.739281148e+08 1.1e-09 0.02
17 1.1e-04 4.3e-07 5.3e-08 1.00e+00 3.739307883e+08 3.739281148e+08 1.1e-09 0.02
18 1.1e-04 4.3e-07 5.3e-08 1.00e+00 3.739307883e+08 3.739281148e+08 1.1e-09 0.02
19 5.6e-05 2.2e-07 2.1e-08 1.00e+00 3.745231585e+08 3.745218142e+08 5.2e-10 0.02
20 5.6e-05 2.2e-07 2.1e-08 1.00e+00 3.745231585e+08 3.745218142e+08 5.2e-10 0.02
21 5.6e-05 2.2e-07 2.1e-08 1.00e+00 3.745231585e+08 3.745218142e+08 5.2e-10 0.02
Optimizer terminated. Time: 0.02
Interior-point solution summary
Problem status : UNKNOWN
Solution status : UNKNOWN
Primal. obj: 3.7452315852e+08 nrm: 2e+08 Viol. con: 3e-10 var: 0e+00 cones: 0e+00
Dual. obj: 3.7452181420e+08 nrm: 4e+08 Viol. con: 0e+00 var: 6e+03 cones: 0e+00
view raw leastsq_log.txt hosted with ❤ by GitHub
Problem
Name : Euclidean-norm
Objective sense : max
Type : CONIC (conic optimization problem)
Constraints : 5626
Cones : 1
Scalar variables : 6526
Matrix variables : 0
Integer variables : 0
Optimizer - threads : 8
Optimizer - solved problem : the dual
Optimizer - Constraints : 199
Optimizer - Cones : 1
Optimizer - Scalar variables : 464 conic : 106
Optimizer - Semi-definite variables: 0 scalarized : 0
Factor - setup time : 0.00 dense det. time : 0.00
Factor - ML order time : 0.00 GP order time : 0.00
Factor - nonzeros before factor : 6584 after factor : 6700
Factor - dense dim. : 0 flops : 4.30e+05
ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME
0 0.0e+00 3.6e+02 2.0e+00 0.00e+00 0.000000000e+00 1.000000000e+00 1.0e+00 0.01
1 6.8e-09 4.2e+01 6.9e-01 -1.00e+00 1.501036940e+01 8.574572891e+00 1.2e-01 0.01
2 3.8e-09 2.4e+01 5.1e-01 -9.99e-01 6.279790020e+01 4.982675952e+01 6.7e-02 0.01
3 2.8e-09 1.7e+01 4.3e-01 -9.73e-01 3.024148977e+02 2.841987192e+02 4.8e-02 0.01
4 1.2e-09 7.2e+00 2.7e-01 -9.47e-01 1.403012164e+03 1.361535218e+03 2.0e-02 0.01
5 5.1e-10 3.2e+00 1.5e-01 -7.72e-01 4.375311681e+03 4.307120804e+03 8.9e-03 0.01
6 1.3e-10 8.4e-01 3.4e-02 -3.16e-01 1.285288466e+04 1.280108793e+04 2.3e-03 0.01
7 2.2e-11 1.3e-01 1.3e-03 8.77e-01 1.841713195e+04 1.841402952e+04 3.8e-04 0.01
8 3.3e-12 2.1e-02 5.4e-05 1.48e+00 1.926071277e+04 1.926050868e+04 5.8e-05 0.01
9 4.7e-13 2.9e-03 2.1e-06 1.38e+00 1.935907086e+04 1.935905559e+04 8.2e-06 0.01
10 5.0e-12 3.0e-04 5.8e-08 1.40e+00 1.937066316e+04 1.937066210e+04 8.4e-07 0.01
11 1.1e-12 5.9e-06 1.3e-10 1.16e+00 1.937169004e+04 1.937169003e+04 1.6e-08 0.01
12 1.5e-11 7.5e-10 6.6e-15 9.99e-01 1.937170752e+04 1.937170752e+04 1.0e-10 0.01
Optimizer terminated. Time: 0.02
Interior-point solution summary
Problem status : PRIMAL_AND_DUAL_FEASIBLE
Solution status : OPTIMAL
Primal. obj: 1.9371707520e+04 nrm: 3e+03 Viol. con: 7e-17 var: 1e-11 cones: 0e+00
Dual. obj: 1.9371707520e+04 nrm: 2e+04 Viol. con: 0e+00 var: 3e-04 cones: 3e-12
view raw norm_log.txt hosted with ❤ by GitHub

Note the following differences between the two MOSEK logs:

  • The problem status in the norm problem is primal and dual feasible and the solution status is optimal, in contrast to the least-squares problem.
  • The infinity-norm of the solutions in the norm problem are smaller by nearly 5 orders of magnitude compared to the solution of the least-squares problem. This is usually a desirable trait, as explained in the  debugging section of MOSEK docs 
  • The number of iterations taken by the norm problem is far fewer.
  • The objective value of the norm problem is essentially the square root of the least-squares problem (the least-squares problem was not solved to optimality, hence the discrepancy) 
Observe that the solution is only guaranteed to be unique if the objective is strictly convex. The objective function is strictly convex if and only if the A matrix (following  this notation) is of full column-rank. If this is not the case, then the two problems might output different optimal solution vectors (as is the case above), but as Orwell would put it, one of those solutions will be more equal than the other

So, at least for CVXPY-MOSEK users (and MOSEK Fusion users!), we recommend the Euclidean norm in place of the sum-of-squares, wherever possible.