Tuesday, June 17, 2014

What if the solver stalls?

In this post we will discuss why the solver get sometimes stalls and how does this relates with the solution quality. What does we mean for stall?

The solver get stalled whenever it cannot make any sensible progress towards the optimal solution.

An iteration of a continuous optimizer in MOSEK consists of
\[ x^+ = x + \alpha d_x\]
where
  • $x$:  Is the current guess for the optimal solution.
  • $d_x$: Is a direction computed in some way by the optimizer.
  • $\alpha$: Is nonnegative step size.
  • $x^+$: Is the updated guess for the optimal solution.
The interpretation is that the optimizer computes a search direction $d_x$ and then takes a step in that direction. The search direction is computed such that when moving in that direction the current solution will be moved closer towards an optimal solution. Normally the step size $\alpha$ is chosen such that the new trial solution $x^+$ is as close as possible to the optimal solution with respect to some merit function. 

If the search direction is badly chosen then the optimal step size might be very close to $0$, in which case the algorithm is stalled. So why does this happens in practice? Typically the search direction $d_x$ is given as the solution of a system of linear equations, say


\[H d_x = h,\]


where it can be proven that if the linear system is solved accurately then good progress is obtained. Unfortunately in practice due to fact that computations are done in finite precision then the search direction may be of poor quality leading to a stall.

To summarize due to numerical issues MOSEK may not be able to compute a good search direction and if that leads to an extremely short step then MOSEK stops and says STALLED. Thus it does not make sense to continue because the subsequent iterations will also leads to small steps and no progress at all.

Observe that when a stall occurs then usually the current solution is very close to an optimal solution and many cases the reported solution is good enough for most practical purposes. Recall that usually the optimization problem solved is approximation of the true problem and even in the best case MOSEK will only report an approximated solution to approximated problem. 

Let's visualize the algorithm execution using a simplified scheme: the feasible region (primal and dual) is the yellowish one and the solution is the green dot on its corner. Algorithm execution generates a sequence of iterates, the green dots, that hopefully will end up in 10.

Interior-point algorithm execution sketch.





























For each iterates, a normalized primal and dual feasibility violation is computed, as well as a measure of optimality violation. Let's call them $r_p,r_d$ and $r_g$ respectively. Ideally, $||(r_p,r_d,r_g)||$ should be driven to zero, but in practice the solver stops when
\[ ||(r_p,r_d,r_g)|| \leq \epsilon_f.\]

Thus $\epsilon_f$ defines a region in which a solution is optimal, i.e. the purple region in the figure. When the solver terminates it provides a solution with a certain solution status. Hopefully, it will be optimal. Otherwise, the solver may still be able to provide a useful solution: for this reason, and to improve flexibility, MOSEK denotes some solutions as near-optimal: the solution would be optimal if we scale up $\epsilon_f$ by a factor $\alpha>1$, i.e.

\[ ||(r_p,r_d,r_g)|| \leq \epsilon_f\alpha.\]

This corresponds to the reddish region. As the algorithm proceeds, we reach the near-optimal region at step (9) and then the optimal one at step (10). If all goes well, we stop in the latter. If we get stalled at (9) we will get a near-optimal solution. But what if the solver get stalled at step (8)? The solution status is unknown in this case.

In general, if the solver is stalled, the solution status is still meaningful since the solver is able to check the solution feasibility and the residual values. In particular a solution could be near-optimal. In our example, step (8) is not even near optimal and could be not very useful, but it is still worth to check it.
In case of stall, user should
  • look at solution summary: provides the primal and dual violation under the column PFEAS and DFEAS, and the MU column measure the optimality violation;
  • retrieve the solution status from the solver.
What can be done to avoid stalls? It is hard to say something about in general. A few good rules to follow are:
  • Avoid nearly feasible or nearly infeasible problems.
  • Make sure the problem is scaled reasonably.
  • Make sure the primal and dual solutions are reasonably bounded i.e. not too large in absolute size.
Users should consider revising their model is stalling happens regularly. If modeling does not help, you are encouraged to contact our support.