Friday, March 4, 2016

Reformulating a non convex MINLP as MISOCP


Few weeks ago we read a nice and interesting post here.

The author took inspiration from a previous post on an open forum to show how the intuitive formulation of a simple problem can lead to a non convex MINLP when a MIQP can be used instead.

In this post we will
  • show how to reformulate the proposed MIQP problem as an MISOCP,
  • implement the MISOCP model using MOSEK Fusion API,
  • derive a more compact MISOCP formulation and
  • implement the second model using MOSEK Fusion API as well.

The problem is the following: we are given $N$ power unit (in the original post ovens) that must be used to provide a required amount $T$ of energy. Each plant contributes an amount $x_i$ of energy.

The inefficiency of each plant is given by

\[
g_i(x_i) = \left\lbrace \begin{array}{ll} 100 \sum_ i \frac{(a_i - x_i)^2}{a_i} & x_i>0,\\
0& x_i=0.  \end{array}\right.
\]

where $a_i$ is the optimal operational level of the plant. Moving away from $a_i$ leads to inefficient use of the plant. Here there is an example for $a_i = 10 $.





The overall inefficiency is then

\[
f(x) = \sum_i g_i(x_i).
\]

We want to minimize the inefficiency still providing the amount $T$ of energy.

A simple approach is to use an indicator variable $y_i$ to represent the $i-$th plant activation. Assuming we know an upper bound $M_i$ to the power each plant can produce, the model takes the form

\begin{equation}
\begin{array}{lll}
\min & 100\sum_i y_i \frac{(a_i - x_i)^2}{a_i} \\
&\\
s.t. & \sum_i x_i = T\\
& x_i \leq y_i M_i & i=1, \ldots, N\\
& x_i \geq 0 & i=1, \ldots, N,
& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}
\end{equation}


As noted in the original post, this is a non-convex NLP. A simple reformulation allows to remove the bilinear term $x_i y_i$:



\begin{equation}
\begin{array}{lll}
\min & \sum_i \frac{d_i^2}{a_i} \\
&\\
s.t. & \sum_i x_i = T\\
& x_i \leq y_i M_i & i=1, \ldots, N\\
& -M_i(1- y_i) \leq s_i \leq M_i(1-y_i)  & i=1, \ldots, N\\
& d_i = a_i - x_i +s_i  & i=1, \ldots, N\\
& x_i \geq 0 & i=1, \ldots, N,\\
& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}
\end{equation}


The trick is simple: an active plant $y_i=1$ implies $s_i=0$ and therefore $d_i = a_i - x_i$. But for $y_i =0$ the slack variable $s_i$ is free, while $x_i=0$. This implies that $d_i$ is free to assume any value, and since we are minimizing its second power, it will be driven to zero as well.

In conic form the problem reads



\begin{equation}
\begin{array}{lll}
\min & t\\
&\\
s.t. & \sum_i x_i = T\\
& x_i \leq y_i M_i & i=1, \ldots, N\\
& -M_i(1- y_i) \leq s_i \leq M_i(1-y_i)  & i=1, \ldots, N\\
& d_i = a_i - x_i +s_i  & i=1, \ldots, N\\
& (1/2, t, d_1/\sqrt{a_1}, \ldots,  d_n/\sqrt{a_N}) \in \mathcal{Q_r}\\
& x_i \geq 0 & i=1, \ldots, N,\\
& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}
\end{equation}

Those not used to conic formulation may refer to our modeling cookbook.

The Fusion implementation is the following 





The code is very compact and readable, no need for many comments. The only operators one may wonder about are

  • vstack - it returns an array stacking its argument vertically,  
  • mulElm - it performs element wise multiplications.


Now we ask: is there a more compact formulation? 

The answer is yes: the purpose of the slack variables is to remove the contribution to the inefficiency that a deactivated oven will provide anyway. That contribution is simply equal to $a_i$. Therefore we only need to subtract $a_i$ if $y_i=1$.



\begin{equation}
\begin{array}{lll}
\min & t - \sum_i a_i ( 1- y_i)\\
&\\
s.t. & \sum_i x_i = T\\
& x_i \leq y_i M_i & i=1, \ldots, N\\
& (1/2, t, (a_1 -x_1)/\sqrt{a_1}, \ldots, (a_N - x_N)/\sqrt{a_N}) \in \mathcal{Q_r}\\
& x_i \geq 0 & i=1, \ldots, N,\\
& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}
\end{equation}



which translates to the following code:





As one can see, the code is more compact, clean and readable. Fusion helps implementing conic optimization models in short time with a remarkable simplicity.


Last observation: the proposed problem is indeed simple to formulate, but the intuition leads to a non-convex MINLP. That is typical, but luckily in this case with a bit of work we can actually move to a MIQP formulation first, and a MISOCP then. And the final MISOCP comes with no additional variables at all.

Investing time on the model pays!