Few weeks ago we read a nice and interesting post here.
The author took inspiration from a previous post on an open forum to show how the intuitive formulation of a simple problem can lead to a non convex MINLP when a MIQP can be used instead.
In this post we will
- show how to reformulate the proposed MIQP problem as an MISOCP,
- implement the MISOCP model using MOSEK Fusion API,
- derive a more compact MISOCP formulation and
- implement the second model using MOSEK Fusion API as well.
The problem is the following: we are given $N$ power unit (in the original post ovens) that must be used to provide a required amount $T$ of energy. Each plant contributes an amount $x_i$ of energy.
The inefficiency of each plant is given by
\[
g_i(x_i) = \left\lbrace \begin{array}{ll} 100 \sum_ i \frac{(a_i - x_i)^2}{a_i} & x_i>0,\\
0& x_i=0. \end{array}\right.
\]
where $a_i$ is the optimal operational level of the plant. Moving away from $a_i$ leads to inefficient use of the plant. Here there is an example for $a_i = 10 $.
The overall inefficiency is then
\[
f(x) = \sum_i g_i(x_i).
\]
We want to minimize the inefficiency still providing the amount $T$ of energy.
A simple approach is to use an indicator variable $y_i$ to represent the $i-$th plant activation. Assuming we know an upper bound $M_i$ to the power each plant can produce, the model takes the form
\begin{equation}
\begin{array}{lll}
\min & 100\sum_i y_i \frac{(a_i - x_i)^2}{a_i} \\
&\\
s.t. & \sum_i x_i = T\\
& x_i \leq y_i M_i & i=1, \ldots, N\\
& x_i \geq 0 & i=1, \ldots, N,
& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}
\end{equation}
As noted in the original post, this is a non-convex NLP. A simple reformulation allows to remove the bilinear term $x_i y_i$:
\begin{equation}
\begin{array}{lll}
\min & \sum_i \frac{d_i^2}{a_i} \\
&\\
s.t. & \sum_i x_i = T\\
& x_i \leq y_i M_i & i=1, \ldots, N\\
& -M_i(1- y_i) \leq s_i \leq M_i(1-y_i) & i=1, \ldots, N\\
& d_i = a_i - x_i +s_i & i=1, \ldots, N\\
& x_i \geq 0 & i=1, \ldots, N,\\
& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}
\end{equation}
The trick is simple: an active plant $y_i=1$ implies $s_i=0$ and therefore $d_i = a_i - x_i$. But for $y_i =0$ the slack variable $s_i$ is free, while $x_i=0$. This implies that $d_i$ is free to assume any value, and since we are minimizing its second power, it will be driven to zero as well.
In conic form the problem reads
\begin{equation}
\begin{array}{lll}
\min & t\\
&\\
s.t. & \sum_i x_i = T\\
& x_i \leq y_i M_i & i=1, \ldots, N\\
& -M_i(1- y_i) \leq s_i \leq M_i(1-y_i) & i=1, \ldots, N\\
& d_i = a_i - x_i +s_i & i=1, \ldots, N\\
& (1/2, t, d_1/\sqrt{a_1}, \ldots, d_n/\sqrt{a_N}) \in \mathcal{Q_r}\\
& x_i \geq 0 & i=1, \ldots, N,\\
& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}
\end{equation}
The Fusion implementation is the following
The code is very compact and readable, no need for many comments. The only operators one may wonder about are
Now we ask: is there a more compact formulation?
- vstack - it returns an array stacking its argument vertically,
- mulElm - it performs element wise multiplications.
Now we ask: is there a more compact formulation?
The answer is yes: the purpose of the slack variables is to remove the contribution to the inefficiency that a deactivated oven will provide anyway. That contribution is simply equal to $a_i$. Therefore we only need to subtract $a_i$ if $y_i=1$.
\begin{equation}
\begin{array}{lll}
\min & t - \sum_i a_i ( 1- y_i)\\
&\\
s.t. & \sum_i x_i = T\\
& x_i \leq y_i M_i & i=1, \ldots, N\\
& (1/2, t, (a_1 -x_1)/\sqrt{a_1}, \ldots, (a_N - x_N)/\sqrt{a_N}) \in \mathcal{Q_r}\\
& x_i \geq 0 & i=1, \ldots, N,\\
& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}
\end{equation}
which translates to the following code:
As one can see, the code is more compact, clean and readable. Fusion helps implementing conic optimization models in short time with a remarkable simplicity.
Last observation: the proposed problem is indeed simple to formulate, but the intuition leads to a non-convex MINLP. That is typical, but luckily in this case with a bit of work we can actually move to a MIQP formulation first, and a MISOCP then. And the final MISOCP comes with no additional variables at all.
Investing time on the model pays!