Few weeks ago we read a nice and interesting post here.

The author took inspiration from a previous post on an open forum to show how the intuitive formulation of a simple problem can lead to a non convex MINLP when a MIQP can be used instead.

In this post we will

- show how to reformulate the proposed MIQP problem as an MISOCP,
- implement the MISOCP model using MOSEK Fusion API,
- derive a more compact MISOCP formulation and
- implement the second model using MOSEK Fusion API as well.

The problem is the following: we are given $N$ power unit (in the original post ovens) that must be used to provide a required amount $T$ of energy. Each plant contributes an amount $x_i$ of energy.

The inefficiency of each plant is given by

\[

g_i(x_i) = \left\lbrace \begin{array}{ll} 100 \sum_ i \frac{(a_i - x_i)^2}{a_i} & x_i>0,\\

0& x_i=0. \end{array}\right.

\]

where $a_i$ is the optimal operational level of the plant. Moving away from $a_i$ leads to inefficient use of the plant. Here there is an example for $a_i = 10 $.

The overall inefficiency is then

\[

f(x) = \sum_i g_i(x_i).

\]

We want to minimize the inefficiency still providing the amount $T$ of energy.

A simple approach is to use an indicator variable $y_i$ to represent the $i-$th plant activation. Assuming we know an upper bound $M_i$ to the power each plant can produce, the model takes the form

\begin{equation}

\begin{array}{lll}

\min & 100\sum_i y_i \frac{(a_i - x_i)^2}{a_i} \\

&\\

s.t. & \sum_i x_i = T\\

& x_i \leq y_i M_i & i=1, \ldots, N\\

& x_i \geq 0 & i=1, \ldots, N,

& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}

\end{equation}

As noted in the original post, this is a non-convex NLP. A simple reformulation allows to remove the bilinear term $x_i y_i$:

\begin{equation}

\begin{array}{lll}

\min & \sum_i \frac{d_i^2}{a_i} \\

&\\

s.t. & \sum_i x_i = T\\

& x_i \leq y_i M_i & i=1, \ldots, N\\

& -M_i(1- y_i) \leq s_i \leq M_i(1-y_i) & i=1, \ldots, N\\

& d_i = a_i - x_i +s_i & i=1, \ldots, N\\

& x_i \geq 0 & i=1, \ldots, N,\\

& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}

\end{equation}

The trick is simple: an active plant $y_i=1$ implies $s_i=0$ and therefore $d_i = a_i - x_i$. But for $y_i =0$ the slack variable $s_i$ is free, while $x_i=0$. This implies that $d_i$ is free to assume any value, and since we are minimizing its second power, it will be driven to zero as well.

In conic form the problem reads

\begin{equation}

\begin{array}{lll}

\min & t\\

&\\

s.t. & \sum_i x_i = T\\

& x_i \leq y_i M_i & i=1, \ldots, N\\

& -M_i(1- y_i) \leq s_i \leq M_i(1-y_i) & i=1, \ldots, N\\

& d_i = a_i - x_i +s_i & i=1, \ldots, N\\

& (1/2, t, d_1/\sqrt{a_1}, \ldots, d_n/\sqrt{a_N}) \in \mathcal{Q_r}\\

& x_i \geq 0 & i=1, \ldots, N,\\

& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}

\end{equation}

The Fusion implementation is the following

The code is very compact and readable, no need for many comments. The only operators one may wonder about are

Now we ask: is there a more compact formulation?

- vstack - it returns an array stacking its argument vertically,
- mulElm - it performs element wise multiplications.

Now we ask: is there a more compact formulation?

The answer is yes: the purpose of the slack variables is to remove the contribution to the inefficiency that a deactivated oven will provide anyway. That contribution is simply equal to $a_i$. Therefore we only need to subtract $a_i$ if $y_i=1$.

\begin{equation}

\begin{array}{lll}

\min & t - \sum_i a_i ( 1- y_i)\\

&\\

s.t. & \sum_i x_i = T\\

& x_i \leq y_i M_i & i=1, \ldots, N\\

& (1/2, t, (a_1 -x_1)/\sqrt{a_1}, \ldots, (a_N - x_N)/\sqrt{a_N}) \in \mathcal{Q_r}\\

& x_i \geq 0 & i=1, \ldots, N,\\

& y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array}

\end{equation}

which translates to the following code:

As one can see, the code is more compact, clean and readable. Fusion helps implementing conic optimization models in short time with a remarkable simplicity.

Last observation: the proposed problem is indeed simple to formulate, but the intuition leads to a non-convex MINLP. That is typical, but luckily in this case with a bit of work we can actually move to a MIQP formulation first, and a MISOCP then. And the final MISOCP comes with no additional variables at all.

Investing time on the model pays!