Few weeks ago we read a nice and interesting post here.
The author took inspiration from a previous post on an open forum to show how the intuitive formulation of a simple problem can lead to a non convex MINLP when a MIQP can be used instead.
In this post we will
- show how to reformulate the proposed MIQP problem as an MISOCP,
- implement the MISOCP model using MOSEK Fusion API,
- derive a more compact MISOCP formulation and
- implement the second model using MOSEK Fusion API as well.
The problem is the following: we are given N power unit (in the original post ovens) that must be used to provide a required amount T of energy. Each plant contributes an amount x_i of energy.
The inefficiency of each plant is given by
g_i(x_i) = \left\lbrace \begin{array}{ll} 100 \sum_ i \frac{(a_i - x_i)^2}{a_i} & x_i>0,\\ 0& x_i=0. \end{array}\right.
where a_i is the optimal operational level of the plant. Moving away from a_i leads to inefficient use of the plant. Here there is an example for a_i = 10 .
The overall inefficiency is then
f(x) = \sum_i g_i(x_i).
We want to minimize the inefficiency still providing the amount T of energy.
A simple approach is to use an indicator variable y_i to represent the i-th plant activation. Assuming we know an upper bound M_i to the power each plant can produce, the model takes the form
\begin{equation} \begin{array}{lll} \min & 100\sum_i y_i \frac{(a_i - x_i)^2}{a_i} \\ &\\ s.t. & \sum_i x_i = T\\ & x_i \leq y_i M_i & i=1, \ldots, N\\ & x_i \geq 0 & i=1, \ldots, N, & y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array} \end{equation}
As noted in the original post, this is a non-convex NLP. A simple reformulation allows to remove the bilinear term x_i y_i:
\begin{equation} \begin{array}{lll} \min & \sum_i \frac{d_i^2}{a_i} \\ &\\ s.t. & \sum_i x_i = T\\ & x_i \leq y_i M_i & i=1, \ldots, N\\ & -M_i(1- y_i) \leq s_i \leq M_i(1-y_i) & i=1, \ldots, N\\ & d_i = a_i - x_i +s_i & i=1, \ldots, N\\ & x_i \geq 0 & i=1, \ldots, N,\\ & y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array} \end{equation}
The trick is simple: an active plant y_i=1 implies s_i=0 and therefore d_i = a_i - x_i. But for y_i =0 the slack variable s_i is free, while x_i=0. This implies that d_i is free to assume any value, and since we are minimizing its second power, it will be driven to zero as well.
In conic form the problem reads
\begin{equation} \begin{array}{lll} \min & t\\ &\\ s.t. & \sum_i x_i = T\\ & x_i \leq y_i M_i & i=1, \ldots, N\\ & -M_i(1- y_i) \leq s_i \leq M_i(1-y_i) & i=1, \ldots, N\\ & d_i = a_i - x_i +s_i & i=1, \ldots, N\\ & (1/2, t, d_1/\sqrt{a_1}, \ldots, d_n/\sqrt{a_N}) \in \mathcal{Q_r}\\ & x_i \geq 0 & i=1, \ldots, N,\\ & y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array} \end{equation}
The Fusion implementation is the following
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def oven1(n, M, A, T): | |
AS = [ 1./math.sqrt(a) for a in A ] | |
with Model() as m: | |
x = m.variable('x', n, Domain.greaterThan(0.) ) | |
y = m.variable('y', n, Domain.inRange(0.,1.0), Domain.isInteger() ) | |
s = m.variable('s', n, Domain.unbounded() ) | |
d = m.variable('d', n, Domain.unbounded() ) | |
t = m.variable('t', 1, Domain.unbounded() ) | |
m.constraint( Expr.sum(x) , Domain.equalsTo(T)) | |
m.constraint( Expr.sub( x, Expr.mulElm( M, y) ), Domain.lessThan(0.) ) | |
m.constraint( Expr.add( s, Expr.mulElm( M, y) ), Domain.lessThan(M) ) | |
m.constraint( Expr.sub( Expr.mulElm( M, y), s ), Domain.lessThan(M) ) | |
m.constraint( Expr.sub( Expr.add(d,x), s), Domain.equalsTo( A) ) | |
m.constraint( Expr.vstack( 0.5, t, Expr.mulElm( d, AS) ), Domain.inRotatedQCone() ) | |
m.objective( ObjectiveSense.Minimize, t) | |
m.setLogHandler(sys.stdout) | |
m.solve() | |
return x.level(), y.level() |
The code is very compact and readable, no need for many comments. The only operators one may wonder about are
Now we ask: is there a more compact formulation?
- vstack - it returns an array stacking its argument vertically,
- mulElm - it performs element wise multiplications.
Now we ask: is there a more compact formulation?
The answer is yes: the purpose of the slack variables is to remove the contribution to the inefficiency that a deactivated oven will provide anyway. That contribution is simply equal to a_i. Therefore we only need to subtract a_i if y_i=1.
\begin{equation} \begin{array}{lll} \min & t - \sum_i a_i ( 1- y_i)\\ &\\ s.t. & \sum_i x_i = T\\ & x_i \leq y_i M_i & i=1, \ldots, N\\ & (1/2, t, (a_1 -x_1)/\sqrt{a_1}, \ldots, (a_N - x_N)/\sqrt{a_N}) \in \mathcal{Q_r}\\ & x_i \geq 0 & i=1, \ldots, N,\\ & y_i \in \{0,1 \} & i=1, \ldots, N.\\ \end{array} \end{equation}
which translates to the following code:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def oven2(N, M, A, T): | |
AS = [ 1./math.sqrt(a) for a in A ] | |
with Model() as m: | |
x = m.variable('x', N, Domain.greaterThan(0.) ) | |
y = m.variable('y', N, Domain.inRange(0.,1.0), Domain.isInteger() ) | |
t = m.variable('t', 1, Domain.unbounded() ) | |
m.constraint( Expr.sum(x) , Domain.equalsTo(T) ) | |
m.constraint( Expr.sub( x, Expr.mulElm( M, y) ), Domain.lessThan(0.) ) | |
m.constraint( Expr.vstack( 0.5, t, Expr.mulElm(AS, Expr.sub(A,x)) ), Domain.inRotatedQCone() ) | |
m.objective( ObjectiveSense.Minimize, Expr.sub( t, Expr.dot( A, Expr.sub(1., y ) ) ) ) | |
m.setLogHandler(sys.stdout) | |
m.solve() | |
return x.level(), y.level() |
As one can see, the code is more compact, clean and readable. Fusion helps implementing conic optimization models in short time with a remarkable simplicity.
Last observation: the proposed problem is indeed simple to formulate, but the intuition leads to a non-convex MINLP. That is typical, but luckily in this case with a bit of work we can actually move to a MIQP formulation first, and a MISOCP then. And the final MISOCP comes with no additional variables at all.
Investing time on the model pays!