## Friday, July 24, 2015

### MOSEK on GitHub

We are glad to announce that MOSEK has open its own page on GitHub:

The main purpose is to host repositories where we will publish additional material such as

• utilities,
• tutorials,
• simple toolboxes,

that we develop as part of our daily activities. GitHub is the natural tool to make such small tools available to the general public: it is free, many people already have an account and works well!

We think it is another great way to share our work with users and help them get the most out of MOSEK.

You are encouraged to submit issues and propose enhancements, as well as feedback! Nothing is more valuable than user experience...

The code available on the GitHub comes as-is, and it does not fall under the terms of the MOSEK EULA.  Suitable licenses are included, typically MIT license for code and Creative Common BY for tutorial and other training material.

## Monday, May 4, 2015

### Price list change from June the 1st 2015

With effect on the June the 1st 2015 our price list will change.

• PTS floating license: \$1950 (annual maintenance fee \$487.5)
• PTS server license: \$7800 (annual maintenance fee \$1950)

All other prices remain unchanged.

The change reflects the new functionality and interfaces provided in the PTS package.

## Friday, April 17, 2015

### MOSEK and platform support

At MOSEK we are occasionally asked if we can support new platforms such as
• Dspace MicroAutoBox
• HP UX (has been supported)
• Various mainframe operating systems
• QNX (did a port in 2014)
• Solaris (has been supported)
• VxWorks
Although we are quite sure we can adapt MOSEK to run on almost any platform, we are reluctant to accommodate such requests. We will now explain why.

Currently MOSEK is available for the following platforms
• Linux 32 and 64 bit on Intel X86.
• MAC OSX  64 bit on Intel X86.
• Windows 32 and 64 bit on Intel X86.
These are well support mainstream platforms with plenty of tools such as a good C tool chain and Python support.

In general to port MOSEK to a new platform then the platform
• should preferably be a Linux like operating system.
• support double precision floating point computations.
• should have a mature C compiler tool chain with all the usual libraries
• dynamic memory allocation should be available.
• provide a good optimized sequential BLAS/LAPACK library if performance is important. This is particularly important if support for semidefnite optimization is required.
In practice if it is not possible to cross compile MOSEK for the platform under Linux or Windows then we require:
• Python 2.7 should be available so we can run our Scons based build system.
• Perforce, our source control system, should be supported on the platform
• Some other tools that we currently have forgotten about.
Note most of our customers of course expect us to test and support our software for long time. If that is the case we should be able to hook the platform into the Jenkins continuous integration system, so we can build and test automatically regularly. If that is not possible, then the cost of supporting the platform increases.

MOSEK is a complex piece of software i.e. it is not just a few 1000s lines of code that can be compiled with a make script. This implies that porting it to a new platform usually requires at least 1 months of work from a senior developer and in many cases more.  In particular if the platform is far from the standard Linux 64 bit x86 platform, then a port is cumbersome and time consuming.

To summarize we can without doubt adapt MOSEK to run on almost any platform. However, porting MOSEK to a new platform is a costly undertaking. Therefore, if you want us to port MOSEK to a new platform then you to present a decent business case that shows we will recover our costs with a reasonable probability.

As a postscript let us tell you about our experience with QNX which we ported MOSEK to recently. QNX is a real time OS that looks pretty much like Linux. You can get QNX to run as a virtual machine under Windows (it seems under Windows only). In theory a port should be easy. However, the keyboard and mouse support is less than optimal, not to say lousy. It is actually very hard to get the network connection working so moving files to and from the virtual machine is a pain. After quite a lot of  hassle we got a QNX build cross compiled on Linux and verified that it worked under QNX and we gave it to a potential customer. The potential customer said he did not have time to test it yet unfortunately.

## Monday, March 30, 2015

### Easter Holidays

MOSEK team will be on vacation during the Easter period!

Support and sales will be closed from Thursday the 2nd of April 2015 to Monday the 6th of April 2015, both included.

Normal service will be restored from Tuesday the 7th of April 2015.

We wish an happy Easter to all of you!

## Monday, March 16, 2015

### Machine Learning with MOSEK Fusion: linear soft-margin Support Vector Machine

Machine-Learning (ML)  has become a common widespread tool in many applications that affect our everyday life. In many cases, at the very core of these techniques there is an optimization problem. In this post we will show how one of the most common ML technique, the Support-Vector Machine (SVM) is defined as the solution of a conic optimization problem.

The aim of this post is two-fold:
1. To show that simple SVM models can be efficiently solved with a commercial general-purpose solver as MOSEK
2. That the MOSEK Fusion API yield  a very compact and expressive code.
We are not claiming that a general-purpose solver is the tool of choice for solving SVMs problems. There is a huge literature about specialized algorithms, and software packages (see for instance LibSVM) that exploit their peculiar structure.

In words, the basic SVM model can be stated as:
"We are given a set of points $m$ in a $n$-dimensional space, partitioned in two groups. Find, if any, the separating hyperplane of the two subsets with the largest margin, i.e. as far as possible from the points."

In mathematical form, that means to determine an hypeplane $w^T x = b$ that separate  two sets of points leaving the largest margin possible. It can be proved that the margin is given by $2/\|w\|$.

Therefore, we need to solve the problem of maximizing  $2/\|w\|$ with respect of $w,b$ with the constraints that points of the same class must lie on the same side of the hyperplane. Denoting with $x_i\in\mathbb{R}^n$ the i-th observation and assuming that each point is given a label $y_i\in \{-1,+1\}$, it is easy to see that the separation is equivalent to:

$y_i (w^T x_i -b)\geq 1.$

A simple example in two dimension is the following:

It is easy to show (see [1]) that the separating hyperplane is the solution of the following optimization problem:

$\begin{array} {lll} \min & \frac{1}{2} \| w \|^2 &\\ & y_i (w^T x_i -b ) \geq 1& i=1,\ldots,m \end{array}$

If a solution exists, $w,b$ define the separating hyperplane and the sign of $w^Tx -b$ can be used to decide the class in which a point $x$ falls.

To allow more flexibility the soft-margin SVM classifier is often used instead. It allows for violation of the classification. To this extent a non-negative slack variable is added to each linear constraint and penalized in the objective function.

$\begin{array} {lll} \min_{b,w} & \frac{1}{2} \| w \|^2 + C \sum_{i=1}^{m} \xi_i&\\ & y_i (w^T x_i -b ) \geq 1 - \xi_ i& i=1,\ldots,m\\ & \xi_i \geq 0 & i=1,\ldots,m \end{array}$

In matrix form we have

$\begin{array} {lll} \min_{b, w, \xi} & \frac{1}{2} \| w \|^2 + C \mathbf{e}^T \xi\\ & y \star ( X w - b \mathbf{e}) + \xi \geq \mathbf{e}\\ & \xi \geq 0 \end{array}$

where $\star$ denotes the component-wise product, and $\mathbf{e}$ a vector with all components equal to one. The constant $C>0$ acts both as scaling factor and as weight. Varying $C$ yields different trade-off between accuracy and robustness.

Implementing the matrix formulation of the soft-margin SVM in Fusion is very easy. We only need to cast the problem in conic form, which in this case only involves converting the quadratic term of the objective function in a conic constraint:

$\begin{array} {ll} \min_{b, w, \xi,t} & t + C \mathbf{e}^T \xi& \\ & \xi + y \star ( X w - b \mathbf{e}) \geq \mathbf{e}\\ & (1,t,w) \in \mathcal{Q_r}^{n+2}\\ & \xi \geq 0 \end{array}$

where $\mathcal{Q_r}$ denotes a rotated cone of dimension $n+2$.

In our Fusion implementation we will allow the user to specify a set of values for $C$, and we will solve a problem for each one. We will show Python code for sake of simplicity. However, much better performance can be achieved with the Java or C# API.

Let assume now we are given an array y of labels, a matrix X where input data are stored row-wise and a set of values C for  we want to test. The implementation in Fusion of our conic model is the following

The code is so expressive that it deserves very few comments:

1.  A sequence of problem can be solved just changing the objective function, as in loop starting in line 19: in this most of the model is build only once.
2. Expression in line 13 shows how a linear expression can be store and reused.
3. The construct Variable.repeat(b,m)  logically repeats variable $b$, i.e. forms an array of the form $(b,b,b,\ldots,b)^T \in \mathbb{R}^m$.
4. The solution is recovered by the level method: if an optimal solution hasn't been found, an exception is thrown.

Let's make few tests. We generate two sets of random two dimensional points each from a Gaussian distribution: the first set centered at $(1.0,1.0)$; the second at $(-1.0,-1.0)$. We experiment standard deviation $\sigma$ of $1/2$ and $1$.

With $\sigma=1/2$ we obtain a separable sets of points and for $C=1$ we have  the following

For $\sigma=1$ separability is usually not possible. Larger value of $C$ emphasize good classification, but reduces the margin and hence robustness. For instance, we may obtain the following:

The black and red lines correspond to $C=1$ and $C=2^{10}$.

MOSEK solves this problems in almost no time:

Further improvements can be obtained exploiting sparsity: in most cases input data are supposed to be sparse, i.e. most entries in the X matrix are zero. If this is the case, then Fusion, and therefore MOSEK, can exploit such a property. You only need to feed the function with a Fusion sparse matrix:

where Xi,Xj,Xv are the arrays that store the sparse representation of X in triplet form. This will save memory and reduce the computational burden.

So to summarize: you can easily build your own linear SVM classifier with MOSEK Fusion! Give it a try and play with it. Fusion demonstrates again its simplicity and power of expressiveness.

Machine learning 20.3 (1995): 273-297.

## Monday, January 12, 2015

We are happy to announce that MOSEK will be among the sponsors of the forthcoming:

2015 Mixed Integer Programming Workshop,
June 1st – 4th, 2015
at
The Gleacher Center, Chicago, IL

The Mixed Integer Programming (MIP) workshop series is designed to bring together the integer programming research community in an annual meeting. MIP 2015, hosted by Chicago Booth at The Gleacher Center, is the 12th workshop in the series.