Wednesday, March 6, 2024

Ferrari, factorizing, and portfolios in conic form

1. Suppose we want to solve the equation $$\begin{equation*}(x^2-3x+2)(x-3)(x-4)=0.\end{equation*}$$ The straightforward approach is to multiply everything out: $$\begin{equation*}x^4-10x^3+35x^2-50x+24=0\end{equation*}$$ and use the algorithm for solving a general 4-th degree equation discovered by Lodovico Ferrari, an Italian mathematician, around 1540. The procedure is pretty involved, requires solving auxiliary 3-rd degree equations and other operations. Luckily it can be summarized in explicit formulae for the quartic roots, which we don't quote here due to restricted space. Having gone through this ordeal we get, for example, that one of the four roots is $$\begin{equation*}x_1=\frac{-1}{6} \sqrt{6} \sqrt{\frac{\left(\frac{1}{2}\right)^{\frac{2}{3}} \left(\left(\frac{1}{2}\right)^{\frac{2}{3}} \left(36\sqrt{-3}+70\right)^{\frac{2}{3}}+5\left(\frac{1}{2}\right)^{\frac{1}{3}} \left(36\sqrt{-3}+70\right)^{\frac{1}{3}}+13\right)}{\left(36\sqrt{-3}+70\right)^{\frac{1}{3}}}}-\frac{1}{2} \sqrt{\frac{-1}{3} \left(\frac{1}{2}\right)^{\frac{1}{3}} \left(36\sqrt{-3}+70\right)^{\frac{1}{3}}-\frac{\frac{26}{3} \left(\frac{1}{2}\right)^{\frac{2}{3}}}{\left(36\sqrt{-3}+70\right)^{\frac{1}{3}}}+\frac{10}{3}}+\frac{5}{2}\end{equation*}$$ which numerically evaluates to $x_1\approx 1.00 - 1.38\cdot 10^{-17}i$ and with some good faith we conclude $x_1=1$. We obtain the other three roots similarly.

Of course you wouldn't do it that way. You would exploit the fact that your equation already comes almost completely factored, the small quadratic part $x^2-3x+2$ is easy to factor as $(x-1)(x-2)$ with standard school math, so we have $$\begin{equation*}(x-1)(x-2)(x-3)(x-4)=0\end{equation*}$$ and we get all the solutions for free, exactly. The only reason we used the previous method is because we thought we first have to reduce the problem to some standard form and then use it as a black-box. The black-box, however, is unnecessarily complicated for what we need here, introduces noise, and ignores the additional structure we started with so a lot of wasted work is done to go back and forth.

You can read more about the twisted story of Ferrari and his mathematical duels over polynomial equations online, for instance in this article.


2. Suppose we want to optimize a portfolio with a factor covariance model of the form $\mathrm{diag}(D)+V^TFV$, with assets $x\in \mathbb{R}^n,\ (n\approx 3000)$, where $D\in \mathbb{R}^{n}$  is specific risk, $F\in \mathbb{R}^{k\times k},\ (k\approx 40)$ is factor covariance and $V\in \mathbb{R}^{k\times n}$ is the matrix of factor exposures. The straightforward approach is to multiply everything out, in pseudocode:

Q = diag(D) + V.T @ F @ V

and use the algorithm for optimizing a general quadratic problem:

risk = sqrt(quad_form(x, Q))

The procedure is pretty involved, requires checking that the matrix $Q\in \mathbb{R}^{n\times n}$ is positive-semidefinite, which at this size is prone to accumulation of numerical errors and can be time consuming (even though we know very well $Q$ is PSD by construction). Simultaneously the modeling tool and/or conic solver will need to compute some decomposition $Q=U^TU$. Having gone through this ordeal, the solver will finally have the risk in conic form

risk = norm(U@x)

albeit working with a dense matrix $U$ with $n\times n\approx 10^7$ nonzeros will not be optimal for efficiency and numerical reasons. We finally obtain the solution.

Of course you wouldn't do it that way. You would exploit the fact that your risk model already comes almost completely factored, the small factor covariance matrix $F\in \mathbb{R}^{k\times k}$ is easy to factor using standard Cholesky decomposition as $F=H^TH$, $H\in\mathbb{R}^{k\times k}$, and we get a conic decomposition of risk for free: $$\begin{equation*}x^T(\mathrm{diag}(D)+V^TFV)x=x^T(\mathrm{diag}(D)+V^TH^THV)x=\sum_i D_ix_i^2 + \|HVx\|_2^2,\end{equation*}$$ or in pseudocode:

risk = norm([diag(D^0.5); H @ V] @ x)

The matrix $HV\in \mathbb{R}^{k\times n}$ has $kn\approx  10^5$ nonzeros (a factor of $n/k\approx 100$ fewer than before) and everything else is very sparse so we get the solution very quickly and with high accuracy. The only reason we used the previous method is because we thought we first have to formulate a general quadratic problem and then solve it as a black-box. The black-box, however, is unnecessarily complicated for what we need here, introduces noise, and ignores the additional structure we started with so a lot of wasted work is done to go back and forth.

You can read more about the efficient conic implementation of the factor model in our Portfolio Optimization Cookbook, the documentation, a CVXPY tutorial and other resources.