1 Introduction

One of the central problems in finance is determining optimal financial portfolio, that is, optimal capital distribution among n financial instruments. A celebrated work of Markowitz [9] suggests to select a portfolio with minimum possible standard deviation of the rate of return subject to a constraint on the expected return. However, Markowitz himself acknowledged that standard deviation is not an ideal way to estimate the portfolio risk, because it is symmetric and therefore equally penalizes loss and profit. As a possible alternative, Markowitz considers lower semi-deviation, which penalises only returns below the expected value. However, while the optimization problem with standard deviation has a closed form solution, the one with lower semi-deviation reduces to convex (in fact, quadratic) programming, which is still reasonably efficient, but may take some time to solve for large-scale portfolios.

In 1991, Konno and Yamazaki [8] studied portfolio optimization with mean absolute deviation (MAD) instead of standard deviation, and showed that it reduces to linear programming, and can therefore be solved very efficiently. However, mean absolute deviation is also symmetric, so has limited advantages in comparison with the standard deviation. In 2000, Rockafellar and Uryasev [14, 15] suggested to use conditional value-at-risk (CVaR) as a risk measure, and demonstrated that it is at the same (i) non-symmetric, and focused on penalising losses, and (ii) the portfolio optimization problem with it reduces to linear programming.

In 2002, Rockafellar, Uryasev and Zabarankin [12, 16] introduced general deviation measures - a broad class of functionals that contains standard deviation, standard semi-deviation, MAD and CVaR-deviation as special cases, and study portfolio optimization problem in this general setting [13]. This provides investors with flexibility when choosing which deviation measure is the best for modelling their individual risk preferences. Moreover, the resulting risk preferences are consistent with the theory of rational choice [3], and can be easily restored based on the actions an investor performed in the past [6]. However, portfolio optimization problem with general deviation measure reduces to convex programming, which can be more difficult to solve than linear programming. For some individual deviation measures, like MAD or CVaR-deviation, or related risk measures [15], norms [10], or performance measures [2], the problem can be reduced to linear programming using specific properties of these functionals. However, the general question “for what deviation measures portfolio optimization problem can be reduced to linear programming” has not been investigated.

In 2013, Grechuk and Zabarankin [4, 7] suggested an idea of cooperative investment, that allows investors with different risk preferences share the risk inherited in a portfolio with mutual benefit. It is shown in [4] that the optimal portfolio of the group can be found from mean-deviation optimization problem with certain deviation measure \({{{\mathcal {D}}}}^*\) that can be explicitly constructed from deviation measures \({{{\mathcal {D}}}}_1,\dots ,{{{\mathcal {D}}}}_m\) representing individual risk preferences of m investors. However, even if \({{{\mathcal {D}}}}_1,\dots ,{{{\mathcal {D}}}}_m\) are “commonly used” deviation measures like MAD or CVaR-deviation, so that individual portfolio optimization problems can be reduced to linear programming by well-known methods specific for these deviation measures, it is not clear how to write a linear program to minimize the deviation measure \({{{\mathcal {D}}}}^*\).

This paper develops general method for reducing to linear programming portfolio optimization problem with general deviation measures that have polyhedral dual sets. This is a broad class of deviation measures, that contains MAD and CVaR-deviation as special cases. Importantly, if deviation measures \({{{\mathcal {D}}}}_1,\dots ,{{{\mathcal {D}}}}_m\) representing risk preferences of m investors belong to this class, then so is the corresponding deviation measure \({{{\mathcal {D}}}}^*\), which allows one to reduce the cooperative investment problem to linear programming.

A key technical difficulty is that, for many polytopes, the number of vertices V is exponentially larger than the number of faces F. If the dual set of some deviation measure \({{{\mathcal {D}}}}\) is such polytope, it is trivial to reduce portfolio optimization problem with \({{{\mathcal {D}}}}\) to linear programming with the number of constraints proportional to V. This, however, is highly impractical. The main feature of our method is that our linear program has the number of variables and constrains that depends polynomially on F. Using the minimax theorem (see equation (15) below), it is relatively easy to write down a small linear program to compute the optimal value of the optimization problem, but, with this method, it is unclear how to compute the optimal solution, that is, the weights of the optimal portfolio. Our main result (Theorem 1) shows how to write down a small linear program that allows to compute both optimal value and the optimal portfolio for optimization problem with arbitrary deviation measure with polyhedral dual set.

Our main result is formulated as a minimization of an arbitrary positively homogeneous convex functional, whose dual set is given by linear inequalities, and its applicability goes far beyond portfolio analysis with deviation measures. For example, Chekhlov, Uryasev and Zabarankin [1] studied portfolio optimization minimizing drawdown, and proved that this problem reduces to linear programming. We show that this result can alternatively be derived as another direct corollary from our main theorem.

This work is organised in five sections. Section 2 defines deviation measures and introduces portfolio optimization problem. Section 3 reduces this problem to linear programming, which is the main result of this work. Section 4 presents some examples and applications of the main result to the individual and cooperative portfolio optimization. Section 5 concludes the work.

2 Optimization problem with deviation measures

Let \((\Omega , {{{\mathcal {F}}}}, {{\mathbb {P}}})\) be a probability space, where \(\Omega \) is an arbitrary non-empty set, \({{{\mathcal {F}}}}\) is the \(\sigma \)-algebra of subsets of \(\Omega \), and \({{\mathbb {P}}}\) is a probability measure on \((\Omega ,{{{\mathcal {F}}}})\). Sets belonging to \({{{\mathcal {F}}}}\) will be called events. A random variable (r.v.) is a function \(X:\Omega \rightarrow {{\mathbb {R}}}\) such that for every \(x \in {{\mathbb {R}}}\) set \(\{\omega \in \Omega \,|\,X(\omega ) \le x\}\) is an event. Let \(L^2(\Omega )\) be the vector space of r.v.s X for which \({{\mathbb {E}}}[X^2]<\infty \), where \({{\mathbb {E}}}[X]=\int _\Omega X(\omega )d{{\mathbb {P}}}\) is the expectation. Let \(F_X(x)={\mathbb {P}}[X\leqslant x]\) and \(q_X(\alpha )=\inf \{x|F_X(x)>\alpha \}\) define the cumulative distribution function (CDF) and quantile function of an r.v. \(X\in {{{\mathcal {L}}}}^2(\Omega )\), respectively.

We will study a one period model of financial market with one risk-free asset with constant rate of return \(r_0\), and n risky assets with random rates of returns \({{{\bar{r}}}}_i \in L^2(\Omega )\), \(i=1,\dots , n\). Denote \(r_i={{{\bar{r}}}}_i-r_0, i=1,\dots , n\) the excess rate of return over the risk-free rate. A financial portfolio is formed by investing the proportion \(y_0\) of the unit capital in the risk-free asset and proportion \(y_i\) in each risky asset \(i=1,\dots ,n\). Then the budget constraint is \(\sum _{i=0}^n y_i = 1\), and the excess rate of return of the portfolio is

$$\begin{aligned} X=y_0 r_0 + \sum _{i=1}^n y_i{{{\bar{r}}}}_i - r_0 = \sum _{i=1}^n y_i({{{\bar{r}}}}_i - r_0) = \sum _{i=1}^n y_ir_i. \end{aligned}$$

Note that X does not depend on \(y_0\), and portfolio optimization problem can be formulated over , where is the transpose operation. \(y_0\) can then be found from the budget constraint as \(y_0=1-\sum _{i=1}^n y_i\). If we assume the possibility of short sales \((y_i<0)\) and borrowing at the risk-free rate \((y_0<0)\), then \(y\in {{\mathbb {R}}}^n\) is unconstrained.

We assume that investor would like to form a portfolio with \({{\mathbb {E}}}[X]\) at least \(\Delta \) which minimizes the deviation from this mean.

Definition 1

A deviation measureFootnote 1 is any functional \({{{\mathcal {D}}}}:{{{\mathcal {L}}}}^2(\Omega )\rightarrow [0,\infty ]\) satisfying

  1. (D1)

    \({{{\mathcal {D}}}}(X)=0\) for constant X, but \({{{\mathcal {D}}}}(X) > 0\) otherwise (nonnegativity),

  2. (D2)

    \({{{\mathcal {D}}}}(\lambda X) = \lambda {{{\mathcal {D}}}}(X)\) for all X and all \(\lambda > 0\) (positive homogeneity),

  3. (D3)

    \({{{\mathcal {D}}}}(X + Y)\le {{{\mathcal {D}}}}(X) + {{{\mathcal {D}}}}(Y)\) for all X and Y (subadditivity),

  4. (D4)

    set \(\{X\in {{{\mathcal {L}}}}^2(\Omega )\big |{{{\mathcal {D}}}}(X)\le c\}\) is closed for all \(c<\infty \) (lower-semicontinuity).

In this paper, we only consider finite deviation measures, that is, such that \({{{\mathcal {D}}}}(X)<\infty \) for all \(X \in {{{\mathcal {L}}}}^2 (\Omega )\). Examples of deviation measures include the standard deviation \(\sigma (X):=\sqrt{{{\mathbb {E}}}[|X-{{\mathbb {E}}}[X]|^2]}\), mean absolute deviation \(\mathrm{MAD}(X):={{\mathbb {E}}}[|X-{{\mathbb {E}}}[X]|]\), and conditional value-at-risk (CVaR) deviation at level \(\alpha \in (0,1)\), defined as

$$\begin{aligned} \mathrm{CVaR}_\alpha ^\Delta (X) := {{\mathbb {E}}}[X]-\frac{1}{\alpha }\int _0^\alpha q_X(\beta ) d\beta . \end{aligned}$$

Deviation measures can be characterized in terms of risk envelopes \({{{\mathcal {E}}}}\subset {{{\mathcal {L}}}}^2 (\Omega )\), which are sets of random variables (r.v.’s), satisfying

  1. (R1)

    \({{{\mathcal {E}}}}\) is a convex and closed bounded set containing 1,

  2. (R2)

    \({{\mathbb {E}}}[S]=1\) for every \(S\in {{{\mathcal {E}}}}\),

  3. (R3)

    for every non-constant \(X \in {{{\mathcal {L}}}}^2 (\Omega )\) there is a \(S\in {{{\mathcal {E}}}}\) such that \({{\mathbb {E}}}[XS] < {{\mathbb {E}}}[X]\).

As shown in [12] and [16], there is a one-to-one correspondence between deviation measures and risk envelopes given by the formulas

$$\begin{aligned} {{{\mathcal {D}}}}(X)=\sup \limits _{{1-S}\in {{{\mathcal {E}}}}} {{\mathbb {E}}}[XS] \end{aligned}$$
(1)

and

$$\begin{aligned} {{{\mathcal {E}}}} = \{S \in {{{\mathcal {L}}}}^2 (\Omega )\, | \, {{\mathbb {E}}}[X(1-S)] \le {{{\mathcal {D}}}}(X),\, \forall X\}. \end{aligned}$$
(2)

The mean deviation portfolio optimization problem is

$$\begin{aligned} \inf _{y \in {{\mathbb {R}}}^n} {{{\mathcal {D}}}}\left( \sum _{i=1}^n y_ir_i\right) \quad \text {subject to} \quad {\mathbb {E}}\left[ \sum _{i=1}^n y_ir_i\right] \ge \Delta \end{aligned}$$
(3)

for some \(\Delta >0\).

With new notation

$$\begin{aligned} \rho (y) := {{{\mathcal {D}}}}\left( \sum _{i=1}^n y_ir_i\right) \end{aligned}$$
(4)

and \(\mu _i=\frac{{\mathbb {E}}[r_i]}{\Delta }, i=1\dots , n\), the problem (3) reduces to

(5)

where .

The properties of deviation measures imply that \(\rho :{{\mathbb {R}}}^n\rightarrow {\mathbb {R}}\) is a non-negative positively homogeneous lower-semicontinuous convex function. Then set

(6)

is a convexFootnote 2 closed bounded set containing 0, and

(7)

For every \(y\in {{\mathbb {R}}}^n\) denote the set of vectors for which the maximum in (7) is attained.

Note that for every \(S \in {{{\mathcal {E}}}}\) (2) implies that

$$\begin{aligned} {{\mathbb {E}}}[X(1-S)] \le {{{\mathcal {D}}}}(X), \end{aligned}$$

which for \(X=\sum _{i=1}^n y_ir_i\) simplifies to

$$\begin{aligned} \sum _{i=1}^n y_i {\mathbb {E}}[r_i(1-S)] \le {{{\mathcal {D}}}}(X) = \rho (y), \end{aligned}$$

hence

$$\begin{aligned} {{{\mathcal {Q}}}} = \{q=(q_1,\dots ,q_n) | q_i={\mathbb {E}}[r_i(1-S)], i=1,\dots ,n, \,\, S\in {{{\mathcal {E}}}}\}. \end{aligned}$$

In practice, instrument’s rates of returns are often estimated from historical data. The resulting distributions may then be discrete, and it is convenient to model them as r.v.s on a finite probability space. Let us assume that \(\Omega \) is finite with \(T = |\Omega |\) and \({\mathbb {P}}(\omega ) > 0\) for any \(\omega \in \Omega \). Then all random variables can be identified with vectors in \({{\mathbb {R}}}^T\).

A finite probability space \(\Omega \) will be called uniform, if \({\mathbb {P}}(\omega _1)=\dots ={\mathbb {P}}(\omega _T)=\frac{1}{T}\). If and are two r.v.s on the uniform probability space, then

Let R be \(n \times T\) matrix with entries \(r_{it}=r_i(\omega _t), i=1,\dots , n, t=1,\dots , T\). Then the excess rate of return X of a portfolio with weights \(y\in {{\mathbb {R}}}^n\) is . Then we have

Hence,

$$\begin{aligned} {{{\mathcal {Q}}}} = \left\{ q=\frac{RS}{T} \,\bigg |\, S \in 1-{{{\mathcal {E}}}}\right\} . \end{aligned}$$
(8)

Following [5], we call deviation measure \({{{\mathcal {D}}}}\) finitely generated if the corresponding risk envelope \({{{\mathcal {E}}}}\) is a convex hull of a finite number of points. For example, standard deviation is not finitely generated, while the mean absolute deviation and CVaR deviation are.

Indeed, the risk envelopes of \(\mathrm{MAD}\) and CVaR-deviation are given by

$$\begin{aligned} {{{\mathcal {E}}}}_M=\big \{\,S \;\big |\; {\mathbb {E}}[S]=1, \;\; \sup S-\inf S \le 2\big \} \end{aligned}$$

and

$$\begin{aligned} {{{\mathcal {E}}}}_C=\big \{\,S \;\big |\; {\mathbb {E}}[S]=1, \;\; 0\le S \le \alpha ^{-1}\big \}, \end{aligned}$$

respectively, see [12,  Example 6] and [12,  Example 3]. Both \({{{\mathcal {E}}}}_M\) and \({{{\mathcal {E}}}}_C\) are convex polytopes in \({{\mathbb {R}}}^T\) with a finite number of vertices. Set \({{{\mathcal {E}}}}_M\) can be conveniently represented as

$$\begin{aligned} {{{\mathcal {E}}}}_M=\left\{ S \in {{\mathbb {R}}}^T \,\,\bigg |\,\, \frac{1}{T}\sum _{t=1}^T s_t=1,\quad \exists c:\, -1\le c-s_t\le 1 \quad \forall t\right\} . \end{aligned}$$
(9)

Here, c is a new auxiliary variable. Geometrically, the constraints in (9) represent a polytope \({{{\mathcal {E}}}}'_M\) in space \({{\mathbb {R}}}^{T+1}\) with coordinates \((s_1, \dots , s_T,c)\), and \({{{\mathcal {E}}}}_M\) is the projection of \({{{\mathcal {E}}}}'_M\) to \({{\mathbb {R}}}^T\). Such representations are of crucial importance, because polytopes with exponentially many vertices and faces can often be represented as projections of polytopes given by polynomially many constraints [17].

Let \({{{\mathcal {D}}}}\) be a finitely generated deviation measure, whose risk envelope \({{{\mathcal {E}}}}\) is defined by a finite number of linear constraints, possibly involving some auxiliary variables. Then set \({{{\mathcal {Q}}}}\) in (8) is a bounded convex polytope, which can be given explicitly in the following form. Let \(u\in {{\mathbb {R}}}^m\) be a vector of auxiliary variables, and let \( \begin{bmatrix} q \\ u \end{bmatrix} \in {{\mathbb {R}}}^{n+m} \) be the vector whose first n coordinates is q and last m coordinates is u. Then let

$$\begin{aligned} {{{\mathcal {Q}}}} = \left\{ q \in {{\mathbb {R}}}^n \,\bigg |\, \exists u\in {{\mathbb {R}}}^m : A\begin{bmatrix} q \\ u \end{bmatrix} \le b \right\} \end{aligned}$$
(10)

where A is some \(k \times (n+m)\) matrix and \(b \in {{\mathbb {R}}}^k\) is a vector.

In the Examples below, we provide the details how exactly set \({{{\mathcal {Q}}}}\) for \(\mathrm{MAD}(X)\) and \(\mathrm{CVaR}_\alpha ^\Delta (X)\) can be written in the form (10). We will use the following notations. Let \({\textbf {0}}_{i,j}\) be an \(i \times j\) matrix whose entries are all zeros, \(E_{i,j}\) be an \(i \times j\) matrix whose entries are all ones and \(I_{i,i}\) be an \(i \times i\) identity matrix.

Example 1

When \({{{\mathcal {D}}}}(X)=\mathrm{MAD}(X)\), then set \({{{\mathcal {Q}}}}\) in (8) can be represented in the form (10) with \(m=T+1\), \(k=2(1+T+n)\), and the matrix A and vector b are given by

$$\begin{aligned} A=\begin{bmatrix} {\textbf {0}}_{1,n} &{}E_{1,T} &{}0\\ {\textbf {0}}_{1,n} &{}-E_{1,T} &{}0\\ {\textbf {0}}_{T,n} &{}-I_{T,T} &{}E_{T,1}\\ {\textbf {0}}_{T,n} &{}I_{T,T} &{}-E_{T,1}\\ I_{n,n} &{}\frac{-R}{T} &{}{\textbf {0}}_{n,1}\\ -I_{n,n} &{}\frac{R}{T} &{}{\textbf {0}}_{n,1} \end{bmatrix} \quad \text {and}\quad b=\begin{bmatrix} 0 \\ 0 \\ E_{2T,1} \\ {\textbf {0}}_{2n,1} \end{bmatrix}. \end{aligned}$$
(11)

Detail

With \({{{\mathcal {D}}}}(X)=\mathrm{MAD}(X)\), set \({{{\mathcal {Q}}}}\) in (8) becomes

$$\begin{aligned} {{{\mathcal {Q}}}}= & {} \left\{ q=\frac{RS}{T} \,\bigg |\, S \in 1-{{{\mathcal {E}}}}_M\right\} = \left\{ q \,\bigg |\, \, \exists S \in {{\mathbb {R}}}^T, \, c \in {{\mathbb {R}}}: q=\frac{RS}{T}, \right. \nonumber \\&\left. \sum _{t=1}^T s_t=0, \,\, -1\le c-s_t\le 1 \quad \forall t\right\} \end{aligned}$$
(12)

Let us define \(u=\begin{bmatrix} S \\ c \end{bmatrix} \) be the vector of auxiliary variables in (10). Then, with A and b defined in (11), \( A \begin{bmatrix} q \\ u \end{bmatrix} \le b \) reduces to \(k=2(1+T+n)\) inequalities. The first two inequalities are \( \begin{bmatrix} {\textbf {0}}_{1,n}&E_{1,T}&0 \end{bmatrix} \begin{bmatrix} q \\ u \end{bmatrix} \le 0\) and \( \begin{bmatrix} {\textbf {0}}_{1,n}&-E_{1,T}&0 \end{bmatrix} \begin{bmatrix} q \\ u \end{bmatrix} \le 0\), which in component form are \(\sum _{t=1}^T s_t \le 0\) and \(-\sum _{t=1}^T s_t \le 0\), respectively. The next 2T inequalities \( \begin{bmatrix} {\textbf {0}}_{T,n}&-I_{T,T}&E_{T,1} \end{bmatrix} \begin{bmatrix} q \\ u \end{bmatrix} \le E_{T,1}\) and \( \begin{bmatrix} {\textbf {0}}_{T,n}&I_{T,T}&-E_{T,1} \end{bmatrix} \begin{bmatrix} q \\ u \end{bmatrix} \le E_{T,1}\) reduce respectively to \(c-s_t \le 1\) and \(s_t-c \le 1\), \(1 \le t \le T\). Finally, the last 2n inequalities are \( \begin{bmatrix} I_{n,n}&\frac{-R}{T}&{\textbf {0}}_{n,1} \end{bmatrix} \begin{bmatrix} q \\ u \end{bmatrix} \le {\textbf {0}}_{n,1} \) and \( \begin{bmatrix} -I_{n,n}&\frac{R}{T}&{\textbf {0}}_{n,1} \end{bmatrix} \begin{bmatrix} q \\ u \end{bmatrix} \le {\textbf {0}}_{n,1}, \) which imply that \(q-\frac{RS}{T} \le 0\) and \(-q+\frac{RS}{T} \le 0\), respectively. This is exactly the description of the set \({{{\mathcal {Q}}}}\) in (12). \(\square \)

Example 2

When \({{{\mathcal {D}}}}(X)=\mathrm{CVaR}_\alpha ^\Delta (X)\), then set \({{{\mathcal {Q}}}}\) in (8) can be represented in the form (10) with \(m=T\), \(k=2(T+1+n)\), and the matrix A and vector b are given by

$$\begin{aligned} A=\begin{bmatrix} {\textbf {0}}_{T,n} &{}I_{T,T}\\ {\textbf {0}}_{T,n} &{}-I_{T,T}\\ {\textbf {0}}_{1,n} &{}E_{1,T}\\ {\textbf {0}}_{1,n} &{}-E_{1,T}\\ I_{n,n} &{}-\frac{R}{T}\\ -I_{n,n} &{}\frac{R}{T} \end{bmatrix} \quad \text {and}\quad b=\begin{bmatrix} E_{T,1} \\ \left( \frac{1-\alpha }{\alpha }\right) E_{T,1} \\ {\textbf {0}}_{2(1+n),1} \end{bmatrix}. \end{aligned}$$
(13)

Detail

With \({{{\mathcal {D}}}}(X)=\mathrm{CVaR}_\alpha ^\Delta (X)\), set \({{{\mathcal {Q}}}}\) in (8) becomes

$$\begin{aligned} {{{\mathcal {Q}}}}= & {} \left\{ q=\frac{RS}{T} \,\bigg |\, S \in 1-{{{\mathcal {E}}}}_C\right\} = \left\{ q \,\bigg |\, \, \exists S \in {{\mathbb {R}}}^T: q=\frac{RS}{T},\right. \nonumber \\&\, \, \, \left. \sum _{t=1}^T s_t=0, \,\, \frac{\alpha -1}{\alpha }\le s_t\le 1 \quad \forall t\right\} \end{aligned}$$
(14)

Let us define \(u=S\) be the vector of auxiliary variables in (10). Then, with A and b defined in (13), \( A \begin{bmatrix} q \\ u \end{bmatrix} \le b \) reduces to \(k=2(T+1+n)\) inequalities. The first T inequalities are \( \begin{bmatrix} {\textbf {0}}_{T,n}&I_{T,T} \end{bmatrix} \begin{bmatrix} q \\ S \end{bmatrix} \le E_{T,1}\), which in component form reduce to \(s_t \le 1\), \(1 \le t \le T\). The next T inequalities \( \begin{bmatrix} {\textbf {0}}_{T,n}&-I_{T,T} \end{bmatrix} \begin{bmatrix} q \\ S \end{bmatrix} \le \left( \frac{1-\alpha }{\alpha }\right) E_{T,1}\) reduce to \(-s_t \le \frac{1-\alpha }{\alpha }\), \(1 \le t \le T\). The next two inequalities are \( \begin{bmatrix} {\textbf {0}}_{1,n}&E_{1,T} \end{bmatrix} \begin{bmatrix} q \\ S \end{bmatrix} \le 0 \) and \( \begin{bmatrix} {\textbf {0}}_{1,n}&-E_{1,T} \end{bmatrix} \begin{bmatrix} q \\ S \end{bmatrix} \le 0, \) which in component form are \(\sum _{t=1}^T s_t \le 0\) and \(-\sum _{t=1}^T s_t \le 0\), respectively. Finally, the last 2n inequalities imply that \(q-\frac{RS}{T} \le 0\) and \(-q+\frac{RS}{T} \le 0\), respectively. This is exactly the description of the set \({{{\mathcal {Q}}}}\) in (14). \(\square \)

3 Reduction to linear programming

In this section, we study optimization problem (5) with \(\rho \) given by (7), where \({{{\mathcal {Q}}}}\) is defined in (10). The goal is to show that this problem can be reduced to linear programming.

First, we note that, for any fixed \(y \in {{\mathbb {R}}}^n\), \(\rho (y)\) can be calculated using linear program

Next, let \(v^*\) be the optimal value of (5), and let . Then

(15)

where the last equality follows from Sion’s minimax theorem [18]. We next compute . If \(q=v \mu \) for some constant \(v\ge 0\), then

where the third equality follows from the definition of \(K_\mu \). If, conversely, \(q \ne v \mu \) for any \(v\ge 0\), then one can find a hyperplane going throuth the origin separating vectors q and \(\mu \). This implies the existence of a vector \(y^*\) such that . Then, for any constant , one has , hence \((Cy^*)\in K_\mu \). Then

where the last equality follows from and the fact that C can be arbitrary large.

Let \({{{\mathcal {Q}}}}_\mu = \{q \in {{{\mathcal {Q}}}}: q = v \mu \text { for some } v\ge 0\}\). Then

or equivalently

$$\begin{aligned} v^* = \max \limits _{v \ge 0} v, \quad \text {sublect to} \quad v \mu \in {{{\mathcal {Q}}}}. \end{aligned}$$

If \({{{\mathcal {Q}}}}\) is given by (10), this is a linear program. Hence, one can easily find the optimal value \(v^*\). However, this method does not return optimal \(y^*\) in (5), which corresponds to the optimal portfolio weights. We next prove that \(y^*\) can also be computed from a linear program.

Proposition 1

Let \(y^*\) be an optimal solution of (5), so that \(v^*=\rho (y^*)\). Then

  1. (i)

    either \(v^*=0\) or ;

  2. (ii)

    \(v^*\mu \in {{{\mathcal {Q}}}}_\rho (y^*)\).

Proof

Vector satisfies the constraint in (5). Hence, by optimality of \(y^*\) in (5), , or . Because \(v^*\ge 0\) and , (i) follows.

For every \(y \in {{\mathbb {R}}^n}\) with , vector satisfies the constraint in (5). Hence, by optimality of \(y^*\) in (5), , or . By (6), this implies that \(v^*\mu \in {{{\mathcal {Q}}}}\). Because by (i), \(v^*\mu \in {{{\mathcal {Q}}}}_\rho (y^*)\). \(\square \)

Theorem 1

Let \(\rho \) be given by (7) with \({{{\mathcal {Q}}}}\) defined in (10). Then optimal solution \(y^*\) to (5) can be found from the following linear program

(16)

which can alternatively be written in a component form

$$\begin{aligned} \begin{aligned}&\max \limits _{v\in {{\mathbb {R}}}, y\in {{\mathbb {R}}}^n, u\in {{\mathbb {R}}}^m, \pi \in {{\mathbb {R}}}^k} v \\&\text {subject to} \\&\sum _{j=1}^n\left( \mu _j a_{ij}\right) v+\sum _{j=n+1}^{n+m} u_{j-n} a_{ij} \le b_i,\,i=1,\dots ,k,\\&\sum _{j=1}^n \mu _j y_j \ge 1,\\&\sum _{i=1}^k \pi _i a_{ij} = y_j, \,\, j=1, \dots , n, \\&\sum _{i=1}^k \pi _i a_{ij} = 0, \,\, j=n+1, \dots , n+m, \\&\sum _{i=1}^k \pi _i b_i \le v,\\&\pi _i \ge 0, \, i=1,\dots ,k, . \end{aligned} \end{aligned}$$
(17)

where \(a_{ij}\), \(i=1,\dots ,k\), \(j=1,\dots ,n+m\) and \(b_i\), \(i=1,\dots ,k\) are the entries of A and the components of b, respectively, and .

Proof

Let \(y^*\) be an optimal solution of (5), so that and \(v^*=\rho (y^*)\). By Proposition 1 (ii), \(v^*\mu \in {{{\mathcal {Q}}}}\), hence by (10) there exists some \(u^* \in {{\mathbb {R}}}^m\) such that \(A\begin{bmatrix} v^*\mu \\ u^* \end{bmatrix} \le b\).

Let \(a^i\), \(i=1,\dots ,k\) and \(b_i\), \(i=1,\dots ,k\) are the rows of A and the components of b, respectively. If for some \(x=\begin{bmatrix} q \\ u \end{bmatrix} \in {{\mathbb {R}}}^{n+m}\) we have \(a^ix \le b_i\), \(i=1,\dots ,k\), then \(q \in {{{\mathcal {Q}}}}\) and therefore . Hence, by [11,  Theorem 22.3], there exist a vector \(\pi ^*\in {{\mathbb {R}}}^k\) with non-negative components \(\pi ^*_i\) such that and \(\sum \limits _{i=1}^k\pi ^*_i b^i \le v^*\), or, equivalently, and . Hence, \(v^*, y^*, u^*, \pi ^*\) is a feasible solution to (16).

Hence, if \(v', y', u', \pi '\) is an optimal solution to (16), then \(v' \ge v^*\).

On the other hand, constraint \(A\begin{bmatrix} v'\mu \\ u' \end{bmatrix} \le b\) implies that \(q=v'\mu \in {{{\mathcal {Q}}}}\). By (6), for every \(y \in {{\mathbb {R}}}^n\), . If , this implies that \(\rho (y) \ge v'\). Hence, \(v^* \ge v'\), so in fact \(v^*=v'\). This implies that \(v^*, y^*, u^*, \pi ^*\) is in fact an optimal solution to (16).

It is left to prove that \(y'\) is an optimal solution to (5). Let \(q \in {{{\mathcal {Q}}}}\) be arbitrary. Then there exists \(u \in {{\mathbb {R}}}^m\) such that , \(i=1,\dots ,k\). Multiplying i-th inequality by \(\pi '_i\) and adding up, we get , or . Because \(q \in {{{\mathcal {Q}}}}\) was arbitrary, this implies that \(\rho (y')\le v^*\). Because \(y'\) satisfies the constraint in (5), and \(v^*\) is an optimal value in (5), we have \(\rho (y')=v^*\), and \(y'\) is an optimal solution to (5). \(\square \)

4 Applications and examples

4.1 Individual portfolio optimization

In this section we assume that probability space is finite uniform with T equally-likely scenarios. Recall that R denotes the \(n \times T\) matrix whose entries \(r_{it}\) represent the excess rate of return of asset i under scenario t.

Example 3

Portfolio optimization problem (3) with \({{{\mathcal {D}}}}(X)=\mathrm{MAD}(X)\) reduces to (5) with \(\mu _i=\frac{{\mathbb {E}}[r_i]}{\Delta }\), \(i=1,\dots ,n\) and

$$\begin{aligned} {{{\mathcal {Q}}}}= & {} \left\{ q=\frac{RS}{T} \,\bigg |\, S \in 1-{{{\mathcal {E}}}}_M\right\} \nonumber \\= & {} \left\{ q \,\bigg |\, \, \exists S \in {{\mathbb {R}}}^T, \, c \in {{\mathbb {R}}}: q=\frac{RS}{T}, \right. \nonumber \\&\left. \sum _{t=1}^T s_t=0, \,\, -1\le c-s_t\le 1 \quad \forall t\right\} \end{aligned}$$

This representation is a special case of (10), and Theorem 1 is applicable.

Example 4

Portfolio optimization problem (3) with \({{{\mathcal {D}}}}(X)=\mathrm{CVaR}_\alpha ^\Delta (X)\) reduces to (5) with \(\mu _i=\frac{{\mathbb {E}}[r_i]}{\Delta }\), \(i=1,\dots ,n\) and

$$\begin{aligned} {{{\mathcal {Q}}}}= & {} \left\{ q=\frac{RS}{T} \,\bigg |\, S \in 1-{{{\mathcal {E}}}}_C\right\} = \left\{ q \,\bigg |\, \, \exists S \in {{\mathbb {R}}}^T: q=\frac{RS}{T},\right. \\&\,\,\, \left. \sum _{t=1}^T s_t=0, \,\, \frac{\alpha -1}{\alpha }\le s_t\le 1 \quad \forall t\right\} \end{aligned}$$

This representation is a special case of (10), and Theorem 1 is applicable.

Example 5

Assume that the deviation measure in (3) is a linear combination of M CVaR-Deviations (Mixed CVaR-Deviation Measure)

$$\begin{aligned} {{{\mathcal {D}}}}(X)=\sum _{i=1}^M \lambda _i CVaR_{\alpha _i}^\Delta (X) \end{aligned}$$
(18)

where \(0<\alpha _1<\dots<\alpha _M<1\), \(\sum _{i=1}^M \lambda _i=1\) and \(\lambda _i \ge 0, \forall i\). The corresponding functional \(\rho \) defined in (4) can be represented in the form of (7) with

$$\begin{aligned} {{{\mathcal {Q}}}} = \sum _{i=1}^M \lambda _i {{{\mathcal {Q}}}}_i \quad \text {and}\quad \lambda _i {{{\mathcal {Q}}}}_i = \left\{ \frac{RS^i}{T} \bigg | S^i \in \lambda _i {{{\mathcal {E}}}}_i\right\} \end{aligned}$$

where

$$\begin{aligned} \lambda _i {{{\mathcal {E}}}}_i=\left\{ S \in {{\mathbb {R}}}^T \,\bigg |\, \sum _{t=1}^T s_t=0,\,\, \lambda _i-\frac{\lambda _i}{\alpha _i}\le s_t \le \lambda _i \quad \forall t\right\} . \end{aligned}$$
(19)

This representation is a special case of (10), and Theorem 1 is applicable.

Our next example is portfolio optimization with drawdown. Let vector \(z=(z_1,...,z_T)\) represent the time series (historical or forecasted) of prices of a financial instrument or portfolio. Then the drawdown of z at time t is the difference

$$\begin{aligned} D_t(z) = \max _{1\le i \le t} z_i - z_t \end{aligned}$$

between the maximal price before t and the price at time t. One may consider portfolio minimizing the maximal drawdown

$$\begin{aligned} \text {MaxDD}(z) = \max _{1\le t \le T} D_t(z), \end{aligned}$$

or the average drawdown

$$\begin{aligned} \text {AvDD}(z) = \frac{1}{T} \sum _{t=1}^T D_t(z), \end{aligned}$$

or, more generally, the conditional Drawdown-at-risk (CDaR) measure \(\mathrm{CDaR}_{\alpha }(z)\) for any \(\alpha \in (0,1)\), which, intuitively, is the average of \(\alpha \) fraction of the worst drawdowns.

For a portfolio with weights y, \(\mathrm{CDaR}_{\alpha }(y)\) can be equivalently defined [1] as

(20)

with

$$\begin{aligned} {{{\mathcal {Q}}}}=\{RS\,| S\in {{{\mathcal {E}}}}\}, \end{aligned}$$

where

$$\begin{aligned} {{{\mathcal {E}}}}=\left\{ S\bigg | \sum _{t=1}^T s_t = 0, \,\, \sum _{t=1}^l s_t \ge 0 \;\forall l, \; \sum _{t=1}^T s^+_t \le 1, \; s_t \ge -\frac{1}{(1-\alpha ) T} \; \forall t \right\} , \end{aligned}$$
(21)

where \(s^+_t=\max \{0,s_t\}\) and R is a \(n \times T\) matrix in which the entry \(r_{it}\) for \(i=1, \dots , n\) and \(t=1, \dots , T\) denotes the adjusted rate of return of ith asset over the time period [0, t].

Example 6

Let the risk measure in (5) be a linear combination of M CDaRs

$$\begin{aligned} \rho (y)=\sum _{i=1}^M \lambda _i CDaR_{\alpha _i}(y) \end{aligned}$$
(22)

where \(0<\alpha _1<\dots<\alpha _M<1\), \(\sum _{i=1}^M \lambda _i=1\) and \(\lambda _i \ge 0, \forall i\). This risk measure can be represented in the form of (7) with

$$\begin{aligned} {{{\mathcal {Q}}}} = \sum _{i=1}^M \lambda _i {{{\mathcal {Q}}}}_i \quad \text {and}\quad \lambda _i {{{\mathcal {Q}}}}_i = \{RS^i | S^i \in \lambda _i {{{\mathcal {E}}}}_i\} \end{aligned}$$

where

$$\begin{aligned} \lambda _i {{{\mathcal {E}}}}_i=\left\{ S\bigg | \sum _{t=1}^T s_t = 0, \,\, \sum _{t=1}^l s_t \ge 0 \;\forall l, \; \sum _{t=1}^T s^+_t \le \lambda _i, \; s_t \ge -\frac{\lambda _i}{(1-\alpha _i) T} \; \forall t \right\} . \end{aligned}$$
(23)

This representation is a special case of (10), and Theorem 1 is applicable.

4.2 Risk sharing and cooperative investment

Assume that m investors jointly hold a financial instrument or portfolio, whose profit after unit of time is modelled as a random variable X (negative values of X corresponds to losses). Investors are allowed to distribute X among themselves, so that investor i gets part \(Y_i\) with \(\sum _{i=1}^m Y_i = X\). Each investor evaluates his/her part using mean-deviation utility function

$$\begin{aligned} U_i(Y_i) = {\mathbb {E}}[Y_i]-{{{\mathcal {D}}}}_i(Y_i), \end{aligned}$$

where \({{{\mathcal {D}}}}_i\) is a deviation measure used by investor i. We call vector \(Y=(Y_1,...,Y_m)\) an allocation. We say that allocation \(Z=(Z_1,...,Z_m)\) dominates \(Y=(Y_1,...,Y_m)\) if \(U_i(Z_i)\ge U_i(Y_i)\), \(i=1,\dots ,m\) with at least one inequality being strict. An allocation Y is called feasible if \(\sum _{i=1}^m Y_i = X\), and Pareto optimal if there is no feasible allocation that dominates it. Equivalently, a feasible allocation is Pareto optimal if and only if it is a maximizer in the optimization problem

$$\begin{aligned} \max \limits _{Y_i: \sum Y_i=X} \sum _{i=1}^m U_i(Y_i). \end{aligned}$$
(24)

Indeed, if there is a feasible Z that dominates Y, then \(\sum _{i=1}^m U_i(Z_i)>\sum _{i=1}^m U_i(Y_i)\), hence Y is not an optimizer in (24). Conversely, if there is a feasible allocation Z such that \(\delta =\sum _{i=1}^m U_i(Z_i)-\sum _{i=1}^m U_i(Y_i)>0\), then allocation \(Z'=(Z'_1, \dots , Z'_m)\) given by \(Z'_i=Z_i-U_i(Z_i)+U_i(Y_i)+\delta /m\), \(1=1,\dots , m\) is feasible and dominates Y, hence Y is not Pareto optimal.

Note that \(\sum _{i=1}^m U_i(Y_i)={\mathbb {E}}[X]-\sum _{i=1}^m {{{\mathcal {D}}}}_i(Y_i)\), hence maximizing \(\sum _{i=1}^m U_i(Y_i)\) is equivalent to minimizing \(\sum _{i=1}^m {{{\mathcal {D}}}}_i(Y_i)\). Let us denote

$$\begin{aligned} {{{\mathcal {D}}}}^*(X) = \min \limits _{Y_i: \sum Y_i=X} \sum _{i=1}^m {{{\mathcal {D}}}}_i(Y_i). \end{aligned}$$
(25)

So far, we have discussed sharing of a fixed instrument with profit X. Now assume that there are two instruments available, with profits \(X_1\) and \(X_2\), respectively, and investors are allowed to choose which one to share. If, for example, \({\mathbb {E}}[X_1]={\mathbb {E}}[X_2]\) but \({{{\mathcal {D}}}}^*(X_1)>{{{\mathcal {D}}}}^*(X_2)\), then, for any allocation Y with \(\sum Y_i = X_1\) one may find an allocation Z with \(\sum Z_i = X_2\) that dominates Y, and therefore it does not make sense to choose \(X_1\). From this discussion, it is clear that in general the group of investor should look for a portfolio with as small \({{{\mathcal {D}}}}^*\) as possible, or, in other words, should solve optimization problem (3) with \({{{\mathcal {D}}}}^*\) in place of \({{{\mathcal {D}}}}\). If \({{{\mathcal {D}}}}_1, \dots , {{{\mathcal {D}}}}_m\) are deviation measures with risk envelopes \({{{\mathcal {E}}}}_1, \dots , {{{\mathcal {E}}}}_m\), respectively, then \({{{\mathcal {D}}}}^*\) defined in (25) is also a deviation measure with risk envelope \({{{\mathcal {E}}}}={{{\mathcal {E}}}}_1 \cap \dots \cap {{{\mathcal {E}}}}_m\), see [5]. Equivalently, if \(\rho _i(y) = {{{\mathcal {D}}}}_i\left( \sum _{i=1}^n y_ir_i\right) \), defined as in (4), has dual sets \({{{\mathcal {Q}}}}_i\), then the group of investors should solve problem (5) where \(\rho \) is defined in (7) with

$$\begin{aligned} {{{\mathcal {Q}}}}={{{\mathcal {Q}}}}_1 \cap {{{\mathcal {Q}}}}_2 \cap \cdots \cap {{{\mathcal {Q}}}}_m. \end{aligned}$$
(26)

If all \({{{\mathcal {Q}}}}_i\) can be represented in the form (10), then their intersection \({{{\mathcal {Q}}}}\) is also in the form (10), and Theorem 1 can be used to reduce (5) to a linear program.

Now, if \(y^*\) is the optimal solution to (5), then \(X^*=\sum _{i=1}^n y_i^* r_i\) is the optimal solution to (3) with \({{{\mathcal {D}}}}={{{\mathcal {D}}}}^*\). Next we may solve optimization problem (25) to find a Pareto optimal allocation \(Y=(Y_1,\dots ,Y_m)\). Note that if \(Y=(Y_1,\dots ,Y_m)\) is Pareto optimal, then for any constants \(C_1,\dots ,C_m\) with \(\sum C_i =0\), allocation \((Y_1+C_1,\dots ,Y_m+C_m)\) is also Pareto optimal. This allows us to choose a Pareto optimal allocation that is “fair” in various senses, for example, one with \({\mathbb {E}}[Y_1+C_1]=\dots ={\mathbb {E}}[Y_m+C_m]\).

Fig. 1
figure 1

The optimal solution of cooperative investment problems

Example 7

(Cooperative investment of S &P100 index with MAD and CVaR-Deviation) Assume that there are two investors in a financial market, the first one with MAD risk measure and the second one with single CVaR-Deviation risk measure at risk level \(\alpha \), and both investors wants to form an optimal joint portfolio to be shared between them. We select the same \(n=96\) instruments from the \( S \& P 100\) index and identify weekly rates of return for \(T=150\) weeks from 04/Jul/2016 to 20/May/2019. We calculate weekly rates of return \(r_{it}\), \(i=1,\dots ,n\), \(t=1,\dots ,T\) according to

$$\begin{aligned} r_{it}=\frac{P(i,t+1)-P(i,t)}{P(i,t)} \end{aligned}$$

We select \(\Delta =0.8\) in (3) and \(\alpha =0.3\). If investors would invest individually, then they solve (3) with \({{{\mathcal {D}}}}(X)=\mathrm{MAD}(X)\) and \({{{\mathcal {D}}}}(X)=\mathrm{CVaR}_\alpha ^\Delta (X)\), respectively, and the optimal values of the objective functions are 0.364 and 0.457, respectively.

In contrast, let the investors solve the cooperative investment problem (3) with \({{{\mathcal {D}}}}={{{\mathcal {D}}}}^*\) to identify the optimal cooperative investment weights \(y^*\). Figure 1 represents the weights \(y^*\) of optimal cooperative portfolio, and also the weights of individual optimal portfolios with MAD and CVaR-Deviation, respectively. Further, if \(Y^*_1\) and \(Y^*_2\) are the shares of \(X^*=\sum _{i=1}^n y_i^* r_i\) the investors receive in the cooperative investment problem, then \(\mathrm{MAD}(Y^*_1)=0.327 < 0.364\), and \(\mathrm{CVaR}_\alpha ^\Delta (Y^*_2)=0.420 < 0.457\).

In Example 7, both investors received shares with the same expected return but lower risk than their optimal individual investments. This is possible because \(Y^*_1\) and \(Y^*_2\) are not representable as \(\sum _{i=1}^n y_i r_i\) and are therefore not available on the market individually. Only their sum \(X^*\) is available on the market.

5 Conclusions

Linear programming is one of the most efficient methods to solve applied problems, including portfolio optimization problems in finance. Different authors considered specific portfolio optimization problems, e.g. with mean absolute deviation, conditional value-at-risk, or drawdown, and derived linear programming formulations of these problems in each specific case. In this paper, we establish a general framework which allows to reduce all these and many other optimizations problems to linear programming in a unified way. In addition to reproving many existing results in a unified fashion, we present a new application to cooperative portfolio problem, for which the linear programming formulation was not known before this work.