MAD risk parity portfolios

Ararat, Çağın; Cesarone, Francesco; Pınar, Mustafa Çelebi; Ricci, Jacopo Maria

doi:10.1007/s10479-023-05797-2

MAD risk parity portfolios

Original Research
Open access
Published: 16 January 2024

Volume 336, pages 899–924, (2024)
Cite this article

Download PDF

You have full access to this open access article

Annals of Operations Research Aims and scope Submit manuscript

MAD risk parity portfolios

Download PDF

1112 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In this paper, we investigate the features and the performance of the risk parity (RP) portfolios using the mean absolute deviation (MAD) as a risk measure. The RP model is a recent strategy for asset allocation that aims at equally sharing the global portfolio risk among all the assets of an investment universe. We discuss here some existing and new results about the properties of MAD that are useful for the RP approach. We propose several formulations for finding MAD-RP portfolios computationally, and compare them in terms of accuracy and efficiency. Furthermore, we provide extensive empirical analysis based on three real-world datasets, showing that the performances of the RP approaches generally tend to place both in terms of risk and profitability between those obtained from the minimum risk and the Equally Weighted strategies.

Modified Risk Parity Portfolios to Limit Concentration on Low Risk Assets in Multi-Asset Portfolios

A robust framework for risk parity portfolios

Article 03 August 2020

Minimum risk versus capital and risk diversification strategies for portfolio construction

Article 05 April 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The 2008 subprime financial crisis led many scholars and practitioners to strong criticism of classical risk-gain models and to the consequent development of new portfolio selection strategies based on the concept of risk allocation. Indeed, even though risk-gain models often have nice features in terms of formulation and of tractability, they often show several drawbacks, such as their high sensitivity to estimation errors of the input parameters (in particular, of expected returns, see, e.g., Best and Grauer 1991a, Best and Grauer 1991b, Chopra and Ziemba 1993, DeMiguel et al. 2009, and Michaud and Michaud 1998), or their lack of risk diversification. A straightforward method to deal with this issue could be the choice of the Equally Weighted (EW) portfolio, where the invested capital is equally distributed among the assets that belong to the investment universe. However, if the investment universe consists of assets with very different intrinsic risks, then the resulting portfolio has limited total risk diversification.

A more refined risk-focused method is the risk parity (RP), also called Equal Risk Contribution. It consists of selecting a portfolio where each asset equally contributes to the total portfolio risk, regardless of their estimated expected returns (Maillard & Teiletche, 2010). The RP strategy has its roots in the practitioner world (see, e.g., Fabozzi et al. 2021 and Liu et al. 2020), indeed it is often considered as a heuristic method. However, several recent theoretical and empirical findings justify the growing popularity of the RP strategy in the practice of asset allocation.

From an empirical viewpoint, risk parity portfolios show smaller sensitivity to estimation errors of the input parameters compared to portfolios based on risk minimization and on other optimization strategies (Cesarone et al., 2020a). Furthermore, risk parity portfolios seem to show promising out-of-sample performance (Fisher et al., 2015; Jacobsen & Lee, 2020).

From a theoretical viewpoint, since an RP portfolio can be found by solving a minimization problem with logarithmic constraints on the portfolio weights, the RP strategy can be interpreted as a minimum risk approach with a constraint on the minimum level of diversification, which can be seen as a sort of regularization. Furthermore, an example of the theoretical justification of the Risk Parity approach can be found in Oderda (2015), where the author shows that the analytic form of the optimal portfolio solution obtained by maximizing portfolio relative logarithmic wealth at a fixed tracking risk level with respect to the market-capitalization-weighted index, consists of a linear combination of this index, the global minimum variance portfolio, the EW portfolio, the RP portfolio, and the high cash flow rate of return portfolio.

The risk measure commonly used in the RP approach is volatility (Maillard & Teiletche, 2010; Roncalli, 2013). Other risk measures have been considered in the literature such as the Conditional Value-at-Risk (Boudt et al., 2013; Cesarone & Colucci, 2018; Mausser & Romanko, 2018) and the Expectiles (Bellini et al., 2021), both belonging to the class of coherent risk measures. However, the long-only risk parity model has a unique solution when the risk is positive, convex and positively homogeneous (Cesarone et al., 2020b), and for Conditional Value-at-Risk and Expectiles, positivity is not always guaranteed (Bellini et al., 2021; Cesarone & Colucci, 2018; Cesarone & Tardella, 2017). An alternative risk measure to volatility, which is, by definition, positive for nonconstant market returns, is the Mean Absolute Deviation (MAD) belonging to the class of deviation risk measures (Rockafellar et al., 2006). In risk-gain portfolio optimization analysis, it was introduced by Konno and Yamazaki (1991) as an alternative to the Markowitz model.

In this paper, the main goal is to investigate the theoretical properties and the performance of the MAD risk parity portfolios, providing, to the best of our knowledge, multiple contributions to the literature. First, we revisit existing theoretical results on MAD, discuss its differentiability, and provide a characterization of multivariate random market returns for which MAD is additive. We show the conditions that determine this characterization, which can be seen as strong positive dependence among the asset returns. Furthermore, under these conditions, we provide a closed-form solution for the long-only MAD-RP portfolio. Second, we establish the existence and uniqueness of the MAD-RP portfolio, thus extending the theoretical results of Cesarone et al. (2020b). Third, we propose several formulations to find the MAD-RP portfolios practically, thus comparing their performances both in terms of accuracy and efficiency. Finally, we provide an extensive empirical analysis on three real-world datasets by comparing the out-of-sample performance obtained from the global minimum volatility and MAD portfolios, the volatility and MAD risk parity portfolios, and the Equally Weighted portfolio.

The rest of the paper is organized as follows. Section 2 introduces the Mean Absolute Deviation risk measure, providing a discussion of MAD properties relevant to our context. In Sect. 3, we show how to formulate the RP approach with MAD mathematically and how to find the MAD-RP portfolios in practice. Section 4 provides an extensive empirical analysis based on three real-world datasets. More precisely, in Sect. 4.2.1, we test and compare all the MAD-RP formulations in terms of accuracy and efficiency, while in Sects. 4.2.2 and 4.2.3, we report a graphical comparison of some portfolio selection approaches and out-of-sample results, respectively. Finally, in Sect. 5, we draw some overall conclusions.

2 Measuring the portfolio risk by MAD

Let $n\in \mathbb {N}$. We denote by $\mathbb {R}^n$ the n-dimensional Euclidean space and define the cones $\mathbb {R}^n_+:=\{\varvec{x}\in \mathbb {R}^n\mid x_i\ge 0 \text{ for } \text{ each } i\in \{1,\ldots ,n\}\}$, $\mathbb {R}^n_{++}:=\{\varvec{x}\in \mathbb {R}^n\mid x_i>0 \text{ for } \text{ each } i\in \{1,\ldots ,n\}\}$. We also denote by $\Delta ^{n-1}$ the set of all $\varvec{x}\in \mathbb {R}^n_+$ with $\sum _{i=1}^n x_i=1$.

To introduce the probabilistic setup, let us fix a complete probability space $(\Omega , \mathcal {F}, \mathbb {P})$. For an event $A\in \mathcal {F}$, we define its indicator function by $\mathbb {1}_A(\omega )=1$ for $\omega \in A$, and by $\mathbb {1}_A(\omega )=0$ for $\omega \in \Omega {\setminus } A$. We denote by $L^0:=L^0(\Omega ,\mathcal {F},\mathbb {P})$ the space of all $\mathcal {F}$-measurable and real-valued random variables, where two elements are distinguished up to $\mathbb {P}$-almost sure (a.s.) equality. We denote by $L^1:=L^1(\Omega ,\mathcal {F},\mathbb {P})$ the set of all $X\in L^0$ with $\mathbb {E}[\left|X\right|]<+\infty $.

The mean absolute deviation (MAD) of a random variable is a measure of its dispersion and is defined as the expected value of the absolute deviations from a reference value that is generally represented by the mean of the random variable:

$$\begin{aligned} {{\,\textrm{MAD}\,}}(X):=\mathbb {E}[\left|X-\mathbb {E}[X]\right|] \, ,\quad X\in L^1. \end{aligned}$$

(1)

MAD was first used in the field of portfolio optimization by Konno and Yamazaki (1991) as a symmetric risk measure alternative to variance. An upside version of MAD is the mean semi-absolute deviation defined by

$$\begin{aligned} {{\,\textrm{MSAD}\,}}(X):=\mathbb {E}[(X-\mathbb {E}[X])^+] \, ,\quad X\in L^1, \end{aligned}$$

(2)

where $a^+:=\max \{a,0\}$ denotes the positive part of $a\in \mathbb {R}$. As it is well-known, we have ${{\,\textrm{MSAD}\,}}(X)=\frac{1}{2} {{\,\textrm{MAD}\,}}(X)$.

2.1 MAD as a deviation measure

The next proposition summarizes the properties of MAD as a deviation measure. We omit its straightforward proof for brevity and refer the reader to Rockafellar et al. (2006) for a general treatment of deviation measures.

Proposition 1

The functional ${{\,\textrm{MAD}\,}}:L^1 \rightarrow [0, \infty )$ is a deviation measure, that is, it satisfies the following properties for every $X,Y\in L^1$:

(i)
Translation invariance: ${{\,\textrm{MAD}\,}}(X+\alpha )={{\,\textrm{MAD}\,}}(X)$ for every $\alpha \in \mathbb {R}$.
(ii)
Positive homogeneity: ${{\,\textrm{MAD}\,}}(\lambda X)=\lambda {{\,\textrm{MAD}\,}}(X)$ for every $\lambda \ge 0$.
(iii)
Subadditivity: ${{\,\textrm{MAD}\,}}(X+Y)\le {{\,\textrm{MAD}\,}}(X)+{{\,\textrm{MAD}\,}}(Y)$.
(iv)
Normalization: ${{\,\textrm{MAD}\,}}(0)=0$.
(v)
Strict positivity: It holds ${{\,\textrm{MAD}\,}}(X)>0$ if and only if X is not equal to a constant $\mathbb {P}$-a.s.

2.2 Additivity of MAD

Next, we provide a property that allows for finding a closed form solution to MAD-RP portfolios, namely, a characterization of random variables for which MAD is additive. To the best of our knowledge, this basic observation is new to the current paper.

Lemma 2

(Additivity of MAD) Let $X,Y \in L^1$. Then,

$$\begin{aligned} {{\,\textrm{MAD}\,}}(X+Y)={{\,\textrm{MAD}\,}}(X)+{{\,\textrm{MAD}\,}}(Y) \end{aligned}$$

(3)

if and only if

$$\begin{aligned} (X-\mathbb {E}[X])(Y-\mathbb {E}[Y])\ge 0 \quad \mathbb {P}\text{-a.s }. \end{aligned}$$

(4)

Proof

See Appendix A. $\square $

Just as the portfolio volatility is additive for asset returns that are perfectly correlated and the portfolio Conditional Value-at-Risk (CVaR) is additive for asset returns that are comonotone (Tasche, 2002), in view of Lemma 2, MAD is additive for random returns X, Y if and only if the deviations of X and Y from their respective means have the same sign almost surely. Hence, Condition (4) can be interpreted as a kind of positive dependence. Furthermore, we point out that in the case of additivity of homogeneous and subadditive risk measures, the sum of the risks of X and Y is an upper bound for the risk of $X + Y$. Such an upper bound represents the worst portfolio risk, and it occurs for volatility when the linear correlation between X and Y is equal to 1, for CVaR when X and Y are comonotone, and for MAD when X and Y satisfy Condition (4).

Remark 3

Clearly, if X, Y are square-integrable random variables that satisfy (4), then they are positively correlated (i.e., nonnegative covariance). Moreover, if X, Y are square-integrable random variables with perfect positive correlation, then (4) holds. Indeed, having perfect positive correlation implies that $Y=aX+b$ for some $a>0$ and $b\in \mathbb {R}$ so that $(X-\mathbb {E}[X])(Y-\mathbb {E}[Y])=a(X-\mathbb {E}[X])^2\ge 0$ $\mathbb {P}$-a.s. In the next remark and example, we prove that the converse is not true in general.

Remark 4

Let $\alpha \in (0,1)$. For a square-integrable random variable X, its $\alpha $-expectile is defined as

$$\begin{aligned} e_\alpha (X):=\underset{r\in \mathbb {R}}{{{\,\mathrm{argmin\ }\,}}}\mathbb {E}[\alpha ((X-r)^+)^2+(1-\alpha )((X-r)^-)^2]. \end{aligned}$$

In Bellini et al. (2021, Theorem 3), it is shown that the additivity of the $\alpha $-expectiles at two square-integrable random variables X, Y, i.e., $e_\alpha (X+Y)=e_\alpha (X)+e_\alpha (Y)$, is equivalent to the condition

$$\begin{aligned} (X-e_\alpha (X))(Y-e_\alpha (Y))\ge 0\quad \mathbb {P}\text{-a.s }. \end{aligned}$$

(5)

whenever $\alpha \in (\frac{1}{2},1)$. In Bellini et al. (2021 Example 2), it is assumed that $\mathbb {P}\{X=0,Y=0\}=\mathbb {P}\{X=0,Y=1\}=\mathbb {P}\{X=1,Y=0\}=\mathbb {P}\{X=1,Y=1\}=\frac{1}{6}$ and $\mathbb {P}\{X=2,Y=2\}=\frac{1}{3}$. In this case, they show that (5) holds for $\alpha =\frac{3}{4}$ and X, Y are not comonotonic. In the same example, note that we have $\mathbb {E}[X]=\mathbb {E}[Y]=1$, $\mathbb {E}[XY]=\frac{3}{2}$, ${{\,\textrm{MAD}\,}}(X)={{\,\textrm{MAD}\,}}(Y)=\frac{2}{3}$, and ${{\,\textrm{MAD}\,}}(X+Y)=\frac{4}{3}$ so that (4) also holds. In particular, this example shows that (4) does not imply comonotonicity. Since ${{\,\textrm{Var}\,}}(X)={{\,\textrm{Var}\,}}(Y)=\frac{2}{3}$ and ${{\,\textrm{Cov}\,}}(X,Y)=\frac{1}{2}$, X and Y are not perfectly positively correlated. Hence, (4) does not imply perfect positive correlation either. As another example, assuming that $\mathbb {P}\{X=0,Y=0\}=\mathbb {P}\{X=1,Y=1\}=\mathbb {P}\{X=2,Y=1\}=\frac{1}{3}$, they show that (5) does not hold for $\alpha =\frac{3}{4}$ but X, Y are comonotonic. Note that $\mathbb {E}[X]=1$, $\mathbb {E}[Y]=\frac{2}{3}$, ${{\,\textrm{MAD}\,}}(X)=\frac{2}{3}$, ${{\,\textrm{MAD}\,}}(Y)=\frac{4}{9}$, and ${{\,\textrm{MAD}\,}}(X+Y)=\frac{11}{9}$ so that (4) does not hold either. In particular, comonotonicity does not imply (4).

As shown in Remarks 3, 4, two square-integrable random variables that satisfy (4) are positively correlated but not necessarily perfectly correlated. Indeed, the correlation coefficient of such random variables can be any value in [0, 1] as the next example shows.

Example 5

Let $\mu _X,\mu _Y\in \mathbb {R}$ and $\rho \in [0,1]$. Suppose that there exist two square-integrable positive random variables $R_X,R_Y$ such that

$$\begin{aligned} \frac{\mathbb {E}[R_XR_Y]}{\sqrt{\mathbb {E}[R_X^2]\mathbb {E}[R_Y^2]}}=\rho . \end{aligned}$$

Suppose also that there exists an event $A\in \mathcal {F}$ such that

$$\begin{aligned} \mathbb {E}[R_X \mathbb {1}_A]=\frac{\mathbb {E}[R_X]}{2},\quad \mathbb {E}[R_Y \mathbb {1}_A]=\frac{\mathbb {E}[R_Y]}{2}. \end{aligned}$$

(6)

(For instance, any event $A\in \mathcal {F}$ that is independent of $R_X, R_Y$ and has $\mathbb {P}(A)=\frac{1}{2}$ satisfies this condition.) Let $S:=\mathbb {1}_A-\mathbb {1}_{A^c}$ and define

$$\begin{aligned} X:=\mu _X+SR_X,\quad Y:=\mu _Y+SR_Y. \end{aligned}$$

Thanks to (6), we have $\mathbb {E}[SR_X]=\mathbb {E}[SR_Y]=0$ so that $\mathbb {E}[X]=\mu _X$, $\mathbb {E}[Y]=\mu _Y$. Hence,

$$\begin{aligned} {{\,\textrm{Cov}\,}}(X,Y)=\mathbb {E}[(X-\mathbb {E}[X])(Y-\mathbb {E}[Y])]=\mathbb {E}[S^2R_XR_Y]=\mathbb {E}[R_XR_Y]=\rho \sqrt{\mathbb {E}[R_X^2]\mathbb {E}[R_Y^2]}, \end{aligned}$$

which implies that the correlation coefficient of X, Y is $\rho $.

The next theorem is a generalization of Lemma 2 for the multivariate setting. It provides a list of equivalent statements that can be seen as the definition of additivity for MAD when multiple random variables are considered.

Theorem 6

Let $Y_1, \ldots , Y_n\in L^1$. Then, the following are equivalent.

(i)
${{\,\textrm{MAD}\,}}(\sum _{i=1}^n x_iY_i)=\sum _{i=1}^n x_i{{\,\textrm{MAD}\,}}(Y_i)$ for every $\varvec{x}\in \mathbb {R}^n_+$.
(ii)
${{\,\textrm{MAD}\,}}(\sum _{i=1}^n x_iY_i)=\sum _{i=1}^n x_i{{\,\textrm{MAD}\,}}(Y_i)$ for every $\varvec{x}\in \Delta ^{n-1}$.
(iii)
${{\,\textrm{MAD}\,}}(\sum _{i\in I}Y_i)=\sum _{i\in I}{{\,\textrm{MAD}\,}}(Y_i)$ for every nonempty subset $I\subseteq \{1,\ldots ,n\}$.
(iv)
$(Y_i-\mathbb {E}[Y_i])(Y_j-\mathbb {E}[Y_j])\ge 0$ $\mathbb {P}$-a.s. for every $i,j\in \{1,\ldots ,n\}$ with $i\ne j$.

Proof

See Appendix A. $\square $

2.3 Subdifferential of MAD

In order to introduce the marginal risk contributions of portfolios, we provide a theoretical treatment of the subdifferential of MAD as a function of the portfolio weights in this section. Suppose that there are n assets in the market. We write $\varvec{r}=(r_1,\ldots ,r_n)^{\textsf{T}}$ for the random vector of asset returns where we assume $r_i\in L^1$ for each $i\in \{1,\ldots ,n\}$. We also write $\varvec{\mu }=(\mu _1,\ldots ,\mu _n)^{\textsf{T}}$ for the corresponding mean vector of $\varvec{r}$, i.e., $\mathbb {E}[r_i]=\mu _i$ for each $i\in \{1,\ldots ,n\}$. Furthermore, we indicate by $\varvec{x}= (x_{1}, \ldots , x_{n})^{\textsf{T}}\in \mathbb {R}^n$ a vector of portfolio weights, which are the decision variables of the problems tackled in this work. For such $\varvec{x}$, we have $\sum _{i=1}^n x_i =1$ and we denote by $R(\varvec{x})$ the random portfolio return. Note that we have

$$\begin{aligned} R(\varvec{x})=\varvec{r}^{\textsf{T}}\varvec{x}=\sum _{i=1}^{n}r_{i}x_{i} \, . \end{aligned}$$

Then, with some abuse of notation, the risk of the portfolio measured by MAD is given by

$$\begin{aligned} {{\,\textrm{MAD}\,}}(\varvec{x}):={{\,\textrm{MAD}\,}}(R(\varvec{x}))=\mathbb {E} \left[ \left|(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}\right| \right] = \mathbb {E} \left[ \left|\sum _{i=1}^{n}(r_{i}-\mu _{i})x_{i}\right| \right] . \end{aligned}$$

(7)

In (7), we observe that the function $\varvec{x}\mapsto {{\,\textrm{MAD}\,}}(\varvec{x})$ is generally not differentiable at some points in its domain due to the absolute value function. However, it is a finite, convex, hence also a continuous function on $\mathbb {R}^n$; thus, it has a nonempty subdifferential $\partial {{\,\textrm{MAD}\,}}(\varvec{x})$ at every given $\varvec{x}\in \mathbb {R}^n$ as calculated by the next proposition.

Proposition 7

Let $\varvec{x}\in \mathbb {R}^n$. Then, the subdifferential of ${{\,\textrm{MAD}\,}}$ at $\varvec{x}$ is given by

$$\begin{aligned} \partial {{\,\textrm{MAD}\,}}(\varvec{x})= \left\{ \mathbb {E} \left[ \left( {{\,\textrm{sgn}\,}}\left( ( \varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x} \right) +\eta \mathbb {1}_{ \{(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}=0\}} \right) (\varvec{r}-\varvec{\mu }) \right] \mid \eta \in L^0,\ \eta \in [-1,1]\ \mathbb {P}\text{-a.s. } \right\} . \end{aligned}$$

Proof

By viewing ${{\,\textrm{MAD}\,}}$ as an integral functional with respect to the probability measure $\mathbb {P}$, we may calculate its subdifferential using Rockafellar (1973, Theorem 23), which works under the assumption that the probability space is complete. (See also Rockafellar and Wets 1982, Corollary, p. 179 for a similar result with slightly different assumptions.) Note that we may write

$$\begin{aligned} {{\,\textrm{MAD}\,}}(\varvec{x}) = \int _{\Omega }h(\omega ,\varvec{x})\mathbb {P}(d\omega ), \quad \varvec{x}\in \mathbb {R}^n, \end{aligned}$$

where $h:\Omega \times \mathbb {R}^n\rightarrow \mathbb {R}$ is defined by

$$\begin{aligned} h(\omega ,\varvec{x}):=\left|(\varvec{r}(\omega )-\varvec{\mu })^{\textsf{T}}\varvec{x}\right|,\quad \omega \in \Omega ,\varvec{x}\in \mathbb {R}^n. \end{aligned}$$

For every $\varvec{x}\in \mathbb {R}^n$, the function $\omega \mapsto h(\omega ,\varvec{x})$ on $\Omega $ is measurable since $\varvec{r}$ is a random vector; for every $\omega \in \Omega $, the function $\varvec{x}\mapsto h(\omega ,\varvec{x})$ is convex and continuous on $\mathbb {R}^n$. In particular, h is measurable on $\Omega \times \mathbb {R}^n$. Let $\varvec{\xi }:\Omega \rightarrow \mathbb {R}^n$ be a bounded random variable, we can find $M>0$ such that $\left|\xi _i\right|\le M$ for every $i\in \{1,\ldots ,n\}$ $\mathbb {P}$-a.s. Then,

$$\begin{aligned} \int _{\Omega }h(\omega ,\varvec{\xi }(\omega ))\mathbb {P}(d\omega )=\int _{\Omega }\left|(\varvec{r}(\omega )-\varvec{\mu })^{\textsf{T}}\varvec{\xi }(\omega )\right|\mathbb {P}(d\omega )\le M\sum _{i=1}^n \int _{\Omega }\left|r_i(\omega )-\mu _i\right|\mathbb {P}(d\omega )<+\infty \end{aligned}$$

since $r_1,\ldots ,r_n\in L^1$. Hence, all assumptions in Rockafellar (1973, Theorem 23) are satisfied and we obtain that

$$\begin{aligned} \partial {{\,\textrm{MAD}\,}}(\varvec{x})=\{\mathbb {E}[\varvec{\xi }]\mid \varvec{\xi }\in (L^1)^n,\ \mathbb {P}\{\omega \in \Omega \mid \varvec{\xi }(\omega )\in \partial _{\varvec{x}} h(\omega ,\varvec{x})\}=1\}, \end{aligned}$$

(8)

where, for each $\omega \in \Omega $, $\partial _{\varvec{x}}h(\omega ,\varvec{x})$ denotes the subdifferential of the function $\varvec{x}\mapsto h(\omega ,\varvec{x})$ at $\varvec{x}$. By standard rules of finite-dimensional subdifferential calculus (see, e.g., Rockafellar 1970, Theorem 23.9), we can calculate this set as

$$\begin{aligned} \partial _{\varvec{x}}h(\omega ,\varvec{x})=(\varvec{r}(\omega )-\varvec{\mu })\partial \left|\cdot \right|((\varvec{r}(\omega )-\varvec{\mu })^{\textsf{T}}\varvec{x}), \end{aligned}$$

(9)

where $\partial \left|\cdot \right|$ denotes the (set-valued) subdifferential mapping of the absolute value function and is given by $\partial \left|\cdot \right|(s)=\{{{\,\textrm{sgn}\,}}(s)\}$ for $s\ne 0$, and by $\partial \left|\cdot \right|(s)=[-1,+1]$ for $s=0$, which can be written compactly as

$$\begin{aligned} \partial \left|\cdot \right|(s)=\{{{\,\textrm{sgn}\,}}(s)+t\mathbb {1}_{\{0\}}(s)\mid t\in [-1,+1]\},\quad s\in \mathbb {R}. \end{aligned}$$

(10)

Combining (9), (10), we get

$$\begin{aligned} \partial _{\varvec{x}}h(\omega ,\varvec{x})=\{({{\,\textrm{sgn}\,}}((\varvec{r}(\omega )-\varvec{\mu })^{\textsf{T}}\varvec{x})+t\mathbb {1}_{\{0\}}((\varvec{r}(\omega )-\varvec{\mu })^{\textsf{T}}\varvec{x}))(\varvec{r}(\omega )-\varvec{\mu })\mid t\in [-1,+1]\}. \end{aligned}$$

Combining this with (8) yields that $\partial {{\,\textrm{MAD}\,}}(\varvec{x})$ is the set of all expectations of the form $\mathbb {E}[\varvec{\xi }]$, where $\varvec{\xi }\in (L^1)^n$ is such that

$$\begin{aligned} \mathbb {P}\{\omega \in \Omega \mid \exists t\in [-1,1]:\varvec{\xi }(\omega )=({{\,\textrm{sgn}\,}}((\varvec{r}(\omega )-\varvec{\mu })^{\textsf{T}}\varvec{x})+t\mathbb {1}_{\{(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}=0\}}(\omega ))(\varvec{r}(\omega )-\varvec{\mu })\}=1. \end{aligned}$$

(11)

Here, t depends on $\omega $ but its measurability is not guaranteed. To resolve this measurability issue, we show that condition (11) holds if and only if there exists $\eta \in L^0$ such that $-1\le \eta \le 1$ $\mathbb {P}$-a.s. and

$$\begin{aligned} \mathbb {P}\{\omega \in \Omega \mid \varvec{\xi }(\omega )=({{\,\textrm{sgn}\,}}((\varvec{r}(\omega )-\varvec{\mu })^{\textsf{T}}\varvec{x})+\eta (\omega )\mathbb {1}_{\{(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}=0\}}(\omega ))(\varvec{r}(\omega )-\varvec{\mu })\}=1. \end{aligned}$$

(12)

Indeed, the “if" part of the claim is trivial. For the “only if" part, suppose that (11) holds. Let $A\in \mathcal {F}$ be the event in the condition. Hence, the set

$$\begin{aligned} T(\omega ):=\{t\in [-1,1]\mid \varvec{\xi }(\omega )=({{\,\textrm{sgn}\,}}((\varvec{r}(\omega )-\varvec{\mu })^{\textsf{T}}\varvec{x})+t\mathbb {1}_{\{(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}=0\}}(\omega ))(\varvec{r}(\omega )-\varvec{\mu })\} \end{aligned}$$

is nonempty for every $\omega \in A$. Moreover,

$$\begin{aligned} (t,\omega )\mapsto ({{\,\textrm{sgn}\,}}((\varvec{r}(\omega )-\varvec{\mu })^{\textsf{T}}\varvec{x})+t\mathbb {1}_{\{(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}=0\}}(\omega ))(\varvec{r}(\omega )-\varvec{\mu }) \end{aligned}$$

is a Carathéodory function, i.e., it is continuous in $t\in [-1,1]$ and measurable in $\omega \in \Omega $. For definiteness, let us set $t(\omega )=0$ for $\omega \in \Omega {\setminus } A$. The measurability of $\varvec{\xi }$ together with Rockafellar and Wets (1997, Corollary 14.6, Example 14.15) imply that there exists a random variable $\eta $ such that $\eta (\omega )\in T(\omega )$ for almost every $\omega \in \Omega $. Therefore, (12) holds for $\eta $, which yields the desired subdifferential formula for ${{\,\textrm{MAD}\,}}$. $\square $

3 Risk parity approach with MAD

The risk parity (RP) approach aims to choose the portfolio weights such that the risk contributions of all assets are equal. In this section, we discuss the existence, uniqueness, and calculation of such portfolio weights.

3.1 MAD-RP portfolios: existence and uniqueness

The standard method for decomposing a risk measure (in the broad sense) into additive components is based on Euler’s homogenous function theorem. Since $\varvec{x}\mapsto {{\,\textrm{MAD}\,}}(\varvec{x})$ is a homogeneous function of degree 1, if it were continuously differentiable, then we would have

$$\begin{aligned} {{\,\textrm{MAD}\,}}(\varvec{x})= \nabla {{\,\textrm{MAD}\,}}(\varvec{x})^{\textsf{T}}\varvec{x}=\sum _{i=1}^n x_i\frac{\partial {{\,\textrm{MAD}\,}}(\varvec{x})}{\partial x_i}. \end{aligned}$$

(13)

Similar to the case of volatility treated as a risk measure (Maillard & Teiletche, 2010), we could interpret the quantity $x_{i} \frac{\partial {{\,\textrm{MAD}\,}}(\varvec{x})}{\partial x_{i}}$ as the risk contribution of the asset i to the global MAD of the portfolio.

We need a suitable replacement of (13) since MAD is not differentiable at some points of its domain in general. Nevertheless, by the generalized version of Euler’s homogeneous function theorem for the subdifferentials of convex functions Yang and Wei (2008, Theorem 3.1), we still have

$$\begin{aligned} {{\,\textrm{MAD}\,}}(\varvec{x})=\varvec{s}^{\textsf{T}}\varvec{x},\quad \varvec{s}\in \partial {{\,\textrm{MAD}\,}}(\varvec{x}). \end{aligned}$$

(14)

Given two vectors $\varvec{a},\varvec{b}\in \mathbb {R}^n$, we write $\varvec{a}\cdot \varvec{b}=(a_1b_1,\ldots ,a_nb_n)^{\textsf{T}}$ for their Hadamard product. Let us introduce the set

$$\begin{aligned} \mathcal{R}\mathcal{C}(\varvec{x}):=\varvec{x}\cdot \partial {{\,\textrm{MAD}\,}}(\varvec{x}):=\{\varvec{x}\cdot \varvec{s}(\varvec{x})\mid \varvec{s}(\varvec{x})\in \partial {{\,\textrm{MAD}\,}}(\varvec{x})\}. \end{aligned}$$

(15)

Suppose that $\varvec{x}\in \Delta ^{n-1}$ is a portfolio vector. For each ${{\,\textrm{RC}\,}}(\varvec{x})=({{\,\textrm{RC}\,}}_i(\varvec{x}),\ldots ,{{\,\textrm{RC}\,}}_n(\varvec{x}))^{\textsf{T}}\in \mathcal{R}\mathcal{C}(\varvec{x})$, (14) yields that

$$\begin{aligned} {{\,\textrm{MAD}\,}}(\varvec{x})=\sum _{i=1}^{n} {{\,\textrm{RC}\,}}_{i}(\varvec{x}). \end{aligned}$$

For this reason, we call ${{\,\textrm{RC}\,}}(\varvec{x})$ a feasible risk contribution vector for portfolio $\varvec{x}$. For instance, it is easy to verify that $\varvec{x}\cdot \mathbb {E}[{{\,\textrm{sgn}\,}}((\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x})(\varvec{r}-\varvec{\mu })]$ is a feasible risk contribution vector for $\varvec{x}$; see Proposition 7.

A MAD-RP portfolio is characterized by the requirement of having the same risk contribution for each asset. More precisely, a long-only portfolio $\varvec{x}\in \Delta ^{n-1}$ is called a MAD-RP portfolio if there exists ${{\,\textrm{RC}\,}}(\varvec{x})\in \mathcal{R}\mathcal{C}(\varvec{x})$ such that

$$\begin{aligned} {{\,\textrm{RC}\,}}_{i}(\varvec{x})= {{\,\textrm{RC}\,}}_{j}(\varvec{x}) \end{aligned}$$

(16)

for each $i,j\in \{1,\ldots ,n\}$ with $i\ne j$. Then, a long-only MAD-RP portfolio can be found by solving the system

$$\begin{aligned} \left\{ \begin{array}{ll} {{\,\textrm{RC}\,}}_{i}(\varvec{x})=\lambda , &{} \quad i\in \{1,\ldots ,n\}, \\ \varvec{x}\in \Delta ^{n-1},\\ {{\,\textrm{RC}\,}}(\varvec{x})\in \mathcal{R}\mathcal{C}(\varvec{x}),\\ \lambda \in \mathbb {R}. \end{array} \right. \end{aligned}$$

(17)

For the results of this section, we work under the following nondegeneracy assumption.

Assumption 8

The only vector $\varvec{x}\in \mathbb {R}^n_+$ such that $(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}=0$ $\mathbb {P}$-a.s. is $\varvec{x}=0$.

We discuss the mildness of Assumption 8 in the next two examples.

Example 9

Suppose that $\varvec{r}-\varvec{\mu }$ has distinct values $\varvec{\nu }_1,\ldots ,\varvec{\nu }_m\in \mathbb {R}^n$ with respective probabilities $p_1,\ldots ,p_m>0$, where $m\ge n$ and $\sum _{j=1}^m p_j=1$. In this case, the condition $(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}=0$ $\mathbb {P}$-a.s. is equivalent to the linear system $\varvec{A}\varvec{x}=\varvec{0}_m$, where $\varvec{A}\in \mathbb {R}^{m\times n}$ is the matrix whose respective rows are $\varvec{\nu }_1^{\textsf{T}}, \ldots , \varvec{\nu }_m^{\textsf{T}}$ and $\varvec{0}_m$ is the m-dimensional zero vector. Then, the following are equivalent:

(i)
Assumption 8 holds.
(ii)
The only solution of the system $\varvec{A}\varvec{x}=\varvec{0}_m, \varvec{x}\in \mathbb {R}^n_+$ is $\varvec{x}=0$.
(iii)
There exists $\varvec{\lambda }\in \mathbb {R}^m$ such that $\varvec{A}^{\textsf{T}}\varvec{\lambda }\in \mathbb {R}^n_{++}$.
(iv)
There exist $\lambda _1,\ldots ,\lambda _m\in \mathbb {R}$ such that $\sum _{j=1}^m \lambda _j\varvec{\nu }_j\in \mathbb {R}^n_{++}$.

Here, the equivalence between (ii) and (iii) is by Gordan’s Alternative Theorem; see Gordan (1873) and Güler (2010, Theorem 3.14). In particular, Assumption 8 holds if the rank of $\varvec{A}$ is n, i.e., there are n linearly independent vectors among $\varvec{\nu }_1,\ldots ,\varvec{\nu }_m$.

Example 10

Suppose that $\varvec{r}-\varvec{\mu }$ has the multivariate centered Gaussian distribution with covariance matrix $\varvec{\Sigma }\in \mathbb {R}^{n\times n}$. Then, for every $\varvec{x}\in \mathbb {R}^n$, $(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}$ has the univariate centered Gaussian distribution with variance $\varvec{x}^{\textsf{T}}\varvec{\Sigma }\varvec{x}$. In this case, the condition $(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}=0$ $\mathbb {P}$-a.s. is equivalent to $\varvec{x}^{\textsf{T}}\varvec{\Sigma }\varvec{x}=0$. Hence, Assumption 8 holds if and only if $\varvec{\Sigma }$ is strictly copositive. In particular, Assumption 8 holds if $\varvec{\Sigma }$ is positive definite, i.e., $\varvec{\Sigma }$ has full rank or, equivalently, $\varvec{\Sigma }$ is nonsingular.

The next proposition establishes the existence and uniqueness of a MAD-RP portfolio.

Proposition 11

Suppose that Assumption 8 holds.

(i)
Let $\lambda >0$ and consider the problem
$$\begin{aligned} \begin{array}{rl} \displaystyle \min _{\varvec{x}\in \mathbb {R}^n_{++}} &{} {{\,\textrm{MAD}\,}}(\varvec{x})-\lambda \displaystyle \sum _{i=1}^{n}\ln x_{i}.\\ \end{array} \end{aligned}$$
(18)
Then, there exists a unique optimal solution of this problem.
(ii)
There exists a unique MAD-RP portfolio, that is, the system (17) has a unique solution.

Proof

(i) The supposition ensures that ${{\,\textrm{MAD}\,}}(\varvec{x})=\mathbb {E}[|(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}|]>0$ for every $\varvec{x}\in \mathbb {R}^n_+\setminus \{0\}$, that is, ${{\,\textrm{MAD}\,}}$ is a positive function as defined in Cesarone et al. (2020b, Section 2). Moreover, ${{\,\textrm{MAD}\,}}$ is also convex and positively homogeneous. Although ${{\,\textrm{MAD}\,}}$ is not continuously differentiable at some points of its domain, the same arguments as in the proof of Cesarone et al. (2020b, Section 2) are applicable, and it follows that the given problem has a unique optimal solution.

(ii) Let us fix some $\lambda >0$. By (i), there exists a unique optimal solution $\bar{\varvec{x}}$ of the problem in (18). By Cesarone et al. (2020b, Proposition 2), $\varvec{x}^*:=\frac{\bar{\varvec{x}}}{\sum _{i=1}^n \bar{x}_i}$ is the unique optimal solution of the problem in (18) but with $\lambda $ replaced with $\lambda ^*:=\frac{\lambda }{\sum _{i=1}^n \bar{x}_i}$. By the first order condition for optimality, we have

$$\begin{aligned} 0\in \partial {{\,\textrm{MAD}\,}}(\varvec{x}^*)-\lambda ^* \left[ \frac{1}{x_i^*} \right] _{i=1}^n, \end{aligned}$$

(19)

that is,

$$\begin{aligned} (\lambda ^*,\ldots ,\lambda ^*)^{\textsf{T}} \in \varvec{x}^*\cdot \partial {{\,\textrm{MAD}\,}}(\varvec{x}^*). \end{aligned}$$

Hence, there exists $\varvec{s}(\varvec{x}^*)\in \partial {{\,\textrm{MAD}\,}}(\varvec{x}^*)$ such that $x^{*}_i s_i(\varvec{x}^*)=\lambda ^*$ for every $i\in \{1,\ldots ,n\}$. In particular, ${{\,\textrm{RC}\,}}(\varvec{x}):=\varvec{x}^*\cdot \varvec{s}(\varvec{x}^*)\in \mathcal{R}\mathcal{C}(\varvec{x}^*)$ and ${{\,\textrm{RC}\,}}_i(\varvec{x})=\lambda ^*$ for every $i\in \{1,\ldots ,n\}$. Therefore, $\varvec{x}^*$ is a MAD-RP portfolio. Its uniqueness follows from the strict convexity of the logarithmic objective function in (18). $\square $

Remark 12

In Cesarone et al. (2020b, Theorem 2), the existence and uniqueness of an RP portfolio are shown for a continuously differentiable risk measure. Their proof has two main components: (1) arguments based on coercivity and Weierstrass theorem to prove that the associated logarithmic problem (similar to (18)) has a unique optimal solution, where the positivity, convexity, and positive homogeneity of the risk measure are used but its differentiability is not used, (2) their Proposition 1, where it is shown that an optimal solution of the logarithmic problem gives rise to an RP portfolio and the proof uses first order conditions for the minimization of differentiable functions. In the present paper, (1) still works when the risk measure is replaced with MAD, as stated in the proof of Proposition 11(i). However, in the proof of Proposition 11(ii), we extend (2) by using the more general first order condition based on subdifferentials.

Assuming that the portfolio MAD is additive (see Theorem 6(iv)), i.e., the asset returns satisfy the condition

$$\begin{aligned} (r_i-\mu _i)(r_j-\mu _j)\ge 0 \quad \mathbb {P}\text{-a.s. } \end{aligned}$$

(20)

for every $i,j\in \{1,\ldots ,n\}$ with $i\ne j$, we show in the next proposition that the weights of the assets in the MAD-RP portfolio are proportional to the reciprocals of the MADs of the individual asset returns.

We point out that taking a portfolio with weights proportional to the inverse of the risks of the individual asset returns is a naïve approach frequently used in practice. This approach is often called naïve risk parity due to the strong implicit assumption on the dependence of the asset returns (see, e.g., Qian 2011, Qian 2017, Clarke et al. 2013, and Haesen et al. 2017).

Proposition 13

(Closed-form solution for the long-only MAD-RP portfolio) Suppose that Assumption 8 holds. Furthermore, suppose that the additivity condition (20) holds. Then, the unique MAD-RP portfolio is given by

$$\begin{aligned} x_{i}=\displaystyle \frac{\mathbb {E}[\left|r_{i}-\mu _{i}\right|]^{-1}}{\sum _{j=1}^{n}\mathbb {E}[ \left|r_{j}-\mu _{j}\right|]^{-1}},\quad i\in \{1,\ldots ,n\}. \end{aligned}$$

(21)

Proof

The first supposition ensures that there is a unique MAD-RP portfolio by Proposition 11. Moreover, as checked in the proof of Proposition 11, we have ${{\,\textrm{MAD}\,}}(r_i)=\mathbb {E}[|r_i-\mu _i|]>0$ for each $i\in \{1,\ldots ,n\}$. Hence, the expression in (21) is well-defined.

Under (20), MAD is additive for long-only portfolios by Theorem 6. Hence, we have

$$\begin{aligned} {{\,\textrm{MAD}\,}}(\varvec{x})= \sum _{i=1}^{n} x_{i} {{\,\textrm{MAD}\,}}(r_{i}) \end{aligned}$$

(22)

for every $\varvec{x}\in \mathbb {R}^n_+$, i.e., ${{\,\textrm{MAD}\,}}$ is linear on $\mathbb {R}^n_+$. It follows that ${{\,\textrm{MAD}\,}}$ is differentiable at every $\varvec{x}\in \mathbb {R}^n_{++}$ and the gradient is given by $\nabla {{\,\textrm{MAD}\,}}(\varvec{x})=({{\,\textrm{MAD}\,}}(r_1),\ldots ,{{\,\textrm{MAD}\,}}(r_n))^{\textsf{T}}$. Hence, $\mathcal{R}\mathcal{C}(\varvec{x})$ is a singleton and the only feasible risk contribution vector is given by

$$\begin{aligned} {{\,\textrm{RC}\,}}(\varvec{x})=(x_1{{\,\textrm{MAD}\,}}(r_1),\ldots ,x_n{{\,\textrm{MAD}\,}}(r_n))^{\textsf{T}} \end{aligned}$$

for each $\varvec{x}\in \mathbb {R}^n_{++}$. Imposing condition (16), we can write

$$\begin{aligned} x_{i} \mathbb {E}[ \left|r_{i}-\mu _{i}\right|] = x_{j} \mathbb {E}[ \left|r_{j}-\mu _{j}\right|], \quad i,j\in \{1,\ldots ,n\}. \end{aligned}$$

(23)

Thus, for a fixed $i\in \{1,\ldots ,n\}$, considering the full investment constraint $\sum _{j=1}^{n}x_{j}=1$, we obtain

$$\begin{aligned} \sum _{j=1}^n x_j =\sum _{j=1}^n \frac{\mathbb {E}[\left|r_i-\mu _i\right|]}{\mathbb {E}[\left|r_j-\mu _j\right|]}x_i=\mathbb {E}[\left|r_i-\mu _i\right|]x_i\sum _{j=1}^n \mathbb {E}[\left|r_j-\mu _j\right|]^{-1}=1. \end{aligned}$$

From this, (21) follows immediately. Hence, the portfolio given in (21) is a MAD-RP portfolio, and this is the only such portfolio as established above. $\square $

3.2 MAD-RP portfolios: calculations

In the following sections, we provide three methods for finding MAD-RP portfolios in practice. For these methods, as usual in portfolio optimization, we use a lookback approach where the possible realizations of the discrete random returns arise from historical data. This will ensure that our optimization problems are finite-dimensional. Let $T\in \mathbb {N}$ be the length of the time series of the prices of the assets. For each $i\in \{1,\ldots ,n\}$ and $t\in \{0,\ldots ,T\}$, we denote by $p_{it}$ the realized price of asset i at time t, and for $t\ge 1$, we denote by $r_{it}=\frac{p_{it}-p_{i(t-1)}}{p_{i(t-1)}}$ the realized (linear) return of asset i for the period ending at t. We assume that there are no ties of the outcomes and, therefore, that each historical scenario is equally likely with probability $\frac{1}{T}$ (see, e.g., Carleo et al. 2017, Cesarone 2020, and references therein). In particular, for each $i\in \{1,\ldots ,n\}$, the random return $r_i$ of asset i is a discrete random variable with the uniform distribution over the set $\{r_{i1},\ldots ,r_{iT}\}$ and we have $\mu _i=\frac{1}{T}\sum _{t=1}^T r_{it}$. Furthermore, we assume that the only vector $\varvec{x}\in \mathbb {R}^n_+$ such that $(\varvec{r}-\varvec{\mu })^{\textsf{T}}\varvec{x}=0$ $\mathbb {P}$-a.s. is $\varvec{x}=0$, which guarantees the existence and uniqueness of an MAD-RP portfolio by Proposition 11.

3.2.1 Logarithmic formulations

The first formulation that we present consists of minimizing MAD with a logarithmic barrier term in the objective function (Bai et al., 2016) as already given in (18):

$$\begin{aligned} \left\{ \begin{array}{rl} \displaystyle \min _{\varvec{x}\in \mathbb {R}^n_{++}} &{} {{\,\textrm{MAD}\,}}(\varvec{x})-\lambda \displaystyle \sum _{i=1}^{n}\ln x_{i},\\ \end{array} \right. \end{aligned}$$

(24)

where $\lambda >0$ is a constant. We name this strictly convex optimization problem log_obj. Suppose that log_obj has an optimal solution $\bar{\varvec{x}}$ (due to the strict convexity of the objective function, it must be the unique optimal solution). Then, following the arguments in the proof of Proposition 11, the normalized portfolio $\varvec{x}^*=\frac{\bar{\varvec{x}}}{\sum _{i=1}^{n}\bar{x}_{i}}$ is the MAD-RP portfolio and it is the unique optimal solution of log_obj with $\lambda $ replaced with $\lambda ^*:=\frac{\lambda }{\sum _{i=1}^n \bar{x}_i}$.

The second formulation has a logarithmic constraint and we call it log_constr (see Spinu 2013, Cesarone and Colucci 2018, Bellini et al. 2021). It consists of solving the following convex problem:

$$\begin{aligned} \left\{ \begin{array}{rl} \displaystyle \min _{\varvec{x}\in \mathbb {R}^n_{++}} &{} {{\,\textrm{MAD}\,}}(\varvec{x})\\ &{} \displaystyle \sum _{i=1}^{n}\ln x_{i}\ge c, \end{array} \right. \end{aligned}$$

(25)

where $c\in \mathbb {R}$ is a constant. Moreover, if $\bar{\varvec{x}}$ is an optimal solution of log_constr, then $\frac{\bar{\varvec{x}}}{\sum _{i=1}^{n}\bar{x}_{i}}$ is the MAD-RP as checked in the next remark.

Remark 14

Since the function $\varvec{x}\mapsto {{\,\textrm{MAD}\,}}(\varvec{x})$ is not differentiable everywhere, we can apply the general Karush-Kuhn-Tucker conditions with subdifferentials as formulated in Rockafellar (1970, Theorem 28.3). Note that the Lagrangian of log_constr is given by

$$\begin{aligned} L(\varvec{x}, \lambda )={{\,\textrm{MAD}\,}}(\varvec{x})+\lambda \biggl (c-\sum _{i=1}^{n}\ln (x_{i}) \biggr ) \end{aligned}$$

(26)

for each $\varvec{x}\in \mathbb {R}^n_{++}$, $\lambda \ge 0$. Then, for a fixed $\lambda \ge 0$, the subdifferential of $\varvec{x}\mapsto L(\varvec{x},\lambda )$ at $\varvec{x}\in \mathbb {R}^n_{++}$ is given by

$$\begin{aligned} \partial _{\varvec{x}}L(\varvec{x},\lambda )=\partial {{\,\textrm{MAD}\,}}(\varvec{x})-\lambda \left[ \frac{1}{x_i} \right] _{i=1}^n. \end{aligned}$$

In particular, the problem $\inf _{\varvec{x}\in \mathbb {R}^n_{++}} L(\varvec{x},\lambda )$ of calculating the dual objective function at a given $\lambda > 0$ is equivalent to log_obj. Let $\bar{\varvec{x}}$ be an optimal solution of log_constr with $\bar{\lambda }\ge 0$ denoting the Lagrange multiplier of the logarithmic constraint at optimality. Similar to (19), we have $0\in \partial _{\varvec{x}}L(\bar{\varvec{x}},\bar{\lambda })$. Arguing as in the proof of Proposition 11, we obtain the existence of $\varvec{s}(\bar{\varvec{x}})\in \partial {{\,\textrm{MAD}\,}}(\bar{\varvec{x}})$ such that $\bar{x}_is_i(\bar{\varvec{x}})=\bar{\lambda }$ for every $i\in \{1,\ldots ,n\}$. In particular, $\bar{\varvec{x}}\cdot \varvec{s}(\bar{\varvec{x}})\in {{\,\textrm{RC}\,}}(\bar{\varvec{x}})$ and ${{\,\textrm{MAD}\,}}(\bar{\varvec{x}})=n\bar{\lambda }$. Since ${{\,\textrm{MAD}\,}}(\bar{\varvec{x}})>0$ by the assumption stated at the beginning of this section, we have $\bar{\lambda }>0$. On the other hand, by complementary slackness, we have $\bar{\lambda }(c-\sum _{i=1}^n\ln (\bar{x}_i))=0$. Hence, $c=\sum _{i=1}^n \ln (\bar{x}_i)$. It follows that the normalized version $\varvec{x}^*:=\frac{\bar{\varvec{x}}}{\sum _{i=1}^n \bar{x}_i}$ is the MAD-RP portfolio and it is an optimal solution of log_constr with c replaced with $c^*:=\sum _{i=1}^n \ln (x^*_i)=\sum _{i=1}^n \ln (\bar{x}_i)-n\ln (\sum _{i=1}^n \bar{x}_i)=c-n\ln (\sum _{i=1}^n \bar{x}_i)$.

Formulation (25) allows us to obtain some features of the risk parity portfolio $\varvec{x}^{MAD-RP}$ related to a global minimizer $\varvec{x}^{MinMAD}$ of MAD (i.e., the MinMAD portfolio) and the EW portfolio $\varvec{x}^{EW}$, as shown in Remarks 15 and 16, respectively.

Remark 15

(MinMAD vs. MAD-RP) From the formulation log_constr in (25), it is clear that

$$\begin{aligned} {{\,\textrm{MAD}\,}}(\varvec{x}^{MinMAD})\le {{\,\textrm{MAD}\,}}(\varvec{x}^{MAD-RP}). \end{aligned}$$

Remark 16

(EW vs. MAD-RP) Let us consider a slightly modified version of Problem (25), where we add the budget constraint $\displaystyle \sum _{i=1}^{n} x_{i}=1$. Due to Jensen inequality, we have

$$\begin{aligned} \displaystyle \frac{1}{n}\sum _{i=1}^{n}\ln (x_{i})\le \ln \biggl (\frac{1}{n}\sum _{i=1}^{n}x_{i}\biggr ) = - \ln (n) \end{aligned}$$

(27)

for every feasible solution $\varvec{x}$ of the new problem. Now, fixing $c=-n\ln (n)$ in Problem (25), the unique feasible solution is given by $x_{i}=\frac{1}{n}$, i.e., $\varvec{x}=\varvec{x}^{EW}$. As a consequence, we have

$$\begin{aligned} {{\,\textrm{MAD}\,}}(\varvec{x}^{MAD-RP})\le {{\,\textrm{MAD}\,}}(\varvec{x}^{EW}). \end{aligned}$$

3.2.2 System-of-equation formulations

An alternative method, named soe_1, consists of solving the system (17) directly. Following the expression of the set $\mathcal{R}\mathcal{C}(\varvec{x})$ in (15) and that of the subdifferential $\partial {{\,\textrm{MAD}\,}}(\varvec{x})$ in Proposition 7, we can rewrite this system as

$$\begin{aligned} \negthinspace \negthinspace \negthinspace \negthinspace \negthinspace \left\{ \negthinspace \begin{array}{ll} \frac{x_{i}}{T}\sum _{t=1}^T s_t(r_{it}-\mu _{i}) = \displaystyle \lambda , &{} i\in \{1,\ldots ,n\},\\ \sum \limits _{i=1}^{n}x_{i}=1, &{} \\ x_{i}\ge 0, &{} i\in \{1,\ldots ,n\}, \\ \sum _{i=1}^n (r_{it}-\mu _{i})x_i\ne 0 \Rightarrow s_t={{\,\textrm{sgn}\,}}(\sum _{i=1}^n (r_{it}-\mu _{i})x_i), &{} t\in \{1,\ldots ,T\},\\ -1\le s_t\le +1,&{} t\in \{1,\ldots ,T\}, \\ \lambda \in \mathbb {R}, \; x_i\in \mathbb {R},\; s_t\in \mathbb {R},\; &{} i\in \{1,\ldots ,n\}, t\in \{1,\ldots ,T\}.\\ \end{array} \right. \end{aligned}$$

(28)

Note that $s_t$ is an auxiliary decision variable in this formulation and it stands for a subgradient of the absolute value function at the point $\sum _{i=1}^n (r_{it}-\mu _{i})x_i$. The logical implication constraint in the above formulation can be converted into linear inequalities by introducing additional variables. We rewrite $s_t$ as a convex combination of $+1$ and $-1$, i.e., $s_t=\alpha _t-(1-\alpha _t)=2\alpha _t-1$ for some variable $\alpha _t$ with values in [0, 1], where $\alpha _t$ is enforced to be 1 when $\sum _{i=1}^n (r_{it}-\mu _{i})x_i>0$, and it is enforced to be 0 when $\sum _{i=1}^n (r_{it}-\mu _{i})x_i<0$. This is achieved by the following formulation with the help of two additional binary decision variables:

$$\begin{aligned} \left\{ \begin{array}{ll} \frac{x_{i}}{T} \sum _{t=1}^T (2\alpha _t-1)(r_{it}-\mu _{i}) = \displaystyle \lambda , &{} \quad i\in \{1,\ldots ,n\},\\ \sum \limits _{i=1}^{n}x_{i}=1, &{} \\ x_{i}\ge 0, &{}\quad i\in \{1,\ldots ,n\}, \\ -Mu_t\le \sum _{i=1}^n (r_{it}-\mu _{i})x_i \le Mv_t,&{}\quad t\in \{1,\ldots ,T\},\\ 1-M(1-v_t)\le \alpha _t\le M(1-u_t),&{}\quad t\in \{1,\ldots ,T\},\\ 0\le \alpha _t\le 1,&{}\quad t\in \{1,\ldots ,T\}, \\ u_t,v_t\in \{0,1\},&{}\quad t\in \{1,\ldots ,T\},\\ \lambda \in \mathbb {R}; \; x_i\in \mathbb {R};\; \alpha _t,u_t,v_t\in \mathbb {R},&{}\quad i\in \{1,\ldots ,n\},\ t\in \{1,\ldots ,T\}, \end{array} \right. \end{aligned}$$

(29)

where $M>0$ is a suitably large constant. This reformulation is processed numerically using the global optimization software BARON (Tawarmalani and Sahinidis 2005).

Remark 17

As a simplification of (29), one may consider the following system:

$$\begin{aligned} \left\{ \begin{array}{ll} \frac{x_{i}}{T} \sum _{t=1}^T (v_t-u_t)(r_{it}-\mu _{i}) = \displaystyle \lambda , &{} \quad i\in \{1,\ldots ,n\},\\ \sum \limits _{i=1}^{n}x_{i}=1, &{} \\ x_{i}\ge 0, &{} \quad i\in \{1,\ldots ,n\},\\ -Mu_t\le \sum _{i=1}^n (r_{it}-\mu _{i})x_i \le M v_t,&{}\quad t\in \{1,\ldots ,T\},\\ v_t + u_t \le 1,&{}\quad t\in \{1,\ldots ,T\},\\ u_t,v_t\in \{0,1\},&{}\quad t\in \{1,\ldots ,T\},\\ \lambda \in \mathbb {R};\; x_i\in \mathbb {R};\; u_t,v_t\in \mathbb {R},&{}\quad i\in \{1,\ldots ,n\},\ t\in \{1,\ldots ,T\}. \end{array} \right. \end{aligned}$$

(30)

In this system, $v_t-u_t$ stands for a subgradient of the absolute value function at a point where we have $\sum _{i=1}^n (r_{it}-\mu _{i})x_i $; however, it is restricted to the set $\{-1,0,+1\}$ when $\sum _{i=1}^n (r_{it}-\mu _{i})x_i =0$. Hence, for some corner cases, the system (30) may be infeasible even though the MAD-RP portfolio exists. On the other hand, if $\varvec{x}^*$ is part of a solution of (30), then it is the MAD-RP portfolio. For this reason, we will use (30) in the empirical analysis of Sect. 4 when implementing soe_1.

Finally, we also consider the following formulation, which we call soe_2 and consists of a further simplified version of the system (17)

$$\begin{aligned} \left\{ \begin{array}{ll} q_{i}^{2} \mathbb {E}\biggl [ {{\,\textrm{sgn}\,}}\biggl [ \sum _{j=1}^{n}(r_{j}-\mu _{j})q_{j}^{2}\biggr ](r_{i}-\mu _{i})\biggr ] = \displaystyle \frac{z}{n}, &{} \quad i\in \{1,\ldots ,n\},\\ \sum \limits _{i=1}^{n}q_{i}^{2}=1, &{} \end{array} \right. \end{aligned}$$

(31)

where we remove the sign restrictions for $\varvec{x}$.

3.2.3 Least-squares formulations

The last method, called ls_rel, exploits the fact that we have $\frac{{{\,\textrm{RC}\,}}_{i}(\varvec{x})}{{{\,\textrm{MAD}\,}}(\varvec{x})}=\frac{1}{n}$ for each $i\in \{1,\ldots ,n\}$ whenever $\varvec{x}$ is the MAD-RP portfolio (see Cesarone and Colucci 2018). More precisely, it consists of the following least-squares formulation:

$$\begin{aligned} \left\{ \begin{array}{rll} \displaystyle \min &{} \displaystyle \sum _{i=1}^{n}\biggl (\frac{{{\,\textrm{RC}\,}}_{i}}{{{\,\textrm{MAD}\,}}(\varvec{x})} -\frac{1}{n}\biggr )^{2} &{} \\ &{} \varvec{{{\,\textrm{RC}\,}}}\in \mathcal{R}\mathcal{C}(\varvec{x}), &{}\\ &{} \sum \limits _{i=1}^{n}x_{i}=1, &{} \\ &{} x_{i}\ge 0, &{} \quad i\in \{1,\ldots ,n\}, \\ &{}\varvec{x}\in \mathbb {R}^n,\; \varvec{{{\,\textrm{RC}\,}}}\in \mathbb {R}^n. \end{array} \right. \end{aligned}$$

(32)

A variant of this least-squares method is given by

$$\begin{aligned} \left\{ \begin{array}{rll} \displaystyle \min &{} \displaystyle \sum _{i=1}^{n}\sum _{j=1}^n\bigl ({{\,\textrm{RC}\,}}_{i}-{{\,\textrm{RC}\,}}_{j}\bigr )^{2} &{} \\ &{} \varvec{{{\,\textrm{RC}\,}}}\in \mathcal{R}\mathcal{C}(\varvec{x}), &{}\\ &{} \sum \limits _{i=1}^{n}x_{i}=1, &{} \\ &{} x_{i}\ge 0, &{} \quad i\in \{ 1,\ldots ,n\},\\ &{} \varvec{x}\in \mathbb {R}^n,\; \varvec{{{\,\textrm{RC}\,}}}\in \mathbb {R}^n, \end{array} \right. \end{aligned}$$

(33)

named ls_abs (see, e.g., Maillard and Teiletche 2010). The functional constraint $\varvec{{{\,\textrm{RC}\,}}}\in \mathcal{R}\mathcal{C}(\varvec{x})$ can be reformulated in terms of linear inequalities with additional binary variables as in Sect. 3.2.2. If an optimal solution for ls_rel or ls_abs is found with zero optimal value, then this solution is the MAD-RP portfolio. Nevertheless, numerical methods can only guarantee local optimality due to the nonconvex nature of these problems. Hence, using the least-squares formulations should be considered as a heuristic approach.

4 Empirical analysis

In this section, we provide an extensive empirical analysis based on three investment universes which are described in Sect. 4.1. More precisely, in Sect. 4.2.1, we test and compare all the MAD-RP formulations in terms of accuracy and efficiency. Section 4.2.2 shows some graphical examples of the portfolio selection strategies analyzed in terms of portfolio weights and relative risk distribution among assets, using both volatility and MAD as risk measures. In Sect. 4.2.3, we report and discuss their out-of-sample performance results based on a Rolling Time Window scheme of evaluation.

4.1 Datasets

We give here a brief description of the three real-world datasets used in the empirical analysis. These datasets were obtained from Refinitiv and are publicly available on https://www.francescocesarone.com/.

ETF Emerging countries (ETF-EC): it consists of the total return index,^{Footnote 1} expressed in US dollars, of the ETFs of 24 emerging countries.
Euro bonds (EuroBonds): it consists of the total return index in euros of the government bonds of 11 countries belonging to the Eurozone, with maturities ranging from 1 year to 30 years.
Commodities and Italian bonds mix (CIB-mix): it consists of the total return index, expressed in euros, of a mixture of Italian government bonds, whose maturity ranges from 2 to 30 years, and commodities (agriculture, gold, energy and industrial metals).

In Table 1, we provide additional information about these datasets.

Table 1 List of the datasets analyzed

Full size table

4.2 Computational results

4.2.1 Accuracy and efficiency of the MAD-RP formulations

In this section, we test and compare all the MAD-RP formulations, described in Sect. 3.2, in terms of accuracy and efficiency using the following quantities:

the square root of the value of the objective function of Problem (32), named $\sqrt{F(\varvec{x})}$;
the portfolio ${{\,\textrm{MAD}\,}}$ obtained from each method, named ${{\,\textrm{MAD}\,}}(\varvec{x})$;
the empirical mean of the absolute deviations of relative risk contributions from $\frac{1}{n}$,
$$\begin{aligned} \text{ MeanAbsDev } = \frac{1}{n}\sum _{i=1}^{n}\left| \frac{{{\,\textrm{RC}\,}}_{i}(\varvec{x})}{{{\,\textrm{MAD}\,}}(\varvec{x})}-\frac{1}{n}\right| \,; \end{aligned}$$
the maximum of the absolute deviations of relative risk contributions from $\frac{1}{n}$,
$$\begin{aligned} \text{ MaxAbsDev } = \max _{1\le i\le n}\left| \frac{{{\,\textrm{RC}\,}}_{i}(\varvec{x})}{{{\,\textrm{MAD}\,}}(\varvec{x})}-\frac{1}{n}\right| \,; \end{aligned}$$
the inverse of the number of assets selected, namely $\frac{1}{n}$;
the computational times (in seconds) required to solve each MAD-RP formulation considered.

All the experiments have been implemented on a workstation with Intel Core CPU (i7-6700, 3.4 GHz, 16 Gb RAM) under MS Windows 10, using MATLAB 9.1. More precisely, we solve models log_obj (24), log_constr (25), ls_rel (32), ls_abs (33) by means of the built-in function fmincon; model soe_1 (30) by using the global optimization software BARON (version 15.9.22), which is also called from MATLAB by means of the MATLAB/BARON toolbox (Tawarmalani & Sahinidis, 2005); finally, we solve soe_2 (31) by means of the built-in function fsolve.

Table 2 Experimental results for the MAD-RP methods using ETF-EC, where $1/n = 0.0417$

Full size table

Since in Sect. 4.2.3 we perform an extensive out-of-sample analysis using in-sample windows of 2 years, for the following accuracy and efficiency analysis of the MAD-RP formulations we set the number of observations $T=500$ days (i.e., the first two years of each dataset). For Problems (24) and (25), we also perform a sensitivity analysis by varying the constants $\lambda >0$ and $c \in \mathbb {R}$, respectively. Such a sensitivity analysis leads us to set $\lambda = 0.001$ and $c = -40$, which guarantee a reasonable level of accuracy for all experiments.

Table 3 Experimental results for the MAD-RP methods using EuroBonds, where $1/n = 0.0161$

Full size table

Tables 2, 3, and 4 report the computational results obtained for the three datasets analyzed. Both soe_1 and soe_2 seem to be the least accurate and efficient formulations. Note that, with this experimental setup, the most computationally demanding formulation soe_1 fails to find a solution in a reasonable time (1 day).

Methods based on least-squares formulations, i.e., ls_rel and ls_abs, achieve intermediate performances for all the datasets.

Table 4 Experimental results for the MAD-RP methods using CIB-mix, where $1/n = 0.0909$

Full size table

Finally, on one hand, the two logarithmic formulations essentially lead to the same level of accuracy. On the other hand, log_obj seems to be the most efficient method for the three datasets considered.

4.2.2 Comparison of some portfolio selection approaches

To better illustrate the concept of risk allocation, we report here a graphical comparison of the five portfolio selection strategies listed below.

1.
MAD-RP: the portfolio obtained by solving the log_obj formulation.
2.
MinMAD: the global minimum MAD portfolio, obtained by implementing problem (3.6) of Konno and Yamazaki (1991) without constraining the portfolio expected return.
3.
MinV: the global minimum variance portfolio, as in Markowitz (1952).
4.
Vol-RP: the portfolio obtained by means of the risk parity approach using volatility as a risk measure (Problem (7) of Maillard and Teiletche 2010).
5.
EW: the Equally Weighted portfolio, where the capital is uniformly distributed among all the assets in the investment universe, i.e., $x_{i}=\frac{1}{n}$ for each $i\in \{1,\ldots ,n\}$.

More precisely, we consider an investment universe of 5 assets and we evaluate weights and Relative Risk Contributions (RRCs), obtained from all the portfolio strategies analyzed. Note that for each portfolio approach, RRCs are computed by considering both volatility and MAD as risk measures. In Fig. 1, we report the results obtained for 5 assets belonging to CIB-mix.^{Footnote 2}

We can observe that the Vol-RP and MAD-RP portfolios generally show a slightly different portfolio composition but both appear to be well-diversified similar to the EW portfolio. A converse situation can be pointed out for the minimum risk portfolios. Indeed, as shown on the top of Fig. 1, both the MinV and MinMAD approaches concentrate 71% and 94% of the capital in only one asset, respectively. In the middle (bottom) of Fig. 1, for each portfolio strategy, we can see how much each asset contributes (in percentage) to the whole portfolio volatility (MAD). Clearly, for Vol-RP (MAD-RP), each asset equally contributes to the portfolio volatility (MAD). Furthermore, by construction, for a fixed risk measure, the asset weights of the minimum risk portfolio coincide with their RRCs. It is also interesting to observe that, for the EW portfolio, RRCs are uneven regardless of the risk measure used.

4.2.3 Out-of-sample analysis

In this section, we discuss the main results of the out-of-sample performance analysis for all the portfolio selection strategies listed in Sect. 4.2.2. For this analysis, we adopt a Rolling Time Window scheme of evaluation: we allow for the possibility of rebalancing the portfolio composition during the holding period, at fixed intervals. To calibrate the portfolio selection models, we use 2 years (i.e., 500 days) for the in-sample window, while we choose 20 days for the out-of-sample window with rebalancing allowed every 20 days. To evaluate the out-of-sample performances of the five portfolios analyzed, we use the following quantities typically adopted in the literature (see, e.g., Bacon 2008; Rachev et al. 2008; Bruni et al. 2017; Cesarone et al. 2015, 2016, 2019, 2022). Let $R^{out}$ denote the out-of-sample portfolio return, and $W_{t}=W_{t-1}(1+R^{out})$ the portfolio wealth at time t.

Mean is the annualized average portfolio return, i.e., $\mu ^{out}=\mathbb {E}[R^{out}]$. Obviously, higher values are associated to higher performances.
Volatility (Vol) is the annualized standard deviation, computed as $\sigma ^{out}=\sqrt{\mathbb {E}[(R^{out}-\mu ^{out})^{2}]}$. Since it measures the portfolio risk, lower values are preferable.
Mean Absolute Deviation (MAD) of $R^{out}$, namely ${{\,\textrm{MAD}\,}}(R^{out}):=\mathbb {E}[\left|R^{out}-\mu ^{out}\right|]$.
Maximum Drawdown (MaxDD): denoting the drawdowns by
$$\begin{aligned} dd_{t}=\frac{W_{t}-\displaystyle \max _{1\le \tau \le t}W_{\tau }}{\displaystyle \max _{1\le \tau \le t}W_{\tau }}\text{, }\quad t\in \{1,\ldots ,T\}, \end{aligned}$$
MaxDD is defined as $ MaxDD=\displaystyle \min _{1\le t\le T}dd_{t}\text{. } $ It measures the loss from the observed peak in the returns; hence, it will always have negative sign (or, in the best case scenario, it will be equal to 0), meaning that the higher the better.
Ulcer index, i.e.,
$$\begin{aligned} UI=\sqrt{\frac{\sum \limits _{t=1}^{T}dd_{t}^{2}}{T}}\text{. } \end{aligned}$$
It evaluates the depth and the duration of drawdowns in prices over the out-of-sample period. Lower Ulcer values are associated to better portfolio performances.
Sharpe ratio measures the gain per unit risk and it is defined as
$$\begin{aligned} SR^{out}=\frac{\mu ^{out}-r_{f}}{\sigma ^{out}}, \end{aligned}$$
where we choose $r_{f}=0$. The larger this value, the better the performance.
Sortino ratio is similar to the Sharpe ratio but uses another risk measure, i.e., the Target Downside Deviation, $TDD=\sqrt{\mathbb {E}[((R^{out}-r_{f})^-)^{2}]}$. Therefore, the Sortino Ratio is defined as
$$\begin{aligned} SoR=\frac{\mu ^{out}-r_{f}}{TDD}, \end{aligned}$$
where $r_{f}=0$. Similar to the Sharpe ratio, the higher it is, the better the portfolio performance.
Turnover, i.e.,
$$\begin{aligned} Turn=\frac{1}{Q}\sum \limits _{q=1}^{Q}\sum \limits _{k=1}^{n}|x_{q,k}-x_{q-1,k}|, \end{aligned}$$
where Q represents the number of rebalances, $x_{q,k}$ is the portfolio weight of asset k after rebalancing, and $x_{q-1,k}$ is the portfolio weight before rebalancing at time q. Lower turnover values imply better portfolio performance. We point out that this definition of portfolio turnover is a proxy of the effective one, since it evaluates only the amount of trading generated by the models at each rebalance, without considering the trades due to changes in asset prices between one rebalance and the next. Thus, by definition, the turnover of the EW portfolio is zero.
Rachev-5% ratio measures the upside potential, comparing right and left tail. Mathematically, it is computed as
$$\begin{aligned} \frac{CVaR_{\alpha }(r_{f}-R^{out})}{CVaR_{\beta }(R^{out}-r_{f})}, \end{aligned}$$
where we choose $\alpha =\beta = 5\%$ and $r_{f}=0$.

In Tables 5, 6 and 7, we report the experimental results obtained for the three dataset, ETF-EC, EuroBonds and CIB-mix, respectively. For each performance measure, we show with different colors the rank of the results of the five portfolio selection strategies analyzed. More precisely, the colors range from deep-green to deep-red, where deep-green represents the best performance while deep-red the worst one.

Table 5 Out-of-sample results for ETF-EC. (Color figure online)

Full size table

As shown in Table 5, the MinMAD portfolio seems to have better performance with respect to the other portfolios with two exceptions: Turnover and Mean. Indeed, Turnover of the Vol-RP and MAD-RP portfolios are much smaller than that of MinMAD. The EW portfolio guarantees the highest Mean, followed by the Vol-RP and MAD-RP portfolios. We can observe that these two RP strategies provide very similar values for all the performance measures considered, and their out-of-sample performances, both in terms of risk and of gain, are located in the middle between those of the minimum risk portfolios and those of the EW portfolio, as theoretically expected in the in-sample case (see Remarks 15 and 16). The MinV and MinMAD portfolios appear to be the least risky, both in terms of Vol, MAD, MaxDD, and Ulcer. Furthermore, even though MAD-RP tends to be riskier than the minimum risk strategies, it achieves the second-largest values of Sharpe and Sortino. Note that we do purposely leave the EW portfolio out of the comparison concerning Turnover, since its value is 0 by construction.

Table 6 Out-of-sample results for EuroBonds. (Color figure online)

Full size table

In Table 6, we report the computational results for EuroBonds. Again as expected, the minimum risk portfolios show the best risk performances but with the worst values of out-of-sample expected return. Conversely, the EW portfolio has the best values in terms of Gain-Risk ratios and the worst in terms of risk. Also in this case, the performances of the RP strategies generally lie between those of EW and those of the minimum risk portfolios. Furthermore, the Vol-RP portfolio shows the lowest value of Turnover, followed by MAD-RP.

Table 7 Out-of-sample results for CIB-mix. (Color figure online)

Full size table

In Table 7, we report the computational results for CIB-mix. We observe here that the MAD-RP portfolio presents very good performances: it has the highest Mean, Sharpe and Sortino values (followed by the Vol-RP portfolio), and the second-highest Rachev-5%, below that of EW. Again, the minimum risk portfolios, particularly MinV, have the best Vol, MAD, MaxDD, and Ulcer. Furthermore, the Vol-RP portfolio provides the lowest Turnover, followed by MinMAD.

In Fig. 2, for ETF-EC, EuroBonds and CIB-mix, we report the time evolution of the portfolio wealth, investing one unit of the currency at the beginning of the investment horizon. We can note that the EW portfolio often provides the highest values of wealth and that the risk parity portfolios are systematically better than the minimum risk ones. The Vol-RP and MAD-RP portfolios tend to perform similarly, with the only exception of the CIB-mix dataset, where the MAD-RP portfolio performs significantly better and is often the best among all.

Table 8 Analysis of the differences in the portfolio weights of the Risk Parity and minimum risk strategies

Full size table

Since the out-of-sample performance measures for the risk parity portfolios using volatility and MAD generally tend to appear very close, we compute a simple metric to better highlight the differences between the portfolio weights selected by these two approaches. More precisely, for each dataset listed in Table 1, we use the following metric named Overall Weight Differences (OWD):

$$\begin{aligned} OWD =\frac{1}{Q} \sum \limits _{q=1}^{Q} \left( \frac{1}{n} \sum \limits _{k=1}^{n} |x_{q,k}^{Vol-RP}-x_{q,k}^{MAD-RP}| \right) , \end{aligned}$$

(34)

where, again, Q represents the number of rebalances, $x_{q,k}^{Vol-RP}$ and $x_{q,k}^{MAD-RP}$ are the weights of asset k obtained, at time q, by the RP strategy using volatility and MAD, respectively. Table 8 shows the results obtained for all datasets considered, in which the analysis of the differences in the portfolio weights of the MinV and MinMAD approaches is also included. For each dataset, we can observe that the OWD values computed for the risk parity and minimum risk strategies appear to be of the same order of magnitude. Furthermore, as highlighted for the out-of-sample performance results, it seems that the greatest portfolio allocation differences can be observed for the CIB-mix dataset.

5 Conclusions

In this paper, we have proposed an extension of the now popular Risk Parity approach for volatility to the Mean Absolute Deviation (MAD) as a portfolio risk measure.

From a theoretical viewpoint, we have discussed the subdifferentiability and additivity features of MAD, highlighting their usefulness in the risk parity strategy, and we have established the existence and uniqueness results for the MAD-RP portfolio. Furthermore, we have presented several mathematical formulations for finding the MAD-RP portfolios practically and we have tested them on three real-world datasets both in terms of efficiency and accuracy. We have observed that the system-of-equation formulations seem to be the least accurate and efficient. Methods based on least-squares formulations show intermediate performances for all the datasets, while those based on logarithmic formulations appear to be the best.

From an experimental viewpoint, we have presented an extensive empirical analysis comparing the out-of-sample performance obtained from the global minimum volatility and MAD portfolios, the volatility and MAD risk parity portfolios, and the Equally Weighted portfolio. In terms of out-of-sample risk, these results confirm the theoretical (in-sample) properties of the RP portfolio, namely its risk lies between that of the minimum risk portfolio and the risk of the Equally Weighted portfolio. The RP portfolios always show positive annual expected returns for all datasets, even though they are smaller than those of the Equally Weighted portfolio, which is however much riskier. Furthermore, in terms of Gain-Risk ratios, the risk parity portfolios tend to provide medium-high performances, often between those of EW and those of minimum risk portfolios; however, in few cases, they are better. We point out that the two risk parity portfolios show similar values for all performance measures. However, when directly comparing Vol-RP with MAD-RP, the latter usually provides slightly higher profitability at the expense of slightly higher risk and turnover. Furthermore, for each dataset, we have observed that the portfolio allocation differences computed through the risk parity and minimum risk strategies using volatility and MAD seem to be of the same order of magnitude.

Each model tested tends to respond to different requirements related to different risk attitudes of the investors. On one hand, as expected, the minimum risk models are advisable for risk-averse investors, avoiding as much as possible any shock represented by deep drawdowns. On the other hand, the RP strategies seem to be appropriate for investors mildly adverse to the total portfolio risk. These investors could be willing to waive a bit of safety (w.r.t. that of the minimum risk portfolios) and a bit of gain (w.r.t. that of the EW portfolio) to achieve a more balanced risk allocation and a more diversified portfolio. Furthermore, although the Equally Weighted approach embodies the concept of high diversification and achieves good out-of-sample expected returns, it generates portfolios with very high out-of-sample risk both in terms of volatility, MAD, Maximum Drawdown, and Ulcer index. Therefore, according to our findings, the EW portfolio seems to be advisable for sufficiently risk-seeking investors who try to maximize gain without worrying about periods of deep drawdowns.

Notes

It consists in the price of the asset including dividends, assuming these dividends are reinvested in the company.
See, e.g., https://www.msci.com/eqb/methodology/meth_docs/MSCI_May12_IndexCalcMethodology.pdf.
IT Benchmark 10 years DS Govt. Index, IT Benchmark 5 years DS Govt. Index, IT Benchmark 15 years DS Govt. Index, S &P GSCI Agriculture Total Return, and S &P GSCI Industrial Metals Total Return for the period 1/2013-12/2018.

References

Bacon, C. A. (2008). Practical portfolio performance measurement and attribution (2nd ed.). Wiley.
Google Scholar
Bai, X., Scheinberg, K., & Tutuncu, R. (2016). Least-squares approach to risk parity in portfolio selection. Quantitative Finance, 16(3), 357–376.
Google Scholar
Bellini, F., Cesarone, F., Colombo, C., & Tardella, F. (2021). Risk parity with expectiles. European Journal of Operational Research, 291(3), 1149–1163.
Google Scholar
Best, M., & Grauer, R. (1991). On the sensitivity of mean-variance-efficient portfolios to changes in asset means: some analytical and computational results. Review of Financial Studies, 4(2), 315–342.
Google Scholar
Best, M., & Grauer, R. (1991). Sensitivity analysis for mean-variance portfolio problems. Management Science, 37(8), 980–989.
Google Scholar
Boudt, K., Carl, P., & Peterson, B. (2013). Asset allocation with conditional value-at-risk budgets. Journal of Risk, 15, 39–68.
Google Scholar
Bruni, R., Cesarone, F., Scozzari, A., & Tardella, F. (2017). On exact and approximate stochastic dominance strategies for portfolio selection. European Journal of Operational Research, 259(1), 322–329.
Google Scholar
Carleo, A., Cesarone, F., Gheno, A., & Ricci, J. M. (2017). Approximating exact expected utility via portfolio efficient frontiers. Decisions in Economics and Finance, 40(1–2), 115–143.
Google Scholar
Cesarone, F. (2020). Computational finance: MATLAB® oriented modeling. Routledge.
Google Scholar
Cesarone, F., & Colucci, S. (2018). Minimum risk versus capital and risk diversification strategies for portfolio construction. Journal of the Operational Research Society, 69(2), 183–200.
Google Scholar
Cesarone, F., & Tardella, F. (2017). Equal risk bounding is better than risk parity for portfolio selection. Journal of Global Optimization, 68(2), 439–461.
Google Scholar
Cesarone, F., Scozzari, A., & Tardella, F. (2015). Linear vs. quadratic portfolio selection models with hard real-world constraints. Computational Management Science, 12(3), 345–370.
Google Scholar
Cesarone, F., Moretti, J., Tardella, F., et al. (2016). Optimally chosen small portfolios are better than large ones. Economics Bulletin, 36(4), 1876–1891.
Google Scholar
Cesarone, F., Lampariello, L., & Sagratella, S. (2019). A risk-gain dominance maximization approach to enhanced index tracking. Finance Research Letters, 29, 231–238.
Google Scholar
Cesarone, F., Mango, F., Mottura, C. D., Ricci, J. M., & Tardella, F. (2020). On the stability of portfolio selection models. Journal of Empirical Finance, 59, 210–234.
Google Scholar
Cesarone, F., Scozzari, A., & Tardella, F. (2020). An optimization-diversification approach to portfolio selection. Journal of Global Optimization, 76(2), 245–265.
Google Scholar
Cesarone, F., Martino, M. L., & Carleo, A. (2022). Does ESG impact really enhance portfolio profitability? Sustainability, 14(4), 2050.
Google Scholar
Chopra, V., & Ziemba, W. (1993). The effect of errors in means, variances, and covariances on optimal portfolio choice. The Journal of Portfolio Management, 19(2), 6–11.
Google Scholar
Clarke, R., De Silva, H., & Thorley, S. (2013). Risk parity, maximum diversification, and minimum variance: An analytic perspective. The Journal of Portfolio Management, 39(3), 39–53.
Google Scholar
DeMiguel, V., Garlappi, L., & Uppal, R. (2009). Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy? Review of Financial Studies, 22(5), 1915–1953.
Google Scholar
Fabozzi, F. A., Simonian, J., & Fabozzi, F. J. (2021). Risk parity: The democratization of risk in asset allocation. The Journal of Portfolio Management, 47, 41–50.
Google Scholar
Fisher, G. S., Maymin, P. Z., & Maymin, Z. G. (2015). Risk parity optimality. The Journal of Portfolio Management, 41(2), 42–56.
Google Scholar
Gordan, P. (1873). Über die auflösung linearer gleichungen mit reelen coefficienten (on the solution of linear inequalities with real coefficients). Mathematische Annalen, 6(1), 23–28.
Google Scholar
Güler, O. (2010). Fundamentals of optimization. Springer.
Google Scholar
Haesen, D., Hallerbach, W. G., Markwat, T., & Molenaar, R. (2017). Enhancing risk parity by including views. The Journal of Investing, 26(4), 53–68.
Google Scholar
Jacobsen, B., & Lee, W. (2020). Risk-parity optimality even with negative Sharpe ratio assets. The Journal of Portfolio Management, 46(6), 110–119.
Google Scholar
Konno, H., & Yamazaki, H. (1991). Mean-absolute deviation portfolio optimization model and its applications to Tokyo Stock Market. Management Science, 37(5), 519–531.
Google Scholar
Liu, B., Brzenk, P,, Cheng, T. (2020). Indexing risk parity strategies. S &P Global, S &P Dow Jones Indices, October 2020 Available at https://www.spglobal.com/spdji/en/documents/research/research-indexing-risk-parity-strategies.pdf?force_download=true (October 2020).
Maillard, S. T., & Teiletche, J. (2010). The properties of equally weighted risk contribution portfolios. The Journal of Portfolio Management, 36(4), 60–70.
Google Scholar
Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1), 77–91.
Google Scholar
Mausser, H., & Romanko, O. (2018). Long-only equal risk contribution portfolios for CVaR under discrete distributions. Quantitative Finance, 18(11), 1927–1945.
Google Scholar
Michaud, R., & Michaud, R. (1998). Efficient asset management. Harvard Business School Press.
Google Scholar
Oderda, G. (2015). Stochastic portfolio theory optimization and the origin of rule-based investing. Quantitative Finance, 15(8), 1259–1266.
Google Scholar
Qian, E. (2011). Risk parity and diversification. The Journal of Investing, 20(1), 119–127.
Google Scholar
Qian, E. (2017). How naïve is naïve risk parity? Panagora Asset Management.
Rachev, S., Stoyanov, S., & Fabozzi, F. (2008). Advanced stochastic models, risk assessment, and portfolio optimization: The ideal risk, uncertainty, and performance measures. Wiley.
Google Scholar
Rockafellar, R., & Wets, R. J. B. (1982). On the interchange of subdifferentiation and conditional expectation for convex functionals. Stochastics, 7, 173–182.
Google Scholar
Rockafellar, R., & Wets, R. J. B. (1997). Variational analysis. Springer.
Google Scholar
Rockafellar, R., Uryasev, S., & Zabarankin, M. (2006). Generalized deviations in risk analysis. Finance and Stochastics, 10(1), 51–74.
Google Scholar
Rockafellar, R. T. (1970). Convex analysis. Princeton University Press.
Google Scholar
Rockafellar, R. T. (1973). Conjugate duality and optimization. SIAM.
Roncalli, T. (2013). Introducing expected returns into risk parity portfolios: A new framework for tactical and strategic asset allocation. Available at SSRN: http://ssrncom/abstract=2321309
Spinu, F. (2013). An algorithm for computing risk parity weights. Available at SSRN: http://ssrncom/abstract=2297383
Tasche, D. (2002). Expected shortfall and beyond. Journal of Banking & Finance, 26(7), 1519–1533.
Google Scholar
Tawarmalani, M., & Sahinidis, N. V. (2005). A polyhedral branch-and-cut approach to global optimization. Mathematical Programming, 103(2), 225–249.
Google Scholar
Yang, F., & Wei, Z. (2008). Generalized Euler identity for subdifferentials of homogeneous functions and applications. Journal of Mathematical Analysis and Applications, 337, 516–523.
Google Scholar

Download references

Acknowledgements

This work was partially supported by the PRIN 2017 Project (no. 20177WC4KE), funded by the Italian Ministry of Education, University, and Research. We thank two anonymous referees for their useful comments. In particular, Example 9 is suggested by one of the referees.

Funding

Open access funding provided by the Scientific and Technological Research Council of Türkiye (TüBİTAK).

Author information

Authors and Affiliations

Bilkent University - Department of Industrial Engineering, Ankara, Turkey
Çağın Ararat & Mustafa Çelebi Pınar
Roma Tre University - Department of Business Studies, Rome, Italy
Francesco Cesarone & Jacopo Maria Ricci
Bergamo University - Department of Economics, Bergamo, Italy
Jacopo Maria Ricci

Authors

Çağın Ararat
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Cesarone
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Çelebi Pınar
View author publications
You can also search for this author in PubMed Google Scholar
Jacopo Maria Ricci
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Çağın Ararat.

Ethics declarations

Conflict of interest

We declare that we do not have any conflicting interests related to this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proofs of the results in Sect. 2.2

Proof of Lemma 2

By the triangle inequality on $\mathbb {R}$, we have that

$$\begin{aligned} \left|X-\mathbb {E}[X]\right| + \left|Y-\mathbb {E}[Y]\right| - \left|(X-\mathbb {E}[X]) + (Y-\mathbb {E}[Y])\right| \ge 0 \quad \mathbb {P}\text{-a.s. } \end{aligned}$$

(35)

Note that the additivity condition (3) is equivalent to

$$\begin{aligned} \mathbb {E} \left[ \left| (X-\mathbb {E}[X])+(Y-\mathbb {E}[Y])\right| - \left|X-\mathbb {E}[X]\right| - \left|Y-\mathbb {E}[Y]\right| \right] = 0 \end{aligned}$$

(36)

By (35), the random variable $\left|X-\mathbb {E}[X]\right| + \left|Y-\mathbb {E}[Y]\right| - \left|(X-\mathbb {E}[X]) + (Y-\mathbb {E}[Y])\right| $ is $\mathbb {P}$-a.s. nonnegative. Hence, (36) is equivalent to

$$\begin{aligned} \left|(X-\mathbb {E}[X]) + (Y-\mathbb {E}[Y])\right| - \left|X-\mathbb {E}[X]\right| - \left|Y-\mathbb {E}[Y]\right| = 0 \quad \mathbb {P}\text{-a.s. } \end{aligned}$$

(37)

Now, for each $a,b\in \mathbb {R}$, we have $|a+b |= |a|+|b|$ if and only if $ab\ge 0$. It follows that (37) is equivalent to (4), which completes the proof.

Proof of Theorem 6

It is clear that (i) implies (ii) and (iii). Next, we show that (ii) implies (i). Let $\varvec{x}\in \mathbb {R}^n_+$. If $\varvec{x}=0$, then the equality in (i) becomes trivial. Suppose that $\varvec{x}\ne 0$. In particular, $a:=\sum _{i=1}^n x_i>0$. Then, by (ii) and the positive homogeneity of MAD, we have

$$\begin{aligned} {{\,\textrm{MAD}\,}}\left( \sum _{i=1}^nx_i Y_i \right) =a{{\,\textrm{MAD}\,}}\left( \sum _{i=1}^n \frac{x_i}{a}Y_i \right) =a\sum _{i=1}^n \frac{x_i}{a}{{\,\textrm{MAD}\,}}(Y_i)=\sum _{i=1}^n x_i{{\,\textrm{MAD}\,}}(Y_i). \end{aligned}$$

Hence, (i) follows.

To prove that (iii) implies (iv), let us fix $i,j\in \{1,\ldots ,n\}$ with $i\ne j$ and set $I:=\{i,j\}$. Then, (iii) yields ${{\,\textrm{MAD}\,}}(Y_i+Y_j)={{\,\textrm{MAD}\,}}(Y_i)+{{\,\textrm{MAD}\,}}(Y_j)$. Hence, by Theorem 2, we have $(Y_i-\mathbb {E}[Y_i])(Y_j-\mathbb {E}[Y_j])\ge 0$ $\mathbb {P}$-a.s., that is, (iv) holds.

To complete the proof, we show that (iv) implies (i). We prove the equality in (i) by induction on $k(\varvec{x}):=\min \{i\ge 0\mid x_{i+1}=\ldots =x_n=0\}$. (We assume that $k(\varvec{x})=n$ if $x_n\ne 0$.) Note that $0\le k(\varvec{x})\le n$. The base case $k(\varvec{x})=0$ is trivial since $x_1=\ldots =x_n=0$ in this case. Let $k\le n-1$ and suppose that the equality in (i) holds for every $\bar{\varvec{x}}\in \mathbb {R}^n_+$ with $k(\bar{\varvec{x}})=k$. Let $\varvec{x}\in \mathbb {R}^n_+$ with $k(\varvec{x})=k+1$. Then,

$$\begin{aligned} {{\,\textrm{MAD}\,}}\left( \sum _{i=1}^n x_iY_i \right) ={{\,\textrm{MAD}\,}}\left( \sum _{i=1}^{k+1} x_iY_i \right) ={{\,\textrm{MAD}\,}}\left( \sum _{i=1}^{k} x_iY_i+x_{k+1}Y_{k+1} \right) . \end{aligned}$$

(38)

Note that we have

$$\begin{aligned}&\left( \sum _{i=1}^{k} x_iY_i-\mathbb {E} \left[ \sum _{i=1}^{k} x_iY_i \right] \right) \left( x_{k+1}Y_{k+1}-x_{k+1}\mathbb {E}[Y_{k+1}] \right) \\&\quad =x_{k+1}\sum _{i=1}^{k}x_i(Y_i-\mathbb {E}[Y_i])(Y_{k+1}-\mathbb {E}[Y_{k+1}])\ge 0 \end{aligned}$$

thanks to (iv) and the fact that $\varvec{x}\in \mathbb {R}^n_+$. Hence, by Theorem 2,

$$\begin{aligned} {{\,\textrm{MAD}\,}}\left( \sum _{i=1}^{k} x_iY_i+x_{k+1}Y_{k+1} \right) ={{\,\textrm{MAD}\,}}\left( \sum _{i=1}^{k} x_iY_i \right) +{{\,\textrm{MAD}\,}}(x_{k+1}Y_{k+1}). \end{aligned}$$

(39)

On the other hand, we may write $\sum _{i=1}^{k} x_iY_i=\sum _{i=1}^{n} \bar{x}_iY_i$, where $\bar{\varvec{x}}:=(x_1,\ldots ,x_k,0,\ldots ,0)$. Since $k(\bar{ \varvec{x}})=k$, we may use the induction hypothesis and obtain

$$\begin{aligned} {{\,\textrm{MAD}\,}}\left( \sum _{i=1}^{k} x_iY_i \right) ={{\,\textrm{MAD}\,}}\left( \sum _{i=1}^{n} \bar{x}_iY_i \right) =\sum _{i=1}^n \bar{x}_i{{\,\textrm{MAD}\,}}(Y_i)=\sum _{i=1}^k x_i{{\,\textrm{MAD}\,}}(Y_i). \end{aligned}$$

(40)

Combining (38), (39), (40) and using the positive homogeneity of MAD give that

$$\begin{aligned} {{\,\textrm{MAD}\,}}\left( \sum _{i=1}^n x_iY_i \right) =\sum _{i=1}^k x_i{{\,\textrm{MAD}\,}}(Y_i)+x_{k+1}{{\,\textrm{MAD}\,}}(Y_{k+1})=\sum _{i=1}^n x_i{{\,\textrm{MAD}\,}}(Y_i), \end{aligned}$$

which concludes the induction argument.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ararat, Ç., Cesarone, F., Pınar, M.Ç. et al. MAD risk parity portfolios. Ann Oper Res 336, 899–924 (2024). https://doi.org/10.1007/s10479-023-05797-2

Download citation

Received: 22 October 2021
Accepted: 15 December 2023
Published: 16 January 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s10479-023-05797-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

MAD risk parity portfolios

Abstract

Similar content being viewed by others

Modified Risk Parity Portfolios to Limit Concentration on Low Risk Assets in Multi-Asset Portfolios

A robust framework for risk parity portfolios

Minimum risk versus capital and risk diversification strategies for portfolio construction

1 Introduction

2 Measuring the portfolio risk by MAD

2.1 MAD as a deviation measure

Proposition 1

2.2 Additivity of MAD

Lemma 2

Proof

Remark 3

Remark 4

Example 5

Theorem 6

Proof

2.3 Subdifferential of MAD

Proposition 7

Proof

3 Risk parity approach with MAD

3.1 MAD-RP portfolios: existence and uniqueness

Assumption 8

Example 9

Example 10

Proposition 11

Proof

Remark 12

Proposition 13

Proof

3.2 MAD-RP portfolios: calculations

3.2.1 Logarithmic formulations

Remark 14

Remark 15

Remark 16

3.2.2 System-of-equation formulations

Remark 17

3.2.3 Least-squares formulations

4 Empirical analysis

4.1 Datasets

4.2 Computational results

4.2.1 Accuracy and efficiency of the MAD-RP formulations

4.2.2 Comparison of some portfolio selection approaches

4.2.3 Out-of-sample analysis

5 Conclusions

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: Proofs of the results in Sect. 2.2

Appendix A: Proofs of the results in Sect. 2.2

Proof of Lemma 2

Proof of Theorem 6

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation