1 Introduction

The milestone of modern finance theory for portfolio selection is undoubtedly the seminal work of Markowitz (1952, 1959) on the gain-risk analysis. Indeed, his famous Mean-Variance (MV) model is still widely used by both academics and practitioners to support portfolio selection decisions. The success of this bi-objective optimization problem has inevitably drawn many criticisms and proposals of alternative or more refined models (Kolm et al. 2014). In fact, many refinements of the Markowitz model have been provided in the literature, such as the definition of various new risk measures for selecting a portfolio. This is justified by the empirical evidences that the asset returns can show asymmetric and/or fat tail distributions (see, e.g., Mandelbrot 1972; Pagan 1996; Cont 2001). Therefore, measuring the risk of a portfolio with the variance of returns typically suffers many practical limitations, since it equally weights positive and negative deviations from the mean. As a consequence, several studies have focused on more suitable downside risk measures (Biglova et al. 2004). In this regard, value-at-risk (VaR) is traditionally recognized as the industry standard downside risk measure adopted by financial institutions, and has become very popular among researchers and practitioners over the years (see, e.g., Duffie and Pan 1997; Philippe 2001; Consigli 2002). The VaR of a portfolio represents the maximum potential portfolio loss at a given confidence level (typically 1%, 5%, and 10%) related to a predefined time horizon. Despite its simple definition and wide use in the financial industry, VaR has some drawbacks: losses exceeding VaR are not taken into account, VaR is non-convex with respect to the portfolio weights and fails to satisfy the subadditivity property, which is a desiderable feature of a coherent risk measure (see, e.g., Artzner et al. 1999; Mausser 1998). To address these shortcomings, Rockafellar and Uryasev (2000) have proposed a portfolio selection approach that uses conditional value-at-risk (CVaR) as a risk measure and provided an LP formulation for the corresponding portfolio optimization problem. However, although CVaR has the advantage of being a coherent risk measure (Acerbi and Tasche 2002), Cont et al. (2010) have observed that using CVaR instead of VaR could lead to a less robust risk management model and that CVaR is much more sensitive to outliers than VaR (see also de Vries et al. 2005; Dhaene et al. 2008; Kou et al. 2013). Lim et al. (2011) have also highlighted the high sensitivity of CVaR to estimation errors in data-driven Mean-CVaR portfolio selection models. Furthermore, Gneiting (2011) has shown that VaR is elicitable, unlike CVaR and the general class of spectral risk measures (see also Bellini and Bignozzi 2015). For all these reasons, and also on the basis of the regulatory requirement under the Basel II Accord (Basel Committee on Banking Supervision 2012), VaR is still a widely used risk measure in the financial industry.

The use of VaR in portfolio optimization can be found in Wang (2000), where the author first presented a two-stage optimization approach based on the Mean-Variance and Mean-VaR analysis. Then, he has proposed a Mean-Variance-VaR model, but without providing any explicit formulation of this problem and any empirical analysis on real markets. Basak and Shapiro (2001) have proposed a dynamic portfolio optimization problem that maximizes the expected utility of the portfolio wealth including a VaR constraint for regulatory requirements. Alexander and Baptista (2002) have described a Mean-VaR model assuming that the assets’ returns follow a multivariate normal and student-t distributions. Consigli (2002) has empirically investigated the behavior of different VaR measurement techniques and portfolio optimization strategies during market instability periods, when deviating from the assumption of normality of returns distribution. Gaivoronski and Pflug (2005) have tackled the problem of VaR minimization by considering an approximation, the smoothed value-at-risk (SVaR), which requires less computational effort. Another significant theoretical and practical contribution to this topic has been provided by Benati and Rizzi (2007), who have proved NP-hardness of the Mean-VaR problem and have proposed a mixed-integer linear programming (MILP) formulation for it. Pınar (2013) has presented a closed-form solution for the Mean-VaR model applied to a market with a risk-free asset and n normally distributed risky assets, where short sales are allowed. However, a critical issue of portfolio selection models based on VaR is its estimation procedure (see, e.g., Pritsker 1997; Manganelli and Engle 2001), which can yield different results depending on the approach adopted. Indeed, several methods to estimate the portfolio VaR have been provided in the literature. These can be arranged into two main categories: parametric (see, e.g., Cui et al. 2013; Lotfi et al. 2017) and nonparametric (see, e.g., Cui et al. 2018; Lwin et al. 2017) methods.

We recall that a large number of portfolio selection models have been formulated as optimization problems with more than two objectives. For interested readers, we mention the work of Steuer and Na (2003) who have presented an extensive survey on multiple criteria decision making applied to several important topics in finance. For instance, to better control the shape and characteristics of the portfolio return distribution, many scholars have attempted to extend the MV model to higher-order moments, such as the portfolio skewness and kurtosis (see, e.g., Konno et al. 1993; Konno and Ki 1995; Aracioğlu et al. 2011). Other researches have investigated the inclusion of three or more objectives for selecting a portfolio (see, e.g., Chow 1995; Lo et al. 2006; Anagnostopoulos and Mamanis 2010; Utz et al. 2014; Cesarone et al. 2022a; Bera and Park 2008). For regulatory and reporting purposes, Roman et al. (2007) have emphasized the important role of risk measures focused on high portfolio losses (or equivalently, on low portfolio returns). They have then proposed a tri-objective optimization problem, where the portfolio expected return is maximized, while variance and CVaR are minimized. In the case of discrete random variables, the authors have reformulated this model as a single-objective quadratic optimization problem.

In this paper, we propose to add the VaR criterion to the classical Mean-Variance approach in order to better address the typical regulatory constraints of the financial industry. We thus obtain a nonparametric portfolio selection model characterized by three criteria: expected return, variance, and VaR at a specified confidence level. The resulting optimization problem consists in minimizing variance with parametric constraints on the levels of expected return and VaR. This model can be formulated as a mixed-integer quadratic programming (MIQP) problem. An extensive empirical analysis on real-world datasets demonstrates the practical applicability of the proposed approach. Furthermore, the out-of-sample performance of the optimal Mean-Variance-VaR portfolios seems to be generally better than that of the Equally Weighted and of the Mean-Variance-CVaR portfolios.

The paper is organized as follows. In Sect. 2, we introduce the Mean-Variance-VaR model and show how to formulate it as an MIQP problem. In Sect. 2.1, we discuss how to practically obtain the Mean-Variance-VaR efficient surface, by minimizing variance with parametric constraints on the levels of the portfolio expected return and of VaR. Section 2.2 describes the Mean-Variance-CVaR model that will be compared with the Mean-Variance-VaR approach. In Sect. 3, we provide an extensive empirical analysis on six real-world datasets. Finally, Sect. 4 summarizes the main contributions of our work and describes some directions for further developments.

2 The Mean-Variance-VaR model

We consider an investment universe of n assets, whose linear returns are represented by the random variables \(R_1, \ldots , R_n\). In the case of full investment, a long-only portfolio is identified with a vector \(x \in \Delta =\biggl \{x \in {\mathbb {R}}^n: \sum _{k=1}^n\,x_k\,=\,1, \, x_k \ge 0, k= 1, \dots , n \biggr \}\), where \(x_k\) is the fraction of capital invested in asset k. Thus, the portfolio linear return is given by \(R_{P}(x)=\Sigma _{k=1}^n x_k R_k\). We assume that the random variables \(R_1, \ldots , R_n\) are defined on a discrete probability space \(\{\Omega ,{\mathcal {F}},P\}\), with \(\Omega \,=\,\{\omega _1,\dots ,\omega _T\}\), \({\mathcal {F}}\) a \(\sigma\)-field and \(P(\omega _t)=p_t\). In this work, we use a look-back approach where the possible realizations of the discrete random returns are obtained from historical data. As it is customary in portfolio optimization, the investment decision is made using T equally likely historical scenarios (see, e.g., Carleo et al. 2017, and references therein). Then, under scenario \(t \in \{1,\dots ,T\}\), we denote by \(r_{kt}\) the return of asset \(k \in \{1,\dots ,n\}\) and by \(R_{Pt}(x)=\sum _{k=1}^{n} x_k r_{kt}\) the portfolio return.

The classical Mean-Variance (MV) portfolio optimization model (Markowitz 1952, 1959) aims at determining the vector of portfolio weights \(x= \left( x_1, x_2, \cdots , x_n \right)\) that minimizes the portfolio variance \(\sigma _{P}^{2}(x) = \sum _{k=1}^n \sum _{j=1}^n x_k\,x_j\,\sigma _{kj}\), while restricting the portfolio expected return \(\mu _{P}(x)= \sum _{k=1}^n\,\mu _k\,x_k\) to attain a specified target level \(\eta\). Here, we denote by \(\mu _k=\frac{1}{T}\sum _{t=1}^{T} r_{kt}\) the expected return of asset k, and by \(\sigma _{kj}= \frac{1}{T}\sum _{t=1}^{T} (r_{kt} - \mu _k) (r_{jt} - \mu _j)\) the covariance between assets k and j. Thus, the MV model can be formulated as the following convex QP problem

$$\begin{aligned} \left\{ { \begin{array}{lll} \mathop{\min}\limits_{x} &{} \displaystyle\sum _{k=1}^n \sum _{j=1}^n x_k\,x_j\,\sigma _{kj} &{} \\ \text{ s.t. } &{} &{} \\ &{} \displaystyle\sum _{k=1}^n\,\mu _k\,x_k\,\ge \, \eta &{} \\ &{} \displaystyle\sum _{k=1}^n x_k = 1 &{} \\ &{} x_k \ge 0 &{} k=1,\ldots ,n \end{array} }\right. \end{aligned}$$
(1)

where the last two constraints represent the budget and the no-short sellings constraints, respectively.

Our aim is to include value-at-risk (VaR) as a risk measure in the portfolio selection process in addition to the portfolio expected return and variance. VaR is one of the most popular risk management tools in the financial industry and it is commonly used to control risk (Longerstaey and Spencer 1996). As usual, \(VaR_\varepsilon\) represents the maximum loss at a given confidence level \((1-\varepsilon )\) related to a predefined time horizon \(\{1,\dots ,T\}\), where typically \(\varepsilon =0.01, 0.05, 0.10\). Therefore, for a given portfolio x, \(VaR_\varepsilon (x)\) is the value such that the portfolio loss \(L_P(x)\,=\,-R_P(x)=\,-\sum _{k=1}^{n}\, x_k R_k\) exceeds \(VaR_\varepsilon (x)\) with a probability of \(\varepsilon 100\%\). More formally, \(VaR_\varepsilon (x)\) of the random portfolio return \(R_P(x)\) is the \(\varepsilon\)-quantile \(Q_\varepsilon (R_P(x))\) of its distribution with negative sign

$$\begin{aligned} \begin{aligned}&VaR_\varepsilon (x)=-Q_\varepsilon (R_P(x)). \end{aligned} \end{aligned}$$

For the Mean-Variance-VaR approach, a random portfolio return \(R_P(x)\) is preferred to \(R_P(y)\) if and only if \(\mu _{P}(x) \ge \mu _{P}(y)\), \(\sigma _{P}^{2}(x) \le \sigma _{P}^{2}(y)\) and \(VaR_\varepsilon (x) \le VaR_\varepsilon (y)\), where at least one inequality is strict. Therefore, the efficient surface of the Mean-Variance-VaR model can be obtained by finding the non-dominated portfolios, which are the Pareto-optimal solutions of the following tri-objective optimization problem

$$\begin{aligned} \left\{ { \begin{array}{lll} \mathop{\min}\limits_{x} &{} \left( -\mu _{P}(x), \sigma _{P}^{2}(x), VaR_\varepsilon (x) \right) &{} \\ \text{ s.t. } &{} &{} \\ &{} x \in \displaystyle\Delta =\biggl \{x \in {\mathbb {R}}^n: \sum _{k=1}^n\,x_k\,=\,1, \, x_k \ge 0, k= 1, \dots , n \biggr \} \,. &{} \end{array} }\right. \end{aligned}$$
(2)

To practically solve Problem (2), we transform it into a single-objective optimization problem, by applying the classical \(\epsilon\)-constraint method (see, e.g., Cesarone 2020 ) as follows

$$\begin{aligned} \left\{ { \begin{array}{lll} \mathop{\min}\limits_{x} &{} \sigma _{P}^{2}(x) &{} \\ \text{ s.t. } &{} &{} \\ &{} \mu _{P}(x)\,\ge \, \eta &{} \\ &{} VaR_\varepsilon (x)\,\le \, z &{} \\ &{} \displaystyle\sum _{k=1}^n x_k = 1 &{} \\ &{} x_k \ge 0 &{} k=1,\ldots ,n \end{array} }\right. \end{aligned}$$
(3)

where \(\eta\) and z are the required target levels of the portfolio expected return and VaR, respectively. Similar to Benati and Rizzi (2007), we can substitute \(VaR_\varepsilon (x)=-r_\varepsilon\) by adding the constraints \(r_\varepsilon \le \sum _{k=1}^n r_{k t}\,x_k + M\,(1-y_t)\), \(\forall t=1,\dots ,T\) and \(\sum _{t=1}^T y_t\ge (1-\varepsilon )\,T\), where \(r_\varepsilon\) is a real variable, M is a sufficiently large positive number, and \(y_t\), with \(t=1,\dots ,T\), are Boolean variables. Thus, Problem (3) can be formulated as the following mixed-integer quadratic (MIQP) problem

$$\begin{aligned} \min \limits _{(x, r_\varepsilon , y)}&\sum _{k=1}^n \sum _{j=1}^n x_k\,x_j\,\sigma _{kj}{} & {} \nonumber \\ \text{ s.t. }{} & {} &\nonumber \\&\sum _{k=1}^n\,\mu _k\,x_k\,\ge \, \eta{} & {} \nonumber \\&-r_\varepsilon \,\le \, z{} & {} \end{aligned}$$
(4a)
$$\begin{aligned}&\qquad\qquad\quad\quad r_\varepsilon \le \sum _{k=1}^n r_{k t}\,x_k + M\,(1-y_t)&t=1,\dots ,T&\end{aligned}$$
(4b)
$$\begin{aligned}&\qquad\sum _{t=1}^T y_t \ge (1-\varepsilon )\,T{} & {} \nonumber \\&\qquad\sum _{k=1}^n x_k = 1{} & {} \nonumber \\&\qquad x_k \ge 0&k=1,\ldots ,n&\nonumber \\&\qquad y_t\in \{0,1\}&t=1,\dots ,T&\end{aligned}$$
(4c)

Note that when the portfolio loss \(-\sum _{k=1}^n r_{k t}\,x_k\) is above \(-r_\varepsilon\) at time t, then, for sufficiently large \(M>0\), we must have \(1-y_t=1\) in constraint (4b) so that \(y_t\) must be equal to 0. On the other hand, constraint (4c) imposes that the number of the portfolio loss scenarios that exceed \(-r_\varepsilon\) is not greater than \(\varepsilon T\). Thus, \(-r_\varepsilon\) actually represents the VaR of the portfolio x and is bounded above by z (i.e., by the required target level of the portfolio VaR) through constraint (4a). In the empirical analysis we set \(\varepsilon = 0.01, 0.05\).

Remark 1

(Computational improvement of the Mean-Variance-VaR model) Model (4) is a MIQP problem that contains a big-M coefficient in constraint (4b). It is well known that for this type of formulations the computational efficiency of the solvers typically improves when reducing the value of M. Clearly, the penalty parameter M should be chosen in a way that guarantees the validity of the Mean-Variance-VaR formulation. To this aim, the lowest possible value of M is given by \(M_{ideal}=\max \nolimits _{1\le t \le T}\{-z-\sum \nolimits _{k=1}^{n}r_{kt}x_{k}^{\star }\}\), where \(x_{k}^{\star }\) is the optimal solution of Model (4). However, since the optimal solution of Model (4) is not available a priori, we need to use a computable upper bound for this value. Here, we propose to use the value \({\widetilde{M}}_{ideal}=-z-\min \nolimits _{\forall i,t} r_{i,t}\) which clearly satisfies \({\widetilde{M}}_{ideal}>M_{ideal}\).

In what follows, we report some computational experiments showing an example of the improvement in computational time depending on the choice of the parameter M. More precisely, for the computationally demanding case where \(T=330\), \(n=28\), and \(\varepsilon =5\%\), we obtain \(M_{ideal}=0.04\) and \({\widetilde{M}}_{ideal}=0.30\). We observe that the time required to solve Model (4) with \({\widetilde{M}}_{ideal}\) and \(M_{ideal}\) are essentially the same, namely 3 s, while the time required to solve the same model with the trivial bound \(M=1\) is 850 s, around two orders of magnitude larger.

As described at the beginning of this section, in Problem (4) we use the sample covariance matrix for evaluating the portfolio variance. However, since the estimation of the covariance matrix represents a sensitive issue (see, e.g., Kondor et al. 2007; DeMiguel et al. 2009; Cesarone et al. 2020a), in the following remark we provide some clarifications on this choice.

Remark 2

(About the covariance matrix) The sensitivity to estimation errors of the minimum risk portfolios strongly depends on the ratio \(\frac{n}{T}\), namely on the size n of the investment universe and on the number of observations T. As shown by Kondor et al. (2007), when considering a long-short minimum variance portfolio the instability of the optimal solution bursts for \(n\simeq T\), namely when the covariance matrix becomes singular. We handle this aspect by appropriately setting n and T, namely ensuring that \(n<T\) (see Sect. 3). Furthermore, the no-short selling constraints in Problem (4) tend to improve the stability of the optimal solutions, as highlighted by Jagannathan and Ma (2003). Indeed, the authors show that considering long-only portfolios when minimizing the portfolio variance is equivalent to find long-short minimum risk portfolios with shrunk covariance matrices that help to reduce estimation errors.

The Mean-Variance-VaR Pareto-optimal portfolios can be obtained as solutions of Problem (4) by appropriately varying the target level of the portfolio expected return \(\eta\) and the target level of the portfolio VaR z, as shown in the next section.

2.1 Finding the Mean-Variance-VaR efficient surface

In this section, we discuss how to practically obtain the Mean-Variance-VaR efficient surface by solving Problem (4), similarly to Roman et al. (2007). Basically, we minimize the portfolio variance by appropriately varying the target level of the portfolio expected return \(\eta\) and the target level of the portfolio VaR z.

To obtain all the Pareto-optimal portfolios, we first determine an appropriate interval for \(\eta\). This is the interval \([\eta _{\min },\,\eta _{\max }]\), where \(\eta _{\min }=\max \{\eta _{minV},\,\eta _{minVaR}\}\) and \(\eta _{\max }=\mu _{P}(x_{\max })\) with \(x_{\max } = \mathop {\mathrm {arg\,max}}\limits _{x \in \Delta } \mu _{P}(x)\). Here, \(\eta _{minV}=\mu _{P}(x_{minV})\) with \(x_{minV} = \mathop {\mathrm {arg\,min}}\limits _{x \in \Delta } \sigma _{P}^{2}(x)\) and \(\eta _{minVaR}=\mu _{P}(x_{minVaR})\) with \(x_{minVaR} = \mathop {\mathrm {arg\,min}}\limits _{x \in \Delta } VaR_\varepsilon (x)\).

Then, for a fixed level \(\eta \in [\eta _{\min },\,\eta _{\max }]\), we define the appropriate interval of z,

\([z_{\min }(\eta ), z_{\max }(\eta )]\), whose values guarantee that the optimal portfolios of Problem (4) are non-dominated. More precisely, \(z_{\min }(\eta )=VaR_\varepsilon (x_{minVaR}(\eta ))\), where \(x_{minVaR}(\eta )\) is the portfolio with minimum VaR whose expected return is bounded below by \(\eta\), namely it is the optimal solution of the following problem (5)

$$\begin{aligned} \left\{ { \begin{array}{lll} \mathop{\min}\limits_{x} &{} VaR_\varepsilon (x) &{} \\ \text{ s.t. } &{} &{} \\ &{} \mu _{P}(x) \ge \eta \\ &{} x \in \Delta \end{array} }\right. \end{aligned}$$
(5)

On the other hand, \(z_{\max }(\eta )=VaR_\varepsilon (x_{minV}(\eta ))\), where \(x_{minV}(\eta )\) is the portfolio with minimum variance whose expected return is bounded below by \(\eta\), namely it is the optimal solution of the following problem (6)

$$\begin{aligned} \left\{ { \begin{array}{lll} \mathop{\min}\limits_{x} &{} \sigma _{P}^{2}(x) &{} \\ \text{ s.t. } &{} &{} \\ &{} \mu _{P}(x) \ge \eta \\ &{} x \in \Delta \end{array} }\right. \end{aligned}$$
(6)

In Fig. 1, we report an example of the Pareto-optimal portfolios obtained from Model (4) in the Variance-VaR plane for several fixed levels of the portfolio expected return \(\eta\). Note that by solving Problem (4) for different levels of the portfolio expected return \(\eta \in [\eta _{\min },\,\eta _{\max }]\) with \(z=z_{\min }(\eta )\), we obtain the Mean-VaR efficient frontier (see the bold black dashed line). On the other hand, when we solve Problem (4) for different values of \(\eta \in [\eta _{\min },\,\eta _{\max }]\) with \(z=z_{\max }(\eta )\), we achieve the Mean-Variance efficient frontier (see the red dashed line).

Fig. 1
figure 1

Example of the Mean-Variance-VaR Pareto-optimal portfolios (with \(\varepsilon =1\%\)) for several levels of the portfolio expected return \(\eta\) in the Variance - VaR plane

For a fixed level of \(\eta \in [\eta _{\min },\,\eta _{\max }]\), if we require stronger conditions on the portfolio VaR, namely lower levels of \(z(\eta )\), we clearly obtain efficient portfolios with higher variance, because the feasible region in (4) becomes smaller. However, the in-sample increase in variance is relatively small for lower levels of required expected return \(\eta\). This behavior is confirmed in our out-of-sample analysis in Sect. 3.2. Thus, it seems reasonable to complement the classical Mean-Variance model with a restriction on the VaR level, particularly for low levels of \(\eta\).

We also observe that the number of selected assets in the optimal solutions decreases when the required VaR becomes smaller (see also Fig. 3). Furthermore, as the required target portfolio return \(\eta\) increases, both the portfolio variance and its VaR typically increase as shown in Fig. 1. We also note that for \(\eta =\eta _{\min }\) and \(z=z_{\max }(\eta _{\min })\), we obtain the Global Minimum Variance (GMinV) portfolio (see the bold x in Fig. 1). On the other hand, when \(\eta =\eta _{\max }\), the efficient frontier degenerates into a single point: the portfolio composed by the single asset with highest expected return.

2.2 The Mean-Variance-CVaR model

For comparison purposes, we also test the model proposed by Roman et al. (2007), which we report here for convenience:

$$\begin{aligned} \left\{ { \begin{array}{lll} \mathop{\min}\limits_{x} &{} \sigma _{P}^{2}(x) &{} \\ \text{ s.t. } &{} &{} \\ &{} \mu _{P}(x)\,\ge \, \eta &{} \\ &{} CVaR_\varepsilon (x)\,\le \, \lambda &{} \\ &{} \displaystyle\sum _{k=1}^n x_k = 1 &{} \\ &{} x_k \ge 0 &{} k=1,\ldots ,n \end{array} }\right. \end{aligned}$$
(7)

Exploiting the LP formulation of Rockafellar and Uryasev (2000) for CVaR, Model (7) can be reformulated as a QP, as shown at the end of Sect. 4 of Roman et al. (2007). Again, to determine all the Pareto-optimal portfolios, we detect a proper interval for \(\eta \in [\eta _{\min },\,\eta _{\max }]\), where \(\eta _{\min }=\max \{\eta _{minV},\,\eta _{minCVaR}\}\), \(\eta _{\max }=\mu _{P}(x_{\max })\) with \(x_{\max } = \mathop {\mathrm {arg\,max}}\limits _{x \in \Delta } \mu _{P}(x)\), and \(\eta _{minCVaR}=\mu _{P}(x_{minCVaR})\) with \(x_{minCVaR} = \mathop {\mathrm {arg\,min}}\limits _{x \in \Delta } CVaR_\varepsilon (x)\). Thus, for a fixed level \(\eta \in [\eta _{\min },\,\eta _{\max }]\), we identify the appropriate interval of \(\lambda \in [\lambda _{\min }(\eta ), \lambda _{\max }(\eta )]\), with \(\lambda _{\min }(\eta )=CVaR_\varepsilon (x_{minCVaR}(\eta ))\) and \(\lambda _{\max }(\eta )=CVaR_\varepsilon (x_{minV}(\eta ))\), where \(x_{minCVaR}(\eta )\) and \(x_{minV}(\eta )\) are the portfolios with minimum CVaR and with minimum variance, respectively, whose expected returns are bounded below by \(\eta\).

3 Empirical analysis

In this section, we conduct an extensive analysis on several real-world datasets to examine both the practical applicability of the Mean-Variance-VaR model and its out-of-sample performance. In Tables 1 and 2, we list the datasets considered in this study, containing weekly and daily linear returns, respectively, for some major market indexes. The datasets in Table 1 are also available in Bruni et al. (2016), and have been widely used in the literature (see Bruni et al. 2017; Cesarone et al. 2019, 2020b; Bellini et al. 2021; Corsaro et al. 2021); the datasets in Table 2 are publicly available in the website https://host.uniroma3.it/docenti/cesarone/DataSets.htm, and have been used in other empirical analyses on portfolio selection (Carleo et al. 2017; Cesarone et al. 2020a; Benati and Conde 2022).

Table 1 List of the weekly datasets analyzed
Table 2 List of the daily datasets analyzed

3.1 Empirical setup and performance measures

We first perform some numerical tests to determine the computational times required by Gurobi (Gurobi Optimization 2021), one of the best currently available MIQP solvers, to solve our model when varying the number n of the assets in the investment universe, the number T of the historical scenarios, and the confidence level \(\varepsilon\). For this purpose, we use the SP500 dataset listed in Table 1.

Table 3 Computational times (in seconds) for solving the Mean-Variance-VaR model with \(\varepsilon\)=1\(\%\)
Table 4 Computational times (in seconds) for solving the Mean-Variance-VaR model with \(\varepsilon\)=5\(\%\)

In Tables 3 and 4, we report the computational times (in seconds) for finding a Mean-Variance-VaR portfolio when \(\varepsilon\) is fixed to 1% and 5%, respectively. The experiments are performed by varying n from 20 to 442 and T from 52 to 595, for each pair of \(n<T\) that guarantees the non-singularity of the covariance matrix (see the discussion in Remark 2). As expected, the computational times tend to increase with n and T, but most notably when \(\varepsilon =5\%\). Indeed, in this case, the Gurobi MIQP solver typically spends more than eight hours for finding the optimal solution for high values of n and T. On the other hand, for \(\varepsilon =10\%\), the computational times tend to exceed one day thus becoming impractical, as also pointed out by Benati and Rizzi (2007). All the procedures have been implemented in MATLAB R2019b using the GUROBI 9.1 optimization solver, and have been executed on a laptop with an Intel(R) Core(TM) i7-8565U CPU @ 1.80 GHz processor and 8,00 GB of RAM.

For the out-of-sample performance analysis, we adopt a rolling time window (RTW) scheme of evaluation, namely we allow for the possibility of rebalancing the portfolio composition during the holding period at fixed intervals. Here, we choose one financial month both as a rebalancing interval and as a holding period. Furthermore, on the basis of the computational times illustrated in Tables 3 and 4, we choose \(\varepsilon =1\%\) and \(5\%\), in-sample windows of 2 years (namely 104 observations) for weekly datasets (see Table 1), and in-sample windows of 10 months (namely 200 observations) for daily datasets (see Table 2).

The out-of-sample analysis is based on 16 Pareto-optimal portfolios obtained from Problem (4) by appropriately varying the target levels of the portfolio expected return \(\eta\) and of the portfolio VaR z (see Sect. 2.1).

Fig. 2
figure 2

Example of the 16 Mean-Variance-VaR efficient portfolios selected for the out-of-sample performance analysis

More precisely, we consider 4 different levels of target return \(\eta _{\alpha } = \eta _{\min } + \alpha \, (\eta _{\max } - \eta _{\min })\) with \(\alpha \,=\,0, \,1/4, \,1/2, \,3/4\). Furthermore, for a fixed level \(\eta _{\alpha }\), we choose 4 levels of the portfolio VaR \(z_{\beta }\) in the interval \([z_{\min }(\eta _{\alpha }), z_{\max }(\eta _{\alpha })]\): \(z_{\beta }(\eta _{\alpha }) = z_{\min }(\eta _{\alpha }) + \beta \, (z_{\max }(\eta _{\alpha }) - z_{\min }(\eta _{\alpha }))\) with \(\beta \,=\,0, \,1/3, \,2/3, \,1\). Note that for \(\beta =0\) we obtain the Mean-VaR optimal portfolios (see the bold black dashed line in Fig. 1), while for \(\beta =1\) we select the Mean-Variance optimal portfolios (see the red dashed line in Fig. 1). In Fig. 2, we report an example of the 16 Mean-Variance-VaR efficient portfolios in the Variance-VaR plane for the DowJones daily dataset (see Table 2). Furthermore, to better understand the composition and diversification of these 16 Pareto-optimal portfolios, Fig. 3 displays a boxplot showing the number of assets selected by each Mean-Variance-VaR efficient portfolio, using the RTW scheme of evaluation. We observe that, for a fixed target level of the portfolio expected return \(\eta\), the number of selected assets tends to decrease when lower levels of VaR are required for the portfolio. On the other hand, for a fixed level of the portfolio VaR, the number of selected assets tends to decrease when increasing the required level \(\eta\) of the portfolio expected return.

Fig. 3
figure 3

Number of assets selected by each Mean-Variance-VaR efficient portfolio

The out-of-sample performance of the 16 Mean-Variance-VaR efficient portfolios is compared with that of the Equally Weighted (EW) portfolio and with that of 16 Mean-Variance-CVaR efficient portfolios. Similarly to the VaR case, these portfolios are obtained by considering 4 levels of target return \(\eta _{\alpha } = \eta _{\min } + \alpha \, (\eta _{\max } - \eta _{\min })\) with \(\alpha \,=\,0, \,1/4, \,1/2, \,3/4\), and, for a fixed level \(\eta _{\alpha }\), 4 levels of the portfolio CVaR \(\lambda _{\beta }(\eta _{\alpha }) = \lambda _{\min }(\eta _{\alpha }) + \beta \, (\lambda _{\max }(\eta _{\alpha }) - \lambda _{\min }(\eta _{\alpha }))\) with \(\beta \,=\,0, \,1/3, \,2/3, \,1\).

For each portfolio strategy, we evaluate the out-of-sample results by using the following performance measures commonly employed in the literature (see, e.g., Cesarone and Colucci 2018; Bruni et al. 2017; Cesarone et al. 2022b), and described below.

  • The Sharpe ratio (Sharpe 1966, 1994) measures the gain per unit risk and is defined as

    $$\begin{aligned} \textrm{Sharpe}=\frac{\mu ^\textrm{out}-r_{f}}{\sigma ^\textrm{out}}, \end{aligned}$$

    where \(r_{f}=0\), \(\mu ^\textrm{out}\) is the sample mean of the out-of-sample portfolio return \(R^\textrm{out}\), and \(\sigma ^\textrm{out}\) is its standard deviation. An higher Sharpe ratio indicates a better portfolio performance.

  • The Maximum DrawDown (MaxDD, Chekhlov et al. 2005) measures the maximum potential out-of-sample loss from the observed peak, and is defined as

    $$\begin{aligned} \textrm{MaxDD}=\min \limits _{T^\textrm{in}+1\le t\le T} \textrm{DD}_{t}, \end{aligned}$$

    where \(T^\textrm{in}\) is the length of the in-sample window, and the DrawDown is computed as

    $$\begin{aligned} \textrm{DD}_{t}=\frac{W_{t}-\max \limits _{T^\textrm{in}+1\le \tau \le t}W_{\tau }}{\max \limits _{T^\textrm{in}+1\le \tau \le t}W_{\tau }}, \qquad t \in \{T^\textrm{in}+1,\dots T\}. \end{aligned}$$

    Here, \(W_{t}=W_{t-1}(1+R^\textrm{out}_{t})\) denotes the portfolio wealth at time t, with \(W_{0}=1\). The MaxDD is always non-positive, hence higher values are preferable.

  • The Ulcer index (Martin and McCann 1989) evaluates the depth and the duration of drawdowns in prices over the out-of-sample period, and is defined as

    $$\begin{aligned} \textrm{Ulcer} = \sqrt{\frac{\sum \limits _{t=T^\textrm{in}+1}^{T}DD_{t}^2}{T-T^\textrm{in}}}. \end{aligned}$$

    A lower Ulcer value indicates a better portfolio performance.

  • The Turnover (DeMiguel et al. 2009) evaluates the amount of trading required to perform in practice the portfolio strategy, and is defined as

    $$\begin{aligned} \textrm{Turnover} = \frac{1}{Q}\sum _{q=1}^{Q}\sum _{k=1}^{n}\mid x_{q,k}-x_{q-1,k}\mid , \end{aligned}$$

    where Q represents the number of rebalances, \(x_{q,k}\) is the portfolio weight of asset k after rebalancing, and \(x_{q-1,k}\) is the portfolio weight before rebalancing at time q. Lower turnover values indicate better portfolio performance. We point out that this definition of portfolio turnover is a proxy of the effective one, since it evaluates only the amount of trading generated by the models at each rebalance, without considering the trades due to changes in asset prices between one rebalance and the next. Thus, by definition, the turnover of the EW portfolio is zero.

  • The Sortino ratio (Sortino et al. 2001) is defined as

    $$\begin{aligned} \textrm{Sortino} = \frac{\mu ^\textrm{out}-r_{f}}{\textrm{TDD}}, \end{aligned}$$

    where \(\textrm{TDD}=\sqrt{{\mathbb {E}}[((R^\textrm{out}-r_{f})_{-})^{2}]}\) is the Target Downside Deviation, and \(r_{f}\)=0. Higher values of the Sortino ratio indicate a better portfolio performance.

  • The Rachev ratio (Biglova et al. 2004) measures the relative gap between the mean of the best \(\alpha \%\) values of \(R^\textrm{out}-r_{f}\) and that of the worst \(\beta \%\) ones, and it is computed as

    $$\begin{aligned} \textrm{Rachev} = \frac{CVaR_{\alpha }(r_{f}-R^\textrm{out})}{CVaR_{\beta }(R^\textrm{out}-r_{f})}, \end{aligned}$$

    where \(\alpha =\beta =5\%, 10\%\), and \(r_{f} = 0\). Higher Rachev ratio values are clearly preferred.

3.2 Out-of-sample performance results

Table 5 Out-of-sample performance results of the Mean-Variance-VaR efficient portfolios on the EuroStoxx50 daily dataset with \(\varepsilon =1\%\)(color figure online)

In Table 5, we provide the computational results obtained by the 16 analyzed portfolio strategies and by the benchmark EW portfolio on the EuroStoxx 50 daily dataset (see Table 2), with \(\varepsilon =1\%\). The rank of the performance results is shown in different colors. More precisely, for each row the colors range from deep-green to deep-red, where deep-green represents the best performance, while deep-red represents the worst one. We observe that the EW portfolio shows the lowest Sharpe and Sortino ratios, and one of the highest volatilities. As mentioned above, by definition, the turnover of the EW portfolio is 0. Furthermore, we notice that by considering stronger conditions on the portfolio VaR, the Mean-Variance-VaR portfolios tend to improve their out-of-sample performance. This is more evident when choosing high levels of the portfolio expected return \(\eta\), for which the Pareto-optimal portfolios with low levels of z achieve the best performance in terms of mean, Sharpe, Sortino, and Rachev ratios. This behavior is also confirmed by the trend of the cumulative out-of-sample portfolio returns, reported in Fig. 4, where the EW portfolio tends to be dominated by the Mean-Variance-VaR portfolios for all target levels \(\eta\). Furthermore, there seems to be a behavioral pattern of portfolios with lower levels of z, namely \(z_{0}\) and \(z_{1/3}\), which have better performance w.r.t. the Mean-Variance portfolios (namely, the Mean-Variance-VaR portfolios with \(z_1\)). This is more remarkable for the highest level of the portfolio expected return, \(\eta _{3/4}\).

Fig. 4
figure 4

Cumulative out-of-sample portfolio returns of the Mean-Variance-VaR efficient portfolios using different levels of \(\eta\) and \(\varepsilon =1\%\) on the EuroStoxx 50 daily dataset

In Table 6, for each portfolio strategy we show the out-of-sample performance results obtained on the DowJones daily dataset (see Table 2), with \(\varepsilon \,=\,5\%\).

Table 6 Out-of-sample performance results of the Mean-Variance-VaR efficient portfolios on the DowJones daily dataset with \(\varepsilon =5\%\)(color figure online)

In this case, for higher levels of the portfolio expected return \(\eta\), namely \(\eta _{1/2}\) and \(\eta _{3/4}\), the portfolios with lower levels of VaR, namely \(z_0\) and \(z_{1/3}\), show the highest mean, Sharpe, Sortino, and Rachev ratios. We highlight that, when comparing the efficient Mean-Variance-VaR portfolios with the Mean-Variance ones, we also observe a general improvement of the out-of-sample performance, particularly for low levels of z. This behavior is also confirmed by the trend of the cumulative out-of-sample portfolio returns, reported in Fig. 5. In this case, the EW portfolio seems to be preferred to the Mean-Variance-VaR portfolios with \(\eta =\eta _{\min }\) and \(\eta =\eta _{1/4}\), while for higher levels of the required portfolio return, the Mean-Variance-VaR portfolios tend to dominate the EW portfolio in terms of cumulative returns. As in the previous case, portfolios with lower levels of z, namely \(z_{0}\) and \(z_{1/3}\), seem to exhibit a behavioral pattern, showing better performance w.r.t. the Mean-Variance portfolios (namely, the Mean-Variance-VaR portfolios with \(z_1\)). This is more noticeable for the highest level of the portfolio expected return \(\eta _{3/4}\).

Fig. 5
figure 5

Cumulative out-of-sample portfolio returns of the Mean-Variance-VaR efficient portfolios using different levels of \(\eta\) and \(\varepsilon =5\%\) on the DowJones daily dataset

For comparison purposes, in Tables 7 and 8 we also provide the out-of-sample performance results obtained by the Mean-Variance-CVaR portfolios described in Sect. 2.2, using the same experimental setup applied to the Mean-Variance-VaR case. Tables 5, 6, 7 and 8 also report the out-of-sample VaR (mean/median) for all the 16 optimal portfolios selected by the Mean-Variance-VaR and the Mean-Variance-CVaR approaches.

Table 7 Out-of-sample performance results of the Mean-Variance-CVaR efficient portfolios on the EuroStoxx50 daily dataset with \(\varepsilon =1\%\)(color figure online)
Table 8 Out-of-sample performance results of the Mean-Variance-CVaR efficient portfolios on the DowJones daily dataset with \(\varepsilon =5\%\)(color figure online)

We observe that as the required target levels of the in-sample VaR and the in-sample CVaR increase, the out-of-sample VaR typically increases as well.

3.3 Comprehensive analysis of the results

For a better assessment of the performance of Mean-Variance-VaR efficient portfolios, we now report a summary of our computational results on all the datasets listed in Tables 1 and 2. The detailed out-of-sample results can be found in the supplementary materials. In Tables 9 and 10, we summarize the number of datasets where the Mean-Variance-VaR efficient portfolios achieve equal or better performance than that of the EW portfolio when \(\varepsilon =1\%\) and \(\varepsilon =5\%\), respectively. On the other hand, in Table 11 and 12 we report the number of datasets where the Mean-Variance-VaR efficient portfolios achieve equal or better performance than that of the Mean-Variance-CVaR portfolios when \(\varepsilon =1\%\) and \(\varepsilon =5\%\), respectively. In each table we use the green marker when the best performances are obtained in at least 50% of the total cases. We point out that, under stronger conditions on the portfolio VaR, the performance of the optimal portfolios of our model seems to be typically better than that of the EW portfolio (see Table 10) in terms of mean, Sharpe ratio, Sortino ratio, and Maximum Drawdown, particularly when \(\varepsilon =5\%\). Similarly, the performance of the efficient portfolios of our model seems to be generally better than that of the Mean-Variance-CVaR portfolios (see Table 11) particularly for \(\varepsilon =1\%\). As shown in Table 12, when \(\varepsilon =5\%\) the good performance of the Mean-Variance-VaR portfolios appears to be less evident. This might be due to the higher robustness of CVaR with \(\varepsilon =5\%\), which seems to better estimate the tail risk, as compared to CVaR with \(\varepsilon =1\%\).

Table 9 Number of datasets out of six where the Mean-Variance-VaR efficient portfolios achieve equal or better performance than that of the EW portfolio, when \(\varepsilon =1\%\)(color figure online)
Table 10 Number of datasets out of six where the Mean-Variance-VaR efficient portfolios achieve equal or better performance than that of the EW portfolio, when \(\varepsilon =5\%\)(color figure online)
Table 11 Number of datasets out of six where the Mean-Variance-VaR efficient portfolios achieve equal or better performance than that of the Mean-Variance-CVaR portfolios, when \(\varepsilon =1\%\)(color figure online)
Table 12 Number of datasets out of six where the Mean-Variance-VaR efficient portfolios achieve equal or better performance than that of the Mean-Variance-CVaR portfolios, when \(\varepsilon =5\%\)(color figure online)

4 Conclusions and future research

In this paper, we have proposed a tri-objective portfolio selection model which adds conditions on the portfolio VaR to the classical Mean-Variance approach. The Mean-Variance-VaR model is formulated as a MIQP problem and is solved using Gurobi. We have described appropriate combinations of the parameters \(\varepsilon\), n, and T, for which we can obtain an optimal solution in a reasonable time. Our extensive empirical analysis based on several real-world datasets shows promising results in terms of various performance measures. Indeed, it seems that the Mean-Variance-VaR optimal portfolios can generally achieve equal or better out-of-sample results than those obtained by the EW and by the Mean-Variance-CVaR portfolios. Further future research might be directed to extend this approach to other Mean-Risk models and to investigate the stability of the Pareto-optimal solutions obtained by our model (see, e.g., Cesarone et al. 2020a). Furthermore, a broad comparison between parametric and nonparametric estimation approaches for VaR-based portfolio selection problems is left for future studies.