A risk perspective of estimating portfolio weights of the global minimum-variance portfolio
Abstract
The problem of how to determine portfolio weights so that the variance of portfolio returns is minimized has been given considerable attention in the literature, and several methods have been proposed. Some properties of these estimators, however, remain unknown, and many of their relative strengths and weaknesses are therefore difficult to assess for users. This paper contributes to the field by comparing and contrasting the risk functions used to derive efficient portfolio weight estimators. It is argued that risk functions commonly used to derive and evaluate estimators may be inadequate and that alternative quality criteria should be considered instead. The theoretical discussions are supported by a Monte Carlo simulation and two empirical applications where particular focus is set on cases where the number of assets (p) is close to the number of observations (n).
Keywords
Global minimum-variance portfolio Portfolio theory High dimensional Risk functionsJEL Classification
C13 C18 C44 G111 Introduction
Markowitz (1952, 1959) developed the theoretical foundation for the modern portfolio theory^{1} providing investors with a tool to solve a key issue of how to distribute their wealth among a set of available assets. The problem was postulated as a choice of a portfolio mean return and variance of portfolio returns. This led to two principles where an investor for a given level of portfolio variance maximizes the portfolio return, or likewise for a given portfolio return minimizes the portfolio variance. Hence, according to Markowitz, an investor under such constraints needs only to be concerned about two moments of the assets multivariate distribution: the mean vector and the covariance matrix. In practical situations, these two quantities are unknown and must be estimated in order to perform the optimization. Many authors have shown that this optimization procedure often fails in practice (Bai et al. 2009; Best and Grauer 1991; Kempf et al. 2002; Merton 1980). Some authors go so far as to argue that the estimation error (sampling variance) dominates the procedure to the extent that the equally weighted non-random portfolio performs better than those optimized from data (Frankfurter et al. 1971; DeMiguel et al. 2009; Michaud 1989).
The estimation problem becomes particularly serious when the numbers of assets (say p) are close to the number of observations (n). This is mainly so because the sample covariance matrix becomes stochastically unstable and may not even be invertible. It then comes natural that the standard “plug-in” estimators, defined by simply replacing the unknown mean vector and covariance matrix by the standard text book estimators, should be replaced by some improved covariance estimator. A significant number of improvements over the plug-in estimator have been developed over the last decades (Frost and Savarino 1986; Ledoit and Wolf 2003, 2004). The vast majority of these fall into two categorizes or families of estimators. The first category is based on the fact that the sample covariance matrix is a poor approximation of the true covariance matrix and, therefore, the estimation problem is concerned with developing improved estimators of the covariance. These improved estimators are then simply substituted for the unknown parameter within Markowitz’s optimal weight function. The other approach, which appears to be the more common one during recent years, relies on principles developed by Stein (1956) and James and Stein (1961). With such estimators, the standard estimator is weighted (“shrunken”) toward a non-random target quantity. This type of estimators seems to have found a new renaissance in portfolio estimation theory, for which they appear to be particularly well suited. A sample of some recent developments includes Bodnar et al. (2018), Frahm and Memmel (2010), Golosnoy and Okhrin (2007), Kempf and Memmel (2006), Okhrin and Schmid (2007). Each of these methods naturally has its own merits and uses.
Investors want to know the basic properties and the relative risk of favoring one estimator over another before applying any specific method. There is, however, no consensus about how the concept of “risk” should be defined. From a statistical point of view, risk refers to some measure of the difference between a quantity of interest (which could be either random or fixed) and our inference target. Risk is usually expressed through moments of differences, such as the mean squared error (MSE), but could also involve forecast bias, or angles between true and estimated vectors. It is obvious that the optimality properties of any estimator depend on the specific risk measure, or quality criteria, being used to describe it. Indeed, it is well known that the ranking of estimators’ performance, such as estimators of the inverse covariance matrix, may change or even reverse when evaluated with alternative loss functions (Haff 1979; Muirhead 1982).
While most recent papers in portfolio optimization theory have been concerned with the extremely important problem of developing efficient estimators of portfolio weights and related quantities, this paper focuses on the risks associated with these estimators. Specifically, the purpose of this paper is to compare and contrast a number of risk measures of the GMVP estimator to give investors and developers of statistical methods a fair understanding of their differences and similarities and, hence, a foundation for determining the weight estimator that is best suited for a given specific problem.
The paper proceeds as follows. In Sect. 2, the problem of minimum-variance portfolio estimation is stated, Sect. 3 introduces the concept of risk function, and Sect. 4 classifies different GMVP estimators. Section 5 describes the Monte Carlo study design and provides a discussion of the derived results. Section 6 outlines two empirical applications, and Sect. 7 summarizes the findings and concludes.
2 Preliminaries
3 The risk of portfolio estimators
Generally speaking, there are several ways to view the weight vector \(\mathbf {w}\) and the formulation of the inference problem. When deriving statistical estimators and inference procedures for portfolio weights, there is no consensus regarding which quantity to optimize. For example, since \({\mathbf {w}}\in {\mathbb {R}^{p}}\), it is natural to think of estimators in terms of estimating a parameter vector. Alternatively, upon noting that \({\mathbf {w}}=\left( {{\mathbf {1}}^{\prime }{{\varvec{\Sigma }}^{-1}}{\mathbf {1}}}\right) ^{-1}{\varvec{\Sigma }}^{-1}{\mathbf {1}}\) only depends on the unknown parameter \(\varvec{\Sigma }^{-1}\in \mathbb {R}^{p\times p}\), the inference problem may be thought of as one concerned with estimating a matrix, a problem rather different from that of estimating a vector. Yet another view of the inference problem is the out-of-sample prediction variance (Frahm and Memmel 2010). It is obvious that the properties of any estimator \({\hat{\mathbf {w}}}\) will depend on the quality criteria, or risk function, being used to judge it and that no single estimator can optimize all relevant properties simultaneously. In fact, the performance ranking of estimators of \({\mathbf {w}}\) may be changed or even reversed when evaluated on alternative loss functions (Haff 1979; Muirhead 1982). An investor searching the literature for “the best” estimator of the GMVP is likely to end up with a battery of proposed estimators, each being “optimal” in some sense. In this section, we will present and discuss similarities and differences between a number of risk functions for the GMVP problem, some of which are commonly used while others appears to be new in the GMVP context.
Remark (i) (Directional risks) Any estimator of the GMVP may be decomposed into components orthogonal and parallel to \({\mathbf {w}}\): Let \({\mathbf{A}}^{+}\) denote the Moore–Penrose pseudoinverse of some matrix \(\mathbf {A}\). Then, the component of \({\hat{\mathbf {w}}}\) parallel to \(\mathbf {w}\) is given by \({\mathbf{v}}=\left( {{{\mathbf{w}}^{+}} {\hat{\mathbf{w}}}}\right) {\mathbf{w}}=\left( \frac{\mathbf{w'}\hat{\mathbf{w}}}{{\mathbf{w'w}}}\right) {\mathbf{w}}\) and the component orthogonal to \(\mathbf {w}\) is determined by \({\mathbf{u}}={{\varvec{\Omega }}_{\bot }}{\hat{\mathbf{w}}}\) where \({\varvec{\Omega }}_{\bot }=\left( {{\mathbf {I}}-\left( {\frac{{\mathbf {ww^{\prime }}}}{{\mathbf {w^{\prime }w}}}}\right) }\right) \) is a projection matrix (Rao 2008, pp. 46–47). We can thus decompose \({\hat{\mathbf {w}}}\) according to \({\hat{\mathbf {w}}}={\mathbf {v}}+{\mathbf {u}}\), where \({\mathbf {v}}\) is parallel to \({\mathbf {w}}\) and \({\mathbf {u}}\) is orthogonal to \({\mathbf {w}}\). Some special cases of the above-defined risk functions in the direction orthogonal to the GMVP, which is our direction of main interest, are then given by \({\mathfrak {R}_{1}}\left( {{{\hat{{\varvec{\Sigma }}}}^{-1}},{{\varvec{\Omega }}_{\bot }}}\right) \), \({\mathfrak {R}_{2}}\left( {{{\hat{{\varvec{\Sigma }}}}^{-1}},{{\varvec{\Omega }}_{\bot }}}\right) \), \({\mathfrak {R}_{3}}\left( {\mathbf{u}}\right) \) and \({\mathfrak {R}_{4}}\left( {{\mathbf{u}},{\mathbf {r}_{m}}}\right) \). Although the risk in a certain direction to \({\mathbf {w}}\) alone is of limited interest, it does provide some insight into the relative performance of one estimator to another. For example, it may be shown that \(E\left[ \left( {\hat{\mathbf {w}}}_{I}-{\mathbf {w}}\right) ^{\prime }{\varvec{\Omega }}_{\bot }\left( {\hat{\mathbf {w}}}_{I}-{\mathbf {w}}\right) \right] =\frac{1}{\left( {n-p+1}\right) }\frac{{1}}{{{\mathbf {1^{\prime }}}{{\varvec{\Sigma }}}^{-1}{\mathbf {1}}}}\left( {\mathrm{tr}\left\{ {\varvec{\Sigma }}^{-1}\right\} -\left( \frac{{\mathbf {1^{\prime }}}{\varvec{\Sigma }}^{-3}{\mathbf {1}}}{{\mathbf {1^{\prime }}}{{\varvec{\Sigma }}}^{-2}{\mathbf {1}}}\right) }\right) ,\) whereas \(\left( {\mathbf {w}}_{0}-{\mathbf {w}}\right) ^{\prime }{\varvec{\Omega }_{\bot }}\left( {\mathbf {w}}_{0}-{\mathbf {w}}\right) ={p^{-1}}\left( {1-\frac{{{p^{-1}}{\left( {\mathbf {1^{\prime }}}{\varvec{\Sigma }}^{-1}{\mathbf {1}}\right) }^{2}}}{{\mathbf {1^{\prime }}}{{\varvec{\Sigma }}^{-2}}{\mathbf {1}}}}\right) \) (see Appendix A).
Remark (ii) (Implicit Covariance Matrix) There always exists a p.d. diagonal matrix \({\varvec{\Lambda }}\) such that \({\mathbf{P1}}= {{{\varvec{\Lambda }} \mathbf{P}}}{{\mathbf{w}}_{0}}\) or, equivalently, \({{\mathbf{w}}_{0}}={\mathbf{P}}{{\varvec{\Lambda }}^{-1}}{\mathbf{P^{\prime }1}}\), where \({{\mathbf{w}}_{0}}\) is any reference portfolio (Frahm and Memmel 2010, Theorem 8). The Stein-type estimator defined by \({{\hat{\mathbf{w}}}_{S}}=\left( {1-\alpha }\right) {{\hat{\mathbf{w}}}_{I}}+\alpha {{\mathbf{w}}_{0}}\) is therefore associated with an “implicit” covariance matrix estimator in the sense that there exists a matrix \({\hat{{\varvec{\Sigma }}}}_{S}^{-1}\) such that \({{\hat{\mathbf{w}}}_{S}}= \frac{\hat{{\varvec{\Sigma }}}_{S}^{-1}{} \mathbf{1}}{\mathbf{1}^{\mathbf {\prime }}{\hat{\varvec{\Sigma }}}_{S}^{-1}{} \mathbf{1}}\), where \({\hat{{\varvec{\Sigma }}}}_{S}^{-1}=\left( {1-\alpha }\right) {{\mathbf{S}}^{-1}}+ \alpha {\varvec{\Sigma }}_{0}^{-1}\), \(0\le \alpha \le 1\), and \({\hat{{\varvec{\Sigma }}}}_{S}^{-1}\) may, or may not, be positive definite. If we define our implicit covariance matrix by \({\varvec{\Sigma }}_{S}^{-1}=\left( {1-\alpha }\right) {{\mathbf{S}}^{-1}}+\alpha {\varvec{\Sigma }}_{0}^{-1}=\left( {1-\alpha }\right) {{\mathbf{S}}^{-1}}+\alpha {p^{-1}}{\mathbf{I}}\left( {{\mathbf{1^{\prime }}}{{\mathbf{S}}^{-1}}{\mathbf{1}}}\right) \), we obtain the identity \({\hat{\mathbf{w}}_{s}=\frac{{{{\hat{{\varvec{\Sigma }}}}_{S}}^{-1}{\mathbf{1}}}}{{{\mathbf{1^{\prime }}}{{\hat{{\varvec{\Sigma }}}}_{S}}^{-1}{\mathbf{1}}}}=\frac{{\left\{ {\left( {1-\alpha }\right) {{\mathbf{S}}^{-1}}+\alpha {p^{-1}}{\mathbf{I}}\left( {{\mathbf{1^{\prime }}}{{\mathbf{S}}^{-1}}{\mathbf{1}}}\right) }\right\} {\mathbf{1}}}}{{{\mathbf{1'}}\left\{ {\left( {1-\alpha }\right) {{\mathbf{S}}^{-1}}+\alpha {p^{-1}}{\mathbf{I}}\left( {{\mathbf{1^{\prime }}}{{\mathbf{S}}^{-1}}{\mathbf{1}}}\right) }\right\} {\mathbf{1}}}}=}\)\(\left( {1-\alpha }\right) \frac{{{{\mathbf{S}}^{-1}}{\mathbf{1}}}}{{{\mathbf{1'}}{{\mathbf{S}}^{-1}}{\mathbf{1}}}}+\alpha {p^{-1}}{\mathbf{1}}=\left( {1-\alpha }\right) {{\hat{\mathbf{w}}}_{I}}+\alpha {{\mathbf{w}}_{0}}\). This identity allows us to investigate the risk of \({{\hat{\mathbf{w}}}_{S}}\) with respect to any risk function designed for estimators of the (inverse) covariance matrices. For example, although \({\mathfrak {R}_{0}}\left( {{\hat{{\varvec{\Sigma }}}}^{-1}}\right) \) is a function of \({{\hat{{\varvec{\Sigma }}}}^{-1}}\) and does not involve an explicit weight estimator, it is nevertheless possible to evaluate \({{\hat{\mathbf{w}}}_{S}}\) via \({\mathfrak {R}_{0}}\left( {{\hat{{\varvec{\Sigma }}}}_{S}^{-1}}\right) \).
4 Families of GMVP weight estimators
Although the estimators \({{\hat{\mathbf{w}}_{\mathrm {II}}}}\) and \({{\hat{\mathbf{w}}_{\mathrm {III}}}}\) have shown great potential in improving the standard estimator \(\hat{\mathbf {w}}_{\mathrm {I}}\), improved estimators can be developed from a variety of different points of view. In particular, resolvent-type estimators, defined by \({{\hat{{\varvec{\Sigma }}}}}_{k}^{-1}={\left( {{\mathbf {S}}+k{\mathbf {I}}}\right) ^{-1}}\), \(k\in \mathbb {R}_{+}\), have shown great potential in estimating the precision matrix, particularly in high-dimensional settings (Holgersson and Karlsson 2012; Serdobolskii 1985).
These estimators add a small constant to the eigenvalues before inversion, thereby creating a more stable estimator. They play an important role in spectral analysis (Serdobolskii 1985) but have also proved to be efficient in more applied problems (Holgersson and Karlsson 2012). The “regularizing” coefficient k imposes a (small) bias on estimators of the precision matrix and hence offers a form of variance-bias trade-off rather different from the Stein-type estimators. Since the poor performance of the standard plug-in estimator is largely due to high sample variance in the precision matrix, the resolvent estimators are interesting candidates for improved estimation of the GMVP.
5 Monte Carlo study
Specification of the distribution of the stochastic terms in DGP I, II III
Parameter | Specification of DGP I |
\({\varvec{\Sigma }}\) | i. \({\varvec{\Sigma }}_{p\times p}={n^{-1}}\sum \limits _{t=1}^{n}{\left( {{\mathbf {R}}_{t}}-{\bar{\mathbf {R}}}\right) }{\left( {{\mathbf {R}}_{t}}-{\bar{\mathbf {R}}}\right) ^{\prime }},\,{\bar{\mathbf {R}}}={n^{-1}}\sum \limits _{t=1}^{n}{{\mathbf {R}}_{t}}\), where \({\mathbf {R}}_{t}\) is a vector of observed returns on p stocks |
ii. \({\varvec{\Sigma }}_{p\times p}=\mathrm{Toeplitz}\left( {\phi ^{0}},{\phi ^{1}},\ldots ,{\phi ^{p-1}}\right) \), \(\phi =0.5\) | |
\({{\varvec{\mu }}}\) | \(\mathbf {0}_{p\times 1}\) |
Parameter | Specification of DGP II |
\(\varvec{\varOmega }\) | \({\varvec{\varOmega }}_{p\times p}=\mathrm{Toeplitz}\left( {\phi ^{0}},{\phi ^{1}},\ldots ,{\phi ^{p-1}}\right) \), \(\varphi =0.5\) |
\({\varvec{\nu }}\) | 5 |
\(\varvec{\gamma }\) | \({\varvec{\gamma }}=\beta {{\mathbf{1}}_{p}},\;\beta =1\) |
\(\varvec{\lambda }\) | \({\varvec{\lambda }}=\beta {\upsilon /{\left( {\upsilon -2}\right) }}{{\mathbf{1}}_{p}}\) |
\({{\varvec{\mu }}}\) | \(\mathbf {0}_{p\times 1}\) |
\({\varvec{\Sigma }}\) | \({\varvec{\Sigma }}=\frac{\upsilon }{{\upsilon -2}}{\varvec{\Omega }}+\frac{{2{\upsilon ^{2}}{\beta ^{2}}}}{{\left( {\upsilon -2}\right) }^{2}\left( {\upsilon -4}\right) }{{\mathbf{1}}_{p}}{{\mathbf{1}}_{p}}^{\prime }\) |
Parameter | Specification of DGP III |
df | 1 |
\({\varvec{\Sigma }}\) | \({\varvec{\Sigma }}_{p\times p}=\mathrm{Toeplitz}\left( {6},{2},\ldots ,{2}\right) \), |
\({{\varvec{\mu }}}\) | \(3\cdot \mathbf {1}_{p\times 1}\) |
Design of the Monte Carlo experiments used for DGP I, II III
Factor | Symbol | Design |
---|---|---|
p / n ratio (dimension in relation to sample size) | c | 0.1, 0.5, 0.98 |
Dimension of portfolio (number of assets) | p | 100 |
Number of time observations | n | Follows from p and |
c above |
Specifications of the conditioned observation \(\mathbf {R}_{t}\) used in \(\mathfrak {R}_{4}\)
Risk | Conditioned on observation \(\mathbf {R}_{t}\) |
---|---|
\({\mathfrak {R}_{4}\left( \hat{w}_{j}|\mathbf {R}_{t}\right) }\) | single random draw from specified DGP and \(\mathbf {R}_{t}\) is kept fixed over all replicates |
\({\mathfrak {R}_{4}}\left( \hat{w}_{j}|\mathbf {R}_{t}+l\right) \) | vector l has ones in its p/2 upper rows and zeros elsewhere |
\({\mathfrak {R}_{4}}\left( \hat{w}_{j}|2\mathbf {R}_{t}\right) \) | multiplying \(\mathbf {R}_{t}\) with 2 when \(\mathbf {R}_{t}\) is generated as specified above |
MC simulation results for DGP I with \({\varvec{\varvec{\Sigma }}}\) according to Table 1 (i), \((p=100)\)
Estimator | (1) | (2) | (3) | (4) | (5) | (6) | (7) |
---|---|---|---|---|---|---|---|
\(\mathfrak {R}_{0}\) | \(\mathfrak {R}_{1}\) | \(\mathfrak {R}_{2}\) | \(\mathfrak {R}_{3}\) | \(\mathfrak {R}_{4}\) | \(\mathfrak {R}_{4}\left( \mathbf {R}_{t}+1\right) \) | \(\mathfrak {R}_{4}\left( 2\mathbf {R}_{t}\right) \) | |
\(c=0.98\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 0.9975 | 0.9996 | 0.9007 | 0.9022 | 0.9020 | 0.9049 | 0.9043 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 0.8660 | 0.9714 | 0.1441 | 0.1971 | 0.1514 | 0.1411 | 0.1459 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 7.43E−09 | 2.38E−09 | 0.0132 | 0.0270 | 0.0291 | 0.0112 | 0.0115 |
\(c=0.5\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 0.8245 | 0.8744 | 0.8975 | 0.9599 | 0.8980 | 0.8934 | 0.8953 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 0.6699 | 0.7599 | 0.8108 | 0.9484 | 0.8096 | 0.7949 | 0.8023 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 0.6208 | 0.2027 | 1.1517 | 0.9054 | 1.5729 | 0.6837 | 0.4599 |
\(c=0.1\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 0.8866 | 0.9563 | 0.9786 | 0.9988 | 0.9788 | 0.9769 | 0.9774 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 0.8710 | 0.9501 | 0.9757 | 0.9988 | 0.9759 | 0.9735 | 0.9742 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
5.1 Results from Monte Carlo simulations
Based on the results from the Monte Carlo experiments displayed in Tables 4, 5, 6 and 7, the estimator \({{\hat{\mathbf {w}}}_{\text {IV}}}\) performs well if c is larger than 0.1 for both DGP I with covariance structure from real data and for DGP II. But for DGP I with a Toeplitz covariance structure, \({{\hat{\mathbf {w}}}_{\text {IV}}}\) performs well only for c close to one. On the other hand, estimator \({{\hat{\mathbf {w}}}_{\text {III}}}\) performs best among the four estimators. This holds for all investigated values of c. Furthermore, \({{\hat{\mathbf {w}}}_{\text {II}}}\) is a good estimator if c is not close to one, and its performance is close to \({{\hat{\mathbf {w}}}_{\text {III}}}\). However, as c gets close to one, \({{\hat{\mathbf {w}}}_{\text {II}}}\) is outperformed both by \({{\hat{\mathbf {w}}}_{\text {III}}}\) and \({{\hat{\mathbf {w}}}_{\text {IV}}}\).
MC simulation results for DGP I with \({\varvec{\Sigma }}\) according to Table 1 (ii), \((p=100)\)
Estimator | (1) | (2) | (3) | (4) | (5) | (6) | (7) |
---|---|---|---|---|---|---|---|
\(\mathfrak {R}_{0}\) | \(\mathfrak {R}_{1}\) | \(\mathfrak {R}_{2}\) | \(\mathfrak {R}_{3}\) | \(\mathfrak {R}_{4}\) | \(\mathfrak {R}_{4}\left( \mathbf {R}_{t}+1\right) \) | \(\mathfrak {R}_{4}\left( 2\mathbf {R}_{t}\right) \) | |
\(c=0.98\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 0.9735 | 0.9987 | 0.4616 | 0.4666 | 0.4771 | 0.4354 | 0.4630 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 0.7139 | 0.9338 | 0.0147 | 0.0245 | 0.0156 | 0.0140 | 0.0131 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 9.05E−07 | 9.57E−07 | 0.0365 | 0.0485 | 0.0427 | 0.0401 | 0.0391 |
\(c=0.5\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 0.1745 | 0.0599 | 0.0276 | 0.5150 | 0.0243 | 0.0247 | 0.0316 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 0.2419 | 0.0538 | 0.0137 | 0.5083 | 0.0100 | 0.0101 | 0.0176 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 0.1520 | 0.2821 | 0.4407 | 0.7514 | 0.5315 | 0.5182 | 0.4667 |
\(c=0.1\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 6.7586 | 1.1462 | 0.0490 | 0.9075 | 0.0213 | 0.0230 | 0.0756 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 7.9439 | 1.3396 | 0.0508 | 0.9079 | 0.0198 | 0.0209 | 0.0820 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 0.7016 | 0.8513 | 0.8792 | 0.9905 | 0.9160 | 0.9100 | 0.8903 |
MC simulation results for DGP II with \({\varvec{\Sigma }}\) according to Table 1, \((p=100)\)
Estimator | (1) | (2) | (3) | (4) | (5) | (6) | (7) |
---|---|---|---|---|---|---|---|
\(\mathfrak {R}_{0}\) | \(\mathfrak {R}_{1}\) | \(\mathfrak {R}_{2}\) | \(\mathfrak {R}_{3}\) | \(\mathfrak {R}_{4}\) | \(\mathfrak {R}_{4}\left( \mathbf {R}_{t}+1\right) \) | \(\mathfrak {R}_{4}\left( 2\mathbf {R}_{t}\right) \) | |
\(c=0.98\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 0.9987 | 0.9996 | 0.7997 | 0.7998 | 0.7962 | 0.7964 | 0.7963 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 0.9358 | 0.9658 | 0.0782 | 0.0790 | 0.0801 | 0.0802 | 0.0801 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 1.21E−07 | 5.90E−07 | 0.0425 | 0.0459 | 0.0488 | 0.0501 | 0.0488 |
\(c=0.5\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 0.5633 | 0.8330 | 0.6544 | 0.6612 | 0.6572 | 0.6575 | 0.6572 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 0.3755 | 0.7484 | 0.4679 | 0.4784 | 0.4726 | 0.4730 | 0.4726 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 0.1681 | 0.7059 | 0.5581 | 0.6349 | 0.6559 | 0.6685 | 0.6559 |
\(c=0.1\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 1.3380 | 0.9903 | 0.7809 | 0.7893 | 0.7800 | 0.7819 | 0.7801 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 1.3953 | 0.9891 | 0.7554 | 0.7648 | 0.7544 | 0.7565 | 0.7544 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 1.0721 | 0.9980 | 0.9255 | 0.9498 | 0.9566 | 0.9606 | 0.9566 |
MC simulation results for DGP III with \({\varvec{\Sigma }}\) according to Table 1, \((p=100)\)
Estimator | (1) | (2) | (3) | (4) | (5) | (6) | (7) |
---|---|---|---|---|---|---|---|
\(\mathfrak {R}_{0}\) | \(\mathfrak {R}_{1}\) | \(\mathfrak {R}_{2}\) | \(\mathfrak {R}_{3}\) | \(\mathfrak {R}_{4}\) | \(\mathfrak {R}_{4}\left( \mathbf {R}_{t}+1\right) \) | \(\mathfrak {R}_{4}\left( 2\mathbf {R}_{t}\right) \) | |
\(c=0.98\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 0.9992 | 0.9999 | 0.4334 | 0.4390 | 0.4179 | 0.4168 | 0.4179 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 0.9672 | 0.9946 | 0.019 | 0.0288 | 0.0168 | 0.0121 | 0.0168 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 2.66E−08 | 1.97E−08 | 0.00358 | 0.0454 | 0.036 | 0.0358 | 0.0360 |
\(c=0.5\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 0.1263 | 0.0371 | 0.0242 | 0.5042 | 0.0242 | 0.024 | 0.0242 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 0.1745 | 0.0163 | 0.0106 | 0.4972 | 0.0105 | 0.0104 | 0.0105 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 0.3083 | 0.4337 | 0.5783 | 0.7857 | 0.5763 | 0.5766 | 0.5763 |
\(c=0.1\) | |||||||
\(\hat{\mathbf {w}}_{\text {II}}\) | 4.9783 | 0.1493 | 0.0124 | 0.9014 | 0.0122 | 0.012 | 0.0122 |
\(\hat{\mathbf {w}}_{\text {III}}\) | 6.1863 | 0.1803 | 0.0191 | 0.9902 | 0.0192 | 0.019 | 0.0192 |
\(\hat{\mathbf {w}}_{\text {IV}}\) | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
6 Empirical study
The empirical evaluation of the investigated estimators of the weights in the GMVP is achieved through a moving window approach on two different data sets for which we use different sampling methods. In the first method (fixed sampling method), we simply apply the estimators on all available assets, and in the second approach (random sampling method), we repeatedly randomly pick a given number of assets and then evaluate the estimators performance by the one-period out-of-sample returns. The reason for applying a moving window is that the mean-variance portfolio theory was developed as a one-period model.
6.1 Fixed sample
Example 1
Stocks listed on the Nasdaq stock exchange
For this empirical application, 89 stocks with complete past values are selected from SP100 over the period 1997-04 to 2010-07 (159 monthly returns). Note that gross returns are used, i.e., the risk-free return is not subtracted. A moving window approach is employed with a length of 149 months, giving 10 out-of-sample returns.
Example 2
Stocks listed on the Stockholm stock exchange
Performance of GMVP estimators applied to portfolios from Nasdaq and Stockholm OMX stock exchange, fixed sampling method
\(\hat{\mathbf {w}}_{\text {I}}\) | \(\hat{\mathbf {w}}_{\text {II}}\) | \(\hat{\mathbf {w}}_{\text {III}}\) | \(\hat{\mathbf {w}}_{\text {IV}}\) | \(\hat{\mathbf {w}}_{\text {V}}\) | |
---|---|---|---|---|---|
Nasdaq (\(c=0.597\)) | |||||
\({\hat{\sigma _{j}}}\) | 0.0589 | 0.0560 | 0.0540 | 0.0443 | 0.0540 |
\({\bar{R}_{j}}\) | 0.0006 | 0.0021 | 0.0035 | 1.3E-05 | 0.0064 |
\(\hat{SR_{j}}\) | 0.0098 | 0.0374 | 0.0640 | 0.0003 | 0.1189 |
Stockholm OMX (\(c=0.742\)) | |||||
\({\hat{\sigma _{j}}}\) | 0.0774 | 0.0636 | 0.0447 | 0.0479 | 0.0456 |
\({\bar{R}_{j}}\) | 0.0079 | 0.0112 | 0.0165 | 0.0085 | 0.0251 |
\(\hat{SR_{j}}\) | 0.1019 | 0.1757 | 0.3695 | 0.1773 | 0.5503 |
6.2 Random samples
Example 3
Stocks listed on the Nasdaq stock exchange (random sampling method)
For this empirical application \(p=20,50,80\) stock are randomly selected out of 89 stocks with complete past values from SP100 over the period 1997-04 to 2010-07 (159 monthly returns). A moving window approach is employed with \(c=0.537\)\((n=159,103,47)\), giving 10 out-of-sample returns. This procedure is repeated 100 times.
Performance of GMVP estimators applied to portfolios from Nasdaq stock exchange, random sampling method
\(\hat{\mathbf {w}}_{\text {I}}\) | \(\hat{\mathbf {w}}_{\text {II}}\) | \(\hat{\mathbf {w}}_{\text {III}}\) | \(\hat{\mathbf {w}}_{\text {IV}}\) | \(\hat{\mathbf {w}}_{\text {V}}\) | |
---|---|---|---|---|---|
Nasdaq (\(p=80\),\(c=0.537\)) | |||||
\({\mathrm{Mean}}({\hat{\sigma }}_{{\mathrm{portfolio}},j})\) | 0.0535 | 0.0519 | 0.0511 | 0.0432 | 0.0540 |
\({\mathrm{Mean}}({{\bar{R}}_{j}}\)) | 0.0007 | 0.0021 | 0.0032 | 0.0010 | 0.0064 |
\({\mathrm{Mean}}{(\mathop {SR}\limits ^{\wedge }}_{{\mathrm{portfolio}},j})\) | 0.0138 | 0.0411 | 0.0629 | 0.0229 | 0.1192 |
Nasdaq (\(p=50\), \(c=0.537\)) | |||||
\({\mathrm{Mean}}({\hat{\sigma }}_{{\mathrm{portfolio}},j})\) | 0.0535 | 0.0517 | 0.0508 | 0.0434 | 0.0540 |
\({\mathrm{Mean}}({{\bar{R}}_{j}}\)) | \(-\) 0.0012 | 0.0001 | 0.0014 | 0.0023 | 0.0066 |
\({\mathrm{Mean}}{(\mathop {SR}\limits ^{\wedge }}_{{\mathrm{portfolio}},j})\) | \(-\) 0.0219 | 0.0028 | 0.0278 | 0.0540 | 0.1213 |
Nasdaq (\(p=20\), \(c=0.537\)) | |||||
\({\mathrm{Mean}}({\hat{\sigma }}_{{\mathrm{portfolio}},j})\) | 0.0496 | 0.0469 | 0.0462 | 0.0409 | 0.0557 |
\({\mathrm{Mean}}({{\bar{R}}_{j}}\)) | 0.0035 | 0.0039 | 0.0045 | 0.0041 | 0.0068 |
\({\mathrm{Mean}}{(\mathop {SR}\limits ^{\wedge }}_{{\mathrm{portfolio}},j})\) | 0.0701 | 0.0837 | 0.0980 | 0.0996 | 0.1225 |
Example 4
Stocks listed on the Stockholm stock exchange (random sampling method)
Performance of GMVP estimators applied to portfolios from Stockholm stock exchange, random sampling method
\(\hat{\mathbf {w}}_{\text {I}}\) | \(\hat{\mathbf {w}}_{\text {II}}\) | \(\hat{\mathbf {w}}_{\text {III}}\) | \(\hat{\mathbf {w}}_{\text {IV}}\) | \(\hat{\mathbf {w}}_{\text {V}}\) | |
---|---|---|---|---|---|
Stockholm OMX (\(p=80\),\(c=0.53\)) | |||||
\({\mathrm{Mean}}({\hat{\sigma }}_{{\mathrm{portfolio}},j})\) | 0.0502 | 0.0469 | 0.0444 | 0.0445 | 0.0460 |
\({\mathrm{Mean}}({{\bar{R}}_{j}}\)) | 0.0113 | 0.0132 | 0.0149 | 0.0125 | 0.0248 |
\({\mathrm{Mean}}{(\mathop {SR}\limits ^{\wedge }}_{{\mathrm{portfolio}},j})\) | 0.2243 | 0.2808 | 0.3363 | 0.2798 | 0.5389 |
Stockholm OMX (\(p=50\), \(c=0.53\)) | |||||
\({\mathrm{Mean}}({\hat{\sigma }}_{{\mathrm{portfolio}},j})\) | 0.0757 | 0.0685 | 0.0623 | 0.0684 | 0.0472 |
\({\mathrm{Mean}}({{\bar{R}}_{j}}\)) | 0.0225 | 0.0229 | 0.0232 | 0.0245 | 0.0251 |
\({\mathrm{Mean}}{(\mathop {SR}\limits ^{\wedge }}_{{\mathrm{portfolio}},j})\) | 0.2977 | 0.3336 | 0.3722 | 0.3573 | 0.5325 |
Stockholm OMX (\(p=20\), \(c=0.53\)) | |||||
\({\mathrm{Mean}}({\hat{\sigma }}_{{\mathrm{portfolio}},j})\) | 0.0711 | 0.0614 | 0.0533 | 0.0571 | 0.0500 |
\({\mathrm{Mean}}({{\bar{R}}_{j}}\)) | 0.0149 | 0.0168 | 0.0189 | 0.0172 | 0.0246 |
\({\mathrm{Mean}}{(\mathop {SR}\limits ^{\wedge }}_{{\mathrm{portfolio}},j})\) | 0.2099 | 0.2743 | 0.3552 | 0.3021 | 0.4912 |
7 Summary
The global minimum-variance portfolio (GMVP) solution developed by Markowitz is considered to be a fundamental concept in portfolio theory. The early researchers investigating this matter usually applied a simple plug-in estimator for estimating the weights and paid very little attention to the distributional property of the estimator. More recently, the full distribution of the standard estimator has been derived (Okhrin and Schmid 2006), and it is now recognized that the standard estimator offers a poor approximation of the true GMVP. Within a relatively short period of time, a variety of improvements to the standard estimator have been developed. Naturally, each of these improvements has its pros and cons, but there does not seem to be a consensus about how to evaluate the performance, or efficiency, of GMVP estimators. Perhaps this is because there are, in fact, several possible measures one can use for assessing the properties of a portfolio estimator. In this paper, we discuss a number of different risk functions for the weight estimator. These include: risk functions of covariance matrix estimators, forecast mean square errors, directional risks and conditional risks. The risk functions are labeled with an index determined by the degree to which they are specialized for portfolio estimation: \(\mathfrak {R}_{2}\) is generally preferred over \({\mathfrak {R}_{1}}\) which is preferred over \(\mathfrak {R}_{0}\) etc. However, this ordering does not mean that \({\mathfrak {R}_{2}}\) is uniformly better than \({\mathfrak {R}_{1}}\) and \({\mathfrak {R}_{0}}\). For example, \({\mathfrak {R}_{4}}\) does not exist in closed form for the regularized portfolio estimator used in this paper. Hence, \({\hat{\mathbf {w}}_{IV}}\) has to be optimized through \(\mathfrak {R}_{0}\) rather than \({\mathfrak {R}_{4}}\). In other words, one would typically use \({\mathfrak {R}_{4}}\) or \({\mathfrak {R}_{3}}\) as a tool for deriving an estimator of \({\mathbf {w}_\mathrm{GMVP}}\), but there are settings where risk functions of lower rank-order must be used because of their simpler functional form. A selection of recent GMVP estimators is used in a Monte Carlo simulation for purposes of: (i) comparing different risk measures for a given estimator and (ii) comparing different estimators for a given risk. Moreover, a new estimator, based on a resolvent estimator, is proposed. The analysis focuses on asset data where the number of observations (n) is comparable to the number of assets (p). This case is important because investors might be reluctant to use long data sets as the economy is not expected to be stable over long time periods, and hence, investors are encountering a high-dimensional setting. The simulations are complemented by an analysis of two real data sets: One data set is drawn from the Nasdaq stock exchange, and the other one employs Stockholm OMX data. The general finding of the paper is that no estimator dominates uniformly over all risk functions. We can, however, establish that there are dominating tendencies, in the sense that some estimators tend to perform better with respect to most risk aspects. A Stein-type estimator developed by Frahm and Memmel (2010) is found to perform well in cases when \(n\gg p\), whereas another Stein-type estimator proposed by Bodnar et al. (2018) dominates when n is proportional to p. A resolvent-type estimator is found to perform surprisingly well over a large number of settings. While this paper is restricted to properties of point estimators, future research could involve more general inferential aspects, such as hypotheses testing.
Footnotes
Notes
References
- Anderson, T.W.: An Introduction to Multivariate Statistical Analysis, 3rd edn. Wiley, New York (2003)zbMATHGoogle Scholar
- Bai, Z., Liu, H., Wong, W.K.: Enhancement of the applicability of Markowitz’s portfolio optimization by utilizing random matrix theory. Math. Finance 19(4), 639–667 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
- Best, M.J., Grauer, R.R.: On the sensitivity of mean–variance-efficient portfolios to changes in asset means: some analytical and computational results. Rev. Financ. Stud. 4(2), 315–342 (1991)CrossRefGoogle Scholar
- Bodnar, T., Zabolotskyy, T.: How risky is the optimal portfolio which maximizes the sharpe ratio? AStA Adv. Stat. Anal. 101(1), 1–28 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
- Bodnar, T., Parolya, N., Schmid, W.: Estimation of the global minimum variance portfolio in high dimensions. Eur. J. Oper. Res. 266(1), 371–390 (2018)MathSciNetCrossRefzbMATHGoogle Scholar
- Demarta, S., McNeil, A.J.: The t copula and related copulas. Int. Stat. Rev. 73(1), 111–129 (2005)CrossRefzbMATHGoogle Scholar
- DeMiguel, V., Garlappi, L., Uppal, R.: Optimal versus naive diversification: how inefficient is the 1/n portfolio strategy? Rev. Financ. Stud. 22(5), 1915–1953 (2009)CrossRefGoogle Scholar
- Efron, B., Morris, C.: Multivariate empirical Bayes and estimation of covariance matrices. Ann. Stat. 4(1), 22–32 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
- Frahm, G., Memmel, C.: Dominating estimators for minimum-variance portfolios. J. Econom. 159(2), 289–302 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
- Frankfurter, G.M., Phillips, H.E., Seagle, J.P.: Portfolio selection: the effects of uncertain means, variances, and covariances. J. Financ. Quant. Anal. 6(5), 1251–1262 (1971)CrossRefGoogle Scholar
- Frost, P.A., Savarino, J.E.: Portfolio size and estimation risk. J. Portf. Manag. 12(4), 60–64 (1986)CrossRefGoogle Scholar
- Golosnoy, V., Okhrin, Y.: Multivariate shrinkage for optimal portfolio weights. Eur. J. Finance 13(5), 441–458 (2007)CrossRefGoogle Scholar
- Haff, L.R.: Estimation of the inverse covariance matrix: random mixtures of the inverse Wishart matrix and the identity. Ann. Stat. 7(6), 1264–1276 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
- Holgersson, T., Karlsson, P.S.: Three estimators of the Mahalanobis distance in high-dimensional data. J. Appl. Stat. 39(12), 2713–2720 (2012)MathSciNetCrossRefGoogle Scholar
- Holgersson, T., Mansoor, R.: Assessing normality of high-dimensional data. Commun. Stat. Simul. Comput. 42(2), 360–369 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
- James, W., Stein, C.: Estimation with quadratic loss. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, pp. 361–379, University of California Press, Berkeley, CA (1961)Google Scholar
- Kempf, A., Memmel, C.: Estimating the global minimum variance portfolio. Schmalenbach Bus. Rev. 58(4), 332–348 (2006)CrossRefGoogle Scholar
- Kempf, A., Kreuzberg, K., Memmel, C.: How to incorporate estimation risk into Markowitz optimization. In: Operations Research Proceedings 2001, pp. 175–182. Springer (2002)Google Scholar
- Kotz, S., Nadarajah, S.: Multivariate T-Distributions and Their Applications. Cambridge University Press, Cambridge (2004)CrossRefzbMATHGoogle Scholar
- Ledoit, O., Wolf, M.: Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J. Empir. Finance 10(5), 603–621 (2003)CrossRefGoogle Scholar
- Ledoit, O., Wolf, M.: Honey, I shrunk the sample covariance matrix. J. Portf. Manag. 30(4), 110–119 (2004)CrossRefGoogle Scholar
- Markowitz, H.M.: Portfolio selection. J. Finance 7(1), 77–91 (1952)Google Scholar
- Markowitz, H.M.: Portfolio Selection: Efficient Diversification of Investments. Yale University Press, New Haven (1959)Google Scholar
- Merton, R.C.: On estimating the expected return on the market: an exploratory investigation. J. Financ. Econ. 8(4), 323–361 (1980)CrossRefGoogle Scholar
- Michaud, R.O.: The Markowitz optimization enigma: is optimized optimal? ICFA Contin. Edu. Ser. 1989(4), 43–54 (1989)CrossRefGoogle Scholar
- Muirhead, R.J.: Aspects of Multivariate Statistical Theory. Wiley, New York (1982)CrossRefzbMATHGoogle Scholar
- Okhrin, Y., Schmid, W.: Distributional properties of portfolio weights. J. Econom. 134(1), 235–256 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
- Okhrin, Y., Schmid, W.: Comparison of different estimation techniques for portfolio selection. AStA Adv. Stat. Anal. 91(2), 109–127 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
- Rao, C.R.: Linear Statistical Inference and Its Applications, 2nd edn. Wiley, New York (2008)Google Scholar
- Roy, A.D.: Safety first and the holding of assets. Econometrica 20(3), 431–449 (1952)CrossRefzbMATHGoogle Scholar
- Serdobolskii, V.I.: The resolvent and the spectral functions of sample covariance matrices of increasing dimension. Russ. Math. Surv. 40(2), 232–233 (1985)MathSciNetCrossRefGoogle Scholar
- Serdobolskii, V.I.: Estimation of high-dimensional inverse covariance matrices. In: Multivariate Statistical Analysis, pp. 87–101. Springer (2000)Google Scholar
- Srivastava, M.S.: Methods of Multivariate Statistics. Wiley, New York (2002)zbMATHGoogle Scholar
- Stein, C.: Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 197–206. University of California Press, Berkeley, CA (1956)Google Scholar
Copyright information
OpenAccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.