As our results are solely based on an empirical analysis, it is crucial to employ a suitable empirical setup to ensure the validity and reproducibility of our findings. As an investor who uses a minimum-variance portfolio seeks, by definition, the portfolio exhibiting the lowest variance, our empirical study must reflect this objective. Indeed, all the investigated portfolios include so-called tuning parameters—parameters important for the optimization procedure but not yet optimized by theoretical analysis. Instead, one of these parameters is identified by cross-validation in combination with machine power, whereas the other will be set to a constant value.
In our case, we have two tuning parameters: \(\delta\), resulting from the LASSO constraint, and k, emerging from the turnover constraint. Owing to computational restrictions, we use \(\delta\) as the tuning parameter to achieve the lowest variance and therefore optimize its value with cross-validation. The parameter k of the turnover constraint is kept constant throughout the dataset, as we only use it to reduce turnover compared with the unconstrained benchmark.
Hence, we do not set the value of \(\delta\) so that it meets specific well-known constraints such as the short-sale constraint (see DeMiguel et al. (2009a)). In contrast to other authors such as Zhao et al. (2019), we do not want to achieve any practitioner’s rule for investment and therefore do not keep \(\delta\) as a constant, independent of the present market situation. By contrast, we allow the \(\delta\) value to change in every period, as we always want to achieve the optimization goal, that is, finding the portfolio with the lowest variance. This, in our opinion, a more realistic approach leads to a more likely change in the chosen assets and higher turnover. This in turn provides another reason for imposing an additional turnover constraint, as in model (7).
Data
For our empirical study, we use S&P 500 stock price data from the Thomson Reuters EIKON database. This covers daily data from January 1998 to the end of December 2018, with \(T=5282\) observations overall. Our analysis is based on discrete returns, calculated as \(r_t=\frac{P_t-P_{t-1}}{P_{t-1}}\). To avoid transforming the data and therefore potentially distorting valuable information, we only focus on those stocks present throughout the data period (319 stocks). To check whether dimensionality influences our results, we analyze the surviving 319 companies as well as a randomly generated subset of 100 stocks of the original 319. All our models are estimated by taking into account the returns of approximately the past two years of trading (i.e., \(\tau =2*252=504\) observations). This results in 4778 trading days of out-of-sample returns from our different model approaches.
To illustrate the structure of our two datasets, Fig. 1 presents their correlation plots. This figure shows the sample correlation of each possible pair of stock returns. As all the correlations are close to zero or positive for both datasets, we order the stocks according to their first principal component.Footnote 3 Friendly (2002) and Wei et al. (2017) provide examples of an R-implementation. Both figures show that a large number of stocks exhibit strong correlations with each other. However, from the top left to bottom right of the figures, the overall correlation diminishes to a slight positive correlation, in some cases even close to zero. Further, the random selection of 100 out of the 319 stocks does not visually break the underlying correlation structure of the data, as both plots seem to have a similar appearance.
Variance estimators
To thoroughly analyze whether the introduced LASSO and turnover constraints decrease the number of assets as well as turnover, while maintaining a low-variance profile, we use some recent and efficient variance estimators. Starting with one of the most commonly used estimators among practitioners and researchers, we calculate the sample covariance estimator, defined as
$$\begin{aligned} {\widehat{\Sigma }}_ S =\frac{1}{\tau -1} \left( R - {\widehat{\mu }} 1' \right) \left( R - {\widehat{\mu }} 1' \right) ', \end{aligned}$$
where \(R \in {\mathbb {R}}^{n \times t}\) is the matrix of past returns and \({\widehat{\mu }} \in {\mathbb {R}}^{n}\) the vector of expected returns (here, estimated as average returns). At high concentration ratios, \(q=n/\tau \rightarrow 1\), the empirical variance, although unbiased, exhibits high estimation variance and, therefore, a high out-of-sample estimation error. To minimize this estimation error, a linear shrinkage procedure can be applied to the unbiased sample estimator by combining it with a target covariance matrix. Following Ledoit and Wolf (2003), the variance estimator becomes
$$\begin{aligned} {\widehat{\Sigma }}_{LW_{L}} = s{\widehat{\Sigma }}_{T} + (1-s){\widehat{\Sigma }}_{S} , \end{aligned}$$
(12)
where \({\widehat{\Sigma }}_T\) is the estimate of a specific target covariance matrix and s is a shrinkage constant with \(s \in [0,1]\). Assuming identical pairwise correlations between all n assets, the target matrix is substituted with the constant covariance matrix as in Ledoit and Wolf (2004).
A more sophisticated shrinking method is non-linear shrinkage, as suggested by Ledoit and Wolf (2017). As this estimator shrinks the eigenvalues individually; small, potentially underestimated eigenvalues are pushed up, while large, potentially overestimated eigenvalues are pulled down. Without going into further detail, we write the non-linear shrinkage estimator as
$$\begin{aligned} {\widehat{\Sigma }}_{LW_{NL}} = V{\widehat{E}}_{LW_{NL}} V', \end{aligned}$$
(13)
where V is the matrix of the orthogonal eigenvectors and \({\widehat{E}}_{LW_{NL}}\) is the diagonal matrix of the shrunk eigenvalues, as shown by Ledoit and Wolf (2012, 2015). Because \({\widehat{\Sigma }}_{LW_{NL}}\) is proven to be asymptotically optimal within the class of rotationally equivariant estimators, we might expect it to perform better than any of the aforementioned estimators, especially in cases of large concentration ratios.
Hence, we further extend our analysis to factor-based covariance estimation methods, which assume a specific structure in the covariances of asset returns. One promising example of that family of variance estimators is the principal orthogonal complement thresholding (POET) estimator provided by Fan et al. (2013). Here, the principal components of the sample covariance matrix \({\widehat{\Sigma }}_{S}\) are used as factors. Moreover, subsequent adaptive thresholding with a threshold parameter \(\theta\) is applied to the covariance of the residuals of the estimated factor model (see, e.g., Cai and Liu 2011).Footnote 4 Therefore, the POET estimator has the form:
$$\begin{aligned} {\widehat{\Sigma }}_{POET} =\sum ^K_{i=1}{\widehat{\xi }}_i v_i v_i' + {\widehat{\Sigma }}^{\theta }_{u,K}, \end{aligned}$$
(14)
where \(v_i\) is the eigenvector to asset return i, \(\xi _i\) is the corresponding eigenvalue, and \({\widehat{\Sigma }}^{\theta }_{u,K}\) is the idiosyncratic covariance matrix after the applied thresholding procedure with threshold level \(\theta\).
In particular, the estimators (13) and (14) estimate the GMV portfolios well, and thus, they can be considered to be the state-of-the-art among homoscedastic variance estimators for return data.
Performance measures
To evaluate the out-of-sample performance of each portfolio, we report various performance measures, starting with the out-of-sample portfolio standard deviation \(\sigma _p\) and Sharpe ratio \(\text {SR}_p\), defined as
$$\begin{aligned} {\widehat{\sigma }}_p= & {} \frac{1}{T-\tau }\sum ^{T-1}_{t=\tau }\left( w_t' r_{t+1} - {\widehat{\mu }}_p\right) ^2, \end{aligned}$$
(15)
$$\begin{aligned} \widehat{\text {SR}}_p= & {} \frac{{\widehat{\mu }}_p - r_f}{{\widehat{\sigma }}_p}, \end{aligned}$$
(16)
where \(w_t\) are the portfolio weights chosen at time t, \(w_t' r_{t+1}\) is the out-of-sample portfolio return, \({\widehat{\mu }}_p= \frac{1}{T-\tau }\sum ^{T-1}_{t=\tau }w_t' r_{t+1}\) is the out-of-sample portfolio expected return. For the computation of the Sharpe ratio, we assume a risk-free interest rate \(r_f=0\).
Since we consider a variance minimization problem, daily out-of-sample portfolio variance is of utmost importance. Hence, we check whether the calculated out-of-sample variance of the LASSO-based method in (3) as well as the LASSO and turnover-based method in (7) have significantly different standard deviations than their standard counterpart in (1). Therefore, we perform the two-sided HAC test with the Parzen kernel for the differences in variances, as described by Ledoit and Wolf (2008), and report the corresponding p-values.
Furthermore, in accordance with the literature on portfolio optimization and estimation risk reduction, to approximate the arising transaction costs (e.g., DeMiguel et al. 2009a; Dai and Wen 2018), we use the average daily turnover
$$\begin{aligned} \text {Turnover}= & {} \frac{1}{T-\tau -1}\sum ^{T-1}_{t=\tau +1}\sum ^{n}_{j=1}\left( \left| w_{j, t+1}-w_{j,t^+}\right| \right), \end{aligned}$$
(17)
where \(w_{j,t^+}\) denotes the portfolio weight in asset j before rebalancing at \(t+1\) but scaled back to sum to 1 and \(w_{j, t+1}\) is the portfolio weight in asset j after rebalancing at \(t+1\).
We next evaluate the portfolio composition with respect to the number of non-zero investments and short sales as well as the development of the short-sale budget over time, defined as
$$\begin{aligned} \text {Average assets}= & {} \frac{1}{T-\tau }\sum ^{T}_{t=\tau +1}\sum ^{n}_{j=1}\mathbbm {1}_{\{w_{j,t}\ne 0\}}, \end{aligned}$$
(18)
$$\begin{aligned} \text {Average short sales}= & {} \frac{1}{T-\tau }\sum ^{T}_{t=\tau +1}\sum ^{n}_{j=1}\mathbbm {1}_{\{w_{j,t}<0\}}. \end{aligned}$$
(19)
To shed more light the introduced models from the perspective of risk exposure, we include two different portfolio concentration measures. First, the concentration ratio determines the distribution of assets exposures within a portfolio and is defined as the aggregate share of the \(n_b\)-largest weights within a portfolio.
$$\begin{aligned} \text {Concentration ratio}= & {} = \frac{1}{T-\tau }\sum _{t=\tau }^{T}\sum _{j=1}^{n_b}|w_{j,t}|, \end{aligned}$$
(20)
where we set \(n_b=5\) throughout our empirical study. Naturally, a lower concentration ratio implies better portfolio exposure and diversification.
Second, following Choueifaty and Coignard (2008), we compute the diversification ratio as the ratio of the weighted average of asset volatilities divided by the portfolio volatility.
$$\begin{aligned} \text {Diversification ratio}= & {} \frac{1}{T-\tau }\sum _{t=\tau }^{T}\frac{w_t'\sigma _t}{\sqrt{w_t'\Sigma _t w_t}}, \end{aligned}$$
(21)
where \(\Sigma _t\) is calculated as in Eq. (13) and \(\sigma _t=\sqrt{\text {diag}(\Sigma _t)}\). Due to its definition, the diversification ratio takes up values \(\ge 1\) and is higher when the portfolio exhibits higher (better) diversification levels.
Finally, to gain more insight into the model structure, we analyze the final \(\delta\) values of model types (3) and (7). All the values are reported on a daily basis.
Course of action
For our empirical work, we use a non-expanding rolling window study that incorporates cross-validation for our tuning parameter. As mentioned earlier, we evaluate the tuning parameter \(\delta\) for the LASSO constraint, so that it may change each day. The parameter k is left constant over time, set to 0.0005 for the 319 S&P dataset and 0.001 for the 100 S&P dataset. These values were found by checking different values k for each dataset in a small subsample. Changing k by a reasonably large number did not result in vastly different outcomes. In general, choosing a too large value for k leads to a portfolio that still has high turnover, whereas choosing it to be too small worsens its risk/return profile. The more the assets considered, the lower k should be.
To analyze the impact of both the sparsity (LASSO) and the stability (LASSO \(+\) TO) constraints, we implement a simple one-fold cross-validation for the tuning parameter \(\delta\). However, because of the described model representation of (6), which allows us to simplify the absolute value constraint, we apply our cross-validation toward the \(\lambda\) Lagrange parameter instead of \(\delta\) and restore all \(\delta\) values in a second step by simply calculating \(||w||_1=\delta\).
We start with \(t=1\), January 2, 1998, and use the following daily returns up to \(t=504\) to create an in-sample dataset covering approximately two years of daily returns. From that data sample, we take another smaller subsample for our cross-validation consisting of the first \(504-20=484\) observations. We then calculate models (3) and (7) using 20 different \(\lambda\) values chosen from a linear sequence of numbers from \(\lambda _{t+1}=\lambda _{t}+0.00001\) to 0, whereas we initialize \(\lambda _1\) with 0.00001. The resulting weights of these 20 models are then applied to the first subsequent daily return of the cross-validation subsample (here, the 485th observation) to create an individual daily portfolio return for both models. The subsample is then shifted by one and the procedure carried out again. This is repeated until we reach the 20 observations we previously omitted. We then compare the standard deviations of the 20 out-of-sample cross-validation returns for each \(\lambda\) for both models individually. After receiving an optimal \(\lambda\), chosen to be that corresponding to the lowest standard deviation, we set our final \(\lambda\) to be \(\lambda _t\) for each model individually. Next, we calculate the weights using models (3) and (7) for all the selected data on daily in-sample returns. The true out-of-sample returns are then constructed by multiplying the calculated weights by the returns of the following period (here, the 505th observation). As model (1) needs no cross-validation, we calculate it only at this point to receive its out-of-sample portfolio return as well. We then proceed by shifting the former in-sample data by one period (i.e., a day) and repeat the procedure 4778 times until the last out-of-sample daily return covers December 31, 2018.