1 Introduction

The capital asset pricing model (CAPM), in particular its one-factor version, has been for decades the most commonly used framework to analyse the relationship between risk and return. For instance, Fama and MacBeth (1973) estimated this model for NYSE stocks and found a positive relationship between average return and market volatility in the period 1926–1968. The standard approach to calculating the cost of equity is also based on the CAPM (Fernandez 2015): in a survey of the Association for Financial Professionals (AFP) 90% of the respondents said that they use the CAPM for estimating the cost of capital and making investment decisions (Jacobs and Shivdasani 2012).

This paper focuses on a key component of the CAPM, namely the market risk premium (MRP), which is defined as the difference between the expected return on a market portfolio and the risk-free rate, and is also the slope of the security market line (SML), a graphical representation of the CAPM. The aim of the analysis is to provide evidence about some of its statistical properties as well as those of its volatility, in particular their degree of persistence, by applying fractional integration techniques to a set of data from the US, Germany and Japan, namely the biggest economies in America, Europe and Asia respectively in terms of nominal GDP over most of the time period considered in our study. The tests are carried out at different frequencies (weekly and monthly) and over different time horizons (2, 5, and 10 years) in order to check the robustness of the findings. The possible existence of breaks and changes in persistence is then investigated in the case of the US.

The layout of the paper is the following: Section 2 reviews the relevant literature; Section 3 describes the data and the econometric framework; Section 4 discusses the empirical findings; Section 5 offers some concluding remarks.

2 Literature review

The market risk premium is a key parameter in the context of the CAPM. According to the previously mentioned AFP survey, about 49% of companies base their investment strategy on an estimated value between 5% and 6%, 23% of them on a value between 3% and 4%, and 17% of them on a value of 7% or greater. As pointed out by Fernandez et al. (2019) and Damodaran (2018), the term equity risk premium can be used for different concepts, specifically:

  1. 1.

    The Historical Equity Premium (HEP), defined as the historical return differential between stocks and government bonds. Dimson et al. (2006) analysed it for 17 countries over 106 years and estimated an average value of approximately 4.5–5%, which is lower than the typical estimates reported in textbooks (e.g., 7.1% in the US according to Ross et al. (2010), but still too high to be consistent with reasonable risk aversion levels. This is known as the equity premium puzzle (Mehra and Prescott 1985); standard asset-pricing models have been expanded in the more recent literature in an attempt to explain it (e.g., Siegel 2017) .

  2. 2.

    The Expected Equity Premium (EEP),Footnote 1 namely the expected return differential between stocks and government bonds. Expectations are usually calculated using surveys, such as those carried out by: Fernandez et al. (2019) for the period 2008–2018 for over 500 companies and professionals estimating a EEP of 5.4% - 5.7% in the US; Graham and Harvey (2018) for the period 2000–2018 among SP500 Chief Financial Officers (CFOs), obtaining an average EEP of 4.42% in the US; Welch (2008) with more than 400 answers from finance and economics professors, estimating an average EEP of 6% in the US.

  3. 3.

    The Implied Equity Premium (IEP), namely the required equity premium according to an asset pricing model. In particular, it can be calculated in the context of a traditional dividend valuation model (Gordon 1962) by discounting future dividends and expected growth, or within a general cash flow discount model by discounting the expected shareholder cashflows (DCFs). For instance, Fama and French (2002) estimated the IEP for the period 1951–2000 to be in the range 2.55%–4.32% (much higher than the estimated HEP of 7.43%). Damodaran (2018) calculated a value of 5.08% for the US (4.16% on average over the period 1960–2017) using the expected 5-year dividends including buybacks for the S&P500 index. He also proposed an alternative method based on country default spreads and equity market volatility, which yielded an estimated IEP of 5.3%. An investment house such as KPMG calculated an IEP of 5.75% in 2019 for the US by using general DCF models combining historical equity returns and the current IEP (Groenendijk et al. 2019).

Damodaran (2018) concluded that IEP outperforms HEP in terms of predictive power for the risk premium. Fama and French (2002) advocated using the dividend growth model. Jacquier et al. (2005) argued that a historical geometric average, lower than the arithmetic average, is a more suitable measure of the equity premium. Levi and Welch (2017) concluded that over 20-year time horizons the equity premium is in the 5%–7% range. It is noteworthy that using consensus premia obtained by averaging across different approaches might not be good practice as these might vary considerably in terms of their accuracy (Damodaran 2018).

As for forecasting the equity premium, Hsiao and Wan (2011) compared different methods and concluded that rolling windows produce more accurate forecasts than constant parameter models. Bollerslev et al. (2013) analysed intraday data for the S&P500 and the VIX volatility index and found that the realized volatilities are best described by long memory fractional integrated processes and that there is return predictability in a fractionally cointegrated vector autoregression model (CFVAR) including returns and two variance measures proxies.

Finally, Chang-Jin et al. (2004) found a positive relationship between stock market volatility and the equity premium, which is supported by a negative and significant volatility feedback effect; Chan and Feng (2008) estimated significant jumps in risk premia in the DAX, DJIA, FTSE, Nikkei, and S&P500 indices in response to news, and showed that ignoring the long-memory feature in volatility dynamics might lead to invalid rejections of time-varying risk premia.

Concerning the persistence of the MRP, Dimson et al. (2006) found evidence that US HEPs were lower than frequently quoted historical averages, and concluded that their performance cannot contribute to forward-looking equity premia because of its lack of persistence. Other authors tried to estimate the MRP with persistent proxies; for instance, Azeredo (2014) searched for the mechanisms determining the behaviour of the equity premium under persistent consumption growth, and Chang-Jin et al. (2004) investigated the relationship between stock market volatility and the equity premium. Finally, Andersen et al. (2006) studied the persistence and predictability of the realized betas rather than the MRP.

3 Data and methodology

We calculate the Historical Equity Premium (HEP) as the difference between the yield on 2, 5 and 10-year government bonds and the stock market return over the corresponding time horizon at both the weekly and the monthly frequency. Specifically, we use the following series: for the US, Treasury bond yields and S&P500 returns; for Germany, Bund yields and DAX returns; for Japan, Japanese Government Bond (JGB) yields and NIKKEI returns. The data sources are, respectively, the St. Louis Fed, the ECB Statistical Data Warehouse, and the Bloomberg and Reuters-Eikon databases. The exact sample period and some descriptive statistics for each of the computed series are reported in Table 1; their plots over the time horizons and for the frequencies considered are displayed in Fig. 1. It can be seen that the risk premium tends to be less volatile when computed over a longer time span, regardless of the data frequency.

Table 1 Descriptive statistics for the market risk premium
Fig. 1
figure 1

HRP over different time spans and at different frequencies. a 10y with monthly frequency. b 10y with weekly frequency. c 5y with monthly frequency. d 5y with weekly frequency. e 2y with monthly frequency. f 2y with weekly frequency

The persistence of these series is estimated using fractional integration methods allowing the order of integration (denoted by d) to take fractional as well as integer values. This is a more general and flexible approach that the standard one based on the I(0) versus I(1) dichotomy, and it encompasses a variety of cases, namely: short-memory series (d = 0); long-memory stationary series (0 < d < 0.5); mean-reverting nonstationary series (0.5 ≤ d < 1); unit roots (d = 1) or explosive patterns (d > 1). The estimation of the differencing parameter is based on an approximation to the likelihood function (Whittle function) formulated in the frequency domain, and uses a simple version of the tests of Robinson (1994) to determine the confidence bands for the values of d.

Specifically, the following model is estimated:

$$ {y}_t=\alpha +\beta t+{x}_t+\kern0.5em t=1,2,\dots, $$
(1)

where α and β are unknown coefficients (a constant and a time trend coefficient), and xt is assumed to be integrated of order d, i.e.,

$$ {\left(1-L\right)}^d{x}_t={u}_t,\kern0.5em t=0,\pm 1,\kern0.5em \dots, $$
(2)

where d can be any real value, L is the lag-operator (Lxt = xt-1) and ut is an I(0) series, defined for our purposes as a covariance (or second-order) stationary process with a spectral density function that is positive and finite at the zero frequency. The fractional differencing parameter d measures the persistence of the series and is estimated for both the market risk premium and its volatility (proxied by the squared first differences) over the different time horizons and for the different frequencies considered.

4 Empirical results

Figure 2a, b and c summarize the different patterns between US, Germany and Japan for the observed historical long-term (10y) equity premiums with monthly observations. It looks that there is not a single pattern between different countries, however, in recent times (after 1990s) the probability of a positive premium increases over previous periods (after 1960s). This issue has no direct relationship with high or low sovereign interest rate environments.

Fig. 2
figure 2figure 2

Relationship between long-term HRP and Risk free yield for different countries and periods. a US for periods 1960–2009 and 1990–2009. b Germany for periods 1960–2009 and 1990–2009. c Japan for periods 1972–2009 and 1990–2009

In addition, Fig. 3a, b and c show similar behavior between the long-term (10y) and medium term (5y) historical equity premium. Thus, length of a project looks not a determinant driver for the observed equity premium.

Fig. 3
figure 3

Relationship between mid-term HRP and Risk free yield for different countries and periods. a US Observed mid-term HRP premiums. b Germany observed mid-term HRP premiums. c Japan observed mid-term HRP premiums

Tables 2, 3, 4 and 5 display the estimated values of d from the model given by Eqs. (1) and (2), for both the risk premium series and their volatility (proxied by their squared first differences), jointly with the 95% confidence intervals of the non-rejection values of d using Robinson’s (1994) tests. We consider three different specifications for Eq. (1): i) α = β = 0 (i.e., no deterministic components); ii) β = 0 (i.e., an intercept only); iii) α and β freely estimated from the data (i.e., including both an intercept and a linear time trend). We also make two alternative assumptions about the residuals, namely that they follow in turn a white noise or an autocorrelated process, in the latter case the non-parametric model of Bloomfield (1973) for weakly autocorrelated errors being estimated. In each case the values of d in bold are those from our preferred specification, our model selection criterion being the statistical significance of the other estimated parameters according to their t-values.

Table 2 Estimates of d under the assumption of white noise residuals
Table 3 Estimates of d under the assumption of autocorrelated (Bloomfield) residuals
Table 4 Estimates of d for the squared differences with white noise residuals
Table 5 Estimates of d for the squared differenced with autocorrelated (Bloomfield) residuals

Tables 2 and 3 show the results obtained for the persistence of the risk premium under the assumption of white noise and autocorrelated disturbances respectively. In both cases the selected specification includes an intercept only. It can be seen that with white noise errors (Table 2) the null hypothesis of I(1) or a unit root cannot be rejected in the majority of cases; it is only rejected (in favour of orders of integration which are above 1) in the case of the US for the 10, 5 and 2 year time horizons with monthly data, and also for Germany and Japan for the 5 year time span with monthly data. In all other cases d is not statistically different from 1, which supports the random walk hypothesis; there is no evidence of mean reversion (d < 1) in any single case. When assuming autocorrelation in the disturbances (Table 3) the estimated values of d are slightly lower but the unit root null hypothesis can still not be rejected in any case. This I(1) behaviour is consistent with market efficiency.

Next we analyse persistence in the volatility of the risk premium (measured by its squared first differences). Tables 4 and 5 report the estimated values of d with their confidence bands, again for the two cases of white noise and autocorrelated errors respectively. The two sets of results are very similar, most of the values of d lying in the interval (0, 0.5) and implying stationary long-memory behaviour. There are only two cases when the I(0) hypothesis of short memory cannot be rejected, namely for Japan and the US with a 5-year span and monthly data. We also find a significant time trend in the case of the US with a 10-year span and monthly data.

Finally, we consider the possibility of structural breaks. Given the similarity between the monthly and the weekly results for d for all series and the fact that the US is the largest economy with the longest time span, we decided to focus on the US case at the monthly frequency with 10-year, 5-year and 2-year spans. Specifically, we carry out the Bai and Perron (2003) and Gil-Alana’s (2008) tests for multiple breaks. Both suggest the presence of two, three and four breaks for the monthly data over a 10, 5 and 2-year span respectively. The specific break dates are displayed in Table 6 and are the following: 1974 m09 and 1997 m11 for the 10-year span; 1981 m11, 1997 m05 and 2007 m1 for the 5-year span, and 1982 m11, 1989 m04, 1997 m06 and 2008 m03 for the 2-year span, and broadly coincide with the 1973–74 oil crisis, the early US 1980s recession resulting from the Fed’s contractionary monetary policy, the 1997 Asian financial crisis, and the 2007 global financial crisis. One of the detected breaks in the 2-year sample corresponds to the 1998 Savings and Loan crisis. As for the volatility series, a single break is detected, in 1974 for the 10-year sample, and in 2003 for the other two.

Table 6 Break dates using the Bai and Perron (2003) and Gil-Alana (2008) tests

Having detected some breaks in the series of interest, we re-estimate the differencing parameter d for each of the subsamples to see if it has changed over time. Its estimated values for both the risk premium and its volatility under the alternative assumptions of white noise and autocorrelated residuals are reported in Tables 7 and 8 respectively. In the former case (see Table 7) there is no evidence of mean reversion in the risk premium, and a slight increase in persistence in the second and third subsamples. As for volatility, there is a sizeable increase in the case of the 10-year sample, a slight one in the case of the 5-year sample, and a decrease in the case of the 2-year one.

Table 7 Estimates of d for each subsample: White noise errors
Table 8 Estimates of d for each subsample: White noise errors

When allowing for autocorrelation in the residuals (see Table 8), the values of d are generally smaller though the confidence intervals are much wider, such that the I(1) hypothesis cannot be rejected in any single case. Thus, once more, there is no evidence of mean reversion in the risk premium. As for volatility, its persistence increases in the case of the 10-year sample, and a decrease in the other two.

5 Conclusions

The CAPM is still the most popular model for analysing the relationship between risk and return. This paper provides evidence on the degree of persistence of one of its key components, namely the market risk premium, as well as its volatility. The analysis applies fractional integration methods to data for the US, Germany and Japan, and for robustness purposes considers different time horizons (2, 5 and 10 years) and frequencies (monthly and weekly). The empirical findings in most cases imply that the market risk premium is a highly persistent variable which can be characterized as a random walk process, whilst its volatility is less persistent and exhibits stationary long-memory behaviour. There is also evidence that in the case of the US the degree of persistence has changed as a results of various events such as the 1973–74 oil crisis, the early 1980s recession resulting from the Fed’s contractionary monetary policy, the 1997 Asian financial crisis, and the 2007 global financial crisis; this is confirmed by both endogenous break tests and the associated subsample estimates. Market participants should take this evidence into account when designing their investment strategies.