1 Introduction

Earnings forecasts are a critical input in many academic studies in finance and accounting as well as in practical applications. They are central to firm valuation, are widely used in asset allocation decisions, and are the basis for the accounting-based cost of capital calculations, such as the Implied Cost of Capital (ICC). It is, therefore, crucial to have precise and unbiased estimates.

The most popular source for obtaining earnings forecasts are financial analysts. These forecasts are aggregated by data providers, such as the Institutional Brokers’ Estimate System (I/B/E/S), and subsequently made available to academics and practitioners by these providers. Although analysts’ forecasts are fairly accurate (O’Brien 1988; Hou et al. 2012), researchers have found a significant optimism bias (Francis and Philbrick 1993; McNichols and O’Brian 1997; Easton and Sommers 2007).

The alternative to analysts’ earnings forecasts is a regression-based model, which can either be based solely on past realizations of earnings (time-series models) or on a combination of past earnings and other financial variables. The literature first developed time-series models. These models use past realizations of earnings in a linear or an exponential smoothing framework (Ball and Brown 1968; Brown et al. 1987). The results are underwhelming; these forecasts are neither accurate nor unbiased.Footnote 1

Recently, cross-sectional models to forecast earnings have proliferated. Hou et al. (2012) develop a cross-sectional model (henceforth HVZ model) based on assets, earnings, and dividends, which outperforms analysts’ forecasts in terms of coverage, Earnings Response Coefficients (ERC),Footnote 2 and forecast biasFootnote 3 but still trailed analysts’ forecasts with respect to forecast accuracy.Footnote 4 Gerakos and Gramacy (2013) find that a simple Random Walk (RW) model, in which the previous period’s value is used as a forecast, performs as well as other, more sophisticated, earnings forecast models. Finally, Li and Mohanram (2014) implement an Earnings Persistence (EP) and a Residual Income (RI) model to forecast earnings. They show that these models are superior to the HVZ and RW models in terms of bias, accuracy, and ERC.

More recently, Ball and Ghysels (2017) develop a model based on mixed data sampling regression methods (MIDAS), which combines various high-frequency time-series data to forecast earnings. Their model outperforms raw analysts’ forecasts in some cases and also can be combined with analysts’ forecasts to improve forecast accuracy. The findings from Ball and Ghysels (2017) tie in with ours as they show that regression-based models can be used to improve analysts’ earnings forecasts. The model from Ball and Ghysels (2017) is, however, limited to short-term forecast horizons (next quarter), and is thus not suited to estimate the ICC, which also requires medium- and long-term forecasts (up to five years in the future). Our study, therefore, extends the work by Ball and Ghysels (2017) by providing empirical evidence on the advantages of combining analysts forecasts with a regression-based model for longer-term earnings forecasts.

In summary, existing studies show that analysts’ earnings forecasts are more accurate than cross-sectional earnings forecasts, but they do less well in terms of bias and ERC. Moreover, analysts’ forecasts have two important shortcomings: sluggishnessFootnote 5 and poor long-term estimates.Footnote 6

This study proposes a parsimonious cross-sectional regression model consisting of analysts’ earnings forecasts, gross profits, and past stock performance. The inclusion of analysts’ forecasts aims to improve forecast accuracy, in particular of short-term forecasts as analysts’ have a timing and information advantage over forecasts based solely on accounting data (Ball and Ghysels 2017). Including gross profits is motivated by findings from Novy-Marx (2013), which suggest that gross profits predict future earnings.Footnote 7 It is intuitive that stock returns also contain information regarding future earnings. Indeed, Richardson et al. (2010) and Ashton and Wang (2012) find that changes in stock prices predict future earnings and Abarbanell (1991) shows that stock returns are related to future earnings forecasts revisions. Including past stock returns in our combined model has a further advantage as this variable mitigates the effect of sluggish analysts’ forecasts (Guay et al. 2011). We term our method the combined model (CM), as it combines analysts’ forecasts with a cross-sectional method.

We compare our CM to the most popular methods in the literature, namely raw analysts’ forecasts and the RW, EP, RI, and HVZ models. To isolate the value of analysts’ forecasts within the CM, we also estimate a cross-sectional analysts’ forecasts (CSAF) model.Footnote 8 We show that the CM delivers earnings forecasts that are slightly more accurate than raw analysts’ forecasts and markedly more accurate than the cross-sectional models while beating all other tested methods in terms of bias and ERC. The CSAF model underperforms not only the CM but also the raw analysts’ forecasts in terms of bias and accuracy. This suggests that using analysts’ forecasts in a cross-sectional model is not sufficient to improve forecast accuracy nor to decrease bias. However, the fact that the CM outperforms all of the analyzed models, including the CSAF, shows that the variables gross profits and past stock performance substantially improve earnings forecasts. We perform further robustness tests that show that adding analysts’ forecasts to any cross-sectional model does not improve earnings forecasts to the same extent that our CM does. Finally, we implement the model from Mohanram and Gode (2013), which removes predictable forecast errors from analysts’ forecasts and find that it too underperforms the CM.

Combining analysts’ forecasts with a cross-sectional model has one important disadvantage compared to regression-based models: the coverage is limited to firms with an analyst following. This reduces the sample of firms compared to models that only use financial data. However, Li and Mohanram (2014) document that cross-sectional models perform far worse in the sample without I/B/E/S coverage than in the sample with I/B/E/S coverage. This is intuitive as firms without analyst coverage tend to be smaller firms with a lower information environment (Hou et al. 2012), which makes it more difficult to forecast earnings based solely on data. Therefore, the higher coverage of regression-based models may not be a large advantage as the resulting forecasts seem to be of poor quality. Furthermore, in most applications, the restriction to firms with analyst coverage should not be a major limitation as this sample still “represents 90 percent or more of the total market capitalization”Footnote 9 of all firms on the NYSE, AMEX, and NASDAQ.

One important application of earnings forecasts is to estimate a firm’s cost of capital, in particular, the ICC. To further evaluate the earnings forecasts from our tested models, we use them as inputs in computing the ICC. We find that many of our benchmark models produce ICC estimates that have a negative and significant relationship with gross profits. This evidence conflicts with Novy-Marx (2013) and Fama and French (2015) who derive theoretically and show empirically that firms with high gross profitability should have higher expected returns. In contrast, the ICC based on our CM shows a positive and significant relationship with gross profitability, in line with the theoretical derivation. In addition, the ICC based on the CM displays a stronger association with ex-post realized returns for both dimensions (cross-sectional and time-series) than the ICC based on the other benchmark models. A long-short strategy of buying the highest ICC decile and short-selling the lowest ICC decile based on the ICC estimated with the CM yields significant average annual returns of up to 6.65%.

This study contributes to the finance and accounting literature in several ways. First, we document that combining analysts’ earnings forecasts with a regression-based model leads to more accurate and less biased estimates compared to each of the components alone. It takes advantage of each method’s favorable characteristics while mitigating their shortcomings. The CM also outperforms the most popular models from the literature in all three dimensions analyzed: bias, accuracy, and ERC.

Second, we apply our earnings forecasts to the estimation of the ICC and show that using earnings forecasts from the CM leads to an increase in the cross-sectional relation between ICC and future returns. Thus, one major criticism of the ICC, namely the weak correlation between ICC estimates and realized returns,Footnote 10 is attenuated by using more accurate earnings forecasts. The improvement in earnings forecast quality is also economically meaningful as long/short portfolios constructed using the ICC based on earnings forecasts from the CM have significant excess portfolio returns.

Third, we provide evidence that using analysts’ forecasts in a cross-sectional forecast model is not sufficient to remove the optimism bias nor to improve the accuracy of forecasts. The cross-sectional models use in-sample coefficients to predict earnings out-of-sample, and this approach seems to introduce a large amount of noise in the out-of-sample estimates, in particular for long-term estimates. The results show that the CSAF model’s performance decreases in the time horizon compared to raw analysts’ forecasts.

The paper is organized as follows. In Sect. 2, we describe our sample selection, the cross-sectional models, and provide details on the ICC estimation. In Sect. 3, we compare the performance of earnings forecasts proxies in terms of bias, accuracy, and ERC. In Sect. 4, we evaluate the correlation between returns and ICC estimated with different methods to forecast earnings. Section 5 shows the correlation between ICC and firm characteristics. Finally, we perform robustness checks in Sect. 6, and conclude in Sect. 7.

2 Data and methodology

2.1 Sample selection

We select firms at the intersection of the Center for Research in Security Prices (CRSP), CompustatFootnote 11 fundamentals annual, and I/B/E/S summary files. We filter for firms listed on NYSE, AMEX, and NASDAQ with share codes 10 and 11. Our sample starts in June 1977, as this is the first year for which I/B/E/S provides analysts’ forecasts, and ends in June 2015. We require at least five years of data for the 10-year pooled regressions of the cross-sectional forecast models. To evaluate the earnings forecasts, we use data from the year after the forecast was made. Therefore, our forecasts cover the period from 1982 to 2014. We require non-missing one-year-ahead earnings forecasts, price, and shares outstanding from I/B/E/S to include a firm-year in the sample. Our proxy for the risk-free rate is the yield on the U.S. 10-year government bond, which we obtain from Thomson Reuters Datastream.

2.2 Earnings forecasts

We develop a model that combines analysts’ earnings forecasts with a cross-sectional regression model to forecast earnings. We benchmark this approach to popular methods from the literature, namely using only analysts’ forecasts, the RW model,Footnote 12 and four cross-sectional models: the CSAF, Hou et al. (2012) (HVZ),Footnote 13 EP, and RI models.Footnote 14

We obtain analysts’ forecasts and share prices from I/B/E/S as of June for each year in the sample period. To compare analysts’ forecasts to the above-mentioned models, we transform analysts’ estimates from a per-share level to a dollar level by multiplying the per-share figures by the number of shares outstanding provided by I/B/E/S. For the RW model, following Gerakos and Gramacy (2013), we use income before extraordinary items from year (t) as earnings forecasts for year (\(t+\tau \) with \(\tau = 1\) to 3).

We follow the approach of Hou et al. (2012) when estimating the cross-sectional regressions. First, we run a rolling window pooled regression (in-sample) using the previous ten years of data (see Eq. 1). We regress the dependent variable earnings (\(E_{(i,t)}\)) for firm (i) in year (t) on the independent variables (\(x1, x2, \ldots , xn\)) for firm (i) in the relevant year (\(t-\tau \) with \(\tau \) = 1 to 3). (\(\epsilon _{(i,t)}\)) is the error term for firm (i) in year (t). We perform the regression at the dollar level with unscaled data.Footnote 15

$$\begin{aligned} E_{(i,t)}=\alpha _0+\alpha _1x1_{(i,t-\tau )}+\alpha _2x2_{(i,t-\tau )}+\cdots +\alpha _nxn_{(i,t-\tau )}+\epsilon _{(i,t)}. \end{aligned}$$
(1)

Second, we forecast earnings (\({E}_{(i,t+\tau )}\)) (out-of-sample) for firm (i) in year (\(t+\tau \)) (see Eq. 2). We obtain the forecast by multiplying the independent variables for each firm (i) of year (t) with the coefficients (\(\alpha _0,\alpha _1,\alpha _2,\ldots ,\alpha _n\)) from the pooled regression from Eq. 1. The advantage of this approach is that there are no strict survivorship requirements as we require firms only to have sufficient accounting data for year (t) to forecast earnings.

$$\begin{aligned} \tilde{E}_{(i,t+\tau )}=\alpha _0+\alpha _1x1_{(i,t)}+\alpha _2x2_{(i,t)}+\cdots +\alpha _nxn_{(i,t)} . \end{aligned}$$
(2)

Consider the following example. Assume that 2010 is year (t) and we want to forecast earnings for 2011 (\(t+\tau \) with \(\tau =1\)). First, we run a pooled regression with the dependent variable data for the period 2001–2010 (from year \(t-9\) to year t) on the independent variables for the period 2000–2009 (from year (\(t-9-\tau \)) to year (\(t-\tau \) with \(\tau =1\)) and store the regression coefficients. Then, we multiply these coefficients (\(\alpha _0,\alpha _1,\alpha _2,\ldots ,\alpha _n\)) by the independent variables (x1, x2, ..., xn) from year 2010 (year = t) to estimate the earnings for 2011 (year \(t+\tau \) with \(\tau =1\)).

We forecast earnings in June of each year (t). We take care to avoid the use of data that was not publicly available at the estimation dates. To this end, we collect accounting data only for companies with a fiscal year end between April of year (\(t-1\)) to March of year (t). To mitigate the influence of outliers, we winsorize earnings and other level variables each year at the first and last percentile, as in Hou et al. (2012) and Li and Mohanram (2014).

When evaluating forecast bias, accuracy, and ERC, the researcher has to ensure that the definition of earnings forecasts and realized earnings are in line. More specifically, analysts typically forecast street earnings, which differ from earnings according to the Generally Accepted Accounting Principles (GAAP) in significant points (Bradshaw and Sloan 2002). To account for this difference, we compare analysts’ forecasts and the CM forecasts to realized street earnings. For the other models (HVZ, RI, EP, RW), we make the comparison based on realized income before extraordinary items, which is based on GAAP.Footnote 16 This distinction is also made in other papers (e.g., Hou et al. 2012).

Finally, note that when comparing models, we restrict the sample to firm-year observations for which analysts’ forecasts are available. This ensures that the basis for the forecast evaluation is fair across the tested models. For example, the coverage for models that do not use analysts’ forecasts is wider, but these additional firms tend to be firms with a poor information environment (Hou et al. 2012), for which earnings are more difficult to forecast. Including these firms in the forecast evaluation of cross-sectional models would bias the analysis in favor of models that use analysts’ forecasts. "Appendix 1" provides the details on the estimation of the HVZ, EP, and RI models.

2.2.1 Combined model

The CM aims to take advantage of the high accuracy of analysts’ forecasts, while incorporating the low bias of the cross-sectional models. To include analysts’ forecasts, we use the last available forecast from I/B/E/S. Our cross-sectional model is a parsimonious approach that includes gross profits and two variables related to past stock returns. Our use of gross profits is motivated by findings from Novy-Marx (2013), who shows that this variable explains most earnings related anomalies and a wide range of seemingly unrelated profitable trading strategies. We include two variables related to past stock returns because Ashton and Wang (2012) and Richardson et al. (2010) show that price changes drive earnings. The model is presented in Eq. 3:

$$\begin{aligned} E_{(i,t)}=\alpha _0+\alpha _1eIBES1_{(i,t-\tau )}+\alpha _2GP_{(i,t-\tau )}+\alpha _3r10_{(i,t-\tau )}+\alpha _4r122_{(i,t-\tau )}+\epsilon _{(i,t)}, \end{aligned}$$
(3)

(\(E_{(i,t)}\)) represents the street earnings of firm (i) in year (t), (\(eIBES1_{(i,t-\tau )}\) with \(\tau =1\) to 3) is the I/B/E/S one-year-ahead earnings forecast, (\(GP_{(i,t-\tau )}\)) is gross profits, (\(r10_{(i,t-\tau )}\))Footnote 17 is the change of market capitalization over the preceding month. (\(r122_{(i,t-\tau )}\))Footnote 18 is the change in market capitalization from \(t-12\) to \(t-2\) months.Footnote 19 As the regression is carried out at the dollar level, the I/B/E/S one-year-ahead earnings per share forecast, as well as the realized street earnings per share, are multiplied by the number of shares provided by I/B/E/S.

2.2.2 Cross-sectional analysts’ forecasts

To show that the CM benefits from the combination of analysts’ forecasts with a cross-sectional model (and that neither of its components solely drives the strong forecast performance), we include a model that uses analysts’ forecasts in a cross-sectional regression. We estimate the cross-sectional analysts’ forecasts (CSAF) model with Eq. 4:

$$\begin{aligned} E_{(i,t)}=\alpha _0+\alpha _1eIBES1_{(i,t-\tau )}+\epsilon _{(i,t)}, \end{aligned}$$
(4)

where (\(E_{(i,t)}\)) represents the street earnings of firm (i) in year (t) and (\(eIBES1_{(i,t-\tau )}\) with \(\tau =1\) to 3) are the I/B/E/S one-year-ahead earnings forecasts. This regression is carried out at the dollar level.Footnote 20

2.3 Estimating the ICC

The ICC is defined as the discount rate that equates a stock’s current price to the present value of its expected future free cash flows to equity. The cash flows are estimated using earnings forecasts and expected growth in earnings. There are many different approaches to estimate the ICC in the literature, so for the purpose of our tests, we choose four common methods. We implement two methods that are based on a residual income model, namely Gebhardt et al. (2001) (GLS) and Claus and Thomas (2001) (CT).Footnote 21 In addition, we employ two methods that are based on an abnormal earnings growth model, namely Ohlson and Juettner-Nauroth (2005) (OJ) and Easton (2004) (modified price-earnings growth or MPEG). Last, we estimate a composite ICC, which is the average of the four above-mentioned approaches. To maximize the coverage of the composite ICC, we only require a firm to have at least one non-missing individual ICC estimate (as in Hou et al. (2012)).

For the ICC calculation, we require each firm to have a one-year-ahead, a two-year-ahead, and a three-year-ahead earnings forecast. If the three-year-ahead forecast is not available, we estimate it by multiplying the two-year-ahead mean earnings forecast by one plus the consensus long-term growth rate. If neither the three-year-ahead earnings forecast nor the long-term growth rate is available, we compute the growth rate between the one-year and two-year-ahead earnings forecasts and use it to estimate the three-year-ahead earnings forecast. Following Hou et al. (2012), we assume that the annual report becomes publicly available at the latest 90 days after the fiscal year-end. Like Gebhardt et al. (2001), we create a synthetic book value when this information is not yet public. Specifically, we estimate the synthetic book value using book value data for year (\(t-1\)) plus earnings minus dividends (\(B_t=B_{t-1}+EPS_t-D_t\)). Regarding the payout ratio, we use the current payout ratio for firms with positive earnings. Like Gebhardt et al. (2001), for firms with negative earnings, we compute the payout ratio as the ratio between dividends and 6% of total assets. For the residual income models, we estimate the book valueFootnote 22 in year \(t+\tau \) using the clean surplus relation \(B_{(t+\tau )}=B_{(t+\tau -1)}+EPS_{(t+\tau )}*(1- PayoutR)\). We set the Payout Ratio to zero when the \(EPS_{(t+\tau )}\) is negative to avoid economically questionable negative dividends. Furthermore, we exclude all observations with negative book value per share. Following Pástor et al. (2008), we winsorize growth rates below 2% and above 100%. See "Appendix 2" for a detailed description of the ICC methodologies.

3 Empirical results of earnings forecasts methods

3.1 Descriptive statistics and coefficient estimates of cross-sectional regressions

In this section, we present the descriptive statistics (see Table 1) and the first step of our procedure to forecast earnings, i.e., the pooled (in-sample) regression using lagged ten years of data. We report the average coefficients, the respective t-statistics with Newey and West (1987) adjustment and the Adjusted R-squared. The earnings are estimated yearly from 1983 to 2015 for one-year-ahead forecasts, from 1985 to 2015 for two-year-ahead forecasts, and from 1987 to 2015 for three-year-ahead forecasts. We regress earnings at time (t) on lagged independent variables. (\(\tau =1\)), (\(\tau =2\)), and (\(\tau =3\)) indicate that the independent variables are lagged by one, two and three years, respectively.

Panel A of Table 2 reports the results for the CM. First, we can see that lagged analysts’ earnings forecasts (\(eIBES1_{i,t-\tau }\) with \(\tau =1\) to 3) are highly significant in explaining earnings even when controlling for other variables from the earnings forecasts literature. Various studies have documented the accuracy of analysts’ earnings forecasts (e.g., Fried and Givoly (1982); O’Brien (1988); Hou et al. (2012)) and this finding corroborates our choice of including analysts’ forecasts in our CM. In terms of magnitude, the average coefficient for analysts’ earnings forecasts is less than 1 (0.957 for one-year-lagged regression, 0.872 in two-year-lagged regression, and 0.774 in the three-year-lagged regression), which confirms the result from the literature that analysts’ forecasts tend to be too optimistic.

Table 1 Descriptive statistics of the variables in the pooled (in-sample) regressions

Although the one-year-lagged gross profits variable (\(GP_{i,t-1}\)) is negative and weakly significant in explaining earnings, the two-, and three-year-lagged coefficients of gross profits are positive and significant with a t-statistic of 7.32 and 12.53 and coefficients of 0.026 and 0.057, respectively. The low significance and the negative coefficients in the one-year-lagged (\(\tau =1\)) regression are likely due to the large explanatory power of analysts’ one-year-ahead earnings forecasts, leaving one-year-lagged gross profits redundant. The positive and significant coefficients of gross profits in the two and three-year-lagged regressions confirm the results of Novy-Marx (2013) that this variable is a good proxy for future earnings.

The coefficients of the one-month past stock return (\(r10_{(t-\tau )}\)) are all positive (0.057, 0.086, and 0.054, for \(\tau = 1\) to 3, respectively) and significant at the 1% level in all analyzed periods. Finally, past stock return from \(-12\) to \(-2\) months (\(r122_{(t-\tau )}\)) is significant at the 5% significance level for the two-year-lagged period (t-statistics of 2.25) having a positive coefficient (0.014 for \(\tau =2\)). These results confirm the findings from Ashton and Wang (2012) and Richardson et al. (2010) that stock price changes have a positive correlation with forward earnings and they tie in with the evidence from Abarbanell (1991) that analysts’ forecasts do not fully reflect the information in prior stock price changes. Our results are also in line with Guay et al. (2011) who find that analysts tend to react slowly to the information contained in recent stock price changes.

Table 2 Coefficient estimates from the pooled (in-sample) regressions

In Panel B of Table 2, we see the results regarding the CSAF model. In particular, the coefficients of analysts’ earnings forecasts in the one-year-lagged regressions are quite close to the CM (coefficient of 0.953 with a t-statistic of 35.87 for the CSAF compared to the coefficient of 0.957 and a t-statistic of 33.31 for the CM). However, the two- and three-year-lagged regressions show a different picture. While the coefficients of analysts’ earnings forecasts on the CSAF regression are closer to one (0.971 in the two-year-lagged regression and 0.992 in the three-year-lagged regression), the coefficients of the CM are lower (0.872 in the two-year-lagged regression and 0.774 in the three-year-lagged regression). This indicates that the additional variables gross profits and lagged returns in the CM become more important in the two- and three-year ahead earnings forecasts compared to the one-year-ahead ones. These results are in line with Bradshaw et al. (2012) who show that analysts’ forecasts are accurate for one-year-ahead horizons, but the two- and three-year-ahead forecasts can underperform even a random walk model.

In Panel C of Table 2, we see the results regarding the HVZ model. The model proposed by Hou et al. (2012) shows a positive and significant relation between earnings (\(E_{(t)}\)) and one-, two-, and three-year-lagged (\(\tau =1\) to 3) earnings (\(E_{(t-\tau )}\)), lagged dividends (\(D_{(t-\tau )}\)), lagged assets (\(A_{(t-\tau )}\)) and the dummy of lagged dividends (\(DD_{(t-\tau )}\)). The coefficient of the dummy variable indicating lagged negative earnings (\(Neg E_{(t-\tau )}\)) is positive and statistically significant in the three-year-lagged regression and the accruals (\(Ac_{(t-\tau )}\)) variable is significant in none of the regressions. The magnitude and the sign of the coefficients are similar to Hou et al. (2012) and Li and Mohanram (2014), even though the sample period is different.Footnote 23

For the EP model (see Panel D of Table 2), the lagged dummy variable of negative earnings (\(Neg E_{(t-\tau )}\)) is negative and significant, lagged earnings (\(E_{(t-\tau )}\)) is positive and significant, and the interaction term (Neg E * \(E_{(t-\tau )}\)) is negative and significant in all analyzed regressions (\(\tau =1\) to 3).

For the RI model (see Panel E of Table 2), the lagged dummy of negative earnings (\(Neg E_{(t-\tau )}\)) is negative and significant, lagged earnings (\(E_{(t-\tau )}\)) is positive and significant, the interaction term (Neg E * \(E_{(t-\tau )}\)) is negative, and lagged book value (\(B_{(t-\tau )}\)) is positive and significant. All these results are similar to Li and Mohanram (2014) with the only difference being that (\(TACC_{(t-\tau )}\)) is negative but not significant in our regression. This difference is probably due to the different estimation period and a possibly different calculation method of standard errors for the t-statistics.

When we compare the adjusted R-squared figures of the tested models, we see that the CM and the CSAF present the highest values for all analyzed periods. For the one-year-lagged regression, the adjusted R-squared of the CM is 0.94, compared to 0.94 (CSAF), 0.77 (HVZ model), 0.77 (EP model), and 0.78 (RI model). For the two-year-lagged regression, the CM has an adjusted R-squared of 0.86, which is higher than the CSAF(0.85), HVZ (0.69), EP (0.68), and RI (0.70) models. For the three-year-lagged regression, the adjusted R-squared values are 0.81 (CM), 0.78 (CSAF), 0.66 (HVZ model), 0.63 (EP model), and 0.66 (RI model). Our adjusted R-squared values for the EP and RI models are higher than in Li and Mohanram (2014) as we estimate these models at the dollar level so that the heteroskedasticity of the dollar level data inflates the adjusted R-squared. Although a high in-sample R-squared value is not a sufficient condition for high out-of-sample performance, it is a necessary one (Welch and Goyal 2008). These in-sample results bode well for the CM. We will analyze the forecast bias in the next section.

3.2 Bias comparison

There is ample evidence that analysts’ forecasts tend to be too optimistic (e.g., Lin and McNichols 1998; Hong and Kubik 2003b; Merkley et al. 2017; Mest and Plummer 2003) with one of the reasons being that they face a conflict of interest. In a survey of 365 analysts, Brown et al. (2015) find that 44% of respondents say their success in generating underwriting business or trading commissions is very important for their compensation. There is also empirical evidence for the conflict of interest hypothesis. Hong and Kubik (2003b) find that controlling for accuracy, analysts who are optimistic compared to the consensus are more likely to have favorable job separations. In particular, for analysts who cover stocks underwritten by their houses, optimism becomes more relevant than accuracy for favorable job separations. This optimism bias carries over into many applications that use these forecasts as an input. Easton and Sommers (2007) estimate that overly-optimistic analysts’ earnings forecasts lead to an upward bias in the ICC of 2.84%. Given the importance of bias, we now compare the mean and median biases of all tested earnings forecast models. We define bias as the difference between actual earnings and earnings forecasts, scaled by the firm’s end-of-June market equity. We estimate bias out-of-sample for one-, two-, and three-year-ahead forecasts (\(\tau =1\) to 3).

$$\begin{aligned} Bias_{(i,t+\tau )}= \frac{(Actual\, earnings_{(i,t+\tau )} - Earnings\, forecast_{(i,t+\tau )})}{Market\, equity_{(i,t)}} \end{aligned}$$
(5)

As seen in Eq. 5, a negative (positive) bias means overly-optimistic (pessimistic) earnings forecasts. A bias of zero indicates unbiased forecasts. We estimate bias at the end of June of each yearFootnote 24 for each firm. Then, we estimate the yearly mean and median forecast biases. In Panel A of Table 3, we report the average of the yearly mean and median biases and the respective t-statistics with the Newey-West adjustment for all tested models. For all methods, the bias evaluation is based on a common sample that is restricted to firm-years for which I/B/E/S analysts’ forecasts are available.

In Panel A of Table 3, we see that the CM is the only model that has no statistically significant bias at the 0.05 significance level. We emphasize that this result also holds when analyzing the mean and median biases and when testing one-, two- or three-year-ahead forecasts. Our results confirm the positive bias of analysts’ forecasts, as the mean and median biases are negative and statistically significant for one-, two, and three-year-ahead forecasts. The one-year-ahead median bias is small in magnitude (\(-0.002\)), i.e., it overestimates earnings by an amount of 0.2% of market equity. However, the median bias increases for two and three-year ahead forecasts to \(-0.009\) and \(-0.013\), respectively. Our results are different from those in Abarbanell and Lehavy (2003), who show that the median bias is zero for analysts’ forecasts. This is possibly due to the different sample period (Abarbanell and Lehavy (2003) analyze the period from 1985 to 1998) and the different forecast periodicity (the authors use quarterly forecasts while we use yearly forecasts).

Table 3 Earnings forecasts bias

Moving to the benchmark models, the HVZ and RI models show an optimistic mean bias in the one-, two-, and three-year-ahead forecasts. The EP model displays an optimism bias for the mean one-year-ahead forecasts as well as for the median two- and three-year-ahead regressions. The forecasts based on the RW model show a positive bias, which means that they are overly pessimistic. This is intuitive as this model does not take growth in earnings into account. Finally, the CSAF model performs well in that it only has a significant bias for the two-year ahead forecast horizon. However, it does show greater bias in terms of magnitude for the three-year ahead forecasts compared to the raw analysts’ forecasts. This indicates that simply incorporating analysts’ forecasts into a cross-sectional regression does not remove the optimism bias.

Panel B of Table 3 shows whether the bias of the CM is statistically different in comparison to other models. The first row presents the difference between the CM and analysts’ forecasts, and we see that in all periods, for the mean and the median, the biases are statistically different. Thus, we document that the CM is not as overly-optimistic as raw analysts’ forecasts. In the second row, we compare the CM to the CSAF, and we see that the mean and median bias is statistically different for two- and three-year-ahead forecasts. These results show that the additional variables of the CM (compared to the CSAF) are important for achieving unbiased forecasts, in particular for long-term earnings. When we compare the CM to the RI model, we see differences only for the three-year-ahead forecast. Furthermore, the CM is statistically less optimistic than the HVZ for one- and three-year-ahead forecasts and less pessimistic than the RW for two- and three-year-ahead forecasts. Last, we show that the CM is not as overly-optimistic as the EP model at a statistically significant margin for all analyzed periods. In short, the CM displays the lowest bias of all tested models for all forecast horizons.

Fig. 1
figure 1

These figures show the time-series of the mean and median bias for the US market. Bias is defined as the difference between actual earnings and earnings forecasts, scaled by the firm’s end-of-June market equity. Results refer to the CM, analysts’ forecasts (AF), and the benchmark cross-sectional model with the bias closest to zero. Results are shown for one-, two-, and three-year ahead earnings forecasts. We estimate one, two, three-year ahead forecast bias for the periods 1985–2015, 1987–2015, and 1989–2015, respectively

To analyze forecast bias over time, Fig. 1 shows the mean and median forecast bias for one-, two-, and three-year-ahead earnings forecasts. For the sake of clarity, we only include the raw analysts’ forecasts, the CM, and the benchmark model with forecast bias closest to zero in the figure. The optimism bias of the raw analysts’ forecasts is immediately apparent. The corresponding graph is almost always below zero for different forecast horizons and aggregation methods (mean and median). We also see spikes in the bias for the RW model that correspond to economic shocks. For example, the burst of the Internet bubble in 2001 results in an overly-optimistic estimate as the previous (high) level of earnings is used as a forecast. On the other hand, the CM displays periods with positive and negative bias, indicating that it is on average unbiased.

3.3 Accuracy comparison

There is substantial evidence that analysts’ forecasts are more accurate than regression-based models (e.g., Fried and Givoly (1982); O’Brien (1988); Hou et al. (2012)). Researchers argue that the higher accuracy of analysts’ forecasts is due to their “innate ability and task-specific experience”Footnote 25 (e.g., Clement et al. (2007)), industry-related experience obtained before becoming an analyst (e.g., Bradley et al. (2017)), and the number of analysts covering each industry (e.g., Merkley et al. (2017)).

In this section, we compare the forecast accuracy of all tested models. For all methods the accuracy evaluation is based on a common sample that is restricted to firm-years for which I/B/E/S analysts’ forecasts are available. We use absolute error as a proxy for accuracy. Following Bradley et al. (2017), we estimate the absolute error as the absolute difference between actual earnings and earnings forecasts, scaled by the firm’s end-of-June market equity. The lower the value of the absolute error, the more accurate the forecast.

$$\begin{aligned} Absolute \, error_{(i,t+\tau )}= abs\left[ \frac{(Forecast\, earnings_{(i,t+\tau )} - Actual\, earnings_{(i,t+\tau )})}{Market\, equity_{(i,t)}}\right] \end{aligned}$$
(6)

We estimate the out-of-sample absolute error based on Eq. 6 at the end of June of each year,Footnote 26 for one-, two-, and three-year-ahead forecast horizons (\(\tau =1\) to 3) for each firm. In Panel A of Table 4, we report the yearly average of the mean and median absolute errors (accuracy) and the respective t-statistics with the Newey-West adjustment for all tested models.

Table 4 Earnings forecasts accuracy

As we see in Panel A of Table 4, the CM is slightly superior to the raw analysts’ forecasts and the CSAF model and markedly superior to the benchmark models in terms of mean accuracy. When comparing the three most accurate models, the CM has the lowest forecast error (0.046), followed by the CSAF (0.050) and the AF (0.057). The mean absolute error of the benchmark models is roughly twice as high (inaccurate) as the CSAF model, raw analysts’ forecasts or the CM for the one-year-ahead forecast. For two- and three-year ahead mean absolute error, the CM again is more accurate than the other models but we note that the difference to analysts’ forecasts is smaller (the CM has a mean absolute error of 0.063 and 0.070 for two- and three-year-ahead forecasts, in comparison, the mean absolute error of analysts’ forecasts is 0.070 and 0.076). Regarding the CSAF model, the difference in terms of accuracy to the CM becomes higher for long-term forecasts since the absolute error for the CSAF model is 0.076 for two-year-ahead and 0.099 for three-year-ahead forecasts. The CSAF model outperforms the raw analysts’ forecasts for the one-year-ahead horizon (mean error), but it is less accurate for two- and three-year-ahead forecasts. Finally, the mean absolute error of the other benchmark models is on average five percentage points higher than the CM.

With regard to median absolute error, the results of analysts’ forecasts are slightly superior to the CM for one- and two-year-ahead horizons (0.011 and 0.024 for raw analysts’ forecasts and 0.015 and 0.026 for the CM for one-year and two-year forecasts, respectively). For three-year-ahead forecasts, the median absolute error is 0.033 for both models. The third best model in terms of median accuracy is the CSAF, with absolute errors of 0.016, 0.030, and 0.042 for one-, two-, and three-year-ahead forecasts. Concerning the other benchmark models, the median absolute error is substantially higher (more inaccurate) compared to the raw analysts’ forecasts, the CSAF, and the CM. We also highlight that the analysts’ forecasts are more accurate than the ones estimated with the CSAF model in one-, two-, and three-year-ahead forecasts. This is evidence that only including the analysts’ forecasts in a cross-sectional model is not sufficient to improve the forecasts.

In Panel B of Table 4, we test whether the differences are statistically significant. The CM shows superior accuracy compared to all cross-sectional models and the RW model. Like Gerakos and Gramacy (2013), we find that the RW model is as accurate as the cross-sectional models. Comparing the CM to analysts’ forecasts, the CM outperforms the analysts in the medium and long-term (two- and three-year-ahead) forecasts. However, the results for one-year-ahead forecasts are mixed since the analysts’ forecasts have better median accuracy, while the mean accuracy is not statistically different between both models.

Fig. 2
figure 2

These figures show the time-series of the mean and median accuracy for the US market. We define accuracy as the absolute difference between actual earnings and earnings forecasts, scaled by the firm’s end-of-June market equity. Results refer to the combined model (CM), analysts’ forecasts (AF), and the benchmark cross-sectional model with the highest accuracy (lowest absolute forecast error). Results are shown for one-, two-, and three-year ahead earnings forecasts. We estimate one, two, three-year ahead forecast bias for the periods 1985–2015, 1987–2015, and 1989–2015, respectively

In Fig. 2, we plot the forecast accuracy over time for the tested methods. The raw analysts’ forecasts are superior to the CM in terms of one-year-ahead median accuracy, in particular for the first years of the sample period. When we split the analyzed period into two equal-length sub-periods, we see that the difference in median accuracy during the period 1985–2000 is 0.0073, while in the period 2001–2015 it decreases to 0.0022. We observe the same pattern for two-year-ahead median accuracy; here the difference falls from 0.0032 (earlier period) to 0.0000 (later period), which indicates that the CM has improved the accuracy compared to the raw analysts’ forecasts over the years. Last, note that the raw analysts’ forecasts and the CM outperform the benchmark models in all periods.

3.4 Earnings response coefficient

The ERC is the coefficient that measures the response of stock prices to surprises (new information) in accounting earnings announcements (Easton and Zmijewski (1989)). A higher ERC suggests that the market reacts more strongly to the unexpected earnings from a model that represents a better approximation of market expectations (Li and Mohanram (2014)). According to Brown (1993), assuming an informationally efficient market, the accuracy and market association could be considered “two sides of the same coin.”Footnote 27 However, it is important to clarify that while bias and accuracy are ex-post assessments of forecasts, the ERC examines the extent to which earnings forecasts provide the best ex-ante estimates of market expectations. This analysis also helps to rule out the possibility that our results are primarily driven by different definitions of earnings (street versus GAAP).

We estimate the ERC using the sum of the quarterly earnings announcement returns (market-adjusted, from day \(- 1\) to day \(+ 1\)) on one-, two-, and three-year-ahead firm-specific unexpected earnings (i.e., the forecast bias) measured over the same horizon. The unexpected earnings, as well as the returns, are standardized to make the ERC comparable among all models.

Panel A of Table 5 shows the time-series average of the ERCs, the respective t-statistics, and the time-series average of adjusted R-squared for all tested models. We see that the CM reports the highest coefficient of ERC for all time horizons analyzed. The one-, two- and three-year ahead ERC coefficients are 0.132, 0.130, and 0.098, respectively. Regarding R-squared, for the one-year-ahead horizon, the highest values are achieved by the RI model (0.017), followed by CM (0.016) and CSAF (0.016). For two- and three-year ahead horizons, the highest R-squared is achieved by the CM, with a value of 0.017 and 0.009, respectively. The results suggest that the earnings forecasts from the CM are closer to market expectations. We test for statistical significance in Panel B.

Table 5 Earnings response coefficient

As we see in Panel B of Table 5, for one-year-ahead forecasts, the CM outperforms raw analysts’ forecasts regarding ERC coefficient and adjusted R-squared. The difference in the ERC coefficient is also highly statistically significant (t-statistic of 3.76). For the same forecast horizon, the CM does not significantly outperform the other benchmark models. When analyzing two-year-ahead forecasts, we note that the CM shows a higher ERC coefficient than the CSAF, HVZ, EP, RI, models and a higher adjusted R-squared than the CSAF, HVZ and RI models at a statistically significant margin. Finally, for three-year-ahead forecasts, the results are statistically different when comparing the CM to the RW and the CSAF models. In summary, we find that the CM represents market expectations most consistently among all tested models.

4 Implied cost of capital

The ICC is a popular proxy for expected returns (see e.g., Pástor et al. (2008); Frank and Shen (2016); Bielstein and Hanauer (2019)) as its estimates contain less noise than estimates based on realized returns (e.g., Lee et al. (2009)). Better earnings estimates should improve the correlation between the ICC and subsequent realized returns leading to more useful ICC estimates. In this section, we analyze the performance of ICC estimates using proxies for earnings forecasts based on the CM, analysts’ forecasts, and the benchmark models. First, we compute the ICC on an aggregate level and evaluate its ability to predict realized returns over time. Then, we analyze the cross-sectional correlation between ICC and ex-post forward returns.

4.1 Relation between ICC and returns on an aggregate level

There is evidence that the ICC at an aggregate level is a good proxy for time-varying expected returns (e.g., Pástor et al. (2008); Li et al. (2013)). Due to the fact that one of the main inputs for the ICC estimation are earnings forecasts, we believe that this input can strongly influence the ICC’s performance as a proxy for expected returns. In this section, we test whether the slopes from a regression of realized market returns on the ICC, computed using different methods to forecast earnings, are greater than zero.Footnote 28 We regress ex-post, one-year-forward value-weighted (VW) excess market returns on VW excess ICC. For each earnings forecast method, we estimate five different ICC models (GLS, CT, OJ, MPEG, and a Composite, which is the mean of the four previous models). We employ the following proxies for earnings forecasts: the CM, analysts’ forecasts, the HVZ model, the EP model, and the RI model.Footnote 29 To compute the excess ICC and excess market returns, we use the yield on the U.S. 10-year government bond. Panel A of Table 6 presents the results.

For the one-year-forward return predictive regressions, we document that the ICC estimated with earnings from the CM offers the largest number of significant regression slopes. For three ICC methods (CT, OJ, and MPEG) the coefficients are significant at the 0.05 level. In contrast, the HVZ and CSAF models, as well as raw analysts’ forecasts, only produce two significant coefficients. By comparing the t-statistics, the ICC estimated with the CM reports the highest t-statistics in three out of the five ICC approaches.

Table 6 Regressions of ICC and ex-post realized returns

4.2 Relation between ICC and returns cross-sectionally

In the previous section, we compared the predictive power of the ICC over time. Now, we analyze whether the ICC has a positive correlation with the cross-section of stock returns. To this end, we perform univariate Fama and Macbeth (1973) (FM) cross-sectional regressions of ex-post-forward return premium on four individual ICC premium estimates (we use the GLS, CT, OJ, and MPEG approaches) and on the Composite ICC premium at the firm level. To estimate earnings’ forecasts for the ICC computation, we use the following proxies: the CM, analysts’ forecasts, the CSAF model, the HVZ model, the EP model, and the RI model. The results are reported in Panel B of Table 6.

When we regress cross-sectional monthly returns on the ICC, we can see that the ICC estimated with the CM has the strongest correlation with the cross-section of returns since the coefficients are statistically significant in four (GLS, CT, OJ, and Composite) out of five ICC approaches. The ICC estimated with the CM has the highest t-statistics in all analyzed ICC approaches. Interestingly, the second model with the highest number of significant coefficients is the ICC estimated with the CSAF model. This result shows that although the CSAF model is less accurate and more biased than the raw analysts’ forecasts, the resulting ICC estimates have a higher correlation with the cross-section of expected returns.

4.3 Portfolio strategies

As shown in Table 6, the ICC exhibits weak explanatory power in FM regressions. However, this finding might be driven by small and micro-cap stocks as the FM regressions weight the observations equally (Novy-Marx 2013). An additional shortcoming of FM regressions is that they are sensitive to outliers. To address these potential issues, we analyze the performance of value-weighted portfolios sorted by their ICC.

Table 7 presents annual excess returns (in excess of the risk-free return). The stocks are sorted into quintiles and deciles based on their respective ICC at the end of June each year from 1986 to 2012. We report the performance of the long-short strategies 5 − 1 (fifth quintile minus the first quintile) and 10 − 1 (tenth decile minus first decile). We estimate ICCs based on earnings from the following models: the CM, AF, CSAF, HVZ, RI, EP, and RW. We sort portfolios based on the following ICC approaches: CT, GLS, OJ, and MPEG. In addition, we include a Composite ICC, which is the average of the above-mentioned approaches. To compute excess returns, we use the one-month Treasury bill rate.

Table 7 Returns of portfolios formed on ICC

The results of the long-short strategies show that only the ICC estimated with the CM and the CSAF model report significant excess returns. The ICC estimated with the CM has significant excess returns with the GLS approach for the 5 − 1 (4.45%) and 10 − 1 (4.98%) long-short strategies and with the CT approach for the 10 − 1 strategy, with annualized excess returns of 6.65%. The ICC estimated with the CSAF model has significant excess returns for the strategy 10 − 1 with the CT and Composite ICC. Some of our results here may differ from the corresponding results in the original papers for the HVZ, EP, and RI models. This may be due to a different return frequency used to compute t-statistics, different sample periods, and different stock universes.

In summary, the ICC estimated with the CM reports a stronger correlation with returns compared to the other models. The results hold for both dimensions, over-time and cross-sectionally. The ICC estimated with the CSAF model has similar predictive power compared to the raw analysts’ forecasts but a stronger correlation with returns cross-sectionally.

5 Firm characteristics and expected returns

We evaluate whether a set of firm characteristics that have been used to explain the cross-sectional variation of expected returns proxied by average realized returns also have the same relation when the ICC as a proxy for expected returns is used. We perform Fama and Macbeth (1973) (FM) cross-sectional regressions with ex-post excess realized returns from July (year t) to June (year \(t+1\)) and excess ICC estimated with different proxies for earnings forecasts as dependent variables. The independent variables are firm characteristics available prior to the end of June of year (t). We estimate the ICCFootnote 30 based on different proxies of earnings forecasts at the end of June of each year.

We use the following firm characteristics. We estimate market \(\beta \) at the end of June for each stock and for each year using the stock’s previous 60 monthly excess returns (we require a minimum of 24 months, and excess returns are in excess of the one-month Treasury bill rate taken from Kenneth French’s data library). Idiosyncratic volatility is the standard deviation of the residuals from regressing the stock’s returns in excess of the one-month Treasury bill rate on the three Fama and French (1993) factorsFootnote 31 estimated yearly at the end of June using the previous 60 monthly returns (we require a minimum of 24 months) (e.g., Ang et al. (2006); Hou et al. (2015)). Asset growth is the change in total assets from the fiscal year ending in year (\(t-1\)) to the fiscal year ending in (t), divided by (\(t-1\)) total assets (e.g., Fama and French (2015)). Size is the natural logarithm of market equity at the end of June in year (t). Gross profitability is the ratio of gross profits to total assets (e.g., Novy-Marx (2013)). Leverage is book value of debt divided by book equity. CapEx is capital expenditures divided by total assets from year (\(t-1\)). ln(beme) is the natural logarithm of the ratio of book equity to market equity at the previous fiscal year-end. In Table 8, we provide the average of the FM regression coefficients estimated yearly for the period from June 1986 to June 2012 and the respective t-statistics with Newey-West adjustment.Footnote 32

For market \(\beta \) the results are mixed. While we see negative and significant coefficients for the ICC with earnings forecasts from the CM, as well as from the cross-sectional (CSAF, HVZ, EP, and RI) models, the ICC using analysts’ earnings forecasts has a positive relation with market \(\beta \). The relation between market \(\beta \) and forward returns is not statistically significant. These results are similar to Hou et al. (2012), as their ICC model has a negative and significant relation to market \(\beta \) while the relation to realized returns is not statistically significant. The ICC based on the CM, analysts’ forecasts, EP, and HVZ earnings forecasts has a positive and significant relation with leverage, but forward returns and ICC with CSAF and RI earnings forecasts have no significant coefficients for leverage.

Table 8 Implied cost of capital and risk factors

All proxies of expected returns have positive coefficients for idiosyncratic volatility. However, the coefficients are statistically significant only for the ICC with earnings forecasts derived from the CM (t-statistic of 2.514), analysts’ forecasts (t-statistic of 4.446), the CSAF model (t-statistics of 2.518), and the EP model (t-statistic of 3.218). The results for asset growth are interesting since we are able to confirm the negative cross-sectional relation of asset growth and returns, also shown in Aharoni et al. (2013). Although, the ICC estimated with most proxies of earnings forecasts shows a negative and significant relation with asset growth (the coefficients vary between \(-0.417\) and \(-1.637\) with t-statistics between 3.521 and 5.386), the ICC with analysts’ forecasts has a positive and significant relation with a coefficient of 0.181 and a t-statistic of 3.076. These findings caution against using ICC based on analysts’ forecasts earnings as a proxy for expected returns.

The size effect is stronger when we use the ICC as a proxy for expected returns than when realized returns are used. The ICCs based on any of the tested earnings forecasts methods show significant coefficients at the 0.01 level. When we analyze the relation of size and forward realized returns, the coefficient is not statistically significant. Concerning the value effect, the coefficients of ln(beme) are positive and statistically significant for all proxies of expected returns, but the t-statistics are higher when the ICC is used as a proxy for expected returns than when the ex-post realized returns are used. This is not surprising as the ICC is a more sophisticated value measure and is therefore highly correlated with the value factor (e.g., Li et al. (2013)).

According to Novy-Marx (2013), gross profitability has a positive and significant relation to returns. In our study, we confirm these results when using realized returns as the dependent variable as the corresponding coefficient is 5.627 with a t-statistic of 3.181. The results for the ICC based on the CM (a positive coefficient of 3.013 and a t-statistic of 8.492) are also similar to those from the regression using realized returns. However, when we analyze the ICCs with earnings forecasts from the HVZ model and the RI model, the results show a negative and significant relation, with a t-statistic of 6.657, and 3.280, respectively. Finally, CapEx has a negative and significant relation with the ICC based on the CM, HVZ, EP, and RI models and an insignificant relation with the other proxies of expected returns analyzed in this study.

6 Extensions to cross-sectional models and removing predictable forecast errors

In this section, we address two potential points of criticism regarding our CM. First, we check whether the good performance of the CM could have been achieved by including analysts’ forecasts in any of the other cross-sectional earnings forecast models. To this end, we include analysts’ earnings forecasts in the HVZ, EP, and RI models. Second, the literature has developed methods to improve analysts’ forecasts by removing their predictable forecast errors (e.g., Mohanram and Gode (2013); Han et al. (2001)). To test if these improvements deliver similar results than our CM, we implement the procedure from Mohanram and Gode (2013) (henceforth MG model).

In our first test, we replace income before extraordinary items from Compustat with one-year-ahead earnings forecasts from I/B/E/S in the HVZ, EP, and RI earnings forecasts models. Then we compute forecast bias, accuracy and ERC based on street earnings.Footnote 33 As you can see in Table 9, the tested models based on analysts’ forecasts have lower accuracy and weaker earnings response coefficients compared to the CM, and all of them show a significant bias in at least one of the three forecast horizons. However, comparing the models with analysts’ forecasts to the original models, they display improvements in all three measures (accuracy, bias, and ERC).

Table 9 Cross-sectional models with analysts’ forecasts

Recent studies show that part of the analysts’ earnings forecast errors (bias) are predictable (see Mohanram and Gode (2013); Larocque (2013); Guay et al. (2011)). Mohanram and Gode (2013) propose a model to remove the predictable errors, which can be used to estimate ICC. The authors show that the ICC estimated with the MG model has a stronger cross-sectional correlation with returns. We closely follow Mohanram and Gode (2013)Footnote 34 when implementing their model and subsequently compare the results to our analyzed models.Footnote 35

In unreported results (available on request), we find similar coefficients in the in-sample regression and a consistent correlation between SURP1 and SURP2 and the respective analysts’ forecasts errors. Regarding bias and accuracy, we do not find any statistically significant one-year or two-year ahead mean bias, but we find a positive and statistically significant median bias for two-year-ahead earnings forecasts (0.011 with t-statistic of 3.92). In terms of accuracy, the MG model underperforms the raw analysts’ forecasts in all tests at the per-share level. These results are in line with the findings from Hou et al. (2012) who also replicate the MG model.

Table 10 The cross-sectional correlation between returns and ICC estimated with the MG model

Like Mohanram and Gode (2013), we estimate ICC at the share level. Moreover, we use the long-term growth (LTG) forecasts to estimate the three-year ahead earnings forecasts and set the short-term growth (STG) forecast equal to LTG in the estimation of the OJ and MPEG methods when the two-year ahead forecast is below the one-year-ahead one. Table 10 displays the results from cross-sectional tests of realized returns and ICC estimates based on the MG model.

In Panel A of Table 10, we show the Fama-Macbeth regression. From the five ICC approaches tested, two report a statistically significant relation with the cross-section of returns. Although the MG model reports a smaller number of significant coefficients compared to the CM (see Table 6), the model clearly shows a higher correlation with realized returns compared to the raw analysts’ forecasts. The results of the portfolio sorts based on the ICC (see Panel B of Table 10) estimated with the MG model also provide similar evidence. Only one of the long-short strategies based on the ICC from the MG model is statistically different from zero, while three strategies based on the CM display positive and significant returns. However, by comparing the MG model with the raw analysts’ forecasts, we can see that the MG model indeed leads to an improvement of the correlation between ICC and realized returns.

To sum up, although the tested adjustments improve forecast performance compared to their unadjusted implementations, we confirm that the CM still scores better in terms of accuracy, bias, ERC, and when looking at the correlation between ICC and realized returns.

7 Conclusion

In this study, we develop a new method to forecast corporate earnings. We build upon analysts’ earnings forecasts, which are known to be accurate, yet upwardly biased. To improve these analysts’ forecasts, we combine them with variables that have proven to be good predictors of earnings. First, we include gross profits, as Novy-Marx (2013) finds a strong association with earnings. Second, we follow Ashton and Wang (2012), who show that stock price changes drive earnings, by including recent stock market performance. This also mitigates the fact that analysts, on average, need longer to incorporate new information into their earnings forecasts than it takes the stock market to incorporate new information into the share price (Guay et al. 2011).

We compare our new approach, the CM, to several methods from the literature, namely raw analyst forecasts, the model by Hou et al. (2012), the earnings persistent model (Li and Mohanram 2014), and the residual income model (Li and Mohanram 2014). In addition, we add an alternative benchmark, the CSAF model, which is based on a cross-sectional regression including only analysts’ earnings forecasts as an input. We find that our CM has the lowest bias and highest accuracy among all the tested models. Regarding market expectations, we show that the CM also performs better than the other benchmark models. Furthermore, we compute the ICC based on the different earnings forecast models and find that the CM leads to ICC estimates that have the strongest association with subsequent realized stock returns. We also rule out the possibility that any cross-sectional model can perform as well as our CM by including analysts’ forecasts in the HVZ, EP, and RI models. We confirm that this improves their forecast performance, but they still trail behind our CM. Finally, we also compare our CM to the method from Mohanram and Gode (2013), which removes predictable forecast errors from analysts’ earnings forecasts and find that overall, the CM outperforms the forecasts based on the MG method.

This new method makes a strong case for combining two different approaches to forecast earnings: human forecasts made by financial analysts and cross-sectional forecasts based purely on financial data. These two approaches have distinct advantages and disadvantages. Analysts’ forecasts are known to be accurate, yet upwardly biased. On the other hand, cross-sectional forecasts are unbiased, but not as accurate. Combining them into one model mitigates both disadvantages while conserving the advantages.

Our findings are relevant for practitioners working with earnings forecasts, as well as academics employing earnings forecasts as inputs in valuation models, such as the ICC. We recommend the use of our CM to improve the accuracy and unbiasedness of earnings forecasts, which benefits methods that build on these forecasts and applications thereof.