Skip to main content
Log in

Economic persistence, earnings informativeness, and stock return regularities

  • Published:
Review of Accounting Studies Aims and scope Submit manuscript


We propose a simple framework for understanding accounting-based stock return regularities. A firm’s accounting reports provide noisy information about hidden economic states that evolve according to a Markov process. In response to the accounting reports, a representative Bayesian investor forms beliefs about the underlying state and hence the value of the firm. For a population of such firms, the model provides predictions consistent with two sets of well-documented regularities: (i) the market reaction to an earnings announcement that ends a string of consecutive earnings increases and (ii) the return predictabilities based on accruals and book-tax differences. The model also yields novel cross-sectional predictions about the distinct roles of economic persistence and earnings informativeness. We confirm these predictions through empirical tests.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others


  1. Even though the law of large numbers stipulates that estimates based on large samples of certain transactions (such as warranty expenses) are predictable with certain precision, abundant other transactions are nondiversifiable in nature.

  2. In general, an information system is perfect if it is a permutation matrix, obtained by permuting the rows of an identity matrix, and contains exactly one entry of 1 in each row and each column and 0 elsewhere. Given the ordinal values of states and signals, we adopt a more restrictive definition.

  3. In a filtering problem, one tries to estimate the hidden states, based on observed data.

  4. Standard texts on economic dynamics have discussed the linear filtering problem and the nonlinear filtering problem (e.g., the binary hidden Markov model, as examined in our study) in greater length. See Ljungqvist and Sargent (2004, pp. 1269–1273) or Miao (2014, pp. 251–260)

  5. To gain some insights into the belief updating, consider two extreme cases: uninformative earnings signals and perfectly informative earnings signals. With an uninformative reporting system (c + d = 1), there is still updating in μt, due to the nature of the Markov process,

    $$ \mu_{t}=(a+b-1)\mu_{t-1}+1-b. $$

    As long as 0 < a < 1 and 0 < b < 1, the two-state Markov process dictates that, even though the investor rationally ignores the signal, there is still updating of investor beliefs, which will converge to the stationary distribution in the long run,

    $$ \lim_{t\rightarrow\infty}\mu_{t}=\frac{1-b}{2-a-b}, $$

    and stock price is equal to the unconditional expectation. With a perfect reporting system (c = d = 1), μt will oscillate between 1 and 0, depending on the current signal.

  6. The notion that Bayesian learning implies decreasing weights on prior beliefs as the evidence accumulates is also illustrated by Chen et al. (2005). In a model of investor learning about analyst predictive ability, Chen et al. predict that, as the length of the analyst’s forecast record increases, investors put decreasing weights on prior beliefs and increasing weights on the record of the analyst’s forecast accuracy.

  7. The distinction between the Bayesian and frequentist perspectives has been discussed by Berger (1985) at greater length. In asset pricing studies, Lewellen and Shanken (2002) use this distinction to explain why parameter uncertainty may lead to return predictability, even when the agent within the model cannot perceive or exploit the predictability. An important implication of the distinction is that the hedge strategies perceived to be profitable by an empiricist cannot be exploited by the investor in the model, thereby perpetuating the existence of arbitrage opportunities.

  8. The number of periods is not intended to describe real-world firm ages. Rather, we simulate long histories to ensure the existence of a large number of earnings strings of various lengths, because the consistency of the simulation results relies on averaging a large number of firms with the same type of earnings patterns.

  9. The numbers of firm-quarters are 32,992, 15,057, 7,537, 9,832, 5,476, 3,457, 2,320, and 1,762, respectively, for a string of length 1, ..., 8.

  10. This measure, named “earnings fidelity” by Du et al. (2020), is publicly available at

  11. According to Christensen and Demski (2003), earnings and cash flows are “simply two different ways of doing the accounting, and both, in principle, are sources of information” (p. 128), and “the typical financial report contains accrual and cash basis renderings, two different partitions so to speak” (p. 115).

  12. Similarly, Barberis et al. (1998) also study trading strategies from a frequentist perspective based on a one-firm model. Our analysis, like that of Barberis et al., assumes an exogenous pricing kernel, in the sense that there is a profitable strategy but the price does adjust for the fact that investors should follow the strategy. We thank an anonymous reviewer for this observation.

  13. The top statutory corporate federal tax rate was 46 percent in 1980–1986, 40 percent in 1987, 34 percent in 1988–1992, and 35 percent in 1993–2015.

  14. Similarly, Li (2011) argues that the correlation between earnings and investment/employee growth has implications for earnings persistence.

  15. For simplicity, we do not consider the signal in the period before the start of the string. However, even if we do, the proof of Flh(μt− 1) ≤ Flhh(μt− 1) is analogous.


  • Adda, A., & Cooper, R. (2003). Dynamic economics: Quantitative methods and applications. Cambridge: The MIT Press.

    Google Scholar 

  • Antle, R., Demski, J., & Ryan, S. (1994). Multiple sources of information, valuation, and accounting earnings. Journal of Accounting, Auditing and Finance, 9, 675–696.

    Article  Google Scholar 

  • Ayers, B. (1998). Deferred tax accounting under SFAS No. 109: An empirical investigation of its incremental value-relevance relative to APB No. 11. The Accounting Review, 73, 195–212.

    Google Scholar 

  • Barberis, N., Shleifer, A., & Vishny, R. (1998). A model of investor sentiment. Journal of Financial Economics, 49, 307–345.

    Article  Google Scholar 

  • Barth, M., Elliott, J., & Finn, M. (1999). Market rewards associated with patterns of increasing earnings. Journal of Accounting Research, 37, 387–413.

    Article  Google Scholar 

  • Berger, J. (1985). Statistical decision theory and bayesian analysis. New York: Springer.

    Book  Google Scholar 

  • Blackburne, T., & Blouin, J. (2017). Understanding the informativeness of book-tax differences. Working paper, University of Washington and University of Pennsylvania.

  • Blackwell, D. (1951). Comparison of experiments. In Neyman, J (Ed.) Proceedings of the second Berkeley symposinm on mathematical statistics and probability. Berkeley: University of California Press.

  • Chen, Q., Francis, J., & Jiang, W. (2005). Investor learning about analyst predictive ability. Journal of Accounting and Economics, 39, 3–24.

    Article  Google Scholar 

  • Chi, S., Pincus, M., & Teoh, S. (2014). Mispricing of book-tax differences and the trading behavior of short sellers and insiders. The Accounting Review, 89, 511–543.

    Article  Google Scholar 

  • Christensen, J., & Demski, J. (2003). Accounting theory: An information content perspective. New York: McGraw-Hill/Irwin.

    Google Scholar 

  • DeAngelo, H., DeAngelo, L., & Skinner, D. (1996). Reversal of fortune: Dividend signaling and the disappearance of sustained earnings growth. Journal of Financial Economics, 40, 341–371.

    Article  Google Scholar 

  • Dhaliwal, D., Lee, H., Pincus, M., & Steele, L. (2017). Taxable income and firm risk. Journal of the American Taxation Association, 39, 1–24.

    Article  Google Scholar 

  • Dechow, P., & Ge, W. (2006). The persistence of earnings and cash flows and the role of special items: Implications for the accrual anomaly. Review of Accounting Studies, 11, 253–296.

    Article  Google Scholar 

  • Dechow, P., Ge, W., & Schrand, C. (2010). Understanding earnings quality: A review of the proxies, their determinants and their consequences. Journal of Accounting and Economics, 50, 344–401.

    Article  Google Scholar 

  • Dichev, I., & Tang, V. (2008). Matching and the changing properties of accounting earnings over the last 40 years. The Accounting Review, 83, 1425–1460.

    Article  Google Scholar 

  • Du, K. (2019). Investor expectations, earnings management, and asset prices. Journal of Economic Dynamics and Control, 105, 134–157.

    Article  Google Scholar 

  • Du, K., Huddart, S., Xue, L., & Zhang, Y. (2020). Using a hidden Markov model to measure earnings quality. Journal of Accounting and Economics, in press.

  • Francis, J., LaFond, R., Olsson, P., & Schipper, K. (2004). Costs of equity and earnings attributes. The Accounting Review, 79, 967–1010.

    Article  Google Scholar 

  • Graham, J., Raedy, J., & Shackelford, D. (2012). Research in accounting for income taxes. Journal of Accounting and Economics, 53, 412–434.

    Article  Google Scholar 

  • Green, J., Hand, J., & Soliman, M. (2011). Going, going, gone? The apparent demise of the accruals anomaly. Management Science, 57, 797–816.

    Article  Google Scholar 

  • Hanlon, M., Laplante, S., & Shevlin, T. (2005). Evidence for the possible information loss of conforming book income and taxable income. Journal of Law and Economics, 48, 407–442.

    Article  Google Scholar 

  • Hemmer, T., & Labro, E. (2019). Management by the numbers: A formal approach to deriving informational and distributional properties of “un-managed” earnings. Journal of Accounting Research, 57, 5–51.

    Article  Google Scholar 

  • Hiemann, M. (2017). Earnings and firm value in the presence of real options. Working paper, Columbia Business School.

  • Jiang, X. (2016). Biases in accounting and nonaccounting information: Substitutes or complements? Journal of Accounting Research, 54, 1297–1330.

    Article  Google Scholar 

  • Kaserer, C., & Klingler, C. (2008). The accrual anomaly under different accounting standards: Lessons learned from the German experiment. Journal of Business Finance and Accounting, 35, 837–859.

    Article  Google Scholar 

  • Ke, B., Huddart, S., & Petroni, K. (2003). What insiders know about future earnings and how they use it: Evidence from insider trades. Journal of Accounting and Economics, 35, 315–346.

    Article  Google Scholar 

  • Kothari, S. (2001). Capital market research in accounting. Journal of Accounting Economics, 31, 105–231.

    Article  Google Scholar 

  • Lewellen, J., & Shanken, J. (2002). Learning, asset-pricing tests, and market efficiency. Journal of Finance, 57, 1113–1145.

    Article  Google Scholar 

  • Li, F. (2011). Earnings informativeness based on corporate investment decisions. Journal of Accounting Research, 49, 721–752.

    Article  Google Scholar 

  • Ljungqvist, L., & Sargent, T. (2004). Recursive macroeconomic theory, 2nd edn. Cambridge: The MIT Press.

    Google Scholar 

  • Marschak, J. (1971). Economics of information systems. Journal of the American Statistical Association, 66, 192–219.

    Article  Google Scholar 

  • Marschak, J., & Miyasawa, K. (1968). Economic comparability of information systems. International Economic Review, 9, 137–174.

    Article  Google Scholar 

  • Mashruwala, C., Rajgopal, S., & Shevlin, T. (2006). Why is the accrual anomaly not arbitraged away? The role of idiosyncratic risk and transaction costs. Journal of Accounting and Economics, 42, 3–33.

    Article  Google Scholar 

  • Miao, J. (2014). Economic dynamics in discrete time. Cambridge: The MIT Press.

    Google Scholar 

  • Richardson, S., Sloan, R., Soliman, M., & Tuna, I. (2005). Accrual reliability, earnings persistence and stock prices. Journal of Accounting and Economics, 39, 437–485.

    Article  Google Scholar 

  • Richardson, S., Tuna, I., & Wysocki, P. (2010). Accounting anomalies and fundamental analysis: A review of recent research advances. Journal of Accounting and Economics, 50, 410–454.

    Article  Google Scholar 

  • Sloan, R. (1996). Do stock prices fully reflect information in accruals and cash flows about future earnings? The Accounting Review, 71, 289–315.

    Google Scholar 

  • Spiceland, J., Sepe, J., & Nelson, M. (2013). Intermediate accounting, 7th edn. New York: McGraw-Hill Irwin.

    Google Scholar 

  • Tauchen, G. (1986). Finite state Markov-chain approximations to univariate and vector autoregressions. Economics Letters, 20, 177–181.

    Article  Google Scholar 

  • Wu, J., Zhang, L., & Zhang, X. (2010). The q-theory approach to understanding the accrual anomaly. Journal of Accounting Research, 48, 177–223.

    Article  Google Scholar 

  • Zhang, X. (2007). Accruals, investment, and the accrual anomaly. The Accounting Review, 82, 1333–1363.

    Article  Google Scholar 

  • Zhou, F. (2017). Disclosure dynamics and investor learning. Working paper, University of Pennsylvania.

Download references


We are grateful to Stefan Reichelstein (the editor) and two anonymous reviewers for numerous helpful comments. We thank Jianhong Chen, Lei Dong, Pingyang Gao, Robert Göx, Jeremiah Green, Tina Huynh, Lawrence Jin, Rick Laux, Jonathan Lewellen, Stefan Lewellen, Jia Li, Pierre Liang, John Liechty, Lyndon Orton, Hong Qu, Jalal Sani, Jack Stecher, Mark Tippett (discussant), Biqin Xie, and workshop participants at Pennsylvania State University, University of Southern Denmark, Michigan State University, Carnegie Mellon University, University of Waterloo, University of Zurich, the 10th MEAFA Research Meeting at University of Sydney, and Munich School of Management for helpful comments. We especially acknowledge the input of Lingzhou Xue on an earlier project related to the paper. All errors are our own. Financial support from Penn State’s Smeal College of Business is gratefully acknowledged. An earlier version of this paper was titled “Reporting systems, investor learning, and stock return regularities”.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Kai Du.


Appendix A: Proofs

Proof of Lemma 1.

The risk-neutral cum-dividend price is equal to the discounted expected future cash flows. Note that \(\frac {1}{1+\delta }<1\) ensures the convergence of \(I+\frac {1}{1+\delta }Q +\frac {1}{(1+\delta )^{2}}Q^{2}+...=(I-\frac {1}{1+\delta }Q)^{-1}\). Therefore we have

$$ p_{t}=(\mu_{t}, 1-\mu_{t})\left( I-\frac{1}{1+\delta}Q\right)^{-1}\bar{X} $$

where \(\bar {X}=(H, L)'\). This can be easily rewritten as

$$ p_{t}=\frac{1+\delta}{\delta(2-a-b+\delta)}\left( H(1-a+\delta)+L(1-b)-\delta(H-L)\mu_{t}\right). $$

Proof of Lemma 2

By Bayes’s Rule,

$$ \begin{array}{@{}rcl@{}} \mu_{t}|_{y_{t}}&=&l=\text{Pr}(x_{t}=L|y_{t}=l,\boldsymbol{y}^{t-1})\\ &=&\frac{\text{Pr}(y_{t}=l|x_{t}=L)\text{Pr}(x_{t}=L|\boldsymbol{y}^{t-1})}{\text{Pr}(y_{t}=l|x_{t}=L,\boldsymbol{y}^{t-1})}\\ &=&\frac{{\sum}_{x_{t-1}=L,H}\text{Pr}(y_{t}=l|x_{t}=L)\text{Pr}(x_{t}=L|x_{t-1})\text{Pr}(x_{t-1}|\boldsymbol{y}^{t-1})}{{\sum}_{x_{t}=L,H}{\sum}_{x_{t-1}=L,H}\text{Pr}(y_{t}=l|x_{t})\text{Pr}(x_{t}|x_{t-1})\text{Pr}(x_{t-1}|\boldsymbol{y}^{t-1})}\\ &=&\frac{ca\mu_{t-1}+c(1-b)(1-\mu_{t-1})}{\left( ca\mu_{t-1}+c(1 - b)(1 - \mu_{t-1})\right) + \left( (1 - d)(1 - a)\mu_{t-1} + (1 - d)b(1-\mu_{t-1})\right)}\\ &=&\frac{c((a+b-1)\mu_{t-1}+1-b)}{(c+d-1)(a+b-1)\mu_{t-1}+(1-d)b+c(1-b)}. \end{array} $$

The expression for \(\mu _{t}|_{y_{t}=h}\) can be derived analogously. □

Proof of Lemma 3.

For notational brevity, let gl(μ) and gh(μ) denote the belief updating functions, given the belief from last period being μt = μ:

$$ g_{l}(\mu)=\mu_{t+1}|_{y_{t+1}=l, \mu_{t}=\mu}, $$
$$ g_{h}(\mu)=\mu_{t+1}|_{y_{t+1}=h, \mu_{t}=\mu}. $$

Given a + b ≥ 1 and c + d ≥ 1, it is straightforward to show that both gl(μ) and gh(μ) are increasing functions of μ on [0, 1]:

$$ g^{\prime}_{l}=\frac{c (1-d) (a+b-1)}{(c ((a - 1) \mu+1)+(a-1) (d - 1) \mu+b (\mu - 1) (c+d - 1))^{2}}\geq 0, $$
$$ g^{\prime}_{h}=\frac{(1-c) d (a+b-1)}{(a c \mu+a d \mu-a \mu+b (\mu - 1) (c+d - 1)-c \mu+c-d \mu+\mu - 1)^{2}}\geq 0. $$

Additionally, we can show that gl(μ) is a concave function on [0, 1], i.e.,

$$ g^{\prime\prime}_{l}=\frac{2 c (d-1) (a+b-1)^{2} (c+d-1)}{(c ((a-1) \mu+1)+(a-1) (d-1) \mu+b (\mu-1) (c+d-1))^{3}}\leq 0; $$

and that gh(μ) is a convex function for μ ∈ [0, 1], i.e.,

$$ g^{\prime\prime}_{h}(\mu)=\frac{2 (c-1) d (a+b-1)^{2} (c+d-1)}{(a c \mu+a d \mu - a \mu+b (\mu - 1) (c+d - 1) - c \mu+c - d \mu+\mu-1)^{3}}\geq 0. $$

It is also easy to show that gl(μ) ≥ μ is equivalent to

$$ 0\leq\mu\leq\bar{\mu} $$

where \(\bar {\mu }=\frac {1}{2} \left (\sqrt {\frac {c \left (a^{2} c+4 a (d-1)-4 d+4\right )-2 (a-2) b c (d-1)+b^{2} (d-1)^{2}}{(a+b-1)^{2} (c+d-1)^{2}}} + \frac {(a-2) c+b (2 c+d-1)}{(a+b-1) (c+d-1)}\!\right )\). Similarly, gh(μ) ≤ μ is equivalent to

$$ \underline{\mu}\leq\mu\leq 1 $$

where \(\underline {\mu } = \frac {1}{2} \!\left (\!\!\frac {a (c{\kern -.5pt}-{\kern -.5pt}1)+b (2 c+d{\kern -.5pt}-{\kern -.5pt}2)-2 c+2}{(a+b-1) (c+d-1)} - \sqrt {\!\frac {a^{2} (c{\kern -.5pt}-{\kern -.5pt}1)^{2}{\kern -.5pt}-2 a (b{\kern -.5pt}-{\kern -.5pt}2) (c{\kern -.5pt}-{\kern -.5pt}1) d+d \left (b^{2} d+4 b (c{\kern -.5pt}-{\kern -.5pt}1)-4 c+4\right )}{(a+b-1)^{2} (c+d-1)^{2}}}\right )\). It is also easy to verify that \(\underline {\mu }\leq \bar {\mu }\) holds. Therefore, for any \(\mu \in [\underline {\mu }, \bar {\mu }]\), we have \(\mu \leq g_{l}(\mu )\leq \bar {\mu }\) and \(\underline {\mu }\leq g_{h}(\mu )\leq \mu \). In other words, if we start from \(\mu _{t}\in [\underline {\mu }, \bar {\mu }]\), μt+ 1 will always stay within the range \([\underline {\mu }, \bar {\mu }]\). For any \(\mu \in [0, \underline {\mu })\), we have gl(μ) > μ and gh(μ) > μ. Thus, if we start from \(\mu _{t}\in [0, \underline {\mu })\), μt+ 1 will eventually rise to the range \([\underline {\mu }, \bar {\mu }]\). For any \(\mu \in (\bar {\mu },1]\), we have gl(μ) < μ and gh(μ) < μ. In other words, if we start from \(\mu _{t}\in (\bar {\mu },1]\), μt+ 1 will eventually fall to the interval \([\underline {\mu }, \bar {\mu }]\).

By checking the signs of the derivatives with respect to parameters of the reporting system, it is easy to verify the following properties of the bounds: (i) \(\frac {\partial \underline {\mu }}{\partial a}\geq 0\), \(\frac {\partial \underline {\mu }}{\partial b}\leq 0\), \(\frac {\partial \underline {\mu }}{\partial c}\leq 0\), and \(\frac {\partial \underline {\mu }}{\partial d}\leq 0\); (ii) \(\frac {\partial \bar {\mu }}{\partial a}\geq 0\), \(\frac {\partial \bar {\mu }}{\partial b}\leq 0\), \(\frac {\partial \bar {\mu }}{\partial c}\geq 0\), and \(\frac {\partial \bar {\mu }}{\partial d}\geq 0\).

Ceteris paribus, if the L state is more persistent, an investor’s probabilistic assessment of the state being indeed L is higher (signified by higher lower and upper bounds); if the H state is more persistent, the opposite is true. A more informative reporting system will enlarge the range of possible investor beliefs, because more informative signals induce more substantive revisions. □

Proof of Proposition 1.

Suppose the earnings string is of length s = 1, i.e., yt = l,yt− 1 = h. The conditional expectation of a price change in a large sample of firms observed by an empiricist is

$$ \begin{array}{@{}rcl@{}} {E_{t}^{f}}[p_{t} - p_{t-1}|y_{t} = l, y_{t-1} = h]&=&-\frac{1+\delta}{2-a-b+\delta}(H-L){E_{t}^{f}}[\mu_{t}-\mu_{t-1}|y_{t}=l, y_{t-1}=h]\\ &=&-\frac{1+\delta}{2 - a - b + \delta}(H - L){\int}_{\mu_{t-1}}\!\left( g_{l}(\mu_{t-1})-\mu_{t-1}\right)dF(\mu_{t-1}|y_{t-1}=h)\\ &\leq& 0, \end{array} $$

because gl(μt− 1) ≥ μt− 1 for \(\mu _{t-1}\in [\underline {\mu }, \bar {\mu }]\). Suppose the earnings string is of length s = 2, i.e., yt = l,yt− 1 = yt− 2 = h. The conditional expectation of a price change observed by an empiricist is

$$ \begin{array}{@{}rcl@{}} &&{E_{t}^{f}}[p_{t}-p_{t-1}|y_{t}=l, y_{t-1}=y_{t-2}=h]\\&=&-\frac{1+\delta}{2-a-b+\delta}(H-L){E_{t}^{f}}[\mu_{t}-\mu_{t-1}|y_{t}=l, y_{t-1}=y_{t-2}=h]\\ &=&-\frac{1+\delta}{2-a-b+\delta}(H-L){\int}_{\mu_{t-1}}\left( g_{l}(\mu_{t-1})-\mu_{t-1}\right)dF(\mu_{t-1}|y_{t-1}=y_{t-2}=h) \end{array} $$
$$ \begin{array}{@{}rcl@{}} &\leq& 0. \end{array} $$

Now we would like to find conditions for

$$ {E_{t}^{f}}[p_{t}-p_{t-1}|y_{t}=l, y_{t-1}=h]\geq {E_{t}^{f}}[p_{t}-p_{t-1}|y_{t}=l, y_{t-1}=y_{t-2}=h]. $$

This is equivalent to

$$ \begin{array}{@{}rcl@{}} &&{\int}_{\mu_{t-1}}\left( g_{l}(\mu_{t-1})-\mu_{t-1}\right)dF(\mu_{t-1}|y_{t-1}=h)\\&\leq& {\int}_{\mu_{t-1}}\left( g_{l}(\mu_{t-1})-\mu_{t-1}\right)dF(\mu_{t-1}|y_{t-1}=y_{t-2}=h). \end{array} $$

Define Fh(μt− 1) = F(μt− 1|yt− 1 = h) and Fhh(μt− 1) = F(μt− 1|yt− 1 = yt− 2 = h). By integration by parts, the left-hand side of Eq. A.16 is

$$ \begin{array}{@{}rcl@{}} &&{\int}_{\mu_{t-1}}\left( g_{l}(\mu_{t-1})-\mu_{t-1}\right)dF(\mu_{t-1}|y_{t-1}=h)\\ &=&\underbrace{\left( g_{l}(\bar{\mu}) - \bar{\mu}\right)}_{=0}\underbrace{F^{h}(\bar{\mu})}_{=1}- \left( g_{l}(\underline{\mu}) - \underline{\mu}\right)\underbrace{F^{h}(\underline{\mu})}_{=0}-{\int}_{\underline{\mu}}^{\bar{\mu}}F^{h}(\mu_{t-1})\left( g^{\prime}_{l}(\mu_{t-1})-1\right)d\mu_{t-1}\\ &=&-{\int}_{\underline{\mu}}^{\bar{\mu}}\left( g^{\prime}_{l}(\mu_{t-1})-1\right)F^{h}(\mu_{t-1})d\mu_{t-1}. \end{array} $$

Similarly, the right-hand side is

$$ \begin{array}{@{}rcl@{}} &&{\int}_{\mu_{t-1}}\left( g_{l}(\mu_{t-1})-\mu_{t-1}\right)dF(\mu_{t-1}|y_{t-1}=y_{t-2}=h)\\ &=&-{\int}_{\underline{\mu}}^{\bar{\mu}}\left( g^{\prime}_{l}(\mu_{t-1})-1\right)F^{hh}(\mu_{t-1})d\mu_{t-1}. \end{array} $$

Therefore Eq. A.16 becomes

$$ {\int}_{\underline{\mu}}^{\bar{\mu}}\left( g^{\prime}_{l}(\mu_{t-1})-1\right)F^{h}(\mu_{t-1})d\mu_{t-1}\geq {\int}_{\underline{\mu}}^{\bar{\mu}}\left( g^{\prime}_{l}(\mu_{t-1})-1\right)F^{hh}(\mu_{t-1})d\mu_{t-1}. $$

A sufficient condition for Eq. A.19 is

$$ \left( g^{\prime}_{l}(\mu_{t-1})-1\right)F^{h}(\mu_{t-1})\geq \left( g^{\prime}_{l}(\mu_{t-1})-1\right)F^{hh}(\mu_{t-1}) $$

μt− 1. In the following, we first proveFootnote 15

$$ F^{h}(\mu_{t-1})\leq F^{hh}(\mu_{t-1}). $$

By definition, for a given μt− 3,

$$ \begin{array}{@{}rcl@{}} F^{h}(\mu_{t-1})&=&F(\mu_{t-1}|y_{t-1}=h)=\text{Pr}(g_{h}(\mu_{t-2})\leq \mu_{t-1})=\text{Pr}(\mu_{t-2}\leq g_{h}^{-1}(\mu_{t-1}))\\ &=&\lambda\text{Pr}(g_{l}(\mu_{t-3})\leq g_{h}^{-1}(\mu_{t-1}))+(1-\lambda)\text{Pr}(g_{h}(\mu_{t-3})\leq g_{h}^{-1}(\mu_{t-1}))\\ \end{array} $$

where λ = Pr(yt− 2 = l);

$$ \begin{array}{@{}rcl@{}} F^{hh}(\mu_{t-1})=&F(\mu_{t-1}|y_{t-1}=y_{t-2}=h)=\text{Pr}(g_{h}(g_{h}(\mu_{t-3}))\leq \mu_{t-1}) \\ =&\text{Pr}(g_{h}(\mu_{t-3}) \leq g_{h}^{-1}(\mu_{t-1})). \end{array} $$

Because gl(μt− 3) ≥ gh(μt− 3), ∀μt− 3, it follows that

$$ \text{Pr}(g_{l}(\mu_{t-3})\leq g_{h}^{-1}(\mu_{t-1}))\leq \text{Pr}(g_{h}(\mu_{t-3})\leq g_{h}^{-1}(\mu_{t-1})), $$

and therefore

$$ F^{h}(\mu_{t-1})\leq F^{hh}(\mu_{t-1}) $$

\(\forall \mu _{t-1}\in [\underline {\mu },\bar {\mu }]\). In other words, Fh first-order stochastically dominates (f.o.s.d.) Fhh. Therefore a sufficient condition for Eq. A.20 to hold is

$$ g^{\prime}_{l}(\mu_{t-1})-1\leq 0. $$

μt− 1. It is easy to show that Eq. A.26 holds as long as any of the following three conditions (in addition to Conditions 1 and 2) is satisfied: (1) 0 ≤ bb, where \(b^{*}=\frac {c\left ((2 c+d-1)-\sqrt {(1-d) (4 c+3 d-3)}\right )}{2(c+d-1)^{2}} \); (2) bb ≤ 1 and 0 ≤ aa, where \(a^{*}=\frac {b^{2} (c+d-1)^{2}-b c (2 c+d-1)+c (c-d+1)}{c (1-d)}\); (3) bb ≤ 1, a < a ≤ 1, and μt− 1μ, where \(\mu ^{\dag }=\frac {b(c+d-1)-c}{(a+b-1)(c+d-1)}+\sqrt {\frac {c(1-d)}{(a+b-1)(c+d-1)^{2}}}\).

Now let us consider longer earnings strings. Note that, based on the proof of Eq. A.25, analogous arguments for longer sequences of h signals can be proved by induction. For example, Fhh first-order stochastically dominates (f.o.s.d.) Fhhh, and so forth. As the earnings string becomes infinitely long, we have

$$ \lim_{s\rightarrow \infty}\mu_{t-1}=\underline{\mu} $$


$$ \lim_{s\rightarrow \infty}F^{hh...h}(\mu_{t-1})=1. $$

As a result,

$$ \begin{array}{@{}rcl@{}} &&\lim_{s\rightarrow \infty}{E_{t}^{f}}[p_{t} - p_{t-1}|y_{t} = l, y_{t-1} = ... = y_{t-s} = h] = {E_{t}^{f}}[p_{t} - p_{t-1}|y_{t}\!=l, \mu_{t-1} = \underline{\mu}]\\ &&=-\frac{1+\delta}{2 - a - b + \delta}(H - L)\left( \!\frac{c(a+b-1)\underline{\mu}+c(1-b)}{(c + d - 1)(a\!+b - 1)\underline{\mu}+(1 - d)b+c(1 - b)}-\underline{\mu}\right).\\ \end{array} $$

Analogously, if a earnings string is defined as consecutive low l signals, the average market reaction to breaks in earnings strings is positive, and the magnitude of the average market reaction becomes greater as the string of l signals becomes longer under certain conditions. In the limit, we have,

$$ \begin{array}{@{}rcl@{}} &\underset{s\rightarrow \infty}{\lim}{E_{t}^{f}}[p_{t}-p_{t-1}|y_{t}=h, y_{t-1}=...=y_{t-s}=l]\\ =&-\frac{1+\delta}{2-a-b+\delta}(H-L)\left( \frac{(1-c)(a+b-1)\bar{\mu}+(1-c)(1-b)}{(1-c-d)(a+b-1)\bar{\mu}+db+(1-c)(1-b)}-\bar{\mu}\right). \end{array} $$

Proof of Lemma 4

We first prove the belief-updating functions. Given investor belief of the previous period (μt− 1) and the signals (yt and zt), the belief at the end of period t is given by \(\mu _{t}|_{y_{t},z_{t}}\). Consider the case of \(\mu _{t}|_{y_{t}=l,z_{t}=l}\). By Bayes’s Rule,

$$ \begin{array}{@{}rcl@{}} &&\mu_{t}|_{y_{t}=l,z_{t}=l}\\ &=&\text{Pr}(x_{t}=L|y_{t}=l,z_{t}=l,\boldsymbol{y}^{t-1},\boldsymbol{z}^{t-1})\\ &=&\frac{\text{Pr}(y_{t}=l,z_{t}=l|x_{t}=L)\text{Pr}(x_{t}=L|\boldsymbol{y}^{t-1},\boldsymbol{z}^{t-1})}{\text{Pr}(y_{t}=l,z_{t}=l|x_{t}=L,\boldsymbol{y}^{t-1},\boldsymbol{z}^{t-1})}\\ &=&\frac{{\sum}_{x_{t-1}=L,H}\text{Pr}(y_{t}=l,z_{t}=l|x_{t}=L)\text{Pr}(x_{t}=L|x_{t-1})\text{Pr}(x_{t-1}|\boldsymbol{y}^{t-1})}{{\sum}_{x_{t}=L,H}{\sum}_{x_{t-1}=L,H}\text{Pr}(y_{t}=l,z_{t}=l|x_{t})\text{Pr}(x_{t}|x_{t-1})\text{Pr}(x_{t-1}|\boldsymbol{y}^{t-1})}\\ &=&\frac{c_{1}c_{2}a\mu_{t-1}+c_{1}c_{2}(1-b)(1-\mu_{t-1})}{\left( c_{1}c_{2}a\mu_{t-1} + c_{1}c_{2}(1 - b)(1 - \mu_{t-1})\right) + \left( (1 - d_{1})(1 - d_{2})(1 - a)\mu_{t-1} + (1 - d_{1})(1 - d_{2})b(1 - \mu_{t-1})\right)}\\ &=&\frac{c_{1}c_{2}((a+b-1)\mu_{t-1}+1-b)}{c_{1}c_{2}((a+b-1)\mu_{t-1}+1-b)+(1-d_{1})(1-d_{2})((1-a-b)\mu_{t-1}+b)}. \end{array} $$

The other three scenarios, \(\mu _{t}|_{y_{t}=l,z_{t}=h}\), \(\mu _{t}|_{y_{t}=h,z_{t}=l}\), and \(\mu _{t}|_{y_{t}=h,z_{t}=h}\), can be derived analogously. The four functions are summarized below:

$$ \begin{array}{@{}rcl@{}} \mu_{t}|_{y_{t}=l, z_{t}=l}&=\frac{c_{1}c_{2}((a+b-1)\mu_{t-1}+1-b)}{c_{1}c_{2}((a+b-1)\mu_{t-1}+1-b)+(1-d_{1})(1-d_{2})((1-a-b)\mu_{t-1}+b)}, \end{array} $$
$$ \begin{array}{@{}rcl@{}} \mu_{t}|_{y_{t}=l, z_{t}=h}&=\frac{c_{1}(1-c_{2})((a+b-1)\mu_{t-1}+1-b)}{c_{1}(1-c_{2})((a+b-1)\mu_{t-1}+1-b)+(1-d_{1})d_{2}((1-a-b)\mu_{t-1}+b)}, \end{array} $$
$$ \begin{array}{@{}rcl@{}} \mu_{t}|_{y_{t}=h, z_{t}=l}&=\frac{(1-c_{1})c_{2}((a+b-1)\mu_{t-1}+1-b)}{(1-c_{1})c_{2}((a+b-1)\mu_{t-1}+1-b)+d_{1}(1-d_{2})((1-a-b)\mu_{t-1}+b)}, \end{array} $$
$$ \begin{array}{@{}rcl@{}} \mu_{t}|_{y_{t}=h, z_{t}=h}&=\frac{(1-c_{1})(1-c_{2})((a+b-1)\mu_{t-1}+1-b)}{(1-c_{1})(1-c_{2})((a+b-1)\mu_{t-1}+1-b)+d_{1}d_{2}((1-a-b)\mu_{t-1}+b)}. \end{array} $$

Define \(g_{ij}(\mu )\equiv \mu _{t+1}|_{y_{t+1}=i, z_{t+1}=j, \mu _{t}=\mu }\), where i,j ∈{l,h}. It is easy to show that, for (a,b,c1,d1,c2,d2) such that a,b,c1,d1,c2,d2 ∈ (0, 1), a + b ≥ 1, c1 + d1 ≥ 1 and c2 + d2 ≥ 1, the following two inequalities always hold:

$$ g_{hh}(\mu)\leq g_{lh}(\mu)\leq g_{ll}(\mu), $$
$$ g_{hh}(\mu)\leq g_{hl}(\mu)\leq g_{ll}(\mu). $$

Therefore, to bound the value of μt, we only need to consider the bounds of gll(μ) and ghh(μ). We can show that gll(μ) ≥ μ is equivalent to

$$ 0\leq \mu \leq \mu^{**} $$

where μ∗∗ is given by

$$ \begin{array}{@{}rcl@{}} \mu^{**} &=& \frac{1}{2} \left( \sqrt{\frac{a^{2} {c}_{1}^{2} {c}_{2}^{2}+2((a-2)b+2(1-a))c_{1}c_{2}(d_{1}-1) (d_{2}-1)+b^2 (d_{1}-1)^{2} (d_{2}-1)^2}{(a+b-1)^2 (c_{1}c_{2}-d_{1} d_{2} + d_{1}+d_{2}-1)^2}}\right.\\ &&\left. +\frac{(a-2) c_{1}c_{2}+b (2 c_{1}c_{2}-d_{1} d_{2}+d_{1}+d_{2}-1)}{(a+b-1) (c_{1} c_{2}-d_{1} d_{2}+d_{1}+d_{2}-1)}\right), \end{array} $$

and that ghh(μ) ≤ μ is equivalent to

$$ \mu^{*}\leq \mu \leq 1 $$

where μ is given by

$$ \begin{array}{@{}rcl@{}} \mu^{*} &=& \frac{1}{2} \left( \frac{(a-2) (c_{1}-1) (c_{2}-1)+b (2 c_{1} (c_{2}-1)-2 c_{2}-d_{1} d_{2}+2)}{(a+b-1) (c_{1}c_{2}-c_{1}-c_{2}-d_{1} d_{2}+1)}\right.\\ &&\left. -\sqrt{\frac{a^2(c_{1}-1)^2 (c_{2}-1)^2+d_{1} d_{2} \left( b^2 d_{1} d_{2}+(2 a (b-2)+4(1-b)) (c_{1}-1) (c_{2}-1)\right)}{(a+b-1)^2 (c_{1}c_{2}-c_{1}-c_{2}-d_{1} d_{2}+1)^2}}\right).\\ \end{array} $$

The conditions gll(μ) ≥ μ and ghh(μ) ≤ μ ensure that, if yt+ 1 = l and zt+ 1 = l, then μt+ 1μt; if yt+ 1 = h and zt+ 1 = h, then μt+ 1μt. It is also easy to show that the second order conditions for concavity/convexity are met. For any a,b,c1,d1,c2,d2, such that a + b ≥ 1, c1 + d1 ≥ 1 and c2 + d2 ≥ 1, we have

$$ \begin{array}{@{}rcl@{}} {g}_{ll}^{\prime\prime}(\mu) \!\!&=&\!\! -\frac{2 c_{1} c_{2} (d_{1} - 1) (d_{2} - 1) (a + b - 1)^{2} (c_{1} c_{2} - d_{1} d_{2} + d_{1} + d_{2} - 1)}{\left( c_{1} c_{2} ((a - 1) \mu + 1) - (a - 1) (d_{1} - 1) (d_{2} - 1) \mu + b (\mu - 1) (c_{1} c_{2} - d_{1} d_{2} + d_{1} + d_{2} - 1)\right)^3}\!\leq\! 0,\qquad\qquad \end{array} $$
$$ \begin{array}{@{}rcl@{}} {g}_{hh}^{\prime\prime} (\mu) \!\!&=&\!\! \frac{2 (c_{1} - 1) (c_{2} - 1) d_{1} d_{2} (a + b - 1)^2 (c_{1} + c_{2} - c_{1} c_{2} + d_{1} d_{2} - 1)}{\left( c_{1} (c_{2} - 1) ((a - 1) \mu + 1) + (1 - a)(c_{2}\mu + d_{1} d_{2}\mu) + b (\mu - 1) (c_{1}c_{2} - c_{1} - c_{2} - d_{1} d_{2} + 1) - c_{2} + (a - 1) \mu + 1\right)^3} \\ &\geq&\!\! 0. \end{array} $$

The rest of the proof is analogous to that of Lemma 3.

By checking the signs of the derivatives, it is easy to verify the following properties of the bounds: (i) \(\frac {\partial \mu ^{*}}{\partial a}\geq 0\), \(\frac {\partial \mu ^{*}}{\partial b}\leq 0\), \(\frac {\partial \mu ^{*}}{\partial c_{j}}\leq 0\), and \(\frac {\partial \mu ^{*}}{\partial d_{j}}\leq 0\) for j = 1, 2; (ii) \(\frac {\partial \mu ^{**}}{\partial a}\geq 0\), \(\frac {\partial \mu ^{**}}{\partial b}\leq 0\), \(\frac {\partial \mu ^{**}}{\partial c_{j}}\geq 0\), and \(\frac {\partial \mu ^{**}}{\partial d_{j}}\geq 0\) for j = 1, 2. □

Proof of Proposition 2

The hedge return obtained by trading on the two signals is

$$ {E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=h, z_{t}=l)-{E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=l, z_{t}=h). $$

Note that

$$ {E_{t}^{f}}(p_{t+1}-p_{t}|\mu_{t}=\mu)=\frac{1+\delta}{2-a-b+\delta}(H-L)\left( (2-a-b)\mu -1+b\right)\equiv \phi(\mu). $$

where ϕ(μ) is a linear increasing function of μ,

$$ \phi^{\prime}(\mu)=\frac{(1+\delta)(2-a-b)}{2-a-b+\delta}(H-L)\geq 0. $$


$$ \begin{array}{@{}rcl@{}} && {E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=h, z_{t}=l)-{E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=l, z_{t}=h) \end{array} $$
$$ \begin{array}{@{}rcl@{}} &&= \frac{(1+\delta)(2-a-b)}{2-a-b+\delta}(H-L)\left( {\int}_{\mu}\mu dF(\mu|y_{t}=h, z_{t}=l)-{\int}_{\mu}\mu dF(\mu|y_{t}=l, z_{t}=h)\right).\\ \end{array} $$

Recall that

$$ \mu_{t}|_{y_{t}=h,z_{t}=l}=\frac{(1-c_{1})c_{2}((a+b-1)\mu_{t-1}+1-b)}{(1 - c_{1})c_{2}((a + b - 1)\mu_{t-1} + 1 - b) + d_{1}(1 - d_{2})((1 - a - b)\mu_{t-1} + b)}, $$
$$ \mu_{t}|_{y_{t}=l, z_{t}=h}=\frac{c_{1}(1-c_{2})((a+b-1)\mu_{t-1}+1-b)}{c_{1}(1 - c_{2})((a + b - 1)\mu_{t-1} + 1 - b) + (1 - d_{1})d_{2}((1 - a - b)\mu_{t-1} + b)}. $$

The key to the return predictability is the fact that the distributions of μt, conditional on different combinations of signals, differ. Note that, with probability 1, μt ∈ [μ,μ∗∗]. Therefore evaluating the integration over the full support of [0, 1] is equivalent to evaluating the integration over [μ,μ∗∗]. Denote Fhl(μt) = F(μt|yt = h,zt = l) and Flh(μt) = F(μt|yt = l,zt = h), we know that Fhl(μ) = Flh(μ) = 0 and Fhl(μ∗∗) = Flh(μ∗∗) = 1. Therefore evaluating the integration over the full support of [0, 1] is equivalent to evaluating the integration over [μ,μ∗∗]. By integration by parts, we have

$$ \begin{array}{@{}rcl@{}} {\int}_{\mu_{t}}\mu_{t}dF^{hl}(\mu_{t})&=&{\int}_{\mu^{*}}^{\mu^{**}}\mu_{t}dF^{hl}(\mu_{t})\\ &=&\mu^{**}\cdot F^{hl}(\mu^{**})-\mu^{*}\cdot F^{hl}(\mu^{*})-{\int}_{\mu^{*}}^{\mu^{**}}F^{hl}(\mu_{t})d\mu_{t}\\ &=&\mu^{**}-{\int}_{\mu^{*}}^{\mu^{**}}F^{hl}(\mu_{t})d\mu_{t}. \end{array} $$


$$ \begin{array}{@{}rcl@{}} {\int}_{\mu_{t}}\mu_{t}dF^{lh}(\mu_{t})=\mu^{**}-{\int}_{\mu^{*}}^{\mu^{**}}F^{lh}(\mu_{t})d\mu_{t}. \end{array} $$

Therefore the inequality \({\int \limits }_{\mu _{t}}\mu _{t}dF^{hl}(\mu _{t})\geq {\int \limits }_{\mu _{t}}\mu _{t}dF^{lh}(\mu _{t})\) is equivalent to \({\int \limits }_{\mu ^{*}}^{\mu ^{**}}F^{hl}(\mu _{t})d\mu _{t}\leq {\int \limits }_{\mu ^{*}}^{\mu ^{**}}F^{lh}(\mu _{t})d\mu _{t}\). In the following, we prove a sufficient condition of the above inequality, namely, Fhl(μt) ≤ Flh(μt), ∀μt. By definition, for a given μt− 1,

$$ F^{hl}(\mu)=\text{Pr}(\mu^{hl}(\mu_{t-1})\leq \mu). $$

Assume a,b,c1,d1,c2,d2 ∈ [0, 1], a + b ≥ 1, c1 + d1 ≥ 1, and c2 + d2 ≥ 1. It is easy to show that

$$ \mu_{t}|_{y_{t}=h, z_{t}=l, \mu_{t-1}=\mu}\geq \mu_{t}|_{y_{t}=l, z_{t}=h, \mu_{t-1}=\mu} $$


$$ 1-d_{1}\leq c_{1}\leq \frac{c_{2} d_{2}(1-d_{1})}{d_{1}(1-d_{2})+c_{2}(d_{2}-d_{1})}. $$

But we know that ∀μt− 1, μhl(μt− 1) ≥ μlh(μt− 1) given the assumptions. Therefore

$$\text{Pr}(\mu^{hl}(\mu_{t-1})\leq \mu)\leq \text{Pr}(\mu^{lh}(\mu_{t-1})\leq \mu)$$

Given that ϕ(μ) is a linear increasing function of μ, we have

$$ {E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=h, z_{t}=L)-{E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=l, z_{t}=H)\geq 0 $$


$$ 1-d_{1}\leq c_{1}\leq \frac{c_{2} d_{2}(1-d_{1})}{d_{1}(1-d_{2})+c_{2}(d_{2}-d_{1})}, $$

where it can be shown that \(1-d_{1}\leq \frac {c_{2} d_{2}(1-d_{1})}{d_{1}(1-d_{2})+c_{2}(d_{2}-d_{1})}\) always holds.

Now we examine the comparative statics with respect to a and b. Let w = a + b, then

$$ \begin{array}{@{}rcl@{}}& &{E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=h, z_{t}=l)-{E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=l, z_{t}=h) \end{array} $$
$$ \begin{array}{@{}rcl@{}} &&= \frac{(1 + \delta)(2 - w)}{2-w+\delta}(H - L)\!\left( {\int}_{\mu}\mu dF(\mu|y_{t} = h, z_{t} = l) - {\int}_{\mu}\mu dF(\mu|y_{t} = l, z_{t} = h)\right).\\ \end{array} $$

Note that w may affect the hedge return through the functional form of \(\frac {(1+\delta )(2-w)}{2-w+\delta }(H-L)\) and its impact on the conditional distributions. We focus on the impact through \(\frac {(1+\delta )(2-w)}{2-w+\delta }(H-L)\), assuming that the impact on the conditional distributions plays a secondary role.

$$ \begin{array}{@{}rcl@{}} && \frac{\partial \left( {E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=h, z_{t}=l)-{E_{t}^{f}}(p_{t+1}-p_{t}|y_{t}=l, z_{t}=h)\right)}{\partial w} \end{array} $$
$$ \begin{array}{@{}rcl@{}} &&\approx - \frac{\delta(1 + \delta)(H - L)}{(2-w+\delta)^{2}}\!\left( {\int}_{\mu}\mu dF(\mu|y_{t} = h, z_{t} = l) - {\int}_{\mu}\mu dF(\mu|y_{t} = l, z_{t} = h)\!\right)\!<\!0\\ \end{array} $$

i.e., the profitability of the hedge strategy decreases with w, the economic persistence. □

Appendix B: Variable definitions

1.1 B.1. Variable definitions



A c c

Total accruals scaled by average total assets (Compustat item at). Total accruals are calculated as ΔCA −ΔCL −ΔCash + ΔSTDDep, where ΔCA is the change in current assets (act), ΔCL is the change in current liabilities (lct), ΔCash is the change in cash (che), ΔSTD is the change in short-term debt (dlc), Dep is depreciation and amortization expense (dp).

B e t a

The slope coefficient from a regression of daily stock returns on equal-weighted market returns over the fiscal year.


Book-tax difference, calculated as taxable income minus net income then scaled by the market value of equity at three months after the beginning of fiscal year t. The taxable income for year t is CTEt × (1 − τ)/τ, where CTEt is current tax expense, measured as the sum of current federal (txfed) and foreign income taxes (txfo), or, when either of these accounts is missing, as total income tax expense (txt) less deferred tax expense (txdi). The U.S. statutory corporate tax rate τ is assumed to be cross-sectionally constant and is measured as the top statutory corporate federal tax rate. The top statutory corporate federal tax rate was 46 percent in 1980–1986, 40 percent in 1987, 34 percent in 1988–1992, and 35 percent in 1993–2015.


Cumulative abnormal return from two trading days before to one trading day after the earnings announcement date. Returns are adjusted by the value-weighted market.

E / P

Earnings-price ratio, calculated as operating income after depreciation (oiadp) scaled by market value of equity at fiscal year-end.


A firm-year measure of \((\hat {c}+\hat {d})/2\) estimated using a hierarchial Bayesian method as proposed by Du et al. (2020). See Appendix B.2 for details.


Market value of equity divided by book value of equity at year-end.

R e t

The stock return over the 12 months beginning nine months prior to the end of fiscal year.

S i z e

Market value of equity at the end of the fiscal year.


The length of an earnings string. Earnings strings are defined as consecutive earnings increases, where an earnings increase is defined as EPS before extraordinary items in the observation quarter being higher than EPS for the same quarter of the previous year.


The scaled decile rank for the variable of interest assigned for each firm every year. We rank the values of the variable for every firm into deciles for each fiscal year. The ranks are then transformed to the range between 0 and 1, by subtracting one and dividing by nine.

V o l a t i l i t y

The standard deviation of daily stock returns over the fiscal year.

R e l e v a n c e

The value relevance of earnings, measured as the explained variability (R2) from the following regression of returns on the level and change in earnings: Rett = γ0 + γ1Earnt + γ2ΔEarnt + εt, where Rett is the stock return over the 12 months beginning nine months prior to the end of fiscal year t, Earnt is income before extraordinary items in year t scaled by market value of equity at three months after the beginning of fiscal year t, and ΔEarnt is the change in income before extraordinary items in year t scaled by market value of equity at three months after the beginning of fiscal year t. For each firm-year, the regression model is estimated over the five-year rolling window ending with year t (requiring at least three years of data).

1.2 B.2. Earnings informativeness

This appendix briefly outlines the procedure used to estimate earnings informativeness (EI). For more details about the procedure, and discussions about the advantage, internal validity, and external validity of the resulted measure, see Du et al. (2020).

Suppose there are n idiosyncratic firms, each of which is a time-varying counterpart of the model in Section 2. For firm i, the law of motion of the underlying state (Qit) and the mapping from the state to the earnings signal (ηij) are given by the following matrices:

$$ \boldsymbol{Q}_{it}=\left[ \begin{array}{cc} a_{it} & 1-a_{it}\\ 1-b_{it} & b_{it} \end{array} \right],\quad \boldsymbol{\eta}_{it}= \left[ \begin{array}{cc} c_{it} & 1-c_{it}\\ 1-d_{it} & d_{it} \end{array} \right], $$

where ait, bit, cit, and dit are influenced by time-varying, firm-specific characteristics.

Let 𝜃 be the vector of unknown model parameters, π0 the initial state probabilities, y the observed history of signals, L(𝜃|y) the joint likelihood of the observed signals, and π(𝜃) the prior distribution of the unknown parameters. The posterior distribution of 𝜃 given the observed data is given by

$$ \begin{array}{ll} \pi(\boldsymbol{\theta}|\boldsymbol{y})\propto L(\boldsymbol{\theta}|\boldsymbol{y})&\pi(\boldsymbol{\theta})\propto \left[{\prod}_{i=1}^{n}\boldsymbol{\pi}_{0} \Tilde{\boldsymbol{\eta}}_{i1}\boldsymbol{Q}_{i2}\Tilde{\boldsymbol{\eta}}_{i2} \cdots \boldsymbol{Q}_{iT_{i}}\Tilde{\boldsymbol{\eta}}_{iT_{i}}\boldsymbol{1}^{\prime}\right] \cdot \pi(\boldsymbol{\theta}), \end{array} $$

where η~it, t = 1, 2,…,Ti, is the 2 × 2 diagonal matrix

$$ \Tilde{\boldsymbol{\eta}}_{it}=\left[ \begin{array}{cc} c_{it}^{1-y_{it}}(1-c_{it})^{y_{it}} & 0\\ 0 & d_{it}^{y_{it}}(1-d_{it})^{1-y_{it}} \end{array} \right]. $$

The main idea of the MCMC method is to construct a Markov chain that has the desired posterior distribution π(𝜃|y) as its equilibrium distribution. In each iteration of the MCMC method, we generate samples of the parameters based on their values from the previous iteration and their posterior distributions. After 90,000 iterations, the generated samples from the MCMC method converge to the equilibrium distribution. We use the subsequent 10,000 iterations to approximate the posterior distributions and estimate the parameters.

The estimation is applied to U.S. firms covered by Compustat with a minimum of 20 consecutive quarters of valid data for the period of 1980–2015. We operationalize yit as an indicator that takes the value h if firm i’s earnings surprise based on a seasonal random walk model in quarter t is positive and l otherwise. Four covariates (firm size, financial leverage, market-to-book ratio, and cash flow volatility) are used to help identify the model. We operationalize earnings informativeness as:

$$ \mathit{EI}_{it}=\frac{\hat{c}_{it}+\hat{d}_{it}}{2}. $$

We then take the annual average of the respective quarterly measure, obtaining a measure for 111,346 firm-year observations representing 8,713 unique firms.

Appendix C: An alternative setup with AR(1) states and Gaussian signals

This appendix presents the implications of the linear-Gaussian framework with AR(1) states and Gaussian signals. Essentially, the one-state/one-signal and one-state/two-signal cases are both special cases of the Kalman filter (Ljungqvist and Sargent 2004; Miao 2014).

1.1 C.1. The one-signal case

Suppose the underlying state xt and the accounting signal yt are given by

$$ \begin{array}{@{}rcl@{}} x_{t}&=&\rho x_{t-1}+u_{t} \end{array} $$
$$ \begin{array}{@{}rcl@{}} y_{t}&=&\beta x_{t}+v_{t} \end{array} $$

where β > 0, \(u_{t}\sim N(0,{\sigma _{u}^{2}})\) and \(v_{t}\sim N(0,{\sigma _{v}^{2}})\) are independent from each other.

Following the conventions of filtering problems, define the following notations:

$$ \begin{array}{@{}rcl@{}} y_{t|t-1}&\equiv& E_{t-1}[y_{t}] \end{array} $$
$$ \begin{array}{@{}rcl@{}} V_{t|t-1}&\equiv& E[(y_{t}-y_{t-1})^{2}]. \end{array} $$

Then, the standard Kalman filter results (e.g., Ljunquivst and Sargent 2004) imply that

$$ \begin{array}{@{}rcl@{}} x_{t|t-1}&=&\rho x_{t-1|t-1} \end{array} $$
$$ \begin{array}{@{}rcl@{}} w_{t|t-1}&=&\rho^{2} w_{t-1|t-1}+{\sigma_{u}^{2}} \end{array} $$
$$ \begin{array}{@{}rcl@{}} y_{t|t-1}&=&\beta x_{t|t-1} \end{array} $$
$$ \begin{array}{@{}rcl@{}} V_{t|t-1}&=&w_{t|t-1}\beta^{2}+{\sigma_{v}^{2}} \end{array} $$
$$ \begin{array}{@{}rcl@{}} x_{t|t}&=&x_{t|t-1}+w_{t|t-1}\beta V_{t|t-1}^{-1}(y_{t}-y_{t|t-1}) \end{array} $$
$$ \begin{array}{@{}rcl@{}} w_{t|t}&=&w_{t|t-1}-(\beta w_{t|t-1})^{2}/V_{t|t-1}. \end{array} $$

Belief updating can be easily simulated. For some parametric specifications, we can generate patterns seen in the binary setup. For example, assume ρ = 0.50, β = 0.90, and \(({\sigma _{u}^{2}}, {\sigma _{v}^{2}})\in \{(0.10, 0.10), (0.15, 0.10)\}\). We simulate the model over T= 1,000 periods for N= 10,000 idiosyncratic histories (“firms”) and define a high signal h in period t as an earnings increase, i.e., ytyt− 1. We then calculate the market reaction to breaks in earnings strings as the average return in all firm-periods with an l signal following exactly s consecutive h signals, where s= 1,2,...,8 is the length of earnings strings. We find the average market reaction monotonically decreases as s increases, consistent with the prediction of binary setup.

Note that, to simulate discrete events (e.g., earnings increase or decrease), we need to transform the continuous values of the yt into binary values, which may involve an arbitrary selection of the threshold for “clipping” the signal. Therefore the applicability of the linear-Gaussian model to the empirical phenomenon of earnings strings is limited and could be sensitive to further assumptions. As such, we view our binary setup as more suitable for studying earnings strings.

1.2 C.2: The two-signal case

The two-signal case can also be derived under the linear-Gaussian framework. Suppose the signal yt is a 2 × 1 vector:

$$ \begin{array}{@{}rcl@{}} x_{t}&=&\rho x_{t-1}+u_{t} \end{array} $$
$$ \begin{array}{@{}rcl@{}} y_{t}&=&B x_{t}+v_{t} \end{array} $$

where \(y_{t}=[{y_{t}^{1}}, {y_{t}^{2}}]'\) (representing two noisy signals), B = [β1,β2], \(u_{t}\sim N(0,{\sigma _{u}^{2}})\) and \(v_{t}\sim N(0,{\Sigma }_{v})\); ut and vt are independent from each other.

Then, Kalman filter implies that

$$ \begin{array}{@{}rcl@{}} x_{t|t-1}&=&\rho x_{t-1|t-1} \end{array} $$
$$ \begin{array}{@{}rcl@{}} w_{t|t-1}&=&\rho^{2} w_{t-1|t-1}+{\sigma_{u}^{2}} \end{array} $$
$$ \begin{array}{@{}rcl@{}} y_{t|t-1}&=&B x_{t|t-1} \end{array} $$
$$ \begin{array}{@{}rcl@{}} V_{t|t-1}&=&w_{t|t-1}BB^{\prime}+{\Sigma}_{v} \end{array} $$
$$ \begin{array}{@{}rcl@{}} x_{t|t}&=&s_{t|t-1}+w_{t|t-1}B^{\prime}V_{t|t-1}^{-1}(y_{t}-y_{t|t-1}) \end{array} $$
$$ \begin{array}{@{}rcl@{}} w_{t|t}&=&w_{t|t-1}-w_{t|t-1}^{2}B^{\prime}V_{t|t-1}^{-1}B. \end{array} $$

Belief updating and the associated stock return dynamics can be simulated, analogous to the one-signal case.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Du, K., Huddart, S. Economic persistence, earnings informativeness, and stock return regularities. Rev Account Stud 25, 1263–1300 (2020).

Download citation

  • Published:

  • Issue Date:

  • DOI:


JEL Classification