Appendix A: Variable definitions
Variable
|
Description
|
---|
R
i
t
|
The buy-and-hold annual return ending three months after fiscal year-end, computed using compounded monthly CRSP equity returns.
|
U
R
i
t
|
The unexpected buy-and-hold annual return ending three months after fiscal year-end, computed using compounded monthly CRSP equity returns. Unexpected returns are constructed using annual 5𝖷5 reference portfolios by sorting on size and book-to-market following Fama and French (1992) and Fama and French (1993). We then obtain unexpected returns by subtracting the expected return from Rit.
|
D
R
i
t
|
An indicator variable equal to one when Rit or URit is less than zero and equal to zero otherwise.
|
I
i
t
|
Earnings (per share) excluding extraordinary items scaled by lagged share price. [EPSPX/LPRCC_F1]
|
U
I
i
t
|
Unexpected earnings, defined as the difference in earnings excluding extraordinary items (i.e., Iit - Iit− 1).
|
O
I
B
D
P
i
t
|
Operating income before depreciation, defined as income before interest, taxes, depreciation, and non-operating items (i.e., extraordinary items, special items, and discontinued items), scaled by beginning of the year market value of equity. [OIBDP / (PRCC_F*CSHO)t− 1]
|
U
O
I
B
D
P
i
t
|
Unexpected operating income before depreciation, defined as the difference in operating income before deprecation (i.e., OIBDPit - OIBDPit− 1).
|
O
t
h
e
r
I
n
c
o
m
e
i
t
|
Other reported annual earnings, defined as Iit less OIBDPit.
|
U
O
t
h
e
r
I
n
c
o
m
e
i
t
|
Unexpected other income, defined as the difference in other income (i.e., OtherIncomeit - OtherIncomeit− 1).
|
B
M
it− 1
|
The book-to-market ratio at the beginning of the fiscal year. [(CEQ+LT) /((PRCC_F*CSHO)+LT)t− 1]
|
L
e
v
e
r
a
g
e
it− 1
|
Firm leverage at the beginning of the fiscal year. [(DLTT+DLC) /(AT)t− 1]
|
S
i
z
e
it− 1
|
The natural logarithm of market value of equity at the beginning of the fiscal year. [ln(PRCC_F*CSHO)t− 1]
|
C
C
A
i
t
|
“Funds from Operations - Other” scaled by beginning of the year market value of equity. [-FOPO / (PRCC_F*CSHO)t− 1]
|
U
C
C
A
i
t
|
(Scaled) unexpected special items (i.e., CCAit - CCAit− 1).
|
UI_bef_UCCAit
|
UIit -UCCAit
|
Appendix B:: Econometrics of the earnings on returns regression coefficient
Research on accounting conservatism is inherently interested in testing the link between Iit and yit—that is, the relationship between earnings and transactions that are subject to asymmetric accounting recognition. Because yit is unobservable to researchers, Rit is used instead. In this appendix, we formally derive the econometric properties of the earnings on returns regression coefficient.
Because yit is unobservable to researchers, a “plug-in” solution that substitutes a proxy variable for the unobservable explanatory variable is used. The assumptions that must be met to obtain unbiased estimates are discussed in ?[ ()pp. 308–313]wooldridge2012introductory.Footnote 29 In the generalized BKN framework, yit is expressed using Eq. (3) and then substituted into Eq. (4). Ball et al. (2013a) performs an analogous substitution to form an estimable equation with the observable variables Iit and Rit, although this step is not shown in Ball et al. (2013a).
To demonstrate this, we start with Eq. (3):
$$ R_{it}=x_{it}+y_{it}+g_{it} $$
and solve for yit (the unobservable construct of interest to researchers) as:
$$ y_{it}=R_{it}-x_{it}-g_{it}. $$
(13)
This representation is of returns measuring yit with error of the following form:
$$ e_{it}=R_{it}-y_{it}=x_{it}+g_{it}. $$
Recall that researchers of asymmetric timeliness are interested in the estimation of the structural relationship between Iit and yit:
$$ I_{it}=\left[p(y)_{it}\right]y_{it}+u_{it}, $$
(14)
where
$$ u_{it}=x_{it}+\left[1-p(y)_{it-1}\right]y_{it-1}+g_{it-1}+\varepsilon_{it}-\varepsilon_{it-1}. $$
(15)
Because yit is unobservable, Eq. (14) cannot be estimated directly. Consistent with prior conservatism research, returns is used as a measure of good or bad news, resulting in the substitution of yit from Eq. (13) into Eq. (14) to obtain:
$$ \begin{array}{@{}rcl@{}} I_{it} & = & \left[p(y)_{it}\right](R_{it}-x_{it}-g_{it})+u_{it}. \end{array} $$
(16)
This can be simplified into the following structural equation:
$$ I_{it}=\left[p(y)_{it}\right]R_{it}+\upsilon_{it}, $$
(17)
where
$$ \begin{array}{@{}rcl@{}} \upsilon_{it} =\left[p(y)_{it}\right](-x_{it}-g_{it})+x_{it}+\left[1-p(y)_{it-1}\right]y_{it-1}+g_{it-1}+\varepsilon_{it}-\varepsilon_{it-1}. \end{array} $$
(18)
Equations (17) and (18) are the key to understanding why using returns as a proxy for yit leads to biased estimates. First, following Wooldridge (2010), the classical errors-in-variables assumption requires that the measurement error of the proxy variable not be correlated with the unobservable variable. Clearly, this assumption is violated, as eit(= xit + git) and yit are correlated based on the Ball et al. (2013a) assumptions that \(cov\left (x_{it},y_{it}\right )=\sigma _{x,y}>0\) and \(cov\left (y_{it},g_{it}\right )=\sigma _{y,g}>0\).Footnote 30 Second, even if ωit = 0 when news is sufficiently good and accordingly the errors-in-variables problem doesn’t arise (i.e., \(\left [p(y)_{it}\right ](-x_{it}-g_{it})=0\)), the structural error term, υit, will still be correlated with Rit, as both are determined by xit. (See Eqs. (18) and (3) above.)
Because the structural error term υit is not independent of Rit, the estimated slope coefficient \(\hat {p(y)_{it}}\) from an OLS regression of:Footnote 31
$$ I_{it}=\left[\hat{p(y)_{it}}\right]R_{it}+\hat{\xi}_{it} $$
(19)
will have the following expected value:Footnote 32
$$ \begin{array}{@{}rcl@{}} E\left[\hat{p(y)_{it}}\right] & = & E\left[\frac{\hat{\sigma}_{R,I}}{\hat{\sigma}_{R}^{2}}\right]=\left[\frac{\sigma_{R,({{\left[p(y)_{it}\right]}}R+\upsilon)}}{{\sigma_{R}^{2}}}\right] \\ & =&\left[\frac{\left[p(y)_{it}\right]\sigma_{R}^{2}}{{\sigma_{R}^{2}}}+\frac{\sigma_{R,\upsilon}}{{\sigma_{R}^{2}}}\right]=\left[\left[p(y)_{it}\right]+\frac{\sigma_{R,\upsilon}}{{\sigma_{R}^{2}}}\right]. \end{array} $$
(20)
Following conventional practice, our derivation substitutes the structural equation \(I=\left [p(y)_{it}\right ]R+\upsilon \) for I in σR,I to obtain an expression for the structural parameter p(y)it.
As can be seen in Eq. (20), \(\hat {p(y)_{it}}\) does not provide an unbiased estimate of p(y)it—the structural parameter linking Iit and yit—because σR,υ≠ 0.Footnote 33 Accordingly, Rit and υit are not independent, and R is endogenous, which leads to p(y)it (or ωt in the BKN framework) not being identified in a regression of earnings on returns.Footnote 34 An expansion of υit in Eq. (20) leads to the more general representation:Footnote 35
$$ \begin{array}{@{}rcl@{}} E\left[\hat{p(y)_{it}}\right]&= & E\left[\frac{\hat{\sigma}_{R,I}}{\hat{\sigma}_{R}^{2}}\right]=\left[\frac{\sigma_{R,(\left[p(y)_{it}\right]R+\upsilon)}}{{\sigma_{R}^{2}}}\right]\\ & =&\left[\frac{\left[p(y)_{it}\right]\sigma_{R}^{2}}{{\sigma_{R}^{2}}}+\frac{\sigma_{R,\left\{ \left( 1-\left[p(y)_{it}\right]\right)x_{it}-\left[p(y)_{it}\right]g_{it}+\left( 1-\left[p(y)_{it-1}\right]\right)y_{it-1}+g_{it-1}+\varepsilon_{it}-\varepsilon_{it-1}\right\} }}{{\sigma_{R}^{2}}}\right] \\ & =&\left[\left[p(y)_{it}\right]+\frac{\left( 1-\left[p(y)_{it}\right]\right)\sigma_{R,x}-\left[p(y)_{it}\right]\sigma_{R,g}}{{\sigma_{R}^{2}}}\right]. \end{array} $$
(21)
As Eq. (21) shows, empirical estimates of p(y)it will vary with the importance of \(\left (1-\left [p(y)_{it}\right ]\right )\sigma _{R,x}\) and \(\left [p(y)_{it}\right ]\sigma _{R,g}\) relative to \({\sigma _{R}^{2}}\). For the empirical estimate to be unbiased, \(\left (1-\left [p(y)_{it}\right ]\right )\sigma _{R,x}=0\) and \(\left [p(y)_{it}\right ]\sigma _{R,g}=0\), or alternatively, \(\left (1-\left [p(y)_{it}\right ]\right )\sigma _{R,x}=\left [p(y)_{it}\right ]\sigma _{R,g}\).
Appendix C: Clarification of prior claims in (Ball et al. 2013a)
Revisiting (Ball et al. 2013a) claims of addressing return endogeneity and sample truncation bias
Ball et al. (2013a) makes specific claims regarding return endogeneity and sample truncation bias in one-regime ATMs. Specifically, the study states:
A regression of earnings on returns fulfills its research objective of representing timeliness (p. 1079.)
When the research objective is to estimate the functional shape of the conditional expectation \(E\left (I\mid R\right )\), return is the correct independent variable, and conditioning on it does not induce bias (p. 1091).
We also address the Dietrich, Muller, and Riedl [2007] claim that return endogeneity and sample truncation lead to biased Basu regression estimates (p. 1074).
An important clarification is that the analysis in Ball et al. (2013a) confuses asymmetric timeliness and earnings timeliness. This is indicated in the first two statements that suggest that researchers are interested in the earnings timeliness of all components of returns, not just the timeliness of y—the focus of conservatism research. This alternative focus leads the analysis in Ball et al. (2013a) to provide only a statistical representation of the relationship between I and R, as no structural relationship exists between I and R in the BKN framework. This can clearly be seen in the path analysis depiction of the BKN framework in Fig. 1.
The research approach in Ball et al. (2013a) deviates from suggested research practices in accounting. For instance, Gow et al. (2016) advises:
An important point worth emphasizing is that the model-based causal reasoning is distinct from statistical reasoning. Suppose we observe data on x and y and make the strong assumption that we know causality is one-way. How do we distinguish between whether X causes Y or Y causes X? Statistics can help us determine whether X and Y are correlated, but correlations do not establish causality. Only with assumptions about causal relations between X, Y, and other variables (i.e., a theory) can we infer causality. (p. 482)
Because of the focus on “earnings timeliness”, rather than the the structural relationship between I and y, the Ball et al. (2013a) representation (shown below in Eq. (23)) is not useful for investigations that focus on ATMs. Regarding the last Ball et al. (2013a) statement, below we address claims about the validity of earnings on returns regressions and the effects on ATM estimates of conditioning on returns. Our focus is on the asymmetric timeliness of y in earnings, the (only) structural equation of interest; we also demonstrate why the statistical representations of earnings on returns regressions, including piece-wise regressions presented in Ball et al. (2013a), do not support the assertions above. We continue to use the two-regime framework in this section for notational convenience and for its generality.
(Ball et al. 2013a) claim of addressing return endogeneity
Regarding the claim in Ball et al. (2013a) of addressing return endogeneity, Eq. (21) shows that the estimated coefficient from a regression of earnings on returns does not provide an unbiased estimate of the extent to which earnings incorporates transactions that are subject to asymmetric accounting recognition. Recall that Eq. (21) takes the following form.
$$ \begin{array}{@{}rcl@{}} E\left[\hat{p(y)_{it}}\right]= \left[\left[p(y)_{it}\right]+\frac{\left( 1-\left[p(y)_{it}\right]\right)\sigma_{R,x}-\left[p(y)_{it}\right]\sigma_{R,g}}{{\sigma_{R}^{2}}}\right]. \end{array} $$
Unless both \(\left (1-\left [p(y)_{it}\right ]\right )\sigma _{R,x}=0\) and \(\left [p(y)_{it}\right ]\sigma _{R,g}=0\) (or alternatively, \(\left (1-\left [p(y)_{it}\right ]\right )\sigma _{R,x}=\left [p(y)_{it}\right ]\sigma _{R,g}\)), the coefficient estimate will be biased. Absent this, the bias cannot be signed, given the separate influence of σR,x and σR,g and the unknown value of p(y)it.
Ball et al. (2013a) does not consider this general case. Instead, a restriction that ωit = 0 (corresponding to \(\left [p(y)_{it}\right ]=0\) in our notation) is invoked.Footnote 36Ball et al. (2013a) refers to this as the “no-conservatism” condition when evaluating the properties of earnings on returns regression. Even under this condition, the estimated coefficient will be positive rather than 0 as illustrated below:
$$ \begin{array}{@{}rcl@{}} E\left[\hat{p(y)_{it}}\mid\omega_{it}=0\right] =E\left[\frac{\hat{\sigma}_{R,I}}{\hat{\sigma}_{R}^{2}}\right]=\left[\frac{\sigma_{R,x}}{{\sigma_{R}^{2}}}\right]. \end{array} $$
(22)
Consequently, when there are no transactions that face asymmetric accounting recognition, an OLS regression slope coefficient will be zero as expected only if xit = 0 (for all i,t). Stated differently, the estimated coefficient will be unbiased only if earnings and returns are orthogonal to each other—a rather uninteresting special case. This conclusion arises under the condition of the symmetric non-recognition of yit; that is, there is no recognition of transactions subject to accounting conservatism.
In contrast to our derivations, the analysis in Ball et al. (2013a):Footnote 37
-
begins with \(E\left [\frac {\hat {\sigma }_{R,I}}{\hat {\sigma }_{R}^{2}}\right ]\),
-
substitutes Iit = xit + yit− 1 + git− 1 + εit − εit− 1 for Iit, setting ωityit = 0 by assumption, and
-
substitutes Rit = xit + yit + git for Rit.
These steps lead to the following characterization:Footnote 38
$$ \begin{array}{@{}rcl@{}} E\left[\hat{p(y)_{it}}\mid\omega_{it}=0\right] & = & E\left[\frac{\hat{\sigma}_{R,I}}{\hat{\sigma}_{R}^{2}}\right] \\ & =&E\left[\frac{\hat{\sigma}_{(x_{it}+y_{it}+g_{it}),(x_{it}+y_{it-1}+g_{it-1}+\varepsilon_{it}-\varepsilon_{it-1})}}{\hat{\sigma}_{\left( x_{t}+y_{t}+g_{t}\right)}^{2}}\right] \\ & =&\left[\frac{{\sigma_{x}^{2}}+\sigma_{x,y}+\sigma_{g,x}}{{\sigma_{x}^{2}}+2\sigma_{(x_{it},y_{it}+g_{it})}+\sigma_{(y_{it}+g_{it})}^{2}}\right]. \end{array} $$
(23)
Ball et al. (2013a) indicates that the properties of \(E\left [\hat {p(y)_{it}}\mid \omega _{it}=0\right ]\) in Eq. (23) are desirable, observing that, as the ratio of information incorporated into earnings this period increases (i.e., \({\sigma _{x}^{2}}\uparrow \)) relative to all other information (i.e., \(\sigma _{\left (y_{it}+g_{it}\right )}^{2}\)), \(E\left [\hat {p(y)_{it}}\mid \omega _{it}=0\right ]\) becomes larger. Consider, however, that in the extreme, if all news in returns is incorporated into earnings in the current period, then yit = 0 (for all i,t) and git = 0 (for all i,t), and therefore \(\sigma _{(x_{it},y_{it}+g_{it})}=0\) and \(\sigma _{(y_{it}+g_{it})}^{2}=0\). Eq. (23) reduces to:
$$ \begin{array}{@{}rcl@{}} E\left[\hat{p(y)_{it}}\mid\omega_{it}=0\right] & = & \left[\frac{{\sigma_{x}^{2}}}{{\sigma_{x}^{2}}}\right]=1. \end{array} $$
(24)
This derivation speaks to the timeliness of earnings, not the timeliness of yit.
Because the derivation in Eq. (23) focuses on the statistical properties of the relationship between earnings on returns, the analysis cannot speak to whether return endogeneity is problematic. This occurs as the bias in an estimator can only be evaluated relative to a structural parameter. As noted above, the BKN framework does not specify a structural equation between earnings and returns. More importantly, researchers investigating asymmetric timeliness are interested in the (asymmetric) timeliness of yit in earnings, which has the structural parameter p(y)it. With this objective in mind, reconsider Eq. (23); ωit = 0 by assumption; however, \(E\left [\hat {p(y)_{it}}\right ]{{\neq }}0\) leads to an inconsistency. This inconsistency occurs because the underlying structural relationship, \(I=\left [p(y)_{it}\right ]R+\upsilon \), which contains the structural coefficient, p(y)it, regarding how yit affects Iit, and the structural error term, υit, is not part of the derivation. In addition, the substitution of Rit = xit + yit + git for Rit obscures that the explanatory variable, Rit, in the regression of earnings on returns is endogenous, as it becomes omitted from the derivation.
The different focus in Ball et al. (2013a) has important implications. To see this, closely examine Eq. (22) versus Eq. (23) when ωit = 0. The equations are equivalent, as the numerators of both equations equal σR,x (i.e., in slightly different notation for the numerator for Eq. (23), \(\sigma _{x_{it},(x_{it}+y_{it}+g_{it})}\) equals σR,x, as Rit = xit + yit + git) and denominators of both equal \({\sigma _{R}^{2}}\) (i.e., in slightly different notation for the denominator for Eq. (23), \(\sigma _{(x_{t}+y_{t}+g_{t}),(x_{t}+y_{t}+g_{t})}\) equals \({\sigma _{R}^{2}}\)). Ball et al. (2013a) fails to point out that, if ωit = 0 (for all i,t), then a consistent estimator is zero because no relationship exists between Iit and yit (proxied by Rit); however, empirical estimates to the contrary will arise because σR,x > 0. That is, when yit does not determine Iit (i.e., the Ball et al. (2013a) “no-conservatism” condition), Ball et al. (2013a)’s Eq. (8) demonstrates the properties of the endogeneity bias that arise because the unobservable variable xit is in the error term and is positively correlated with Rit.Footnote 39
In summary, the Ball et al. (2013a) derivations demonstrate only the statistical properties of the slope coefficient from a regression of earnings onto returns and not of a structural relationship (i.e., Fig. 1 shows there is not a direct link between Iit and Rit). Researchers testing for asymmetric timeliness are fundamentally interested in the structural coefficient between Iit and yit. Because it focuses on the statistical relationship between earnings and returns, the Ball et al. (2013a) analysis cannot speak to whether the earnings on returns regression leads to biased estimates of the structural coefficient between Iit and yit. For these reasons, the derivations in Ball et al. (2013a) do not support the assertion that causal relationships between earnings and transactions subject to asymmetric accounting recognition are being tested in empirical tests of earnings on returns.
(Ball et al. 2013a) claim of addressing sample truncation bias
Regarding the claim in Ball et al. (2013a) of addressing sample truncation, as we demonstrate in Eq. (12), the ATM has the following properties:
$$ \begin{array}{@{}rcl@{}} ATM Bias_{it}&= & \left\{ \left[p(R)_{it}+\frac{\left( 1-p(R)_{it}\right)\sigma_{R,x}-p(R)_{it}\sigma_{R,g}}{{\sigma_{R}^{2}}}\mid R_{it}<0\right]-\left[p(y)_{it}\mid y_{it}<c_{it}\right]\right\} \\ & & -\left\{ \left[p(R)_{it}+\frac{\left( 1-p(R)_{it}\right)\sigma_{R,x}-p(R)_{it}\sigma_{R,g}}{{\sigma_{R}^{2}}}\mid R_{it}\geq0\right]-\left[p(y)_{it}\mid y_{it}\geq c_{it}\right]\right\} . \end{array} $$
Subsection 3.4 describes the two conditions under which the ATM will not suffer from differential endogeneity bias that is a function of truncated distributions. Neither condition is trivial.
How is it that Ball et al. (2013a) reaches the alternative conclusion that conditioning on returns does not induce bias? The analysis in Ball et al. (2013a) follows these steps:
-
begin with \(E\left [\frac {\hat {\sigma }_{R,I}}{\hat {\sigma }_{R}^{2}}\right ]\),
-
substitute \(I_{it}=x_{it}+\left [\omega _{it}\right ]y_{it}+\left [1-\omega _{it-1}\right ]y_{it-1}+g_{it-1}+\varepsilon _{it}-\varepsilon _{it-1}\) for Iit,
-
substitute Rit = xit + yit + git for Rit, and
-
assume that (i) ωit = 0 for all Rit, or alternatively, (ii) ωit = 1 when Rit < 0 and ωit = 0 when Rit ≥ 0.
This leads to the following characterizations of the bad and good news coefficients (see Ball et al. (2013a) Eqs. (A1) and (A2)):
$$ \begin{array}{@{}rcl@{}} E\left[\hat{\delta}_{1}^{bad}\right] & = & E\left[\frac{\hat{\sigma}_{R,I}}{\hat{\sigma}_{R}^{2}}\mid R_{it}<0\right] \\ & =&E\left[\frac{\hat{\sigma}_{(x_{it}+y_{it}+g_{it}),(x_{it}+\left[\omega_{it}\right]y_{it}+\left[1-\omega_{it-1}\right]y_{it-1}+g_{it-1}+\varepsilon_{it}-\varepsilon_{it-1})}}{\hat{\sigma}_{R}^{2}}\mid R_{it}<0\right] \\ & =&\left[\frac{\sigma_{x_{it},(x_{it}+y_{it}+g_{it})}+\sigma_{\left[\omega_{it}\right]y_{it},(x_{it}+y_{it}+g_{it})}}{{{\sigma}}_{R}^{2}}\mid R_{it}<0\right] \end{array} $$
(25)
$$ \begin{array}{@{}rcl@{}} E\left[\hat{\delta}_{1}^{good}\right] & = & E\left[\frac{\hat{\sigma}_{R,I}}{\hat{\sigma}_{R}^{2}}\mid R_{it}\geq0\right] \\ & =&E\left[\frac{\hat{\sigma}_{(x_{it}+y_{it}+g_{it}),(x_{it}+\left[\omega_{it}\right]y_{it}+\left[1-\omega_{it-1}\right]y_{it-1}+g_{it-1}+\varepsilon_{it}-\varepsilon_{it-1})}}{\hat{\sigma}_{R}^{2}}\mid R_{it}\geq0\right] \\ & =&\left[\frac{\sigma_{x_{it},(x_{it}+y_{it}+g_{it})}+\sigma_{\left[\omega_{it}\right]y_{it},(x_{it}+y_{it}+g_{it})}}{\hat{\sigma}_{R}^{2}}\mid R_{it}\geq0\right]. \end{array} $$
(26)
Ball et al. (2013a) then invokes a “linearity assumption,” which requires returns and all of its components to share the same variance-covariance matrix for subsamples partitioned based on the sign of returns. This leads to the ATM coefficient being represented as (see Ball et al. (2013a) Eq. (A3), and notice that the Ball et al. (2013a) linearity assumption equates \({\sigma _{R}^{2}}\mid R_{it}<0\) and \({\sigma _{R}^{2}}\mid R_{it}\geq 0\)):
$$ \begin{array}{@{}rcl@{}} & E\left[\hat{\delta}_{1}^{bad}\right]-E\left[\hat{\delta}_{1}^{good}\right]= & \frac{\left[\sigma_{\left[\omega_{it}\right]y,(x_{it}+y_{it}+g_{it})}\mid R_{it}<0\right]-\left[\sigma_{\left[\omega_{it}\right]y,(x_{it}+y_{it}+g_{it})}\mid R_{it}\geq0\right]}{{\sigma_{R}^{2}}\mid R_{it}\geq0}.\\ \end{array} $$
(27)
As Ball et al. (2013a) indicates, the Basu ATM will equal zero, if ωit = 0 for all Rit, and will be positive, if ωit = 1 when Rit < 0 and ωit = 0 when Rit ≥ 0. The first condition of no conservatism can be seen in Eq. (27) as ωit = 0. The second condition leads to Eq. (27) collapsing to:
$$ \begin{array}{@{}rcl@{}} & E\left[\hat{\delta}_{1}^{bad}\right]-E\left[\hat{\delta}_{1}^{good}\right]= & \frac{\left[\sigma_{y_{it},(x_{it}+y_{it}+g_{it})}\mid R_{it}<0\right]}{{\sigma_{R}^{2}}\mid R_{it}\geq0}. \end{array} $$
(28)
Note that the right-hand side of Eq. (28) portrays the statistical properties of the ATM rather than demonstrating whether it is unbiased, relative to a structural coefficient of interest. Specifically, as before with the Ball et al. (2013a) representation of the properties of the earnings on returns coefficient, the underlying structural relationship (using our notation) \(I=\left [p(y)_{it}\right ]R+\upsilon \), which contains the structural coefficient, p(y)it, regarding how yit affects Iit, and the structural error term, υit, is not part of the derivation. In addition, the substitution of Rit = xit + yit + git for Rit again obscures that the explanatory variable, Rit, in the regression of earnings on returns is endogenous, as it is omitted from the derivation.
Consider closely the properties of the Ball et al. (2013a) representation of the ATM. Under that representation, there is no sample truncation bias, because it is ruled out by the linearity assumption.Footnote 40 Specifically, the \(\sigma _{x_{it},(x_{it}+y_{it}+g_{it})}\) terms in Eqs. (25) and (26) demonstrate that differential sample truncation bias can exist across the bad and good news coefficients. For example, because \(\sigma _{x_{it},(x_{it}+y_{it}+g_{it})}\) equals σR,x, as Rit = xit + yit + git, differences in \(\left [\frac {\sigma _{R,x}}{{\sigma _{R}^{2}}}\mid R_{it}<0\right ]\) and \(\left [\frac {\sigma _{R,x}}{{\sigma _{R}^{2}}}\mid R_{it}\geq 0\right ]\) can exist. Ball et al. (2013a) rules out this possibility through the linearity assumption. We demonstrate this point in greater detail below.
Second, notice that differences in the application of conservative accounting methods through the use of different cutoffs arising from differences across mandatory accounting practices or by managerial discretion are not captured by the Ball et al. (2013a) representation. That is, by assumption ωit = 1 when Rit < 0 and ωit = 0 when Rit ≥ 0. This assumption guarantees perfect classification when using the sign of stock returns. As we demonstrate above, the assumption is restrictive and, under most conditions, is required for the differential biases for the bad and good news samples to perfectly offset. As this assumption guarantees that p(y)it must always equal either one or zero for the bad and good news subsamples respectively, we are unclear how recent empirical studies claim to be relying on the analysis in Ball et al. (2013a) but then move to empirically test for differences in the mandatory or voluntary application of conservative accounting practices.
Third, why will the ATM vary under the linearity and perfect classification assumptions in Ball et al. (2013a)? Closer examination of Eq. (28) indicates that the ATM should equal one by the assumptions that ωit = 1 when Rit < 0 and ωit = 0 when Rit ≥ 0. However, unless yit = Rit, the ATM will be downward biased. To separate ωit from the factors driving the bias, substitute yit = Rit − xit − git and Rit = xit + yit + git in Eq. (28) to obtain:
$$ \begin{array}{@{}rcl@{}} & E\left[\hat{\delta}_{1}^{bad}\right]-E\left[\hat{\delta}_{1}^{good}\right]= & \left[\frac{\sigma_{\left[\omega_{it}\right]R_{it}-x_{it}-g_{it},R_{it}}}{{{\sigma}}_{R}^{2}}\mid R_{it}<0\right] \\ & & =\left[\omega_{it}-\frac{\sigma_{R,x}+\sigma_{R,g}}{{{\sigma}}_{R}^{2}}\mid R_{it}<0\right] \\ & & =\left[1-\frac{\sigma_{R,x}+\sigma_{R,g}}{{{\sigma}}_{R}^{2}}\mid R_{it}<0\right]. \end{array} $$
(29)
As can be seen, while the ATM will be positive as claimed in Ball et al. (2013a), there exists a downward bias that grows with the importance of x and g relative to y. That is, even though the ATM should equal one by the assumptions regarding ωit, the coefficient estimate for the ATM will vary, due to the relative importance of the three components of returns. Ball et al. (2013a) discusses how the ATM is affected by the importance of x and g relative to y (Section 4) but does not characterize their importance as confounding the interpretation of the timeliness of y, as the ATM should always equal one under the Ball et al. (2013a) formulation.Footnote 41
The bias, represented by the last term in Eq. (29), arises solely because of the endogeneity of returns. To see this, consider the more general ATM representation based on our Eq. (12):
$$ \begin{array}{@{}rcl@{}} E\left[\hat{\delta}_{1}^{bad}\right]-E\left[\hat{\delta}_{1}^{good}\right]&= & \left[\left\{ p(R)_{it}+\frac{\left( 1-p(R)_{it}\right)\sigma_{R,x}-p(R)_{it}\sigma_{R,g}}{{\sigma_{R}^{2}}}\mid R_{it}<0\right\} \right]\\ & & -\left[\left\{ p(R)_{it}+\frac{\left( 1-p(R)_{it}\right)\sigma_{R,x}-p(R)_{it}\sigma_{R,g}}{{\sigma_{R}^{2}}}\mid R_{it}<0\right\} \right]. \end{array} $$
Now, following the analysis in Ball et al. (2013a), invoke these two assumptions: (i) the linearity assumption and (ii) perfect classification assumption based on the sign of stock returns. This reduces the above equation to:
$$ \begin{array}{@{}rcl@{}} E\left[\hat{p(y)_{it}}\mid R_{it}<0\right]-E\left[\hat{p(y)_{it}}\mid R_{it}\geq0\right]&= & \left[1\right]+\left[\frac{-\sigma_{R,g}}{{\sigma_{R}^{2}}}\mid R_{it}<0\right]\\ & & -\left[0\right]-\left[\frac{\sigma_{R,x}}{{\sigma_{R}^{2}}}\mid R_{it}\geq0\right]\\ & =&\left[1-\frac{\sigma_{R,x}+\sigma_{R,g}}{{{\sigma}}_{R}^{2}}\mid R_{it}<0\right]. \end{array} $$
Here, the ATM is still a function of the remaining endogeneity bias, despite invoking the two assumptions in Ball et al. (2013a). In addition, the absence of either assumption leads to a much more complicated depiction of the validity concerns faced by the empirical researcher, as given in Eq. (12). Further, note that Ball et al. (2013a)’s linearity assumption is not a new insight regarding the required conditions for the ATM to be unbiased. Rather, the assumption essentially serves the same purpose as Dietrich et al. (2007)’s conditions (i)–(iii) to rule out possible bias in ATMs. Accordingly, requiring the linearity assumption to hold fails to justify Ball et al. (2013a)’s assertion that “conditioning on [returns] does not induce bias” while simultaneously invoking this condition to mitigate the effects of the biases. Thus the conclusion in Ball et al. (2013a) that the analysis in the study “contradicts the claims of Dietrich, Muller, and Riedl [2007]” (p. 1083) is unfounded. Instead, notwithstanding the restrictive assumptions that are invoked in Ball et al. (2013a) to obtain its derivations and inferences, the asymmetric timeliness test is still affected by the remaining endogeneity bias that is a function of truncated distributions. Finally, and most importantly, notice that invoking the two assumptions in Ball et al. (2013a) leads to an ATM measure that cannot be used to justify empirical tests intended to measure variation in the asymmetric recognition of yit.Footnote 42