Review of Accounting Studies

, Volume 19, Issue 2, pp 736–768

Measuring discretionary accruals: are ROA-matched models better than the original Jones-type models?

Authors

  • Edmund Keung
    • National University of Singapore
    • University of Windsor
Article

DOI: 10.1007/s11142-013-9262-7

Cite this article as:
Keung, E. & Shih, M.S.H. Rev Account Stud (2014) 19: 736. doi:10.1007/s11142-013-9262-7

Abstract

Discretionary accruals estimated from Jones-type models are elevated or depressed for firms with extreme performance. Kothari et al. (J Acc Econ 39:163–197, 2005) propose performance matching to address the issue, that is, to difference discretionary accruals estimated from Jones-type models for treatment and control firms matched on current ROA. This study shows (1) performance matching will systematically cause discretionary accruals of either sign to be underestimated, and (2) the measurement error will be negatively correlated with the true discretionary accruals. As a result, using discretionary accruals estimated with performance matching to test whether certain events induce earnings management will increase the frequency of Type II errors, and using them as the dependent or an independent variable in regression analysis will bias the regression coefficient toward zero. The results of our empirical tests are consistent with these predictions.

Keywords

Earnings managementPerformance matchingJones-type models

JEL Classification

M41

1 Introduction

Dechow et al. (1995) show discretionary accruals estimated from the Jones and modified Jones models are higher (lower) than expected for firms with high (low) reported earnings. The result suggests Jones-type models may be misspecified for samples skewed toward firms with extreme performance. Kothari et al. (2005) propose performance matching to address the issue, i.e., to difference discretionary accruals estimated from Jones-type models for treatment firms and control firms matched on current year ROA. Performance matching on ROA has now become the “standard procedure”. Researchers now almost invariably adopt the procedure when the sample firms’ average ROA is elevated or depressed. And many adopt the procedure without even first examining whether the sample is skewed (e.g., Blouin et al. 2007; Lawrence et al. 2011). This raises the question of whether performance matching is overused, and used in situations to which it is unsuitable.

Questions also can be raised about whether performance matching is clearly desirable for skewed samples. If the sample skewness is caused by earnings management by the firms in response to the hypothesized stimulus, it would be unnecessary to correct for the skewness (see Holthausen et al. 1995). The issue of necessity notwithstanding, it would be acceptable to adopt performance matching as the standard practice if its use does not affect the power of test significantly. This study finds that is not the case, however. We find that the control firm used for performance matching is likely to have more (less) abnormal accruals unrelated to the event of interest than the treatment firm when there is upward (downward) earnings management. With performance matching, therefore, upward (downward) earnings management will be measured with a negative (positive) error. Moreover, the measurement error will be negatively correlated with the true discretionary accruals. The result has two important implications. First, for studies on event-triggered earnings management, the frequencies of Type II errors will be higher with performance matching. Second, for studies that use discretionary accruals estimated with performance matching as a dependent or independent variable in regression analysis (e.g., Ayers et al. 2006; Ali et al. 2007; Jo and Kim 2007; Louis et al. 2008; Cohen and Zarowin 2010), the coefficient estimate will be biased toward zero. These problems will not occur with no performance matching.

The results of our empirical tests are consistent with the predictions. Similar to Kothari et al. (2005), we seed earnings management into archival data from Compustat to evaluate each accruals model’s power of test. But we use a different procedure to match the treatment and control firms on ROA to better reflect how researchers actually choose control firms in empirical research on earnings management issues. Our results show that performance matching (1) reduces the amount of discretionary accruals detected by 20–40 % and (2) reduces the power of test (the number of detected cases of earnings management) by 30–50 % for plausible magnitudes of earnings management. The decline in power is explained by both the measurement error we discuss in this paper and an increase in noise in the discretionary accruals estimate induced by performance matching.1 We also re-estimate two regression models employed in Ayers et al. (2006) and show the coefficient on discretionary accruals drops substantially in both cases with performance matching.

Our paper contributes to the literature in the following ways. First, while some authors speculate that performance matching reduces the power of test (e.g., Ayers et al. 2006; Dechow et al. 2012), we show why this happens and present comprehensive empirical results consistent with our theory. Second, we show that researchers should interpret their test results with caution. In particular, our results show it is often premature to conclude there is no earnings management when (1) mean discretionary accruals estimated with no performance matching are significant, and (2) mean discretionary accruals estimated with performance matching are insignificant. Lastly, our study suggests more research is needed to develop alternatives to ROA-matched accruals models. Recent developments (e.g., Ayers et al. 2006; Dechow et al. 2012) may be important steps in that direction.

The rest of the paper is organized as follows. Section 2 shows why ROA-matched models systematically underestimate discretionary accruals. Section 3 presents the empirical results. Section 4 contains the closing remarks.

2 Analysis

Performance matching has become the standard practice in empirical research on earnings management. In five accounting journals (Review of Accounting Studies, Contemporary Accounting Research, Journal of Accounting and Economics, Journal of Accounting Research, and The Accounting Review), there are at least 45 published empirical studies in the period 2005–2011 that use performance-matched discretionary accruals as the dependent variable in regression analysis and 19 as an independent variable. Most researchers do performance matching as a robustness check, and many describe the results in rather general terms, such as “the inferences are similar,” rather than reveal the changes in test results in detail. We find a number of studies that tabulate both sets of results. These are summarized in Table 1, together with a number of studies that use only ROA-matched estimates of discretionary accruals in their analyses.
Table 1

A sample of studies employing ROA-matched estimate of discretionary accruals

Study

Research question

ROA for treatment and control firms

Performance-matching used

Comments

In main tests?

In robustness tests?

Cahan and Zhang (2006)

Examine whether, after Andersen’s demise, successor auditors require more income-reducing accruals for ex-Andersen clients

Not discussed in the paper. Table 3 shows mean and median ROA of ex-Andersen clients and other Big-4 clients, but no comparison is made

No

Yes

The primary test result is weaker when the ROA-matched measure of discretionary accruals is used (Table 6)

Menon and Williams (2004)

Examine whether firms that employ former audit partners as officers or directors have larger signed and unsigned abnormal accruals

Discussed. Table 2 shows that firms with former audit partners as employees have higher ROA on average

No

Yes

The primary test result is weaker when the ROA-matched measure of discretionary accruals is used (Tables 4 and 6)

Ayers et al. (2006)

Investigate whether estimated discretionary accruals differ between firms that meet or just beat annual earnings benchmarks and those that miss, and between firms residing in other pairs of adjacent bins of earnings, earnings changes, and earnings surprises

No data are reported

No

Yes

Results from using ROA-matched estimates of discretionary accruals from the modified Jones models are not tabulated. But the authors state they are not significant

Botsari and Meeks (2008)

Examine whether acquiring firms in share-for-share bids manage accruals prior to the bids

Not discussed. The authors only show that estimated discretionary accruals and ROA are correlated for the sample firms (Table 14). Moreover, no comparison in ROA is made between share-for-share bids and all-cash bids

No

Yes

The primary test result is weaker when the ROA-matched measure of discretionary accruals is used (Table 15)

Larcker et al. (2007)

Examine the association between corporate governance indices and abnormal accruals

Not discussed. But mean ROA is higher for the sample firms than for the average firm on Compustat (Table 1)

No

Yes

Results from using the ROA-matched measure of discretionary accruals are not tabulated. But it is reported that adjusted R2 drops from 1.6 to 0.2 %

Jo and Kim (2007)

Examine the relation of disclosure frequency with earnings management for a sample of seasoned equity offerings (SEOs)

Whether the sample of SEO firms is skewed in terms of ROA is not discussed

Yes

No

Table 3 shows clear indications of earnings management by SEO firms in the quarter of SEO when one examines estimated discretionary accruals in the quarter. The signs vanish when the ROA-matched measure is used

Lawrence et al. (2011)

Examine whether audit quality differs between Big 4 and non-Big 4 audit firms

Table 1 shows no differences in ROA between Big 4 clients and non-Big 4 clients

Yes

N/A

The coefficient on BIG4 (test variable) is insignificant for the refined sample (Table 2)

Blouin et al. (2007)

Examine why certain former Arthur Andersen clients follow their AA audit teams to new audit firms and whether audit quality (as measured by discretionary accruals) changes after the switch

Not discussed in the paper. Table 2 reveals no difference in ROA between the followers and non-followers

Yes

N/A

Little difference in audit quality is found between followers and nonfollowers

Prawitt et al. (2009)

Investigate the relation of earnings management with the quality of the internal audit function (IAF)

Not mentioned in the paper

Yes

N/A

The coefficient on IAQuality (test variable) is insignificant for firms with positive accruals (Table 3)

It is interesting that the authors seem to hold the view that performance matching reduces noise in estimated discretionary accruals (p. 1262)

Ali et al. (2007)

Investigate whether family firms have better earnings quality

Show family firms have higher ROA on average than nonfamily firms. Not only are discretionary accruals estimated with performance matching, but also ROA is included as an independent variable in the regression

Yes

N/A

The results depend on the regression specification. Some are not supportive of the primary prediction (Table 3, Panel C)

One issue that Table 1 reveals is that many researchers seem somewhat careless about the use of performance matching. While the procedure was proposed as a solution to sample skewness, authors of many studies fail to show their samples are skewed in terms of ROA or even discuss the skewness issue. This raises the question of whether performance matching is overused and used in situations for which it is inappropriate. As stated at the outset, moreover, questions can be raised about whether performance matching is clearly desirable for situations for which it is designed for; that is, when the sample is skewed. Since earnings management increases or decreases profit, there are reasonably good odds that the elevated or depressed mean ROA of sample firms in a study is caused by earnings management by the firms in response to the hypothesized stimulus. Stripping away the effect of earnings management, the sample would not be skewed. To the extent that is the case, Holthausen et al. (1995) argue, it is unnecessary to correct for the skewness by means such as performance matching.2

The decision on whether to adopt a research method should be based on a comparison of the costs and benefits. While it is unclear performance matching is necessary for every study with a skewed sample, it would be acceptable to adopt it as the standard practice if its use does not cause a significant loss of power. But the studies listed in Table 1 suggest that may not be the case. Those studies that report both sets of results invariably show the results from using ROA-matched estimates of discretionary accruals are weaker. Jo and Kim (2007), for example, show any signs of earnings management by SEO firms in the quarter of interest vanish when one estimates discretionary accruals with performance matching (see their Table 3). This suggests performance matching could reduce the power of test significantly. Moreover, many studies that use only ROA-matched estimates of discretionary accruals in their analyses generate insignificant test results. While the results in these studies may reflect a genuine lack of association between the test variables, performance matching may be the cause or at least a contributing factor. The problem is likely to be more severe than Table 1 suggests, given that studies that fail to produce significant test results are likely to be rejected for publication or even abandoned before submission to a journal.

If performance matching reduces the power of test, then the question to ask is why? Ayers et al. (2006) and Dechow et al. (2012) argue that performance matching increases noise in the discretionary accruals estimate. We argue there is another factor at play. We use Kothari et al.’s (2005) analytical framework to make our point. Kothari et al. argue that discretionary accruals estimated from Jones-type models for firms suspected of managing earnings in response to an event (e.g., a seasoned equity offer) include abnormal accruals of three different kinds: (1) discretionary accruals induced by the event of interest, (2) discretionary accruals induced by other events or firm-specific circumstances, and (3) performance-induced normal accruals.3 Kothari et al. (2005) argue abnormal accruals of the second and third kinds increase with ROA, and firms with high ROA will have more of these kinds of abnormal accruals than others. We therefore refer to the two kinds of abnormal accruals collectively as performance-related abnormal accruals (PRAA).

Kothari et al. assume that firms with the same ROA will have the same amount of PRAA. The assumption underlies their belief that differencing the estimates of discretionary accruals of the treatment and control firms matched on ROA will extract the discretionary accruals induced by the event of interest. Say a special event K induces certain firms, including Firm A, to manage earnings higher. Let Firm B have the same ROA as Firm A and serve as the control of Firm A. Kothari et al. argue that since Firm A and Firm B have the same ROA, they should have the same amount of PRAA. Since Event K induces Firm A to take additional discretionary accruals, they argue, “when the estimated discretionary accruals of the treatment and control firms are differenced, only discretionary accruals related to the event of interest remains” (Kothari et al. 2005, Sec. 2.3). Thus, “firms classified as having abnormally high or low levels of earnings management are those that manage more than expected given their level of performance.” Kothari et al. therefore conclude after examining potential problems with their models that “the key point here is that the power of test using performance-based discretionary accrual measure is not sacrificed so long as the researcher seeks to estimate the earnings management impact of the treatment event itself” (italics added) (Kothari et al. 2005, Sect. 2.3, p. 171).

However, for Kothari et al.’s differencing procedure to measure event-induced discretionary accruals accurately, it is critical that the treatment and control firms have the same amount of PRAA, as shown in Fig. 1. How often will that critical condition be met? Note that if the treatment firm were not induced by the special event to take extra discretionary accruals on top of its PRAA, the treatment firm would have lower ROA than the control firm. Thus the treatment firm is likely to have lower PRAA than the control firm. Differencing the estimated discretionary accruals of the treatment and control firms therefore will generate an estimate of discretionary accruals triggered by the special event lower than the true amount. An example will help clarify this point. Assume one group of firms each report an ROA of 20 % and another group each report an ROA of 25 %. Following Kothari et al.’s argument that a higher ROA leads to higher PRAA, we assume each firm with a 20 % (25 %) ROA has PRAA equal to 6 % (8.5 %) of lagged assets.4 Assume Firm A would report earnings of $20 and lagged total assets of $100, if it were not affected by Event K. Firm A’s ROA thus would be 20 %, and its PRAA equal 6 % of lagged assets.5 Event K induces Firm A to take $5 in additional accruals, or 5 % of lagged total assets. With the extra accruals induced by Event K, Firm A now reports earnings of $25, translating into a higher ROA of 25 %. To performance match, the researcher would choose a firm with an ROA of 25 % as the control for Firm A. To be consistent, we also call the control Firm B. Given its ROA of 25 %, Firm B’s PRAA would equal 8.5 % of lagged total assets. Since a Jones-type model would classify any unusual accruals of Firms A and B as discretionary, Kothari et al.’s differencing procedure would yield the outcome, as shown in Fig. 2, that the extra discretionary accruals that Event K induced Firm A to take equal 2.5 % (5 + 6 − 8.5 %), only one half of the true amount of 5 %.
https://static-content.springer.com/image/art%3A10.1007%2Fs11142-013-9262-7/MediaObjects/11142_2013_9262_Fig1_HTML.gif
Fig. 1

How the ROA-matched models ideally should work

https://static-content.springer.com/image/art%3A10.1007%2Fs11142-013-9262-7/MediaObjects/11142_2013_9262_Fig2_HTML.gif
Fig. 2

How the ROA-matched models actually work

In cases where firms manipulate accruals upward in response to an event, the general formula for discretionary accruals as measured with performance matching (DA–PM) is:
$$ {\text{DA}} {\rm -} {\text{PM}} = {\text{EDA}} + {\text{PRAA}}_{\rm t} -{\text{PRAA}}_{\rm c} + \upeta $$
where EDA is event-induced discretionary accruals; PRAAt is the treatment firm’s PRAA, which can be positive or negative; PRAAc is the control firm’s PRAA; η is a random variable with mean zero.6 Let u = PRAAt − PRAAc, then
$$ {\text{DA}}{\rm -}{\text{PM}} = {\text{EDA}} + {\text{u}} + \upeta. $$

Since PRAAt is less than PRAAc, u < 0.

Thus, when discretionary accruals are estimated with performance matching, the estimate contains two errors: one (η) is random and has a zero mean, and the other (u) is negative and therefore has a negative mean. More importantly, while η is uncorrelated with EDA, u is negatively correlated with EDA (The greater EDA is, the greater is the gap between PRAAt and PRAAc and the more negative u is.) Since the mean of η is zero and the mean of u is negative, the mean of DA–PM will be less than the mean of EDA. Thus, when accruals are manipulated upward in response to an event, the extent of earnings management tends to be underestimated with performance matching, leading to a high frequency of Type II errors.

What if the firms manipulate accruals downward? In this case, the general formula for event-induced discretionary accruals as measured with performance matching is also:
$$ {\text{DA}}{\rm -}{\text {PM}} = {\text{EDA}} + {\text{PRAA}}_{\text{t}}-{\text{PRAA}}_{\text{c}} + \upeta = EDA + {\text{u}} + \upeta. $$

In this case, however, PRAAt is greater than PRAAc (since without the event-induced downward accruals manipulation, the treatment firm would have a higher ROA than the control firm), and therefore u (PRAAt − PRAAc) is positive. So the mean of DA–PM will be greater than the mean of EDA, which is negative. Thus the extent of earnings management will also be underestimated with performance matching in this case.

The underestimation problem is less likely to arise without performance matching. The general formula for discretionary accruals induced by an event as measured by an original Jones-type model with no performance matching would simply be:
$$ {\text{DA}}{\rm -}{\text {NP}} = {\text{EDA}} + {\text{PRAA}}_{\text{t}} + \upepsilon $$
where ε is a random variable with a zero mean.7 Sample firms in many empirical studies have an elevated or depressed average ROA. As stated earlier, the elevated or depressed average ROA may be the result of earnings management, rather than the sample firms’ good or poor operating performance. In that case, PRAAt is as likely to be positive as negative for firms in the sample and therefore the sample mean of PRAAt is likely to be zero or near zero.8 Thus, we can rewrite DA–NP as the sum of EDA and one single zero-mean variable e (PRAAt + ε):
$$ {\text{DA}}{\rm -}{\text{NP}} = {\text{EDA}} + {\text{e}} $$
where e has a zero mean and zero correlation with EDA. Thus the mean of DA–NP is likely to be close to that of EDA, and the frequency of type II errors is likely to be lower when there is no performance matching. In the example shown in Fig. 2, the estimated extent of earnings management without performance matching would average 5 %, if all the sample firms manage earnings up 5 % in response to event K, instead of 2.5 %, as would be the case with performance matching,

We therefore propose two testable hypotheses:

Hypothesis 1

To the extent that a sample’s skewness is explained by earnings management by the sample firms, performance matching will increase measurement error in estimated discretionary accruals.

Hypothesis 2

Measurement error induced by performance matching is negatively correlated with the true discretionary accruals.

2.1 For regression analyses

Besides being used to detect event-induced earnings management, discretionary accruals estimated with performance matching are often used as an independent variable (e.g., Jo and Kim 2007; Louis et al. 2008; Cohen and Zarowin 2010) or the dependent variable (e.g., Menon and Williams 2004; Ali et al. 2007; Jo and Kim 2007) in regression analysis.

Performance matching is likely to reduce the power of test in such studies as well. That is because whether estimated discretionary accruals are placed on the left- or righthand side of the regression equation, the goal is to investigate empirically the relation of discretionary accruals with one or multiple other variables. Our earlier analysis shows that performance matching induces measurement error in estimated discretionary accruals and the measurement error is negatively correlated with the true amount of discretionary accruals. Thus any relation between estimated discretionary accruals and another variable is likely to be weakened by performance matching. Therefore the magnitude of the estimated coefficient will be smaller with performance matching, again increasing the risk of a Type II error. We present the mathematical proofs of the results in the “Appendix”. We therefore make the following prediction:

Hypothesis 3

Performance matching will weaken the relationship of estimated discretionary accruals with other variables in regression analysis.

3 Empirical tests

We next empirically examine the bias that is likely to be induced by performance matching. The question is important as the answer reveals the cost to researchers from employing the procedure. Note that Kothari et al. (2005) did compare the frequency of Type II errors across models. But their research design causes the frequencies of Type II errors with performance matching to be understated. They select firms with ROA closest to the treatment firms’ original ROA as the controls (as opposed to firms with ROA closest to the ROA the treatment firms would report after manipulation). Say a treatment firm has an “honest” ROA of 10 % and manages earnings upward to 15 % by taking discretionary accruals equal to 5 %. The control that Kothari et al. would choose for this treatment firm is the firm with ROA closest to 10 %, not that with ROA closest to 15 %.9 This matching procedure can only be implemented in simulations but not in real empirical research. To implement this matching procedure in real empirical research, researchers would need to know the treatment firms’ discretionary accruals (5 % in the example above), which are unknown. Given this problem, all researchers to the best of our knowledge choose firms with ROA closest to the treatment firms’ reported ROA (15 % in the above example) as the control. Thus frequencies of Type II errors in empirical studies that performance match are likely to be higher than numbers reported by Kothari et al. (2005), given that firms with higher ROA tend to have higher PRAA. We use a more realistic procedure to select control firms for performance matching. Our analysis covers both quarterly and annual data, moreover, as many authors also use Jones-type models to estimate quarterly discretionary accruals (e.g., Abarbanell and Lehavy 2003; Defond and Park 2001; Matsumoto 2002).

Following prior research (e.g., Dechow et al. 1995; Kothari et al. 2005; Ecker et al. 2011), we seed earnings management into financial data of firms randomly drawn from Compustat to compare power across models. With randomly drawn samples, we are able to see the effect of performance matching for samples skewed in terms of ROA as a result of earnings management but unskewed in terms of genuine operating performance. We report the results for annual and quarterly data separately.

3.1 Annual models

We estimate annual discretionary accruals from the modified Jones model:
$$ TTAC_{it} /AST_{it - 1} = \beta_{0} (1/AST_{it - 1} ) \, + \beta_{1} [(\varDelta REV_{it} - \varDelta REC_{it})/AST_{it - 1} ] + \beta_{2} (PPE_{it} /AST_{it - 1} ) + \varepsilon_{it} $$
where TTACit is total accruals taken by firm i in year t; ASTit−1 is total assets of firm i at the end of year t − 1; ΔREVit is the change in net revenues for firm i in year t from year t − 1; ∆RECit is the change in receivables in year t from year t − 1; PPEit is the gross book value of property, plant and equipment of firm i at the end of year t; εit is the error term.

We estimate accruals using both the balance sheet approach and the income statement approach. Under the income statement approach, total accruals are income before extraordinary items (Compustat Data18) minus net cash flow from operating activities (Data308). Under the balance sheet approach, total accruals are the change in noncash current assets minus the change in current liabilities excluding the current portion of long-term debt, minus depreciation and amortization, i.e., (ΔData4 − ΔData1 − ΔData5 + ΔData34 − Data14).

The analysis for accruals estimated from balance sheet (income statement) accounts is based on data for firm-years in 1985–2006 (1989–2006).10 The preliminary sample for the balance sheet (income statement) approach includes 484, 620 (407,394) firm-years. We exclude firms in financial and utilities industries (SIC 6000–7000 and SIC 4900–5000). Firm-years with missing data are also deleted, as are firm-years with total accruals in the top or bottom percentile, to remove potential outliers. Finally, we drop industry-years in which there are fewer than 20 observations.11 Table 2 shows the final sample size for each year and each method used to calculate total accruals. Descriptive statistics for the annual samples are reported in Table 3 (Panel A).
Table 2

Sample sizes

 

Annual models balance sheet approach

Annual models income statement approach

Quarterly models balance sheet approach

Quarterly models income statement approach

1985

    

1986

5,292

 

15,152

 

1987

5,478

 

16,427

 

1988

5,380

 

17,146

 

1989

5,250

 

17,194

7,302

1990

5,252

5,224

16,849

13,781

1991

5,321

5,335

17,027

13,924

1992

5,539

5,562

17,820

14,224

1993

5,975

5,998

19,124

14,848

1994

6,220

6,235

20,182

15,279

1995

6,495

6,538

20,759

16,171

1996

7,311

7,396

22,584

20,156

1997

7,394

7,483

23,526

22,631

1998

7,105

7,192

23,295

22,526

1999

7,379

7,534

24,021

22,846

2000

7,256

7,400

25,058

23,296

2001

6,922

7,057

24,020

14,873

2002

6,535

6,638

22,991

13,798

2003

6,268

6,348

21,805

12,982

2004

6,086

6,146

21,462

12,928

2005

5,824

5,872

20,974

12,777

2006

4,343

4,362

18,820

11,373

Total

128,625

108,320

436,866

285,715

Table 3

Descriptive statistics

Variable

Mean

SD

10th ‰

Median

90 ‰

Panel A: Annual data

TTACit/ASTit−1 − BS

−0.046

0.163

−0.188

−0.046

0.098

TTACit/ASTit−1 − IS

−0.107

0.285

−0.278

−0.063

0.066

1/ASTit−1

0.107

0.351

0.001

0.013

0.236

(ΔREVit − ∆RECit)/ASTit−1

0.132

0.378

−0.171

0.068

0.503

PPEit/ASTit−1

0.333

0.289

0.050

0.250

0.750

ROAit

−0.113

0.500

−0.453

0.020

0.141

Panel B: Quarterly data

TTACit/ASTit−1 − BS

−0.014

0.073

−0.081

−0.012

0.053

TTACit/ASTit−1 − IS

−0.027

0.088

−0.098

−0.016

0.042

1/ASTit−1

0.104

0.340

0.001

0.013

0.234

(ΔREVit − ∆RECit)/ASTit−1

0.003

0.073

−0.061

0.002

0.069

PPEit/ASTit−1

0.304

0.244

0.048

0.235

0.698

ROAit

−0.027

0.114

−0.118

0.005

0.035

TTACit is total accruals of firm i in year or quarter t; ASTit−1 is total assets at the end of year or quarter t − 1; ΔREVit is the change in net revenues for firm i in year or quarter t from year or quarter t − 1; ∆RECit is the change in receivables in year or quarter t from year or quarter t − 1; PPEit is the gross book value of property, plant, and equipment of firm i at the end of year or quarter t; ROAit is return on investment in year or quarter t

To calculate the frequency of Type II errors, we perform 250 iterations for each set of parameters, each iteration consisting of the following steps. First, a sample of 100 firm-years is randomly drawn with no replacement from the original full multi-year sample. An amount of extra accruals is added to the actual total accruals of each of the 100 firm-years. Similar to Kothari et al., the seeded discretionary accruals range from −10 to +10 % of lagged total assets. The sample of 100 firm-years with seeded discretionary accruals is then returned to the full sample. Next, we run an OLS regression based on the modified Jones model for firms in each year-industry (two-digit SIC) combination, with regression residuals for the 100 treatment firm-years serving as estimates of their discretionary accruals. An additional step is taken if the analysis calls for performance matching: the estimated discretionary accruals of each treatment firm-year are reduced by those of the firm in the same year-industry chosen as the control.

Following to Kothari et al. (2005), we assume earnings management is partly revenue based. Therefore, if x% is added to the reported accruals of the treatment firms, one half of x% is added to both the change in revenues and that of accounts receivable.

We report our test results in Table 4. Results for total accruals calculated from balance sheet (income statement) accounts are reported in Panel A (B). Within each panel, we first report results with no performance matching, followed by results with performance matching where the control firm is chosen based on the treatment firm’s original ROA plus the seeded discretionary accruals (reported ROA). To show Kothari et al.’s (2005) procedure of choosing control firms leads to an underestimation of the frequency of Type II errors (i.e., overstates power), we also report results when that procedure is used (performance matching − on premanaged ROA). Each set of results includes the percentage of the 250 samples with significant t values of average estimated discretionary accruals (power of test).12 Since the t statistic used in the power of test calculation is affected by both the mean and standard deviation of estimated discretionary accruals, also reported are mean discretionary accruals estimated (estimated DA), mean standard deviation of the estimate (SD), and the mean ratio of mean to standard deviation (power ratio).
Table 4

Power of test comparisons for annual data

Seed (%)

No performance matching

Performance matching (on reported ROA)

Performance matching (on premanaged ROA)

Power of test (%)

Estimated DA (%)

SD

Power ratio (%)

Power of test (%)

Power decline (%)

Estimated DA (%)

SD

Power ratio (%)

Power of test (%)

Overstated by (%)

Estimated DA (%)

SD

Power ratio (%)

Panel A: Balance sheet approach

Testing the hypothesis that discretionary accruals are positive

 0

6.40

−0.03

0.147

−0.11

2.80

 

−0.29

0.206

−1.51

2.80

 

−0.29

0.206

−1.51

 1

19.20

0.91

0.148

6.47

10.40

45.83

0.75

0.203

3.85

13.20

26.92

0.93

0.204

4.78

 2

45.60

2.06

0.146

14.62

23.60

48.25

1.62

0.203

8.34

32.80

38.98

2.12

0.202

10.94

 4

84.80

3.89

0.146

27.50

47.20

44.34

3.23

0.203

16.26

62.00

31.36

3.79

0.201

19.31

 6

98.00

5.66

0.146

40.26

77.60

20.82

4.87

0.206

24.13

88.80

14.43

5.58

0.201

28.38

 10

100.00

9.77

0.146

69.02

98.80

1.20

8.50

0.207

41.81

100.00

1.21

9.84

0.201

49.84

Testing the hypothesis that discretionary accruals are negative

 −0

4.00

0.19

0.147

1.31

5.20

 

0.38

0.202

1.92

5.20

 

0.38

0.202

1.92

 −1

19.20

−0.97

0.147

−6.60

10.40

45.83

−0.83

0.204

−4.11

13.20

26.92

−0.84

0.201

−4.17

 −2

35.60

−1.81

0.145

−12.89

20.00

43.82

−1.63

0.202

−8.30

23.60

18.00

−1.79

0.203

−9.00

 −4

82.40

−3.82

0.144

−26.98

49.60

39.81

−3.17

0.200

−16.29

62.80

26.61

−3.78

0.197

−19.71

 −6

98.00

−5.73

0.145

−40.54

73.20

25.31

−4.82

0.201

−24.49

85.20

16.39

−5.75

0.200

−29.44

 −10

100.00

−9.70

0.147

−67.90

99.20

0.80

−8.46

0.211

−40.77

100.00

0.81

−10.08

0.202

−50.88

Panel B: Income statement approach model

Testing the hypothesis that discretionary accruals are positive

 −0

7.60

0.10

0.230

0.23

4.80

 

−0.07

0.268

−0.27

4.80

 

−0.07

0.268

−0.27

 1

18.40

0.80

0.235

5.51

6.40

65.22

0.46

0.278

2.01

6.80

6.25

0.70

0.276

2.96

 2

30.80

1.81

0.243

10.57

13.60

55.84

1.32

0.284

5.12

17.60

29.41

1.78

0.285

6.77

 4

57.60

3.77

0.229

20.18

32.00

44.44

2.87

0.269

11.45

43.20

35.00

3.63

0.270

14.35

 6

76.40

5.78

0.233

29.15

48.40

36.65

4.13

0.273

15.89

67.20

38.84

5.64

0.274

21.89

 10

94.33

9.54

0.230

47.29

80.00

15.19

7.07

0.274

27.51

94.80

18.50

9.45

0.269

37.58

Testing the hypothesis that discretionary accruals are negative

 −0

5.20

0.10

0.230

0.23

5.20

 

−0.07

0.268

−0.27

5.20

 

−0.07

0.268

−0.27

 −1

9.20

−1.18

0.243

−3.37

6.80

26.09

−0.84

0.280

−2.93

12.40

6.90

−1.20

0.285

−4.44

 −2

18.80

−2.10

0.242

−7.70

13.20

29.79

−1.60

0.281

−5.94

16.40

24.24

−1.80

0.282

−6.99

 −4

50.40

−3.83

0.235

−16.00

32.80

34.92

−3.29

0.283

−12.34

53.20

62.20

−4.48

0.280

−17.16

 −6

82.80

−5.55

0.228

−24.68

50.80

38.65

−4.11

0.274

−16.15

68.00

33.86

−5.62

0.273

−22.43

 −10

99.60

−9.54

0.233

−42.96

82.00

17.67

−7.01

0.284

−26.09

93.20

13.66

−9.94

0.277

−38.04

We test the power of test of three model specifications: (1) the original modified Jones model with no performance matching; (2) the modified Jones model with performance matching, where the control firm is one in the same industry with ROA closest to the treatment firm’s reported ROA (original ROA plus the seeded discretionary accruals); and (3) modified Jones model with performance matching where the control firm is chosen using Kothari et al.’s procedure (on premanaged ROA)

Variable definitions: Power of test = the percentage of random samples with t-values of mean estimated discretionary accruals above the threshold; Estimated DA = mean estimated discretionary accruals; SD = mean standard deviation of estimated discretionary accruals; Power ratio = estimated DA/SD; Power decline = 1 − power of test with performance matching/power of test with no performance matching; Overstated by = power of test with performance matching (on premanaged ROA)/power of test with performance matching (on reported ROA) − 1

Overall, the results in Table 4 suggest that, consistent with our prediction, researchers incur a significant cost from performance matching. Results presented in the upper part of Panel B (income statement approach) serve as a good example. With no performance matching, 18.4, 30.8, 57.6, 76.4, and 94.3 % of the samples are flagged as affected by upward earnings management at 1, 2, 4, 6, and 10 % seed levels respectively. With performance matching (on reported ROA), the percentages drop to 6.4, 13.6, 32, 48.4, and 80 % respectively. The power declines are non-trivial. The decline from 30.8 % to 13.6 % at the 2 % seed level is 55.8 % (1 − 13.6/30.8 %) in percentage terms. One may consider 1, 2, and 4 % as more plausible magnitudes of earnings management. At these magnitudes, the power declines fall mostly in the range of 30–50 %. With any reasonable estimate of the cost of Type II errors, these degrees of power decline are likely to be economically significant.

Turning to results reported for “performance matching − on premanaged ROA”, one sees that consistent with our prediction, Kothari et al.’s way (2005) of choosing control firms invariably causes the power of the ROA-matched model to be overstated. At the 6 % seed level, this way of choosing control firms would inflate the power of test to 67.2 %, exceeding the true power by 38.8 % (67.2/48.4 % − 1).

It is easy to understand why performance matching reduces the power of test when one compares mean estimated discretionary accruals across models. Discretionary accruals estimated with performance matching are lower in magnitude. At the 10 % seed level, discretionary accruals estimated with performance matching (income statement approach) are only 7.07 %, almost 30 % below target. Estimated discretionary accruals with no performance matching are 9.54 %, much closer to the target of 10 %. The drop from 9.54 to 7.07 % caused by performance matching is about 25 % (1 − 7.07/9.54 %). The result is consistent with Hypothesis 1.

We show in Fig. 3a the average measurement error of each model, which equals the average discretionary accruals estimated minus the seed, at each seed level (for total accruals estimated from income statement accounts). The measurement error when we performance match (black bars) is always negative for positive simulated discretionary accruals and positive for negative simulated discretionary accruals. Moreover, the magnitude of measurement error increases with the magnitude of seeded discretionary accruals. The measure error at the −10 % seed level (2.99 %) is more than 18 times as high as that at the −1 % seed level (0.16 %). The result is consistent with our prediction that error in discretionary accruals estimated with performance matching is negatively correlated with the true discretionary accruals (Hypothesis 2). Measurement error in discretionary accruals estimated with no performance matching (gray bars) is much lower in absolute value at each seed level. At the −6 % seed level, the measurement error with performance matching (1.89 %) is more than four times as high as that with no performance matching (0.45 %). The result shows that it is incorrect to argue that performance matching increases the power of test, as Francis and Yu (2009, p. 1527) do. Notice also that the magnitude of the measurement error with no performance matching does not always increase with the magnitude of seeded discretionary accruals.13
https://static-content.springer.com/image/art%3A10.1007%2Fs11142-013-9262-7/MediaObjects/11142_2013_9262_Fig3_HTML.gif
Fig. 3

Performance of the annual ROA-matched Modified Jones model versus the original Modified Jones model (income statement approach). a Measurement error at various levels of discretionary accruals. b Average standard deviation of estimates of discretionary accruals

For comparison purposes, we also show measurement error induced by performance matching with control firms chosen using Kothari et al.’s procedure (white bars). This matching procedure leads to much lower measurement error than the feasible matching procedure. This result explains why Kothari et al.’s matching procedure causes the power of their model to be overstated.

Performance matching based on ROA does not just cause earnings management to be underestimated. The procedure also increases the variability of the discretionary accruals estimate, as Table 3 shows (We also plot the statistics in Fig. 3b, to facilitate comparison.)14 There seems to be disagreement over whether performance matching increases or reduces noise in estimated discretionary accruals. Prawitt et al. (2009, p. 1262) hold the view that performance matching reduces noise, while Ayers et al. (2006) and Dechow et al. (2012) hold the opposite view. Our results support Ayers et al. and Dechow et al.’s view.

The effect of performance matching can be properly evaluated only if a fairly close match between the treatment and control firms on ROA is achieved. We show in Table 5 three sets of statistics in this regard: (1) median ROA of the treatment firms before and after earnings management, (2) median ROA of the treatment and control firms matched on reported ROA and (3) median ROA of the treatment and control firms matched on premanaged ROA. The second and third sets of statistics show a very close match of the treatment and control firms on ROA is achieved in our simulations. Incidentally, at the risk of demonstrating an obvious point, the first set of statistics shows earnings management can turn an unskewed sample into a skewed one.
Table 5

Median ROA comparisons

Seed

No performance matching

Performance matching (on reported ROA)

Performance matching (on premanaged ROA)

Seed

No performance matching

Performance matching (on reported ROA)

Performance matching (on premanaged ROA)

Treatment clean

Treatment reported

Treatment reported

Control

Treatment clean

Control

Treatment clean

Treatment reported

Treatment reported

Control

Treatment clean

Control

ROA

ROA

ROA

ROA

ROA

ROA

ROA

ROA

ROA

ROA

ROA

ROA

Panel A: Balance sheet approach

Panel B: Income statement approach

Testing the hypothesis that discretionary accruals are positive

Testing the hypothesis that discretionary accruals are positive

 0

1.87

1.87

1.87

1.87

1.87

1.87

 0

1.76

1.76

1.76

1.77

1.76

1.77

 1

1.98

2.98

2.98

3.00

1.98

2.01

 1

1.78

2.78

2.78

2.78

1.78

1.79

 2

1.97

3.97

3.97

3.95

1.97

1.96

 2

1.72

3.72

3.72

3.72

1.72

1.73

 4

1.97

5.97

5.97

5.89

1.97

1.91

 4

1.91

5.91

5.91

5.90

1.91

1.92

 6

1.95

7.95

7.95

7.92

1.95

1.96

 6

1.83

7.83

7.83

7.78

1.83

1.84

 10

2.14

12.14

12.14

11.85

2.14

2.11

 10

1.83

11.83

11.83

11.62

1.83

1.84

Testing the hypothesis that discretionary accruals are negative

Testing the hypothesis that discretionary accruals are negative

 −0

1.95

1.95

1.95

1.93

1.95

1.93

 −0

1.90

1.90

1.90

1.92

1.90

1.92

 −1

2.07

1.07

1.07

1.08

2.07

2.09

 −1

1.78

0.78

0.78

0.82

1.78

1.77

 −2

1.93

−0.07

−0.07

0.03

1.93

1.95

 −2

1.87

−0.13

−0.13

−0.07

1.87

1.88

 −4

2.07

−1.93

−1.93

−1.88

2.07

2.09

 −4

1.88

−2.12

−2.12

−2.06

1.88

1.87

 −6

1.88

−4.12

−4.12

−4.01

1.88

1.95

 −6

2.00

−4.00

−4.00

−3.91

2.00

1.99

 −10

2.13

−7.87

−7.87

−7.65

2.13

2.15

 −10

1.73

−8.27

−8.27

−8.08

1.73

1.73

We test the power of test of three model specifications: (1) the original modified Jones model with no performance matching; (2) the modified Jones model with performance matching, where the control firm is one in the same industry with ROA closest to the treatment firm’s reported ROA (original ROA plus the seeded discretionary accruals); and (3) modified Jones model with performance matching where the control firm is chosen using Kothari et al.’s procedure (on premanaged ROA)

Variable definitions: Treatment clean ROA = median treatment firm ROA before seeded earnings management is added; Treatment Reported ROA = median treatment firm ROA after seeded earnings management is added; Control ROA = median control firm ROA

3.2 Quarterly models

Similar to Matsumoto (2002), we add a dummy variable for the fourth fiscal quarter to our quarterly modified Jones model because that quarter may be different from the other fiscal quarters due to increased auditor scrutiny and firms’ tendency to report special items for that quarter (Francis et al. 1996).15 Our quarterly modified Jones model is as follows:
$$ \begin{aligned} TTAC_{it} /AST_{it - 1} & = \beta_{0} (1/AST_{it - 1} ) \, + \beta_{1} [(\varDelta REV_{it } - \, \varDelta REC_{it} )/AST_{it - 1} ] \\ & \quad + \beta_{2} (PPE_{it} /AST_{it - 1} ) + \beta_{3} Q_{4} + \varepsilon_{it} \\ \end{aligned} $$
where Q4 is a dummy variable, equal to 1 if quarter t is the fourth fiscal quarter for firm i and 0 otherwise.

Under the income statement approach, quarterly total accruals equal Compustat Data8 minus Data108. Compustat reports Data108 for each quarter as a year-to-date figure. The figures reported for the second quarter through the fourth quarter therefore are adjusted to derive the correct figures for the three quarters. Under the balance sheet approach, quarterly total accruals equal (ΔData40 − ΔData49 − ΔData36 + ΔData45 − Data5).

The analysis for accruals estimated from balance sheet (income statement) accounts is based on data for firm-quarter in 1985–2006 (1989–2006). The preliminary sample for the balance sheet (income statement) approach includes 1,514,076 (1,278,696) firm-quarters. The last two columns of Table 2 show the final sample sizes for each year. Descriptive statistics for the samples for quarterly data are reported in Table 3 (Panel B).

We apply the same procedure of simulation to quarterly data with one minor difference. As quarterly earnings are lower, the seeded discretionary accruals are halved for the quarterly models, ranging from −5 to +5 % of lagged total assets. The test results are presented in Table 6.
Table 6

Power of test comparisons for quarterly data

Seed (%)

No performance matching

Performance matching (on reported ROA)

Performance matching (on premanaged ROA)

Power of test (%)

Estimated DA (%)

SD

Power ratio (%)

Power of test (%)

Power decline (%)

Estimated DA (%)

SD

Power ratio (%)

Power of test (%)

Overstated (%)

Estimated DA (%)

SD

Power ratio (%)

Panel A: Balance sheet approach

Testing the hypothesis that discretionary accruals are positive

 0

6.40

−0.01

0.069

−0.03

3.60

 

−0.03

0.093

−0.39

3.60

 

−0.03

0.093

−0.39

 0.5

17.60

0.50

0.069

7.56

8.00

54.55

0.33

0.094

3.54

18.00

125.00

0.51

0.093

5.60

 1

42.40

0.97

0.067

14.97

18.40

56.60

0.68

0.093

7.33

25.20

36.96

0.95

0.092

10.76

 2

90.98

1.96

0.069

29.11

40.89

55.06

1.36

0.096

14.36

67.21

64.36

1.97

0.095

21.66

 3

98.80

2.87

0.069

43.19

66.00

33.20

2.08

0.098

21.48

90.79

37.57

2.91

0.093

31.64

 5

99.60

4.94

0.067

75.51

95.98

3.63

3.49

0.104

34.27

100.00

4.18

4.95

0.093

53.98

Testing the hypothesis that discretionary accruals are negative

 −0

5.60

−0.01

0.069

−0.03

5.60

 

−0.03

0.093

−0.39

5.60

 

−0.03

0.093

−0.39

 −0.5

26.91

−0.58

0.066

−8.69

16.47

38.81

−0.50

0.092

−5.44

19.59

18.98

−0.64

0.091

−6.98

 −1

38.55

−0.97

0.069

−14.12

16.13

58.17

−0.63

0.096

−6.72

32.22

99.75

−0.99

0.093

−10.90

 −2

86.64

−1.94

0.068

−29.36

49.19

43.23

−1.44

0.096

−15.37

72.88

48.17

−1.95

0.093

−21.44

 −3

99.60

−3.02

0.068

−45.59

73.09

26.61

−2.22

0.097

−23.33

93.36

27.73

−2.96

0.093

−32.22

 −5

100.00

−4.93

0.069

−73.60

98.80

1.20

−3.74

0.102

−37.09

100.00

1.22

−5.01

0.094

−54.32

Panel B: Income statement approach

Testing the hypothesis that discretionary accruals are positive

 0

7.60

0.06

0.078

1.85

4.40

 

0.03

0.095

0.16

4.40

 

0.03

0.095

0.16

 0.5

21.20

0.48

0.080

7.27

11.20

47.17

0.30

0.097

3.36

14.98

33.75

0.43

0.096

4.81

 1

40.00

0.93

0.079

13.49

15.20

62.00

0.53

0.093

6.04

22.18

45.90

0.81

0.094

9.02

 2

77.91

1.98

0.079

27.00

40.00

48.66

1.32

0.097

13.97

68.02

70.04

1.94

0.094

21.17

 3

96.00

2.98

0.077

40.96

55.60

42.08

1.74

0.098

18.09

89.92

61.73

2.88

0.093

31.75

 5

99.60

4.90

0.079

65.91

83.40

16.26

2.76

0.104

27.05

100.00

19.90

4.94

0.095

53.66

Testing the hypothesis that discretionary accruals are negative

 −0

4.40

0.06

0.078

1.85

4.80

 

−0.03

0.095

−0.16

4.80

 

−0.03

0.095

−0.16

 −0.5

10.80

−0.45

0.077

−4.79

8.00

25.93

−0.27

0.094

−2.94

13.20

65.00

−0.51

0.092

−5.53

 −1

36.40

−1.06

0.081

−12.49

19.28

47.04

−0.76

0.098

−7.99

34.27

77.80

−1.14

0.097

−11.94

 −2

80.97

−1.87

0.079

−23.47

38.46

52.50

−1.22

0.097

−13.00

67.36

75.12

−1.95

0.096

−20.83

 −3

100.00

−2.86

0.078

−36.85

62.10

37.90

−1.98

0.100

−20.58

90.73

46.10

−2.88

0.094

−31.75

 −5

100.00

−4.93

0.079

−64.48

92.74

7.26

−3.16

0.103

−31.39

99.60

7.40

−4.87

0.094

−53.28

We test the power of test of three model specifications: (1) the original modified Jones model with no performance matching; (2) the modified Jones model with performance matching, where the control firm is one in the same industry with ROA closest to the treatment firm’s reported ROA (original ROA plus the seeded discretionary accruals); and (3) modified Jones model with performance matching where the control firm is chosen using Kothari et al.’s procedure (on premanaged ROA)

Variable definitions: Power of test = the percentage of random samples with t-values of mean estimated discretionary accruals above the threshold; Estimated DA = mean estimated discretionary accruals; SD = mean standard deviation of estimated discretionary accruals; Power ratio = estimated DA/SD; power decline = 1 − power of test with performance matching/power of test with no performance matching; Overstated by = power of test with performance matching (on premanaged ROA)/power of test with performance matching (on reported ROA) − 1

The results in Table 6 are similar to those reported in Table 4, also showing that performance matching reduces power of test. Performance matching based on current quarter ROA reduces the power of test by 47.2, 62, 48.7, 42.1, and 16.3 % respectively at 0.5, 1, 2, 3, and 5 % seed levels for total accruals estimated from income statement accounts (Table 6 Panel B). The decline in power can also be attributed to both the measurement error and noise that performance matching introduces in the discretionary accruals estimate, as Fig. 4 shows. At the 5 % seed level, discretionary accruals are estimated at only 2.76 % on average with performance matching, with measurement error of −2.24 %. In comparison, with no performance matching, discretionary accruals are estimated at 4.9 %, only 0.1 % off target. The drop from 4.9 to 2.76 % caused by performance matching is about 44 %. Notice also that the measurement error when we performance match is negatively correlated with the seed level, consistent with Hypothesis 2.
https://static-content.springer.com/image/art%3A10.1007%2Fs11142-013-9262-7/MediaObjects/11142_2013_9262_Fig4_HTML.gif
Fig. 4

Performance of the quarterly ROA-matched modified Jones model versus the original modified Jones model (income statement approach). a Measurement error at various levels of simulated discretionary accruals. b Average standard deviation of estimates of discretionary accruals

3.3 Which problem is more serious?

Our results show that performance matching based on current ROA both induces measurement error in and adds noise to estimated discretionary accruals. The natural question to ask is this: which problem is more serious? Recall that we compute the power ratio (mean discretionary accruals divided by the standard deviation) for each of the 250 samples in each round of simulation with annual and quarterly discretionary accruals estimated with performance matching. We investigate for each sample whether we can increase the magnitude of the power ratio to a greater extent by replacing the numerator or the denominator with the corresponding statistic when there is no performance matching. Figure 5 shows the percentage of samples for which replacing the numerator (mean) will increase the magnitude of the power ratio to a great extent for the annual models. Figure 6 shows the same for the quarterly models. The percentage exceeds 50 % for every seed level for both the income statement and balance sheet methods of estimating total accruals for the annual models. The percentage exceeds 50 % for every seed level except one (0.5 %) for the quarterly models, for both the income statement and balance sheet methods of estimating total accruals. Thus, while both the measurement error and additional noise introduced by performance matching affect the power of test, it seems that the measurement error is a somewhat more serious problem.
https://static-content.springer.com/image/art%3A10.1007%2Fs11142-013-9262-7/MediaObjects/11142_2013_9262_Fig5_HTML.gif
Fig. 5

Effect of earnings management underestimation of annual ROA-matched models versus that of the greater standard deviation of the discretionary accruals estimate. This chart shows the percentages of randomly drawn samples whose power ratios (mean divided by standard deviation) for discretionary accruals estimated from the annual ROA-matched modified Jones model are depressed more by the model’s tendency to underestimate discretionary accruals than by the model’s tendency to generate discretionary accruals estimates with large standard deviations. a Balance sheet approach. b Income statement approach

https://static-content.springer.com/image/art%3A10.1007%2Fs11142-013-9262-7/MediaObjects/11142_2013_9262_Fig6_HTML.gif
Fig. 6

Effect of earnings management underestimation of quarterly ROA-matched models versus that of the greater standard deviation of the discretionary accruals estimate. This chart shows the percentages of randomly drawn samples whose power ratios (mean divided by standard deviation) for discretionary accruals estimated from the quarterly ROA-matched modified Jones model are depressed more by the model’s tendency to underestimate discretionary accruals than by the model’s tendency to generate discretionary accruals estimates with large standard deviations. a Balance sheet approach. b Income statement approach

3.4 Sensitivity analyses

  1. 1.

    We also estimate annual and quarterly discretionary accruals from the Jones Model (Jones 1991). The inferences are similar.

     
  2. 2.

    We also estimate discretionary accruals from firm-specific time-series regressions. The test results are also similar.

     
  3. 3.

    Ecker et al. (2011) show that selecting control firms based on size (lagged total assets) rather than industry has more power in detecting earnings management. Matching the treatment and control firms on size does not change our test results.

     

3.5 Change in regression coefficient

We replicate tests in a prior study to test Hypothesis 3. Ayers et al. (2006) investigate whether estimated discretionary accruals differ between firms that meet or just beat annual earnings benchmarks and those that miss.16 For two benchmarks (prior year earnings and analyst earnings forecast), they estimate the following regression model:
$$ {\text{EM}} =\upbeta_{0} + \upbeta_{1} {\text{DisAcc}} + \upbeta_{2} {\text{Chg}}\_{\text{CF}} + \upbeta_{j} \varSigma {\text{Ind}}_{j} + \epsilon; $$
where EM equals 1 for firm-years that exactly meet or just beat the earnings benchmark and zero for firm-years that just miss it, DisAcc is estimated discretionary accruals for the year, Chg_CF is change in cash flow from the previous year (annual Compustat data items #308–#124), and Ind’s are industry dummies.17

Ayers et al. (2006) report that β1 is positive for both benchmarks when discretionary accruals estimated from the original modified Jones-type model are used in the regression, suggesting that firms manage accruals to meet or just beat either earnings benchmark. Ayers et al. (2006) also report the result is much weaker when discretionary accruals estimated from the same model with performance matching are used but provide no details.18 We examine the effect of performance matching on the β1 estimate, and compare average ROA between the “just meet” and “just miss” samples to better understand why the β1 estimate declines when there is performance matching.

We use a larger and more up-to-date sample for the comparison study (our sample includes firm-years in 1994–2006 vs. 1994–2002 in Ayers et al.). The results are presented in Table 7. When discretionary accruals estimated without performance matching are used in the regression, as Panel A shows, the estimated β1 is positive and significant for either benchmark (p value = 0.035 and 0.09 for the prior year earnings and analyst forecast benchmarks respectively). The results are similar to those reported in Ayers et al. (2006). When we performance match based on current year ROA, the estimated β1 declines substantially for either benchmark and is no longer significant at either the 5 or 10 % level. Rejecting the alternative hypothesis, we make the inference that β1 is no different from zero, i.e., firms do not take more discretionary accruals to be in the “just beat” bin for either benchmark. We notice that performance matching also reduces the fit of the model, as indicated by the decline in likelihood ratio.
Table 7

Change in regression coefficient caused by discretionary accruals estimated with performance matching

Regression equation: EM = β0 + β1DisAcc + β2Chg_CF + βj∑Indj + ε

Benchmark

Earnings of prior year

Analyst earnings forecast

Discretionary accruals model

No performance matching

Performance matching

No performance matching

Performance matching

Coefficient

p value

Coefficient

p value

Coefficient

p value

Coefficient

p value

Panel A: Regression results

DisACC

0.1436

0.035

0.0839

0.321

0.2995

0.091

0.0681

0.331

Chg_CF

0.0000

<0.0001

0.0000

0.174

−0.0001

0.292

−0.0001

0.014

Likelihood ratio

1,233.26

86.61

92.44

70.97

p value

<0.0001

0.0212

0.0073

0.2035

n

8,505

5,297

Panel B: Comparing ROA

Just miss

0.0723

0.0676

Meet or just beat

0.0715

0.0606

EM equals 1 for firm-years with earnings that exactly meet or just beat the earnings benchmark and zero for firm-years with earnings that just miss the earnings benchmark, DisACC is estimated discretionary accruals, and Chg_CF is change in cash flow

Note that the inference that β1 is no different from zero, while consistent with our prediction that performance matching will weaken the relationship of discretionary accruals with another variable, may not be a Type II error. It can actually be the correct inference. It is possible that the firm-years in the “meet or just beat” bin do not engage in earnings management, but they have more PRAA (performance-related abnormal accruals) than those in the “just miss” bin due to their better operating performance. In that case, the inference that β1 is positive when there is no performance matching would be a Type I error, caused by the modified Jones model’s inability to distinguish between PRAA and discretionary accruals. And the inference that β1 is no different from zero when we performance match would be the correct inference, caused by the removal of the extra PRAA of the firm-years in the “meet or just beat” bin by performance matching.

To settle this important issue, we compare the mean ROA of firm-years in the “meet or just beat” bin with that of firm-years in the “just miss” bin. As Panel B of Table 7 shows, mean ROA of the “meet or just beat” bin is quite similar to that of the “just miss” bin for either benchmark; interestingly, the former is actually lower than the latter for either benchmark, although the difference is not statistically significant. Untabulated results of other parametric and non-parametric tests also show little difference in ROA between firm-years in the two bins. Thus firm-years in the “meet or just beat” bin are unlikely to have more PRAA than those in the “just miss” bin. It is therefore reasonable to draw the conclusion that the inference that β1 is positive is the correct inference, while the inference that β1 is zero when we performance match is a Type II error, caused by measurement error in discretionary accruals estimated with performance matching.

4 Summary and concluding remarks

Performance matching has become a standard procedure for studies involving the use of discretionary accruals. Many employ it without even first examining whether the sample is skewed (e.g., Blouin et al. 2007; Lawrence et al. 2011). We explore two issues pertaining to performance matching. First, while some speculate that performance matching reduces the power of test, the mechanism by which this reduction in power occurs has been unknown.19 We show the decline in power occurs systematically rather than randomly. And it occurs primarily because the control firm used for performance matching is likely to have more (less) abnormal accruals unrelated to the event of interest than the treatment firm with upward (downward) earnings management, thus inducing a measurement error in estimated discretionary error. The measurement error, moreover, is negatively correlated with the true discretionary accruals. The problem, together with the fact performance matching also increases noise in estimated discretionary accruals, makes it more difficult to detect earnings management when researchers performance match. Second, the extent to which performance matching reduces the power of test has been underestimated. We show the procedure Kothari et al. (2005) use to choose control firms in simulations cannot be implemented in empirical research as it requires knowledge of the treatment firms’ original (“honest”) ROA. Not having such knowledge, researchers invariably choose control firms based on the treatment firms’ reported ROA. The loss of power caused by performance matching in empirical research therefore is higher than the numbers reported by Kothari et al. (2005) suggest. The loss of power affects not only research on whether certain events induce earnings management. In studies that use discretionary accruals estimated with performance matching as the dependent or an independent variable in regression analysis, the regression coefficient will be biased toward zero. The results of our empirical test results are consistent with these predictions.

Our results show that researchers face an uncertain outcome when estimating discretionary accruals for a skewed sample with performance matching. Performance matching would be beneficial, if the sample skewness is entirely caused by the firms’ genuine good or poor operating results rather than by earnings management in response to the hypothesized stimulus. The frequency of Type I errors would be lower than what could be achieved without performance matching. Performance matching would be harmful, on the other hand, if the sample skewness is instead caused by earnings management in response to the hypothesized stimulus. The frequency of Type II errors would be much higher than what could be achieved with no performance matching and much higher than previously thought. Since authors in most studies don’t know what causes the sample to be skewed, performance matching ex ante is as likely to be harmful as to be beneficial. This raises the question of whether performance matching should be adopted as a standard procedure, as it appears to be the case now. Rather than relying on performance matching as a standard procedure, researchers should try and develop new, better discretionary accruals models. Research in this direction appears to be already underway. Ayers et al. (2006) investigate whether estimated discretionary accruals differ significantly between firms that reside in various pairs of adjacent bins of earnings, earnings changes, and earnings surprises. The results may be useful to certain studies in reducing the frequency of Type I errors without resorting to performance matching. Dechow et al. (2012) show that incorporating the reversal of discretionary accrual in future periods in the test will substantially reduce the frequency of Type I errors for skewed samples without performance matching, while often achieving a lower frequency of Type II errors than can be achieved with performance matching. But more efforts to develop better solutions appear to be still needed.

Footnotes
1

Ayers et al. (2006) and Dechow et al. (2012) both argue that performance matching increases noise in the discretionary accruals estimate.

 
2

For any study with a skewed sample there is also a chance that the sample skewness is instead caused by the firms’ genuine good or poor performance or earnings management for other purposes. In that case, performance matching would be desirable to reduce the chance of false inferences. Since the researcher can never be sure what causes the sample to be skewed, one can never say for sure whether performance matching is desirable for any study with a skewed sample.

 
3

Kothari et al. (2005) argue that good performance induces firms to take more normal accruals, stating “working capital accruals increase in forecasted sales growth and earnings because of a firm’s investment in working capital to support the growth in sales” (p. 165).

 
4

Discretionary accruals are expressed as a percentage of lagged total assets because Jones-type models scale all the variables by lagged total assets.

 
5

We follow Kothari et al. (2005) and calculate ROA as reported earnings scaled by lagged total assets.

 
6

PM in DA–PM denotes performance matching.

 
7

NP in DA–NP denotes no performance matching.

 
8

It is common for researchers to use regression residuals as proxies for discretionary accruals. The mean of regression residuals is zero.

 
9

See the footnote of Table 4 in their paper for a description of how the control firm is chosen.

 
10

Calculating accruals using the income statement approach requires cash flow data. Firms started reporting cash flow data after Statement of Financial Accounting Standards (SFAS) No. 95 was issued in 1987.

 
11

We define industries by two-digit SIC codes.

 
12

The t statistic for each sample equals \( \overline{DA} /(s(DA)/\sqrt N ) \), where \( \overline{DA} \) is the sample mean of discretionary accruals estimates, s(DA) is the sample standard deviation of the discretionary accruals estimates, and N is the sample size.

 
13

It is unsurprising that the measurement error of the original Jones model is often also negative (positive) for positive (negative) seed levels. Following prior studies, we estimate discretionary accruals by regression. Positive (negative) seeded discretionary accruals cause the regression residuals of the treatment firms to be higher (lower) than those of the other (“clean”) firms. But the magnitudes of the treatment firms’ regression residuals often will be lower than that of the seed because regression minimizes the sum of squares of the residuals.

 
14

Discretionary accruals estimated with performance matching (DA–PM) are the difference between two variables. The standard deviation of DA-PM may be high or low, depending on the correlation between the two variables. Specifically, the variance of DA-PM equals σx2 + σy2 − 2σxy, where σx2 is the variance of discretionary accruals estimate of the treatment firm, σy2 is that of the control firm, and σxy is the covariance of the two estimates. If we assume σxy equals zero and σx2 = σy2, the variance of DA–PM would be 2σx2, or twice the variance of discretionary accruals estimated with no performance matching. The standard deviation of DA–PM would be \( \sqrt 2 \)σx.

 
15

Results are unchanged when Q4 is not included as an independent variable.

 
16

They also investigate whether estimated discretionary accruals differ between firms residing in other pairs of adjacent bins of earnings, earnings changes, and earnings surprises. We do not replicate those tests in our study. .

 
17

For the prior year earnings benchmark, EMt equals 1if 0 ≤ ΔXt < 0.01 and 0 if −0.01 ≤ ΔXt < 0.00, where ΔXt is the change in net income from year t − 1 to t divided by the market value of equity at the end of year t − 2. For the analyst earnings forecast benchmark, EMt equals 1 if $0 ≤ FEt < $0.01 and 0 if −$0.01 ≤ FEt < $0.00, where FEt is year t’s actual earnings per share minus the most recent analyst forecast prior to the earnings announcement (based on data in unadjusted I/B/E/S Detail History file). FE is rounded to the nearest penny.

 
18

They tabulate the change in test results caused by performance matching for only a forward looking model.

 
19

It is critical that the mechanism by which the reduction in power occurs be known, for two reasons. First, the model can be improved only if its problems are fully understood. Second, without knowing why the model lacks power, the users are likely to attribute any loss in power caused by performance matching to chance and continue to adopt performance matching as a standard procedure.

 

Acknowledgments

We thank the editor, Patricia Dechow, and an anonymous referee for their comments and suggestions, which substantially improved the paper. An earlier version of the paper with a different title was presented at 2009 American Accounting Association Annual Meeting (New York, USA), McGill University, and Santa Clara University. We thank the participants for valuable comments and suggestions. We also received valuable comments from workshop participants at National University of Singapore. Funding for the study was provided by the University Research Fund, National University of Singapore.

Copyright information

© Springer Science+Business Media New York 2013