Abstract
We investigate how the distribution of the penalties incurred by auditors for failing to detect fraud influences their effort to detect fraud and auditees’ commission of fraud. We compare a probabilistic, skewed audit penalty to a penalty that automatically imposes the expected penalty of the probabilistic distribution (hereafter, a deterministic penalty). Our experiments show that a deterministic penalty with the same expected value of a probabilistic, skewed penalty increases audit effort to detect fraud and decreases fraudulent reporting by auditees and that these benefits hold in a game involving both auditee and auditor players.
Similar content being viewed by others
Notes
For example, Senator Elizabeth Dole brought a proposal before Congress that would require SEC registrants to obtain financial statement insurance as outlined in Ronen (2002). This proposal changes many incentives found in the audit environment, including auditors’ penalties for failing to detect fraud. See Cunningham (2004) for a description of the legal and market-structure modifications required to implement this approach.
As discussed in Sect. 2, there are many reasons why the shape of the auditor’s penalty distribution may be negatively skewed, with high magnitude penalties occurring rarely and lesser penalties occurring more frequently. Our study does not examine why the distribution may be negatively skewed but examines only how the distributional properties of the penalty distribution affect the detection and frequency of fraudulent reporting.
Although prior legal studies suggest that jury damage awards are skewed for the reasons outlined in this paragraph, some may argue that the damage awards in auditing are a function of the auditor behavior. For example, if the level of auditor negligence is skewed, then damage awards that mirror the level of negligence will also be skewed. Interestingly, the design and analyses in prior research (Wissler et al. 1999; Schkade et al. 2000) control for negligence while still observing skewed damage awards. Other factors that may influence damage awards against auditors include whether the auditor was shown to have scienter or intent to deceive. Also, the level of reliance by users may influence the damage award with higher damages awarded when the auditee was more widely traded. If these factors are skewed then the awards will also be skewed. However, we submit that damage awards have been shown to be skewed while controlling for such factors and, therefore, our assumption of skewed damage awards is reasonable.
The Public Accounting Report’s (2008) “Largest Financial Payouts by Accounting Firms” shows that there are approximately 80 audit-related payouts that resulted from a unique audit engagement. Of these payouts, the largest (smallest) was $633.5 ($10) million, while the mean (median) was about $72 ($38) million. The average for the two highest deciles was roughly $283 and $122 million, while the average for the two lowest deciles was roughly $14 and $11 million. This is consistent with our assumption that auditors face a skewed distribution for failing to detect fraud. Of course, this observed distribution of settlements is likely a function of numerous stochastic processes, including auditor negligence and jury selection. As stated in the previous footnote, our study assumes that a skewed distribution is something faced by auditors, even after controlling for auditor negligence. As a result, penalties in our study do not depend on auditor negligence.
Two forms of incorrect inferences can result. A type II (type I) error occurs when inferences from the sample lead the auditor to incorrectly conclude that no (a) material misstatement exists. The penalty for a type II error is our focus because this penalty is generally believed to be the more significant penalty. Also, this penalty is most subject to influence by policymakers, and avoiding the error of incorrectly concluding that no material misstatement exists is a major objective of many proposed audit reforms.
Many economic theories derive their predictions based on an assumed level of risk aversion among agents. However, an extensive prior literature demonstrates that the same person can be (1) risk averse for losses of high magnitude but low probability, (2) risk seeking for losses of ordinary magnitudes and probabilities, (3) risk averse for gains of ordinary magnitudes and probabilities, and (4) risk seeking for gains of high magnitude but low probability. As a result, we do not base our hypotheses on an assumed level of risk aversion among participants, nor do we measure participants’ level of risk aversion prior to the study. Instead, we randomly assign participants to our experimental conditions, which allows us to draw inferences from any observed differences.
Because the number of rounds may have induced some fatigue among participants, we examine the first 25 rounds to see if our results differ. We observe qualitatively and statistically similar results (not tabulated) for these earlier rounds, suggesting that fatigue did not affect the conclusions reported in this paper. Participants were told at the beginning of the experiment that there would be 50 rounds, including five practice rounds. Out of concern that participants’ behavior may change due to opportunistic strategies in the final rounds of the experiment, we rerun all of our analyses dropping the final 10 and 25 rounds. We observe qualitatively and statistically similar results (not tabulated) for these earlier rounds, suggesting that opportunistic end-game behavior did not drive any of the results reported in this paper.
The $10 value was chosen to achieve a rough balance between the probability of type I and type II errors.
This design feature precluded the experiment-wide dominant strategy of sampling zero and rejecting the report, which would result in an expected payoff of $49.50 in each round. We chose to eliminate this option because it does not map well into the real world. Auditors are likely unable to avoid audit work while issuing an adverse opinion. By including this constraint, we forced participants to focus their attention on the penalty for incorrect acceptance of a false report. Because this design feature was constant across all cells of the experiment and it is unlikely to interact with any of our manipulations, this feature does not explain any of our results.
As described in the results section, the automated acceptance and rejection rules in the experiment affected nine of the 45 nonpractice rounds. Only the 36 rounds in which participants made sampling decisions are used in our analyses.
In designing the skewed distribution, we could have used a more continuous distribution of penalties. However, for simplicity and the ease of explaining the penalty distribution to participants, we chose to use a bimodal distribution that approximated a skewed distribution. Future research could examine whether the continuous nature of the skewed distribution would change the results of our study.
We recognize that this distribution is perhaps more extreme than the relative penalties that auditors face. However, our experiment is not intended to achieve mundane realism; instead, we wanted to choose a strong manipulation to test our theory while mapping into the prior prospect theory research.
Although this is not a perfect random assignment, we assume that the order in which participants responded to our email was random. Our second experiment replicated our findings using a random assignment.
The recruiting emails, flyers, and the informed consent document signed by all participants before the experiment began stated that the average participant would receive approximately $18. Additionally, the asserters’ (verifiers’) instructions explained that the amount received “will be based on your portion of the total experimental earnings across all asserters (verifiers) in today's study.”
Significantly lower quiz scores occurred in the symmetric setting (9.1) than in the skewed (9.8) or deterministic (9.8) distributions (p = 0.022 two-tailed). Dropping participants who answered fewer than ten questions correctly did not change the inferences drawn from this study.
Given our automated rule to accept or reject based on the sampling results, we expect these results to be reflected in incorrect agreements (i.e., missed frauds). To investigate this expectation, we performed a separate analysis of incorrect agreements, which yielded similar results. Specifically, a model substituting the number of incorrect agreements for sample size decisions was significant for penalty distribution (F = 6.83, p = 0.0025). Also, the pattern found in the mean rate of incorrect agreements was also consistent with sampling decisions. The average rate of incorrect agreements was 22.3, 10.2, and 8.2% for the skewed, deterministic, and symmetrical distributions, respectively.
In reviewing Fig. 3, we noted that all three treatment groups appear to vary their sampling decisions in a similar pattern over the 36 rounds. Since the three treatment groups appeared to be responding to something that changed from round to round and was similar across treatments, we investigated whether participants’ sampling decisions were influenced by the reported blue balls. The Pearson correlation between sampling decisions and reported blue balls is 0.295, which is significant at p < .0001. We found this result interesting given that reported blue balls only provided relevant information when it was less than 20 or equal to 100. Since participants were not allowed to make sampling decisions for those cases, they appear to be responding to irrelevant information. Thus, in all cases where participants made sampling decisions, the reported blue balls provided no incremental information (for example, a report showing 98 blue balls was equally likely to be fraudulent in comparison with a report showing 22 blue balls). We suspect that our participants are demonstrating the robust finding in behavioral decision research that humans often violate normative rules of probabilistic reasoning (Nisbett et al. 1983).
Because of technical difficulties, participants in one condition completed only 20 rounds (including practice rounds) instead of the expected number of rounds. As a result, we use only the first 20 rounds of the other condition in our analyses. Inferences do not change when we include the omitted rounds in our analyses.
These statements were made in August 2005 at the American Accounting Association’s annual meeting.
References
Allais, P. M. (1953). The behavior of rational man in risk situations—A critique of the axioms and postulates of the American school. Econometrica, 21, 503–546.
Ball, S., & Cech, P. (1996). Subject pool choice and treatment effects in economic laboratory research. Research in Experimental Economics, 6, 239–292.
Bernoulli, D. (1738/1954). Exposition of a new theory on the measurement of risk (trans: Sommer, L). Econometrica, 22, 22–36.
Bloomfield, R. J. (1997). Strategic dependence and the assessment of fraud risk: A laboratory study. The Accounting Review, 72(4), 517–538.
Camerer, C. (1990). Behavioral game theory. In R. M. Hogarth (Ed.), Insights in decision making: A tribute to Hillel J. Einhorn. Chicago: University of Chicago Press.
Camerer, C. (1997). Progress in behavioral game theory. Journal of Economic Perspectives, 11, 167–188.
Camerer, C. (2003). Behavioral game theory: Experiments in strategic interaction. Princeton: Princeton University Press.
Cunningham, L. A. (2004). A model financial statement insurance act. Connecticut Insurance Law Journal, 11. Available at SSRN http://ssrn.com/abstract=588268.
Demski, J. S. (2004). Endogenous Expectations. The Accounting Review, 79(2), 519–539.
Dopuch, N., & King, R. R. (1992). Negligence versus strict liability regimes in auditing: An experimental investigation. The Accounting Review, 67(1), 97–120.
Dopuch, N., King, R. R., & Schatzberg, J. W. (1994). An experimental investigation of alternative damage-sharing liability regimes with an auditing perspective. Journal of Accounting Research, 32(Supplement), 103–130.
Ellsberg, D. (1961). Risk, ambiguity, and the savage axioms. Quarterly Journal of Economics, 75, 643–669.
Fellingham, J., & Newman, P. (1985). Strategic considerations in auditing. The Accounting Review, 60(4), 634–650.
Fennema, H., & Wakker, P. (1997). Original and cumulative prospect theory: A discussion of empirical differences. Journal of Behavioral Decision Making, 10, 53–64.
Fox, C. R., & Tversky, A. (1998). A belief-based account of decision under uncertainty. Management Science, 44, 879–895.
Kachelmeier, S., & King, R. R. (2002). Using laboratory experiments to evaluate accounting policy issues. Accounting Horizons, 16(3), 219–232.
Kahneman, D., & Tversky, A. (1979). Prospect theory: an analysis of decision under risk. Econometrica, 47, 263–291.
Kim, C. K., & Waller, W. S. (2005). A behavioral accounting study of strategic interaction in a tax compliance game. In R. Zwich & A. Rapoport (Eds.), Experimental business research (Vol. II). Boston: Kluwer.
King, R. R., & Schwartz, R. (1999). Legal penalties and audit quality: An experimental investigation. Contemporary Accounting Research, 16(4), 685–710.
King, R. R., & Schwartz, R. (2000). An experimental investigation of auditors’ liability: Implications for social welfare and exploration of deviations from theoretical predictions. The Accounting Review, 75(4), 429–451.
Kunda, Z. (1990). The case for motivated reasoning. Psychological Bulletin, 108(3), 480–498.
Newman, P., & Noel, J. (1989). Error rates, detection rates, and payoff functions in auditing. Auditing: A Journal of Practice and Theory, 8(Supplement), 50–63.
Nisbett, R. E., Krantz, D. H., Jepson, C., & Kunda, Z. (1983). The use of statistical heuristics in everyday inductive reasoning. Psychological Review, 90, 339–363.
Public Accounting Report. (2008). Largest financial settlements by accounting firms. December 15 Issue: 23.
Ronen, J. (2002). Post-enron reform: Financial-statement insurance and GAAP revisited. Stanford Journal of Law, Business & Finance, 8(1), 39–68.
Schkade, D., Sunstein, C., & Kahneman, D. (2000). Deliberating about dollars: The severity shift. Columbia Law Review, 100, 1139–1175.
Shibano, T. (1990). Assessing audit risk from errors and irregularities. Journal of Accounting Research, 28(Supplement), 110–140.
Simon, H. A. (1982). Models of bounded rationality: Behavioral economics and business organization. Cambridge, MA: MIT Press.
Tversky, A., & Fox, C. R. (1995). Weighing risk and uncertainty. Psychological Review, 102, 269–283.
Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211, 453–463.
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5, 297–323.
Von Neumann, J., & Morgenstern, O. (1947). Theory of games and economic behavior. Princeton, NJ: Princeton University Press.
Wissler, R., Hart, A., & Saks, M. (1999). Decision making about general damages: A comparison of jurors, judges, and lawyers. Michigan Law Review, 98, 752–826.
Zimbelman, M., & Waller, W. (1999). An experimental investigation of auditor–auditee interaction under ambiguity. Journal of Accounting Research, 37(Supplement), 135–155.
Acknowledgments
We are grateful for comments from workshop participants at the Brigham Young University Accounting Research Symposium, the University of Utah, the University of Arizona, Boston College, and the University of Nevada, Las Vegas. We also appreciate helpful written comments from Jim Bierstaker, Brian Mayhew, and Jason Smith.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Burton, F.G., Wilks, T.J. & Zimbelman, M.F. The impact of audit penalty distributions on the detection and frequency of fraudulent reporting. Rev Account Stud 16, 843–865 (2011). https://doi.org/10.1007/s11142-011-9152-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11142-011-9152-9