Artificial Intelligence and Fraud Detection

Bao, Yang; Hilary, Gilles; Ke, Bin

doi:10.1007/978-3-030-75729-8_8

Yang Bao⁵,
Gilles Hilary⁶ &
Bin Ke⁷

Part of the book series: Springer Series in Supply Chain Management ((SSSCM,volume 11))

4027 Accesses
25 Citations

Abstract

Fraud exists in all walks of life and detecting and preventing fraud represents an important research question relevant to many stakeholders in society. With the rise in big data and artificial intelligence, new opportunities have arisen in using advanced machine learning models to detect fraud. This chapter provides a comprehensive overview of the challenges in detecting fraud using machine learning. We use a framework (data, method, and evaluation criterion) to review some of the practical considerations that may affect the implementation of machine-learning models to predict fraud. Then, we review select papers in the academic literature across different disciplines that can help address some of the fraud detection challenges. Finally, we suggest promising future directions for this line of research. As accounting fraud constitutes an important class of fraud, we will discuss all of these issues within the context of accounting fraud detection.

We thank Kai Guo for research assistance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.wsj.com/articles/borrower-beware-credit-card-fraud-attempts-rise-during-the-coronavirus-crisis-11,590,571,800
2.
https://www.technologyreview.com/2019/11/18/131912/6-essentials-for-fighting-fraud-with-machine-learning/
3.
See Boute et al. (2022) for a more in-depth discussion on the use of AI in financial services.
4.
In a 10-year review of corporate accounting fraud commissioned by the Committee of Sponsoring Organization of the Treadway Commission (COSO), Beasley et al. (2010) find that the total cumulative misstatement or misappropriation of nearly $120 billion across 300 fraud cases with available information (mean of nearly $400 million per case) (Beasley et al., 1999).
5.
See SAS no. 99 (American Institute of Certified Public Accountants, 2002) for a discussion of this issue in a U.S. context.
6.
https://www.federalreserve.gov/publications/files/changes-in-us-payments-fraud-from-2012-to-2016-20181016.pdf
7.
See Zhang et al. (2015) for a good discussion of these issues.
8.
https://spectrum.ieee.org/riskfactor/computing/software/michigans-midas-unemployment-system-algorithm-alchemy-that-created-lead-not-gold
9.
https://www.freep.com/story/news/local/michigan/2017/07/30/fraud-charges-unemployment-jobless-claimants/516332001/
10.
http://www.eurofinas.org/uploads/documents/Non-visible/Eurofinas-Accis_ReportOnFraud_WEB.pdf
11.
https://www.corporatecomplianceinsights.com/the-growing-problem-of-corporate-fraud/
12.
https://technode.com/2019/12/19/tencent-xiaomi-apps-called-out-for-illegal-data-collection/
13.
Supervised models “learn” from labeled data. To train a supervised model, one presents both fraudulent and non-fraudulent records that have been labeled as such. Unsupervised models ask the model to “learn” the data structure on its own.
14.
https://nvlpubs.nist.gov/nistpubs/ir/2019/NIST.IR.8280.pdf
15.
https://www.ftc.gov/news-events/blogs/business-blog/2020/04/using-artificial-intelligence-algorithms
16.
https://mit-insights.ai/6-essentials-for-fighting-fraud-with-machine-learning/
17.
This section heavily relies on Bao et al. (2020).
18.
https://www.technologyreview.com/2020/05/11/1001563/covid-pandemic-broken-ai-machine-learning-amazon-retail-fraud-humans-in-the-loop/
19.
https://arxiv.org/abs/1808.03305
20.
This section borrows heavily from the online appendix of Bao et al. (2020).
21.
Recurrent neural networks are artificial neural networks where connections between nodes form a directed graph along a temporal sequence.
22.
https://www.cio.com/article/3525877/serious-fraud-office-cto-ben-denison-reveals-how-ai-is-transforming-legal-work.html
23.
https://customerthink.com/why-85-of-the-artificial-intelligence-projects-fail/

References

Abbasi, A., Albrecht, C., Vance, A., & Hansen, J. (2012). Metafraud: A meta-learning framework for detecting financial fraud. MIS Quarterly, 1293–1327.
Google Scholar
American Institute of Certified Public Accountants (2002) Consideration of fraud in a financial statement audit. Statement on Auditing Standards No. 99. New York.
Google Scholar
Amiram, D., Bozanic, Z., & Rouen, E. (2015). Financial statement errors: Evidence from the distributional properties of financial statement numbers. Review of Accounting Studies, 20, 1540–1593.
Article Google Scholar
Ashton, R. H. (1974). Behavioral implications of information overload in managerial accounting reports. Cost and Management, 48(4), 37–40.
Google Scholar
Baltrušaitis, T., Ahuja, C., & Morency, L. P. (2018). Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 423–443.
Article Google Scholar
Bao, Y., Ke, B., Li, B., Yu, Y. J., & Zhang, J. (2020). Detecting accounting fraud in publicly traded US firms using a machine learning approach. Journal of Accounting Research, 58(1), 199–235.
Article Google Scholar
Beasley, M. S. (1996). An empirical analysis of the relation between the board of director composition and financial statement fraud. The Accounting Review, 71, 443–465.
Google Scholar
Beasley, M. S., Carcello, J. V., and Hermanson, D. R. (1999). Fraudulent financial reporting: 1987–1997: An Analysis of U.S. Public Companies. Sponsored by the Committee of Sponsoring Organizations of the Treadway Commission (COSO).
Google Scholar
Beasley, M. S., Carcello, J. V., Hermanson, D. R., and Neal, T. L. (2010). Fraudulent financial reporting: 1998–2007: An Analysis of U.S. Public Companies.” Sponsored by the Committee of Sponsoring Organizations of the Treadway Commission (COSO).
Google Scholar
Bekker, J., & Davis, J. (2020). Learning from positive and unlabeled data: A survey. Machine Learning, 109(4), 719–760.
Article Google Scholar
Beneish, M. D. (1997). Detecting GAAP violation: Implications for assessing earnings management among firms with extreme financial performance. Journal of Accounting and Public Policy, 16, 271–309.
Article Google Scholar
Beneish, M. D. (1999). The detection of earnings manipulation. Financial Analysts Journal, 55, 24–36.
Article Google Scholar
Benbasat, I., & Taylor, R. N. (1982). Behavioral aspects of information processing for the design of management information systems. IEEE Transactions on Systems, Man, and Cybernetics, 12(4), 439–450.
Article Google Scholar
Beutel, A., Akoglu, L., & Faloutsos, C. (2015). Graph-based user behavior modeling: from prediction to fraud detection. Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. pp. 2309–2310.
Google Scholar
Brazdil, P., Carrier, C. G., Soares, C., & Vilalta, R. (2008). Metalearning: Applications to data mining. Springer Science & Business Media.
Google Scholar
Boute, R. N., Gijsbrechts, J., & Van Mieghem, J. A. (2022). Digital lean operations: Smart automation and artificial intelligence in financial services. In V. Babich, J. Birge, & G. Hilary (Eds.), Innovative technology at the interface of finance and operations. Springer Series in Supply Chain Management. Springer Nature.
Google Scholar
Brazel, J. F., Jones, K. L., & Zimbelman, M. F. (2009). Using nonfinancial measures to assess fraud risk. Journal of Accounting Research, 47(5), 1135–1166.
Article Google Scholar
Brown, N. C., Crowley, R. M., & Elliott, W. B. (2020). What are you saying? Using topic to detect financial misreporting. Journal of Accounting Research, 58, 237–291.
Article Google Scholar
Burns, N., & Kedia, S. (2006). The impact of performance-based compensation on misreporting. Journal of Financial Economics, 79, 35–67.
Article Google Scholar
Cao, S., Yang, X., Chen, C., Zhou, J., Li, X., & Qi, Y. (2019). TitAnt: Online real-time transaction fraud detection in ant financial. arXiv. preprint arXiv:1906.07407.
Google Scholar
Chen, X., Hilary, G. and Tian, X. (2020). Mandatory data breach transparency and insider trading, working paper.
Google Scholar
Cecchini, M., Aytug, H., Koehler, G. J., & Pathak, P. (2010). Making words work: Using financial text as a predictor of financial events. Decision Support Systems, 50(1), 164–175.
Article Google Scholar
Citron, D. K. (2008). Technological due process. Wash UL Rev, 85, 1249.
Google Scholar
Darrough, M., Huang, R., & Zhao, S. (2020). Spillover effects of fraud allegations and investor sentiment. Contemporary Accounting Research, 37, 982–1014.
Article Google Scholar
Davidson, R., Dey, A., & Smith, A. (2015). Executives’ Boff-the-job^ behavior, corporate culture, and financial reporting risk. Journal of Financial Economics, 117(1), 5–28.
Article Google Scholar
de Roux, D., Perez, B., Moreno, A., Villamil, M. D. P., & Figueroa, C. (2018) Tax fraud detection for under-reporting declarations using an unsupervised machine learning approach. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 215–222.
Google Scholar
Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1995). Detecting earnings management. The Accounting Review, 70(2), 193–226.
Google Scholar
Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1996). Causes and consequences of earnings manipulation: An analysis of firms subject to enforcement actions by the SEC. Contemporary Accounting Research, 13, 1–36.
Article Google Scholar
Dechow, P. M., Ge, W., Larson, C. R., & Sloan, R. G. (2011). Predicting material accounting misstatements. Contemporary Accounting Research, 28(1), 17–82.
Article Google Scholar
Dong, W., Liao, S., & Zhang, Z. (2018). Leveraging financial social media data for corporate fraud detection. Journal of Management Information Systems, 35(2), 461–487.
Article Google Scholar
Dutta, I., Dutta, S., & Raahemi, B. (2017). Detecting financial restatements using data mining techniques. Expert Systems with Applications, 90, 374–393.
Article Google Scholar
Dyck, A., Morse, A., & Zingales, L. (2020). How pervasive is corporate fraud. University of Toronto. working paper.
Google Scholar
Efendi, J., Srivastava, A., & Swanson, E. P. (2007). Why do corporate managers misstate financial statements? The role of option compensation and other factors. Journal of Financial Economics, 85, 667–708.
Article Google Scholar
Ernst & Young (2010). Driving ethical growth—New markets, new challenges. 11th Global Fraud Survey. from https://linomartins.files.wordpress.com/2011/12/2011th_global_fraud_survey.pdf.
Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27, 861–874.
Article Google Scholar
Fiore, U., De Santis, A., Perla, F., Zanetti, P., & Palmieri, F. (2019). Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Information Sciences, 479, 448–455.
Article Google Scholar
Fletcher, H., Glancy, & Yadav, S. B. (2011). A computational model for financial reporting fraud detection. Decision Support Systems, 50(3), 595–601.
Article Google Scholar
Garip, F. (2020). What failure to predict life outcomes can teach us. Proceedings of the National Academy of Sciences, 117(15), 8234–8235.
Article Google Scholar
Green, P., & Choi, J. H. (1997). Assessing the risk of management fraud through neural network technology. Auditing: A Journal of Practice & Theory, 16, 14–29.
Google Scholar
Guo, J., Liu, G., Zuo, Y., & Wu, J. (2018). Learning sequential behavior representations for fraud detection. 2018 IEEE international conference on data mining (ICDM). IEEE, pp. 127–136.
Google Scholar
Hajek, P., & Henriques, R. (2017). Mining corporate annual reports for intelligent detection of financial statement fraud–a comparative study of machine learning methods. Knowledge-Based Systems, 128, 139–152.
Article Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The elements of statistical learning. Springer.
Book Google Scholar
He, H., & Ma, Y. (2013). Imbalanced learning: Foundations, algorithms, and applications. Wiley.
Book Google Scholar
Healy, P. M. (1985). The effect of bonus schemes on accounting decisions. Journal of Accounting and Economics, 7(1), 85–107.
Article Google Scholar
Hobson, J. L., Mayew, W. J., & Venkatachalam, M. (2012). Analyzing speech to detect financial misreporting. Journal of Accounting Research, 50(2), 349–392.
Article Google Scholar
Hoi, S. C., Sahoo, D., Lu, J., & Zhao, P. (2018). Online learning: A comprehensive survey. arXiv preprint arXiv:1802.02871.
Google Scholar
Hu, B., Zhang, Z., Shi, C., Zhou, J., Li, X., & Qi, Y. (2019). Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism. Proceedings of the AAAI Conference on Artificial Intelligence. pp. 946–953.
Google Scholar
Humpherys, S. L., Moffitt, K. C., Burns, M. B., Burgoon, J. K., & Felix, W. F. (2011). Identification of fraudulent financial statements using linguistic credibility analysis. Decision Support Systems, 50(3), 585–594.
Article Google Scholar
Iselin, E. R. (1988). The effects of information load and information diversity on decision quality in a structured decision task. Accounting, Organizations and Society, 13(2), 147–164.
Article Google Scholar
Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20, 422–446.
Article Google Scholar
Johnson, S. A., Ryan, H. E., & Tian, Y. S. (2009). Managerial incentives and corporate fraud: The sources of incentives matter. Review of Finance, 13, 115–145.
Article Google Scholar
Karpoff, J. M., Lee, D. S., & Martin, G. S. (2008). The costs to firms of cooking the books. Journal of Financial and Quantitative Analysis, 43(03), 581–612.
Article Google Scholar
Karpoff, J. M., Koester, A., Lee, D. S., & Martin, G. S. (2017). Proxies and databases in financial misconduct research. The Accounting Review, 92(6), 129–163.
Article Google Scholar
Kleinberg, J., Ludwig, J., Mullainathan, S., & Obermeyer, Z. (2015). Prediction policy problems. American Economic Review: Papers & Proceedings, 105(5), 491–495.
Article Google Scholar
KPMG. Peat Marwick (1998). Fraud Survey. KPMG Peat Marwick.
Google Scholar
Larcker, D. F., Richardson, S. A., & Tuna, I. (2007). Corporate governance, accounting outcomes, and organizational performance. The Accounting Review, 82(4), 963–1008.
Article Google Scholar
Larcker, D., & Zakolyukina, A. A. (2012). Detecting deceptive discussion in conference calls. Journal of Accounting Research, 50, 495–540.
Article Google Scholar
Li, H., Liu, B., Mukherjee, A., & Shao, J. (2014). Spotting fake reviews using positive-unlabeled learning. Computación y Sistemas, 18(3), 467–475.
Article Google Scholar
Liang, C., Liu, Z., Liu, B., Zhou, J., Li, X., and Yang, S. (2019). Uncovering Insurance Fraud Conspiracy with Network Learning. Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 1181–1184.
Google Scholar
Lin, J., Hwang, M., & Becker, J. (2003). A fuzzy neural network for assessing the risk of fraudulent financial reporting. Managerial Auditing Journal, 18, 657–665.
Article Google Scholar
Liu, S., Hooi, B., & Faloutsos, C. (2019). A contrast metric for fraud detection in rich graphs. IEEE Transactions on Knowledge and Data Engineering, 31(12), 2235–2248.
Article Google Scholar
Ngai, E. W., Hu, Y., Wong, Y. H., Chen, Y., & Sun, X. (2011). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50(3), 559–569.
Article Google Scholar
Oentaryo, R., Lim, E.-P., Finegold, M., Lo, D., Zhu, F., Phua, C., et al. (2014). Detecting click fraud in online advertising: A data mining approach. The Journal of Machine Learning Research, 15(1), 99–140.
Google Scholar
Perols, J. L., Bowen, R. M., Zimmermann, C., & Samba, B. (2017). Finding needles in a haystack: Using data analytics to improve fraud prediction. The Accounting Review, 92, 221–245.
Article Google Scholar
Purda, L., & Skillicorn, D. (2015). Accounting variables, deception, and a bag of words: Assessing the tools of fraud detection. Contemporary Accounting Research, 32(3), 1193–1223.
Article Google Scholar
Salganik, M., Lundberg, I., Kindel, A., Ahearn, C., Al-Ghoneim, K. Almaatouq, A., Altschul, D., Brand, J., Carnegie, N., Compton, R, Datta, D., Davidson, T., Filippova, A., Gilroy, C., Goode, B., Jahani, E., Kashyap, R., Kirchner, A., Mckay, S. (2020). Measuring the predictability of life outcomes with a scientific mass collaboration. Proceedings of the National Academy of Sciences. 117.
Google Scholar
Shah, N., Lamba, H., Beutel, A., & Faloutsos, C. (2017). The many faces of link fraud. 2017 IEEE International Conference on Data Mining (ICDM). IEEE, pp. 1069–1074.
Google Scholar
Shmueli, G. (2010). To explain or to predict. Statistical Science, 25, 289–310.
Article Google Scholar
Van Vlasselaer, V., Eliassi-Rad, T., Akoglu, L., Snoeck, M., & Baesens, B. (2017). Gotcha! Network-based fraud detection for social security fraud. Management Science, 63(9), 3090–3110.
Article Google Scholar
Varian, H. R. (2014). Big data: New tricks for econometrics. Journal of Economic Perspectives, 28, 3–28.
Article Google Scholar
Wang, D., Lin, J., Cui, P., Jia, Q., Wang, Z., Fang, Y., et al. (2019a). A Semi-supervised Graph Attentive Network for Financial Fraud Detection. 2019 IEEE International Conference on Data Mining (ICDM). IEEE, pp. 598–607.
Google Scholar
Wang Y., Wang L., Li Y., He D., Chen W., Liu T.-Y. (2013). A Theoretical Analysis of NDCG Ranking Measures. In Proceedings of the 26th Annual Conference on Learning Theory.
Google Scholar
Wang, J., Wen, R., Wu, C., Huang, Y., & Xion, J. (2019b). Fdgars: Fraudster detection via graph convolutional networks in online app review system. Companion Proceedings of The 2019 World Wide Web Conference. pp. 310–316.
Google Scholar
Wang, Y., & Xu, W. (2018). Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decision Support Systems, 105, 87–95.
Article Google Scholar
Whiting, D. G., Hansen, J. V., McDonald, J. B., Albrecht, C., & Albrecht, W. S. (2012). Machine learning methods for detecting patterns of management fraud. Computational Intelligence, 28, 505–527.
Article Google Scholar
Xu, C., Zhang, J., & Sun, Z. (2017). Online reputation fraud campaign detection in user ratings. IJCAI, 3873–3879.
Google Scholar
Yuan, S., Wu, X., Li, J., & Lu, A. (2017) Spectrum-based deep neural networks for fraud detection. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. pp. 2419–2422.
Google Scholar
Zhang, J., Yang, X., & Appelbaum, D. (2015). Toward effective big data analysis in continuous auditing. Accounting Horizons, 29(2), 469–476.
Article Google Scholar
Zhang, Y.-L., Zhou, J., Zheng, W., Feng, J., Li, L., Liu, Z., et al. (2019). Distributed deep forest and its application to automatic detection of cash-out fraud. ACM Transactions on Intelligent Systems and Technology (TIST), 10(5), 1–19.
Google Scholar
Zheng, P., Yuan, S., Wu, X., Li, J., & Lu, A. (2019) One-class adversarial nets for fraud detection. Proceedings of the AAAI Conference on Artificial Intelligence. pp. 1286–1293.
Google Scholar
Zhong, Q., Liu, Y., Ao, X., Hu, B., Feng, J., Tang, J., et al. (2020). Financial defaulter detection on online credit payment via multi-view attributed heterogeneous information network. Proceedings of The Web Conference 2020. pp. 785–795.
Google Scholar
Zhu, Y., Xi, D., Song, B., Zhuang, F., Chen, S., Gu, X., et al. (2020) Modeling Users’ Behavior Sequences with Hierarchical Explainable Network for Cross-domain Fraud Detection. Proceedings of The Web Conference 2020. pp. 928–938.
Google Scholar

Download references

Author information

Authors and Affiliations

Antai College of Economics and Management, Shanghai Jiao Tong University, Shanghai, People’s Republic of China
Yang Bao
McDonough School of Business, Georgetown University, Washington, DC, USA
Gilles Hilary
Department of Accounting, NUS Business School, National University of Singapore, Singapore, Singapore
Bin Ke

Authors

Yang Bao
View author publications
You can also search for this author in PubMed Google Scholar
Gilles Hilary
View author publications
You can also search for this author in PubMed Google Scholar
Bin Ke
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

McDonough School of Business, Georgetown University, Washington, DC, USA
Volodymyr Babich
University of Chicago Booth, School of Business, Chicago, IL, USA
John R. Birge
McDonough School of Business, Georgetown University, Washington, DC, USA
Gilles Hilary

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bao, Y., Hilary, G., Ke, B. (2022). Artificial Intelligence and Fraud Detection. In: Babich, V., Birge, J.R., Hilary, G. (eds) Innovative Technology at the Interface of Finance and Operations. Springer Series in Supply Chain Management, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-030-75729-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-75729-8_8
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75728-1
Online ISBN: 978-3-030-75729-8
eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics