Skip to main content

Machine Learning in Accounting Research

  • Chapter
  • First Online:
Diginomics Research Perspectives

Abstract

We present a compact overview of machine learning applications in financial accounting and audit research as well as management accounting research. Here, the application of machine learning has the potential to provide novel insights into empirical data and to improve predictive performance. We highlight the potential use of deep learning to process unstructured and structured data more efficiently and a greater focus on model interpretability as viable opportunities for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Sometimes, targets are synonymously referred to as “labels,” “annotations,” “responses,” or simply “outputs.”

  2. 2.

    AAERs contain financial statements with fraudulent account balances identified within two years after the audited financial statement was released (e.g., Green & Choi, 1997).

  3. 3.

    For more information about social media analytics, see Kinra et al. (2022).

  4. 4.

    Of course, random choice would result in a fifty percent chance of making the correct prediction due to the balanced dataset.

  5. 5.

    For example, counting the classification predictions of each model and selecting the majority class.

  6. 6.

    Undersampling means randomly deleting cases from the majority class (i.e., non-fraudulent cases in the context of fraud detection). In contrast, oversampling results in the duplication of cases in the minority class (i.e., fraudulent cases in the context of fraud detection). Undersampling, as compared to oversampling, is more effective against class and cost imbalances (Drummond & Holte, 2003).

  7. 7.

    The curse of dimensionality describes that a complete search for an optimal solution on the entire feature space is becoming computationally harder with every additional feature (Bellman, 1961).

  8. 8.

    Target prediction on novel data, i.e., data not used in the training process.

  9. 9.

    Genetic algorithms are inspired by natural selection, meaning a model evolves through generations and the allowance for random mutations, which will be kept if they serve the model objective.

  10. 10.

    Yelp is a crowd-sourced platform for business reviews.

References

  • Abdolmohammadi, M. J. (1991). Factors affecting auditors’ perceptions of applicable decision aids for various audit tasks. Contemporary Accounting Research, 7(2), 535–548.

    Article  Google Scholar 

  • Al Asheeri, M. M., & Hammad, M. (2019). Machine learning models for software cost estimation. International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies, 3ICT 2019 (pp. 1–6). IEEE.

    Google Scholar 

  • Al-Radaideh, Q. A., & Al Nagi, E. (2012). Using data mining techniques to build a classification model for predicting employees performance. International Journal of Advanced Computer Science and Applications, 3(2), 144–151.

    Google Scholar 

  • Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. Journal of Finance, 4, 589–609.

    Article  Google Scholar 

  • Appelbaum, D., Kogan, A., Vasarhelyi, M. A., & Yan, Z. (2017). Impact of business analytics and enterprise systems on managerial accounting. International Journal of Accounting Information Systems, 25, 29–44.

    Article  Google Scholar 

  • Association of Certified Fraud Examiners. (2020). Report to the Nations: 2020 Global Study on Occupational Fraud and Abuse.

    Google Scholar 

  • Baldwin, A. A., Brown, C. E., & Trinkle, B. S. (2006). Opportunities for artificial intelligence development in the accounting domain: The case for auditing. Intelligent Systems in Accounting, Finance and Management, 14(3), 77–86.

    Article  Google Scholar 

  • Bao, Y., Hilary, G., & Ke, B. (2020). Artificial intelligence and fraud detection. In Innovative technology at the interface of finance and operations (working paper).

    Google Scholar 

  • Bao, Y., Ke, B., Li, B., Yu, J., & Zhang, J. (2020). Detecting accounting fraud in publicly traded U.S. firms using a machine learning approach. Journal of Accounting Research, 58(1), 199–235.

    Article  Google Scholar 

  • Basu, S. (1997). The conservatism principle and the asymmetric timeliness of earnings. Journal of Accounting and Economics, 24(1), 3–37.

    Article  Google Scholar 

  • Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of Accounting Research, 4, 71–111.

    Article  Google Scholar 

  • Bellman, R. E. (1961). Adaptive control processes. Princeton University Press.

    Book  Google Scholar 

  • Bertomeu, J., Cheynel, E., Floyd, E., & Pan, W. (2020). Using machine learning to detect misstatements. Review of Accounting Studies, 26, 468–519.

    Article  Google Scholar 

  • Bertomeu, J., Cheynel, E., Liao, Y., & Milone, M. (2021). Using machine learning to measure conservatism. Working Paper Washington University in St. Louis, available under: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3924961

  • Bhimani, A., Datar, S. M., Horngren, C. T., & Rajan, M. V. (2019). Management and cost accounting (7th ed.). Pearson Education, Limited.

    Google Scholar 

  • Boritz, E. J., & Kennedy, D. B. (1995). Effectiveness of neural network types for prediction of business failure. Expert Systems with Applications, 9(4), 503–512.

    Article  Google Scholar 

  • Brandas, C., Muntean, M., & Didraga, O. (2018). Intelligent decision support in auditing: Big data and machine learning approach. In 17th International Conference on Informatics in economy, IE 2018, Education, Research & Business Technologies, The Bucharest University of Economic Studies, Bucharest, Romania, pp. 425–430.

    Google Scholar 

  • Brown, N. C., Crowley, R. M., & Elliott, W. B. (2020). What are you saying? Using topic to detect financial misreporting. Journal of Accounting Research, 58(1), 237–291.

    Article  Google Scholar 

  • Cao, M., Chychyla, R., & Stewart, T. (2015). Big data analytics in financial statement audits. Accounting Horizons, 29(2), 423–429.

    Article  Google Scholar 

  • Cecchini, M., Aytug, H., Koehler, G. J., & Pathak, P. (2010a). Detecting management fraud in public companies. Management Science, 56(7), 1146–1160.

    Article  Google Scholar 

  • Cecchini, M., Aytug, H., Koehler, G. J., & Pathak, P. (2010b). Making words work: Using financial text as a predictor of financial events. Decision Support Systems, 50(1), 164–175.

    Article  Google Scholar 

  • Chandanshive, V. B., & Kambekar, A. R. (2019). Estimation of building construction cost using artificial neural networks. Journal of Soft Computing in Civil Engineering, 3(1), 91–107.

    Google Scholar 

  • Chen, C., Liaw, An., & Breiman, L. (2004). Using random forest to learn imbalanced data. In Technical report 666, Statistics Department of University of California at Berkley.

    Google Scholar 

  • Cheriyan, S., Ibrahim, S., Mohanan, S., & Treesa, S. (2018). Intelligent sales prediction using machine learning techniques. In International Conference on Computing, Electronics & Communications Engineering (ICCECE).

    Google Scholar 

  • Cokins, G. (2013). Top 7 trends in management accounting. Strategic Finance, 95(6), 21–30.

    Google Scholar 

  • Craja, P., Kim, A., & Lessmann, S. (2020). Deep learning for detecting financial statement fraud. Decision Support Systems, 139, 113421.

    Article  Google Scholar 

  • Dastile, X., & Celik, T. (2021). Making deep learning-based predictions for credit scoring explainable. IEEE Access, 9, 50426–50440.

    Article  Google Scholar 

  • Dechow, P. M., Ge, W., Larson, C. R., & Sloan, R. G. (2011). Predicting material accounting misstatements. Contemporary Accounting Research, 28(1), 17–82.

    Article  Google Scholar 

  • Dong, W., Liao, S., & Zhang, Z. (2018). Leveraging financial social media data for corporate fraud detection. Journal of Management Information Systems, 35(2), 461–487.

    Article  Google Scholar 

  • Drummond, C., & Holte, R. C. (2003). C4.5, class imbalance, and cost sensitivity: Why undersampling beats oversampling. In Proceedings of the Twentieth International Conference on Machine Learning: Workshop – Learning from imbalanced data sets II, pp. 1–8.

    Google Scholar 

  • du Jardin, P. (2016). A two-stage classification technique for bankruptcy prediction. European Journal of Operational Research, 254(1), 236–252.

    Article  Google Scholar 

  • Dutta, I., Dutta, S., & Raahemi, B. (2017). Detecting financial restatements using data mining techniques. Expert Systems with Applications, 90(30), 374–393.

    Article  Google Scholar 

  • Earley, C. E. (2015). Data analytics in auditing: Opportunities and challenges. Business Horizons, 58(5), 493–500.

    Article  Google Scholar 

  • Fanning, K. M., & Cogger, K. O. (1998). Neural network detection of management fraud using published financial data. Intelligent Systems in Accounting, Finance and Management, 7(1), 21–41.

    Article  Google Scholar 

  • Gepp, A., Linnenluecke, M. K., O’Neill, T. J., & Smith, T. (2018). Big data techniques in auditing research and practice: Current trends and future opportunities. Journal of Accounting Literature, 40, 102–115.

    Article  Google Scholar 

  • Géron, A. (2017). Hands-on machine learning with Scikit-learn and TensorFlow. O’Reilly Media, Inc.

    Google Scholar 

  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

    Google Scholar 

  • Green, B. P., & Choi, J. H. (1997). Assessing the risk of management fraud through neural network technology. Auditing: A Journal of Practice & Theory, 16(1), 14–28.

    Google Scholar 

  • Gruszczyński, M. (2019). On unbalanced sampling in bankruptcy prediction. International Journal of Financial Studies, 7(2), 28.

    Article  Google Scholar 

  • Gunnarsson, B. R., vanden Broucke, S., Baesens, B., Óskarsdóttir, M., & Lemahieu, W. (2021). Deep learning for credit scoring: Do or don’t? European Journal of Operational Research, 295(1), 292–305.

    Article  Google Scholar 

  • Hagemann, V., & Klug, K. (2022). Human resource management in a digital environment. In L. Hornuf, Diginomics research perspectives: The role of digitalization in business and society, (pp. @@@). Springer international publishing.

    Google Scholar 

  • Hajek, P., Olej, V., & Myskova, R. (2014). Forecasting corporate financial performance using sentiment in annual reports for stakeholders’ decision-making. Technological and Economical Development of Economy, 20(4), 721–738.

    Article  Google Scholar 

  • Hashemi, S. T., Ebadati, O. M., & Kaur, H. (2020). Cost estimation and prediction in construction projects: A systematic review on machine learning techniques. SN Applied Science, 2, 1703.

    Article  Google Scholar 

  • Hobson, J. L., Mayew, W. J., & Venkatachalam, M. (2012). Analyzing speech to detect financial misreporting. Journal of Accounting Research, 50(2), 349–392.

    Article  Google Scholar 

  • Hoitash, R., & Hoitash, U. (2018). Measuring accounting reporting complexity with XBRL. The Accounting Review, 93(1), 259–287. https://doi.org/10.2308/ACCR-51762

    Article  Google Scholar 

  • Hooda, N., Bawa, S., & Rana, P. S. (2018). Fraudulent firm classification: A case study of an external audit. Applied Artificial Intelligence, 32(1), 48–64.

    Article  Google Scholar 

  • Huang, C.-L., Chen, M.-C., & Wang, C.-J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications, 33(4), 847–856.

    Article  Google Scholar 

  • Humpherys, S. L., Moffitt, K. C., Burns, M. B., Burgoon, J. K., & Felix, W. F. (2011). Identification of fraudulent financial statements using linguistic credibility analysis. Decision Support Systems, 50(3), 585–594.

    Article  Google Scholar 

  • Institute of Management Accountants. (2008). Statements on management accounting: Definition of management accounting. www.imanet.org

  • International Auditing and Assurance Standards Board. (2016). Exploring the growing use of technology in the audit, with a focus on data analytics. IFAC.

    Google Scholar 

  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. Springer. https://doi.org/10.1007/978-1-4614-7138-7

    Book  Google Scholar 

  • Kashyap, P. (2017). Machine learning for decision makers. Apress.

    Book  Google Scholar 

  • Katarya, R., Gautam, A., Bandgar, S. P., & Koli, D. (2020). Analyzing customer sentiments using machine learning techniques to improve business performance. In International Conference on Advances in Computing, Communication Control and Networking (ICACCCN).

    Google Scholar 

  • Kinra, A., Kotzab, H., & Siekmann, F. (2022). Social media analytics in operations and supply chain management: Opportunities, challenges and paradoxes. In L. Hornuf (Ed.), Diginomics research perspectives: The role of digitalization in business and society, (pp. @@@). Springer international publishing.

    Google Scholar 

  • Kirimi, J. M., & Moturi, C. A. (2016). Application of data mining classification in employee performance prediction. International Journal of Computer Applications, 146(7), 28–35.

    Article  Google Scholar 

  • Koh, H. C., & Tan, G. (2011). Data mining applications in healthcare. Journal of Healthcare Information Management, 19(2), 64–72.

    Google Scholar 

  • Kokina, J., & Davenport, T. H. (2017). The emergence of artificial intelligence: How automation is changing auditing. Journal of Emerging Technologies in Accounting, 14(1), 115–122.

    Article  Google Scholar 

  • Kondo, S., Miyakawa, D., Shiraki, K., Suga, M., & Usuki, M. (2019). Using machine learning to detect and forecast accounting fraud. RIETI.

    Google Scholar 

  • Kotsiantis, S., Koumanakos, E., Tzelepis, D., & Tampakas, V. (2006). Forecasting fraudulent financial statements using data mining. International Journal of Computational Intelligence, 3(2), 104–110.

    Google Scholar 

  • Koyuncugil, A. S., & Ozgulbas, N. (2012). Financial early warning system model and data mining application for risk detection. Expert Systems with Applications, 39(6), 6238–6253.

    Article  Google Scholar 

  • Kuger, M. (2019). Global bankruptcy report 2019. Dun & Bradstreet Worldwide Network.

    Google Scholar 

  • Kumar, M. R., & Gunjan, V. K. (2020). Review of machine learning models for credit scoring analysis. Ingeniería Solidaria, 16(1), 1–16.

    Google Scholar 

  • Kumar, P. R., & Ravi, V. (2007). Bankruptcy prediction in banks and firms via statistical and intelligent techniques: A review. European Journal of Operational Research, 180(1), 1–28.

    Article  Google Scholar 

  • Lanen, W., Anderson, S., & Maher, M. (2013). Fundamentals of cost accounting (4th ed.) McGraw-Hill Education.

    Google Scholar 

  • Le, T., Lee, M. Y., Park, J. R., & Baik, S. W. (2018). Oversampling techniques for bankruptcy prediction: Novel features from a transaction dataset. Symmetry, 10(4), 79.

    Article  Google Scholar 

  • Le, T., Son, L. H., Vo, M. T., Lee, M. Y., & Baoik, S. W. (2018). A cluster-based boosting algorithm for bankruptcy prediction in a highly imbalanced dataset. Symmetry, 10(7), 250.

    Article  Google Scholar 

  • Lee, I. (2017). Big data: Dimensions, evolution, impacts, and challenges. Business Horizons, 60(3), 293–303.

    Article  Google Scholar 

  • Li, B., Yu, J., Zhang, J., & Ke, B. (2016). Detecting accounting frauds in publicly traded U.S. firms: A machine learning approach. Asian Conference on Machine Learning, PMLR, 45, 173–188.

    Google Scholar 

  • Lin, W.-Y., Hu, Y.-H., & Tsai, C.-F. (2012). Machine learning in financial crisis prediction: A survey. IEEE Transactions on Systems, Man, and Cybernetics, 42(4), 421–436.

    Article  Google Scholar 

  • Lum, K. T., Baker, D. R., & Hihn, J. M. (2008). The effects of data mining techniques on software cost estimation. IEEE international conference on engineering management (EMC).

    Google Scholar 

  • Mai, F., Tian, S., Lee, C., & Ma, L. (2019). Deep learning models for bankruptcy prediction using textual disclosures. European Journal of Operational Research, 274(2), 743–758.

    Article  Google Scholar 

  • Melo-Acosta, G. E., Duitama-Munoz, F., & Arias-Londono, J. D. (2017). Fraud detection in big data using supervised and semi-supervised learning techniques. Proceedings of the 2017 IEEE Colombian conference on communications and computing, COLCOM 2017. https://doi.org/10.1109/COLCOMCON.2017.8088206.

  • Mishra, B. K., Hazra, D., Tarannum, K., & Kumar, M. (2016). Business intelligence using data mining techniques and business analytics. International conference on System Modeling & Advancement in research trends (SMART).

    Google Scholar 

  • No, W. G., Lee, K. K., Huang, F., & Li, Q. (2019). Multidimensional audit data selection (MADS): A framework for using data analytics in the audit data selection process. Accounting Horizons, 33(3), 127–140.

    Article  Google Scholar 

  • Office of Compliance Inspections and Examinations of the Securities and Exchange Commission (2015). Examination priorities for 2015. http://www.sec.gov/about/offices/ocie/national-examination-program-priorities-2015.pdf

  • Perols, J. L. (2011). Financial statement fraud detection: An analysis of statistical and machine learning algorithms. Auditing: A Journal of Practice & Theory, 30(2), 19–50.

    Article  Google Scholar 

  • Perols, J. L., Bowen, R. M., Zimmermann, C., & Samba, B. (2017). Finding needles in a haystack: Using data analytics to improve fraud prediction. The Accounting Review, 92(2), 221–245.

    Article  Google Scholar 

  • Psychoula, I., Gutmann, A., Mainali, P., Lee, S. H., Dunphy, P., & Petitcolas, F. A. P. (2021). Explainable machine learning for fraud detection. IEEE Computer Special Issue on Explainable AI and Machine Learning, forthcoming.

    Google Scholar 

  • Purda, L., & Skillicorn, D. (2015). Accounting variables, deception, and a bag of words: Assessing the tools of fraud detection. Contemporary Accounting Research, 32(3), 1193–1223.

    Article  Google Scholar 

  • Shirata, C. Y., Takeuchi, H., Ogino, S., & Watanabe, H. (2011). Extracting key phrases as predictors of corporate bankruptcy: Empirical analysis of annual reports by text mining. Journal of Emerging Technologies in Accounting, 8(1), 31–44.

    Article  Google Scholar 

  • Son, H., Hyun, C., Phan, D., & Hwang, H. J. (2019). Data analytic approach for bankruptcy prediction. Expert Systems with Applications, 138(30), 112816.

    Article  Google Scholar 

  • Son, H., Kim, C., & Kim, C. (2012). Hybrid principal component analysis and support vector machine model for predicting the cost performance of commercial building projects using pre-project planning variables. Automation in Construction, 27, 60–66.

    Article  Google Scholar 

  • Sun, T. (2019). Applying deep learning to audit procedures: An illustrative framework. Accounting Horizons, 33(3), 89–109.

    Article  Google Scholar 

  • Sun, J., & Li, H. (2008). Data mining method for listed companies’ financial distress prediction. Knowledge-Based Systems, 21(1), 1–5.

    Article  Google Scholar 

  • Sun, T., & Vasarhelyi, M. A. (2017). Detailed record title: Deep learning and the future of auditing: How an evolving technology could transform analysis and improve judgment. CPA Journal, 87(6), 25–29.

    Google Scholar 

  • Tang, X., Li, S., Tan, M., & Shi, W. (2020). Incorporating textual and management factors into financial distress prediction: A comparative study of machine learning methods. Journal of Forecasting, 39(5), 769–787.

    Article  Google Scholar 

  • Wang, Y., & Wang, Z. (2016). Integrating data mining into managerial accounting system: Challenges and opportunities. Chinese Business Review, 15(1), 33–41.

    Google Scholar 

  • Williams, T. P., & Gong, J. (2014). Predicting construction cost overruns using text mining, numerical data and ensemble classifiers. Automation in Construction, 43, 23–29.

    Article  Google Scholar 

  • Wu, C.-F., Huang, S.-C., Chiou, C.-C., & Wang, Y.-M. (2021). A predictive intelligence system of credit scoring based on deep multiple kernel learning. Applied Soft Computing, 111, 107668.

    Article  Google Scholar 

  • Zięba, M., Tomczak, S. K., & Tomczak, J. M. (2016). Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Systems with Applications, 58, 93–101.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Loy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Fieberg, C., Hesse, M., Loy, T., Metko, D. (2022). Machine Learning in Accounting Research. In: Hornuf, L. (eds) Diginomics Research Perspectives. Advanced Studies in Diginomics and Digitalization. Springer, Cham. https://doi.org/10.1007/978-3-031-04063-4_6

Download citation

Publish with us

Policies and ethics