Skip to main content

An Automated Literature Analysis on Data Mining Applications to Credit Risk Assessment

  • Chapter
  • First Online:
Artificial Intelligence in Financial Markets

Abstract

This chapter presents an automated literature analysis of data mining applications to credit risk assessment, encompassing the period from 2010 to 2014. Google Scholar was used to collect the 100 most relevant articles published in management and information systems conferences and journals containing the keywords ‘data mining’ and ‘credit risk’. This set of articles served as a basis for assessing the main trends of research in data mining applications to credit risk, first by using text mining, then through the Latent Dirichlet allocation Algorithm for grouping the articles into logical topics.

Five types of problems in credit risk were assessed: credit scoring, bankruptcy, credit fraud, credit cards and regulatory issues. From these, credit scoring receives most attention, while bankruptcy and credit fraud were the topic of a significant number of articles. The most interesting finding is that the most advanced data mining techniques such as support vector machines and ensembles are being applied to credit risk problems more for tuning these techniques than to benefit credit risk assessment. This represents an interesting research gap to be addressed. The trends identified prove the value of the automated procedure undertaken, which is novel in credit risk applications. Credit scoring was confirmed as the dominant subject regarding data mining applications. Several studies focused on tuning data mining techniques rather than on showing the benefits achieved by applying such techniques. More focus should be given to the value of data mining to risk assessment. Also, findings suggest that regulatory issues are demanding research in data quality, in alignment with banking regulation leveraged by the global crisis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Claessens, S., Kose, M. A., Laeven, M. L., and Valencia, F. (2014). Financial Crises: Causes, Consequences, and Policy Responses. International Monetary Fund.

    Google Scholar 

  2. Nijskens, R., & Wagner, W. (2011). Credit risk transfer activities and systemic risk: How banks became less risky individually but posed greater risks to the financial system at the same time. Journal of Banking & Finance, 35(6), 1391–1398.

    Article  Google Scholar 

  3. Galati, G., & Moessner, R. (2013). Macroprudential policy—A literature review. Journal of Economic Surveys, 27(5), 846–878.

    Google Scholar 

  4. Moro, S., Cortez, P., & Rita, P. (2015). Business intelligence in banking: A literature analysis from 2002 to 2013 using text mining and latent Dirichlet allocation. Expert Systems with Applications, 42(3), 1314–1324.

    Article  Google Scholar 

  5. Tobback, E., Martens, D., Van Gestel, T., & Baesens, B. (2014). Forecasting Loss Given Default models: Impact of account characteristics and the macroeconomic state. Journal of the Operational Research Society, 65(3), 376–392.

    Article  Google Scholar 

  6. Macey, J. R. (2012). Regulator effect in Financial regulation. The Cornell Law Review, 98, 591–636.

    Google Scholar 

  7. Turban, E., Sharda, R., Delen, D. (2011). Decision support and business intelligence systems, (9th edn), New Delhi, Pearson.

    Google Scholar 

  8. Marques, A. I., García, V., & Sanchez, J. S. (2013). A literature review on the application of evolutionary computing to credit scoring. Journal of the Operational Research Society, 64(9), 1384–1399.

    Article  Google Scholar 

  9. Guerrero-Baena, M. D., Gómez-Limón, J. A., & Fruet Cardozo, J. V. (2014). Are multi-criteria decision making techniques useful for solving corporate finance problems? A bibliometric analysis. Revista de Metodos Cuantitativos para la Economia y la Empresa, 17, 60–79.

    Google Scholar 

  10. Oreski, S., & Oreski, G. (2014). Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Systems with Applications, 41(4), 2052–2064.

    Article  Google Scholar 

  11. Delen, D., & Crossland, M. D. (2008). Seeding the survey and analysis of research literature with text mining. Expert Systems with Applications, 34(3), 1707–1720.

    Article  Google Scholar 

  12. Harzing, A. W. (2013). A preliminary test of Google Scholar as a source for citation data: A longitudinal study of Nobel prize winners. Scientometrics, 94(3), 1057–1075.

    Article  Google Scholar 

  13. De Winter, J. C., Zadpoor, A. A., & Dodou, D. (2014). The expansion of Google Scholar versus Web of Science: A longitudinal study. Scientometrics, 98(2), 1547–1565.

    Article  Google Scholar 

  14. Hall, C. M. (2006). The impact of tourism knowledge: Google scholar, citations and the opening up of academic space. E-Review of Tourism Research, 4(5), 119–136.

    Google Scholar 

  15. Tabuenca, B., Kalz, M., Ternier, S., and Specht, M. (2014). Mobile authoring of open educational resources for authentic learning scenarios. Universal Access in the Information Society, Special Issue: The Use of Mobile Technology and Ubiquitous Computing for Universal Access in Online Education, pp. 1–15.

    Google Scholar 

  16. Crystal, D. (2012). English as a global language. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  17. Hatim, B., & Mason, I. (2014). Discourse and the Translator. London: Routledge.

    Google Scholar 

  18. Fan, W., Wallace, L., Rich, S., & Zhang, Z. (2006). Tapping the power of text mining. Communications of the ACM, 49(9), 76–82.

    Article  Google Scholar 

  19. Soper, D. S., & Turel, O. (2012). An n-gram analysis of communications 2000–2010. Communications of the ACM, 55, 81–87.

    Article  Google Scholar 

  20. Campbell, J. C., Hindle, A., & Stroulia, E. (2014). Latent Dirichlet allocation: Extracting Topics from Software Engineering Data. In M. Bird & T. Zimmermann (Eds.), The art and science of analyzing software data (1 ed.). Burlington, MA: Morgan Kaufmann.

    Google Scholar 

  21. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993–1022.

    Google Scholar 

  22. Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications, 33(4), 847–856.

    Article  Google Scholar 

  23. Gerding, E. F. (2009). Code, crash, and open source: The Outsourcing of Financial Regulation to Risk Models and the Global Financial Crisis. Washington Law Review, 84, 127–198.

    Google Scholar 

  24. Siddiqi, N. (2012). Credit risk scorecards: Developing and implementing intelligent credit scoring (Vol. 3). New York: Wiley.

    Book  Google Scholar 

  25. Moro, S., Cortez, P., & Rita, P. (2014). A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62, 22–31.

    Article  Google Scholar 

  26. Khashman, A. (2010). Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes. Expert Systems with Applications, 37(9), 6233–6239.

    Article  Google Scholar 

  27. Tian, Y., Shi, Y., & Liu, X. (2012). Recent advances on support vector machines research. Technological and Economic Development of Economy, 18(1), 5–33.

    Article  Google Scholar 

  28. Moges, H. T., Dejaeger, K., Lemahieu, W., & Baesens, B. (2013). A multidimensional analysis of data quality for credit risk management: New insights and challenges. Information & Management, 50(1), 43–58.

    Article  Google Scholar 

  29. Yap, B. W., Ong, S. H., & Husain, N. H. M. (2011). Using data mining to improve assessment of credit worthiness via credit scoring models. Expert Systems with Applications, 38(10), 13274–13283.

    Article  Google Scholar 

  30. Mandala, I. G. N. N., Nawangpalupi, C. B., & Praktikto, F. R. (2012). Assessing Credit Risk: An application of data mining in a Rural Bank. Procedia Economics and Finance, 4, 406–412.

    Article  Google Scholar 

  31. Tsai, C. F., & Chen, M. L. (2010). Credit rating by hybrid machine learning techniques. Applied Soft Computing, 10(2), 374–380.

    Article  Google Scholar 

  32. Marinaki, M., Marinakis, Y., & Zopounidis, C. (2010). Honey bees mating optimization algorithm for financial classification problems. Applied Soft Computing, 10(3), 806–812.

    Article  Google Scholar 

  33. Zhou, X., Jiang, W., & Shi, Y. (2010). Credit risk evaluation by using nearest subspace method. Procedia Computer Science, 1(1), 2449–2455.

    Article  Google Scholar 

  34. Verikas, A., Kalsyte, Z., Bacauskiene, M., & Gelzinis, A. (2010). Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: A survey. Soft Computing, 14(9), 995–1010.

    Article  Google Scholar 

  35. De Andrés, J., Lorca, P., de Cos Juez, F. J., & Sánchez-Lasheras, F. (2011). Bankruptcy forecasting: A hybrid approach using Fuzzy c-means clustering and Multivariate Adaptive Regression Splines (MARS). Expert Systems with Applications, 38(3), 1866–1875.

    Article  Google Scholar 

  36. Wu, D. D., Chen, S. H., & Olson, D. L. (2014). Business intelligence in risk management: Some recent progresses. Information Sciences, 256, 1–7.

    Article  Google Scholar 

  37. Hens, A. B., & Tiwari, M. K. (2012). Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method. Expert Systems with Applications, 39(8), 6774–6781.

    Article  Google Scholar 

  38. Chen, S. C., & Huang, M. Y. (2011). Constructing credit auditing and control & management model with data mining technique. Expert Systems with Applications, 38(5), 5359–5365.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Copyright information

© 2016 The Author(s)

About this chapter

Cite this chapter

Moro, S., Cortez, P., Rita, P. (2016). An Automated Literature Analysis on Data Mining Applications to Credit Risk Assessment. In: Dunis, C., Middleton, P., Karathanasopolous, A., Theofilatos, K. (eds) Artificial Intelligence in Financial Markets. New Developments in Quantitative Trading and Investment. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-48880-0_6

Download citation

  • DOI: https://doi.org/10.1057/978-1-137-48880-0_6

  • Published:

  • Publisher Name: Palgrave Macmillan, London

  • Print ISBN: 978-1-137-48879-4

  • Online ISBN: 978-1-137-48880-0

  • eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics