An Automated Literature Analysis on Data Mining Applications to Credit Risk Assessment

Moro, Sérgio; Cortez, Paulo; Rita, Paulo

doi:10.1057/978-1-137-48880-0_6

Sérgio Moro^8,9,
Paulo Cortez⁹ &
Paulo Rita¹⁰

Part of the book series: New Developments in Quantitative Trading and Investment ((QTAM))

4730 Accesses
2 Citations

Abstract

This chapter presents an automated literature analysis of data mining applications to credit risk assessment, encompassing the period from 2010 to 2014. Google Scholar was used to collect the 100 most relevant articles published in management and information systems conferences and journals containing the keywords ‘data mining’ and ‘credit risk’. This set of articles served as a basis for assessing the main trends of research in data mining applications to credit risk, first by using text mining, then through the Latent Dirichlet allocation Algorithm for grouping the articles into logical topics.

Five types of problems in credit risk were assessed: credit scoring, bankruptcy, credit fraud, credit cards and regulatory issues. From these, credit scoring receives most attention, while bankruptcy and credit fraud were the topic of a significant number of articles. The most interesting finding is that the most advanced data mining techniques such as support vector machines and ensembles are being applied to credit risk problems more for tuning these techniques than to benefit credit risk assessment. This represents an interesting research gap to be addressed. The trends identified prove the value of the automated procedure undertaken, which is novel in credit risk applications. Credit scoring was confirmed as the dominant subject regarding data mining applications. Several studies focused on tuning data mining techniques rather than on showing the benefits achieved by applying such techniques. More focus should be given to the value of data mining to risk assessment. Also, findings suggest that regulatory issues are demanding research in data quality, in alignment with banking regulation leveraged by the global crisis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Claessens, S., Kose, M. A., Laeven, M. L., and Valencia, F. (2014). Financial Crises: Causes, Consequences, and Policy Responses. International Monetary Fund.
Google Scholar
Nijskens, R., & Wagner, W. (2011). Credit risk transfer activities and systemic risk: How banks became less risky individually but posed greater risks to the financial system at the same time. Journal of Banking & Finance, 35(6), 1391–1398.
Article Google Scholar
Galati, G., & Moessner, R. (2013). Macroprudential policy—A literature review. Journal of Economic Surveys, 27(5), 846–878.
Google Scholar
Moro, S., Cortez, P., & Rita, P. (2015). Business intelligence in banking: A literature analysis from 2002 to 2013 using text mining and latent Dirichlet allocation. Expert Systems with Applications, 42(3), 1314–1324.
Article Google Scholar
Tobback, E., Martens, D., Van Gestel, T., & Baesens, B. (2014). Forecasting Loss Given Default models: Impact of account characteristics and the macroeconomic state. Journal of the Operational Research Society, 65(3), 376–392.
Article Google Scholar
Macey, J. R. (2012). Regulator effect in Financial regulation. The Cornell Law Review, 98, 591–636.
Google Scholar
Turban, E., Sharda, R., Delen, D. (2011). Decision support and business intelligence systems, (9th edn), New Delhi, Pearson.
Google Scholar
Marques, A. I., García, V., & Sanchez, J. S. (2013). A literature review on the application of evolutionary computing to credit scoring. Journal of the Operational Research Society, 64(9), 1384–1399.
Article Google Scholar
Guerrero-Baena, M. D., Gómez-Limón, J. A., & Fruet Cardozo, J. V. (2014). Are multi-criteria decision making techniques useful for solving corporate finance problems? A bibliometric analysis. Revista de Metodos Cuantitativos para la Economia y la Empresa, 17, 60–79.
Google Scholar
Oreski, S., & Oreski, G. (2014). Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Systems with Applications, 41(4), 2052–2064.
Article Google Scholar
Delen, D., & Crossland, M. D. (2008). Seeding the survey and analysis of research literature with text mining. Expert Systems with Applications, 34(3), 1707–1720.
Article Google Scholar
Harzing, A. W. (2013). A preliminary test of Google Scholar as a source for citation data: A longitudinal study of Nobel prize winners. Scientometrics, 94(3), 1057–1075.
Article Google Scholar
De Winter, J. C., Zadpoor, A. A., & Dodou, D. (2014). The expansion of Google Scholar versus Web of Science: A longitudinal study. Scientometrics, 98(2), 1547–1565.
Article Google Scholar
Hall, C. M. (2006). The impact of tourism knowledge: Google scholar, citations and the opening up of academic space. E-Review of Tourism Research, 4(5), 119–136.
Google Scholar
Tabuenca, B., Kalz, M., Ternier, S., and Specht, M. (2014). Mobile authoring of open educational resources for authentic learning scenarios. Universal Access in the Information Society, Special Issue: The Use of Mobile Technology and Ubiquitous Computing for Universal Access in Online Education, pp. 1–15.
Google Scholar
Crystal, D. (2012). English as a global language. Cambridge: Cambridge University Press.
Book Google Scholar
Hatim, B., & Mason, I. (2014). Discourse and the Translator. London: Routledge.
Google Scholar
Fan, W., Wallace, L., Rich, S., & Zhang, Z. (2006). Tapping the power of text mining. Communications of the ACM, 49(9), 76–82.
Article Google Scholar
Soper, D. S., & Turel, O. (2012). An n-gram analysis of communications 2000–2010. Communications of the ACM, 55, 81–87.
Article Google Scholar
Campbell, J. C., Hindle, A., & Stroulia, E. (2014). Latent Dirichlet allocation: Extracting Topics from Software Engineering Data. In M. Bird & T. Zimmermann (Eds.), The art and science of analyzing software data (1 ed.). Burlington, MA: Morgan Kaufmann.
Google Scholar
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993–1022.
Google Scholar
Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications, 33(4), 847–856.
Article Google Scholar
Gerding, E. F. (2009). Code, crash, and open source: The Outsourcing of Financial Regulation to Risk Models and the Global Financial Crisis. Washington Law Review, 84, 127–198.
Google Scholar
Siddiqi, N. (2012). Credit risk scorecards: Developing and implementing intelligent credit scoring (Vol. 3). New York: Wiley.
Book Google Scholar
Moro, S., Cortez, P., & Rita, P. (2014). A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62, 22–31.
Article Google Scholar
Khashman, A. (2010). Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes. Expert Systems with Applications, 37(9), 6233–6239.
Article Google Scholar
Tian, Y., Shi, Y., & Liu, X. (2012). Recent advances on support vector machines research. Technological and Economic Development of Economy, 18(1), 5–33.
Article Google Scholar
Moges, H. T., Dejaeger, K., Lemahieu, W., & Baesens, B. (2013). A multidimensional analysis of data quality for credit risk management: New insights and challenges. Information & Management, 50(1), 43–58.
Article Google Scholar
Yap, B. W., Ong, S. H., & Husain, N. H. M. (2011). Using data mining to improve assessment of credit worthiness via credit scoring models. Expert Systems with Applications, 38(10), 13274–13283.
Article Google Scholar
Mandala, I. G. N. N., Nawangpalupi, C. B., & Praktikto, F. R. (2012). Assessing Credit Risk: An application of data mining in a Rural Bank. Procedia Economics and Finance, 4, 406–412.
Article Google Scholar
Tsai, C. F., & Chen, M. L. (2010). Credit rating by hybrid machine learning techniques. Applied Soft Computing, 10(2), 374–380.
Article Google Scholar
Marinaki, M., Marinakis, Y., & Zopounidis, C. (2010). Honey bees mating optimization algorithm for financial classification problems. Applied Soft Computing, 10(3), 806–812.
Article Google Scholar
Zhou, X., Jiang, W., & Shi, Y. (2010). Credit risk evaluation by using nearest subspace method. Procedia Computer Science, 1(1), 2449–2455.
Article Google Scholar
Verikas, A., Kalsyte, Z., Bacauskiene, M., & Gelzinis, A. (2010). Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: A survey. Soft Computing, 14(9), 995–1010.
Article Google Scholar
De Andrés, J., Lorca, P., de Cos Juez, F. J., & Sánchez-Lasheras, F. (2011). Bankruptcy forecasting: A hybrid approach using Fuzzy c-means clustering and Multivariate Adaptive Regression Splines (MARS). Expert Systems with Applications, 38(3), 1866–1875.
Article Google Scholar
Wu, D. D., Chen, S. H., & Olson, D. L. (2014). Business intelligence in risk management: Some recent progresses. Information Sciences, 256, 1–7.
Article Google Scholar
Hens, A. B., & Tiwari, M. K. (2012). Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method. Expert Systems with Applications, 39(8), 6774–6781.
Article Google Scholar
Chen, S. C., & Huang, M. Y. (2011). Constructing credit auditing and control & management model with data mining technique. Expert Systems with Applications, 38(5), 5359–5365.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Instituto Universitário de Lisboa (ISCTE-IUL), ISTAR-IUL, Lisboa, Portugal
Sérgio Moro
ALGORITMI Research Centre, University of Minho, Minho, Portugal
Sérgio Moro & Paulo Cortez
Instituto Universitário de Lisboa (ISCTE-IUL), ISTAR-IUL, Business Research Unit (BRU-IUL), Lisboa, Portugal
Paulo Rita

Authors

Sérgio Moro
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Cortez
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Rita
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ACANTO Holding, Hannover, Germany
Christian L. Dunis
University of Liverpool, Liverpool, United Kingdom
Peter W. Middleton
American University of Beirut (AUB), Beirut, Lebanon
Andreas Karathanasopolous
University of Patras, Patras, Greece
Konstantinos Theofilatos

Copyright information

About this chapter

Cite this chapter

Moro, S., Cortez, P., Rita, P. (2016). An Automated Literature Analysis on Data Mining Applications to Credit Risk Assessment. In: Dunis, C., Middleton, P., Karathanasopolous, A., Theofilatos, K. (eds) Artificial Intelligence in Financial Markets. New Developments in Quantitative Trading and Investment. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-48880-0_6

Download citation

DOI: https://doi.org/10.1057/978-1-137-48880-0_6
Published: 23 November 2016
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-1-137-48879-4
Online ISBN: 978-1-137-48880-0
eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics