Abstract
This chapter presents an automated literature analysis of data mining applications to credit risk assessment, encompassing the period from 2010 to 2014. Google Scholar was used to collect the 100 most relevant articles published in management and information systems conferences and journals containing the keywords ‘data mining’ and ‘credit risk’. This set of articles served as a basis for assessing the main trends of research in data mining applications to credit risk, first by using text mining, then through the Latent Dirichlet allocation Algorithm for grouping the articles into logical topics.
Five types of problems in credit risk were assessed: credit scoring, bankruptcy, credit fraud, credit cards and regulatory issues. From these, credit scoring receives most attention, while bankruptcy and credit fraud were the topic of a significant number of articles. The most interesting finding is that the most advanced data mining techniques such as support vector machines and ensembles are being applied to credit risk problems more for tuning these techniques than to benefit credit risk assessment. This represents an interesting research gap to be addressed. The trends identified prove the value of the automated procedure undertaken, which is novel in credit risk applications. Credit scoring was confirmed as the dominant subject regarding data mining applications. Several studies focused on tuning data mining techniques rather than on showing the benefits achieved by applying such techniques. More focus should be given to the value of data mining to risk assessment. Also, findings suggest that regulatory issues are demanding research in data quality, in alignment with banking regulation leveraged by the global crisis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Claessens, S., Kose, M. A., Laeven, M. L., and Valencia, F. (2014). Financial Crises: Causes, Consequences, and Policy Responses. International Monetary Fund.
Nijskens, R., & Wagner, W. (2011). Credit risk transfer activities and systemic risk: How banks became less risky individually but posed greater risks to the financial system at the same time. Journal of Banking & Finance, 35(6), 1391–1398.
Galati, G., & Moessner, R. (2013). Macroprudential policy—A literature review. Journal of Economic Surveys, 27(5), 846–878.
Moro, S., Cortez, P., & Rita, P. (2015). Business intelligence in banking: A literature analysis from 2002 to 2013 using text mining and latent Dirichlet allocation. Expert Systems with Applications, 42(3), 1314–1324.
Tobback, E., Martens, D., Van Gestel, T., & Baesens, B. (2014). Forecasting Loss Given Default models: Impact of account characteristics and the macroeconomic state. Journal of the Operational Research Society, 65(3), 376–392.
Macey, J. R. (2012). Regulator effect in Financial regulation. The Cornell Law Review, 98, 591–636.
Turban, E., Sharda, R., Delen, D. (2011). Decision support and business intelligence systems, (9th edn), New Delhi, Pearson.
Marques, A. I., García, V., & Sanchez, J. S. (2013). A literature review on the application of evolutionary computing to credit scoring. Journal of the Operational Research Society, 64(9), 1384–1399.
Guerrero-Baena, M. D., Gómez-Limón, J. A., & Fruet Cardozo, J. V. (2014). Are multi-criteria decision making techniques useful for solving corporate finance problems? A bibliometric analysis. Revista de Metodos Cuantitativos para la Economia y la Empresa, 17, 60–79.
Oreski, S., & Oreski, G. (2014). Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Systems with Applications, 41(4), 2052–2064.
Delen, D., & Crossland, M. D. (2008). Seeding the survey and analysis of research literature with text mining. Expert Systems with Applications, 34(3), 1707–1720.
Harzing, A. W. (2013). A preliminary test of Google Scholar as a source for citation data: A longitudinal study of Nobel prize winners. Scientometrics, 94(3), 1057–1075.
De Winter, J. C., Zadpoor, A. A., & Dodou, D. (2014). The expansion of Google Scholar versus Web of Science: A longitudinal study. Scientometrics, 98(2), 1547–1565.
Hall, C. M. (2006). The impact of tourism knowledge: Google scholar, citations and the opening up of academic space. E-Review of Tourism Research, 4(5), 119–136.
Tabuenca, B., Kalz, M., Ternier, S., and Specht, M. (2014). Mobile authoring of open educational resources for authentic learning scenarios. Universal Access in the Information Society, Special Issue: The Use of Mobile Technology and Ubiquitous Computing for Universal Access in Online Education, pp. 1–15.
Crystal, D. (2012). English as a global language. Cambridge: Cambridge University Press.
Hatim, B., & Mason, I. (2014). Discourse and the Translator. London: Routledge.
Fan, W., Wallace, L., Rich, S., & Zhang, Z. (2006). Tapping the power of text mining. Communications of the ACM, 49(9), 76–82.
Soper, D. S., & Turel, O. (2012). An n-gram analysis of communications 2000–2010. Communications of the ACM, 55, 81–87.
Campbell, J. C., Hindle, A., & Stroulia, E. (2014). Latent Dirichlet allocation: Extracting Topics from Software Engineering Data. In M. Bird & T. Zimmermann (Eds.), The art and science of analyzing software data (1 ed.). Burlington, MA: Morgan Kaufmann.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993–1022.
Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications, 33(4), 847–856.
Gerding, E. F. (2009). Code, crash, and open source: The Outsourcing of Financial Regulation to Risk Models and the Global Financial Crisis. Washington Law Review, 84, 127–198.
Siddiqi, N. (2012). Credit risk scorecards: Developing and implementing intelligent credit scoring (Vol. 3). New York: Wiley.
Moro, S., Cortez, P., & Rita, P. (2014). A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62, 22–31.
Khashman, A. (2010). Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes. Expert Systems with Applications, 37(9), 6233–6239.
Tian, Y., Shi, Y., & Liu, X. (2012). Recent advances on support vector machines research. Technological and Economic Development of Economy, 18(1), 5–33.
Moges, H. T., Dejaeger, K., Lemahieu, W., & Baesens, B. (2013). A multidimensional analysis of data quality for credit risk management: New insights and challenges. Information & Management, 50(1), 43–58.
Yap, B. W., Ong, S. H., & Husain, N. H. M. (2011). Using data mining to improve assessment of credit worthiness via credit scoring models. Expert Systems with Applications, 38(10), 13274–13283.
Mandala, I. G. N. N., Nawangpalupi, C. B., & Praktikto, F. R. (2012). Assessing Credit Risk: An application of data mining in a Rural Bank. Procedia Economics and Finance, 4, 406–412.
Tsai, C. F., & Chen, M. L. (2010). Credit rating by hybrid machine learning techniques. Applied Soft Computing, 10(2), 374–380.
Marinaki, M., Marinakis, Y., & Zopounidis, C. (2010). Honey bees mating optimization algorithm for financial classification problems. Applied Soft Computing, 10(3), 806–812.
Zhou, X., Jiang, W., & Shi, Y. (2010). Credit risk evaluation by using nearest subspace method. Procedia Computer Science, 1(1), 2449–2455.
Verikas, A., Kalsyte, Z., Bacauskiene, M., & Gelzinis, A. (2010). Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: A survey. Soft Computing, 14(9), 995–1010.
De Andrés, J., Lorca, P., de Cos Juez, F. J., & Sánchez-Lasheras, F. (2011). Bankruptcy forecasting: A hybrid approach using Fuzzy c-means clustering and Multivariate Adaptive Regression Splines (MARS). Expert Systems with Applications, 38(3), 1866–1875.
Wu, D. D., Chen, S. H., & Olson, D. L. (2014). Business intelligence in risk management: Some recent progresses. Information Sciences, 256, 1–7.
Hens, A. B., & Tiwari, M. K. (2012). Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method. Expert Systems with Applications, 39(8), 6774–6781.
Chen, S. C., & Huang, M. Y. (2011). Constructing credit auditing and control & management model with data mining technique. Expert Systems with Applications, 38(5), 5359–5365.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Copyright information
© 2016 The Author(s)
About this chapter
Cite this chapter
Moro, S., Cortez, P., Rita, P. (2016). An Automated Literature Analysis on Data Mining Applications to Credit Risk Assessment. In: Dunis, C., Middleton, P., Karathanasopolous, A., Theofilatos, K. (eds) Artificial Intelligence in Financial Markets. New Developments in Quantitative Trading and Investment. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-48880-0_6
Download citation
DOI: https://doi.org/10.1057/978-1-137-48880-0_6
Published:
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-1-137-48879-4
Online ISBN: 978-1-137-48880-0
eBook Packages: Economics and FinanceEconomics and Finance (R0)