Abstract
Credit scoring is one of the most important issues in financial decision-making. The use of data mining techniques to build models for credit scoring has been a hot topic in recent years. Classification problems often have a large number of features, but not all of them are useful for classification. Irrelevant and redundant features in credit data may even reduce the classification accuracy. Feature selection is a process of selecting a subset of relevant features, which can decrease the dimensionality, reduce the running time, and improve the accuracy of classifiers. Random forest (RF) is a powerful classification tool which is currently an active research area and successfully solves classification problems in many domains. In this study, we constructed a fast credit scoring model based on parallel Random forests and Recursive Feature Elimination (FRFE) . Two public UCI data sets, Australia and German credit have been used to test our method. The experimental results of the real world data showed that the proposed method results in a higher prediction rate than a baseline method for some certain datasets and also shows comparable and sometimes better performance than the feature selection methods widely used in credit scoring.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Altman, E.I., Saunders, A.: Credit risk measurement: developments over the last 20 years. J. Bank. Finance 21(11–12), 1721–1742 (1997)
Davoodabadi, Z., Moeini, A.: Building customers’ credit scoring models with combination of feature selection and decision tree algorithms 4(2), 97–103 (2015)
Khashman, A.: A neural network model for credit risk evaluation. Int. J. Neural Syst. 19(4), 285–294 (2009)
Bellotti, T., Crook, J.: Support vector machines for credit scoring and discovery of significant features. Expert Syst. Appl. 36(2), 3302–3308 (2009)
Wen, F., Yang, X.: Skewness of return distribution and coefficient of risk premium. J. Syst. Sci. Complexity 22(3), 360–371 (2009)
Zhou, X., Jiang, W., Shi, Y., Tian, Y.: Credit risk evaluation with kernel-based affine subspace nearest points learning method. Expert Syst. Appl. 38(4), 4272–4279 (2011)
Kim, G., Wu, C., Lim, S., Kim, J.: Modified matrix splitting method for the support vector machine and its application to the credit classification of companies in Korea. Expert Syst. Appl. 39(10), 8824–8834 (2012)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Dordrecht (1998)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Oreski, S., Oreski, D., Oreski, G.: Hybrid system with genetic algorithm and artificial neural networks and its application to retail credit risk assessment. Expert Syst. Appl. 39(16), 12605–12617 (2012)
Saberi, M., Mirtalaie, M.S., Hussain, F.K., Azadeh, A., Hussain, O.K., Ashjari, B.: A granular computing-based approach to credit scoring modeling. Neurocomputing 122, 100–115 (2013)
Lee, S., Choi, W.S.: A multi-industry bankruptcy prediction model using back-propagation neural network and multivariate discriminant analysis. Expert Syst. Appl. 40(8), 2941–2946 (2013)
Ghatge, A.R., Halkarnikar, P.P.: Ensemble neural network strategy for predicting credit default evaluation 2(7), 223–225 (2013)
Chaudhuri, A., De, K.: Fuzzy support vector machine for bankruptcy prediction. Appl. Soft Comput. J. 11(2), 2472–2486 (2011)
Ghodselahi, A.: A hybrid support vector machine ensemble model for credit scoring. Int. J. Comput. Appl. 17(5), 1–5 (2011)
Huang, L., Chen, C., Wang, J.: Credit scoring with a data mining approach based on support vector machines. Comput. J. Expert Syst. Appl. 33(4), 847–856 (2007)
Eason, G., Li, S.T., Shiue, W., Huang, H.: The evaluation of consumer loans using support vector machines. Comput. J. Expert Syst. Appl. 30(4), 772–782 (2006)
Martens, D., Baesens, B., Gestel, T., Vanthienen, J.: Comprehensible credit scoring models using rule extraction from support vector machines. Eur. Comput. J. Oper. Res. 183(3), 1466–1476 (2007)
Wang, Y., Wang, S., Lai, K.: A new fuzzy support vector machine to evaluate credit risk. Comput. J. IEEE Trans. Fuzzy Syst. 13(6), 25–29 (2005)
Oreski, S., Oreski, G.: Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst. Appl. 41(4), 2052–2064 (2014)
Ling, Y., Cao, Q.Y., Zhang, H.: Application of the PSO-SVM model for credit scoring. In: Proceedings of the 2011 7th International Conference on Computational Intelligent and Security, CIS 2011, pp. 47–51 (2011)
Liang, D., Tsai, C.-F., Wua, H.-T.: The effect of feature selection on financial distress prediction. Knowl. Based Syst. 73, 289–297 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Ha, VS., Nguyen, HN. (2016). FRFE: Fast Recursive Feature Elimination for Credit Scoring. In: Vinh, P., Barolli, L. (eds) Nature of Computation and Communication. ICTCC 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 168. Springer, Cham. https://doi.org/10.1007/978-3-319-46909-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-46909-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46908-9
Online ISBN: 978-3-319-46909-6
eBook Packages: Computer ScienceComputer Science (R0)