Abstract
Recent finance and debt crises have made credit risk management one of the most important issues in financial research. Reliable credit scoring models are crucial for financial agencies to evaluate credit applications and have been widely studied in the field of machine learning and statistics. In this paper, a novel feature-weighted support vector machine (SVM) credit scoring model is presented for credit risk assessment, in which an F-score is adopted for feature importance ranking. Considering the mutual interaction among modeling features, random forest is further introduced for relative feature importance measurement. These two feature-weighted versions of SVM are tested against the traditional SVM on two real-world datasets and the research results reveal the validity of the proposed method.
Similar content being viewed by others
References
Archer, K.J., Kimes, R.V., 2008. Empirical characterization of random forest variable importance measures. Comput. Stat. Data Anal., 52(4):2249–2260. [doi:10.1016/j.csda.2007.08.015]
Baesens, B., van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., Vanthienen, J., 2003. Benchmarking state-of-the-art classification algorithms for credit scoring. J. Oper. Res. Soc., 54(6):627–635. [doi:10.1057/palgrave. jors.2601545]
Bellotti, T., Crook, J., 2009. Support vector machines for credit scoring and discovery of significant features. Expert Syst. Appl., 36(2):3302–3308. [doi:10.1016/j.eswa.2008.01.005]
Blum, A.L., Langley, P., 1997. Selection of relevant features and examples in machine learning. Artif. Intell., 97(1-2): 245–271. [doi:10.1016/S0004-3702(97)00063-5]
Breiman, L., 2001. Random forests. Mach. Learn., 45(1):5–32. [doi:10.1023/A:1010933404324]
Chen, Y.W., Lin, C.J., 2006. Combining SVMs with Various Feature Selection Strategies. Feature Extraction Studies in Fuzziness and Soft Computing, 207:315–324. [doi:10. 1007/978-3-540-35488-8_13]
Guyon, I., Weston, J., Barnhill, S., Vapnik, V., 2002. Gene selection for cancer classification using support vector machines. Mach. Learn., 46(1–3):389–422. [doi:10.1023/ A:1012487302797]
Huang, C.L., Chen, M.C., Wang, C.J., 2007. Credit scoring with a data mining approach based on support vector machines. Expert Syst. Appl., 33(4):847–856. [doi:10. 1016/j.eswa.2006.07.007]
Martens, D., Baesens, B., van Gestel, T., Vanthienen, J., 2007. Comprehensible credit scoring models using rule extraction from support vector machines. Eur. J. Oper. Res., 183(3):1466–1476. [doi:10.1016/j.ejor.2006.04.051]
Pal, S.K., De, R.K., Basak, J., 2000. Unsupervised feature evaluation: a neuro-fuzzy approach. IEEE Trans. Neur. Network, 11(2):366–376. [doi:10.1109/72.839007]
Pang, H.X., Dong, W.D., Xu, Z.H., Feng, H.J., Li, Q., Chen, Y.T., 2011. Novel linear search for support vector machine parameter selection. J. Zhejiang Univ.-Sci. C (Comput. & Electron.), 12(11):885–896. [doi:10.1631/ jzus.C1100006]
Prinzie, A., Poel, D.V.D., 2008. Random forests for multiclass classification: random multinomial logit. Expert Syst. Appl., 34(3):1721–1732. [doi:10.1016/j.eswa.2007.01.029]
Thomas, L.C., Oliver, R.W., Hand, D.J., 2005. A survey of the issues in consumer credit modelling research. J. Oper. Res. Soc., 56(9):1006–1015. [doi:10.1057/palgrave.jors.2602018]
van Gestel, T., Suykens, J., Baesens, B., Viaene, S., Vanthienen, J., Dedene, G., Moor, B.D., Vandewalle, J., 2004. Benchmarking least squares support vector machine classifiers. Mach. Learn., 54(1):5–32. [doi:10. 1023/B:MACH.0000008082.80494.e0]
Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer Verlag, New York.
Vapnik, V., 1998. Statistical Learning Theory. John Wiley & Sons, New York.
Wang, D.L., Zheng, J.G., Zhou, Y., 2011. Binary tree of posterior probability support vector machines. J. Zhejiang Univ.-Sci. C (Comput. & Electron.), 12(2):83–87. [doi:10. 1631/jzus.C1000022]
Wang, X.Z., Wang, Y.D., Wang, L.J., 2004. Improving fuzzy c-means clustering based on feature-weight learning. Pattern Recogn. Lett., 25(10):1123–1132. [doi:10.1016/ j.patrec.2004.03.008]
Yeung, D.S., Wang, X.Z., 2002. Improving performance of similarity-based clustering by feature weight learning. IEEE Trans. Pattern Anal. Mach. Intell., 24(4):556–561. [doi:10.1109/34.993562]
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Basic Research Program (973) of China (No. 2011CB706506), the National Natural Science Foundation of China (No. 50905159), the Natural Science Foundation of Jiangsu Province (No. BK2010261), and the Fundamental Research Funds for the Central Universities (No. 2011XZZX005), China
Rights and permissions
About this article
Cite this article
Shi, J., Zhang, Sy. & Qiu, Lm. Credit scoring by feature-weighted support vector machines. J. Zhejiang Univ. - Sci. C 14, 197–204 (2013). https://doi.org/10.1631/jzus.C1200205
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/jzus.C1200205