Abstract
Hidden Markov model (HMM) has made great achievements in many fields such as speech recognition and engineering. However, due to its assumption of state conditional independence between observations, HMM has a very limited capacity for recognizing complex patterns involving more than first-order dependencies in customer relationships management. Group Method of Data Handling (GMDH) could overcome the drawbacks of HMM, so we propose a hybrid model by combining the HMM and GMDH to score customer credit. There are three phases in this model: training HMM with multiple observations, adding GMDH into HMM and optimizing the hybrid model. The proposed hybrid model is compared with other exiting methods in terms of average accuracy, Type I error, Type II error and AUC. Experimental results show that the proposed method has better performance than HMM/ANN in two credit scoring datasets. The implementation of HMM/GMDH hybrid model allows lenders and regulators to develop techniques to measure customer credit risk.
Similar content being viewed by others
References
Abdou H, Pointon J, Elmasry A (2008) Neural nets versus conventional techniques in credit scoring in Egyptian banking. Expert Syst Appl 35(3):1275–1292
Aksenova TI, Yurachkovsky YP (1988) A characterisation at unbiased structure and conditions of their J-optimality. Sov J Autom Inf Sci 21(4):36–42
Anastasakis L, Mort N (2009) Exchange rate forecasting using a combined parametric and nonparametric self-organising modelling approach. Expert Syst Appl 36(10):12001–12011
Anonymous Articles, software, books and presentations about the group method of data handling. http://www.gmdh.net/articles/index.html
Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1–2):245–271
Bourlard H, Morgan N, Wooters C, Renals S (1992) CDNN: a context dependent neural network for continuous speech recognition. In: IEEE international conference on acoustics, speech, and signal processing, vol 2, pp 349–352
Bourlard H, Wellekens C (1990) Links between Markov models and multilayer perceptrons. IEEE Trans Pattern Anal Mach Intell 12(12):1167–1178
Bystroff C, Thorsson V, Baker D (2000) HMMSTR: a Hidden Markov Model for local sequence-structure correlations in proteins. J Mol Biol 301(1):173–190
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Crook JN, Edelman DB, Thomas LC (2007) Recent developments in consumer credit risk assessment. Eur J Oper Res 183(3):1447–1465
Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml
Gupta JND, Smith KA (2003) Neural networks in business: techniques and applications. IRM Press, USA
Henley WE, Dj Hand (1997) Construction of a k-nearest-neighbour credit-scoring system. IMA J Manag Math 8(4):305–321
Ivakhnenko A (1976) The group method of data handling in prediction problems. Sov Autom Control 9(6):21–30
Ivakhnenko A, Stepashko V (1985) Noise immunity of modeling. Naukova Dumka, Kiev
Joanes DN (1993) Reject inference applied to logistic regression for credit scoring. IMA J Manag Math 5(1):35–43
Kayasith P, Theeramunkong T (2011) Pronouncibility index (\(\rm {\Pi }\)): a distance-based and confusion-based speech quality measure for dysarthric speakers. Knowl Inf Syst 27(3):367–391
Khashman A (2010) Neural networks for credit risk evaluation: investigation of different neural models and learning schemes. Expert Syst Appl 37(9):6233–6239
Kim Y (2006) Toward a successful CRM: variable selection, sampling, and ensemble. Decis Support Syst 41(2):542–553
Laitinen EK (1999) Predicting a corporate credit analyst’s risk estimate by logistic and linear models. Int Rev Financ Anal 8(2):97–121
Lee KF (1988) On large-vocabulary speaker-independent continuous speech recognition. Speech Commun 7(4):375–379
Lee TS, Chiu CC, Chou YC, Lu CJ (2006) Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Comput Stat Data Anal 50(4):1113–1130
Lee TS, Chiu CC, Lu CJ, Chen IF (2002) Credit scoring using the hybrid neural discriminant technique. Expert Syst Appl 23(3):245–254
Lin SL (2009) A new two-stage hybrid approach of credit risk in banking industry. Expert Syst Appl 36(4):8333–8341
Lukashin AV, Borodovsky M (1998) GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26(4):1107–1115
Madala H, Ivakhnenko A (1994) Inductive learning algorithms for complex systems modeling. CRC press, Boca Raton
Morgan N, Bourlard H (1990) Continuous speech recognition using multilayer perceptrons with Hidden Markov Models. In: International conference on acoustics, speech, and signal processing, vol 1, pp 413–416
Mueller JA, Lemke F (1999) Self-organising data mining: an intelligent approach to extract knowledge from data. ScriptSoftware International, Berlin
Oguz H, Gurgen F (2008) Credit risk analysis using Hidden Markov Model. In: International symposium on computer and information sciences, pp 1–5
Oliveira ALI, Braga PL, Lima RMF, Cornlio ML (2010) GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation. Inf Softw Technol 52(11):1155–1166
Pudil P, Novovicová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):1119–1125
Rabiner L (1989) A tutorial on Hidden Markov Models and selected applications in speech recognition. In: Proceedings of the IEEE vol 77(2), pp 257–286
Abdel-Aal RE (2005) GMDH-based feature ranking and selection for improved classification of medical data. J Biomed Inform 38(6):456–468
Robinson A (1994) An application of recurrent nets to phone probability estimation. IEEE Trans Neural Netw 5(2):298–305
Rosenberg E, Gleit A (1994) Quantitative methods in credit management: a survey. Oper Res 42(4): 589–613
Schenk J, Rigoll G (2006) Novel hybrid NN/HMM modelling techniques for on-line handwriting recognition. In: Tenth international workshop on frontiers in handwriting recognition. Suvisoft
Smyth P (1994) Hidden Markov models for fault detection in dynamic systems. Pattern Recognit 27(1):149–164
Srivastava A, Kundu A, Sural S, Majumdar A (2008) Credit card fraud detection using Hidden Markov Model. IEEE Trans Dependable Secur Comput 5(1):37–48
Steiger DM, Sharda R (1996) Analyzing mathematical models with inductive learning networks. Eur J Oper Res 93(2):387–401
Thomas LC (2000) A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. Int J Forecast 16(2):149–172
Trentin E, Gori M (2001) A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing 37(1–4):91–126
Wang B, Japkowicz N (2010) Boosting support vector machines for imbalanced data sets. Knowl Inf Syst 25(1):1–20
Wei H, He J, Tan J (2011) Layered hidden Markov models for real-time daily activity monitoring using body sensor networks. Knowl Inf Syst 29(2):479–494
West D (2000) Neural network credit scoring models. Comput Oper Res 27(11–12):1131–1152
Westgaard S, van der Wijst N (2001) Default probabilities in a corporate bank portfolio: a logistic model approach. Eur J Oper Res 135(2):338–349
Xiao J, He CZ (2010) SODM based multiple classifiers fusion and its application in customer classification. J Ind Eng/Eng Manag 24(4):71–77
Young SJ, Evermann G, Gales MJF, Hain T, Kershaw D, Moore G, Odell J, Ollason D, Povey D, Valtchev V, Woodland PC (2006) The HTK book, version 3.4. Cambridge University Engineering Department, Cambridge, UK
Yu L, Wang SY, Lai KK (2008) Credit risk assessment with a multistage neural network ensemble learning approach. Expert Syst Appl 34(2):1434–1444
Acknowledgments
This research is supported by the Natural Science Foundation of China under Grant Nos. 71071101, 71101100 and 71211130018, New Teachers Fund for Doctor Stations, Ministry of Education under Grant No. 20110181120047, China Postdoctoral Science Foundation under Grant No. 2011M500418, Research Start-up Project of Sichuan University under Grant No. 2010SCU11012.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Teng, GE., He, CZ., Xiao, J. et al. Customer credit scoring based on HMM/GMDH hybrid model. Knowl Inf Syst 36, 731–747 (2013). https://doi.org/10.1007/s10115-012-0572-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0572-z