Credit risk prediction using support vector machines

  • Jan-Henning Trustorff
  • Paul Markus Konrad
  • Jens Leker
Original Research


The main purpose of this paper is to examine the relative performance between least-squares support vector machines and logistic regression models for default classification and default probability estimation. The financial ratios from a data set of more than 78,000 financial statements from 2000 to 2006 are used as default indicators. The main focus of this paper is on the influence of small training samples and high variance of the financial input data and the classification performance measured by the area under the receiver operating characteristic. The resolution and the reliability of the predicted default probabilities are evaluated by decompositions of the Brier score. It is shown that support vector machines significantly outperform logistic regression models, particularly under the condition of small training samples and high variance of the input data.


Support vector machines Credit risk prediction Default classification Estimation of probabilities of default Training sample size Accounting data 

JEL Classification

C14 G33 


  1. Abe S (2005) Support vector machines for pattern classification. Springer, LondonGoogle Scholar
  2. Atiya AF (2001) Bankruptcy prediction for credit risk using neural networks: a survey and new results. IEEE Trans Neural Netw 12(4):929–935CrossRefGoogle Scholar
  3. Baesens B, van Gestel T, Viaene S, Stepanova M, Suykens JAK, Vanthienen J (2003) Benchmarking state-of-the-art classification algorithms for credit scoring. J Oper Res Soc 54:627–635CrossRefGoogle Scholar
  4. Bamber D (1975) The area above ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12:387–415CrossRefGoogle Scholar
  5. Basel Committee on Banking Supervision (2006) International convergence of capital measurement and capital standards. Bank for International Settlements, BaselGoogle Scholar
  6. Boser BE, Guyon IM, Vapnik VN (1992) A traininig algorithm for optimal margin classifers. In: Haussler D (ed) Proceedings of the 5th annual ACM workshop on computational learning theory. ACM Press, New York, pp 144–152CrossRefGoogle Scholar
  7. Butera G, Faff R (2006) An integrated multi-model credit rating system for private firms. Rev Quantitat Finance Account 26:311–340CrossRefGoogle Scholar
  8. Carling K, Jacobson T, LindT J, Roszbach K (2007) Corporate credit risk modeling and the macroeconomy. J Bank Finance 31(3):845–868CrossRefGoogle Scholar
  9. Chen LH, Chiou TW (1999) A fuzzy credit-rating approach for commercial loans: a taiwan case. Omega 27:407–419CrossRefGoogle Scholar
  10. Chen S, Härdle W, Moro R (2006) Estimation of default probabilities with support vector machines. Discussion Paper 77, SFB 649 Humboldt University, BerlinGoogle Scholar
  11. Cortes C, Vapnik VN (1995) Support-vector networks. Mach Learn 20(3):273–297Google Scholar
  12. Cristianini N, Shawe-Taylor J (2006) An introduction to support vector machines. Cambridge University Press, CambridgeGoogle Scholar
  13. DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837–845CrossRefGoogle Scholar
  14. Engelmann B, Hayden E, Tasche D (2003) Testing rating accuracy. Risk pp 82–86Google Scholar
  15. Evgeniou T, Pontil M, Poggio T (2000) Regularization networks and support vector machines. Adv Comput Math 13:1–50CrossRefGoogle Scholar
  16. Hastie T, Tibshirani R (1990) Generalized additive models. Chapmann and Hall, LondonGoogle Scholar
  17. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning: data mining, inference, and prediction. Springer, New YorkGoogle Scholar
  18. Hersbach H (2000) Decomposition of the continuous ranked probability score for ensemble prediction systems. Am Meteorol Soc 15:559–570Google Scholar
  19. Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New YorkCrossRefGoogle Scholar
  20. Huang Z, Chen H, Hsu CJ, Chen WH, Wu S (2004) Credit rating analysis with support vector machines and neural networks: a market comparative study. Decis Support Syst 37:543–558CrossRefGoogle Scholar
  21. Härdle W, Moro R, SchSfer D (2005) Predicting bankcruptcy with support vector machines. In: Cizek P, Härdle W, Weron R (eds) Statistical tools for finance and insurance. Springer, Berlin, pp 225–248CrossRefGoogle Scholar
  22. Härdle W, Lee YJ, SchSfer D, Yeh YR (2007) The default risk of firms examined with smooth support vector machines. Discussion Paper 757, DIW BerlinGoogle Scholar
  23. McLachlan GJ (2004) Discriminant analysis and statistical pattern recognition. Wiley, New YorkGoogle Scholar
  24. Ravi Kumar P, Ravi V (2007) Bankcruptcy prediction in banks and firms via statistical and inteligent techniques—a review. Eur J Oper Res 180:1–28CrossRefGoogle Scholar
  25. Schölkopf B, Smola AJ (2002) Learning with kernels. MIT Press, CambridgeGoogle Scholar
  26. Sun L (2007) A re-evaluation of auditors’ opinions versus statistical models in bankruptcy prediction. Rev Quantitat Finance Account 28:55–78CrossRefGoogle Scholar
  27. Suykens JA, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9:293–300CrossRefGoogle Scholar
  28. Suykens JA, van Gestel T, Brabanter JD, Moor BD, Vandewalle J (2002) Least squares support vector machines. World Scientific, SingaporeCrossRefGoogle Scholar
  29. Theodoridis S, Koutroumbas K (2006) Pattern recognition. Elsevier Academic Press, AmsterdamGoogle Scholar
  30. van Gestel T, Suykens JA, Baesens B, Viaene S, Vanthienen J, Dedene G, de Moor Joss Vandewalle B (2004) Benchmarking least squares support vector machine classifiers. Mach Learn 54:5–32CrossRefGoogle Scholar
  31. Vapnik VN (1998) Statistical learning theory. Wiley, New YorkGoogle Scholar
  32. Vapnik VN (2000) The nature of statistical learning theory, 2nd edn. Springer, BerlinGoogle Scholar
  33. Varetto F (1998) Genetic algorithms applications in the analysis of insolvency risk. J Bank Finance 22:1421–1439CrossRefGoogle Scholar
  34. Yobas MB, Crook JN, Ross P (2000) Credit scoring using neural and evolutionary techniques. IMA J Manage Math 11(2):111–125CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Jan-Henning Trustorff
    • 1
  • Paul Markus Konrad
    • 1
  • Jens Leker
    • 1
  1. 1.Institute of Business AdministrationUniversity of MünsterMünsterGermany

Personalised recommendations