# Credit Risk Assessment Using Statistical and Machine Learning: Basic Methodology and Risk Modeling Applications

- 2.3k Downloads
- 69 Citations

## Abstract

Risk assessment of financialintermediaries is an area of renewed interest due tothe financial crises of the 1980's and 90's. Anaccurate estimation of risk, and its use in corporateor global financial risk models, could be translatedinto a more efficient use of resources. One importantingredient to accomplish this goal is to find accuratepredictors of individual risk in the credit portfoliosof institutions. In this context we make a comparativeanalysis of different statistical and machine learningmodeling methods of classification on a mortgage loandata set with the motivation to understand theirlimitations and potential. We introduced a specificmodeling methodology based on the study of errorcurves. Using state-of-the-art modeling techniques webuilt more than 9,000 models as part of the study. Theresults show that CART decision-tree models providethe best estimation for default with an average 8.31%error rate for a training sample of 2,000 records. Asa result of the error curve analysis for this model weconclude that if more data were available,approximately 22,000 records, a potential 7.32% errorrate could be achieved. Neural Networks provided thesecond best results with an average error of 11.00%.The *K*-Nearest Neighbor algorithm had an averageerror rate of 14.95%. These results outperformed thestandard Probit algorithm which attained an averageerror rate of 15.13%. Finally we discuss thepossibilities to use this type of accurate predictivemodel as ingredients of institutional and global riskmodels.

## Keywords

Risk Assessment Training Sample Average Error Modeling Technique Risk Model## Preview

Unable to display preview. Download preview PDF.

## References

- Adriaans, P. and Zantinge, D. (1996). Knowledge discovery and data mining.Google Scholar
- Altman, E., Avery, R.B., Eisenbeis, R.A. and Sinkey, J.F., Jr. (1981).
*Application of Classification Techniques in Business, Banking and Finance.*Jai Press Inc.Google Scholar - Amari, S. (1993). A universal theorem on learning curves.
*Neural Networks*,**6**, 161-166.Google Scholar - Amis, E. (1984).
*Epicurus Scientific Method*. Cornell University Press.Google Scholar - Basle Committee on Banking Supervision (1997). Compendium of documents (April) Vol. 2 Advanced supervisory methods, Chapter ll, pp. 82-181.Google Scholar
- Berger, J.O. (1985).
*Statistical Decision Theory and Bayesian Analysis*. Springer series in Statistics.Google Scholar - Bigus, J.P. (1996).
*Data Mining with Neural Networks: Solving Business Problems from Application Development to Decision Support.*Google Scholar - Black, F. and Scholes, M.S. (1973). The pricing of options and corporate liabilities.
*Journal of Political Economy*,**81**(May/June), 637-654.Google Scholar - Blattberg, R.C. and Deighton, J. (1996). Manage marketing by the customer equity test.
*Harvard Business Review*(July/August).Google Scholar - Breeden, D.T. (1979). An intertemporal asset pricing model with stochastic consumption and investment opportunities.
*Journal of Financial Economics*,**7**(September), 265-296. Reprinted in Bhattacharya and Constantinides, eds. (1989).Google Scholar - Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984).
*Classification and Regression Trees.*Wadsworth Inc., Pacific Glove.Google Scholar - Breiman, L. (1996). Bias, variance, and arcing classifiers. Tech. Rep. 460, Statistics Dept. U. of California, Berkeley (April 1996).Google Scholar
- Bourgoin, M. (1994).
*Applying Machine-Learning Techniques to a Real-World Problem on a Connection Machine CM-5*.Google Scholar - Bourgoin, M. and Smith, S. (1995). Leveraging your hidden data to improve ROI: A case study in the credit card business. In Freedman, Klein and Lederman (eds.),
*Artificial Intelligence in the Capital Markets*. Probus Publishing.Google Scholar - Carlin, B.P. and Louis, T.A. (1996).
*Bayes and Empirical Bayes Methods for Data Analysis*. Chapman and Hall.Google Scholar - Cortes, C., Jackel, L.D. and Chiang (1994a). W-P limits on learning machine accuracy imposed by data quality. In G. Tesauro, D.S. Touretzky and T.K. Leen (eds.),
*Advances in Neural Networks Processing Systems*, Vol. 7, p. 239, MIT Press.Google Scholar - Cortes, C., Jackel, L.D., Solla, S.A. and Vapnik, V. (1994b). Learning curves: Asymptotic values and rate of convergence. In G. Tesauro, D.S. Touretzky and T.K. Leen (eds.),
*Advances in Neural Networks Processing Systems*, Vol. 6, p. 327, MIT Press.Google Scholar - Dewatripont, M. and Tirole, J. (1994).
*The Prudential Regulation of Banks*. MIT Press.Google Scholar - Eaton, M.L. (1983).
*Multivariate Statistics*. Wiley, New York.Google Scholar - Elder and Pregibon (1996). A statistical perspective on knowledge discovery in databases. In
*Advances in Knowledge Discovery and Data Mining*. AAAI Press/The MIT Press.Google Scholar - Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P. and Uthurusamy. R. (eds.) (1996).
*Advances in Knowledge Discovery and Data Mining*. AAAI Press/The MIT Press.Google Scholar - Fletcher, R. (1981).
*Practical Methods of Optimization*. Wiley-Interscience, John Wiley and Sons.Google Scholar - Fisher, R. (1950).
*Statistical Methods for Research Workers*. 11th Edition.Google Scholar - Friedman, J.H., Bentley, J.L. and Finkel, R.A. (1977). An algorithm for finding best matches in logarithmic expected time.
*ACM Transactions on Mathematical Software*,**3**, 9-226.Google Scholar - Friedman, J.H. (1997). On bias, variance, 0/1-loss, and the curse of dimensionality.
*Data Mining and Knowledge Discovery*,**1**, 55-77.Google Scholar - Freund, Y. and Shapire, R.E. (1995). A decision theoretic generalization on on-line learning and an application to bosting.
*Computational Learning Theory*. 2nd European Conference, EuroCOLT'95, pp. 23-27. http://www.research.att.com/orgs/ssr/people/yoavGoogle Scholar - Fukunaga, K. (1990).
*Introduction to Statistical Pattern Recognition*.Google Scholar - Glymor, C., Madigan, D., Pregibon, D., and Smyth, P. (1997). Statistical themes and lessons for data mining.
*Data Mining and Knowledge Discovery*,**1**, 11-28.Google Scholar - Greene, W.H. (1993).
*Econometric Analysis*. Macmillan, 2nd Edition.Google Scholar - Goldberg, D. (1989).
*Genetic Algorithms in Search, Optimization, and Machine Learning*. Addison-Wesley.Google Scholar - Hand, D.J. (1981).
*Discrimination and Classification*. John Wiley, Chichester.Google Scholar - Hassoun, M.H. (1995).
*Fundamentals of Artificial Neural Networks*. MIT Press, Cambridge, Mass.Google Scholar - Horst, R. and Pardalos, P.M. (eds.) (1995).
*Handbook of Global Optimization*. Kluwer.Google Scholar - Hume, D. (1739).
*An Inquiry Concerning Human Understanding*. Prometheus Books, Pub. 1988.Google Scholar - Hutchinson, J.M., Lo, A.W. and Poggio, T. (1994). A non-parametric approach to pricing and hedging derivative securities via learning networks.
*The Journal of Finance*,**XLIX**(3).Google Scholar - Jaynes, E. (1983).
*Papers on Probability, Statistics and Statistical Physics*. R.D. Rosenkrantz (ed.), D. Reidel Pub. Co.Google Scholar - Jeffreys, H. (1931). Scientific inference. Cambridge Univ. Press.Google Scholar
- Kearns, M.J. and Vazirani, U.V. (1994).
*An Introduction to Computational Learning Theory*. MIT Press, Cambridge, Mass.Google Scholar - Keuzenkamp, H.A. and McAleer, M. (1995). Simplicity, scientific inference and econometric modeling.
*The Economic Journal*,**105**, 1-21.Google Scholar - Kuan, C.-M. and White, H. (1994). Artificial neural networks: An economic perspective.
*Econometric Reviews*,**13**(1).Google Scholar - Lachenbruch, P.A. and Mickey, M.R. (1968).
*Discriminant Analysis*. Hafner Press, New York.Google Scholar - Li, M. and Vitanyi, P. (1997).
*An Introduction to Kolmogorov Complexity and Its Applications*. 2nd Edition, Springer-Verlag, New York.Google Scholar - McClelland, J.L. and Rumelhart, D.E. (eds.) (1986).
*Parallel Distributed Processing.*MIT Press.Google Scholar - McLachan, G.L. (1992).
*Discriminant Analysis and Statistical Pattern Recognition.*John Wiley, New York.Google Scholar - Meyer, P.A. and Pifer, H.W. (1970). Prediction of bank failures.
*Journal of Finance*,**25**(4), 853-868.Google Scholar - Merton, R.C. (1973). Theory of rational option pricing.
*Bell Journal of Economics and Management Science*,**4**(Spring), 141-183.Google Scholar - Merton, R.C. (1973). An intertemporal capital asset pricing model.
*Econometrica*,**41**(September), 867-887. Reprinted in*Continuous Time Finance*(1990). Basil Blackwell as Chapter 15, Cambridge, Mass.Google Scholar - Michie, D., Spiegelhalter, D.J. and Taylor, C.C. (eds.) (1994).Machine learning, neural and statistical classification. Ellis Horwood series in Artificial Intelligence.Google Scholar
- Mitchell, T. (1997).
*Machine Learning*. McGraw Hill, http://www.cs.cmu.edu/ tom/mlbook.htmlGoogle Scholar - Opper, M. and Haussler, D. (1995). Bounds for predictive errors in the statistical mechanics of supervised learning.
*Physical Review Letters*,**75**, 3772.Google Scholar - Piatetsky-Shapiro, G. and Frawley, W.J. (eds.) (1991).
*Knowledge Discovery in Databases.*MIT Press.Google Scholar - Pindyck, R.S. and Rubinfield, D.L. (1981).
*Econometric Models and Economic Forecasts.*2nd Edition, McGraw Hill.Google Scholar - Popper, K. (1958).
*The Logic of Scientific Discovery.*Hutchinson & Co, London.Google Scholar - Rissanen, J.J. (1989).
*Stochastic Complexity and Statistical Inquiry.*World Scientific.Google Scholar - Ross, S.A. (1976). Arbitrage theory of capital asset pricing.
*Journal of Economic Theory*, December.Google Scholar - Sharpe, W.F. (1963). A simplified model for portfolio analysis.
*Management Science*,**9**(January), 277-293.Google Scholar - Sinkey, J.F., Jr. (1975). A multivariate statistical analysis on the characteristics of problem banks.
*Journal of Finance*,**30**(1), 21-36.Google Scholar - Simoudis, E., Han, J. and Fayyad U. (eds.) (1996). Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD'96). AAAI Press. See also KDD Nuggets: http://info.gte.com/kdd/ Small, R.D. and Edelstein, H. (1997).
*Scalable Data Mining in Building, Using and Managing the Data Warehouse*. Prentice Hall PTR.Google Scholar - Seung, H.S., Sompolinsky, H. and Tishby, N. (1993). Statistical mechanics of learning from examples.
*Physical Review A*,**45**, 6056.Google Scholar - Tamayo, P., Berlin, J., Dayanand, N., Drescher, G., Mani, D.R. and Wang. C. (1997).
*Darwin: An Scalable Integrated System for Data Mining.*Thinking Machines white paper.Google Scholar - Wang, C., Venkatesh, S.S. and Judd, J.S. (1994). Optimal stopping and effective machine complexity in learning. In G. Tesauro, D.S. Touretzky and T.K. Leen (eds.),
*Advances in Neural Networks Processing Systems*,**7**, 239. MIT Press.Google Scholar - Weiss, S.M. and Kulikowski, C.A. (1991).
*Computer Systems that Learn: Classification and Prediction Methods from Statistics, Neural Networks, Machine Learning and Expert Systems.*Morgan Kaufmann, San Mateo, Calif.Google Scholar - White, H. (1992).
*Artificial Neural Networks*. Blackwell, Cambridge, Mass.Google Scholar - Vapnik, V. (1995).
*The Nature of Statistical Learning Theory.*Springer-Verlag.Google Scholar - Zadeh, L.A. (1994). Fuzzy logic, neural networks and soft computing.
*Communications of the ACM*,**3**, 77.Google Scholar