Skip to main content
Log in

Mortgage Default: Classification Trees Analysis

  • Original Article
  • Published:
The Journal of Real Estate Finance and Economics Aims and scope Submit manuscript

Abstract

We apply the powerful, flexible, and computationally efficient nonparametric Classification and Regression Trees (CART) algorithm to analyze real estate mortgage data. CART is particularly appropriate for our data set because of its strengths in dealing with large data sets, high dimensionality, mixed data types, missing data, different relationships between variables in different parts of the measurement space, and outliers. Moreover, CART is intuitive and easy to interpret and implement. We discuss the pros and cons of CART in relation to traditional methods such as linear logistic regression, nonparametric additive logistic regression, discriminant analysis, partial least squares classification, and neural networks, with particular emphasis on real estate. We use CART to produce the first academic study of Israeli mortgage default data. We find that borrowers’ features, rather than mortgage contract features, are the strongest predictors of default if accepting icbadli borrowers is more costly than rejecting “good” ones. If the costs are equal, mortgage features are used as well. The higher (lower) the ratio of misclassification costs of bad risks versus good ones, the lower (higher) are the resulting misclassification rates of bad risks and the higher (lower) are the misclassification rates of good ones. This is consistent with real-world rejection of good risks in an attempt to avoid bad ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abu-Hanna, A., and N. de Keizer. (2003). “Integrating Classification Trees with Local Logistic Regression in Intensive Care Prognosis,” Artificial Intelligence in Medicine (Forthcoming).

  • Ambrose, B. W., and R. J. Buttimer, Jr. (2000). “Embedded Options in the Mortgage Contract,” The Journal of Real Estate Finance and Economics 21, 95 111.

    Article  Google Scholar 

  • Ambrose, B. W., and A. B. Sanders. (2003). “Commercial Mortgage Backed Securities: Prepayment and Default,” Journal of Real Estate Finance and Economics 26, 175–192.

    Google Scholar 

  • Ambrose, B. W., R. J. Buttimer, Jr., and C. A. Capone, Jr. (1997). “Pricing Mortgage Default and Foreclosure Delay,” Journal of Money, Credit, and Banking 29, 314–325.

    Google Scholar 

  • Ambrose, B. W., C. A. Capone, Jr., and Y. Deng. (2001). “Optimal Put Exercise: An Empirical Examination of Conditions for Mortgage Foreclosure,” Journal of Real Estate Finance and Economics 23, 213–234.

    Article  Google Scholar 

  • Averbook, B. J., P. Fu, J. S. Rao, and E. G. Mansour. (2002). “A Long-term Analysis of 1,018 Patients with Melanoma by Classic Cox Regression and Tree-structured Survival Analysis at a Major Referral Center: Implications on the Future of Cancer Staging,” Surgery 132, 589–604.

    Article  PubMed  Google Scholar 

  • Bloch, D. A., R. A. Olshen, and M. G. Walker. (2002). “Risk Estimation for Classification Trees,” Journal of Computational and Graphical Statistics 11, 263–288.

    Article  Google Scholar 

  • Breault, J. L., C. R. Goodall, and P. J. Fos. (2002). “Data Mining a Diabetic Data Warehouse,” Artificial Intelligence in Medicine 26, 37–54.

    Article  PubMed  Google Scholar 

  • Breiman, L. (1996). “Bagging Predictors,” Machine Learning 24, 123–140.

    Google Scholar 

  • Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. (1998). Classification and Regression Trees, New York: Chapman and Hall/CRC.

    Google Scholar 

  • Capozza, D. R., D. Kazarian, and T. A. Thomson. (1997). “Mortgage Default in Local Markets,” Real Estate Economics 25, 631–655.

    Article  Google Scholar 

  • Capozza, D. R., D. Kazarian, and T. A. Thomson. (1998). “The Conditional Probability of Mortgage Default,” Real Estate Economics 26, 359–390.

    Google Scholar 

  • Chandy, P. R., and E. H. Duett. (1990). “Commercial Paper Rating Models,” Quarterly Journal of Business and Economics 29, 79–101.

    Google Scholar 

  • Clauretie, T. (1990). “A Note on Mortgage Risk: Default vs. Loss Rates,” AREUEA Journal 18, 202–206.

    Google Scholar 

  • De’ath, G., and K. E. Fabricius. (2000). “Classification and Regression Trees: A Powerful yet Simple Technique for Ecological Data Analysis,” Ecology 81, 3178–3192.

    Google Scholar 

  • Deng, Y. (1997). “Mortgage Termination: An Empirical Hazard Model with a Stochastic Term Structure,” Journal of Real Estate Finance and Economics 14, 309–331.

    Article  Google Scholar 

  • Deng, Y., J. M. Quigley, and R. Van Order. (2000). “Mortgage Terminations, Heterogeneity and the Exercise of Mortgage Options,” Econometrica 68, 275–307.

    Article  Google Scholar 

  • DeVaney, S. (1994). “The Usefulness of Financial Ratios as Predictors of Household Insolvency: Two Perspectives,” Financial Counseling and Planning 5, 15–24.

    Google Scholar 

  • Faraggi, D., M. LeBlanc, and J. Crowly. (2001). “Understanding Neural Networks Using Regression Trees: An Application to Multiple Myeloma Survival Data,” Statistics in Medicine 20, 2965–2975.

    Article  CAS  PubMed  Google Scholar 

  • Fix, E., and J. Hodges. (1951). “Discriminatory Analysis, Nonparametric Discrimination: Consistency Properties,” Technical Report, Randolph Field Texas, USAF School of Aviation Medicine.

  • Foster, C., and R. Van Order. (1984). “An Option-Based Model of Mortgage Default,” Housing Finance Review 3, 351–372.

    Google Scholar 

  • Freund, Y., and R. E. Schapire. (1997). “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boostinglo,” Journal of Computer and System Sciences 55, 119–139.

    Article  Google Scholar 

  • Friedman, J. H. (1991). “Multivariate Adaptive Regression Splines,” Annals of Statistics 19, 1–141.

    Google Scholar 

  • Frydman, H., E. I. Altman, and D. L. Kao. (1985). “Introducing Recursive Partitioning for Financial Classification: The Case of Financial Distress,” The Journal of Finance 40, 269–292.

    Google Scholar 

  • Fu, C. Y. (2003). “Combining Loglinear Models with Regression Tree (CART): an Application to Birth Data,” Computational Statistics and Data Analysis (Forthcoming).

  • Gerritsen, R. (1999). “Assessing Loan Risks: A Data Mining Case Study,” Exclusive Ore, Pennsylvania.

  • Goel, P. K., S. O. Prasher, R. M. Patel, J. M. Landry, R. B. Bonnell, and A. A. Viau. (2003). “Classification of Hyperspectral Data by Decision Trees and Artificial Neural Networks to Identify Weed Stress and Nitrogen Status of Corn,” Computers and Electronics in Agriculture 39, 67–93.

    Article  Google Scholar 

  • Hastie, T., R. Tibshirani, and J. H. Friedman. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics, New York: Springer Verlag.

    Google Scholar 

  • Haughton, D., and S. Oulabi. (1997). “Direct Marketing Modeling with CART and CHAID,” Journal of Interactive Marketing 11, 42–52.

    Google Scholar 

  • Hoffman, H. J. (1990). “Die Unwendung des CART-Verfahrens zur Statistichen Bonitatanalyse von Konsumentenkreditenl,” ZeitSchrft-fur-Betriebswirtschaft 60, 941–962.

    Google Scholar 

  • Karolyi, A., and A. B. Sanders. (1998). “The Variation of Economic Risk Premiums in Real Estate Returns,” Journal of Real Estate Finance and Economics 17, 245–262.

    Article  Google Scholar 

  • Kau, J. B., and D. C. Keenan. (1993). “Transaction Costs, Suboptimal Termination, and Default Probabilities for Mortgages,” AREUEA Journal 21, 247–263.

    Google Scholar 

  • Kau, J. B., D. C. Keenan, W. J. Muller, III, and J. F. Epperson. (1992). “A Generalized Valuation Model for Fixed-Rate Residential Mortgages,” Journal of Money, Credit, and Banking 24, 279–299.

    Google Scholar 

  • Kau, J. B., D. C. Keenan, and T. Kim. (1994). “Default Probabilities for Mortgages,” Journal of Urban Economics 35, 278–296.

    Article  Google Scholar 

  • Kennedy, D. (1992). “Classification Techniques in Accounting Research: Empirical Evidence of Comparative Performance,” Contemporary Accounting Research 2, 419–442.

  • Kolyshkina, I., and R. Brookes. (2002). “Data Mining Approaches to Modeling Insurance Risk,” Report, PriceWaterhouseCoopers.

  • Komorad, K. (2002). “On Credit Scoring Estimation,” Master’s Thesis, Institute for Statistics and Econometrics, Humboldt University, Berlin.

  • Kuhnert, P. M., K. A. Do, and R. McClure. (2000). “Combining Non-Parametric Models with Logistic Regression: An Application to Motor Vehicle Injury Data,” Computational Statistics and Data Analysis 34, 371–386.

    Article  Google Scholar 

  • Lekkas, V., J. M. Quigley, and R. Van Order. (1993). “Loan Loss Severity and Optimal Mortgage Default,” Journal of the American Real Estate and Urban Economics Association 21, 353–371.

    Article  Google Scholar 

  • Markham, I., B. G. Mathien, and B. Wray. (2000). “Kanban Setting Through Artificial Intelligence: A Comparative Study of Artificial Neural Networks and Decision Trees,” Integrated Manufacturing Systems: The International Journal of Manufacturing Technology Management 11, 239–246.

  • Mezrick, J. J. (1994). “When is a Tree a Hedge?” Financial Analysts Journal 50, 75–81.

    Google Scholar 

  • Michie, D., D. J. Spieglehalter, and C. C. Taylor. (eds) (1994). Machine Learning, Neural and Statistical Classification, London: Ellis Horwood Ltd.

    Google Scholar 

  • Miles, M. (1990). “What is The Value of U.S. Real Estate?” Real Estate Review 20, 69–75.

    Google Scholar 

  • Moisen, G. G., and T. S. Frescino. (2002). “Comparing Five Modelling Techniques for Predicting Forest Characteristics,” Ecological Modelling 30, 209–225.

    Article  Google Scholar 

  • O’Brien, T. V., and P. E. Durfee. (1994). “Classification Tree Software,” Marketing Research 6, 36–39.

    Google Scholar 

  • Pomykalski, J. J., W. F. Truszkowski, and D. E. Brown. (1999). “Expert Systems,” In J. Webster (ed.), Wiley Encyclopedia for Electrical and Electronics Engineering, New York: John Wiley & Sons, Inc.

    Google Scholar 

  • Quigley, J. M., and R. Van Order. (1995). “Explicit Tests of Contingent Claims Models of Mortgage Default,” The Journal of Real Estate Finance and Economics 11, 99–117.

    Article  Google Scholar 

  • Rousu, J., L. Flander, M. Suutarinen, K. Autio, P. Kontkanen, and A. Rantanen. (2003). “Novel Computational Tools in Bakery Process Data Analysis: a Comparative Study,” Journal of Food Engineering 57, 45–56.

    Article  Google Scholar 

  • Sanders, A. B. (2002). “Government Sponsored Agencies: Do the Benefits Outweigh the Costs?” Journal of Real Estate Finance and Economics 25, 121–127.

    Article  Google Scholar 

  • Sorensen, E. H., K. L. Miller, and C. K. Ooi. (2000). “The Decision Tree Approach to Stock Selection,” Journal of Portfolio Management 27, 42–52.

    Google Scholar 

  • Stanton, R., and N. Wallace. (1998). “Mortgage Choice: What is the Point?” Real Estate Economics 26, 173–205.

    Article  Google Scholar 

  • Thearling, K. (2002). “Scoring Your Customers,” http://www.thearling.com.

  • Tronstad, R., and R. Gum. (1994). “Cow Culling Decisions Aadapted for Management with CART,” American Journal of Agricultural Economics 76, 237–249.

    Google Scholar 

  • Vandell, K. D. (1993). “Handing Over the Keys: A Perspective on Mortgage Default Research,” Journal of the American Real Estate and Urban Economics Association 21, 211–246.

    Article  Google Scholar 

  • Vandell, K. (1995). “How Ruthless is Mortgage Default?” Journal of Housing Research 6, 245–264.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Feldman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Feldman, D., Gross, S. Mortgage Default: Classification Trees Analysis. J Real Estate Finan Econ 30, 369–396 (2005). https://doi.org/10.1007/s11146-005-7013-7

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11146-005-7013-7

Key Words

Navigation