Skip to main content
Log in

Hurdle models of loan default

  • Special Issue Paper
  • Published:
Journal of the Operational Research Society

Abstract

Some models of loan default are binary, simply modelling the probability of default, while others go further and model the extent of default (eg number of outstanding payments; amount of arrears). The double-hurdle model, originally due to Cragg (Econometrica, 1971), and conventionally applied to household consumption or labour supply decisions, contains two equations, one which determines whether or not a customer is a potential defaulter (the ‘first hurdle’), and the other which determines the extent of default. In separating these two processes, the model recognizes that there exists a subset of the observed non-defaulters who would never default whatever their circumstances. A Box-Cox transformation applied to the dependent variable is a useful generalization to the model. Estimation is relatively easy using the Maximum Likelihood routine available in STATA. The model is applied to a sample of 2515 loan applicants for whom loans were approved, a sizeable proportion of whom defaulted in varying degrees. The dependent variables used are amount in arrears and number of days in arrears. The value of the hurdle approach is confirmed by finding that certain key explanatory variables have very different effects between the two equations. Most notably, the effect of loan amount is strongly positive on arrears, while being U-shaped on the probability of default. The former effect is seriously under-estimated when the first hurdle is ignored.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Notes

  1. a STATA version 8.0, Stata Corporation, College Station, Texas.

  2. b This may, of course, be a ‘cohort effect’, with borrowers born around 1950 being ‘safer’ than other cohorts. In fact, given the nature of the model, it is logical to assume that it is a cohort effect, since age itself cannot affect the probability of never defaulting. To distinguish the cohort effect from the age effect would require additional observations taken in a different year.

References

  • Cragg JG (1971). Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica 39: 829–844.

    Article  Google Scholar 

  • Dionne G, Artis M and Guillen M (1996). Count data models for a credit scoring system. J Empirical Finance 3: 303–325.

    Article  Google Scholar 

  • Jones AM (1989). A double hurdle model of cigarette consumption. J Appl Econom 4: 23–39.

    Article  Google Scholar 

  • Jones AM and Yen ST (2000). A Box-Cox double hurdle model. The Manchester School 68: 203–221.

    Article  Google Scholar 

  • Smith MD (2002). On specifying double hurdle models. In: Ullah A, Wan A and Chaturvedi A (eds). Handbook of Applied Econometrics and Statistical Inference. Marcel-Dekker, New York, pp 535–552.

    Google Scholar 

  • McDowell A (2003). From the help desk: hurdle models. Stata J 3: 178–184.

    Google Scholar 

  • Deaton AS and Irish M (1984). Statistical models for zero expenditures in household budgets. J Public Econom 23: 59–80.

    Article  Google Scholar 

  • Stewart MB (1983). On least squares estimation when the dependent variable is grouped. Rev Econ Studies 50: 737–753.

    Article  Google Scholar 

  • Cleveland WS (1979). Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74: 829–836.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P G Moffatt.

Appendices

Appendix A. STATA code for estimation of Box-Cox double-hurdle model

Notes: ‘listy’ is a previously defined list of variables appearing in the second hurdle; ‘listd’ contains the variables of the first hurdle. ‘theta1’ corresponds to x i ′β in (14), ‘theta2’ to σ, ‘theta3’ to z i ′α, and ‘theta4’ to λ. b is a vector of suitable starting values.

Appendix B. STATA code for estimation of Box-Cox double-hurdle model with interval data

Notes: The Notes to Appendix A also apply here. In addition, the J=8 intervals assumed in the construction of this likelihood function are: 0–30; 30–60; 60–90; 90–120; 120–150; 150–180; 180–210; 210+. The probability associated with the final interval is computed as one minus the sum of the other seven.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moffatt, P. Hurdle models of loan default. J Oper Res Soc 56, 1063–1071 (2005). https://doi.org/10.1057/palgrave.jors.2601922

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1057/palgrave.jors.2601922

Keywords

Navigation