Statistical Methods of Credit Risk Analysis

Yhip, Terence M.; Alagheband, Bijan M. D.

doi:10.1007/978-3-030-32197-0_8

Terence M. Yhip³ &
Bijan M. D. Alagheband⁴

1184 Accesses

Abstract

This chapter represents a big leap from expert-judgement modelling to purely quantitative/statistical modelling. The two approaches are vital and complementary tools in a bank’s risk assessment toolbox. The chapter examines the structure of the linear probability model and probit and logit analysis, shows the similarity and differences, and applies the methods to a sample of companies. It also provides step-by-step guidance to formulate a logit model, and explains how to perform a logit regression using actual data and interpret the logit regression results. As with all models, including expert-judgement models, the stability or reliability of the estimated parameters, descriptors, and weights is not a constant, which makes model validation necessary and essential. Poor validation can be costly to a lender.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Pindyck. Robert, Rubinfeld. Daniel (1976), Econometric Models and Economic Forecasts, 4th Edition, Irwin/McGraw-Hill. Refer to Chapter 11 for a full discussion of these probability models. For a more theoretical discussion, the interested reader may also benefit from consultation. Chapters 1 and 2 of G. S. Maddala, Limited-Dependent and Qualitative Variables in Econometrics, Cambridge University Press, 1983.
2.
For proof, see Pindyck and Rubinfeld (1976).
3.
There are other candidates for the link function but the normal probability function is the most common.
4.
Recall that a function is a mechanism that takes an input (or many inputs for that matter) and produces a unique output. For example, take a practical example we do every day on a road trip without even thinking about it, such as Y = f(x) = 100X, where X is hour, 100 is the speed limit in kilometres per hour, and Y is distance travelled in kilometres. The inputs in this function are the various values of X, and the output is distance. For example, after 2 hours, distance travelled is 200 km. Sometimes, however, we want to know how long it will take to cover 200 km, so in this case, the input is distance and the output is time in hours. Hence, we need a rule to work backwards and this is the inverse function. Since we are multiplying X by 100, we now have to do the opposite of dividing Y by 100 and we get the inverse function: $ {f}^{-1}(Y)=\frac{Y}{100} $. Plug Y = 200 km into the inverse function (denoted by the −1 sign) and the answer is 2 hours.
5.
Maximum likelihood estimation is finding the values of the parameters of a likelihood function that produces the maximum likelihood. It involves differentiating the log of the likelihood function with respect to p_i and setting the equation to zero. The log translation simplifies the calculus by transforming the likelihood function, which is a product of terms, into a simpler linear form that is easier to differentiate.
6.
The logit function is written as:
$$ F(z)=\frac{\exp (z)}{1+\exp (z)} $$
where z, the input, is any real number, and F(z) produces a value in the (0,1) range. We know this is the range because if we examine the limiting property of the F(z), as z goes to -∞, the numerator approaches 0, whilst the denominator approaches 1; hence, F(z) approaches 0. As z goes to +∞, the numerator and the denominator get larger and larger (so the 1 in the denominator can be ignored); hence, F(z) approaches 1. Therefore, we can restate the above expression as a
$$ p=\frac{\exp (z)}{1+\exp (z)} $$
$$ 1-p=\frac{1+\exp (z)-\exp (z)}{1+\exp (z)}=\frac{1}{1+\exp (z)} $$
Hence, the odds ratio is
$$ \left(\frac{p}{1-p}\right)=\exp (z) $$
Taking the natural log on both sides results in
$$ \ln \left(\frac{p}{1-p}\right)=z $$
In our case, Z_i is the estimated i^th value derived from the equation, α + β_jx_ij. Given the values of Z_i, the probability and the odds ratio are easy to calculate.
7.
Going back over the last half a century. The pioneers in this field of bankruptcy prediction were William Beaver (1967), who applied t tests to evaluate the importance of individual accounting ratios, and Edward I. Altman (1968), who applied multiple discriminant analysis within a pair-matched sample. Indeed, the widely known and used Altman Z-score is named after the author of this influential paper, “Financial Ratios and Discriminant Analysis and Prediction of Corporate Bankruptcy”, published in the Journal of Finance, 23, pp. 189–209. Since the Altman study, numerous other authors have extended and dug field deeper and extended by numerous authors over the last half a century.
8.
The derivative of linear probability model given by Eq. (8.1) is constant, $ \frac{\partial {y}_i}{\partial {x}_i}=\beta $.
The model is generalised for more than one explanatory variable, X_ij. In contrast, in the probit and logit model, the derivatives are not constant and cannot be interpreted as the marginal effects on the dependent variable. The marginal impact of a change in X_i1 is not only β_i1, a constant in the linear model, but β_i1 is weighted by a function whose value depends on X_i1 and all other variables in the function.
9.
Siddiqi, N (2006), Credit Risk Scorecards, Developing and Implementing Intelligent Credit Scoring, John Wiley & Sons. Lyn. C. Thomas (2009), Consumer Credit Models, Pricing, Profit, Portfolios, Oxford University Press, Oxford, UK.
10.
The tests may include the t test for the difference in the means of the two populations or the F test for difference in the variance of the two populations.
11.
Parameters are the most important features of a function. Put simply, these are the numbers in the model (such as the econometric models presented in the book) that have to be estimated. They are not fed into the model as input (like the values for the explanatory variables). Parameters are important because they determine the output for given values of the predictors.
12.
Siddiqi, N. (2006), Chapter 8, ibid.

Author information

Authors and Affiliations

University of the West Indies, Mississauga, ON, Canada
Terence M. Yhip
McMaster University and Hydro One Networks Inc., Toronto, ON, Canada
Bijan M. D. Alagheband

Authors

Terence M. Yhip
View author publications
You can also search for this author in PubMed Google Scholar
Bijan M. D. Alagheband
View author publications
You can also search for this author in PubMed Google Scholar

1 Electronic Supplementary Material