Nonlinear relationships in bankruptcy prediction and their effect on the profitability of bankruptcy prediction models

Lohmann, Christian; Möllenhoff, Steffen; Ohliger, Thorsten

doi:10.1007/s11573-022-01130-8

Nonlinear relationships in bankruptcy prediction and their effect on the profitability of bankruptcy prediction models

Original Paper
Open access
Published: 20 December 2022

Volume 93, pages 1661–1690, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Business Economics Aims and scope Submit manuscript

Nonlinear relationships in bankruptcy prediction and their effect on the profitability of bankruptcy prediction models

Download PDF

Christian Lohmann ORCID: orcid.org/0000-0003-0885-2083¹,
Steffen Möllenhoff¹ &
Thorsten Ohliger²

2844 Accesses
Explore all metrics

Abstract

This study uses generalized additive models to identify and analyze nonlinear relationships between accounting-based and market-based independent variables and how these affect bankruptcy predictions. Specifically, it examines the independent variables that Altman (J Financ 23:589–609, 1968; Predicting financial distress of companies. Revisiting the Z-score and ZETA^® models. Working paper, 2000) and Campbell et al. (J Financ 63:2899–2939, 2008) used and analyzes what specific form these nonlinear relationships take. Drawing on comprehensive data on listed U.S. companies, we show empirically that the bankruptcy prediction is influenced by statistically and economically relevant nonlinear relationships. Our results indicate that taking into account these nonlinear relationships improves significantly several statistical validity measures. We also use a validity measure that is based on the profitability of the bankruptcy prediction models in the context of credit scoring. The findings demonstrate that taking into account nonlinear relationships can substantially increase the discriminatory power of bankruptcy prediction models.

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

Article Open access 20 January 2024

Can We Trust the Trust Words in 10-Ks?

Article 23 February 2023

The impact of corporate governance on financial performance: a cross-sector study

Article 30 May 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The primary aim of research on bankruptcy prediction is to estimate as accurately as possible the probability of a company becoming bankrupt (for an overview see, e.g., Scott 1981; Dimitras et al. 1996; Altman and Saunders 1997; Balcaen and Ooghe 2006; Bellovary et al. 2007). The accuracy of such forecasts largely depends on the methods and models that are applied and on selecting the most suitable explanatory variables for the purpose of predicting bankruptcy (Laitinen and Kankaanpää 1999). In general, the internal validity of bankruptcy prediction models increases with the complexity of the empirical methods and models and with the number of explanatory variables that researchers use. In contrast to that, the external validity of bankruptcy prediction models decreases if a certain complexity threshold is exceeded and overfitting arises.

The methodology that analysts apply has become very diversified and includes structural models (Merton 1974; Black and Cox 1976; Fabozzi et al. 2010), reducing models (Jarrow and Turnbull 1995), heuristic methods, such as expert systems (Messier and Hansen 1988), models based on chaos theory (Lindsay and Campbell 1996), univariate and multivariate discriminant analyses (Beaver 1966; Altman 1968; Altman et al. 1977), survival analyses (Lane et al. 1986; Louma and Laitinen 1991; Shumway 2001), neural networks (Charitou et al. 2004; Neves and Vieira 2006), support vector machines (Min and Lee 2005; Wang et al. 2005), and, more recently, gradient boosting models (Jones 2017). For example, Jones (2017) uses 91 explanatory variables on shareholder structure and management compensation, variables that proxy size effects, market-based and accounting-based variables, macro-economic variables, analyst recommendations, and industry variables.

The methods and models used for predicting bankruptcy have improved impressively in recent years with respect to validity measures that are based either on likelihood, such as Nagelkerke’s pseudo-R² (Nagelkerke 1991) or Akaike’s information criterion (Akaike 1973), or on classification, such as the accuracy ratio (Tasche 2005; Trueck and Rachev 2009, pp. 26–28) or the area under curve (AUC; Engelmann 2011). All such models involve a trade-off between statistical validity and comprehensibility. While more complex empirical models may increase the internal and external validity of measures based on likelihood or classification, they often tend to be harder to interpret (Jones et al. 2015, 2017). Models that are designed for predicting bankruptcy as accurately as possible but are not sufficiently plausible and interpretable from an economic perspective are unlikely to be useful in practice (see, e.g., Altman et al. 1994; see also the critical study of neural networks by Hayden and Porath 2011). Contrary to linear regression techniques, more recent bankruptcy prediction models, such as neural networks and gradient boosting models, that take into account nonlinear relationships do not show clearly how the explanatory variables and the probability of bankruptcy interrelate. This is often the case when there are nonlinear relationships between the explanatory variables and the predictor or non-monotonous effects of the explanatory variables on the probability of bankruptcy. An effective bankruptcy prediction model needs to capture the actual effects of the most important explanatory variables on the probability of bankruptcy and still be clear and interpretable. Furthermore, the main criterion for evaluating such a model should be how it affects the profitability of decision making, rather than on validity measures that are solely based on either likelihood or classification.

While several studies only assume the existence of such nonlinear relationships (Brüderl and Schüssler 1990; Atiya 2001; Saunders and Allen 2010), some studies have already provided empirical evidence that there are indeed such relationships among the independent variables of the models they have used. Several studies apply univariate methods on categorical independent variables derived from annual financial statements and analyze nonlinear effects with respect to quantiles of classified data (e.g., Serrano-Cinca 1997; Estrella et al. 2000; Falkenstein et al. 2000; Sobehart et al. 2000; van Gestel et al. 2005; Altman et al. 2010; Hayden 2011). Multivariate forecast models provide more detailed insights into the nonlinear relationships between the independent variables and the predictor or the non-monotonous effects on the probability of bankruptcy. Some of the studies that apply generalized additive models (GAMs) have, in fact, detected a range of nonlinear relationships with respect to analyses of creditworthiness (Burkhard and De Giorgi 2006; Alp et al. 2011; Lohmann and Ohliger 2018; Djeundje and Crook 2019) and bankruptcy prediction (Berg 2007; Hwang et al. 2007; Cheng et al. 2010; Dakovic et al. 2010). However, most of these studies focus on comparing several empirical models from a strictly statistical perspective and do not describe the nonlinear relationships they identify in sufficient detail nor interpret them from an economic perspective. One exception is the study by Lohmann and Ohliger (2017), which examines the specific form of nonlinear relationships between accounting-based independent variables and the predictor for the probability of bankruptcy. Using data on limited German companies, the authors show that nonlinear relationships are observed both below and above specific thresholds with respect to a company’s equity ratio, asset structure ratio based on tangible assets, return on assets, sales, and age.

One problem that many models for predicting bankruptcy share is that neither validity measures based on likelihood nor those based on classification take into account the economically relevant profitability that is associated with a more or less accurate bankruptcy prediction model. If the estimated probability of bankruptcy has an impact on the profitability, these commonly used validity measures may lead to inaccurate conclusions about the economic consequences of a particular empirical model. More specifically, the costs that result from misclassifying companies that are, in fact, bankrupt, and the lost profits that result from misclassifying companies that are, in fact, solvent have to be taken into account (Takahashi et al. 1984; Wilson and Sharda 1994; Yang et al. 1999; Trueck and Rachev 2009, p. 30). Consequently, in economic terms, an empirical model that is statistically more valid may be less desirable than an alternative of lower statistical validity. Overall, traditional validity measures perform inconsistently because they do not take into account the profitability of the bankruptcy prediction models.

The present study uses data on listed U.S. companies covering the period 2000–2020 to examine whether taking into account nonlinear relationships in GAMs improves the accuracy with which generalized linear models (GLMs) predict bankruptcy and, if so, to what extent. Our study contributes to research on predicting bankruptcy in two ways: first, we apply GAMs to identify nonlinear relationships between relevant variables. We use the independent variables that Altman (1968, 2000) and Campbell et al. (2008) already applied in their bankruptcy prediction models. We explain that the estimated spline functions can be analyzed and interpreted with respect to every independent variable and its effect on the predictor for the probability of bankruptcy. This provides substantial insights into cause-effect relationships. The comparison between the estimated spline functions and the estimated linear functions reveals in which of the independent variables’ value ranges the estimated linear functions underestimate or overestimate the actual (nonlinear) effects. Second, we show that, compared to GLMs, GAMs increase the validity measures based on either likelihood or classification. We examine in depth the advantages of using GAMs by using a validity measure that is based on profitability and therefore reflects the economic consequences of a more accurate estimation of the probability of bankruptcy. The pairwise comparison between GLMs and GAMs shows that the profitability of bankruptcy prediction models and their associated classifications can be substantially increased by taking into account nonlinear relationships.

The paper is structured as follows: in the next section we will present our empirical data, the dependent and independent variables we used, and the relevant descriptive statistics and correlations. In the third section we will present the results we derived from the estimated bankruptcy prediction models. We will analyze the nonlinear relationships between the independent variables and the predictor for the probability of bankruptcy and we will compare the validity measures that are based on either likelihood or classification. In the fourth section, we will evaluate the profitability of the bankruptcy prediction models with regard to credit approval decisions. Finally, in the last section we will summarize the main results and discuss their practical implications.

2 Empirical data

2.1 Sample refinement

Our empirical study analyzes listed U.S. companies and estimates the probability of bankruptcy within a triennial period. Our empirical analysis is based on a company’s bankruptcy for the fiscal years 2000–2020 and accounting and market data for the fiscal years 2000–2017. The information on a company’s bankruptcy was taken from the UCLA-LoPucki Bankruptcy Research Database and from the bankruptcy database that was built and is maintained by Chava et al. (2011) and Chava (2014). The information on bankruptcy that our data provide shows whether a company filed for bankruptcy under Chapter 7 or Chapter 11 before the end of 2020 and, for companies that did, when. The independent variables were extracted from the COMPUSTAT database and the CRSP database and correspond to the accounting-based and market-based independent variables that Altman (1968, 2000) and Campbell et al. (2008) applied. We did not take into account further accounting-based and market-based independent variables (e.g., Ohlson 1980), as financial ratios that relate to the same area are often correlated to a substantial extent (Beaver et al. 2005). We also collected information on each company’s industry.

Overall, we gathered empirical raw data that comprise 207,990 annual observations on 24,924 listed U.S. companies for the fiscal years 2000–2017. In addition to data on a company’s bankruptcy, each annual observation includes the annual financial statement and corresponding market data. We processed the raw data in six steps that we outline in Table 1 and extracted the training sample on which we based the estimation of the bankruptcy prediction models and the out-of-time validation sample.

Table 1 The six-step procedure of processing the raw data for the fiscal years 2000–2017

Full size table

Before we processed the raw data we chose our metric independent variables, drawing on Altman (1968, 2000) (five independent variables) and Campbell et al. (2008) (nine independent variables). In the first step, we eliminated all annual observations where one or more independent variables took an implausible value. Particularly, implausible values were given by a share of liquid assets in total assets that is larger than 1 and a negative share of liquid assets in the market-valued total assets. In the second step, we excluded all companies in the category “Money & Finance” of the Fama–French 12-industry classification scheme. In the third step, we eliminated all annual observations which included less than two independent variables. As a result of this procedure, the raw data were reduced to 138,922 annual observations derived from 16,922 listed U.S. companies.

In the following two steps, we applied the k-nearest neighbor algorithm with k = 10 and replaced missing independent variables by the weighted mean of its closest 10 neighbors. Furthermore, we identified and winsorized outliers. Outliers of each independent variable were winsorized at 1% and 99% percentiles. These procedures did not further reduced to useable data. Finally, we split our data into a training sample that includes annual observations from the period 2000–2014 (in conjunction with information on a company’s bankruptcy for the fiscal years 2000–2017) and an out-of-time validation sample that includes annual observations from the period 2015–2017 (in conjunction with information on a company’s bankruptcy for the fiscal years 2015–2020). While we applied the training sample to estimate the bankruptcy prediction models, we used the out-of-time validation sample to examine a model’s external validity and its prediction accuracy in more detail. The bankruptcy prediction models should exhibit sufficient external validity and be usable with existing and new data from a comparable population.

2.2 Dependent und independent variables

The dependent variable reflects the type of bankruptcy. We classified each bankrupt company according to the type of failure (Dickerson and Kawaja 1967; Schwarz and Arminger 2010; Erlenmaier 2011) and to the period within which it became bankrupt. A company was classified as “bankrupt” if it had declared bankruptcy under Chapter 7 or Chapter 11 within three years after the annual financial statement that we consulted. The 3-year prediction period is relatively large, but it is still commonly applied. Theodossiou (1993) already observed that a company’s financial characteristics change many years before a company actually files for bankruptcy. For example, Campbell et al. (2008) estimated the probability of bankruptcy for 6 months and 1, 2, and 3 years and, more recently, Paraschiv et al. (2022) apply a 3-year prediction period.

Our training sample for the fiscal years 2000–2014 comprises 2635 annual observations where one of the 1110 corporate bankruptcies followed within a triennial period. The a priori bankruptcy rate is about 2.15% in the training sample. Our out-of-time validation sample for the fiscal years 2015–2017 comprises 209 annual observations where one of the 145 corporate bankruptcies followed within a triennial period. The a priori bankruptcy rate of 1.28% is lower than the a priori bankruptcy rate in the training sample. While the training sample includes the economic slowdown after the dotcom bubble in 2000 and the financial crisis that began around 2007, the out-of-time validation sample covers a period with a historically low number of corporate bankruptcies due to advantageous macroeconomic and financing conditions.

The independent variables consist of metric variables derived from the annual financial statements and stock-market information of the listed U.S. companies. In particular, the bankruptcy prediction models we estimate are based on the independent variables that either Altman (1968, 2000) or Campbell et al. (2008) used. In Altman’s work (1968, 2000), the independent variables consist of the five financial ratios presented in Table 2. We applied a set of independent variables where every variable result was derived entirely from accounting information. To that end, instead of market value of equity (Altman 1968), we used book value of equity (Altman 2000) to calculate the independent variable BVE_TL. Furthermore, we also decided to use the market-based independent variables Campbell et al. (2008) describe. These represent two sets of independent variables (see Table 3) that differ in the valuation of total assets (i.e., adjusted total assets (ATA) vs. market-valued total assets (MTA)). We also take into account the year of each observation and each company’s industry, according to the Fama–French 12-industry classification scheme. In our context, categorical variables are only of secondary importance, because they are not useful in the analysis of nonlinear relationships. Nevertheless, we included them to make our analysis as comprehensive as possible.

Table 2 Overview of the independent variables used in Altman (1968, 2000)

Full size table

Table 3 Overview of the independent variables used in Campbell et al. (2008)

Full size table

2.3 Descriptive statistics and correlations

The descriptive statistics of the independent variables are displayed in Table 4 and show the expected characteristics. The value ranges of the independent variables are economically plausible. As we excluded negative equity that is shown on the asset side from total assets, it is feasible that WC_TA < − 1, BVE_TL < 0, TL_ATA > 1, and MB < 0. The minimum and maximum values do not show any differences between solvent and bankrupt companies due to winsorizing of outliers. Several independent variables exhibit a large difference between mean and median values as the winsorized outliers still have a large effect on the mean. The median is more informative. A particularly pronounced example is the independent variable WC_TA that exhibits negative means and positive medians for both solvent and bankrupt companies. Further noticeable differences between mean and median values arise in the independent variable RE_TA. Our analysis reveals statistically significant differences (p < 0.01) between solvent and bankrupt companies in the mean and median values with respect to all metric independent variables.

Table 4 Minimum, mean, median, maximum, and standard deviation of the independent variables for bankrupt, solvent, and all observations (training sample)

Full size table

Most independent variables in our three sets (Altman 1968, 2000; Campbell et al. 2008—ATA; Campbell et al. 2008—MTA) are moderately or little correlated according to Pearson’s correlation coefficients (see Table 5). The correlation between RE_TA and EBIT_TA is high, as both independent variables relate to a company’s earnings. Furthermore, there is a high correlation between SIGMA and RSIZE and PRICE exhibits high correlations to SIGMA and RSIZE. The Pearson’s correlation coefficients also indicate that there are high correlations between RE_TA and WC_TA, EBIT_TA and WC_TA, as well as TL_ATA and NI_ATA. But these correlations are largely caused by winsorized outliers and other outlying observations as Spearman’s rank correlation coefficients take the values 0.177 (RE_TA and WC_TA), 0.119 (EBIT_TA and WC_TA), and − 0.124 (TL_ATA and NI_ATA). We tested each set of independent variables for multicollinearity between each metric independent variable and all other independent variables. The variance inflation factor shows that there is no multicollinearity, apart from the identified correlations within the three sets of independent variables. The contingency analysis does not reveal any strong relationships between the metric and categorical independent variables. Consequently, the three sets of independent variables demonstrate the expected data structures and serve as a valid database. However, we have to take into account the existing correlations within the three sets of independent variables and we have to be careful when we interpret the regression results of the bankruptcy prediction models.

Table 5 Correlations within the three sets of independent variables (training sample)

Full size table

3 Results

3.1 Estimated bankruptcy prediction models

To analyze the relationships between the independent variables and the dependent variable, we estimated both GLMs and nonlinear GAMs with respect to the three sets of independent variables Altman (1968, 2000) and Campbell et al. (2008) used. Predicting bankruptcy requires that a company’s solvency status is coded in a binary manner (“solvency” = 0; “bankruptcy” = 1). Following this approach, we transformed the information on qualitative bankruptcy that we drew from our data into a Bernoulli-distributed measure. This metric measure, which we subsequently used in regression analysis, can be interpreted as the metric probability of company being in the class “bankruptcy.” In GLMs, this probability depends on a set of independent variables that linearly increase or decrease the predictor which is further transformed by a distribution function (Nelder and Wedderburn 1972). As every distribution function is strictly non-decreasing (Jacod and Protter 2012), each independent variable has a monotonic effect on the probability within a GLM (Hosmer et al. 2013). We use the logistic distribution function in all GLM and GAM estimations. In GAMs, the functional relationship between each independent variable and the predictor is estimated by an unspecified spline function (Hastie and Tibshirani 1990, 1995). In the following, we use penalized splines with base functions that relate to the B-spline-base (Kneib 2006) to model the splines. We put the GAMs in concrete terms by using basic functions of rank g = 3 and 12 intervals that are based on quantiles of the same size for each penalized spline. The smoothing parameter is determined by the generalized cross-validation criterion (Green and Silverman 1994; Eilers and Marx 1996). As a result, applying GAM allows modeling a non-monotonic relationship between each independent variable and the probability of bankruptcy.

In the three GLM and three GAM estimations, we included the independent variables Altman (1968, 2000) used, the adjusted independent variables (ATA) Campbell et al. (2008) used, and the market-based independent variables (MTA) used in Campbell et al. (2008). The year of the observation and the company’s industry, according to the Fama–French 12-industry classification, are also taken into account as categorical independent variables. The categorical independent variables are treated as dummy variables. For example, Eqs. (1) and (2) show the GLM and the GAM when the independent variables that were used by Altman (1968, 2000) are applied. Thereby, $\eta_{i,t}^{GLM}$ and $\eta_{i,t}^{GAM}$ are the predictors that are transformed by the logistic distribution function $F( \cdot )$ to calculate the probability $\pi_{i,t}$ of company $i$ being in the class “bankruptcy” in year $t$. In contrast to the linear regression coefficients in Eq. (1), the relationships between the metric independent variables and the predictor $\eta_{i,t}^{GAM}$ are estimated by using spline functions $f( \cdot )$ that do not follow a specified form.

$$\pi_{i,t} = F\left( {\eta_{i,t}^{GLM} } \right) = F\left( \begin{gathered} \beta_{0} + \beta_{1} \cdot WC\_TA_{i,t} + \beta_{2} \cdot RE\_TA_{i,t} + \beta_{3} \cdot EBIT\_TA_{i,t} \hfill \\ + \beta_{4} \cdot BVE\_TL_{i,t} + \beta_{5} \cdot S\_TA_{i,t} + \beta_{6} \cdot Industry_{i} + \beta_{7} \cdot Year_{t} + \varsigma_{i,t} \hfill \\ \end{gathered} \right)$$

(1)

$$\pi_{i,t} = F\left( {\eta_{i,t}^{GAM} } \right) = F\left( \begin{gathered} \beta_{0} + f_{1} (WC\_TA_{i,t} ) + f_{2} (RE\_TA_{i,t} ) + f_{3} (EBIT\_TA_{i,t} ) \hfill \\ + f_{4} (BVE\_TL_{i,t} ) + f_{5} (S\_TA_{i,t} ) + \beta_{6} \cdot Industry_{i} + \beta_{7} \cdot Year_{t} + \varsigma_{i,t} \hfill \\ \end{gathered} \right)$$

(2)

The estimations of the GLMs and the GAMs are presented in Table 6. Although the GLMs and GAMs are estimated with an intercept and dummy variables for the industry and the year of the observation, Table 6 only reports the results with regard to the metric independent variables. The results from the GLMs include the regression coefficients. The asterisks denote the level of significance based on the likelihood ratio test (Wood 2017, p. 411). The results of the metric independent variables in the GAMs show the equivalent degrees of freedom df_f, which represent the variability of the estimated splines of the metric independent variables. The value df_f = 1 shows that the estimated spline corresponds to a linear function and the increasing degrees of freedom indicate the level of increases in nonlinearity. Again, the asterisks denote the level of significance based on the likelihood ratio test (Wood 2017, p. 411).

Table 6 GLM and GAM estimations and validity measures

Full size table

With regard to the independent variables Altman (1968, 2000) used in GLM1, the increasing values of WC_TA and RE_TA and the decreasing values of BVE_TL and S_TA have a positive and statistically significant effect on the probability of bankruptcy. The positive coefficients of WC_TA and RE_TA are caused by particularly negative outliers. If outliers would be eliminated below the 5% and above the 95% percentiles, the coefficients of WC_TA and RE_TA would change their sign. The estimated coefficients of the GLM2 and GLM3 are comparable to those Campbell et al. (2008) report and largely exhibit the expected signs.

The GLM estimation of the adjusted independent variables (ATA) that Campbell et al. (2008) used shows that decreasing values of EXC_RET and increasing values of TL_ATA, SIGMA, and RSIZE significantly increase the probability of bankruptcy. The coefficient of NI_ATA is not statistically significant. Taking into account the market-based independent variables (MTA) that Campbell et al. (2008) used, we find that decreasing values of NI_MTA, CA_MTA, and EXC_RET and increasing values of TL_MTA, SIGMA, RSIZE, and PRICE significantly increase the probability of bankruptcy. The positive coefficient of PRICE is caused by particularly negative outliers. If outliers would be eliminated below the 5% and above the 95% percentiles, the coefficient of PRICE would change its sign. Furthermore, we have to note that the estimated coefficients of RSIZE are positive and statistically significant. That result deviates from Campbell et al. (2008, p. 2910) who report a positive and statistically significant coefficient of RSIZE when the adjusted independent variables were applied and a negative and statistically significant coefficient of RSIZE when the market-based independent variables were applied.

According to Table 6, the GLM estimations largely correspond to the GAM estimations with respect to the level of significance of the metric independent variables. The equivalent degrees of freedom of the GAM estimations (df_f > 1.00) reveal that there are nonlinear relationships between the independent variables and the predictor. Consequently, in order to describe the nonlinear relationships’ direction, we have to analyze the spline patterns in detail.

Despite the presence of highly correlated independent variables we restrict our analysis to the variables that Altman (1968, 2000) and Campbell et al. (2008) applied in their studies. The effects of highly correlated independent variables on the GLM and GAM estimations are negligible. We examined the robustness of the GLM and GAM estimations by varying the highly correlated independent variables of the bankruptcy prediction models. The bankruptcy prediction models are very robust against any variation of the highly correlated independent variables. As our analysis focuses on improving a model’s validity by taking into account nonlinear relationships in GAM estimations rather than coefficients of GLM estimations and their interpretation, highly correlated independent variables are not an obstacle for the present analysis. Furthermore, the nonlinear relationships that are estimated by the GAMs are robust against any change in the number of intervals for which each spline function was estimated.

As an additional robustness check we re-estimated the GLMs and GAMs as mixed models by taking into account company-specific random effects. Unobserved company-specific properties may cause omitted variable bias if these unobserved company-specific properties are correlated with the dependent and independent variables. We estimated the mixed models by introducing company-specific random effects and eliminating industry dummy variables. As the computing effort was too high to include the 16,489 companies of the training sample in the mixed models, we drew 30 random samples of 1000 companies out of 16,922 companies. Then, we applied the observations that belong to the training sample period 2000–2014 and estimated 30 mixed models for each GLM and each GAM (i.e., 30 · (3 + 3) = 180 mixed models). As last step, we validated these estimations by applying the observations that belong to the training sample period 2000–2014 and the validation sample period 2015–2017. We analyzed the distributions of the regression coefficients and the degrees of freedom and we evaluated the estimated spline functions that were statistically significant. The regression coefficients of the GLM are robust with regard to their sign. However, the regression coefficients are largely not statistically significant as the company-specific random effects provide a large proportion of the explanatory power. The degrees of freedom of the mixed models are comparable to the degrees of freedom of the three estimated GAMs. The estimated spline functions of the mixed models show similar nonlinear relationships that are presented in Figs. 1, 2 and 3 Based on this robustness check we have to conclude that neglecting company-specific random effects does not distort the analysis of nonlinear relationships in bankruptcy prediction models.

3.2 Nonlinear relationships

The estimated spline functions show the estimated nonlinear relationships between the independent variables and the predictor. The equivalent degrees of freedom of the GAM estimations, and thus the estimated nonlinear relationships, differ with regard to the metric independent variables. Figures 1, 2 and 3 depict the spline patterns of the significant independent variables for the three GAM estimations. The black bold line represents the estimated spline. The value of the independent variable is plotted on the x-axis, while the effect on the predictor is plotted on the y-axis. Higher values on the y-axis indicate a higher probability of bankruptcy. However, because these probabilities also depend on the values of the other variables, we cannot determine them more precisely. The 95% confidence band is shaded gray. To compare the estimated spline patterns with the estimated linear functions, we inserted the linear functions of the GLM estimations and centered the estimated linear functions with the estimated spline patterns at the function value 0. Figures 1, 2 and 3 also depict the empirical density function of the independent variable as a dotted line, with the maximum value on the right side. The empirical density function matches the descriptive statistics that are presented in Table 4. The spline patterns are particularly meaningful within the value range where the empirical density function indicates a large number of observations.

The spline patterns show that there are a wide range of nonlinear relationships between the independent variables and the predictor. The comprehensive analysis of the spline patterns reveals five main structures that can occur separately or in combination: (1) sharply decreasing effect on the predictor in a relatively small value range with a large number of observations, (2) sharply increasing effect on the predictor in a relatively small value range with a large number of observations, (3) a relationship between the independent variable and the predictor that can be described by an U-curve, (4) a relationship between the independent variable and the predictor that can be described by an inverted U-curve, and (5) almost no effect on the predictor in a relatively large value range with a small number of observations. We will analyze the nonlinear relationships in Figs. 1, 2 and 3 for selected independent variables in more detail.

From the analysis of the spline function of EBIT_TA in Fig. 1 we were able to draw more valid conclusions. For positive values (EBIT_TA > 0), the independent variable EBIT_TA has a negative and almost linear effect on the predictor and, therefore, decreases the probability of bankruptcy. When EBIT_TA takes negative values, we can assume that the spline pattern for EBIT_TA is almost constant. If EBIT_TA deteriorates, this is not likely to have a further effect on the predictor. As a result, the assumption of a linear relationship will not hold within the whole value range of the independent variable EBIT_TA.

The spline function of BVE_TL in Fig. 1 decreases almost linearly until BVE_TL = 2. That means that the increase in BVE_TL reduces the probability of bankruptcy. The effect of BVE_TL on the predictor decreases in the upper peripheral areas, where we can assume that the spline pattern for BVE_TL > 2 is almost constant. This empirical finding is consistent with the findings of van Gestel et al. (2005) and Lohmann and Ohliger (2017), who showed empirically that the effect of high equity ratios on the probability of bankruptcy converges towards zero. The decreasing effect of additional potential liability when BVE_TL is already high can explain the nonlinear relationship between BVE_TL and the predictor for the probability of bankruptcy. However, linear functions cannot capture the change in the slope of the spline function, which leads ceteris paribus to either underestimating (BVE_TL < 2) or overestimating (2 < BVE_TL) the probability of bankruptcy.

The estimated spline function of NI_ATA in Fig. 2 is comparable to the estimated spline function of EBIT_TA in Fig. 1. For slightly negative and positive values (NI_ATA > − 0.1), the independent variable NI_ATA has a negative and almost linear effect on the predictor. In contrast to that, negative values (NI_ATA < − 0.1) do not increase the predictor or, as a result, the probability of bankruptcy. These two sections of the estimated spline function cannot be reproduced by a strictly linear function as the sensitivity of the effect that NI_ATA has on the probability of bankruptcy is underestimated for slightly negative and positive values (NI_ATA > − 0.1) and is overestimated for negative values (NI_ATA < − 0.1).

The estimated spline function of TL_ATA in Fig. 2 increases almost linearly within the range 0.0 < TL_ATA < 0.7 and thus increases the probability of bankruptcy. The effect of TL_ATA on the probability of bankruptcy decreases in the upper peripheral areas, where we can assume that the spline pattern for TL_ATA > 0.7 is almost constant. Consequently, the results we derive from the estimated spline function are consistent with previous empirical evidence that the probability of bankruptcy exhibits low sensitivity when equity ratios are low (Lennox 1999; Lohmann and Ohliger 2017).

The spline pattern of RSIZE in Fig. 2 indicates a reversed U-shaped relationship between RSIZE and the predictor. Companies with a relatively low or a relatively high market valuation exhibit a lower probability of bankruptcy. Among these, companies with a relatively low market valuation are likely to be young and still in their “honeymoon” period (Hudson 1987; Everett and Watson 1998; Honjo 2000; Altman et al. 2010).

The nonlinear relationships in GAM3 are comparable to those in GAM2. The independent variable NI_MTA in Fig. 3 exhibits a decreasing effect on the predictor for a narrower value range. However, we can again observe lower and upper thresholds where the slope of the spline function and, therefore, the effect of the independent variable on the predictor change. The spline functions of TL_MTA and CA_MTA indicate that there are almost linear relationships between these independent variables and the predictor over the whole value range. Based on the estimated spline functions that reveal the effective relationships between the independent variables and the predictor, we have to conclude that an estimated linear function will usually be inaccurate, because it partly underestimates or overestimates the effect of the independent variable on the predictor.

3.3 Validity measures

The GLM and GAM estimations are evaluated with respect to several goodness-of-fit criteria. The validity of the estimated GLMs and GAMs on the basis of the likelihood that a company will go bankrupt, according to Nagelkerke’s pseudo-R² (Nagelkerke 1991) and to Akaike’s information criterion (Akaike 1973) can be seen in Table 6. Nagelkerke’s pseudo-R² is clearly higher in each GAM than in the corresponding GLM. The increase in Nagelkerke’s pseudo-R² amounts to 104.11% (GLM1 vs. GAM1), 169.44% (GLM2 vs. GAM2), and 80.65% (GLM3 vs. GAM3).

In contrast to Nagelkerke’s pseudo-R², Akaike’s information criterion does take into account a model’s complexity. This allows us to compare directly the validity of GLMs and GAMs (Horowitz 1983; Wood 2017). Applying Akaike’s information criterion, according to which lower values indicate greater validity, leads to a similar conclusion. Given that Akaike’s information criterion explicitly takes into account a model’s complexity, the relative difference between GLMs and GAMs is lower, because the GAM exhibits a larger number of equivalent degrees of freedom. However, the difference according to Akaike’s information criterion is sufficiently large to indicate that the GAM is superior to the corresponding GLM (Hilbe 2009, p. 260).

Table 6 also provides proof of the model’s validity on the basis of classification. This is a more reliable indicator of a model’s validity with respect to predicting the likelihood of a company going bankrupt. We calculated the area under curve (AUC), which indicates the model’s overall validity (Engelmann 2011), for the training sample and the validation sample. The values we derived from the GAMs always exceed those we derived from the corresponding GLMs. Applying the statistical test that Delong et al. (1988) recommend, we found that the differences in the training sample and in the validation sample are statistically significant at p-value < 0.01. The results show that the GAM remains superior to the corresponding GLM.

4 Profitability of the bankruptcy prediction models

The estimated GLM and GAM bankruptcy prediction models can also be evaluated on their profitability with regard to credit approval decisions. The methodology was developed by Blöchinger and Leippold (2006) and subsequently applied by Agarwal and Taffler (2008) and Bauer and Agarwal (2014) for listed UK companies and Paraschiv et al. (2021) for Norwegian small and medium-sized companies. The basic idea is to perform a paired comparison of two different bankruptcy prediction models that compete against each other in the credit market. The bankruptcy prediction model effects the outcome of the competition as it determines the required credit spread and, therefore, the claimed interest for the debtor. It is also accountable for distinguishing between solvent debtors and debtors that will declare bankruptcy within the credit period. We evaluate the profitability of the bankruptcy prediction models by comparing GLM1 and GAM1, GLM2 and GAM2, and GLM3 and GAM3.

The methodology of Blöchinger and Leippold (2006) include a framework for credit approval decisions and the specification of a set of parameters. The simplified credit offer can be described by Eq. (3). The left side of Eq. (3) denotes a lender’s expected revenue given that the credit spread R_i,t of company i in year t can actually be received if the company remains solvent. Here, we apply the probability that a specific company i in year t will not become bankrupt within the next three years. The right side of Eq. (3) consists of a lender’s expected loss (LGD) given that company i goes bankrupt within the next three years and the expected minimum rate k given that the company remains solvent.

$$R_{i,t} \cdot P\{ Y = 0\,\left| {\,company = i,year = t} \right.\} = \begin{array}{*{20}l} {LGD \cdot P\{ Y = 1\,\left| {\,company = i,year = t} \right.\} } \\ { + k \cdot P\{ Y = 0\,\left| {\,company = i,year = t} \right.\} } \\ \end{array}$$

(3)

The most competitive credit offer is given when Eq. (3) holds. As we compare two bankruptcy prediction models with each other, we will calculate different credit spreads for each company–year observation in our data set. Company i can be gained as debtor in year t by the entity whose bankruptcy prediction model determines the lowest credit spread R_i,t. If the credit spread R_i,t of both bankruptcy prediction models should be identical, the credit will be split in equal shares. Therefore, it is preferable to estimate the probability of bankruptcy and the counter-probability as accurate as possible to gain solvent companies as debtors and reject credit applications of companies that will go bankrupt.

We put this framework in concrete terms by making the following five assumptions: (1) every company applies for a similar volume of credit in each year for which we have company-related information in our data set. For example, if there are ten annual observations of a company, that company applies for ten credits and these credits can be granted by different lenders. (2) The lenders are not restricted with regard to the total volume of credit and we do not take into account refinancing cost or other operating cost of the lenders. (3) The minimum rate k is set to 0.3%. (4) A credit application is always rejected if the credit spread R_i,t is larger than 10%. (5) If a company declares bankruptcy, the LGD determines the loss. There are no further interest payments or redemptions after bankruptcy. We assume three different LGD values which are blanket and do not depend on the specific bankruptcy date. The LGD can take the values 15%, 45% (according to European Union 2013, Article 161 (1) a), and 75% (according to European Union 2013, Article 161 (1) b).

Tables 7 and 8 show the lending policy and the profitability of the bankruptcy prediction models for the training sample and the validation sample. Thereby, GLM1 competes with GAM1, GLM2 competes with GAM2, and GLM3 competes with GAM3. First, the lending policy of the bankruptcy prediction models is illustrated by three key figures that highlight the share in the credit market and the share of bankruptcies in the credit portfolio. Second, the profitability of the bankruptcy prediction models is stated as absolute and relative figures. Besides profit and profit margin, we calculated the return on assets (ROA) and the return on risk-weighted assets (RORWA). The risk-weighted assets are determined according to the Basel III regulatory framework that is stipulated in European Union (2013, Article 153), but without applying the adjustment factor for small and medium-sized companies (according to European Union 2013, Article 153 (4)). The maturity of credit that is required for calculation is assumed to be 2.5 years (according to European Union 2013, Article 162 (1)).

Table 7 Model comparison on the basis of lending policy and profitability (training sample)

Full size table

Table 8 Model comparison on the basis of lending policy and profitability (validation sample)

Full size table

The comparisons between the GLMs and GAMs show that GAMs exhibit greater discriminatory power. GAMs estimate a lower probability of bankruptcy for these companies that remain solvent. As a result, the credit spread of the GAM is generally smaller than the credit spread of the corresponding GLM and the institution that applies a GAM is able to attract a larger number of corporate borrowers. Additionally, applying a GAM avoids at a larger extent that companies that will go bankrupt get a credit. Overall, the larger discriminatory power of GAMs allows a more aggressive lending policy, leads to a larger share in the credit market, and limits the number of credits that are granted to effectively bankrupt companies. These benefits of GAMs lead to higher profit in comparison to GLMs due to higher revenue and smaller loss. The higher profit is also confirmed by the relative profit figures. Thereby, the profit gap between every pair of GLM and GAM increases in LGD. These results allow us to conclude that taking into account nonlinear relationships significantly increases the discriminatory power and, therefore, the profitability of the bankruptcy prediction model. These findings are consistent in both the training and the validation sample.

5 Conclusion

In order to derive a reliable prediction of bankruptcy, it is necessary to strike a balance between a model’s validity and complexity. In the present study we extend commonly used bankruptcy prediction models by taking into account nonlinear relationships between independent variables and the predictor for the probability of bankruptcy. Omitting the effects of nonlinear relationships may distort the estimates of a company’s probability of going bankrupt. This makes it necessary to evaluate the economic relevance of taking into account nonlinear relationships. For that purpose, it is important to select appropriate validity criteria.

Our findings show that several independent variables that Altman (1968, 2000) and Campbell et al. (2008) used exhibit statistically significant and economically plausible nonlinear relationships with the probability of a company going bankrupt. In the value range where the independent variables exhibit sufficient data points, it is safe to assume that these variables have an almost linear effect on the predictor. However, we did observe nonlinear relationships below and above specific thresholds at which the estimated spline functions change their slope. With respect to the independent variables RE_TA, EBIT_TA, NI_ATA, and NI_MTA, we were able to prove empirically that when each of these independent variables takes small values, there is a converging effect. Below a certain threshold, decreases in the values of each of these independent variables have only a minor or no effect on the probability that a company will go bankrupt. With respect to the independent variables BVE_TL, S_TA, and TL_ATA, we observed that, above a certain threshold, when these independent variables take large values, there is a similar effect on the probability of bankruptcy.

The validity measures that are based on either likelihood or classification indicate that the validity of the GAMs we used, in which we took into account nonlinear relationships, is higher than that of the equivalent GLMs. As a result, we have to acknowledge that there are relevant nonlinear relationships between the independent variables that were introduced by Altman (1968, 2000) and Campbell et al. (2008) and the predictor for the probability of bankruptcy. However, the improvements in the validity measures that are based either on likelihood or on classification may not necessarily be perceived as sufficient to justify choosing a more complex model for predicting bankruptcy. When only such measures are used, there is a risk that the evaluation of bankruptcy prediction models will lead to a wrong conclusion and to choosing an inappropriate model, even if the application of an alternative model increases the profitability on an economically relevant scale. For example, a global, single-item validity measure such as the AUC does not take into account the actual consequences of bankruptcy prediction models and, therefore, distorts conclusions on validity. To prevent this, it is advisable to evaluate bankruptcy prediction models on the basis of realistic assumptions about their application and the associated economic outcome.

In the present study we examined whether the profitability of bankruptcy prediction models can serve as a further validity criterion. To demonstrate the validity of this criterion, we applied two nested models that differ only with respect to nonlinear relationships: the GAM takes them into account, while the GLM does not. Consequently, we can be confident that any increase in a model’s profitability can be attributed to the inclusion or exclusion of existing nonlinear relationships. With respect to paired competition between GLMs and GAMs in the credit market, we found that applying a GAM leads to higher profitability than applying a GLM. This result holds for the both the training and the validation sample. Therefore, it is worthwhile to take nonlinear relationships into account when the probability of bankruptcy is estimated.

Data availability

The information on a company’s bankruptcy was taken from the UCLA-LoPucki Bankruptcy Research Database and from the bankruptcy database that was built and is maintained by Sudheer Chava (Chava 2014; Chava et al. 2011). The independent variables were extracted from the COMPUSTAT database and the CRSP database. The data that support the findings of this study were used under license for the current study, and so the data are not publicly available.

References

Agarwal V, Taffler RJ (2008) Comparing the performance of market-based and accountingbased bankruptcy prediction models. J Bank Financ 32:1541–1551. https://doi.org/10.1016/j.jbankfin.2007.07.014
Article Google Scholar
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki FP (eds) Second International Symposium on Information Theory. Akadémiai Kiadó, Budapest, pp. 267–281
Alp ÖS, Büyükbebeci E, Çekiç AI, Özkurt FY, Taylan P, Weber G-W (2011) CMARS and GAM & CQP: modern optimization methods applied to international credit default prediction. J Comput Appl Math 235:4639–4651. https://doi.org/10.1016/j.cam.2010.04.039
Article Google Scholar
Altman EI (1968) Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Financ 23:589–609. https://doi.org/10.2307/2978933
Article Google Scholar
Altman EI (2000) Predicting financial distress of companies. Revisiting the Z-score and ZETA® models. Working paper
Altman EI, Saunders A (1997) Credit risk measurement: developments over the last 20 years. J Bank Financ 21:1721–1742. https://doi.org/10.1016/s0378-4266(97)00036-8
Article Google Scholar
Altman EI, Haldeman RG, Narayanan P (1977) ZETA™ analysis: a new model to identify bankruptcy risk of cooperations. J Bank Financ 1:29–54. https://doi.org/10.1016/0378-4266(77)90017-6
Article Google Scholar
Altman EI, Marco G, Varetto F (1994) Corporate distress diagnosis: comparisons using linear discriminant analysis and neural networks (the Italian experience). J Bank Financ 18:505–529. https://doi.org/10.1016/0378-4266(94)90007-8
Article Google Scholar
Altman EI, Sabato G, Wilson N (2010) The value of non-financial information in small and medium-sized enterprise risk management. J Credit Risk 6:95–127. https://doi.org/10.21314/JCR.2010.110
Article Google Scholar
Atiya AF (2001) Bankruptcy prediction for credit risk using neural networks: a survey and new result. IEEE T Neural Networ 12:929–935. https://doi.org/10.1109/72.935101
Article Google Scholar
Balcaen S, Ooghe H (2006) 35 years of studies on business failure: an overview of the classic statistical methodologies and their related problems. Brit Account Rev 38:63–93. https://doi.org/10.1016/j.bar.2005.09.001
Article Google Scholar
Bauer J, Agarwal V (2014) Are hazard models superior to traditional bankruptcy prediction approaches? A comprehensive test. J Bank Financ 40:432–442. https://doi.org/10.1016/j.jbankfin.2013.12.013
Article Google Scholar
Beaver WH (1966) Financial ratios as predictors of failure. J Account Res 4:71–111. https://doi.org/10.2307/2490171
Article Google Scholar
Beaver WH, McNichols MF, Rhie JW (2005) Have financial statements become less informative? Evidence from the ability of financial ratios to predict bankruptcy. Rev Account Stud 10:93–122. https://doi.org/10.1007/s11142-004-6341-9
Article Google Scholar
Bellovary JL, Giacomino DE, Akers MD (2007) A review of bankruptcy prediction studies: 1930 to present. J Financ Edu 33:1–43
Google Scholar
Berg D (2007) Bankruptcy prediction by generalized additive models. Appl Stoch Model Bus 23:129–143. https://doi.org/10.1002/asmb.658
Article Google Scholar
Black F, Cox JC (1976) Valuing corporate securities: some effects on bond indenture provisions. J Financ 31:351–367. https://doi.org/10.1111/j.1540-6261.1976.tb01891.x
Article Google Scholar
Blöchlinger A, Leippold M (2006) Economic benefit of powerful credit scoring. J Bank Financ 30:851–873. https://doi.org/10.1016/j.jbankfin.2005.07.014
Article Google Scholar
Brüderl J, Schüssler R (1990) Organizational mortality: the liability of newness and adolescence. Admin Sci Quart 35:530–537. https://doi.org/10.2307/2393316
Article Google Scholar
Burkhard J, de Giorgi E (2006) An intensity-based non-parametric default model for residential mortgage portfolios. J Risk 8:57–95. https://doi.org/10.21314/JOR.2006.137
Article Google Scholar
Campbell JY, Hilscher J, Szilagyi J (2008) In search of distress risk. J Financ 63:2899–2939. https://doi.org/10.1111/j.1540-6261.2008.01416.x
Article Google Scholar
Charitou A, Neophytou E, Charalambous C (2004) Predicting corporate failure: empirical evidence for the UK. Eur Account Rev 13:465–497. https://doi.org/10.1080/0963818042000216811
Article Google Scholar
Chava S (2014) Environmental externalities and cost of capital. Manage Sci 60:2223–2247. https://doi.org/10.1287/mnsc.2013.1863
Article Google Scholar
Chava S, Stefanescu C, Turnbull S (2011) Modeling the loss distribution. Manage Sci 57:1267–1287. https://doi.org/10.1287/mnsc.1110.1345
Article Google Scholar
Cheng KF, Chu CK, Hwang R-C (2010) Predicting bankruptcy using the discrete-time semiparametric hazard model. Quant Financ 10:1055–1066. https://doi.org/10.1080/14697680902814274
Article Google Scholar
Dakovic R, Czado C, Berg D (2010) Bankruptcy prediction in Norway: a comparison study. Appl Econ Lett 17:1739–1746. https://doi.org/10.1080/13504850903299594
Article Google Scholar
Delong ER, Delong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837–845. https://doi.org/10.2307/2531595
Article Google Scholar
Dickerson OD, Kawaja M (1967) The failure rates of business. In: Pfeffer I (ed) The financing of small business. A current assessment. Macmillian, New York, pp 82–94
Google Scholar
Dimitras AI, Zanahkis SH, Zopounnidis C (1996) A survey of business failures with an emphasis on prediction methods and industrial applications. Theory and methodology. Eur J Oper Res 90:487–513. https://doi.org/10.1016/0377-2217(95)00070-4
Article Google Scholar
Djeundje VB, Crook J (2019) Identifying hidden patterns in credit risk survival data using generalised additive models. Eur J Oper Res 277:366–376. https://doi.org/10.1016/j.ejor.2019.02.006
Article Google Scholar
Eilers PHC, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11:89–102. https://doi.org/10.1214/ss/1038425655
Article Google Scholar
Engelmann B (2011) Measures of a rating’s discriminative power: applications and limitations. In: Engelmann B, Rauhmeier R (eds) The Basel II risk parameters: estimation, validation, stress testing—with applications to loan risk management, 2nd edn. Springer, Berlin, pp 263–287. https://doi.org/10.1007/978-3-642-16114-8
Chapter Google Scholar
Erlenmaier U (2011) The shadow rating approach: experience from banking practice. In: Engelmann B, Rauhmeier R (eds) The Basel II risk parameters: estimation, validation, stress testing—with applications to loan risk management, 2nd edn. Springer, Berlin, pp 37–74. https://doi.org/10.1007/978-3-642-16114-8
Chapter Google Scholar
Estrella A, Park S, Peristiani S (2000) Capital ratios and credit ratings as predictors of bank failures. Econ Policy Rev 6:33–52
Google Scholar
European Union (2013) Regulation (EU) No 575/2013 of the European Parliament and of the Council of 26 June 2013 on prudential requirements for credit institutions and investment firms and amending Regulation (EU) No 648/2012
Everett J, Watson J (1998) Small business failure and external risk factors. Small Bus Econ 11:371–390. https://doi.org/10.1023/A:1008065527282
Article Google Scholar
Fabozzi FJ, Chen R-R, Hu S-Y, Pan G-G (2010) Test of the performance of structural models in bankruptcy prediction. J Credit Risk 6:37–78. https://doi.org/10.21314/JCR.2010.108
Article Google Scholar
Falkenstein E, Boral A, Carty LV (2000) RiskCalc™ for private companies: Moody’s Default Model. Moody’s Investors Service. http://www.efalken.com/papers/riskcalc.pdf. Accessed 31 Mar 2022
Green PJ, Silverman BW (1994) Nonparametric regression and generalized linear models: a roughness penalty approach. Chapman and Hall/CRC, London
Book Google Scholar
Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Chapman and Hall/CRC, London. https://doi.org/10.1002/sim.4780110717
Book Google Scholar
Hastie TJ, Tibshirani RJ (1995) Generalized additive models for medical research. Stat Methods Med Res 4:187–196. https://doi.org/10.1177/096228029500400302
Article Google Scholar
Hayden E (2011) Estimation of a rating model for corporate exposures. In: Engelmann B, Rauhmeier R (eds) The Basel II risk parameters: estimation, validation, stress testing—with applications to loan risk management, 2nd edn. Springer, Berlin, pp 13–24. https://doi.org/10.1007/978-3-642-16114-8
Chapter Google Scholar
Hayden E, Porath D (2011) Statistical methods to develop rating models. In: Engelmann B, Rauhmeier R (eds) The Basel II risk parameters: estimation, validation, stress testing—with applications to loan risk management, 2nd edn. Springer, Berlin, pp 1–12. https://doi.org/10.1007/978-3-642-16114-8
Chapter Google Scholar
Hilbe JM (2009) Logistic regression models. CRC Press, Boca Raton
Book Google Scholar
Honjo Y (2000) Business failure of new firms: an empirical analysis using a multiplicative hazards model. Int J Ind Organ 18:557–574. https://doi.org/10.1016/S0167-7187(98)00035-6
Article Google Scholar
Horowitz JL (1983) Statistical comparison of non-nested probabilistic discrete choice models. Transport Sci 17:319–350. https://doi.org/10.1287/trsc.17.3.319
Article Google Scholar
Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, 3rd edn. Wiley, Hoboken. https://doi.org/10.1002/9781118548387
Book Google Scholar
Hudson J (1987) The age, regional, and industrial structure of company liquidations. J Bus Finan Account 14:199–213. https://doi.org/10.1111/j.1468-5957.1987.tb00539.x
Article Google Scholar
Hwang R-C, Cheng KF, Lee JC (2007) A semiparametric method for predicting bankruptcy. J Forecasting 26:317–342. https://doi.org/10.1002/for.1027
Article Google Scholar
Jacod J, Protter P (2012) Probability essentials, 2nd edn. Springer, Berlin. https://doi.org/10.1007/978-3-642-55682-1
Book Google Scholar
Jarrow RA, Turnbull SM (1995) Pricing derivatives on financial securities subject to credit risk. J Financ 50:53–85. https://doi.org/10.1111/j.1540-6261.1995.tb05167.x
Article Google Scholar
Jones S (2017) Corporate bankruptcy prediction. A high dimensional analysis. Rev Account Stud 22:1366–1422. https://doi.org/10.1007/s11142-017-9407-1
Article Google Scholar
Jones S, Johnstone D, Wilson R (2015) An empirical evaluation of the performance of binary classifiers in the prediction of credit ratings changes. J Bank Financ 56:72–85. https://doi.org/10.1016/j.jbankfin.2015.02.006
Article Google Scholar
Jones S, Johnstone D, Wilson R (2017) Predicting corporate bankruptcy: an evaluation of alternative statistical frameworks. J Bus Finan Account 44:3–34. https://doi.org/10.1111/jbfa.12218
Article Google Scholar
Kneib T (2006) Mixed model based inference in structured additive regression. Dr. Hut-Verlag, Munich
Google Scholar
Laitinen T, Kannkaanpää M (1999) Comparative analysis of failure prediction methods: the Finnish case. Eur Account Rev 8:67–92. https://doi.org/10.1080/096381899336159
Article Google Scholar
Lane WR, Looney SW, Wansley JW (1986) An application of the cox proportional hazards model to bank failure. J Bank Financ 10:511–531. https://doi.org/10.1016/s0378-4266(86)80003-6
Article Google Scholar
Lennox C (1999) Identifying failing companies: a re-evaluation of the logit, probit and DA approaches. J Econ Bus 51:347–364. https://doi.org/10.1016/S0148-6195(99)00009-0
Article Google Scholar
Lindsay DH, Campbell A (1996) A chaos approach to bankruptcy prediction. J Appl Bus Res 12:1–9. https://doi.org/10.19030/jabr.v12i4.5779
Article Google Scholar
Lohmann C, Ohliger T (2017) Nonlinear relationships and their effect on bankruptcy prediction. Schmalenbach Bus Rev 18:261–287. https://doi.org/10.1007/s41464-017-0034-y
Article Google Scholar
Lohmann C, Ohliger T (2018) Nonlinear relationships in a logistic model of default for a high risk installment portfolio. J Credit Risk 14:45–68. https://doi.org/10.21314/JCR.2017.232
Article Google Scholar
Louma M, Laitinen EK (1991) Survival analysis as a tool for company failure prediction. OMEGA Int J Manage Sci 19:673–678. https://doi.org/10.1016/0305-0483(91)90015-L
Article Google Scholar
Merton RC (1974) On the pricing of corporate debt: the risk structure of interest rates. J Financ 29:449–470. https://doi.org/10.1111/j.1540-6261.1974.tb03058.x
Article Google Scholar
Messier WF Jr, Hansen J (1988) Inducing rules for expert system development: an example using default and bankruptcy data. Manage Sci 34:1403–1415. https://doi.org/10.1287/mnsc.34.12.1403
Article Google Scholar
Min JH, Lee Y-C (2005) Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Syst Appl 28:603–614. https://doi.org/10.1016/j.eswa.2004.12.008
Article Google Scholar
Nagelkerke DJD (1991) A note on a general definition of the coefficient of determination. Biometrika 78:691–692. https://doi.org/10.1093/biomet/78.3.691
Article Google Scholar
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A-G 135:370–384. https://doi.org/10.2307/2344614
Article Google Scholar
Neves JC, Vieira A (2006) Improving bankruptcy prediction with hidden layer learning vector quantization. Eur Account Rev 15:253–271. https://doi.org/10.1080/09638180600555016
Article Google Scholar
Ohlson JA (1980) Financial ratios and the probabilistic prediction of bankruptcy. J Account Res 18:109–131. https://doi.org/10.2307/2490395
Article Google Scholar
Paraschiv, Schmid M, Wahlstrøm RR (2022) Bankruptcy prediction of privately held SMEs using feature selection methods. https://ssrn.com/abstract=3911490. Accessed 29 Oct 2022
Saunders A, Allen L (2010) Credit risk measurement in and out of the financial crisis: new approaches to value at risk and other paradigms, 3rd edn. Wiley, Hoboken
Book Google Scholar
Schwarz A, Arminger G (2010) The basis of credit scoring: On the definition of credit default events. In: Locarek-Junge H, Weihs C (eds) Classification as a tool for research: proceedings of the 11th conference of the international federation of classification societies. Springer, Berlin, pp. 595–602
Scott J (1981) The probability of bankruptcy: a comparison of empirical predictions and theoretical models. J Bank Financ 5:317–344. https://doi.org/10.1016/0378-4266(81)90029-7
Article Google Scholar
Serrano-Cinca C (1997) Feedforward neutral networks in the classification of financial information. Eur J Financ 3:183–202. https://doi.org/10.1080/135184797337426
Article Google Scholar
Shumway T (2001) Forecasting bankruptcy more accurately: a simple hazard model. J Bus 74:101–124. https://doi.org/10.1086/209665
Article Google Scholar
Sobehart JR, Keenan SC, Stein RM (2000) Benchmarking quantitative default risk models: a validation methodology. Moody’s Investors Service. http://www.rogermstein.com/wp-content/uploads/53621.pdf. Accessed 31 Mar 2022
Takahashi K, Kurokawa Y, Watase K (1984) Corporate bankruptcy prediction in Japan. J Bank Financ 8:229–247. https://doi.org/10.1016/0378-4266(84)90005-0
Article Google Scholar
Tasche D (2005) Rating and probability of default validation. In: Baseler Ausschuss für Bankenaufsicht (ed) Studies on the validation of internal rating systems. https://www.bis.org/publ/bcbs_wp14.pdf, pp. 28–59. Accessed 31 Mar 2022
Theodossiou P (1993) Predicting shifts in the mean of a multivariate time series process: an application in predicting business failures. J Am Stat Assoc 88:441–449. https://doi.org/10.1080/01621459.1993.10476294
Article Google Scholar
Trueck S, Rachev ST (2009) Rating based modeling of credit risk: theory and application of migration matrices. Elsevier Academic Press, Amsterdam. https://doi.org/10.1016/B978-0-12-373683-3.00003-8
Book Google Scholar
Van Gestel T, Baesens B, Van Dijcke P, Suykens JAK, Garcia J, Alderweireld T (2005) Linear and non-linear credit scoring by combining logistic regression and support vector machines. J Credit Risk 1:31–60. https://doi.org/10.21314/JCR.2005.025
Article Google Scholar
Wang Y, Wang S, Lai KK (2005) A new fuzzy support vector machine to evaluate credit risk. IEEE T Fuzzy Syst 13:820–831. https://doi.org/10.1109/TFUZZ.2005.859320
Article Google Scholar
Wilson R, Sharda R (1994) Bankruptcy prediction using neural networks. Decis Support Syst 11:545–557. https://doi.org/10.1016/0167-9236(94)90024-8
Article Google Scholar
Wood SN (2017) Generalized additive models. An introduction with R, 2nd edn. Chapman and Hall/CRC, Boca Raton. https://doi.org/10.1201/9781315370279
Book Google Scholar
Yang ZR, Platt MB, Platt HD (1999) Probabilistic neural networks in bankruptcy prediction. J Bus Res 44:67–74. https://doi.org/10.1016/S0148-2963(97)00242-7
Article Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL. The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Schumpeter School of Business and Economics, University of Wuppertal, Gaußstraße 20, 42119, Wuppertal, Germany
Christian Lohmann & Steffen Möllenhoff
parcIT GmbH, Erftstraße 15, 50672, Cologne, Germany
Thorsten Ohliger

Authors

Christian Lohmann
View author publications
You can also search for this author in PubMed Google Scholar
Steffen Möllenhoff
View author publications
You can also search for this author in PubMed Google Scholar
Thorsten Ohliger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Lohmann.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lohmann, C., Möllenhoff, S. & Ohliger, T. Nonlinear relationships in bankruptcy prediction and their effect on the profitability of bankruptcy prediction models. J Bus Econ 93, 1661–1690 (2023). https://doi.org/10.1007/s11573-022-01130-8

Download citation

Accepted: 06 December 2022
Published: 20 December 2022
Issue Date: November 2023
DOI: https://doi.org/10.1007/s11573-022-01130-8

Keywords

JEL Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Nonlinear relationships in bankruptcy prediction and their effect on the profitability of bankruptcy prediction models

Abstract

Similar content being viewed by others