1 Introduction

The primary aim of research on bankruptcy prediction is to estimate as accurately as possible the probability of a company becoming bankrupt (for an overview see, e.g., Scott 1981; Dimitras et al. 1996; Altman and Saunders 1997; Balcaen and Ooghe 2006; Bellovary et al. 2007). The accuracy of such forecasts largely depends on the methods and models that are applied and on selecting the most suitable explanatory variables for the purpose of predicting bankruptcy (Laitinen and Kankaanpää 1999). In general, the internal validity of bankruptcy prediction models increases with the complexity of the empirical methods and models and with the number of explanatory variables that researchers use. In contrast to that, the external validity of bankruptcy prediction models decreases if a certain complexity threshold is exceeded and overfitting arises.

The methodology that analysts apply has become very diversified and includes structural models (Merton 1974; Black and Cox 1976; Fabozzi et al. 2010), reducing models (Jarrow and Turnbull 1995), heuristic methods, such as expert systems (Messier and Hansen 1988), models based on chaos theory (Lindsay and Campbell 1996), univariate and multivariate discriminant analyses (Beaver 1966; Altman 1968; Altman et al. 1977), survival analyses (Lane et al. 1986; Louma and Laitinen 1991; Shumway 2001), neural networks (Charitou et al. 2004; Neves and Vieira 2006), support vector machines (Min and Lee 2005; Wang et al. 2005), and, more recently, gradient boosting models (Jones 2017). For example, Jones (2017) uses 91 explanatory variables on shareholder structure and management compensation, variables that proxy size effects, market-based and accounting-based variables, macro-economic variables, analyst recommendations, and industry variables.

The methods and models used for predicting bankruptcy have improved impressively in recent years with respect to validity measures that are based either on likelihood, such as Nagelkerke’s pseudo-R2 (Nagelkerke 1991) or Akaike’s information criterion (Akaike 1973), or on classification, such as the accuracy ratio (Tasche 2005; Trueck and Rachev 2009, pp. 26–28) or the area under curve (AUC; Engelmann 2011). All such models involve a trade-off between statistical validity and comprehensibility. While more complex empirical models may increase the internal and external validity of measures based on likelihood or classification, they often tend to be harder to interpret (Jones et al. 2015, 2017). Models that are designed for predicting bankruptcy as accurately as possible but are not sufficiently plausible and interpretable from an economic perspective are unlikely to be useful in practice (see, e.g., Altman et al. 1994; see also the critical study of neural networks by Hayden and Porath 2011). Contrary to linear regression techniques, more recent bankruptcy prediction models, such as neural networks and gradient boosting models, that take into account nonlinear relationships do not show clearly how the explanatory variables and the probability of bankruptcy interrelate. This is often the case when there are nonlinear relationships between the explanatory variables and the predictor or non-monotonous effects of the explanatory variables on the probability of bankruptcy. An effective bankruptcy prediction model needs to capture the actual effects of the most important explanatory variables on the probability of bankruptcy and still be clear and interpretable. Furthermore, the main criterion for evaluating such a model should be how it affects the profitability of decision making, rather than on validity measures that are solely based on either likelihood or classification.

While several studies only assume the existence of such nonlinear relationships (Brüderl and Schüssler 1990; Atiya 2001; Saunders and Allen 2010), some studies have already provided empirical evidence that there are indeed such relationships among the independent variables of the models they have used. Several studies apply univariate methods on categorical independent variables derived from annual financial statements and analyze nonlinear effects with respect to quantiles of classified data (e.g., Serrano-Cinca 1997; Estrella et al. 2000; Falkenstein et al. 2000; Sobehart et al. 2000; van Gestel et al. 2005; Altman et al. 2010; Hayden 2011). Multivariate forecast models provide more detailed insights into the nonlinear relationships between the independent variables and the predictor or the non-monotonous effects on the probability of bankruptcy. Some of the studies that apply generalized additive models (GAMs) have, in fact, detected a range of nonlinear relationships with respect to analyses of creditworthiness (Burkhard and De Giorgi 2006; Alp et al. 2011; Lohmann and Ohliger 2018; Djeundje and Crook 2019) and bankruptcy prediction (Berg 2007; Hwang et al. 2007; Cheng et al. 2010; Dakovic et al. 2010). However, most of these studies focus on comparing several empirical models from a strictly statistical perspective and do not describe the nonlinear relationships they identify in sufficient detail nor interpret them from an economic perspective. One exception is the study by Lohmann and Ohliger (2017), which examines the specific form of nonlinear relationships between accounting-based independent variables and the predictor for the probability of bankruptcy. Using data on limited German companies, the authors show that nonlinear relationships are observed both below and above specific thresholds with respect to a company’s equity ratio, asset structure ratio based on tangible assets, return on assets, sales, and age.

One problem that many models for predicting bankruptcy share is that neither validity measures based on likelihood nor those based on classification take into account the economically relevant profitability that is associated with a more or less accurate bankruptcy prediction model. If the estimated probability of bankruptcy has an impact on the profitability, these commonly used validity measures may lead to inaccurate conclusions about the economic consequences of a particular empirical model. More specifically, the costs that result from misclassifying companies that are, in fact, bankrupt, and the lost profits that result from misclassifying companies that are, in fact, solvent have to be taken into account (Takahashi et al. 1984; Wilson and Sharda 1994; Yang et al. 1999; Trueck and Rachev 2009, p. 30). Consequently, in economic terms, an empirical model that is statistically more valid may be less desirable than an alternative of lower statistical validity. Overall, traditional validity measures perform inconsistently because they do not take into account the profitability of the bankruptcy prediction models.

The present study uses data on listed U.S. companies covering the period 2000–2020 to examine whether taking into account nonlinear relationships in GAMs improves the accuracy with which generalized linear models (GLMs) predict bankruptcy and, if so, to what extent. Our study contributes to research on predicting bankruptcy in two ways: first, we apply GAMs to identify nonlinear relationships between relevant variables. We use the independent variables that Altman (1968, 2000) and Campbell et al. (2008) already applied in their bankruptcy prediction models. We explain that the estimated spline functions can be analyzed and interpreted with respect to every independent variable and its effect on the predictor for the probability of bankruptcy. This provides substantial insights into cause-effect relationships. The comparison between the estimated spline functions and the estimated linear functions reveals in which of the independent variables’ value ranges the estimated linear functions underestimate or overestimate the actual (nonlinear) effects. Second, we show that, compared to GLMs, GAMs increase the validity measures based on either likelihood or classification. We examine in depth the advantages of using GAMs by using a validity measure that is based on profitability and therefore reflects the economic consequences of a more accurate estimation of the probability of bankruptcy. The pairwise comparison between GLMs and GAMs shows that the profitability of bankruptcy prediction models and their associated classifications can be substantially increased by taking into account nonlinear relationships.

The paper is structured as follows: in the next section we will present our empirical data, the dependent and independent variables we used, and the relevant descriptive statistics and correlations. In the third section we will present the results we derived from the estimated bankruptcy prediction models. We will analyze the nonlinear relationships between the independent variables and the predictor for the probability of bankruptcy and we will compare the validity measures that are based on either likelihood or classification. In the fourth section, we will evaluate the profitability of the bankruptcy prediction models with regard to credit approval decisions. Finally, in the last section we will summarize the main results and discuss their practical implications.

2 Empirical data

2.1 Sample refinement

Our empirical study analyzes listed U.S. companies and estimates the probability of bankruptcy within a triennial period. Our empirical analysis is based on a company’s bankruptcy for the fiscal years 2000–2020 and accounting and market data for the fiscal years 2000–2017. The information on a company’s bankruptcy was taken from the UCLA-LoPucki Bankruptcy Research Database and from the bankruptcy database that was built and is maintained by Chava et al. (2011) and Chava (2014). The information on bankruptcy that our data provide shows whether a company filed for bankruptcy under Chapter 7 or Chapter 11 before the end of 2020 and, for companies that did, when. The independent variables were extracted from the COMPUSTAT database and the CRSP database and correspond to the accounting-based and market-based independent variables that Altman (1968, 2000) and Campbell et al. (2008) applied. We did not take into account further accounting-based and market-based independent variables (e.g., Ohlson 1980), as financial ratios that relate to the same area are often correlated to a substantial extent (Beaver et al. 2005). We also collected information on each company’s industry.

Overall, we gathered empirical raw data that comprise 207,990 annual observations on 24,924 listed U.S. companies for the fiscal years 2000–2017. In addition to data on a company’s bankruptcy, each annual observation includes the annual financial statement and corresponding market data. We processed the raw data in six steps that we outline in Table 1 and extracted the training sample on which we based the estimation of the bankruptcy prediction models and the out-of-time validation sample.

Table 1 The six-step procedure of processing the raw data for the fiscal years 2000–2017

Before we processed the raw data we chose our metric independent variables, drawing on Altman (1968, 2000) (five independent variables) and Campbell et al. (2008) (nine independent variables). In the first step, we eliminated all annual observations where one or more independent variables took an implausible value. Particularly, implausible values were given by a share of liquid assets in total assets that is larger than 1 and a negative share of liquid assets in the market-valued total assets. In the second step, we excluded all companies in the category “Money & Finance” of the Fama–French 12-industry classification scheme. In the third step, we eliminated all annual observations which included less than two independent variables. As a result of this procedure, the raw data were reduced to 138,922 annual observations derived from 16,922 listed U.S. companies.

In the following two steps, we applied the k-nearest neighbor algorithm with k = 10 and replaced missing independent variables by the weighted mean of its closest 10 neighbors. Furthermore, we identified and winsorized outliers. Outliers of each independent variable were winsorized at 1% and 99% percentiles. These procedures did not further reduced to useable data. Finally, we split our data into a training sample that includes annual observations from the period 2000–2014 (in conjunction with information on a company’s bankruptcy for the fiscal years 2000–2017) and an out-of-time validation sample that includes annual observations from the period 2015–2017 (in conjunction with information on a company’s bankruptcy for the fiscal years 2015–2020). While we applied the training sample to estimate the bankruptcy prediction models, we used the out-of-time validation sample to examine a model’s external validity and its prediction accuracy in more detail. The bankruptcy prediction models should exhibit sufficient external validity and be usable with existing and new data from a comparable population.

2.2 Dependent und independent variables

The dependent variable reflects the type of bankruptcy. We classified each bankrupt company according to the type of failure (Dickerson and Kawaja 1967; Schwarz and Arminger 2010; Erlenmaier 2011) and to the period within which it became bankrupt. A company was classified as “bankrupt” if it had declared bankruptcy under Chapter 7 or Chapter 11 within three years after the annual financial statement that we consulted. The 3-year prediction period is relatively large, but it is still commonly applied. Theodossiou (1993) already observed that a company’s financial characteristics change many years before a company actually files for bankruptcy. For example, Campbell et al. (2008) estimated the probability of bankruptcy for 6 months and 1, 2, and 3 years and, more recently, Paraschiv et al. (2022) apply a 3-year prediction period.

Our training sample for the fiscal years 2000–2014 comprises 2635 annual observations where one of the 1110 corporate bankruptcies followed within a triennial period. The a priori bankruptcy rate is about 2.15% in the training sample. Our out-of-time validation sample for the fiscal years 2015–2017 comprises 209 annual observations where one of the 145 corporate bankruptcies followed within a triennial period. The a priori bankruptcy rate of 1.28% is lower than the a priori bankruptcy rate in the training sample. While the training sample includes the economic slowdown after the dotcom bubble in 2000 and the financial crisis that began around 2007, the out-of-time validation sample covers a period with a historically low number of corporate bankruptcies due to advantageous macroeconomic and financing conditions.

The independent variables consist of metric variables derived from the annual financial statements and stock-market information of the listed U.S. companies. In particular, the bankruptcy prediction models we estimate are based on the independent variables that either Altman (1968, 2000) or Campbell et al. (2008) used. In Altman’s work (1968, 2000), the independent variables consist of the five financial ratios presented in Table 2. We applied a set of independent variables where every variable result was derived entirely from accounting information. To that end, instead of market value of equity (Altman 1968), we used book value of equity (Altman 2000) to calculate the independent variable BVE_TL. Furthermore, we also decided to use the market-based independent variables Campbell et al. (2008) describe. These represent two sets of independent variables (see Table 3) that differ in the valuation of total assets (i.e., adjusted total assets (ATA) vs. market-valued total assets (MTA)). We also take into account the year of each observation and each company’s industry, according to the Fama–French 12-industry classification scheme. In our context, categorical variables are only of secondary importance, because they are not useful in the analysis of nonlinear relationships. Nevertheless, we included them to make our analysis as comprehensive as possible.

Table 2 Overview of the independent variables used in Altman (1968, 2000)
Table 3 Overview of the independent variables used in Campbell et al. (2008)

2.3 Descriptive statistics and correlations

The descriptive statistics of the independent variables are displayed in Table 4 and show the expected characteristics. The value ranges of the independent variables are economically plausible. As we excluded negative equity that is shown on the asset side from total assets, it is feasible that WC_TA < − 1, BVE_TL < 0, TL_ATA > 1, and MB < 0. The minimum and maximum values do not show any differences between solvent and bankrupt companies due to winsorizing of outliers. Several independent variables exhibit a large difference between mean and median values as the winsorized outliers still have a large effect on the mean. The median is more informative. A particularly pronounced example is the independent variable WC_TA that exhibits negative means and positive medians for both solvent and bankrupt companies. Further noticeable differences between mean and median values arise in the independent variable RE_TA. Our analysis reveals statistically significant differences (p < 0.01) between solvent and bankrupt companies in the mean and median values with respect to all metric independent variables.

Table 4 Minimum, mean, median, maximum, and standard deviation of the independent variables for bankrupt, solvent, and all observations (training sample)

Most independent variables in our three sets (Altman 1968, 2000; Campbell et al. 2008—ATA; Campbell et al. 2008—MTA) are moderately or little correlated according to Pearson’s correlation coefficients (see Table 5). The correlation between RE_TA and EBIT_TA is high, as both independent variables relate to a company’s earnings. Furthermore, there is a high correlation between SIGMA and RSIZE and PRICE exhibits high correlations to SIGMA and RSIZE. The Pearson’s correlation coefficients also indicate that there are high correlations between RE_TA and WC_TA, EBIT_TA and WC_TA, as well as TL_ATA and NI_ATA. But these correlations are largely caused by winsorized outliers and other outlying observations as Spearman’s rank correlation coefficients take the values 0.177 (RE_TA and WC_TA), 0.119 (EBIT_TA and WC_TA), and − 0.124 (TL_ATA and NI_ATA). We tested each set of independent variables for multicollinearity between each metric independent variable and all other independent variables. The variance inflation factor shows that there is no multicollinearity, apart from the identified correlations within the three sets of independent variables. The contingency analysis does not reveal any strong relationships between the metric and categorical independent variables. Consequently, the three sets of independent variables demonstrate the expected data structures and serve as a valid database. However, we have to take into account the existing correlations within the three sets of independent variables and we have to be careful when we interpret the regression results of the bankruptcy prediction models.

Table 5 Correlations within the three sets of independent variables (training sample)

3 Results

3.1 Estimated bankruptcy prediction models

To analyze the relationships between the independent variables and the dependent variable, we estimated both GLMs and nonlinear GAMs with respect to the three sets of independent variables Altman (1968, 2000) and Campbell et al. (2008) used. Predicting bankruptcy requires that a company’s solvency status is coded in a binary manner (“solvency” = 0; “bankruptcy” = 1). Following this approach, we transformed the information on qualitative bankruptcy that we drew from our data into a Bernoulli-distributed measure. This metric measure, which we subsequently used in regression analysis, can be interpreted as the metric probability of company being in the class “bankruptcy.” In GLMs, this probability depends on a set of independent variables that linearly increase or decrease the predictor which is further transformed by a distribution function (Nelder and Wedderburn 1972). As every distribution function is strictly non-decreasing (Jacod and Protter 2012), each independent variable has a monotonic effect on the probability within a GLM (Hosmer et al. 2013). We use the logistic distribution function in all GLM and GAM estimations. In GAMs, the functional relationship between each independent variable and the predictor is estimated by an unspecified spline function (Hastie and Tibshirani 1990, 1995). In the following, we use penalized splines with base functions that relate to the B-spline-base (Kneib 2006) to model the splines. We put the GAMs in concrete terms by using basic functions of rank g = 3 and 12 intervals that are based on quantiles of the same size for each penalized spline. The smoothing parameter is determined by the generalized cross-validation criterion (Green and Silverman 1994; Eilers and Marx 1996). As a result, applying GAM allows modeling a non-monotonic relationship between each independent variable and the probability of bankruptcy.

In the three GLM and three GAM estimations, we included the independent variables Altman (1968, 2000) used, the adjusted independent variables (ATA) Campbell et al. (2008) used, and the market-based independent variables (MTA) used in Campbell et al. (2008). The year of the observation and the company’s industry, according to the Fama–French 12-industry classification, are also taken into account as categorical independent variables. The categorical independent variables are treated as dummy variables. For example, Eqs. (1) and (2) show the GLM and the GAM when the independent variables that were used by Altman (1968, 2000) are applied. Thereby, \(\eta_{i,t}^{GLM}\) and \(\eta_{i,t}^{GAM}\) are the predictors that are transformed by the logistic distribution function \(F( \cdot )\) to calculate the probability \(\pi_{i,t}\) of company \(i\) being in the class “bankruptcy” in year \(t\). In contrast to the linear regression coefficients in Eq. (1), the relationships between the metric independent variables and the predictor \(\eta_{i,t}^{GAM}\) are estimated by using spline functions \(f( \cdot )\) that do not follow a specified form.

$$\pi_{i,t} = F\left( {\eta_{i,t}^{GLM} } \right) = F\left( \begin{gathered} \beta_{0} + \beta_{1} \cdot WC\_TA_{i,t} + \beta_{2} \cdot RE\_TA_{i,t} + \beta_{3} \cdot EBIT\_TA_{i,t} \hfill \\ + \beta_{4} \cdot BVE\_TL_{i,t} + \beta_{5} \cdot S\_TA_{i,t} + \beta_{6} \cdot Industry_{i} + \beta_{7} \cdot Year_{t} + \varsigma_{i,t} \hfill \\ \end{gathered} \right)$$
(1)
$$\pi_{i,t} = F\left( {\eta_{i,t}^{GAM} } \right) = F\left( \begin{gathered} \beta_{0} + f_{1} (WC\_TA_{i,t} ) + f_{2} (RE\_TA_{i,t} ) + f_{3} (EBIT\_TA_{i,t} ) \hfill \\ + f_{4} (BVE\_TL_{i,t} ) + f_{5} (S\_TA_{i,t} ) + \beta_{6} \cdot Industry_{i} + \beta_{7} \cdot Year_{t} + \varsigma_{i,t} \hfill \\ \end{gathered} \right)$$
(2)

The estimations of the GLMs and the GAMs are presented in Table 6. Although the GLMs and GAMs are estimated with an intercept and dummy variables for the industry and the year of the observation, Table 6 only reports the results with regard to the metric independent variables. The results from the GLMs include the regression coefficients. The asterisks denote the level of significance based on the likelihood ratio test (Wood 2017, p. 411). The results of the metric independent variables in the GAMs show the equivalent degrees of freedom dff, which represent the variability of the estimated splines of the metric independent variables. The value dff = 1 shows that the estimated spline corresponds to a linear function and the increasing degrees of freedom indicate the level of increases in nonlinearity. Again, the asterisks denote the level of significance based on the likelihood ratio test (Wood 2017, p. 411).

Table 6 GLM and GAM estimations and validity measures

With regard to the independent variables Altman (1968, 2000) used in GLM1, the increasing values of WC_TA and RE_TA and the decreasing values of BVE_TL and S_TA have a positive and statistically significant effect on the probability of bankruptcy. The positive coefficients of WC_TA and RE_TA are caused by particularly negative outliers. If outliers would be eliminated below the 5% and above the 95% percentiles, the coefficients of WC_TA and RE_TA would change their sign. The estimated coefficients of the GLM2 and GLM3 are comparable to those Campbell et al. (2008) report and largely exhibit the expected signs.

The GLM estimation of the adjusted independent variables (ATA) that Campbell et al. (2008) used shows that decreasing values of EXC_RET and increasing values of TL_ATA, SIGMA, and RSIZE significantly increase the probability of bankruptcy. The coefficient of NI_ATA is not statistically significant. Taking into account the market-based independent variables (MTA) that Campbell et al. (2008) used, we find that decreasing values of NI_MTA, CA_MTA, and EXC_RET and increasing values of TL_MTA, SIGMA, RSIZE, and PRICE significantly increase the probability of bankruptcy. The positive coefficient of PRICE is caused by particularly negative outliers. If outliers would be eliminated below the 5% and above the 95% percentiles, the coefficient of PRICE would change its sign. Furthermore, we have to note that the estimated coefficients of RSIZE are positive and statistically significant. That result deviates from Campbell et al. (2008, p. 2910) who report a positive and statistically significant coefficient of RSIZE when the adjusted independent variables were applied and a negative and statistically significant coefficient of RSIZE when the market-based independent variables were applied.

According to Table 6, the GLM estimations largely correspond to the GAM estimations with respect to the level of significance of the metric independent variables. The equivalent degrees of freedom of the GAM estimations (dff > 1.00) reveal that there are nonlinear relationships between the independent variables and the predictor. Consequently, in order to describe the nonlinear relationships’ direction, we have to analyze the spline patterns in detail.

Despite the presence of highly correlated independent variables we restrict our analysis to the variables that Altman (1968, 2000) and Campbell et al. (2008) applied in their studies. The effects of highly correlated independent variables on the GLM and GAM estimations are negligible. We examined the robustness of the GLM and GAM estimations by varying the highly correlated independent variables of the bankruptcy prediction models. The bankruptcy prediction models are very robust against any variation of the highly correlated independent variables. As our analysis focuses on improving a model’s validity by taking into account nonlinear relationships in GAM estimations rather than coefficients of GLM estimations and their interpretation, highly correlated independent variables are not an obstacle for the present analysis. Furthermore, the nonlinear relationships that are estimated by the GAMs are robust against any change in the number of intervals for which each spline function was estimated.

As an additional robustness check we re-estimated the GLMs and GAMs as mixed models by taking into account company-specific random effects. Unobserved company-specific properties may cause omitted variable bias if these unobserved company-specific properties are correlated with the dependent and independent variables. We estimated the mixed models by introducing company-specific random effects and eliminating industry dummy variables. As the computing effort was too high to include the 16,489 companies of the training sample in the mixed models, we drew 30 random samples of 1000 companies out of 16,922 companies. Then, we applied the observations that belong to the training sample period 2000–2014 and estimated 30 mixed models for each GLM and each GAM (i.e., 30 · (3 + 3) = 180 mixed models). As last step, we validated these estimations by applying the observations that belong to the training sample period 2000–2014 and the validation sample period 2015–2017. We analyzed the distributions of the regression coefficients and the degrees of freedom and we evaluated the estimated spline functions that were statistically significant. The regression coefficients of the GLM are robust with regard to their sign. However, the regression coefficients are largely not statistically significant as the company-specific random effects provide a large proportion of the explanatory power. The degrees of freedom of the mixed models are comparable to the degrees of freedom of the three estimated GAMs. The estimated spline functions of the mixed models show similar nonlinear relationships that are presented in Figs. 1, 2 and 3 Based on this robustness check we have to conclude that neglecting company-specific random effects does not distort the analysis of nonlinear relationships in bankruptcy prediction models.

Fig. 1
figure 1

Estimated spline patterns of the significant independent variables in GAM1. The black bold line represents the estimated spline. The value of the independent variable is plotted on the x-axis, while the effect on the predictor is plotted on the y-axis. The 95% confidence band is shaded gray. The dashed line represents the linear function of the GLM estimation. The estimated linear function is centered with the estimated spline at the function value 0. The empirical density function of the independent variable is depicted as a dotted line, with the maximum value on the right side. The depicted value range of each independent variable is trimmed at 5% and 95% percentiles

Fig. 2
figure 2

Estimated spline patterns of the significant independent variables of GAM2. The black bold line represents the estimated spline. The value of the independent variable is plotted on the x-axis, while the effect on the predictor is plotted on the y-axis. The 95% confidence band is shaded gray. The dashed line represents the linear function of the GLM estimation. The estimated linear function is centered with the estimated spline at the function value 0. The empirical density function of the independent variable is depicted as a dotted line, with the maximum value on the right side. The depicted value range of each independent variable is trimmed at 5% and 95% percentiles

Fig. 3
figure 3

Estimated spline patterns of the significant independent variables in GAM3. The black bold line represents the estimated spline. The value of the independent variable is plotted on the x-axis, while the effect on the predictor is plotted on the y-axis. The 95% confidence band is shaded gray. The dashed line represents the linear function of the GLM estimation. The estimated linear function is centered with the estimated spline at the function value 0. The empirical density function of the independent variable is depicted as a dotted line, with the maximum value on the right side. The depicted value range of each independent variable is trimmed at 5% and 95% percentiles

3.2 Nonlinear relationships

The estimated spline functions show the estimated nonlinear relationships between the independent variables and the predictor. The equivalent degrees of freedom of the GAM estimations, and thus the estimated nonlinear relationships, differ with regard to the metric independent variables. Figures 1, 2 and 3 depict the spline patterns of the significant independent variables for the three GAM estimations. The black bold line represents the estimated spline. The value of the independent variable is plotted on the x-axis, while the effect on the predictor is plotted on the y-axis. Higher values on the y-axis indicate a higher probability of bankruptcy. However, because these probabilities also depend on the values of the other variables, we cannot determine them more precisely. The 95% confidence band is shaded gray. To compare the estimated spline patterns with the estimated linear functions, we inserted the linear functions of the GLM estimations and centered the estimated linear functions with the estimated spline patterns at the function value 0. Figures 1, 2 and 3 also depict the empirical density function of the independent variable as a dotted line, with the maximum value on the right side. The empirical density function matches the descriptive statistics that are presented in Table 4. The spline patterns are particularly meaningful within the value range where the empirical density function indicates a large number of observations.

The spline patterns show that there are a wide range of nonlinear relationships between the independent variables and the predictor. The comprehensive analysis of the spline patterns reveals five main structures that can occur separately or in combination: (1) sharply decreasing effect on the predictor in a relatively small value range with a large number of observations, (2) sharply increasing effect on the predictor in a relatively small value range with a large number of observations, (3) a relationship between the independent variable and the predictor that can be described by an U-curve, (4) a relationship between the independent variable and the predictor that can be described by an inverted U-curve, and (5) almost no effect on the predictor in a relatively large value range with a small number of observations. We will analyze the nonlinear relationships in Figs. 1, 2 and 3 for selected independent variables in more detail.

From the analysis of the spline function of EBIT_TA in Fig. 1 we were able to draw more valid conclusions. For positive values (EBIT_TA > 0), the independent variable EBIT_TA has a negative and almost linear effect on the predictor and, therefore, decreases the probability of bankruptcy. When EBIT_TA takes negative values, we can assume that the spline pattern for EBIT_TA is almost constant. If EBIT_TA deteriorates, this is not likely to have a further effect on the predictor. As a result, the assumption of a linear relationship will not hold within the whole value range of the independent variable EBIT_TA.

The spline function of BVE_TL in Fig. 1 decreases almost linearly until BVE_TL = 2. That means that the increase in BVE_TL reduces the probability of bankruptcy. The effect of BVE_TL on the predictor decreases in the upper peripheral areas, where we can assume that the spline pattern for BVE_TL > 2 is almost constant. This empirical finding is consistent with the findings of van Gestel et al. (2005) and Lohmann and Ohliger (2017), who showed empirically that the effect of high equity ratios on the probability of bankruptcy converges towards zero. The decreasing effect of additional potential liability when BVE_TL is already high can explain the nonlinear relationship between BVE_TL and the predictor for the probability of bankruptcy. However, linear functions cannot capture the change in the slope of the spline function, which leads ceteris paribus to either underestimating (BVE_TL < 2) or overestimating (2 < BVE_TL) the probability of bankruptcy.

The estimated spline function of NI_ATA in Fig. 2 is comparable to the estimated spline function of EBIT_TA in Fig. 1. For slightly negative and positive values (NI_ATA > − 0.1), the independent variable NI_ATA has a negative and almost linear effect on the predictor. In contrast to that, negative values (NI_ATA < − 0.1) do not increase the predictor or, as a result, the probability of bankruptcy. These two sections of the estimated spline function cannot be reproduced by a strictly linear function as the sensitivity of the effect that NI_ATA has on the probability of bankruptcy is underestimated for slightly negative and positive values (NI_ATA > − 0.1) and is overestimated for negative values (NI_ATA < − 0.1).

The estimated spline function of TL_ATA in Fig. 2 increases almost linearly within the range 0.0 < TL_ATA < 0.7 and thus increases the probability of bankruptcy. The effect of TL_ATA on the probability of bankruptcy decreases in the upper peripheral areas, where we can assume that the spline pattern for TL_ATA > 0.7 is almost constant. Consequently, the results we derive from the estimated spline function are consistent with previous empirical evidence that the probability of bankruptcy exhibits low sensitivity when equity ratios are low (Lennox 1999; Lohmann and Ohliger 2017).

The spline pattern of RSIZE in Fig. 2 indicates a reversed U-shaped relationship between RSIZE and the predictor. Companies with a relatively low or a relatively high market valuation exhibit a lower probability of bankruptcy. Among these, companies with a relatively low market valuation are likely to be young and still in their “honeymoon” period (Hudson 1987; Everett and Watson 1998; Honjo 2000; Altman et al. 2010).

The nonlinear relationships in GAM3 are comparable to those in GAM2. The independent variable NI_MTA in Fig. 3 exhibits a decreasing effect on the predictor for a narrower value range. However, we can again observe lower and upper thresholds where the slope of the spline function and, therefore, the effect of the independent variable on the predictor change. The spline functions of TL_MTA and CA_MTA indicate that there are almost linear relationships between these independent variables and the predictor over the whole value range. Based on the estimated spline functions that reveal the effective relationships between the independent variables and the predictor, we have to conclude that an estimated linear function will usually be inaccurate, because it partly underestimates or overestimates the effect of the independent variable on the predictor.

3.3 Validity measures

The GLM and GAM estimations are evaluated with respect to several goodness-of-fit criteria. The validity of the estimated GLMs and GAMs on the basis of the likelihood that a company will go bankrupt, according to Nagelkerke’s pseudo-R2 (Nagelkerke 1991) and to Akaike’s information criterion (Akaike 1973) can be seen in Table 6. Nagelkerke’s pseudo-R2 is clearly higher in each GAM than in the corresponding GLM. The increase in Nagelkerke’s pseudo-R2 amounts to 104.11% (GLM1 vs. GAM1), 169.44% (GLM2 vs. GAM2), and 80.65% (GLM3 vs. GAM3).

In contrast to Nagelkerke’s pseudo-R2, Akaike’s information criterion does take into account a model’s complexity. This allows us to compare directly the validity of GLMs and GAMs (Horowitz 1983; Wood 2017). Applying Akaike’s information criterion, according to which lower values indicate greater validity, leads to a similar conclusion. Given that Akaike’s information criterion explicitly takes into account a model’s complexity, the relative difference between GLMs and GAMs is lower, because the GAM exhibits a larger number of equivalent degrees of freedom. However, the difference according to Akaike’s information criterion is sufficiently large to indicate that the GAM is superior to the corresponding GLM (Hilbe 2009, p. 260).

Table 6 also provides proof of the model’s validity on the basis of classification. This is a more reliable indicator of a model’s validity with respect to predicting the likelihood of a company going bankrupt. We calculated the area under curve (AUC), which indicates the model’s overall validity (Engelmann 2011), for the training sample and the validation sample. The values we derived from the GAMs always exceed those we derived from the corresponding GLMs. Applying the statistical test that Delong et al. (1988) recommend, we found that the differences in the training sample and in the validation sample are statistically significant at p-value < 0.01. The results show that the GAM remains superior to the corresponding GLM.

4 Profitability of the bankruptcy prediction models

The estimated GLM and GAM bankruptcy prediction models can also be evaluated on their profitability with regard to credit approval decisions. The methodology was developed by Blöchinger and Leippold (2006) and subsequently applied by Agarwal and Taffler (2008) and Bauer and Agarwal (2014) for listed UK companies and Paraschiv et al. (2021) for Norwegian small and medium-sized companies. The basic idea is to perform a paired comparison of two different bankruptcy prediction models that compete against each other in the credit market. The bankruptcy prediction model effects the outcome of the competition as it determines the required credit spread and, therefore, the claimed interest for the debtor. It is also accountable for distinguishing between solvent debtors and debtors that will declare bankruptcy within the credit period. We evaluate the profitability of the bankruptcy prediction models by comparing GLM1 and GAM1, GLM2 and GAM2, and GLM3 and GAM3.

The methodology of Blöchinger and Leippold (2006) include a framework for credit approval decisions and the specification of a set of parameters. The simplified credit offer can be described by Eq. (3). The left side of Eq. (3) denotes a lender’s expected revenue given that the credit spread Ri,t of company i in year t can actually be received if the company remains solvent. Here, we apply the probability that a specific company i in year t will not become bankrupt within the next three years. The right side of Eq. (3) consists of a lender’s expected loss (LGD) given that company i goes bankrupt within the next three years and the expected minimum rate k given that the company remains solvent.

$$R_{i,t} \cdot P\{ Y = 0\,\left| {\,company = i,year = t} \right.\} = \begin{array}{*{20}l} {LGD \cdot P\{ Y = 1\,\left| {\,company = i,year = t} \right.\} } \\ { + k \cdot P\{ Y = 0\,\left| {\,company = i,year = t} \right.\} } \\ \end{array}$$
(3)

The most competitive credit offer is given when Eq. (3) holds. As we compare two bankruptcy prediction models with each other, we will calculate different credit spreads for each company–year observation in our data set. Company i can be gained as debtor in year t by the entity whose bankruptcy prediction model determines the lowest credit spread Ri,t. If the credit spread Ri,t of both bankruptcy prediction models should be identical, the credit will be split in equal shares. Therefore, it is preferable to estimate the probability of bankruptcy and the counter-probability as accurate as possible to gain solvent companies as debtors and reject credit applications of companies that will go bankrupt.

We put this framework in concrete terms by making the following five assumptions: (1) every company applies for a similar volume of credit in each year for which we have company-related information in our data set. For example, if there are ten annual observations of a company, that company applies for ten credits and these credits can be granted by different lenders. (2) The lenders are not restricted with regard to the total volume of credit and we do not take into account refinancing cost or other operating cost of the lenders. (3) The minimum rate k is set to 0.3%. (4) A credit application is always rejected if the credit spread Ri,t is larger than 10%. (5) If a company declares bankruptcy, the LGD determines the loss. There are no further interest payments or redemptions after bankruptcy. We assume three different LGD values which are blanket and do not depend on the specific bankruptcy date. The LGD can take the values 15%, 45% (according to European Union 2013, Article 161 (1) a), and 75% (according to European Union 2013, Article 161 (1) b).

Tables 7 and 8 show the lending policy and the profitability of the bankruptcy prediction models for the training sample and the validation sample. Thereby, GLM1 competes with GAM1, GLM2 competes with GAM2, and GLM3 competes with GAM3. First, the lending policy of the bankruptcy prediction models is illustrated by three key figures that highlight the share in the credit market and the share of bankruptcies in the credit portfolio. Second, the profitability of the bankruptcy prediction models is stated as absolute and relative figures. Besides profit and profit margin, we calculated the return on assets (ROA) and the return on risk-weighted assets (RORWA). The risk-weighted assets are determined according to the Basel III regulatory framework that is stipulated in European Union (2013, Article 153), but without applying the adjustment factor for small and medium-sized companies (according to European Union 2013, Article 153 (4)). The maturity of credit that is required for calculation is assumed to be 2.5 years (according to European Union 2013, Article 162 (1)).

Table 7 Model comparison on the basis of lending policy and profitability (training sample)
Table 8 Model comparison on the basis of lending policy and profitability (validation sample)

The comparisons between the GLMs and GAMs show that GAMs exhibit greater discriminatory power. GAMs estimate a lower probability of bankruptcy for these companies that remain solvent. As a result, the credit spread of the GAM is generally smaller than the credit spread of the corresponding GLM and the institution that applies a GAM is able to attract a larger number of corporate borrowers. Additionally, applying a GAM avoids at a larger extent that companies that will go bankrupt get a credit. Overall, the larger discriminatory power of GAMs allows a more aggressive lending policy, leads to a larger share in the credit market, and limits the number of credits that are granted to effectively bankrupt companies. These benefits of GAMs lead to higher profit in comparison to GLMs due to higher revenue and smaller loss. The higher profit is also confirmed by the relative profit figures. Thereby, the profit gap between every pair of GLM and GAM increases in LGD. These results allow us to conclude that taking into account nonlinear relationships significantly increases the discriminatory power and, therefore, the profitability of the bankruptcy prediction model. These findings are consistent in both the training and the validation sample.

5 Conclusion

In order to derive a reliable prediction of bankruptcy, it is necessary to strike a balance between a model’s validity and complexity. In the present study we extend commonly used bankruptcy prediction models by taking into account nonlinear relationships between independent variables and the predictor for the probability of bankruptcy. Omitting the effects of nonlinear relationships may distort the estimates of a company’s probability of going bankrupt. This makes it necessary to evaluate the economic relevance of taking into account nonlinear relationships. For that purpose, it is important to select appropriate validity criteria.

Our findings show that several independent variables that Altman (1968, 2000) and Campbell et al. (2008) used exhibit statistically significant and economically plausible nonlinear relationships with the probability of a company going bankrupt. In the value range where the independent variables exhibit sufficient data points, it is safe to assume that these variables have an almost linear effect on the predictor. However, we did observe nonlinear relationships below and above specific thresholds at which the estimated spline functions change their slope. With respect to the independent variables RE_TA, EBIT_TA, NI_ATA, and NI_MTA, we were able to prove empirically that when each of these independent variables takes small values, there is a converging effect. Below a certain threshold, decreases in the values of each of these independent variables have only a minor or no effect on the probability that a company will go bankrupt. With respect to the independent variables BVE_TL, S_TA, and TL_ATA, we observed that, above a certain threshold, when these independent variables take large values, there is a similar effect on the probability of bankruptcy.

The validity measures that are based on either likelihood or classification indicate that the validity of the GAMs we used, in which we took into account nonlinear relationships, is higher than that of the equivalent GLMs. As a result, we have to acknowledge that there are relevant nonlinear relationships between the independent variables that were introduced by Altman (1968, 2000) and Campbell et al. (2008) and the predictor for the probability of bankruptcy. However, the improvements in the validity measures that are based either on likelihood or on classification may not necessarily be perceived as sufficient to justify choosing a more complex model for predicting bankruptcy. When only such measures are used, there is a risk that the evaluation of bankruptcy prediction models will lead to a wrong conclusion and to choosing an inappropriate model, even if the application of an alternative model increases the profitability on an economically relevant scale. For example, a global, single-item validity measure such as the AUC does not take into account the actual consequences of bankruptcy prediction models and, therefore, distorts conclusions on validity. To prevent this, it is advisable to evaluate bankruptcy prediction models on the basis of realistic assumptions about their application and the associated economic outcome.

In the present study we examined whether the profitability of bankruptcy prediction models can serve as a further validity criterion. To demonstrate the validity of this criterion, we applied two nested models that differ only with respect to nonlinear relationships: the GAM takes them into account, while the GLM does not. Consequently, we can be confident that any increase in a model’s profitability can be attributed to the inclusion or exclusion of existing nonlinear relationships. With respect to paired competition between GLMs and GAMs in the credit market, we found that applying a GAM leads to higher profitability than applying a GLM. This result holds for the both the training and the validation sample. Therefore, it is worthwhile to take nonlinear relationships into account when the probability of bankruptcy is estimated.