1 Introduction

The issue of default-prediction models for small and medium-sized enterprises (SMEs) has attracted the interest of academics since the 1970s (Edmister, 1972), with renewed attention starting from the 1990s due to the Basel Capital Accords. The unique financial characteristics of SMEs have meant that traditional default-prediction models (developed for large firms) are not adequate for estimating the probability of SMEs defaulting (Ciampi, 2015). SMEs are actually financially riskier and have a lower asset correlation with one another than large businesses (Dietsch & Petey, 2004; Saurina & Trucharte, 2004). As a consequence, the development of default-prediction models for SMEs has become a specific and autonomous stream of finance literature, and this is still the case today.

This field of interest has become particularly topical given the current COVID-19 pandemic, which is making the limits of traditional rating models (mainly based on financial ratios and accounting data) even more marked when applied to SME default prediction. The global COVID-19 crisis has been impacting on the financial health of the majority of SMEs, forcing them to base their chances of survival on turnaround plans which, by their very nature, significantly reduce the predictive value of past accounting data on which financial ratios are based (Ciampi et al., 2021). Furthermore, this crisis is expected to have an amplification effect on the tendency of SMEs to resort to unorthodox accounting behaviors with the aim of postponing the emergence of economic and/or financial imbalances (Ciampi, 2018), thereby increasing the timeframe within which a firm’s financial weaknesses are reflected inits financial ratios level.

It is, then, necessary to build default-prediction models tailored to SMEs (Calabrese et al., 2013) and based on other information, in addition to traditional financial ratios (Ciampi et al., 2021). Some studies (among others, Ohlson, 1980; Keasey & Watson, 1987; Lin et al., 2010; Ciampi, 2015; Christopoulos et al., 2019) have, for example, already analyzed the effect of corporate governance characteristics on SEM failure by demonstrating that adding corporate governance variables to financial ratios significantly improves SME default-prediction accuracy rates. Other studies (among others, Bartoli et al., 2013; Bergerès et al., 2015; Chen et al., 2015; Norden & Weber, 2010; Smondel, 2018) have instead considered data related to the relationship with banks, suggesting that also adding this information improves the predictive power of models. However, to date, no studies have combined corporate governance characteristics and bank-firm information together with financial ratios, while errors in prediction models proposed in the literature still persist. The challenge is finding a prediction system based on a consistent number of different variables (Ciampi, 2015, 2021).

An appropriate methodology to combine financial, corporate governance and bank-firm data can be identified in Bayesian models (Bernardo & Smith, 1994), which increase in-sample prediction accuracy in comparison to traditional logit models (Figini & Giudici, 2011). This methodological approach was adopted for the present study, specifically by running a merged longitudinal predictive model on a sample of 973 SMEs based in Italy, where small and medium-sized enterprises account for more than 98% of all firms and employ over 80% of the total workforce. The firms in the sample are clients of 36 different co-operative banks (7.6% of the banks operating in Italy). They have a direct presence in more than a third of Italian municipalities, and in 620 municipalities (out of 7903), they operate as a single intermediary (Bank of Italy, 2019). Their commitment to retail banking is evidenced by the fact that 59% of their assets are destined for loans to households and small and medium-sized businesses (6% more than other banks). The data were collected for the years 2012–2014.

The main findings suggest the high predictive power of leverage ratio (total debts/(total debts + equity)), CEO tenure, ownership concentration and board diversity, and the number of loans overdue for more than 180 days in the previous 12 months and the number of months during which enterprises are overdrawn – their cash and signature during the previous 12 months.

The paper contributes to the stream of finance literature on SME default prediction by drawing upon a combination of the following: financial ratios (cash flow/turnover, cash flow/total debts, return on investment, acid test ratio and total debts/(total debts + equity)); interest charges/turnover; EBITDA/turnover; total debts/EBITDA; corporate governance data (CEO duality, board independence, CEO tenure, ownership concentration, board size, equity incentives of the board, board diversity and audit independence); and bank-firm information (reporting institutions – 12 months; first information requests – 12 months; suffering – 12 months; overdue 180 days – 12 months; revocation – 12 months; expense on used – 12 months; months overdrawn). A close examination of published literature (as reported in the next section) highlights that although these three categories have been considered in previous studies analyzing their predictive power in relation to SME defaults, such studies have never combined all three categories. By doing this, we have answered the call for exploration and measurement of the predictive power of different variables and/or categories of variables, emerging from the systematic literature review on SME default prediction by Ciampi et al. (2021). Furthermore, to fully capture the predictive power of the three different categories of variables, we applied a merged longitudinal predictive model (Figini & Giudici, 2011), anchored in the Bayesian literature (Bernardo & Smith, 1994), which performs much better than classical longitudinal models and logit models (Figini & Giudici, 2011). Lastly, by using data from several banks, we had the opportunity to control our analysis for bank characteristics.

2 SME default-prediction modeling: a review of the literature

The academic literature on financial distress forecasting dates back to the work of Beaver (1966), who first tested the predictive ability of a selection of financial ratios by conducting a univariate analysis of a matched-pair sample of large US firms. The topic then gained momentum thanks to contributions from several prominent authors who estimated the likelihood of large firms incurring financial distress, using approaches such as multiple discriminant analysis (Altman, 1968), methods for pricing corporate liabilities (Merton, 1974) and logistic and probit regressions (Ohlson, 1980; Zmijewski, 1984). However, it was only with the seminal investigation of Edmister (1972) that research into default-prediction models specific to SMEs started attracting scholarly attention. Edmister (1972) selected 42 US SMEs and 19 financial ratios and, like Altman (1968), relied on multiple discriminant analysis. He concluded that the application of such a technique results in improved predictions, compared to traditional subjective evaluations.

From the mid-1980s onwards, scholars primarily relied on logistic and probit regressions to predict SME defaults. Keasey and Watson (1987) focused on single-plant, independently owned SMEs located in the northeast of England between 1970 and 1983 and developed one of the first logit models for financial distress prediction, based on accounting and firm-level data. Apart from the UK (Gupta et al., 2015; Lin et al., 2012), similar single-country empirical analyses relying on accounting and firm-level data were later conducted for SMEs in several different geographical contexts, such as Sweden (Yazdanfar, 2011), Greece (Kosmidis & Stavropoulos, 2014) and Lithuania (Kanapickiene & Spicas, 2019). Instead, probit regression, first proposed by Zmijewski (1984), grew in importance as a research method in the early 2000s, when Dietsch and Petey (2002) conducted a large-scale study on a sample of 220,000 French SMEs, employing predominantly accounting and credit scoring data. In the following years, researchers also started to use loan-level data collected from the internal databases of banks operating in Germany (Norden & Weber, 2010), Slovakia (Fidrmuc & Hainz, 2010), Italy (Bottazzi et al., 2011) and Portugal (Duarte et al., 2018). However, although regression models can ensure the transparency of results when examining SMEs’ likelihood of incurring financial failure, these models suffer from high sensitivity to multicollinearity, and data should be coherent with the statistical assumptions that these models imply. Furthermore, although their accuracy is satisfactory, regression models are still limited compared to non-parametric models (Alaka et al., 2018).

In the late 1980s, Bayesian approaches were developed to predict financial failure of SMEs. These models were primarily based on accounting data and presented lower loss functions compared to their traditional counterparts. In their pioneering work, Keasey and Watson (1988) examined a matched-pair sample of 73 failed SMEs located in the northeast of England between 1970 and 1982 and found that a financial failure-prediction model based on a simple Bayesian approach correctly classified 69.9% of non-failed and 65.8% of failed SMEs. In the following years, Fantazzini and Figini (2009) observed a panel of 1003 German SMEs belonging to 352 business sectors between 1996 and 2004 and obtained evidence that Bayesian models outperform classical longitudinal and pooled logit models. Recently, Traczynski (2017) proposed a Bayesian model incorporating market data, which could provide better out-of-sample forecasts using fewer, albeit significant, predictors. Although they combine simplicity, interpretability and predictive accuracy, Bayesian models are highly dependent on the availability of prior knowledge. Therefore, having sufficiently large datasets is a critical issue (Fantazzini & Figini, 2009).

From the early 2000s onwards, credit scoring and hazard models entered the academic debate on SME default prediction. Credit scoring models brought several relevant methodological improvements. In this regard, Sohn and Kim (2013) built a model that considers four different states of financial ratios, while Nam (2013) devised a forecasting tool that captures (month by month) the countercyclical movements of capital requirements for SMEs. More recently, Chi and Meng (2019) proposed a rating system for SMEs that avoids information redundancy by relying on a small selection of 23 indices deriving from accounting-, firm-, loan- and market-level data. Credit scoring models, albeit superior in forecasting accuracy compared to more traditional empirical approaches, usually depend on the availability of several categories of data (e.g., Hasumi & Hirata, 2014; Sohn & Jeon, 2010; Sohn & Kim, 2013; Yu et al., 2019), and their classification performance may be undermined by the subjective biases of the human raters involved (Oliveira et al., 2017). Hazard models have also proved effective in predicting SMEs’ failure, thanks to their ability to deal with time-varying covariates (El Kalak & Hudson, 2016; Gupta et al., 2018). Nevertheless, they require compliance with statistical assumptions concerning the data structure and management of missing data.

Over the last decade, to overcome the main limitation inherent in traditional models, artificial intelligence tools have been developed to predict the financial failure of SMEs. Angelini et al. (2008) first proposed a feed-forward and an ad-hoc artificial neural network to assess the credit risk of 76 Italian SMEs, based on accounting and loan-level data, reporting error rates ranging from 7% to 14%. Mittal et al. (2011) also built a non-parametric multilevel perceptron to assess the credit reliability of a sample of 2864 Indian micro enterprises observed from 2007 to 2009, presenting an overall predictive accuracy of 71.68%. Ciampi and Gordini (2013) analyzed over 7000 Italian SMEs and found that artificial neural networks outperformed multiple discriminant analysis and logistic regression regardless of the level of aggregation considered (size, business sector and geography). Besides being very accurate non-parametric tools, artificial neural networks benefit from low sensitivity to multicollinearity. However, the results they provide lack transparency, causing researchers to refer to artificial neural networks as “black boxes” (Alaka et al., 2018).

Overall, extant studies on default prediction of SMEs have relied on parametric models implying strong statistical assumptions in the data structure (e.g., Gupta et al., 2015; Lin et al., 2012) or non-parametric models lacking transparency (Angelini et al., 2008; Ciampi & Gordini, 2013; Mittal et al., 2011). Moreover, even though accounting-, firm- and loan-level data have been used separately in previous analyses, their joint use is still limited (e.g., Yildirim, 2020). Furthermore, corporate governance data have seldom been included in forecasting models, despite their utmost relevance in the context of SMEs (Ciampi, 2015). When incorporated in quantitative models, corporate governance data have been fed into traditional regression models (Filipe et al., 2016; Keasey & Watson, 1987; Ohlson, 1980; Ono et al., 2014).

The database used in the present paper, being sufficiently large, facilitated use of a Bayesian classifier, thus overcoming the limitations inherent in both traditional and artificial intelligence models. Moreover, the empirical model proposed employs accounting, corporate governance and bank-firm data, thus encompassing all classes of predictors that the extant literature has deemed mandatory for forecasting the financial failure of SMEs.

3 Methodology

3.1 Sample and data collection

The analysis was run on a sample of 973 Italian small and medium-sized enterprises, which are clients of 36 different co-operative banks. The initial dataset was provided by Centrale Rischi Finanziari (CRIF: Italian Credit Register), an Italian rating agency, Centro Servizi Direzionali (CSD), an Italian consulting company working with the co-operative banks, and Cerved, a data-driven company that holds information about governance and ownership structures, and copies of the balance sheets and income statements of all Italian companies. After excluding large firms from the initial sample, we were left with 1847 small and medium-sized enterprises. To classify the small and medium-sized enterprises, we referred to the European definition that considers a firm as an SME if it has fewer than 250 employees and a total revenue below 50 million euro. In the second step, we excluded firms that did not have three years (2012–2014) of complete information, resulting in a sample of 1771 firms, which was split into small and medium-sized enterprises that had defaulted or not defaulted. We referred to the new definition of “default” (Article 178 of Regulation (EU) No. 575/2013 (Capital Requirements Regulation – CRR)), which came into force in Italy on January 1, 2021. With regard to this article, the European Banking Authority (EBA) guidelines specify all aspects relating to application of the definition of default of an obligor. The EBA has identified differing practices used by institutions concerning the definition of default. Consequently, these guidelines provide detailed clarification of application of the definition of default, which includes aspects such as the following: “days past due” criterion for default identification; indications of unlikeliness to pay; conditions for a return to non-defaulted status; treatment of the definition of default in external data; application of the default definition in a banking group; and specific aspects related to retail exposures. The EBA considers this harmonization necessary in order to ensure consistent use of the definition of default and to ensure that a harmonized approach is taken across institutions and jurisdictions. By applying this definition, we identified 252 defaulted firms.

We then used a propensity score-matching method to identify a sample of comparable firms with similar characteristics as defaulted or non-defaulted. In order to identify a set of companies which did not differ significantly from the sample of defaulted firms in specific criteria, we firstly split the whole sample into age-location-industry-dimension (employees) subsamples, and we ran separate propensity score logit regressions for each subsample. By matching defaulted firms to controls in the same place and industry, of the same age and dimension, year by year, we mitigated concerns that a non-random age/location/industry/dimension distribution could affect the results. We applied three-to-one nearest-neighbor matching with replacement. In detail, we identified the matching partners for each defaulted firm by minimizing the propensity score distance between defaulted and non-defaulted firms. In order to meaningfully employ matching, it was necessary to condition the support common to both defaulted and non-defaulted firms (Heckman et al., 1998). Implementing the common support condition ensured that any combination of characteristics observed in the defaulted group could also be observed in the control group. We employed the minima and maxima comparison method and deleted all observations where the propensity score was smaller than the minimum and/or larger than the maximum in the opposite group (see Caliendo & Kopeinig, 2008). In addition, we checked whether the matching procedure was able to balance distribution of the relevant variables in both the control and buyout groups (“balancing property”). Our final sample included 721 control firms, with a total of 973 firms.

This procedure, by controlling for constant unobserved differences between the treatment and the control group, allowed us to reduce self-selection problems and avoid the existence of systematic differences in distribution of the matched samples after the matching process (Rosenbaum & Rubin, 1983).

The resultant sample of 973 SMEs included firms from the manufacturing industry (27%), the service industry (46%) and others (27%), but not financial companies. It was composed as follows: 603 micro enterprises (turnover < 2 mln; employees < 10); 292 small enterprises (turnover < 10 mln; employees < 50); 78 medium-sized enterprises (turnover < 50 mln; employees < 250). The firms in our sample are mainly located in the north and centre of Italy and have the legal form of a limited company (89%) (Table 1).

Table 1 Sample distribution

As anticipated, the firms in the sample are clients of 36 different co-operative banks, with capital of between 5 thousand euro and 322 thousand euro, and total assets between 44 thousand euro and about 8 and a half million euro. In Italy, co-operative credit is the most representative form of banking localism. The small size and orientation towards the local market favor relationship lending and reduction of information asymmetries between lenders and borrowers (Berger & Udell, 1995; Berger et al., 2005; Elsas, 2005; Petersen & Rajan, 1994). The information advantage enjoyed by co-operative banks over their larger counterparts, as well as their proximity to the entrepreneurial and social fabric of the territory, translates into a better capacity to select and monitor opaque borrowers such as SMEs (McKillop et al., 2020). Compared to local banks established in another corporate form, co-operative credit banks have similar purposes, adopt a traditional banking business model and use homogeneous organizational structures and procedures, which are evidenced by recent formation of co-operative banking groups (Iccrea and Cassa Centrale Banca). This aspect contributes to the originality of our data because we could also capture bank-level heterogeneity, and it was possible to extend the results to a greater part of the Italian banking system than most previous studies focusing on single banks or groups of banks (e.g., Dainelli et al., 2013).

For each firm in our sample, Cerved provided data about financial statements and governance and ownership structures; CRIF and CSD made bank-firm hard information available. Finally, data about banks were collected through the Bankscope Bureau Van Dijk database. The years covered by the analyses were 2012, 2013 and 2014; and the total number of observations was 2147.

3.2 Variables

This study used one dependent variable and three groups of independent variables: financial ratios, bank-firm information and corporate governance, plus some control variables at bank and firm level. More details are provided below.

Dependent variable: This was the default risk, which was measured by a dichotomous variable (probability of default), taking a value of 1 if a firm had gone bankrupt or defaulted and 0 otherwise.

Independent variables: As stated above, this study used three groups of independent variables: financial ratios, bank-firm hard information and corporate governance indicators (Table 2).

Table 2 List of financial ratios

With regard to financial ratios, we included ratios for profitability, leverage and liquidity areas of a company’s profile. Specifically, by applying the variance inflation factor (VIF) method (Chatterjee & Hadi, 2012), we moved from a list of 23 ratios previously used in the literature on SME default prediction (Ciampi, 2015) and selected those with the lowest rates of correlation, excluding variables with VIF values above 3 (Pompe & Bilderbeek, 2005). Finally, to identify the best combination of these, we applied the stepwise method (Shin & Lee, 2002; Shin et al., 2005), resulting in seven ratios for inclusion in the regression models.

Bank-firm information was selected from a long list of 365 variables provided by CRIF and CSD, by following the same procedure explained above (variance inflation factor and stepwise methods). Table 3 shows the seven variables selected and their measures.

Table 3 Bank-firm hard information

Finally, the corporate governance variables were also selected, firstly by looking at previous literature on the corporate governance-default relationship and then by applying the variance inflation factor and stepwise methods. The selected variables and their measures are listed in Table 4.

Table 4 Corporate governance indicators

Control variables: We included the following control variables: firm size, industry, location, firm age, legal form and bank capitalization. Firm size was measured as the natural logarithm of the number of employees. The entire sample of examined firms consisted of SMEs. However, as reported in the sample description, these could be split into micro, small and medium-sized enterprises. It is thus possible for different turnover levels to influence the default probability. Industry was considered through two dummy variables relating to the sectors to which the firms belonged: manufacturing and services (with “other industries” as the reference category). This partition, which is generally used by Italian banks to develop their scoring and rating systems, allowed us to capture the possible effects of typically diverse profiles of firms operating in the three different categories of business, in terms of financial and governance structures. Location was operationalized through three dummy variables concerning the geographical location of the firms: north, center and south/islands of Italy (with “centre” as the reference category); this geographical partitioning is frequently used in research on Italian samples and allowed us to capture different economic and industrial features characterizing the Italian business system. Firm age, which is strictly linked to firms’ probability of default (as previous studies have suggested), was measured as the natural logarithm of the years since constitution. Legal form was a dummy variable assuming a value of 1 when the firm was a limited company and 0 otherwise. Finally, at bank level, bank capitalization was included in the regression models, measured as CET 1 (common equity tier 1), calculated as the TIER1/asset ratio.

3.3 The merged longitudinal predictive model

The model applied to study the joint effect of financial ratios, corporate governance variables and bank-firm information was a merged longitudinal predictive model, which performs better than both separate longitudinal predictive models and a single model (Figini & Giudici, 2011). The combined model finds justification in the Bayesian paradigm, which provides a unified and intuitively appealing approach to the problem of drawing inferences from observations. Bayesian statistics view statistical inference as a problem of belief dynamics, and evidence about a phenomenon is used to revise and update the knowledge about it. Following Bayesian theory (see, e.g., Bernardo & Smith, 1994) and the approach by Figini and Giudici (2011), we firstly observed a score πi i = 1,.., n derived from financial ratios; a score πi* i = 1,.., n derived from corporate governance data; and a score πi** i = 1,.., n derived from bank-firm information. Secondly, starting from the original training data D of size n (composed of k financial variables, z corporate governance variables and p bank-firm variables), we generated r new training sets by sampling examples from D uniformly and with replacement. As a result, we obtained r bootstrapped training sets. Thirdly, we built r predictive models on the r bootstrapped training datasets derived from step 1 separately for the financial ratios, corporate governance and bank-firm variables. In the fourth step, we computed the variances of πI, πi* and πi**. In the fifth step, we derived δi as

$$\mathrm{\delta i }=\frac{\frac{1}{{\sigma }^{2}({\pi }_{i})}}{\frac{1}{{\sigma }^{2}({\pi }_{i})}+ \frac{1}{{\sigma }^{2}({\pi *}_{i})}+ \frac{1}{{\sigma }^{2}({\pi **}_{i})}}$$

The final probability of default for each SME was a linear combination of πI, πi* and πi** weighted by δi, computed as follows:

$${\text{PD}}_{\text{i}} = \updelta_{\text{i}}\uppi_{\text{I}} + (1 - \updelta _{\text{i}})\uppi_{\text{i}}^{*} + (1 - \updelta _{\text{i}})\uppi_{\text{i}}^{**}, \quad {\text{i}} =1, \ldots , {\text{n}}$$

It can be interpreted as a Bayesian before posterior probability for the models proposed. It should be noted that the data indicator δi had to satisfy suitable regularity conditions: σ2i**) + σ2i*) + σ2i) ≠ 0, σ2i**) < ∞, σ2i*) < ∞, and σ2i) < ∞.

By following Figini and Giudici (2011), we further estimated the posterior expectation of θ, as follows:

$${\text{E}}(\theta /{\text{y}}){\text{ }} = \frac{{k~ + ~a_{0} }}{{n + ~a_{0} + \beta _{0} }} = ~\frac{n}{{n + ~a_{0} + \beta _{0} }}~.~\frac{k}{n} + \frac{{a_{0} ~ + \beta _{0} }}{{n + ~a_{0} + \beta _{0} }}~\left( {\frac{{a_{0} }}{{a_{0} ~ + \beta _{0} }}} \right)$$

4 Results

4.1 Descriptive statistics

Looking at financial ratios, we could note the sign was minus before the mean value of return on investment and EBITDA/turnover. This indicates that the levels of profitability for the firms in our sample were negative on average. The degree of indebtedness, on the other hand, was almost as high as the level of liquidity (taking into account the inventory).

The CEO was also chair of the board of directors in less than 50% of the firms in the sample; 20% of directors were not officers and did not have any affiliations with the firm. The maximum number of years that a CEO had served on the board was 37, and the median was 9. In 80% of the firms, a shareholder (or more shareholders belonging to the same family) held the majority of the shares. The number of directors on the board was equal to 3.19 on average, with a maximum value of 25. Women constituted 22% of the directors on average; the mean percentage of independent directors on the audit committee was 29%.

By splitting the sample between default and non-default firms, we could observe that although it was negative for both default and non-default firms, the return on investments was, on average, significantly lower for default firms. Moreover, firms that were more indebted also presented a lower degree of liquidity. Additionally, default firms presented a higher number of credit-limit violation days on a checking account in a month and a higher value of overdue payments at the end of the month. Finally, default firms were smaller than non-default ones (Table 5).

Table 5 Descriptive statistics

In terms of the corporate governance indicator, it was possible to affirm that in default firms, the CEO was generally not also chair of the board of directors, and he/she tended to lead the firm for a shorter period (5.3 years on average) than CEOs of non-default firms (26.5 years on average); the degree of ownership concentration was slightly higher, while the board size was smaller; finally, there was a significantly lower percentage of women on the board.

Table 6 shows the pairwise correlations, where we can observe that while the return on investments, the EBITDA/turnover ratio and the acid test were negatively correlated to default risk, the total debts/(total debts + equity) ratio and the total debts/EBITDA ratio were significantly and positively correlated. CEO duality, CEO tenure, board independence and board diversity were significantly and negatively correlated with SME default, while ownership concentration was significantly and positively correlated to default risk. Finally, the number of months that firms were overdrawn for, and overdue payments, were both significantly and positively related to SME defaults.

Table 6 Descriptive statistics for default and non-default firms

4.2 Results of merged longitudinal predictive model

Table 7 shows the results obtained by applying a longitudinal predictive model to estimate the predictive power of financial ratios.

Table 7 Chosen financial ratios for one step LPM

The results suggest that the degree of indebtedness is significantly and positively related to default risk, thus suggesting that because more indebted SMEs have a higher probability of failing, choices about firms’ financial structure cannot be undervalued.

Table 8 shows the results obtained by applying a longitudinal predictive model to estimate the predictive power of corporate governance variables.

Table 8 Chosen corporate governance variables for one step LPM

We found that CEO tenure is negatively related to SME defaults. This means that when a CEO sits on the board for a long time, the probability of a default is reduced. Again, ownership concentration is positively related to the probability of default. This means that for SMEs, the presence of a majority shareholder is not a guarantee of the firm’s continuity. Furthermore, the probability of default is affected by board diversity, in such a way that a higher percentage of women on the board reduces the default risk.

Table 9 shows the results obtained by applying a longitudinal predictive model to estimate the predictive power of bank-firm hard information.

Table 9 Chosen bank-firm hard information for one step LPM

The number of loans overdue by 180 days in the previous 12 months and the number of months overdrawn – cash and signature in the preceding 12 months – significantly affect the probability of default.

Table 10 shows the results for a single model, with a significant just leverage ratio, CEO tenure and ownership concentration, and number of loans overdue by more than 180 days in the last 12 months.

Table 10 Chosen predictors for one step LPM

4.3 Robustness check

To corroborate the robustness of our results, we also used an alternative and continuous measure of default risk: financial distress (Migliani et al., 2015). The measure for this variable was based on the model developed by Zmijewski (1984). The Zmijewski Financial Score (ZFS) is one of the most widely used financial distress-prediction models (Carcello & Neal, 2003; Hay et al., 2007). The ZFS is constructed based on an index incorporating multiple financial ratios representing firm profitability (net income/total assets), leverage (total debt/total assets) and liquidity (current assets/current liabilities). A higher ZFS indicates a greater likelihood of financial distress.

We ran a merged longitudinal predictive model for this dependent variable. Table 11 shows the results and confirms the robustness of our analysis.

Table 11 Chosen predictors for one step LPM (Financial distress)

5 Concluding discussion

The paper aimed to examine the joint effect of financial ratios, corporate governance and bank-firm information on SME defaults, by applying a merged longitudinal predictive model (Figini & Giudici, 2011) anchored in Bayesian literature (Bernardo & Smith, 1994). This served to overcome the difficulties found in previous studies of keeping a huge amount of information for SMEs, given their reduced dimension and the reluctance of banks to share credit relationship data (Altman et al., 2013).

Our main results relating to financial ratios suggest that leverage is positively related to SMEs’ default risk. Therefore, we suggest that firms pay attention to their level of indebtedness and also to the structure of their debt. Of particular interest are our findings about the effect of corporate governance variables on SMEs’ default risk. Specifically, the results show that CEO tenure has a significant and negative effect on the probability of default, meaning that when a CEO serves on the board for a long period of time, the default risk is reduced. By linking this finding to previous findings, we would argue that when a CEO (probably also the founder of the firm) refrains from acting as a dictator and is surrounded by other people who share his or her vision and are involved with strategic planning, the CEO tenure has a positive effect on firm survival. The results further suggest that ownership concentration is positively related to the probability of default. Therefore, for SMEs, this may constitute a limit because it curtails the possibilities for development of the enterprise, and in a period of crisis, it may not allow for generation of adequate capital necessary for recovery. This is particularly true for family firms, which are known to be reluctant to expand the ownership structure in order to maintain control. Such a myopic vision could compromise firms’ survival and lead businesses towards bankruptcy. Finally, our empirical analysis highlighted that a higher percentage of women on boards reduces the default risk. The reason for this could be that women tend to communicate more effectively (Joy, 2008) and thus enhance dissemination of information from board to investors (Gul et al., 2011). Moreover, they allocate more time to monitoring and have a significant impact on board inputs by having better attendance records and joining more committees (Adams & Ferreira, 2009). Finally, the number of loans overdue by 180 days in the preceding 12 months and the number of months that firms are overdrawn – their cash and signature in the previous 12 months – are two key aspects which should be monitored with regard to a firm’s relationship with banks. Both are indicative of a firm’s inability to honor its debts on time and can be interpreted as a signal of inadequate management of working capital and financing sources.

Despite the interesting results, this paper does suffer from some limitations, which indicate new directions for future research. Firstly, we examined a limited number of years. It would therefore be interesting to replicate the complete analysis using a longer time horizon, in order to capture changes in corporate governance and bank-firm relationships that are unlikely to be able to save a firm on the verge of bankruptcy. Secondly, although we included three different categories of default determinants in our model, we did not have qualitative data, which could be collected through a survey of the field. Finally, even though we applied a robust technique, an interesting development for this research could be application of alternative techniques such as neural networks, support vector machines (SVMs) and machine learning models (Jones et al., 2015, 2017).