1 Introduction

Nowadays, models that can predict the bankruptcy of a company are of interest to various economic entities, such as banks, credit agencies, governments, and financial analysts, not to mention customers and suppliers. Although bankruptcy detection models have been gradually developing since the 1960s (Beaver, 1966; Altman, 1968), the vast majority of them are still based only on accounting and financial variables as explanatory factors (Fijorek & Grotowski, 2012; Attaran et al., 2012). Ernst and Young (2010) suggested that most managers believe that innovation (specifically in management processes) can play a fundamental role in accelerating a company’s growth. Therefore, corporate governance variables – such as manager turnovers, which introduce new entrepreneurship skills, knowledge, and external relationships (Paoloni et al., 2020), and the presence of internal or external auditors (Cenciarelli et al., 2018) – can be a crucial factor in increasing the competitive advantage of a corporation (Chen et al., 2014; Agostini & Nosella, 2017). Indeed, from the l990s onwards, various authors (e.g., Gilson 1990; Daily & Dalton, 1994; Gales & Kesner, 1994; Deng & Wang, 2006; Donker et al., 2009; Bredart, 2016; Manzaneque et al., 2016) have begun to investigate the link between financial distress and corporate governance, confirming the significant relationship between corporate governance variables and the predictive power of bankruptcy forecast models. However, despite the increase in the amount of research on corporate governance variables, which covers multiple disciplines, scholarly literature on the topic is still limited and fragmented (Martín-de Castro et al., 2019), especially with regards to the role and impact of corporate governance on companies’ turnarounds. Research tends to instead remain focused on the analysis of the impact of financial variables in predicting corporate defaults. Therefore, the aim of this paper is to test the effectiveness of internal corporate governance mechanisms pertaining to the board of directors in anticipating corporate defaults in Italy – in which research on the overall analysis of this issue is still lacking. The study of the inclusion of corporate governance variables is particularly suitable to the Italian context, as this country is typically characterized by non-listed family firms, which tend to operate at higher levels of risk than non-family firms (Contreras et al., 2021) and are normally unable to alter their management (or the original entrepreneur).

Moreover, despite the increasing use of corporate default forecasts in various fields, new, more efficient, and more reliable models for predicting a crisis have not yet been developed. Stemming form Altman (1968), most existing literature has relied on the use of “traditional” statistical models, such as Logit, probit, and linear discriminant analysis. Recently, however, researchers have been increasingly interested in analysing machine learning models’ ability to predict bankruptcy situations. Previous literature, such as Barboza et al., (2017), has shown that “new age” models are able to offer more accurate forecasts in comparison to “traditional” methods, like Logit. Therefore, the second contribution of this paper is to reinforce scholarly literature in this field. In doing so, this paper will confirm the abilities of the models that previous literature (Jones et al., 2017; Barboza et al., 2017) has shown to be the most efficient with regards to machine learning techniques, and those that are able offer better predictions than Logit – specifically with regards to the Random Forest model. The importance of further testing these models’ efficiency in predicting the corporate defaults of machine learning techniques is amplified by the importance of their various practical applications, especially for users such as credit agencies, banks, and investment companies. In fact, as suggested by Althey (2018), we can expect these techniques to have a significant impact on the field of economics and finance within a short period of time. The practical applications of these techniques vary. For example, these techniques could be used when solving policy and decision problems (Kleinberg et al., 2015), when predicting loan repayments and attempting to improve credit scoring (Bjorkegren & Grissen, 2019), or when enhancing economic models (Althey, 2018). The increasing use of these techniques (Althey, 2018) occurs due to the fact that they can be easily applied without the need to manipulate the original database, as they are resilient to a series of statistical issues, such as omitted variables, multicollinearity, outliers, and heteroscedasticity.

To this end, we use a unique set of non-listed Italian UTP positions to compare the predictive performance of a Logit and a Random Forest model using the ROC (Receiver Operating Characteristic) curve: a method frequently analyzed in previous literature (Swets et al., 2000; Jones et al., 2015, 2017; Barboza et al., 2017). Banks classify positions that will probably not respect their contractual obligations as UTP positions. This is one of the more innovative elements of this study, as previous researchers have mainly concentrated on NPLs (Non Performing Loans, i.e., borrowings that are already long overdue and so are officially in a distressed situation). We adopt a different perspective in comparison to literature regarding the application of machine learning techniques in predicting corporate defaults. We would like to highlight, from the outset, that the ability to correctly forecast the classification as UTP has significant practical managerial consequences: UTP is a status that anticipates the company’s distress and, therefore, it allows managers to find timely and realistic solutions to allow them to face a potential financial crisis. The development of models that can predict these situations and send alert signals is of particular interest to all the stakeholders of a company (Davis & Karim, 2008; Dallocchio & Tron, 2020) and for banks (as shown by the guidance to banks on non-performing loans of European Central BankFootnote 1) and governments (such as in the new Italian Code of BankruptcyFootnote 2).

As predictive variables, we use the turnover of the members of the board of directors and the number of CEOs in the years prior to the crisis as corporate governance indicators, as in Elloumi & Gueyiè (2001), Fahlenbrach et al. (2007), Lin et al., (2020), and Fernando et al., (2020), along with the presence of the statutory and external auditors, as in Bredart (2014) and Cenciarelli et al., (2018). The final results confirm that financial variables (Z-Score) have a primary role in predicting corporate distress situations. However, they also corroborate the importance of corporate governance factors. These conclusions are further confirmed by the fact that, when comparing the models with and without the corporate governance variables, we find that the ROC curve is higher for those models that also include corporate governance variables, as in Liang et al., (2016).

2 Literature review

Literature on predicting bankruptcy is quite extensive. However, as highlighted by Wang et al., (2014), there is still no mature or definite theory of corporate failure since the “most reliable” method of predicting corporate bankruptcy has not yet been identified. After the first pieces of research on the topic, performed in the United States by Tamari (1964), Beaver (1966), and Altman (1968), several approaches for forecasting bankruptcies have been elaborated upon and improved. In the 1990s, researchers started to develop more sophisticated models by also including new strategic variables and financial/accounting measures (Altman & Saunders, 1997; Amigoni, 1998; Eccles, 1991). The ability of corporate governance factors to predict corporate defaults, as confirmed by various authors (e.g., Gilson 1990; Daily & Dalton, 1994; Gales & Kesner, 1994; Deng & Wang, 2006; Donker et al., 2009; Bredart, 2016; Manzaneque et al., 2016; Fernando et al., 2020), is explained by the postulates of Agency Theory. According to this theory (Donker et al., 2009), managers are more focused on obtaining short-term results to maximize their compensation and rewards, while shareholders tend to choose long-term strategies. Therefore, this leads to an ethical conflict between managers, like the CEO or the sole director, and the shareholders. Literature in this field is mainly based on the analysis of the US companies (Manzaneque et al., 2016). However, corporate governance mechanisms can differ significantly from one country to another, which is one of the reasons why the extension of this analysis to other geographical contexts is necessary in order to corroborate the existing literature.

Our interest is mainly focused on the board of directors, because it is considered to be one of the main internal corporate governance mechanisms (Norwahida et al., 2012) and a good measure of a company’s ability to create and leverage intellectual capital. The board of directors and the top management of a company have a crucial role in the success of a turnaround, as they have the power to implement a series of strategic actions that might prevent or solve a crisis (Porter, 1987; Garzella, 2005; Grant, 2011; Leng et al., 2021). As a consequence, top management can be considered the most important figures when it comes to renewing the structure and the strategy of a company during a corporate turnaround (Lohrke et al., 2004). As highlighted by previous literature (Schiuma et al., 2008), people with an adequate level of competencies and skills can create a virtuous circle, generating new ideas and techniques able to innovate the product/processes of a company, thereby improving performance. Moreover, as supported by Santana et al., (2017), management in declining firms can also replace the usual downsizing responses, highlighting this paper’s potential relevance in the world of start-ups and small firms (Zingales, 2000). The board has the ability to be a key factor in determining the future of a business. However, overconfident CEOs can lead a company to face higher risks of bankruptcy (Leng et al., 2011, 2021). As a consequence, weak corporate governance can increase the likelihood of opportunistic behavior occurring in the management team, which could lead to a reduction in profitability and the overall value of a company (La Porta et al., 2000), increasing the likelihood of financial distress. Consequently, the interaction between the role of the board and the likelihood of financial distress should be examined. In this regard, previous studies (Goodstein et al., 1994; Yemarck, 1996) have been more focused on analyzing the problems related to the size of the board, revealing that larger boards tend to incentivize opportunistic behaviors by the management of a company. On the contrary, smaller boards and a larger percentage of independent individuals can increase the performance of the company and reduce the likelihood of financial distress (Jensen, 1993; Fernando et al., 2020). Several studies have also analyzed the impact of management turnover on a firm’s performance, with the general consensus being that the likelihood of management turnover is negatively related to firm performance (Huson et al., 2004; Warner et al., 1988), discovered a positive relationship between low returns stocks and the probability of the turnover of the CEO, president, or board chairman. Similarly, Kim (1996) empirically demonstrates that firm stock returns have a persistent negative effect on turnover probability. On the contrary, Huson et al., (2004) discover that investors view turnover announcements as good news, presaging performance improvements. However, despite the increasing amount of research on the impact of the turnover of management and board members on the probability of financial distress (Gilson, 1990; Elloumi & Gueyiè, 2001), literature in this area is still limited and is primarily focused on data from the US.

For corporate governance systems, another key aspect for a company is the presence of internal and external auditors. In this sense, literature has shown that the presence of internal and external audit systems can have a significant impact on changes to a company’s financial performance and on its probability of default (see Guo et al., 2016 and Cenciarelli et al., 2018, among others). Internal and external auditors can guarantee the quality of the information of the financial reports provided by the company for investors (Bratten et al., 2013), and their role has relevant consequences during a financial crisis (Cenciarelli et al., 2018). In this sense, also the presence of the audit committee can have a significant positive impact in preventing the risk of frauds and irregularities (Beasley et al., 2000). For distressed firms in particular, statutory auditors and external auditors are obliged to judge the ability of the company to operate as a going concern entity for the following 12 months. In this sense, with the European Union’s 2015/848/EU regulation, auditors were required to promptly communicate to the top management of a company the presence of indicators of financial distress. In this field, research has shown auditors’ ability to anticipate the emergence of a financial crisis (Bhimani et al., 2009). Therefore, their presence helps a company to prevent triggering this event. Research on this issue is still limited, especially in European countries (Cenciarelli et al., 2018).

Until recent times, methodological approaches were based mainly on Logit/probit models when it came to predicting corporate defaults, (Ohslon, 1980; Aziz et al., 1988; Platt & Platt, 1990; Ward, 1994; Back et al., 1996; McGurr & DeVaney, 1998; Kahya & Theodossiou, 1999; Beyonon & Peel, 2001; Neophytou et al., 2004; Lin & Piesse, 2004; Westgaard & Van der Wijst, 2001; Foreman, 2003; Brockman & Turtle, 2003; among others) and the discriminant analysis model (Altman, 1968: Edmister 1972; Piesse & Wood, 1992; Altman & Narayanan, 1997; Pompe & Feelders, 1997; McGurr & DeVaney, 1998; Yang et al., 1999; Altman et al., 2013; among others), whose merits have been studied extensively in literature (Efron, 1975; Ohlson, 1980; Altman et al., 1994; among others). However, Begley et al., (1996) supported the need for the development of new models, since traditional ones based on Altman (1968) and Ohlson’s (1980) theories, have shown their weaknesses.

A new generation of corporate default predictions has arisen, including studies using hazard models, which tend to predict corporate defaults better than traditional ones (Shumway, 2001). These techniques are not binary classifiers as they calculate the probability of a corporate default over time (Chava & Jarrow, 2004; Tian et al., 2015; among others). They are therefore more suitable in the long run.

The need to improve the forecast of corporate defaults has recently led to the development of new types of models based on machine learning techniques, such as generalized boosting, AdaBoost, and Random Forest (Jones et al., 2017; Samuel, 1959) proposed the concept of machine learning, defining it as “a discipline that gives computers the ability to learn without a clear program”. These techniques are based on the running of a series of processes that continue to improve the classification of observations using their common patterns (Tian et al., 2012), thereby learning from experience.

One of the most famous machine learning techniques is the artificial neural network, which is very useful when it comes to solving complex and non-linear relationships “by mimicking the structure of the brain and connecting artificial neurons using simple structures” (Kim et al., 2020). Machine learning techniques have been used in default predictions before the 1990s, and several authors (Yang et al., 1999; Zhao et al., 2014; Geng et al., 2015; Jones et al., 2017) have demonstrated that they have both better prediction performances than Logit/probit and, on average, good levels of efficiency. They reach a good level of reliability, especially in the Italian context, wherein small companies often do not excel with regards to the accuracy of their accounting information (Falavigna, 2012; Zhao et al., 2014; Jones et al., 2017) identified five main issues with the artificial neural network:

  1. i)

    poor performance in case of unbalanced data;

  2. ii)

    high error rate with a small sample;

  3. iii)

    high difficulties in selecting the hidden layer;

  4. iv)

    less capacity to handle large numbers of potentially irrelevant inputs and to handle both categorical and continuous data;

  5. v)

    lack of interpretability of data.

Another machine learning technique is represented by the support vector machines, which are widely used in various fields, including corporate default predictions. The support vector machines are based on the concept of a separating hyperplane (Jones et al., 2017), which allows us to identify the greatest distance between the most similar observations that are oppositely classified (Cortes & Vapnik, 1995; Noble, 2006). Therefore, when a sample is completely separable in groups, the support vector machines are able to build very accruable models. However, when managing economic-finance data, this is virtually impossible. Therefore, the support vector machines contain a margin of error (Zhou et al., 2014). Support vector machines have been notably used by researchers in corporate default predictions (Min & Lee, 2005; Yu et al., 2010), and have had a predictive quality similar to (Barboza et al., 2017; Jones et al., 2017) or even better (Shin et al., 2005) than the artificial neural network. Liang et al., (2016) also showed that support vector machines are more reliable in anticipating corporate defaults when using financial ratios and corporate governance indicators. Similar to neural networks, the limitations of support vector machines are their lack of interpretability and difficulties in handling large numbers of potentially irrelevant inputs (Jones et al., 2017).

A third technique applied in machine learning is boosting. This technique is able to find the model that best classifies a sample. The boosting technique, based on the continuous use of different sets of the initial sample, creates various training sets and identifies the one with the lowest error rate (Begley et al., 1996; Hastie et al., 2009). Unlike the Logit model, the boosting technique is resistant to over-fitting and, unlike neural networks and support vector machines, is able to deal with irrelevant inputs. It also tends to be better at handling mixed types of data (Jones et al., 2017). Due to these characteristics, this technique – together with AdaBoost, a derived algorithm – has been able to predict corporate defaults with a remarkable degree of reliability: significantly higher than that of Logit/neural networks/support vector machines (Barboza et al., 2017; Jones et al., 2017; Aliaj et al., 2020).

The Random Forest model is similar to boosting but it is based on the concept of charting decision rules using a tree structure. This technique, created by Breiman (2001), randomly selects a subset of characteristics from each node of the tree, following a bagging technique. Therefore “a particularly strong predictor in the dataset (along with some moderately strong predictors) will be used by most if not all the trees in the top split” (Jones et al., 2017). A more detailed and precise description of Random Forest can be found in Booth et al., (2014) and Calderoni et al., (2015). The Random Forest technique has various advantages: (i) it is robust to outliers; (ii) it is robust to missing data; and (iii) it allows for the identification of the importance of each variable in the classification results (Yeh et al., 2014; Jones et al., 2017). Several authors (Olson et al., 2012; Barboza et al., 2017; Jones et al., 2017; Li & Wang, 2018; Aliaj et al., 2020) have shown that the Random Forest model is more interpretable and can achieve more accurate results in predicting corporate defaults than other machine learning algorithms.

Despite the reliability of these techniques in predicting corporate defaults, the amount of empirical research on the topic is still limited. Tsai et al., (2014), using a set of Taiwanese companies, compared the predictability power of three neural network techniques, support vector machines, and Random Forest, demonstrating the ability of Random Forest to perform significantly better. Heo et al. (2014), using a sample of Korean companies, showed that the AdaBoost has more predictive power than other classifiers, especially for large companies. Kim et al. (2014), using publicly traded U.S. restaurants, demonstrated that, from 1988 to 2010, the AdaBoost and Random Forest models have been the best predictors of performance. They had the smallest degree of error overall and in terms of type I error rates, or rather the probability of rejecting the null hypothesis given that it is true. Both Jones et al., (2017) and Barboza et al., (2017), by comparing several statistical methods and using a set of US firms, showed that the Random Forest and the boosting techniques are the most performing classifiers in several cases. The applications of machine learning techniques in Italy are still limited. Bragoli et al. (2019), using a dataset of Italian companies from 2007 to 2015, showed that machine learning techniques outperform traditional classifiers. Aliaj et al., (2020) using a large sample of Italian companies, demonstrated that, in the Italian context, Random Forest provides the best results, thus corroborating the findings of Barboza et al., (2017). Donato et al. (2020) showed that, using a non-parametric supervised classification algorithm on a random sample of 100 non-listed SMEs, it is possible to sufficiently predict the distress of a firm in advance (4–5 years prior to failure).

From the analysis of the existing literature, it is reasonably evident that boosting and Random Forest are the best techniques for predicting corporate defaults. Moreover, differently from other machine learning techniques, the Random Forest model allows us to easily interpret the results – a fundamental factor for helping executives to improve their businesses. Barboza et al., (2017), despite the increasing number of studies in this field, suggested that “new studies, exploring different models, contexts and datasets, are relevant, since results regarding the superiority of these models are still inconclusive”. Therefore, this paper aims to fill the research gap on the ability of Random Forest and Logit models to predict not only the default of a company, but also its classification as UTP. This issue has not yet been adequately covered in academic literature, despite its remarkable impact from a managerial perspective. In fact, it intercepts the signals of a stressed situation and anticipates the potential status of default (Ambrosini & Tron, 2016; Caputo & Tron, 2016). This feature offers an important option for managers, who can implement several corrective actions. After the default and – even worse – after the bankruptcy, the number of adoptable solutions is strictly limited (Tron et al., 2018; Ferri et al., 2020). Furthermore, several papers have revealed the impact of corporate governance indicators (such as composition and the mandated duration of the main governance – and control – bodies, changes in majority shareholders; etc.) on default probability (Elloumi & Gueyi, 2001; Switzer et al., 2018; Lin et al., 2020; Fernando et al., 2020) and on the turnaround outcome (Miglani et al., 2020). However, previous studies have not yet thoroughly investigated the impact of governance variables in predicting the default of a company using new machine learning models, like Random Forest. Only Liang et al., (2016), on the basis of a sample of Taiwanese companies, demonstrated a better performance in predicting defaults, relying on corporate governance and financial variables, rather than using only financial variables. Therefore, as suggested by Barboza et al., (2017), we used machine learning models, also fed by corporate governance variables, to analyze the Italian context, along with its specific features in terms of company size and corporate governance.

3 Research methodology

Thanks to the support of a leading bank operating in Italy, and following the example of Dallocchio et al., (2020), we collected data on Italian companies classified as UTP in 2014. The database structure allowed us to verify whether or not a company had been placed on the special register of “UTP positions”.

Out of an initial sample of 10,143 companies made available by the bank, we selected 72 that had the following features:

  • private;

  • set up and registered in Italy;

  • out of trouble (“in bonis”) before August 2014 and still “in bonis” in 2017;

  • able to repay interest and/or part of the principal payment;

  • originally included in the “special register” of the bank;

  • positively concluded a restructuring process at least two years before this study began.

Since the initial sample only included private Italian companies that were able to complete a full turnaround process, two additional samples were identified: firms that defaulted during the period between 2014 and 2016, and firms that did not demonstrate economic and financial distress during the period between 2007 and 2016. To construct these two samples, we followed a two-step process.

Firstly, by identifying their economic sector and using the revenues of the companies included in the first sample as an approximation of the size of a company, we identified, thanks to Bureau Van Dijk Database (AIDA), 34,124 companies (excluding listed companies) as our first sample.

Secondly, still drawing from the same source, we highlighted 76 companies that defaulted between 2014 and 2016, while the “healthy” companies (once again selected from the aforementioned cluster) were identified through a process of pairwise sampling, which allowed us to identify 72 entities. Therefore, the final database was composed of 220 companies: 72 of the original sample (success in restructuring), 72 healthy companies (no economic and/or financial distress between 2007 and 2016), and 76 companies that defaulted between 2014 and 2016. Then, we started collecting economic/financial data through AIDA, and governance data through the data provider CERVEDFootnote 3. As previously mentioned, existing literature on machine learning models (Barboza et al., 2017; Jones et al., 2017) has only used economic/financial variables. On top of this, we decided to include corporate governance variables, downloaded from CERVED. An overview of the corporate governance variables that were used in this paper, along with their sources, can be found in Table 1.

Table 1 Variable definition and source

To predict the classification of a company as a UTP position, as control variables, this study used one of the most adopted and easy-to-use tools for assessing default risk, the Z’-Score and Z’’-Score. The Z’-Score was created by Altman (1993) as a revision for non-listed companies of the original Z-Score (1968). In analytical terms, the Z’-Score model is based on five factors:

  • Liquidity (\(\frac{\text{C}\text{u}\text{r}\text{r}\text{e}\text{n}\text{t} \text{A}\text{s}\text{s}\text{e}\text{t}\text{s}-\text{C}\text{u}\text{r}\text{r}\text{e}\text{n}\text{t} \text{L}\text{i}\text{a}\text{b}\text{i}\text{l}\text{i}\text{t}\text{i}\text{e}\text{s}}{\text{T}\text{o}\text{t}\text{a}\text{l} \text{A}\text{s}\text{s}\text{e}\text{t}\text{s}}\)),

  • Profitability (\(\frac{\text{R}\text{e}\text{t}\text{a}\text{i}\text{n}\text{e}\text{d} \text{E}\text{a}\text{r}\text{n}\text{i}\text{n}\text{g}\text{s}}{\text{T}\text{o}\text{t}\text{a}\text{l} \text{A}\text{s}\text{s}\text{e}\text{t}\text{s}}\)),

  • Productivity (\(\frac{\text{E}\text{B}\text{I}\text{T}}{\text{T}\text{o}\text{t}\text{a}\text{l} \text{A}\text{s}\text{s}\text{e}\text{t}\text{s}}\)),

  • Leverage (\(\frac{\text{B}\text{o}\text{o}\text{k} \text{V}\text{a}\text{l}\text{u}\text{e} \text{o}\text{f} \text{E}\text{q}\text{u}\text{i}\text{t}\text{y}}{\text{B}\text{o}\text{o}\text{k} \text{V}\text{a}\text{l}\text{u}\text{e} \text{o}\text{f} \text{T}\text{o}\text{t}\text{a}\text{l} \text{L}\text{i}\text{a}\text{b}\text{i}\text{l}\text{i}\text{t}\text{i}\text{e}\text{s} }\)),

  • Asset turnover (\(\frac{\text{S}\text{a}\text{l}\text{e}\text{s}}{\text{T}\text{o}\text{t}\text{a}\text{l} \text{A}\text{s}\text{s}\text{e}\text{t}\text{s}}\)).

The reliability of Z’-Score in measuring the health of small and medium enterprises in the Italian context has been discussed by several authors (Madonna & Cestari, 2015; Paoloni & Celli, 2018; Dallocchio et al., 2020). For emerging countries and non-manufacturing companies, Altman elaborated upon the Z’’-Score (1995), using a correction factor of 3.25 and deleting the asset turnover. In this case, the Z’’-Score proved to be reliable in the Italian context (Altman et al., 2013). Due to the reliability in the Italian context of Z’ and Z’’, and in line with existing literature, we chose to use the two scores as predictive variables. Furthermore, since the Z-Scores allowed us to anticipate the emergence of a crisis, thanks to their ability to recognise the relationship between the potential corporate default and the accounting indicators in the years before insolvency, the management team can implement a coherent strategy for preventing the crisis (Altman & Le Fleur, 1985).

Their values were obtained by, once again, relying on AIDA. We collected data from 2007 until the “financial distress moment” (identified as the date of inclusion in the special register of the UTP’s position), as reported by the bank agents responsible for distressed debt positions, through interviews and surveys conducted by the authors. In particular, for each specific type of Z-Score, we calculated the average score across two periods: (i) before the financial distress (2007–2011); (ii) during the financial distress (2012–2014). Therefore, an overview over the financial variables that we use in this paper and their source can be found in Table 2.

Table 2 Variable definition and source

Companies with missing data were finally discarded, allowing us to compare the results of the Logit, which suffered from missing data, and the Random Forest. Therefore, the final sample had 112 constituents: 54 successfully restructured companies, 13 defaulted companies, and 45 healthy companies. We controlled the weighting of every industry represented in the sample: no sector had a weighting higher than about 7% for restructured and healthy companies. In Table 3, the descriptive statistics of corporate governance variables and Z-Score are reported.

Table 3 Descriptive statistics by company status

The criteria adopted for the construction of the control samples appear consistent with the research perspective and reflect the actual health status of the companies analyzed (healthy, restructured, failed). We then constructed a dummy variable to approximate the state of the company (0 = healthy company, 1 = recovered or defaulted enterprise). We proceeded to analyze the correlations between the variables included in Table 4, which allowed us to obtain remarkable results. A relatively higher number of CEOs and board members demonstrated a positive relationship with the dummy variable, while their average term in office has a negative relationship. As expected, we identified a negative relationship between the Z-Score and the probability of default for all considered periods.

Table 4 Correlation analysis

After developing the correlation analysis, we applied the two following models:

  • Logit model.

  • Random Forest Model.

Both of the models were run using, firstly, the Z’-Score and, secondly, the Z’’-Score.

We selected the Random Forest technique, as several authors (i.e., Barboza et al., 2017) have shown it to be the best machine learning classifier. This model, robust to the presence of outliers or missing data, is able to identify the importance of each variable in the classification results The model corrects the decision trees’ habit of overfitting to their training set (Friedman, 2001; Schapire & Freund, 2012); an event that happens when the model too closely fits the training set. Estimated using StataFootnote 4, the Random Forest tree depth was set to 1,000, the number of predictors for each tree was set to 3, the bootstrap sample size was set to 1,000, and the minimum number of cases for parent node was set to 2. Following the example of Hastie et al., (2009), the classifiers were trained and tested on each dataset using repeated 70/30 random allocations between training and test samples. The training sample included, therefore, 78 companies, of which 47 were restructured or defaulted, and the remaining 31 were healthy. Given the small size of our sample, as a robustness check, we employed a 10th K-fold cross validation approach (see Hastie et al., 2009).

By comparing the two models, it is clear to see that both models have their own advantages and disadvantages. The Logit model is particularly suitable for the economic and financial field, since this model is appropriate for predicting binary events and does not require the independent variables to have equal variance in each group, or even be normally distributed (Hilbe, 2015). Moreover, the Logit model is less subject to overfitting (Hilbe, 2015). On the contrary, the random forest model also has several advantages in that it is robust to outliers and to missing data (Lantz, 2019). This allows this model to be particularly suitable for analyzing small databases with an optimum level of generalization (Lantz, 2019).

The Logit model tends to require less computation and is easily interpretable when compared to the Random Forest model, as it is a linear model. However, the Random Forest model obtains a higher predicting performance compared to the Logit model (Jones et al., 2017) and does not require any adjustments to the databases used, since it does not require variables to be scaled or normalized (Lantz, 2019). Moreover, the Logit model is exposed to several limitations which do not affect the Random Forest model: heteroskedasticity, serial correlation, non-normality of error terms, and it is not suitable for nonlinear relationships.

In order to compare the predictive performance of the models, we decided to use the ROC curves: a method commonly used in previous literature (Swets et al., 2000; Jones et al., 2015, 2017; Barboza et al., 2017). The ROC curve, which plots the true positive rate (sensitivity) relative to the false positive rate (1 − specificity), would have an AUC (area under the curve) of exactly 0.5 in case of random guess. Therefore, every classifier should reach a value higher than 0.5. As suggested by Jones et al., (2017), a value higher than 0.9 signals a strong classifier, while a value between 0.8 and 0.9 is indicative of a good or useful classifier.

4 Results

Firstly, we ran the Logit model using the Z’-Score (Model 1) and the Z’’-Score (Model 2) on the training sample.

In the two Logit models, corporate governance variables are generally not statistically significant in anticipating the default of a company. However, the turnover of the board of directors – which can often occur unexpectedly as a result of the actions of the shareholder that caused the crisis – is statically significant at 5% in both models and has a negative coefficient.

In both models, the Z-Scores are statically significant but with differing impacts in the case of the Z-Score relative to the period 2007–2011 or 2012–2014. As expected, the 12–14 Z-Score is statically significant at 1% in both models, with a negative coefficient. Therefore, the models confirm the ability of Z-Score to anticipate the emergence of a crisis two years earlier than the occurrence of a company’s classification as UTP. Furthermore, the sign of the coefficient confirms the negative (expected) relationship between the Z-Score and the dummy dependent variable, since the higher the Z-Score, the lower the probability of default. Interestingly, the impact of Z’-Score in predicting the status of a company is higher than Z’’-Score. The 07–11 Z-Score is also statically significant at 5% in both Model 1 and Model 2, but the sign of the coefficient is positive, which is an unexpected result. This coefficient could be a consequence of the use of the average of Z-Scores over the last five years, which is also a limitation of this work. Secondly, this could be due to the characteristics of the sample, which could therefore be influenced by outliers - a factor that does not influence the Random Forest technique.

Table 5 Logit results

The results of Tables 5 and 6 confirm the overall good ability of the Logit model to correctly classify the status of a company.

Table 6 Logit Classification

Secondly, we ran the Random Forest model using the Z’-Score (Model 3) and the Z’’-Score (Model 4) on the training sample. The main results are shown in Table 7. The results show that Model 3, based on Z’-Score, is better than Model 4 at classifying the status of a company. However, the two models have high degrees of accuracy for both the training and test samples.

Table 7 Random forest

To interpret the model performance, we used the relative variables importance (RVIs), which reports the number of times on average that a variable is used in the decision trees of the model (Hastie et al., 2009; Friedman & Meulman, 2003). A RVI bigger than 0 implies that the variable is used in the decision trees of the model and, therefore, contributes to improving the prediction capabilities of the model itself. RVIs are reported in Fig. 1 (Model 3) and Fig. 2 (Model 4) on a scale from 0 to 1.

Fig. 1
figure 1

RVI’s Model 3. Notes: The Model 3 is based on the use of the Random Forest combines with the Z’-Score

Fig. 2
figure 2

RVI’s Model 4. Notes: The Model 4 is based on the use of the Random Forest combines with the Z’’-Score

Generally, all variables contribute to the overall predictive accuracy of the models. However, the strength of the RVIs differs significantly across variables. The results show that financial variables are still the most important indicators for predicting the financial default of a company. The 12–14 Z-Score is the most important variable in both models, confirming the results obtained in the Logit framework. The 07–11 Z-Score is the second most used variable in both models. Nevertheless, in Model 4 its importance is aligned to the 12–14 Z-Score, suggesting that the Random Forest model is able to predict the emergence of a crisis, giving the company time to find an appropriate solution. In opposition to Logit results, the governance variables demonstrate a good overall contribution in both models, which confirms the crucial role of corporate governance variables, as their RVI is higher than 0.5 for all variables. The turnover of the board of directors’ members is rated third in terms of its importance, while the number of sole directors and chairmen of the board in the years prior to the crisis are the least relevant variables. Similarly, the presence of internal and external auditors does not seem to significantly affect the probability of default. On the contrary, the number of CEOs before the crisis is one of the most impactful variables, highlighting the importance of the relationship between the stability of the board of directors and, especially, of the person responsible for managing the company and the probability of an economic/financial crisis.

The RVIs show the difference between the Logit and the Random Forest models. Logit, highlights only three significant variables, while Random Forest shows that all of the variables contribute to improving the performance of the model. Jones (2017) shows that Random Forest allows for the inclusion of many variables which are also highly correlated. In the case of Logit, this could lead to multicollinearity and overfitting.

Despite confirming the fact that financial variables are still the most important indicator in predicting the financial default of a company, the results also confirm the impact of corporate governance variables on the probability of bankruptcy. These results corroborate the importance of the members of the board of directors’ turnover, as in Elloumi & Gueyiè (2001), suggesting that stability in the composition of the board of directors can positively impact the performance of a company. Furthermore, we also ran the models excluding corporate governance variables. In this framework, we found that the ROC curve was higher for models that also included governance variables, as shown in Liang et al., (2016). Therefore, the Random Forest results reveal the importance of governance variables, especially with regards to the turnover of members on the board of directors and the number of CEOs in the year before the crisis. These results corroborate the theses of Gilson (1990), Fahlenbrach et al. (2007), and Fernando et al., (2020).

The results of the Logit and Random Forest models also confirm the Z’-Score and Z’’-Score’s ability to predict the emergence of a crisis, in line with previous literature (Dallocchio et al., 2020; Paoloni & Celli, 2018; Madonna & Cestari, 2016; Altman et al., 2013) researching the Italian context.

Furthermore, the Random Forest model corroborates the thesis that not only the Z-Score but also corporate governance variables can predict both the default of a company and its potential classification as UTP, which is of course a precursor. From a managerial standpoint, this is a key feature, because it allows a company to adopt an appropriate restructuring process well before the overall situation collapses. Anticipating the emergence f a crisis is becoming increasingly important in many countries, in which banking systems are suffering with the heavy burden of NPLs and correlated capital losses. This is also the case for Italy (along with many other continental European countries), which uses the new Bankruptcy Code, wherein the government requires companies and their managers to adopt restructuring procedures before events of credit. In this framework, controlling bodies or supervisory boards are also responsible for intercepting signs of economic and financial discomfort in time. Having said that, the option of using simple models, like the Random Forest model, could have a profound impact on the activities of both managing and controlling entities.

As previously mentioned, we have compared the predictive performance of the models using the ROC curves. The ROC curve data is shown in Tables 8, while the curves are shown in Fig. 3 (for all samples) and in Fig. 4 (for the test sample). The Random Forest shows a high degree of accuracy in the training phase. However, this does not imply high levels of reliability for the model, as its robustness and credibility actually depend on its ability to correctly predict the outcomes for the test sample. Nevertheless, both the overall results and the test results displayed in Table 8 show that the “new” machine learning classifier Random Forest significantly outperforms the traditional Logit model, both using Z’-Score and Z’’-Score. The ROC area using the Random Forest and the Z’-Score is 0.9357 in the test sample: a value that signals a strong classifier. These results confirm those found in existing literature (Olson et al., 2012; Tsai et al., 2014; Barboza et al., 2017; Jones et al., 2017). Furthermore, these results confirm the suitability of Random Forest for the Italian context, along with its peculiarities, as in the case of Aliaj et al., (2020).

Table 8 ROC
Fig. 3
figure 3

ROC Curve ALL SAMPLE

Fig. 4
figure 4

ROC Curve TEST SAMPLE

It is worth noting that Random Forest shows the predicting ability of the Z-Score, contrary to the case of Barboza et al., (2017). However, Barboza et al., (2017) used the original the Z-Score, while we used the Z’- and the Z’’-Scores; the first of which, in particular, was shown to have high levels of reliability. We also included governance variables, confirming the theories of the aforementioned study, showing that Random Forest allows us to include different indicators and, therefore, predict bankruptcy cases more efficiently.

Despite this fact, the ROC area varies drastically as a function of the inclusion of Z’- or Z’’-Score. The ROC curve in the test sample is 0.9357 in the case of Model 3, and it is 0.8393 in Model 4. It reaches 0.8321 in Model 1 and 0.7750 in Model 2. The results show that both Logit and Random Forest benefit from improved predicting power when Z’-Score is considered. This result could be linked to the fact that Z’’-Score was mainly built for use in emerging markets. This implies that the ratio of Sales/Total Assets, which is not included in the Z’’-Score, could represent a key factor when anticipating the emergence f a crisis.

However, without exception, in all tests the Random Forest revealed better outcomes than those of the Logit model, due to its ability to better use the information contained in the corporate governance variables.

5 Robustness tests

To check the robustness of our results, different tests were carried out. Firstly, we assumed a 10th K-fold cross validation approach. The K-fold cross validation, as suggested by Hastie et al., (2009), is particularly useful when working with a small sample size, as in our case. This technique is primarily used in applied machine learning to estimate the skill of a machine learning model on unseen data. Therefore, it is used on limited samples in order to estimate how the model is expected to perform in general when used to make predictions on data not used during the training of the model. We used this technique on both the Logit and the Random Forest model. The results are shown in Table 9.

Table 9 K-Fold cross validation

The results did not show a significant change in AUC performance for Logit models, but the Random Forest model suffered from a 5% reduction, especially in the case of Model 4. However, the results confirm the superiority of Random Forest and Z’-Score (Model 3), which is still a strong classifier. This technique again confirms the superiority of the Z’-Score for both approaches, and the role of corporate governance variables in the Random Forest model. Moreover, this thesis is also corroborated by the fact the ROC area in the test sample is superior in the case of Model 1 (Logit – Z’-Score) in comparison to that of Model 4 (Random Forest – Z’’-Score).

Secondly, as we used a dummy variable for three different possible statuses related to the company (healthy, restructured, or ceased), we ran the model excluding defaulted companies in order to have only two possible causes for the dummy. The new sample, therefore, only includes 99 companies. The results of the Logit are reported in Table 10. The 12–14 Z-Score is again statically significant at 1% in the case of Model 5, and at 5% in the case of Model 6. However, in this case, governance variables and the 07–11 Z-Score are statically significant only in Model 5.

Table 10 Logit new

The new ROC curves are shown in Tables 11 and are drawn in Figs. 5 and 6. Previous results are also confirmed in this case. The Random Forest model again demonstrates a higher predicting power than the Logit model in all cases and samples. In this case, the Z’-Score and corporate governance variables in the Random Forest also prove to be better discriminant indicators for anticipating the emergence of a crisis. This conclusion is also reinforced by the fact that the ROC area is higher for Model 5 than Model 8, as in the case of the K-Fold test.

Table 11 ROC new
Fig. 5
figure 5

New ROC Curve ALL SAMPLE

Fig. 6
figure 6

New ROC Curve TEST SAMPLE

6 Conclusions

This paper extends prior empirical research on financial distress and corporate governance in a geographical context such as Italy. Italian companies’ corporate governance system characteristics are more likely to heighten agency problems and, therefore, they could contribute to worsening situations of financial distress. The final results corroborate the thesis on the central role of corporate governance on crisis management – a topic that is relatively new in comparison to studies on financially sound enterprises. This research has produced evidence regarding the importance of corporate governance variables – especially those linked to the board of directors and the top management of a company – in anticipating the emergence of bankruptcies, while the presence of auditors seems to be less relevant. Although financial variables are the most crucial factors in all models, the Random Forest model shows that corporate governance variables play a primary role; especially the renewal of the CEO and the average term of the board of directors. These results suggest that the stability of the CEO, the composition of the board of directors, and the person responsible for managing the company can deeply affect the probability of an economic/financial crisis. This thesis is also corroborated by the fact that the exclusion of these variables deeply affects the performance of the models, reducing their capabilities when predicting bankruptcies. These findings have important implications for family-owned firms and for banks with regards to improving the performance of their credit models.

Using a unique set of UTP Italian companies, we compare the Logit and Random Forest models’ ability to predict bankruptcies. Despite their low diffusion and use, we confirm that the Random Forest outperforms the Logit model in predicting corporate defaults. Our findings and suggestions for corporate default predictions are as topical as ever. Firstly, machine learning techniques have been proven to be particularly effective in forecasting corporate defaults. Secondly, they have remarkable practical applications for various business operators, such as credit agencies and banks. These techniques, which are relatively easy to implement, are stable predictors and are resilient to a series of statistical issues, like omitted variables, multicollinearity, outliers, and heteroscedasticity. Moreover, the RVIs of the Random Forest technique also allow us to interpret the importance of the variables; a fundamental factor helping managers with key activities targeted towards the prevention of crises.

Furthermore, the results of the models show the predictive power of Z’- and Z’’-Scores in the Italian context specifically, corroborating the theses of previous researchers. In addition, we also confirm that the Z’-Score is a better indicator in anticipating potential corporate disease.

The main limitation of our work is, of course, the sample, which only includes Italian non listed UTP companies. This fact surely impacts upon our results. However, we also consider this feature an additional contribution of our research. As a matter of fact, we demonstrate the reliability of these models in predicting the classification of UTP for private firms. This is a factor of pivotal importance, because it anticipates the default and can act as a stimulus when fixing emerging problems. Future researchers, given the ability of these models to manage a great quantity of data, should include more countries, considering both listed and private companies and despite the low availability of data, possibly also the date a company was classified as UTP. Secondly, because overfitting is not a significant issue for the main machine leaning techniques, future researchers could include more predictive variables, especially those linked to sustainability, given the limited amount of existing literature on the matter (Elloumi & Gueyiè, 2001; Fahlenbrach et al., 2007; Ricci et al., 2020).