Skip to main content
Log in

Combining Kohonen maps and prior payment behavior for small enterprise default prediction

  • Published:
Small Business Economics Aims and scope Submit manuscript

Abstract

This study aims to verify the potential of combining corporate prior payment behavior and Kohonen maps for small enterprise default prediction. Logistic regression, discrete-time hazard models, and Kohonen maps were applied to a sample of 1200 Italian small enterprises, and two categories of prediction models were calculated: one exclusively based on financial ratios and the other based also on payment behavior-related variables. The main findings are as follows: (1) Kohonen map-based trajectories give significantly higher prediction accuracy rates compared to both logistic and hazard models; (2) the longer the forecast horizon and/or the smaller the firm’s size, the greater are the improvements in prediction accuracy obtainable through Kohonen maps; (3) accuracy rates are higher when company payment behavior-related variables are added to financial ratios as default predictors; and (4) the smaller a firm, the greater is the increase in accuracy obtainable by adding payment behavior-related variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. A Kohonen map, also known as a self-organizing map, is a type of artificial neural network using unsupervised learning to build a two-dimensional map of a problem space. The key distinctive feature of a Kohonen map compared to other approaches to problem solving is that it uses competitive learning rather than error-correction learning such as backpropagation with gradient descent. A Kohonen map can generate a visual representation of data on a hexagonal or rectangular grid. Applications include meteorology, oceanography, project prioritization, and oil and gas exploration.

  2. In this paper, ‘cooperating banking group’ refers to the banking group (whose name is not disclosed for reasons of confidentiality) which collaborated with this research project by making data available in relation to company prior payment behavior variables, as described in Section 4.2.

  3. Proportional stratified random sampling is a type of probability sampling wherein the entire population is branched off into multiple non-overlapping, homogeneous groups (strata), and final members of the sample are randomly and proportionally chosen from the various strata. Members in each stratum should be distinct so that every member of all groups has an equal opportunity of being selected using simple probability. This sampling method is also called random quota sampling. This study used the following stratification variables:

    1. 1.

      2015 turnover: below 1.0 million, 1.0–2.5 million, 2.5–4.0 million, 4.0–5.0 million

    2. 2.

      Business sector: food, clothing, wood products, chemical products, metallurgy, mechanical machines, electronic and optical machines and tools

    3. 3.

      Geographical location (region of central Italy in which the firm was located): Abruzzo, Emilia Romagna, Lazio, Liguria, Marche, Toscana, or Umbria.

  4. The Central Credit Register is described in footnote number 7.

  5. I would like to thank the anonymous reviewer who suggested the use of a hazard model as a second benchmark of self-organizing map-based trajectories, thus allowing for more striking and valuable results.

  6. The results obtained through discrete-time hazard prediction models are very similar to those of logistic regression models. This is because all the default events in our analysis happened in 2015, with the result that the status of the firms (in the sample) associated with the independent variables relating to each of the 8 years under assessment is always a non-default status, with the exception of the last one.

  7. For example, in Italy, there is the Central Credit Register, an information system on the debt of the customers of the banks and financial companies supervised by the Bank of Italy, which contains information on customers’ borrowings from the intermediaries and allows lenders to obtain a wide range of information on the risk position of each customer vis-à-vis the banking system, including past due and/or overdrawn exposures for more than 30, 60, 90, and 180 days.

  8. A second-order method, based on second derivatives of network parameters, was not used in order to look into an equivalent number of points of comparison.

References

Download references

Acknowledgments

The authors would like to thank Fabrizio Cipollini, University of Florence, for his suggestions about the statistical methods used in this study. The authors would also like to thank two anonymous reviewers for their helpful comments on an earlier version of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Ciampi.

Appendix

Appendix

1.1 Description and results of the six selection methods used to choose the ratios and the prior payment behavior variables included in the prediction models

The variable selection criteria used in this study (among the most commonly used with default prediction modeling) are the following.

  1. 1.

    Wilks’s lambda criterion relying on a forward search procedure to explore a (sub) space of possible variable combinations, Fisher F test to interrupt the search, and Wilks’s lambda to compare variable subsets and determine the best (this criterion is optimized for discriminant analysis)

  2. 2.

    Akaike information criterion with a forward stepwise search and chi-squared as a stopping criterion in a logistic regression model

  3. 3.

    Akaike information criterion with a backward stepwise search and chi-squared as a stopping criterion in a logistic regression model

  4. 4.

    Zero-order neural network criterion based on the evaluation criteria designed by Yacoub and Bennani (1997), with a backward search procedure and network retraining after each variable removal

  5. 5.

    First-order neural network criterion using the first derivatives of network parameters with respect to variables as an evaluation criterion,Footnote 8 with a backward search procedure and network retraining after each variable removal

  6. 6.

    Error network criterion relying on the evaluation of an out-of-sample error calculated with the neural network, with a backward search procedure and network retraining after each variable removal

To select variables, 1000 random bootstrap samples were drawn with replacement from the training sample dataset of year 2014 (1200 firms). Each bootstrap sample included 1200 firms. In order to select variables, the following three-step process was used:

  • Firstly, each selection criterion was used to select variables with these 1000 bootstrap samples.

  • Secondly, only those variables that were included in more than 70% of the selection results were selected.

  • Finally, only those variables which were selected by at least three of the abovementioned criteria were selected and used for the development of prediction models.

Tables 18, 19, 20, 21, 22, and 23 show, for each selection criterion used in this study, the variables that appeared in more than 70% of the selection results. The variables which were selected by at least three of these criteria (and which were consequently selected for the development of our prediction models) are highlighted in bold.

Table 18 Variables included in more than 70% of the selection results using a Wilks’s lambda criterion
Table 19 Variables included in more than 70% of the selection results using Akaike information criterion with a forward stepwise search
Table 20 Variables included in more than 70% of the selection results using Akaike information criterion with a backward stepwise search
Table 21 Variables included in more than 70% of the selection results using a zero-order neural network criterion
Table 22 Variables included in more than 70% of the selection results using a first-order neural network criterion
Table 23 Variables included in more than 70% of the selection results using an error neural network criterion
Table 24 Correlation matrix between explanatory variables related to year 2014 and included in the logistic regression models

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ciampi, F., Cillo, V. & Fiano, F. Combining Kohonen maps and prior payment behavior for small enterprise default prediction. Small Bus Econ 54, 1007–1039 (2020). https://doi.org/10.1007/s11187-018-0117-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11187-018-0117-2

Keywords

JEL classifications

Navigation