Forecasting bankruptcy using biclustering and neural network-based ensembles

Abstract

Most bankruptcy prediction models that have been analyzed in the literature, and that are estismated using ensemble-based techniques, are still not able to fully embody the true diversity of firm bankruptcy situations. Indeed, these models try to assess all bankruptcy situations either mostly using the same set of variables (bagging, boosting), or using the same set of observations (random subspace). In the first case, an ensemble assumes that any symptom of failure has the same origin. In the second case, it assumes that any financial situation that can lead to failure is the same for all firms. However, there are many situations where these two assumptions do not hold and where a state of bankruptcy may be specific to a given subgroup of firms or may be explained by a particular subset of variables. Certain methods, such as random forest or rotation forest, which combine the characteristics of both random subspace and bagging appear as solutions to this issue. However, they do not always perform significantly better than other ensemble models do. This is why we propose a modeling method that attempts to overcome the limitations of the previous models. It is based on a biclustering technique that seeks out groups of firms that are each characterized by a well-defined subset of variables and on an ensemble technique that is used to embody the full diversity of all bankruptcy situations that belong to each bicluster as precisely as possible. We show how the complementarity between these two techniques can improve forecasts.

This is a preview of subscription content, log in to check access.

Notes

  1. 1.

    The Point-Biserial Correlation estimates the difference between the mean value of distances between groups and that of distances within groups. The C-Index of Hubert and Levin computes the difference between the sum of the k smallest distances and that of the k largest distances. The Gamma of Baker and Hubert computes the number of distances between groups that are greater than the largest distance within groups, and the number of distances within groups that are lower than the lowest distance within groups.

References

  1. Abid, I., Mkaouar, F., & Kaabia, O. (2016). Dynamic analysis of the forecasting bankruptcy under presence of unobserved heterogeneity. Annals of Operations Research, 262, 241–256.

    Article  Google Scholar 

  2. Affes, Z., & Hentati-Kaffel, R. (2018). Forecast bankruptcy using a blend of clustering and MARS model: Case of US banks. Annals of Operations Research. https://doi.org/10.1007/s10479-018-2845-8

  3. Alam, P., Booth, D., Lee, K., & Thordarson, T. (2000). The use of fuzzy clustering algorithm and self-organizing neural networks for identifying potentially failing banks: An experimental study. Expert Systems with Applications, 18, 185–199.

    Article  Google Scholar 

  4. Alfaro, E., Gamez, M., & Garcia, N. (2007). A boosting approach for corporate failure prediction. Applied Intelligence, 27, 29–37.

    Article  Google Scholar 

  5. Alfaro, E., Garcia, N., Gamez, M., & Elizondo, D. (2008). Bankruptcy forecasting: An empirical comparison of Ada Boost and neural networks. Decision Support Systems, 45, 110–122.

    Article  Google Scholar 

  6. Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. Journal of Finance, 23, 589–609.

    Article  Google Scholar 

  7. Altman, E. I. (1984). A further empirical investigation of the bankruptcy cost question. Journal of Finance, 39, 1067–1089.

    Article  Google Scholar 

  8. Balcaen, S., & Ooghe, H. (2006). 35 years of studies on business failure: An overview of the classical statistical methodologies and their related problems. British Accounting Review, 38, 63–93.

    Article  Google Scholar 

  9. Barboza, F., Kimura, H., & Altman, E. (2017). Machine learning models and bankruptcy prediction. Expert Systems With Applications, 83, 405–417.

    Article  Google Scholar 

  10. Bardos, M. (1995). Détection précoce des défaillances d’entreprises à partir des documents comptables. Bulletin de la Banque de France, 3, 57–71.

    Google Scholar 

  11. Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.

    Google Scholar 

  12. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Article  Google Scholar 

  13. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Boca Raton, Florida: Chapman and Hall-CRC.

    Google Scholar 

  14. Charalambous, C., Charitou, A., & Kaourou, F. (2000). Comparative analysis of artificial neural network models: Application in bankruptcy prediction. Annals of Operations Research, 99, 403–425.

    Article  Google Scholar 

  15. Charrad, M., Lechevallier, Y., Ahmed, M.B., & Saporta, G. (2010). On the number of clusters in block clustering algorithms. In Proceedings of the twenty-third international Florida artificial intelligence research society conference, Florida (pp 392–397).

  16. Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In: 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, California (pp 785–794).

  17. del Martin Brio, B., & Serrano-Cinca, C. (1993). Self-organizing neural networks for the analysis and representation of data: Some financial cases. Neural Computating and Applications, 1, 193–206.

    Article  Google Scholar 

  18. D’Aveni, R. A. (1989). The aftermath of organizational decline: A longitudinal study of the strategic and managerial characteristics of declining firms. Academy of Management Journal, 32, 577–605.

    Google Scholar 

  19. Dimitras, A. I., Zanakis, S., & Zopounidis, C. (1996). A survey of business failures with an emphasis on prediction methods and industrial applications. European Journal of Operational Research, 90, 487–513.

    Article  Google Scholar 

  20. Doumpos, M., & Zopounidis, C. (2007). Model combination for credit risk assessment: A stacked generalization approach. Annals of Operations Research, 151, 289–306.

    Article  Google Scholar 

  21. du Jardin, P. (2015). Bankruptcy prediction using terminal failure processes. European Journal of Operational Research, 242, 286–303.

    Article  Google Scholar 

  22. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27, 861–874.

    Article  Google Scholar 

  23. Fedorova, E., Gilenko, E., & Dovzhenko, S. (2013). Bankruptcy prediction for Russian companies: Application of combined classifiers. Expert Systems with Applications, 40, 7285–7293.

    Article  Google Scholar 

  24. Freund, Y. (1990). Boosting a weak learning algorithm by majority. In: COLT’90: The third annual workshop on computational learning theory (pp 202–216).

  25. Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232.

    Article  Google Scholar 

  26. Garca, V., Marques, A. I., & Salvador Sanchez, J. (2019). Exploring the synergetic effects of sample types on the performance of ensembles for credit risk and corporate bankruptcy prediction. Information Fusion, 47, 88–101.

    Article  Google Scholar 

  27. Govaert, G., & Nadif, M. (2009). Un modèle de mélange pour la classification croisée d’un tableau de données continue. CAP’09. 11e conférence sur l’apprentissage artificiel (pp. 287–302). Tunisia: Hammamet.

  28. Govaert, G., & Nadif, M. (2014). Co-clustering: Models, algorithms and applications. Computer Engineering series. Hoboken: Wiley.

    Google Scholar 

  29. Gupta, M. C. (1969). The effect of size, growth, and industry on the financial structure of manufacturing companies. Journal of Finance, 24, 517–529.

    Google Scholar 

  30. Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a Receiver Operating Characteristic (roc) curve. Radiology, 143, 29–36.

    Article  Google Scholar 

  31. Heo, J., & Yang, J. Y. (2014). Ada boost based bankruptcy forecasting of Korean construction companies. Applied Soft Computing, 24, 494–499.

    Article  Google Scholar 

  32. Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.

    Article  Google Scholar 

  33. Huang, G. B., Zhu, Q. Y., & Siew, C. K. (2006). Extreme learning machine: Theory and applications. Neurocomputing, 70, 489–501.

    Article  Google Scholar 

  34. Hunter, J., & Isachenkova, N. (2001). Failure risk : A comparative study of UK and Russian firms. Journal of Policy Modeling, 23, 511–521.

    Article  Google Scholar 

  35. Huysmans, J., Baesens, B., Vanthienen, J., & van Gestel, T. (2006). Failure prediction with self-organizing maps. Expert Systems with Applications, 30, 479–487.

    Article  Google Scholar 

  36. Kaski, S., & Lagus, K. (1996). Comparing self-organizing maps. In: J.C.V. C. von der Malsburg W. von Seelen, B. Sendhoff (eds.) International conference on artificial neural networks, Lecture notes in computer science (vol. 1112, pp. 809–814). Berlin: Springer.

  37. Kim, M. J., & Kang, D. K. (2012). Classifiers selection in ensembles using genetic algorithms for bankruptcy prediction. Expert Systems with Applications, 39, 9308–9314.

    Article  Google Scholar 

  38. Kim, S. Y., & Upneja, A. (2014). Predicting restaurant financial distress using decision tree and Ada Boosted decision tree models. Economic Modelling, 36, 354–362.

    Article  Google Scholar 

  39. Kiviluoto, K. (1998). Predicting bankruptcies with the self-organizing map. Neurocomputing, 21, 191–201.

    Article  Google Scholar 

  40. Kuncheva, L. I., & Whitaker, C. J. (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51, 181–207.

    Article  Google Scholar 

  41. Laitinen, T. (1991). Financial ratios and different failure processes. Journal of Business Finance and Accounting, 18, 649–673.

    Article  Google Scholar 

  42. Lee, K. C., Kwon, Y., & Han, I. (1996). Hybrid neural network models for bankruptcy predictions. Decision Support Systems, 18, 6372.

    Google Scholar 

  43. Lensberg, T., Eilifsen, A., & Mckee, T. E. (2006). Bankruptcy theory development and classification via genetic programming. European Journal of Operational Research, 169, 677–697.

    Article  Google Scholar 

  44. Leray, P., & Gallinari, P. (1998). Feature selection with neural networks. Behaviormetrika, 26, 145–166.

    Article  Google Scholar 

  45. Li, H., Lee, Y. C., Zhou, Y. C., & Sun, J. (2011). The random subspace binary logit ( RSBL) model for bankruptcy prediction. Knowledge-Based Systems, 24, 1380–1388.

    Article  Google Scholar 

  46. Madeira, S. C., & Oliveira, A. L. (2004). Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1, 24–45.

    Article  Google Scholar 

  47. Melville, P., & Mooney, R. J. (2004). Creating diversity in ensembles using artificial data. Journal of Information Fusion: Special Issue on Diversity in Multiclassifier Systems, 6, 99–111.

    Article  Google Scholar 

  48. Mensah, Y. M. (1984). An examination of the stationarity of multivariate bankruptcy prediction models: A methodological study. Journal of Accounting Research, 22, 380–395.

    Article  Google Scholar 

  49. Milligan, G. W. (1981). A Monte-Carlo study of thirty internal criterion measures for cluster analysis. Psychometrika, 46, 187–199.

    Article  Google Scholar 

  50. Mousavi, M.M., & Ouenniche, J. (2018). Multi-criteria ranking of corporate distress prediction models: Empirical evaluation and methodological contributions. Annals of Operations Research https://doi.org/10.1007/s10479-018-2814-2.

  51. Nadif, M., & Jollois, F. X. (2004). Identification de blocs homogènes sur des donnés continues. In: Proceedings of Quatrièmes journées extraction et gestion des connaissances, RNTI-E-2, France (pp. 241–246).

  52. Ouenniche, J., & Tone, K. (2017). An out-of-sample evaluation framework for DEA with application in bankruptcy prediction. Annals of Operations Research, 254, 235–250.

    Article  Google Scholar 

  53. Neves, J. C., & Vieira, A. (2006). Improving bankruptcy prediction with hidden layer learning vector quantization. European Accounting Review, 15, 253–271.

    Article  Google Scholar 

  54. Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 18, 109–131.

    Article  Google Scholar 

  55. Platt, H. D., & Platt, M. B. (2002). Predicting corporate financial distress: reflections on choice-based sample bias. Journal of Economics and Finance, 26, 184–199.

    Article  Google Scholar 

  56. Platt, H. D., Platt, M. B., & Pedersen, J. G. (1994). Bankruptcy discrimination with real variables. Journal of Business Finance and Accounting, 21, 491–510.

    Article  Google Scholar 

  57. Pompe, P. P. M., & Bilderbeek, J. (2005). Bankruptcy prediction: The influence of the year prior to failure selected for model building and the effects in a period of economic decline. Intelligent Systems in Accounting, Finance and Management, 13, 95–112.

    Article  Google Scholar 

  58. Rodriguez, J. J., & Kuncheva, L. I. (2006). Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10, 1619–1630.

    Article  Google Scholar 

  59. Huang, S.C., Tang, Y.C., Lee, C.W., Chang, M.J. (2012). Kernel local Fisher discriminant analysis based manifold-regularized SVM model for financial distress predictions. Expert Systems with Applications 39, 3855–3861.

  60. Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5, 197–227.

    Google Scholar 

  61. Serrano-Cinca, C. (1996). Self-organizing neural networks for financial diagnosis. Decision Support Systems, 17, 227–238.

    Article  Google Scholar 

  62. Stein, R. M. (2007). Benchmarking default prediction models: Pitfalls and remedies in model validation. Journal of Risk Model Validation, 1, 77–113.

    Article  Google Scholar 

  63. Steinwart, I., & Christmann, A. (2008). Support Vector Machines. Information Science and Statistics. Berlin: Springer.

    Google Scholar 

  64. Sun, J., Fujita, H., Chen, P., & Li, H. (2017). Dynamic financial distress prediction with concept drift based on time weighting combined with Adaboost support vector machine ensemble. Knowle dge-Base d Systems, 120, 4–14.

    Article  Google Scholar 

  65. Tsai, C. F., Hsu, Y. F., & Yen, D. C. (2014). A comparative study of classifier ensembles for bankruptcy prediction. Applied Soft Computing, 24, 977–984.

    Article  Google Scholar 

  66. Tumer, K., & Ghosh, J. (1996). Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 29, 341–348.

    Article  Google Scholar 

  67. Wang, L., & Wu, C. (2017). Business failure prediction based on two-stage selective ensemble with manifold learning algorithm and kernel-based fuzzy self-organizing map. Knowledge-Based Systems, 121, 99–110.

    Article  Google Scholar 

  68. West, D., Dellana, S., & Qian, J. (2005). Neural network ensemble strategies for financial decision application. Computers and Operations Research, 32, 2543–2559.

    Article  Google Scholar 

  69. Zelenkov, Y., Fedorova, E., & Chekrizov, D. (2017). Two-step classification method based on genetic algorithm for bankruptcy forecasting. Expert Systems with Applications, 88, 393–401.

    Article  Google Scholar 

  70. Zmijewski, M. E. (1984). Methodological issues related to the estimation of financial distress prediction models. Journal of Accounting Research, 22, 59–82.

    Article  Google Scholar 

Download references

Acknowledgements

We are very grateful to the two anonymous reviewers for their valuable comments.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Philippe du Jardin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

A: Main studies that have compared the forecast ability of bankruptcy prediction models designed with ensemble techniques

See Table 11.

Table 11 .

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

du Jardin, P. Forecasting bankruptcy using biclustering and neural network-based ensembles. Ann Oper Res (2019). https://doi.org/10.1007/s10479-019-03283-2

Download citation

Keywords

  • Financial risk
  • Bankruptcy prediction
  • Ensemble-based model
  • Neural network
  • Biclustering