Skip to main content
Log in

An insight into the experimental design for credit risk and corporate bankruptcy prediction systems

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Over the last years, it has been observed an increasing interest of the finance and business communities in any application tool related to the prediction of credit and bankruptcy risk, probably due to the need of more robust decision-making systems capable of managing and analyzing complex data. As a result, plentiful techniques have been developed with the aim of producing accurate prediction models that are able to tackle these issues. However, the design of experiments to assess and compare these models has attracted little attention so far, even though it plays an important role in validating and supporting the theoretical evidence of performance. The experimental design should be done carefully for the results to hold significance; otherwise, it might be a potential source of misleading and contradictory conclusions about the benefits of using a particular prediction system. In this work, we review more than 140 papers published in refereed journals within the period 2000–2013, putting the emphasis on the bases of the experimental design in credit scoring and bankruptcy prediction applications. We provide some caveats and guidelines for the usage of databases, data splitting methods, performance evaluation metrics and hypothesis testing procedures in order to converge on a systematic, consistent validation standard.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Abdou, H., & Pointon, J. (2011). Credit scoring, statistical techniques and evaluation criteria: A review of the literature. Intelligent Systems in Accounting. Finance and Management, 18 (2–3), 59–88.

    Google Scholar 

  • Abdou, H., Pointon, J., El-Masry, A. (2008). Neural nets versus conventional techniques in credit scoring in Egyptian banking. Expert Systems with Applications, 35 (3), 1275–1292.

    Google Scholar 

  • Abdou, H.A. (2009a). An evaluation of alternative scoring models in private banking. Journal of Risk Finance, 10 (1), 38–53.

    Google Scholar 

  • Abdou, H.A. (2009b). Genetic programming for credit scoring: The case of Egyptian public sector banks. Expert Systems with Applications, 36 (9:11), 402–11. 417.

    Google Scholar 

  • Abdou, H.A., El-Masry, A., Pointon, J., Abdou, H., El-Masry, A., Pointon, J. (2007). On the applicability of credit scoring models in Egyptian banks. Banks and Bank Systems, 2 (1), 4–20.

    Google Scholar 

  • Ahn, B.S., Cho, S.S., Kim, C.Y. (2000). The integrated methodology of rough set theory and artificial neural network for business failure prediction. Expert Systems with Applications, 18 (2), 65–74.

    Google Scholar 

  • Ahn, H., & Kim, K.J. (2009). Bankruptcy prediction modeling with hybrid case-based reasoning and genetic algorithms approach. Applied Soft Computing, 9 (2), 599–607.

    Google Scholar 

  • Alfaro, E., García, N., Gámez, M., Elizondo, D. (2008). Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks. Decision Support Systems, 45 (1), 110–122.

    Google Scholar 

  • Alpaydin, E. (2010). Introduction to Machine Learning. Cambridge MA: MIT Press.

    MATH  Google Scholar 

  • Angelini, E., di Tollo, G., Roli, A. (2008). A neural network approach for credit risk evaluation. The Quarterly Review of Economics and Finance, 48 (4), 733–755.

    Google Scholar 

  • Antonakis, A.C., & Sfakianakis, M.E. (2009). Assessing naïve Bayes as a method for screening credit applicants. Journal of Applied Statistics, 36 (5), 537–545.

    MATH  MathSciNet  Google Scholar 

  • Atiya, A.F. (2001). Bankruptcy prediction for credit risk using neural networks: A survey and new results. IEEE Trans on Neural Networks, 12 (4), 929–935.

    Google Scholar 

  • Bache, K., & Lichman, M. (2013). UCI Machine Learning Repository. Irvine, School of Information and Computer Sciences: University of California. http://archive.ics.uci.edu/ml.

  • Baesens, B., van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., Vanthienen, J. (2003). Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54 (6), 627–635.

    MATH  Google Scholar 

  • Becerra, V.M., Galvao, R.K.H., Abou-Seads, M. (2005). Neural and wavelet network models for financial distress classification. Data Mining and Knowledge Discovery, 11 (1), 35–55.

    MathSciNet  Google Scholar 

  • Bellotti, T., & Crook, J. (2009). Support vector machines for credit scoring and discovery of significant features. Expert Systems with Applications, 36(2), 3302–3308.

    Google Scholar 

  • Ben-David, A., & Frank, E. (2009). Accuracy of machine learning models versus “hand crafted” expert systems – a credit scoring case study. Expert Systems with Applications, 36(3), 5264–5271.

    Google Scholar 

  • Bensic, M., Sarlija, N., Zekic-Susac, M. (2005). Modelling small-business credit scoring by using logistic regression, neural networks and decision trees. Intelligent Systems in Accounting Finance and Management, 13 (3), 133–150.

    Google Scholar 

  • Berrar, D., & Lozano, J.A. (2013). Significance tests or confidence intervals: which are preferable for the comparison of classifiers. Journal of Experimental & Theoretical Artificial Intelligence, 25 (2), 189–206.

    Google Scholar 

  • Bischl, B., Mersmann, O., Trautmann, H., Weihs, C. (2012). Resampling methods for meta-model validation with recommendations for evolutionary computation. Evol Comput, 20 (2), 249–275.

    Google Scholar 

  • Boguslauskas, V., & Mileris, R. (2009). Estimation of credit risk by artificial neural networks models. Economics of Engineering Decisions, 4, 7–14.

    Google Scholar 

  • Bose, I. (2006). Deciding the financial health of dot-coms using rough sets. Information & Management, 43 (7), 836–846.

    Google Scholar 

  • Brown, I., & Mues, C. (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Systems with Applications, 39(3), 3446–3453.

    Google Scholar 

  • Callejón, A.M., Casado, A.M., Fernández, M.A., Peláez, J.I. (2013). A system of insolvency prediction for industrial companies using a financial alternative model with neural networks. International Journal of Computational Intelligence Systems, 6 (1), 29–37.

    Google Scholar 

  • Canbas, S., Cabuk, A., Kilic, S.B. (2005). Prediction of commercial bank failure via multivariate statistical analysis of financial structures: The Turkish case. European Journal of Operational Research, 166 (2), 528–546.

    MATH  MathSciNet  Google Scholar 

  • Caouette, J.B., Altman, E.I., Narayanan, P., Nimmo, R. (2008). Managing Credit Risk: The Great Challenge for Global Financial Markets. Hoboken NJ: Wiley.

    Google Scholar 

  • Catal, C. (2012). Performance evaluation metrics for software fault prediction studies. Acta Polytechnica Hungarica, 9 (4), 193–206.

    Google Scholar 

  • Chen, F.L., & Li, F.C. (2010). Combination of feature selection approaches with SVM in credit scoring. Expert Systems with Applications, 37(7), 4902–4909.

    Google Scholar 

  • Chen, M.C., & Huang, S.H. (2003). Credit scoring and rejected instances reassigning through evolutionary computation techniques. Expert Systems with Applications, 24(4), 433–441.

    MATH  Google Scholar 

  • Chen, M.Y. (2013). A hybrid ANFIS model for business failure prediction utilizing particle swarm optimization and subtractive clustering. Information Sciences, 220, 180–195.

    Google Scholar 

  • Chen, N., Ribeiro, B., Vieira, A.S., Duarte, J., Neves, J.C. (2011). A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Systems with Applications, 38(10:12), 939–12. 945.

    Google Scholar 

  • Chen, S.C., & Huang, M.Y. (2011). Constructing credit auditing and control management model with data mining technique. Expert Systems with Applications, 38(5), 5359–5365.

    Google Scholar 

  • Chen, W., Ma, C., Ma, L. (2009). Mining the customer credit using hybrid support vector machine technique. Expert Systems with Applications, 36(4), 7611–7616.

    Google Scholar 

  • Cheng, C.B., Chen, C.L., Fu, C.J. (2006). Financial distress prediction by a radial basis function network with logit analysis learning. Computers & Mathematics with Applications, 51(3–4), 579–588.

    MATH  MathSciNet  Google Scholar 

  • Chi, L.C., & Tang, T.C. (2006). Bankruptcy prediction: Application of logit analysis in export credit risks. Australian Journal of Management, 31(1), 17–28.

    Google Scholar 

  • Cho, S., Hong, H., Ha, B.C. (2010). A hybrid approach based on the combination of variable selection using decision trees and case-based reasoning using the mahalanobis distance: For bankruptcy prediction. Expert Systems with Applications, 37(4), 3482–3488.

    Google Scholar 

  • Chow, S.L. (1998). Precis of statistical significance: rationale, validity, and utility. Behavioral and Brain Sciences, 21(2), 169–239.

    Google Scholar 

  • Ciampi, F., & Gordini, N. (2013). Small enterprise default prediction modeling through artificial neural networks: An empirical analysis of Italian small enterprises. Journal of Small Business Management, 51(1), 23–45.

    Google Scholar 

  • Cielen, A., Peeters, L., Vanhoof, K. (2004). Bankruptcy prediction using a data envelopment analysis. European Journal of Operational Research, 154(2), 526–532.

    MATH  Google Scholar 

  • Cimpoeru, S.S. (2011). Neural networks and their application in credit risk assessment. evidence from the Romanian market. Technological and Economic Development of Economy, 17(3), 519–534.

    Google Scholar 

  • Cohen, P.R. (1995). Empirical Methods for Artificial Intelligence. Cambridge MA: MIT Press.

    MATH  Google Scholar 

  • Crook, J.N., Edelman, D.B., Thomas, L.C. (2007). Recent developments in consumer credit risk assessment. European Journal of Operational Research, 183(3), 1447–1465.

    MATH  MathSciNet  Google Scholar 

  • Dȧnilȧ, O.M. (2012). Credit risk assessment under Basel Accords. Theoretical and Applied Economics, 18(3), 77–90.

    Google Scholar 

  • Daubie, M., Levecq, P., Meskens, N. (2002). A comparison of the rough sets and recursive partitioning induction approaches: An application to commercial loans. International Transactions in Operational Research, 9(5), 681–694.

    MATH  Google Scholar 

  • Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7(1), 1–30.

    MATH  Google Scholar 

  • Eletter, S.F., Yaseen, S.G., Elrefae, G.A. (2010). Neuro-based artificial intelligence model for loan decisions. American Journal of Economics and Business Administration, 2(1), 27–34.

    Google Scholar 

  • Elsayad, A.M. (2010). Implementing automated prediction systems for credit scoring. ICGST International Journal on Automatic Control and Systems Engineering, 10(1), 11–19.

    Google Scholar 

  • Finlay, S. (2011). Multiple classifier architectures and their application to credit risk assessment. European Journal of Operational Research, 210(2), 368–378.

    Google Scholar 

  • Flórez-López, R. (2010). Effects of missing data in credit risk scoring. a comparative analysis of methods to achieve robustness in the absence of sufficient data. Journal of the Operational Research Society, 61(3), 486–501.

    Google Scholar 

  • Forman, G., & Scholz, M. (2010). Apples-to-apples in cross-validation studies: Pitfalls in classifier performance measurement. SIGKDD Explorations Newsletters, 12(1), 49–57.

    Google Scholar 

  • Fritz, S., & Hosemann, D. (2000). Restructuring the credit process: Behaviour scoring for German corporates. International Journal of Intelligent Systems in Accounting Finance & Management, 9(1), 9–21.

    Google Scholar 

  • Galindo, J., & Tamayo, P. (2000). Credit risk assessment using statistical and machine learning: Basic methodology and risk modeling applications. Computational Economics, 15(1–2), 107–143.

    MATH  Google Scholar 

  • García S, Fernández A, Luengo J (2010). Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciencies, 180(10), 2044–2064.

  • García, V., Marqués, A.I., Sánchez, J.S. (2012). On the use of data filtering techniques for credit risk prediction with instance-based models. Expert Systems with Applications, 276(18:13), 267–13.

    Google Scholar 

  • van Gestel, T., Baesens, B., Suykens, J.A.K., den Poel, D.V., Baestaens, D.E., Willekens, M. (2006). Bayesian kernel based classification for financial distress detection. European Journal of Operational Research, 172(3), 979–1003.

    MATH  Google Scholar 

  • van Gestel, T., Baesens, B., Martens, D. (2010). From linear to non-linear kernel based classifiers for bankruptcy prediction. Neurocomputing, 73(16–18), 2955–2970.

    Google Scholar 

  • Goletsis, Y., Exarchos, T.P., Katsis, C.D. (2010). Credit scoring using an Ant mining approach. Human Systems Management, 29(2), 79–88.

    Google Scholar 

  • Hamadani, A.Z. (2013). An integrated genetic-based model of naive Bayes networks for credit scoring. International Journal of Artificial Intelligence & Applications, 4(1), 85–103.

    Google Scholar 

  • Hand, D.J. (2009). Measuring classifier performance: a coherent alternative to the area under the roc curve. Machine Learning, 77(1), 103–123.

    Google Scholar 

  • Hand, D.J. (2012). Assessing the performance of classification methods. International Statistical Review, 80(3), 400–414.

    MathSciNet  Google Scholar 

  • Harris, T. (2013). Quantitative credit risk assessment using support vector machines: Broad versus narrow default definitions. Expert Systems with Applications. doi:10.1016/j.eswa.2013.01.044.

    Google Scholar 

  • Hoffmann, F., Baesens, B., Martens, J., Put, F., Vanthienen, J. (2002). Comparing a genetic fuzzy and a neurofuzzy classifier for credit scoring. International Journal of Intelligent Systems, 17(11), 1067–1083.

    MATH  Google Scholar 

  • Hoffmann, F., Baesens, B., Mues, C., van Gestel, T., Vanthienen, J. (2007). Inferring descriptive and approximate fuzzy rules for credit scoring using evolutionary algorithms. European Journal of Operational Research, 177(1), 540–555.

    MATH  Google Scholar 

  • Hong, C.S. (2009). Optimal threshold from ROC and CAP curves. Communications in Statistics – Simulation and Computation, 38(10), 2060–2072.

    MATH  MathSciNet  Google Scholar 

  • Horcher, K.A. (2005). Essentials of Financial Risk Management. NJ: Wiley Hoboken.

    Google Scholar 

  • Hsieh, N.C. (2005). Hybrid mining approach in the design of credit scoring models. Expert Systems with Applications, 28(4), 655–665.

    Google Scholar 

  • Hu, Y.C. (2009). Bankruptcy prediction using ELECTRE-based single-layer perceptron. Neurocomputing, 72(13–15), 3150–3157.

    Google Scholar 

  • Hu, Y.C., & Chen, C.J. (2011). A PROMETHEE-based classification method using concordance and discordance relations and its application to bankruptcy prediction. Information Sciences, 181(22), 4959–4968.

    Google Scholar 

  • Hu, Y.C., & Tseng, F.M. (2007). Functional-link net with fuzzy integral for bankruptcy prediction. Neurocomputing, 70(16–18), 2959–2968.

    Google Scholar 

  • Huang, C.L., Chen, M.C., Wang, C.J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications, 33(4), 847–856.

    MathSciNet  Google Scholar 

  • Huang, J.J., Tzeng, G.H., Ong, C.S. (2006). Two-stage genetic programming (2SGP) for the credit scoring model. Applied Mathematics and Computation, 174(2), 1039–1053.

    MATH  MathSciNet  Google Scholar 

  • Hughes, G. (1968). On the mean accuracy of statistical pattern recognizers. IEEE Trans on Information Theory, 14(1), 55–63.

    Google Scholar 

  • Im, J.K., Apley, D.W., Qi, C., Shan, X. (2012). A time-dependent proportional hazards survival model for credit risk analysis. Journal of the Operational Research Society, 63(3), 306–321.

    Google Scholar 

  • Japkowicz, N., & Shah, M. (2011). Evaluating Learning Algorithms: A Classification Perspective. New York NY: Cambridge University Press.

    Google Scholar 

  • Jouzbarkand, M., Keivani, F.S., Khodadadi, M., Fahim, S R.S.N., Aghajani, V. (2013). Bankruptcy prediction model by Ohlson and Shirata models in Tehran stock exchange. World Applied Sciences Journal, 21(2), 152–156.

    Google Scholar 

  • Karan, M.B., Ulucan, A., Kaya, M. (2013). Credit risk estimation using payment history data: A comparative study of Turkish retail stores. Central European Journal of Operations Research, 21, 1–16.

    Google Scholar 

  • Kaski, S., Sinkkonen, J., Peltonen, J. (2001). Bankruptcy analysis with selforganizing maps in learning metrics. IEEE Trans on Neural Networks, 12(4), 936–947.

    Google Scholar 

  • Khandani, A.E., Kim, A.J., Lo, A.W. (2010). Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 34 (11), 2767–2787.

    Google Scholar 

  • Khashman, A. (2010). Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes. Expert Systems with Applications, 37(9), 6233–6239.

    Google Scholar 

  • Kiefer, N.M. (2009). Default estimation for low-default portfolios. Journal of Empirical Finance, 16(1), 164–173.

    MathSciNet  Google Scholar 

  • Kim, Y.S., & Sohn, S.Y. (2004). Managing loan customers using misclassification patterns of credit scoring model. Expert Systems with Applications, 26(4), 567–573.

    Google Scholar 

  • Koh, H.C., Tan, W.C., Goh, C.P. (2006). A two-step method to construct credit scoring models with data mining techniques. International Journal of Business and Information, 1(1), 96–118.

    Google Scholar 

  • Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Proc. 14th International Joint Conference on Artificial intelligence (Vol. 2, pp. 1137–1143). Canada: Montreal.

  • Korol, T. (2013). Early warning models against bankruptcy risk for Central European and Latin American enterprises. Economic Modelling, 31(1), 22–30.

    Google Scholar 

  • Kotsiantis, S. (2007). Credit risk analysis using a hybrid data mining model. International Journal of Intelligent Systems Technologies and Applications, 2(4), 345–356.

    Google Scholar 

  • Lacerda, E., Carvalho, A.C.P.L.F., Braga, A.P., Ludermir, T.B. (2005). Evolutionary radial basis functions for credit assessment. Applied Intelligence, 22(3), 167–181.

    Google Scholar 

  • Laha, A. (2007). Building contextual classifiers by integrating fuzzy rule based classification technique and k-nn method for credit scoring. Advanced Engineering Informatics, 21(3), 281–291.

    Google Scholar 

  • Lanine, G., & Vennet, R.V. (2006). Failure prediction in the Russian bank sector with logit and trait recognition models. Expert Systems with Applications, 30(3), 463–478.

    Google Scholar 

  • Lee, T.S., & Chen, I.F. (2005). A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines. Expert Systems with Applications, 28(4), 743–752.

    Google Scholar 

  • Lee, T.S., Chiu, C.C., Lu, C.J., Chen, I.F. (2002). Credit scoring using the hybrid neural discriminant technique. Expert Systems with Applications, 23(3), 245–254.

    Google Scholar 

  • Lee, T.S., Chiu, C.C., Chou, Y.C., Lu, C.J. (2006). Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Computational Statistics & Data Analysis, 50(4), 1113–1130.

    MathSciNet  Google Scholar 

  • Lensberg, T., Eilifsen, A., McKee, T.E. (2006). Bankruptcy theory development and classification via genetic programming. European Journal of Operational Research, 169(2), 677–697.

    MATH  MathSciNet  Google Scholar 

  • Li, A., Shi, Y., He, J. (2008). MCLP-based methods for improving “bad” catching rate in credit cardholder behavior analysis-based methods for improving “bad” catching rate in credit cardholder behavior analysis. Applied Soft Computing, 8(3), 1259–1265.

    Google Scholar 

  • Li, H., & Sun, J. (2009). Gaussian case-based reasoning for business failure prediction with empirical data in China. Information Sciences, 179(1–2), 89–108.

    Google Scholar 

  • Li, H., & Sun, J. (2011). Principal component case-based reasoning ensemble for business failure prediction. Information & Management, 48 (6), 220–227.

    Google Scholar 

  • Li, H., & Sun, J. (2013). Predicting business failure using an RSF-based case-based reasoning ensemble forecasting method. Journal of Forecasting, 32(2), 180–192.

    MathSciNet  Google Scholar 

  • Lin, W.Y., Hu, Y.H., Tsai, C.F. (2012). Machine learning in financial crisis: A survey. IEEE Trans on Systems Man, and Cybernetics–Part C:. Applications and Reviews, 42(4), 421–436.

    Google Scholar 

  • Liu, G., & Zhu, Y. (2006). Credit assessment of contractors: A rough set method. Tsinghua Science & Technology, 11(3), 357–362.

    Google Scholar 

  • Liu, Y., & Schumann, M. (2005). Data mining feature selection for credit scoring models. Journal of the Operational Research Society, 56(9), 1099–1108.

    MATH  Google Scholar 

  • Lu, H., Liyan, H., Hongwei, Z. (2013). Credit scoring model hybridizing artificial intelligence with logistic regression. Journal of Networks, 8(1), 253–261.

    Google Scholar 

  • Malhotra, R., & Malhotra, D.K. (2002). Differentiating between good credits and bad credits using neuro-fuzzy systems. European Journal of Operational Research, 136(1), 190–211.

    MATH  Google Scholar 

  • Malhotra, R., & Malhotra, D.K. (2003). Evaluating consumer loans using neural networks. Omega, 31(2), 83–96.

    Google Scholar 

  • Marinakis, Y., Marinaki, M., Doumpos, M., Matsatsinis, N., Zopounidis, C. (2008). Optimization of nearest neighbor classifiers via metaheuristic algorithms for credit risk assessment. Journal of Global Optimization, 42(2), 279–293.

    MATH  MathSciNet  Google Scholar 

  • Marqués, A.I., García V., Sánchez, J.S. (2012a). Exploring the behaviour of base classifiers in credit scoring ensembles. Expert Systems with Applications, 39(11:10), 244–10. 250.

    Google Scholar 

  • Marqués, A.I., García, V., Sánchez, J.S. (2012b). Two-level classifier ensembles for credit risk assessment. Expert Systems with Applications, 39(12:10), 916–10. 922.

    Google Scholar 

  • Marqués, A.I., García, V., Sánchez, J.S. (2013). On the suitability of resampling techniques for the class imbalance problem in credit scoring. Journal of the Operational Research Society, 64(7), 1060–1070.

    Google Scholar 

  • Martens, D., Baesens, B., van Gestel, T., Vanthienen, J. (2007). Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research, 183(3), 1466–1476.

    MATH  Google Scholar 

  • Marusteri, M., & Bacarea, V. (2010). Comparing groups for statistical differences: how to choose the right statistical test. Biochemia Medica, 20(1), 15–32.

    Google Scholar 

  • Matsatsinis, N.F. (2002). CCAS: An intelligent decision support system for credit card assessment. Journal of Multi-Criteria Decision Analysis, 11(4–5), 213–235.

    MATH  Google Scholar 

  • Mileris, R. (2010). Estimation of loan applicants default probability applying discriminant analysis and simple Bayesian classifier. Economics and Management, 15, 1078–1084.

    Google Scholar 

  • Min, J.H., & Lee, Y.C. (2005). Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications, 28(4), 603–614.

    Google Scholar 

  • Moradi, M., Salehi, M., Ghorgani, M.E., Yazdi, H.S. (2013). Financial distress prediction of Iranian companies using data mining techniques. Organizacija, 46(1), 20–27.

    Google Scholar 

  • Nagy, G. (2004). Classifiers that improve with use. In: Proc. Conference on Pattern Recognition and Multimedia, (pp. 79–86). Tokyo.

  • Nanni, L., & Lumini, A. (2009). An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring. Expert Systems with Applications, 36(2), 3028–3033.

    Google Scholar 

  • Ong, C.S., Huang, J.J., Tzeng, G.H. (2005). Building credit scoring models using genetic programming. Expert Systems with Applications, 29(1), 41–47.

    Google Scholar 

  • Paleologo, G., Elisseeff, A., Antonini, G. (2010). Subagging for credit scoring models. European Journal of Operational Research, 201(2), 490–499.

    Google Scholar 

  • Park, C.S., & Han, I. (2002). A case-based reasoning with the feature weights derived by analytic hierarchy process for bankruptcy prediction. Expert Systems with Applications, 23(3), 255–264.

    MathSciNet  Google Scholar 

  • Pavlenko, T., & Chernyak, O. (2010). Credit risk modeling using bayesian networks. International Journal of Intelligent Systems, 25, 326–344.

    MATH  Google Scholar 

  • Pavlidis, N.G., Tasoulis, D.K., Adams, N.M., Hand, D.J. (2012). Adaptive consumer credit classification. Journal of the Operational Research Society, 63(12), 1645–1654.

    Google Scholar 

  • Pendharkar, P.C. (2005). A threshold-varying artificial neural network approach for classification and its application to bankruptcy prediction problem. Computers & Operations Research, 32(10), 2561–2582.

    MATH  Google Scholar 

  • Peng, Y., Kou, G., Shi, Y., Chen, Z. (2008). A multi-criteria convex quadratic programming model for credit data analysis. Decision Support Systems, 44(4), 1016–1030.

    Google Scholar 

  • Peng, Y., Wang, G., Kou, G., Shi, Y. (2011). An empirical study of classification algorithm evaluation for financial risk prediction. Applied Soft Computing, 11(2), 2906–2915.

    Google Scholar 

  • Pervan, I., & Kuvek, T. (2013). The relative importance of financial ratios and nonfinancial variables in predicting of insolvency. Croatian Operational Research Review, 4(1), 187–197.

    Google Scholar 

  • Phua, C., Alahakoon, D., Lee, V. (2004). Minority report in fraud detection: Classification of skewed data. SIGKDD Explorations Newsletter, 6(1), 50–59.

    Google Scholar 

  • Pietruszkiewicz, W. (2008). Dynamical systems and nonlinear Kalman filtering applied in classification. Proc. of 7th IEEE International Conference on Cybernetic Intelligent Systems, (pp. 263–268). London: UK.

  • Ping, Y., & Yongheng, L. (2011). Neighborhood rough set and SVM based hybrid credit scoring classifier. Expert Systems with Applications, 38(9:11), 300–11. 304.

    Google Scholar 

  • Piramuthu, S. (2006). On preprocessing data for financial credit risk evaluation. Expert Systems with Applications, 30(3), 489–497.

    Google Scholar 

  • Provost, F.J., & Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learning, 42(3), 203–231.

    MATH  Google Scholar 

  • Raeder, T., Forman, G., Chawla, N. (2012). Learning from imbalanced data: Evaluation matters In Holmes, D E, & Jain, L C (Eds.), Data Mining: Foundations and Intelligent Paradigms, (pp. 315–331). Berlin Heidelberg: Springer-Verlag.

    Google Scholar 

  • Ravi Kumar, P., & Ravi, V. (2007). Bankruptcy prediction in banks and firms via statistical and intelligent techniques - a review. European Journal of Operational Research, 180(1), 1–28.

    MATH  Google Scholar 

  • Ravisankar, P., Ravi, V., Bose, I. (2010). Failure prediction of dotcom companies using neural network-genetic programming hybrids. Information Sciences, 180(8), 1257–1267.

    Google Scholar 

  • Ribeiro, B., Silva, C., Chen, N., Vieira, A.S., Neves, J.C. (2012). Enhanced default risk models with SVM+. Expert Systems with Applications, 152(11:10), 140–10.

    Google Scholar 

  • Sabzevari, H., Soleymani, M., Noorbakhsh, E. (2007). Proc of the 3rd CRC Credit Scoring Conference Edinburgh. UK.

  • Sadatrasoul, S.M., Gholamian, M.R., Siami, M., Hajimohammadi, Z. (2013). Credit scoring in banks and financial institutions via data mining techniques: A literature review. Journal of AI and Data Mining, 1(2), 119–129.

    Google Scholar 

  • Salcedo-Sanz, S., Fernández-Villacañas, J.L., Segovia-Vargas, M.J., Bousoño-Calzón, C. (2005). Genetic programming for the prediction of insolvency in non-life insurance companies. Computers & Operations Research, 32(4), 749–765.

    MATH  Google Scholar 

  • Sechidis, K., Tsoumakas, G., Vlahavas, I. (2011). On the stratification of multi-label data. Proc. European Conference on Machine Learning and Knowledge Discovery in Databases (Vol. 2, pp. 145–158). Greece: Athens.

  • Serrano-Cinca, C., & Gutiérrez-Nieto, B. (2013). Partial least square discriminant analysis for bankruptcy prediction. Decision Support Systems, 54(3), 1245–1255.

    Google Scholar 

  • Shin, K.S., & Lee, Y.J. (2002). A genetic algorithm application in bankruptcy prediction modeling. Expert Systems with Applications, 23(3), 321–328.

    Google Scholar 

  • Siami, M., Gholamian, M.R., Basiri, J. (2013). An application of locally linear model tree algorithm with combination of feature selection in credit scoring. International Journal of Systems Science. doi:10.1080/00207721.2013.767395.

    Google Scholar 

  • Staples, M., & Niazi, M. (2007). Experiences using systematic review guidelines. Journal of Systems and Software, 80(9), 1425–1437.

    Google Scholar 

  • Sun, L., & Shenoy, P.P. (2007). Using Bayesian networks for bankruptcy prediction: Some methodological issues. European Journal of Operational Research, 180(2), 738–753.

    MATH  Google Scholar 

  • Sušteršič, M., Mramor, D., Zupan, J. (2009). Consumer credit scoring models with limited data. Expert Systems with Applications, 36(3), 4736–4744.

    Google Scholar 

  • Tan, C.N.W., & Dihardjo, H. (2001). A study of using artificial neural networks to develop an early warning predictor for credit union financial distress with comparison to the probit model. Managerial Finance, 27(4), 56–77.

    Google Scholar 

  • Thomas, L.C., Edelman, D.B., Crook J.N. (2002). Credit Scoring and Its Applications. Philadelphia, PA: SIAM.

    MATH  Google Scholar 

  • Tsai, C.F. (2009). Feature selection in bankruptcy prediction. Knowledge-Based Systems, 22(2), 120–127.

    Google Scholar 

  • Tsai, C.F., & Chen, M.L. (2010). Credit rating by hybrid machine learning techniques. Applied Soft Computing, 10(2), 374–380.

    Google Scholar 

  • Tsai, C.F., & Cheng, K.C. (2012). Simple instance selection for bankruptcy prediction. Knowledge-Based Systems, 27, 333–342.

    Google Scholar 

  • Tsai, C.F., & Hsu, Y.F. (2013). A meta-learning framework for bankruptcy prediction. Journal of Forecasting, 32(2), 167–179.

    MathSciNet  Google Scholar 

  • Tsai, C.F., & Wu, J.W. (2008). Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Systems with Applications, 34(4), 2639–2649.

    Google Scholar 

  • Tsakonas, A., Dounias, G., Doumpos, M., Zopounidis, C. (2006). Bankruptcy prediction with neural logic networks by means of grammar-guided genetic programming. Expert Systems with Applications, 30(3), 449–461.

    Google Scholar 

  • Tseng, F.M., & Hu, Y.C. (2010). Comparing four bankruptcy prediction models: Logit, quadratic interval logit, neural and fuzzy neural networks. Expert Systems with Applications, 37(3), 1846–1853.

    Google Scholar 

  • Twala, B. (2010). Multiple classifier application to credit risk assessment. Expert Systems with Applications, 37(4), 3326–3336.

    Google Scholar 

  • Verikas, A., Kalsyte, Z., Bacauskiene, M., Gelzinis, A. (2010). Hybrid and ensemble-based soft computing techniques in bankruptcy prediction : A survey. Soft Computing - A Fusion of Foundations. Methodologies and Applications, 14(9), 995–1010.

    Google Scholar 

  • Wang, C.M., & Huang, Y.F. (2009). Evolutionary-based feature selection approaches with new criteria for data mining: A case study of credit approval data. Expert Systems with Applications, 36(3), 5900–5908.

    Google Scholar 

  • Wang, G., & Ma, J. (2011). Study of corporate credit risk prediction based on integrating boosting and random subspace. Expert Systems with Applications, 878(11:13), 871–13.

    Google Scholar 

  • Wang, G., Hao, J., Ma, J., Jiang, H. (2011). A comparative assessment of ensemble learning for credit scoring. Expert Systems with Applications, 38(1), 223–230.

    Google Scholar 

  • Wang, G., Ma, J., Huang, L., Xu, K. (2012). Two credit scoring models based on dual strategy ensemble trees. Knowledge-Based Systems, 26, 61–68.

    Google Scholar 

  • Wang, Y., Wang, S., Lai, K.K. (2005). A new fuzzy support vector machine to evaluate credit risk. IEEE Trans on Fuzzy Systems, 13(6), 820–831.

    Google Scholar 

  • West, D. (2000). Neural network credit scoring models. Computers & Operations Research, 27(11–12), 1131–1152.

    MATH  Google Scholar 

  • West, D., Dellana, S., Qian, J. (2005). Neural network ensemble strategies for financial decision applications. Computers & Operations Research, 32(10), 2543–2559.

    MATH  Google Scholar 

  • Witkowska, D. (2006). Discrete choice model application to the credit risk evaluation. International Advances in Economic Research, 12 (1), 33–42.

    Google Scholar 

  • Xie, G., Zhao, Y., Jiang, M., Zhang, N. (2013). A novel ensemble learning approach for corporate financial distress forecasting in fashion and textiles supply chains. Mathematical Problems in Engineering, 1–9.

  • Yang, Y. (2007). Adaptive credit scoring with kernel learning methods. European Journal of Operational Research, 183(3), 1521–1536.

    MATH  Google Scholar 

  • Yeh, I.C., & Lien, C.H. (2009). The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36(2), 2473–2480.

    Google Scholar 

  • Yobas, M.B., Crook, J.N., Ross, P. (2000). Credit scoring using neural and evolutionary techniques. IMA Journal of Management Mathematics, 11(2), 111–125.

    MATH  MathSciNet  Google Scholar 

  • Yu, L., Wang, S., Lai, K.K. (2008). Credit risk assessment with a multistage neural network ensemble learning approach. Expert Systems with Applications, 34(2), 1434–1444.

    Google Scholar 

  • Yu, L., Wang, S., Lai, K.K. (2009). An intelligent-agent-based fuzzy group decision making model for financial multicriteria decision support: The case of credit scoring. European Journal of Operational Research, 195(3), 942–959.

    MATH  MathSciNet  Google Scholar 

  • Zhang, D., Zhou, X., Leung, S.C.H., Zheng, J. (2010). Vertical bagging decision trees model for credit scoring. Expert Systems with Applications, 37(12), 7838–7843.

    Google Scholar 

  • Zhao, H., Sinha, A., Ge, W. (2009). Effects of feature construction on classification performance: An empirical study in bank failure prediction. Expert Systems with Applications, 36(2), 2633–2644.

    Google Scholar 

  • Zhou, L. (2013). Performance of corporate bankruptcy prediction models on imbalanced dataset: The effect of sampling methods. Knowledge-Based Systems, 41, 16–25.

    MATH  Google Scholar 

  • Zhou, L., Lai, K.K., Yu, L. (2009). Credit scoring using support vector machines with direct search for parameters selection. Soft Computing, 13(2), 149–155.

    MATH  Google Scholar 

  • Zhou, L., Lai, K.K., Yen, J. (2012). Bankruptcy prediction using SVM models with a new approach to combine features selection and parameter optimisation. International Journal of Systems Science, 1–13.

  • Zhou, X., Jiang, W., Shi, Y., Tian, Y. (2011). Credit risk evaluation with kernel-based affine subspace nearest points learning method. Expert Systems with Applications, 38(4), 4272–4279.

    Google Scholar 

  • Zurada, J., & Zurada, M. (2002). How secure are good loans: Validating loan-granting decisions and predicting default rates on consumer loans. Review of Business Information Systems, 6(3), 65–84.

    Google Scholar 

Download references

Acknowledgements

This work has partially been supported by the Mexican Science and Technology Council (CONACYT-Mexico) through a Postdoctoral Fellowship [223351], the Spanish Ministry of Economy under grant TIN2013-46522-P and the Generalitat Valenciana under grant PROMETEOII/2014/062.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Salvador Sánchez.

Appendix A:: Credit databases

Appendix A:: Credit databases

The databases used in the experiments of the studies here analyzed are presented in Table 8. For each database, we report the number of samples, the number of independent variables and the papers in which it has been employed.

Table 8 Databases used in the papers reviewed

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

García, V., Marqués, A.I. & Sánchez, J.S. An insight into the experimental design for credit risk and corporate bankruptcy prediction systems. J Intell Inf Syst 44, 159–189 (2015). https://doi.org/10.1007/s10844-014-0333-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-014-0333-4

Keywords

Navigation