Advertisement

Business Credit Scoring of Estonian Organizations

  • Jüri KuusikEmail author
  • Peep Küngas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10816)

Abstract

Recent hype in social analytics has modernized personal credit scoring to take advantage of rapidly changing non-financial data. At the same time business credit scoring still relies on financial data and is based on traditional methods. Such approaches, however, have the following limitations. First, financial reports are compiled typically once a year, hence scoring is infrequent. Second, since there is a delay of up to two years in publishing financial reports, scoring is based on outdated data and is not applied to young businesses. Third, quality of manually crafted models, although human-interpretable, is typically inferior to the ones constructed via machine learning.

In this paper we describe an approach for applying extreme gradient boosting with Bayesian hyper-parameter optimization and ensemble learning for business credit scoring with frequently changing/updated data such as debts and network metrics from board membership/ownership networks. We report accuracy of the learned model as high as 99.5%. Additionally we discuss lessons learned and limitations of the approach.

Keywords

Business credit scoring Machine learning Boosted decision tree Hyper-parameter tuning 

References

  1. 1.
    Xia, Y., Liu, C., Li, Y., Liu, N.: A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Syst. Appl. 78, 225–241 (2017)CrossRefGoogle Scholar
  2. 2.
    Siddiqi, N.: Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards, 2nd edn. Wiley, Hoboken (2017)CrossRefGoogle Scholar
  3. 3.
    Tamari, M.: Financial ratios as a means of forecasting bankruptcy. Manag. Int. Rev. 6(4), 15–21 (1966)Google Scholar
  4. 4.
    Baxter, R.A., Gawler, M., Ang, R.: Predictive model of insolvency risk for Australian corporations. In: Proceedings of the Sixth Australasian Conference on Data Mining and Analytics, vol. 70. Australian Computer Society, Inc. (2007)Google Scholar
  5. 5.
    Investopedia: Bankruptcy risk definition. https://www.investopedia.com/terms/b/bankruptcyrisk.asp
  6. 6.
    Yu, L., Wang, S., Lai, K.K., Zhou, L.: Bio-inspired Credit Risk Analysis: Computational Intelligence with Support Vector Machines. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-77803-5CrossRefGoogle Scholar
  7. 7.
  8. 8.
    Ilves, T.: Impact of board dynamics in corporate bankruptcy prediction: application of temporal snapshots of networks of board members and companies. Master thesis. Tartu University (2014)Google Scholar
  9. 9.
    Yap, B.W., Rani, K.A., Rahman, H.A.A., Fong, S., Khairudin, Z., Abdullah, N.N.: An application of oversampling, undersampling, bagging and boosting in handling imbalanced datasets. In: Herawan, T., Deris, M.M., Abawajy, J. (eds.) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). LNEE, vol. 285, pp. 13–22. Springer, Singapore (2014).  https://doi.org/10.1007/978-981-4585-18-7_2CrossRefGoogle Scholar
  10. 10.
    Analytics Vidhya: How to handle Imbalanced Classification Problems in machine learning? (2017). https://www.analyticsvidhya.com/blog/2017/03/imbalanced-classification-problem/
  11. 11.
    Kaggle Inc.: Kaggle competitions. https://www.kaggle.com/competitions
  12. 12.
    Bischl, B., Richter, J., Bossek, J., Horn, D., Thomas, J., Lang, M.: mlrMBO: a modular framework for model-based optimization of expensive black-box functions. Cornell University Library (2017)Google Scholar
  13. 13.
    Bischl, B., Lang, M., Kotthoff, L., Schiffner, J., Richter, J., Studerus, E., Casalicchio, G., Jones, Z.M.: mlr: machine learning in R. J. Mach. Learn. Res. 17, 1–5 (2016)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Jenks, G.F.: The data model concept in statistical mapping. In: International Yearbook of Cartography, vol. 7, pp. 186–190 (1967)Google Scholar
  15. 15.
    Hand, D.J., Anagnostopoulos, C.: Measuring classification performance. hmeasure.net
  16. 16.
    Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: KDD 2016 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, pp. 785–794 (2016)Google Scholar
  17. 17.
    Greenwell, B.M.: pdp: an R package for constructing partial dependence plots. R J. 9(1), 421–436 (2017)Google Scholar
  18. 18.
    SAS Institute Inc.: SAS® Enterprise Miner™ 14.3: Reference Help; Chapter 67 Segment Profile Node (2017)Google Scholar
  19. 19.
    Luke, D.A.: A User’s Guide to Network Analysis in R. UR. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-23883-8CrossRefzbMATHGoogle Scholar
  20. 20.
    Svolba, G.: Data preparation for analytics using SAS. SAS Institute Inc. (2006)Google Scholar
  21. 21.
  22. 22.
    Abdou, H., Pointon, J.: Credit scoring, statistical techniques and evaluation criteria: a review of the literature. Intell. Syst. Account. Financ. Manag. 18(2–3), 59–88 (2011)CrossRefGoogle Scholar
  23. 23.
    Hooman, A., Mohana, O., Marthandan, G., Yusoff, W.F.W., Karamizadeh, S.: Statistical and data mining methods in credit scoring. In: Proceedings of the Asia Pacific Conference on Business and Social Sciences, Kuala Lumpur (2015)Google Scholar
  24. 24.
    Tarver, E.: Business credit score: everything you should know to build business credit. https://fitsmallbusiness.com/how-business-credit-scores-work/
  25. 25.
    Register OÜ: Credit risk prediction for Estonian companies. https://docs.google.com/document/d/1aG9Y6B8J8Q9Ee75tA6X3SJByo9QAJCvugYas7Q9p2VE
  26. 26.
    Zięba, M., Tomczak, S.K., Tomczak, J.M.: Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Syst. Appl. 58, 93–101 (2016)CrossRefGoogle Scholar
  27. 27.
    Altman, E.I.: Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Financ. 23, 589–609 (1968)CrossRefGoogle Scholar
  28. 28.
    Shin, K.S., Lee, T.S., Kim, H.J.: An application of support vector machines in bankruptcy prediction model. Expert Syst. Appl. 28, 127–135 (2005)CrossRefGoogle Scholar
  29. 29.
    Geng, R., Bose, I., Chen, X.: Prediction of financial distress: an empirical study of listed Chinese companies using data mining. Eur. J. Oper. Res. 241, 236–247 (2015)CrossRefGoogle Scholar
  30. 30.
    Tutz, G., Schmid, M.: Modeling Discrete Time-to-Event Data. SSS. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-28158-2CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Register OÜTallinnEstonia
  2. 2.University of TartuTartuEstonia
  3. 3.STACCTartuEstonia

Personalised recommendations