Skip to main content

Application of Bayesian Automated Hyperparameter Tuning on Classifiers Predicting Customer Retention in Banking Industry

Part of the Advances in Intelligent Systems and Computing book series (AISC,volume 1175)

Abstract

The paper aims to demonstrate the comparison of accuracy metrics achieved on nine different fundamental Machine Learning (ML) classifiers. Bayesian Automated Hyperparameter Tuning, with Tree-structured Parzen Estimator, has been performed on all of nine ML classifiers predicting the customers likely to be retained by the bank. After visualizing the nature of dataset and its constraints of class imbalance and limited training examples, Feature Engineering has been performed to compensate for the constraints. The ML techniques comprise first using six classifiers (namely, K-Nearest Neighbors, Naive Bayes, Decision Tree, Random Forest, SVM, and Artificial Neural Network––ANN) individually on the dataset with their default hyperparameters with and without Feature Engineering. Second, three boosting classifiers (namely, AdaBoost, XGBoost, and GradientBoost) were used without changing their default hyperparameters. Thirdly, on each classifier, Bayesian Automated Hyperparameter tuning (AHT) with Tree-structured Parzen Estimator was performed to optimize the hyperparameters to obtain the best results on the training data. Next, AHT was performed on the three boosting classifiers as well. The cross-validation mean training accuracy achieved is comparatively quite better than those achieved on this dataset so far on Kaggle and other research papers. Besides, such an extensive comparison of nine classifiers after Bayesian AHT on Banking Industry dataset has never been made before.

Keywords

  • Bayesian automated hyperparameter tuning
  • Boosting methods
  • Outlier detection
  • Tree-structured Parzen estimator

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-981-15-5619-7_7
  • Chapter length: 18 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   169.00
Price excludes VAT (USA)
  • ISBN: 978-981-15-5619-7
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   219.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.

References

  1. T. Vafeiadis, K.I. Diamantaras, G. Sarigiannidis, K.C. Chatzisavvas, A comparison of machine learning techniques for customer churn prediction. Simul. Model. Pract. Theory 55, 1–9 (2015)

    CrossRef  Google Scholar 

  2. S.A. Qureshi, A.S. Rehman, A.M. Qamar, A. Kamal, A. Rehman, Telecommunication subscribers’ churn prediction model using machine learning, in 2013 Eighth International Conference on Digital Information Management (ICDIM) (IEEE, 2013), pp. 131–136

    Google Scholar 

  3. K. Kim, C.-H. Jun, J. Lee, Improved churn prediction in telecommunication industry by analyzing a large network. Exp. Syst. Appl.

    Google Scholar 

  4. C. Kirui, L. Hong, W. Cheruiyot, H. Kirui, Predicting customer churn in mobile telephony industry using probabilistic classifiers in data mining. Int. J. Comput. Sci. Issues (IJCSI) 10(2)

    Google Scholar 

  5. G. Kraljevi´c, S. Gotovac, Modeling data mining applications for prediction of prepaid churn in telecommunication services. AUTOMATIKA: casopis za automatiku, mjerenje, elektroniku, raˇcunarstvo i komunikacije 51(3), 275–283 (2010)

    Google Scholar 

  6. R.J. Jadhav, U.T. Pawar, Churn prediction in telecommunication using data mining technology. IJACSA Editorial

    Google Scholar 

  7. D. Radosavljevik, P. van der Putten, K.K. Larsen, The impact of experimental setup in prepaid churn prediction for mobile telecommunications: What to predict, for whom and does the customer experience matter? Trans. MLDM 3(2), 80–99 (2010)

    Google Scholar 

  8. Y. Richter, E. Yom-Tov, N. Slonim, Predicting customer churn in mobile networks through analysis of social groups, in SDM, vol. 2010 (SIAM, 2010), pp. 732–741

    Google Scholar 

  9. S¸. G¨ursoy, U. Tu˘gba, Customer churn analysis in telecommunication sector. J. School Bus. Admin. Istanbul Univer. 39(1), 35–49 (2010)

    Google Scholar 

  10. K. Tsiptsis, A. Chorianopoulos, Data Mining Techniques in CRM: Inside Customer Segmentation (Wiley, New York, 2011)

    Google Scholar 

  11. F. Eichinger, D.D. Nauck, F. Klawonn, Sequence mining for customer behaviour predictions in telecommunications, in Proceedings of the Workshop on Practical Data Mining at ECML/PKDD (2006), pp. 3–10

    Google Scholar 

  12. A. Lemmens, C. Croux, Bagging and boosting classification trees to predict churn. J. Mark. Res. 43(2), 276–286 (2006)

    CrossRef  Google Scholar 

  13. Y. Xie, X. Li, Churn prediction with linear discriminant boosting algorithm, in 2008 International Conference on Machine Learning and Cybernetics, vol. 1 (IEEE, 2008), pp. 228–233

    Google Scholar 

  14. U.D. Prasad, S. Madhavi, Prediction of churn behaviour of bank customers using data mining tools. Indian J. Mark. 42(9), 25–30 (2011)

    Google Scholar 

  15. Dataset available on. https://www.kaggle.com/barelydedicated/bank-customer-churn-modeling

  16. Invesp Consulting. https://www.invespcro.com/blog/customer-acquisition-retention/

  17. The Chartered Institute of Marketing, Cost of customer acquisition versus customer retention (2010)

    Google Scholar 

  18. B. Shahriari, K. Swersky, Z. Wang, R.P. Adams, N. de Freitas, Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104(1), 148–175 (2016)

    CrossRef  Google Scholar 

  19. J.S. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for hyper-parameter optimization. in Advances in Neural Information Processing Systems (2011), pp. 2546–2554

    Google Scholar 

  20. F. Hutter, H.H. Hoos, K. Leyton-Brown, Sequential model-based optimization for general algorithm configuration, in International Conference on Learning and Intelligent Optimization (Springer, Heidelberg), pp. 507–523

    Google Scholar 

  21. K. Potdar, T.S. Pardawala, C.D. Pai, A comparative study of categorical variable encoding techniques for neural network classifiers. Int. J. Comput. Appl. 175(4), 7–9 (2017)

    Google Scholar 

  22. Interquartile Range Upton, Graham; Cook, Ian Understanding Statistics (Oxford University Press, 1996)

    Google Scholar 

  23. T. Wong, N, Yang, Dependency analysis of accuracy estimates in k-fold cross validation. IEEE Trans. Knowl. Data Eng. 29(11), 2417–2427 (2017)

    Google Scholar 

  24. J. Schmidhuber, Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)

    CrossRef  Google Scholar 

  25. C. Cortes, V.N. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  26. A. Ben-Hur, D. Horn, H. Siegelmann, V.N. Vapnik, Support vector clustering. J. Mach. Learn. Res. 2, 125–137 (2001)

    Google Scholar 

  27. T.R. Patil, S.S. Sherekar, Performance analysis of Naive Bayes and J48 classification algorithm for data classification. Int. J. Comput. Sci. Appl. 6(2). ISSN: 0974-1011

    Google Scholar 

  28. T.K. Ho Random decision forests, in Proceedings of the 3rd International Conference on Document Analysis and Recognition (Montreal, QC, 1995), pp. 278–282

    Google Scholar 

  29. N.S. Altman, An introduction to kernel and nearest-neighbor nonparametric regression. Am. Statist. 46(3), 175–185 (1992)

    MathSciNet  Google Scholar 

  30. L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and Regression Trees (Wadsworth and Brooks/Cole Advanced Books and Software, Monterey, CA, 1984)

    MATH  Google Scholar 

  31. T. Elhassan, M. Aljurf, Classification of imbalance data using Tomek Link (T-Link) combined with Random Under-Sampling (RUS) as a data reduction method

    Google Scholar 

  32. S. Visa, A. Ralescu, Issues in mining imbalanced data sets-a review paper, in Proceedings of the Sixteen Midwest Artificial Intelligence and Cognitive Science Conference, vol. 2005 (2005), pp. 67–73). sn

    Google Scholar 

  33. M.R. Spiegel, L.J. Stephens, Schaum’s outlines statistics, 4th edn. (McGraw Hill, 2008)

    Google Scholar 

  34. I. Jolliffe, Principal component analysis, in International Encyclopedia of Statistical Science, ed. by M. Lovric (Springer, Heidelberg, 2011)

    Google Scholar 

  35. R. Schapire, Y. Singer, Improved boosting algorithms using confidence-rated predictions (1999)

    Google Scholar 

  36. https://github.com/dmlc/xgboost

  37. J.H. Friedman, Greedy function approximation: a gradient boosting machine (1999)

    Google Scholar 

  38. Scikit learn documentation credits/link. https://scikit-learn.org/stable/documentation.html

  39. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Perrot, É. Duchesnay, Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akash Sampurnanand Pandey .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Pandey, A.S., Shukla, K.K. (2021). Application of Bayesian Automated Hyperparameter Tuning on Classifiers Predicting Customer Retention in Banking Industry. In: Sharma, N., Chakrabarti, A., Balas, V.E., Martinovic, J. (eds) Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol 1175. Springer, Singapore. https://doi.org/10.1007/978-981-15-5619-7_7

Download citation