Abstract
Presently, most markets are extremely saturated and, as a result, businesses are highly competitive. Hence, avoiding the loss of preexisting customers is pivotal, deeming the prediction of customer loss crucial to efficiently target potential churners and attempt to retain them. This study provides an in-depth comparison of various machine learning techniques and advanced preprocessing methods as well as an overall guide for handling churn prediction problems. Churn prediction is fundamentally a binary classification problem. To handle said problem, within this paper, numerous methods that belong to different machine learning categories (linear, nonlinear, ensemble, neural networks) are constructed, optimized and trained on the subscription data of a new real-world dataset originating from a popular online drug information platform that provides information on drugs and drug substances as well as professional tools for pharmacotherapy decision making. In contrast with previous works that address traditional customer churn in relation to telecom, banking or insurance industries, the current study addresses online subscriber churn where users might churn at any given moment. This study also focuses on the proper preprocessing of the given data via advanced machine learning methods, as well as evaluating the models under different conditions to measure their robustness. The results are presented, compared, analyzed and explained. Extensive feature importance analysis is performed to explain not only the models themselves but to also indicate the main factors that contribute toward churning. The findings co-align with the notion that, under the important condition that the dataset is preprocessed using not only statistical methods but machine learning techniques as well, all methods perform adequately and are generally viable options, but ensemble methods, namely Random Forests, are more flexible and resistant toward outliers. Feature importance analysis indicates that usage, not demographic data, is the prime indicator of churn.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from Ergobyte Informatics S.A., but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with permission of Ergobyte Informatics S.A.
References
Ahmad AK, Jafar A, Aljoumaa K (2019) Customer churn prediction in telecom using machine learning in big data platform. J Big Data 6(1):1–24
Athanassopoulos AD (2000) Customer satisfaction cues to support market segmentation and explain switching behavior. J Bus Res 47(3):191–207
Auret L, Aldrich C (2011) Empirical comparison of tree ensemble variable importance measures. Chemom Intell Lab Syst 105(2):157–170
Benesty J et al (2009) Pearson correlation coefficient. Noise reduction in speech processing. Springer, Berlin, Heidelberg, pp 1–4
Brandusoiu I, Toderean G, Ha B (2016) Methods for churn prediction in the prepaid mobile telecommunications industry. In: International conference on communications, pp 97–100
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Burez J, Van den Poel D (2007) CRM at Canal + Belgique: reducing customer attrition through targeted marketing. Expert Syst Appl 32:277–288
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794
Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21:2020
Coussement K, Van den Poel D (2008) Churn prediction in subscription services: an application of support vector machines while comparing two parameter-selection techniques. Expert Syst Appl 34:313–327
den Poel DV, Lariviere B (2004) Customer attrition analysis for financial services using proportional hazard models. Eur J Oper Res 157(1):196–217
Drakopoulos G, Mylonas P (2020) Evaluating graph resilience with tensor stack networks: a keras implementation. Neural Comput Appl 32(9):4161–4176
Eichinger F, Nauck DD, Klawonn F (2006) Sequence mining for customer behaviour predictions in telecommunications. In: Proceedings of the workshop on practical data mining at ECML/PKDD, pp 3–10
García DL, Nebot À, Vellido A (2017) Intelligent data analysis approaches to churn as a business problem: a survey. Knowl Inf Syst 51(3):719–774
Geiler L, Affeldt S, Nadif M (2022) A survey on machine learning methods for churn prediction. Int J Data Sci Anal 2022:1–26
Günther C-C, Tvete IF, Aas K, Sandnes GI, Borgan Ø (2014) Modelling and predicting customer churn from an insurance company. Scand Actuarial J 2014(1):58–71
Gupta N (2013) Artificial neural network. Netw Complex Syst 3(1):24–28
Gürsoy S, Tugba U (2010) Customer churn analysis in telecommunication sector. J Schl Bus Admin Istanbul Univ 39(1):35–49
Jadhav RJ, Pawar UT (2011) Churn prediction in telecommunication using data mining technology. IJACSA Ed 2(2):17–19
Jones MA, Mothersbaugh DL, Beatty SE (2000) Switching barriers and repurchase intentions in services. J Retail 76(2):259–374
Kiguchi M, Saeed W, Medi I (2022) Churn prediction in digital game-based learning using data mining techniques: logistic regression, decision tree, and random forest. Appl Soft Comput 118:108491
Kim K, Jun C-H, Lee J (2014) Improved churn prediction in telecommunication industry by analyzing a large network. Expert Syst Appl 41(15):6575–6584
Kingma DP, Ba JL (2015) ADAM: a method for stochastic optimization. Published as a conference paper at ICLR 2015
Kirui C, Hong L, Cheruiyot W, Kirui H (2013) Predicting customer churn in mobile telephony industry using probabilistic classifiers in data mining. Int J Comput Sci Iss (IJCSI) 10(2):165–172
Kraljevic G, Gotovac S (2010) Modeling data mining applications for prediction of prepaid churn in telecommunication services. Automatika 51(3):275–283
Kursa MB, Rudnicki WR (2010) Feature selection with the Boruta package. J Stat Softw 36(11):1–13
Li P, Wu Q, Burges CJ (2008) Mcrank: learning to rank using multiple classification and gradient boosting. Advances in Neural Information Processing Systems 20
LightGBM’s documentation, https://lightgbm.readthedocs.io/. Accessed 2021/10/3
Liu FT, Ting KM, Zhou Z-H (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining
Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems
McHugh ML (2012) Interrater reliability: the kappa statistic. Biochem Med 22(3):276–282. https://hrcak.srce.hr/89395. Accessed 2021/1/5
Morik K, Kopcke H (2004) Analysing customer churn in insurance data a case study. In: Proceedings of the 8th European conference on principles and practice of knowledge discovery in databases, New York, USA, pp 325–336
Mozer DGM, Wolniewicz R, Kaushansky H (2000) Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry. IEEE Trans Neural Netw 11:690–696
Pedregosa F et al (2011) Scikit-learn: machine learning in python. JMLR 12:2825–2830
Prasad UD, Madhavi S (2011) Prediction of churn behaviour of bank customers using data mining tools. Indian J Mark 42(9):25–30
Qureshi SA, Rehman AS, Qamar AM, Kamal A, Rehman A (2013) Telecommunication subscribers’ churn prediction model using machine learning. In: 2013 Eighth international conference on digital information management (ICDIM). IEEE, pp 131–136
Radosavljevik D, van der Putten P, Larsen KK (2010) The impact of experimental setup in prepaid churn prediction for mobile telecommunications: what to predict, for whom and does the customer experience matter? Trans MLDM 3(2):80–99
Rashmi KV, Gilad-Bachrach R (2015) DART: dropouts meet multiple additive regression trees, http://arxiv.org/abs/1505.01866
Richter Y, Yom-Tov E, Slonim N (2010) Predicting customer churn in mobile networks through analysis of social groups, SDM, vol 2010. SIAM, pp 732–741
Shaaban E, Helmy Y, Khedr A, Nasr M (2012) A proposed churn prediction model. J Eng Res Appl 2(4):693–697
Shi H (2007) Best-first decision tree learning. The University of Waikato, Hamilton
Sterling D, Sterling T, Zhang YM, Chen H (2015) Welding parameter optimization based on Gaussian process regression Bayesian optimization algorithm. In: IEEE international conference on automation science and engineering (CASE) Aug 24–28, 2015, Gothenburg, Sweden
Thomas JS (2001) A methodology for linking customer acquisition to customer retention. J Mark Res 38(2):262–268
Tsiptsis K, Chorianopoulos A (2011) Data mining techniques in CRM: inside customer segmentation. Wiley, New York
Umayaparvathi V, Iyakutti K (2016) A survey on customer churn prediction in telecom industry: datasets, methods and metrics. Int Res J Eng Technol 3(04):1065–1070
Vert JP, Tsuda K, Schölkopf B (2004) A primer on kernel methods. Kernel methods in computational biology. MIT Press, Cambridge, pp 35–70
Wang F et al (2022) An ensemble of Xgboost models for detecting disorders of consciousness in brain injuries through EEG connectivity. Expert Syst Appl 198:116778
Winter E (2002) The shapley value. Handbook of game theory with economic applications 3:2025–2054
XGBoost’s documentation, https://xgboost.readthedocs.io/. Accessed 2021/10/3
Xie Y, Li X (2008) Churn prediction with linear discriminant boosting algorithm. In: 2008 International conference on machine learning and cybernetics, vol 1. IEEE, pp 228–233
Zhu B et al (2018) Investigating decision tree in churn prediction with class imbalance. In: Proceedings of the international conference on data processing and applications
Zouhri W, Homri L, Dantan J-Y (2022) Handling the impact of feature uncertainties on SVM: a robust approach based on Sobol sensitivity analysis. Expert Syst Appl 189:115691
Acknowledgements
The authors would like to thank Ergobyte Informatics S.A. for providing the dataset and for their valuable comments and suggestions on the preparation of this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Theodoridis, G., Tsadiras, A. Applying machine learning techniques to predict and explain subscriber churn of an online drug information platform. Neural Comput & Applic 34, 19501–19514 (2022). https://doi.org/10.1007/s00521-022-07603-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07603-9