Skip to main content
Log in

Applying machine learning techniques to predict and explain subscriber churn of an online drug information platform

  • S.I.: Deep learning modelling in real life: (Anomaly Detection, Biomedical, Concept Analysis, Finance, Image analysis, Recommendation)
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Presently, most markets are extremely saturated and, as a result, businesses are highly competitive. Hence, avoiding the loss of preexisting customers is pivotal, deeming the prediction of customer loss crucial to efficiently target potential churners and attempt to retain them. This study provides an in-depth comparison of various machine learning techniques and advanced preprocessing methods as well as an overall guide for handling churn prediction problems. Churn prediction is fundamentally a binary classification problem. To handle said problem, within this paper, numerous methods that belong to different machine learning categories (linear, nonlinear, ensemble, neural networks) are constructed, optimized and trained on the subscription data of a new real-world dataset originating from a popular online drug information platform that provides information on drugs and drug substances as well as professional tools for pharmacotherapy decision making. In contrast with previous works that address traditional customer churn in relation to telecom, banking or insurance industries, the current study addresses online subscriber churn where users might churn at any given moment. This study also focuses on the proper preprocessing of the given data via advanced machine learning methods, as well as evaluating the models under different conditions to measure their robustness. The results are presented, compared, analyzed and explained. Extensive feature importance analysis is performed to explain not only the models themselves but to also indicate the main factors that contribute toward churning. The findings co-align with the notion that, under the important condition that the dataset is preprocessed using not only statistical methods but machine learning techniques as well, all methods perform adequately and are generally viable options, but ensemble methods, namely Random Forests, are more flexible and resistant toward outliers. Feature importance analysis indicates that usage, not demographic data, is the prime indicator of churn.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from Ergobyte Informatics S.A., but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with permission of Ergobyte Informatics S.A.

References

  1. Ahmad AK, Jafar A, Aljoumaa K (2019) Customer churn prediction in telecom using machine learning in big data platform. J Big Data 6(1):1–24

    Article  Google Scholar 

  2. Athanassopoulos AD (2000) Customer satisfaction cues to support market segmentation and explain switching behavior. J Bus Res 47(3):191–207

    Article  Google Scholar 

  3. Auret L, Aldrich C (2011) Empirical comparison of tree ensemble variable importance measures. Chemom Intell Lab Syst 105(2):157–170

    Article  Google Scholar 

  4. Benesty J et al (2009) Pearson correlation coefficient. Noise reduction in speech processing. Springer, Berlin, Heidelberg, pp 1–4

    Google Scholar 

  5. Brandusoiu I, Toderean G, Ha B (2016) Methods for churn prediction in the prepaid mobile telecommunications industry. In: International conference on communications, pp 97–100

  6. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  7. Burez J, Van den Poel D (2007) CRM at Canal + Belgique: reducing customer attrition through targeted marketing. Expert Syst Appl 32:277–288

    Article  Google Scholar 

  8. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  9. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794

  10. Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21:2020

    Article  Google Scholar 

  11. Coussement K, Van den Poel D (2008) Churn prediction in subscription services: an application of support vector machines while comparing two parameter-selection techniques. Expert Syst Appl 34:313–327

    Article  Google Scholar 

  12. den Poel DV, Lariviere B (2004) Customer attrition analysis for financial services using proportional hazard models. Eur J Oper Res 157(1):196–217

    Article  MATH  Google Scholar 

  13. Drakopoulos G, Mylonas P (2020) Evaluating graph resilience with tensor stack networks: a keras implementation. Neural Comput Appl 32(9):4161–4176

    Article  Google Scholar 

  14. Eichinger F, Nauck DD, Klawonn F (2006) Sequence mining for customer behaviour predictions in telecommunications. In: Proceedings of the workshop on practical data mining at ECML/PKDD, pp 3–10

  15. García DL, Nebot À, Vellido A (2017) Intelligent data analysis approaches to churn as a business problem: a survey. Knowl Inf Syst 51(3):719–774

    Article  Google Scholar 

  16. Geiler L, Affeldt S, Nadif M (2022) A survey on machine learning methods for churn prediction. Int J Data Sci Anal 2022:1–26

    Google Scholar 

  17. Günther C-C, Tvete IF, Aas K, Sandnes GI, Borgan Ø (2014) Modelling and predicting customer churn from an insurance company. Scand Actuarial J 2014(1):58–71

    Article  MathSciNet  MATH  Google Scholar 

  18. Gupta N (2013) Artificial neural network. Netw Complex Syst 3(1):24–28

    Google Scholar 

  19. Gürsoy S, Tugba U (2010) Customer churn analysis in telecommunication sector. J Schl Bus Admin Istanbul Univ 39(1):35–49

    Google Scholar 

  20. Jadhav RJ, Pawar UT (2011) Churn prediction in telecommunication using data mining technology. IJACSA Ed 2(2):17–19

    Google Scholar 

  21. Jones MA, Mothersbaugh DL, Beatty SE (2000) Switching barriers and repurchase intentions in services. J Retail 76(2):259–374

    Article  Google Scholar 

  22. Kiguchi M, Saeed W, Medi I (2022) Churn prediction in digital game-based learning using data mining techniques: logistic regression, decision tree, and random forest. Appl Soft Comput 118:108491

    Article  Google Scholar 

  23. Kim K, Jun C-H, Lee J (2014) Improved churn prediction in telecommunication industry by analyzing a large network. Expert Syst Appl 41(15):6575–6584

    Article  Google Scholar 

  24. Kingma DP, Ba JL (2015) ADAM: a method for stochastic optimization. Published as a conference paper at ICLR 2015

  25. Kirui C, Hong L, Cheruiyot W, Kirui H (2013) Predicting customer churn in mobile telephony industry using probabilistic classifiers in data mining. Int J Comput Sci Iss (IJCSI) 10(2):165–172

    Google Scholar 

  26. Kraljevic G, Gotovac S (2010) Modeling data mining applications for prediction of prepaid churn in telecommunication services. Automatika 51(3):275–283

    Article  Google Scholar 

  27. Kursa MB, Rudnicki WR (2010) Feature selection with the Boruta package. J Stat Softw 36(11):1–13

    Article  Google Scholar 

  28. Li P, Wu Q, Burges CJ (2008) Mcrank: learning to rank using multiple classification and gradient boosting. Advances in Neural Information Processing Systems 20

  29. LightGBM’s documentation, https://lightgbm.readthedocs.io/. Accessed 2021/10/3

  30. Liu FT, Ting KM, Zhou Z-H (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining

  31. Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems

  32. McHugh ML (2012) Interrater reliability: the kappa statistic. Biochem Med 22(3):276–282. https://hrcak.srce.hr/89395. Accessed 2021/1/5

  33. Morik K, Kopcke H (2004) Analysing customer churn in insurance data a case study. In: Proceedings of the 8th European conference on principles and practice of knowledge discovery in databases, New York, USA, pp 325–336

  34. Mozer DGM, Wolniewicz R, Kaushansky H (2000) Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry. IEEE Trans Neural Netw 11:690–696

    Article  Google Scholar 

  35. Pedregosa F et al (2011) Scikit-learn: machine learning in python. JMLR 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  36. Prasad UD, Madhavi S (2011) Prediction of churn behaviour of bank customers using data mining tools. Indian J Mark 42(9):25–30

    Google Scholar 

  37. Qureshi SA, Rehman AS, Qamar AM, Kamal A, Rehman A (2013) Telecommunication subscribers’ churn prediction model using machine learning. In: 2013 Eighth international conference on digital information management (ICDIM). IEEE, pp 131–136

  38. Radosavljevik D, van der Putten P, Larsen KK (2010) The impact of experimental setup in prepaid churn prediction for mobile telecommunications: what to predict, for whom and does the customer experience matter? Trans MLDM 3(2):80–99

    Google Scholar 

  39. Rashmi KV, Gilad-Bachrach R (2015) DART: dropouts meet multiple additive regression trees, http://arxiv.org/abs/1505.01866

  40. Richter Y, Yom-Tov E, Slonim N (2010) Predicting customer churn in mobile networks through analysis of social groups, SDM, vol 2010. SIAM, pp 732–741

  41. Shaaban E, Helmy Y, Khedr A, Nasr M (2012) A proposed churn prediction model. J Eng Res Appl 2(4):693–697

    Google Scholar 

  42. Shi H (2007) Best-first decision tree learning. The University of Waikato, Hamilton

    Google Scholar 

  43. Sterling D, Sterling T, Zhang YM, Chen H (2015) Welding parameter optimization based on Gaussian process regression Bayesian optimization algorithm. In: IEEE international conference on automation science and engineering (CASE) Aug 24–28, 2015, Gothenburg, Sweden

  44. Thomas JS (2001) A methodology for linking customer acquisition to customer retention. J Mark Res 38(2):262–268

    Article  Google Scholar 

  45. Tsiptsis K, Chorianopoulos A (2011) Data mining techniques in CRM: inside customer segmentation. Wiley, New York

    Google Scholar 

  46. Umayaparvathi V, Iyakutti K (2016) A survey on customer churn prediction in telecom industry: datasets, methods and metrics. Int Res J Eng Technol 3(04):1065–1070

    Google Scholar 

  47. Vert JP, Tsuda K, Schölkopf B (2004) A primer on kernel methods. Kernel methods in computational biology. MIT Press, Cambridge, pp 35–70

    Google Scholar 

  48. Wang F et al (2022) An ensemble of Xgboost models for detecting disorders of consciousness in brain injuries through EEG connectivity. Expert Syst Appl 198:116778

    Article  Google Scholar 

  49. Winter E (2002) The shapley value. Handbook of game theory with economic applications 3:2025–2054

    Article  Google Scholar 

  50. XGBoost’s documentation, https://xgboost.readthedocs.io/. Accessed 2021/10/3

  51. Xie Y, Li X (2008) Churn prediction with linear discriminant boosting algorithm. In: 2008 International conference on machine learning and cybernetics, vol 1. IEEE, pp 228–233

  52. Zhu B et al (2018) Investigating decision tree in churn prediction with class imbalance. In: Proceedings of the international conference on data processing and applications

  53. Zouhri W, Homri L, Dantan J-Y (2022) Handling the impact of feature uncertainties on SVM: a robust approach based on Sobol sensitivity analysis. Expert Syst Appl 189:115691

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Ergobyte Informatics S.A. for providing the dataset and for their valuable comments and suggestions on the preparation of this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Georgios Theodoridis.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Theodoridis, G., Tsadiras, A. Applying machine learning techniques to predict and explain subscriber churn of an online drug information platform. Neural Comput & Applic 34, 19501–19514 (2022). https://doi.org/10.1007/s00521-022-07603-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07603-9

Keywords

Navigation