Customer churn prediction system: a machine learning approach

Abstract

The customer churn prediction (CCP) is one of the challenging problems in the telecom industry. With the advancement in the field of machine learning and artificial intelligence, the possibilities to predict customer churn has increased significantly. Our proposed methodology, consists of six phases. In the first two phases, data pre-processing and feature analysis is performed. In the third phase, feature selection is taken into consideration using gravitational search algorithm. Next, the data has been split into two parts train and test set in the ratio of 80% and 20% respectively. In the prediction process, most popular predictive models have been applied, namely, logistic regression, naive bayes, support vector machine, random forest, decision trees, etc. on train set as well as boosting and ensemble techniques are applied to see the effect on accuracy of models. In addition, K-fold cross validation has been used over train set for hyperparameter tuning and to prevent overfitting of models. Finally, the obtained results on test set have been evaluated using confusion matrix and AUC curve. It was found that Adaboost and XGboost Classifier gives the highest accuracy of 81.71% and 80.8% respectively. The highest AUC score of 84%, is achieved by both Adaboost and XGBoost Classifiers which outperforms over others.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

References

  1. 1.

    Abbasimehr H, Setak M, Tarokh M (2011) A neuro-fuzzy classifier for customer churn prediction. International Journal of Computer Applications 19(8):35–41

    Google Scholar 

  2. 2.

    Adwan O, Faris H, Jaradat K, Harfoushi O, Ghatasheh N (2014) Predicting customer churn in telecom industry using multilayer preceptron neural networks: Modeling and analysis. Life Science Journal 11(3):75–81

    Google Scholar 

  3. 3.

    Ahmad AK, Jafar A, Aljoumaa K (2019) Customer churn prediction in telecom using machine learning in big data platform. Journal of Big Data 6(1):28

    Article  Google Scholar 

  4. 4.

    Archambault, D., Hurley, N., Tu, C.T.: Churnvis: visualizing mobile telecommunications churn on a social network with attributes. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013), pp. 894–901. IEEE (2013)

  5. 5.

    Asthana P (2018) A comparison of machine learning techniques for customer churn prediction. International Journal of Pure and Applied Mathematics 119(10):1149–1169

    Google Scholar 

  6. 6.

    Aziz R, Verma C, Srivastava N (2018) Artificial neural network classification of high dimensional data with novel optimization approach of dimension reduction. Annals of Data Science 5(4):615–635

    Article  Google Scholar 

  7. 7.

    Brânduşoiu, I., Toderean, G., Beleiu, H.: Methods for churn prediction in the pre-paid mobile telecommunications industry. In: 2016 International conference on communications (COMM), pp. 97–100. IEEE (2016)

  8. 8.

    Burez J, Van den Poel D (2009) Handling class imbalance in customer churn prediction. Expert Systems with Applications 36(3):4626–4636

    Article  Google Scholar 

  9. 9.

    Chen, H., Chiang, R.H., Storey, V.C.: Business intelligence and analytics: From big data to big impact. MIS quarterly pp. 1165–1188 (2012)

  10. 10.

    Coussement K, De Bock KW (2013) Customer churn prediction in the online gambling industry: The beneficial effect of ensemble learning. Journal of Business Research 66(9):1629–1636

    Article  Google Scholar 

  11. 11.

    Coussement K, Van den Poel D (2008) Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert systems with applications 34(1):313–327

    Article  Google Scholar 

  12. 12.

    Dahiya, K., Bhatia, S.: Customer churn analysis in telecom industry. In: 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions), pp. 1–6 (2015)

  13. 13.

    Dong, T., Shang, W., Zhu, H.: Naïve bayesian classifier based on the improved feature weighting algorithm. In: International Conference on Computer Science and Information Engineering, pp. 142–147. Springer (2011)

  14. 14.

    Fawcett T (2006) An introduction to roc analysis. Pattern recognition letters 27(8):861–874

    MathSciNet  Article  Google Scholar 

  15. 15.

    García S, Fernández A, Herrera F (2009) Enhancing the effectiveness and interpretability of decision tree and rule induction classifiers with evolutionary training set selection over imbalanced problems. Applied Soft Computing 9(4):1304–1314

    Article  Google Scholar 

  16. 16.

    Gürsoy UŞ (2010) Customer churn analysis in telecommunication sector. İstanbul Üniversitesi İşletme Fakültesi Dergisi 39(1):35–49

    Google Scholar 

  17. 17.

    Hadden J, Tiwari A, Roy R, Ruta D (2006) Churn prediction: Does technology matter. International Journal of Intelligent Technology 1(2):104–110

    Google Scholar 

  18. 18.

    Hadden J, Tiwari A, Roy R, Ruta D (2007) Computer assisted customer churn management: State-of-the-art and future trends. Computers & Operations Research 34(10):2902–2917

    MATH  Article  Google Scholar 

  19. 19.

    Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier,

  20. 20.

    Huang, Y., Zhu, F., Yuan, M., Deng, K., Li, Y., Ni, B., Dai, W., Yang, Q., Zeng, J.: Telco churn prediction with big data. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, pp. 607–618 (2015)

  21. 21.

    Idris, A., Khan, A., Lee, Y.S.: Genetic programming and adaboosting based churn prediction for telecom. In: 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1328–1332. IEEE (2012)

  22. 22.

    Kirui, C., Hong, L., Cheruiyot, W., Kirui, H.: Predicting customer churn in mobile telephony industry using probabilistic classifiers in data mining. International Journal of Computer Science Issues (IJCSI) 10(2 Part 1), 165 (2013)

  23. 23.

    Kisioglu P, Topcu YI (2011) Applying bayesian belief network approach to customer churn analysis: A case study on the telecom industry of turkey. Expert Systems with Applications 38(6):7151–7157

    Article  Google Scholar 

  24. 24.

    Lalwani P, Banka H, Kumar C (2017) Crwo: Clustering and routing in wireless sensor networks using optics inspired optimization. Peer-to-Peer Networking and Applications 10(3):453–471

    Article  Google Scholar 

  25. 25.

    Lalwani, P., Banka, H., Kumar, C.: Gsa-chsr: gravitational search algorithm for cluster head selection and routing in wireless sensor networks. In: Applications of Soft Computing for the Web, pp. 225–252. Springer (2017)

  26. 26.

    Lalwani P, Banka H, Kumar C (2018) Bera: a biogeography-based energy saving routing architecture for wireless sensor networks. Soft Computing 22(5):1651–1667

    Article  Google Scholar 

  27. 27.

    Lejeune MA (2001) Measuring the impact of data mining on churn management. Internet Research

  28. 28.

    Massey AP, Montoya-Weiss MM, Holcom K (2001) Re-engineering the customer relationship: leveraging knowledge assets at ibm. Decision Support Systems 32(2):155–170

    Article  Google Scholar 

  29. 29.

    Musheer RA, Verma C, Srivastava N (2019) Novel machine learning approach for classification of high-dimensional microarray data. Soft Computing 23(24):13409–13421

    Article  Google Scholar 

  30. 30.

    Nath SV, Behara RS (2003) Customer churn analysis in the wireless industry: A data mining approach. Proceedings-annual meeting of the decision sciences institute 561:505–510

    Google Scholar 

  31. 31.

    Petrison LA, Blattberg RC, Wang P (1997) Database marketing: Past, present, and future. Journal of Direct Marketing 11(4):109–125

    Article  Google Scholar 

  32. 32.

    Qureshi, S.A., Rehman, A.S., Qamar, A.M., Kamal, A., Rehman, A.: Telecommunication subscribers’ churn prediction model using machine learning. In: Eighth International Conference on Digital Information Management (ICDIM 2013), pp. 131–136. IEEE (2013)

  33. 33.

    Radosavljevik D, van der Putten P, Larsen KK (2010) The impact of experimental setup in prepaid churn prediction for mobile telecommunications: What to predict, for whom and does the customer experience matter? Trans. MLDM 3(2):80–99

    Google Scholar 

  34. 34.

    Rajamohamed R, Manokaran J (2018) Improved credit card churn prediction based on rough clustering and supervised learning techniques. Cluster Computing 21(1):65–77

    Article  Google Scholar 

  35. 35.

    Rodan A, Faris H, Alsakran J, Al-Kadi O (2014) A support vector machine approach for churn prediction in telecom industry. International journal on information 17(8):3961–3970

    Google Scholar 

  36. 36.

    Shaaban E, Helmy Y, Khedr A, Nasr M (2012) A proposed churn prediction model. International Journal of Engineering Research and Applications 2(4):693–697

    Google Scholar 

  37. 37.

    Sharma H, Kumar S (2016) A survey on decision tree algorithms of classification in data mining. International Journal of Science and Research (IJSR) 5(4):2094–2097

    MathSciNet  Article  Google Scholar 

  38. 38.

    Simons, R.: Siebel systems: Organizing for the customer (2005)

  39. 39.

    Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, f-score and roc: a family of discriminant measures for performance evaluation. In: Australasian joint conference on artificial intelligence, pp. 1015–1021. Springer (2006)

  40. 40.

    Tamaddoni Jahromi, A.: Predicting customer churn in telecommunications service providers (2009)

  41. 41.

    Ultsch A (2002) Emergent self-organising feature maps used for prediction and prevention of churn in mobile phone markets. Journal of Targeting, Measurement and Analysis for Marketing 10(4):314–324

    Article  Google Scholar 

  42. 42.

    Umayaparvathi V, Iyakutti K (2016) A survey on customer churn prediction in telecom industry: Datasets, methods and metrics. International Research Journal of Engineering and Technology (IRJET) 4(4):1065–1070

    Google Scholar 

  43. 43.

    Wei CP, Chiu IT (2002) Turning telecommunications call details to churn prediction: a data mining approach. Expert systems with applications 23(2):103–112

    Article  Google Scholar 

  44. 44.

    Xie Y, Li X, Ngai E, Ying W (2009) Customer churn prediction using improved balanced random forests. Expert Systems with Applications 36(3):5445–5449

    Article  Google Scholar 

  45. 45.

    Yu, W., Jutla, D.N., Sivakumar, S.C.: A churn-strategy alignment model for managers in mobile telecom. In: 3rd Annual Communication Networks and Services Research Conference (CNSR’05), pp. 48–53. IEEE (2005)

  46. 46.

    Zhao, Y., Li, B., Li, X., Liu, W., Ren, S.: Customer churn prediction using improved one-class support vector machine. In: International Conference on Advanced Data Mining and Applications, pp. 300–306. Springer (2005)

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Praveen Lalwani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lalwani, P., Mishra, M.K., Chadha, J.S. et al. Customer churn prediction system: a machine learning approach. Computing (2021). https://doi.org/10.1007/s00607-021-00908-y

Download citation

Keywords

  • Customer Churn Prediction
  • Machine Learning
  • Predictive Modeling
  • Confusion Matrix
  • AUC Curve

Mathematics Subject Classification

  • 68T01
  • 68T05