Skip to main content
Log in

Applying over 100 classifiers for churn prediction in telecom companies

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In today’s date where machine learning is the key to solve so many problems in different fields, one really should know the extent of its importance in their field. One of the major applications of machine learning is Predictive Analytics. Churn prediction is one of the key steps for customer retention in this saturating market scenario [31]. This is one of the major objectives and any toolkit which can give insights on this can be really beneficial for any service providing companies. Furthermore, one of the major problems that business analysts face during this procedure is to decide which classifier to select. In the continuously evolving field of machine learning where developers are constantly coming up with new machine learning algorithms, it is often difficult for the analysts to have knowledge about the varied options. In our work, we try to analyze and compare the performance of over 100 classifiers in churn prediction of a telecom company. We have used renowned classifiers from different families. This work can serve as the first step for any data scientist who wants to develop a churn prediction system for their application. Also, we try to explore efficient algorithms that will give a better result. Churn prediction is a mildly imbalanced set of the problem which degrade the performance of classifiers. The highest accuracy is given by the Regularized Random Forest classifier. Since the problem is imbalanced, we also consider the area under the Receiver Operating Characteristic (ROC) curve and the classifier Bagging Random Forest produces the best result in this scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Abbott, Dean (2014). Applied predictive analytics: principles and techniques for the professional data analyst. John Wiley & Sons

  2. Aha D, Kibler D (1991) Instance-based learning algorithms. Mach Learn 6:37–66

    MATH  Google Scholar 

  3. Arun Kumar M, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert SystAppl 36:7535–7543

    Article  Google Scholar 

  4. Borah, Parashjyoti, and Deepak Gupta (2019). “Functional iterative approaches for solving support vector classification problems based on generalized Huber loss.” Neural Comput & Applic : 1–21

  5. Bouckaert, Remco “class BayesNet” https://weka.sourceforge.io/doc.dev/weka/classifiers/bayes/BayesNet.html Accessed October 2019

  6. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  Google Scholar 

  7. Castro EG, Tsuzuki MSG (2015) Churn prediction in online games using players’ login records: a frequency analysis approach. IEEE Transactions on Computational Intelligence and AI in Games 7(3):255–265

    Article  Google Scholar 

  8. Chih-Chung Chang, Chih-Jen Lin (2001). LIBSVM - a library for support vector machines. URL http://www.csie.ntu.edu.tw/~cjlin/libsvm/

  9. William W Cohen (1995). Fast effective rule induction. In: Twelfth International Conference on Machine Learning, 115-123

  10. Dalvi, Preeti K, et al (2016). “Analysis of customer churn prediction in telecom industry using decision trees and logistic regression.” 2016 Symposium on Colossal Data Analysis and Networking (CDAN). IEEE

  11. Diez JJR et al (2006) Rotation Forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28:1619–1630

    Article  Google Scholar 

  12. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, Chih-Jen Lin (2008). LIBLINEAR - a library for large linear classification. URL http://www.csie.ntu.edu.tw/~cjlin/liblinear/

  13. Farquad MAH, Ravi V, Bapi Raju S (2012) Analytical CRM in banking and finance using SVM: a modified active learning-based rule extraction approach. International Journal of Electronic Customer Relationship Management 6(1):48–73

    Article  Google Scholar 

  14. Fernández A, del Jesus MJ, Herrera F (2010) On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets. Inf Sci 180(8):1268–1291

    Article  MathSciNet  Google Scholar 

  15. Fernández-Delgado M et al (2014) Do we need hundreds of classifiers to solve real world classification problems? The journal of machine learning research 15(1):3133–3181

    MathSciNet  MATH  Google Scholar 

  16. Frank, Eibe (2014). “Fully supervised training of Gaussian radial basis function networks in WEKA.” : 1–5

  17. Eibe Frank, Mark Hall (2001). A Simple Approach to Ordinal Classification. In: 12th European Conference on Machine Learning, 145–156

  18. Eibe Frank, Mark Hall, Bernhard Pfahringer (2003). Locally Weighted Naive Bayes. In: 19th Conference in Uncertainty in Artificial Intelligence, 249–256

  19. Frank E, Wang Y, Inglis S, Holmes G, Witten IH (1998) Using model trees for classification. Mach Learn 32(1):63–76

    Article  Google Scholar 

  20. Eibe Frank, Ian H. Witten (1998). Generating accurate rule sets without global optimization. In: Fifteenth International Conference on Machine Learning, 144-151

  21. Holte RC (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learn 11:63–91

    Article  Google Scholar 

  22. Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew (2004). “Extreme learning machine: a new learning scheme of feedforward neural networks.” 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541). Vol. 2. IEEE

  23. Idris, Adnan, Asifullah Khan, and Yeon Soo Lee (2012). “Genetic programming and adaboosting based churn prediction for telecom.” 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE

  24. Ismail MR, Awang MK, Rahman MNA, Makhtar M (2015) A multi-layer perceptron approach for customer churn prediction. International Journal of Multimedia and Ubiquitous Engineering 10(7):213–222

    Article  Google Scholar 

  25. John G Cleary, Leonard E (1995). Trigg: K*: An Instance-based Learner Using an Entropic Distance Measure. In: 12th International Conference on Machine Learning, 108–114

  26. George H John, Pat Langley (1995). Estimating continuous distributions in Bayesian classifiers. In: Eleventh Conference on Uncertainty in Artificial Intelligence, San Mateo, 338-345

  27. Khemchandani JR, S. (2007) Chandra, twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910

    Article  Google Scholar 

  28. Ron Kohavi (1995). The Power of Decision Tables. In: 8th European Conference on Machine Learning, 174–189

  29. R Kohavi (1995). Wrappers for performance enhancement and oblivious decision graphs. Department of Computer Science, Stanford University

  30. Ron Kohavi (1996). Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid. In: Second International Conference on Knoledge Discovery and Data Mining, 202-207

  31. Kumar, Krishan (2013). “Customer retention strategies of telecom service providers.”

  32. Ludmila I Kuncheva (2004). Combining pattern classifiers: methods and algorithms. John Wiley and Sons, Inc

  33. le Cessie S, van Houwelingen JC (1992) Ridge estimators in logistic regression. Appl Stat 41(1):191–201

    Article  Google Scholar 

  34. Lin Dong, Eibe Frank, Stefan Kramer (2005). Ensembles of balanced nested dichotomies for multi-class problems. In: PKDD, 84-95

  35. MATLAB. (2018). (R2018b). Natick, Massachusetts: the MathWorks Inc

  36. P Melville, RJ Mooney (2003). Constructing diverse classifier ensembles using artificial training examples. In: Eighteenth International Joint Conference on Artificial Intelligence, 505-510

  37. Mozer, Michael, Richard Wolniewicz, Eric Johnson and Howard Kaushansky. (1999). Churn reduction in the wireless industry, proceedings of the neural information processing systems conference, San Diego, CA

  38. Nasiri JA, Charkari NM, Jalili S (2015) Least squares twin multi-class classification support vector machine. Pattern Recogn 48(3):984–992

    Article  Google Scholar 

  39. Pamina. (2018). Telecom churn, Teradata center for customer relationship management at Duke University. Version 2. Retrieved 2019 September

  40. Pao Y-H, YoshiyasuTakefuji (1992) Functional-link net computing: theory, system architecture, and functionalities. Computer 25(5):76–79

    Article  Google Scholar 

  41. Peng XJ, Xu D, Kong LY, Chen DJ (2016) L1-norm loss based twin support vector machine for data recognition. InfSci 340–341:86–103

    MATH  Google Scholar 

  42. J Platt (1998). Fast Training of Support Vector Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning

  43. Quinlan, John R (1992). “Learning with continuous classes.” 5th Australian joint conference on artificial intelligence. Vol. 92

  44. Quinlan R (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA

  45. R Core Team (2013). R: a language and environment for statistical computing. R Foundation for statistical computing, Vienna, Austria. URL http://www.R-project.org/

  46. Richeldi, Marco, and Alessandro Perrucci (2002). “Churn analysis case study.” Deliverable D17 2

  47. Richhariya B, Sharma A, Tanveer M (2018) “Improved universum twin support vector machine.” 2018 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE

  48. Richhariya B, Tanveer M (2018) EEG signal classification using universum support vector machine. Expert Syst Appl 106:169–182

    Article  Google Scholar 

  49. Richhariya B, Tanveer M (2020) “A reduced universum twin support vector machine for class imbalance learning.” Pattern Recogn :107150

  50. RStudio Team (2015). RStudio: integrated development for R. RStudio, Inc., Boston, MA URL http://www.rstudio.com/

  51. Savitha R, Suresh S, Sundararajan N (2012) Fast learning circular complex-valued extreme learning machine (CC-ELM) for real-valued classification problems. Inf Sci 187:277–290

    Article  MathSciNet  Google Scholar 

  52. Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825–2830, (2011)

  53. Seni G, Elder JF (2010) Ensemble methods in data mining: improving accuracy through combining predictions. Synthesis lectures on data mining and knowledge discovery 2(1):1–126

    Article  Google Scholar 

  54. Shankar, K, et al (2018). “Optimal feature-based multi-kernel SVM approach for thyroid disease classification.” J Supercomput : 1–16

  55. Shao Y-H et al (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968

    Article  Google Scholar 

  56. Sharma, Sweta, Reshma Rastogi, and Suresh Chandra (2019). “Large-scale twin parametric support vector machine using pinball loss function.” IEEE Transactions on Systems, Man, and Cybernetics: Systems

  57. Marc Sumner, Eibe Frank, Mark Hall (2005). Speeding up Logistic Model Tree Induction. In: 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, 675–683

  58. Suresh S, Sundararajan N, ParamasivanSaratchandran (2008) Risk-sensitive loss functions for sparse multi-category classification problems. Inf Sci 178(12):2621–2638

    Article  MathSciNet  Google Scholar 

  59. Tanveer M, Gautam C, Suganthan PN (2019) Comprehensive evaluation of twin SVM based classifiers on UCI datasets. Appl Soft Comput 83:105617

    Article  Google Scholar 

  60. Tanveer M, Khan MA, Ho S-S (2016) Robust energy-based least squares twin support vector machines. Appl Intell 45(1):174–186

    Article  Google Scholar 

  61. Tanveer M, Tiwari A, Choudhary R, Jalan S (2019) Sparse pinball twin support vector machines. Appl Soft Comput 78:164–175

    Article  Google Scholar 

  62. Tanveer, M, et al (2020). “Machine learning techniques for the diagnosis of Alzheimer’s disease: A review.” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 16.1s : 1–35

  63. Ting, KM, Witten, IH (1997). Stacking Bagged and Dagged Models. In: Fourteenth international Conference on Machine Learning, San Francisco, CA, 367-375

  64. Vafeiadis T, Diamantaras KI, Sarigiannidis G, Chatzisavvas KC (2015) A comparison of machine learning techniques for customer churn prediction. Simul Model Pract Theory 55:1–9

    Article  Google Scholar 

  65. Vapnik VN (1998) Statistical learning theory. John Wiley & Sons, New York

    MATH  Google Scholar 

  66. Geoffrey I. Webb (2000). MultiBoosting: A Technique for Combining Boosting and Wagging. Machine Learning. Vol.40 (No.2)

  67. Witten IH, Mining EFD (2005) Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco

    Google Scholar 

  68. Xie Y, Li X, Ngai EWT, Ying W (2009) Customer churn prediction using improved balanced random forests. Expert Syst Appl 36(3):5445–5449

    Article  Google Scholar 

  69. Zhao Y, Li B, Li X, Liu W, Ren S (2005) Customer Churn Prediction Using Improved One-Class Support Vector Machine. In: Li X, Wang S, Dong ZY (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture notes in computer science, vol 3584. Springer, Berlin, Heidelberg

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Gupta.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Adhikary, D.D., Gupta, D. Applying over 100 classifiers for churn prediction in telecom companies. Multimed Tools Appl 80, 35123–35144 (2021). https://doi.org/10.1007/s11042-020-09658-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09658-z

Keywords

Navigation