Applying over 100 classifiers for churn prediction in telecom companies

Adhikary, Debjyoti Das; Gupta, Deepak

doi:10.1007/s11042-020-09658-z

Applying over 100 classifiers for churn prediction in telecom companies

Published: 26 August 2020

Volume 80, pages 35123–35144, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

831 Accesses
23 Citations
Explore all metrics

Abstract

In today’s date where machine learning is the key to solve so many problems in different fields, one really should know the extent of its importance in their field. One of the major applications of machine learning is Predictive Analytics. Churn prediction is one of the key steps for customer retention in this saturating market scenario [31]. This is one of the major objectives and any toolkit which can give insights on this can be really beneficial for any service providing companies. Furthermore, one of the major problems that business analysts face during this procedure is to decide which classifier to select. In the continuously evolving field of machine learning where developers are constantly coming up with new machine learning algorithms, it is often difficult for the analysts to have knowledge about the varied options. In our work, we try to analyze and compare the performance of over 100 classifiers in churn prediction of a telecom company. We have used renowned classifiers from different families. This work can serve as the first step for any data scientist who wants to develop a churn prediction system for their application. Also, we try to explore efficient algorithms that will give a better result. Churn prediction is a mildly imbalanced set of the problem which degrade the performance of classifiers. The highest accuracy is given by the Regularized Random Forest classifier. Since the problem is imbalanced, we also consider the area under the Receiver Operating Characteristic (ROC) curve and the classifier Bagging Random Forest produces the best result in this scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inferring Machine Learning Based Parameter Estimation for Telecom Churn Prediction

Customer Churn in Telecom Sector: Analyzing the Effectiveness of Machine Learning Techniques

Customer Churn Analysis Using Machine Learning

References

Abbott, Dean (2014). Applied predictive analytics: principles and techniques for the professional data analyst. John Wiley & Sons
Aha D, Kibler D (1991) Instance-based learning algorithms. Mach Learn 6:37–66
MATH Google Scholar
Arun Kumar M, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert SystAppl 36:7535–7543
Article Google Scholar
Borah, Parashjyoti, and Deepak Gupta (2019). “Functional iterative approaches for solving support vector classification problems based on generalized Huber loss.” Neural Comput & Applic : 1–21
Bouckaert, Remco “class BayesNet” https://weka.sourceforge.io/doc.dev/weka/classifiers/bayes/BayesNet.html Accessed October 2019
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Castro EG, Tsuzuki MSG (2015) Churn prediction in online games using players’ login records: a frequency analysis approach. IEEE Transactions on Computational Intelligence and AI in Games 7(3):255–265
Article Google Scholar
Chih-Chung Chang, Chih-Jen Lin (2001). LIBSVM - a library for support vector machines. URL http://www.csie.ntu.edu.tw/~cjlin/libsvm/
William W Cohen (1995). Fast effective rule induction. In: Twelfth International Conference on Machine Learning, 115-123
Dalvi, Preeti K, et al (2016). “Analysis of customer churn prediction in telecom industry using decision trees and logistic regression.” 2016 Symposium on Colossal Data Analysis and Networking (CDAN). IEEE
Diez JJR et al (2006) Rotation Forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28:1619–1630
Article Google Scholar
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, Chih-Jen Lin (2008). LIBLINEAR - a library for large linear classification. URL http://www.csie.ntu.edu.tw/~cjlin/liblinear/
Farquad MAH, Ravi V, Bapi Raju S (2012) Analytical CRM in banking and finance using SVM: a modified active learning-based rule extraction approach. International Journal of Electronic Customer Relationship Management 6(1):48–73
Article Google Scholar
Fernández A, del Jesus MJ, Herrera F (2010) On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets. Inf Sci 180(8):1268–1291
Article MathSciNet Google Scholar
Fernández-Delgado M et al (2014) Do we need hundreds of classifiers to solve real world classification problems? The journal of machine learning research 15(1):3133–3181
MathSciNet MATH Google Scholar
Frank, Eibe (2014). “Fully supervised training of Gaussian radial basis function networks in WEKA.” : 1–5
Eibe Frank, Mark Hall (2001). A Simple Approach to Ordinal Classification. In: 12th European Conference on Machine Learning, 145–156
Eibe Frank, Mark Hall, Bernhard Pfahringer (2003). Locally Weighted Naive Bayes. In: 19th Conference in Uncertainty in Artificial Intelligence, 249–256
Frank E, Wang Y, Inglis S, Holmes G, Witten IH (1998) Using model trees for classification. Mach Learn 32(1):63–76
Article Google Scholar
Eibe Frank, Ian H. Witten (1998). Generating accurate rule sets without global optimization. In: Fifteenth International Conference on Machine Learning, 144-151
Holte RC (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learn 11:63–91
Article Google Scholar
Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew (2004). “Extreme learning machine: a new learning scheme of feedforward neural networks.” 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541). Vol. 2. IEEE
Idris, Adnan, Asifullah Khan, and Yeon Soo Lee (2012). “Genetic programming and adaboosting based churn prediction for telecom.” 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE
Ismail MR, Awang MK, Rahman MNA, Makhtar M (2015) A multi-layer perceptron approach for customer churn prediction. International Journal of Multimedia and Ubiquitous Engineering 10(7):213–222
Article Google Scholar
John G Cleary, Leonard E (1995). Trigg: K*: An Instance-based Learner Using an Entropic Distance Measure. In: 12th International Conference on Machine Learning, 108–114
George H John, Pat Langley (1995). Estimating continuous distributions in Bayesian classifiers. In: Eleventh Conference on Uncertainty in Artificial Intelligence, San Mateo, 338-345
Khemchandani JR, S. (2007) Chandra, twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Article Google Scholar
Ron Kohavi (1995). The Power of Decision Tables. In: 8th European Conference on Machine Learning, 174–189
R Kohavi (1995). Wrappers for performance enhancement and oblivious decision graphs. Department of Computer Science, Stanford University
Ron Kohavi (1996). Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid. In: Second International Conference on Knoledge Discovery and Data Mining, 202-207
Kumar, Krishan (2013). “Customer retention strategies of telecom service providers.”
Ludmila I Kuncheva (2004). Combining pattern classifiers: methods and algorithms. John Wiley and Sons, Inc
le Cessie S, van Houwelingen JC (1992) Ridge estimators in logistic regression. Appl Stat 41(1):191–201
Article Google Scholar
Lin Dong, Eibe Frank, Stefan Kramer (2005). Ensembles of balanced nested dichotomies for multi-class problems. In: PKDD, 84-95
MATLAB. (2018). (R2018b). Natick, Massachusetts: the MathWorks Inc
P Melville, RJ Mooney (2003). Constructing diverse classifier ensembles using artificial training examples. In: Eighteenth International Joint Conference on Artificial Intelligence, 505-510
Mozer, Michael, Richard Wolniewicz, Eric Johnson and Howard Kaushansky. (1999). Churn reduction in the wireless industry, proceedings of the neural information processing systems conference, San Diego, CA
Nasiri JA, Charkari NM, Jalili S (2015) Least squares twin multi-class classification support vector machine. Pattern Recogn 48(3):984–992
Article Google Scholar
Pamina. (2018). Telecom churn, Teradata center for customer relationship management at Duke University. Version 2. Retrieved 2019 September
Pao Y-H, YoshiyasuTakefuji (1992) Functional-link net computing: theory, system architecture, and functionalities. Computer 25(5):76–79
Article Google Scholar
Peng XJ, Xu D, Kong LY, Chen DJ (2016) L1-norm loss based twin support vector machine for data recognition. InfSci 340–341:86–103
MATH Google Scholar
J Platt (1998). Fast Training of Support Vector Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning
Quinlan, John R (1992). “Learning with continuous classes.” 5th Australian joint conference on artificial intelligence. Vol. 92
Quinlan R (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA
R Core Team (2013). R: a language and environment for statistical computing. R Foundation for statistical computing, Vienna, Austria. URL http://www.R-project.org/
Richeldi, Marco, and Alessandro Perrucci (2002). “Churn analysis case study.” Deliverable D17 2
Richhariya B, Sharma A, Tanveer M (2018) “Improved universum twin support vector machine.” 2018 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE
Richhariya B, Tanveer M (2018) EEG signal classification using universum support vector machine. Expert Syst Appl 106:169–182
Article Google Scholar
Richhariya B, Tanveer M (2020) “A reduced universum twin support vector machine for class imbalance learning.” Pattern Recogn :107150
RStudio Team (2015). RStudio: integrated development for R. RStudio, Inc., Boston, MA URL http://www.rstudio.com/
Savitha R, Suresh S, Sundararajan N (2012) Fast learning circular complex-valued extreme learning machine (CC-ELM) for real-valued classification problems. Inf Sci 187:277–290
Article MathSciNet Google Scholar
Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825–2830, (2011)
Seni G, Elder JF (2010) Ensemble methods in data mining: improving accuracy through combining predictions. Synthesis lectures on data mining and knowledge discovery 2(1):1–126
Article Google Scholar
Shankar, K, et al (2018). “Optimal feature-based multi-kernel SVM approach for thyroid disease classification.” J Supercomput : 1–16
Shao Y-H et al (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968
Article Google Scholar
Sharma, Sweta, Reshma Rastogi, and Suresh Chandra (2019). “Large-scale twin parametric support vector machine using pinball loss function.” IEEE Transactions on Systems, Man, and Cybernetics: Systems
Marc Sumner, Eibe Frank, Mark Hall (2005). Speeding up Logistic Model Tree Induction. In: 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, 675–683
Suresh S, Sundararajan N, ParamasivanSaratchandran (2008) Risk-sensitive loss functions for sparse multi-category classification problems. Inf Sci 178(12):2621–2638
Article MathSciNet Google Scholar
Tanveer M, Gautam C, Suganthan PN (2019) Comprehensive evaluation of twin SVM based classifiers on UCI datasets. Appl Soft Comput 83:105617
Article Google Scholar
Tanveer M, Khan MA, Ho S-S (2016) Robust energy-based least squares twin support vector machines. Appl Intell 45(1):174–186
Article Google Scholar
Tanveer M, Tiwari A, Choudhary R, Jalan S (2019) Sparse pinball twin support vector machines. Appl Soft Comput 78:164–175
Article Google Scholar
Tanveer, M, et al (2020). “Machine learning techniques for the diagnosis of Alzheimer’s disease: A review.” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 16.1s : 1–35
Ting, KM, Witten, IH (1997). Stacking Bagged and Dagged Models. In: Fourteenth international Conference on Machine Learning, San Francisco, CA, 367-375
Vafeiadis T, Diamantaras KI, Sarigiannidis G, Chatzisavvas KC (2015) A comparison of machine learning techniques for customer churn prediction. Simul Model Pract Theory 55:1–9
Article Google Scholar
Vapnik VN (1998) Statistical learning theory. John Wiley & Sons, New York
MATH Google Scholar
Geoffrey I. Webb (2000). MultiBoosting: A Technique for Combining Boosting and Wagging. Machine Learning. Vol.40 (No.2)
Witten IH, Mining EFD (2005) Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Google Scholar
Xie Y, Li X, Ngai EWT, Ying W (2009) Customer churn prediction using improved balanced random forests. Expert Syst Appl 36(3):5445–5449
Article Google Scholar
Zhao Y, Li B, Li X, Liu W, Ren S (2005) Customer Churn Prediction Using Improved One-Class Support Vector Machine. In: Li X, Wang S, Dong ZY (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture notes in computer science, vol 3584. Springer, Berlin, Heidelberg
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, National Institute of Technology, Yupia, Arunachal Pradesh, India
Debjyoti Das Adhikary & Deepak Gupta

Authors

Debjyoti Das Adhikary
View author publications
You can also search for this author in PubMed Google Scholar
Deepak Gupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepak Gupta.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Adhikary, D.D., Gupta, D. Applying over 100 classifiers for churn prediction in telecom companies. Multimed Tools Appl 80, 35123–35144 (2021). https://doi.org/10.1007/s11042-020-09658-z

Download citation

Received: 25 February 2020
Revised: 11 August 2020
Accepted: 18 August 2020
Published: 26 August 2020
Issue Date: November 2021
DOI: https://doi.org/10.1007/s11042-020-09658-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Applying over 100 classifiers for churn prediction in telecom companies

Abstract

Access this article

Similar content being viewed by others

Inferring Machine Learning Based Parameter Estimation for Telecom Churn Prediction

Customer Churn in Telecom Sector: Analyzing the Effectiveness of Machine Learning Techniques

Customer Churn Analysis Using Machine Learning

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Applying over 100 classifiers for churn prediction in telecom companies

Abstract

Access this article

Similar content being viewed by others

Inferring Machine Learning Based Parameter Estimation for Telecom Churn Prediction

Customer Churn in Telecom Sector: Analyzing the Effectiveness of Machine Learning Techniques

Customer Churn Analysis Using Machine Learning

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation