Knowledge and Information Systems

, Volume 51, Issue 3, pp 719–774 | Cite as

Intelligent data analysis approaches to churn as a business problem: a survey

  • David L. García
  • Àngela Nebot
  • Alfredo VellidoEmail author
Survey Paper


Globalization processes and market deregulation policies are rapidly changing the competitive environments of many economic sectors. The appearance of new competitors and technologies leads to an increase in competition and, with it, a growing preoccupation among service-providing companies with creating stronger customer bonds. In this context, anticipating the customer’s intention to abandon the provider, a phenomenon known as churn, becomes a competitive advantage. Such anticipation can be the result of the correct application of information-based knowledge extraction in the form of business analytics. In particular, the use of intelligent data analysis, or data mining, for the analysis of market surveyed information can be of great assistance to churn management. In this paper, we provide a detailed survey of recent applications of business analytics to churn, with a focus on computational intelligence methods. This is preceded by an in-depth discussion of churn within the context of customer continuity management. The survey is structured according to the stages identified as basic for the building of the predictive models of churn, as well as according to the different types of predictive methods employed and the business areas of their application.


Churn analysis Intelligent data analysis Computational intelligence Customer continuity management Literature survey 



We thank anonymous reviewers for their useful comments and suggestions. This research was partially supported by Spanish MINECO TIN2012-31377 research project.


  1. 1.
    Au WH, Chan KC, Yao X (2003) A novel evolutionary data mining algorithm with applications to churn prediction. IEEE Trans Evolut Comput 7(6):532–545CrossRefGoogle Scholar
  2. 2.
    Baesens B, Verstraeten G, Van den Poel D, Egmont-Peterson M, Van Kenhove P, Vanthienen J (2004) Bayesian network classifiers for identifiying the slope of the customer lifecycle of long-life customers. Eur J Oper Res 156(2):508–523zbMATHCrossRefGoogle Scholar
  3. 3.
    Behara RS, Fisher WW, Lemmink JG (2002) Modelling and evaluating service quality measurement using neural networks. Int J Oper Prod Manag 22(10):1162–1185CrossRefGoogle Scholar
  4. 4.
    Berg D (2007) Bankruptcy prediction by generalized additive models. Appl Stoch Models Bus Ind 23(2):129–143MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Bloemer JM, Brijis T, Vanhoof K, Swinnen G (2003) Comparing complete and partial classification for identifying customers at risk. Int J Res Mark 20(2):117–131CrossRefGoogle Scholar
  6. 6.
    Bose I, Chen X (2009) Quantitative models for direct marketing: a review from systems perspective. Eur J Oper Res 195(1):1–16MathSciNetzbMATHCrossRefGoogle Scholar
  7. 7.
    Bose I, Chen X (2015) Detecting the migration of mobile service customers using fuzzy clustering. Inf Manag 52(2):227–238CrossRefGoogle Scholar
  8. 8.
    Boser BE, Guyon IM, Vapnik V (1992) A training algorithm for optimal margin classifiers. In: Fifth annual workshop on computational learning theory, pp 114–152Google Scholar
  9. 9.
    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140zbMATHGoogle Scholar
  10. 10.
    Buckinx W, Van den Poel D (2005) Customer base analysis: partial defection of behaviorally loyal clients in a non-contractual FMCG retail setting. Eur J Oper Res 164(1):252–268zbMATHCrossRefGoogle Scholar
  11. 11.
    Burez J, Van den Poel D (2007) CRM at a pay-TV company: using analytical models to reduce customer attrition by targeted marketing for subscription services. Expert Syst Appl 32(2):277–288CrossRefGoogle Scholar
  12. 12.
    Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167CrossRefGoogle Scholar
  13. 13.
    Chen ZY, Fan ZP, Sun M (2012) A hierarchical multiple kernel support vector machine for customer churn prediction using longitudinal behavioral data. Eur J Oper Res 223(2):461–472MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    Chiang D, Wang Y, Lee S, Lin C (2003) Goal-oriented sequential pattern for network banking and churn analysis. Expert Syst Appl 25(3):293–302CrossRefGoogle Scholar
  15. 15.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  16. 16.
    Coussement K, Van den Poel D (2008) Churn prediction in subscription services: an application of support vector machines while comparing two parameter-selection techniques. Expert Syst Appl 34(1):313–327CrossRefGoogle Scholar
  17. 17.
    Coussement K, De Bock KW (2013) Customer churn prediction in the online gambling industry: the beneficial effect of ensemble learning. J Bus Res 66(9):1629–1636CrossRefGoogle Scholar
  18. 18.
    Crespo F, Weber R (2005) A methodology for dynamic data mining based on fuzzy clustering. Fuzzy Sets Syst 150(2):267–284MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Crone SF, Lessmann S, Stahlbock R (2006) The impact of preprocessing on data mining: an evaluation of classifier sensitivity in direct marketing. Eur J Oper Res 173(3):781–800MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Cronin JJ, Brady MK, Hule GT (2000) Assessing the dffects of quality, value, and customer satisfaction on customer behavioural intentions in service environments. J Retail 76(2):193–218CrossRefGoogle Scholar
  21. 21.
    Datta P, Masand B, Mani PR, Li B (2000) Automated cellular modelling and prediction on a large scale. Artif Intell Rev 14(6):485–502zbMATHCrossRefGoogle Scholar
  22. 22.
    De Bock K, Coussement K, Van den Poel D (2010) Ensemble classification based on generalized additive models. Comput Stat Data Anal 54(6):1535–1546MathSciNetzbMATHCrossRefGoogle Scholar
  23. 23.
    De Bock K, Van den Poel D (2013) Reconciling performance and interpretability in customer churn prediction using ensemble learning based on generalized additive models. Expert Syst Appl 39(8):6816–6826CrossRefGoogle Scholar
  24. 24.
    Dy J, Brodley C (2004) Feature selection for unsupervised learning. J Mach Learn Res 5(1):845–889MathSciNetzbMATHGoogle Scholar
  25. 25.
    Farquad MAH, Ravi V, Raju SB (2014) Churn prediction using comprehensible support vector machine: an analytical CRM application. Appl Soft Comput 19:31–40CrossRefGoogle Scholar
  26. 26.
    Fayyad U, Piatetski-Shapiro G, Smith P (1996) From data mining to knowledge discovery in databases. AI Mag 17(3):37–54Google Scholar
  27. 27.
    Ferreira JB, Vellasco M, Pacheco MA, Barbosa CH (2004) Data mining techniques on the evaluation of wireless churn. In: Proceedings of the 12th European symposium on artificial neural networks (ESANN), pp 483–488Google Scholar
  28. 28.
    García DL, Vellido A, Nebot À (2007) Customer continuity management as a foundation for churn data mining. Technical Report LSI-07-2-R Universitat Politécnica de Catalunya, Barcelona, SpainGoogle Scholar
  29. 29.
    Gilbert R (1989) Mobility barriers and the value of incumbency. Handb Ind Organ 1:475–535CrossRefGoogle Scholar
  30. 30.
    Glady N, Baesens B, Croux C (2008) Modeling churn using customer lifetime value. Eur J Oper Res 197(1):402–411zbMATHCrossRefGoogle Scholar
  31. 31.
    Gonçalves Curty R, Zhang P (2011) Social commerce: looking back and forward. Proc Am Soc Inf Sci Technol 48(1):1–10CrossRefGoogle Scholar
  32. 32.
    Gür-Ali Ö, Aritürk U (2014) Dynamic churn prediction framework with more effective use of rare event data: The case of private banking. Expert Syst Appl 41(17):7889–7903CrossRefGoogle Scholar
  33. 33.
    Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(1):1157–1182zbMATHGoogle Scholar
  34. 34.
    Hadden J (2008) A customer profiling methodology for churn prediction. Ph.D. Thesis, Cranfield University, UKGoogle Scholar
  35. 35.
    Hadden J, Tiwari A, Roy R, Ruta D (2007) Computer-assisted customer churn management: state-of-the-art and future trends. Comput Oper Res 34(10):2902–2917zbMATHCrossRefGoogle Scholar
  36. 36.
    Haenlein M, Kaplan AM, Schoder D (2006) Valuing the real option of abandoning unprofitable customers when calculating customer lifetime value. J Mark 70(3):5–20CrossRefGoogle Scholar
  37. 37.
    Haenlein M, Kaplan AM (2012) The impact of unprofitable customer abandonment on current customers exit, voice, and loyalty intentions: an empirical analysis. J Serv Mark 26(6):458–470CrossRefGoogle Scholar
  38. 38.
    Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844CrossRefGoogle Scholar
  39. 39.
    Ho Ha S, Min Bae S, Chan Park S (2002) Customers’ time-variant purchase behaviour and corresponding marketing strategies: an online retailer’s case. Comput Ind Eng 43(4):801–820CrossRefGoogle Scholar
  40. 40.
    Hsieh N (2004) An integrated data mining and behavioural scoring model for analysing bank customers. Expert Syst Appl 27(4):623–633CrossRefGoogle Scholar
  41. 41.
    Hsu C, Chang C, Lin C (2008) A practical guide to support vector classification. Technical Report. Department of Computer Science, National Taiwan University, TaiwanGoogle Scholar
  42. 42.
    Huang BQ, Kechadi MT, Buckley B (2009) Customer churn prediction for broadband internet services. In: Pedersen TB, Mohania MK, Tjoa M (eds) Proceedings of DaWaK 2009, vol 5691., Lecture Notes in Computer ScienceSpringer, Berlin, pp 229–243Google Scholar
  43. 43.
    Huang BQ, Kechadi MT, Buckley B (2012) Customer churn prediction in telecommunications. Expert Syst Appl 39(1):1414–1425CrossRefGoogle Scholar
  44. 44.
    Huang Y, Kechadi T (2013) An effective hybrid learning system for telecommunication churn prediction. Expert Syst Appl 40(14):5635–5647CrossRefGoogle Scholar
  45. 45.
    Hung SY, Yen DC, Wang HY (2006) Applying data mining to telecom churn management. Expert Syst Appl 31(3):512–524CrossRefGoogle Scholar
  46. 46.
    Hwang H, Jung T, Suh E (2004) An LTV model and customer segmentation based on customer value: a case study on the wireless telecommunication industry. Expert Syst Appl 26(2):181–188CrossRefGoogle Scholar
  47. 47.
    Jenamani M, Mohapatra PK, Ghose S (2003) A stochastic model of e-customer behaviour. Electron Commerce Res Appl 2(1):81–94CrossRefGoogle Scholar
  48. 48.
    Jones MA, Mothersbaugh DL, Beatty SE (2000) Switching barriers and repurchase intentions in services. J Retail 70(2):259–270CrossRefGoogle Scholar
  49. 49.
    Jones MA, Mothersbaugh DL, Beatty SE (2002) Why customers stay: measuring the underlying dimensions of services switching costs and managing their differential strategic outcomes. J Bus Res 55(6):441–450CrossRefGoogle Scholar
  50. 50.
    Jonker J, Piersma N, Van den Poel D (2004) Joint optimization of customer segmentation and marketing policy to maximize long-term profitability. Expert Syst Appl 27(2):159–168CrossRefGoogle Scholar
  51. 51.
    Keramati A, Jafari-Marandi R, Aliannejadi M, Ahmadian I, Mozaffari M, Abbasi U (2014) Improved churn prediction in telecommunication industry using datamining techniques. Appl Soft Comput 24:994–1012CrossRefGoogle Scholar
  52. 52.
    Kim K, Jun CH, Lee J (2014) Improved churn prediction in telecommunication industry by analyzing a large network. Expert Syst Appl 41(15):6575–6584CrossRefGoogle Scholar
  53. 53.
    Kim M-K, Park M-Ch, Jeong D-H (2004) The effects of customer satisfaction and switching barrier on customer loyalty in Korean mobile telecommunication services. Telecommun Policy 28(2):145–159CrossRefGoogle Scholar
  54. 54.
    Kim H, Yoon C (2004) Determinants of subscriber churn and customer loyalty in the Korean mobile telephony market. Telecommun Policy 28(9–10):751–765CrossRefGoogle Scholar
  55. 55.
    Kisioglu P, Topcu YL (2010) Applying Bayesian belief network approach to customer churn analysis: a case study on the telecom industry of Turkey. Expert Syst Appl 38(6):7151–7157CrossRefGoogle Scholar
  56. 56.
    Klemperer P (1987) Markets with consumer switching cost. Q J Econ 102(2):375–394MathSciNetCrossRefGoogle Scholar
  57. 57.
    Kumar D, Ravi V (2008) Predicting credit card customer churn in banks using data mining. Int J Data Anal Tech Strateg 1(1):4–28CrossRefGoogle Scholar
  58. 58.
    Langley P (2000) Crafting papers on machine learning. In: Proceedings of the 17th international conference on machine learning (ICML 2000) Stanford University, pp 1207–1216Google Scholar
  59. 59.
    Larivière B, Van den Poel D (2005) Predicting customer retention and profitability by using random forests and regression forests techniques. Expert Syst Appl 29(2):472–484CrossRefGoogle Scholar
  60. 60.
    Lee TS, Chiu CC, Chou YC, Lu CJ (2006) Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Comput Stat Data Anal 50(4):1113–1130MathSciNetzbMATHCrossRefGoogle Scholar
  61. 61.
    Lemmens A, Croux C (2006) Bagging and boosting classification trees to predict churn. J Mark Res 43(2):276–286CrossRefGoogle Scholar
  62. 62.
    Lessmann S, Voß S (2009) A reference model for customer-centric data mining with support vector machines. Eur J Oper Res 199(1):520–530MathSciNetzbMATHCrossRefGoogle Scholar
  63. 63.
    Lima E (2009) Domain knowledge integration in data mining for churn and customer lifetime value modelling: new approaches and applications. Ph.D. Thesis, University of Southampton, Faculty of Law, Arts and Social Sciences, Southampton, UKGoogle Scholar
  64. 64.
    Lima E, Mues C, Baesens B (2010) Monitoring and backtesting churn models. Expert Syst Appl 38(1):975–982CrossRefGoogle Scholar
  65. 65.
    Lisboa PJG, Edisbury B, Vellido A (eds) (2000) Business applications of neural networks. World Scientific Publishing Co, SingaporeGoogle Scholar
  66. 66.
    Liu H, Motoda H (2007) Computational methods of feature selection. Chapman and Hall/CRC, Data Mining and Knowledge Discovery SerieszbMATHGoogle Scholar
  67. 67.
    Liu D, Shih Y (2005) Integrating AHP and data mining for product recommendation based on customer lifetime value. Inf Manag 42(3):387–400CrossRefGoogle Scholar
  68. 68.
    Liu D, Shih Y (2005) Hybrid approaches to product recommendation based on customer lifetime value and purchase references. J Syst Softw 77(2):181–191CrossRefGoogle Scholar
  69. 69.
    Madden G, Savage SJ, Coble-Neal G (1999) Subscriber churn in the Australian ISP market. Inf Econ Policy 11:195–207CrossRefGoogle Scholar
  70. 70.
    Miguéis VL, Camanho A, Falcão e Cunha J (2013) Customer attrition in retailing: an application of multivariate adaptive regression splines. Expert Syst Appl 40(16):6225–6232CrossRefGoogle Scholar
  71. 71.
    Mihelis G, Grigoroudis E, Siskos Y, Politis Y, Malandrakis Y (2001) Customer satisfaction measurement in the private bank sector. Eur J Oper Res 130(2):347–360zbMATHCrossRefGoogle Scholar
  72. 72.
    Mozer MC, Wolniewicz R, Grimes DB, Johnson E, Kaushansky H (2000) Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry. IEEE Trans Neural Netw 11(3):690–696CrossRefGoogle Scholar
  73. 73.
    Neslin S, Gupta S, Kamakura W, Lu J, Mason C (2004) Defection detection: improving predictive accuracy of customer churn models. Technical Report. Tuck School of Business, Dartmouth CollegeGoogle Scholar
  74. 74.
    Ng K, Liu H (2000) Customer retention via data mining. Artif Intell Rev 16(4):569–590zbMATHCrossRefGoogle Scholar
  75. 75.
    Nie G, Zhang L, Li X, Shi Y (2006) The analysis on the customers churn of charge email based on data mining. Proceedings of the sixth IEEE international conference on data mining (ICDM). Springer, Berlin, pp 843–847Google Scholar
  76. 76.
    Nie G, Wang G, Zhang P, Tian Y, Shi Y (2009) Finding the hidden pattern of credit card holder’s churn: a case of China. Computational Science–ICCS 2009. Springer, Berlin, pp 561–569CrossRefGoogle Scholar
  77. 77.
    Nie G, Rowe W, Zhang L, Tian Y, Shi Y (2011) Credit card churn forecasting by logistic regression and decision tree. Expert Syst Appl 38(12):15273–15285CrossRefGoogle Scholar
  78. 78.
    Osei-Bryson K-M (2004) Evaluation of decision trees: a multi criteria approach. Comput Oper Res 31(11):1933–1945zbMATHCrossRefGoogle Scholar
  79. 79.
    Pal SK, Ghosh A (2004) Soft computing data mining. Inf Sci 163(1–3):5–12CrossRefGoogle Scholar
  80. 80.
    Patterson MP, Smith T (2003) A cross-cultural study of switching barriers and propensity to stay with service providers. J Retail 79(2):107–120CrossRefGoogle Scholar
  81. 81.
    Pfeifer PE, Carraway RL (2000) Modeling customer relationships as Markov chains. J Interact Mark 14(2):43–55CrossRefGoogle Scholar
  82. 82.
    Prinzie A, Van den Poel D (2006) Investigating purchasing-sequence patterns for financial services using Markov, MTD and MTDg models. Eur J Oper Res 170(3):710–734zbMATHCrossRefGoogle Scholar
  83. 83.
    Provost P, Fawcett T, Kohavi R (2000) The case against accuracy estimation for comparing induction algorithms. Proceedings of the 15th international conference on machine learning (ICML 1998). Morgan Kaufman, San Francisco, pp 445–453Google Scholar
  84. 84.
    Ranaweera Ch, Neely A (2003) Some moderating effects on the service quality-customer relation link. Int J Oper Prod Manag 23(2):230–248CrossRefGoogle Scholar
  85. 85.
    Rygielski J, Wang J, Yen DC (2002) Data mining techniques for customer relationship management. Technol Soc 24(4):483–502CrossRefGoogle Scholar
  86. 86.
    Shearer C (2000) The CRISP-DM model: the new blueprint for data mining. J Data Warehous 5(4):13–22Google Scholar
  87. 87.
    Shin HW, Sohn SY (2004) Segmentation of stock trading customers according to potential value. Expert Syst Appl 27(1):27–33CrossRefGoogle Scholar
  88. 88.
    Slater SF, Narver JC (2000) Intelligence generation and superior customer value. J Acad Mark Sci 28(1):120–127CrossRefGoogle Scholar
  89. 89.
    Slotnick SA, Sobel MJ (2005) Manufacturing lead-time rules: customer retention versus tardiness costs. Eur J Oper Res 163(3):825–856zbMATHCrossRefGoogle Scholar
  90. 90.
    Sun Z, Bebis G, Miller R (2004) Object detection using feature subset selection. Pattern Recognit 37(11):2165–2176CrossRefGoogle Scholar
  91. 91.
    Suryadi K, Gumilang S (2008) Actionable decision model in customer churn monitoring based on support vector machines technique. In: Proceedings of the 9th Asia Pacific industrial engineering and management systems conference, Bandung, IndonesiaGoogle Scholar
  92. 92.
    Tiwari A, Hadden J, Turner C (2010) A new neural network based customer profiling methodology for churn prediction. Proceedings of the ICCSA 2010, vol IV. Springer, Berlin, pp 358–369Google Scholar
  93. 93.
    Tsai CF, Lu YH (2009) Customer churn prediction by hybrid neural networks. Expert Syst Appl 36(10):12547–12553CrossRefGoogle Scholar
  94. 94.
    Vafeiadis T, Diamantaras KI, Sarigiannidis G, Chatzisavvas KCh (2015) A comparison of machine learning techniques for customer churn prediction. Simul Model Pract Theory 55:1–9CrossRefGoogle Scholar
  95. 95.
    Van den Poel D (2003) Predicting mail-order repeat buying: which variables matter? Tijdschrift voor Economie and Management 48(3):371–403MathSciNetGoogle Scholar
  96. 96.
    Van den Poel D, Laraviére B (2004) Customer attrition analysis for financial services using proportional hazard models. Eur J Oper Res 157(1):196–217zbMATHCrossRefGoogle Scholar
  97. 97.
    Vellido A, Lisboa PJG, Vaughan J (1999) Neural networks in business: a survey of applications (1992–1998). Expert Syst Appl 17(1):51–70CrossRefGoogle Scholar
  98. 98.
    Verbeke W, Dejaeger K, Martens D, Hur J, Baesens B (2011) New insights into churn prediction in the telecommunication sector: a profit driven data mining approach. Eur J Oper Res 218(1):211–219CrossRefGoogle Scholar
  99. 99.
    Verbeke W, Martens D, Baesens B (2014) Social network analysis for customer churn prediction. Appl Soft Comput 14(part C):431–446CrossRefGoogle Scholar
  100. 100.
    Verbraken T, Verbeke W, Baesens B (2014) Profit optimizing customer churn prediction with Bayesian network classifiers. Intell Data Anal 18(1):3–24Google Scholar
  101. 101.
    Verhoef PC, Donkers B (2001) Predicting customer potential vualue, an application in the insurance industry. Decis Support Syst 32(2):189–199CrossRefGoogle Scholar
  102. 102.
    Verhoef PC, Spring PN, Hoekstra JC, Leeflang PS (2003) The commercial use of segmentation and predictive modelling techniques for database marketing in the Netherlands. Decis Support Syst 34(4):471–481CrossRefGoogle Scholar
  103. 103.
    Wang G, Liu L, Peng Y, Nie G, Kou G, Shi Y (2010) Predicting credit card holder churn in banks of China using data mining and MCDM. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), pp 215–218Google Scholar
  104. 104.
    Wei CP, Chiu LT (2002) Turning telecommunications call details to churn prediction: a data mining approach. Expert Syst Appl 23(2):103–112CrossRefGoogle Scholar
  105. 105.
    Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufman, San FranciscozbMATHGoogle Scholar
  106. 106.
    Wu D (2009) Supplier selection: a hybrid model using DEA, decision tree and neural network. Expert Syst Appl 36(5):9105–9112CrossRefGoogle Scholar
  107. 107.
    Xiao J, Xiao Y, Huang A, Liu D, Wang S (2015) Feature-selection-based dynamic transfer ensemble model for customer churn prediction. Knowl Inf Syst 43(1):29–51CrossRefGoogle Scholar
  108. 108.
    Yan L, Wolniewicz RH, Dodier R (2004) Predicting customer behaviour in telecommunications. IEEE Intell Syst 19(2):50–58CrossRefGoogle Scholar
  109. 109.
    Yeswanth V, Vimal Raj V, Saravanan M (2011) Evolutionary churn prediction in mobile networks using hybrid learning. In: Proceedings of the 24th International Florida Artificial Intelligence Research Society Conference (FLAIRS), pp 471–476Google Scholar
  110. 110.
    Zhao Y, Li B, Li X, Liu W, Ren S (2005) Customer churn prediction using improved one-class support vector machine. In: Li X, Wang S, Yang-Dong Z (eds) Advanced data mining and applications. Lecture Notes in Computer Science, vol 3584. Springer, Berlin, pp 300–306Google Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversitat Politècnica de Catalunya (UPC BarcelonaTech)BarcelonaSpain

Personalised recommendations