A predictive investigation of first-time customer retention in online reservation services

  • Yen-Chun Chou
  • Howard Hao-Chun Chuang
Empirical article


This paper reports a predictive investigation of first-time customer retention in an emerging service business—online reservation services. We work with an online platform that enables customers to make reservations for various types of restaurants. With numerous first-time users on the platform, the focal company is eager to effectively identify recurring customers. However, the business problem is challenging due to that each first-time customer has one and only one booking record hinders the use of well-established marketing models that demand multiple booking records for a customer. By analyzing more than 100,000 observations, we extract booking-related features that are useful in predicting first-time customer retention. Our feature extraction is potentially applicable to other service sectors (e.g., hotel, travel) with similar booking information fields (e.g., reservation timing, party size). We further conduct a comparative study in which surprisingly, the seemingly simplistic generalized additive model (GAM) for our test cases consistently outperforms computationally intensive ensemble learning methods, even the cutting-edge XGBoost. Our analysis indicates that there is no silver bullet for applied predictive modeling and GAM should definitely be included in the arsenal of business researchers. We conclude by discussing the implications of our study for online service providers and business data analytics.


E-services First-time customer retention Prediction Analytics Statistical learning 


  1. Bapna R, Chang SA, Goes P, Gupta A (2009) Overlapping online auctions: empirical characterization of bidder strategies and auction prices. MIS Q 33(4):763–783CrossRefGoogle Scholar
  2. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the twenty-secondth ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, pp 785–794Google Scholar
  3. Coussement K, Van den Poel D (2008) Churn prediction in subscription services: an application of support vector machines while comparing two parameter selection techniques. Expert Syst Appl 34(1):313–327CrossRefGoogle Scholar
  4. Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874CrossRefGoogle Scholar
  5. Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall/CRC, Boca RatonGoogle Scholar
  6. Hastie T, Tibshirani R (2016) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New YorkGoogle Scholar
  7. Hong SJ, Tam KY (2006) Understanding the adoption of multipurpose information appliances: the case of mobile data services. Inform Syst Res 17(2):162–179CrossRefGoogle Scholar
  8. Hong W, Thong YLJ, Tam KY (2004) Designing product listing pages on e-commerce website: an examination of presentation mode and information format. Int J Hum-Comput 61(4):481–503CrossRefGoogle Scholar
  9. Hosseini SY, Bideh AZ (2014) A data mining approach for segmentation-based importance performance analysis (SOM-BPNN-IPA): a new framework for developing customer retention strategies. Serv Bus 8(2):295–312CrossRefGoogle Scholar
  10. James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning with applications in R. Springer, New YorkCrossRefGoogle Scholar
  11. Kim G, Chae BK, Olson DL (2013) A support vector machine (SVM) approach to imbalanced datasets of customer response: comparison with other customer response models. Serv Bus 7(1):167–182CrossRefGoogle Scholar
  12. Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, New YorkCrossRefGoogle Scholar
  13. Larsen K (2015) GAM: the predictive modeling silver bullet. Accessed 9 Feb 2018
  14. Lee S, Shin B, Lee HG (2009) Understanding post-adoption usage of mobile data services: the role of supplier-side variables. J Assoc Inf Syst 10(12):860–888Google Scholar
  15. Lin M, Lucas HC, Shmueli G (2013) Too big to fail: large samples and the p-value problem. Inform Syst Res 24(4):906–917CrossRefGoogle Scholar
  16. Migueis VL, Camanho AS, Borges J (2017) Predicting direct marketing response in banking: comparison of class imbalance methods. Serv Bus 11(4):831–849CrossRefGoogle Scholar
  17. Morris DZ (2016) Netflix says geography, age, and gender are “garbage” for predicting taste. Accessed 9 Feb 2018
  18. Morrisonn AM, Jing S, O’Leary JT, Cai LA (2001) Predicting usage of the Internet for travel bookings: an exploratory study. Inform Technol Tour 4(1):15–30CrossRefGoogle Scholar
  19. Olson DL (2007) Data mining in business services. Serv Bus 1(3):181–193CrossRefGoogle Scholar
  20. Ranganathan P, Pramesh CS, Aggarwal R (2017) Common pitfalls in statistical analysis: logistics regression. Perspect Clin Res 8(3):148–151Google Scholar
  21. SAS software (2018) The logistic procedure: example 53.2 logistic modeling with categorical predictors. Accessed 28 Mar 2018
  22. Schmutz P, Roth SP, Seckler M, Opwis K (2010) Designing product listing pages—effects on sales and users’ cognitive workload. Int J Hum-Comput Stud 68(7):423–431CrossRefGoogle Scholar
  23. Shankar V, Smith AK, Rangaswamy A (2003) Customer satisfaction and loyalty in online and offline environments. Int J Res Mark 20(2):153–175CrossRefGoogle Scholar
  24. Shmueli G (2010) To explain or to predict? Stat Sci 25(3):289–310CrossRefGoogle Scholar
  25. Shmueli G, Koppius OR (2011) Predictive analytics in information systems research. MIS Q 35(3):553–572CrossRefGoogle Scholar
  26. Swart MP, Roodt G (2015) Market segmentation variables as moderators in the prediction of business tourist retention. Serv Bus 9(3):491–513CrossRefGoogle Scholar
  27. Vakrat Y, Seidmann A (2000) Implications of the bidders’ arrival process on the design of online auctions. In: Proceedings of the thirty-third annual Hawaii international conference on system sciences, Maui, HawaiiGoogle Scholar
  28. Van den Poel D, Buckinx W (2005) Predicting online-purchasing behaviour. Euro J Oper Res 166(2):557–575CrossRefGoogle Scholar
  29. Van den Poel D, Lariviere B (2004) Customer attrition analysis for financial services using proportional hazard models. Euro J Oper Res 157(1):196–217CrossRefGoogle Scholar
  30. Xie Y, Li X, Ngai EWT, Ying W (2009) Customer churn prediction using improved balanced random forests. Expert Syst Appl 36(3):5445–5449CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.College of CommerceNational Chengchi UniversityTaipeiTaiwan

Personalised recommendations