A systematic framework of predicting customer revisit with in-store sensors

  • Sundong Kim
  • Jae-Gil LeeEmail author
Regular Paper


Recently, there is a growing number of off-line stores that are willing to conduct customer behavior analysis. In particular, predicting revisit intention is of prime importance, because converting first-time visitors to loyal customers is very profitable. Thanks to noninvasive monitoring, shopping behaviors and revisit statistics become available from a large proportion of customers who turn on their mobile devices. In this paper, we propose a systematic framework to predict the revisit intention of customers using Wi-Fi signals captured by in-store sensors. Using data collected from seven flagship stores in downtown Seoul, we achieved 67–80% prediction accuracy for all customers and 64–72% prediction accuracy for first-time visitors. The performance improvement by considering customer mobility was 4.7–24.3%. Furthermore, we provide an in-depth analysis regarding the effect of data collection period as well as visit frequency on the prediction performance and present the robustness of our model on missing customers. We released some tutorials and benchmark datasets for revisit prediction at


Revisit prediction Retail analytics Predictive analytics Feature engineering Marketing Mobility data 



This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT) (No. 2017R1E1A1A01075927). We appreciate Minseok Kim for helping surveys on off-line stores and drawing floor plans. We also thank ZOYI for providing active discussion in regard to the datasets.

Supplementary material


  1. 1.
    Baumann P, Kleiminger W, Santini S (2013) The influence of temporal and spatial features on the performance of next-place prediction algorithms. In: Proceedings of the 2013 ACM international joint conference on pervasive and ubiquitous computing. ACM, pp 449–458Google Scholar
  2. 2.
    Besse PC, Guillouet B, Loubes J-M, Royer F (2017) Destination prediction by trajectory distribution based model. IEEE Trans Intell Transp Syst 99:1–12Google Scholar
  3. 3.
    Brébisson A, Simon É, Auvolat A, Vincent P, Bengio Y (2015) Artificial neural networks applied to taxi destination prediction. In: Proceedings of the 2015 ECML/PKDD discovery challenge. Springer, pp 40–51Google Scholar
  4. 4.
    Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 785–794Google Scholar
  5. 5.
    Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Geng W, Yang G (2017) Partial correlation between spatial and temporal regularities of human mobility. Sci Rep 7:6249CrossRefGoogle Scholar
  7. 7.
    Giannotti F, Nanni M, Pinelli F, Pedreschi D (2007) Trajectory pattern mining. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 330–339Google Scholar
  8. 8.
    Hui SK, Bradlow ET, Fader PS (2009) Testing behavioral hypotheses using an integrated model of grocery store shopping path and purchase behavior. J Consum Res 36(3):478–493CrossRefGoogle Scholar
  9. 9.
    Hwang I, Jang Y (2017) Process mining to discover shoppers’ pathways at a fashion retail store using a wifi-base indoor positioning system. IEEE Trans Autom Sci Eng 14:1786–1792CrossRefGoogle Scholar
  10. 10.
    Jung S, Lim C, Yoon S (2011) Study on selecting process of visitor’s movements in exhibition space. J Archit Inst Korea Plan Des 27(12):53–62Google Scholar
  11. 11.
    Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems, vol 30. Curran Associates, Inc, pp 3146–3154Google Scholar
  12. 12.
    Kim S, Lee J-G (2018) Utilizing in-store sensors for revisit prediction. In: IEEE international conference on data mining. IEEE, pp 217–226Google Scholar
  13. 13.
    Kim T, Chu M, Brdiczka O, Begole J (2009) Predicting shoppers’ interest from social interactions using sociometric sensors. In: CHI’09 extended abstracts on human factors in computing systems. ACM, pp 4513–4518Google Scholar
  14. 14.
    Lee J-G, Han J, Li X (2011) Mining discriminative patterns for classifying trajectories on road networks. IEEE Trans Knowl Data Eng 23(5):713–726CrossRefGoogle Scholar
  15. 15.
    Lemaître G, Nogueira F, Aridas CK (2017) Imbalanced-learn: a Python toolbox to tackle the curse of imbalanced datasets in machine learning. J Mach Learn Res 18(17):1–5zbMATHGoogle Scholar
  16. 16.
    Lim C, Park H, Yoon S (2013) A study of an exhibitions space analysis according to visitor’s cognition. J Archit Inst Korea Plan Des 29(8):69–78Google Scholar
  17. 17.
    Lim C, Yoon S (2010) Development of visual perception effects model for exhibition space. J Archit Inst Korea Plan Des 26(5):131–138Google Scholar
  18. 18.
    Liu G, Nguyen TT, Zhao G, Zha W, Yang J, Cao J, Wu M, Zhao P, Chen W (2016) Repeat buyer prediction for E-commerce. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 155–164Google Scholar
  19. 19.
    Lu X, Wetter E, Bharti N, Tatem AJ, Bengtsson L (2013) Approaching the limit of predictability in human mobility. Sci Rep 3:2923CrossRefGoogle Scholar
  20. 20.
    Lv J, Li Q, Sun Q, Wang X (2018) T-CONV: a convolutional neural network for multi-scale taxi trajectory prediction. In: Proceedings of the 2018 IEEE international conference on big data and smart computing. IEEE, pp 82–89Google Scholar
  21. 21.
    Martin J, Mayberry T, Donahue C, Foppe L, Brown L, Riggins C, Rye EC, Brown D (2017) A study of MAC address randomization in mobile devices and when it fails. Proc Priv Enhanc Technol 2017(4):365–383CrossRefGoogle Scholar
  22. 22.
    Mathew W, Raposo R, Martins B (2012) Predicting future locations with hidden Markov models. In: Proceedings of the 2012 ACM conference on ubiquitous computing. ACM, pp 911–918Google Scholar
  23. 23.
    Monreale A, Pinelli F, Trasarti R, Giannotti F (2012) WhereNext: a location predictor on trajectory pattern mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 637–646Google Scholar
  24. 24.
    OpenSignal, Inc (2016) Global state of mobile networks (August 2016). Technical reportGoogle Scholar
  25. 25.
    Park S, Jung S, Lim C (2001) A study on the pedestrian path choice in clothing outlets. Korean Inst Inter Des J 28:140–148Google Scholar
  26. 26.
    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay É (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetzbMATHGoogle Scholar
  27. 27.
    Peppers D, Rogers M (2016) Managing customer experience and relationships. Wiley, New YorkCrossRefGoogle Scholar
  28. 28.
    Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A (2018) CatBoost: unbiased boosting with categorical features support. In: Advances in neural information processing systems, vol 31. Curran Associates, Inc, pp 6639–6649Google Scholar
  29. 29.
    Ren Y, Tomko M, Salim FD, Ong K, Sanderson M (2017) Analyzing web behavior in indoor retail spaces. J Assoc Inf Sci Technol 68(1):62–76CrossRefGoogle Scholar
  30. 30.
    Sapiezynski P, Stopczynski A, Gatej R, Lehmann S (2015) Tracking human mobility using WiFi signals. PLoS ONE 10(7):e0130824CrossRefGoogle Scholar
  31. 31.
    Scellato S, Musolesi M, Mascolo C, Latora V, Campbell AT (2011) Nextplace: a spatio-temporal prediction framework for pervasive systems. In: Proceedings of the 9th international conference on pervasive computing. Springer, pp 152–169Google Scholar
  32. 32.
    Sheth A, Seshan S, Wetherall D (2009) Geo-fencing: confining Wi-Fi coverage to physical boundaries. In: Proceedings of the 7th international conference on pervasive computing, pp 274–290Google Scholar
  33. 33.
    Song C, Qu Z, Blumm N, Barabási A-L (2010) Limits of predictability in human mobility. Science 327(5968):1018–1021MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Stanković RS, Falkowskib BJ (2003) The Haar wavelet transform: its status and achievements. Comput Electr Eng 29(1):25–44CrossRefzbMATHGoogle Scholar
  35. 35.
    Syaekhoni A, Lee C, Kwon Y (2018) Analyzing customer behavior from shopping path data using operation edit distance. Appl Intell 48:1912–1932CrossRefGoogle Scholar
  36. 36.
    Tomko M, Ren Y, Ong K, Salim F, Sanderson M (2014) Large-scale indoor movement analysis: the data, context and analytical challenges. In: Proceedings of analysis of movement data, GIScience 2014 workshopGoogle Scholar
  37. 37.
    Um S, Chon K, Ro Y (2006) Antecedents of revisit intention. Ann Tour Res 33(4):1141–1158CrossRefGoogle Scholar
  38. 38.
    Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259CrossRefGoogle Scholar
  39. 39.
    Xue AY, Zhang R, Zheng Y, Xie X, Huang J, Xu Z (2013) Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In: Proceedings of the 29th IEEE international conference on data engineering. IEEE, pp 254–265Google Scholar
  40. 40.
    Yada K (2011) String analysis technique for shopping path in a supermarket. J Intell Inf Syst 36(3):385–402CrossRefGoogle Scholar
  41. 41.
    Yalowitz SS, Bronnenkant K (2009) Timing and tracking: unlocking visitor behavior. Visit Stud 12(1):47–64CrossRefGoogle Scholar
  42. 42.
    Yan X, Wang J, Chau M (2015) Customer revisit intention to restaurants: evidence from online reviews. Inf Syst Front 17:645–657CrossRefGoogle Scholar
  43. 43.
    Yan Z, Chakraborty D, Parent C, Spaccapietra S, Aberer K (2013) Semantic trajectories: mobility data computation and annotation. ACM Trans Intell Syst Technol 4(3):1–38CrossRefGoogle Scholar
  44. 44.
    Ying JJC, Lee WC, Weng TC, Tseng VS (2011) Semantic trajectory mining for location prediction. In: Proceedings of the 19th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 34–43Google Scholar
  45. 45.
    Yoshimura Y, Krebs A, Ratti C (2017) Noninvasive bluetooth monitoring of visitors’ length of stay at the louvre. IEEE Perv Comput 16(2):26–34CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Graduate School of Knowledge Service EngineeringKAISTDaejeonRepublic of Korea
  2. 2.Department of Industrial and Systems EngineeringKAISTDaejeonRepublic of Korea

Personalised recommendations