A systematic framework of predicting customer revisit with in-store sensors

Abstract

Recently, there is a growing number of off-line stores that are willing to conduct customer behavior analysis. In particular, predicting revisit intention is of prime importance, because converting first-time visitors to loyal customers is very profitable. Thanks to noninvasive monitoring, shopping behaviors and revisit statistics become available from a large proportion of customers who turn on their mobile devices. In this paper, we propose a systematic framework to predict the revisit intention of customers using Wi-Fi signals captured by in-store sensors. Using data collected from seven flagship stores in downtown Seoul, we achieved 67–80% prediction accuracy for all customers and 64–72% prediction accuracy for first-time visitors. The performance improvement by considering customer mobility was 4.7–24.3%. Furthermore, we provide an in-depth analysis regarding the effect of data collection period as well as visit frequency on the prediction performance and present the robustness of our model on missing customers. We released some tutorials and benchmark datasets for revisit prediction at https://github.com/kaist-dmlab/revisit.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Notes

  1. 1.

    The proportion of users in their twenties who keep their Wi-Fi on is 29.2%, according to a survey by Korea Telecom (July 2015).

  2. 2.

    In Fig. 2, the ratio of the first-time visitors in store E_GN is over 70%. We made a few assumptions to interpret the data as it is and will discuss them in “Appendix D”.

  3. 3.

    Owing to a nondisclosure agreement, additional store information cannot be disclosed. We encourage readers to think that dozens of sensors cover the other stores in a similar manner.

  4. 4.

    https://walkinsights.com/sensors.

  5. 5.

    As a result of Sect. 4.2.3, our model is considered to be safe to perform cross-validation.

  6. 6.

    For this experiment, we included visit count and date to our feature set, so the overall accuracy is slightly higher than the values reported from Table 4.

  7. 7.

    http://bit.ly/kaggle_taxi_interview_1st_nn.

  8. 8.

    https://www.freshhema.com/.

  9. 9.

    https://en.wikipedia.org/wiki/Amazon_Go.

  10. 10.

    http://bit.ly/Guggenheim_App.

  11. 11.

    Scikit-learn 0.20, which is the latest version at the time of this submission, was used for the experiments.

  12. 12.

    http://bit.ly/Kaggle_Guide_Stacking.

  13. 13.

    We ran another five sets of fivefold cross-validation for this experiment. Thus, the values of the baselines in Table 9 are slightly different from those in Table 8 within the margin of error.

References

  1. 1.

    Baumann P, Kleiminger W, Santini S (2013) The influence of temporal and spatial features on the performance of next-place prediction algorithms. In: Proceedings of the 2013 ACM international joint conference on pervasive and ubiquitous computing. ACM, pp 449–458

  2. 2.

    Besse PC, Guillouet B, Loubes J-M, Royer F (2017) Destination prediction by trajectory distribution based model. IEEE Trans Intell Transp Syst 99:1–12

    Google Scholar 

  3. 3.

    Brébisson A, Simon É, Auvolat A, Vincent P, Bengio Y (2015) Artificial neural networks applied to taxi destination prediction. In: Proceedings of the 2015 ECML/PKDD discovery challenge. Springer, pp 40–51

  4. 4.

    Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 785–794

  5. 5.

    Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232

    MathSciNet  Article  Google Scholar 

  6. 6.

    Geng W, Yang G (2017) Partial correlation between spatial and temporal regularities of human mobility. Sci Rep 7:6249

    Article  Google Scholar 

  7. 7.

    Giannotti F, Nanni M, Pinelli F, Pedreschi D (2007) Trajectory pattern mining. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 330–339

  8. 8.

    Hui SK, Bradlow ET, Fader PS (2009) Testing behavioral hypotheses using an integrated model of grocery store shopping path and purchase behavior. J Consum Res 36(3):478–493

    Article  Google Scholar 

  9. 9.

    Hwang I, Jang Y (2017) Process mining to discover shoppers’ pathways at a fashion retail store using a wifi-base indoor positioning system. IEEE Trans Autom Sci Eng 14:1786–1792

    Article  Google Scholar 

  10. 10.

    Jung S, Lim C, Yoon S (2011) Study on selecting process of visitor’s movements in exhibition space. J Archit Inst Korea Plan Des 27(12):53–62

    Google Scholar 

  11. 11.

    Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems, vol 30. Curran Associates, Inc, pp 3146–3154

  12. 12.

    Kim S, Lee J-G (2018) Utilizing in-store sensors for revisit prediction. In: IEEE international conference on data mining. IEEE, pp 217–226

  13. 13.

    Kim T, Chu M, Brdiczka O, Begole J (2009) Predicting shoppers’ interest from social interactions using sociometric sensors. In: CHI’09 extended abstracts on human factors in computing systems. ACM, pp 4513–4518

  14. 14.

    Lee J-G, Han J, Li X (2011) Mining discriminative patterns for classifying trajectories on road networks. IEEE Trans Knowl Data Eng 23(5):713–726

    Article  Google Scholar 

  15. 15.

    Lemaître G, Nogueira F, Aridas CK (2017) Imbalanced-learn: a Python toolbox to tackle the curse of imbalanced datasets in machine learning. J Mach Learn Res 18(17):1–5

    MATH  Google Scholar 

  16. 16.

    Lim C, Park H, Yoon S (2013) A study of an exhibitions space analysis according to visitor’s cognition. J Archit Inst Korea Plan Des 29(8):69–78

    Google Scholar 

  17. 17.

    Lim C, Yoon S (2010) Development of visual perception effects model for exhibition space. J Archit Inst Korea Plan Des 26(5):131–138

    Google Scholar 

  18. 18.

    Liu G, Nguyen TT, Zhao G, Zha W, Yang J, Cao J, Wu M, Zhao P, Chen W (2016) Repeat buyer prediction for E-commerce. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 155–164

  19. 19.

    Lu X, Wetter E, Bharti N, Tatem AJ, Bengtsson L (2013) Approaching the limit of predictability in human mobility. Sci Rep 3:2923

    Article  Google Scholar 

  20. 20.

    Lv J, Li Q, Sun Q, Wang X (2018) T-CONV: a convolutional neural network for multi-scale taxi trajectory prediction. In: Proceedings of the 2018 IEEE international conference on big data and smart computing. IEEE, pp 82–89

  21. 21.

    Martin J, Mayberry T, Donahue C, Foppe L, Brown L, Riggins C, Rye EC, Brown D (2017) A study of MAC address randomization in mobile devices and when it fails. Proc Priv Enhanc Technol 2017(4):365–383

    Article  Google Scholar 

  22. 22.

    Mathew W, Raposo R, Martins B (2012) Predicting future locations with hidden Markov models. In: Proceedings of the 2012 ACM conference on ubiquitous computing. ACM, pp 911–918

  23. 23.

    Monreale A, Pinelli F, Trasarti R, Giannotti F (2012) WhereNext: a location predictor on trajectory pattern mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 637–646

  24. 24.

    OpenSignal, Inc (2016) Global state of mobile networks (August 2016). Technical report

  25. 25.

    Park S, Jung S, Lim C (2001) A study on the pedestrian path choice in clothing outlets. Korean Inst Inter Des J 28:140–148

    Google Scholar 

  26. 26.

    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay É (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  27. 27.

    Peppers D, Rogers M (2016) Managing customer experience and relationships. Wiley, New York

    Google Scholar 

  28. 28.

    Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A (2018) CatBoost: unbiased boosting with categorical features support. In: Advances in neural information processing systems, vol 31. Curran Associates, Inc, pp 6639–6649

  29. 29.

    Ren Y, Tomko M, Salim FD, Ong K, Sanderson M (2017) Analyzing web behavior in indoor retail spaces. J Assoc Inf Sci Technol 68(1):62–76

    Article  Google Scholar 

  30. 30.

    Sapiezynski P, Stopczynski A, Gatej R, Lehmann S (2015) Tracking human mobility using WiFi signals. PLoS ONE 10(7):e0130824

    Article  Google Scholar 

  31. 31.

    Scellato S, Musolesi M, Mascolo C, Latora V, Campbell AT (2011) Nextplace: a spatio-temporal prediction framework for pervasive systems. In: Proceedings of the 9th international conference on pervasive computing. Springer, pp 152–169

  32. 32.

    Sheth A, Seshan S, Wetherall D (2009) Geo-fencing: confining Wi-Fi coverage to physical boundaries. In: Proceedings of the 7th international conference on pervasive computing, pp 274–290

  33. 33.

    Song C, Qu Z, Blumm N, Barabási A-L (2010) Limits of predictability in human mobility. Science 327(5968):1018–1021

    MathSciNet  Article  Google Scholar 

  34. 34.

    Stanković RS, Falkowskib BJ (2003) The Haar wavelet transform: its status and achievements. Comput Electr Eng 29(1):25–44

    Article  Google Scholar 

  35. 35.

    Syaekhoni A, Lee C, Kwon Y (2018) Analyzing customer behavior from shopping path data using operation edit distance. Appl Intell 48:1912–1932

    Article  Google Scholar 

  36. 36.

    Tomko M, Ren Y, Ong K, Salim F, Sanderson M (2014) Large-scale indoor movement analysis: the data, context and analytical challenges. In: Proceedings of analysis of movement data, GIScience 2014 workshop

  37. 37.

    Um S, Chon K, Ro Y (2006) Antecedents of revisit intention. Ann Tour Res 33(4):1141–1158

    Article  Google Scholar 

  38. 38.

    Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259

    Article  Google Scholar 

  39. 39.

    Xue AY, Zhang R, Zheng Y, Xie X, Huang J, Xu Z (2013) Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In: Proceedings of the 29th IEEE international conference on data engineering. IEEE, pp 254–265

  40. 40.

    Yada K (2011) String analysis technique for shopping path in a supermarket. J Intell Inf Syst 36(3):385–402

    Article  Google Scholar 

  41. 41.

    Yalowitz SS, Bronnenkant K (2009) Timing and tracking: unlocking visitor behavior. Visit Stud 12(1):47–64

    Article  Google Scholar 

  42. 42.

    Yan X, Wang J, Chau M (2015) Customer revisit intention to restaurants: evidence from online reviews. Inf Syst Front 17:645–657

    Article  Google Scholar 

  43. 43.

    Yan Z, Chakraborty D, Parent C, Spaccapietra S, Aberer K (2013) Semantic trajectories: mobility data computation and annotation. ACM Trans Intell Syst Technol 4(3):1–38

    Article  Google Scholar 

  44. 44.

    Ying JJC, Lee WC, Weng TC, Tseng VS (2011) Semantic trajectory mining for location prediction. In: Proceedings of the 19th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 34–43

  45. 45.

    Yoshimura Y, Krebs A, Ratti C (2017) Noninvasive bluetooth monitoring of visitors’ length of stay at the louvre. IEEE Perv Comput 16(2):26–34

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT) (No. 2017R1E1A1A01075927). We appreciate Minseok Kim for helping surveys on off-line stores and drawing floor plans. We also thank ZOYI for providing active discussion in regard to the datasets.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jae-Gil Lee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A. Comparison on various classifiers

We compared the performances of eight classifiers. We used default parameter settings for classifiers and some tuned parameters are listed below.

  • Classifiers provided by Scikit-learn  [26].Footnote 11 The parameters used are summarized as follows.

    • LR (Logistic Regression): default settings.

    • DT (Decision Tree): max_depth = 4.

    • RF (Random Forests): n_estimator = 10.

    • AB (AdaBoost): default settings.

    • GB (Gradient Boosting): max_depth = 4.

  • Up-to-date boosting classifiers:

    • CAB (CatBoost): depth = 4, learning_rate = 0.1, iterations = 30.

    • XGB (XGBoost): max_depth = 4, learning_rate = 0.1.

    • LGB (LightGBM): max_depth = 4, learning_rate = 0.1.

Fig. 13
figure13

Comparison between classifiers. LGB turns out to be the most effective among all classifiers. a Average accuracy on all experiments, b average running time on all experiments

Figure 13 summarizes the comparison results for the eight classifiers in terms of prediction accuracy and running time. To obtain stable results, we repeated fivefold cross-validation 25 times and then reported the averages by aggregating the results of the seven stores. As a result, LGB turned out to be the fastest classifier among the three best-performing classifiers—GB, XGB, and LGB. CAB was very fast as well as gave comparable results. Interestingly, DT took more time than RF and showed a better result in the default setting. Table 8 shows the details of Fig. 13 by showing the accuracy for each of the seven stores. The mean and standard deviation were calculated from the average accuracies of 25 different fivefold cross-validations.

Table 8 Prediction accuracy (%) of various classifiers for the revisit prediction task

B. Comparison on stacking models

To achieve additional performance improvement, we applied stacking (meta ensembling) with eight strategies. Stacking is a model ensembling technique used to combine multiple predictive models to generate a better model  [38]. Usually, the stacked model is known to outperform each of the individual models owing to its smoothing nature and its ability to highlight each base model. The main point of the stacking is to utilize the prediction results of the base models as features for the stacking model in the second layer.

To do this, we selected CAB, XGB, and LGB as the base models. We further separated a training set into three subsets and used two subsets to make the prediction labels for the remaining subset. The prediction labels for the testing set were also calculated together three (\(=_3\!C_2\)) times, and the three sets of the labels for the testing set were averaged for the final use. In this way, we generated the label features for both training and testing sets. These additional features are fed to the final LGB stacking model. We followed a general procedure from the referenceFootnote 12 and added three options. Figure 14 illustrates the process of creating eight stacking models (\(M_1\)\(M_8\)) through the choice of the three options. The description of the three options is as follows.

  • Sampling strategy: A parameter that determines whether to use either random oversampling  [21] or downsampling. This option is not directly related to the stacking, but we added it to improve the accuracy by treating the class imbalance problem.

  • # of predictions: A parameter that determines whether to use one model or multiple models for each fold. The former case generates a single additional feature, and the latter case generates three additional features.

  • Using only labels: A parameter that determines whether to use only the prediction labels (one or three features) or to use all existing features with the prediction labels (n+1 or n+3 features where n is the total number of hand-engineered features used).

Fig. 14
figure14

Stacking options

Table 9 Prediction accuracy (%) of stacking models for the revisit prediction task with the data of all visitors

Table 9 shows the average accuracy results obtained for each of the seven stores in details.Footnote 13 We observed that the performance improvement was not so high despite the long running time of the stacking model. Thus, we conjecture that each of the best-performing classifiers achieved almost the highest accuracy by itself.

C. Lower bounds of prediction accuracy

The visit logs \(v_k\) with the same visit count k are considered to have the same information. To maximize the accuracy, we must predict the label l of \(v_k\) by the following criteria:

$$\begin{aligned} \forall v: l(v \in v_k)= {\left\{ \begin{array}{ll} 1, &{} \text {if } E[RV_{\mathrm{bin}}(v_k)] \ge 1/2\\ 0, &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$
(2)

Considering each proportion \(p_k = |v_k|/\sum _{k}{|v_k|}\) and simplifying \(E[RV_{\mathrm{bin}}(v_k)]\) as \(r_k\), the lower bound accuracy of a model can be represented as \(LB = \sum _{k}p_k \cdot \max (r_k, 1-r_k)\). In the experiment of only first-time visitors, \(LB = 1/2\) since \(p_1 = 1\) and \(r_1 = 1/2\).

The interpretation with the lower bound is as follows. For higher predictability, the revisit tendency of each \(v_k\) should be homogeneous. In Fig. 15, we can notice that store L_MD is more predictable than A_GN, because \(|r_k-0.5|\) of L_MD is larger than that of A_GN for the majority of k.

Fig. 15
figure15

Lower bound accuracies of two stores

D. Assumptions to interpret the data

Here, we would like to clarify how we count the first-time visitors and explain several underlying assumptions to consider.

  • Assumption 1: Because we do not know whether customers visited a store before data was collected, we simply assume that the customers did not visit before the collection period. We believe that this assumption is reasonable because the stores in which we collected the data were relatively new at that time we began data collection.

  • Assumption 2: Because customers are captured only when they turn on the Wi-Fi of their mobile device, we assume that the customers’ Wi-Fi turn on behavior is consistent when they visit the store. Also, we assume that there is no correlation between Wi-Fi usage and customer groups (first-time visitors and VIP customers).

  • Assumption 3: We assume that customers visit the store with a device having the same MAC address. For this purpose, we retained only Android devices but removed Apple devices in the preprocessing step, because the later versions of iOS 8.0 follow a MAC-address randomization policy  [21] which makes infeasible to identify the same customer.

Rigorously speaking, the proportion of true first-time visitors would be less than 70% by considering all the effects explained above. Nevertheless, these customers are also likely to be early stage visitors.

E. Deciding the group movement threshold

We decided 30 s group movement threshold by the following logic. According to our observation at store E_GN in the afternoon of June 24 and June 26, 2017, 56% of 105 customers entered the store with their companions, which was more than half. Considering \(p_x=39.2\%\) as the on-site Wi-Fi turn on rate (Always-on: 29.2%, Conditionally-on: 10%)  [24] and \(p_y = 56\%\) as the actual proportion of customers in a group, we expected that \(p_{yo}=15.5\%\) of the total visitors were represented as having companions in our collected data of store E_GN (by Eq. 1 in Sect. 5.3.2). By setting 30 s as a threshold of accompaniment, we also obtained 15% of the total visitors were considered as having companions in the same data. By considering a gap between actual group ratio and observed group ratio, we claim that 30 s is an appropriate threshold to distinguish group movement.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kim, S., Lee, J. A systematic framework of predicting customer revisit with in-store sensors. Knowl Inf Syst 62, 1005–1035 (2020). https://doi.org/10.1007/s10115-019-01373-y

Download citation

Keywords

  • Revisit prediction
  • Retail analytics
  • Predictive analytics
  • Feature engineering
  • Marketing
  • Mobility data