1 Introduction

In this introduction, we describe the background of our study and propose the research questions to be answered. This section also provides the literature review.

1.1 Background and research questions

Online advertisements play an integral role in today’s World Wide Web. Regardless of whether they are for online or offline shopping, the center of interest in ad technology developments is to increase the sales of the targeted items or shops. Therefore, even post hoc analysis of the advertisements mainly focuses on these outcomes. However, to propose an effective marketing strategy, we need to disentangle what type of the customers advertisements invite to the target shop. The customers invited by the advertisement may show different shopping patterns from customers who visit the shop voluntarily. If these differences exist, an appropriate advertisement strategy for each group is not the same. Studying customer behavior during their shopping will advance understanding of what customers are induced by the advertisements and develop the additional marketing strategy to enhance the business.

Recent studies, for instance, have investigated customers’ offline shopping trajectories, who have received advertisements. The customers who received advertisements increase in-store traffic, resulting in unplanned purchases (Hui et al., 2013a, 2013b). The field experiment in a shopping mall suggests a similar phenomenon in which location-based advertisements change customers’ shopping behavior trajectories (Ghose et al., 2019). These studies imply that just focusing on the traditional simple measurement may neglect the behavioral differences between regular customers and the customers who are invited to the shops by the advertisement.

Although the above literature provides insightful findings on customers’ shopping behavior, several vital issues remain unexplored. First, they only cover a limited area for shopping, such as in a grocery store (Hui et al., 2013a, 2013b) or shopping mall (Ghose et al., 2019). In addition, they only focus on the trajectories of the target shop visit day, and not much is known about the long term trajectory of the customers who received advertisements. In other words, few studies have focused on the entire picture of the long-run customers’ trajectory. However, such comprehension of customers’ offline behavior is essential to maximize the effectiveness of brick-and-mortar advertising. For example, suppose a customer visits several stores when they visit the advertised shop. In that case, additional advertisements for those other stores would have a better conversion rate. In addition, analyzing long-term offline trajectories enables us to understand the type of customers who are likely to revisit the store (i.e., those who will become loyal customers). This long term analysis will enable us to maximize repeat customers through advertisements.

To fill the gaps mentioned above, we analyzed the long-term customer trajectory data from the mobile devices that consist of 4 months of customers’ location data. We first map the experimental data for Online to Offline (O2O) advertisements and the customers’ location data. Thereafter, we address the following three major research questions:

  1. RQ1 Spatial trajectory:

    Do the customers invited by the advertisement show different shopping trajectories than regular customers on the target shop visit day?

  2. RQ2 Persistence of trajectory difference:

    How long does that trajectory difference last?

  3. RQ3 Revisiting:

    How likely customers will return to the same shop after the visit, and what factors are associated with that revisiting behavior?

With the answers to the third research question, we will utilize our findings to propose an effective advertising strategy that maximizes the number of revisit customers. To this end, we use an uplift model, a framework of causal machine learning, to propose an advertisement campaign strategy aiming to maximize the revisiting effect.

This paper contributes to the current state-of-the-art in three ways. First, we study the trajectory differences between the customers invited by the advertisement and regular customers who voluntarily visit the shop on a large scale. With the long-term user trajectory data, we also study the duration of the trajectory differences between the two groups. Our holistic analysis implies that the existing studies omit unignorable behavioral differences between the two groups. We also find that a certain fraction of the users invited to the physical shop revisit the same shop. Utilizing this discovery, we propose a prediction model that maximizes the revisiting effect of the advertisement.

1.2 Literature review

This subsection first reviews prior studies on the human offline trajectory mining in general and then focuses on the offline customer trajectory. Finally, we discuss the application of the mining human trajectory location-based advertisements, including O2O, and we also introduce the important marketing concept in these topics, commercial area.

1.2.1 Mining human offline trajectory

The mining of human offline trajectory has been an essential computer science topic, which focuses on analyzing the user’s physical behavior theoretically or empirically to extract useful information from this high-dimensional data. One of the most fundamental interests in this topic is to find frequent trajectory patterns (Gaffney & Smyth, 1999; Liu et al., 2011), as frequent patterns allow us to categorize users. For example, the frequent trajectory patterns are often used for clustering the users. Accordingly, the researchers study the trajectories’ semantics to generate interpretable results (Ghosh & Ghosh, 2016; Lee et al., 2007; Nanni & Pedreschi, 2006; Zhou et al., 2019). In recent years, embedding techniques such as word2vec have been used to approximate high-dimensional offline trajectories to lower dimensions (Crivellari & Beinat, 2019, 2020).

There are several applications of mining offline human trajectory. The first important application is to predict the next offline trajectory of users (Monreale et al., 2009) because of its usefulness not only for the business but also for the public policy for human mobility (Gambs et al., 2012; Gao et al., 2019). However, the limits of predictions based on offline trajectories have been discussed in the context of social networks (Song et al., 2010). In social network analysis, this prediction problem is defined as edge connection prediction (Althoff et al., 2017; Sapiezynski et al., 2018), which reveals the importance of the social network. In addition, the users’ offline trajectory can predict their demographic information (Bao et al., 2016; Hinds & Joinson, 2018; Ravi et al., 2018; Wang et al., 2017; Zhong et al., 2015).

1.2.2 Customer offline trajectory

Marketing scientists are especially interested in a more specific human offline trajectory and customer shopping trajectory. This is because it conveys the most critical aspect of consumer behavior during their shopping, for instance, the selection of the times of purchase by the customer. From this perspective, marketing scientists have proposed models that incorporate the customer shopping trajectory, such as traditional marketing strategy models or statistical models.

We would like to emphasize that marketing researchers have mainly focused on customers’ in-store offline trajectories in the context of spatial analysis (Bronnenberg, 2005; Bronnenberg et al., 2005). For instance, some early work on customers’ in-shop trajectory study the customers’ shopping path in a super market (Larson et al., 2005)

To understand the in-store trajectory pattern, Hui et al. deeply exploited customer behaviors. They first modeled the customer shopping path (Hui et al., 2009a, 2009b). To investigate the advertisement and shopping trajectory, they conducted field experiments on in-store purchases with coupon (Hui et al., 2013b) detailed analysis with video data (Hui et al., 2013a). A more extended study by Ghose et al. (2019) conducted a large field experiment in a shopping mall to understand the effectiveness of the redeem campaign, and study the customers’ trajectory within the mall.

1.2.3 Location base advertisement and commercial area

Recently, online advertisements have used location data, including the offline trajectory of the targeted customers. This capability makes online advertisements different from traditional advertisements in several aspects (Goldfarb, 2014). Location information from mobile devices has been the source of most information of the customers’ trajectory. Recently, the importance of customers’ location information for online advertisements has garnered significant attention (Agarwal et al., 2011, 2020; Johnson et al., 2016). This information is also effective for marketing strategies, such as coupons (Molitor et al., 2020). In addition, the most recent papers in this topic link the location advertisement to the causal inference machine learning to assess the effect of given advertisements (Kawanaka & Moriwaki, 2019; Moriwaki et al., 2020).

To distribute the location-based advertisement effectively, a typical strategy sets a commercial area for potential customers, because persons far from the target shop are unlikely to visit that shop. Traditionally, marketing researchers have been interested in market areas (Greenhut, 1952), and they study the evolution of the marketing area (Thelen & Woodside, 1997) expansion (Allaway et al., 1994) and their driving force (Allaway et al., 2003).

Overall, previous studies found the customers’ trajectory differences according to the advertisements they received. In addition, the importance of location information for online advertisements has been indicated. In particular, marketing scientists have mainly studied the first point with field experiments, and computer scientists have focused on the second point in the context of the ad technology. Our study aims to connect these two dots using A/B testing for O2O advertisements and offline customers’ trajectory data.

2 Study design and data

In this section, we report our study design and data. We first discuss our strategy to answer the three research questions introduced in Sect. 1. Thereafter, we explain our data set, which consists of the A/B testings for the advertisements. We also introduce a glossary of terms in Table 1 to make our deliberations in this paper consistent and concise.

2.1 Study design to answer the RQs

Our three research questions ask if the difference in the motivation for visiting the shops reflect the trajectory differences. To answer these questions, we compose the A/B testing data for O2O advertisements.

As a regular A/B testing, the users in the experiment are randomly split into two groups to study the effectiveness of the O2O advertisements. One group receives the advertisements (Treatment), whereas the other group does not distribute the advertisements (Control). The primary interest of these A/B testing is the target shops that the online advertisement causes customers to visit. To understand this point, the location data is collected from the users’ mobile devices to check whether the users actually visit the target shop.

In this study, we investigate the experimental data from a different point of view, summarized in the schematic example in Fig. 1. We leverage the situation in which the A/B testing for O2O advertisements causes motivational distinctions. The experimental data contains the users of the two groups, and this difference corresponds to the motivation to visit the shop. The control group users are supposed to visit the target shop voluntarily, because any O2O advertisement is not distributed to this group. In contrast, the treatment group can have an extrinsic motivation provided by the distributed advertisement. To understand if such motivational difference reflects their offline trajectories, we compare the trajectory between the voluntary customers (Control) and those of the invited customers (Treatment). Table 1 summarizes the terms that we introduced in this section.

Fig. 1
figure 1

Schematic of the voluntary and invited visit. Schematic of our research design. We compare the two groups that have different motivations to visit the target shop. The users in a Voluntary visit group visits the target shop without seeing any O2O advertisement. The users in (a) are supposed to have an intrinsic motivation for shopping. On the other hand, b Invited visit group is the users that receive the O2O advertisements, and hence they are supposed to have an extrinsic motivation for shopping. In this study, we study if this motivational difference reflect the trajectory difference

Table 1 Glossary of terms

2.2 Volume of the audiences and the resulting sales

We conduct our study based on the data generated from the A/B testing of the O2O advertisements conducted by CyberAgent, Inc., a Japan-based major advertising agency. They conduct nation-wide advertisement campaigns for offline shopping, aiming at inviting potential customers to a physical shop through online advertisements. Their campaign distributes online advertisements based on customer behavior.

As typical online advertisements, the advertisement campaigns use a demand-side platform (DSP) to deliver their advertisement creatives to potential customers’ mobile devices through a real-time bidding (RTB) system. During the campaigns, users’ visiting histories are used to count the number of visits to target stores. To understand the effectiveness of each campaign, the company randomly split the user pool into two groups as follows: treatment and control. Only the users in the treatment group have a chance to receive the advertisements, whereas those in the control group do not receive any treatment.

We study the multiple campaigns for multiple businesses, covering the various categories from electronic stores to super markets Footnote 1. As we introduced in Sect. 2.1, this study focuses on the customers who visited a target shop, resulting in the number of customers in the control group being 1440 and the number of customers in the treatment group is 1515. We also report the balance test at Appendix A.1.

2.3 Offline behavior trajectory data

In addition to the visiting history to the target shop, we are interested in their trajectories near the target shop visit day. We collect the anonymized customer location data from their mobile devices and label them with the actual commercial place data. We trace the customers’ shopping trajectories, matching the location data from the customer with the commercial place data set. We then only label “visited” if a customer stayed for more than 10 min within a 20 m radius of a location registered in the shop data set.

For this threshold for “visited”, we use the most strict threshold in the service, which is the source of our data (the O2O advertisement product operated by CyberAgent, Inc.Footnote 2). In the service, they use the different thresholds depending on the advertised shop (3 min, 5 min, etc.). Those thresholds are determined based on the consultations between the advertisers and the retailers. Therefore, the thresholds used in the product is recognized as valid among the advertisers who actually pay the advertisement cost and operate the business. Our threshold, “10 min within a 20 m radius” is the most strict threshold among those thresholds. While it would be more appropriate to choose the threshold size depending on the type of the advertised shop, we decided to use the most strict threshold for research purpose. For this reason, we believe that our threshold is valid to study the customer behavior of interest.

In addition, we label the visited location with the three commercial categories as follows: shopping, food, and service. Consequently, our data set contains 23,549,902 location histories for 2955 users from January 1, 2020 to April 1, 2020. For the additional analysis, we also study the post-experiment user location data from April 1, 2020 to July 31, 2020 for the revisit analysis to be described in the subsequent section. Our mobility data is collected legally from the apps on their mobile device strictly following the privacy policy of the data provider (CyberAgent, 2021). On each app, the OS (i.e., iOS or Android) is designed to explicitly ask the user’s consent for mobility data usage. Our data provider obtains these mobility data from the app developers within the user consent.

2.4 Offline behavior features for the uplift model

We also compose the users’ behavior features built on the user offline trajectory data for uplift modeling to be detailed in Sect. 3.4. To prevent the data leakage, we do not use any data for the analysis for RQ1-3 described in Sects. 2.2 and 2.3. The demographic features are predicted values for each demographic characteristics that are provided by CyberAgent, Inc. The prediction model for demographic information is implemented using regression models and XGboost (Chen & Guestrin, 2016). This prediction model is not technically difficult and it is well known that the offline trajectories of the users can predict their demographic information (Bao et al., 2016; Hinds & Joinson, 2018; Ravi et al., 2018; Wang et al., 2017; Zhong et al., 2015). Note that this estimation results are used only for the uplift and not for our main research questions (RQ1–3). In addition, we use 96 shopping categories of the shops that the users visited which covers shopping categories from Fishing Shop to Bakery. Finally, we estimate the users’ home location and calculate the distance from their home to the target shops. We only use the data before the period of the experiments to prevent data leaking.

3 Methods

In this section, we describe the methods to answer the aforementioned three research questions. For RQ1, we discuss our spatial trajectory analysis in Sect. 3.1. Then we explain our panel regression model to study RQ2 at Sect. 3.2. Finally, for RQ3, we propose the method to calculate the revisit effect of the O2O advertisements at Sect. 3.3, and the causal machine learning model to maximize the lift effect of the revisit at Sect. 3.4.

3.1 RQ1: spatial trajectory analysis

With the compiled shopping trajectory data, we study whether the invited customers show different shopping trajectories compared with the voluntary customers. To this end, we model customers’ spatial trajectory. We first align the user location data such that we could analyze the customer behavior on the same coordinate and run a regression model to study the spatial features of each group.

3.1.1 Aligning spatial shopping trajectories from the different places

As discussed in the previous sections, the experiments target multiple offline shops and are on a nationwide scale. This makes it impossible to map the customers’ movement in one actual map as a typical geological analysis does. In other words, the raw trajectory data contains the user location history in different places across the nation, and hence, they do not allow us to summarize their trajectories as a single model. Therefore, we align the customer trajectory data into the same coordinate.

We refer to the location of the target shop for this alignment. For each customer travel point \((u_i, v_i)\), we align it taking longitude and latitude differences as follows:

$$\begin{aligned} (u_i, v_i) = \left(u^{raw}_{i} - u^{ref}_{i}, v^{raw}_{i} - v^{ref}_{i}\right), \end{aligned}$$
(1)

where \(u^{l}_{i}\) and \(v^{l}_{i}\) denote the latitude and longitude of the l, \(l=\{raw, ref\}\). raw represents the raw customer’s geographical position, whereas ref represents the position of the reference points. The reference point in this study is the location of the target shop.

3.1.2 Features of aligned location data

We also note that shop categories are annotated in our location data. Here, we formally denote our aligned location data, \(G\in {\mathbb {R}}_{+}^{ U \times V \times N \times M}\) and \(g_{u, v, c, t} \in T\). \(g_{u, v, c, t}\) represents the number of visits to a shop of category c, and its aligned latitude and longitude are (uv), and t represents its experimental group (Treatment or Control).

3.1.3 Specification of spatial difference of the shopping trajectory

For RQ1, we investigate if the customer in the treatment group shows a different spatial trajectory from the control group, modeling a spatial relationship between shop visiting behaviors and the distance from the target shop.

Because spatial analysis can easily experience difficulties, such as spatial dependency or auto-correlation among regions, we employ geographically weighted regression (GWR) (Brunsdon et al., 1998), a weighted regression model that incorporates spatial heterogeneity. We specify the following geographically weighted logistic regression model:

$$\begin{aligned} y_{i}=\sum _{j=0}^{m} \beta _{bw, j}\left( u_{i}, v_{i}\right) x_{ij}+\varepsilon _{i}, \end{aligned}$$
(2)

where \(y_{i}\) is a binary dependent variable that represents the dominance of either group; \(x_{i,j}\) is a vector of independent variables for area \(\left( u_{i}, v_{i}\right)\); \(\beta _{bw, j}\) represents that for each feature j, we use a different bandwidth bw for weighting. Dependent variables, here is whether the treatment group dominates are \(\left( u_{i}, v_{i}\right)\). For the independent variable at position i, \(x_{i,j}\), we use the distance from the reference points and shop category shares.

We set \(y_{i}\) as 1 when the customers in the treatment group visit a block \(\left( u_{i}, v_{i}\right)\) more frequently than the customers in the control group, and otherwise 0. Because the number of observed data points for each group is different, we normalized the data as described in Appendix A.2.

3.2 RQ2: travel distance differences after the target shop visit day

When a customer changes their trajectory, it would be of paramount interest whether that alternation is a permanent or tentative. We study the shopping trajectory after the day when the customer visits the target shop. To trace the customer trajectory, we calculate the travel distances nearly 3 days before and after the first visit day, denoting the travel distances of user i at day s as \(d_{i,s}\). Here, \(d_{i,s}\) represents the position from the day that customer i visits the target shop, and we focus on 3 days before and after the target shop visit day (i.e., \(s = \{j\}_{j=-3}^{3}\)). For example, \(d_{i,s=0}\) is the travel distance of the day that user i visited the target shop. We calculate the travel distance after the visit to the shop between the control and treatment groups using the following regressions.

$$\begin{aligned} d_{i,s} =\alpha +{\text {Aft}}_{t} \cdot T_{i} \beta + X_{i, s} {\gamma } + \epsilon _{i, s}, \end{aligned}$$
(3)

where \(\alpha\) denotes an intercept; \(T_{i}\) the dummy variable that takes 1 when customer i is in the treatment group and otherwise 0; \({Aft}_{s}\) becomes 1 after the shop visit (i.e., \(s > 0\)); otherwise, 0; \(X_{i s}\) represents the control variables that incorporate customer fixed effects; \(\epsilon _{i s}\) denotes the error term.

In this model specification, \(\beta\) represents the average differences in the travel distance between the voluntary (control) and invited (treatment) group after the shop visit day. For control variables, \(X_{i s}\), we use several sets of fixed effect variables that might affect travel distance as follows: the day of week, the day, the advertisement, and individual fixed effect. We will attempt several settings of X and discuss how it affects the estimation of the travel distance differences. For instance, having a fixed individual effect implies that we control the customer’s difference, such as their travel distance preference. Especially, when we incorporate the individual and time fixed effects, this regression model becomes the same as the framework of difference-in-difference, which is a widely used causal inference technique not only in economics (Angrist & Pischke, 2008; Card & Krueger, 1993; Lechner, 2011) but in a recent computer science study (Kusmierczyk & Gomez-Rodriguez, 2018; Li et al., 2020). We use robust standard errors clustered at the advertisement level or the individual levelFootnote 3

3.3 RQ3: revisit effect of advertisements

We are also interested in studying the long-term effect of the advertisements, the revisiting effect. In other words, we intend to know the ratio of the customers that the advertisement makes them revisit. To study this effect, we subtract the revisit ratio in the control group from the treatment groups’ revisit ratio. Because the control group did not receive an advertisement, the subtracted revisit ratio can be interpreted as the effect of the advertisement. In addition to the simple calculation, we conduct meta-analysis to consider the advertisement differences.

3.3.1 Calculating the revisiting by the advertisement

As in Sect. 2.1, we assume that there are two types of customers: Induced and organic customers. Induced customers are the customers induced by the advertisements. Meanwhile, there are customers who visit the target shops regardless of the advertisements, voluntary customers.

The revisit probability Pr(RV) can be formalized as follows: \(Pr(RV) = Pr(RV_{organic}) + Pr(RV_{induced})\), where \(Pr(RV_{organic})\) and \(Pr(RV_{induced})\) denote the revisit ratios of the organic and induced customers, respectively Footnote 4. With this formalization, we obtain \(Pr(RV_{induced})\). As a definition, all of the re-visiting customers in the control group do not receive an advertisement. Therefore, we estimate \(Pr(RV_{induced})\) as Footnote 5

$$\begin{aligned} Pr(RV_{induced})= & {} Pr(RV) - Pr(RV_{organic}) \end{aligned}$$
(4)
$$\begin{aligned}= & {} Pr_{treatment}(RV) - Pr_{control}(RV)\end{aligned}$$
(5)
$$\begin{aligned}= & {} E[Revisit|T=1] - E[Revisit|T=0] \end{aligned}$$
(6)

3.3.2 Controlling the differences among the advertisements

Although estimating Eq. 4 is straightforward, it might contain some biases, because our data set consists of the experiments for the various advertising campaigns. Differences among them can yield heterogeneous effects on revisiting, resulting in an imprecise estimation. To overcome this potential problem, we employ the model of the variance of the experiments and study the overall effect of the experiments. To this end, we conducted a meta-analysis using the Mantel–Haenszel method with random effects, which is a popular method in clinical research (Harrer et al., 2019). This meta-analysis method returns a weighted risk ratio assuming the variance among the experiments as random effects. To make our results comparable, we will report the odds ratio of revisiting for the invited customer (treatment) group with the voluntary (control) group by the following two methods: the direct calculation described in Eq. 4 and the meta-analysis.

3.4 Prediction model to maximize the revisit effect of O2O Ad

Finally, we propose a prediction model for high-value customers and discuss the marketing strategy using the data we observed in our experiments. In our analysis, we will find the customers who revisit the target shops after the first visit. These customers are supposed to bring more profit to the business than the customers who visit only once. Therefore, a model to predict the revisiting effect is of paramount interest for advertisers.

The most natural solution for this prediction task might be to find the factors associated with customers’ revisit behaviors. For example, logistic regression analysis will promptly find the variables that are associated with the revisiting behavior. Several marketing studies conventionally perform this type of post-experiment analysis. However, the challenge that advertisers face in reality and practice is the efficient distribution of advertisements to potential customers under the budget constraint.

3.4.1 The aim of uplift model

To attain the goal mention above, we use Uplift model that predicts the degree to which an advertisement improves the probability of revisiting.

$$\begin{aligned} \tau _{i} = {\text {P}}\left( Revisit_{i}=1 \mid T_{i}=1\right) -{\text {P}}\left( Revisit_{i}=1 \mid T_{i}=0\right) . \end{aligned}$$
(7)

The first term on the right-hand side of Eq. 7 is the probability that customer i revisits the target shop if it receives the advertisements, whereas the second term is the probability that customer i revisits the target shop if it does not receive the advertisements. Intuitively, Eq. 7 calculates the degree that the advertisement lifts the probability of customer i revisiting.

Our interest in this task is to sort the customers by descending order of \(\tau _{i}\), and distribute the ad following that order to maximize the aggregated effect of \(\tau _{i}\). In the literature of computer science, this \(\tau _{i}\) is called as lift effect and the prediction models of lift effect are called as uplift model (Gutierrez & Gérardy, 2017; Kawanaka & Moriwaki, 2019; Rzepakowski & Jaroszewicz, 2012; Zaniewicz & Jaroszewicz, 2013).

3.4.2 Learning the lift effect of O2O ad by uplift model

The apparent obstacle to learning the lift effect described in Eq. 7 is that we only observe one of the two in the dataFootnote 6. In other word, for user i who did not receive the O2O ad in the experiment, we need to study the probability of revisit “what if” she/he had received the O2OFootnote 7.

To learn a function that predicts the \(\tau\) value for a given customer, we utilize the experimental data. With the data, we want to learn the function as follows:

$$\begin{aligned} \tau \left( X_{i}\right) ={\text {P}}\left( Revisit=1 \mid X_{i}, G=T\right) -{\text {P}}\left( Revisit=1 \mid X_{i}, G=C\right) , \end{aligned}$$
(8)

where \(X_i\) denotes the feature vector of customer i discussed in Sect. 2.4, \(G={T, C}\) represents whether a customer receives a treatment (i.e., receives an advertisement) T or not C. The first term describes the probability of customer i’s revisiting given \(X_i\) when they are in treatment groups and the second term is the probability of revisiting even when they do not receive an advertisement.

We could learn the first and second term separately from the treatment and control group data, but it requires training two different models for a single prediction function. To avoid this cumbersome task, we use the reverted model proposed by (Jaskowski & Jaroszewicz, 2012), which is a transformed version of Eq. 8; that is, introducing the indicator Z as follows:

$$\begin{aligned} Z=\left\{ \begin{array}{ll} 1 &{} \text { if G=T and Revisit=1} \\ 1 &{} \text { if G=C and Revisit=0} \\ 0 &{} \text { otherwise. } \end{array} \right. \end{aligned}$$
(9)

As (Jaskowski & Jaroszewicz, 2012) explains, Z takes 1 if the assignment G is ideal. In other words, we want to see the customers with the treatment revisit, and the customers without the treatment not revisiting. Note that Z takes 0 when a treated customer does not revisit, or a non-treated customer revisits. We will not gain any treatment for both cases. Under the assumption that the treatment assignment is random, we obtainFootnote 8

$$\begin{aligned} \tau \left( X_{i}\right)= & {} {\text {P}}\left( Revisit=1 \mid X_{i}, G=T\right) -\small {{\text {P}}\left( Revisit=1 \mid X_{i}, G=C\right) }\end{aligned}$$
(10)
$$\begin{aligned}= & {} 2 P\left( Z_{i}=1 \mid X_{i}\right) -1. \end{aligned}$$
(11)

Therefore, we will estimate the function \(\tau (X)\) through \(P\left( Z=1 \mid X\right)\).

3.4.3 Evaluation of the uplift model

To evaluate the uplift model, we will study the mechanism through which the advertisement distribution based on the model increases the uplift effect compared to the random advertisement distribution. Following practical conventions (Radcliffe & Surry, 2011; Rzepakowski & Jaroszewicz, 2012; Zaniewicz & Jaroszewicz, 2013), we use the area under the uplift curve (AUUC) for the evaluation. The AUUC describes the uplift gain when we distribute the advertisements from the users with the highest uplift scores to the users with the lowest uplift socre as follows:

$$\begin{aligned} AUUC =\int _{k = 0}^1 f_{1}(k)-f_{0}(k) dk, \end{aligned}$$
(12)

where \(f_{o}(k),\ o \in \{1, 0\}\) denotes the lift effect of the kth sample in the ordered sample o. For \(o=1\), we sort the sample based on the uplift effect prediction (descending). In contrast, we arrange the sample in a random manner when \(o=0\).

The AUUC describes an advertisement strategy that maximizes the revisit effect based on following natural observation. First of all, it is not cost-efficient to distribute an O2O ad to the customer who is not likely to revisit the shop by the ad. Similarly, we do not need to pay attention to the customer who revisits the shop regardless of the ad. Therefore, an optimal advertisement strategy is to prioritize the customers who have a high lift (i.e., revisit effect described as \(\tau\)), which means such customers are likely to revisit the shop because of the ad. The function \(f_{1}(k)\) describes the revisit rate when we distribute the O2O ad to the customer whose \(\tau\) is more than the kth percentile. The function \(f_{0}(k)\), on the other hand, returns the revisit rate when we randomly distribute the O2O ad to the kth percent of the customers. Thus, \(f_{1}(k)-f_{0}(k)\) describes the gain of the revisit (rate) by the advertisements campaign in which we consider the lift effect. Consequently, we attain the maximized lift effect with \(k^{*}\) such that \(f_{1}(k)-f_{0}(k)\) is maximized.

4 Results

This section reports our results that answer the three research questions proposed in the introduction section. First, we focus on the customer trajectory near the target shop during their first target shop visit (RQ1). In addition to presenting the actual customers’ trajectory, we model the customers’ spatial behavior and report the model prediction to show the distinctions between the two groups. Furthermore, we trace the trajectory 3 days after the first target shop visit day and calculate the travel distances to observe the persistence of the advertisement effects (RQ2). Then, we will study the revisiting effect of advertisements (RQ3), comparing the revisiting ratio between the control and treatment groups. Finally, with the finding in RQ3, we propose a counterfactual machine learning model for marketing strategy based on our analysis. We use the uplift model to predict the revisit effect and a strategy for delivering the advertisement to maximize the revisit effects.

4.1 RQ1: shopping trajectory differences

Our first research question asks; ”Do the customers invited by the advertisement show different shopping spatial trajectories on the target shop visit day?” To investigate this trajectory difference, we compare the trajectories of the voluntary visit customers (the control group) and the invited visit customers (the treatment group) during their first target visit day.

4.1.1 Customer location history near the target shop

To answer RQ1, we study the customer location data near the target shop to compare the differences in visit points between the two groups. We first align the customer location history to map their trajectories into the same coordination as in Sect. 3.1.1. After coordination, we focus on an area of 2km from the target shops in radius, and calculate the number of visits in each grid (111 m2), as discussed in Sect. 3.1.3. Figure 2 shows the area near the target shop with a label that represents the group that dominates a particular area. Not compelling but Fig. 2 alludes to the fact that the customers in the control group are dense near the shop and the treatment group are dominant in the outward area.

4.1.2 Customer spatial trajectory modeling

To conclude the hypothesis proposed above, we will employ a GWR to model a spatial relationship between the distance from the target shop and the dominant area. We first model a simple regression, where we only use the distance from the shop. The results of the regression in Table 2 demonstrate a positive relationship between distance and treatment group dominance. The further from the target shop a place is located, the more likely the treatment group dominates that place. These results are qualitatively the same even when we add some control variables that might affect the area preference. In the second model, we add the ratio of the shops in each category, shopping and food, and found a positive relationship between distance and treatment group dominance.

This finding becomes more evident as we observe the prediction results using the estimated models. Figure 3 depicts the prediction result, where the control group dominates the center areas, whereas the treatment group dominates the outer areas. We also found a similar result when we used the second multivariate model. Note that the aim of these prediction models is not to prediction the behavior, but to abstract the important factors from the customers’ trajectories.

These prediction results contrast the trajectories of customers with two different motivations for visiting the target shop. Voluntary visitors gather around the area near the target shops. Meanwhile, the invited customers move to wider areas. The customers who shop in a wide area would bring more benefits to the shops near the target shop. In other words, the invited customers bring a larger externality in the sense that the shops near the target shop benefit from them, even though the campaign did not advertise these nearby shops (Fig. 4).

Table 2 Results: GWR model
Fig. 2
figure 2

Dominant areas around the target shop. We align the target shop to the center of the plot and then split the area around the target shops by grids (the area of a square is about 111 m × 111 m). This figure shows the areas the invited customers (blue, treatment) group dominant and the voluntary customers (red, control) group dominant. We calculate which blocks are dominant following the procedure described in Sect. 3.1.3

Fig. 3
figure 3

GWR prediction (simple model, Model 1). We align the target shop to the center of the plot and then split the area around the target shops by grids (the area of a square is about 111 m × 111 m). This figure describes the prediction by the estimated GWR model (Model 1 in Table 2). The figure shows that the model predicts that the invited customers’ dominants the outer areas (Blue), whereas the voluntary visit customer condence around the target shop (Red)

Fig. 4
figure 4

GWR prediction (multivariate model, Model 2). We align the target shop to the center of the plot and then split the area around the target shops by grids (the area of a square is about 111 m × 111 m). This figure provides the prediction by the GWR model with multiple independent variables (Model 2 in Table 2). Model 2 also predicts the dominance of the outer areas by the treatment groups

4.2 RQ2: persistence of trajectory difference

After RQ1, one interesting question would be; “How long do these trajectory differences persist? (RQ2)”. Such an analysis will enable us to know the duration of the behavior difference between the voluntary and invited customers found in RQ1.

To understand the persistence of the advertisement externality, we investigate the daily travel distances of the customers before and after the first target shop visit day. Because we found that the invited (treatment) group travels wider areas than the control group, the travel distances of each group should be a suitable indicator of such externality. In this analysis, we calculate the daily average travel distance for each group. There are several confounding factors that might affect their travel distance, such as advertisement differences, individual differences, and differences among days. Therefore, we control these differences by incorporating control variables, as discussed in Sect. 3.2.

4.2.1 First look

We first calculate the travel distance difference among the two groups for each day, controlling the advertisement differences by the advertisement fixed effect. We run the regression analysis and report the coefficients of the day-dummy in Fig. 5. The figure shows that the travel distances increase on the target shop visit day, and the distance differences remain relatively higher than the day before the visit. Even though the confidence intervals of days 1 and 2 are on the zero bound, we are not sure if these differences are robust. For this first analysis, we only control for differences among the advertisements, but each individual might respond differently. We, therefore, need to conduct a regression analysis modeling these differences to determine if this finding is robust.

4.2.2 Analysis with elaborated models

We attempt four settings of the control variables and report the results in Table 3. The table calculates the day average travel distances 3 days after the target shop visit day Footnote 9. Consequently, we find that the invited (treatment) group travels a longer distance even after the shop visit day. We estimate Model 1 as a baseline that only incorporates the dummy variables of the advertisements. For Model 2, we also incorporate the day-of-week dummy into the models. These first two models return similar results in the travel distance differences, suggesting that the day-of-week effect is negligible.

On the contrary, the last two models with the heterogeneity among the customers (Models 3 and 4) report smaller travel differences than the first two (Models 1 and 2), indicating that there is a certain heterogeneity among the customers and its overestimates of travel distances. While detecting the overestimation in the first look analysis, our analysis of the persistence of trajectory difference demonstrates consistent results that the treatment group travels longer distances even after the target shop visit day.

Table 3 Travel distance differences: the voluntary visit customer vs the invited customer

4.3 RQ3: revisit odds differences

Finally, we expand the time horizon of the analysis and ask RQ3, studying if the customers revisit the shop within 4 months after the visit. To answer this question, we calculated the re-visitation ratio difference between the treatment and control groups, studying if a user revisits the shop within 4 months after the visit. We also present a causal machine learning model to predict the re-visitation effect of the advertisement and discuss the marketing strategy for re-visitation.

4.3.1 Revisiting ratio difference

As discussed in Sect. 3.3, we interpret the differences among the groups in the revisit ratio as the revisits invited by the advertisements. We calculate the odds ratio of revisiting for the invited customer (treatment) group with the voluntary (control) group. Because our data consists of multiple advertisement campaigns, the effect of advertising may vary. We, therefore, present the odd ratio using Mentel–Haenszel methods to take into account the advertisement effects’ heterogeneity.

Figure 6 reports the results of the two different odds with 95% confidence intervals. In both calculations, the odd ratios are greater than 1, implying that the customers who received an advertisement are more likely to revisit than those who visit a shop voluntary. In other words, the advertisements encourage revisiting of the customers who do not revisit without the advertisements. Figure 6 also shows that the odds estimation by the meta analysis has a narrow confidence interval than the estimation using the direct method. This overestimation by the direct methods suggests that the revisit effect varies among the advertisements. However, after modifying these differences, the odd ratio is still greater than 1, implying that the advertisements increase the revisit ratio.

Fig. 5
figure 5

Travel distances difference (km) before and after the shop visit day. Figure 5 describes the transition of the travel distance differences (km) between the voluntary and invited customers (the error bars represent 95% confidence interval)

Fig. 6
figure 6

Odds ratio of revisiting. Figure 6 compares the odds ratio of revisiting for the invited customer (treatment) group with the voluntary (control) group by the two different methods: Direct and Meta analysis. Direct: The odds ratio of the whole sample. Meta: The odds ratio by the Mantel–Haenszel method that controls the differences between the advertisements

Fig. 7
figure 7

Distribution of the SHAP values of the top 10 higest mean SHAP value features. The top 10 features by the SHAP value by (Lundberg & Lee, 2017) for the uplift model. The descriptions of the features are in Appendix A.4 Except for Distance from home to shop, the other demographic variables represent the probability that a customer in that demographic category. For example, the high value of Other occupation category means the prediction model returns the high probability that the customer is not in the specific occupation category. The color represents the value of the feature (Reg: high, Blue: low). A high SHAP value represents a positive impact on the lift effect, and vice versa. For instance, the figure shows that the customer who lives the spot with fewer woman residents contributes to the high lift effect predictions. Overall, this SHAP value analysis suggests that younger male persons tend to have a higher lift effect

Fig. 8
figure 8

Uplift curve. Figure 8 shows the marketing strategy to maximize the lift effect of revisiting by the advertisements. The baseline describes the revisit effect when we randomly chose the users to get advertisements from the targeted user pool. The figure shows that revisiting is maximized when the customers whose lift effect is more than the 60th percentile receive the advertisements

4.4 Predicting the revisiting lift effects

We found that the advertisement encourages the customers to revisit the target shop. A natural next goal is to study the way to distribute the advertisement to maximize the revisiting effect. As we discussed in Sect. 3.4, this goal is not achievable only without modeling the effects that the advertisement lifts (increases) the probability of revisiting a given customer. To this end, we use uplift modeling to predict the revisiting effect for each customer, and discuss a reasonable advertisement strategy. We split the data set into three and conducted feature selection optimization, train model, and predict the uplift with each subset of the data. With the trained model, we study what features are associated with the revisiting effect prediction.

To construct the uplift model, we first determine the best combination of features to maximize the uplift effects (i.e., the AUUC) from the set of features described in Sect. 2.4. A number of candidate features for the model include customer profiles and location information. The best set of features is useful not only for improving the prediction accuracy but also in the sense that it enables us to know the features that are associated with the revisiting behavior. We conduct feature selection optimization via the state-of-the-art framework for hyper-parameter optimization, Optuna (Akiba et al., 2019). The feature optimization selects 60 features that consist of visit place histories, demographic information, such as age, and distance from home to the target shops.

With the features selected by the Optuna (Akiba et al., 2019), we use the XGBoost classifier (Chen & Guestrin, 2016) to estimate our uplift model and train it on the train data set.Footnote 10 To understand each feature’s contribution to the prediction of the lift effect, we utilize the SHapley Additive exPlanations (SHAP) value (Lundberg & Lee, 2017). The SHAP value represents the degree of the contribution of each feature on the prediction.Footnote 11. We chose the top 10 features based on the mean absolute SHAP values.Footnote 12 A high SHAP value of a feature implies that a high value of that feature has a positive effect on a lift effect prediction and vice versa. Figure 7 reports the distribution of the SHAP values for the top 10 features, and it suggests the importance of predicted demographic information, such as age or occupation. The distribution of the SHAP values of the most three important features suggest that our uplift model predicts a high lift effect for young males. In addition, we find that location information is also important, as represented by Dist from the shop to home. The figure demonstrates that the longer distance predicts a high lift effect. This result implies that such customers wouldn’t have revisit the shop, because they are from the shop, and the O2O advertisements can cause the revisit.

With the trained uplift model, we study the optimal advertisement strategy to maximize the revisit effects. We calculate the cumulative success rate and depict it in Fig. 8. The figure demonstrates the cumulative success rate when we distribute the advertisement to the customers based on the model’s predicted uplift score. The y-axis represents the cumulative success rate (i.e., cumulative revisit rate), and the x-axis represents the percentage of the customers who received the advertisement. The cumulative success rate achieves the peak when approximately 60% of people received the advertisement. Hence, the uplift model’s optimal strategy by this model is to distribute to the customers in the top 60% uplift scores.

5 Limitation

The A/B testings used in this study does not force customers to visit the target shops, but the customers who are willing to do so show up in our trajectory data as we could do in some field experiments. Hence, there is a possibility that the advertisement does not alter the customers’ trajectory and just picks up those who intend to change their trajectory. Therefore, it would be an oversimplification to claim that the O2O advertisements is the only source of the trajectory differences found in this study. d, as we discussed in the introduction section, the existing research continuously found that advertisements alter the behavior trajectory (Ghose et al., 2019; Hui et al., 2013a, 2013b). In addition, this limitation can remain in a more rigid random control study. Even with such limitation, the body of our research manifests that the customers who received the O2O advertisements show different trajectories than the voluntary visit customers: move a wider area, travel a longer distance, and have a higher revisit ratio. We believe that our research conducts a large-scale quasi-field experiment that proposes a promising direction of behavior research in this marketing area. We also would like to mention here that our result can change according to the threshold discussed in Sect. 2.3. To discuss an impact of the threshold on the result, we discussed the robustness check in Appendix A.3.

6 Conclusions and discussion

In summary, our results demonstrate that the customers who are invited by the advertisement to the physical shops show different trajectories than those who visit the shop voluntarily. In our analysis, the invited customers demonstrate a more active shopping trajectory on their visit day, whereas the voluntary customers’ trajectories concentrate near the target shop. The invited customers move not just for longer distances, but also in larger areas than voluntary visit customers. These trajectory differences last even after the target shop visit day; their travel distances are approximately 6.8% longer than the voluntary visit customers. In addition, we found that the advertisements pick up customers who have higher revisit odds, implying that the invited customers are valuable for businesses. While estimating the exact economic value on businesses (e.g., sales, revenue, etc.) is out of the scope of the current paper, we can assume that the economic value to gain a marginal revisit customer is more than the price that an advertiser pay to gain a marginal visitFootnote 13.

Based on this finding, we proposed a causal inference prediction model to devise an advertisement strategy for increasing the revisiting probability. Our model does not just predict the revisit probability for a given customer, rather than how much the advertisement will increase the number of revisits, enabling us to deliver the advertisements efficiently.

These insights provide opportunities for retailers who have their physical shops and the owners of huge commercial facilities, such as shopping malls. For example, the advertisements can invite new customers to their places, and such customers travel to more extensive areas that contribute to the business administrated by shopping malls. The customers who travel to more extensive areas would be exposed to more shopping spots, resulting in a high likelihood of committing unplanned purchases. These implications are also useful for the advertising industry. The price of advertisements conventionally relies on simple metrics of volume of the audiences and the resulting sales. Typical pricing of display advertisement depends the number of clicks a given ad earns (cost-per-click, CPC) or the number of conversions they earn (cost-per-action, CPA). However, our findings suggest some arbitrage between the values that advertisements bring and their actual price. For example, we could augment the value of O2O advertisements, demonstrating the economic value that would be brought to their marketplace.

The results here are not without limitations as discussed in the previous section, calling calls for further studies. First, while we found that O2O advertisements invited customers with a more massive externality, we cannot conclude that the advertisement itself is only the cause of such externality. Second, because we have no access to the customers’ spending history, we are not able to estimate the economic value of the externalizes. Therefore, future work will need to collect customer spending information to estimate the economic value of invited customers to the area near the target shops. As we proposed the revisit uplift model, one possible direction for further work is to propose a causal machine learning model that maximizes the causal externality effect by O2O advertisements.