Abstract
Many firms place ads in search engines in order to motivate users to visit their website. However, users may react differently to an ad. For example, some users who intended to click on the organic result may now click on the ad, which creates unnecessary costs and diminishes profits. This paper presents the first systematic investigation of all effects on users an ad can cause during their search with a given keyword at a certain point in time. It covers the three sequential decisions users make: whether (and where) to click, whether to convert, and what (or how much) to buy. We develop a model by which each effect can separately be quantified and regressed on the search context, allowing insights into what drives user reactions. As a demonstration for its application, we conduct a largescale field experiment on brand bidding, the practice of placing ads for brand names.
Similar content being viewed by others
Notes
Due to a policy by Google, the conversion behaviour and the basket choice of users who visit the firm’s website via the organic result are not traceable if they use a secure internet connection (https). However, a sufficiently large number of users used a standard internet connection (http), so that we can base our estimates on their behaviour, assuming that it does not differ significantly from that of httpsusers.
References
Abou Nabout, N., & Skiera, B. (2012). Return on quality improvements in search engine marketing. Journal of Interactive Marketing, 26(3), 141–154.
Abou Nabout N, Skiera B (2013) Brand bidding in paid search: Just cannibalizing or increasing profit? Proceedings of the 42st Conference of the European Marketing Academy
Agarwal, A., Hosanagar, K., & Smith, M. D. (2015). Do organic results help or hurt sponsored search performance? Information Systems Research, 26(4), 695–713.
AMA (American Marketing Association) (2018) Dictionary, https://www.ama.org/resources/Pages/Dictionary.aspx? dLetter=B&dLetter=B, retrieved on June 17, 2018.
Ayanso, A., & Mokaya, B. (2013). Efficiency evaluation in search advertising. Decision Sciences, 44(5), 877–913.
Bakos, Y. B. (1997). Reducing buyer search costs: Implications for electronic marketplaces. Management Science, 43(12), 1676–1692.
Baye, M. R., de los Santos, B., & Wildenbeest, M. R. (2016). Search engine optimization: What drives organic traffic to retail sites? Journal of Economics and Management Strategy, 25(1), 6–31.
Becker H, Broder A, Gabrilovich E, Josifovski V, Pang B (2009) What happens after an ad click? Quantifying the impact of landing pages in web advertising. Proceedings of the 18th ACM Conference on Information and Knowledge Management: 57–66.
Berman, R., & Katona, Z. (2013). The role of search engine optimization in search marketing. Marketing Science, 32(4), 644–651.
Blake, T., Tadelis, S., & Nosko, C. (2015). Consumer heterogeneity and paid search effectiveness: A large scale field experiment. Econometrica, 83(1), 155–174.
Blask, T., Funk, B., & Schulte, R. (2012). To bid or not to bid? Investigating retailbrand keyword performance in sponsored search advertising. Communications in Computer and Information Science, 314, 129–140.
Breuer, R., & Brettel, M. (2012). Short and longterm effects of online advertising: Differences between new and existing customers. Journal of Interactive Marketing, 26(3), 155–166.
Broder, A. (2002). A taxonomy of web search. ACM SIGIR Forum, 36(2), 3–10.
Cacioppo, J. T., & Petty, R. (1979). Effects of message repetition and position on cognitive response. Recall and persuasion. Journal of Personality and Social Psychology, 37(1), 97–109.
Chan, D. X., Yuan, Y., Koehler, J., & Kumar, D. (2011). Incremental clicks impact of search advertising. Journal of Advertising Research, 51(4), 643–647.
Ghose, A., & Yang, S. (2009). An empirical analysis of search engine advertising: Sponsored search in electronic markets. Marketing Science, 55(10), 1605–1622.
Goldstein, D. G., Suri, S., McAfee, R. P., EkstrandAbueg, M., & Diaz, F. (2014). The economic and cognitive costs of annoying display advertisements. Journal of Marketing Research, 51(6), 742–752.
Groth, M. (2005). Customers as good soldiers: Examining citizenship behaviours in Internet service deliveries. Journal of Management, 31(1), 7–27.
Hong, I. B., & Cha, H. S. (2013). The mediating role of consumer trust in an online merchant in predicting purchase intention. International Journal of Information Management, 33(6), 927–939.
IAB (2019) IAB internet advertising revenue report – 2018 full year results. https://www.iab.com/wpcontent/uploads/2019/05/FullYear2018IABInternetAdvertisingRevenueReport.pdf. Accessed 30 Jun 2019.
Jansen, B. J., Brown, A., & Resnick, M. (2007). Factors relating to the decision to click on a sponsored link. Decision Support Systems, 44(1), 46–59.
Jansen, B. J., & Resnick, M. (2006). An examination of searcher’s perceptions of nonsponsored and sponsored links during ecommerce web searching. Journal of the American Society for Information Science and Technology, 57(14), 1949–1961.
Kirmani, A. (1990). The effect of perceived advertising costs on brand perceptions. Journal of Consumer Research, 17(2), 160–171.
Kirmani, A. (1997). Advertising repetition as a signal of quality: If it’s advertised so much, something must be wrong. Journal of Advertising, 26(3), 77–86.
Lewis, D. (1973). Causation. The Journal of Philosophy, 70(14), 556–567.
Loomes, G., & Sugden, R. (1982). Regret theory: An alternative theory of rational choice under uncertainty. Economic Journal, 92(368), 805–824.
Lu, X., & Zhao, X. (2014). Differential effects of keyword selection in search engine advertising on direct and indirect sales. Journal of Management Information Systems, 30(4), 299–326.
McCulloch, C. E., & Neuhaus, J. M. (2011). Misspecifying the shape of a random effects distribution: Why getting it wrong may not matter. Statistical Science, 26(3), 388–402.
Mediative (2015) The evolution of Google search results pages & their effect on user behaviour. Whitepaper. http://www.mediative.com/whitepapertheevolutionofgooglessearchresultspageseffectsonuserbehaviour/. Accessed 22 Aug 2016.
Moral, P., Gonzalez, P., & Plaza, B. (2014). Methodologies for monitoring website performance: Assessing the effectiveness of AdWords campaigns on a tourist SME website. Online Information Review, 38(4), 575–588.
Oulasvirta A, Hukkinen JP, Schwartz B (2009) When more is less: The paradox of choice in search engine use. Proceedings of the 32nd ACM International Conference on Research and Development in Information Retrieval: 516–523.
Pan, B., Hembrooke, H., Joachims, T., Lorigo, L., Gay, G., & Granka, L. (2007). In Google we trust: Users’ decisions on rank, position, and relevance. Journal of ComputerMediated Communication, 12(3), 801–823.
Phan N, Bailey P, Wilkinson R (2007) Understanding the relationship of information need specificity to search query length. Proceedings of the 30th International ACM SIGIR Conference on Research and Development in Information Retrieval: 709–710.
Rossi, P. E., Allenby, G., & McCulloch, R. (2005). Bayesian Statistics and Marketing. Hoboken: John Wiley & Sons.
Rutz, O. J., & Bucklin, R. E. (2011). From generic to branded: A model of spillover in paid search advertising. Journal of Marketing Research, 48(1), 87–102.
Rutz, O. J., & Trusov, M. (2011). Zooming in on paid search ads – A consumerlevel model calibrated on aggregated data. Marketing Science, 30(5), 789–800.
Simonov A, Nosko C, Rao JM (2015) Competition and cannibalization of brand keywords. WP. University of Chicago (Booth), https://research.chicagobooth.edu/~/media/516A6EE2639F46E0BDA7BE41EC8D8CC3.pdf. Accessed 10 Sept 2017.
Winter, P., Alpar, P., & Geißler, C. (2014). When does brand bidding pay off (even) when website competition is low? Proceedings of the 35th International Conference on Information Systems, 51.
Yang, S., & Ghose, A. (2010). Analyzing the relationship between organic and sponsored search advertising: Positive, negative, or zero interdependence? Marketing Science, 29(4), 602–623.
References for appendices
Hausman, J. A. (1975). An instrumental variable approach to full information estimators for linear and certain nonlinear econometric models. Econometrica, 43(4), 727–738.
Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. Proceedings of the 3rd International Workshop on Distributed Statistical Computing.
Wichmann, B., & Hill, D. (1982). Algorithm AS 183: An efficient and portable pseudorandom number generator. Journal of the Royal Statistical Society: Series C: Applied Statistics, 31(2), 188–190.
Zellner, A. (1962). An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. Journal of the American Statistical Association, 57(298), 348–368.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: Ulrike Lechner
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
Identification of the model
Since our model contains several latent variables, which, furthermore, are allowed to interact with each other, it is not intuitively clear that it is identified. Therefore, we now elaborate on this point.
Let us first consider the “control part” of our model, that is, eqs. (3), (7), and (10) (in combination with (1a), (5a), and (14a)). These equations are independent from the others, so that their identification can be analysed separately. They can be represented as
where the \( {\boldsymbol{X}}_{k,t}^{\dots } \)’s are vectors summarizing the exogenous variables in the respective equations. Conditional on the latent utilities \( {U}_{k,t}^{\left\langle 1\right\rangle } \) and \( {U}_{k,t}^{1:\left\langle 1\right\rangle } \), (16) is essentially a system of seemingly unrelated regression equations (SUREs), which is known to be identified (Zellner 1962).
Next, we consider the “treatment part” of our model, but exclude all equations relating to the click level for the moment. The remaining eqs. (6), (8), (9), (11), (12), and (13) (in combination with (5b) and (14b)) can be summarized and represented as
(17) is not a system of SUREs because \( {U}_{k,t}^{\to j:\left\langle 1\right\rangle } \) and \( \log \left({\mathrm{RPCV}}_{k,t}^{\to j:1}\right) \) depend on the ad position, which is endogenous. Since (17) is triangular, however, its likelihood and posterior distribution are the same as for a system of SUREs (Zellner 1962, Hausman 1975). Such systems are identified if the endogenous variable is determined by at least one exogenous variable that is not part of the other equations in the system; concretely, \( {\boldsymbol{X}}_{k,t}^{AP} \) needs to contain at least one variable that none of the other variable vectors contains. This condition is satisfied in our model because Bid_{k, t} is such a variable.
Given that (17) is identified, \( {f}_k^{CPC} \), \( {\epsilon}_{k,t}^{CPC} \), \( {f}_k^{AP} \), and \( {\epsilon}_{k,t}^{AP} \) are identified similar to the components of (16). \( {f}_k^{1:1\to j:\left\langle 1\right\rangle } \) and \( {f}_k^{1:1\to j:1} \) are also identified because \( {f}_k^{1:\left\langle 1\right\rangle } \) and \( {f}_k^{1:1} \) can be replaced with their estimates from the control part. In contrast, it cannot be concluded from the sums \( {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle }+{\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle } \) and \( {\epsilon}_{k,t}^{1:1}+{\epsilon}_{k,t}^{1:1\to j:1} \), which are identified similar to the error terms in (16), on \( {\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle } \) and \( {\epsilon}_{k,t}^{1:1\to j:1} \) because \( {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle } \) and \( {\epsilon}_{k,t}^{1:1} \) are not known, respectively. Still, the corresponding elements of the covariance matrix are identified. This is because the distributions of \( {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle } \) and \( {\epsilon}_{k,t}^{1:1} \) can be parameterized from the control part. We give an example for demonstration. The covariance of \( {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle }+{\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle } \) and \( {\epsilon}_{k,t}^{1:1}+{\epsilon}_{k,t}^{1:1\to j:1} \) can be decomposed as follows:
Put differently, we have \( \mathrm{Cov}\left[{\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle },{\epsilon}_{k,t}^{1:1\to j:1}\right]=\mathrm{Cov}\left[{\epsilon}_{k,t}^{1:\left\langle 1\right\rangle }+{\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle },{\epsilon}_{k,t}^{1:1}+{\epsilon}_{k,t}^{1:1\to j:1}\right]\mathrm{Cov}\left[{\epsilon}_{k,t}^{1:\left\langle 1\right\rangle },{\epsilon}_{k,t}^{1:1}\right] \); i.e., the covariance of \( {\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle } \) and \( {\epsilon}_{k,t}^{1:1\to j:1} \) is uniquely determined by the covariance of \( {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle } \) and \( {\epsilon}_{k,t}^{1:1} \), which can be estimated from the control part, and the covariance of \( {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle }+{\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle } \) and \( {\epsilon}_{k,t}^{1:1}+{\epsilon}_{k,t}^{1:1\to j:1} \), which is estimated in the treatment part. Here we have used that \( \mathrm{E}\left[{\epsilon}_{k,t}^{1:\left\langle 1\right\rangle}\cdotp {\epsilon}_{k,t}^{1:1\to j:1}\right]=\mathrm{E}\left[{\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle}\cdotp {\epsilon}_{k,t}^{1:1}\right]=0 \) because \( \mathrm{Cov}\left[{\epsilon}_{k,t}^{1:\left\langle 1\right\rangle },{\epsilon}_{k,t}^{1:1\to j:1}\right]=\mathrm{Cov}\left[{\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle },{\epsilon}_{k,t}^{1:1}\right]=0 \) due to (14a) and (14b).
Finally, let us consider the clicklevel effects of SEA, that is, eqs. (2a), (2b), and (4b) (in combination with (1b), (4a) and (14b)). As indicated in the paper, it would not be possible to identify the four clicklevel effects for each observation in the treatment scenario without using a model because each observation contributes only two data points (the respective click probabilities). Modelling them allows their identification only under certain conditions. It is not easy to see for our actual model as presented in the paper whether these conditions are met, so that we will use a toy model and dataset for illustration. The toy dataset consists of only two observations for the treatment scenario and is shown in Table 9.
In the toy model, we set \( {p}_{k,t}^{\left\langle 1\right\rangle }={\alpha}^{\left\langle 1\right\rangle }+{\beta}^{\left\langle 1\right\rangle}\cdotp {x}_k \) (replacing (3)) and \( {p}_{k,t}^{i\to \left\langle j\right\rangle }={\alpha}^{i\to \left\langle j\right\rangle }+{\beta}^{i\to \left\langle j\right\rangle}\cdotp {x}_k \) (replacing (4a) and (4b)), where x_{k} is a dummy variable. We assume that α^{〈1〉} is known to equal 0.3 (e.g., because the control part has already been estimated) and set β^{0 → 〈2〉} = β^{1 → 〈0〉} = β^{1 → 〈2〉} = 0, so that only the addition on ad effect may depend on x_{k}. We also proceed as if \( {p}_{k,t}^{\to \left\langle 1\right\rangle } \) and \( {p}_{k,t}^{\to \left\langle 2\right\rangle } \) were observed directly instead of the corresponding numbers of individual searches and clicks. For the toy dataset, (2a) and (2b) would then lead to the following equation system arising during estimation:
We consider three variants of the toy model. In variant 1, we set β^{〈1〉} = β^{0 → 〈1〉} = 0, which means that the probability of a click without an ad and all clicklevel effects are constant. For this case, the righthand sides of (I) and (III) on one hand and of (II) and (IV) on the other hand are identical. Regarding (I) and (III), this entails are direct contradiction because the lefthand sides differ. Regarding (II) and (IV), for which the lefthand sides are equal, one equation is redundant. Since (18) contains four unknown variables (the α’s) and the same number of equations, it is not identified in these cases.
In variant 2, we set again β^{0 → 〈1〉} = 0 but assume that β^{〈1〉} is known to equal 0.5. This means that all clicklevel effects are constant but that the probability of a click without an ad may vary across keywords. For this case, (18) again contains four unknown variables, but there are no contradicting or redundant equations, so that it is identified. Its solution is α^{0 → 〈2〉} = α^{1 → 〈2〉} = 0.1, α^{0 → 〈1〉} = 0.52, α^{1 → 〈0〉} = 0.78.
In variant 3, we assume again that β^{〈1〉} is known to equal 0.5 but allow the addition on organic effect to differ between keywords, that is, impose no restrictions on β^{0 → 〈1〉}. For this case, (18) is again not identified. This is because it now contains five unknown variables (the α’s and β^{0 → 〈1〉}) but still only four equations. Note that this problem would not be solved if we had observations for more days because the additional equations were again either contradicting or redundant. Obviously, allowing the other clicklevel effects to also differ between keywords would make things worse.
Summarizing, the toy model is only identified if the “explaining variation” in \( {p}_{k,t}^{\left\langle 1\right\rangle } \) is greater than the “explained variation” in \( {p}_{k,t}^{i\to \left\langle j\right\rangle } \). This is also true for our actual model. Practically, it means that \( {p}_{k,t}^{\left\langle 1\right\rangle } \) should be determined by at least one factor that is not also a determinant of \( {p}_{k,t}^{i\to \left\langle j\right\rangle } \). We exploit unobserved keyword heterogeneity for this purpose. This is not the only possible choice; for example, we have also experimented with distinguishing between weekdays and weekends. However, we have found this variable to be only a very weak predictor of \( {p}_{k,t}^{\left\langle 1\right\rangle } \), so that it can contribute only marginally to the explaining variation. In contrast, unobserved keyword heterogeneity has been found to have a substantial influence on \( {p}_{k,t}^{\left\langle 1\right\rangle } \) (with σ^{〈1〉} = 0.66 as reported in Table 3), making it a good choice.
One can assume that the clicklevel effects are identified in our actual model for the same reasons for which they are identified in variant 2 of our toy model. However, due to the use of a multinomial logit link function, the inclusion of correlated error terms in the latent utilities, and some other hurdles, this is difficult to track analytically. Therefore, we conducted a simulation study to confirm identification. We considered a reduced version of our model for this purpose to accelerate estimation. Besides the click level, the reduced model also included the conversion level to explore whether its correlation with the click level is identified. The revenue level was not included because it is analogous to the conversion level. Furthermore, the explaining variables in (3) and (7) were replaced by a single, randomly drawn one, x_{1k, t}. The explaining variables in (4b) and (8) were replaced by x_{1k, t} and a second random variable x_{2k, t} that summarizes the variables which do not exist in the control scenario (such as the ad position). x_{1k, t} and x_{2k, t} were drawn from standard normal distributions. The equations relating to the ad position and the CPC ((12) and (13)) were not considered. (14a) and (14b) were adapted correspondingly. The data generation mechanism was based on the equations of our model. The number of individual searches was drawn as N_{k, t} ∼ Uniform(1; 10000). The parameters to be estimated were generated as follows:

1.
Draw β ∼ Normal(0; 1) for each β ∈ β, where β summarizes all regression coefficients (including the intercepts) in the reduced versions of (3), (4b), (7), and (8).

2.
Draw \( {\Sigma}^{\prime}\sim {Wishart}^{1}\left(\frac{1}{2}\cdotp \mathbf{1}(2);2\right) \) and \( {\Sigma^{\to}}^{\prime}\sim {Wishart}^{1}\left(\frac{1}{5}\cdotp \mathbf{1}(5);5\right) \), where 1(n) denotes the ndimensional identity matrix. Calculate Σ = Σ^{′}/ max(Σ^{′}) · 0.25 and Σ^{→} = Σ^{→′}/ max( Σ^{→′}) · 0.25.

3.
Draw σ^{〈1〉} ∼ Uniform(0; 3) or set σ^{〈1〉} = 1 (see below). Set σ^{1 : 〈1〉} = 1 and σ^{1 : 1 → : 〈1〉} = 1.
We carried out three experiments, using 100 replications each. In experiment 1, we were interested in the influence the absolute magnitude of unobserved keyword heterogeneity has on the estimation results, so we varied only σ^{〈1〉} across replications. The purpose of experiment 2 was to investigate the influence of the relative magnitude of unobserved keyword heterogeneity; thus, we held σ^{〈1〉} constant across replications but varied β. In experiment 3, the complete data generation mechanism was repeated in each replication, which simulates the application of the model to different realworld datasets.
We measured the simulation error by the absolute deviation \( \leftv\hat{v}\right \) between the true value v and its estimate \( \hat{v} \). The average results across all replications are reported in Table 10. As can be seen, all parameters were, on average, accurately recovered. The deviation between the true values and their estimates was usually lower than one decimal. Given that the absolutes of the true values were, on average, comparatively large, this deviation can be neglected. This suggests that our model is identified. Interestingly, the results for experiments 1 and 2 indicate that the deviations are acceptable even if the absolute or relative magnitude of unobserved keyword heterogeneity is comparatively low.
Estimation approach
We estimated our model using the Gibbs sampler JAGS (Plummer 2003). Since the control part of our model is independent of its treatment part (by (14a) and (14b)), as mentioned earlier, we used a twostage approach. In the first stage, we considered only the control part and, correspondingly, the observations made in the control periods of our field experiment. In the second stage, we used the estimated values from the control part to estimate the treatment part by the observations made in the treatment periods.
The specification for an iteration in the first stage (for \( t\in \mathcal{T} \)) is essentially as follows:

1.
Draw α^{〈1〉}, \( {\alpha}_l^{\left\langle 1\right\rangle}\in {\boldsymbol{\alpha}}^{\left\langle 1\right\rangle } \), α^{1 : 〈1〉}, \( {\alpha}_l^{1:\left\langle 1\right\rangle}\in {\boldsymbol{\alpha}}^{1:\left\langle 1\right\rangle } \), α^{1 : 1}, and \( {\alpha}_l^{1:1}\in {\boldsymbol{\alpha}}^{1:1} \). Prior distribution: Normal(0; 10^{2}).

2.
Draw \( {\beta_l^{\left\langle 1\right\rangle}}_{l=1,\dots, 3} \), \( {\beta_l^{1:\left\langle 1\right\rangle}}_{l=1,\dots, 2} \), and \( {\beta_l^{1:1}}_{l=1,\dots, 2} \). Prior distribution: Normal(0; 10^{2}).

3.
Draw σ^{〈1〉}, σ^{1 : 〈1〉}, and σ^{1 : 1}. Prior distribution: Uniform(0; 10).

4.
Draw Σ. Prior distribution: \( {Wishart}^{1}\left(\frac{1}{3}\cdotp \mathbf{1}(3);3\right) \).

5.
For each keyword k, draw \( {\overset{\sim }{\alpha}}_k^{\left\langle 1\right\rangle } \), \( {\overset{\sim }{\alpha}}_k^{1:\left\langle 1\right\rangle } \), and \( {\overset{\sim }{\alpha}}_k^{1:1} \) as specified in (3), (7), and (10), conditional on σ^{〈1〉}, σ^{1 : 〈1〉}, and σ^{1 : 1}, respectively.
For each keyword k and each point in time t:

6.
Calculate \( {\epsilon}_{k,t}^{1:1} \) by (10), given α^{1 : 1}, α^{1 : 1}, \( {\overset{\sim }{\alpha}}_k^{1:1} \), and \( {\beta_l^{1:1}}_{l=1,\dots, 2} \).

7.
Draw \( \left({\epsilon}_{k,t}^{\left\langle 1\right\rangle}\kern0.5em {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle}\kern0.5em {\epsilon}_{k,t}^{1:1}\right) \) as specified in (14a), conditional on Σ and \( {\epsilon}_{k,t}^{1:1} \).

8.
Calculate \( {U}_{k,t}^{\left\langle 1\right\rangle } \) and \( {p}_{k,t}^{\left\langle 1\right\rangle } \) by (3), given α^{〈1〉}, α^{〈1〉}, \( {\overset{\sim }{\alpha}}_k^{\left\langle 1\right\rangle } \), \( {\beta_l^{\left\langle 1\right\rangle}}_{l=1,\dots, 3} \), and \( {\epsilon}_{k,t}^{\left\langle 1\right\rangle } \).

9.
Calculate \( {U}_{k,t}^{1:\left\langle 1\right\rangle } \) and \( {p}_{k,t}^{1:\left\langle 1\right\rangle } \) by (7), given α^{1 : 〈1〉}, α^{1 : 〈1〉}, \( {\overset{\sim }{\alpha}}_k^{1:\left\langle 1\right\rangle } \), \( {\beta_l^{1:\left\langle 1\right\rangle}}_{l=1,\dots, 3} \), and \( {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle } \).

10.
Calculate the likelihood function. The likelihood for the click level conditional on \( {p}_{k,t}^{\left\langle 1\right\rangle } \) is given by \( {\prod}_{k,t}\left(\begin{array}{c}{N}_{k,t}\\ {}{N}_{k,t}^{\left\langle 1\right\rangle}\end{array}\right)\cdotp {p_{k,t}^{\left\langle 1\right\rangle}}^{N_{k,t}^{\left\langle 1\right\rangle }}\cdotp {\left(1{p}_{k,t}^{\left\langle 1\right\rangle}\right)}^{N_{k,t}{N}_{k,t}^{\left\langle 1\right\rangle }} \) due to (1a). Similarly, the likelihood for the conversion level conditional on \( {p}_{k,t}^{1:\left\langle 1\right\rangle } \) is given by \( {\prod}_{k,t}\left(\begin{array}{c}{N}_{k,t}^{\left\langle 1\right\rangle}\\ {}{N}_{k,t}^{1:\left\langle 1\right\rangle}\end{array}\right)\cdotp {p_{k,t}^{1:\left\langle 1\right\rangle}}^{N_{k,t}^{1:\left\langle 1\right\rangle }}\cdotp {\left(1{p}_{k,t}^{1:\left\langle 1\right\rangle}\right)}^{N_{k,t}^{\left\langle 1\right\rangle }{N}_{k,t}^{1:\left\langle 1\right\rangle }} \) due to (5a). The likelihood for the revenue level is accounted for by drawing the error terms in step 7.
In the second stage (for \( t\in {\mathcal{T}}^{\to } \)), each iteration consists essentially of the following steps:

1.
Draw α^{i → 〈j〉}_{j ≠ i}, \( {\alpha}_l^{\left\langle i\right\rangle \to j}\in {{\boldsymbol{\alpha}}^{\left\langle i\right\rangle \to j}}_{j\ne i} \), α^{1 : 1 → : 〈1〉}, \( {\alpha}_l^{1:1\to :\left\langle 1\right\rangle}\in {\boldsymbol{\alpha}}^{1:1\to :\left\langle 1\right\rangle } \), α^{1 : 1 → : 1}, \( {\alpha}_l^{1:1\to :1}\in {\boldsymbol{\alpha}}^{1:1\to :1} \), α^{CPC}, \( {\alpha}_l^{CPC}\in {\boldsymbol{\alpha}}^{CPC} \), α^{AP}, and \( {\alpha}_l^{AP}\in {\boldsymbol{\alpha}}^{AP} \). Prior distribution: Normal(0; 10^{2}).

2.
Draw \( {\beta_l^{i\to \left\langle j\right\rangle}}_{j\ne i;l=1,\dots, 4} \), \( {\beta_l^{1:1\to :\left\langle 1\right\rangle}}_{l=1,\dots, 3} \), \( {\beta_l^{1:1\to :1}}_{l=1,\dots, 3} \), \( {\beta_l^{CPC}}_{l=1,\dots, 4} \), and \( {\beta_l^{AP}}_{l=1,\dots, 4} \), as well as \( {\gamma_l^{1:1\to 2:\left\langle 1\right\rangle}}_{l=1,\dots, 2} \) and \( {\gamma_l^{1:1\to 2:1}}_{l=1,\dots, 2} \). Prior distribution: Normal(0; 10^{2}).

3.
Draw σ^{1 : 1 → : 〈1〉}, σ^{1 : 1 → : 1}, σ^{CPC}, and σ^{AP}. Prior distribution: Uniform(0; 10).

4.
Draw Σ^{→}. Prior distribution: \( {Wishart}^{1}\left(\frac{1}{9}\cdotp \mathbf{1}(9);9\right) \).

5.
For each keyword k, draw \( {\overset{\sim }{\alpha}}_k^{1:1\to :\left\langle 1\right\rangle } \), \( {\overset{\sim }{\alpha}}_k^{1:1\to :1} \), \( {\overset{\sim }{\alpha}}_k^{CPC} \), and \( {\overset{\sim }{\alpha}}_k^{AP} \) as specified in (8), (11), (12), and (13), conditional on σ^{1 : 1 → : 〈1〉}, σ^{1 : 1 → : 1}, σ^{CPC}, and σ^{AP}, respectively.
For each keyword k and each point in time t:

6.
Draw \( \left({\epsilon}_{k,t}^{\left\langle 1\right\rangle}\kern0.5em {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle}\kern0.5em {\epsilon}_{k,t}^{1:1}\right) \) as specified in (14a), using the estimate of Σ.

7.
Calculate \( {U}_{k,t}^{\left\langle 1\right\rangle } \) and \( {p}_{k,t}^{\left\langle 1\right\rangle } \) by (3), given \( {\epsilon}_{k,t}^{\left\langle 1\right\rangle } \), using the estimates of α^{〈1〉}, α^{〈1〉}, \( {\overset{\sim }{\alpha}}_k^{\left\langle 1\right\rangle } \) and \( {\beta_l^{\left\langle 1\right\rangle}}_{l=1,\dots, 3} \).

8.
Calculate \( {U}_{k,t}^{1:\left\langle 1\right\rangle } \) and \( {p}_{k,t}^{1:\left\langle 1\right\rangle } \) by (7), given \( {\epsilon}_{k,t}^{1:\left\langle 1\right\rangle } \), using the estimates of α^{1 : 〈1〉}, α^{1 : 〈1〉}, \( {\overset{\sim }{\alpha}}_k^{1:\left\langle 1\right\rangle } \), and \( {\beta_l^{1:\left\langle 1\right\rangle}}_{l=1,\dots, 2} \).

9.
Calculate \( \log \left({\mathrm{RPCV}}_{k,t}^{1:1}\right) \) by (10), given \( {\epsilon}_{k,t}^{1:1} \), using the estimates of α^{1 : 1}, α^{1 : 1}, \( {\overset{\sim }{\alpha}}_k^{1:1} \), and \( {\beta_l^{1:1}}_{l=1,\dots, 2} \).

10.
Calculate \( {\epsilon_{k,t}^{1:1\to j:1}}_{j=1,\dots, 2} \) and \( \Delta \mathrm{log}\left({\mathrm{RPCV}}_{k,t}^{1:1\to j:1}\right) \) by (9) and (11), given \( \log \left({\mathrm{RPCV}}_{k,t}^{1:1}\right) \), α^{1 : 1 → : 1}, α^{1 : 1 → : 1}, \( {\overset{\sim }{\alpha}}_k^{1:1\to :1} \), \( {\beta_l^{1:1\to :1}}_{l=1,\dots, 3} \), and \( {\gamma_l^{1:1\to 2:\left\langle 1\right\rangle}}_{l=1,\dots, 2} \).

11.
Calculate \( {\epsilon}_{k,t}^{CPC} \) by (12), given α^{CPC}, α^{CPC}, \( {\overset{\sim }{\alpha}}_k^{CPC} \), and \( {\beta_l^{CPC}}_{l=1,\dots, 4} \).

12.
Calculate \( {\epsilon}_{k,t}^{AP} \) by (13), given α^{AP}, α^{AP}, \( {\overset{\sim }{\alpha}}_k^{AP} \), and \( {\beta_l^{AP}}_{l=1,\dots, 4} \).

13.
Draw \( \left({\epsilon_{k,t}^{i\to \left\langle j\right\rangle}}_{j\ne i}\kern0.5em {\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle}\kern0.5em {\epsilon_{k,t}^{1:1\to j:1}}_{j=1,\dots, 2}\kern0.5em {\epsilon}_{k,t}^{CPC}\kern0.5em {\epsilon}_{k,t}^{AP}\right) \) as specified in (14b), conditional on Σ^{→}, \( {\epsilon_{k,t}^{1:1\to j:1}}_{j=1,\dots, 2} \), \( {\epsilon}_{k,t}^{CPC} \), and \( {\epsilon}_{k,t}^{AP} \).

14.
Calculate \( {U_{k,t}^{i\to \left\langle j\right\rangle}}_{j\ne i} \) by (4b), given α^{i → 〈j〉}_{j ≠ i}, α^{〈i〉 → j}_{j ≠ i}, \( {\beta_l^{i\to \left\langle j\right\rangle}}_{j\ne i;l=1,\dots, 4} \), and \( {\epsilon_{k,t}^{i\to \left\langle j\right\rangle}}_{j\ne i} \).

15.
Calculate \( {p_{k,t}^{i\to \left\langle j\right\rangle}}_{j\ne i} \) by (4a), given \( {U_{k,t}^{i\to \left\langle j\right\rangle}}_{j\ne i} \).

16.
Calculate \( {p_{k,t}^{\to \left\langle j\right\rangle}}_{j=1,\dots, 2} \) by (3), given \( {p}_{k,t}^{\left\langle 1\right\rangle } \) and \( {p_{k,t}^{i\to \left\langle j\right\rangle}}_{j\ne i} \).

17.
Calculate \( \Delta {U_{k,t}^{1:1\to j:\left\langle 1\right\rangle}}_{j=1,\dots, 2} \) by (8), given α^{1 : 1 → : 〈1〉}, α^{1 : 1 → : 〈1〉}, \( {\overset{\sim }{\alpha}}_k^{1:1\to :\left\langle 1\right\rangle } \), \( {\beta_l^{1:1\to :\left\langle 1\right\rangle}}_{l=1,\dots, 3} \), and \( {\epsilon}_{k,t}^{1:1\to :\left\langle 1\right\rangle } \).

18.
Calculate \( {U_{k,t}^{\to j:\left\langle 1\right\rangle}}_{j=1,\dots, 2} \) and \( {p_{k,t}^{\to j:\left\langle 1\right\rangle}}_{j=1,\dots, 2} \) by (6), given \( {U}_{k,t}^{1:\left\langle 1\right\rangle } \) and \( \Delta {U_{k,t}^{1:1\to j:\left\langle 1\right\rangle}}_{j=1,\dots, 2} \).

19.
Calculate the likelihood function. The likelihood for the click level conditional on \( {p_{k,t}^{\to \left\langle j\right\rangle}}_{j=1,\dots, 2} \)is given by \( \prod \limits_{k,t}\left(\begin{array}{c}{N}_{k,t}\\ {}{N}_{k,t}^{\left\langle 1\right\rangle },{N}_{k,t}^{\left\langle 2\right\rangle}\end{array}\right)\cdotp {p_{k,t}^{\to \left\langle 1\right\rangle}}^{N_{k,t}^{\to \left\langle 1\right\rangle }}\cdotp {p_{k,t}^{\to \left\langle 2\right\rangle}}^{N_{k,t}^{\to \left\langle 2\right\rangle }}\cdotp {\left(1{p}_{k,t}^{\to \left\langle 1\right\rangle }{p}_{k,t}^{\to \left\langle 2\right\rangle}\right)}^{N_{k,t}{N}_{k,t}^{\to \left\langle 1\right\rangle }{N}_{k,t}^{\to \left\langle 2\right\rangle }} \) due to (1b). The likelihood for the conversion level conditional on \( {p_{k,t}^{\to j:\left\langle 1\right\rangle}}_{j=1,\dots, 2} \) is given by \( \prod \limits_{k,t}\left(\begin{array}{c}{N}_{k,t}^{\to \left\langle 1\right\rangle}\\ {}{N}_{k,t}^{\to 1:\left\langle 1\right\rangle}\end{array}\right)\cdotp {p_{k,t}^{\to 1:\left\langle 1\right\rangle}}^{N_{k,t}^{\to 1:\left\langle 1\right\rangle }}\cdotp {\left(1{p}_{k,t}^{\to 1:\left\langle 1\right\rangle}\right)}^{N_{k,t}^{\to \left\langle 1\right\rangle }{N}_{k,t}^{\to 1:\left\langle 1\right\rangle }}\cdotp \left(\begin{array}{c}{N}_{k,t}^{\to \left\langle 2\right\rangle}\\ {}{N}_{k,t}^{\to 2:\left\langle 1\right\rangle}\end{array}\right)\cdotp {p_{k,t}^{\to 2:\left\langle 1\right\rangle}}^{N_{k,t}^{\to 2:\left\langle 1\right\rangle }}\cdotp {\left(1{p}_{k,t}^{\to 2:\left\langle 1\right\rangle}\right)}^{N_{k,t}^{\to \left\langle 2\right\rangle }{N}_{k,t}^{\to 2:\left\langle 1\right\rangle }} \) due to (5b). The likelihood for the revenue level, the CPC, and the ad position is accounted for by drawing the error terms in step 13, respectively.
To generate new random candidate values for the next iteration, we employed the method of Wichmann and Hill (1982). We used 100,000 iterations to estimate the control part of our model and 200,000 iterations to estimate its more complex treatment part. The same numbers of iterations were used for the adaptation and “burnin” of the sampler.
Estimation result details
Tables 11 and 12 give the estimated covariance matrices for the control and the treatment scenario, respectively, which have been omitted in the paper for brevity.
Balance sheet of SEA and derivation
To derive the balance sheet of SEA, we first express the factors in (15b) that depend on the effects of SEA (or their transformations) as described in the paper. E[ΔProfit_{k, t}] is then given by
Expanding (D.1), we get (terms that cancel each other out are shown but crossed off for easier understanding)
Each term in (20) can be interpreted as a profit component. It can be seen that two terms cancel out. The first, \( {p}_{k,t}^{\left\langle 1\right\rangle}\cdotp {p}_{k,t}^{1:\left\langle 1\right\rangle}\cdotp {\mathrm{RPCV}}_{k,t}^{1:1}\cdotp \pi \), describes the profit generated by hypothetical visitors whose behaviour (at any level) is not influenced by SEA and equals, therefore, E[Profit_{k, t}]. The other term that cancels out, \( {p}_{k,t}^{\left\langle 1\right\rangle}\cdotp {p}_{k,t}^{1\to \left\langle 2\right\rangle}\cdotp {p}_{k,t}^{1:\left\langle 1\right\rangle}\cdotp {\mathrm{RPCV}}_{k,t}^{1:1}\cdotp \pi \), describes the profit generated by cannibalized users whose conversion behaviour and basket choice is not influenced by SEA. This shows that the cannibalization effect does, ceteris paribus, not affect the firm’s revenues, but only its costs (as captured by another term).
The remaining terms can be attributed to the effect(s) to which they relate. Terms that relate to exactly one effect describe the “pure” impact of this effect on expected profits. Terms that relate to more than one effect describe the impact of the interaction of these effects. Such interaction can happen across the three levels of user behaviour investigated. For example, the terms \( \left(1{p}_{k,t}^{\left\langle 1\right\rangle}\right)\cdotp {p}_{k,t}^{0\to \left\langle 1\right\rangle}\cdotp {p}_{k,t}^{1:\left\langle 1\right\rangle}\cdotp {\mathrm{RPCV}}_{k,t}^{1:1}\cdotp \pi \) and \( {p}_{k,t}^{\left\langle 1\right\rangle}\cdotp \Delta {p}_{k,t}^{1:\left\langle 1\right\rangle \to 1:\left\langle 1\right\rangle}\cdotp {\mathrm{RPCV}}_{k,t}^{1:1}\cdotp \pi \) describe the change in E[ΔProfit_{k, t}] if only the addition on organic effect or only the conversion effect were active, respectively. The impact of the interaction of these effects is captured by the term \( \left(1{p}_{k,t}^{\left\langle 1\right\rangle}\right)\cdotp {p}_{k,t}^{0\to \left\langle 1\right\rangle}\cdotp \Delta {p}_{k,t}^{1:\left\langle 1\right\rangle \to 1:\left\langle 1\right\rangle}\cdotp {\mathrm{RPCV}}_{k,t}^{1:1}\cdotp \pi \). We distribute terms that relate to several effects equally, as there is no reason for a different attribution. E.g., the aforementioned term is attributed to one half to the addition on organic effect and to the other half to the conversion effect.
Finally, we distinguish terms by whether they are positive (increasing E[ΔProfit_{k, t}]) or negative (decreasing E[ΔProfit_{k, t}]). While the clicklevel effects are always positive by construction, the conversion effect and the revenue effect can also be negative. Therefore, their sign determines the orientation of the terms that relate to them. Table 13 shows how the balance sheet of SEA appears for the case of nonnegative conversion and revenue effects. The extension to the other cases is straightforward. E[ΔProfit_{k, t}] is then the residual item that balances the effects of SEA. Therefore, for the case of E[ΔProfit_{k, t}] ≥ 0, it has to be written on the “passive” side of the balance sheet, as it is done in Table 13.
Rights and permissions
About this article
Cite this article
Winter, P., Alpar, P. Effects of search engine advertising on user clicks, conversions, and basket choice. Electron Markets 30, 837–862 (2020). https://doi.org/10.1007/s12525019003765
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12525019003765