Abstract
Marketing researchers have become increasingly interested in spatial datasets. A main challenge of analyzing spatial data is that researchers must a priori choose the size and make-up of the areal units, hence the resolution of the analysis. Analyzing the data at a resolution that is too high may mask “macro” patterns, while analyzing the data at a resolution that is too low may result in aggregation bias. Thus, ideally marketing researchers would want a “data-driven” method to determine the “optimal” resolution of analysis, and at the same time automatically explore the same dataset under different resolutions, to obtain a full set of empirical insights to help with managerial decision making. In this paper, we propose a new approach for multi-resolution spatial analysis that is based on Bayesian model selection. We demonstrate our method using two recent marketing datasets from published studies: (i) the Netgrocer spatial sales data in Bell and Song (Quantitative Marketing and Economics 5:361–400, 2007), and (ii) the Pathtracker® data in Hui et al. (Marketing Science 28:566–572, 2009b; Journal of Consumer Research 36:478–493, 2009c) that track shoppers’ in-store movements. In both cases, our method allows researchers to not only automatically select the resolution of the analysis, but also analyze the data under different resolutions to understand the variation in insights and robustness to the level of aggregation.
Similar content being viewed by others
Notes
One example of prior information is that if the data are assumed to be spatially correlated, a hierarchical prior specification (e.g., using a Markov random field, see Banerjee et al. 2003) in which adjacent regions have correlated parameters may be more appropriate. We leave this for future research.
Note that when c 1 = 0, our methodology is equivalent to model selection using Bayes factor. Thus, calculating the Bayes factor using the Savage-Dickey ratio (Rossi et al. 2005) for every configuration can also be used to determine the most likely configuration. However, both the prior and posterior density in the Savage-Dickey ratio are not easy to compute analytically; one would have to rely on numerical integrals in high dimensions or MCMC computations, both of which are computational intensive and involve simulation error.
If the researcher would like to introduce observed covariates to explain the variations in sales rate, the covariates can be introduced through a Poisson regression framework; our methodology can then be used to model the potential spatial clustering of the error terms. We leave this for further research.
As with any prior specification, robustness to the choice should be assessed. We looked at various values of α and β but due to the large amounts of data, their exact weakly informative values were not material.
R code for the implementation of Ferligoj and Batagelj (1982) algorithm is available from the authors upon request.
We can formally assess the degree of spatial auto-correlation across regions by computing the Moran’s I statistics (Moran 1950). In our setting, however, that is not strictly valid because our regions (defined through configuration G) themselves are being estimated.
R code for the implementation of the “functional distance” approach is available from the authors upon request.
References
Banerjee, S., Gelfand, A. E., & Carlin, B. P. (2003). Hierarchical modeling and analysis of spatial data. Chapman and Hall.
Barbujani, G., Jacquez, G. M., & Ligi, L. (1990). Diversity of some gene frequencies in European and Asian populations V. Steep multilocus clines. American Journal of Human Genetics, 47, 867–875.
Bell, D., & Song, S. (2007). Neighborhood effects and trial on the internet: evidence from online grocery retailing. Quantitative Marketing and Economics, 5(4), 361–400.
Bertsimas, D., & Tsitsiklis, J. (1993). Simulated annealing. Statistical Science, 8(1), 10–15.
Bithell, J. F. (2000). A classification of disease mapping methods. Statistics in Medicine, 19, 2203–2215.
Bloom, P. N., Gundlach, G. T., & Cannon, J. P. (2000). Slotting allowances and fees: schools of thought and the views of practicing managers. Journal of Marketing, 64, 92–108.
Bocquet-Appel, J. P., & Bacro, J. N. (1994). Generalized wombling. Systematic Biology, 43(3), 442–448.
Booth, J. G., Caselle, G., & Hobert, J. P. (2008). Clustering using objective functions and stochastic search. Journal of Royal Statistical Society (Series B), 70, 119–139.
Bradlow, E. T., Bronnenberg, B., Russell, G. J., Arora, N., Bell, D. R., Duvvuri, S. D., et al. (2005). Spatial models in marketing. Marketing Letters, 16, 267–678.
Bronnenberg, B. J., Dhar, S. K., & Dube, J.-P. (2007). Consumer packaged goods in the United States: national brands, local branding. Journal of Marketing Research, 44, 4–13.
Choi, J., Hui, S. K., & Bell, D. (2010). Spatio-temporal analysis of imitation behavior across new buyers at an online grocery retailer. Journal of Marketing Research, 47(1), 65–79.
Farley, J. U., & Winston Ring, L. (1966). A stochastic model of supermarket traffic flow. Operations Research, 14(4), 555–567.
Ferligoj, A., & Batagelj, V. (1982). Clustering with relational constraints. Psychometrika, 47, 413–426.
Fong, D., Wayne, K. H., & DeSarbo, S. (2007). A Bayesian methodology for simultaneously detecting and estimating regime change points and variable selection in multiple regression models for marketing research. Quantitative Marketing and Economics, 5(4), 427–453.
Fortin, M.-J., & Drapeau, P. (1995). Delineation of ecological boundaries: comparison of approaches and significance test. OIKOS, 72, 323–332.
Francois, O., Ancelet, S., & Guillot, G. (2006). Bayesian clustering using hidden Markov random fields in spatial population genetics. Genetics, 174, 805–816.
Gangnon, R. E., & Clayton, M. K. (2000). Bayesian detection and modeling of spatial disease clustering. Biometrics, 56, 922–935.
Garber, T., Goldenberg, J., Libai, B., & Muller, E. (2004). From density to destiny: using spatial dimension of sales data for early prediction of new product success. Marketing Science, 23(3), 419–428.
Goffe, W. L., Ferrier, G. D., & Rogers, J. (1994). Global optimization of statistical functions with simulated annealing. Journal of Econometrics, 60, 65–99.
Hajek, B. (1988). Cooling schedules for optimal annealing. Mathematics of Operations Research, 13(2), 311–329.
Hoeting, J., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian model averaging: a tutorial. Statistical Science, 14(4), 382–417.
Huang, Y., Hui, S., Inman, J., & Suher, J. (2012). Capturing the ‘First Moment of Truth’: Understanding point-of-purchase drivers of unplanned consideration and purchase. Working Paper.
Hui, S. K., Fader, P. S., & Bradlow, E. T. (2009a). Path data in marketing: an integrated framework and prospectus for model building. Marketing Science, 28(2), 320–335.
Hui, S. K., Fader, P. S., & Bradlow, E. T. (2009b). The traveling salesman goes shopping: the systematic deviations of grocery paths from TSP optimality. Marketing Science, 28(3), 566–572.
Hui, S. K., Bradlow, E. T., & Fader, P. S. (2009). Testing behavioral hypotheses using an integrated model of grocery store shopping path and purchase behavior. Journal of Consumer Research, 36, 478–493.
Jacquez, G. M., Maruca, S., & Fortin, M. J. (2000). From fields to objects: a review of geographic boundary analysis. Journal of Geographical Systems, 2(3), 221–241.
Jacquez, G. M., & Greiling, D. A. (2003). Geographic boundaries in breast, lung, and colorectal cancers in relation to exposure to air toxics in Long Island, New York. International Journal of Health Geographics, 2(4), available at http://www.ij-healthgeographics.com/content/2/1/4.
Ju, J., Gopal, S., & Kolaczyk, E. D. (2005). On the choice of spatial and categorical scale in remote sensing land cover characterization. Remote Sensing of Environment, 96(1), 62–77.
Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley.
Keane, M. J. (1978). A functional distance approach to regionalisation. Regional Studies, 12, 379–386.
Kolaczyk, E. D., & Huang, H. (2001). Multiscale statistical models for hierarchical spatial aggregation. Geographical Analysis, 33(2), 95–118.
Larson, J. S., Bradlow, E. T., & Fader, P. S. (2007). An exploratory look at supermarket shopping paths. International Journal of Research in Marketing, 22, 395–414.
Lawson, A. B. (2006). Disease cluster detection: a critique and a Bayesian proposal. Statistics in Medicine, 25, 897–916.
Lawson, A. B., Biggeri, A., Bohning, D., Lesaffre, E., Viel, J.-F., & Bertollini, R. (1999). Disease mapping and risk assessment for public health decision making. Chichester: Wiley.
Liechty, J., Pieters, R., & Wedel, M. (2003). Global and local covert visual attention: evidence from a Bayesian hidden Markov model. Psychometrika, 68(4), 519–541.
Louie, M. M., & Kolaczyk, E. D. (2006a). Multiscale detection of localized anomalous structure in aggregate disease incidence data. Statistics in Medicine, 25(5), 787–810.
Louie, M. M., & Kolaczyk, E. D. (2006b). A multiscale method for disease mapping in spatial epidemoiology. Statistics in Medicine, 25(8), 1287–1308.
Lu, H., & Carlin, B. P. (2005). Bayesian areal wombling for geographical boudnary analysis. Geographical Analysis, 37, 265–285.
Ma, H., & Carlin, B. P. (2007). Bayesian multivariate areal wombling for multiple disease boundary analysis. Bayesian Analysis, 2(2), 281–302.
Manning, C. D., Raghavan, P., & Schutze, H. (2008). Introduction to information retrieval. New York: Cambridge Unviersity Press.
Mollie. (1996). Bayesian mapping of disease. In W. Gilks, S. Richardson, & D. J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice. London: Chapman and Hall.
Montgomery, A. L., Li, S., Srinivasan, K., & Liechty, J. C. (2004). Modeling online browsing and path analysis using clickstream data. Marketing Science, 23(4), 579–595.
Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika, 37, 17–33.
Murtagh, F. (1985). A survey of algorithms for contiguity-constrained clustering and related problems. The Computer Journal, 28(1), 82–88.
Pieters, R., Rosbergen, E., & Wedel, M. (1999). Visual attention to repeated print advertising: a test for scanpath theory. Journal of Marketing Research, 36(4), 424–438.
Pukkala, E. (1989). Cancer maps of Finland: An example of small-area based mapping. In P. Boyle, C. S. Muir, & E. Grundmann (Eds.), Cancer mapping (pp. 208–215). Berlin: Springer.
Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163.
Richardson, S., Montfort, C., Green, M., Draper, G., & Muirhead, C. (1995). Spatial variation of natural radiation and childhood leukaemia incidience in Great Britain. Statistics in Medicine, 14(21/22), 2487–2501.
Robert, C., & Casella, G. (2004). Monte Carlo statistical methods, 2nd Edn, Springer.
Rossi, P. E., Allenby, G. M., & McCulloch, R. (2005). Bayesian statistics and marketing., Wiley.
Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Computational and Applied Mathematics, 20, 53–65.
Sorensen, H. (2003). The science of shopping. Marketing Research, 15(3), 30–35.
Ter Hofstede, F., Wedel, M., & Steenkamp, J.-B. E. M. (2002). Identifying spatial segments in international markets. Marketing Science, 21, 160–177.
Theil, H. (1954). Linear aggregate of economic relations. Amsterdam: North-Holland.
der Lans, V., Ralf, R. P., & Wedel, M. (2008). Eye-movement analysis of search effectiveness. Journal of the American Statistical Association, 103(482), 452–461.
Womble, W. H. (1951). Differential systematics. Science, 114(2961), 315–322.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix
1.1 Finding the configuration G with the highest posterior model probability
We propose a simulated annealing (SA) algorithm (Goffe et al. 1994) to stochastically search for the configuration with the highest posterior model probability, \( {G^*} = \mathop {{\arg \,\max }}\limits_G \Pr \left( {G|y} \right) \). The SA algorithm is a Monte Carlo optimization technique (Robert and Casella 2004) that is particularly suited for finding the global optima over a finite and discrete search domain, which is the case for configuration G where the number of potential configurations is finite. Specifically, it has been shown that in a finite search space, the SA algorithm converges to the global optima with probability 1, given that the “cooling schedule” (discussed later) is slow enough (Bertsimas and Tsitsiklis 1993; Hajek 1988). Thus, the SA algorithm is particularly suitable for our problem.
The SA algorithm can be described by the following steps:
-
1.
Start with an initial configuration G 0 and initial “temperature” T 0.
-
2.
For each iteration n (n = 1, …, N):
-
a.
Propose a candidate move G’ (discussed below)
-
b.
Move to G’ \( \left( {{\text{i}}.{\text{e}}.,{G_{n + 1}} = G\prime } \right) \)with probability min (1, r), where \( r = \exp \left( {\frac{{\ln (\Pr (G\prime |y) - \ln (\Pr ({G_n}|y)}}{{{T_n}}}} \right) \). Otherwise, \( {G_{n + 1}} = {G_n} \).
-
c.
Update the temperature using a “cooling schedule”, \( {T_{n + 1}} = k{T_n}\left( {0 < k < 1} \right) \).
-
a.
We now discuss how a candidate move G’ in step (2a) is defined. First, we define a “boundary representation” of G that automatically incorporates the contiguity structure in the data. The leftmost panel of Fig. 14 shows the areal units of the raw data for a hypothetical example; originally, there are 4 areal units in total. The upper middle and right panels show two potential configurations for G, denoted as G 1 and G 2 , respectively. We denote G 1 using the notation {(1), (3), (2,4)}, and G 2 using the notation {(1,3), (2,4)}.
The lower middle and right panels of Fig. 14 shows the boundary representations of the configurations. The boundary representation of G 1 is denoted as B 1 in the lower middle panel. The boundary representation provides an alternative representation for a configuration, and is interpreted as follows: The absence of a boundary between two areal units indicates that they are grouped into the same region. As can be seen in the figure, boundaries 1—2, 3—4, 1—3 are present in B 1 , but boundary 2—4 is absent, thus areal units 2 and 4 are grouped into the same region, resulting in the configuration G 1 = {(1), (3), (2,4)}. Likewise, B 2 is the boundary representation of G 2 . Figure 15 shows all 12 possible configurations with this example, along with their corresponding boundary representations.
From the current configuration G n (and its corresponding boundary representation B n ), the proposed candidate move “turns” one of the randomly-selected boundaries in B n from “on” to “off,” or from “off” to “on.” This gives us a candidate boundary representation B’, and hence a candidate configuration G’ that, because of the definition of boundary representations, already satisfies the contiguity constraint.
Figure 16 summarizes the “neighborhood structure” of our proposed procedure of generating candidate moves. Configurations that are connected can be reached in one move through their boundary representations. As can be seen, our proposed procedure allows us to search through the domain of configurations efficiently through a sequence of local moves, the key idea behind the SA algorithm.
We follow standard guidelines in setting other parameter values for the SA algorithm (Robert and Casella 2004). Specifically, the initial configuration G 0 is randomly generated. The initial temperature is set to a high value, so that moves from the initial configuration have around 80% probability of being accepted (Robert and Casella 2004). The number of iterations N is set to one million, and the “cooling schedule” k is set so that at the last iteration, the temperature drops to 0.001. To ensure that the SA algorithm converges to the global optimal configuration, we repeat the SA algorithm 100 times, each time from a different random starting point. The algorithm is coded up in C++ with the GNU scientific library. The source code is available upon request.
1.2 Derivation of Eq. 9 (Sales Mapping)
Dropping the constant \( \left( {\prod\limits_j {\frac{{{n_j}^{{y_j}}}}{{{y_j}!}}} } \right) \) from consideration because it is the same for all G, we have:
1.3 Derivation of Eq. 15 (Flow Mapping)
Thus,
Technical appendix (Online Supplement)
Here we propose a supplemental algorithm to stochastically search for configurations that are “near” the configuration that has the highest posterior model probability (denoted as G*). This involves a slight modification to the simulated annealing algorithm proposed in Appendix A. Specifically, before the end of each iteration, if the current configuration G n+1 has a posterior model probability that is higher than rP(G*|y), where r is a pre-defined constant that defines what is considered “near” (e.g., r = exp(-2), which equates to -2 units on the log-probability scale), the algorithm terminates and returns G n+1 as a “near” configuration. Thus, the supplemental algorithm can be described by the following steps:
-
1.
Start with an initial configuration G 0 and initial “temperature” T 0.
-
2.
For each iteration n (n = 1, …, N):
-
a.
Propose a candidate move G’ (discussed below)
-
b.
Move to G’ (i.e., \( \left( {{\text{i}}.{\text{e}}.,{G_{n + 1}} = G\prime } \right) \)with probability min (1, r), where \( r = \exp \left( {\frac{{\ln (\Pr (G'|y) - \ln (\Pr ({G_n}|y)}}{{{T_n}}}} \right) \). Otherwise, G n+1 = G n .
-
c.
Update the temperature using a “cooling schedule”, \( {T_{n + 1}} = k{T_n}\quad (0 < k < 1) \).
-
d.
If \( P({G_{n + 1}}|y) \geqslant P(G*|y) \), terminate the algorithm and return G n+1; if not, continues the algorithm.
-
a.
We apply the above supplemental algorithm to the sales mapping problem described in Section 3.4 to identify 100 “near” configuration to G*. As an illustration, two examples of “near” configurations are shown in Figs. 17 and 18 below. Other “near” configurations are available from the authors upon request.
Looking across the 100 nearby solutions, we find that, similar to Figs. 17 and 18, all of them appear very similar to G*. This suggests that the posterior model probability has a fairly “sharp” peak at G*, and the likelihood surface is likely to be unimodal.
Rights and permissions
About this article
Cite this article
Hui, S.K., Bradlow, E.T. Bayesian multi-resolution spatial analysis with applications to marketing. Quant Mark Econ 10, 419–452 (2012). https://doi.org/10.1007/s11129-012-9122-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11129-012-9122-y