- 644 Downloads
This chapter introduces readers to applications of data fusion in marketing from a Bayesian perspective. We will discuss several applications of data fusion including the classic example of combining data on media viewership for one group of customers with data on category purchases for a different group, a very common problem in marketing. While many missing data approaches focus on creating “fused” data sets that can be analyzed by others, we focus on the overall inferential goal, which, for this classic data fusion problem, is to determine which media outlets attract consumers who purchase in a particular category and are therefore good targets for advertising. The approach we describe is based on a common Bayesian approach to missing data, using data augmentation within MCMC estimation routines. As we will discuss, this approach can also be extended to a variety of other data structures including mismatched groups of customers, data at different levels of aggregation, and more general missing data problems that commonly arise in marketing. This chapter provides readers with a step-by-step guide to developing Bayesian data fusion applications, including an example fully worked out in the Stan modeling language. Readers who are unfamiliar with Bayesian analysis and MCMC estimation may benefit by reading the chapter in this handbook on Bayesian Models first.
KeywordsData fusion Data augmentation Missing data Bayesian Markov-chain Monte Carlo
We would like to thank the many co-authors with whom we have had discussions while developing and troubleshooting fusion models and other Bayesian missing data methods, especially Andres Musalem, Fred Feinberg, Pengyuan Wang, and Julie Novak.
- Bradlow, E. T., & Zaslavsky, A. M. (1999). A hierarchical latent variable model for ordinal data from a customer satisfaction survey with no answer responses. Journal of the American Statistical Association, 94(445), 43–52.Google Scholar
- Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M. A., Guo, J., Li, P., & Riddell, A. (2016). Stan: A probabilistic programming language. Journal of Statistical Software, 76.Google Scholar
- Cho, J., Aribarg, A., & Manchanda, P. (2015). The value of measuring customer satisfaction. Available at SSRN 2630898.Google Scholar
- Ford, B. L. (1983). An overview of hot-deck procedures. Incomplete Data in Sample Surveys, 2(Part IV), 185–207.Google Scholar
- Little, R. J., & Rubin, D. B. (2014). Statistical analysis with missing data. Hoboken: Wiley.Google Scholar
- Novak, J., Feit. E. M., Jensen, S., & Bradlow, E. (2015). Bayesian imputation for anonymous visits in crm data. Available at SSRN 2700347.Google Scholar
- Rässler, S. (2002). Statistical matching: A frequentist theory, practical applications, and alternative Bayesian approaches (Vol. 168). New York: Springer Science & Business Media.Google Scholar
- Spiegelhalter, D., Thomas, A., Best, N., & Lunn, D. (2003). WinBUGS User Manual Version 1.4, January 2003 at https://faculty.washington.edu/jmiyamot/p548/spiegelhalter%20winbugs%20user%20manual.pdf.
- Stan Development Team. (2017). Stan modeling language user’s guide and reference manual, version 2.17.0. http://mc-stan.org
- Stan Development Team (2016). Rstan getting started. https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started