Advertisement

VANISH regularization for generalized linear models

  • Oliver J. RutzEmail author
  • Garrett P. Sonnier
Article
  • 19 Downloads

Abstract

Marketers increasingly face modeling situations where the number of independent variables is large and possibly approaching or exceeding the number of observations. In this setting, covariate selection and model estimation present significant challenges to usual methods of inference. These challenges are exacerbated when covariate interactions are of interest. Most extant regularization methods make no distinction between main and interaction terms in estimation. The linear VANISH model is an exception to these methods. The linear VANISH model is a regularization method for models with interaction terms that ensures proper model hierarchy by enforcing the heredity principle. We derive the generalized VANISH model for nonlinear responses, including duration, discrete choice, and count models widely used in marketing applications. In addition, we propose a VANISH model that allows to account for unobserved consumer heterogeneity via a mixture approach. In three empirical applications we demonstrate that our proposed model outperforms main effects models as well as other methods that include interaction terms.

Keywords

Non-linear marketing models High-dimensional data, Interactions, regularization methods Bayesian methods 

Notes

Supplementary material

11129_2019_9216_MOESM1_ESM.pdf (403 kb)
ESM 1 (PDF 403 kb)

References

  1. Agarwal, A.K.H. & Smith, M.S. (2011). Location, Location, Location: An Analysis of Profitability of Position in Online Advertising Markets, Journal of Marketing Research, XLVIII (Dec), 1057–1073.Google Scholar
  2. Allenby, G.M., Neeraj, A., & Ginter, J.L. (1998). On the Heterogeneity of Demand, Journal of Marketing Research, 35, 384–389.Google Scholar
  3. Bakshy, E., Hofman, J.M., Mason, W.A., & Watts, D.J. (2011). Everyone’s an influencer: Quantifying influence on Twitter. Fourth ACM International Conference on Web Search and Data Mining.Google Scholar
  4. Bumbaca, F., Misra, S., & Rossi, P. (2017). Distributed Markov chain Monte Carlo for Bayesian hierarchical models. University of California, Irvine, Working Paper.Google Scholar
  5. Cheng, J., Adamic, L., Dow, A., Kleinberg, J., & Leskovec, J. (2014). Can cascades be predicted? Proc. 23rd International World Wide Web Conference. Google Scholar
  6. Ebbes, P., Papies, D., Van Heerde, H.J. The sense and non-sense of holdout sample validation in the presence of endogeneity. Marketing Science 30(6),1115–1122Google Scholar
  7. Ghose, A., & Yang, S. (2009). An empirical analysis of sponsored search in online advertising. Management Science, 55(10), 1605–1622.CrossRefGoogle Scholar
  8. Ghose, A., Ipeirotis, P. G., & Li, B. (2014). Examining the impact of ranking on consumer behavior and search engine revenue. Management Science, 60(7), 1632–1654.CrossRefGoogle Scholar
  9. Gilbride, T. J., Allenby, G. M., & Brazell, J. D. (2006). Models for heterogeneous variable selection. Journal of Marketing Research, 43(3), 420–430.CrossRefGoogle Scholar
  10. Hong, L., Dan, O., & Davison, B. D. (2011). Predicting popular messages in Twitter. WWW 2011. Hyderabad, India.Google Scholar
  11. Naik, P., Wedel, M., Bacon, L., Bodapati, A., Bradlow, E., Kamakura, W., Kreulen, J., Lenk, P., Madigan, D., & Montgomery, A. (2008). Challenges and opportunities in high-dimensional choice data analyses. Marketing Letters, 19(3), 201–213.CrossRefGoogle Scholar
  12. Nelder, J. A. (1998). The selection of terms in response-surface models—how strong is the weak-heredity principle? The American Statistician, 52(4), 315–318.Google Scholar
  13. Park, T., & Casella, G. (2008). The Bayesian LASSO. Journal of the American Statistical Association, 103, 681–686.CrossRefGoogle Scholar
  14. Peixoto, J. L. (1990). A property of well-formulated polynomial regression models. The American Statistician, 44(1), 26–30.Google Scholar
  15. Pennebaker, J. W. (2011). The secret life of pronouns: What our words say about us. New York: Bloomsbury Press.CrossRefGoogle Scholar
  16. Petrovič, S., Osborne, M., & Lavenko, V. (2011). RT to win! Predicting message propagation in Twitter. Association for the Advancement of Artificial Intelligence. Google Scholar
  17. Radchenko, P., & James, G. M. (2010). Variable selection using adaptive non-linear interaction structures in high dimensions. Journal of the American Statistical Association, 105, 1541–1553.CrossRefGoogle Scholar
  18. Rutz, O. J., Bucklin, R. E., & Sonnier. G. P. (2012). A latent instrumental variables approach to modeling keyword conversion in paid search advertising. Journal of Marketing Research, XLIX(Jun), 306–319.Google Scholar
  19. Rutz, O. J., Sonnier, G. P., & Trusov, M. (2017). A new method to aid copy testing of paid search text advertisements. Journal of Marketing Research, 54(6), 885–900.CrossRefGoogle Scholar
  20. Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. New York: McGraw-Hill.Google Scholar
  21. Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B, 64, 583–639.CrossRefGoogle Scholar
  22. Stephens, M. (2000). Dealing with label switching in mixture models. Journal of the Royal Statistical Society: Series B, 62(4), 795–809.CrossRefGoogle Scholar
  23. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B, 58(2), 267–288.Google Scholar
  24. Tikhonov, A. N., & Arsenin, V. Y. (1977). Solution of ill-posed problems. Washington: Winston & Sons.Google Scholar
  25. Yoganarasimhan, H. (2018). Search Personalization using Machine Learning. Forthcoming at Management Science.Google Scholar
  26. Zaman, T., Fox, E. B., & Bradlow, E. B. (2014). A Bayesian approach for predicting the popularity of tweets. The Annals of Applied Statistics, 8(3), 1583–1611.CrossRefGoogle Scholar
  27. Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.University of WashingtonSeattleUSA
  2. 2.The University of Texas at AustinAustinUSA

Personalised recommendations