Abstract
We compare several unsupervised probabilistic machine learning methods for market basket analysis, namely binary factor analysis, two topic models (latent Dirichlet allocation and the correlated topic model), the restricted Boltzmann machine and the deep belief net. After an overview of previous applications of unsupervised probabilistic machine learning methods to market basket analysis we shortly present the methods which we investigate and outline their estimation. Performance is measured by tenfold cross-validated log likelihood values. Binary factor analysis vastly outperforms topic models. The restricted Boltzmann machine attains a similar performance advantage over binary factor analysis. Overall, a deep belief net with 45 variables in the first and 15 variables in the second hidden layers turns out to be the best model. We also compare the investigated machine learning methods with respect to ease of interpretation and runtimes. In addition, we show how to interpret the relationships between hidden variables and observed category purchases. To demonstrate managerial implications we estimate the effect of promoting each category both on purchase probability increases of other product categories and the relative increase of basket size. Finally, we indicate several possibilities to extend restricted Boltzmann machines and deep belief nets for market basket analysis.
Similar content being viewed by others
References
Ackerman TA (2005) Multidimensional item response theory models. In: Everitt BS, Howell DC (eds) Encyclopedia of statistics in behavioral science, vol 3. Wiley, Chichester, pp 1272–1280
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in very large databases. In: Proceedings of the 20th international conference on VLDB, Santiago, Chile, pp 487–1280
Altosaar J (2014) ctm-c. https://github.com/blei-lab/ctm-c. Accessed 6 Aug 2019
Ashenfelter O, Levine PB, Zimmerman DJ (2003) Statistics and econometrics: methods and applications. Wiley, New York
Bartholomew DJ (1980) Factor analysis for categorical data. J R Stat Soc B 42:293–321
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–27
Bengio Y, Lamblin P, Popovic D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Schölkopf B, Platt J, Hoffman T (eds) Advances in neural information processing systems 19 (NIPS’06). MIT Press, Cambridge, pp 153–160
Betancourt R, Gautschi D (1990) Demand complementarities, household production, and retail assortments. Mark Sci 9(2):146–161
Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):17–35
Blei DM, Lafferty JA (2007) A correlated topic model of science. Ann Appl Stat 1:17–35
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Boztuğ Y, Silberhorn N (2006) Modellierungsansätze in der Warenkorbanalyse im Überblick. J Betr Wirtsch 56:105–128
Boztug Y, Reutterer T (2008) A combined approach for segment-specific market basket analysis. Eur J Oper Res 187:294–312
Brown A, Croudace T (2015) Scoring and estimating score precision using multidimensional IRT. In: Reise SP, Revicki DA (eds) Handbook of item response theory modeling: applications to typical performance assessment. Routledge/Taylor & Francis, New York, pp 307–333
Cai L (2010) High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika 75(1):33–57
Chalmers RP (2012) mirt: a multidimensional item response theory package for the R environment. J Stat Softw 48(6):1–29
Christidis K, Apostolou D, Mentzas G (2010) Exploring customer preferences with probabilistic topic models. In: European conference on machine learning and principles and practice of knowledge discovery in databases. Barcelona, Spain, Sept 20–24
Crain SP, Zhou K, Shuang-Hong Y, Zha H (2012) Dimensionality reduction and topic modeling. From latent semantic indexing to latent Dirichlet allocation and beyond. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, New York, pp 129–161
Erhan D, Bengio Y, Courville A, Manzagol P-A, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? Mach Learn Res 11:625–660
Erosheva E (2003) Bayesian estimation of the grade of membership model. Bayesian Stat 7:501–510
Evermann J, Rehse J-R, Fettke P (2017) Predicting process behaviour using deep learning. Decis Support Syst 100:129–140
Gedenk K, Neslin SA, Ailawadi KL (2010) Sales promotion. In: Krafft M, Mantrala MK (eds) Retailing in the 21st century, 2nd edn. Springer, Berlin, pp 303–317
Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(Suppl. 1):5228–5235
Grün B, Hornik K (2011) topicmodels: an R package for fitting topic models. J Stat Softw 40(13):1–30
Hahsler M (2017) Groceries data set. https://rdrr.io/cran/arules/man/Groceries.html. Accessed 6 Aug 2019
Hahsler M, Hornik K, Reutterer T (2006) Implications of probabilistic data modeling for mining association rules. In: Spiliopoulou M, Kruse R, Borgelt C, Nürnberger A, Gaul W (eds) From data and information analysis to knowledge engineering. Springer, Berlin, pp 598–605
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14:1771–1800
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554
Hruschka H (2014a) Analyzing market baskets by restricted Boltzmann machines. OR Spectr 36:209–228
Hruschka H (2014b) Linking multi-category purchases to latent activities of shoppers: analysing market baskets by topic models. Mark ZFP 36:267–274
Hruschka H (2017) Multi-category purchase incidences with marketing cross effects. Rev Manag Sci 11:443–469
Jacobs B, Donkers B, Fok D (2016) Model-based purchase predictions for large assortments. Mark Sci 35:389–404
Kamakura WA, Wedel M (2001) Exploratory Tobit factor analysis for multivariate censored data. Multivar Behav Res 36:5–82
Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y (2007) An empirical evaluation of deep architectures on problems with many factors of variation. In: Ghahramani Z (ed) Proceedings of the 24th international conference on machine learning. ACM, New York, pp 473–480
Le Roux N, Bengio Y (2007) Representational power of restricted Boltzmann machines and deep belief networks. Technical report 1294, Département d’informatique et recherche opérationnelle, Université de Montréal
Manchanda P, Ansari A, Gupta S (1999) The “Shopping Basket”: a model for multi-category purchase incidence decisions. Market Sci 18:95–114
Mochihashi D (2004) lda, a latent Dirichlet allocation package. http://chasen.org/~daiti-m/dist/lda/. Accessed 6 Aug 2019
Murphy KP (2012) Machine learning. A probabilistic perspective. MIT Press, Cambridge
Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data. https://doi.org/10.1186/s40537-014-0007-7
Pydoc (2019) Hamiltonian Monte Carlo—a gradient-based MCMC algorithm. https://www.pydoc.io/pypi/tfp-nightly-gpu-0.0.1.dev20180412/autoapi/python/mcmc/hmc/index.html. Accessed 6 Aug 2019
Ramanathan S, Dhar S (2010) The effect of sales promotions on the size and composition of the shopping basket: regulatory compatibility from framing and temporal restrictions. J Mark Res 47:542–552
Reutterer T, Hahsler M, Hornik K (2007) Data Mining und Marketing am Beispiel der explorativenWarenkorbanalyse. Market ZFP 29(3):28–38
Reutterer T, Hornik K, March N, Gruber K (2017) A data mining framework for targeted category promotions. J Bus Econ 87:337–358
Rong X (2014) deepnet: deep learning toolkit in R. https://www.rdocumentation.org/packages/deepnet/versions/0.2. Accessed 6 Aug 2019
Russel GJ, Kamakura WA (1997) Modeling multiple category brand preference with household basket data. J Retail 73(4):439–461
Russell GJ, Petersen A (2000) Analysis of cross category dependence in market basket selection. J Ret 76(3):369–392
Salakhutdinov R, Hinton G (2012) An efficient learning procedure for deep Boltzmann machines. Neural Comput 24:1967–2006
Schröder N (2017) Using multidimensional item response theory models to explain multi-category purchases. Mark ZFP 39(2):28–38
Seetharaman PB, Siddhartha C, Ainslie A, Boatwright P, Chan T, Gupta S, Mehta N, Rao V, Strijnev A (2005) Models of multi-category choice behavior. Mark Lett 16:239–254
Shevchuk Y (2019) Neupy: neural networks in Python. http://neupy.com/pages/home.html. Accessed 6 Aug 2019
Singh A, Tucker CS (2017) A machine learning approach to product review disambiguation based on function, form and behavior classification. Decis Support Syst 97:81–91
Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing: explorations in the microstructure of cognition. Volume 1: foundations. MIT Press, Cambridge, pp 194–281
Steyvers M, Griffiths T (2007) Probabilistic topic model. In: Landauer T, McNamara D, Dennis S, Kintsch W (eds) Handbook of latent semantic analysis. Erlbaum, Hillsdale, pp 424–440
Sun Y, Deng H, Han J (2012) Probabilistic models for text mining. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, New York, pp 259–295
Tirunillai S, Tellis GJ (2014) Mining marketing meaning from online chatter: Strategic brand analysis of big data using latent Dirichlet allocation. J Mark Res 51(4):463–479
Videla-Cavieres I, Ríos SA (2014) Extending market basket analysis with graph mining techniques: a real case. Expert Syst Appl 41:1928–1936
Wedel M, Kamakura WA (1999) Market segmentation. Conceptual and methodological foundations, 2nd edn. Kluwer Academic Publishers, Boston
Wedel M, Kannan PK (2016) Marketing analytics for data-rich environments. J Mark 80:97–121
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Topic models
Table 10 shows the topic proportions of the investigated LDA and CTM each with two topics.
Appendix 2: Binary factor analysis
Table 11 contains the ten highest information values for the one factor BFA model.
Appendix 3: Restricted Boltzmann machine
Tables 12, 13 and 14 list the hidden variables of the selected RBM (DBN) sorted by to the sum of absolute marginal effects in descending order. We show for each hidden variable the five categories with highest information values separately for positive and negative weights \(W_{jk}\).
Appendix 4: Deep belief net
Tables 15, 16 and 17 list the hidden variables of the selected DBN sorted by to the sum of absolute marginal effects in descending order. We show for each hidden variable the five categories with highest information values separately for positive and negative weights \(W_{3lj}\).
Rights and permissions
About this article
Cite this article
Hruschka, H. Comparing unsupervised probabilistic machine learning methods for market basket analysis. Rev Manag Sci 15, 497–527 (2021). https://doi.org/10.1007/s11846-019-00349-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11846-019-00349-0
Keywords
- Machine learning
- Market basket analysis
- Factor analysis
- Topic models
- Restricted Boltzmann machine
- Deep learning