Topic modeling in marketing: recent advances and research opportunities
- 947 Downloads
Using a probabilistic approach for exploring latent patterns in high-dimensional co-occurrence data, topic models offer researchers a flexible and open framework for soft-clustering large data sets. In recent years, there has been a growing interest among marketing scholars and practitioners to adopt topic models in various marketing application domains. However, to this date, there is no comprehensive overview of this rapidly evolving field. By analyzing a set of 61 published papers along with conceptual contributions, we systematically review this highly heterogeneous area of research. In doing so, we characterize extant contributions employing topic models in marketing along the dimensions data structures and retrieval of input data, implementation and extensions of basic topic models, and model performance evaluation. Our findings confirm that there is considerable progress done in various marketing sub-areas. However, there is still scope for promising future research, in particular with respect to integrating multiple, dynamic data sources, including time-varying covariates and the combination of exploratory topic models with powerful predictive marketing models.
KeywordsLDA Machine learning Marketing research Topic modeling
JEL ClassificationM30 C00
There is an ongoing trend among marketing scholars (e.g., Flach 2001, pp. 205; Shaw et al. 2001, pp. 127) and practitioners (e.g., Nimeroff 2017) to adopt machine learning techniques in a diverse field of application domains. This trend is intertwined with the digitalization of our economy and the increasing availability of “big” and unstructured data, such as large amounts of texts or other inherently sparse, high-dimensional data (Kahn et al. 2010, p. 4). The focus in this paper is on so-called topic models, a specific model class which recently emerged as a versatile tool to analyze such marketing data.
Stemming from the early ideas of traditional cluster analysis, which are particularly relevant for marketing research (Punj and Stewart 1983, p. 135; Reutterer 2003, pp. 52), and enriched by the idea of probabilistic modeling and mixed membership (Blei 2012, pp. 78; Galyardt 2015, pp. 40), popular topic models like LDA (Latent Dirichlet Allocation) are a flexible, unsupervised machine learning approach to soft-cluster big data (e.g., Blei et al. 2003, pp. 993; Blei and Lafferty 2009, p. 77). Applications in marketing research comprise but are not limited to consumer profiling (e.g., Blanchard et al. 2017, pp. 408; Trusov et al. 2016, pp. 413), to the assessment of buying patterns and purchase predictions (e.g., Hruschka 2016, p. 7; Jacobs et al. 2016, pp. 389), to discovering online communities and topics (e.g., Ngyen et al. 2015, pp. 9603), and more. In business settings, the insights derived by applying topic models can, for instance, help increasing the effectiveness and efficiency of online ads by fitting them thematically to web pages (e.g., Le et al. 2008, pp. 889), or assist in building recommender systems for online market platforms (e.g., Christidis and Mentzas 2013, pp. 4373). Despite its rising popularity, to the best of our knowledge, there is currently no systematic overview of the state of research of topic modeling in marketing, which is exactly what this article aims to contribute.
After a search in Google, Google Scholar, and various library databases using keywords like “topic model”, “topic modeling marketing”, “LDA”, and scanning for relevant literature in the initially found publications, we detected a total of 61 papers, ranging from 2008 until the end of 2017. Ten of these were published in core marketing journals (Amado et al. 2017; Blanchard et al. 2017; Büschken and Allenby 2016; Calheiros et al. 2017; Hruschka 2014; Jacobs et al. 2016; Schröder 2017; Song et al. 2017; Tirullinai and Tellis 2014; Trusov et al. 2016). Since the trend of using the method is currently spanning across numerous disciplines (e.g., Schmidt 2013), the remaining 51 articles were published in journals from other fields, but explicitly assess marketing relevant research topics. Due to the openness of the framework (Airoldi et al. 2015, p. 4), and the applicability to a large variety of datasets (Blei 2012, p. 83), the field seems to be highly disordered and diverse. This directly leads to our research questions, trying to find a classification system for the publications focused on methods: (RQ1) What are the applied methodological strategies? Specifically, we assess procedures of (1) data retrieval, (2) implementing and extending, and (3) evaluating topic models, utilized as a core clue to the orientation of publications and to the current state of the field. Due to the still ongoing evolution of the methodological approach (Blei et al. 2003, pp. 1015; Galyardt 2015, pp. 42), a substantial amount of published research tend to be merely experimental than focusing on substantial results (e.g., Jacobs et al. 2016; Phuong and Phuong 2012; Wang et al. 2015). Additionally, because of the large variation of models (Airoldi et al. 2015, pp. 1–567) resulting from the ability to relax basic assumptions of the approach (Blei 2012, pp. 82), we expect a diversity in assessed objects and research interests. Therefore, we try to find fields of research by looking at the data, the examined objects, and the research interests: (RQ2) What are current fields of research? Our third and final research question aims at connecting the former two questions by deriving major gaps and providing possible future directions: (RQ3) What are major gaps and future directions of research?
The subsequent sections of the paper are organized as follows: In Sect. 2, we describe the building blocks of LDA (Blei and McAuliffe 2010, pp. 1; Blei 2012, pp. 78), which today is the basic approach in topic modeling and kind of “a springboard for many other topic models” (Blei and Lafferty 2009, p. 72). Also, we illustrate commonly used extensions of LDA (e.g., Blei and Lafferty 2007, 2009; Blei and McAuliffe 2010; Do and Gatica-Perez 2010; Hoffman et al. 2010), approaches of evaluating (e.g., Newman et al. 2010), and intertwined with that, essential critique directed at the method (e.g., Schmidt 2013). These aspects are crucial in understanding topic modeling applications in marketing research (Sects. 3 and 4). In Sect. 3, we develop a classification system, both, based on the theoretical literature on the subject, and on the examined papers, which focuses on methodological strategies. In Sect. 4, we adapt that scheme to explicitly summarize trends and patterns in current research. Additionally, we try to find sub-patterns by introducing fields of current research, derived from the examined data, objects, and research interests. Lastly, in Sect. 5, we conclude our work by summarizing gaps and future possibilities in that promising area.
2 Topic modeling
2.1 Latent Dirichlet Allocation
2.2 Comparing LDA to related methods
The just described basic LDA is clearly related to various clustering and other exploratory data compression techniques. Of particular relevance is model-based clustering, using finite mixture models (FMM) (e.g., McLachlan and Peel 2000, pp. 1–39; Titterington et al. 1985, p. 8), the Products of Experts (PoE) model as introduced by Hinton (2002, p. 1), and exploratory factor analytic (EFA) procedures (e.g., Muthen 1978, p. 407).1 Below, we briefly focus on similarities and differences between these methods and LDA.
Finally, LDA is also related to a third family of methods, namely, exploratory or model-based factor analysis for binary variables used for dimensional reduction. Similar to LDA, factor analytic models aim at compressing high-dimensional data sets into a smaller set of (latent) common factors (equivalent to the topics in LDA), while conserving as much of the original information as possible (Bartholomew et al. 2011, p. 209; Fabrigar and Wegener 2012, p. 10). There are many methodological variants for finding latent groups for binary data (e.g., Hruschka 2016, p. 3; Muthen 1978, pp. 551; Muthen and Christoffersson 1981, pp. 407). While similar by idea, there are also some notable differences to LDA, regarding the conceptualization, structure, and output. For example, in LDA, the co-occurences of words in probability distributions that are part of the method’s structure, form the topics (i.e., the factors), not correlations between variables.
2.3 Extensions of the basic LDA
Due to its open framework, LDA is highly extendable and can easily be applied to various kinds of data (Airoldi et al. 2015, p. 4; Blei 2012, p. 83). The prerequisite is a large set of documents, each consisting of discrete units, which are distributed unevenly. Whatever the documents and units specifically might be, plays a minor role from that perspective. For example, mixed membership models in general have been applied to texts (e.g., Wang and Blei 2011, pp. 450), surveys (Gross and Manrique-Vallier 2015, pp. 119), political voting data (Gormley and Murphy 2015, p. 441), population genetics (Shringarpure and Xing 2015, p. 397), image analyses (Cao et al. 2014, pp. 8959), image and text analyses (Blei and Jordan 2003a, p. 128), and more (e.g., Blei and Lafferty 2009, p. 71). Data in marketing relevant research exemplarily encompass purchase histories and consumer data (Hruschka 2016, p. 7; Ishingaki et al. 2015, p. 17; Jacobs et al. 2016, p. 397), the internet browser’s cookies (Trusov et al. 2016, pp. 409), and mobile apps usage data (Do and Gatica-Perez 2010, p. 3).
Additionally, in LDA, even basic assumptions like the so-called “bag-of-words” property can be relaxed. The latter assumption characterizes the fact that the basic LDA ignores the order of words in documents and the order of documents in a text corpus (Blei 2012, pp. 82). Also, basic statistical assumptions like the assumed distributions were altered.2 This flexibility leads to a substantial amount of extensions of LDA (e.g., Airoldi et al. 2015, p. 4; Balasubramanyan and Cohen 2015, p. 256; Blei and Jordan 2003a, pp. 2). In fact, the model has served as a basis for the advent of numerous other types of topic models (Blei and Lafferty 2009, p. 72). These include the Correlated Topic Model (CTM) for discovering correlations between topics (Blei and Lafferty 2007, p. 17; Blei and Lafferty 2009, pp. 82), Dynamic LDA for modeling topics as changing over time (Blei and Lafferty 2009, p. 82), the Supervised Topic Model (sLDA), where an additional response variable (e.g., the rating of a text) is integrated for better fitting on the data (Blei and McAuliffe 2010, pp. 2), Online LDA, which reduces the computational time needed in LDA for massive document streams (Hoffman et al. 2010, p. 2), the Author Topic Model (ATM), which associates each author with a topic probability (Rosen-Zvi et al. 2004, p. 487), and the Author-Recipient Topic Model (ART), which further extends this idea by “building a topic distribution for every author recipient pair” (Balasubramanyan and Cohen 2015, p. 260). For instance, other models also include hierarchies between documents like the Hierarchically Supervised Latent Dirichlet Allocation (HSLDA) (Wood and Perotte 2015, pp. 311). Various network topic models mainly differ in the view of what a network is, what a link consists of, if the links are conceptualized as within or between documents, and if additional factors (e.g., time dependency) are included (e.g., Airoldi et al. 2015, p. 7). These form a substantial amount of the latest attempts in the field to analyze complex, multi layered real world data. Examples are so-called Relational Topic Models (RTMs) (Chang and Blei 2009, p. 81), and Block-LDA (Balasubramanyan and Cohen 2015, pp. 255). However, only a small fraction of these more recently introduced models are currently in use (see for example: Table 6).
2.4 Procedures and criteria for model evaluation
Stemming from the variability of the approach, in conjunction with a diverse range of data and research interests, there are numerous procedures and metrics used to evaluate the models. These are applied at various stages of the modeling process and include manual, semi-automated, and fully automated methods (Roberts et al. 2015, pp. 12). Likewise, evaluating topic models could include the computational performance (e.g., Jacobs et al. 2016, pp. 394), indicators for optimal parameter settings, model fit (in sample and predicitive out of sample), and the assessment of the clustering output (topics). In general, a widely used practice is to run the same model with a different number of topics, and by varying other model parameters (e.g., the prior distributions), or to render different topic models on the same data for comparison (e.g., Hruschka 2014, p. 270; Roberts et al. 2015, p. 18; Wang et al. 2015, p. 4). This may enable the researcher to find a good model, feasible parameter settings, reasonable topics, and to assess the stability of the output and of the covariate effects (Roberts et al. 2015, pp. 14; pp. 22; p. 24). When doing that, a quite common procedure is to split the data into a training set and a hold-out validation set. This enables scholars to examine how predictive models behave on unseen data, e.g., utilizing hold-out-likelihood (e.g., Blei and McAuliffe 2010, p. 8). However, a substantial amount of research has gone into the assessment of the clustering output (topics). One way to examine topics is to evaluate semantic coherence, which is a summary measure to capture “the tendency of a topic’s high probability words to co-occur in the same document” (Mimno et al. 2011 quoted from Roberts et al. 2015, pp. 13; see also Newman et al. 2010, p. 100). Newman et al. (2009, pp. 3) use a slightly different approach. They propose a model to measure the likelihood of co-occurrence of words, by using external text data sources to provide regularization instead of the internal text data in the documents. Another important aspect of topics is exclusivity, which “captures whether those high probability words are specific to a single topic” (Roberts et al. 2015, p. 14). A manual approach to condense both procedures is to ask human raters if topics are interpretable and can be associated with a single concept (Newman et al. 2009, pp. 2). Also, one can validate that topics capture a single concept by “reading several example documents” (Roberts et al. 2015, pp. 12), or by comparing already present categories to the automatically generated clusters (Roberts et al. 2015, pp. 12).
2.5 Limitations and critique
Despite the above-mentioned recent advances and extensions of the basic LDA, the method is not immune against critizism and limitations. First, there are obvious problems like intervening variables (such as author and environment specific covariates).3 Despite recent advances allowing to infer the number of topics from the data, such as the Hierarchical Dirichlet Process as an extension of LDA (Teh et al. 2006, p. 1575), it is still prevalent to choose the number of topics beforehand (Blei and Lafferty 2009, p. 81) and to employ post hoc procedures to validate the suitablility of choice. However, if the number of topics is chosen wrong, this can result in a poor performance (Tang et al. 2014, p. 7). Another eligible critique is the need for extensive parameter optimization before running (Asuncion et al. 2009, p. 30), possibly more arranging topic models to fit the needs of the researcher, than capturing what is really there (Schmidt 2013). Also, the underlying bag-of-words assumption (where information on word order is lost) has been criticized for oversimplifying documents (Shafiei and Milios 2006, p. 1).
Putting these deficiencies aside, Schmidt (2013) questions assumptions that are at the heart of the method. More specifically, analysts assume that topics are coherent (i.e., they share some common aspect) and stable (i.e., they apply to several documents the same way), leading them to the opinion that the co-occurrence patterns of words are “more meaningful than the words that constitute them” (Schmidt 2013) and appropriately capture a concept by being semantically coherent. However, as Schmidt (2013) nicely illustrates, in some instances, the top few words characterizing a topic are not necessarily a decent summary of the large amount of words constituting the whole probability distribution. Transferred to language processing, the most frequent words don’t necessarily create the meaning. To a certain extent, these problems can be at least partly solved by techniques like word removal, or changing the bag-of-words assumption to incorporate more information. However, these are also related to the practice of looking just on the top words in the output of the model. Newman et al. (2009, p. 2) express a similar concern on this issue by saying that some topics learned by a model “(while statistically reasonable) are not particularly useful for human use”. Intertwined with that, Crain et al. (2012, pp. 144–148) note that LDA tends to learn broad (more diffuse) topics, where adding concepts to the same topic are favored if these share the same aspects. Thus, the suggestion of Schmidt (2013) is to put extensive effort into visualizing and validating the model before interpreting the results. This, of course, is considerably easier when analyzing position data on a map than with words and their respective semantic implications. Tang et al. (2014, pp. 4–8) point to limiting factors of LDA using a posterior contraction analysis. For example, they depict that a small number of documents (no matter how much words these contain) makes it impossible to guarantee a valid topic identification. The underlying topics need to be well separated for good LDA performance in the sense of Eucledian metric, which is the case if, for example, these are concentrated at a small number of words.
Finally, there are numerous papers which compare the performance of LDA to other related methods in certain problem domains, pointing towards possible deficiencies of LDA for these. Even if LDA is optimized (e.g., in terms of the number of topics), other methods are capable of outperforming it for specific tasks, data and setups. The methods under consideration comprise a multitude of quantitative methods. For example, Hruschka (2016, pp. 8) compared the relative performance of LDA, CTM, a Binary Factor Analytic Model, Restricted Boltzmann Machines (RBMs) and Deep Belief Nets (DBNs) for predicting purchase incidences in a market basket. The author shows that Binary Factor Analysis vastly outperformed both topic models, and was itself outperformed by RBMs and DBNs. Schröder (2017, pp. 31) used a Multidimensional Item Response Theory Model (MIRT) to analyze market baskets for identifying latent traits of households and predicting purchase behaviour. Based on AIC (Akaike Information Criterion), and AICc (Corrected Akaike Information Criterion), MIRT outperforms LDA for both, the binary and the polytomous purchase data scenarios. Salakhutdinov and Hinton (2009, pp. 6) introduce the Replicated Softmax Model, to automatically model low dimensional latent semantic representations in academic and in newspaper articles. Compared to LDA, their model makes better predictions and has a higher retrieval accuracy.
3 Approaches and applications in marketing research
In this section, we derive a structured review of main methodological implications for applying topic modeling in marketing. By utilizing conceptual articles as well as analyzing empirical work, we derive the following characterizing dimensions to categorize prior applications of topic models in the field of marketing: Data structures and data retrieval (3.1), topic model implementation and extensions (3.2), procedures used for model evaluation (3.3). In the course of developing Sects. 3.1, 3.2, 3.3, we describe typical methodological strategies employed by the relevant literature along with a number of characteristic examples compiled in Tables 1, 2, 3, 4, 5 to answer our research question RQ1. Subsequently, we combine these findings in Table 6 and Table A1 in the appendix, categorizing all available publications into the scheme developed in Tables 1, 2, 3, 4, 5. By doing so, Table 6 provides an integrated view on the extent that a strategy is utilized in a specific field of research (RQ2).
3.1 Data structures and data retrieval
Data structures and data retrieval for topic modeling in marketing
Units are already present
Extraction of units beforehand
Synthetically generated data
The units to be pre-processed and used in topic models are already present beforehand like words in texts, products in purchase records, etc.
Automatized recognition of discrete units from high dimensional data using algorithms, lexica, etc.
Extraction of discrete units using manually predefined groups / categorical reduction of high dimensional data
Algorithmic generation of entirely artificial data
Generating artificial data based on / including real data
E.g., Blanchard et al. (2017, p. 403; pp. 407); Büschken and Allenby (2016, p. 958); Ishingaki et al. (2015, pp. 5); Knights et al. (2009, pp. 244); Schieber et al. (2011, p. 3); Wang et al. (2015, p. 3);
E.g., Do and Gatica-Perez (2010, pp. 4);
E.g., Ishingaki et al. (2015, pp. 13);
In terms of LDA, the critical definition of documents, topics and words vastly differs between papers (Table A2 in the appendix—column: conceptualization of data). Obviously, it varies with the examined data. For example, in the paper of Sun et al. (2013, p. 2–4), documents are a user’s purchase history with each purchased product being a word in this document and topics (i.e., the clusters to be retrieved) are the customers’ purchasing preferences. Schieber et al. (2011, pp. 4) try to model topics and individuals’ opinions about products in Twitter. Accordingly, they define a document as a single Tweet, which consists of words and (possibly) contains a few topics. However, the research interest is another important determinant of how the data is conceptualized. Sticking to the example provided, as already mentioned, a document can be set up as a single posting of a user (Schieber et al. 2011, p. 5). In contrast, Weng et al. (2010, p. 264), who try to identify influential users on Twitter by the following structure and topic similarities, classify all postings of a user as a document (Weng et al. 2010, p. 264). For Paul and Girju (2009, pp. 1412), who intend to find differences in services related to traveling, an online discussion or thread (involving postings of several users) is a document. There are further assumptions that reflect themselves in the methodological conceptualization of the data. For example, scholars often restrict a certain unit in a document (e.g., a sentence) to have a single topic (e.g., Büschken and Allenby 2016, p. 954).
3.2 Topic model implementation and extensions
In most publications either basic LDA (e.g., Chen et al. 2013, p. 1), common adaptations like Labeled LDA (e.g., Ramage et al. 2010, p. 132), sLDA (e.g., Blei and McAuliffe 2010, pp. 2), temporal LDA (e.g., Wang et al. 2012, p. 124), the Author Topic Model (e.g., Do and Gatica-Perez 2010, pp. 4), CTM (e.g., Trusov et al. 2016, pp. 413), or other custom adjustments of LDA (e.g., Büschken and Allenby 2016, pp. 957; Paul and Girju 2009, p. 1410; Tirullinai and Tellis 2014, p. 468) are used. However, noticeably less commonly, scholars utilize more exotic models like the User Aware Sentiment Topic Model (USTM) (Yang et al. 2015, pp. 415–417), or the Visual Sentiment Topic Model (VSTM) (Cao et al. 2014, pp. 8959). Apart from that, when taking a broader perspective, in the papers under consideration, there is a continuum of scientific strategies in how to use topic models. One easily to implement approach consists of utilizing the method for the clustering output to represent the actual research results in an exploratory manner (e.g., Cao et al. 2014, p. 8964; Christidis and Mentzas 2013, pp. 4375; Karpienko and Reutterer 2017, p. 17; Luo et al. 2015, pp. 1185; Schröder et al. 2017, pp. 42; Sun et al. 2013, p. 7; Wang et al. 2015, p. 3; Yang et al. 2015, pp. 419).
As a consequence, scholars often perform further analyses and visualizations of certain aspects of the data. For example, it is quite common to use a topic model as an in-between-step for an overall model or research aim (e.g., Cao et al. 2014, pp. 8959; Christidis and Mentzas 2013, pp. 4373; Luo et al. 2015, pp. 1180; Sun et al. 2013, pp. 4; Yang et al. 2015, pp. 420). For instance, Luo et al. (2015, pp. 1180) try to find marketing topics in social media postings and their influence, which is measured by the reaction of users. They utilize LDA to get a topic probability vector for each micro-blog-post and subsequently calculate the topic influence, the topic response, and the topic trends by implementing various custom formulas for further processing. The topic influence is a measure of the proportion of microblog-posts to contain a certain topic, topic response indicates how much customers actively engage in reposting, and the latter indicates the development of the former two over time and across topics (Luo et al. 2015, p. 1182). Different in context but similar by idea, Karpienko and Reutterer (2017, pp. 11) apply LDA to the abstracts of a large compilation of academic marketing articles to derive latent topics of scholarly interest as a preprocessing step before inferring communities of topic combinations, using a version of social network analysis. Subsequently, the authors study the evolution of marketing topics over time and academic journals. Yet another example for using LDA in conjunction with other data compression techniques is the study by Schröder et al. (2017, pp. 42), who use LDA to derive latent shopping interests based on users’ website browsing behavior. Based on the derived latent shopping interests, the authors examine the existence of different online shopper segments using k-means clustering and study implications on shopping behavior. Sun et al. (2013, p. 2), who try to predict customers’ propensities to join group-purchasing events on an online social platform, use LDA to capture the purchasing preferences of customers beforehand. They propose two models, which use this output for further calculation. The first one, PCIM (product-centric inference model), tries to apprehend if the specific product determines that a customer would join a group purchasing event, given the user’s purchasing preferences. Accordingly, the second one, GICIM (group-initiator-centric model), assumes that the group initiator in the social network plays the decisive part in that process, when accounting for the user’s and initiator’s topic mixture (Sun et al. 2013, p. 5). By doing so, they extend the application of topic models from an exploratory approach to one which supports hypotheses testing. Dan et al. (2017, pp. 42) also build a predictive model based on the output derived by LDA for the case of online hotel reviews. Using the latent topics derived from a sentence-constrained LDA version, they develop a latent rating regression approach for making inferences on the relative contribution of each latent topic on guests’ overall hotel evaluations. Another example of that kind of utilizing the method represents a practical approach. Christidis and Mentzas (2013, pp. 4373) try to build a topic based recommender system for buyers and sellers on an e-auction platform, consisting of buyer item recommendations and seller text and item recommendations. They use basic LDA to extract the probabilities of words in topics and topics in documents (i.e., the item descriptions / items in the online marketplace). Subsequently, they calculate the cosine similarities between items (using the topic distributions of each item) and similarities between topics and terms to establish the recommendation functionality.
Obviously, a vital step in current research is to incorporate the output of a complex method or model into topic models (e.g., Cao et al. 2014, pp. 8959; Wang et al. 2015, pp. 2; Yang et al. 2015, pp. 415–418). Self-evidently, this is highly associated with more sophisticated forms of data extraction and preparation. An example for that kind of approach is the USTM framework, which aims at modeling user meta-data, topical aspects, and sentiments in online consumer reviews (Yang et al. 2015, p. 414), to aggregate the opinions of various market segments. To capture the sentiments, the authors utilize seed words and two sentiment lexica and incorporate the sentiment information by using asymmetric Dirichlet priors to assign e.g., positive words with a higher probability for positive topics (Yang et al. 2015, p. 414–419). Another example is the Image-regulated Graph Topic Model (IGTM), where the authors utilize the SIFT feature algorithm and k-means clustering, to extract 500 discrete visual words from images, and build a bag of visual words model for detecting weighted relationships between images via an image relation graph, which consists of nodes (images) and edges (similarities between images). By using further variables, IGTM aims at jointly modeling text and images to enrich topic detection with sentiments (Wang et al. 2015, pp. 2). They allocate a topic assignment and an image assignment for each word in a document, where each topic is a multinomial distribution over words and an image is a multinomial distribution over topics (Wang et al. 2015, pp. 2). There are further instances like Cao et al. (2014, pp. 8958) using an Adjective Noun Pairs (ANPs) based detector for image annotations and Visual Sentiment Ontology (VSO) detectors to construct SentiBank. Subsequently, they incorporate this information in their topic models, trying to enhance the prediction of sentiments in images.
However, at the same time, these authors mark a fourth noticeable implementation strategy, which is to combine a set of topic models in a framework, to improve prediction. More specifically, Cao et al. (2014) intend to build a topic model, which analyzes the distributions of visual sentiment features in topics. Since some non-discriminative (i.e., topic irrelevant) sentiment features have high probabilities in topics, they introduce a background topic model and additional estimators to distinguish these from discriminative ones.
Types of topic model implementation
Topic models as part of a more complex model or research objective
Using the output of topic models as research results
Using the output of topic models for further processing in a bigger model or research aim
Using the output of other methods / models as vital input for topic models
Using several topic models in a framework
E.g., basic LDA to cluster textual online reviews from PatientsLikeMe.com (Park and Ha 2016, pp. 1494); Further examples: Cao et al. (2014, p. 8964); Christidis and Mentzas (2013, pp. 4375); Karpienko and Reutterer (2017, p. 17); Luo et al. (2015, pp. 1185); Schröder et al. (2017, p. 42); Sun et al. (2013, p. 7); Wang et al. (2015, p. 3); Yang et al. (2015, p. 419);
E.g., using the clustering output of LDA as input for further models (PCIM & GICIM) (Sun et al. 2013, pp. 4); Further examples: Cao et al. (2014, pp. 8959); Christidis and Mentzas (2013, pp. 4373); Dan et al. (2017, pp. 46); Karpienko and Reutterer (2017, pp. 11); Luo et al. (2015, pp. 1180); Schröder et al. (2017, pp. 42); Yang et al. (2015, pp. 420);
E.g., VSTM, which consists of a foreground and a background topic model (Cao et al. 2014, p. 8959);
Topic model (LDA) extensions
Integrating additional variables
Changing the inference method
Changing basic assumptions
Incorporating extra information into the model in terms of additional variables / parameters
Changing the inference method (Variational Approximation with EM in original LDA) (Blei et al. 2003, p. 1003) to optimize the predictive performance, the rate of convergence and the computational effectiveness in respect to e.g., the data, the number of topics and hyperparameter settings (Asuncion et al. 2009, pp. 28)
Changing basic assumptions entailed in LDA to adapt to specific data and research interests
Optimizing topics to learn by putting constraints into the model in respect to certain purposes and assumptions of the specific research endeavour
E.g., in the form of components, covariates, prior distributions, etc. (e.g., Büschken and Allenby (2016, p. 958); Phuong and Phuong (2012, pp. 65); Ramage et al. (2010, p. 132); Trusov et al. (2016, pp. 413); Yang et al. (2015, p. 415–418));
E.g., using MCMC (like Gibbs), Laplace, MAP (Maximum a Posteriori), etc. (e.g., Büschken and Allenby (2016, pp. 969); Phuong and Phuong (2012, p. 66); Ramage et al. (2010, p. 133); Trusov et al. (2016, p. 416); Wang et al. (2012, pp. 125); Yang et al. (2015, p. 417));
E.g., assumed distributions (e.g., Trusov et al. 2016, pp. 415), bag-of-words (Yang et al. 2015, p. 418), that the order of documents does not matter (e.g., Wang et al. 2012, pp. 124), etc. Further examples: Büschken and Allenby (2016, p. 958); Phuong and Phuong (2012, pp. 66); Ramage et al. (2010, p. 132);
E.g., one topic per sentence (Büschken and Allenby 2016, pp. 955), correspondence between topics and labels (Ramage et al. 2010, pp. 132), etc. Further examples: Phuong and Phuong (2012, pp. 65); Trusov et al. (2016, pp. 413);
3.3 Evaluation procedures
Evaluating topic models
Optimal parameter settings
Model Fit (in sample and predictive out of sample)
Analysis of clustering output
Analysis of the estimator (for inference)
Evaluating the computational performance in terms of e.g., computational time, scalability, etc
Evaluating/determining the optimal parameter settings, in terms of e.g., number of topics, prior values, etc
Evaluating the predictive performance of the model (as a topic quality indicator), e.g., in sample and out of sample, using real or synthetic data
Evaluating the clustering output (topics) in terms of e.g., semantic coherence, exclusivity, etc
Analyzing the estimator for topic inference, e.g., in terms of number of iterations for convergence
E.g., Ishingaki et al. (2015, pp. 13);
Types of comparisons for model evaluation
Types of comparisons
E.g., Tirullinai and Tellis (2014, pp. 470);
Comparing results of an automated process to human ratings/scores and evaluations
External reports and categories
E.g., Tirullinai and Tellis (2014, pp. 470);
Comparing the clustering output of topic models to external reports (e.g., consumer reports) or already present categories
Traditional clustering techniques
E.g., Trusov et al. (2016, p. 417);
Comparing the output of topic models to traditional customer segmentation and clustering techniques
Specific metrics, associated with a field
E.g., Weng et al. (2010, pp. 267);
Comparing a topic model to specific algorithms, associated with a research field (like page rank, in degree, etc.)
Comparing a topic model to other topic models
Comparing (mathematical, componential, parameterwize (like the number of topics)) variations of the same topic model
Taking a broader perspective, one approach to assess the clustering output is to evaluate the outcome in retrospect (e.g., Tirullinai and Tellis 2014, pp. 472), another is to compare predictions of the model from limited data to hold-out data (e.g., Jacobs et al. 2016, p. 397). Another evaluation strategy is to use synthetically generated data (with known distributions beforehand) to compare expected with actually retrieved results (e.g., Blanchard et al. 2017, p. 402; Büschken and Allenby 2016, p. 971; Knights et al. 2009, p. 243).
4 Topic modeling research in marketing
Synopsis of topic modeling applications in marketing
4.1 Research objectives and evaluating topic models
By simulatenously looking at Table 6 and table A1 in the appendix, we can see that topic modeling extensively depends on somewhat arbitrary factors, like setting the hyperparameters, the number of topics (etc.), and is experimental to a certain degree. Consistently, such uncertainties are visible in an emphasis on model validation. 41% of all papers aim at both (model validation and explorative research), but 34% mainly focus on evaluating their model and the remaining ones (23%), with the exception of one publication which entails no aim, implement the latter. Closely related is the extensive use of evaluation procedures with a focus on (1) assessing model fit of the proposed models, (2) optimal parameter settings, and (3) a detailed analysis of the clustering output. Scholars also introduce comparisons to examine their models (80% of papers). Concisely, they tend to compare (1) variations of the same topic model and (2) their model to other topic models as the dominant strategies. However, an outstanding fact is the frequent inclusion of human ratings and scores (20%). Despite these efforts, there are 11 publications in which there is no applied evaluation technique at all. Scholars in sales / retailing emphasize more on model fit and optimal parameter settings than on an analysis of the clustering output. Intertwined with that is the relatively prominent use of comparing variations of the same topic model. For social media, scholars evaluate topic models less often. One reason for that is that a few authors focus on utilizing topic models as part of a more complex model or research aim, evaluating the overarching model (not topic models directly) (e.g., Herzig et al. 2014, pp. 52). This also reflects itself in the “implementing TM” columns in Table 6. Apart from that, in the online textual consumer reviews section, optimal parameter settings, an analysis of the clustering output, and model fit are predominant, which seems to be a reasonable pattern when looking at the aims of research and the data.
4.2 Utilized data structure
In terms of employed data, scholars heavily rely on already present units in the data (89%). However, a considerable amount (mostly in addition) performs an automatized recognition of units from high dimensional data (20%), and 11% extract manually predefined categories beforehand. It is noteworthy that seven papers don’t rely on already present units but solely use another form of the depicted data retrieval strategies (e.g., Cao et al. 2014, pp. 8958; Do and Gatica-Perez 2010, pp. 4). Scholars also utilize artificially generated data for evaluation purposes in 6 publications. The importance of already present units in the data repeats itself in all the specific fields of research, except for sales / retailing, where scholars often deal with manual categorizations of products and brands (e.g., Do and Gatica-Perez 2010, p. 4).
4.3 Model implementation and extensions
Methodologically, scholars most often present the output of topic models (82%), but mostly also use this output for further processing in a more complex model (66%). However, in a noticeably smaller amount of publications, just one of the two strategies is exclusively utilized (41%), and 11 papers do not present the output of topic models at all, just using it for a research aim that goes beyond the scope of just deriving topics from the data. A popular strategy is to use the output of another method or model as input for topic models in addition (25%). As previously mentioned, for social media, scholars tend to use the output of topic models for further processing noticeably more often than in other fields of research, almost inverting established methodological patterns. There is an emphasis on utilizing the output of topic models as research results in the Online Textual Consumer Reviews and Services Research branch. These patterns are intertwined with field-specific research interests. In general, scholars tend to use LDA (e.g., Park and Ha 2016, pp. 1491), common extensions and / or perform customizations (e.g., Trusov et al. 2016, pp. 413). Therefore, a total of 87% of papers contain a customized topic model (in comparison to original LDA). Speaking of extending topic models, scholars mostly use MCMC (64%) (and within that Gibbs) for inference, in contrast to Blei et al. (2003). Strikingly, extended topic models seem to be prevalent with 59% of publications introducing additional variables, 51% changing basic assumptions, and 41% introducing constraints. This overall pattern is more or less consistent in the specific fields of research.
5 Future directions of topic modeling in marketing
When looking at the recent development of topic modeling research, starting with the first article, which appeared in 2008, we have seen an upward trend in the transition into marketing. As depicted above (Table 6), the reviewed articles cover important research areas. Still, there are interesting phenomena and unexplored fields of research in marketing, which haven’t been analyzed to date. Below, we highlight some shortcomings in the previous literature, which should offer future researchers in this exciting field many promising research opportunities. By doing so, we address our third research question (RQ3).
5.1 Offline, high dimensional data and additional time-varying information
The already covered research areas seem to be connected to a certain extent to relatively easy to get and to process (online and digital) data. An additional inclusion of offline data and of high dimensional data (e.g., Do and Gatica-Perez 2010, pp. 4) by using sophisticated strategies of data retrieval could further extend the field. Also, as shown by Blei and Lafferty (2006, pp. 5), who propose a Dynamic Topic Model to analyze changes of topics over time, it is worth it to consider the inclusion of time-varying information. Adopting topic modeling frameworks that allow to integrate additional information, in particular time-dependent marketing covariates, should have great potential in the field of marketing research. Furthermore, integrating some form of guidance in the process of topic generation can be helpful for interpreting the derived solutions (see e.g., Andrzejewski and Zhu 2009, pp. 43; Blei and McAuliffe 2010, pp. 2).
5.2 Topic models and complexity in marketing research
Modeling marketing phenomena entails complexity, reflecting itself in methodologically elaborated conceptualizations and procedures (see the “implementing TM” and “extending TM” columns in Table 6). We expect this aspect to proceed and intensify in the future—both on the implementation level of topic models, as well as on the level of extending the method itself. We highlighted numerous examples of implementing the method. For example, an intriguing approach is presented by Sun et al. (2013), shifting topic modeling from an explorative method to one that supports hypotheses testing, by integrating the output of the topic model into two subsequent models. As we noted above (Table 6), extending the method itself is quite an important strategy in dealing with marketing problems, entailing the introduction of additional variables, constraints and changing basic assumptions & the inference method respectively. This reflects itself in the literature to a certain extent. Several authors also see the need for an improvement of inference algorithms, the invention of tools to more easily develop and implement topic models, and more automatization (e.g., in choosing the number of topics) (Blei et al. 2010, p. 1; p. 10; Blei 2012, pp. 82; Blei 2014, p. 25). Some of these efforts are already in the implementation phase. For example, Bart (2009, pp. 1) technically changes Gibbs Sampling to get a faster performing inference method.
5.3 Presenting and visualizing research results
Despite the large variety of techniques and tools available to suitably visualize the outcome of topic models, such as topic proportions and topic distributions across corporae (e.g., Chaney and Blei 2012, pp. 420; Grün and Hornik 2011, p. 16), many scholars in the field continue to call for developing new algorithms to visualize topics and present data and corpora (e.g., Chaney and Blei 2012, p. 419). Additionally, they stress the need for simplified, yet more sophisticated and interactive frameworks for scholars & practicioners (e.g., Blei 2012, p. 84; Blei 2014, p. 25; Kjellin and Liu 2016, pp. 485–460; Zinman and Fritz n.d., p. 2).
5.4 Shortcomings in marketing research using topic models
Researchers utilize numerous methods and measures to validate modeling outcomes. However, there are some shortcomings as well. Firstly, some authors do not use the full spectrum of evaluation techniques, making their research results somewhat opaque. Secondly, despite indicators that point into the right direction, most of the authors fail to meet the critical problems mentioned by Schmidt (2013), both, in assessing the clustering output, and in model validation (Sect. 2). There are numerous other problems needing to be solved. As already mentioned above, Tang et al. (2014, pp. 4-8) point to some limiting factors of LDA in the form of conditions when applying the method and Crain et al. (2012, pp. 144–148) and Newman et al. (2009, p. 2) elaborate on problems of learning topics. Blei (2012) also sees the need for the development of further methods for evaluating and selecting topic models, when confronted with a particular problem domain or dataset. As we have also mentioned in Sect. 2.5., some related methods have been demonstrated to outperform LDA-type models in specific tasks, using specific data, or setups, which calls for future research to gain a more thorough understanding on the relative advantages of competing methods.
We thank one of the anonymous reviewers for pointing us to this important aspect.
For example, a logistic normal prior distribution is utilized in CTM to be able to correlate topics (Paisley et al. 2015, p.207).
Intervening variables like author characteristics (e.g., psychological factors, self selection), user interface specific features in online reviews (e.g., a constraint in the number of words to be written), etc., often determine what is written in the documents and add a predictive element to topic models.
Open access funding provided by Vienna University of Economics and Business (WU).
- Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (2015) Introduction to mixed membership models and methods. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 3Google Scholar
- Andrzejewski D, Zhu X (2009) Latent Dirichlet Alocation with Topic-in-Set Knowledge. SemiSupLearn’09. Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing. pp. 43–48. ISBN: 978-1-932432-38-1Google Scholar
- Asuncion A, Welling M, Smyth P, The YW (2009) On smoothing and inference for topic models. pp. 27–34. https://arxiv.org/abs/1205.2662
- Balasubramanyan R, Cohe WW (2015) Block-LDA: jointly modeling entity-annotated text and entity-entity links. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 255Google Scholar
- Bart E (2009) Speeding up Gibbs sampling by variable grouping. NIPS Workshop on applications for topic models: text and beyond. https://www.parc.com/publication/3410/speeding-up-gibbs-sampling-by-variable-grouping.html
- Bartholomew DJ, Steele F, Moustaki I, Galbraith JI (2011) Analysis of multivariate social science data, 2nd edn. CRC Press, Taylor & Francis Group, Boca Raton, LondonGoogle Scholar
- Blei DM (2014) Build, compute, critique, repeat: data analysis with latent variable models. Annu Rev Stat Appl 1:203–232. https://doi.org/10.1146/annurev-statistics-022513-115657 CrossRefGoogle Scholar
- Blei DM, Jordan MI (2003a) Modeling annotated data. SIGIR’03. ACM 1581136463/03/0007Google Scholar
- Blei DM, Lafferty JD (2006) Dynamic topic models. Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh, PAGoogle Scholar
- Blei DM, Lafferty JD (2009) Topic models. In: Srivastava A, Sahami M (eds) Chapman and Hall/CRC. data mining and knowledge discovery series. Taylor and Francis Group, LLC, New York, p 71Google Scholar
- Blei DM, McAuliffe JD (2010) Supervised topic models. NIPS ProceedingsGoogle Scholar
- Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet Allocation. J Mach Learn Res 3(2003):993–1022Google Scholar
- Boyd-Graber J, Mimno D, Newman D (2015) Care and feeding of topic models: problems, diagnostics, and improvements. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 225Google Scholar
- Brett MR (2012) Topic modeling: a basic introduction. http://journalofdigitalhumanities.org/2-1/topic-modeling-a-basic-introduction-by-megan-r-brett/
- Büschken J, Allenby GM (2017) Improving text analaysis using sentence conjunctions and punctuation. SSRN. https://ssrn.com/abstract=2908915. Accessed 31 Jan 2017
- Chaney AJB, Blei DM (2012) Visualizing topic models. Assoc Adv Artif Intell. pp 419–422Google Scholar
- Chang J, Blei DM (2009) Relational topic models for document networks. Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS). Vol. 5 of JMLRGoogle Scholar
- Chang J, Boyd-Graber J, Gerrish S, Wang C, Blei DM (2009) Reading tea leaves: how humans interpret topic models. http://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models.pdf. Accessed 28 July 2017
- Chen AT, Sheble L, Eichler G (2013) Topic modeling and network visualization to explore patient experiences, pp 1–4. http://faculty.washington.edu/atchen/pubs/Chen_Sheble_Eichler_VAHC2013.pdf. Accessed 16 Jan 2017
- Cho YS, Steeg GV, Galstyan A (2015) Mixed membership blockmodels for dynamic networks with feedback. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 527Google Scholar
- Clement M, Boßow-This S (2007) Fuzzy clustering mit Hilfe von mixture models. In: Albers S, Klapper D, Konradt U, Walter A, Wolf J. Methodik der empirischen Forschung. 2., überarbeitete und erweiterte Auflage. Gabler. pp 167–182Google Scholar
- Costa Filho IG (2010) Mixture models for the analysis of gene expression: integration of multiple experimentsand cluster validation. Dissertation. University of BerlinGoogle Scholar
- Crain SP, Zhou K, Yang SH, Zha H (2012) Dimensionality reduction and topic modeling. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, HeidelbergGoogle Scholar
- Dan N, Bellio R, Reutterer T (2017) A note on latent rating regression for aspect analysis of user-generated content. Working paper. Department of Marketing WU ViennaGoogle Scholar
- Do T-M-T, Gatica-Perez D (2010) By their apps you shall understand them: mining large-scale patterns of mobile phone usage. In: Proceedings of the 9th international conference on mobile and ubiquitous multimedia (MUM’10), 1–3 Dec. ISBN: 978-1-4503-0424-5Google Scholar
- Fabrigar LR, Wegener DT (2012) Exploratory factor analysis. Understanding statistics. Oxford University Press, New YorkGoogle Scholar
- Fox EB, Jordan MI (2015) Mixed membership models for time series. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 417Google Scholar
- Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, New YorkGoogle Scholar
- Galyardt A (2015) Interpreting mixed membership. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 39Google Scholar
- Gormley IC, Murphy TB (2015) Mixed membership models for rank data. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 461Google Scholar
- Gormley MR, Dredze M, Van Durme B, Eisner J (2012) Shared components topic models. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp 783–792Google Scholar
- Gross JH, Manrique-Vallier D (2015) A mixed membership approach to political ideology. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 117Google Scholar
- Han HJ, Mankad S, Gavirneni N, Verma R (2016) What guests really think of your hotel: text analytics of online customer reviews. Cornell Hosp Rep 16(2):3–17Google Scholar
- Heinrich K (2015) Integration von Topic Models und Netzwerkanalyse bei der Bestimmung des Kundenwertes. In: Wissensgemeinsschaften 2015. Technische Universität Dresden. Verlag der Wissenschaften. DresdenGoogle Scholar
- Herzig J, Mass Y, Roitman H (2014) An author-reader influence model for detecting topic-based influencers in social media. ACM 978-1-4503-2954-5/14/09. HT’14, September 1–4Google Scholar
- Hinton GE (2002) Training products of experts by minimizing contrastive divergence. GCNU TR 2000-004Google Scholar
- Ho Q, Xing EP, Airoldi EM (2015) Analyzing time-evolving networks. In: Blei DM, Airoldi EM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 489Google Scholar
- Hoffman MD, Blei DM, Bach F (2010) Online learning for latent Dirichlet allocation. NIPS ProceedingsGoogle Scholar
- Hruschka H (2016) Hidden variable models for market basket data. University of Regensburg, RegensburgGoogle Scholar
- Hu B, Ester M (2013) Spatial topic modeling in online social media for location recommendation. RecSys ’13. ACM 978-1-4503-2409-0/13/10Google Scholar
- Hu Z, Wang C, Yao J, Xing E, Yin H, Cui B (2013) Community specific temporal discovery from social media. arXiv:1312.0860v1. Accessed 3 Dec 2013
- Iqbal HR, Ashraf MA, Nawab RMA (2015) Predicting an author’s demographics from text using topic modeling approach. Notebook for PAN at CLEF 2015Google Scholar
- Ishingaki T, Ternui N, Sato T, Allenby GM (2015) Topic modeling of market responses for large-scale transaction data. Data science and service research discussion paper. Discussion paper No. 35. Center for Data Science and Service Research Graduate School of Economic and Management. Tohoku UniversityGoogle Scholar
- Jeong B, Yoon J, Lee JM (2017) Social media mining for product planning: a product opportunity mining approach based on topic modeling and sentiment analysis. Int J Inform Manag. https://doi.org/10.1016/j.ijinfomgt.2017.09.009
- Jo Y, Oh A (2011) Aspect and sentiment unification model for online review analysis. WSDM’11. ACM 978-1-4503-0493-1/11/02Google Scholar
- Kahn A, Baharudin B, Hong Lee L, Kahn K (2010) A review of machine learning algorithms for text-documents classification. J Adv Inform Technol 1(1):4–20Google Scholar
- Karpienko R, Reutterer T (2017) An empirical study of journal positioning and the evolution of marketing subareas. Working paper. Department of Marketing WU ViennaGoogle Scholar
- Kjellin PE, Liu Y (2016) A survey on interactivity in topic models. IJACSA 7(4):456–461Google Scholar
- Knights D, Mozer MC, Nocolov N (2009) Detecting topic drift with compound topic models. association for the advancement of artificial intelligence. www.aaai.org. Accessed 18 July 2018
- Lakkaraju H, Bhattacharyya C, Merugu S, Bhattacharya I (2009) Exploiting coherence for the simultaneous discovery of latent facets and associated sentiments. In: Proceedings of the 18th International Conference on World Wide Web (WWW 2009), pp 131–140Google Scholar
- Le D-T, Nguyen C-T, Coltech Q-TH, Phan X-H, Horiguchi S (2008) Matching and ranking with hidden topics towards online contextual advertising. 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent TechnologyGoogle Scholar
- Liu L, Tang J, Han J, Jiang M, Yang S (2010) Mining topic-level influence in heterogeneous networks. CIKM’10, October 25–29, 2010, TorontoGoogle Scholar
- Lu B, Ott M, Cardi C, Tsou BK (2011) Multi-aspect sentiment analysis with topic models. 11th IEEE International Conference on Data Mining Workshops, pp 1–8Google Scholar
- Luo J, Pan X, Zhu X (2015) Identifying digital traces for business marketing through topic probabilistic model. Technology analysis and strategic management. ISSN: 0953-7325 (Print) 1465-3990 (Online) Journal homepage. http://www.tandfonline.com/loi/ctas20. Accessed 24 Apr 2017CrossRefGoogle Scholar
- McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York, Chichester, Weinheim, Brisbane, Singapore, Toronto. ISBN 0-471-00626-2Google Scholar
- Mimno D, Wallach HM, Talley E, Leenders M, McCallum A (2011) Optimizing semantic coherence in topic models. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 262–272Google Scholar
- Moghaddam S, Ester M (2012) On the design of LDA models for aspect-based opinion mining. In: Proceedings of the 21st ACM international conference on information and knowledge management (CIKM’12). ACM 978-1-4503-1156-4/12/10, pp 803–812Google Scholar
- Mukherjee A, Liu B (2012) Aspect extraction through semi-supervised modeling. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp 339–348Google Scholar
- Newman D, Karimi S, Cavedon L (2009) External evaluation of topic models. In: Proceedings of the 14th Australasian Document Computing Symposium. 4 Dec 2009Google Scholar
- Newman D, Lau JH, Grieser K, Baldwin T (2010) Automatic evaluation of topic coherence. Human language technologies: The 2010 Annual Conference of the North American Chapter of the ACL. 100–108Google Scholar
- Nimeroff J (2017) How machine learning will be used for marketing in 2017. Forbes technology council. Forbes. https://www.forbes.com/sites/forbestechcouncil/2017/03/10/how-machine-learning-will-be-used-for-marketing-in-2017/#74029c4e6d3d. Accessed 8 June 2017
- Paisley J, Blei DM, Jordan MI (2015) Bayesian nonnegative matrix factorization with stochastic variational inference. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 205Google Scholar
- Parasuraman A, Grewal D, Krishnan R (2007) Marketing research, 2nd edn. Houghton Mifflin Company, BostonGoogle Scholar
- Park KB, Ha SH (2016) Mining user-generated contents to detect service failures with topic model. Int J Comput Electr Autom Control Inform Eng 10(8):1491–1496Google Scholar
- Pathak N, DeLong C, Banerjee A (2008) Social topic models for community extraction. The 2nd SNA-KDD Workshop’08 (SNA-KDD’08). ACM 978-1-59593-848-0Google Scholar
- Paul M, Girju R (2009) Cross-cultural analysis of blogs and forums with mixed-collection topic models. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp 1408–1417, Singapore, 6–7 August 2009. c 2009 ACL and AFNLPGoogle Scholar
- Phuong DV, Phuong TM (2012) A keyword-topic model for contextual advertising. SoICT, pp 63–70. ACM 978-1-4503-1232-5/12/08. https://doi.org/10.1145/2350716.2350728
- Proctor T (2005) Essentials of marketing research, 4th edn. Pearson Education Limited, HarlowGoogle Scholar
- Rabinovich M, Blei DM (2014) The inverse regression topic model. In: Proceedings of the 31st International Conference on Machine Learning. JMLR: W&CP. Vol. 32Google Scholar
- Rahman MdM, Wang H (2016) Hidden topic sentiment model. WWW 2016, pp 155–165. ACM 978-1-4503-4143-1/16/04Google Scholar
- Ramage D, Dumais S, Liebling D (2010) Characterizing microblogs with topic models. In: Proceedings of the Fourth International AAAI Conference on Weblogs and Social MediaGoogle Scholar
- Reutterer T (2003) Bestandsaufnahme und aktuelle Entwicklungen bei der Segmentierungsanalyse von Produktmärkten. J für Betriebswirtschaft 53(2):52–74Google Scholar
- Roberts ME, Stewart BM, Tingley D (2015) Navigating the local modes of big data: the case of topic models. Draft. June 2015. Prepared for “Computational social science: discovery and prediction”, pp 1–55. https://scholar.harvard.edu/files/dtingley/files/multimod.pdf. Accessed 3 Jan 2018
- Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: UAI ‘04 Proceedings of the 20th conference on Uncertainty in artificial intelligence. p 487–494Google Scholar
- Salakhutdinov R, Hinton G (2009) Replicated softmax: an undirected topic model. Advances in neural information processing systems 22. NIPS Proceedings 2009Google Scholar
- Sammut C, Webb GI (2011) Encyclopedia of machine learning. Springer, New YorkGoogle Scholar
- Schieber A, Hilbert A, Sommer S, Heinrich K (2011) Analyzing customer sentiments in microblogs—a topicmodel-based approach for Twitter datasets. In: Proceedings of the Seventeenth Americas Conference on Information Systems, Detroit, Michigan August 4th–7thGoogle Scholar
- Schmidt BM (2013) Words alone: dismantling topic models in the humanities. J Digit Humanit 2(1). http://journalofdigitalhumanities.org/2-1/words-alone-by-benjamin-m-schmidt/. Accessed 31 May 2017
- Schröder N, Falke A, Hruschka H, Reutterer T (2017) Analyzing browsing and purchasing across multiple websites based on latent Dirichlet allocation. ALLDATA 2017. ISBN: 978-1-61208-552-4Google Scholar
- Shafiei MM, Milios EE (2006) Latent Dirichlet co-clustering. In: Proceedings of the Sixth International Conference on Data Mining (ICDM’06). 0-7695-2701-9/06. pp 1–10Google Scholar
- Shringarpure S, Xing EP (2015) Population stratification with mixed membership models. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 397Google Scholar
- Sun Y, Deng H, Han J (2012) Probabilistic models for text mining. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, Heidelberg, pp 260–296Google Scholar
- Sun F-T, Griss M, Mengshoel O, Yeh Y-T (2013) Latent topic analysis for predicting group purchasing behavior on the social web. http://repository.cmu.edu/cgi/viewcontent.cgi?article=1157&context=silicon_valley. Accessed 18 Apr 2017
- Sweet TM, Thomas AC, Junker BW (2015) Hierarchical mixed membership stochastic blockmodels for multiple networks and experimental interventions. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 463Google Scholar
- Tang J, Meng Z, Nguyen XL, Mei Q, Zhang M (2014) Understanding the limiting factors of topic modeling via posterior contraction analysis. In: Proceedings of the 31st International Conference on Machine Learning, Beijing, China, 2014. JMLR: W&CP vol. 32Google Scholar
- Titov I, McDonald R (2008) Modeling online reviews with multi-grain topic models. WWW 2008. IW3C2. ACM 978-1-60558-085-2/08/04Google Scholar
- Titterington D, Smith A, Makov U (1985) Statistical analysis of finite mixture distributions. Wiley. ISBN 0-471-90763-4Google Scholar
- Tran T, Ho T, Do P (2015) Detecting communities and surveying the most influence of online users. ACSIJ 4(6):172–178 (ISSN: 2322-5157) Google Scholar
- Underwood T (2012) Topic modeling made just simple enough. https://tedunderwood.com/2012/04/07/topic-modeling-made-just-simple-enough/
- Wallach HM, Mimno D, McCallum A (2009) Rethinking LDA: Why priors matter. http://dirichlet.net/pdf/wallach09rethinking.pdf. Accessed 20 July 2017
- Wang C, Blei DM (2011) Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’11). ACM 978-1-4503-0813-7/11/08, pp 448–456Google Scholar
- Wang H, Lu Y, Zhai CX (2011) Latent aspect rating analysis without aspect keyword supervision. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’11). ACM 978-1-4503-0813-7/11/08, pp 618–626Google Scholar
- Wang Y, Agichtein E, Benzi M (2012) TM-LDA: efficient online modeling of latent topic transitions in social media. KDD’12, August 12–16, 2012, Beijing, China. Copyright 2012 ACM 978-1-4503-1462-6/12/08Google Scholar
- Wang Z, Li L, Zahng C, Huang Q (2015) Image-regulated graph topic model for cross-media topic detection. ICIMCS’15, August 19–21, 2015, Zhangjiajie, Hunan, ChinaGoogle Scholar
- Wedel M, Kamakura WA (1999) Market segmentation, vol 2. Springer Science + Business Media, New YorkGoogle Scholar
- Welling M, Hinton G, Osindero S (2002) Learning sparse topographic representations with products of student t distributions. Advances in neural information processing systems. Vol. 15. Vancouver, CanadaGoogle Scholar
- Weng J, Lim E-P, Jiang J, He Q (2010) TwitterRank: finding topic-sensitive influential twitterers. WSDM’10, February 4–6, 2010, New York City, New York, USA. Copyright 2010 ACM 978-1-60558-889-6/10/02, pp 261–270Google Scholar
- Wood F, Perotte A (2015) Mixed membership classification for documents with hierarchically structured labels. In: Airoldi EM, Blei DM, Erosheva EA, Fienberg SE (eds) Handbook of mixed membership models and their applications. CRC Press, Florida, p 305Google Scholar
- Xie Y, Gao Y, Gou J, Cheng Y, Honbo D, Zhang K, Agrawal A, Choudhary A (2012) Probabilistic macro behavioral targeting. DUBMMSM’12, October 29, 2012, Maui, Hawaii, USA. ACM 978-1-4503-1707-8/12/10, pp 7–10Google Scholar
- Yang Z, Kotov A, Mohan A, Lu S (2015) Parametric and non-parametric user-aware sentiment topic models. SIGIR’15, August 09–13, 2015, Santiago, Chile. ACM 978-1-4503-3621-5/15/08, pp 413–422Google Scholar
- Yin Z, Cao L, Han J, Zhai C, Huang T (2011) Geographical topic discovery and comparison. WWW 2011 — session: spatio-temporal analysis. March 28–April 1, 2011, Hyderabad, India. ACM 978-1-4503-0632-4/11/03, pp 247–256Google Scholar
- Zinman A, Fritz D (n.d.). Data portraiture and topic models. pp 1–5. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.156.4544. Accessed 31 Aug 2017
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.