To the best of our knowledge work on multi-category purchase behavior began in the mid 1970s (Böcker 1975). This work was of exploratory nature and based on pairwise association measures between product categories. The first truly multivariate models which aimed to explain consumers’ multi-category choice decisions appeared in the early 1990s (e.g., Hruschka 1991) and were later extended to include predictors, especially marketing variables (Hruschka et al. 1999; Manchanda et al. 1999; Russell and Petersen 2000). But also exploratory approaches evolved parallel with the availability of increasingly large shopping basket databases and computational advances. The applied methodologies include data compression techniques like association rule mining (e.g. Brijs et al. 2004; Kamakura 2012), collaborative filtering (e.g., Mild and Reutterer 2003) or various machine learning approaches (e.g., Reutterer et al. 2006, 2007).

In our call-for papers we addressed several research gaps in the literature on multi-category purchase behavior which are relevant for this special issue. Of course, we did not expect submitted papers to completely eliminate these gaps. But we think that the papers included in this issue contribute to narrow several of these gaps. Two papers deal with expenditure shares or loyalty segments in contrast to the extant literature with its focus on purchase incidence. These two papers also differ from previous publications in another respect as they combine a cross-category with a brand perspective. Most publications on multi-category purchase behavior typically only give qualitative managerial implications and do not try to measure cross category effects on quantitative business goals. Two papers of this issue on the other hand assess profit or sales revenue increases across several categories due to promoting categories. Finally, another paper demonstrates the power of visualization aids for complex, high-dimensional data to assist marketing analysts.

We give a brief overview of the five papers contained in this issue in the following paragraphs:

The paper by Hahsler and Karpienko “Visualizing Association Rules in Hierarchical Groups” presents a post-processing technique for association rules. Association rules gained popularity in describing asymmetric category associations found in market basket data. In such a context, high number of categories result in excessively large number of rules which practically prevents manual interpretation. For example, Hahsler and Karpienko obtain 6668 rules for a data set with 169 grocery categories. The authors propose an interactive visualization tool, which allows analysts to explore the set of mined rules. The technique is based on a group matrix representation consisting of nested groups which can be interactively explored down to the individual rule level. The hierarchy of rules is determined based on K-means clustering of lift values for corresponding rules. A high lift value indicates a high association between antecedents and consequents of a rule. Lift values of rules which were eliminated by the rule mining algorithm are all set to 1.0, i.e., to the value for independence. At the following levels K-means can be applied to the rules repeatedly.

The paper by Reutterer, Hornik, March, and Gruber “A Data Mining Framework for Targeted Category Promotions” introduces a novel multi-step approach for deriving category-level promotions using market basket analysis for individual households. In the first step, household segments are determined by cluster analysis under the restriction that all baskets of a household are assigned to the same cluster. In the second step, association rules are mined for each household segment based on minimum conditional frequency (called all-confidence) of all association rules that can be generated from the same antecedent. Only a certain number of antecedents are kept (those with the highest all-confidence values). In the final step, a list of categories which are recommended for promotion are selected for each segment by an optimization algorithm. The empirical application uses shopping baskets of households enrolled in a grocery retailers loyalty program which refer to 268 product categories. After applying their multi-step approach Reutterer et al. also assess profit increases due to segment specific targeting under two different scenarios.

The paper by Noormann and Tillmanns “Drivers of Private-Label Purchase Behavior Across Quality Tiers and Product Categories” analyzes households’ expenditure shares for four brand choice alternatives, namely three tiers of private brands (generic, standard, or premium) offered by one retailer and national brands (aggregate) for each of 12 product categories. These shares are computed across 12 months and investigated by fractional multinomial logit models whose predictors are all based on data of the previous 12 months. The following results are consistent across all categories and tiers. Proneness to buy generic (standard, and premium) private labels as a rule reveal significant positive effects on their respective tier shares. Proneness to buy is defined as spending in terms of the percentage of total spending on categories in which the retailer offers a private label. Consumers who bought a higher number of different product categories exhibit a higher private-label share in all tiers. Promotion sensitivity (the share of expenditures associated with promotions across all categories) exerts a negative effect on private-label shares.

The paper by Silberhorn, Boztuğ, and Hildebrandt “Does umbrella branding really work? Investigating cross-category brand loyalty” focuses on the following question: are customers loyal to a certain brand in the parent product category more likely than other consumers to be loyal to the same brand in another (extension) category? For each of eight categories Silberhorn et al. distinguish three prior segments according to the share of the brand in terms of purchase quantities. These three segments consist of first choice buyers with a share of 0.5 or more, second choice buyers with a share below 0.5 and competitive buyers who never purchase the brand. In addition, two groups of frequent and rare buyers are formed by a modal split of category purchase frequencies. Comparisons of appropriate conditional probabilities provide empirical evidence for frequent buyers’ tendency to be cross-category brand loyal for most category pairs. Tractive force of a category c* is defined as conditional probability of being a first choice buyer in category c for a first choice buyer in category c minus the conditional probability of being a first choice buyer in category c for a second choice (or a competitive) buyer in category c* both for frequent and seldom buyers. Promotional activities in the parent category are not recommended, as total tractive forces for several of the other categories turn out to be higher.

The paper by Hruschka “Dependences of Multi-Category Purchases on Interactions of Marketing Variables” extends the usual specification of the multivariate probit model by including interaction effects of marketing variables. Such interaction effects have not been investigated in previous studies which analyze multi-category purchases by multivariate logit or multivariate probit models. For a data set covering 25 grocery categories the extended specification clearly performs better than the usual specification in terms of information criteria. Moreover, many interaction effects are erroneously attributed to the main effects of marketing variables if one applies the usual specification. Stochastic simulation of the two specifications demonstrates that by relying on the usual specification managers run the risk to overestimate sales revenue increases due to sales promotion activities.

Let us conclude this editorial with an outlook on possibilities for future research. Analyzing market baskets containing individual brands across several product categories appears to be an interesting, but challenging task which could also lead to more detailed managerial implications. This task requires the development of models which are able to cope with a much higher complexity as they must reproduce relations between several hundred brands and distinguish between complementary and substitutional effects. Another promising extension consists of investigating store choice as an additional dependent variable. Finally, we expect that the possibility to combine companies’ internal customer databases with external survey data, online- or social media data, and/or geo-location information will stimulate the development of new approaches for leveraging knowledge on cross-category purchase effects to better understand marketing phenomena.