# A data mining framework for targeted category promotions

- 3k Downloads
- 1 Citations

## Abstract

This research presents a new approach to derive recommendations for segment-specific, targeted marketing campaigns on the product category level. The proposed methodological framework serves as a decision support tool for customer relationship managers or direct marketers to select attractive product categories for their target marketing efforts, such as segment-specific rewards in loyalty programs, cross-merchandising activities, targeted direct mailings, customized supplements in catalogues, or customized promotions. The proposed methodology requires customers’ multi-category purchase histories as input data and proceeds in a stepwise manner. It combines various data compression techniques and integrates an optimization approach which suggests candidate product categories for segment-specific targeted marketing such that cross-category spillover effects for non-promoted categories are maximized. To demonstrate the empirical performance of our proposed procedure, we examine the transactions from a real-world loyalty program of a major grocery retailer. A simple scenario-based analysis using promotion responsiveness reported in previous empirical studies and prior experience by domain experts suggests that targeted promotions might boost profitability between 15 % and 128 % relative to an undifferentiated standard campaign.

## Keywords

Cross-category purchases Target marketing Customized coupons Clustering Association rule mining## JEL Classification

C52 C55 M3## 1 Introduction

Even though their popularity might already have exceeded peak levels, with an average of more than twelve memberships per U.S. household and a reported half of U.S. adults being enrolled in at least one, loyalty programs continue to be a mainstay in customer relationship management (Kivetz and Simonson 2003; Ferguson and Hlavinka 2007; Berry 2013). Many companies invest tremendous amounts of money in both, their online and offline loyalty program environments to strive for building and preserving loyalty of their primary clientele. In the marketing literature, however, there is mixed evidence on the effectiveness of loyalty programs and many recent efforts to improve them concentrate on various program design components, tier or reward structures; excellent reviews are provided by Liu (2007), Liu and Yang (2009) and Zhang and Breugelmans (2012).

In a nutshell, loyalty programs have been mainly criticized for their vanishing ability to gain a competitive advantage in an environment where almost all major competitors in a particular industry rival in “loyalty wars” for the most profitable clients (Kumar and Shah 2004; Shugan 2005; Singh et al. 2008). However, some smart companies have learned how to squeeze out valuable customer insights from the vast amount of data permanently accruing from monitoring customer interactions with their touch points and to benefit from personally identifiable purchase history data by customizing targeted marketing activities (Ailawadi et al. 2010; Liu 2007; Bodapati 2008). For example, the U.K.’s biggest and most profitable grocery chain Tesco pioneered data-driven loyalty programs by deriving a lifestyle segmentation of their customer base using behavioral data. At Tesco, these “lifestyle” segments are constructed by looking into the composition of their customers’ shopping baskets and used for deriving segment-specific targeted mailings, coupons and promotions (Humby et al. 2004). A similar approach is adopted by the French grocery retailer Carrefour.

In this paper, we take the perspective of multi-category retailers like Tesco or Carrefour, who need to manage their category level merchandising and target marketing decisions to increase sales generated by their existing customer base (Chen et al. 1999; Rowley 2005). Such retailers frequently make use of targeted promotions to draw consumers’ attention to specific categories. One of the key challenges in such a setting is to decide which categories to promote from among the hundreds or even thousands they offer and to whom, i.e., which customer segment(s) to target. We propose a procedure which addresses both the customer segmentation issue and the task to support managers with selecting attractive categories for deriving segment-specific, targeted category level promotion campaigns^{1}. In addition, we will explore the potential benefits of adapting the presented promotional decision support system and compare its empirical performance relative to that of a simple standardized promotion heuristic.

In the next section, we position the proposed framework against prior related research. Section 3 then presents the building blocks of the methodology to determine product categories to be featured in segment-specific promotional campaigns. Section 4 illustrates the empirical application of the framework by analyzing a real-world transaction dataset collected from a retailer’s loyalty program. We also provide rough estimates of the expected outcome of our procedure to support target marketing campaigns using a simple scenario-based setting and compare the profitability implications with those anticipated from a standardized promotion heuristic. Finally, Sect. 5 discusses the results and provides an outlook on future enhancements of the proposed approach.

## 2 Literature review and research contribution

In our research framework we consider targeted category level promotions in the same manner as Venkatesan and Farris (2012). These authors focus on retailer-initiated coupon campaigns, which are customized to customers’ specific preferences (as reflected in their purchase histories) and are targeted to only a subset of the retailer’s clientele. Such targeted promotions are typically offered by major retailers as part of their loyalty programs or are distributed by specialized target marketing services like Catalina Marketing for cooperating retailers and manufacturers (Zhang and Wedel 2009; Pancras and Sudhir 2007).

In recent years, the effectiveness of targeted promotions has increasingly been studied by marketing researchers (e.g., Rossi et al. 1996; Shaffer and Zhang 1995; Zhang and Krishnamurthi 2004). There is empirical evidence that compared to conventional (i.e., undifferentiated) ones targeted promotions are capable to increase profits (Khan et al. 2009; Musalem et al. 2008). In an early contribution, Bawa and Shoemaker (1989) show for direct mailing coupons that consumers are more responsive to (coupon) promotions if their prior preference for the promoted brand is higher.

Using survey data, Shoemaker and Tibrewala (1985) also report that consumers indicate higher redemption intentions for brands they are loyal to. Zhang and Wedel (2009) show that profit differences are mainly the result of variations in redemption rates and in offline stores the incremental benefit of individual level targeting is relatively small compared to segment-level customization. In a retailer-customized setting, Venkatesan and Farris (2012) also provide support for higher redemption rates of targeted promotions. Furthermore, beyond a lift in coupon redemption the authors also find a mere exposure effect of customized coupon campaigns. This is consistent with a recent study by Sahni et al. (2014); using data from a set of randomized field experiments the authors find significant carryover effects after the promotions expired and evidence for cross-category spillover to non-promoted items. Summing up, these findings suggest that promotions customized to the prior preferences of customer segments can boost company’s profits.

Most prior research on designing targeted promotions has focused on how to detect interesting customer segments to target. For example, the direct approach by Bodapati and Gupta (2004) predicts whether a prospective customer exceeds a predetermined threshold on a defined outcome (e.g., grocery expenditures). Rossi et al. (1996) assess the information content of various information using a target couponing problem that customizes coupons to specific households. Shaffer and Zhang (1995) analytical framework notes the effect of targeting coupons to selected households on firm profits, prices, and coupon face values. Zhang and Krishnamurthi (2004) provide recommendations about when to promote how much to whom, according to the time-varying pattern of purchase behavior and impact of current promotions on future purchases.

Notwithstanding its importance, most of this prior research neglects the selection of which category to promote for the derived customer segment. The approach presented in the next section aims to support decision makers in this respect. Our analytical framework shares some common notions with the approaches introduced by Reutterer et al. (2006) and Boztug and Reutterer (2008). We also segment customers based on the multi-category choices observed in their past purchase history data with the focal company and we also derive the targeted categories based on their aptitude to stimulate cross-category spillover. However, beyond technical aspects, the major differences of the present contribution against these previous studies are as follows: For each customer segment our approach provides the decision maker with a list of candidate categories for segment level targeted promotion campaigns; the list is derived such that the included categories maximize the cross-category spillover effects for non-promoted categories. This task is accomplished by combining various data mining tools with optimization techniques in one integrated analytical framework.

Furthermore, in developing our approach we explicitly distinguish between two types of product categories: The first type contains categories purchased by a significant fraction of a specific company’s customer base. Such “bestsellers” or top-selling categories show high purchase incidence rates and are very frequently bought compared to the rest of the assortment; we therefore denote these as high-frequency categories (HFCs). In a grocery retailing context, such HFCs typically include every day food categories like fresh milk, vegetables, bread or other fast moving consumer good categories. Using Drèze and Hoch (1998) terminology such “type 1 categories” are purchased by customers with the focal company on a regular basis whenever they visit the store.

Thus, such HFCs are perfect for traffic building and useful candidates for undifferentiated (or non-targeted) promotions to draw customers into the store. However, they are less useful for the targeted promotions we aim to derive, because their category expansion effects tend to be modest (Bell et al. 1999). For the purpose of deriving customer segments and selecting categories for segment-specific targeted promotions, we instead focus on a second type of categories we denote as low-frequency categories (LFCs), i.e., categories which show relatively low purchase incidences on the aggregate level but might be characterized by substantial variation across customers. The underlying rationale of considering such LFCs for target marketing purposes is related to the so-called “long tail effect” (Anderson 2006; Elberse 2008), which suggests that multi-category retailers can stimulate previously untapped demand by detecting and promoting specific category combinations that reflect distinctive tastes and preference structures at the individual customer or segment level but are “averaged out” (i.e., vanishing in relative small purchase incidences) on the aggregate level. More precisely, differentiating customer preferences and buying habits are more likely to be reflected in their specific multi-category choices in the “long tail” (i.e., LFCs) than in categories purchased by the vast majority of a company’s clientele. For example, a baby household and a young single household will probably both buy milk, bread and vegetables in combination. However, the latter household is not very likely to purchase any baby hygiene products. Instead, the shopping baskets of young singles might be significantly characterized by convenience food categories, frozen food, etc. Thus, we posit that using characteristic LFC combinations found in customers’ purchase histories might enhance the effectiveness of targeted promotions. The next section presents the technical details of the proposed procedure to derive targeted segment level promotions.

## 3 Methodological framework

*K*-centroids cluster algorithm (KCCA) as introduced by Leisch and Grün (2006). In step 2, an association rule mining (ARM) analysis identifies the segment-specific frequent itemsets in the pooled transactions for each segment detected in the previous step. In accordance with other association rule mining approaches, an additional filter measure separates the statistically interesting frequent itemsets from the less important ones and helps to reduce the number of considered cross-category associations. Finally, in step 3 the itemsets are used as input for an optimization procedure which recommends a list of categories maximizing profits with respect to their own profitability and a profit lift due to expected cross-category purchase associations.

### 3.1 Step 1: identifying household segments

It is common practice among marketing analysts conducting exploratory market basket analysis to assume that each customer transaction (i.e., the shopping basket or market basket) reflects the output of a combined multi-category decision process made during a shopping trip (Manchanda et al. 1999; Russell and Petersen 2000; Hruschka 1991; Kwak et al. 2015). Following previous research this is considered as a “pick-any/*J*” decision task where each transaction can be represented as a *J*-dimensional binary vector \(x_n \in \{1,0\}^J\) of category purchase incidences, with *J* denoting the number of categories considered^{2}. A database of *N* transactions then gives the data set \(X_N = \{ x_n, 1 \le n \le N \}\).

*K*-centroids cluster algorithm introduced by Leisch and Grün (2006). In general,

*K*-centroids methods (such as

*K*-means, McQueen 1967) partition data sets by finding a set of

*K*centroids \(P_K = \{ p_k, 1 \le k \le K \}\) which optimally represent the data set, in the sense that the total distance between the data points \(x_n\) and their centroids \(p(x_n) \in P_K\) become minimal. Formally, with

*d*(

*x*,

*p*) the distance between

*x*and

*p*, one aims at solving

*K*segments \(C_K = \{ c_1, \ldots , c_K \}\) obtained are such that \(c_k\) contains all \(x_n\) for which \(p_k\) is the closest centroid from \(P_K\). Finding such optimal representations is typically based on heuristics which iterate between computing optimal centroids for the current partition and optimal partitions for the current centroids (see Bock 1999; Leisch 2006 for more information). We follow Leisch (2006, Sect. 3.2) in taking

*d*as the extended Jaccard distance, such that \(d(x_n, x_m)\) is the relative frequency of categories purchased in only one of the transactions

*n*and

*m*(but not in both).

For personalized basket data, each transaction can be linked to a household it originates from. To identify household segments, one could follow Boztug and Reutterer (2008) to employ a two-step voting procedure, which first segments the transactions without taking the household information into account, and then assigns households to segments according to the majority of their transactions. Here, we follow a constrained clustering approach which already employs the household information when clustering the transactions via a so-called “must-link” constraint (Wagstaff et al. 2001; Basu et al. 2008), enforcing all transactions corresponding to one household to the same segment. This immediately yields segments of households with similar basket compositions.

*K*-centroids approach very conveniently allows to impose such must-link constraints (Leisch and Grün 2006). Write \(X_{N,h}\) for the transactions in \(X_N\) corresponding to household

*h*, and

*H*for the number of households. Then for all

*h*, all transactions \(x_n\) in \(X_{n,h}\) should have the same centroid \(p(X_{n,h}) \in P_K\), and constrained

*K*-centroids clustering is performed via solving

The thus obtained segmentation yields centroid vectors \(p_k\) which correspond to the prototypical “average” market basket for their corresponding segment \(c_k\) (and are typically similar to the vectors of category purchase frequencies of the respective segments, Leisch and Grün 2006).

A common challenge in the application of *K*-centroids based cluster methods is the determination of an appropriate value of *K* (Aldenderfer and Blashfield 1984; Milligan and Cooper 1985; Kaufman and Rousseeuw 2005). Although this information is not available before the analysis, in most real-world situations *K*-centroids partitioning requires the analyst to predefine a priori the number of expected groups in the dataset or to use heuristics like the ‘elbow’ criterion (Thorndike 1953; Gordon and Vichi 1998), cluster validation indices (Dimitriadou et al. 2002) or index voting. Our approach to choosing *K* is based on the idea of increasing *K* until the corresponding partitions no longer markedly change. More precisely, we employ the “corrected” Rand (1971) for measuring the agreement of two different partitions of the same data set. We compute the constrained *K*-centroids partitions for a suitable range of *K* values, and then inspect the Rand indices of the partitions using *K* and \(K+1\) centroids. We then choose *K* large enough to account for all large changes in the sequence of indices (ensuring that the obtained partition is rather stable with respect to increasing *K*).

### 3.2 Step 2: mining segment-specific frequent itemsets

Whereas the segment centroids \(p_k\) reflect the market basket structure of an average transaction of the segment, they do not provide any information on which categories are exactly bought in combination within the segments’ transactions. As such the set of centroid vectors merely informs the analyst about the specific “interests” of the various household segments in certain (combinations of) categories. For example, observing that a segment features rather frequent purchases of both white wines and red wines does not allow to conclude that these purchases occur together (i.e., in the same baskets). However, identifying interesting category associations clearly is a key ingredient to successful personalized target marketing of the type discussed in the previous section. This can be accomplished by employing transaction data mining techniques for finding frequent so-called itemsets and association rules (e.g., Brijs et al. 2004; Reutterer et al. 2007; Kamakura 2012).

*itemsets*correspond to sets of categories. We say that an itemset

*A*is contained in a transaction

*x*, symbolically \(A \subseteq x\), if

*x*features purchases of all categories in

*A*. The basic measure of interestingness of an itemset is its

*support*, which is the frequency of transactions containing the itemset. Formally, for transactions from segment \(c_k\),

*S*| denotes the cardinality of a set

*S*(i.e., the number of its elements). If the support value of an itemset is above a user-defined threshold (so-called

*minimum support*), the itemset is referred to as “frequent” (Mannila 1997). Even for very large transaction databases, frequent itemsets can efficiently be mined, for example, by using the APRIORI algorithm (Agrawal et al. 1993; Agrawal and Srikant 1994; Bayardo and Agrawal 1999; Zaki et al. 1997).

*association rule*\(A \rightarrow B\) splits an itemset \(C = A \cup B\) into two non-empty disjoint itemsets

*A*and

*B*, the antecedent and the consequent of the rule. The strength of the association is typically measured by the

*confidence*of the rule, which is the conditional frequency of transactions containing

*B*within the transactions containing

*A*. Formally, for transactions from segment \(c_k\),

As in general \(\mathrm {conf} (A \rightarrow B) \ne \mathrm {conf} (B \rightarrow A)\), confidence provides an asymmetric measure of the statistical strength of the association between two itemsets *A* and *B*.

*all-confidence*(Omiecinski 2003), which computes the minimal confidence of all association rules that can be generated from the itemset. Formally,

### 3.3 Step 3: optimization and filtering segment-specific itemsets

If associations are mined with a low minimum support in a dataset showing skewed purchase frequencies, the analyst has to be aware of finding many weakly related cross-support itemsets (Hui et al. 2006). This results from grouping customers with similar interest in purchasing certain categories of the assortment (e.g., customers of a “baby” cluster disproportionately often buy baby related products). As outlined in Sect. 3.2, these problems can be addressed using the all-confidence value suggested by Omiecinski (2003) to reduce the output of the APRIORI algorithm. We thus filter the frequent itemsets obtained from step 2 accordingly, retaining those *F* frequent itemsets with the highest all-confidence values (and hence length at least two), for a suitable value of *F*. The remaining frequent itemsets are then transferred to the proposed optimization model, generating a list of single categories which should be promoted within the corresponding customer segment.

To select only the most valuable items for customized marketing, we use the generalized PROFSET model introduced by Brijs et al. (2004), which determines the most profitable categories based on their profit lift into frequent itemsets of interest, by solving an all-binary optimization problem. The resulting categories imply a high monetary value and the ability to initiate cross-selling in the respective customer segment.

*J*categories to select from, and \(\mathcal {F}\) for the frequent itemsets of interest based on these categories (in our approach, \(\mathcal {F}\) consists of the

*F*frequent itemsets with the highest all-confidence values). Write \(Q_j\) for the binary variable indicating the selection of category

*j*(i.e., \(Q_j = 1\) if

*j*is selected, and zero otherwise). The PROFSET optimization selects the \(\Phi\) best categories by solving

*V*(

*A*) is the “value” (profit margin) of the itemset

*A*which is obtained by suitably aggregating the values of the transactions containing it.

*SP*(

*j*) and

*PP*(

*j*) the sales and purchase prices, respectively, and \(f(j, x_n)\) the number of times

*j*was purchased in transaction \(x_n\). When aggregating the transaction values into itemset values, care must be taken to avoid that transactions contribute to several itemsets (e.g., all itemsets they contain). The original PROFSET approach thus takes

*V*(

*A*) as the sum of the \(v(x_n)\) over all transactions exactly matching

*A*(i.e., featuring purchases of

*exactly*the categories in

*A*). When employing only

*frequent*itemsets, this may result in excluding many transactions in the value aggregation, and hence under-estimating the actual values (see Section 3.3 in Brijs et al. (2004)). This effect is particularly relevant in our approach which is based on using relatively small numbers of frequent itemsets with interesting cross-category associations as measured by all-confidence.

We thus generalize the PROFSET model as follows: For each transaction under consideration, we determine all frequent itemsets in \(\mathcal {F}\) it contains, and distribute the value of the transaction to the values of these itemsets weighted according to the support values of the itemsets, either by direct distribution, or alternatively by randomly selecting the itemsets according to these weights. For example, if a transaction contains exactly the two frequent itemsets \(\{vegetables, water\}\) with support 0.02 and \(\{bottled\ beer, water\}\) with support 0.07, then the weight of the first itemset is \(0.02 / (0.02 + 0.07) = 2 / 9\), and the weight of the second is 7/9.

Employing the PROFSET approach requires the specification of a pre-defined number \(\Phi\) of categories to be selected. Because both marketing budgets and advertising spaces are scarce resources in both on- and offline environments, loyalty program managers tend to be rather interested in focusing their target marketing efforts on a selected few product categories than dealing with a multitude of interrelated itemsets. After using e.g. a branch-and-bound algorithm to solve the all-binary PROFSET optimization problem, the solution determines \(\Phi\) variables, which point to the categories to be selected for maximizing the objective function and subsequently used in target marketing actions.

## 4 Empirical application

The application of the proposed framework to derive recommendations for segment-specific, category level targeted promotions is demonstrated below using a real-world data set. We obtained the data from a major grocery retailer who prefers to stay anonymous. The data set contains transactions realized by members of the loyalty program offered by the focal retailer. Using a simple scenario-based analysis, we illustrate and discuss the benefits and comparative effectiveness of our proposed data-driven target marketing approach vis-a-vis undifferentiated standard promotions.

### 4.1 Data description

To illustrate the practical application of our proposed procedure, we drew two disjoint samples from the transaction data base, each containing the transactions of 3,000 randomly selected households. The first sample is used for selecting an appropriate KCCA segmentation model, i.e., to determine an appropriate value for *K* and the centroids \(P_K\) of the corresponding constrained cluster solution. The second sample is used for performance evaluation, using months 1–10 to update the KCCA segmentation obtained from sample 1 by performing KCCA with the chosen *K* and the centroids initialized with \(P_K\), and using months 11 and 12 as the hold-out sample for the profitability scenario-based analysis.

In grocery retailing it is typical that households have varying lengths of buying histories. On average, households in sample 1 made around 26 transactions with a basket size of six categories in the observation period from the LFC. To robustify the selection of an appropriate value for *K*, we only consider households with buying sequences that are sufficiently long but not extremely long. Specifically, we exclude those households with the smallest 20 % and largest 5 % numbers of transactions, leaving 2,250 households in sample 1 to use for selecting an appropriate KCCA model.

### 4.2 Identifying household segments and extracting itemsets for targeting

To give an illustrative example of the cross-category purchase interrelationships resulting from the proposed algorithm, Fig. 5 depicts the results for two of the clusters generated from the second sample obtained for the segments \(k=8\) and \(k=1\). The different peaks of light-gray bars on the left-hand side in Fig. 5 indicate that the households in both groups show interests in quite different itemsets. To match these peaks to the corresponding categories the ten most frequently purchased itemsets in the corresponding segments are investigated (cf. Fig. 5, right-hand side). The households in segment \(k = 8\) (the “baby” cluster) seem to focus on baby food and baby care categories since these products are purchased at a substantially higher than average rate. A typical household being represented by the baby cluster buys baby hygiene products with a probability of 35.24 % and adds baby food in glass with a probability of 22.82 % and baby food mush/powder with a probability of 17.62 %. The households in segment \(k=1\) (the “wine” cluster) combine different kinds of—in particular—wine or other alcoholic beverages. In contrast to its overall purchase probability of 3.86 %, red/rosé wines occur at a rate about ten times higher than average in a basket purchased by a wine cluster household (red/rosé wine’s group-specific purchase frequency is 32.21 %).

Recommended categories for segment-specific targeting and categories selected for standardized promotion (with corresponding segment-specific or global relative purchase frequencies in brackets)

Segment 1 | Segment 3 | Segment 8 | Standardized |
---|---|---|---|

Wine | Health food | Baby | Promotion |

Red/rosé wines (32 %) | Organic prod. (42 %) | Baby hygiene prod. (35 %) | Bottled beer (18 %) |

White wines (23 %) | Wholemeal prod. (14 %) | Baby food jars (26 %) | Delicatessen (29 %) |

Sparkling wine (12 %) | Organic beef (5 %) | Baby food powder (18 %) | Soft drinks (31 %) |

Beef (12 %) | Frozen ice cream (10 %) | Frozen ice cream (12 %) | Vegetables (48 %) |

In comparison, the last column in Table 1 lists the top-four “bestselling” (in terms of generated sales values for the same households and observation period under study) categories from the set of HFCs. While these categories were excluded from our segmentation and subsequent frequent itemset mining procedure, they would represent promising candidates for an undifferentiated, traffic-building promotional campaign. Next we further explore the potential performance of a segment-specific targeting approach against a standardized campaign.

### 4.3 Scenario analysis to evaluate profitability implications

After performing the stepwise procedure illustrated above, managers are provided with (a) a set of household segments, corresponding centroids and household assignments to segments as well as (b) for each segment an itemset including the categories recommended for target marketing actions. Now, suppose that a loyalty program manager considers to launch a targeted promotions campaign for the previously identified household segments. Of course, an evaluation procedure of first choice would be to run a series of (randomized) field experiments and to compare the effectiveness of targeted promotions relative to an undifferentiated approach (or doing nothing). Because we do not have access to such experimental data, we discuss some basic and preliminary considerations from a managerial perspective. In fact, prior to costly experimentation both analysts and—more importantly—managers typically wish to gain some initial notions on the prospective chances of success of such an approach and for which segments targeting is most likely to pay off. In doing a preliminary feasibility study, we next adopt a simple scenario-based evaluation of such a strategy by making some assumptions based on prior empirical findings.

#### 4.3.1 Scenario settings

The scenario analysis is pursued by estimating the profit margin generated with the itemsets recommended for the different campaigns (targeted vs. standard) using the empirical basket data set at hand. For this purpose we reutilize the transaction data included in the second sample used to illustrate the empirical application of our approach. Note that the households’ transactions for months 1–10 were used to determine segment memberships and for deriving itemset recommendations. This data is now also used to calculate the expected profit margin for the following two months, which serve as a hold-out period. As profit margins are available on all categories, it is possible to determine the profits realized by the focal retailer with a certain itemset. This profit simulation is done using all 268 categories and all 3,000 households in the second sample.

Since we extend the cluster membership of a household determined for the calibration sample to the hold-out period, we also check whether the cluster size coincides relative to the number of included households (cf. Fig. 4, light gray bars) and the generated percentage profit gains (cf. Fig. 4, white bars) for months 11–12. Despite some smaller deviations, the three values obtained mostly correspond to each other. Thus we conclude that the size of the cluster approximately determines the profits generated by the households of the corresponding segments.

We assume that the retailer at hand considers to conduct a segment-specific promotional campaign for a predefined set of four categories per segment (see the examples in Table 1) in the hold-out period. For example, such a campaign could be effected by distributing targeted coupons among the household members for each segment. With targeted coupons, customers typically can earn a discount of a certain monetary amount (or a percentage value equivalent) if they bought at least one product in the mentioned category within a predefined period of time (Kalwani and Yim 1992; DelVecchio et al. 2007). For evaluating the expected effectiveness of such a targeted marketing campaign we use a standardized promotion campaign as a benchmark (i.e., promoting categories with the highest revenue in months 1–10; see the HFC categories listed in the last column of Table 1). To compare the expected profits resulting from both campaign-types we use the expected gains in profit margins for the baskets along the complete set of categories (i.e., the 52 HFC with the previously identified 216 LFC). Following Brijs et al. 2004 we only consider direct product costs and ignore category handling and inventory costs for sake of simplicity. Furthermore, we also do not consider any potential costs to implement a targeted marketing strategy.

To evaluate the relative performance of the two campaign-types in terms of profitability, their respective expected profit increases are calculated in months 11–12 conditional on the selected promotional campaign. Next we add up all the profit margins potentially accruing from two scenarios which we define according to Table 2 and discuss further below. Due to cross-category associations we also have to consider the purchase interrelationships between the promoted itemsets and the rest of the assortment in step 3 of the proposed framework. Therefore, after mining all association rules with a minimum support of 2 % and a length of two (i.e., the major associations) the profits are multiplied with the confidence value of the corresponding rule, revealing the expected indirectly affected accumulated profits generated by selling the promoted categories. Finally, we estimate the percentage profit gain which is expected to be achieved by the corresponding promotion campaign compared to the real profit achieved in month 11–12. Thus, we implicitly assume stationary marketing activities in both the calibration and hold-out periods and do not account for any seasonal or stock-buying effects.

Two scenarios of expected percentage profit growth in response to the corresponding campaigns featuring specific recommended itemsets

Campaign | Setting no. 1 | Setting no. 2 | Profit added to |
---|---|---|---|

Gross-profit growth (%) | Gross-profit growth (%) | ||

Segment-specific promotion | 10.00 | 15.00 | Promoted itemsets |

Cross-merchandising | 6.00 | 10.00 | Associated itemsets |

Standardized promotion | 5.00 | 3.00 | Promoted itemsets |

Standardized promotion | 5.00 | 1.00 | Associated itemsets |

The profit lift values expected for specific combinations of itemsets and campaign types under the two scenarios are included in Table. These values are based on prior empirical findings reported in the relevant marketing literature (e.g., Drèze and Hoch 1998; Zhang and Wedel 2009; Venkatesan and Farris 2012; Sahni et al. 2014) and discussions with domain experts working with the focal retailer. For segment-specific targeted coupons, Drèze and Hoch (1998) report a 25 % increase in the promoted categories after the program has been running for six months and taking costs into account. The overall profitability of the campaign depends on the length of the coupon’s validity period; this applies even more, if the promoted categories are from the LFCs and average per-basket sales in the these categories usually tend to increase over time (Drèze and Hoch 1998). Thus, we assume a pessimistic value of 10 % for setting no. 1. In contrast to Drèze and Hoch (1998) the promoted categories of our approach are matched to the purchasing behavior of the targeted households taking purchase interrelationships into account. Therefore, a more optimistic profit lift of as much as 15 % in the promoted itemsets is assumed for setting no. 2. In addition, in a second study by Drèze and Hoch (1998) the authors applied cross-merchandising techniques (of the type “save a certain amount on category B products if you purchase category A products”, Drèze and Hoch 1998, see Section 3) and found sales increases in the targeted category ranging from 6 to 10 %. Since the targeted items in cross-merchandising campaigns correspond to the right-hand side categories of rules derived from frequent itemsets, we adopt these values as proxies in our scenario-based analysis (cf. Table 2).

For the standardized promotion campaign, we refer to the meta-analyses by Tellis (1988) and Bijmolt et al. (2005) which summarizes empirical research related to price elasticities on which our scenario assumptions are based on. Since the categories determined by the standardized promotion all come from the grocery domain the projected profit increase of 5 % employed in setting no. 1 is very optimistic. Nevertheless, to avoid overestimating the results from the segment-specific framework, a growth of 3 % is still estimated for the accumulated profits of the promoted categories when the standardized promotion method is applied for the more pessimistic scenario setting no. 2^{3}. It is also possible that the sales of associated itemsets could rise as much as the sales of the promoted categories (5 % in setting no. 1), but in fact, the gain will likely be much smaller. Therefore, we assume as a pessimistic outcome that the profit in the associated itemsets will increase by only 1 %.

#### 4.3.2 Results

The bars in Fig. 6 represent the expected profit margin gains in months eleven and twelve within each household segment derived for the two scenarios in Table 2. For the first scenario, the retailer would expect an overall profit margin gain of 15 %. Figure 6a shows that the segment-specific targeted category level program outperforms the standardized promotion in only four out of the eleven household segments under investigation (in particular the wine and the baby clusters). In other words, only for these segment-specific targeted promotions as derived by our proposed framework using LFCs are the recommended option because of a higher expected gain in profitability compared to the standardized promotion campaign. For segments \(k = \{2, 4, 6, 7, 9, 10, 11\}\) Fig. 6a shows that the profit increase achieved by the standardized promotion will be twice as high as the projected profit lift for the segment-specific case. For these groups, the expected gain in profit by adopting a targeted marketing program are unlikely to compensate for the profit potential not realized by conventional standard promotion techniques. However, segment-specific targeting would still be profitable for 32.2 % of the targeted households, while about two thirds of all households would still need to be addressed with the standardized promotion.

## 5 Discussion and future research

We present and empirically demonstrate the performance of a new approach to support loyalty program managers and direct marketers in customizing their segment-specific target marketing activities on a product category level. Our proposed decision-support framework requires customers’ past purchase histories as input data, builds on state-of-art data mining techniques and integrates an optimization procedure which provides the decision maker with a list of candidate product categories for segment-level targeted promotion campaigns. This list is derived such that the included categories maximize the cross-category spillover effects for non-promoted categories.

There are many occasions in which marketing managers can benefit from such itemset recommendations for target marketing purposes. These include but are not limited to designing segment-specific rewards in loyalty programs, cross-merchandising activities, targeted direct mailings, customized supplements in catalogues, and customized promotions. For example, the latter can be delivered both offline directly in the store (e.g., by issuing customized check-out-coupons as provided by Catalina Marketing services) or in online environments by sending targeted emails or during shopping trips in online stores.

We demonstrate the application of the stepwise procedure using transaction data from a real-world loyalty program offered by an anonymous major grocery retailer. In the scope of our empirical application study we also explored the projected profitability implications of utilizing the derived recommendations for designing segment-specific, category level targeted promotions. A scenario-based simulation study suggests that the adoption of targeted promotions might boost profitability between 15 % and 128 % relative to an undifferentiated standard campaign and that at least for some segments targeting can be the preferred option even under very conservative assumptions on the effectiveness of segment-specific target marketing actions. Of course, a more thorough evaluation of the relative effectiveness of targeted campaigns derived by utilizing our approach would be desirable. Such an evaluation strategy could entail a series of randomized field experiments. The evaluation framework introduced by Wang et al. (2016) offers a promising starting point for endeavors toward this direction, which we leave open for future research.

It also would be beneficial to combine our approach, which is primarily concerned with data compression, with a more predictive approach for modeling customers’ multi-category choice decisions (such as the work by Manchanda et al. 1999; Dippold and Hruschka 2013; Hruschka 2013). Further promising extensions of our approach could also concentrate on making the segmentation approach dynamic in order to account for changes in customers’ purchasing habits. Finally, to accommodate larger numbers of categories or for applications on a sub-category or even item-level the proposed approach needs to be made scalable for very high-dimensional transaction data. Such an attempt requires some type of variable selection or variable weighting; the contributions by Carmone Jr. et al. (1999) and Brusco and Cradit (2001) might be promising candidates to deal with this kind of challenges.

## Footnotes

- 1.
In this paper we focus on category level customized promotions, a marketing instrument frequently used in loyalty programs offered by multi-category retailers (Drèze and Hoch 1998; Osuna et al. 2016; Venkatesan and Farris 2012). However, the research framework presented here could be easily adopted and/or extended to specific brands of products or even to an item-based level.

- 2.
Following the discussion in the previous section, in our empirical illustration, we will only employ the LFCs for identifying household segments, in which case

*J*is the number of LFCs as defined a priori by the analyst. However, note that from a purely technical perspective the proposed procedure is agnostic to any preselection and could also be applied for the complete set of categories. - 3.
Note that the profit estimation of the standardized promotion benefits from the assumption of ignoring costs since it ensures a more conservative calculation of the output of our segment-specific approach.

## Notes

### Acknowledgments

Open access funding provided by Vienna University of Economics and Business (WU).

## References

- Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. Proceedings of the 20th International Conference on Very Large Databases. Santiago, Chile, pp 487–499Google Scholar
- Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, pp 207–216Google Scholar
- Ailawadi KL, Bradlow ET, Draganska M, Nijs V, Rooderkerk RP, Sudhir K, Wilbur KC, Zhang J (2010) Empirical models of manufacturer-retailer interaction: A review and agenda for future research. Market Lett 21(3):273–285CrossRefGoogle Scholar
- Aldenderfer MS, Blashfield RK (1984) Cluster analysis, quantitative applications in the social sciences, vol 44. Sage University Paper, Beverly HillsGoogle Scholar
- Anderson C (2006) The long tail: how endless choice is creating unlimited demand. RH Business Books, LondonGoogle Scholar
- Basu S, Davidson I, Wagstaff K (2008) Constrained clustering: advances in algorithms, theory, and applications. Chapman & HallGoogle Scholar
- Bawa K, Shoemaker RW (1989) Analyzing incremental sales from a direct mail coupon promotion. J Market 53(3):66CrossRefGoogle Scholar
- Bayardo RJ, Agrawal S (1999) Mining the most interesting rules. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 145–154Google Scholar
- Bell DR, Chiang J, Padmanabhan V (1999) The decomposition of promotional response: an empirical generalization. Market Sci 18(4):504–526CrossRefGoogle Scholar
- Berry J (2013) Bulking up: the 2013 colloquy loyalty census – growth and trends in us loyalty program activity. Colloquy JuneGoogle Scholar
- Bijmolt THA, van Heerde HJ, Pieters RGM (2005) New empirical generalizations on the determinants of price elasticity. J Market Res 42(2):141–156CrossRefGoogle Scholar
- Bock HH (1999) Clustering and neural network approaches. In: Gaul W, Locarek-Junge H (eds) Classification in the information age, Proceedings of the 22nd Annual Conference of the Gesellschaft für Klassifikation e.V., Springer, Heidelberg, Germany, pp 42–57Google Scholar
- Bodapati A, Gupta S (2004) A direct approach to predicting discretized response in target marketing. J Market Res 41(1):73–85CrossRefGoogle Scholar
- Bodapati AV (2008) Recommendation systems with purchase data. J Market Res 45(1):77–93CrossRefGoogle Scholar
- Boztug Y, Reutterer T (2008) A combined approach for segment-specific analysis of market basket data. EJOR Euro J Operat Res 187(1):294–312CrossRefGoogle Scholar
- Brijs T, Swinnen G, Vanhoof K, Wets G (2004) Building an association rules framework to improve product assortment decisions. Data Mining Know Dis 8(1):7–23CrossRefGoogle Scholar
- Brusco MJ, Cradit DJ (2001) A variable-selection heuristic for k-means clustering. Psychometrika 66(2):249–270CrossRefGoogle Scholar
- Carmone F Jr, Kara A, Maxwell S (1999) A new model to improve market segment definition by identifying noisy variables. J Market Res 36(4):501–509CrossRefGoogle Scholar
- Chen Y, Hess JD, Wilcox RT, Zhang ZJ (1999) Accounting profits versus marketing profits: a relevant metric for category management. Market Sci 18(3):208–229CrossRefGoogle Scholar
- DelVecchio D, Krishnan HS, Smith DC (2007) Cents or percent? the effects of promotion framing on price expectations and choice. J Market 71(3):158–170CrossRefGoogle Scholar
- Dimitriadou E, Dolnicar S, Weingessel A (2002) An examination of indexes for determining the number of clusters in binary data sets. Psychometrika 67(1):137–160CrossRefGoogle Scholar
- Dippold K, Hruschka H (2013) A model of heterogeneous multicategory choice for market basket analysis. Rev Market Sci 11(1):1–31CrossRefGoogle Scholar
- Drèze X, Hoch SJ (1998) Exploiting the installed base using cross-merchandising and category destination programs. Int J Res Market 15(5):459–471CrossRefGoogle Scholar
- Elberse A (2008) Should you invest in the long tail? Harvard Business Review 86(7/8):88–96
**(hBS Centennial Issue)**Google Scholar - Ferguson R, Hlavinka K (2007) The COLLOQUY loyalty marketing census: sizing up the us loyalty marketing industry. J Consum Market 24(5):313–321CrossRefGoogle Scholar
- Gordon AD, Vichi M (1998) Partitions of partitions. J Class 15(2):265–285CrossRefGoogle Scholar
- Hahsler M, Hornik K, Reutterer T (2006) Implications of probabilistic data modeling for mining association rules. In: From Data and Information Analysis to Knowledge Engineering (Proceedings of the 29th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Magdeburg, March 9–11, 2005), Springer-Verlag, Heidelberg, Studies in Classification, Data Analysis, and Knowledge Organization, pp 598–605Google Scholar
- Hettich S, Hippner H (2001) Assoziationsanalyse. In: Hippner H, Küsters UL, Meyer M, Wilde K (eds) Handbuch data mining im marketing—knowledge discovering in marketing databases. Viewag, Wiesbaden, pp 427–463Google Scholar
- Hornik K (2005) A CLUE for CLUster ensembles. J Stat Software 14(12):1–25CrossRefGoogle Scholar
- Hruschka H (1991) Bestimmung der Kaufverbundenheit mit Hilfe eines probalistischen Messmodells. Zeitschrift für betriebswirtschaftliche Forschung 43(5):418–434Google Scholar
- Hruschka H (2013) Comparing small-and large-scale models of multicategory buying behavior. J Forecast 32(5):423–434CrossRefGoogle Scholar
- Hui X, Tan PN, Kumar V (2006) Hyperclique pattern discovery. Data Mining Know Dis 13(2):219–242CrossRefGoogle Scholar
- Humby C, Hunt T, Phillips T (2004) Scoring points: How Tesco is winning customer loyalty. Kogan Page PublishersGoogle Scholar
- Kalwani MU, Yim CK (1992) Consumer price and promotion expectations: an experimental study. J Market Res (JMR) 29(1):90–100CrossRefGoogle Scholar
- Kamakura WA (2012) Sequential market basket analysis. Market Lett 23(3):505–516CrossRefGoogle Scholar
- Kaufman L, Rousseeuw PJ (2005) Finding groups in data: an introduction to cluster analysis. WileyGoogle Scholar
- Khan R, Lewis M, Singh V (2009) Dynamic customer management and the value of one-to-one marketing. Market Sci 28(6):1063–1079CrossRefGoogle Scholar
- Kivetz R, Simonson I (2003) The idiosyncratic fit heuristic: effort advantage as a determinant of consumer response to loyalty programs. J Market Res 40(4):454–467CrossRefGoogle Scholar
- Kumar V, Shah D (2004) Building and sustaining profitable customer loyalty for the 21st century. J Retail 80(4):317–329CrossRefGoogle Scholar
- Kwak K, Duvvuri SD, Russell GJ (2015) An analysis of assortment choice in grocery retailing. J Retail 91(1):19–33CrossRefGoogle Scholar
- Leisch F (2006) A toolbox for k-centroids cluster analysis. Comp Stat Data Anal 51(2):526–544CrossRefGoogle Scholar
- Leisch F, Grün B (2006) Extending standard cluster algorithms to allow for group constraints, Proceedings in Computational Statistics. In: Rizzi A, Vichi M (eds) COMPSTAT 2006. Physica-Verlag, Heidelberg, pp 885–892Google Scholar
- Liu Y (2007) The long-term impact of loyalty programs on consumer purchase behavior and loyalty. J Market 71(4):19–35CrossRefGoogle Scholar
- Liu Y, Yang R (2009) Competing loyalty programs: Impact of market saturation, market share, and category expandability. J Market 73(1):93–108CrossRefGoogle Scholar
- Manchanda P, Ansari A, Gupta S (1999) The “shopping basket”: a model for multicategory purchase incidence decisions. Market Sci 18(2):95–114CrossRefGoogle Scholar
- Mannila H (1997) Methods and problems in data mining. In: Afrati FN, Kolaitis PG (eds) Database Theory — ICDT ’97, 6th International Conference, Delphi, Greece, January 8–10, 1997, Proceedings, Springer, Lecture Notes in Computer Science, vol 1186, pp 41–55Google Scholar
- McQueen J (1967) Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, Berkeley, CA, USA, pp 281–297Google Scholar
- Milligan GW, Cooper MC (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2):159–179CrossRefGoogle Scholar
- Musalem A, Bradlow ET, Raju JS (2008) Who’s got the coupon? Estimating consumer preferences and coupon usage from aggregate information. J Market Res 45(6):715–730CrossRefGoogle Scholar
- Omiecinski E (2003) Alternative interest measures for mining associations in databases. IEEE Trans Know Data Eng 15(1):57–69CrossRefGoogle Scholar
- Osuna I, González J, Capizzani M (2016) Which categories and brands to promote with targeted coupons to reward and to develop customers in supermarkets. J RetailGoogle Scholar
- Pancras J, Sudhir K (2007) Optimal marketing strategies for a customer data intermediary. J Market Res 44(4):560–578CrossRefGoogle Scholar
- Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850CrossRefGoogle Scholar
- Reutterer T, Mild A, Natter M, Taudes A (2006) A dynamic segmentation approach for targeting and customizing direct marketing campaigns. J Inter Market 20(3–4):43–57Google Scholar
- Reutterer T, Hahsler M, Hornik K (2007) Data Mining und Marketing am Beispiel der explorativen Warenkorbanalyse. Marketing: Zeitschrift für Forschung und. Praxis 29(3):163–179Google Scholar
- Rossi PE, McCulloch RE, Allenby GM (1996) The value of purchase history data in target marketing. Market Sci 15(4):321–340CrossRefGoogle Scholar
- Rowley J (2005) Building brand webs: customer relationship management through the tesco clubcard loyalty scheme. Int J Retail Dist Manage 33(3):194–206CrossRefGoogle Scholar
- Russell GJ, Petersen A (2000) Analysis of cross category dependence in market basket selection. J Retail 76(3):367–392CrossRefGoogle Scholar
- Sahni N, Zou D, Chintagunta PK (2014) Effects of targeted promotions: Evidence from field experiments. Available at SSRN 2530290Google Scholar
- Shaffer G, Zhang ZJ (1995) Competitive coupon targeting. Market Sci 14(4):395–416CrossRefGoogle Scholar
- Shoemaker RW, Tibrewala V (1985) Relating coupon redemption rates to past purchasing of the brand. J Adv Res 25(5):40–47Google Scholar
- Shugan SM (2005) Brand loyalty programs: are they shams? Market Sci 24(2):185–193CrossRefGoogle Scholar
- Singh SS, Jain DC, Krishnan TV (2008) Research note - customer loyalty programs: are they profitable? Manage Sci 54(6):1205–1211CrossRefGoogle Scholar
- Tellis GJ (1988) The price elasticity of selective demand: a meta-analysis of econometric models of sales. J Market Res 24(4):331–341CrossRefGoogle Scholar
- Thorndike RL (1953) Who belongs in the family? Psychometrika 18(4):267–276CrossRefGoogle Scholar
- Venkatesan R, Farris PW (2012) Measuring and managing returns from retailer-customized coupon campaigns. J Market 76(1):76–94CrossRefGoogle Scholar
- Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained \(k\)-means clustering with background knowledge. In: Proceedings of the International Conference on Machine Learning (ICML), pp 577–584Google Scholar
- Wang Y, Lewis M, Cryder C, Sprigg J (2016) Enduring effects of goal achievement and failure within customer loyalty programs: A large-scale field experiment. Market Sci. doi: 10.1287/mksc.2015.0966, URL 10.1287/mksc.2015.0966, to appear
- Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: KDD, pp 283–286Google Scholar
- Zhang J, Breugelmans E (2012) The impact of an item-based loyalty program on consumer purchase behavior. J Market Res 49(1):50–65CrossRefGoogle Scholar
- Zhang J, Krishnamurthi L (2004) Customizing promotions in online stores. Market Sci 23(4):561–578CrossRefGoogle Scholar
- Zhang J, Wedel M (2009) The effectiveness of customized promotions in online and offline stores. J Market Res 46(2):190–206CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.