## Abstract

Prices for grocery items differ across stores and time because of promotion periods. Consumers therefore have an incentive to search for the lowest prices. However, when a product is purchased infrequently, the effort to check the price every shopping trip might outweigh the benefit of spending less. I propose a structural model for storable goods that takes into account inventory holdings and search. The model is estimated using data on laundry detergent purchases. I find search costs play a large role in explaining purchase behavior, with consumers unaware of the price of detergent on 70 % of their shopping trips. Therefore, from the retailer’s point of view raising awareness of a promotion through advertising and displays is important. I also find a promotion for a particular product increases the consumer’s incentive to search. This change in incentives leads to an increase in category traffic, which from the store manager’s perspective is a desirable side effect of the promotion.

This is a preview of subscription content, access via your institution.

## Notes

Other papers that estimate the magnitude of search costs include Hortaçsu and Syverson (2004) for the mutual fund industry or Hong and Shum (2006) and Santos et al. (2011) for online book purchases and Honka (2012) for the car insurance market. Mehta et al. (2003) estimate search costs for grocery shopping items but do not allow consumers to keep an inventory.

Supermarkets do engage to some extent in price flexing, that is, adjusting prices to local conditions, but only for a small subset of products (according to the UK Competition Commission) and does not seem to include laundry detergents.

I also allow liquid and powder detergent purchases within this residual category but code them at the same price and pack size for simplicity.

The pack sizes of the different brands are not exactly equal to 900 g and 1.9 kg. However, only small differences are present, and I therefore ignore these differences when looking at price variation across brands.

To be precise, another condition is the absence of income effects at the trip-level. A budget-constrained household might delay its detergent purchase because it needs to buy more urgent products. Despite the fact that these products are completely unrelated in consumption, demand would be correlated between these products. I looked at whether low-income households behave differently and checked for changes in behavior over the duration of a month (due to monthly wage payments). Doing so, I found no evidence of the presence of budget constraints at the triplevel.

The only clearly complementary product is softener, which I therefore exclude from the analysis both in this section and in the structural estimation.

There is no information on the mode of transport for each shopping trip in the data, it is therefore not possible to test this hypothesis at the trip level. Some further analysis using the distance to each supermarket was also conducted. Distance did not seem to matter for the purchase probability which also speaks against the importance of transport costs for the purchase decision regarding laundry detergent.

However, the correlation could also arise if consumers are perfectly informed about prices but they have to incur a cost to visit certain parts of the store. This situation is less relevant for the data I am using, as in the UK, little price information is available prior to visiting the store.

Note, that the composition of the whole shopping basket is itself part of a larger optimization problem that the consumer has to solve. In other words, the decision to search and purchase

*any*product is part of the consumers decision making process subject to certain constraints. Therefore, one could trace back the reasons for why search costs for detergent vary as a function of the shopping basket composition to more basic underlying primitives. I outline such a model verbally in Section A.2 of the Appendix. In the empirical model, I will treat the basket composition as exogenous to the search and purchase decision regarding detergent. This simplification is necessary in order to make the model tractable.Note that the use of the max-operator in the above equation constitutes a slight abuse of notation. The consumer will make a choice in the purchase stage that maximizes the present discounted value and not the flow utility. The maximization is therefore with respect to the choice-specific value function

*v*_{ ps,t }rather than*u*_{ ps,t }.This assumption relies on an institutional feature of the supermarket sector in the UK: the almost absolute absence of feature advertising. Consumers therefore cannot gather price information before going to the supermarket, and instead have to engage in search within the store.

This is similar to the situation of a static demand model without an outside option. In that kind of model, estimating a constant term in the utility function is not possible. For the same reasons,

*v*(*c*(*i*_{ t })) cannot be identified.I divide expenditure on each trip by the average household-specific expenditure level. The variable therefore only captures household-specific expenditure variation over time.

I am working under the assumption that the three search-cost shifters do indeed represent different levels of search costs rather than transport costs. I addressed this issue in Section 3.1 and am not able to directly test it within the structural model.

To fully avoid any concerns regarding this type of bias, one would also have to model inventory as a function of the products in stock, which the model presented here does not do. Erdem et al. (2003) deal with this issue by including quality-adjusted inventory as a second state variable.

Note that the three search-cost shifters only take positive values, and all enter the logit term with a negative sign. The maximum value the logit term (not scaled by \(\widetilde{s}\)) can take is therefore 0.5.

The only other difference is that the model without search is estimated without allowing for heterogeneity in the price coefficient. I allow for different coefficients in all other terms for two types of consumers (as in the baseline model with search). This approach is taken for the following reason: When allowing for heterogeneity in the price coefficient, I obtain an extremely large price coefficient for one type. This type of consumer is predicted to almost never make a purchase (on less than 1 % of his shopping trips). Because this simple validation exercise predicts such implausible behavior for one type (the predictions for the second type do not exhibit such patterns), I decided to restrict the price coefficient. The restriction I impose onto the model without search is a drawback for the validation exercise. But because the predictions from model without search and with heterogeneity in the price coefficient are even worse, I am stacking the cards in favor of the model without search, by restricting heterogeneity.

See Table 12 for more details.

In principle, one way of fixing the poor fit along the pack-size dimension would be to add a set of pack-size dummies into the utility function. The stance this paper takes is that underlying structural parameters, such as search and storage costs, should in principle fully explain the choice of pack size. Employing pack-size dummies would therefore constitute a somewhat reduced-form way of fixing the poor pack-size fit of the model. For this reason, I do not include pack-size fixed effects into the model. On the upside, the model with search does make some progress toward improving the predictions of the model while relying on a fully structural specification of consumer utility. Furthermore, including a set of pack-size fixed effect would also cause problems for identification. See Section A.9 of the Appendix for more details.

## References

Bell, D.R., Iyer, G., Padmanabhan, V. (2002). Price competition under stockpiling and flexible consumption.

*Journal of Marketing Research, 39*(3), 292–303.Ching, A., Erdem, T., Keane, M. (2009). The price consideration model of brand choice.

*Journal of Applied Econometrics, 24*, 393–420.Erdem, T., Imai, S., Keane, M.P. (2003). Brand and quantity choice dynamics under price uncertainty.

*Quantitative Marketing and Economics, 1*, 5–64.Erdem, T., Katz, M.L., Sun, B. (2010). A simple test for distinguishing between internal reference price theories.

*Quantitative Marketing and Economics, 8*, 303–332.Goeree, M.S. (2008). Limited information and advertising in the US personal computer industry.

*Econometrica, 76*(5), 1017–1074.Gowrisankaran, G., & Rysman, M. (2009). Dynamics of consumer demand for new durable goods. NBER Working Paper 14737.

Heckman, J., & Singer, B. (1984). A method for minimizing the impact of distributional assumptions in econometric models for duration data.

*Econometrica, 52*(2), 271–320.Hendel, I., & Nevo, A. (2006). Measuring the implications of sales and consumer inventory behavior.

*Econometrica, 74*(6), 1637–1673.Hong, H., & Shum, M. (2006). Using price distributions to estimate search costs.

*RAND Journal of Economics, 37*(2), 257–275.Honka, E. (2012). Quantifying search and switching costs in the U.S. auto insurance industry (unpublished manuscript).

Hortaçsu, A., & Syverson, C. (2004). Search costs, product differentiation, and welfare effects of entry: a case study of SP 500 index funds.

*The Quarterly Journal of Economics, 119*(4), 403–456.Kamakura, W.A., & Russell, G.J. (1989). A probabilistic choice model for market segmentation and elasticity structure.

*Journal of Marketing Research, 26*(4), 379–390.Kim, J.B., Albuquerque, P., Bronnenberg, B.J. (2010). Online demand under limited consumer search.

*Marketing Science, 29*(6), 1001–1023.Koulayev, S. (2010). Estimating demand in online search markets, with application to hotel bookings (unpublished manuscript).

Manski, C. (2004). Measuring expectations.

*Econometrica, 72*(5), 1329–1376.Mehta, N., Rajiv, S., Srinivasan, K. (2003). Price uncertainty and consumer search: a structural model of consideration set formation.

*Marketing Science, 22*(1), 58–84.Melnikov, O. (2001). Demand for differentiated durable products: the case of the U.S. computer printer market (unpublished manuscript).

Moraga-Gonzalez, J.L., Sandor, Z., Wildenbeest, M.R. (2011). Consumer search and prices in the automobile market (unpublished manuscript).

Osborne, M. (2011). Consumer learning, switching costs and heterogeneity: a structural examination.

*Quantitative Marketing and Economics, 9*(1), 25–70.Rust, J. (1987). Optimal replacement of GMC bus engines: an empirical model of Harold Zurcher.

*Econometrica, 55*(5), 999–1033.Rust, J. (1994). Structural estimation of Markov decision processes. In R.F. Engle, & D.L. McFadden (Eds.),

*Handbook of econometrics*(Vol. 4, Chap. 51, pp. 3082–3143). Elsevier Science.Santos, B.I.D.L., Hortacsu, A., Wildenbeest, M. (2011). Testing models of consumer search using data on web browsing and purchasing behavior.

*American Economic Review*(forthcoming).Schiraldi, P. (2011), Automobile replacement: a dynamic structural approach.

*RAND Journal of Economics, 42*(2), 266–291.Sun, B., Neslin, S.A., Srinivasan, K. (2003). Measuring the impact of promotions on brand switching when consumers are forward looking.

*Journal of Marketing Research, 11*, 389–405.Warner, E.J., & Barsky, R.B. (1995). The timing and magnitude of retail store markdown: evidence from weekends and holidays.

*Quarterly Journal of Economics, 110*(2), 321–352.

## Acknowledgements

I would like to thank my advisors John Van Reenen and Pasquale Schiraldi for their invaluable guidance and advice. I am also grateful to Michaela Draganska, Alan Sorenson and Tat Chan who discussed the paper for great feedback as well as participants at various conferences and seminar participants at the London School of Economics, the Institute for Fiscal Studies, Frankfurt, CREST (Paris), Stanford, UCLA, Rochester, Washington University in St. Louis, Carnegie Mellon, Chicago, San Diego, Zurich, Tilburg and Northwestern. I would also like to thank Rachel Griffith at the Institute for Fiscal Studies for great help with the data and detailed discussions as well as Pedro Gardete, Joachim Groeger, Wes Hartmann, Guenter Hitsch, Claire LeLarge, Fabio Pinna, Peter Rossi, Thomas Schelkle, Philipp Schmidt-Dengler and two anonymous referees for helpful comments. Any remaining errors are my own. A previous version of this paper was circulated under the title “A Dynamic Model with Consideration Set Formation”.

## Author information

### Authors and Affiliations

### Corresponding author

## Appendices

### Appendix A

### A.1 Household selection

When selecting the households that are included in the estimation, I apply several criteria (all described in the main text). This section provides some further justification for the selection criteria and provides details about how the sample size was affected. The full dataset contains about 40,000 households; the final sample used in the estimation comprises 686 households.

In a first step, I eliminate all households that were in the sample for less than 20 weeks, because information from “uncommitted” consumers that spend only a short period of time in the panel might be less reliable. This exclusion reduces the sample to roughly 31,000 households. I then eliminate all households that bought less than 6 kg of detergent per year and households that did not purchase any detergent for a period of at least 16 weeks. These exclusions eliminate households with extremely low consumption rates and that possibly visit a launderette some of the time. In the sample, 90 % of households buy between 10 and 35 kg, the mean being 20 kg. This finding suggests the 6 kg constitute an unusual behavior that the model will not be able to capture. Similarly, a large gap in purchases might be due to the household going on holiday, which also constitutes an unusual consumption behavior that the model cannot capture. These two criteria decrease the sample size to 12,000. Next, I eliminate all households that bought detergent tablets less than 75 % of the time. This elimination removes households that primarily purchased other types of detergent, such as powder or liquid detergent. This step leaves me with about 2,000 households. Finally, I use only households that bought one of the five brands for which I construct price series at least 75 % of the time. As a result, 686 households fulfill all criteria.

Arguably, some of the criteria applied are quite conservative. For example, one might try to eliminate longer periods without purchases but still keep most of the time series for a particular household. I do not make such an attempt here, as I end up with a fairly large sample of households compared to other papers in the literature (e.g., Hendel and Nevo 2006). I also have a much longer time series of purchases (6 years) than what is available in other datasets used to analyze demand for similar products. Therefore, very little is lost by eliminating households in a conservative way if one is doubtful about irregularities in their behavior. Also, other papers such as Osborne (2011) take a random sample of all households to reduce the computational burden in the estimation. Instead, I prefer to apply the conservative criteria of elimination outlined above.

### A.2 Outline of a model of shopping basket size and composition choice

Section 3.1 of the paper shows the size and composition of the shopping basket influences the purchase probability for detergent. Of course, the size and composition of the whole shopping basket is not truly exogenous, but itself part of a larger optimization problem that the consumer has to solve. In other words, the decision to search and purchase *any* product is part of the consumer’s decision making process subject to certain constraints. Therefore, one could trace back the reasons for why search costs for detergent vary as a function of the shopping basket composition to more basic underlying primitives. To do so, one can think of a model in which the consumer decides whether to search for price information across *all* products in the supermarket (and possibly decides to make a purchase). Although I do not attempt to formally derive (or estimate) such a model, sketching out the trade-offs inherent in such a model is still instructive. Assume the consumer enters the store with a certain need for various products. The inventory of the product he holds at home and his future consumption needs determine the immediacy of the need. Also, time spent in the store is costly for the consumer. Even without any formal derivations, one can intuitively think of the type of predictions that can be obtained: (1) When the opportunity cost of time is high, a consumer will only want to stock up on the most necessary items. He will therefore purchase fewer items on such a trip and engage in less search. Furthermore, he is more likely to purchase perishable goods, which need to be stocked up on more frequently rather than durable products such as laundry detergent. (2) Assume the consumer has already searched for a product in a particular product category. The marginal cost of searching for a further product within the category is lower than for another product in a different category, due to products typically being arranged by category. This type of cost saving in the time spent searching gives consumers an incentive to lump together purchases within a category on a particular trip. The mechanisms just described will create the type of correlation between shopping basket size and composition with the purchase probabilities for detergent reported in Section 3.1. The variation in the shopping basket composition, although being an outcome of the consumer’s decision problem, will therefore reflect differences in the search cost for detergent, caused by the underlying variation in the time constraint and the consumption needs across various products. In the empirical model, I will treat the basket composition as exogenous to the search and purchase decision regarding detergent. This simplification is necessary in order to make the model tractable.

### A.3 Selection of trips in Section 3.2 consumers missing promotions

To compute the percentages presented in the table, I first have to define a promotion. I assume (as described in Section 2.4) the 75th percentile of the price distribution of each brand at a particular supermarket to be the regular price. As promotions are infrequent and because the regular price varies little over time, this definition is appropriate. The 75th percentile will always lie outside of the promotion range of the price distribution. I define a promotion as a price that is at least 20 % below the regular price.

I then compute the identity of the product purchased and the price of the product for every purchase made. In the next step, I look up the price for the purchased product on every shopping trip of the same consumer that happened before the actual purchase and after the previous purchase in the detergent category. I drop any previous trip to a store where the particular product was not available. This procedure allows me to find out whether the product purchased had been on promotion on any previous trip of the same consumer for any arbitrary time window. Regardless of the time window, it will always go no further back than the first trip after the previous purchase. Because detergent is purchased infrequently, this constraint is usually not binding for the one-week time window used in the table. I also eliminate all trips to supermarkets in the “Other” category, as I do not have reliable price information for those trips. I do not consider these trips, both in terms of actual purchases and in terms of possible purchases on previous trips.

### A.4 Implementation of the product fixed effects

To control for unobserved product quality, I estimate a set of five brand intercepts for a total of six brands in the sample. I impose two constraints on the brand intercepts: (1) The lowest-valued brand has an intercept of zero (min_{
j
}[*ξ*
_{
j
} = 0]); I do not estimate this intercept, but impose it in the estimation. In practice, the optimization algorithm searches over a set of five differences between the six *ξ*-terms. This way of constructing the algorithm gives me a unique ranking between all six brands. Setting the lowest-value brand’s intercept to zero, I can use the differences to obtain the remaining *ξ*-terms. (2) All intercepts are constrained to be non-negative (*ξ* ≥ 0). Constraint (1) is a standard normalization constraint, as only quality difference between brands can be identified. I further assume unobserved product quality is a brand-specific characteristics and is scaled-up linearly when a consumer purchases a larger pack size. That is, compared to a 1 kg pack size, a 2 kg pack gives twice as much unobserved product quality. Due to the multiplication of the brand intercepts with the pack size purchased, imposing constraint (2) is necessary. It ensure that large pack sizes provide higher utility than small pack sizes. Negative intercepts would imply lower utility from larger pack sizes, which is economically counter-intuitive. Note that the non-negativity constraint may seem to make the outside option of not purchasing relatively less attractive; however, the brand intercepts capture quality differences *relative to the normalized brand*, but not relative to the outside option. The dynamic parameters such as search and storage costs primarily govern the relationship between the outside option and the available purchase options.

### A.5 Discretizing the state variables

To implement the dynamic programming problem, I discretize the price distribution as well as the search-cost distribution and the inventory variable. The probability distribution regarding the store visited in the next time period is already a discrete distribution by construction.

To cover all the possible price realizations of any of the available products, I construct a grid for prices between 0 and 16 British pounds. A price of zero never occurs; therefore, I use this grid point to deal with temporarily unavailable products. Products are made effectively unavailable by assigning them an extremely high price (99999 instead of 0), which reduces utility from this option to minus infinity. This grid point (which captures product availability) will enter the consumer’s expectations about futures prices together with the price distribution conditional on availability. I use 25 gridpoints, which makes the grid fine enough to capture the typical promotion depth for any pack size and brand available during the sample period.

To discretize the distribution of future expected search costs, I do the following: Based on the estimates of \(\widetilde{s}\) and the vector *γ*, I calculate the search costs for all the shopping trips and compute the distribution of the search costs over a set of grid-points. Defining the grid is made particularly easy by the functional form chosen, as the search cost has to be an element of the compact interval \(\widetilde{s} * [0,1]\). I therefore use a grid of values between zero and 1 and compute the distribution of \(\exp(x_{t}' \beta)/[1+\exp(x_{t}' \beta)]\). I then multiply this term by \(\widetilde{s}\). I use 11 grid points for the search-cost distribution in the estimation.

Finally, I discretize inventory using a grid ranging from 0 to 15 kg of detergent inventory with 30 grid points in the dynamic problem. When constructing the inventory variable for each consumer, I allow the transition (as a function of the consumption rate and the pack size of a purchase) to be continuous. For every choice available, I compute the expected value function by linearly interpolating between the value functions defined for the closest grid point to the left and to the right of the actual inventory value (which resulted from the continuous transition process). Because the grid is relatively fine, the method of interpolation presumably does not have a large effect.

### A.6 Initial inventory

When estimating the model I have to deal with the problem of an unknown initial inventory. Note that the consumption rate is convex by construction. Once the inventory falls below the rate (*τ*), consumption is reduced until it becomes zero when the inventory is completely depleted. Because of this, the impact of the initial inventory will fade over time. I start with the first observed purchase for each household and assume that no inventory was held before that time period. I then calculate the evolution of the inventory implied by the estimated consumption parameter *τ* and the observed purchases. Only after the first ten trips is the observed behavior used in order to form the likelihood function. This helps to mitigate the initial inventory problem. Note, that I deal with the initial condition problem in a similar way as Hendel and Nevo (2006). Erdem et al. (2003) instead use a more sophisticated approach. They simulate purchases and consumption for an extended period of time prior to the observed data and use several sets of simulated inventories as the initial condition(s). The likelihood is obtained by averaging over the individual likelihoods for each simulated initial condition. As a sensitivity check, I also tried excluding the first 20 trips instead of only 10 from the estimation. This had little impact on the results.

### A.7 Sensitivity checks

I ran several tests to check the sensitivity of the estimates. First, one might worry some reverse causality exists for the search-cost shifters. This type of mechanism is particularly worrisome in the case of the “number of other cleaning products purchased” search-cost shifter, as the consumer might decide to buy detergent and therefore also buy other cleaning products. Presumably, buying detergent is less likely to have an influence on the overall trip expenditure or the number of household products, which is a wide category. Therefore, I re-estimated the model, dropping the “number of other cleaning products purchased” as a shifter of the search costs. I report the results of this regression together with the baseline results in Table 16 of the Appendix. Relatively little change occurs in the parameter estimates of the maintained parameters. Furthermore, I tried to include a quadratic storage-cost term for both types of households. Both coefficients are positive but not significantly different from zero.

### A.8 Implementing the simulation

To analyze the predictions from the model, I need to simulate consumer behavior based on the parameter estimates of the model. To this end, I take draws from the distribution of error terms, determine the optimal choice for each consumer, and aggregate over the choices of all simulated consumers to obtain the market shares for each choice. I also randomly draw from the empirical distribution of the variables the identity of the store and the prices that a consumer faces in a particular time period. The probability distributions of store visits and prices can easily be computed from the raw data. In both cases, I compute the distribution from sample frequencies (of store visits/store-specific prices), and it is therefore independent of any parameters of the estimation. I discretize the price distribution for this purpose; the store-visit distribution is discrete by construction. I use a total of 5,000 simulated consumers for each of the two types and simulate the behavior over 1,000 weeks. I calculate total market shares by weighing the market share of each type of consumer with the estimated weight of the respective type.

Finally, consumers also differ by the inventory they hold in a particular time period. To get sensible results from the simulation, I need to know the inventory distribution implied by the model. I find this distribution by starting at an arbitrary distribution. I then simulate consumer behavior for enough time periods such that the distribution reaches a steady state. Specifically, I start by assigning an inventory of zero to each consumer. I then simulate the consumer’s behavior over 100 time periods and update the inventory each period according to the rate of depreciation derived in the estimation and the simulated purchases. The inventory changes little from period to period at the end of the 100 simulated time periods. Therefore, the impact of the initial inventory should have faded completely after this time span. I then use this “steady-state” inventory distribution as the initial inventory for the various simulation exercises in the results section, the validation, and the counterfactuals.

When looking at market shares in promotion periods and regular-price weeks (for the validation and the two counterfactuals), I aggregate the market shares separately for all promotion and regular-price weeks. I compute the elasticity by comparing the percentage difference in market share with the price change implied by the promotion. I therefore do not use one particular promotion to assess the effect on demand. Instead, I look at all promotions (which are by construction randomly timed) and average over all promotional weeks. I draw the prices of all other products from their empirical distribution in every week. The promotion of a particular product can therefore sometimes coincide with promotions for other products, as implied by the price distributions. This way of constructing price elasticities is as close as possible to the choice situations consumers face in reality. This makes the elasticities comparable to those calculated from the raw data.

Furthermore, in the case of all three products reported in Table 8, I set the depth and frequency of the promotion as close as possible to the actual price patterns. A 900 g pack of Ariel at Morrisons was promoted 3.4 % of the time with a 27 % discount. A 900 g pack of Tesco’s Private Label at Tesco was promoted 19 % of the time with a 26 % discount. A 1,900 g pack of Ariel at Morrisons was promoted 4.7 % of the time with a 21 % discount.

### A.9 Pack-size intercepts

One weakness of the paper lies in the fact that the demand model cannot predict pack-size choice particularly well. In principle, a set of pack-size fixed effects could help improve the fit in this respect. Currently, preferences over different pack sizes are modeled in a fairly restrictive way; they are primarily influenced by search and storage costs. High storage costs make buying large packs less attractive, whereas high search costs make large pack sizes more attractive in order to save on future search costs (by purchasing less often). Of course there might exist other explanation for why consumers would prefer different pack sizes that are not captured in the model. However, consumer choice across pack sizes (in response to non-linear pricing over pack sizes, namely, quantity discounts) is of crucial importance for identification. As laid out in Section 5.2, pack-size choice is one of three types of variation that help to identify the price coefficient, and search and storage costs. The inclusion of a set of pack-size fixed effects would “soak up” all variation in the pack-size choice dimension and therefore make identifying all three key parameters from the remaining variation impossible. The limited flexibility in pack-size preferences is the cost of achieving identification in the model presented here.

### Appendix B: Tables

## Rights and permissions

## About this article

### Cite this article

Seiler, S. The impact of search costs on consumer behavior: A dynamic approach.
*Quant Mark Econ* **11**, 155–203 (2013). https://doi.org/10.1007/s11129-012-9126-7

Received:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s11129-012-9126-7

### Keywords

- Dynamic demand estimation
- Search costs
- Imperfect information
- Storable goods
- Stockpiling

### JEL Classification

- D12
- D83
- C61
- L81