A semiparametric approach to estimating reference price effects in sales response models

It is well known that store-level brand sales may not only depend on contemporaneous influencing factors like current own and competitive prices or other marketing activities, but also on past prices representing customer response to price dynamics. On the other hand, non- or semiparametric regression models have been proposed in order to accommodate potential nonlinearities in price response, and related empirical findings for frequently purchased consumer goods indicate that price effects may show complex nonlinearities, which are difficult to capture with parametric models. In this contribution, we combine nonparametric price response modeling and behavioral pricing theory. In particular, we propose a semiparametric approach to flexibly estimating price-change or reference price effects based on store-level sales data. We compare different representations for capturing symmetric vs. asymmetric and proportional vs. disproportionate price-change effects following adaptation-level and prospect theory, and further compare our flexible autoregressive model specifications to parametric benchmark models. Functional flexibility is accommodated via P-splines, and all models are estimated within a fully Bayesian framework. In an empirical study, we demonstrate that our semiparametric dynamic models provide more accurate sales forecasts for most brands considered compared to competing benchmark models that either ignore price dynamics or just include them in a parametric way.


Introduction
Sales response functions describe the relationship between the sales of a product (or an entire product category) as dependent variable and predictors that are believed to influence sales as independent variables. In brand sales models based on store-level data, these predictors typically represent prices of (substitute) brands and other marketing variables related to promotional activities (like displays and feature advertising) as well as trend or seasonal indicators. In this context, retailers and academic researchers face several challenges, for example how to process the usually large amount of information (predictors) to arrive at a parsimonious model, how to specify the functional relationships between metric predictors (like prices) and sales, and/or how to accommodate dynamic effects in the model. More specifically, several streams of research for modeling (store) sales response to price variations have developed over the last 40 years. Among them, researchers have focused on choosing the right functional form to adequately capture the relationship between (own or competitive) prices and sales. Non-or semiparametric regression models have been proposed here in recent years in order to capture strong and/or complex nonlinearities in price response that could actually be proven for frequently purchased consumer goods in many empirical studies and are difficult to handle with parametric models. A second stream has addressed price dynamics in (store) sales response models by adding variables for lagged (or even lead) price effects, by including price terms for reference price or price-change effects, or by considering time-varying price parameters. Reference price effects are more commonly studied with disaggregate consumer data (i.e., household-level data), and have been less frequently incorporated into response models based on aggregate sales data. Obviously, reference price effects are much more difficult to model with aggregate data compared to disaggregate data. In the latter case, purchase incidence, brand choice, and purchase quantity decisions of consumers can be more easily separated at the individual consumer or household level, and reference price effects can in principle influence all three decisions (although they are most popular in models that have its focus on brand choice only). Beyond reference price effects, other forms of dynamics like stockpiling, state dependence (brand loyalty) or customer holdover, or consumer learning have been shown to be also very relevant at the individual consumer or household level, see for example Neslin and van Heerde (2009) and van Heerde and Neslin (2017) for an overview. Sales response models lack this micro-foundation: both the three different consumer decisions and the possibly individually different dynamics are confounded in aggregate data, making it challenging to disentangle them (see, e.g., Neslin and Shoemaker 1989;van Heerde et al. 2000van Heerde et al. , 2004. Using aggregate data, reference price effects can therefore only be interpreted for an aggregate of households in the sense that they refer to prices paid or observed in previous periods (e.g., weeks) rather than to prices paid or observed at previous individual purchase occasions. This may be less of a problem if goods like in the food sector are purchased on a regular (weekly) basis and are frequently advertised. On the other hand, if consumer goods have longer interpurchase times household data can also suffer from some problems like for example a different composition of consumers across periods when modeling dynamic effects over time (e.g., weeks). And, modeling reference price effects with aggregate data has also its pros due to the greater managerial relevance of aggregate data compared to household-level data. Although household-level data are richer for explaining customers' purchasing behavior (as indicated above), they have been often criticized by managers for their potential lack of representativeness, which may cause share estimates to differ from those based on store-level data (cf. van Heerde 1999, p. 21). While household-level data cover only a subset of all customers purchasing at a retail store, store-level data cover all these customers, which is a weighty argument from the perspective of a store manager in favor of using aggregate data. For a comprehensive discussion on the pros and cons of the different data types, see van Heerde (1999, pp. 20-21). As a result of the discussion above, it is however important to separate reference price effects from other dynamic effects like stockpiling or customer holdover when relying on aggregate data. 1 In this paper, we combine nonparametric sales response modeling with the estimation of price dynamics where the latter are captured by reference price effects. The fact that no other study has tackled this frontier up to now can probably be explained by the much higher popularity of studying reference price effects with household-level data and the greater difficulties to disentangle different sources of dynamics with aggregate data, as discussed above. We try to fill this research gap and propose a semiparametric model to flexibly estimating price-change or reference price effects based on store-level sales data. We compare different representations for accommodating symmetric versus asymmetric and proportional versus disproportionate price-change effects following adaptation-level and prospect theory, and further compare our flexible autoregressive model specifications to parametric benchmark models. Since management decisions should be based on the model with the highest predictive performance (van Heerde et al. 2002), our primary focus is on the predictive model performance rather than on solving the problem how to tease out different dynamic effects with aggregate data. Actually, focusing on prediction as our main goal relaxes the problem that it is more difficult to disentangle different dynamic effects with aggregate sales data compared to disaggregate consumer data. Nevertheless, we address this problem and separate reference price effects from other dynamic effects like stockpiling and customer holdover by including lagged sales as autoregressive model component.
In an empirical study, we demonstrate that our semiparametric dynamic models can provide more accurate sales forecasts compared to competing benchmark models that either ignore price dynamics or just include them in a parametric way. The main benefit of the proposed model is therefore to help managers to predict sales better. It has been shown before for other models that semiparametric modeling can provide (much) better predictions than parametric modeling (see the literature review in Sect. 2 below); as such, one main academic contribution of the paper is to 1 3 show that this also holds for reference price models. In addition, we discuss likely implications of our model for related optimal pricing decisions in our outlook onto future research perspectives at the end of the article.
The rest of the paper is organized as follows. Section 2 provides a compact review of the relevant literature on price response modeling based on aggregate data, reflecting the road from parametric to more flexible model specifications as well as the different options of addressing price dynamics, including reference price effects in particular. In Sect. 3, we introduce our Bayesian model estimation framework. Using scanner data for refrigerated orange juice brands sold by a large supermarket chain we compare different model specifications (nonparametric vs. parametric, dynamic vs. static, alternative options of specifying reference price effects) for predictive performance and discuss implications regarding estimated price elasticities in Sect. 4. We conclude in Sect. 5 with a summary of the most important findings, managerial implications, and an outlook on future research opportunities.

Literature review
This section provides an overview of relevant literature for our proposed approach, referring to the functional form of price response models, the incorporation of price dynamics in such models, and the few approaches that have so far combined functional flexibility and the estimation of dynamic price effects in sales response models.
Early approaches used strictly parametric modeling to estimate sales/price response functions, as a rule using the sales variable in logarithmic form (e.g., Hruschka 1997;Montgomery 1997;Foekens et al. 1999;Kopalle et al. 1999;van Heerde et al. 2000van Heerde et al. , 2002Hruschka 2006a, b;Andrews et al. 2008). Using log sales instead of sales enables to capture nonlinearities in sales response, however the observed data are still projected "into a Procrustean bed of a fixed parameterization" (Härdle 1990; as cited in van Heerde 1999, p. 28).
In other words, parametric models only provide consistent estimates if the a priori assumed functional form is correct (e.g., Leeflang et al. 2000). The use of more flexible semi-or nonparametric models can help to overcome this problem, as these allow to 'extract' the shape of functional relationships directly from data without prior knowledge about the functional form (e.g., van Heerde 2017). van Heerde et al. (2001) have shown for several food categories that the use of a kernel regression approach can improve the predictive performance of brand sales models based on store-level data compared to parametric modeling. Hruschka (2006a) and Hruschka (2007) used neural nets (multilayer perceptrons) to capture nonlinearities in sales response and reported much better log marginal densities as well as high posterior model probabilities or superior cross-validated predictive densities compared to strictly parametric modeling for all brands considered. Other researchers proposed spline approaches to model sales response and could reveal strong nonlinearities in price effects which in addition were shaped very differently at the individual brand level. Kalyanam and Shively (1998) used stochastic cubic splines, Haupt and Kagerer (2012) and Haupt et al. (2014) applied B-splines, Hruschka (2000) considered both B-splines and cubic smoothing splines, and Steiner et al. (2007), Weber and Steiner (2012), Lang et al. (2015), and Weber et al. (2017) employed Bayesian P-splines. Except for Kalyanam and Shively (1998) and Hruschka (2000), who focused on model fit (the first also used marginal posterior model probabilities, the latter applied AIC), the mentioned spline applications provided further evidence of (much) more accurate sales predictions when using nonparametric instead of parametric response models. These findings are of great importance since (store) managers should prefer the model specification with the best possible predictive performance (van Heerde et al. 2002). More flexible specifications have the potential to work better and to provide superior forecasts than parametric ones if the sales data at hand include complex nonlinear relationships in price response which are difficult to 'read out' with parametric models (e.g., Lang et al. 2015).
Beyond the choice of the right functional form, one can think about the incorporation of time-dependent (price) effects leading to dynamic instead of static sales or price response models. In the context of price promotions there is empirical evidence that lags or leads of prices can have an impact on current brand or current category sales volumes. van Heerde et al. (2000) and van Heerde et al. (2004) used leads and lags of price indices reflecting promotional price cuts with different types of promotional support, and reported significant and in parts also very substantial dynamic effects. Nijs et al. (2001) and Horváth and Fok (2013) accounted for price dynamics by fitting VARX models. Nijs et al. (2001) examined category-demand effects and found that the strong positive short-term effects of price promotions almost completely dissipate over time. Horváth and Fok (2013) analyzed cross-price effects and found evidence of preemptive switching in a way that a brand's price promotion in one period can decrease a substitute brand's sales in subsequent periods. Foekens et al. (1999) and Kopalle et al. (1999) proposed varying parameter models to account for dynamic (pricing) effects in store-level sales response models, both using the widespread multiplicative functional form for modeling price response. Foekens et al. (1999) reparameterized a brand's own-price elasticity as to depend on cumulated previous price discounts (amount and time) for both the brand considered and competing brands, and reported that the magnitude and timing of preceding price cuts can have a significant impact on own-price elasticities at the current period. Kopalle et al. (1999) reparameterized own-and cross-price parameters as functions of geometrically-weighted averages of past discounts and in addition developed a normative model for related pricing decisions.
Talking about lagged prices and dynamic price response modeling is further closely connected to the topic of reference prices. Adaptation-level theory, as proposed by Helson (1964), states that the perception of a new stimulus is performed relative to an 'adaptation level'. Applied to a pricing context, the adaptation level for judging a newly observed price information is called reference price. In other words, the reference price constitutes an internal price standard of the consumer and works as the adaptation level the consumer compares the current price of a brand observed or paid to. Prospect theory also suggests that consumers evaluate alternatives based on a comparison to a standard or reference point, but further distinguishes how consumers value potential gains versus losses from making a decision (Kahnemann and Tversky 1979). The value of a new price information therefore not only depends on the (absolute) difference to a reference point but also on whether the price difference represents a gain or a loss for the consumer. Consequently, prospect theory offers an option to evaluate price-change effects much more differentiated compared to adaptation-level theory (see Sect. 3.1 for a more detailed description of the value function underlying prospect theory).
In the pricing literature, adaptation-level and prospect theory are most frequently applied in the context of (brand) choice modeling, i.e., in models that are based on disaggregate consumer data (for an introduction to this topic see Neslin and van Heerde 2009, for a detailed literature review see Mazumdar et al. 2005 andNeumann andBöckenholt 2014, and for a recent application see Boztuğ et al. 2014 andBaumgartner et al. 2018). Exceptions are for example Kucher (1987) and Natter and Hruschka (1997), who incorporated reference price effects into market share models (i.e., using aggregate data). At this point, it is important to note that the reference price as a construct to model dynamic effects of past prices can be either operationalized as the price of the previous purchase occasion (referred to as pricechange effect) or determined via more complex reference price formation mechanisms based on several past prices (referred to as price-deviation effect), compare Kucher (1987). Gedenk (2002, p. 249) has provided an overview of studies for either stream, and Briesch et al. (1997) discussed different reference price formation mechanisms. Accordingly, because reference prices of consumers cannot be directly measured or be determined in aggregate sales data, some authors used either the price of the last period or the average of several previous prices as proxy for reference prices in aggregate response models (also see Gedenk 2002, pp. 247-249).
Referring to adaptation-level theory, Simon (1982, pp. 208-213) still a little earlier argued that it seems realistic to assume not only an absolute price effect but also a price-change effect on brand sales, i.e., that the price of the last period should have an impact on the response to the current period's price. In a first step, he proposed two versions of linear price-change response models, one where a brand's sales depend on the difference between its current price and its previous price, and another where the relative price change instead of the absolute price change was used as independent variable. Both linear price-change response models assume a symmetric and proportional sales response to price increases and decreases. In addition, he also proposed a hyperbolic sine sales response function to accommodate non-proportional (but still symmetric) price-change effects, with the relative price change as its argument. He motivated the use of this functional form using the same behavioral rationale as is inherent to the well-known Gutenberg price response model: small price changes may have only marginal (below average) effects on sales, while large price changes should yield disproportionately large sales effects (e.g., Hruschka 2000). Note that the three models did not include a contemporaneous (static) price effect, however Simon (1982, p. 211) mentioned that the model could be extended accordingly in case of a sufficiently large data base (as it is given today with store-level scanner data). Later, Simon (1992, pp. 253-254) also considered the possibility of an asymmetric price-change response by expanding his linear pricechange response model (with the absolute price difference as argument) to capture sales effects from price increases (losses) and price decreases (gains) separately. He motivated the model extension with an own empirical study where he observed a significant price-change effect for price decreases (but not for price increases) on the one hand, as well as by referring to prospect theory, which vice versa suggests a higher price elasticity for price increases compared to price decreases due to loss aversion of consumers on the other hand.
In the context of price assessment, Diller (2008, pp. 140-143) linked prospect theory to reference prices and proposes (among others) a reverse s-shaped decreasing function to capture asymmetric price-change or price-deviation effects. The shape of the function looks similar to the logistic price response function but shows a steeper progression for losses than for gains, following prospect theory. As an alternative, he suggested an s-shaped decreasing response function in order to accommodate the existence of possible lower and upper price thresholds. Consequently, the latter function is not in line with prospect theory but resembles the shape of the Gutenberg function with a flatter middle part on the one hand and much more elastic parts for larger deviations from the reference point on the other hand. Both functions allow to capture a disproportionate price response pattern. Importantly, Diller (2008, pp. 360-361) pointed out that the choice of the right functional form depends on the empirical data at hand which makes it necessary to compare different parametric approaches in empirical applications. As mentioned above, using nonparametric estimation techniques can remedy this dilemma by letting the data determine the shape of price-deviation or price-change effects without a priori assumptions about functional forms.
There are certainly pros as well as cons to decide upon whether to use only the price of the last period or a more complex reference price formation mechanism based on prices of several previous periods in an aggregate sales response model. Rinne (1981, pp. 29-30) already used the previous price as reference price (in terms of the last seen price) in his sales response model, as well as Kucher (1987, p. 179) did due to the "exceptional position" of the previous price among all past prices. In addition, there is empirical evidence that individual consumers would not access price information that lies much beyond the immediate past purchase occasion, simply due to difficulties in accurately remembering prices further back (see Krishnamurthi et al. 1992, and the literature cited therein). If consumers buy products on their shopping trips on a weekly basis, this restricted memory capacity argument with its focus on the immediately last price as reference price is also applicable to aggregate data. Also, established reference price formation models for disaggregate data use periodical (weekly) updates for a brand's reference price at the individual consumer level even if a consumer did not buy that brand in the last period (cf. Erdem et al. 2010). Accordingly, "updating reference prices only when households make purchases would underestimate the reference price" (Erdem et al. 2010, p. 310). This assumes that consumers are monitoring brand prices over periods and therefore are aware of a brand's price in the previous period, which is realistic for frequently purchased consumer goods (at least for segments of consumers). Based on these arguments, we focus on price-change response models using aggregate store-level sales data and the price of the last period as proxy for the reference price of an aggregate of consumers shopping at a retailer, hence we use the more parsimonious option to operationalize reference prices.
Beyond the approaches discussed above, only very few authors have explicitly considered prospect theory for modeling price effects in store-level sales response models. Based on the reference price model of Greenleaf (1995), Kopalle et al. (1996) assumed that demand for a brand is a linear function of price(s) and a pricedeviation effect, the latter which is operationalized with two additively separable terms representing gains and losses. The authors developed optimal dynamic pricing strategies and showed that when (a sufficiently large number of) consumers weigh losses stronger than gains, as suggested by prospect theory, every day low pricing (i.e., setting constant prices) is optimal for a retailer. Conversely, if (enough) consumers weigh gains stronger than losses, a hi-lo strategy (cyclical pricing) would be the optimal retailer strategy (for a similar result see Fibich et al. 2007). Assuming asymmetric reference price effects with loss-averse consumers, Fibich et al. (2003) demonstrated that for an infinite planning horizon the optimal pricing strategy 'converges' at a steady-state price, which turns out slightly lower than without considering reference price effects. Pauwels et al. (2007) proposed smooth transition regression models to explore threshold-based price elasticities and found evidence for larger threshold sizes for gains than for losses.
Van Heerde et al. (2004) addressed both functional flexibility and price dynamics in brand sales models (with store-level data). Based on van Heerde et al. (2000), who used leads and lags of price discount variables to capture price dynamics, and based on van Heerde et al. (2001), who applied kernel regression to flexibly estimate price discount effects, the authors combined both features (leads and lags, local polynomial regression) in order to decompose the sales effect of promotions into the three different sources cross-brand effects, cross-period effects, and category expansion effects. Finally, Natter and Hruschka (1997) have been previously the only ones who estimated reference price effects within a flexible approach (via a neural network) and based on aggregate data, however their approach was directed on market share modeling rather than sales response modeling. To the best of our knowledge, no study so far has employed nonparametric regression to flexibly estimate asymmetric reference price (price-change) effects in store-level sales response models, and we attempt to fill this research gap in the literature with our study. Table 1 summarizes the literature on (store-level) sales response models with focus on estimating price effects discussed above, distinguishing between approaches that have addressed either functional flexibility, or price dynamics (in the form of using lead or lagged prices, reference prices, or time-varying parameters), or both features. In the Appendix A, we provide an overview of advantages of using nonparametric regression techniques in general and especially for capturing gain and loss effects and summarize further convenient properties of applying Bayesian P-splines as we do in this article.
In the following, we present a semiparametric brand sales model which accounts for price dynamics via (asymmetric) price-change effects. Our model therefore combines reference price effects with functional flexibility. Our model specification is based on Weber et al. (2017), who assumed a brand's sales to depend on the brand's own price and prices of substitute brands, further marketing covariates representing the use of displays and odd prices, as well as on store-specific and holiday effects. We additionally consider price dynamics via price-change effects in different ways, that way accommodating adaptation-level and prospect theory. Our focus is on the predictive performance of the proposed approach in comparison to several benchmark models, including linear, exponential, and multiplicative models with or without price dynamics. From a modeling perspective, we want to demonstrate the capability of nonparametric regression to estimate any kind of price-change response from data (symmetric vs. asymmetric response, proportionate vs. disproportionate response). From a managerial perspective, our primary goal is to provide an econometric model that can improve sales predictions from incorporating reference price effects into a flexible sales response model compared to simpler model specifications that either ignore price Table 1 Overview of studies accommodating price dynamics (e.g., reference price effects) and/or functional flexibility in aggregate sales response models

Study
Price dynamics Functional flexibility Rinne (1981) Reference prices (gains and losses) - Kucher (1987) Reference prices (price difference) - Kopalle et al. (1996) Reference prices (gains and losses) - Kalyanam and Shively (1998)  This study Reference prices (price difference, gains and losses) Bayesian P-splines dynamics, functional flexibility, or both. Likely implications of our proposed model for related optimal pricing decisions will be discussed in Sect. 5.

Model specification
In this section, we review the concept of prospect theory and its application to reference prices, propose different options to include a lagged price term in a sales response model to measure price-change effects, and introduce our semiparametric model in different variants to address both functional flexibility and price (and other) dynamics.

Reference prices and prospect theory
As mentioned in the literature review, prospect theory as proposed by Kahnemann and Tversky (1979) builds upon adaptation-level theory, which states that a consumer's response to a stimulus (e.g., the current price observed or paid) does not only depend on the stimulus itself but on the 'distance' to the consumer's standard or 'adaptation' level (e.g., her/his reference price). In prospect theory, the comparison to that reference point is expressed by the value function v(x) , with argument x representing the deviation of the observed stimulus from the reference point (see Fig. 1). In particular, x > 0 refers to a gain and x < 0 to a loss and, accordingly, the value v(x) is positive for a gain and negative for a loss. The value function is further concave for gains and convex for losses, and asymmetric with |v(−x)| > v(x), x > 0 . This implies that consumers are loss-averse and weigh losses of an amount x stronger than gains of the same amount. Therefore, the value function is steeper for losses than for gains. In the present context, x represents the price change between the current price ( p t ) of a brand and its price of the last period ( p t−1 ), where the latter corresponds to the reference price. Following adaptation-level theory and prospect theory, different ways to incorporate lagged prices into a sales/price response model are possible.

3
A semiparametric approach to estimating reference price effects…

Absolute versus relative price differences
The most simple way to account for a price-change effect is to use absolute or relative price changes, as for example proposed by Simon (1982, p. 209): For both variants, negative values correspond to losses and positive values to gains (which is in line with the representation given in Fig. 1). However, using a linear or parametric nonlinear sales response function, this specification results in only one single effect estimate for both gains and losses, since only one parameter is determined for the price-change effect. A great advantage of flexible (e.g., spline) functions therefore is their ability to capture possibly different shapes for the gain and loss parts of the value function, even if gain and loss effects are not separated from each other as in (1) and (2).

Separate price terms for gains and losses
An approach to account for individual price-change effects of gains and losses in parametric models is to simply split the (relative) difference between the current and lagged price, according to (1) and (2), into two separate variables: Now, the loss and gain parts of the value function shown in Fig. 1 are captured by two separate terms. Consequently, asymmetric and disproportionate (except for the linear model) price-change effects are allowed for gains and losses. One disadvantage of this specification for parametric models may be that whenever one of the two variables takes a positive value the second one is truncated to zero, thus generating 'many zeros' in the vector of observations for these two covariates (which may lead to an estimation bias). Using nonparametric techniques like splines, estimation results are much less (if at all) affected by this truncation due to their local fitting property. We elaborate on this local fitting property in more detail in Sect. 4.2.

Reference prices and other dynamic effects
As specified above, we treat a brand's price of the previous period ( p t−1 ) as reference price and include its deviation from the price of the current period ( p t ) to capture reference price effects, i.e., price dynamics. At this point, it is important to note that empirical findings in the field of price promotions indicate that price cuts (2) Δ rel p t = Δp t ∕p t−1 .
(corresponding to gains in our context) can lead to strong sales spikes during a promotion. The reasons for the incremental sales volume of a brand during promotional periods are manifold: price discounts are typically substantial, they are advertised via in-store displays and/or out-of-store flyers, and consumers might stockpile as a result of accelerating their purchases by buying earlier and/or a larger amount of the promoted item. Many studies based on household-level panel data have revealed that purchase acceleration effects can be considerable, an about equally large number of studies based on aggregate store sales data could however not find the expected postpromotion dip after the deal period (i.e., the corresponding sales decrease just as a result of stockpiling; for an overview of related studies and findings see, e.g., van Heerde et al. 2001). As a consequence, managers who rely on store sales data may overestimate the gain effect from a price decrease during the promotion and underestimate the loss effect when the price has increased again in the post-promotion period, which would be contradictory to what prospect theory suggests. Stockpiling is hard to detect in store-level sales data (see Neslin and Shoemaker 1989), and Neslin and Schneider Stone (1996) provided a number of arguments for explaining this mystery (also see Neslin and van Heerde 2009). Accordingly, the possible lack of postpromotion dips in aggregate data can be the result of an aggregation bias, caused by potentially very different purchasing patterns of individual households. For example, some households can be deal-to-deal buyers, some pursue excessive stockpiling while others consume faster, or some others may build up brand loyalty and repeat purchase the promoted brand even after the promotion at a higher price (also referred to as customer holdover or purchase reinforcement). Consequently, other dynamic effects like stockpiling or customer holdover must be accommodated in aggregate sales response models to ensure that the dynamics in the data are indeed driven by pricing effects (in our model by reference price effects) and not or at least only partially by different other types of dynamics. Based on simulated data, Slonim and Garbarino (2009) could show that the level of stockpiling can indeed provide an explanation for the mixed findings on asymmetric reference price effects in the literature that either are in line with prospect theory (greater sensitivity to losses) or not (greater sensitivity to gains). In particular, they found that the conditions that affect the level of stockability of a product (e.g., holding costs of consumers, perishability of products, required storage space, frequency and depth of price deals) drive the direction and magnitude of the asymmetry of reference price effects. In order to account for stockpiling and customer holdover effects, we follow Pauwels et al. (2007) and include lagged sales (a brand's sales of the previous period) as an additional predictor leading to an autoregressive sales response model. Therefore, a negative effect of lagged sales on current sales would indicate more stockpiling of consumers on average, while a positive effect would point to a customer holdover effect. Of course, in aggregate data both effects may be simultaneously at work and positive repeat purchase effects (customer holdover) may overcompensate purchase acceleration effects (stockpiling), depending on the composition of different types of households in the sample and the extent to which they stockpile. For example, Macé and Neslin (2004) found that households with less income, a larger number of household members, working women, and cars are more prone to stockpiling, and Chan et al. (2008) reported that brand loyals and heavy users stockpile (more) for future consumption compared to brand switchers and light users. 2

Semiparametric modeling approach
We use the following additive semiparametric dynamic sales response model with unknown smooth (nonparametric) functions for price and price-change effects as described in (5). Accordingly, the (log) unit sales of a brand in a specific store and week is assumed to depend on the brand's own price, a reference price term including the brand's price of the previous week as reference price of the (aggregate of) consumers, own promotional activities for the brand (odd pricing, use of a display), prices of substitute brands captured at the price-quality tier level, cross-promotional activities of substitute brands in the same tier (use of a display), a potential holiday effect, and unobserved store-specific effects. In addition, we include the (log) unit sales of the brand in the previous week as autoregressive model component to separate other dynamic effects from the reference price effect, as discussed above. Since we estimate the sales response for each brand separately, we omit the use of brand indices for simplification in the following: where • s is a store-specific random intercept for store s; • f 1 is a decreasing function of the brand's own price ( p st ); • f 2 is an increasing function for the reference price effect ( p ref st ), see below for more details; • f c i are increasing functions for cross-price effects ( p c i st ); • captures parametric effects for promotional activities and seasonality ( v ′ st ); • captures the autoregressive effect of the one-period lagged unit sales ( log(q s;t−1 )); • st is a Gaussian error term with mean zero and variance 2 .
A more detailed description of all variables and estimated effects is provided in Table 4 in Sect. 4.1, where we will introduce the data set used in our empirical study.
The unknown smooth nonlinear functions f are modeled via Bayesian P-splines, which were introduced by Lang and Brezger (2004). Later, Brezger and Steiner (2008) proposed an extended Gibbs sampling procedure to additionally accommodate monotonicity constraints, as is economically reasonable and now well-established for modeling price response. Reference price effects or price-change effects are captured via the four options as discussed in Sect. 3.1: we use either a single price-difference variable that measures the difference between the current and previous price in absolute terms (abs-diff) or relative terms (rel-diff), or we split up the price-change effect into a gain and loss effect measuring the price change again in monetary units (absgl) or as a percentage change (rel-gl). We further estimate two nested variants of the semiparametric model in order to assess the impact of considering (price) dynamics on the predictive model performance: a static variant without the reference price term and without the autoregressive part (static), and a simpler dynamic variant without the reference price term but including one-period lagged sales as autoregressive part (dyn-ar). For the sake of clarity, we omit both the nonparametric cross-price terms and all parametric terms except the autoregressive part (which are common to all model versions) to especially focus on the differences between the various model specifications in the following equations (see (5) for details): • static (static): • dynamic, with autoregressive part, without price dynamics (dyn-ar): • dynamic, with autoregressive part, price difference in absolute terms (abs-diff): • dynamic, with autoregressive part, price difference in relative terms (rel-diff): • dynamic, with autoregressive part, gains and losses in absolute terms (abs-gl): • dynamic, with autoregressive part, gains and losses in relative terms (rel-gl): In our empirical study, we additionally compare the semiparametric approach to several parametric sales response models that are described next.

Benchmark models
As benchmark models, we consider two widely used parametric nonlinear models (the exponential or log-linear model, and the multiplicative or log-log model) as well as the simple linear sales response model. Including the three parametric models allows us to further evaluate the potential of flexible regression models for capturing reference price effects and especially to assess possible interactions of accounting for price dynamics and/or functional flexibility with regard to improvements in the accuracy of sales predictions. We did not expect that the linear model would perform very well, since it is nowadays well-established that price response is usually nonlinear for frequently purchased consumer goods, as considered here (see the literature review in Sect. 2). Nevertheless, the linear model represents a natural benchmark model, especially as it constituted the starting approach in the German-language pricing literature for modeling price-change effects (also compare Sect. 2). We eventually moved the empirical results related to the linear model to the Appendix for the interested reader, because it actually performed much worse than the other models in our application, as expected.
In the strictly parametric models, the unknown smooth price functions f j in (5) (and in (6)-(11), respectively) are replaced by parametric linear effects. The linear and exponential models differ only in the specification of the dependent variable (and consequently in the related autoregressive part), which is unit sales ( q ) in the former case and log unit sales ( log(q) ) in the latter case. The multiplicative model uses log unit sales as dependent variable and furthermore log-transformations for all price covariates. Accordingly, the three benchmark models can be expressed as follows (omitting the terms for store intercepts, promotional activities, and seasonality, which are identical across the models for simplification, to highlight the differences between the three parametric models): • linear: • exponential: • multiplicative: For the linear and exponential models, p ref represents one of the four different specifications for capturing the price-change or reference price effect (simple price-difference term vs. separate gain and loss variables, measured in absolute vs. relative terms) as in (8)-(11), and p c i denotes the cross-price terms, as introduced in (5). For the multiplicative model, we use the ratio between the previous price and the current price, Δp t = p t−1 ∕p t , instead of the price difference as equivalent specification for the price-change effect (corresponding to the difference in log prices). Like for the price-difference term, as in (1), negative values of the log price-ratio term correspond to losses and positive values to gains. Note that an operationalization of the reference price term as in (2) is not reasonable for the multiplicative model, resulting in only two specifications for the price-change effect (simple log price ratio, separate log price ratios for gains and losses). Further note that like for the semiparametric model we again estimated each two nested variants of the three parametric benchmark models, once as static model without the reference price term and without the autoregressive part (static), and once as simpler dynamic model without the reference price term but including the autoregressive part (dyn-ar), compare Sect. 3.3. Overall, we estimate and validate 22 different models (each 6 variants of the linear, exponential, and semiparametric models, as well as 4 variants of the multiplicative model), and all models are estimated within a fully Bayesian framework using the public domain software package BayesX (Brezger et al. 2005). Table 2 summarizes the capabilities of the four types of models to estimate asymmetric or disproportionate price-change effects, depending on the specification of the reference price term(s). In case a simple price-difference (or price-ratio) term is used, the linear model only allows a symmetric and proportional price-change effect, the exponential and multiplicative models a symmetric and disproportionate effect, and the semiparametric model an asymmetric and disproportional effect. If separate gain and loss terms are used, the linear model enables an asymmetric but only proportional price-change effect, the exponential and multiplicative models an asymmetric and disproportionate effect, and the semiparametric model once again an asymmetric and disproportionate effect. The linear model is not in line with prospect theory, that suggests asymmetric and disproportionate price-change effects. The exponential model is able to reproduce disproportionately increasing values of gains and losses (if modeled separately) as suggested, e.g., by the Gutenberg function. The multiplicative model is still a bit more flexible than the exponential model and is not only able to capture disproportionately increasing gains and losses but also to mimic disproportionately decreasing returns to scale (again if gains and losses are modeled as separate terms), as suggested, e.g., by the logistic function. The semiparametric model can approximate any curvature from data including concave and convex shapes as well as (asymmetrically) s-shaped and reverse s-shaped patterns as provided, e.g., by logistic or Gutenberg functions.

Empirical study
This section describes the data we use in our empirical study, provides technical details on model estimation and model validation, and presents and discusses the corresponding results.

Data
For our empirical analysis, we use scanner data for refrigerated orange juice sold by a large supermarket in the Chicago metropolitan area (Dominick's Finer Foods). The data were provided by the James M. Kilts Center of the University of Chicago and contain weekly unit sales of 64 oz. packages for m ∈ {1, … , M = 8} brands that can be divided into three price-quality tiers: the premium brand tier with two brands, the national brand tier with five brands, and the private label brand tier represented by the supermarket's own store brand. The data were collected in s ∈ {1, … , S = 81} stores of the supermarket chain and cover a time 75,88] . Descriptive statistics for weekly brand prices, market shares, unit sales, and the share of weeks with a different store-specific price compared to the previous week (referred to as "price changes") are displayed in Table 3.
For a more parsimonious model specification, we capture cross-price effects at the tier level by using the sales-weighted mean price across the competing brands belonging to a considered price-quality tier per store and week (see, e.g., Kopalle et al. 1999). The three cross-price variables are denoted as p (prem) , p (nat) , and p (priv) in the following, referring to the premium, national, and store brand tiers, and they are captured in (5) by the cross-price terms with index c i ∈ {(prem), (nat), (priv)} . Since there is only one store brand (the retailer's own brand), sales response models for this brand include only the two competitive price variables p (prem) and p (nat) . Note that using this more parsimonious specification of cross-price effects still allows the estimation of cross-effects across tiers, even if the information about prices of competing brands within a tier per store and week is concentrated as a sales-weighted average of the corresponding individual prices. This is important as previous research has shown that cross-price effects across tiers can be substantial; especially if higher-tier brands are temporarily reduced in price during a promotion, it can be expected that they can steal sales from lower-tier brands. For the dynamic models, we additionally need the lagged price (i.e., the price of the previous period, p t−1 ) to model price-change effects. For this reason, we eventually dropped the first week in the data for the estimation of all models (including the static model versions) to preserve comparability across the estimation results. Note that the share of weeks with a different store-specific price compared to the previous week ranges between 39.3% for Florida Natural and 56.7% for Tropicana.
The data further provide information on the use of displays and odd prices by the retailer that we include in our models as control variables together with an indicator variable for a holiday in the current week to capture seasonality. 3 Table 4 provides a summary of all variables included in our models as well as an overview of the related effects estimated in the (most complex) semiparametric models as example. Eilers and Marx (1996), who originally introduced the P-spline approach into the statistical literature, recommended to use between 20 and 40 equidistant knots within the range of observed levels of an independent variable of interest. That way, sufficient flexibility for the spline should be guaranteed (i.e., not less than 20 knots) and at the same time overfitting can be avoided (i.e., not more than 40 knots). For our empirical study, we use 20 knots which is also in line with Lang and Brezger (2004) and represents the default setting in the BayesX software. Strictly speaking, we generally use 20 knots for estimating all contemporaneous own-and cross-price effects. However, in order to provide a fair model comparison, we use only 10 knots each for estimating gain and loss effects in the models with two separate price-change terms (cf. (10) and (11)) whereas 20 knots for estimating the price-change effect in the models that use only one single price-difference (price-ratio) term (cf. (8) and (9)). This is reasonable since the gain and loss terms capture only one branch of the value function and therefore cover only about half of the data range of the price-difference term each.

3
A semiparametric approach to estimating reference price effects… Table 4 Summary of variables used in our models and overview of estimated effects in the semiparametric reference price model as specified in (5) Variable Effect Description s Store-specific random intercept for store s All models, the static and dynamic ones as well as the semiparametric and parametric ones, are estimated with BayesX using a Gibbs sampler to draw from the posterior distribution. We use a total of 12,000 iterations, with a burn-in period of 2000 iterations and a thinning value of 10 to minimize the autocorrelation of the samples, i.e., we finally saved D = 1000 draws from the Markov chain. To account for parameter uncertainty, model performance (see (16)-(19) below and Sect. 4.3.3) is assessed using the individual parameter (Gibbs) draws instead of using the posterior means of estimated parameters (e.g., Montgomery 1997;Hruschka 2006b;Lang et al. 2015). That is, predictions ̂s t for (log) unit sales ( log(q st ) and q st , respectively) are calculated as the mean across 1000 draw-based predictions ̂s t,d : Since we are interested in predictions for a brand's unit sales (instead of log unit sales) and especially to be able to compare the performance of models estimated in the log sales space (exponential, multiplicative, and semiparametric model) versus models estimated in the sales space (linear model), conditional mean predictions for unit sales are computed for the exponential, multiplicative, and semiparametric sales response models via q st = exp(̂s t +̂2∕2) (see, e.g., Greene 2008, p. 100). For the linear models, q st =̂s t .
We compare the different models with regard to their prediction accuracy by using two error measures: the Root Mean Squared Sales Prediction Error ( RMSE , see, e.g., van Heerde et al. 2001) and the Root Median Squared Sales Prediction Error (RMedSE, see, e.g., Franses and Ghijsels 1999): In particular, we compute the Average Root Mean or Median Squared Sales Prediction Error ( ARMSE/ARMedSE ) in holdout samples based on a C-fold cross-validation with C = 10 folds. That is, we randomly split the total sample of observations for a brand into 10 folds, use each time C − 1 = 9 parts of the sample for model estimation, calculate the RMSE or RMedSE for the remaining part (holdout), and finally average over the 10 individual RMSE/RMedSE values: Note that all (lagged) prices are assumed to be always known both during model estimation and model validation, i.e., even though observations of a certain previous period may not be explicitly part of a respective estimation or holdout sample. We use the (A)RMedSE measure in addition to the more widespread (A)RMSE measure in order to correct for the possibility of huge misses due to outliers in holdout samples (Franses and Ghijsels 1999): suppose that, due to the random split of the sample, the range of values for one of the price variables in one of the holdout exercises would be larger than the corresponding range of levels across the other folds used for model estimation. In this case, sales forecasting for holdout observations at price levels in domains not covered by the estimation sample becomes an extrapolation. Unlike parametric functions, whose shape is globally affected or determined by one or only few parameters, (P-)splines fit the data locally which gives them their high flexibility to capture more complex shapes. This local fitting, however, makes splines or any other nonparametric regression technique at the same time more sensitive at the boundaries of the data range, since predictions outside the data range would be guided only by the nearest domain of the spline. As a consequence, it can happen that extrapolated sales predictions for 'new' price levels turn out exorbitantly high or low if the spline is very steep at the boundaries of the data range. Using the RMedSE as measure for the cross-validation procedure guides against this 'extrapolation problem' (as opposed to RMSE ). Note that extrapolation is generally not recommendable per se, but could theoretically appear here due to the random sample split. Since the 'extrapolation problem' did occur in very few instances when computing the RMSE measure, we removed the corresponding observations (21 observations representing as a rule isolated exceptionally low prices or high gains) from the data to preserve the comparability both between the different model specifications and the two performance measures, leaving a total of 54,841 observations for model estimation. 4

Estimated effects
In the following, we at first illustrate our estimation results for the semiparametric model using the brand "Citrus Hill" as an example. Figure 2 shows plots of the abs-diff model given in (8), i.e., using absolute differences for estimating the pricechange effect. Depicted are the estimated mean effects for the price variables and the lagged sales variable including 95% pointwise credible intervals as well as partial residuals (e.g., Fahrmeir et al. 2013, p. 77), and the estimated effects for own-display use, tier-specific cross-displays, 9-and 99-ending prices, and the holiday covariate. Note that estimated median effects (blue lines, almost always hidden) coincide with the mean effects (red lines) for all price variables and the lagged sales variable.
First, the estimated effects and effect sizes show face validity. The own-price effect turns out much stronger than any of the cross-price effects. Since "Citrus Hill" is a national brand it could further be expected that its unit sales are more strongly affected by brands of both the premium and national brand tier than by the store brand (as becomes evident from the nearly flat cross-tier price effect with respect to the store brand, see middle-right panel). Moreover, the own-price effect shows a threshold effect near the price of 2.00$ , and all price effects have very tight Fig. 2 Estimation results for the semiparametric abs-diff model using the brand "Citrus Hill" as example: estimated effects and partial residuals for price and lagged sales variables including 95% pointwise credible intervals (red lines, gray-shaded credible intervals), as well as estimated effects for display, price ending, and holiday covariates including error bars confidence bands. 99-ending prices have a much larger effect size than other prices ending in 9, and a holiday in a week leads to a significant decrease in the brand's unit sales in this week. Both the own-display effect and the cross-display effect of the competing national brands are not significant, as the credible intervals include the zero point, respectively.
The more interesting part is the estimated dynamic price effect, displayed in the top-middle panel of Fig. 2. Note that positive values of Δp t correspond to gains and negative values to losses (compare (1)). We observe that the loss part of the spline is rather flat (except for very large losses), while the gain part is steeply increasing over the whole price difference range to the right of the zero-point. That is, a higher current price compared to the price of the previous week shows only a small effect, while price cuts strongly stimulate sales. Consequently, customers seem to value gains more than losses for "Citrus Hill", which contradicts the assumption of loss aversion as suggested by prospect theory (see Sect. 3.1).
Finally, the autoregressive model part displayed in the top-right panel shows a significant positive effect of one-period lagged sales on the sales of the current period ( = 0.11, p < 0.05 ) and suggests a (moderate) customer holdover effect rather than stockpiling across the aggregate of consumers for "Citrus Hill". Note that this lagged sales effect turns out small in comparison to the own-price and reference price effects. Remember that one would expect a significant negative parameter estimate for lagged sales in case of distinct stockpiling across (parts of) consumers. 5

Fig. 3
Estimated price-change effects from the exponential (left panels), multiplicative (top-middle panel), and semiparametric models (right panels) for the brand "Citrus Hill", capturing the price difference in absolute terms (abs-diff, top panels) or relative terms (rel-diff, bottom panels). See Fig. 6 in the Appendix for a variant relating to the 'sales space' In Fig. 3, we focus on the dynamic model part and compare the estimated pricechange effects for (1) the exponential (left panels), multiplicative (top-middle panel), and semiparametric response models (right panels) and (2) for the two options to capture the price difference Δp st either in absolute monetary units (top panels, absdiff) or by a percentage change (bottom panels, rel-diff). Remember that we did not estimate a rel-diff version of the multiplicative model (see Sect. 3.4). Independent of the specification of the price difference in absolute or relative terms, the advantage of using a flexible regression approach becomes obvious: the spline clearly fits the data much better than the two parametric models.
Without loss of generality, the effect plots refer to the space where the models were estimated, i.e., the log sales space for all these models. Note that this implies that estimated price effects for the exponential (or log-linear) and multiplicative (or log-log) models turn out linear in the log-space but exponential in the sales space. To illustrate this, we additionally plotted the estimated price-change effects for all three types of models in the sales space, see Fig. 6 in the Appendix. Here, we observe that the exponential and multiplicative models tend to a convex shape for absolute differences (with a slightly better fit of the multiplicative model for high gain values), but that both parametric models are far too inflexible to capture the strong kink for the gain effect inherent to the data near the upper bound of the pricedifference range. The price-change effect is determined by only one parameter estimate in both the exponential and the multiplicative model, which makes them rather inflexible compared to the spline model, at least for this kind of price variable (the Fig. 4 Estimated price-change effects from the exponential (left panels), multiplicative (middle panels), and semiparametric models (right panels) for the brand "Citrus Hill", capturing gain and loss effects in absolute monetary units (abs-gl) with two separate terms. See Fig. 7 for a variant relating to the 'sales space' and Fig. 8 for a variant with gains and losses defined in percentage terms (please find both figures in the Appendix) price-difference term comprises negative and positive values for losses and gains, respectively). Differences in estimated effects between the abs-diff and rel-diff models are small except for the spline model where the strong kink for gains is still more distinct if the price difference is captured in relative terms (compare the right bottom panels in Figs. 3 and 6).
So far, using the flexible spline approach, we could detect a large difference between the impact of gains and losses on the unit sales of "Citrus Hill", although gain and loss effects were not modeled separately. Figure 4 now displays the estimated price-change effects for the exponential (left panels), multiplicative (middle panels) and semiparametric 6 response models (right panels) when separate gain and loss terms are used and price differences are measured in absolute monetary units (abs-gl models, cf. (10)). Alternatively, plots of the gain and loss effects are again provided for the sales space for all three types of models, see Fig. 7 in the Appendix. As expected, the two parametric models are now able to better separate the existing differences between gain and loss effects: the loss effects now turn out very flat, as was obvious from the semiparametric abs-diff and rel-diff models before (cf. the right panels in Figs. 3 and 6). In contrast, due to the greater flexibility of the spline model, differences from modeling the price-change effect via one single price-difference term versus separate gain and loss effects are small. Although the convex gain effect is now much better represented by the exponential model, the model is still too inflexible to capture the strong kink for large perceived savings (compare the top-middle and top-right panels in Fig. 7). The multiplicative model does a much better job at the upper bound of the observed range for gains and fits high perceived savings well, but is not flexible enough to adequately capture the more mid-sized gains observed in the range around 0.6 very well (compare the top-middle panels in Figs. 4 and 7) . We will show in Sect. 4.3.3, where the models are compared for all brands for their predictive performance, that once price effects are modeled flexibly using the spline approach, improvements in predictive accuracy from treating gain and loss effects with two separate nonparametric terms are only very small, marginal, or not at all achievable (compared to using a single price-difference term). We here abstain from discussing the estimation results for "Citrus Hill" obtained from the rel-gl models, i.e., when gains and losses are defined in percentage terms, cf. (11), since the plots differ only marginally from those for the abs-gl models in Figs. 4 and 7. See Fig. 8 in the Appendix for a variant of Fig. 4 with gains and losses defined in percentage terms. Figure 5 shows plots for the estimated price-change effect obtained from the flexible rel-diff models (see (9)) for all eight orange juice brands: the two premium brands "Florida Natural" and "Tropicana Pure" (top row), the five national brands "Citrus Hill", "Florida Gold", "Minute Maid", "Tree Fresh", and "Tropicana" (middle rows), and "Dominick's" own store brand (bottom row). Interestingly, although the effect sizes for the gain effect (in parts largely) differ between brands, we get a uniform picture for all brands similar to that for the brand "Citrus Hill": gain and loss effects are asymmetric, gain effects turn out (much) larger than loss effects, and loss effects are rather flat. Virtually no loss effects exist for the premium brand "Tropicana Pure" and for the national brand "Tree Fresh", while loss effects only very moderately increase for the other brands in case of larger perceived losses. Also Estimated price-change effects from the flexible rel-diff model for all eight brands in the refrigerated orange juice category, i.e., capturing the price differences as relative percentage changes ( Δ rel p st ) using a single price-difference term note that the data are highly sparse at the lower bound of Δ rel p st for "Florida Natural" and "Minute Maid", as indicated by the wider confidence bands. Interpretation of the increasing loss effect in the domain of very large perceived losses should be treated with caution here, respectively. Gain effects for the brands "Florida Natural", "Citrus Hill", and "Tropicana" are highly nonlinear (disproportionate) and increase steeply, and only the gain effects of "Florida Gold" and "Minute Maid" show decreasing returns to scale, as suggested by prospect theory. Further, gain effects are very differently shaped across brands such that they can hardly be captured adequately with a (single) parametric function.
Overall, consumers generally seem to weigh losses much less than gains (or not at all) in the refrigerated orange juice category. In other words, if prices are increased only the contemporaneous price effect decreases sales, while if prices are decreased an additional gain effect exists. In addition, the positive lagged sales effect observed for all eight orange juice brands independent of the specification used for capturing the reference price effect (abs-diff, rel-diff, abs-gl, rel-gl) speaks in favor of a customer holdover effect rather than distinct stockpiling of (at least some) consumers in promotional weeks. In particular, the lagged sales effect obtained from the semiparametric models is consistently significant positive for all brands except "Minute Maid", with an estimate for the autoregressive component ( ) in the range between 0.03 and 0.20 (and positive between 0.007 and 0.03 for "Minute Maid" but not significant in most cases). Noticeably, the estimated lagged sales effect differs only marginally between the four different dynamic model variants at the individual brand level. Note that the lagged sales effect turns out rather flat for most brands compared to both the estimated own and reference price effects, 7 that is why we stayed to model the autoregressive part parametrically.
Our two findings of (1) gain-seeking behavior rather than loss aversion of consumers (in contrast to what prospect theory postulates) and (2) a moderate customer holdover effect rather than a stockpiling effect are not necessarily as expected. One possible explanation could be that (parts of) consumers may not only buy more orange juice in weeks with perceived price savings but also may consume a larger amount of orange juice in these weeks and some of them continue to repurchase the last brand bought in the next period. This explanation seems supported by the plausible fact that the customer holdover effect is larger for the two premium brands than for most national brands (except for "Tree Fresh", where is similar high around 0.20), and low for "Dominick's", the retailer's own store brand (around 0.05). In addition, refrigerated orange juice is a perishable product and therefore less stockable. We further checked our autoregressive models for possible collinearity problems, which might have occurred if the previous own price ( p s;t−1 ) and previous sales ( q s;t−1 ) of the considered brand had been highly correlated, and which might have led to wrong signs for estimated effects in case of high collinearity. Note at first, however, that using reference price terms (which include the previous price) in general reduces correlations compared to including the previous own price per se as variable. Absolute pairwise correlations between one-period lagged log unit sales on the one hand and the simple price difference, gains, or losses (measured in absolute terms, respectively) on the other hand for example range between 0.2 and 0.4, 0.1 and 0.2, or 0.2 and 0.5 across brands, and are therefore not critical. Furthermore, variance inflation factors for lagged log unit sales and the simple price difference (measured in absolute terms) are always lower than 2 and 3.5, respectively, across brands and hence lie far below the critical value of 10; the latter would indicate serious multicollinearity problems (also compare Fig. 2, which refers to exactly this specification of variables in the semiparametric abs-diff model).

Price elasticities
From a managerial perspective, price elasticities are of interest as well. Table 5 shows estimated own-price elasticities as examples for the static and the two dynamic abs-diff and abs-gl variants of the exponential, multiplicative, and semiparametric models for all eight brands. Computation of own-price elasticities is straightforward for the static model versions (including the semiparametric model), but more difficult for the dynamic model specifications. For the latter, price elasticities depend on both the own-price term and the parameter(s) of the reference price term(s), details are provided in the Appendix B.
We determined (weighted) mean price elasticities for the exponential and semiparametric models by evaluating (p st ) at every observation p st to explicitly account for the whole price distribution in the data, instead of computing price elasticities only at the average price level of a brand. In addition to showing mean price elasticities for the full span of price levels only, we further divided the price range of the brands into three subranges and report mean price elasticities for low, medium, and high price levels (local elasticities) to get still deeper insights into the price elasticity structures and to uncover aggregation biases. Note that price elasticities for the multiplicative dynamic model versions do also not depend on the price level and therefore continue to be constant when considering the entire price range, but may still differ across the three subranges due to different numbers of price changes (absdiff model) or gains and losses (abs-gl models) within each of the three subranges.
As can be seen from Table 5, mean own-price elasticities (see columns full range) decrease from the static through the abs-diff to the abs-gl model for almost every brand and model type. This does not always hold at the more disaggregate level within the medium and high price subranges (except for "Florida Natural"), and for lower prices we even predominantly observe the reverse pattern. This underlines that it might be not enough to only pay attention to global elasticities. Further, as a rule, the discrepancies in estimated price elasticities from the abs-diff and abs-gl models are much larger for the two parametric models, while only moderate or even negligible for the semiparametric model (both for the full price range and within the three subranges). The latter could be expected due to the capability of the semiparametric model to identify asymmetric and disproportionate price-change effects even if a single price-difference term is used (as in the abs-diff model version). For some brands, the semiparametric model suggests either much higher (e.g., "Florida Natural", "Florida Gold") or noticeably lower price elasticities (e.g., "Tree Fresh") on the fully aggregated level (see columns full range) and independent of the model 1 3 A semiparametric approach to estimating reference price effects… specification (static, abs-diff, abs-gl). For some other brands, elasticities from the parametric models closely approach these of the semiparametric model provided that gains and losses were modeled separately in the parametric models (e.g., "Tropicana Pure", "Citrus Hill"). But even in the latter case, there are noticeable differences in the local elasticities between the parametric and semiparametric models. Differences in estimated own-price elasticities between the two parametric models and the semiparametric one are particularly large for low prices of the brands "Florida Gold" and "Tree Fresh" and high prices of "Dominick's". For "Florida Gold" ("Tree Fresh"), the semiparametric model suggests a much lower (higher) price sensitivity in the low price range, and for "Dominick's" a much higher price sensitivity in the high price range compared to each of the two parametric models, respectively. Altogether, the semiparametric model leads to different managerial insights with respect to own-price elasticities in many cases.

Predictive performance
As described in Sect. 4.2, the predictive performance of the 22 models is evaluated in terms of the Average Root Median Squared Sales Prediction Error ( ARMedSE ) and the Average Root Mean Squared Sales Prediction Error ( ARMSE ) in holdout samples (based on a 10-fold cross-validation procedure). Table 6 summarizes the ARMedSE values for the exponential, multiplicative, and semiparametric models and for all brands. Further provided are the relative improvements in ARMedSE values of each of the dynamic models over the static model by model type. The corresponding predictive validity results when using the ARMSE measure instead of ARMedSE as well as the corresponding results for the much worse performing linear models can be found in the Appendix (Tables 8, 9 and 10).
From Table 6, we can at first observe that extending the static models by including only the lagged sales variable (dyn-ar) hardly improves or even decreases the predictive performance for all brands except "Tree Fresh", independent of the type of model. For "Tree Fresh", improvements in ARMedSE over the static model are still only very moderate ranging between −1.9 % for the semiparametric model and −3.6 % for the exponential model. This resembles our findings on the autoregressive effects for the semiparametric models, where the lagged sales effect turned out rather flat compared to the estimated own and reference price effects across brands (see the discussion at the end of Sect. 4.3.1).
Concerning the exponential sales response model, using a single price-difference term to capture the reference price effect (abs-diff, rel-diff) is clearly and consistently inferior to separating gain and loss effects with two individual price terms (abs-gl, rel-gl). The largest improvements in ARMedSE over the static model when using a single price-difference term are still being made for the brands "Citrus Hill" ( −11 %, abs-diff), "Florida Natural" ( −9 %, abs-diff), and "Tree Fresh" ( −8 %, absdiff), whereas the predictive performance even worsens for the brand "Dominick's" ( +0.6 %, rel-diff). In contrast, predictive accuracy for all brands greatly benefits from considering price dynamics via separate gain and loss terms, with improvements in ARMedSE between −9 % for "Florida Gold" as well as "Minute Maid" (abs-gl) up to −24 % for "Tropicana Pure" and "Citrus Hill" (rel-gl). Here, except for "Florida Natural", the specification of gains and losses in relative terms (rel-gl) turns out to be at least as good or even superior to measuring the price difference in monetary units (abs-gl) for improving the predictive model performance. Taking a look at the results for the linear sales response model in Table 9, we find that the static version of the exponential model (i.e., not addressing price or other dynamics at all) already outperforms the best dynamic linear specification. This suggests, that the linear sales response model is highly misspecified since it is not able to accommodate the expected nonlinearities in price response for frequently purchased consumer goods, like orange juice.
For the multiplicative models, we see a parallel development in predictive model performance as for the exponential response models: modeling the reference price effect with a single price-difference term (abs-diff) is much less helpful or again even decreases predictive validity for some brands over the static multiplicative model compared to the use of separate gain and loss variables (abs-gl). Improvements in predictive validity from the latter dynamic model (abs-gl) range between −7 % for Florida Gold and −21 % for Tropicana Pure. Note that although the static multiplicative models clearly outperform their exponential counterparts, differences in ARMedSE values between the best dynamic exponential and multiplicative models are very small or even marginal, which is reflected by the fact that the multiplicative (exponential) model predicts better for five (three) brands. None of the two parametric model types is therefore superior when price dynamics are accommodated, both perform similarly well.
The following findings are obtained for the semiparametric sales response model with flexibly estimated price effects. First, the static semiparametric model (i.e., ignoring price and other dynamics) always provides more accurate sales predictions than the best dynamic exponential or multiplicative model versions capturing the price-change effect via a single reference price term (abs-diff and rel-diff). Improvements in ARMedSE from accommodating functional flexibility range up to −11 % for the brand "Tree Fresh" here (percentages not displayed in the table). For "Tree Fresh", the static flexible model even outperforms each of the dynamic exponential and multiplicative model versions (i.e., including the abs-gl and rel-gl models). Second, adding price dynamics further improves the predictive performance of the flexible approach, but contrary to the class of exponential or multiplicative models all four dynamic specifications (abs-diff, rel-diff, abs-gl, rel-gl) perform pretty close. Again, once price effects are accommodated flexibly it does not seem to make a great difference whether the price-change effect is captured by one single price-difference term or two separate variables for gains and losses (as was already evident from our discussions of the estimated effects in Sect. 4.3.1 and price elasticities in Sect. 4.3.2).

3
A semiparametric approach to estimating reference price effects… Best models per brand and type of model (i.e., exponential, multiplicative, and semiparametric) are marked in bold Finally, the last rows in Table 6 contrast the best flexible model with the best exponential and multiplicative models. Accordingly, improvements in predictive accuracy from semiparametric instead of nonlinear parametric modeling of price effects (own-price, cross-price, and price-change effects) lie between −3 % and −11 % (exponential model) or −3 % and −12 % (multiplicative model) for seven out of eight brands, with "Tree Fresh" and "Tropicana" benefiting most from accommodating functional flexibility, respectively. For "Tropicana Pure", the exponential and multiplicative models with separate variables for gains and losses already do a good job and semiparametric modeling does not pay off here. The latter is important to note, since nonparametric techniques are only more powerful if nonlinearities (here nonlinear effects in price response) are too complex to be captured by parametric nonlinear models.
The results for the second predictive performance measure, the Average Root Mean Squared Sales Prediction Error ( ARMSE ), closely resemble those for the ARMedSE measure in many aspects, which is why we have put the corresponding results in the Appendix (see Table 8). First, including the lagged sales variable but no price dynamics (dyn-ar) only marginally improves or even worsens the predictive performance compared to the static model for all brands except "Dominick's", independent of the type of model. For "Dominick's", improvements are moderate ranging between −3 % and −4 % across model types. Second, the best (dynamic) exponential and multiplicative models again perform similarly well, and no recommendation can be made in favor of one or the other. The multiplicative (exponential) model performs somewhat better for five (three) out of the eight brands. Third, relative improvements in ARMSE for the best semiparametric models over the best exponential (multiplicative) models are similar than for ARMedSE and lie between −3 % and −15 % ( −3 % and −14 %) for seven out of eight brands, with "Tree Fresh" and "Tropicana" as before benefiting most from addressing functional flexibility. And fourth, once price effects are modeled flexibly the four different dynamic model specifications (abs-diff, rel-diff, abs-gl, rel-gl) come pretty close in their prediction accuracy. On the other hand, there are some differences in the patterns of the ARMSE versus the ARMedSE results which are noteworthy. For the two parametric models, using a single price-difference term to capture the reference price effect (abs-diff, rel-diff) is no more consistently inferior to separating gain and loss effects with two individual price terms (abs-gl, rel-gl), as can be seen for the two brands "Tropicana" and "Dominick's". For "Minute Maid", accommodating price dynamics does not pay off at all when measured by ARMSE (as opposed to ARMedSE ), independent from the type of model (exponential, multiplicative, semiparametric) and specification of the reference price term (abs-diff, rel-diff, abs-gl, rel-gl). For "Tree Fresh", the improvements from the semiparametric model over the two parametric models are large ( −14 % compared to the exponential model, −10 % compared to the multiplicative model), but once price effects are modeled flexibly adding price dynamics does not provide further benefits. And finally, semiparametric modeling does not pay off at all for the premium brand "Florida Natural" here.
In total, for parametric models (including the linear model) the picture is more clear when using ARMedSE as measure of predictive accuracy. Here, the predictive performance always strongly benefited from capturing gains and losses with two separate covariates compared to a simple price-difference term. This clear implication in favor of separating gains and losses did not hold for all brands if ARMSE was employed to assess the predictive performance. Moreover, once price effects were modeled flexibly, adding a reference price term always further improves ARMedSE , while this did not apply to all brands when using ARMSE instead. Nevertheless, our findings clearly suggest the use of the semiparametric approach as method of choice to assess the predictive performance. First, the semiparametric model enabled better predictions for seven out of eight brands regardless of which measure was used, with sometimes very large improvements in predictive accuracy compared to all parametric models (see the brands "Tropicana" and "Tree Fresh" in Tables 6 and 8). For the remaining brand, semiparametric modeling was not inferior to parametric modeling, respectively. Second, because of the latter aspect (if one cannot do worse with the flexible approach), one need not care to find the best parametric model at the individual brand level. And third, once price effects are modeled flexibly, it does not longer seem to make a great difference which dynamic specification is employed to adequately capture reference price effects. This is due to the high flexibility of P-splines (local fitting property) to uncover large differences (different curvatures) between gain and loss effects, even if these were not modeled separately but only via a single price-difference term.

A note on loss aversion
The previous discussions have shown that the semiparametric approach is characterized by an at least as good or (considerably) better predictive performance than the parametric models considered, and that once (reference) price effects are modeled flexibly the decision whether price-change effects should be accounted for by a single price-difference term or two separate terms for gains and losses is obviously of minor importance. Since prospect theory is a prominent behavioral concept stating that consumers should weigh losses of a certain amount stronger than gains of the same amount (also compare Sects. 2 and 3), and because we did not find lossaversion but instead gain-seeking behavior of consumers for all considered brands in the refrigerated orange juice brand category without exception, a closer look on loss-gain ratios seems worthwhile. Loss-gain ratio statistics are more widespread in a brand choice modeling context (e.g., Neumann and Böckenholt 2014), and have not been applied yet to sales response models to the best of our knowledge.
For our parametric models with separate gain and loss terms (gl models), the gain-loss ratio can easily be determined by dividing the estimated parameter for losses ( 2L ) by the corresponding one for gains ( 2G ): (13) and (14) in connection with the abs-gl model in (10) for the derivation of the parameters. 8 For the semiparametric models, the loss-gain ratio extends to a flexible nonlinear ratio based on the derivative of the estimated effect curve. The calculation is nonetheless rather similar to the simple loss-gain ratio for parametric models: we divide the derivative of the loss part by the one of the gain part and aggregate them to a weighted mean, with the number of observations supporting the particular points as weights: where X contains all price-difference observations in the data set (full range) or those contained in a particular predefined subrange, respectively. Table 7 displays as examples the loss-gain ratios for the exponential, multiplicative, and semiparametric abs-diff and abs-gl models, referring to either the full range of price differences observed for the brands or to small, mid-sized, and large price differences at a more disaggregate subrange level (measured each time in monetary units). Note that for the parametric models with a single price-difference term (absdiff) the loss-gain ratio implicitly amounts to = 1 (therefore not included in the table), while this does not apply for the semiparametric model.
Considered aggregated over the entire price difference ranges (see columns full rg.), the loss-gain ratios for the parametric and the semiparametric models turn out consistently (much) smaller than 1, which suggests that consumers buying refrigerated orange juice brands at Dominick's Finer Foods stores are not loss-averse as a rule (as was already visible from the estimated price effect curves for losses and gains), contrary to what prospect theory postulates. Taking a look at the more disaggregated results within the separate price-difference intervals |Δp t | ≤ 0.50 , |Δp t | ∈ (0.50$, 1.00$] , and |Δp t | > 1.00 confirms this finding in principle. Here, we find loss-aversion only very sporadically and only for some brands, and in no single case for (absolute) price differences below 0.50 . Additionally, the loss-gain ratios are lowest across the three subranges here for most brands implying that consumers weigh low gains much stronger than low losses compared to price differences larger than 0.50 (an exception is "Tree Fresh", where consumers are least loss-averse for large price differences greater than 1.00 ). Clear loss aversion of consumers (i.e., with a loss-gain ratio greater than 2) is evident in only two cases: for large price changes of "Florida Gold", and for mid-sized price changes of "Tree Fresh". Note that the loss-gain ratios hardly differ between the two semiparametric model versions (absdiff vs. abs-gl), which could be expected based on our previous findings. Finally, the loss-gain ratios obtained from the exponential and the multiplicative abs-gl models are well below 1 for all brands (never larger than 0.5 and mostly not exceeding 0.25), which speaks again in favor of estimating gain and loss effects with two separate terms when using the parametric models. (2014) reported an average loss-gain ratio of 1.49 (indicating moderate loss aversion) based on a meta analysis of 33 studies conducted in a parametric brand choice modeling context (i.e., using random utility models of brand choice), which contradicts the findings of our study at first glance. However, the authors also showed that loss aversion can vary substantially depending on, e.g., product characteristics and the used model specification. For example, loss-aversion turned out significantly stronger for durable product categories which commonly bare a higher financial risk than nondurables (like, e.g., orange juice). Further, lossgain ratios were found to be significantly lower for so-called sticker shock models that (like in our approach) use a price main effect in addition to gain and loss terms (as opposed to the use of gain and loss terms only). The meta study of Neumann and Böckenholt (2014) was motivated by the fact that previous findings on loss aversion regarding price effects were very inconsistent, i.e., some studies indicated strong evidence for loss aversion while others not at all (for details, see the literature cited therein). 9 Mazumdar et al. (2005) emphasized much earlier that the empirical evidence on asymmetric reference price effects is mixed, referring to a number of empirical studies on brand choice not supporting loss aversion.

Neumann and Böckenholt
Interestingly, Natter and Hruschka (1997) also could not find loss aversion of consumers in an empirical study for laundry detergent brands based on their estimated market share models (i.e., based on aggregate data), and reported larger coefficients for gains than for losses (in terms of absolute magnitudes). They provided a number of possible explanations for their findings that could favor gain-seeking behavior over loss aversion: price cuts are frequently supported by POS advertising like displays and have therefore a stronger effect; costs of brand switching motivate consumers to utilize price cuts on their more preferred brands; and/or there exists a high share of brand switchers who are attracted by lower prices. Still, it is important to mention that our results of course depend on the specification of the dynamic parts in our models and our decision to separate reference price effects from other dynamic effects by including a one-period lagged sales term. As discussed before in the introduction and in Sect. 3.2, it is generally difficult to disentangle different dynamic (price) effects with aggregate sales data, and the small or negligible loss effects for price increases observed in our data might have been underestimated due to an unidentified stockpiling effect if one existed. In other words, since post-promotion dips caused by stockpiling are hard to detect in aggregate sales response models, for example as a result of very different (re)purchasing patterns of individual households, the negative effect of losses might be undervalued and actually larger.
But even then, this should not be a problem for sales prediction, which is the main objective of our proposed model.

Conclusions
In this article, we proposed a semiparametric approach to flexibly estimating reference price effects in brand sales models. In particular, we focused on the so-called price-change response of consumers (prominently introduced by Simon 1982), using aggregate store-level sales data and the price of the last period as proxy for the reference price of (an aggregate of) consumers. We compared different options to capture this dynamic price-change effect, following adaptation-level and prospect theory. While adaptation-level theory states that consumers evaluate a new price information for a brand relative to an adaptation level (which is the brand's price of the last period in our context), prospect theory goes one step further and claims that consumers should value losses of a certain amount stronger than gains of the same amount (corresponding to price increases and price decreases of the same amount in our context), and that the value function is convex for losses and concave for gains. Accommodating functional flexibility for price effects via nonparametric regression helps to simultaneously analyze a potential asymmetry and/or disproportionality of the price-change response without the need to assume a certain functional form for it a priori. In other words, by letting the data determine the shape of the price-change effect we can easily verify if the implications of these behavioral theories hold for the data at hand. We further compared the semiparametric approach to parametric benchmark models in order to assess the added value of using nonparametric regression for estimating price (change) effects.
To compare the predictive performance of our models, we conducted an empirical study using store-level scanner data of the Dominick's Finer Foods (DFF) data base for refrigerated orange juice. For model specification, we assumed a brand's sales to depend on the brand's own price, the brand's sales of the previous period, prices of substitute brands, promotional activities, store-specific and holiday effects, as well as on the brand's previous price to capture the price-change or reference price effect in the following ways: via a single price-difference term versus two separate price terms for perceived gains and losses, where the price change with respect to the previous price is measured in absolute monetary units or as a percentage change, respectively. We further estimated two nested variants of our semiparametric model in order to evaluate the impact of accounting for (price) dynamics on the predictive model performance: a static variant without the reference price term and without the lagged sales effect, and a simpler dynamic variant without the reference price term but including one-period lagged sales as autoregressive part. To assess the added value of employing nonparametric regression for estimating the price-change effect flexibly (as well as own-and cross-price effects), we also compared our semiparametric approach to the exponential (log-linear) and the multiplicative (log-log) sales response function (as well as to the simple linear model) as parametric benchmark models.

Results and managerial implications
The main results of our empirical study can be summarized as follows: first, accounting for price-change or reference price effects can substantially improve the predictive performance of brand sales models (as measured by the cross-validated average root median or mean squared errors, ARMedSE or ARMSE ). For the parametric models (linear, exponential, and multiplicative), accommodating gain and loss effects with two separate price-terms (abs-gl, rel-gl) largely improves ARMedSE values (i.e., reduces prediction errors) in holdout samples, whereas using a single price-difference term (abs-diff, rel-diff) provides only small (marginal) improvements or even decreases the predictive accuracy measured by ARMedSE compared to the static model. A look at the estimated effects and the corresponding partial residuals of the competing dynamic model specifications reveals the reason for the clearly worse performance of the abs-diff and rel-diff models: obviously, the pricechange effect is asymmetric for nearly all orange juice brands analyzed, however using a single price-difference term for both gains and losses does not allow the detection of asymmetrical effects of price changes. The use of separate variables for perceived gains versus perceived losses helps to overcome this limitation. In contrast, the semiparametric models do not have this shortcoming: due to their much greater flexibility they are able to capture such asymmetries and therefore possibly different shapes for gain and loss effects, even if gain and loss effects are not separated from each other with two different price terms. This explains why the four different dynamic specifications for the reference price effect perform similarly well here. In other words, once price (change) effects are accommodated flexibly the form of the specification of the reference price term gets secondary. The predictive validity results based on the ARMSE measure closely resemble those for the ARMedSE measure with one noticeable exception. For the parametric models, modeling gain and loss effects via two separate reference price terms did no consistently improve the predictive performance for all brands compared to using a single price-difference term only. In these cases, however, taking price dynamics into account did improve the predictive accuracy either only marginally or not at all, independent of the form of including the reference price term.
Second, as the spline functions are furthermore able to account for disproportionate effects of any shape, each of the semiparametric model variants provided more accurate sales predictions than its linear, exponential, or multiplicative counterparts for all brands considered but one (for each of the two predictive validity measures). For the one brand, the semiparametric model predicted similarly well or marginally better nevertheless. Interestingly, even the static semiparametric model leads to lower ARMedSE values than each of the two dynamic exponential or multiplicative models when capturing the price-change effect with only a single price-difference term. This also held for all but one brand (two brands) for the exponential (multiplicative) model when the predictive accuracy was evaluated by the ARMSE measure. This underlines the power of nonparametric estimation techniques in the present context. Overall, improvements in predictive accuracy from accommodating price effects flexibly over the best dynamic exponential or multiplicative models ranged up to −11 % ( −12 %) in terms of ARMedSE and up to −15 % ( −14 %) in terms of ARMSE at the individual brand level.
Third, referring to the shapes of the flexibly estimated price-change effects, we observed rather steep, disproportionate gain effects while rather flat loss effects for nearly all brands. Accordingly, consumers seem to weigh gains much stronger than losses in the refrigerated orange juice category, which contradicts prospect theory. Loss-gain ratios smaller than 1 as a rule underlined this finding. For the gain effect, decreasing returns to scale (as suggested by prospect theory) were found only for two of the national brands. For most of the other brands, the estimated gain effect curves turn out neither strictly concave nor strictly convex, showing more or less complex nonlinearities, which in addition differed across brands. This is exactly the strength of nonparametric modeling: there is no need to search for the right functional specification(s) in advance, the shape of effects is estimated directly from the data. Also note that we controlled in our models for other dynamic effects like stockpiling or customer holdover by including oneperiod lagged sales. Still, it could be that larger stockpiling effects may have gone undetected because post-promotion dips are generally hard to detect in aggregate response models. In this case, the negative effect of losses might have been undervalued. On the other hand, refrigerated orange is a less stockable product which suggests that the estimation bias in the estimated loss effects should not be that large, if one existed.
From a managerial point of view, using our more complex semiparametric approach seems worth the effort as it provides several advantages. First, as already discussed above, predictions turned out never worse, often better and sometimes considerably superior to those from any of the parametric models compared to. Second, semiparametric modeling especially pays off if price effects involve complex nonlinearities which are difficult or not at all to capture via parametric models. Even if complex nonlinearities (e.g., strong kinks or several thresholds) are not at work and improvements from using the more complex model would be not that big or only small, one need not care about the problem which parametric model to use for which brand to arrive at the best possible brand sales predictions. Our study has shown that nonlinearities for gain effects may be complex and may further turn out very differently at the individual brand level which favors the use of a flexible estimation techniques. Third, free software (BayesX) is available to easily estimate the semiparametric model (as well as the parametric models).

Limitations and outlook
Our study of course has some limitations. First, our empirical findings relate to only one product category (refrigerated orange juices) which does not yet allow a generalization of the results. More studies for different product categories and based on aggregate data are necessary to confirm or to complement our findings and to augment existing findings on loss aversion (or gain-seeking behavior), which so far have almost exclusively related to a brand choice modeling context, i.e., to disaggregate data.
Second, our article has its focus on functional flexibility and reference price effects (i.e., exploring asymmetries and disproportionalities of the price-change effect), and contributes to the stream of nonparametric models in marketing. Except for the random store intercept, which captures heterogeneity in baseline sales across stores (e.g., due to differently sized store sales areas) in all of our models, we did not address potential heterogeneity of marketing effects across stores. Weber and Steiner (2012) have shown that accounting for heterogeneity may be not helpful per se to improve the predictive performance of a store-level sales response model, whereas accommodating functional flexibility can substantially reduce prediction errors. Only few approaches exist that have accounted simultaneously for both functional flexibility and heterogeneity in store-level sales response models (e.g., Hruschka 2006a;Lang et al. 2015;Weber et al. 2017). The latter two studies have shown that extending an already flexible model to additionally accommodate store heterogeneity in marketing effects may further improve the predictive model performance. Transferred to our context, it could be interesting to analyze if and how strong the price-change effect differs across stores. On the other hand, existing (flexibly estimated) nonlinear effects might also just be a form of heterogeneity, as existing differences in price effects across stores can 'add up' to a nonlinear (homogeneous) effect. 10 In the latter case, explicitly considering heterogeneity in addition to functional flexibility might not necessarily further improve predictive accuracy. Finally, accommodating time-varying parameters as an alternative or in addition to considering heterogeneity and/or functional flexibility could also accomplish improvements in predictions.
Third, we did not address the issue of price endogeneity, a point of increasing controversy in the relevant literature. A number of different approaches have been proposed to treat endogeneity in prices (for an overview of endogeneity in aggregate market response models, see Hruschka 2017), however Rossi (2014, p. 671) has pointed out that using an invalid instrument to accommodate price endogeneity can "cause the estimates to differ even when there is no endogeneity bias". Typical candidates for appropriate instrument variables in aggregate response models are lagged prices and costs (or wholesale prices). The former, however, are not exogenous in case of our dynamic models with reference price terms, and hence cannot be used as instruments here (also cf. Hruschka 2017). In addition, standard techniques (e.g., 2SLS) to account for endogeneity do not necessarily work in (flexible) nonlinear models, as we have with our flexibly estimated nonlinear price and reference price effects. Generalized Methods of Moments estimation (if a valid instrument is available) or the copula-based method as instrument-free alternative could be promising ways out to accommodate endogeneity within in our semiparametric approach (e.g., Hruschka 2017).
Fourth, we followed the stream of researchers who proposed using the price of the last period as reference price (cf. Sect. 2). Alternatively, the reference price could be built based on prices of several previous weeks, following as example the concepts of adaptive or extrapolative expectations (e.g., Natter and Hruschka 1997;Briesch et al. 1997;Baumgartner et al. 2018).
Finally, a further challenge would be to develop optimal dynamic pricing strategies for the different models. Here, one could at first lean on the research of Kopalle et al. (1996) or Fibich et al. (2003) (cf. Sect. 2) and compare pricing implications obtained from models that ignore asymmetries of the price-change effect versus models that do capture asymmetries in gain and loss effects. This could be especially interesting for the semiparametric approach where using either a single price difference term or separate terms for gain and loss effects performed similarly well in our empirical study regarding predictive validity. Based on the findings of Kopalle et al. (1996) or Fibich et al. (2003), we would expect a hi-lo or pulsing strategy (cyclical pricing) as optimal pricing strategy for most refrigerated orange juice brands, since gain effects turned out to be (much) larger than loss effects as a rule. For the two brands "Tree Fresh" and "Florida Gold", we did find loss aversion for moderate and/or larger price differences, making predictions about the expected dynamic pricing strategies for these two brands more difficult. Furthermore, Lang et al. (2015) reported higher expected total chain profits for their semiparametric (static) sales response models compared to the multiplicative sales response function. The expected loss in profit for the multiplicative model was accompanied by a larger number of lower optimal price levels across weeks than suggested by the semiparametric model. In other words, semiparametric modeling led to a larger number of higher price levels compared to parametric modeling in their study. Weber et al. (2017) analyzed expected category profits and basically confirmed the findings of Lang et al. (2015) that flexible (static) price response modeling offers a huge potential for increasing a retailer's profits compared to parametric modeling. Moreover, we obtained (as a rule) lower price elasticities for our dynamic models (abs-diff, abs-gl, compare Sect. 4.3.2) compared to using a static model (except for very low price levels), independent from the type of model (parametric or semiparametric). This suggests that static sales response models might overestimate price sensitivities of consumers and therefore can lead to suboptimal pricing strategies. We leave the issue of optimal dynamic pricing for future research. (2) It is generally difficult to identify exceptional price effects (like distinct threshold or saturation effects for gains and losses) with parametric functions, at least not without prior knowledge about their supposed location within the price range. (3) In linear or parametric nonlinear sales response models gain and loss effects are captured via two separate truncated functions. This is not necessary when applying nonparametric regression, and may help to avoid estimation biases due to truncation.
Using Bayesian P-splines as we do in the article at hand has further convenient properties: (1) Too much flexibility leading to undesirable overfitting effects that may be harmful for predictions can be easily controlled (i.e., penalized).
(2) Monotonicity constraints can be easily imposed (e.g., Brezger and Steiner 2008). This is particularly reasonable for estimating price response, since we expect brand sales to monotonically decrease (increase) in own price (prices of substitute brands) from an economic rationale. Note that the common parametric price response functions used for brand sales modeling are inherently monotonic. (3) Using a Bayesian estimation framework allows to estimate the amount of smoothness of price or other marketing effects simultaneously with all other model parameters, instead of applying additional smoothing parameter selection procedures that become necessary in frequentist settings (Steiner et al. 2007).

Fig. 6
Estimated price-change effects from the exponential (left panels), multiplicative (top-middle panel), and semiparametric models (right panels) for the brand "Citrus Hill", capturing the price difference in absolute terms (abs-diff, top panels) or relative terms (rel-diff, bottom panels), and relating to the 'sales space'. See Fig. 3 for the variant relating to the 'log sales space' Fig. 7 Estimated price-change effects from the exponential (left panels), multiplicative (middle panels), and semiparametric models (right panels) for the brand "Citrus Hill", capturing gain and loss effects in absolute monetary units (abs-gl) with two separate terms, and relating to the 'sales space'. See Fig. 4 for the variant relating to the 'log sales space' Fig. 8 Estimated price-change effects from the exponential (left panels) and semiparametric models (right panels) for the brand "Citrus Hill", capturing gain and loss effects as relative percentage changes (rel-gl) with two separate terms (note that there is no rel-gl model version for the multiplicative models). See Fig. 4 for a variant with gains and losses defined in absolute monetary units (abs-gl)

3
A semiparametric approach to estimating reference price effects…

3
Best models per brand and type of model (i.e., exponential, multiplicative, and semiparametric) are marked in bold