1 Introduction

When facing heterogeneous buyers, price discrimination allows a seller to capture a larger portion of the total market surplus than offering a single product quality. Price discrimination is prevalent, but sellers often employ just a small number of product types, despite our casual and statistical observations that suggest significant heterogeneity among buyers’ willingness to pay. The lack of sufficient product variety has been commonly attributed to the existence of some fixed costs of launching products of different qualities (e.g., Dixit and Stiglitz 1977; Spence 1980). In many instances, however, these costs tend to be small or immaterial, thereby making it difficult to justify the observed patterns of firm strategy by resorting to such costs alone.

Motivated by these observations, this paper proposes a theory of price discrimination that incorporates a now well-established bias from rational decision making, namely consumer loss aversion (Kahneman and Tversky 1979). Specifically, we introduce Kőszegi and Rabin (2006) expectation model of reference-dependent preferences into a standard screening model á la Mussa and Rosen (1978).Footnote 1 In our setup, a monopolist seller offers a menu of bundles before a buyer privately observes his willingness to pay and decides whether to make a purchase. As in Kőszegi and Rabin (2006), henceforth referred to as KR, the buyer anticipates his future consumption choice for each possible contingency and experiences “gain–loss utility” with reference to his own past expectation of contingent consumption, in addition to standard “consumption/intrinsic utility.” Furthermore, the expectation must be correct; that is, it must be consistent with the buyer’s optimal consumption choice in each realization of uncertainty. This requirement of rational expectation, or personal equilibrium (PE), implies that the menu must satisfy incentive compatibility and (ex post) participation constraints that account for reference-dependent preferences and loss aversion.Footnote 2

In addition to the large existing literature documenting empirical support for loss aversion in a variety of economic situations, a slew of recent studies point to the specific role played by expectations in the formation of reference points (e.g., Mas 2006; Abeler et al. 2011; Card and Dahl 2011; Crawford and Meng 2011; Ericson and Fuster 2011; Gill and Prowse 2012; Sprenger 2015). The price discrimination setting offers a natural ground to explore the expectation-based approach to reference point formation, because its essential ingredient is the uncertainty of consumer demand. We usually know products that are available before discovering the specific conditions that determine our preferences. Consider, for example, a sports fan whose willingness to pay for the sports TV package is influenced by the performance of his favorite team during pre-season. This consumer may form an expectation that he would purchase the premium package if and only if the team ends up having a promising pre-season. But, once the regular season starts, he compares the expected purchase to what he could have consumed.

We show that loss aversion indeed serves to limit the benefits of price discrimination and can even result in the optimality of a full pooling menu in a situation where buyers with standard preferences would be separated via a menu with strictly increasing quality-price schedule. Moreover, the expectation-based approach brings into play an additional determinant of optimal contractual form: It depends on an interplay between the extent of consumer loss aversion and the shape of the distribution of consumer’s willingness to pay. In particular, our results suggest that given a (sufficient) level of loss aversion, the firm is more likely to shy away from screening in markets with large population of consumers with low willingness to pay.

Our main message is most clearly conveyed in the case of binary consumer types, where the effect of loss aversion manifests itself in two ways. First, when each consumer compares the alternative of non-participation to the bundle of his choice ex post, he experiences a loss on quality and a gain in money. Thus, as the consumer becomes more loss averse, he becomes willing to pay more for a given quality, which implies that the seller can profitably increase the quality for the type whose participation constraint is binding (i.e., the low willingness-to-pay type).

Second, for the consumer who acquires an information rent (i.e., the high willingness-to-pay type), deviation to lower quality-price bundle leads to another channel of gain–loss comparisons across the two utility dimensions. In this case, however, the comparison is weighted by the ex ante likelihood of the alternative event. Given loss aversion, the deviation incentive would be greater when the low willingness-to-pay type, and thus a lower price, was anticipated with a larger probability.

The combination of the above two effects generates the following: When the likelihood of low willingness-to-pay consumer is sufficiently large and the degree of loss aversion lies in an intermediate range, the seller’s optimal strategy is to offer the same bundle to both types.Footnote 3 In the case of a continuum of consumer types, focusing on the case in which full separation is optimal under standard preferences, we establish conditions under which partial or even full pooling is optimal among menus with monotone quality and price.

In our model, multiple personal equilibria may arise from a menu. Our treatment above follows the standard mechanism design approach by assuming that the firm can select the PE and hence focusing on truthful self-selection. An alternative approach suggested by KR is to assume that it is the consumer who is capable of choosing his favorite PE, or the preferred personal equilibrium (PPE). We derive the optimal menu under the concept of PPE and binary consumer types, and show that a pooling menu continues to be optimal under a wide range of parameter values. To our knowledge, this is the first non-trivial analysis of PPE in a model of adverse selection to date.

Our paper contributes to the growing literature on firm behavior under boundedly rational agents (see the surveys of Ellison 2006; Spiegler 2011; Kőszegi 2014). Among this literature, monopolist’s screening problems with loss averse consumers have recently been studied independently by Herweg and Mierendorff (2013), Orhun (2009), and Carbajal and Ely (2016).

Herweg and Mierendorff (2013) consider a seller who chooses from two-part tariffs for a loss averse consumer with uncertain demand and demonstrate the optimality of flat tariff. They model the consumer also in the frame of KR, but with gain–loss arising only from the money dimension, and characterize optimal contract when the consumer can commit to ex ante participation. Our analysis differs in several aspects. First, our setup allows for both ex post and ex ante participation. Second, we consider a general class of menus under gain–loss utilities that arise from both money and quality dimensions and derive the precise channel via which consumer loss aversion generates bunching over quality as well as price. Moreover, our treatment of gain–loss utility gives rise to non-trivial PPE analysis.

In Orhun (2009) and Carbajal and Ely (2016), the seller offers a menu to a consumer who already knows his type and admits an exogenously given reference point that is type-dependent. These authors also demonstrate the possibility of optimal pooling. However, their utility models do not involve gain–loss comparisons across multiple types; moreover, the issue of optimal menus that are PE (PPE) is not explored. The main concern of Carbajal and Ely (2016) is to explain how the shape of optimal contract depends on the reference point.

Loss aversion has been fruitfully incorporated in other contexts of firm behavior. Heidhues and Kőszegi (2014), Spiegler (2012), and Rosato (2013) consider monopoly pricing with complete information. In these models, the monopolist can optimally commit to a random pricing strategy. In contrast, we explore the role of loss aversion in a model with demand uncertainty and menu contracts. The consumer’s expectations concern his future demand, not the price realization. Courty and Nasiry (2015) derive the uniformity of optimal price irrespective of product quality in a monopoly model with consumer loss aversion and random utility shocks. They do not however address the issue of screening as we do here.

Also using the KR model, Heidhues and Kőszegi (2008) explain why firms with differentiated products and heterogeneous costs may end up charging a uniform price. The competition model of differentiated products is also explored by Karle and Peitz (2013) and Zhou (2011). De Meza and Webb (2007) and Herweg et al. (2010) study the role of loss aversion in agency problems. In auctions, loss averse bidders are introduced by Lange and Ratan (2010) and Eisenhuth (2010), and Grillo (2013) considers a cheap talk game in which the receiver is loss averse.

Finally, our paper complements other approaches aimed at understanding the implications of biased consumers for monopolist behavior. Time-inconsistent preferences or self-control problems have been explored in the context of contract design by DellaVigna and Malmendier (2004), Eliaz and Spiegler (2006), Esteban et al. (2007) and Heidhues and Kőszegi (2010); Eliaz and Spiegler (2008) and Grubb (2009) investigate the role of overconfident consumers. Jeleva and Villeneuve (2004) show that pooling menu could be optimal in an insurance model with adverse selection if the consumer has imprecise belief about the underlying risk. Here, optimal pooling arises if the likelihood of low risk consumer is sufficiently large; however, this parameter also affects the corresponding insurance coverage, while in our optimal pooling menu the product quality depends on the degree of loss aversion and not on the distribution of willingness to pay.

This paper is organized as follows. Section 2 lays out a price discrimination setup with KR’s reference-dependent preferences for the binary-type case. In Sect. 3, we characterize the optimal menu in our model by adopting truthful personal equilibrium as the solution concept. The optimal menu under preferred personal equilibrium is characterized in Sect. 4. We discuss some alternative models of reference points, and their consequences, in Sect. 5. Section 6 concludes. All proofs are relegated to the “Appendix” unless mentioned otherwise. We also present the details of some omitted analyses in a Supplementary Material.

2 The setup

2.1 Price discrimination with loss averse consumers

Consider a market that consists of a monopolistic seller of some product and its buyer. Let \(b=(q,t) \) denote a “bundle” in which the product of quality q is sold for the payment of t. A “menu” of bundles is referred to as \(M \subseteq \mathbb {R}_+^2\). We refer to \(\emptyset =(0,0)\) as the null bundle, or outside option. The seller’s profit from a bundle \(b=(q,t)\) is \(t-cq\), where \(c>0\) is the constant marginal cost of production. There is no cost of offering a bundle.

The buyer’s willingness to pay for the product, or “type,” \(\theta \in \varTheta \) is unknown at the time of menu offer from the seller but later learned privately at the time of actual consumption. Let F denote the commonly known cumulative distribution function on \(\varTheta \).

Upon observing menu M, but before learning his type, the buyer forms a “reference point,” \(R: \varTheta \rightarrow M \cup \{\emptyset \}\), which specifies a (deterministic) contingent plan of purchase at each possible type realization (including the possibility of opting out). Let \(R(\theta ^{\prime }) = (q^r(\theta ^{\prime }), t^r(\theta ^{\prime }))\) for each \(\theta ^{\prime } \in \varTheta \). Given reference point R, type-\(\theta \) buyer’s ex post utility from consuming bundle \(b = (q,t)\) is given by the sum of two components, “consumption/intrinsic” utility and “gain–loss” utility,” as followsFootnote 4:

$$\begin{aligned} u(b \mid \theta , R) := m(b;\theta ) + \int _{\theta ^{\prime } \in \varTheta } n(b; \theta , \theta ^{\prime }, R(\theta ^{\prime })) {\text {d}}F(\theta ^{\prime }), \end{aligned}$$
(1)

where

  • the consumption/intrinsic utility is measured by

    $$\begin{aligned} m(b;\theta ) := \theta v(q) - t \end{aligned}$$

    such that \(v(\cdot )\) is a (differentiable) function satisfying \(v(0)=0\), \(v^{\prime }(\cdot )>0\),\(v^{\prime \prime }(\cdot )<0\), \({\lim }_{q\rightarrow 0}v^{\prime }(q)=\infty \) and \({\lim }_{q\rightarrow \infty }v^{\prime }(q)=0\); and

  • the gain–loss utility is given by

    $$\begin{aligned} n(b; \theta , \theta ^{\prime }, R(\theta ^{\prime })) := \mu \left( \theta v(q) - \theta ^{\prime } v(q^r(\theta ^{\prime })) \right) + \mu \left( t^r(\theta ^{\prime })-t \right) , \end{aligned}$$
    (2)

    where \(\mu \) is an indicator function such that, for any \(k_{1},k_{2}\in \mathbb {R}_{+}\),

    $$\begin{aligned} \mu (k_{1}-k_{2}):=\left\{ \begin{array}{ll} k_{1}-k_{2} &{}\quad \text {if }k_{1}\ge k_{2} \\ \lambda (k_{1}-k_{2}),\lambda >1 &{}\quad \text {if }k_{1}<k_{2}.\end{array} \right. \end{aligned}$$

The utility function in (1) adapts the model of KR to our price discrimination setting. Note that the overall gain–loss utility here is measured in expectation over the uncertainty surrounding the payoff type of the decision maker rather than the randomness of outcomes per se (for each type, the reference bundle is deterministic). Each type-\(\theta \) buyer compares himself with another hypothetical type \(\theta ^{\prime }\); as such, type-\(\theta \) buyer experiences gain–loss from the difference between his bundle and that of each hypothetical type \(\theta ^{\prime }\) in terms of final utilities. Following Tversky and Kahneman (1991) and Kőszegi and Rabin (2006), we assume that the gain–loss utility is additively separable across the two consumption dimensions, quality and monetary transfer. In Sect. 5, we formally discuss how our utility formulation differs from some alternative formulations of reference point in the price discrimination setup.

The following time line will be useful to illustrate the model and compare it with the standard screening model.Footnote 5

figure a

2.2 Personal equilibrium

We now introduce the notion of personal equilibrium proposed by KR, which incorporates the idea that the reference point formed by an economic agent should be in accordance with his actual choices.

Definition 1

Given any menu M, \(R: \varTheta \rightarrow M \cup \{\emptyset \}\) is a personal equilibrium (PE) if, for all \(\theta \in \varTheta \),

$$\begin{aligned} u(R(\theta )|\theta ,R)\ge u(b|\theta ,R),\quad \forall b \in M \cup \{\emptyset \}. \end{aligned}$$
(3)

We say that R is a truthful personal equilibrium (TPE) if it is a PE given \(M=R\).

Condition (3) requires that each bundle \(R(\theta )\) in the PE be optimal for type \(\theta \) with R as the reference point so that \(R(\theta )\) is the bundle the buyer actually chooses if his type turns out to be \(\theta \). Note that the equilibrium utility of each type must be no lower than its utility from choosing the null option since the buyer can always opt out after the realization of his type.

In the case of a TPE, the reference point itself is offered as a menu and therefore each type only needs to prefer his choice of bundle over the other type’s bundle or the null bundle. That is, R is a TPE if and only if the incentive compatibility and individual rationality requirements hold as follows: For each \(\theta ,\theta ^{\prime } \in \varTheta \),

$$\begin{aligned} u(R(\theta ) |\theta ,R) \ge u(R(\theta ^{\prime }) |\theta ,R)\qquad \qquad \qquad ({\text {IC}}_{\theta }) \end{aligned}$$
$$\begin{aligned} u(R(\theta )|\theta ,R)&\ge u(\emptyset |\theta ,R).\qquad \qquad \qquad \qquad ({\text {IR}}_{\theta }) \end{aligned}$$

Since these inequalities, henceforth referred to as the (IC) and (IR) constraints, are implied by (3), the following result is immediate.

Proposition 1

Suppose R is a personal equilibrium (PE) of some menu M. Then, R is a truthful personal equilibrium (TPE).

This result is a version of revelation principle since it implies that it is without loss to focus on direct menus, i.e., menus in which every bundle is purchased in equilibrium.

The concept of personal equilibrium is not robust to the problem of multiple equilibria, however. When the seller offers a TPE menu R, the buyer might form an alternative reference point \(R^{\prime }\ne R\) and play it as a PE so that the seller fails to achieve the desired outcome. Moreover, the alternative PE could give the buyer a higher ex ante expected utility than the TPE. It is possible that the TPE generates a negative ex ante utility with there being another PE in which the buyer does not buy at all.

One approach to resolve the issue of multiple PEs proposed by KR is to assume that the consumer always chooses the PE that maximizes his ex ante expected utility, or the preferred personal equilibrium (PPE). Let \(\mathcal {P} (M)\) denote the set of all PEs that can arise when the seller offers a menu M; that is, R belongs to \(\mathcal {P} (M)\) if \(R\subseteq M \cup \{\emptyset \}\) and R satisfies condition (3). Also, given a menu R, let U(R) denote the buyer’s corresponding ex ante expected utility:

$$\begin{aligned} U (R) := \int _{\theta \in \varTheta } u(R(\theta ) \mid \theta , R) {\text {d}}F(\theta ). \end{aligned}$$

Definition 2

Given any menu M, \(R: \varTheta \rightarrow M \cup \{\emptyset \}\) is a preferred personal equilibrium (PPE) if \(R \in \mathcal {P} (M)\) and \(U(R)\ge U(R^{\prime }) \text{ for } \text{ all } R^{\prime }\in \mathcal {P} (M)\). We say that R is a truthful preferred personal equilibrium (TPPE) if it is a PPE given \(M=R\).

We characterize the seller’s profit-maximizing menu of bundles under both notions PE and PPE. In Sect. 3, the seller is assumed to be able to select his favorite PE from the menu that he offers; in Sect. 4, the buyer selects the PPE. While it may be unrealistic to assume that the seller can always manipulate the consumer’s beliefs, it also seems plausible that some consumers would respond naively to the menu on the table when he forms beliefs about his future contingent actions.

In both treatments, we restrict attention to the set of direct menus by focusing on TPE and TPPE. This is without loss for the analysis of PE menu due to Proposition 1, but a similar revelation principle for PPE menus may not hold. To see this, suppose that R is a PPE given some menu \(M \ne R\). It is possible that R is not a PPE given itself—that is, R is not a TPPE—because we cannot a priori rule out the existence of some \(R^{\prime } \in \mathcal {P} (R)\) that does not belong to \(\mathcal {P} (M)\) and generates a higher ex ante expected utility. This failure of revelation principle poses a great challenge for complete analysis of optimal menu design since such analyses rely critically on the revelation principle, as well known from the mechanism design literature. We address this issue in more detail in Sect. 4.3.

3 Optimal TPE menu

3.1 Binary consumer types

We begin by characterizing the seller’s optimal PE menu for the case of binary consumer types. Let \(\varTheta = \{\theta _L, \theta _H\}\) such that \(0<\theta _{L}<\theta _H\). The probability measure on \(\theta _L\) is denoted by \(p\in (0,1)\). For ease of exposition, we refer to a reference point in this case simply as \(R:=\{r_{L},r_{H}\}\) where \(r_{i}=(q_{i}^{r},t_{i}^{r})\) for \(i=H,L\).Footnote 6

3.1.1 The seller’s problem

Proposition 1 implies that the set of PE menus is equivalent to the set of TPE menus and hence there is no loss in restricting ourselves to menus that are themselves TPEs. We sometimes refer to such a menu simply as a TPE menu and let \(\mathcal {M}\) denote the set of all TPE menus. The seller’s problem, denoted as [P], is given by

$$\begin{aligned} \max _{\{(q_{L},t_{L}),(q_{H},t_{H})\} \in \mathcal {M}} p(t_{L}-cq_{L})+(1-p)(t_{H}-cq_{H}). \qquad \qquad \qquad [P] \end{aligned}$$

Under the reference-dependent preference framework, a broader class of menus can be supported as TPEs, compared to the standard screening model. In particular, it is possible to have the low-type buyer purchasing the higher quality-price bundle and vice versa. Given such a menu, the high type suffers a loss from deviating to mimic the low type and paying more than anticipated, and this no longer supports the usual incentive compatibility argument for the necessity of quality monotonicity of a feasible menu.

One class of menus that can be easily ruled out is one where one type of buyer receives a lower quality but pays more than the other type (including the case of a higher payment for the same quality or the same payment for a lower quality). The reason is simple: If the former type deviates to the latter’s bundle, then he will enjoy a higher gain–loss utility as well as a higher intrinsic utility.

We are therefore left with the following three classes of menus to consider.

  1. 1.

    Pooling menu \(q_{H}=q_{L}\) and \(t_{H}=t_{L}\)

  2. 2.

    Screening menu \(q_{H}>q_{L}\) and \(t_{H}>t_{L}\)

  3. 3.

    Reverse-screening menu \(q_{H} < q_{L}\) and \(t_{H} < t_{L}\)

We let \(\mathcal {M}^{P}\), \(\mathcal {M}^{S}\), and \(\mathcal {M}^{R}\) denote the set of pooling, screening, and reverse-screening menus, respectively, that satisfy the (IC) and (IR) constraints. For the full expressions of these constraints, see Section S.1 of the Supplementary Material.

3.1.2 Symmetric information benchmark

Before the main analysis, we examine the optimal menu when the seller and buyer are symmetrically informed. This will give us an insight into how the informational asymmetry interacts with loss aversion to generate the optimality of pooling. Consider a profit-maximizing seller who is symmetrically informed of \(\theta \) and thus can commit to a menu ex ante such that she imposes \((q_{i},t_{i})\) upon observing each type \(\theta _{i}\) being realized. Specifically, we modify the seller’s problem [P] by dropping the (IC) constraints. Let us denote by \([P^s]\) the seller’s profit maximization problem among contracts that satisfy the (IR) constraints only.

The following result gives a necessary condition for the optimal menu with symmetric information.

Lemma 1

The solution to \([P^s]\) must be such that \(\theta _{H} v(q_{H}) \ge \theta _{L} v(q_{L})\) and \(t_{H} \ge t_{L}\).

Using the above Lemma and the fact that both (IR) constraints are binding, we obtain

$$\begin{aligned} t_{L} = \frac{(\lambda +1)}{2} \theta _{L} v(q_{L}) \; \text{ and } \; t_{H} =t_{L} + \frac{\theta _{H} v (q_H) -\theta _{L} v(q_{L})}{B(p,\lambda )}, \end{aligned}$$
(4)

where

$$\begin{aligned} B(p,\lambda ) :=\frac{1+(1-p) +p\lambda }{1+p +(1-p)\lambda }. \end{aligned}$$
(5)

Here, \(B(p,\lambda )\) measures the relative impact of loss aversion on deviation incentives in our model, where gain–loss utilities arise stochastically in both quality and monetary dimensions. Deviating from purchasing the reference bundle to the null bundle induces a loss in quality but a gain in money. Notice that \(B(p,1)=1\).

Assuming \(\theta _{H} v (q_H) > \theta _{L} v(q_{L})\) at the optimum,Footnote 7 we can plug (4) into the objective function and take the first-order conditions to obtain

$$\begin{aligned} \frac{c}{v^{\prime }(q_{L})}&= \frac{ \left[ (\lambda +1)B(p,\lambda )-2(1-p)\right] \theta _{L}}{2p B(p,\lambda )} \end{aligned}$$
(6)
$$\begin{aligned} \frac{c}{v^{\prime }(q_{H})}&= \frac{\theta _{H}}{B(p,\lambda )}. \end{aligned}$$
(7)

Note from (6) and (7) that \(q_{L} \ge q_{H}\) if and only if

$$\begin{aligned} \frac{(\lambda +1)B(p,\lambda ) -2(1-p)}{2p} \ge \frac{\theta _{H}}{\theta _{L}}, \end{aligned}$$
(8)

which holds for \(\lambda \) exceeding some threshold since \((\lambda +1)B(p,\lambda )\) strictly increases in \(\lambda \) without bound. Thus, with \(\lambda \) high enough to satisfy (8), the symmetrically informed seller can maximize profit by endowing the low type with a higher quality but charging the high type with a larger transfer (see (4)). Note that the optimal qualities are the same across the two types only when (8) holds as equality, which is a knife-edge phenomenon. Furthermore, the same quality does not necessarily mean the same transfer.

This implies that pooling menu, which is the main focus of our analysis, does not arise when the buyers are loss averse but do not hold private information. Neither does it emerge as a consequence of asymmetric information alone, as in Mussa and Rosen (1978). The optimality of pooling is indeed a consequence of the interplay between loss aversion and asymmetric information, as we demonstrate in later sections. Intuitively, pooling will emerge as the optimal menu when the quality reversal is desirable due to loss aversion but is not feasible in the presence of asymmetric information.

3.1.3 Results

We now turn to the analysis of [P], i.e., finding an optimal menu when the seller and buyer are asymmetrically informed. A unified analysis of all possible menus is not available since different classes of menus entail different forms of gain–loss utility. Our analysis below considers each class separately to identify an optimal menu within that class, which will then lead us to characterize the overall optimal menu. Note that any pooling menu lies on the boundary of the set of feasible screening menus (\(\mathcal {M}^S\)) or reverse-screening menus (\(\mathcal {M}^R\)). The optimality of pooling will thus arise if two inequality constraints, \(q_H \ge q_L\) and \(q_H \le q_L\), which we impose to find an optimal menu within \(\mathcal {M}^S\) and \(\mathcal {M}^R\), turn out to be binding. In what follows, whenever we mention an “optimal screening (pooling or reverse-screening) menu,” it will mean optimality within the set of screening (pooling or reverse-screening) menus.

Pooling menu

We begin by characterizing the seller’s profit-maximizing choice within the class of the pooling menu. Consider a pooling menu \(R =\{r= (q,t)\} \in \mathcal {M}^{P}\). Clearly, the \(({\text {IR}}_{H})\) constraint is implied by the \(({\text {IR}}_{L})\) constraint since, if both types choose the same bundle, type \(\theta _{H}\) is better off in terms of both intrinsic and gain–loss utilities while the outside payoff is type-independent. Now, \(({\text {IR}}_{L})\) can be written as

$$\begin{aligned} u(r|\theta _{L}, R )&= \theta _{L}v(q)-t-(1-p)\lambda (\theta _{H}-\theta _{L})v(q) \\&\ge u(\emptyset |\theta _{L}, R) =p[t-\lambda \theta _{L}v(q)]+(1-p)[t-\lambda \theta _{H}v(q)], \end{aligned}$$

or after rearrangement,

$$\begin{aligned} t \le \frac{(\lambda +1)}{2}\theta _{L} v(q). \end{aligned}$$
(9)

Clearly, (9) must be binding at the optimum. The following result is then immediate from the first-order condition of the seller’s profit maximization.

Proposition 2

The optimal pooling menu, \(\{(q^{p}, t^{p})\}\), is such that \(\theta _{L}v^{\prime }(q^{p})=\frac{2c}{\lambda +1}\).

Thus, the seller finds it optimal to sell a higher quality to a consumer with higher \(\lambda \). This is because the buyer wants to avoid the loss from non-participation and, therefore, is willing to pay more for a given amount of consumption if he is more loss averse, as can be seen in (9) above.

Screening menu

Consider a screening menu \(R = \{r_{L} =(q_{L},t_{L}), r_{H}=(q_{H},t_{H})\}\in \mathcal {M}^{S} \) where \(q_{L}<q_{H}\) and \(t_{L}<t_{H}\). As in the standard screening model, we can show that the \(({\text {IC}}_{H})\) and \(({\text {IR}}_{L})\) constraints are binding at the optimum while the other constraints are not. Using a similar derivation to (9), the \(({\text {IR}}_{L})\) constraint can be written as

$$\begin{aligned} t_{L} \le \frac{\lambda +1}{2}\theta _{L}v(q_{L}), \end{aligned}$$
(10)

which must be binding at the optimum. Thus, for the same reason as in the optimal pooling menu above, the optimal quality for the low type increases with loss aversion. We refer to this as the participation effect of loss aversion, meaning that a greater aversion to the loss resulting from comparison with non-participation enables the seller to charge more and thus increase the quality for the low-type consumer.

Next, write the \(({\text {IC}}_{H})\) constraint as

$$\begin{aligned} u(r_{H}|\theta _{H},R)&= \theta _{H}v(q_{H})-t_{H}+p[\theta _{H}v(q_{H})-\theta _{L}v(q_{L})-\lambda (t_{H}-t_{L})] \\&\ge u(r_{L}|\theta _{H},R) = \theta _{H}v(q_{L})-t_{L}+p(\theta _{H}-\theta _{L})v(q_{L}) \\&\quad +(1-p)[(t_{H}-t_{L})-\lambda \theta _{H}(v(q_{H})-v(q_{L}))], \end{aligned}$$

which can then be rewritten as

$$\begin{aligned} {[}1+(1-p)+p\lambda ] (t_{H}-t_{L})\le [1+p+(1-p)\lambda ]\theta _{H}[v(q_{H})-v(q_{L})]. \end{aligned}$$
(11)

The benefit of type \(\theta _{H}\) deviating to \(r_{L}\), captured by the LHS of (11), consists of reduced payment, \(t_{H}-t_{L}\), and its positive impact on the gain–loss utility, \((1-p + p\lambda )(t_{H}-t_{L})\). To understand the latter, note first that the gain from paying \(t_{L}\) instead of \(t_{H}\) is weighted by the probability \(1-p\) with which the buyer expected the payment to be \(t_{H}\). At the same time, by the deviation, the high type avoids the loss equal to \(\lambda (t_{H}-t_{L})\) that he would have incurred from sticking with his equilibrium choice, which is weighted by the probability p with which \(\theta _{L}\) would have occurred.

The cost of deviation, captured by the RHS of (11), results from a reduced quality from \(q_{H}\) to \(q_{L}\) and can be explained similarly. One can then see that \(B(p,\lambda ) =\frac{1+(1-p) +p\lambda }{1+p +(1-p)\lambda }\), defined previously in (5), reflects the relative (benefit–cost) impact factor of deviating to a lower quality, lower price bundle, which would result in a monetary gain but a quality loss.

When binding, (11) can be written as

$$\begin{aligned} t_{H} = t_{L} + \frac{\theta _{H} [v (q_{H}) -v(q_{L})]}{B(p,\lambda )}. \end{aligned}$$
(12)

Notice from (11) that higher \(\lambda \) amplifies both the benefit and cost of deviation. If a higher \(\lambda \) makes \(B(p,\lambda )\) larger (smaller), then the loss aversion makes screening less (more) effective in enabling the extraction of more payment from the high type. We will refer to this as the screening effect of loss aversion, which could be favorable or adverse to the seller depending on the value of p. Also, (12) implies that, for fixed \(\lambda \), the effectiveness of screening is decreasing in the likelihood of low type, i.e., \(B(p,\lambda )\) is increasing in p.

Now, we describe the optimal screening menu and compare it with the optimal pooling menu.

Proposition 3

  1. (a)

    The optimal screening menu, \(\{(q^s_L,t^s_L),(q^s_L,t^s_H)\}\), is such that

    $$\begin{aligned} \frac{c}{v^{\prime }(q^s_{L})}&= \max \left\{ \frac{ (\lambda +1)B(p,\lambda )\theta _{L}-2(1-p)\theta _{H}}{2p B(p,\lambda )},0 \right\} \end{aligned}$$
    (13)
    $$\begin{aligned} \frac{c}{v^{\prime }(q^s_{H})}&= \frac{\theta _H}{B(p,\lambda )}, \end{aligned}$$
    (14)

    where \(q^s_L\), if not equal to 0, increases in \(\lambda \) and \(q^s_H\) decreases (increases) in \(\lambda \) if \(p > \frac{1}{2}\) \((p<\frac{1}{2})\).

  2. (b)

    Any screening menu is dominated by the optimal pooling menu if and only if

    $$\begin{aligned} \frac{\theta _H}{\theta _{L}} \le \left( \frac{\lambda +1}{2} \right) B(p,\lambda ), \end{aligned}$$
    (15)

    which in turn holds if and only if \(\lambda \ge \lambda _S \) for some threshold \(\lambda _S >1\) that decreases in p and increases in \(\frac{\theta _H}{\theta _L}\).

In part (a) of Proposition 3, the optimal quality \(q_{L}\) increasing with \(\lambda \) should be expected from the participation effect. The behavior of \(q_{H}\) is related to the fact that \(B(p,\lambda )\) increases with \(\lambda \) if and only if \(p > \frac{1}{2}\): That is, a higher \(\lambda \) means the adverse (favorable) screening effect if \(p > \frac{1}{2}\left( p< \frac{1}{2}\right) \).

Part (b) states the condition under which pooling dominates screening. The inequality (15) holds when the participation effect, measured by \(\frac{\lambda +1}{2}\) [see (10) above], is large and/or when the screening effect works against the profitability of screening as \(B(p,\lambda )\) gets large. There are a couple of noteworthy observations here. First, with sufficiently large \(\lambda \), the dominance of pooling over screening remains even when \(p<\frac{1}{2}\) such that the screening effect works favorably for the screening seller. This is because the participation effect dominates the screening effect, namely \(\frac{\lambda +1}{2}\) increases with \(\lambda \) faster than \(B(p,\lambda )\) decreases. Second, the threshold, \(\lambda _S\left( p,\frac{\theta _{H}}{\theta _{L}}\right) \), is decreasing in p, and this implies that screening is less attractive relative to pooling when the low-type consumers are more abundant. This follows from the fact that a higher (ex ante) likelihood of \(\theta _{L}\) generates a greater deviation incentive for the high type via the gain–loss utility (\(\partial B(p,\lambda )/ \partial p > 0\)).

Reverse-screening menu

Let us consider next a reverse-screening menu \(R = \{r_{L} =(q_{L},t_{L}), r_{H}=(q_{H},t_{H})\} \in \mathcal {M}^{R} \) such that \(q_{L} > q_{H}\) and \(t_{L} > t_{H}\), satisfying the (IC) and (IR) constraints. The reverse-screening menu is a useful device to exploit the aforementioned participation effect by giving a higher quality to the low type. Giving a higher quality to the low type, however, may create a deviation incentive for the high type. This incentive can be curbed should the high type suffer a sufficient loss from a higher deviation price. How this loss is affected by the parameters in our model will determine when the reverse-screening menu is optimal.

We first provide a couple of necessary conditions for reverse-screening menu to be feasible or optimal.

Lemma 2

  1. (a)

    A reverse-screening menu can be a TPE only if

    $$\begin{aligned} \frac{\theta _H}{\theta _L} \le \frac{\lambda +1}{2}. \end{aligned}$$
    (16)
  2. (b)

    Any optimal reverse-screening menu must satisfy \(\theta _H v(q_H) \ge \theta _L v(q_L)\).

Part (a) states that loss aversion must be high enough to sustain a reverse-screening menu as a TPE. According to part (b), the seller does not want to reverse the qualities to the extent that the utility from quality consumption is reversed.

We now compare reverse-screening and pooling menus.

Proposition 4

Any reverse-screening menu is dominated by the optimal pooling menu if and only if

$$\begin{aligned} \frac{\theta _{H}}{\theta _{L}} \ge \frac{1+p+(1-p)\lambda }{2}, \end{aligned}$$
(17)

which in turn holds if and only if \(\lambda \le \lambda _{R} \) for some threshold \(\lambda _{R}\) that increases in p and \(\frac{\theta _{H}}{\theta _{L}}\).

Thus, if \(\lambda \) is large enough to violate (17), reverse-screening in fact dominates pooling. This arises due to the participation effect that makes the increase in \(q_L\), rather than \(q_H\), more effective in extracting surplus. Since the high-type consumer derives a higher level of utility from any given contract and therefore cares less about an improvement in quality than the low-type consumer, the attractiveness of exploiting the high type’s higher marginal intrinsic utility can be outweighed by the participation effect when the consumer is significantly loss averse.

Condition (17) shows that pooling tends to dominate reverse-screening as p gets larger. The logic is similar to that behind part (b) of Proposition 3: A higher p makes it more tempting for the high type to deviate. When the realization of the low type has been anticipated to be more likely, under screening, the high type experiences a greater loss from sticking to \(r_H\) that involves a higher payment while, under reverse-screening, the same consumer finds it less costly to deviate to \(r_L\).

Optimal menu

We are now ready to characterize the menu that maximizes the expected profit among all TPE menus.

Theorem 1

There exists some \(\hat{p} \in (0,1)\) such that \(\lambda _S \le \lambda _R \) if and only if \(p \ge \hat{p}\). Then, the optimal menu that solves [P] is

  1. (a)

    a pooling menu if \(p \ge \hat{p}\) and \(\lambda \in [\lambda _S, \lambda _R]; \)

  2. (b)

    a screening menu if \(\lambda < \min \{ \lambda _R,\lambda _S \}; \)

  3. (c)

    a reverse-screening menu if \(\lambda > \max \{ \lambda _R,\lambda _S \}; \)

  4. (d)

    either screening or reverse-screening menu (but not both) if \(p < \hat{p}\) and \(\lambda \in [\lambda _R, \lambda _S].\)

Proof

First, it is straightforward to see that

$$\begin{aligned} \lim _{p \rightarrow 0} \lambda _S&=\infty >\frac{2\theta _H}{\theta _L} -1 = \lim _{p \rightarrow 0} \lambda _R\\ \lim _{p \rightarrow 1} \lambda _S&=2\sqrt{\frac{\theta _H}{\theta _L}} -1 <\infty = \lim _{p \rightarrow 1} \lambda _R. \end{aligned}$$

Thus, by the mean value theorem and the monotonicity of \(\lambda _S\) and \(\lambda _R\), we can find \(\hat{p} \in (0,1)\) such that \(\lambda _S\ge \lambda _R\) if and only if \(p \ge \hat{p} \). Then, parts (a) to (d) of the claim immediately follow from combining part (b) of Propositions 3 and 4. \(\square \)

Pooling is optimal if there is enough mass of low types and the consumer is sufficiently, but not too, loss averse. Otherwise, a screening or reverse-screening menu is optimal. In the latter case, there is a region of parameters, as shown in part (d), in which we have not been able to fully sort between screening and reverse-screening menus, but in most cases we expect the screening (reverse-screening) menu to be optimal if \(\lambda \) is low (high).

The central message of Theorem 1 is the optimality of pooling. Another noteworthy theoretical prediction of our model is the possibility of optimal reverse-screening under sufficiently large \(\lambda \). We nonetheless show below that this latter result does not hold in a model with a continuum of buyer types [Theorem 2, part (c)] or with an alternative gain–loss utility specification (Proposition 7).

The following example illustrates how the optimal menu varies with the parameter values. Here, pooling is optimal for a wide range of parameter values, while reverse-screening requires \(\lambda \) to be larger than 2.Footnote 8

Example 1

Suppose that \(\frac{\theta _{H}}{\theta _{L}}=1.5\). Figure 1 divides the space of \((\lambda ,p)\) into four regions according to Theorem 1 and illustrates the type of optimal menu in each region.

Fig. 1
figure 1

Optimal TPE menu

It can be shown, though only numerically, that in the region (d), there is a threshold value of \(\lambda \) for each p below (above) which the screening (reverse-screening) menu is optimal. Below dotted line, the optimal screening menu entails exclusion of the buyer with low willingness to pay [see (13) above].Footnote 9

Notice in the above example that, at low values of p, loss aversion actually generates a benefit from serving also the low-type buyer who would otherwise be excluded by the profit-maximizing seller. This is due to the participation effect that enables the firm to sell a higher quality-price bundle to the low type than in the model without loss aversion.Footnote 10

3.2 A continuum of consumer types

In this section, we explore the scope of our findings beyond binary consumer types by considering a continuum-type case. Section S.2 of the Supplementary Material offers a detailed analysis, including formal proofs and numerical examples of the main results.

Suppose that \(\theta \in [\underline{\theta },\overline{\theta }]\) with a cdf F, which has a strictly positive and continuously differentiable pdf f. Define the “virtual value” function as

$$\begin{aligned} J(\theta ) :=\theta -\frac{1-F(\theta )}{f(\theta )}, \end{aligned}$$

and assume that it is strictly increasing. Without loss aversion, this assumption leads to full separation of types.

Let \((q,t): [\underline{\theta },\overline{\theta }] \rightarrow \mathbb {R}_{+}\times \mathbb {R}\) denote a menu offered by the seller. For simplicity, we assume that \(q(\cdot )\) and \(t(\cdot )\) are continuous.Footnote 11 We restrict attention to two classes of monotone menus: (i) both \(q(\cdot )\) and \(t(\cdot )\) are non-decreasing; and (ii) both \(q(\theta )\) and \(t(\theta )\) are non-increasing while \(\theta v(\theta )\) is non-decreasing. With some abuse of terminology, we refer to the former class of menus as screening menus and the latter as reverse-screening menus.

Given a feasible TPE menu, with some abuse of notation, let \(U(\theta ^{\prime };\theta )\) denote the payoff of type \(\theta \) reporting \(\theta ^{\prime }\) and let \(U(\theta ) := U(\theta ;\theta )\). Then, the (IC) constraint can be written as

$$\begin{aligned} U(\theta )= \max _{\theta ^{\prime }\in [\underline{\theta },\overline{\theta }]}U(\theta ^{\prime };\theta ),\quad \forall \theta , \end{aligned}$$
(18)

while the (IR) constraint as

$$\begin{aligned} U (\theta ) \ge \int _{\underline{\theta }}^{\overline{\theta }} (t(s) -\lambda s v(q(s))) {\text {d}} F(s), \quad \forall \theta . \end{aligned}$$
(19)

In both screening and reverse-screening menus we consider, \(\theta v(q(\theta ))\) is non-decreasing and, hence, we can define

$$\begin{aligned} \hat{\theta }(\theta ,\theta ^{\prime }) :=\sup \{ r \in [\underline{\theta },\overline{\theta }]\, |\, s v (q(s)) \le \theta v(q(\theta ^{\prime })),\,\, \forall s \le r\}. \end{aligned}$$

Note that if type \(\theta \) (mis)reports to be type \(\theta ^{\prime }\) and receives \(q (\theta ^{\prime })\), he experiences a utility gain (loss) in quality dimension, compared to the types below (above) \(\hat{\theta } (\theta ;\theta ^{\prime })\).

We can then write

$$\begin{aligned} U(\theta ^{\prime };\theta )&= \theta v(q(\theta ^{\prime })) -t(\theta ^{\prime }) + \left[ \int _{\underline{\theta }}^{\hat{\theta }(\theta ,\theta ^{ \prime })}(\theta v(q(\theta ^{\prime }))-s v(q(s))){\text {d}}F(s)\right. \nonumber \\&\quad \left. +\int _{\theta ^{\prime }}^{ \overline{\theta }}(t(s)-t(\theta ^{\prime })){\text {d}}F(s)\right] \nonumber \\&\quad -\lambda \left[ \int _{\hat{\theta }(\theta ,\theta ^{\prime })}^{\overline{\theta } }(sv(q(s))-\theta v(q(\theta ^{\prime }))){\text {d}}F(s)+\int _{\underline{\theta } }^{\theta ^{\prime }}(t(\theta ^{\prime })-t(s)){\text {d}}F(s)\right] . \end{aligned}$$

The first-order condition for incentive compatibility amounts to the followingFootnote 12:

$$\begin{aligned} \left. \frac{\partial }{\partial \theta ^{\prime }} U(\theta ^{\prime };\theta ) \right| _{\theta ^{\prime } =\theta }= & {} \theta \left( v (q(\theta ))\right) ^{\prime } \left[ 1+F(\theta )\right. \nonumber \\&\left. +\lambda (1-F(\theta ))\right] -t^{\prime }(\theta )\left[ 1 + (1-F(\theta ))+\lambda F(\theta )\right] = 0. \end{aligned}$$
(20)

To see the intuition behind this expression, consider the cost and benefit of type \(\theta \) from slightly overstate his type. On the one hand, the intrinsic utility from quality consumption marginally increases by \(\theta (v(q(\theta )))^{\prime }\). From this, the gain that type \(\theta \) enjoys relative to the types below increases by \(\theta (v(q(\theta )))^{\prime } F(\theta )\) while the loss, which type \(\theta \) suffers relative to the types above, decreases by \(\lambda \theta (v(q(\theta )))^{\prime }(1-F(\theta ))\). Thus, the overall marginal benefit in the quality dimension is proportional to \(1+F(\theta )+\lambda (1-F(\theta ))\). On the other hand, due to a higher payment after the deviation, the intrinsic utility decreases by \(t^{\prime }(\theta )\). From this, the gain that type \(\theta \) enjoys relative to the types above decreases by \(t^{\prime } (\theta )(1-F(\theta ))\) while the loss increases by \(\lambda t^{\prime } (\theta )F(\theta )\). Thus, the overall marginal benefit in the money dimension is proportional to \(1+ (1- F(\theta )) + \lambda F(\theta )\).

We can rewrite (20) as

$$\begin{aligned} t^{\prime }(\theta )=(v(q(\theta )))^{\prime }\frac{\theta (1+F(\theta )+ \lambda (1-F(\theta )))}{1+(1-F(\theta ))+\lambda F(\theta )}= (v(q(\theta )))^{\prime }G(\theta ,\lambda ), \end{aligned}$$
(21)

where

$$\begin{aligned} G(\theta ,\lambda ):=\frac{\theta }{H (\theta ,\lambda )} \; \text{ and } \; H (\theta ,\lambda ) := \frac{ 1+(1-F(\theta ))+\lambda F(\theta )}{1+F(\theta )+\lambda (1-F(\theta ))}. \end{aligned}$$

Note that \(H(\theta ,\lambda )\) is the continuum-type counterpart of \(B(p,\lambda )\) in (12). It affects the rate at which the payment increases as the consumer’s type, and thus its corresponding quality marginally increases. Without reference-dependent utility, the rate of increase is proportional to \(G(\theta ,1)=\theta \); this should be adjusted using \(H(\theta ,\lambda )\) in the presence of reference-dependent utility. We refer to \(G (\theta ,\lambda )\) as the “gain–loss-adjusted type,” whose behavior is crucial for determining the optimal quality schedule. Note that \(G(\theta ,\lambda ) > \theta \) if \(\theta < F^{-1}\left( \frac{1}{2}\right) \) (and \(G(\theta ,\lambda ) < \theta \) if \(\theta > F^{-1}\left( \frac{1}{2}\right) \)), so the gain–loss-adjusted type is leveled out. Moreover, \(H(\theta ,\lambda )\) increases in \(\theta \) and does so faster with higher \(\lambda \), which may cause \(G(\theta ,\lambda )=\frac{\theta }{H(\theta ,\lambda )} \) to decrease.

We next present our results of this section.

Theorem 2

Consider the case of a continuum of consumer types, and restrict attention to monotone menus. The optimal TPE menu has the following properties:

  1. (a)

    Suppose that (i) \(\theta (1+F(\theta )+\lambda (1-F(\theta ))) \) is non-decreasing in \(\theta \) and (ii) \(\frac{\lambda ^2 +2\lambda -3}{2(\lambda +1)} > \frac{1}{\overline{\theta }f(\overline{\theta })}\). Then, pooling occurs around the highest type \(\overline{\theta }\).

  2. (b)

    Suppose that \(\underline{\theta } >0\), \(\theta f(\theta ) > F(\theta ) \ \forall \theta \), and \(f^{\prime }(\theta )\le 0 \ \forall \theta \). Then, there exists some \(\overline{\lambda }>1\) such that, for any \(\lambda > \overline{\lambda }\), pooling occurs over the entire interval \([\underline{\theta },\overline{\theta }]\).

  3. (c)

    Any reverse-screening menu is dominated by a pooling menu.

In part (a), condition (i) guarantees that a quality-transfer schedule that deters deviation to a marginal type does so to all other types and hence global incentive compatibility is implied by local consideration.Footnote 13 Condition (ii) is equivalent to requiring that \(G_{\theta } (\overline{\theta },\lambda )<0\), i.e., the gain–loss-adjusted type decreases with the original type around the top. Without having to concern with information rent at the top, this means that the gain–loss-adjusted virtual value also decreases, leading to pooling at the top. Note that the inequality never holds if \(\lambda =1\).

Part (b) gives a set of conditions sufficient for full pooling to be optimal. The first condition, \(\underline{\theta } >0\), prevents the optimal menu from excluding the bottom type, as required by a full pooling menu. To understand the second condition, let us first note

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } G(\theta ,\lambda ) = \theta \frac{1-F(\theta )}{ F(\theta )}. \end{aligned}$$

Thus, for sufficiently high \(\lambda \), the gain–loss-adjusted type decreases going from \(\underline{\theta }\) to \(\overline{\theta }\) while it may not be in between. Then, the condition that \(\theta f(\theta ) > F(\theta ) \ \forall \theta \) ensures that this expression monotonically decreases over the entire interval so that \(G_{\theta }(\theta ,\lambda )\) is always negative for sufficiently high \(\lambda \). The last condition, \(f^{\prime }(\theta ) \le 0\), ensures (along with the second condition) that \(G_{\theta \theta } (\theta )\le 0\) for sufficiently high \(\lambda \), which means worsening of the information rent problem due to loss aversion. Note that this condition is consistent with the observation in the previous binary-type analysis that the screening effect adversely affects the profitability of a screening menu when the low type is abundant.

Part (c) shows that, in contrast to the binary-type case, the reverse-screening menu can no longer be optimal with continuously many types. Recall that we consider reverse-screening menus whose quality/transfer schedule is non-increasing. Thus, the class of menus that are dominated by pooling menu here includes any menu in which the quality/transfer schedule is strictly decreasing over some local interval of types while being constant elsewhere. To understand this result, recall that a key feature of optimal reverse-screening with binary types was the participation effect: For the low willingness-to-pay consumer, the participation constraint must be binding at the optimum and therefore the additional loss arising from non-participation allows the firm to extract a greater payment from this type by offering a higher quality product [see (10)]. With a continuum of types, this effect no longer applies. The participation constraint similarly binds for the lowest type, but the corresponding revenue impact is only marginal. On the other hand, just as in the binary case, the incentive compatibility requirement works against the profitability of reverse-screening menus.

Remark 1

Our derivation of optimal menu is based on the restriction to monotone menus. Therefore, Theorem 2 implies the following: When the conditions stipulated in part (a) or (b) are met, the optimal TPE menu involves either pooling, or else, strict violation of monotonicity (“local reverse-screening”). Neither contractual form is predicted by the standard model with increasing virtual value.

4 Optimal TPPE menu

4.1 The seller’s problem

Let us next consider a consumer who is capable of choosing the best PE from a given menu of bundles. We restrict attention to the binary consumer type case and TPPE menus, i.e., TPE menus that generate the highest ex ante utility to the consumer among all corresponding PEs.

Given any TPE menu \(R = \{b_L, b_H\} \in \mathcal {M}\), let

$$\begin{aligned} C(R) := \left\{ R^{\prime } = \{ b_L^{\prime }, b_H^{\prime } \} \ne R\, | \, b_i^{\prime } = \emptyset , b_L, \text{ or } b_H \text{ for } \text{ each } i = L,H \right\} , \end{aligned}$$

that is, the set of all menus other than R that can arise from each of the two types choosing a bundle contained in R. In order for a TPE menu \(R = \{b_L, b_H\}\) to be a TPPE, it must be that for every alternative consumption plan \(R^{\prime }\in C(R)\), either \(R^{\prime }\) fails to be a PE or the buyer’s ex ante payoff from \(R^{\prime }\) does not exceed that from R. This requirement will be met if and only if R and \(R^{\prime }\) satisfy at least one of the five inequalities below:

$$\begin{aligned} u(b_L^{\prime }|\theta _L, R^{\prime }) < u( \tilde{b} |\theta _L, R^{\prime })\quad \text{ for } \tilde{b} \in R {\setminus } \{b_L^{\prime }\} \qquad \qquad \qquad ({\text {FIC}}_L) \end{aligned}$$
$$\begin{aligned} u (b_L^{\prime }| \theta _L, R^{\prime }) < u (\emptyset | \theta _L,R^{\prime }) \qquad \qquad \qquad \quad \qquad \qquad \qquad \qquad ({\text {FIR}}_L) \end{aligned}$$
$$\begin{aligned} u (b^{\prime }_H | \theta _H, R^{\prime }) < u (\tilde{b} | \theta _H, R^{\prime })\quad \text{ for } \tilde{b} \in R {\setminus } \{b_H^{\prime }\} \qquad \qquad \qquad ({\text {FIC}}_H) \end{aligned}$$
$$\begin{aligned} u (b_H^{\prime }| \theta _H, R^{\prime }) < u (\emptyset | \theta _H, R^{\prime }) \qquad \qquad \qquad \quad \qquad \qquad \qquad \qquad ({\text {FIR}}_H) \end{aligned}$$
$$\begin{aligned} U(R^{\prime }) \le U(R). \qquad \qquad \qquad \quad \qquad \qquad \qquad \qquad \qquad \qquad (U) \end{aligned}$$

Fixing a consumption plan R, the first four inequalities above represent violations of the four (IC) and (IR) conditions, respectively, for an alternative plan \(R^{\prime }\) to constitute itself a PE. These inequalities will be referred to as the (FIC) and (FIR) conditions. The last inequality means that the buyer’s ex ante payoff from \(R^{\prime }\) does not exceed that from R. We say that \(R\in \mathcal {M}\) satisfies the PPE requirement with respect to \(R^{\prime }\) if at least one of the above five inequalities is satisfied.

A TPE menu \(R \in \mathcal {M}\) is a TPPE if and only if it satisfies the PPE requirement with respect to \(R^{\prime }\) for every \(R^{\prime }= C(R)\). Let \(\mathcal {M}^e\) denote the set of all such menus. Then, the seller’s corresponding optimization program is given as followsFootnote 14:

$$\begin{aligned} \max _{\{(q_{L},t_{L}),(q_{H},t_{H})\} \in \mathcal {M}^e} p(t_{L}-cq_{L})+(1-p)(t_{H}-cq_{H}). \qquad \qquad \qquad [P^e] \end{aligned}$$

4.2 Results

We begin our analysis of optimal TPPE menu by exploring a necessary condition for a screening menu to be a TPPE. Suppose that the firm offers \(R =\{b_L, b_H\}\) such that \(b_L \ne b_H\) intended to screen the high-type consumer. The problem is that the consumer may instead form, or deviate to, an alternative consumption plan from the offered bundles. In particular, choosing a constant bundle poses a potential benefit in terms of gain–loss utilities. Our first result provides the conditions for a screening menu to satisfy the PPE requirement with respect to the pooling menus. To state the result, define

$$\begin{aligned} \alpha (p,\lambda ) := {\left\{ \begin{array}{ll} \frac{\lambda +1}{2} &{} \text{ if } p < \frac{\lambda +2}{\lambda +3}\\ \frac{1+(1-p)(\lambda -1)}{1-(1-p)(\lambda -1)} &{} \text{ if } p \ge \frac{\lambda +2}{\lambda +3} \end{array}\right. } \quad \text{ and } \quad \beta (p,\lambda ) := {\left\{ \begin{array}{ll} \frac{1-p(\lambda -1) }{1+p(\lambda -1)} &{} \text{ if } p \le \frac{1}{\lambda +3} \\ \frac{2}{\lambda +1} &{} \text{ if } p > \frac{1}{\lambda +3}. \end{array}\right. } \end{aligned}$$
(22)

Lemma 3

Fix any screening menu \(R = \{b_L, b_H\}\). Then, we obtain the following:

  1. (a)

    R satisfies the PPE requirement with respect to \(R^H := \{b^H\}\) if and only if

    $$\begin{aligned} \frac{t_H - t_L}{v_H -v_L} \ge \theta _L \alpha (p,\lambda ) (\text{ With } \text{ the } \text{ inequality } \text{ being } \text{ strict } \text{ if } p < \frac{\lambda +2}{\lambda +3}); \end{aligned}$$
    (23)
  2. (b)

    R satisfies the PPE requirement with respect to \(R^L := \{b^L\} \) if and only if

    $$\begin{aligned} \frac{t_H - t_L}{v_H -v_L} \le \theta _H \beta (p,\lambda ) (\text{ With } \text{ the } \text{ inequality } \text{ being } \text{ strict } \text{ if } p > \frac{1}{\lambda +3}). \end{aligned}$$
    (24)

Furthermore, conditions (23) and (24) imply that R is a TPE.

Part (a) is derived from the following considerations. If \(b_H\) was so expensive relative to \(b_L\) as to satisfy (23), the consumer would not deviate to \(R^H\) (under which he would always consume \(b^H\)) for one of two reasons: Either the low type prefers \(b_L\) to \(b_H\) so that \(R^H\) cannot be a PE, or the expected transfer from \(R^H\) is sufficiently higher than that from R such that \(R^H\) overall yields a lower ex ante payoff than R. Part (b) and condition (24) are derived similarly by considering \(R^L\). These two conditions also turn out to ensure that the screening menu R is itself a TPE, greatly facilitating our characterization below.

It follows from (23) and (24) that a screening TPPE menu exists only if the RHS of (24) is smaller than the RHS of (23), which delivers the necessary condition for the existence of a screening TPPE menu. We next show that this condition is also sufficient and holds if \(\lambda \) is not too large. A reverse-screening TPPE menu can exist only if \(\lambda \) is sufficiently large. In contrast, one can always find a pooling TPPE menu that yields a positive profit.

Proposition 5

Define \(\overline{\lambda }_S \in (1, \infty )\) such that \(\theta _L \alpha \left( p,\overline{\lambda }_S\right) =\theta _H \beta \left( p,\overline{\lambda }_S\right) \). Also, define

$$\begin{aligned} \overline{\lambda }_R := \max \left\{ \frac{2\theta _H-(1+p)\theta _L}{(1-p)\theta _L}, 1+ \frac{1}{p}\right\} > \overline{\lambda }_S. \end{aligned}$$

We obtain the following:

  1. (a)

    There exists a screening TPPE menu if and only if \(\lambda < \overline{\lambda }_S\). Also, there exist \(\underline{p}\) and \(\overline{p}\) with \(0<\underline{p}<\overline{p}<1\) such that, as p increases, \(\overline{\lambda }_S\) is (continuously) decreasing for \(p < \underline{p} \), constant for \(p \in [\underline{p},\overline{p}]\), and increasing for \(p > \overline{p}\).

  2. (b)

    There exists a reverse-screening TPPE menu only if \(\lambda \ge \overline{\lambda }_R\).

  3. (c)

    There always exists a pooling TPPE menu that yields a positive profit.

An immediate implication from Proposition 5 is that only pooling menus can be sustained as TPPE if the loss aversion parameter is in the range \([\overline{\lambda }_S,\overline{\lambda }_R)\). Furthermore, part (a) shows that screening is feasible under a smallest range on \(\lambda \) when p takes an intermediate value: \(\overline{\lambda }_S\) is minimized when \(p \in [\underline{p}, \overline{p}]\).

To gain some intuition, note first that the gain–loss utilities are generated by the difference between the actual realized type and the expectation. Therefore, they occur more often when the type distribution has a greater variance, which, in the case of binary types, is true when p is closer to a half. In contrast to the PE analysis earlier, we are now concerned with the consumer’s ex ante payoff comparisons across multiple PEs: a greater variance in the type distribution makes the contingent consumption plan less attractive ex ante.

It remains to show the shape of profit-maximizing TPPE menu. It turns out that a screening menu is optimal whenever it can be supported as a TPPE. We state our next theorem.

Theorem 3

The optimal menu that solves \([P^e]\) is

  1. (a)

    a pooling menu if \(\lambda \in [\overline{\lambda }_S,\overline{\lambda }_R)\);

  2. (b)

    a screening menu if \(\lambda < \overline{\lambda }_S\), and the optimal \(q_L\) and \(q_H\) solve

    $$\begin{aligned} \frac{c}{v^{\prime } (q_L)}&= \frac{\theta _L \alpha (p,\lambda )}{p} - \frac{\theta _H (1-p) \beta (p,\lambda )}{p} \end{aligned}$$
    (25)
    $$\begin{aligned} \frac{c}{v^{\prime } (q_H)}&= \theta _H \beta (p,\lambda ). \end{aligned}$$
    (26)

Our proof of part (b) consists of two steps. First, we take a screening menu and solve a relaxed problem by imposing the PPE requirement only for a subset of deviations, \(R^L\), \(R^H\), and \(R^{\emptyset H} =\{\emptyset ,H\}\). As shown in Lemma 3, the deviations to \(R^L\) and \(R^H\) can be deterred by invoking (23) and (24). In order to deter the deviation to \(R^{\emptyset H}\), the transfer for the low type, \(t_L\), should not be too large since otherwise the buyer would find it better off ex ante to choose \(R^{\emptyset H}\), i.e., (U) is violated. This imposes another upper bound on \(t_L\) in addition to the bound imposed by \(({\text {IR}}_L)\) as part of the TPE conditions. These two bounds can be written together as \(t_L \le \theta _L \alpha (p,\lambda )\) (where \(\alpha (p,\lambda )\) is as defined in (22)). This constraint and (24) must be binding at the optimum of the relaxed problem, which leads to the first-order conditions given in (25) and (26). The second step of the proof then shows that the optimal menu for the relaxed problem satisfies all other PPE requirements.

We have not derived a boundary beyond which reverse-screening begins to dominate pooling, which can still be optimal when \(\lambda \ge \overline{\lambda }_R\).Footnote 15 Nonetheless, Theorem 3 demonstrates that the additional insurance motive captured by the PPE requirement favors pooling for a wide range of parameter values. We offer a numerical illustration in Fig. 2. To highlight the contrast with the TPE results earlier, we set \(\frac{\theta _H}{\theta _L} = 1.5\) as in Example 1 and plot \(\overline{\lambda }_S\) and \(\overline{\lambda }_R\) together with \(\lambda _S\) and \(\lambda _R\) appearing in Fig. 1.

Fig. 2
figure 2

Optimal TPPE menu

Remark 2

Notice the shaded region at the top left of Fig. 2 where optimal TPE menu is pooling but screening is the optimal strategy under TPPE. The introduction of PPE requirements reduces profitability of both types of menu. For instance, optimal pooling TPE menu may entail an alternative PE in which the buyer never makes a purchase.Footnote 16 When the likelihood of low type is large, the PPE requirements make a greater impact on pooling than on screening.

4.3 A role for redundant bundle

The analysis of PPE menus above followed the spirit of revelation principle, focusing on the direct revelation menus. The restriction to direct menus is without loss if the seller is allowed to select the truthful equilibrium, or TPE in out setup. However, the notion of PPE also seeks optimality from the agent’s perspective and hence renders the revelation principle inapplicable. In this section, we present a new possibility that an indirect menu can improve the seller’s profit upon the optimal pooling TPPE menu previously characterized. The alternative menu that we propose features two bundles, but both consumer types pool on a single bundle, with the other remaining redundant.

Suppose that the optimal TPPE menu is a pooling menu \(M = \{b^*=(q^*, t^*)\}\) for which the PPE requirement against (the deviation to) the null menu \(R= \{ \emptyset ,\emptyset \}\) boils down to condition (U). This condition then imposes an upper bound on the transfer as follows:

$$\begin{aligned} t^*\le \big [ p\theta _L + (1-p)\theta _H -p (1-p)(\lambda -1) (\theta _H -\theta _L) \big ]v(q^*) = \varPhi v (q^*), \end{aligned}$$
(27)

where

$$\begin{aligned} \varPhi := p \theta _L + (1-p)\theta _H - p(1-p)(\lambda -1) (\theta _H -\theta _L). \end{aligned}$$

Let us now modify M and design a new menu \(M^{\prime } = \{ b=(q,t), b^{\prime }=(q^{\prime },t^{\prime })\}\), where

  • \(q=q^*\) and \(t=\varPhi v (q^*) +\epsilon \) for \(\epsilon >0\);

  • \(q^{\prime } =\delta \) and \(t^{\prime } = \theta _H \frac{2}{\lambda +1} v (q^{\prime }) -\delta ^{\prime }\) for \(\delta , \delta ^{\prime } >0\).

Since \( q = q^*\) and \(t > t^*\), the seller’s profit is higher under \(M^{\prime }\) than under M, provided that both types pooling on b constitutes a PPE.

This latter observation is indeed true under the following parametric restrictions.

Assumption 1

  1. (i)

    \( \overline{\lambda }_S< \lambda < 1+ \frac{1}{p}\);

  2. (ii)

    \(\theta _H \frac{2}{\lambda +1}< \varPhi < \theta _L \frac{\lambda +1}{2}\);

  3. (iii)

    \( \frac{1}{\varPhi } > \max \Big \{ \left( \frac{p\lambda +1}{2} \right) \frac{1}{\theta _H}, \left( \frac{1-(1-p)(\lambda -1)}{1+(1-p)(\lambda -1)}\right) \frac{1}{\theta _L} \Big \}\);

  4. (iv)

    \(\theta _H \frac{2}{\lambda +1} < \theta _L \frac{1+p +(1-p)\lambda }{1+(1-p) + p \lambda }\).

Proposition 6

Suppose that Assumption 1 holds. Then, there are sufficiently small values of \(\epsilon , \delta ,\) and \(\delta ^{\prime }\) such that in the PPE of menu \(M^{\prime } = \{ b, b^{\prime }\}\), both types choose b and the corresponding expected profit exceeds that from the optimal pooling TPPE menu \(M = \{ b^* \}\).

A formal proof is presented in Section S.4 of the Supplementary Material. To understand this result, note first that, by Theorem 3 (since \(1+ \frac{1}{p}\le \overline{\lambda }_R\)), part (i) of Assumption 1 implies that the optimal TPPE menu is a pooling menu; also, by part (ii), (U) is implied by \(({\text {FIC}}_H)\) and \(({\text {FIC}}_L)\) and hence (U) captures the PPE requirement against \(R = \{\emptyset ,\emptyset \}\).

Next, consider our menu \(M^{\prime }\). Here, pooling on b violates condition (U) given in (27) but still satisfies the PPE requirement against \( \{\emptyset , \emptyset \}\). This is because the redundant bundle \(b^{\prime }\) is constructed such that the high type would deviate from the null bundle to choose \(b^{\prime }\). For providing such incentives to break pooling on \(\emptyset \) as a PE, we need to ensure that the screening effect of loss aversion works in favor of separation. An important content of Assumption 1 therefore requires p to be sufficiently low (recall from Sect. 3.1.3 that, for fixed \(\lambda \), the effectiveness of screening is decreasing in p).

Introducing a redundant bundle however generates new constraints: First, the consumer must be incentivized not to choose \(b^{\prime }\) over b, and second, the PPE requirements must be satisfied against new potential PEs involving \(b^{\prime }\). In terms of the latter, since three bundles (including the null bundle) are available, we need to check for 8 possible deviations from the desired pooling PE, while each deviation must be consistent with \(({\text {FIC}}_L)\), \(({\text {FIC}}_H)\) or (U). Assumption 1-(iii) and -(iv) are invoked to handle these requirements.

Assumption 1 is satisfied by a non-trivial set of parameter values. For instance, in Fig. 3, we set \(\theta _H/\theta _L =1.5\) and depict those parameter values in the shaded region.

Fig. 3
figure 3

PPE menu that yields a higher profit than the optimal TPPE menu

We do not know the full extent of optimal contracting under general indirect menus. In the case of pooling menu, it is relatively easy to break undesired PEs by introducing a redundant bundle since pooling menus admit a relatively small number of potential deviations compared to screening or reverse-screening menus. To derive the profit-maximizing menu among all indirect menus with an arbitrary number of redundant bundles, the optimization problem involves an intractable number of constraints. From the perspective of mechanism design theory, the analysis of general optimal menu amounts to searching for the second-best mechanism over the entire mechanism space, direct or indirect, without help of the revelation principle. To our knowledge, this question is yet to be tackled by the literature.

5 Alternative reference points

In this section, we discuss some alternative approaches, and their consequences, of modeling gain–loss utilities in the price discrimination setup.

5.1 Bundles as stochastic reference point

Our approach to modeling a stochastic reference point is that each type-\(\theta \) consumer compares the utility from his consumption, i.e., \(\theta v(q)\), with the utility that each hypothetical type \(\theta ^{\prime }\) would have derived from consuming her reference bundle, i.e., \(\theta ^{\prime } v(q^r(\theta ^{\prime }))\). Thus, the gain–loss term on the intrinsic utility component for each type \(\theta \) amounts to

$$\begin{aligned} \int _{\theta ^{\prime } \in \varTheta } \mu \left( \theta v(q) - \theta ^{\prime } v(q^r(\theta ^{\prime }))\right) {\text {d}}F(\theta ^{\prime }), \end{aligned}$$
(28)

where \(\mu \) is the loss aversion indicator function as defined in (2).

An alternative approach is to consider comparison of just the physical outcomes. This would mean rewriting of (28) into

$$\begin{aligned} \theta \int _{\theta ^{\prime } \in \varTheta } \mu \left( v(q) - v(q^r(\theta ^{\prime }))\right) {\text {d}}F(\theta ^{\prime }), \end{aligned}$$
(29)

and (2) into

$$\begin{aligned} n(b; \theta , \theta ^{\prime }, R(\theta ^{\prime })):= & {} n(b; \theta , R(\theta ^{\prime }))\\= & {} \theta \mu \left( v(q) - v(q^r(\theta ^{\prime })) \right) + \mu \left( t^r(\theta ^{\prime })-t \right) . \end{aligned}$$

According to (29), each type-\(\theta \) consumer evaluates his consumption bundle against reference bundles with his own willingness to pay, ignoring a potential comparison against other possible selves that he could have been.Footnote 17 To further clarify the difference from (28), suppose that the reference bundle is identical for two distinct types, i.e., \(R(\theta ^{\prime }) = R(\theta ^{\prime \prime })\). In the alternative approach, the gain–loss utility is also treated identically; in contrast, we consider the case in which the gain–loss utilities would differ across the two distinct types. Our approach recognizes the fact that the same bundle could generate different consequences for different types.

Beyond the conceptual difference discussed above, the two approaches also generate different results. In particular, with (29), reverse-screening can never be incentive feasible. The properties of screening and pooling menus remain identical nonetheless. Next result characterizes the optimal TPE menu with the alternative utility model in the binary-type case. A corresponding analysis for the continuum-type case is presented in Section S.5 of the Supplementary Material.

Proposition 7

Suppose that the buyer’s gain–loss utility is as given by (29). Also, suppose that \(\varTheta = \{\theta _L, \theta _H\}\). Then, the optimal menu that solves [P] is a pooling menu if and only if \(\lambda \ge \lambda _S\), where \(\lambda _S\) is as defined in Proposition 3.

Proof

Note first that the alternative gain–loss specification does not affect the (IR) constraints and hence the optimal pooling menu. Also, the (IC\(_H\)) constraint for screening is given by

$$\begin{aligned} u(r_{H}|\theta _{H},R)&= \theta _{H}v(q_{H})-t_{H}+p[\theta _{H}v(q_{H})-\theta _{H}v(q_{L})-\lambda (t_{H}-t_{L})] \\&\ge u(r_{L}|\theta _{H},R) = \theta _{H}v(q_{L})-t_{L}+p(\theta _{H}-\theta _{H})v(q_{L}) \\&\quad +(1-p)[(t_{H}-t_{L})-\lambda \theta _{H}(v(q_{H})-v(q_{L}))], \end{aligned}$$

which clearly leads to the same expression as (11). Therefore, Proposition 3 remains true.

Next, we show that reverse-screening cannot be a PE. Consider a reverse-screening menu with \(t_{L}>t_{H}\) and \(q_{L}>q_{H}.\) Then, \(({\text {IC}}_{H})\) is written as

$$\begin{aligned}&\theta _{H}v(q_{H})-t_{H}+p\left[ -\lambda \theta _{H}(v(q_{L})-v(q_{H}))+(t_{L}-t_{H})\right] \\&\quad \ge \theta _{H}v(q_{L})-t_{L}+(1-p)\left[ \theta _{H}(v(q_{L})-v(q_{H}))-\lambda (t_{L}-t_{H})\right] , \end{aligned}$$

which simplifies to

$$\begin{aligned} (t_{L}-t_{H})\left[ 1+p+(1-p)\lambda \right] \ge \theta _{H}(v_{L}-v_{H})\left[ 1+(1-p)+p\lambda \right] . \end{aligned}$$
(30)

Analogously, \(({\text {IC}}_{L})\) is written as

$$\begin{aligned}&\theta _{L}v(q_{L})-t_{L}+(1-p)\left[ \theta _{L}(v(q_{L})-v(q_{H}))-\lambda (t_{L}-t_{H})\right] \\&\quad \ge \theta _{L}v(q_{H})-t_{L}+p\left[ -\lambda \theta _{L}(v(q_{L})-v(q_{H}))+(t_{L}-t_{H})\right] , \end{aligned}$$

which simplifies to

$$\begin{aligned} (t_{L}-t_{H})\left[ 1+p+(1-p)\lambda \right] \le \theta _{L}(v_{L}-v_{H})\left[ 1+(1-p)+p\lambda \right] . \end{aligned}$$
(31)

Combining (30) and (31) yields

$$\begin{aligned} B(p,\lambda )\theta _{H}\le \frac{t_{L}-t_{H}}{v(q_{L})-v(q_{H})}\le B(p,\lambda )\theta _{L}, \end{aligned}$$
(32)

where \(B(p,\lambda )=\frac{1+(1-p)+p\lambda }{1+p+(1-p)\lambda }.\) It is clear that the two inequalities in (32) cannot hold simultaneously. This completes the proof. \(\square \)

A key modeling choice that facilitates the KR approach in our setup is that the buyer and seller have symmetric information when the seller designs/offers menu, but the buyer later learns some additional payoff-relevant private information. As observed in Sect. 3.1.2, this incomplete information is critical to our results. After receiving new information, the buyer evaluates his consumption by not only its intrinsic utility but also by comparing it with the utility or outcome previously anticipated for every other possible contingency. In particular, the buyer’s ex post preference is affected by the average gain–loss utility with respect to the (commonly known) prior distribution.

Using the prior to evaluate gain–loss comparisons offers a convenient way of modeling expectation-based reference-dependent utility. An interesting direction of future research would however be to consider alternative approaches to incorporating gain–loss comparisons across multiple types.Footnote 18 Such a model would still be consistent with KR’s rational expectations framework that attempts to endogenize reference point: The buyer would form contingent consumption plan before learning his private information, and this plan would have to be optimal for each realized type under the alternative utility model.

5.2 Average bundle

An important motivation for adopting the KR model of reference-dependent preferences arose from recognizing the role of expectations. While in the KR model the reference point is stochastic and equals the distribution of expected outcomes, the models of disappointment aversion (Bell 1985; Loomes and Sugden 1986) formulate the reference point as fixed, and in particular, as the expected utility certainty equivalent of a gamble. A similar approach in our price discrimination setup would be to take the expected utility of the contingent bundles as reference point.Footnote 19

Formally, with binary types and menu \(\left\{ b_{L},b_{H}\right\} \), consider type-\(\theta \) buyer’s gain–loss utility from bundle \(b=(q,t)\) to be

$$\begin{aligned} \mu \left[ \theta v(q)-\left( p\theta _{L}v(q_{L})+(1-p)\theta _{H}v(q_{H})\right) \right] +\mu \left[ \left( pt_{L}+(1-p)t_{H}\right) -t \right] . \end{aligned}$$
(33)

In Section S.5 of the Supplementary Material, we solve for the optimal menu under this alternative specification of reference-dependent preferences. It turns out that this analysis is very close to that of optimal TPE menus in Sect. 3. Whenever a pooling menu maximizes the firm’s profit under TPE, it does so here as well.

5.3 Additive separability

Our formulation of gain–loss utilities treats quality and money dimensions in an additive separable form. This is consistent with the endowment effect observed in many empirical studies. An alternative formulation would be to apply the gain–loss utility to the total utility, \(\theta v(q)-t\). It turns out that the predictions of our model under such a gain–loss specification are no different from the model with standard preferences. See Section S.5 of the Supplementary Material.

6 Conclusion

We often find sellers offering menus with just a small number of bundles. This paper demonstrates that such observations are consistent with profit-maximizing firms that face loss averse consumers. We show that, in the binary-type case, a pooling menu is the seller’s optimal menu under a range of loss aversion parameter if the low willingness-to-pay consumers are sufficiently abundant. This result arises as a consequence of the interplay between loss aversion and asymmetric information. The benefits from screening with multiple bundles become even more restricted when the consumer is capable of choosing the personal equilibrium that generates the highest ex ante payoff. We also identify conditions under which partial or even full pooling dominates screening for the seller facing a continuum of consumer types.

The optimal menus described in our analysis above have the feature that the buyer’s ex ante expected utility (including anticipated gain–loss) often falls below zero. This can be problematic for the seller if the consumer can calculate the ex ante loss and find some commitment device to stay away from the menu altogether. In our previous working paper Hahn et al. (2012), we showed that introducing an additional ex ante participation constraint to the analysis (requiring the buyer’s ex ante expected utility to be nonnegative) does not alter our central message. In fact, the loss averse consumer’s ex ante insurance motives can induce the profit-maximizing firm to offer pooling menus under a wider range of parameters.

The same conclusion also holds in an alternative model of ex ante contracting where the buyer’s participation decision is made before his type is realized. That is, the buyer, when deciding whether to accept the menu offered by the seller, is uncertain about his willingness to pay. Analyzing the optimal PPE menu in this alternative model reveals that the pooling menu is optimal for a larger set of parameters under the ex ante participation constraint than under the ex post one. Again, the buyer’s insurance motives reinforce the benefits of pooling.Footnote 20 These additional results, together with those reported in Sect. 5 for additively separable gain–loss utilities, demonstrate that the optimality of pooling is a general phenomenon with loss averse consumers, valid under different decision-making scenarios and time lines.

Our theory offers potential explanations for why some sellers fail to fully materialize the benefits from further price discrimination in industries that seem to have low fixed costs of adding another product variant. For example, seats in existing entertainment venues provide different views and the cost of offering multiple seating categories is essentially zero. But, the practice of price discrimination in this industry, sometimes known as “scaling the house,” displays wide variations both within and across markets as well as across time (see the survey of Courty 2000). In particular, many ticket sellers indeed choose to offer uniform pricing or very few seating categories.Footnote 21 In a study of another industry with potentially low fixed product costs, Crawford and Shum (2007) report that 70% of over 1000 US cable TV providers in their sample year of 1995 offered a single package of channels only and estimate substantial unrealized returns from price discrimination.Footnote 22

Consistent with our prediction that price discrimination would be more likely under certain market conditions, in contrast to pop concerts, high-brow entertainment events, such as classical concerts, usually offer many seating categories (e.g., Huntington 1993); in their cross-sectional study of cable TV providers, Crawford and Shum (2007) report evidence that markets offering more cable packages tend to be “populated by households with greater tastes for cable service quality (Crawford and Shum 2007, p. 201).” Our results can also shed light on observed pricing practices in other industries. For example, buses and motels usually offer a single type of seats and rooms, and this contrasts with the standard features of trains and hotels that frequently serve upscale travelers.

While we take the uncertainty to affect willingness to pay directly, variations in willingness to pay may arise from other sources, for example, income shocks. In such a case, however, the buyer should also realize gain–loss utility in that uncertain monetary dimension. Also, our model suggests that, contrary to common observations, reverse-screening can be optimal if the consumer is significantly loss averse (at least with only few consumer types). Interestingly, Ayres (1995) and Ayres and Siegelman (1995) found a case of car dealers who offered substantially lower prices to white consumers than to nonwhite consumers. Given the high willingness to pay estimated for white buyers, these authors suggested racial bias behind the observed practice. In a recent paper, however, Bang et al. (2014) provide a rational justification of such “reverse price discrimination.” Although these accounts are concerned with third-degree price discrimination, they suggest that reverse-screening may not be a mere theoretical possibility.