Probability weighting and insurance demand in a unified framework

We provide a comprehensive analysis of the impact of probability weighting on optimal insurance demand in a unified framework. We identify decreasing relative overweighting as a new local condition on the probability weighting function that is useful for comparative static analysis. We discuss the effects of probability weighting on coinsurance, deductible choice, insurance demand for low-probability, high-impact risks versus high-probability, low-impact risks, and insurance demand in the presence of nonperformance risk. Probability weighting can make better or worse predictions than expected utility depending on the insurance demand problem at hand.


Introduction
Insurance choices are important financial decisions for households and can have a significant impact on their welfare (Bhargava et al. 2017). Researchers have aspired to find a valid descriptive model of insurance choice under risk for decades to be able to predict individual behavior and conduct policy analysis. In the expected utility (EU) model, the curvature of the utility function measures the individual's degree of risk aversion (Pratt 1964), which then drives optimal insurance demand (Mossin 1968). From a descriptive standpoint, utility curvature alone is often too rigid to explain insurance choices in the field and may require implausibly high levels of risk aversion (e.g., Sydnor 2010). Incorporating probability weighting into the decision model is a common approach to address these shortcomings (e.g., Barseghyan et al. 2013;Hansen et al. 2016;Wakker et al. 1997).
In this paper, we provide a comprehensive analysis of the impact of probability weighting on optimal insurance demand in a unified framework. Our model allows for clear comparisons to the predictions obtained under the standard EU model. We explore a number of common insurance demand problems. Specifically, we consider the effects of probability weighting on coinsurance, deductible choice, insurance demand for low-probability, high-impact (LPHI) risks versus high-probability, low-impact (HPLI) risks, and insurance demand in the presence of nonperformance risk. We are the first to formalize these problems of optimal insurance demand in one efficient model. We thus provide a comprehensive assessment of the merits and limitations of probability weighting when it comes to explaining insurance demand.
Three empirical observations motivate our study. First, people tend to "overinsure" modest risks (e.g., Sydnor 2010). Probability weighting is a potential explanation. Typical probability weighting functions imply higher insurance demand than EU when considering coinsurance in the binary loss model and for deductible choice. The reason is a substitution effect between overweighting of the loss probability and utility curvature. Second, studies have documented less demand for insurance covering LPHI risks than for insurance covering HPLI risks, which is at odds with EU predictions (e.g., Browne et al. 2015;Slovic et al. 1977). We show that, under mild conditions, probability weighting makes the same prediction as EU and is therefore not able to explain underinsurance of LPHI risks. Third, individuals are sensitive to contract nonperformance risk and reduce their insurance demand by more than EU predicts (e.g., Wakker et al. 1997;Zimmer et al. 2018). While probability weighting may appear as a promising solution, it actually implies higher insurance demand than EU under reasonable assumptions. So if anything, it exacerbates the puzzle. Together, these results reveal that the descriptive appeal of probability weighting is limited to overinsurance puzzles and does not extend to underinsurance puzzles.
We conduct most of our analysis in the simplest model of insurance demand with a binary loss risk. This simplification makes the model tractable and allows us to derive rich comparative statics without having to assume a particular functional form of the probability weighting function. We extend our results for coinsurance to optimal deductible choice. This not only shows that our results do not depend on the binary loss assumption, but also covers an important range of applications on insurance markets. We introduce decreasing relative overweighting (DRO) as a local property of the probability weighting function and collect all probabilities where it holds in the DRO region. In this region, small probabilities are overweighted more than large probabilities relative to their baseline value. We relate DRO to other properties of the probability weighting function. 1 DRO regions are large theoretically and empirically, so most loss probabilities relevant in insurance demand fall well within the region. 2 Given the usefulness of DRO for the comparative statics of insurance demand, this property may be helpful in other applications as well. In a final step, we use experimentally calibrated preferences to provide numerical illustrations of our results.
We are not the first to look at optimal insurance demand outside of the EU model. Machina (1995) analyzes the robustness of several classical results in insurance theory under non-expected utility preferences (see also Schlesinger 1997). Doherty and Eeckhoudt (1995) examine optimal insurance demand under Yaari's (1987) dual theory. Yet others have looked at insurance demand through the lens of regret aversion (Braun and Muermann 2004), disappointment aversion (Huang et al. 2012), ambiguity aversion (Alary et al. 2013;Snow 2011), and prospect theory (Schmidt 2016). We focus on Quiggin's (1982) rank-dependent utility (RDU) to study the role of probability weighting in insurance demand. We explicitly allow the probability weighting function to take the commonly-observed inverse S-shape (e.g., Abdellaoui et al. 2011) and abstain from parametric assumptions, which are abundant in empirical work (e.g., Barseghyan et al. 2013;Hansen et al. 2016;Harrison and Ng 2018). While some results exist (see Schmidt 1998), we are the first to bring together a whole range of insurance demand problems, including LPHI versus HPLI risks and nonperformance risk. Analyzing these issues in one unified framework allows us to take a broader perspective on the effects of probability weighting.
The paper proceeds as follows. The next section introduces the baseline model and defines DRO. In Section 3, we characterize optimal insurance demand under probability weighting, derive the substitution between overweighting and utility curvature, which resolves the overinsurance puzzle for modest risks, and extend our results to deductible insurance. In Section 4, we present situations where underinsurance has been observed. We analyze how insurance demand differs for LPHI versus HPLI risks, and study the demand effects of nonperformance risk under probability weighting. Section 5 offers numerical illustrations based on experimentally-calibrated preferences. Section 6 presents a more in-depth discussion of the literature and explains how our results offer new insights. The last section concludes. 1 In many cases, the DRO region takes the form of (0,p) with p < 1 . This is weaker than requiring starshapedness at 1 of the dual to the probability weighting function (see Ryan 2006) but stronger than Tversky and Wakker's (1995) lower subadditivity property. 2 For typical inverse S-shaped probability weighting functions, the DRO region includes all probabilities below a threshold of at least 75%. Our focus on loss probabilities in the DRO region is thus less restrictive than overweighting, because only probabilities below 40% or so are overweighted in most cases. Quiggin's (1982) rank-dependent utility (RDU) allows us to isolate the effect of probability weighting on optimal insurance demand. We start with a simple twostate model with a binary loss. We will extend our results to deductible choice in Section 3.3 and consider three states of the world in Section 4.2 to accommodate nonperformance risk.

Model setup
Let x = (E 1 , x 1 ;...;E n , x n ) be an ordered prospect with outcomes x 1 ≥ ... ≥ x n for a partition E i i of the state space into events. Under RDU, ordered prospects are evaluated according to where u denotes an increasing and concave utility function, and i is the decision weight for event E i . Let ℙ be the probability distribution and w the individual's probability weighting function. Insurance covers losses and losses are bad news, so decision weights are defined by with the convention that ∪ n j=n+1 E j = �. 3 To avoid violations of first-order stochastic dominance, we assume that w is increasing with w(0) = 0 and w(1) = 1 (see Quiggin 1982;Tversky and Kahneman 1992). EU is a special case of (2) for w(p) = p , because then i = ℙ(E i ) . This enables us to isolate the effect of probability weighting on insurance decisions.
To model insurance demand, we assume the individual has initial wealth x and is subject to a potential loss of L < x which occurs with probability p. The individual chooses a level of insurance coverage to protect himself against the risk of loss. We denote the coinsurance rate by and make the common assumption that 0 ≤ ≤ 1 . The insurer charges a loading m on top of the expected cost of insurance, so the premium is mpL . Insurance is called actuarially fair for m = 1 . If m > 1 , the contract is actuarially unfair or loaded, and if m < 1 , it is actuarially favorable or subsidized. 4 The no-loss state results in final wealth of x 1 = x − mpL , the loss state results in final wealth of If only gains were involved, decision weights would be given by Sarin and Wakker (1998). Insurance decisions have rarely been interpreted in the gain domain. Exceptions are Schmidt's (2016) analysis based on third-generation prospect theory (see Schmidt et al. 2008) and Köszegi and Rabin's (2007) choice-acclimated personal equilibrium. 4 We assume m < 1∕p because otherwise purchasing any amount of insurance would be state-wise dominated by remaining uninsured. We will tighten the upper bound on m later in the analysis.
Because ≤ 1 , final wealth in the no-loss state is never less than final wealth in the loss state. According to (2), the decision weights are then given by 1 = 1 − w(p) and 2 = w(p) . The individual's optimal insurance demand * W maximizes the following objective function: The notation W indicates the presence of probability weighting. EU is nested in (3) for w(p) = p with the following objective function: The notation U is short for expected utility. We can rewrite the probability weighting function as w(p) = p + (p) , where (p) denotes the absolute amount of overweighting. It measures by how many percentage points a given probability is overweighted. 5 We can then compare the two objective functions as follows: As long as insurance is partial ( < 1 ), probability weighting reduces the individual's perceived welfare if and only if the loss probability is overweighted.

Decreasing relative overweighting (DRO)
We also make use of the relative amount of overweighting, which we define as follows: It relates the absolute amount of overweighting to the value of the loss probability. For example, if p = 0.05 and p = 0.10 are both overweighted with w(0.05) = 0.10 and w(0.10) = 0.15 , the absolute amount of overweighting is identical for the two probabilities, (0.05) = (0.10) = 0.05 . In relative terms, however, 0.05 is overweighted by more because (0.05) = 2 > 1.5 = (0.10).
Descriptive decision theory primarily has found inverse S-shaped probability weighting functions (e.g., Abdellaoui et al. 2011;Stott 2006;Wu and Gonzalez 1996). Functions with this shape overweight small and underweight large probabilities with a unique interior fixed point where w(p * ) = p * . They are concave up to an inflection point p and convex afterward. Many functional forms have been proposed to accommodate an inverse S-shape including those in Goldstein and Einhorn (1987), Tversky and Kahneman (1992), Wu and Gonzalez (1996), Prelec (1998), andWakker (2010). We provide an overview in Appendix A.1 and discuss some of their properties. Panel (a) in Fig. 1 shows an example of inverse S-shaped probability weighting based on the Goldstein and Einhorn functional form with parameters r = 0.5 and s = 0.7 . Panels (b) and (c) depict the associated absolute and relative amount of overweighting. This motivates the following definition.
Definition 1 A probability weighting function has decreasing relative overweighting (DRO) at probability p if there is an open neighborhood of p where is decreasing. The DRO region is the collection of all probabilities where the probability weighting function satisfies DRO.
In the DRO region, smaller probabilities are more overweighted than larger ones relative to their baseline value, just as in the example after Equation (6). If w is differentiable, the DRO region contains all probabilities where ′ < 0. 6 So while overweighting can be characterized as a positive relative amount of overweighting, a condition on the level of , DRO is a condition on its slope. Panel (c) of Fig. 1 shows that the DRO region covers all overweighted probabilities but extends to 0.79, far beyond the fixed point and the inflection point, including many probabilities that are underweighted. Many probability weighting functions have large DRO regions, which contain the vast majority of probabilities relevant in insurance demand (see Appendix A.2). Our focus on loss probabilities in the DRO region is thus innocuous.
DRO can be related to other properties of the probability weighting function. Ryan (2006) shows that, with a concave utility function, RDU preferences are Jewitt (1989) risk-averse if and only if w(p)/p is non-increasing on (0, 1]. This is equivalent to requiring that the weak version of DRO holds globally, in which case the dual to the probability weighting function is star-shaped at 1 (see Chateauneuf et al. 2004). 7 In our analysis, we do not require DRO to hold globally. Doing so would exclude empirically relevant shapes of the probability weighting function such as an inverse S-shape because the DRO region is (0,p) in this case with p < 1 , see Proposition 7(i). DRO regions of this form imply that Tversky and Wakker's (1995) lower subadditivity condition holds for all probabilities up to p.

Insurance demand under probability weighting and under EU
We first consider how optimal insurance demand depends on the loading. All formal derivations and proofs are provided in Appendix B. We obtain an upper 7 Ghossoub and He (2020) provide an overview of notions of (comparative) risk aversion for RDU preferences and their characterization. If the probability weighting function is star-shaped at 0, the DRO region is empty because star-shapedness is equivalent to w(p)/p being non-decreasing on (0, 1]. bound m W and a lower bound m W on the loading. For loadings above the upper bound, the optimal level of insurance coverage is zero ( * W = 0 if m ≥ m W ). Therefore, we call m W the no-insurance bound. For loadings below the lower bound, full insurance coverage is optimal ( * W = 1 if m ≤ m W ). We call m W the full-insurance bound. Partial insurance is optimal for loadings between the two bounds ( 0 < * W < 1 for m W < m < m W ). The two bounds are given by The corresponding EU bounds arise as special cases of (7) for w(p) = p . We denote them by m U and m U , and let the optimal level of coverage under EU be * U . According to Mossin (1968), full insurance is optimal when the price of insurance is actuarially fair ( * U = 1 if m = 1 ) and partial insurance is optimal when the loading exceeds unity ( * U < 1 if m > 1 ). Since the objective function in (3) nests EU, we can compare optimal insurance demand across RDU and EU.
Proposition 1 For a given utility function, overweighting (underweighting) of the loss probability: (i) increases (decreases) the full-insurance bound and the no-insurance bound, that is, m W > m U and m W > m U (m W < m U and m W < m U ); (ii) increases (decreases) the level of insurance demand for loadings between min(m W , m U ) and max(m W , m U ).
Furthermore, a higher loss probability: (iii) decreases the ratio m W ∕m U for loss probabilities in the DRO region; (iv) decreases the ratio m W ∕m U for overweighted loss probabilities in the DRO region. Results 1(i) and 1(ii) state that overweighting of the loss probability rationalizes higher insurance demand than predicted by EU. Results 1(iii) and 1(iv) say that, in the DRO region, probability weighting drives a larger wedge between the demand thresholds the lower the loss probability. It may seem obvious that overweighting of the loss probability raises insurance demand, but there are actually two conflicting effects at work. Overweighting of the loss probability raises both the marginal cost and the marginal benefit of insurance relative to EU. Overweighting puts more weight on the loss state where marginal utility is high, and decreases the weight on the no-loss state where marginal utility is low. This raises the marginal cost of insurance because paying an additional dollar in premium is more costly in the loss state than the no-loss state. At the same time, it raises the marginal benefit of insurance because the indemnity is only received when the loss occurs. The positive net effect derives from Equation (5). Insurance transfers final wealth from the no-loss state to the loss state, which shrinks the marginal utility gap across states. Therefore, probability weighting makes the individual appreciate insurance more if and only if the loss probability is overweighted. The increase in the marginal benefit then dominates the increase in the marginal cost of insurance, and demand for insurance increases.
Proposition 1 is similar to Schmidt's (1998) Proposition 3.1, but there are important differences. He allows for overinsurance (i.e., > 1 ), which we exclude by assumption, and identifies the range of loadings for which full insurance is optimal. In our case, this range is (0, m W ] and contains loadings where individuals would purchase more than full insurance if they had the chance to do so. Schmidt (1998) focuses on a strictly concave probability weighting function, which rules out the inverse S-shape. Our result does not require any assumptions about the curvature of the probability weighting function. Additionally, we also provide insights into the extensive margin by looking at the loading factor that chokes off insurance demand along with some comparative statics.
EU has often been criticized for its prediction of partial coverage at actuarially unfair premiums, as it tends to conflict with choices observed in the laboratory and the field (e.g., Jaspersen et al. 2021;Shapira and Venezia 2008;Sydnor 2010). Probability weighting allows us to explain the evidence. According to result 1(i), overweighting of the loss probability implies a full-insurance bound above unity so that full insurance can be optimal even when premiums are actuarially unfair. Results 1(i) and 1(ii) extend to the intensive margin: m W , m W , and * W are increasing in the absolute amount of overweighting. So the more the loss probability is overweighted, the larger the range of loadings above one where full insurance is in demand. Furthermore, the ratio m W ∕m W is decreasing in p because the no-insurance bound changes at a faster rate than the full-insurance bound. So for loss probabilities in the DRO region, m W , m W , and m W − m W are all decreasing in p. At lower loss probabilities, the range of loadings where insurance is in demand is wider and individuals are willing to purchase insurance at higher relative prices. Figure 2 illustrates these results. We use the estimates in Barseghyan et al. (2013), who analyze choices for three different insurance products: automobile comprehensive, automobile collision, and homeowners insurance. They find average loss probabilities of 2.1%, 6.9% and 8.4%, and estimate a single coefficient of absolute risk aversion and a product-specific absolute amount of overweighting from the observed insurance choices. For our illustration, we use their estimates and set the loss to $1,000, which roughly corresponds to the deductible choice that the individuals in Barseghyan et al.'s (2013) study faced. Following their analysis, we use an exponential utility function, so no assumption on the individual's wealth is necessary. 8 In Figure 2, we plot the full-insurance bound and the no-insurance bounds as a function of the absolute amount of overweighting. The gray area corresponds to RDU and the black area to EU. In all three panels, (m W , m W ) is above (m U , m U ) when (p) is positive, is equal to (m U , m U ) when (p) is zero, and is below (m U , m U ) when (p) is negative, consistent with result 1(i). Since the bounds under probability weighting are increasing in the absolute amount of overweighting, the gray areas fan out. Comparing the panels from right to left, the gray areas become "steeper" and wider as p decreases. This illustrates the effect of the loss probability on the RDU bounds. The black EU areas do not change much at all. In fact, m U is constant at 1 across panels and m U increases slightly when going from right to left. This is consistent with results 1(iii) and 1(iv) because both the no-insurance bound and the fullinsurance bound are more sensitive to changes in the loss probability under RDU than under EU.
All three panels show a sizable effect of probability weighting even for modest overweighting. The dashed vertical lines in Fig. 2 represent the absolute amounts of overweighting estimated in Barseghyan et al. (2013). At these levels, the reservation price for insurance can be one and a half to four times as high as under EU. Particularly, when the loss probability is small, an individual who overweights the loss probability will likely still buy full insurance at loadings where his EU twin would not buy any insurance at all. This happens for loadings anywhere above the black area and below the gray area. Hence, probability weighting can help rationalize insurance choices when individuals exhibit higher demand than predicted by EU.
The underlying mechanism behind these observations is that RDU with standard probability weighting functions leads to increased risk aversion toward low-probability risks. The finding of increased risk aversion for losses of low probability is already part of Tversky and Kahneman's (1992) fourfold pattern of risk attitudes. Ghossoub and He (2020) provide an overview of notions of risk aversion under RDU and also characterize their comparative versions. Requiring an RDU preference to be more risk-averse than the corresponding EU preference for all risks often imposes restrictions on the probability weighting function such as concavity (e.g., Hong et al. 1987;Ryan 2006) or dominance of the identity line (see Quiggin 1993). These restrictions rule out the common inverse S-shape. A focus on risks with a loss probability that is overweighted alleviates this issue.

Substitution between overweighting and utility curvature
Proposition 1 rests on the assumption of identical utility functions when comparing insurance demand between RDU and EU. While this helps us isolate the effect of probability weighting, empirical studies tend to find less concave utility functions when allowing probabilities to be distorted (see Barseghyan et al. 2013;Diecidue and Wakker 2002;Fox et al. 1996;Rabin 2000;Selten et al. 1999). Here, we shed light on how both motives, utility curvature and overweighting of loss probabilities, jointly explain insurance demand.
While the full-insurance bound m W in (7) is constant in utility curvature, the noinsurance bound m W and the optimal level of coverage * W are increasing in utility curvature. 9 So, in general, optimal insurance demand reflects a substitution between utility curvature and overweighting of the loss probability. We can obtain additional insights by specifying the utility function. We use exponential or iso-elastic utility because they measure utility curvature with a single parameter and are common in empirical applications. We can then analyze combinations of utility curvature and absolute amount of overweighting that generate the same level of insurance demand. Graphically, this yields iso-insurance demand curves in the utility curvature-overweighting plane. The substitution effect makes these curves downward-sloping, and the size of their slope measures the degree of substitution. This allows us to identify factors that are associated with stronger substitution. We summarize our results in the following proposition. for the three insurance products analyzed by Barseghyan et al. (2013). Our calculations are based on model 1b of their analysis and assume exponential utility with absolute risk aversion of 0.00063. The dashed line indicates the value of (p) estimated by Barseghyan et al. for each market. The black shaded area contains the loadings where an EU individual buys partial coverage, with full insurance for loadings below and no coverage for loadings above the area. The gray shaded area is the analogous area for probability weighting 9 The result for m W is obtained by factoring out w(p)/p from (7). The first-order condition for * W implies a positive relationship between utility curvature and insurance demand. Both conclusions rely on Pratt (1964), who shows that u � (x)∕u � (x − L) is negatively associated with the curvature of u.
Proposition 2 Let the utility function be exponential or iso-elastic utility and consider the iso-insurance demand curves in the utility curvature-overweighting plane.
(i) The curves are decreasing in any case and convex if w(p) < 0.5; (ii) They are steeper and more convex, the higher the level of insurance demand, the lower the loss probability as long as w(p) < 0.5 , and the lower the loss severity. Figure 3 shows examples of iso-insurance demand curves for iso-elastic utility. We set the loading to m = 2 and assume that 50% of wealth is at risk. Each panel sets a different level of insurance demand and illustrates the iso-insurance demand curve for four different loss probabilities. Per result 2(i), the curves are downwardsloping and slightly convex. In each panel, the curves are steeper the lower the loss probability. By comparing the curves across panels, we observe that higher insurance demand makes the curves steeper. This is consistent with result 2(ii). Similar results can be obtained for the no-insurance bound m W , which is also characterized by a substitution between utility curvature and overweighting of the loss probability. In Section 5.2, we provide a sense of magnitude for this effect using experimentallycalibrated preferences.
The substitution between overweighting and utility curvature has practical implications for the inference of preferences from insurance choices. Overweighting of the loss probability reduces the degree of utility curvature needed to rationalize a given level of insurance demand. Under EU, the utility curvature required to explain insurance choices can be implausibly large (Sydnor 2010), but allowing for overweighting renders more sensible estimates (Barseghyan et al. 2013). The convexity of the iso-insurance demand curves implies that this substitution effect is stronger at the extensive margin than the intensive margin. So when introducing a 1 percentage point overweighting of the loss probability, this reduces the utility curvature by more than when increasing an existing amount of overweighting by 1 percentage point. Result 2(ii) implies that probability weighting is particularly suitable as an explanation for high insurance demand against modest risks (see Sydnor 2010). These mechanisms underlie the empirical appeal of probability weighting as a descriptor Fig. 3 Iso-insurance demand curves in the utility curvature-overweighting plane for iso-elastic utility. We consider a loading of m = 2 and that 50% of wealth is at risk. We vary the loss probability p in each panel and the level of insurance demand * W across panels of certain insurance choices in the field. As we will see in Section 4, this appeal is limited and does not extend to other types of insurance demand phenomena.

Deductible insurance
For many insurance decisions, individuals face severity risk conditional on the occurrence of loss and choose a deductible that determines how much of the risk they retain and how much they transfer to the insurer. The binary loss model does not allow us to distinguish between deductible insurance and coinsurance. Therefore, we will now introduce severity risk and consider the individual's choice of an optimal deductible level. A caveat to our analysis is that a straight deductible is not necessarily optimal under RDU with an inverse S-shaped probability weighting function, see Bernard et al. (2015), Ghossoub (2019) and Xu et al. (2019). However, it is a very common shape of the indemnity schedule, which is found for property losses in auto insurance and homeowners insurance. Many health plans also contain deductibles. As it turns out, our main results in Proposition 1 do not rely on the binary loss assumption and can be extended to deductible choice. A loss occurs with probability p, and we let the loss severity take values in [0, L] according to the cumulative distribution function F( ) . We let F be continuous and assume that L is the maximum possible loss, which is the smallest value such that F(L) = 1 . Then, F( ) < 1 for all losses < L . The indemnity schedule is a straight deductible, The last equality is obtained via integration by parts. We thus have P � (D) = −mp(1 − F(D)) . The individual's optimal deductible D * W maximizes see also Bernard et al. (2015). The second equality holds because w is differentiable and F is continuous, the third equality is obtained by distinguishing between losses below the deductible and losses above the deductible. The objective function under EU is a special case of (9) by setting w(p) = p such that w � (p) = 1 for all p. It is given by see also Schlesinger (2013). The three terms represent the utility if no loss occurs, the utility of losses below the deductible, and the utility of losses above the deductible. Even under EU with a concave utility function, the objective function of the deductible choice problem is not necessarily concave (see Meyer and Ormiston 1999;Schlesinger 1981). Therefore, we have to assume that both U and W are concave in D so that we can utilize the first-order approach.
In this case, we obtain an upper bound m W on the loading with no insurance demand for loadings above this threshold ( D * W = L for m ≥ m W ), and a lower bound m W on the loading with full insurance demand for loadings below this threshold ( D * W = 0 for m ≤ m W ). Partial insurance is optimal for loadings between the two These results mirror what we found in the coinsurance problem. The bounds are given by m W = w(p)∕p and by The full-insurance bound m W is exactly the same as in the case of coinsurance. We therefore recoup the first part of Proposition 1(i): Overweighting (underweighting) of the loss probability increases (decreases) the full-insurance bound. Obviously, we then also recoup Proposition 1(iii) The no-insurance bound m W differs from the expression in Eq. (7) for coinsurance. In the deductible case, the EU no-insurance bound is and probability weighting now has several effects on the no-insurance bound. If lim p→0 w � (p) = ∞ , as is the case for many parametric classes of inverse S-shaped probability weighting functions, we have that m W = ∞ , and probability weighting induces the individual to always buy some insurance regardless of the size of the loading. This prediction seems unrealistic. When we focus on w � (0) < ∞ , probability weighting raises the no-insurance bound when the loss probability is overweighted and the probability weighting function is concave on (0, p]. The same conditions ensure that probability weighting lowers the deductible compared to the level optimal for EU. A lower deductible corresponds to increased insurance demand. The next proposition summarizes.
Proposition 3 Consider the problem of optimal deductible choice for a given utility function.

(i) Overweighting (underweighting) of the loss probability increases (decreases) the full-insurance bound. (ii) Probability weighting increases the no-insurance bound if
or if w overweights the loss probability and is concave on (0, p]. (iii) Probability weighting lowers the optimal deductible if w overweights the loss probability and is concave on (0, p]. Compared to the coinsurance case, overweighting of the loss probability now needs to be flanked by the concavity of the probability weighting function for probabilities below the loss probability. The reason is that, in addition to the effects discussed in Section 3.1, probability weighting now also affects the marginal utility of losses below the deductible because these losses are retained by the individual. Overweighting of p and concavity up to p imply that the DRO region includes (0, p]. The assumptions in Proposition 3(ii) and (iii) are thus more restrictive than the assumptions for the corresponding coinsurance results. From an empirical standpoint, however, they are still mild. They are satisfied for an inverse S-shaped weighting function for loss probabilities that are below the interior fixed point p * and below the inflection point p . Many empirical studies find p ≥ p * and p * ≈ 0.37 (e.g., Prelec 1998) so that most loss probabilities relevant in insurance demand are covered. In this case, the prediction of higher insurance demand under probability weighting continues to hold for some loss probabilities that are between p * and p and are thus underweighted. The appeal of probability weighting as an explanation for higher insurance demand thus extends to the case of deductible choice and is not driven by the binary loss assumption.

LPHI versus HPLI risks
The study of low-probability, high-impact (LPHI) risks such as natural catastrophes has become a focus of the insurance economics literature. Such risks pose a major financial challenge to consumers and to society as a whole (e.g., Abadie and Gardeazabal 2003;Bouwer et al. 2007). High insurance penetration can make households and the entire economy significantly more resilient against such shocks (Von Peter et al. 2012), suggesting that insurance against LPHI risks has greater value than insurance against high-probability, low-impact (HPLI) risks. Browne et al. (2015) show that, under EU, individuals should purchase more insurance against LPHI risks than comparable HPLI risks, but observe the opposite in the data. 10 Other studies document a similar preference in favor of insurance against HPLI risks (e.g., Slovic et al. 1977).
In this section, we investigate whether probability weighting can explain this behavior. Our results thus far do not provide an answer. As long as loss probabilities are overweighted, RDU predicts higher demand against both HPLI risks and LPHI risks than EU does, see Proposition 1(i) and 1(ii). It is not clear how the two demand levels under RDU compare to each other. Our results on the effects of changes in the loss probability in Proposition 1(iii) and 1(iv) are first-order risk changes, whereas the comparison between LPHI and HPLI risks is better conceptualized as an increase in risk in the sense of Rothschild and Stiglitz (1970), which is a secondorder risk change. We summarize our findings in the following proposition.
Proposition 4 Under probability weighting, the no-insurance bound m W , the fullinsurance bound m W , and optimal insurance demand * W are higher for LPHI risks than HPLI risks as long as the corresponding loss probabilities are in the DRO region.
Proposition 4 extends Browne et al.'s (2015) EU prediction of higher insurance demand against LPHI risks than comparable HPLI risks to probability weighting as long as DRO holds for the relevant loss probabilities. Overweighting is not required. The result holds for a probability weighting function that underweights the associated loss probabilities as long as low probabilities are relatively less underweighted than larger ones so that DRO remains satisfied. Under EU, the full-insurance bound is always one and therefore unaffected by the change from an HPLI risk to an LPHI risk but the no-insurance bound and optimal insurance demand increase.
Probability weighting and EU make the same prediction about insurance demand against LPHI versus HPLI risks -we should observe higher demand for the former than the latter, and not the other way around. Proposition 4 holds for any increasing and concave utility function, so this prediction continues to hold even if individuals who weight probabilities have less concave utility functions than their EU counterparts. As we show in Proposition 7 in Appendix A.2, DRO regions are large for common probability weighting functions and contain all loss probabilities relevant in insurance demand. The DRO condition in Proposition 4 is thus likely to be fulfilled.
Our proof relies on Rothschild and Stiglitz's (1970) notion of an increase in risk, which presupposes equal expected losses. This may appear as a knife-edge case. In reality, the expected loss for LPHI risks is often greater than for HPLI risks, especially when considering the fat-tailed nature of catastrophic events. Extended warranties for appliances are a typical example of an HPLI risk. Jindal (2014) finds that washing machines have a 25-29% failure rate with an average repair cost of $249 and an average replacement cost of $599, resulting in an expected loss of $62-$72 for repair and $150-$173 for replacement. Flood risk is often considered an example of an LPHI risk with some degree of underinsurance. A back-of-the-envelope calculation based on aggregate data from the National Flood Insurance Program (NFIP) shows an average insured loss amount of $67,500 and an average insured loss probability of 0.8655%. 11 This results in an expected loss of $584, which exceeds the expected loss for the HPLI risk example by more than a factor of three.
Proposition 4 remains valid when LPHI risks have higher expected losses than HPLI risks as long as the utility function displays non-increasing absolute risk aversion. Standard comparative static arguments show that optimal insurance demand is then increasing in the loss severity. Both EU and probability weighting share the prediction of higher insurance demand against LPHI risks compared to HPLI risks even if we allow for differences in expected losses. Probability weighting cannot explain why researchers often find the reverse behavior in the data.
Several alternatives have been proposed to explain underinsurance of LPHI risks. One suggestion is diminishing marginal sensitivity (Schmidt 2016), though evidence of this behavioral assumption is mixed (e.g., Chung et al. 2019;Harbaugh et al. 2010;Jaspersen et al. 2021). Friedl et al. (2014) argue that social comparison can make insurance against highly correlated risks less attractive and present evidence from a laboratory experiment. Subjective probability estimates are another possibility. Recent experimental evidence suggests that underinsurance of LPHI risks is not observed when objective probabilities are provided (Bajtelsmit et al. 2015;Laury et al. 2009). Krawczyk et al. (2017) document persistent underestimation of loss probabilities for LPHI risks even if subjects learn about the risk over time.
In the extreme, individuals may simply neglect rare events. Kahneman and Tversky (1979) emphasized the importance of neglect within the editing phase. Neglect or underestimation of rare events is much more common than overweighting when people make decisions based on experience as opposed to when they learn about their options from descriptions (Hertwig et al. 2004;Hertwig and Erev 2009). When people neglect rare events, lack of insurance demand is to be expected and is not at odds with probability weighting.

Nonperformance risk
Thus far, we have assumed that the insurer will follow through with the promised indemnity when a loss happens. In real life, however, promises are not always kept. Reasons for nonperformance include insurer default, claims being denied or contested, delays in the claims handling process, and contractual uncertainty when it comes to the interpretation of the insurance contract (see Schlesinger 2013;Li and Peter 2021). Experiments have shown that individuals tend to react strongly to nonperformance risk, purchasing less insurance than predicted by EU. In Wakker et al.'s (1997) hypothetical survey, respondents required a 20% premium reduction to compensate for a 1% default probability. Similar results have been documented in other hypothetical studies (Albrecht and Maurer 2000;Zimmer et al. 2009), in incentive-compatible experiments (Zimmer et al. 2018;McIntosh et al. 2019), and in the field (Cole et al. 2013).
Probability weighting appears to be a promising explanation for low insurance demand due to nonperformance risk, because the inverse S-shape will overweight the probability of the worst outcome -experiencing a loss but not receiving a payment from the insurer. Wakker et al. (1997) find that probability weighting indeed reduces the willingness to pay for insurance when nonperformance risk is present. They only consider full insurance and can therefore not speak to the demand implications of probability weighting. When considering optimal demand, we will see that probability weighting has several effects on the optimal level of coverage. From a practical standpoint, the prediction of overinsurance tends to persist even in the presence of nonperformance risk.
We follow the primary theoretical models of Schlesinger and von der Schulenburg (1987) and Doherty and Schlesinger (1990). The setup is the same as in Section 2.1, except that the insurer now pays the claim with probability q ∈ (0, 1) and nonperformance occurs with probability (1 − q) . The insurance premium is adjusted accordingly to mpqL . If the claim goes unpaid, final wealth is

The individual's objective function under EU is
where the subscript NP abbreviates nonperformance. 12 Under RDU, the individual's insurance demand maximizes with decision weights 1 = 1 − w(p) , 2 = w(p) − w(p(1 − q)) and 3 = w(p (1 − q)) . We assume that the compound lottery is reduced before obtaining the decision weights. 13 Under EU, nonperformance risk invalidates most of the comparative statics of insurance demand because it introduces complex effects into the individual's costbenefit analysis (Doherty and Schlesinger 1990). The possibility of contract nonperformance reduces the marginal benefit of insurance because insurance is no longer completely reliable. It also increases the marginal cost of insurance because marginal utility in the nonperformance state is higher than in the other states. However, the insurance premium takes nonperformance risk into account. This represents a countervailing wealth effect, which reduces the marginal cost of insurance, because nonperformance risk makes coverage more affordable. The following proposition 12 We focus on total nonperformance for simplicity. Doherty and Schlesinger (1990) and Mahul and Wright (2007) also discuss partial recovery. 13 Assuming reduction of compound lotteries (ROCL) is not innocuous (see Bernasconi 1994). Segal (1990) shows in his Theorem 2(a) that ROCL does not imply compound independence or mixture independence, so it is not at odds with RDU. Segal's (1990) recursive RDU model allows for violations of ROCL. Recent experimental evidence suggests that this model performs worse at explaining insurance choices than conventional RDU with ROCL (see Lambregts et al. 2021). presents the effects of nonperformance risk on insurance demand and draws a comparison between RDU and EU.

Proposition 5 Assume the DRO region covers all probabilities up to the loss probability.
(i) At the margin, nonperformance risk reduces the full-insurance bound and the no-insurance bound. (ii) At the margin, nonperformance risk reduces optimal insurance demand if the utility function exhibits non-increasing absolute risk aversion.
Let the utility function be given and assume that w overweights the loss probability and is concave on (0, p]. (iii) Probability weighting leads to higher insurance demand than EU if and only if q >q for an endogenous performance threshold q ∈ [0, 1).
According to results 5(i) and 5(ii), nonperformance risk has the intuitive effect of reducing insurance demand if the level of nonperformance risk is small and the DRO region is large enough. Small nonperformance probabilities lower the noinsurance bound, making it less likely that individuals will purchase any amount of coverage. Small levels of nonperformance risk also lower optimal insurance demand for those loadings where coverage is in demand. In these cases, the negative effect of a less reliable insurance contract due to nonperformance risk outweighs the positive effect of a lower premium. Results 5(i) and 5(ii) contain EU as a special case, thereby extending Doherty and Schlesinger's (1990) analysis. They find that demand effects are indeterminate when allowing for an arbitrary level of nonperformance risk, whereas we find a definitive negative effect by focusing on small levels of nonperformance risk. 14 Result 5(iii) holds for a given level of nonperformance risk, which does not have to be small. Under the stated assumptions, RDU predicts higher insurance demand than EU as long as the performance probability is large enough. In this case, probability weighting does worse than EU at explaining the underinsurance puzzle for nonperformance risk. This result may appear surprising at first sight, especially against the background of Wakker et al. (1997) who find a lower willingness to pay for insurance due to probability weighting when nonperformance risk is present. The decision weights in Equation (14) put less weight on the no-loss state ( 1 < 1 − p ) due to overweighting of the loss probability, and more weight on the nonperformance state ( 3 > p(1 − q) ) because overweighting of the loss probability and concavity of w on (0, p] imply overweighting of the nonperformance state. The probability of the intermediate state, where a loss happens and the insurance contract performs, may be overweighted ( 2 > pq ) or underweighted ( 2 < pq ). Overweighting occurs when q exceeds a critical level and underweighting occurs when q is below this critical level.
As a result, the marginal benefit of insurance with nonperformance risk may actually be larger under RDU than under EU as long as the performance probability is large enough because then the individual attaches sufficient weight on the intermediate state where the contract performs. So while probability weighting always increases the marginal cost of insurance relative to EU, its effect on the marginal benefit can be positive or negative. In the absence of nonperformance risk, we know from Proposition 1(ii) that the impact of probability weighting on the marginal benefit is stronger than its impact on the marginal cost. This dominance persists when introducing nonperformance risk as long as the nonperformance probability is not too large.
The assumptions on the probability weighting function in Proposition 5(iii) are the same as in the case of deductible choice, see Proposition 3(ii) and (iii). As discussed there, they are more restrictive than assuming DRO on (0, p] but are still satisfied by inverse S-shaped probability weighting functions and loss probabilities relevant in insurance demand. The important insight from Proposition 5(iii) is that for low, and thus empirically plausible, levels of nonperformance risk, probability weighting predicts higher insurance demand than EU. If EU already predicts demand that is too high compared to empirically observed behavior, probability weighting will only widen the gap between theory and evidence and make matters worse.
The restrictive assumption in Proposition 5(iii) is that we keep utility curvature fixed when introducing probability weighting. As discussed in Proposition 2, utility functions are usually less concave in the presence of probability weighting. To take this into account, we will now assume exponential or iso-elastic utility to leverage Clarke's (2016) Theorem 2. Insurance demand with nonperformance risk is a special case of index insurance. Focusing on those cases where some insurance is purchased for some levels of risk aversion, Clarke shows that insurance demand can be strictly decreasing in risk aversion for m = 1 , or hump-shaped in risk aversion for m ≥ 1 . For iso-elastic utility, we denote the relative risk aversion parameters by U for EU and by W for probability weighting with U > W . If insurance demand under EU is hump-shaped in risk aversion, let * denote the level of risk aversion where it peaks. We then obtain the following result.
Proposition 6 Assume iso-elastic utility with parameter U under EU and W under probability weighting, with W < U . Assume that w overweights the loss probability and is concave on (0, p]. Let q denote the threshold from Proposition 5(iii) for isoelastic utility with parameter W .
(i) For m = 1 , if insurance demand is strictly decreasing in risk aversion and q >q , probability weighting predicts higher insurance demand than EU.
For m ≥ 1 with hump-shaped insurance demand that peaks at * , several cases are possible.
(ii) If U ≤ * and q ≤q , probability weighting predicts lower insurance demand than EU. (iii) If U > * , there is a ̂< * so that probability weighting predicts higher insurance demand than EU for W ∈ (̂, U ) and q ≥q , and lower insurance demand for W <̂ and q ≤q.
The proof is obtained by combining Clarke's (2016) Theorem 2(i) and (ii) with our Proposition 5(iii). Proposition 6 presents those cases where the effect of the change in utility curvature is aligned with the effects of probability weighting. When insurance demand is hump-shaped and U lies to the right of the peak, the critical value ̂ partitions (0, U ) into levels of risk aversion associated with lower insurance demand and levels of risk aversion with higher insurance demand. Similarly, the threshold q from Proposition 5(iii) separates nonperformance probabilities where probability weighting predicts higher insurance demand from those where it predicts lower insurance demand. We obtain Proposition 6 by combining cases with the same sign. Proposition 6 also holds for exponential utility because Clarke's Theorem 2 does and because our Proposition 5(iii) holds for any concave utility function.
Some cases remain indeterminate. Take m > 1 with a small level of nonperformance risk ( q >q ) and U ≤ * or U > * but W <̂ ; in this case, the decline in utility curvature predicts lower insurance demand, either because we are in the upwardsloping portion of the hump or because the decrease in utility curvature is large. However, probability weighting predicts an increase in insurance demand. The net effect then depends on the relative strength of both changes. To shed some light on those cases, we present numerical illustrations in Section 5.3.

Experimental design
To speak to the magnitude of some of our analytical results, we use data from an incentivized experiment to calibrate utility curvature and probability weighting functions. We then use those preferences to illustrate our findings. For the experiment, we recruited 94 subjects from a population primarily consisting of students. 15 Subjects first completed a general knowledge questionnaire of 20 multiple-choice questions. They received 20€ in compensation for answering at least 50% of the questions correctly. Each subject then made 90 insurance-like choices over a possible loss ( L = 10 € or L = 20€). Four of our subjects did not pass the questionnaire and are not considered further in the analysis. In 45 choices, the subject chose between a risky loss of L with probability p or a certain loss of mpL. In the other 45 choices, the subject chose between a risky gain of another 20€-L with probability p or 20€ with probability (1 − p) , or a certain gain of 20€ −mpL . Both sets of choices had the same possible combinations of p, m and L, which are displayed in Table 1. The order of all 90 choices, as well as their left/right ordering, were randomized.

The role of utility curvature and probability weighting
In Proposition 2, we showed that utility curvature and overweighting of the loss probability jointly rationalize insurance demand. To compare the relative importance of each preference motive, we first use the subjects' decisions to estimate parameters of preference functionals at the individual level, assuming full integration of assets. We use iso-elastic utility with relative risk aversion , that is, u(x) = x 1− ∕(1 − ) for ≠ 1 and u(x) = ln x for = 1 , and estimate parameters for the six probability weighting functions given in Appendix A.1. Goldstein and Einhorn's (1987, GE) functional form fits our data best, so we use it for our main analysis. 16 We then use the estimated parameters to show how differences in preferences across individuals affect predicted insurance demand. We assume x = 20 and L = 10 and calculate the no-insurance bound m W over the range of absolute amounts of overweighting (p) and utility curvature parameters observed in the subject population. For this, we hold one parameter constant at its median and vary the other parameter in the 10th, 25th, 50th, 75th, and 90th percentiles of the data, moving from lowest to highest. The top panel of Table 2 provides the percentiles of (p) and in the data based on the estimated preference functionals. 17 The bottom panel of Table 2 shows the results. The second column denotes the parameter being varied. Reading from left to right, the no-insurance bound increases in the amount of overweighting of the loss probability and in utility curvature. This is consistent with the substitution between overweighting and utility curvature outlined in Proposition 2. Reading down the columns, the no-insurance bound decreases as p increases because the loss probabilities considered here are in the DRO region of w GE , see Proposition 7(iv) in Appendix A.2. The rightmost column of Table 2 shows the interdecile range of the no-insurance bound for the respective parameter. Three points are noteworthy. First, differences in overweighting and utility curvature can lead to sizable variation in m W . Second, such differences generate more variation in m W at low rather than high loss probabilities. Third, m W is more sensitive to changes in overweighting than changes in utility curvature at all considered loss 17 We include subjects with negative and thus convex utility functions in the analysis. For them, m W is the loading where they are indifferent between no insurance and full insurance because individuals with a convex utility function would never purchase partial insurance. However, m W is continuous in ∈ ℝ and its comparative statics with respect to and (p) are similar for concave and convex utility functions. 16 Results for other probability weighting functions are comparable, see Appendix C.2. We only include subjects for whom the maximum likelihood estimator converges. For the GE function, this is the case for 77 out of 90 subjects. The second column of Table 4 in Appendix A.2 provides an overview.
probabilities. All in all, our findings suggest that probability weighting is the dominating preference motive because it has a stronger impact on the no-insurance bound than utility curvature, at least for our set of experimentally calibrated preferences.
The calculations in Table 2 are based on a loss that puts 50% of wealth at risk, which is high compared to most naturally occurring insurance decisions except liability and perhaps homeowners insurance when considering a total loss. In Appendix C.1, we provide an illustration for losses that put only 25% or 10% of wealth at risk. Differences in overweighting and utility curvature then lead to less variation in m W , but probability weighting is even more important than utility curvature in relative terms. In Appendix C.2, we repeat our illustration for different forms of the probability weighting function. While the no-insurance bound is always decreasing in p, the size of this effect varies by functional form. We provide a detailed discussion in the appendix.

Calculation of nonperformance thresholds
Propositions 5(iii) and 6 highlight the role of an endogenous performance threshold q to sign the effect of probability weighting on insurance demand under nonperformance risk. We will now use the experimentally calibrated preferences to calculate this threshold and provide a sense of its magnitude. We set x = 20 and L = 10 and simulate sixteen insurance decisions, using the estimated preference functionals of the subjects in our sample. 18 We vary the loss probability across four values with p ∈ {0.01, 0.05, 0.10, 0.20} and the loading across four values with m ∈ {1.10, 1.25, 1.50, 2.00} . For each combination of p and m, we determine optimal insurance demand as a function of the performance probability for each individual with and without probability weighting. 19 We make two such comparisons  Our results are virtually unchanged if the loss puts only 25% or 10% of wealth at risk. 19 We use Goldstein and Einhorn's (1987) probability weighting function because it has the best overall fit with our data, and restrict the comparison to individuals with a concave utility function, since this is required in Proposition 5(iii) and 6 . In each insurance decision, we further discard those individuals who have the same insurance demand at all levels of nonperformance risk as no meaningful comparison is possible. This case mostly appears for combinations of high p and high m because then insurance demand is uniformly zero. and report them in Table 3. For the results in panel (a), we keep the utility function fixed when introducing probability weighting as in Proposition 5(iii). In panel (b), we re-estimate each individual's preference functional when probability weighting is muted to obtain the best-fitting level of utility curvature under EU. In each comparison, we identify the value of the performance probability q where the individual's insurance demand under probability weighting coincides with his insurance demand under EU. This is q for this particular individual. For each combination of loss probability and loading, we thus obtain a distribution of q values across individuals. Table 3 reports the averages of these distributions. For example, the top-left value of q = 0.05 means that, for insurance with a loading of 1.10 covering a 1% chance of loss, probability weighting will, on average, imply higher insurance demand than EU as long as claims have at least a 5% chance of getting paid. In other words, for probability weighting to rationalize lower insurance demand than EU, a nonperformance probability of more than 95% is required.
In both panels, the average performance thresholds are so low that virtually all empirically relevant levels of nonperformance risk lead to higher insurance demand under probability weighting than under EU. The average thresholds are closer to one when a high loading is coupled with a high loss probability, but these cases become less realistic at the same time. A 20% loss probability with a loading of 2 creates an insurance premium that is 40% of the insured asset's value, and many individuals may prefer not to insure altogether, regardless of the insurer's performance probability. Comparing the average thresholds between panels, we notice a slight increase when allowing for a different level of utility curvature. So the additional flexibility about the utility function implies that lower levels of nonperformance risk allow for RDU to predict less insurance demand than EU. However, the difference between the two panels never exceeds 5 percentage points, so in absolute terms these levels of nonperformance risk are still high.
To put things in perspective, we note that many of the loss probabilities and loading factors in Table 3 are representative of standard consumer insurance markets. The predicted homeowners loss probability in Barseghyan et al. (2013) has a mean of 8.4% and a standard deviation of 4.4%. Insurers typically operate in competitive markets with moderate loadings. A back-of-the-envelope calculation using twenty years of industry-level data from the National Association of Insurance Commissioners (NAIC 2017), shows that the mean ratio of premiums to losses (an approximation of m) for homeowners insurance is 1.39 with a maximum of 1.67. For private auto insurance, the mean is 1.31 and the maximum is 1.44.
The nonperformance probabilities implied by Table 3 are higher than those observed in practice, and in most cases much higher. For example, A.M. Best studied impairments between 1969 and 2002 and found an average annual impairment frequency of 0.8% in the U.S. property/casualty industry (Best 2004).  use rating transitions for annuity providers and show that cumulative average impairment rates can be in the double digits when allowing for long time horizons of more than 10 years and for initial ratings of B++/B or worse. Insurers with an A/A-rating or better will have a cumulative average impairment rate of less than 10% even after 15 years. In this case, almost all combinations of loss probability and loading in Table 3 predict higher insurance demand under RDU than under EU. So while the substitution between utility curvature and overweighting can resolve commonly observed overinsurance puzzles, it does not provide a good explanation for underinsurance due to nonperformance risk. If anything, it exacerbates the puzzle because RDU predicts even higher insurance demand than EU in the presence of nonperformance risk despite the fact that the probability of the nonperformance state will typically be overweighted. Contrary to the case of underinsurance of LPHI risks, neglect or underestimation of rare events does not help here. If individuals neglect nonperformance risk altogether, their insurance demand should not react much at all. Peter and Ying (2020) propose uncertainty as a potential explanation for lower insurance demand due to nonperformance risk.

Relationship to prior literature
Our results complement and extend the literature on probability weighting and insurance demand. Several authors have explained a preference for full insurance at actuarially unfair premiums. One explanation is based on first-order risk aversion (Schmidt 1998;Segal and Spivak 1990;Schlesinger 1997), which can be accommodated by EU (Dionne and Li 2014), but arises more prominently in RDU with a concave or convex probability weighting function, see Segal and Spivak's (1990) Proposition 4. In a similar vein, Doherty and Eeckhoudt (1995) use Yaari's (1987) dual theory with a concave probability weighting function to explain full insurance at unfair premiums. By assumption, these papers rule out the commonly found pattern of inverse S-shaped probability weighting (e.g., Abdellaoui et al. 2011). Our focus on binary risks allows us to be more flexible about the probability weighting function. We are also able to provide a more detailed characterization of optimal insurance demand under probability weighting and to carve out the substitution between overweighting and utility curvature explicitly, including its determinants.
Others have investigated risk sharing in markets with aggregate uncertainty under the dual theory, under RDU, or under Schmeidler's (1989) more general Choquet Expected Utility. Schmidt (1999) characterizes efficient risk sharing under the dual theory with a concave probability weighting function. Chateauneuf et al. (2000), Tsanakas and Christofides (2006), Chakravarty and Kelsey (2015), and Carlier and Dana (2008) go beyond the dual theory but focus on convex probability weighting functions. Xia and Zhou (2016) require all individuals in the economy to have the same probability weighting function. Boonen and Ghossoub (2020) allow for inverse S-shaped probability weighting and for differences across individuals. Overall, the focus in this literature is on the characterization of Pareto optima and less on the comparative statics of insurance demand. There is usually no comparison of insurance demand for LPHI risks versus HPLI risks and no consideration of nonperformance risk.
Several empirical studies have used probability weighting to explain insurance demand. The identification strategy in Barseghyan et al. (2013) is based on a binary loss risk and approximates the utility function by a second-order Taylor expansion. Their focus is on the willingness to pay for insurance. Similar studies include Harrison and Ng (2018), Hansen et al. (2016), andCollier et al. (2021). We, in turn, derive new results about optimal insurance demand without approximating the utility function and without functional form assumptions on the probability weighting function. Contrary to the commonly held belief, results about willingness to pay do not automatically carry over to optimal demand. Chiu (2012) identifies comparability assumptions that allow him to leverage results about the effect of risk preferences on willingness to pay and apply them to the optimal demand for stochastic improvements. 20 He relies on EU. What these comparability assumptions look like under RDU is, to the best of our knowledge, unknown.
Our application of probability weighting to underinsurance problems is novel in the literature. Doherty and Schlesinger's (1990) model of nonperformance risk has been extended to recovery conditional on default (Mahul and Wright 2007), to the insurer-reinsurer relationship (Bernard and Ludkovski 2012), to risk management instruments other than insurance (Briys et al. 1991;Schlesinger 1993), to divergent beliefs about nonperformance risk (Cummins and Mahul 2003), and to endogenous default risk (Biffis and Millossovich 2012). All of these studies are based on EU. Wakker et al. (1997) study the willingness to pay for full insurance in the presence of nonperformance risk. Probability weighting then implies a stronger negative effect of nonperformance risk than EU (see also Segal 1988). Propositions 5(i) and 5(ii) show that this negative effect persists when considering optimal demand; however, when comparing demand levels between RDU and EU, the ranking can now go either way (see Proposition 5(iii) and 6). The more plausible case is higher insurance demand under RDU than under EU, as illustrated in Section 5.3. This discrepancy highlights that findings about willingness to pay may not be applicable to optimal demand. In reality, individuals not only choose whether to buy insurance but also how much to buy, because many contracts offer different levels of coverage.

Implications and conclusions
In this paper, we study the effect of probability weighting on optimal insurance demand. We investigate three established empirical findings. People overinsure modest risks, underinsure LPHI risks compared to HPLI risks, and underinsure in response to nonperformance risk. We are the first to formalize these different insurance demand problems in one efficient framework, which allows us to take a broader perspective on the merits and limitations of probability weighting. In the course of our analysis, we identify decreasing relative overweighting (DRO) as a useful local property of the probability weighting function and focus on loss probabilities in the DRO region for many of our results. Given its usefulness in the context of insurance demand, we anticipate that the DRO property may turn out to be helpful in other economic applications of probability weighting as well.
Many of our results are based on a simple model of insurance demand with a binary risk of loss. This allows us to be flexible in terms of the probability weighting function. We characterize optimal insurance demand under RDU and compare it to EU. Overweighting of the loss probability increases insurance demand, which leads to a substitution between overweighting and utility curvature. We derive determinants of the intensity of this substitution effect and explain how it underlies the descriptive appeal of probability weighting as a solution to the overinsurance puzzle of modest risks (Sydnor 2010). We also extend our results for coinsurance to deductible choice and show that they are robust. When it comes to underinsurance problems, however, probability weighting has little to offer. Just like EU, it predicts higher insurance demand for LPHI risks than HPLI risks, and is therefore unable to explain lacking insurance demand against LPHI risks (Browne et al. 2015). Finally, we investigate insurance demand in the presence of nonperformance risk with probability weighting. EU has been criticized for its inability to explain how strongly individuals react to even modest levels of nonperformance risk (Cole et al. 2013;Zimmer et al. 2018). Under plausible assumptions, RDU predicts higher insurance demand than EU in such contexts, and is therefore even further away from the evidence. Collectively, our results show that the predictions of higher insurance demand due to probability weighting carries over to situations where researchers have instead documented underinsurance.
The juxtaposition of these results reveals that probability weighting is best understood as an incomplete solution for insurance demand puzzles. Its descriptive appeal depends on whether the objective is to explain overinsurance or underinsurance. The results in our paper motivate further research in this area. For example, what are the properties of a particular insurance choice that activate or deactivate probability weighting as a preference motive? How and when does probability weighting interact with other drivers of insurance choices, such as reference dependence, subjective probabilities, and the use of heuristics? Insurance markets provide a real-world laboratory to test the descriptive merits of competing models of choice under risk, and the new results in this paper are a step forward toward a better understanding of the advantages and limitations of probability weighting as a descriptive theory of insurance demand behavior.

A.1 Common parametric classes of probability weighting functions
We use several classes of probability weighting functions to illustrate our results. We introduce them here for reference and note some of their features. 21 Many probability weighting functions have initially been developed for gain lotteries, and while differences between gain and loss domain estimates tend to be small (Tversky and Kahneman 1992), they do exist (Etchart-Vincent 2004). We start with the probability weighting function introduced by Goldstein and Einhorn (GE, 1987): with r, s > 0 to ensure monotonicity. Examples are shown in Panel (a) of Fig. 4. The GE probability weighting function can be S-shaped ( r > 1 ), inverse S-shaped ( r < 1 ), concave ( r = 1, s > 1 ) or convex ( r = 1, s < 1 ), and the identity is obtained for r = s = 1 . Parameter r mostly controls curvature, while s steers elevation.
The functional form proposed for Prospect Theory's rank-dependent version, Cumulative Prospect Theory (CPT), was introduced by Tversky and Kahneman (TK, 1992 Panel (b) of Fig. 4): with r > 0.2793 for monotonicity. It is inverse S-shaped for r < 1 , reverts to the identity for r = 1 , and is S-shaped for 1 < r ≤ 2. 22 Parameter r controls both curvature and elevation, but in opposite directions. A two-parameter version of the TK function was initially proposed by Wu and Gonzalez (WG, 1996) and is given by with r, s > 0. 23 As can be seen in Panel (c) of Fig. 4, s controls elevation while r is inversely related to curvature. The function w WG is inverse S-shaped for r < 1 , contains the identity as a special case for r = 1 , and is S-shaped for r > 1 as long as s > (r − 1)∕r. 24 Prelec (1998) derives two probability weighting functions axiomatically. The one-parameter version (Prl-I) is given by with r > 0 for monotonicity. We obtain an inverse S-shape for r < 1 (see Panel (d) of Fig. 4), the identity for r = 1 , and an S-shape for r > 1 . This class of probability weighting functions intersects the identity at its inflection point at p = p * = 1∕e . It has fixed elevation and its curvature is inversely related to r. The two-parameter version (Prl-II, see Panel (e) of Fig. 4) is and r, s > 0 guarantee monotonicity. This class of probability weighting functions control elevation via s, exhibit an inverse S-shape for r < 1 and an S-shape for r > 1 . For r = 1 , it nests the class of power weighting functions w Pwr (p) = p s (not illustrated), which are either convex for s > 1 , concave for s < 1 , or revert to the identity for s = 1.
Another relevant class of probability weighting functions, which has received praise for its empirical appeal (see Wakker 2010), is the neo-additive class, For r ∈ (2, 2.6112) , it has two inflection points and is convex-concave-convex. For r ≥ 2.6112 , it is convex. 23 For monotonicity, we need to set s ≤ 1 when r > 1 and s < s when r < 1 . The upper bound s on s is implicitly defined via s + ((1 − r)∕r) r + ((1 − r)∕r) r−1 s r = 1 . It is strictly increasing in r from lim r→0 s(r) = 2 to lim r→1 s(r) = ∞. 24 For r > 1 and s ≤ (r − 1)∕r , it does not have an interior fixed point. In this case, it can be convex, S-shaped, or have multiple inflection points. with parameters r ∈ (0, 1) and s ∈ (−r, r) . They are straight lines that are flatter than the identity and have discontinuities at both ends of the unit interval. Parameter r inversely measures slope and s controls elevation. We illustrate this class in Panel (f) of Fig. 4.

Decreasing relative overweighting
The DRO region plays an important role in our comparative static analysis. We will now derive the DRO region for various classes of probability weighting functions. Any inverse S-shaped probability weighting function with fixed point p * and inflection point p has two points p ′ and p ′′ , one smaller and one larger than p * and p , which define the likelihood insensitivity region. For p ∈ (p � , p �� ) the probability weighting function is flatter than the identity, w � (p) < 1 , while it is steeper than the identity on (0, p � ) ∪ (p �� , 1) , see Lemma 1 in Baillon et al. (2020). Likewise, any S-shaped probability weighting function has two points p ′ and p ′′ such that w � (p) > 1 for p ∈ (p � , p �� ) and w � (p) < 1 for p ∈ (0, p � ) ∪ (p �� , 1) . We will first establish some general results about DRO regions for probability weighting Fig. 4 Examples of probability weighting functions functions with no more than one change in curvature (inverse S-shaped, S-shaped, concave, convex, neo-additive), and then apply them to specific parametric classes.
Proposition 7 Consider a probability weighting function w that is twice continuously differentiable on (0, 1) with no more than one inflection point.
(iii) If w is concave or neo-additive, the DRO region is (0, 1]; if w is convex, the DRO region is empty.
For parametric probability weighting functions, we obtain p as follows:
◻ DRO regions of inverse S-shaped probability weighting functions include all overweighted probabilities, all probabilities where w is concave, and some probabilities that are underweighted and where concavity does not hold. DRO regions can be quite large. For example, p Prl−I is negatively associated with r and lim r→1p Prl−I = exp(−1∕e) ≈ 0.6922 . When the Prelec-I function is inverse S-shaped (i.e., for r < 1 ), the DRO region includes at least all probabilities below 69.22% , and even larger probabilities the smaller the value of parameter r. A similar argument shows that the lowest p TK for inverse S-shaped TK weighting functions is 0.75 so that DRO regions in this case include at least all probabilities up to 75% , and contain probabilities above 75% for values of r below 1. Both of the upper bounds are in a region where probabilities are underweighted.
This suggests that our focus on loss probabilities in the DRO region is a mild assumption. Figure 5 plots p for various parameter values of inverse S-shaped probability weighting functions. With sufficient elevation (high s in GE or low s in WG and Prl-II), the DRO region is large. For almost all r and s parameter values reported in the literature (see Table 5 in Stott 2006), nearly all loss probabilities relevant in insurance lie in the DRO region, the main exception being maybe the probability to claim on a health insurance policy (e.g., Bhargava et al. 2017).
Based on our own experimental data (see Section 5), we calibrate the different classes of probability weighting functions and calculate the upper bound p of the DRO region for each subject under the given function. Table 4 summarizes the distribution of p for subjects exhibiting inverse S-shaped probability weighting. Even the lowest 10th

Appendix B: Mathematical proofs
Determining the Loading Thresholds in Equation (7) The first-order condition for objective function (3) is where x * 1 and x * 2 denote final wealth in the no-loss and the loss state at the optimal level of insurance demand * W . The second-order condition is satisfied because the objective function is globally concave in due to u ′′ < 0 . The first-order expression is then strictly decreasing in . We derive the full-insurance bound and the noinsurance bound by solving W∕ | =1 ≥ 0 and W∕ | =0 ≤ 0 for m. This yields Equation (7). Diminishing marginal utility implies m W < m W .

Proof of Proposition 1 and Additional Comparative Statics
The full-insurance bound m W can be rewritten as 1 + (p)∕p , which is increasing in the absolute amount of overweighting. We rearrange the no-insurance bound as follows: Therefore, m W increases in the absolute amount of overweighting. This shows result 1(i).
For 1(ii), we need to compare insurance demand under EU and under RDU. When the individual overweights the loss probability, we know from 1(i) that m U < m W and m U < m W . We distinguish different cases for the loading factor. If m ≤ m U , full coverage is optimal under EU and under RDU. If m U < m ≤ m W , less than full coverage is optimal under EU, whereas full coverage is optimal under RDU. If m ∈ (m W , m U ), 25 partial insurance is optimal under EU and under RDU but insurance demand is higher under RDU because If m U ≤ m < m W , no insurance is optimal under EU but partial insurance is optimal under RDU. For m ≥ m W no insurance is optimal under both EU and RDU. So the individual always purchases at least as much insurance under RDU as under EU, and strictly more for m ∈ (m U , m W ) . The reasoning for underweighting is analogous.
To show results 1(iii) and 1(iv), we first provide comparative statics with respect to the loss probability p. We rewrite the full-insurance bound as m W = 1 + (p) , which is decreasing in p for loss probabilities in the DRO region. For the effect on m W , consider the numerator of m W ∕ p , which after some rearrangement is given by The first square bracket is negative because DRO at p is equivalent to the probability weighting function being elastic at p. The second square bracket is negative due to diminishing marginal utility. Hence, an increase in the loss probability increases m W . The ratio m W ∕m W is decreasing in p, regardless of whether p is in the DRO region or not, because An increase in p raises w(p), which increases the denominator so that m W ∕m W decreases. If p is in the DRO region, the range of loadings where partial insurance is optimal decreases in p. This follows from m W − m W = m W ⋅ m W ∕m W − 1 because both factors are decreasing in p.
These results imply that m W ∕m U is decreasing in p for p in the DRO region because m U = 1 and m W is decreasing in p under DRO. This shows (iii). For (iv), we compute Taking the derivative of m W ∕m U with respect to p and rearranging its numerator yields: For p in the DRO region, the square bracket is less than w(p)(w(p) − p)(u � (x) − u � (x − L)) , which is negative for w(p) > p . Then, a lower loss probability raises m W ∕m U . .

Proof of Proposition 2
We rewrite the first-order condition (B.1) for an interior solution as follows: We can then solve for * W when the utility function is exponential or iso-elastic. For exponential utility, we have u(x) = (1 − e −Ax )∕A with A > 0 , and optimal insurance demand is given by Solving for utility curvature A yields with the following derivatives: For iso-elastic utility, we have u � (x) = x − with > 0 . Optimal insurance demand is then given as follows: Solving for the utility curvature parameter renders This differs from (B.10) only by a non-negative factor that does not depend on (p) . Therefore, d ∕d (p) and d 2 ∕d (p) 2 have the same sign as for exponential utility. This proves 2(i).
For 2(ii), note that dA∕d (p) is more negative the higher * W , the lower p as long as w(p) < 0.5 , and the lower L. When d 2 A∕d (p) 2 is positive, it is more positive the higher * W , the lower p, and the lower L. For exponential utility, this is directly evident from Equations (B.11) and (B.12). For iso-elastic utility, ln x * 1 ∕x * 2 is decreasing in * W , increasing in p, and increasing in L, so 2(ii) follows from (B.11) and (B.12) with a similar argument.

Proof of Proposition 3
After some rearrangements and simplifications, the first-order condition for objective function (9) is given as follows: For D = 0 , we obtain P(0) = mp ∫ L 0 dF( ) for the premium and hence This is nonpositive for m ≤ w(p)∕p , indicating that raising the deductible above zero would lower the value of the objective function. Hence, full insurance is optimal. For no insurance, we set D = L and obtain P(L) = 0 . Notice that because F(L) = 1 and w(0) = 0 . Rewrite the first-order expression as follows: When m > m W , the square bracket is strictly positive for D = L . Due to continuity, it remains positive for values of D slightly below L, indicating that the value of the objective function can be increased by raising the deductible. This establishes that no insurance is optimal, D * W = L . On the flip side, when m < m W , the square bracket is strictly negative for D = L . Due to continuity, it remains negative for values of D slightly below L, and the objective function can be increased by lowering the deductible. Therefore, some insurance is optimal, D * W < L . Again by continuity, this implies that D * W = L for m = m W because otherwise the optimal deductible level would exhibit a jump at m = m W .
Given that the full-insurance bound is the same as in the coinsurance problem, result 3(i) follows immediately. For result 3(ii), lim p→0 w � (p) = ∞ implies m W = ∞ , which exceeds m U . When w � (0) < ∞ , we obtain that m W ≥ m U if and only if Overweighting of the loss probability and concavity of the probability weighting function on (0, p] imply that w � (0) > 1. 26 This together with overweighting of p implies that the first square bracket is positive. The second square bracket is non-negative because p(1 − F( )) is in the concave portion of w so that For result 3(iii), we first characterize the optimal deductible D * U under EU via the corresponding first-order condition, which is given by When inserting D * U into the first-order expression under RDU, we can utilize the above condition and obtain the following: The first square bracket is negative if and only if Overweighting of p and concavity up to p of the probability weighting function imply that (0, p] is part of the DRO region. In this case, the left-hand side of (B.22) exceeds w(p)/p, which exceeds the right-hand side of (B.22) because p is overweighted. The second square bracket in (B.21) is also negative because the probability weighting function is concave on (0, p]. As a result, we obtain that 26 If we had w � (0) ≤ 1 for a probability weighting function that is concave on (0, p], then w(p) = ∫ p 0 w � (t) dt ≤ w � (0) ∫ p This indicates that probability weighting induces the individual to lower the optimal deductible, which is equivalent to higher insurance demand in the deductible choice problem.

Proof of Proposition 4
We fix the expected loss by setting pL = const . Then, (dp∕dp) ⋅ L + p ⋅ (dL∕dp) = 0 so that dL∕dp = −L∕p . By rewriting m W as 1 + (p) , we can see that m W increases following a mean-preserving spread of the insured risk as long as the loss probability remains in the DRO region. Under the binary risk assumption, a mean-preserving spread reduces p and simultaneously increases L. We calculate this derivative explicitly, and find that DRO at p is equivalent to pw � (p) < w(p).
We then take the derivative of m W with respect to p while using dL∕dp = −L∕p . After some simplifications, the numerator is given by: The first term is negative due to u ′′ < 0 . The sign of the second term depends on the square bracket, which is negative if pw � (p) < w(p) . So for loss probabilities in the DRO region, m W increases after a mean-preserving spread of the insured risk.
For optimal insurance demand * W ∈ (0, 1) , recall first-order condition (B.1). The implicit function theorem and the concavity of the objective function imply that the sign of * W ∕ p | | |pL=const coincides with the sign of 2 W∕ p | |pL=const . This crossderivative is given by: The five terms represent different economic effects on insurance demand when the insured risk becomes riskier. 27 To determine the net effect, we solve for u � (x * 1 ) from the first-order condition, 27 A lower p decreases the marginal cost of insurance because there is less weight on the loss state (first term); a larger L increases the marginal cost of insurance because it lowers final wealth in the loss state (second term); a lower p reduces the marginal benefit of insurance because there is less weight on the loss state (third term); a larger L increases the marginal benefit of insurance because it raises the dollar amount per unit of coverage (fourth term); a larger L increases the marginal benefit of insurance because it reduces final wealth in the loss state (fifth term).
substitute, rearrange and combine terms to obtain: The second term is negative because u ′′ < 0 . The first term is negative if DRO holds at p. To see this, recall that an interior solution requires m > m W = w(p)∕p . The square bracket in (B.28) is then no larger than This is negative for p in the DRO region. Hence, optimal insurance demand increases following a mean-preserving spread of the insured risk because such a change lowers p.
(B.30) W NP = −mpqL 1 u � (x * 1 ) + 2 u � (x * 2 ) + 3 u � (x * 3 ) + 2 Lu � (x * 2 ) = 0. (B.31) The first round bracket is positive due to diminishing marginal utility; the second round bracket is positive because the DRO region contains (0, p] so that w( )∕ is decreasing for ≤ p and w � (0) = lim →0 w( )∕ . Also, lim q→1 dm W ∕dq = ∞ whenever lim p→0 w � (p) = ∞ . In any case, starting at no nonperformance risk ( q = 1 ), any small decrease of q reduces m W . The threshold m W is implicitly defined via W NP ∕ | | =0 = 0 . To determine the effect of nonperformance risk at the margin, we interpret W NP ∕ | | =0 as a function of m and q, apply the implicit function theorem to obtain dm∕dq , and evaluate it at q = 1 . Define then direct computation shows that For q = 1 , we know from Equation (7) that As a result, which is infinite whenever lim p→0 w � (p) = ∞ . If w � (0) is finite, the round bracket is positive because we assumed that (0, p] is part of the DRO region. Therefore, starting at no nonperformance risk ( q = 1 ), any small decrease of q reduces m W . This shows result 5(i).
For q = 1 , this expression simplifies to The cross-derivative is infinite when lim p→0 w � (p) = ∞ . If w � (0) is finite, we can use the first-order condition at q = 1 , which is given by and substitute it into 2 W NP ∕ q | |q=1 . After some rearrangements, this yields the following: The first square bracket is positive because the DRO region contains (0, p], the second square bracket is positive due to u ′′ < 0 , and the third square bracket is nonnegative because of non-increasing absolute risk aversion. As a result, the crossderivative is positive at q = 1 , and potentially infinite, indicating that any small level of nonperformance risk reduces optimal insurance demand. To prove result 5(iii), we need to assess the effect of probability weighting on optimal insurance demand at a given level of nonperformance risk. Let * U,NP denote optimal insurance demand under EU. It is characterized by the following first-order condition: Solving for u � (x * 2 ) yields When evaluating the individual's first-order expression under RDU at * U,NP and substituting condition (B.48), we obtain the following after some simplifications and rearrangements: = −mpL (1 − w(p))u � (x * 1 ) + w(p)u � (x * 2 ) + * W m 2 p 2 L 2 (1 − w(p))u �� (x * 1 ) + w(p)u �� (x * 2 ) − mp 2 Lw � (0) u � (x * 2 ) − u � (x * 3 ) − * W mpL 2 w(p)u �� (x * 2 ) + pLw � (0)u � (x * 2 ).
(B.45) mpL (1 − w(p))u � (x * 1 ) + w(p)u � (x * 2 ) = w(p)Lu � (x * 2 ), In the absence of nonperformance risk (i.e., for q = 1 ), this simplifies to which is positive because of w(p) > p . Probability weighting increases insurance demand in the absence of nonperformance risk, as shown in Proposition 1(ii). For the remainder of the proof, let q ∈ (0, 1) . Our assumptions ensure that (0, p] is part of the DRO region. Therefore, (p(1 − q))∕p(1 − q) > (p)∕p , which is equivalent to pq 3 − p(1 − q) 2 > 0 . We can then rearrange (B.49) as follows: The sign coincides with the sign of the curly bracket. Nonperformance risk implies x * 3 < x * 1 so that u � (x * 3 )∕u � (x * 1 ) > 1 due to u ′′ < 0 . We denote the first fraction in the curly bracket by f(q) and examine its behavior on (0, 1) as a function of q. For q → 1 , the numerator of f(q) converges to (w(p) − p) , which is positive because the loss probability is overweighted. The denominator of f(q) converges to 0 from above due to DRO so that lim q→1 f (q) = ∞ . For q → 0 , we obtain from L'Hôpital's rule that which may or may not exceed unity depending on whether the probability weighting function is steeper or flatter than the identity at the loss probability. For q ∈ (0, 1) , the derivative of f(q) with respect to q is given by The numerator is non-negative because the loss probability is overweighted and the weighting function is assumed to be concave up to the loss probability p. Indeed, the curly bracket in (B.53) can be rearranged to (B.49) .