Violations of coalescing in parametric utility measurement

The majority consensus in the empirical literature is that probability weighting functions are typically inverse-S shaped, that is, people tend to overweight small and underweight large probabilities. A separate stream of literature has reported event-splitting effects (also called violations of coalescing) and shown that they can explain violations of expected utility. This leads to the questions whether (1) the observed shape of weighting functions is a mere consequence of the coalesced presentation and, more generally, whether (2) preference elicitation should rely on presenting lotteries in a canonical split form instead of the commonly used coalesced form. We analyze data from a binary choice experiment where all lottery pairs are presented in both split and coalesced forms. Our results show that the presentation in a split form leads to a better fit of expected utility theory and to probability weighting functions that are closer to linear. We thus provide some evidence that the extent of probability weighting is not an ingrained feature, but rather a result of processing difficulties.


Introduction
Experiments on decision making under risk mostly employ a coalesced presentation of lotteries, i.e., branches which lead to the same consequences are combined and the respective probabilities are added up. However, presenting gamble pairs in a canonical split form makes them easier to compare and process for the decision maker since, in the case of binary choice, both gambles involve the same set of probabilities. For illustration, consider the classic paradox of Allais (1953), also termed common consequence effect, where M$ denotes millions of dollars. Figure 1 presents the Allais paradox in the commonly used coalesced form. Here, subjects Extended author information available on the last page of the article weighting function in their utility representation. Nowadays, the majority consensus in the literature is that this function is typically inverse-S shaped, i.e., small probabilities are overweighted, whereas large ones are underweighted (Wu and Gonzalez 1996;Gonzalez and Wu 1999;Abdellaoui 2000;Bleichrodt et al. 2001; although not unanimously, see, e.g., Hertwig 2012 for critique). The evidence has mainly been derived using coalesced presentation of the lotteries. We analyze how the shape of the probability weighting function differs if the gamble pairs are presented in a split form instead. Since probabilities are easier to compare in split form, an absence or diminished extent of the typical non-linear shape in this form could indicate that the previous evidence mostly reflects difficulties in processing probabilities instead of an ingrained non-linear weighting of probabilities.
Question 2 is related but more general. As violations of EU decrease under the split form, it could be more suitable for elicitation of von Neumann-Morgenstern utility functions in EU. The split form could thus improve prescriptive decision analysis, as the assessment of von Neumann-Morgenstern functions is central in this context (von Winterfeldt and Edwards 1986;McCord and de Neufville 1986;Fischoff 1991;Bleichrodt et al. 2001).
Our analyses of both questions are based on parametric analysis with fitting the parameters of EU and rank-dependent utility (RDU)-with the latter corresponding to the gain-domain parameters of cumulative prospect theory (CPT). Firstly, to answer Question 2, we ask whether the fit of EU improves if we use choice data from split lotteries. We then extend this analysis from the EU to the RDU. To keep the analysis manageable, we restrict our attention to pure gain gambles.
Up to date, only limited work has been done to examine what intelligible impact, if any, failing to account for splitting effects shows in the RDU framework. Indeed, quite little is known about the impact of the splitting effects on the estimated values of the central parameters of RDU or the model fit. Real-life gambles do not always occur in a split form, but explicitly presenting them in a split form could improve the fit of RDU. Moreover, it could alter the features of the probability weighting function, namely, the magnitude of the usually observed non-linear shape, which brings us back to Question 1. And this is indeed a result that we find: The split form significantly improves prescriptive decision analysis.
The paper is organized as follows. We discuss the theoretical background and related literature in Sect. 2. We lay out the experimental design and estimation approaches in Sect. 3 and present the results in Sect. 4. Finally, we discuss the limitations and implications for future research in Sect. 5 and 6.

Expected utility, rank-dependent utility and cumulative prospect theory
We consider a set of real-valued outcomes X. P denotes the set of all gambles or lotteries over X. A gamble P 2 P satisfies the axioms of Kolmogoroff (1933), i.e., 0 p x i ð Þ 1 8 x i 2 X and pðXÞ ¼ 1. Preferences of the decision are formalized by a binary relation ¤ P Â P. A function V : X ! r represents ¤ on P if and only if P ¤ Q , V P ð Þ ! V Q ð Þ. For a gamble with n possible outcomes, the preferences in EU can be represented by where u is the von Neumann-Morgenstern utility function. In parametric analysis, u is commonly assumed to be a power function u x i ð Þ ¼ x a i , where 1 À a is the coefficient of relative risk aversion.
In the CPT framework, outcomes of a gamble are ordered in an increasing order x 1 Á Á Á x k 0 x kþ1 Á Á Á x n and preferences can be represented by a sum of two RDU functionals where sign dependence, reference dependence and rank dependence are all satisfied (Tversky and Kahneman 1992). In parametric analysis, the value function is mostly taken as a two-part power function: It assigns a number v (x) to each outcome x to describe the subjective value of the outcome relative to a reference point. The reference-dependent S-shaped value function with v (0) = 0 firstly exhibits diminishing sensitivity to gains and losses, such that the function is concave (with 0\a\1 exhibiting risk aversion for gains) or convex (with 0\b\1 exhibiting risk seeking behavior for losses), respectively. Secondly, the value function implies loss aversion (when k [ 1) in that a loss of a given amount has more impact on the attractiveness of a prospect than a gain of an equivalent amount: Àv Àx ð Þ[ v x ð Þ for all x [ 0 (Kahneman andTversky 1979, Tversky andKahneman 1991). Note, however, that we focus exclusively on the gain domain in our analyses.
The decision weights p þ for the cumulative probabilities of positive outcomes in (1) are defined by The probability weighting function w þ is strictly increasing and continuous. It is defined for the whole probability domain 0; 1 ½ and satisfies w þ 0 ð Þ ¼ 0 and w þ 1 ð Þ ¼ 1.
In their work, Tversky and Kahneman (1992) propose fitting the data to the following single-parameter functional form of the probability weighting function: The typically observed inverse-S shape of this weighting function (henceforth TKW) exhibits overweighting of small probabilities (up to the crossover point where w (p) = p) and underweighting of large probabilities. Furthermore, to contribute to the empirical tractability of the model, we consider additional parametric specifications of the probability weighting function. For example, the two-parameter linear-in-log-odds specification introduced by Goldstein and Einhorn (1987) has been claimed to be the most commonly used specification of the probability weighting function (Booij et al. 2010). This function (henceforth GEW) is given by where two parameters independently, instead of one, explain the shape of the weighting function. Namely, the c parameter (which is usually assumed 0\c\1 to maintain the inverse-S shape) allows controlling for the curvature of the function and thus serves as a discriminability indicator, while the d parameter 2 (d [ 0) explains the elevation of the function and thus serves as an attractiveness indicator (Tversky and Kahneman 1992;Gonzalez and Wu 1999). Given that the two properties often do not covary, this specification can offer a substantial advancement in relation to the Tversky and Kahneman (1992) specification outlined above (Gonzalez and Wu 1999;Booij et al. 2010). We should note, however, that the exact values of the named weighting parameter estimates are subject to possible interaction effects, 3 e.g., between d and a, which both take into account some part of risk aversion (Nillson et al. 2011;Glöckner and Pachur 2012). Another two-parameter specification enabling similar interpretation of the c and d parameters as GEW was introduced by Prelec (1998) (henceforth P2W): A special case of the P2W with d ¼ 1 is expressed in the single-parameter form (henceforth P1W): 2 Although generally there tend to be no further restrictions imposed on d, Glöckner and Pachur (2012) propose restricting the d parameter values (in GEW) between 0 and 4. 3 If a relatively large number of free parameters are being estimated for choice data on proportionally fewer games, some interactions might occur due to negative correlations of d and a. Generally, the functional relationships between the CPT model's parameters are still subject to debate (Nillson et al. 2011).
It can reportedly outperform the other weighting functions presented above if used in combination with the power value function in (3) (Stott 2006). Previous studies have delivered mixed results regarding which of the four functional forms of the probability weighting function provides the best fit (see, e.g., Wu and Gonzalez 1996;Gonzalez and Wu 1999;Sneddon and Luce 2001). In this paper, we thus consider the power value function in combination with all four functional forms of the weighting function presented above: the one-parameter TKW and P1W and the two-parameter GEW and P2W. In addition, we introduce a fifth weighting function attributable to the EU: w (p) = p (restricting the c parameter to the value of unity such that no weighting taking place, henceforth denoted EUW) for benchmark purposes as a special linear case of the one-parameter weighting function.

Event-splitting effects
By definition, an event-splitting effect (also called violation of coalescing) occurs when a reversal of preference arises in response to a coalesced-versus split-form change of the same choice. By using the term split form of a gamble throughout the paper, we refer to the canonical split form, for which both gambles of a choice are split in a way that allows the corresponding ranked branches to have equal probabilities while keeping the number of branches minimal (also implying the same number of branches in both gambles). 4 For example, a choice between A = (€40, 0.1; €40, 0.1; €2, 0.8) and A 0 = (€98, 0.1; €2, 0.1; €2, 0.8) is called the canonical split form of the choice between the coalesced B = (€40, 0.2; €2, 0.8) and B 0 = (€98, 0.1; €2, 0.9) (Birnbaum and Navarrete 1998). Kahneman (2003) suggests that ''most decision makers will spontaneously transform the former prospect into the latter and treat them as equivalent in subsequent operations of evaluation and choice'' (p. 727). This observation largely coincides with the definition of coalescing, which refers to an assumption that any two or more branches leading to the same outcome can be combined by adding their probabilities without affecting the utility of the gamble, such that A ¼ x; p; x; q; y; 1 À p À q ð Þ $ B ¼ x; p þ q; y; 1 À p À q ð Þ and additionally C ¼ x; p; ð y; q; y; 1 À p À qÞ $ D ¼ x; p; y; 1 À p ð Þ . Coalescing implies that A 1 C if and only if B 1 D. From above, A $ B and C $ D. Therefore, A 1 C if and only if B $ A 1 C $ D. Thus, by transitivity, B 1 D (Birnbaum et al. 2017). Because coalescing and transitivity should be satisfied within the CPT framework with any 4 Before examining coalescing in more detail, it is important to define the elements called branches that comprise any gamble in question. A branch is a probability-consequence pair that is distinct in the presentation to the decision maker. To illustrate this, consider two gambles: a three-branch gamble A ¼ x; p; x; q; y; 1 À p À q ð Þ , for which the three branches lead to outcomes x, x and y with probabilities p, q and 1 À p À q ð Þ , respectively; and a two-branch gamble B ¼ x; p þ q; y; 1 À p À q ð Þ , for which the two branches are x; p þ q ð Þand y; 1 À p À q ð Þ for outcomes x and y respectively. A keen observer will notice that the two gambles appear objectively equivalent, as the combination of outcomes for the first two branches of gamble A, x; p ð Þ and x; q ð Þ lead to the same result as the first branch of gamble B, x; p þ q ð Þ . Thus gamble B is the two-branch coalesced form of the three-branch gamble A; and conversely, gamble A is the split form of gamble B. w(p) function, there should also be no splitting effects (Birnbaum and Navarrete 1998;Luce 1998;Birnbaum 2008; see Appendix 1 for a proof illustrating how the CPT's assumption of rank dependency implies the satisfaction of coalescing).
In the meantime, there exists abundant evidence showing that people do not treat coalesced-form and split-form gambles as equal (e.g., Conlisk 1989;Starmer and Sugden 1993;Humphrey 1995). In fact, it has been shown that splitting the branch with the highest available outcome can increase the attractiveness of a gamble in comparison to a coalesced form of the same gamble. Conversely, splitting the branch with the lowest available outcome decreases the attractiveness of the gamble (Starmer and Sugden 1993;Humphrey 1995Humphrey , 2001. Splitting both the highest and lowest branches in a binary gamble with two equiprobable positive branches tends to make the gamble worse, in compliance with loss aversion (Birnbaum 2008).
Interestingly, it appears that violations of coalescing cannot be attributed to lack of knowledge, as they are persistent even in people with doctoral degrees who are familiar with the literature on decision making (Birnbaum 1999). Neither can the splitting effects be explained by errors, as they are still persistent when errors are factored out (Birnbaum et al. 2017), nor can the effects be attributed to the particular format 5 used for presenting, or framing, the gambles (Birnbaum 2004(Birnbaum , 2006Birnbaum et al. 2008). Decision heuristics, like anchoring and adjustment, cannot account for the observed splitting effects either (Humphrey 1996). Meanwhile, the results regarding the effects of specific learning and experience are still mixed (see, e.g., Humphrey 2006;Birnbaum and Schmidt 2015).
Indeed, it seems that people simply ''do not obey coalescing'' (Birnbaum 2007, p. 171). The assumption that coalesced and split forms of the same gamble would be treated equivalently is thus ''empirically false'' (Birnbaum 2008, p. 464). And yet, CPT is still argued by many to be ''the ''best'', if imperfect, description of decision making under risk and uncertainty'' (Birnbaum 2008, p. 463). In light of this controversy, a question arises regarding what intelligible impact, if any, the splitting effects have on the conclusions drawn by the ''imperfect'' CPT framework.
We examine which of the two gamble presentation forms (coalesced versus split) leads to more normatively accurate, or rational, results. In this context, the term rationality is used in the sense proposed by von Neumann and Morgenstern (1947). It implies being in line with the normative preference axioms of the EU and, most notably, the substitution axiom.
We hypothesize that the normative EU explains data comprised of split-form gambles better than data comprised of coalesced-form gambles, while it is unclear whether this is to be the case for the descriptive RDU model. 6 A couple of previous studies have already focused on the CPT's fit to data that indirectly test coalescing (e.g., Birnbaum and Chavez 1997;Birnbaum and Navarrete 1998), but little is known about how well selected data comprised of coalesced-versus split-form gamble pairs fit an RDU model with varying functional specifications.

Experimental procedure
We examine these questions using data from a pairwise choice experiment (see Birnbaum et al. 2017), conducted with 54 student subjects at the University of Kiel in Germany (all undergraduate students, 61% in the economics and business administration programs; 22.0 years old on average; of them, 21 female). The experiment is based on a random-lottery incentive mechanism, which is a commonly used one-step choice-based elicitation approach that lets subjects face multiple pairs of gambles in a sequence and choose a preferred gamble for each of the pairs (see, e.g., Hey and Orme 1994;Wu and Gonzalez 1996).
At the end of the experiment, one pair is chosen randomly and played out for real. Each pair consists of a risky gamble R ¼ x 1 ; p 1 ; x 2 ; p 2 ; x 3 ; p 3 ; x 4 ; p 4 ð Þ and a safe gamble S ¼ y 1 ; q 1 ; y 2 ; q 2 ; y 3 ; q 3 ; y 4 ; q 4 ð Þ with two to four outcomes x i , y i and respective probabilities p i , q i that are systematically varied. The choices between the gambles are presented in a pseudo-random order and the outcomes x i , y i are ordered from the lowest to the highest within each gamble (see Appendix 2 for an example of the presentation format and the experiment instructions).
The dataset comprises 28 gamble pairs (14 of them presented in a split form, see Appendix 3 for an overview of all gambles and the respective descriptive statistics of gamble choices), implying 4 9 28 = 112 choice situations faced by each subject over four repetitions, that is, 54 9 4 9 28 = 6048 choice situations observed in total, 3024 of them in a split form. Note, however, that the gambles in our study are relatively specific in that no certain outcomes are included and the values and probabilities of high outcomes are relatively similar between the gambles within a decision. The subjects received, on average, a €19.1 cash reward (including a €5.0 show-up reward) for an approximately 90-minute session, leading to an average reward of €12.8 per hour.

Structural modeling
We apply structural methods to jointly estimate several core parameters of the EU and RDU frameworks. In particular, we use maximum likelihood estimation (MLE) to determine the most likely parameter values to have generated the given dataset Footnote 6 continued function (e.g., Birnbaum and McIntosh 1996;Birnbaum and Chavez 1997). Hence, we additionally hypothesize finding significant differences in c (and d) parameters between coalesced-and split-form gambles, respectively. We expect that this result would remain robust also when controlling for additional variables such as the numbers of branches or outcomes in a gamble as well as demographic information.
within the specification bounds. In addition, the log-likelihood of the MLE allows us to measure the goodness of fit of the respective frameworks.
RDU includes subjective values of outcomes and subjective weights of probabilities. For the subjective values, we use a power value function in all models. We thus add the a parameter as the first one in the list of estimable parameters. For the probability weighting, we examine four weighting function specifications given that, firstly, the probability weighting function appears to be central for considering the coalesced-versus split-form data and, secondly, the explanatory power of the RDU model depends on the function specifications and the corresponding interaction effects (e.g., Stott 2006). We thus add the c (and d, where relevant) parameter to the list of estimable parameters (see Appendix 7 for a full list of parameters in each model).
In addition, we extend the RDU to accommodate stochastic behavior by applying an exponential specification of the choice rule 7 of Luce (1959) (see also Rieskamp 2008). The exponential specification of Luce's choice rule is defined as where p R; S ð Þstands for the probability of choosing the risky gamble R over the safe gamble S and the sensitivity parameter / [ 0 specifies how sensitively the model reacts to differences between the subjective values V R ð Þ and V S ð Þ of the gambles R and S, respectively (Rieskamp 2008;Nillson et al. 2011). Consequently, given that it is feasible to structurally estimate all specified parameters jointly with MLE, we add the / argument to the list of estimable parameters.
Using the choice rule, we attempt to quantify the goodness of fit of the RDU model predictions given the actual choices between the gambles. A useful tool for the purpose is the deviation measure G 2 , expressed as with i denoting the choice among gambles and N denoting the total number of gambles. f i yjh ð Þ expresses the probability that the RDU model with its parameter values h predicts a choice y, such that f i yjh Þif the gamble S is chosen. Low values of G 2 are indicators of good choice predictions and, hence, a good fit of the RDU model (Rieskamp 2008). A directly related measure of fit is the Akaike information criterion 8 (AIC) that additionally adjusts for the complexity of the model (namely, the number of parameters in a given specification) and thus allows comparing the explanatory power of differing models. The AIC is defined as where n stands for the number of free parameters in a model (Akaike 1973). As a rule, an AIC difference of DAIC [ 10 for two given models strongly favours the model with the lowest AIC measure over the other (Burnham and Anderson 2002).

Fitting coalesced-and split-form data to RDU and EU
For the main hypothesis, we analyze the AIC measures for the EU specification using the coalesced-form and split-form data separately. This approach allows us to consider whether, in some settings, one could find grounds for preferring the one form of presenting the gambles over the other. The linear case of the RDU oneparameter weighting function (with restricting the c parameter to unity, thus attaining EUW) shows clear evidence that the split-form gamble pairs provide a better fit than coalesced-form gambles in the EU specification (coalesced form AIC EUW = 3870, split form AIC EUW = 3816, see DAIC in Table 1). However, the results also indicate that for the RDU specifications, the coalesced form data provide a better fit than the split-form data. Namely, three out of the four core RDU specifications show a DAIC that notably exceeds 10, thus providing a result in favor of the coalesced-form data, while the fourth (TKW) shows a DAIC smaller than 10, providing an insufficiently conclusive result.
Interestingly, the opposite holds when examining directly comparable one-type gambles with split highest and lowest branches exclusively (see DAIC in Table 2 and the list of gambles in Appendix 3): The model that uses split-form data outperforms the same model that uses equivalent coalesced-form data in all four RDU specifications. The question of whether experiments ought to include splitform gamble pairs rather than their coalesced equivalents to ensure a more accurate preference elicitation thus finds some confirmation here but is still open for further examination.
The four RDU specifications outperform the EU specification in terms of the model fit (see DAIC EUW in Table 2) for the coalesced-form data. For the split-form data of the directly comparable one-type gambles, the differences between the AIC EUW and the respective measures of fit for the four RDU specifications no longer exhibit significant differences.
Finally, the one-parameter P1W outperforms TKW and the two-parameter P2W outperforms GEW in terms of the fit measures in practically all of the examined models. Therefore, in the following discussion of results, we focus on these bestperforming models in particular; see Appendices 4-6 for the full results of all examined RDU specifications and model versions with various control variables. In total, we use four versions of the model (referred to as models M1 to M4, see Appendix 7 for summary).
The model that uses coalesced-form data outperforms the model that uses split-form data The model that uses split-form data outperforms the model that uses coalesced-form data Better fit for CPT than EU The four RDU specifications are applied on split-form data as compared to coalesced-form data with the whole dataset considered The four RDU specifications are applied on split form data as compared to coalesced-form data with 16 out of the 28 gamble pairs considered 4.2 Comparing RDU parameter estimates for the coalesced-and split-form data The results for the weighting function parameters c and d of model M1 in Table 3 are unequivocal. There exist significant differences at a 99% significance level between these parameters when considered in the contexts of coalesced-and splitform gamble pairs. Indeed, also the direction of the differences is consistent over the RDU specifications, with c split remaining significantly higher than c coa and d split remaining significantly lower than d coa (all at a 99% significance level, according to Wald tests). These results are, however, worthy of attention not only for their significant differences between the respective coalesced-and split-form c and d, but also due to the absolute values of these weighting function parameters. While the coalesced form allows maintaining the c coa value significantly different (lower) from unity, thus confirming the established RDU predictions (Wald tests, p values \ 0.001), the split-form c split reveals an unusual picture. The estimated parameter value is indifferent from unity in the one-parameter P1W specification and significantly different (higher) from unity in the two-parameter P2W specification (Wald tests, The symbol ** for the corresponding split parameters indicates a difference between the coalesced-and split-form parameter values at a 95% significance level, *** indicates a difference at a 99% significance level. All differences in this model are due to a binary splitcoal variable. All constant terms are significant at a 99% level. As for the constant term representing the / parameter, its values are likely dependent on the a as well as the d parameters, which explains the differences over the four specifications. Standard errors (in parentheses) are clustered at subject level p values \ 0.001). These results thus conflict with the usual results of the RDU framework, namely, the aforementioned 0\c\1. Model M2 might provide additional insights into this uncommon result. This model, compared to M1, considers the weighting function exclusively, keeping the utility function parameters unchanged. According to Chow tests, 9 the respective weighting parameter values estimated in M1 and M2 do not significantly differ between these model versions for the one-parameter P1W specifications (neither for the corresponding coalesced-, nor split-form parameters), while the two-parameter P2W specification estimates in M2 are different from the estimates in M1 at a 95% significance level for c split , d coal and d split .
The results of M2 confirm the results of M1 and reveal the familiar trend of significantly different weighting function parameters c and d in the coalesced-form as compared to the split-form gamble pairs (Wald tests, p values \ 0.001), with c coa significantly different (lower) from unity (Wald tests, p values \ 0.001). In this case, the c split parameters report practically no curvature, which implies that no weighting of probabilities for split-form gamble pairs could be identified in the P1W case (essentially making it equivalent to the EUW case), while for the two-parameter function P2W, the elevation parameter d split alone assures that the weighting function is curved. 10 As depicted in Fig. 3 where we plot the results of M1 graphically, the results of the one-parameter weighting functions are admittedly easier to interpret than the two-parameter weighting functions. The implications of splitting-namely, that split-form gambles result in less probability weighting than coalesced-form gambles-hold for both P1W and P2W, but the particular shapes of the split and coalesced P2W should be interpreted with caution, as the d values are quite high and indicate more pessimism in the split form.
Although there are some limitations to our results, we can indeed conclude that violations of coalescing for split-form gamble pairs in the RDU framework not only explain differences in the model fit, but also affect the subjective weighting function. Namely, the use of the split-form gamble pairs appears to change gamble choices and considerably diminish the weighting as compared to the coalesced-form gambles.
Note that we also check the robustness of these results by considering further weighting function specifications and adjusting the RDU parameter estimates to further variables. Our analysis (see M1, M2 in Appendix 4 and M3, M4 in Appendices 5 and 6) shows that the familiar trend of significantly different weighting function parameters c and d for the coalesced-versus split-form gambles remains strong and consistent across all RDU specifications and considered models. Taken together, our results indicate that one source of the more pronounced 9 In the given case when linear-form restrictions are met, a Chow test (for comparisons between, rather than within, models) provides results that are essentially equivalent to the results of a Wald test. 10 For the elevation parameter, the estimated value and value change between the models exhibit a curious development, which appears to be attributable to the interaction effects between the value function and weighting function specifications. As moderately significant differences in a values are only found in cases where the power value function is combined with a two-parameter weighting function, one can reasonably conclude that it is a result of an increased ''latitude to interact'' (see Stott 2006, p. 102). deviation from EU could be comparison difficulties caused by the coalesced form of the presented gambles.

Discussion and limitations
The results of this paper are insightful in a number of ways. Firstly, we have shown that the fit of EU is indeed better if the gamble pairs are presented in a split form. This result indicates that the split form improves prescriptive decision analysis. For example, we could conclude that one should rather use split-form than coalescedform gamble pairs when advising decision making.
Secondly, we have found evidence for significant differences in magnitude between the c parameters in the RDU weighting functions for coalesced-and splitform gamble pairs in the given dataset. Meanwhile, the somewhat mixed evidence regarding the logically independent elevation parameter d calls for further examination of this property. Note that the interaction effects between d and a cannot be ruled out in this setting and are a possible source of the mixed evidence. Still, although the values of d are quite high (particularly for coalesced gambles), there is no indication for strong effects of cross-parameter compensations: c and d   Table 3 above. The solid lines refer to coalesced-form data and the dashed lines refer to split-form data in the one-parameter (black, P1W) and two-parameter (gray, P2W) Prelec weighting functions, respectively do not vary considerably in concert, and d values are relatively stable in all model specifications. Whether this result reflects some more fundamental theoretical issues or is merely a method artifact is due to further research.
Thirdly, it appears that presenting gamble pairs in a split form changes gamble choices as to bring the RDU closer to the EU. This is the case not only for the measures of model fit, but also for the properties of the weighting function. With the curvature parameter c largely closer to unity (and often not significantly different from unity) for split-form than for coalesced-form gambles, the results indicate that the subjects tend to pay less account to subjective probability weighting when evaluating split-form gamble pairs of a certain type. 11 They act comparatively more ''normatively'' than expected and thus put one of the cornerstones of the RDU (and by implication, CPT) into question.
Note, however, that the gambles in our study were quite specific in that values and probabilities of the high outcomes were relatively similar between the gambles within a decision. Note also that no certain outcomes were included that might drive the typically stronger deviations from linearity of the weighting function. We thus acknowledge the fact that the particular shape of the weighting function might be different with additional gambles and tests for generality are due to further research.

Conclusions
The results of this paper invite its readers to carefully rethink RDU and its perspective on the subjective probability weighting or, more particularly, on the stability of the probability weighting function against the editing of lotteries. We have provided some reasons to conclude that the non-linearity in the weighting function might be more pronounced in result of coalescing. That is, probability weighting does not necessarily appear to be an ingrained feature, but rather a result facilitated by processing difficulties.
What do our results imply for utility theory and its applications? Firstly, one could argue in favor of using gamble pairs in a split form and employing EU as a decision criterion, given that it performs rather well for the split-form pairs. However, real-life gambles do not always occur in a split form. Therefore, a second recommendation could be to increasingly employ other theories that imply splitting effects. Other models, like the transfer of attention exchange (TAX) model or the rank-affected multiplicative weights (RAM) model, could potentially be rivals to the EU, RDU and CPT (Birnbaum and Chavez 1997;Birnbaum 1999Birnbaum , 2008. 11 We identify certain contradictions in the estimated CPT parameter values, with some one-parameter models leading to a common inverse-S shape probability weighting function, while others resulting in a linear function for the one-parameter weighting function. Note that the linear shape appears to become less pronounced with addition of further relevant explanatory variables in a model. We thus confirm the proposition by Birnbaum and Navarrete (1998), among others, who argue that the CPT improves its pertinence if the parameters can additionally depend on the number of branches or outcomes in a gamble. Controlling for this has indeed allowed the CPT parameters to be in more compliance with the theory than otherwise. In the meantime, we also find that the importance of the splitting itself is not diminished when these additional variables are added. This result extends the discussion of bounded rationality as a reason behind the observed splitting effects and indicates that the underlying psychological reasoning reaches beyond minimized cognitive expenditures related to the absolute frequency of considered attributes.
Thirdly, one could continue employing RDU (and by implication, CPT), albeit cautiously, knowing that the splitting has a fairly intelligible effect on the weighting function.
Much remains to be done still. Firstly, because this paper concerns coalescedand split-form gamble pairs in the gain domain exclusively, we advise extending the scope of the forthcoming experiments to also include mixed and loss-only gambles (to examine the splitting effects in the respective parameters for the loss domain, including the loss aversion parameter). Secondly, because the construction of splitting appears to have an influence on the resulting model fit, we advise extending the binary concept of splitting and examining a variety of directly comparable coalesced versus differently split datasets. Thirdly, because the interpretations of the psychological reasons behind the splitting effects are still manifold, we advise gathering further insights from parallel lines of research, such as neuroeconomics and others.
Acknowledgements Open Access funding provided by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommons.org/licenses/by/4.0/.

Appendix 1: Proof of coalescing within the framework of CPT
In its most general representation as a part of RDU family, CPT guarantees satisfaction of coalescing (i.e., the equivalence of coalesced and split forms of a given gamble) even if no editing rules are assumed (Birnbaum 2008). In particular, as demonstrated by Birnbaum and Navarrete (1998, p. 57-58), RDU implies coalescing for any wðpÞ function in where the (decumulative) probability of receiving x i or more is denoted by p i , while the probability of receiving more than x i is denoted by p iÀ1 .
If outcomes x ¼ y, then V x; p; x; q; z; r Appendix 2: Format of the presented gambles See Fig. 4. At the beginning of the experiment, the subjects received written instructions that were also read out aloud. Following the instructions, four transparent dominance questions were used to test the subjects' understanding. Only after completing these were the subjects allowed to proceed with the experiment. Four booklets with pairwise choices between presented gambles were then consecutively provided to each of the subjects, with every choice being replicated four times using counterbalanced left-right positioning.
In addition, two transparent stochastic dominance problems were included in each of the booklets to check whether the subjects were being sufficiently attentive. Importantly, both coalesced and split forms of the same gambles were included in the booklets, but the booklets differed in how the gambles were included. Namely, while half of the subjects received four booklets with intermixed coalesced-and split-form gambles, the other half received booklets with only the one or the other form gambles.
For further details, please refer to the instructions (translated from German) on the following pages.  Fig. 4 The presentation design of the gambles. Note: The subjects were asked to mark their preferred gamble by circling it

Instructions
Thank you for your willingness to participate in this study. Your data and personal details will be treated confidentially, will not be passed on to anyone and will not be used for any other purposes than this study.

The study
We want to know how people choose between lotteries, that is, how they choose between different options of winning money. For this purpose, two lotteries are offered for you to play in each task. Your task is to decide which of the two lotteries you prefer to participate in. Example 1 Imagine a lottery wheel with 100 lottery tickets. 50 tickets are marked with ''0€'', whereas the other 50 tickets are marked with ''20€''. If you draw a ticket from the lottery wheel, you have a 50-50 chance of winning either 0€ or 20€. Both amounts can be drawn with 50% probability.
Imagine now a second lottery wheel, which contains 75 ''25€'' tickets and 25 ''50€'' tickets. Would you rather draw a ticket from the first or second lottery wheel? This task is described in the questionnaires as follows: Example 1: Which lottery would you like to play? □ A with a probability of 50% you win 0 € with a probability of 50% you win 20 € OR □ B with a probability of 75% you win 25 € with a probability of 25% you win 50 € In this case, it is clear that you should prefer the second lottery B. In this lottery you would win at least 25 €, whereas for lottery A even the maximum prize is only 20 €. Example 2 In this study, however, it will often depend on your own opinion which lottery you prefer. An example of this is the next task: Example 2: Which lottery would you like to play? □ A with a probability of 50% you win 0 € with a probability of 25% you win 5 € with a probability of 25% you win 50 € OR □ B with a probability of 50% you win 12 € with a probability of 50% you win 14 € There are three different possible prizes in Lottery A. Lottery B has two different, equally probable prizes. Suppose you choose Lottery A: With 25 lottery tickets you could receive the high prize of €50. However, if you draw one of the other 75 tickets from Lottery A, i.e., €0 or €5, you would have been better off if you had chosen Lottery B, as in Lottery B you will win at least €12. Which lottery would you rather play? Some would rather play the first lottery A, others would rather play the second lottery B. It is just a matter of opinion. There is no ''right'' or ''wrong'' answer! Once you have made your choice, tick the appropriate lottery in the box.

Rules
For all lotteries in this study, you can imagine a lottery wheel with 100 tickets with cash winnings. All tickets can be randomly drawn from the wheel with the same probability. Therefore, the probability of winning a prize corresponds to the number of tickets with exactly this prize in the wheel. In each task, the probability and amount of all possible winnings for both lotteries are given. The probability adds up to 100% in each lottery.
Please answer the questions alone, do not talk to the other participants during the study. If you violate this rule, you will be excluded from the study and will not receive any payment.
The study will take about 1 h and 15 min. We ask all participants to stay until the end and not to leave earlier. There are enough tasks for the whole period. Payment is not made until the end of the study. Payment Each participant receives a participation fee of 5 €. In addition, one of the tasks is randomly selected at the end of the study. For this purpose, you are asked to draw a number from a bag. It is checked which lottery you have chosen for this task and this lottery is then played, again by drawing a number.
The lowest possible prize is 0 €. If you draw this prize, you go home with 5 €. The highest possible prize is 60 €. In this case you would receive 65 € in total. The other profit opportunities are in between, the 5 € participation fee is always added. Your earnings are paid out immediately in cash.
Thus, your payment depends on the decisions you make during the study! Further procedure On the next page, please answer the general questions first.
On page four you will find four exercises. When you have answered them, please raise your hand. Someone will check the exercises to see if you have understood them.
You will then receive your first task sheet. Each time you have finished answering a sheet, please raise your hand, then you will receive the next task sheet.

Questions?
If you have any questions, raise your hand and someone will come to you. Please don't ask your neighbors.
Appendix 3: Structure of the presented gambles in coalesced and split forms See Table 4.  Birnbaum and Navarrete (1998) demonstrate that, for a more general CPT model, median parameters do indeed differ between two-outcome and three-outcome gambles. The authors argue that the estimated weighting function parameters for two-outcome gambles are consistent with previous numerously replicated results (in that 0\c\1, see, e.g., Tversky and Kahneman 1992;Wu and Gonzalez 1996), whereas for three-outcome gambles the results indicate c [ 1, consistent also with Birnbaum and McIntosh (1996) and Birnbaum and Chavez (1997). In line with this, the number of branches or outcomes in a gamble should have a significant impact on the CPT weighting function parameter estimates, even when the splitting is controlled for.
We consider these additions to the literature in M3 and find that both weighting function parameter estimates c and d are significantly explained by the number of branches as well as the number of outcomes in a gamble in all four weighting function specifications. Furthermore, Chow tests for comparisons between M2 and M3 indicate that adding the outcomes variable has also resulted in altering of c coa , c split , d coa and d split values significantly (at a 99% significance level). Although similar to that in M1 and M2, the familiar significant differences between the weighting function parameters for coalesced-and split-form gambles are still  Appendix 7: Overview of the estimated model versions See Table 11. The symbol ** for the corresponding split parameters indicates a difference between the coalesced-form and split-form parameter values at a 95% significance level, ***indicates a difference at a 99% significance level. All differences in this model are due to a binary splitcoal variable. All constant terms are significant at a 99% level. Standard errors (in parentheses) are clustered at subject level The symbol ** for the corresponding split parameters indicates a difference between the coalesced-form and split-form parameter values at a 95% significance level, *** indicates a difference at a 99% significance level. All differences in this model are due to a binary splitcoal variable. All constant terms are significant at a 99% level. Standard errors (in parentheses) are clustered at subject level The symbol ** for the corresponding split parameters indicates a difference between the coalesced-form and split-form parameter values at a 95% significance level, *** indicates a difference at a 99% significance level. Meanwhile, all significant variables (at a 99% level) are presented in bold font. Alongside with the non-linear predictions of the weighting function parameters, the respective regression coefficients are presented in italic font. The binary variable splitcoal takes the value of 1 if the corresponding gamble is presented in a split form, 0 otherwise; the outcomes variable indicates the total number of outcomes in a gamble pair (taking a value of 4-6, as in the coalesced form); constant refers to the constant term. Standard errors (in parentheses) are clustered at subject level The symbol ** for the corresponding split parameters indicates a difference between the coalesced-form and split-form parameter values at a 95% significance level, *** indicates a difference at a 99% significance level. Meanwhile, all significant variables (at a 99% level) are presented in bold font. Alongside with the non-linear predictions of the weighting function parameters, the respective regression coefficients are presented in italic font. The binary variable splitcoal takes the value of 1 if the corresponding gamble is presented in a split form, 0 otherwise; the branches variable indicates the total number of branches in a gamble pair (taking a value of 4-8); constant refers to the constant term. Standard errors (in parentheses) are clustered at subject level The symbol ** for the corresponding split parameters indicates a difference between the coalesced-form and split-form parameter values at a 95% significance level, *** indicates a difference at a 99% significance level. Meanwhile, all significant variables (at a 99% level) are presented in bold font. Alongside with the non-linear predictions of the weighting function parameters, the respective regression coefficients are presented in italic font. The binary variable splitcoal takes the value of 1 if the corresponding gamble is presented in a split form, 0 otherwise; the outcomes variable indicates the total number of outcomes in a gamble pair (taking a value of 4-6, as in the coalesced form); constant refers to the constant term. Furthermore, repetition takes on a value of 1-4 depending on the filling order; gender takes on a value of 1 for male subjects, 0 for female subjects; and major takes on a value of 1 for subjects studying economics or business administration, 0 otherwise. Standard errors (in parentheses) are clustered at subject level The symbol ** for the corresponding split parameters indicates a difference between the coalesced-form and split-form parameter values at a 95% significance level, *** indicates a difference at a 99% significance level. Meanwhile, all significant variables (at a 99% level) are presented in bold font. Alongside with the non-linear predictions of the weighting function parameters, the respective regression coefficients are presented in italic font. The binary variable splitcoal takes the value of 1 if the corresponding gamble is presented in a split form, 0 otherwise; the branches variable indicates the total numberof branches in a gamble pair (taking a value of 4 to 8); constant refers to the constant term. Furthermore, repetition takes on a value of 1-4 depending on the filling order; gender takes on a value of 1 for male subjects, 0 for female subjects; and major takes on a value of 1 for subjects studying economics or business administration, 0 otherwise. Standard errors (in parentheses) are clustered at subject level