1 Introduction

Behavioral research on decision making has identified many framing effects (Kahneman 2011), in which objectively identical information presented in different ways leads to different decisions. Since probabilities are a central element in any decision problem under risk, effects caused by different ways of communicating probabilities are of particular importance both for increasing our understanding of decision making, and for supporting decision makers in practice.

Different ways of communicating probabilities to decision makers lead to various bias phenomena (Ohlert and Weißenberger 2015). It has been shown that biases such as base rate neglect (subjects ignore the frequency at which an event occurs at large and focus only on conditional probabilities) strongly depend on whether probabilities are expressed as fractions (like 40%) or as ratios of two numbers (e.g., 4 out of 10) (Gigerenzer and Hoffrage 1995; Hoffrage et al. 2002). When probabilities are expressed as ratios, using different numbers to represent the same probability influences decision making. This effect is known as the “Ratio bias” and refers to the tendency of individuals to judge a probability expressed as a ratio of high numbers as more likely than the same—or an even higher—probability expressed as a ratio of low numbers. For example, a ratio bias occurs when individuals prefer a lottery that offers a 9/100 chance of winning over a lottery that offers a 1/10 chance of winning, even though the actual probability of winning is higher with the second lottery.

Since the first studies that identified the ratio bias phenomenon in the 1970ies (Piaget and Inhelder 1975), and a wave of related studies in the following years (Denes-Raj and Epstein 1994; Denes-Raj et al. 1995; Kirkpatrick and Epstein 1992; Miller et al. 1989), this phenomenon has been controversially discussed in the literature. The present study aims to contribute to this literature by exploring the context in which ratio bias phenomena are likely to occur. In doing so, we also try to overcome several limitations of previous empirical studies pointed out in the literature. Ratio bias experiments have often been criticized to create a setting which favored occurrence of a ratio bias. Most early studies only contained choices in which the lottery expressed via high numbers was objectively worse. Thus, deviations in favor of the low-number alternative were not possible and researchers could not distinguish between the ratio bias as a systematic deviation and general errors in any direction. Only few studies have so far addressed this question (Dale et al. 2007; Lefebvre et al. 2011), with mixed results. Another problem that we address in this paper is that most previous studies forced respondents to make a choice between two alternatives, without allowing for indifference (Passerini et al. 2012). Combining the possibility to deviate from rational behavior in either direction with indifference allows us to consider two levels of deviations in both directions: indicating indifference when one alternative would be better, and indicating preference for the inferior alternative.

Considering this variety of possible deviations within one study not only allows us to contrast the ratio bias to random mistakes in choice. It also allows us to study the relationship of behaviors across settings. If we find that \(x\%\) of subjects deviate in favor of the high-number alternative, and \(y < x\%\) deviate in favor of the low-number alternative, two (extreme) scenarios are possible: either the same \(y\%\) deviate in both cases, and only an additional \(x-y\%\) exhibit the bias in favor of the high-number alternative; or the two groups are completely disjoint. This distinction has important consequences for the practice of decision analysis and support. If most subjects who exhibit a ratio bias never deviate from rational behavior in favor of a low-number alternative, and others do the opposite that would indicate that many individuals have a preferred way of representing probabilities which would lead them to correct inferences. Thus, it could be possible to tune the presentation of probabilities to individual characteristics to help decision makers to achieve a correct interpretation. If, on the other hand, the same person frequently deviates in either directions, other ways of de-biasing must be found.

This study also aims at identifying factors that lead to occurrence of the ratio bias. We study both problem-specific factors, as well as individual factors. Most studies on the ratio bias so far have been conducted with quite small probabilities. We systematically study small, medium and large probabilities. We also control for demographic factors such as gender (Booij et al. 2010; Schmidt and Traub 2002) and age (Bonsang and Dohmen 2015; Harbaugh et al. 2002). These factors are known to have an impact on decision making behavior under risk, although some well-established stereotypes have recently come under scrutiny (Filippin and Crosetto 2016). Therefore, they might also impact the occurrence of a ratio bias. Following the discussions about the impact of incentives on occurrence of the ratio bias (Dale et al. 2007; Denes-Raj and Epstein 1994; Lefebvre et al. 2011), we use an incentivized task.

In line with previous literature on the ratio bias, this study is exploratory in nature. While we study the occurrence of the ratio bias across a wider range of conditions than previous studies, there is still no established theory that would allow us to formulate clear hypotheses indicating under which conditions this bias is more likely to occur. However, creating and analyzing a systematic set of controlled observations is in our view an important first step towards theory development.

The remainder of the paper is structured as follows. In Sect. 2, we provide a comprehensive overview of the relevant literature. This survey allows us to identify gaps in existing research and to derive our specific research questions. Section 3 describes the experiment we conducted, the results of which are presented in Sect. 4. Section 5 concludes the paper by discussing its main result, offering some final conclusions, and by providing an outlook onto future research.

2 Literature review and research questions

2.1 Overview

The first empirical study that found evidence for the ratio bias phenomenon was conducted as early as 1975 (Piaget and Inhelder 1975) with an experimental design called the two-urn choice task, but the sample only included children. More frequent studies of the ratio bias using different experimental designs started emerging in the early 1990s [for example, (Denes-Raj and Epstein 1994; Kirkpatrick and Epstein 1992; Miller et al. 1989)]. The ratio bias is of particular interest for the field of health economics (due to its important implications for policy), but can potentially be a very generally relevant phenomenon in any decision-making context.

So far, the ratio bias seems to be one manifestation of human irrationality and of a “conflict between the head and the heart” (Pacini and Epstein 1999). The ratio bias effect has not only been found in studies that used lotteries in the gain domain, but also in the assessment of risks involving undesirable events, such as probabilities of death (Bonner and Newell 2008; Pinto-Prades et al. 2006; Yamagishi 1997). According to the ratio bias effect, individuals choose the lower probability of winning instead of the higher one, simply because of the way the ratios are expressed. This means that they have full knowledge of the probabilities involved, and that there is neither an issue of missing information as in ambiguity aversion (Frisch and Baron 1988), nor of any particular complexity in the tasks (Huck and Weizsaecker 1999). It appears that individuals are biased towards choices with large absolute frequencies of success, rather than large probabilities of success.

Even though a wide array of experiments have replicated the ratio bias in the past, more recent publications have started to question the significance of this effect (Lefebvre et al. 2011) in more realistic settings, in particular if actual monetary incentives are involved (Dale et al. 2007).

The typical experimental design used in most empirical studies presents a series of pairs of lotteries to subjects and asks them to choose their preferred lottery in each pair. The two lotteries of each pair can have either the same or different probabilities of winning, but there is always one lottery expressed as a ratio of small numbers and one as a ratio of higher numbers. To make these lotteries more understandable for subjects, they are often described as urns or bags that contain a defined number of items of different colors. The existence of the ratio bias effect was not only observed in scenarios that presented two equal probabilities, but also in scenarios in which the high-number ratio offered a smaller probability of winning than the low-number ratio. For example, in a study by Denes-Raj and Epstein (1994), nearly half of the subjects preferred the option 8/100 to the option 1/10. Alonso and Fernandez-Berrocal (2003) performed a study among 105 Spanish high school students that also provided strong evidence for the ratio bias for even larger differences between probabilities. In that study, participants preferred ratios representing probabilities of 10% over 20% and even 30%.

Further studies investigated the ratio bias phenomenon in less abstract contexts. As early as in the late 1980s, Miller et al. (1989) found that two events with the same low probability, but expressed as different ratios, elicited different levels of suspiciousness among subjects. For example, subjects had to imagine a situation in which a child randomly selects a cookie from a jar containing one chocolate-chip cookie and 19 oatmeal cookies and another jar containing ten chocolate-chip cookies and 190 oatmeal cookies. In both cases, the child draws a chocolate-chip cookie, which is the preferred type. When subjects were asked how suspicious they would be that the child peeked into the jar before drawing a cookie, they were more suspicious in the case of the jar with the 1 out of 20 probability than the jar with the 10 out of 200 probability.

Several researchers have introduced incentives in their experiments. While Kirkpatrick and Epstein (1992) and Dale et al. (2007) were able to confirm the existence of the ratio bias when real incentives were offered to subjects, Lefebvre et al. (2011) found no evidence for the ratio bias in their experiments that included salient incentives.

In the field of medical decision making, Yamagishi (1997) found that causes of death were considered as riskier by subjects when the mortality rates were expressed as deaths per 10,000 people rather than deaths per 100 people, even when the actual mortality rate was higher in the latter case. Bonner and Newell (2008) found that cancer is perceived as riskier when subjects are told that 36,500 people die from cancer every year as opposed to 100 people die from cancer every day. Similarly, Pinto-Prades et al. (2006) found a ratio bias in the context of choices between medical treatments. Subjects are less willing to accept a certain risk of death when frequencies were expressed as deaths per 1000 people rather than deaths per 100 people.

Denes-Raj et al. (1995) found evidence for the ratio bias in a series of three experiments in which the context was described with different degrees of detail and emotional involvement. One scenario was described with vivid details, another one in a mechanical and impersonal way, and the third one in an intensely emotion-arousing way. Passerini et al. (2012) replicated this study and also found the existence of a ratio bias. However, they considered some task and scenario descriptions to be misleading, since they encourage the activation of the experiential system and inhibit the rational system. When they ran slightly adjusted experiments (e.g., removing one phrase from the text) in addition to the original experiment, the ratio bias became much less frequent or even disappeared.

In a study by Rudski and Volksdorf (2002), the ratio bias was much more prevalent when choices were presented graphically as opposed to textually. Therefore, they argue that the presentation of a choice problem has a significant effect on the balance of automatic versus analytic cognition. Passerini et al. (2012) replicated the original study, but found an almost equal presence of the ratio bias effect in both the textual and the graphical version. The same authors also replicated a study by Rudski and Volksdorf (2002) with some small modifications. They eliminated one sentence that they deemed problematic. With this small change, the ratio bias disappeared, which suggests that occurrence of the ratio bias depends on the way in which a problem is described. They also replicated another study by Alonso and Fernandez-Berrocal (2003), again once in its original form and additionally with slightly modified texts. The findings of the replicated original study (using the same texts) were consistent with the previous findings, but the ratio bias disappeared when one sentence was removed and another replaced. This means that also in this experiment, like in the two previously mentioned ones, the task description had considerable influence on the occurrence of the ratio bias.

In addition to the effect of task descriptions, Passerini et al. (2012) consider the absence of a third option expressing indifference as a major limitation in most ratio bias studies. In the absence of an indifference option, subjects have no possibility of giving the correct answer in a case of two equal probabilities.

Lefebvre et al. (2011) also provide a critical perspective on existing research on the ratio bias. In particular, they identified four main aspects of previous studies that question the generalizability and relevance of the ratio bias phenomenon. First, most experiments were conducted without salient and strong financial incentives. Second, subjects were often explicitly told to follow their gut feeling and the experiments involved direct observation by the researcher. The authors claim that this might lead to an experimenter demand effect. Third, many of the studies used psychology students as subjects, who may be particularly prone to experimenter demand effects. Last, many experiments did not control for potential error rates or noise, which might lead to an overestimation of the ratio bias effect. The authors performed four experiments addressing these weaknesses to defend their claims. In one of them, they observed a very weak ratio bias effect, while in the other three experiments, no ratio bias was found. These results cast serious doubts on the relevance of the ratio bias phenomenon.

Interestingly, one recurring finding is that the effect is considerably stronger if individuals estimate the behavior of others (others-perspective) rather than their own behavior (self-perspective). This is most likely because most people are convinced that they are more rational than others, and therefore assume that others will exhibit a stronger ratio bias effect than themselves (Alonso and Fernandez-Berrocal 2003; Denes-Raj et al. 1995; Kirkpatrick and Epstein 1992; Passerini et al. 2012).

2.2 Research questions

In this study, we intend to address several issues identified in previous literature. The first is the set of limited choices used in many previous studies. To the best of our knowledge, only two studies (Dale et al. 2007; Lefebvre et al. 2011) tried to compare the ratio bias to random errors made in situations in which the high-number lottery is objectively better, and none of them combined this reversed direction of deviations with the possibility to indicate indifference. By allowing for both possibilities, we can more precisely identify whether there is a systematic bias in favor of high-number alternatives that exceeds the frequency of random errors in the other direction. Our first research question is therefore:

RQ1: Is there a systematic bias towards high-number alternatives in the presence of different choice options that allow for the selection of either the high-number or the low-number alternative, as well as indicating indifference?

Although previous research has addressed the presentation of lotteries in various formats, quantitative characteristics of the choice problems (such as the level of probabilities involved) were, somewhat surprisingly, not systematically analyzed yet. Furthermore, previous research did not consider individual characteristics of subjects such as gender or age that have been shown to influence decision behavior under risk. Our second research question is, therefore,

RQ2: Which factors lead to more or less frequent occurrence of the ratio bias?

More specifically, RQ2 consists of two separate sub-questions:

RQ2a: Do probability levels, and the difference between probabilities of the two alternatives, influence occurrence of the ratio bias?

RQ2b: Do characteristics of the subject such as gender and age influence occurrence of the ratio bias?

Finally, the variety of choice problems we use in our experiment allows us to study whether subjects behave in a similar way across different settings. This will allow us to see if a tendency towards the ratio bias can be considered a stable personal trait, or a random behavior that occurs in some settings and not in others. We therefore formulate our third research question in general as follows:

RQ3: Is there consistent behavior across decision problems, in which the objective relation of high-number and low-number alternatives is different?

Behavior can be consistent in several ways: a subject might consistently show a bias towards the high-number or the low-number alternative, regardless of whether the objectively correct solution is indifference or preference for the other alternative. Alternatively, a subject might consistently show a certain strength of deviation (indifference or preference for the wrong alternative) both in problems in which the high-number or the low-number alternative is objectively better. We denote these two types of behavior as a “Direction effect” and a “Common deviation effect”. Furthermore, consistency could be related to the drivers of behavior (characteristics of the problem or the subject) identified under RQ2, or might be independent of them. We therefore formulate three sub-questions of RQ3:

RQ3a: Do subjects exhibit a direction effect, i.e., a consistent tendency towards the high-number alternative in problems in which both alternatives are objectively equivalent as well as in problems where the low-number alternative is objectively better?

RQ3b: Do subjects exhibit a common deviation effect, i.e., a consistent tendency towards indifference or towards preference for the objectively worse alternative across problems in which the high-number or the low-number alternative is objectively better?

RQ3c: Are consistent deviations from rational behavior across different problem settings related to the same characteristics of the problem or the subject?

A similar research question as RQ3a could also be formulated for deviations in favor of the low-number alternative, but since we are mainly interested in deviations favoring the high-number alternative, we do not study this aspect further.

3 Experiment

3.1 Task and incentives

In the main part of the experiment, subjects were exposed to 27 choice problems involving two lotteries, in which probabilities were represented as a ratio of high numbers for one alternative, and a ratio of low numbers for the other alternative. Subjects could indicate preference for either one of the lotteries or indifference. In nine out of these 27 problems, the lottery represented via low numbers had the higher probability of winning (representing the usual setup of ratio bias experiments), in nine problems, this relation was reversed and in nine problems, the two probabilities were the same. In each of these groups, three problems each involved small, medium and large probabilities. To avoid a cognitive overload of subjects, which could have resulted in the experiment merely being a “guessing game”, we decided not to include non-salient ratios (such as 812:1392 or 117:1404). Table 1 provides an overview of the stimulus material used.

To minimize the experimenter demand effect, and to control for learning effects, two randomizations were performed. For each of the 27 problems, the order in which alternatives were presented to subjects was randomly chosen. In some problems, the high-number alternative was presented first, in the others, the low-number alternative was presented first. Across subjects, the ordering chosen for a given problem was the same, i.e., in a given problem, all subjects saw the high-number (or the low-number) alternative first. Furthermore, the order in which problems were presented was randomized across subjects. For this randomization, three versions of the questionnaire were prepared that contained the same 27 problems in different sequence.

Table 1 Problems used in the experiment

We provided a substantial, performance-dependent monetary incentive to subjects. This addresses a limitation of most previous experiments, which were conducted without or with very low (for example, Dale et al. 2007; Denes-Raj and Epstein 1994; Kirkpatrick and Epstein 1992) financial incentives. Thus, we followed the recommendations of Camerer and Hogarth (1999), who found that higher incentives have the largest effects in judgment and decision tasks, such as choices between lotteries.

The instructions presented the ratios as numbers of red tickets in a bag that contained the indicated total number of red and white tickets. All lotteries were actually played out by simulating them in an Excel sheet. If a red ticket was drawn in the chosen lottery, the participant earned a point. If a white ticket was drawn, the participant did not earn a point. This means that participants could theoretically earn between 0 and 27 points, and the expected number of points is maximized by always choosing the lottery that provides the highest probability of winning. The five participants who obtained the highest number of points received an Amazon.com voucher worth 50 Euro. Since the experiment was conducted in a class where attendance was compulsory for students, no show-up fee was paid to participants.

3.2 Experiment and subjects

The entire experiment was carried out as a paper and pencil exercise at the beginning of an undergraduate lecture in Business Administration at the University of Vienna. Students were not informed in advance that an experiment would take place in the session. The experiment was deliberately conducted at the beginning of the lecture rather than after the lecture, to avoid participants making random choices to be able to leave the lecture hall as quickly as possible, or slower participants getting distracted by those who already finished and therefore left the lecture hall earlier.

Participants were informed upfront about the possibility to win a price, to minimize the drop-out rate. Then a questionnaire was handed out to subjects, which contained instructions, the 27 choice problems indicated in Table 1, and demographic questions about age and gender. Furthermore, participants also had to state their e-mail address so that the winners of the vouchers could be contacted after the experiment. Subjects were asked not to use a calculator or mobile phone when making their choices. Instead, they were asked to trust their gut feeling and to avoid spending too much time deliberating each of the tasks.

A total of 136 undergraduate students enrolled in a Business Administration class at the University of Vienna (Faculty of Business, Economics and Statistics) participated in the study, of which 82 (60.3%) were female and 54 (39.7%) were male. The age of the subjects ranged from 18 to 45 years, with a mean of 22.1 years and a median of 21 years.

The three versions of the questionnaire containing problems in different sequence were handed out in such a way that each person had a different version of the questionnaire than the next person sitting on the left or right side of them. This prevented participants from copying answers. Versions were almost evenly distributed, 46 subjects received one version of the questionnaire, and 45 each one of the two other versions. A \(\chi ^2\) test indicted that for none of the 27 choice problems, the distribution of answers was significantly influenced by the arrangement of questions at the \(5\%\) level. For one out of 27 problems, the difference was significant at the \(10\%\) level (\(p=0.0973\)).

Since the problems shown in Table 1 involve both “easy” numbers such as 80:90, and more “difficult” numbers such as 77:92, we also tested whether this apparent difficulty makes a difference. We compared the results of the first question in each of the nine blocks in Table 1, which contains the “easiest” numbers, to results of the other questions in the same block using a \(\chi ^2\) test. The only two significant differences were found in the upper left block of Table 1, i.e., in problems with small probabilities in which the low-numbered alternative is better. However, here the fraction of correct answers in the “easy” problem 1:9 vs. 9:90 was in fact lower than for the other two problems in this group. We therefore conclude that difficulty of interpreting the numbers involved is not an issue.

4 Results

4.1 Systematic bias (RQ1)

A systematic bias towards the high-number alternative means that irrational decisions in favor of the high-number alternative occur more frequently than irrational decisions in favor of the low-number alternative. To analyze this research question, we have to distinguish between problems in which the objectively correct solution is preference for one alternative, and problems in which the objectively correct solution is indifference. In the first case, there are two possible deviations in the same direction, indifference, which can be considered a weak bias, and preference for the wrong alternative as a stronger bias. In the latter case, deviations in both directions are possible.

The questionnaire contained nine problems in which the high-number alternative had a higher probability of winning than the low-number alternative, and nine problems with the opposite arrangement. A ratio bias exists if deviations occur more frequently when the low-number is objectively better than when the high-number alternative is objectively better. Table 2 provides an overview of the frequency of response types in these two types of problems.

Table 2 Occurrence of bias effects

We indeed find that deviations in favor of the high-number alternative occur more frequently than deviations in favor of the low-number alternative. A \(\chi ^2\) test indicates that this difference is highly significant. We therefore can conclude that a ratio bias is present in our data. Our results are roughly comparable to the experiments of Dale et al. (2007), who found about 40% of instances of deviations in favor of the high-number alternative, and about 25% deviations in favor of the low-number alternative. In contrast, the numbers found by Lefebvre et al. (2011) were considerably lower.

In the problems in which both alternatives represent the same probabilities, any preference indicates a deviation. Table 3 provides an overview of choices in these problems. We again find that a significantly higher number of deviations is in favor of the high-number alternative rather than in favor of the low-number alternative. For both alternatives, the fraction of occurrences of incorrect preferences is similar to the fraction of incorrect preferences in Table 2.

Table 3 Frequency of choices in problems with equal probabilities

We therefore can answer RQ1 positively and conclude that a systematic deviation in favor of the high-number alternative exists both in cases in which the low-number alternative is objectively better, and in cases in which the two alternatives are equivalent.

4.2 Influence factors (RQ2)

We again have to distinguish between problems in which the objectively correct answer was preference for one alternative, and problems in which the normatively correct answer was indifference.

In the first group of problems, responses can be ordered in a meaningful sequence from rational behavior to strong deviation. We therefore used an ordered multinomial regression approach employing a random effect model (with respondent as grouping variable) to account for possible effects of unobserved respondent characteristics. All estimations were performed using function clmm from the ordinal package (Christensen 2015) of the R system (R Core Team 2015). Response type is the dependent variable. A dummy variable indicates whether the deviation is in favor of the high-number or the low-number alternative. Problem characteristics refer to the absolute difference between the two probabilities, and the probability level involved. To allow for nonlinear effects, probability levels were coded via two dummy variables for small probabilities (of about 0.1), and large probabilities (of about 0.8), the moderate probabilities of about 0.5 thus formed the reference case. Demographic characteristics are gender and age.

We estimate three models. The first model (\(m1_{all}\)) uses all problems in which one alternative was objectively better, and includes the dummy variable for problem type. To study whether problem characteristics or individual characteristics have different effects depending on the direction, we estimated two additional models. Model \(m1_{high}\) was estimated for deviations in favor of the high-number alternatives, model \(m1_{low}\) for the opposite direction. We had to use two separate models, since the inclusion of interaction terms between problem type and other variables led to considerable numerical problems in the estimation.

Table 4 Results of multinomial regression analysis

Table 4 summarizes the results of this analysis. A likelihood ratio test indicates that all models provide a significantly better fit than a baseline model without the independent variables (but including the random effects). The model using all problems shows a significant effect of problem type in the expected direction. Deviations in favor of the low-number alternative are significantly less likely to occur than deviations corresponding to a ratio bias. Deviations, and in particular the ratio bias, are also more likely if the probability involved is small. Furthermore, male subjects are less prone to exhibit either type of deviation. This effect is also clearly visible in the distributions of answers by gender shown in Fig. 1.

Fig. 1
figure 1

Distribution of responses according to gender for ratio bias (left) and reverse deviations (right)

There is a significant effect of the probability levels involved in both models. In the case of the ratio bias, problems involving small probabilities lead to a significantly higher occurrence of the bias. Deviations in the reverse direction are more likely in problems involving large probabilities, although this effect is not as significant as for the ratio bias. Surprisingly, neither model indicates an effect of the difference in probabilities between the two lotteries, probably due to the small differences in probabilities. Given the relatively small sample size, the very weakly significant effect of age in the first model should not be considered relevant.

In problems with equal probabilities, responses cannot be ranked by rationality and we can no longer use an ordered multinomial model. We therefore followed a two-step approach to analyze this data. We first estimated a logistic regression model (\(m2_{bias}\)) on a dummy variable indicating whether any deviation from rational choice (indifference) occurred. In the second step, we considered only the cases in which a deviation was present, and estimated a logistic regression model (\(m2_{high}\)) in which the dependent variable identified preference for the high-number alternative. In both cases, we again used a random effect model to take into account possible unobserved characteristics of the subjects.

Table 5 Regression results for problems with equal probabilities

Table 5 shows the results of this analysis. The regression on occurrence of a bias only shows a weakly significant effect of gender. In total, the fit of this model to the data is not significantly better than the fit of the null model. There is no gender effect on the type of bias, but as was already visible in the previous results, the ratio bias seems to be particularly frequent in problems involving small probabilities.

The fact that gender has an influence only on whether a deviation from rational behavior occurs, but not on its direction, is clearly visible in Fig. 2. The fraction of responses consistent with rational behavior (indifference) is larger for male subjects, but the relative size of the two different types of deviations is very similar across genders.

Fig. 2
figure 2

Distribution of responses in problems with equal probabilities by gender

4.3 Behavior across problems (RQ3)

To study the relationship between behaviors in the three settings (equivalence, high- and low-numbered alternative objectively better), we counted the number of occurrences of rational behavior and different deviations in the nine problems of each settings for each subject and compared these frequencies for the same subject across different settings.

4.3.1 Direction effect (RQ3a)

To study the direction effect, we have to compare deviations in favor of the high-number alternatives when the low-number alternative was objectively better, to deviations in favor of the high-number alternative in the setting with equal probabilities. In the former case, deviations can take the form of indifference, or of preference for the high-number alternative, which we analyze separately.

Fig. 3
figure 3

Consistency of deviations in favor of the high-number alternative across problem types

The left part of Fig. 3 shows the number of cases in which indifference was indicated although the low-number alternative was better vs. the number of preferences for the high-number alternative when both were objectively equivalent. The right part of the figure shows the number of mistaken preferences for the high-number alternative on the y-axis. In the bubble charts used in this and the following figures, the sizes of the circles (and the numbers inscribed in them) indicate the number of subjects who exhibited the respective deviation for the number of problems indicated on the two axes.

As the left part of Fig. 3 shows, indifference rather than preference for the low-number alternative occurs quite independently of whether the subject prefers the high-number alternative in the setting of objective equivalence. In particular, many subjects indicated indifference rather than preferring the low-number alternative, but never deviated from indifference in favor of the high-number alternative. In contrast, as the right part of Fig. 3 shows, subjects who preferred the high-number alternative when the two alternatives were equivalent frequently also preferred the high-number alternative when the low-number alternative was better. Eight subjects even exhibited this behavior in all nine problems of each type. This leads to a highly significant correlation between the number of preferences for the high-number alternative between both types of problems.

4.3.2 Common deviation effect (RQ3b)

Here we study whether subjects consistently exhibit the same type of deviation (indifference or preference for the wrong alternative) across problems in which the low-number or the high-number alternative is objectively better. Figure 4 analyzes the two types of deviations separately.

Fig. 4
figure 4

Consistency of strength of deviations across problems in which high-number and low-number alternative is better

As the left part of Fig. 4 shows, a majority of 72 subjects never indicated indifference in either setting. However, subjects who indicated indifference when one alternative was better did so quite consistently across setting. One subject even indicated indifference in all nine problems in both settings. This leads to a very high correlation across the two settings.

In contrast, as shown in the right part of Fig. 4, subjects tended to indicate preference for the wrong alternative only in one of the settings, but not in the other. For example, nine subjects indicated preference for the low-number alternative in all nine problems in which the high-number alternative offered a better chance, but these subjects never exhibited the standard ratio bias. On the other hand, several subjects exhibited a ratio bias at several times, but never deviated from rational behavior in the reverse direction. Since there were also some subjects exhibiting deviations in both directions several times, the correlation in this case is not significant. This allows us to conclude that the ratio bias is a particular form of deviation from rational behavior, that is distinct from random mistakes in the opposite direction.

4.3.3 Common influence factors (RQ3c)

To test whether the consistencies we found for RQ3a and RQ3b are caused by similar characteristics of problems or subjects influencing behavior in both settings, or whether they are the result of some other commonalities, we performed a regression analysis in which we analyzed whether indifference or preference for the high-number alternative in a setting in which the low-number alternative is actually better can be explained by the influence factors studied in RQ2, or by behavior in the other settings. However, in estimating this model, we have to avoid endogeneity effects. It is not possible to directly combine behavior in other settings and additional factors in one model. If behavior in the other settings is also influenced by the same factors, then the model would contain “independent” variables which themselves depend on other independent variables.

To avoid these endogeneity problems and isolate the possible effects of common factors in different types of problems, we used a two-step approach. We first determined which responses would be predicted by external factors using models m1 and m2, respectively. We then counted the number of “unexpected” choices, i.e., responses that differ from those predicted by models m1 and m2. In the second step, we used these unexpected choices rather than the total number of choices as independent variables to explain choices in the other setting. This model now also can include the other factors as independent variables, since the unexpected choices by definition are not influenced by these factors.

We estimated two such models to explain occurrence of indifference (model \(m3_{weak}\)) or preference (model \(m3_{strong}\)) for the high-number alternative in settings in which the low-number alternative is better. If behavior in different problems is driven by the same underlying set of factors, then the model should not indicate a significant relationship between behavior in the different problems, all commonalities in behavior should be explained by the other factors. We therefore compared the model including both the explanatory factors and the unexpected behavior in other problems to a baseline model, which includes only the (common) explanatory factors using a likelihood ratio test. Results of this analysis are shown in Table 6. In this table, “N weak reverse” and “N strong reverse”refer to the number of statements of indifference and preference for the low-number alternative in problems in which the high-number alternative was better, and “N pref high” and “N pref low” to the number of preferences for the high- and low-number alternative in problems where the two alternatives were equivalent.

Table 6 Logistic regression of weak and strong ratio bias on unexpected behavior in other problems and influence factors

Model \(m3_{weak}\) indicates a significant relationship between the number of indifference statements when the high-number alternative is in fact better and the number of indifference statements if the low-number alternative is better. In contrast, preference for the high-number alternative can reliably be predicted by the same factors across problems in which the low-number alternative is objectively better, and those in which both alternatives are equivalent. Preference for the high-number alternative in problems where alternatives are equivalent (variable N pref high) has only a very weakly significant influence on preference for the high-number alternative in model \(m3_{strong}\). This behavior can thus almost completely be explained by the common explanatory factors. The likelihood ratio test indicates that adding the variables representing the numbers of choices in other problems does not improve fit of the model significantly over the baseline model.

5 Discussion and conclusions

Our empirical results provide clear answers to our research questions. Concerning RQ1, we obtained clear evidence that there is a systematic effect in favor of high-number alternatives across different problem types. Our data confirm the results obtained by Dale et al. (2007), and is in contrast to the results of Lefebvre et al. (2011), which are to the best of our knowledge the only two studies so far comparing the ratio bias to random errors in the opposite direction. We also find that strict preference for the “wrong” alternative in both directions occurs more frequently than indifference, and that the difference between directions is larger for strict preference than for indifference. Thus, we conclude that the ratio bias is indeed a systematic phenomenon, which seems to be independent whether the objectively correct solution is indifference or preference for the other alternative.

Referring to RQ2, we find that gender has a significant effect on the occurrence of a ratio bias. This corresponds to studies on other effects in decision making under risk like probability weighting (Fehr-Duda et al. 2006) or loss aversion (Schmidt and Traub 2002), which were also shown to affect women more frequently than men. Concerning problem characteristics, the level of probabilities involved has a consistent and significant effect, the ratio bias is more likely with small probabilities. Therefore, previous studies that focused on lotteries involving small probabilities might have overestimated the overall impact of the ratio bias. Interestingly, we did not find a significant effect of the magnitude of probability differences. This result is probably due to the fact that probability differences in our experiment were quite small. Other experiments found ratio bias phenomena even when probability differences were in the range of 20–30% (Alonso and Fernandez-Berrocal 2003).

Concerning RQ3, we find clear evidence that indifference on the one hand, and preference for the high-number alternative on the other hand, are two distinct phenomena. Subjects who deviated from rational behavior by indicating indifference quite frequently did so both when the high-number or the low-number alternative was objectively better. In contrast, subjects indicating irrational preference for the high-number alternative did so in situations in which the other alternative was better or both were objectively equal, but did not prefer the low-number alternative when the high-number alternative was objectively better. These subjects thus exhibit a systematic deviation in favor of high-number alternatives. Furthermore, our analysis of behavior across different types of problems has shown that irrational indication of indifference is driven by different factors than preference for the high-number alternative. To the best of our knowledge, this is a new result that allows to better delineate the ratio bias as a systematic phenomenon from other forms of deviations from rational behavior.

These results have several implications for the practice of decision analysis. Since we have found that some subjects have a tendency towards indicating indifference, this option should be avoided when eliciting preferences or probabilities unless it can be assumed that it is the objectively correct answer. Furthermore, we have found that decisions involving low probabilities are more likely to be affected by a ratio bias, so elicitation methods should at least also involve problems with high probabilities if questions about low probabilities cannot be avoided. Furthermore, decision analysts should, in general, be aware that answers might be affected by a ratio bias and use different ways of representing probabilities and check for consistency across these representations.

Although our experiment has provided some clear answers to the research questions we have posed, it is certainly not without limitations. The additional option of indicating indifference seems to have caused a number of subjects to choose that option across a variety of problem settings. However, it is not clear whether these subjects really wanted to indicate that the two alternatives are equivalent for them, or whether they interpreted indifference as a kind of “I don’t know” answer. Thus, further experiments could add this fourth possibility. Furthermore, we have conducted our experiment with a sample from a quite homogeneous student population, therefore its results cannot be generalized to the population at large. Conducting similar experiments with different groups of subjects would also allow to broaden the set of demographic variables that can be controlled for. In addition, individual characteristics related to the way subjects approach decision making in general such as decision-making styles (Scott and Bruce 1995) or analytical vs. intuitive orientation (Allinson and Hayes 1996) could be examined in future studies on the ratio bias, as well as subjects’ level of numeracy (i.e., their understanding of basic mathematical concepts). This would shed more light on the conditions under which the ratio bias occurs at the individual level.