The Monty Hall dilemma starts with an initial choice among three alternatives, one of which conceals a prize. But prior to being shown whether the chosen alternative has the prize, a subject is shown that one of the unchosen alternatives is not the one with the prize. The subject is then offered to stay with the initial choice or switch to the remaining alternative. Given that the unchosen alternative that was revealed is never the one with the prize, switching to the remaining alternative would increase the subject’s chances of winning. Why this is so is not intuitively obvious. It becomes clearer once it is realized that the original choice will be correct only one third of the time, and once an empty unchosen door is revealed by the experimenter (who purposely avoids opening the door with the prize), switching will increase the chances of winning to two thirds. Yet most subjects choose not to switch, and they stay even after considerable experience with this task (Granberg & Brown, 1995).

To determine whether suboptimal choice with this task is a general phenomenon, Herbranson and Schroeder (2010) created a nonverbal version of the task and gave it to both humans and pigeons. Humans were given 200 trials with feedback to observe whether extended experience with the task would increase the use of an optimal switching strategy. The results were very similar to those of Granberg and Brown (1995), in which humans eventually learned to switch, but they only did so only about two thirds of the time. That is, they tended to match the probabilities of being correct for staying and switching, whereas the optimal strategy would have been to switch all of the time. This tendency to match the probabilities associated with staying and switching is typical of humans when outcomes are probabilistic (Fantino & Esfandiari, 2002). That is, humans have a tendency to search for a strategy that will work all of the time.

Interestingly, Herbranson and Schroeder (2010) found that even though pigeons initially showed a stronger bias to stay with their initial choice than did humans, the pigeons did acquire the switching strategy, and after 30 sessions of training they used it almost exclusively. From these results, it appears that pigeons, but not humans, learn to effectively solve the task.

One reason for suboptimal choice on the part of humans is the mistaken belief that when there are two options and the correct response is unknown, the probability of reinforcement for staying or switching is the same (50%). The tendency to perceive the probabilities associated with the two remaining doors in the Monty Hall dilemma as being equal has been attributed to an equiprobability bias (Lecoutre, 1992). But it does not explain the overwhelming initial tendency that humans have to stay.

In the Monty Hall dilemma, humans may be more likely to stick with their initially chosen door because of the illusion of control (Langer, 1975). That is, people often believe that they have some influence over random events. They picked the door that they picked for a reason, so why should they switch? Another reason for staying with their initial choice is that they would have greater regret if they were to switch away from their initial choice and lose than if they were to stay and lose (Gilovich, Medvec, & Chen, 1995). Related to this fear of regret by humans is the feeling of ownership of their original choice. This ownership effect, commonly referred to as the endowment effect, can be seen when people request more money to give up an object that they have been told that they own than they would pay for it if it were not theirs (Kahneman, Knetsch, & Thaler, 1986; Thaler, 1980).

Nonhuman animals have also shown evidence of an endowment effect (Brosnan et al., 2007; Lakshminaryanan, Chen, & Santos, 2008). For example, primates prefer to keep a treat that has been given to them rather than exchange that treat for an equally preferred (or more preferred) treat. This phenomenon is likely to result from the more general tendency of animals (including humans) to be loss-averse. An endowment-like effect (sometimes referred to as the sunk cost effect) also has been found in pigeons. When pigeons start responding on a schedule of reinforcement, they often prefer to continue with that schedule rather than switch to a better schedule (i.e., one that predicts reinforcement sooner; Magalhães & White, 2014; Pattison, Zentall, & Watanabe, 2012)

Support for the influence of ownership on human performance in the Monty Hall dilemma was found by Granberg and Dorr (1998). They reported that subjects who had the initial choice made for them tended to switch more often than those that made the initial choice themselves. Granberg and Dorr reasoned that when the choice was made for them, subjects did not have a sense of endowment, and thus were more prepared to switch to the other alternative.

Stagner, Rayburn-Reeves, and Zentall (2013) proposed that pigeons may not show an extended tendency to stay with their original choice because they did not take ownership of that choice. Stagner et al. asked whether pigeons that were required to “invest” more in their initial choice by making 20 pecks rather than one would perform more like humans. However, they found that not only did the pigeons not stay with their original choice more often than the normal one-peck pigeons, but they actually learned to switch faster. Stagner et al. concluded that the added response requirement may have made the initial choice more salient or made the relative consequences of staying versus switching more important.

In the present research, we asked whether pigeons, too, would be more likely to switch if they did not make the initial choice—that is, if they were offered only one of the three alternatives and were then permitted to stay with that original alternative (in which case there was a 33% chance of reinforcement) or switch to the other alternative (in which case there was a 67% chance of reinforcement).

Method

Subjects

Fourteen pigeons, eight White Carneaux and six homing pigeons (Columbia livia), served as subjects. All of the pigeons had taken part in an experiment involving a simultaneous red/green color discrimination in which one color served as the positive stimulus and the other color served as the negative stimulus for the first half of each session, at which point there was a reversal of the discrimination. The pigeons were maintained at 85% of their free feeding weight and were allowed free access to water and grit. The subjects were individually housed in wire cages in a colony room maintained on a 12:12-h light:dark cycle. All pigeons were maintained in accordance with protocol approved by the institutional Animal Care and Use Committee at the University of Kentucky.

Apparatus

The experiment was conducted in a BRS/LVE (Laurel, MD) sound attenuating standard operant test chamber measuring 36 cm high, 30 cm from the response panel to the back wall, and 30 cm across the response panel. Three circular response keys (2.5 cm in diameter) were aligned horizontally on the response panel and separated from each other by 6.0 cm. A 12-stimulus projector with lamps (General Electric, 1829) that could project green and white hues was mounted behind each response key. Mixed-grain reinforcement was provided from a raised and illuminated grain feeder. Reinforcement consisted of 1.5-s access to mixed grain. The experiment was controlled by a microcomputer and interface located in an adjacent room.

Training

Subjects were randomly assigned to one of two groups. For subjects in the free choice group, each trial began with the onset of three white response keys. A single peck to any key turned off the three keys for 1.0 s and then turned on the selected key and one of the initially unchosen keys, both green. A single peck to the initially chosen key resulted in reinforcement with a probability of .33. A single peck to the initially unchosen key resulted in reinforcement with a probability of .67.

For subjects in the forced group, each trial began with the onset of one white key, selected at random, with the constraint that each key was presented equally often over the course of a session. A single peck to that key turned off the key for 1.0 s and then turned on the key that had just been pecked and one of the initially unpresented keys, both green. Another single peck to the initially pecked key resulted in reinforcement with a probability of .33. A single peck to the key that initially had not been presented resulted in reinforcement with a probability of .67.

For both groups, trials were separated by a 5-s intertrial interval that was illuminated by a houselight. All sessions consisted of 96 trials and were conducted six days a week, for a total of 60 sessions.

Results and discussion

In the typical Monty Hall task, the subject gets to make the initial choice and then gets to decide to stay (in which case reinforcement is provided one third of the time) or switch (in which case reinforcement is provided two thirds of the time). In most research, early in training pigeons have been found to switch less often than stay, and in the present experiment, pigeons in the free choice group switched only 39.6% of the time on their first session of training, a level that was significantly below chance, t(6) = 2.97, p = .03, Cohen’s d = 2.43. With experience, however, the pigeons quickly learned to switch more often than stay, and after about 35 sessions they reached an asymptotic level of switching (see acquisition data in Fig. 1).

Fig. 1
figure 1

Acquisition of switching after initial choice. The free choice group was free to make an initial choice among the three response keys. For the forced group, the computer made the initial choice of key. For both groups, the probability of reinforcement for switching was .67, and the probability of reinforcement for staying was .33

In contrast, the pigeons that did not have the opportunity to make their initial choice, the forced group, started out switching at 48.2% on their first session of training, a level that did not differ significantly from chance, t < 1, and they did not learn to switch at a much higher rate over the course of the experiment. Examination of Fig. 1 suggests that the pigeons in both groups had reached relatively stable levels of switching by the end of training. For this reason, we averaged the percentages of switching over the last five sessions of training (Sessions 56–60). A repeated measures t test performed on the percentage of switching responses early in training (pooled over the first five sessions of training, 46.2%), as compared with the end of training (the last five sessions of training, 73.6%), indicated a significant increase in switching, t(6) = 3.58, p = .004, Cohen’s d = 2.92. Additionally, over the last five sessions of training, the free choice group switched significantly more often than the forced group (52.8%), t(12) = 2.76, p = .02, Cohen’s d = 1.59. Furthermore, the free choice group tended to switch significantly more than would be expected by chance (50%), t(6) = 3.11, p = .02, Cohen’s d = 2.54, but the forced group did not, t(6) = 1.76, p > .05. Although the free choice group switched more than the forced choice group, as can be seen in the right-hand column of Table 1, there was considerable variability in the degrees of switching by that group.

Table 1 Mean probabilities for each pigeon to switch or stay, separated for the initial and second keys chosen on each trial

Given that the forced group did not switch as much as the free choice group, we asked whether the pigeons’ key preferences could have been responsible for the failure to choose optimally. If, for example, the pigeon preferred the left key over the center key and the center key over the right key, one might expect the pigeon to switch whenever the forced key was center and the alternative was left or the forced key was right and the alternative was left or center. However, one might expect the pigeon to stay if the forced key was left and the alternative key was center or right, or if the forced key was center and the alternative key was right.

To further analyze the pigeons’ choices to stay or switch, we considered the response key presented for the forced group and asked whether those data suggested a pattern of stay and switch responding for each pigeon that depended on which two response keys were presented. Table 1 lists the sequences of initial keys pecked and alternatives presented for each pigeon (left, center, or right). Of course, for pigeons in the forced group the initial choice was made for them. For pigeons in the forced group, the data from Table 1 have been summarized in Table 2 and ordered according to key preference, as indicated by the tendency to stay with or switch to that key whenever possible. For pigeons in the forced group, the pattern was quite clear: Each pigeon appeared to have a most preferred key, a less preferred key, and a least preferred key. If the most preferred key was initially presented to the pigeon, it would typically stay with that key (93.8%). If the second most preferred key was initially presented to the pigeon, it would stay with that key about half of the time, 50.9% (switching when the alternative was the more preferred key, staying when the alternative was the least preferred key). Finally, if the least preferred key was initially presented to the pigeon, it would almost never stay (0.9%). Thus, following their initial peck to the lit white key, these pigeons tended to stay and switch about half of the time, depending on which of the two keys was more preferred. Most importantly, they showed little tendency to switch beyond how they responded to which of the two choice keys they preferred.

Table 2 Sessions 1–2: Probabilities of choice when each key was available and probabilities of switching

The pigeons in the free choice group also had key preferences, and early in training they chose their most preferred key most of the time and stayed with that choice more often than they switched (see Table 2). After considerable training, they continued to choose their preferred key on their initial choice (90.0% of the time), and in spite of the fact that they could have always stayed with their preferred choice, they learned to switch more often than the pigeons in the forced group (see Table 3).

Table 3 Sessions 56–60: Probabilities of choice when each key was available and probabilities of switching

The finding that the free choice pigeons tended to switch more often than the forced pigeons appears to be paradoxical, because pigeons in the free choice group always had the option to stay with their preferred key, whereas the pigeons in the forced group often were forced to choose between two less preferred keys. That is, early in training, pigeons in the forced group would be expected to switch when a more preferred alternative was provided when they could then choose. The paradox may be resolved, however, because for pigeons in the free choice group, their initial tendency to stay with their preferred key resulted in 33% reinforcement, and any tendency to switch away from their preferred key after their initial choice would have doubled the probability of reinforcement to 67%. For pigeons in the forced group, however, their key preference would have resulted in 50% reinforcement, because it caused them to switch on half of the trials—all of the trials on which their least preferred key was initially presented and half of the trials on which their second preferred key was presented. Thus, it may have been harder for pigeons in the forced group to discriminate between the consequences of choosing on the basis of their key preference (resulting in 50% reinforcement) and switching more consistently (67% reinforcement).

It should be noted that the forced procedure used in the present experiment was a bit different from that used by Granberg and Dorr (1998). In the Granberg and Dorr experiment, one subject made the initial choice and a different subject was given the opportunity to stay or switch, whereas in the present experiment only one key was available; that is, the computer made the initial choice. Furthermore, in research with humans, following the initial choice, the subject is shown that one of the original alternatives did not contain the prize, whereas in the research with animals, following the initial response, one of the alternatives is removed (but see Herbranson & Schroeder, 2010, who found similar results with humans using the pigeon procedure).

The differences in procedure notwithstanding, the results of the present experiment and those of Stagner et al. (2013) suggest that for pigeons the endowment effect does not play an important role in the Monty Hall dilemma. The question remains, however, whether it plays a role in the slow and incomplete acquisition of optimal choice in the Monty Hall dilemma for humans. Although humans eventually choose to switch more than stay, they rarely choose to switch more than two thirds of the time. Such probability matching by humans is not uncommon when outcomes are probabilistic (Koehler & James, 2010). Although probability matching results in better-than-chance reinforcement, a better strategy would be to always switch. The reason that humans match the probability of reinforcement has been attributed to the mistaken notion that with probabilistic outcomes that are randomly arranged, some distribution of switching and staying will result in better-than-probability matching (Gaissmaier, Schooler, & Rieskamp 2006). This misconception may result from the fact that in our culture and educational system, there is almost always a correct (reinforced) answer to every question (i.e., a predictable response that will result in reinforcement on every trial). Many animals, on the other hand, appear to learn that consistent selection of the alternative with the higher probability of reinforcement is more likely to maximize reinforcement (Bitterman, 1975), perhaps because they live in a more probabilistic world.

Support for the hypothesis that past experience with consistent outcomes may be responsible for humans’ attempt to do better than 67% reinforcement has come from research on base-rate neglect (Goodie & Fantino, 1995). When humans are trained on matching to sample (i.e., when a sample color is presented, a choice of the matching comparison color is correct), they quickly learn to choose the matching comparison stimulus. However, when matching one sample color is reinforced most of the time but matching the other sample color is reinforced less than half of the time, humans typically neglect the fact that choice of one comparison color is reinforced most of the time, regardless of the color of the sample. That is, humans continue to match the samples. By doing so, they neglect the base rate with which choice of the comparison colors is reinforced. Interestingly, when pigeons are given the same task, they show better sensitivity to the base rates and get more correct than humans do (Fantino, Kanevsky, & Charlton, 2005); however, if pigeons are given extensive training on matching to sample with 100% reinforcement for matching and then are transferred to the task in which choice of one of the comparisons is reinforced most of the time, they, too, show evidence of base-rate neglect (see also DiGian & Zentall, 2007; Zentall & Clement, 2002; Zentall, Singer, & Miller, 2008). Thus, extensive experience with a task in which a stimulus (in this case, the sample) provides a highly reliable cue for correct comparison choice biases, pigeons tend to neglect the fact that the stimulus is no longer a reliable cue, resulting in suboptimal choice by the pigeons.

In the present experiment, stimulus (spatial) preferences appear to have prevented the pigeons in the forced choice group from choosing optimally. Stimulus preferences may also account for what would appear to be evidence for a cognitive-dissonance-like finding in monkeys and children (Egan, Santos, & Bloom, 2007). In that experiment, subjects were given a choice between two (of three) colored candies. They were then given a second choice between candy of the color that they did not originally take and the third-colored candy (which they were not originally offered). Egan et al. found that when subjects were given the second choice, they tended to reject the originally rejected color more than would be expected by chance. They reasoned that cognitive dissonance was responsible for the bias. That is, the fact that they had rejected one of the candies on their original choice caused them to reject it again when given a second choice. Given that the pairs of candies were presented randomly, the subjects should not have been biased to reject the candy a second time. However, Chen (2008) noted that any (even small) differential preferences among the three colors could have resulted in just such a bias. He argued that the preferences could be represented in the order 1 > 2 > 3, and that three possible original choices could be presented—1 versus 2, 1 versus 3, and 2 versus 3. If the first choice was 1 versus 3, then the second choice would have been 2 versus 3, and the subject would have chosen 2 and rejected 3 again. Similarly, if the first choice was 2 versus 3, then the second choice would have been 1 versus 3, and the subject would have chosen 1 and rejected 3 again. Only if the first choice was 1 versus 2 would the second choice have been 2 versus 3, and the subject would have stayed with 2 and rejected 3. Thus, without positing cognitive dissonance, in two cases out of three, on the second choice the subjects would have rejected the color originally rejected. And that is exactly what Egan et al. found.

The present results support the conclusion reached by Stagner et al. (2013) that pigeons do not appear to be affected by manipulations that in humans affect the tendency to switch in the Monty Hall task. It may be that pigeons are not affected by endowment-like processes in the way that humans are, which can at least partially account for why pigeons do much better on tasks such as these.