In the typical Monty Hall task, the subject gets to make the initial choice and then gets to decide to stay (in which case reinforcement is provided one third of the time) or switch (in which case reinforcement is provided two thirds of the time). In most research, early in training pigeons have been found to switch less often than stay, and in the present experiment, pigeons in the free choice group switched only 39.6% of the time on their first session of training, a level that was significantly below chance, t(6) = 2.97, p = .03, Cohen’s d = 2.43. With experience, however, the pigeons quickly learned to switch more often than stay, and after about 35 sessions they reached an asymptotic level of switching (see acquisition data in Fig. 1).
In contrast, the pigeons that did not have the opportunity to make their initial choice, the forced group, started out switching at 48.2% on their first session of training, a level that did not differ significantly from chance, t < 1, and they did not learn to switch at a much higher rate over the course of the experiment. Examination of Fig. 1 suggests that the pigeons in both groups had reached relatively stable levels of switching by the end of training. For this reason, we averaged the percentages of switching over the last five sessions of training (Sessions 56–60). A repeated measures t test performed on the percentage of switching responses early in training (pooled over the first five sessions of training, 46.2%), as compared with the end of training (the last five sessions of training, 73.6%), indicated a significant increase in switching, t(6) = 3.58, p = .004, Cohen’s d = 2.92. Additionally, over the last five sessions of training, the free choice group switched significantly more often than the forced group (52.8%), t(12) = 2.76, p = .02, Cohen’s d = 1.59. Furthermore, the free choice group tended to switch significantly more than would be expected by chance (50%), t(6) = 3.11, p = .02, Cohen’s d = 2.54, but the forced group did not, t(6) = 1.76, p > .05. Although the free choice group switched more than the forced choice group, as can be seen in the right-hand column of Table 1, there was considerable variability in the degrees of switching by that group.
Table 1 Mean probabilities for each pigeon to switch or stay, separated for the initial and second keys chosen on each trial
Given that the forced group did not switch as much as the free choice group, we asked whether the pigeons’ key preferences could have been responsible for the failure to choose optimally. If, for example, the pigeon preferred the left key over the center key and the center key over the right key, one might expect the pigeon to switch whenever the forced key was center and the alternative was left or the forced key was right and the alternative was left or center. However, one might expect the pigeon to stay if the forced key was left and the alternative key was center or right, or if the forced key was center and the alternative key was right.
To further analyze the pigeons’ choices to stay or switch, we considered the response key presented for the forced group and asked whether those data suggested a pattern of stay and switch responding for each pigeon that depended on which two response keys were presented. Table 1 lists the sequences of initial keys pecked and alternatives presented for each pigeon (left, center, or right). Of course, for pigeons in the forced group the initial choice was made for them. For pigeons in the forced group, the data from Table 1 have been summarized in Table 2 and ordered according to key preference, as indicated by the tendency to stay with or switch to that key whenever possible. For pigeons in the forced group, the pattern was quite clear: Each pigeon appeared to have a most preferred key, a less preferred key, and a least preferred key. If the most preferred key was initially presented to the pigeon, it would typically stay with that key (93.8%). If the second most preferred key was initially presented to the pigeon, it would stay with that key about half of the time, 50.9% (switching when the alternative was the more preferred key, staying when the alternative was the least preferred key). Finally, if the least preferred key was initially presented to the pigeon, it would almost never stay (0.9%). Thus, following their initial peck to the lit white key, these pigeons tended to stay and switch about half of the time, depending on which of the two keys was more preferred. Most importantly, they showed little tendency to switch beyond how they responded to which of the two choice keys they preferred.
Table 2 Sessions 1–2: Probabilities of choice when each key was available and probabilities of switching
The pigeons in the free choice group also had key preferences, and early in training they chose their most preferred key most of the time and stayed with that choice more often than they switched (see Table 2). After considerable training, they continued to choose their preferred key on their initial choice (90.0% of the time), and in spite of the fact that they could have always stayed with their preferred choice, they learned to switch more often than the pigeons in the forced group (see Table 3).
Table 3 Sessions 56–60: Probabilities of choice when each key was available and probabilities of switching
The finding that the free choice pigeons tended to switch more often than the forced pigeons appears to be paradoxical, because pigeons in the free choice group always had the option to stay with their preferred key, whereas the pigeons in the forced group often were forced to choose between two less preferred keys. That is, early in training, pigeons in the forced group would be expected to switch when a more preferred alternative was provided when they could then choose. The paradox may be resolved, however, because for pigeons in the free choice group, their initial tendency to stay with their preferred key resulted in 33% reinforcement, and any tendency to switch away from their preferred key after their initial choice would have doubled the probability of reinforcement to 67%. For pigeons in the forced group, however, their key preference would have resulted in 50% reinforcement, because it caused them to switch on half of the trials—all of the trials on which their least preferred key was initially presented and half of the trials on which their second preferred key was presented. Thus, it may have been harder for pigeons in the forced group to discriminate between the consequences of choosing on the basis of their key preference (resulting in 50% reinforcement) and switching more consistently (67% reinforcement).
It should be noted that the forced procedure used in the present experiment was a bit different from that used by Granberg and Dorr (1998). In the Granberg and Dorr experiment, one subject made the initial choice and a different subject was given the opportunity to stay or switch, whereas in the present experiment only one key was available; that is, the computer made the initial choice. Furthermore, in research with humans, following the initial choice, the subject is shown that one of the original alternatives did not contain the prize, whereas in the research with animals, following the initial response, one of the alternatives is removed (but see Herbranson & Schroeder, 2010, who found similar results with humans using the pigeon procedure).
The differences in procedure notwithstanding, the results of the present experiment and those of Stagner et al. (2013) suggest that for pigeons the endowment effect does not play an important role in the Monty Hall dilemma. The question remains, however, whether it plays a role in the slow and incomplete acquisition of optimal choice in the Monty Hall dilemma for humans. Although humans eventually choose to switch more than stay, they rarely choose to switch more than two thirds of the time. Such probability matching by humans is not uncommon when outcomes are probabilistic (Koehler & James, 2010). Although probability matching results in better-than-chance reinforcement, a better strategy would be to always switch. The reason that humans match the probability of reinforcement has been attributed to the mistaken notion that with probabilistic outcomes that are randomly arranged, some distribution of switching and staying will result in better-than-probability matching (Gaissmaier, Schooler, & Rieskamp 2006). This misconception may result from the fact that in our culture and educational system, there is almost always a correct (reinforced) answer to every question (i.e., a predictable response that will result in reinforcement on every trial). Many animals, on the other hand, appear to learn that consistent selection of the alternative with the higher probability of reinforcement is more likely to maximize reinforcement (Bitterman, 1975), perhaps because they live in a more probabilistic world.
Support for the hypothesis that past experience with consistent outcomes may be responsible for humans’ attempt to do better than 67% reinforcement has come from research on base-rate neglect (Goodie & Fantino, 1995). When humans are trained on matching to sample (i.e., when a sample color is presented, a choice of the matching comparison color is correct), they quickly learn to choose the matching comparison stimulus. However, when matching one sample color is reinforced most of the time but matching the other sample color is reinforced less than half of the time, humans typically neglect the fact that choice of one comparison color is reinforced most of the time, regardless of the color of the sample. That is, humans continue to match the samples. By doing so, they neglect the base rate with which choice of the comparison colors is reinforced. Interestingly, when pigeons are given the same task, they show better sensitivity to the base rates and get more correct than humans do (Fantino, Kanevsky, & Charlton, 2005); however, if pigeons are given extensive training on matching to sample with 100% reinforcement for matching and then are transferred to the task in which choice of one of the comparisons is reinforced most of the time, they, too, show evidence of base-rate neglect (see also DiGian & Zentall, 2007; Zentall & Clement, 2002; Zentall, Singer, & Miller, 2008). Thus, extensive experience with a task in which a stimulus (in this case, the sample) provides a highly reliable cue for correct comparison choice biases, pigeons tend to neglect the fact that the stimulus is no longer a reliable cue, resulting in suboptimal choice by the pigeons.
In the present experiment, stimulus (spatial) preferences appear to have prevented the pigeons in the forced choice group from choosing optimally. Stimulus preferences may also account for what would appear to be evidence for a cognitive-dissonance-like finding in monkeys and children (Egan, Santos, & Bloom, 2007). In that experiment, subjects were given a choice between two (of three) colored candies. They were then given a second choice between candy of the color that they did not originally take and the third-colored candy (which they were not originally offered). Egan et al. found that when subjects were given the second choice, they tended to reject the originally rejected color more than would be expected by chance. They reasoned that cognitive dissonance was responsible for the bias. That is, the fact that they had rejected one of the candies on their original choice caused them to reject it again when given a second choice. Given that the pairs of candies were presented randomly, the subjects should not have been biased to reject the candy a second time. However, Chen (2008) noted that any (even small) differential preferences among the three colors could have resulted in just such a bias. He argued that the preferences could be represented in the order 1 > 2 > 3, and that three possible original choices could be presented—1 versus 2, 1 versus 3, and 2 versus 3. If the first choice was 1 versus 3, then the second choice would have been 2 versus 3, and the subject would have chosen 2 and rejected 3 again. Similarly, if the first choice was 2 versus 3, then the second choice would have been 1 versus 3, and the subject would have chosen 1 and rejected 3 again. Only if the first choice was 1 versus 2 would the second choice have been 2 versus 3, and the subject would have stayed with 2 and rejected 3. Thus, without positing cognitive dissonance, in two cases out of three, on the second choice the subjects would have rejected the color originally rejected. And that is exactly what Egan et al. found.
The present results support the conclusion reached by Stagner et al. (2013) that pigeons do not appear to be affected by manipulations that in humans affect the tendency to switch in the Monty Hall task. It may be that pigeons are not affected by endowment-like processes in the way that humans are, which can at least partially account for why pigeons do much better on tasks such as these.