Metacognition is typically defined as thinking about thinking or as the ability to consciously monitor cognitive processes. The ability to monitor the state of one’s memory is referred to as metamemory. Previous research has suggested that some animal species may have a capacity for metacognition. Some of the earliest evidence suggestive of metacognition in non-human animals was reported in studies conducted on rhesus monkeys by several different investigators (Beran, Smith, Redford, & Washburn, 2006; Hampton, 2001; Hampton & Hampstead, 2006; Hampton, Zivin, & Murray, 2004; Kornell, Son, & Terrace, 2007; Smith & Washburn, 2005; Smith, Shields, Schull, & Washburn, 1997; Smith, Shields, Allendoerfer, & Washburn, 1998; Smith, Shields, & Washburn, 2003; Smith, Beran, Redford, & Washburn, 2006). Recently, Foote and Crystal (2007) trained rats to discriminate between noise durations ranging from 2 to 8 sec. During test trials, the rats could choose between taking the duration test and potentially earning a large reward for correct responding, or declining the test and earning a guaranteed smaller reward. Rats declined the test most frequently on trials with difficult-to-discriminate intermediate durations, and they showed better accuracy for difficult discrimination tests which they chose to take than for difficult discrimination tests which they were forced to take. As Crystal and Foote (2009) recently noted, this pattern of results has been the prevailing standard, first outlined by Inman and Shettleworth (1999), for determining whether an animal has knowledge of its own cognitive state. The two criteria identified by Inman and Shettleworth (1999) and observed by Foote and Crystal (2007) were: (1) an increase in the frequency of declining a test as task difficulty increases and (2) higher accuracy on trials in which they chose to take the test than on forced tests, with the accuracy difference increasing as task difficulty increases. The enhanced accuracy on trials in which an animal chooses to take the test has been termed the chosen–forced advantage (Crystal & Foote, 2009). This chosen–forced advantage was initially viewed as reflecting the operation of metacognitive processes because an animal with metacognition would presumably only choose to take the test when it “knows that it knows” the correct response. On forced trials, the animals are required to respond even on trials in which they would have declined the test had that option been available, thereby driving down the accuracy relative to chosen test trials.

The attribution of metacognitive states to nonhuman animals has received considerable scrutiny. Hampton (2009) noted that adaptive cognitive control reflective of metacognitive behavior could be the result of either private or public mechanisms. Private mechanisms are those based on cognitive states which only the subject has privileged access to, whereas public mechanisms are those based on publicly available information, such as discrimination difficulty, history of reinforcement, etc., which would be available to the subject as well as to others. Hampton (2009) argued that public mechanisms can adequately account for most or all cases of metacognitive behavior in nonhuman animals. Carruthers (2008) has also argued that much of the data on metacognition in animals can be explained by first-order reasoning processes which involve beliefs and desires of varying strengths and do not require the attribution of metacognitive processes to animals. Consistent with these critical assessments, it has been shown that both the chosen–forced advantage and the increase in escape responding can be generated by quantitative models which incorporate basic discrimination processes and no metacognitive ability (Crystal & Foote, 2009; Jozefowiez, Staddon, & Cerutti, 2009; Smith, Beran, Couchman, & Coutinho, 2008; Smith, Beran, Couchman, Coutinho, & Boomer, 2009; Staddon & Jozefowiez, 2007). For example, Crystal and Foote (2009) have persuasively argued that the two behavioral effects initially proposed as potentially indicative of metacognition can be explained in terms of operations performed on primary representations (i.e., response strength associated with the subjective level of a stimulus in the case of perceptual discriminations, and response strength which declines as a function of memory trace decay in the case of memory tasks) without any reliance on secondary representations (i.e., knowing the state of one’s knowledge). In agreement with the formal model of Smith et al. (2008), they argued that the reward for making the escape response results in a low-frequency threshold for selecting the escape option, which is constant across stimulus conditions. When response strengths for the primary responses in the perceptual discrimination task or the memory task fall below the low-frequency threshold, the escape response is selected. If the previous results obtained by Foote and Crystal (2007) in rats were due to low-level mechanisms operating on primary representations, then many other animal species should also be capable of exhibiting the chosen–forced advantage and a higher rate of escaping difficult test conditions. However, Inman and Shettleworth (1999) and Sutton and Shettleworth (2008) found little evidence of this pattern of behavior in pigeons despite the use of multiple experiments and tests within each study. The difference between rats and pigeons could be due to task differences or a genuine species difference, or it may be that the finding in rats is not replicable.

In the study reported here, pigeons were tested in a duration matching-to-sample procedure to determine if they would exhibit a chosen–forced advantage and increased escape responding as the task was made more difficult. The pigeons were trained to discriminate between durations of feederlight illumination of 2 and 8 sec by responding to red and green comparison stimuli. Pecking one color was correct for the short duration, and pecking the other color was correct for the long duration. After presentation of the sample duration, either a vertical or a horizontal line was presented on one of the side keys. Pecking one of the line orientations produced the red and green comparisons (i.e., a Forced trial). Correct responses were reinforced with an 8-sec access to grain, while incorrect responses resulted in a 0-sec access to grain. Pecking the other line orientation immediately resulted in a 4-sec access to grain (i.e., a no test trial). When accuracy was 85% or better for the 2- and 8-sec durations on forced trials, pigeons received a choice testing phase with 25% forced trials, 25% no test trials, and 50% choice trials. On choice trials, both line orientations were simultaneously presented, and pigeons could choose to take the test or decline the test. Following the choice testing phase, additional test phases were sequentially undertaken. These consisted of sample omission test sessions and retention interval (RI) test sessions (i.e., delays between sample offset and onset of the line orientations). The sample omission phase was conducted to assess whether the pigeons would escape the test more often on trials in which the sample was absent. As noted by Crystal and Foote (2009), performance on sample-omitted trials could be based on a very weak primary representation of a sample stimulus from the previous trial, but presumably the strength of this primary representation would be lower than the threshold for declining the test. As a result, a higher rate of escaping the test would be predicted on sample-omitted trials. Similarly, the rate of escaping the test is also predicted to increase as the RI is increased. According to Crystal and Foote (2009), the primary representation of the strength of responding to the short or the long comparison stimulus would decrease as the RI is increased. At a long RI, the strength of the escape response would be higher than the strength of responding to the short or long comparison stimulus and the rate of escape responding would be higher than it would be at a much shorter RI.

In the study by Foote and Crystal (2007), the intertrial interval (ITI) was 8 min, whereas in the Inman and Shettleworth (1999) and Sutton and Shettleworth (2008) studies, the ITI was 20 sec. It is possible that the length of the ITI could influence an animal’s propensity to accept or decline a memory test. With a short ITI, animals may be inclined to take the test and try for the larger reward even when their memory for the sample is poor. On the other hand, with a longer ITI, animals may be more inclined to escape a test for the smaller reward more often and only choose to take a test for the larger reward when their memory for the sample is good. Consequently, in a final testing phase, both ITI and RI were varied in order to assess whether ITI length affected escape rates. Results obtained over the various phases of testing in this study will determine whether pigeons are capable of exhibiting: (1) more frequent selection of the escape option on sample omission test trials and on long-delay test trials; and (2) higher accuracy on trials in which the test is chosen than on trials in which the test is forced.

Method

Subjects

Ten adult Silver King pigeons, maintained at approximately 80% of their free-feeding weight and housed individually with constant access to water and grit, served as subjects. Post-session feedings of Purina Pigeon Chow (Elmira Feed & Supply, Elmira, Ontario) were provided to maintain their target weights. The colony room was illuminated on a 12:12 (light/dark) cycle by fluorescent lighting turned on at approximately 7:00 a.m. each day. Testing was conducted 5 days per week between 9:00 a.m. and 6:00 p.m. Four birds had previously been trained in a standard operant chamber to discriminate sequences of light flashes, but they had no previous training in experiments on metamemory. Six birds were experimentally naïve.

Apparatus

Four Coulbourn modular operant test cages (model #E10-10; Coulbourn Instruments, Whitehall, PA), each housed within isolation cubicles (model #10-20; Coulbourn Instruments), were used. Each cubicle utilized baffled air intake exhaust systems and ventilation fans. Each test cage contained three horizontally aligned, translucent plastic keys positioned approximately at a pigeon's standing sight line. Behind each key was a projector which displayed red, green, a white vertical line, and a white horizontal line onto a frosted rear projection screen (model #E21-18; Coulbourn Instruments). A 5.7 × 5-cm opening was located directly below the center key which, during reinforcement, provided access to a hopper containing mixed grain. Within the opening was a lamp (model #E14-10 with bulb #S11819X; Coulbourn Instruments) that was illuminated only during sample presentation and during reinforcement. Located 6.5 cm above the center key was a houselight that directed light upward to reflect light from the top of the cage (model #14-10; Coulbourn Instruments). The presentation of all experimental events and the recording of response choices was accomplished through a Med-Associates interface and a microcomputer running MEDState Notation programs.

Procedure

Shaping and initial duration matching-to-sample training

All birds were trained to eat mixed grain from the illuminated food hopper. After hopper training, the pigeons were autoshaped to peck at red, green, a white vertical line, or a white horizontal line randomly presented on either the left or the right side keys. Each pigeon was trained to discriminate between short (2 sec) and long (8 sec) durations of hopper light illumination. Following termination of the sample duration, red and green comparison stimuli were presented in a counterbalanced fashion on the left and right side keys. For five birds, red was correct following the short sample, and green was correct following the long sample. The correct response contingencies were reversed for the remaining five birds. For all birds, a single peck to one of the comparison stimuli turned them both off and, if correct, permitted an 8-sec access to mixed grain from the food hopper. Incorrect responses to the comparison stimuli resulted in a brief blackout followed immediately by the presentation of the same sample and comparison stimulus configuration. A correct response on a correction trial produced an 8-sec access to mixed grain, although only the choice response on the initial (noncorrection) trial was used to calculate matching accuracy. Within each block of four trials, all combinations of the two sample durations with comparison stimuli counterbalanced on the left and right side keys occurred once. The order of presentation was randomized individually for each bird. A randomly selected ITI of 8, 16, 32 or 64 sec, spent in darkness, separated the trials. All sessions ended upon completion of 96 trials. Once individual birds attained an accuracy level of 85% or higher for both short and long samples on at least two consecutive sessions, they progressed to forced trials training.

Forced trials training

Each session of forced trials training consisted of 96 trials. Immediately after the presentation of the sample, either a vertical or a horizontal line was presented on either the left or the right key (counterbalanced across trials). For five birds, the vertical line signaled a forced test trial, while the horizontal line signaled a forced escape trial. This was reversed for the remaining five birds. A single peck to the forced test stimulus terminated the line and was followed, after a delay of 0.5 sec, by presentation of the comparison stimuli on the side keys. A single peck to the correct color comparison terminated the comparison stimuli and was immediately followed by an 8-sec access to the food hopper. A single peck to the incorrect color comparison terminated the comparison stimuli and added 8 sec to the ITI. A single peck to the forced escape stimulus terminated the line and was followed, after a delay of 0.5 sec, by presentation of a 4-sec access to the food hopper. Within each block of 16 trials, there were eight forced test trials and eight forced escape trials. Trials were separated by a randomly selected ITI (8, 16, 32, or 64 sec). Training continued until pigeons reached an accuracy criterion of 85% correct or higher on both short and long sample trials for two consecutive sessions. Three pigeons (two experienced birds and one experimentally naïve bird) were removed from the study because they failed to meet this criterion within 40 sessions of training.

Choice testing

Each session of choice testing consisted of 96 trials, and pigeons were tested for 20 sessions. Twenty-four trials (25%) were forced test trials and 24 (25%) were forced escape trials identical to those in the previous training phase. Forty-eight trials (50%) were choice test trials in which both the vertical line and the horizontal line were presented (counterbalanced over the left and right side keys) after the sample. On choice trials, the pigeons could choose either to take the memory test or to escape the memory test.

Sample omission testing

Following the choice testing phase, the pigeons received ten sessions of sample omission testing. Each session consisted of 108 trials. Ninety-six trials were the same as those described for the choice testing phase. On 12 randomly selected trials, the sample was omitted. On these trials, the vertical and horizontal stimuli were presented immediately after the ITI, and the pigeon could either choose to take the memory test or to escape the memory test. If the pigeon chose to take the test on sample omitted trials, pecking either the red or the green key resulted in an 8-sec access to food with a probability of 0.5.

RI testing and choice performance after RI testing

Following sample omission testing, the pigeons received 20 sessions of RI testing. For three pigeons, this occurred immediately after completion of sample omission testing, while the remaining four pigeons received between 14 and 20 sessions of training that were identical to the choice testing phase prior to RI testing. During RI testing, sessions were identical to the choice testing phase, except that on one half of the trials of each type (forced test, forced escape, and choice) a 10-sec RI was inserted between the termination of the sample and the presentation of the line orientation stimuli. There were a total of 64 trials in each session (32 with a RI of 0 sec, 32 with a RI of 10 sec). The 32 trials at a given RI consisted of eight forced test trials, eight forced escape trials, and 16 choice trials. Following RI testing, all pigeons received from 5 to 17 sessions of training identical to that of the choice testing phase.

ITI and RI testing with the choice procedure

During this phase of the experiment, pigeons received 24 sessions of RI testing identical to that of the previous phase except that the ITI was 6 sec during 12 sessions and 60 sec during the remaining 12 sessions. The sessions alternated between testing with an ITI of 6 sec and testing with an ITI of 60 sec. In all statistical analyses reported in this article, the rejection region was p < 0.05.

Results

Choice testing and sample omission testing

The top panel of Fig. 1 presents the accuracy on forced and choice test trials during the choice testing phase as well as the percentage of trials on which the memory test was escaped. Accuracy was between 80 and 90% correct on both forced and choice test trials. The escape percentage was 55%, which is relatively high given that discrimination accuracy was above 80% correct. An analysis of variance (ANOVA) was conducted on the accuracy data, with trial type (forced test, choice test) and sample duration (short, long) as within-subject factors. There were no significant main effects of trial type or sample duration, and no interaction, all F(1,6) ≤ 2.46. Therefore, accuracy was not significantly higher when pigeons chose to take the test than when they were forced to take the test. This occurred even though pigeons chose to take the test on approximately 45% of the trials.

Fig. 1
figure 1

Top panel Mean percentage correct on forced test (black bar) and choice test (light-gray bar) trials and the percentage of choice trials on which the escape option (dark-gray bar) was selected during the initial choice testing phase. Lower panel Mean percentage correct on forced test (black bar) and choice test (light-gray bar) trials when the sample was presented during sample omission testing. The mean percentage of choice trials on which the escape option was selected on trials in which the sample was either presented (dark-gray bar) or omitted (last bar) is also shown. Error bars: Standard error of the mean (SEM)

The bottom panel of Fig. 1 shows the accuracy on forced test and choice test trials during sample omission testing when the sample was presented as well as the percentage of escape trials when the sample was presented and when the sample was omitted. Accuracy was high and very similar on forced test trials and choice test trials. An ANOVA with trial type (forced test, choice test) and sample duration (short, long) as within subject factors indicated that there were no significant main effects or interaction, all F < 1. Pigeons were no more accurate when they chose to take the test than when they were forced to take the test on sample-presented trials. For both the sample-presented and the sample-omitted trials, the pigeons escaped between 50–60% of the time. There was no significant difference in the percentage escape for sample-presented and sample-omitted trials, t < 1. As would be expected, on sample-omitted trials in which pigeons chose to take the test, they responded to the comparison correct for the short sample [M = 87.5, SD = 8.32] at a level significantly above chance, t(6) = 11.91. In contrast to what would be expected if pigeons exhibited functional use of the escape response, the frequency of escaping the test did not increase on trials in which no sample was present.

RI testing and choice performance after RI testing

The top panel of Fig. 2 shows the accuracy on forced test and choice test trials as a function of RI, and the bottom panel shows the percentage of trials on which the memory test was escaped. The data for one pigeon were excluded because accuracy on the forced test trials decreased to chance levels during RI testing. For the remaining six pigeons, accuracy decreased when the RI was increased to 10 sec for both the forced test and choice test trials. At the 0-sec RI, accuracy was higher on long-sample trials than on short-sample trials. Accuracy on long-sample trials was reduced more than accuracy on short-sample trials by increasing the RI to 10 sec. Overall, accuracy did not appear to be greater on chosen test trials than on forced test trials. An ANOVA was conducted on these data with trial type, sample duration, and RI as within-subject factors. There was a significant effect of RI, F(1,5) = 171.19, and a significant trial type × RI interaction, F(1,5) = 7.73. Most importantly, there was a significant trial type × sample duration × RI interaction, F(1,5) = 9.21. Overall, there was no difference in accuracy between forced and choice trials at either the 0-sec RI or the 10-sec RI, all F(1,5) ≤ 2.06. At the 0-sec RI, accuracy was significantly greater on long-sample trials than on short-sample trials regardless of whether the test was forced or chosen, all F(1,5) = 21.34 and 6.75, respectively. Increasing the RI to 10 sec reduced accuracy more for the long sample than for the short sample, F(1,5) = 20.28 and 4.75, respectively. At the 10-sec RI, accuracy did not significantly differ for short-sample and long-sample trials regardless of whether the test was forced or chosen, F(1,5) ≤ 1.56. An additional analysis indicated that accuracy at the 10-sec RI was not significantly different from chance for forced short- and long-sample trials or for chosen short- and long-sample trials, ts(5) ≤ 1.91. These results indicate that there was no overall chosen–forced advantage and that there was no evidence of a significantly higher accuracy for the short sample than for the long sample at the long RI.

Fig. 2
figure 2

Top panel Mean percentage correct on forced test and choice test trials. Bottom panel Mean percentage of choice trials on which the escape option was selected as a function of the retention interval (0 and 10 sec). Error bars: SEM

The bottom panel of Fig. 2 also shows that escape responding occurred frequently during RI testing and that it only showed a very slight increase as the RI was increased. An ANOVA conducted on these data with sample duration and RI as within-subject factors failed to find any significant main effects or interactions, all F(1,5) ≤ 2.65. The frequency of escaping the test did not increase on trials in which the RI was increased.

Figure 3 shows the accuracy and escape data obtained when pigeons were returned to the experimental conditions they had received during initial choice testing (i.e., with a 0-sec RI). One additional pigeon was removed from the study at this point because it began to escape the test on almost every trial. During this phase, the remaining five pigeons began to show consistently greater accuracy when they chose to take the test than when they were forced to take the test. This occurred regardless of whether the sample duration was short or long. An ANOVA on the accuracy data with trial type and sample duration as within-subject factors only showed a significant main effect of trial type, F(1,4) = 9.44. No other main effect or interaction was statistically significant. The percentage of escape responding did not differ significantly from that observed during the initial choice testing phase, t(4) < 1.66.

Fig. 3
figure 3

The mean percentage correct on forced test (black bar) and choice test (light-gray bar) trials, and the percentage of choice trials on which the escape option (dark-gray bar) was selected during choice test sessions which followed retention interval (RI) testing. Error bars: SEM

ITI and RI testing

The top panel of Fig. 4 presents accuracy on forced test and choice test trials as a function of ITI and RI. At the 0-sec RI, accuracy was slightly higher on choice trials than on forced trials. At the 10-sec RI, matching accuracy was lower, and there no longer appeared to be higher accuracy on choice trials than on forced trials. An ANOVA was conducted on these data with ITI, RI, and trial type as within-subject factors. There was a significant main effect of RI, F(1,4) = 109.52, as well as an interaction of trial type and RI which was close to being significant, F(1,4) = 5.98, p = 0.07. At the 0-sec RI, accuracy was significantly greater when pigeons chose to take the test (M = 81.2, SD = 4.02) than when they were forced to take the test (M = 73.3, SD = 7.17), F(1,4) = 19.71; however, at the 10-sec RI, there was no difference in accuracy for chosen (M = 50.4, SD = 6.16) and forced (M = 54.0, SD = 3.18) tests, F(1,4) = 1.07. There were no other significant main effects or interactions. An additional analysis was conducted on the 10-sec RI data to determine if there was any difference in accuracy for short- and long-sample trials at any of the ITI × trial type combinations. In agreement with the results of the previous RI testing phase, at the 10-sec RI, accuracy did not significantly differ for short-sample and long-sample trials at any of the ITI × trial type combinations, all F(1,4) ≤ 1.74. At the 10-sec RI, accuracy was not significantly different from chance for short - and long-sample trials regardless of whether the test was chosen or forced, all t(4) ≤ 2.59.

Fig. 4
figure 4

Top panel Mean percentage correct on forced test and choice test trials as a function of the intertrial interval (ITI; 6 and 60 sec) and the RI (0 and 10 sec). Lower panel Mean percentage of choice trials on which the escape option was selected as a function of the ITI (6 and 60 sec) and RI (0 and 10 sec). Error bars: SEM

The bottom panel of Fig. 4 presents the percentage of escape responding as a function of ITI and RI. As in previous test phases, escape responding occurred frequently, and it showed a slight but non-significant increase at the longer RI. An ANOVA failed to produce any significant main effects or interaction effects, all F(1,5) ≤ 5.57.

Choice trial performance after ITI and RI testing

The pigeons were returned to the experimental conditions they had received during initial choice testing (i.e., with a 0-sec RI) for 11 sessions. With additional training, the one pigeon removed from the study during the previous RI testing phase regained accurate performance on forced trials, and the data for this pigeon were therefore included in Fig. 5. Accuracy was greater when pigeons chose to take the test than when they were forced to take the test. An ANOVA on the accuracy data with trial type and sample duration as within-subject factors showed a significant main effect of trial type, F(1,5) = 19.89. No other effects were statistically significant. The percentage of escape responding was similar to that observed during earlier testing phases.

Fig. 5
figure 5

Mean percentage correct on forced test (dark bar) and choice test (light-gray bar) trials and percentage of choice trials on which the escape option (dark-gray bar) was selected during the choice test sessions which followed ITI and RI testing. Error bars: SEM

Figures 6 and 7 show the results for Pigeon 19 and Pigeon 41, respectively. During various stages of testing, these pigeons showed both a chosen–forced advantage as well as a large increase in escape responding at the longer RI. During the initial choice phase and sample omission test phase, Pigeon 41 did not exhibit consistently higher accuracy on choice tests than on forced tests, but Pigeon 19 did. During sample omission testing, Pigeon 41 showed substantially higher escape rates on trials in which no sample was presented, but Pigeon 19 did not. During RI testing, neither pigeon exhibited consistently higher accuracy on choice trials than on forced trials; however, escape rates were much higher for the more difficult 10-sec RI trials than for the easier 0-sec RI trials. During the choice test sessions which followed RI testing, both pigeons exhibited a higher accuracy on choice trials than on forced trials. During the ITI/RI test phase at the 0-sec delay, Pigeon 19 only exhibited a higher accuracy on the choice trials than on the forced trials when the ITI was 6 sec; however, Pigeon 41 exhibited a higher accuracy on choice trials regardless of whether the ITI was 6 or 60 sec. Pigeon 19 escaped more often when the ITI was 6 sec than when it was 60 sec, and it escaped more at the 10-sec RI than at the 0-sec RI. For Pigeon 41, escape rates were similar for the 6- and 60-sec ITI test sessions, and at the 10-sec RI, Pigeon 41 escaped on almost all of the trials. This result makes the accuracy data at the 10-sec RI for choice trials questionable for this pigeon because it is based on a total of only six trials. When these pigeons were returned to choice test sessions following ITI/RI testing, they continued to show higher accuracy on choice tests than on forced tests. Thus, although there were some differences, both pigeons provided evidence of the two behavioral effects previously considered to be important for metamemory but more recently explained in terms of operations performed on primary representations: (1) they consistently selected the “no-test” option more frequently on long-delay test trials, and (2) from the RI testing phase onward, they exhibited higher accuracy on trials in which they chose to take the test than on those in which they were forced to take the test. However, neither bird showed a larger difference in accuracy between choice and forced trials as the task difficulty increased (i.e., the 10-sec RI).

Fig. 6
figure 6

Performance across all phases of the experiment for Pigeon #19 (P19)

Fig. 7
figure 7

Performance across all phases of the experiment for Pigeon #41 (P41)

Discussion

The data obtained during initial choice testing, sample omission testing, and RI testing appeared to replicate the previous failures of pigeons to demonstrate the two behavioral effects initially proposed as indicative of metacognition (Inman & Shettleworth, 1999; Sutton & Shettleworth, 2008). Pigeons did not show higher accuracy on choice trials than on forced trials, and they did not exhibit significantly higher escape rates as task difficulty increased. However, after the RI testing phase, higher accuracy was observed on choice trials than on forced trials, and it continued to occur at the shorter RI throughout the remainder of the study. The chosen–forced advantage had not been previously been obtained in studies with pigeons when the choice to take or escape the test is given before the test stimuli are presented (Inman & Shettleworth, 1999; Sutton & Shettleworth, 2008).

This result while consistent with explanations that rely on primary representations rather than secondary representations (Crystal & Foote, 2009) still leaves many questions unanswered. Why did the chosen–forced advantage take so long to emerge? Once it emerged, why was it only obtained when, during the ITI/RI test, the RI was 0 sec, and not when the RI was 10 sec? Why did escape responding not significantly increase as the RI was increased during the ITI/RI test? Individual differences appear to be a contributing factor. As noted earlier, both Pigeon 19 and Pigeon 41 exhibited greater escape rates when the RI was 10 sec, and these pigeons also showed higher accuracy on choice trials than on forced trials at the short RI. The failure of these birds to exhibit a similar or larger difference in accuracy between choice and forced trials at the longer RI may have occurred because there was no memory trace available at the 10-sec RI. Escape rates were also very high for both of these birds at the longer RI, and this may have affected the reliability of assessing accuracy on choice trials. The high rate of escape of these two birds at the long RI is consistent with models that rely solely on primary representations and fading memory traces presumably because at a long RI the initial primary representation has completely faded. It would be worthwhile to conduct additional tests in this task at RIs of between 0 and 10 sec. While the overall pattern of results does not provide clear evidence for all of the predictions of behavior set forth by low-level models, the findings are closer to achieving those results than those suggested by earlier studies in pigeons. In order to exhibit these effects, it appears that pigeons require a great deal of experience with the task and a favorable set of task parameters, and it also appears that some pigeons may be more likely to exhibit the predicted behaviors than others. Individual differences have also been reported in studies conducted on tufted capuchin monkeys (Fujita, 2009) and rhesus monkeys (Hampton, 2001). In Fujita’s study, the two capuchin monkeys both selected the escape option more often as task difficulty increased, but only one of the monkeys showed the choice-forced advantage. In Hampton’s study, the two rhesus monkeys both chose the escape option more often as task difficulty increased, and they both showed an overall choice-forced advantage. However, only one monkey was significantly more accurate on chosen tests than on forced tests at the longer delays.

The ITI in the Foote and Crystal (2007) study was 8 min, while in the Inman and Shettleworth (1999) and Sutton and Shettleworth (2008) studies it was 20 sec. Given the relatively short duration of the ITI, pigeons may be less inclined to escape a trial and may more frequently choose to take a test. Rats, on the other hand, with an 8-min ITI might be expected to escape more often and only choose to take a test when they are sure of the correct response. In the present study, the ITIs during initial training and testing ranged from 8 to 64 sec, with a mean value of 30 sec. Neither escape rates nor the choice-forced accuracy difference was differentially affected by an ITI of 6 or 60 sec during a testing session. It may be that pigeons’ escape behavior is insensitive to ITI duration or that a much longer ITI is required to obtain an effect. In future testing, we will examine ITI durations more similar to those used in the Foote and Crystal study.

In Experiment 2 of the Sutton and Shettleworth study, the option to take or escape the test was provided prior to presentation of the comparison stimuli, and escape rates were very low when the RI was 0 sec (M = 3.43%). In Experiment 3 of their study, the escape option was provided at the same time as the comparison stimuli, and the escape rate increased substantially (M = 57.03%). In our study, the escape option was provided prior to presentation of the comparison stimuli and the escape rate was similar to that observed in Sutton and Shettleworth’s Experiment 3. Thus, elevated escape rates can be obtained in pigeons even when the escape option is provided prior to presentation of the comparisons.

Previous studies of memory for duration samples in pigeons have often reported a choose–short effect which is characterized by above-chance accuracy on short-sample trials and chance or below-chance accuracy on long-sample trials at extended RIs (see Grant, Spetch, & Kelly, 1997). Several different explanations have been provided for this choose–short effect. According to the subjective shortening hypothesis (Spetch & Wilkie, 1983), the representation of the long sample in working memory shortens and increasingly becomes more similar to the representation of the short sample in working memory as the RI increases. On the other hand, the instructional ambiguity/confusion hypothesis explains the choose–short effect in terms of a confusion that can occur between the ITI and RI (see Zentall, 2007). According to this hypothesis, pigeons confuse the RI with the ITI because of similarity in the ambient stimulus conditions and, when presented with choice stimuli at the end of an RI, they may respond as if no sample had been presented on that trial. Because the lack of a sample is more similar to a short sample than to a long sample, pigeons would be biased to respond to the stimulus correct for the short sample. In our study, there was no choose–short effect observed during either the initial RI test or the subsequent ITI and RI testing phase. While the stimulus conditions during the ITI and RI were similar, at the end of the RI a vertical and/or a horizontal line was presented prior to the presentation of comparison stimuli on test trials. This may have been sufficient to disambiguate the RI from the ITI prior to test responding and prevent the occurrence of a choose–short effect. Alternatively, there may have been some other aspect of the training and procedure used in the current study that was responsible for the absence of a choose–short effect.

The results of our study provide evidence that pigeons can exhibit higher accuracy on a chosen memory test than on a forced test after considerable training. Two pigeons also showed much higher escape rates on trials with a long RI than on trials with a short RI. While these results do provide evidence that pigeons can exhibit the two behavioral effects initially proposed as indicative of metacognition, the effects are not as strong nor as consistent as the data from the studies on rhesus monkeys (Beran et al., 2006; Hampton, 2001; Hampton & Hampstead, 2006; Hampton et al., 2004; Kornell et al., 2007; Smith et al., 2003, 2006, 2009) and rats (Foote & Crystal, 2007). However, not all monkey species display the behavior pattern that was previously thought to be indicative of metacognition. Beran et al. (2009) recently reported that capuchin monkeys did not use the escape response in density discrimination tasks. This failure of capuchin monkeys to use the escape response and previous failures of pigeons to show any evidence of a chosen–forced advantage or an increase in escape responding have been viewed by Smith et al. (2009) as evidence against associative explanations of the metacognitive pattern of performance. They argued that neither pigeons nor capuchin monkeys are associatively challenged and that therefore they should be very capable of using cues that would allow them to maximize reward and exhibit the ‘metacognitive’ pattern of performance predicted by quantitative models which incorporate basic discrimination processes. The failure of pigeons to exhibit the chosen–forced advantage and the rates of decline dependent on task difficulty in previous studies (Inman & Shettleworth, 1999; Sutton & Shettleworth, 2008) were viewed by Smith et al. (2009) as evidence against low-level associative models and support for a non-associative psychological explanation of metacognitive performance patterns. However, Hampton (2009) noted that this conclusion may be premature and that there are probably many task-specific methodological factors that affect whether the chosen–forced advantage and the increase in escape rates will be exhibited in a particular species. Recent findings provide empirical confirmation for this point of view. Fujita (2009) has recently reported that capuchin monkeys tested in a delayed matching-to-sample task chose the escape option more often as task difficulty increased and that one of the two monkeys showed a consistent choice-forced advantage. In our experiments, we showed that pigeons are indeed capable of exhibiting the chosen–forced advantage and that some individual pigeons also showed an increase in escape responding at a long RI. Given these findings, it seems that low-level mechanisms cannot be dismissed as explanations for the behavioral effects initially proposed as indicative of metacognition. As Crystal and Foote (2009) have suggested, new methods need to be developed to study metacognition in animals which allow for behavioral effects to be predicted that cannot be explained solely in terms of mechanisms operating on primary representations.