The first experiment examined how the pigeons learned to indicate the ordinal position of the objects in this type of multiple-choice discrimination using response targets. After first shaping the pigeons to peck at the response targets in isolation, the first phase of the experiment examined how pigeons performed when only required to make one choice between two targets placed over the set of the three objects. After showing good evidence of learning this simple task, the second phase of the experiment added a third target, and the pigeons were required to make two successive choices to each display. A diagram of this latter procedure is provided in Fig. 1. In this case, after pecking a ready signal, a trial began with all three objects having a target area imposed on top of them. Following this three-alternative choice, there was a second choice test between the two targets not chosen on the first choice. The pigeons were rewarded after each choice for selecting the target on the pictorially closest object at the moment of each choice (front, then middle when required). We deliberately employed a large and diverse set of stimuli to prevent the pigeons from memorizing specific scenes or object configurations and to promote the creation of as flexible a response rule as possible.
Four male pigeons (Columba livia) were tested. One pigeon (I3) was experimentally naïve at the start of the experiment. The other pigeons (C1, G2, & L4) had previously been tested with different discriminations including 3D action, time perception, and auditory processing. The pigeons were maintained at 80%–85% of their free feeding weight based on daily weighing. They had free access to water and grit and were kept on a 12:12 light:dark cycle.
Testing occurred in two flat-black Plexiglas operant chambers. Both chambers were equipped with an infrared touch screen (EZscreen EZ-150-Wave-USB) on one wall and a 28-V houselight that was continuously lit. Food reinforcement was provided through a square 5 cm × 5 cm access hole centrally positioned below the touch screen. Stimuli were presented on the monitor (Dell E153FPf or NEC Accusync LCD51VM-BK, 1,024 × 768 resolution), situated directly behind the touch screen. Pigeons C1 & G2 were always tested in the first chamber, and I3 & L4 were tested in the second. All experimental events were controlled through a Microsoft Visual Basic program.
Scene stimuli were 10.5 × 10.5 cm square images rendered using 3DS Max (v2013, Autodesk). All scenes had a depicted ground, which was a brown, textured plane that was shaded to give it the appearance of receding in depth. By vertically flipping the images, this ground was presented at either the bottom or the top of the scene. Each scene also contained three objects: a cube, a hollowed cylinder, and a sphere (see Fig. 1). These objects could be colored as either matte red, blue, or green, with lighting and shading to provide the appearance of depth and edges within the object. A total of 27 different object/color combinations were tested, ranging from having all three objects being different colors to all three being the same color, or combinations in between. The primary light source for shading the scene was situated in one of four locations equidistant from the origin of the scene, located 20° to the left of the camera, 65° to the left of the camera, 20° to the right of the camera, or 65° to the right of the camera. These scene variations (27 color configurations × 4 light configurations × 2 ground position configurations = 216) were applied to all 3D scene configurations of the objects, which are described next.
The 3D scene configurations were generated by systematically varying the rendered placement and size of the three objects. This variation allowed for different pictorial depth cues—object interposition/occlusion, object relative size, and object height in field—to be present to varying degrees. Our systematic variation allowed for four different size configurations, four different height-in-field configurations, and eight different occlusion configurations.
Object size was judged using the difference between the highest and lowest pixel in the scene image (i.e., vertical extent). The four size configurations consisted of the following arrangements: One configuration had a size difference between each of the front, middle, and back objects, thus making relative size an informative depth cue in the comparison of all the objects. The second configuration had all three objects at the same size, making relative size an uninformative depth cue. The third configuration had a size difference between the front and the middle objects, while the middle and back objects were about equal in size, and the fourth configuration had a size difference between the middle and back objects, while the front and middle objects were about equal. Thus, these two size configurations permitted two of the objects to be compared using size as an informative cue. When configurations required size differences between objects, their vertical extent differed on average by 40%, with a minimum of 10% difference, and in rare cases as large as 125% (i.e., over twice as large in comparison). In absolute terms, the front object averaged approximately 41 mm, the middle object 36 mm, and the back object 31 mm, with standard deviations of 5.5 mm.
The four height-in-field configurations were organized analogously to the four size configurations. An object’s relative height in field was measured using the midpoint of the object. One configuration placed the objects so that all object comparisons were informed by this depth cue, while the second placed them at equally heights relative to the field. The third and fourth configurations permitted two of the objects be judge using this cue, but not the third. To make this depth cue informative, the distance between the midpoints of objects had to be at least 20% of their averaged size, averaging a 78% difference in our stimulus set, with a maximum of 230% difference (i.e., more than two object-extents separated). To prevent the height on the screen acting as a discriminative cue, half the trials were tested with the scene inverted so that textured ground appeared towards the top of the image and the front object would be located at the top of array, thus inverting the height in field relative to the objects.
Eight occlusion configurations were used which varied the degree of interposition among the three objects. In the most informative condition, the front object partially overlapped both the middle and the back object, and the middle object partially overlapped the back object. In the completely unoccluded condition, there was no overlap among any of the three objects. The remaining six configurations varied the degree of overlap between different combinations of the objects (e.g., front and back overlap, but not middle). On average, the overlapping portion was about 13% of the whole unoccluded object, with a minimum of 10% occluded and a maximum of 20% occluded. In addition to these three depth-relevant properties of the displays, the front-to-back and left-to-right order of the objects’ position were also equated among the three objects.
With these different variations, we had a potential total of 995,328 distinct scene stimuli (4 size configurations × 4 height in field configurations × 8 occlusion configurations × 6 front-to-back orders × 6 left-to-right orders × 216 lighting and scene variations). Of course, certain combinations were impossible regarding 3D physics and were consequently excluded. We also ensured that each scene configuration had at least one cue available for judging among the objects. As a result, each session pseudorandomly sampled from a pool of 233,280 possible scenes by sampling 1,080 scene configurations and making them using our 216 scene and lighting variations. In this effort, we attempted to properly balance the counts for each. Thus, they were composed of all size configurations (276 of each informative condition, 252 of the uninformative condition), height-in-field configurations (276 of each informative condition, 252 of the uninformative condition), left-to-right orders (180 each), and front-to-back object orders (180 each). Occlusion was absent in half of the scene configurations (540), and the other half was composed equally of the other seven occlusion configurations.
Finally, the target response areas were identical 12-mm (outer diameter) annuli. They consisted of series of multiple colored bands around the ring so that they would be visible when placed against any shading of an object’s color (see Fig. 1). They were randomly placed in the interior of each object on each trial with the constraint they not touch any exterior edges.
After the naïve pigeon (I3) was autoshaped to peck the display for food reward, the pigeons were all trained to peck the target area in isolation over a few sessions. Each trial started with a peck at a white 2.9-mm circular ready signal. Following this, the pigeons were first trained to peck at a single target placed on a solid black background. Once pecking was established, this was followed by training to peck at a target randomly located within one of 15 uniformly colored 15.5 × 10.5 cm rectangles that were randomly used across trials. Once pecking was established, the rectangles were replaced by the scenes described previously and discrimination training began.
Phase 1: One-choice discrimination training
The pigeons were first trained to make the essential ordinal depth discrimination using a simpler one-choice variation of the task. In this procedure, only two of the three objects in the scene were randomly selected to have targets placed on them.
After starting each trial with a peck to the ready signal, a scene with targets on two different objects was displayed, located randomly within the bottom third of the computer screen. Random scene placement on the screen was used to prevent absolute spatial biases and promote the pigeons’ processing of the scene and its contents. The pigeons were then required to peck (variable ratio = 6–9) at the target placed on the object appearing closest in depth to obtain food reward (hopper duration 2.5 s, except L4 who was given 4.5 s). Pecks at the other target on the display were considered incorrect selections and terminated the trial. Trials not completed within 10 min were terminated. After a choice, there was an intertrial interval (ITI) of 3 s.
Each session consisted of 216 trials. Each session tested 216 scene configurations, pseudorandomly selected to maintain a balanced selection of different scene configurations every five sessions of testing. Due to an error in the programming, pigeons I3 and L4 received scenes only lit from the right at 20° or from the left at 65° during experiment 1. Phase 1 consisted of 20 sessions for pigeons I3 and L4, and 32 sessions for C1 and G2. Only the first 20 sessions were analyzed.
Phase 2: Two-choice discrimination training
We then started the two-choice phase. Here, targets were placed on all three objects, and the pigeons were required to make two successive choices (each choice VR6–9). In the first choice, selecting the target located on the perceptually closest object of the three provided immediate food reward. Selecting the other two targets on the display was considered incorrect. Regardless of first-choice accuracy, the screen was blanked for 1 s, and the same scene would return, less the just-selected target. If reinforcement was delivered for the first choice, food was available during the 1-s blanked period and extended for 1.5 s past the scene’s return. The object behind the just-selected target remained present. The pigeons now needed to make a second choice between the remaining two targets. Again, selecting the target on the object closest in depth from the targets still available resulted in food reward, and an incorrect selection had no programmed consequences. Either outcome ended the trial and resulted in a 3-s ITI. Each session consisted of 216 two-choice trials. This phase lasted a total of 35–45 complete sessions depending on the bird. Incomplete sessions, possibly due to satiation from the 432 possible reinforcers, were omitted.
At this point, we made two procedural modifications to help integrate the two successive choices more seamlessly as a single trial. First, we made food reward probabilistic (50%) after correct first choices. Food reward after correct second choices remained at 100%. Second, we separately required a correct first response to make a second choice on a randomly selected 64 trials. On these trials, if the pigeon made an incorrect first choice, the trial simply ended. These changes extended training for another 28–36 sessions. When evaluated, we found that neither of these modifications changed the pigeons’ choice behavior, and all of these sessions have been combined for the analyses below. Depending on the bird, Phase 2 lasted 63–79 sessions. For the purpose of the results we examined the first 60 sessions.
Phase 1: One-choice discrimination training
The pigeons acquired the one-choice discrimination easily. The left portion of Fig. 2 shows the increase in average choice accuracy for the four pigeons over the first 20 sessions of Phase 1. Because only a single choice was required between the two target areas, chance was considered equal to 50%. The pigeons were significantly above chance by the seventh session (seventh session accuracy = 61%), t(3) = 5.7, p = .010, d = 2.9.
To better understand how the pigeons learned the task, we examined their performance when the two targets were located on the objects at the three possible depth combinations (i.e., front vs. middle, front vs. back & middle vs. back). A repeated-measures (RM) analysis of variance (ANOVA; depth combination × four-session block) evaluating choice accuracy revealed a significant main effect of depth combination, F(2, 6) = 22.0, p = .002, ηp2 = .88, and a main effect of session, F(4, 12) = 3.8, p = .031, ηp2 = .56, but no interaction between the two factors, F(8, 24) = 2.0, p = .085, over Phase 1.
Looking at the levels of accuracy over the final four-session block of Phase 1, all four pigeons were found to be more accurate when the targets were located on the front and back objects (mean accuracy = 70%), t(3) = 3.8, p = .032, d = 1.9, and poorest when they were located on the front and middle object (59%), t(3) = 3.2, p = .048, d = 1.6, with the middle and back combination being in between (64%), t(3) = 4.8, p = .017, d = 2.4.
Reaction time (RT) was measured as the time from stimulus onset to the time of the first peck on a target. Long RTs over 10 s were excluded from this analysis (about 2% excluded). Over the last four-session block of Phase 1, mean RT for correct choice responses were faster (1,456 ms) than for incorrect choice responses (1,603 ms). This correct/incorrect RT difference was found in all four birds and across all three combinations of depth placement. Overall, RTs were fastest when the targets were on front and middle objects (1,405 ms), while targets on the middle and back objects were responded to the slowest (1,652 ms) and the front and back combination was between (1,433 ms). An RM ANOVA evaluating reaction time in this block on these depth combinations confirmed a significant main effect, F(2, 6) = 16.6, p = .004, ηp2 = .85; RTs were log transformed prior to analysis for normality. Post hoc comparisons suggested that the pigeons were slower with the middle/back discrimination relative to both the front/middle (p = .009) and front/back conditions (p = .039). There was no difference between the front/middle and front/back conditions (p = .627).
Phase 2: Two-choice discrimination training
The right portion of Fig. 2 shows mean choice accuracy for all four pigeons during Phase 2. Choice accuracy for the first choice (black symbols), where three targets were available, was judged relative to chance set to 33%, while accuracy for the second choice (gray symbols), where two targets were available, was judged relative to chance set at 50%. Despite the increased difficulty of the first choice with the addition of the third target, second choice accuracy in Phase 2 remained at levels comparable to that observed during Phase 1 and did not change over the phase, F(14, 42) = 0.8, p = .65. In contrast, first-choice accuracy grew slowly over the phase, F(14, 42) = 3.1, p = .002, ηp2 = .51. The pigeons gradually increased their accuracy on this more challenging first choice over the initial 40 sessions of Phase 2 before reaching an apparent asymptote at around 60%. Examining accuracy during just the first session of Phase 2, the pigeons exhibited above-chance discrimination upon the introduction of the third target (first-choice accuracy = 50%; chance = 33%), t(3) = 4.0, p = .028, d = 2.0, and the addition of a second choice (second choice: first session accuracy = 64%; chance = 50%), t(3) = 5.2, p = .014, d = 2.6.
Examining the last two blocks of this phase, the pigeons correctly selected the front object on average 68% of the time and made errors by selecting the middle object (23%) or back object (9%) on the remainder of the trials. The pigeons made two correct choices in a row 46% of the time, which should happen by chance 17% of the time. The pigeons made a correct first choice followed by an incorrect second choice 22% of the time, and made an incorrect first choice followed by a correct second choice 20% of the time. The pigeons made two incorrect choices in a row on 12% of trials where that was possible.
Certain depth combinations of objects supported better accuracy than others in Phase 1. We conducted a similar analysis for those depth combinations for the second choices in Phase 2. One important difference here is that now the pigeons had already made a first choice to one of the three targets (front, middle, or back object), which had been either correct or not. Examining the last two blocks of Phase 2, all four pigeons performed better on their second choice when the first choice was correct (66%) versus when the first choice was incorrect (62%). A more detailed RM ANOVA (depth combination) on second choice accuracy revealed a significant main effect of depth combination, F(2, 6) = 38.3, p < .001, ηp2 = .93.
When the pigeons’ second choice was between the middle and back object (i.e., they had made a correct front target first choice) they were successful at choosing the middle object (accuracy = 66%, chance = 50%), t(3) = 6.4, p = .008, d = 1.0. When the pigeons second choice was between the front and back object (i.e., they had made an incorrect middle object first choice) they were also accurate in choosing the front object (accuracy = 66%), t(3) = 4.1, p = .026, d = 0.9. When the pigeons second choice was between the front and middle object (i.e., they had made an incorrect back object first choice) they were at chance in choosing the front object (accuracy = 50%), t(3) = 0.1, p = .924.
This experiment found that the pigeons can learn a multiple-choice discrimination involving the placement of target response areas on top of the different objects placed at varying apparent depths in a complex scene. All four pigeons learned to make both above-chance first and second choices as determined by the ordinal position of the objects in pictorial depth. Although using completely different approaches, these results converge with those reached by Cavoto and Cook (2006) in their examination of depth perception by pigeons. Their discrimination involved a go/no-go task in which pigeons had to peck at one particular depth order (S+) of specifically colored objects for food reward while suppressing their pecking to the five other possible orderings of the same objects (S−). In contrast, the current task required the pigeons to choose a target on the nearest object followed by choosing the middle object, but critically, the correct object identity and scene composition varied immensely from trial to trial. Despite these differences in the nature of the required discrimination, the response methodology, and the appearance of the stimulus sets, both studies provide good converging evidence that pigeons can perceive apparent depth relations among objects rendered in 3D scenes.
In general, when two of the three objects had targets, the pigeons’ best performance occurred when the back object was a possible response option (i.e., front vs. back or middle vs. back). There are a few possible factors that may contribute to this finding. First, the constraints of 3D physics impact the appearance of the objects even though the target areas associated with the objects were equal in size. When relative size was an informative cue, objects in front were larger than objects behind, and when occlusion was an informative cue, objects in front were less likely to be occluded (i.e., never for the front object and sometimes for the middle object). The back object was therefore sometimes smaller, and its full contour and surface features were sometimes not visible. Both of these features may have made the front object more salient as the rewarded response option independent of the processing of ordinal position, and the middle object would have also shared some of these same benefits.
Second, if pigeons have a particular difficulty with amodal completion (Qadri & Cook, 2015), for example, then the depth-independent shape recognition of the back objects would be more impacted than front objects and impair possible object-specific associations. Third, given that choosing the back object was never rewarded, the back object may have been more aversive overall. In fact, selection of the back object in the first choice led to chance level performance in the second choice, possibly indicating poor perceptual conditions for the depth discrimination or disengagement by the pigeons. Alternatively, the pigeons may have recognized that since they were on the second choice of the scene, they needed to select the middle object within the scene, which would be incorrect if the front object was still available, as is the case when they make an error on the first choice. Such an ordinal-object bias would boost performance in the middle versus back discrimination and impair the front versus middle discrimination, consistent with the observed data.
Lastly, the response rule used by the pigeons generalized well in dealing with the target areas used to judge as object’s position. This flexibility was reflected in several ways. First, when the third response area and second choice requirement were added in Phase 2, the pigeons showed significant above-chance transfer. Second, the pigeons could use the rule successfully across two successive choices, first applying to the front object and next to middle object. Finally, this rule had to be general enough to deal with the very large number of variable situations employed here. Besides their difference in apparent depth, the objects varied in their identity, location, and color, along with the various scene and lighting characteristics present or absent in each trial. Given the number of possible scenes and number of training trials, stimuli rarely repeated more than a few times across the entire experiment, likely forcing the pigeons to generate as flexible a response rule as possible.