Introduction

Everyday animals face countless problem-solving situations. One of the key ingredients of successful problem-solving is the inhibition of certain prepotent responses that exist because they are either preprogrammed or have been extensively reinforced in the past. Developmental studies on animals and children have shown that inhibitory problems, rather than a lack of conceptual comprehension, can often prevent subjects from solving certain tasks (e.g., Deacon 1997; Diamond 1990).

Numerous species have been observed to perform well on detour tasks in which they have to walk around a barrier (e.g., fence) to obtain food or a social reward (chimpanzees: Köhler 1925; Kellog and Kellog 1933; chicks: Regolin et al. 1994, 1995a; cats: Poucet Thinus-Blanc and Chapuis 1983; dogs: Pongracz et al. 2001; quokkas: Wynne and Leguet 2004; quails, canaries and herring gulls: Zucca et al. 2005; fish: Bisazza et al. 1997; snails: Atkinson 2003). It is very likely that species differ in the way they solve this task. Snails, for example, rely on sensory feedback while moving along a barrier, whereas 2-day-old chicks have been reported to solve detour tasks in their initial choice, although some evidence suggest that they may perform better in seeking social partners than prey (Regolin et al. 1995b). Nevertheless, there are some task features to which various species respond in similar ways. Thus, comparisons of the behavior in front of transparent and opaque barriers with various species (chicks: Regolin et al. 1994; cats: Poucet et al. 1983; dogs: Chapuis et al. 1983) have confirmed an early finding of Köhler (1925) with chimpanzees, that transparent barriers pose a more difficult problem than opaque ones.

In contrast to detour problems in which subjects have to move around a barrier themselves, Köhler (1925) showed that chimpanzees perform poorly when pushing an object around a barrier. However, later studies have found that chimpanzees and other primates can solve mechanical mazes, computerized mazes, and bent-wire detour problems that require the inhibition of direct solutions and the activation of alternative indirect routes (Bingham 1929; Guillaume and Meyerson 1930; Davis et al. 1957; Davis 1958, 1968; Washburn and Rumbaugh 1992; Washburn et al. 1991). One of the tasks that has received considerable research attention in connection to the ability to inhibit prepotent responses is the detour-reaching task or object retrieval task. In this task, a toy is placed into a plexiglass box, with an opening only to one side only (Diamond 1981). When 7–9-month-old human infants see the toy through the closed plexiglass side of the box, they will reach straight for it, despite tactile feedback from the plexiglass. In contrast to 9-month-olds, older infants find the opening of the box (Diamond 1990). When tested with an opaque box, 7–9-month-old infants perform better (Diamond 1981, 1990), suggesting that the visibility of the attractive toy evokes the urge to reach for it directly. In several studies Diamond has shown that performance on the object retrieval task is linked to maturation of the dorsolateral prefrontal cortex (e.g., Diamond and Goldman-Rakic 1986; Diamond 1991a, b).

Adult rhesus monkeys, marmosets and vervet monkeys easily find the opening of the object retrieval box (Diamond and Goldman-Rakic 1989; Roberts et al. 1991; Taylor et al. 1990a, b). In contrast, cotton-top tamarins fail to perform above chance in this task (and continue reaching directly) even after 24 trials (Santos et al. 1999). However, given sufficient experience that included a training phase with an opaque box, tamarins were also able to overcome their initial difficulties and solve the reaching problem.

Surprisingly, performance in object-reaching tasks has yet to be examined in the great apes. This lack of information about humans’ closest living relatives is particularly puzzling because increases of prefrontal cortex have been postulated in human evolution as key development in executive problem-solving and forward-planning. Crucially, the great apes, rather than the monkeys that have been tested, are the group that displays the largest prefrontal cortex in nonhuman primates (see Semendeferi 1999). Therefore, data on great apes’ performance is crucial to make inferences about the evolution of inhibitory skills in humans. Moreover, obtaining data from all great ape species, not just one species (e.g., chimpanzees) is essential to make inferences about the evolution of human and ape cognitive skills (see Beck 1982; Parker et al. 1999). A joint appraisal could help to infer points where this capability may have increased.

Some authors have suggested that orangutans rather than our closest relative, the chimpanzee, possess greater inhibitory skills. For instance, Shumaker et al. (2002) found that two orangutans solved a reversed contingency task that chimpanzees had systematically failed (Boysen and Berntson 1995; Boysen et al. 1996). However, the study by Shumaker et al. (2002) has been criticized because orangutans showed no initial preference for the larger quantity, and, therefore, they did not have to inhibit reaching for the larger quantity of food (Kralik et al. 2002; Vlamings et al. 2006). Moreover, Vlamings et al. (2006) (see also Uher and Call 2008) tested all the great apes in a reverse contingency task and found no evidence of the putative differences between chimpanzees and orangutans. Some orangutans (and chimpanzees) passed the task while some orangutans (and chimpanzees) failed it. Note that some monkeys, if given enough trials, can also pass this task even without using correction procedures (Albiach-Serrano et al. 2007; Murray et al. 2005).

The aim of the current study was to compare the ability of great apes to solve a modified version of the classical detour-reaching task taking great care to use the same method with all species. This is important because although data from multiple species is available, for instance in the case of the classical detour task, quite often inter-species comparisons are difficult to interpret because the methods used to study each species also differ substantially. Upon completing our first experiment, we conducted a follow-up with those apes that initially failed the task to assess the reasons for their failure and to see if a minimum amount of training would help them overcome their initial bias. Since the current study also included apes from two different captive populations (zoo and sanctuary), we investigated whether the origin of the apes had an effect on performance. This comparison is important to map the inter-population differences in cognition—a topic that has received relatively little research attention (but see Bania et al. 2009; Call and Tomasello 1996; Furlong et al. 2008). Finally, we tested 3–5-year-old children on the same task as this age represents a key transition period as evidenced by neurological and behavioral studies. Neurological studies reported that the prefrontal lobe—an area implicated in the inhibition of prepotent responses, shows an important growth spurt between the age of 4 and 4 years (Luria 1973; Huttenlocher 1979; Thatcher 1992). Among the major changes detected during this period are the development in the size and complexity of cells, which include myelinisation, fissuration and synaptic density (Carver et al. 2001a).

Behavioral studies have indicated that 3–4-year-old children perform poorly on a variety of tasks requiring inhibitory control, including the day–night task (Gerstadt et al. 1994; Diamond et al. 2002), the tapping-imitation task (Luria 1973; Diamond and Taylor 1996), the delay of gratification task (Mischel and Mischel 1983; Mischel et al. 1989) and the dimensional card-sorting task (Frye et al. 1995). However, there is a marked improvement in inhibitory control between 3 and 5 years of age (Carlson and Moses 2001; Kopp 1982; Zelazo and Frye 1998). Livesey and Morgan (1991) administered a go/no-go task to 4- and 5-year-old children, and found that performance was near perfect in latter age group. Further it has been reported that the preschool period is an important period for the development of inhibitory skills as measured by the stop-signal task (Carver et al. 2001a, b) in which subjects have to inhibit their responses following an auditory signal in a forced choice discrimination task. Even the youngest age group (children between 4- and 9-year-old were tested) had some capacity of withholding responses, and performance improved significantly with age, in particular in the youngest age group. According to Bell and Livesey (1985) inhibitory behavior is linked to self regulation, the development of which is reported to be an important milestone in the young child (Kopp 1982; Lee et al. 1983). As far as we know, there have been no studies that investigate inhibitory control in preschool children in a problem-solving setting. Additionally, except for the study of Carlson et al. (2002) there have been no studies that directly compare inhibitory skills among 3-, 4- and 5-year-olds. Based on the previous studies we expected older children to show superior performance than younger children in the detour-reaching task used in this study.

Experiment 1: zoo apes

Methods

Subjects

We tested 27 great apes housed at the Wolfgang Köhler Primate Research Center (WKPRC) in Leipzig (Germany). There were 6 gorillas (Gorilla gorilla; age range 5–25 years; 2 males, 4 females), 7 orangutans (Pongo pygmaeus; age range 5–33 years; 2 males, 5 females), 4 bonobos (Pan paniscus; age range 6–20 years; 3 males, 1 female) and 10 chimpanzees (Pan troglodytes; age range 5–27 years; 3 males, 7 females). All animals had participated in various experimental cognitive studies at the WKPRC (see http://wkprc.eva.mpg.de). Subjects were not food deprived prior or during the experiment and water was available ad libitum during the experiment. We included all subjects older than 3 years of age that were available for testing at the time.

Materials

The apparatus consisted of a bottomless opaque wooden box (50 × 53 × 30 cm) resting on a platform and attached to the mesh by a metal frame (see Fig. 1a). The front panel of the box had two transparent Plexiglas doors (18 × 18 cm2) suspended by hinges and that opened inwards when pushed. We used transparent doors because several studies had shown that several species find transparent barriers more challenging than opaque ones (e.g., Chapuis et al. 1983; Diamond 1990; Regolin et al. 1994). The mesh contained two large holes that coincided with the box doors through which the animal could stick his or her entire arm. A 5-cm piece of banana could be placed right behind one of the hinged doors via one of two lateral trap doors, which were out of reach of the subject. An opaque plexiglass panel (80 × 90 cm) blocked the subject’s access to the hinged doors during baiting. This prevented the subject from seeing where the food was placed. In one of the conditions, the banana was covered by a small opaque cup (5 cm in diameter and 6 cm in height). If the subject tried to reach for the food directly, the door pushed the food away and the reward fell out of reach from the subject. To get the reward, subjects had to reach indirectly, through the empty door and grab the reward from behind (see Fig. 1b). Two cameras recorded testing. One camera was placed underneath the box, the other one at the side. The signal of both cameras was fed to a single tape in a DV-walkman.

Fig. 1
figure 1

Experimental setup (a) and solution to the task (b)

Procedure

Each subject was tested individually in an observation room, except for one orangutan and three chimpanzees that were accompanied by their young offspring. At the beginning of each trial, the experimenter installed an opaque panel and placed a piece of banana behind either the left or right door of the box. In the visible condition the reward was fully visible whereas in the invisible condition, the experimenter covered the reward with an upside down opaque cup. Including trials in the visible and invisible conditions allowed us to assess whether seeing the reward made the problem harder than not seeing it. After baiting was completed, the experimenter removed the opaque panel and sat down approximately 1 m behind the box. Subjects were allowed to approach the box and take the food by reaching through the doors (see Fig. 1b). The trial ended when subjects got or lost the food. Then the opaque panel was lowered again and the next trial was conducted. If subjects failed to respond to the box, they were vocally encouraged to do so. If subjects did not respond after 5 min, the session was terminated. Testing stopped after subjects became unwilling to participate on three consecutive sessions. All sessions were videotaped.

Design

We used a 4 (species) × 2 (condition: visible/invisible) split plot, with species as the between and condition as the within subjects factor. The order, in which the conditions were presented, was counterbalanced within species. Each condition consisted of ten trials [5 trials for each food location (left/right)]. The order of food location was counterbalanced across trials using a randomization program, with the obligation that food was never on the same location three times in a row. Subjects were tested in two sessions (one per condition), each run on a different day. The median number of days between conditions was one.

Data scoring and analyses

We scored from the videotapes whether subjects obtained the reward. A second observer coded 20% of the sessions. Inter-observer reliability was excellent (Cohen’s kappa = 0.94). We analyzed the percent of correct trials as a function of species, condition and sex. We used nonparametric statistics because the assumption of homogeneity of variance was not met. We corrected multiple pair-wise comparisons with the Bonferroni–Holm procedure (Holm 1979).

Results

Two chimpanzees and one bonobo were dropped from the analysis because they did not complete the two sessions. Figure 2 presents the percentage of correct trials as a function of species. There were no significant differences between conditions. Consequently, we collapsed this variable in subsequent analyses. There were significant differences between species (Kruskal–Wallis test: χ2 = 17.52; df = 3; P = 0.001). Pairwise comparisons revealed that orangutans outperformed all other species (Mann–Whitney tests: Z > 2.71; P < 0.01 in all cases). Males and females performed at comparable levels (Mann–Whitney test: Z = 0.59; P = 0.55). Unlike other species, orangutans significantly improved their performance during testing from 29% correct in the first 5-trial block to 91% correct in the last 5-trial block (Friedman test: χ2 = 14.22; df = 3; P = 0.003). Only one out of seven orangutans solved the problem in the first trial, although she subsequently missed other trials (no subject from any other species solved the problem on their first trial). We found no evidence that age affected performance in orangutans (Spearman r = 0.07; P = 0.88; N = 7).

Fig. 2
figure 2

Mean percentage (+SEM) of correct trials as a function of species in “Experiment 1

Discussion

Orangutans outperformed all other great ape species. Although initially they also performed poorly, unlike other species, they learned to inhibit reaching directly for the reward within 20 trials. It is conceivable that this reflects the more developed inhibitory skills of orangutans as some authors have suggested (Shumaker et al. 2002, but see Vlamings et al. 2006). Perhaps the reason for their success is due to the fact that they were not pulled as strongly by the food as their African counterparts which face stronger food competition with group mates (Shumaker et al. 2002). Additionally, orangutans’ more deliberate means of locomotion in the canopy (see Povinelli and Cant 1995) may have contributed as well. However, the fact that orangutans also made numerous mistakes in the initial trials shows that at least they showed the same initial pull as the other ape species.

Unlike previous studies with human infants and monkeys (e.g., Diamond 1990; Santos et al. 1999) the visibility of the reward had no effect on performance. However, the barrier in the current study was always transparent whereas other studies made the barrier opaque, which may explain the differences between studies.

Although the species differences reported here may represent a genuine species difference, caution is required for two reasons. First, our sample size is small, and therefore the impact of large inter-individual differences on the group comparisons should not be underestimated. It is conceivable that the origin of our subjects (zoo) may have had an important effect on performance, although we can rule that a differential experimental history was responsible for the observed differences between orangutans and other apes because all species routinely received the same tests. Second, even if the species differences were genuine, the strength of the prepotent response in unsuccessful subjects is unknown. It is conceivable that unsuccessful subjects could learn to overcome their responses if provided with some sort of training. In the next two experiments, we addressed these two open questions.

Experiment 2: two training regimes for apes

In this experiment, we investigated whether subjects could learn to refrain from reaching directly for the reward, after implementing a special training that consisted of blocking the baited door so that subjects were forced to use the other door and grab the reward from behind. We divided the subjects that had failed “Experiment 1” into two groups. Each group received one of two training regimes. The fixed regime taught subjects to always reach through the same door to get the reward, while the variable training regime trained subjects to reach from both doors alternatively. Once training was completed, we presented the visible condition of “Experiment 1”.

Methods

Subjects

We included all subjects that had failed “Experiment 1” except one gorilla and one chimpanzee that were dropped because of their extreme side-biased responding. Thus, the final sample consisted of 5 gorillas (age range 5–25 years), 4 bonobos (age range 6–20 years) and 9 chimpanzees (age range 5–27 years).

Materials

We used the same box and setup as in “Experiment 1” with two exceptions. First, baiting took place behind a transparent (not an opaque) panel (80 × 90 cm) placed between the mesh and the box. Second, we used a piece of opaque plexiglass (5 × 30 cm) placed vertically behind the door where the reward was located to block the door movement during training (Fig. 3).

Fig. 3
figure 3

Picture showing the plastic piece blocking the movement of the door to train subjects to reach indirectly for the reward

Procedure

The basic testing procedure was the same as in “Experiment 1” except that there were two phases: training and testing. During training, the experimenter inserted the transparent panel between the apparatus and the subject. Then the experimenter captured the subject’s attention, placed the reward behind one of the doors and blocked the door vertically by sliding down a piece of opaque plastic in full view of the subject. Upon removal of the transparent panel, the subject was allowed to approach the apparatus and take the food by reaching through the unblocked door. As soon as the subject acquired the food, the safety panel was lowered and the procedure was repeated for a total of six trials. In the fixed training regime, the reward was always placed behind the same door (with left/right side counterbalanced across subjects) whereas in the variable regime, the reward appeared equally often behind each door.

Upon completing the six training trials, subjects received six test trials which were identical to visible trials in “Experiment 1”, except that we used a transparent rather than an opaque panel to prevent subjects from reaching before baiting was completed. On the first test trial, the reward always appeared on the opposite side from where it was placed on the last training trial. In the rest of the trials the order of the food location was counterbalanced across trials. If subjects did not respond or retrieve the food during training or test trials, the experimenter encouraged them vocally. If subjects did not respond after 5 min, the session ended. When subjects were not able to retrieve the food or became unwilling to be tested on six consecutive sessions testing stopped. All sessions were videotaped.

Design

We used a between subjects design with training condition (fixed/variable) as the between subjects factor. Each training condition consisted of six trials. Half of the subjects were presented with the fixed condition, the other half with the variable condition. Participants were matched across conditions on the basis of sex, age and species so that there were no significant differences between groups in any of these variables.

Data scoring and analyses

We scored from the videotapes whether subjects kept knocking against the blocked door during training and whether they obtained the reward in the test. A second observer coded 20% of the sessions for each of these two variables. Inter-observer reliability was excellent for knocking (Cohen’s kappa = 1.0) and success (Cohen’s kappa = 0.94).

We compared the percent of correct trials before and after training. To do so, we selected the last six trials of “Experiment 1” and compared them to the six test trials of the current experiment. Additionally, we compared the overall percent of correct trials in the test as a function of the training regime. Finally, we correlated the percent of correct trials in the test with the percentage of trials in which subjects knocked on the door during training for each of the training regimes. Due to our reduced sample size, we were unable to analyze species differences and consequently, pooled all species together. We used nonparametric statistics because the assumption of homogeneity of variance was not met.

Results

Two gorillas failed to learn during training and another one quit responding during testing, and they were dropped from subsequent analyses. Figure 4 presents the percentage of correct trials prior and after training in the remaining subjects. Subjects improved their performance in the test compared to the pretest both in the fixed (Wilcoxon test: Z = 2.41; P = 0.016) and variable condition (Wilcoxon test: Z = 2.54; P = 0.011). Additionally, there were no significant differences in the test between the two training methods (Mann–Whitney test: Z = 0.81; P = 0.42). However, the distribution of responses differed substantially between training regimes. Subjects in the fixed training showed a more restricted distribution of responses with the values clustered around the median (inter-quartile range 50–67%; kurtosis 3.17) whereas subjects in the variable training showed a greater dispersal in their responses (inter-quartile range 12.5–70%; kurtosis 1.04). We found no evidence that either sex (Mann–Whitney test: Z = 0.68; P = 0.50) or age (Spearman r = 0.33; P = 0.20; N = 17) significantly affected performance.

Fig. 4
figure 4

Mean percentage (+SEM) of correct trials before and after training as a function of training regime

Training regimes also differed in the percentage of trials in which subjects knocked against the plexiglass during training. Subjects with variable training knocked the plexiglass significantly more often than subjects with fixed training (Mann–Whitney test: Z = 2.75; P = 0.006). Interestingly, the percent of knocking during training predicted success during the test for the variable training regime (Spearman r = −0.83; P = 0.003; N = 10). Subjects who knocked less often against the plexiglass during training performed better in the test than those who knocked frequently against the plexiglass during training. In contrast, subjects in the fixed training regime showed no relation between knocking against the plexiglass during training and test performance (Spearman r = −0.11; P = 0.81; N = 7).

Discussion

Most subjects could learn to overcome their prepotent responses with little additional training (six forced trials)—only two gorillas failed to benefit from this training. Both training regimes resulted in comparable overall levels of effectiveness but different distributions. On the one hand, the fixed regime trained subjects to always reach through one of the doors and when presented with the test, in which the location of the reward changed from trial to trial, subjects got about 50% of the trials correct because they always reached through the same door—the one they had been trained to use. On the other hand, the variable regime did not train subjects to reach through one particular door, but required the subjects to reach through different doors in different trials depending on the position of the reward. Some subjects seem to have understood this because they did well in the test, better in general than most subjects in the fixed training regime. Interestingly, those successful subjects were also those who knocked less on the blocked door during training. In fact, those subjects who knocked on the door more often during training not only performed worse than the other subjects in the variable training regime but also generally worse than most subjects in the fixed training regime.

Thus, the fixed regime trained subjects to produce a single response and therefore was easier to acquire—all subjects except one scored 50% or above. The downside of this regime, however, was that only one subject (14%) scored above 50%. In contrast, the variable regime trained flexibility, and therefore it was more effective in the test phase in which 40% of the subjects scored above 50% correct, but its downside was that another 40% of the subjects scored below 20% correct. Taken together, these results indicated that subjects could be trained to overcome their prepotent reaching response toward the reward in a few trials, but the training regime had a profound impact on the flexibility that subjects displayed during the test. In the next experiment, we returned to the question of species differences by testing additional groups of apes in the original problem.

Experiment 3: sanctuary apes

In this experiment, we sought to confirm the species differences found in “Experiment 1” with another sample of great apes. In this case, the apes came from sanctuaries located in countries with wild ape populations. Unlike zoo apes, sanctuary apes had been born in the wild and had been transferred to sanctuaries after they became orphaned or confiscated from humans who kept them illegally as pets. At the sanctuaries, apes had the opportunity to join other apes in social groups and spend some time roaming the forest habitat surrounding the sanctuaries. The inclusion of apes with different rearing histories allowed us to investigate the effects of ape origin on performance.

Methods

Subjects

We tested 14 bonobos (8 males and 6 females; mean age 5.5 years; range 4–8 years), 12 chimpanzees (6 males and 6 females; mean age 6.8 years; range 4–15 years), and 11 orangutans (6 males and 5 females; mean age 6.4 years; range 4–10 years). The bonobos were housed at the Lola Ya Bonobo sanctuary in Democratic Republic of Congo, the chimpanzees at the Ngamba Island sanctuary in Uganda and the orangutans belonged to the orangutan care center and quarantine in Pasir Panjang, Kalimantan, Indonesia. We dropped one orangutan that, instead of reaching for the food, destroyed the testing unit. Subjects were tested individually and they were not food- or water deprived during testing. We included all subjects older than 3 years of age that were available for testing at the time.

Materials

We used the same apparatus as in “Experiment 1”.

Procedure

The procedure was identical to that of “Experiment 1” except that we only used visible trials. After subjects had entered the testing unit, the experimenter showed a food reward to the subject and placed it behind one of the swinging doors. In order to get the reward, subjects had to introduce their arm through the empty door and grab the food from behind. Subjects received one 10-trial session with the food appearing the same number of times to the left and right of the subject with the restriction that it never appeared more than two times in a row on the same side. We scored and analyzed the data in the same way as in “Experiment 1”. In addition, we directly compared the performance of zoo and sanctuary apes. To do so, we calculated the percentage of correct trials for the first ten trials of the zoo apes and compared them to the ten trials administered to sanctuary apes.

Results

Figure 5 presents the percentage of correct trials for each species housed at the sanctuaries. Contrary to our expectations, species did not differ significantly in their ability to obtain the reward (Kruskal–Wallis test: χ2 = 1.69; df = 2; P = 0.43). Similarly, there were no significant sex (Mann–Whitney test: Z = 0.91; P = 0.36) or age differences (Spearman r = −0.002; P = 0.99; N = 37). However, we found important individual differences. One bonobo, two orangutans, and no chimpanzees got the reward more often than would be expected by chance (at least 9 out of 10 times). Furthermore, three bonobos, two orangutans and four chimpanzees got the reward in the first trial. Only one bonobo and one orangutan solved the problem on the first trial and were above chance in the whole session.

Fig. 5
figure 5

Mean percentage (+SEM) of correct trials as a function of species in “Experiment 3

Sanctuary apes performed significantly better than zoo apes (Mann–Whitney test: Z = 3.07; P = 0.002) although such a difference depended on the species. Sanctuary chimpanzees and bonobos performed significantly better than their zoo counterparts (Mann–Whitney test: chimpanzees: Z = 3.05, P = 0.002; bonobos: Z = 2.43, P = 0.015). In contrast, there were no significant differences between the two orangutan samples (Mann–Whitney test: Z = 0.64; P = 0.52). Combining the data from both samples revealed a significant difference between species (Kruskal–Wallis test: χ2 = 6.79; df = 2; P = 0.034). Orangutans performed significantly better than chimpanzees and bonobos. There were no sex differences (Mann–Whitney test: Z = 1.01; P = 0.31).

Discussion

We partially confirmed the results of “Experiment 1”. Orangutans performed at comparable levels to those observed in “Experiment 1”, thus showing good inhibitory skills. In contrast, the chimpanzees and the bonobos in the current experiment performed better than those included in “Experiment 1”. Combining the data from both samples revealed that orangutans outperformed chimpanzees and bonobos. In general, subjects did not perform above 50% correct, and only a minority of subjects mastered the task by the end of testing. This attests to a considerable level of task difficulty. In the next experiment we tested 3–5-year-old children. These ages are particularly interesting in children because they represent an important transition period for inhibitory and executive function abilities (e.g., Carlson et al. 1998; Kopp 1982).

Experiment 4: children

Methods

Subjects

Fifty-five children from eight schools in Leipzig participated in this experiment. These children were drawn from the participant database from the department of development psychology of the Max Planck institute for Evolutionary Anthropology. Children came from families of mixed socio-economic backgrounds. Thirteen of these children were dropped from the study because they did not respond to the task or refused to be tested after a few trials. Thus, the final group consisted of 42 children: fourteen 3-year-olds (M = 3.0; range 35–37 months), fourteen 5-year-olds (M = 4.0; range 47–49 months) and fourteen 5-year-olds (M = 5.0; range 59–61 months). There were equal numbers of boys and girls in each age group.

Materials

The apparatus was identical to the one used with apes, but it was painted differently (and had stickers glued on the top and sides) to make it more attractive to children (Fig. 6). Additionally, the door openings were padded to prevent potential injuries. A yellow curtain with printed bears was hanging in front of the box to prevent the child seeing the box being baited. We used six different toys as rewards that were exchanged for small stickers that children could keep. A pilot study showed that children in this age range were very motivated to obtain those stickers.

Fig. 6
figure 6

Testing setup for children. The top row of pictures depicts a child reaching directly for the reward and losing it. The bottom row depicts a child reaching indirectly and getting the reward

Procedure

Children were tested individually by two experimenters in a quiet room at their own school. During the walk to the test room, both experimenters talked to the child to make her feel comfortable with the experimenters and the new situation. Experimenter 1 asked whether the child wanted to play a game and went inside the room and took his position behind the box while Experimenter 2 waited outside with the child. Upon being called by Experimenter 1, the child and Experimenter 2 entered the room and approached the box. The Experimenter 1 said: “Watch this (name of the child). I am going to hide this (name of the toy). If you are able to retrieve the (name of the toy), and bring it to Experimenter 2, Experimenter 2 will give you a sticker”. While Experimenter 1 put the toy behind one of the doors, Experimenter 2 kneeled down, presented the child with the stickers and asked which sticker would like to have.

After the child had chosen a sticker, Experimenter 1 said: “Ok, you will get this sticker if you can bring me the toy”. Experimenter 2 immediately opened the curtain and the child was allowed to approach the box and attempt to get the toy by reaching through the doors. After the subject got the toy, Experimenter 1 closed the curtain while saying: “Perfect, you really did a good job”. Experimenter 2 rewarded the child by saying: “Perfect, you really did a good job, now you will get the sticker”. If the subject lost the toy, Experimenter 1 closed the curtain while saying: “Ooohh, it does not work”. Experimenter 2 said: “Ooohh, now you won’t get a sticker.” Then Experimenter 1 asked whether the child wanted to try again and put a new toy behind one of the doors. If the child solved the task on the previous trial, it was allowed to choose a new sticker. If children did not respond to the box or kept telling the experimenter that they could not get the toy, Experimenter 1 encouraged them by saying: “Just try, try to get the toy”. If the subject did not respond after approximately 5 min, the trial ended. Experimenter 1 asked whether the child wanted to try with a different toy. When the child refused to retrieve the toy on three consecutive trials, testing stopped. All sessions were videotaped.

Design

We used a between subjects design with age as factor. Each subject received one session of six trials [3 trials for each toy location (left/right)]. The order of the location at which the toy was placed was counterbalanced across trials using a randomization program.

Data analysis

We used the same scoring procedure and analyses as in “Experiment 3”. A second observer coded 20% of the sessions. Inter-observer reliability was excellent (Cohen’s kappa = 1.0). Additionally, we also scored the time it took subjects to solve each trial. Inter-observer reliability was again excellent (r = 0.93). Based on the data available in the literature, we expected older children to perform better than younger ones. Therefore, we used one-tailed nonparametric statistics.

Results

Figure 7 presents the percentage of correct trials for each age group (N = 14 for each group). As predicted, older children performed significantly better than younger ones (Kruskal–Wallis test: χ2 = 4.75; df = 2; P = 0.047, one-tailed). However, this difference was most clearly shown when comparing 3-year-olds to 4- and 5-year-olds pooled together (Mann–Whitney test: Z = 1.91; P = 0.028). Five 4-year-olds (36%) and three 5-year-olds (21%) solved the task whereas only one 3-year-old (7%) did so.

Fig. 7
figure 7

Mean percentage (+SEM) of correct trials as a function of age in “Experiment 4

Analysis of the latencies indicated a significant difference in the time it took subjects to make a choice [χ²(2, N = 42) = 15.69; P = 0.001]. Three-year-old children needed significantly more time in comparison to 4- (68 vs. 8 s, Mann–Whitney: Z = 3.81; P = 0.001) and 5-year-old children (68 vs. 25 s, Mann–Whitney: Z = 2.99; P = 0.003). Additionally, 75% of the 3-year-old children missed 1–3 trials because they were unable to solve the task after approximately 5 min or indicated that it was not possible to get the toy and refused to be tested. In contrast, only one 4-year-old child and no 5-year-old child had a trial missing.

Discussion

As expected older children outperformed younger ones in this task. These data fit well with other results on inhibitory control that the age between 3 and 5 years is an important transition period for the acquisition of inhibitory control. Numerous studies have reported 3-year-olds having motor- and cognitive inhibition problems in a variety of tasks (Carlson and Moses 2001; Diamond and Taylor 1996; Frye et al. 1995; Gerstadt et al. 1994; Mischel and Mischel 1983). However, even the performance of the older children was quite low (between 20 and 30%) which again confirms previous studies reporting far-from-perfect performance in these age groups (Carver et al. 2001a, b; Carlson et al. 2002; Mischel and Mischel 1983; Mischel et al. 1989).

We also noted that 3-year-old children needed more time to complete a test trial. The majority of 3-year-old children passed the 5-min limit on 1–3 test trials or indicated that they were unable to solve the task and refused to be tested further. At first all 3-year-olds tried to reach for the toy directly. When this strategy failed, they often stopped trying and kept repeating that it was impossible to get the toy. Others just kept looking through the door behind which the toy was placed or tried to somehow lift it. These findings might indicate that in comparison to 4- and 5-year olds, 3-year-olds might have been less able to cope with the frustration of ‘losing a sticker’. Unable to activate any efficient solution they might have stopped trying. In contrast 4- and 5-year-olds kept trying and completed test trials more quickly.

It is conceivable that the test setting, including being watched by two experimenters while being unable to solve a task might have put extra pressure on participants, decreasing the likelihood of producing the detour-solution. This explanation, however, cannot easily explain the observed age differences. Future studies could investigate whether children show similar performance when left alone in the room. Being alone may have alleviated some of the social pressure that they may have felt and additionally, not being able to relate their failures to experimenter may have helped them to better monitor and evaluate their own actions.

General discussion

We found some evidence that great apes could solve the detour-reaching task, thus inhibiting their initial tendency to reach directly for the food. Orangutans were generally the more skilful than the other species although unsuccessful subjects showed notable improvement after minimal training. However, the type of training substantially affected the subsequent task performance. Sanctuary chimpanzees and bonobos outperformed their zoo counterparts, but this was not the case for orangutans. Four- and 5-year-old children outperformed 3-year-olds even though the performance of the older children was far from perfect and comparable to some of the ape groups tested.

Orangutans were the species that performed best in this task. They appeared better able to take into account the larger picture, generate the detour-solution and inhibit their prepotent responses than the other species. One possible explanation is due to the fact that they are more deliberate than the other apes due to their means of locomotion in the canopy. Additionally, the differences may have been accentuated by their unique social organization based on a dispersed social system that reduces direct individual food competition. As a consequence they might be less impulsive, less focused and better able to overview certain situations than the other species. It is worth noting that the orbital frontal cortex of orangutans, an area commonly associated with inhibitory skills, deviates from the other species (Semendeferi et al. 1997; Semendeferi 1999). However, other tests for inhibitory control have found no evidence of species differences between orangutans, chimpanzees and bonobos (Amici et al. 2008; Vlamings et al. 2006). Interestingly, all three species display high levels of fission–fusion, a social organization in which party membership changes constantly. In contrast, gorillas, which have a more stable social organization, showed lower levels of inhibitory control than those species with higher levels of fission–fusion (Amici et al. 2008). Note that gorillas were also the species that displayed more difficulties during the training phase of “Experiment 2”, but our small sample does not allow us to draw firm conclusions on this point.

Despite the great difficulty that some subjects experienced with the task, they were able to improve their performance after some minimal remedial training. The results of this training suggested that most unsuccessful apes experienced cognitive rather than motor inhibition problems. Due to the salience of the food item subjects might have been so focused on the door behind which the food was placed that they were unable to detach their attention from it and take into account other aspects of the task (the other door) that would have allowed them to solve the task. It is important to emphasize that the type of training had a profound impact of the subsequent task performance. Those subjects trained to reach through one particular door continued to do so during the test, which means that they failed all those trials in which the food appeared in the nontrained side. Thus, this kind of fixed training fostered motor inflexibility whereas training alternation between doors fostered flexibility; but this worked only for those individuals that could inhibit reaching directly toward the reward during training. Recall that knocking against the door during training predicted was inversely correlated with success during the subsequent test phase but only for those subjects that were trained with the alternation regime.

Research on frontal patients has indicated that insufficient inhibitory control can lead to two different outcomes: (a) the inability to focus on just one thing (as measured in for example the Stroop Test) and (b) the opposite, the inability to expand one’s attention beyond one thing (Diamond 1990). Frontal patients can be so focused on a salient stimulus that they fail to take into account the larger picture, for example alternative routes to solve a task (Luria 1973; Diamond 1990). The children and apes tested in the present study seemed to have problems with taking into account the larger picture too. They seemed to be so focused on the door behind which the toy was presented that they failed to take into account that the solution involved the other door. Unlike apes, however, children made no mistakes after they found the correct solution. In pure cognitive inhibition problems, like the nine-dot problem (Maier 1930), subjects usually do not make mistake once they know the solution. It is therefore suggested that in contrast to the apes, children are facing pure cognitive inhibition problems in the present detour-reaching study.

To further investigate the nature of the inhibitory problems of the children who were unable to solve the task, it would be interesting to present children with the training blocked condition. If performance reached perfection after blocking, failures in the original task are likely due to cognitive inhibition problems: the inability to activate a detour-solution due to the salience of the toy. If children increased performance but kept making mistakes, this is likely due to motor inhibition problems: being unable to inhibit the prepotent direct-reaching response. In addition it would be interesting to test older children. As reported earlier, inhibitory skills continue to develop in childhood (Kopp 1982; Mischel and Mischel 1983; Mischel et al. 1989; Carver et al. 2001a, b; Carlson et al. 2002). Children become cognitively more flexible and are better able to disengage from salient objects. Future studies on inhibitory skills in typically developing children are important, as inhibitory skills are good predictors of cognitive as well as social competence, scholastic performance and the ability to cope with stress and frustration (Mischel et al. 1989). Given the current interest in theory of mind development, investigation of inhibitory skills is relevant too, as theory of mind and inhibitory skills seem to be closely related (Perner and Lang 2002; Carlson et al. 2002; Zelazo et al. 2002).

Chimpanzees and bonobos housed in sanctuaries clearly outperformed their counterparts housed in zoos, although this was not true for orangutans. These differences are intriguing and potentially important (Boesch 2007). However, caution is needed when trying to extrapolate these results to the whole population of zoo or sanctuary apes because our samples are small and come from particular origins. Caution should also be applied with the human sample because it may only reflect the performance of middle-class western children, and it may not be representative of other human populations that differ with regard to the experiences that children are exposed to (e.g., schooling). Moreover, the ape populations themselves are not so homogeneous since there are large inter-individual differences within populations regarding the rearing histories of the individuals and also potentially important genetic differences. For instance, all zoo chimpanzees were descendants of individuals that came from Liberia whereas all sanctuary chimpanzees came from Eastern wild populations. Future studies could include multiple groups that vary systematically in certain factors (e.g., rearing practices or cultural background) to assess their impact on inhibitory control.

There is another issue that recommends caution when making broad generalizations. Although sanctuary chimpanzees and bonobos outperformed their zoo counterpart on a detour-reaching task, there are other tasks in which the pattern is reversed and these same zoo apes outperformed sanctuary apes. For instance, zoo apes outperformed sanctuary apes in quantity estimation and gaze following (compare Hanus and Call 2007; Bräuer et al. 2005 with Herrmann et al. 2007). Nevertheless, the potential differences between populations is an important issue that deserves careful scrutiny as it may provide important clues regarding the epigenesis of cognitive skills in the great apes. Documenting the differences is important but not less important is documenting the similarities. In fact, there are some tasks in which no differences were apparent between zoo and sanctuary apes (e.g., quantity estimation, object tracking tasks, see Herrmann et al. 2007; Barth and Call 2006). This means that what is required to assess the status of various populations is a battery of multiple tasks and a balanced consideration of the similarities and differences between populations. At the very least, this manuscript should contribute to dispel the myth that sanctuary apes are a nonviable research population due to their origin as we have seen that they are capable of solving tasks that challenge the inventiveness of 4- and 5-year-old children.

In summary, we found that some great apes were capable of solving a detour-reaching task after only a few trials and those that failed could be trained to solve the task after minimal training. However, the training regime had a substantial effect on how flexibly subjects could deploy their training in the original situation. In general, orangutans outperformed other ape species, but there were also important differences between groups depending on the apes’ origin. Namely, sanctuary chimpanzees and bonobos outperformed their zoo counterparts. Four- and 5-year-old children performed better than 3-year-old children. Finally, the performance of the older children was roughly equivalent to that of sanctuary apes.