Analogical reasoning is a cornerstone of human thought (Sternberg, 1977). It is fundamental for a diversity of cognitive processes of considerable importance, such as classification (Ramscar & Pain, 1996), transfer of learning in new contexts (Gentner, Loewenstein, & Thompson, 2003), problem solving (Gick & Holyoak, 1980), and creative thinking (Holyoak & Thagard, 1996). Analogical thinking requires an ability to associate a relation instantiated by a first set of stimuli (source domain) to another relation shown by a second set of stimuli (target domain) (Gentner, 2002). Consider, for instance, the following analogy: Fish is to water as bird is to air. Understanding this analogy requires that the listener identify the relational similarity between the elements of the source domain (fish and water) and determine that the same relation exists between the elements of the target domain (bird and air). The ability to solve analogy problems is promoted during childhood by the acquisition of relational language (Gentner, Simms, & Flusberg, 2009; Loewenstein & Gentner, 2005). However, it remains unclear whether language learning acts as a booster that shifts attention toward the relational structure of a problem, or is simply a mandatory prerequisite for the appropriate encoding of analogy problems. Studies on analogy making in animals are perfect means to assess this issue of great theoretical importance.

Analogical reasoning has mostly been studied in animals with the relational matching-to-sample (RMTS) task (but see Haun & Call, 2009; Hribar, Haun, & Call, 2011; Kennedy & Fragaszy, 2008). In this task, the subject first perceives a set of stimuli (source domain) that are all the same (same relation) or all different (different relation), and then perceives two new sets of stimuli. One of the latter two sets (the target set) shows the same (same or different) relation as the source set, and the other shows the alternative relation. The animal is rewarded for choosing the comparison pair showing the same relation as the sample pair. A historical perspective on the use of the RMTS task pinpoints the pioneering contribution of Premack (1983), who reported that a language-trained chimpanzee (named Sarah) could solve the RMTS task. According to Premack, that success indicates that linguistic encoding is required for successful appreciation of second-order relations (i.e., relations between relations). Indeed, numerous studies following this first demonstration have failed to reveal RMTS abilities in language-naïve apes and monkeys (e.g., Flemming, Beran, & Washburn, 2007; Oden, Thompson, & Premack, 1988, 1990; Premack, 1983), with the notable exception of Thompson, Oden, and Boysen (1997), who reported that training to associate tokens to relations permits relational matching in language-naïve chimpanzees. However, this theoretical perspective on the contribution of language to successful RMTS has been challenged in light of a new set of experimental findings. Thus, a first set of studies showed that baboons (Fagot, Wasserman, & Young, 2001), and then pigeons (Cook & Wasserman, 2007), can solve this task when the relations are illustrated by arrays of multiple stimuli rather than by pairs of items. In addition, recent studies have revealed that nonhuman primates can pass the RMTS task with pairs of items rather than arrays after extensive training on the task (in a capuchin, Truppa, Mortari, Garofoli, Privitera, & Visalberghi, 2011; in baboons, Fagot & Parron, 2010; Fagot & Thompson, 2011) or with specific reinforcement contingencies (Flemming, Thompson, Beran, & Washburn, 2011). Vonk (2003) further reported relational matching in a gorilla and orangutans with limited training. These findings demonstrate some abilities to process second-order relations, and this is arguably a phylogenetic basis of analogical reasoning.

The demonstration that animals can match relations with relations is a very important first step in the assessment of analogy making in the species considered. However, analogical reasoning implies cognitive processes that are not restricted to the ability to match relations with relations. One important component of analogical reasoning in humans is that the encoding of the source domain often depends on the properties of the target domain (Holyoak & Thagard, 1989). Imagine an analogy task in which the source domain contains the “bee–hive” pair of items. Various relations exist between a bee and a hive, such as that the bee “lives in” or “works in” the hive. In that case, the relation to be processed must be specified by the relational properties of the target items. Thus the relation “works in” is highlighted when the target domain contains the “secretary–office” pair, but the relation “lives in” becomes salient when the target domain comprises the “human–house” pair.

Previous RMTS tasks used with animals have only manipulated a single stimulus dimension at a time, such as the shape (e.g., Fagot & Thompson, 2011; Truppa et al., 2011) or color (e.g., Fagot & Parron, 2010) of the stimuli. With these procedures, inspection of the source stimuli provided all the cues necessary to determine the (same or different) relation to be processed, with no need to reencode the source domain in terms of the properties of the target domain. This aspect of the task is at odds with analogy making in humans, in which the source domain is reencoded after the items of the target domain have been processed (Gitomer, Curtis, Glaser, & Lensky, 1987; Yan, Forbus, & Gentner, 2003). The strategy implying a reencoding of the source domain is of a higher level of complexity. It requires that the subject select the appropriate relation among a set of possible ones, and further imposes working memory constraints that are absent in the RMTS tasks previously employed with animals.

A second important feature of analogical reasoning in humans is the capacity to match relations instantiated by very different means. Consider, for instance, the “identity” relation: Two objects might be considered identical because they have the same color (e.g., tomato, fire truck), the same shape (moon, ball), or the same function (train, plane). Humans would have no real difficulty associating these pairs during an analogy task, because the items within each pair have something in common, independent of the dimension (shape, color, or function) shared by these items. Unfortunately, that ability to match relations across dimensions cannot be tackled by the version of RMTS procedure used so far with animals, again because these tasks only imply the processing of a single dimension at a time (e.g., the shape dimension, Fagot & Thompson, 2011; Truppa et al., 2011; or the color dimension, Fagot & Parron, 2010). Empirical data are needed in this domain to evaluate the power and flexibility of analogical thinking in animals.

We present below two experiments on baboons suggesting for the first time that an animal species can flexibly reencode the relations instantiated by the source domain as a function of the relational properties of the target domain (Exp. 1) and can, moreover, solve analogy problems involving the matching of relations instantiated by different stimulus dimensions (Exp. 2).

Experiment 1: Reencoding of the source relation depending on the target relation

In Experiment 1, we tested the ability of baboons to flexibly reencode the source domain as a function of the relational properties of the target domain. Baboons were trained and tested with a new version of the RMTS task in which the color and the shape of the source items were manipulated independently. In this task, the relevant (shape or color) dimension to process was only indicated by the properties of the target (comparison) pairs, which appeared after the sample had disappeared, therefore imposing a reencoding of the relational properties of the source pair—stored in working memory—after the target pairs have been processed.

Method

Subjects and apparatus

The subjects were four male Guinea baboons (Papio papio; age range 4–7.6 years) previously trained in the RMTS task using pairs of monochromatic shapes as stimuli (Fagot & Thompson, 2011). They lived in a social group of 30 individuals within a 700-m2 enclosure and had free access to ten operant-conditioning test chambers, each equipped with a 19-in. touch screen, a food dispenser, and a radio frequency identification (RFID) reader that identified each baboon via a microchip implanted in each arm (Fagot & Bonté, 2010; Fagot & Paleressompoulle, 2009). A test program written with E-Prime Version 2 professional used the subjects’ identity to determine its “last stopping point” in the sequence of trial presentations, and assigned the independent variables to be experienced by each subject during the trial.

Procedure

For this study, we used a sequential zero-delay two-dimensional RMTS procedure (see Fig. 1 and the illustrative Video S1 provided as supplemental information). After identification of the subject via the microchip, one of four possible types of source pairs was presented in the center of the touch screen. As is shown in Fig. 1, the source pair could comprise (1) two items of the same color and same shape, (2) two identical shapes of different colors, (3) two different shapes of a unique color, or (4) two items differing in both color and shape. Note that these four conditions allowed us to distinguish consistent trials, in which the information provided by the shapes of the sample items and their colors illustrated the same relation (e.g., the relation of sameness in Option 1 and of differentness in Option 4), from inconsistent trials, in which the shape and the color of the sample items illustrated different relations (i.e., the relation of sameness for shapes and differentness for colors in Option 2, or the inverse relations in Option 3). These shapes measured a maximum of 100 × 100 pixels (6.8° of visual angle), and were separated by a minimum of four pixels. Touching the source pair made it disappear and triggered the immediate display of two comparison stimulus pairs centered on the left and right sides of the screen. The structure of the comparison pairs identified the relevant dimension on which the correct matching response was to be based. When color was the relevant dimension (color trials), the comparison pairs comprised two vertical 150 × 50 pixel bars (7.4° of visual angle), separated from each other by four pixels, that were of the same (same relation) or different (different relation) colors. In this case, the shape of the comparison could not inform the subject as to the correct response, as it was constant across the two stimulus pairs. Therefore, the subjects had to focus on the color of the stimuli and select the comparison pair showing the same (same or different) relation that had been illustrated by the color of the source items. When the shape was the relevant dimension (shape trials), the comparison pairs only contained white shapes. One comparison pair was drawn from two identical shapes (same relation) and the other from two different shapes (different relation). The color of the comparison shapes was consistently white, and therefore noninformative as to the correct response. Therefore, the subjects had to select the comparison pair matching the source pair with respect to the shape cues. Particularly interesting in this design are the trials in which the subjects had to provide a same response when the alternative dimension illustrated a different relation (i.e., inconsistent trials). These trials could not be solved by processing the overall variability of the stimulus pairs, with all dimensions combined, but required independent processing of the two dimensions.

Fig. 1
figure 1

Illustration of the experimental displays used in Experiment 1. Each panel illustrates a trial for which the color (left-hand panels) or the shape (right-hand panel) was the relevant dimension to process. The source pair is shown in the middle of each panel, and the comparison pairs are shown on the left and right sides (S + = target pair, S– = foil pair) of each panels. The upper four panels represent the consistent trials, in which the color and shape of the sample items illustrated the same (same or different) relation (e.g., the same shape and same color, in the upper left example). The bottom four panels represent the inconsistent trials, in which the color and shape of the sample item pairs illustrated different (same or different) relations (e.g., different colors but the same shape, in the bottom left example). Note that the sample and comparison pairs were presented in two successive screens in these trials, with no delay between these displays

The source and target pairs were drawn from two distinct sets of six colors and six shapes, precluding physical matching by either color or shape. The left/right locations of the target pair were counterbalanced within each block. Correct responses were reinforced by a drop of dry wheat, and incorrect responses gave rise to a green screen and a 3-s time-out. Solving this task required the animal to remember the relationships instantiated by the source pair and to reencode that pair a posteriori, once the relevant dimension has been indicated by the structure of the comparison pairs.

Training and testing

In the two-dimensional RMTS task, flexible reencoding of the source items can only be demonstrated if the relevant (color or shape) dimension to be processed varies randomly from one trial to the next. To meet this requirement, the baboons underwent a long and effortful training regimen involving alternated blocks of color and shape trials of progressively reduced block sizes (from 120-trial blocks, to 80-, 40-, 16-, 8-, 4-, and 2-trial blocks) to teach the baboons to flexibly shift their attention from one stimulus dimension (e.g., the color) to the other (e.g., the shape). Blocks of color and shape trials were presented in an ABBA BAAB order (the “A” blocks involving color trials only and the “B” blocks shape trials only). The baboons had to reach 80 % correct or more in each block to move to the next “A” or “B” block, until the block sizes were reduced to 16 trials. At that point, the baboons were submitted to randomized color or shape trial blocks of 8, 4, and 2 trials, and were required to achieve 100 sequences of 80 % correct in two consecutive blocks before the block size was reduced. Table 1 summarizes the numbers of color and shape blocks performed by each subject throughout training. This extensive training required an average of 58,541 trials per baboon (range: 38,400–71,690 trials) and took from 5 to 6 weeks, depending on the subject.

Table 1 Numbers of training blocks performed by each subject as a function of block size (from 120- to 2-trial blocks) for each type of trial (color and shape), as well as total numbers of trials and the corresponding standard deviations (SDs) per subject and trial type

Two test phases followed the initial training. Test Phase 1 was aimed to determine reencoding of the source item as a function of the properties of the target items. It consisted of six consecutive 120-trial blocks, each comprising 60 color trials randomly intermixed with 60 shape trials (see Fig. 1). The second test phase determined whether baboons preferentially processed relational rather than perceptual similarity in the two-dimensional RMTS task. Test Phase 2 consisted of ten 128-trial blocks that comprised 112 baseline trials (56 color and 56 shape trials) identical to those of Test Phase 1, intermixed with 16 probe cross-mapping trials (see Fig. 2). The probe trials differed from baseline trials in that the incorrect comparison pair shared with the sample pair an attribute of the relevant dimension. Illustrative examples of cross-mapped trials are provided in Fig. 2 for same (upper panels) and different (bottom panels) trials involving a relational judgment on the color (left-hand panel) and the shape (right-hand panels) dimensions.

Fig. 2
figure 2

Illustration of the cross-mapped trials of Test Phase 2 of Experiment 1. In these trials, S– systematically shared one stimulus dimension with the sample pair. For example, in the color cross-mapped trials of the upper left panel, the light orange color used to draw the sample items was also used to draw the negative, S– comparison pair. Similarly, in the cross-mapped shape trials (right hand panels), one of the shapes within the sample pair served to draw S–. The same procedure was followed for the same trials (upper panels) and the different trials (bottom panels). Note that Fig. 2 only illustrates consistent trials, but that the same procedure was used for inconsistent trials. It was reasoned that a correct choice for S + in these cross-mapped trials, either consistent or inconsistent, would imply that the baboons gave priority to the relational cues

Results and discussion

During Test Phase 1, the four baboons performed 77 % correct on average (range: 74.9 %–80.8 %; see also Video S2, available in the online supplemental materials). For the statistical analyses, we first distinguished each dimension (color or shape) by relation (same or different) condition. Individual performance in each condition was analyzed by way of two-tailed binomial tests. Bonferroni corrections were applied to all of these tests to counteract the problem of multiple comparisons. A first analysis showed that all subjects performed well above chance in each of these four conditions (all ps < .0001; see Fig. 3). A second analysis assessed the effect of trial order and distinguished the trials for which the relevant (color or shape) dimension was identical to that of the previous (i.e., n – 1) trial, from those trials that required an attentional shift because of a change in the relevant dimension. The average performance for the “shift” trials was 73.7 % correct (range: 70.9 %–78.6 %), and for the “no-shift” trials it was 80.3 % (range: 78.6 %–83.4 %); the four baboons showed reliable performance in both types of trials (two-tailed binomial tests, all ps < .0001). This finding suggests that the baboons based their responses on the relational cues provided in the current trial, with very limited proactive interference from the previous trial.

Fig. 3
figure 3

Percentages of correct responses in Test Phase 1 of Experiment 1, broken down per condition of the stimulus dimension (color vs. shape) by relation (same, different)

The final analysis contrasted the trials in which the relations shown by the color or shape cues of the source pair were either consistent (the color and shape dimensions showed the same relation) or inconsistent (they showed different relations). Relational inconsistency induced a performance decline, and more so for the color (55.4 % vs. 94.3 % correct for inconsistent vs. consistent trials, respectively) than for the shape trials (62.2 % vs. 96.3 % correct for inconsistent vs. consistent). That decline shows that the processing of the relational color and shape cues interfered within a trial. Computation of Bonferroni-corrected two-tailed binomial tests provided mixed results for three of the baboons (see Table 2), but revealed that one baboon (DAN) was reliably above chance in both the color (68.5 % correct) and shape (70.2 %) inconsistent trials (all ps < .0001), suggesting that this animal efficiently processed the relational cues provided by each dimension. Of most importance, the same analyses showed that DAN was reliably above chance in the same color trials (76.2 % correct) and same shape trials (75 % correct), in which the relation expressed by the dimension to be neglected conflicted with that of the relevant dimension (all ps < .0001). The high performance of DAN in these two types of trials demonstrates that this animal did not encode an overall estimate of the perceptual variability of the stimuli to solve the task, but considered instead the relational cues provided by each dimension. Altogether, this remarkable performance demonstrates a capacity to reencode the source domain of the task as a function of the relational properties of the target domain.

Table 2 Mean percentages correct obtained for each baboon during Test Phase 1 of Experiment 1, as a function of trial type (color or shape) and relational consistency across dimensions (consistent or inconsistent relations)

Figure 4 shows the average performance obtained in Test Phase 2. All of the baboons performed above chance in both baseline (M = 78.5 %) and probe (M = 76.9 %) cross-mapped trials, although the sample and the foil comparison had an item in common in the latter trials (all ps < .0001). This finding strongly suggests that the reencoding of the source items in Test Phase 1 involved the processing of relational rather than more purely perceptual cues, because the baboons gave priority to the processing of relational cues when the relational and perceptual cues conflicted.

Fig. 4
figure 4

Percentages of correct responses in the baseline and probe trials of Test Phase 2 of Experiment 1. * p < .0001

Experiment 2: Matching relations across dimensions

Humans are capable of matching relations across dimensions and can, for example, indicate that two red objects illustrate the relation of sameness (considering color cues), just as do two identical triangles (considering the shape cues). To our knowledge, this ability to match relations across dimensions has never been investigated in animals. In Experiment 2, the baboons received cross-dimensional relational trials and were requested to match stimulus pairs on the basis of relational cues expressed by different dimensions in the sample (e.g., color cue) and comparison (e.g., shape cue) pairs. It was reasoned that correct relational matching responses with this design would provide evidence that the baboons could match relations across dimensions.

Method

Cross-dimensional relational matching was tested in Experiment 2 with the same baboons and apparatus described above. After a period of retraining using the same procedure as during Test Phase 1 of Experiment 1 (range 228–4,928 training trials), each baboon completed ten blocks of 128 randomized trials consisting of 112 baseline trials (56 color and 56 shape trials) identical to the baseline trials of Experiment 1 (see Fig. 1) and 16 probe trials (8 color sample and 8 shape sample trials). In contrast to the baseline trials, the source pair in the probe trials used unidimensional stimuli. Thus, the stimuli composing the source pair (see Fig. 5) were either two vertical bars with the same or a different color (color sample trials) or two of the same or different white shapes (shape sample trials). Touching the source pair triggered the presentation of the two comparison pairs, which were drawn from the dimension not illustrated in the source pair (see Fig. 5). Thus, the source relation was shown with color cues and the target relation with shape cues in the color sample trials, and vice versa for the shape sample trials. The baboons were rewarded if they touched the comparison pair that instantiated the same relational structure as the source pair.

Fig. 5
figure 5

Illustration of the trial procedure of Experiment 2 for the color- and shape-sample trials

Results and discussion

A preliminary analysis (one-way analysis of variance) verified whether performance in the probe trials varied on average across the blocks. Because the effect of block was not significant [F(9, 27) = .85, p > .05], data were pooled across the ten test blocks for the statistical analyses. Overall, the performance was of 76.6 % correct on average in baseline trials (range: 74.7 %–81.6 %) and 63.4 % correct (range: 54.4 %–76.2 %) during the probe trials. Following the same statistical procedure used in Experiment 1, the average scores achieved by each individual (Fig. 6) were compared to chance level with Bonferroni-corrected two-tailed binomial tests. All subjects demonstrated reliable performance (all ps < .0001) in the baseline trials, and three of them (i.e., CAUET, DAN, and VIVIEN) continued to perform well above chance in the probe trials (all ps < .01), in which they were required to match relations across the color and shape dimensions.

Fig. 6
figure 6

Percentages of correct responses in the baseline and probe trials of Experiment 2. * p < .01

In Experiment 2, we assumed that the vertical rectangles would signal the need to process color relations, while two white shapes would signal the need to process the shapes cues, because in the design of Experiment 1 the subjects had been trained to do so. However, inspection of Fig. 5 suggests that an alternative strategy was possible in the case of the same trials. Consider, for instance, the upper left panel of Fig. 5. In these trials, the baboons might consider the shape of the sample items, and match the sample and comparison pairs according to shape cues, regardless of color variations. Alternatively, they might match these two pairs on color cues in the trial of the upper right panel of Fig. 5. However, this strategy would not be efficient in the case of the different trials (see the bottom two panels of Fig. 5). To further the exploration of cross-dimensional matching in baboons, we thus investigated whether the three baboons that were above chance on average in the probe trials (with same and different trials confounded) also showed above-chance performance when the analysis was restricted to the different trials. Above-chance performance in these trials was confirmed for CAUET (66.2 % correct, Bonferroni-corrected binomial test, p < .0025), but not for DAN (54 % correct) or VIVIEN (25 % correct). The results therefore indicate that one of the baboons at least could match relations across dimensions.

General discussion

One important property of analogical reasoning in humans is the fact that the dimension to process in the source domain is often determined by the relational properties of the target domain. Analogical reasoning therefore implies flexibility in the encoding of the source domain. The first major contribution of our study has been to demonstrate that a nonhuman primate species can express this flexibility. This finding is clearly demonstrated in Experiment 1, in which DAN could use the information provided by the comparison pair to appropriately consider the relation expressed by the relevant dimension of the sample pairs. It moreover did so in the consistent as well as in the more difficult inconsistent trials, in addition to the random blocks of trials, ruling out the possible effect of proactive interference from the previous trials.

Early comparative studies have already shown that apes (Thompson et al., 1997; Vonk, 2003) and baboons (Fagot & Thompson, 2011; Truppa et al., 2011) can match relations with relations in the unidimensional RMTS task involving pairs of items, but these findings were subject to criticisms by Penn, Holyoak, and Povinelli (2008), who argued that the nonhuman primates might use estimates of the perceptual variability in the source and target domains, rather than relations per se. This hypothesis is clearly ruled out by the fact that DAN continued to provide same responses when the stimulus dimension to process within the source pair indicated a same relation but the dimension to reject in that pair indicated a different relation. The second major contribution of our study has been to reveal that, in contrast to Penn et al.’s position, baboons may use a more complex cognitive strategy to solve the RMTS task, and continued to match relations even when perceptual variability cues conflicted with the relational judgments.

The last major contribution of this study has been to provide the first evidence that a baboon can match relations across dimensions. We observed that three baboons were successful on average in Experiment 2, and that one of them continued to be successful when the analysis was restricted to the different trials, on which the relational strategy was the only one viable. Noticeably, however, the baboon (CAUET) that showed the strongest evidence of cross-dimensional matching in Experiment 2 was not the same baboon (i.e., DAN) who demonstrated the best performance in Experiment 1. This finding is not surprising, considering that the cognitive processes required by these two tasks are somewhat different. Experiment 1 implied that the baboons respond to the task by matching the sample and comparison pairs on the same dimension, while neglecting the interference of the alternative dimension. By contrast, Experiment 2 implied that the subject should make use of the relations expressed by both dimensions and compare and combine the information derived from these two dimensions for correct matching. Regardless of the identity of the most successful baboons in Experiments 1 and 2, our study demonstrates that these two tasks are solvable by baboons, suggesting that these animals have at least the basic abilities to reason about second-order relations and to solve analogy problems.

An abundant literature has indicated that analogical performance is improved in human infants by the acquisition of verbal labels (Rattermann & Gentner, 1998). The immediate theoretical implication of our study is that language learning is not a mandatory prerequisite for the appropriate encoding of analogical problems. We do not want to assert that the baboon is a nonhuman primate species particularly gifted for analogical reasoning. Admittedly, their success in the task could only be obtained after thousands of training trials, and this training was likely the key factor leading to successful RMTS. The main function of word learning might be to boost analogical reasoning by rapidly orienting the attention of the children toward the relational structure of a problem (Loewenstein & Gentner, 2005). That function was likely accomplished in our study by the extensive training during which the baboons learned to flexibly shift their attention from dimension to dimension and relation to relation.

In spite of the remarkable achievement reported here, an important difference remains between analogical reasoning in humans and animals: The ability to solve analogical reasoning RMTS problems has been demonstrated in animals only in tasks involving dimensions and relations on which they have already been trained (e.g., Fagot & Thompson, 2011; Truppa et al., 2011). This aspect of the comparative literature contrasts sharply with analogical reasoning in humans, which readily extends beyond particular test dimensions or domains of knowledge (e.g., Gentner, 2003). The next challenge for researchers will be to identify the source of that difference between humans and animals. We suspect that its source will be found in the interplay between word (or symbol) learning and attentional processes.