Introduction

Two core systems of number or one? This question has pervaded the developmental and comparative literatures on numerical cognition for more than a decade. At issue is whether quantification skills that are the basis for mathematical competencies are predicated on one system or two systems for representing and judging quantity or number. There is no debate about the existence of a first core system of number, often called the approximate number system, or ANS (Brannon & Roitman, 2003; Cantlon, Platt, & Brannon, 2009; Gallistel & Gelman, 2000). In this system, quantities are represented inexactly, with increasing variance in discriminating or estimating set sizes as the true set sizes increase. This system produces the well-established distance and size effects, whereby performance in discriminating sets is best predicted by the ratio of those sets to each other (e.g., Beran, 2007), an outcome that reflects the workings of Weber’s law.

This system has been seen in a wide variety of animals including amphibians (e.g., Krusche, Uller, & Dicke, 2010), birds (e.g., Ain, Giret, Grand, Kreutzer, & Bovet, 2009; Rugani, Cavazzana, Vallortigara, & Regolin, 2013), marine mammals (e.g., Abramson, Hernandez-Lloreda, Call, & Colmenares, 2011), terrestrial non-primate mammals (Baker, Shivik, & Jordan, 2011; Perdue, Talbot, Stone, & Beran, 2012; Vonk & Beran, 2012), and many primates (Beran, 2007, 2008; Cantlon & Brannon, 2006a, b; Hanus & Call, 2007; Lewis, Jaffe, & Brannon, 2005). It also is seen in the discrimination performance of children (Cantlon, Safford, & Brannon, 2010; Cordes & Brannon, 2009; Huntley-Fenner & Cannon, 2000) and adult humans who are prevented from using formal counting routines (Beran, Taglialatela, James, Flemming, & Washburn, 2006; Boisvert, Abroms, & Roberts, 2003; Cordes, Gelman, Gallistel, & Whalen, 2001; Whalen, Gallistel, & Gelman, 1999).

The second system operates very differently, and is best described as a precise but size-limited system that accurately represents small numbers but cannot accommodate numbers greater than four because of limits in attention and working memory capacity. A model of this system, often called the object file model, has been described in detail elsewhere (e.g., Feigenson & Carey 2005; Feigenson, Carey, & Hauser, 2002; Uller, Carey, Huntley-Fennner, & Klatt, 1999), but the key point here is that a two core-systems hypothesis for numerical cognition requires evidence that performance with small sets of items looks better than performance with large sets (Feigenson, Dehaene, & Spelke, 2004; Xu, 2003; also see Hyde, 2011; Hyde & Wood, 2011, for a discussion about other stimulus features that impact which of these two systems might be activated). Although there has been less evidence for this system in the comparative literature than in the developmental literature, some reports have suggested that fish may show evidence of these two systems (Agrillo, Miletto Petrazzini, & Bisazza, 2014; Agrillo, Piffer, Bisazza, & Butterworth, 2012; Gómez-Laplaza & Gerlai, 2011; Piffer, Agrillo, & Hyde, 2012; but see Gomez-Laplaza & Gerlai, 2011; Potrich, Sovrano, Stancher, & Vallortigara, 2015), as might birds (e.g., Garland, Low, & Burns, 2012; Hunt, Low, & Burns, 2008), salamanders (Krusche, Uller, & Dicke, 2010; Uller et al. 2003), and beluga whales (Abramson, Hernandez-Lloreda, Call, & Colmenares, 2013). Among primate studies, only one paper reported that semi-free ranging rhesus monkeys also showed the hallmark behavioral features of two core systems (Hauser et al. 2000a, b), whereas the vast majority of the studies with primates show ratio effects indicative of the approximate number system only (e.g., Barnard et al., 2013; Beran, 2004, 2008; Evans, Beran, Harris, & Rice, 2009; Cantlon & Brannon, 2006a, b; Hanus & Call, 2007; Merritt, MacLean, Crawford, & Brannon, 2011; Nieder & Miller, 2004).

A recent report with human participants suggested that enumeration of small sets was “superprecise” and reflected a specialized mechanism for representing small numbers of items as is outlined for the object file model (Choo & Franconeri, 2014). That study was based on the longstanding finding that humans are particularly fast and accurate at reporting small numbers of items (usually four or less) compared to the gradually increasing response time slope and gradually decreasing performance slope for all progressively larger numbers above four. This subitization process for small numbers (Mandler & Shebo, 1982) seemingly supports the idea of two systems for enumeration of visual items. Although alternate accounts of subitizing exist that still allow for meeting Weber’s law (Lemmon, 1927), the similarity in size limits for enumerating small numbers of items, for visual memory capacity (Luck & Vogel, 1997), and for tracking moving objects (Pylyshyn & Storm, 1988) suggests that perhaps this object file system is at work in numerical judgments. The object file system is proposed to be the second core system, and one that can be adopted for other purposes such as object tracking (see Franconeri, Alvarez & Enns, 2007).

Choo and Franconeri (2014) gave adult participants two sets of dots on a computer screen. Small set comparisons always consisted of three items in one set and one, two, four, or five items in the other, whereas large set comparisons consisted of 30 items in one set and 10, 20, 40, or 50 items in the other. Participants had to indicate whether the second set was smaller or larger than the consistent reference set (the set with three or 30 items). The response times in making these decisions were faster for the small set sizes than for the large ones, and this was also true in a second experiment in which the participants judged whether the two arrays had the same number of items or differing numbers of items with faster responses for small sets compared to large sets. Choo and Franconeri suggested that these results demonstrated super-precision for small collections of items, a claim that might lend support to the idea of two separate systems for numerical and quantitative representation.

Given the ongoing debate in the comparative literature about whether small quantities are somehow “special” in terms of the precision with which they are perceived and represented, the Choo and Franconeri (2014) task could be adapted for use with non-human animals, and then used to assess whether those animals also showed the same pattern of greater precision for small sets compared to large sets (see Agrillo et al., 2012). In Experiment 1, we presented an adapted version of this task to capuchin monkeys in which they had to choose the larger of two dot arrays shown on the screen, and we used the same quantity comparisons as in Choo and Franconeri. In Experiment 2, we expanded the number of comparisons in the small range (all combinations of one to nine items) and the large range (all comparisons of ten to 90 items in 10-item increments). This allowed us to assess whether performance was similar across small and large set sizes in terms of the role of ratio effects. If this were true, this would argue against a two-system interpretation, or at least indicate that monkeys do not show the high performance for small numbers that humans showed. If monkeys instead performed relatively better with small sets compared to large sets, this could indicate a two-system interpretation and match previous results from human participants.

Critically, the six monkeys that we tested during Experiment 1 were involved in their first experimental test using the joystick-computerized apparatus that is commonly used in numerical tests and other cognitive tests with monkeys (e.g., Beran, 2008; Evans, Beran, Chan, Klein, & Menzel, 2008). This was important because it allowed us to control for any previous experience with dots, two-dimensional quantities on computer screens, or the making of relative judgments about quantities. The lack of evidence of two number systems among some comparative studies could be a result of those previous studies testing “task-savvy” animals that had experience with a number of tests of numerical cognition (e.g., Beran, 2008) and that might therefore have come to rely more heavily on the ANS (see Bisazza, Agrillo, & Lucon-Xiccato, 2014). Naïve monkeys thus allowed us to best assess any spontaneous super-precision for small sets. Previous research with naïve baboons (Barnard et al., 2013) had shown that only one core system appeared to be at work in the discriminations of those primates, and we predicted the same was likely to be true for capuchin monkeys.

Experiment 1

Methods

Participants

We tested six adult capuchin monkeys (Cebus apella) including two males (Benny: age 11 years, Mason: age 16 years) and four females (Bailey: age 15 years; Gonzo: age 8 years; Gretel: age 11 years; Lexi: age 6 years). All monkeys previously had been trained to use a joystick with their hands to control a cursor on a computer screen within the previous 6 months (see Evans et al., 2008 for training details), but this was their only prior experimental history with the joystick apparatus. However, one monkey (Mason) was trained on a touchscreen computer and had participated in several facial-recognition studies that involved discriminating conspecific faces (Pokorny & de Waal, 2009a, b; Pokorny, Webb, & de Waal, 2011). The monkeys were socially housed and separate voluntarily for computer testing. The monkeys had continuous access to water. They received a daily diet of fruits and vegetables, and thus they were not food deprived for the purposes of this experiment or any other experiment. The experiment was conducted with the approval of the Georgia State University Institutional Animal Care and Use Committee and followed all federal guidelines.

Apparatus

The monkeys were tested using the Language Research Center’s Computerized Test System. This system consists of a personal computer, digital joystick, color monitor, and pellet dispenser (Evans et al., 2008). To engage in test trials, the monkeys manipulated the joystick with their hands and these manipulations led to isomorphic movements of a small cursor on the computer screen. When monkeys made correct responses in the task they received 45-mg banana-flavored chow pellets (Bio-Serv, Frenchtown, NJ, USA) that were delivered by a pellet dispenser that was connected to the computer through a digital I/O board (PDISO8A; Keithley Instruments, Cleveland, OH, USA). The task program was written in Visual Basic 6.0.

Design and procedure

The task involved quantity discriminations between two choices. Monkeys initiated trials by moving the centrally located cursor to a digital button at the top of the screen. When contacted, that button disappeared and two arrays of identical white dots (5 mm in diameter) were presented at the left center and right center of the screen, with the cursor centered between them. Both arrays were enclosed in a thin border to present them as two discrete sets, and the background was black. Monkeys could take as long as they wanted to make a response. They were rewarded with a single food pellet for selecting the array with more dots, and then a 1-s inter-trial interval occurred before the start button appeared for the next trial. Incorrect selections led to a 20-s timeout during which the screen remained blank before the start button appeared for the next trial.

There were two conditions. In the first condition (Small Magnitude), every trial presented an array of three dots on one randomly selected side of the screen, and one, two, four, or five dots in the other array (also randomly selected). In the second condition (Large Magnitude), every trial presented an array of 30 dots and 10, 20, 40, or 50 dots in the other array (again, with random side presentations).

Each monkey completed as many trials as it chose to perform during the approximately 3- to 4-hour test session, during which time water was always available as was visual access to conspecifics in nearby parts of the enclosure. Half of the monkeys started with the Small Magnitude condition and half started with the Large Magnitude condition. Monkeys completed four blocks of trials where each block consisted of 200 trials in one condition and then 200 trials in the other condition. To collect the full data set required nine sessions for Bailey, five sessions for Benny, five sessions for Gonzo, three sessions for Gretel, five sessions for Lexi, and four sessions for Mason.

Results

Figure 1 illustrates the accuracy of the monkeys in choosing the larger set as a function of magnitude condition (small sets or large sets) and as a function of trial block. We conducted a repeated-measures ANOVA with these two factors included. There was a significant main effect of block, F (3, 15) = 11.21, p < .001, ηp2 = .87. There was not a significant main effect of magnitude condition, F (1, 5) = 3.46, p =.12, ηp2 = .40, and there was not a significant interaction, F (3, 15) = 0.77, p = .52, ηp2 = .13. The main effect of block reflected improved performance with experience, as evidenced by a significant linear fit from the test of within-subjects contrast, F (1, 5) = 19.71, p = .007, ηp2 = .80. There was not a difference in performance as a function of magnitude condition, and capuchin monkeys did not privilege small sets over large sets.

Fig. 1
figure 1

Mean accuracy for the monkeys with the small and large magnitude arrays for each trial block. Error bars indicate 95 % confidence intervals using the Cousineau (2005) method for calculating confidence intervals for within-subject designs

Fig. 2
figure 2

The mean response times for small and large magnitude arrays and each trial block. Error bars indicate 95 % confidence intervals using the Cousineau (2005) method for calculating confidence intervals for within-subject designs

We also examined response times as a function of magnitude condition (Small or Large) and each of the four specific comparisons within each set size (Fig. 2). We first removed any trials with response times that exceeded 10 s, as these were extremely rare (<0.5 % of the total trials). There was not a statistically significant effect of magnitude, F (1, 5) = 1.79, p = .23, ηp2 = .26. There was, however, an effect of block, F (3, 15) = 5.82, p = .008, ηp2 = .54. The main effect of block reflected progressively slower responding across blocks, as evidenced by a significant linear fit from the test of within-subjects contrast, F (1, 5) = 7.54, p = .04, ηp2 = .60. There was not a statistically significant interaction of magnitude and trial block, F (3, 15) = 0.07, p = .97, ηp2 = .01. The same pattern of results occurred when all trials were kept in the analysis but we re-coded any trials with response times of greater than 10 s as having a response time of exactly 10 s.

Fig. 3
figure 3

The group performance of the monkeys with the small and large magnitude arrays shown as a function of the ratio between sets (A) or the difference in dot quantity between sets (B). Error bars indicate 95 % confidence intervals using the Cousineau (2005) method for calculating confidence intervals for within-subject designs

Discussion

In the current study, experimentally-naïve capuchin monkeys were equally proficient and equally fast in discriminating small and large sets of items in a relative quantity judgment task. Crucially, there was no evidence for the “superprecision” for small sets over large sets as was documented among human adults using a similar paradigm. There also was no evidence for faster discriminations and responses toward small sets compared to large sets, unlike in human participants (Choo & Franconeri, 2014).

The comparisons used in Experiment 1 did not allow us to fully assess whether ratio effects presented similarly in these monkeys for the small range and large range of quantities. These ratio effects, calculated by dividing the small set size by the large set size, are consistently reported to best account for the performance of non-human primates in two-choice quantity discrimination tasks (Barnard et al., 2013; Beran, 2004, 2008; Evans, Beran, Harris, & Rice, 2009; Cantlon & Brannon, 2006a, b; Hanus & Call, 2007; Merritt, MacLean, Crawford, & Brannon, 2011; Nieder & Miller, 2004). To best assess ratio effects, as well as other potential contributing factors to performance such as overall magnitude of sets, requires a wide range of comparisons. Experiment 2 presented this range, and it did so for small and large magnitude ranges.

Experiment 2

Methods

Participants

The same six monkeys participated as in Experiment 1. In the interim between Experiment 1 and Experiment 2, these monkeys completed one other study in which they made quantity judgments (Parrish, Agrillo, Perdue, & Beran, 2015). Thus, they had some additional experience in the kind of test given in Experiment 1.

Apparatus

This was the same as in Experiment 1.

Design and procedure

The monkeys performed the same two-choice discrimination in which they compared arrays of the same dot stimuli as in Experiment 1. Now, however, there was a much greater range of quantity comparisons, and those were presented in the small magnitude and large magnitude ranges. For the small magnitude range, all possible combinations of one to nine items were presented (except comparisons of equal set sizes). For the large magnitude range, all combinations of 10, 20, 30, 40, 50, 60, 70, 80, and 90 items were presented (except comparisons of equal quantities). Thus, for each magnitude range, the difference in the quantities to be compared on a given trial ranged from one to eight (small range) or from 10 to 80 (large range). For both ranges, the ratio of items within the possible comparisons ranged from .11 (one vs. nine or 10 vs. 90) to .89 (eight vs. nine or 80 vs. 90). The magnitude (small or large) and specific comparison were chosen randomly on each trial, and the side of the screen with the larger array was randomly determined on each trial. Each monkey completed 3,500 trials in this experiment across a variable number of daily test sessions (Gretel – two sessions; Benny and Gonzo – four sessions; Lexi – five sessions; Mason – six sessions; Bailey – 17 sessions).

Results

Figure 3a presents the group data as a function of the ratio between sets. We compared the regression slopes for the ratio data for small and large magnitude ranges using Analysis of Covariance (with ratio as the covariate). There was a significant difference between the two magnitude ranges, F (1, 51) = 18.83, p < .001. As can be seen in the figure, performance was better for the large magnitude range, and the slope of the decreasing function was shallower for the large magnitude range. Figure 3b shows performance as a function of the difference between sets. There were differences of one to eight items for the small magnitude range and differences of ten to 90 items (in 10-unit increments) in the large magnitude range. A paired-samples t-test again showed a significant difference between the magnitude ranges, t(7) = 3.29, p = .013. Once again, this difference reflected better performance in the large magnitude range compared to the small magnitude range. This pattern of results stands in contrast to evidence of super-precision for small magnitudes among human adults (Choo & Franconeri, 2014).

Figure 4 shows the individual monkeys’ performance in each range as a function of the ratio between sets. The ANCOVAs revealed that Gonzo, F (1, 51) = 7.51, p < .01, Mason, F (1, 51) = 5.97, p =.018, Benny, F (1, 51) = 13.11, p < .001, and Gretel, F (1, 51) = 30.02, p < .001, all showed better performance for the large magnitude range. Bailey, F (1, 51) = 3.94, p = .053, and Lexi, F (1, 51) = 0.01, p = .92, showed no difference for the two magnitude ranges.

Fig. 4
figure 4

Data for individual moneys with the small and large magnitude arrays shown as a function of the ratio between sets

General discussion

These results are valuable in light of the ongoing debate in the comparative and developmental literature over the mechanisms underlying numerical cognition. Much of the previous research had indicated strong support for a single quantity representation system among non-human primate species (Barnard et al., 2013; Beran, 2004, 2007, 2008; Evans et al., 2009; Cantlon & Brannon, 2006a, b; Hanus & Call, 2007; Judge, Evans, & Vyas, 2005; Merritt et al., 2011; Nieder & Miller, 2004, but see Hauser et al. 2000a, b). The present data provide similar support, and in a test that was based on one used with adult humans that had instead indicated two systems might be at work. These experimentally-naïve capuchin monkeys demonstrated the reported pattern of results from the outset of testing, suggestive of only one system at work. This was true in Experiment 1 in terms of their accuracy in choosing the larger set and in terms of comparable response times to large-magnitude sets and small-magnitude sets (but see Tomonaga & Matsuzawa, 2002, for a different test with chimpanzees in which there was evidence of faster responding to small sets). In Experiment 2, strong ratio effects were evident (as were distance effects), but these were equivalent for the small and large quantity ranges for a few monkeys, or, in the majority of cases, monkeys showed greater discrimination performance with the larger quantity range. One reason for this might be greater overall dot stimulus amount onscreen despite consistent relative amounts of dot stimuli in the two ranges. This might lead to the slight increase in performance if those sets led to more attentive discrimination. However, as illustrated in Figs. 3 and 4, the clearest conclusion is that performance is highly similar whether monkeys are choosing between two relatively small dot magnitudes or two relatively large ones.

It is interesting to consider why there is a difference between the treatment of small and large sets for humans (in some studies) but not often such a difference in tests with non-human animals (and, very rarely such a difference for non-human primates). Sensitivity to a wide range of quantities would clearly be advantageous among all animal species in the realms of foraging, predation, and sexual selection. A system that allowed one to generate approximate representations for telling apart important differences (e.g., five vs. eight pieces of food) but not for telling apart small differences in large magnitudes (e.g., 12 vs. 13 predators, 21 vs. 22 pieces of food) would be valuable. The approximate number system does just this, and is seen in many species.

More precise and exact representations of small quantities and numbers also could serve an adaptive function, particularly where keeping track of exactly what or who is around you is critical (e.g., as in the case of seeing two vs. three predators in the vicinity). Thus, one can see why such a system might exist. But, the more precise and rapid apprehension and representation of small numbers of items is not often evident in other species (including in the present experiment) and is not always present even in tests with humans, and therefore remains elusive with regard to understanding whether it is a core system or instead may be the result of specialized learning or experience or special stimulus presentation formats (e.g., Hyde, 2011). More research is needed to clarify that issue, but the present results showed that these naïve capuchin monkeys do not show privileged capacities for dealing with small numbers of items. Such privilege may be unique to human quantity comparisons, or perhaps is the result of specific kinds of experiences or task demands in making such judgments.