Dealing accurately with numerical quantities is fundamental to success in modern societies. On a daily basis, we are asked to make judgments about concepts that are expressed numerically, be it monetary value, weather, temporal duration, or spatial distance. But what is the cognitive basis for our ability to engage with numerical quantities such as these? Recently, it has been proposed that the answer to this question is an innate and inexact analog system known as the Approximate Number System (ANS). This evolutionarily ancient system enables us, for example, to rapidly decide—without explicitly counting—which of two orange trees has the greatest number of fruit, or which of two herds has the greatest number of gazelle. The ANS supports approximate numerical operations—such as comparison and addition—on both visual and auditory arrays in adults, children, and nonhuman animals (Barth et al., 2006; Brannon & Terrace, 2000; Cordes, Gelman, Gallistel, & Whalen, 2001; Dehaene, 1992, 1997; Gallistel & Gelman, 1992).

These findings raise the possibility that the ANS is the cognitive basis of everyday numeracy skills such as exact addition, subtraction, and multiplication with Arabic numerals. At least four sources of evidence support this possibility. First, the ANS appears to be automatically activated in response to Arabic numerals by adults as well as children (Dehaene, 1997; Moyer & Landauer, 1967). Second, children who have had no formal mathematical instruction seem to harness the ANS when asked to perform approximate symbolic arithmetic operations (additions of numerosities represented as Arabic numerals; Gilmore, McCarthy, & Spelke, 2007). Third, several different measures of children’s ANS proficiency (using both symbolic and nonsymbolic stimuli) have been found to correlate with their performance on tests of early symbolic numeracy skills (e.g., symbolic stimuli, Durand, Hulme, Larkin, & Snowling, 2005; Holloway & Ansari, 2009; nonsymbolic stimuli, Halberda, Mazzocco, & Feigenson, 2008; Mundy & Gilmore, 2009). Fourth, one measure of ANS proficiency taken at the start of formal schooling, the symbolic numerical distance effect, has been shown to predict young children’s symbolic mathematics competence a year later (De Smedt, Verschaffel, & Ghesquière, 2009). However, other measures often assumed to reflect ANS proficiency, notably the nonsymbolic numerical distance effect, have been found to not correlate with school-level mathematics achievement (Holloway & Ansari, 2009; Mundy & Gilmore, 2009). It may be that the NDE is a poor measure of ANS proficiency (Gilmore, Attridge, & Inglis, in press).

A natural question arises from this set of findings: If adults exhibit an automatic ANS-based response to symbolic numerosities, and if the ANS is involved in early symbolic numerical operations in children, does the ANS also influence more sophisticated numerical operations of the types conducted by adults in everyday life? In other words, is the ANS a fundamental part of the way humans of all ages engage with numerical quantities? Or, alternatively, is the large variance in mathematical competence possessed by adults unrelated to differences in ANS acuity? In view of the educational potential for harnessing the ANS to develop effective instruction, it is clear that further evidence that speaks to the relationship between the ANS and formal mathematics achievement would be desirable. Our goal in the present article is to explore this relationship.

The approximate number system

Representations of numerosities within the ANS are noisy, and they grow noisier as the magnitude of the to-be-represented numerosity increases. To capture this noise, Barth et al. (2006) proposed that ANS representations of a numerosity n follow a normal distribution with mean n and standard deviation wn. Here, w is the internal Weber fraction, which gives a measure of the acuity of an individual’s ANS. Thus, on a nonsymbolic comparison task, in which participants are asked to select which of two arrays of colored dots are numerically greater, those participants with high ws have less precise representations, and consequently lower accuracy rates.

Halberda et al. (2008) gave 14-year-old children a nonsymbolic comparison task, calculated each individual’s w (henceforth ANS acuity), and related these to standardized mathematics achievement tests that had been taken at ages 5–11. They found strong relationships between these two measures at each testing-point (r 2s varied from .11 to .33). These correlations retained significance after controlling for covariates such as IQ and working memory measures. This result seems to suggest that ANS acuity and mathematics achievement are closely related. However, both ANS acuity (Halberda & Feigenson, 2008) and symbolic mathematics achievement are developmental; therefore, because Halberda et al. did not test their participants concurrently on the two tasks, it is possible that their ANS measure (taken at age 14) had been influenced by developmental patterns not reflected in the mathematics achievement tests taken at ages 5–11. For example, it is conceivable that individual differences in early mathematics achievement leads, over several years school experience, to differential levels of exposure to numerical ideas (both in terms of quality and quantity of the exposure), which in turn might lead to differential ANS acuities. Thus, measuring ANS acuity some years after mathematical achievement might overstate the relationship between these two constructs. Some support for this possibility comes from Iuculano, Tang, Hall, and Butterworth’s (2008) finding that the nonsymbolic addition performance of 8–9 year old children did not correlate with their exact symbolic addition performance when tested concurrently.

To investigate these issues, we conducted Experiment 1. Our aim was to determine whether there is a relationship between children’s ANS acuities and formal mathematics achievements when the two constructs are measured concurrently.

Experiment 1

Method

Participants

Participants were 39 children (20 male) from the ages of 7.6 to 9.4 years (M = 8.4). They took part, with parental consent, at school and were rewarded with stickers. Three tasks were administered, as listed below.

Nonsymbolic comparison task

Participants completed a computer-based nonsymbolic comparison task in which they selected the more numerous of two dot arrays, designed based on the version used by Pica, Lemer, Izard, and Dehaene (2004). The two arrays (one red and one blue on a white background) were presented side by side simultaneously on a 15-in. LCD laptop screen. The ratios between the numerosities of the left and right arrays were 0.5, 0.6, 0.7, 0.8, and their inverses, and the numerosity of the arrays ranged from 5 to 22. The color and side of screen of the correct array were fully counterbalanced. Participants were asked to select, as quickly and as accurately as possible, which array was more numerous. Responses were recorded via the leftmost (left bigger) and rightmost (right bigger) buttons on a five-button response box.

Each of 128 trials began with a fixation point for 1,000 ms, followed by the dot arrays for 1,500 ms. If the participant had not responded within 1,500 ms, the arrays were followed by a white screen with a black question mark. This allowed participants to still respond while preventing them from counting the arrays. Participants rarely exceeded this duration (2.5% of trials), and the mean RT was well within this limit (776 ms). The design is summarized in Fig. 1. Experimental trials were preceded by a practice block of eight trials.

Fig. 1
figure 1

The experimental paradigm used in Experiment 1. The two dot arrays were colored red and blue

To prevent participants reliably using strategies on the basis of continuous quantities correlated with number (dot size, total enclosure area), the stimuli were created following the method adopted by Pica et al. (2004). For each problem, two sets of stimuli were created: one in which the dot size and total enclosure area decreased with numerosity, and one in which the dot size and total enclosure area increased with numerosity.

WoodcockJohnson III tests of achievement

The Calculation subtest of the Woodcock–Johnson III Tests of Achievement was administered with the standard procedure (participants had no time limits and continued until they had answered six questions incorrectly in succession). An example problem is given in Table 1.

Table 1 Details about the various subtests of the Woodcock–Johnson III Tests of Achievement used in Experiments 1 and 2

Weschler abbreviated scale of intelligence

Participants completed the Matrix Reasoning subtest of the Weschler Abbreviated Scale of Intelligence (WASI) following the standard procedure. The raw scores were converted into t scores to give an age-standardized measure of nonverbal intelligence.

Results

Those participants who appeared to be using strategies that were based on continuous quantities correlated with number (i.e., those who were not using their ANS) on a majority of trials were eliminated from the sample (i.e., those participants whose accuracy rates on the two sets of stimuli created by Pica et al.’s [2004] method differed by more than 0.5, N = 10). In addition, we removed those participants whose performance was not above chance (N = 5). This left 24 participants for the main analysis.

Accuracy rates varied from 0.59 to 0.81, with a mean of 0.69 (SD = 0.06), and were subjected to an ANOVA with ratio as a within-subjects factor. There was a significant effect of ratio, F(3, 69) = 11.76, p < .001, and a significant linear trend, F(1, 23) = 50.32, p < .001. As is characteristic of the ANS, accuracy rates were highest at the 0.5 ratio, and lowest at the 0.8 ratio. These data are shown in Fig. 2.

Fig. 2
figure 2

Accuracy rates by problem ratio in Experiment 1. Error bars show ±1 SEs of the means

Using the log-likelihood method, each participant’s accuracy data were individually fitted to the model proposed by Barth et al. (2006).Footnote 1 Values of the w parameter varied from 0.34 to 1.17, with a mean of 0.65 (SD = 0.23).Footnote 2 The relationship between participants’ nonsymbolic comparison accuracy and their Woodcock–Johnson Calculation subtest scores is shown in Fig. 3. ANS acuity, as measured by w parameters, was found to negatively correlate with Woodcock–Johnson Calculation subtest scores, after controlling for age-standardized WASI matrix reasoning scores and age, pr = −.548, p = .008. In other words, high ANS acuities (w parameters close to zero) were related to high scores on the Woodcock–Johnson calculation subtest.

Fig. 3
figure 3

The relationship found in Experiment 1 between standardized residuals (controlling for age-standardized matrix reasoning scores) for ANS Acuity (nonsymbolic comparison internal Weber fraction) and the Woodcock–Johnson calculation subtest

Discussion

Our goal in Experiment 1 was to determine whether the relationship found by Halberda et al. (2008) between ANS acuity at age 14 and mathematics achievement at ages 5–11 could be replicated when the two measures were taken concurrently. We measured the ANS acuity and mathematics achievement of 7- to 9-year-old children and found a strong relationship between the two constructs. Those children with high ANS acuity tended to have high mathematics achievement scores. Since both ANS acuity and mathematics ability are developmental, one can naturally ask whether these two constructs codevelop into adulthood. In other words, do children develop their mathematics ability and ANS acuity together in a mutually reinforcing cycle? To explore this issue, we conducted a second experiment with adult participants.

Experiment 2

The primary goal of Experiment 2 was to determine whether the relationship between ANS acuity and mathematics achievement we found with 7–9 year old children in Experiment 1 would also hold with adults who were experienced in everyday mathematics. Along with a measure of achievement focused on numerical calculation (used by Halberda et al. [2008] and in Experiment 1), we took several other measures of achievement that, taken together, better reflect the broad nature of mathematics. In particular, along with various measures of numerical skill (calculation, calculation fluency, etc.), we also took measures of nonnumerical mathematical skills related to logical inference and geometry.

Method

Participants

Participants were 101 adults (50 male, of the ages 18–48, M = 23) who were recruited from the participant panel of the University of Nottingham’s Learning Sciences Research Institute; each was paid £20 for taking part. Testing was conducted individually. Computer tasks were presented on a 17-in. Philips 170B LCD. In addition to the tasks reported in the present article, participants tackled a number of other numerical cognition tasks (nonsymbolic addition, subitizing, etc.) that are not discussed here. The order of tasks was counterbalanced between participants, except that all tasks with symbolic stimuli were presented after tasks with nonsymbolic stimuli, so as to avoid cuing counting strategies (cf. Gilmore et al., in press).

Nonsymbolic comparison task

Participants completed the nonsymbolic comparison task from Experiment 1, with minor changes to the stimuli characteristics and procedure (pilot testing revealed that using identical stimuli to those given to the children may have led to ceiling effects). The numerical size of the arrays ranged from 9–70, and the pairs differed by the ratios 0.625, 0.714, and 0.833 (5:8, 5:7, and 5:6) and their inverses. Again, the stimuli were created following the method adopted by Pica et al. (2004).

Each of 120 trials began with a fixation point for 1,000 ms, followed by the dot arrays until response. Participants were asked to select, as quickly as possible, which array was more numerous. Responses were recorded via the leftmost (left bigger) and rightmost (right bigger) buttons on a five-button response box.

There was a response time limit of 1,249 ms to prevent ceiling effects. The limit was determined by taking the mean plus one standard deviation of the reaction times (RTs) found in pilot testing. This was enforced with a “Please speed up” message, followed by the next trial. Few trials led to the display of this message (approximately 3.6%), and these were recorded as incorrect responses. The experimental trials were preceded by a block of 10 practice trials. Participants were prompted to take breaks after every 20 trials.

Woodcock–Johnson III tests of achievement

Again, the calculation subtest of the Woodcock–Johnson III Tests of Achievement was administered with the standard procedure. However unlike in Experiment 1, we also administered the Math Fluency, Applied Problems, Quantitative Concepts, and Number Series subtests of the Woodcock–Johnson, again with the standard procedure. Descriptions of each of these subtests are given in Table 1.

Non-numerical tasks

Participants answered 32 paper-based conditional inference problems following the design used by Evans and Handley (1999). In addition, they were given 20 min to complete a reduced version of the paper-based van Hiele Level Geometry Test (Usiskin, 1982), and they also completed the matrix reasoning subtest of the WASI, following the standard procedure.

Results

As in Experiment 1, those participants who appeared to be using non-numeric cues for the majority of trials (N = 25), or whose performance was not above chance (N = 11), were eliminated from the analysis. This left 64 participants (an additional participant did not complete all of the tasks in the study).

Accuracy rates varied from 0.58 to 0.85, with a mean of 0.73 (SD = 0.06), and were subjected to a one-way ANOVA with ratio as a within-subjects factor. Again, responses showed the ratio effect characteristic of the ANS, F(2, 126) = 107.6, p < .001, and a significant linear trend, F(1, 63) = 219.0, p < .001. These data are shown in Fig. 4.

Fig. 4
figure 4

Accuracy rates by problem ratio in Experiment 2. Error bars show ±1 SEs of the means

Values of the w parameter varied from 0.22 to 0.95, with a mean of 0.39 (SD = 0.14). The relationship between participants’ nonsymbolic comparison accuracy and their Woodcock–Johnson Calculation subtest scores is shown in Fig. 5. Unlike in Experiment 1, ANS acuity, as measured by w parameters, was not found to correlate with Woodcock–Johnson calculation subtest scores, after controlling for age-standardized matrix reasoning scores and age, pr = +.161, p = .211. This nonsignificant positive correlation was found to be significantly different from the significant negative correlation found in Experiment 1, Fisher’s r-to-z transformation, z = 3.07, p = .001.Footnote 3 In other words, in addition to being not significantly different from zero, the correlation between ANS acuity and mathematical achievement found in adults was significantly different from that found in children.

Fig. 5
figure 5

The relationship found in Experiment 2 between standardized residuals (controlling for age-standardized matrix reasoning scores) for ANS Acuity (nonsymbolic comparison internal Weber fraction) and the Woodcock–Johnson calculation subtest

In addition, having controlled for age-standardized matrix reasoning scores, no significant correlation was found between ANS acuity and either the Math Fluency, pr = +.184, p = .156, Applied Problems, pr = −.098, p = .453, Quantitative Concepts, pr = −.110, p = .401, or Number Series, pr = +.081, p = .535, subtests of the Woodcock–Johnson III Tests of Achievement. Nor were there significant correlations between ANS acuity and overall scores on the conditional inference task, pr = −.110, p = .400, or on the van Hiele Levels Geometry Test, pr = −.164, p = .208.

Summary

In Experiment 2, we found no significant relationship between adults’ ANS acuity and any measure of mathematical achievement. We asked participants to answer a wide variety of mathematical tasks, including calculation, speeded calculation, conditional reasoning, and applied problems, and found no relationships between these scores and participants’ ANS acuities. In particular, the correlation between ANS acuity and calculation achievement for adults was significantly different from that for children.

General discussion

Halberda et al. (2008) found a relationship between individuals’ ANS acuity, tested at age 14, and their mathematics achievement at ages 5–11. They speculated that this may be because the ANS plays a causal role in individual differences in symbolic mathematical competency. Since adults also have access to the ANS, which appears to be automatically activated when participants view Arabic numerals (Moyer & Landauer, 1967), it is natural to hypothesize that a similar relationship holds in adults. In the present study, we confirmed that in 7- to 9-year-old children, there is a strong relationship between ANS acuity and numerical calculation achievement (when tested concurrently), but demonstrated that the same relationship does not hold with adults. This finding rules out the possibility that ANS acuity is directly implicated in the large individual differences found in adults’ numerical calculation achievement. Together, these findings suggest that, along with ANS acuity and mathematical achievement changing with age, the strength of the association between these constructs does as well.

One speculative hypothesis that would account for this set of data is to suppose that the ANS plays a bootstrapping role in the development of whole number understanding. For example, children may come to understand whole numbers by assigning verbal and symbolic names to visual and auditory stimuli that give rise to similar ANS representations. Thus, for young children, we might expect that their fluency with symbolic numbers would be associated with their ANS acuity, since their symbolic numbers would be nothing more than tags for ANS representations. However, once children have reached a certain sophistication with numerical concepts, other factors (working memory capacity, strategy choice, teaching effectiveness, etc.) may come to dominate individual differences in mathematical performance, leading to a decline in the relationship with ANS acuity. While speculative, this hypothesis does suggest that a detailed microgenetic study of how ANS acuity, and the relationship between ANS acuity and mathematics achievement, develops through formal schooling would be a valuable contribution to our understanding of how numerical concepts are formed.

There is now growing consensus that our abilities to deal with complex symbolic numerical concepts on a day-to-day basis is related in some way to the ANS, an innate analog system that supports rapid, approximate numerical judgments. However, the exact nature of this relationship remains unclear. The finding that the correlation between ANS acuity and mathematics achievement that exists in childhood is not present in adulthood indicates that there is no simplistic relationship between the ANS and symbolic mathematics achievement. Studying the pattern of decline in the relationship between ANS acuity and mathematics achievement as participants gain in maturity and mathematical experience may ultimately shed further light on the cognitive basis of the wide range of numerical operations that we each perform during everyday life.