Skip to main content

A number-line task with a Bayesian active learning algorithm provides insights into the development of non-symbolic number estimation


To characterize numerical representations, the number-line task asks participants to estimate the location of a given number on a line flanked with zero and an upper-bound number. An open question is whether estimates for symbolic numbers (e.g., Arabic numerals) and non-symbolic numbers (e.g., number of dots) rely on common processes with a common developmental pathway. To address this question, we explored whether well-established findings in symbolic number-line estimation generalize to non-symbolic number-line estimation. For exhaustive investigations without sacrificing data quality, we applied a novel Bayesian active learning algorithm, dubbed Gaussian process active learning (GPAL), that adaptively optimizes experimental designs. The results showed that the non-symbolic number estimation in participants of diverse ages (5–73 years old, n = 238) exhibited three characteristic features of symbolic number estimation.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Data and code availability

The data and code for all experiments and models are available for download at None of the experiments described here were preregistered.


  1. Recent studies have suggested that the choice of given numbers in the number-line task could affect the comparison between the MLLM and the CPMs (Opfer et al., 2016; Slusser et al., 2013). For example, a task with given numbers evenly distributed across the number line (Slusser et al., 2013) is more likely to support the CPMs, whereas given numbers concentrated in the early part of the number line are in favor of the MLLM (Siegler & Opfer, 2003). However, the designs selected by GPAL are model-neutral, since GP infers the underlying psychophysical functions without a priori assumptions about their shape.


Download references


The present work was supported by grant FA9550-16-1-0053 to MAP and JIM from the Air Force Office of Scientific Research (AFOSR) and R305A160295 to JEO from the Institute of Education Sciences (IES).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sang Ho Lee.

Ethics declarations

Ethics approval

Experiments in the current study were approved by Institutional Review Board (IRB) in the Ohio State University.

Consent to participate

Informed consent was obtained from all adult participants and legal guardians of children included in the study.

Consent for publication

Participants were informed that no identifying information about participants will be available in the article.

Conflict of interest

There are no known conflicts of interest regarding this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



  1. 1.

    GPAL simulations

In the main body of the paper, we introduced several functions that are likely to be observed in the number-line task. Here, in simulations with artificial data, we assessed the technical soundness and ability of GPAL to identify and recover these functions as data-generating models. That is, if GPAL works as claimed, the method should be able to recover the functional form of any model, including MLLM and CPMs.

Specifically, in the simulations, we used four different functions, which are linear (MLLM with λ = 0), logarithmic (MLLM with λ = 1), 1CPM (β = 0.5), and 2CPM (β = 0.5). The four models were then used to generate estimates for given numbers which are selected by GPAL. Normal random errors with the variance of 25 were added to the estimates. For simplicity, the number range was fixed to 0–100. This way the simulation of GPAL with ten trials was repeated 100 times for each of the four functions. Appendix Figure 6 shows results obtained at the end of ten simulated experimental trials, averaged over 100 independent simulation runs. The solid black curves in the graphs are the GPAL-inferred functions. They were practically indistinguishable from the data-generating functions (dotted red curves), thereby demonstrating that GPAL can successfully recover all the functions of interest.

Fig. 6
figure 6

Results of function recovery simulations using GPAL. Note. In each graph, the solid black curve is the Gaussian process (GP) mean function obtained as an average over 100 independent simulation runs. The dotted red curve is the data generating function (ground truth). The blue area depicts the 95% confidence region

  1. 2.

    Outlier detection

Participants with extremely high posterior variance of GP (> 3SD from the mean) were excluded from the analysis, because the predictions of GP with particularly high variance were not considered reliable. Two children and six adults met this criterion. These outliers were detected separately for children and adults.

  1. 3.

    Model fitting with the raw data

The model evaluation and parameter estimation in the current study relied on the GP-estimated posterior mean functions instead of the raw data. We also fitted the same three models of interest to the raw data to explore whether the sparse data obtained by GPAL would lead to the same outcome and conclusion. The raw data from children generally supported the results in the current study. Appendix Table 3 shows the DIC values of the MLLM and the CPMs fitted to the raw data. The MLLM generally showed smaller DIC values than the CPMs, as in the model comparison using the GP-estimated posterior mean functions (Table 2). The estimates of the logarithmicity measure (μλ) in the MLLM obtained from the raw data (Appendix Fig. 7) also showed the patterns consistent with those in Fig. 4. Younger children showed larger posterior mean of μλ than older children, with increasing trends of μλ across upper bounds in both age groups. However, large HPDIs, especially with 50 as the upper bound, suggested that the sparse raw data in each number range were not as reliable as the functions inferred by GP using the full data across number ranges. The raw data analysis was not feasible with adults because their data were extremely sparse, as shown in Fig. 5. Some adults had only one or two data points for some number ranges, making the by-range model fitting unviable.

Table 3 DIC values for the hierarchical Bayesian models MLLM, 1CPM, and 2CPM. Measured with the raw data from children
Fig. 7
figure 7

The logarithmicity measure (μλ) from the raw data of children plotted against upper bounds. Note. Error bars indicate Bayesian 95% highest posterior density interval (HPDI)

  1. 4.

    Interpretation of the 95% highest posterior density intervals (HPDIs) in Fig. 4

In Bayesian inference, statistical evidence for the difference between the posterior distributions of μλ from different age groups were considered strong when their 95% HPDIs did not overlap with each other. One concern about the analysis with the three age groups was that dividing children into two groups with reduced sample sizes might substantially reduce statistical power. However, HPDIs across upper bounds and age groups shown in Fig. 4 (and also Appendix Fig. 7) suggest that the divided groups have sufficiently strong statistical power to conclude that the values of μλ are meaningfully different across upper bounds and age groups.

  1. 5.

    Education level coding in the partial correlation analysis

To control for the effects of education levels in the Bayesian partial correlations between λ and w values, the education level was coded as 0 for kindergarteners, as the grade for 1st–7th graders, and as 13 for adults. Thirteen corresponds to a high school graduate, which was the minimum education level of the adult participants. We did not differentiate adults by education level because their number-line estimation varied little compared to children. For the correlation analysis, the data were collapsed over children and adults, for 50, 100, 200, and 400 upper bounds.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lee, S.H., Kim, D., Opfer, J.E. et al. A number-line task with a Bayesian active learning algorithm provides insights into the development of non-symbolic number estimation. Psychon Bull Rev 29, 971–984 (2022).

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Numerical cognition
  • Cognitive development
  • Cognitive modeling
  • Gaussian process
  • Active learning
  • Hierarchical Bayesian modeling