1 Introduction

Most theories of the development of counting focus on the cognitive representations of the counting tasks (e.g., Schaeffer et al. 1974, Gelman and Gallistel 1978, Sarnecka and Carey 2008). In particular, the cardinality principle – defined as that the last number tag used in a count, represents the cardinality of the items counted – has received a lot of attention. The principle has been used as a benchmark to determine when a child knows how to count (e.g., Gelman and Gallistel 1978, Wynn 1992, Sarnecka and Carey 2008, Davidson et al. 2012).

LeFevre et al. (2006) distinguish conceptual and procedural knowledge of counting and they study how the two types of knowledge depend on each other. In this paper, we follow their distinction, by first identifying three conceptual components and then analyzing the procedure of counting. Predecessors to this approach are Schaeffer et al. (1974), who wrote about the “action sequence” of counting, and Cavanagh and He (2011), who presented a series of steps involved in counting.

We also adopt the perspective of situated cognition (Zhang and Norman 1994). From this perspective, counting is not studied anymore as an internal cognitive process, but it is always treated as inseparable from the spatial and physical properties of the objects to be counted. Consequently, the physical properties of external situations, such as properties of the objects and their arrangement must be considered. The external aspects of the counting procedure have not been sufficiently analyzed in previous research.

Another aspect that has been neglected in earlier literature on counting is that the collection of elements that are counted typically has a distribution in space, for example, the cookies to be counted are located on a plate, while, as mentioned above, the recital of numerals and the pointing to the objects is a process in time. This means that counting typically involves a mapping from the time domain to the space domain.Footnote 1 This form of counting thus involves a form of cross-modality. In our analysis, we will also present counting situations with various distribution of numerals in space. In such situations, counting involves a one-to-one mapping from the space domain to the space domain. As we will argue, this relieves the working memory of the counter from the task of reciting the numeral and is thus a cognitively easier task.

We analyze the counting process as a sequence of six procedural steps: (i) identifying the collection to be counted; (ii) selecting an uncounted item; (iii) incrementing the count; (iv) marking the counted item; (v) stopping when all items are counted; and (vi) identifying the cardinality of the collection with the last numeral. The last step corresponds to the cardinality principle, which from our perspective becomes but one part of the counting procedure.

The ability to perform the six steps presumes that the counter already has the conceptual knowledge that is necessary to perform the steps of the counting procedure. We claim that three conceptual knowledge components are relevant. The first is understanding that cardinality is a property of collections; the second is the knowledge about numerals; and the third is the conceptual understanding of one-to-one mappings between collections. When we speak of “understanding” or “having the conceptual knowledge”, we do not mean a theoretical or meta-theoretical understanding, but rather of a cognitive grasp of a conceptual or semantical structure underlying the performance of mathematical tasks.

Our main argument in this article is that grasping the three conceptual knowledge components is crucial for the six steps to be performed. Dividing counting into these steps provides us with a framework for understanding what is involved in counting and why situatedness helps evaluating the difficulty of each step.

By studying various types of counting situations, we show that some of the steps may be facilitated by the external organization of the situation. Using the perspective of situated cognition, we analyze how the balance between external and internal representations implies different loads on the working memory and attention of the counting agent. To some extent following Fuson (1988), we classify various one-to-one correspondences that can be formed between the numeral from the child’s counting list, and an object from the counted collection; and various counting routines depending on the amount of external cognitive support that is provided. This analysis shows that even if the counter knows how to use the cardinality principle – what is called a CP-knower (Sarnecka and Carey 2008; Sarnecka 2015) – the other steps of the counting process can be blocked, depending on which kind of collection is being counted.

After presenting the relevant theoretical background in Section 2, the conceptual knowledge components are introduced in Section 3, where we also argue that the domains develop independently and then become integrated in the counting process (Schaeffer et al. 1974; LeFevre et al. 2006). In Section 4, we turn to the procedural knowledge. We describe the six procedural counting steps in more detail and show how they depend on the conceptual knowledge components. Then in Section 5, we present several counting situations that involve various external setups, and we show how this influences the loads on working memory and attention. In particular, we contrast time to space mappings with space to space mappings.

2 Background

In this section, we present a review of the main conceptual principles of counting that have been proposed in earlier research. Because of the complexity of the literature, a full survey is not possible here. Our objective in this paper, is to highlight the importance of studying procedural knowledge of counting in situated cognition contexts.

We make a conceptual distinction between “cardinality” and “numerosity” (Quinon 2020). By “numerosity” we refer to all various connotations of the concept of number, including “cardinality”, “ordinality” and approximate estimates of the number of items in a collection. It is well established that humans, and many other animal species, are equipped with the so-called Approximate Number System (ANS), which enables them to approximate cardinalities of collections (Dehaene 2011). ANS becomes activated by a variety of phenomenal inputs, including symbolic input. It provides humans with an idea that collections have quantities, but it does not provide information about the exact size of those quantities (Carey 2009). In this paper, our focus is on exact cardinality, and not on approximation or estimation of size. By “cardinality” we refer to the narrower situation where a numeral refers to the property of a collection. “Counting” typically refers to a process where a one-to-one mapping is created by mapping the numerals from a certain counting list to a collection of items. In this sense, the counting process manifests the principle that numerals refer to cardinality. Counting also manifests the ordinal progression of numerals, and thereby establishes a connection between the ordinal and the cardinal meaning of numbers.

The ability to form a one-to-one correspondence between elements of a collection and numerals, is what Benacerraf (1965) called transitive counting, as opposed to reciting numerals that is called intransitive counting. The intransitive use is exemplified by “Alice counted to one hundred before searching for the hidden children” and the transitive use by “Snow White counted the dwarfs”. Rips et al. (2008, p. 624) call the transitive use “enumerating”. In this paper, our focus is on transitive counting.

The central idea in many theories of number concept acquisition is that fulfilling the cardinality principle (the last number tag used in a count, represents the number of objects counted) is a necessary and sufficient condition for being recognized as a counting agent. The cardinality principle is a part of early work by Schaeffer et al. (1974), who claim that an individual must conceptually grasp the following principles in order to be able to correctly and competently count:

  • The cardinality principle.

  • The mastery of the counting procedure or the coordination of ordered number names and objects counted.

  • The knowledge that x + 1 is subsequent to x.

Later in this paper, the second principle will be described as the ability to form one-to-one correspondences between numerals and objects in a collection. The third principle is often called the successor principle.

Another very influential analysis is that of Gelman and Gallistel (1978) who also formulate three principles:

  • The cardinality principle.

  • The one-to-one correspondence principle: Items to be counted must be put into one-to-one correspondence with members of the set of numerals that are used to count with (e.g., a set of number words).

  • The stable-order principle: The numerals (number words or number symbols) have a fixed order in which they are consistently used.

Even though Gelman and Gallistel’s principles and those of Schaeffer et al. are formulated in slightly different ways, both sets of principles cover similar types of conceptual content: (i) the idea that the ordinal (counting progression) aspect of number and the cardinal (quantity of elements in a collection) aspect of number are closely related; (ii) that number tags are used to count; and (iii) that in both the list of numerals and the collection of items that are being counted there is a sort of successor.

The cardinality principle has been extensively explored in the context of bootstrapping, introduced by Carey (2009). According to bootstrapping, the concept of exact number is acquired by a child as a result of a coordination between several other partial concepts. In her book, Carey discusses possible scenarios of bootstrapping. The version she argues for states that the child has acquired the concept of number when, after learning the cardinal meanings of the first four numerals, she has grasped also the general principle that the cardinality of a collection is named by the last numeral used in counting of the elements of this collection. In other words, the child has learnt to correlate ordinal use of numerals with their cardinal meaning. The cardinality principle is also central for the work of Sarnecka. According to Sarnecka and Carey (2008, p. 665), a child becomes a cardinality knower when she can perform the operation described above on the numerals from her counting list. They write that “the cardinality principle is a procedural rule about counting and answering the question ‘how many’.” They also see the principle as a way of getting from the ordinal character of reciting numerals to establishing the cardinality of a collection:

“In other words, the cardinal[ity] principle guarantees that for any counting list, in any language, the sixth word in the list must mean 6, the twentieth word must mean 20, and the thousandth word must mean 1000”. Sarnecka (2015, p. 7)

We agree with the idea that assessing the cardinality of a collection has a procedural character. Unlike Sarnecka and Carey, however, we argue that this is just one among several procedural rules involved in counting.

Carey’s and Sarnecka’s accounts have been critically assessed by other researchers, resulting in other interpretations of the cardinality principle. For instance, Davidson et al. (2012) point out that Sarnecka and Carey (2008) provide no evidence that children who fulfill the cardinality principle (CP-knowers) on the numerals from the beginning of the counting list, applied to small finite sets, can generalize the principle even over the list of numerals that they know. In contrast, Davidson et al. (2012) observe that there is no one uniform group of CP-knowers, so children may master the cardinality principle at different levels. In a more radical way, Rips et al. (2008) suggest that an agent cannot be claimed to understand the concept of number, if she does not grasp the generalized cardinality principle, which states that children can generalize over all the natural numbers (they know the names of or not).

Another area of study related to the cardinality principle is how other principles, such as the ability to form one-to-one correspondences or the ability to grasp the successor principle, interact with the cardinality principle. For example, Sarnecka and Carey (2008, p. 665) draw attention to the relation between the cardinality principle and the successor function:

“Alternatively, the cardinal principle can be viewed as something more profound – a principle stating that a numeral’s cardinal meaning is determined by its ordinal position in the list. This means, for example, that the fifth numeral in any count list – spoken or written, in any language – must mean five. … If so, then knowing the cardinal principle means having some implicit knowledge of the successor function – some understanding that the cardinality for each numeral is generated by adding one to the cardinality for the previous numeral”.

From the perspective of concept analysis, the concept “successor” is not necessary for characterizing the cardinality principle, and the two principles are logically independent. This analysis is backed up by Davidson et al. (2012) who found evidence that many CP-knowers, in particular less proficient counters, failed to show any knowledge of the successor principle, even for small numbers. This indicates that the cardinality principle, the concept of successor and the concept of one-to-one correspondence are acquired independently.

In this paper, we focus on the most general characterization of the cardinality principle, that is, understanding that the last name on the counting list names the quantity of counted objects. Taking the perspective of situated cognition, we argue that depending on the context, the difficulty of applying the cardinality principle varies considerably. In particular, it depends on how much the working memory and the attention of the counting individual are taxed. Since, as we claim in this paper, counting involves several procedural steps, the working memory of young children may simply not be sufficient to cover all those steps. For instance, it might not be sufficient for both counting and remembering the last word uttered in the counting process (see Fuson 1988, p. 209). Therefore, children will perform as CP-knowers in some counting situations, but not in others. In brief, the passing from 2-, 3- and 4-knowers to CP-knowers is not the cognitive Rubicon that Sarnecka and Carey (2008) envision; we rather see the development of counting as characterized by a fine-grained delta of small rivers that are gradually stepped over.

3 Three Conceptual Knowledge Components in the Development of Counting

3.1 The Three Components

Before we can analyze the six steps of the counting process, we will make explicit the conceptual knowledge of the counter that is required to enable her to perform the steps correctly. We argue, that in order to know how to count, one needs:

  1. (1)

    To know that numerosity is a property of collections.

  2. (2)

    To know a list of numerals (for example, verbal number words or symbolic Arabic digits).

  3. (3)

    To know how to create one-to-one mappings from one collection of items to another collection.

In the following three subsections, we present each of these knowledge components in more detail.

3.2 Properties of Collections

Collections can have various properties.Footnote 2 Some of these properties are shared with objects, for example weight and location: “These beans weigh 500 grams”. “The radishes are in the plastic bowl in the fridge”. Many properties are, however, unique to collections. For example, collections can be ordered or unordered, uniform (consisting of the same type of objects) or mixed, dense or spread out. Most importantly, collections have cardinality, that is, they contain a certain number of elements. The cardinality of a finite collection is traditionally expressed by a word or a symbol for a natural number.

The interest in numerical invariances of collections has a long history. In a series of “conservation tasks”, Piaget (1952) tested children in order to understand which properties of collections are acquired by children at different stages of development. In one of the experiments, two equinumerous collections of objects, for example red and blue marbles, are placed into two parallel lines, one of red and the other of blue marbles, that are equally long. Then the objects in one line are spread out. The child is asked: “Are there the same amount of red marbles as blue marbles or are there more red or more blue marbles?” A child that has not understood cardinality will answer that there are more objects in the longer line. Failing the Piagetian conservation tasks means that a child has not (yet) understood that cardinality is a property of a collection and that cardinality is invariant over the spatial layout of the collection. In order words, the child does not distinguish the dimension of “size” from the dimension of “magnitude”.

Just as children must learn to separate height from volume in Piaget’s conservation tasks, they must learn to make a distinction between “more” referring to volume or mass (four elephants weigh more than five ants) and “more” referring to number (five ants are more than four elephantsFootnote 3).

As observed by Fuson (1988), a child understanding that cardinality is an invariant property of one type of collection will not generalize to the understanding that cardinality is an invariant property of all types of collections. For instance, numerical properties are most straightforwardly applied to collections of the same category of objects (three cats, four birds, etc.), but might be difficult to assess in collections composed of unsimilar objects. Collections of “things” belonging to various categories involve a more abstract notion of number and are more difficult to manage for children. In Section 5, we suggest that the varying difficulties in assessing the cardinality of different types of collections can be analyzed in terms of situated cognition.

3.3 Numerals

Numerals are a crucial invention for structuring the way we experience quantifiable dimensions of the world. We have a rough sense of quantity without them, but numerals are necessary to impose exact, discrete quantities on our experiences. Everett (2017) even goes as far as claiming that the creation of symbolic representations of numbers was the cultural invention that shaped all further human culture (see also Pantsar and Quinon 2019; Decock 2008, and Coolidge and Overmann 2012). As we will see, however, it is important to point out that there are two basic forms of numerals: verbal (saying “one”, “two”, “three”, and so on) and visual (writing “1”, “2”, “3”, and so on). This distinction has been downplayed in the literature, but we shall argue that it is important from the perspective of situated cognition.Footnote 4 The verbal form – reciting words – is temporal, while the visual is spatial. Thus, the two representations belong to different conceptual domains.

We first discuss the learning of the verbal forms. Fuson et al. (1982) describe several stages that the learner goes through while learning names of numbers. The first numerals are learnt in a purely syntactic way, as a list of arbitrary words – up to ten or twelve depending on the language. Next, children learn bigger numerals. The list becomes more systematic and most languages have a method for recursively generating new number words. Most frequently, the only new words that must be learnt are the bases, for example ‘ten’, ‘hundred’, ‘thousand’, ‘million’, etc.

Acquiring numerals presupposes the ability to apply the stable order principle (Gelman and Gallistel 1978; Fuson 1988; Davidson et al. 2012), that is, that numerals are always recited in the same order. Keeping in mind where you are in the list of numerals when you are counting involves a load on working memory (Schaeffer et al. 1974; Fuson 1988). For example, the shift from ‘sixty-nine’ to ‘seventy’ requires keeping in mind that you are now leaving the sixties and entering the seventies and at the same time shifting from a word marked with ‘nine’ to an unmarked word. Children often make mistakes of the kind “sixty-nine, sixty-ten, sixty-eleven …”. With practice, the correct shifts become routine. The stable order principle is applied only when counting becomes automatized.

Already Fuson (Fuson et al. 1982; Fuson 1988), established that in children’s minds there exists a structure underlying the progression of numerals. Sarnecka and Carey (2008) speak of the placeholder structure that at first consists of a sequence of symbolic representations of numerals. Quinon (2020) suggests that children start grasping the idea of a regular sequence before they learn a specific sequence of names of numerals. The idea that there exists an underlying intuition regarding the preverbal placeholder structure is confirmed by, for instance, a result by Slusser and Sarnecka (2011), who show that children who know one placeholder number line (know a sequence of number-names in some language), understand a number names sequence in another language almost simultaneously.

One precondition for learning to use numerals is obviously that the language spoken by the counters must have a conceptual structure that can handle quantity, quantification or numerosity. This does not hold of anumeric languages, such as the Amazonian Pirahã and Mundurukú, that only contain number words corresponding to ‘one, ‘two’ and ‘three’ (Everett 2017, Pica et al. 2004, see also Butterworth et al. 2008 for a study of languages used by indigenous Australians). However, the limitations of these languages do not imply any cognitive limitations of their speakers. People from these tribes use core cognitive resources to approximate and compare quantities, and they can also learn to count when they have been exposed to an appropriate alternative language structure (Everett 2017, p. 116ff).

It is important to notice that reciting the numerals is a process that is extended in time. This suggests that reciting involves the temporal dimension and as such can be opposed to the spatial dimension that is predominant in visual representations of collections. Furthermore, counting is typically done in a rhythmical manner (Quinon 2020). The equidistance in time of rhythmical progression is analogous to the equidistances between recursively generated numerals.

A comment on learning visual forms of numerals is appropriate here. In addition to learning how to recite (and generate) a list of numbers, children must also learn the visual appearance of the symbols for numbers (Dehaene and Cohen 1995). When using the visual symbols, no time element needs to be involved, as we illustrate in Section 5.

3.4 One-to-One Mappings

The third conceptual knowledge component of counting is the understanding of the concept of one-to-one correspondence. In the most recent experimental psychological literature, this knowledge component is not traditionally listed as a necessary condition for counting, although, Sarnecka and Wright (2013) highlight that grasping the “exact equality” (or “equinumerosity”) is – next to the cardinality principle – a necessary condition for the acquisition of the exact number concept. For us, this ability is the central component in counting.

In parallel with our distinction between verbal and visual numerals, we make a distinction between three kinds of one-to-one mappings: (1) From time to space. This is transitive counting where the counter recites numerals while pointing (or pairing in some other way) the words to objects in a spatially presented collection. It can be seen as a mapping from the ordinal structure of verbal numerals to the cardinality of the collection that is counted. (2) From space to space: This is the form of transitive counting where the counter maps spatially located visual symbols of numerals to a spatially distributed elements of a collection. (3) From time to time: This is transitive counting where the counter recites numerals while pairing the words to objects that pass by in a temporal order (think of the classical counting sheep). In Section 5, we will illustrate all three kinds and discuss the differences between them in terms of cognitive loads on memory and attention.

We differ from the account of Sarnecka and Wright (2013) in several aspects. Firstly, we treat the cardinality principle and equinumerosity as depending on a much more basic ability, which is forming one-to-one correspondence.

Furthermore, given the three kinds of one-to-one mappings introduced here, we submit that the set-up for counting, both visual and temporal, matters. Sarnecka and Wright more or less ignore this aspect, but this is central for our analysis of situated counting in Section 4. In more general terms, Sarnecka and Wright want to know how people understand equinumerosity. In contrast, our focus on in this paper is not on result, but on the process of counting and how it depends on temporal and spatial aspects. In consequence, we highlight that the ability to understand and then create one-to-one correspondences – or rather different manifestations of those abilities – is crucial for learning to count. The importance is noted by Fuson (1988, p. 206):

“A very important step is taken when children first begin to connect counting and cardinal meanings, when children first indicate that they understand that counting has a result instead of just being an isolated activity.”

She calls this “a transition from the counting meaning to the cardinal meaning of numbers”. One may as well say that it is a transition from the ordinal meaning of numbers to the cardinal meaning. In our terminology, this transition occurs when the knowledge about numerals is integrated with the knowledge about one-to-one mappings.

In support of our position, observations of anumeric tribes suggest that the ability to form one-to-one correspondence develops independently of other abilities necessary for processing finite quantities, such as knowledge of numerals, and knowledge that numerals refer to collections of elements. In a series of experiments conducted with an Amazonian tribe of Pirahã, it has been established that the capacity to create one-to-one correspondences can develop through explicit teaching (Gordon 2004; Everett 2017).Footnote 5

In the most basic “line matching” task, the Pirahã participants were presented with an evenly spaced line of items. They were then asked to place the same quantity of other items in an array parallel to the original line. Note that this and the following task are examples of producing a space-to-space mapping. In an “orthogonal matching” task, the participants were again challenged with making a line out of items. The objects on line were supposed to be equal in number to the original line presented to them, but in this case to be rotated ninety degrees (rather than parallel to the original line as in the basic line matching task.) In another task, the subjects were presented with a line of items and then asked to produce a line of the same quantity, but only after the original line was hidden from view. (Everett 2017, pp. 122–123).

The participants could perform the task with more than 4 elements only in the line matching task. In the two other cases, the Pirahã were only capable of accurately and repetitively matching quantities, if the number of items did not exceed three. For quantities greater than three, errors worked their way into the responses. In these cases, their responses reflected a reliance on approximation or analogue estimation, rather than knowledge of the exact cardinality differentiation. In contrast, English speakers accustomed to numerals, have no problem performing one-to-one matching in all three tasks.

The three conceptual knowledge components we have presented here are largely independent. Firstly, knowing that numerosity is a property of collections corresponds to knowing that the answer to a question of the form “How many?” refers to a collection and that the answer is independent of the spatial locations of the objects in the collection and is also independent of other features of collections, such as the total surface occupied by the collection, and of the features of the objects in collections, such as their shape or color (Sarnecka 2015).

Secondly, a child can learn to recite the numerals, without knowing that numerosity is a property of collection, or knowing how to establish one-to-one correspondences. The independence of the acquisition of purely syntactical sequence of numerals has been highlighted by Fuson (1988).Footnote 6 Further evidence for this independence comes from recent experiments by Flowers et al. (2019).

Thirdly, a child can know how to create a one-to-one mapping without knowing the words for the numerals or that numerosity is a property of collections. For example, a tea table for dolls can be set with each doll sitting in front of a cup. The child can create this one-to-one mapping between different collections of objects without using the numerals.Footnote 7 Again, note that this is a space to space mapping. Another example, from our own experience, is a child who, not knowing many items in the list of numerals, pointed in turn to each of the persons sitting around a table saying “one, one, one, …” in synchrony with the pointing gestures. This is a case of time-to-space mapping.

After this presentation of the three forms of background knowledge components, we next turn to how they are combined in order to achieve the ability to count the objects of a collection.Footnote 8

4 Mapping Numerals to Collections

As we noted in the previous section, children who learn to count must know that number is a property of collection. Humans are equipped with several cognitive systems that enable them to assess sizes of collections, and have access to cultural inventions, such as language and symbolic representations, that give them an opportunity to determine sizes to which they do not have immediate cognitive access. We begin by investigating ways in which the exact numerosity of a collection can be assessed.

4.1 Subitizing

Already Kaufman et al. (1949) proposed that humans have a mechanism for visually discriminating small numbers (up to about 4) of objects that enables them to identify the numerosity without counting. They call this mechanism subitizing. The subitizing ability is often seen as based on parallel individuation, that is, the ability to track several objects simultaneously. The idea is that the system represents elements by individual representations or individual object files (Kahneman et al. 1992). For example, three dogs are represented by dog-dog-dog, and not by some individual symbol.

There is no consensus among researchers that subitizing is a numerical ability at all, since the representations consist of two, three or four separate individual object files into which the cardinality is encoded only implicitly. However, it is claimed that it is subitizing that grounds the first numerical knowledge (Carey 2009). Studies by Wynn (1990, 1992) and Sarnecka and Gelman (2004) show that children learn to name small quantities at a very early stage of language development.

Carey (2009) claims that children learn the meaning of number names in order of magnitude and one by one. The main argument in favor of this is the so-called “give-N” test (Wynn 1990). In the experiment, a child is presented with a bowl that contains several plastic apples and is asked to give three apples to a toy animal. The performances of children move through a series of levels, called number-knower levels. Children who do not know the meanings of any number words are called pre-number knowers. After this stage comes the one-knower level. One-knowers know that “one” refers to the cardinality of collections with one object, and that all of the other number words mean something different from 1. Two-knowers know that “one” refers to the cardinality of collections with one object and that “two” refers to the cardinality of collections with two objects, but they make no distinctions among any other numbers. The two-knower level is followed by a three-knower and sometimes a four-knower level. It is thought that the children’s performance on the give-N test for small numbers builds on the subitizing system. When children can generalize to larger numbers fulfilling the cardinality principle, they are called CP-knowers.

4.2 The Steps of Counting

The focus of this section is the procedural knowledge that is required for successful counting. To assess the exact cardinality of larger collections, humans need to learn to count. The cardinality principle is the understanding that the last numeral from the counting list is the same as the cardinal expressing the numerosity of collection. However, to be able to count, more than the cardinality principle needs to be grasped.

In addition to the three conceptual knowledge components, our second theoretical assumption is that counting by creating a one-to-one mapping between numerals and a collection can be broken down into several substeps (partly paralleling those of Cavanagh and He 2011, p. 24). These steps presume that the counter has relevant information from the three conceptual knowledge components.

  1. (i)

    Identify the collection to be counted.

Such an identification involves a cognitive creation of a boundary that can be marked by some perceptual feature, for example that the elements in the collection are spatially separated from other objects, or a merely imagined boundary. In most empirical studies the collection of objects are, first, clearly separated from other objects; second, of the same kind; and, third, often arranged in a linear ordering. As we shall show below, relaxing any of these conditions makes counting more difficult. In many situations, children find it difficult to create an appropriate boundary for the collection, which means that they will fail in the following steps.

  1. ii

    Select an uncounted item in the collection.

When the objects in the collection are separated from other objects and when they are linearly ordered, this step will be easy to perform. Otherwise, attention and/or working memory will be taxed.

  1. iii

    Increment the count in the list of numerals.

The performance of this step depends on whether the counter is performing a mapping using verbal or visual numerals. Using verbal numerals requires that the list of numerals is well internalized. Uncertainties in intransitive verbal counting will be amplified in transitive counting. In the case of visual numerals, the counter must have learnt that the numerals refer to subsequent elements of a sequence or to quantities of elements in collections.

  1. iv

    Mark the just counted item.

As we will see in the following section, this step can be performed in several different ways.

Steps (ii), (iii) and (iv) are then repeated.

  1. v

    Stop when there are no uncounted items.

Again, if the objects in the counted collection are separated from other objects, and if they are linearly ordered, this step will be easy to perform. Otherwise it will be more difficult.

  1. vi

    Identify the cardinality of the collection with the last numeral.

This is precisely applying the cardinality principle.

It should be noted that step (i) presumes knowledge that number is a property of collections, steps (ii), (v) and (vi) presume knowing the numerals, and steps (iv) and (vi) presume the conceptual readiness to construct one-to-one-mappings. Thus, all the three knowledge components that we have identified are required for performing the steps.

These steps will be exemplified in the next section when we present different cases of counting. Performing the steps can be of varying difficulty, depending on the objects to be counted and their physical layout. Fuson (1988, p. 68) writes: “Given the difficulty young children have in coordinating two activities […], one might expect children to make many errors when trying to coordinate the action of producing a word with that of producing a point.” It should be noted that in this passage, Fuson is referring to time to space mappings.

Thus, even if a child knows some ways of establishing a one-to-one mapping between numerals and counted objects, they may still fail in other cases if the physical layout of the collection is complicated. Hence, we argue that, in many practical situations, being CP-knowers is far from sufficient for children to be able to count. To give an example of the relevance of step (i), compare a situation where a spatially separated set of identical objects form the collection to be counted with a collection of green objects of different kinds that are spatially located among objects of different colors. It is clear, that the second case involves a greater load on visual perception and working memory and the counting step in this case is therefore prone to more errors than in the first step, even though the formal structure of the two problems is identical and the cardinality principle applies in the same way. In accordance with this, Zhang and Norman (1994, p. 88) note in their classic paper on situated cognition: “[I]somorphic representations of a common formal structure can cause dramatically different cognitive behaviors. One obvious example is the representation of numbers” (this example is developed in Zhang and Norman (1995)).

5 Situated Procedures of Counting

In this section, we present a number of counting situations. We want to show that several other factors beside the cardinality principle determine whether a child will find it easy or difficult to count in an accurate manner.

We analyze some ways in which a collection can be physically structured and some ways in which counting can be performed. We follow the same method as did Zhang and Norman (1994) in their analysis of different versions of the Tower of Hanoi problemFootnote 9. They write that

”[t]he basic principle of distributed representations is that the representational system of a distributed cognitive task is a set of internal and external representations, which together represent the abstract structure of the task” (Zhang and Norman 1994, p. 87).

They define the concepts as follows:

“Internal representations are in the mind, as propositions, productions, schemas, mental images, connectionist networks, or other forms. External representations are in the world, as physical symbols (e.g., written symbols, beads of abacuses, etc.) or as external rules, constraints, or relations embedded in physical configurations (e.g., spatial relations of written digits, visual and spatial layouts of diagrams, physical constraints in abacuses, etc.). Generally, there are one or more internal and external representations involved in any distributed cognitive task” (Zhang and Norman 1994, p, 89).

The division between internal and external representations is important because as the amount of external representations increases in terms of what is represented in the physical structure of the task, the amount of internal cognitive work that needs to be done by the agent decreases. In relation to the task of counting, the main question concerns to what extent the steps (i) – (vi) involve a load on working memory and attention. The more memory and attention are required, the greater is the risk that the counter makes an error in some of the steps. Here we note that in the cases where verbal numerals are used in counting, these are internal representations, while visual numerals are external.

Some of the cases we present below have parallels in the work of Fuson (1988). She provides a detailed error analysis of the cases that support our arguments concerning the role of the situatedness relative to the counting steps. In contrast to her work, our focus is on the interplay between internal and external representations and how this affects the load on memory and attention. For all cases, we assume that the counted collection has already been identified, so that step (i) is fulfilled.

5.1 Mapping Time to Space

In this subsection, we analyze several cases of counting situations, where reciting verbal numerals is coordinated with pointing at the elements of a collection. These cases involve establishing a one-to-one mapping between the temporal domain of the recital and the spatial domain of the collection. As we shall show, the differences between the domains is a major source of the difficulties that the counter encounters while generating the mapping.

Case 1

The collection is linearly ordered. Counting is done with the aid of a finger.

In this case, it is easy for the counter to move a finger stepwise from one end of the collection to the other. The finger functions as a spatially located memory extension. Selecting the uncounted object (step ii) then becomes automatic. The physical presence of the finger pointing at the objects is a visual scaffold for remembering where in the establishment of the one-to-one mapping the counter is. Without a physical pointing device, this task must be managed by the working memory (see Case 3). The counted item is also externally “marked” when the finger moves past it (iv). Similarly, once the finger passes the last object, the procedure stops and thus also (v) is fulfilled without any particular attention. In brief, only the incrementing in reciting numerals (step iii) when the finger passes an object must be attended to. The cognitive load in this situation is thus mainly determined by step (iii) (Fig. 1).

Fig. 1
figure 1

Counting by moving a finger along a linearly ordered collection while reciting the numerals

In her analysis of this case tested on children of age 3 to 6 years, Fuson (1988, p. 64) writes that because

“words are located temporally and objects are located spatially, some sort of intermediary is needed to connect the two. The counter establishes this correspondence by using an indicating act, often the act of pointing, which has both a temporal and a spatial location.”

In this quote, she brings out that the mapping is from the time domain to the space domain. She notes (Fuson 1988, p. 177) that a fruitful correspondence between an indicating act and the indicated object must satisfy four requirements:

  1. vii

    • Each indicating act must be directed toward an object;

  2. vij

    • Each indicating act must not be directed toward more than one object;

  3. vik

    • Every object must be indicated (this is our step (v));

  4. vil

    • No object is indicated more than once.

Violating these requirements leads to various types of errors. Fuson finds that errors in counting decrease with age, and that the most prevalent error subtype consists in pointing to one object while saying two count words, or skipping objects while pointing. The errors indicate that at least younger children have problems of attending simultaneously to both the process of reciting the list of numerals and the process of moving the finger step by step along the row of objects.

Case 2

The collection is ordered, but not linearly ordered. Counting is done with the aid of a finger.

This case results in the same analysis as above except that the counter must make sure that the finger is following the order of the collection. The cognitive load is thus mainly determined by step (iii), but some attention must be given to (ii). In the case the collection forms a closed curve, for example a circle, the counter must also attend to (v) so that working memory keeps track of which is the last object to be counted. Thus, this case taxes the working memory more than Case 1 (Fig. 2).

Fig. 2
figure 2

Counting by moving a finger along a non-linearly ordered collection while reciting the numerals

Fuson (1988) made two studies of this case, one with children aged 3½ to 6 and one with children aged 2½ to 3½. In the first study, she finds that the only recount errors are those where the children continued around the circle and recounted object they had counted at the beginning. The children were aware of the rule saying that counting should stop just before they arrive again at the object where they started the counting. However, their working memory obviously had problems remembering the location of the first object. She notes (Fuson 1988, p. 198), that this increase in memory demand also affected the production of number words. In the second study, one of the dots to be counted had a different color than the rest, and the children were instructed to start counting that dot. Almost all children (17 out of 19) stopped counting at the correct point, giving evidence that they understand the stop rule (v). Marking the first dot thus helped off-loading working memory and the results thereby improved.

Case 3

The collection is linearly ordered. Counting is done without the aid of a finger or any other physical indexing device.

This case is also similar to Case 1, except the last item counted must be kept in visual working memory while the next item is selected. Thus step (ii) is not externally fulfilled by the position of a finger as in Case 1, but it adds to the cognitive load (Fig. 3).

Fig. 3
figure 3

Counting a linearly ordered collection while reciting the numerals without using any external device

Case 4

The collection is not ordered. Objects are not movable. Counting is done without the aid of a finger or any other indexical device.

This case is cognitively more demanding than the previous ones, since all of the steps (i) – (vi) in it are internal and must be kept in working memory. Even if a finger is used, it may be difficult to fulfill step (iv). One strategy to achieve the marking is to impose an external linear ordering on the collection, say by counting the objects from left to right. However, if the collection is extended in the vertical direction and comparatively dense, it may be difficult to visually determine which object is to the left of another (Fig. 4).

Fig. 4
figure 4

Counting an unordered collection while reciting the numerals without using any external device

In a study, Fuson (1988, Sect. 4.2) compares this case with Case 1. As expected, she found that the error rate for non-ordered collections increase considerably. Children both miss some objects when counting and count some objects twice. Remembering which objects are counted thus puts a higher load on working memory.

Case 5

The collection is not linearly ordered. Objects are movable.

In this case, step (iv) can be fulfilled externally by moving the objected from left to right, say, while counting them so that the counted and uncounted objects are visually divided. By situating the task in the physical location of the objects that is changed by a bodily action, the working memory is partially offloaded. Hence, this case is cognitively less demanding than Case 4 (Fig. 5).

Fig. 5
figure 5

Counting an unordered collection with movable objects using a finger while reciting the numerals

Fuson (1988, Sect 4.4.1) investigated the case where children could (but were not asked to) move the blocks that were counted. About half of the younger children (age 3½ to 5½) and almost all of the oldest children (age 5½ to 6) moved the blocks while counting. Most of those who moved the block used the movement as an indicating act sorting the counted from the uncounted blocks. There were some differences in the error rates made when compared to counting objects in a row: errors with more than one pointing per object were frequent when pointing to objects in a row, but it never occurred that one block was moved and then moved again. This can be interpreted in the following way: when an object can be moved it becomes easier to determine – visually and motorically – that it has been counted than when an object is just pointed to.

Case 6

The collection is not linearly ordered. Objects are mobile while being counted. 

This task is cognitively more challenging than all the previous ones since fulfilling steps (ii), (iv) and (v) involves an extremely heavy load on working memory even if the collection only involves a few objects.Footnote 10 (Fig. 6).

Fig. 6
figure 6

In this situation, the objects of the collection are moving around while the numerals are recited

Case 7

Counting in a mental spatial representation.

If a person is given the task of counting the numbers of windows in the apartment where he or she lives while being at a distant location, the person must mentally imagine the rooms of the apartment and from memory reconstruct the windows in the rooms. This task thus involves objects in a mental representation of space rather than objects in the present spatial surrounding as in all the previous cases. In particular, this makes step (ii) more difficult than in Cases 1–5, but also steps (iv) and (v) depend on the accuracy of the mental representation.

More variations of counting can be considered, but we hope that the cases we have presented illustrate how an analysis in terms of attention and working memory can determine the cognitive difficulty of a particular counting situation.

5.2 Mapping Space to Space

We next turn to the case where external visual symbols are used instead of internal verbal numerals. If the numerals are presented in a linear spatial layout, then the working memory involved in reciting the numerals in the correct order (step (iii)) can be offloaded. New methods for counting can then be exploited. On the other hand, such cases presume that the counter is familiar with the visual symbolic representations of the numerals. This knowledge is different from the ability to recite numerals (Dehaene and Cohen 1995).

Case 8

Numerals are linearly ordered from left to right on a piece of paper. The objects to be counted are linearly ordered below the numerals. The counter has a pencil.Footnote 11

In this case the one-to-one mapping can be created by drawing lines from the numerals to the objects. The counter first draws a line from “1” to the leftmost object, then a line from “2” to the second leftmost, etc. The procedure is stopped when there are no more objects without lines in the collection. Applying the cardinality principle just means identifying the last numeral. In this setup all of (i) – (vi) are more or less automatically fulfilled. The counter just has to make sure that there are not two lines going to the same object, that there is no object without any line from a numeral, and that no numeral is skipped in the procedure. All this can be visually determined. Hence the temporal domain is not involved and the working memory used for reciting numerals is not needed. This case thus involves less cognitive load than all the previous cases since the spatial locations of the numerals and the objects help constructing the one-to-one mapping and thereby fulfilling the steps of the counting procedure. The counter doesn’t even need to be able to recite the verbal numerals.Footnote 12 (Fig. 7).

Fig. 7
figure 7

Counting a linearly ordered collection of objects with linearly ordered numerals with help of a pen

Case 8 can be modified in several ways in parallel to cases 1–6 in the previous subsection. However, for all such extensions, the fact that the numerals are located in physical space (say on a sheet of paper) makes these cases cognitively less demanding than the corresponding cases where the numerals are produced as internal representations by counting verbally in a temporal sequence. In brief, mappings space-to-space are cognitively less demanding than corresponding mappings time-to-space. Most of the counting errors documented by Fuson (1988) concern the establishment of the one-to-one mapping, but her studies are about mappings from time to space. We predict that if they were repeated for space-to-space mappings, significantly fewer errors would be observed.

5.3 Mapping Time to Time

Finally, we turn to the third type of one-to-one mappings – those from the temporal domain to itself.

5.3.1 Case 9. Runners passing a finishing line

Counting objects that pass a fixed place in space, by reciting numbers, involves a mapping from time to time. In such cases the spatial locations of the objects are irrelevant, except for the moment they pass the fixed place. Steps (i) – (vi) are more or less externally fulfilled (given that there is only one type of objects that pass the place). However, objects should not pass the place faster than they can be counted, and it may be difficult to determine when the step (v) is fulfilled. If there are long temporal intervals between successive objects, the last numeral must be preserved in working memory.

The nine cases that have been analyzed here are all instances of applying the cardinality principle. As the analysis shows, however, the steps (i) – (v) must also be performed. We have seen that counting can be more or less difficult depending on the available external representations that can help offloading the working memory of the counter. We predict that the more working memory is involved in a particular case of counting, the more errors will the counter make. To some extent, Fuson’s (1988) experiments support this prediction, but more tests involving variations of the counting situation can be made to test it.

Davidson et al. (2012, p. 167) write that Sarnecka and Carey (2008) “may have overestimated the knowledge that results from becoming a CP-knower, since they did not explore whether differences between children were meaningfully related to [the experience of counting] … and did not test numbers beyond five”. To this we would like to add that our analysis shows that variations in children’s working memory and attention will lead to different performances in counting situations where the physical constraints more or less support the one-to-one mapping procedure.

The distinction between the temporal and spatial domains that generate the three types of one-to-one correspondence, and also the feasibility of different situated cognition contexts could serve as a base for an educational program, such as digital games enhancing children’s math development (see e.g. Haake et al. 2015). We predict, for example, that once a child has learned the symbols for the numerals, the space-to-space mapping in Case 8 is easier to master then a corresponding time-to-space mapping. We also predict that the bodily involvement in drawing lines between external representations of numbers and objects supports the learning of the counting process more than the reciting of numerals and thereby leads to fewer matching errors.

In our proposal for using situated counting in educational programs, we follow a recent study by Johnson et al. (2019) who suggest that children are way more competent when it comes to solving mathematical problems than it is usually believed. The authors highlight that even very young preschool children can engage and make sense of sophisticated mathematical ideas in a favorable contextFootnote 13. We suggest that contexts for counting situations can be systematized by the situatedness and the loads on attention and working memory they necessitate.

6 Conclusions

Our ambition in this paper has been to provide a conceptual analysis of the partial concepts that are necessary to build a mature concept of natural number. We view our work as a combination of philosophical analysis and cognitive modeling of concepts.

The upshot of our analysis is that counting is a multi-faceted competence that may develop gradually over a long time. We have highlighted that there are three conceptual knowledge components – properties of collections, numerals, and one-to-one mappings – that a child must master before transitive counting can be learned. And even so, there are several steps that should be accomplished without error in order to determine the cardinality of a collection. Our main contributions in the article are, firstly, that the counting process can be broken down to a series of steps that require prior grasping of the three components of conceptual knowledge, and, secondly, that the cognitive difficulty of performing these steps depends on the counting situation.

We have argued that successful counting does not only involve fulfilling the cardinality principle, but also depends to a large extent on the physical structure of the counting situation, that is, to what extent steps (i) – (vi), presented in Subsection 4.2 can be supported by external representations. Following the tradition of situated cognition, in particular Zhang and Norman (1994), our model explains a lot of data concerning children’s counting performance and generates new predictions. Our analysis extends the error analysis performed by Fuson (1988) by specifying how the difficulty of the six steps we have presented can be determined by how working memory and attention is taxed in counting situations that vary in their physical structure and in the bodily engagement in the counting. In particular, analyzing the differences between time- to-space mappings and space-to-space mappings show that the latter involves considerably less cognitive load during the counting process.