Enumerating by pointing to locations: a new method for measuring the numerosity of visual object representations

Haladjian, Harry Haroutioun; Pylyshyn, Zenon W.

doi:10.3758/s13414-010-0030-5

Enumerating by pointing to locations: a new method for measuring the numerosity of visual object representations

Published: 06 November 2010

Volume 73, pages 303–308, (2011)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Enumerating by pointing to locations: a new method for measuring the numerosity of visual object representations

Download PDF

Harry Haroutioun Haladjian¹ &
Zenon W. Pylyshyn¹

1315 Accesses
10 Citations
Explore all metrics

Abstract

The fast and accurate enumeration of a small set of objects, called subitizing, is thought to involve a different mechanism from other numerosity judgments, such as those based on estimation. In this report, we examine the subitizing limit using a novel enumeration task that obtained the perceived locations of enumerated objects. Observers were shown brief masked displays (50, 200, and 350 ms) of 2–9 small black discs randomly placed on a gray screen and then asked to place a marker where each disc had been located. The number of these markers provided an estimate of the number of items processed. This “pointing” methodology enabled observers to accurately “enumerate” displays containing up to six items in contrast with the four-item limit typically found when using standard reporting methods (and replicated here in Experiment 2). These results suggest a different account of the limits found in most subitizing and enumeration studies.

Visual field asymmetries in numerosity processing

Article Open access 18 October 2022

Numerosity estimates for attended and unattended items in visual search

Article 20 March 2017

If it looks, sounds, or feels like subitizing, is it subitizing? A modulated definition of subitizing

Article 10 January 2019

Introduction

The ability to identify and locate a number of objects in a visual scene serves many cognitive functions, such as navigating through crowded environments or playing team sports. Counting is a related ability, where accurate enumeration requires individuating discrete perceptual objects. Some theories of enumeration posit that it is achieved via a magnitude estimation mechanism. This system is observed in animals and supports the internal representations of numerosity and duration (Meck & Church, 1983). This mechanism is believed to also enable verbal counting in humans when number labels are mapped to discrete units on this continuous representation (Gallistel & Gelman, 1992).

Results from some studies, however, suggest that different mechanisms may be responsible for computing small numerosities. When enumerating sets of 1–4 items, observers make few errors with modestly increasing reaction times (RT) as set size increases. The rate of errors and RT, however, increases substantially for sets with more than four items (Trick & Pylyshyn, 1993). The term subitizing refers to this quick and accurate small-set enumeration (Kaufman, Lord, Reese, & Volkmann, 1949). This observation has led some to argue that, in addition to magnitude representations, another system can individuate and select up to four items in parallel and then numerals can be mapped to these items (Feigenson, Dehaene, & Spelke, 2004; Trick & Pylyshyn, 1994a, 1994b). The interpretation of these results, however, remains contested. Some studies attribute the difference in small- and large-set enumeration to capacity limitations of information transfer into short-term memory (Cowan, 2001; Klahr, 1973). Others posit subitizing as pattern-recognition, where familiar patterns formed by fewer items are recognized more quickly, for example, like the patterns on dice (Mandler & Shebo, 1982). Another view argues that the RT slope does not change suddenly but rather increases as a continuous function of increasing variability (Whalen, Gallistel, & Gelman, 1999). Whether or not two separate systems serve enumeration remains an open question.

In this report, we introduce a new experimental methodology designed to examine the relationship between spatial representations and the representation of sets of objects in order to characterize the mechanism that supports subitizing. Specifically, we designed an experiment that measures the accuracy of the spatial encoding of objects and indirectly provides an indication of how many objects were recalled, which serves as a measure of enumeration.^{Footnote 1} Observers were shown a brief stimulus (50, 200, or 350 ms) comprised of 2–9 small black discs randomly placed on a gray screen and immediately masked. Then the observers used a mouse to “point to” where the objects had been by placing markers on a blank screen at the former locations of each disc. This methodology allowed us to analyze both location and enumeration accuracy and their relationship.

Experiment 1

Method

Twenty-four Rutgers University undergraduates participated in one 45-min session for course credit or payment. The experiment was programmed in MATLAB® using Psychophysics Toolbox 3.0.8 (Brainard, 1997) and controlled by a PC computer running the Windows® XP operating system. The stimuli were displayed on a 19-inch (c. 48.3 cm) color HP P1100 CRT monitor (1,280 × 1,024 pixel resolution at 70 Hz).

The test stimulus consisted of 2–9 identical black discs that appeared on a gray background (to reduce contrast and minimize after-images and phosphor decay). These discs were 35 pixels in diameter (~1° visual angle) and randomly placed on the screen with the following constraints: disc edges could not lie within 115 pixels (~3°) and no farther than 715 pixels (~20°) of each other; additionally, discs could not appear within ~200 pixels (~5°) from the screen edges. This produced an effective viewing display of 21.1° by 16.6° (768 × 614 pixels). The minimum distance between discs was set at approximately 3° since attention requires at least 1° of visual separation for accurate discrimination (Bahcall & Kowler, 1999) and at least 2° when stimuli extend 15° into the periphery (Intriligator & Cavanagh, 2001).

Observers sat approximately 60 cm from the screen in a darkened room. They were instructed to look carefully for the brief display of black discs in order to notice the number of discs and remember their locations. Each trial began with a 2,500-ms gray screen with a white central fixation cross, on which the observers were instructed to fixate. Then, the test display was flashed for 50, 200, or 350 ms, followed by a 16-ms black screen and an 85-ms mask (created by randomly assigning a white or black value to a grid of 4 × 4 pixel squares). Finally, a gray screen with a crosshair cursor appeared and observers placed markers (“X”) on the recalled location of each disc (see Fig. 1). It was emphasized that the number of markers placed on the screen should correspond to the number of discs on the test display, even if the observer was unsure about their exact locations. When the observers were finished marking the disc locations, they pressed the space bar to start the next trial. Observers received 12 trials of each of the 24 test conditions (3 durations and 8 numerosities); these 288 trials were randomly distributed throughout the experiment. Observers were encouraged to take a break at any point during the experiment. The primary measures of interest were enumeration accuracy and the magnitude of location errors, which were analyzed using within-subjects ANOVA.

Results

Enumeration accuracy

Numerical accuracy, measured as the proportion of trials in each condition in which the observer provided the correct number of location marks, was high for displays containing up to six items and decreased significantly for larger numerosities. The ANOVA indicates main effects for display duration [F(2, 6,336) = 85.9, p < 0.001, eta ² = 0.789] and numerosity [F(7, 6,336) = 128.4, p < 0.001, eta ² = 0.848], with an interaction [F(14, 6,336) = 8.0, p < 0.001, eta ² = 0.258). Performance in the 50-ms display duration was significa ntly worse than the 200-ms and 350-ms durations for displays with 6–9 items. Figure 2 shows the enumeration accuracy as a function of numerosity for each display duration. (Note: all error bars in this report represent 95% confidence intervals.)

We also analyzed the average number of miscounts in each condition. Over- and under-counting were treated the same in this analysis by taking the absolute value of miscounts (84% of errors were underestimates). ANOVA results for miscounts also indicate main effects for display duration [F(2, 6,332) = 90.6, p < 0.001, eta ² = 0.798] and numerosity [F(7, 6,332) = 92.5, p < 0.001, eta ² = 0.801], with an interaction [F(14, 6,332) = 22.3, p < 0.001, eta ² = 0.492]. The average counting error increased with greater numerosities, but less so for the longer display durations. (See Fig. 3).

Additionally, a simple linear function was computed between display numerosity and response numerosity (combined durations). For small sets (2–6 items), the degree of linear fit between display numerosity and numerical responses was high: adjusted r ² = 0.971 (p < 0.001) and ß = 0.985. For larger sets (7–9 items), the fit was not as good: adjusted r ² = 0.350 (p < 0.001) with a lower slope (ß = 0.591).

“Pointing” accuracy

Location error in the pointing task is reported as the Euclidean distance between a stimulus disc and a paired response disc, which was determined as follows. Using Delaunay Triangulation (Kendall, 1989) and nearest-neighbor methods, we identified the likely associated response marker for each stimulus disc in each trial and then calculated the pixel distance between the centers of these paired discs. Some trials resulted with unpaired discs, for example, when an observer miscounted the stimulus. These trials were excluded from the location analysis (approximately 15% of total trials).

ANOVA results for the magnitude of location errors in each condition indicate main effects for display duration [F(2, 4,952) = 20.7, p < 0.001, eta ² = 0.430] and numerosity [F(7, 4,952) = 66.3, p < 0.001, eta ² = 0.819], but without an interaction [F(14, 4,952) = 1.3, p = 0.187, eta ² = 0.051]. Figure 4 shows the average error distance in pixels and degrees of visual angle. Errors increased for larger numerosities and in the shortest duration—the 50-ms display was significantly worse than the other durations in all numerosities except 8. A regression analysis on the combined durations showed a larger increase (slope) of location errors with numerosity for displays with 2–6 items (ß = 0.258, adjusted r ² = 0.066, p < 0.001) than for displays with 7–9 items (ß = 0.182, adjusted r ² = 0.033, p < 0.001).

Experiment 2

The results of Experiment 1 suggest that the “pointing method” allows more items to be processed (and indirectly enumerated) in subitizing than typically reported (e.g., Trick & Pylyshyn, 1994b). Since subitizing experiments have used a variety of methods, there remains the possibility that the subitizing range increase found in this experiment may be due to aspects of our methodology other than pointing to recalled object locations. Therefore, Experiment 2 compared our indirect “pointing method” of inferring how many items had been processed with the more conventional method that relies on observers’ explicit report of the cardinality of the set. Other aspects of the experiment were the same as in the first experiment (e.g., the nature of the displays, the use of the mouse to report cardinality, performance measures). Since Experiment 2 involves the use of a different set of observers, we replicated our pointing method on the new subject population in order to provide a within-subjects comparison of the pointing response versus the symbolic numeral response.