Spatial partitions systematize visual search and enhance target memory
- 464 Downloads
Humans are remarkably capable of finding desired objects in the world, despite the scale and complexity of naturalistic environments. Broadly, this ability is supported by an interplay between exploratory search and guidance from episodic memory for previously observed target locations. Here we examined how the environment itself may influence this interplay. In particular, we examined how partitions in the environment—like buildings, rooms, and furniture—can impact memory during repeated search. We report that the presence of partitions in a display, independent of item configuration, reliably improves episodic memory for item locations. Repeated search through partitioned displays was faster overall and was characterized by more rapid ballistic orienting in later repetitions. Explicit recall was also both faster and more accurate when displays were partitioned. Finally, we found that search paths were more regular and systematic when displays were partitioned. Given the ubiquity of partitions in real-world environments, these results provide important insights into the mechanisms of naturalistic search and its relation to memory.
KeywordsVisual search Spatial memory
As embodied agents, much of human behavior is contingent on our ability to locate and access objects in space—whether tools, resources, other individuals, or sources of information. For the most part, this is accomplished in one of two ways: through search, or through memory—in other words, by exploring the environment, or by leveraging episodic memory for where we have previously observed a target object.1 There has been considerable interest in examining the interplay between these processes, as well as the conditions under which one or the other might be preferred. To date, the literature suggests that the use of memory is enhanced when search is more difficult. Memory use is rare in relatively simple displays with low orienting costs (Kunar, Flusberg, & Wolfe, 2008; Wolfe, Klempen, & Dahlen, 2000), but memory is used increasingly often as search becomes more challenging—for instance, by decreasing target discriminability or increasing stimulus eccentricity (Solman & Smilek, 2012), or by increasing orienting costs through the need for eye or head movements (Solman & Kingstone, 2014; Solman & Smilek, 2010; cf. Ballard, Hayhoe, & Pelz, 1995). Similarly, when the elements of a task support search—for instance, through semantic cues that enable inference of the target locations (Eckstein, Drescher, & Shimozaki, 2006; Neider & Zelinsky, 2006; Torralba, Oliva, Castelhano, & Henderson, 2006)—then even targets in complex, naturalistic displays are less likely to be found via episodic memory (Võ & Wolfe, 2012, 2013).
Of central interest in these studies is the nature of the search–memory interplay in naturalistic settings, with the aim of improving our understanding of routine naturalistic behavior. Here we focused on an aspect of naturalistic environments that has received limited attention in the context of search—the ubiquity of partitions in the world. The bulk of human environments are multiply subdivided into buildings, rooms, items of furniture, and further down into nested compartments, drawers, and containers. There are several good reasons to believe that such partitioning might influence search. First, it has long been known that grouping or otherwise regularizing the configuration of items in search displays can facilitate target detection and improve efficiency (e.g., Bundesen & Pedersen, 1983; Farmer & Taylor, 1980; Humphreys, Quinlan, & Riddoch, 1989; Treisman, 1982; Williams, Pollatsek, & Reichle, 2014). Second, there is evidence that visual attention is typically deployed in a coarse-to-fine ordering: selecting groups first, then homing in on individual objects within them (e.g., Rao, Zelinsky, Hayhoe, & Ballard, 2002; Zelinsky, Rao, Hayhoe, & Ballard, 1997). In this way, subdivided spaces might provide a natural or complementary structure for guiding attention. Finally, on larger scales, we find that human navigation often relies on landmark knowledge—suggesting that spatial encoding is better served by referencing readily identifiable features, rather than by encoding absolute positions (Foo, Warren, Duchon, & Tarr, 2005).
The influence of display partitioning on search has been examined recently by Nakashima and Yokosawa (2013). Participants searched among Cs and Os in either uniform arrays or arrays subdivided by black borders. Nakashima and Yokosawa reported that partitions impair easier searches, perhaps due to perceptual disruption, but critically, these same partitions can also facilitate more difficult searches. One explanation for this result is that partitions, like other forms of grouping, enable more systematic processing of the items in the display (cf. Williams et al., 2014), thereby avoiding attentional inefficiencies such as retracing searched locations or dwelling on and reinspecting items that have already been examined.
In this study, we extended the investigation of partitions in search, with a dual purpose. First, and primarily, we explored how partitions might influence the use of memory in search, using the repeated-search paradigm (Wolfe et al., 2000). Second, using trajectory analysis, we examined how partitions influence the strategic/systemic components of the search process itself, in hopes of clarifying the mechanistic underpinnings of Nakashima and Yokosawa’s (2013) results. In the present study, participants searched repeatedly through partitioned and open (nonpartitioned) displays of object images, and then they were tested on their explicit memory for item positions. We used a masked, mouse-contingent display to enforce serial scanning and enable detailed path analysis metrics. We made several predictions. Most focally, we expected that partitioning search displays would lead to improved memory for item locations, both during search and during explicit free recall. In addition, we expected to find faster search through partitioned displays, supported by more regular search paths.
Critically, note that in this study we approached search vis-à-vis exploratory behavior in general, as opposed to visual search in particular (cf. Hollingworth, 2012; Smith, Hood, & Gilchrist, 2008; Solman & Kingstone, 2015). Indeed, here we used masked displays, which directly limit and largely preclude any influence of visual features in guiding search. By limiting the influence of low-level featural guidance, we emphasized the two search components of focal interest in the present study—memory use and exploratory strategy.
A group of 35 participants (six male, 29 female) from the University of British Columbia participated for course credit. All reported normal or corrected-to-normal visual acuity. We obtained informed consent from all participants, and all experimental procedures and protocols were reviewed and approved by the University of British Columbia Behavioral Research Ethics Board.
Each participant completed two blocks—open displays in one block, and partitioned displays in the other, with the block order counterbalanced across participants. In each block, participants searched for each of the 48 items in five separate repetitions, for a total of 240 search trials. Incorrect trials were recycled to the end of the search period, so that each participant correctly located each target five times. Following search, a single explicit-memory test was presented for the location of each of the 48 items. Our analysis of explicit memory included only the Partition (open, partitioned) factor, whereas the search measures included both the Partition and Repetition (1, 2, 3, 4, 5) factors.
Search and memory trials proceeded in largely the same way (Fig. 2). A trial began with a masked display and a blank green square, where the target template would subsequently appear. Participants triggered the onset of search by moving the mouse-contingent window onto the central green square, triggering the target, whereupon they could use the window to explore the display. Note that prior to triggering the target, the search display was not visible through the window. During search trials, participants moved the window over the display to inspect the items, and they were instructed to click on the item matching the target template. A brief (250-ms) feedback display flashed either green or red, to indicate that the response was correct or incorrect, respectively. A response was deemed correct if the click location was within 80 pixels of the target item’s center. During memory trials, only the mask display was visible (i.e., the item identities were unavailable), and participants were instructed to click on the location where they believed each item had been. No feedback was provided on memory trials.
The experiment was written and executed in Python using the pygame module, and run on an Apple Mini running OS X 10.6.4 on a 2.4-GHz Intel Core 2 Duo processor. The stimulus displays were presented on a 24-in. Dell Acer V243H monitor at a resolution of 1,920 × 1,080. The seating distance was not rigidly controlled, but was approximately 60 cm. For both search and memory trials, in addition to response time and response location, we recorded the position of the mouse-contingent window at a rate of ~20 Hz.
Given that the search items were readily identifiable natural-object images, error rates were low for most participants, with a few exceptions. We excluded these error-prone participants with a recursive outlier removal process. We identified the Partition × Repetition condition cell with the greatest error for each participant, then recursively excluded those participants whose error rates were more than 3.5 standard deviations from the group. This led to the removal of three participants (Zs = 8.7, 8.7, and 21.1). The remainder of the analysis proceeded with N = 32.
We first evaluated the accuracy of explicit memory and the response speed during this portion of the task. Next, we evaluated the spatial magnitude of the errors made. Participants were significantly more accurate in explicit-memory testing for partitioned (M = 90.9 %) than for open (M = 82.4 %) displays, t(31) = 4.210, p < .001, and were significantly faster in producing these correct responses, t(31) = 2.931, p < .01 (M = 1,596 vs. M = 1,751 ms).
As we noted above, accuracy was quite high, so the analysis of errors was limited, with missing cells due to perfect performance in one or more conditions leading to a reduction of the sample size to 23. Error magnitude was estimated by computing the distance between the response location and the location of the target. Interestingly, we found that although fewer errors were made overall in the partitioned displays, when errors were produced, they were farther from the target in the partitioned displays (M = 321 pixels) than in the open displays (M = 257 pixels), t(22) = 2.276, p < .05. One explanation for this effect is to consider that errors in explicit localization reflect a combination of memory inaccuracies (near-misses) and memory failures (arbitrary location selections). The larger average error distance seen in the partitioned condition may reflect a change in the distribution of these errors, with location memory in partitioned displays having a stronger all-or-none character—that is, selectively fewer near-misses. To test this theory, we distinguished between near-misses (errors adjacent to the correct location) and failures (errors farther away). A Partition (partitioned, open) × Distance (near, far) analysis of variance (ANOVA) was conducted on the numbers of errors in explicit memory. There were more errors overall in the open displays (as we reported previously), F(1, 31) = 17.7, MSE = 7.45, p < .001, and more near than far errors, F(1, 31) = 9.44, MSE = 4.06, p < .005. Critically, the interaction was also significant, F(1, 31) = 16.8, MSE = 2.03, p < .001, with a larger reduction in near (open, M = 5.28; partitioned, M = 2.22) than in far (open, M = 3.16; partitioned, M = 2.16) misses.
A trial was accurate if the searcher clicked on the item matching the target object, and inaccurate otherwise. Since the items were images of highly distinctive common natural objects, we expected to find relatively few errors. Indeed, across conditions, fewer than 2 % of the trials on average were errors. The rate of errors was not influenced by partition condition, F(1, 31) = 1.38, p = .248, nor by repetition, F(4, 124) < 1, p = .511. The interaction was also nonsignificant, F(4, 124) = 1.47, p = .215.
In the following analyses, we used trajectory data to present qualitative overviews of search performance and learning. Trajectory analyses offer detailed information on the time course of search, the accuracies of individual movements and movement components, and the overall character and potential strategic underpinnings of search paths. In the following discussion, we examine how repetition influenced the directed (i.e., toward the target) rate of change of mouse movements during search, yielding a portrait of how quickly searchers were able to orient to the target and how this orienting shifted from random search to accurate ballistic movements over the course of learning.
For these analyses, we omitted the highly variable first repetition (latency difference: M = –317, SD = 1,761; peak difference: M = 2.05, SD = 79.7), in which no memory was expected (although some incidental memory might have been present even in the first repetition—i.e., for later targets observed while searching for earlier targets). We focused instead on later repetitions, where the dominant behavior would be expected to be memory-driven. For peak latencies (Fig. 4, panel C), we found only a main effect of repetition, F(3, 93) = 155.2, MSE = 109,652.157, p < .001, with earlier peaks as repetitions increased. There was no significant difference between the open and partitioned conditions, F(1, 31) < 1, p = .440, and no interaction, F < 1, p = .835. For peak amplitudes (Fig. 4, panel D), we found a main effect of repetition, with the amplitude increasing with repetitions, F(3, 93) = 102.7, MSE = 11,929.831, p < .001; a significant effect of partition, with larger amplitudes in the partitioned than in the open condition, F(1, 31) = 4.21, MSE = 45,321.709, p < .05; and a significant Partition × Repetition interaction, F(3, 93) = 4.62, MSE = 11270.565, p < .005. This interaction was followed up with paired-samples t tests at each repetition to compare the partitioned and open conditions. We found no difference for Repetitions 2, t(31) = t(31) = –0.187, p = .853, and 3, t(31) = 0.592, p = .558, but significantly higher amplitudes for the partitioned than for the open conditions at Repetitions 4, t(31) = 3.657, p < .001, and 5, t(31) = 2.424, p < .05.
We next examined whether partitioning the display influenced the structure of search, regardless of performance. If search is influenced by the partitions, we would expect that searchers should transition more often between items within a partition than between partitions. We tested this prediction by examining the transition probabilities for the experimental partition set (the set actually displayed), as compared to the transition probabilities for a control partition set (obtained by mirroring the layout of the experimental partitions). Note that the control set was not displayed at any time—we used this strictly as a control case for examining the transition rates. We evaluated these transition probabilities for both the open and the partitioned conditions, with the expectation that transitions within a partition should be equivalent for the experimental and control partitions in the open condition (since no visual markers differentiated these regions), but that within-partition transitions should be amplified in the experimental relative to the control partitions for the partitioned condition, reflecting a bias toward segmenting search episodes by partition, rather than searching the entire display indiscriminately.
For each sample, we determined the item, if any, on which that sample fell, through strict collision with the rectangle where the item was displayed. Samples falling outside any item were given a null coding. Each item was associated with a given experimental partition and with a control partition. Transitions were identified by finding successive samples (ignoring null-coded samples) falling on different items. The transition was recorded along with its classification as being either within a partition or between partitions for both the experimental and control partition sets.
The present experiment revealed a number of effects of display partitioning on the performance of search and on explicit memory for search target locations. Evaluating the item–item transitions during search, we found that partitions strongly influenced the trajectory of search, such that searchers were more likely to move from one item to another within the same partition. This systematicity may have facilitated exhaustive search, reducing the demands on memory for which items had already been inspected. We also found that, in later repetitions, searchers in the partitioned condition moved toward the target with an increased peak speed, although the latency of this movement was not significantly altered by the partitions. This increase in movement speed may explain the modest overall RT difference. The increased peak speed in conjunction with an absence of latency differences suggests that retrieval time is essentially fixed, but that the accuracy of either the representation or the guidance may be improved for partitioned as compared to open displays, leading to a more rapid orienting movement. In terms of explicit item location memory, we found a clear effect of partition, with explicit memory being more accurate for partitioned displays, and with these recalled locations being generated more quickly. Notably, the reduction in errors between conditions occurred preferentially for “near-misses”—suggesting that instead of increasing the number of target locations encoded, partitions facilitate a shift from approximate location memory to precise memory. In the case of open displays, memory is sufficient to localize a target within a small cluster of adjacent positions, but more often it fails to pinpoint the exact position.
This study occupies a unique middle territory between visual-search and spatial-memory paradigms. A variety of studies have explored the effects of different configurational factors on spatial memory. When items are presented in a spatial arrangement, the sequence of items generated during free recall tends to cluster on the basis of item proximity (Hirtle & Jonides, 1985; McNamara, 1992; however, this effect may depend closely on the temporal order of item presentation: McNamara, Halpin, & Hardy, 1992; see also Tversky, 1991). Of more direct relevance to the present results, when explicit groupings are formed (e.g., by color, shape, or boundaries), relational memory is often improved for within-group as compared to between-group pairs (Hommel, Gehrke, & Knuf, 2000; McNamara, 1986). With respect to boundaries in particular, there is evidence that children may overestimate distances across boundaries (Cohen, Baldwin, & Sherman, 1978; Kosslyn, Pick, & Fariello, 1974), and in adults boundaries may facilitate the formation of hierarchical representations in memory (Stevens & Coupe, 1978).
Although considerable attention has been given to spatial memory, there are some hurdles to overcome when linking these results to the effects in visual search. Most studies of spatial memory have focused on short-term memory for position sequences, or else on item–item priming or relative position judgments. In other words, when absolute positional recall is measured, this is only for short-term memory of ordered sequences of three or four targets; when larger and unsequenced sets of items have been presented, the measures have mostly focused on relative bearing and relative distance judgments.
There are reasons to suggest that search may provide a more ecological window on spatial memory for object positions. Routine naturalistic search involves (1) the precise localization of large numbers of objects, (2) target sequences generally unrelated to the order of initial exposure, and (3) a gradual buildup of memory through repeated interactions. There is even evidence that the act of searching for an object may confer unique advantages for spatial memory—in real scenes, a target that has been searched for is remembered better than either an item viewed incidentally or an item viewed in the course of an explicit memorization task (Võ & Wolfe, 2012). On the other hand, the search literature itself has for the most part had surprisingly little to say regarding spatial memory. The most popular models of visual search (Itti & Koch, 2000, 2001; Pomplun, Reingold, & Shen, 2003; Wolfe, 1994, 2007) have restricted their attention to bottom-up featural guidance, with some acknowledgement of top-down biases (e.g., from context and expectancies). These models provide exceptionally good accounts of search when it is driven exclusively by the visual properties of an array, but behaviors arising from ongoing search through relatively stable environments remain beyond their scope. Although the importance of memory at multiple spatial and temporal scales is understood (see, e.g., Shore & Klein, 2000, for a review), and although some models do include memory terms at the within-trial level (e.g., Guided Search 4.0: Wolfe, 2007), to date no well-established models of search have incorporated the effects of repeated exposure and the commensurate buildup of spatial memory.
The data here add to a growing body of work addressing the factors feeding into the nebulous “top-down” category incorporated in models of human search performance. To date, top-down considerations have primarily involved general semantic knowledge and related expectancies for particular objects in particular settings (e.g., Chen & Zelinsky, 2006; Eckstein, Drescher, & Shimozaki, 2006; Ehinger, Hidalgo-Sotelo, Torralba, & Oliva, 2009; Henderson, 2003; Navalpakkam & Itti, 2002, 2005; Neider & Zelinsky, 2006; Torralba et al., 2006; Zelinsky et al., 1997), or otherwise, memory developed over the course of repeated presentations (Chun & Jiang, 1998; Jiang & Wagner, 2004; Kunar, Flusberg, & Wolfe, 2008; Olson & Chun, 2002; Solman & Kingstone, 2014; Solman & Smilek, 2010, 2012; Võ & Wolfe, 2012, 2013; Wolfe et al., 2000). Two additional sources of top-down guidance are addressed in the present research: (1) strategic biases in scanpath organization, and (2) semantic-independent configurational aspects of the search display. The present research has confirmed that arbitrary structure (i.e., partitioning) encourages systematicity in scanpaths (De Lillo, Kirby, & James, 2014; Gilchrist & Harvey, 2006; Hooge & Erkelens, 1996; Solman & Kingstone, 2015), and further has demonstrated that arbitrary structure leads to improved memory for target locations.
Several possible mechanisms may underlie the observed memory improvement. First, we note that the more regularized scanpaths during search through partitioned displays may have facilitated accurate spatial encoding by allowing observers to avoid reinspections or gaps during search. Several studies of random search have also shown that paths are adapted to regularities in the arrangement of display items (e.g., clusters: De Lillo, Kirby, & James, 2014; grids: Gilchrist & Harvey, 2006; or circles: Hooge & Erkelens, 1996). The observation that scanpaths are also regularized by partitions suggests that this is a reasonable mechanism for the improvements observed by Nakashima and Yokosawa (2013) in random, perceptually driven search.
It is also possible that both the scanpath effects and the memory improvement may be traced to a common support from the spatial landmarks provided by the boundaries (e.g., corners or edges: Foo et al., 2005; or distinctive context: Cherry & Park, 1993). These background features could provide reference points for spatial memory, both during encoding and during recall. Indeed, in studies of recall for sequences of locations, memory span is increased when the locations are regularly arrayed, symmetrical, or form continuous, nonintersecting paths (Kemps, 1999, 2001), and when the structure of the sequence conforms to the structure of the locations (De Lillo, 2004; De Lillo, Kirby, & James, 2014). In this view, global context in a display serves as an anchor for location memory, facilitating guidance and potentially helping to diversify and separate item representations. If we view group membership as an additional feature for each item, then both encoding and guidance ought to be enhanced for grouped items. Partitions, then, offer an extremely flexible grouping signal—by creating a set membership property independent of lower-level grouping properties, like proximity or color.
A related possibility arises from observations of coarse-to-fine visual orienting (Rao et al., 2002; Zelinsky et al., 1997) and evidence for hierarchical or semihierarchical encoding in spatial memory (De Lillo, 2004; McNamara, 1986, 1992; Stevens & Coupe, 1978; Tversky, 1991). In particular, although it may be difficult to recall a single, precise coordinate in the full display, it may be much easier to encode two smaller pieces of information about each target—that is, the particular partition and the location within that smaller region. Alternatively, this coarse-to-fine encoding could instead emerge over time, so that each target is associated with only a single piece of location memory, but the spatial resolution of this memory improves over time. In this case, partitions in the display might provide a convenient scaffold for early coarse memory, with this advantage facilitating subsequent searches and more detailed encodings.
Despite the complexities of naturalistic search—a torrent of sensory data, effectively boundless environments, and extremely large set sizes—humans demonstrate a remarkable facility for locating the objects they need during routine activity. This facility likely depends on the effective combination of visual ability with memory, each supporting the other, when necessary. Here we have shown that the environment itself may influence this interplay. Whereas previous work has established that when the environment offers semantic cues to facilitate prediction as a guiding factor in search, episodic target memory is reduced (Võ & Wolfe, 2012, 2013), we demonstrated the complementary result—that when the environment offers structural supports to facilitate spatial encoding, episodic target memory is increased.
Of course, under naturalistic conditions, even search without episodic memory is likely to be supported to some extent by more general semantic memory—that is, knowledge about where a given class of object is likely to be, as distinct from knowledge of where a particular instance has been directly observed. For the present purposes, we group this particular form of memory-guided search with random search, because both processes are marked by the need for exploration, in contrast to directed orienting in the case of episodic memory.
The following analyses are largely insensitive to the number of time points resampled, provided that the sampling was not excessively coarse. In general, resampling rates should be chosen so that the resampling errors are smaller than the variability in the supporting data.
This work was supported by the Natural Sciences and Engineering Research Council of Canada, Grant No. RGPIN 170077-11.
- Brodeur, M. B., Dionne-Dostie, E., Montreuil, T., & Lepage, M. (2010). The Bank of Standardized Stimuli (BOSS), a new set of 480 normative photos of objects to be used as visual stimuli in cognitive research. PLoS ONE, 5, e10773. doi: 10.1371/journal.pone.0010773 CrossRefPubMedPubMedCentralGoogle Scholar
- Navalpakkam, V., & Itti, L. (2002). A goal oriented attention guidance model. In H. H. Bülthoff, C. Wallraven, S.-W. Lee, & T. A. Poggio (Eds.), Biologically motivated computer vision 2002 (Lecture Notes in Computer Science, Vol. 2525, pp. 453–461). Berlin, Germany: Springer.Google Scholar
- Tversky, B. (1991). Spatial mental models. In G. H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 27, pp. 109–145). Orlando, FL: Academic Press.Google Scholar