Binding serial order to representations in working memory: a spatial/verbal dissociation

Gmeindl, Leon; Walsh, Megan; Courtney, Susan M.

doi:10.3758/s13421-010-0012-9

Binding serial order to representations in working memory: a spatial/verbal dissociation

Published: 02 November 2010

Volume 39, pages 37–46, (2011)
Cite this article

Download PDF

Memory & Cognition Aims and scope Submit manuscript

Binding serial order to representations in working memory: a spatial/verbal dissociation

Download PDF

Leon Gmeindl¹,
Megan Walsh¹ &
Susan M. Courtney¹

2488 Accesses
37 Citations
1 Altmetric
Explore all metrics

Abstract

Verbal information is coded naturally as ordered representations in working memory (WM). However, this may not be true for spatial information. Accordingly, we used memory span tasks to test the hypothesis that serial order is more readily bound to verbal than to spatial representations. Removing serial-order requirements improved performance more for spatial locations than for digits. Furthermore, serial order was freely reproduced twice as frequently for digits as for locations. When participants reordered spatial sequences, they minimized the mean distance between items. Participants also failed to detect changes in serial order more frequently for spatial than for verbal sequences. These results provide converging evidence for a dissociation in the binding of serial order to spatial versus verbal representations. There may be separable domain-specific control processes responsible for this binding. Alternatively, there may be fundamental differences in how effectively temporal information can be bound to different types of stimulus features in WM.

A (further) test of spontaneous serial refreshing in verbal and spatial working memory

Article Open access 01 December 2022

How serially organized working memory information interacts with timing

Article 17 October 2016

Dissociating visuo-spatial and verbal working memory: It’s all in the features

Article Open access 17 December 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Converging evidence indicates a dissociation in working memory (WM) between verbal information, such as words and numbers, and visuospatial information, such as objects and locations (for reviews, see Jonides, Reuter-Lorenz, Smith, Awh, Barnes, Drain et al. 1996; Smith & Jonides, 1995, 1999). Accordingly, the most widely cited model of WM (Baddeley & Hitch, 1974) postulated separate memory stores for verbal and visuospatial information: the “phonological loop” and the “visuospatial sketchpad,” respectively. Recent neuroimaging and neuropsychological evidence suggests, moreover, that verbal and spatial storage systems involve different types of information coding and control mechanisms (e.g., Courtney, Roth, & Sala, 2007; Curtis and D’Esposito 2004; Levy & Goldman-Rakic, 2000). In the present study, we examined whether there is specifically a dissociation in the binding of serial-order information to verbal versus spatial stimulus representations in WM.

Differences in the degree to which, or the mechanisms by which, temporal information can be effectively bound to stimulus representations in WM would warrant a new interpretation of performance on tests that are widely used to quantify WM ability in both healthy participants and patients, such as the standardized digit span and spatial span tests (Wechsler Adult Intelligence Scale III, The Psychological Corporation, 1997; see also Berch, Krikorian & Huha, 1998, for a review of the related Corsi block-tapping task). These tests require the reproduction of items (digits or locations) in the same order as that in which they were presented. Sequence length is incremented until the participant fails to reproduce sequences of the same length on two successive trials. There are a number of scoring methods, but the conventional span score is the longest sequence length recalled in the correct order.

We hypothesized that serial order is more readily bound to verbal information, which is usually encountered in a temporally ordered manner (e.g., Tallal, Merzenich, Miller, & Jenkins, 1998) and appears to be rehearsed serially (Baddeley, 1986; Lee & Estes, 1977), than to spatial information. The latter might naturally be coded in WM as multilocation configurations (Bor, Duncan, & Owen, 2001; Gmeindl, Nelson, Wiggin, & Reuter-Lorenz, 2010; Jiang, Olson, & Chun, 2000) and not rehearsed serially. As a result, requiring memory for the serial order of spatial stimuli (as in the standardized spatial span test) may oblige participants to recruit alternative, more effortful encoding or rehearsal mechanisms than they would otherwise adopt, thereby masking effective memory capacity for spatial information (Smyth & Scholey, 1994a). In contrast, if verbal items are naturally coded and rehearsed using serial mechanisms, requiring maintenance of serial order in addition to item identity should have a smaller negative impact on verbal than on spatial WM performance.

This hypothesis is consistent with the results of previous research. For example, several studies (e.g., Dutta & Nairne, 1993; Healy, 1975, 1977; Jones, 1976) provided evidence consistent with separable codes and/or mechanisms employed for representing serial order and spatial locations in WM. In contrast, memory for serial order may normally rely on the same WM mechanisms (e.g., the phonological loop) that code and maintain verbal material (Healy, 1975, 1977). Moreover, maintenance of the serial order of verbal stimuli might be a fortuitous by-product of the serial rehearsal naturally employed for the short-term maintenance of verbal representations; the serial rehearsal of verbal stimuli (e.g., F–B–L–M) may simultaneously refresh the serial order as well as the identities of those stimuli held in WM. Serial rehearsal could allow order to be reconstructed from the relative strength of contextual information associated with stimulus representations held in memory (see, e.g., Howard & Kahana, 2002), without an explicit coding, or “tagging,” of serial positions. In contrast, if spatial locations are naturally maintained as multilocation configural representations, memory for the serial order of spatial locations may necessitate either the coordination of separable verbal and spatial WM subsystems (if serial order is coded verbally) or the recruitment of alternative (e.g., serial) spatial coding or rehearsal mechanisms.

However, perhaps serial mechanisms, such as covert shifts of attention (Awh, Jonides, & Reuter-Lorenz, 1998) or motor sequencing operations (e.g., Postle, Idzikowski, Della Sala, Logie, & Baddeley, 2006), are used for rehearsal of spatial information, just as an articulatory motor system is used for subvocal rehearsal of verbal information. In this case, the serial nature of the spatial rehearsal system might allow temporal information to be bound effectively to spatial locations. Thus, contrary to our primary hypothesis, serial order might be bound equally well to both verbal and spatial stimulus representations in WM. In this case, poorer performance on spatial span tests (or Corsi block-tapping) than on digit span tests (as is observed in normative studies; e.g., Isaacs & Vargha-Khadem, 1989; Kessels, van den Berg, Ruis, & Brands, 2008) would indeed reflect more severe limitations on the representational capacity or precision of spatial, relative to verbal, WM, rather than a dissociation in the ability to bind serial order to spatial versus verbal representations. In other words, spatial information may be fundamentally more difficult to encode, maintain, and/or retrieve than is verbal information, independently of requirements to remember serial order. This alternative hypothesis leads to the prediction that, whereas removal of the requirement to reproduce the serial order of stimuli should improve performance for both spatial and verbal tasks, spatial task performance should not improve more than verbal task performance improves.

To test these hypotheses, we administered parallel verbal and visuospatial tasks to healthy adults in two experiments. In Experiment 1, in one condition (same order), participants were required to reproduce target items in the same order as that in which those items had been presented. In a separate condition (any order), participants were allowed to reproduce the target items in any order. In Experiment 2, participants were not required to reproduce item sequences. Rather, they were required to judge whether target and test sequences contained the same items in the same serial order.

Experiment 1

Method

Participants

Twenty-four adults (17 females, 7 males; ages 18-30 years, M = 21.9 years, SE = 0.8) provided written, informed consent and received $20 per hour. The study protocol was approved by the Institutional Review Boards of the Johns Hopkins University and the Johns Hopkins Medical Institutions. All participants but one were right-handed, and all reported normal or corrected-to-normal vision and hearing. All participants were naïve as to the purpose of the study.

Apparatus and stimuli

A Dell Inspiron 8200 laptop computer presented stimuli and recorded responses (Berch et al., 1998). Responses in the verbal tasks were made using the top-row number keys “0” through “9,” and a touch screen (KEYTEC, Inc.) was used to record responses (in terms of pixel coordinates) in the spatial tasks. A chinrest located ~38 cm from the display was used to control the visual angle of the stimuli. Custom MATLAB 7.3 code (The MathWorks, Inc.; Brainard, 1997) ensured precise timing of stimulus presentation and data collection. Visual stimuli were presented against a gray background. In the verbal tasks, white digits subtending ~0.9° of visual angle in width and ~1.2° of visual angle in height appeared sequentially at the center of the display. In the spatial tasks (described below), an array of ten filled blue squares (Fig. 1) approximated the layout of blocks used in the standardized spatial span test (Wechsler Adult Intelligence Scale III, The Psychological Corporation, 1997). Some squares were indicated as memory targets by changing color from blue to orange; these colors were isoluminant to minimize afterimages. Each square subtended ~4.4° of visual angle in width and in height, and the maximum eccentricity of the display elements was ~22.8°. Participants wore headphones for auditory-stimulus presentation and to minimize distraction. One auditory stimulus (approximately 600 Hz) provided a response cue, and another (approximately 300 Hz) provided acknowledgment of each detected response (keypress or finger tap). In the verbal tasks, a white dash (–) subtending ~0.9° of visual angle provided a visual response cue as well.

Procedure

Participants first were given instructions and demonstrations and then performed four practice blocks, each consisting of five trials. One practice block was presented for each of the four combinations of the two primary independent variables (stimulus type and serial-order requirements, described below). Within the set of practice blocks, stimulus type (verbal, spatial) was presented in a blocked and alternating fashion (ABAB), with the particular stimulus type presented first counterbalanced across participants. Serial-order requirements (same order, any order) were crossed with stimulus type. Practice blocks began with sequences of two items. A staircase adjustment procedure^{Footnote 1} was used whereby the sequence length following a correct/incorrect response sequence was incremented/decremented, respectively, by one item.

Following completion of all four practice blocks, four test blocks (one block per condition) were presented. Each block contained 14 trials and began with a sequence of three items. The sequence length was subsequently adjusted by the staircase procedure as the trial block progressed. As with the set of practice blocks, stimulus type (verbal, spatial) was presented in a blocked and alternating fashion (ABAB), and serial-order requirements (same order, any order) were crossed with stimulus type.

A schematic illustration of the trial structure is shown in Fig. 1. Target stimuli were presented sequentially for 750 ms each, with an interstimulus interval (ISI) of 250 ms. In the verbal tasks, target digits were presented at the center of the screen. In the spatial tasks, each target square in the sequence turned orange for 750 ms, during which time the other nine squares in the array remained blue. All ten squares were blue during the ISI. In both verbal and spatial tasks, 250 ms after the offset of the final stimulus in the sequence, an auditory response cue was presented. In the verbal tasks, a dash appeared at the center of the screen simultaneously with the auditory response cue and persisted while the subject reported the digit sequence by pressing the number keys. In the spatial tasks, the array of blue squares persisted while the subject reported the spatial sequence by tapping squares; squares remained blue throughout the response phase. For both verbal and spatial tasks, an acknowledgment beep was sounded upon detection of each response (buttonpress or finger tap). Participants were instructed to use only their preferred index finger for all responses.

To minimize the possibility of adopting a (nonoptimal) strategy of remembering which items were not presented, especially in the any-order conditions, each target item had, on average, a .28 probability of being repeated within the same target sequence, and participants were required to reproduce items the correct number of times (i.e., in the digit tasks, each key corresponding to a target digit had to be pressed the same number of times that that digit was presented within the target sequence, and in the spatial tasks, each square had to be touched the same number of times that it was presented as a target square within the target sequence); critically, this repetition procedure was implemented for both same-order and any-order conditions. It should be noted that stimulus repetition is a departure from the standardized span tests. This repetition is illustrated in the following example digit sequence: 5–2–9–3–2, where the digit 2 is repeated. The sets of pseudorandom target sequences that we constructed prior to testing were matched across corresponding verbal and spatial conditions; thus, for example, for the three-item sequence 4–7–2, included in our set of target sequences, the digits 4, 7, and 2 were presented as a digit sequence and target squares at locations 4, 7, and 2 were presented as a spatial sequence (see Fig. 1).^{Footnote 2}

In the same-order conditions, a response sequence was incorrect if it did not match both the identity (digit or location) and serial order of target items (i.e., target items had to be reproduced in the same serial order as that in which they had been presented, including repeated items). In the any-order conditions, a response sequence was incorrect if it included items not presented in the target sequence, omitted target items, and/or included an incorrect number of repetitions. The WM performance score was operationally defined as the mean sequence length of the last ten sequences of each test block, including the sequence length that would have been presented had there been a 15th trial in the block.

Data analysis

Repeated measures analyses of variance were conducted on WM scores. In addition, response sequences in the any-order condition were analyzed to reveal the degree to which participants freely reproduced the serial order of target sequences when they were not required to do so. Because we hypothesized that serial order might be more readily bound to verbal than to spatial information in WM, we predicted that participants would freely reproduce serial order more frequently in the digit/any-order task than in the spatial/any-order task.

Results

Replicating the pattern of normative data from standardized span tests (e.g., Isaacs & Vargha-Khadem, 1989; Kessels et al., 2008), performance (Fig. 2) was better for the digit task (95% CI: [6.68, 7.60]) than for the spatial task (95% CI: [6.22, 7.11]) when participants were required to maintain serial order, t(23) = 2.36, p = .03, partial η² = .20. Performance reliably improved when the serial-order requirement was omitted, F(1, 23) = 44.65, p < .001, partial η² = .66; this finding held for both the digit task, t(23) = 2.44, p = .02, partial η² = .21, and the spatial task, t(23) = 5.74, p < .001, partial η² = .59. Of particular importance, however, the improvement in performance was greater for the spatial task than for the verbal task, F(1, 23) = 14.27, p = .001, partial η² = .38. In fact, with removal of serial-order requirements, participants performed reliably better on the spatial task (95% CI: [7.66, 9.23]) than on the verbal task (95% CI: [7.22, 7.83]), t(23) = 2.70, p = .01, partial η² = .24.

When serial-order requirements were removed, the proportion of correct responses in which serial order was nevertheless reproduced in the digit/any-order task (95% CI: [0.34, 0.62]) was twice that observed in the spatial/any-order task (95% CI: [0.12, 0.37]), as shown in Fig. 3a; the corresponding main effect of stimulus type was reliable, F(1, 23) = 10.78, p = .003, partial η² = .32. The stimulus type × sequence length interaction was not reliable, p > .05.

Because performance improved much more for the spatial task (25%) than for the verbal task (5%) when serial-order requirements were removed, we next conducted supplementary analyses of performance in the spatial/any-order condition to investigate whether participants might have strategically reorganized target items. In particular, we considered that participants may have implemented a chunking strategy whereby subsets of items were grouped into local spatial configurations (Bor, Duncan, Wiseman, & Owen, 2003), a strategy that would tend to result in a reduction of the mean distance between successive locations reported in the recall phase, relative to the mean distance between successive locations presented in the corresponding target sequences. In contrast, if participants simply had benefited from the reduced memory load conferred by removal of serial-order requirements and had not strategically reorganized target items, there should be no reliable reduction in the mean distance between successive items at the recall phase. We therefore calculated the mean interitem distance between successive squares touched during correct response sequences in which serial order was not reproduced (i.e., when target locations were touched the correct number of times, but in orders different from those of target sequences). We then compared this distance with the mean distance between successive targets presented on the very same trials (Fig. 3b). For all sequence lengths from 3 to 11 items, the mean interitem distance was reliably shorter for response sequences than for the corresponding target sequences (Wilcoxon signed-rank tests: Ws ≤ 11, ps ≤ .02).^{Footnote 3} Furthermore, when we omitted immediate repetitions of target and response locations from mean distance calculations, this pattern held for all sequence lengths except for 9-item sequences, for which the difference between mean target and mean response distances just failed to reach significance, W = 24, p = .07.

Because interitem distances may have been strategically minimized to reduce the duration of recall responses (a possibility discussed below), we also analyzed the interresponse intervals (time between successive touch responses), response sequence durations, and the time between the recall cue and the first response, for all correct responses in the spatial/any-order condition. We compared these measures between sequences in which the serial order of targets was freely reproduced and sequences in which participants reordered items. Sequence lengths from three to eight items were analyzed because they were the only ones that provided observations for both types of response sequences. Neither mean interresponse interval (serial order reproduced, M = 586 ms, SE = 52; serial order not reproduced, M = 589 ms, SE = 44), mean recalled-sequence duration (serial order reproduced, M = 3.7 s, SE = 0.3; serial order not reproduced, M = 3.9 s, SE = 0.2), nor mean latency from the recall cue to the first response (serial order reproduced, M = 1,114 ms, SE = 129; serial order not reproduced, M = 1,291 ms, SE = 109) indicated reduced means for reordered sequences. Differences between these two sequence types also did not vary systematically with sequence length.

Discussion

In order to test whether serial order is more readily bound to verbal than to spatial representations in WM, we compared performance on digit span and spatial span tests that either required participants to reproduce the serial order of target items or allowed them to recall target items in any order. Perhaps unsurprisingly, performance improved when we removed the requirement to reproduce serial order. However, of particular theoretical importance, Experiment 1 revealed a larger improvement in spatial than in verbal performance. Furthermore, participants freely reproduced serial order twice as often in the verbal task as in the spatial task. In the absence of serial-order requirements, participants also reduced the mean distance between items correctly reproduced at the recall phase, relative to the mean distance between targets presented in the very same trials, suggesting strategic reorganization. In sum, several findings of Experiment 1 indicate that serial order is more readily bound to verbal than to spatial representations in WM.

We next conducted an experiment in which recognition test versions of the WM tasks (cf. Smyth & Scholey, 1996a, b) were administered to test a corollary prediction: If serial order is more difficult to bind to spatial than to verbal representations in WM, participants should fail to detect changes in serial order more frequently for spatial than for verbal sequences. On recognition test trials, a target sequence was followed by a test sequence that was identical to the target sequence, contained a nontarget item that replaced a target item, or consisted of the same target items but presented in a different order. Furthermore, because participants indicated whether the test sequence matched or did not match the target sequence simply by pressing one of two response keys, motor processing requirements were minimized and equated across the verbal and spatial recognition tasks, thereby eliminating the possibility that differences in demand for overt motor control or sequence generation could account for any observed differences in performance across stimulus modalities.