The effect of memory load on object reconstruction: Insights from an online mouse-tracking task

Li, Aedan Y.; Yuan, James Y.; Pun, Carson; Barense, Morgan D.

doi:10.3758/s13414-022-02650-9

The effect of memory load on object reconstruction: Insights from an online mouse-tracking task

Published: 04 January 2023

Volume 85, pages 1612–1630, (2023)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

The effect of memory load on object reconstruction: Insights from an online mouse-tracking task

Download PDF

Aedan Y. Li ORCID: orcid.org/0000-0003-0580-4676¹^na1,
James Y. Yuan¹^na1,
Carson Pun¹ &
…
Morgan D. Barense¹

793 Accesses
3 Citations
7 Altmetric
Explore all metrics

Abstract

Why can’t we remember everything that we experience? Previous work in the domain of object memory has suggested that our ability to resolve interference between relevant and irrelevant object features may limit how much we can remember at any given moment. Here, we developed an online mouse-tracking task to study how memory load influences object reconstruction, testing participants synchronously over virtual conference calls. We first tested up to 18 participants concurrently, replicating memory findings from a condition where participants were tested individually. Next, we examined how memory load influenced mouse trajectories as participants reconstructed target objects. We found interference between the contents of working memory and what was perceived during object reconstruction, an effect that interacted with visual similarity and memory load. Furthermore, we found interference from previously studied but currently irrelevant objects, providing evidence of object-to-location binding errors. At the greatest memory load, participants were nearly three times more likely to move their mouse cursor over previously studied nontarget objects, an effect observed primarily during object reconstruction rather than in the period before the final response. As evidence of the dynamic interplay between working memory and perception, these results show that object reconstruction behavior may be altered by (i) interference between what is represented in mind and what is currently being viewed, and (ii) interference from previously studied but currently irrelevant information. Finally, we discuss how mouse tracking can provide a rich characterization of participant behavior at millisecond temporal resolution, enormously increasing power in cognitive psychology experiments.

The temporal development of memory processes in source monitoring: An investigation with mouse tracking

Article Open access 03 May 2023

Trial-by-trial mouse trajectory predicts variance in precision across working memory representations: A critical reanalysis of Hao et al. (2021)

Article 06 June 2022

Object-based selection in visual working memory

Article Open access 13 July 2021

The capacity of visual working memory is surprisingly limited. Although we encounter thousands of objects during our daily lives, we can remember only a few pieces of information at any moment (Cowan, 2001, 2010). Some research has suggested that this capacity limit is closely related to our ability to resolve the interference from competing information (Bartsch & Oberauer, 2023; Endress & Szabó, 2017; Oberauer & Lin, 2017; Shipstead & Engle, 2013). For example, to identify a lemon, we must correctly associate its shape (“oval”), color (“yellow”), and location (“on the table”) while distinguishing the lemon from objects with similar features, such as a lime or an apple. This interference from overlapping features can be minimized by forming distinct object representations through feature binding (Cowell et al., 2019; Hedayati et al., 2022; A. Y. Li et al., 2022; Oberauer, 2019; Schneegans & Bays, 2017; Swan & Wyble, 2014). When feature binding fails, binding errors can occur (Treisman, 1996), where between-item interference can result in the incorrect combination of features belonging to different objects. These errors manifest commonly in everyday life, such as misremembering the proper fruit for a dinner recipe but also in more severe conditions, such as Alzheimer’s disease, which has been associated with deficits in visual binding (Parra et al., 2009).

The study of feature binding in working memory has a rich history in the psychological sciences. Early influential models considered binding to operate over independent feature maps, with attention serving as the “glue” between features sharing common spatial locations (Treisman & Gelade, 1980). This seminal work led to debates about the format of the object representation, with experiments positing shape and color features can be bound directly to each other (Luck & Vogel, 1997; Zhang & Luck, 2008), bound by virtue of space or time (Schneegans et al., 2022; Schneegans & Bays, 2017), or bound hierarchically as both integrated wholes and independent features (Hedayati et al., 2022; A. Y. Li et al., 2022). Converging findings reveal that the format of the object representation can likely include multiple forms of binding, such that features may be initially bound to shared spatial maps but are then bound to form object representations that no longer depend on space (Shepherdson et al., 2022). This view has support from neuroimaging research whereby early posterior neocortex may represent features bound to location (Henderson et al., 2022; Hubel & Wiesel, 1962; M. Li et al., 2014; Schneegans & Bays, 2017; Sprague & Serences, 2013; Thyer et al., 2022), with fully specified objects bound to spatial context represented in anterior regions like the medial temporal lobes (Cooper & Ritchey, 2019; Cowell et al., 2019; Liang et al., 2020; Martin et al., 2018; Wu & Buckley, 2022; Yeung et al., 2013; Yeung et al., 2017; Yeung et al., 2019; Yonelinas et al., 2019). Critically, this body of work suggests a division between feature-to-location binding that may depend on posterior regions of the neocortex with object-to-location binding that may depend on anterior regions such as the medial temporal lobes.

To study feature binding, working memory researchers often quantify the errors that occur when binding fails. For example, one influential procedure quantifies feature-to-location binding errors using continuous reconstruction tasks (Ma et al., 2014). In a typical variant of this task, a number of colored squares are first shown during the study phase of an experiment. After a delay period, participants are then cued to reconstruct the color that was studied at an indicated location. Participants select the target color by moving their mouse cursor along a circular color wheel. Studies like these quantify binding errors as the proportion of participants’ final selections that correspond with the uncued, nontarget colors, which have been incorrectly associated with the target location. When estimated using mixture models, these nontarget feature responses are known as “swap” errors (see Bays, 2016; Bays et al., 2009). This work typically finds that feature-to-location binding errors increase in tandem with memory load, suggesting that failures in feature integration are more likely to occur as participants attempt to hold increasingly more information in mind. However, while existing research in working memory has most often studied binding errors using simpler features like color or orientation when cued by location (i.e., feature-to-location binding errors), it remains an open question how memory load may influence binding errors for more complex objects during working memory (i.e., object-to-location binding errors).

One previous method to study complex feature binding errors has been through eye tracking (Barense et al., 2012; Erez et al., 2013; Ryan et al., 2007; Yeung et al., 2013; Yeung et al., 2017; Yeung et al., 2019), which provides rich information about the internal representation at millisecond temporal resolution (for reviews, see Hannula et al., 2010; Kragel & Voss, 2022; Ryan et al., 2020; Voss et al., 2017; Wynn et al., 2019). For example, object-to-location binding errors can be measured as the proportion of eye fixations made towards the relevant target (i.e., previously studied objects) compared with irrelevant nontarget objects (i.e., similar lures) over the entire gaze trajectory. This body of work has found that patients with medial temporal lobe damage (Barense et al., 2012; Erez et al., 2013; Ryan et al., 2000) and older adults at risk for Alzheimer’s disease (Yeung et al., 2013; Yeung et al., 2017; Yeung et al., 2019) have aberrant viewing behavior compared with healthy older adults. Populations with medial temporal lobe damage often fail to direct their gaze towards important locations in scenes, evidence that these regions may be essential for successfully binding complex objects to locations within spatial environments (Ryan et al., 2000; Yeung et al., 2019). Analyzing the continuous trajectory of eye movements can therefore be used to quantify object-to-location binding errors, operationalized as the fixations made towards the location of irrelevant objects in a spatial environment.

The majority of existing working memory studies quantify binding errors using only the final participant response during the test phase of an experiment (Fig. 1a) without considering the entire trajectory of navigating to those responses (Fig. 1b; but see Hao et al., 2021; Park & Zhang, 2022, for recent examples of mouse tracking in a working memory experiment). The standard approach may exclude important information, akin to analyzing only the final gaze position during eye tracking without considering the eye movements that form the path leading up to the final gaze position (see Barense et al., 2012; Erez et al., 2013; Golomb et al., 2008; Golomb & Kanwisher, 2012; Golomb et al., 2014; Hannula et al., 2010; Liu et al., 2017; Wynn et al., 2020; Yeung et al., 2017; Yeung et al., 2019, for examples of eye-tracking experiments that consider the scan path across the entire trial). Furthermore, existing working memory research has primarily focused on simpler feature-to-location binding errors, limiting our understanding of how binding operates over more complex object features in working memory. Inspired by eye-tracking methodologies, we quantified object-to-location binding errors in the present study not only from the final response at test but also as participants continuously reconstructed shape–color objects using mouse tracking. This approach enabled us to convert a dependent variable with a single observation each trial (i.e., the final response on a cognitive task) to a dependent variable with hundreds of observations each trial (i.e., time-series data at millisecond temporal resolution), providing a rich characterization of participant behavior leading up to the final response at test.

To test object-to-location binding errors in working memory, we adapted a simultaneous reconstruction task to include online mouse tracking (Fig. 1b; A.Y. Li et al., 2022). Drawing from previous eye-tracking literature, we define object-to-location binding errors as the between-item interference stemming from irrelevant lure objects from study. We recorded the mouse path trajectories every trial as participants reconstructed objects (analogous to recording gaze trajectories from an eye-tracking experiment). Moreover, we developed our task as a downloadable executable file so that participants could access the task online and run it on their own machines, and we tested the data reliability and testing efficiency of this experimental method when paired with virtual conferencing rooms. Thus, in addition to studying the effect of memory load on object-to-location binding errors for more complex shape–color objects than commonly tested in the literature (see Ma et al., 2014), this work provides a novel online approach to quantify mouse-path trajectories while participants reconstruct objects from memory.

Method

We first examined the data reliability of online executables when paired with virtual conference rooms (Fig. 2), attempting to replicate previous memory load findings from the literature (A. Y. Li et al., 2022; Ma et al., 2014; Sone et al., 2021). In the individual testing condition, participants downloaded an executable program on their own computers, and then completed the memory task (Fig. 3a) one-on-one with the experimenter in virtual conference calls (i.e., over Zoom; Fig. 3b). Critically, we also examined the testing efficiency of our virtual conference room approach, testing many participants in groups within the concurrent testing condition (Fig. 3c). Having established the efficacy of our online approach, we then conducted mouse-tracking trajectory analyses to understand how participant reconstruction behavior may be influenced by memory load. Finally, we examined the influence of memory load on object-to-location binding errors during object reconstruction. For details about converting Python-based experiments into a downloadable executable, including a full video demo, see a tutorial and commented code on GitHub: https://github.com/james-y-yuan/executable-pipeline.

Participants

Sixty participants were recruited from the University of Toronto and from the community (M_age = 20.41 years, SD_age = 3.83 years, 31 females). Thirty participants (M_age = 21.97 years, SD_age = 4.90 years, 20 females) were tested individually (Fig. 3a), and 30 participants (M_age = 18.90 years, SD_age = 1.37 years, 12 females) were concurrently tested (Fig. 3b). Students received course credit, and community members were compensated with $10/hour CAD.

Procedure

We incorporated mouse-tracking into the simultaneous reconstruction task, such that the position of the mouse cursor was recorded approximately once every 20 ms (for more details about the task design, see A. Y. Li et al., 2022; Fig. 1). Shapes were sampled from the Validated Circular Shape Space (A. Y. Li et al., 2020), and colors were derived from a circle defined on CIELAB color space (L = 70, a = 20, b = 38, radius = 60; Zhang & Luck, 2008). To ensure that there were no systematic mappings between particular features and locations on the task, we jittered shape mappings by participant and color mappings by trial, following previous work (A. Y. Li et al., 2022).

To manipulate memory load, participants were presented with twenty trials each of one, two, or three shape–color objects in a random order (Fig. 3c). Objects were displayed within a fixed 1,080 × 1,080-pixel square, coded using absolute coordinates so that visual images were never stretched or distorted across different monitor resolutions (see Fig. 3c). Object locations were sampled randomly so that no objects overlapped any other objects. The shapes and colors of the objects were sampled from VCS space and CIELAB space, respectively. When sampling from each feature space, values for a given trial were chosen from a set of six points spread equidistantly across the circle (i.e., 60 degrees apart). Thus, for every trial, all objects were at least 60 degrees different in shape and color from every other object, so that the visual similarity of objects was always tightly controlled. This sampling approach also ensured that the target object’s shape and color features were random for each trial.

Each trial proceeded in the typical fashion of continuous reconstruction tasks (see Fig. 1a). After the initial study phase of 2,000 ms, there was an ISI of 300 ms. The masks then appeared at the location of the studied objects for 300 ms. After a retention interval of 1,000 ms, a cue appeared at the location of one of the studied objects (500 ms). The test phase display then appeared, with the mouse cursor positioned at the centre of the screen. Participants used their mouse to reconstruct the target shape–color object which was cued at an indicated location (Fig. 3c). At the onset of the initial mouse movement, the reconstructed object appeared at the cued location corresponding to the position of the mouse cursor on the test display. This portion of the task was untimed, and we recorded the mouse movement throughout the entire test phase until the participant made a response (Fig. 1b). Thus, participants complete each trial by matching their memory of the target object with what was perceived on the display as the mouse cursor moved along shape and color space (for a video, see: https://osf.io/ycq5s). Upon completion of the task, which ranged from 30 to 60 minutes, the anonymized data was uploaded to our lab server (Fig. 2), and participants were debriefed verbally and compensated. To ensure that our task could be reasonably completed online within this time frame, each set size condition included 20 trials (see Discussion for more information about this methodological decision).

In the individual testing condition, we completed the task across 30 sessions of one-on-one virtual conference calls (Fig. 3a). Critically, we completed the task across six sessions of multiple groups of participants in the concurrent testing condition (Fig. 3b). We tested up to 18 participants in the same virtual conference, with the other sessions containing up to five participants. The concurrent testing condition was intended to mimic a typical online study, where an experimenter may wish to recruit many participants on a rolling basis.

Statistical analysis

Online task data reliability

To determine the data reliability of our online executable pipeline using the simultaneous reconstruction task (Fig. 1), we first examined whether we could replicate previously observed memory load effects from the lab (e.g., A. Y Li et al., 2022). We then tested the efficiency of our virtual conference room approach by comparing memory performance between the individual and concurrent testing conditions (Fig. 3).

Memory performance was quantified as error, defined as the absolute angular distance between the reconstructed feature and target feature on the individual circular shape and color spaces (see Fig. 1b). More specifically, we defined fine-grained feature responses as those for which shape or color error were less than or equal to 15 degrees. Furthermore, we defined fine-grained object responses as those for which both shape and color error were less than or equal to 15 degrees (see A.Y. Li et al., 2022). We predicted that increases in memory load should lead to decreases in fine-grained responses, in line with previous findings in the literature (A. Y. Li et al., 2022; Ma et al., 2014; Sone et al., 2021).

Separate random-intercept linear mixed models were used to predict fine-grained shape, color, and object responses from memory load (see Magezi, 2015, for how within-subject psychology experiments can be analyzed using linear mixed models). For each model, fine-grained shape, color, or object responses were used as the dependent variable, and memory load (Set Size 1, 2, or 3) was defined as the fixed factor. In a follow-up analysis, to directly compare between individual and concurrent testing conditions (Fig. 3), we additionally included testing condition as a fixed effects factor. In all analyses, individual participants were included as a clustering variable to account for the within-subject design. All linear mixed model analyses were conducted using jamovi (The Jamovi Project, 2021).

Mouse-tracking trajectory analysis

Once the reliability of our online approach had been established, we then conducted a series of mouse-tracking trajectory analyses. We first calculated descriptive results, including the onset time of the mouse movement as well as the mean and variability of the trial duration across mouse trajectories. We next analyzed the continuous trajectory, including only the responses when the mouse cursor began moving and not when the cursor was displayed at the centre of the screen at the start of the test phase.

Because we recorded the position of the mouse cursor approximately once every 20 ms, and because the duration of trials was not fixed, the number of data points across trials will vary based on their duration (e.g., a 1,000 ms trial would give us 50 data points, whereas a 2,000 ms trial would give us 100 data points). Thus, even two paths that follow an identical trajectory over shape–color space will differ in the number of data points collected if their durations are not the same (see Fig. 5a for a visual depiction). For this reason, we linearly interpolated each trajectory to the same path length to normalize the duration of all mouse paths (Fig. 5b). This approach allowed us to project the trajectories across all trials of the experiment onto the same axis. We then converted the shape–color object corresponding to the position of a given mouse cursor at each time point into error (i.e., how far that position was from the target position of the shape–color object; Fig. 1b). By depicting the normalized trajectories in terms of error from the target, we could graphically visualize participant object reconstruction behavior over relative positions along the trajectory (Fig. 5). Put simply, we visualized each point along the trial in a manner akin to how the final response is typically depicted (Fig. 4).

In the final trajectory analysis, we analyzed how the normalized mouse paths changed across relative positions along the trajectory as a function of memory load (Fig. 6). We visualized the mouse trajectories for each participant in terms of shape errors and color errors separately, extending our previous visualization which projected errors along shape–color space together onto the same axis (Fig. 5). The dependent variable in our linear mixed models was the shape error and color error of each data point along the normalized mouse trajectory. Fixed factors were memory load (one objects, two objects, three objects) and relative position in the trial (start, 25%, 50%, 75%, and final response). This approach allowed us to statistically assess the magnitude of the error across memory load conditions and the defined positions along the trajectory. As clustering variables, we included information about relative position, individual trials, and participants.

Quantifying object-to-location binding errors

Next, we examined whether memory load influences object-to-location binding errors during object reconstruction; see Fig. 7. For each trial, we identified the number of times a participant’s mouse hovered over nontarget, uncued objects presented during the study phase (i.e., a nontarget lure object; Fig. 7a). More specifically, any period in time, of any duration, in which the mouse cursor was within 15 degrees of both shape and color error from a nontarget object was defined as a “nontarget mouse hover” (see example: https://osf.io/7tfn6/). We report this analysis only for trials with one nontarget object (Set Size 2) and two nontarget objects (Set Size 3), as only these trials presented participants with nontarget objects. The benefit of this approach is that we can examine the trajectory leading up to the final response at test, akin to the analysis of eye-tracking experiments that quantify the interference between target and nontarget object lures (Barense et al., 2012; Erez et al., 2013; Yeung et al., 2013; Yeung et al., 2017; Yeung et al., 2019). Thus, mouse tracking provides a potentially sensitive marker of object-to-location binding errors that occur during object reconstruction.

We used the same linear mixed models described previously, except we predicted nontarget hovers from memory load while including potentially confounding factors in the model. We included trial duration as a fixed effect, as mouse paths that span longer durations tend to move over broader areas of the available shape–color space, inflating the chance of random hovers over nontargets. We also included the overall variability of the mouse path trajectory over shape and color spaces as fixed effects, which would similarly cause systematic increases in hovers due to mouse paths spanning a broader area.

Ruling out alternative explanations

To ensure that our measure of nontarget mouse hovers reflects actual object-to-location binding errors that increase with memory load rather than random hovers driven by chance, we accounted for several alternative explanations. First, our operationalization of mouse hovers is systematically influenced by the number of objects in a trial. This is because mouse hovers will be more likely to occur by chance on trials with two nontargets (Set Size 3) compared with one nontarget (Set Size 2), given that twice as much area counts as a hover when two nontargets are present (see Fig. 7a). Thus, if more hovers are observed in Set Size 3 than in Set Size 2, this could be attributed merely to more mouse movements being captured as nontarget hovers by chance alone. For this reason, we conducted a control analysis that equated the area that could be counted as a nontarget hover across set sizes. Using the same sampling parameters as the actual nontargets (see Procedure), we generated an “invisible” nontarget in the Set Size 2 trials and then counted the number of mouse hovers over both the actual nontarget and the invisible nontarget. By doing so, the area counted as a hover in Set Size 2 is made equal to the area in Set Size 3.

Next, we accounted for the theoretical possibility that an increase in mouse hovers over nontarget objects might be driven by random guessing behavior. Whereas the previous analysis controlled for the random hovers driven by chance from our analytical approach, the present analysis controlled for the random hovers driven by chance from participant mouse movements. That is, perhaps more hovers could be observed at higher memory loads not because of binding errors, but because participants tend more often to forget the target object and resort to a guess (e.g., Zhang & Luck, 2008, 2009). This possibility is usually addressed by fitting a uniform distribution from a mixture model, derived from the final responses at test across all trials of an experiment. Here, we addressed the possibility of random guessing behavior trial-by-trial during object reconstruction. If participants select a random shape–color object on the stimulus space (Fig. 7a), we expect to find more hovers at any given point on the shape–color space, not only over nontargets. Thus, we counted the number of hovers over the points directly opposite to nontarget objects on shape–color space each trial, which should increase with memory load only if random guessing is involved. On the contrary, however, we expected that increases in nontarget mouse hovering could not be explained by random guesses at higher memory loads, and so we predict that hovers over nontargets should be unrelated to hovers over a point on the opposite side of shape–color space. Thus, in a subsequent linear mixed model control analysis, we included the hovers over the opposite side of shape–color space as a fixed effect to control for the possibility of random guessing behavior.

Finally, we examined whether object-to-location binding errors occurred throughout the entire period of object reconstruction, providing a rich characterization of participant behavior leading up to the final response at test. We compared the frequency of hovers per trial in the first 75% of the trial’s mouse path (i.e., object reconstruction) to the final 25% of the trial’s mouse path (i.e., the final response at test). As longer mouse path lengths provide more opportunities to hover over nontarget objects, we normalized the hover frequencies to ensure the results were in proportion to the entire mouse path. Specifically, we multiplied the hover frequency over the first 75% of the mouse path by 4/3, and the frequency over the final 25% of the mouse path by 4, so that all hover frequencies we report in the results are in proportion to 100% of the mouse path.

Results

For anonymized data, see https://osf.io/a4vsb/, and a tutorial for creating downloadable executables is available at https://github.com/james-y-yuan/executable-pipeline. See the raw error distributions for shape and color features from the final response at test in Fig. 4a–f (i.e., the raw error distributions akin to a typical continuous reconstruction task; Ma et al., 2014), and the error distributions over the entire mouse trajectory in Fig. 5 and Fig. 6.