Benefits and pitfalls of data compression in visual working memory

Lazartigues, Laura; Lavigne, Frédéric; Aguilar, Carlos; Cowan, Nelson; Mathy, Fabien

doi:10.3758/s13414-021-02333-x

Benefits and pitfalls of data compression in visual working memory

Published: 15 June 2021

Volume 83, pages 2843–2864, (2021)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Benefits and pitfalls of data compression in visual working memory

Download PDF

1156 Accesses
2 Citations
Explore all metrics

Abstract

Data compression in memory is a cognitive process allowing participants to cope with complexity to reduce information load. However, previous studies have not yet considered the hypothesis that this process could also lead to over-simplifying information due to haphazard amplification of the compression process itself. For instance, we could expect that the over-regularized features of a visual scene could produce false recognition of patterns, not because of storage capacity limits but because of an errant compression process. To prompt memory compression in our participants, we used multielement visual displays for which the underlying information varied in compressibility. The compressibility of our material could vary depending on the number of common features between the multi-dimensional objects in the displays. We measured both accuracy and response times by probing memory representations with probes that we hypothesized could modify the participants’ representations. We confirm that more compressible information facilitates performance, but a more novel finding is that compression can produce both typical memory errors and lengthened response times. Our findings provide clearer evidence of the forms of compression that participants carry out.

Can compression take place in working memory without a central contribution of long-term memory?

Article 26 October 2023

A compressibility account of the color-sharing bonus in working memory

Article 08 March 2021

Terms of the debate on the format and structure of visual memory

Article 04 June 2014

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Working memory is known to be limited in both the amount of detail that can be retained in a visual scene (Bays et al., 2009; Brady & Alvarez, 2015; Ma et al., 2014; Schurgin et al., 2020) and the total number of objects that can be recalled regardless of how these objects can be detailed (Cowan, 2001; Luck & Vogel, 1997; Zhang & Luck, 2008). These two types of limitation are generally predicted by competing models such as those based on shared-resource or discrete-slots, respectively (Bays & Husain, 2008; Rouder et al., 2008). However, the memory benefit caused by shared features in visual working memory does not seem to be accounted for easily by these two main classes of working memory models (Quinlan & Cohen, 2012), and other studies have shown that both types of limitations should be accounted for concurrently to fit data (Awh et al., 2007; Cowan et al., 2013; Hardman & Cowan, 2015; Oberauer & Eichenberger, 2013; Xu & Chun, 2006). An approach taken in the present study is to consider that models of capacity limits should better account for how shared features are processed in working memory, in particular when those features allow some room to be preserved in memory.

There are a few known factors that can help individuals recode information in a more efficient way, in particular when the visual scenes offer the possibility of associating features with one another during the task at hand (Gao et al., 2016; Jiang et al., 2000, 2004; Peterson & Berryhill, 2013; Woodman et al., 2003). For instance, the phenomenon of binding has been shown to allow objects to be integrated based on multiple features (Alvarez & Cavanagh, 2004; Bays et al., 2011; Fougnie & Alvarez, 2011; Saiki, 2019; Wheeler & Treisman, 2002; Xu, 2002). In the case of binding, let’s imagine that one participant is presented with an array of four novel objects each made of three features, thus totaling 12 different features. Being able to recognize one of the multi-feature objects does not mean that the participant would be able to recall precisely the three constituent features of each object. With a working memory capacity of four items, for instance, one could encode just two features of two different objects, or any other combinations such as three features for one object and only one feature for another object, but always totaling four features. In this case, a capacity of four items does not allow a given individual to retain the 12 features of the whole scene. However, multi-feature objects can be better memorized when features occur repeatedly over time and can be recoded as chunks. Once formed in long-term memory, those chunks can allow participants to hold a greater amount of information in working memory (Brady et al., 2009; Ngiam et al., 2019; Orbán et al., 2008). Provided that each multi-feature object chunk counts as one item, a participant with a capacity limit of four items could therefore perfectly recall an array of four recognized objects, each made of three features, thus recalling 12 different features.

In other cases, objects are not necessarily already encoded in long-term memory. For instance, previous studies have shown that bottom-up stimulus characteristics (e.g., Gestalt cues) can help participants group information to better recall stimulus items in visual scenes (Woodman et al., 2003). This is for instance the case with spatial information (De Lillo, 2004; Dry et al., 2012; Feldman, 1999; Haladjian & Mathy, 2015; Korjoukov et al., 2012; Sargent et al., 2010). Capacity in working memory can also be easily exceeded by perceptual organization when items share the same colors in a visual scene (Brady & Tenenbaum, 2013). Morey et al. (2015), for instance, showed that there is an advantage for singletons in displays containing repetitions (compared to displays containing no repetitions). Their interpretation was that grouping of repeated colors on the spot can leave room in working memory. One explanation of how grouping or chunking functions is that compression of information could be at work whenever room is being preserved in memory, particularly when individuals can find a way to efficiently recode information (Brady et al., 2009).

The compression account

Compression of information involves recoding information into a more compact form. Although there is a controversy regarding whether visual working memory capacity is fixed regardless of information content (Alvarez & Cavanagh, 2004; Awh et al., 2007), the compression account predicts that storage is particularly efficient when information contains regularities. During the last decade, compression has been put forward to account for intelligence (Chekaf et al., 2018; Hutter, 2004), memory (Brady et al., 2009; Chekaf et al., 2016; Mathy & Feldman, 2012), language (Christiansen & Chater, 2016; Ferrer-i-Cancho et al., 2013; Kirby et al., 2015), and perception (Haladjian & Mathy, 2015; Nassar et al., 2018; Ramzaoui & Mathy, 2021). Compression is an information-theoretic concept based on algorithmic complexity. Algorithmic complexity corresponds to the shortest possible representation of an object (Li & Vitányi, 2008). The gain offered by the shortest representation of an object (in comparison to the length of the original object) allows one to estimate the compressibility of the given object. Regarding the computational aspects of the theory, the shortest representation usually takes the form of the shortest program in a Turing machine. Although it is not computable (because one can never know whether maximal compression has been reached by a given recoding process), estimates can be obtained, see http://www.complexitycalculator (Gauvrit et al., 2016; Soler-Toscano et al., 2014). However, because different languages can be used interchangeably in place of Turing machines, indirect approaches can be taken by developing more practical metrics in a given domain. For instance, a compressibility metric has been developed in categorization to describe multi-feature objects (Feldman, 2003), and this metric (or very similar ones, see Kemp, 2012; Vigo, 2006) have proven useful to account for subjective complexity during learning (Bradmetz & Mathy, 2008; Feldman, 2000; Lafond et al., 2007; Mathy & Bradmetz, 2004), so we decided to use this practical implementation in the present study to build our visual displays. The next section presents the metric and introduces the idea that a compression process can include pitfalls as well as benefits.

The good and bad aspects of compression in working memory

To better understand how compression functions in working memory more globally, both advantages (optimization of information) and disadvantages (loss of information) should be studied together. Although the present work involves arrays of multifeatured objects to be recognized, for the sake of easy visualization we first present the over-compression idea in the context of sequential stimuli. For instance, if participants make use of a compression process to recode information, capacity should be increased in situations where patterns exist (for instance, for retaining a display set such as ■☐■☐ sequentially as shown, as the compact description "2■☐"). This description would be sufficient to recognize the same sequence reproduced as a probe display or reject a sequence in which something has changed from the studied sequence. However, the putative compression process could also lead to specific errors when patterns are badly (or overly) compressed. If a participant studied the display set ■☐♦☐ it might be recoded as the rule: "2■☐, but rotate second dark square.” However, the qualification about rotating the second dark square could be lost from working memory, so, if the test display were ■☐■☐, the participant would incorrectly find a perfect match to the now over-compressed or lossy representation 2■☐.

We presume that a poorly managed compression process could distort perceptual organization instead of producing benefits. The above-mentioned studies have mostly insisted on the benefits of compressibility. Our hypothesis in the present study is that compression in working memory could also be detrimental to the recall process. The reason is that we can expect the memory content to be over-simplified if memorization is driven by a compression process that seeks to reduce information load. Also, simplification errors should depend on compressibility levels, as less random errors should be expected when stimulus sets are more structured. We should thus expect a greater number of compression errors with greater compressibility.

Our stimulus displays were made of objects comprising a variable number of conjunctions of features, to manipulate the complexity of the display. Complexity was manipulated to allow recoding of information based on the compressibility metric defined by Feldman (2003), which is adapted to Boolean dimensions (i.e., dimensions made of two discrete features). Feldman (2003) described complexity by using spatial dimensions represented by hypercubes, also as Hasse diagrams. A hypercube is just an extension of a square representing four two-dimensional objects or a cube representing eight three-dimensional objects. These diagrams are useful to represent the similarities between objects. In a cube, two objects related by an edge share only one feature. The number of edges that one needs to follow to join two objects represent the number of dissimilarities between two objects. A four-dimensional hypercube represents two joined three-dimensional cubes, so one needs to follow three edges and then switch cubes to find the object differing in four features from one object. By representing the locations of some objects using black dots in a hypercube allows one to represent the relational structure between objects. Figure 1 shows the similarity structure of the stimulus items (both with and without the probe) for each of our experimental conditions. To represent how this similarity structure can be best represented using a minimal number of features, the compressibility metric of Feldman is based on the minimal disjunctive normal forms corresponding to the selected features for a given structure in a hypercube (Table 1). In Fig. 1 the complexity number indicates the minimal number of features that describe the chosen subset of stimuli, the stimuli are marked with a dot in each of the hypercubes, and the structure column always contains a new dot in the hypercube, which represents the chosen probe. For instance, the hypercube in the first row and third column of Fig. 1 can be summarized by the spatial rule “any object that is on the left and at the bottom.” When using the reference set of features at the top of Fig. 1, this spatial rule corresponds to the feature rule “any object that has a left-disc and that is square.” Because the rule mentions two literals (i.e., the two features just mentioned), the complexity of the rule sums to 2. Another example is the hypercube in the first row and first column of Fig. 1 can be summarized by the spatial rule “any object on the left and at the bottom (except the object in the back of the right-cube).” Because the rule mentions four literals (i.e., the four traits just written in italics), the complexity of the rule sums to 4. Using the specific dimensions we used in our experiment, the rule would be exemplified as: “any object that is on the left-disc and square (except the hatched and red one)” (but note that the experiment actually used rotations of the dimensions, so the features for a given trial would be randomized, although obeying the specific structure). One more complex example would be the last row of the first column, summarized by “any object on the left and at the bottom (either on the front of the left-cube or at the back of the right-cube), plus the object on the top right corner on the front of the right-cube.” The sum of the traits in italics is here 10. One implementation would be: “square with left-disc (either the plain and blue or the red and hatched), plus the plain red circle with a right-disc.”

Table 1 Minimal formulae for conditions used in Experiment 1 and Experiment 2 based on the catalog of Feldman (2003). For instance, the features a-a’, b-b’, c-c’, and d-d’ represent the features Blue-Red, Square-Circle, Plain-Hatched, and Left-disc-Right-disc, respectively, as shown in Fig. 1. Correspondence between letters and features was randomly assigned in each trial

Full size table

In the present study we aimed at predicting accuracy and response times (RTs) in a comprehensive way as a function of stimulus-set complexity. Our experiments allowed us to study memory for diverse study sets of three objects presented simultaneously, followed by a test object that participants were asked to categorize as new (absent from the array) or old (present in the array), as exemplified in Fig. 2. Our general method consisted of displaying arrays of three objects that were designed to allow associations to be formed. The objects were four-dimensional stimuli varying in shape, color, texture (plain vs. hatched), and direction of a white dot within the shape (left vs. right).

This method allowed us to study how introduction of a lure could modify the participants’ representation. Consider, for example, the array to be studied in Fig. 2. Given that two of three discs appear on the left of the objects (i.e., for the two squares), an over-compressed representation might include the disc on the left in all three object representations. In that case, a probe item that was a blue circle with the disc on the left would be incorrectly judged to be present in the array. As another example based on Fig. 2, the probe item shown – a red circle with the disc on the right – could be incorrectly identified based on an over-regularized representation in which blue was assigned to all squares, and red to all circles. However, the latter representation is less plausible than the former because both “red” and “dot on the right” are statistically under-represented in the study display. By studying various conditions similar to this example, we discuss the advantage of the present method, which we believe goes beyond previous research that mostly focused on the benefits of compression processes. Also, the present study goes a step further by investigating how both errors and RTs could result from the putative compression process.

Experiment 1

Overview

The aim of Experiment 1 was to study immediate memory for visual scenes made of arrays of three multi-dimensional objects presented simultaneously. Each array was followed by a probe and the task consisted of deciding as quickly as possible whether the probe was new or old. The design for a given trial is exemplified in Fig. 2. Based on a compressibility metric we develop in this section, our goal was to assess the quality of the participants’ representations using probes that, depending on the conditions, could completely or partially match the three stimulus objects of the memory array. We recorded RTs and accuracy to measure the effect of the compressibility of the memory sets. Interestingly, our method also consisted of characterizing the effect of the probe interacting with the memory set, by describing whether introduction of the probe could fool the participants. The to-be-tested idea was that the participants’ representations could be prone to be modified upon introduction of the probe.

Method

Participants

Thirty French participants (M_age = 31.8 years; SD = 10.6) volunteered to take part in the experiment. The sample included 24 females and six males, having completed between 0 and 8 years of higher education. The experiment was approved by the local ethics committee (CERNI) of the Université Côte d'Azur and the experiment was conducted with the informed consent of the participants. To estimate our minimal sample size, we referred to the study by Feldman (2000) in which 45 subjects were asked to memorize similar sets of three four-dimensional Boolean objects. Feldman achieved sufficient power to show a relationship between complexity and proportion of correct recall (R² = .37; a similar positive relationship was shown with three three-dimensional objects using 22 participants, R² = .98). In the study by Feldman though, the participants had to observe for 20 s the 16 four-dimensional objects (the four to-be-memorized objects called “positive examples” appeared in the upper half and all other objects appeared in the lower half). The participants were then asked to categorize all 16 objects presented randomly as positive or negative during a block called the “categorization task.” This categorization task seemed more demanding than our memorization task, so we estimated roughly that a sample of 30 subjects would at least allow us to observe the Initial complexity effect.

Dimensionality and compressibility of stimulus sets

The task was created using the Python library PsychoPy (Peirce, 2007). The four-dimensional stimuli of Kibbe and Kowler (2011) were used to draw the individual stimulus items based on different shapes (circle vs. square), color (blue vs. red), texture (plain vs. hatched), and disc position (left vs. right).

The combinations of features allowed the construction of 16 possible objects. We used these stimuli because our experiment required four-dimensional stimuli not varying in size in order to have equal distances between stimuli. Also, we chose to use displays of three objects, because this cardinality facilitated equalization of distances within arrays; all objects were equidistant in the array, as they were arranged as points of an equilateral triangle. Each trial used three different objects shown simultaneously on a single display on a white background (Fig. 2). Table 1 shows all of the combinations that were used in the task to generate the trials, based on the complexity of the display structure and the complexity of the structure including the probe. Following introduction of the probe, our manipulation made the complexity of the four items altogether (i.e., the three stimuli plus the probe) increase, decrease, or stay constant.

The second column of Fig. 1 indicates the complexity of the initial set displayed in the first column (in the first column, the chosen stimuli are marked with a dot in each of the hypercubes). For instance, the last stimulus set has a complexity of 10 because ten features are necessary to minimally describe the entire set. The third column indicates the complexity of the final stimulus set (initial set to which we added the probe/lure). For instance, the last stimulus case of the table has a complexity of 12 because the entire set of four stimuli is not easily compressible, and in that case the adjunction of the probe makes the entire structure more complex than the initial structure of the three stimuli. The last column of Fig. 1 also indicates the sum of shared features between the initial structure and a lure. For instance, the sum of shared features is easily visible in Fig. 3a where there are only two shared features between the initial structure on the left and the probe on the right (the feature red, and the feature hatched). In Fig. 3b, there are, however, four shared features between the initial structure on the left and the probe on the right. The number of shared features served as a way to double the number of observations for each case of interest, by taking advantage of all of the possible variations that were allowed.

The reason why we restricted the choice of structures and probes was to manipulate two independent variables independently. We used a total of nine possible structures listed in Table 1. Each case was numbered using the values described in Fig. 1 for each variable, using a triplet for the three respective measures (initial complexity, final complexity, and sum of shared features). For instance, in Fig. 3a and Fig. 3b, the structures are, respectively, 4-8-2 and 10-8-4. The structures 4-8-2, 4-8-4, 6-6-4, 6-9-3, 6-9-5, 6-10-4, 7-6-4, 10-8-4, 10-12-4 are shown in Fig. 1 and recapitulated in Table 1. The number of features between the initial set and the probe was considered a control variable.

The combinations of different initial structures and different probes allowed us to generate two main independent variables: (1) Initial/display set complexity (inversely corresponding to the compressibility of the display set), (2) Change between Initial Set Complexity and Final set complexity (inversely corresponding to the compressibility of the stimuli of the display set and the probe combined). For instance, in Fig. 3a, the initial structure presents a complexity of 4, while the final structure presenting a complexity of 8 led to an increase of four points in complexity (Change = 4). Also, note because that there are two features in common between the initial structure and the probe in the case of Fig. 3a, we code this case using the triplet 4-8-2 (corresponding to, respectively, Initial Set Complexity, Final set complexity, and Number of shared features between the initial set and the probe). Another example is 10-8-4, in which the change in complexity is equal to 8 − 10 = −2, which reflects the idea that the probe can decrease the complexity of the initial structure.

Two predictions follow from the two main factors Initial Complexity and Change in Complexity:

1)
A low complexity of the initial set was hypothesized to account for greater memory performance in general (both change and no-change trials can be performed more easily when the initial set can be potentially compressed, with faster response times considering that a more compressed representation could be rapidly decompressed); conversely, performance was expected to be lower when the complexity of the initial set is higher (as participants cannot find a way to simplify information, information load is higher, and decompression time takes longer).
2)
Introduction of the probe (i.e., a fourth element) should interact with the initial set of three objects. In particular, the probe was hypothesized to deceive the participant when the complexity (of the four elements taken together) decreases in comparison to that of the initial set (i.e., a negative Change) because the probe fitted well to the display set; conversely, it was hypothesized that the participant would better detect a change trial when the complexity increases because an increase of complexity assumed a very different probe from the display set. This second less intuitive prediction is still based on the compression account: participants can be deceived by a lure if they over-compress information. One simple example would be a presentation of three colored squares (white, dark grey, black): if not correctly encoded (for instance based on a lossy compression such as “all squares” or “squares from white to dark”), the participant would have a greater chance to falsely recognize a light grey square, but probably not a circle that would involve a larger modification of complexity. We thus expected that a high increase in complexity due to introduction of the probe would have a greater chance of not fitting a compressed or over-compressed representation. The factor Number-of-shared-features was not of direct interest in the present study and it was only thought to better generalize our results. To elaborate on an example based on our material, let us presume a participant can have trouble encoding the display set (three first items only) in Fig. 3a as “anything that is not-right-disc and not-circle (but not hatched and red in the same time).” Note that it should be assumed that the participants are not expected to encode the set more minimally such as using the description “not-right-disc and not-circle” (or “square and left-disc”) because they are instructed to perform successive trials in which all eight potential features can be a determinant for a correct response. Taking into account this constraint, when trying to optimize the storage process using the available compressibility, the participant could encode the stimulus set as “square and left-disc” and leave out the exception. In that case, the introduction of the probe square with red hatching would satisfy the over-compressed representation. A false alarm could therefore be committed when the participant is required to decide whether the probe was present or not in the original stimulus set. When the probe corresponds to one of the three displayed objects, the probability of an omission would be very low based on the over-compressed participant's representation. However, if the probe was a circle, the three squares of Fig. 3a mixed with the circle would not produce a new cohesive ensemble. The complexity of the new ensemble would not fit the one of the initial set of three stimuli, so rejection of the probe would be facilitated.

Procedure

Participants sat approximately 60 cm from the computer screen. Participants were required to enter their response for each trial using two keys of a numeric keypad (keys 4 and 6) depending on whether they judged the probe had been previously presented in the stimulus display or not. After receiving a series of basic instructions to complete the task, participants started the experiment with a series of 15 trials as a warmup, including feedback. Next, the participants were administered 540 trials with no feedback. Within each of the nine experimental structures, the probe in the no-change trials corresponded to one of the three previously presented objects for 30 trials. There were 30 other change trials with the probe (i.e., a lure) absent from the stimulus display. Participants could take a quick break after the first 200 trials, and after 400 trials. The 540 trials were permuted for each participant.

Each trial began with a 2,000-ms fixation cross followed by a 500-ms blank screen (white). The stimulus displays then appeared for 1,000 ms, followed by a second 500-ms blank screen (white). Note that the stimulus display was not followed by a random mask to allow maximal encoding of the stimuli. Then, the probe was shown for 300 ms before a final blank screen (white) that allowed sufficient time for participants to enter their responses (see Fig. 2). The next trial was initiated automatically by the program. We measured the number of hits (i.e., the participant recognized the probe as one of the stimulus items of the stimulus display), false alarms (FAs), omissions, and correct rejects (CRs) and RTs.

Results

The present analysis focuses on two aspects of participants' performance. For each of the following predictions, we ran a linear mixed-effects model using participants as a random variable and allowing different intercepts and slopes across participants (see Brown, 2020; Singmann & Kellen, 2019). Each model used only one dependent variable per analysis, either drift rates or FA rates. Drift rates allowed us to combine the proportion of correct responses, mean correct RTs, and their variance. Drift rates is a parameter of the diffusion model (Ratcliff, 1978) that has been developed for speeded binary decision processes. This parameter generally described as one of the most relevant of the model allows one to describe how fast the decision reaches one of two optional responses. Drift rates allowed obtaining an overview of correct performance, instead of using the three dependent variable hit rates, hit RTs, and correct rejection RTs (see Online Supplementary Material (OSM), which provides the details for these variables). Because our data set is rather small and we used a two-alternative forced-choice task (and to avoid the complex parameter-fitting procedure of the original Ratcliff diffusion model), drift rates were estimated at a macroscopic level using the EZ diffusion model (Wagenmakers et al., 2007) and were calculated using the R package EZ2. Beforehand, we checked that RTs were right-skewed, that skewness was more pronounced with increasing complexity, that RTs were comparable for each type of response (correct and incorrect), and that RTs were comparable for all incorrect responses through subjects and conditions.

Initial Complexity and Change of Complexity were considered the main independent, manipulated factors. We also used the number of shared features between the probes and the display set as a covariate to better control for similarity effects. We also computed interactions between factors to refine our analyses but we made sure our models were not overly complex by providing values of the Akaike Information Criterion (AIC; see Akaike, 1987), which offers a trade-off between the goodness of fit of the model and its simplicity.

Firstly, we posited that higher complexity of the initial set should account for lower memory performance. We thus expected lower drift rates and higher FA rates because of a failure to encode stimulus features of the highest complexity sets.

Secondly, an increase in complexity caused by the introduction of a lure probe should account for greater memory performance (lower FA rates and higher drift rates). The rationale was that an increase in complexity due to introduction of the lure probe has a greater chance of not fitting a compressed (or over-compressed) representation of the display set. The decision (i.e., reject the lure) was thus expected to be easier with a greater change in complexity. Conversely, we expected higher FA rates and lower drift rates when introduction of the lure does not increase complexity, with the idea that this manipulation can help detect an errant compression process.

For more detailed descriptive statistics, the OSM shows the descriptive statistics of the hit rates, FA rates, hit RTs, and correct rejection RTs across the paired structures.

We only removed 2.5% of the data corresponding to RTs less than 250 ms and greater than 2,000 ms. We chose to keep relatively long RTs (skewness = 1.23) to study the potential effect of complexity.

For each of the two dependent variables, the following subsections first describe the result of the mixed model applied to a given dependent variable before presenting the statistical tests for each of the paired comparisons. For the paired comparisons, we applied the Bonferroni correction for multiple paired comparisons (each correction was applied within a given dependent variable, therefore never exceeding two comparisons, thus with a threshold at p = .025). Tables 2 and 3 show the results of the linear mixed models run on drift rates and FA rates. The paired comparisons across all dependent variables are shown in Fig. 4.

Table 2 General effect of Initial set complexity, Change in complexity, and Sum of shared features on drift rates

Full size table

Table 3 General effect of Initial set complexity, Change in complexity, and Sum of shared features on false alarm rates

Full size table

Drift rates

To analyze whether complexity levels affected memorization, we first ran a linear mixed model to study the influence of Initial Complexity and Change of complexity on drift rates. We also use the method of paired comparisons when the factor Sum of shared features could be maintained constant.

Table 2 shows the results of the linear mixed model run on drift rates as a function of Initial Set Complexity, Change in Complexity, and adding Sum of shared features as a control factor. The Change in Complexity factor could have positive values (i.e., complexity increased when introducing the lure), negative values (i.e., complexity decreased when introducing the lure), or a null value (complexity did not change in spite of the new structure). See Table 1.

Contrary to our expectations, the mixed model showed no significant decrease of drift rates as a function of Initial Set Complexity. We nevertheless observed an increase of drift rates with a higher change in complexity between the initial set and the final set (t(2.258) = 2.905, p = .004), as expected, but this effect was counterbalanced when complexity of the initial set was the highest (which is captured by the interaction between the two factors: t(2.258) = - 2.220, p ≤ .027). Nevertheless, a decrease was observed for the two additional paired comparisons allowing testing of the effect of Initial set complexity (Fig. 4a): the drift rates decreased significantly for both the pairs 4-8-4 versus 10-8-4 (t(29.00) = - 6.48, p < .001) and 6-6-4 versus 7-6-4 (t(28.95) = −2.56, p = .016). On the other hand, the analysis based on the comparisons 6-6-4 versus 6-10-4 (in the condition 6-6-4 the probe did not change the complexity, contrary to in the condition 6-10-4 in which the probe increased the complexity by 4 points) and 10-8-4 versus 10-12-4 (respectively a Change of −2 and +2) were conducted. In Fig. 4c, paired comparisons did not show a significant effect (6-6-4 vs. 6-10-4, t(2.900) = .59, p = .059 ; 10-8-4 vs. 10-12-4, t(2.900) = - 0.58, p = .565). Drift rates corresponded to a good answer rate and good answer RTs but these two last paired comparisons were interested in the effect of the modification of the complexity by the lure probe; however, the drift rate seems to be an insufficient measure to capture this effect. Therefore, it was chosen to observe FA rates independently.

False alarm (FA) rates

The following two analyses concern situations in which a lure was displayed, allowing us to study both Initial Complexity and Change of Complexity. For FA rates, we ran a linear mixed model including an interaction term between the two factors of interest, and we also ran paired comparisons as a function of the factor Change of Complexity (by maintaining the two other factors constant).

Table 3 shows the result of the linear mixed model run on FA rates as a function of Initial Set Complexity, Change in Complexity, and adding Sum of shared features as a control factor. A FA corresponds to an incorrect answer in a change trial (a false recognition of the lure as part of the initial display set).

Contrary to our expectations, the mixed model showed no significant decrease of FA rates as a function of Initial set. We nevertheless observed a decrease of FA rates with a higher change in complexity between the initial set and the final set (t(1.862) = - 7.127, p < .001), as expected, but this effect was counterbalanced when complexity of the initial set was the highest (which is captured by the interaction between the two factors: t(7.731) = 5.865, p < .001). This result is apparent in the analysis based on the comparisons 6-6-4 versus 6-10-4 and 10-8-4 versus 10-12-4; in Fig. 4b, paired comparisons showed a significant increase in FAs with an increased Initial Set Complexity (4-8-4 vs. 10-8-4, t(1.733) = 3.579, p < .01 ; 6-6-4 vs. 7-6-4, t(1721) = 3.135, p = .002) and in Fig. 4d, paired comparisons showed asignificant decrease in FAs with an increased change in complexity when Initial Set Complexity was low (6-6-4 vs. 6-10-4, t(1.742) = -3.494, p < .001) but not when Initial Set Complexity was higher (10-8-4 vs. 10-12-4, t(1.712) = 1.72, p = .086). For the latter comparison, the greater complexity of the initial set (i.e., 10) might not have facilitated memorization whatsoever (we observed the lowest hit rates for these two structures as shown in the preceding analysis).

To go further with our analysis on FA rates, we attempted to control the effect of the number of new features caused by the introduction of the probe (features in the lure not presented in the initial set). Indeed, if participants only saw a set of squares in the initial set before being presented with a circle in the display test, this new feature “circle” could allow an easier rejection of the lure (Mewhort & Johns, 2000), which could account for a decrease in the FA rate. The emergence of one or two new features in the lure effectively decreased significantly the FA rates in comparison to the conditions in which no new feature appeared (F(1,29.058) = 25.31, p < .001). Nevertheless, when we only selected the conditions for which no new feature appeared in the lure, we still found a significant effect of Initial Set Complexity (F(1,445) = 20.13, p <.001). There was, however, no effect of Change in complexity (F(1,32) = 1.59, p =.216), but note that the conditions having no new features in the probe were the three conditions with the highest Initial set Complexity and previous analyses indicated that no effect of Change in complexity appeared when the initial complexity was too high. Thus, effects observed on FA rates were not entirely linked to the number of new features introduced with the lure.

Comparative analysis of drift rates and FA rates for all independent variables

Our last analysis was based on the two dependent variables (drift rates and FA rates), which were both predicted to be influenced by the two main factors (Initial Set Complexity and Change in Complexity). Our goal was to verify whether the full model including the interaction term was effectively the most parsimonious model. The two mixed-model analyses progressively included the two factors of interest and their interaction, and the additional control factor Shared features. The successive models were tested to obtain the most parsimonious account of the data, by computing the AIC for each model (see Table 4). A lower AIC corresponds to a more optimal model (i.e., considering that a parsimonious model leads to an adequate fit associated with minimal model complexity). In the case of drift rates, the mixed model including the three factors led to the best AIC (i.e., -2618.618). For the FA rates, the best model corresponded to the model including only Sum of shared features (i.e., -1795.354).

Table 4 Comparisons of mixed model based on the AIC criterion for Drift rates and False alarm (FA) rates

Full size table

Discussion

Our aim was to test whether visual memory capacity is determined by a process of data compression by studying correct responses, errors, and RTs. Our compressibility metric was based on the complexity of the stimulus material, with more complex material being theoretically less compressible. We measured accuracy and speed of responses in order to analyse FA rates and drift rates. The drift rates integrated accuracy and RTs for correct responses. Our assumption was that individuals can develop on the spot a compressed representation of a display set obeying regularities (e.g., correlated features), and this design was thought to test the compression hypothesis without the need to retrieve pre-existing chunks (as is the case in the studies by Brady et al., 2009, and Reder et al., 2016). We also assumed that the compression process is a trade-off, with the risk of compressing too much information resulting in lossy representations. We expected that the presence of compressibility in the display set would enable greater memory performance and faster performance based on the idea that shorter compressed representation would take less time to be decompressed. A more novel prediction was that the introduction of a lure would interact with the compressed representation to modify the initial perceived complexity.

For the factor Initial Set Complexity (i.e., the complexity of the display set), as predicted, our findings showed a decrease in performance (smaller drift rates) with higher complexity, indicating a higher cognitive load for less compressible display sets in pairwise comparisons of the complexity structures. The effect of complexity on FA rates was not found in the mixed model, but it was detected in one paired comparison.

When we manipulated a change in complexity with introduction of a lure, our findings showed less pronounced effects. Within the paired comparisons, we observed a large decrease of FAs with a higher change in complexity when the complexity of the initial set was initially low (i.e., in the pair 6-6-4 vs. 6-10-4). We did not observe the expected effect when the complexity of the initial set was initially high (i.e., in the pair 10-8-4 vs. 10-12-4). Our interpretation is that the lure might not affect the decision process when encoding of the stimulus set was already too difficult. Analyses on drift rates, however, showed the expected effect of Change in Complexity in the mixed model: a higher change in complexity resulted in higher drift rates, meaning that a probe can easily be identified as not belonging to the initial set when complexity changes. Change in Complexity nevertheless interacted with the Initial Set Complexity factor, again revealing that a Change in Complexity did not occur when the complexity of the initial set was initially high (10-8-4 vs. 10-12-4). Aside from the general effect captured by the mixed model, no significant direct effect could be isolated from the paired comparisons.

Finally, the effect of new features in the probe was tested and an effect of Initial Set Complexity on FA rates was still found even when no new feature was introduced in the probe.

Experiment 2

Experiment 1 suggested an effect of change in complexity caused by the addition of a probe to the initial stimulus set. This effect could be due to a modified perception of the initial complexity of the stimulus set when the probe has to be compared to the memorized stimuli. However, this effect was only observed in the paired condition 6-6-4 versus 6-10-4, with lower performance in 6-6-4 as participants seemed to be more lured by a probe that did not modify the level of complexity; this difference was not observed in the more complex pair of conditions 10-8-4 versus 10-12-4. We thus concluded in Experiment 1 that the effect of change in complexity primarily depends on initial stimulus set complexity, as participants might not be able to properly encode information when complexity was too high, but the experimental conditions did not allow to study intermediate levels of complexity.

In Experiment 2, new conditions were tested to study a larger and better controlled range of complexity effects. The new conditions were thought to better observe the two opposite effects of change in complexity (i.e., Increase in complexity with introduction of the probe vs. Decrease in complexity with introduction of the probe) for different levels of initial set complexity (Low vs. Medium vs. High, corresponding to initial stimulus set complexity equal to 4, 6, and 10, respectively).