Skip to main content

Visual long-term memory is not unitary: Flexible storage of visual information as features or objects as a function of affect

Abstract

Research has shown that observers store surprisingly highly detailed long-term memory representations of visual objects after only a single viewing. However, the nature of these representations is currently not well understood. In particular, it may be that the nature of such memory representations is not unitary but reflects the flexible operating of two separate memory subsystems: a feature-based subsystem that stores visual experiences in the form of independent features, and an object-based subsystem that stores visual experiences in the form of coherent objects. Such an assumption is usually difficult to test, because overt memory responses reflect the joint output of both systems. Therefore, to disentangle the two systems, we (1) manipulated the affective state of observers (negative vs. positive) during initial object perception, to introduce systematic variance in the way that visual experiences are stored, and (2) measured both the electrophysiological activity at encoding (via electroencephalography) and later feature memory performance for the objects. The results showed that the nature of stored memory representations varied qualitatively as a function of affective state. Negative affect promoted the independent storage of object features, driven by preattentive brain activities (feature-based memory representations), whereas positive affect promoted the dependent storage of object features, driven by attention-related brain activities (object-based memory representations). Taken together, these findings suggest that visual long-term memory is not a unitary phenomenon. Instead, incoming information can be stored flexibly by means of two qualitatively different long-term memory subsystems, based on the requirements of the current situation.

Recent research has shown that observers are able to successfully recognize thousands of visually presented objects after only a single viewing, even when highly detailed knowledge about visual features is necessary for correct recognition (e.g., Brady, Konkle, Alvarez, & Oliva, 2008), suggesting that humans possess a visual long-term memory system with a massive storage capacity for incoming visual information.

However, one important unanswered question concerns the nature of the stored visual long-term memory representations. From a perceptual perspective, two qualitatively different processing steps are involved when visual scenes are initially represented in the cognitive system, and both of their outputs may provide the basis on which previously encountered objects are later recognized. First, signals from the retina are analyzed to extract visual features such as orientation, color, and so forth, a process by which highly detailed representations of independent features are created that are closely linked to the physical properties of the visual scene. Second, these independent feature representations are recoded on the basis of a stored inner model of the structure of the world that reflects learned regularities in the visual input, a process by which informative features are integrated into coherent object representations and uninformative features are discounted, leading to the phenomenal experience of a visual scene that is segregated into coherent objects (Riesenhuber & Poggio, 1999; Serences & Yantis, 2006; Tarr, 1995; Treisman & Gelade, 1980).

The prevailing view is that only the output of the object-processing stage is stored in long-term memory, whereas the initial independent feature representations are quickly lost (Dehaene, 2014; Rensink, 2002). Such a view is based on studies suggesting that information is stored in visual memory in terms of objects (e.g., Cowan, 2001; Luck & Vogel, 1997), as well as research in the domain of perceptual memory and change blindness, demonstrating that sensory information is available only for short amounts of time (e.g., Rensink, 2002; Sperling, 1960). The underlying mechanism is often assumed to be that only those sensory features selected by attention are bound together into an object representation, and thereby converted into a more durable form of representation that can be stored in long-term memory (Gegenfurtner & Sperling, 1993; Treisman & Gelade, 1980).

However, there is also evidence that initial sensory-based representations can be stored over long time periods in surprising detail. One line of evidence shows that spatial frequency information, which is considered to be extracted in a first, feed-forward sweep in the visual cortex, can be retained with high precision over at least 24 h (Magnussen & Dyrnes, 1994; Magnussen, Greenlee, Aslaksen, & Kildebo, 2003; but see Lages & Paul, 2006). Another line of evidence shows that a single exposure to a visual image enhances processing when the same stimulus is encountered again (i.e., repetition priming), an effect that shows little decrease after days, months (Mitchell & Brown, 1988), or even years (Mitchell, 2006), and that even occurs for novel visual stimuli that observers have had no previous experience with (Musen & Treisman, 1990).

Thus, it seems that visual experiences can be stored by two separate visual long-term memory subsystems: a feature-based memory subsystem retaining visual experiences in the form of independent feature representations, and an object-based subsystem retaining visual experiences in the form of coherent objects (for such a model, see Johnson, 1983). If so, an interesting possibility emerges: The nature of visual long-term memories may not be unitary, but qualitatively different depending on the subsystem used for storing.

One prerequisite for such an assumption is that incoming visual information can be flexibly stored in either feature-based or object-based form. Indeed, from a functional point of view, such a differential storing of incoming information was early postulated by Piaget (1970) in his model of assimilation and accommodation, which has recently been elaborated in models of the so-called predictive brain (e.g., A. Clark, 2013). The basic idea is that functional storing requires the storing of incoming information in relation to an already-stored inner model of the structure of the world that reflects learned perceptual knowledge about objects. As long as the current inner model is appropriate, it can be imposed on incoming visual information so that visual experiences can be resource-efficiently stored as coherent objects based on the currently stored object knowledge (i.e., “assimilation” in Piagetian terms, or “prediction” in predictive-brain terms). However, in situations in which the current inner model does not sufficiently represent the incoming information, the inner model itself has to be refined on the basis of the inconsistent information (i.e., “accommodation” in Piagetian terms, or “prediction-error based refinement of predictions” in predictive-brain terms). In such a case, it would be functional to store visual experiences in the form of feature representations in order to refine current inner object models based on the stored inconsistent feature information. Thus, depending on the appropriateness of the currently stored inner object models, visual experiences may be stored in the form of either independent feature representations or coherent objects.

The aim of the present study was to examine whether the storage format of visual long-term memory representations indeed varies as a function of the appropriateness of currently stored inner models. To empirically test such a hypothesis, two requirements have to be met. First, experimental conditions must be established that vary in the appropriateness of the currently stored inner models, so that there is systematic variance in the way visual experiences are stored. In particular, to control for potentially confounding effects of differences between visual stimuli, it would be optimal to vary only the experienced appropriateness of the inner models and to use identical visual stimuli. One possibility to optimally fulfill this requirement would be to manipulate the affective state of observers. As has been proposed in prominent theories on affect–cognition interactions, affect signals the validity of one’s current inner model of the world, with positive affect validating and negative affect invalidating it (Clore & Huntsinger, 2007), with the consequence that positive affect triggers processes of assimilation, and negative affect, processes of accommodation (Bless & Fiedler, 2006), an assumption that has been supported by behavioral (e.g., Fiedler, Nickel, Asbeck, & Pagel, 2003) and neurophysiological (e.g., Kuhbandner et al., 2009) evidence.

The second requirement that will have to be met to test whether incoming visual information is stored in the form of independent feature representations or coherent objects is to establish a measurement that can reliably identify qualitative differences in the ways visual experiences are stored. One possibility would be to measure memory for the features of visually presented objects and examine whether the features of an object are remembered in a dependent or an independent way (Brady, Konkle, Alvarez, & Oliva, 2013). That is, if presented objects are stored in the form of coherent object representations, feature memory should more likely behave in an all-or-none fashion, because either an object would be remembered, and thus all object-defining features, or an object would not be remembered, and thus none of the features (dependent storage). By contrast, if presented objects are stored in the form of feature representations, memories for individual features should vary independently from each other, because they have not been organized into coherent object representations (independent storage).

However, inferring the existence of two separate visual memory subsystems from behaviorally observed differences in dependency between features alone may be difficult. The reason is that the operating of the two memory subsystems is not measured directly, but only inferred from a model-dependent interpretation of observed memory results. One possibility to examine the operation of the two memory subsystems more directly would be to measure neural activities during initial encoding and examine whether successful remembering of features is driven by early preattentive or later attention-related brain activities (i.e., a subsequent-memory effect; Paller, Kutas, & Mayes, 1987). Because attention is assumed to be a prerequisite for the integration of features into coherent object representations (e.g., Treisman & Gelade, 1980), the former would indicate storage in the form of feature representations, whereas the latter would indicate storage in the form of coherent object representations.

To meet the requirements above in the present study, participants were shown pictures of real-world objects that varied along two feature dimensions (see Fig. 1) while experiencing either positive or negative affect, and we recorded the electroencephalographic (EEG) signals of the participants while they performed the perception task. To prevent strategic effects during encoding, object pictures were shown for only a short duration (200 ms) and we did not mention that memory for the objects would be tested later.

Fig. 1
figure 1

Experimental procedure. In an incidental study phase (left panel), a series of visual objects was shown (200 ms each, with a blank interstimulus interval of 2,200 ms) with the instruction to decide for each object whether or not to buy it. In a surprise memory test (right panel), participants were asked to select the object they had seen during the study phase. Four response options were shown, originating from the combination of two features (state and color) with two values each. The object pictures shown here are for example only; for the pictures actually presented, see Brady et al. (2008) and Brady et al. (2013)

To examine whether the objects were stored in qualitatively different ways, behaviorally, we determined the degree of dependency between stored features (for details, see the Method section).Footnote 1

Neurophysiologically, we compared poorer (one remembered feature) and richer (two remembered features) object memories and determined whether the number of remembered features was driven by early, preattentive or later, attention-related brain activities. If object memories were stored in the form of either independent feature representations or coherent objects as a function of affective state, then both the behavioral and EEG measures should systematically vary as a function of induced affect, with negative affect promoting independent storage of object features driven by preattentive brain activities (feature-based memory representations), and positive affect promoting dependent storage of object features driven by attention-related brain activities (object-based memory representations).

Method

Participants

We recruited 40 undergraduates (35 females, five males; M age = 22.3 years, SD = 4.1) who participated for course credit. The sample size was based on a power analysis (G*Power 3.1.7) in order to allow sufficient power (β = .80, α = .05, two-tailed) to detect medium-sized effects (d = 0.5). All participants provided written informed consent, reported normal or corrected-to-normal vision acuity, and passed the Ishihara test for normal color vision. The study was conducted in accordance with the Helsinki Declaration and the University Research Ethics Standards. The data reported here are a subset (with additional EEG data collection) of the data already presented in Spachtholz, Kuhbandner, and Pekrun, (2016). All data exclusions, manipulations, and measures in the study are reported.

Material

We selected 200 images of real-world objects from published sets of stimuli (Brady et al., 2008, 2013). For each object we created four different images, resulting from the combination of two different states (e.g., open/closed) and two different colors (e.g., yellow/blue). The two state versions were already available from the stimulus sets. To create two color versions, we first selected a random hue value for the first version and then rotated this hue value (which can be represented as an angle on an isoluminant color circle) by 180° for the second version. We selected only objects whose colors were not intrinsically related to the objects (see Fig. 1 for examples).

Design and procedure

Participants were tested individually with E-Prime 2.0 (Psychology Software Tools, Inc., Pittsburgh, PA), using a procedure adapted from Brady et al. (2013). The experiment consisted of an incidental study phase and a surprise test phase. During the study phase, we directed participants’ attention to the centrally presented objects by asking them to decide whether to buy each of them. Each trial started with the presentation of an object for 200 ms, followed by a blank screen for 1,700 ms, during which participants made their buying decisions via button presses. The next trial started after a blank screen of 500-ms duration.

The study phase was divided into two blocks of 100 objects each. At the beginning of each block, either positive or negative affect was induced, by asking participants to recall a happy or sad autobiographical event for 3 min while listening to appropriate music (Jefferies, Smilek, Eich, & Enns, 2008). The order of the affect conditions and the assignment of objects to affect conditions were counterbalanced across participants. In the test phase, participants completed a forced choice recognition memory test. Each object was presented in all four possible feature combinations, and participants were asked to select the picture they had seen during the study phase (see Fig. 1). Memory for half of the objects of each affect condition was tested immediately after the study phase. The remaining half were tested in a delayed memory test one day after the study phase (the results from the delayed test were not analyzed because participants’ performance showed floor effects in both the negative condition, M Pboth = .03, SD = .06, and the positive condition, M Pboth = .08, SD = .10).Footnote 2

Participants initially completed 20 practice trials of the study task using objects different from those used later in the experiment. The success of the affect induction was retrospectively measured after each affect-induction block using the Affect Grid (Russell, Weiss, & Mendelsohn, 1989), which assesses experienced affect on the dimensions of valence (1 = extremely negative, 9 = extremely positive) and arousal (1 = low arousal, 9 = high arousal).

Data analysis

The degree of dependency between stored features can be calculated as the strength of association between memories for the individual features. For example, complete dependency would imply that if the first feature is remembered successfully, the second feature should also be remembered, and if the first feature is not remembered, the second feature should also not be remembered. This strength of association corresponds to the correlation between memory for the state feature and memory for the color feature, which, for binary variables (remembered vs. not remembered), can be calculated using the phi coefficient. Performing this calculation requires the probabilities of remembering both features (P Both), a single feature—that is, either state (P Single_State) or color (P Single_Color)and none of the features (P None) of the objects. P Both, P Single_State, P Single_Color, and P None are directly related to the observed proportions of correctly reporting both features, only one of the features (either state or color), or none of the features. However, to estimate the respective probabilities, the effect of guessing must also be considered. Observers could report neither of the features correctly only when they remembered none of features and did not guess any feature by chance. If observers reported only one feature (either state or color) correctly, there would be two possibilities: Either they remembered only one feature and did not guess the other feature by chance, or they remembered neither of the features and guessed exactly one by chance. If observers reported both features correctly, there would be three possibilities: They remembered both features, they remembered only one feature (state or color) and guessed the other by chance, or they remembered none of the features and guessed both features by chance.

To estimate P Both, P Single_State, P Single_Color, and P None, we formulated a model representing these relations (see Table 1). The best-fitting parameters were determined for each participant and condition using maximum likelihood estimation (Myung, 2003), in which the parameters were restricted to a range of [0, 1].

Table 1 Formulas for predicting the observed proportions of the four possible response events from the probabilities of remembering both features (P Both), only one of the features (either state P Single_State or color P Single_Color), or none of the features (P None)

EEG recording and analysis

Electrocortical activity was recorded from 30 active electrodes (Brain Products, Gilching, Germany), which were positioned according to the extended 10–20 system and originally referenced to an electrode at Cz. The signals were digitized with a sampling rate of 500 Hz (BrainAmp Amplifiers, Brain Products, Gilching, Germany), and the impedances of all electrodes were kept below 20 kΩ. Recording was done in a dimly lit, sound-attenuated, and electrically shielded chamber.

Offline, the continuous data of the study phase were segmented into epochs of − 600 to 1,800 ms, time-locked to stimulus onset, and epochs containing electrode or movement artifacts were removed. The data were then subjected to an infomax independent components analysis, and artifactual components were identified by visual inspection of the component topographies and power spectra. The main sources of artifacts were eye blinks, eye movements, and muscle activity. Components identified as artifactual were removed, and the remaining components were back-projected into EEG signal space. Epochs were again inspected and were rejected if they contained residual artifacts. On average, 93 trials (range 78–100 trials) per participant remained for the analysis (negative: M Both = 24.4, M Single = 17.8; positive: M Both = 24.5, M Single = 16.6). The relatively small number of trials per condition was compensated for by the large sample size of the study. The analysis was performed using Fieldtrip (Oostenveld, Fries, Maris, & Schoffelen, 2011) and custom MATLAB code. For the event-related potential (ERP) analysis, epochs were band-pass-filtered (0.05 to 40 Hz), resliced into epochs from − 150 to 750 ms relative to stimulus onset, and re-referenced to an average reference. A baseline correction was applied using the entire prestimulus interval.

For the statistical analysis, in a first step we identified subsequent-memory effects (i.e., time clusters in which activity was related to the number of stored object features) for each affect condition separately. To this end, we contrasted the activity at encoding for trials in which both features were later recalled correctly (Bothobs) versus trials in which a single feature was later recalled correctly (Singleobs). For this purpose, a two-stage randomization procedure was used for each sample point after stimulus onset (Blair & Karniski, 1993; Karniski, Blair, & Snider, 1994). At the first stage, paired t tests were performed for each electrode and the resulting t values were recorded. Then the sum of the squared t values, t sum2, was calculated over all electrodes, as a measure of both the strength and the spatial extent of the differences between conditions. Then, to correct these results for multiple comparisons across electrodes, 10,000 permutation runs were performed in which conditions were randomly swapped within participants. In each run, paired t tests were performed for each electrode and t sum2 was recorded. This created a distribution of values of t sum2 that would be expected under the null hypothesis of no difference between conditions. From this reference distribution, the corrected p value (p corr) for a given observed t sum2 from the first stage of the analysis could be calculated as the proportion of permutation runs yielding an equal or higher value of t sum2. To also account for multiple comparisons across time, time clusters with significant differences between the conditions were only considered when they extended over six or more consecutive sample points (i.e., 12 ms or longer; for a similar procedure, see Volberg & Greenlee, 2014).

In a second step, we compared the subsequent-memory effects between affect conditions. To this end, we averaged activity over the time windows of the subsequent-memory effects detected in the first step of the analysis. Then, first, we examined whether these effects were evident in each affect condition separately, and second, we determined whether the effects differed between affect conditions by using additional permutation tests.

Results

Two participants were excluded because their proportions of reporting neither of the two features correctly were greater than .25, indicating that they did not have any memory for the features.

Affect induction

As compared to the positive condition, participants’ ratings in the negative condition were lower on both the valence dimension (M Neg = 3.0, SD = 1.1; M Pos = 7.3, SD = 1.0), t(37) = – 15.9, p < .001, 95% CI [3.77, 4.87], d z = 2.57, and the arousal dimension (M Neg = 3.6, SD = 1.6; M Pos = 5.8, SD = 1.9), t(37) = – 6.1, p < .001, 95% CI [1.47, 2.95], d z = 0.99, indicating that the affect induction was successful.

Memory performance

Overall, the model for estimating P Both, P Single_State, P Single_Color, and P None fitted the data very well (R 2 Pos = .97, R 2 Neg = .96; positive affect condition: M PBoth = .28, SD = .17; M PSingle_state = .20, SD = .13; M PSingle_color = .09, SD = .11; M PNone = .43, SD = .22; negative affect condition: M PBoth = .24, SD = .19; M PSingle_state = .27, SD = .17; M PSingle_color = .15, SD = .12; M PNone = .35, SD = .21; see Fig. 2A).

Fig. 2
figure 2

Memory performance. (a) Probabilities of remembering either both features (state and color), only a single feature (either only state or only color), or none of the features, shown as function of affective state. (b) Mean numbers of remembered features across objects and the dependency between the state and color features are shown as a function of affective state. The dashed lines represent predictions for the cases in where the two object features are stored completely dependently (orange line) or completely independently (blue line). Error bars represent standard errors

The mean numbers of remembered features across all objects did not differ between the positive (M = .85, SD = .35) and negative (M = .89, SD = .14) affect conditions, t(37) = 0.95, p = .347, 95% CI [– .05, .14], d Z = 0.15, indicating that available memory resources did not differ between affect conditions (see Fig. 2B). The dependency (i.e., the phi coefficient) between the two object features was higher in the positive affect condition (M = .38, SD = .38) than in the negative affect condition (M = .11, SD = .41), t(37) = 3.26, p = .002, 95% CI [.10, .44], d z = 0.53 (see Fig. 2B).

EEG results

The permutation test revealed four subsequent-memory effects (two in an early and two in a late time range). Figure 3A shows the results for the early time range (0 to 400 ms), and Fig. 4A shows the results for the late time range (400 to 750 ms). These figures depict the p values after correction for multiple comparisons (p corr, as described in the Method section) for the comparison between amplitudes in trials in which both features were reported correctly (Bothobs trials) and amplitudes in trials in which a single feature was reported correctly (Singleobs trials).

Fig. 3
figure 3

EEG results: Subsequent-memory effects—early time range. (a) Results of the first permutation test for ERP differences at encoding between trials in which both features versus only a single feature were subsequently reported correctly, for the negative (upper panel) and positive (lower panel) affect conditions. The bars mark significant subsequent-memory effects. (b) Scalp distributions of t values and results of the second permutation test for the difference between activities (both vs. single trials), averaged over each of the time clusters (upper panels: 86–100 ms; lower panels: 130–162 ms) identified in panel A, as a function of affect. (c) ERP waveforms as a function of the number of correctly remembered features (both vs. single trials) and affect condition (upper panels: negative affect; lower panels: positive affect) at the electrodes marked by asterisks on the head plots in this column. Light gray rectangles mark the time clusters identified in panel A

Fig. 4
figure 4

EEG results: Subsequent-memory effects—late time range. (a) Results of the first permutation test for ERP differences at encoding between trials in which both features versus only a single feature were subsequently reported correctly, for the negative (upper panel) and positive (lower panel) conditions. The bars mark significant subsequent-memory effects, and the dashed lines represent the mean reaction times for each affect condition. (b) Scalp distributions of t values and results of the second permutation test for the difference between activities (both vs. single trials), averaged over each of the time clusters (upper panels: 436–502 ms; lower panels: 616–634 ms) identified in panel A, as a function of affect. (c) ERP waveforms as a function of the number of correctly remembered features (both vs. single trials) and affect condition (upper panels: negative affect; lower panels: positive affect) at the electrodes marked by asterisks on the head plots in this column. Light gray rectangles mark the time clusters identified in panel A

Early ERP effects

For the negative affect condition, one subsequent-memory effect was identified from 86 to 100 ms. The amplitudes in Bothobs trials were smaller than those in Singleobs trials over a broad range of central and parietal electrodes (see Fig. 3B, upper panels), whereas the time course of amplitudes was characterized by a negative-going deflection (see Fig. 3C, upper panels). Both the topographical distribution and time course of the subsequent-memory effect are well in line with a difference in the amplitude of the visual C1 component, which is typically characterized by a widespread centro-parietal negativity peaking around 70–100 ms (see, e.g., V. P. Clark, Fan, & Hillyard, 1994) and is generally considered to be preattentive (V. P. Clark & Hillyard, 1996). A permutation test for the activity averaged over the time interval of the effect (86–100 ms) showed that the subsequent-memory effect was significant in the negative affect (p corr = .010) but not in the positive affect (p corr = .913) condition (see Fig. 3B, upper panels); the interaction was significant (p corr = .023). Taken together, this indicates that the storage of object features is related to the amplitude of the preattentive C1 in negative but not in positive affective states.

For the positive affect condition, in contrast, one subsequent-memory effect was identified from 130 to 162 ms. The amplitudes in Bothobs trials were higher than the amplitudes in Singleobs trials over a range of posterior and occipital electrodes (see Fig. 3B, lower panels), whereas the overall time course of the amplitudes was characterized by a positive-going deflection (see Fig. 3C, lower panels). Both the topographical distribution and time course of the subsequent-memory effect are well in line with a difference in the amplitude of the visual P1 component, which is typically characterized by an occipital positivity peaking around 100–150 ms (see, e.g., Di Russo, Martínez, Sereno, Pitzalis, & Hillyard, 2002) and is generally considered to be attention-related (see, e.g., Mangun & Hillyard, 1991). A permutation test for activity averaged over the time interval (130–162 ms) showed that the subsequent-memory effect was significant in the positive affect condition (p corr = .003) but not in the negative affect condition (p corr = .500) (see Fig. 3B, lower panels); the interaction was significant (p corr = .045). Taken together, this indicates that the storage of object features is related to the attention-related P1 amplitude in positive but not in negative affective states.

Late ERP effects

For the negative affect condition, we identified two subsequent-memory effects (436–502 and 616–634 ms); see Fig. 4A, top panel. For both effects, the amplitudes in Bothobs trials were smaller than those in Singleobs trials over a broad range of parietal and occipital electrodes (see Fig. 4B). No subsequent-memory effects were observed in the positive affect condition (Fig. 4A, bottom panel). A permutation test for the activity averaged over each of the time intervals showed that the subsequent-memory effects were significant in the negative affect condition (p corrs = .006 and .030, respectively, in the 436-502 ms and the 616-634 ms interval) but not in the positive affect condition (p corrs = .647 and .988, respectively); both interactions were significant (p corrs = .025 and .050, respectively).

Encoding effects

To examine whether the differences between affect conditions regarding subsequent-memory effects were related to differences in encoding processes, we compared the average activities over Bothobs and Singleobs trials in each of the time clusters of the four subsequent-memory effects between affect conditions. We did not observe any significant differences, all p corrs > .356 (see Fig. 5), indicating that the encoding processes in the four time clusters did not vary as a function of affect.

Fig. 5
figure 5

EEG results: Encoding effects, showing scalp distributions of encoding-related activities (averaged across both and single trials) for the negative (upper panels) and positive (lower panels) affect conditions, averaged over each of the four time clusters exhibiting significant subsequent-memory effects

Discussion

The present findings demonstrate that visual long-term memory is not a unitary phenomenon but consists of two qualitatively different memory subsystems that can be used flexibly for storing incoming visual information. When participants experienced negative affect during initial encoding, visual objects were more likely to be stored in the form of independent feature representations mediated by preattentive brain activities. By contrast, when participants experienced positive affect during initial encoding, visual objects were more likely to be stored in the form of coherent object representations mediated by attention-related brain activities. Taken together, this indicates that incoming visual information can be stored flexibly either by a feature-based memory subsystem that retains visual experiences in the form of independent feature representations or by an object-based memory subsystem that retains visual experiences in the form of coherent object representations, depending on the requirements of the current situation as signaled by the affective state of the observer.

Such an assumption is further corroborated by the observed affect-dependent effects of stimulus evaluation on subsequent object memory, as measured by postperceptual differences in the ERP signals. To ensure that participants attended to the stimuli, they were asked to make evaluative (buying) decisions about each stimulus, and because the stimuli were visible for only 200 ms, these evaluations had to be based on stored mental representations of the objects. As was indicated by the analyses of the late ERP effects, subsequent feature memory varied as a function of evaluation-related brain activities only in the negative, not in the positive, affect conditions (see Fig. 4). In particular, such a differential subsequent memory effect was found despite the fact that comparable brain activities were observed during the initial evaluation (see Fig. 5). Such a pattern of findings suggests that evaluation processes operated on distinct types of representations, depending on the affective state. In the negative affect condition, individual features were differentially affected by evaluation processes, indicating that independently stored feature representations were operated upon. By contrast, in the positive affect condition, features were comparably affected by evaluation processes, indicating that coherent object representations were operated upon.

The reason for the existence of two memory subsystems may be that two opposing requirements have to be met for adaptive learning (A. Clark, 2013; Piaget, 1970). On the one hand, to keep stability, incoming information has to be processed with respect to a currently stored inner model of the structure of the world that reflects learned regularities in the visual input (assimilation). On the other hand, to allow for adaptation, the currently stored inner model has to be continuously refined on the basis of newly incoming information (accommodation). The object-based memory subsystem seems to serve the function of assimilation; the feature-based memory subsystem seems to serve the function of accommodation. Interestingly, such an assumption suggests that the two memory subsystems may not differ only in storage format, but also in the allocation of available processing resources across the features of an object. If the object-based memory subsystem serves to store incoming information on the basis of currently stored inner models, then available resources should be allocated more broadly across features, in order to minimize matching errors. If the feature-based memory subsystem serves to refine current inner models, then resources should be more strongly focused on individual features, in order to maximize perceptual precision for future model refinement. Indeed, there is preliminary evidence for such an assumption. As has already been reported in a related article with an increased behavioral sample (Spachtholz et al., 2016), available resources are traded between the quantity of encoded features and their individual strength as a function of affect, with positive affect promoting quantity and negative affect promoting strength.

The aim of the present study was to examine the nature of the visual representations stored in long-term memory. Interestingly, several previous studies have examined the nature of the visual representations stored in short-term memory (e.g., Fougnie, Asplund, & Marois, 2010; Luck & Vogel, 1997; Wheeler & Treisman, 2002). Similar to the findings of the present study in the domain of long-term memory, in short-term memory it also seems to be the case that incoming visual information can be stored in either feature-based or object-based form. That is, the features of the environment seem to be initially stored in parallel in independent visual memory systems, and by focusing attention on some of the features, they are bound together and maintained in visual short-term memory as integrated “objects.” However, a number of fundamental differences between visual short-term and long-term memory suggest that the term “object” may refer to qualitatively different types of representations in short-term versus long-term visual memory. From a functional perspective, it is commonly assumed that visual short-term memory serves the function of stabilizing quickly fading sensory information for a few seconds across eye movements and blinks (e.g., Hollingworth, Richard, & Luck, 2008). By contrast, visual long-term memory serves the function of storing object information for future encounters with those objects. In particular, the functional storing of object information in long-term memory requires the storing of incoming information in relation to already-stored perceptual knowledge about the objects. This differential functionality is also reflected in methodological differences between the studies on visual short-term and long-term memory. Whereas visual short-term memory is typically measured by the ability to indicate whether a display consisting of meaningless simple stimuli, such as colored squares or oriented lines, is the same as the one presented about 1 s before (i.e., a change detection task), visual long-term memory is typically measured by the ability to indicate whether a real-world object was part of a personally experienced episode from the past. Thus, given these functional and methodological differences, the relationship between the representations stored in short-term and long-term memory is difficult to infer from the existing research, which may be an interesting avenue to investigate in the future (e.g., Brady, Störmer, & Alvarez, 2016).

Author note

P.S. and C.K. developed the study concept and the design, performed the data analysis, and wrote the manuscript. Both authors approved the final version of the manuscript for submission.

Notes

  1. It is important to note that the dependency between object features cannot be inferred from the probability of remembering both features of an object, because a memory response with two successfully remembered features might reflect either dependent storage of the two features as a coherent object representation or independent storage of the two individual features with sufficient strength.

  2. The discrepancy between the relatively low memory performance in the delayed memory test and the previously reported high capacity of visual memory (e.g., Brady et al. 2008) can be explained by the fact that different task parameters were used in the present study (stimulus presentation time of 200 ms vs. 3 s; incidental vs. intentional encoding; immediate vs. 24-h delayed test). All of these parameters are known to decrease memory performance (Andermane & Bowers, 2015; Block, 2009; Brady, Störmer, & Alvarez, 2016), and in fact, studies with similar parameters have shown comparable feature memory performance (e.g., Brady et al., 2013).

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Philipp Spachtholz or Christof Kuhbandner.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Spachtholz, P., Kuhbandner, C. Visual long-term memory is not unitary: Flexible storage of visual information as features or objects as a function of affect. Cogn Affect Behav Neurosci 17, 1141–1150 (2017). https://doi.org/10.3758/s13415-017-0538-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3758/s13415-017-0538-4

Keywords

  • Visual long-term memory
  • Memory representations
  • Subsequent memory effect
  • Emotions
  • EEG