Does context recollection depend on the base-rate of contextual features?

Nieznański, Marek; Obidziński, Michał; Ford, Daria

doi:10.1007/s10339-023-01153-1

Does context recollection depend on the base-rate of contextual features?

Research Article
Open access
Published: 11 September 2023

Volume 25, pages 9–35, (2024)
Cite this article

Download PDF

You have full access to this open access article

Cognitive Processing Aims and scope Submit manuscript

Does context recollection depend on the base-rate of contextual features?

Download PDF

824 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

Episodic recollection is defined by the re-experiencing of contextual and target details of a past event. The base-rate dependency hypothesis assumes that the retrieval of one contextual feature from an integrated episodic trace cues the retrieval of another associated feature, and that the more often a particular configuration of features occurs, the more effective this mutual cueing will be. Alternatively, the conditional probability of one feature given another feature may be neglected in memory for contextual features since they are not directly bound to one another. Three conjoint recognition experiments investigated whether memory for context is sensitive to the base-rates of features. Participants studied frequent versus infrequent configurations of features and, during the test, they were asked to recognise one of these features with (vs. without) another feature reinstated. The results showed that the context recollection parameter, representing the re-experience of contextual features in the dual-recollection model, was higher for frequent than infrequent feature configurations only when the binding of feature information was made easier and the differences in the base-rates were extreme, otherwise no difference was found. Similarly, base-rates of features influenced response guessing only in the condition with salient differences in base-rates. The Bayes factor analyses showed that the evidence from two of our experiments favoured the base-rate neglect hypothesis over the base-rate dependency hypothesis; the opposite result was obtained in the third experiment, but only when high base-rate disproportion and facilitated feature binding conditions were used.

The effects of context in item-based directed forgetting: Evidence for “one-shot” context storage

Article 06 February 2017

Confidence carryover during interleaved memory and perception judgments

Article 18 September 2018

Measuring binding effects in event-based episodic representations

Article Open access 09 May 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Dual-process models of memory postulate that recognition memory performance reflects the contribution of two distinct components referred to as recollection and familiarity (Yonelinas 2002). Recollection reflects the conscious reinstatement of details from a learning episode, including both target and contextual information, whereas familiarity reflects a more automatic and general activation of a memory trace. A variation of the dual-process view of memory is the fuzzy-trace theory (e.g., Brainerd and Reyna 1990, 2002, 2004), which assumes that two qualitatively different types of representations, verbatim trace and gist trace, are encoded in parallel during a study experience. Verbatim trace stores perceptual item-specific information about a stimulus, whereas gist trace represents more general meaning-based information. Overall, recollection reflects verbatim trace retrieval, whereas familiarity is based on gist trace processing (e.g., Reyna 2012; cf. Nieznański et al. 2019).

Recently, Brainerd and colleagues (Brainerd et al. 2014a; Brainerd et al. 2015) have impugned the unitary view of recollection and proposed a model that distinguishes between the conscious recollection of contextual information and the vivid reinstatement of target information. In this model, target recollection derives from the retrieval of verbatim traces of old items, whereas context recollection is based—like familiarity—on gist trace processing. Most recently, however, Brainerd et al. (2022a) have acknowledged that contextual details may be stored in a type of memory trace that is separate from verbatim and gist, namely, a contextual trace. They argued that contextual details are typically associated with multiple old items, which makes them distinct from surface and semantic details specific to particular items. This three-dimensional structure was supported by a meta-analysis of conjoint recognition studies, which distinguished a semantic familiarity (gist trace-based) factor, a context recollection (contextual trace-based) factor, and a target recollection (verbatim trace-based) factor.

Our research stems from an assumption that the strength of the contextual trace can reflect the frequency of occurrence of a particular contextual feature among multiple old items. The more items share the same contextual feature, the stronger the contextual trace of this feature should be. We also hypothesize that the probability that a probe containing a particular contextual feature will evoke context recollection of another associated contextual feature is affected by the frequency of these two contextual features co-occurring. For example, context recollection that a cue word printed in a large font size was green should be higher when most of the presented large-font-size words were printed in green. In other words, we predict that context recollection is sensitive to the base-rate of contextual information experienced during study and reflects the frequency of context-context pairings. This base-rate dependency account finds some support in studies on multidimensional source recognition (e.g., Meiser and Bröder 2002) or in studies on ‘pattern completion’ (e.g., Horner et al. 2015; Horner and Burgess 2013). However, there are also some compelling arguments in favour of an alternative view—the base-rate neglect account, which refers to the phenomenon known from the judgment and decision-making literature that people have a strong tendency to favour diagnostic information over the base-rates when judging the probability of an event (e.g., Kahneman and Tversky 1973). The aim of the current study is to estimate the evidence in favour of the base-rate dependency hypothesis versus the base-rate neglect hypothesis in the recollection of correlated contextual features.

Arguments in favour of base-rate dependency in memory

The dependency of context memory on the experienced base-rate of contextual features is consistent with the mutual cuing hypothesis (e.g., Arnold et al. 2019; Boywitt and Meiser 2012; Meiser 2014; Meiser and Bröder 2002) which claims that the successful retrieval of one contextual feature serves as a cue for the other contextual feature. The positive stochastic dependence among concurrent retrieval processes for multiple contextual features observed by Meiser and colleagues suggests that these features are integrated into coherent episodic trace (but see Starns and Hicks 2005; Vogt and Bröder 2007). Encoding events into integrated traces facilitates the joint retrieval of the configurations of features. Importantly, such a dependence was observed when participants declared that they consciously recollected the contextual feature (the state of “remembering”) but not in the state based on familiarity (“knowing”). This supports our prediction that the context recollection process, which is defined in the dual recollection theory as a state of vivid reinstatement of contextual features (Brainerd et al. 2014a; Brainerd et al. 2015), is sensitive to the frequency of context-context configurations.

Since the mutual cuing hypothesis predicts that the successful retrieval of one contextual feature facilitates the retrieval of the other contextual feature, the more we can expect such facilitation to occur when one of these features does not need to be retrieved, but is provided to the subject. In such a case, the cueing of the second feature is not conditional on the successful retrieval of the first, but the provided feature is ready for use as a cue. Therefore, in our experiments, we introduced a manipulation of the reinstatement of one of the features as a condition that should enhance base-rate dependency.

The dependency of the retrieval of one element on the retrieval of another element was also demonstrated for elements that are not subordinates, that is, are not contextual features. In the Horner and Burgess (2013) experiments, participants were required to learn location-person-object triplets. The authors analysed how dependent the retrieval of one element (e.g., the person) is on the retrieval of another element (e.g., the object) when cued by a third element (e.g., the location), and they confirmed an interdependence in the ability to retrieve the different elements comprising the same event. Other studies also found support for the view that event elements are integrated into coherent ‘event engrams’ that enable episodic recollection (Horner et al. 2015). Incidental aspects of an event, as contextual details, are also retrieved along with other elements of a complete event. The retrieval of all these constituents of an event when presented with a partial cue is named ‘pattern completion’. The holistic recollection of event elements resulting from their associative structure is even regarded as the defining characteristic of episodic memory, and it was observed both for simultaneously and separately encoded event elements (Horner et al. 2015; James et al. 2020; but see Trinkler et al. 2006).

Research on mutual cuing hypothesis and pattern completion converge in their theoretical conclusions, but use quite different research paradigms, taking this into account, in our Experiment 1 we used a procedure more like that of source memory research (e.g., Meiser and Bröder 2002), while in Experiments 2 and 3 we also used a procedure like that of pattern completion research with colour-object-location triplets (e.g., Horner and Burgess 2013).

Important support for the base-rate dependency in memory also comes from Anderson and Schooler’s (1991) environmental explanation of such memory phenomena as practice, retention, and spacing effects. They describe the memory system as making statistical inferences and reflecting the structure that exists in the environment. According to their observations, the memory system tries to make available those memories that are most likely to be useful in a given time and environment. Therefore, we can expect that memory will also mirror the frequencies of features configurations experienced during the study phase of a memory experiment. This should happen whether or not subjects consciously notice the frequency structure of features, just as awareness of the fact that an event is repeated is not needed for the practice effect to occur.

Arguments in favour of base-rate neglect in memory

As Johnson et al. (1993) stated in their source monitoring framework, source attributions can be influenced by prior knowledge, schemas, or expectations. The strength of prior associations between features, especially when attentional resources are restricted, may influence item-context binding processes (Nieznański 2013). However, as demonstrated by Bayen et al. (2000), schema-based expectancy seems to influence guessing rather than the ability to remember the source. Source guessing is informed by (a) schema-based bias, which is cross-situational and based on general world knowledge, and (b) probability matching, which is based on situation-specific item-source contingency (e.g., Bell et al. 2020; Spaniol and Bayen 2002). The latter mechanism reflects base-rates experienced at encoding, so that, when source memory is not available, participants guess the source of detected-old items consistently with the proportion of sources associated with the particular type of items (e.g., Bayen and Kuhlmann 2011; Kuhlmann et al. 2012; Wulff et al. 2021). This line of research clearly indicates that specific contingencies of item types and sources influence guessing but not source detection, and this assertion is based on analyses using the two-high threshold multinomial model for source monitoring (Bayen et al. 1996), which enables the separation of the processes of item detection, source discrimination, and response bias. Therefore, the probability-matching account suggests that base-rate dependency appears in metamemory judgments rather than in object-level memory processes.

Base-rate neglect is well-known as one of the many errors and fallacies of human probability judgment, which were initially described in the Kahneman and Tversky research program (e.g., Kahneman and Tversky 1973; Tversky and Kahneman 1983; Tversky and Kohler 1994). In the domain of memory, some analogues of such fallacies were investigated by Brainerd and colleagues. For example, Brainerd et al. (2014b) described conjunction illusions, that is, instances in which participants falsely remember that a target from a single source was presented in multiple sources (see also: Brainerd et al. 2017; Nakamura and Brainerd 2017). In recent reviews, instances when the structure of real-world events is not preserved by our memories were referred to by Brainerd (2021, 2022) as ‘deep distortions’. The study of these phenomena has been inspired by the fuzzy-trace theory’s idea of gist memory, which implies that the retrieval of gist traces supports the acceptance of items belonging to different reality states that are mutually incompatible, for example, a related distractor may be accepted when asked if it is a related new item, but also when asked if it is a target because the target and the related distractor share a gist. Deep distortions are a new family of false memories that operate at a higher level of measurement than surface distortions. Compared to traditional false memories, they are theoretically more fundamental and measurable by analysing relations between two or more memories. Emergent relations among these memories of events or sources, usually studied using the conjoint recognition paradigm, are confronted with certain normative principles and are classified as deep distortions when they violate the axioms and rules of logic or classical probability theory (Brainerd 2021, 2022). An interesting recent example of a violation of the laws of logic comes from the Brainerd et al. (2022b) experiments, which showed that old? and new? judgments do not produce equivalent recognition accuracy. Despite logical equivalence, accuracy levels differ for judgments that an item is old from judgments that it is not new, and judgments that an item is new differ from judgments that it is not old.

Our aim was to analyse relations between memories for frequent versus infrequent configurations of features. It is possible that base-rate neglect in context memory is another example of when the structure of an everyday experience is not preserved by our memories—in this case, our memory does not act on the logic of conditional probability. An attempt to demonstrate the base-rate neglect in source memory was made by Lu and Nieznański (2020), however, that study did not apply modelling analyses to separate the contribution of context recollection.

Experiment 1: Context memory for equally versus unequally distributed features in neutral and reinstated test conditions

The general goal of Experiment 1 was to ascertain the presence or absence of an effect of an apparent correlation between contextual features on memory for one of these features. All the presented items differed in two dimensions of colour and size. The memory for the colour dimension was tested, and the distribution of colours by font size was manipulated within-subjects. For small-size items, the colours were equally distributed, whereas for large-size items saliently more items were presented in one colour than another. The main question was whether the base-rates experienced during the study influence context memory or do they just affect the guessing bias. For evenly distributed features, the influence of the base-rate should result in the absence of differences in context memory performance, while for disproportionately distributed features, the impact of the base-rate should result in differences in context memory performance. Moreover, if context-to-context associations are encoded into an integrated memory trace, reinstating the item size should reactivate colour memory, resulting in better context memory test performance (e.g., Symeonidou and Kuhlmann 2021, but see Hicks and Starns 2016).

In the condition with the reinstated large or small font size at the test in comparison with the condition with the neutral (medium) font size, applying the (implicit or explicit) knowledge about the correlation between contextual features should be easier. In this condition, participants were directly informed that the font used to present the word at the test is the same size as the font used at the study, therefore, they can use their knowledge about the base-rates of colours in particular fonts (e.g., that words printed in green were often presented in large font and rarely in small font). Applying the learnt correlation between features is also possible in the neutral condition depending on the ability to spontaneously mentally reinstate the font size of the presented word (cf. Starns and Hicks 2013). However, since the study font size may be forgotten or falsely attributed, context memory in the neutral condition should be much less affected by the base-rates than in the condition with the reinstated font.

Participants

In this experiment, 78 participants were recruited from among first and second-year psychology undergraduates. They received extra credits in their courses. One participant was excluded since he reported colour blindness. Participants’ mean age was 20.93 years (SD = 3.46), 18 were men.

Stimuli

As the materials, we used 123 nouns in Polish taken from the dataset prepared by Imbir (2016). According to the ratings available in this dataset, the selected words were all low in arousal, of medium valence and frequency, and of medium or high imaginability. In detail, our materials met the following criteria: all were nouns, 4–6 letters long, with a valence rating (on a scale from 1 to 9) between 3 and 7, an arousal rating lower or equal to 3.6, imaginability higher or equal to 4; and a frequency of appearance in the language from 300 to 1500 (Mandera et al. 2014).

Procedure and design

The participants were examined at individual workstations in the University Lab. The presentation of the stimuli and the response recording were controlled using the E-Prime 2.0 program (Psychology Software Tools, Pittsburgh, PA).

At study, 81 words were presented, two-thirds of them (54) were presented in font Colour 1, and one-third (27) in font Colour 2, thus, the base-rates were manipulated within subjects. For approximately half of the participants, Colour 1 was green and Colour 2 was blue, and vice versa for the other half. The participants were asked to try to remember words along with their colour and size. They were notified that some colours are more frequent in a particular font size than in another. The words were presented in a random order, at a rate of 4 s, with an interstimulus interval of 250 ms. Among 81 words, 45 were presented in large font size (96 pts) and 36 in small font size (24 pts); the font type was Arial, bold. Among 54 words in Colour 1, two-thirds (36) were presented in large and one-third (18) in small font size. Among the 27 words in Colour 2, one-third (9) were presented in large and two-thirds (18) in small font size. Overall, there were more Colour 1 than Colour 2 words, and Colour 1 words were more often in large font than small font, and the opposite was true for Colour 2 words. Figure 1 illustrates the proportions of words in each colour and font. Formally, the probability of a particular colour given a particular font size can be computed using the conditional probability formula, as follows:

$$P\left({C}_{1}|L\right)=\frac{P(L\cap {C}_{1})}{P(L)}= \frac{36/81}{45/81}=0.8,$$

$$P\left({C}_{1}|S\right)=\frac{P(S\cap {C}_{1})}{P(S)}= \frac{18/81}{36/81}=0.5,$$

$$P\left({C}_{2}|L\right)=\frac{P(L\cap {C}_{2})}{P(L)}= \frac{9/81}{45/81}=0.2,$$

$$P\left({C}_{2}|S\right)=\frac{P(S\cap {C}_{2})}{P(S)}= \frac{18/81}{36/81}=0.5.$$

where C₁ = Colour 1, C₂ = Colour 2, L = large font, and S = small font. Therefore, when a particular test probe is recognized as being presented at study in the large font, it is also expected that it was presented in Colour 1 rather than Colour 2 (the a priori hypothesis that it was Colour 1 is 4 times more probable than that it was Colour 2). However, when a word is presented in small font at test, it is equally probable that it was in Colour 1 or 2 at study.

At test, the studied words were presented intermixed with 42 distractors. Reinstatement of font size at test was manipulated between subjects. The words were presented in the same—large (96 pts) or small (24 pts)—font size for 37 participants, and in a new medium (48 pts) font size for 40 participants. In the reinstated condition, half of the distractors were presented in large font and the other half in small font. At test, the participants were informed that their task was to recognize if the word was presented and answer “yes” or “no” to the question that will be shown under the test word on a particular slide. There were three types of probe questions counterbalanced across participants and presented equally often with each type of test items: (a) Was this word presented in Colour 1?; (b) Was this word presented in Colour 2?; and (c) Was this word presented in either Colour 1 or 2? The slides were presented in random order. The test trials were participant-paced with the next trial appearing immediately after a response.

Data analysis

Bayesian analyses were conducted in JASP (JASP Team 2019; jasp-stats.org, see: van Doorn et al. 2020). We used Bayes factor BF₁₀ to compare the predictive performance of an alternative hypothesis over a null hypothesis. A Bayes factor between 1 and 3 is considered weak evidence, between 3 and 10 moderate evidence, and above 10 is considered strong evidence in favour of an alternative hypothesis. In symmetry, a BF₁₀ lower than 1 supports a null hypothesis, a factor between 0.333 and 0.1 means moderate evidence, and below 0.1 is considered strong evidence for a null hypothesis. When the dependent variables were normally distributed and the variances were homogenous across the groups, we performed Bayesian t tests, otherwise, we reported the Mann–Whitney U-test or the Wilcoxon rank-signed test. As priors we used default options in JASP, that is, the Cauchy distribution with r set to $1/\sqrt{2}$.

Multinomial modelling analyses were based on hierarchical Bayesian modelling using the latent-trait approach (Klauer 2010). This approach uses the multivariate normal distribution of the transformed individual parameters as the prior distribution on a group level. Monte Carlo Markov Chain sampling methods are employed to obtain the parameter posterior estimates (for more information about hierarchical multinomial processing tree models and examples of their application see: Arnold et al. 2019; Ernst et al. 2019; Heck et al. 2018; Klauer 2010). All hierarchical multinomial modelling analyses were conducted using the R package TreeBUGS (Heck et al. 2018).

Multinomial model for conjoint recognition paradigm

In the present research, the multinomial dual-recollection model (Brainerd et al. 2015) was used as a measurement model. The original model was developed for a context memory experiment with targets presented on List 1 and List 2 at study, and with three types of probe questions presented during the test phase: “Was it on List 1?”, “Was it on List 2?” or “Was it on either List 1 or List 2?”. The model defines the following retrieval processes: (a) The RT₁ (RT₂) parameter (target recollection), which is the probability that a List 1 (List 2) target cue provokes the conscious reinstatement of its presentation during the study; (b) The RC₁ (RC₂) parameter (context recollection), which is the probability that a List 1 (List 2) target cue provokes the conscious reinstatement of the contextual details of List 1 (List 2) presentation; and (c) The F₁ (F₂) parameter (familiarity), which represents the probability that a List 1 (List 2) target cue provokes a sufficiently high familiarity to make the target be perceived as old. Moreover, two response bias parameters are also defined: one (b) for accepting non-retrieved items (targets or distractors) for List 1? probe questions or List 2? probe questions, and another (b₁₂) for accepting non-retrieved items for List 1 or 2? probe questions (see Brainerd et al. 2015, Table 2). In comparison with the original model, the present research replaced the List 1 targets and the List 2 targets with the targets presented in Colour 1 or Colour 2.

A part of the multinomial model applied in the current research is presented in Fig. 2. One tree can be depicted for each probe question and item type. In Fig. 2, only the model of processing of targets presented in Colour 1 and large font is shown as an example. On the left are the item types used at the test with the specified question probes (Colour 1?, Colour 2?, and Colour 1 or Colour 2?). On the right are the participants’ responses (accept or reject), which are connected with the question probes and the item types by the branches of the processing trees representing the latent cognitive processes postulated by the dual recollection theory. As can be seen in Fig. 2, when a target context is congruent with the question probe (C1?|Target_C1), the target cues are accepted if the context recollection (RC₁) or the target recollection (RT₁) is successful and, if neither are successful, the response bias (b₁) can produce a “yes” response. When a target context is incongruent with the question probe (C2?|Target_C1), the target cues are rejected if the context recollection is successful but are accepted if the context recollection fails (1 − RC₁) and the target recollection (RT₁) is successful, and a “yes” response may also be produced by the response bias (b₂). On the probes with the Colour 1 or Colour 2? question (C1or2?|Target_C1), the participants respond “yes” if the context recollection, target recollection or familiarity (F₁) are successful; and if all of these retrieval processes fail, the response bias (b₁₂) can produce acceptance. For distractors, only the response bias (b₁ for C1?, b₂ for C2?, and b₁₂ for C1or2?) can produce acceptance (cf. Brainerd et al. 2015). Separate models of this type were created for large and small font-size items.

Results

Results based on descriptive measures

Descriptive statistics concerning the mean acceptance rates and the mean corrected acceptance rates (CAR) (i.e., the probability of a “yes” response for targets minus the probability of a “yes” response for distractors) for particular colour/font configurations and for each type of probe question are presented in Tables 14 and 15 in the “Appendix 1”. Figure 3 presents only the means and 95% credible intervals of accurate CARs, and Fig. 4 presents false alarms for distractors. The grand means of accurate CARs, compared between large font size items (M = 0.278, SD = 0.358) and small font size items (M = 0.256, SD = 0.313), pooling over configuration types and test conditions, were not significantly different, t(153) = 0.82. However, when the grand means of accurate CARs were compared between Colour 1 items (M = 0.226, SD = 0.312) and Colour 2 items (M = 0.309, SD = 0.354), a significant difference was found, t(153) = 2.81, Cohen’s d = 0.23, p = 0.006, indicating that participants attributed the less frequently presented colour more accurately.

Table 1 Group-level parameter estimates (standard deviations) and 95% Bayesian Credible Intervals of the dual-recollection multinomial model obtained in Experiment 1

Does context recollection depend on the base-rate of contextual features?

Abstract

Similar content being viewed by others

The effects of context in item-based directed forgetting: Evidence for “one-shot” context storage

Confidence carryover during interleaved memory and perception judgments

Measuring binding effects in event-based episodic representations

Introduction

Arguments in favour of base-rate dependency in memory

Arguments in favour of base-rate neglect in memory