Introduction

Each day we are inundated with information and face the task of determining what is important and unimportant. The information deemed important is, hopefully, encoded sufficiently for later use, whereas the unimportant information is easily discarded. Indeed, many researchers have posited that a healthy memory system is predicated on the ability to diminish, expunge, forget, or somehow render inaccessible information that is no longer relevant (e.g., Anderson & Schooler, 2000; Bjork, 2011) and to focus on the most important information (Castel, 2008). Although a great deal of work has considered how the importance of information affects encoding and retrieval, very little research has examined how retrieval may influence the importance we assign to information. Castel, Rhodes, McCabe, Soderstrom and Loaiza (2012) reported a notable exception, observing that forgotten information was perceived as less important than remembered information. Accordingly, in this paper, we describe and test two accounts of such a forgetting bias and attempt to elucidate the mechanisms that drive retrospective evaluations of the value of remembered and forgotten information.

Value-directed remembering and the forgetting bias

The relative importance of information can influence what is later remembered. Castel and colleagues have reported abundant evidence for the selective nature of encoding and remembering, such that individuals are more likely to remember highly valuable information than less valuable information, a finding termed value-directed remembering (Castel, 2008). For example, Castel, Benjamin, Craik, and Watkins (2002, Experiment 1) had participants study words that were arbitrarily paired with numbers indicating the value of remembering that information with higher numbers indicating more valuable information. Across multiple study-test cycles, participants were more likely to remember high-value relative to low-value information. Subsequent studies have similarly demonstrated a memorial benefit for valuable information. For example, value-directed remembering is evident across the lifespan (Castel, Humphreys, Lee, Galván, Balota, & McCabe, 2011; Hayes, Kelly, & Smith, 2013; Koriat, Ackerman, Adiv, Lockl, & Schneider, 2014) and is impaired in individuals with neuropsychological (Castel, Balota, & McCabe, 2009) or attentional deficits (Castel, Lee, Humphreys, & Moore, 2011). Further, the influence of value on remembering has been demonstrated when attempting to remember faces (DeLozier & Rhodes, 2015), names (Festini, Hartley, Tauber, & Rhodes, 2013), and information in more complex scenarios, such as the amount of money one is owed (Castel, Friedman, McGillivray, Flores, Murayama, Kerr, & Drolet, 2016), the health risks of medication (Friedman, McGillivray, Murayama, & Castel, 2015), and the risks of food-borne allergies (Middlebrooks, McGillivray, Murayama, & Castel, in press). In addition, these effects have been examined using neuropsychological models and methods (e.g., Adcock, Thangavel, Whitfield-Gabrieli, Knutson, & Gabrieli, 2006; Cohen, Rissman, Suthana, Castel, & Knowlton, 2016) to determine how dopamine and semantic processing give rise to better memory for high-value information. Other data suggest that, when given the opportunity to control their own learning, individuals prioritize high-value information (e.g., Castel, Murayama, Friedman, McGillivray, & Link, 2013; DeLozier & Rhodes, 2015; Koriat et al. 2006).

Although demonstrations of value-directed remembering are legion, there is far less evidence on how valuable and less valuable information is retrospectively judged, particularly when that information has been forgotten. A possibility is that information once deemed valuable may be downgraded in importance when it is forgotten. For example, one might explain forgetting to send an email to a colleague as a consequence of a message that was not particularly important. Castel et al. (2012) tested this possibility by having participants retrospectively judge the value of information following a memory test (see also Madan & Spetch 2012). Specifically, they had participants study lists of words with each word randomly paired with a value from 1 to 12. Immediately after the presentation of each list participants completed a free recall test in which they were instructed to recall as many words from that list as possible. Following the final free recall test, participants were given a sheet of paper with all of the words that had been presented and instructed to recall the value that was initially associated with each word. Consistent with prior work on value-based remembering, recall was positively related to the value associated with each item during the study phase. Most important for present purposes, participants rated remembered items as being more valuable than forgotten items, a finding that held even when controlling for the actual study value of the item.

A second experiment demonstrated that the forgetting bias extends to the subjective value an individual assigns to information. Specifically, participants were given personality traits (e.g., honest, intelligent, vulgar) and rank-ordered the value of each trait when evaluating a significant other on a scale from 1 to 8. These traits were then re-presented as study items, accompanied by the value the participant had assigned in the earlier rating phase. On a subsequent free recall test, participants were more likely to remember high-value relative to low-value items. Following this test, participants were given a list of the studied traits and asked to recall the value associated with each. As in Experiment 1, participants provided higher ratings for remembered compared with forgotten traits. Thus, Castel et al.’s (2012) results are indicative of a forgetting bias – individuals deem forgotten information to be less important than remembered information.

Potential accounts of the forgetting bias

Although Castel et al. (2012) reported a robust forgetting bias, the source of this bias remains unclear. Castel et al. suggested that the relative, perceived fluency of an item (cf. Kelley & Rhodes, 2002) may drive the forgetting bias. Specifically, remembered items may be perceived as more fluent and thus familiar and receive higher value ratings while forgotten items may be perceived as less fluent/familiar and receive lower value ratings. Indeed, manipulating the fluency of an item can affect the perceived familiarity of that item. For example, Whittlesea, Jacoby, and Girard (1990) presented participants with a short list of words followed by a test word, which was covered with either a light or a heavy mask. Participants were both faster to identify the word and more likely to report that the test word was “old” when it was covered with a light mask as opposed to a heavy mask. Thus, the perception of fluency may create higher levels of familiarity (see also e.g., Jacoby & Whitehouse, 1989; Westerman, 2008).

Along with feelings of familiarity, there has also been an abundance of research showing the influence of perceptual fluency on many other metacognitive judgments (e.g., Alter & Oppenheimer, 2009; Johnston, Dark, & Jacoby, 1985; Kleider & Goldinger, 2004; Reber & Schwarz, 1999; Song & Schwarz, 2008; Werth & Strack, 2003). For example, Werth and Strack (2003) had participants study questions and answers in formats that were easy (high figure-ground contrast) or difficult (low figure-ground contrast) to read. Participants were then asked to judge the likelihood that they would have known the answer. Easily read items elicited higher ratings that the participant would have known the answer. Reber and Schwarz (1999) likewise reported that participants were more likely to endorse statements such as “Osorno is in Chile” if the statement was presented in an easy-to-read color. The forgetting bias may similarly reflect a fluency-based attribution. That is, recalled words may be more fluent/familiar and thus regarded as more valuable. Conversely, participants may have lower feelings of fluency/familiarity for forgotten words and thus assign them lower values. We tested this account by manipulating the perceptual fluency of items when they were presented for a value rating (Experiment 1) and by manipulating the familiarity of an item prior to making a value rating (Experiment 2).

In addition to the fluency hypothesis, it is also possible that the forgetting bias may be driven by a general belief that remembered information is more important than forgotten information (cf. Mueller, Dunlosky, Tauber, & Rhodes, 2014). That is, participants may believe that if they remembered an item it must be important and thus assign it a high value within an experimental context that asks them to consider prior study value. Conversely, participants may interpret forgetting an item with the belief that it was unimportant and thus assign it a low value. To that end, participants may assign value to an item following a two-stage process. First, participants attempt to remember whether an item was recalled during the initial phase. Next, based on this memory for a past test, items that are recalled as “remembered” are assigned a value within the upper end of the possible range of values and items recalled as “forgotten” are assigned a low value in the possible range of values. That is, participants may hold a theory that remembered information is more valuable than forgotten information and apply this theory to value judgments after interrogating memory for the prior test.

Finn and Metcalfe (2007, 2008; see also Serra & Ariel, 2014) have provided compelling evidence that memory for a past test influences a different domain of judgment: Predictions of future memory performance (i.e., Judgments of Learning or JOLs; see Rhodes, 2016, for a review) for the same items across multiple study-test trials. Participants typically demonstrate an underconfidence with practice effect in such situations (Koriat, Sheffer, & Ma’ayan, 2002), exhibiting overconfidence on an initial trial (i.e., JOLs exceed performance) and underconfidence on subsequent trials (i.e., JOLs underestimate performance). One factor leading to underconfidence on later trials is that participants use their memory-for-past-test as a basis for judgment. Accordingly, items remembered on a past trial are given high JOLs and items forgotten on a past trial are given low JOLs, leading participants to underestimate additional learning and exhibit underconfidence (see Ariel & Dunlosky 2011; Tauber & Rhodes, 2012, for additional factors driving this effect).

Applying a similar logic, the forgetting bias may reflect participants’ memory for past tests. That is, when making a value judgment, participants may interrogate memory to determine whether an item was recalled on a prior test. If it is deemed remembered, the item may be assigned a higher value than an item deemed forgotten (i.e., not recalled) on the prior test. We investigated this account in Experiment 3 by asking participants to indicate whether an item was “remembered” or “forgotten” on the previous recall test, prior to making a value rating. In particular, if participants apply a general theory that remembered information is more valuable than forgotten information, value judgments should be higher for items deemed “recalled,” regardless of the actual status as a remembered or forgotten item. A theory-based account would also predict that value judgments should be similar for all items deemed “forgotten,” independent of the objective status of an item as forgotten or remembered. If participants do not rely on their memory-for-past-test to remember value, then value ratings should be differentiated based only on whether the item was remembered or forgotten and uninfluenced by memory for a past test.

Such a pattern of findings, whereby items regarded as “recalled” are given higher values ratings, may be anticipated by the directed forgetting (DF) literature. In particular, studies of item-method DF present participants with lists of items that alternate among instructions to forget (F) or remember (R) a particular item. When subsequently tested on the list, memory is generally superior for R items relative to F items, a finding that holds even when participants are instructed to recall or recognize all studied items, regardless of the original designation (e.g., Woodward & Bjork, 1971). Several studies have asked participants to label recalled or recognized items as having been originally presented under R or F instructions (e.g., Davis & Okada, 1971; Gallant & Yang, 2014; Thompson, Fawcett, & Taylor, 2011; Woodward & Bjork, 1971). Although participants are generally accurate at identifying the origin of an item (e.g., Thompson et al., 2011), there is some evidence that errors are informed by a prior recall episode. For example, Woodward and Bjork (1971) had participants study multiple lists of words under item-method DF instructions. Participants took an immediate test after each list with the goal of recalling only R items and then a final test, after all lists were presented, under instructions to recall any studied item. Participants also identified any F items among the items output on the final test. Although errors were infrequent, Woodward and Bjork (1971) noted that “…the immediate-recall history of a word heavily influenced whether the word was labeled as an F word or not” (p. 114). In particular, R words not recalled on the initial test that were output on the final test were frequently mistakenly labeled as F words and F words recalled on the initial test that were output on a final test were mistakenly labeled as R words. Similarly, in the current study, the immediate recall history of items may influence the value ratings given, with prior recall positively related to value.

In all, the experiments reported should serve to test the mechanism(s) driving the forgetting bias. We note that these accounts are not entirely mutually exclusive. For example, an item might be judged as “recalled” because it is highly familiar or fluent. Such overlapping mechanisms would thus predict that manipulations of familiarity and fluency should exert strong effects on retrospective assessments of value and result in patterns of data similar to differentiating items regarded as remembered or forgotten.

Experiment 1

In Experiment 1 we investigated a fluency-based account of the forgetting bias. As noted previously, this account proposes that forgotten information may be perceived as less fluent (as a result of being initially forgotten) or less familiar and thus deemed less valuable than remembered information. Accordingly, participants may attribute the ease with which an item is processed during the rating phase to the value associated with that item during the study phase and thus provide higher ratings of value to more fluent remembered items relative to less fluent forgotten items (cf. Kelley & Rhodes, 2002).

Experiment 1 tested this account by manipulating the ease with which items were perceived during the value-rating phase. Specifically, following Castel et al. (2012), participants first studied four lists containing 12 words that were each randomly paired with a number from 1 to 12 specifying the value of the word. After the presentation of each list, participants engaged in a free recall test for the words from that list. Finally, participants were presented with each word they had studied and recalled the original value that was paired with that word. Importantly, we manipulated the fluency of the items during the value judgment by presenting words with either a high or low figure-ground contrast (cf. Werth & Strack, 2003). If fluency influences value ratings, then participants should assign higher values to words presented in the fluent condition (high figure-ground contrast) and lower values to words in the dis-fluent condition (low figure-ground contrast) regardless of whether they previously recalled those items.

Method

Participants

Forty undergraduate students at Colorado State University participated in the experiment for partial course credit.

Materials

Materials consisted of 60 nouns taken from the Kucera and Francis (1967) norms. These were randomly divided into four sets of 12 items that were presented equally often in low or high figure-ground contrast. The sets were equated for frequency via the Kucera and Francis norms (M = 42.56; SE = 5.34), number of letters, (M = 5.94; SE = .21), and number of syllables (M = 2.02; SE = .11). The remaining 12 items served as a practice list prior to beginning the study phase.

Procedure

Participants were presented with four lists each containing 12 words. Words were presented on the screen for 2 s followed by the presentation of the next word. Each word was randomly assigned a value from 1 to 12, which was presented directly below the word. Participants were told to treat the value as points in a game whereby higher value words were worth more points. Further, participants were instructed to maximize their final point value by remembering as many words as possible. Following the presentation of each list, participants were given 1 min to recall as many words from the previous list as possible on an answer sheet that was provided (they were not given feedback on performance). A practice list was presented prior to the four experimental lists to familiarize participants with the procedure. The study-test procedure was repeated until all four experimental lists had been presented. The order of the lists was counterbalanced such that each list occurred equally often at each position in the presentation order.

Following the presentation and recall of all lists, participants were presented with all of the words that they had previously studied. The words were blocked in the same lists from study, and lists were presented in the same order as they were studied. Words within each list were randomized anew for each participant and were presented one at a time in the center of the screen. Each word was randomly assigned to be presented in either high figure-ground contrast (i.e., high fluency; black words on a white background) or low figure-ground contrast (i.e., low fluency; lime green words on a cyan background).Footnote 1 Participants were given an unlimited amount of time to recall the value that was initially assigned to the word. Participants were encouraged to be as accurate as possible but no constraints were placed on their responses (i.e., the frequency of using a particular value was not constrained).

Results

As trials are nested within participants and values varied across the trials, our primary method of analysis was mixed-effects modeling with random participant effects (random item effects were not incorporated as the assignment of the value was counterbalanced across participants; see Murayama, Sakaki, Yan, & Smith, 2014). Random participant slopes as well as a random participant intercept were specified for all the main effects. Study lists were treated as fixed effects with three effect coded variables in the model. Prior to conducting analyses, study value was centered at the group mean (this was also the case for analyses in the subsequent experiments). To facilitate ease of interpretation, all figures are presented with data collapsed into quartiles (i.e., values 1–3; 4–6; 7–9; 10–12); however, all analyses treated value as a continuous variable in the mixed-effects model.

Recall

Figure 1 displays the mean proportion of words recalled as a function of study value and fluency. As expected, participants recalled more words that were paired with higher point values. Overall, on average, participants recalled nearly half (M = 0.44; SE = 0.014) of the words. A generalized mixed-effects model predicting recall performance (based on a Bernoulli distribution, with 0 = not recalled and 1 = recalled) from study value, fluency (effect coded −1 = low fluency, 1 = high fluency), the interaction between them, and study lists revealed that participants were more likely to recall words paired with higher values, Exp (b) = 1.22, z = 7.19, p < .01.Footnote 2 Fluency condition also positively predicted recall performance, Exp (b) = 1.16, z = 2.75, p < .05. Given that fluency was manipulated after the recall phase and randomly assigned to items, we treat this finding with caution and suggest that the effect on memory is likely spurious.

Fig. 1
figure 1

The mean proportion recalled as a function of point value (in groups of three) and fluency in Experiment 1. Errors bars reflect one standard error of the mean

Value ratings

Of primary interest are value ratings for remembered versus forgotten items as a function of fluency. If participants attribute fluent processing to items originally assigned a high value, then a main effect should be evident such that high-fluency items are given higher value ratings than low-fluency items. In contrast to this hypothesis, Fig. 2 shows little difference in remembered value across levels of fluency. This was confirmed via a mixed-effects model predicting remembered value from study value, recall (effect coded: −1 = forgot, 1 = remembered; the variable was not centered to preserve the consistent meaning of −1 and 1 across participants), fluency, their two-way and three-way interactions, and study lists. The three-way interaction and the two way interactions between study value and fluency and between recall and fluency were not significant, zs < 1. There was also no main effect of fluency, z < 1. However, the results revealed main effects of study value, b = .17, z = 7.48, p < .01, and recall, b = 0.95, z = 8.18, p < .01. As evident from the beta value, participants assigned higher values to remembered words relative to forgotten words.

Fig. 2
figure 2

The mean remembered value for words that were remembered and words that were forgotten in Experiment 1 as a function of fluency and actual value (in groups of three). Errors bars reflect one standard error of the mean

Memory accuracy for the values associated with each item was modest (M = 0.15; SE = .011), but exceeded chance (.083), t(39) = 6.36. Further, memory accuracy was not influenced by study value or fluency, zs < 1.

Experiment 2

Experiment 1 replicated the forgetting bias reported by Castel et al. (2012). Specifically, participants regarded remembered items as more valuable than forgotten items. More importantly, there was little evidence that fluency had any impact on value ratings. That is, value ratings did not differ for high-fluency versus low-fluency items and fluency did not interact with value or the status of an item as remembered or forgotten. Thus, one might conclude that remembered value is independent of the experience participants might have or the attributions participants might make when attempting to remember the prior value of an item. However, the findings are (a) dependent on a null effect for fluency and (b) reflect only one possible manipulation of fluency/familiarity (but see Reber & Schwarz, 1999). Accordingly, in Experiment 2, we sought to employ a stronger manipulation of familiarity than was used in Experiment 1.

As in Experiment 1, participants in Experiment 2 studied four lists of items, each randomly paired with a value, and followed by an immediate test of free recall. However, prior to the judgment phase, participants were re-exposed to the study items so as to augment the familiarity of those items. Specifically, each item was presented, without their accompanying value, either once or three times in this familiarity phase. Following the familiarity phase, participants were shown each item and asked to recall the accompanying value. If participants mistakenly attribute familiarity to items of high value, then items presented three times should be regarded as more valuable than items presented once. Such a pattern would suggest that participants, in part, use current processing to determine the prior value of an item.

Method

Participants

Sixty undergraduate students at Colorado State University participated for partial course credit.

Materials and procedure

The materials, initial study, and recall phase were identical to Experiment 1. Once the final list had been presented and the final recall test administered, participants in Experiment 2 moved on to the familiarity phase when each list was presented for additional study, but without its original study value. In this phase, words were presented in blocks corresponding to the order of the lists from the original study phase (e.g., the first list presented in the study phase comprised the first block of the familiarity phase, the second list of the study phase was the second block was the familiarity phase, etc.). Half of the lists were presented once and half were presented three times with the number of presentations of a particular list counterbalanced across participants. For lists presented three times, the entire list was presented before starting the list anew until all three presentations had been completed. Prior to the familiarity phase, participants were instructed that they would have the opportunity to restudy each of the four previously presented lists and were encouraged to carefully attend to the words. Words within each list were presented in a uniquely randomized order at a 2-s rate.

Once all lists had been restudied, participants proceeded to the value-rating phase. The procedure was identical to Experiment 1, with the exception that all words were presented in an identical manner (white font on a black background).

Results

Recall

Figure 3 displays the mean proportion of words recalled as a function of study value and repetition. On average, participants recalled nearly half (M = 0.42; SE = 0.01) of the words. As in Experiment 1, a generalized mixed-effect model predicting dichotomous recall performance from study value, familiarity (−1 = low familiarity, 1 = high familiarity), their interaction, and study lists showed that participants were more likely to recall words paired with higher study values, Exp (b) = 1.20, z = 8.84, p < .01. Familiarity was not significantly related to recall performance, z < 1.

Fig. 3
figure 3

The mean proportion recalled as a function of point value (in groups of three) and repetition in Experiment 2. Errors bars reflect one standard error of the mean. Note: Rep = Repetition

Value ratings

As in Experiment 1, our primary interest was in value ratings for remembered versus forgotten items as a function of repetition (see Fig. 4). If people use the familiarity of an item as an index of its value, then repeated items should be regarded as more valuable than items presented only once prior to the judgment phase. We performed a mixed-effects model predicting remembered value from study value, recall (effect coded), familiarity, their two-way and three-way interactions, and study lists.

Fig. 4
figure 4

The mean remembered value and actual value for words that were initially remembered and words that were initially forgotten in Experiment 2 as a function of repetition and point value (in groups of three). Errors bars reflect one standard error of the mean. Note: Rep = Repetition

The three-way interaction and the interaction between study value and familiarity was not significant, zs < 1. The main effects of study value, recall, and familiarity were all significant, bs = 0.15, 0.54, and 0.20, zs = 7.73, 6.36, and 3.23, ps < .01. These effects were qualified by two significant two-way interactions. First, there was a significant interaction between study value and recall, b = .07, z = 4.05, p < .01. Simple slope tests revealed a positive relationship between study value and remembered value for forgotten items, b = .08, z = 3.41, p < .01, as well as for remembered items, b = .22, t = 7.98, p < .01. This effect was particularly strong for remembered items, leading to the interaction.

There was also a significant interaction between recall and familiarity, b =−.17, z = 2.96, p < .01. Follow-up tests showed that, for forgotten items, remembered value was higher for words presented three times relative to those presented once, b = .37, t = 4.84, p < .01. Conversely, remembered value was not influenced by repetition for remembered items, z < 1.

Memory accuracy for the values associated with each item (M = 0.13; SE = .01) exceeded chance (.083), t(59) = 5.39. Further, memory accuracy was not influenced by study value or fluency, zs < 1.34.

Experiment 3

As in Experiment 1, participants in Experiment 2 provided higher value ratings for remembered than forgotten items. More importantly, Experiment 2 provided moderate support for the influence of familiarity on participants’ value ratings. Specifically, whereas the manipulation of familiarity was unrelated to value ratings for remembered items, participants accorded forgotten items higher value ratings when they had been seen three times rather than once during the familiarity phase. However, we note that the influence of familiarity on value ratings was still comparatively modest relative to whether an item was remembered or forgotten. Combined with the results of Experiment 1, such data suggest that attributions about current processing made during the judgment phase plays, at most, a minor role in the forgetting bias, in contrast to Castel et al.’s (2012) speculation.

In Experiment 3 we explored an alternative account of the forgetting bias. As noted previously, one possibility is that participants assign a value to an item following a two-stage process. First, participants may attempt to remember whether an item was recalled during the initial phase. Next, based on this judgment, items that are remembered as “recalled” are assigned a value within the upper end of the possible range of values and items remembered as “forgotten” are assigned a low value in the possible range of values. Thus, memory for a past test may lead participants to consider higher or lower values based on whether the item was forgotten or remembered.

To create the pattern described, with remembered items consistently deemed more valuable than forgotten items, participants would need to be reasonably good at remembering their performance on a past test. For example, were memory-for-past-test near chance, applying a theory would be ineffective and lead to similar values for remembered compared with forgotten items. This concern appears unwarranted, as several prior studies suggest that participants can remember past test performance at levels that far exceed chance (Finn & Metcalfe, 2008; Gardiner & Klee, 1976).

We investigated a memory-for-past-test account of the forgetting bias in Experiment 3. As in the prior experiments, participants studied items randomly paired with values and were immediately tested on these items. During the subsequent value judgment phase, we solicited two judgments for each item. First, participants were asked to indicate whether the item was “recalled” or “forgotten” when they were tested. Next, they were asked to recall the value that was paired with the item. Based on prior work (Finn & Metcalfe, 2008; Gardiner & Klee, 1976), we anticipated that participants would be reasonably proficient at remembering prior recall performance. More importantly, identifying whether an item was deemed remembered or forgotten allowed us to investigate a theory-based approach to value judgments. In particular, if participants apply a general theory that remembered information is more valuable than forgotten information, value judgments should be similar for items deemed “recalled,” regardless of whether the item was actually remembered or forgotten and higher than value judgments for items deemed “forgotten.” Likewise, value judgments should be similar for all items deemed “forgotten,” regardless of the objective status of an item as forgotten or remembered. If memory-for-past test does not influence judgment, then value ratings should be affected only by whether the item was remembered or forgotten and uninfluenced by memory for a past test. We tested these possibilities in Experiment 3.

Method

Participants

Sixty undergraduate students at Texas Christian University participated in the experiment for partial course credit.

Materials and procedure

The materials used and the procedure for Experiment 3 were identical to Experiment 2 with two major exceptions. First, participants did not engage in restudy (i.e., a familiarity phase) prior to the value rating task. Second, the procedure for the value rating task was altered. Specifically, for each item, participants first indicated whether that item had been recalled during the initial test phase. Next, participants provided a value rating for the item in the same manner as in Experiment 2.

Results

Recall

Overall, participants recalled 43 % of the words. We performed a generalized mixed-effects model predicting recall (dichotomous variable) from study value, memory for past accuracy (effect coded; −1 = “not correctly recalled”, 1 = “correctly recalled”; the variable was not centered to preserve the consistent meaning of −1 and 1 across participants), their interaction, and study lists. The results (see Fig. 5) revealed a main effect of study value, Exp (b) = 1.26, z = 8.14, p < .01, as participants remembered more valuable information. Memory for past test also predicted recall, Exp (b) = 1.73, z = 3.54, p < .01, indicating that items that were correctly judged were associated with better recall.

Fig. 5
figure 5

The mean proportion recalled as a function of point value (in groups of three) in Experiment 3. Errors bars reflect one standard error of the mean

Judgments of recall

We first examined the accuracy of memory for past test by assessing how frequently participants correctly identified a remembered item as “recalled” and a forgotten item as “not recalled” (see Fig. 6). A mixed-effects model predicting memory accuracy from study value, memory status (effect coded: −1 = forgotten, 1 = remembered), their interaction, and study lists showed that there was a main effect of study value, Exp (b) = 0.94, z = 3.49, p < .01, indicating that, while controlling for memory status (i.e., whether the item was remembered or forgotten), participants’ memory for past test decreased with increasing study value. Further, there was also a main effect of memory status, Exp (b) = 1.58, t = 3.12, p < .01, such that, while controlling for study value, memory for past test was more accurate for remembered items relative to forgotten items. Value did not interact with Memory Status, z < 1.

Fig. 6
figure 6

The mean proportion of remembered and forgotten items correctly judged as a function of point value (in groups of three) in Experiment 3

A more conventional way to analyze these data is in terms of signal detection theory (see Table 1) by the proportion of items judged “recalled.” A hit corresponded to remembered items correctly deemed “recalled” and a false alarm corresponded to forgotten items incorrectly classified as “recalled.” For each participant, only four items were associated with a particular value, rendering calculations necessary for signal detection analyses untenable. Thus, we grouped each value into quartiles, starting from the lowest values (1–3) to the highest values (10–12), to make signal detection calculations.

Table 1 Mean recognition performance for past test by value in Experiment 3

The proportion of items deemed “recalled” were analyzed in a 2 (Memory Status: recalled, not recalled) × 4 (Value: 1–3, 4–6, 7–9, 10–12) repeated-measures ANOVA. Overall, hits (M = 0.794; SE = .029) greatly exceeded false alarms (M= 0.350; SE = .040), F(1, 51) = 92.836, p < .001, η p 2 = .645. The proportion of items called “recalled” did not vary as a function of Value nor did Value interact with Memory Status, Fs < 1. Further analyses showed that measures of discriminability (d’) and response criterion (C’) did not differ by value, Fs < 1. Thus, on the whole, participants were adept at discriminating recalled from forgotten items but performance was far from ceiling.

Value ratings

Participants in Experiment 3 indicated whether an item was remembered or forgotten on the initial test. Accordingly, we can assess value ratings both when there is a correspondence between memory for past test and objective performance (e.g., an item was remembered and correctly deemed “recalled”) and for instances where the judgment diverges from objective performance (e.g., an item was forgotten but incorrectly deemed “recalled”). As noted previously, if participants apply a general theory that remembered information is more valuable than forgotten information, then value judgments should be higher for items deemed “recalled,” regardless of the actual status as a remembered or forgotten item. Similarly, items deemed “forgotten” should be given lower value judgments than remembered information, regardless of the actual status of the item as remembered or forgotten.

Figure 7 displays value ratings for remembered and forgotten items as a function of value and based on whether the item was correctly judged or not. By this classification system, remembered items that were correctly judged were deemed “remembered” whereas remembered items associated with an error in judgment were deemed “forgotten.” The same classification applies to forgotten items, such that forgotten items correctly judged were deemed “forgotten” whereas forgotten items associated with an error in judgment were deemed “recalled.” A mixed-model was conducted to evaluate value ratings predicted by study value, recall (effect coded; −1 = forgotten, 1 = remembered), memory for past test accuracy, their two-way and three-way interactions, and study lists. The results revealed a significant main effect of study value, b = 0.11, z = 4.25, p <.01. The main effect of recall and memory for past test accuracy were not significant, zs < 1.43. There was also a significant interaction between recall and memory for past test accuracy, b = 0.94, z = 11.72, p < .01. Furthermore, these effects were qualified by a significant three-way interaction, b = .06, z = 2.83, p < .05.

Fig. 7
figure 7

The mean remembered value and actual value for words that were initially remembered and words that were initially forgotten as a function of whether or not it was correctly judged in Experiment 3. A correct judgment would entail a remembered item being deemed “recalled” and a forgotten item being deemed “forgotten.” An error in judgment would entail a remembered item being judged “forgotten” and a forgotten item being judged “recalled.” Errors bars represent one standard error of the mean

To elucidate this interaction, we computed a simple interaction between recall and memory for past test accuracy at low (1 SD below the mean) and high (1 SD above the mean) study values. The interaction effect was stronger when study value was high, b = 1.17, z = 12.41, p<.01. This interaction indicates that value judgments were higher for items deemed “remembered,” regardless of whether the item was actually remembered or forgotten. The interaction effect was weaker when study value was low, but the interaction was still significant and showed the same pattern, b = .72, z = 5.48, p < .01. Overall, these results comport with the hypothesis that value judgments are higher for items deemed “remembered” than for those deemed “forgotten,” regardless whether the item was actually remembered or forgotten item.

Recall accuracy for the values associated with each item was modest (M = 0.142; SE = .023), but exceeded chance (.083), t(87) = 3.257, p = .002. In addition, as value increased, participants’ cued recall accuracy decreased, b = −.09, z = 3.05, p < .01. Given that cued-recall accuracy was near floor, we do not further interpret this finding.

General discussion

The current study demonstrated that people deem forgotten information to be less valuable than remembered information and tested two possible mechanisms that drive this finding. Our results suggest that the fluency or familiarity of an item being judged plays a minor role in participants’ value judgments. For example, manipulating the ease with which an item could be read had no impact on participants’ value judgment (Experiment 1). Moreover, manipulating the familiarity of an item by varying the number of presentations prior to judgment had little impact on value ratings, confined only to value ratings for forgotten items previously seen three times (Experiment 2). We note that we have certainly not exhausted the possible range and nature of manipulations of fluency that might influence attributions regarding value. Indeed, effects of familiarity may be more prevalent when participants are unaware of the source of fluency (e.g., Jacoby & Whitehouse, 1989) or when a sub-set of items are particularly fluent relative to other items (e.g., Jacoby & Dallas, 1981; Wänke & Hansen, 2015; Westerman, 2008). For example, fluency effects might be enhanced were item repetition manipulated within-lists rather than between-lists. Thus, the potential role of fluency and familiarity in the forgetting bias warrants continued exploration.

Evidence from Experiment 3 largely favors an account of the forgetting bias based on memory for the outcome of a prior test (Finn & Metcalfe, 2008; Serra & Ariel, 2014). In particular, participants in Experiment 3 first indicated whether an item was remembered on the initial test of free recall prior to judging the value of that item. Based on a memory-for-past-test account, items deemed “recalled” should be accorded higher ratings than items deemed “forgotten,” regardless of the objective status of that item as remembered or forgotten. Our results were consistent with this prediction. For example, consider the data shown in Fig. 7 for items that were objectively (actually) remembered. Those items that were correctly judged “recalled” garnered considerably higher ratings than items incorrectly judged to be “forgotten.” A similar pattern was apparent for objectively forgotten items, with items correctly judged “forgotten” given substantially lower ratings than forgotten items incorrectly judged as “recalled.” Thus, our data suggest that it is not the objective status of an item as remembered or forgotten that drove value judgments, but the perceived status of the item as “recalled” or “forgotten,” based on memory for past test, that was the most important factor.Footnote 3

Such a mechanism based on memory for a past test may reflect an adaptive use of memory (e.g., Anderson & Schooler, 2000; Bjork, 2011; Schooler & Hertwig, 2005). Indeed, memory may be attuned to the environment such that the most important information is also the most accessible. Important information should also garner more resources than less accessible information. For example, individuals should be more likely to persist in a search for more important or valuable information and terminate search more quickly for information that is less valuable (cf. Dougherty & Harbison, 2007; see Ariel, Dunlosky, & Bailey, 2009, for a similar idea applied to study choices during encoding). This strategy might be adaptive, but our data suggest some circumstances in which it could be misapplied. For example, although our participants were quite good at identifying which information was previously forgotten or remembered, they were far from perfect, suggesting that some information may be mistakenly regarded as forgotten and thus unimportant.

Further, beliefs regarding the influence of value on memory may not be entirely accurate. For instance, individuals appear to hold a theory that information rendered important after encoding, when value would have little impact on memory, should still be quite memorable (Kassam, Gilbert, Swencionis, & Wilson, 2009; Soderstrom & McCabe, 2011). As an illustration, Soderstrom and McCabe (2011) had participants study pairs of words randomly associated with a value from 1 to 6 and judge the likelihood of remembering that information in the future. Value information was presented either before the pair was studied or after it was studied, when value would not influence encoding. Their results showed that participants provided greater predictions of performance for more valuable information, regardless of when value information was presented. Thus, although the forgetting bias reported here and elsewhere (Castel et al., 2012) may reflect adaptive mechanisms based on a general belief or theory about memory, we do not argue for the unqualified sagacity of this view. Future work will profit by exploring the implications of such a theory for control over memory processes.

In all, we attempted to determine the source of why people devalue forgotten information. Our results indicate that forgotten information is retrospectively deemed less important than remembered information. This bias does not appear to be driven by perceptions of the fluency of an item or its familiarity. Rather, our data suggest that individuals invoke a theory about the value of remembered or forgotten information by interrogating memory for performance on a past test. By this theory, information judged as having been remembered is deemed to be more important than information judged as having been forgotten, regardless of whether this comports with the actual state of memory on a past test. Thus, our memory for the past informs the relevance of the present.