Imagine that you are at a departmental function, meeting unfamiliar researchers from other labs and trying to remember their names. You even use the trick you once read about, of repeating the person’s name in conversation with them: “Hi George . . . nice to make your acquaintance, George . . . I really love your work, George . . .”. Now imagine your chagrin the next day, when you bump into George in the corridor, but can’t for the life of you remember his name. Why didn’t the name repetition trick help consolidate your memory of that name?

There’s a good explanation for your forgetfulness—you ignored the spacing effect. Repeated exposure to information strengthens memory for it, but spaced repetitions are most effective (Madigan, 1969; Melton, 1970; and more recently Cepeda et al., 2006; Delaney et al., 2010). Had you repeated George’s name throughout the event and not in a massed fashion, the chances of retaining that information for future recall would increase significantly.

Recently, Kahana and colleagues (Kuhn et al., 2018) have appealed to the spacing effect to explain an interesting aspect of free recall. In immediate free recall of word lists, the most recently studied items are generally better recalled than mid-list items, an effect often attributed to the retrieval of final items from short-term memory (Davelaar et al., 2005; Talmi et al., 2005). When cognitively demanding activity intervenes between study and test, this effect dissipates. Furthermore, when after initial immediate free recall and subsequent cognitive activity, a final free recall test is administered, the prominent recency effect observed in immediate free recall is actually reversed, with list-end items being remembered more poorly than earlier items (Craik, 1970). In other words, there is a positive within-list recency effect for immediate retrieval, and a negative within-list recency effect for delayed re-retrieval. Initially, this negative recency effect was adduced to provide evidence in favor of dual-store models of word list memory. Craik argued that items from the end of a word list were held in short-term memory buffer for the least amount of time. Therefore, although they had a higher probability of being immediately retrieved, they have the least strength in longer-term memory. Single store models sought alternative explanations of the negative recency effect. Contrary to accounts that attribute it to strategic rehearsal processes (e.g., Tan & Ward, 2000), Kuhn et al. (2018) provide evidenceFootnote 1 that negative recency is a function of spacing (i.e., the amount of time or the number of cognitive steps intervening between initial study and immediate free recall). This challenges the strategic rehearsal processes account, which ascribes importance only to an item's list position in the presentation phase and not to its output position in the immediate recall phase. Kuhn and colleagues propose that since items presented late in an encoding list are likely to be immediately recalled early on, those items have the least amount of spacing between activations, and therefore the subsequent memory for them is weaker than for items earlier in the encoding sequence. However, the crucial factor is not study list position alone, but the degree of spacing resulting from the study list position and the output position in the immediate free recall test. This account can accommodate the negative recency phenomenon within a single-store model of list memory retrieval patterns.

If the spacing of prior exposures is responsible for the long-term negative-recency effect, we should expect to see that effect not only on subsequent (final) free recall, but on other expressions of memory traces, such as recognition tests. This is because the second exposure via recall is said to serve as a further encoding event (Craik, 1970; Kuhn et al., 2018). That becomes apparent upon considering that retrieval in initial free recall is a paradigmatic case of retrieval practice (recently reviewed by McDermott, 2021). Although retrieval practice affects recollection more strongly than familiarity, in a multi-list paradigm like the current one, it was found to improve overall recognition (Chan & McDermott, 2007, Exp. 3). Similarly, spacing effects are reported for recognition memory, especially for its recollective aspect (Glenberg, 1976; Hintzman, 1969; Parkin & Russo, 1993; Zhao et al., 2015). Therefore, to the extent that the mechanisms responsible for the spacing effect yield negative recency for recall, they should yield negative recency for recognition. If the effect of study-recall spacing duration is found for subsequent (final) recognition testing as well, it would provide additional support for the explanation put forward by Kuhn and colleagues; absent such a finding, the evidence in favor of the spacing account and against the strategic retrieval account would be equivocal.

List serial position effects on delayed (subsequent/final) recognition tests have been explored in several classic studies. Craik et al. (1970) reported a negative recency effect in delayed recognition following immediate free recall tests, parallel to the negative recency effect found for final free recall (Craik, 1970). In contrast, Cohen (1970) reported positive recency effects in delayed recognition following immediate free recall tests. Engle (1974) noted that in the study of Craik et al. (1970), inclusion in recognition success analysis was conditional on being successfully recalled in the immediate test, while in Cohen (1970) inclusion in the recognition analysis was not conditional on prior production in immediate free recall. Engle (1974) therefore directly compared conditional and unconditional analyses, as well as manipulating presentation rate. He reported that words recalled in initial free recall (IFR) exhibited negative recency at all presentation rates, but that was not the case for non-recalled words. Engle (1974) also reported that subsequent recognition confidence ratings for words that had been recalled in IFR increased linearly with IFR output position, reflecting greater subsequent memory strength. Similarly, McCabe and Madigan (1971) reported that the final (recency) item in a five-element sequence of word pairs was identified with the lowest confidence level in delayed recognition after immediate free recall. In contrast, Engle and Durban (1977), examining both auditory and visual presentations, found a small positive recency effect, and Darley and Murdock (1971), who employed a three-alternative forced choice recognition test, found no recency effects.

Notably, delayed recognition accuracy (or confidence) as a function of specific study-output lag for each IFR item was not reported in any of the abovementioned studies. Lacking that data, even in those studies that do report a negative recency effect, it is difficult to adjudicate between Kuhn et al.’s (2018) claim that spacing is responsible for long-term negative recency in delayed recognition and an alternative explanation—that list-end items strategically receive more immediate rote rehearsal (McCabe & Madigan, 1971; Reitman, 1970). In that alternative account, those list-end items are also said to be produced in initial output positions based on registration in short-term memory, while primacy items, when immediately remembered, are retrieved from long-term memory formed during initial encoding, and so are also more successfully identified in delayed recognition tests.

Furthermore, there is reason to suspect that in a final recognition test a different pattern of effects might be observed than those reported by Kuhn et al. (2018) for final free recall. Numerous studies by Zacks and colleagues (e.g., Swallow et al., 2009; reviewed in Radvansky & Zacks, 2017) have documented the strong impact of proximity to event boundaries on subsequent memory. In recognition memory, the most potent retrieval cue is the memorandum itself, which is used as the mnemonic probe—a copy cue (Tulving, 1983). In a test of recognition, the strategic retrieval processes employed in free recall may be overshadowed by copy cue strength and proximity to boundaries— i.e., study list end and/or to an early output position in the initial free recall stage—which lead to strong encoding might yield the best final recognition performance.

Therefore, examination of negative recency in delayed recognition, and identification of its possible mechanism if occurring, require detailed information about the lags between study presentation and production in IFR. Fortunately, the PEERS data set includes a number of experimental sessions in which delayed recognition tests were administered at the conclusion of learning and immediate free recall testing of all lists. Furthermore, unlike earlier reports, this data enable examination of the exact serial order of recall in immediate free recall, enabling more precise quantification of the study-to-free-recall lag. We were therefore able to assess whether the negative recency effect in delayed recognition is indeed a function of spacing between initial encoding and immediate free recall in the lists presented during the earlier phases of the experiment.

In addition to investigating whether final recognition performance would support the spacing hypothesis of Kuhn et al. (2018) by performing the same analyses they executed, but on the recognition data, we took advantage of the scope of the PEERS database to conduct additional analyses tracking recognition probabilities as a function of list recency and within-list recency. The most important of these analyses examines the degree of monotonicity of the effects of spacing. As detailed below, this revealed some unexpected findings.

Method

The current investigation is based recognition data collected as part of Experiment 1 of the Penn Electrophysiology of Encoding and Retrieval Study (PEERS).Footnote 2 One hundred seventy-one participants participated in Experiment 1 (consisting of seven experimental sessions).Footnote 3 Two of these participants were excluded from the present analyses due to data corruption, leaving N = 169. Each of the seven sessions consisted of 16 lists of 16 words presented one at a time on a computer screen. Each study list was followed by an immediate free recall test, and each session ended with a comprehensive recognition test. Half of the sessions were randomly chosen to include a final free recall test, which took place before the recognition test.

Earlier PEERS publications report details of the method (for a more complete description, see Healey et al., 2014; Lohnas & Kahana, 2014). In brief, each item was on the screen for 3,000 ms, followed by jittered 800–1,200-ms interstimulus interval.Footnote 4 After the last item in the list, there was a 1,200–1,400-ms jittered delay, after which a tone sounded, a row of asterisks appeared, and the participant was given 75 seconds to vocally recall the just-presented items in any order. If a session was randomly selected for final free recall (FFR) test, participants performed the FFR test following the 16th immediate free recall (IFR) test. An instruction screen informed participants that they had 5 minutes to recall all the items from the preceding lists in any order. An old/new recognition test was administered after either FFR or after the last list’s IFR test. In total, 320 words were presented serially on the computer screen, with target/lure ratio varying with session, and targets comprising 80%, 75%, 62.5%, or 50% of the total trial items. Participants were instructed to indicate for each word whether the test word had been presented previously. Recognition trials were self-paced. Feedback on accuracy and reaction time was provided after each trial.

Results

Before turning to the analyses relevant to the question under consideration, it is instructive to get a sense of the overall pattens of performance in delayed recognition. One interesting pattern is the probability of final recognition as a function of the item’s serial position during encoding, considering both the list that each item came from and the item’s serial position within that list, for a total of 256 (16 × 16) possible positions. We therefore conducted a repeated-measures analysis of variance (ANOVA), with list number (Lists 1–16) and within-list serial position (Slots 1–16) as within-subject factors. Mauchly’s test of sphericity indicated that degrees of freedom in both main effects required correction; Greenhouse–Geiser epsilon was applied accordingly. We found a main effect of between-list position, F(11.34, 1905.58) = 47.92, p < .001, and a main effect of within-list position, F(12.72, 2136.08) = 10.02, p < .001, but no interaction between those factors, F(83.19, 13976.51) = 1.09, p = .270. As can be seen in Fig. 1a (top), there is a positive long-term recency effect across lists, such that, as might be expected, participants recognized more items from recent than from remote lists; this is reflected in the strong positive-slope linear trend in the across list effect, F = 347.1, p < .001, and the absence of a significant quadratic trend, F = 3.61, p > .05. This detrimental effect of study–test delay, with the concomitant number of intervening lists (possibly leading to retroactive interference on subsequent recognition), is not surprising. Moreover, since for the investigation of the spacing effect proposal the key factor is the within-list position at study (and at initial free recall), and the across-list position is not relevant, nor did it interact with the within-list effects, we will not further comment on that finding.

Fig. 1
figure 1

Final recognition as a function of serial position during encoding. All 16 lists of Fig. 1a (up) are averaged together in Fig. 1b (down), to create a within-list serial position curve. Error bars indicate SEM

To provide a clearer picture of the shape of the within-list serial position curve, we aggregated the data across all 16 lists; in this analysis, we included all items, whether or not they were successfully recalled in the initial free recall (IFR) stage. The shape of the average serial position curve presented in Fig. 1b (bottom) illustrates that there is also a primacy effect within each list. In fact, recognition probability appears to decrease with each additional item on the list, until reaching its lowest value at the middle of the list (Item 9), followed by the asymptotic mid-list plateau observed in classic serial position curves for immediate free recall. At the penultimate point, recognition probability rises again (Item 15). This is reflected in the presence of both linear (F = 49.32, p < .001) and quadratic (F = 26.46, p < .001) components in the within-list effect. However, the comparison with Fig. 3 and the free recall data provided by Lohnas and Kahana (2014); Fig. 3a) indicates that this spike is not an indication of advantage in principle, but rather a function of the greater relative proportion for the words in this position achieving successful initial free recall (~81% vs. an average of 59% for items in Positions 9–14), for which subsequent recognition is better. Notably, and once again unlike classic serial position curves for immediate free recall, there is a sharp decline in final recognition success for the final 16th item, despite items in that position having been initially recalled at a rate of ~94% (Lohnas & Kahana, 2014; Fig. 3a). The implications of these performance trends will be explored in the Discussion.

To examine the proposal of Kuhn et al. (2018) regarding spacing effects on negative recency, we proceeded to analyze recency effects, examining memory for later list items compared to earlier items. We assume that the mere retrieval of an item produces learning (e.g., Karpicke & Roediger, 2008). Therefore, to avoid confounds of the effects of final free recall before long-term recognition, all the analyses presented here (including Fig. 1, above) consider final recognition data only for PEERS Experiment 1 sessions that did not include the FFR test.Footnote 5 Moreover, the analyses consider recognition data separately for items that participants did or did not initially recall during their IFR trials, except for the detailed spacing analysis, which considers the number of items between encoding and subsequent IFR, and is therefore based solely on the former items.

To test the hypothesis that spacing between encoding and initial retrieval influences final recognition, we classified the initially recalled items into two categories according to their output positions, defining early output positions as the first half of outputs, and late output positions as the second half (as in Kuhn et al., 2018). This partitioning resulted in three classes of items: not initially recalled, recalled early, and recalled late. Figure 2 shows the probability of final recognition for these three item types as a function of recency of encoding, considering both the item’s list number and the item’s serial position within that list. It illustrates three major effects in final recognition. First, participants correctly recognized almost all the items that they recalled during IFR, as was reported for FFR by Kuhn et al. (2018). Second, participants recognized more items from recent than from remote lists. However, this effect was seen primarily for items that participants failed to recall in IFR. Third, we found a pronounced within-list negative-recency effect, which was most dominant for items that were recalled in early output positions during IFR. To provide a clearer picture of the relationship between this negative recency and item type, in Fig. 3 we aggregated the data across all 16 lists.Footnote 6 Like the results of Kuhn et al. (2018); Fig. 2a) regarding free recall, Fig. 3 demonstrates that the negative recency effect is attenuated when the analysis is restricted to either not-recalled items or items recalled late in the recall period. Thus, this finding provides additional support for the spacing-based account of negative recency (Craik, 1970; Kuhn et al., 2018).

Fig. 2
figure 2

Final recognition as a function of recency of encoding, for three classes of items: those recalled in early output positions in IFR (first half of recalls), those recalled in late output positions in IFR (last half of recalls), and items that were not recalled in IFR

Fig. 3
figure 3

Upper part of panel: Final recognition probabilities as a function of within-list position, averaged across all 16 lists of Fig. 2. Error bars depict 1 SEM (calculated using the methods described in Cousineau, 2005, and Morey, 2008). The large SEM seen at Positions 15 and 16 for items recalled late and unrecalled items results from a large drop in the number of items recalled at these positions (see Appendix, Supplementary Table 1, for the exact number of items recognized in each position). Lower part of panel: Final free recall probabilities, as reported by Kuhn et al., 2018. For both recall and recognition, the negative recency effect is strongest (i.e., retrieval probability is relatively weaker) for items recalled in early output positions but attenuated (i.e., retrieval probability is relatively stronger) for items recalled in later output positions

This inference is further supported by an analysis focusing on the initially recalled items (the two top curves in Fig. 3), which allowed us to specifically compare items recalled in the first half of recalls to those recalled in the last half, as a function of the encoding serial position (ranging from 1-16). The number of items in each of these serial positions varied greatly. Only 107 participants (out of 169) recalled one or more items from the 16th encoding position in late output positions, whereas all 169 did so when considering early output positions. This was simply because items from encoding position 16 were much more likely to be produced early in the recall sequence. Therefore, we subjected these data to mixed model analyses, which is most appropriate for unbalanced data (Baayen et al., 2008; Tibon & Levy, 2015). For each of the serial positions, outlier participants whose recognition performance was 3 SD below group mean were excludedFootnote 7. Item serial position during initial encoding, and IFR output position (first vs. second half) were fixed factors, with participant as a random factor (West, 2009). This analysis yielded a main effect of item serial position, F(15, 5171) = 16.38, p < .001, and a main effect of IFR output position, F(1, 5171) = 49.96, p < .001. Importantly, and consistent with the spacing-based account of negative recency, there was a significant interaction between serial position and IFR output position, F(15, 5171) = 9.66, p < .001. As shown in Fig. 3, the proportion of items recognized in Encoding Positions 1–11 was not affected by IFR output position, whereas the proportion of recognized items from Encoding Positions 12–16 was significantly lower if produced in the first half of IFR output positions than if produced in the last half. This is confirmed by the mixed model analyses that compared recognition probability between the first and last half, separately for each of the serial positions (see Table 1 for the complete statistical report, and Supplementary Table 1 for the detailed descriptive statistics). This indicates the specificity of the negative recency effect on final recognition to items previously produced in the first half of IFR output positions, as might be expected given the shorter spacing between the initial encoding and retrieval of these items.

Table 1 Results of a mixed model analysis comparing recognition probability between the first and last half of retrievals, separately for each of the serial positions (retrieval position half was used as fixed factor and participant as a random factor)

To gauge the spacing account more directly, we examined the probability of correct final recognition as a function of the exact spacing between the initial presentation of the items and their recall during the IFR (measured as the number of intervening items; possible values range from 0 to 30). As explained in the Introduction, the spacing account of negative recency posits that memory for items should improve consistently as the spacing between the two learning episodes (initial encoding and IFR) increases. Therefore, end-of-list items, which have shorter spacings than early list items on average, should be remembered less well, and this effect should be strongest for end-of-list items that are recalled early (shortest spacings of all). Consistent with the spacing account, Fig. 4 portrays the positive correlation between spacing and recognition probability, such that final recognition performance rises with the spacing between an item’s position during study and its position in IFR. Following the methods used by Kuhn et al. (2018), we computed several variations of the correlation between spacing amount and recognition probability separately for each participant.Footnote 8 Across all possible spacings (0–30), the distribution of correlation coefficients was significantly positive (mean correlation = 0.27), t(167) = 15.41, p < .001. However, Fig. 4 indicates that such correlations might not be equivalent across all spacing positions. We therefore dismantled the general correlation into three separate examinations of spacing-recognition correlations for spacing amounts 0–10, 11–20, and 21–30. Indeed, these subset correlations demonstrate that the overall positive correlation derives completely from the leftmost part of the graph (spacings 0–10), mean r = .34, t(166) = 15.54, p < .001, as clearly seen in Fig. 4, whereas the correlation in the other two subcategories is either null (mean r = −.02), t(160) = 0.67, p = .502, or negative (mean r = −.73), t(168) = 29.42, p < .001.Footnote 9 This seems to indicate that spacing between prior exposures indeed improves subsequent recognition performance, but that once sufficient spacing is provided, further spacing does not provide additional memory strength.

Fig. 4
figure 4

Probability of final recognition as a function of the spacing between initial presentation and initial recall. A positive correlation was found, demonstrating that the probability of recalling an item during final recognition increases as the number of items between the initial presentation and IFR increases. Error bars depict 1 SEM (calculated using the methods described in Cousineau, 2005, and Morey, 2008). Note that the large SEM seen at the right end of the graph derives from the scarcity of items recalled after such a large spacing (see Appendix, Supplementary Table 2, for the exact number of items recognized for each spacing)

Discussion

We assessed the spacing account of the delayed recency effect in retrieval of studied word lists (Kuhn et al., 2018), by examining whether its predictions would hold not only for final free recall but also for final recognition testing. We found that this was indeed the case: in comprehensive delayed recognition testing conducted on a large corpus of studied words, the probability of a word being correctly recognized was significantly influenced by the spacing between its initial presentation and its initial immediate free recall. As in the analyses reported by Kuhn et al. (2018), this was found both in comparison of halves of recall sessions and for individual spacings. However, a more nuanced examination of this relationship demonstrated that unlike the case in final free recall (Kuhn et al., 2018, Fig. 2b), this relationship was not monotonic. Increases from immediate repetition up to ~10 intervening items led to greater likelihood of correct subsequent endorsement, but additional spacing did not further improve performance.

Considering both the consonance and the dissonance between the recognition and recall spacing effects may be instructive regarding the possible mechanistic bases of the spacing effect in general (Greene, 1989). Although this issue is not discussed by Kuhn et al. (2018), the spacing account accords well with the notion that immediate post-encoding consolidation processes, including changes on the neural level, are required for the stability of newly formed memories (Ben-Yakov & Dudai, 2011; Dudai et al., 2015). Alternatively, the advantage of spaced repetitions might be in the changing temporal context (Howard & Kahana, 1999, 2002) of the activation of the word representations; association with multiple contexts might increase the likelihood of the subsequent recall and recognition of a studied probe (Smith, 1982; Smith & Handy, 2016). A third possibility, suggested by many earlier studies of within-session spacing effects (reviewed by Delaney et al., 2010), is that participants use different encoding strategies for each encounter with a stimulus in massed versus spaced presentations. This claim is supported by participant reports (Delaney et al., 2010).

Resolution of this question—whether consolidation factors, temporal context changes, or encoding processing differences are responsible for negative recency in delayed recognition—might be provided by the non-monotonic effects in the current data (Fig. 4). This finding accords with an early report of non-monotonic spacing effects on retrieval. This may be found in Glenberg (1976, Exp. 3), in which memory for trigrams was tested at lags of 8, 32 or 64 items after a second presentation, which lagged 0, 1, 8, 20, or 40 items after the initial presentation. Glenberg reported a nonmonotonic spacing advantage at the two longer retention intervals, which approached asymptote following a lag of 8 items between study presentation – comparable to our finding of an asymptotic trend beginning at a lag of ~10 items. Beyond that report, however, the variety of paradigms employed in earlier research makes it difficult to draw direct comparisons. As noted by Benjamin and Tullis (2010), few studies of spacing effects have made use of recognition testing. Additionally, nonmonotonicity in the effects of spacing on the order of days and even months between presentations (as summarized, e.g., in the meta-analysis of Cepeda et al., 2006) are likely to involve very different processes than lags of a maximum of 75 seconds as in the present study. A more fundamental difference between the present paradigm, and earlier research is that in a preponderance of studies, the second (and sometimes third, etc.) encounter with to-be-remembered information is in an additional study trial identical to the initial study trial. Under such circumstances, differences in strategic encoding processes between spaced and massed presentations (Delaney et al., 2010) might very well play a role. In contrast, in the current paradigm, the second exposure to the information is when it is produced in the IFR. In this case, the participants are not engaging in intentional encoding, but rather in intentional retrieval. It therefore seems that the two most robust candidates for explaining the current results are consolidation factors and temporal context changes.

We propose that for recognition memory, which enable retrieval judgments based on familiarity, the benefits to repeated encoding might significantly depend on metabolic re-potentiation of the neuronal networks representing the words to be remembered (Feng et al., 2019; Xue et al., 2011). Once those neurons have had enough time to recoup their ability to conduct the processes required for Hebbian plasticity, little further benefit might be derived from additional delay (Smolen et al., 2016). While temporal context factors may also impact on recognition, in the final recognition test paradigm, those influences might be attenuated. In this paradigm, the multiple temporally structured study–test stages of the preceding part of the experiment are disrupted by the presentation of a comprehensive set of recognition probes in random order, so those factors might be overshadowed by the strength of the recognition test copy cue, as noted in the Introduction. Therefore, in the current data as well in the comparable retention intervals in Glenberg (1976), spacing benefits become asymptotic after ~10 items. In contrast, for recall, the greater the difference between the temporal context of the initial (study) and subsequent (initial free recall) exposures to a given target word, the greater the variety of associative cues available, with the greater chance that it will be recollected in the final free recall test. As opposed to the recognition test in which the probes are presented randomly, disrupting the possibility of using temporal context retrieval of one target to cue additional targets, in free recall participants may use temporal context to perform further retrievals of items that were temporally proximal to just-retrieved words. Thus, while both temporal context and consolidation process factors affect both free recall and recognition, the two tasks may be sensitive to these influences, to different degrees. Consolidation factors might be more relevant for final familiarity-based recognition (yielding an asymptotic limit to spacing benefit), while temporal context factors might play a greater role in recollection-based final free recall (in which Kuhn et al., 2018, observed no asymptote in spacing benefits).

One aspect of the current data may require explanation through an additional process. Examination of the overall recognition probabilities for each serial position (Fig. 1b; that graph also includes the words not retrieved in initial free recall) reveals that there is a sharp drop in recognition probability for the last list item; indeed, it is the item least likely to be later recognized, despite the very strong recency effect reported for initial free recall of items in that position (~94%; Lohnas & Kahana, 2014; Fig. 3a). Although the differences are numerically small, given the power inherent in the dataset, this anomaly should not be ignored. We suggest that it results from the fact that initial free recall began immediately after list presentation. The last list item, enjoying the standard recency advantage in immediate free recall, may be reported from short-term memory, and therefore subject to minimal retrieval processing and remembered less well in the long run (Crowder, 1976; McCabe & Madigan, 1971; Reitman, 1970; Tan & Ward, 2000). In contrast, primacy and mid-list words that are not held in a short-term store require IFR retrieval by strategic recall processes. Those retrieval processes might serve as the type of deep encoding that underlies the testing effect (Karpicke & Roediger, 2008), leading to stronger subsequent recognition in a delayed test. This distinction cannot account for the asymptotic profile of spacing effects observed for all the other input position items in the recognition test but may be relevant to understanding poor delayed recognition of the final input position items. We noted above that the negative recency effect was initially adduced to provide evidence for dual-store models of word list memory (Craik, 1970); the finding of Kuhn et al. (2018) that the crucial factor is not study list position but rather study-to-initial recall spacing provides an alternative explanation that accommodates single-store models. Seemingly, this anomalously poor delayed recognition of the final list items might be seen as providing support for a limited version of dual-store models, applying only to the final list item. It seems to us, though, that resolution of the larger issue is better served by paradigms that combine anatomical and physiological assays and interventions with behavioral patterns (e.g., Innocenti et al., 2013; Kloth et al., 2020; Talmi et al., 2005) in providing evidence for or against process dissociation.

In conclusion, performance on delayed comprehensive recognition tasks generally supports the spacing account of the negative-recency effect observed in retrieval of word lists (Kuhn et al., 2018). Moreover, the pattern of nonmonotonicity in negative recency effects hints to a link between these behavioral findings and cellular-molecular processes implicated in the early consolidation of new learning. The rich pattern of results revealed by the large dataset including multiple tests of a large number of participants indicates that rehearsal and retrieval strategy factors may also play a role in affecting what we will subsequently remember and what we will subsequently forget.