The repetition of errors in recall: a review of four ‘fragmentation’ experiments

Laming, Donald

doi:10.1007/s00426-021-01598-z

The repetition of errors in recall: a review of four ‘fragmentation’ experiments

Review
Open access
Published: 14 March 2022

Volume 86, pages 1699–1724, (2022)
Cite this article

Download PDF

You have full access to this open access article

Psychological Research Aims and scope Submit manuscript

The repetition of errors in recall: a review of four ‘fragmentation’ experiments

Download PDF

Donald Laming ORCID: orcid.org/0000-0002-6260-2846¹

1906 Accesses
Explore all metrics

Abstract

This review reanalyses the data from four experiments originally designed to test the fragmentation hypothesis. Participants were asked to recall triple or quadruple associates, cued by each of their components in turn, and to guess if they could not remember. There were many errors in recall and many of those errors were repetitions of previous errors. This reanalysis focuses, not on the fragmentation hypothesis, but on the repetition of errors. It works backwards through sequences of test trials to discover the best prior match to the responses on each trial. It reports frequencies of different categories of repetition, conditional probabilities of repetition, correct recalls, and the probability of repetition in relation to the lag between trial and match in the test sequence. These results may be summarised as (1) every event (a stimulus or a response or just a retrieval) to which the participant attends is separately recorded in memory, creating an ordered record of those events that have engaged the participant’s attention; (2) the compilation of the record is automatic; while attention to a stimulus is at the participant’s disposal, the consequent entry into memory is not, and (3) the retrieval of a potential response from memory is spontaneous; that retrieval becomes an overt response if it is compatible with the cue. This makes sense of a number of historic anomalies in the study of recall and informs some contemporary problems in the study of short-term memory.

Contiguity in episodic memory

Article 21 November 2018

Testing the primary and convergent retrieval model of recall: Recall practice produces faster recall success but also faster recall failure

Article 08 February 2019

The influence of encoding manipulations on the dynamics of free recall

Article 17 July 2014

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

This review reanalyses the data from four experiments on memory for stimuli comprised of three or four distinct components. Recall was tested by presenting a single component as cue and asking: What went with it? Each individual component was presented in turn as cue, but separated, of course, by tests of other stimuli. In three of these experiments participants were asked to guess if they could not remember. Many of those guesses turned out to be repetitions, in whole or in part, of the cue and responses on some previous test trial.

Figure 1 reproduces a sample stimulus from Jones (1978). There is a yellow cup located in the centre of the bottom shelf in a simulated shop window. Viewing the complete collection of stimuli, that cup might have been any one of nine Objects (ball, cup, comb, [toy] flute, fork, [toy] gun, milk bottle, scissors, or a screwdriver). It might have been painted in any one of nine Colours (black, blue, brown, green, grey, orange, pink, red, yellow), and might have been placed in any one of nine Locations (left, centre or right on the top, middle or bottom shelf in the window). Participants were shown a set of nine stimuli, so constructed that each attribute value (Colour, Location, Object) appeared exactly once in the set (and different sets, of course, combined the attribute values differently). Those nine stimuli had to be presented in some order and Serial position in that presentation was used as a fourth attribute, both as cue and as component to be recalled.

The stimuli were presented on slides in a Kodak carousel projector for 2.2 s each. After counting backwards by threes for 25 s from some large three-digit number, there followed 36 trials, each represented by a cue on a separate page of a booklet. Each page contained one of the attribute values and asked what other values had been combined with it. So, yellow as cue asked for cup, bottom centre, and 6th; while 6 asked for yellow, cup, bottom centre. Each trial comprised a cue and three responses. The different stimuli were cued every nine trials and participants were asked to guess if they could not recall. Most of those ‘guesses’ turned out to be repetitions, in whole or in part, of the combination of cue and responses on some previous trial, and analysis of those repetitions provides a means of observing the storage and retrieval of memories over the space of 20 min.

There were 36 participants who each viewed three different sets of nine stimuli, each followed by 36 test trials, in the space of an hour. Each set of stimuli was presented and tested, and the participants rested, before the next was presented. Thirty-six participants, three sets of stimuli and 36 test trials give a total of 3888 test trials.

Table 1 reproduces the data from one presentation set by one participant to show, first, that many sets of responses are repetitions, in whole or in part, of some previous trial and, second, to illustrate the variety of those repetitions. Trials 1–9 are the stimuli; Trials 10–45 are test trials. Each stimulus has four attributes, Colour, Location, Object and Sequential position. Column 2 is the cue attribute, which changes with each trial, and Columns 3–6 the combination of cue and responses actually recorded. Columns 9–12 reproduce the content of the preceding stimulus or trial that best matches these responses (the most probable source). Column 7 is the trial number of that source (less than 10 indicates a stimulus) and Column 8 the nature of the match. C, L, O and S indicate correct responses, so CLOS means that all responses are correct (Trials 21, 31). ‘_’ indicates an error, so CLO_ indicates a triple match with a stimulus (Trials 16, 20 22, 40); likewise, CL_S, C_OS and _LOS mean that two responses are correct, but the third is wrong; and CL__ etc. means only one response is correct.

Table 1 Sample data from one presentation set by one participant in Jones (1978). Trials 1–9 are the stimuli; Trials 10–45 are test trials

Full size table

There are also different patterns of repetition to be distinguished. The combination of cue and three responses on a trial will be called an answer. The individual components of an answer are the cue and (three) responses. Where one answer exactly matches the cue and three responses on some previous trial (but not a correct recall of a stimulus, Trial 29), I speak of the recall of an answer; but it will be necessary to distinguish answers that are copies of the answer on the immediately preceding trial, answer (lag 0) or, simply, Ans0 (Trials 24 and 45), because the proportion of immediate repetitions is near 1. The answer that is repeated might itself be the recall of a previous error (Trials 35 and 41) or might be an incomplete recall of a stimulus (Trial 11). If only one or two responses (and the cue) match the answer on some previous trial, that is a cued recall (crc, Trials 14, 32, 33, 44). In the case that two or three responses, but not the cue, match some previous answer, I speak of a yoked recall (yrc, many trials in Table 1), and a yoked guess is two or three responses that match some stimulus other than the correct one (ygs, Trials 10, 12, 13, 23, 34). Finally, Null (Trial 15) signifies three independent guesses.

Given such a frequency and variety of repetition, the next question must be statistical significance. Since, when recall fails, there are 729 different combinations of three attribute values that might comprise a guess, the probability of matching any preceding trial by chance is tiny. The manner of calculating those probabilities is explained later under “Results”, but, anticipating those calculations, the statistical evaluation of different categories of repeated errors in the present experiment is set out in Table 2, where the numbers of answers actually recalled are compared with the calculated probabilities. That comparison between numbers of repetitions and chance probabilities provides the justification for this review. Most of those repeated errors must have been retrieved from memory. Analysis of those repetitions provides a novel means of observing the storage and retrieval of memories in the short term.

Table 2 Incidences of different categories of repeated errors in the experiment by Jones (1978), with a statistical assessment of each category

Full size table

The experiments

In this article, I examine/re-examine the data from four experiments of which the first, Jones (1978), has already been described.

Lansdale and Laming (1995): This study repeated Jones’ (1978) experiment using colour slides of a billiards table with a coloured ball (C: black, blue, brown, cyan, green, orange, pink, white, yellow, but not red) somewhere between the centre pockets, a distinctive pattern (P: nine different patterns) of eight red balls somewhere on the further half of the table and a white object (O: beer mug, book, brush, clock, cup, gloves, milk bottle, newspaper, vase) on the left hand edge of the table. As before, these components were independently assembled to produce nine different presentation sets of nine stimuli such that each component appeared exactly once in each set. Participants viewed the stimuli in a set for 2.1 s each; then, after an interval of 25 s counting backwards by threes, there followed 27 test trials. On each trial participants were cued with one of the components from one of the stimuli and asked to recall the other two. They were asked to guess if they could not recall. Each stimulus was tested at intervals of nine trials in the sequence of 27. Successive trials always presented a different attribute (C, P, O) as cue.

Each participant was asked to study and recall three sets of stimuli in a session of one hour. Twenty-seven participants completed a design balanced across all nine presentation sets. Two very similar experiments were reported. In the first, participants recorded their responses on printed sheets of paper and added their confidence (ratings 1–5) that each response was correct. In the second, the stimuli were presented on a computer peripheral and latencies were measured. Taken together, these two experiments comprised a total of 4374 test trials.

Laming (2021): The stimuli were 90 advertisements culled from glossy magazines. Each advert consisted of a brand name (B), a picture (P) of the product and a slogan (S). They were divided into nine product groups (cosmetics, fashion, food and drink, furniture, technology, holidays, jewellery, perfumes and shoes) with ten advertisements in each group. The slides were presented at 15 s intervals using a Kodak carousel projector. Each product group was tested in a separate session, lest the mixing of, say, perfumes and food provided additional cues which brand name, picture and slogan went together. The test trials comprised a series of 30 cues on slides, each slide showing one component from one of the stimuli. A picture cue showed the picture from the original advertisement with the brand and slogan blanked out. Brand names and slogans were typed in Times 12pt, the brand names in capitals, and photographed against a black ground.

There were 30 participants, who were tested on the different product groups at various intervals after presentation, ranging from an immediate test up to 4 months (but the loss of recall with lapse of time is not reviewed here). The cues were ordered such that each stimulus was tested every ten trials and successive cues presented different attributes. Participants were asked to write their responses in booklets^{Footnote 1} and, to assist, a complete set of brand names, slogans and pictures was projected on a screen. The pictures were labelled A, ⋯, J and the brand names and slogans were ordered alphabetically to obscure any association between them. Participants were instructed to select a guess from this screen if they could not remember; 30 s was allowed for responding to each cue. It is inevitable that over the space of 4 months not all participants will attend all the testing sessions; this experiment yielded a total of 5880 test trials.

Ross and Bower (1981, Expt. 3)^{Footnote 2}: “The learning materials were groups of four words [quartets] that have slight pre-experimental connections to a common concept, frame, or script (see Schank & Abelson, 1977).” Ross and Bower (p. 5). There were four lists of 12 quartets. The quartets were projected, one at a time, for 10 s onto a screen by an overhead projector. Thereafter, each quartet was cued twice with different component words. Participants were asked to write their recalls of the other three words of the quartet in a booklet, with precautions to prevent them looking back at previous responses. There were 18 participants who studied and recalled four lists in random order, giving a total of 1728 test trials.

These four experiments were originally designed to test the fragmentation hypothesis (Jones, 1976), which says, simply, that the representation of a stimulus in memory fragments. The contents of the fragment (e.g. Colour, Location, Object, Serial position) determine what attributes will function as cue and what other attributes can be retrieved. The most stringent test of this hypothesis subsists in cueing each stimulus by each of its individual components in turn and then examining the consistency of the retrievals. To my knowledge, only these four experiments have tested the hypothesis in this particular manner. Other experiments have also addressed the fragmentation hypothesis, most recently Joensen et al. (2019), but do not identify the sources of responses with the precision achieved here. So, these four experiments are selected for reanalysis because of the data they provide about the sources of individual responses. The fragmentation hypothesis has no further role to play, except that the repetition of previous answers generates data that suggests the fragmentation hypothesis without that hypothesis having any relevance to memory (Laming, 2019).

The data are reanalysed here solely to identify the most probable source for each combination of responses. Such a re-analysis of Lansdale and Laming (1995) has already been published (Laming, 2019) and the initial publication of the third experiment (Laming, 2020) already focuses on repeated errors. (The statistical analyses from these two experiments are placed in a supplementary file (Review_tables.doc) and the results merely summarised below). But the experiments by Jones (1978) and Ross and Bower (1981, Expt. 3) are here re-analysed ab initio, working from the original data. The analysis is set out under ‘Results’; then ‘Three Propositions’ presents some immediate post hoc interpretations and their relation to previous work; further comment is in the Discussion. This arrangement of the argument means that certain topics recur and some sub-headings are repeated three times.

Results

The reanalysis works through each sequence of test trials in inverse order. Beginning at the end, it seeks the most probable source for each combination of cue and responses. A test of the fragmentation hypothesis would look only at correlations between the responses to the three cues addressing each individual stimulus; the analysis/re-analysis here examines the correlations between each combination of cue and responses and all preceding such combinations, including the original stimuli. The analyses are exploratory and the interpretations below post hoc. The results are subdivided into.

1.
repetitions of previous errors;
2.
conditional proportions of repetitions; and
3.
the relation between lag and recall.

Repetitions of previous errors

For each trial n (n = 10, ⋯ 45), the reanalysis works backwards from trial n − 1 to 10 and then through the stimulus set (n = 9, ⋯ 1), looking for the best match to the cue and three responses. A previous answer (or stimulus) that matches all four attributes is always deemed a more probable source than a match of only three or two. Given two matches of equal size, the most recent (with the smallest lag) is preferred over the more remote.

Suppose the answer on trial n is not correct, but matches the cue and three responses on some previous (erroneous) trial. The probability of a complete match by chance, not this particular match, but any match, depends on the number of previous answers that could be matched by a suitable choice of responses. There is some number of previous answers containing the trial cue as a response. Let that number be x; it varies from trial to trial, but is usually 0, 1 or 2. There is also some number of combinations of three responses (728, because the stimulus addressed must be excluded from the calculation) that might have been output. The probability of matching a previous answer (any previous answer) by chance is therefore x/728. The occurrence of a match is a Bernoulli variable with variance (1 − x/728)(x/728). The sum of such variables over the totality of trials is a generalised binomial, with mean equal to the sum of the probabilities and variance to the sum of the individual variances. This mean is compared in Table 2 with the numbers of answers actually recalled.

If the output on trial n is not a complete repetition of any previous answer or stimulus, it may combine three or two matching attributes. Such a combination may consist of the cue and one or two matching attributes (a cued pair or triple) or of two or three attributes excluding the cue (a yoked pair or triple). Answers containing one or two correct responses must now be excluded from the calculation to preclude the number of matches being inflated by partial recalls of stimuli. The probabilities of cued pairs and triples and of yoked pairs and triples are calculated, along the lines set out above for an answer, except that additional combinations of cue and responses must now be excluded from the calculation. If the cue is a Colour, there is one combination of Location and Object that would give a triple match to a stimulus; that combination must be excluded. In addition, there may be Serial position values that, when added to one of the other 80 combinations (of Location and Object), would give a complete match to a previous answer. Such a triple combination of guesses is also excluded.

Yoked repetitions are simpler. Three Location, Object and Serial position guesses (excluding the cue) may be matched to any previous answer that does not include the trial cue, else it would be a repetition of a complete answer. It may also be matched to an incorrect stimulus, where it would be separately classified as a yoked guess. The Bernoulli probabilities and variances are again summed to provide a generalised binomial variable for comparison with the total number of cued/yoked pairs/triples recalled. These calculations for an experiment with three-attribute stimuli are explained in great detail in Lansdale and Laming (1995, pp. 44–50).

Jones (1978): In Table 2 ‘Total trials’ in Col. 3 is the number of trials, conditional on the cue, on which a repetition of the designated category might have been observed. These numbers relate to a total of 972 trials with each attribute as cue, from which quadruple-correct answers (and, for triple-matches, triple-correct and repeated answers) have been deleted. The sums of the Bernoulli probabilities are small in relation to the numbers of previous errors that are repeated, so that most such repetitions must have been true recalls from memory. One might summarise the results as follows: yoked triples and guesses to Object as cue, and several different pair repetitions are not significant (well, not at 0.001).

Lansdale and Laming (1995): The statistical evaluation of this experiment has already been published (Laming, 2019) and is here placed in a supplementary file (Table A). There was a total of 1458 trials with each attribute as cue. The repetition of complete answers is again highly significant. Of pair repetitions, C and O pair together, both as cued and as yoked recalls, but pair combinations involving P are generally not significant. Colour and Object, of course, have natural names; Pattern does not.

Laming (2021): The statistical evaluation of this experiment is also placed in a supplementary file (Table B). There was a total of 1960 trials with each attribute as cue. Excepting yoked guesses, everything is significant at 0.001, except for pair-repetitions that do not include the picture. Briefly, the brand might be recalled with the picture, or the slogan, but without the picture, brand and slogan are lost. The picture plays a pivotal role in the recall of these advertisements.

Ross and Bower (1981, Expt. 3): As with Jones (1978) and the other two experiments, each set of responses was presumed to be a recall of the previous trial or stimulus with which it had the greatest congruence. Where two sources showed equal congruence, the most recent was preferred, so that the second complete recall of a quartet was sourced to the first recall. Where recall was incomplete, participants would sometimes produce additional words from other quartets, quartets other than the correct one. Such additional words are treated as true recalls from a previous trial or stimulus, choosing the most recent when there was more than one such source. Intrusions from previous lists or from outside the experiment altogether are ignored.

The participants were not instructed to guess if they failed to recall and certainly did not have a finite pool of words from which the probability of a repeated recall could be calculated. It is not therefore possible to assess the statistical significance of the aggregate repetitions. Instead, Table 3 below simply records the numbers of each category of recall. The categories are ‘single’, ‘double’ and ‘triple retrieval’, and if ‘Cue’ is included, the recall is obtained either from a stimulus or from a previous recall of a stimulus (i.e. a correct recall). If there is no mention of ‘Cue’, the responses have been retrieved from some other preceding trial (i.e. an error). ‘Cue only’ signifies no recall at all. Table 3 further separates retrievals according as they are sourced to a stimulus, to a previous trial, or to a previous trial that was itself a recall of a previous trial.

Table 3 Numbers of different categories of retrieval in Ross and Bower (1981, Expt 3)

Full size table

Summing up: repetitions of previous errors: In the data from Jones (1978), Lansdale and Laming (1995) and Laming (2021) many previous errors are repeated in recall. Individual categories of repetition are typically significant at 0.001. The exceptions are triples and pairs that lack a particular component (Object in Jones, 1978, Picture in Laming, 2020) or include Pattern in Lansdale and Laming (1995). The experiment by Ross and Bower (1981) shows a much greater proportion of completely correct recalls and the prevalence of repeated errors is accordingly less.

Conditional proportions of repetitions

Many errors are retrievals of previous errors; Figs. 2, 4 and 6 display conditional proportions of retrievals, conditional on the source. For example, the pale green histogram bars are where the cue on trial n + 1 happens to be one of the responses made on trial n. These bars show the proportions, conditional on the category of error on trial n, of a complete reproduction of the preceding set of responses (Ans0). The proportions are all near 1, except when trial n is Null, composed of two or three independent guesses. Likewise, the dark green histogram bars show the corresponding proportions for answers with lag > 0. The proportion conditional on a previous (identical) answer as source is greater than the rest. Putting this more generally, the entries in Table 2 (and Tables A an B in Review_tables.doc), already classified according to category of error; can be further decomposed according to the category of the presumed source. Each entry under ‘Number observed’ (col. 4) has already been traced to a source. For each trial enumerated under ‘Total trials’ (col. 3) there are one or more prior trials that could have been retrieved to generate that particular category of error, and I take the most recent of those prior trials as the potential source. The proportions in Figs. 2, 4 and 6 are the quotients of these two quantities.

The repetition of (erroneous) answers applies equally to correct recalls. A completely correct recall on first cueing places an additional copy of the stimulus in memory, which supports an increased likelihood of a correct recall at the next test, while a failure to recall has the contrary effect. There is no feedback following any of these tests, no further sight of the stimulus, so that the sequence of successive recalls should be a martingale (see Laming, 2005).^{Footnote 3} While the probabilities following particular sequences of recalls diverge, the expectation remains unchanged.

Jones (1978): Figure 2 decomposes the entries in Table 2 according to the category of the presumed source. Different categories of source are represented by different clusters of histogram bars spaced along the abscissa. Within each cluster different coloured bars show the quotient of the number of repetitions in that category (of repetition) divided by the number of trials on which such a repetition might have been observed.

There are two significant features. First, if the trial cue on trial n + 1 is also one of the responses on the immediately preceding trial (n), repetition of that preceding trial (Ans0) is almost certain, except in the case that the preceding trial is Null (three unrelated guesses). The average proportion of repetitions (Null sources excepted) is 0.985; it does not vary significantly between sources [χ²(N = 175) = 10.653, with 7 df, p = 0.154], while retrievals from Null sources (0.135) are much less frequent [χ²(N = 279) = 105.620, with 1 df, p < 0.001]. Second, the average proportion of repetitions for answers (lag > 0) is 0.232, but increases to 0.658 when the source is itself an (identical) answer. Likewise, the average proportion of repetitions for a cued triple is 0.383, but increases to 0.818 when the source is an answer. Complete (erroneous) answers tend especially to be repeated.

Figure 3 shows the proportions of completely correct (CLOS) answers following each sequence of preceding correct and incorrect trial responses for that particular stimulus. The proportion correct on first test is 0.152 and, if subsequent tests depended solely on independent access to the stimulus, the proportion of four correct CLOS answers would be 0.0005. In fact, the proportion of four correct CLOS answers in sequence is 0.080. It is plain from Fig. 3 that a correct (CLOS) answer increases the conditional probability of a correct answer on the next test. The small black dots mark the means at the second, third and fourth trials. They are linked by thin black lines to the corresponding data points representing the proportions correct on the preceding trial. This is to emphasise that while the probabilities for individual sequences diverge, the expectations do not.

The open circles in Fig. 3a are calculations from a model in the "Appendix".^{Footnote 4} There are four trials for each stimulus spaced at intervals of nine within the trial sequence; the combinations of responses on those trials have different accessibilities (probabilities of retrieval). The probability of retrieving the original stimulus, a_s, is assumed constant, but thereafter a₁ is the accessibility of the responses on the previous trial (the first trial at the second test, but the second trial at the third test and the third trial at the fourth test). At the second and third tests, a₂ is the accessibility of the preceding test but one, while a₃ is the accessibility of the first test at the fourth trial. The parameter estimates are: â_s = 0.152 for the stimulus, â₁ = 0.650 for the preceding trial, â₂ = 0.451 and a₃ = 0.270. There are 16 terminal proportions to be modelled with four free parameters [χ²(N = 972) = 13.417, with 11 df, p = 0.267].

The black squares in Fig. 3b show the corresponding proportions for repeated answers. At first test they show the proportion of answers (lag > 0) sourced from triple, pair and Null sources (0.218) and, at subsequent tests, the proportion sourced from previous answers (0.320, 0.271).

Lansdale and Laming (1995): Figure 4 decomposes the entries in Table A in Review_tables.doc in like manner to Fig. 2. As before, if the trial cue matches one of the responses on the immediately preceding trial, repetition of that preceding answer (lag 0) is almost certain, except in the case that the preceding trial is Null (two unrelated guesses). The average proportion of repetitions (Null sources excepted) is 0.971; it does not vary significantly between sources [χ²(N = 173) = 10.922, with 7 df, p = 0.142], while retrievals from Null sources (0.328) are much less frequent [χ²(N = 432) = 213.013, with 1 df, p < 0.001]. The average proportion of repetitions for answers (lag > 0) is 0.158, but increases to 0.514 (see Fig. 5 below) when the source is itself an (identical) answer.

Figure 5 shows the proportions of completely correct (CPO) answers following each sequence of preceding correct and incorrect trials. The proportion correct on first test is 0.111 and, if subsequent tests depended solely on independent access to the stimulus, the proportion of three correct CPO answers to a stimulus would be 0.0014. In fact, the proportion of three correct CPO answers in sequence is 0.0261. As before, a correct (CPO) answer increases the conditional probability of a correct answer on the next test. The small black circles show the means at the second and third trials, linked by thin black lines to the data points representing the proportions correct on the preceding trial. The open circles are predictions from the model in the "Appendix". The parameter estimates (cf. above) are: â_s = 0.098 for the stimulus, â₁ = 0.373 for the preceding trial and â₂ = 0.252 [χ²(N = 1458) = 12.658, with 4 df, p = 0.013].

The black squares in Fig. 5 show the corresponding proportions for repeated answers. At first test they show the proportion of answers (lag > 0) sourced from pair and Null sources (0.158) and, at second test, the proportion of answers sourced from existing answers (0.514). In both cases the proportions of answers retrieved exceed those of correct recalls [1st test: χ²(N = 3534) = 9.246, with 1 df, p = 0.002; 2nd test: χ²(N = 266) = 6.738, with 1 df, p = 0.009].

Laming (2021): Figure 6 decomposes the entries from the recall of advertisements in Table B in Review_tables.doc in like manner to Fig. 2. As above, if the trial cue matches one of the responses on the immediately preceding trial, repetition of that preceding answer (lag 0) is highly probable, except in the case that the preceding trial is Null. The average proportion of repetitions (Null sources excepted) is 0.835; it varies somewhat between sources [χ²(N = 121) = 18.403, with 8 df, p = 0.018], while retrievals from Null sources (0.218) are much less frequent [χ²(N = 286) = 114.362, with 1 df, p < 0.001]. Second, the average proportion of repetitions for answers (lag > 0) is 0.244, but increases to 0.653 (see Fig. 6 below) when the source is itself an (identical) answer.

Figure 7 shows the proportions of completely correct (BPS) answers following each sequence of preceding correct and incorrect trials addressing that stimulus. The proportion correct on first test is 0.380 and, if subsequent tests depended solely on independent access to the stimulus, the proportion of three correct BPS answers would be 0.055. In fact, the proportion of three correct BPS answers in sequence is 0.330. A correct (BPS) answer greatly increases the conditional probability of a correct answer on the next test. As above, the small black circles show the means at the second and third trials, linked by thin black lines to the data points representing the proportions correct on the preceding trial. The open circles in Fig. 7 show predictions from the model in the "Appendix". The parameter estimates are: â_s = 0.420 for the stimulus, â₁ = 0.754 for the preceding trial and â₂ = 0.345.

The black squares in Fig. 7 show the corresponding proportions for repeated answers. At first test they show the proportion of answers (lag > 0) sourced from pair and Null sources (0.209) and, at second test, the proportion of answers sourced from previous answers (0.653).

Ross and Bower (1981, Expt. 3): Notwithstanding that each stimulus (quartet) had four components, the stimuli were cued only twice. The histogram clusters in Fig. 8 present the distributions over different categories of second recall, correct recalls as well as errors, for each different category of first recall. Since there was no set of alternatives from which to select a guess, conditional probabilities cannot be calculated (cp. Figs. 2, 4, 6). If a recall contains the trial cue (Cue and 1, 2 or 3 retrievals), it has been retrieved from the quartet addressed by that cue (or a previous recall of that quartet). The second recall, in this case, tends to duplicate the first. If the recall does not contain the trial cue (1, 2 or 3 retrievals only), it necessarily comes from some other source and is an error.

Figure 9 shows the proportions of completely correct recalls on first and second tests. If the first recall is completely correct, the probability of a completely correct recall on second test is greatly increased, with a complementary decrease if the first recall fails. The small black circle shows the mean proportion correct on second test and is but little increased over the proportion on first test. The process is a martingale. The open circles are predictions from a simplified version of the model in the "Appendix". In this model a recall is completely correct if either the cue accesses the original quartet (a_s = 0.274) or a correct first recall (a₁ = 0.909).

Summing up: conditional proportions of repetitions: Repeated errors are often sourced from previous repetitions. This is abundantly clear in the first three experiments where conditional probabilities can be calculated; it is less apparent in Ross and Bower (1981, Expt. 3). The relation between source and repetition is of no apparent significance except for repetitions of the immediately preceding trial and for repeated answers. If a trial cue turns out to be one of the responses on the immediately preceding trial, the probability of a complete repetition is so near to 1 that there is no significant difference between sources, except when the source is Null. A similar, but more muted, relationship is seen in the repetition of complete (erroneous) answers. The same process applies to completely correct recalls, where it generates a martingale. Individual (conditional) sequences diverge from one test to the next, but the unconditional expected proportion of correct recalls does not change. These properties are apparent in all three experiments where conditional probabilities can be calculated. The data from Ross and Bower (1981, Expt. 3) display the martingale property over two successive tests.

The relation between lag and recall

Each recall of a previous error, complete answer or cued or yoked triple or pair, may be sourced from any preceding trial. Lag is the difference in trial number between the present trial and the presumed source, less 1, so that a repetition of the immediately preceding trial has lag 0. Given two matches of equal congruence, the most recent (with the shortest lag) is preferred over the more remote. This might appear to bias the lag-recall-relation to shorter lags, but this is illusory. What is presented in the figures below is the probability of repetition. While the number of each sort of error retrieved is constrained to the most recent source, so also is the number of opportunities for such a retrieval. The figures below plot the quotient of number of repetitions over number of opportunities, both constrained to the most recent source, and show how that estimated probability varies with lag.

A cued repetition can originate only from a preceding trial containing the current trial cue, so that answer, cued triple and pair are mutually exclusive. The numbers of answers etc. that might be repeated decreases progressively as lag increases, and the lag-recall relation becomes increasingly variable. Accordingly, these data are plotted cumulatively in Figs. 10, 11, and 12, that is, as ‘Answers’, ‘Answers and triples’, and as ‘All cued repetitions’. The cumulative totals show less variation, trial to trial, and the trend is clearer. For the same reason yoked recalls in Fig. 10 are also cumulated. They are, however, separated from cued repetitions because the essential condition for a yoked recall is that the source does not contain the current trial cue (else the yoked recall would be classified as a answer or cued triple). They originate from those previous trials from which a cued repetition cannot be sourced. There are eight times as many such trials and, to maintain comparability, the proportions of yoked triples and pairs have been increased eightfold.

What is the functional form of the relation between lag and probability of repetition? To provide a basis for comparison, I have chosen to fit the hyperbola.

$$ p_{{{\text{lag}}}} \, = \,{\text{A}}_{{{\text{cat}}}} /\left( {{\text{1}}\, + \,r \times {\text{lag}}} \right), $$

(1)

where the constant A_cat, different for different categories of repetition, adjusts each curve to the level of repetitions, and the coefficient r, fixed for any one experiment, adjusts that relation to the succession of trials. This question is usually asked with respect to time, and here the trial sequence must stand proxy. The constants in Eq. (1) were estimated by least squares, weighted by the number of cases at each lag.

Jones (1978): Figure 10 plots the proportions of occasions that the trial at lag l is the presumed source for each kind of error, answer or, cued or yoked, triple or pair. Repetitions of each of these kinds of error show conventional recency except that yoked pairs show a depression over the first three lags (≤ 2). (This might not appear noteworthy, but see below). The proportions decrease from about 0.8 for the immediately preceding answer or pair to near zero at a lag of 32. The histogram bars at the right hand end of the abscissa are the cumulative proportions of, respectively, complete correct (CLOS) recalls, correct triple recalls and pair recalls, and triple and pair yoked guesses. The proportions of complete correct (CLOS) recalls, correct triple recalls and pair recalls are comparable to the repetition of answers, cued triples and pairs at about lag 7, while the proportions of triple and pair yoked guesses match the inflated (8 ×) proportions of triple and pair yoked retrievals at about the same lag.

Equation 1 provides a passable representation of the repetition of answers, but not of any other cumulative category unless (as in Fig. 10) the estimated probabilities for the first few lags are excluded.

Lansdale and Laming (1995): Figure 11 plots the proportions of occasions that the trial at lag l is the presumed source for each kind of error.^{Footnote 5} The proportions decrease from about 0.5 for the immediately preceding answer or pair to near zero at a lag of 25. As before, the proportions for yoked pairs have been increased eight-fold: Independently of this adjustment, yoked pairs show inverse recency up to lag 7 (Kendall rank correlation 0.714, p = 0.007, one-tailed, comparing lags 0–7 only); thereafter the proportions decrease much as do the proportions of answers.^{Footnote 6} The histogram at the right hand end of the abscissa in Fig. 11 reproduces the cumulative proportions of, respectively, complete correct (CPO) recalls, correct pair recalls and yoked guesses. In each case, the proportions of correct recalls are comparable to the recall of answers, cued pairs and yoked pairs at about lag 5.

Equation 1 provides a passable representation of the repetition of answers, but not of cued or yoked pairs unless (as in Fig. 11) the estimated probabilities for the first five lags are excluded.

Laming (2021): Figure 12 plots the proportions of occasions that the trial at lag l is the presumed source for each kind of error. The proportions decrease from about 0.5 for the immediately preceding answer or pair to near zero at a lag of 30. As before, the proportions for yoked pairs have been increased, now ninefold, because the stimuli were cued every ten trials. Independently of this adjustment, yoked pairs show inverse recency up to lag 8 (Kendall rank correlation 0.500, p = 0.030, one-tailed, comparing lags 0–8 only); thereafter their probabilities decrease much as do those for answers. The histogram at the right hand end of the abscissa in Fig. 12 reproduces the cumulative proportions of, respectively, complete correct (BPS) recalls, correct pair recalls and yoked guesses. In this case the proportions of correct recalls are comparable to the recall of answers and cued pairs at very short lag. Equation 1 provides passable representations of the repetitions only if the first few lags are excluded (see Fig. 12).

Ross and Bower (1981, Expt. 3): This experiment yielded more correct and partially correct recalls than the previous three and correspondingly fewer repeated errors. Figure 13 shows those errors as a function of lag. Each trial was sourced by running backwards in the sequence of trials to find the maximum match, with the trial cue, not itself a response, excluded. The repetitions in Fig. 13 are those for which the maximum match did not contain the trial cue (i.e. errors). Recalls of 1, 2 or 3 words from lag l are mutually exclusive, so these entries have been cumulated as in Figs. 10, 11 and 12. Two sets of data points have been fitted with Eq. (1) with the parameters again estimated by weighted least squares. The additional parameter r has the value 0.669. It should also be noted that repetitions from lag 0 have high probability.

Ross and Bower cued each quartet twice, with different words as cue, at a range of different trial intervals. Figure 14 shows correct recalls, complete or partial, as a function of lag, where lag is trial distance from stimulus – 1. The filled symbols show recalls on the first test, the open symbols recalls on the second. Comparison with Fig. 13 shows the rate of loss of accessibility to be much slower (r = 0.034), one twentieth of the former rate; and responses on the second test trial, where there are two entries in memory to access, appear not to show any trend at all.

Summing up: the relation between lag and recall: The results from the first three experiments are generally consistent. Figures 10, 11 and 12 present a picture of memory function over the short term; the repetitions equate to a direct observation of individual retrievals. But the record is almost certainly not complete. All the first three experiments show conventional recency (cf. Tan & Ward, 2000) for most categories of repetition, but inverse recency for yoked pairs at the lowest lags; this suggests an interaction between different categories of retrieval (see Retrieval As A Function Of Lag/Inverse recency below). The repeated errors in Ross and Bower (1981, see Fig. 13) follow a similar pattern; but correct and partially correct recalls of the original quartets lose accessibility at a much slower rate.

Equation 1 provides a guide to the curvature of the data. But the functional form of the relation between lag and probability of repetition is complicated by the presence of multiple sources. If there have been previous correct recalls of a particular stimulus (Figs. 3, 5, 7, 9), the probability of a further correct recall is much increased. This is true also for repetitions of complete answers. Even so, the repetition of complete answers has arguably the smallest number of alternative sources and presents the clearest picture of the lag-recall relation. In Jones (1978) only 158 out of 557 repetitions of answers were sourced from a previous answer, admitting more than one possible source. In Lansdale and Laming (1995) the proportion was 54 out of 357 and in Laming (2021) 96 out of 495. Amongst simple formulae, Eq. (1) is passable, but many other functional forms could be substituted with suitable values to their constants.

The rate of loss of accessibility is usually estimated with respect to elapsed time; here the succession of trials stands proxy. Equation 1 has a constant A_cat to adjust to different levels of repetitions, and a coefficient r to adjust to the succession of trials. That coefficient is forced to be the same for all categories in an experiment and the estimated values are set out in Table 4 together with the estimated duration of trials. That temporal duration does not differ much between experiments; neither does the rate parameter (r) except for correct recalls in the experiment by Ross and Bower (1981), where the loss of accessibility of correct recalls proceeds at only 1/20th of the rate for repeated errors. I emphasise that those two rates refer to responses made by the same participants in the same series of trials. The only difference is the relation of the responses to the quartet addressed by the cue.

Table 4 Estimated rate parameters for the hyperbolas in Figs. 10, 11, 12, 13 and 14

Full size table

Three propositions

The repetition of errors has not previously been studied. This review has assembled the results of four experiments to spell out their implications for our understanding of memory. I set out three propositions concerning what one might call the mechanics of memory. References to previous work will illustrate the wider applicability of these propositions:

1.
Every event (a stimulus or a response or just a retrieval) to which the participant attends is separately recorded in memory, creating an ordered record of those events that have engaged the participant’s attention.

Repetition of errors

Table 5 sets out the proportions of each kind of repeated error in these experiments. In Jones (1978) there was a repeated error of some kind on 0.6 of trials. Bearing in mind that an error cannot be repeated until it has been output a first time, 25% of those trials must be discounted (each stimulus was cued 4 times), leaving a maximum proportion of 0.75. For Lansdale and Laming (1995) and Laming (2021) the equivalent maximum (with 3 tests of each stimulus) is 0.67, and the observed proportions are 0.435 and 0.256. It is not just the fact of repetition, but the large proportion of trials that it engages. The fourth experiment, Ross and Bower (1981, Expt. 3) presents a different picture. The proportion of completely correct recalls was much higher, and most of the errors consisted of the retrieval of a single word from the wrong quartet.

Table 5 Proportions of each kind of error repeated in the four experiments, calculated with respect to the total number of trials

Full size table

The repetition of previous errors is most prevalent when correct recall is difficult and participants are asked to guess. But those previous errors are available for repetition and must have been recorded in memory. Of course, I cannot speak to those response combinations that were not repeated, but I prefer the idea that every event is separately recorded in memory to the task of explaining why these particular events, and only these, were retained.

Historically, there have been other examples. Bartlett (1932) studied the retention of stories by asking participants for a written reproduction. Successive reproductions of “The War of the Ghosts” showed that changes from the original introduced in a first reproduction persisted through subsequent reproductions. “In a chain of reproductions obtained from a single individual, the general form, or outline, is remarkably persistent, once the first version has been given. ⋯ With frequent reproduction the form and items of remembered detail very quickly become stereotyped and thereafter suffer little change.” (Bartlett, 1932, p. 93). Bartlett’s study was replicated by Kay (1955) with participants making seven successive reproductions at intervals of a week and listening to the original again after each reproduction. Participants did not learn to correct the errors in their original reproductions. Instead, “The outstanding result was ⋯ each subject established an extraordinarily close relation between one reproduction and another.” (Kay, 1955, p. 93). The participants recalled chiefly what they had originally written and each reproduction rewrote that original reproduction in memory.

Mandler (1972, pp. 158–160; see also Hanawalt & Tarr, 1961) compared cued recall with recognition of a list of 60 words selected from 20 different categories. Recall was cued by the category labels. Intrusions on the recall test that subsequently appeared as lures in the recognition test were recognised as ‘Old’ 92%. This is to be compared with 97% for words originally presented and correctly recalled and only 57% for words presented but not recalled. The repetition of previous responses, whether correct or errors, appears to be widespread, and the re-analysis in Table 2 and Tables A and B in Review_tables.doc simply exhibits this phenomenon in detail. Many answers are recorded in memory and may be retrieved to provide the response(s) on some subsequent trial. Moreover, such retrievals are more potent than the original stimulus, a property that underlies the ‘generation effect’ and the ‘testing effect’ (Roediger & Karpicke, 2006; Slamecka & Graf, 1978).

Proposition 1 also speaks of an ‘ordered record’. Figures 10, 11, 12 and 13 show that the probability of repetition depends on how recently the original error was made and recency must also be represented in memory. In free recall experiments in which the participants have rehearsed out loud (Brodie & Murdock, 1977; Murdock & Metcalfe, 1978) it is possible to predict the ensuing sequence of recalls, not perfectly, but with significant success. Rehearsal follows a relatively simple pattern in which short sequences of words are recycled until the next stimulus word is presented (visually), whereafter rehearsal restarts, incorporating the latest stimulus. If that pattern of rehearsal is modelled and then run on after the signal to begin recall, it provides a good approximation to the sequence of recalls actually recorded (Laming, 2006, 2008). For rehearsal to repeatedly recycle previous rehearsals, those previous rehearsals have to be stored in memory in historic order.

Correct recalls

Table 2 and Tables A and B in Review_tables.doc concern only the repetition of errors, but Figs. 3, 5, 7 and 9 demonstrate a related effect with correct recalls. A completely correct first recall increases the probability of a correct recall on the next test, and this enhancement extends to second and third recalls. It applies also to the repeated recall of answers (Figs. 3, 5, 7). While the repetition of a previous erroneous answer must be a retrieval from memory, the retrieval enters an additional copy of the answer in memory.

One alternative is that those stimuli that show increased recall on second test were learned more thoroughly on initial presentation. This is Estes’ (1960) idea of all-or-none learning. Estes presented his argument with several experiments of RTT design, one presentation of a stimulus–response pair for study followed by two tests. If the first test fails, the pair has not been retained in memory and the second test must also fail (correct guesses excepted). One such experiment (Estes et al., 1960) paired eight consonant triples with the digits 1–8 and showed a pattern of responses on repeated test very much like Fig. 9 (Ross & Bower, 1981, Expt. 3). This experiment was replicated by Jones (1962), but with four trials following the one presentation of the stimulus–response pairs. Jones’ (1962) results show a pattern very much like Fig. 3a (from Jones, 1978), including improvement on a fourth test following an initial failure (see Laming, 2019, Fig. 11). While enhanced attention to some stimuli may produce increased recall (and does; see Laming, 2021), the all-or-none hypothesis cannot stand.

Reminiscence

The improvement on repeated testing accounts for the phenomenon of reminiscence. Ballard (1913) reported a seemingly counterintuitive finding: In the course of his employment as an HM Inspector of Schools, he would ask groups of schoolchildren to learn a poem with deliberately insufficient time allowed. The children were asked to write out as much as they could remember. Ballard would then return unexpectedly a few days later and ask the schoolchildren to write the poem out again. Scoring the number of whole lines reproduced without error that number increased on repeated test, over the score on initial test, by up to 20% (see Woodworth & Schlosberg, 1955, Fig. 25–6, p. 794). Since retention usually decreases with lapse of time, this increase seemed paradoxical, and was labelled ‘reminiscence’.

Reminiscence (more recently hypermnesia) has proved a fragile phenomenon, difficult to replicate (for reviews, see Payne, 1987; Erdelyi, 2010). But the principle that every event to which the participant attends is separately recorded in memory means that lines of poetry written during the initial test would have been recorded in memory and available for retrieval on a subsequent test. Now Ballard scored only the numbers of lines reproduced without error; this is like selecting completely correct recalls only in Fig. 3, etc. The second test, of course, showed an increased amount recalled (cf. Roediger & Thorpe, 1978). But the schoolchildren had no access to the poem in between whiles and the process has to be a martingale. So if Ballard had also examined lines that were reproduced incorrectly, he would surely have found his school children repeating their errors on the second test. In short, comparison with Figs. 3, 5, 7 and 9 makes reminiscence appear an artefact resulting from a biased selection of data.

2.
The compilation of the record is automatic; while attention to a stimulus is at the participant’s disposal, the consequent entry into memory is not.

Figures 2, 4 and 6 plot the proportions of different kinds of error conditional on their presumed source. In view of the ‘generation’ and ‘testing’ effects (Roediger & Karpicke, 2006; Slamecka & Graf, 1978), it is perhaps plausible that sources that are themselves repetitions of previous errors should be recorded in memory, but what about Null sources comprising two or three independent guesses? Why should participants remember those? An alternative possibility is that repetitions of a Null source are no more than apparent, a combination of chance guesses.

Table 6 tabulates the number of trials (in the first three experiments) on which a Null source might have been matched completely, the number of trials on which such a complete match was observed, the chance probability of such a match and, finally, a normal deviate corresponding to the excess number of answers over expectation, calculated from the binomial distribution. While some of these answers might be chance matches, it is clear that most are true retrievals of Null sources, which must have been recorded in memory. So, why do participants retain combinations of two or three independent guesses? If entry into memory is automatic, that question is immediately answered.

Table 6 Answers retrieved from Null sources with a statistical evaluation

Full size table

3.
The retrieval of a potential response from memory is spontaneous; that retrieval becomes an overt response if it is compatible with the cue.

Yoked recalls

There is a long-held tradition that learning occurs by the successive association of one idea with another (e.g. Ebbinghaus, 1885/1964, Ch. IX). Successive paired-associate trials, in which participants learn a list of stimulus–response pairs, giving a particular response to each different stimulus, were thought to strengthen such an association (e.g. Thorndike, 1913). More recently Anderson and Bower (1973), Brown et al., (2007), Lewandowsky and Murdock (1989), Lohnas et al. (2015), Polyn et al. (2009) and Raaijmakers and Shiffrin (1980), among others, have proposed extensive theories, based on the idea that the stimulus is instrumental in retrieving the response. These theories accommodate, though only approximately, a range of different experimental procedures. But this is no more than a habit of thought; it is not an established fact.

Analysis here shows that events on individual trials are recorded in memory, whence they may be retrieved to provide a response. So, successive paired-associate trials increase the pool of previous S–R pairs. Suppose that these pairs are retrieved spontaneously. If the pair S_i–R_i comes to mind while S_i happens to be the cue, then R_i is produced as response. One can distinguish between S_i as the agent in procuring R_i as response, on the one hand, and spontaneous retrieval, on the other, only to the extent that experimental data permit. But the same probability of recall obtains on both scenarios (Laming, 2019, pp. 46–48). Experiments can estimate the probability of responding R_i to S_i as cue, but that estimate tells us nothing about how R_i is accessed.

But the incidences of yoked pairs and triples in Tables 2 and Tables A and B in Review_tables.doc speak powerfully to this issue. A yoked recall is a pair (or triple) sourced to some previous trial that does not contain the cue from the current trial. It follows that the current trial cue could have had no part in eliciting this particular (yoked) recall. The rule that applies equally to both cued and yoked recalls is simply that recall must be compatible with the trial cue. Recall from a source that contains the current trial cue is, ipso facto, compatible; but so also are yoked recalls from sources that do not contain the trial cue. The incidence of most categories of yoked recall is highly significant. Since in those cases the trial cue could not have been the agent of recall, retrieval must be spontaneous.

Inverse recency

Are repeated combinations of errors retrieved as successions of single retrievals, component by component, or are they the product of a single retrieval, a triple or a pair (a ‘chunk’, Miller, 1956)? While it is common for the experimenter to analyse recalls in terms of individual components, there is no reason why memory should follow suit. The inverse recency exhibited by yoked pairs in Figs. 11 and 12 speaks powerfully to this question.

The lag–recall relations in Fig. 11 for yoked pairs on the one hand and for answers and cued pairs on the other are different; the difference is highly significant, this after allowing for the different levels of repetition (Laming, 2019, p. 22).^{Footnote 7} Suppose the trial cue fails to elicit a retrieval, and the participant then selects a component at random and uses that component as cue to retrieve a partner from some previous answer. That is the very procedure that is assumed in the retrieval of cue pairs and, in consequence, their respective lag–recall relations should be the same. But, in fact, they are very different. It follows that yoked pairs are not retrieved through random selection of a cue; they must instead be retrieved as a pair. Suppose further that a complete answer is observed when a yoked pair or triple happens to match the trial cue; that is, the procedures for retrieving yoked pairs and complete answers are the same. In that case their respective lag-recall relations would also be the same. But, again, they are very different. It follows that complete answers do not result from a yoked pair matching the trial cue; they are the product of a single retrieval, a triple.

This argument requires two qualifications. First, the probabilistic nature of retrieval does not preclude some yoked pairs being the confluence of two single-component retrievals, or some answers being the confluence of a yoked pair matching the trial cue. All that can be said is that yoked pairs are generally retrieved as a pair and answers generally retrieved as a triple. Second, the argument that yoked pairs are retrieved as pairs could be inverted to argue that cued pairs are the consequence of a single retrieval matching the trial cue. I return to this issue below (Inverse recency, again).