Abstract
Memory scientists usually compare mean performance on some measure(s) (accuracy, confidence, latency) as a function of experimental condition. Some researchers have made within-subject variability in task performance a focal outcome measure (e.g., Yao et al., Journal of Clinical and Experimental Neuropsychology, 38, 227–237, 2016). Here, we explored between-subject variability in accuracy as a function of experimental conditions. This work was inspired by an incidental finding in a previous study, in which we observed greater variability in accuracy of memory performance on cued recall (CR) versus free recall (FR) of English animal/object nouns (Mah et al., Frontiers in Psychology, 14, 1146200, 2023). Here we report experiments designed to assess the reliability of that pattern and to explore its causes (e.g., differential interpretation of instructions, [un]relatedness of CR word pairs, encoding time). In Experiment 1 (N = 120 undergraduates), we replicated the CR:FR variability difference with a more representative set of English nouns. In Experiments 2A (N = 117 Prolific participants) and 2B (N = 127 undergraduates), we found that the CR:FR variability difference persisted in a forced-recall procedure. In Experiment 3 (N = 260 Prolific participants), we used meaningfully related word pairs and still found greater variability in CR than in FR performance. In Experiment 4 (N = 360 Prolific participants), we equated CR and FR study phases by having all participants study pairs and, again, observed greater variability in CR than FR. The same was true in Experiment 5 (N = 120 undergraduates), in which study time was self-paced. Comparisons of variability across subjects can yield insights into the mechanisms underlying task performance.
Similar content being viewed by others
Data availability
All data/experiment programs are available online (https://osf.io/3tra5/).
Code availability
All analysis scripts/experiment programs are available online (https://osf.io/3tra5/).
Notes
See the preregistration (https://osf.io/xfj6a) for the final word pool and details of the word selection procedure. Briefly, we began with the MRC Psycholinguistic Database (Wilson, 1988) of 21,561 nouns and then in several steps selected from that pool a set of nouns that are average on multiple dimensions (i.e., within a central mass of the database-wide distribution).
Results were similar when looking at accuracy separately by test order (i.e., for those who did CR first vs. second); see Supplementary Material 3C.
We also fit and compared Bayesian computational models of FR and CR performance (for this and all subsequent experiments). These analyses generally agreed with the ones reported here (see SOM 2 and our preregistration for more details about these analyses)
If two out of three raters considered a word to have a salient non-noun meaning or to be too obscure, that word was removed from the pool.
Before conducting Experiment 2A, we pilot tested the procedure on Prolific (N = 16). This testing revealed a high rate of exclusions (14/16 participants reported not understanding at least 75% of words), so we preregistered an additional inclusion criterion for this sample: English as a first language (self-reported on Prolific), in addition to self-reported English fluency. This had the added benefit of making our Prolific sample more comparable to our student samples in terms of language status.
This was due to our sampling procedure (i.e., opening more study slots than we needed to maximize data collected while trying to anticipate exclusions), but the results were the same when including/excluding the seven additional participants.
Results differed as a function of test order–the Pitman-Morgan test was significant for those that did FR before CR, but not for those who did CR before FR. This may be due to slightly lower CR performance in the latter group constraining CR variance (see Supplementary Material 4D).
Perhaps this is why the corresponding Bayesian analysis did not provide compelling evidence for a CR:FR difference (see Supplementary Material 4A).
Results were generally similar when comparing those who did CR before FR and vice versa, though the variability difference and CR accuracy were greater/higher for those who did FR before CR (see Supplementary Material 5D). It is possible that doing FR first (easier task) better prepares participants for CR, and this increase in accuracy (i.e., off of CR floor) serves to increase CR variability.
Results were nearly identical when excluding the excess seven participants above our target N. Specifically, the Pitman–Morgan p < .001, bootstrapped CR:FR variance ratio = 1.54 (95% percentile bootstrap CI [1.34, 1.77]).
90% because our hypothesis of CR > FR variability was one-sided.
We intended for the experiment program to auto-advance to the next word/pair after 30 s, but did not detect the programming error leading to the design reported in-text until after data had been collected. Study trials >30 s were rare (~4.5% of all trials), and excluding these trials or participants who had more than one such trial (n = 20) or participants who had any such trials (n = 33) did not change the results of our primary analysis. So, for subsequent analyses we did not exclude any trials/participants on this basis.
The inclusion/exclusion of the additional four participants above our preregistered target N did not change the results of our primary confirmatory analysis, so they were included for all subsequent analyses.
References
* denotes Supplementary Material reference
Anderson, J. R., Lebiere, C., Lovett, M., & Reder, L. (1998). ACT-R: A higher-level account of processing capacity. Behavioral and Brain Sciences, 21(6), 831–832. https://doi.org/10.1017/S0140525X98221765
Christensen, H., Mackinnon, A. J., Korten, A. E., Jorm, A. F., Henderson, A. S., Jacomb, P., & Rodgers, B. (1999). An analysis of diversity in the cognitive performance of elderly community dwellers: Individual differences in change scores as a function of age. Psychology and Aging, 14(3), 365–379. https://doi.org/10.1037/0882-7974.14.3.365
Cleary, A. M. (2018). Dependent measures in memory research: From free recall to recognition. In H. Otani & B. L. Schwartz (Eds.), Handbook of research methods in human memory. Routledge.
Cowan, N., Saults, J. S., Elliott, E. M., & Moreno, M. V. (2002). Deconfounding serial recall. Journal of Memory and Language, 46(1), 153–177. https://doi.org/10.1006/jmla.2001.2805
Cox, G. E., Hemmer, P., Aue, W. R., & Criss, A. H. (2018). Information and processes underlying semantic and episodic memory across tasks, items, and individuals. Journal of Experimental Psychology: General, 147(4), 545–590. https://doi.org/10.1037/xge0000407
Eronen, M. I., & Bringmann, L. F. (2021). The theory crisis in psychology: How to move forward. Perspectives on Psychological Science, 16(4), 779–788. https://doi.org/10.1177/1745691620970586
Goldsmith, M., & Koriat, A. (2007). The strategic regulation of memory accuracy and informativeness. In A. S. Benjamin & B. H. Ross (Eds.), Skill and strategy in memory use (48th ed., pp. 1–60). Elsevier Academic Press.
Hartigan, J. A., Hartigan, P. M. (1985). The Dip Test of Unimodality. The Annals of Statistics, 13(1), 70-84. https://doi.org/10.1214/aos/1176346577
Jamieson, R. K., Mewhort, D. J. K., & Hockley, W. E. (2016). A computational account of the production effect: Still playing twenty questions with nature. Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale, 70(2), 154–164. https://doi.org/10.1037/cep0000081
La Plume, A. A., Paterson, T. S. E., Gardner, S., Stokes, K. A., Freedman, M., Levine, B., Troyer, A. K., Anderson, N. D. (2021). Interindividual and intraindividual variability in amnestic mild cognitive impairment (aMCI) measured with an online cognitive assessment. Journal of Clinical and Experimental Neuropsychology, 43(8), 796-812. https://doi.org/10.1080/13803395.2021.1982867
Loftus, G. R., & Masson, M. E. J. (1994). Using confidence intervals in within‐subject designs. Psychonomic Bulletin & Review, 1(4), 476–490. https://doi.org/10.3758/BF03210951
Mah, E. Y., Campbell, A., Tamburri, C., Grannon, K., & Lindsay, D. S. (2023). A direct replication of Popp and Serra (2016, Experiment 1): Better free recall and worse cued recall of animal names than object names. Frontiers in Psychology, 14, 1146200. https://doi.org/10.3389/fpsyg.2023.1146200
Morgan, W. A. (1939). A test for the significance of the difference between two variances in a sample from a normal bivariate distribution. Biometrika, 31, 13–19.
Morrison, A. B., Rosenbaum, G. M., Fair, D., & Chein, J. M. (2016). Variation in strategy use across measures of verbal working memory. Memory & Cognition, 44(6), 922–936. https://doi.org/10.3758/s13421-016-0608-9
Murdock, B. B. (1983). A distributed memory model for serial-order information. Psychological Review, 90(4), 316–338. https://doi.org/10.1037/0033-295X.90.4.316
Nairne, J. S., VanArsdall, J. E., & Cogdill, M. (2017). Remembering the living: Episodic memory is tuned to animacy. Current Directions in Psychological Science, 26(1), 22–27. https://doi.org/10.1177/0963721416667711
Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in psychology. Psychonomic Bulletin & Review, 26(5), 1596–1618. https://doi.org/10.3758/s13423-019-01645-2
Pitman, E. J. G. (1939). A note on normal correlation. Biometrika, 31, 9–12.
Popp, E. Y., & Serra, M. J. (2016). Adaptive memory: Animacy enhances free recall but impairs cued recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42(2), 186–201. https://doi.org/10.1037/xlm0000174
Raaijmakers, J. G. W., & Shiffrin, R. M. (1980). SAM a theory of probabilistic search of associative memory. In G. H. Bower (Ed.), Psychology of learning and motivation (14th ed., pp. 207–262). Elsevier. https://doi.org/10.1016/S0079-7421(08)60162-0
Ratcliff, R., Thapar, A., & McKoon, G. (2011). Effects of aging and IQ on item and associative memory. Journal of Experimental Psychology: General, 140(3), 464–487. https://doi.org/10.1037/a0023810
Roediger, H. L., & Payne, D. G. (1985). Recall criterion does not affect recall level or hypermnesia: A puzzle for generate/recognize theories. Memory & Cognition, 13(1), 1–7. https://doi.org/10.3758/BF03198437
Roediger, H. L., Watson, J. M., McDermott, K. B., & Gallo, D. A. (2001). Factors that determine false recall: A multiple regression analysis. Psychonomic Bulletin & Review, 8(3), 385–407. https://doi.org/10.3758/BF03196177
Siedlecki, K. L. (2007). Investigating the structure and age invariance of episodic memory across the adult lifespan. Psychology and Aging, 22(2), 251–268. https://doi.org/10.1037/0882-7974.22.2.251
Simons, D., Shoda, Y., & Lindsay, D. S. (2017). Constraints on Generality (COG): A proposed addition to all empirical papers. Perspectives on Psychological Science, 12, 1123–1128. https://doi.org/10.1177/1745691617708630
van Rooij, I., & Baggio, G. (2021). Theory before the test: How to build high-verisimilitude explanatory theories in psychological science. Perspectives on Psychological Science, 16(4), 682–697. https://doi.org/10.1177/1745691620970604
Watkins, M. J. (1990). Mediationism and the obfuscation of memory. American Psychologist, 45(3), 328–335. https://doi.org/10.1037/0003-066X.45.3.328
Wilson, M. (1988). MRC psycholinguistic database: Machine-usable dictionary version 2.00. Behavior Research Methods, Instruments, & Computers, 20(1), 6–10. https://doi.org/10.3758/BF03202594
Yao, C., Stawski, R. S., Hultsch, D. F., & McDonald, S. W. S. (2016). Selective attrition and intraindividual variability in response time moderate cognitive change. Journal of Clinical and Experimental Neuropsychology, 38(2), 227–237. https://doi.org/10.1080/13803395.2015.1102869
Zerr, C. L., Berg, J. J., Nelson, S. M., Fishell, A. K., Savalia, N. K., & McDermott, K. B. (2018). Learning efficiency: Identifying individual differences in learning rate and retention in healthy adults. Psychological Science, 29(9), 1436–1450. https://doi.org/10.1177/0956797618772540
Acknowledgements
We would like to thank Henry L. Roediger, Colleen M. Kelley, John Dunlosky, Larry Jacoby, Reed Hunt, and Roger Ratcliff for their helpful insights and suggestions.
Funding
This work was supported by an NSERC Discovery grant (#RGPIN-2016-03944) awarded to D.S.L.
Author information
Authors and Affiliations
Contributions
E.Y.M. and D.S.L. conceived of the experiments; E.Y.M. programmed the experiments, collected the data, and analyzed the data. E.Y.M. and D.S.L. drafted and revised the manuscript.
Corresponding author
Ethics declarations
Conflicts of interest/Competing interests
We do not have any conflicts of interest to declare.
Ethics approval
All the experiments reported herein were approved by the ethics review board of the University of Victoria, and were conducted in accordance with the ethical standards as laid down in the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.
Consent to participate
All participants who took part in the experiment consented to participate.
Consent for publication
Both authors consent to the publication of this manuscript.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Practices Statement
The data and materials for all experiments are available online (https://osf.io/3tra5/), and all experiments were preregistered: Experiment 1 (https://osf.io/xfj6a), Experiments 2A and 2B (https://osf.io/3w6fm), Experiment 3 (https://osf.io/v67gy), Experiment 4 (https://osf.io/de7bu), and Experiment 5 (https://osf.io/my53w).
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mah, E.Y., Lindsay, D.S. Variability across subjects in free recall versus cued recall. Mem Cogn 52, 23–40 (2024). https://doi.org/10.3758/s13421-023-01440-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3758/s13421-023-01440-4