Skip to main content
Log in

Efficiently measuring recognition performance with sparse data

  • Published:
Behavior Research Methods Aims and scope Submit manuscript

Abstract

We examine methods for measuring performance in signal-detection-like tasks when each participant provides only a few observations. Monte Carlo simulations demonstrate that standard statistical techniques applied to ad’ analysis can lead to large numbers of Type I errors (incorrectly rejecting a hypothesis of no difference). Various statistical methods were compared in terms of their Type I and Type II error (incorrectly accepting a hypothesis of no difference) rates. Our conclusions are the same whether these two types of errors are weighted equally or Type I errors are weighted more heavily. The most promising method is to combine an aggregated’ measure with a percentile bootstrap confidence interval, a computerintensive nonparametric method of statistical inference. Researchers who prefer statistical techniques more commonly used in psychology, such as a repeated measurest test, should useγ (Goodman & Kruskal, 1954), since it performs slightly better than or nearly as well asd’. In general, when repeated measurest tests are used,γ is more conservative thand’: It makes more Type II errors, but its Type I error rate tends to be much closer to that of the traditional .05 α level. It is somewhat surprising thatγ performs as well as it does, given that the simulations that generated the hypothetical data conformed completely to thed’ model. Analyses in which H—FA was used had the highest Type I error rates. Detailed simulation results can be downloaded fromwww.psychonomic.org/archive/Schooler-BRM-2004.zip.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Cohen, J. (1988).Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Deese, J. (1959). On the prediction of occurrence of particular verbal intrusions in immediate recall.Journal of Experimental Psychology,58, 17–22.

    Article  PubMed  Google Scholar 

  • Di Stefano, J. (2003). How much power is enough? Against the development of an arbitrary convention for statistical power calculations.Functional Ecology,17, 707–709.

    Article  Google Scholar 

  • Efron, B., &Tibshirani, R. (1991). Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy.Statistical Science,1, 54–77.

    Article  Google Scholar 

  • Gallo, D. A., Roediger, H. L., III, &McDermott, K. B. (2001). Associative false recognition occurs without strategic criterion shifts.Psychonomic Bulletin & Review,8, 579–586.

    Article  Google Scholar 

  • Goodman, L. A., &Kruskal, W. H. (1954). Measures of association for cross classifications.Journal of the American Statistical Association,49, 732–764.

    Google Scholar 

  • Green, D. M., &Swets, J. A. (1966).Signal detection theory and psychophysics. New York: Wiley.

    Google Scholar 

  • Heit, E., Brockdorff, N., &Lamberts, K. (2003). Adaptive changes of response criterion in recognition memory.Psychonomic Bulletin & Review,10, 718–723.

    Article  Google Scholar 

  • Martinez W. L., &Martinez A. R. (2002).Computational statistics handbook with MATLAB. New York: Chapman & Hall/CRC.

    Google Scholar 

  • Miller, M. B., &Wolford, G. L. (1999). Theoretical commentary: The role of criterion shift in false memory.Psychological Review,106, 398–405.

    Article  Google Scholar 

  • Mooney, C. Z., &Duval, R. D. (1993).Bootstrapping: A nonparametric approach to statistical inference. London: Sage.

    Book  Google Scholar 

  • Nelson, T. O. (1984). A comparison of current measures of the accuracy of feeling-of-knowing predictions.Psychological Bulletin,95, 109–133.

    Article  PubMed  Google Scholar 

  • Read, J. D. (1996). From a passing thought to a false memory in 2 minutes: Confusing real and illusory events.Psychonomic Bulletin & Review,3, 105–111.

    Article  Google Scholar 

  • Roediger, H. L., III, &McDermott, K. B. (1995). Creating false memories: Remembering words that were not presented in lists.Journal of Experimental Psychology: Learning, Memory, & Cognition,21, 803–814.

    Google Scholar 

  • Snodgrass, J. G., &Corwin, J. (1988). Pragmatics of measuring recognition memory: Applications to dementia and amnesia.Journal of Experimental Psychology: General,117, 34–50.

    Article  Google Scholar 

  • Stretch, V., &Wixted, J. T. (1998). On the difference between strengthbased and frequency-based mirror effects in recognition memory.Journal of Experimental Psychology: Learning, Memory, & Cognition,24, 1379–1396.

    Google Scholar 

  • Wixted, J. T., &Stretch, V. (2000). The case against a criterion-shift account of false memory.Psychological Review,107, 368–376.

    Article  PubMed  Google Scholar 

  • Zoubir, A. M., &Boashash, B. (1998). The bootstrap and its application in signal processing.IEEE Signal Processing Magazine,15, 56–76.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lael J. Schooler.

Additional information

This work was begun while L.S. was supported by NSRA Fellowship 1F32HD/MHC7787-01A1 at Indiana University.

Note—This article was accepted by the previous editor, Jonathan Vaughan.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schooler, L.J., Shiffrin, R.M. Efficiently measuring recognition performance with sparse data. Behavior Research Methods 37, 3–10 (2005). https://doi.org/10.3758/BF03206393

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3758/BF03206393

Keywords

Navigation