How crucial is the response format for the testing effect?


Combining study and test trials during learning is more beneficial for long-term retention than repeated study without testing (i.e., the testing effect). Less is known about the relative efficacy of different response formats during testing. We tested the hypothesis that overt testing (typing responses on a keyboard) during a practice phase benefits later memory more than covert testing (only pressing a button to indicate successful retrieval). In Experiment 1, three groups learned 40 word pairs either by repeatedly studying them, by studying and overtly testing them, or by studying and covertly testing them. In Experiment 2, only the two testing conditions were manipulated in a within-subjects design. In both experiments, participants received cued recall tests after a short (~19 min) and a long (1 week) retention interval. In Experiment 1, all groups performed equally well at the short retention interval. The overt testing group reliably outperformed the repeated study group after 1 week, whereas the covert testing group performed insignificantly different from both these groups. Hence, the testing effect was demonstrated for overt, but failed to show for covert testing. In Experiment 2, overtly tested items were better and more quickly retrieved than those covertly tested. Further, this does not seem to be due to any differences in retrieval effort during learning. To conclude, overt testing was more beneficial for later retention than covert testing, but the effect size was small. Possible explanations are discussed.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 199

This is the net price. Taxes to be calculated in checkout.

Fig. 1


  1. 1.

    Please note that the study-only group was added later following a reviewer request. Although participant allocation to the two test groups was random, this was, for this reason, not the case for the study-only group.

  2. 2.

    During final testing in experiment 1 the participants pressed ENTER only after having filled in their response. This response latency measure does not allow for calculating retrieval latency because it includes both the time it takes to retrieve an item and the time it takes to write the response. However, because the final testing procedure and the items used were identical for both groups, any difference between the two should primarily be due to the retrieval latency. To compare with the findings of experiment 2, we ran two separate Mann–Whitney U tests (again due to data not being normally distributed). When tested after the short retention interval, the participants accessed the information reliably faster in the overt (Median = 3,404.79), compared to the covert testing group (Median = 4,054.82), Mann–Whitney U = 367.00, p = 0.02. After a week, the median latencies were still slightly faster in the overt (Median = 4,930.84), compared to the covert testing group (Median = 5,501.00), but this difference was not statistically reliable, Mann–Whitney U = 440.00, p = 0.18. Notably, the result pattern is exactly the same as in experiment 2.


  1. Ackerman, R., & Koriat, A. (2011). Response latency as a predictor of the accuracy of children’s reports. Journal of Experimental Psychology: Applied, 17, 406–417.

  2. Bjork, R. A. (1975). Retrieval as a memory modifier: an interpretation of negative recency and related phenomena. In R. L. Solso (Ed.), Information processing and cognition: the loyola symposium (pp. 123–144). Hillsdale: Erlbaum.

  3. Bjork, R. A. (1994). Memory and metamemory considerations in the training of human beings. In J. Metcalfe & A. Shimamura (Eds.), Metacognition: knowing about knowing (pp. 185–205). Cambridge: MIT Press.

  4. Carpenter, S. K., & DeLosh, E. (2006). Impoverished cue support enhances subsequent retention: support for the elaborative retrieval explanation of the testing effect. Memory and Cognition, 34, 268–276.

  5. Carpenter, S. K., & Pashler, H. (2007). Testing beyond words: using tests to enhance visuospatial map learning. Psychonomic Bulletin and Review, 14, 474–478.

  6. Carpenter, S. K., Pashler, H., & Vul, E. (2006). What types of learning are enhanced by a cued recall test? Psychonomic Bulletin and Review, 13, 826–830.

  7. Carpenter, S. K., Pashler, H., Wixted, J. T., & Vul, E. (2008). The effects of tests on learning and forgetting. Memory and Cognition, 36, 438–448.

  8. Carrier, M., & Pashler, H. (1992). The influence of retrieval on retention. Memory and Cognition, 20, 633–642.

  9. Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed practice in verbal recall tasks: a review and quantitative synthesis. Psychological Bulletin, 132, 352–380.

  10. Cohen, R. L. (1981). On the generality of some memory laws. Scandinavian Journal of Psychology, 22, 267–281.

  11. Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013). Improving students’ learning with effective learning techniques: promising directions from cognitive and educational psychology. Psychological Science in the Public Interest, 14, 4–58.

  12. Engelkamp, J. (2001). Action memory: a system-oriented approach. In H. D. Zimmer, R. L. Cohen, M. J. Guynn, J. Engelkamp, R. Kormi-Nouri, & M. C. Foley (Eds.), Memory for action: a distinct form of episodic memory? (pp. 49–96). New York: Oxford University Press.

  13. Engelkamp, J., & Krumnacker, H. (1980). Imaginale und motorische Prozesse beim Behalten verbalen Materials. Zeitschrift für Experimentelle und Angewandte Psychologie, 27, 511–533.

  14. Gates, A. I. (1917). Recitation as a factor in memorizing. Archives of Psychology, 6(40), 1–104.

  15. Halamish, V., & Bjork, R. A. (2011). When does testing enhance retention? A distribution-based interpretation of retrieval as a memory modifier. Journal of Experimental Psychology. Learning, Memory, and Cognition, 37, 801–812.

  16. Hogan, R. M., & Kintsch, W. (1971). Differential effects of study and test trials on long-term recognition and recall. Journal of Verbal Learning and Verbal Behavior, 10, 562–567.

  17. Hunt, R. R. (2006). The concept of distinctiveness in memory research. In R. R. Hunt & J. B. Worthen (Eds.), Distinctiveness and memory (pp. 3–25). New York: Oxford University Press.

  18. Izawa, C. (1976). Vocalized and silent tests in paired-associate learning. The American Journal of Psychology, 89, 681.

  19. Johnson, C. J., Paivio, A., & Clark, J. M. (1996). Cognitive components of picture naming. Psychological Bulletin, 120, 113–139.

  20. Jönsson, F. U., Hedner, M., & Olsson, M. J. (2012). The testing effect as a function of explicit testing instructions and judgments of learning. Experimental Psychology, 59, 251–257.

  21. Kang, S. H. K. (2010). Enhancing visuospatial learning: the benefit of retrieval practice. Memory & Cognition, 38, 1009–1017.

  22. Kornell, N., Bjork, R. A., & Garcia, M. A. (2011). Why tests appear to prevent forgetting: a distribution-based bifurcation model. Journal of Memory and Language, 65, 85–97.

  23. Kornell, N., & Rhodes, M. G. (2013). Feedback reduces the metacognitive benefit of tests. Journal of Experimental Psychology: Applied, 19, 1–13.

  24. Krumboltz, J. D., & Weisman, R. G. (1962). The effect of overt versus covert responding to programed instruction on immediate and delayed retention. Journal of Educational Psychology, 53, 89–92.

  25. Kuhl, B., & Anderson, M. (2011). More is not always better: paradoxical effects of repetition on semantic accessibility. Psychonomic Bulletin and Review, 18, 964–972.

  26. Kuo, T. M., & Hirshman, E. (1996). Investigations of the testing effect. American Journal of Psychology, 109, 451–464.

  27. Larsson Sundqvist, M., Todorov, I., Kubik, V., & Jönsson, F. U. (2012). Study for now, but judge for later: delayed judgments of learning promote long-term retention. Scandinavian Journal of Psychology, 53, 450–454.

  28. Longcamp, M., Boucard, C., Gilhodes, J.-C., Anton, J.-L., Roth, M., Nazarian, B., et al. (2008). Learning through hand- or typewriting influences visual recognition of new graphic shapes: behavioral and functional imaging evidence. Journal of Cognitive Neuroscience, 20, 802–815.

  29. Longcamp, M., Boucard, C., Gilhodes, J.-C., & Velay, J.-L. (2006). Remembering the orientation of newly learned characters depends on the associated writing knowledge: a comparison between handwriting and typing. Human Movement Science, 25, 646–656.

  30. MacLeod, C. M., Gopie, N., Hourihan, K. L., Neary, K. R., & Ozubko, J. D. (2010). The production effect: delineation of a phenomenon. Journal of experimental psychology. Learning, memory, and cognition, 36, 671–685.

  31. Mangen, A., & Velay, J.-L. (2010). Digitizing literacy: reflections on the haptics of writing. In M. H. Zadeh (Ed.), Advances in haptics. Vienna: IN-TECH web.

  32. Metcalfe, J. (2009). Metacognitive judgments and control of study. Current Directions in Psychological Science, 18, 159–163.

  33. Metcalfe, J., & Finn, B. (2008). Familiarity and retrieval processes in delayed judgments of learning. Journal of Experimental Psychology. Learning, Memory, and Cognition, 34, 1084–1097.

  34. Morris, C. D., Bransford, J. D., & Franks, J. J. (1977). Levels of processing versus transfer appropriate processing. Journal of Verbal Learning and Verbal Behavior, 16, 519–533.

  35. Naka, M., & Naoi, H. (1995). The effect of repeated writing on memory. Memory and Cognition, 23, 201–212.

  36. Nelson, T. O., & Dunlosky, J. (1991). When people’s judgments of learning (JOLs) are extremely accurate at predicting subsequent recall: The delayed-JOL effect. Psychological Science, 2, 267–270.

  37. Nelson, T. O., & Dunlosky, J. (1994). Norms of paired-associate recall during multitrial learning of Swahili–English translation equivalents. Memory, 2, 325–335.

  38. Nelson, T. O., Leonesio, J., Shimamura, A. P., Landwehr, R. F., & Narens, L. (1982). Overlearning and the feeling of knowing. Journal of Experimental Psychology. Learning, Memory, and Cognition, 8, 279–288.

  39. Nilsson, L. G. (2000). Remembering actions and words. In E. Tulving & F. I. M. Craik (Eds.), The oxford handbook of memory (pp. 137–148). New York: Oxford University Press.

  40. Putnam, A. L., & Roediger, H. L. (2013). Does response mode affect amount recalled or the magnitude of the testing effect? Memory and Cognition, 41, 36–48.

  41. Rhodes, M. G., & Tauber, S. K. (2011). The influence of delaying judgments of learning on metacognitive accuracy: a meta-analytic review. Psychological Bulletin, 137, 131–148.

  42. Robinson, M. D., Johnson, J. T., & Herndon, F. (1997). Reaction time and assessments of cognitive effort as predictors of eyewitness memory accuracy and confidence. Journal of Applied Psychology, 82, 416–425.

  43. Roediger, H. L., Agarwal, P. K., Kang, S. H. K., & Marsh, E. J. (2010). Benefits of testing memory: best practices and boundary conditions. In G. M. Davies & D. B. Wright (Eds.), New frontiers in applied memory (pp. 13–49). Brighton: Psychology Press.

  44. Roediger, H. L., & Karpicke, J. D. (2006a). The power of testing memory: basic research and implications for educational practice. Perspectives on Psychological Science, 1, 181–210.

  45. Roediger, H. L., & Karpicke, J. D. (2006b). Test-enhanced learning: taking memory tests improves long-term retention. Psychological Science, 17, 249–255.

  46. Shaps, L. P., Johansson, B., & Nilsson, L.-G. (1976). Swedish Association Norms. (Report No. 196). Uppsala: Department of Psychology, Uppsala University.

  47. Spellman, B. A., & Bjork, R. A. (1992). When predictions create reality: judgments of learning may alter what they are intended to assess. Psychological Science, 3, 315–316.

  48. Steffens, M. C., Buchner, A., Wender, K. F., & Decker, C. (2007). Limits on the role of retrieval cues in memory for actions: enactment effects in the absence of object cues in the environment. Memory and Cognition, 35, 1841–1853.

  49. Toppino, T. C., & Cohen, M. S. (2009). The testing effect and the retention interval: questions and answers. Experimental Psychology, 56, 252–257.

  50. Tulving, E., & Thomson, D. M. (1973). Encoding specificity and retrieval processes in episodic memory. Psychological Review, 80, 352–373.

  51. Unsworth, N., Heitz, R. P., Schrock, J. C., & Engle, R. W. (2005). An automated version of the operation span task. Behavior Research Methods, 37, 498–505.

  52. van den Broek, G. S. E., Takashima, A., Segers, E., Fernández, G., & Verhoeven, L. (2013). Neural correlates of testing effects in vocabulary learning. Neuroimage, 78, 94–102.

  53. Whitten, W. B, I. I., & Bjork, R. A. (1977). Learning from tests: effects of spacing. Journal of Verbal Learning and Verbal Behavior, 16, 465–478.

  54. Zimmer, H. D., & Engelkamp, J. (2003). Signing enhances memory like performing actions. Psychonomic Bulletin and Review, 10, 450–454.

Download references


This research was supported by a grant from The Swedish Research Council (2009-2334) to Fredrik Jönsson. We thank Tara Soltani for help with parts of the data collection in Experiment 1.

Author information

Correspondence to Fredrik U. Jönsson.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Jönsson, F.U., Kubik, V., Larsson Sundqvist, M. et al. How crucial is the response format for the testing effect?. Psychological Research 78, 623–633 (2014) doi:10.1007/s00426-013-0522-8

Download citation


  • Target Word
  • Retention Interval
  • Word Pair
  • Response Format
  • Test Cycle