The effect of corpus size in predicting reaction time in a basic word recognition task: Moving on from Kučera and Francis

  • Curt BurgessEmail author
  • Kay Livesay
Cognitive Research


Word frequency is one of the strongest determiners of reaction time (RT) in word recognition tasks; it is an important theoretical and methodological variable. The Kučera and Francis (1967) word frequency count (derived from the 1-million-word Brown corpus) is used by most investigators concerned with the issue of word frequency. Word frequency estimates from the Brown corpus were compared with those from a 131-million-word corpus (the HAL corpus; conversational text gathered from Usenet) in a standard word naming task with 32 subjects. RT was predicted equally well by both corpora for high-frequency words, but the larger corpus provided better predictors for low- and medium-frequency words. Furthermore, the larger corpus provides estimates for 97,261 lexical items; the smaller corpus, for 50,406 items.


Word Frequency Cognitive Science Society Corpus Size Word Fragment Completion Brown Corpus 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Breland, H. M. (1996). Word frequency and word difficulty: A comparison of counts on four corpora.Psychological Science,7, 96–99.CrossRefGoogle Scholar
  2. Buchanan, L., Burgess, C., &Lund, K. (1996). Overcrowding in semantic neighborhoods: Modeling deep dyslexia.Brain & Cognition,32, 111–114.Google Scholar
  3. Burgess, C. (1998).From simple associations to the building blocks of language: Modeling meaning in memory with the HAL model. Manuscript submitted for publication.Google Scholar
  4. Burgess, C., &Hollbach, S. C. (1988). A computational model of syntactic ambiguity as a lexical process. InProceedings of the Tenth Annual Cognitive Science Society Meeting (pp. 263–269). Hillsdale, NJ: Erlbaum.Google Scholar
  5. Burgess, C., &Lund, K. (1994). Multiple constraints in syntactic ambiguity resolution: A connectionist account of psycholinguistic data. InProceedings of the Cognitive Science Society (pp. 90–95). Hillsdale, NJ: Erlbaum.Google Scholar
  6. Burgess, C., &Lund, K. (1997a). Modeling cerebral asymmetries of semantic memory using high-dimensional semantic space. In M. Beeman & C. Chiarello (Eds.),Getting it right: The cognitive neuroscience of right hemisphere language comprehension (pp. 215–244). Hillsdale, NJ: Erlbaum.Google Scholar
  7. Burgess, C., &Lund, K. (1997b). Modeling parsing constraints with high-dimensional context space.Language & Cognitive Processes,12, 177–210.CrossRefGoogle Scholar
  8. Burgess, C., &Lund, K. (1997c). Representing abstract words and emotional connotation in high-dimensional memory space. InProceedings of the Cognitive Science Society (pp. 61–66). Hillsdale, NJ: Erlbaum.Google Scholar
  9. Burgess, C., Lund, K., &Kromsky, A. (1997). Examining issues in developmental psycholinguistics with a high-dimensional memory model.Abstracts of the Psychonomic Society,2, 66.Google Scholar
  10. Burgess, C., Tanenhaus, M. K., &Hoffman, M. (1994). Parafoveal and semantic effects on syntactic ambiguity resolution. InProceedings of the Cognitive Science Society (pp. 96–99). Hillsdale, NJ: Erlbaum.Google Scholar
  11. Cattell, J. M. (1886). The time it takes to see and name objects.Mind,11, 63–65.CrossRefGoogle Scholar
  12. Chiarello, C. (1988). Lateralization of lexical processes in the brain: A review of visual half-field research. In H. A. Whitaker (Ed.),Contemporary reviews in neuropsychology (pp. 36–76). New York: Springer-Verlag.Google Scholar
  13. Clark, S. E., &Burchett, R. E. R. (1994). Word frequency and list composition effects in associative recognition and recall.Memory & Cognition,22, 55–62.Google Scholar
  14. Dobbs, A. R., Friedman, A., &Lloyd, J. (1985). Frequency effects in lexical decisions: A test of the verification model.Journal of Experimental Psychology: Human Perception & Performance,11, 81–92.CrossRefGoogle Scholar
  15. Dupuy, H. J. (1974). The rationale, development, and standardization of a basic word vocabulary test.Vital & Health Statistics,2, 71.Google Scholar
  16. Forster, K. I. (1976). Accessing the mental lexicon. In R. J. Wales & E. C. T. Walker (Eds.),New approaches to language mechanisms (pp. 257–287). Amsterdam: North-Holland.Google Scholar
  17. Francis, W. N., &Kučera, H. (1982).Frequency analysis of English usage: Lexicon and grammar. Boston: Houghton Mifflin.Google Scholar
  18. Gernsbacher, M. A. (1984). Resolving 20 years of inconsistent interactions between lexical familiarity and orthography, concreteness, and polysemy.Journal of Experimental Psychology: General,113, 256–281.CrossRefGoogle Scholar
  19. Graf, P., &Williams, D. (1987). Completion norms for 40 three-letter word stems.Behavior Research Methods, Instruments, & Computers,19, 422–445.Google Scholar
  20. Grainger, J., O’Regan, J. K., Jacobs, A. M., &Segui, J. (1989). On the role of competing word units in visual word recognition: The neighborhood frequency effect.Perception & Psychophysics,45, 189–195.Google Scholar
  21. Hyönä, J., &Olson, R. K. (1995). Eye fixation patterns among dyslexic and normal readers: Effects of word length and word frequency.Journal of Experimental Psychology: Learning, Memory, & Cognition,21, 1430–1440.CrossRefGoogle Scholar
  22. Jurado, M. A., Junque, C., Pujol, J., Oliver, B., &Vendrell, P. (1997). Impaired estimation of word occurrence frequency in frontal lobe patients.Neuropsychologia,35, 635–641.CrossRefGoogle Scholar
  23. Kučera, H., &Francis, W. N. (1967).Computational analysis of presentday American English. Providence, RI: Brown University Press.Google Scholar
  24. Livesay, K., &Burgess, C. (1997). Mediated priming: A representational and empirical account using the HAL model. InProceedings of the Cognitive Science Society (pp. 436–441). Hillsdale, NJ: Erlbaum.Google Scholar
  25. Lovelace, E. A. (1988). On using norms for low-frequency words.Bulletin of the Psychonomic Society,26, 410–412.Google Scholar
  26. Lund, K., &Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence.Behavior Research Methods, Instruments, & Computers,28, 203–208.Google Scholar
  27. Lund, K., &Burgess, C. (1997, December).Recurrent neural networks and global co-occurrence models: Developing contextual representations of word meaning. Paper presented at the NIPS*97 (Neural Information Processing Systems) Neural Models of Concept Learning postconference workshop, Breckenridge, CO.Google Scholar
  28. Lund, K., Burgess, C., &Atchley, R. A. (1995). Semantic and associative priming in high-dimensional semantic space. InProceedings of the Cognitive Science Society (pp. 660–665). Hillsdale, NJ: Erlbaum.Google Scholar
  29. Lund, K., Burgess, C., &Audet, C. (1996). Dissociating semantic and associative word relationships using high-dimensional semantic space. InProceedings of the Cognitive Science Society (pp. 603–608). Hillsdale, NJ: Erlbaum.Google Scholar
  30. MacDonald, M. C. (1994). Probabilistic constraints and syntactic ambiguity resolution.Language & Cognitive Processes,9, 157–201.CrossRefGoogle Scholar
  31. MacDonald, M. C., Pearlmutter, N. J., &Seidenberg, M. S. (1994). The lexical nature of syntactic ambiguity resolution.Psychological Review,101, 676–703.CrossRefPubMedGoogle Scholar
  32. McClelland, J. L., &Rumelhart, D. E. (1985). Distributed memory and the representation of general and specific information.Journal of Experimental Psychology: General,114, 159–188.CrossRefGoogle Scholar
  33. Monsell, S. (1991). The nature and locus of word frequency effects in reading. In D. Besner & G. W. Humphreys (Eds.),Basic processes in reading: Visual word recognition (pp. 148–197). Hillsdale, NJ: Erlbaum.Google Scholar
  34. Morton, J. (1969). Interaction of information in word recognition.Psychological Review,76, 165–178.CrossRefGoogle Scholar
  35. Plaut, D. C. (1996). Relearning after damage in connectionist networks: Toward a theory of rehabilitation.Brain & Language,52, 25–82.CrossRefGoogle Scholar
  36. Rudell, A. P. (1993). Frequency of word usage and perceived word difficulty: Ratings of Kucera and Francis words.Behavior Research Methods, Instruments, & Computers,25, 455–463.Google Scholar
  37. Schwanenflugel, P., &Shoben, E. (1985). The influence of sentence constraints on the scope of facilitation for upcoming words.Journal of Memory & Language,24, 232–252.CrossRefGoogle Scholar
  38. Sears, C. R., Hino, Y., &Lupker, S. J. (1995). Neighborhood size and neighborhood frequency effects in word recognition.Journal of Experimental Psychology: Human Perception & Performance,21, 876–900.CrossRefGoogle Scholar
  39. Smith, E. E., Shoben, E. J., &Rips, L. J. (1974). Structure and process in semantic memory: A featural model for semantic decisions.Psychological Review,81, 214–241.CrossRefGoogle Scholar
  40. Tanenhaus, M. K., &Carlson, G. N. (1989). Lexical structure and language comprehension. In W. Marslen-Wilson (Ed.),Lexical representation and process (pp. 529–561). Cambridge, MA: MIT Press.Google Scholar
  41. Thorndike, E. L., &Lorge, I. (1944).The teacher’s word book of 30,000 words. New York: Columbia University, Teachers College Press.Google Scholar
  42. Troia, G. A., Roth, F. P., &Yeni-Komshian, G. H. (1996). Word frequency and age effects in normally developing children’s phonological processing.Journal of Speech & Hearing Research,39, 1099–1108.Google Scholar
  43. Trueswell, J. C., Tanenhaus, M. K., &Garnsey, S. M. (1994). Semantic influences on parsing: Use of thematic role information in syntactic ambiguity resolution.Journal of Memory & Language,33, 285–318.CrossRefGoogle Scholar

Copyright information

© Psychonomic Society, Inc. 1998

Authors and Affiliations

  1. 1.Psychology DepartmentUniversity of CaliforniaRiverside

Personalised recommendations