Behavior Research Methods

, Volume 46, Issue 1, pp 240–253 | Cite as

ESCOLEX: A grade-level lexical database from European Portuguese elementary to middle school textbooks

  • Ana Paula SoaresEmail author
  • José Carlos Medeiros
  • Alberto Simões
  • João Machado
  • Ana Costa
  • Álvaro Iriarte
  • José João de Almeida
  • Ana P. Pinheiro
  • Montserrat Comesaña


In this article, we introduce ESCOLEX, the first European Portuguese children’s lexical database with grade-level-adjusted word frequency statistics. Computed from a 3.2-million-word corpus, ESCOLEX provides 48,381 word forms extracted from 171 elementary and middle school textbooks for 6- to 11-year-old children attending the first six grades in the Portuguese educational system. Like other children’s grade-level databases (e.g., Carroll, Davies, & Richman, 1971; Corral, Ferrero, & Goikoetxea, Behavior Research Methods, 41, 1009–1017, 2009; Lété, Sprenger-Charolles, & Colé, Behavior Research Methods, Instruments, & Computers, 36, 156–166, 2004; Zeno, Ivens, Millard, Duvvuri, 1995), ESCOLEX provides four frequency indices for each grade: overall word frequency (F), index of dispersion across the selected textbooks (D), estimated frequency per million words (U), and standard frequency index (SFI). It also provides a new measure, contextual diversity (CD). In addition, the number of letters in the word and its part(s) of speech, number of syllables, syllable structure, and adult frequencies taken from P-PAL (a European Portuguese corpus-based lexical database; Soares, Comesaña, Iriarte, Almeida, Simões, Costa, …, Machado, 2010; Soares, Iriarte, Almeida, Simões, Costa, França, …, Comesaña, in press) are provided. ESCOLEX will be a useful tool both for researchers interested in language processing and development and for professionals in need of verbal materials adjusted to children’s developmental stages. ESCOLEX can be downloaded along with this article or from


Children lexical databases Word frequency Child language processing Reading Literacy 


Author note

This work is part of the research project “Procura-PALavras (P-PAL): A software program for deriving objective and subjective psycholinguistic indices for European Portuguese words” (PTDC/PSI-PCO/104679/2008), funded by FCT (Fundação para a Ciência e Tecnologia), by FEDER (Fundo Europeu de Desenvolvimento Regional) through the European programs QREN (Quadro de Referência Estratégico Nacional), and by COMPETE (Programa Operacional Factores de Competitividade). We are grateful to Porto Editora for providing the textbooks without which ESCOLEX would not have been possible.


  1. Adams, M. J. (1990). Beginning to read: Thinking and learning about print. Cambridge: MIT Press.Google Scholar
  2. Adelman, J. S., Brown, G. D. A., & Quesada, J. F. (2006). Contextual diversity, not word frequency, determines word naming and lexical decision times. Psychological Science, 17, 814–823.PubMedCrossRefGoogle Scholar
  3. Balota, D. A., Yap, M. J., Cortese, M. J., Hutchison, K. A., Kessler, B., Loftis, B., … Treiman, R. (2007). The English Lexicon Project. Behavior Research Methods, 39, 445–459. doi: 10.3758/BF03193014
  4. Benedict, H. (1979). Early lexical development: Comprehension and production. Journal of Child Language, 6, 183–200. doi: 10.1017/S0305000900002245 PubMedCrossRefGoogle Scholar
  5. Bloom, L. (1973). One word at a time: The use of single word utterances before syntax. The Hague: Mouton.Google Scholar
  6. Blomert, L. (2011). The neural signature of orthographic-phonological binding in successful and failing reading development. NeuroImage, 57, 695–703.PubMedCrossRefGoogle Scholar
  7. Booth, J. R., Burman, D. D., Meyer, J. R., Gitelman, D. R., Parrish, T. B., & Mesulam, M. M. (2004). Development of brain mechanisms for processing orthographic and phonologic representations. Journal of Cognitive Neuroscience, 16, 1234–1249.PubMedCentralPubMedCrossRefGoogle Scholar
  8. Bowey, J. (2005). Grammatical sensitivity: Its origins and potential contribution to early reading skill. Journal of Experimental Child Psychology, 90, 318–343.PubMedCrossRefGoogle Scholar
  9. Breland, H. M. (1996). Word frequency and word difficulty: A comparison of counts in four corpora. Psychological Science, 7, 96–99.CrossRefGoogle Scholar
  10. Brown, R. (1973). A first language: The early stages. London: George Allen & Unwin.CrossRefGoogle Scholar
  11. Carroll, J. B., Davies, P., & Richman, B. (Eds.). (1971). The American Heritage word frequency book. Boston: Houghton Mifflin.Google Scholar
  12. Castles, A., & Coltheart, M. (1993). Varieties of developmental dyslexia. Cognition, 47, 149–180.PubMedCrossRefGoogle Scholar
  13. Castles, A., Davis, C., Cavalot, P., & Forster, K. (2008). Tracking the acquisition of orthographic skills in developing readers: Masked priming effects. Journal of Experimental Child Psychology, 97, 165–182.CrossRefGoogle Scholar
  14. Chéreau, C., Gaskell, M. G., & Dumay, N. (2007). Reading spoken words: Orthographic effects in auditory priming. Cognition, 102, 341–360.PubMedCrossRefGoogle Scholar
  15. Coady, J. A., & Aslin, R. N. (2003). Phonological neighbourhoods in the developing lexicon. Journal of Child Language, 30, 441–469.PubMedCrossRefGoogle Scholar
  16. Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108, 204–256. doi: 10.1037/0033-295X.108.1.204 PubMedCrossRefGoogle Scholar
  17. Corral, S., Ferrero, M., & Goikoetxea, E. (2009). LEXIN: A lexical database from Spanish kindergarten and first-grade readers. Behavior Research Methods, 41, 1009–1017. doi: 10.3758/BRM.41.4.1009 PubMedCrossRefGoogle Scholar
  18. Damian, M. F., & Bowers, J. S. (2009). Orthographic effects in rhyme monitoring: Are they automatic? European Journal of Cognitive Psychology, 22, 1–11.Google Scholar
  19. Dickinson, D. K., & Snow, C. E. (1987). Interrelationships among prereading and oral language skills in kindergartners from two social classes. Early Childhood Research Quarterly, 2, 1–25.CrossRefGoogle Scholar
  20. Doctor, E. A., & Coltheart, M. (1980). Children’s use of phonological encoding when reading for meaning. Memory & Cognition, 8, 195–209.CrossRefGoogle Scholar
  21. Dollaghan, C. A. (1994). Children’s phonological neighbourhoods: Half empty or half full? Journal of Child Language, 21, 257–272.PubMedCrossRefGoogle Scholar
  22. Ehri, L. C. (1995). Phases of development in learning to read words by sight. Journal of Research in Reading, 18, 116–125.CrossRefGoogle Scholar
  23. Fenk-Oczlon, G., & Fenk, A. (2008). Complexity trade-offs between the subsystems of language. In M. Miestamo, K. Sinnemäki, & F. Karlsson (Eds.), Language complexity: Typology, contact, change (pp. 43–65). Amsterdam: John Benjamins.Google Scholar
  24. Goldfield, B. A., & Reznick, J. S. (1990). Early lexical acquisition: Rate, content, and the vocabulary spurt. Journal of Child Language, 17, 171–183.PubMedCrossRefGoogle Scholar
  25. Goswami, U., Ziegler, J. C., & Richardson, U. (2005). The effects of spelling consistency on phonological awareness: A comparison of English and German. Journal of Experimental Child Psychology, 92, 345–365.PubMedCrossRefGoogle Scholar
  26. Harm, M. W., & Seidenberg, M. S. (1999). Phonological, reading acquisition and dyslexia: Insights from connectionist models. Psychological Review, 106, 491–528.PubMedCrossRefGoogle Scholar
  27. Kučera, H., & Francis, W. N. (1967). Computational analysis of present day American English. Providence: Brown University Press.Google Scholar
  28. Lambert, E., & Chesnet, D. (2001). NOVLEX: Une base de données lexicales pour les élèves de primaire. L'Année Psychologique, 101, 277–288.CrossRefGoogle Scholar
  29. Lété, B., Peereman, R., & Fayol, M. (2008). Consistency and word-frequency effects on spelling among first- to fifth-grade French children: A regression-based study. Journal of Memory and Language, 58, 952–977.CrossRefGoogle Scholar
  30. Lété, B., Sprenger-Charolles, L., & Colé, P. (2004). MANULEX: A grade-level lexical database from French elementary school readers. Behavior Research Methods, Instruments, & Computers, 36, 156–166. doi: 10.3758/BF03195560 CrossRefGoogle Scholar
  31. Lieberman, I. Y., & Shankweiler, D. (1985). Phonology and the problems of learning to read and write. Remedial and Special Education, 6, 8–17.CrossRefGoogle Scholar
  32. Lonigan, C. J., Burgess, S. R., & Anthony, J. L. (2000). Development of emergent literacy and early reading skills in preschool children: Evidence from a latent-variable longitudinal study. Developmental Psychology, 36, 596–613.PubMedCrossRefGoogle Scholar
  33. Luce, P. A., Pisoni, D. B., & Goldinger, S. D. (1990). Similarity neighborhoods of spoken words. In G. T. M. Altmann (Ed.), Cognitive models of speech processing (pp. 122–147). Cambridge: MIT Press.Google Scholar
  34. Marconi, L., Ott, M., Pesenti, E., Ratti, D., & Tavella, M. (1993). Lessico elementare: Dati statistici sull’italiano letto e scritto dai bambini delle elementari. Bologna: Zanichelli.Google Scholar
  35. Martínez, J. A., & García Pérez, M. E. (2008). ONESC: A database of orthographic neighbors for Spanish read by children. Behavior Research Methods, 40, 191–197. doi: 10.3758/BRM.40.1.191 CrossRefGoogle Scholar
  36. Mason, J. (1980). When do children begin to read: An exploration of four year old children’s letter and word reading competencies. Reading Research Quarterly, 15, 203–227.CrossRefGoogle Scholar
  37. Masterson, J., Stuart, M., Dixon, M., & Lovejoy, S. (2010). Children’s printed word database: Continuities and changes over time in children’s early reading vocabulary. British Journal of Psychology, 101, 221–242.PubMedCrossRefGoogle Scholar
  38. McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1–86. doi: 10.1016/0010-0285(86)90015-0 PubMedCrossRefGoogle Scholar
  39. Monsell, S. (1991). The nature and locus of word frequency effects in reading. In D. Besner & G. W. Humphreys (Eds.), Basic processes in reading: Visual word recognition (pp. 148–197). Hillsdale: Erlbaum.Google Scholar
  40. Moret-Tatay, C., & Perea, M. (2011). Is the go/no-go lexical decision task preferable to the yes/no task with developing readers? Journal of Experimental Child Psychology, 110, 125–132.PubMedCrossRefGoogle Scholar
  41. Muneaux, M., & Ziegler, J. C. (2004). Locus of orthographic effects in spoken word recognition: Novel insights from the neighbour generation task. Language & Cognitive Processes, 19, 641–660.CrossRefGoogle Scholar
  42. Nagy, W. E., & Herman, P. A. (1987). Breadth and depth of vocabulary knowledge: Implications for acquisition and instruction. In M. McKeown & M. Curtis (Eds.), The nature of vocabulary acquisition (pp. 19–35). Hillsdale: Erlbaum.Google Scholar
  43. Newman, S. D. (2012). The homophone effect during visual word recognition in children: An fMRI study. Psychological Research, 76, 280–291.PubMedCrossRefGoogle Scholar
  44. Norris, D. (2006). The Bayesian reader: Explaining word recognition as an optimal Bayesian decision process. Psychological Review, 113, 327–357. doi: 10.1037/0033-295X.113.2.327 PubMedCrossRefGoogle Scholar
  45. Pattamadilok, C., Morais, J., de Vylder, O., Ventura, P., & Kolinsky, R. (2009). The orthographic consistency effect in the recognition of French spoken words: An early developmental shift from sublexical to lexical orthographic. Applied Psycholinguist, 30, 441–462.CrossRefGoogle Scholar
  46. Pattamadilok, C., Perre, L., Dufau, S., & Ziegler, J. C. (2009). On-line orthographic influences on spoken language in a semantic task. Journal of Cognitive Neuroscience, 21, 169–179.PubMedCrossRefGoogle Scholar
  47. Perea, M., Panadero, V., Moret-Tatay, C., & Gómez, P. (2012). The effects of inter-letter spacing in visual-word recognition: Evidence with young normal readers and developmental dyslexics. Learning and Instruction, 22, 420–430.CrossRefGoogle Scholar
  48. Perea, M., Soares, A. P., & Comesaña, M. (in press). Contextual diversity is a main determinant of word identification times in young readers. Journal of Experimental Child Psychology. doi: 10.1016/j.jecp.2012.10.014
  49. Peereman, R., Dufour, S., & Burt, J. S. (2009). Orthographic influences in spoken word recognition: The consistency effect in semantic and gender categorization tasks. Psychonomic Bulletin & Review, 16, 363–368. doi: 10.3758/PBR.16.2.363 CrossRefGoogle Scholar
  50. Peereman, R., Lété, B., & Sprenger-Charolles, L. (2007). Manulex-infra: Distributional characteristics of grapheme–phoneme mappings, and infralexical and lexical units in child-directed written material. Behavior Research Methods, 39, 579–589. doi: 10.3758/BF03193029 PubMedCrossRefGoogle Scholar
  51. Perfetti, C. A. (1985). Reading ability. New York: Oxford University Press.Google Scholar
  52. Plaut, D. C., McClelland, J. L., Seidenberg, M. S., & Patterson, K. (1996). Understanding normal and impaired word reading: Computational principles in quasi-regular domains. Psychological Review, 103, 56–115. doi: 10.1037/0033-295X.103.1.56 PubMedCrossRefGoogle Scholar
  53. Rastle, K., & Brysbaert, M. (2006). Masked phonological priming effects in English: Are they real? Do they matter? Cognitive Psychology, 53, 97–145. doi: 10.1016/j.cogpsych.2006.01.002 PubMedCrossRefGoogle Scholar
  54. Rastle, K., McCormick, S. F., Bayliss, L., & Davis, C. J. (2011). Orthography influences the perception and production of speech. Journal of Experimental Psychology: Learning, Memory, and Cognition, 37, 1588–1594. doi: 10.1037/a0024833 PubMedGoogle Scholar
  55. Reichle, E. D., Pollatsek, A., Fisher, D. L., & Rayner, K. (1998). Toward a model of eye movement control in reading. Psychological Review, 105, 125–157. doi: 10.1037/0033-295X.105.1.125 PubMedCrossRefGoogle Scholar
  56. Seidenberg, M. S., & Tanenhaus, M. K. (1979). Orthographic effects in rhyme monitoring. Journal of Experimental Psychology: Human Learning & Memory, 5, 546–554. doi: 10.1037/0278-7393.5.6.546 Google Scholar
  57. Shaywitz, B. A., Shaywitz, S. E., Pugh, K. R., Mencl, W. E., Fulbright, R. K., Skudlarski, P., … Gore, J. C. (2002). Disruption of posterior brain systems for reading in children with developmental dyslexia. Biological Psychiatry, 52, 101–110.Google Scholar
  58. Simões, A. M., & Almeida, J. J. (2001). Jspell: Um módulo de análise morfológica para uso em processamento de linguagem natural. In A. Gonçalves & C. N. Correia (Eds.), Actas do Encontro Nacional da Associação Portuguesa de Linguística (pp. 485–495). Lisbon: Associação Portuguesa de Linguística.Google Scholar
  59. Slobin, D. I. (1973). Cognitive prerequisites for the development of grammar. In C. A. Ferguson & D. I. Slobin (Eds.), Studies of child language development (pp. 175–208). New York: Holt, Rinehart & Winston.Google Scholar
  60. Smolensky, P. (1996). On the comprehension/production dilemma in child language. Linguistic Inquiry, 27, 720–731.Google Scholar
  61. Snowling, M. (1980). The development of grapheme–phoneme correspondence in normal and dyslexic readers. Journal of Child Psychology, 29, 294–305.CrossRefGoogle Scholar
  62. Soares, A. P., Comesaña, M., Iriarte, A., Almeida, J. J., Simões, A., Costa, A., et al. (2010). P-PAL: A European Portuguese lexical database. Linguamática, 2, 67–72.Google Scholar
  63. Soares, A. P., Iriarte, A., Almeida, J. J., Simões, A., Costa, A., França, P., . . . Comesaña, M. (in press). Procura-PALavras (P-PAL): A new measure of word frequency for contemporary European Portuguese. Psicologia: Reflexão e Crítica.Google Scholar
  64. Stuart, M., Dixon, M., Masterson, J., & Gray, B. (2003). Children’s early reading vocabulary: Description and word frequency lists. British Journal of Educational Psychology, 73, 585–598.PubMedCrossRefGoogle Scholar
  65. Taft, M., Castles, A., Davis, C., Lazendic, G., & Nguyen-Hoan, M. (2008). Automatic activation of orthography in spoken word recognition: Pseudohomograph priming. Journal of Memory and Language, 58, 366–379.CrossRefGoogle Scholar
  66. Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge: Harvard University Press.Google Scholar
  67. Turkeltaub, P. E., Gareau, L., Flowers, D. L., Zeffiro, T. A., & Eden, G. F. (2003). Development of neural mechanisms for reading. Nature Neuroscience, 6, 767–773.PubMedCrossRefGoogle Scholar
  68. Unsworth, S. J., & Pexman, P. M. (2003). The impact of reader skill on phonological processing in visual word recognition. Quarterly Journal of Experimental Psychology, 56, 63–81.PubMedCrossRefGoogle Scholar
  69. Vellutino, F. R., Fletcher, J. M., Snowling, M. J., & Scanlon, D. M. (2004). Specific reading disability (dyslexia): What have we learned in the past four decades? The Journal of Child Psychiatry, 45, 2–40.CrossRefGoogle Scholar
  70. Ventura, P., Kolinsky, R., Pattamadilok, C., & Morais, J. (2008). The developmental turn point of orthographic consistency effects in speech recognition. Journal of Experimental Child Psychology, 100, 135–145.PubMedCrossRefGoogle Scholar
  71. Ventura, P., Morais, J., & Kolinsky, R. (2006). The development of orthographic consistency effect in speech recognition: From sub-lexical to lexical involvement. Cognition, 105, 547–576.CrossRefGoogle Scholar
  72. Zahar, R., Cobb, T., & Spada, N. (2001). Acquiring vocabulary through reading: Effects of frequency and contextual richness. The Canadian Modern Language Review, 57, 541–572.CrossRefGoogle Scholar
  73. Zeno, S., Ivens, S. H., Millard, R. T., & Duvvuri, R. (1995). The educator’s word frequency guide. Brewster: Touchstone Applied Science.Google Scholar
  74. Zevin, J. D., & Seidenberg, M. S. (2002). Age of acquisition effects in word reading and other tasks. Journal of Memory and Language, 47, 1–29. doi: 10.1006/jmla.2001.2834 CrossRefGoogle Scholar
  75. Ziegler, J. C., Ferrand, L., & Montant, M. (2004). Visual phonology: The effects of orthographic consistency on different auditory word recognition tasks. Memory & Cognition, 32, 732–741. doi: 10.3758/BF03195863 CrossRefGoogle Scholar
  76. Ziegler, J. C., & Goswami, U. (2005). Reading acquisition, developmental dyslexia, and skilled reading across languages: A psycholinguistic grain size theory. Psychological Bulletin, 131, 3–29. doi: 10.1037/0033-2909.131.1.3 PubMedCrossRefGoogle Scholar
  77. Ziegler, J. C., & Muneaux, M. (2007). Orthographic facilitation and phonological inhibition in spoken word recognition: A developmental study. Psychonomic Bulletin & Review, 14, 75–80.CrossRefGoogle Scholar
  78. Zipf, G. K. (1949). Human behavior and the principle of least effort. Cambridge: Addison-Wesley.Google Scholar

Copyright information

© Psychonomic Society, Inc. 2013

Authors and Affiliations

  • Ana Paula Soares
    • 1
    Email author
  • José Carlos Medeiros
    • 4
  • Alberto Simões
    • 2
  • João Machado
    • 1
  • Ana Costa
    • 1
  • Álvaro Iriarte
    • 2
  • José João de Almeida
    • 3
  • Ana P. Pinheiro
    • 1
  • Montserrat Comesaña
    • 1
  1. 1.School of Psychology, University of MinhoMinhoPortugal
  2. 2.Centre for Humanistic Studies, University of MinhoMinhoPortugal
  3. 3.Computer Science and Technology Center, University of MinhoMinhoPortugal
  4. 4.Porto EditoraPortoPortugal

Personalised recommendations