Skip to main content

Empirically derived probabilities for grapheme-to-phoneme correspondences in english


Prior probabilities of graphemes and conditional probabilities for their pronunciation as specific phonemes are given based on a corpus of 17,310 English words. Phonemes are as given in recent editions ofWebster’s New Collegiate Dictionary, with minor revisions; graphemes are defined as letters or letter clusters corresponding to single phonemes. Grapheme-phoneme probabilities were derived from a revised table of frequency of occurrence of phoneme-to-grapheme correspondences generated in a study of spelling regularities (P. R. Hanna, J. S. Hanna, Hodges, & Rudorf, 1966). This quantitative descriptive information provides an index of the strength of particular grapheme-phoneme associations in English. Suggestions are made for the utilization of these probabilities as estimates of spelling/sound predictability in reading research.


  1. Allen, J. (1980). Speech synthesis from text. In J. C. Simon (Ed.),Spoken language generation and understanding (pp. 383–396). Dortrecht, The Netherlands: Reidel.

    Google Scholar 

  2. Coltheart, M. (1985). Cognitive neuropsychology and the study of reading. In M. I. Posner & G. S. M. Marin (Eds.),Attention and performance XI (pp. 3–37). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  3. Denes, P. B. (1963). On the statistics of spoken English.The Journal of the Acoustical Society of America,35, 892–904.

    Article  Google Scholar 

  4. Friendly, M., Franklin, P. E., Hoffman, D., &Rubin, D. C. (1982). The Toronto Word Pool: Norms for imagery, concreteness, orthographic variables, and grammatical usage for 1,080 words.Behavior Research Methods & Instrumentation,14, 375–399.

    Google Scholar 

  5. Glushko, R. J. (1979). The organization and activation of orthographic knowledge in reading aloud.Journal of Experimental Psychology: Human Perception & Performance,6, 674–691.

    Article  Google Scholar 

  6. Goodman, R. A., &Caramazza, A. (1986). Phonologically plausible errors: Implications for a model of the phoneme-grapheme conversion mechanism in the spelling process. In G. Augst (Ed.),Proceedings of the International Colloquium on Graphics and Orthography (pp. 300–325). Siegen, W. Germany.

  7. Hanna, P. R., Hanna, J. S., Hodges, R. E., &Rudorf, E. H. (1966).Phoneme-grapheme correspondences as cues to spelling improvement. Washington, DC: U.S. Department of Health, Education, and Welfare.

    Google Scholar 

  8. Henderson, L. (1985). Issues in the modelling of pronunciation assembly in normal reading. In K. E. Patterson, J. C. Marshall, & M. Coltheart (Eds.),Surface dyslexia: Neuropsychological and cognitive studies of phonological reading. London: Erlbaum.

    Google Scholar 

  9. Parkin, A. J. (1982). Phonological recoding in lexical decision: Effects of spelling-to-sound regularity depend on how regularity is defined.Memory & Cognition,10, 43–53.

    Google Scholar 

  10. Patterson, K. E., &Morton, J. (1985). From orthography to phonology: An attempt at an old interpretation. In K. E. Patterson, J. Marshall, & M. Coltheart (Eds.),Surface dyslexia: Neuropsychological and cognitive studies of phonological reading. London: Erlbaum.

    Google Scholar 

  11. Reggia, J. A., &Berndt, R. S. (1986). Modelling reading aloud and its relevance to acquired dyslexia.Computer Methods & Programs in Biomedicine,22, 13–19.

    Article  Google Scholar 

  12. Seidenberg, M. S., Waters, G.S., Barnes, M. A., &Tanenhaus, M. K. (1984). When does irregular spelling or pronunciation influence word recognition?Journal of Verbal Learning & Verbal Behavior,23, 383–404.

    Article  Google Scholar 

  13. Venezky, R. L. (1970).The structure of English orthography. The Hague: Mouton.

    Google Scholar 

  14. Webster’s New Collegiate Dictionary (6th ed.). (1961). Springfield, MA: Merriam-Webster.

  15. Webster’s New Collegiate Dictionary (8th ed.). (1981). Springfield, MA: Merriam-Webster.

Download references

Author information



Corresponding author

Correspondence to Rita Sloan Berndt.

Additional information

This project was supported by Grant N00014-85-K-0390 from the Office of Naval Research. R. S. Berndt is supported by Grant KO4-NS-00851 from the National Institute of Neurological and Communicative Disorders and Stroke. The authors are grateful to Patricia Grady, Anne Haendiges, Gloria Kimbles, Michael Morris, and Aita Salasoo for their assistance with various aspects of this project, and also to Ruth D. Bass for careful typing of the manuscript and tables.

The idea of using the corpus generated from P. R. Hanna, J. S. Hanna, Hodges, and Rudorf (1966) for the purposes set out in this study was suggested to us by Max Coltheart.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Berndt, R.S., Reggia, J.A. & Mitchum, C.C. Empirically derived probabilities for grapheme-to-phoneme correspondences in english. Behavior Research Methods, Instruments, & Computers 19, 1–9 (1987).

Download citation


  • Dene
  • Dyslexia
  • Reverse Probability
  • Letter Cluster
  • Inconsistent Word