Skip to main content
Log in

Word frequency distributions and lexical semantics

  • Published:
Computers and the Humanities Aims and scope Submit manuscript

Abstract

This paper addresses the relation between meaning, lexical productivity, and frequency of use. Using density estimation as a visualization tool, we show that differences in semantic structure can be reflected in probability density functions estimated for word frequency distributions. We call attention to an example of a bimodal density, and suggest that bimodality arises when distributions of well-entrenched lexical items, which appear to be lognormal, are mixed with distributions of productively created nonce formations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Anshen, F. and M. Aronoff “Producing Morphologically Complex Words.” Linguistics, 26 (1988), 641–655.

    Google Scholar 

  • Aronoff, M. Word Formation in Generative Grammar. Cambridge, Mass.: MIT Press, 1976.

    Google Scholar 

  • Baayen, R. H. “Quantitative Aspects of Morphological Productivity.” In Yearbook of Morphology 1991. Eds. G. E. Booij and J. Van Marle. Dordrecht: Kluwer Academic Publishers, 1992, pp. 109–149.

    Google Scholar 

  • Baayen, R. H. “Statistical Models for Word Frequency Distributions: A Linguistic Evaluation.” Computers and the Humanities, 26 (1993), 347–363.

    Google Scholar 

  • Baayen, R. H. “Derivational Productivity and Text Typology.” Journal of Quantitative Linguistics, I (1994), 16–34.

    Google Scholar 

  • Baayen, R. H. and R. Lieber. “Productivity and English Derivation: A Corpus-Based Study.” Linguistics, 29 (1991), 801–843.

    Google Scholar 

  • Baayen, R. H., R. Piepenbrock, and H. Van Rijn. The CELEX Lexical Database (CD-ROM). Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania, 1993.

    Google Scholar 

  • Brunet, E. Le Vocabulaire de Jean Giraudoux. Genève: Slatkine, 1978.

    Google Scholar 

  • Carroll, J. B. “On Sampling from a Lognormal Model of Word Frequency Distribution.” In Computational Analysis of Present Day American English. Eds. H. Kucera and W. N. Francis. Providence: Brown University Press, 1967, pp. 406–424.

    Google Scholar 

  • Carroll, J. B. “An Alternative to Juilland's Usage Coefficient for Lexical Frequencies, and a Proposal for a Standard Frequency Index (SFI).” Computer Studies in the Humanities and Verbal Behavior, 3 (1970), 61–65.

    Google Scholar 

  • Chitashvili, R. J. and R. H. Baayen. “Word Frequency Distributions.” In Quantitative TextAnalysis. Eds. G. Altmann and L. Hrebicek. Trier: Wissenschaftlicher Verlag Trier, 1993, pp. 54–135.

    Google Scholar 

  • Clark, H. H. and E. V. Clark. “When Nouns Surface as Verbs.” Language, 55 (1979), 567–811.

    Google Scholar 

  • De Vries, J. W. Lexicale Morfologie van het Werkwoord in Modern Nederlands. Leiden: Universitaire Pers, 1975.

    Google Scholar 

  • Haerdle, W. Smoothing Techniques With Implementation in S. Berlin: Springer, 1991.

    Google Scholar 

  • Harwood, F. W. and A. M. Wright. “Statistical Study of English Word Formation.” Language, 32 (1956), 260–273.

    Google Scholar 

  • Holmes, D.I. “Authorship Attribution.” Computers and the Humanities, 28 (1994), 87–106.

    Google Scholar 

  • Hubert, P and D. Labbe. “Un Modèle de Partition du Vocabulaire.” In Etudes sur la Richesse et les Structures Lexicales. Eds. D. Labbe, P. Thoiron and D. Serant. Paris: Slatkine-Champion, 1988, pp. 93–114.

    Google Scholar 

  • Jackendoff, R. Semantic Structures. Cambridge, Mass.: The MIT Press, 1990.

    Google Scholar 

  • Koehler, R. Zur Linguistischen Synergetik: Struktur and Dynamik der Lexik. Bochum: Brockmeyer, 1986.

    Google Scholar 

  • Lieber, R. and R. H. Baayen. “Verbal Prefixes in Dutch: A Study in Lexical Conceptual Structure.” In Yearbook of Morphology 1993. Eds. G. E. Booijand J. Van Marle. Dordrecht: Kluwer Academic Publishers, 1993, pp. 51–78.

    Google Scholar 

  • Marslen-Wilson, W., L. K. Tyler, R. Waksler and L. Older. “Morphology and Meaning in the English Mental Lexicon.” Psychological Review, 101 (1994), 3–33.

    Google Scholar 

  • Martin, W. “On the Construction of a Basic Vocabulary.” In Proceedings of the 6 th International Conference on Computers and the Humanities. Eds. S. Burton and D. Short. Comp. Science Press, 1983, pp. 410–414.

  • Martin, W. “Lexical Frequency.” In Distributions Spatiales et Temporelles, Constellations des Manuscrits: Etudes de Variation Linguistique Offertes à Anthonij Dees à l'Occasion de son 60me Anniversaire. Ed. K. van Reenen-Stein. Amsterdam: Benjamins, 1988, pp. 139–152.

    Google Scholar 

  • Muller, C. Principes et Methodes de Statistique Lexicale. Paris: Hachette, 1977.

    Google Scholar 

  • Orlov, J. K. “Dynamik der Hdufigkeitsstrukturen.” In Studies on Zipj's Law. Eds. H. Guiter and M. V. Arapov. Bochum: Brockmeyer, 1983, pp. 116–153.

    Google Scholar 

  • Renouf, A. “Corpus Development.” In Looking Up: An Account of the Cobuild Project in Lexical Computing. Ed. J. M. Sinclair. London: Collins, 1987, pp. 1–40.

    Google Scholar 

  • Rubenstein, H. and I. Pollack. “Word Predictability and Intelligibility.” Journal of Verbal Learning and Verbal Behavior, 2 (1963), 147–158.

    Google Scholar 

  • Scarborough, D. L., C. Cortese and H. S. Scarborough. “Frequency and Repetition Effects in Lexical Memory.“ Journal of Experimental Psychology: Human Perception and Performance, 3 (1977), 1–17.

    Google Scholar 

  • Shapiro, B. J. “The Subjective Estimation of Word Frequency.” Journal of Verbal Learning and Verbal Behavior, 8 (1969), 248–251.

    Google Scholar 

  • Sichel, H. S. “Word Frequency Distribution and Type-Token Characteristics.” Mathematical Scientist, 11 (1986), 45–72.

    Google Scholar 

  • Siegel, S. Nonparametric Statistics for the Behavioral Sciences. New York: McGraw Hill, 1956.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baayen, R.H., Lieber, R. Word frequency distributions and lexical semantics. Comput Hum 30, 281–291 (1996). https://doi.org/10.1007/BF00115137

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00115137

Key words

Navigation