Quantitative Regularities of the Diversity of Lexical Meaning

  • Pranas Zunde
  • Hongyi Zhou


There is a number of extensively tested and confirmed regularities of use of natural language such as rank-frequency distribution (widely known as Zipf’s Law) and type-token distribution of words. Most of these regularities are based on formal attributes of words (number of word occurrences, number of different lexicographical types of words, etc.). On the other hand, very little has been done to investigate potential regularities involving semantic attributes of natural language. The focus of the study reported here is on the quantitative aspects of the diversity of referential meaning of linguistic elements which they acquire in the process of overall semantic attribution.

Specifically, the objective was to investigate, for selected languages, the distribution of words and morphemes by the number of their dictionary meanings. Truncated negative binomial, Waring, Yule, Borel, and zeta distributions were selected as the most likely theoretical candidates. Statistical methods were used to evaluate goodness-of-fit of empirical data to these theoretical distributions. Results on the distribution of words by the number of dictionary meanings and on the lexical frequency distribution of morphemes are presented. Best fits to empirical frequencies of words by the number of meanings for English, Spanish, Russian, and Hungarian languages and for English morphemes were obtained to negative binomial, Waring and Yule distribution laws, both across and within the major grammatical categories of words (i.e. nouns, verbs, adjectives).

The results of fitting the frequencies of word associations to theoretical distributions are described. The distributions of word associations for a sample of 67 stimulus words fitted best to truncated negative binomial law with remarkable consistency. Potential generalizations and implications of these findings are discussed.


Negative Binomial Distribution Theoretical Distribution Parameter Estimation Method Spanish Word Transitive Verb 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Altmann, G., Best, K.H., and Kind, B., (1987), A Generalization of the La of Semantic Diversification, Quantitative Linguistics, Vol. 32, pp. 130–139 (In German).Google Scholar
  2. Andrukovich, P. F., and Korolev, E. I., 1977, The Statistical and Lexicogrammatical Properties of Words, Autom. Doc. Math. Linguist., Vol. 11, No. 2, pp. 1–11.Google Scholar
  3. Baker, S. J., 1950, The Pattern of Language, Journal of General Psychology, Vol. 42, No. 1, pp. 25–66. Becker, C. A., and Killion, T. H., 1977, Interaction of Visual and Cognitive Effects in Word Recognition, J. Exp. Psychol., Human Perceptions and Performance, Vol. 3, No. 3, pp. 389–4111.CrossRefGoogle Scholar
  4. Edmundson, H. P., 1977, Statistical Inference in Mathematical and Computational Linguistics, International Journal of Computer and Information Science, Vol. 6, No. 2, pp. 95–129.MathSciNetMATHCrossRefGoogle Scholar
  5. Fuller, W., 1968, An Introduction to Probability Theory and Its Applications, Vol. I, John Wiley & Sons, New York.Google Scholar
  6. Guiraud, P., 1954, Language and Communication, Informational Substance of Semantization, Bulletin de la Societe de Linguistique de Paris, Vol. 49, pp. 119–133. (In French.)Google Scholar
  7. Guiraud, P., 1965, Diacritical and Statistical Models for Languages in Relation to the Computer, The Use of Computers in Anthropology, Hymes, D., ed., Mouton and Co., London, pp. 235–254.Google Scholar
  8. Guiraud, P., 1971, The Semic Matrices of Meaning, Essays in Semiotics, Kristeva, J., Rey-Debove, J., and Umiker, D. J., eds., Mouton, Paris, pp. 150–159.Google Scholar
  9. Harris, Z., 1954, Distributional Structure, Word, No. 10, pp. 146–162.Google Scholar
  10. Johnson, P. E., 1969, On the Communication of Concepts in Science, Journal of Educational Psychology, Vol. 60, No. 1, pp. 32–40.CrossRefGoogle Scholar
  11. Korolev, E. I., 1977, The Use of the Distributive Statistical Method in the Language Apparatus of Automated Information Systems, Autom. Doc. Math. Linguist, Vol. 11, No. 1, pp. 31–37.MathSciNetGoogle Scholar
  12. Krylov, Yu. K., and Yakubovskaya, M. D., 1977, Statistical Analysis of Polysemy as a Language Universal and the Problem of the Semantic Identity of the Word, Nauchno-Tekhnicheskaya Informatsiya, Series 2, Vol. 11, No. 3, pp. 1–6.Google Scholar
  13. Lewis, P. A., Baxendale, P. B., and Bennet, J. L., 1967, Statistical Discrimination of the Synonymy/Google Scholar
  14. Antonymy Relationship Between Words, Journal of the ACM,Vol. 14, No. 1, pp. 20–44. Ljung, M., 1974, A Frequency Dictionary of English Morphemes,AWE/Gebers, Stockholm, Sweden.Google Scholar
  15. Meyer, D. E., and Schvaneveldt, R. W., 1976, Meaning, Memory Structure, and Mental Processes, Science, 192, (4234), pp. 27–33.CrossRefGoogle Scholar
  16. Orszag, L., 1962, A magyar nyelv ertelmezo szotara,Vol. 1–7, Budapest, Hungary (In Hungarian). Ozhegov, S.I., 1960, Lexicographic Collection,Moscow (In Russian)Google Scholar
  17. Pap, F., 1967, On Some Quantitative Characteristics of a Language Vocabulary, Annales Institutti Philologiae Slavicae Universitatis Debreceniensis, Vol 7, pp. 51–58 (In Russian)Google Scholar
  18. Rubenstein, H., and Goodenough, J. B., 1965, Contextual Correlates of Synonymy, Communications of the ACM, Vol. 8, No. 10, pp. 627–633.CrossRefGoogle Scholar
  19. Simon, H. A., 1955, On a Class of Skew Distribution Functions, Biometrika, 42, pp. 425–440. Terwilliger, R. F., 1968, Meaning and Mind, Oxford Univ. Press, New York.Google Scholar
  20. Thoren, B., 1959, 8000 ord for 8 ars angelska, Malmo,GleerupsGoogle Scholar
  21. Thorndike, E. L., and Lorge, I., 1959, The Teacher’s Workbook of 30,000 Words, 3rd ed. New York, Columbia University Press.Google Scholar
  22. Zipf, G. K., 1949, Human Behavior and the Principle of Least Effort, Addison-Wesley Press, Cambridge, Mass.Google Scholar
  23. Zunde, P., 1981, On Empirical Laws and Theories of Information Science,Research Report, Georgia Institute of Technology, Atlanta, GA, NTIS Access No. PB82–125998.Google Scholar
  24. Zunde, P., 1987, Information Science Laws and Regularities: A Survey, Rasmussen, J., and Zunde, P., eds., Empirical Foundations of Information and Software Sciences III, Plenum Press, New York, NY, p. 243–270.Google Scholar
  25. Zunde, P., and Zhou, H., 1988, On Semantic Regularities of Language Use, Research Report GIT-ICS89/03, Georgia Institute of Technology, Atlanta, Georgia.Google Scholar

Copyright information

© Plenum Press, New York 1990

Authors and Affiliations

  • Pranas Zunde
    • 1
  • Hongyi Zhou
    • 1
  1. 1.School of Information and Computer ScienceGeorgia Institute of TechnologyAtlantaUSA

Personalised recommendations