Abstract
Part IV thus far has examined how statistical universals might contribute to the formation of linguistic units such as words and their values. This chapter will continue to examine these units, especially in terms of the length distribution of words and compounds.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The procedure requires a dictionary to convert a word to a phoneme sequence. In Chap. 11, a text was transformed to phoneme sequences by using such a dictionary, but words that are not in the dictionary cannot easily be transformed into phoneme sequences.
- 2.
Note that the range of lengths on the horizontal axis is too small for a logarithmic axis to reveal any useful trend, too.
- 3.
The corresponding graph for a shuffled text is obviously identical to that for the original natural language text.
- 4.
This graph, too, is presented on semilog axes, because of Miller and Mandelbrot’s theoretical analysis and the same reason mentioned for Fig. 13.1.
- 5.
The corpus includes some long hyphenated chunks that are sometimes doubtful to be called “compounds”. Nevertheless, they are included in this analysis because they show some of the reality of hyphen usage.
References
Bentz, Christian and Ferrer-i-Cancho, Ramon (2016). Zipf’s law of abbreviation as a language universal. In Proceedings of the Leiden Workshop on Capturing Phylogenetic Algorithms for Linguistics.
Kanwal, Jasmeen, Smith, Kenny, Culbertson, Jennifer, and Kirby, Simon (2017). Zipf’s law of abbreviation and the principle of least effort: Language users optimise a miniature lexicon for efficient communication. Cognition, 165, 45–52.
Mandelbrot, Benoit B. (1953). An informational theory of the statistical structure of language. In Proceedings of Symposium of Applications of Communication theory, pages 486–502.
Miller, George A. (1957). Some effects of intermittent silence. The American Journal of Psychology, 70(2), 311–314.
Piantadosi, Stegen T., Tily, Harry, and Gibson, Edward (2011). Word lengths are optimized for efficient communication. Proceedings of the National Academy of Sciences, 108(9), 3526–3529.
Zipf, George K. (1949). Human Behavior and the Principle of Least Effort : An Introduction to Human Ecology. Addison-Wesley Press.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s)
About this chapter
Cite this chapter
Tanaka-Ishii, K. (2021). Size and Frequency. In: Statistical Universals of Language. Mathematics in Mind. Springer, Cham. https://doi.org/10.1007/978-3-030-59377-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-59377-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59376-6
Online ISBN: 978-3-030-59377-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)