I discuss the ubiquity of power law distributions in language organisation (and elsewhere), and argue against Miller’s (The mating mind: How sexual choice shaped the evolution of human nature, William Heinemann, London, 2000) argument that large vocabulary size is a consequence of sexual selection. Instead I argue that power law distributions are evidence that languages are best modelled as dynamical systems but raise some issues for models of iterated language learning.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
There have been many attempts to model ‘Zipf Curves’ more accurately beginning with Mandelbrot (1953) and Simon (1955) and continuing more recently with Church and Gale (1995) who use mixtures of Poisson distributions to model word and ngram distributions for applications such as information retrieval and speech recognition. I ignore these here as they are not relevant to the specific goals of this paper.
Such effects can be monitored, for example, using the ‘top 20’ on-line dictionary queries published by Cambridge University Press, http://www.dictionary.cambridge.org/top20/top20_0205.asp.
Albert R, Barabasi A (2002) Statistical machanics of complex networks. Rev Mod Phys 74:47–97
Baayen H (1991) A stochastic process for word frequency distributions. In: Proceedings of the assoc for computational linguistics, Morgan Kaufmann, Menlo Park, CA, pp 271–278
Baayen H (2001) Word frequency distributions. Kluwer, Dordrecht
Bak P (1996) How nature works: the sicence of self-organized criticality. Copernicus Press, New York
Bornholdt S, Ebel H (2001) World Wide Web scaling exponent from Simons 1955 model. Phys Rev 64:035104
Briscoe EJ (2000) Evolutionary perspectives on diachronic syntax. In: Pintzuk S, Tsoulas G, Warner A (eds) Diachronic syntax: models and mechanisms. Oxford University Press, Oxford, pp 75–108
Briscoe EJ, Copestake AA, Lascarides A (1995) Blocking. In: Dizier P St, Viegas E (eds) Computational lexical semantics. Cambridge University Press, Cambridge, pp 273–302
Buttery P, Korhonen A (2005) Large-scale analysis of verb subcategorization differences between child directed speech and adult speech. In: Proceedings of the interdisciplinary workshop on the identification and representation of verb features and verb classes, Saarland University
Church K, Gale W (1995) Poisson mixtures. Nat Lang Eng 1:1–36
Clark E (2003) First language acquisition. Cambridge University Press, Cambridge
Conwell E, Demuth K (2007) Early syntactic productivity: evidence from dative shift. Cognition 103:163–179
Copestake AA, Briscoe EJ (1995) Regular polysemy and semi-productive sense extension. J Semant 12:15–67
Diamond J (1997) Guns, germs and steel: the fate of human societies. Random House, New York
Ferrer i Cancho R, Sole R (2001) The small world of human language. Proc R Soc B Biol Sci 268(1482):2261–2265
Guiraud H (1954) Les Characteres Statistiques du Vocabulaire. Press Universitaires de France, Paris
Kirby S (2001) Spontaneous evolution of linguistic structure: an iterated learning model of the emergence of regularity and irregularity. IEEE Trans Evol Comput 5(2):102–110
Korhonen A (2002) Subcategorization acquisition. Computer Laboratory, University of Cambridge, Techical Report UCAM-CL-TR-530
Korhonen A, Krymolowski Y, Briscoe EJ (2006) A large subcategorization lexicon for natural language processing applications. In: Proceedings of the 5th international conference on language resources and evaluation (LREC06), Genova, Italy
Mandelbrot B (1953) An informational theory of the statistical structure of language. In: Jackson W (ed) Communication theory. Butterworths, London
Manning C, Schutze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge
Miller G (2000) The mating mind: how sexual choice shaped the evolution of human nature. William Heinemann, London
Preiss J, Korhonen A, Briscoe EJ (2002) Subcategorization acquisition as an evaluation method for WSD. In: Proceedings of the language resources and evaluation conference (LREC02), Morgan Kaufmann, Menlo Park, CA, pp 1551–1556
Sampson G (2001) Empirical linguistics. Continuum, London
Schulze C, Stauffer D (2006) Recent developments in computer simulations of language competition. Comput Sci Eng 8:86–93
Sharman R (1989) Observational evidence for a statistical model of language, IBM UKSC Report 205
Simon H (1955) On a class of skew distribution functions. Biometrika 42:435–440
Wichmann S (2005) On the power law distribution of language family sizes. J Linguist 41:117–131
Yook S, Jeong H, Barabasi A-L, Tu Y (2001) Weighted evolving networks. Phys Rev Lett 86:5835–5838
Zipf G (1935) The psycho-biology of language: an introduction to dynamic philology. Houghton-Miflin, New York
Zipf G (1949) Human behavior and the principle of least effort. Addison-Wesley, Cambridge
I would like to thank the anonymous reviewers for their helpful comments, and Paula Buttery and Anna Korhonen for analysis and plots of data from the Valex lexicon.
About this article
Cite this article
Briscoe, T. Language learning, power laws, and sexual selection. Mind & Society 7, 65–76 (2008). https://doi.org/10.1007/s11299-007-0040-8
- Zipf curve
- Iterated learning model
- Small world distribution
- Evolutionary linguistics
- Diathesis alternation