Identifying Trends in Word Frequency Dynamics
- 566 Downloads
The word-stock of a language is a complex dynamical system in which words can be created, evolve, and become extinct. Even more dynamic are the short-term fluctuations in word usage by individuals in a population. Building on the recent demonstration that word niche is a strong determinant of future rise or fall in word frequency, here we introduce a model that allows us to distinguish persistent from temporary increases in frequency. Our model is illustrated using a 108-word database from an online discussion group and a 1011-word collection of digitized books. The model reveals a strong relation between changes in word dissemination and changes in frequency. Aside from their implications for short-term word frequency dynamics, these observations are potentially important for language evolution as new words must survive in the short term in order to survive in the long term.
KeywordsWord dynamics Fluctuations Statistical model Internet communities
We thank Janet Pierrehumbert for discussions during preliminary stages of the project. This work was supported by the Northwestern University Institute on Complex Systems (E.G.A.), the Max Planck Institute for the Physics of Complex Systems (E.G.A.), and a Sloan Research Fellowship (A.E.M.).
- 2.Baayen, R.H.: Word Frequency Distributions. Springer, Berlin (2002) Google Scholar
- 3.Pagel, M.: Human language as a culturally transmitted replicator. Nat. Rev. Genet. 10, 405–415 (2009) Google Scholar
- 11.The Usenet Archives, available at http://groups.google.com
- 12.The Google Books Ngram Corpuses, available at http://books.google.com/ngrams/datasets
- 21.Serrano, M.A., Flammini, A., Menczer, F.: Modeling statistical properties of written text. PLoS ONE 4(4), e537 (2009) Google Scholar
- 22.Corral, R., Ferrer-i-Cancho, R., Boleda, G., Diaz-Guilera, A.: Universal complex structures in written language. arXiv:0901.2924v1 [physics.soc-ph] (2009)
- 24.Petersen, A.M., Tenenbaum, J., Havlin, S., Stanley, H.E.: Statistical laws governing fluctuations in word use from word birth to word death. Sci. Rep. 2, 313 (2012) Google Scholar
- 29.Zanette, D.H.: Dynamics of fashion: the case of given names. arXiv:1208.0576 [physics.soc-ph] (2012)