International Journal of Theoretical Physics

, Volume 35, Issue 2, pp 395–417 | Cite as

Zipf 's law and the effect of ranking on probability distributions

  • R. Günther
  • L. Levitin
  • B. Schapiro
  • P. Wagner
Article

Abstract

Ranking procedures are widely used in the description of many different types of complex systems. Zipf's law is one of the most remarkable frequency-rank relationships and has been observed independently in physics, linguistics, biology, demography, etc. We show that ranking plays a crucial role in making it possible to detect empirical relationships in systems that exist in one realization only, even when the statistical ensemble to which the systems belong has a very broad probability distribution. Analytical results and numerical simulations are presented which clarify the relations between the probability distributions and the behavior of expected values for unranked and ranked random variables. This analysis is performed, in particular, for the evolutionary model presented in our previous papers which leads to Zipf's law and reveals the underlying mechanism of this phenomenon in terms of a system with interdependent and interacting components as opposed to the “ideal gas” models suggested by previous researchers. The ranking procedure applied to this model leads to a new, unexpected phenomenon: a characteristic “staircase” behavior of the mean values of the ranked variables (ranked occupation numbers). This result is due to the broadness of the probability distributions for the occupation numbers and does not follow from the “ideal gas” model. Thus, it provides an opportunity, by comparison with empirical data, to obtain evidence as to which model relates to reality.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Auerbach, F. (1913). Das Gesetz der Bevölkerungskonzentration [The law of population concentration],Petermans Mitteilungen,59, 74.Google Scholar
  2. Borodovsky, M. Yu., and Gusein-Zade, S. M. (1989). A general rule for ranked series of codon frequencies in different genomes.Journal of Biomolecular Structure and Dynamics,6, 1001.Google Scholar
  3. Brokes, B. C. (1982). Qualitative analysis in the humanities: The advantage of ranking techniques, inStudies on Zipf's Law, H. Guiter and M. V. Arapov, eds., Studienverlag Dr. N. Brockmeyer, Bochum, Germany.Google Scholar
  4. Frankhauser, P. (1991). The Pareto-Zipf-distribution of urban systems as stochastic process, inModels of Selforganization in Complex Systems, W. Ebeling, M. Peschel, and W. Weidlich, eds., Akademie Verlag, Berlin.Google Scholar
  5. H. Guiter and M. V. Arapov, eds. (1982).Studies on Zipf's Law, Studienverlag Dr. N. Brockmeyer, Bochum, Germany.Google Scholar
  6. Günther, R., and Wagner, P. (1995). Analysis of real texts, NRI Internal Report.Google Scholar
  7. Günther, R., Schapiro, B., and Wagner, P. (1992). Physical complexity and Zipf's law,International Journal of Theoretical Physics,31, 525–543.Google Scholar
  8. Günther, R., Schapiro, B., and Wagner, P. (1993). Critical specific complexity: Recent results, NMI Internal Report.Google Scholar
  9. Katsikas, A. A., and Nicolis, J. S. (1990). Chaotic dynamics of generating Markov partitions and linguistic sequences mimicking Zipf's law,Nuovo Cimento,12D, 177.Google Scholar
  10. Kohonen, T. (1982). Analysis of a simple self-organizing process,Biological Cybernetics,44, 135–140.Google Scholar
  11. Levitin, L. B., and Schapiro, B. (1993). Zipf's law and information complexity in an evolutionary system, inProceedings IEEE International Symposium on Information Theory, San Antonio, Texas, p. 76.Google Scholar
  12. Li, W. (1992). Random texts exhibit Zipf's-law-like word frequency distributions,IEEE Transactions on Information Theory,38, 1842.Google Scholar
  13. Mandelbrot, B. B. (1953). An information theory of the statistical structure of language, inCommunication Theory, W. Jackson, ed., London, pp. 486–502.Google Scholar
  14. Mandelbrot, B. B. (1983).The Fractal Geometry of Nature, Freeman, New York.Google Scholar
  15. Nicolis, J. S., and Tsuda, I. (1989). On the parallel between Zipf's law and 1/f process in chaotic systems possessing coexisting attractors,Progress of Theoretical Physics,82, 254–274.Google Scholar
  16. Orlov, J. K. (1982). Ein Modell der Häufigkeitsstruktur des Vokabulars, inStudies on Zipf's Law, H. Guiter and M. V. Arapov, eds., Studienverlag Dr. N. Brockmeyer, Bochum, Germany.Google Scholar
  17. Pareto, V. (1897).Cour d'Economie Politique, Lausanne and Paris [reprinted inOevre Completes, Genf Droz].Google Scholar
  18. Schapiro, B. (1994). An approach to the physics of complexity,Chaos, Solitons and Fractals,4, 115–123.Google Scholar
  19. Sharman, R. A. (1989). Observational evidence for a statistical model of language, IBM UKSC Report 205, September 1989.Google Scholar
  20. Shreider, Yu. A. (1967). Theoretical derivation of text statistical features,Problemy Peredachi Informatsii,3, 57–63.Google Scholar
  21. Willis, J. C. (1922).Age and Area, Cambridge University Press, Cambridge.Google Scholar
  22. Zipf, G. K. (1935).The Psychobiology of Language, Houghton-Miflin, Boston.Google Scholar
  23. Zipf, G. K. (1949).Human Behavior and the Principle of Least Effort, Addison-Wesley, Cambridge, Massachusetts.Google Scholar

Copyright information

© Plenum Publishing Corporation 1996

Authors and Affiliations

  • R. Günther
    • 1
  • L. Levitin
    • 1
    • 2
  • B. Schapiro
    • 1
  • P. Wagner
    • 1
    • 3
  1. 1.Naturwissenschaftliches und Medizinisches Institut (NMI)ReutlingenGermany
  2. 2.College of EngineeringBoston UniversityBoston
  3. 3.Zentrum für Paralleles RechnenMathematisches Institut der Universität zu KölnKölnGermany

Personalised recommendations