Skip to main content

Semantic Space as a Metapopulation System: Modelling the Wikipedia Information Flow Network

  • Chapter
Towards a Theoretical Framework for Analyzing Complex Linguistic Networks

Abstract

The meaning of a word can be defined as an indefinite set of interpretants, which are other words that circumscribe the semantic content of the word they represent (Derrida 1982). In the same way each interpretant has a set of interpretants representing it and so on. Hence the indefinite chain of meaning assumes a rhizomatic shape that can be represented and analysed via the modern techniques of network theory (Dorogovtsev and Mendes 2013).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Albert, R., Barabási, A.-L.: Statistical mechanics of complex networks. Reviews of Modern Physics 74(1), 47 (2002)

    Article  MathSciNet  Google Scholar 

  2. Amancio, D.R., Oliveira Jr., O.N., da F. Costa, L.: Unveiling the relationship between complex networks metrics and word senses. EPL (Europhysics Letters) 98(1), 18002 (2012)

    Article  Google Scholar 

  3. Amancio, D.R., Oliveira Jr., O.N., da Fontoura Costa, L.: Identification of literary movements using complex networks to represent texts. New Journal of Physics 14(4), 043029 (2012)

    Google Scholar 

  4. Balcan, D., Kabakçıoğlu, A., Mungan, M., Erzan, A.: The information coded in the yeast response elements accounts for most of the topological properties of its transcriptional regulation network. PLoS One 2(6), e501 (2007)

    Google Scholar 

  5. Balloux, F., Lugon-Moulin, N.: The estimation of population differentiation with microsatellite markers. Molecular Ecology 11(2), 155–165 (2002)

    Article  Google Scholar 

  6. Baronchelli, A., Gong, T., Puglisi, A., Loreto, V.: Modeling the emergence of universality in color naming patterns. Proceedings of the National Academy of Sciences 107(6), 2403–2407 (2010)

    Article  Google Scholar 

  7. Barwise, J.: Information flow: the logic of distributed systems. Cambridge University Press (1997)

    Google Scholar 

  8. Bastian, M., Heymann, S., Jacomy, M., et al.: Gephi: an open source software for exploring and manipulating networks. In: ICWSM, vol. 8, pp. 361–362 (2009)

    Google Scholar 

  9. Bergmann, S., Ihmels, J., Barkai, N.: Similarities and differences in genome-wide expression data of six organisms. PLoS Biology 2(1), e9 (2003)

    Google Scholar 

  10. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia-A crystallization point for the Web of Data. Web Semantics: Science, Services and Agents on the World Wide Web 7(3), 154–165 (2009)

    Article  Google Scholar 

  11. Borge-Holthoefer, J., Arenas, A.: Categorizing words through semantic memory navigation. The European Physical Journal B-Condensed Matter and Complex Systems 74(2), 265–270 (2010)

    Article  Google Scholar 

  12. Briët, J., Harremoës, P.: Properties of classical and quantum Jensen-Shannon divergence. Physical Review A 79(5), 052311 (2009)

    Google Scholar 

  13. Capocci, A., Servedio, V.D.P., Colaiori, F., Buriol, L.S., Donato, D., Leonardi, S., Caldarelli, G.: Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia. Physical Review E 74(3), 036116 (2006)

    Google Scholar 

  14. Crooks, A.T.: Constructing and implementing an agent-based model of residential segregation through vector GIS. International Journal of Geographical Information Science 24(5), 661–675 (2010)

    Article  Google Scholar 

  15. Deleuze, G., Guattari, F.: Rhizom, vol. 67. Merve (1977)

    Google Scholar 

  16. Deleuze, G., Guattari, F.: A thousand plateaus: Capitalism and schizophrenia. Bloomsbury Publishing (1988)

    Google Scholar 

  17. Derrida, J.: Margins of philosophy. University of Chicago Press (1982)

    Google Scholar 

  18. Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of networks: From biological nets to the Internet and WWW. Oxford University Press (2013)

    Google Scholar 

  19. Dorogovtsev, S.N., Mendes, J.F.F.: Language as an evolving word web. Proceedings of the Royal Society of London. Series B: Biological Sciences 268(1485), 2603–2606 (2001)

    Article  Google Scholar 

  20. Duncan, O.D., Duncan, B.: A methodological analysis of segregation indexes. American Sociological Review, 210–217 (1955)

    Google Scholar 

  21. Eco, U.: Semiotics and the Philosophy of Language, vol. 398. Indiana University Press (1986)

    Google Scholar 

  22. Ferrer i Cancho, R., Solé, R.V.: Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf’s Law Revisited. Journal of Quantitative Linguistics 8(3), 165–173 (2001)

    Article  Google Scholar 

  23. Fitch, W.T.: Linguistics: an invisible hand. Nature 449(7163), 665–667 (2007)

    Article  Google Scholar 

  24. Fuchs, V.R.: A note on sex segregation in professional occupations. Explorations in Economic Research 2(1), 105–111 (1975)

    Google Scholar 

  25. Gerlach, M., Altmann, E.G.: Stochastic model for the vocabulary growth in natural languages. Physical Review X 3(2), 021006 (2013)

    Google Scholar 

  26. Grosse, I., Bernaola-Galván, P., Carpena, P., Román-Roldán, R., Oliver, J., Eugene Stanley, H.: Analysis of symbolic sequences using the Jensen-Shannon divergence. Physical Review E 65(4), 041905 (2002)

    Google Scholar 

  27. Heaps, H.S.: Information retrieval: Computational and theoretical aspects. Academic Press, Inc. (1978)

    Google Scholar 

  28. Hopper, P.J., Traugott, E.C.: Grammaticalization. Cambridge University Press (2003)

    Google Scholar 

  29. Hutchens, R.: One Measure of Segregation*. International Economic Review 45(2), 555–578 (2004)

    Article  MathSciNet  Google Scholar 

  30. de Jesus Holanda, A., Pisa, I.T., Kinouchi, O., Martinez, A.S., Ruiz, E.E.S.: Thesaurus as a complex network. Physica A: Statistical Mechanics and its Applications 344(3), 530–536 (2004)

    Article  Google Scholar 

  31. Kim, J., Krapivsky, P.L., Kahng, B., Redner, S.: Infinite-order percolation and giant fluctuations in a protein interaction network. Physical Review E 66(5), 055101 (2002)

    Google Scholar 

  32. van Leijenhorst, D.C., Van der Weide, T.P.: A formal derivation of Heaps’ Law. Information Sciences 170(2), 263–272 (2005)

    Article  MathSciNet  Google Scholar 

  33. Lieberman, E., Michel, J.-B., Jackson, J., Tang, T., Nowak, M.A.: Quantifying the evolutionary dynamics of language. Nature 449(7163), 713–716 (2007)

    Article  Google Scholar 

  34. Lin, J.: Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory 37(1), 145–151 (1991)

    Article  Google Scholar 

  35. Macdonald, P.J., Almaas, E., Barabási, A.-L.: Minimum spanning trees of weighted scale-free networks. EPL (Europhysics Letters) 72(2), 308 (2005)

    Article  Google Scholar 

  36. Masucci, A.P., Kalampokis, A., Eguíluz, V.M., Hernández-García, E.: Wikipedia information flow analysis reveals the scale-free architecture of the semantic space. PloS One 6(2), e17333 (2011a)

    Google Scholar 

  37. Masucci, A.P.: Formal versus self-organised knowledge systems: A network approach. Physica A: Statistical Mechanics and its Applications 390(23), 4652–4659 (2011)

    Article  MathSciNet  Google Scholar 

  38. Masucci, A.P., Kalampokis, A., Eguíluz, V.M., Hernández-García, E.: Extracting directed information flow networks: an application to genetics and semantics. Physical Review E 83(2), 026103 (2011b)

    Google Scholar 

  39. Masucci, A.P., Rodgers, G.J.: Network properties of written human language. Physical Review E 74(2), 026102 (2006)

    Google Scholar 

  40. Masucci, A.P., Rodgers, G.J.: Multi-directed Eulerian growing networks. Physica A: Statistical Mechanics and its Applications 386(1), 557–563 (2007)

    Article  Google Scholar 

  41. Menczer, F.: Growing and navigating the small world web by local content. Proceedings of the National Academy of Sciences 99(22), 14014–14019 (2002)

    Article  Google Scholar 

  42. Montemurro, M.A., Zanette, D.H.: Towards the quantification of the semantic information encoded in written language. Advances in Complex Systems 13(02), 135–153 (2010)

    Article  Google Scholar 

  43. Mora, R., Ruiz-Castillo, J.: Additively decomposable segregation indexes. The case of gender segregation by occupations and human capital levels in Spain. The Journal of Economic Inequality 1(2), 147–179 (2003)

    Article  Google Scholar 

  44. Muchnik, L., Itzhack, R., Solomon, S., Louzoun, Y.: Self-emergence of knowledge trees: Extraction of the Wikipedia hierarchies. Physical Review E 76(1), 016106 (2007)

    Google Scholar 

  45. Mungan, M., Kabakloğlu, A., Balcan, D., Erzan, A.: Analytical solution of a stochastic content-based network model. Journal of Physics A: Mathematical and General 38(44), 9599 (2005)

    Article  MathSciNet  Google Scholar 

  46. Navigli, R.: Word sense disambiguation: A survey. ACM Computing Surveys (CSUR) 41(2), 10 (2009)

    Article  Google Scholar 

  47. Petersen, A.M., Tenenbaum, J.N., Havlin, S., Eugene Stanley, H., Perc, M.: Languages cool as they expand: Allometric scaling and the decreasing need for new words. Scientific Reports 2 (2012)

    Google Scholar 

  48. Prim, R.C.: Shortest connection networks and some generalizations. Bell System Technical Journal 36(6), 1389–1401 (1957)

    Article  Google Scholar 

  49. Ramasco, J.J., Mungan, M.: Inversion method for content-based networks. Physical Review E 77(3), 036122 (2008)

    Google Scholar 

  50. Ratner, N.B., Gleason, J.B., Narasimhan, B.: An introduction to psycholinguistics: what do language users know. In: Gleason, J.B., Ratner, N.B. (eds.) Psycholinguistics. Harcourt Brace College, Philadelphia (1999)

    Google Scholar 

  51. Rényi, A.: On measures of entropy and information. In: Fourth Berkeley Symposium on Mathematical Statistics and Probability, pp. 547–561 (1961)

    Google Scholar 

  52. Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley (1989)

    Google Scholar 

  53. Samsonovic, A.V., Ascoli, G.A.: Principal semantic components of language and the measurement of meaning. PloS One 5(6), e10921 (2010)

    Google Scholar 

  54. Schelling, T.C.: Models of segregation. The American Economic Review, 488–493 (1969)

    Google Scholar 

  55. Serrano, M.A., Flammini, A., Menczer, F.: Modeling statistical properties of written text. PloS One 4(4), e5372 (2009)

    Google Scholar 

  56. Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review 5(1), 3–55 (2001)

    Article  Google Scholar 

  57. Sigman, M., Cecchi, G.A.: Global organization of the Wordnet lexicon. Proceedings of the National Academy of Sciences 99(3), 1742–1747 (2002)

    Article  Google Scholar 

  58. Simon, H.A.: On a class of skew distribution functions. Biometrika, 425–440 (1955)

    Google Scholar 

  59. Sinatra, R., Condorelli, D., Latora, V.: Networks of motifs from sequences of symbols. Physical Review Letters 105(17), 178702 (2010)

    Article  Google Scholar 

  60. Skyrms, B.: Signals: Evolution, learning, and information. Oxford University Press (2010)

    Google Scholar 

  61. Stauffer, D., Aharony, A.: Introduction to percolation theory. Taylor and Francis (1991)

    Google Scholar 

  62. Steyvers, M., Tenenbaum, J.B.: The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth. Cognitive Science 29(1), 41–78 (2005)

    Article  Google Scholar 

  63. Theil, H., Finizza, A.J.: A note on the measurement of racial integration of schools by means of informational concepts (1971)

    Google Scholar 

  64. Violi, P.: Meaning and experience. Indiana University Press (2001)

    Google Scholar 

  65. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘smallworld’ networks. Nature 393(6684), 440–442 (1998)

    Article  Google Scholar 

  66. Zanette, D., Montemurro, M.: Dynamics of text generation with realistic Zipf’s distribution. Journal of Quantitative Linguistics 12(1), 29–40 (2005)

    Article  Google Scholar 

  67. Zipf, G.K.: Human behavior and the principle of least effort. Addison-Wesley Press (1949)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Paolo Masucci .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Masucci, A.P., Kalampokis, A., Eguíluz, V.M., Hernández-García, E. (2016). Semantic Space as a Metapopulation System: Modelling the Wikipedia Information Flow Network. In: Mehler, A., Lücking, A., Banisch, S., Blanchard, P., Job, B. (eds) Towards a Theoretical Framework for Analyzing Complex Linguistic Networks. Understanding Complex Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47238-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-47238-5_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-47237-8

  • Online ISBN: 978-3-662-47238-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics