Advertisement

Semantic Space as a Metapopulation System: Modelling the Wikipedia Information Flow Network

  • A. Paolo MasucciEmail author
  • Alkiviadis Kalampokis
  • Víctor M. Eguíluz
  • Emilio Hernández-García
Part of the Understanding Complex Systems book series (UCS)

Abstract

The meaning of a word can be defined as an indefinite set of interpretants, which are other words that circumscribe the semantic content of the word they represent (Derrida 1982). In the same way each interpretant has a set of interpretants representing it and so on. Hence the indefinite chain of meaning assumes a rhizomatic shape that can be represented and analysed via the modern techniques of network theory (Dorogovtsev and Mendes 2013).

Keywords

Degree Distribution Percolation Threshold Minimum Span Tree Shannon Entropy Semantic Space 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Albert and Barabási(2002)]
    Albert, R., Barabási, A.-L.: Statistical mechanics of complex networks. Reviews of Modern Physics 74(1), 47 (2002)MathSciNetCrossRefGoogle Scholar
  2. [Amancio et al.(2012a)Amancio, Oliveira Jr, and Costa]
    Amancio, D.R., Oliveira Jr., O.N., da F. Costa, L.: Unveiling the relationship between complex networks metrics and word senses. EPL (Europhysics Letters) 98(1), 18002 (2012)CrossRefGoogle Scholar
  3. [Amancio et al.(2012b)Amancio, Oliveira Jr, and da Fontoura Costa]
    Amancio, D.R., Oliveira Jr., O.N., da Fontoura Costa, L.: Identification of literary movements using complex networks to represent texts. New Journal of Physics 14(4), 043029 (2012)Google Scholar
  4. [Balcan et al.(2007)Balcan, Kabakçıoğlu, Mungan, and Erzan]
    Balcan, D., Kabakçıoğlu, A., Mungan, M., Erzan, A.: The information coded in the yeast response elements accounts for most of the topological properties of its transcriptional regulation network. PLoS One 2(6), e501 (2007)Google Scholar
  5. [Balloux and Lugon-Moulin(2002)]
    Balloux, F., Lugon-Moulin, N.: The estimation of population differentiation with microsatellite markers. Molecular Ecology 11(2), 155–165 (2002)CrossRefGoogle Scholar
  6. [Baronchelli et al.(2010)Baronchelli, Gong, Puglisi, and Loreto]
    Baronchelli, A., Gong, T., Puglisi, A., Loreto, V.: Modeling the emergence of universality in color naming patterns. Proceedings of the National Academy of Sciences 107(6), 2403–2407 (2010)CrossRefGoogle Scholar
  7. [Barwise(1997)]
    Barwise, J.: Information flow: the logic of distributed systems. Cambridge University Press (1997)Google Scholar
  8. [Bastian et al.(2009)Bastian, Heymann, Jacomy, et al.]
    Bastian, M., Heymann, S., Jacomy, M., et al.: Gephi: an open source software for exploring and manipulating networks. In: ICWSM, vol. 8, pp. 361–362 (2009)Google Scholar
  9. [Bergmann et al.(2003)Bergmann, Ihmels, and Barkai]
    Bergmann, S., Ihmels, J., Barkai, N.: Similarities and differences in genome-wide expression data of six organisms. PLoS Biology 2(1), e9 (2003)Google Scholar
  10. [Bizer et al.(2009)Bizer, Lehmann, Kobilarov, Auer, Becker, Cyganiak, and Hellmann]
    Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia-A crystallization point for the Web of Data. Web Semantics: Science, Services and Agents on the World Wide Web 7(3), 154–165 (2009)CrossRefGoogle Scholar
  11. [Borge-Holthoefer and Arenas(2010)]
    Borge-Holthoefer, J., Arenas, A.: Categorizing words through semantic memory navigation. The European Physical Journal B-Condensed Matter and Complex Systems 74(2), 265–270 (2010)CrossRefGoogle Scholar
  12. [Briët and Harremoës(2009)]
    Briët, J., Harremoës, P.: Properties of classical and quantum Jensen-Shannon divergence. Physical Review A 79(5), 052311 (2009)Google Scholar
  13. [Capocci et al.(2006)Capocci, Servedio, Colaiori, Buriol, Donato, Leonardi, and Caldarelli]
    Capocci, A., Servedio, V.D.P., Colaiori, F., Buriol, L.S., Donato, D., Leonardi, S., Caldarelli, G.: Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia. Physical Review E 74(3), 036116 (2006)Google Scholar
  14. [Crooks(2010)]
    Crooks, A.T.: Constructing and implementing an agent-based model of residential segregation through vector GIS. International Journal of Geographical Information Science 24(5), 661–675 (2010)CrossRefGoogle Scholar
  15. [Deleuze and Guattari(1977)]
    Deleuze, G., Guattari, F.: Rhizom, vol. 67. Merve (1977)Google Scholar
  16. [Deleuze and Guattari(1988)]
    Deleuze, G., Guattari, F.: A thousand plateaus: Capitalism and schizophrenia. Bloomsbury Publishing (1988)Google Scholar
  17. [Derrida(1982)]
    Derrida, J.: Margins of philosophy. University of Chicago Press (1982)Google Scholar
  18. [Dorogovtsev and Mendes(2013)]
    Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of networks: From biological nets to the Internet and WWW. Oxford University Press (2013)Google Scholar
  19. [Dorogovtsev and Mendes(2001)]
    Dorogovtsev, S.N., Mendes, J.F.F.: Language as an evolving word web. Proceedings of the Royal Society of London. Series B: Biological Sciences 268(1485), 2603–2606 (2001)CrossRefGoogle Scholar
  20. [Duncan and Duncan(1955)]
    Duncan, O.D., Duncan, B.: A methodological analysis of segregation indexes. American Sociological Review, 210–217 (1955)Google Scholar
  21. [Eco(1986)]
    Eco, U.: Semiotics and the Philosophy of Language, vol. 398. Indiana University Press (1986)Google Scholar
  22. [Ferrer i Cancho and Solé(2001)]
    Ferrer i Cancho, R., Solé, R.V.: Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf’s Law Revisited. Journal of Quantitative Linguistics 8(3), 165–173 (2001)CrossRefGoogle Scholar
  23. [Fitch(2007)]
    Fitch, W.T.: Linguistics: an invisible hand. Nature 449(7163), 665–667 (2007)CrossRefGoogle Scholar
  24. [Fuchs(1975)]
    Fuchs, V.R.: A note on sex segregation in professional occupations. Explorations in Economic Research 2(1), 105–111 (1975)Google Scholar
  25. [Gerlach and Altmann(2013)]
    Gerlach, M., Altmann, E.G.: Stochastic model for the vocabulary growth in natural languages. Physical Review X 3(2), 021006 (2013)Google Scholar
  26. [Grosse et al.(2002)Grosse, Bernaola-Galván, Carpena, Román-Roldán, Oliver, and Stanley]
    Grosse, I., Bernaola-Galván, P., Carpena, P., Román-Roldán, R., Oliver, J., Eugene Stanley, H.: Analysis of symbolic sequences using the Jensen-Shannon divergence. Physical Review E 65(4), 041905 (2002)Google Scholar
  27. [Heaps(1978)]
    Heaps, H.S.: Information retrieval: Computational and theoretical aspects. Academic Press, Inc. (1978)Google Scholar
  28. [Hopper and Traugott(2003)]
    Hopper, P.J., Traugott, E.C.: Grammaticalization. Cambridge University Press (2003)Google Scholar
  29. [Hutchens(2004)]
    Hutchens, R.: One Measure of Segregation*. International Economic Review 45(2), 555–578 (2004)MathSciNetCrossRefGoogle Scholar
  30. [de Jesus Holanda et al.(2004)de Jesus Holanda, Torres Pisa, Kinouchi, Souto Martinez, and Eduardo Seron Ruiz]
    de Jesus Holanda, A., Pisa, I.T., Kinouchi, O., Martinez, A.S., Ruiz, E.E.S.: Thesaurus as a complex network. Physica A: Statistical Mechanics and its Applications 344(3), 530–536 (2004)CrossRefGoogle Scholar
  31. [Kim et al.(2002)Kim, Krapivsky, Kahng, and Redner]
    Kim, J., Krapivsky, P.L., Kahng, B., Redner, S.: Infinite-order percolation and giant fluctuations in a protein interaction network. Physical Review E 66(5), 055101 (2002)Google Scholar
  32. [van Leijenhorst and Van der Weide(2005)]
    van Leijenhorst, D.C., Van der Weide, T.P.: A formal derivation of Heaps’ Law. Information Sciences 170(2), 263–272 (2005)MathSciNetCrossRefGoogle Scholar
  33. [Lieberman et al.(2007)Lieberman, Michel, Jackson, Tang, and Nowak]
    Lieberman, E., Michel, J.-B., Jackson, J., Tang, T., Nowak, M.A.: Quantifying the evolutionary dynamics of language. Nature 449(7163), 713–716 (2007)CrossRefGoogle Scholar
  34. [Lin(1991)]
    Lin, J.: Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory 37(1), 145–151 (1991)CrossRefGoogle Scholar
  35. [Macdonald et al.(2005)Macdonald, Almaas, and Barabási]
    Macdonald, P.J., Almaas, E., Barabási, A.-L.: Minimum spanning trees of weighted scale-free networks. EPL (Europhysics Letters) 72(2), 308 (2005)CrossRefGoogle Scholar
  36. [Masucci et al.(2011a)Masucci, Kalampokis, Eguíluz, and Hernández-García]
    Masucci, A.P., Kalampokis, A., Eguíluz, V.M., Hernández-García, E.: Wikipedia information flow analysis reveals the scale-free architecture of the semantic space. PloS One 6(2), e17333 (2011a)Google Scholar
  37. [Masucci(2011)]
    Masucci, A.P.: Formal versus self-organised knowledge systems: A network approach. Physica A: Statistical Mechanics and its Applications 390(23), 4652–4659 (2011)MathSciNetCrossRefGoogle Scholar
  38. [Masucci et al.(2011b)Masucci, Kalampokis, Eguíluz, and Hernández-García]
    Masucci, A.P., Kalampokis, A., Eguíluz, V.M., Hernández-García, E.: Extracting directed information flow networks: an application to genetics and semantics. Physical Review E 83(2), 026103 (2011b)Google Scholar
  39. [Masucci and Rodgers(2006)]
    Masucci, A.P., Rodgers, G.J.: Network properties of written human language. Physical Review E 74(2), 026102 (2006)Google Scholar
  40. [Masucci and Rodgers(2007)]
    Masucci, A.P., Rodgers, G.J.: Multi-directed Eulerian growing networks. Physica A: Statistical Mechanics and its Applications 386(1), 557–563 (2007)CrossRefGoogle Scholar
  41. [Menczer(2002)]
    Menczer, F.: Growing and navigating the small world web by local content. Proceedings of the National Academy of Sciences 99(22), 14014–14019 (2002)CrossRefGoogle Scholar
  42. [Montemurro and Zanette(2010)]
    Montemurro, M.A., Zanette, D.H.: Towards the quantification of the semantic information encoded in written language. Advances in Complex Systems 13(02), 135–153 (2010)CrossRefGoogle Scholar
  43. [Mora and Ruiz-Castillo(2003)]
    Mora, R., Ruiz-Castillo, J.: Additively decomposable segregation indexes. The case of gender segregation by occupations and human capital levels in Spain. The Journal of Economic Inequality 1(2), 147–179 (2003)CrossRefGoogle Scholar
  44. [Muchnik et al.(2007)Muchnik, Itzhack, Solomon, and Louzoun]
    Muchnik, L., Itzhack, R., Solomon, S., Louzoun, Y.: Self-emergence of knowledge trees: Extraction of the Wikipedia hierarchies. Physical Review E 76(1), 016106 (2007)Google Scholar
  45. [Mungan et al.(2005)Mungan, Kabakloğlu, Balcan, and Erzan]
    Mungan, M., Kabakloğlu, A., Balcan, D., Erzan, A.: Analytical solution of a stochastic content-based network model. Journal of Physics A: Mathematical and General 38(44), 9599 (2005)MathSciNetCrossRefGoogle Scholar
  46. [Navigli(2009)]
    Navigli, R.: Word sense disambiguation: A survey. ACM Computing Surveys (CSUR) 41(2), 10 (2009)CrossRefGoogle Scholar
  47. [Petersen et al.(2012)Petersen, Tenenbaum, Havlin, Stanley, and Perc]
    Petersen, A.M., Tenenbaum, J.N., Havlin, S., Eugene Stanley, H., Perc, M.: Languages cool as they expand: Allometric scaling and the decreasing need for new words. Scientific Reports 2 (2012)Google Scholar
  48. [Prim(1957)]
    Prim, R.C.: Shortest connection networks and some generalizations. Bell System Technical Journal 36(6), 1389–1401 (1957)CrossRefGoogle Scholar
  49. [Ramasco and Mungan(2008)]
    Ramasco, J.J., Mungan, M.: Inversion method for content-based networks. Physical Review E 77(3), 036122 (2008)Google Scholar
  50. [Ratner et al.(1999)Ratner, Gleason, and Narasimhan]
    Ratner, N.B., Gleason, J.B., Narasimhan, B.: An introduction to psycholinguistics: what do language users know. In: Gleason, J.B., Ratner, N.B. (eds.) Psycholinguistics. Harcourt Brace College, Philadelphia (1999)Google Scholar
  51. [RÉNYI(1961)]
    Rényi, A.: On measures of entropy and information. In: Fourth Berkeley Symposium on Mathematical Statistics and Probability, pp. 547–561 (1961)Google Scholar
  52. [Salton(1989)]
    Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley (1989)Google Scholar
  53. [Samsonovic and Ascoli(2010)]
    Samsonovic, A.V., Ascoli, G.A.: Principal semantic components of language and the measurement of meaning. PloS One 5(6), e10921 (2010)Google Scholar
  54. [Schelling(1969)]
    Schelling, T.C.: Models of segregation. The American Economic Review, 488–493 (1969)Google Scholar
  55. [Serrano et al.(2009)Serrano, Flammini, and Menczer]
    Serrano, M.A., Flammini, A., Menczer, F.: Modeling statistical properties of written text. PloS One 4(4), e5372 (2009)Google Scholar
  56. [Shannon(2001)]
    Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review 5(1), 3–55 (2001)CrossRefGoogle Scholar
  57. [Sigman and Cecchi(2002)]
    Sigman, M., Cecchi, G.A.: Global organization of the Wordnet lexicon. Proceedings of the National Academy of Sciences 99(3), 1742–1747 (2002)CrossRefGoogle Scholar
  58. [Simon(1955)]
    Simon, H.A.: On a class of skew distribution functions. Biometrika, 425–440 (1955)Google Scholar
  59. [Sinatra et al.(2010)Sinatra, Condorelli, and Latora]
    Sinatra, R., Condorelli, D., Latora, V.: Networks of motifs from sequences of symbols. Physical Review Letters 105(17), 178702 (2010)CrossRefGoogle Scholar
  60. [Skyrms(2010)]
    Skyrms, B.: Signals: Evolution, learning, and information. Oxford University Press (2010)Google Scholar
  61. [Stauffer and Aharony(1991)]
    Stauffer, D., Aharony, A.: Introduction to percolation theory. Taylor and Francis (1991)Google Scholar
  62. [Steyvers and Tenenbaum(2005)]
    Steyvers, M., Tenenbaum, J.B.: The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth. Cognitive Science 29(1), 41–78 (2005)CrossRefGoogle Scholar
  63. [Theil and Finizza(1971)]
    Theil, H., Finizza, A.J.: A note on the measurement of racial integration of schools by means of informational concepts (1971)Google Scholar
  64. [Violi(2001)]
    Violi, P.: Meaning and experience. Indiana University Press (2001)Google Scholar
  65. [Watts and Strogatz(1998)]
    Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘smallworld’ networks. Nature 393(6684), 440–442 (1998)CrossRefGoogle Scholar
  66. [Zanette and Montemurro(2005)]
    Zanette, D., Montemurro, M.: Dynamics of text generation with realistic Zipf’s distribution. Journal of Quantitative Linguistics 12(1), 29–40 (2005)CrossRefGoogle Scholar
  67. [Zipf(1949)]
    Zipf, G.K.: Human behavior and the principle of least effort. Addison-Wesley Press (1949)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • A. Paolo Masucci
    • 1
    Email author
  • Alkiviadis Kalampokis
    • 2
  • Víctor M. Eguíluz
    • 3
  • Emilio Hernández-García
    • 3
  1. 1.Centre for Advanced Spatial AnalysisUniversity College of LondonLondonUK
  2. 2.Dipartimento di Scienze e TecnologieUniversitá degli Studi di Napoli “Parthenope”NapoliItaly
  3. 3.Instituto de Física Interdisciplinar y Sistemas Complejos IFISC (CSIC-UIB)Palma de MallorcaSpain

Personalised recommendations