The Structure and Dynamics of Linguistic Networks

  • Monojit ChoudhuryEmail author
  • Animesh MukherjeeEmail author
Part of the Modeling and Simulation in Science, Engineering and Technology book series (MSSET)

Human beings as a species are quite unique to this biological world, for they are the only organisms known to be capable of thinking, communicating and preserving potentially an infinite number of ideas that form the pillars of modern civilization. This unique ability is a consequence of the complex and powerful human languages characterized by their recursive syntax and compositional semantics [40]. It has been argued that language is a dynamic complex adaptive system that has evolved through the process of self-organization to serve the purpose of human communication needs [80]. The complexity of human languages has always attracted the attention of physicists, who have tried to explain several linguistic phenomena through models of physical systems (see e.g., [32, 42]).

Like any physical system, a linguistic system (i.e., a language) can be viewed from three different perspectives [52]. On one extreme, a language is a collection of utterances that are produced by the speakers of a linguistic community during the course of their interactions with other speakers of the same community. This is analogous to the microscopic view of a thermodynamic system, where every utterance and its corresponding context contributes to the identity of the language, i.e., the grammar. On the other extreme, a language can be characterized by a set of grammar rules and a vocabulary. This is analogous to a macroscopic view. Sandwiched between these two extremes, one can also conceive of a mesoscopic view of language, where linguistic entities, such as the letters, words or phrases are the basic units and the grammar is an emergent property of the interactions among them.


Information Retrieval Degree Distribution Natural Language Processing Cluster Coefficient Preferential Attachment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    M. E. Adilson, A. P. S. de Moura, Y. C. Lai, and P. Dasgupta. Topology of the conceptual network of language. Physical Review E, 65(065102):1–4, 2002.Google Scholar
  2. 2.
    A. Agarwal, S. Chakrabarti, and S. Aggarwal. Learning to rank networked entities. In Proceedings of KDD, 2006.Google Scholar
  3. 3.
    A. Akmajian. Linguistics. An introduction to Language and Communication. MIT Press, Cambridge, MA, 1995.Google Scholar
  4. 4.
    A. Albright and B. Hayes. Rules vs. analogy in english past tenses: A computational/experimental study. Cognition, 90:119–161, 2003.CrossRefGoogle Scholar
  5. 5.
    A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286:509–512, 1999.CrossRefMathSciNetGoogle Scholar
  6. 6.
    C. Biemann. Chinese whispers - an efficient graph clustering algorithm and its application to natural language processing problems. In Proceedings of TextGraphs: the Second Workshop on Graph Based Methods for Natural Language Processing, pages 73–80, New York, NY, June 2006. Association for Computational Linguistics.Google Scholar
  7. 7.
    C. Biemann. Unsupervised part-of-speech tagging employing efficient graph clustering. In Proceedings of the COLING/ACL 2006 Student Research Workshop, pages 7–12, Sydney, Australia, July 2006. Association for Computational Linguistics.Google Scholar
  8. 8.
    C. Biemann, I. Matveeva, R. Mihalcea, and D. Radev, editors. Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing. Association for Computational Linguistics, Rochester, NY, 2007.Google Scholar
  9. 9.
    S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. CNIS, 30(1–7):107–117, 1998.Google Scholar
  10. 10.
    N. Chomsky. The Minimalist Program. MIT Press, Cambridge, MA, 1995.Google Scholar
  11. 11.
    M. Choudhury, M. Thomas, A. Mukherjee, A. Basu, and N. Ganguly. How difficult is it to develop a perfect spell-checker? A cross-linguistic analysis through complex network approach. In Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pages 81–88, Rochester, NY, 2007. Association for Computational Linguistics.Google Scholar
  12. 12.
    A. Clark. Inducing syntactic categories by context distribution clustering. In C. Cardie, W. Daelemans, C. Nédellec, and E. T. K. Sang, editors, Proceedings of the Fourth Conference on Computational Natural Language Learning and of the Second Learning Language in Logic Workshop, Lisbon, 2000, pages 91–94. Association for Computational Linguistics, Somerset, NJ, 2000.Google Scholar
  13. 13.
    A. M. Collins and M. R. Quillian. Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Memory, 8:240–247, 1969.CrossRefGoogle Scholar
  14. 14.
    W. Croft. Typology and Universals. Cambridge University Press, Cambridge, MA, 1990.Google Scholar
  15. 15.
    B. de Boer. Self-organisation in vowel systems. Journal of Phonetics, 28(4): 441–465, 2000.CrossRefGoogle Scholar
  16. 16.
    I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In Proceedings of The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD-2003), pages 89–98, 2003.Google Scholar
  17. 17.
    W. B. Dolan, L. Vanderwende, and S. Richardson. Automatically deriving structured knowledge base from on-line dictionaries. In Proceedings of the Pacific Association for Computational Linguistics, 1993.Google Scholar
  18. 18.
    S. N. Dorogovtsev and J. F. F. Mendes. Language as an evolving word Web. Proceedings of the Royal Society of London B, 268(1485):2603–2606, December 22, 2001.CrossRefGoogle Scholar
  19. 19.
    G. Erkan and D. Radev. LexRank: Graph-based lexical centrality as salience in text summarization. JAIR, 22:457–479, December 4, 2004.Google Scholar
  20. 20.
    C. Felbaum. WordNet, an Electronic Lexical Database for English. MIT Press, Cambridge, MA, 1998.Google Scholar
  21. 21.
    R. Ferrer-i-Cancho. The structure of syntactic dependency networks: insights from recent advances in network theory. In: “Problems of Quantitative Linguistics”, G. Altmann, V. Levickij, and V. Perebyinis (eds.). Chernivtsi: Ruta. 60–75, 2005Google Scholar
  22. 22.
    R. Ferrer-i-Cancho. Why do syntactic links not cross? Europhysics Letters, 76:1228–1235, 2006.CrossRefGoogle Scholar
  23. 23.
    R. Ferrer-i-Cancho, A. Capocci, and G. Caldarelli. Spectral methods cluster words of the same class in a syntactic dependency network. International Journal of Bifurcation and Chaos, 17(7), 2007. AQ: Please provide page number for Ref. 23.Google Scholar
  24. 24.
    R. Ferrer-i-Cancho and R. V. Solé. The small world of human language. Proceedings of The Royal Society of London. Series B, Biological Sciences, 268(1482):2261–2265, November 2001.CrossRefGoogle Scholar
  25. 25.
    R. Ferrer-i-Cancho and R. V. Solé. Two regimes in the frequency of words and the origin of complex lexicons: Zipf's law revisited. Journal of Quantitative Linguistics, 8:165–173, 2001.CrossRefGoogle Scholar
  26. 26.
    R. Ferrer-i-Cancho and R. V. Solé. Patterns in syntactic dependency networks. Physical Review E, 69(051915), 2004.Google Scholar
  27. 27.
    S. Finch and N. Chater. Bootstrapping syntactic categories using statistical methods. In Background and Experiments in Machine Learning of Natural Language: Proceedings of the 1st SHOE Workshop, pages 229–235. Katholieke Universiteit, Brabant, Holland, 1992.Google Scholar
  28. 28.
    D. Freitag. Toward unsupervised whole-corpus tagging. In COLING '04: Proceedings of the 20th International Conference on Computational Linguistics, page 357, Morristown, NJ, 2004. Association for Computational Linguistics.Google Scholar
  29. 29.
    M. Galley and K. McKeown. Improving word sense disambiguation in lexical chaining. In Proceedings of IJCAI, 2003.Google Scholar
  30. 30.
    M. Gamon. Graph-based text representation for novelty detection. In Proceedings of the Workshop on TextGraphs at HLT-NAACL, pages 17–24, 2006.Google Scholar
  31. 31.
    S. Gauch and R. Futrelle. Experiments in Automatic Word Class and Word Sense Identification for Information Retrieval. In Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval, pages 425–434, Las Vegas, NV, April 1994.Google Scholar
  32. 32.
    M. Gell-Mann. Language and complexity. In J. W. Minett and W. S.-Y. Wang, editors, Language Acquisition, Change and Emergence: Essays in Evolutionary Linguistics. City University of Hong Kong Press, July 2005.Google Scholar
  33. 33.
    D. Gibson, J. M. Kleinberg, and P. Raghavan. Inferring Web communities from link topology. In Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia, pages 225–234, 1998.Google Scholar
  34. 34.
    A. B. Goldberg and J. Zhu. Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization. In HLT-NAACL 2006 Workshop on Textgraphs: Graph-based Algorithms for Natural Language Processing, 2006.Google Scholar
  35. 35.
    J. H. Greenberg and J. J. Jenkins. Studies in the psychological correlates of the sound system of American English. Word, 20:157–177, 1964.Google Scholar
  36. 36.
    T. M. Gruenenfelder and D. B. Pisoni. Modeling the mental lexicon as a complex system: Some preliminary results using graph theoretic measures. In Research on Spoken Language Processing Progress Report No. 27, Bloomington, Indiana University, 27–47, 2005.Google Scholar
  37. 37.
    Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating Web spam with TrustRank. In Proceedings of VLDB, pages 576–587, 2004.Google Scholar
  38. 38.
    A. D. Haghighi, A. Y. Ng, and C. D. Manning. Robust textual inference via graph matching. In HLT '05: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 387–394, Morristown, NJ, 2005. Association for Computational Linguistics.Google Scholar
  39. 39.
    Z. S. Harris. Mathematical Structures of Language. Wiley, New York, 1968.Google Scholar
  40. 40.
    M. D. Hauser, N. Chomsky, and W. T. Fitch. The faculty of language: What is it, who has it, and how did it evolve? Science, 298:1569–1579, 2002.CrossRefGoogle Scholar
  41. 41.
    R. F. i-Cancho, A. Mehler, O. Pustylnikov, and A. Díaz-Guilera. Correlations in the organization of large-scale syntactic dependency networks. In TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pages 65–72. Association for Computational Linguistics, 2007.Google Scholar
  42. 42.
    Y. Itoh and S. Ueda. The Ising model for changes in word ordering rules in natural languages. Physica D: Nonlinear Phenomena, 198(3–4):333–339, 2004.CrossRefGoogle Scholar
  43. 43.
    J. Jannink and G. Wiederhold. Thesaurus entry extraction from an on-line dictionary. In Proceedings of Fusion, 1999.Google Scholar
  44. 44.
    B. Jedynak and D. Karakos. Unigram language models using diffusion smoothing over graphs. In Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pages 33–36, Rochester, NY, 2007. Association for Computational Linguistics.Google Scholar
  45. 45.
    V. Kapatsinski. Sound similarity relations in the mental lexicon: Modeling the lexicon as a complex network. Speech Research Lab Progress Report, Indiana University, Bloomington, IN, 2006.Google Scholar
  46. 46.
    V. Kapustin and A. Jamsen. Vertex degree distribution for the graph of word co-occurrences in Russian. In Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pages 89–92, Rochester, NY, 2007. Association for Computational Linguistics.Google Scholar
  47. 47.
    J. Ke, M. Ogura, and W. S.-Y. Wang. Optimization models of sound systems using genetic algorithms. Computational Linguistics, 29(1):1–18, 2003.CrossRefGoogle Scholar
  48. 48.
    J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of ACM, 46, 1999.Google Scholar
  49. 49.
    R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. Structure and evolution of blogspace. Communications of the ACM, 47(12):35–39, 2004.CrossRefGoogle Scholar
  50. 50.
    M. Lesk. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of SIGDOC, 1986.Google Scholar
  51. 51.
    J. Liljencrants and B. Lindblom. Numerical simulation of vowel quality systems: the role of perceptual contrast. Language, 48:839–862, 1972.CrossRefGoogle Scholar
  52. 52.
    H. Liljenstrom. Micro Meso Macro: Addressing Complex Systems Couplings. World Scientific Publishing, Singapore, 2005.Google Scholar
  53. 53.
    P. A. Luce and D. B. Pisoni. Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19:1–36, 1998.CrossRefGoogle Scholar
  54. 54.
    I. Maddieson. Patterns of Sounds. Cambridge University Press, Cambridge, 1984.Google Scholar
  55. 55.
    W. Marslen-Wilson. Activation, competition, and frequency in lexical access. In: G. T. M. Altmann (ed.), Cognitive Models of Speech Processing: Psycholinguistic and Computational Perspectives, MIT Press, Cambridge, MA, pages 148–173, 1990.Google Scholar
  56. 56.
    R. McDonald, F. Pereira, K. Ribarov, and J. Hajič. Non-projective dependency parsing using spanning tree algorithms. In HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 523–530, Morristown, NJ, 2005. Association for Computational Linguistics.Google Scholar
  57. 57.
    R. Mihalcea. Graph-based ranking algorithms for large vocabulary word sense disambiguation. In Proceedings of HTL-EMNLP, 2005.Google Scholar
  58. 58.
    R. Mihalcea and D. Radev. Graph-based algorithms for information retrieval and natural language processing. Tutorial at HLT/NAACL 2006, 2006.Google Scholar
  59. 59.
    R. Mihalcea and D. Radev, editors. Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing. Association for Computational Linguistics, 2006.Google Scholar
  60. 60.
    R. Mihalcea and P. Tarau. TextRank: Bringing order into texts. In Proceedings of EMNLP, 2004.Google Scholar
  61. 61.
    R. Mihalcea, P. Tarau, and E. Figa. PageRank on semantic networks with applications to word sense disambiguation. In Proceedings of COLING, 2004.Google Scholar
  62. 62.
    G. A. Miller and W. G. Charles. Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1):1–28, 1991.CrossRefGoogle Scholar
  63. 63.
    G. A. Miller and P. M. Gildea. How children learn words. Scientific American, 257(3):86–91, 1987.CrossRefGoogle Scholar
  64. 64.
    A. Mukherjee, M. Choudhury, A. Basu, and N. Ganguly. Modeling the co-occurrence principles of the consonant inventories: A complex network approach. International Journal of Modern Physics C, 18(2):281–295, 2007.CrossRefzbMATHGoogle Scholar
  65. 65.
    A. Mukherjee, M. Choudhury, A. Basu, and N. Ganguly. Self-organization of sound inventories: Analysis and synthesis of the occurrence and co-occurrence networks of consonants. Journal of Quantitative Linguistics,
  66. 66.
    J. Nath, M. Choudhury, A. Mukherjee, C. Biemann, and N. Ganguly. Unsupervised parts-of-speech induction for Bengali. In Proceedings of the Sixth International Language Resources and Evaluation Conference (LREC), 2008.Google Scholar
  67. 67.
    D. Nettle. Using social impact theory to simulate language change. Lingua, 108: 95–117, 1999.CrossRefGoogle Scholar
  68. 68.
    H. G. Nusbaum, D. B. Pisoni, and C. K. Davis. Sizing up the Hoosier mental lexicon: Measuring the familiarity of 20,000 words, Indiana University. Research on Speech Perception Progress Report No. 10, pages 357–376, 1984.Google Scholar
  69. 69.
    B. Pang and L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL '04), Main Volume, pages 271–278, Barcelona, Spain, July 2004.Google Scholar
  70. 70.
    S. Pinker. The Language Instinct: How the Mind Creates Language. HarperCollins, New York, 1994.Google Scholar
  71. 71.
    S. Pinker and A. Price. On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28:195–247, 1988.CrossRefGoogle Scholar
  72. 72.
    R. Rapp. A practical solution to the problem of automatic part-of-speech induction from text. In Conference Companion Volume of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL-05), Ann Arbor, MI, 2005.Google Scholar
  73. 73.
    M. Richardson, A. Prakash, and E. Brill. Beyond PageRank: Machine learning for static ranking. In Proceedings of WWW, pages 707–715, 2006.Google Scholar
  74. 74.
    H. Schütze. Part-of-speech induction from scratch. In Proceedings of the 31st Annual Meeting on Association for Computational Linguistics, pages 251–258, Morristown, NJ, 1993. Association for Computational Linguistics.Google Scholar
  75. 75.
    H. Schütze. Distributional part-of-speech tagging. In Proceedings of the 7th Conference on European Chapter of the Association for Computational Linguistics, pages 141–148, San Francisco, CA, 1995. Morgan Kaufmann Publishers Inc.Google Scholar
  76. 76.
    J.-L. Schwartz, L.-J. Boë, N. Vallée, and C. Abry. The dispersion-focalization theory of vowel systems. Journal of Phonetics, 25:255–286, 1997.CrossRefGoogle Scholar
  77. 77.
    M. Sigman and G. A. Cecchi. Global organization of the wordnet lexicon. Proceedings of the National Academy of Science, 99(3):1742–1747, 2002.CrossRefGoogle Scholar
  78. 78.
    M. M. Soares, G. Corso, and L. S. Lucena. The network of syllables in Portuguese. Physica A: Statistical Mechanics and its Applications, 355(2–4): 678–684, 2005.Google Scholar
  79. 79.
    Z. Solan, D. Horn, E. Ruppin, and S. Edelman. Unsupervised learning of natural languages. Proceedings of National Academy of Sciences, 102(33):11629–11634, 2005.CrossRefGoogle Scholar
  80. 80.
    L. Steels. Language as a complex adaptive system. In Proceedings of PPSN VI, pages 17–26, 2000.Google Scholar
  81. 81.
    D. Steriade. Knowledge of similarity and narrow lexical override. BLS, 29: 583–598, 2004.Google Scholar
  82. 82.
    M. Steyvers and J. B. Tenenbaum. The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cognitive Science, 29(1): 41–78, 2005.CrossRefGoogle Scholar
  83. 83.
    M. Tamariz. Exploring the Adaptive Structure of the Mental Lexicon. Ph.D. thesis, Department of Theoretical and Applied Linguistics, Univerisity of Edinburgh, Scotland, 2005.Google Scholar
  84. 84.
    K. Toutanova, C. D. Manning, and A. Y. Ng. Learning random walk models for inducing word dependency distributions. In ICML '04: Proceedings of the Twenty-First International Conference on Machine Learning, page 103, New York, NY, 2004.Google Scholar
  85. 85.
    J. Véronis. HyperLex: Lexical cartography for information retrieval. Computer Speech and Language, 18(3):223–252, 2004.CrossRefGoogle Scholar
  86. 86.
    M. S. Vitevitch. Phonological neighbors in a small world (network): What can graph theory tell us about the mental lexicon? Departmental Colloquy co-sponsored by the Linguistics and Psychology Departments, Rice University, January 27, 2006.Google Scholar
  87. 87.
    D. Widdows and B. Dorow. A graph model for unsupervised lexical acquisition. In Proceedings of COLING, 2002.Google Scholar

Copyright information

© Birkhäuser Boston, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Microsoft Research IndiaBangaloreIndia
  2. 2.Department of Computer Science and EngineeringIndian Institute of TechnologyKharagpurIndia

Personalised recommendations