Network Differences between Normal and Shuffled Texts: Case of Croatian

  • Domagoj Margan
  • Sanda Martinčić-Ipšić
  • Ana Meštrović
Part of the Studies in Computational Intelligence book series (SCI, volume 549)

Abstract

This paper is an initial attempt to study the properties of the Croatian word order via complex networks. We present network properties of normal and shuffled Croatian texts for different co-occurrence window sizes and different linkage boundaries. The results of network analysis show that the text shuffling causes the decrease of the network diameter, due to the establishment of previously non-existing links. This indicates that the syntax does play a significant role in the Croatian language, although it is a mostly free word-order language.

Keywords

complex networks linguistic co-occurrence networks Croatian corpus shuffled text randomized text 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Meta-net white paper series: Key results and cross-language comparison (2012), http://www.meta-net.eu/whitepapers/key-results-and-cross-language-comparison
  2. 2.
    Alstott, J., Bullmore, E., Plenz, D.: Powerlaw: a python package for analysis of heavy-tailed distributions. arXiv preprint arXiv:1305.0215 (2013)Google Scholar
  3. 3.
    Ban, K., Martinčić-Ipšić, S., Meštrović, A.: Initial comparison of linguistic networks measures for parallel texts. In: 5th International Conference on Information Technologies and Information Society (ITIS), pp. 97–104 (2013)Google Scholar
  4. 4.
    Barabási, A., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Caldeira, S., Lobao, P., Andrade, R., Neme, A., Miranda, V.: The network of concepts in written texts. The European Physical Journal B-Condensed Matter and Complex Systems 49(4), 523–529 (2006)CrossRefGoogle Scholar
  6. 6.
    Hagberg, A., Swart, P., Chult, D.: Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Laboratory (LANL) (2008)Google Scholar
  7. 7.
    Krishna, M., Hassan, A., Liu, Y., Radev, D.: The effect of linguistic constraints on the large scale organization of language. arXiv preprint arXiv:1102.2831 (2011)Google Scholar
  8. 8.
    Li, W.: Random texts exhibit zipf’s-law-like word frequency distribution. IEEE Transactions on Information Theory 38(6), 1842–1845 (1992)CrossRefGoogle Scholar
  9. 9.
    Liu, H., Hu, F.: What role does syntax play in a language network? EPL (Europhysics Letters) 83(1), 18002 (2008)CrossRefGoogle Scholar
  10. 10.
    Margan, D., Martinčić-Ipšić, S., Meštrović, A.: Preliminary report on the structure of Croatian linguistic co-occurrence networks. In: 5th International Conference on Information Technologies and Information Society (ITIS), Slovenia, pp. 89–96 (2013)Google Scholar
  11. 11.
    Masucci, A., Rodgers, G.: Network properties of written human language. Physical Review E 74(2), 026102 (2006)Google Scholar
  12. 12.
    Masucci, A., Rodgers, G.: Differences between normal and shuffled texts: structural properties of weighted networks. Advances in Complex Systems 12(1), 113–129 (2009)CrossRefMathSciNetGoogle Scholar
  13. 13.
    Newman, M.: Power laws, pareto distributions and zipf’s law. Contemporary Physics 46(5), 323–351 (2005)CrossRefGoogle Scholar
  14. 14.
    Watts, D., Strogatz, S.: Collective dynamics of small-world networks. Nature 393(6684), 440–442 (1998)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Domagoj Margan
    • 1
  • Sanda Martinčić-Ipšić
    • 1
  • Ana Meštrović
    • 1
  1. 1.Department of InformaticsUniversity of RijekaRijekaCroatia

Personalised recommendations