Skip to main content

Text Type Differentiation Based on the Structural Properties of Language Networks

  • Conference paper
  • First Online:
Information and Software Technologies (ICIST 2016)

Abstract

In this paper co-occurrence language network measures from literature and legal texts are compared on the global and on the local scale. Our dataset consists of four legal texts and four short novellas both written in English. For each text we construct one directed and weighted network, where weight of a link between two nodes represents overall co-occurrence frequencies of the corresponding words. We choose four literature-law pairs of texts with approximately the same number of different words for comparison. The aim of this experiment was to investigate how complex network measures operate in different structures of texts and which of them are sensitive to different text types. Our results show that on the global scale only average strength is the measure that exhibit some uniform behaviour due to the differences in textual complexity. In general, global measures may not be well suited to discriminate between mentioned genres of texts. However, local perspective rank plots of in and out selectivity (average node strength) indicate that there are more noticeable structural differences between legal texts and literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)

    Article  Google Scholar 

  2. Cong, J., Liu, H.: Approaching human language with complex networks. Phys. Life Rev. 11(4), 598–618 (2014)

    Article  MathSciNet  Google Scholar 

  3. Borge-Holthoefer, J., Arenas, A.: Semantic networks: structure and dynamics. Entropy 12(5), 1264–1302 (2010)

    Article  MATH  Google Scholar 

  4. Cancho, R.F.I., Solé, R.V., Köhler, R.: Patterns in syntactic dependency networks. Phys. Rev. E 69(5), 051915 (2004)

    Article  Google Scholar 

  5. Soares, M.M., Corso, G., Lucena, L.: The network of syllables in portuguese. Phys. A Stat. Mech. Appl. 355(2), 678–684 (2005)

    Article  Google Scholar 

  6. Ban, K., Ivakic, I., Meštrović, A.: A preliminary study of croatian language syllable networks. In: 2013 36th International Convention on Information & Communication Technology Electronics & Microelectronics (MIPRO), pp. 1296–1300. IEEE (2013)

    Google Scholar 

  7. Solé, R.V., Corominas-Murtra, B., Valverde, S., Steels, L.: Language networks: their structure, function, and evolution. Complexity 15(6), 20–26 (2010)

    Article  Google Scholar 

  8. Margan, D., Martinčić-Ipšić, S., Meštrović, A.: Preliminary report on the structure of Croatian linguistic co-occurrence networks. In: 5th International Conference on Information Technologies and Information Society (ITIS), pp. 89–96 (2013)

    Google Scholar 

  9. Ban Kirigin, T., Meštrović, A., Martinčić-Ipšić, S.: Towards a formal model of language networks. In: Dregvaite, G., Damasevicius, R. (eds.) ICIST 2015. CCIS, vol. 538, pp. 469–479. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24770-0_40

    Google Scholar 

  10. Šišović, S., Martinčić-Ipšić, S., Meštrović, A.: Comparison of the language networks from literature and blogs. In: 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1603–1608. IEEE (2014)

    Google Scholar 

  11. Opsahl, T., Agneessens, F., Skvoretz, J.: Node centrality in weighted networks: generalizing degree and shortest paths. Soc. Netw. 32(3), 245–251 (2010)

    Article  Google Scholar 

  12. Amancio, D.R., Oliveira Jr., O.N., da Fontoura Costa, L.: Identification of literary movements using complex networks to represent texts. New J. Phys. 14(4), 043029 (2012)

    Article  Google Scholar 

  13. Amancio, D.R., Aluisio, S.M., Oliveira Jr., O.N., da Fontoura Costa, L.: Complex networks analysis of language complexity. arXiv preprint arXiv:1302.4490 (2013)

  14. de Arruda, H.F., da Fontoura Costa, L., Amancio, D.R.: Classifying informative, imaginative prose using complex networks. arXiv preprint arXiv: 1507.07826 (2015)

  15. Antiqueira, L., Nunes, M.G.V., Oliveira Jr., O.N., da Fontoura Costa, L.: Strong correlations between text quality and complex networks features. Phys. A: Stat. Mech. Appl. 373, 811–820 (2007)

    Google Scholar 

  16. Masucci, A., Rodgers, G.: Differences between normal and shuffled texts: structural properties of weighted networks. Adv. Complex Syst. 12(01), 113–129 (2009)

    Article  MathSciNet  Google Scholar 

  17. Margan, D., Meštrović, A., Martinčić-Ipšić, S.: Complex networks measures for differentiation between normal and shuffled Croatian texts. In: 37th International IEEE Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1598–1602 (2014)

    Google Scholar 

  18. Grabska-Gradzińska, I., Kulig, A., Kwapień, J., Drożdż, S.: Complex network analysis of literary and scientific texts. Int. J. Mod. Phys. C 23(07), 1250051 (2012)

    Article  Google Scholar 

  19. Newman, M.E.J.: Networks, an introduction (2010)

    Google Scholar 

  20. Latora, V., Marchiori, M.: Efficient behavior of small-world networks. Physical Rev. Lett. 87(19), 198701 (2001)

    Article  Google Scholar 

  21. Latora, V., Marchiori, M.: Economic small-world behavior in weighted networks. The Eur. Phys. J. B-Condens. Matter Complex Syst. 32(2), 249–263 (2003)

    Article  Google Scholar 

  22. Project gutenberg. https://www.gutenberg.org/

  23. Newman, M.E.J.: Assortative mixing in networks. Phys. Rev. Lett. 89(20), 208701 (2002)

    Article  Google Scholar 

  24. Schult, D.A., Swart, P.: Exploring network structure, dynamics, and function using networkx. In: Proceedings of the 7th Python in Science Conferences (SciPy 2008), vol. 2008, pp. 11–16 (2008)

    Google Scholar 

  25. Bastian, M., Heymann, S., Jacomy, M., et al.: Gephi: an open source software for exploring and manipulating networks. ICWSM 8, 361–362 (2009)

    Google Scholar 

  26. Margan, D., Meštrović, A., LaNCoA: a python toolkit for language networks construction and analysis. In: 38th International IEEE Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1961–1966 (2015)

    Google Scholar 

  27. Noldus, R., Van Mieghem, P.: Assortativity in complex networks. J. Complex Netw. 3(4), 507–542 (2015). http://dx.doi.org/10.1093/comnet/cnv005

    Article  MathSciNet  Google Scholar 

  28. Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: Selectivity-Based Keyword Extraction Method. Int. J. Semant. Inf. Syst. (IJSWIS) 12(3) (2016, accepted)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sanda Martinčić-Ipšić .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Martinčić-Ipšić, S., Miličić, T., Meštrović, A. (2016). Text Type Differentiation Based on the Structural Properties of Language Networks. In: Dregvaite, G., Damasevicius, R. (eds) Information and Software Technologies. ICIST 2016. Communications in Computer and Information Science, vol 639. Springer, Cham. https://doi.org/10.1007/978-3-319-46254-7_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46254-7_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46253-0

  • Online ISBN: 978-3-319-46254-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics