CICLing 2012: Computational Linguistics and Intelligent Text Processing pp 142-153 | Cite as
Core-Periphery Organization of Graphemes in Written Sequences: Decreasing Positional Rigidity with Increasing Core Order
Abstract
The positional rigidity of graphemes (as well as words considered as single units) in written sequences has been analyzed in this paper using complex network methodology. In particular, the information about adjacent co-occurrence of graphemes in a corpus has been used to construct a network, where the nodes represent the distinct signs used. Core-periphery structure of this network has been uncovered using k-core decomposition technique suitably generalized for directed networks. This allows identification of a core signary or “graphem-ome” of the corresponding writing system, i.e., the group of frequently co-occurring graphemes. The distribution of the frequency with which such signs occur at different positions in a sequence (e.g., at the beginning or at the end or in the middle) shows that while signs belonging to the periphery often occur only at specific positions, those in the innermost cores may occur at many different positions. This is quantified by using a positional entropy measure that shows a systematic increase with core order for the different databases used in this study (corpus of English, Chinese and Sumerian sentences as well as a database of Indus civilization inscriptions).
Keywords
linguistic networks core-periphery organization positional entropy core signaryPreview
Unable to display preview. Download preview PDF.
References
- Biemann, C., Quasthoff, U.: Networks generated from natural language text. In: Ganguly, N., et al. (eds.) Dynamics on and of Complex Networks, pp. 167–185. Birkhauser, Boston (2009)CrossRefGoogle Scholar
- Chatterjee, N., Sinha, S.: Understanding the mind of a worm. Progress in Brain Research 168, 145–153 (2007)CrossRefGoogle Scholar
- Choudhury, M., Mukherjee, A.: The structure and dynamics of linguistic networks. In: Ganguly, N., et al. (eds.) Dynamics on and of Complex Networks, pp. 145–166. Birkhauser, Boston (2009)CrossRefGoogle Scholar
- Dorogovtsev, S.N., Mendes, J.F.: Language as an evolving word web. Proceedings of the Royal Society of London B 268(1485), 2603–2606 (2001)CrossRefGoogle Scholar
- Ferrer i Cancho, R., Sole, R.V.: The small world of human language. Proceedings of the Royal Society of London B 268(1482), 2261–2265 (2001)CrossRefGoogle Scholar
- Fuls, A.: Entwicklung einer geographisch-epigraphischen datenbank der indusschrift. In: Weisbruch, S., Kaden, R. (eds.) Entwicklerforum Geodäsie und Geoinformationstechnik 2010. Technische Universität, Berlin (2010)Google Scholar
- Holme, P.: Core-periphery organization of complex networks. Physical Review E 72(4), 046111(1-4) (2005)Google Scholar
- Lamb, S.M.: Linguistic and cognitive networks. In: Garvin, P. (ed.) Cognition: A Multiple View, pp. 195–222. Spartan Books, New York (1970)Google Scholar
- Palaima, T.G., Pope, E.I., Kent Reilly, F.: Unlocking the secrets of ancient writing. Catalogue of an exhibition in conjunction with the 11th International Mycenological Colloqium. The University of Texas at Austin (2000)Google Scholar
- Parpola, A.: Deciphering the Indus Script. Cambridge University Press, Cambridge (1994)Google Scholar
- Saha Roy, R., Ganguly, N., Chowdhury, M., Singh, N.K.: Complex network analysis reveals kernel-periphery structure in web search queries. In: 2nd International ACM SIGIR Workshop on Query Representation and Understanding (QRU 2011), pp. 5–8 (2011)Google Scholar
- Sinha, S., Izhar, A.M., Pan, R.K., Wells, B.K.: Network analysis of a corpus of undeciphered Indus civilization inscriptions indicates syntactic organization. Computer Speech and Language 25(3), 639–654 (2011)CrossRefGoogle Scholar