Abstract
We study the persistent homology of a data set of syntactic parameters of world languages. We show that, while homology generators behave erratically over the whole data set, non-trivial persistent homology appears when one restricts to specific language families. Different families exhibit different persistent homology. We focus on the cases of the Indo-European and the Niger–Congo families, for which we compare persistent homology over different cluster filtering values. The persistent components appear to correspond to linguistic subfamilies, while the meaning, in historical linguistic terms, of the presence of persistent generators of the first homology is more mysterious. We investigate the possible significance of the persistent first homology generator that we find in the Indo-European family. We show that it is not due to the Anglo-Norman bridge (which is a lexical, not syntactic phenomenon), but is related instead to the position of Ancient Greek and the Hellenic branch within the Indo-European phylogenetic network.
This is a preview of subscription content, access via your institution.
References
SSWL Database of Syntactic Parameters: http://sswl.railsplayground.net/
Perseus Software Package for Persistent Homology: http://www.sas.upenn.edu/~vnanda/perseus/
Chomsky, N.: Lectures on Government and Binding. Foris Publications, Dordrecht (1982)
Chomsky, N., Lasnik, H.: The theory of principles and parameters. In: Syntax: An International Handbook of Contemporary Research, pp. 506–569. de Gruyter (1993)
Baker, M.: The Atoms of Language. Basic Books, New York (2001)
Rizzi, L.: On the format and locus of parameters: the role of morphosyntactic features, preprint (2016)
Shopen, T.: Language Typology and Syntactic Description: Volume 1, Clause Structure; Volume 2, Complex Constructions; Volume 3: Grammatical Categories and Lexicon. Cambridge University Press, Cambridge (2007)
Galves, C. (ed.): Parameter Theory and Linguistic Change. Oxford University Press, Oxford (2012)
Longobardi, G., Guardiano, C.: Evidence for syntax as a signal of historical relatedness. Lingua 119, 1679–1706 (2009)
Haspelmath, M.: Parametric versus functional explanations of syntactic universals. In: The Limits of Syntactic Variation, pp. 75–107. John Benjamins (2008)
Haspelmath, M., Dryer, M.S., Gil, D., Comrie, B.: The World Atlas of Language Structures. Oxford University Press, Oxford (2005)
Marcolli, M.: Syntactic parameters and a coding theory perspective on entropy and complexity of language families. Entropy 18(4), 110 (2016)
Park, J.J., Boettcher, R., Zhao, A., Mun, A., Yuh, K., Kumar, V., Marcolli, M.: Prevalence and recoverability of syntactic parameters in sparse distributed memories. In: Geometric Science of Information. Third International Conference GSI 2017, vol. 10589, pp. 265–272, Lecture Notes in Computer Science, Springer (2017)
Shu, K., Aziz, S., Huynh, V.L., Warrick, D., Marcolli, M.: Syntactic phylogenetic trees. In: Kouneiher, J. (ed.) Foundations of Mathematics and Physics one Century after Hilbert, Springer Verlag. arXiv:1607.02791, to appear
Shu, K., Marcolli, M.: Syntactic structures and code parameters. Math. Comput. Sci. 11(1), 79–90 (2017)
Siva, K., Tao, J., Marcolli, M.: Spin Glass Models of Syntax and Language Evolution. arXiv:1508.00504, to appear in Linguistic Analysis
Bendor-Samuel, J.: The Niger–Congo Languages: A Classification and Description of Africa’s Largest Language Family. University Press of America, Lanham (1989)
Manfredi, V., Reynolds, K. (eds.): Niger–Congo Syntax and Semantics. Boston University, African Studies Center, Boston (1995)
Shu, K., Ortegaray, A., Berwick, R., Marcolli, M.: Phylogenetics of Indo-European Language Families via an Algebro-Geometric Analysis of their Syntactic Structures, arXiv:1712.01719
Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46(2), 255–308 (2009)
Edelsbrunner, H., Harer, J.L.: Computational Topology: An Introduction. American Mathematical Society, Providence (2010)
Ghrist, R.: Elementary Applied Topology. CreateSpace, Seattle (2014)
Carlsson, G., Ishkhanov, T., de Silva, V., Zomorodian, A.: On the local behavior of spaces of natural images. Int. J. Comput. Vis. 76, 1–12 (2008)
Warnow, T., Evans, S.N., Ringe, D., Nakhleh, L.: A stochastic model of language evolution that incorporates homoplasy and borrowing. In: Phylogenetic Methods and the Prehistory of Languages, McDonald Institute Monographs (2006)
Zomorodian, A., Carlsson, G.: Computing persistent homology. Discrete Comput. Geom. 33(2), 249–274 (2005)
Horak, D., Maletić, S., Rajković, M.: Persistent homology of complex networks. J. Stat. Mech. 2009, P03034 (2009)
Kahle, M.: Random geometric complexes. Discrete Comput. Geom. 45(3), 553–573 (2011)
Pachter, L., Sturmfels, B.: Algebraic statistics for computational biology. Cambridge University Press, Cambridge (2005)
Ringe, D., Warnow, T., Taylor, A.: Indo-European and computational cladistics. Trans. Philol. Soc. 100, 59–129 (2002)
Manin, Y.I.: Neural codes and homotopy types: mathematical models of place field recognition. Mosc. Math. J. 15(4), 741–748 (2015)
Curto, C., Itskov, V., Veliz-Cuba, A., Youngs, N.: The neural ring: an algebraic tool for analysing the intrinsic structure of neural codes. Bull. Math. Biol. 75(9), 1571–1611 (2013)
Acknowledgements
This work was performed within the activities of the last author’s Mathematical and Computational Linguistics lab and CS101/Ma191 class at Caltech. The last author was partially supported by NSF Grants DMS-1007207, DMS-1201512, DMS-1707882, and PHY-1205440.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Port, A., Gheorghita, I., Guth, D. et al. Persistent Topology of Syntax. Math.Comput.Sci. 12, 33–50 (2018). https://doi.org/10.1007/s11786-017-0329-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11786-017-0329-x
Keywords
- Linguistics
- Syntax
- Persistent homology
Mathematics Subject Classification
- 91F20
- 55U10
- 68P05