Skip to main content

Persistent Topology of Syntax


We study the persistent homology of a data set of syntactic parameters of world languages. We show that, while homology generators behave erratically over the whole data set, non-trivial persistent homology appears when one restricts to specific language families. Different families exhibit different persistent homology. We focus on the cases of the Indo-European and the Niger–Congo families, for which we compare persistent homology over different cluster filtering values. The persistent components appear to correspond to linguistic subfamilies, while the meaning, in historical linguistic terms, of the presence of persistent generators of the first homology is more mysterious. We investigate the possible significance of the persistent first homology generator that we find in the Indo-European family. We show that it is not due to the Anglo-Norman bridge (which is a lexical, not syntactic phenomenon), but is related instead to the position of Ancient Greek and the Hellenic branch within the Indo-European phylogenetic network.

This is a preview of subscription content, access via your institution.


  1. SSWL Database of Syntactic Parameters:

  2. Perseus Software Package for Persistent Homology:

  3. Chomsky, N.: Lectures on Government and Binding. Foris Publications, Dordrecht (1982)

    Google Scholar 

  4. Chomsky, N., Lasnik, H.: The theory of principles and parameters. In: Syntax: An International Handbook of Contemporary Research, pp. 506–569. de Gruyter (1993)

  5. Baker, M.: The Atoms of Language. Basic Books, New York (2001)

    Google Scholar 

  6. Rizzi, L.: On the format and locus of parameters: the role of morphosyntactic features, preprint (2016)

  7. Shopen, T.: Language Typology and Syntactic Description: Volume 1, Clause Structure; Volume 2, Complex Constructions; Volume 3: Grammatical Categories and Lexicon. Cambridge University Press, Cambridge (2007)

    Google Scholar 

  8. Galves, C. (ed.): Parameter Theory and Linguistic Change. Oxford University Press, Oxford (2012)

    Google Scholar 

  9. Longobardi, G., Guardiano, C.: Evidence for syntax as a signal of historical relatedness. Lingua 119, 1679–1706 (2009)

    Article  Google Scholar 

  10. Haspelmath, M.: Parametric versus functional explanations of syntactic universals. In: The Limits of Syntactic Variation, pp. 75–107. John Benjamins (2008)

  11. Haspelmath, M., Dryer, M.S., Gil, D., Comrie, B.: The World Atlas of Language Structures. Oxford University Press, Oxford (2005)

    Google Scholar 

  12. Marcolli, M.: Syntactic parameters and a coding theory perspective on entropy and complexity of language families. Entropy 18(4), 110 (2016)

    Article  MathSciNet  Google Scholar 

  13. Park, J.J., Boettcher, R., Zhao, A., Mun, A., Yuh, K., Kumar, V., Marcolli, M.: Prevalence and recoverability of syntactic parameters in sparse distributed memories. In: Geometric Science of Information. Third International Conference GSI 2017, vol. 10589, pp. 265–272, Lecture Notes in Computer Science, Springer (2017)

  14. Shu, K., Aziz, S., Huynh, V.L., Warrick, D., Marcolli, M.: Syntactic phylogenetic trees. In: Kouneiher, J. (ed.) Foundations of Mathematics and Physics one Century after Hilbert, Springer Verlag. arXiv:1607.02791, to appear

  15. Shu, K., Marcolli, M.: Syntactic structures and code parameters. Math. Comput. Sci. 11(1), 79–90 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  16. Siva, K., Tao, J., Marcolli, M.: Spin Glass Models of Syntax and Language Evolution. arXiv:1508.00504, to appear in Linguistic Analysis

  17. Bendor-Samuel, J.: The Niger–Congo Languages: A Classification and Description of Africa’s Largest Language Family. University Press of America, Lanham (1989)

    Google Scholar 

  18. Manfredi, V., Reynolds, K. (eds.): Niger–Congo Syntax and Semantics. Boston University, African Studies Center, Boston (1995)

    Google Scholar 

  19. Shu, K., Ortegaray, A., Berwick, R., Marcolli, M.: Phylogenetics of Indo-European Language Families via an Algebro-Geometric Analysis of their Syntactic Structures, arXiv:1712.01719

  20. Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46(2), 255–308 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  21. Edelsbrunner, H., Harer, J.L.: Computational Topology: An Introduction. American Mathematical Society, Providence (2010)

    MATH  Google Scholar 

  22. Ghrist, R.: Elementary Applied Topology. CreateSpace, Seattle (2014)

    Google Scholar 

  23. Carlsson, G., Ishkhanov, T., de Silva, V., Zomorodian, A.: On the local behavior of spaces of natural images. Int. J. Comput. Vis. 76, 1–12 (2008)

    Article  MathSciNet  Google Scholar 

  24. Warnow, T., Evans, S.N., Ringe, D., Nakhleh, L.: A stochastic model of language evolution that incorporates homoplasy and borrowing. In: Phylogenetic Methods and the Prehistory of Languages, McDonald Institute Monographs (2006)

  25. Zomorodian, A., Carlsson, G.: Computing persistent homology. Discrete Comput. Geom. 33(2), 249–274 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  26. Horak, D., Maletić, S., Rajković, M.: Persistent homology of complex networks. J. Stat. Mech. 2009, P03034 (2009)

    Article  MathSciNet  Google Scholar 

  27. Kahle, M.: Random geometric complexes. Discrete Comput. Geom. 45(3), 553–573 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  28. Pachter, L., Sturmfels, B.: Algebraic statistics for computational biology. Cambridge University Press, Cambridge (2005)

    Book  MATH  Google Scholar 

  29. Ringe, D., Warnow, T., Taylor, A.: Indo-European and computational cladistics. Trans. Philol. Soc. 100, 59–129 (2002)

    Article  Google Scholar 

  30. Manin, Y.I.: Neural codes and homotopy types: mathematical models of place field recognition. Mosc. Math. J. 15(4), 741–748 (2015)

    MathSciNet  MATH  Google Scholar 

  31. Curto, C., Itskov, V., Veliz-Cuba, A., Youngs, N.: The neural ring: an algebraic tool for analysing the intrinsic structure of neural codes. Bull. Math. Biol. 75(9), 1571–1611 (2013)

    Article  MathSciNet  MATH  Google Scholar 

Download references


This work was performed within the activities of the last author’s Mathematical and Computational Linguistics lab and CS101/Ma191 class at Caltech. The last author was partially supported by NSF Grants DMS-1007207, DMS-1201512, DMS-1707882, and PHY-1205440.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Matilde Marcolli.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Port, A., Gheorghita, I., Guth, D. et al. Persistent Topology of Syntax. Math.Comput.Sci. 12, 33–50 (2018).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Linguistics
  • Syntax
  • Persistent homology

Mathematics Subject Classification

  • 91F20
  • 55U10
  • 68P05