Bulletin of Mathematical Biology

, Volume 79, Issue 3, pp 619–634 | Cite as

Dimensional Reduction for the General Markov Model on Phylogenetic Trees

Original Article


We present a method of dimensional reduction for the general Markov model of sequence evolution on a phylogenetic tree. We show that taking certain linear combinations of the associated random variables (site pattern counts) reduces the dimensionality of the model from exponential in the number of extant taxa, to quadratic in the number of taxa, while retaining the ability to statistically identify phylogenetic divergence events. A key feature is the identification of an invariant subspace which depends only bilinearly on the model parameters, in contrast to the usual multi-linear dependence in the full space. We discuss potential applications including the computation of split (edge) weights on phylogenetic trees from observed sequence data.


Representation theory Markov chains Affine group 



This work was inspired from a question Alexei Drummond put to Barbara Holland during her presentation at the New Zealand Phylogenetics Meeting, DOOM 2016. I would also like to thank the anonymous reviewer for their careful and substantive comments that lead to a greatly improved manuscript.

Funding This work was supported by the Australian Research Council Discovery Early Career Fellowship DE130100423.


  1. Allman ES, Kubatko LS, Rhodes JA (2017) Split scores: a tool to quantify phylogenetic signal in genome-scale data. Syst Biol. doi: 10.1093/sysbio/syw103
  2. Allman ES, Rhodes JA (2008) Phylogenetic ideals and varieties for the general Markov model. Adv. Appl. Math. 40(2):127–148MathSciNetCrossRefMATHGoogle Scholar
  3. Baker A (2012) Matrix groups: an introduction to Lie group theory. Springer Science & Business Media, New YorkGoogle Scholar
  4. Bashford JD, Jarvis PD, Sumner JG, Steel MA (2004) U(1)\(\times \) U(1)\(\times \) U(1) symmetry of the Kimura 3ST model and phylogenetic branching processes. J Phys A Math Gen 37(8):L81MathSciNetCrossRefMATHGoogle Scholar
  5. Bryant D (2009) Hadamard phylogenetic methods and the \(n\)-taxon process. Bull Math Biol 71(2):339–351MathSciNetCrossRefMATHGoogle Scholar
  6. Casanellas M, Fernández-Sánchez J (2007) Performance of a new invariants method on homogeneous and nonhomogeneous quartet trees. Mol Biol Evol 24(1):288–293CrossRefGoogle Scholar
  7. Casanellas M, Fernández-Sánchez J (2011) Relevant phylogenetic invariants of evolutionary models. Journal de Mathématiques Pures et Appliquées 96(3):207–229MathSciNetCrossRefMATHGoogle Scholar
  8. Cavender JA, Felsenstein J (1987) Invariants of phylogenies in a simple case with discrete states. J Classif 4(1):57–71CrossRefMATHGoogle Scholar
  9. Chifman J, Kubatko L (2014) Quartet inference from SNP data under the coalescent model. Bioinformatics 30(23):3317–3324CrossRefGoogle Scholar
  10. Draisma J, Kuttler J (2009) On the ideals of equivariant tree models. Math Ann 344(3):619–644MathSciNetCrossRefMATHGoogle Scholar
  11. Eriksson N (2005) Tree construction using singular value decomposition. In: Pachter L, Sturmfels B (eds) Algebraic statistics for computational biology, chapter 10. Cambridge University Press, New York, pp 347–358CrossRefGoogle Scholar
  12. Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17(6):368–376CrossRefGoogle Scholar
  13. Felsenstein J (2004) Inferring phylogenies, vol 2. Sinauer Associates, SunderlandGoogle Scholar
  14. Fernández-Sánchez J, Casanellas M (2016) Invariant versus classical quartet inference when evolution is heterogeneous across sites and lineages. Syst Biol 65(2):280–291CrossRefGoogle Scholar
  15. Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Biol 20(4):406–416CrossRefGoogle Scholar
  16. Francis AR (2014) An algebraic view of bacterial genome evolution. J Math Biol 69(6–7):1693–1718MathSciNetCrossRefMATHGoogle Scholar
  17. Hagedorn TR (2000) A combinatorial approach to determining phylogenetic invariants for the general model. Technical report, CRM-2671Google Scholar
  18. Hendy MD, Penny D, Steel MA (1994) A discrete fourier analysis for evolutionary trees. Proc Natl Acad Sci 91(8):3339–3343CrossRefMATHGoogle Scholar
  19. Holland BR, Jarvis PD, Sumner JG (2013) Low-parameter phylogenetic inference under the general Markov model. Syst Biol 62(1):78–92CrossRefGoogle Scholar
  20. Jarvis PD, Sumner JG (2014) Adventures in invariant theory. ANZIAM J 56(02):105–115MathSciNetCrossRefMATHGoogle Scholar
  21. Jarvis PD, Sumner JG (2016) Matrix group structure and Markov invariants in the strand symmetric phylogenetic substitution model. J Math Biol 73:259–282MathSciNetCrossRefMATHGoogle Scholar
  22. Johnson JE (1985) Markov-type lie groups in \(\text{ GL }(n, r)\). J Math Phys 26(2):252–257MathSciNetCrossRefMATHGoogle Scholar
  23. Lake JA (1987) A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Mol Biol Evol 4(2):167–191Google Scholar
  24. Semple C, Steel M (2003) Phylogenetics, vol 24. Oxford University Press, OxfordMATHGoogle Scholar
  25. Sturmfels B, Sullivant S (2005) Toric ideals of phylogenetic invariants. J Comput Biol 12(2):204–228CrossRefGoogle Scholar
  26. Sumner JG, Charleston MA, Jermiin LS, Jarvis PD (2008) Markov invariants, plethysms, and phylogenetics. J Theor Biol 253(3):601–615MathSciNetCrossRefGoogle Scholar
  27. Sumner JG, Fernández-Sánchez J, Jarvis PD (2012a) Lie Markov models. J Theor Biol 298:16–31MathSciNetCrossRefGoogle Scholar
  28. Sumner JG, Holland BR, Jarvis PD (2012b) The algebra of the general Markov model on phylogenetic trees and networks. Bull Math Biol 74(4):858–880MathSciNetCrossRefMATHGoogle Scholar
  29. Sumner JG, Jarvis PD (2005) Entanglement invariants and phylogenetic branching. J Math Biol 51(1):18–36MathSciNetCrossRefMATHGoogle Scholar
  30. Sumner JG, Jarvis PD (2009) Markov invariants and the isotropy subgroup of a quartet tree. J Theor Biol 258(2):302–310MathSciNetCrossRefGoogle Scholar
  31. Yang Z (2014) Molecular evolution: a statistical approach. Oxford University Press, OxfordCrossRefMATHGoogle Scholar

Copyright information

© Society for Mathematical Biology 2017

Authors and Affiliations

  1. 1.School of Physical Sciences, MathematicsUniversity of TasmaniaHobartAustralia

Personalised recommendations