Abstract
It is known that the Kimura 3ST model of sequence evolution on phylogenetic trees can be extended quite naturally to arbitrary split systems. However, this extension relies heavily on mathematical peculiarities of the associated Hadamard transformation, and providing an analogous augmentation of the general Markov model has thus far been elusive. In this paper, we rectify this shortcoming by showing how to extend the general Markov model on trees to include incompatible edges; and even further to more general network models. This is achieved by exploring the algebra of the generators of the continuous-time Markov chain together with the “splitting” operator that generates the branching process on phylogenetic trees. For simplicity, we proceed by discussing the two state case and then show that our results are easily extended to more states with little complication. Intriguingly, upon restriction of the two state general Markov model to the parameter space of the binary symmetric model, our extension is indistinguishable from the Hadamard approach only on trees; as soon as any incompatible splits are introduced the two approaches give rise to differing probability distributions with disparate structure. Through exploration of a simple example, we give an argument that our extension to more general networks has desirable properties that the previous approaches do not share. In particular, our construction allows for convergent evolution of previously divergent lineages; a property that is of significant interest for biological applications.
Similar content being viewed by others
References
Bandelt, H.-J. (1994). Phylogenetic networks. In Verhandlungen des Naturwissenschaftlichen Vereins Hamburg: Vol. 34.
Bandelt, H. J., & Dress, A. W. M. (1992). Split decomposition: a new and useful approach to phylogenetic analysis of distance data. Molecular Phylogenetics and Evolution, 1, 242–252.
Bashford, J. D., Jarvis, P. D., Sumner, J. G., & Steel, M. A. (2004). U(1)×U(1)×U(1) symmetry of the Kimura 3ST model and phylogenetic branching processes. Journal of Physics. A, Mathematical and General, 37, L1–L9.
Bryant, D. (2005a). Extending tree models to split networks. In: L. Pachter & B. Sturmfels (Eds.), Algebraic statistics and computational biology (pp. 297–368). Cambridge: Cambridge University Press.
Bryant, D. (2005b). On the uniqueness of the selection criterion in Neighbor-Joining. Journal of Classification, 22, 3–15.
Bryant, D. (2009). Hadamard phylogenetic methods and the n-taxon process. Bulletin of Mathematical Biology, 71, 297–309.
Bryant, D., & Moulton, V. (2004). Neighbor-Net: an agglomerative method for the construction of phylogenetic networks. Molecular Biology and Evolution, 21, 255–265.
Hendy, M. D., & Penny, D. (1989). A framework for the quantitative study of evolutionary trees. Systematic Zoology, 38, 297–309.
Holland, B., & Moulton, V. (2003). Consensus networks: a method for visualising incompatibilities in collections of trees. In: G. Benson & R. Page (Eds.), 3rd international workshop on algorithms in bioinformatics (WABI 2003) (pp. 165–176). Berlin: Springer.
Holland, B. R., Jermiin, L. S., & Moulton, V. (2006). Improved consensus network techniques for genome-scale phylogeny. Molecular Biology and Evolution, 23, 848–855.
Huson, D. H., Rupp, R., & Scornavacca, C. (2011). Phylogenetic networks: concepts, algorithms and applications. Cambridge: Cambridge University Press.
Jarvis, P. D., Bashford, J. D., & Sumner, J. G. (2005). Path integral formulation and Feynman rules for phylogenetic branching models. Journal of Physics. A, Mathematical and General, 38, 9621–9647.
Jarvis, P. D., & Sumner, J. G. (2010). Markov invariants for phylogenetic rate matrices derived from embedded submodels. arXiv:1008.1121, to appear.
Jermiin, L. S., Ho, S. Y. W., Ababneh, F., Robinson, J., & Larkum, A. W. D. (2004). The biasing effect of compositional heterogeneity on phylogenetic estimates may be underestimated. BMC Systems Biology, 53, 638–643.
Jin, G., Nakhleh, L., Snir, S., & Tuller, T. (2006). Maximum likelihood of phylogenetic networks. Bioinformatics, 21, 2604–2611.
Johnson, J. E. (1985). Markov-type Lie groups in GL(n,ℝ). Journal of Mathematical Physics, 26, 252–257.
Penny, D. (2005). Relativity for molecular clocks. Nature, 436, 183–184.
Procesi, C. (2007). Lie groups: an approach through invariants and representations. Berlin: Springer.
Semple, C., & Steel, M. (2003). Phylogenetics. Oxford: Oxford Press.
Strimmer, K., & Moulton, V. (2000). Likelihood analysis of phylogenetic networks using directed graphical models. Molecular Biology and Evolution, 17, 875–881.
Sumner, J. G., Charleston, M. A., Jermiin, L. S., & Jarvis, P. D. (2008). Markov invariants, plethysms, and phylogenetics. Journal of Theoretical Biology, 253, 601–615.
Sumner, J. G., & Jarvis, P. D. (2005). Entanglement invariants and phylogenetic branching. Journal of Mathematical Biology, 51, 18–36.
Von Haeseler, A., & Churchill, G. A. (1993). Network models for sequence evolution. Journal of Molecular Evolution, 37, 77–85.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sumner, J.G., Holland, B.R. & Jarvis, P.D. The Algebra of the General Markov Model on Phylogenetic Trees and Networks. Bull Math Biol 74, 858–880 (2012). https://doi.org/10.1007/s11538-011-9691-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-011-9691-z