Skip to main content
Log in

Lie Markov models with purine/pyrimidine symmetry

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

Continuous-time Markov chains are a standard tool in phylogenetic inference. If homogeneity is assumed, the chain is formulated by specifying time-independent rates of substitutions between states in the chain. In applications, there are usually extra constraints on the rates, depending on the situation. If a model is formulated in this way, it is possible to generalise it and allow for an inhomogeneous process, with time-dependent rates satisfying the same constraints. It is then useful to require that, under some time restrictions, there exists a homogeneous average of this inhomogeneous process within the same model. This leads to the definition of “Lie Markov models” which, as we will show, are precisely the class of models where such an average exists. These models form Lie algebras and hence concepts from Lie group theory are central to their derivation. In this paper, we concentrate on applications to phylogenetics and nucleotide evolution, and derive the complete hierarchy of Lie Markov models that respect the grouping of nucleotides into purines and pyrimidines—that is, models with purine/pyrimidine symmetry. We also discuss how to handle the subtleties of applying Lie group methods, most naturally defined over the complex field, to the stochastic case of a Markov process, where parameter values are restricted to be real and positive. In particular, we explore the geometric embedding of the cone of stochastic rate matrices within the ambient space of the associated complex Lie algebra.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. The reader may notice that we have changed the terminology of Sumner et al. (2012a) and we refer to the desired property as “locally multiplicative closure” instead of “multiplicative closure”. The problem of global multiplicative closure for a continuous-time Markov model is a deep problem related to the convergence of the Baker–Campbell–Hausdorff formula (see Blanes and Casas 2004). Notice that this is not a serious drawback as the nature of the problem is local.

  2. Note this group is isomorphic to the dihedral group \({\mathbf {D}}_4\), which describes the symmetries of a square. However, it also admits a more natural description in our setting as \({\mathfrak {S}}_2 \wr {\mathfrak {S}}_2 \), the wreath product of \({\mathfrak {S}}_2\) with itself (see Rotman 1995, Chapter VII).

References

  • Alexandrov AD (2005) Convex polyhedra. Springer Monographs in Mathematics. Springer, Berlin. ISBN 3-540-23158-7 (translated from the 1950 Russian edition by N. S. Dairbekov, S. S. Kutateladze and A. B. Sossinsky, with comments and bibliography by V. A. Zalgaller and appendices by L. A. Shor and Yu. A. Volkov)

  • Birkhoff G (1938) Analytical groups. Trans Am Math Soc 43(1):61–101. ISSN 0002–9947. doi:10.2307/1989902

    Google Scholar 

  • Blanes S, Casas F (2004) On the convergence and optimization of the Baker–Campbell–Hausdorff formula. Linear Algebra Appl 378:135–158. ISSN 0024–3795. doi:10.1016/j.laa.2003.09.010

    Google Scholar 

  • Bogopolski O (2008) Introduction to group theory. EMS Textbooks in Mathematics, European Mathematical Society (EMS), Zürich. ISBN 978-3-03719-041-8. doi:10.4171/041 (translated, revised and expanded from the Russian original)

  • Campbell JE (1897) On a law of combination of operators (second paper). Proc Lond Math Soc 28:381–390

    MATH  Google Scholar 

  • Casanellas M, Fernández-Sánchez J (2010) Relevant phylogenetic invariants of evolutionary models. J Math Pure Appl 96:207–229

    Article  Google Scholar 

  • Casanellas M, Sullivant S (2005) The strand symmetric model. In: Algebraic statistics for computational biology. Cambridge University Press, New York, pp 305–321. doi:10.1017/CBO9780511610684.020

  • Casanellas M, Fernández-Sánchez J, Kedzierska A (2012) The space of phylogenetic mixtures for equivariant models. Algorithms Mol Biol 7:33

    Article  Google Scholar 

  • Davies EB (2010) Embeddable Markov matrices. Electron J Probab 15(47):1474–1486. ISSN 1083–6489. doi:10.1214/EJP.v15-733

    Google Scholar 

  • Donten-Bury M, Michałek M (2012) Phylogenetic invariants for group-based models. J Algebr Stat 3(1):44–63. ISSN 1309–3452

    Google Scholar 

  • Draisma J, Kuttler J (2008) On the ideals of equivariant tree models. Math Ann 344:619–644

    Article  MathSciNet  Google Scholar 

  • Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376

    Article  Google Scholar 

  • Fernández-Sánchez J (2013) Code for lie markov models with purine/pyrimidine symmetry. http://www.pagines.ma1.upc.edu/jfernandez/purine_pyrimidine.html

  • Hasegawa M, Kishino H, Yano T (1988) Phylogenetic inference from DNA sequence data. Statistical theory and data analysis, II (Tokyo, 1986). North-Holland, Amsterdam

    Google Scholar 

  • James G, Liebeck M (2001) Representations and characters of groups, 2nd edn. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  • Johnson JE (1985) Markov-type Lie groups in \(GL(n,{R})\). J Math Phys 26:252–257

    Article  MATH  MathSciNet  Google Scholar 

  • Jukes T, Cantor C (1969) Evolution of protein molecules. In: Mammalian protein, metabolism, pp 21–132

  • Kimura M (1980) A simple method for estimating evolutionary rates of base substitution through comparative studies of nucleotide sequences. J Mol Evol 16:111–120

    Article  Google Scholar 

  • Kimura M (1981) Estimation of evolutionary distances between homologous nucleotide sequences. Proc Natl Acad Sci 78:1454–1458

    Google Scholar 

  • Michałek M (2011) Geometry of phylogenetic group-based models. J Algebra 339:339–356. ISSN 0021-8693. doi:10.1016/j.jalgebra.2011.05.016

    Google Scholar 

  • Posada D, Crandall KA (1998) Modeltest: testing the model of DNA substitution. Bioinformatics 14:817–818

    Article  Google Scholar 

  • Rotman J (1995) An introduction to the theory of groups, 4th edn, volume 148 of Graduate Texts in Mathematics. Springer, New York. ISBN 0-387-94285-8

  • Sagan BE (2001) The symmetric group: representations, combinatorial algorithms, and symmetric functions, 2nd edn., Graduate Texts in MathematicsSpringer, Berlin

    Book  Google Scholar 

  • Semple C, Steel M (2003) Phylogenetics. Oxford Press, Oxford

    MATH  Google Scholar 

  • Stein W et al (2012) Sage Mathematics Software (Version 4.8). The Sage Development Team. http://www.sagemath.org

  • Sumner JG, Fernández-Sánchez J, Jarvis PD (2012a) Lie Markov models. J Theor Biol 298:16–31. ISSN 0022-5193. doi:10.1016/j.jtbi.2011.12.017

    Google Scholar 

  • Sumner JG, Jarvis PD, Fernández-Sánchez J, Kaine BT, Woodhams MD, Holland BR (2012b) Is the general time-reversible model bad for molecular phylogenetics? Syst Biol 61:1069–1074

    Article  Google Scholar 

  • Tavaré S (1986) Some probabilistic and statistical problems in the analysis of dna sequences. Lect Math Life Sci (American Mathematical Society) 17:57–86

    Google Scholar 

  • Yap V, Pachter L (2004) Identification of evolutionary hotspots in the rodent genomes. Genome Res 14(4):574–579

    Article  Google Scholar 

Download references

Acknowledgments

JFS was partially supported by Ministerio de Educación y Ciencia MTM2009-14163-C02-02, MTM2012-38122-C03-01 and Generalitat de Catalunya, 2009 SGR 1284. JGS and PDJ were partially supported by Australian Research Council grant DP0877447. MDW was partially supported by Australian Research Council Grant FT100100031.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jesús Fernández-Sánchez.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 72 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fernández-Sánchez, J., Sumner, J.G., Jarvis, P.D. et al. Lie Markov models with purine/pyrimidine symmetry. J. Math. Biol. 70, 855–891 (2015). https://doi.org/10.1007/s00285-014-0773-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-014-0773-z

Keywords

Mathematics Subject Classification

Navigation