A Phylogenetic Mixture Model for Heterotachy

  • Andrew Meade
  • Mark Pagel

Abstract

We present a likelihood-based phylogenetic mixture model designed to analyse data that exhibit within-site rate variation or heterotachy. Heterotachy refers to the phenomenon of a site in a gene-sequence or other alignment changing its rate of evolution throughout the tree. The method accounts for heterotachy by summing the likelihood of the data at each site over more than one set of branch lengths on the same tree. A branch length set that is best for one site may differ from the branch length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. We show that the model improves the accuracy of phylogenetic reconstruction when the sequence data are not derived from a single underlying evolutionary process. We apply the method to a number of simulated and published data sets and show that many sequence data sets have complex evolutionary signals of heterotachy. The presence of such signals has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is implemented in a Bayesian Markov chain Monte Carlo framework and is available from the authors’ Web site, and can be used for the analysis of both nucleotide and morphological data.

Keywords

Mixture Model Akaike Information Criterion Branch Length Covarion Model Ancestral State Reconstruction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723CrossRefGoogle Scholar
  2. 2.
    Ane C, Burleigh JG, McMahon MM, Sanderson MJ (2005) Covarion structure in plastid genome evolution: a new statistical test. Mol Biol Evol 22:914–924PubMedCrossRefGoogle Scholar
  3. 3.
    Buchheim MA, Michalopulos EA, Buchheim JA (2001) Phylogeny of the Chlorophyceae with special reference to the Sphaeropleales: A study of 18S and 26S rDNA data. Journal of Phycology, 37:819–835CrossRefGoogle Scholar
  4. 4.
    Felsenstein J (2004) Inferring phylogenies. Sinauer, SunderlandGoogle Scholar
  5. 5.
    Fitch WM, Markowitz E (1970) An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem Genet 4:579–593PubMedCrossRefGoogle Scholar
  6. 6.
    Galtier N (2001) Maximum-likelihood phylogenetic analysis under a covarion-like model. Mol Biol Evol 18:866–873PubMedGoogle Scholar
  7. 7.
    Gelman A (2003) Bayesian data analysis. CRC, Boca RatonGoogle Scholar
  8. 8.
    Geyer CJ (1992) Practical Markov chain Monte Carlo. Stat Sci 7:473–483CrossRefGoogle Scholar
  9. 9.
    Huelsenbeck JP (1999) Variation in the pattern of nucleotide substitution across sites. J Mol Evol 48:86–93PubMedCrossRefGoogle Scholar
  10. 10.
    Huelsenbeck JP (2002) Testing a covariotide model of DNA substitution. Mol Biol Evol 19:698–707PubMedGoogle Scholar
  11. 11.
    Kiontke K, Gavin NP, Raynes Y, Roehrig C, Piano F, Fitch DHA (2004) Caenorhabditis phylogeny predicts convergence of hermaphroditism and extensive intron loss. Proceedings of the National Academy of Sciences of the United States of America, 101:9003–9008PubMedCrossRefGoogle Scholar
  12. 12.
    Kolaczkowski B, Thornton J (2004) Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous. Nature 431:980–984PubMedCrossRefGoogle Scholar
  13. 13.
    Koshi JM, Goldstein RA (1998) Models of natural mutations including site heterogeneity. Proteins Struct Funct Genet 32:289–295PubMedCrossRefGoogle Scholar
  14. 14.
    Lopez P, Casane D, Philippe H (2002) Heterotachy, an important process of protein evolution. Mol Biol Evol 19:1–7PubMedGoogle Scholar
  15. 15.
    Mueller RL, Macey JR, Jaekel M, Wake DB, Boore JL (2004) Morphological homoplasy, life history evolution, and historical biogeography of plethodontid salamanders inferred from complete mitochondrial genomes. Proceedings of the National Academy of Sciences of the United States of America, 101:13820–13825PubMedCrossRefGoogle Scholar
  16. 16.
    Pagel M, Meade A (2004) A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst Biol 53:571–581PubMedCrossRefGoogle Scholar
  17. 17.
    Pagel M, Meade A (2005) Mixture models in phylogenetic inference. In: Gascuel O (ed) Mathematics of evolution and phylogeny. Clarendon, Oxford, pp 121–142Google Scholar
  18. 18.
    Philippe H, Zhou Y, Brinkmann H, Rodrigue N, Delsuc F (2005) Heterotachy and long-branch attraction in phylogenetics. BMC Evol Biol 5:50PubMedCrossRefGoogle Scholar
  19. 19.
    Piel WH, Donoghue MJ, Sanderson MJ (2002) TreeBASE: a database of phylogenetic knowledge. pp 41–47 in: J. Shimura, K. Wilson, and D. Gordon, eds. The interoperable “Catalog of Life.” Research Report, National Institute for Environmental Studies No. 171, Tsukuba, JapanGoogle Scholar
  20. 20.
    Rambaut A, Grassly NC (1997) Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput Appl Biosci 235–238Google Scholar
  21. 21.
    Rokas A, Williams BL, King N, Carroll SB (2003) Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425:798–804PubMedCrossRefGoogle Scholar
  22. 22.
    Rydin C, Kallersjo M, Friist EM (2002) Seed plant relationships and the systematic position of Gnetales based on nuclear and chloroplast DNA: Conflicting data, rooting problems, and the monophyly of conifers. International Journal of Plant Sciences, 163:197–214CrossRefGoogle Scholar
  23. 23.
    Shalchian-Tabrizi K, Skanseng M, Ronquist F, Klaveness D, Bachvaroff TR, Delwiche CF, Botnen A, Tengs T, Jakobsen KS (2006) Heterotachy processes in rhodophyte-derived secondhand plastid genes: implications for addressing the origin and evolution of dinoflagellate plastids. Mol Biol Evol 23:1504–1515PubMedCrossRefGoogle Scholar
  24. 24.
    Specht CD (2006) Systematics and evolution of the tropical monocot family Costaceae (Zingiberales): A multiple dataset approach. Systematic Botany, 31:89–106CrossRefGoogle Scholar
  25. 25.
    Spencer M, Susko E, Roger AJ (2005) Likelihood, parsimony, and heterogeneous evolution. Mol Biol Evol 22:1161–1164PubMedCrossRefGoogle Scholar
  26. 26.
    Swofford DL, Olsen GJ, Waddell PJ, Hillis DM, Moritz C, Mable BK (1996) Mol Syst 407–514Google Scholar
  27. 27.
    Taylor MS, Kai C, Kawai J, Carninci P, Hayashizaki Y, Semple CA (2006) Heterotachy in mammalian promoter evolution. PLoS Genet 2:e30PubMedCrossRefGoogle Scholar
  28. 28.
    Tuffley C, Steel M (1998) Modeling the covarion hypothesis of nucleotide substitution. Math Biosci 147:63–91PubMedCrossRefGoogle Scholar
  29. 29.
    Venditti C, Meade A, Pagel M (2008) Phylogenetic mixture models can reduce the node-density artifact. Syst Biol, in press 2008Google Scholar
  30. 30.
    Yang Z (1994) Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol 39:306–314PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Andrew Meade
    • 1
  • Mark Pagel
    • 1
  1. 1.School of Biological Sciences, Philip Lyle BuildingThe University of ReadingReadingUK

Personalised recommendations