Journal of Molecular Evolution

, Volume 67, Issue 5, pp 437–447 | Cite as

The Phylogenetic Informativeness of Nucleotide and Amino Acid Sequences for Reconstructing the Vertebrate Tree

  • Jeffrey P. Townsend
  • Francesc López-Giráldez
  • Robert Friedman


To aid in future efforts to accurately reconstruct the vertebrate tree, a quantitative measure of phylogenetic informativeness was applied to nucleotide and amino acid sequences for a set of 11 genes. We identified orthologues and assembled published fossil-calibrated divergence times between taxa that had been sequenced for each gene. Rates of molecular evolution for each site were estimated to characterize the molecular evolutionary pattern of genes and to calculate the phylogenetic informativeness. The fast-evolving gene albumin yielded the highest informativeness over the period from 60 million years ago to 500 million years ago. In contrast, calmodulin yielded the lowest informativeness, presumably because functional constraint minimized substitutions in the amino acid sequence. The gene c-myc showed an intermediate level of informativeness. The nucleotide sequence of cytochrome b showed extremely high utility for recent epochs, but low utility for times before 100 million years ago. We ranked nine other genes for their utility during the epochs of the divergence of the muroid rodents, early placental mammals, early vertebrates, and early metazoa, yielding results consistent with, but more precise than, previous studies. Interestingly, DNA sequence always exceeded amino acid sequence in informativeness over all time scales, yet support values were at best moderately higher. For epochs not subject to strong phylogenetic conflict due to convergence, we advocate gleaning the additional power of the threefold increase in number of characters that is present for DNA sequences over resorting to the less noisy but less informative amino acid sequences.


Information Phylogeny Polytomy Power Rapid radiation Vertebrate 


  1. Abascal F, Zardoya R, Posada D (2005) ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21:2104–2105PubMedCrossRefGoogle Scholar
  2. Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402PubMedCrossRefGoogle Scholar
  3. Arnason U, Adegoke JA, Bodin K, Born EW, Esa YB, Gullberg A, Nilsson M, Short RV, Xu XF, Janke A (2002) Mammalian mitogenomic relationships and the root of the eutherian tree. Proc Natl Acad Sci USA 99:8151–8156PubMedCrossRefGoogle Scholar
  4. Arnason U, Janke A (2002) Mitogenomic analyses of eutherian relationships. Cytogenet Genome Res 96:20–32PubMedCrossRefGoogle Scholar
  5. Collins TM, Fedrigo O, Naylor GJP (2005) Choosing the best genes for the job: the case for stationary genes in genome-scale phylogenetics. Syst Biol 54:493–500PubMedCrossRefGoogle Scholar
  6. Delsuc F, Scally M, Madsen O, Stanhope MJ, de Jong WW, Catzeflis FM, Springer MS, Douzery EJP (2002) Molecular phylogeny of living xenarthrans and the impact of character and taxon sampling on the placental tree rooting. Mol Biol Evol 19:1656–1671PubMedGoogle Scholar
  7. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797PubMedCrossRefGoogle Scholar
  8. Fedrigo O, Adams DC, Naylor GJ (2005) DRUIDS—detection of regions with unexpected internal deviation from stationarity. J Exp Zool B Mol Dev Evol 304:119–128PubMedCrossRefGoogle Scholar
  9. Felsenstein J (2001) Taking variation of evolutionary rates between sites into account in inferring phylogenies. J Mol Evol 53:447–455PubMedCrossRefGoogle Scholar
  10. Gissi C, San Mauro D, Pesole G, Zardoya R (2006) Mitochondrial phylogeny of Anura (Amphibia): a case study of congruent phylogenetic reconstruction using amino acid and nucleotide characters. Gene 366:228–237PubMedCrossRefGoogle Scholar
  11. Glazko GV, Nei M (2003) Estimation of divergence times for major lineages of primate species. Mol Biol Evol 20:424–434PubMedCrossRefGoogle Scholar
  12. Graybeal A (1994) Evaluating the phylogenetic utility of genes: a search for genes informative about deep divergences among vertebrates. Syst Biol 43:174–193CrossRefGoogle Scholar
  13. Hasegawa M, Kishino H, Yano T (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174PubMedCrossRefGoogle Scholar
  14. Hedges SB, Kumar S (2003) Genomic clocks and evolutionary timescales. Trends Genet 19:200–206CrossRefGoogle Scholar
  15. Hedges SB, Blair JE, Venturi ML, Shoe JL (2004) A molecular timescale of eukaryote evolution and the rise of complex multicellular life. BMC Evol Biol 4:2PubMedCrossRefGoogle Scholar
  16. Hillis DM, Pollock DD, McGuire JA, Zwickl DJ (2003) Is sparse taxon sampling a problem for phylogenetic inference? Syst Biol 52:124–126PubMedCrossRefGoogle Scholar
  17. Jones DT, Taylor WR, Thornton JM (1992) The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci 8:275–282PubMedGoogle Scholar
  18. Kumar S, Gadagkar SR (2001) Disparity index: a simple statistic to measure and test the homogeneity of substitution patterns between molecular sequences. Genetics 158:1321–1327PubMedGoogle Scholar
  19. Lara MC, Patton JL, daSilva MNF (1996) The simultaneous diversification of South American echimyid rodents (Hystricognathi) based on complete cytochrome b sequences. Mol Phylogenet Evol 5:403–413PubMedCrossRefGoogle Scholar
  20. Lessa EP, Cook JA (1998) The molecular phylogenetics of tuco-tucos (genus Ctenomys, Rodentia: Octodontidae) suggests an early burst of speciation. Mol Phylogenet Evol 9:88–99PubMedCrossRefGoogle Scholar
  21. Li W (1997) Molecular evolution. Sinauer Associates, Sunderland, MAGoogle Scholar
  22. Mayrose I, Graur D, Ben-Tal N, Pupko T (2004) Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior. Mol Biol Evol 21:1781–1791PubMedCrossRefGoogle Scholar
  23. Meyer A, Zardoya R (2003) Recent advances in the (molecular) phylogeny of vertebrates. Annu Rev Ecol Evol Syst 34:311–338CrossRefGoogle Scholar
  24. Murphy WJ, Eizirik E, Johnson WE, Zhang YP, Ryderk OA, O’Brien SJ (2001a) Molecular phylogenetics and the origins of placental mammals. Nature 409:614–618PubMedCrossRefGoogle Scholar
  25. Murphy WJ, Eizirik E, O’Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E, Ryder OA, Stanhope MJ, de Jong WW, Springer MS (2001b) Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294:2348–2351PubMedCrossRefGoogle Scholar
  26. Naylor GJ, Brown WM (1997) Structural biology and phylogenetic estimation. Nature 388:527–528PubMedCrossRefGoogle Scholar
  27. Nei M, Kumar S, Takahashi K (1998) The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small. Proc Natl Acad Sci USA 95:12390–12397PubMedCrossRefGoogle Scholar
  28. Pond SL, Frost SD, Muse SV (2005) HyPhy: hypothesis testing using phylogenies. Bioinformatics 21:676–679PubMedCrossRefGoogle Scholar
  29. Posada D, Crandall KA (1998) MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817–818PubMedCrossRefGoogle Scholar
  30. Poux C, Douzery EJP (2004) Primate phylogeny, evolutionary rate variations, and divergence times: a contribution from the nuclear gene IRBP. Am J Phys Anthropol 124:1–16PubMedCrossRefGoogle Scholar
  31. Rokas A, Carroll SB (2005) More genes or more taxa? The relative contribution of gene number and taxon number to phylogenetic accuracy. Mol Biol Evol 22:1337–1344PubMedCrossRefGoogle Scholar
  32. Rokas A, King N, Finnerty J, Carroll SB (2003a) Conflicting phylogenetic signals at the base of the metazoan tree. Evol Dev 5:346–359PubMedCrossRefGoogle Scholar
  33. Rokas A, Williams BL, King N, Carroll SB (2003b) Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425:798–804PubMedCrossRefGoogle Scholar
  34. Rosenberg MS, Kumar S (2001a) Traditional phylogenetic reconstruction methods reconstruct shallow and deep evolutionary relationships equally well. Mol Biol Evol 18:1823–1827PubMedGoogle Scholar
  35. Rosenberg MS, Kumar S (2001b) Incomplete taxon sampling is not a problem for phylogenetic inference. Proc Natl Acad Sci USA 98:10751–10756PubMedCrossRefGoogle Scholar
  36. Rosenberg MS, Kumar S (2003) Taxon sampling, bioinformatics, and phylogenomics. Syst Biol 52:119–124PubMedCrossRefGoogle Scholar
  37. Russo CAM, Takezaki N, Nei M (1996) Efficiencies of different genes and different tree-building methods in recovering a known vertebrate phylogeny. Mol Biol Evol 13:525–536PubMedGoogle Scholar
  38. Sarich VM, Wilson AC (1967) Rates of albumin evolution in primates. Proc Natl Acad Sci USA 58:142PubMedCrossRefGoogle Scholar
  39. Scally M, Madsen O, Douady CJ, de Jong WW, Stanhope MJ, Springer MS (2002) Molecular evidence for the major clades of placental mammals. J Mammal Evol 8:239–277CrossRefGoogle Scholar
  40. Schmidt HA, Strimmer K, Vingron M, von Haeseler A (2002) TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504PubMedCrossRefGoogle Scholar
  41. Simmons MP (2000) A fundamental problem with amino-acid-sequence characters for phylogenetic analyses. Cladistics Int J Willi Hennig Soc 16:274–282CrossRefGoogle Scholar
  42. Simmons MP, Ochoterena H, Freudenstein JV (2002a) Conflict between amino acid and nucleotide characters. Cladistics Int J Willi Hennig Soc 18:200–206CrossRefGoogle Scholar
  43. Simmons MP, Ochoterena H, Freudenstein JV (2002b) Amino acid vs nucleotide characters: challenging preconceived notions. Mol Phylogenet Evol 24:78–90PubMedCrossRefGoogle Scholar
  44. Simmons MP, Carr TG, O’Neill K (2004a) Relative character-state space, amount of potential phylogenetic information, and heterogeneity of nucleotide and amino acid characters. Mol Phylogenet Evol 32:913–926PubMedCrossRefGoogle Scholar
  45. Simmons MP, Reeves A, Davis JI (2004b) Character-state space versus rate of evolution in phylogenetic inference. Cladistics Int J Willi Hennig Soc 20:191–204CrossRefGoogle Scholar
  46. Springer MS, Murphy WJ, Eizirik E, O’Brien SJ (2003) Placental mammal diversification and the Cretaceous-Tertiary boundary. Proc Natl Acad Sci USA 100:1056–1061PubMedCrossRefGoogle Scholar
  47. Takezaki N, Gojobori T (1999) Correct and incorrect vertebrate phylogenies obtained by the entire mitochondrial DNA sequences. Mol Biol Evol 16:590–601PubMedGoogle Scholar
  48. Takezaki N, Figueroa F, Zaleska-Rutczynska Z, Klein J (2003) Molecular phylogeny of early vertebrates: monophyly of the agnathans as revealed by sequences of 35 genes. Mol Biol Evol 20:287–292PubMedCrossRefGoogle Scholar
  49. Takezaki N, Figueroa F, Zaleska-Rutczynska Z, Takahata N, Klein J (2004) The phylogenetic relationship of tetrapod, coelacanth, and lungfish revealed by the sequences of forty-four nuclear genes. Mol Biol Evol 21:1512–1524PubMedCrossRefGoogle Scholar
  50. Teeling EC, Springer MS, Madsen O, Bates P, O’Brien SJ, Murphy WJ (2005) A molecular phylogeny for bats illuminates biogeography and the fossil record. Science 307:580–584PubMedCrossRefGoogle Scholar
  51. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882PubMedCrossRefGoogle Scholar
  52. Townsend JP (2007) Profiling phylogenetic informativeness. Syst Biol 56:222–231PubMedCrossRefGoogle Scholar
  53. Yang ZH (1996) Among-site rate variation and its impact on phylogenetic analyses. Trends Ecol Evol 11:367–372CrossRefGoogle Scholar
  54. Yang ZH (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13:555–556PubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Jeffrey P. Townsend
    • 1
  • Francesc López-Giráldez
    • 1
  • Robert Friedman
    • 2
  1. 1.Department of Ecology and Evolutionary BiologyYale UniversityNew HavenUSA
  2. 2.Department of Biological SciencesUniversity of South CarolinaColumbiaUSA

Personalised recommendations