Abstract
All comparative analyses rely on at least one phylogenetic hypothesis. However, the reconstruction of the evolutionary history of species is not the primary aim of these studies. In fact, it is rarely the case that a well-resolved, fully matching phylogeny is available for the interspecific trait data at hand. Therefore, phylogenetic information usually needs to be combined across various sources that often rely on different approaches and different markers for the phylogenetic reconstruction. Building hypotheses about the evolutionary history of species is a challenging task, as it requires knowledge about the underlying methodology and an ability to flexibly manipulate data in diverse formats. Although most practitioners are not experts in phylogenetics, the appropriate handling of phylogenetic information is crucial for making evolutionary inferences in a comparative study, because the results will be proportional to the underlying phylogeny. In this chapter, we provide an overview on how to interpret and combine phylogenetic information from different sources, and review the various tree-tailoring techniques by touching upon issues that are crucial for the understanding of other chapters in this book. We conclude that whichever method is used to generate trees, the phylogenetic hypotheses will always include some uncertainty that should be taken into account in a comparative study.
The original version of this chapter was revised: Online Practical Material website has been updated. The erratum to this chapter is available at https://doi.org/10.1007/978-3-662-43550-2_23
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
see also Glossary at the end of the chapter
References
Abdo Z, Minin VN, Joyce P, Sullivan J (2005) Accounting for uncertainty in the tree topology has little effect on the decision-theoretic approach to model selection in phylogeny estimation. Mol Biol Evol 22 (3):691–703. doi:10.1093/molbev/msi050
Alfaro ME, Huelsenbeck JP (2006) Comparative performance of Bayesian and AIC-based measures of phylogenetic model uncertainty. Syst Biol 55(1):89–96. doi:10.1080/10635150500433565
Amcoff M, Gonzalez-Voyer A, Kolm N (2013) Evolution of egg dummies in tanganyikan cichid fishes: the roles of parental care and sexual selection. J Evol Biol 26:2369–2382. doi:10.1111/jeb.12231
Arima S, Tardella L (2012) Improved harmonic mean estimator for phylogenetic model evidence. J Comput Biol 19(4):418–438. doi:10.1089/cmb.2010.0139
Arnold C, Matthews LJ, Nunn CL (2010) The 10k Trees website: a new online resource for primate phylogeny. Evol Anthropol 19:114–118
Benson DA, al. e (2011) GenBank. Nucleic Acids Res 39:D32–D37
Bininda-Emonds O, Gittleman JL, Purvis A (1999) Building large trees by combining phylogenetic information: a complete phylogeny of the extant Carnivora (Mammalia). Biol Rev 74:143–175
Bininda-Emonds ORP, Cardillo M, Jones KE, R DEM, Beck RMD, Grenyer R, Price SA, Vos RA, Gittleman JL, Purvis A (2007) The delayed rise of present-day mammals. Nature 446:507–512
Blomberg SP, Lefevre JG, Wells JA, Waterhouse M (2012) Independent contrasts and PGLS regression estimators are equivalent. Syst Biol 61(3):382–391. doi:10.1093/sysbio/syr118
Bromham L (2011) The genome as a life-history character: why rate of molecular evolution varies between mammal species. Phil Trans R Soc B 366:2503–2513. doi:10.1098/rstb.2011.0014
Burleigh JG, Bansal MS, Eulenstein O, Hartmann S, Wehe A, Vision TJ (2011) Genome-scale phylogenetics: inferring the plant tree of life from 18,896 gene trees. Syst Biol 60(2):117–125. doi:10.1093/sysbio/syq072
Castresana J (2000) Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 17:540–552
de Villemereuil P, Wells JA, Edwards RD, Blomberg SP (2012) Bayesian models for comparative analysis integrating phylogenetic uncertainty. BMC Evol Biol 12. doi:10.1186/1471-2148-12-102
Desluc F, Brinkmann H, Philippe H (2005) Phylogenomics and the reconstruction of the tree of life. Nature Rev Genet 6(5):361–375
Donoghue MJ, Ackerly DD (1996) Phylogenetic uncertainties and sensitivity analyses in comparative biology. Phil Trans R Soc B 351:1241–2149
Durbin R, Eddy S, Krogh A, Mitchison G (1998) Biological sequence analysis. Cambridge University Press, Cambridge
Ewens WJ, Grant GR (2010) Statistical methods in bioinformatics: an introduction. Springer Science and Business Media, New York
Felsenstein J (1985) Phylogenies and the comparative method. Am Nat 125(1):1–15
Felsenstein J (2004) Inferring phylogenies. Sunderland, Sinauer Associates
FitzJohn RG, Maddison WP, Otto SP (2009a) Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies. Syst Biol 58(6):595–611. doi:10.1093/sysbio/syp067
FitzJohn RG, Maddison WP, Otto SP (2009b) Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies. Syst Biol 58:595–611
Freckleton RP, Harvey PH, Pagel M (2002) Phylogenetic analysis and comparative data: a test and review of evidence. Am Nat 160(6):712–726. doi:10.1086/343873
Galtier N, Jobson RW, Nabholz B, Glemin S, Blier PU (2009) Mitochondrial whims: metabolic rate, longevity and the rate of molecular evolution. Biol Lett 5 (3):413–416. doi:rsbl.2008.0662 [pii] 10.1098/rsbl.2008.0662
Gonzalez-Voyer A, Fitzpatrick JL, Kolm N (2008) Sexual selection determines parental care patterns in cichlid fishes. Evolution 62 (8):2015–2026. doi:EVO426 [pii] 10.1111/j.1558-5646.2008.00426.x
Grafen A (1989) The phylogenetic regression. Phil Trans R Soc B 326(1223):119–157
Hall BG (2004) Phylogenetic trees made easy: a how-to manual. Sinauer Associates Inc, Sunderland
Hansen TF (1997) Stabilizing selection and the comparative analysis of adaptation. Evolution 51(5):1341–1351
Harvey PH, Pagel MD (1991) The comparative method in evolutionary biology. Oxford University Press, Oxford
Hastings WK (1970) Monte carlo sampling methods using markov chains and their applications. Biometrika 57(1):97–109. doi:10.2307/2334940
Higgins D, Lemey P (2009) Multiple sequence alignment. In: Lemey P, Salemi M, Vandamme A-M (eds) The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing. Cambridge University Press, Cambridge, pp 68–96
Huelsenbeck JP, Ronquist F, Nielsen R, Bollback JP (2001) Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294:2310–2314
Ives AR, Midford PE, Garland T (2007) Within-species variation and measurement error in phylogenetic comparative methods. Syst Biol 56(2):252–270. doi:10.1080/10635150701313830
Jetz W, Thomas GH, Joy JB, Hartmann K, Mooers AO (2012a) The global diversity of birds in space and time. Nature 491(7424):444–448. doi:10.1038/nature11631
Jetz W, Thomas GH, Joy JB, Hartmann K, Mooers AO (2012b) The global diversity of birds in space and time. Nature 491:444–448. doi:10.1038/nature11631
Kälersjö M, Albert VA, Farris JS (1999) Homoplasy increases phylogenetic structure. Cladistics 15(1):91–93. doi:10.1111/j.1096-0031.1999.tb00400.x
Kalinowski ST (2009) How well do evolutionary trees describe genetic relationships among populations? Heredity 102:506–513. doi:10.1038/hdy.2008.136
Leclerc MC, Hugot JP, Durand P, Renaud F (2004) Evolutionary relationships between 15 Plasmodium species from new and old World primates (including humans): an 18S rDNA cladistic analysis. Parasitology 129:677–684
Lemey P, Salemi M, Vandamme A-M (eds) (2009) The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing. Cambridge University Press, Cambridge
Linder CR, Warnow T (2006) An overview of phylogeny reconstruction. In: Aluru S (ed) Handbook of computational molecular biology. Chapman & Hall/CRC Computer & Information Science, Boca Raton, FL
Linnaeus C (1758) Systema naturae. 10th edn., Stockholm
Martins EP, Hansen TF (1997) Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspecific data. Am Nat 149(4):646–667
Martins EP, Housworth EA (2002) Phylogeny shape and the phylogenetic comparative method. Syst Biol 51(6):873–880. doi:10.1080/10635150290155863
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092
Minin V, Abdo Z, Joyce P, Sullivan J (2003) Performance-based selection of likelihood models for phylogeny estimation. Syst Biol 52 (5):674–683. doi:10.1080/10635150390235494
Moriyama EN, Powell JR (1997) Synonymous substitution rates in Drosophila: mitochondrial versus nuclear genes. J Mol Evol 45:378–391
Morlon H, Parsons TL, Plotkin JB (2011) Reconciling molecular phylogenies with the fossil record. Proc Natl Acad Sci 108(39):16327–16332. doi:10.1073/pnas.1102543108
Nakhleh L (2013) Computational approaches to species phylogeny inference and gene tree reconciliation. Trends Ecol Evol 28(12):719–728. doi:10.1016/j.tree.2013.09.004
Nei M, Kumar N (2000) Molecular evolution and phylogenetics. Oxford University Press, Oxford
Page RDM, Holmes EC (1998) Molecular evolution: a phylogenetic approach. Blackwell Publishing, Oxford
Pagel M (1999) Inferring the historical patterns of biological evolution. Nature 401:877–884
Pagel M, Meade A (2006) Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo. Am Nat 167(6):808–825
Pagel M, Meade A, Barker D (2004a) Bayesian estimation of ancestral character states on phylogenies. Syst Biol 53(3):673–684. doi:10.1080/10635150490522232
Pagel M, Meade A, Barker D (2004b) Bayesian estimation of ancestral character states on phylogenies. Syst Biol 53(5):673–684
Paradis E (2011) Analysis of phylogenetics and evolution with R, 2nd edn. Springer, Berlin
Posada D, Buckley TR (2004) Model selection and model averaging in phylogenetics: advantages of akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst Biol 53(5):793–808. doi:10.1080/10635150490522304
R Development Core Team (2007) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. doi:http://www.R-project.orgS
Revell LJ, Reynolds RG (2012) A new Bayesian method for fitting evolutionary models to comparative data with intraspecific variation. Evolution 66(9):2697–2707. doi:10.1111/j.1558-5646.2012.01645.x
Ronquist F, Huelsenbeck JP (2003) MrBayes 3: bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574
Roquet C, Thuiller W, Lavergne S (2013) Building megaphylogenies for macroecology: taking up the challenge. Ecography 36:13–26. doi:10.1111/j.1600-0587.2012.07773.x
Rzhetsky A, Nei M (1992) A simple method for estimating and testing minimum-evolution trees. Mol Biol Evol 9:945–967
Saitou N, Nei M (1987) The neighbor-joining method—a new method for reconstructing phylogenetic trees. Mol Biol Evol 4(4):406–425
Sanderson MJ (1997) A nonparametric approach to estimating divergence times in the absence of rate constancy. Mol Biol Evol 14(12):1218–1231
Santos JC (2012) Fast molecular evolution associated with high active metabolic rates in poison frogs. Mol Biol Evol 29(8):2001–2018
Santos-Gally R, Gonzalez-Voyer A, Arroyo J (2013) Deconstructing heterostyly: the evolutionary role of incompatibility system, pollinators, and floral architecture. Evolution 67(7):2072–2082
Sibley CG, Ahlquist JE (1990) Phylogeny and classification of birds: a study in molecular evolution. Yale University Press, New Haven
Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22 (21):2688–2690. btl446 [pii] doi:10.1093/bioinformatics/btl446
Stone GN, Nee S, Felsenstein J (2011) Controlling for non-independence in comparative analysis of patterns across populations within species. Phil Trans R Soc B 366(1569):1410–1424. doi:10.1098/rstb.2010.0311
Symonds MRE (2002) The effects of topological inaccuracy in evolutionary trees on the phylogenetic comparative method of independent contrasts. Syst Biol 51:541–553
Thomas GH, Hartmann K, Jetz W, Joy JB, Mimoto A, Mooers AO (2013) PASTIS: an R package to facilitate phylogenetic assembly with soft taxonomic inferences. Methods Ecol Evol 4:1011–1017. doi:10.1111/2041-210X.12117
Wolfe KH, Sharp PM, Li W-H (1989) Rates of synonymous substitution in plant nuclear genes. J Mol Evol 29:208–211
Wu D, Jospin G, Eisen J (2013) Systematic identification of gene families for use as “markers” for phylogenetic and phylogeny-driven ecological studies of bacteria and archaea and their major subgroups. PLoS ONE 8(10):e77033. doi:10.1371/journal.pone.0077033
Zwickl DJ (2006) Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. University of Texas at Austin, Austin
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Glossary
- Additive tree/phylogeny
-
A phylogeny is termed additive when the tips are not all equidistant from the root. In an additive phylogeny branch lengths represent the number of expected substitutions, therefore differences among taxa in the rate of molecular evolution will lead to differences in branch lengths.
- Branch
-
A continuous line that connects two nodes or a node to a tip in the phylogeny.
- Branch length
-
Represents the “distance” between the two nodes or the node and tip connected by the branch. The “distance” can be measured in number of evolutionary transitions (if the phylogeny is reconstructed using maximum parsimony methods), number of expected substitutions, which is an estimate of the rate of molecular evolution, or divergence times.
- Gene duplication
-
When a second copy of an existing gene emerges within a single genome. Gene duplication is a major mechanism by which new genetic material is generated.
- Homology
-
Shared similarity between taxa that is due to inheritance from a common ancestor.
- Homoplasy
-
Similarity between taxa that results from convergent evolution, for example due to similar selection pressures.
- Horizontal gene transfer
-
The transfer of genetic material between individuals of different species, and which is not the result of inheritance from a common ancestor.
- Hybridization
-
Mating between individuals of two distinct species of plants or animals resulting in viable offspring.
- Incomplete lineage sorting
-
Occurs when coalescence times of alleles are within the time span of speciation events or shorter. Incomplete lineage sorting results in gene genealogies that are not concordant with the species phylogeny.
- Nodes
-
Represent the putative ancestors of the taxa represented in the phylogeny.
- Orthologous genes
-
Genes originating from a common ancestor (i.e. homologous genes) that have undergone independent evolution following a speciation event.
- Parallel or convergent evolution
-
Evolution of phenotypes or sequences under similar selective regimes leading to higher similarities than would be expected based on the degree of shared ancestry.
- Paralogous genes
-
Genes originating from a duplication event recent enough to reveal their common ancestry.
- Polytomy
-
When more than two branches originate from a single node in the phylogeny. Polytomies reflect uncertainty in the timing of speciation events, either because of lack of sufficient data to determine the order of events with confidence (so called “soft polytomies”) or because the speciation events were so rapid there was insufficient time for the necessary substitutions to discriminate between the timings of the speciation events to accumulate (so called “hard polytomies”).
- Root
-
Represents the most recent common ancestor of all the tips (taxa) in the phylogeny. All branches of the phylogeny lead to the root and the root connects all nodes.
- Saturation
-
Occurs when two aligned, presumably orthologous, sequences have accumulated such an elevated number of repeated substitutions that these provide a poor estimate of their time of divergence. Saturation occurs because there is a higher probability of reverse mutations (changes to a nucleotide present in the past) as time of divergence increases and hence apparent differences between orthologous sequences become lower than expected based on the time of divergence.
- Substitution rate
-
Also referred to as molecular evolution rate, it is the rate at which organisms accumulate genetic differences over time, it is usually calculated as the number of substitutions per site per unit time. Non-synonymous and synonymous substitutions can be discriminated depending on whether changes in the nucleotide sequence affect the translated amino acide sequences or not, respectively.
- Tips
-
Also called leaves (following the tree analogy for phylogenies) they are the taxa whose relationships are being estimated with the phylogeny
- Ultrametric tree/phylogeny
-
A phylogeny is termed ultrametric when all the tips are equidistant from the root. In other words the distance between any two species in the tree is the same as long as the path crosses the root of the tree. In ultrametric trees the branch lengths usually represent divergence times. Ultrametric trees can also be estimated under the assumption of a constant rate of substitution that is the same for all taxa, also called a molecular clock. However, recent studies with diverse species have called into question the molecular clock showing that the rate of molecular evolution varies among even closely related species and is correlated with species-specific traits and even environmental variables.
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Garamszegi, L.Z., Gonzalez-Voyer, A. (2014). Working with the Tree of Life in Comparative Studies: How to Build and Tailor Phylogenies to Interspecific Datasets. In: Garamszegi, L. (eds) Modern Phylogenetic Comparative Methods and Their Application in Evolutionary Biology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43550-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-662-43550-2_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43549-6
Online ISBN: 978-3-662-43550-2
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)