Abstract
Molecular phylogenetics is the study of evolutionary history among organisms. After selecting sequences and obtaining an optimal alignment for patterns of divergence, the next step is to build a graphical representation called a phylogenetic tree with each sequence as a branch of it. A tree represents a prediction of evolutionary relationships among organisms. In addition to uncovering evolutionary relationships, phylogenetic analysis finds applications in numerous ways such as guiding mutagenesis in laboratory, peptide design, and quantification of gene variants. This chapter focuses on the methodology of building a phylogenetic tree, which requires a careful selection of parameters as well as statistical analyses of the predictions for accuracy and robustness. We also discuss a number of tools which are based on algorithms with different underlying assumptions. These tools are available to perform different steps of any phylogenetic analysis including inference of phylogenetic trees and their visualization, estimating divergence times, mining online databases, estimating rates of molecular evolution, inferring ancestral sequences, and testing evolutionary hypotheses.
Even if we didn’t have a single fossil, the evidence for evolution would be absolutely secure because of comparative anatomy, comparative biochemistry, and geographical distribution
Dr. Richard Dawkins (The Blind Watchmaker)
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Adikesavan AK, Katsonis P, Marciano DC et al (2011) Separation of recombination and SOS response in Escherichia coli RecA suggests LexA interaction sites. PLoS Genet 7:e1002244
David WM (2004) Bioinformatics: sequence and genome analysis, 2nd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor
DeLano WL (2002) The PyMOL molecular graphics system. http://www.pymol.org
Dereeper A, Audic S, Claverie J-M, Blanc G (2010) BLAST-EXPLORER helps you building datasets for phylogenetic analysis. BMC Evol Biol 10:8. https://doi.org/10.1186/1471-2148-10-8
Dereeper A, Guignon V, Blanc G et al (2008) Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res 36:W465–W469. https://doi.org/10.1093/nar/gkn180
Edwards AWF, Cavalli-Sforza LL (1964) Reconstruction of evolutionary trees. In: Phenetic and phylogenetic classification, vol 6. Systematics Association, London, pp 67–76
Efron B, Halloran E, Holmes S (1996) Bootstrap confidence levels for phylogenetic trees. Proc Natl Acad Sci USA 93:13429–13434. https://doi.org/10.1073/pnas.93.23.13429
Felsenstein J (1973) Maximum likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Syst Zool 22:240–249
Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376. https://doi.org/10.1007/BF01734359
Felsenstein J (1983) Statistical inference of phylogenies. J R Stat Soc 126:246–272
Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution (NY) 39:783–791. https://doi.org/10.2307/2408678
Felsenstein J (1989) PHYLIP–phylogeny inference package (version 3.2). Cladistics 5:164–166
Felsenstein J (2013) PHYLIP-phylogeny inference package (version 3.695). Department of Genome Sciences, University of Washington, Seattle
Fitch WM (1971) Towards defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20:406–416
Hasegawa M, Kishino H, Yano T (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109
Hillis DM, Bull JJ (1993) An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst Biol 42:182–192. https://doi.org/10.1017/CBO9781107415324.004
Huelsenbeck JP, Ronquist F, Nielsen R, Bollback JP (2001a) Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294:2310–2314
Huelsenbeck JP, Ronquist F (2001b) MrBayes: Bayesian inference of phylogeny. Bioinformatics 17:754–755
Huelsenbeck JP, Ronquist F (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574. https://doi.org/10.1093/bioinformatics/btg180
Jukes TH, Cantor CR (1969) Evolution of protein molecules. Academic, New York, pp 21–132
Katsonis P, Lichtarge O (2014) A formal perturbation equation between genotype and phenotype determines the evolutionary action of protein coding variations on fitness. Genome Res 24:2050. https://doi.org/10.1101/gr.176214.114
Kumar S, Tamura K, Nei M (1994) MEGA: molecular evolutionary genetics analysis software for microcomputers. Comput Appl Biosci 10:189–191
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120. https://doi.org/10.1007/BF01731581
Lewis PO, Holder MT, Swofford DL (2015) Phycas: software for Bayesian phylogenetic analysis. Syst Biol 64:525–523. https://doi.org/10.1093/sysbio/syu132
Lichtarge O, Bourne HR, Cohen FE (1996) An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 257:342–358. https://doi.org/10.1006/jmbi.1996.0167
Lua RC, Lichtarge O (2010) PyETV: a PyMOL evolutionary trace viewer to analyze functional site predictions in protein complexes. Bioinformatics 26:2981–2982. https://doi.org/10.1093/bioinformatics/btq566
Lua RC, Wilson SJ, Konecki DM et al (2015) UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures. Nucleic Acids Res 44:D308–D312. https://doi.org/10.1093/nar/gkv1279
Madabushi S, Yao H, Marsh M et al (2002) Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J Mol Biol 316:139–154. https://doi.org/10.1006/jmbi.2001.5327
Maddison WP, Maddison DR (1999) MacClade: analysis of phylogeny and character evolution (version 3.08). Sinauer Associates, Sunderland
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) J Chem Phys 21:1087
Mueller LD, Ayala FJ (1982) Estimation and interpretation of genetic distance in empirical studies. Genet Res 40:127–137
Revell LJ (2013) Rphylip: an R interface for PHYLIP. R package (Version 0-1.09)
Rodriguez GJ, Yao R, Lichtarge O, Wensel TG (2010) Evolution-guided discovery and recoding of allosteric pathway specificity determinants in psychoactive bioamine receptors. Proc Natl Acad Sci USA 107:9476–9476. https://doi.org/10.1073/pnas.1005260107
Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574. https://doi.org/10.1093/bioinformatics/btg180
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425
Shoji-Kawata S, RJr S, Leveno M, Campbell GR et al (2009) Identification of a candidate therapeutic autophagy–inducing peptide. Nature 33:1223–1229. https://doi.org/10.3892/ijo
Sneath PHA, Sokal RR (1973) Numerical taxonomy. W.H. Freeman, San Francisco
Swofford DL (1991) PAUP: Phylogenetic analysis using parsimony, (version 3.1) computer program distributed by the Illinois. Natural History Survey, Champaign
Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10:512–526
Tamura K, Stecher G, Peterson D et al (2013) MEGA6: molecular evolutionary genetics analysis (version 6.0). Mol Biol Evol 30:2725–2729. https://doi.org/10.1093/molbev/mst197
Tavaré S (1986) Some probabilistic and statistical problems in the analysis of DNA sequences. Lectures on mathematics in the life sciences. Am Math Soc 17:57–86
Ward RM, Venner E, Daines B et al (2009) Evolutionary trace annotation server: automated enzyme function prediction in protein structures using 3D templates. Bioinformatics 25:1426–1427. https://doi.org/10.1093/bioinformatics/btp160
Wilkins AD, Lua R, Erdin S et al (2010) Sequence and structure continuity of evolutionary importance improves protein functional site discovery and annotation. Protein Sci 19:1296–1311. https://doi.org/10.1002/pro.406
Acknowledgments
This work is supported by a grant from the NIH Research Project Grant Program (2R01GM079656). The authors are grateful to Dr. David C. Marciano, Dr. Angela Wilkins, and Dr. Rhonald C. Lua for their helpful comments.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Atri, B., Lichtarge, O. (2018). Computational Approaches to Studying Molecular Phylogenetics. In: Shanker, A. (eds) Bioinformatics: Sequences, Structures, Phylogeny . Springer, Singapore. https://doi.org/10.1007/978-981-13-1562-6_9
Download citation
DOI: https://doi.org/10.1007/978-981-13-1562-6_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1561-9
Online ISBN: 978-981-13-1562-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)