Abstract
Phylogenetics is the study of evolutionary relationships among organisms. Sequence alignment is commonly performed as the first step of phylogenetics to determine the similarities of DNA and protein sequences. Searching these relationships is needed to analyze diseases, predict the genetic structures of pathogens, and also to classify the organisms. We pursue an evolutionary structure called the phylogenetic tree with leaves representing the living organisms called taxa and the intermediate nodes as the hypothetical ancestors. We start this chapter by stating the terms of phylogenetic terminology. We then describe various methods of constructing phylogenetic trees and propose a simple distributed algorithm to construct such trees. In the maximum parsimony problem, we are given the evolutionary tree and a number of taxa, and our aim is to label nodes of this tree that explains data with minimum number of mutations. We review a number of algorithms for this purpose and introduce a new algorithm for distributed parsimony implementation. After reviewing the probabilistic maximum likelihood method of constructing phylogenetic trees, we provide a brief review of distributed approaches that use maximum likelihood. Phylogenetic networks are more general as they exhibit events such as horizontal gene transfer and recombination which cannot be represented by the evolutionary trees. We discuss these networks briefly to conclude this chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Addario-Berry L, Hallett MT, Lagergren J (2003) Towards identifying lateral gene transfer events. In: Proceedings of 8th pacific symposium on biocomputing (PSB03), pp 279–290
Bandelt HJ, Forster P, Sykes BC, Richards MB (1995) Mitochondrial portraits of human populations using median networks. Genetics 141:743–753
Bandelt HJ, Forster P, Rohl A (1999) Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol 16(1):37–48
Bandelt HJ, Macaulay V, Richards M (2000) Median networks: speedy construction and greedy reduction, one simulation, and two case studies from human mtDNA. Mol Phyl Evol 16:8–28
Blouin C, Butt D, Hickey G, Rau-Chaplin A (2005) Fast parallel maximum likelihood-based protein phylogeny. In: Proceedings of 18th international conference on parallel and distributed computing systems, ISCA, pp 281–287
Colijn C, Gardy J (2014) Phylogenetic tree shapes resolve disease transmission patterns. Evol Med Public Health 2014:96–108
DasGupta B, He X, Jiang T, Li M, Tromp J, Zhang L (2000) On computing the nearest neighbor interchange distance. Proceedings of DIMACS workshop on discrete problems with medical applications 55:125–143
Doolittle WF (1999) Phylogenetic classification and the Universal Tree. Science 284:2124–2128
Dunn JC (1974) Well separated clusters and optimal fuzzy partitions. J Cybern 4:95–104
Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17(6):368–376
Felsenstein J (1991) PHYLIP: phylogenetic inference package. University of Washington, Seattle
Felsenstein J (2004) Inferring Phylogenies. 2nd edn. Sinauer Associates Inc., Chapter 2
Fitch WM (1971) Toward defining course of evolution: minimum change for a specified tree topology. Syst Zool 20:406–416
Gast M, Hauptmann M (2012) Efficient parallel computation of nearest neighbor interchange distances. CoRR abs/1205.3402
Griffiths RC, Marjoram P (1997) An ancestral recombination graph. In: Donnelly P, Tavare S (eds) Progress in population genetics and human evolution, volume 87 of IMA volumes of mathematics and its applications. Springer, Berlin (Germany), pp 257–270
Hallett MT, Lagergren J (2001) Efficient algorithms for lateral gene transfer problems. Proceedings 5th annunal international conference on computational molecular biology (RECOMB01). ACM Press, New York, pp 149–156
Hendy MD, Penny D (1982) Branch and bound algorithms to determine minimal evolutionary trees. Math Biosci 60:133–142
Huber KT, Watson EE, Hendy MD (2001) An algorithm for constructing local regions in a phylogenetic network. Mol Phyl Evol 19(1):1–8
Huson DH (1998) SplitsTree: a program for analyzing and visualizing evolutionary data. Bioinformatics 14(1):68–73
Huson DH, Scornavacca C (2011) A survey of combinatorial methods for phylogenetic networks. Genome Biol Evol 3:23–35
Huson DH, Rupp R, Scornavacca C (2010) Phylogenetic networks. Cambridge University Press
Jin G, Nakhleh L, Snir S, Tuller T (2007) Inferring phylogenetic networks by the maximum parsimony criterion: a case study. Mol Biol Evol 24(1):324–337
Keane TM, Naughton TJ, Travers SA, McInerney JO, McCormack GP (2005) DPRml: distributed phylogeny reconstruction by maximum likelihood. Bioinformatics 21(7):969–974
Keane TM, Naughton TJ, McInerney JO (2007) MultiPhyl: a high-throughput phylogenomics webserver using distributed computting. Nucleic Acids Res 35(2):3337
Kumar S, Tamura K, Nei M (1993) MEGA: molecular evolutionary genetics analysis, ver. 1.01. The Pennsylvania State University, University Park, PA
Lemey P, Salemi M, Vandamme A-M (eds) (2009) The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing, 2nd edn. Cambridge University Press. ISBN-10: 0521730716. ISBN-13: 978-0521730716
Linder CR, Moret BME, Nakhleh L, Warnow T (2004) Network (Reticulate) Evolution: biology, models, and algorithms. School of Biological Sciences. In, The ninth pacific symposium on biocomputing
Nakhleh L (2010) Evolutionary phylogenetic networks: models and issues. In: Heath L, Ramakrishnan, N (eds) The problem solving handbook for computational biology and bioinformatics. Springer, pp 125–158
Nasibov EN, Ulutagay G (2008) FN-DBSCAN: a novel density-based clustering method with fuzzy neighborhood relations. In: Proceedings of 8th international conference application of fuzzy systems and soft computing (ICAFS-2008), pp 101–110
Robinson DF (1971) Comparison of labeled trees with valency three. J Comb Theory Ser B 11(2):105–119
Ropelewski AJ, Nicholas HB, Mendez RR (2010) MPI-PHYLIP: parallelizing computationally intensive phylogenetic analysis routines for the analysis of large protein families. PLoS ONE 5(11):e13999. doi:10.1371/journal.pone.0013999
Ruzgar R, Erciyes K (2012) Clustering based distributed phylogenetic tree construction. Expert Syst Appl 39(1):89–98
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol BioI Evol 4(4):406–425
Sankoff D (1975) Minimal mutation trees of sequences. SIAM J Appl Math 28:35–42
Stamatakis A (2004) Distributed and parallel algorithms and systems for inference of huge phylogenetic trees based on the maximum likelihood method. Ph.D. thesis, Technische Universitat, Munchen, Germany
Schmidt HA, Strimmer K, Vingron M, Haeseler A (2002) Tree-puzzle: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18(3):502–504
Studier J, Keppler K (1988) A note on the neighbor-joining algorithm of Saitou and Nei. Mol BioI Evol 5(6):729–731
Sung W-K (2009) Algorithms in bioinformatics: a practical introduction. CRC Press (Taylor and Francis Group), Chap 8
Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24(8):1586–1591
Zhou BB, Till M, Zomaya A (2004) Parallel implementation of maximum likelihood methods for phylogenetic analysis. In: Proceedings of 18th international symposium parallel and distributed processing (IPDPS 2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Erciyes, K. (2015). Phylogenetics. In: Distributed and Sequential Algorithms for Bioinformatics. Computational Biology, vol 23. Springer, Cham. https://doi.org/10.1007/978-3-319-24966-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-24966-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24964-3
Online ISBN: 978-3-319-24966-7
eBook Packages: Computer ScienceComputer Science (R0)