Skip to main content
Log in

The Performance of Two Supertree Schemes Compared Using Synthetic and Real Data Quartet Input

  • Original Article
  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

Despite impressive advancements in technological and theoretical tools, construction of phylogenetic (evolutionary) trees is still a challenging task. The availability of enormous quantities of molecular data has made large-scale phylogenetic reconstruction involving thousands of species, a more viable goal. For this goal, separate trees over different, overlapping subsets of species, representing histories of various markers of these species, are collected. These trees, typically with conflicting signals, are subsequently combined into a single tree over the full set, an operation denoted as supertree construction. The amalgamation of such trees into a single tree lies at the heart of many tasks in phylogenetics, yet remains a daunting endeavor, especially in light of conflicting signals. In this work, we study the performance of matrix representation with parsimony (MRP), the most widely used supertree method to date, when confronted with quartet trees. Quartet trees are the most basic informational unit when amalgamation of unrooted trees is attempted, and they remain relevant in more general settings even though standard supertree methods are not necessarily confined to quartets. This study involves both real and simulated data, and the effects of several parameters on the results are evaluated, revealing a number of anomalies associated with MRP. We show that these anomalies are surmountable when using a recently introduced supertree method, weighted quartet MaxCut (wQMC).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Avni E, Cohen R, Snir S (2014) Weighted quartets phylogenetics. Syst Biol. http://sysbio.oxfordjournals.org/content/early/2014/11/19/sysbio.syu087.abstract

  • Bansal MS, Banay G, Gogarten JP, Shamir R (2011) Detecting highways of horizontal gene transfer. J Comput Biol 18(9):1087–1114

    Article  CAS  PubMed  Google Scholar 

  • Baum BR (1992) Combining trees as a way of combining data sets for phylogenetic inference. Taxon 41:3–10

    Article  Google Scholar 

  • Beck RM, Bininda-Emonds OR, Cardillo M, Liu FG, Purvis A (2006) A higher-level mrp supertree of placental mammals. BMC Evol Biol 6:93

    Article  PubMed  PubMed Central  Google Scholar 

  • Beiko R, Hamilton N (2006) Phylogenetic identification of lateral genetic transfer events. BMC Evol Biol 6(1):15. ISSN 1471-2148. http://www.biomedcentral.com/1471-2148/6/15

  • Boc A, Philippe H, Makarenkov V (2010) Inferring and validating horizontal gene transfer events using bipartition dissimilarity. Syst Biol 59(2):195–211. http://sysbio.oxfordjournals.org/content/59/2/195.abstract

  • Chifman J, Kubatko L (2014) Quartet inference from snp data under the coalescent model. Bioinformatics 30(23):3317–3324. http://bioinformatics.oxfordjournals.org/content/30/23/3317.abstract

  • Chor B, Hendy M, Holland B, Penny D (2000) Multiple maxima of likelihood in phylogenetic trees: an analytic approach. MBE 17(10):1529–1541. Earlier version appeared in RECOMB 2000

  • Chor B, Khetan A, Snir S (2006) Maximum likelihood molecular clock comb: analytic solutions. J Comput Biol. Earlier version appeared in RECOMB 2003

  • Chor B, Snir S (2004) Molecular clock fork phylogenies: closed form analytic maximum likelihood solutions. Syst Biol 53(6):963–967. http://sysbio.oxfordjournals.org/content/53/6/963.abstract

  • Constantinescu M, Sankoff D (1995) An efficient algorithm for supertrees. J Classif 12(1):101–112. ISSN 0176-4268. https://doi.org/10.1007/BF01202270

  • Estabrook GF (1985) Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units. Syst Biol 34(2):193–200

    Article  Google Scholar 

  • Eulenstein O, Chen D, Burleigh JG, Fernández-Baca D, Sanderson MJ (2004) Performance of flip supertrees with a heuristic algorithm. Syst Biol 53(2):299–308

    Article  PubMed  Google Scholar 

  • Felsenstein J (1978) Cases in which parsimony or compatibility methods will be positively misleading. Syst Zool 27(4):401–410. ISSN 00397989. http://www.jstor.org/stable/2412923

  • Felsenstein J (1989) PHYLIP—phylogenetic inference package, (version 3.2). Cladistics 5:164–166

    Google Scholar 

  • Fleischauer M, Böcker S (2017) Bad clade deletion supertrees: A fast and accurate supertree algorithm. Molecular Biology and Evolution 34(9):2408–2421. https://doi.org/10.1093/molbev/msx191

    Article  PubMed  Google Scholar 

  • Gillooly JF, Gomez JP, Mavrodiev EV, Rong Y, McLamore ES (2016) Body mass scaling of passive oxygen diffusion in endotherms and ectotherms. Proc Natl Acad Sci USA 113(19):5340–5345. http://www.pnas.org/content/113/19/5340.abstract

  • Goloboff Pablo A, Catalano Santiago A (2016) Tnt version 1.5, including a full implementation of phylogenetic morphometrics. Cladistics 32(3):221–238. ISSN 1096-0031. https://doi.org/10.1111/cla.12160

  • Gordon AD (1986) Consensus supertrees: the synthesis of rooted trees containing overlapping sets of labeled leaves. J Classif 3(2):335–348. ISSN 0176-4268. https://doi.org/10.1007/BF01894195

  • Holland BR, Benthin S, Lockhart PJ, Moulton V, Huber KT (2008) Using supernetworks to distinguish hybridization from lineage-sorting. BMC Evol Biol 8(1):202. ISSN 1471-2148. https://doi.org/10.1186/1471-2148-8-202

  • Holland BR, Jarvis PD, Sumner JG (2013) Low-parameter phylogenetic inference under the general markov model. Syst Biol 62(1):78–92

    Article  PubMed  Google Scholar 

  • Maddison WP (1997) Gene trees in species trees. Syst Biol 46(3):523–536. http://links.jstor.org/sici?sici=1063-5157%28199709%2946%3A3%3C523%3AGTIST%3E2.0.CO%3B2-G

  • Mirarab S, Bayzid MS, Boussau B, Warnow T (2014) Statistical binning enables an accurate coalescent-based estimation of the avian tree. Science 346(6215):1250463. http://www.sciencemag.org/content/346/6215/1250463.abstract

  • Nakhleh L, Ruths D, Wang L (2005) Riata-hgt: a fast and accurate heuristic for reconstructing horizontal gene transfer. In: Wang L (ed), Computing and combinatorics, volume 3595 of lecture notes in computer science. Springer, Berlin, pp 84–93. ISBN 978-3-540-28061-3. http://dx.doi.org/10.1007/11533719_11

  • Nguyen N, Mirarab S, Warnow T (2012) Mrl and superfine+mrl: new supertree methods. Algorithms Mol Biol 7(1):3. ISSN 1748-7188. https://doi.org/10.1186/1748-7188-7-3

  • Nyakatura K, Bininda-Emonds O (2012) Updating the evolutionary history of carnivora (mammalia): a new species-level supertree complete with divergence time estimates. BMC Biol 10(1):12. ISSN 1741-7007. http://www.biomedcentral.com/1741-7007/10/12

  • Puigbó P, Wolf YI, Koonin EV (2009) Search for a ’tree of life’ in the thicket of the phylogenetic forest. J Biol 8(6):59. ISSN 1475-4924. http://jbiol.com/content/8/6/59

  • Ragan MA (1992) Matrix representation in reconstructing phylogenetic-relationships among the eukaryotes. Biosystems 28:47–55

    Article  CAS  PubMed  Google Scholar 

  • Robinson DR, Foulds LR (1981) Comparison of phylogenetic trees. Math Biosci 53:131–147

    Article  Google Scholar 

  • Roch S, Snir S (2012) Recovering the tree-like trend of evolution despite extensive lateral genetic transfer: a probabilistic analysis. In: RECOMB, pp 224–238

  • Sanderson MJ (2003) r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19(2):301–302. http://bioinformatics.oxfordjournals.org/content/19/2/301.abstract. http://ginger.ucdavis.edu/r8s/

  • Sigwart JD, Lindberg DR (2014) Consensus and confusion in molluscan trees: evaluating morphological and molecular phylogenies. Syst Biol. http://sysbio.oxfordjournals.org/content/early/2014/12/02/sysbio.syu105.abstract

  • Snir S, Rao S (2006) Using max cut to enhance rooted trees consistency. IEEE/ACM Trans Comput Biol Bioinform 3(4):323–333. Preliminary version appeared in WABI 2005

  • Snir S, Rao S (2010) Quartets maxcut: a divide and conquer quartets algorithm. IEEE/ACM Trans Comput Biol Bioinform 7(4):704–718

    Article  CAS  PubMed  Google Scholar 

  • Snir S, Rao S (2012) Quartet maxcut: a fast algorithm for amalgamating quartet trees. Mol Phylogenet Evol 62(1):1–8. ISSN 1055-7903

  • Snir S, Warnow T, Rao S (2008) Short quartet puzzling: a new quartet-based phylogeny reconstruction algorithm. J Comput Biol 1(15):91–103

    Article  Google Scholar 

  • Steel M (1992) The complexity of reconstructing trees from qualitative characters and subtrees. J Classif 9(1):91–116. ISSN 0176-4268. https://doi.org/10.1007/BF02618470

  • Steel M, Rodrigo A (2008) Maximum likelihood supertrees. Syst Biol 57(2):243–250

    Article  PubMed  Google Scholar 

  • Strimmer K, von Haeseler A (1996) Quartet puzzling: a quartet maximum-likelihood method for reconstructing tree topologies. Mol Biol Evol 13(7):964–969. ISSN 0737-4038. ftp://ftp.ebi.ac.uk/pub/software/unix/puzzle/

  • Swenson MS, Suri R, Linder CR, Warnow T (2011) An experimental study of quartets maxcut and other supertree methods. Algorithms Mol Biol 6(1):7

    Article  PubMed  PubMed Central  Google Scholar 

  • Swenson MS, Suri R, Linder CR, Warnow T (2012) Superfine: fast and accurate supertree estimation. Syst Biol 61(2):214–227

    Article  PubMed  Google Scholar 

  • Swofford DL (1998) PAUP*beta. Sinauer, Sunderland

  • Whidden C, Zeh N, Beiko RG (2014) Supertrees based on the subtree prune-and-regraft distance. Syst Biol 63(4):566. http://dx.doi.org/10.1093/sysbio/syu023

  • Wickett NJ, Mirarab S, Nguyen N, Warnow T, Carpenter E, Matasci N, Ayyampalayam S, Barker MS, Burleigh JG, Gitzendanner MA, Ruhfel BR, Wafula E, Der JP, Graham SW, Mathews S, Melkonian M, Soltis DE, Soltis PS, Miles NW, Rothfels CJ, Pokorny L, Shaw AJ, DeGironimo L, Stevenson DW, Surek B, Villarreal JC, Roure B, Philippe H, dePamphilis CW, Chen T, Deyholos MK, Baucom RS, Kutchan TM, Augustin MM, Wang J, Zhang Y, Tian Z, Yan Z, Wu X, Sun X, Wong GK, Leebens-Mack J (2014) Phylotranscriptomic analysis of the origin and early diversification of land plants. Proc Natl Acad Sci USA 111(45):E4859–E4868. http://www.pnas.org/content/111/45/E4859.abstract

  • Zhaxybayeva O, Gogarten JP, Charlebois RL, Doolittle WF, Papke RT (2006) Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. Genome Res 16(9):1099–1108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sagi Snir.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 422 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Avni, E., Yona, Z., Cohen, R. et al. The Performance of Two Supertree Schemes Compared Using Synthetic and Real Data Quartet Input. J Mol Evol 86, 150–165 (2018). https://doi.org/10.1007/s00239-018-9833-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-018-9833-0

Keywords

Navigation