Skip to main content

Species Tree Inference with SNP Data

  • Protocol
  • First Online:
Book cover Plant Comparative Genomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2512))

  • 1152 Accesses

Abstract

While the inference of species trees from molecular sequences has become a common type of analysis in studies of species diversification, few programs so far allow for the use of single-nucleotide polymorphisms (SNPs) for the same purpose. In this book chapter, I discuss the use of the Bayesian program SNAPP, which infers the species tree by mathematically integrating over all possible genealogies at each SNP. In particular, I focus on a molecular clock model developed for SNAPP, allowing the inference of divergence times together with the species tree topology and the population size, directly from SNP datasets in variant call format. With the growing availability of SNP datasets for multiple closely related species, this approach is becoming increasingly relevant for the reconstruction of the temporal framework of recent species diversification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kubatko LS, Degnan JH (2007) Inconsistency of phylogenetic estimates from concatenated data under coalescence. Syst Biol 56:17–24

    Article  CAS  Google Scholar 

  2. Leaché AD, Rannala B (2011) The accuracy of species tree estimation under simulation: a comparison of methods. Syst Biol 60:126–137

    Article  Google Scholar 

  3. Liu L, Edwards SV (2009) Phylogenetic analysis in the anomaly zone. Syst Biol 58:452–460

    Article  Google Scholar 

  4. Degnan JH, DeGiorgio M, Bryant D, Rosenberg NA (2009) Properties of consensus methods for inferring species trees from gene trees. Syst Biol 58:35–54

    Article  Google Scholar 

  5. Roch S, Steel M (2014) Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent. Theor Popul Biol 100:56–62

    Article  Google Scholar 

  6. Ogilvie HA, Bouckaert RR, Drummond AJ (2017) StarBEAST2 brings faster species tree inference and accurate estimates of substitution rates. Mol Biol Evol 34:2101–2114

    Article  CAS  Google Scholar 

  7. Stange M, Sánchez-Villagra MR, Salzburger W, Matschiner M (2018) Bayesian divergence-time estimation with genome-wide SNP data of sea catfishes (Ariidae) supports Miocene closure of the Panamanian Isthmus. Syst Biol 67:681–699

    Article  Google Scholar 

  8. Maddison WP (1997) Gene trees in species trees. Syst Biol 46:523–536

    Article  Google Scholar 

  9. Liu L (2008) BEST: Bayesian estimation of species trees under the coalescent model. Bioinformatics 24:2542–2543

    Article  CAS  Google Scholar 

  10. Edwards SV (2009) Is a new and general theory of molecular systematics emerging? Evolution 63:1–19

    Article  CAS  Google Scholar 

  11. Kubatko LS, Carstens BC, Knowles LL (2009) STEM: species tree estimation using maximum likelihood for gene trees under coalescence. Bioinformatics 25:971–973

    Article  CAS  Google Scholar 

  12. Heled J, Drummond AJ (2010) Bayesian inference of species trees from multilocus data. Mol Biol Evol 27:570–580

    Article  CAS  Google Scholar 

  13. Yang Z (2015) The BPP program for species tree estimation and species delimitation. Curr Zool 61:854–865

    Article  Google Scholar 

  14. Zhang C, Rabiee M, Sayyari E, Mirarab S (2018) ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics 19:153

    Article  Google Scholar 

  15. Edwards SV, Xi Z, Janke A et al (2016) Implementing and testing the multispecies coalescent model: a valuable paradigm for phylogenomics. Mol Phylogenet Evol 94:447–462

    Article  Google Scholar 

  16. Springer MS, Gatesy J (2016) The gene tree delusion. Mol Phylogenet Evol 94:1–33

    Article  Google Scholar 

  17. Chifman J, Kubatko LS (2014) Quartet inference from SNP data under the coalescent model. Bioinformatics 30:3317–3324

    Article  CAS  Google Scholar 

  18. Bryant D, Bouckaert RR, Felsenstein J, Rosenberg NA, RoyChoudhury A (2012) Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis. Mol Biol Evol 29:1917–1932

    Article  CAS  Google Scholar 

  19. De Maio N, Schrempf D, Kosiol C (2015) PoMo: an allele frequency-based approach for species tree estimation. Syst Biol 64:1018–1031

    Article  Google Scholar 

  20. Stoltz M, Bauemer B, Bouckaert R et al (2021) Bayesian inference of species trees using diffusion models. Syst Biol 70:145–161

    Google Scholar 

  21. Leaché AD, Fujita MK, Minin VN, Bouckaert RR (2014) Species delimitation using genome-wide SNP data. Syst Biol 63:534–542

    Article  Google Scholar 

  22. Bouckaert RR, Vaughan TG, Barido-Sottani J et al (2019) BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput Biol 15:e1006650

    Article  CAS  Google Scholar 

  23. Drummond AJ, Bouckaert RR (2015) Bayesian evolutionary analysis with BEAST 2. Cambridge University Press, Cambridge

    Book  Google Scholar 

  24. Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA (2018) Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst Biol 67:901–904

    Article  CAS  Google Scholar 

  25. Drummond AJ, Nicholls GK, Rodrigo AG, Solomon W (2002) Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 161:1307–1320

    Article  CAS  Google Scholar 

  26. Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376

    Article  CAS  Google Scholar 

  27. Barth JMI, Gubili C, Matschiner M et al (2020) Stable species boundaries despite ten million years of hybridization in tropical eels. Nat Commun 11:1433

    Article  CAS  Google Scholar 

  28. Kumar S, Stecher G, Suleski M, Hedges SB (2017) TimeTree: a resource for timelines, timetrees, and divergence times. Mol Biol Evol 34:1812–1819

    Article  CAS  Google Scholar 

  29. Fernández R, Kallal RJ, Dimitrov D et al (2018) Phylogenomics, diversification dynamics, and comparative transcriptomics across the spider Tree of Life. Curr Biol 28:1489–1497

    Article  Google Scholar 

  30. Rabosky DL, Chang J, Title PO et al (2018) An inverse latitudinal gradient in speciation rate for marine fishes. Nature 559:392–395

    Article  CAS  Google Scholar 

  31. Upham NS, Esselstyn JA, Jetz W (2019) Inferring the mammal tree: species-level sets of phylogenies for questions in ecology, evolution, and conservation. PLoS Biol 17:e3000494

    Article  CAS  Google Scholar 

  32. Janssens S, Couvreur TLP, Mertens A et al (2020) A large-scale species level dated angiosperm phylogeny for evolutionary and ecological analyses. Biodiv Data J 8:e39677

    Article  Google Scholar 

  33. Matschiner M, Musilova Z, Barth JMI et al (2017) Bayesian phylogenetic estimation of clade ages supports trans-Atlantic dispersal of cichlid fishes. Syst Biol 66:3–22

    Article  Google Scholar 

  34. Jacobsen MW, Pujolar JM, Gilbert MTP et al (2014) Speciation and demographic history of Atlantic eels (Anguilla anguilla and A. rostrata) revealed by mitogenome sequencing. Heredity 113:432–442

    Article  CAS  Google Scholar 

  35. Plummer M, Best N, Cowles K, Vines K (2006) CODA: convergence diagnosis and output analysis for MCMC. R News 6:7–11

    Google Scholar 

  36. Yule GU (1925) A mathematical theory of evolution, based on the conclusions of Dr. J. C. Willis, F.R.S. Phil Trans R Soc Lond B 213:21–87

    Article  Google Scholar 

  37. Genner MJ, Turner GF (2014) Timing of population expansions within the Lake Malawi haplochromine cichlid fish radiation. Hydrobiologia 748:121–132

    Article  Google Scholar 

  38. Yang Z (1994) Estimating the pattern of nucleotide substitution. J Mol Evol 39:105–111

    PubMed  Google Scholar 

  39. Brown WM, Prager EM, Wang A, Wilson AC (1982) Mitochondrial DNA sequences of primates: tempo and mode of evolution. J Mol Evol 18:225–239

    Article  CAS  Google Scholar 

  40. Bouckaert RR (2010) DensiTree: making sense of sets of phylogenetic trees. Bioinformatics 26:1372–1373

    Article  CAS  Google Scholar 

  41. Heled J, Bouckaert RR (2013) Looking for trees in the forest: summary tree from posterior samples. BMC Evol Biol 13:211

    Article  Google Scholar 

Download references

Acknowledgments

I thank Julie Lee-Yaw, Amanda Haponski, Livia Loureiro, Sue Sherman-Broyles, Bohao Fang, Yayan Kusuma, Daniel Poveda-Martínez, Xiaoxi Yang, Cecilia Fiorini, Kristen Finch, Armel Donkpegan, Marta Liber, Jie Gao, and Julia Canitz for testing the snapp_prep.rb script. Funding was provided by the Research Council of Norway (FRIPRO 275869).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Matschiner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Matschiner, M. (2022). Species Tree Inference with SNP Data. In: Pereira-Santana, A., Gamboa-Tuz, S.D., Rodríguez-Zapata, L.C. (eds) Plant Comparative Genomics. Methods in Molecular Biology, vol 2512. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2429-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2429-6_2

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2428-9

  • Online ISBN: 978-1-0716-2429-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics