Skip to main content

Abstract

Although massively parallel sequencing (MPS) allows researchers to obtain huge amounts of data at low cost compared to Sanger sequencing, current costs and computational constraints for whole genome sequencing and analysis generally prohibit phylogenomic and population genomic studies of non-model organisms and species with large, complex genomes. Therefore, new methods have been developed to select specific genomic regions for re-sequencing. Many of these methods can be applied to non-model organisms or species with very few genetic resources. Choosing which method to use for the study system of interest often depends on a variety of factors. In this chapter, we describe various re-sequencing methods with a focus on target enrichment. Additionally, we lay out experimental design considerations, bioinformatics pipelines, and proper reporting of results for target enrichment.

Karolina Heyduk and Jessica D. Stephens are contributed equally with all other contributors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Bao S, Jiang R, Kwan WK, Wang BB, Ma X, Song YQ (2011) Evaluation of next-generation sequencing software in mapping and assembly. J Hum Genet 56:406–414

    Article  CAS  PubMed  Google Scholar 

  • Bayzid MD, Warnow T (2013) Naïve binning improves phylogenomic analyses. Bioinformatics 29:2277–2284

    Article  CAS  PubMed  Google Scholar 

  • Bejerano G, Pheasant M, Makunin I, Stephen S, Kent W, Mattick J, Haussler D (2004) Ultraconserved elements in the human genome. Science 304:1321

    Article  CAS  PubMed  Google Scholar 

  • Blumenstiel B, Cibulskis K, Fisher S, DeFelice M, Barry A et al. (2010) Targeted exon sequencing by in-solution hybrid selection. Curr Protoc Hum Genet Chapter 18: Unit 18.4.

    Google Scholar 

  • Cariou M, Duret L, Charlat S (2013) Is RAD-seq suitable for phylogenetic inference? An in silico assessment and optimization. Ecol Evol 3:846–852

    Article  PubMed  PubMed Central  Google Scholar 

  • Carpenter ML, Buenrostro JD, Valdiosera C et al (2013) Pulling out the 1%: whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries. Am J Hum Genet 93:852–864

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Catchen J, Hohenlohe P, Bassham S, Amores A, Cresko W (2013) Stacks: an analysis tool set for population genomics. Mol Ecol 22:3124–3140

    Article  PubMed  PubMed Central  Google Scholar 

  • Chifman J, Kubatko L (2014) Quartet inference from SNP data under the coalescent model. Bioinformatics 30:3317. doi:10.1093/bioinformatics/btu530

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Comer JR, Zomlefer WB, Barrett CF, Davis JL, Stevenson DW, Heyduk K, Leebens-Mack J (2015) Resolving relationships within the palm subfamily Arecoideae (Arecaceae) using plastid sequences derived from next-generation sequencing. Am J Bot 102:888–899

    Article  CAS  PubMed  Google Scholar 

  • Cummings N, King R, Rickers A, Kaspi A, Lunke S, Haviv I, Jowett JBM (2010) Combining target enrichment with barcode multiplexing for high throughput SNP discovery. BMC Genomics 11:641

    Article  PubMed  PubMed Central  Google Scholar 

  • Davey JW, Blaxter ML (2010) RADSeq: next-generation population genetics. Brief Funct Genomics 9:416–423

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML (2011) Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat Rev Genet 12:499–510

    Article  CAS  PubMed  Google Scholar 

  • Davey JW, Cezard T, Fuentes-Utrilla P, Eland C, Gharbi K, Blaxter ML (2013) Special features of RAD Sequencing data: implications for genotyping. Mol Ecol 22:3151–3164

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Dermitzakis ET, Reymond A, Antonarakis SE (2005) Conserved non-genic sequences—an unexpected feature of mammalian genomes. Nat Rev Genet 6:151–157

    Article  CAS  PubMed  Google Scholar 

  • Derti A, Roth FP, Church GM, Wu C-T (2006) Mammalian ultraconserved elements are strongly depleted among segmental duplications and copy number variants. Nat Genet 38:1216–1220

    Article  CAS  PubMed  Google Scholar 

  • Duarte JM, Wall PK, Edger PP, Landherr LL, Ma H, Pires JC, Leebens-Mack J, dePamphilis CW (2010) Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis, and Oryza and their phylogenetic utility across various taxonomic levels. BMC Evol Biol 10:61

    Article  PubMed  PubMed Central  Google Scholar 

  • Easton DF, Rharoah PDP, Antoniou AC et al (2015) Gene-panel sequencing and the prediction of breast-cancer risk. N Engl J Med 372:2243–2257

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Eaton DAR (2014) PyRAD: assembly of de novo RADseq loci for phylogenetic analyses. Bioinformatics 30:1844. doi:10.1093/bioinformatics/btu121

    Article  CAS  PubMed  Google Scholar 

  • Ekblom R, Galindo J (2011) Applications of next generation sequencing in molecular ecology of non-model organisms. Heredity 107:1–15

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ekblom R, Wolf JBW (2014) A field guide to whole-genome sequencing, assembly, and annotation. Evol Appl 7(9):1026–1042

    Article  PubMed  PubMed Central  Google Scholar 

  • Enk JM, Devault AM, Kuch M, Murgha YE, Rouillard JM, Poinar HN (2014) Ancient whole genome enrichment using baits built from modern DNA. Mol Biol Evol 31:1292–1294

    Article  CAS  PubMed  Google Scholar 

  • Faircloth BC (2016) PHYLUCE is a software package for the analysis of conserved genomic loci. Bioinformatics 32:786-788. doi:10.1093/bioinformatics/btv646

    Google Scholar 

  • Faircloth BC, Glenn TC (2012) Not all sequence tags are created equal: designing and validating sequence identification tags robust to indels. PLoS One 7:e42543. doi:10.1371/journal.pone.0042543

    Google Scholar 

  • Faircloth BC, McCormack JE, Crawford NG, Harvey MG, Brumfield RT, Glenn TC (2012) Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Syst Biol 61:717–726

    Article  PubMed  Google Scholar 

  • Faircloth BC, Branstetter MG, White ND, Brady SG (2015) Target enrichment of ultraconserved elements from arthropods provides a genomic perspective on relationships among Hymenoptera. Mol Ecol Resour 15:489

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Feng YJ, Liu QF, Chen MY, Liang D, Zhang P (2016) Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly. Mol Ecol Resour 16:91. doi:10.1111/1755-0998.12429

    Article  CAS  PubMed  Google Scholar 

  • Fisher S, Barry A, Abreu J, Minie B, Nolan J et al (2011) A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries. Genome Biol 12:R1

    Article  PubMed  PubMed Central  Google Scholar 

  • Gautier M, Gharbi K, Cezard T, Foucaud J, Kerdelhue C, Pudlo P, Cornuet JM, Estoup A (2012) The effect of RAD allele dropout on the estimation of genetic variation within and between populations. Mol Ecol 22:3165–3178

    Article  PubMed  Google Scholar 

  • Glenn TC, Nilsen R, Kieran TJ, Finger JW Jr, Pierson TW, García-De-Leon FJ, del Rio Portilla MA, Reed K, Anderson JL, Meece JK, Alabady M, Belanger M, Faircloth BC (2016) Adapterama I: universal stubs and primers for thousands of dual-indexed Illumina Nextera and TruSeqHT compatible libraries (iNext & iTru). bioRxiv

    Google Scholar 

  • Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Grover CE, Salmon A, Wendel JF (2011) Targeted sequence capture as a powerful tool for evolutionary analysis. Am J Bot 99(2):312–319

    Article  Google Scholar 

  • Haas BJ, Gevers D, Earl AM et al (2011) Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. Genome Res 21:494–504

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hahn DA, Ragland GJ, Shoemaker DD, Denlinger DL (2009) Gene discovery using massively parallel pyrosequencing to develop ESTs for the fleshy fly Sarcophaga crassipalpis. BMC Genomics 10:234. doi:10.1186/1471-2164-10-234

    Article  PubMed  PubMed Central  Google Scholar 

  • Heled J, Drummond AJ (2010) Bayesian inference of species trees from multilocus data. Mol Biol Evol 27:570–580

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Heyduk K, Trapnell DW, Barnett CF, Leebens-Mack J (2016) Estimating relationships within Sabal (Arecaceae) through multilocus analyses of sequence capture data. Biol J Linnean Soc 17(1):106–120

    Google Scholar 

  • Huang H, Knowles LL (2014) Unforeseen consequences of excluding missing data from next-generation sequences: simulation study of RAD sequences. Syst Biol doi: 10.1093/sysbio/syu046

    Google Scholar 

  • Keane TM, Goodstadt L, Danecek P, White MA, Wong K et al (2011) Mouse genome variation and its effect on phenotypes and gene regulation. Nature 477:289–294

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kubatko LS (2009) Identifying hybridization events in the presence of coalescence via model selection. Syst Biol 58:478–488

    Article  CAS  PubMed  Google Scholar 

  • Kubatko LS, Carstens BC, Knowles LL (2009) STEM: species tree estimation using maximum likelihood for gene trees under coalescence. Bioinformatics 25:971–973

    Article  CAS  PubMed  Google Scholar 

  • Lemmon EM, Lemmon AR (2013) High-throughput genomic data in systematics and phylogenetics. Annu Rev Ecol Evol Syst 44:99–121

    Article  Google Scholar 

  • Li Y, Zhao S, Ma J, Li D, Yan L, Li J, Qi X, Guo X et al (2013) Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing. BMC Genomics 14:579. doi:10.1186/1471-2164-14-579

    Article  PubMed  PubMed Central  Google Scholar 

  • Liu L (2008) BEST: Bayesian estimation of species trees under the coalescent model. Bioinformatics 24:2542–2543

    Article  CAS  PubMed  Google Scholar 

  • Liu L, Yu L, Edwards SV (2010) A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evol Biol 10:302. doi:10.1186/1471-2148-10-302

    Article  PubMed  PubMed Central  Google Scholar 

  • Luikart G, England PR, Tallmon D, Jordan S, Taberlet P (2003) The power and promise of population genomics: from genotype to genome typing. Nat Rev Genet 4:981–994. doi:10.1038/nrg1226

    Article  CAS  PubMed  Google Scholar 

  • Mamanova L, Coffey AJ, Scott CE, Kozarewa I, Turner EH et al (2010) Target-enrichment strategies for next-generation sequencing. Nat Methods 7:111–118

    Article  CAS  PubMed  Google Scholar 

  • McCormack JE, Maley JM, Hird SM, Derryberry EP, Graves GR, Brumfield RT (2012) Next-generation sequencing reveals population genetic structure and a species tree for recent bird divergences. Mol Phylogenet Evol 62:397–406

    Article  PubMed  Google Scholar 

  • McCormack JE, Hird SM, Zellmer AJ, Carstens BC, Brumfield RT (2013a) Applications of next-generation sequencing to phylogeography and phylogenetics. Mol Phylogenet Evol 66:526–538

    Article  CAS  PubMed  Google Scholar 

  • McCormack JE, Harvey MG, Faircloth BC, Crawford NG, Glenn TC, Brumfield RT (2013b) A phylogeny of birds based on over 1,500 loci collected by target enrichment and high-throughput sequencing. PLoS One 8:e54848. doi:10.1371/journal.pone.0054848

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • McCormack JE, Tsai WLE, Faircloth BC (2015) Sequence capture of ultraconserved elements from bird museum specimens. Molecular Ecology Resources doi: 10.1111/1755-0998.12466

    Google Scholar 

  • Meiklejohn KA, Danielson MJ, Faircloth BC, Glenn TC, Braun EL, Kimball RT (2014) Incongruence among different mitochondrial regions: a case study using complete mitogenomes. Mol Phylogenet Evol 78:314–323

    Article  CAS  PubMed  Google Scholar 

  • Mertes F, ElSharawy A, Sauer S, van Helvoort JMLM, van der Zaag PJ, Franke A, Nilsson M, Lehrach H, Brookes AJ (2011) Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief Funct Genomics 10(6):374–386

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Meyer M, Kircher M (2010) Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb Protoc 2010: pdb prot5448

    Google Scholar 

  • Mirarab S, Reaz R, Bayzid MS, Zimmerman T, Swenson MS, Warnow T (2014) ASTRAL: genome-scale coalescent-based species tree estimation. Bioinformatics 30:i541–i548

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ozsolak F, Milos PM (2011) RNA sequencing: advantages, challenges, and opportunities. Nat Rev Genet 12:87–98

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Peñalba JV, Smith LL, Tonione MA, Sass C, Hykin SM, Skipwith PL, McGuire JA, Bowie RCK, Moritz C (2014) Sequence capture using PCR-generated probes: a cost-effective method of targeted high-throughput sequencing for nonmodel organisms. Mol Ecol 14(5):1000–1010

    Google Scholar 

  • Puritz JB, Matz MV, Toonen RJ, Weber JN, Bolnick DI, Bird CE (2014) Demystifying the RAD fad. Mol Ecol 23(24):5937–5942

    Article  CAS  PubMed  Google Scholar 

  • Raposo do Ameral F, Neves LG, Resende MF Jr, Mobili F, Miyaki CY, Pellegrino KC, Biondo C (2015) Ultraconserved elements sequencing as a lowcost source of complete mitochondrial genomes and microsatellite markers in non-model amniotes. PLoS One 10:e0138446

    Google Scholar 

  • Rohland N, Reich D (2012) Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Genome Res 22:939–946

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rubin BER, Ree RH, Moreau CS (2012) Inferring phylogenies from RAD sequence data. PLoS One 7:1–12

    Article  Google Scholar 

  • Shearer EA, Hildebrand MS, Ravi H, Joshi S, Guiffre AC, Novak B, Happe S, LeProust EM, Smith RJH (2012) Pre-capture multiplexing improves efficiency and cost-effectiveness of targeted genomic enrichment. BMC Genomics 13:618

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sims D, Sudbery I, Ilot NE, Heger A, Ponting CP (2014) Sequencing depth and coverage: key considerations in genomic analyses. Nat Rev Genet 15:121–132

    Article  CAS  PubMed  Google Scholar 

  • Smith BT, Harvey MG, Faircloth BC, Glenn TC, Brumfield RT (2014) Target capture and massively parallel sequencing of ultraconserved elements (UCEs) for comparative studies at shallow evolutionary time scales. Syst Biol 63(1):83–95

    Article  PubMed  Google Scholar 

  • Stephen S, Pheasant M, Makunin IV, Mattick JS (2008) Large-scale appearance of ultraconserved elements in tetrapod genomes and slowdown of the molecular clock. Mol Biol Evol 25:402–408

    Article  CAS  PubMed  Google Scholar 

  • Stephens JD, Rogers WL, Heyduk K, Cruse-Sanders JM, Determann RO, Glenn TC, Malmberg RL (2015a) Resolving phylogenetic relationships for the recently radiated carnivorous plant genus Sarracenia using target enrichment. Mol Phylogenet Evol 85:76–87

    Article  PubMed  Google Scholar 

  • Stephens JD, Rogers WL, Mason CM, Donovan LA, Malmberg RL (2015b) Species tree estimation of diploid Helianthus (Asteraceae) using target enrichment. Am J Bot 102:921–941

    Article  Google Scholar 

  • Wagner CE, Keller I, Wittwer S, Selz OM, Mwaiko S, Greuter L, Sivasundar A, Seehausen O (2013) Genome-wide RAD sequence data provide unprecedented resolution of species boundaries and relationships in the Lake Victoria cichlid adaptive radiation. Mol Ecol 22:787–798

    Article  CAS  PubMed  Google Scholar 

  • Wang Y, Qian PY (2009) Conservative fragments in bacterial 16S rRNA genes and primer design for 16S ribosomal DNA amplicons in metagenomic studies. PLoS One 4:e7401. doi:10.1371/journal.pone.0007401

    Article  PubMed  PubMed Central  Google Scholar 

  • Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang Y, Ghaffari N, Johnson CD, Braga-Neto UM, Wang H, Chen R, Zhou H (2011) Evaluation of the coverage and depth of transcriptome by RNA-Seq in chickens. BMC Bioinformatics 12:S5. doi:10.1186/1471-2105-12-S10-S5

    CAS  Google Scholar 

  • Weitmeier K, Straub SCK, Cronn RC, Fishbein M, Schmickl R, McDonnell A, Liston A (2014) Hyb-Seq: combining target enrichment and genome skimming for plant phylogenomics. Appl Plant Sci 2:1400042. doi:10.3732/apps.1400042

    Google Scholar 

  • Xu J, Zhao Q, Du P, Xu C, Wang B, Feng Q, Liu Q, Tang S, Gu M, Han B, Liang G (2010) Developing high throughput genotyped chromosome segment substitution lines based on population whole-genome re-sequencing in rice (Oryza sativa L.). BMC Genomics 11:656. doi:10.1186/1471-2164-11-656

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Yu Y, Nakhleh L (2015) A distance-based method for inferring phylogenetic networks in the presence of incomplete lineage sorting. Bioinform Res Appl 9096:378–389

    Google Scholar 

  • Yu Y, Cuong T, Degnan JH, Nakhleh L (2011) Coalescent histories on phylogenetic networks and detection of hybridization despite incomplete lineage sorting. Syst Biol 60:138–149

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhu Y, Bergland AO, González J, Petrov DA (2012) Empirical validation of pooled whole genome population re-sequencing in Drosophila melanogaster. PLoS One 7:e41901. doi:10.1371/journal pone.0041901

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Travis C. Glenn .

Editor information

Editors and Affiliations

Annex: Quick Reference Guide

Annex: Quick Reference Guide

Fig. QG3.1
figure a

Representation of the wet-lab procedure workflow

Fig. QG3.2
figure b

Main steps of the computational analysis pipeline

Table QG3.1 Experimental design considerations
Table QG3.2 Available software recommendations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Heyduk, K., Stephens, J.D., Faircloth, B.C., Glenn, T.C. (2016). Targeted DNA Region Re-sequencing. In: Aransay, A., Lavín Trueba, J. (eds) Field Guidelines for Genetic Experimental Designs in High-Throughput Sequencing. Springer, Cham. https://doi.org/10.1007/978-3-319-31350-4_3

Download citation

Publish with us

Policies and ethics