Reduced Representation Methods for Subgenomic Enrichment and Next-Generation Sequencing

Part of the Methods in Molecular Biology book series (MIMB, volume 772)


Several methods have been developed to enrich DNA for subsets of the genome prior to next-generation sequencing. These front-end enrichment strategies provide powerful and cost-effective tools for researchers interested in collecting large-scale genomic sequence data. In this review, I provide an overview of both general and targeted reduced representation enrichment strategies that are commonly used in tandem with next-generation sequencing. I focus on several key issues that are likely to be important when deciding which enrichment strategy is most appropriate for a given experiment. Overall, these techniques can enable the collection of large-scale genomic data in diverse species, providing a powerful tool for the study of evolutionary biology.

Key words

Reduced representation Sequence capture Targeted resequencing Ancient DNA Sample bar coding 



I thank Emily Hodges, Frank Albert, Martin Kircher, Adrian Briggs, Hernán Burbano, Gordon Luikart, and Matthias Meyer for many helpful conversations on NGS and targeted enrichment. Research contributing to this review was supported by an NSF international postdoctoral fellowship (OISE-0754461).


  1. 1.
    Mamanova L, Coffey AJ, Scott CE et al (2010) Target-enrichment strategies for next-generation sequencing. Nat Methods 7:111–118PubMedCrossRefGoogle Scholar
  2. 2.
    Turner EH, Ng SB, Nickerson DA et al (2009) Methods for genomic partitioning. Ann Rev Genomics Hum Genet 10:263–284CrossRefGoogle Scholar
  3. 3.
    Lee H, O’Connor BD, Merriman B et al (2009) Improving the efficiency of genomic loci capture using oligonucleotide arrays for high throughput resequencing. BMC Genomics 10:646PubMedCrossRefGoogle Scholar
  4. 4.
    Summerer D (2009) Enabling technologies of genomic-scale sequence enrichment for targeted high-throughput sequencing. Genomics 94:363–368PubMedCrossRefGoogle Scholar
  5. 5.
    Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63PubMedCrossRefGoogle Scholar
  6. 6.
    Metzker ML (2010) Sequencing technologies - the next generation. Nat Rev Genet 11:31–46PubMedCrossRefGoogle Scholar
  7. 7.
    Shendure J, Ji HL (2008) Next-generation DNA sequencing. Nat Biotechnol 26:1135–1145PubMedCrossRefGoogle Scholar
  8. 8.
    Margulies M, Egholm M, Altman WE et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380PubMedGoogle Scholar
  9. 9.
    Bentley DR, Balasubramanian S, Swerdlow HP et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53–59PubMedCrossRefGoogle Scholar
  10. 10.
    Valouev A, Ichikawa J, Tonthat T et al (2008) A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. Genome Res 18:1051–1063PubMedCrossRefGoogle Scholar
  11. 11.
    Wheeler DA, Srinivasan M, Egholm M et al (2008) The complete genome of an individual by massively parallel DNA sequencing. Nature 452:872–876PubMedCrossRefGoogle Scholar
  12. 12.
    Wang J, Wang W, Li R et al (2008) The diploid genome sequence of an Asian individual. Nature 456:60–U61PubMedCrossRefGoogle Scholar
  13. 13.
    Li RQ, Fan W, Tian G et al (2010) The sequence and de novo assembly of the giant panda genome. Nature 463:311–317PubMedCrossRefGoogle Scholar
  14. 14.
    Altshuler D, Pollara VJ, Cowles CR et al (2000) An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature 407:513–516PubMedCrossRefGoogle Scholar
  15. 15.
    Van Tassell CP, Smith TP, Matukamalli LK et al (2008) SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries. Nat Methods 5:247–252PubMedCrossRefGoogle Scholar
  16. 16.
    Baird NA, Etter PD, Atwood TS et al (2008) Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE 3:e3376PubMedCrossRefGoogle Scholar
  17. 17.
    Wiedmann RT, Smith TPL, Nonneman DJ (2008) SNP discovery in swine by reduced representation and high throughput pyrosequencing. BMC Genet 9:81PubMedCrossRefGoogle Scholar
  18. 18.
    Hohenlohe PA, Bassham S, Etter PD et al (2010) Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags. PLoS Genet. 6:e1000862PubMedCrossRefGoogle Scholar
  19. 19.
    Van Bers NEM, Van Oers K, Kerstens HHD et al (2010) Genome-wide SNP detection in the great tit Parus major using high throughput sequencing. Mol Ecol 19:89–99PubMedCrossRefGoogle Scholar
  20. 20.
    Adams MD, Kelley JM, Gocayne JD et al (1991) Complementary DNA sequencing: Expressed sequence tags and the human genome project. Science 252:1651–1656PubMedCrossRefGoogle Scholar
  21. 21.
    Velculescu VE, Zhang L, Vogelstein B et al (1995) Serial analysis of gene expression. Science 270:484–487PubMedCrossRefGoogle Scholar
  22. 22.
    Velculescu VE, Zhang L, Zhou W et al (1997) Characterization of the yeast transcriptome. Cell 88:243–251PubMedCrossRefGoogle Scholar
  23. 23.
    Mortazavi A, Williams BA, McCue K et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628PubMedCrossRefGoogle Scholar
  24. 24.
    Zerbino DR, Birney E (2008) Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829PubMedCrossRefGoogle Scholar
  25. 25.
    Su AI, Cooke MP, Ching KA et al (2002) Large-scale analysis of the human and mouse transcriptomes. Proc Natl Acad Sci USA 99:4465–4470PubMedCrossRefGoogle Scholar
  26. 26.
    Montgomery SB, Sammeth M, Gutierrez-Arcelus M et al (2010) Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464:773–777PubMedCrossRefGoogle Scholar
  27. 27.
    Carninci P, Shibata Y, Hayatsu N et al (2000) Normalization and subtraction of cap-trapper-selected cDNAs to prepare full-length cDNA libraries for rapid discovery of new genes. Genome Res 10:1617–1630PubMedCrossRefGoogle Scholar
  28. 28.
    Varley KE, Mitra RD (2008) Nested Patch PCR enables highly multiplexed mutation discovery in candidate genes. Genome Res 18:1844–1850PubMedCrossRefGoogle Scholar
  29. 29.
    Stiller M, Knapp M, Stenzel U et al (2009) Direct multiplex sequencing (DMPS): a novel method for targeted high-throughput sequencing of ancient and highly degraded DNA. Genome Res 19:1843–1848PubMedCrossRefGoogle Scholar
  30. 30.
    Tewhey R, Warner JB, Nakano M et al (2009) Microdroplet-based PCR enrichment for large-scale targeted sequencing. Nat Biotechnol 27:1025–1031PubMedCrossRefGoogle Scholar
  31. 31.
    Lovett M, Kere J, Hinton LM (1991) Direct selection: a method for the isolation of cDNAs encoded by large genomic regions. Proc Natl Acad Sci USA 88:9628–9632PubMedCrossRefGoogle Scholar
  32. 32.
    Hodges E, Xuan Z, Balija V et al (2007) Genome-wide in situ exon capture for selective resequencing. Nat Genet 39:1522–1527PubMedCrossRefGoogle Scholar
  33. 33.
    Albert TJ, Molla MN, Muzny DM et al (2007) Direct selection of human genomic loci by microarray hybridization. Nat Methods 4:903–905PubMedCrossRefGoogle Scholar
  34. 34.
    Okou DT, Steinberg KM, Middle C et al (2007) Microarray-based genomic selection for high-throughput resequencing. Nat Methods 4:907–909PubMedCrossRefGoogle Scholar
  35. 35.
    Gnirke A, Melnikov A, Maguire J et al (2009) Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol 27:182–189PubMedCrossRefGoogle Scholar
  36. 36.
    Cleary MA, Kilian K, Wang Y et al (2004) Production of complex nucleic acid libraries using highly parallel in situ oligonucleotide synthesis. Nat Methods 1:241–248PubMedCrossRefGoogle Scholar
  37. 37.
    Hughes TR, Mao M, Jones AR et al (2001) Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol 19:342–347PubMedCrossRefGoogle Scholar
  38. 38.
  39. 39.
    Ng SB, Turner EH, Robertson PD et al (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461:272–276PubMedCrossRefGoogle Scholar
  40. 40.
    Ng SB, Buckingham KJ, Lee C et al (2010) Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 42:30–35PubMedCrossRefGoogle Scholar
  41. 41.
    Hodges E, Rooks M, Xuan Z et al (2009) Hybrid selection of discrete genomic intervals on custom-designed microarrays for massively parallel sequencing. Nat Protocols 4:960–974CrossRefGoogle Scholar
  42. 42.
    Porreca GJ, Zhang K, Li JB et al (2007) Multiplex amplification of large sets of human exons. Nat Methods 4:931–936PubMedCrossRefGoogle Scholar
  43. 43.
    Briggs AW, Good JM, Green RE et al (2009) Targeted retrieval and analysis of five Neandertal mtDNA genomes. Science 325:318–321PubMedCrossRefGoogle Scholar
  44. 44.
    Krause J, Fu Q, Good JM et al (2010) The complete mitochondrial DNA genome of an unknown hominin from southern Siberia. Nature 464:894–897PubMedCrossRefGoogle Scholar
  45. 45.
    Krause J, Briggs AW, Kircher M et al (2010) A complete mtDNA genome of an early modern human from Kostenki, Russia. Curr Biol 20:231–236PubMedCrossRefGoogle Scholar
  46. 46.
    Maricic T, Whitten M, Pääbo S (2010) Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PLoS ONE 5:e14004PubMedCrossRefGoogle Scholar
  47. 47.
    Wolf JBW, Bayer T, Haubold B et al (2010) Nucleotide divergence vs. gene expression differentiation: comparative transcriptome sequencing in natural isolates from the carrion crow and its hybrid zone with the hooded crow. Mol Ecol 19:162–175PubMedCrossRefGoogle Scholar
  48. 48.
    Burbano HA, Hodges E, Green RE et al (2010) Targeted investigation of the Neandertal genome by array-based sequence capture. Science 328:723–725PubMedCrossRefGoogle Scholar
  49. 49.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760PubMedCrossRefGoogle Scholar
  50. 50.
    Li H, Handsaker B, Wysoker A et al (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079PubMedCrossRefGoogle Scholar
  51. 51.
    Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851–1858PubMedCrossRefGoogle Scholar
  52. 52.
    Langmead B, Trapnell C, Pop M et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10:R25PubMedCrossRefGoogle Scholar
  53. 53.
    Alkan C, Kidd JM, Marques-Bonet T et al (2009) Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet 41:1061–1067PubMedCrossRefGoogle Scholar
  54. 54.
    Meyer M, Kircher M (2010) Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb Protoc. doi: 10.1101/pdb.prot5448
  55. 55.
    Blow MJ, Zhang T, Woyke T et al (2008) Identification of ancient remains through geno­mic sequencing. Genome Res 18:1347–1353PubMedCrossRefGoogle Scholar
  56. 56.
    Meyer M, Stenzel U, Hofreiter M (2008) Parallel tagged sequencing on the 454 platform. Nature Protocols 3:267–278PubMedCrossRefGoogle Scholar
  57. 57.
    Briggs AW, Stenzel U, Johnson PLF et al (2007) Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci USA 104:14616–14621PubMedCrossRefGoogle Scholar
  58. 58.
    Green RE, Krause J, Briggs AW et al (2010) A draft sequence of the Neandertal genome. Science 328:710–722PubMedCrossRefGoogle Scholar
  59. 59.
    Green RE, Krause J, Ptak SE et al (2006) Analysis of one million base pairs of Neanderthal DNA. Nature 444:330–336PubMedCrossRefGoogle Scholar
  60. 60.
    Meyerhans A, Vartanian JP, Wainhobson S (1990) DNA recombination during PCR. Nucleic Acids Res 18:1687–1691PubMedCrossRefGoogle Scholar
  61. 61.
    Botstein D, White RL, Skolnick M et al (1980) Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet 32:314–331PubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Division of Biological SciencesUniversity of MontanaMissoulaUSA

Personalised recommendations