Skip to main content
Log in

Target enrichment sequencing in cultivated peanut (Arachis hypogaea L.) using probes designed from transcript sequences

  • Original Article
  • Published:
Molecular Genetics and Genomics Aims and scope Submit manuscript

Abstract

Enabled by the next generation sequencing, target enrichment sequencing (TES) is a powerful method to enrich genomic regions of interest and to identify sequence variations. The objective of this study was to explore the feasibility of probe design from transcript sequences for TES application in calling sequence variants in peanut, an important allotetraploid crop with a large genome size. In this study, we applied an in-solution hybridization method to enrich DNA sequences of seven peanut genotypes. Our results showed that it is feasible to apply TES with probes designed from transcript sequences in polyploid peanut. Using a set of 31,123 probes, a total of 5131 and 7521 genes were targeted in peanut A and B genomes, respectively. For each genotype used in this study, the probe target capture regions were efficiently covered with high depth. The average on-target rate of sequencing reads was 42.47%, with a significant amount of off-target reads coming from genomic regions homologous to target regions. In this study, when given predefined genomic regions of interest and the same amount of sequencing data, TES provided the highest coverage of target regions when compared to whole genome sequencing, RNA sequencing, and genotyping by sequencing. Single nucleotide polymorphism (SNP) calling and subsequent validation revealed a high validation rate (85.71%) of homozygous SNPs, providing valuable markers for peanut genotyping. This study demonstrated the success of applying TES for SNP identification in peanut, which shall provide valuable suggestions for TES application in other non-model species without a genome reference available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ (2007) Direct selection of human genomic loci by microarray hybridization. Nat Methods 4:903–905

    Article  CAS  PubMed  Google Scholar 

  • Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Babraham Bioinformatics. http://www.bioinformatics.babraham.ac.uk/projects/fastqc. Accessed 15 Mar 2017

  • Bertioli DJ, Cannon SB, Froenicke L, Huang G, Farmer AD, Cannon EKS, Liu X, Gao D, Clevenger J, Dash S, Ren L, Moretzsohn MC, Shirasawa K, Huang W, Vidigal B, Abernathy B, Chu Y, Niederhuth CE, Umale P, Araújo ACG, Kozik A, Kim KD, Burow MD, Varshney RK, Wang X, Zhang X, Barkley N, Guimarães PM, Isobe S, Guo B, Liao B, Stalker HT, Schmitz RJ, Scheffler BE, Leal-Bertioli SCM, Xun X, Jackson SA, Michelmore R, Ozias-Akins P (2016) The genome sequences of Arachis duranensis and Arachis ipaensis, the diploid ancestors of cultivated peanut. Nat Genet 48:438–446

    Article  CAS  PubMed  Google Scholar 

  • Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bundock PC, Casu RE, Henry RJ (2012) Enrichment of genomic DNA for polymorphism detection in a non-model highly polyploid crop plant. Plant Biotechnol J 10:657–667

    Article  CAS  PubMed  Google Scholar 

  • Catchen J, Hohenlohe PA, Bassham S, Amores A, Cresko WA (2013) Stacks: an analysis tool set for population genomics. Mol Ecol 22:3124–3140

    Article  PubMed  PubMed Central  Google Scholar 

  • Chen X, Zhu W, Azam S, Li H, Zhu F, Li H, Hong Y, Liu H, Zhang E, Wu H (2013) Deep sequencing analysis of the transcriptomes of peanut aerial and subterranean young pods identifies candidate genes related to early embryo abortion. Plant Biotechnol J 11:115–127

    Article  CAS  PubMed  Google Scholar 

  • Clevenger JP, Ozias-Akins P (2015) SWEEP: a tool for filtering high-quality SNPs in polyploid crops. G3 (Bethesda) 5:1797–1803

    Article  Google Scholar 

  • Clevenger J, Chavarro C, Pearl SA, Ozias-Akins P, Jackson SA (2015) Single nucleotide polymorphism identification in polyploids: a review, example, and recommendations. Mol Plant 8:831–846

    Article  CAS  PubMed  Google Scholar 

  • Clevenger J, Chu Y, Chavarro C, Agarwal G, Bertioli DJ, Leal-Bertioli SC, Pandey MK, Vaughn J, Abernathy B, Barkley NA (2017) Genome-wide SNP genotyping resolves signatures of selection and tetrasomic recombination in peanut. Mol Plant 10:309–322

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Garrison E, Marth G (2012) Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907

  • Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, Fisher S, Russ C (2009) Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol 27:182–189

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Grover CE, Salmon A, Wendel JF (2012) Targeted sequence capture as a powerful tool for evolutionary analysis. Am J Bot 99:312–319

    Article  PubMed  Google Scholar 

  • Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9:868–877

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36

    Article  PubMed  PubMed Central  Google Scholar 

  • Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25

    Article  PubMed  PubMed Central  Google Scholar 

  • Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079

    Article  PubMed  PubMed Central  Google Scholar 

  • Lin X, Tang W, Ahmad S, Lu J, Colby CC, Zhu J, Yu Q (2012) Applications of targeted gene capture and next-generation sequencing technologies in studies of human deafness and other genetic disabilities. Hear Res 288:67–76

    Article  CAS  PubMed  Google Scholar 

  • Mamanova L, Coffey AJ, Scott CE, Kozarewa I, Turner EH, Kumar A, Howard E, Shendure J, Turner DJ (2010) Target-enrichment strategies for next-generation sequencing. Nat Methods 7:111–118

    Article  CAS  PubMed  Google Scholar 

  • McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome analysis toolkit: a mapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mertes F, Elsharawy A, Sauer S, van Helvoort JM, van der Zaag P, Franke A, Nilsson M, Lehrach H, Brookes AJ (2011) Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief Funct Genom 10:374

    Article  CAS  Google Scholar 

  • Neves LG, Davis JM, Barbazuk WB, Kirst M (2013) Whole-exome targeted sequencing of the uncharacterized pine genome. Plant J 75:146–156

    Article  CAS  PubMed  Google Scholar 

  • Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461:272–276

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Peng Z, Gallo M, Tillman BL, Rowland D, Wang J (2016) Molecular marker development from transcript sequences and germplasm evaluation for cultivated peanut (Arachis hypogaea L.). Mol Genet Genom 291:363–381

    Article  CAS  Google Scholar 

  • Peng Z, Liu F, Wang L, Zhou H, Paudel D, Tan L, Maku J, Gallo M, Wang J (2017) Transcriptome profiles reveal gene regulation of peanut (Arachis hypogaea L.) nodulation. Sci Rep 7:40066

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pootakham W, Shearman JR, Ruang-areerate P, Sonthirod C, Sangsrakru D, Jomchai N, Yoocha T, Triwitayakorn K, Tragoonrung S, Tangphatsornruang S (2014) Large-scale SNP discovery through RNA sequencing and SNP genotyping by targeted enrichment sequencing in Cassava (Manihot esculenta Crantz). PLoS One 9:e116028

    Article  PubMed  PubMed Central  Google Scholar 

  • Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rogers SO, Bendich AJ (1988) Extraction of DNA from plant tissues. In: Gelvin SB, Schilperoort RA (eds) Plant molecular biology manual. Kluwer Academic Publisher, Dordrecht, pp 1–10

    Google Scholar 

  • Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239:487

    Article  CAS  PubMed  Google Scholar 

  • Salmon A, Udall JA, Jeddeloh JA, Wendel J (2012) Targeted capture of homoeologous coding and noncoding sequence in polyploid cotton. G3 (Bethesda) 2:921–930

    Article  CAS  Google Scholar 

  • Song J, Yang X, Resende MF Jr, Neves LG, Todd J, Zhang J, Comstock JC, Wang J (2016) Natural allelic variations in highly polyploidy Saccharum complex. Front Plant Sci 7:804

    PubMed  PubMed Central  Google Scholar 

  • Summerer D (2009) Enabling technologies of genomic-scale sequence enrichment for targeted high-throughput sequencing. Genomics 94:363–368

    Article  CAS  PubMed  Google Scholar 

  • Tseng Y, Tillman BL, Peng Z, Wang J (2016) Identification of major QTLs underlying tomato spotted wilt virus resistance in peanut cultivar Florida-EP TM ‘113’. BMC Genet 17:128

    Article  PubMed  PubMed Central  Google Scholar 

  • Turner EH, Lee C, Ng SB, Nickerson DA, Shendure J (2009) Massively parallel exon capture and library-free resequencing across 16 genomes. Nat Methods 6:315–316

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhou X, Xia Y, Ren X, Chen Y, Huang L, Huang S, Liao B, Lei Y, Yan L, Jiang H (2014) Construction of a SNP-based genetic linkage map in cultivated peanut based on large scale marker development using next-generation double-digest restriction-site-associated DNA sequencing (ddRADseq). BMC Genom 15:351

    Article  Google Scholar 

Download references

Acknowledgements

The research presented in this article was sponsored by the Florida Peanut Producers Association and National Peanut Board.

Author information

Authors and Affiliations

Authors

Contributions

JW conceived the study. JW and ZP designed and coordinated the experiments. ZP, LW, and WF conducted the experiment. ZP and WF analyzed the data. ZP, DP, and DL prepared figures. ZP drafted the manuscript. All authors revised the manuscript.

Corresponding author

Correspondence to Jianping Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by S. Hohmann.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, Z., Fan, W., Wang, L. et al. Target enrichment sequencing in cultivated peanut (Arachis hypogaea L.) using probes designed from transcript sequences. Mol Genet Genomics 292, 955–965 (2017). https://doi.org/10.1007/s00438-017-1327-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00438-017-1327-z

Keywords

Navigation