Abstract
Enabled by the next generation sequencing, target enrichment sequencing (TES) is a powerful method to enrich genomic regions of interest and to identify sequence variations. The objective of this study was to explore the feasibility of probe design from transcript sequences for TES application in calling sequence variants in peanut, an important allotetraploid crop with a large genome size. In this study, we applied an in-solution hybridization method to enrich DNA sequences of seven peanut genotypes. Our results showed that it is feasible to apply TES with probes designed from transcript sequences in polyploid peanut. Using a set of 31,123 probes, a total of 5131 and 7521 genes were targeted in peanut A and B genomes, respectively. For each genotype used in this study, the probe target capture regions were efficiently covered with high depth. The average on-target rate of sequencing reads was 42.47%, with a significant amount of off-target reads coming from genomic regions homologous to target regions. In this study, when given predefined genomic regions of interest and the same amount of sequencing data, TES provided the highest coverage of target regions when compared to whole genome sequencing, RNA sequencing, and genotyping by sequencing. Single nucleotide polymorphism (SNP) calling and subsequent validation revealed a high validation rate (85.71%) of homozygous SNPs, providing valuable markers for peanut genotyping. This study demonstrated the success of applying TES for SNP identification in peanut, which shall provide valuable suggestions for TES application in other non-model species without a genome reference available.
Similar content being viewed by others
References
Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ (2007) Direct selection of human genomic loci by microarray hybridization. Nat Methods 4:903–905
Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Babraham Bioinformatics. http://www.bioinformatics.babraham.ac.uk/projects/fastqc. Accessed 15 Mar 2017
Bertioli DJ, Cannon SB, Froenicke L, Huang G, Farmer AD, Cannon EKS, Liu X, Gao D, Clevenger J, Dash S, Ren L, Moretzsohn MC, Shirasawa K, Huang W, Vidigal B, Abernathy B, Chu Y, Niederhuth CE, Umale P, Araújo ACG, Kozik A, Kim KD, Burow MD, Varshney RK, Wang X, Zhang X, Barkley N, Guimarães PM, Isobe S, Guo B, Liao B, Stalker HT, Schmitz RJ, Scheffler BE, Leal-Bertioli SCM, Xun X, Jackson SA, Michelmore R, Ozias-Akins P (2016) The genome sequences of Arachis duranensis and Arachis ipaensis, the diploid ancestors of cultivated peanut. Nat Genet 48:438–446
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120
Bundock PC, Casu RE, Henry RJ (2012) Enrichment of genomic DNA for polymorphism detection in a non-model highly polyploid crop plant. Plant Biotechnol J 10:657–667
Catchen J, Hohenlohe PA, Bassham S, Amores A, Cresko WA (2013) Stacks: an analysis tool set for population genomics. Mol Ecol 22:3124–3140
Chen X, Zhu W, Azam S, Li H, Zhu F, Li H, Hong Y, Liu H, Zhang E, Wu H (2013) Deep sequencing analysis of the transcriptomes of peanut aerial and subterranean young pods identifies candidate genes related to early embryo abortion. Plant Biotechnol J 11:115–127
Clevenger JP, Ozias-Akins P (2015) SWEEP: a tool for filtering high-quality SNPs in polyploid crops. G3 (Bethesda) 5:1797–1803
Clevenger J, Chavarro C, Pearl SA, Ozias-Akins P, Jackson SA (2015) Single nucleotide polymorphism identification in polyploids: a review, example, and recommendations. Mol Plant 8:831–846
Clevenger J, Chu Y, Chavarro C, Agarwal G, Bertioli DJ, Leal-Bertioli SC, Pandey MK, Vaughn J, Abernathy B, Barkley NA (2017) Genome-wide SNP genotyping resolves signatures of selection and tetrasomic recombination in peanut. Mol Plant 10:309–322
Garrison E, Marth G (2012) Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907
Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, Fisher S, Russ C (2009) Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol 27:182–189
Grover CE, Salmon A, Wendel JF (2012) Targeted sequence capture as a powerful tool for evolutionary analysis. Am J Bot 99:312–319
Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9:868–877
Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Lin X, Tang W, Ahmad S, Lu J, Colby CC, Zhu J, Yu Q (2012) Applications of targeted gene capture and next-generation sequencing technologies in studies of human deafness and other genetic disabilities. Hear Res 288:67–76
Mamanova L, Coffey AJ, Scott CE, Kozarewa I, Turner EH, Kumar A, Howard E, Shendure J, Turner DJ (2010) Target-enrichment strategies for next-generation sequencing. Nat Methods 7:111–118
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome analysis toolkit: a mapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303
Mertes F, Elsharawy A, Sauer S, van Helvoort JM, van der Zaag P, Franke A, Nilsson M, Lehrach H, Brookes AJ (2011) Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief Funct Genom 10:374
Neves LG, Davis JM, Barbazuk WB, Kirst M (2013) Whole-exome targeted sequencing of the uncharacterized pine genome. Plant J 75:146–156
Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461:272–276
Peng Z, Gallo M, Tillman BL, Rowland D, Wang J (2016) Molecular marker development from transcript sequences and germplasm evaluation for cultivated peanut (Arachis hypogaea L.). Mol Genet Genom 291:363–381
Peng Z, Liu F, Wang L, Zhou H, Paudel D, Tan L, Maku J, Gallo M, Wang J (2017) Transcriptome profiles reveal gene regulation of peanut (Arachis hypogaea L.) nodulation. Sci Rep 7:40066
Pootakham W, Shearman JR, Ruang-areerate P, Sonthirod C, Sangsrakru D, Jomchai N, Yoocha T, Triwitayakorn K, Tragoonrung S, Tangphatsornruang S (2014) Large-scale SNP discovery through RNA sequencing and SNP genotyping by targeted enrichment sequencing in Cassava (Manihot esculenta Crantz). PLoS One 9:e116028
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842
Rogers SO, Bendich AJ (1988) Extraction of DNA from plant tissues. In: Gelvin SB, Schilperoort RA (eds) Plant molecular biology manual. Kluwer Academic Publisher, Dordrecht, pp 1–10
Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239:487
Salmon A, Udall JA, Jeddeloh JA, Wendel J (2012) Targeted capture of homoeologous coding and noncoding sequence in polyploid cotton. G3 (Bethesda) 2:921–930
Song J, Yang X, Resende MF Jr, Neves LG, Todd J, Zhang J, Comstock JC, Wang J (2016) Natural allelic variations in highly polyploidy Saccharum complex. Front Plant Sci 7:804
Summerer D (2009) Enabling technologies of genomic-scale sequence enrichment for targeted high-throughput sequencing. Genomics 94:363–368
Tseng Y, Tillman BL, Peng Z, Wang J (2016) Identification of major QTLs underlying tomato spotted wilt virus resistance in peanut cultivar Florida-EP TM ‘113’. BMC Genet 17:128
Turner EH, Lee C, Ng SB, Nickerson DA, Shendure J (2009) Massively parallel exon capture and library-free resequencing across 16 genomes. Nat Methods 6:315–316
Zhou X, Xia Y, Ren X, Chen Y, Huang L, Huang S, Liao B, Lei Y, Yan L, Jiang H (2014) Construction of a SNP-based genetic linkage map in cultivated peanut based on large scale marker development using next-generation double-digest restriction-site-associated DNA sequencing (ddRADseq). BMC Genom 15:351
Acknowledgements
The research presented in this article was sponsored by the Florida Peanut Producers Association and National Peanut Board.
Author information
Authors and Affiliations
Contributions
JW conceived the study. JW and ZP designed and coordinated the experiments. ZP, LW, and WF conducted the experiment. ZP and WF analyzed the data. ZP, DP, and DL prepared figures. ZP drafted the manuscript. All authors revised the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by S. Hohmann.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Peng, Z., Fan, W., Wang, L. et al. Target enrichment sequencing in cultivated peanut (Arachis hypogaea L.) using probes designed from transcript sequences. Mol Genet Genomics 292, 955–965 (2017). https://doi.org/10.1007/s00438-017-1327-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00438-017-1327-z