Abstract
Whole-genome resequencing (WGR) is a high-throughput way to determine genomic variations in breeding-related research. Accuracy and sensitivity are two of the most important issues in variation calling of WGR, especially for samples with low-depth resequencing data, which are used to reduce cost and save time in studies as survey of core germplasms from natural populations or genome-based breeding selection in segregation populations. An approach called pooled mapping was developed to call variations from low-depth resequencing data of natural or segregation populations. It is highly accurate and sensitive. First, pooled mapping creates a library of confident polymorphic loci in genomes of the population; then, the genotypes are called out at these confident loci for each sample in an efficient manner. The reliability of this pooled mapping method was confirmed using simulated datasets, real resequencing data and experimental genotyping. With onefold simulated resequencing data, results showed that pooled mapping identified SNPs in high accuracy (99.59 %) and sensitivity (93 %), compared to the commonly used method (accuracy: 29 %; sensitivity: 56 %). For the real low-depth resequencing data (≈0.8×) of 281 B. oleracea accessions, four loci corresponding to 1063 sites were selected for KASP genotyping to confirm the performance of pooled mapping. We found for all the 875 homozygous sites analyzed, pooled mapping achieved accuracy as 98.24 % and a sensitivity as 90.97 %. In conclusion, pooled mapping is an efficient means of determining reliable genomic variations with limited resequencing data for population samples. It will be a valuable tool in population genomic analysis and genome-based breeding research.
Similar content being viewed by others
References
Agarwal M, Shrivastava N, Padh H (2008) Advances in molecular marker techniques and their applications in plant sciences. Plant Cell Rep 27(4):617–631. doi:10.1007/s00299-008-0507-z
Ahmed SM, Verma V, Qazi PH, Ganaie MM, Bakshi SK, Qazi GN (2005) Molecular phylogeny in Indian Tinospora species by DNA based molecular markers. Plant Syst Evol 256(1–4):75–87. doi:10.1007/s00606-004-0293-1
Bornet B, Branchard M (2001) Nonanchored inter simple sequence repeat (ISSR) markers: reproducible and specific tools for genome fingerprinting. Plant Mol Biol Rep 19(3):209–215. doi:10.1007/bf02772892
Botstein D, White RL, Skolnick M, Davis RW (1980) Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet 32(3):314
Consortium PGS (2011) Genome sequence and analysis of the tuber crop potato. Nature 475(7355):189–195
Consortium TG (2012) The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485(7400):635–641
Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML (2011) Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat Rev Genet 12(7):499–510
Dorman RB, Rasmus NF, al-Haddad BJ, Serrot FJ, Slusarek BM, Sampson BK, Buchwald H, Leslie DB, Ikramuddin S (2012) Benefits and complications of the duodenal switch/biliopancreatic diversion compared to the Roux-en-Y gastric bypass. Surgery 152(4):758–767
Engelsma K, Calus M, Bijma P, Windig J (2010) Estimating genetic diversity across the neutral genome with the use of dense marker maps. Genet Sel Evol 42(1):1–10. doi:10.1186/1297-9686-42-12
Fischer SG, Lerman LS (1979) Length-independent separation of DNA restriction fragments in two-dimensional gel electrophoresis. Cell 16(1):191–200
Gibbons JG, Janson EM, Hittinger CT, Johnston M, Abbot P, Rokas A (2009) Benchmarking next-generation transcriptome sequencing for functional and evolutionary genomics. Mol Biol Evol 26(12):2731–2744
Goff SA, Ricke D, Lan T-H, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296(5565):92–100
Grodzicker T, Williams J, Sharp P, Sambrook J (1974) Physical mapping of temperature-sensitive mutations of adenoviruses. Cold Spring Harb Symp Quant Biol 39:439–446
Grossman PD, Bloch W, Brinson E, Chang CC, Eggerding FA, Fung S, Woo S, Winn-Deen ES (1994) High-density multiplex detection of nucleic acid sequences: oligonucleotide ligation assay and sequence-coded separation. Nucleic Acids Res 22(21):4527–4534
Huang X, Feng Q, Qian Q, Zhao Q, Wang L, Wang A, Guan J, Fan D, Weng Q, Huang T (2009) High-throughput genotyping by whole-genome resequencing. Genome Res 19(6):1068–1076
Huang X, Wei X, Sang T, Zhao Q, Feng Q, Zhao Y, Li C, Zhu C, Lu T, Zhang Z (2010) Genome-wide association studies of 14 agronomic traits in rice landraces. Nat Genet 42(11):961–967
Imelfort M, Duran C, Batley J, Edwards D (2009) Discovering genetic polymorphisms in next-generation sequencing data. Plant Biotechnol J 7(4):312–317
Kantety R, La Rota M, Matthews D, Sorrells M (2002) Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol Biol 48(5–6):501–510. doi:10.1023/a:1014875206165
Landegren U, Kaiser R, Caskey CT, Hood L (1988) DNA diagnostics-molecular techniques and automation. Science 242(4876):229–237
Larose DT (2005) k-nearest neighbor algorithm. In: Discovering knowledge in data: an introduction to data mining. Wiley, pp 90–106
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv:13033997
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25(14):1754–1760
Li H, Durbin R (2010) Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26(5):589–595
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079
Ling H-Q, Zhao S, Liu D, Wang J, Sun H, Zhang C, Fan H, Li D, Dong L, Tao Y (2013) Draft genome of the wheat A-genome progenitor Triticum urartu. Nature 496(7443):87–90
Liu S, Liu Y, Yang X et al (2014) The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat Commun. doi:10.1038/ncomms4930
Metzker ML (2010) Sequencing technologies—the next generation. Nat Rev Genet 11(1):31–46
Mir RR, Varshney RK (2012) Future prospects of molecular markers in plants. Mol Markers Plants 169–190
Mohan M, Nair S, Bhagwat A, Krishna T, Yano M, Bhatia C, Sasaki T (1997) Genome mapping, molecular markers and marker-assisted selection in crop plants. Mol Breed 3(2):87–103
Mutz K-O, Heilkenbrinker A, Lönne M, Walter J-G, Stahl F (2013) Transcriptome analysis using next-generation sequencing. Curr Opin Biotechnol 24(1):22–30
Orita M, Suzuki Y, Sekiya T, Hayashi K (1989) Rapid and sensitive detection of point mutations and DNA polymorphisms using the polymerase chain reaction. Genomics 5(4):874–879
Paran I, Michelmore RW (1993) Development of reliable PCR-based markers linked to downy mildew resistance genes in lettuce. Theor Appl Genet 85(8):985–993. doi:10.1007/bf00215038
Paterson AH, Wendel JF, Gundlach H, Guo H, Jenkins J, Jin D, Llewellyn D, Showmaker KC, Shu S, Udall J (2012) Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature 492(7429):423–427
Powell W, Morgante M, Andre C, Hanafey M, Vogel J, Tingey S, Rafalski A (1996) The comparison of RFLP, RAPD, AFLP and SSR (microsatellite) markers for germplasm analysis. Mol Breed 2(3):225–238. doi:10.1007/bf00564200
Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, Bertoni A, Swerdlow HP, Gu Y (2012) A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13(1):341
Roberts A, McMillan L, Wang W, Parker J, Rusyn I, Threadgill D (2007) Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows. Bioinformatics 23(13):i401–i407
Rounsley S, Marri PR, Yu Y, He R, Sisneros N, Goicoechea JL, Lee SJ, Angelova A, Kudrna D, Luo M (2009) De novo next generation sequencing of plant genomes. Rice 2(1):35–43
Schmickl R, Jørgensen MH, Brysting AK, Koch MA (2010) The evolutionary history of the Arabidopsis lyrata complex: a hybrid in the amphi-Beringian area closes a large distribution gap and builds up a genetic barrier. BMC Evol Biol 10(1):98
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J (2010) Genome sequence of the palaeopolyploid soybean. Nature 463(7278):178–183
Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326(5956):1112–1115
Schroder J, Schroder H, Puglisi S (2009) SNP detection for massively parallel whole-genome resequencing. Bioinf 14(17):2157–2163
Semagn K, Bjørnstad Å, Ndjiondjop MN (2006) An overview of molecular marker methods for plants. Afr J Biotechnol 5(25):2540–2568
Somers D, Ravel C, Praud S, Murigneux A, Canaguier A, Sapet F, Samson D, Balfourier F, Dufour P, Chalhoub B (2006) Single-nucleotide polymorphism frequency in a set of selected lines of bread wheat (Triticum aestivum L.). Genome 49(9):1131–1139
Tautz D, Trick M, Dover GA (1985) Cryptic simplicity in DNA is a major source of genetic variation. Nature 322(6080):652–656
Tayeh N, Bahrman N, Devaux R, Bluteau A, Prosperi J-M, Delbreil B, Lejeune-Hénaut I (2013) A high-density genetic map of the Medicago truncatula major freezing tolerance QTL on chromosome 6 reveals colinearity with a QTL related to freezing damage on Pisum sativum linkage group VI. Mol Breed 32(2):279–289. doi:10.1007/s11032-013-9869-1
Tenaillon MI, Sawkins MC, Long AD, Gaut RL, Doebley JF, Gaut BS (2001) Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc Natl Acad Sci 98(16):9161–9166
Varshney RK, Nayak SN, May GD, Jackson SA (2009) Next-generation sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol 27(9):522–530
Vignal A, Milan D, SanCristobal M, Eggen A (2002) A review on SNP and other types of molecular markers and their use in animal genetics. Genet Sel Evol 34(3):1–31. doi:10.1186/1297-9686-34-3-275
Wang X, Lou P, Bonnema G, Yang B, He H, Zhang Y, Fang Z (2005) Linkage mapping of a dominant male sterility gene Ms-cd1 in Brassica oleracea. Genome 48(5):848–854
Wang Y, Zhang W-Z, Song L-F, Zou J-J, Su Z, Wu W-H (2008) Transcriptome analyses show changes in gene expression to accompany pollen germination and tube growth in Arabidopsis. Plant Physiol 148(3):1201–1211
Wang X, Wang H, Wang J, Sun R, Wu J, Liu S, Bai Y, Mun J-H, Bancroft I, Cheng F (2011) The genome of the mesopolyploid crop species Brassica rapa. Nat Genet 43(10):1035–1039
Williams JG, Kubelik AR, Livak KJ, Rafalski JA, Tingey SV (1990) DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res 18(22):6531–6535
Xu X, Bai G (2015) Whole-genome resequencing: changing the paradigms of SNP detection, molecular mapping and gene discovery. Mol Breed 35(1):1–11. doi:10.1007/s11032-015-0240-6
Yu J, Fang D, Kohel R, Ulloa M, Hinze L, Percy R, Zhang J, Chee P, Scheffler B, Jones D (2012) Development of a core set of SSR markers for the characterization of Gossypium germplasm. Euphytica 187(2):203–213. doi:10.1007/s10681-012-0643-y
Zhu Y, Song Q, Hyten D, Van Tassell C, Matukumalli L, Grimm D, Hyatt S, Fickus E, Young N, Cregan P (2003) Single-nucleotide polymorphisms in soybean. Genetics 163(3):1123–1134
Acknowledgments
This work was funded by the 973 Program 2012CB113900 to XW and FC; the 973 Program 2013CB127000 and the 863 Program 2012AA100101 to XW; the National Natural Science Foundation of China NSFC Grant 31301771 and National Science and Technology Ministry (2014BAD01B09) to FC; the Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences. Research was carried out in the Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture, P. R. China.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
All authors confirm to have no conflict of interest.
Additional information
Lixia Fu and Chengcheng Cai have contributed equally to this work.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Fu, L., Cai, C., Cui, Y. et al. Pooled mapping: an efficient method of calling variations for population samples with low-depth resequencing data. Mol Breeding 36, 48 (2016). https://doi.org/10.1007/s11032-016-0476-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11032-016-0476-9