Abstract
Advances in DNA sequencing provide tools for efficient large-scale discovery of markers for use in plants. Discovery options include large-scale amplicon sequencing, transcriptome sequencing, gene-enriched genome sequencing and whole genome sequencing. Examples of each of these approaches and their potential to generate molecular markers for specific applications have been described. Sequencing the whole genome of parents identifies all the polymorphisms available for analysis in their progeny. Sequencing PCR amplicons of sets of candidate genes from DNA bulks can be used to define the available variation in these genes that might be exploited in a population or germplasm collection. Sequencing of the transcriptomes of genotypes varying for the trait of interest may identify genes with patterns of expression that could explain the phenotypic variation. Sequencing genomic DNA enriched for genes by hybridization with probes for all or some of the known genes simplifies sequencing and analysis of differences in gene sequences between large numbers of genotypes and genes especially when working with complex genomes. Examples of application of the above-mentioned techniques have been described.
Similar content being viewed by others
References
Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song XZ, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, Weinstock GM and Gibbs RA 2007 Direct selection of human genomic loci by microarray hybridization. Nat. Method. 4 903–905
Akhunov E, Nicolet C and Dvorak J 2009 Single nucleotide polymorphism genotyping in polyploid wheat with the Illumina Goldengate assay. Theor. Appl. Genetics 119 507–517
Alverson AJ, Rice DW, Dickinson S, Barry K and Palmer JD 2011 Origins and recombination of the bacterial-sized mitochondrial genome of cucumber. Plant Cell 23 2499–2513
Arai-Kichise Y, Shiwa Y, Nagasaki H, Ebana K, Yoshikawa H, Yano M and Wakasa K 2011 Discovery of genome-wide DNA polymorphisms in a landrace cultivar of japonica rice by whole-genome sequencing. Plant Cell Physiol. 52 274–282
Argout X, Salse J, Aury J-M, Guiltinan MJ, Droc G, Gouzy J, Allegre M, Chaparro C, et al. 2010 The genome of Thebroma cacao. Nat. Genet. 43 101–108
Atherton RA, McComish BJ, Shepherd LD, Berry LA, Albert NW and Lockhart PJ 2010 Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform. Plant Methods 6 22
Barbazuk WB, Emrich SJ, Chen LL and Schnable PS 2007 SNP discovery via 454 transcriptome sequencing. Plant J. 51 910–918
Birney E 2011 Assemblies: the good, the bad, the ugly. Nat. Method. 8 59–60
Brautigam A and Gowik U 2010 What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research. Plant Biol. 12 831–841
Buckler ES, Warburton ML and Rocheford T 2010 Rare genetic variation at Zea mays crtRB1 increases beta-carotene in maize grain. Nat. Genet. 42 322–327
Bundock PC, Eliott F, Ablett G, Benson AD, Casu R, Aitken K and Henry RJ 2009 Targeted SNP discovery in sugarcane using 454 sequencing. Plant Biotechnol. J. 7 347–354
Bundock PC, Casu R and Henry RJ 2012 Enrichment of genomic DNA for polymorphism detection in a non-model highly polyploidy crop plant. Plant Biotechnol. J. 10 657-667
Chan AP, Crabtree J, Zhao Q, Lorenzi H, Orvis J, Puiu D, Melake-Berhan A, Jones KM, et al. 2010 Draft genome sequence of the oilseed species Ricinus communis. Nat. Biotechnol. 28 951–956
Cronn R, Liston A, Parks M, Gernandt DS, Shen R and Mockler T 2008 Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology. Nucleic Acids Res. 36 e122
Cross M, Waters D, Lee LS and Henry RJ 2008 Endonucleolytic Mutation Analysis by Internal Labeling (EMAIL). Electrophoresis 29 1291–1301
Deschamps S and Campbell MA 2010 Utilization of next-generation sequencing platforms in plant genomics and genetic variant discovery. Mol. Breed. 25553–570
Doorduin L, Gravendeel B, Lammers Y, Ariyurek Y, Chin-A-Woeng T and Vrieling K 2011 The complete chloroplast genome of 17 individuals of pest species Jacobaea vulgaris: SNPs, microsatellites and barcoding markers for population and phylogenetic studies. DNA Res. 18 93–105
Druley TE, Vallania FLM, Wegner DJ, Varley KE, Knowles OL, Bonds JA, Robison SW, Doniger SW, Hamvas A and Cole FS 2009 Quantification of rare allelic variants from pooled genomic DNA. Nat. Method 6 263–265
Dubey A, Farmer A, Schlueter J, Cannon SB, Abernathy B, Tuteja R, Woodward J, Shah T, et al. 2011 Defining the transcriptome assembly and its use for genome dynamics and transcriptome profiling studies in pigeonpea (Cajanus cajan L.) DNA Res. 18 153–164
Edwards D and Batley J 2010 Plant genome sequencing: applications for crop improvement. Plant Biotechnol. J. 8 2–9
Edwards M and Henry R 2011 DNA sequencing methods contributing to new directions in cereal research. J. Cereal Sci. 54 395–400
Fitzgerald TL, Shapter FM, McDonald S, Waters DLE, Chivers IH, Drenth A, Nevo E and Henry RJ 2011 Genome diversity in wild grasses under environmental stress. Proc. Natl. Acad. Sci. USA 108 21139–21144
Frazer KA, Murray SS, Schork NJ and Topol EJ 2009 Human genetic variation and its contribution to complex traits. Nat. Rev. Genet. 10 241–251
Fu Y, Springer NM, Gerhardt DJ, Ying K, Yeh CT, Wu W, Swanson-Wagner R, D’Ascenzo M, et al. 2010 Repeat subtraction-mediated sequence capture from a complex genome. Plant J. 62 898–909
Fuji S, Kazama T, Yamada M and Toryama K 2010 Discovery of global genomic re-organization based on comparison of two newly sequenced rice mitochondrial genomes with cytoplasmic male sterility-related genes. BMC Genomics 11 209
Futschik A and Schlotterer C 2010 The next generation of molecular markers from massively parallel sequencing of pooled DNA samples. Genetics 186 207–218
Garvin MR, Saitoh K and Gharrett AJ 2010 Application of single nucleotide polymorphisms to non-model species: a technical review. Mol. Ecol. Resour. 10 915–934
Gillies S, Furtado A and Henry RJ 2012 Gene expression in the developing aleurone and starchy endosperm of wheat. Plant Biotechnol. J. 10 668-679
Glenn TC 2011 Field guide to next-generation DNA sequencers. Mol. Ecol. Resour. 11 759–769
Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, et al. 2009 Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27 182–189
Gopala Krishnan S, Waters DLE, Katiyar SK, Sadananda AR, Satyadev V and Henry R 2011 Genome-wide DNA polymorphisms in elite indica rice inbreds discovered by whole-genome sequencing. Plant Biotechnol. J. 10 623-634
Gore MA, Chia JM, Elshire RJ, Sun Q, Ersoz ES, Hurwitz BL, Peiffer JA, McMullen MD, et al. 2009 A first generation haplotype map of maize. Science 326 1115–1117
Gupta PK, Langridge P and Mir RR 2010 Marker-assisted wheat breeding: present status and future possibilities. Mol. Breed. 26 145–161
He Z, Zhai W, Wen H, Tang T, Wang Y, Lu X, Greenberg AJ, Hudson RR, Wu CI and Shi S 2011 Two evolutionary histories in the genome of rice: the roles of domestication genes. PLoS Genet. 7 e1002100
Henry RJ and Edwards K 2009 New tools for single nucleotide polymorphism (SNP) discovery and analysis accelerating plant biotechnology. Plant Biotechnol. J. 7 311
Henry RJ 1997 Practical applications of plant molecular biology (London: Chapman and Hall)
Henry RJ 2001 Plant genotyping: The DNA fingerprinting of plants (Oxon: CABI Publishing)
Henry RJ 2005 Plant diversity and evolution: Genotypic and phenotypic variation in higher plants (Oxon: CABI Publishing)
Henry RJ 2008 Plant genotyping II: SNP technology (Wallingford: CABI Publishing)
Henry RJ 2010 Plant resources for food fuel and conservation (London: Earthscan)
Hill H, Lee LS and Henry RJ 2011 Variation in sorghum starch synthesis genes associated with differences in starch phenotype. Food Chem. doi:10.1016/j.foodchem.2011.08.057
Hiremath PJ, Farmer A, Cannon SB, Woodward J, Kudapa H, Tuteja R, Kumar A, BhanuPrakash A, et al. 2011 Large-scale transcriptome analysis of chickpea (Cicer arietinum L.) an orphan legume crop of the semi-arid tropics of Asia and Africa. Plant Biotechnol. J. 9 922–931
Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ, Hannon GJ and McCombie WR 2007 Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 39 1522–1527
Imelfort M, Batley J, Grimmond S and Edwards D 2009a Genome sequencing approaches and successes; in Methods in molecular biology, plant genomics (eds) D Somers, P Langridge and Gustafson JP (Humana Press) pp 345–258
Imelfort M, Duran C, Batley J and Edwards D 2009b Discovering genetic polymorphisms in next-generation sequencing data. Plant Biotechnol. J. 7 312–317
IRGSP (International Rice Genome Sequencing Project) 2005 The map based sequence of the rice genome. Nature 436 793–800
Kharabian-Masouleh A, Waters D, Reinke R and Henry R 2009 A high-throughput assay for rapid and simultaneous analysis of perfect markers for important quality and agronomic traits in rice using multiplexed MALDI-TOF mass spectrometry. Plant Biotechnol. J. 7 355–363
Kharabian-Masouleh A, Waters DLE, Reinke RF and Henry RJ 2011 Discovery of polymorphisms in starch related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing. Plant Biotechnol. J. 9 1074–1085
Kim MY, Lee S, Van K, Kim TH, Jeong SC, Cho IY, Kim DS, Lee YS, et al. 2010 Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome. Proc. Natl. Acad. Sci. USA 107 22032–22037
Kircher M and Kelso J 2010 High-throughput DNA sequencing - concepts and limitations. Bioessays 32 524–536
Kulheim C., Yeou SH, Maintz J, Foley W and Moran G 2009 Comparative SNP diversity among four Eucalyptus species for genes from secondary metabolite biosynthetic pathways. BMC Genomics 10 452
Kulheim C,Yeoh, SH, Wallis IR, S Laffan S, Moran GF, et al. 2011 The molecular basis of quantitative variation in foliar secondary metabolites in Eucalyptus globulus. New Phytol. 191 1041–1053
Lai J, Li R, Xu, X, Jin W, Xu M, Zhao H, Xiang Z, Song W, et al. 2010. Genome-wide patterns of genetic variation among elite maize inbred lines. Nat. Genet. 42 1027–1030
Laird PW 2010 Principles and challenges of genome-wide DNA methylation analysis. Nat. Rev. Genet. 11 191–203
Magi A, Benelli M, Gozzini A, Girolami F, Torricelli F and Brandi ML 2010 Bioinformatics for next generation sequencing data. Genes 1 294–307
Malory S, Shapter FM, Elphinstone MS, Chivers IH and Henry RJ 2011 Characterizing homologues of crop domestication genes in poorly described wild relatives by high-throughput sequencing of whole genomes Plant Biotechnol. J. 9 1131–1140
McMullen MD, Kresovich S, Villeda HS, Bradbury P, Li H, Sun Q, Flint-Garcia S, Thornsberry J, et al. 2009 Genetic properties of the maize nested association mapping population. Science 325 737–740
Myles S, Chia JM, Hurwitz B, Simon C, Zhong GY, Buckler E and Ware D 2010 Rapid genomic characterization of the genus Vitis. PLoS ONE 5 e8219
Nock C, Waters DLE, Edwards MA, Bowen S, Rice N, Cordeiro GM and Henry RJ 2011 Chloroplast genome sequence from total DNA for plant identification. Plant Biotechnol. J. 9 328–333
Novaes E, Drost D, Farmerie W, Pappas G, Grattapaglia D, et al. 2008 High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC Genomics 9 312
Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ and Zwick ME 2007 Microarray-based genomic selection for high-throughput resequencing. Nat. Method. 4 907–909
Out AA, van Minderhout I, Goeman JJ, Ariyurek Y, Ossowski S, Schneeberger K, Weigel D, van Galen M, Taschner PEM and Tops CMJ 2009 Deep sequencing to reveal new variants in pooled DNA samples. Hum. Mutat. 30 1703–1712
Parks M, Cronn R and Liston A 2009 Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes. BMC Biol. 7 84
Pattemore JA, Rice N, Marshall DF, Waugh R and Henry RJ 2010 Cereal Variety Identification using MALDI-TOF mass spectrometry SNP genotyping. J. Cereal Sci. 52 356–361
Peterson TW, Nam SJ and Darby A 2010 Next-gen sequencing survey; in North America equity research (New York: JP Morgan Chase & Co.)
Rival A, Beule T, Bertossi FA, Tregear J and Jaligot E 2010 Plant epigenetics: From genomes to epigenomes. Notulae Botanicae Horti Agrobotanici Cluj-Napoca 38 9–15
Rodríguez-Moreno L, González VM, Benjak A, Martí MC, Puigfomènech P, Aranda MA and Garcia-Mas J 2011 Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin. BMC Genomics 12 424
Schadt EE, Turner S and Kasarskis A 2010 A window into third-generation sequencing. Hum. Mol. Genet. 19 R227–R240
Sexton T, Henry R, McManus LJ, Bowen S and Shepherd M 2010a Capture of assay template by multiplex PCR of long amplicons for genotyping SNPs and InDels with MALDI-TOF mass spectrometry. Mol. Breed. 25 471–480
Sexton TR, Henry RJ, McManus LJ, Henson M, Thomas DS and Shepherd M 2010b Genetic association studies in Eucalyptus pilularis Smith (blackbutt). Aust. Forest. J. 73 254–258.
Sexton T, Henry R, Harwood C, Thomas D, L. McManus L, et al. 2011 SNP discovery and association mapping in Eucalyptus pilularis (blackbutt). BMC Proc. 5 O9
Sexton TR, Henry RJ, Harwood CE, Thomas DS, McManus LJ, Raymond C, Henson M and Shepherd M 2012 Pectin methylesterase genes influence solid wood properties of Eucalyptus pilularis. Plant Physiol. 158 531–541
Shendure J and Ji HL 2008 Next-generation DNA sequencing. Nat. Biotechnol. 26 1135–1145
Shapter FM, Fitzgerald TL, Waters DLE, McDonald S, Chivers IH and Henry RJ 2012 Analysis of adaptive ribosomal gene diversity in wild plant populations from contrasting climatic environments. Plant Signal. Behav. 7 1-3 accepted 8 February 2012
Shendure J and Ji HL 2008 Next-generation DNA sequencing. Nat. Biotechnol. 26 1135–1145
Souza GM, Berges H, Bocs S, Casu R, D’Hont A, Ferreira JE, Henry R, Ming R, et al. 2011 The sugarcane genome challenges: Strategies for sequencing a highly complex genome. Trop. Plant Biol. 4 145–156
Straub SCK, Fishbein M, Livshultz T, Foster Z, Parks M, Weitemier K, Cronn RC and Liston A 2011 Building a model: Developing genomic resources for common milkweed (Ascleplas syriaca) with low coverage genome sequencing. BMC Genomics 12 211
Thomas RK, Nickerson E, Simons JF, Jänne PA, Tengs T, Yuza Y, Garraway LA, LaFramboise T, Lee JC and Shah K 2006 Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing. Nat. Med. 12 852–855
Thudi M, Li Y, Jackson SA, May GD and Varshney RK 2012 Current state-of-the-art sequencing technologies for plant genomics research. Brief. Functional Genomics 11 3–11
Trick M, Long Y, Meng J and Bancroft I 2009 Single nucleotide polymorphism (SNP) discovery in the polyploidy Brassica napus using Solexa transcriptome sequencing. Plant Biotechnol. J. 7334–346
Trebbi D, maccaferri M, de Heer P, Sorensen A, Giuliani S, Sanguineti MC, Massi A, van der Vossen EAG and Tuberosa R 2011 High-throughput SNP discovery and genotyping in durum wheat (Triticum durum Desf.). Theor. Appl. Genet. 123 555–569
TPGSC (The Potato Genome Sequencing Consortium) 2011 Genome sequence and analysis of the tuber crop potato. Nature 475 189–195
Tung C, Zhao K, Wright LM, Ali ML, Jung J, Kimball J, Tyagi W, Thomson MJ, et al. 2010 Development of a research platform for dissecting phenotye-genotype associations in rice (Oryza spp.). Rice 3 205–217
Varley KE and Mitra RD 2008 Nested Patch PCR enables highly multiplexed mutation discovery in candidate genes. Genome Res. 18 1844–1850
Varshney RV, Glaszmann J-C, Leung H and Ribaul J-M 2010 More genomic resources for less studies crops. Trends Biotechnol. 28 452–460
Ward BL, Anderson RS and Bendich AJ 1981 The mitochondrial genome is large and variable in a family of plants (Cucurbitaceae). Cell 25 793–803
Waters DLE, Nock CJ, Ishikawa R, Rice N and Henry RJ 2012 Chloroplast genome sequence confirms distinctness of Australian and Asian wild rice. Ecol. Evol. 2 211–217
Whittall JB, Syring J, Parks M, Buenrostro J, Dick C, et al. 2010 Finding a (pine) needle in a haystack: chloroplast genome sequence divergence in rare and widespread pines. Mol. Ecol. 19 100–114
Wu X, Ren C, Joshi T, Vuong T, Xu D and Nguyen HT 2010 SNP discovery by high-throughput sequencing in soybean. BMC Genomics 11 469
Xu YB, Crouch JH and Jonathan H 2008 Marker-assisted selection in plant breeding: From publications to practice. Crop Sci. 48 391–407
Yamamoto T, Nagasaki H, Yonemaru J, Ebana K, Nakajima M, Shibaya T and Yano M 2010 Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms. BMC Genomics 11 267
Yan JB, Yang XH, Shah T, Sanchez-Villeda H, Li JS, Warburton M, Zhou Y, Crouch JH and Xu YB 2010 High-throughput SNP genotyping with the GoldenGate assay in maize. Mol. Breed. 25 441–451
Yan J, Kandianis CB, Harjes CE, Bai L, Kim EH, Yang X, Skinner DJ, Fu Z, et al. 2011 Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence. BMC Genomics 12 59
Zhang Y, Ma P and Li D 2011 High-throughput sequencing of six bamboo chloroplast genomes: phylogenetic implications for temperate woody bamboos (Poaceae: Bambusoideae). PLoS ONE 6 e20596
Acknowledgements
Financial support from the Australian Research Council, Grains Research and Development Corporation, Sugar Research and Development Corporation, and Rural Industries Research and Development Corporation is acknowledged. GKS visit to SCPS was sponsored by the BOYSCAST fellowship of Department of Science and Technology, India.
Author information
Authors and Affiliations
Corresponding author
Additional information
[Henry RJ, Edwards M, Waters DLE, Gopala Krishnan S, Bundock P, Sexton TR, Masouleh AK, Nock CJ and Pattemore J 2012 Application of large-scale sequencing to marker discovery in plants. J. Biosci. 37 1–13] DOI 10.1007/s12038-012-9253-z
Rights and permissions
About this article
Cite this article
Henry, R.J., Edwards, M., Waters, D.L.E. et al. Application of large-scale sequencing to marker discovery in plants. J Biosci 37, 829–841 (2012). https://doi.org/10.1007/s12038-012-9253-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12038-012-9253-z