Skip to main content
Log in

Application of large-scale sequencing to marker discovery in plants

  • Published:
Journal of Biosciences Aims and scope Submit manuscript

Abstract

Advances in DNA sequencing provide tools for efficient large-scale discovery of markers for use in plants. Discovery options include large-scale amplicon sequencing, transcriptome sequencing, gene-enriched genome sequencing and whole genome sequencing. Examples of each of these approaches and their potential to generate molecular markers for specific applications have been described. Sequencing the whole genome of parents identifies all the polymorphisms available for analysis in their progeny. Sequencing PCR amplicons of sets of candidate genes from DNA bulks can be used to define the available variation in these genes that might be exploited in a population or germplasm collection. Sequencing of the transcriptomes of genotypes varying for the trait of interest may identify genes with patterns of expression that could explain the phenotypic variation. Sequencing genomic DNA enriched for genes by hybridization with probes for all or some of the known genes simplifies sequencing and analysis of differences in gene sequences between large numbers of genotypes and genes especially when working with complex genomes. Examples of application of the above-mentioned techniques have been described.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Similar content being viewed by others

References

  • Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song XZ, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, Weinstock GM and Gibbs RA 2007 Direct selection of human genomic loci by microarray hybridization. Nat. Method. 4 903–905

    Article  CAS  Google Scholar 

  • Akhunov E, Nicolet C and Dvorak J 2009 Single nucleotide polymorphism genotyping in polyploid wheat with the Illumina Goldengate assay. Theor. Appl. Genetics 119 507–517

    Google Scholar 

  • Alverson AJ, Rice DW, Dickinson S, Barry K and Palmer JD 2011 Origins and recombination of the bacterial-sized mitochondrial genome of cucumber. Plant Cell 23 2499–2513

    Article  PubMed  CAS  Google Scholar 

  • Arai-Kichise Y, Shiwa Y, Nagasaki H, Ebana K, Yoshikawa H, Yano M and Wakasa K 2011 Discovery of genome-wide DNA polymorphisms in a landrace cultivar of japonica rice by whole-genome sequencing. Plant Cell Physiol. 52 274–282

    Google Scholar 

  • Argout X, Salse J, Aury J-M, Guiltinan MJ, Droc G, Gouzy J, Allegre M, Chaparro C, et al. 2010 The genome of Thebroma cacao. Nat. Genet. 43 101–108

    Article  PubMed  Google Scholar 

  • Atherton RA, McComish BJ, Shepherd LD, Berry LA, Albert NW and Lockhart PJ 2010 Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform. Plant Methods 6 22

    Google Scholar 

  • Barbazuk WB, Emrich SJ, Chen LL and Schnable PS 2007 SNP discovery via 454 transcriptome sequencing. Plant J. 51 910–918

  • Birney E 2011 Assemblies: the good, the bad, the ugly. Nat. Method. 8 59–60

    Article  CAS  Google Scholar 

  • Brautigam A and Gowik U 2010 What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research. Plant Biol. 12 831–841

  • Buckler ES, Warburton ML and Rocheford T 2010 Rare genetic variation at Zea mays crtRB1 increases beta-carotene in maize grain. Nat. Genet. 42 322–327

    Article  PubMed  Google Scholar 

  • Bundock PC, Eliott F, Ablett G, Benson AD, Casu R, Aitken K and Henry RJ 2009 Targeted SNP discovery in sugarcane using 454 sequencing. Plant Biotechnol. J. 7 347–354

    Google Scholar 

  • Bundock PC, Casu R and Henry RJ 2012 Enrichment of genomic DNA for polymorphism detection in a non-model highly polyploidy crop plant. Plant Biotechnol. J. 10 657-667

    Article  PubMed  CAS  Google Scholar 

  • Chan AP, Crabtree J, Zhao Q, Lorenzi H, Orvis J, Puiu D, Melake-Berhan A, Jones KM, et al. 2010 Draft genome sequence of the oilseed species Ricinus communis. Nat. Biotechnol. 28 951–956

    Article  PubMed  CAS  Google Scholar 

  • Cronn R, Liston A, Parks M, Gernandt DS, Shen R and Mockler T 2008 Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology. Nucleic Acids Res. 36 e122

  • Cross M, Waters D, Lee LS and Henry RJ 2008 Endonucleolytic Mutation Analysis by Internal Labeling (EMAIL). Electrophoresis 29 1291–1301

    Article  PubMed  CAS  Google Scholar 

  • Deschamps S and Campbell MA 2010 Utilization of next-generation sequencing platforms in plant genomics and genetic variant discovery. Mol. Breed. 25553–570

  • Doorduin L, Gravendeel B, Lammers Y, Ariyurek Y, Chin-A-Woeng T and Vrieling K 2011 The complete chloroplast genome of 17 individuals of pest species Jacobaea vulgaris: SNPs, microsatellites and barcoding markers for population and phylogenetic studies. DNA Res. 18 93–105

    Google Scholar 

  • Druley TE, Vallania FLM, Wegner DJ, Varley KE, Knowles OL, Bonds JA, Robison SW, Doniger SW, Hamvas A and Cole FS 2009 Quantification of rare allelic variants from pooled genomic DNA. Nat. Method 6 263–265

  • Dubey A, Farmer A, Schlueter J, Cannon SB, Abernathy B, Tuteja R, Woodward J, Shah T, et al. 2011 Defining the transcriptome assembly and its use for genome dynamics and transcriptome profiling studies in pigeonpea (Cajanus cajan L.) DNA Res. 18 153–164

    Google Scholar 

  • Edwards D and Batley J 2010 Plant genome sequencing: applications for crop improvement. Plant Biotechnol. J. 8 2–9

    Google Scholar 

  • Edwards M and Henry R 2011 DNA sequencing methods contributing to new directions in cereal research. J. Cereal Sci. 54 395–400

    Google Scholar 

  • Fitzgerald TL, Shapter FM, McDonald S, Waters DLE, Chivers IH, Drenth A, Nevo E and Henry RJ 2011 Genome diversity in wild grasses under environmental stress. Proc. Natl. Acad. Sci. USA 108 21139–21144

    Google Scholar 

  • Frazer KA, Murray SS, Schork NJ and Topol EJ 2009 Human genetic variation and its contribution to complex traits. Nat. Rev. Genet. 10 241–251

    Article  PubMed  CAS  Google Scholar 

  • Fu Y, Springer NM, Gerhardt DJ, Ying K, Yeh CT, Wu W, Swanson-Wagner R, D’Ascenzo M, et al. 2010 Repeat subtraction-mediated sequence capture from a complex genome. Plant J. 62 898–909

    Google Scholar 

  • Fuji S, Kazama T, Yamada M and Toryama K 2010 Discovery of global genomic re-organization based on comparison of two newly sequenced rice mitochondrial genomes with cytoplasmic male sterility-related genes. BMC Genomics 11 209

    Google Scholar 

  • Futschik A and Schlotterer C 2010 The next generation of molecular markers from massively parallel sequencing of pooled DNA samples. Genetics 186 207–218

    Article  PubMed  CAS  Google Scholar 

  • Garvin MR, Saitoh K and Gharrett AJ 2010 Application of single nucleotide polymorphisms to non-model species: a technical review. Mol. Ecol. Resour. 10 915–934

    Article  PubMed  CAS  Google Scholar 

  • Gillies S, Furtado A and Henry RJ 2012 Gene expression in the developing aleurone and starchy endosperm of wheat. Plant Biotechnol. J. 10 668-679

    Article  PubMed  CAS  Google Scholar 

  • Glenn TC 2011 Field guide to next-generation DNA sequencers. Mol. Ecol. Resour. 11 759–769

    Article  PubMed  CAS  Google Scholar 

  • Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, et al. 2009 Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27 182–189

    Article  PubMed  CAS  Google Scholar 

  • Gopala Krishnan S, Waters DLE, Katiyar SK, Sadananda AR, Satyadev V and Henry R 2011 Genome-wide DNA polymorphisms in elite indica rice inbreds discovered by whole-genome sequencing. Plant Biotechnol. J. 10 623-634

    Google Scholar 

  • Gore MA, Chia JM, Elshire RJ, Sun Q, Ersoz ES, Hurwitz BL, Peiffer JA, McMullen MD, et al. 2009 A first generation haplotype map of maize. Science 326 1115–1117

    Article  PubMed  CAS  Google Scholar 

  • Gupta PK, Langridge P and Mir RR 2010 Marker-assisted wheat breeding: present status and future possibilities. Mol. Breed. 26 145–161

    Article  Google Scholar 

  • He Z, Zhai W, Wen H, Tang T, Wang Y, Lu X, Greenberg AJ, Hudson RR, Wu CI and Shi S 2011 Two evolutionary histories in the genome of rice: the roles of domestication genes. PLoS Genet. 7 e1002100

  • Henry RJ and Edwards K 2009 New tools for single nucleotide polymorphism (SNP) discovery and analysis accelerating plant biotechnology. Plant Biotechnol. J. 7 311

    Google Scholar 

  • Henry RJ 1997 Practical applications of plant molecular biology (London: Chapman and Hall)

    Google Scholar 

  • Henry RJ 2001 Plant genotyping: The DNA fingerprinting of plants (Oxon: CABI Publishing)

    Book  Google Scholar 

  • Henry RJ 2005 Plant diversity and evolution: Genotypic and phenotypic variation in higher plants (Oxon: CABI Publishing)

    Book  Google Scholar 

  • Henry RJ 2008 Plant genotyping II: SNP technology (Wallingford: CABI Publishing)

    Book  Google Scholar 

  • Henry RJ 2010 Plant resources for food fuel and conservation (London: Earthscan)

    Google Scholar 

  • Hill H, Lee LS and Henry RJ 2011 Variation in sorghum starch synthesis genes associated with differences in starch phenotype. Food Chem. doi:10.1016/j.foodchem.2011.08.057

  • Hiremath PJ, Farmer A, Cannon SB, Woodward J, Kudapa H, Tuteja R, Kumar A, BhanuPrakash A, et al. 2011 Large-scale transcriptome analysis of chickpea (Cicer arietinum L.) an orphan legume crop of the semi-arid tropics of Asia and Africa. Plant Biotechnol. J. 9 922–931

    Google Scholar 

  • Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ, Hannon GJ and McCombie WR 2007 Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 39 1522–1527

    Article  PubMed  CAS  Google Scholar 

  • Imelfort M, Batley J, Grimmond S and Edwards D 2009a Genome sequencing approaches and successes; in Methods in molecular biology, plant genomics (eds) D Somers, P Langridge and Gustafson JP (Humana Press) pp 345–258

  • Imelfort M, Duran C, Batley J and Edwards D 2009b Discovering genetic polymorphisms in next-generation sequencing data. Plant Biotechnol. J. 7 312–317

  • IRGSP (International Rice Genome Sequencing Project) 2005 The map based sequence of the rice genome. Nature 436 793–800

    Article  Google Scholar 

  • Kharabian-Masouleh A, Waters D, Reinke R and Henry R 2009 A high-throughput assay for rapid and simultaneous analysis of perfect markers for important quality and agronomic traits in rice using multiplexed MALDI-TOF mass spectrometry. Plant Biotechnol. J. 7 355–363

    Google Scholar 

  • Kharabian-Masouleh A, Waters DLE, Reinke RF and Henry RJ 2011 Discovery of polymorphisms in starch related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing. Plant Biotechnol. J. 9 1074–1085

    Google Scholar 

  • Kim MY, Lee S, Van K, Kim TH, Jeong SC, Cho IY, Kim DS, Lee YS, et al. 2010 Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome. Proc. Natl. Acad. Sci. USA 107 22032–22037

    Google Scholar 

  • Kircher M and Kelso J 2010 High-throughput DNA sequencing - concepts and limitations. Bioessays 32 524–536

    Article  PubMed  CAS  Google Scholar 

  • Kulheim C., Yeou SH, Maintz J, Foley W and Moran G 2009 Comparative SNP diversity among four Eucalyptus species for genes from secondary metabolite biosynthetic pathways. BMC Genomics 10 452

    Google Scholar 

  • Kulheim C,Yeoh, SH, Wallis IR, S Laffan S, Moran GF, et al. 2011 The molecular basis of quantitative variation in foliar secondary metabolites in Eucalyptus globulus. New Phytol. 191 1041–1053

  • Lai J, Li R, Xu, X, Jin W, Xu M, Zhao H, Xiang Z, Song W, et al. 2010. Genome-wide patterns of genetic variation among elite maize inbred lines. Nat. Genet. 42 1027–1030

    Article  PubMed  CAS  Google Scholar 

  • Laird PW 2010 Principles and challenges of genome-wide DNA methylation analysis. Nat. Rev. Genet. 11 191–203

    Article  PubMed  CAS  Google Scholar 

  • Magi A, Benelli M, Gozzini A, Girolami F, Torricelli F and Brandi ML 2010 Bioinformatics for next generation sequencing data. Genes 1 294–307

    Article  CAS  Google Scholar 

  • Malory S, Shapter FM, Elphinstone MS, Chivers IH and Henry RJ 2011 Characterizing homologues of crop domestication genes in poorly described wild relatives by high-throughput sequencing of whole genomes Plant Biotechnol. J. 9 1131–1140

    CAS  Google Scholar 

  • McMullen MD, Kresovich S, Villeda HS, Bradbury P, Li H, Sun Q, Flint-Garcia S, Thornsberry J, et al. 2009 Genetic properties of the maize nested association mapping population. Science 325 737–740

    Article  PubMed  CAS  Google Scholar 

  • Myles S, Chia JM, Hurwitz B, Simon C, Zhong GY, Buckler E and Ware D 2010 Rapid genomic characterization of the genus Vitis. PLoS ONE 5 e8219

  • Nock C, Waters DLE, Edwards MA, Bowen S, Rice N, Cordeiro GM and Henry RJ 2011 Chloroplast genome sequence from total DNA for plant identification. Plant Biotechnol. J. 9 328–333

    Google Scholar 

  • Novaes E, Drost D, Farmerie W, Pappas G, Grattapaglia D, et al. 2008 High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC Genomics 9 312

    Google Scholar 

  • Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ and Zwick ME 2007 Microarray-based genomic selection for high-throughput resequencing. Nat. Method. 4 907–909

    Article  CAS  Google Scholar 

  • Out AA, van Minderhout I, Goeman JJ, Ariyurek Y, Ossowski S, Schneeberger K, Weigel D, van Galen M, Taschner PEM and Tops CMJ 2009 Deep sequencing to reveal new variants in pooled DNA samples. Hum. Mutat. 30 1703–1712

    Article  PubMed  CAS  Google Scholar 

  • Parks M, Cronn R and Liston A 2009 Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes. BMC Biol. 7 84

  • Pattemore JA, Rice N, Marshall DF, Waugh R and Henry RJ 2010 Cereal Variety Identification using MALDI-TOF mass spectrometry SNP genotyping. J. Cereal Sci. 52 356–361

    Google Scholar 

  • Peterson TW, Nam SJ and Darby A 2010 Next-gen sequencing survey; in North America equity research (New York: JP Morgan Chase & Co.)

    Google Scholar 

  • Rival A, Beule T, Bertossi FA, Tregear J and Jaligot E 2010 Plant epigenetics: From genomes to epigenomes. Notulae Botanicae Horti Agrobotanici Cluj-Napoca 38 9–15

    CAS  Google Scholar 

  • Rodríguez-Moreno L, González VM, Benjak A, Martí MC, Puigfomènech P, Aranda MA and Garcia-Mas J 2011 Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin. BMC Genomics 12 424

    Google Scholar 

  • Schadt EE, Turner S and Kasarskis A 2010 A window into third-generation sequencing. Hum. Mol. Genet. 19 R227–R240

    Article  PubMed  CAS  Google Scholar 

  • Sexton T, Henry R, McManus LJ, Bowen S and Shepherd M 2010a Capture of assay template by multiplex PCR of long amplicons for genotyping SNPs and InDels with MALDI-TOF mass spectrometry. Mol. Breed. 25 471–480

    Article  CAS  Google Scholar 

  • Sexton TR, Henry RJ, McManus LJ, Henson M, Thomas DS and Shepherd M 2010b Genetic association studies in Eucalyptus pilularis Smith (blackbutt). Aust. Forest. J. 73 254–258.

    Google Scholar 

  • Sexton T, Henry R, Harwood C, Thomas D, L. McManus L, et al. 2011 SNP discovery and association mapping in Eucalyptus pilularis (blackbutt). BMC Proc. 5 O9

  • Sexton TR, Henry RJ, Harwood CE, Thomas DS, McManus LJ, Raymond C, Henson M and Shepherd M 2012 Pectin methylesterase genes influence solid wood properties of Eucalyptus pilularis. Plant Physiol. 158 531–541

  • Shendure J and Ji HL 2008 Next-generation DNA sequencing. Nat. Biotechnol. 26 1135–1145

    Article  PubMed  CAS  Google Scholar 

  • Shapter FM, Fitzgerald TL, Waters DLE, McDonald S, Chivers IH and Henry RJ 2012 Analysis of adaptive ribosomal gene diversity in wild plant populations from contrasting climatic environments. Plant Signal. Behav. 7 1-3 accepted 8 February 2012

    Google Scholar 

  • Shendure J and Ji HL 2008 Next-generation DNA sequencing. Nat. Biotechnol. 26 1135–1145

    Article  PubMed  CAS  Google Scholar 

  • Souza GM, Berges H, Bocs S, Casu R, D’Hont A, Ferreira JE, Henry R, Ming R, et al. 2011 The sugarcane genome challenges: Strategies for sequencing a highly complex genome. Trop. Plant Biol. 4 145–156

    Google Scholar 

  • Straub SCK, Fishbein M, Livshultz T, Foster Z, Parks M, Weitemier K, Cronn RC and Liston A 2011 Building a model: Developing genomic resources for common milkweed (Ascleplas syriaca) with low coverage genome sequencing. BMC Genomics 12 211

    Google Scholar 

  • Thomas RK, Nickerson E, Simons JF, Jänne PA, Tengs T, Yuza Y, Garraway LA, LaFramboise T, Lee JC and Shah K 2006 Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing. Nat. Med. 12 852–855

    Article  PubMed  CAS  Google Scholar 

  • Thudi M, Li Y, Jackson SA, May GD and Varshney RK 2012 Current state-of-the-art sequencing technologies for plant genomics research. Brief. Functional Genomics 11 3–11

    CAS  Google Scholar 

  • Trick M, Long Y, Meng J and Bancroft I 2009 Single nucleotide polymorphism (SNP) discovery in the polyploidy Brassica napus using Solexa transcriptome sequencing. Plant Biotechnol. J. 7334–346

  • Trebbi D, maccaferri M, de Heer P, Sorensen A, Giuliani S, Sanguineti MC, Massi A, van der Vossen EAG and Tuberosa R 2011 High-throughput SNP discovery and genotyping in durum wheat (Triticum durum Desf.). Theor. Appl. Genet. 123 555–569

  • TPGSC (The Potato Genome Sequencing Consortium) 2011 Genome sequence and analysis of the tuber crop potato. Nature 475 189–195

    Article  Google Scholar 

  • Tung C, Zhao K, Wright LM, Ali ML, Jung J, Kimball J, Tyagi W, Thomson MJ, et al. 2010 Development of a research platform for dissecting phenotye-genotype associations in rice (Oryza spp.). Rice 3 205–217

    Google Scholar 

  • Varley KE and Mitra RD 2008 Nested Patch PCR enables highly multiplexed mutation discovery in candidate genes. Genome Res. 18 1844–1850

  • Varshney RV, Glaszmann J-C, Leung H and Ribaul J-M 2010 More genomic resources for less studies crops. Trends Biotechnol. 28 452–460

    Google Scholar 

  • Ward BL, Anderson RS and Bendich AJ 1981 The mitochondrial genome is large and variable in a family of plants (Cucurbitaceae). Cell 25 793–803

    Article  PubMed  CAS  Google Scholar 

  • Waters DLE, Nock CJ, Ishikawa R, Rice N and Henry RJ 2012 Chloroplast genome sequence confirms distinctness of Australian and Asian wild rice. Ecol. Evol. 2 211–217

    Article  PubMed  Google Scholar 

  • Whittall JB, Syring J, Parks M, Buenrostro J, Dick C, et al. 2010 Finding a (pine) needle in a haystack: chloroplast genome sequence divergence in rare and widespread pines. Mol. Ecol. 19 100–114

    Article  PubMed  CAS  Google Scholar 

  • Wu X, Ren C, Joshi T, Vuong T, Xu D and Nguyen HT 2010 SNP discovery by high-throughput sequencing in soybean. BMC Genomics 11 469

  • Xu YB, Crouch JH and Jonathan H 2008 Marker-assisted selection in plant breeding: From publications to practice. Crop Sci. 48 391–407

    Article  Google Scholar 

  • Yamamoto T, Nagasaki H, Yonemaru J, Ebana K, Nakajima M, Shibaya T and Yano M 2010 Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms. BMC Genomics 11 267

    Article  PubMed  Google Scholar 

  • Yan JB, Yang XH, Shah T, Sanchez-Villeda H, Li JS, Warburton M, Zhou Y, Crouch JH and Xu YB 2010 High-throughput SNP genotyping with the GoldenGate assay in maize. Mol. Breed. 25 441–451

    Article  CAS  Google Scholar 

  • Yan J, Kandianis CB, Harjes CE, Bai L, Kim EH, Yang X, Skinner DJ, Fu Z, et al. 2011 Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence. BMC Genomics 12 59

    Article  Google Scholar 

  • Zhang Y, Ma P and Li D 2011 High-throughput sequencing of six bamboo chloroplast genomes: phylogenetic implications for temperate woody bamboos (Poaceae: Bambusoideae). PLoS ONE 6 e20596

Download references

Acknowledgements

Financial support from the Australian Research Council, Grains Research and Development Corporation, Sugar Research and Development Corporation, and Rural Industries Research and Development Corporation is acknowledged. GKS visit to SCPS was sponsored by the BOYSCAST fellowship of Department of Science and Technology, India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert J Henry.

Additional information

[Henry RJ, Edwards M, Waters DLE, Gopala Krishnan S, Bundock P, Sexton TR, Masouleh AK, Nock CJ and Pattemore J 2012 Application of large-scale sequencing to marker discovery in plants. J. Biosci. 37 1–13] DOI 10.1007/s12038-012-9253-z

Rights and permissions

Reprints and permissions

About this article

Cite this article

Henry, R.J., Edwards, M., Waters, D.L.E. et al. Application of large-scale sequencing to marker discovery in plants. J Biosci 37, 829–841 (2012). https://doi.org/10.1007/s12038-012-9253-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12038-012-9253-z

Keywords

Navigation