Journal of Biosciences

, Volume 37, Issue 5, pp 829–841 | Cite as

Application of large-scale sequencing to marker discovery in plants

  • Robert J Henry
  • Mark Edwards
  • Daniel L E Waters
  • Gopala Krishnan S
  • Peter Bundock
  • Timothy R Sexton
  • Ardashir K Masouleh
  • Catherine J Nock
  • Julie Pattemore


Advances in DNA sequencing provide tools for efficient large-scale discovery of markers for use in plants. Discovery options include large-scale amplicon sequencing, transcriptome sequencing, gene-enriched genome sequencing and whole genome sequencing. Examples of each of these approaches and their potential to generate molecular markers for specific applications have been described. Sequencing the whole genome of parents identifies all the polymorphisms available for analysis in their progeny. Sequencing PCR amplicons of sets of candidate genes from DNA bulks can be used to define the available variation in these genes that might be exploited in a population or germplasm collection. Sequencing of the transcriptomes of genotypes varying for the trait of interest may identify genes with patterns of expression that could explain the phenotypic variation. Sequencing genomic DNA enriched for genes by hybridization with probes for all or some of the known genes simplifies sequencing and analysis of differences in gene sequences between large numbers of genotypes and genes especially when working with complex genomes. Examples of application of the above-mentioned techniques have been described.


DNA markers plants sequencing 


  1. Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song XZ, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, Weinstock GM and Gibbs RA 2007 Direct selection of human genomic loci by microarray hybridization. Nat. Method. 4 903–905CrossRefGoogle Scholar
  2. Akhunov E, Nicolet C and Dvorak J 2009 Single nucleotide polymorphism genotyping in polyploid wheat with the Illumina Goldengate assay. Theor. Appl. Genetics 119 507–517Google Scholar
  3. Alverson AJ, Rice DW, Dickinson S, Barry K and Palmer JD 2011 Origins and recombination of the bacterial-sized mitochondrial genome of cucumber. Plant Cell 23 2499–2513PubMedCrossRefGoogle Scholar
  4. Arai-Kichise Y, Shiwa Y, Nagasaki H, Ebana K, Yoshikawa H, Yano M and Wakasa K 2011 Discovery of genome-wide DNA polymorphisms in a landrace cultivar of japonica rice by whole-genome sequencing. Plant Cell Physiol. 52 274–282Google Scholar
  5. Argout X, Salse J, Aury J-M, Guiltinan MJ, Droc G, Gouzy J, Allegre M, Chaparro C, et al. 2010 The genome of Thebroma cacao. Nat. Genet. 43 101–108PubMedCrossRefGoogle Scholar
  6. Atherton RA, McComish BJ, Shepherd LD, Berry LA, Albert NW and Lockhart PJ 2010 Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform. Plant Methods 6 22Google Scholar
  7. Barbazuk WB, Emrich SJ, Chen LL and Schnable PS 2007 SNP discovery via 454 transcriptome sequencing. Plant J. 51 910–918Google Scholar
  8. Birney E 2011 Assemblies: the good, the bad, the ugly. Nat. Method. 8 59–60CrossRefGoogle Scholar
  9. Brautigam A and Gowik U 2010 What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research. Plant Biol. 12 831–841Google Scholar
  10. Buckler ES, Warburton ML and Rocheford T 2010 Rare genetic variation at Zea mays crtRB1 increases beta-carotene in maize grain. Nat. Genet. 42 322–327PubMedCrossRefGoogle Scholar
  11. Bundock PC, Eliott F, Ablett G, Benson AD, Casu R, Aitken K and Henry RJ 2009 Targeted SNP discovery in sugarcane using 454 sequencing. Plant Biotechnol. J. 7 347–354Google Scholar
  12. Bundock PC, Casu R and Henry RJ 2012 Enrichment of genomic DNA for polymorphism detection in a non-model highly polyploidy crop plant. Plant Biotechnol. J. 10 657-667PubMedCrossRefGoogle Scholar
  13. Chan AP, Crabtree J, Zhao Q, Lorenzi H, Orvis J, Puiu D, Melake-Berhan A, Jones KM, et al. 2010 Draft genome sequence of the oilseed species Ricinus communis. Nat. Biotechnol. 28 951–956PubMedCrossRefGoogle Scholar
  14. Cronn R, Liston A, Parks M, Gernandt DS, Shen R and Mockler T 2008 Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology. Nucleic Acids Res. 36 e122Google Scholar
  15. Cross M, Waters D, Lee LS and Henry RJ 2008 Endonucleolytic Mutation Analysis by Internal Labeling (EMAIL). Electrophoresis 29 1291–1301PubMedCrossRefGoogle Scholar
  16. Deschamps S and Campbell MA 2010 Utilization of next-generation sequencing platforms in plant genomics and genetic variant discovery. Mol. Breed. 25553–570Google Scholar
  17. Doorduin L, Gravendeel B, Lammers Y, Ariyurek Y, Chin-A-Woeng T and Vrieling K 2011 The complete chloroplast genome of 17 individuals of pest species Jacobaea vulgaris: SNPs, microsatellites and barcoding markers for population and phylogenetic studies. DNA Res. 18 93–105Google Scholar
  18. Druley TE, Vallania FLM, Wegner DJ, Varley KE, Knowles OL, Bonds JA, Robison SW, Doniger SW, Hamvas A and Cole FS 2009 Quantification of rare allelic variants from pooled genomic DNA. Nat. Method 6 263–265Google Scholar
  19. Dubey A, Farmer A, Schlueter J, Cannon SB, Abernathy B, Tuteja R, Woodward J, Shah T, et al. 2011 Defining the transcriptome assembly and its use for genome dynamics and transcriptome profiling studies in pigeonpea (Cajanus cajan L.) DNA Res. 18 153–164Google Scholar
  20. Edwards D and Batley J 2010 Plant genome sequencing: applications for crop improvement. Plant Biotechnol. J. 8 2–9Google Scholar
  21. Edwards M and Henry R 2011 DNA sequencing methods contributing to new directions in cereal research. J. Cereal Sci. 54 395–400Google Scholar
  22. Fitzgerald TL, Shapter FM, McDonald S, Waters DLE, Chivers IH, Drenth A, Nevo E and Henry RJ 2011 Genome diversity in wild grasses under environmental stress. Proc. Natl. Acad. Sci. USA 108 21139–21144Google Scholar
  23. Frazer KA, Murray SS, Schork NJ and Topol EJ 2009 Human genetic variation and its contribution to complex traits. Nat. Rev. Genet. 10 241–251PubMedCrossRefGoogle Scholar
  24. Fu Y, Springer NM, Gerhardt DJ, Ying K, Yeh CT, Wu W, Swanson-Wagner R, D’Ascenzo M, et al. 2010 Repeat subtraction-mediated sequence capture from a complex genome. Plant J. 62 898–909Google Scholar
  25. Fuji S, Kazama T, Yamada M and Toryama K 2010 Discovery of global genomic re-organization based on comparison of two newly sequenced rice mitochondrial genomes with cytoplasmic male sterility-related genes. BMC Genomics 11 209Google Scholar
  26. Futschik A and Schlotterer C 2010 The next generation of molecular markers from massively parallel sequencing of pooled DNA samples. Genetics 186 207–218PubMedCrossRefGoogle Scholar
  27. Garvin MR, Saitoh K and Gharrett AJ 2010 Application of single nucleotide polymorphisms to non-model species: a technical review. Mol. Ecol. Resour. 10 915–934PubMedCrossRefGoogle Scholar
  28. Gillies S, Furtado A and Henry RJ 2012 Gene expression in the developing aleurone and starchy endosperm of wheat. Plant Biotechnol. J. 10 668-679PubMedCrossRefGoogle Scholar
  29. Glenn TC 2011 Field guide to next-generation DNA sequencers. Mol. Ecol. Resour. 11 759–769PubMedCrossRefGoogle Scholar
  30. Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, et al. 2009 Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27 182–189PubMedCrossRefGoogle Scholar
  31. Gopala Krishnan S, Waters DLE, Katiyar SK, Sadananda AR, Satyadev V and Henry R 2011 Genome-wide DNA polymorphisms in elite indica rice inbreds discovered by whole-genome sequencing. Plant Biotechnol. J. 10 623-634Google Scholar
  32. Gore MA, Chia JM, Elshire RJ, Sun Q, Ersoz ES, Hurwitz BL, Peiffer JA, McMullen MD, et al. 2009 A first generation haplotype map of maize. Science 326 1115–1117PubMedCrossRefGoogle Scholar
  33. Gupta PK, Langridge P and Mir RR 2010 Marker-assisted wheat breeding: present status and future possibilities. Mol. Breed. 26 145–161CrossRefGoogle Scholar
  34. He Z, Zhai W, Wen H, Tang T, Wang Y, Lu X, Greenberg AJ, Hudson RR, Wu CI and Shi S 2011 Two evolutionary histories in the genome of rice: the roles of domestication genes. PLoS Genet. 7 e1002100Google Scholar
  35. Henry RJ and Edwards K 2009 New tools for single nucleotide polymorphism (SNP) discovery and analysis accelerating plant biotechnology. Plant Biotechnol. J. 7 311Google Scholar
  36. Henry RJ 1997 Practical applications of plant molecular biology (London: Chapman and Hall)Google Scholar
  37. Henry RJ 2001 Plant genotyping: The DNA fingerprinting of plants (Oxon: CABI Publishing)CrossRefGoogle Scholar
  38. Henry RJ 2005 Plant diversity and evolution: Genotypic and phenotypic variation in higher plants (Oxon: CABI Publishing)CrossRefGoogle Scholar
  39. Henry RJ 2008 Plant genotyping II: SNP technology (Wallingford: CABI Publishing)CrossRefGoogle Scholar
  40. Henry RJ 2010 Plant resources for food fuel and conservation (London: Earthscan)Google Scholar
  41. Hill H, Lee LS and Henry RJ 2011 Variation in sorghum starch synthesis genes associated with differences in starch phenotype. Food Chem. doi:10.1016/j.foodchem.2011.08.057
  42. Hiremath PJ, Farmer A, Cannon SB, Woodward J, Kudapa H, Tuteja R, Kumar A, BhanuPrakash A, et al. 2011 Large-scale transcriptome analysis of chickpea (Cicer arietinum L.) an orphan legume crop of the semi-arid tropics of Asia and Africa. Plant Biotechnol. J. 9 922–931Google Scholar
  43. Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ, Hannon GJ and McCombie WR 2007 Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 39 1522–1527PubMedCrossRefGoogle Scholar
  44. Imelfort M, Batley J, Grimmond S and Edwards D 2009a Genome sequencing approaches and successes; in Methods in molecular biology, plant genomics (eds) D Somers, P Langridge and Gustafson JP (Humana Press) pp 345–258Google Scholar
  45. Imelfort M, Duran C, Batley J and Edwards D 2009b Discovering genetic polymorphisms in next-generation sequencing data. Plant Biotechnol. J. 7 312–317Google Scholar
  46. IRGSP (International Rice Genome Sequencing Project) 2005 The map based sequence of the rice genome. Nature 436 793–800CrossRefGoogle Scholar
  47. Kharabian-Masouleh A, Waters D, Reinke R and Henry R 2009 A high-throughput assay for rapid and simultaneous analysis of perfect markers for important quality and agronomic traits in rice using multiplexed MALDI-TOF mass spectrometry. Plant Biotechnol. J. 7 355–363Google Scholar
  48. Kharabian-Masouleh A, Waters DLE, Reinke RF and Henry RJ 2011 Discovery of polymorphisms in starch related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing. Plant Biotechnol. J. 9 1074–1085Google Scholar
  49. Kim MY, Lee S, Van K, Kim TH, Jeong SC, Cho IY, Kim DS, Lee YS, et al. 2010 Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome. Proc. Natl. Acad. Sci. USA 107 22032–22037Google Scholar
  50. Kircher M and Kelso J 2010 High-throughput DNA sequencing - concepts and limitations. Bioessays 32 524–536PubMedCrossRefGoogle Scholar
  51. Kulheim C., Yeou SH, Maintz J, Foley W and Moran G 2009 Comparative SNP diversity among four Eucalyptus species for genes from secondary metabolite biosynthetic pathways. BMC Genomics 10 452Google Scholar
  52. Kulheim C,Yeoh, SH, Wallis IR, S Laffan S, Moran GF, et al. 2011 The molecular basis of quantitative variation in foliar secondary metabolites in Eucalyptus globulus. New Phytol. 191 1041–1053Google Scholar
  53. Lai J, Li R, Xu, X, Jin W, Xu M, Zhao H, Xiang Z, Song W, et al. 2010. Genome-wide patterns of genetic variation among elite maize inbred lines. Nat. Genet. 42 1027–1030PubMedCrossRefGoogle Scholar
  54. Laird PW 2010 Principles and challenges of genome-wide DNA methylation analysis. Nat. Rev. Genet. 11 191–203PubMedCrossRefGoogle Scholar
  55. Magi A, Benelli M, Gozzini A, Girolami F, Torricelli F and Brandi ML 2010 Bioinformatics for next generation sequencing data. Genes 1 294–307CrossRefGoogle Scholar
  56. Malory S, Shapter FM, Elphinstone MS, Chivers IH and Henry RJ 2011 Characterizing homologues of crop domestication genes in poorly described wild relatives by high-throughput sequencing of whole genomes Plant Biotechnol. J. 9 1131–1140Google Scholar
  57. McMullen MD, Kresovich S, Villeda HS, Bradbury P, Li H, Sun Q, Flint-Garcia S, Thornsberry J, et al. 2009 Genetic properties of the maize nested association mapping population. Science 325 737–740PubMedCrossRefGoogle Scholar
  58. Myles S, Chia JM, Hurwitz B, Simon C, Zhong GY, Buckler E and Ware D 2010 Rapid genomic characterization of the genus Vitis. PLoS ONE 5 e8219Google Scholar
  59. Nock C, Waters DLE, Edwards MA, Bowen S, Rice N, Cordeiro GM and Henry RJ 2011 Chloroplast genome sequence from total DNA for plant identification. Plant Biotechnol. J. 9 328–333Google Scholar
  60. Novaes E, Drost D, Farmerie W, Pappas G, Grattapaglia D, et al. 2008 High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC Genomics 9 312Google Scholar
  61. Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ and Zwick ME 2007 Microarray-based genomic selection for high-throughput resequencing. Nat. Method. 4 907–909CrossRefGoogle Scholar
  62. Out AA, van Minderhout I, Goeman JJ, Ariyurek Y, Ossowski S, Schneeberger K, Weigel D, van Galen M, Taschner PEM and Tops CMJ 2009 Deep sequencing to reveal new variants in pooled DNA samples. Hum. Mutat. 30 1703–1712PubMedCrossRefGoogle Scholar
  63. Parks M, Cronn R and Liston A 2009 Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes. BMC Biol. 7 84Google Scholar
  64. Pattemore JA, Rice N, Marshall DF, Waugh R and Henry RJ 2010 Cereal Variety Identification using MALDI-TOF mass spectrometry SNP genotyping. J. Cereal Sci. 52 356–361Google Scholar
  65. Peterson TW, Nam SJ and Darby A 2010 Next-gen sequencing survey; in North America equity research (New York: JP Morgan Chase & Co.)Google Scholar
  66. Rival A, Beule T, Bertossi FA, Tregear J and Jaligot E 2010 Plant epigenetics: From genomes to epigenomes. Notulae Botanicae Horti Agrobotanici Cluj-Napoca 38 9–15Google Scholar
  67. Rodríguez-Moreno L, González VM, Benjak A, Martí MC, Puigfomènech P, Aranda MA and Garcia-Mas J 2011 Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin. BMC Genomics 12 424Google Scholar
  68. Schadt EE, Turner S and Kasarskis A 2010 A window into third-generation sequencing. Hum. Mol. Genet. 19 R227–R240PubMedCrossRefGoogle Scholar
  69. Sexton T, Henry R, McManus LJ, Bowen S and Shepherd M 2010a Capture of assay template by multiplex PCR of long amplicons for genotyping SNPs and InDels with MALDI-TOF mass spectrometry. Mol. Breed. 25 471–480CrossRefGoogle Scholar
  70. Sexton TR, Henry RJ, McManus LJ, Henson M, Thomas DS and Shepherd M 2010b Genetic association studies in Eucalyptus pilularis Smith (blackbutt). Aust. Forest. J. 73 254–258.Google Scholar
  71. Sexton T, Henry R, Harwood C, Thomas D, L. McManus L, et al. 2011 SNP discovery and association mapping in Eucalyptus pilularis (blackbutt). BMC Proc. 5 O9Google Scholar
  72. Sexton TR, Henry RJ, Harwood CE, Thomas DS, McManus LJ, Raymond C, Henson M and Shepherd M 2012 Pectin methylesterase genes influence solid wood properties of Eucalyptus pilularis. Plant Physiol. 158 531–541Google Scholar
  73. Shendure J and Ji HL 2008 Next-generation DNA sequencing. Nat. Biotechnol. 26 1135–1145PubMedCrossRefGoogle Scholar
  74. Shapter FM, Fitzgerald TL, Waters DLE, McDonald S, Chivers IH and Henry RJ 2012 Analysis of adaptive ribosomal gene diversity in wild plant populations from contrasting climatic environments. Plant Signal. Behav. 7 1-3 accepted 8 February 2012Google Scholar
  75. Shendure J and Ji HL 2008 Next-generation DNA sequencing. Nat. Biotechnol. 26 1135–1145PubMedCrossRefGoogle Scholar
  76. Souza GM, Berges H, Bocs S, Casu R, D’Hont A, Ferreira JE, Henry R, Ming R, et al. 2011 The sugarcane genome challenges: Strategies for sequencing a highly complex genome. Trop. Plant Biol. 4 145–156Google Scholar
  77. Straub SCK, Fishbein M, Livshultz T, Foster Z, Parks M, Weitemier K, Cronn RC and Liston A 2011 Building a model: Developing genomic resources for common milkweed (Ascleplas syriaca) with low coverage genome sequencing. BMC Genomics 12 211Google Scholar
  78. Thomas RK, Nickerson E, Simons JF, Jänne PA, Tengs T, Yuza Y, Garraway LA, LaFramboise T, Lee JC and Shah K 2006 Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing. Nat. Med. 12 852–855PubMedCrossRefGoogle Scholar
  79. Thudi M, Li Y, Jackson SA, May GD and Varshney RK 2012 Current state-of-the-art sequencing technologies for plant genomics research. Brief. Functional Genomics 11 3–11Google Scholar
  80. Trick M, Long Y, Meng J and Bancroft I 2009 Single nucleotide polymorphism (SNP) discovery in the polyploidy Brassica napus using Solexa transcriptome sequencing. Plant Biotechnol. J. 7334–346Google Scholar
  81. Trebbi D, maccaferri M, de Heer P, Sorensen A, Giuliani S, Sanguineti MC, Massi A, van der Vossen EAG and Tuberosa R 2011 High-throughput SNP discovery and genotyping in durum wheat (Triticum durum Desf.). Theor. Appl. Genet. 123 555–569Google Scholar
  82. TPGSC (The Potato Genome Sequencing Consortium) 2011 Genome sequence and analysis of the tuber crop potato. Nature 475 189–195CrossRefGoogle Scholar
  83. Tung C, Zhao K, Wright LM, Ali ML, Jung J, Kimball J, Tyagi W, Thomson MJ, et al. 2010 Development of a research platform for dissecting phenotye-genotype associations in rice (Oryza spp.). Rice 3 205–217Google Scholar
  84. Varley KE and Mitra RD 2008 Nested Patch PCR enables highly multiplexed mutation discovery in candidate genes. Genome Res. 18 1844–1850Google Scholar
  85. Varshney RV, Glaszmann J-C, Leung H and Ribaul J-M 2010 More genomic resources for less studies crops. Trends Biotechnol. 28 452–460Google Scholar
  86. Ward BL, Anderson RS and Bendich AJ 1981 The mitochondrial genome is large and variable in a family of plants (Cucurbitaceae). Cell 25 793–803PubMedCrossRefGoogle Scholar
  87. Waters DLE, Nock CJ, Ishikawa R, Rice N and Henry RJ 2012 Chloroplast genome sequence confirms distinctness of Australian and Asian wild rice. Ecol. Evol. 2 211–217PubMedCrossRefGoogle Scholar
  88. Whittall JB, Syring J, Parks M, Buenrostro J, Dick C, et al. 2010 Finding a (pine) needle in a haystack: chloroplast genome sequence divergence in rare and widespread pines. Mol. Ecol. 19 100–114PubMedCrossRefGoogle Scholar
  89. Wu X, Ren C, Joshi T, Vuong T, Xu D and Nguyen HT 2010 SNP discovery by high-throughput sequencing in soybean. BMC Genomics 11 469Google Scholar
  90. Xu YB, Crouch JH and Jonathan H 2008 Marker-assisted selection in plant breeding: From publications to practice. Crop Sci. 48 391–407CrossRefGoogle Scholar
  91. Yamamoto T, Nagasaki H, Yonemaru J, Ebana K, Nakajima M, Shibaya T and Yano M 2010 Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms. BMC Genomics 11 267PubMedCrossRefGoogle Scholar
  92. Yan JB, Yang XH, Shah T, Sanchez-Villeda H, Li JS, Warburton M, Zhou Y, Crouch JH and Xu YB 2010 High-throughput SNP genotyping with the GoldenGate assay in maize. Mol. Breed. 25 441–451CrossRefGoogle Scholar
  93. Yan J, Kandianis CB, Harjes CE, Bai L, Kim EH, Yang X, Skinner DJ, Fu Z, et al. 2011 Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence. BMC Genomics 12 59CrossRefGoogle Scholar
  94. Zhang Y, Ma P and Li D 2011 High-throughput sequencing of six bamboo chloroplast genomes: phylogenetic implications for temperate woody bamboos (Poaceae: Bambusoideae). PLoS ONE 6 e20596Google Scholar

Copyright information

© Indian Academy of Sciences 2012

Authors and Affiliations

  • Robert J Henry
    • 1
  • Mark Edwards
    • 2
  • Daniel L E Waters
    • 2
  • Gopala Krishnan S
    • 2
    • 3
  • Peter Bundock
    • 2
  • Timothy R Sexton
    • 4
  • Ardashir K Masouleh
    • 2
  • Catherine J Nock
    • 2
  • Julie Pattemore
    • 5
  1. 1.Queensland Alliance for Agriculture and Food InnovationThe University of QueenslandBrisbaneAustralia
  2. 2.Southern Cross Plant ScienceSouthern Cross UniversityLismoreAustralia
  3. 3.Division of GeneticsIndian Agricultural Research InstituteNew DelhiIndia
  4. 4.Departmant of Forest SciencesThe University of British ColumbiaVancouverCanada
  5. 5.EH Graham Centre for Agricultural Innovation, School of Agricultural and Wine SciencesCharles Sturt UniversityWagga WaggaAustralia

Personalised recommendations