Chromosome Research

, Volume 19, Issue 7, pp 939–953 | Cite as

Exploring giant plant genomes with next-generation sequencing technology



Genome size in plants is characterised by its extraordinary range. Although it appears that the majority of plants have small genomes, in several lineages genome size has reached giant proportions. The recent advent of next-generation sequencing (NGS) methods has for the first time made detailed analysis of even the largest of plant genomes a possibility. In this review, we highlight investigations that have utilised NGS for the study of plants with large genomes, as well as describing ongoing work that aims to harness the power of these technologies to gain insights into their evolution. In addition, we emphasise some areas of research where the use of NGS has the potential to generate significant advances in our current understanding of how plant genomes evolve. Finally, we discuss some of the future developments in sequencing technology that may further improve our ability to explore the content and evolutionary dynamics of the very largest genomes.


genome size evolution next-generation sequencing repetitive DNA second-generation sequencing transposable element 



Chromatin immunoprecipitation followed by sequencing


Illegitimate recombination


Long interspersed nuclear element


Long terminal repeat


Mega base pairs of DNA


Miniature inverted repeat transposable element


Next-generation sequencing


Short interspersed nuclear element


Small interfering RNA

SMRT sequencing

Single molecule real-time sequencing


Small RNA


Transposable element


Third-generation sequencing


Terminal inverted repeat


Unequal homologous recombination



We thank James Tosh and Andrew Leitch for helpful comments on an earlier version of this manuscript, and Andrew Leitch, Richard Nichols, Mike Fay, Simon Renny-Byfield, Jiří Macas, Petr Novák and Pavel Neumann for useful discussion on the analysis of NGS data from plants with very large genomes. We also thank two anonymous reviewers and the editors for helpful comments that allowed us to improve this manuscript. Research into the dynamics of genome evolution in Fritillaria is part of a Natural Environment Research Council (NERC)-funded project (‘Evolutionary Dynamics of Genome Obesity’; grant number NE/G01724/1) to the Royal Botanic Gardens, Kew (UK) and Queen Mary, University of London (UK), and is being conducted in collaboration with the Biology Centre ASCR, Institute of Plant Molecular Biology (Czech Republic); 454 sequencing for this project is supported by the NERC Biomolecular Analysis Facility at the University of Liverpool (UK); plant material for this research has been kindly provided from specimens grown by Laurence Hill, Jeremy Broome, Richard Kernick and Kit Strange.


  1. Ambrožová K, Mandáková T, Bureš P et al (2011) Diverse retrotransposon families and an AT-rich satellite DNA revealed in giant genomes of Fritillaria lilies. Ann Bot 107:255–268CrossRefPubMedGoogle Scholar
  2. Argout X, Salse J, Aury J-M et al (2011) The genome of Theobroma cacao. Nat Genet 43:101–108CrossRefPubMedGoogle Scholar
  3. Bennett MD, Leitch IJ (2005a) Genome size evolution in plants. In: Gregory TR (ed) The evolution of the genome. Elsevier, New York, pp 89–162CrossRefGoogle Scholar
  4. Bennett MD, Leitch IJ (2005b) Nuclear DNA amounts in angiosperms—progress, problems and prospects. Ann Bot 95:45–90CrossRefPubMedGoogle Scholar
  5. Bennett MD, Leitch IJ (2010) Plant DNA C-values database (release 5.0, Dec 2010).
  6. Bennett MD, Leitch IJ (2011) Nuclear DNA amounts in angiosperms: targets, trends and tomorrow. Ann Bot 107:467–590CrossRefPubMedGoogle Scholar
  7. Bennett MD, Smith JB (1976) Nuclear DNA amounts in angiosperms. Philos Trans R Soc Lond B Biol Sci 274:227–274CrossRefPubMedGoogle Scholar
  8. Bennett MD, Smith JB (1991) Nuclear DNA amounts in angiosperms. Philos Trans R Soc Lond B Biol Sci 334:309–345CrossRefGoogle Scholar
  9. Bennett MD, Leitch IJ, Price HJ, Johnston JS (2003) Comparisons with Caenorhabditis (100 Mb) and Drosophila (175 Mb) using flow cytometry show genome size in Arabidopsis to be 157 Mb and thus 25% larger than the Arabidopsis Genome Initiative estimate of 125 Mb. Ann Bot 91:547–557CrossRefPubMedGoogle Scholar
  10. Bennetzen JL (2002) Mechanisms and rates of genome expansion and contraction in flowering plants. Genetica 115:29–36CrossRefPubMedGoogle Scholar
  11. Cantu D, Vanzetti LS, Sumner A et al (2010) Small RNAs, DNA methylation and transposable elements in wheat. BMC Genomics 11:408CrossRefPubMedGoogle Scholar
  12. Chan AP, Crabtree J, Zhao Q et al (2010) Draft genome sequence of the oilseed species Ricinus communis. Nat Biotechnol 28:951–959CrossRefPubMedGoogle Scholar
  13. Devos KM, Brown JKM, Bennetzen JL (2002) Genome size reduction through illegitimate recombination counteracts genome expansion in Arabidopsis. Genome Res 12:1075–1079CrossRefPubMedGoogle Scholar
  14. Doležel J, Bartoš J, Voglmayr H, Greilhuber J (2003) Nuclear DNA content and genome size of trout and human. Cytometry A 51A:127–128CrossRefGoogle Scholar
  15. Doležel J, Greilhuber J, Suda J (2007) Estimation of nuclear DNA content in plants using flow cytometry. Nat Protoc 2:2233–2244CrossRefPubMedGoogle Scholar
  16. Du J, Tian Z, Hans CS et al (2010) Evolutionary conservation, diversity and specificity of LTR-retrotransposons in flowering plants: insights from genome-wide analysis and multi-specific comparison. Plant J 63:584–598CrossRefPubMedGoogle Scholar
  17. Feschotte C (2008) Transposable elements and the evolution of regulatory networks. Nat Rev Genet 9:397–405CrossRefPubMedGoogle Scholar
  18. Feschotte F, Pritham EJ (2007) DNA transposons and the evolution of eukaryotic genomes. Annu Rev Genet 41:331–368CrossRefPubMedGoogle Scholar
  19. Feuillet C, Leach JE, Rogers J, Schnable PS, Eversole K (2011) Crop genome sequencing: lessons and rationales. Trends Plant Sci 16:77–88CrossRefPubMedGoogle Scholar
  20. Glenn TC (2011) Field guide to next-generation DNA sequencers. Mol Ecol Resour. doi:10.1111/j.1755-0998.2011.03024.x
  21. Greilhuber J, Doležel J, Lysak MA, Bennett MD (2005) The origin, evolution and proposed stabilization of the terms ‘Genome Size’ and ‘C-value’ to describe nuclear DNA contents. Ann Bot 95:91–98CrossRefPubMedGoogle Scholar
  22. Greilhuber J, Borsch T, Müller K, Worberg A, Porembski S, Barthlott W (2006) Smallest angiosperm genomes found in Lentibulariaceae with chromosomes of bacterial size. Plant Biol 8:770–777CrossRefPubMedGoogle Scholar
  23. Grover CE, Wendel JE (2010) Recent insights into mechanisms of genome size change in plants. J Bot doi:10.1155/2010/382732
  24. Hawkins JS, Kim HR, Nason JD, Wing RA, Wendel JF (2006) Differential lineage-specific amplification of transposable elements is responsible for genome size variation in Gossypium. Genome Res 16:1252–1261CrossRefPubMedGoogle Scholar
  25. Hawkins JS, Grover CE, Wendel JF (2008) Repeated big bangs and the expanding universe: directionality in plant genome size evolution. Plant Sci 174:557–562CrossRefGoogle Scholar
  26. Hawkins JS, Proulx SR, Rapp RA, Wendel JF (2009) Rapid DNA loss as a counterbalance to genome expansion through retrotransposon proliferation in plants. Proc Natl Acad Sci USA 106:17811–17816CrossRefPubMedGoogle Scholar
  27. Hu TT, Pattyn P, Bakker EG et al (2011) The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet 43:476–481CrossRefPubMedGoogle Scholar
  28. Huang S, Li R, Zhang Z et al (2009) The genome of the cucumber, Cucumis sativus L. Nat Genet 41:1275–1283CrossRefPubMedGoogle Scholar
  29. International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800CrossRefGoogle Scholar
  30. Kidwell MG (2002) Transposable elements and the evolution of genome size in eukaryotes. Genetica 115:49–63CrossRefPubMedGoogle Scholar
  31. Kidwell MG (2005) Transposable elements. In: Gregory TR (ed) The evolution of the genome. Elsevier, New York, pp 165–221CrossRefGoogle Scholar
  32. Kovach A, Wegrzyn JL, Parra G et al (2010) The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences. BMC Genomics 11:420CrossRefPubMedGoogle Scholar
  33. Kraaijeveld K (2010) Genome size and species diversification. Evol Biol 37:227–233CrossRefGoogle Scholar
  34. Leitch IJ, Bennett MD (2007) Genome size and its uses: the impact of flow cytometry. In: Doležel J, Greilhuber J, Suda J (eds) Flow cytometry with plant cells: analysis of genes, chromosomes and genomes. Wiley, New York, pp 153–176Google Scholar
  35. Leitch IJ, Soltis DE, Soltis PS, Bennett MD (2005) Evolution of DNA amounts across land plants (Embryophyta). Ann Bot 95:207–217Google Scholar
  36. Leitch IJ, Beaulieu JM, Cheung K, Hanson L, Lysak MA, Fay MF (2007) Punctuated genome size evolution in Liliaceae. J Evol Biol 20:2296–2308CrossRefPubMedGoogle Scholar
  37. Leitch IJ, Beaulieu JM, Chase MW, Leitch AR, Fay MF (2010) Genome size dynamics and evolution in monocots. J Bot doi:10.1155/2010/862516
  38. Lisch D (2009) Epigenetic regulation of transposable elements in plants. Annu Rev Plant Biol 60:43–66CrossRefPubMedGoogle Scholar
  39. Macas J, Neumann P, Navrátilová A (2007) Repetitive DNA in pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula. BMC Genomics 8:427CrossRefPubMedGoogle Scholar
  40. Mardis ER (2008) Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet 9:387–402CrossRefPubMedGoogle Scholar
  41. Metzker ML (2010) Sequencing technologies—the next-generation. Nat Rev Genet 11:31–46CrossRefPubMedGoogle Scholar
  42. Ming R, Hou S, Feng Y et al (2008) The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus). Nature 452:991–996CrossRefPubMedGoogle Scholar
  43. Nakazato T, Barker MS, Rieseberg LH, Gastony GJ (2008) Evolution of the nuclear genome of ferns and lycophytes. In: Haufler C, Ranker T (eds) The biology and evolution of ferns and lycophytes. Cambridge University Press, Cambridge, UK, pp 177–200Google Scholar
  44. Novák P, Neumann P, Macas J (2010) Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinformatics 11:378CrossRefPubMedGoogle Scholar
  45. O’Brien IEW, Smith DR, Gardner RC, Murray BG (1996) Flow cytometric determination of genome size in Pinus. Plant Sci 115:91–99CrossRefGoogle Scholar
  46. Obermayer R, Leitch IJ, Hanson L, Bennett MD (2002) Nuclear DNA C-values in 30 species double the familial representation in pteridophytes. Ann Bot 90:209–217CrossRefPubMedGoogle Scholar
  47. Oliver KR, Greene WK (2009) Transposable elements: powerful facilitators of evolution. Bioessays 31:703–714CrossRefPubMedGoogle Scholar
  48. Park PJ (2009) CHiP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 10:669–680CrossRefPubMedGoogle Scholar
  49. Paterson AH, Freeling M, Tang H, Wang X (2010) Insights from the comparison of plant genome sequences. Annu Rev Plant Biol 61:349–372CrossRefPubMedGoogle Scholar
  50. Pellicer J, Fay MF, Leitch IJ (2010) The largest eukaryotic genome of them all? Bot J Linn Soc 164:10–15CrossRefGoogle Scholar
  51. Piegu B, Guyot R, Picault N et al (2006) Doubling genome size without polyploidization: dynamics of retrotransposition-driven genomic expansions in Oryza australiensis, a wild relative of rice. Genome Res 16:1262–1269CrossRefPubMedGoogle Scholar
  52. Price HJ, Dillon SL, Hodnett G, Rooney WL, Ross L, Johnston JS (2005) Genome evolution in the genus Sorghum (Poaceae). Ann Bot 95:219–228CrossRefPubMedGoogle Scholar
  53. Renny-Byfield S, Chester M, Kovařík A et al (2011) Next generation sequencing reveals genome downsizing in allotetraploid Nicotiana tabacum, predominantly through the elimination of paternally derived repetitive DNAs. Mol Biol Evol 28:2843–2854Google Scholar
  54. Rothberg JM, Leamon JH (2008) The development and impact of 454 sequencing. Nat Biotechnol 26:1117–1124CrossRefPubMedGoogle Scholar
  55. Schadt EE, Turner S, Kasarskis A (2010) A window into third generation sequencing. Hum Mol Genet 19:R227–R240CrossRefPubMedGoogle Scholar
  56. Schatz MC, Delcher AL, Salzberg SL (2010) Assembly of large genomes using second-generation sequencing. Genome Res 20:1165–1173CrossRefPubMedGoogle Scholar
  57. Schmutz J, Cannon SB, Schlueter J et al (2010) Genome sequence of the palaeopolyploid soybean. Nature 463:178–183CrossRefPubMedGoogle Scholar
  58. Schnable PS, Ware D, Fulton RS et al (2009) The B73 maize genome: complexity, diversity and dynamics. Science 326:1112–1115CrossRefPubMedGoogle Scholar
  59. Shendure J, Ji H (2008) Next-generation DNA sequencing. Nat Biotechnol 26:1135–1145CrossRefPubMedGoogle Scholar
  60. Shulaev V, Sargent DL, Crowhurst RN et al (2011) The genome of woodland strawberry (Fragaria vesca). Nat Genet 43:109–116CrossRefPubMedGoogle Scholar
  61. Slotkin RK, Martienssen RA (2007) Transposable elements and the epigenetic regulation of the genome. Nat Rev Genet 8:272–285CrossRefPubMedGoogle Scholar
  62. Swaminanthan K, Alabady MS, Varala K et al (2010) Genomic and small RNA sequencing of Miscanthus × giganteus shows the utility of sorghum as a reference genome sequence for Andropogoneae grasses. Genome Biol 11:R12CrossRefGoogle Scholar
  63. Tate JA, Soltis DE, Soltis PS (2005) Polyploidy in plants. In: Gregory TR (ed) The evolution of the genome. Elsevier, New York, pp 372–426Google Scholar
  64. Tatum TC, Stepanovic S, Biradar DP, Rayburn AL, Korban SS (2005) Variation in nuclear DNA content in Malus species and cultivated apples. Genome 48:924–930CrossRefPubMedGoogle Scholar
  65. Temsch EM, Temsch W, Ehrendorfer-Schratt L, Greilhuber J (2010) Heavy metal pollution, selection, and genome size: the species of the Žerjav study revisited with flow cytometry. J Bot doi:10.1155/2010/596542
  66. Tenaillon MI, Hufford MB, Gaut BS, Ross-Ibarra J (2011) Genome size and transposable element content as determined by high-throughput sequencing in maize and Zea luxurians. Genome Biol Evol 3:219–229Google Scholar
  67. The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815CrossRefGoogle Scholar
  68. The International Brachypodium Initiative (2010) Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463:763–768CrossRefGoogle Scholar
  69. Tian Z, Rizzon C, Du J et al (2009) Do genetic recombination and gene density shape the pattern of DNA elimination in rice long terminal repeat retrotransposons. Genome Res 19:221–2230CrossRefGoogle Scholar
  70. Velasco R, Zharkikh A, Affourtit J et al (2010) The genome of the domesticated apple (Malus × domestica Borkh.). Nat Genet 42:833–839CrossRefPubMedGoogle Scholar
  71. Vitte C, Bennetzen JL (2006) Analysis of retrotransposon structural diversity uncovers properties and propensities in angiosperm genome evolution. Proc Natl Acad Sci USA 103:17638–17643CrossRefPubMedGoogle Scholar
  72. Volff JN (2006) Turning junk into gold: domestication of transposable elements and the creation of new genes in eukaryotes. Bioessays 28:913–922CrossRefPubMedGoogle Scholar
  73. Wicker T, Keller B (2007) Genome-wide comparative analysis of copia retrotransposons in Triticeae, rice, and Arabidopsis reveals conserved ancient evolutionary lineages and distinct dynamics of individual copia families. Genome Res 17:1072–1081CrossRefPubMedGoogle Scholar
  74. Wicker T, Sabot F, Hua-Van A et al (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8:973–982CrossRefPubMedGoogle Scholar
  75. Wicker T, Taudien S, Houben A et al (2008) A whole-genome snapshot of 454 sequences exposes the composition of the barley genome and provides evidence for parallel evolution of genome size in wheat and barley. Plant J 59:712–722CrossRefGoogle Scholar
  76. Wicker T, Buchmann JP, Keller B (2010) Patching gaps in plant genomes results in gene movement and erosion of colinearity. Genome Res 20:1229–1237CrossRefPubMedGoogle Scholar
  77. Zeh DW, Zeh JA, Ishida Y (2009) Transposable elements and an epigenetic basis for punctuated equilibria. Bioessays 31:715–726CrossRefPubMedGoogle Scholar
  78. Zhang X (2008) The epigenetic landscape of plants. Science 320:489–492CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.Jodrell LaboratoryRoyal Botanic GardensSurreyUK
  2. 2.School of Biological and Chemical SciencesQueen Mary University of LondonLondonUK

Personalised recommendations