Tree Genetics & Genomes

, Volume 7, Issue 6, pp 1277–1285 | Cite as

A sample view of the pedunculate oak (Quercus robur) genome from the sequencing of hypomethylated and random genomic libraries

  • Isabelle Lesur
  • Jérome Durand
  • Federico Sebastiani
  • Niclas Gyllenstrand
  • Catherine Bodénès
  • Martin Lascoux
  • Antoine Kremer
  • Giovanni G. Vendramin
  • Christophe PlomionEmail author
Original Paper


Genomic resources have recently been developed for a number of species of Fagaceae, with the purpose of identifying the genetic factors underlying the adaptation of these long-lived, biologically predominant, commercially and ecologically important species to their environment. The sequencing of genomes of the size of the oak genome (740 Mb/C) is now becoming both possible and affordable due to breakthroughs in sequencing technology. However, an understanding of the composition and structure of the oak genome is required before launching a sequencing initiative. We constructed random (Rd) and hypomethylated (Hp) genomic libraries for pedunculate oak (Quercus robur) and carried out a sample sequencing of 2.33 and 2.36 Mb of shotgun DNA from the Rd and Hp libraries, respectively, to provide a first insight into the repetitive element and gene content of the oak genome. We found striking similarities between Rd sequences and previously analyzed BAC end sequences of pedunculate oak, with a similar percentage of known repeat elements (5.56%), an almost identical simple sequence repeat density (i.e., 29 SSRs per 100 kb), an identical profile of SSR motifs (in descending order of frequency—dinucleotide, pentanucleotide, trinucleotide, tetranucleotide, and hexanucleotide motifs). Conversely, the Hp fraction was, as expected, enriched in nuclear genes (2.44-fold enrichment). This enrichment was associated with a lower frequency of retrotransposons than for Rd sequences. We also identified twice as many SSR motifs in the Rd library as in the Hp library. This work provides useful information before opening a new chapter in oak genome sequencing.


Quercus robur Genome composition Hypomethylated libraries SSR 



This project was supported by INRA and the European Union: a postdoctoral fellowship awarded to I. Lesur (FORESTTRAC project, no. FP7-244096) and a PhD fellowship awarded to J. Durand (EVOLTREE project, no. 16322).

Supplementary material

11295_2011_412_MOESM1_ESM.txt (82 kb)
S1 S1_SSR_hypo_random.txt—Hypomethylated and random genomic sequences containing at least one SSR (text file) (TXT 82 kb)
11295_2011_412_MOESM2_ESM.txt (93 kb)
S2 S2_nuclear_SWI_hit_hypo_random.txt—Hypomethylated and random genomic sequences potentially coding for a nuclear gene (text file) (TXT 93 kb)


  1. Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252:1651–1656PubMedCrossRefGoogle Scholar
  2. Bairoch A, Boeckmann B, Ferro S, Gasteiger E, Swiss-Prot (2004) Juggling between evolution and stability. Brief Bioinformatics 5:39–55PubMedCrossRefGoogle Scholar
  3. Bennetzen JL, Schrick K, Springer PS, Brown WE, SanMiguel P (1994) Active maize genes are unmodified and flanked by diverse classes of modified, highly repetitive DNA. Genome 37:565–576PubMedCrossRefGoogle Scholar
  4. Davuluri RV, Sun H, Palaniswamy SK, Matthews N, Molina C, Kurtz M, Grotewold E (2003) AGRIS: Arabidopsis gene regulatory information server, an information resource of Arabidopsis cis-regulatory elements and transcription factors. BMC Bioinformatics 23(4):25CrossRefGoogle Scholar
  5. Durand J, Bodenes C, Chancerel E, Frigerio JM, Vendramin G, Sebastiani F, Buonamici A, Gailing O, Koelewijn HP, Villani F, Mattioni C, Cherubini M, Goicoechea PG, Herran A, Ikaran Z, Cabané C, Ueno S, Alberto F, Dumoulin PY, Guichoux E, de Daruvar A, Kremer A, Plomion C (2010) A fast and cost-effective approach to develop and map EST-SSR markers: oak as a case study. BMC Genomics 15(11):570CrossRefGoogle Scholar
  6. Emberton J, Ma J, Yuan Y, SanMiguel P, Bennetzen JL (2005) Gene enrichment in maize with hypomethylated partial restriction (HMPR) libraries. Genome Res 15:1441–1446PubMedCrossRefGoogle Scholar
  7. Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8:175–185PubMedGoogle Scholar
  8. Faivre-rampant P, Lesur I, Boussardon C, Bitton F, Bodénès C, Le Provost G, Kremer A, Plomion C (2011) Analysis of BAC end sequences in a keystone forest tree species: oak, revealing insights into the composition of its genome. BMC Genomics 12:292PubMedCrossRefGoogle Scholar
  9. Falgueras J, Lara AJ, Fernández-Pozo N, Cantón FR, Pérez-Trabado G, Claros MG (2010) SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read. BMC Bioinformatics 11:38PubMedCrossRefGoogle Scholar
  10. Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer ELL, Eddy SR, Bateman A (2010) The Pfam protein families database. Nucleic Acids Res 38:D211–222PubMedCrossRefGoogle Scholar
  11. Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W (1998) A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res 8:967–74PubMedGoogle Scholar
  12. Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJA, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C (2009) InterPro: the integrative protein signature database. Nucleic Acids Res 37:D211–215PubMedCrossRefGoogle Scholar
  13. Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110:462–467PubMedCrossRefGoogle Scholar
  14. Kolpakov R, Bana G, Kucherov G (2003) mreps: efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Res 31:3672–3678PubMedCrossRefGoogle Scholar
  15. Lamoureux D, Peterson DG, Li W, Fellers JP, Gill BS (2005) The efficacy of Cot-based gene enrichment in wheat (Triticum aestivum L.). Genome 48(6):1120–6PubMedCrossRefGoogle Scholar
  16. Martienssen R (1998) Transposons, DNA methylation and gene control. Trends Genet 14:263–264PubMedCrossRefGoogle Scholar
  17. Nelson W, Luo M, Ma J, Estep M, Estill J, He R, Talag J, Sisneros N, Kudrna D, Kim H, Ammiraju JS, Collura K, Bharti AK, Messing J, Wing RA, SanMiguel P, Bennetzen JL, Soderlund C (2008) Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains. BMC Genomics 19(9):621CrossRefGoogle Scholar
  18. Palmer LE, Rabinowicz PD, O’Shaughnessy AL, Balija VS, Nascimento LU, Dike S, de la Bastide M, Martienssen RA, McCombie WR (2003) Maize genome sequencing by methylation filtration. Science 302:2115–2117PubMedCrossRefGoogle Scholar
  19. Rabinowicz PD, Schutz K, Dedhia N, Yordan C, Parnell LD, Stein L, McCombie WR, Martienssen RA (1999) Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome. Nat Genet 23(3):305–8PubMedCrossRefGoogle Scholar
  20. Rabinowicz PD, Palmer LE, May BP, Hemann MT, Lowe SW, McCombie WR, Martienssen RA (2003) Genes and transposons are differentially methylated in plants, but not in mammals. Genome Res 13:2658–2664PubMedCrossRefGoogle Scholar
  21. Rabinowicz PD, Citek R, Budiman MA, Nunberg A, Bedell JA, Lakey N, O’Shaughnessy AL, Nascimento LU, McCombie WR, Martienssen RA (2005) Differential methylation of genes and repeats in land plants. Genome Res 15:1431–1440PubMedCrossRefGoogle Scholar
  22. Tarailo-Graovac M, Chen N (2009) Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics Chapter 4: Unit 4.10Google Scholar
  23. Ueno S, Le Provost G, Léger V, Klopp C, Noirot C, Frigerio J, Salin F, Salse J, Abrouk M, Murat F, Brendel O, Derory J, Abadie P, Léger P, Cabane C, Barré A, de Daruvar A, Couloux A, Wincker P, Reviron M, Kremer A, Plomion C (2010) Bioinformatic analysis of Sanger and 454 ESTs for a keystone forest tree species: oak. BMC Genomics 23(11):650CrossRefGoogle Scholar
  24. Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung F, Lee Y, Zheng L, van Heeringen S, Karamycheva S, Bennetzen JL, SanMiguel P, Lakey N, Bedell J, Yuan Y, Budiman MA, Resnick A, Van Aken S, Utterback T, Riedmuller S, Williams M, Feldblyum T, Schubert K, Beachy R, Fraser CM, Quackenbush J (2003) Enrichment of gene-coding sequences in maize by genome filtration. Science 302:2118–2120PubMedCrossRefGoogle Scholar
  25. Yuan Y, SanMiguel PJ, Bennetzen JL (2002) Methylation-spanning linker libraries link gene-rich regions and identify epigenetic boundaries in Zea mays. Genome Res 12(9):1345–9PubMedCrossRefGoogle Scholar
  26. Yuan Y, SanMiguel PJ, Bennetzen JL (2003) High-Cot sequence analysis of the maize genome. Plant J 34(2):249–255, Erratum in: Plant J 36(3):430PubMedCrossRefGoogle Scholar
  27. Zhang HB, Zhao X, Ding X, Paterson AH, Wing RA (1995) Preparation of megabase-size DNA from plant nuclei. Plant J 7:175–18CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Isabelle Lesur
    • 1
  • Jérome Durand
    • 1
  • Federico Sebastiani
    • 2
  • Niclas Gyllenstrand
    • 3
  • Catherine Bodénès
    • 1
  • Martin Lascoux
    • 4
  • Antoine Kremer
    • 1
  • Giovanni G. Vendramin
    • 2
  • Christophe Plomion
    • 1
    Email author
  1. 1.INRA, UMR1202 BIOGECOCestasFrance
  2. 2.Plant Genetics InstituteNational Research CouncilFlorenceItaly
  3. 3.Department of Plant Biology and Forest GeneticsSwedish University of Agricultural SciencesUppsalaSweden
  4. 4.Evolutionary Biology CenterUppsala UniversityUppsalaSweden

Personalised recommendations