Abstract
Genomic resources have recently been developed for a number of species of Fagaceae, with the purpose of identifying the genetic factors underlying the adaptation of these long-lived, biologically predominant, commercially and ecologically important species to their environment. The sequencing of genomes of the size of the oak genome (740 Mb/C) is now becoming both possible and affordable due to breakthroughs in sequencing technology. However, an understanding of the composition and structure of the oak genome is required before launching a sequencing initiative. We constructed random (Rd) and hypomethylated (Hp) genomic libraries for pedunculate oak (Quercus robur) and carried out a sample sequencing of 2.33 and 2.36 Mb of shotgun DNA from the Rd and Hp libraries, respectively, to provide a first insight into the repetitive element and gene content of the oak genome. We found striking similarities between Rd sequences and previously analyzed BAC end sequences of pedunculate oak, with a similar percentage of known repeat elements (5.56%), an almost identical simple sequence repeat density (i.e., 29 SSRs per 100 kb), an identical profile of SSR motifs (in descending order of frequency—dinucleotide, pentanucleotide, trinucleotide, tetranucleotide, and hexanucleotide motifs). Conversely, the Hp fraction was, as expected, enriched in nuclear genes (2.44-fold enrichment). This enrichment was associated with a lower frequency of retrotransposons than for Rd sequences. We also identified twice as many SSR motifs in the Rd library as in the Hp library. This work provides useful information before opening a new chapter in oak genome sequencing.
Similar content being viewed by others
References
Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252:1651–1656
Bairoch A, Boeckmann B, Ferro S, Gasteiger E, Swiss-Prot (2004) Juggling between evolution and stability. Brief Bioinformatics 5:39–55
Bennetzen JL, Schrick K, Springer PS, Brown WE, SanMiguel P (1994) Active maize genes are unmodified and flanked by diverse classes of modified, highly repetitive DNA. Genome 37:565–576
Davuluri RV, Sun H, Palaniswamy SK, Matthews N, Molina C, Kurtz M, Grotewold E (2003) AGRIS: Arabidopsis gene regulatory information server, an information resource of Arabidopsis cis-regulatory elements and transcription factors. BMC Bioinformatics 23(4):25
Durand J, Bodenes C, Chancerel E, Frigerio JM, Vendramin G, Sebastiani F, Buonamici A, Gailing O, Koelewijn HP, Villani F, Mattioni C, Cherubini M, Goicoechea PG, Herran A, Ikaran Z, Cabané C, Ueno S, Alberto F, Dumoulin PY, Guichoux E, de Daruvar A, Kremer A, Plomion C (2010) A fast and cost-effective approach to develop and map EST-SSR markers: oak as a case study. BMC Genomics 15(11):570
Emberton J, Ma J, Yuan Y, SanMiguel P, Bennetzen JL (2005) Gene enrichment in maize with hypomethylated partial restriction (HMPR) libraries. Genome Res 15:1441–1446
Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8:175–185
Faivre-rampant P, Lesur I, Boussardon C, Bitton F, Bodénès C, Le Provost G, Kremer A, Plomion C (2011) Analysis of BAC end sequences in a keystone forest tree species: oak, revealing insights into the composition of its genome. BMC Genomics 12:292
Falgueras J, Lara AJ, Fernández-Pozo N, Cantón FR, Pérez-Trabado G, Claros MG (2010) SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read. BMC Bioinformatics 11:38
Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer ELL, Eddy SR, Bateman A (2010) The Pfam protein families database. Nucleic Acids Res 38:D211–222
Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W (1998) A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res 8:967–74
Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJA, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C (2009) InterPro: the integrative protein signature database. Nucleic Acids Res 37:D211–215
Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110:462–467
Kolpakov R, Bana G, Kucherov G (2003) mreps: efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Res 31:3672–3678
Lamoureux D, Peterson DG, Li W, Fellers JP, Gill BS (2005) The efficacy of Cot-based gene enrichment in wheat (Triticum aestivum L.). Genome 48(6):1120–6
Martienssen R (1998) Transposons, DNA methylation and gene control. Trends Genet 14:263–264
Nelson W, Luo M, Ma J, Estep M, Estill J, He R, Talag J, Sisneros N, Kudrna D, Kim H, Ammiraju JS, Collura K, Bharti AK, Messing J, Wing RA, SanMiguel P, Bennetzen JL, Soderlund C (2008) Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains. BMC Genomics 19(9):621
Palmer LE, Rabinowicz PD, O’Shaughnessy AL, Balija VS, Nascimento LU, Dike S, de la Bastide M, Martienssen RA, McCombie WR (2003) Maize genome sequencing by methylation filtration. Science 302:2115–2117
Rabinowicz PD, Schutz K, Dedhia N, Yordan C, Parnell LD, Stein L, McCombie WR, Martienssen RA (1999) Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome. Nat Genet 23(3):305–8
Rabinowicz PD, Palmer LE, May BP, Hemann MT, Lowe SW, McCombie WR, Martienssen RA (2003) Genes and transposons are differentially methylated in plants, but not in mammals. Genome Res 13:2658–2664
Rabinowicz PD, Citek R, Budiman MA, Nunberg A, Bedell JA, Lakey N, O’Shaughnessy AL, Nascimento LU, McCombie WR, Martienssen RA (2005) Differential methylation of genes and repeats in land plants. Genome Res 15:1431–1440
Tarailo-Graovac M, Chen N (2009) Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics Chapter 4: Unit 4.10
Ueno S, Le Provost G, Léger V, Klopp C, Noirot C, Frigerio J, Salin F, Salse J, Abrouk M, Murat F, Brendel O, Derory J, Abadie P, Léger P, Cabane C, Barré A, de Daruvar A, Couloux A, Wincker P, Reviron M, Kremer A, Plomion C (2010) Bioinformatic analysis of Sanger and 454 ESTs for a keystone forest tree species: oak. BMC Genomics 23(11):650
Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung F, Lee Y, Zheng L, van Heeringen S, Karamycheva S, Bennetzen JL, SanMiguel P, Lakey N, Bedell J, Yuan Y, Budiman MA, Resnick A, Van Aken S, Utterback T, Riedmuller S, Williams M, Feldblyum T, Schubert K, Beachy R, Fraser CM, Quackenbush J (2003) Enrichment of gene-coding sequences in maize by genome filtration. Science 302:2118–2120
Yuan Y, SanMiguel PJ, Bennetzen JL (2002) Methylation-spanning linker libraries link gene-rich regions and identify epigenetic boundaries in Zea mays. Genome Res 12(9):1345–9
Yuan Y, SanMiguel PJ, Bennetzen JL (2003) High-Cot sequence analysis of the maize genome. Plant J 34(2):249–255, Erratum in: Plant J 36(3):430
Zhang HB, Zhao X, Ding X, Paterson AH, Wing RA (1995) Preparation of megabase-size DNA from plant nuclei. Plant J 7:175–18
Acknowledgments
This project was supported by INRA and the European Union: a postdoctoral fellowship awarded to I. Lesur (FORESTTRAC project, no. FP7-244096) and a PhD fellowship awarded to J. Durand (EVOLTREE project, no. 16322).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by D. Grattapaglia
Electronic supplementary material
Below is the link to the electronic supplementary material.
S1
S1_SSR_hypo_random.txt—Hypomethylated and random genomic sequences containing at least one SSR (text file) (TXT 82 kb)
S2
S2_nuclear_SWI_hit_hypo_random.txt—Hypomethylated and random genomic sequences potentially coding for a nuclear gene (text file) (TXT 93 kb)
Rights and permissions
About this article
Cite this article
Lesur, I., Durand, J., Sebastiani, F. et al. A sample view of the pedunculate oak (Quercus robur) genome from the sequencing of hypomethylated and random genomic libraries. Tree Genetics & Genomes 7, 1277–1285 (2011). https://doi.org/10.1007/s11295-011-0412-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11295-011-0412-4