Skip to main content
Log in

Characteristics of the tomato nuclear genome as determined by sequencing undermethylated EcoRI digested fragments

  • Original Paper
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

A collection of 9,990 single-pass nuclear genomic sequences, corresponding to 5 Mb of tomato DNA, were obtained using methylation filtration (MF) strategy and reduced to 7,053 unique undermethylated genomic islands (UGIs) distributed as follows: (1) 59% non-coding sequences, (2) 28% coding sequences, (3) 12% transposons—96% of which are class I retroelements, and (4) 1% organellar sequences integrated into the nuclear genome over the past approximately 100 million years. A more detailed analysis of coding UGIs indicates that the unmethylated portion of tomato genes extends as far as 676 bp upstream and 766 bp downstream of coding regions with an average of 174 and 171 bp, respectively. Based on the analysis of the UGI copy distribution, the undermethylated portion of the tomato genome is determined to account for the majority of the unmethylated genes in the genome and is estimated to constitute 61±15 Mb of DNA (~5% of the entire genome)—which is significantly less than the 220 Mb estimated for gene-rich euchromatic arms of the tomato genome. This result indicates that, while most genes reside in the euchromatin, a significant portion of euchromatin is methylated in the intergenic spacer regions. Implications of the results for sequencing the genome of tomato and other solanaceous species are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Adams KL, Qiu YL, Stoutemyer M, Palmer JD (2002) Punctuated evolution of mitochondrial gene content: high and variable rates of mitochondrial gene loss and transfer to the nucleus during angiosperm evolution. Proc Natl Acad Sci USA 99:9905–9912

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Antequera F, Bird AP (1988) Unmethylated CpG islands associated with genes in higher-plant DNA. Embo J 7:2295–2299

    CAS  PubMed  PubMed Central  Google Scholar 

  • Arumuganathan K, Slattery JP, Tanksley SD, Earle ED (1991) Preparation and flow cytometric analysis of metaphase chromosomes of tomato. Theor Appl Genet 82:101–111

    Article  CAS  PubMed  Google Scholar 

  • Ashapkin VV, Kutueva LI, Vanyushin BF (2002) The gene for domains rearranged methyltransferase (DRM2) in Arabidopsis thaliana plants is methylated at both cytosine and adenine residues. Febs Lett 532:367–372

    Article  CAS  PubMed  Google Scholar 

  • Ayliffe MA, Scott NS, Timmis JN (1998) Analysis of plastid DNA-like sequences within the nuclear genomes of higher plants. Mol Biol Evol 15:738–745

    Article  CAS  PubMed  Google Scholar 

  • Bedell JA, Budiman MA, Nunberg A, Citek RW, Robbins D, Jones J, Flick E, Rohlfing T, Fries J, Bradford K, McMenamy J, Smith M, Holeman H, Roe BA, Wiley G, Korf IF, Rabinowicz PD, Lakey N, McCombie WR, Jeddeloh JA, Martienssen RA (2005) Sorghum genome sequencing by methylation filtration. Plos Biology 3:103–115

    Article  Google Scholar 

  • Bennetzen JL, Schrick K, Springer PS, Brown WE, Sanmiguel P (1994) Active maize genes are unmodified and flanked by diverse classes of modified highly repetitive DNA. Genome 37:565–576

    Article  CAS  PubMed  Google Scholar 

  • Bernatzky R, Tanksley SD (1986) Toward a saturated linkage map in tomato based on isozymes and random cDNA Sequences. Genetics 112:887–898

    CAS  PubMed  PubMed Central  Google Scholar 

  • Bird A (2002) DNA methylation patterns and epigenetic memory. Genes Dev 16:6–21

    Article  CAS  PubMed  Google Scholar 

  • Borodovsky M, McIninch J (1993) GeneMark: parallel gene recognition for both DNA strands. Comput Chem 17:123–133

    Article  CAS  Google Scholar 

  • Budiman MA, Mao L, Wood TC, Wing RA (2000) A deep-coverage tomato BAC library and prospects toward development of an STC framework for genome sequencing. Genome Res 10:129–136

    CAS  PubMed  PubMed Central  Google Scholar 

  • Burge C, Karlin S (1997) Prediction of complete gene structures in human genomic DNA. J Mol Biol 268:78–94

    Article  CAS  PubMed  Google Scholar 

  • Burr B, Burr FA, Thompson KH, Albertson MC, Stuber CW (1988) Gene-mapping with recombinant inbreds in maize. Genetics 118:519–526

    CAS  PubMed  PubMed Central  Google Scholar 

  • Cao XF, Jacobsen SE (2002) Locus-specific control of asymmetric and CpNpG methylation by the DRM and CMT3 methyltransferase genes. Proc Natl Acad Sci USA 99:16491–16498

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Celniker SE, Rubin GM (2003) The Drosophila melanogaster genome. Annu Rev Genomics Hum Genet 4:89–117

    Article  CAS  PubMed  Google Scholar 

  • Chen R, Bouck JB, Weinstock GM, Gibbs RA (2001) Comparing vertebrate whole-genome shotgun reads to the human genome. Genome Res 11:1807–1816

    CAS  PubMed  PubMed Central  Google Scholar 

  • de Jong JH (1998) High resolution FISH reveals the molecular and chromosomal organisation of repetitive sequences in tomato. Cytogenet Cell Genet 81:104–104

    Google Scholar 

  • Doganlar S, Frary A, Daunay MC, Lester RN, Tanksley SD (2002) A comparative genetic linkage map of eggplant (Solanum melongena) and its implications for genome evolution in the Solanaceae. Genetics 161:1697–1711

    CAS  PubMed  PubMed Central  Google Scholar 

  • Ewing B, Green P (1998) Base-calling of automated sequencer traces using Phred I.accuracy assessment. Genome Res 8:186–194

    Article  CAS  PubMed  Google Scholar 

  • Fojtova M, Van Houdt H, Depicker A, Kovarik A (2003) Epigenetic switch from posttranscriptional to transcriptional silencing is correlated with promoter hypermethylation. Plant Physiol 133:1240–1250

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fulton TM, Van der Hoeven R, Eannetta NT, Tanksley SD (2002) Identification analysis and utilization of conserved ortholog set markers for comparative genomics in higher plants. Plant Cell 14:1457–1467

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ganal MW, Lapitan NLV, Tanksley SD (1988) A molecular and cytogenetic survey of major repeated DNA-sequences in tomato (Lycopersicon esculentum). Mol Gen Genet 213:262–268

    Article  CAS  Google Scholar 

  • Goff SA, Ricke D, Lan TH, Presting G, Wang RL, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchinson D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong JP, Miguel T, Paszkowski U, Zhang SP, Colbert M, Sun WL, Chen LL, Cooper B, Park S, Wood TC, Mao L, Quail P, Wing R, Dean R, Yu YS, Zharkikh A, Shen R, Sahasrabudhe S, Thomas A, Cannings R, Gutin A, Pruss D, Reid J, Tavtigian S, Mitchell J, Eldredge G, Scholl T, Miller RM, Bhatnagar S, Adey N, Rubano T, Tusneem N, Robinson R, Feldhaus J, Macalma T, Oliphant A, Briggs S (2002) A draft sequence of the rice genome (Oryza sativa L. ssp japonica). Science 296:92–100

    Article  CAS  PubMed  Google Scholar 

  • Gottschalk W (1954) Die chromosomenstruktur der Solanaceen unter Berucksichtigung phylogenetischer Fragestellungen. Chromosoma 6:539–626

    Article  CAS  PubMed  Google Scholar 

  • Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9:868–877

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Huang CY, Ayliffe MA, Timmis JN (2004) Simple and complex nuclear loci created by newly transferred chloroplast DNA in tobacco. Proc Natl Acad Sci USA 101:9710–9715

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jacobsen SE, Meyerowitz EM (1997) Hypermethylated SUPERMAN epigenetic alleles in Arabidopsis. Science 277:1100–1103

    Article  CAS  PubMed  Google Scholar 

  • Jasencakova Z, Soppe WJJ, Meister A, Gernand D, Turner BM, Schubert I (2003) Histone modifications in Arabidopsis—high methylation of H3 lysine 9 is dispensable for constitutive heterochromatin. Plant J 33:471–480

    Article  CAS  PubMed  Google Scholar 

  • Johnson LM, Cao XF, Jacobsen SE (2002) Interplay between two epigenetic marks: DNA methylation and histone H3 lysine 9 methylation. Curr Biol 12:1360–1367

    Article  CAS  PubMed  Google Scholar 

  • Kakes P (1973) Chromosome number of Cochlearia pyrenaica DC near Moresnet (Belgium). Acta Botanica Neerlandica 22:206–208

    Article  Google Scholar 

  • Kiss T, Szkukalek A, Solymosy F (1989) Nucleotide sequence of a 17S (18S) ribosomal RNA gene from tomato. Nucleic Acids Res 17:2127–2127

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Korf I, Yandell M, Bedell J (2003) Blast. O’Reilly & Associates, Inc. Sebastopol, pp 357

    Google Scholar 

  • Kulikova O, Gualtieri G, Geurts R, Kim D-J, Cook D, Huguet T, de Jong JH, Fransz PF, Bisseling T (2001) Integration of the FISH pachytene and genetic maps of Medicago truncatula. Plant J 27:49–58

    Article  CAS  PubMed  Google Scholar 

  • Lee DY, Teyssier C, Strahl BD, Stallcup MR (2005) Role of protein methylation in regulation of transcription. Endocr Rev 26:147–170

    Article  CAS  PubMed  Google Scholar 

  • Lippman Z, Gendrel AV, Black M, Vaughn MW, Dedhia N, McCombie WR, Lavine K, Mittal V, May B, Kasschau KD, Carrington JC, Doerge RW, Colot V, Martienssen R (2004) Role of transposable elements in heterochromatin and epigenetic control. Nature 430:471–476

    Article  CAS  PubMed  Google Scholar 

  • Majoros WH, Pertea M, Antonescu C, Salzberg SL (2003) GlimmerM, Exonomy and Unveil: three ab initio eukaryotic genefinders. Nucleic Acids Res 31:3601–3604

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mao L, Begum D, Goff SA, Wing RA (2001) Sequence and analysis of the tomato JOINTLESS locus. Plant Physiol 126:1331–1340

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Martienssen R (1998) Transposons DNA methylation and gene control. Trends Genet 14:263–264

    Article  CAS  PubMed  Google Scholar 

  • Martienssen RA, Rabinowicz PD, O’Shaughnessy A, McCombie WR (2004) Sequencing the maize genome. Curr Opin Plant Biol 7:102–107

    Article  CAS  PubMed  Google Scholar 

  • Martin GB, Brommonschenkel SH, Chunwongse J, Frary A, Ganal MW, Spivey R, Wu TY, Earle ED, Tanksley SD (1993) Map-based cloning of a protein-kinase gene conferring disease resistance in tomato. Science 262:1432–1436

    Article  CAS  PubMed  Google Scholar 

  • McClelland M (1983) The frequency and distribution of methylatable DNA sequences in leguminous plant protein coding genes. J Mol Evol 19:346–354

    Article  CAS  PubMed  Google Scholar 

  • Messeguer R, Ganal MW, Steffens JC, Tanksley SD (1991) Characterization of the level, target sites and inheritance of cytosine methylation in tomato nuclear-DNA. Plant Mol Biol 16:753–770

    Article  CAS  PubMed  Google Scholar 

  • Nick H, Bowen B, Ferl RJ, Gilnert W (1986) Detection of cytosine methylation in the maize alcohol dehydrogenase gene by genomic sequencing. Nature 319:243–246

    Article  CAS  Google Scholar 

  • Palmer LE, Rabinowicz PD, O’Shaughnessy AL, Balija VS, Nascimento LU, Dike S, de la Bastide M, Martienssen RA, McCombie WR (2003) Maize genome sequencing by methylation filtrations. Science 302:2115–2117

    Article  PubMed  Google Scholar 

  • Panstruga R, Buschges R, Piffanelli P, Schulze-Lefert P (1998) A contiguous 60 kb genomic stretch from barley reveals molecular evidence for gene islands in a monocot genome. Nucleic Acids Res 26:1056–1062

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Paran I, van der Voort JR, Lefebvre V, Jahn M, Landry L, van Schriek M, Tanyolac B, Caranta C, Ben Chaim A, Livingstone K, Palloix A, Peleman J (2004) An integrated genetic linkage map of pepper (Capsicum spp.). Mol Breed 13:251–261

    Article  CAS  Google Scholar 

  • Peterson DG, Pearson WR, Stack SM (1998) Characterization of the tomato (Lycopersicon esculentum) genome using in vitro and in situ DNA reassociation. Genome 41:346–356

    Article  CAS  Google Scholar 

  • Peterson DG, Wessler SR, Paterson AH (2002) Efficient capture of unique sequences from eukaryotic genomes. Trends Genet 18:547–550

    Article  CAS  PubMed  Google Scholar 

  • Pichersky E, Logsdon JM, McGrath JM, Stasys RA (1991) Fragments of plastid DNA in the nuclear genome of tomato—prevalence, chromosomal location, and possible mechanism of integration. Mol Gen Genet 225:453–458

    Article  CAS  PubMed  Google Scholar 

  • Pop M, Phillippy A, Delcher AL, Salzberg SL (2004) Comparative genome assembly. Brief Bioinform 5:237–248

    Article  CAS  PubMed  Google Scholar 

  • Qu LH, Meng Q, Zhou H, Chen YQ (2001) Identification of 10 novel snoRNA gene clusters from Arabidopsis thaliana. Nucleic Acids Res 29:1623–1630

    Article  CAS  PubMed  Google Scholar 

  • Rabinowicz PD, Schutz K, Dedhia N, Yordan C, Parnell LD, Stein L, McCombie WR, Martienssen RA (1999) Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome. Nature Genet 23:305–308

    Article  CAS  PubMed  Google Scholar 

  • Rabinowicz PD, Palmer LE, May BP, Hemann MT, Lowe SW, McCombie WR, Martienssen RA (2003) Genes and transposons are differentially methylated in plants, but not in mammals. Genome Res 13:2658–2664

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Raleigh EA, Murray NE, Revel H, Blumenthal RM, Westaway D, Reith AD, Rigby PW, Elhai J, Hanahan D (1988) McrA and McrB restriction phenotypes of some E coli strains and implications for gene cloning. Nucleic Acids Res 16:1563–1575

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Salamov AA, Solovyev VV (2000) Ab initio gene finding in Drosophila genomic DNA. Genome Res 10:516–522

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Salzberg S, Delcher A, Kasif S, White O (1998) Microbial gene identification using interpolated Markov models. Nucleic Acids Res 26:544–548

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Shahmuradov IA, Akbarova YY, Solovyev VV, Aliyev JA (2003) Abundance of plastid DNA insertions in nuclear genomes of rice and Arabidopsis. Plant Mol Biol 52:923–934

    Article  CAS  PubMed  Google Scholar 

  • Shinozaki K, Ohme M, Tanaka M, Wakasugi T, Hayashida N, Matsubayashi T, Zaita N, Chunwongse J, Obokata J, Yamaguchishinozaki K, Ohto C, Torazawa K, Meng BY, Sugita M, Deno H, Kamogashira T, Yamada K, Kusuda J, Takaiwa F, Kato A, Tohdoh N, Shimada H, Sugiura M (1986) The complete nucleotide-sequence of the tobacco chloroplast genome - its gene organization and expression. Embo J 5:2043–2049

    CAS  PubMed  PubMed Central  Google Scholar 

  • Sutherland E, Coe L, Raleigh EA (1992) McrBC: a multisubunit GTP-dependent restriction endonuclease. J Mol Biol 225:327–348

    Article  CAS  PubMed  Google Scholar 

  • Steimer A, Schob H, Grossniklaus U (2004) Epigenetic control of plant development: new layers of complexity. Curr Opin Plant Biol 7:11–19

    Article  CAS  PubMed  Google Scholar 

  • The Arabidopsis genome initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815

    Article  Google Scholar 

  • Tikhonov AP, SanMiguel PJ, Nakajima Y, Gorenstein NM, Bennetzen JL, Avramova Z (1999) Colinearity and its exceptions in orthologous Adh regions of maize and sorghum. Proc Natl Acad Sci USA 96:7409–7414

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Timmis JN, Ayliffe MA, Huang CY, Martin W (2004) Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nat Rev Genet 5:123–135

    Article  CAS  PubMed  Google Scholar 

  • Tran RK, Henikoff JG, Zilberman D, Ditt RF, Jacobsen SE, Henikoff S (2005) DNA methylation profiling identifies CG methylation clusters Arabidopsis genes. Curr Biol 15:154–159

    Article  CAS  PubMed  Google Scholar 

  • Unseld M, Marienfeld JR, Brandt P, Brennicke A (1997) The mitochondrial genome of Arabidopsis thaliana contains 57 genes in 366,924 nucleotides. Nature Genet 15:57–61

    Article  CAS  PubMed  Google Scholar 

  • Valarik M, Bartos J, Kovarova P, Kubalakova M, de Jong JH, Dolezel J (2004) High-resolution FISH on super-stretched flow-sorted plant chromosomes. Plant J 37:940–950

    Article  CAS  PubMed  Google Scholar 

  • van der Hoeven R, Ronning C, Giovannoni J, Martin G, Tanksley S (2002) Deductions about the number, organization, and evolution of genes in the tomato genome based on analysis of a large expressed sequence tag collection and selective genomic sequencing. Plant Cell 14:1441–1456

    Article  PubMed  PubMed Central  Google Scholar 

  • Wagner I, Capesius I (1981) Determination of 5-methycytosine from plant DNA by high-performance liquid chromatograph. Biochem Biophys Acta 654:52–56

    CAS  PubMed  Google Scholar 

  • Walker EL, Panavas T (2001) Structural features and methylation patterns associated with paramutation at the r1 locus of Zea mays. Genetics 159:1201–1215

    CAS  PubMed  PubMed Central  Google Scholar 

  • Walbot V, Warren C (1990) DNA methylation in the Alcohol-dehydrogenase-1 gene of maize. Plant Mol Biol 15:121–125

    Article  CAS  PubMed  Google Scholar 

  • White S, Doebley J (1998) Of genes and genomes and the origin of maize. Trends Genet 14:327–332

    Article  CAS  PubMed  Google Scholar 

  • Wicker T, Stein N, Albar L, Feuillet C, Schlagenhauf E, Keller B (2001) Analysis of a contiguous 211 kb sequence in diploid wheat (Triticum monococcum L.) reveals multiple mechanisms of genome evolution. Plant J 26:307–316

    Article  CAS  PubMed  Google Scholar 

  • Wikstrom N, Savolainen V, Chase MW (2001) Evolution of the angiosperms: calibrating the family tree. Proc R Soc Lond Ser B Biol Sci 268:2211–2220

    Article  CAS  Google Scholar 

  • Ye F, Signer ER (1996) RIGS (repeat-induced gene silencing) in Arabidopsis is transcriptional and alters chromatin configuration. Proc Natl Acad Sci USA 93:10881–10886

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Yang ZH, Yoder AD (2003) Comparison of likelihood and Bayesian methods for estimating divergence times using multiple gene loci and calibration points, with application to a radiation of cute-looking mouse lemur species. Syst Biol 52:705–716

    Article  PubMed  Google Scholar 

  • Yu J, Hu SN, Wang J, Wong GKS, Li SG, Liu B, Deng YJ, Dai L, Zhou Y, Zhang XQ, Cao ML, Liu J, Sun JD, Tang JB, Chen YJ, Huang XB, Lin W, Ye C, Tong W, Cong LJ, Geng JN, Han YJ, Li L, Li W, Hu GQ, Huang XG, Li WJ, Li J, Liu ZW, Liu JP, Qi QH, Liu JS, Li T, Wang XG, Lu H, Wu TT, Zhu M, Ni PX, Han H, Dong W, Ren XY, Feng XL, Cui P, Li XR, Wang H, Xu X, Zhai WX, Xu Z, Zhang JS, He SJ, Zhang JG, Xu JC, Zhang KL, Zheng XW, Dong JH, Zeng WY, Tao L, Ye J, Tan J, Ren XD, Chen XW, He J, Liu DF, Tian W, Tian CG, Xia HG, Bao QY, Li G, Gao H, Cao T, Zhao WM, Li P, Chen W, Wang XD, Zhang Y, Hu JF, Liu S, Yang J, Zhang GY, Xiong YQ, Li ZJ, Mao L, Zhou CS, Zhu Z, Chen RS, Hao BL, Zheng WM, Chen SY, Guo W, Li GJ, Liu SQ, Tao M, Zhu LH, Yuan LP, Yang HM (2002) A draft sequence of the rice genome (Oryza sativa L. ssp indica). Science 296:79–92

    Article  CAS  PubMed  Google Scholar 

  • Yuan YN, SanMiguel PJ, Bennetzen JL (2002) Methylation-spanning linker libraries link gene-rich regions and identify epigenetic boundaries in Zea mays. Genome Res 12:1345–1349

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zemach A, Grafi G (2003) Characterization of Arabidopsis thaliana methyl-CpG-binding domain (MBD) proteins. Plant J 34:565–572

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Science Foundation Grant DBI-0116076 ``Exploitation of tomato as a model for comparative and functional genomics” and 0421634 ``Sequence and annotation of the euchromatin of tomato”. Sequencing of MF tomato clones was accomplished at the Institute for Genomic Research (TIGR, www.tigr.org). Thanks to Dr. Valentina Vysotskaia from Exelixis, Inc. for sharing sequences of three BACs (181O9, 181C9, and 181K1) with us, and Dr. Rick Durrett and Dr. Pablo Rabinowicz for helpful discussion.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. D. Tanksley.

Additional information

Communicated by R. Hagemann

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., van der Hoeven, R.S., Nielsen, R. et al. Characteristics of the tomato nuclear genome as determined by sequencing undermethylated EcoRI digested fragments. Theor Appl Genet 112, 72–84 (2005). https://doi.org/10.1007/s00122-005-0107-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-005-0107-z

Keywords

Navigation