Skip to main content
Log in

Large-scale computational analysis of poplar ESTs reveals the repertoire and unique features of expressed genes in the poplar genome

  • Published:
Molecular Breeding Aims and scope Submit manuscript

Abstract

Perennial woody plants differ from annual herbaceous plants in several ways and are expected to have evolved to adopt a unique repertoire and expression profiles of functional genes. Poplar, a model tree species for which a large number of ESTs are publicly available, was used to carry out a large-scale comparative analysis with the expressed sequences of eight plant species. First, we obtained 105,831 poplar ESTs from public databases and identified a set of 25,282 unigenes (i.e., tentative non-redundant sequences). The majority of the unigenes (56%) had significant matches to Arabidopsis genes. We then estimated poplar multigene families by counting the tBLASTX matches of each unigene against the poplar unigene dataset itself. Forty-seven percent of the 25,282 unigenes were subsequently organized into 3,481 multigene families 89% of which had less than five copy members. In poplar, protein kinases represent the largest family followed by GTP-binding proteins and Myb transcription factors. Several multigene families had a higher copy number in poplar than in Arabidopsis hinting potential lineage-specific proliferation of poplar protein families. Such expansion may be related to the adaptation of perennial poplars for the high degree of environmental stresses that affects growth and survival. Comparison of poplar unigenes with the Arabidopsis transcriptome revealed that genes involved in transcriptional regulation are the most divergent while metabolism-related genes are the most conserved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Altschul S. F., Madden T. L., Schaffer A. A., Zhang J., Zhang Z., Miller W. and Lipman D. J. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389-3402.

    Google Scholar 

  • Aravind L., Watanabe H., Lipman D. J. and Koonin E. V. 2000. Lineage-specific loss and divergence of functionally linked genes in eukaryotes. Proc. Natl. Acad. Sci. USA 97: 11319-11324.

    Google Scholar 

  • Bhalerao R., Keskitalo J., Sterky F., Erlandsson R., Bjorkbacka H., Birve S. J., Karlsson J., Gardestrom P., Gustafsson P., Lundeberg J. and Jansson S. 2003. Gene expression in autumn leaves. Plant Physiol. 131: 430-442.

    Google Scholar 

  • Bray E. A. 1993. Molecular Responses to Water Deficit. Plant Physiol. 103: 1035-1040.

    Google Scholar 

  • Chaffey N., Cholewa E., Regan S. and Sundberg B. 2002. Secondary xylem development in Arabidopsis: a model for wood formation. Physiol. Plant 114: 594-600.

    Google Scholar 

  • Chou A. and Burke J. 1999. CRAWview: for viewing splicing variation gene families and polymorphism in clusters of ESTs and full-length sequences. Bioinformatics 15: 376-381.

    Google Scholar 

  • Detrich H. W. 3rd. 1997. Microtubule assembly in cold-adapted organisms: functional properties and structural adaptations of tubulins from antarctic fishes. Comp. Biochem. Physiol. A Physiol. 118: 501-513.

    Google Scholar 

  • Ewing B. and Green P. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8: 186-194.

    Google Scholar 

  • Gupta M. L. Jr., Bode C. J., Dougherty C. A., Marquez R. T. and Himes R. H. 2001. Mutagenesis of beta-tubulin cysteine residues in Saccharomyces cerevisiae: mutation of cysteine 354 results in cold-stable microtubules. Cell Motil. Cytoskeleton 49: 67-77.

    Google Scholar 

  • Hershko A. and Ciechanover A. 1998. The ubiquitin system. Annu. Rev. Biochem. 67: 425-479.

    Google Scholar 

  • Hertzberg M., Aspeborg H., Schrader J., Andersson A., Erlandsson R., Blomqvist K., Bhalerao R., Uhlen M., Teeri T. T., Lundeberg J., Sundberg B., Nilsson P. and Sandberg G. 2001. A transcriptional roadmap to wood formation. Proc. Natl. Acad. Sci. USA 98: 14732-14737.

    Google Scholar 

  • Hide W., Burke J. and Davison D. B. 1994. Biological evaluation of d2 an algorithm for high-performance sequence comparison. J Comput. Biol. 1: 199-215.

    Google Scholar 

  • Hillier L. D., Lennon G., Becker M., Bonaldo M. F., Chiapelli B., Chissoe S., Dietrich N., DuBuque T., Favello A., Gish W., Hawkins M., Hultman M., Kucaba T., Lacy M., Le M., Le N., Mardis E., Moore B., Morris M., Parsons J., Prange C., Rifkin L., Rohlfing T., Schellenberg K., Marra M. and et al. 1996. Generation and analysis of 280000 human expressed sequence tags. Genome Res. 6: 807-828.

    Google Scholar 

  • Kirst M., Johnson A. F., Baucom C., Ulrich E., Hubbard K., Staggs R., Paule C., Retzel E., Whetten R. and Sederoff R. 2003. Apparent homology of expressed genes from wood-forming tissues of loblolly pine (Pinus taeda L. ) with Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 100: 7383-7388.

    Google Scholar 

  • Koag M. C., Fenton R. D., Wilkens S. and Close T. J. 2003. The binding of maize DHN1 to lipid vesicles. Gain of structure and lipid specificity. Plant Physiol. 131: 309-316.

    Google Scholar 

  • Kohler A., Delaruelle C., Martin D., Encelot N. and Martin F. 2003. The poplar root transcriptome: analysis of 7000 expressed sequence tags. FEBS Lett. 542: 37-41.

    Google Scholar 

  • Lander E. S., Linton L. M., Birren B., Nusbaum C., Zody M. C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W. and et al. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.

    Google Scholar 

  • Lespinet O., Wolf Y. I., Koonin E. V. and Aravind L. 2002. The role of lineage-specific gene family expansion in the evolution of eukaryotes. Genome Res. 12: 1048-1059.

    Google Scholar 

  • Lev-Yadun S. 1994. Induction of sclereid differentiation in the pith of Arabidopsis thalianaL) Heyn h. J. Exp. Bot. 45: 1845-1849.

    Google Scholar 

  • Michalek W., Weschke W., Pleissner K. P. and Graner A. 2002. EST analysis in barley defines a unigene set comprising 4000 genes. Theor. Appl. Genet. 104: 97-103.

    Google Scholar 

  • Miller R. T., Christoffels A. G., Gopalakrishnan C., Burke J., Ptitsyn A. A., Broveak T. R. and Hide W. A. 1999. A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base. Genome Res. 9: 1143-1155.

    Google Scholar 

  • Nick P. 1998. Signaling to the microtubular cytoskeleton in plants. In International Review of Cytology-a Survey of Cell Biology 184: 33-80.

    Google Scholar 

  • Nyporko A. Y., Demchuk O. N. and Blume Y. B. 2003. Cold adaptation of plant microtubules: structural interpretation of primary sequence changes in a highly conserved region of alpha-tubulin. Cell Biol. Intl. 27: 241-243.

    Google Scholar 

  • Parker S. K. and Detrich H. W. 1998. Evolution organization and expression of alpha-tubulin genes in the antarctic fish Notothenia coriiceps. Adaptive expansion of a gene family by recent gene duplication inversion and divergence. J Biol. Chem. 273: 34358-34369.

    Google Scholar 

  • Pihakaskimaunsbach K. and Puhakainen T. 1995. Effect of coldexposure on cortical microtubules of rye (Secale cereale) as observed by immunocytochemistry. Physiologia Plantarum 93: 563-571.

    Google Scholar 

  • Quackenbush J., Liang F., Holt I., Pertea G. and Upton J. 2000. The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res. 28: 141-145.

    Google Scholar 

  • Rikin A., Atsmon D. and Gitler C. 1980. Chilling injury in cotton (Gossypium hirsutum L)-Effects of anti-microtubular drugs. Plant and Cell Physiology 21: 829-837.

    Google Scholar 

  • Rubin G. M., Yandell M. D., Wortman J. R., Gabor Miklos G. L., Nelson C. R., Hariharan I. K., Fortini M. E., Li P. W., Apweiler R., Fleischmann W. and et al. 2000. Comparative genomics of the Eukaryotes. Science 287: 2204-2215.

    Google Scholar 

  • Rudd S. 2003. Expressed sequence tags: alternative or complement to whole genome sequences? Trends in Plant Science 8: 321-329.

    Google Scholar 

  • Schiene C. and Fischer G. 2000. Enzymes that catalyse the restructuring of proteins. Curr. Opin. Struct. Biol. 10: 40-45.

    Google Scholar 

  • Soltis P. S., Soltis D. E. and Chase M. W. 1999. Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology. Nature 402: 402-404.

    Google Scholar 

  • Soulages J. L., Kim K., Walters C. and Cushman J. C. 2002. Temperature-induced extended helix/random coil transitions in a group 1 late embryogenesis-abundant protein from soybean. Plant Physiol. 128: 822-832.

    Google Scholar 

  • Sterky F., Regan S., Karlsson J., Hertzberg M., Rohde A., Holmberg A., Amini B., Bhalerao R., Larsson M., Villarroel R., Van Montagu M., Sandberg G., Olsson O., Teeri T. T., Boerjan W., Gustafsson P., Uhlen M., Sundberg B. and Lundeberg J. 1998. Gene discovery in the wood-forming tissues of poplar: analysis of 5 692 expressed sequence tags. Proc. Natl. Acad. Sci. USA 95: 13330-13335.

    Google Scholar 

  • Thompson J. D., Higgins D. G. and Gibson T. J. 1994. CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673-4680.

    Google Scholar 

  • Trail F., Xu J. R., San Miguel P., Halgren R. G. and Kistler H. C. 2003. Analysis of expressed sequence tags from Gibberella zeae (anamorph Fusarium graminearum). Fungal Genet. Biol. 38: 187-197.

    Google Scholar 

  • Tuskan G. A., Wullschleger S. D., Difazio S. P., Gunter L. E., Schuster M. E., Land M. L., Larimer F. W., Ritland K., Boore J. L. and Rokhsar D. S. 2003. The Populus chloroplast genome: a comparison of genome structure and organization. Plant and Animal Genome XI. January 11 ( 15 2003 San Diego California, USA.

  • Van der Hoeven R., Ronning C., Giovannoni J., Martin G. and Tanksley S. 2002. Deductions about the number organization and evolution of genes in the tomato genome based on analysis of a large expressed sequence tag collection and selective genomic sequencing. Plant Cell 14: 1441-1456.

    Google Scholar 

  • Wise M. J. 2003. LEAping to conclusions: A computational reanalysis of late embryogenesis abundant proteins and their possible roles. BMC Bioinformatics 4: 52.

    Google Scholar 

  • Wootton J. C. and Federhen S. 1996. Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 266: 554-571.

    Google Scholar 

  • Zhao C., Johnson B. J., Kositsup B. and Beers E. P. 2000. Exploiting secondary growth in Arabidopsis. Construction of xylem and bark cDNA libraries and cloning of three xylem endopeptidases. Plant Physiol. 123: 1185-1196.

    Google Scholar 

  • Zhu W., Schlueter S. D. and Brendel V. 2003. Refined annotation of the Arabidopsis genome by complete expressed sequence tag mapping. Plant Physiology 132: 469-484.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kyung-Hwan Han.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, S., Oh, S. & Han, KH. Large-scale computational analysis of poplar ESTs reveals the repertoire and unique features of expressed genes in the poplar genome. Molecular Breeding 14, 429–440 (2004). https://doi.org/10.1007/s11032-004-0603-x

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11032-004-0603-x

Navigation