Abstract
Perennial woody plants differ from annual herbaceous plants in several ways and are expected to have evolved to adopt a unique repertoire and expression profiles of functional genes. Poplar, a model tree species for which a large number of ESTs are publicly available, was used to carry out a large-scale comparative analysis with the expressed sequences of eight plant species. First, we obtained 105,831 poplar ESTs from public databases and identified a set of 25,282 unigenes (i.e., tentative non-redundant sequences). The majority of the unigenes (56%) had significant matches to Arabidopsis genes. We then estimated poplar multigene families by counting the tBLASTX matches of each unigene against the poplar unigene dataset itself. Forty-seven percent of the 25,282 unigenes were subsequently organized into 3,481 multigene families 89% of which had less than five copy members. In poplar, protein kinases represent the largest family followed by GTP-binding proteins and Myb transcription factors. Several multigene families had a higher copy number in poplar than in Arabidopsis hinting potential lineage-specific proliferation of poplar protein families. Such expansion may be related to the adaptation of perennial poplars for the high degree of environmental stresses that affects growth and survival. Comparison of poplar unigenes with the Arabidopsis transcriptome revealed that genes involved in transcriptional regulation are the most divergent while metabolism-related genes are the most conserved.
Similar content being viewed by others
References
Altschul S. F., Madden T. L., Schaffer A. A., Zhang J., Zhang Z., Miller W. and Lipman D. J. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389-3402.
Aravind L., Watanabe H., Lipman D. J. and Koonin E. V. 2000. Lineage-specific loss and divergence of functionally linked genes in eukaryotes. Proc. Natl. Acad. Sci. USA 97: 11319-11324.
Bhalerao R., Keskitalo J., Sterky F., Erlandsson R., Bjorkbacka H., Birve S. J., Karlsson J., Gardestrom P., Gustafsson P., Lundeberg J. and Jansson S. 2003. Gene expression in autumn leaves. Plant Physiol. 131: 430-442.
Bray E. A. 1993. Molecular Responses to Water Deficit. Plant Physiol. 103: 1035-1040.
Chaffey N., Cholewa E., Regan S. and Sundberg B. 2002. Secondary xylem development in Arabidopsis: a model for wood formation. Physiol. Plant 114: 594-600.
Chou A. and Burke J. 1999. CRAWview: for viewing splicing variation gene families and polymorphism in clusters of ESTs and full-length sequences. Bioinformatics 15: 376-381.
Detrich H. W. 3rd. 1997. Microtubule assembly in cold-adapted organisms: functional properties and structural adaptations of tubulins from antarctic fishes. Comp. Biochem. Physiol. A Physiol. 118: 501-513.
Ewing B. and Green P. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8: 186-194.
Gupta M. L. Jr., Bode C. J., Dougherty C. A., Marquez R. T. and Himes R. H. 2001. Mutagenesis of beta-tubulin cysteine residues in Saccharomyces cerevisiae: mutation of cysteine 354 results in cold-stable microtubules. Cell Motil. Cytoskeleton 49: 67-77.
Hershko A. and Ciechanover A. 1998. The ubiquitin system. Annu. Rev. Biochem. 67: 425-479.
Hertzberg M., Aspeborg H., Schrader J., Andersson A., Erlandsson R., Blomqvist K., Bhalerao R., Uhlen M., Teeri T. T., Lundeberg J., Sundberg B., Nilsson P. and Sandberg G. 2001. A transcriptional roadmap to wood formation. Proc. Natl. Acad. Sci. USA 98: 14732-14737.
Hide W., Burke J. and Davison D. B. 1994. Biological evaluation of d2 an algorithm for high-performance sequence comparison. J Comput. Biol. 1: 199-215.
Hillier L. D., Lennon G., Becker M., Bonaldo M. F., Chiapelli B., Chissoe S., Dietrich N., DuBuque T., Favello A., Gish W., Hawkins M., Hultman M., Kucaba T., Lacy M., Le M., Le N., Mardis E., Moore B., Morris M., Parsons J., Prange C., Rifkin L., Rohlfing T., Schellenberg K., Marra M. and et al. 1996. Generation and analysis of 280000 human expressed sequence tags. Genome Res. 6: 807-828.
Kirst M., Johnson A. F., Baucom C., Ulrich E., Hubbard K., Staggs R., Paule C., Retzel E., Whetten R. and Sederoff R. 2003. Apparent homology of expressed genes from wood-forming tissues of loblolly pine (Pinus taeda L. ) with Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 100: 7383-7388.
Koag M. C., Fenton R. D., Wilkens S. and Close T. J. 2003. The binding of maize DHN1 to lipid vesicles. Gain of structure and lipid specificity. Plant Physiol. 131: 309-316.
Kohler A., Delaruelle C., Martin D., Encelot N. and Martin F. 2003. The poplar root transcriptome: analysis of 7000 expressed sequence tags. FEBS Lett. 542: 37-41.
Lander E. S., Linton L. M., Birren B., Nusbaum C., Zody M. C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W. and et al. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.
Lespinet O., Wolf Y. I., Koonin E. V. and Aravind L. 2002. The role of lineage-specific gene family expansion in the evolution of eukaryotes. Genome Res. 12: 1048-1059.
Lev-Yadun S. 1994. Induction of sclereid differentiation in the pith of Arabidopsis thalianaL) Heyn h. J. Exp. Bot. 45: 1845-1849.
Michalek W., Weschke W., Pleissner K. P. and Graner A. 2002. EST analysis in barley defines a unigene set comprising 4000 genes. Theor. Appl. Genet. 104: 97-103.
Miller R. T., Christoffels A. G., Gopalakrishnan C., Burke J., Ptitsyn A. A., Broveak T. R. and Hide W. A. 1999. A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base. Genome Res. 9: 1143-1155.
Nick P. 1998. Signaling to the microtubular cytoskeleton in plants. In International Review of Cytology-a Survey of Cell Biology 184: 33-80.
Nyporko A. Y., Demchuk O. N. and Blume Y. B. 2003. Cold adaptation of plant microtubules: structural interpretation of primary sequence changes in a highly conserved region of alpha-tubulin. Cell Biol. Intl. 27: 241-243.
Parker S. K. and Detrich H. W. 1998. Evolution organization and expression of alpha-tubulin genes in the antarctic fish Notothenia coriiceps. Adaptive expansion of a gene family by recent gene duplication inversion and divergence. J Biol. Chem. 273: 34358-34369.
Pihakaskimaunsbach K. and Puhakainen T. 1995. Effect of coldexposure on cortical microtubules of rye (Secale cereale) as observed by immunocytochemistry. Physiologia Plantarum 93: 563-571.
Quackenbush J., Liang F., Holt I., Pertea G. and Upton J. 2000. The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res. 28: 141-145.
Rikin A., Atsmon D. and Gitler C. 1980. Chilling injury in cotton (Gossypium hirsutum L)-Effects of anti-microtubular drugs. Plant and Cell Physiology 21: 829-837.
Rubin G. M., Yandell M. D., Wortman J. R., Gabor Miklos G. L., Nelson C. R., Hariharan I. K., Fortini M. E., Li P. W., Apweiler R., Fleischmann W. and et al. 2000. Comparative genomics of the Eukaryotes. Science 287: 2204-2215.
Rudd S. 2003. Expressed sequence tags: alternative or complement to whole genome sequences? Trends in Plant Science 8: 321-329.
Schiene C. and Fischer G. 2000. Enzymes that catalyse the restructuring of proteins. Curr. Opin. Struct. Biol. 10: 40-45.
Soltis P. S., Soltis D. E. and Chase M. W. 1999. Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology. Nature 402: 402-404.
Soulages J. L., Kim K., Walters C. and Cushman J. C. 2002. Temperature-induced extended helix/random coil transitions in a group 1 late embryogenesis-abundant protein from soybean. Plant Physiol. 128: 822-832.
Sterky F., Regan S., Karlsson J., Hertzberg M., Rohde A., Holmberg A., Amini B., Bhalerao R., Larsson M., Villarroel R., Van Montagu M., Sandberg G., Olsson O., Teeri T. T., Boerjan W., Gustafsson P., Uhlen M., Sundberg B. and Lundeberg J. 1998. Gene discovery in the wood-forming tissues of poplar: analysis of 5 692 expressed sequence tags. Proc. Natl. Acad. Sci. USA 95: 13330-13335.
Thompson J. D., Higgins D. G. and Gibson T. J. 1994. CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673-4680.
Trail F., Xu J. R., San Miguel P., Halgren R. G. and Kistler H. C. 2003. Analysis of expressed sequence tags from Gibberella zeae (anamorph Fusarium graminearum). Fungal Genet. Biol. 38: 187-197.
Tuskan G. A., Wullschleger S. D., Difazio S. P., Gunter L. E., Schuster M. E., Land M. L., Larimer F. W., Ritland K., Boore J. L. and Rokhsar D. S. 2003. The Populus chloroplast genome: a comparison of genome structure and organization. Plant and Animal Genome XI. January 11 ( 15 2003 San Diego California, USA.
Van der Hoeven R., Ronning C., Giovannoni J., Martin G. and Tanksley S. 2002. Deductions about the number organization and evolution of genes in the tomato genome based on analysis of a large expressed sequence tag collection and selective genomic sequencing. Plant Cell 14: 1441-1456.
Wise M. J. 2003. LEAping to conclusions: A computational reanalysis of late embryogenesis abundant proteins and their possible roles. BMC Bioinformatics 4: 52.
Wootton J. C. and Federhen S. 1996. Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 266: 554-571.
Zhao C., Johnson B. J., Kositsup B. and Beers E. P. 2000. Exploiting secondary growth in Arabidopsis. Construction of xylem and bark cDNA libraries and cloning of three xylem endopeptidases. Plant Physiol. 123: 1185-1196.
Zhu W., Schlueter S. D. and Brendel V. 2003. Refined annotation of the Arabidopsis genome by complete expressed sequence tag mapping. Plant Physiology 132: 469-484.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Park, S., Oh, S. & Han, KH. Large-scale computational analysis of poplar ESTs reveals the repertoire and unique features of expressed genes in the poplar genome. Molecular Breeding 14, 429–440 (2004). https://doi.org/10.1007/s11032-004-0603-x
Issue Date:
DOI: https://doi.org/10.1007/s11032-004-0603-x