Tree Genetics & Genomes

, Volume 3, Issue 1, pp 61–70 | Cite as

Conserved ortholog sets in forest trees

  • Konstantin V. Krutovsky
  • Christine G. Elsik
  • Marta Matvienko
  • Alex Kozik
  • David B. Neale
Original Paper

Abstract

Putative single-copy genes and conserved ortholog sets (COS) were identified in model plant species thale cress (Arabidopsis thaliana), rice (Oryza sativa ssp. japonica), and poplar [black cottonwood, Populus trichocarpa (Torr. & Gray ex Brayshaw)] and used to find putative COS in four conifers (the Coniferales order). Using expressed sequence tag sequences, unique transcript sets were assembled in loblolly pine (Pinus taeda L.), white spruce [Picea glauca (Moench) Voss], Douglas-fir [Pseudotsuga menziesii (Mirb.) Franco var. menziesii], and sugi [Cryptomeria japonica (Thunberg ex Linnaeus f.) D. Don]. They were compared with COS sets identified in three model plant species using comparative sequence analysis. Almost half of the single-copy genes in herbaceous species (Arabidopsis and rice) had additional copies and homologs in poplar and conifers. The identified tentative COS sets have many applications in evolutionary genomics studies, phylogenetic analysis, and comparative mapping.

Keywords

COS Cryptomeria japonica EST Ortholog Picea glauca Pinus taeda Populus trichocarpa Pseudotsuga menziesii Unique transcript 

Supplementary material

11295_2006_52_MOESM1_ESM.fasta.
C_japonica_55-COS-genes-shared-with-ARP (fasta 38 kb)
11295_2006_52_MOESM2_ESM.fasta.
P_glauca_359-COS-genes-shared-with-ARP (fasta 355 kb)
11295_2006_52_MOESM3_ESM.fasta.
P_menziesii_90-COS-genes-shared-with-ARP (fasta 56 kb)
11295_2006_52_MOESM4_ESM.fasta.
P_taeda_216-COS-genes-shared-with-ARP (fasta 175 kb)
11295_2006_52_MOESM5_ESM.fasta.
P_trichocarpa-753-COS-genes-shared-with-ARP (fasta 338 kb)
11295_2006_52_MOESM6_ESM.fasta.
Poplar-9605-single-copy-genes-locat-annot (fasta 3290 kb)
11295_2006_52_MOESM7_ESM.fasta.
rice-12004-single-hits (fasta 3849 kb)
11295_2006_52_MOESM8_ESM.xls (9.8 mb)
Table 1Scontigs ESTs ID (XLS 10230 kb)
11295_2006_52_MOESM9_ESM.xls (9.4 mb)
Table 2SCOS summary (XLS 9884 kb)
11295_2006_52_MOESM10_ESM.xls (46 kb)
Table 3S26 COS annotation (XLS 46 kb)
11295_2006_52_MOESM11_ESM.xls (514 kb)
Table 4S753 COS trees (XLS 526 kb)

References

  1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402PubMedCrossRefGoogle Scholar
  2. Bennett MD, Smith JB (1991) Nuclear DNA amounts in angiosperms. Philos Trans R Soc Lond, B 334:309–345Google Scholar
  3. Bradshaw HD, Stettler RF (1993) Molecular genetics of growth and development in Populus. I. Triplody in hybrid poplars. Theor Appl Genet 86:301–307CrossRefGoogle Scholar
  4. Brown GR, Kadel EE III, Bassoni DL, Kiehne KL, Temesgen B, van Buijtenen JP, Sewell MM, Marshall KA, Neale DB (2001) Anchored reference loci in loblolly pine (Pinus taeda L.) for integrating pine genomics. Genetics 159:799–809PubMedGoogle Scholar
  5. DiFazio SP (2005) A pioneer perspective on adaptation. Functional genomics of environmental adaptation in Populus: the 12th New Phytologist Symposium, Gatlinburg, TN, USA, October 2004. New Phytol 165:661–664PubMedCrossRefGoogle Scholar
  6. Dong Q, Schlueter SD, Brendel V (2004) PlantGDB, plant genome database and analysis tools. Nucleic Acids Res 32:D354–D359PubMedCrossRefGoogle Scholar
  7. Frankis MP (1989) Generic inter-relationships in Pinaceae. Notes Roy Bot Gard Edinburgh 45:527–548Google Scholar
  8. Fulton TM, van der Hoeven R, Eannetta NT, Tanksley SD (2002) Identification, analysis, and utilization of conserved ortholog set markers for comparative genomics in higher plants. Plant Cell 14:1457–1467PubMedCrossRefGoogle Scholar
  9. Guillet-Claude C, Isabel N, Pelgas B, Bousquet J (2004) The evolutionary implications of knox-I gene duplications in conifers: correlated evidence from phylogeny, gene mapping, and analysis of functional divergence. Mol Biol Evol 21:2232–2245PubMedCrossRefGoogle Scholar
  10. Gupta PK, Rustgi S (2004) Molecular markers from the transcribed/expressed region of the genome in higher plants. Funct Integr Genomics 4:139–162PubMedCrossRefGoogle Scholar
  11. Hizume M, Kondo T, Shibata F, Ishizuka R (2001) Flow cytometric determination of genome size in the Taxodiaceae, Cupressaceae sensu stricto and Sciadopityaceae. Cytologia 66:307–311Google Scholar
  12. Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9:868–877PubMedCrossRefGoogle Scholar
  13. International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800CrossRefGoogle Scholar
  14. Krutovsky KV, Troggio M, Brown GR, Jermstad KD, Neale DB (2004) Comparative mapping in the Pinaceae. Genetics 168:447–461PubMedCrossRefGoogle Scholar
  15. Morton NE (1991) Parameters of the human genome. Proc Natl Acad Sci USA 88:7474–7476PubMedCrossRefGoogle Scholar
  16. Neale DB, Krutovsky KV (2004) Comparative genetic mapping in trees: the group of conifers. In: Lörz H, Wenzel G (eds) Biotechnology in agriculture and forestry: molecular marker systems. Springer, Berlin Heidelberg New York, pp 267–277Google Scholar
  17. O’Brien IEW, Smith DR, Gardner RC, Murray BG (1996) Flow cytometric determination of genome size in Pinus. Plant Sci 115:91–99CrossRefGoogle Scholar
  18. Ohri D, Khoshoo TN (1986) Genome size in gymnosperms. Plant Syst Evol 153:119–132CrossRefGoogle Scholar
  19. Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J (2003) TIGR gene indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19:651–652PubMedCrossRefGoogle Scholar
  20. Rudd S, Schoof H, Mayer K (2005) PlantMarkers—a database of predicted molecular markers from plants. Nucleic Acids Res 33(Suppl 1):D628–D632PubMedGoogle Scholar
  21. Tatusov RL, Galperin MY, Natale DA, Koonin EV (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res 28:33–36PubMedCrossRefGoogle Scholar
  22. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4:41PubMedCrossRefGoogle Scholar
  23. Temesgen B, Brown GR, Harry DE, Kinlaw CS, Sewell MM, Neale DB (2001) Genetic mapping of expressed sequence tag polymorphism (ESTP) markers in loblolly pine (Pinus taeda L.). Theor Appl Genet 102:664–675CrossRefGoogle Scholar
  24. Zhang Z, Schwartz S, Wagner L, Miller W (2000) A greedy algorithm for aligning DNA sequences. J Comput Biol 7:203–214PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  • Konstantin V. Krutovsky
    • 1
  • Christine G. Elsik
    • 2
  • Marta Matvienko
    • 3
  • Alex Kozik
    • 4
  • David B. Neale
    • 5
    • 6
  1. 1.Department of Forest ScienceTexas A&M UniversityCollege StationUSA
  2. 2.Department of Animal ScienceTexas A&M UniversityCollege StationUSA
  3. 3.Allometra, LLCDavisUSA
  4. 4.Genome CenterUniversity of CaliforniaDavisUSA
  5. 5.Department of Plant SciencesUniversity of CaliforniaDavisUSA
  6. 6.Institute of Forest Genetics, Pacific Southwest Research StationUS Department of Agriculture Forest ServiceDavisUSA

Personalised recommendations