Abstract
An interactive clustering model based on positional weight matrices is described and results obtained using the model to analyze gene regulation patterns in archaea are presented. The 5′ flanking sequences of ORFs identified in four archaea, Sulfolobus solfataricus, Pyrobaculum aerophilum, Halobacterium sp. NRC-1, and Pyrococcus abyssi, were clustered using the model. Three regular patterns of clusters were identified for most ORFs. One showed genes with only a ribosome-binding site; another showed genes with a transcriptional regulatory region located at a constant location with respect to the start codon. A third pattern combined the previous two. Both P. aerophilum and Halobacterium sp. NRC-1 exhibited clusters of genes that lacked any regular pattern. Halobacterium sp. NRC-1 also presented regular features not seen in the other organisms. This group of archaea seems to use a combination of eubacterial and eukaryotic regulatory features as well as some unique to individual species. Our results suggest that interactive clustering may be used to examine the divergence of the gene regulatory machinery in archaea and to identify the presence of archaea-specific gene regulation patterns.
Similar content being viewed by others
References
Barns SM, Delwiche CF, Palmer JD, Pace NR (1996) Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. Proc Natl Acad Sci USA 93:9188–9193
Bell SD, Cairns SS, Robson RL, Jackson SP (1999) Transcriptional regulation of an archaeal operon in vivo and in vitro. Mol Cell 4:971–982
Boyle AP, Boyle JA (2003) Global alignment of microbial translation initiation regions. J MS Acad Sci 48:138–150
Bucher P (1990) Weight matrix description of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J Mol Biol 212:563–578
Dahlke I, Thomm M (2002) A Pyrococcus homolog of the leucine-responsive regulatory protein, LrpA, inhibits transcription by abrogating RNA polymerase recruitment. Nucleic Acids Res 30:701–710
DeLong EF, Wu KY, Prezelin BB, Jovine RV (1994) High abundance of Archaea in Antarctic marine picoplankton. Nature 371:695–697
Fitz-Gibbon ST, Ladner H, Kim UJ, Stetter KO, Simon MI, Miller JH (2002) Genome sequence of the hyperthermophilic crenarchaeon Pyrobaculum aerophilum. Proc Natl Acad Sci USA 99:984–989
Han J, Kamber K (2000) Data mining: concepts and techniques. Morgan Kaufmann, San Francisco
Holberton DV, Marshall J (1995) Analysis of consensus sequence patterns in Giardia cytoskeleton gene promoters. Nucleic Acids Res 15:2945–2953
Jain K, Murthy MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31:264–323
Kozak M (1999) Initiation of translation in prokaryotes and eukaryotes. Gene 234:187–208
Kyrpides NC, Ouzounis CA (1995) The eubacterial transcriptional activator Lrp is present in the archaeon Pyrococcus furiosus. Trends Biochem Sci 20:140–141
Kyrpides NC, Ouzounis CA (1999) Transcription in archaea. Proc Natl Acad Sci USA 96:8545–8550
Lee SJ, Engelmann A, Horlacher R, Qu Q, Vierke G, Hebbeln C, Thomm M, Boos W (2003) TrmB, a sugar-specific transcriptional regulator of the trehalose/maltose ABC transporter from the hyperthermophilic archaeon Thermococcus litoralis. J Biol Chem 278:983–990
Levy S, Hannenhalli S, Workman C (2001) Enrichment of regulatory signals in conserved non-coding genomic sequence. Bioinformatics 17:871–877
Liu R, Blackwell TW, States DJ (2001) Conformational model for binding site recognition by the E. coli MetJ transcription factor. Bioinformatics 17:622–633
Ma J, Campbell A, Karlin S (2002) Correlations between Shine–Dalgarno sequences and gene features such as predicted expression levels and operon structures. J Bacteriol 184:5733–5745
Natale DA, Shankavaram UT, Galperin MY, Wolf YI, Aravind L, Koonin EV (2000) Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs). Genome Biol 1:Research0009.1–0009.19
Ng WV, Ciufo SA, Smith TM, Bumgarner RE, Baskin D, Faust J et al (1998) Snapshot of a large dynamic replicon in a halophilic archaeon: megaplasmid or minichromosome? Genome Res 8:1131–1141
Ng WV, Kennedy SP, Mahairas GG, Berquist B, Pan M, Shukla HD et al (2000) Genome sequence of Halobacterium species NRC-1. Proc Natl Acad Sci USA 97:12176–12181
Ouhammouch M, Dewhurst RE, Hausner W, Thomm M, Geiduschek EP (2003) Activation of archaeal transcription by recruitment of the TATA-binding protein. Proc Natl Acad Sci USA 100:5097–5102
Saito R, Tomita M (1999) Computer analyses of complete genomes suggest that some archaebacteria employ both eukaryotic and eubacterial mechanisms in translation initiation. Gene 238:79–83
Salgado H, Moreno-Hagelsieb G, Smith TF, Collado-Vides J (2000) Operons in Escherichia coli: genomic analyses and predictions. Proc Natl Acad Sci USA 97:6652–6657
Salgado H, Santos-Zavaleta A, Gama-Castro S, Millan-Zarate D, Diaz-Peredo E, Sanchez-Solano F et al (2001) RegulonDB (version 3.2): transcriptional regulation and operon organization in Escherichia coli K-12. Nucleic Acids Res 29:72–74
Schurr T, Nadir E, Margalit H (1993) Identification and characterization of E. coli ribosomal binding sites by free-energy computation. Nucleic Acids Res 21:4019–4023
She Q, Singh RK, Confalonieri F, Zivanovic Y, Allard G, Awayez MJ et al (2001) The complete genome of the crenarchaeon Sulfolobus solfataricus P2. Proc Natl Acad Sci USA 98:7835–7840
Slupska MM, King AG, Fitz-Gibbon S, Besemer J, Borodovsky M, Miller JH (2001) Leaderless transcripts of the crenarchaeal hyperthermophile Pyrobaculum aerophilum. J Mol Biol 309:347–360
Soppa J (1999a) Transcription initiation in archaea: facts, factors and future aspects. Mol Microbiol 31:1295–1305
Soppa J (1999b) Normalized nucleotide frequencies allow the definition of archaeal promoter elements for different archaeal groups and reveal base-specific TFB contacts upstream of the TATA box. Mol Microbiol 31:1589–1592
Staden R (1984) Measurements of the effects that coding for a protein has on a DNA sequences and their use for finding genes. Nucleic Acids Res 12:551–567
Tolstrup N, Sensen CW, Garrett RA, Clausen IG (2000) Two different and highly organized mechanisms of translation initiation in the archaeon Sulfolobus solfataricus. Extremophiles 4:175–179
Vierke G, Engelmann A, Hebbeln C, Thomm M (2003) A novel archaeal transcriptional regulator of heat shock response. J Biol Chem 278:18–26
Wan X, Bridges SM, Boyle JA, Boyle AP (2002) Interactive clustering for exploration of genomic data. In: Dagli CH, Buczak, AL, Ghosh J, Embrechts M, Ersoy O, Kercel S (eds) Smart engineering design, vol 12. ASME Press, New York, pp 753–758
Woese CR, Fox GE (1977) Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci USA 74:5088–5090
Acknowledgements
We thank two anonymous reviewers for their critical suggestion for the revision of the initial manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by K. Horikoshi and F. Robb
Rights and permissions
About this article
Cite this article
Wan, XF., Bridges, S.M. & Boyle, J.A. Revealing gene transcription and translation initiation patterns in archaea, using an interactive clustering model. Extremophiles 8, 291–299 (2004). https://doi.org/10.1007/s00792-004-0388-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00792-004-0388-1