Background

Many important cellular components are ribonucleoprotein (RNP) complexes, such as the spliceosome and ribosome that have key roles in gene regulation and translation. The telomerase RNP is a reverse transcriptase that maintains the telomeric repeats of eukaryotic chromosomes. Telomerase is composed of two proteins, the functionally essential reverse transcriptase TERT and the non-essential TEP1 (also known as TP1 or TLP1) as well as the telomerase RNA. TEP1 [1, 2] is also found to be a component of the enigmatic vault RNP [3]. The vault is a huge structure (13 Md) of unknown function. The vault RNP is mainly composed of the major vault protein MVP, but also contains smaller amounts of TEP1 and VPARP as well as the vault RNA. Although predominately cytoplasmic, a portion of vaults are found associated with nuclear pores [4]. Vaults have been suggested to be involved in multidrug resistance, nucleo-cytoplasmic transport, and formation of RNPs [5]. While investigating the components of these RNPs an interesting protein similarity was noticed.

Results and Discussion

The complete sequence of the Tetrahymena thermophilus telomerase p80 component (Swiss:Q94818), a homologue of TEP1, was used to seed a PSI-blast (Position Specific Iterated-Blast) search at NCBI using the default inclusion threshold [6]. The search identified the vertebrate telomerase/vault component TEP1 and uncharacterised bacterial sequences from Clostridium thermocellum, Cytophaga hutchinsonii, Streptomyces coelicolor and Salmonella typhimurium in the first round. The second round of searching identified vertebrate homologues of the Ro60 ribonucleoprotein, with E-values as low as 6 × 10-12, as well as three further bacterial sequences from Pseudomonas fluorescens, Nostoc punctiforme and Deinococcus radiodurans. The sequence of D. radiodurans has been previously identified as a homologue of the vertebrate Ro60 protein [7]. A short region of similarity between p80 and Ro60 was noted previously, but the biological significance of this observation was not discussed further [2].

Ro60 is the protein component of the Ro RNP complex, that also contains a Y RNA. The region of similarity between Ro60 and TEP1 was over 800 amino acids in length. Protein domains range from 30 to 500 amino acids in length, therefore the region of similarity between Ro60 and TEP1 was too long to be a single domain, so smaller regions of these proteins were investigated. A PSI-blast search with the C-terminal residues 514 to 719 of p80 revealed matches to known vWA proteins; indicating that the C-terminal region in these proteins is a vWA domain. This new search also identified VPARP a poly-ADP-ribose polymerase associated with the vault complex [8] (see also http://www.vaults.arc.ucla.edu/) as containing a closely related vWA domain that was noted previously [9]. It is somewhat surprising that two components of the vault have a highly related vWA domain. It has been suggested that the vWA domain in VPARP binds to a metal ion and might be involved in complex assembly [10]. However, the region of the vWA domain is unlikely to be a site of major vault protein (MVP) binding [9] so perhaps it could be involved in an interaction between TEP1 and VPARP, or bind an as yet unidentified transient component.

The amino-terminal 500 amino acids of p80 were found to be restricted to TEP1, Ro60 and other uncharacterised bacterial proteins using PSI-blast as above. A multiple sequence alignment of this region is shown in Figure 1. This region is large ranging from 286 residues in the C. thermocellem homologue to 485 residues in the p80 protein from Tetrahymena thermophila and so may not correspond to a single protein domain. Therefore we call this evolutionary conserved region a module. The longer members of this family have multiple long insertions that are not found in the shorter homologues. This region is named the TROVE module after Telomerase, Ro and Vault ribonucleoprot eins in which it is found. The alignment of the TROVE module contains a few absolutely conserved residues. None of these conserved residues are the polar types of amino acids found in active sites, so it seems unlikely this region has an enzymatic function. Tetrahymena p80 is known to bind telomerase RNA [11], so the RNA-binding activity must reside in either the TROVE or vWA domains of p80. Given the known functions of vWA domains it is likely that the RNA-binding function resides within the TROVE module.

Figure 1
figure 1

An alignment of TROVE modules. The alignment was generated using MAFFT [15]. The alignment has been coloured using Chroma with the default colouring scheme [16]. The 4th position in the RNP-1 motif proposed by van Horn et al [12] in marked with an asterisk. The Swiss-Prot or GenBank accession numbers for the proteins in the alignment are as follows: Cthermocellum (ZP_00060193), Scoelicolor (Q9X9W7), Tthermophila_p80 (Q94818), Mmusculus_TEP1 (P97499), Hsapiens_TEP1 (Q99973), Npunctiforme (ZP_00108461), Dradiodurans_Ro (Q9RUW8), Celegans_Ro (Q27274), Mmusculus_Ro (O08848), Styphimurium (Q8ZLH8), Pfluorescens (ZP_00086137).

An RNA-binding RRM domain has been proposed in the Ro60 proteins from human, frog and worm [12] due to the presence of the two classic RNP-1 and RNP-2 motifs [13]. This domain would lie within the proposed TROVE module. Although there are interesting similarities with the RNP RNA-binding motifs, examination of the broader TROVE alignment indicates that the 4th RNP-1 position (marked with an asterisk in Figure 1), is a conserved polar residue. However, in known RRM domains this position is a buried beta-sheet anchor residue, and consistently non-polar. In addition the spacing of the proposed RNP-1 and RNP-2 motifs in human and frog would be one of the shortest observed separations in RRM domains, and is inconsistent with the known structures of RRMs. Based on this sequence analysis it seems that the presence of an RRM although plausible based on Ro60 function is unlikely.

Common domains are often found in proteins involved in related cellular processes. For example the PAZ domain is found in Dicer and Piwi proteins that are involved in post-transcriptional gene silencing [14] and are both part of the RISC complex. The discovery of the TROVE module in three RNPs is intriguing and suggests that these three RNPs might be involved in inter-related processes.

What is the function of the bacterial TROVE containing proteins? The Deinococcus homologue is known to be part of a Ro-like RNP that even contains a Y-like RNA molecule [7]. Building a phylogenetic tree of the TROVE module alignment, see figure 3, shows that the Deinococcus homologue does indeed cluster with the known Ro60 proteins, as does the Nostoc punctiforme homologue. It seems likely that Nostoc punctiforme also contains a Ro RNP. The other bacterial homologues cannot be attributed to either the TEP1-like or Ro60-like subfamilies. So we cannot assign any function to these proteins except that they may be part of an as yet unidentified RNP complex. Given the wide but patchy distribution of the TROVE module containing proteins we suggest that they are an ancient RNA-binding component of RNP complexes.

Author's Contributions

AB carried out sequence analysis and produced figures. AB and VK authored the manuscript.

Figure 2
figure 2

A schematic view of the domain architectures of TROVE module containing proteins. Domains shown with Pfam [17] or SMART [18] accessions: WD40 (PF00400), BRCT (PF00533), PARP (PF00644), VIT (SM0609), vWA (SM0327, PF00092), TEP1_N (PF05386). The MVPint domain is the MVP interaction domain [8, 9].

Figure 3
figure 3

A tree constructed using the neighbour-joining algorithm implemented in the QuickTree program [19]. 500 bootstrap replicates were used and values over 75% are shown.