The recent completion of the Arabidopsis thaliana genome sequence and its accessibility in annotated form [1] marks an essential breakthrough for basic and applied plant science. Extensive bioinformatics analysis, using both extrinsic and intrinsic data, initially detected 25,498 genes within the Arabidopsis genome. Around 69% of the corresponding proteins could be classified according to their sequence similarity to proteins of known function in plants and other organisms. Approximately 51% of the genes contain a functional domain detectable by InterPro. InterPro has proven to be especially powerful for functional domain detection [1,2,3].

One of the most abundant domains detected in the Arabidopsis proteome is the RING-finger domain, which was found 365 times in the initial characterization of the genome [1]. In fact, 1.42% of Arabidopsis proteins contain a RING-domain signature. Thus, it is overrepresented in Arabidopsis as compared to other complete eukaryotic genomes (Drosophila melanogaster, Caenorhabditis elegans and Saccharomyces cerevisiae), which contain 0.7-0.75% of RING-domain proteins. To date, the significance of this observation remains unclear.

The RING domain was originally named after the acronym for the first protein it was found in, encoded by the Really Interesting New Gene [4,5]. The motif is related to the zinc-finger domain; however, zinc fingers consist of two pairs of zinc ligands coordinately binding one zinc ion, whereas RING fingers consist of four pairs of ligands binding two ions. Two other motifs that consist of four metal-ligand pairs binding two zinc ions are related to the RING finger - the PHD (plant homeodomain) [6] and LIM (Lin11/Isl-1/Mec-3) [7] domains (Figure 1).

Figure 1
figure 1

Schematic presentation of the structure of prototypical RING, PHD and LIM domains. The metal-ligand residues, either cysteine (C) or histidine (H), are shown as numbered spheres. Two pairs of metal ligands coordinate one zinc ion (hexagon). The numbers next to the loops connecting the metal-ligand residues indicate the minimum and maximum number of loop residues. (a) The structure of a RING domain (RING-HC type). The metal-ligand pairs 1 and 3 coordinate one zinc ion, while pairs 2 and 4 coordinate the second one in a so-called cross-brace arrangement. (b) The structure of a PHD domain reveals a cross-brace arrangement similar to the RING domain. (c) The LIM domain structure is distinct in its consecutive zinc ligation scheme: the first zinc ion is coordinated by the metal-ligand pairs 1 and 2, while the second ion is coordinated by pairs 3 and 4.

The RING-domain structure has been resolved at atomic resolution for three proteins - promyelotic leukemia protein (PML), equine herpes virus protein (IEEVH) and the human recombination protein RAG1 [8,9,10]. These studies revealed that the RING domain forms a distinct so-called cross-brace structure, in which metal-ligand pairs 1 and 3 coordinate to bind one zinc ion, and pairs 2 and 4 bind the second one (Figure 1a). Also, the structures of the LIM-domain proteins CRP1, CRIP and CRP2 [11,12,13] have been resolved and indicate that LIM domains behave like a double zinc finger, coordinating two zinc ions with the two consecutive pairs of ligands (Figure 1c). Finally, the recent solution of the structure of the PHD domain of KAP-1 [14] demonstrates that the zinc-ligation scheme of the PHD domain is similar to the RING domain; that is, it folds into a cross-brace structure (Figure 1b).

In functional terms, the RING domain can basically be considered a protein-interaction domain, and RING-finger proteins have been implicated in a range of diverse biological processes and biochemical activities, from transcriptional and translational regulation to targeted proteolysis [15,16,17]. For several RING proteins a biochemical ubiquitin ligase activity has been observed [18,19]. Thus it has been suggested that the abundance of RING proteins in Arabidopsis might reflect a bias towards target-specific proteolysis as a means of controlling gene activity. However, in most cases a RING domain by itself is not sufficient for ubiquitin ligase activity and the additional structural features required are not known. Thus, to date it remains unclear how many of the RING proteins encoded in the Arabidopsis genome could indeed be involved in protein degradation.

In this in-depth study, we evaluate and classify RING domain proteins of Arabidopsis by computational analyses as well as manual curation. We present a set of Arabidopsis RING domains, which we classify into related clusters and sort according to their potential to form a RING-type cross-brace structure on the basis of recent results from structural analyses.

Results and discussion

InterPro analysis of the Arabidopsisproteome for RING-domain proteins

In an initial characterization of the Arabidopsis proteome [1] by InterPro analysis (release 1.0) [2] a total of 365 RING domains were detected. We searched the proteome for RING domains again, using an updated version of InterPro (release 3.1). In total, our analysis retrieved 446 domains. An overall percentage of 10-15% of erroneously assigned exons has been estimated for the proteome. However, exons that contain functional domains, and which are therefore detectable and adjustable via similarity-based methods, are expected to have significantly fewer wrong assignments. We therefore have confidence in the detected number of RING-finger-containing proteins, although a deviation in the lower single-digit percentage range cannot be excluded.

We evaluated the first-pass computational analysis and manually annotated all detected domains according to a set of criteria that would qualify them as RING domains. All our results are organized on two websites that cross-reference each other [20,21]. Here we present an overview and a summary of the results; corresponding details and supplementary materials can be found on the websites.

Criteria for RING domains

Several of the detected domains did not represent a full-length RING domain. We inspected them in detail and attempted to complete them using the adjacent sequences in the respective proteins. This proved successful in a number of cases; in several others, however, it was not possible. We also noticed that in some domains the conserved spacing of the metal ligands was lacking, indicative of probable false-positive detection. In fact, in addition to programs relying on defined strings, modules not based on defined patterns (such as PfamHMM, which is based on hidden Markov models) are part of InterPro. As a result, domains not containing the defined patterns might have been detected by InterPro as RING domains. Thus, we inspected each of the 446 initially detected RING domains in detail, in order to eliminate false positives from the analysis set.

To verify the RING domains, we defined a set of criteria based on well characterized examples of RING-domain proteins. First, we required domains to contain at least seven of the full complement of eight metal ligands. Second, the metal-ligand residues had to correspond to the RING pattern, either to the prototypical RING pattern (metal ligand 4 is histidine, all others are cysteine, 'RING-HC', see below) or the frequent RING-H2 pattern (metal ligands 4 and 5 histidine, all others cysteine). Third, the spacing of the core residues (metal ligands 3 to 6) had to be conserved. These criteria leave room for subgroups of RING-finger structures in which the spacing between positions 7 and 8 is different from the generally conserved two residues [22], and for cases in which a metal ligand is missing, mostly at the seventh or eighth position. Finally, we also allowed additional deviations from the canonic criteria, which ensure that known variant-type RING domains [23] are included in our set. These modifications are metal ligand substitutions that are observed in a few well characterized RING domains, such as those in MDM2, mouse c-Cbl, Rbx1 and CART1. They include threonine for cysteine substitutions at metal ligand positions 1 or 3 and aspartate for cysteine substitution at metal ligand 8.

We did not allow for similar substitutions at other positions, as the ligands might not be interchangeable in every position [14]. However, we carried out additional computational analyses with more relaxed patterns. For instance, if metal ligand positions 1, 3, 5 and 8 are simultaneously allowed to be encoded by either cysteine, histidine, threonine or aspartate (see above), a total of 81 additional domains in 79 proteins are found. Although these domains are not detectable by InterPro analysis, they might have the potential to encode novel RING-domain variants. However, several LIM domains are included in this set. Moreover, the corresponding RING arrangement was unclear in most other cases and might represent distantly related motifs, such as the U-box [24]. Thus we did not include these additional domains in our proper set.

Other domains in RING proteins

A complete InterPro analysis on the protein set retrieved with the RING motif revealed other domains present in these proteins. The RING domain is closely related to the LIM and PHD domains, and 141 of the RING-domain proteins are also reported to carry a PHD domain. We inspected these predicted PHD domains and the vast majority of the respective metal ligands overlap with the detected RING domains. However, in most cases the RING-domain signature prevails, as some highly conserved residues characteristic of PHD domains are mostly missing. In fact, only in three cases did we clearly favor a PHD-domain architecture, and these domains were eliminated from our set. This means that the Arabidopsis proteome possibly contains 138 fewer PHD domains than detected. Thus, because of the overlap in the RING- and PHD-domain signatures, the frequency of the PHD domain has initially been significantly overestimated.

After the PHD domain, bipartite nuclear-localization signals were the second most frequently detected domain (63 times) in our RING-domain protein set. Most other domains are much less frequent however (< 10 times), and some combinations are obviously absent. For instance, the RING/B-box/coiled-coil protein family found in several eukaryotes seems to be absent from Arabidopsis.

Elimination of false positives

On the basis of our criteria above, the metal-ligand arrangement of every one of the initial 446 domains was reinspected. We noted that some proteins with a generally high content of cysteine and histidine residues represent false positives. Indeed, that is the case for a group of five cellulose synthases, which contain several zinc-finger domains. We eliminated these from our set.

We also eliminated 41 additional domains in which at least two metal ligands or possible substituting residues are missing, and 10 domains in which the spacing of the core residues (metal ligands 3 to 6) did not satisfy the criteria. Thus, in total, 59 probable false-positive domains have been eliminated from our initial set (Table 1).

Table 1 Proteins with false-positive detected RING domains that were eliminated from the dataset

Classification of RING domains

We classified the remaining set of 387 domains according to metal-ligand arrangement. The originally described RING domains were characterized by a histidine at metal-ligand position 4. We have termed the domains in our set with a corresponding arrangement RING-HC domains. We found 118 domains of this type in 111 different proteins. However, the cysteine usually present at metal-ligand position 5 is frequently substituted by a histidine as well, and we identify these domains as RING-H2 domains. Of this type of domain, 215 were found in 214 proteins. The remaining 54 domains, in which not all of the metal ligands were either cysteine or histidine, but where one metal ligand is missing or substituted according to our criteria described above, were classified as RING-variant domains [23].

Derivation of additional patterns for computational analysis

An inherent problem with the computational detection of RING-finger domains is their relatedness to the PHD domain (see above). This ambiguity seems to be due to a lack of structural determinants that separate a given domain in one group or the other. Recently, the first solution structure for a PHD domain has been obtained [14]. Its comparison with related structures revealed some key features that separate LIM, PHD and RING domains.

The LIM domain is clearly set apart from the two others, with a more conserved spacing and conserved hydrophobic residues not found in RING or PHD domains. Among the conserved hydrophobic residues, one is located in front of metal ligand 3 and one after metal ligand 4, and these features result in a zinc ligation by consecutive metal ligand pairs. By contrast, hydrophobic residues in front of metal ligand 5 and after metal ligand 6 seem to result in a cross-brace arrangement, which is observed in RING and PHD domains. Despite this commonality, two features separate RING and PHD domains. First, the loop between metal ligands 4 and 5 can be up to five residues in PHD domains, rather than only up to three in RING domains. Second, in PHD domains the residue two positions in front of metal ligand 7 is an aromatic residue, which alters the hydrophobic core of the domain and thus its structural characteristics.

On the basis of the above data, we defined four patterns that specify a RING domain with increasing stringency (Table 2). In our first pattern (Stringent 1) we required that all metal ligands are present, according to our criteria outlined above, and that the position two residues in front of metal ligand 7 is not an aromatic amino acid. These criteria were satisfied by 324 domains, whereas it was not the case for 63 domains. Failure to match the criteria is mainly the result of a missing or substituted metal ligand; that is, these domains are classified as RING variants. However, among those 63 domains, 13 carry an aromatic residue two positions in front of metal ligand 7 and we thus consider them unlikely to form a RING domain. Rather, they are structurally more similar to PHD domains. Therefore, of the 141 PHD domains predicted in our initial RING-domain set (see above) as few as 16 might indeed represent a PHD structure.

Table 2 Conventional and novel motif signatures used to identify RING domains in this study

In our second pattern (Stringent 2) we added another criterion; we required the positions in front of metal ligand 5 and after metal ligand 6 to be hydrophobic or serine or threonine, which have been observed in some well characterized RING proteins [14]. Hydrophobic residues at these positions are critical for cross-brace structure formation. Thirty more domains failed this test. However, the remaining 294 domains can be considered fairly certain to form a RING structure.

Next (Stringent 3), we abolished the acceptance of threonine or serine after metal ligand 6, as these residues are rarely found in this position in well characterized RING proteins. Ten more domains did not comply with this requirement.

Finally, in our most stringent pattern (Stringent 4) we not only excluded aromatic residues from position 2 in front of metal ligand 7, but also other hydrophobic residues. An additional 55 domains from our set did not match this criterion. However, the 229 domains fulfilling these criteria can be considered to form a RING structure with near certainty.

Clustering of related RING domains

We sought to define groups of related RING-domain proteins beyond their classification by metal-ligand arrangement. However, it turned out that, at least in part, RING-domain proteins strongly deviate from each other outside the conserved domain. Thus it is not feasible to relate them using conventional phylogenetic methods. To circumvent this problem, the isolated RING-finger domains were used instead for further analysis. However, bootstrap values were again too low to reliably relate RING domains using phylogenetic methods. Therefore, a single-linkage-clustering method was applied to obtain clusters of related RING domains. We sorted our set on the basis of similarity restricted to the RING domains and excluding non-conserved amino- and carboxy-terminal parts of the respective proteins. A BLAST analysis [25] using a cut-off value of 10-15 was chosen to define meaningful similarities (Table 3). This analysis resulted in the definition of 54 clusters of RING domains (that is, two or more similar domains), in which 295 domains are grouped. Notably, with only one exception (cluster 2.8), all the RING-domain clusters only contain members from the same respective class of metal-ligand arrangement; that is, prototypical RING-HC domains are only found in clusters with other RING-HC domains. This finding confirms both the significance of our clustering and of the group definition described above. Of the 54 clusters, 28 consist of a pair of domains, whereas clusters with multiple domains contain up to 75 different domains. However, with the exception of the large clusters 2.1 (75 domains) and 2.2 (26 domains) found in the prototypical RING-H2 class, most clusters contain fewer than ten domains.

Table 3 List of Arabidopsis RING domains

Redundancy of clustered proteins

To investigate overall protein similarities and reveal potential functional redundancy between proteins of a given cluster, we decided to produce alignments of the full-length proteins by ClustalW analysis [26]. Links to these alignments are provided with each cluster on our web page. Overall similarities vary between clusters. For instance, the 75 domains of cluster 2.1 contain several members of a RING-H2 family that has been described [27,28]. The proteins in this cluster are generally short, with only very little additional sequence outside the RING domain, and are highly similar to each other. Thus, any functional overlaps could already be contained in the RING domain itself. Other clusters, for example 2.6 and 2.8, consist of genes that are derived from tandem duplication.

In numerous other cases, however, the full-length alignments frequently reveal additional similarities outside the RING domain. For instance, cluster 1.1 includes the RMA1 protein, which has been shown to be a membrane-bound ubiquitin ligase [29]. The proteins in this cluster show some sequence similarity besides the RING domain and share additional features, such as a transmembrane anchor. Thus, it seems likely that these other proteins might also be membrane-bound ubiquitin ligases. Another interesting group is cluster 2.2, which comprises 26 domains. The corresponding proteins also contain significant similarity in a stretch amino-terminal to the RING domain. Moreover, subgroups of proteins within the cluster even display additional similarity in the more distal amino-terminal regions. Two of the proteins in this cluster, AIP2 and CIP8, have been described [30,31]. For CIP8, a ubiquitin ligase activity has been demonstrated [32] and the same might be true for a subgroup of the cluster, which has a high structural similarity with CIP8 (C.H., unpublished data). Thus, this cluster might represent proteins that are functionally redundant to some extent. Notably, the similarity among members of cluster 2.2 also extends to their genomic organization: 22 of the 26 proteins are encoded by a single exon, underpinning their close relatedness and probable common evolutionary origin.


Gene redundancy in Arabidopsis has previously been shown to limit the number of mutants detectable by phenotype [33]. The completed genome sequence shows that a high degree of redundancy might indeed obscure the quest for many phenotypes. Accordingly, we suggest that there probably exists a high degree of functional redundancy among Arabidopsis RING-domain proteins. This would also correlate with the fact that surprisingly few genes in the complete set are characterized as mutants. To our knowledge, this is the case for only two of them, COP1 and PRT1 [34,35]. Notably, for both these proteins, a functional requirement for the RING domain has been demonstrated, and both are unique with respect to their RING domains.

In this study, we present an ordered set of manually curated RING domains of Arabidopsis. In summary, our set includes all bona fide RING domains, as well as common RING-variant domains. Notably, additional Arabidopsis proteins might have potential to form variant RING-finger domains, as has been suggested, for instance, for the HOS1 protein [36]. However, their primary sequences do not support this notion unambiguously and we chose not to include any RING-domain variants in our analysis for which no structural experimental evidence is yet available. Clearly, our findings show that predictions of cysteine-rich domains have to be met with skepticism. On a proteomics level, they can be misleading in drawing general conclusions, as is amply demonstrated by the overestimation of the abundance of the PHD domain owing to their overlapping classification with RING domains. Additional structural data are needed and have to be taken into account in computational analyses to resolve these issues. Our curated set of RING domains in Arabidopsis will serve as a vital starting point for further genome analysis in this field.

Materials and methods

The non-redundant Arabidopsis genome protein set available at MIPS [37] was screened for proteins containing RING-finger domains. Detailed results of this analysis are available on the web [20,21]. Analysis was undertaken using several discrete steps described in detail below.

Whole-genome analysis for proteins containing RING-finger motifs proteins

For initial analysis the InterPro system [2] (iprscan version 3.2) was used to calculate protein domains for all Arabidopsis proteins. The results were filtered for RING-finger domains matching the InterPro domains PF00097, PS00518 or PS50089 (corresponding to domain names ZF-C3HC4, ZINC_FINGER_C3HC4 and ZF_RING, respectively). Arabidopsis proteins containing one or more RING-finger domains were analyzed further.

Frequently detected overlapping domains, for example detected patterns with overlapping localization, were unified and only the domain with the most amino-terminal starting point was used for further analysis.

Prediction of additional domains

Proteins containing a RING-finger domain were subjected to an additional screen for the presence of additional domains using the InterPro package (see above).

Classification of RING-finger domains

The RING-finger domain summarizes different types of subdomains, namely the C3HC4-type and C3H2C3-type. We refer to these types as RING-HC and RING-H2, respectively. To differentiate between these two subtypes an additional fine analysis was carried out: RING-finger-containing genes were classified as C3HC4-type (RING-HC) for the patterns C-x-H-x-[LIVMFY]-C-x(2)-C-[LIVMYA] or C-x(2)-C-x(9-39)-C-x(1-3)-H-x(2-3)-C-x(2)-C-x(4-48)-C-x(2)-C and as C3H2C3-type (RING-H2) for the patterns C-x-H-x-[LIVMFY]-H-x(2)-C-[LIVMYA] or C-x(2)-C-x(9-39)-C-x(1-3)-H-x(2-3)-H-x(2)-C-x(4-48)-C-x(2)-C. RING-finger domains detected by InterPro that did not match these patterns were marked as 'others/unclear type'. Novel patterns for the evaluation of RING domains were defined as described in the text and Table 2.

Clustering of RING-finger domains

The isolated RING domains were related using BLASTP [25] (version 2.1.2) by testing the isolated RING-finger domains against a database containing all RING-finger domains assembled during the previous analysis steps. Domains below a threshold of 10-15 were united into clusters of related domains. This procedure is 'greedy'; for example, although domain A relates to domain B and B relates to C, A and C are not necessarily closely related enough to exceed the given threshold. Nevertheless, this procedure in general succeeded in grouping and/or separating individual subfamilies.

Multiple alignments

For RING-domain clusters with two or more members, multiple alignments of the respective complete protein sequences were done using the ClustalW program [26,38] with default parameter settings.

Manual expert curation

The individual RING-finger domains and clusters underwent manual inspection. Manual adjustments to clusters and rejections of individual domains and clusters on the basis of expert knowledge were carried out as explained in Results and discussion.

Additional data files

The curated set of clustered Arabidopsis RING domains, with their sequences and metal ligands, are provided in a supplemental table. Links to the individual genes and ClustalW analyses are included.