Ubiquitin, a small protein of 76 amino acids, is highly conserved in all eukaryotes. In a multi-step process, ubiquitin is covalently linked to lysine residues of substrate proteins. If a single molecule of ubiquitin is linked to a protein, this is referred to as mono-ubiquitination, a process that is of particular importance for protein trafficking but has also been shown to regulate retrovirus budding and to modulate protein function directly [1]. A lysine residue of a ubiquitin molecule attached to a substrate can itself serve as an acceptor for an additional ubiquitin molecule, and this process can be repeated so that poly-ubiquitinated proteins form. Poly-ubiquitin chains serve as recognition signals for the 26S proteasome, the major regulator of protein abundance in cells, and poly-ubiquitination thus often initiates proteolysis of the substrate. But poly-ubiquitination can also regulate protein function directly without affecting stability, in ways similar to mono-ubiquitination and other post-translational modifications. The mechanisms underlying proteolysis-independent regulation by poly-ubiquitination are only poorly understood but might function by changing conformation or adding or obscuring a binding site (Figure 1; for reviews see [13]).

Figure 1
figure 1

The ubiquitin proteasome system. (a) Ubiquitin is activated by a ubiquitin-activating enzyme (E1) and transferred onto substrate proteins by ubiquitin-conjugating enzymes (E2) and ubiquitin ligases (E3), resulting in (b) either attachment of a single ubiquitin molecule (mono-ubiquitination), attachment of multiple ubiquitin units to several substrate lysine residues on the same protein (multi-ubiquitination) or synthesis of ubiquitin chains (poly-ubiquitination). (c) Many poly-ubiquitinated proteins are subsequently degraded by the 26S proteasome, which consists of the catalytic 20S complex and the regulatory 19S particles. Degradation substrates are either delivered to the proteasome by soluble ubiquitin receptors or recognized by the intrinsic ubiquitin-binding activity of the 19S particle. At the 19S proteasome the ubiquitin chain is disassembled, and the substrate is unfolded before it can enter the cavity of the 20S subunit where proteolysis takes place. Finally, proteolytic fragments exit the proteasome in a poorly understood way. (d) Ubiquitination can also directly regulate protein function in a proteolysis-independent manner, via mono-, multi- or poly-ubiquitinated proteins.

The transfer of ubiquitin is a multi-step process that involves at least three classes of enzymes: ubiquitin-activating enzymes, generally called E1 enzymes; ubiquitin-conjugating enzymes or E2s; and ubiquitin ligases, E3s (Figure 1). E3 ubiquitin ligases are of particular importance because they confer substrate specificity to the system by interacting directly with substrate proteins and thereby directing the transfer of ubiquitin. The human genome encodes an estimated 500-600 ubiquitin ligases, a number comparable to the 518 predicted kinases [4, 5]. If you consider that each ubiquitin ligase is active on several substrates, you can get some impression of the complexity and importance of the ubiquitin system.

Ubiquitination is a highly dynamic process and is balanced by deconjugation of ubiquitin by deubiquitinating enzymes (DUBs). The more than 70 DUBs that are estimated to be encoded in the human genome are responsible for the reversible nature of ubiquitin modifications and have important roles in recycling ubiquitin from proteasome substrates, in stabilizing proteins by counteracting their poly-ubiquitination, and in opposing the proteolysis-independent regulatory roles of ubiquitin modifications (for reviews see [6, 7]). DUBs together with E1, E2 and E3 enzymes and the proteasome make up the ubiquitin-proteasome system.

The large number of proteins that constitute the ubiquitin-proteasome system and the enormous number of ubiquitination substrates mean that global approaches are required if we are to understand fully the role of ubiquitination in cell biology, development, and disease. Large-scale studies of the entire system are still in their early stages, but they have already made important contributions to the field. Here, we review the approaches taken in some of these studies and their findings.

Proteomic approaches to characterizing the ubiquitin-proteasome system

Multi-protein complexes and protein-protein interactions have important roles in the ubiquitin-proteasome system. Both the 26S proteasome (see Figure 1c) and E3 ubiquitin ligases have been studied extensively using protein-complex purification coupled with mass-spectrometric protein identification [811]. Studies of the subunit composition of proteasomes from various organisms have revealed that the 26S proteasome complex consists of the 20S complex (made up of seven α and seven β subunits) and the 19S complex (made up of six ATPase and twelve non-ATPase subunits) [12]. The 20S complex is well characterized as forming the catalytic core; the 19S regulatory complex is believed to be responsible for substrate recognition and unfolding (Figure 1c), but the specific functions of most of the 19S subunit components are still not well understood. There is accumulating evidence for a non-proteolytic role for the proteasome in processes such as transcription, chromatin packaging, and DNA repair [13, 14]. Additional subunits found in the majority of proteasome complexes have been identified following the development of new protein-purification and protein-identification techniques [8, 15]. Recently, hybrid 26S proteasome complexes have been characterized, in which one copy of the 19S is present at one end of the 20S core and the other end is capped by Bml10, a newly characterized HEAT repeat protein in yeast. Blm10 and its mammalian ortholog PA28 function as 20S activators [16, 17]. These complexes seem to have labile structures that cannot be preserved during the purification steps. Although the hybrid complexes reconstituted in vitro have higher peptidase activity than the classic versions, their roles in the degradation of ubiquitinated substrates in vivo are unclear [16, 17]. Given the heterogeneous population and functional diversity of the various proteasome complexes, it remains a challenging task to purify and identify the subpopulations of proteasome complexes and to correlate the differences in their composition with their distinct functions in vivo.

A diverse groups of proteasome-interacting proteins, including ubiquitin ligases, DUBs, heat-shock proteins and many other proteins have been identified by affinity purification and mass spectrometry as well as from genome-wide two-hybrid screens for protein-protein interactions [8, 15, 1821]. Many of the identified interactions seem to be labile under conditions of active ATP hydrolysis by the 19S regulatory complex because addition of ATP coincides with the release of the interacting proteins [8]. This modulation has been suggested to be a part of the mechanism of protein degradation [22]. New methodologies are needed if we are to identify and characterize proteasome-interacting proteins fully, to understand how the different interacting proteins influence the catalytic cycle, and to clarify how they link the ubiquitin-proteasome system to other biological processes.

Mass-spectrometric approaches have also contributed much to our current understanding of the complex composition of E3 ubiquitin ligases. Two types of ubiquitin ligase that play important roles in cell-cycle regulation have been extensively investigated: SCFs and the anaphase-promoting complex/cyclosome (APC/C). SCF ubiquitin ligases are named after three of their four subunits - Skp1, Cdc53 (also known as Cul1) and one member of the F-box protein family - and they also include the Ring-H2 protein Hrt1 (also known as Roc1 or Rbx1) [23]. The substrate specificity of SCFs depends on the different F-box proteins that are tethered to the Cdc53-Hrt1 ubiquitin-ligase module by Skp1. Using sequential rounds of epitope tagging, affinity purification and mass spectrometry (a procedure called SEAM), Skp1 was found to form a variety of complexes, including some that are most likely to have functions other than ubiquitination [9, 24, 25].

In comparison, proteomic approaches have shown that the APC/C, which regulates mitosis, has a more complex structure than SCFs with at least 13 components [26]. Despite some success in identifying the subunits and also the modifications of APC/C, the molecular functions of the individual subunits are largely unknown. Exceptions are the RING-finger subunit Apc11 and the cullin-like subunit Apc2, which are believed to have a direct role in ubiquitin transfer [27].

Perhaps because of the large number of ubiquitin ligases present in the genome, ubiquitin-ligase proteomics is still in its infancy. Affinity purification coupled with mass spectrometry has promised great advances in the study of the composition of protein complexes and the identification of their interacting partners. But further advancements in proteomic research are expected to provide more information on the protein complexes involved in the ubiquitin-proteasome system, including their post-translational modifications, the stoichiometry of their subunits and how they are assembled.

Identification of ubiquitination substrates in vitro

The large number of putative E3 ubiquitin ligases makes systematic characterization of their substrates a formidable task, but it is one that will be important if we are to gain a global view of the dynamics of the ubiquitin system. E3-substrate interactions are generally only transient, and substrates are usually either degraded by the proteasome and/or released from the E3 ligase after the transfer of ubiquitin. This makes detection of E3-ligase-substrate interactions difficult. Two-hybrid assays have successfully identified some substrates of ubiquitin ligases [28], but identification of proteins that can interact with E3 ligases does not necessarily pinpoint substrates. A more effective strategy is to identify E3 substrates by their ubiquitination or degradation by the 26S proteasome.

One of the first effective large-scale attempts used Xenopus oocyte extracts to identify substrates of the APC/C, a ubiquitin ligase that regulates mitosis [2932]. The approach exploited the unique regulation of APC/C activity: it is inactive during interphase but active during mitosis. When added to mitotic extracts, APC/C substrates are ubiquitinated and rapidly degraded by the proteasome, but the same substrates are unchanged in interphase lysates. In a large-scale approach Xenopus cDNA clones that had been in vitro-translated and labeled were divided into small pools and incubated with interphase and mitotic oocyte extracts. Proteins that disappeared specifically from mitotic extracts were isolated, and this led to the identification of several important APC/C substrates, including cyclin B, the DNA-replication inhibitor geminin, and the anaphase inhibitor securin [2932].

A different in vitro approach was applied to identifying the potential substrates of the ubiquitin ligase that is formed by a heterodimer of BRCA1 and BARD1 [33, 34]; the BRCA1/BARD1 heterodimer functions as a tumor suppressor that is important for protection from breast and ovarian cancer, and its ubiquitin-ligase activity has been linked to its protective function [35]. Sato and colleagues [33] immunoprecipitated BRCA1/BARD1 complexes, which were added to a ubiquitination reaction in vitro, that used ubiquitin tagged with the FLAG epitope. The rationale behind the approach was that substrates of ubiquitination by BRCA1 should be bound to the immunoprecipitated BRCA1 and subsequently ubiquitinated with FLAG-ubiquitin in vitro. Proteins conjugated to FLAG-ubiquitin were purified and identified by mass spectrometry [33]. A more directed strategy restricted the hunt for BRCA1-BARD1 substrates to components of the centrosome [34], because BRCA1 has been implicated in the regulation of centrosome duplication [36]. Starita and colleagues [34] incubated mammalian centrosome-containing cell fractions with recombinant BRCA1-BARD1 ligase complexes and biotinylated ubiquitin. Ubiquitinated proteins were detected through the biotin tag on ubiquitin and subsequently identified by mass spectrometry [34]. Both of these strategies [33, 34] identified promising candidate BRCA1-BARD1 substrates - nucleoplasmin/B23 [33] and γ-tubulin [34] - that might be connected to the tumor-suppressor function of BRCA1. A high-throughput strategy has also been used to identify substrates of the yeast ubiquitin ligase Rsp5 in vitro. A luminescent assay involving biotinylated ubiquitin was used to screen several hundred purified yeast proteins for Rsp5-dependent ubiquitination in vitro. Previously known, as well as new, candidate substrates of Rsp5 were identified [37].

A more general in vitro approach [38] used total HeLa cell lysates for large-scale identification of ubiquitinated proteins. Cell lysates were incubated with ubiquitin tagged with six histidines (6×His-ubiquitin) and an ATP-regenerating system to sustain ubiquitination in vitro. The 6×His-ubiquitin was covalently attached to proteins by the E1, E2, and E3 enzymes present in the cell lysates, and this allowed purification of ubiquitinated proteins on the basis of the affinity of 6×His-ubiquitin to Ni2+ ions (Ni-chelate chromatography) [38]. Over 100 ubiquitin-linked proteins were identified by mass spectrometry, of which a relatively high proportion was already implicated in the ubiquitin proteasome pathway, such as E2 and E3 enzymes and proteasome subunits. Because relatively mild purification conditions were chosen, both ubiquitinated proteins and proteins associated with them were identified. This is illustrated by the identification of 16 out of 18 subunits of the 19S proteasome; covalent modification of 19S proteasome subunits with ubiquitin has so far not been reported, but an intrinsic affinity of the 19S proteasome for poly-ubiquitin chains is well known [39] and is most likely to be responsible for the identification of these proteins in the study [38]. Bona fide ubiquitinated proteins can be distinguished from associated, copurifying proteins by fractionation strategies that use highly denaturing conditions and break non-covalent interactions. Such stringent purification conditions have been widely used to demonstrate covalent attachment of ubiquitin to specific proteins [40], as well as in proteome-wide approaches to identifying ubiquitinated proteins, as discussed below.

Ubiquitination substrates in vivo

Identification of all ubiquitinated proteins in a cell under a given growth condition or developmental state is an ambitious aim, but it no longer seems impossible given the tremendous pace at which mass-spectrometry-based proteomics is developing (reviewed in [41]). Ubiquitin profiling was pioneered by Peng and colleagues [42] and usually involves expression of 6×His-tagged ubiquitin in cells (Figure 2). The cellular ubiquitin system conjugates 6×His-ubiquitin to target proteins and allows their purification by Ni-chelate chromatography. Because Ni-chelate purification is compatible with fully denaturing conditions, proteins that are associated with ubiquitinated proteins but are not ubiquitination substrates themselves can efficiently be removed. The purified ubiquitinated proteins are fragmented by trypsin (or similar proteases) to generate peptides, which can be used for mass-spectrometric identification of the proteins present in the purified fraction. More than 1,000 candidate ubiquitination substrates were identified using this method in the relatively simple eukaryote Saccharomyces cerevisiae [42], whose genome encodes roughly 5,800 proteins. Surprisingly, most of the well-studied (and less abundant) ubiquitinated proteins were absent from the list, suggesting that many more yeast proteins than the identified 1,000 candidates are ubiquitination substrates.

Figure 2
figure 2

Global strategies that use mass spectrometry (MS) to study ubiquitination. (a) Diagram of the lysine residues in ubiquitin; the carboxy-terminal Arg-Gly-Gly (RGG) motif is also indicated. (b) In ubiquitin profiling, 6×His-tagged ubiquitin expressed in cells is conjugated to substrate proteins, and this facilitates purification of ubiquitinated proteins under denaturing conditions by Ni-chelate chromatography, in which histidine-tagged proteins bind specifically to immobilized Ni2+ ions. Purified ubiquitinated proteins are digested with trypsin and the resulting peptides are analyzed by mass spectrometry to identify the proteins present in the sample. (c) Precise ubiquitination sites can be determined by mass spectrometry because of a characteristic mass shift caused by diglycine that is retained on ubiquitinated lysine residues within peptides after trypsin digestion. (d) A similar strategy allows differentiation between the various types of ubiquitin chain linkage that can lead to diverse ubiquitin-chain topologies. Depending on the lysine residue in ubiquitin that was used for the ubiquitin-ubiquitin linkage, different linkage-specific signature peptides with characteristic masses are produced by trypsin digestion. These signature peptides can be detected and distinguished by mass spectrometry.

At first glance, it seems that a surprisingly large fraction of the proteome is ubiquitinated. But misfolded proteins, which can be generated by translation inaccuracy, folding problems or oxidative damage, are ubiquitinated and degraded as part of the protein quality-control pathway [43]. One can therefore expect that at least a small fraction of any protein will be ubiquitinated, and that sufficiently sensitive analytical methods might find that all proteins can be ubiquitination substrates. It is important to bear in mind that current ubiquitin-profiling experiments can indicate only whether any of a given protein is ubiquitinated but cannot give any estimate of what fraction of the protein is ubiquitinated. This imposes some limitations on how the results of large-scale studies can be interpreted.

To find more specific substrates of the ubiquitin-proteasome system, recent proteomic approaches have focused on specific parts of the system. Ubiquitin profiling has been used successfully to study the endoplasmic reticulum associated degradation pathway (ERAD) [44]. Membrane-enriched fractions from yeast cells expressing 6×His-ubiquitin were used as a starting material for purification of ubiquitinated proteins and their subsequent identification by mass spectrometry. More than 80 candidate ERAD substrates were identified [44].

Mayor and colleagues [45] enriched for proteasome substrates on a poly-ubiquitin-binding protein resin and followed this with denaturing Ni-chelate chromatography in order to purify ubiquitinated proteins from yeast cells expressing 6×His-ubiquitin. Remarkably, by profiling a yeast strain with a mutation in the proteasomal ubiquitin receptor Rpn10, they could identify 54 candidate ubiquitination substrates that require Rpn10 for degradation [45]. Among them were the transcription factor Gcn4 and the cell cycle regulator Sic1, two known proteasome substrates whose abundance is low. This study [45] demonstrates how subtractive ubiquitin profiling can help to define substrates of particular pathways of the ubiquitin-proteasome system. It is not hard to imagine that a similar strategy, in which cells defective in a particular E3 ligase are compared with wild-type cells, could be used for large-scale identification of the specific substrates of individual ubiquitin ligases. Furthermore, the introduction to proteomic analyses of various mass-spectrometric strategies that use stable isotope labeling promises to transform ubiquitin-profiling experiments by enabling detection of quantitative changes in ubiquitin profiles [4648].

Ubiquitination sites and ubiquitin-chain topology

The pioneering ubiquitin-profiling experiments of Peng and colleagues [42] demonstrated the feasibility of large-scale identification of ubiquitin-attachment sites in substrate proteins. This is possible because, after trypsin digestion, the two carboxy-terminal residues of ubiquitin remain attached to the lysine residue of the substrate protein (Figure 2c). These two additional glycine residues lead to a characteristic 114 Da increase in the mass of the ubiquitinated substrate peptide, which is diagnostic for the ubiquitinated residue and can be monitored by mass spectrometry [42]. Over 100 precise ubiquitin attachment sites have been identified by analyzing peptide-mass data from global ubiquitin-profiling experiments [42, 44]. Bioinformatic analyses of these data sets showed that ubiquitination sites are almost exclusively exposed on the protein surface, and located preferentially in a sequence environment that is predicted to form a loop structure [49]. No conserved ubiquitination motif could be defined, however.

A related strategy allowed detection of different ubiquitin chain topologies in vivo [42, 50]. Formation of a poly-ubiquitin chain requires isopeptide linkages between the terminal carboxyl group of a free ubiquitin molecule and one of seven lysine residues present in a substrate-attached ubiquitin (Figure 2a,d). The most important chain topology is formed through the lysine in position 48 of ubiquitin [51]. Chains linked through Lys48 are the principal recognition signals for the proteasome and generally induce substrate degradation [52]. Chains linked through Lys63 do not induce substrate degradation but have direct effects on protein activity [53, 54]; the biological role of other ubiquitin chain topologies is unclear. From the analytical perspective, the ubiquitin chain linkage can be regarded as a specific example of a ubiquitination site in a substrate: the substrate in this case is ubiquitin itself. Chain linkage can therefore be determined by the characteristic 114 Da mass shift, as described above (Figure 2c,d) [41].

Rather surprisingly, analysis of mass data from large-scale ubiquitin-profiling experiments [42] has revealed that all seven lysine residues in ubiquitin are used to form ubiquitin chains in vivo. The abundance of the different chain-linkage types was ranked and suggested that linkage through Lys48 is the most abundant topology, followed by Lys63 and Lys11 chains and the less frequent linkages through Lys33, Lys27, Lys6, and Lys29 (the latter is detected only in combination with Lys33 linkage) [42]. These results emphasize the complexity of ubiquitin biology. At the same time, interpretation of these experiments [42] is somewhat limited because they can describe only the linkage between two ubiquitins and cannot determine to which protein the chain was attached or whether the chain was attached to a substrate at all (Figure 2d). Similarly, it is unclear whether ubiquitin chains are homogenous or can contain mixed linkage types. The only rigorously studied example so far of a substrate-attached ubiquitin chain in vivo demonstrated the presence of a homogenous ubiquitin chain [50]. Many more studies are necessary, however, before we can decide whether mixed chains exist in vivo and whether they encode biologically important information.

Chemistry-based and global in vivoapproaches to deubiquitination

DUBs are an important component of the ubiquitin-proteasome system. They are proteases and can therefore be targeted by activity-dependent probes, which form covalent bonds with their active sites and have been successfully applied to other classes of proteases [55, 56]. An elegant strategy using an activity-based approach to target DUBs has led to the identification of numerous deubiquitinating activities in cell lysates and the discovery of the new class of DUBs that contain an OTU domain (a domain characteristic of the ovarian tumor superfamily of proteins) [57]. Briefly, ubiquitin fused to a hemagglutinin (HA) epitope tag at its amino terminus and to one of various cysteine-reactive probes (which react with cysteine proteases) at the carboxyl terminus is incubated with total cell lysates. The active site of each DUB forms a covalent bond with the cysteine-reactive group on the HA-ubiquitin probe (Figure 3a) and can therefore be immunopurified using the HA tag and subsequently identified by mass spectrometry (Figure 3b) [58]. Activity-based ubiquitin probes have also been used to generate profiles of DUB activity in different cell lines and tissues and to identify proteins that interact with DUBs (Figure 3b) [58, 59].

Figure 3
figure 3

Activity-based profiling of deubiquitinating enzymes and interacting proteins. (a) Ubiquitin fused to an amino-terminal epitope tag (for example hemagglutinin, HA) and a carboxy-terminal reactive group forms a covalent conjugate with deubiquitinating enzymes (DUBs; for details of the generation of these ubiquitin probes, see [57]). (b) The DUB-ubiquitin conjugates can be immunopurified using the HA epitope. Immunopurification under native conditions allows identification of DUBs and their interacting proteins by mass spectrometry (MS). The immunopurified fractions can be further separated by gel electrophoresis, and DUB-ubiquitin conjugates can be detected by anti-HA immunoblotting. Proteins corresponding to HA-reactive bands can be eluted from silver-stained gels (not shown) and the DUBs can be identified by mass spectrometry.

The problem of identifying DUBs that react with a specific ubiquitinated protein has been elegantly addressed using a collection of RNA-interference (RNAi) vectors that knock down the expression of more than 50 DUBs in mammalian cells [60]. Because the steady-state level of ubiquitin conjugates reflects the balance between ubiquitination and deubiquitination, knockdown of the activity of a specific DUB increases the fraction of the ubiquitinated form of its substrates. This strategy helped to identify the DUB USP1 as the deubiquitinating activity that acts on the mono-ubiquitinated version of FANCD2 (a protein defective in the Fanconi anemia complementation group D2) [60]. A similar collection of small interfering RNAs (siRNAs) has enabled the identification of the familial cylindromatosis tumor suppressor gene (CYLD) as a DUB involved in regulation of the NFκB transcriptional control pathway [61].

More than 25 years have passed since the initial discovery of the ubiquitin system. Ubiquitin has since extended its role from a protein-degradation signal to a regulatory protein modification that affects all areas of biology. The importance of the ubiquitin-proteasome system in biology was acknowledged with the 2004 Chemistry Nobel Prize to Aaron Ciechanover, Avram Hershko, and Irwin Rose, who first established its main features. The complexity and significance of the ubiquitin-proteasome system has started to attract global approaches that are beginning to make important contributions to our understanding of the system. Chemistry-based approaches to deubiquitination have demonstrated the effectiveness of these strategies, and analogous activity-based probes for studying the ubiquitin transfer will be of similar importance. Proteomics using mass spectrometry has had a tremendous impact on the field, as it has helped to describe the nature and regulation of multi-protein complexes that themselves regulate the ubiquitin-proteasome system. Large-scale ubiquitin-profiling experiments have highlighted the involvement of the system in a wide range of processes and demonstrated the complexity of ubiquitin-chain topology. Mass-spectrometric approaches promise to be particularly powerful in the future because one of the previous limitations - the inherent non-quantitative nature of these experiments - has been overcome by stable-isotope-based quantification strategies [4850].

Some of the global approaches described here for the study of the ubiquitin system have also been applied to the study of other ubiquitin-like proteins, such as SUMO, ISG15, and Nedd8 [62, 63]. The strategies that have been proven to be effective for studying ubiquitin biology will be just as important for rapidly advancing our understanding of the role of the growing family of ubiquitin-like modifiers.