Introduction

Proteomics, one of the most important areas of research in the post-genomic era, is not new in terms of its experimental foundations [1]. It is a natural consequence of the huge advances in genome sequencing, bioinformatics and the development of robust, sensitive, reliable and reproducible analytical techniques [212]. Genomics projects have produced a large number of DNA sequences from a wide range of organisms, including humans and mammals. This "genomics revolution" has changed the concept of the comprehensive analysis of biological processes and systems. It is now hypothesized that biological processes and systems can be described based on the comparison of global, quantitative gene expression patterns from cells or tissues representing different states. The discovery of posttranscriptional mechanisms that control rate of synthesis and half-life of proteins and the ensuing nonpredictive correlation between mRNA and protein levels expressed by a particular gene indicate that direct measurement of protein expression also is essential for the analysis of biological processes and systems. Global analysis of gene expression at the protein level is now also termed proteomics. The standard method for quantitative proteome analysis combines protein separation by high resolution (isoelectric focusing/SDS-PAGE) two-dimensional gel electrophoresis (2DE) with mass spectrometric (MS) or tandem MS (MS/MS) identification of selected protein spots [5, 9, 11, 1316]. Important technical advances related to 2DE and protein MS have increased sensitivity, reproducibility, and throughput of proteome analysis while creating an integrated technology. Quantitation of protein expression in a proteome provides the first clue into how the cell responds to changes in its surrounding environments. The resulting over- or under-expressed proteins are deemed to play important roles in the precise regulation of cellular activities that are directly related to a given exogenous stimulus. Conventional 2DE, in combination with advanced mass spectrometric techniques, has facilitated the rapid characterization of thousands of proteins in a single polyacrylamide gel. The uniqueness of 2DE for easy visualisation of protein isoforms, using two physical parameters such as isoelectric point and molecular weight, renders this technology itself extremely informative. The method routinely analyzes more than 1000 different protein spots separated on a single two-dimensional gel and, thus, is well suited for the global analysis of protein expression in an organism. However, high-throughput quantitation of proteins from different cell lysates remains a challenging issue, owing to the poor reproducibility of 2DE, as well as low sensitivity and narrow linear dynamic ranges in the detection methods [1721] Recent developments of fluorescent dyes, such as the different commercially available SYPRO dyes, partially addressed some of these problems [2230]. These dyes detect as little as 1 ng of proteins, and at the same time they offer more than 1000-fold linear dynamic range. The more critical issue, however, is the reproducibility problem of 2DE. Even the identical protein samples that are run on two separate two-dimensional gels will normally produce very similar but not identical 2DE protein maps, owing to the gel-to-gel and operator-to-operator variations. This can be circumvented using multiplexing methods such as fluorescent two-dimensional "Difference Gel Electrophoresis" (2-D DIGE), which substantially reduces variability by displaying two or more complex protein mixtures labeled with different fluorescent dyes in a single 2D gel [21, 3138].

In this review, we focus on the latest developments in 2DE within the context of large-scale proteomics to reveal the advantages, limits and perspectives of the 2DE-based proteomic approach.

A - Gel-based proteomic: from sample preparation to protein separation

1 - Protein solubilisation

In order to take advantage of the high resolution capacity of 2DE, proteins have to be completely denatured, disaggregated, reduced and solubilised to disrupt molecular interactions and to ensure that each spot represents an individual polypeptide.

Although a large number of standard protocols has been published, these protocols have to be adapted and further optimized for the type of sample (bacteria/yeast/mammalian cells; cells/tissue; animal/vegetal material; etc...) to be analyzed, as well as for the proteins of interest (cytosolic/nuclear; total "soluble" or membrane "insoluble" proteins; etc...).

After cell disruption, native proteins must be denatured and reduced to disrupt intra- and intermolecular interactions, and solubilized while maintaining the inherent charge properties. Sample solubilization is carried out using a buffer containing chaotropes (urea and/or thiourea), nonionic (Triton X-100) and/or zwitterionic detergents (CHAPS), reducing agents (DTT), carrier ampholytes and most of the time protease and phosphatase inhibitor cocktails are mandatory.

2 - The first dimension: isoelectric-focusing with immobilized pH gradients (IPGs)

Proteins are amphoteric molecules; they carry positive, negative or zero net charge, depending on their amino acid composition. The net charge of a protein is the sum of all the negative and positive charges. The isoelectric point (pI) of a protein is the specific pH at which the net charge of the protein is zero. Proteins are positively charged at pH values below their pI and are negatively charged at pH values above their pI. IEF is an electrophoretic separation based on this specific biochemical characteristic of proteins.

Basically, the first dimension of the 2DE is achieved with a "strip". It is a dry gel that is formed by the polymerization of acrylamide monomers, linked by bis-acrylamide with molecules of covalently linked immobilin. Immobilins are chemical components that are derived from acrylamide and have additional ionizable non-amphoteric functions. Immobilins of various pKa can create an immobilized pH gradient inside the acrylamide gel. Immobilin was developed by Professors Righetti and Görg at the beginning of the 1990s and is now widely used in 2DE because the IEF gradient is very stable over time and in a high electric field, and shows good reproducibility and a large capacity for separation [9, 3946].

The strip acrylamide gels are dried and cast on a plastic backing. Prior to use, they are rehydrated in a solution containing a pI-corresponding cocktail of carrier ampholytes and with the correct amount of proteins in the solubilization buffer. The carrier ampholytes are amphoteric molecules with a high buffering capacity near their pI. Commercial carrier ampholyte mixtures, which comprise species with pIs spanning a specific pH range, help the proteins to move.

When an electric field is applied, the negatively charged molecules (proteins and ampholytes) move towards the anode (positive/red electrode) and the positively charged molecules move towards the cathode (negative/black electrode). When the proteins are aligned according to their pI, the global net charge is zero and the protein is unable to move and is then focused. Focusing is achieved with a dedicated apparatus that is able to deliver up to 8000 or 10,000 V, but with a limitation in current intensity (50 μA maximum/strip) to reduce heat. The strips are usually first rehydrated without current for at least 5 h (passive rehydration), rehydrated with 50 V for 5 h (active rehydration) and then focused until at least 30 to 80 kV/h.

The equilibration step is critical for 2DE. In this step, the strips are saturated with sodium dodecyl sulfate (SDS), an anionic detergent that can denature proteins and form a negatively charged protein/SDS complex. The amount of SDS bound to a protein is directly proportional to the mass of the protein. Thus, proteins that are completely covered by negative charges are separated on the basis of molecular mass.

The equilibration solution also contains buffer, with urea and glycerol. Equilibration of the strips is achieved in two steps: (1) with an equilibration solution containing DTT, to maintain a reducing environment; and (2) with an equilibration solution containing iodoacetamide, to alkylate reduced thiol groups, preventing their re-oxidation during electrophoresis.

3 - The second dimension: SDS-PAGE

In SDS polyacrylamide gel electrophoresis (SDS-PAGE), migration is determined not by the intrinsic electric charge of polypeptides but by their molecular weight. The SDS-denatured and reduced proteins are separated according to an apparent molecular weight, in comparison with a molecular weight marker. A linear relationship between the logarithm of the molecular weight and the distance of migration of the proteins can be used; it depends essentially on the percentage of polyacrylamide.

Equilibrated strips are embedded with 1% (w/v) low-melting-point agarose in TRIS/Glycine/SDS running buffer and with 0.01% bromophenol blue on the top of the second dimension acrylamide gel. Gels are usually run with 1 or 2 W of current in the first hour, followed by 15 mA/gel overnight with a temperature regulation (10°C to 18°C). When the bromophenol blue migration front reaches the bottom of the gel, the second dimension is finished and the acrylamide gel can be removed from the glass plates.

4 - Gel staining

The gel must firstly be immersed in a fixation solution containing acid (phosphoric acid or acetic acid) and alcohol (ethanol or methanol) as a function of the staining protocol selected. Numerous stains can be used, but with very different costs [17]. Conventional "visible" dyes are Coomassie Blue, colloidal Coomassie Blue and silver nitrate, with quite different sensitivities: 50, 10 and 0.5 ng of detectable protein/spot respectively [17, 20, 25, 4751]. Commercially available fluorescent dyes, such as Sypro Ruby, Flamingo and Deep Purple, have sensitivities of about 1 ng of detectable protein/spot [21, 23, 26, 28, 5255]. Fluorescent dyes have the advantage of a 4 log dynamic linear range but the disadvantage of being more expensive. In comparison with fluorescent dyes, silver nitrate stain has a dynamic linear range of only 1.5 log, and is not recommended for a gel comparison study.

5 - Bioinformatics analysis of 2-D images

Stained gels are scanned on a "visible" or "fluorescent" scanner as a function of the staining protocol selected. The image can then be imported to specific software to be analysed and compared. For a comparison study, at least three repetitions of the same sample should be run; many migration artifacts can occur during 2DE and, to reduce such variability, a mean of several gels is essential. Software, such as Image Master, Progenesis, PDQuest and Samespots, can be used to detect spots and to compare the spot intensity between samples [5360]. Spots of interest, i.e. spots specific to a sample or spots over-expressed on a condition/treatment, can be selected for further MS analysis. Several "computer-based" comparisons can be performed with a 2DE map. As a proteomic map is specific of a given cell, tissue or organism in a specific physiological condition, it is possible to compare not only one spot to one spot, but a set of spots to a set of spots, for example between two closed organisms. In a precedent study, we investigated the natural variation in the proteome among 8 Arabidopsis thaliana ecotypes, of which 3 were previously shown to display atypical responses to environmental stress [61]. The 2DE proteomic maps revealed important variations in terms of function between ecotypes [62]. Hierarchical clustering of proteomes according to either the amount of all anonymous spots, that of the 25 major spots or the functions of these major spots identified the same classes of ecotypes, and grouped the three atypical ecotypes (Fig. 1).

Figure 1
figure 1

Hierarchical clustering of Arabidopsis thaliana ecotypes based on 2DE spot amounts. Soluble proteins from ecotype Col-0 were separated using pH 4-7 IPG in first dimension and 11% SDS-PAGE in second dimension. Proteins spots were visualized by colloidal Coomassie blue staining. The amount of each spot was estimated by its normalized volume as obtained by image analysis [62]. Euclidian distances were then computed for all spots to build the similarity matrix for ecotypes, and clustering was performed using the Ward's method to link the variables.

6 - Protein identification

To identify the proteins within the spots of interest (according to image analysis), a gel with a greater amount of protein is prepared. In this case, IEF step must be performed at least until 100 kV/h. The other steps of the 2DE are very similar to the previously described protocol. Colloidal Coomassie Blue or fluorescent dyes are recommended for the staining of the preparative gel, because they have good compatibility with MS [22, 23, 28, 63]. In contrast, silver nitrate will give poor results, even if MS-compatible protocols are available [21, 49, 50]. It should be noted that a specific spot picker robot, able to work with fluorescence, is essential when working with fluorescent dyes. On a precedent study, we analyzed the total protein maps visualized when using classical visible stains and different fluorescent dyes [49]. For this purpose, a soluble extract from Arabidopsis thaliana was taken as a model of sequenced eukaryotic genome and resolved by 2-DE. Besides specificities in background quality, propensity to saturation, and staining reproducibility, large differences were observed between dyes in terms of sensitivity, especially for low abundance spots. The effects of the staining procedure on MALDI-TOF MS characterization were analyzed too on a set of 48 protein spots that were selected for their contrasting abundance, pI, and Mr. Gels were stained with either classical visible stains colloidal (Coomassie blue and silver nitrate), and different fluorescent dyes (Sypro Ruby and Deep Purple). It appeared that Sypro Ruby combined several favorable features: no dependence of the identification rate upon the physicochemical properties of proteins, no impact on frequency of missed cleavages, and a higher predicted identification rate (Fig. 2).

Figure 2
figure 2

Impact of staining procedures on mass spectrometry compatibility using visible stains and fluorescent dyes. In these studies, the effects of protein spot properties were integrated to derive prediction of the MS results obtainable with the different dyes for all spots in the gels [24, 49]. 100 μg of total protein extracts from Arabidopsis thaliana were focused on pI 4-7 range and separated on gels covering the 15-150 kDa range. By comparison to sensitivity properties of dyes, these simulations enable a first estimation of the overall proteomic capacity of dyes. They argue for a clear advantage in using fluorescent dyes, particularly SR, which cumulates high sensitivity, acceptable identification success on gels loaded with low protein amount, and constant protein sequence coverage. Abbreviations used: colloidal Coomassie blue (CCB), silver nitrate (SN), Sypro Ruby (SR), Deep Purple (DP).

B - Benefits of Gel-based proteomic

1 - A reduced gel to gel variation using "Difference in Gel Electrophoresis" (2D-DIGE)

Difference in gel electrophoresis (DIGE), first conceived by Unlu et al. in 1997, takes advantages of structurally similar cyanine-based dyes to label different pools of protein samples, which are then co-separated on a single 2DE gel [34].

The biggest advantage of DIGE over other two-dimensional- based technologies is that it enables the analysis of two or more protein samples simultaneously on a single 2DE [31, 32, 35, 36]. Because the same proteins present in two different samples were prelabeled with two different dyes (i.e., Cy3 and Cy5, respectively), they could be combined and separated on the same 2DE without the loss of the relative protein abundance in the original samples. At the end of protein separation, the relative ratio of proteins in the two original samples could be readily obtained by comparing the fluorescence intensity of the same protein spots under different detection channels (e.g., Cy3 and Cy5) using a commercial fluorescence gel scanner. Because only one gel is used in DIGE, and the same proteins from two different protein samples comigrate as single spots, there is no need for the generation of "averaged" gels, as well as superimposition of different gels, making spot comparison and protein quantitation much more convenient and reliable. This makes DIGE potentially amendable for high-throughput proteomics applications.

DIGE has shown significant advantages over conventional 2DE in a number of applications. Up to three kinds of fluorescent cyanine dyes have been employed in DIGE, namely, Cy2, Cy3, and Cy5, which allows for simultaneous analysis of up to three different protein samples in a single gel. DIGE is a valuable method for high-throughput studies of protein expression profiles, providing opportunities to detect and quantify accurately "difficult" proteins, such as low-abundance proteins.

A problem in DIGE lies in the hydrophobicity of the cyanine dyes, which label the protein by reacting covalently, to a large extent, with surface- exposed lysines in the protein, and lead to removal of multiple charges from the protein. Consequently, this decreases the solubility of the labeled protein, and in some cases may lead to protein precipitation prior to gel electrophoresis. To address this problem, minimal labeling is generally employed in DIGE. Typically the labelling reaction is optimized such that only 1-5% of total lysines in a given protein are labeled. Alternatively, Shaw et al. have developed a new batch of DIGE Cy3 and Cy5 dyes, which label only free cysteines in a protein by "saturation" labeling [37]. This strategy offers greater sensitivity than the conventional DIGE method. The biggest drawback, however, is that it only labels proteins that contain free cysteines, meaning that a certain percentage of proteins in a proteome will not be labeled with this strategy, let alone downstream detection and characterization of these proteins.

2 - Characterization of protein isoforms

An area of increasing interest in proteomics is the identification of post-translational modifications and/or spliced forms of a same gene or protein [6467]. The process of determining whether a protein is expressed in a particular proteome has become a relatively simple task with the automation of the 'in-gel' digest and subsequent identification of the resulting peptides with mass spectrometers. Today, most proteins are identified by either assigning them definitive protein attributes, such as peptide masses generated by MALDI-TOF mass spectrometry and the short amino acid sequences generated by tandem MS. It is clear that when several spliced variants are present in a proteome, such approach for protein identification mostly characterizes peptides common to all spliced variants. In a precedent study, we used the advantages of 2DE separation to analyze alpha-amylase diversity in human saliva [68]. Because each alpha-amylase isoforms exist as a discrete purified protein, any information obtained from the analysis of this protein is unique to its original proteome (Fig. 3). 2DE was combined with systematic MALDI-TOF MS analysis and more than 140 protein spots identifying the alpha-amylase were shown to constitute a stable but very complex pattern. Careful analysis of mass spectra and simultaneous hierarchical clustering of the observed peptides and of the electrophoretic features of spots defined several groups of isoforms (Fig. 3 right part) with specific sequence characteristics, potentially related with special biological activities. In a recent study, 2DE separation was successfully used to analyze isoforms and polymers forms of bovine milk proteins [69]. A combination of reducing and non-reducing steps was used to reveal proteins polymers occurring before or after heat treatment of milk (Fig. 4). This original 2DE strategy revealed numerous disulfide-mediated interactions and was proposed to analyze reduction/oxidation of milk and dairy product proteins.

Figure 3
figure 3

Characterization of isoforms of alpha-amylase in human saliva. A 2DE map of human salivary proteins was performed [68]. Proteins were resolved using pH 3-10NL IPG and 12% SDS-PAGE, and stained with colloidal Coomassie blue. Alpha-amylase spots subsequently identified by MALDI-TOF MS are indicated by respective spot number (left part). According to the mass features of alpha-amylase (right part) identified spots, (A) coverage of the alpha-amylase sequence by the total population of peptides identified (black boxes) in the different alpha-amylase spots, (B) simultaneous clustering of the 67 alpha-amylase spots according to the MW range measured on gels, the MS identification of peptides in the N-terminal and C-terminal and central regions of the sequence, (C) individual spot coverage of the alpha-amylase sequence by peptides identified (black boxes) by MALDI-TOF MS.

Figure 4
figure 4

Non-reducing 2DE for the study of disulfide-mediated interactions between proteins in raw milk. 2DE of proteins in raw milk separated under non-reducing conditions using a 7-cm pH 4-7 pI range strip for the first dimension, and a 10 to 18% gradient acrylamide gel for the second dimension. The specific/interesting spots as indicated by arrows were submitted to MALDI-TOF mass to identify proteins involved in polymers [69].

C - Limits of "Gel-based proteomic"

1 - Membrane proteins

The resolution of membrane proteins remains an area of considerable concern in gel-based proteomics [7075]. There remains an attitude that it is difficult or impossible to effectively resolve membrane proteins using 2DE. Indeed, few membrane proteins are seen on 2D gels when conventional sample-preparation methods are used. Membrane proteins are poorly soluble in the detergent/chaotrope conditions available for IEF, and are inherently insoluble in gel matrices under these conditions and thus are poorly resolved by IEF and subsequent 2DE. Fractionation, in combination with the correct solubilizing reagents, produces sample extracts that are highly enriched for membrane proteins. Sequential extraction of proteins from a sample by increasing protein solubility at each step is an effective strategy for first removing the more abundant soluble proteins and then for concentrating the less abundant and less soluble membrane proteins. A specific 2DE strategy, using the cationic detergent benzyldimethyl-n-hexadecylammonium chloride in the first and the anionic detergent SDS in the second dimension, was successfully used to analyze membranes proteins in various systems [7678].

2 - Low-abundance proteins

Low-abundance proteins are rarely seen on traditional 2D maps because large quantities of abundant soluble proteins obscure their detection [21, 7981]. Most 2DE-based proteomic studies employ a 'one-extract-one-gel' approach and the majority of proteins identified in these studies are in high abundance. These low-abundance proteins are considered to be some of the most important, including regulatory proteins, signal transduction proteins and receptors. Consequently, the analysis of low-abundance proteins is becoming increasingly common in cellular proteomics. The dynamic range of protein concentration can differ considerably between biological samples. For yeast, the most abundant proteins are present at around 2 000 000 copies per cell, whereas the least abundant proteins are present at around 100 copies per cell, a dynamic range of only 4 orders of magnitude. However, in plasma, the predicted dynamic range of proteins is up to 12 orders of magnitude. Analysis of individual compartments not only provides information on protein localization, but also allows detection of protein populations otherwise not detectable in whole cell proteomes. Detection of the low-abundance proteins requires most of the time removal of abundant proteins from the sample. For example, the complexity of the serum and plasma proteome presents extreme analytical challenges in comprehensive analysis due to the wide dynamic range of protein concentrations. Therefore, robust sample preparation methods remain one of the important steps in the proteome characterization workflow. A specific depletion of high-abundant proteins from human serum and plasma using affinity columns is of particular interest to improve dynamic range for proteomic analysis and enable the identification of low-abundant plasma proteins [80, 82]. On another hand, IPG technology can be used with narrow (2-3 pH units) and very narrow (~1 pH unit) gradients that enable many more proteins to be resolved. Indeed, the advent of immobilized pH gradients has greatly improved the reproducibility of 2D gels and has made it easier for new users to implement this technology. The loading capacity of narrow-range IPGs is considerably higher than broad-range IPGs, thus enabling the visualization and identification of previously unseen proteins. Sub-fractionation can be used to improve the recovery of low-abundance proteins too. For example, membrane preparation methods are commercially available and allow a specific separation between abundant/soluble proteins and membrane/low-abundance proteins. More recently, a system is available to perform a specific depletion of high-abundant proteins and a reduction of protein concentration differences [82, 83]. The protein population is "equalized", by sharply reducing the concentration of the most abundant components, while simultaneously enhancing the concentration of the most dilute species.

3 - Alkaline proteins

Alkaline proteins were particularly difficult to resolve on 2D gels. First, the most common commercially available pH gradients, until recently, were pH 4-7 and pH 3-10 and these do not provide significant alkaline-protein resolution. As more alkaline pH range immobilized pH gradients become commercially available, resolution of proteins in IPGs up to pH 12 has been demonstrated. Strongly alkaline proteins such as ribosomal and nuclear proteins with closely related pIs between 10.5 and 11.8 were focused to the steady state by using 3-12, 6-12 and 9- 12 pI ranges [8486]. For highly resolved 2-D patterns, different optimization steps with respect to pH engineering and gel composition were necessary, such as the substitution of dimethylacrylamide for acrylamide, the addition near the cathode of a paper strip soaked with DTT providing a continuous influx of DTT to compensate for the loss of DTT [41, 45], and the addition of isopropanol to the IPG rehydration solution in order to suppress the reverse electroendosmotic flow which causes highly streaky 2-D patterns in narrow pH range IPGs 9-12 and 10-12 [86].

4 - High molecular weight proteins

Conventional 2DE, using SDS-PAGE as second dimension is not very useful to visualize high-molecular-weight (HMW) proteins. Even with gradient acrylamide gels, it is very difficult to obtain a good separation of proteins up to 250 kDa. HMW proteins and protein-complexes can be analysed using agarose gel IEF [87]. Agarose 2-DE is sufficiently good at separating HMM proteins with molecular masses as large as 500 kDa from various diseased tissues and cells [88]. Indeed, this method was successfully used to analyze HMW complexes from yeast [87], HMW proteins from human plasma [89], hepatocellular carcinoma, prostate and colorectal cancer [88, 9092].

Conclusion

Thanks to its high resolving power and its large sample loading capacity, 2DE allows several hundred proteins to be displayed simultaneously on a single gel, producing a direct and global view of a sample proteome at a given time point. Reference maps of numerous distinct samples have now been published, providing, to researchers worldwide, standardized libraries of proteins known to be present in these samples. But 2DE has some limitations that must be taken into account. Despite maximal precautions, there will be some degree of gel-to-gel and run-to-run variability in the expression of the same protein set, which could be overcome by maintaining a variability coefficient of reference spots as low as possible. It can be largely circumvented using a DIGE strategy. Additionally, some proteins may escape the capabilities of conventional 2DE for several reasons, including the poor solubility of membrane proteins and out of range characteristics of extreme proteins such as high or low pI and molecular weight. Despite all these drawbacks, 2DE can demonstrate changes in relative abundance of visualized proteins and can detect protein isoforms, variants, polymer complexes and posttranslational modifications. Quantitative proteomics can be achieved by assessing differences in protein expression across gels using 2DE dedicated software and proteins in varying spots can be identified by MS. The uniqueness of 2DE for easy visualization of protein isoforms renders this technology itself extremely informative and it is currently the most rapid method for direct targeting of protein expression differences.