Background

Most research in biology relies on comparative observations of two or more conditions in a quantitative or descriptive manner [1]. Quantitative measurements (isotope labeling, label free methods, etc.) in comparative proteomics have been explored [2,3], but it is important to determine whether two datasets derived from different experimental conditions can be compared. Comparability (qualitative similarity of datasets) should be assessed prior to the quantitative comparison of LC-MS/MS datasets.

Data normalization across samples from different biological conditions is another critical point of comparative proteomics. Individual datasets from LC-MS/MS can be obtained with careful sample preparation and mass spectrometry application (injection volume, injection concentration, reproducibility, etc.) to minimize deviations between samples. However, sample deviation is often fundamental and originates from different biological conditions and cannot be assessed by the extant reproducibility of any one sample or dataset normalization.

To approach such problems in comparative proteomics, we hypothesized that proteins expressed consistently across various cellular conditions can be used as internal standards for quantification as well as a dataset comparability indicator. Genes in the glycolytic pathway are widely used as internal standards to normalize DNA microarrays and quantitative PCR studies and were ideal for our purpose. This study selected constitutively expressed proteins from Lactococcus lactis’ glycolytic pathway as internal standards for comparative proteomic analyses [4-10].

We employed a simple, applicable, and accurate spectral counting method demonstrated by numerous researchers to quantify proteins [5,11-16]. This spectral counting method is particularly useful with protein mixtures and whole cell proteomic analyses [1]. The relative amount of a protein between two samples was estimated by comparing two normalized spectral abundance factors (NSAF).

We assessed the approach’s reliability by comparing whole cell proteomes and relative amount of green fluorescent protein (GFP) from two strains expressing GFP at low or high levels, L. lactis HR279 and JHK24 respectively [17-19]. The plasmid pHR086 present in HR279 is an Escherichia coli- L. lactis shuttle vector containing a nisin-inducible GFP expression cassette and pJH24 present in JHK24 is the high copy variant of pHR086. A previous comparative protein expression study demonstrated these high and low copy vectors showed strong correlation between GFP fluorescence intensity and GFP amount per cell [18].

In this study, relative increases in GFP expression among whole L. lactis cell proteomes was calculated using the number of GFP’s MS/MS spectra and the comparison to nine internal standards. Relative increase determined by spectral counting was then compared to values obtained from GFP fluorescence emission. LC-MS/MS dataset reproducibility from one sample and dataset comparability between two samples was evaluated using internal standards. We define relative number of total spectra (R TS ) as a presumptive parameter to evaluate mass spectrometric (MS) dataset’s comparability. In addition, our statistical analysis illustrates the importance of assessing a dataset’s comparability before calculating the relative protein quantities between two proteomic datasets.

Results and discussion

Strategy for the comparability assessment using internal standards

We define ‘comparability’ as the determination of whether two datasets have similar quality in order to correctly reflect proteomic changes occurring between two experimental conditions. Data reproducibility is key in determining comparability between single sample analytical replicates. Ideally, the change (ave_SRA[rep, k]) and standard deviation (SD_SRA[rep, k]) from averaged relative amounts of each protein from whole proteome biological replicates is 1.0 and 0, respectively. In this case, standard deviation reflects reproducibility between replicates.

The comparability assessment, however, becomes problematic when two samples from different experimental conditions are compared. Because datasets from two independent experimental conditions inherently exhibit a difference in each protein’s amount; the standard deviation of the relative amount of total protein (SD_SRA[comp, k]) should not be directly used as a dataset comparability assessment parameter. Instead, a subset of consistently expressed proteins should serve as internal standards for this quality assessment. Consistent expression levels in two different experimental conditions infer that internal standards used are “pseudo-replicates” shared by the two samples. Therefore, the standard deviation of the relative amount of internal standards (SD_SRA[comp, INj]) should be used in comparability assessments between two samples from different experimental conditions.

We also define ‘relative number of total spectra’ (R TS ) as an additional parameter to assess dataset comparability prior to calculating each protein’s relative amount. Linear correlation between R TS and the standard deviation of internal standards (SD_SRA[comp, INj] and SD_SRA[rep, INj]) helped determine R TS threshold value for the comparability assessment that permits a viable comparison.

Results of LC-MS/MS data and identification of proteins

Our experimental system employed four different conditions (Figure 1), L. lactis cells containing either a high or low copy plasmid expressing GFP, sampled at exponential phase (High-1, Low-1, respectively) and early stationary phase (High-2, Low-2, respectively). Three biological replicates of each sample were prepared in the separate sets of experiments. The replicates referred in this work are biological replicates, not analytical or technical replicates of a single biological sample. The samples and the total number of the MS/MS spectra used to identify the proteins (SpC total ) are summarized in Table 1. The biological replicates from the early exponential phase samples, High-1 and Low-1, exhibited a SpC total ranging from 2406 to 4514 resulting in a large value of R TS (1.20 ~ 1.79). In contrast, the SpC total of the early stationary phase samples, High-2 and Low-2, had more uniform numbers between 3810 and 4492 and, consequently, a low R TS value close to 1.0.

Figure 1
figure 1

Cell growth curve of L. lactis HR279 (triangle) and JHK24 (circle). The fluorescence from the HR279 and JHK24 are depicted as open triangles and open circles. The arrows indicated the induction of GFP expression by adding a nisin and the sample collection points.

Table 1 The summary of the LC-MS/MS results

As shown in Table 1, approximately 300 proteins were determined from each sample of three biological replicates. Between 76% to 86% of proteins were present in all biological replicates, and more than 90% of proteins appeared in at least two of the three biological replicates. Replicates with a small R TS value (for example, the sample High-2) showed only 8 proteins uniquely detected among individual replicates. However, the replicates of Low-1, which showed a large R TS , exhibited 25 proteins that were uniquely present in only one of the three biological replicates.

Biological variability

SD_SRA[rep, k] variation has been used for the indication of the quantitative reproducibility between sample replicates. Biological replicates of High-2 exhibited strong quantitative consistency with a 0.37-0.41-fold standard deviation. However, High-1 replicates exhibited a wider range of standard deviations (0.7-1.09 fold; Additional file 1: Table S1; Figure 2 - Correlation between R TS and variability of biological replicates.). Under two replicates’ ideal reproducibility, SRA[rep, k] would exhibit a value of zero (i.e. a change of 1.0-fold) or commonly a single value. However, when SRA[rep, k] exhibits a normal distribution (Additional file 2: Figure S1); the reproducibility can be measured by SD_SRA[rep, k] of the SRA[rep, k] distribution.

Figure 2
figure 2

Correlation between R TS and variability of biological replicates. Three way correlations on 3D space was projected on each xyz-plane describing the linear correlation between (A) R TS and standard deviation of total proteins (SD_SRA[rep, k]), (B) R TS and standard deviation of internal standards (SD_SRA[rep, INj]), and (C) standard deviation of total protein (SD_SRA[rep, k]) and internal standards (SD_SRA[rep, INj]). The X-axis of graph (A) and Y-axis of graph (C) shard the same value and range of SD_SRA[rep, k]. The circles, squares, diamonds, and triangles represent comparisons between biological replicates of samples High-1, High-2, Low-1 and Low-2 in Table 2, respectively.

Once MS analytical reproducibility is guaranteed, dataset quality variation occurs mainly due to biological variability between samples. Dataset normalization tools such as NSAF or technical replicates cannot adjust for the variation between samples; resulting in inaccurate quantitation unless dataset quality is the same. Injecting the same amount of protein to MS can minimize variability in dataset quality but does not necessarily reflect the same initial biological specimen amount. For example, cell protein recovery varies depending on the morphology (cell wall rigidity, exopolysaccharide production) and total amount of protein per cell. Cell lysate’s protein concentrations obtained from growth on different substrates or taken at different growth stages differed up to 3.9-fold even though the same number of cells (as determined by optical density) were initially used (Additional file 1: Table S2). Thus, normalization using protein amounts injected into the MS analysis cannot adjust variation from samples’ different biological conditions.

Internal standards

The nine glycolytic proteins chosen as internal standards exhibited smaller quantitative variations than the total protein pool (Additional file 1: Table S1). SD_SRA[rep, INj] between replicate of the High-2 samples were 0.15- 0.27 fold and showed good reproducibility. High-1’s biological replicates still showed a higher value of SD_SRA[rep, INj] (0.34- to 0.84 fold). Figure 2A presents SD_SRA[rep, INj] linearly correlated to total proteins (SD_SRA[rep, k]), exhibiting a correlation coefficient of 0.84.

Glycolytic enzymes were chosen in this particular study since they are constitutively expressed throughout the cell growth. Alternative sets of internal standards could be used in different experimental conditions or biological systems. For example, the enzymes involved in the Calvin cycle can be used as internal standards for the plant cell. Comparative proteomics between wild type (W) and the mutant (KE) strains of Oryza sativa subsp. japonica showed that the enzymes involved in the photosynthesis were constitutively expressed (Additional file 1: Table S3). In the individual comparisons between the replicates of W and KE strains, the SD_SRA[comp, INj] of selected enzymes were between 0.19- to 0.43-fold suggesting constitutive expression and, consequently, potential use as internal standards (Additional file 1: Table S4)

A common practice in mRNA expression studies is to normalize the expression levels to one or more internal standard(s) [4,6,7,9,10]. This controls for sample variation due to differences in RNA preparation. We hypothesized that the same approach would work in comparative proteomics and used several constitutively expressed proteins from the glycolytic pathway of L. lactis as internal standards. Alternative sets of internal standards could be used in different experimental conditions or biological systems. Indeed, the narrow range of standard deviations in the relative amounts of nine glycolytic enzymes at different growth stages (SD_SRA[comp, INj]) suggests that the expression level of nine enzymes in two different strains and two independent growth stages of L. lactis were maintained at a constant level (Table 2) and did not correlated to the RTS values (Additional file 2:Figure 2S).

Table 2 Relative amount of GFP between high and low expression system at different stage of cell growth

Comparability assessment

The comparability assessment of two samples obtained from different biological conditions starts with comparing the two constitutively expressed internal standards (glycolytic enzymes) subsets. Threshold value obtained from the analysis of replicates (0.46-fold) is applied to assess the comparability between two independent sample sets.

We used standard deviations from biological replicates to design an acceptable range for our comparability assessment. The minimum SD_SRA[rep, k] obtained was 0.38-fold. This experimental observation led us to a determined threshold for the comparability assessment: a standard deviation difference of 0.76-fold (twice the experimentally determined minimum value). Standard deviations of 0.38- and 0.76-fold are equivalent to linear correlation coefficients of 0.98 and 0.95, respectively, when each protein’s NSAF was plotted on a log-log plot (Additional file 2: Figure S3). Consequently, the SD_SRA[rep, INj] was 0.46-fold; this became our threshold for comparability assessments from standard deviation correlations between internal standards and total protein (Figure 2).

R TS - LC data quality

While internal standards are capable of adjusting biological variability between two independent samples, sample comparability should be assessed prior to calculations. R TS is a presumptive parameter for quality assessment correlating with SD_SRA[comp, INj].

Biological replicates from exponential phase samples (High-1, Low-1) exhibited SpC total of 2406–4514 resulting in larger R TS values (1.20-1.79). In contrast, early stationary phase samples (high-2, low-2) SpC total had more uniform numbers between 3810 and 4492 and, consequently, lower R TS values closer to 1.0.

R TS positively correlated with SD_SRA[rep, k]: r 2 = 0.89 and SD_SRA[rep, INj]: r 2 = 0.86 (Figure 2B & 2C). Increasing the value of R TS increased the SD_SRA[rep, k] value; suggesting poor dataset comparability at higher R TS values. The calculated threshold for the comparability assessment was R TS of 1.35 using linear correlation where SD_SRA[rep, k] and SD_SRA[rep, INj] were 0.76- and 0.46-fold, respectively. R TS exhibited linear correlations with the standard deviations of the internal standard proteins from replicates within one sample (SD_SRA[rep, INj]) and between replicates of two different samples (SD_SRA[comp, INj]) (Figure 3). The correlation (r 2 = 0.84) allowed us to calculate an R TS value of 1.34 using a standard deviation of 0.46-fold; showing good agreement with values obtained from biological replicates.

Figure 3
figure 3

Correlation of R TS and the standard deviation of a relative amount of internal standards. Correlation of R TS and the standard deviation of a relative amount of internal standards between replicates (SD_SRA[rep, INj]) and between two samples (SD_SRA[comp, INj]). The standard deviations obtained from the comparison of High-1vs Low-1, High-2 vs. Low-2 and replicates in each group are shown as squares, triangles and circles, respectively. The diamonds depict the outliers from the comparisons between the sample High-1 and Low-1.

R TS values close to 1.0 exhibited similar ion chromatograms and protein identification results (Table 1). In addition, R TS values correlated to the quantitative reproducibility of each protein in the replicates. SD_SRA[rep, k] and SD_SRA[rep, INj] between replicates linearly correlates to the R TS value (Figure 2). These observations support the notion that the R TS value is able to represent the quality of two MS datasets. The R TS value was introduced to assess the dataset comparability prior to the calculation because it contains information on the sample’s absolute quantity of peptide identification information which SD_SRA[comp, INj] does not.

The correlation between R TS and the quantitative reproducibility has been evaluated using the public database. Six replicates of Escherichia coli whole cell proteomic datasets were retrieved from the proteomeXchange website (http://www.proteomexchange.org/) [20]. The Ave_SRA[rep, k] between six replicates were within ±1.77 folds, however, the SD_SRA[rep, k] exhibited wide ranges from 0.5-fold to 3.0-fold depending on the dataset quality. From the SD_SRA[rep, INj], we were able to validate R TS as a the quality assessment parameter (Additional file 2: Figure S4)

Quality levels between MS datasets of two samples plays a critical role in relative protein quantitation. When a good quality sample (many proteins identified, large number of SpCtotal), is compared with a poor quality sample (few identified proteins, small SpCtotal), calculating the relative protein amount can be more inaccurate than comparing two poor quality samples as demonstrated in our data. SpCtotal from biological replicate A of High-1 was 2406 (1A; poor quality), biological replicate B and C of Low-1 sample had of 2878 (3B; poor quality) and 4150 (3C; good quality), respectively. GFP’s relative amount calculated using 1A/3B was 3.88 ± 1.69 fold; closer to the actual value measured by fluorescence (2.86 ± 0.42 fold) than the comparison between 1A/3C (7.09 ± 3.17 fold).

GFP expression

The bioinformatical analysis about the comparability and quality assessment was validated by the biological experiment using GFP expression. Fluorescence determined the relative amount of GFP between High-1 and Low-1 to be 2.86 ± 0.42 fold (Figure 4A). GFP expression increase between High and Low samples was 3.92-fold with a standard deviation of ±1.14 fold when the comparability assessment criteria was not used (Figure 4A; IS(−)/CA(−)). This is 140% higher and contains a larger standard deviation than values obtained from fluorescence measurements. A student’s T test calculated a p-value of 2.4 × 10−2 (IS(−)/CA(−)) and 8.4 × 10−3 (IS(+)/CA(−)) were obtained when internal standard and comparability assessment were not employed (Figure 4A).

Figure 4
figure 4

The impact of comparability assessment and the use of internal standards. The relative increase of GFP expression calculated by the spectral counting method (black bar) was compared to that obtained by the external measurement using fluorescent (white bar). IS and CA represented the use of internal standards and comparability assessment, and the sign of plus (+) and minus (−) indicates the “the use” and “without the use” of method, respectively. For example, IS(+)/CA(+) meant the relative amount of GFP calculated with the use of internal standard and comparability assessment of replicates. The p-values were obtained by the Student t-test (see the Experimental procedures). (A) and (B) is the comparison of sample obtained at early and late exponential phase, respectively.

Comparisons between replicates showed increases ranging from 2.06 ± 0.94 to 7.09 ± 3.14 fold (average 3.46 ± 1.97 fold increase) when internal standards were applied to protein quantification (Table 2 & Figure 4A). Although the calculated accuracy value improved, the standard deviation was still large (RSD of 57%). However, our comparability criteria eliminates MS dataset outliers (7.09 ± 3.17 or 2.06 ± 0.94) resulting in a smaller range for the relative amount of GFP (2.12 ± 0.66 to 3.48 ± 0.59 fold). The pooled value of relative GFP increase between comparable datasets (IS(+)/CA(+)) was 2.88 ± 0.92 fold; showing good correlation with the value obtained from external fluorescence measurements (Figure 4A). Student’s t-test showed a p-value of 0.95 between values obtained from external measurement and our mass spectrometric approach.

GFP’s relative increase amount between High-2 and Low-2 were more uniform (Table 2). Comparisons between replicates had SD_SRA[comp, INj] values lower than 0.46-fold, suggesting good comparability. GFP expression increase between each biological replicates (IS(+)) were in the range of 3.27 ± 0.70 to 4.73 ± 1.10 fold; numbers similar to the 4.00 ± 0.62 fold increase obtained from the external fluorescence measurements (Table 2). Pooled relative increase of GFP expression from MS dataset was 4.05 ± 0.97 fold (Figure 4B; IS(+)/CA(+)), correlating to values observed via fluorescence with a p-value of 0.73. However, the relative GFP amount was 4.69 ± 0.44 fold and the p-value was 1.1 × 10−4 without the use of internal standards (Figure 4B; IS(−)/CA(−)).

Conclusions

We have developed and discussed a novel method to accurately calculate a protein’s relative quantity in two whole cell proteomes. We determined that a preliminary proteomic dataset screen is necessary before an accurate relative protein abundance comparison can be made. In particular, we showed that assessing the relative number of total spectra (R TS ) between two datasets can forecast the quality of the ensuing quantitative comparison. Once two datasets were deemed comparable, we used nine glycolytic enzymes as internal standards to calculate protein relative abundance in the two proteomes. We used GFP as a model protein to demonstrate GFP’s relative abundance. Any two proteomic datasets with a R TS value less than 1.4 produced a value in close agreement with direct GFP fluorescence measurements (p-value of 0.73 and 0.95). While methods developed here employ spectral counting, their application is not limited to the label-free approaches.

Methods

Cell culture

L. lactis HR279 and L. lactis JHK24 were cultivated in M17 media (BD, Franklin Lakes, NJ) containing 3% (w/v) glucose (M17-G) supplemented with 5 μg/ml erythromycin (Sigma, St. Louis, MO). Fermentations were initiated by inoculating 5 ml seed culture and incubated at 30°C without shaking. M17-G media’s volume and initial pH were 300 ml and 6.5, respectively. pH was not controlled during the fermentations. The optical density (OD) was measured by a Beckman DU 7400 spectrophotometer (Beckman, Fullerton, CA) at 600 nm. Green fluorescent protein (GFP) expression was induced by adding nisin to a final concentration of 25 ng/ml when cell OD reached 0.7. GFP expression was monitored by measuring cell fluorescence. Cell pellets were washed three times in phosphate-buffered saline (PBS) and normalized to an OD600nm of 0.1 before analysis. Fluorescence from 100 μl of normalized cells was measured in an ABI770 real time thermo cycler using excitation and emission wavelengths of 488 nm and 520 nm, respectively [19].

Sample preparation and protein digestion

Fifty milliliters of L. lactis in media was removed at early and late exponential phase of cell growth as indicated in Figure 1. Early and late exponential phase samples from L. lactis HR279 were designated as Low-1 and Low-2 and samples from L. lactis JHK24 were High-1 and High-2, respectively. Initial cell mass amounts were normalized to an OD600nm of 1.0 by dilution or concentration to a final volume of 25 ml. Cells underwent centrifugation, washing (three times with PBS), and resuspension (1 ml of lysis buffer containing 100 mM Tris and 8.0 M urea, pH 9.0). Cell disruption used 300 μg silica beads (Sigma-Aldrich, ST. Louis, MO) and a bead beater (FastPrep; QBiogen, Irvine, CA) for six 30 second pulses, each with a 30 second interval on ice in between pulses. Centrifugation removed bead and cell debris. The soluble fraction was kept at −80°C until further analysis. Bio-Rad protein assay kit (BioRad, Valencia, CA) measured protein concentration.

Reduction used 4 μl of 450 mM dithiothreitol (DTT, Sigma-Aldrich, ST. Louis, MO) added to 25 μl of supernatant and incubated for 45 min at 55°C. Digestion required 2.5 μg of mass spectrometry grade trypsin (Promega, Madison, WI) added to the reduced protein mixture and incubated overnight at 37°C. The tryptic peptides were then purified by C18 Ziptip (Millipore, Billerica, MA) according to the manufacturer’s manual. Briefly, the Ziptip was first prepared by washing with 50% acetonitrile (ACN)/H2O followed by 0.1% (v/v) trifluoroacetic acid (TFA) in H2O. The tryptic peptide solution was then loaded onto the Ziptip and washed with 0.1% (v/v) TFA in H2O. The peptides were eluted with 50% ACN in H2O. The purified sample was dried prior to MS analysis.

Protein identification

Digested samples were analyzed by the Genome Center Proteomics Core at the University of California, Davis. Protein identification was performed using an Eksigent Nano LC 2-D system (Eksigent, Dublin, CA) coupled to a LTQ ion trap mass spectrometer (Thermo-Fisher, San Jose CA) through a Picoview Nano-spray source. Peptides were loaded onto an Agilent nanotrap (Zorbax 300SB-C18, Agilent Technologies) at a loading flow rate of 5.0 μL/min. Peptides were then eluted from the trap and separated by a nano-scale 75 μm x 15 cm New Objectives picofrit column packed in house with Michrom Magic C18 AQ packing material. Peptides were eluted using a 90 minute gradient of 2-80% buffer B (Buffer A = 0.1% formic acid, Buffer B = 95% acetonitrile/0.1% formic acid). The top 10 ions in each survey scan were subjected to automatic low energy collision-induced dissociation (CID).

For the protein identification among the replicates, a scoring system was developed to determine the presence of each protein from datasets where one biological replicate identifies a protein and another does not. Three parameters were evaluated to score the presence of a particular protein: (a) the number of unique peptides (Pep uniq ) used for the identification of a protein, (b) the probability values from X! Tandem (−log (e)) and (c) the number of times a protein showed up in all three biological replicates. First, proteins were scored as described in Table 3. Then, if the cumulative score of a protein in the three biological replicates was greater than or equal to three the same number of replicates, it was considered to be present in the sample. For example, if the protein was identified with high confidence (Pep uniq  ≥ 2 and –log (e) ≥ 10) in at least one of the three replicates or with low confidence (Pep uniq  = 1 and 2 ≤ −log (e) ≤ 10) in all three replicates, the score will be three and thus considered to be present in the sample.

Table 3 Protein scoring system for the protein determination in replicates

Database searching and false discovery rate (FDR)

Tandem mass spectra were extracted and charge states were deconvoluted by BioWorks version 3.3. All MS/MS samples were analyzed using X! Tandem (www.thegpm.org). X! Tandem was set up to search against the L. lactis whole proteome with protein supplements expressed from heterologous plasmids. X! Tandem searched with a fragment ion mass tolerance of 0.60 Da, and specified methionine oxidation as a variable modification. Peptide false discovery rates (FDR) were determined by independent MS/MS spectra searches against forward (target) and reverse (decoy) database of L. lactis IL1403 (including plasmid proteins). FDR was calculated as R/(F + R) where R and F were the number of peptides from decoy and target databases. The search was performed at a fixed 1% FDR level.

Bioinformatics- Protein functionality coded on L. lactis IL1403 were obtained from NCBI (http://www.ncbi.nlm.nih.gov) and JGI (http://img.jgi.doe.gov) [17].

Calculation of relative number of total spectra (R TS )

Quality assessment of LC-MS/MS datasets between two samples. Relative number of total spectra (R TS ) was determined using equation 1, where SpC A,i corresponds to the number of spectra for the protein i in sample A, NA and NB are the number of proteins in sample A and B, respectively.

$$ {R}_{TS}=\frac{Max\left({\displaystyle \sum_{i=1}^{N_A}Sp{C}_{A,i},{\displaystyle \sum_{i=1}^{N_{{}_B}}Sp{C}_{B,i}}}\right)}{Min\left({\displaystyle \sum_{i=1}^{N_A}Sp{C}_{A,i},{\displaystyle \sum_{i=1}^{N_{{}_B}}Sp{C}_{B,i}}}\right)} $$
(1)

R TS is the ratio of total number of tandem mass spectra used for the identification of proteins in the sample A and B. It has a value larger than or equal to, 1.0.

Calculation of relative quantification between independent samples

The relative amount of a specific protein between samples A and B was calculated using the number of tandem mass spectra of the specific protein and the internal standards. Nine glycolytic enzymes involved in carbohydrate catabolism were used as internal standards (Table 4). The NSAF of protein k in sample A (P A,k ) was divided by the NSAF of internal standard j of sample A (IN A,j ). To calculate the ratio of protein k between sample A and B, the normalized value of protein k in sample A was divided by the value of the same protein k in sample B. Since we employ nine internal standards, the resulting ratios were averaged. However, ratios could not be directly averaged. For example, a ratio of 2.0 corresponds to a two-fold increase and a ratio of 0.5 corresponds to a two-fold decrease. Thus the net average change of two replicates, which gives a two-fold increase and a two-fold decrease, respectively, should be zero. However, arithmetically, the average ratio of the example above would be 1.25, which is incorrect. To convert the ratio to the linear scalar value the scalar relative amount (SRA) was defined.

Table 4 Internal standards used in this study
$$ SRA\left[\alpha \Big|\beta \right]=\left\{\begin{array}{c}\hfill \alpha \ge \beta,\ \left(\alpha /\beta \right)-1\hfill \\ {}\hfill \alpha <\beta,\ 1-\left(\beta /\alpha \right)\hfill \end{array}\right. $$
(2)

Where α and β are the two values or functions in the ratio that we wish to calculate. In this equation, plus and minus only indicate the direction of the change. For the description of relative amount (ratio), a value of one-fold has to be added to the value of SRA [α| β]. For instance, SRA [α| β] of +0.5 and −0.5 corresponds to a 1.5-fold increase and decrease, respectively.

Using this definition, the SRA of protein k between two replicates A and B (SRA[rep, k]) can be described as follows where P replicate A,k is the amount of protein k in replica A.

$$ SRA\ \left[ NSAP\ \left({P}_{replicate\ A,\ k}\right)\Big| NSAF\ \left({P}_{replicate\ B,\ k}\right)\right] = SRA\ \left[SpC\left({P}_{replicate\ A,\ k}\right)\Big|SpC\ \left({P}_{replicate\ B,\ k}\right)\right] $$
(3)

In this equation, A and B indicates samples and k represents a protein. A and B can be replicates of one sample in a same condition (SRA[rep, k]) or two samples from different biological conditions (SRA[comp, k]). When the protein k is an internal standard, it is designated as SRA[A|B, IN j ] and then used to evaluate the comparability between two samples.

Calculation of GFP expression using internal standards

In order to adjust the biological variability, a set of internal standard has been used to calculate the GFP expression. The adjusted SRA of protein expression between sample A and B (SRA [A|B, k] adj ) was calculated as follows where IN A,j is the jth internal standard in sample A and N is the total number of internal standards (N = 9 in work).

$$ SRA\ {\left[\mathrm{A}\Big|\mathrm{B},\ k\right]}_{adj} = \frac{1}{N}{\displaystyle \sum_{j=1}^N}SRA\ \left[\frac{NSAF\left({P}_{A,\ k}\right)}{NSAF\ \left(I{N}_{A,\ j}\right)}\ \Big|\frac{NSAF\left({P}_{B,\ k}\right)}{NSAF\ \left(I{N}_{B,\ j}\right)}\ \right] $$
(4)

Because the same internal standard j (IN j ) and protein k (P k ) are used to calculate the SRA, equation (3) can be simplified as follows and depends solely on the number of spectra.

$$ SRA\ {\left[\mathrm{A}\Big|\mathrm{B},\ k\right]}_{adj} = \frac{1}{N}{\displaystyle \sum_{j=1}^N}SRA\ \left[\frac{SpC\left({P}_{A,\ k}\right)}{SpC\ \left(I{N}_{A,\ j}\right)}\ \Big|\frac{SpC\ \left({P}_{B,\ k}\right)}{SpC\ \left(I{N}_{B,\ j}\right)}\ \right] $$
(5)

Statistical analysis

Student t-test was used to compare GFP expression’s relative increase calculated using LC-MS/MS and external measurements using fluorescence. The t-test was performed with two-tailed and two samples with unequal variance (heteroscedastic) conditions.

Availability of supporting data

The mass spectrometric datasets used in this experiments and the corresponding GPM protein identification results were available on the ProteomeXchange site (www.proteomexchange.org) with the submission reference of 1-20150322-14021.