1 Introduction

Label-free quantification for proteomic analyses has gained popularity throughout the last decade. Advantages of label-free approaches compared to label incorporated methods (e.g., SILAC [1, 2]) include simplicity of sample preparation and applicability to any organism. Additionally, reduced sample complexity allows for an increase in the number of peptides sequenced, which results in a greater dynamic range and more comprehensive proteome coverage [3, 4]. Spectral counting [5] and ion abundance [610] have been used for label-free quantification and are known to correlate with protein abundance [11]. Furthermore, an algorithm combining spectral counting and ion abundance measurements was developed by Feener and co-workers [12], enabling quantification for an increased number of proteins.

The number of spectral counts (SpCs) for a protein is simply the number of MS/MS spectra that result in identification of its proteolytic peptides. In bottom-up proteomic strategies, data-dependent MS/MS acquisition software selects peptide ions based upon their abundance and charge state, which typically favors identification of more abundant peptides/proteins. Applying dynamic exclusion for previously selected peptides limits the number of SpCs for those abundant peptides and enables the selection of peptides of lower abundance, resulting in higher protein sequence coverage, more confident protein identifications, and increased depth of proteome coverage [13].

Normalization of SpCs is performed to reduce the variance observed between samples and replicates. The variability in SpCs can be caused by numerous factors including sample preparation, gel-to-gel variance (if a gel-based approach is part of the proteomic workflow), and changes in chromatography. Carvalho et al. [14] pointed out the importance of normalizing spectral counting data in order to quantify proteins; however, the most effective normalization method for label-free spectral counting has yet to be elucidated.

For total spectral count (TSpC) normalization [15], the technical replicate with the highest number of TSpC is chosen and the remaining technical replicates for that sample are normalized to it. Subsequently, the values across different samples are normalized to the sample with the highest technical replicate TSpC. The normalization is done for each protein individually and comparisons of average normalized SpCs of the same proteins are made between samples. Comparison of absolute or normalized SpCs between proteins to determine their relative abundance is generally precluded by that fact that longer proteins yield a higher number of spectral counts on average than shorter proteins. To account for this, Washburn and co-workers [1618] developed a method, termed normalized spectral abundance factor (NSAF), in which the SpC for a given protein are divided by its length (L) to give a spectral abundance factor (SAF). To account for variations between runs, the SAF for a given protein (e.g., SpC/L) is subsequently normalized to the sum of all SAFs for proteins identified within that run to create a normalized SAF (i.e., NSAF) that can be used to compare the relative abundance of proteins both between and within samples. In the latter method, the authors are using the assumption that the sum of all SAF should be conserved between replicates to correct for differences in sampling rates. In TSpC normalization, the assumption is the sum of all SpC (i.e., the TSpC) should be conserved between runs/samples. In comparison, normalization to selected proteins (NSP) does not assume the sampling rate should be conserved between replicates/samples; rather, it relies on the premise that the total SpC for a standard protein should be conserved between replicates/samples if it is present at the same concentration. In practice, the standard protein(s) can be an endogenous house-keeping protein or an exogenous protein; the latter has the advantage of knowing precisely the amount of protein added to each sample. Any change observed in the SpC for the standard protein is assumed to reflect the variation between replicates and samples for the entire identified proteome. Consequently, the relative changes in SpC for the standard protein between replicates/samples are used as correction factors to normalize the SpC of all proteins.

Significant biological change in protein abundance is determined either by fold-change[19, 20], by significance testing [2123], or by using a combination of both [14]. Even though SpC datasets do not necessarily meet the criteria for a normal distribution, Student’s t-tests are often applied in such experiments to determine statistical significance. Zhang et al. [21] performed a control experiment with yeast samples and six spike-in proteins at three different concentrations to calculate the false positive rates (FPR) for different significance tests. When only one replicate was utilized and thus the assumption of a normal distribution was obviously false, the G-test provided the lowest FPRs. However, when three replicates were utilized, t-testing performed similarly.

We are utilizing label-free quantification to gain proteomic insights related to pathogenicity of the fungus Magnaporthe oryzae (M. oryzae). M. oryzae causes rice blast disease destroying millions of hectares of rice each year, resulting in losses valued at billions of dollars [24]. Since half of the world’s human population relies on rice as a nutrition source [25], understanding fungal development as it relates to disease progression is important for development of control strategies. Dean and coworkers sequenced the whole genome of M. oryzae in 2005 [26] providing a reliable database for proteomic approaches. Using 1D gel and liquid chromatography-mass spectrometry (GeLC-MS), we would like to study the development of the fungus M. oryzae, wild type and mutants, at different time points in its life cycle. Since the production of biological samples over a wide range of conditions (e.g., time, treatments, mutants) is difficult and downstream proteomic workflow is very time consuming, establishing a confident normalization method is critical. Thus, we have used M. oryzae conidia spectral counting data to compare the normalization methods TSpC, NSAF and NSP in their ability to account for variance between samples due to differences in sample preparation and chromatographic performance.

2 Experimental

2.1 Sample Preparation

M. oryzae conidia were harvested from 8 d old minimal medium plates. Three biological replicates, each containing 2 million conidia, were pooled to account for biological variance. Conidia were lysed by bead beating in a 1X PBS (Fisher Scientific, Pittsburgh, PA, USA), 2 M urea (Sigma Aldrich, St. Louis, MO, USA), and 0.1 % SDS buffer (Bio-Rad, Hercules, CA, USA) to create the biological sample from which all experiments were derived. Protein concentration was determined via a BCA assay (Thermo Fisher Scientific, Rockford, IL, USA). Samples 1 and 2, derived from the same biological sample, were prepared and processed on different days. Chicken myoglobin (Sigma Aldrich) and equine ovalbumin (Sigma Aldrich) were chosen as spike-in proteins and 25 ng of each was added to 50 μg of total protein for each sample. The samples were loaded onto 10%–20% gradient 1D-SDS PAGE gels (Bio-Rad). It is necessary to start out with more material in regards to the number of fractions, the recovery, and the number of injections of a complex protein/peptide mixture. Fifty μg is the amount of material loaded for the size gels utilized and it allows for adequate amounts of peptide material to be recovered (post-digestion). After Coomassie staining (Bio-Rad), 10 fractions of gel bands were excised and in-gel digestion [27] was performed on each fraction. Briefly, each gel-fraction was destained with 100 μL of 50:50 ammonium bicarbonate (Sigma Aldrich)/acetonitrile (ACN) (Burdick and Jackson, Muskegon, USA). Reduction was performed with 100 μL of 10 mM dithiothreitol (Sigma Aldrich) at 56 °C for 30 min, alkylation with 100 μL of 90 mM iodoacetamide (Sigma Aldrich) in the dark for 30 min and digestion overnight at 37 °C with trypsin (protein: protease ratio of 5:1). Acetonitrile was added and discarded between each step to dehydrate the gel pieces. To extract the peptides, 200 μL of 5% formic acid (Sigma Aldrich) in ACN was added to each fraction and incubated for 15 min at 37 °C. The supernatants were transferred into new tubes. ACN (100 μL) was added to the gel pieces and supernatants were combined; this was repeated once more for each fraction. Sample 2' was produced by pooling one-third of the volume of adjacent in-gel digested fractions of sample 2 together to give a sample with only 5 gel fractions, yet having undergone the same sample processing as the 10 gel fraction sample. All samples were dried down and stored at −20 °C until nanoLC-MS/MS analysis.

2.2 NanoLC-MS/MS

A 75 μm i.d. IntegraFrit capillary (New Objective, Woburn, MA, USA) trap was packed to 5 cm with Magic C18AQ packing material (Michrom BioReasources, Auburn, CA, USA). A 75 μm i.d. PicoFrit capillary column (New Objective, Woburn, MA, USA) was packed 15 cm with the same packing material. Separation was carried out using a nanoLC-1D+ system from Eksigent (Dublin, CA, USA) with a continuous, vented column configuration as previously reported by our group [28]. A 2 μL (200 ng) sample was aspirated into a 10 μL loop and loaded onto the trap. Only 200 ng were analyzed per injection so as not to overload the nanoLC column. The flow rate was set to 350 nL/min for separation on the analytical column. Mobile phase A was composed of 98% H2O (Burdick and Jackson), 2% ACN and 0.2% formic acid (Sigma) and mobile phase B was composed of 98% ACN, 2% H2O, and 0.2% formic acid. A 1 h linear gradient from 5% to 50% B was performed. All measurements were performed at room temperature and three technical replicates of each sample were run to allow for statistical comparisons between samples, which are necessary for label-free quantification.

A hybrid LTQ-Orbitrap XL MS (Thermo Fisher Scientific, Bremen, Germany) was used to perform MS analysis. For data-dependent acquisition, the parameters recently published by our group to be optimal for achieving maximum proteome coverage were used verbatim [29]. External calibration was performed following manufacturer’s instructions and using the manufacturer’s calibration mix and lock mass internal calibration using polydimethylcyclosiloxane (m/z 445.120025) was enabled [30].

2.3 Data Analysis

Data analysis was performed by searching each .RAW file, independently, against a concatenated target-reverse M. oryzae database (MG8_GeneCall10.fasta) from the Broad Institute using MASCOT Distiller version 2.3.01 (Matrix Science Inc., Boston, MA, USA). MASCOT parameters were ±5 ppm peptide ion tolerance, ±0.6 Da MS/MS fragment ion tolerance, and two allowed missed cleavages. Carbamidomethylation of cysteine was set as a fixed modification and oxidation of methionine and deamidation of glutamine and asparagine were variable modifications. Peptide lists (.dat files) were created for each .RAW file by MASCOT. ProteoIQ version 2.1.01_SILAC_beta08 (BioInquire, Athens, GA, USA) was used to create five different label-free spectral counting projects: (a) Sample 1, (b) Sample 2, (c) Sample 2', (d) combination of samples 1 and 2, and (e) combination of samples 2 and 2'. A 1% protein FDR was applied to each project, independently (i.e., the FDR was calculated based on the cumulative results of the sample files included in that particular project) [31]. Log2 SpC ratios were calculated and a pairwise t-test was performed on the proteins identified in sample 1 and sample 2.

3 Results and Discussion

The experimental workflow is shown in Figure 1. Sample 1 and sample 2 are derived from the same biological sample but were prepared and processed after protein extraction on different days. M. oryzae conidial protein spiked with equine myoglobin and chicken ovalbumin was loaded onto 1D-SDS-PAGE gels and in-gel digestion was performed. The 10 fractions of each sample were analyzed by nanoLC-MS/MS using different traps and analytical columns for each sample. Additionally, one-third of the final volume of adjacent in-gel digested fractions of sample 2 was combined to produce sample 2' (five fractions) and were also analyzed by nanoLC-MS.

Figure 1
figure 1

Experimental workflow. The samples were prepared and processed on two different days. Myoglobin (25 ng) and ovalbumin (25 ng) were added to 50 μg of M. oryzae conidial protein. One-dimensional SDS-PAGE separation and in-gel digestion were performed. Ten fractions from d 1 and 10 fractions from d 2 were analyzed in triplicate by nanoLC-MS using different traps and columns for each sample set. Additionally, adjacent fractions from d 2 sample were pooled and also analyzed in triplicate by nanoLC-MS

Sample 1 yielded 76,638 TSpC (the sum of three technical replicates) and 1185 proteins identified (see Figure 2). The TSpC number for sample 2 was 95,025 and 1477 proteins were identified. The number of shared proteins between samples 1 and 2 was 1121. Sample 1 and sample 2 contained 64 and 356 unique proteins, respectively. The differences in protein identifications (24%) and number of TSpCs (19%) between samples 1 and 2 were caused by sample processing on different days, reagent quality, gel-to-gel variance, and use of a different trap and column. A slightly higher percentage of difference in the number of proteins identified was observed in a study by Cooper et al. [22]. They reported differences up to 30% in the number of proteins identified between nine replicate soybean peptide samples spiked with different amounts (0.005 to 2.5 pmol) of tryptic peptides from bovine apotransferrin tryptic digest, separated by MudPIT and analyzed on a LTQ-Orbitrap XL mass spectrometer.

Figure 2
figure 2

Venn diagrams showing the TSpCs and the total protein numbers identified at 1% protein FDR in each sample. Sample 2 yielded the highest number of identified proteins with 1477. While a difference of 19% in the number of TSpCs between sample 1 and 2 was observed, a reduction in TSpCs by 50% was noticed when adjacent fractions were combined and only half the number of fractions were analyzed

In order to determine if normalization methods can recover from variables such as large differences in sample complexity, we mimicked such a situation by doubling the sample complexity of sample 2 by pooling adjacent gel fractions to create sample 2'. The number of TSpC for sample 2' was 49,067, about half of the TSpC of sample 2, which was expected due to the decrease in the number of fractions (10 fractions for sample 2 and only five fractions for sample 2'). Moreover only 1087 proteins were identified from sample 2' and were a subset of the total population of protein identifications from sample 2.

SpC scatter plots for proteins from combined analysis of sample 1 and 2 are shown in Figure 3. The first plot shows the unnormalized SpCs for each protein (sum of the three technical replicates) versus the average SpCs of the two samples. A regression line slope of 1 is anticipated in the absence of biological variation as is the case here. The unnormalized scatter plots show slopes of 0.901 and 1.099 for samples 1 and 2 respectively, indicating that sample handling on different days, gel to gel variance, difference in reagent quality, and slightly different chromatography have some effect on SpC reproducibility. Normalizing the data with TSpC normalization corrected the slopes to 1.0093 for sample 1 and 0.9907 for sample 2, while normalization with NSAF corrected the slopes to 1.001 and 0.9878, respectively. NSP yielded some improvements by normalization to the spike-in proteins myoglobin and/or ovalbumin. NSP to myoglobin and ovalbumin yielded slopes of 1.0262 for sample 1 and 0.9738 for sample 2. Slopes of 0.968 for sample 1 and 1.032 for sample 2 were observed by normalizing to myoglobin and normalizing to ovalbumin resulted in slopes of 1.0696 for sample 1 and 0.9304 for sample 2. Normalization factors can be found in the Online Resource 1–Supplementary Tables S1 and S2.

Figure 3
figure 3

SpC scatter plots from combined analysis of sample 1 and 2. Unnormalized SpCs and normalized SpC (NSpC) data for each protein are plotted versus the average SpCs for the protein derived from both samples. NSAF normalization with slopes of 1.001 and 0.999 for samples 1 and 2 corrects best in comparison to NSP and TSpC normalization

The sample complexity was doubled in the case of sample 2' to simulate a drastic change in sample complexity and to evaluate the ability of normalization methods to compensate for that. In the unnormalized scatter plots, shown in Figure 4, the slope of sample 2' (0.6968) was almost half of the slope of sample 2 (1.3032). The normalized plots show that normalization can correct even for such drastic differences in sample complexity. The slope for sample 2' gets corrected to 1.0188 with TSpC normalization and to 1.0122 with NSAF. In spite of good sequence coverage (see Online Resource 1–Supplementary Figure S1), and the use of two spike-in proteins with different attributes, NSP did not perform as well as TSpC and NSAF normalization; however, better performance was observed for ovalbumin compared to myoglobin. Slopes of 0.9791 for sample 2 and 1.0209 for sample 2' were observed for normalization to ovalbumin, whereas normalization to myoglobin yielded slopes of 1.2636 for sample 2 and 0.7364 for sample 2'. Myoglobin was identified with 21 SpCs (8.98 NSpCs) in sample 1, 21 SpCs (7.18 NSpCs) sample 2, and 20 SpCs (13.24 NSpCs) sample 2' across all three replicates. In comparison, ovalbumin was identified with 63 SpCs (26.66 NSpCs) in sample 1, 88 SpCs (30.1 NSpCs) in sample 2, and 45 SpCs (29.81 NSpCs) in sample 2' correlating with the pattern of TSpCs in each sample (76,638, 95,025, 49,067) (see also Online Resource 1–Supplementary Figure S2 and Supplementary Table S3). This observation suggests that as a larger protein ovalbumin had a greater number of SpC and, thus, was more sensitive to the variations between samples.

Figure 4
figure 4

Unnormalized SpC data and NSpCs from combined analysis of sample 2 and sample 2' are plotted versus the average SpCs for each protein. The gross error, mimicked by doubling the complexity of the samples, was adjusted best by NSAF normalization. The unnormalized slope of sample 2' was corrected to 1.0122. Normalizing to myoglobin showed almost no effect, based on the large variation in SpCs smaller proteins experience and therefore lose their reliability

As an additional metric, we evaluated which normalization method gives rise to the lowest variance across the technical replicates. TSpC normalization and NSAF resulted in lower median coefficient of variations (CV) for the samples, while the median CVs using NSP were significantly higher when normalized to both spike-in proteins (see Online Resource 1–Supplementary Figure S3). These data indicate that TSpC and NSAF are the superior normalization techniques compared with NSP. This observation is most likely due to the fact that former methods utilize the entire identified proteome for normalization, which allows for better correction of variability within the similar biological samples. NSP may be better suited to instances in which global protein expression differences exist between two biological samples [20].

After identifying the optimal normalization method(s) we wanted to determine the best means for detecting true biological change between two samples. In previous concepts, one specific threshold has been applied to an entire data set in order to define what proteins are changing significantly. However, results from this study and others have shown higher SpC proteins may require separate criteria for detecting significant change due to their lower variance; conversely, that the high variance of low SpC proteins should preclude them from quantification. Moreover, various methods such as significance testing or applying fold-change thresholds have been utilized with various criteria that are often selected arbitrarily and with disregard to their true predictive value.

To demonstrate this, we produced the volcano plots, shown in Figure 5, with the data obtained here between samples 1 and 2. Volcano plots have long been utilized in the genomic microarray analysis to quickly identify species that have both large and highly significant changes and were more recently applied to proteomic spectral counting data sets by Yates and co-workers [14]. In these plots, the expression change for a given protein is plotted on the x-axis while the corresponding statistical significance is plotted on the y-axis. In Figure 5a, the plot shows when applying a standard P value cutoff of 0.05, 276 of 1511 proteins identified (~18%) between samples 1 and 2 would be falsely discovered to have changed in abundance. Similarly, 290 such proteins (~19%) would be falsely discovered to have changed if the traditional 2-fold threshold for expression change had been utilized. Interestingly, we also found that these two methods lead to false discovery of different proteins, in particular, different level proteins. As indicated by the Venn diagram in Figure 5a, only 100 of the same proteins were falsely discovered between the 0.05 P value cutoff and 2-fold expression change threshold. Upon closer inspection, nearly all of proteins falsely discovered by the 2-fold cutoff were low abundance (i.e., low SpC) proteins, while those discovered by the 0.05 P value cutoff were slightly biased towards higher abundance proteins. This phenomenon can be seen in Figure 5b, which depicts volcano plots for different SpC levels: S ≤ 3, 3 < S ≤ 10, and S > 10, where S indicates the SpC per replicate (six in this case, three per sample). These plots show the fold-change distribution narrows as S increases, indicating more abundant proteins have lower variance data than less abundant proteins.

Figure 5
figure 5

Volcano plots for comparing the normalized SpC between sample 1 and sample 2. The log2 expression ratio is plotted versus the –log10 of the P value obtained from significance testing (pairwise t-test). (a) Plot comparing all proteins identified between samples 1 and 2. Proteins outside the given fold-change limits or above the P value cutoffs are considered to have significantly changed. The absolute number of proteins meeting each criterion are given in parenthesis. The Venn diagram shows the overlap in the number of proteins deemed to have changed when applying either a 2-fold change cutoff or P value cutoff or 0.05. (b) Plots comparing proteins at different SpC levels. The proteins in each plot are defined by S, the SpC obtained per replicate injection

In Tables 1 and 2, narrower SpC per replicate (S) bins were utilized to calculate the false positive rates (FPR) achieved at different SpC levels for different stringency criteria. Given we could be certain no biological change had occurred between samples 1 and 2, we were able to define the FPR for a given criterion as the number of proteins discovered (falsely) in that SpC bin by the total number of proteins (N) falling within that same bin. When comparing different fold-change cutoffs (Table 1), there again appears to be a strong propensity for low SpC proteins to be falsely discovered. Across all fold-change cutoffs, the proteins with lower SpC have the higher FPR. Although a higher fold-change cutoff could be applied to reduce the overall FPR for all proteins, these data suggest little confidence could be placed in results for low abundance even when higher stringency cutoffs are applied. For example, if a 2.5-fold cut-off were utilized, the FPR higher abundance proteins (S > 5) would be acceptable, but the FPR for lower proteins (S ≤ 5) would be no better than 10% and would be particularly poor, 39%, for very low abundance proteins (S < 1.67).

Table 1 False positive rates for SpC bin widths at different fold-change cutoffs
Table 2 False positive rates for SpC bin widths at different P value cutoffs

In Table 2, which shows the FPR for different stringency P values, the opposite trend is observed. Excluding proteins with a low number of SpC (S < 1.67), the FPR increases slightly as S proceeds from low values to higher values, indicating that more abundant proteins are more apt to yield a false positive at a given P value. If a single P value were selected, such as 0.01, this data shows both high abundance proteins (S > 20) and very low abundance proteins (S < 1.67) would yield the majority of false positives. The importance of these observations is 2-fold: first, the large variance of low SpC proteins increases their probability to have erroneously large fold-changes and, secondly, high SpC proteins have a greater propensity to yield low P values simply as a result of their lower variance.

Given the different protein levels are uniquely affected by the two testing methods, we sought to apply dual constraints in order to better control the FPR for all protein levels. In Table 3 are shown different combinations of P value and fold-change cutoffs applied to the entire dataset as well as to different SpC levels. In general, researchers seek to maintain a low FPR; however, using constraints that yield too low of an FPR will result in a high false negative rate (FNR). Thus, we think it reasonable that a FPR of 10% be sought, so as not to exclude too many true positives that would occur during future experiments. The combination that best accomplishes this for the entire dataset, regardless of SpC, is a P value and fold-change cutoff of 0.1 and 2.5, respectively. Notice, this combination utilizes stringent fold-change cutoffs and lax P value cutoffs. This is because the majority of proteins in the dataset have a relatively low SpC (median S = 5.3) and are more strongly affected by fold-change cutoffs. Consequently, this combination results in undesirably low FPR higher SpC proteins (S > 5) and undesirably high FPR for lower SpC proteins (S ≤ 2). Even when excluding very low abundant proteins from the data set (i.e., when considering only proteins having S ≥ 1.67), similar outcomes are reach. It is apparent, when looking at the other combinations, there is no single combination that will satisfy the 10% target FPR across all SpC levels. As a result, we suggest applying different constraints to the different SpC levels.

Table 3 False positive rates for combined P value and fold-change cutoffs applied on different SpC bin width

Since large fold-change cutoffs likely result in a large FNR for higher abundance proteins, we suggest applying more stringent P value cutoffs and less stringent fold-change cutoffs for this set of proteins. For instance, we would not set the fold-change cutoff any higher than 1.5-fold for proteins having S > 10 as the chances of observing a larger fold-change, particularly a fold increase, for high abundance proteins is reduced due to the low linear dynamic range of spectral counting. Conversely, we would suggest utilizing more stringent fold-change cutoffs and less stringent P value cutoffs for low SpC proteins (i.e., S ≤ 10). For very low SpC proteins (S < 1.67) we would consider excluding these from consideration or applying both stringent P value and fold-change cutoffs due to their disposition to yield false positives. Here we defined very low SpC proteins as those with less than 1.67 SpC per replicate injection, or an average of five total spectral counts between two samples. Earlier studies by Old et al. [11] and Collier et al. [3] also proposed a cutoff of 5 or more total spectral counts across 2 samples, both having triplicate injections, to ensure accurate quantification. Gammulla et al. [32] utilized even more stringent criteria, allowing for quantification of proteins having six or more spectral counts in each sample when triplicate injections were performed. We chose to define our constraints and bin widths using SpC per replicate (S), such that they would be independent of the number injections and samples. Consequently, comparisons will be able to be drawn in future experiments regardless of the number of samples or replicates.

It should be emphasized that these results were obtained in the context of a Gel-based proteomics experiments, which may have led to higher false positive rates than other sample preparation techniques (e.g., MudPIT) due to inherent differences in their reproducibility. Additionally, the use of technical replicates rather than biological replicates here may have contributed to the low variation observed for higher SpC proteins. As such, we would caution readers when applying these same criteria to their data. Instead, we would recommend performing similar control experiments in order to define the variability specific to their lab, protocol, and sample type.

4 Conclusions

TSpC normalization, NSAF, and NSP for label-free spectral counting data were investigated on the in-gel tryptic digest of M. oryzae. Normalization to TSpC and NSAF normalization revealed very good correlations and low variance for all data sets. With normalization, correcting for variance caused by sample preparation, gel to gel variance, chromatographic performances, and even drastic changes in sample complexity was possible. We evaluated further that accurate quantification is dependent on the number of SpCs. When applying different constraints for significance tests and/or fold-change cutoffs, we observed biases in the FPR across different SpC levels. In particular, we observed higher SpC proteins to have lower variance data and, as a result, required less stringent fold-change cutoffs to achieve accurate quantification. Conversely, lower SpC proteins showed less reproducibility and required higher fold-change cutoffs in combination with significance testing to ensure accurate quantification. Consequently, we suggest applying different constraints to different SpC levels in order to circumvent these biases and maintain a constant FPR for all proteins.