Background

In recent years cancer diagnostics has been taking enormous advantage of genomics and proteomics, novel fields of modern biology. Proteomics is the study of the proteome, the complete protein components of the cell, tissue or organism, which in contrast to the genome is dynamic and fluctuates depending on a combination of numerous internal and external factors (e.g., physiological status, dietary behavior, stress, disease and medical treatment). Identifying and understanding changes in the proteome related to disease development and therapy progression is the subject of clinical/disease proteomics [1, 2]. It is currently well appreciated that because of the complexity of molecular processes involved in cancer no particular molecular feature alone, neither gene nor protein, could be a reliable biomarker in cancer diagnosis. Instead, multi-component molecular classifiers, exemplified by multi-gene cancer signatures implemented in the functional genomics field, are built and successfully applied. Multi-gene signatures identified for breast cancer have proved their diagnostic power even though detailed knowledge about the function of particular genes that build such signatures may not be available at present [3, 4].

The low molecular weight (<10 kDa) component of the blood proteome is a promising source of previously undiscovered biomarkers. Since this protein fraction is below the limit of effective resolution of conventional gel electrophoresis, mass spectrometric analysis appears to be a method of choice [5], and consequently is an emerging method of clinical proteomics and cancer diagnostics [rev. in: [69]]. The milestone paper in this field was published in 2002 by the group of Petricoin and Liotta, who showed that components of the serum proteome identified by mass spectrometry differentiate patients with ovarian cancer from healthy individuals [10]. Since that time, in spite of a certain controversy regarding this pioneering work [11], numerous papers have been published that aimed to verify the applicability of mass spectrometric analyses of the serum (or plasma) proteome for cancer diagnostics. Although no single peptide could be expected to be a reliable bio-marker in such analyses, multi-peptide sets of markers selected in numerical tests have been shown already in a few studies to have potential prognostic and predictive values for cancer diagnostics [rev. in: [1216]]. The approach that takes into consideration features of the whole proteome, e.g. protein fingerprints given by mass spectra or 2D gel electrophoresis but does not rely on particular identified protein(s), could be called proteome pattern analysis or proteome profiling. In this approach, whose strategy is similar to the search for multi-gene signatures in functional genomics, multi-component sets of peptides/proteins (which are exemplified by ions registered at defined m/z values in the mass spectrum) define specific proteomic patterns (or profiles), allowing one to classify samples even though their particular components lack differentiating power when analyzed separately. Importantly, such pattern/profile reflects features of the specimen's proteome and allows its classification even without detailed knowledge about particular elements [1719]. Mass spectrometry methods particularly suitable for proteome pattern analysis are Matrix-Assisted Laser Desorption-Ionization spectrometry (MALDI) and its derivative Surface-Enhanced Laser Desorption/Ionization spectrometry (SELDI) coupled to a Time-of-Flight (ToF) analyzer, which combine high throughput, fair sensitivity and accuracy of annotation of m/z values of ions in recorded mass spectra of complex protein mixtures such as biological specimens [20, 21]. The relevance of mass spectrometry-based serum (or plasma) proteome pattern analysis has been already tested for several type of human malignancies though none of identified peptide signatures was approved for diagnostics in clinical practice, as yet [15, 2226].

Breast cancer is the most common malignancy in women, comprising about 18% of all female cancers, and 1 million new cases occur worldwide each year. In Western countries the disease is the single commonest cause of death among women aged 40–50, accounting for about a fifth of all deaths in this age group [27]. The most important tools in screening and early detection of breast cancer are imaging techniques: mammography, ultrasonography and magnetic resonance imaging. Unfortunately however, up to 20% of new breast cancer incidents cannot be detected by these methods [28], indicating a constant need for novel molecular markers suitable for screening and early detection of this cancer. Several studies have already addressed the possibility of applying SELDI or MALDI mass spectrometric analyses of blood proteome in diagnostics of breast cancer, and elicited serum (or plasma) proteome patterns specific for patients with breast cancer at either early or late clinical stages [2938]. Among the peptides identified in such differentiating patterns were fragments of C3a [33] and of FPA, fibrinogen, C3f, C4a, ITIH4, apoA-IV, bradykinin, factor XIIIa and transthyrein [35]. In addition, mass spectrometry analyses of the blood proteome allowed the identification of patterns specific for breast cancer patients with different outcome and response to therapy [3943]. Different methodological approaches, both experimental and computational, have been implemented in such studies, and the proposed proteome patterns specific for breast cancer consisted of different peptide sets. However, several peptides that differentiated cancer and control samples appeared reproducibly when comparative analysis across different studies was performed [44], demonstrating the high potential of mass spectrometry-based analyses of the blood proteome pattern in diagnostics of breast cancer once problems with standardization of experimental and computational design are solved.

Here we examined the potential applicability of the serum proteome pattern identified by MALDI-ToF mass spectrometry, either alone or in combination with protein biomarkers analyzed by immunoassays, in early detection of breast cancer. The spectral components that were annotated on the basis of recorded mass spectra were successfully used to build classifiers that allowed reliable identification of early stage breast cancer patients. Importantly, the classifier based on serum proteome pattern outperformed available biomarkers analyzed in blood by immunoassays.

Methods

Characteristics of patient and control groups

The clinical part of the study was carried out at the Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Gliwice Branch, between May 2006 and January 2008. Ninety-two patients diagnosed with clinical stage I or II breast cancer were included in the study, of average age 58.5 years (range 31–74 years). Patients were classified according to the TNM scale; the majority were scored as T1 and T2 (47% and 45%, respectively) as well as N0 and N1 (75% and 24%, respectively), and none had diagnosed metastases (all M0). Biopsy material was used to assess for histopathological tumor grade (27% G1, 45% G2, 28% G3), as well as for expression of estrogen receptor (63% ER+) and progesterone receptor (60% PR+) by immunohistochemistry. Serum samples were collected before the start of therapy. One hundred and four female volunteers were included as a control group; they were required to be free of any known acute or chronic illness and were not treated with any anticancer therapy in the past. The average age in this group was 54 years (range 32–77 years). The study was approved by the appropriate Ethics Committee and all participants provided informed consent indicating their voluntary participation.

Preparation of serum samples

Samples were collected and processed following a standardized protocol. Blood was collected in a 5 ml Vacutainer Tube (Becton Dickinson), incubated for 30 min. at room temperature to allow clotting, and then centrifuged at 1000 g for 10 min. to remove the clot. The serum was aliquoted and stored at -70°C. Directly before analysis, samples were diluted 1:5 with 20% acetonitrile (ACN) in water, then applied onto an Amicon Ultra-4 membrane (50 kDa cut-off) in a spin column and centrifuged at 3000 g for 30 min. This removed the majority (up to 80%) of albumin and other abundant high-molecular weight proteins from the serum samples (not shown).

Mass spectrometry

Samples were analyzed using an Autoflex MALDI-ToF mass spectrometer (Bruker Daltonics, Bremen, Germany); the analyzer worked in the linear mode and positive ions were recorded in the mass range between 2,000–10,000 Da. Mass calibration was performed after every four samples using standards in the range of 5000 to 17,500 Da (Protein Calibration Standard I, Bruker Daltonics). Prior to analysis each sample was loaded onto a ZipTip C18 tip-microcolumn by passing it through repeatedly 10 times, column was washed with water and then eluted with 1 μl of matrix solution (30 mg/ml sinapinic acid in 50% ACN/H2O and 0.1% TFA with addition of 1 mM n-octyl glucopyranoside) directly onto the 600 μm AnchorChip (Bruker Daltonics) plate. ZipTip extraction/loading was repeated twice for each sample and for each spot on the plate two spectra were acquired after 120 laser shots (i.e. four spectra were recorded for each sample). Spectra were exported from the Bruker FlexAnalysis 2.2 software in standard 8-bit binary ASCII format; they consisted of approximately 45,400 measurement points describing mass to charge ratios (m/z) for consecutive [M+H]+ ions and the corresponding signal abundances, covering the range of analyzed m/z values.

Analysis of protein tumor markers in plasma

Plasma samples were obtained after centrifugation of blood on a Ficoll gradient (Lymphoprep™, ICN), and then levels of selected markers were quantified using standard methods of immuno-diagnostics. Enzyme-Linked Immunosorbent Assay (ELISA) was used for assessment of leptin (DRG Diagnostics) and osteopontin (R&D Systems), Chemiluminescent Microparticle Immunoassay (CMIA) for assessment of CEA (Abbott), Trace Resolved Amplified Cryptate Emission (TRACE) for assessment of CYFRA 21.1 (Brahms), and Microparticle Enzyme Immunoassay (MEIA) for assessment of CA15.3 (Abbott). In addition, the level of osteopontin was analyzed in serum samples as described above.

Data Processing and Statistical Analysis

The preprocessing of data that included averaging of technical repeats, interpolation of missing or non-aligned points, binning of neighboring points to reduce data complexity, removal of the spectral area below baseline and the total ion current (TIC) normalization was performed according to procedures considering to be standard in the field [45, 46]. In the second step the spectral components, which reflected [M+H]+ ions recorded at defined m/z values, were identified using decomposition of mass spectra into their Gaussian components. The spectra were modeled as a sum of Gaussian bell-shaped curves, then models were fitted to the experimental data by a variant of the expectation maximization (EM) algorithm [47]. In a few cases when the standard deviation of a Gaussian exceeded a value of 50 the corresponding spectral component was excluded from further more detailed analyses. Based on the decomposition of the average mass spectrum into the Gaussian components, the classifier features were computed by the scalar product with the Gaussian curves treated as kernel functions. The classification used version of the Support Vector Machine (SVM) algorithm described by Schölkopf and coworkers [48]. The size of the training sample was changed from 20% to 90% of the whole dataset, and for each size the two-step training/validation procedure was repeated 1000 times to estimate the average error rate and its 95% confidence interval, which characterized the accuracy of classification. In order to further characterize the quality of classification, receiver operating curves (ROC) were computed by changing the value of the classification threshold in the SVM classifiers, and averaging the obtained specificity/sensitivity proportions over 1000 random validation experiments. We tested the performance of classification with classifiers built of different numbers of spectral components by estimating the level of total errors, as well the number of false positive and false negative classifications. Construction and validation of a classifier is a statistical process, i.e. many different classifiers built of a given number of spectral components were tested (1000 random splits of the dataset), and those which pass the quality threshold could be built of different spectral components. Thus, to identify the components that are the best determinants of a specific proteome pattern we looked for the most frequent components in classifiers that correctly classified samples. The performance of classifiers built of optimized components was assessed by standard logistic regression (1000 iterations with a 50/50 split of the training/validation set).

Results and discussion

Classifiers built on spectral components that determine proteome patterns

The low-molecular-weight fraction of the blood serum proteome consists of numerous peptides, proteins and their fragments. Some of these interact with each other, and a substantial fraction of this blood proteome compartment is carried by albumin as cargo peptides [49, 50]. For this reason we implemented dilution of serum samples with a denaturing organic solvent (acetonitrile) that destroyed the majority of protein interactions and allowed analysis of individual peptides dissociated from (not interacting with) other proteins (e.g., albumin). Characteristic features of MALDI ionization are that most ions created during laser irradiation are singly charged (multiply charged ions, especially those with low m/z values, have very low abundances and can be are neglected), and that these ions are not fragmented under the ionization conditions applied. In other words, peaks registered in a MALDI mass spectrum correspond to mono-protonated peptide/protein molecular ions [M+H]+ described by m/z values that reflect actual molecular weights increased by the mass of the proton. However, when MALDI mass spectra are recorded over a wide range of m/z values (like the 2–10 kDa range in this study) the expected mass accuracy is relatively low and reaches 0.01–0.1% of the analyte's molecular mass, which corresponds to a few Daltons in the range of m/z values analyzed. In consequence, the relative broadening of spectral peaks recorded for the [M+H]+ ions could reflect the low resolution of the analyzer operating in the linear mode or might result in overlapping of ions originating from protein/peptides of very similar molecular masses. In addition, because of technological imperfections there might be some shift in the positions of peptide ions between measurements, which adds more complexity to analyses of large datasets. For this reason, some approaches used for analysis of large datasets relay on alignment of identified spectral peaks [45], which requires numerical "stretching" of spectra before further analyses.

Here we decided to implement an original mathematical procedure based on modeling average spectra and then fitting actual experimental spectra into such a model. Averaging was performed over either the whole dataset or data for cancer patients only, depending on whether the model was used to discriminate cancer and normal samples or different clinical outcomes of patients. We tested models with different numbers of components, and found that for the mass spectra analyzed in the present work 300 components ensured both sufficient fidelity of the model and its efficient computation (not shown). As a result of computation an "average" spectrum was decomposed into spectral components characterized by the exact molecular weight (m/z values of recorded [M+H]+ ions) and the interval where fit corresponding peaks in at least 95% of actual spectra expected in the dataset (+/-95% CI). The resulting spectral components reflect peaks recorded in multiple samples during mass spectrometric analysis, which contained either single peptide/protein ions or a combination of a few ions of very similar m/z values. This approach allowed us to avoid artifacts resulting from the peak alignment and facilitated quantitative analysis of data by simple assessment of signal volumes that fitted to a given component within its 95% CI. Having identified and quantified spectral components, one could find certain whose abundances were significantly different between groups of samples (e.g. between cancer patient and healthy samples) which could be defined as "differentiating". However, to obtain more reliable classification of samples we used spectral components to build multi-component classifiers that determined proteome patterns characteristic for groups, and looked for the most frequent components in classifiers that classified samples correctly.

Identification of components that determine proteome patterns specific for healthy persons and breast cancer patients

At first we compared the serum proteome patterns of 104 healthy women and 92 early stage breast cancer patients. Spectral components corresponding to protein/peptide [M+H]+ ions recorded in MALDI mass spectra were used to built classifiers to perform cancer/healthy control classifications as described above. The best classification performance was obtained with classifiers built of 2–5 features, i.e. spectral components (Fig. 1A). To estimate the sensitivity and specificity of classification, ROC curves [51] were computed for classifiers built of 3 or 4 spectral components. According to our estimations these classifiers allowed classification of cancer patients with 85% specificity and 82–83% sensitivity (Fig. 1B).

Figure 1
figure 1

Estimation of the performance of classification of breast cancer samples. A – The total error rate was plotted against the number of features (i.e. spectral components) in the classifier. Shown are average error rates and 95% confidence intervals calculated based on 1000 random validation experiments with 50:50 training/validation split of data. B – Estimation of the sensitivity and specificity of the classification for classifiers built of three or four spectral components. The ROC curve was computed by changing the value of the probability threshold in the SVM classifier from 0.0 to 1.0, and averaging the specificity obtained versus sensitivity rate over 1000 random repeats of training and validation.

In further analyses we looked for the most frequent spectral components in classifiers that correctly classified breast cancer samples. The three most important components corresponded to the following [M+H]+ peptide ions: m/z = 2865.54, m/z = 3578.73, and m/z = 2303.48 (Fig. 2A). Most interestingly, two of these (m/z = 2865.54 and m/z = 3578.73) were present in nearly all well-performing classifiers, while the third (m/z = 2303.48) was present in 78% of classifiers; it was noteworthy that all other spectral components appeared in classifiers less frequently (<50%; Table 1). Importantly, these most frequent components of cancer classifiers had very high potency to differentiate control and cancer samples by themselves; the statistical significance of differences obtained in univariant analyses for these three peaks were at the level of p-values from 10-20 to 10-14 (they remained highly significant after application of the Bonferroni correction for multiple testing; Table 1). Fig. 2B shows fragments of mass spectra in the near vicinity of the components that were the most frequent features of these breast cancer classifiers; the actual spectral lines for samples from all 196 individuals are shown together with the model component. The levels of such components in samples from individual breast cancer patients and healthy controls were quantified and are shown as box-plots (Fig. 2B).

Figure 2
figure 2

Characterization of spectral components essential for cancer classification. A – The three most frequent differentiating components are marked with arrows along the mass spectra of serum samples of cancer patients (red lines) and healthy controls (green lines). B – Actual spectral plots of three selected components for cancer patients (red lines) and healthy controls (green lines), as well as modeled Gaussian kernels (blue curves); X-axes represent the m/z values, Y-axes represent intensities. Box-plots on the right represent quantification of the abundance of spectral components in samples from cancer patients (red) and healthy controls (green) (shown are minimum, lower quartile, median, upper quartile and maximum values; outliers are marked by asterisks).

Table 1 Characteristics of spectral components that differentiated samples from breast cancer patients and healthy controls.

We also found that 49 out of 300 modeled spectral components (i.e., 16%) had themselves a high potential to differentiate control and cancer samples in univariant analyses (p-value < 0,05 after the Bonferroni correction). Furthermore, all 14 spectral components that appeared in at least 1% of classifiers built of 4 features retained a very high differentiation potential in univariant analyses (p-value < 0.0002 after the Bonferroni correction; Table 1). In addition, we cross-compared spectral components that showed some differentiating power in our study (90 spectral components with uncorrected p-value < 0.005) with spectral peaks that were reported in some other published studies to differentiate breast cancer from healthy control samples (uncorrected p-value < 0.005). The correspondence of [M+H]+ ions was based on ± 0.2% of the m/z values. We found that at least 15 of these spectral components had a corresponding differentiating peak in comparable studies (although not always showing the same tendency; Table 2). This reproducibility, observed in spite of large differences in experimental and computational design, indicates a potency of convergence toward a common proteome pattern specific for breast cancer samples. Interestingly, two spectral components that appeared the most important for cancer classification in our study (i.e., m/z = 2865.54 and m/z = 3578.73) were not reported as differentiating peaks in other studies. We note, however, that in our study serum was analyzed after removal of albumin and components bound to it, which apparently influenced the pattern of mass spectra of the low-molecular-weight fraction of the blood proteome. We observed markedly increased levels of some spectral components in albumin-depleted samples as compared to those analyzed directly (not shown), which could possibly be explained by a reduced efficiency of ionization and detection of certain less abundant peptides in the presence of albumin [49].

Table 2 Comparison of discriminating spectral components/peptide peaks found in this study and in other published work.

Serum proteome patterns identified by MALDI-ToF analyses are similar for different sub-groups of early stage breast cancer patients

Having established that MALDI-ToF analysis of serum peptides identified proteome patterns characteristic for cancer patients, we next examined whether features of peptide profiles would differentiate specific subgroups of patients. First, the group of patients was divided into two equal subgroups according to their age (younger or older then 56.5 years, which was the median), and then spectral classifiers were built according to the methodology described above. In this particular case the performance of classification was about 50% independently of the number of spectral components (features) in classifiers (Fig. 3A), and consequently the classifier had about 50% specificity and 50% sensitivity as shown on the corresponding ROC curve (Fig. 3B). This indicated that there was no real difference in serum proteome patterns between subgroups of patients divided according to their age. This result could be expected because in the whole group there was only 1 patient younger then 35 years which is normally considered an early appearance of cancer, and thus our two age-related subgroups most possibly reflect a random division of the group. Having this "negative control" classification, we next aimed to identify serum proteome patterns specific for subgroups of patients with different clinical and molecular outcomes. We compared patients with different primary tumor size (T1 vs. T2), lymph node status (N0 vs. N1), histopathological grade (G1 and G2 vs. poorly differentiated G3), and also two well-established breast cancer prognostic and predictive molecular markers, expression of estrogen receptor or progesterone receptor [rev. in: [5254]]. For each comparison the performance of classification (total error of classifiers built of 1 to 20 features) and the corresponding ROC curves for classifiers built of 15 spectral components (these were representative of ROC curves computed for classifiers built of 1 to 20 features) are shown in Fig. 3. Most importantly, we observed a low performance of putative classification with a high level of errors for all analyses carried out. Although analyses based on the nodal status and the histopathological grade showed relatively moderate levels of total error (Fig. 3A), they had a very high level of false negative classifications (not shown) which was related to the unbalanced number of subgroups compared (see Table 3); the shape of the corresponding ROC curves also reflect this unbalance (Fig. 3B).

Figure 3
figure 3

Estimation of differences of serum proteome patterns between sub-groups of breast cancers patients. Patients were differentiated by age, primary tumor size (T), lymph nodal status (N), histopathological grade (G), and estrogen (ER) and progesterone (PR) receptor expression. A – The total error rates of classification plotted against the number of features in the classifiers as in Fig. 1A; the actual line width corresponded to 95% confidence intervals. B – ROC curves computed for classifiers built of 15 spectral components for each comparison (computation was done as described in Fig. 1B).

Table 3 Comparison of serum proteome patterns among different sub-groups of breast cancer patients.

The spectral components identified by Gaussian model decomposition were also used for univariant analyses of differences between the subgroups described above. Table 3 presents examples of the top five spectral components with the lowest p-values identified for each of such comparisons. Most importantly, although in standard analyses the levels of some components were different between the subgroups compared, none of these differences appeared significant after application of the Bonferroni test for multiple testing correction (not shown). This result was in complete agreement with results of classification by multi-component classifiers (Fig. 3), which clearly showed similar serum proteome patterns identified by MALDI-ToF analyses in different sub-groups of the early stage breast cancer group. This finding suggested that the multi-component cancer classifier described above might be potentially applicable for early detection of breast cancer, independent of further more detailed clinical and pathological features.

A classifier built on MS-based serum proteome pattern outperforms available protein biomarkers analyzed in blood by immunoassays

To further assess potential diagnostic power of multi-component classifier described above we compared reliability of classification based on biomarker identified by mass spectrometry with the one that based on available protein biomarkers analyzed in blood by immunoassays. Five markers were selected: carcinoembryonic antigen (CEA), carbohydrate antigen CA15.3, cytokeratin fragment CYFRA-21.1, leptin and osteopontin, which had putative diagnostic value for breast cancer, especially at advanced clinical stages, yet none of them was routinely used for early diagnostics of breast cancer [5560]. The plasma levels of these biomarkers were quantified in a group of early stage breast cancer patients (which largely overlapped with the group examined using MALDI-ToF mass spectrometry) and compared with corresponding levels in a group of healthy donors (Table 4). We observed that the level of osteopontin was markedly increased in plasma of cancer patients, and the difference had a high level of statistical significance (p = 0.0003). The differences were much less significant for the four other markers, and therefore osteopontin alone was used in further analyses. The anti-osteopontin antibody used for ELISA recognized all four isoforms (OPN-a, OPN-b, OPN-c, OPN-d) and their different proteolytic fragments present in blood, and thus direct correlation of the ELISA results with MALDI-ToF analyses was not possible. When the plasma level of osteopontin was used for cancer classification it showed 88% sensitivity but only 28% specificity (as tested by the standard logistic regression method).

Table 4 Levels of tumor markers in plasma of breast cancer patients and healthy controls.

With the aim of constructing a putative marker useful in early diagnosis of breast cancer, we decided to combine features of the serum proteome pattern identified by MALDI-ToF MS analysis and the level of osteopontin measured by ELISA. Three spectral components, m/z = 2865.54, m/z = 3578.73, and m/z = 2303.48 Da, which were the most frequent components of the cancer classifier described above, were selected for these analyses. The marker built of this three spectral components showed 78% specificity and 88% sensitivity when tested by the standard logistic regression method. Then, the level of osteopontin was re-tested in serum samples from the cancer patients and healthy individuals subjected to the MS-based study. In this case, however, the average concentration of osteopontin in serum was about two-fold lower as compared to that in plasma, and the difference between cancer patients and healthy persons was much less pronounced. The biomarker built of the serum level of osteopontin alone showed 84% specificity and but only 12% sensitivity when tested by the standard logistic regression method. Finally we tested the performance of a marker built of four features, the three most frequent spectral components (m/z = 2303.48, 2865.54, and 3578,73) and osteopontin. This combined marker showed 78% specificity and 88% sensitivity, the same as the marker built of three spectral peaks alone.

Conclusion

Here we confirmed the high potential of serum proteome pattern analysis by MALDI-ToF spectrometry for the differentiation between early breast cancer patients and healthy controls. Most importantly, a classifier built on this analysis outperforms those based on available protein biomarkers analyzed by immunoassays in blood. However, further combination of MS-based serum proteome pattern analysis with traditional cancer markers might possibly result in a biomarker with a reliability high enough for practical implementation in the early detection and diagnostics of breast cancer.