Introduction

Laryngeal carcinoma (LC) is one of the common forms of head and neck cancers [1], accounting for approximately 5% of the systematic malignancies [2]. LC can be glottic carcinoma, supraglottic carcinoma or subglottic carcinoma, particularly glottic (accounting for more than half of the incidence). Glottic carcinoma can be easily diagnosed during a relatively early stage due to the early emerged symptomatic hoarseness, while the supraglottic and subglottic ones are more difficult to identify and treat, which leads to the loss of laryngeal function, due to the delayed presence of hoarseness and early metastasis to cervical lymph nodes. Currently, the major diagnostic methods for LC consist of laryngeal mirror and fiberoptic laryngoscope in clinical practices, but the final diagnosis still relies on biopsy and pathology. For the early LC, an integrated therapy including surgical resection and radiotherapy may be effective, especially for the glottic type carcinoma, in which a simple radiotherapy may save the vocal function. However, for advanced LC, a total laryngeal resection or even a combined resection are recommended, which may lead to the loss of vocal function postoperatively [3]. Therefore, early detection, diagnosis and treatment will largely improve the survival and postoperative life quality of patients with LC.

Surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) is a novel proteomic technique [46, 2326] to screen large-scale samples by combining several techniques such as mass spectrometry and chromatography together. It is characterized by the following factors: concise, less sampled volume and combined high-flux analysis. It can be used to analyze the crude sample such as serum, tissue, cellular lysates and urine. In particular, in the tumor biomarker screening, this technique can be used in the comparisons between the spectra obtained from patients with malignancies or those with other diseases and the heath to discover novel and specific biomarkers relative to the tumor or other diseases, and to obtain the proteomic information of these proteins immediately, providing an revolution to the detection of early tumor biomarkers. The technique is able to examine dozens or even hundreds of proteins simultaneously (while the conventional method can only focus on a single protein at a time) and improves the sensitivity and specificity of the diagnosis [710, 25, 26].

In this study, we employed advanced proteomic approach—SELDI-TOF-MS—to identify relevant biomarkers that could replace invasive and nonspecific tests for the early diagnosis of LC. A total of 185 serum samples from LC patients and non-cancer control samples were collected and analyzed. A panel of differentially expressed protein peaks was selected for biomarkers of diagnosis for LC.

Materials and methods

Clinical data

The experiment was performed in Taizhou Municipal Hospital, Zhejiang, China, in March 2011. Sixty-eight patients with laryngeal squamous cell carcinoma (26 glottic carcinoma, 38 supraglottic carcinoma and four subglottic carcinoma) at Taizhou Municipal Hospital and The First Affiliated Hospital of Medical College, Zhejiang University were recruited. All LC patients were diagnosed according to combined clinical criteria, including indirect laryngoscope, fiberoptic laryngoscope, and further confirmed with histopathological analysis. Comparative studies were also performed using non-cancer control samples (75 healthy volunteers and 42 Vocal fold polyps). The studies were approved by the local Ethics Committee of Taizhou Municipal Hospital; all patients and volunteers gave written informed consent for their participation. The patients and serum samples were then divided into two groups: the “training” set and the blinded “test” set (Table 1).The blood samples were collected in 5 ml BD Vacutainers without anticoagulation and allowed to clot at room temperature for up to 1 h; the samples were then centrifuged at 4°C for 5 min at 10,000 rev/min. The sera were frozen and stored at −80°C for future analysis.

Table 1 Characteristics of LC patients and non-LC controls

Protein chip analysis

Sample pretreatments and proteomic analysis in the proteomic profiling analysis and the serum samples from the diseased and control groups were randomized, and the investigator was blinded to their identity. Serum samples were processed on Q10 chips (Bio-Rad, USA) according to the manufacturer’s protocols. Briefly, the array spots were pre-activated with binding buffer (50 mM Tris–HCl, pH 8, containing 0.05% Triton X-100) at room temperature for 15 min in a humidifying chamber. Each serum sample was first diluted at 1:2 with U9 solution (9 mol/l urea, 2% CHAPS [3-([3-cholamidopropyl]dimethylammonio)-1-propanesulfonate]) and incubated for 30 min. Denatured serum samples were further diluted at 1:5 in binding buffer. A portion (20 μl) of each diluted sample was spotted onto preactivated protein array chips and incubated in a humidity chamber for 60 min at room temperature. The chips were then washed with fresh binding buffer to remove non-selectively bound proteins twice before allowing them to air-dry for 15 min. 0.5 μl of saturated solution of sinapinic acid (3,5-dimethoxy-4-hydroxy-cinnamic acid) in 50% acetonitrile and 0.5% trifluroacetic acid was added twice to each spot. After air-drying for approximately 5 min at room temperature, the chips were scanned with the ProteinChip reader (model PBS IIc, Ciphergen Biosystems Inc., Fremont, CA, USA) to determine the masses and intensities of all peaks in the m/z range of 1,000–50,000. The reader was set up as follows: mass range, 1,000–50,000 Da; optimized mass range, 2,000–20,000 Da; laser intensity, 190; and laser sensitivity, 9. Mass calibration was performed using an all-in-one peptide reference standard which contained vasopressin (1,084.2 Da), somatostatin (1,637.9 Da), bovine insulin β chain (3,495.9 Da), human insulin recombinant (5,807.6 Da), hirudin (7,033.6 Da) (Ciphergen Biosystems). The default background subtraction was applied, and the peak intensities were normalized using the total ion current from a mass charge of 1,000 to 50,000 Da. A biomarker detection software package (Ciphergen Biomarker Wizards; Ciphergen Biosystems) was used to autodetect protein peaks. Protein peaks were selected based on a first pass of signal/noise ratio of 3 and a minimum peak threshold of 20% of all spectra. This process was completed with a second pass of peak selection at 0.2% of the mass window, and the estimated peaks were added. These selected protein peaks were averaged as clusters and were exported to a commercially available software package (Biomarker Patterns, Ciphergen Biosystems) for further classification analysis.

Detection and statistical data analysis

The profiling spectra of serum samples from the training set were normalized using total ion current normalization by Ciphergen ProteinChip Software (version 3.1). Peak labeling was performed by Biomarker Wizard software, version 3.1 (Ciphergen Biosystems). The m/z ratios between 2,000 and 20,000 were selected for analysis because this range contained the majority of the resolved protein and peptides. The m/z range between 0 and 2,000 was eliminated from analysis to avoid interference from adducts, artifacts of the energy-absorbing molecules, and other possible chemical contaminants. A two-sample t-test was used to compare mean normalized intensities between the case and control groups. The P value was set at 0.01 for statistical significance. Proteins with low P values were selected, and the intensities of the selected peaks were transferred to Biomarker Pattern Software (BPS, Ciphergen Biosystems) to construct the classification tree of LC. Briefly, the intensities of the selected peaks were submitted to BPS as a “Root node”. Based on peak intensity, a threshold was determined by BPS to classify the root node into two child nodes. If the peak intensity of a blind sample was lower than or equal to the threshold, this sample would be labeled “left-side child node.” Peak intensities higher than the threshold would be marked “right-side child node.” After rounds of decision-making, the training set was found to be discriminatory with the least error.

All protein peak intensities of samples in the test set were evaluated by BPS using the classification model. The LC and control samples were then discriminated based on their proteomic profile characteristics. Sensitivity was defined as the probability of predicting LC cases, and the specificity was defined as the probability of predicting control samples. Accuracy was defined as the proportion of correct state classifications.

Results

Quality control and reproducibility

The quality control (QC) serum sample including four mixed serum samples from healthy control subjects with blood type O (two women and two men) was used to determine reproducibility and as a control protein profile for each Q10 protein chip experiment. Both the coefficient of variation (CV) for intensity and mass/charge (m/z) were calculated based on duplicate sample testing. The intrachip and interchip CV for intensity were less than 5%. Both the intrachip and interchip CV for m/z were less than 0.05%. These values showed good reproducibility of spectra in our practices over time (Fig. 1).

Fig. 1
figure 1

Spectra illustrating reproducibility of four separate analyses from the healthy controls of blood type O

Detection of the protein peaks

Proteomic data from the samples of the training set (consisting of 38 LC and 62 non-LC control samples) were analyzed with Biomarker Wizard software, version 3.1. Up to 118 protein peaks per spot were detected between m/z 2,000 and m/z 20,000 and the protein peaks showed the effectiveness of the SELDI technology separation of low-molecular weight proteins (<20,000) (Fig. 2). Additionally, the proteomic spectrums from patients in different stages of LC were compared to evaluate the consistency of these biomarkers in early diagnosis. Interestingly, we found a m/z peak at 13,752 in serum from early stage patients with well differentiated carcinoma, which would diminish in serum samples from later stage patients (moderately differentiated carcinoma, low differentiated carcinoma) (Fig. 5).

Fig. 2
figure 2

Representative protein spectrum of a single LC serum sample detected by SELDI-TOF MS combined with Q10, showing the protein m/z between 2,000 and 20,000

Protein fingerprint analysis of serum samples in patients with LC and non-LC controls

The protein profiles of the serum samples from the 38 patients with LC and the 22 patients with Vocal fold polyps, and the 40 healthy control subjects were examined by SELDI-TOF-MS. The data were analyzed by Biomarker Wizard Version, Version 3.1; 32 m/z peaks were found to discriminate patients with LC and non-LC control subjects (P < 0.01). Among these peaks, 21 were up-regulated and 11 were down-regulated in patients with LC when compared to non-LC control subjects.

Identification of biomarker pattern and construction of diagnostic model

The comparison among different samples showed that the serum profiles from patients with LC and control subjects were very similar in spite of a few of inter-sample variations. Therefore, the few variations that consistently differentiate these different groups could be considered potential disease biomarkers. We used the Biomarker Wizard function of ProteinChip software to identify clusters of peaks differentially presented in LC serum samples compared with samples from patients with Vocal fold polyps and healthy control subjects. We obtained 32 discriminating protein peaks in serum samples. To develop biomarker patterns for the diagnosis of LC, a total of three peaks (5,915, 6,440 and 9,190 Da) (Table 2) were selected to construct a classification tree (Fig. 3). Figure 4 shows the tree structure and sample distribution. The classification tree using the combination of the three peaks could identify 38 patients with LC, 22 patients with Vocal fold polyps, and 40 healthy subjects with a calculated sensitivity of 92.1% and a specificity of 91.9% (Table 3).

Table 2 Mean signal intensities of various proteins and peptides comparing LC with non-LC (mean ± SD)
Fig. 3
figure 3

Differential expression of SELDI peak m/z 5915, 6440, 9190 in LC and control sera. HV healthy volunteers, VFP Vocal fold polyps, LC means laryngeal carcinoma patients. The two tests for each group were found different individuals

Fig. 4
figure 4

The decision trees of diagnostic model for LC

Table 3 Prediction results of the diagnostic model for LC

Test of the diagnostic model for laryngeal carcinoma in a blind test

We used 85 samples, including 30 from patients with LC, 20 patients with Vocal fold polyps, and 35 healthy subjects to test the LC diagnostic model in the blind test. The classification tree discriminated the LC samples from the control samples with a calculated sensitivity of 86.7% and a specificity of 89.1% (Table 3).

Discussion

LC is one of common forms of head and neck cancers, accounting for approximately 5% of the systematic malignancies. It was ranked as the second frequent craniocervical malignancies. Early diagnosis and treatment of LC are crucial for a favorable outcome. If LC was identified at the early stage, the 5-year survival rate could reach 83% [11]. However, due to the absence of the highly specific diagnostic method and the delayed presence of the symptom, a majority of LC patients were in a developed or advanced stage when they were diagnosed, losing the time window for best treatment and leading to a dismal outcome. Therefore, to find single or combined biomarkers for the purpose of early diagnosis is an important clinically relevant issue for LC [2729].

However, it is difficult to seek a serum tumor biomarker. Since almost all tumor-specific proteins are present in a low concentration, it is relatively difficult to detect and identify these proteins with low abundance in the early diagnosis of these malignancies. Moreover, the concentration of these proteins would change dynamically, particularly in stress and disease conditions or posterior to therapy, which may affect the detection. In recent years, with the rapid progress in proteomics and bioinformation processing techniques, it is possible to evaluate the development of disease malignancy dynamically at the overall protein level. The newly developed SEIDI-TOF-MS technique provides a new technical basis for the identification of novel tumor biomarkers and has the following merits [12, 13]: (1) performing the test directly using crude samples; (2) being a large-scale, ultra-trace, high-flux and automatic protein screening analysis; (3) using a very small volume of samples; (4) possibility to detect a combined protein spectra; (5) possibility to demonstrate changes in genomics and to find a new gene based on the characteristics of the protein. Therefore, this technique is expected to be applied to the diagnosis and treatment of the disease at the protein level, able to provide numerous, optimal and dynamic protein fingerprints, playing an adjuvant role in diagnosing a disease or monitoring the therapeutic efficacy. With the further improvement of the technique and the standardization of the operational procedure, this technique is suitable to be used clinically, playing an important role in protecting human beings against malignancies (Fig. 5).

Fig. 5
figure 5

The representative m/z peak at 13752 in different stage LC patients with low differentiated carcinoma and Patients with moderately differentiated carcinoma and well differentiated carcinoma. This data suggested that in different stages of LC patients different biomarkers could be found. The two tests for each group were found different individuals

In this study, we performed a mass chromatographic analysis on the serum samples from patients with LC and Vocal fold polyps and healthy volunteers using SELDI-TOF-MS technique. The resulting 32 protein peaks with marked difference among these groups were analyzed using BPS software, revealing that any of the proteins alone could not thoroughly distinguish LC from the benign disease. Further analysis showed that three separate biomarker proteins with m/z 6,440, 5,915 and 9,190 were automatically screened by Biomarker Pattern Software to establish a taxonomic tree model. Within the 100 samples, three out of 38 LC patients and five out of 62 non-LC patients were missed by this diagnostic model. The sensitivity of the diagnostic model was 92.1% and the specificity was 91.9%, suggesting that SELDI in combination with bioinformatics would facilitate to the diagnosis of LC [2729]. The present study also showed that the combined biomarker proteins were preferred over single biomarker protein in the correct diagnosis of LC. This result is consistent with those reported by Xiao et al. [14] and Zhang et al. [15]. The preference of combined biomarkers over the single one may be due to the following factors: (1) It was a series of proteins rather than single protein that resulted in the development of the disease; these proteins might be tumor-related proteins or peptides, metabolites and/or soluble membrane antigens, a group of which, specific to the tumor, can be screened out by comparing the protein spectra from patients with the tumor and the health. (2) Since LC has several subtypes and single biomarker might not be able to detect all the types of the cancer, and the development of proteomics would provide a required technical basis for the combination of the biomarkers. It can be expected that with the improvement of the proteomic technique on its sensitivity and resolution, the tumor biomarkers, particularly the combined biomarkers, screened from this overall screening technique may be applied extensively into the diagnosis of clinical malignancies, with a sensitivity and specificity far higher than those of serological examinations currently used under the clinical settings.

In our diagnostic model, the three peaks with different m/z values may be biomarkers unique for LC or for some other disease. The up-regulated candidate protein biomarker (5,915 Da) was identified as internal fragment of fibrinogen alpha-E chain [16]; its theoretical mass is 5,908 Da, and the mass of our marker 1 (5,915 Da) is very similar to the internal fragment of it. It is also the positive marker in severe acute respiratory syndrome (SARS) [16] and indicates relapse in gastric cancer [17]. The down-regulated candidate protein biomarker (6,440 Da) was identified as Apolipoprotein C-I [18].The up-regulated candidate protein biomarker (9,190 Da) was identified as Haptoglobin alpha-1 chain [19]. Fibrinogen alpha-E chain is the alpha component of fibrinogen and participates in blood clotting; apolipoprotein C-I is mainly expressed in liver and activated when monocytes differentiate into macrophages; binds free hemoglobin (Hb) released from erythrocytes with high affinity and thereby inhibits its oxidative activity. Additionally, Hellgren et al. [20] identified a 13.7-kDa protein marker derived from Transthyretin. This information will help us in further investigations.

In conclusion, we showed that the use of proteomics approaches such as protein chip and SELDI-TOF-MS in combination with bioinformatics tools, could facilitate the discovery of new biomarkers and provide a rapid and mass-accurate mode of analysis for the detection of multiple disease-related proteins simultaneously, reproducibly, and in high-throughput [21, 22]. With the panel of three selected biomarkers, the combined detection analysis could lead to high sensitivity and specificity.