Rapid stratified serum spectroscopic diagnostics
ATR-FTIR spectra from 433 patients (3897 spectra) were analysed to investigate sensitivities and specificities possible on a patient level. 525 iterations with different training and test spectral datasets (split 1/3 test and 2/3 training on a patient basis) were used to analyze the power of the RBF-SVM analysis. Supplementary information S4 displays a histogram of the range of sensitivities and specificities achieved for the cancer versus non-cancer stratum (histograms for all other strata are displayed in supplementary information S5). The sensitivity and specificity range for cancer versus non-cancer is 81–97 % and 51–95 % respectively with sensitivity and specificity ranges of 46–80 % and 60–93 % respectively for metastatic cancer versus brain cancer, 48–100 % and 31–100 % respectively for glioma versus meningioma, 50–100 % and 2–100 % respectively for high-grade glioma versus low-grade glioma and 28–95 % and 68–98 % for the metastatic origin stratum. Table 2 shows the mean, mode and optimum sensitivities and specificities for each stratum. The optimum sensitivity and specificity is the sensitivity and specificity that best describes the sample set based upon disease grouping.
The optimum, mode and mean sensitivities and specificities observed for all strata range from 51.4 to 100 % respectively, with the optimum sensitivities and specificities achieving 86.3–100 %. The cancer versus non-cancer stratum achieved a mean sensitivity and specificity of 89.8 and 77.5 % respectively, metastatic cancer versus brain cancer of 79.7 and 64.0 % respectively, glioma versus meningioma of 66.7 and 82.1 % respectively, high grade glioma versus low grade glioma of 80.9 and 48.5 % respectively and the origin of metastasis of 64.8 and 86.9 % respectively.
These results show the power of ATR-FTIR spectroscopy to diagnose disease states based upon a stratified approach; however variance still exists in the spectral datasets due to the selection of patient populations in the test and training set. For each stratum, sensitivity and specificity variance exists between classification model iterations. This shows that certain patient partitions provide better classification for the remaining test patient data set. A reason for this is redundant data maximizing the spectral variance within a group within the data variables of the spectral fingerprint region i.e. patient data containing higher intra-group spectral variance partitioned together to form the training set would produce poorer classification models.
Feature extraction for stratified serum spectroscopic diagnostics
To maximize classification accuracy the most salient features of a spectrum can be extracted and ranked based on their similarity to a target set, thus assigning scores on the feature’s ability to discriminate between classes, maximizing inter-group differences .The spectral features used are the peak centroid (measure of the peak’s central point), peak skew (measure of asymmetry in the peak’s shape), peak kurtosis (a measure of the shape of a peak relating peaked vs. flat-topped), peak amplitude and root-mean-squared (RMS) energy. These features were extracted from pre-defined sub-bands of each spectrum and the corresponding inter-band ratios between features were then ranked, using the information gain metric, based upon the resulting score.
Following feature extraction and variable ranking the most discriminatory characteristics of the spectrum (from 1800 to 900 cm−1) were extracted (Table 3 displays the most discriminatory regions with proposed biomolecular assignments) highlighting spectral components relating to proteins, lipids, carbohydrates and nuclear material.
Interestingly the features observed for the 2-class strata, enabling classification of cancer versus non-cancer, metastatic versus brain cancer, glioma versus meningioma and high-grade glioma versus low-grade glioma (top 10 features for each 2-class stratum are displayed in supplementary information S6) which focus on the detection and diagnosis of primary brain cancer are originating from the Amide I (vibrations originating from α–helix structures, β-pleated sheets, turns and random coil (νC = O (80 %), νC–N (10 %), CNN (10 %))  and Amide II—vibrations originating from α–helix structures, β-pleated sheets, turns and random coil [δ N–H (60 %), ν C–N (40 %)], C–O stretch of lipids/proteins, CH2 of lipids/proteins and contributions from nuclear materials (DNA/RNA via PO2
− stretches) spectral regions [18–27]. These spectral regions have been described previously in research discriminating between brain cancer states using tissue spectroscopy [22, 25]. The former highlighted the Amide I (1655 cm−1), Amide II (1547 and 1582 cm−1), carbohydrate (1173 cm−1), glycogen (1014 cm−1) and phosphate regions as describing the majority of difference between infrared spectra of tissue origination from non-cancerous patients and tumour subtypes.
The features observed for the metastatic stratum (top 10 features for each primary site displayed in supplementary information S7), enabling discrimination between the organs of origin of the metastatic cancer (lung vs. melanoma vs. breast), focusing upon secondary brain tumours are originating from vibrations of C–O, C=O and C–H associated with lipids and protein macromolecules, contributions associated with nucleic material (DNA/RNA via PO2
−) and minimal contributions from the Amide spectral regions. This correlates with research performed by Gazi et al. [23, 24] when utilizing FTIR microscopy to investigate discrimination of metastatic prostate cancer tissue and organ confined prostate cancer. Gazi et al. show increases in biomolecular intensities of carbohydrate, phosphate and lipid hydrocarbon intensities between organ confined prostate cancer and prostate cancer bone metastases tissue specimens. Krafft et al. highlight spectral features at 1026, 1080 and 1153 cm−1 as molecular markers for brain metastases of the primary tumour renal cell carcinoma, the intensity at 1735 cm−1, assigned to the carbonyl vibrations (C=O) of ester groups as indicative of brain metastases of breast cancer, an increase in Amide II intensity and broadening of the Amide I low wavenumber shoulder near 1625 cm−1 for brain metastases of lung cancer and an intensity minimum near 1400 cm−1 for brain metastases of colorectal cancer when performing IR spectroscopic imaging of brain tissue . The similar regions observed for the tissue spectroscopic studies as compared to serum based spectroscopic studies provide corroborating evidence or the power of the analysis as the serum biochemical profile is understood to reflect the tissue status.
In order to examine the ability of feature extraction to improve the diagnostic capability of stratified serum diagnostics a 525 iteration feature-fed SVM was performed using all of the 130 features discovered during the feature extraction process, the top 30 features and the top 2 features for the cancer versus non-cancer stratum, based on a variable ranking process. All 130 features are displayed in supplementary information Table S8. Highlighting the spectral regions described previously.
Supplementary information S9 displays the histograms showing the sensitivity and specificities achieved when analysing 525 iterations of a 130 feature-fed SVM (A), 30 feature-fed SVM (B) and 2 feature-fed SVM (C) for the cancer versus non-cancer stratum. When compared to the full fingerprint region SVM shown in supplementary information S3 the range of sensitivities and specificities observed achieve higher percentages and occur over a smaller range, when compared to the SVM analysis of data from the full spectral fingerprint region, from 81 to 97 % and 51–95 % respectively for -the fingerprint region SVM and from 82 to 98 % and 66–97 % respectively for the 130 feature-fed SVM, 81–98 % and 66–95 % for the top 30 feature-fed SVM respectively and 81–96 % and 51–95 % for the top 2 feature-fed SVM respectively.
The mode sensitivity and specificity for the full fingerprint region SVM of the cancer versus non-cancer stratum was 89.4 and 78.0 % respectively compared to mode sensitivities and specificities of 92.3 and 80.5 % when using 130 spectral features. The top 30 features achieved 91.3 and 82.9 % when using 30 features and 89.4 and 70.7 % when using 2 spectral features (Table 4). The mean sensitivity and specificity for the feature extracted models follows the same trend with all 130 features achieving 91.5 % sensitivity and 83.0 % specificity, 30 features achieving 90.6 % sensitivity and 81.9 % specificity and 2 features achieving 88.7 % sensitivity and 77.7 % specificity. The mean sensitivities and specificities achieved using full fingerprint region SVM are similar to those that can be achieved using the top 2 spectral features of 89.8 % sensitivity and 77.5 % specificity. The top 2 spectral features that describe the differences between the cancer versus non-cancer disease groupings are RMS energy of C-O groups, PO2
−, RNA/DNA (1176–1242 cm−1) versus vibrations PO2
− stretch of nucleic acids, RNA/DNA (1020–1115 cm−1) and the skew of the C-O groups, PO2
−, RNA/DNA (1176–1242 cm−1) versus the CH2 of lipids/proteins and Amide II (1483–1537 cm−1) [18–27].
We achieved the optimum sensitivities and specificities from our model consisting of all 130 spectral features for cancer versus non-cancer. Features are ranked in order of how representative they are of the original data, thus a reduction in the diagnostic ability from 2 spectral features, compared to all 130 or top 30, is not surprising due to the reduction in spectral information available during feature-fed-SVM.
The ability to select and rank spectral features enables the extraction of data that describes the differences within the disease groupings without addition of added variance based upon other contributing factors from the patients and enables biochemical differences, via spectral peaks, to be observed whereas a full spectral SVM does not. In addition, the selection of spectral features, based upon the collection of the full FTIR spectrum, allows for targeting of the most discriminatory regions during a sparse frequency collection approach [28, 29], and reduction in the processing power required for classification of disease states providing a quicker and more efficient spectroscopic diagnostic process.
Vibrational spectroscopy can provide rapid, label-free and objective analysis for clinical practice [26, 27]. This proof of principle project provides substantial translational laboratory research to enable the development of clinical serum spectroscopic diagnostics. The rapidity, ease-of-use, low sample volume, reproducibility and detection characteristics shown by this methodology would provide for a rapid and responsive diagnostic tool that can be used throughout the patient pathway . As such the potential clinical impact of serum spectroscopic diagnostics for brain tumours can be:
Robust, rapid diagnostic test with high sensitivity and specificity that can distinguish brain tumours from non cancerous disease prompting more timely onward referral of patients for further testing
A test capable of monitoring response to treatment (surgery, radiotherapy, chemotherapy) and detection of recurrent disease enabling serial sample and testing with less cost, resource and radiation exposure compared to conventional methods. In addition such a test may overcome the time lag required to observe changes in tumour size and characteristics on MRI.
In order to understand the reliability of a diagnostic model the Kappa value is used to assess the inter-observer agreement whilst correcting for chance (see Materials and Methods), where a Kappa value of <0 indicates a less than chance agreement, 0.01–0.20 slight agreement, 0.21–0.40 fair agreement, 0.41–0.60 moderate agreement, 0.61–0.80 substantial agreement and 0.8–1.00 almost perfect agreement . Figure 1 shows Kappa values from a range of currently used diagnostic tests and proposed spectroscopic diagnoses.
Figure 1 shows a Kappa value of 0.12(A) when comparing the histopathological diagnosis of glioblastoma of 34 patients between local, institutional and central neuro-oncopathology reporting concluding that concordance was sub-optimal when comparing local and central review, however the Kappa value did increase to moderate agreement (k = 0.51) when comparing institutional and central review . For mammography(B, D, F, G) a review of 31 community radiologists concerning 30 women with cancer and 83 without was undertaken to assess the advantages of single versus double interpretation comparing the Kappa values from 465 pairs of radiologist and 31,465 pairs of unique pairs. The mean Kappa values for identify non-cancer radiologist when diagnosing non-cancer was 0.30(B) for single interpretation increasing to 0.34(C) on double interpretation and for cancer was 0.59(F) for single interpretation increasing to 0.70(G) for double interpretation . The correlation between Gleason score at biopsy and prostatectomy of 371 patients undergoing radical prostatectomy revealed a Kappa value of 0.42(D) based upon prostate cancer histopathology concluding that this concordance lies within classical clinical standards  and a peer review assessment of 1086 abnormal cervical smears evaluating laboratory cytology performance achieved an overall Kappa value of 0.62(H) when assessing 10 cytologists diagnoses . The Kappa values above are derived from tests that require interpretation from tissue architecture or other diagnostic markers showing a range of Kappa values from 0.12 to 0.70 for these currently used diagnostic tests. It is also interesting to consider a risk factor based test that is performed within the primary care centre in order to direct future treatment and patient care. Examples of such measures are the Framingham Risk Score (FRS) and the European Systemic Coronary Risk Evaluation (SCORE) system for assessing high cardiovascular risk. FRS is widely used within the USA and SCORE is widely used throughout Europe, when comparing the diagnosis of SCORE against that of FRS a Kappa value of 0.42 equating to moderate agreement was achieved . As can be seen from this literature analysis there exists a range of Kappa values from slight agreement to substantial agreement for currently used diagnostic procedures. Kendall et al. used Raman spectroscopy to identify and classify neoplasia in Barrett’s oesophagus when analysing tissue in vitro, in a study utilizing three pathologists to provide a consensus opinion the Kappa value using Raman spectroscopy achieved 0.89(I) . The Kappa values for the ATR-FTIR (J-N) stratified serum diagnostic tests show similar high levels of agreement when comparing against the diagnosis provided following a multidisciplinary team meeting. For cancer versus non-cancer(J) Kappa = 0.77, metastatic versus brain cancer(K) Kappa = 0.90, glioma versus meningioma(L) Kappa = 0.79, high grade Glioma versus low grade Glioma (M) Kappa = 0.70 and the average metastatic model(N) Kappa = 0.74 (lung Kappa = 0.81, skin Kappa = 0.67 and breast Kappa = 0.75). All strata within the stratified serum diagnostics approach showed Kappa values in the substantial and almost perfect agreement ranges.