Study on Preprocessing and Classifying Mass Spectral Raw Data Concerning Human Normal and Disease Cases

  • Xenofon E. Floros
  • George M. Spyrou
  • Konstantinos N. Vougas
  • George T. Tsangaris
  • Konstantina S. Nikita
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4345)


Mass spectrometry is becoming an important tool in biological sciences. Tissue samples or easily obtained biological fluids (serum, plasma, urine) are analysed by a variety of mass spectrometry methods, producing spectra characterized by very high dimensionality and a high level of noise. Here we address a feature exraction method for mass spectra which consists of two main steps : In the first step an algorithm for low level preprocessing of mass spectra is applied, including denoising with the Shift-Invariant Discrete Wavelet Transform (SIDWT), smoothing, baseline correction, peak detection and normalization of the resulting peak-lists. After this step, we claim to have reduced dimensionality and redundancy of the initial mass spectra representation while keeping all the meaningful features (potential biomarkers) required for disease related proteomic patterns to be identified. In the second step, the peak-lists are alligned and fed to a Support Vector Machine (SVM) which classifies the mass spectra. This procedure was applied to SELDI-QqTOF spectral data collected from normal and ovarian cancer serum samples. The classification performance was assessed for distinct values of the parameters involved in the feature extraction pipeline. The method described here for low-level preprocessing of mass spectra results in 98.3% sensitivity, 98.3% specificity and an AUC (Area Under Curve) of 0.981 in spectra classification.


ovarian cancer mass spectra preprocessing biomarkers feature extraction early diagnosis classification 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Coombes, K.R., Tsavachidis, S., Morris, J.S., Baggerly, K.A., Hung, M.-C., Kuerer, H.M.: Improved Peak Detection and Quantification of Mass Spectrometry Data Acquired from Surface-Enhanced Laser Desorption and Ionization by Denoising Spectra with the Undecimated Discrete Wavelet Transform. Proteomics 5(16), 4107–4117 (2005)CrossRefGoogle Scholar
  2. 2.
    Kalousis, A., Prados, J., Rexhepaj, E., Hilario, M.: Feature extraction from mass spectral data for the classification of pathological states. In: Principles of Data Mining and Knowledge Discoverty, Ninth European Conference. Springer, Heidelberg (2005)Google Scholar
  3. 3.
    Wolski, W.E., Lalowski, M., Martus, P., Herwig, R., Giavalisco, P., Gobom, J., Sickmann, A., Lehrach, H., Reinert, K.: Transformation and other factors of the peptide mass spectrometry pairwise peak-list comparison process. BMC Bioinformatics 6, 285 (2005)CrossRefGoogle Scholar
  4. 4.
    Zhang, X., Lu, X., Shi, Q., Xu, X.Q., Leung, H.C., Harris, L.N., Iglehart, J.D., Miron, A., Liu, J.S., Wong, W.H.: Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data. BMC Bioinformatics. 7, 197 (2006)CrossRefGoogle Scholar
  5. 5.
    Wagner, M., Naik, D., Pothen, A.: Protocols for disease classification from mass spectrometry data. Proteomics 3(9), 1692–1698 (2003)CrossRefGoogle Scholar
  6. 6.
    Qu, Y., Adam, B.I., Thornquist, M., Potter, J.D., Thompson, M.L., Yasui, Y., Davis, J., Schellhammer, P.F., Cazares, L., Clements, M., Wright, G.L., Feng, Z.: Data reduction using a discrete wavelet transform in discriminant analysis of very high dimensional data. Biometrics 59, 143–151 (2003)CrossRefMathSciNetzbMATHGoogle Scholar
  7. 7.
    Lee, K.R., Lin, X., Park, D., Eslava, S.: Megavariate data analysis of mass spectrometric proteomics data using latent variable projection method. Proteomics 3 (2003)Google Scholar
  8. 8.
    Conrads, T.P., Fusaro, V.A., Ross, S., Johann, D., Rajapakse, V., Hitt, B.A., Steinberg, S.M., Kohn, E.C., Fishman, D.A., Whitely, G., Barrett, J.C., Liotta, L.A., Petricoin III, E.F., Veenstra, T.D.: High-resolution serum proteomic features for ovarian cancer detection. Endocrine-Related Cancer 11, 163–178 (2004)CrossRefGoogle Scholar
  9. 9.
    Lang, M., Guo, H., Odegard, J.E., Burrus, C.S., Wells Jr., R.O.: Nonlinear processing of a shift invariant DWT for noise reduction. In: Mathematical Imaging: Wavelet Applications for Dual Use, SPIE Proceedings, Orlando FL, vol. 2491 (1995)Google Scholar
  10. 10.
    Lang, M., Guo, H., Odegard, J.E., Burrus, C.S., Wells Jr., R.O.: Noise Reduction Using an Undecimated Discrete Wavelet Transform. IEEE Signal Processing Letters 3, 10–12 (1996)CrossRefGoogle Scholar
  11. 11.
    Donoho, D.L.: De-noising by soft-thresholding. IEEE Trans. Inform. Theory 41(3), 613–627 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Donoho, D.L., Johnstone, I.M.: Ideal spatial adaptation by wavelet shrinkage. Biometrica 81, 425–455 (1994), Also Tech.Report 400, Department of Statistics, Stanford University (July 1992)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Beylkin, G.: On the representation of operators in bases of compactly supported wavelets. SIAM J. Numer. Anal. 29(6), 1716–1740 (1996)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Andrade, L., Manolakos, E.: Signal Background Estimation and Baseline Correction Algorithms for Accurate DNA Sequencing. Journal of VLSI, special issue on Bioinformatics 35(3), 229–243 (2003)Google Scholar
  15. 15.
    Alfassi Zeen, B.: On the normalization of a mass spectrum for comparison of two spectra (2004)Google Scholar
  16. 16.
    Huang, J., Ling, C.X.: Using AUC and Accuracy in Evaluating Learing Algorithms. IEEE Transactions on Knowledge and Data Engineering 17(3), 299–310 (2005)CrossRefGoogle Scholar
  17. 17.
  18. 18.
    Rice Wavelet Toolbox Licence,

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xenofon E. Floros
    • 1
  • George M. Spyrou
    • 2
  • Konstantinos N. Vougas
    • 2
  • George T. Tsangaris
    • 2
  • Konstantina S. Nikita
    • 1
  1. 1.Electrical and Computer Engineering FacultyNational Technical University of AthensAthensGreece
  2. 2.Foundation for Biomedical Research of the Academy of AthensAthensGreece

Personalised recommendations