Feature Extraction from Mass Spectra for Classification of Pathological States

  • Alexandros Kalousis
  • Julien Prados
  • Elton Rexhepaj
  • Melanie Hilario
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3721)


Mass spectrometry is becoming an important tool in proteomics. The representation of mass spectra is characterized by very high dimensionality and a high level of redundancy. Here we present a feature extraction method for mass spectra that directly models for domain knowledge, reduces the dimensionality and redundancy of the initial representation and controls for the level of granularity of feature extraction by seeking to optimize classification accuracy. A number of experiments are performed which show that the feature extraction preserves the initial discriminatory content of the learning examples.


Feature Extraction Peak Detection Discriminatory Information Initial Representation Spatial Redundancy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Morris, J., Coombes, K., Koomen, J., Baggerly, K., Kobayashi, R.: Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics (2005) (advanced publication)Google Scholar
  2. 2.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2001)zbMATHGoogle Scholar
  3. 3.
    Prados, J., Kalousis, A., Sanchez, J.C., Allard, L., Carrette, O., Hilario, M.: Mining mass spectra for diagnosis and biomarker discovery of cerebral accidents. Proteomics 4, 2320–2332 (2004)CrossRefGoogle Scholar
  4. 4.
    Mallat, S.: A wavelet tour of signal processing. Academic Press, London (1999)zbMATHGoogle Scholar
  5. 5.
    Petricoin, E., et al.: Use of proteomic patterns in serum to identify ovarian cancer. The Lancet 395, 572–577 (2002)CrossRefGoogle Scholar
  6. 6.
    Petricoin, E., et al.: Serum proteomic patterns for detection of prostate cancer. Journal of the NCI 94 (2002)Google Scholar
  7. 7.
    Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)Google Scholar
  8. 8.
    Qu, Y., et al.: Data reduction using a discrete wavelet transform in discriminant analysis of very high dimensional data. Biometrics 59, 143–151 (2003)CrossRefMathSciNetzbMATHGoogle Scholar
  9. 9.
    Lee, K.R., Lin, X., Park, D., Eslava, S.: Megavariate data analysis of mass spectrometric proteomics data using latent variable projection method. Proteomics 3 (2003)Google Scholar
  10. 10.
    Zbigniew, R., Struzik, A.S.: The haar wavelet transform in the time series similarity paradigm. In: Principles of Data Mining and Knowledge Discovery, Third European Conference, pp. 12–22. Springer, Heidelberg (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Alexandros Kalousis
    • 1
  • Julien Prados
    • 1
  • Elton Rexhepaj
    • 1
  • Melanie Hilario
    • 1
  1. 1.Computer Science DepartmentUniversity of GenevaGeneveSwitzerland

Personalised recommendations