Feature Extraction from Mass Spectra for Classification of Pathological States
Mass spectrometry is becoming an important tool in proteomics. The representation of mass spectra is characterized by very high dimensionality and a high level of redundancy. Here we present a feature extraction method for mass spectra that directly models for domain knowledge, reduces the dimensionality and redundancy of the initial representation and controls for the level of granularity of feature extraction by seeking to optimize classification accuracy. A number of experiments are performed which show that the feature extraction preserves the initial discriminatory content of the learning examples.
KeywordsFeature Extraction Peak Detection Discriminatory Information Initial Representation Spatial Redundancy
- 1.Morris, J., Coombes, K., Koomen, J., Baggerly, K., Kobayashi, R.: Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics (2005) (advanced publication)Google Scholar
- 6.Petricoin, E., et al.: Serum proteomic patterns for detection of prostate cancer. Journal of the NCI 94 (2002)Google Scholar
- 7.Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)Google Scholar
- 9.Lee, K.R., Lin, X., Park, D., Eslava, S.: Megavariate data analysis of mass spectrometric proteomics data using latent variable projection method. Proteomics 3 (2003)Google Scholar
- 10.Zbigniew, R., Struzik, A.S.: The haar wavelet transform in the time series similarity paradigm. In: Principles of Data Mining and Knowledge Discovery, Third European Conference, pp. 12–22. Springer, Heidelberg (1999)Google Scholar