Improved Mass Spectrometry Peak Intensity Prediction by Adaptive Feature Weighting

  • Alexandra Scherbart
  • Wiebke Timm
  • Sebastian Böcker
  • Tim W. Nattkemper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5506)


Mass spectrometry (MS) is a key technique for the analysis and identification of proteins. A prediction of spectrum peak intensities from pre computed molecular features would pave the way to a better understanding of spectrometry data and improved spectrum evaluation. The goal is to model the relationship between peptides and peptide peak heights in MALDI-TOF mass spectra, only using the peptide’s sequence information and the chemical properties. To cope with this high dimensional data, we propose a regression based combination of feature weightings and a linear predictor to focus on relevant features. This offers simpler models, scalability, and better generalization. We show that the overall performance utilizing the estimation of feature relevance and re-training compared to using the entire feature space can be improved.


Partial Little Square Feature Space Feature Weighting Learning Architecture Local Linear Function 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Shadforth, I., Crowther, D., Bessant, C.: Protein and peptide identification algorithms using MS for use in high-throughput, automated pipelines. Proteomics 5(16), 4082–4095 (2005)CrossRefGoogle Scholar
  2. 2.
    Elias, J.E., Gibbons, F.D., King, O.D., Roth, F.P., Gygi, S.P.: Intensity-based protein identification by machine learning from a library of tandem mass spectra. Nat. Biotechnol. 22(2), 214–219 (2004)CrossRefGoogle Scholar
  3. 3.
    Gay, S., Binz, P.A., Hochstrasser, D.F., Appel, R.D.: Peptide mass fingerprinting peak intensity prediction: extracting knowledge from spectra. Proteomics 2(10), 1374–1391 (2002)CrossRefGoogle Scholar
  4. 4.
    Tang, H., et al.: A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics 22(14), 481 (2006)CrossRefGoogle Scholar
  5. 5.
    Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97(1-2), 245–271 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHGoogle Scholar
  7. 7.
    Ritter, H.: Learning with the self-organizing map. In: Kohonen, T., et al. (eds.) Artificial Neural Networks, pp. 379–384. Elsevier Science Publishers, Amsterdam (1991)Google Scholar
  8. 8.
    Timm, W., Böcker, S., Twellmann, T., Nattkemper, T.W.: Peak intensity prediction for pmf mass spectra using support vector regression. In: Proc. of the 7th International FLINS Conference on Applied Artificial Intelligence (2006)Google Scholar
  9. 9.
    Kawashima, S., Ogata, H., Kanehisa, M.: AAindex: Amino Acid Index Database. Nucleic Acids Res. 27(1), 368–369 (1999)CrossRefGoogle Scholar
  10. 10.
    R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Stat. Comp., Austria (2008) ISBN 3-900051-07-0Google Scholar
  11. 11.
    Kuhn, M.: caret: Classification and Regression Training, R package v. 3.16 (2008)Google Scholar
  12. 12.
    Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)Google Scholar
  13. 13.
    Kohonen, T.: Self-organized formation of topologically correct feature maps. In: Biological Cybernetics, vol. 43, pp. 59–69 (1982)Google Scholar
  14. 14.
    Cleveland, W.S., Devlin, S.J.: Locally-weighted regression: An approach to regression analysis by local fitting. J. of the American Stat. Assoc. 83, 596–610 (1988)CrossRefzbMATHGoogle Scholar
  15. 15.
    Millington, P.J., Baker, W.L.: Associative reinforcement learning for optimal control. In: Proc. Conf. on AIAA Guid. Nav. and Cont., vol. 2, pp. 1120–1128 (1990)Google Scholar
  16. 16.
    Scherbart, A., Timm, W., Böcker, S., Nattkemper, T.W.: Som-based peptide prototyping for mass spectrometry peak intensity prediction. In: WSOM 2007 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Alexandra Scherbart
    • 1
  • Wiebke Timm
    • 1
    • 2
  • Sebastian Böcker
    • 3
  • Tim W. Nattkemper
    • 1
  1. 1.Biodata Mining & Applied Neuroinformatics Group, Faculty of TechnologyBielefeld UniversityGermany
  2. 2.Intl. NRW Grad. School of Bioinformatics & Genome ResearchBielefeld UniversityGermany
  3. 3.Bioinformatics GroupJena UniversityGermany

Personalised recommendations