Algorithm for accurate similarity measurements of peptide mass fingerprints and its application

  • Flavio Monigatti
  • Peter BerndtEmail author


We present a simple algorithm which allows accurate estimates of the similarity between peptide fingerprint mass spectra from matrix assisted laser desorption/ionization (MALDI) spectrometers. The algorithm, which is a combination of mass correlation and intensity rank correlation, was used to cluster similar spectra and to generate consensus spectra from a data store of more than 100,000 spectra. The resulting first spectra library of 1248 unambiguously identified different protein digests was used to search for missed cleavage patterns that have not been reported so far and to shed light on some peptide ionization characteristics. The findings of this study could be directly implemented in peptide mass fingerprint search algorithms to decrease the false positive error rate to <0.25%. Furthermore, the results contribute to the understanding of the peptide ionization process in MALDI experiments.


Peptide MALDI Relative Entropy Mass Spectrometric Data Peptide Mass Fingerprint 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Lahm, H. W.; Langen, H. Mass spectrometry: a tool for the identification of proteins separated by gels. Electrophoresis. 2000, 11, 2105–2114.CrossRefGoogle Scholar
  2. 2.
    Aebersold, R.; Mann, M. Mass spectrometry-based proteomics. Nature. 2003, 6928, 198–207.CrossRefGoogle Scholar
  3. 3.
    Zolg, J. W.; Langen, H. How industry is approaching the search for new diagnostic markers and biomarkers. Mol. Cell. Proteomics. 2004, 4, 345–354.Google Scholar
  4. 4.
    Clauser, K. R.; Baker, P.; Burlingame, A. L. Role of accurate mass measurement (±10 ppm) in protein identification strategies employing MS or MS/MS and database searching. Anal. Chem. 1999, 14, 2871–2882.CrossRefGoogle Scholar
  5. 5.
    Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. 1999, 18, 3551–3567.CrossRefGoogle Scholar
  6. 6.
    Zhang, W.; Chait, B. T. ProFound: An expert system for protein identification using mass spectrometric peptide mapping information. Anal. Chem. 2000, 11, 2482–2489.CrossRefGoogle Scholar
  7. 7.
    Thiede, B.; Lamer, S.; Mattow, J.; Siejak, F.; Dimmler, C.; Rudel, T.; Jungblut, P. R. Analysis of missed cleavage sites, tryptophan oxidation and N-terminal pyroglutamylation after in-gel tryptic digestion. Rapid Commun. Mass Spectrom. 2000, 6, 496–502.CrossRefGoogle Scholar
  8. 8.
    Krause, E.; Wenschuh, H.; Jungblut, P. R. The dominance of arginine-containing peptides in MALDI-derived tryptic mass fingerprints of proteins. Anal. Chem. 1999, 19, 4160–4165.CrossRefGoogle Scholar
  9. 9.
    Wenschuh, H.; Halada, P.; Lamer, S.; Jungblut, P.; Krause, E. The ease of peptide detection by matrix-assisted laser desorption/ionization mass spectrometry: The effect of secondary structure on signal intensity. Rapid Commun. Mass Spectrom. 1998, 3, 115–119.CrossRefGoogle Scholar
  10. 10.
    Ausloos, P.; Clifton, C. L.; Lias, S. G.; Mikaya, A. I.; Stein, S. E.; Tchekhovskoi, D. V.; Sparkman, O. D.; Zaikin, V.; Zhu, D. The critical evaluation of a comprehensive mass spectral library. J. Am. Soc. Mass Spectrom. 1999, 4, 287–299.CrossRefGoogle Scholar
  11. 11.
    Josephs, J. L.; Sanders, M. Creation and comparison of MS/MS spectral libraries using quadrupole ion trap and triple-quadrupole mass spectrometers. Rapid Commun. Mass Spectrom. 2004, 7, 743–759.CrossRefGoogle Scholar
  12. 12.
    Beer, I.; Barnea, E.; Ziv, T.; Admon, A. Improving large-scale proteomics by clustering of mass spectrometry data. Proteomics. 2004, 4, 950–960.CrossRefGoogle Scholar
  13. 13.
    Olumee, Z.; Sadeghi, M.; Tang, X. D. A. V. Rapid Commun Mass Spectrom. 1995, 9, 744–752.CrossRefGoogle Scholar
  14. 14.
    Durbin, R.; Eddy, S.; Krogh, A.; Mitchison, G. Biological sequence analysis; Cambridge University Press: Cambridge, 1998, 308–309.Google Scholar
  15. 15.
    Shannon, C. E. The mathematical theory of communication, 1963. MD Comput. 1997, 4, 306–317.Google Scholar
  16. 16.
    Williams, R. W.; Chang, A.; Juretic, D.; Loughran, S. Secondary structure predictions and medium range interactions. Biochim. Biophys. Acta. 1987, 2, 200–204.CrossRefGoogle Scholar
  17. 17.
    Wilmot, C. M.; Thornton, J. M. Analysis and prediction of the different types of β-turn in proteins. J. Mol. Biol. 1988, 1, 221–232.CrossRefGoogle Scholar
  18. 18.
    Fountoulakis, M.; Langen, H. Identification of proteins by matrix-assisted laser desorption ionization-mass spectrometry following in-gel digestion in low-salt, nonvolatile buffer and simplified peptide recovery. Anal. Biochem. 1997, 2, 153–156.CrossRefGoogle Scholar
  19. 19.
    Stein, S. E.; Scott, D. R. Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass Spectrom. 1994, 5, 859–866.CrossRefGoogle Scholar
  20. 20.
    Alfassi, Z. B. On the normalization of a mass spectrum for comparison of two spectra. J. Am. Soc. Mass Spectrom. 2004, 3, 385–387.CrossRefGoogle Scholar
  21. 21.
    Press, W. H.; Teukolsky, S. A.; Flannery, B. P.; Vetterling, W. T. Numerical Recipes in C; Cambridge University Press: Cambridge, 1992, pp 640–642.Google Scholar
  22. 22.
    Kyte, J.; Doolittle, R. F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982, 1, 105–132.CrossRefGoogle Scholar
  23. 23.
    LEDA: Library for efficient data types and algorithms. Algorithmic Solutions Software GmbH; 2004.Google Scholar

Copyright information

© American Society for Mass Spectrometry 2004

Authors and Affiliations

  1. 1.F. Hoffman-La Roche Ltd.RCMGBaselSwitzerland

Personalised recommendations