PTMSearch: A Greedy Tree Traversal Algorithm for Finding Protein Post-Translational Modifications in Tandem Mass Spectra

  • Attila Kertész-Farkas
  • Beáta Reiz
  • Michael P. Myers
  • Sándor Pongor
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6912)

Abstract

Peptide identification by tandem mass spectrometry (MS/MS) and database searching is becoming the standard high-throughput technology in many areas of the life sciences. The analysis of post-translational modifications (PTMs) is a major source of complications in this area, which calls for efficient computational approaches. In this paper we describe PTMSearch, a novel algorithm in which the PTM search space is represented by a tree structure, and a greedy traversal algorithm is used to identify a path within the tree that corresponds to the PTMs that best fit the input data. Tests on simulated and real (experimental) PTMs show that the algorithm performs well in terms of speed and accuracy. Estimates are given for the error caused by the greedy heuristics, for the size of the search space and a scheme is presented for the calculation of statistical significance.

Keywords

Experimental Spectrum Collision Induce Dissociation Tandem Mass Spectrum Theoretical Spectrum Random Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Yates, J.R., Eng, J.K., McCormack, A.L., Schieltz, D.: Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Analytical Chemistry 67(8), 1426–1436 (1995)CrossRefGoogle Scholar
  2. 2.
    Nathalie, F.-M., Garavelli, J.S., Boeckmann, B., Duvaud, S., Gasteiger, E., Gateau, A., Veuthey, A.-L., Bairoch, A.: Annotation of post-translational modifications in the Swiss-Prot knowledge base. PROTEOMICS 4(6), 1537–1550 (2004)CrossRefGoogle Scholar
  3. 3.
    Yan, B., Zhou, T., Wang, P., Liu, Z., Emanuele II, V.A., Olman, V., Xu, Y.: A Point-Process Model for Rapid Identification of Post-Translational Modifications. Pacific Symposium on Biocomputing (11), 327–338 (2006)Google Scholar
  4. 4.
    Nesvizhskii, A.I., Vitek, O., Aebersold, R.: Analysis and validation of proteomic data generated by tandem mass spectrometry. Nature Methods 4(10), 787–797 (2007)CrossRefGoogle Scholar
  5. 5.
    Li, Y., Chi, H., Wang, L.-H.H., Wang, H.-P.P., Fu, Y., Yuan, Z.-F.F., Li, S.-J.J., Liu, Y.-S.S., Sun, R.-X.X., Zeng, R., He, S.-M.M.: Speeding up tandem mass spectrometry based database searching by peptide and spectrum indexing. Rapid communications in mass spectrometry: RCM 24(6), 807–814 (2010)CrossRefGoogle Scholar
  6. 6.
    Ahrné, E., Müller, M., Lisacek, F.: Unrestricted identification of modified proteins using MS/MS. Proteomics 10(4), 671–686 (2010)CrossRefGoogle Scholar
  7. 7.
    Tsur, D., Tanner, S., Zandi, E., Bafna, V., Pevzner, P.A.: Identification of post-translational modifications via blind search of mass-spectra. Nat. Biotechnol. 23, 1562–1567 (2005)CrossRefGoogle Scholar
  8. 8.
    Baliban, R.C., DiMaggio, P.A., Plazas-Mayorca, M.D., Young, N.L., Garcia, B.A., Floudas, C.A.: A Novel Approach for Untargeted Post-translational Modification Identification Using Integer Linear Optimization and Tandem Mass Spectrometry. Molecular & Cellular Proteomics 9(5), 764–779 (2010)CrossRefGoogle Scholar
  9. 9.
    Knuth, D.E.: The art of computer programming, 2nd edn., vol. 3. Addison-Wesley Longman Publishing Co., Amsterdam (1998)MATHGoogle Scholar
  10. 10.
    Sadygov, R.G., Yates, J.R.: A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases. Anal. Chem. 75(15), 3792–3798 (2003)CrossRefGoogle Scholar
  11. 11.
    Geer, L.Y., Markey, S.P., Kowalak, J.A., Wagner, L., Xu, M., Maynard, D.M., Yang, X., Shi, W., Bryant, S.H.: Open Mass Spectrometry Search Algorithm (June 2004)Google Scholar
  12. 12.
    Anderson, C.W.: Extreme value theory for a class of discrete distributions with applications to some stochastic processes. Journal of Applied Probability 7, 99–113 (1970)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Käll, L., Storey, J.D., MacCoss, M.J., Noble, W.S.S.: Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. Journal of proteome research 7(1), 29–34 (2008)CrossRefGoogle Scholar
  14. 14.
    Fenyo, D., Beavis, R.C.: A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. Analytical Chemistry 75(4), 768–774 (2003)CrossRefGoogle Scholar
  15. 15.
    Falkner, J.A., Kachman, M., Veine, D.M., Walker, A., Strahler, J.R., Andrews, P.C.: Validated maldi-tof/tof mass spectra for protein standards. Journal of the American Society for Mass Spectrometry 18(5), 850–855 (2007)CrossRefGoogle Scholar
  16. 16.
    Craig, R., Beavis, R.C.: Tandem: matching proteins with tandem mass spectra. Bioinformatics 20(9), 1466–1467 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Attila Kertész-Farkas
    • 1
  • Beáta Reiz
    • 2
    • 3
  • Michael P. Myers
    • 1
  • Sándor Pongor
    • 1
    • 2
  1. 1.International Centre for Genetic Engineering and BiotechnologyTriesteItaly
  2. 2.Bioinformatics Group, Biological Research CentreHungarian Academy of SciencesSzegedHungary
  3. 3.Institute of InformaticsUniversity of SzegedSzegedHungary

Personalised recommendations