Skip to main content

Protein Identification from Tandem Mass Spectra by Database Searching

  • Protocol
  • First Online:
Bioinformatics for Comparative Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 694))

Abstract

Protein identification from tandem mass spectra is one of the most versatile and widely used proteomics workflows, able to identify proteins, characterize post-translational modifications, and provide semi-quantitative measurements of relative protein abundance. This manuscript describes the concepts, prerequisites, and methods required to analyze a tandem mass spectrometry dataset in order to identify its proteins, by using a tandem mass spectrometry search engine to search protein sequence databases. The discussion includes instructions for extraction, preparation, and formatting of spectral datafiles; selection of appropriate search parameter settings; and basic interpretation of the results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aebersold, R. and Mann, M. (2003) Mass spectrometry-based proteomics. Nature 422, 198–207.

    Article  PubMed  CAS  Google Scholar 

  2. Deutsch, E. W., Lam, H., and Aebersold, R. (2008) Data analysis and bioinformatics tools for tandem mass spectrometry in proteomics. Physiological Genomics 33, 18–25.

    Article  PubMed  CAS  Google Scholar 

  3. Johnson, R., Davis, M., Taylor, J., and Patterson, S. (2005) Informatics for protein identification by mass spectrometry. Methods 35, 223–236.

    Article  PubMed  CAS  Google Scholar 

  4. Maccoss, M. (2005) Computational analysis of shotgun proteomics data. Current Opinion in Chemical Biology 9, 88–94.

    Article  PubMed  CAS  Google Scholar 

  5. McDonald, W. H. and Yates, J. R. (2003) Shotgun proteomics: integrating technologies to answer biological questions. Current Opinion in Molecular Therapeutics 5, 302–309.

    PubMed  CAS  Google Scholar 

  6. Nesvizhskii, A. I. (2007) Mass Spectrometry Data Analysis in Proteomics, volume 367 of Methods in Molecular Biology, chapter Protein Identification by Tandem Mass Spectrometry and Sequence Database Searching, 87–119. Humana Press, Totowa, NJ.

    Google Scholar 

  7. Sadygov, R. G., Cociorva, D., and Yates, J. R. (2004) Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book. Nature Methods 1, 195–202.

    Article  PubMed  CAS  Google Scholar 

  8. Bafna, V. and Edwards, N. (2003) On de novo interpretation of tandem mass spectra for ­peptide identification. In RECOMB ’03: Proceedings of the Seventh Annual International Conference on Research in Computational Molecular Biology, 9–18. ACM Press, New York.

    Google Scholar 

  9. Chen, T., Kao, M. Y., Tepel, M., Rush, J., and Church, G. M. (2001) A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. Journal of Computational Biology 8, 325–337.

    Article  PubMed  CAS  Google Scholar 

  10. Frank, A. and Pevzner, P. (2005) Pepnovo: de novo peptide sequencing via probabilistic network modeling. Analytical Chemistry 77, 964–973.

    Article  PubMed  CAS  Google Scholar 

  11. Taylor, A. and Johnson, R. S. (1997) Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. Rapid Communications in Mass Spectrometry 11, 1067–1075.

    Article  PubMed  CAS  Google Scholar 

  12. Mann, M. and Wilm, M. (1994) Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Analytical Chemistry 66, 4390–4399.

    Article  PubMed  CAS  Google Scholar 

  13. Tabb, D. L., Ma, Z.-Q., Martin, D. B., Ham, A.-J. L., and Chambers, M. C. (2008) DirecTag: accurate sequence tags from peptide MS/MS through statistical scoring. Journal of Proteome Research 7, 3838–3846.

    Article  PubMed  CAS  Google Scholar 

  14. Tanner, S., Shu, H., Frank, A., Wang, L. C., Zandi, E., Mumby, M., Pevzner, P. A., and Bafna, V. (2005) Inspect: identification of post-translationally modified peptides from tandem mass spectra. Analytical Chemistry 77, 4626–4639.

    Article  PubMed  CAS  Google Scholar 

  15. Dass, C. (2001) Principles and Practice of Biological Mass Spectrometry. John Wiley & Sons, Inc., New York.

    Google Scholar 

  16. Perkins, D. N., Pappin, D. J., Creasy, D. M., and Cottrell, J. S. (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567.

    Article  PubMed  CAS  Google Scholar 

  17. Eng, J. K., McCormack, A. L., and Yates, J. R. (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of the American Society of Mass Spectrometry 5, 976–989.

    Article  CAS  Google Scholar 

  18. Craig, R. and Beavis, R. C. (2004) Tandem: matching proteins with tandem mass spectra. Bioinformatics 20, 1466–1467.

    Article  PubMed  CAS  Google Scholar 

  19. Geer, L. Y., Markey, S. P., Kowalak, J. A., Wagner, L., Xu, M., Maynard, D. M., Yang, X., Shi, W., and Bryant, S. H. (2004) Open mass spectrometry search algorithm. Journal of Proteome Research 3, 958–964.

    Article  PubMed  CAS  Google Scholar 

  20. Kersey, P. J., Duarte, J., Williams, A., Karavidopoulou, Y., Birney, E., and Apweiler, R. (2004) The International Protein Index: an integrated database for proteomics experiments. Proteomics 4, 1985–1988.

    Article  PubMed  CAS  Google Scholar 

  21. The Uniprot Consortium (2010) The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Research 38, D142–D148.

    Article  Google Scholar 

  22. Edwards, N. J. (2007) Novel peptide identification from tandem mass spectra using ESTs and sequence database compression. Molecular Systems Biology 3, 102.

    PubMed  Google Scholar 

  23. Keller, A., Eng, J., Zhang, N., Li, X.-J. J., and Aebersold, R. (2005) A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Molecular Systems Biology 1, 2005.0017.

    Google Scholar 

  24. Kessner, D., Chambers, M., Burke, R., Agus, D., and Mallick, P. (2008) ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 24, 2534–2536.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

The preparation of this manuscript was supported, in part, by CPTI Grant R01 CA126189.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Edwards, N.J. (2011). Protein Identification from Tandem Mass Spectra by Database Searching. In: Wu, C., Chen, C. (eds) Bioinformatics for Comparative Proteomics. Methods in Molecular Biology, vol 694. Humana Press. https://doi.org/10.1007/978-1-60761-977-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-977-2_9

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-976-5

  • Online ISBN: 978-1-60761-977-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics