Qscore: An algorithm for evaluating SEQUEST database search results

Article

Abstract

A scoring procedure is described for measuring the quality of the results for protein identifications obtained from spectral matching of MS/MS data using the Sequest database search program. The scoring system is essentially probabilistic and operates by estimating the probability that a protein identification has come about by chance. The probability is based on the number of identified peptides from the protein, the total number of identified peptides, and the fraction of distinct tryptic peptides from the database that are present in the identified protein. The score is not strictly a probability, as it also incorporates information about the quality of the individual peptide matches. The result of using Qscore on a large test set of data was similar to that achieved using approaches that validate individual spectral matches, with only a narrow overlap in scores between identified proteins and false positive matches. In direct comparison with a published method of evaluating Sequest results, Qscore was able to identify an equivalent number of proteins without any identifiable false positive assignments. Qscore greatly reduces the number of Sequest protein identifications that have to be validated manually.

References

  1. 1.
    Eng, J. K.; McCormack, A. L.; Yates, J. R., III. An Approach to Correlate Tandem Mass Spectral Data of Peptides with Amino Acid Sequences in a Protein Database. J. Am. Soc. Mass Spectrom. 1994, 5, 976–989.CrossRefGoogle Scholar
  2. 2.
    Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.; Garvik, B. M.; Yates, J. R., 3rd. Direct Analysis of Protein Complexes Using Mass Spectrometry. Nat. Biotechnol. 1999, 17, 676–682.CrossRefGoogle Scholar
  3. 3.
    Washburn, M. P.; Wolters, D.; Yates, J. R., 3rd. Large-Scale Analysis of the Yeast Proteome by Multidimensional Protein Identification Technology. Nat. Biotechnol. 2001, 19, 242–247.CrossRefGoogle Scholar
  4. 4.
    Haynes, P. A.; Fripp, N.; Aebersold, R. Identification of Gel-Separated Proteins by Liquid Chromatography-Electrospray Tandem Mass Spectrometry: Comparison of Methods and Their Limitations. Electrophoresis 1998, 19, 939–945.CrossRefGoogle Scholar
  5. 5.
    Davis, M. T.; Lee, T. D. Rapid Protein Identification Using a Microscale Electrospray LC/MS System on an Ion Trap Mass Spectrometer. J. Am. Soc. Mass Spectrom. 1998, 9, 194–201.CrossRefGoogle Scholar
  6. 6.
    Moore, R. E.; Young, M. K.; Lee, T. D. Method for Screening Peptide Fragment Ion Mass Spectra Prior to Database Searching. J. Am. Soc. Mass Spectrom. 2000, 11, 422–426.CrossRefGoogle Scholar
  7. 7.
    Stahl, D. C.; Swiderek, K. M.; Davis, M. T.; Lee, T. D. Data-Controlled Automation of Liquid Chromatography Tandem Mass Spectrometry Analysis of Peptide Mixtures. J. Am. Soc. Mass Spectrom. 1996, 7, 532–540.CrossRefGoogle Scholar
  8. 8.
    Mann, M.; Wilm, M. Error Tolerant Identification of Peptides in Sequence Databases by Peptide Sequence Tags. Anal. Chem. 1994, 66, 4390–4399.CrossRefGoogle Scholar

Copyright information

© American Society for Mass Spectrometry 2002

Authors and Affiliations

  • Roger E. Moore
    • 1
  • Mary K. Young
    • 1
  • Terry D. Lee
    • 1
  1. 1.Division of ImmunologyBeckman Research Institute of the City of HopeDuarteUSA

Personalised recommendations