SALSA: An Algorithm for Mining Specific Features of Tandem MS Data

When using Sequest and similar tools described in previous chapters, we typically have peptide MS-MS data and we ask, “What proteins do these peptides come from?” Sequest and similar programs are well-suited to the task of protein identification from peptide MS-MS data. However, the proposition becomes a bit different if we want to do something other than simply identify what proteins are present in a sample. Consider the following scenarios:
  • We know that our sample contains many proteins, but we only wish to identify those that bear some specific modification. This could be a posttranslational modification, such as phosphorylation, or a modification by a drug or other chemical.

  • We want to identify peptides in a mixture that all share some sequence identity, but may differ in other ways. This could be due to the presence of wild-type and mutant forms of a protein.

  • We know or suspect that our sample contains a particular protein, but we also suspect that it may be present in multiple modified forms and we wish to detect all of them.


