Methods in Molecular Biology™ Volume 564, 2009, pp 245-259
Date: 26 May 2009

Algorithms and Databases

* Final gross prices may vary according to local VAT.

Get Access

Summary

The capacity of proteomics methods and mass spectrometry instrumentation to generate data has grown substantially over the past years. This data volume growth has in turn led to an increased reliance on software to identify peptide or protein sequences from the recorded mass spectra. Diverse algorithms can be applied for the processing of these data, each performing a specific task such as spectrum quality filtering, spectral clustering and merging, assigning a sequence to a spectrum, and assessing the validity of these assignments.

The key algorithms to mass spectral processing pipelines are the ones that assign a sequence to a spectrum. The most commonly used variants of these are crucially dependent on the information contained in the sequences database, which they use as a basis for identification. Since these sequence databases are constructed in different ways and can therefore vary substantially in the amount and type of data they contain, they are also discussed here.