Volume 564 of the series Methods in Molecular Biology™ pp 245-259


Algorithms and Databases

  • Lennart MartensAffiliated withEMBL Outstation - Hinxton, European Bioinformatics Institute
  • , Rolf ApweilerAffiliated withEMBL Outstation - Hinxton, European Bioinformatics Institute Email author 

* Final gross prices may vary according to local VAT.

Get Access


The capacity of proteomics methods and mass spectrometry instrumentation to generate data has grown substantially over the past years. This data volume growth has in turn led to an increased reliance on software to identify peptide or protein sequences from the recorded mass spectra. Diverse algorithms can be applied for the processing of these data, each performing a specific task such as spectrum quality filtering, spectral clustering and merging, assigning a sequence to a spectrum, and assessing the validity of these assignments.

The key algorithms to mass spectral processing pipelines are the ones that assign a sequence to a spectrum. The most commonly used variants of these are crucially dependent on the information contained in the sequences database, which they use as a basis for identification. Since these sequence databases are constructed in different ways and can therefore vary substantially in the amount and type of data they contain, they are also discussed here.

Key words

Sequence database Search algorithm Mass spectrum Clustering Merging Quality assignment Tandem-MS Identification Protein Peptide