A Statistical Comparison of SimTandem with State-of-the-Art Peptide Identification Tools
The similarity search in theoretical mass spectra generated from protein sequence databases is a widely accepted approach for identification of peptides from query mass spectra generated by shotgun proteomics. Since query spectra contain many inaccuracies and the sizes of databases grow rapidly in recent years, demands on more accurate mass spectra similarities and on the utilization of database indexing techniques are still desirable. We propose a statistical comparison of parameterized Hausdorff distance with freely available tools OMSSA, X!Tandem and with the cosine similarity. We show that a precursor mass filter in combination with a modification of previously proposed parameterized Hausdorff distance outperforms state-of-the-art tools in both – the speed of search and the number of identified peptide sequences (even though the q-value is only 0.001). Our method is implemented in the freely available application SimTandem which can be used in the framework TOPP based on OpenMS.
Keywordspeptide identification tandem mass spectrometry similarity search parameterized Hausdorff distance precursor mass filter SimTandem
Unable to display preview. Download preview PDF.
- 8.Liu, J., et al.: Methods for peptide identification by spectral comparison. Proteome Science 5(3) (2007)Google Scholar
- 10.NCBI RefSeq, http://www.ncbi.nlm.nih.gov/RefSeq/
- 12.Novák, J., Hoksza, D.: Parametrised Hausdorff Distance as a Non-Metric Similarity Model for Tandem Mass Spectrometry. In: CEUR Proc. DATESO, pp. 1–12 (2010)Google Scholar
- 15.Sturm, M., et al.: OpenMS – An open-source software framework for mass spectrometry. BMC Bioinformatics 9, 163 (2008)Google Scholar
- 16.UniProtKB/Swiss-Prot, http://www.uniprot.org/