Advertisement

Comparing Peptide Spectra Matches Across Search Engines

  • Rune Matthiesen
  • Gorka Prieto
  • Hans Christian Beck
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 2051)

Abstract

Mass spectrometry is extremely efficient for sequencing small peptides generated by, for example, a trypsin digestion of a complex mixture. Current instruments have the capacity to generate 50–100 K MSMS spectra from a single run. Of these ~30–50% is typically assigned to peptide matches on a 1% FDR threshold. The remaining spectra need more research to explain. We address here whether the 30–50% matched spectra provide consensus matches when using different database-dependent search pipelines. Although the majority of the spectra peptide assignments concur across search engines, our conclusion is that database-dependent search engines still require improvements.

Key words

Database dependent search Peptide assignments 

Notes

Acknowledgments

R.M. is supported by Fundação para a Ciência e a Tecnologia (FCT investigator program 2012), iNOVA4Health—UID/Multi/04462/2013, a program financially supported by Fundação para a Ciência e Tecnologia/Ministério da Educação e Ciência, through national funds and is cofunded by FEDER under the PT2020 Partnership Agreement. This work is also funded by FEDER funds through the COMPETE 2020 Programme and National Funds through FCT – Portuguese Foundation for Science and Technology under the projects number PTDC/BTM-TEC/30087/2017 and PTDC/BTM-TEC/30088/2017.

References

  1. 1.
    Keller A, Eng J, Zhang N, Li XJ, Aebersold R (2005) A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Mol Syst Biol 1:2005.0017.  https://doi.org/10.1038/msb4100024CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Shteynberg D, Nesvizhskii AI, Moritz RL, Deutsch EW (2013) Combining results of multiple search engines in proteomics. Mol Cell Proteomics 12(9):2383–2393.  https://doi.org/10.1074/mcp.R113.027797CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Paulo JA (2013) Practical and efficient searching in proteomics: a cross engine comparison. WebmedCentral 4(10).  https://doi.org/10.9754/journal.wplus.2013.0052
  4. 4.
    Kapp EA, Schutz F, Connolly LM, Chakel JA, Meza JE, Miller CA, Fenyo D, Eng JK, Adkins JN, Omenn GS, Simpson RJ (2005) An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: sensitivity and specificity analysis. Proteomics 5(13):3475–3490.  https://doi.org/10.1002/pmic.200500126CrossRefPubMedGoogle Scholar
  5. 5.
    Balgley BM, Laudeman T, Yang L, Song T, Lee CS (2007) Comparative evaluation of tandem MS search algorithms using a target-decoy search strategy. Mol Cell Proteomics 6(9):1599–1608.  https://doi.org/10.1074/mcp.M600469-MCP200CrossRefPubMedGoogle Scholar
  6. 6.
    Alves G, Wu WW, Wang G, Shen RF, Yu YK (2008) Enhancing peptide identification confidence by combining search methods. J Proteome Res 7(8):3102–3113.  https://doi.org/10.1021/pr700798hCrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Kwon T, Choi H, Vogel C, Nesvizhskii AI, Marcotte EM (2011) MSblender: a probabilistic approach for integrating peptide identifications from multiple database search engines. J Proteome Res 10(7):2949–2958.  https://doi.org/10.1021/pr2002116CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Searle BC, Turner M, Nesvizhskii AI (2008) Improving sensitivity by probabilistically combining results from multiple MS/MS search methodologies. J Proteome Res 7(1):245–253.  https://doi.org/10.1021/pr070540wCrossRefPubMedGoogle Scholar
  9. 9.
    Shteynberg D, Deutsch EW, Lam H, Eng JK, Sun Z, Tasman N, Mendoza L, Moritz RL, Aebersold R, Nesvizhskii AI (2011) iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol Cell Proteomics 10(12):M111.007690.  https://doi.org/10.1074/mcp.M111.007690CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Sultana T, Jordan R, Lyons-Weiler J (2009) Optimization of the use of consensus methods for the detection and putative identification of peptides via mass spectrometry using protein standard mixtures. J Proteomics Bioinform 2(6):262–273.  https://doi.org/10.4172/jpb.1000085CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Dagda RK, Sultana T, Lyons-Weiler J (2010) Evaluation of the consensus of four peptide identification algorithms for tandem mass spectrometry based proteomics. J Proteomics Bioinform 3:39–47.  https://doi.org/10.4172/jpb.1000119CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Nahnsen S, Bertsch A, Rahnenfuhrer J, Nordheim A, Kohlbacher O (2011) Probabilistic consensus scoring improves tandem mass spectrometry peptide identification. J Proteome Res 10(8):3332–3343.  https://doi.org/10.1021/pr2002879CrossRefPubMedGoogle Scholar
  13. 13.
    Serang O, Noble W (2012) A review of statistical methods for protein identification using tandem mass spectrometry. Stat Interface 5(1):3–20CrossRefGoogle Scholar
  14. 14.
    He L, Diedrich J, Chu YY, Yates JR 3rd (2015) Extracting accurate precursor information for tandem mass spectra by RawConverter. Anal Chem 87(22):11361–11367.  https://doi.org/10.1021/acs.analchem.5b02721CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    The M, MacCoss MJ, Noble WS, Kall L (2016) Fast and accurate protein false discovery rates on large-scale proteomics data sets with percolator 3.0. J Am Soc Mass Spectrom 27(11):1719–1727.  https://doi.org/10.1007/s13361-016-1460-7CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Vaudel M, Burkhart JM, Zahedi RP, Oveland E, Berven FS, Sickmann A, Martens L, Barsnes H (2015) PeptideShaker enables reanalysis of MS-derived proteomics data sets. Nat Biotechnol 33(1):22–24.  https://doi.org/10.1038/nbt.3109CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Quandta A, Espona L, Balasko A, Weissera H, Brusniak M, Kunsztb P, Aebersold R, Malmström L (2015) Using synthetic peptides to benchmark peptide identification software and search parameters for MS/MS data analysis. EuPA Open Proteom 5:21–31CrossRefGoogle Scholar
  18. 18.
    Perkins DN, Pappin DJ, Creasy DM, Cottrell JS (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18):3551–3567.  https://doi.org/10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2CrossRefGoogle Scholar
  19. 19.
    Eng JK, McCormack AL, Yates JR (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom 5(11):976–989.  https://doi.org/10.1016/1044-0305(94)80016-2CrossRefPubMedGoogle Scholar
  20. 20.
    Craig R, Beavis RC (2004) TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20(9):1466–1467.  https://doi.org/10.1093/bioinformatics/bth092CrossRefGoogle Scholar
  21. 21.
    Geer LY, Markey SP, Kowalak JA, Wagner L, Xu M, Maynard DM, Yang X, Shi W, Bryant SH (2004) Open mass spectrometry search algorithm. J Proteome Res 3(5):958–964.  https://doi.org/10.1021/pr0499491CrossRefPubMedGoogle Scholar
  22. 22.
    Tabb DL, Fernando CG, Chambers MC (2007) MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. J Proteome Res 6(2):654–661.  https://doi.org/10.1021/pr0604054CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Ternent T, Csordas A, Qi D, Gomez-Baena G, Beynon RJ, Jones AR, Hermjakob H, Vizcaino JA (2014) How to submit MS proteomics data to ProteomeXchange via the PRIDE database. Proteomics 14(20):2233–2241.  https://doi.org/10.1002/pmic.201400120CrossRefGoogle Scholar
  24. 24.
    Aggarwal S, Yadav AK (2016) False discovery rate estimation in proteomics. Methods Mol Biol 1362:119–128.  https://doi.org/10.1007/978-1-4939-3106-4_7CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  • Rune Matthiesen
    • 1
  • Gorka Prieto
    • 2
  • Hans Christian Beck
    • 3
  1. 1.Computational and Experimental Biology Group, CEDOC, Chronic Diseases Research Centre, NOVA Medical School, Faculdade de Ciências MédicasUniversidade NOVA de LisboaLisboaPortugal
  2. 2.Department of Communications Engineering, Faculty of Engineering of BilbaoUniversity of the Basque Country (UPV/EHU)BilbaoSpain
  3. 3.Department of Clinical Biochemistry and PharmacologyOdense University HospitalOdense CDenmark

Personalised recommendations