Visualization, Inspection and Interpretation of Shotgun Proteomics Identification Results

  • Ragnhild R. Lereim
  • Eystein Oveland
  • Frode S. Berven
  • Marc Vaudel
  • Harald Barsnes
Chapter

Abstract

Shotgun proteomics is a high throughput technique for protein identification able to identify up to several thousand proteins from a single sample. In order to make sense of this large amount of data, proteomics analysis software is needed, aimed at making the data intuitively accessible to beginners as well as experienced scientists. This chapter provides insight on where to start when analyzing shotgun proteomics data, with a focus on explaining the most common pitfalls in protein identification analysis and how to avoid them. Finally, the move to seeing beyond the list of identified proteins and to putting the results into a bigger biological context is discussed.

Keywords

Protein identification Visualization Protein annotation Validation 

Abbreviations

PTM

Post-Translational Modification

PSM

Peptide Spectrum Match

FDR

False Discovery Rate

FNR

False Negative Rate

GO

Gene Ontology

PI

Protein Inference

References

  1. 1.
    Aebersold R, Mann M (2003) Mass spectrometry-based proteomics. Nature 422:198–207CrossRefPubMedGoogle Scholar
  2. 2.
    Duncan MW, Aebersold R, Caprioli RM (2010) The pros and cons of peptide-centric proteomics. Nat Biotechnol 28:659–664CrossRefPubMedGoogle Scholar
  3. 3.
    Nesvizhskii AI, Aebersold R (2005) Interpretation of shotgun proteomic data: the protein inference problem. Mol Cell Proteomics 4:1419–1440CrossRefPubMedGoogle Scholar
  4. 4.
    Nesvizhskii AI (2010) A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. J Proteomics 73:2092–2123CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Vaudel M, Burkhart JM, Sickmann A et al (2011) Peptide identification quality control. Proteomics 11:2105–2114CrossRefPubMedGoogle Scholar
  6. 6.
    Elias JE, Gygi SP (2010) Target-decoy search strategy for mass spectrometry-based proteomics. Methods Mol Biol 604:55–71CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Vaudel M, Sickmann A, Martens L (2012) Current methods for global proteome identification. Expert Rev Proteomics 9:519–532CrossRefPubMedGoogle Scholar
  8. 8.
    Chalkley RJ, Clauser KR (2012) Modification site localization scoring: strategies and performance. Mol Cell Proteomics 11:3–14CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Barsnes H, Martens L (2013) Crowdsourcing in proteomics: public resources lead to better experiments. Amino Acids 44:1129–1137CrossRefPubMedGoogle Scholar
  10. 10.
    Vizcaino JA, Mueller M, Hermjakob H et al (2009) Charting online OMICS resources: a navigational chart for clinical researchers. Proteomics Clin Appl 3:18–29CrossRefPubMedGoogle Scholar
  11. 11.
    Cox J, Mann M (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26:1367–1372CrossRefPubMedGoogle Scholar
  12. 12.
    Vaudel M, Barsnes H, Berven FS et al (2011) SearchGUI: an open-source graphical user interface for simultaneous OMSSA and X!Tandem searches. Proteomics 11:996–999CrossRefPubMedGoogle Scholar
  13. 13.
    Vizcaino JA, Cote RG, Csordas A et al (2013) The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013. Nucleic Acids Res 41:D1063–D1069CrossRefPubMedGoogle Scholar
  14. 14.
    Vizcaino JA, Deutsch EW, Wang R et al (2014) ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat Biotechnol 32:223–226CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Vaudel M, Venne AS, Berven FS et al (2014) Shedding light on black boxes in protein identification. Proteomics 14:1001–1005CrossRefPubMedGoogle Scholar
  16. 16.
    Barsnes H, Vaudel M, Colaert N et al (2011) Compomics-utilities: an open-source Java library for computational proteomics. BMC Bioinf 12:70CrossRefGoogle Scholar
  17. 17.
    Barsnes H, Eidhammer I, Martens L (2011) A global analysis of peptide fragmentation variability. Proteomics 11:1181–1188CrossRefPubMedGoogle Scholar
  18. 18.
    Helsens K, Timmerman E, Vandekerckhove J et al (2008) Peptizer, a tool for assessing false positive peptide identifications and manually validating selected results. Mol Cell Proteomics 7:2364–2372CrossRefPubMedGoogle Scholar
  19. 19.
    Olsen JV, de Godoy LM, Li G et al (2005) Parts per million mass accuracy on an Orbitrap mass spectrometer via lock mass injection into a C-trap. Mol Cell Proteomics 4:2010–2021CrossRefPubMedGoogle Scholar
  20. 20.
    Beausoleil SA, Villen J, Gerber SA et al (2006) A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat Biotechnol 24:1285–1292CrossRefPubMedGoogle Scholar
  21. 21.
    Savitski MM, Lemeer S, Boesche M et al (2011) Confident phosphorylation site localization using the Mascot Delta Score. Mol Cell Proteomics 10:M110.003830CrossRefPubMedGoogle Scholar
  22. 22.
    Vaudel M, Breiter D, Beck F et al (2013) D-score: a search engine independent MD-score. Proteomics 13:1036–1041CrossRefPubMedGoogle Scholar
  23. 23.
    Olsen JV, Mann M (2013) Status of large-scale analysis of post-translational modifications by mass spectrometry. Mol Cell Proteomics 12:3444–3452CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Vaudel M, Sickmann A, Martens L (2014) Introduction to opportunities and pitfalls in functional mass spectrometry based proteomics. Biochim Biophys Acta 1844:12–20CrossRefPubMedGoogle Scholar
  25. 25.
    Apweiler R, Bairoch A, Wu CH et al (2004) UniProt: the Universal Protein knowledgebase. Nucleic Acids Res 32:D115–D119CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Flicek P, Amode MR, Barrell D et al (2011) Ensembl 2011. Nucleic Acids Res 39:D800–D806CrossRefPubMedGoogle Scholar
  27. 27.
    Binns D, Dimmer E, Huntley R et al (2009) QuickGO: a web-based tool for Gene Ontology searching. Bioinformatics 25:3045–3046CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Sussman JL, Lin D, Jiang J et al (1998) Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules. Acta Crystallogr D Biol Crystallogr 54:1078–1084CrossRefPubMedGoogle Scholar
  29. 29.
    Herraez A (2006) Biomolecules in the computer: Jmol to the rescue. Biochem Mol Biol Educ 34:255–261CrossRefPubMedGoogle Scholar
  30. 30.
    Vandermarliere E, Martens L (2013) Protein structure as a means to triage proposed PTM sites. Proteomics 13:1028–1035CrossRefPubMedGoogle Scholar
  31. 31.
    von Mering C, Huynen M, Jaeggi D et al (2003) STRING: a database of predicted functional associations between proteins. Nucleic Acids Res 31:258–261CrossRefGoogle Scholar
  32. 32.
    Croft D, O’Kelly G, Wu G et al (2011) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res 39:D691–D697CrossRefPubMedGoogle Scholar
  33. 33.
    da Huang W, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57CrossRefGoogle Scholar
  34. 34.
    Kerrien S, Aranda B, Breuza L et al (2012) The IntAct molecular interaction database in 2012. Nucleic Acids Res 40:D841–D846CrossRefPubMedGoogle Scholar
  35. 35.
    Hunter S, Jones P, Mitchell A et al (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 40:D306–D312CrossRefPubMedGoogle Scholar
  36. 36.
    Villaveces JM, Jimenez RC, Garcia LJ et al (2011) Dasty3, a WEB framework for DAS. Bioinformatics 27:2616–2617PubMedPubMedCentralGoogle Scholar
  37. 37.
    Muller T, Schrotter A, Loosse C et al (2011) Sense and nonsense of pathway analysis software in proteomics. J Proteome Res 10:5398–5408CrossRefPubMedGoogle Scholar
  38. 38.
    Khatri P, Sirota M, Butte AJ (2012) Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput Biol 8, e1002375CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Ragnhild R. Lereim
    • 1
    • 2
    • 3
  • Eystein Oveland
    • 2
    • 3
    • 4
  • Frode S. Berven
    • 1
    • 2
    • 3
  • Marc Vaudel
    • 1
  • Harald Barsnes
    • 1
  1. 1.Proteomics Unit, Department of BiomedicineUniversity of BergenBergenNorway
  2. 2.KG Jebsen Centre for Multiple Sclerosis Research, Department of Clinical MedicineUniversity of BergenBergenNorway
  3. 3.Norwegian Multiple Sclerosis Competence Centre, Department of NeurologyHaukeland UniversityBergenNorway
  4. 4.Department of Clinical MedicineUniversity of BergenBergenNorway

Personalised recommendations