Mapping Biological Networks from Quantitative Data-Independent Acquisition Mass Spectrometry: Data to Knowledge Pipelines

  • Erin L. CrowgeyEmail author
  • Andrea Matlock
  • Vidya Venkatraman
  • Justyna Fert-Bober
  • Jennifer E. Van Eyk
Part of the Methods in Molecular Biology book series (MIMB, volume 1558)


Data-independent acquisition mass spectrometry (DIA-MS) strategies and applications provide unique advantages for qualitative and quantitative proteome probing of a biological sample allowing constant sensitivity and reproducibility across large sample sets. These advantages in LC-MS/MS are being realized in fundamental research laboratories and for clinical research applications. However, the ability to translate high-throughput raw LC-MS/MS proteomic data into biological knowledge is a complex and difficult task requiring the use of many algorithms and tools for which there is no widely accepted standard and best practices are slowly being implemented. Today a single tool or approach inherently fails to capture the full interpretation that proteomics uniquely supplies, including the dynamics of quickly reversible chemically modified states of proteins, irreversible amino acid modifications, signaling truncation events, and, finally, determining the presence of protein from allele-specific transcripts. This chapter highlights key steps and publicly available algorithms required to translate DIA-MS data into knowledge.

Key words

Citrullination Data-independent acquisition Phosphorylation Post-translational modifications Protein networks SWATH 


  1. 1.
    Bantscheff M, Lemeer S, Savitski MM, Kuster B (2012) Quantitative mass spectrometry in proteomics: critical review update from 2007 to the present. Anal Bioanal Chem 404:939–965. doi: 10.1007/s00216-012-6203-4 CrossRefPubMedGoogle Scholar
  2. 2.
    Aebersold R, Mann M (2003) Mass spectrometry-based proteomics. Nature 422:198–207. doi: 10.1038/nature01511 CrossRefPubMedGoogle Scholar
  3. 3.
    Zhang Y, Fonslow BR, Shan B et al (2013) Protein analysis by shotgun/bottom-up proteomics. Chem Rev 113:2343–2394. doi: 10.1021/cr3003533 CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Nesvizhskii AI (2007) Protein identification by tandem mass spectrometry and sequence database searching. Methods Mol Biol 367:87–119. doi: 10.1385/1-59745-275-0:87 PubMedGoogle Scholar
  5. 5.
    Bateman NW, Goulding SP, Shulman NJ et al (2014) Maximizing peptide identification events in proteomic workflows using data-dependent acquisition (DDA). Mol Cell Proteomics 13:329–338. doi: 10.1074/mcp.M112.026500 CrossRefPubMedGoogle Scholar
  6. 6.
    Gillet LC, Navarro P, Tate S et al (2012) Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol Cell Proteomics 11:O111.016717. doi: 10.1074/mcp.O111.016717 CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Röst HL, Rosenberger G, Navarro P et al (2014) OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol 32:219–223. doi: 10.1038/nbt.2841 CrossRefPubMedGoogle Scholar
  8. 8.
    Purvine S, Eppel J-T, Yi EC, Goodlett DR (2003) Shotgun collision-induced dissociation of peptides using a time of flight mass analyzer. Proteomics 3:847–850. doi: 10.1002/pmic.200300362 CrossRefPubMedGoogle Scholar
  9. 9.
    Myung S, Lee YJ, Moon MH et al (2003) Development of high-sensitivity ion trap ion mobility spectrometry time-of-flight techniques: a high-throughput nano-LC-IMS-TOF separation of peptides arising from a Drosophila protein extract. Anal Chem 75:5137–5145. doi: 10.1021/ac030107f CrossRefPubMedGoogle Scholar
  10. 10.
    Venable JD, Dong M-Q, Wohlschlegel J et al (2004) Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat Methods 1:39–45. doi: 10.1038/nmeth705 CrossRefPubMedGoogle Scholar
  11. 11.
    Panchaud A, Jung S, Shaffer SA et al (2011) Faster, quantitative, and accurate precursor acquisition independent from ion count. Anal Chem 83:2250–2257. doi: 10.1021/ac103079q CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Carr SA, Abbatiello SE, Ackermann BL et al (2014) Targeted peptide measurements in biology and medicine: best practices for mass spectrometry-based assay development using a fit-for-purpose approach. Mol Cell Proteomics 13:907–917. doi: 10.1074/mcp.M113.036095 CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Yates JR, Eng JK, McCormack AL, Schieltz D (1995) Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal Chem 67:1426–1436CrossRefPubMedGoogle Scholar
  14. 14.
    Mann M, Wilm M (1995) Electrospray mass spectrometry for protein characterization. Trends Biochem Sci 20:219–224CrossRefPubMedGoogle Scholar
  15. 15.
    Witze ES, Old WM, Resing KA, Ahn NG (2007) Mapping protein post-translational modifications with mass spectrometry. Nat Methods 4:798–806. doi: 10.1038/nmeth1100 CrossRefPubMedGoogle Scholar
  16. 16.
    Jensen ON (2004) Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry. Curr Opin Chem Biol 8:33–41. doi: 10.1016/j.cbpa.2003.12.009 CrossRefPubMedGoogle Scholar
  17. 17.
    Tsur D, Tanner S, Zandi E et al (2005) Identification of post-translational modifications by blind search of mass spectra. Nat Biotechnol 23:1562–1567. doi: 10.1038/nbt1168 CrossRefPubMedGoogle Scholar
  18. 18.
    György B, Tóth E, Tarcsa E et al (2006) Citrullination: a posttranslational modification in health and disease. Int J Biochem Cell Biol 38:1662–1677. doi: 10.1016/j.biocel.2006.03.008 CrossRefPubMedGoogle Scholar
  19. 19.
    Fert-Bober J, Giles JT, Holewinski RJ et al (2015) Citrullination of myofilament proteins in heart failure. Cardiovasc Res 108:232–242. doi: 10.1093/cvr/cvv185 CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Bilbao A, Varesio E, Luban J et al (2015) Processing strategies and software solutions for data-independent acquisition in mass spectrometry. Proteomics 15:964–980. doi: 10.1002/pmic.201400323 CrossRefPubMedGoogle Scholar
  21. 21.
    Nesvizhskii AI (2010) A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. J Proteomics 73:2092–2123. doi: 10.1016/j.jprot.2010.08.009 CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    MacLean B, Tomazela D, Shulman et al. (2010) Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26(7):966–968. Doi  10.1093/bioinformatics/btq054
  23. 23.
    Bernhardt OM, Selevsek N, Gillet LC, et al. (2012) Spectronaut A fast and efficient algorithm for MRM-like processing of data independent acquisition (SWATH-MS) data. Proceedings of the 60th ASMS Conference on MAss Spectrometry and Allied Topics, 2012, Vancouver, BC, Canada.Google Scholar
  24. 24.
    Tsou C-C, Avtonomov D, Larsen B et al (2015) DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat Methods 12:258–264. doi: 10.1038/nmeth.3255 CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Rosenberger G, Koh CC, Guo T et al (2014) A repository of assays to quantify 10,000 human proteins by SWATH-MS. Sci Data 1:140031. doi: 10.1038/sdata.2014.31 CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Holewinski RJ, Parker SJ, Matlock AD et al. (2016) Methods for SWATH™: data independent acquisition on TripleTOF mass spectrometers. Methods Mol Biol. 1410:265–79. doi: 10.1007/978-1-4939-3524-6_16.
  27. 27.
    Escher C, Reiter L, MacLean B et al (2012) Using iRT, a normalized retention time for more targeted measurement of peptides. Proteomics 12:1111–1121. doi: 10.1002/pmic.201100463 CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Parker SJ, Rost H, Rosenberger G et al (2015) Identification of a set of conserved eukaryotic internal retention time standards for data-independent acquisition mass spectrometry. Mol Cell Proteomics 14:2800–2813. doi: 10.1074/mcp.O114.042267 CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Choi M, Chang C-Y, Clough T et al (2014) MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments. Bioinformatics 30:2524–2526. doi: 10.1093/bioinformatics/btu305 CrossRefPubMedGoogle Scholar
  30. 30.
    Szklarczyk D, Franceschini A, Wyder S et al (2015) STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 43:D447–D452. doi: 10.1093/nar/gku1003 CrossRefPubMedGoogle Scholar
  31. 31.
    Xia J, Gill EE, Hancock REW (2015) NetworkAnalyst for statistical, visual and network-based meta-analysis of gene expression data. Nat Protoc 10:823–844. doi: 10.1038/nprot.2015.052 CrossRefPubMedGoogle Scholar
  32. 32.
    Xia J, Benner MJ, Hancock REW (2014) NetworkAnalyst—integrative approaches for protein-protein interaction network analysis and visual exploration. Nucleic Acids Res 42:W167–W174. doi: 10.1093/nar/gku443 CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Nepusz T, Yu H, Paccanaro A (2012) Detecting overlapping protein complexes in protein-protein interaction networks. Nat Methods 9:471–472. doi: 10.1038/nmeth.1938 CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Maere S, Heymans K, Kuiper M (2005) BiNGO: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 21:3448–3449. doi: 10.1093/bioinformatics/bti551 CrossRefPubMedGoogle Scholar
  35. 35.
    Beausoleil SA, Villén J, Gerber SA et al (2006) A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat Biotechnol 24:1285–1292. doi: 10.1038/nbt1240 CrossRefPubMedGoogle Scholar
  36. 36.
    Keller A, Bader SL, Kusebauch U et al (2016) Opening a SWATH window on posttranslational modifications: automated pursuit of modified peptides. Mol Cell Proteomics 15:1151–1163. doi: 10.1074/mcp.M115.054478 CrossRefPubMedGoogle Scholar
  37. 37.
    Fermin D, Walmsley SJ, Gingras A-C et al (2013) LuciPHOr: algorithm for phosphorylation site localization with false localization rate estimation using modified target-decoy approach. Mol Cell Proteomics 12:3409–3419. doi: 10.1074/mcp.M113.028928 CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Manwar Hussain MR, Khan A, Ali Mohamoud HS (2014) From genes to health—challenges and opportunities. Front Pediatr 2:12. doi: 10.3389/fped.2014.00012 CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Gligorijević V, Pržulj N (2015) Methods for biological data integration: perspectives and challenges. J R Soc Interface 12:20150571. doi: 10.1098/rsif.2015.0571 CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Hornbeck PV, Kornhauser JM, Tkachev S et al (2012) PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse. Nucleic Acids Res 40:D261–D270. doi: 10.1093/nar/gkr1122 CrossRefPubMedGoogle Scholar
  41. 41.
    Dinkel H, Chica C, Via A et al (2011) Phospho.ELM: a database of phosphorylation sites—update 2011. Nucleic Acids Res 39:D261–D267. doi: 10.1093/nar/gkq1104 CrossRefPubMedGoogle Scholar
  42. 42.
    Wurgler-Murphy SM, King DM, Kennelly PJ (2004) The phosphorylation site database: a guide to the serine-, threonine-, and/or tyrosine-phosphorylated proteins in prokaryotic organisms. Proteomics 4:1562–1570. doi: 10.1002/pmic.200300711 CrossRefPubMedGoogle Scholar
  43. 43.
    Gnad F, Ren S, Cox J et al (2007) PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites. Genome Biol 8:R250. doi: 10.1186/gb-2007-8-11-r250 CrossRefPubMedPubMedCentralGoogle Scholar
  44. 44.
    Heazlewood JL, Durek P, Hummel J et al (2008) PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor. Nucleic Acids Res 36:D1015–D1021. doi: 10.1093/nar/gkm812 CrossRefPubMedGoogle Scholar
  45. 45.
    Chaudhuri R, Sadrieh A, Hoffman NJ et al (2015) PhosphOrtholog: a web-based tool for cross-species mapping of orthologous protein post-translational modifications. BMC Genomics 16:617. doi: 10.1186/s12864-015-1820-x CrossRefPubMedPubMedCentralGoogle Scholar
  46. 46.
    Linding R, Jensen LJ, Pasculescu A et al (2008) NetworKIN: a resource for exploring cellular phosphorylation networks. Nucleic Acids Res 36:D695–D699. doi: 10.1093/nar/gkm902 CrossRefPubMedGoogle Scholar
  47. 47.
    Lee T-Y, Bo-Kai Hsu J, Chang W-C, Huang H-D (2011) RegPhos: a system to explore the protein kinase-substrate phosphorylation network in humans. Nucleic Acids Res 39:D777–D787. doi: 10.1093/nar/gkq970 CrossRefPubMedGoogle Scholar
  48. 48.
    Gupta R, Birch H, Rapacki K et al (1999) O-GLYCBASE version 4.0: a revised database of O-glycosylated proteins. Nucleic Acids Res 27:370–372CrossRefPubMedPubMedCentralGoogle Scholar
  49. 49.
    Wang J, Torii M, Liu H et al (2011) dbOGAP—an integrated bioinformatics resource for protein O-GlcNAcylation. BMC Bioinformatics 12:91. doi: 10.1186/1471-2105-12-91 CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Chernorudskiy AL, Garcia A, Eremin EV et al (2007) UbiProt: a database of ubiquitylated proteins. BMC Bioinformatics 8:126. doi: 10.1186/1471-2105-8-126 CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    Naegle KM, Gymrek M, Joughin BA et al (2010) PTMScout, a web resource for analysis of high throughput post-translational proteomics studies. Mol Cell Proteomics 9:2558–2570. doi: 10.1074/mcp.M110.001206 CrossRefPubMedPubMedCentralGoogle Scholar
  52. 52.
    Falquet L, Pagni M, Bucher P et al (2002) The PROSITE database, its status in 2002. Nucleic Acids Res 30:235–238CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Kanehisa M, Goto S, Kawashima S, Nakaya A (2002) The KEGG databases at GenomeNet. Nucleic Acids Res 30:42–46CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Fuchs R (1991) MacPattern: protein pattern searching on the Apple Macintosh. Comput Appl Biosci 7:105–106PubMedGoogle Scholar
  55. 55.
    Henikoff S, Henikoff JG (1991) Automated assembly of protein blocks for database searching. Nucleic Acids Res 19:6565–6572CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    Tatusov RL, Altschul SF, Koonin EV (1994) Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks. Proc Natl Acad Sci USA 91:12091–12095CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Li H, Xing X, Ding G et al (2009) SysPTM: a systematic resource for proteomic research on post-translational modifications. Mol Cell Proteomics 8:1839–1849. doi: 10.1074/mcp.M900030-MCP200 CrossRefPubMedPubMedCentralGoogle Scholar
  58. 58.
    Keshava Prasad TS, Goel R, Kandasamy K et al (2009) Human Protein Reference Database—2009 update. Nucleic Acids Res 37:D767–D772. doi: 10.1093/nar/gkn892 CrossRefPubMedGoogle Scholar
  59. 59.
    Zhang P, Kirk JA, Ji W et al (2012) Multiple reaction monitoring to identify site-specific troponin I phosphorylated residues in the failing human heart. Circulation 126:1828–1837. doi: 10.1161/CIRCULATIONAHA.112.096388 CrossRefPubMedPubMedCentralGoogle Scholar
  60. 60.
    Kooij V, Zhang P, Piersma SR et al (2013) PKCα-specific phosphorylation of the troponin complex in human myocardium: a functional and proteomics analysis. PLoS One 8:e74847. doi: 10.1371/journal.pone.0074847 CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  • Erin L. Crowgey
    • 1
    Email author
  • Andrea Matlock
    • 2
  • Vidya Venkatraman
    • 2
  • Justyna Fert-Bober
    • 2
  • Jennifer E. Van Eyk
    • 2
  1. 1.Nemours Alfred I. DuPont Hospital for ChildrenWilmingtonUSA
  2. 2.Advanced Clinical BioSystems Research Institute, Cedars Sinai Medical Center, Heart InstituteLos AngelesUSA

Personalised recommendations