Abstract
In recent years, data-independent acquisition (DIA) has emerged as a powerful analysis method in biological mass spectrometry (MS). Compared to the previously predominant data-dependent acquisition (DDA), it offers a way to achieve greater reproducibility, sensitivity, and dynamic range in MS measurements. To make DIA accessible to non-expert users, a multifunctional, automated high-throughput pipeline DIAproteomics was implemented in the computational workflow framework “Nextflow” (https://nextflow.io). This allows high-throughput processing of proteomics and peptidomics DIA datasets on diverse computing infrastructures. This chapter provides a short summary and usage protocol guide for the most important modes of operation of this pipeline regarding the analysis of peptidomics datasets using the command line. In brief, DIAproteomics is a wrapper around the OpenSwathWorkflow and relies on either existing or ad-hoc generated spectral libraries from matching DDA runs. The OpenSwathWorkflow extracts chromatograms from the DIA runs and performs chromatographic peak-picking. Further downstream of the pipeline, these peaks are scored, aligned, and statistically evaluated for qualitative and quantitative differences across conditions depending on the user’s interest. DIAproteomics is open-source and available under a permissive license. We encourage the scientific community to use or modify the pipeline to meet their specific requirements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gillet LC, Navarro P, Tate S et al (2012) Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol Cell Proteomics 11:O111.016717. https://doi.org/10.1074/mcp.O111.016717
Doerr A (2015) DIA mass spectrometry. Nat Methods 12:35–35. https://doi.org/10.1038/nmeth.3234
Hu A, Noble WS, Wolf-Yadlin A (2016) Technical advances in proteomics: new developments in data-independent acquisition. F1000Res 5:10.12688/f1000research.7042.1
Poulos RC, Hains PG, Shah R et al (2020) Strategies to enable large-scale proteomics for reproducible research. Nat Commun 11:3793. https://doi.org/10.1038/s41467-020-17641-3
Krasny L, Huang PH (2021) Data-independent acquisition mass spectrometry (DIA-MS) for proteomic applications in oncology. Mol Omics 17:29–42. https://doi.org/10.1039/D0MO00072H
Caron E, Espona L, Kowalewski DJ et al (2015) An open-source computational and data resource to analyze digital maps of immunopeptidomes. elife 4. https://doi.org/10.7554/eLife.07661
Ritz D, Kinzi J, Neri D, Fugmann T (2017) Data-independent acquisition of HLA class I peptidomes on the Q exactive mass spectrometer platform. Proteomics 17. https://doi.org/10.1002/pmic.201700177
Pak H, Michaux J, Huber F et al (2021) Sensitive immunopeptidomics by leveraging available large-scale multi-HLA spectral libraries, data-independent acquisition and MS/MS prediction. Mol Cell Proteomics 0. https://doi.org/10.1016/j.mcpro.2021.100080
Wilhelm M, Zolg DP, Graber M et al (2021) Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics. Nat Commun 12:3346. https://doi.org/10.1038/s41467-021-23713-9
Saidi M, Kamali S, Beaudry F (2019) Neuropeptidomics: comparison of parallel reaction monitoring and data-independent acquisition for the analysis of neuropeptides using high-resolution mass spectrometry. Biomed Chromatogr 33:e4523. https://doi.org/10.1002/bmc.4523
Lin L, Zheng J, Zheng F et al (2020) Advancing serum peptidomic profiling by data-independent acquisition for clear-cell renal cell carcinoma detection and biomarker discovery. J Proteome 215:103671. https://doi.org/10.1016/j.jprot.2020.103671
Arju G, Taivosalo A, Pismennoi D et al (2020) Application of the UHPLC-DIA-HRMS method for determination of cheese peptides. Foods 9:979. https://doi.org/10.3390/foods9080979
López-Pedrouso M, Borrajo P, Amarowicz R et al (2021) Peptidomic analysis of antioxidant peptides from porcine liver hydrolysates using SWATH-MS. J Proteome 232:104037. https://doi.org/10.1016/j.jprot.2020.104037
Xu LL, Gao HY, Yang F et al (2022) Major shrimp allergen peptidomics signatures and potential biomarkers of heat processing. Food Chem 382:132567. https://doi.org/10.1016/j.foodchem.2022.132567
Schubert OT, Gillet LC, Collins BC et al (2015) Building high-quality assay libraries for targeted analysis of SWATH MS data. Nat Protoc 10:426–441. https://doi.org/10.1038/nprot.2015.015
Tsou C-C, Avtonomov D, Larsen B et al (2015) DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat Methods 12:258–264. https://doi.org/10.1038/nmeth.3255
Rosenberger G, Koh CC, Guo T et al (2014) A repository of assays to quantify 10,000 human proteins by SWATH-MS. Scientific Data 1:140031. https://doi.org/10.1038/sdata.2014.31
Vizcaíno JA, Csordas A, del-Toro N et al (2016) 2016 update of the PRIDE database and its related tools. Nucleic Acids Res 44:D447–D456. https://doi.org/10.1093/nar/gkv1145
Desiere F, Deutsch EW, King NL et al (2006) The PeptideAtlas project. Nucleic Acids Res 34:D655–D658. https://doi.org/10.1093/nar/gkj040
Shao W, Pedrioli PGA, Wolski W et al (2018) The SysteMHC atlas project. Nucleic Acids Res 46:D1237–D1247. https://doi.org/10.1093/nar/gkx664
Demichev V, Messner CB, Vernardis SI et al (2020) DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat Methods 17:41–44. https://doi.org/10.1038/s41592-019-0638-x
Gessulat S, Schmidt T, Zolg DP et al (2019) Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat Methods 16:509–518. https://doi.org/10.1038/s41592-019-0426-7
Gabriels R, Martens L, Degroeve S (2019) Updated MS2PIP web server delivers fast and accurate MS2 peak intensity prediction for multiple fragmentation methods, instruments and labeling techniques. Nucleic Acids Res 47:W295–W299. https://doi.org/10.1093/nar/gkz299
Tiwary S, Levy R, Gutenbrunner P et al (2019) High-quality MS/MS spectrum prediction for data-dependent and data-independent acquisition data analysis. Nat Methods 16:519–525. https://doi.org/10.1038/s41592-019-0427-6
Bichmann L, Gupta S, Rosenberger G et al (2021) DIAproteomics: a multifunctional data analysis pipeline for data-independent acquisition proteomics and Peptidomics. J Proteome Res 20:3758–3766. https://doi.org/10.1021/acs.jproteome.1c00123
Deutsch EW, Mendoza L, Shteynberg D et al (2010) A guided tour of the trans-proteomic pipeline. Proteomics 10:1150–1159. https://doi.org/10.1002/pmic.200900375
Röst HL, Rosenberger G, Navarro P et al (2014) OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol 32:219–223. https://doi.org/10.1038/nbt.2841
Pino LK, Searle BC, Bollinger JG et al (2020) The skyline ecosystem: informatics for quantitative mass spectrometry proteomics. Mass Spectrom Rev 39:229–244. https://doi.org/10.1002/mas.21540
Li C, Gao M, Yang W et al (2021) Diamond: a multi-modal DIA mass spectrometry data processing pipeline. Bioinformatics 37:265. https://doi.org/10.1093/bioinformatics/btaa1093
Sinitcyn P, Hamzeiy H, Salinas Soto F et al (2021) MaxDIA enables library-based and library-free data-independent acquisition proteomics. Nat Biotechnol 39:1563–1573. https://doi.org/10.1038/s41587-021-00968-7
Di Tommaso P, Chatzou M, Floden EW et al (2017) Nextflow enables reproducible computational workflows. Nat Biotechnol 35:316–319. https://doi.org/10.1038/nbt.3820
Ewels PA, Peltzer A, Fillinger S et al (2020) The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol 38:276–278. https://doi.org/10.1038/s41587-020-0439-x
Röst HL, Sachsenberg T, Aiche S et al (2016) OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Methods 13:741. https://doi.org/10.1038/nmeth.3959
Rosenberger G, Bludau I, Schmitt U et al (2017) Statistical control of peptide and protein error rates in large-scale targeted DIA analyses. Nat Methods 14:921–927. https://doi.org/10.1038/nmeth.4398
Gupta S, Ahadi S, Zhou W, Röst H (2019) DIAlignR provides precise retention time alignment across distant runs in DIA and targeted proteomics. Mol Cell Proteomics 18:806–817. https://doi.org/10.1074/mcp.TIR118.001132
Choi M, Chang C-Y, Clough T et al (2014) MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments. Bioinformatics 30:2524–2526. https://doi.org/10.1093/bioinformatics/btu305
Acknowledgments
This work was supported by the German Ministry for Research and Education (BMBF) as part of the German Network for Bioinformatics infrastructure (FKZ: 31A535A) and by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2180-390900677. In addition, the work was initiated through a travel stipend by the Boehringer Ingelheim Fonds for basic research in medicine and supported by the Chan Zuckerberg Initiative program “Essential Open-Source Software for Science (EOSS).” We thank all nf-core and OpenMS team members for supporting the development and debugging of the pipeline as well as for the provision of the template.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Bichmann, L., Gupta, S., Röst, H. (2024). Data-Independent Acquisition Peptidomics. In: Schrader, M., Fricker, L.D. (eds) Peptidomics. Methods in Molecular Biology, vol 2758. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3646-6_4
Download citation
DOI: https://doi.org/10.1007/978-1-0716-3646-6_4
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3645-9
Online ISBN: 978-1-0716-3646-6
eBook Packages: Springer Protocols