Abstract
In order to interpret glycopeptide tandem mass spectra, it is necessary to estimate the theoretical glycan compositions and peptide sequences, known as the search space. The simplest way to do this is to build a naïve search space from sets of glycan compositions from public databases and to assume that the target glycoprotein is pure. Often, however, purified glycoproteins contain co-purified glycoprotein contaminants that have the potential to confound assignment of tandem mass spectra based on naïve assumptions. In addition, there is increasing need to characterize glycopeptides from complex biological mixtures. Fortunately, liquid chromatography-mass spectrometry (LC-MS) methods for glycomics and proteomics are now mature and accessible. We demonstrate the value of using an informed search space built from measured glycomes and proteomes to define the search space for interpretation of glycoproteomics data. We show this using α-1-acid glycoprotein (AGP) mixed into a set of increasingly complex matrices. As the mixture complexity increases, the naïve search space balloons and the ability to assign glycopeptides with acceptable confidence diminishes. In addition, it is not possible to identify glycopeptides not foreseen as part of the naïve search space. A search space built from released glycan glycomics and proteomics data is smaller than its naïve counterpart while including the full range of proteins detected in the mixture. This maximizes the ability to assign glycopeptide tandem mass spectra with confidence. As the mixture complexity increases, the number of tandem mass spectra per glycopeptide precursor ion decreases, resulting in lower overall scores and reduced depth of coverage for the target glycoprotein. We suggest use of α-1-acid glycoprotein as a standard to gauge effectiveness of analytical methods and bioinformatics search parameters for glycoproteomics studies.
Similar content being viewed by others
References
Leymarie N, Griffin PJ, Jonscher K, Kolarich D, Orlando R, McComb M, Zaia J, Aguilan J, Alley WR, Altmann F, Ball LE, Basumallick L, Bazemore-Walker CR, Behnken H, Blank MA, Brown KJ, Bunz S-C, Cairo CW, Cipollo JF, Daneshfar R, Desaire H, Drake RR, Go EP, Goldman R, Gruber C, Halim A, Hathout Y, Hensbergen PJ, Horn DM, Hurum D, Jabs W, Larson G, Ly M, Mann BF, Marx K, Mechref Y, Meyer B, Möginger U, Neusüss C, Nilsson J, Novotny MV, Nyalwidhe JO, Packer NH, Pompach P, Reiz B, Resemann A, Rohrer JS, Ruthenbeck A, Sanda M, Schulz JM, Schweiger-Hufnagel U, Sihlbom C, Song E, Staples GO, Suckau D, Tang H, Thaysen-Andersen M, Viner RI, An Y, Valmu L, Wada Y, Watson M, Windwarder M, Whittal R, Wuhrer M, Zhu Y, Zou C. Interlaboratory study on differential analysis of protein glycosylation by mass spectrometry: the ABRF glycoprotein research multi-institutional study 2012. Mol Cell Proteomics. 2013. doi: 10.1074/mcp.M113.030643
Desaire H, Hua D. When can glycopeptides be assigned based solely on high-resolution mass spectrometry data? Int J Mass Spectrom. 2009;287:21–6. doi:10.1016/j.ijms.2008.12.001.
Mayampurath AM, Wu Y, Segu ZM, Mechref Y, Tang H. Improving confidence in detection and characterization of protein N-glycosylation sites and microheterogeneity. Rapid Commun Mass Spectrom. 2011;25:2007–19. doi:10.1002/rcm.5059.
Wu Y, Mechref Y, Klouckova I, Mayampurath A, Novotny MV, Tang H. Mapping site-specific protein N-glycosylations through liquid chromatography/mass spectrometry and targeted tandem mass spectrometry. Rapid Commun Mass Spectrom. 2010;24:965–72. doi:10.1002/rcm.4474.
Wang WT, LeDonne NC, Ackerman B, Sweeley CC. Structural characterization of oligosaccharides by high-performance liquid chromatography, fast-atom bombardment-mass spectrometry, and exoglycosidase digestion. Anal Biochem. 1984;141:366–81. doi:10.1016/0003-2697(84)90057-5.
Hu H, Khatri K, Klein J, Leymarie N, Zaia J. A review of methods for interpretation of glycopeptide tandem mass spectral data. Glycoconj J. 2015; 1–12. doi: 10.1007/s10719-015-9633-3
Dallas DC, Martin WF, Hua S, German JB. Automated glycopeptide analysis—review of current state and future directions. Brief Bioinform. 2013;14:361–74. doi:10.1093/bib/bbs045.
Hu H, Khatri K, Zaia J. Algorithms and design strategies towards automated glycoproteomics analysis. Mass Spectrom Rev n/a-n/a. 2016. doi: 10.1002/mas.21487
Wuhrer M, Deelder AM, van der Burgt YEM. Mass spectrometric glycan rearrangements. Mass Spectrom Rev. 2011;30:664–80. doi:10.1002/mas.20337.
Joenväärä S, Ritamo I, Peltoniemi H, Renkonen R. N-Glycoproteomics—an automated workflow approach. Glycobiology. 2008;18:339–49. doi:10.1093/glycob/cwn013.
Wu S-W, Liang S-Y, Pu T-H, Chang F-Y, Khoo K-H. Sweet-Heart—an integrated suite of enabling computational tools for automated MS2/MS3 sequencing and identification of glycopeptides. J Proteomics. 2013;84:1–16. doi:10.1016/j.jprot.2013.03.026.
Lynn K-S, Chen C-C, Lih TM, Cheng C-W, Su W-C, Chang C-H, Cheng C-Y, Hsu W-L, Chen Y-J, Sung T-Y. MAGIC: an automated N-linked glycoprotein identification tool using a Y1-ion pattern matching algorithm and in silico MS2 approach. Anal Chem. 2015. doi: 10.1021/ac5044829
Strum JS, Nwosu CC, Hua S, Kronewitter SR, Seipert RR, Bachelor RJ, et al. Automated assignments of N- and O-site specific glycosylation with extensive glycan heterogeneity of glycoprotein mixtures. Anal Chem. 2013;85:5666–75. doi:10.1021/ac4006556.
An Y, Cipollo JF. An unbiased approach for analysis of protein glycosylation and application to influenza vaccine hemagglutinin. Anal Biochem. 2011;415:67–80. doi:10.1016/j.ab.2011.04.018.
Khatri K, Staples GO, Leymarie N, Leon DR, Turiák L, Huang Y, et al. Confident assignment of site-specific glycosylation in complex glycoproteins in a single step. J Proteome Res. 2014;13:4347–55. doi:10.1021/pr500506z.
He L, Xin L, Shan B, Lajoie GA, Ma B. GlycoMaster DB: software to assist the automated identification of N-linked glycopeptides by tandem mass spectrometry. J Proteome Res. 2014;13:3881–95. doi:10.1021/pr401115y.
Håkansson K, Cooper HJ, Emmett MR, Costello CE, Marshall AG, Nilsson CL. Electron capture dissociation and infrared multiphoton dissociation MS/MS of an N-glycosylated tryptic peptide to yield complementary sequence information. Anal Chem. 2001;73:4530–6. doi:10.1021/ac0103470.
Mechref Y. Use of CID/ETD mass spectrometry to analyze glycopeptides. Curr Protoc Protein Sci Editor Board John E Coligan Al 0 12:Unit-12.1111. 2012. doi: 10.1002/0471140864.ps1211s68
Syka JEP, Coon JJ, Schroeder MJ, Shabanowitz J, Hunt DF. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc Natl Acad Sci U S A. 2004;101:9528–33. doi:10.1073/pnas.0402700101.
Viner RI, Zhang T, Second T, Zabrouskov V. Quantification of post-translationally modified peptides of bovine α-crystallin using tandem mass tags and electron transfer dissociation. J Proteomics. 2009;72:874–85. doi:10.1016/j.jprot.2009.02.005.
Scott NE, Parker BL, Connolly AM, Paulech J, Edwards AVG, Crossett B, et al. Simultaneous glycan-peptide characterization using hydrophilic interaction chromatography and parallel fragmentation by CID, higher energy collisional dissociation, and electron transfer dissociation MS applied to the N-linked glycoproteome of Campylobacter jejuni. Mol Cell Proteomics MCP. 2011;10:M000031–MCP201. doi:10.1074/mcp.M000031-MCP201.
Chalkley RJ, Thalhammer A, Schoepfer R, Burlingame AL. Identification of protein O-GlcNAcylation sites using electron transfer dissociation mass spectrometry on native peptides. Proc Natl Acad Sci U S A. 2009;106:8894–9. doi:10.1073/pnas.0900288106.
Catalina MI, Koeleman CAM, Deelder AM, Wuhrer M. Electron transfer dissociation of N-glycopeptides: loss of the entire N-glycosylated asparagine side chain. Rapid Commun Mass Spectrom RCM. 2007;21:1053–61. doi:10.1002/rcm.2929.
Zhao P, Viner R, Teo CF, Boons G-J, Horn D, Wells L. Combining high-energy c-trap dissociation and electron transfer dissociation for protein O-GlcNAc modification site assignment. J Proteome Res. 2011;10:4088–104. doi:10.1021/pr2002726.
Anderson NL, Anderson NG. The human plasma proteome history, character, and diagnostic prospects. Mol Cell Proteomics. 2002;1:845–67. doi:10.1074/mcp.R200007-MCP200.
Echan LA, Tang H-Y, Ali-Khan N, Lee K, Speicher DW. Depletion of multiple high-abundance proteins improves protein profiling capacities of human serum and plasma. PROTEOMICS. 2005;5:3292–303. doi:10.1002/pmic.200401228.
Zhang A, Sun H, Yan G, Han Y, Wang X. Serum proteomics in biomedical research: a systematic review. Appl Biochem Biotechnol. 2013;170:774–86. doi:10.1007/s12010-013-0238-7.
Khatri K, Klein JA, White MR, Grant OC, Leymarie N, Woods RJ, Hartshorn KL, Zaia J. Integrated omics and computational glycobiology reveal structural basis for influenza A virus glycan microheterogeneity and host interactions. Mol Cell Proteomics mcp.M116.058016. 2016. doi: 10.1074/mcp.M116.058016
Staples GO, Naimy H, Yin H, Kileen K, Kraiczek K, Costello CE, et al. Improved hydrophilic interaction chromatography LC/MS of heparinoids using a chip with postcolumn makeup flow. Anal Chem. 2009;82:516–22. doi:10.1021/ac901706f.
Zhang J, Xin L, Shan B, Chen W, Xie M, Yuen D, et al. PEAKS DB: de novo sequencing assisted database search for sensitive and accurate peptide identification. Mol Cell Proteomics MCP. 2012;11:M111.010587. doi:10.1074/mcp.M111.010587.
Creasy DM, Cottrell JS. Unimod: protein modifications for mass spectrometry. Proteomics. 2004;4:1534–6. doi:10.1002/pmic.200300744.
Horn DM, Zubarev RA, McLafferty FW. Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules. J Am Soc Mass Spectrom. 2000;11:320–32. doi:10.1016/S1044-0305(99)00157-9.
Jaitly N, Mayampurath A, Littlefield K, Adkins JN, Anderson GA, Smith RD. Decon2LS: an open-source software package for automated processing and visualization of high resolution mass spectrometry data. BMC Bioinformatics. 2009;10:87. doi:10.1186/1471-2105-10-87.
Maxwell E, Tan Y, Tan Y, Hu H, Benson G, Aizikov K, et al. GlycReSoft: a software package for automated recognition of glycans from LC/MS data. PLoS One. 2012;7:e45474. doi:10.1371/journal.pone.0045474.
Käll L, Storey JD, MacCoss MJ, Noble WS. Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J Proteome Res. 2008;7:29–34. doi:10.1021/pr700600n.
Eng JK, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom. 1994;5:976–89. doi:10.1016/1044-0305(94)80016-2.
Creasy DM, Cottrell JS. Error tolerant searching of uninterpreted tandem mass spectrometry data. Proteomics. 2002;2:1426–34. doi:10.1002/1615-9861(200210)2:10<1426::AID-PROT1426>3.0.CO;2-5.
Mann M, Wilm M. Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal Chem. 1994;66:4390–9.
Sunyaev S, Liska AJ, Golod A, Shevchenko A, Shevchenko A. MultiTag: multiple error-tolerant sequence tag search for the sequence-similarity identification of proteins by mass spectrometry. Anal Chem. 2003;75:1307–15. doi:10.1021/ac026199a.
Treuheit MJ, Costello CE, Halsall HB. Analysis of the five glycosylation sites of human alpha 1-acid glycoprotein. Biochem J. 1992;283:105–12.
Nishi K, Ono T, Nakamura T, Fukunaga N, Izumi M, Watanabe H, et al. Structural insights into differences in drug-binding selectivity between two forms of human alpha1-acid glycoprotein genetic variants, the A and F1*S forms. J Biol Chem. 2011;286:14427–34. doi:10.1074/jbc.M110.208926.
Rathore AS, Winkle H. Quality by design for biopharmaceuticals. Nat Biotechnol. 2009;27:26–34. doi:10.1038/nbt0109-26.
Acknowledgments
Funding was provided from NIH grants P41GM105603 and R21CA177476. Thermo-Fisher Scientific provided access to the Q-Exactive Plus mass spectrometer used in this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors have no conflicts of interest.
Additional information
Published in the topical collection Glycomics, Glycoproteomics and Allied Topics with guest editors Yehia Mechref and David Muddiman.
Kshitij Khatri and Joshua A. Klein contributed equally to this work.
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(PDF 1129 kb)
Rights and permissions
About this article
Cite this article
Khatri, K., Klein, J.A. & Zaia, J. Use of an informed search space maximizes confidence of site-specific assignment of glycoprotein glycosylation. Anal Bioanal Chem 409, 607–618 (2017). https://doi.org/10.1007/s00216-016-9970-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00216-016-9970-5