An Automated, High-Throughput Method for Interpreting the Tandem Mass Spectra of Glycosaminoglycans

Duan, Jiana; Jonathan Amster, I.

doi:10.1007/s13361-018-1969-z

An Automated, High-Throughput Method for Interpreting the Tandem Mass Spectra of Glycosaminoglycans

Focus: Application of Photons and Radicals for MS: Research Article
Published: 22 May 2018

Volume 29, pages 1802–1811, (2018)
Cite this article

Download PDF

Journal of The American Society for Mass Spectrometry

An Automated, High-Throughput Method for Interpreting the Tandem Mass Spectra of Glycosaminoglycans

Download PDF

1154 Accesses
17 Citations
6 Altmetric
Explore all metrics

Abstract

The biological interactions between glycosaminoglycans (GAGs) and other biomolecules are heavily influenced by structural features of the glycan. The structure of GAGs can be assigned using tandem mass spectrometry (MS²), but analysis of these data, to date, requires manually interpretation, a slow process that presents a bottleneck to the broader deployment of this approach to solving biologically relevant problems. Automated interpretation remains a challenge, as GAG biosynthesis is not template-driven, and therefore, one cannot predict structures from genomic data, as is done with proteins. The lack of a structure database, a consequence of the non-template biosynthesis, requires a de novo approach to interpretation of the mass spectral data. We propose a model for rapid, high-throughput GAG analysis by using an approach in which candidate structures are scored for the likelihood that they would produce the features observed in the mass spectrum. To make this approach tractable, a genetic algorithm is used to greatly reduce the search-space of isomeric structures that are considered. The time required for analysis is significantly reduced compared to an approach in which every possible isomer is considered and scored. The model is coded in a software package using the MATLAB environment. This approach was tested on tandem mass spectrometry data for long-chain, moderately sulfated chondroitin sulfate oligomers that were derived from the proteoglycan bikunin. The bikunin data was previously interpreted manually. Our approach examines glycosidic fragments to localize SO₃ modifications to specific residues and yields the same structures reported in literature, only much more quickly.

A Scoring Algorithm for the Automated Analysis of Glycosaminoglycan MS/MS Data

Article 31 October 2019

A review of methods for interpretation of glycopeptide tandem mass spectral data

Article 26 November 2015

GlycoDeNovo – an Efficient Algorithm for Accurate de novo Glycan Topology Reconstruction from Tandem Mass Spectra

Article 07 August 2017

Introduction

Glycosaminoglycans (GAGs) are linear, polydisperse carbohydrates consisting of a repeating uronic sugar and amino sugar copolymer. GAGs serve a multitude of roles in biology including cell-cell and cell-matrix interactions, generation of energy, changes in proteins binding conformation, and molecular recognition [1,2,3]. Certain GAGs have also been observed as potential biomarkers for disease states [4]. The degree of GAG-protein binding has been shown to be highly dependent on their structure and, more specifically, the position of modifications within their generic repeating copolymer chain [5, 6].

Despite the simple polymeric backbone in GAGs, a single sugar residue can exhibit varying levels of three key modifications, namely O-sulfation, N-deacetylation/sulfation, and uronic sugar stereochemistry [2]. Moreover, the biosynthesis of GAGs is not template driven, resulting in non-uniform dispersion of these modifications across the chain [7, 8]. Database-derived approaches are widely used for protein mass spectra assignment (either top-down or bottom-up) due to the predictability of amino acid sequences from genome sequences but fail when applied to biomolecules whose production is not template-derived [9, 10]. In contrast to the approaches that are successful for protein/peptide analysis, a de novo approach is required for the computer-based analysis of the tandem mass spectra of GAGs.

Considerable progress has been made in GAG analysis using mass spectrometry [1, 11]. At the MS¹ level, a parts per million accurate mass measurement, using high-resolution instruments such as Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS), allows assignment of composition, from which GAG chain length, number of modifications, and types of modification can be assigned [12]. Tandem MS (MS²) of GAGs using various ion activation methods, such as collision-induced dissociation (CID) [13,14,15], infrared multiphoton dissociation [16,17,18,19], electron-detachment dissociation (EDD) [16, 18,19,20,21,22,23,24], and negative-electron transfer dissociation (NETD) [25,26,27], yields structurally informative fragment ions [28]. Glycosidic bond fragmentation provides monosaccharide composition, while cross-ring fragmentation is used to assign the location of modifications within each residue [29]. Because this is a de novo analytical approach, complete structure analysis requires an information-rich mass spectrum that contains sufficient fragment peaks to fully assign all the variable features. Recent developments in ion activation for GAGs have led to a variety of approaches to produce informative MS² spectra [21, 23, 28, 30]. However, the interpretation of such complex mass spectra is generally a tedious manual process that relies upon the expertise of the data analyst. A better understanding of the structural features that promote GAG activity would benefit from an automated, accurate and high-throughput analytical process.

The complexity of the data sets and the time required for analysis increases dramatically as the chain length and the number of modifications increase. Two families of GAGs, heparin/heparan sulfate (Hp/HS) and chondroitin/dermatan sulfate (CS/DS), often contain large numbers of labile sulfate modifications. For these compounds, conventional MS² methods are often inadequate for complete structural determination, either because they do not produce a comprehensive set of fragment ions required to assign all variable features or because they lead to decomposition products that confound the analysis [8, 31]. For example, fragmentation can be accompanied by decomposition of sulfomodifications, producing peaks that are reduced in mass by multiples of 80 mass units but match the mass of standard glycosidic fragments of their counterparts with fewer sulfate modifications [28, 32]. If one does not recognize the peaks that arise from such decomposition, incorrect structural assignments will result. Common de novo strategies that have been successful for protein sequencing [25, 33,34,35] will inevitably be exposed to substantially more false positives due to the high-likelihood of SO₃ loss fragments in GAG MS and MS². Na⁺/H⁺ exchange has been shown to decrease SO₃ loss and makes characterization of highly sulfate species possible [30]; however, SO₃ loss is almost always observed in MS² spectra.

An alternative to the above approach to interpretation is to generate a list of possible fragment peaks for a candidate structure and to score the match with the experimental data. This process can be repeated for all possible isomers having a given elemental composition. Comparison of the experimental MS² against the theoretical fragment list allows us to rank each permutation based on closeness-of-fit to the experimental results. This method becomes impractical to perform manually when the number of possible permutations for a composition exceeds the capability to examine the data. For example, Arixtra, a heparin with five monosaccharides, is the largest highly sulfated GAG to have complete mass spectral characterization [30]. The number of total possible permutations for a GAG scales logarithmically with the respect to chain length. For both chondroitin/dermatan sulfate and heparan sulfate/heparin, the number of permutations based on chain length and number of modifications is calculated as n-choose-k combinations, where n is the number of possible modifiable sites and k is the number of modifications:

$$ {N}_{\mathrm{total}}\propto \log {N}_{\mathrm{chain}\ \mathrm{length}} $$

(1)

$$ \left(\genfrac{}{}{0pt}{}{n}{k}\right)=\frac{n!}{k!\left(n-k\right)!} $$

(2)

Tools for comparison of user-input structures with fragment peaks from tandem MS have been developed [12, 36, 37], but the requirement for a known starting structure limits applicability for high-throughput analysis.

To address this bottleneck for high-throughput sequencing of GAGs, efforts in computer-assisted methods look to improve upon the speed of analysis and to reduce the amount of user-input and supervision. Several software packages have been developed to overcome modern challenges in GAG analysis although a few require addition steps at the experimental level for optimal software performance. The heparin/HS oligosaccharide sequencing tool (HOST) [38] is a computational tool designed for sequencing heparin/HS oligosaccharides using enzymatic digestion combined with ESI-MSⁿ. The method scores and returns the best matching sequences of GAGs based on disaccharide composition analysis, yielding predicted compositions and calculating expected fragmentation patterns in silico. Comparisons of theoretical fragments can then be compared to fragmentation of heparin/HS oligosaccharide MSⁿ data and is scored to return the most likely sequence. However, disaccharide analysis requires complete enzymatic digestion of the GAG using heparin lyases I, II and III over multiple hours of incubation (16 h), limiting the method’s overall speed and applicability in a high-throughput GAG analysis platform.

Another piece of software known as GAG-ID [39] has been shown to discriminate and identify 21 synthetic tetrasaccharides eluted from LC-MS/MS using a scoring system based on peak intensities. It is the first of its kind to automated the interpretation of mixtures when coupled to LC-MS/MS but require complete chemical derivatization of the GAG by replacing all labile sulfate modifications with more stable acetyl groups. Much like HOST, derivatization may not be a viable option for universal GAG analysis.

HS-SEQ [40] is a de novo GAG sequencing computation framework that has been used to automate the structural identification of HS of dp4, 5, 6, 8, and 15. The method determines a precursor sequence (unmodified GAG backbone) and uses information from the tandem MS to best assign possible sulfate and acetate modifications. Assignments are made based on confidence values and are used to generate a list of top candidates. This is the first GAG software that requires only the tandem MS for sequence information. While certainly a high-throughput option, the structural assignment conflicts can arise in the form of sulfate loss fragment, internal fragments, or random matches. The authors of HS-SEQ not only note that the software removes the assignments with lower confidence to resolve conflicting assignments but also believe that this may produce false hits when examining samples extracted from biological sources.

The software developed in our laboratory is designed to sequence GAGs of indefinite length by comparing fragments of theoretical structures (in silico) against experimental data without the need for construction of a database, instead using a genetic algorithm optimization technique to limit the number of permutations while keeping analysis time to a maximum of a few minutes. The method assigns structures based on greatest likelihood using fragment ion products as a critical parameter for the genetic algorithm fitness criterion. Fragments that are in direct conflict with the highest scoring structure(s) are not discarded but reviewed again for possible additional components. We have tested this approach on MS² data from intact CS chains released from the proteoglycan, bikunin. These chains vary in length from 27 to 43 saccharide residues, and vary in the degree of O-sulfomodification from 4 to 7, and thus represent a challenging test of this automated procedure.

Experimental Methods

Mass Spectrometry Analysis

Bikunin GAG MS and MS² data reported in [41] was used as a proof-of-principle data set for the purposes of testing genetic algorithm efficacy. The monoisotopic peaks were selected via the SNAP algorithm from Bruker DataAnalysis software. Analysis of the MS² was performed with the software alone and with no user supervision or assistance.

Computational Methods

MS¹ analysis of parent ion mass is performed using a composition assignment software module written in the MATLAB coding environment. Monoisotopic peaks and charge states are acquired from Bruker DataAnalysis and deconvoluted to a neutral mass. A composition is derived from one or more neutral mass(es) by searching a data matrix of possible chain lengths, degrees of sulfation, deacetylation, and sodium/hydrogen exchange. The user input also includes the possibility of reducing end modifications, and nonreducing ends that can terminate in unsaturated uronic acids, as is common in enzymatically produced GAG oligomers. Theoretical neutral masses in the spreadsheet are compared against user specified masses with a user-defined mass tolerance. The sequences that match are then used for performing the MS² analysis.

For MS² assignment, we implement a genetic algorithm based on fundamental aspects common to all genetic algorithms [42,43,44]. For MS² analysis, the software uses a binary vector to represent glycan structures where on-bits denote an occupied site of SO₃ modification. The first step generates two glycan structures at random that fit the expected composition (initialization step) and then proceeds to “breed” these structures into a new generation of candidates (crossover step). The new generation also is subject to potential mutations in their structure in the form of exchanges between their on- and off-bits (mutation step) in an effort to avoid converging upon a local maximum. Theoretical structures created in the crossover and mutation steps are then tested against the experimental MS² data where the score of each structure is determined based on a closeness-of-fit paradigm (fitness). The scoring system is subject to various factors that will be discussed in detail in future papers. In the case of bikunin, the score of a structure is a naïve model that determines the top candidate based on the number of matching glyocosidic fragments. The primary three steps (crossover, mutation and fitness) are iterated until the maximum fitness value does not change after numerous cycles. The number of iterations required before termination of the algorithm can be defined by the user but is defaulted at a value of 3. The structure(s) containing the highest scores are then examined using additional data interpretation tools that assign fragment peak masses alongside their charge, intensity, and mass error (in ppm).

Experimental MS² data collected by FTICR is extracted from Bruker Apex user interface software using the SNAP peak-picking algorithm. Monoisotopic peak masses and intensities are extracted in the form of comma-separate value (.csv) files. MATLAB software prompts the user for a .csv file containing mass-to-charge in column 1 and intensity in column 2, with mass-to-charge sorted in ascending order. Parent ion mass and charge must be provided by the user as well as mass information pertaining to a linker region mass on the reducing end. Composition details (chain length and numbers of: sulfation, n-acetylation, Na-H exchange) are calculated from a composition calculation module and then given to the software in the preliminary step before initializing the genetic algorithm.

For bikunin proteoglycan a linker mass of 641.1473 (Gal4S-Gal-Xyl-Serine) was used with the remainder of the bikunin chain length represented as a binary vector.

Software integrates separate functional modules to perform mass calculations of theoretical fragment ions, performing standard genetic algorithm features and scoring theoretical structures against experimental data.

Results and Discussion

As GAG chain length and modification increases, the number of possible structural permutations exceeds a value suitable for practical, computationally efficient search methods. For the chondroitin sulfate oligomers studied here, the number of structural possibilities is as large as 3.7E22 for an oligomer of length 50 (Eq. (2)). The number of possibilities is narrowed down when composition can be assigned and the number of known sulfate modifications is determined. While the paradigm for comparing theoretical structures against experimental data can differ, a minimum number of elements such as fragment type, fragment intensity, and sequence coverage must be considered for complete GAG characterization [45]. Thus, instead of trying to shortcut these facets of analysis, we chose an approach that reduces the total search space. Hundreds of millions of structures may exist for a specific GAG composition, but for a pure sample, only one of these structures is a valid assignment. The impracticality of searching through a massive number of incorrect structures is reduced dramatically when a genetic algorithm search heuristic is applied [44].

The genetic algorithm is an optimization tool that has been used for a wide variety of applications [46,47,48,49,50,51]. It mimics the evolutionary process, by using a survival of the fittest mechanism that quickly eliminates large groups of candidates from a pool if they share a feature that does not meet a specific set of criteria [44]. Here we examine the application of this approach to GAG MS² analysis. We have developed software in the MATLAB coding environment that utilizes the genetic algorithm. GAG sequences are expressed as a binary code where on-bits (1s) and off-bits (0s) represent the presence or absence of modifications, respectively, and can be applied to both CS/DS and HS/Hp GAG classes, Figure 1 [42, 43]. The binary sequence is shortened or lengthened to accommodate the appropriate composition calculated from the parent-ion mass. The number of on- and off-bits in the genome is also adjusted based on the number of modifications observed. The final structure is determined via a genetic algorithm, the workflow for which is shown in Figure 2.

Improvements in analysis time and search space reduction can be observed using CID MS² data from several fractions of intact CS chains for the proteoglycan bikunin [41]. The advantage of using these data is threefold. First, the mass spectra are rich in structurally informative fragments. Structural assignment of bikunin from MS² was done previously with manual de novo analysis of these fragments. Software suitable for analysis should make the same assignments using these fragments without any user supervision. A second advantage is that modifications are limited to a single sulfate group per disaccharide. Sulfate modifications have been shown to only occur on the 4-O position of the amino sugar using enzymatic disaccharide analysis. Reducing the total number of possible modification diminishes the search space dramatically. For example, a CS dp43 with 5 sulfate groups has 20,349 possible structures when only examining the occupancy of the 4-O position but 5,949,147 possible structures when every sulfate position (2-O, 4-O, 6-O) is taken into consideration. A simplified search space allows us to demonstrate proof of principle while still maintaining computational efficiency. Finally, the structures of bikunin fractions have been manually verified and reported in the literature [41]. A common motif among bikunin fractions was observed after manual sequence analysis. We were particularly interested to see if the unsupervised approach with our software also yielded these same patterns. Candidate structures of bikunin GAGs produced in the genetic algorithm cycles are assigned scores based on the number of matched glycosidic fragments in the experimental data. The fitness of a candidate structure is determined using three separate tiers of scoring:

$$ {f}_1=\sum \limits_{i=1}^{dp}{N}_{\mathrm{RE}}-\sum \limits_{i=1}^{dp}{N}_{\mathrm{RE}+\mathrm{SO}3} $$

(3)

$$ {f}_2=\sum \limits_{i=1}^{dp}{N}_{\mathrm{NRE}}-\sum \limits_{i=1}^{dp}{N}_{\mathrm{NRE}+\mathrm{SO}3} $$

(4)

$$ {f}_3=\sum \limits_{i=1}^{dp}{I}_{\mathrm{glyc}} $$

(5)

Unambiguous mass tags such as the linker region dictate that greater emphasis should be placed on the reducing end (Y and Z fragments) and provide a more valid structural assignment. The primary fitness of a score is therefore based on its calculated f₁ value, which considers the number of glycosidic fragments from the reducing end (N_RE) that are matched in the experimental data. The software then checks to see if any match is potentially a sulfate decomposition peak by adding the mass of an SO₃-H exchange (79.9568 Da) and searches the experimental data again for a matching mass. The value of f₁ is then reduced by the number of peaks determined to be a product of sulfate decomposition (N_RE + SO3).

If the value of f₁ is tied among multiple structures, a secondary ranking is then determined with f₂, the value of which is based on the number of glycosidic matches from the non-reducing end (B and C fragments). In similar fashion to calculating f₁, considerations for potential sulfate decompositions are considered. Non-reducing end fragments are a tier below reducing end fragments since they could potentially match internal fragments due to the lack of an unambiguous mass tag. Incorrect assignment of internal fragments as non-reducing end fragments limits the validity of assignment.

A tertiary score f₃ is used after matching glycosidic fragments from both reducing and non-reducing ends. Typically, a small selection of candidate structures (2–4) may end up with equal f₁ and f₂ values, in which case the summation of the intensities of all matched glycosidic fragments is the tiebreaker. This simple algorithm can and should be continuously fine-tuned for other purposes as software development continues but is sufficient for proof-of-principle purposes.

Eleven bikunin samples of different compositions were tested using the genetic algorithm. Of these 11, the single highest scoring candidate of the genetic algorithm for 9 of these samples matched the structures reported in literature. Without user supervision, the genetic algorithm results also reaffirm the common bikunin motif reported in literature [41], Figure 3. For the remaining two samples, the genetic algorithm software reported multiple top-scoring candidates. MS² data for these two samples could not unambiguously differentiate these structures; however, the structures reported in literature for these samples were present among the top-scoring candidates. This highlights the importance of data quality for optimal software performance. A lack of informative fragmentation peaks can result in structural ambiguities, but information-rich mass spectra can be interpreted with minimal trouble. However, a genetic algorithm approach has no theoretical minimum for data quality. Spectra not containing sufficient fragmentation for complete glycan characterization can still be interpreted based on available fragment ions and a partial sequence can be generated. Although the spectral quality of bikunin GAG tandem MS is high, more complex and longer chain intact GAGs of proteoglycans may yield less than the full suite of fragments necessary for complete sequencing. In this event, our approach can still be used to determine some portion of the overall glycan structure, as has been done recently for decorin glycans [52].

In addition to matching previously reported structures, a closer examination of other high-scoring candidate structures among samples shows a consistent motif across compositions. Additional structural motifs shown in Figure 4 consistently score within the top five structures of the genetic algorithm. These alternate structures are the ones consisting of similar f₁ and f₂ scores but have low-intensity values for some of their fragment matches (affecting the value of f₃). The high degree of similarity between the primary component identified in literature and the alternate structures may be a result of (A) our scoring method being favored towards reducing end fragments, (B) assigning low-intensity noise peaks as glycosidic fragments, or (C) the possibility of a mixture containing some minor components.

The speed of analysis between using the genetic algorithm versus the exhaustive search of every possible permutation of a composition is shown in Figure 5. Here we see that the genetic algorithm has found the correct answer within a small fraction of the time (0.9–2.5% on average) required to examine every possible structure with the assumption that sulfation only occurs on the 4-O position of the N-acetylgalactosamine. Decrease in search time is primarily due to a reduction in the frequency in which unlikely features are eliminated from the genetic algorithm gene pool. As reported [41], bikunin’s sulfation occurs near the reducing end. Isomeric structures that contain sulfate groups in the non-reducing end ranked lowest in the scoring process, resulting in rapid elimination of a test structure and all structures of similar sulfation patterns with one single iteration. A greater number of iterations were spent refining high-scoring structures once poorly scored structures have been eliminated from consideration. The algorithm is designed to rerun the entire genetic process from scratch multiple times in order to avoid plateauing at local maxima. Convergence upon the same highest scoring structure 5 times was the baseline criterion for an acceptable structural assignment. The repetition number is a user-adjustable parameter, as well.

Of particular significance, the efficiency of this approach is found to increase as the total number of permutations increases. For a pure sample, only a single structure can be assigned to the MS² spectrum, but the number of structures with drastically different modification patterns increases with respect to chain length. An increase in chain length also increases the number of GAG structures that could potentially share a feature not observed in the MS². Structures containing these features drop out of the algorithm as possible options once a single structure of that particular type is scored.

Calculations shown here are run on a 2.4-GHz dual-core processor with 4 GB of RAM, a standard laptop or desktop computer. Speed of calculations can increase with more powerful processors such as a GPU workstation or computer cluster. It is important to note that the genetic algorithm in MATLAB is operated with separate function calls at each step of the algorithm’s cycle. Parallelization of these function calls is particularly attractive for samples of higher chain length and, in theory, could make spectra interpretation no longer the bottleneck for structural elucidation of GAGs. Additional GAG structures determined using this genetic-algorithm based GAG analysis software have been reported [53].

Conclusions

The software performance is limited by two factors: (1) the quality of the MS² data and (2) the specificity of the fitness function. The former limitation can be reduced by using a high-performance instrument such as FTICR or Orbitrap mass spectrometers. Some fragment mass values differ by less than 1 Da, increasing the possibility of ambiguity in low-performance instruments. High-resolution mass spectra with single digit or lower ppm mass error minimize margins for incorrect assignment. Acquisition condition must also be optimized for glycan fragmentation and ideally limit production of confounding fragments such as SO₃ loss or internal cleavages.

The latter factor, specificity of the fitness function in the genetic algorithm, is one that can be fine-tuned to GAG analysis by tandem mass spectrometry. The fitness function presented in this paper is simple, arbitrary, and based on the basics of glycan analysis. This approach works for the examples selected here because only glycosidic bond cleavage was assigned. Higher level structure analysis based on cross-ring cleavages requires a more sophisticated fitness function. A more complete and non-arbitrary scoring algorithm is being developed that assigns statistical weights and importance factors to various fragment peaks. Additional, peak intensity, while not considered heavily in this iteration of the code, can also signify important characteristics in GAG structure. Details for creating an optimized scoring algorithm will be discussed in future work.

Peak picking for GAG fragmentation is not discussed in this paper but is an important consideration moving forward. Bikunin fragment peaks were selected by the SNAP algorithm using averagine and manually validated; this approach is practical for lowly sulfated samples but averaging is insufficient for highly sulfated compounds due to contributions of sulfur to the A + 2 isotope peak. A fully automated and GAG-specific peak picking system is currently in development.

The software is applicable for GAGs that are both lowly sulfated such as bikunin and moderate and highly sulfated samples for both CS/DS and HS/Hp samples. Short-chain HS with more than one SO₃ modification per disaccharide and long-chain chondroitin sulfate such as decorin with approximate 1 SO₃ per disaccharide have been determined using our software [52, 53].

The uronic sugar stereochemistry is a variable modification in GAGs that is difficult to observe using just mass spectrometry. EDD data of heparin and heparan sulfate GAGs has produced a small subset of diagnostic fragments capable of distinguishing between glucuronic and iduronic acid epimers [22]. Chemometric applications have yielded a diagnostic fragment ratio that can definitively determine the C₅ stereochemistry [54]. Application of this ratio can be integrated into the software after basic structural features have been assigned using the approach presented here.

Funding Information

The authors gratefully acknowledge funding from the National Institute of Health, grants P41GM103390 and R21HL136271.

References

Xie, B., Costello, C.E.: Carbohydrate structure determination by mass spectrometry. Carbohydr. Chem. Biol. Med. Appl. 29–57 (2008)
Gandhi, N.S., Mancera, R.L.: The structure of glycosaminoglycans and their interactions with proteins. Chem. Biol. Drug Des. 72, 455–482 (2008)
Article CAS PubMed Google Scholar
Rabenstein, D.L.: Heparin and heparan sulfate: structure and function. Nat. Prod. Rep. 19, 312–331 (2002)
Article CAS PubMed Google Scholar
Ohtsubo, K., Marth, J.D.: Glycosylation in cellular mechanisms of health and disease. Cell. 126, 855–867 (2006)
Article CAS PubMed Google Scholar
Zhao, Y.J., Singh, A., Li, L.Y., Linhardt, R.J., Xu, Y.M., Liu, J., Woods, R.J., Amster, I.J.: Investigating changes in the gas-phase conformation of Antithrombin III upon binding of Arixtra using traveling wave ion mobility spectrometry (TWIMS). Analyst. 14, 6980–6989 (2015)
Article Google Scholar
Zhao, Y.J., Singh, A., Xu, Y.M., Zong, C.L., Zhang, F.M., Boons, G.J., Liu, J., Linhardt, R.J., Woods, R.J., Amster, I.J.: Gas-phase analysis of the complex of fibroblast growth factor 1 with heparan sulfate: a traveling wave ion mobility spectrometry (TWIMS) and molecular modeling study. J. Am. Soc. Mass Spectrom. 28, 96–109 (2017)
Article CAS PubMed Google Scholar
Thanawiroon, C., Rice, K.G., Toida, T., Linhardt, R.J.: Liquid chromatography/mass spectrometry sequencing approach for highly sulfated heparin-derived oligosaccharides. J. Biol. Chem. 279, 2608–2615 (2004)
Article CAS PubMed Google Scholar
Jones, C.J., Beni, S., Limtiaco, J.F.K., Langeslay, D.J., Larive, C.K.: Heparin characterization: challenges and solutions. Annu. Rev. Anal. Chem. 4(4), 439–465 (2011)
Article CAS Google Scholar
Elias, J.E., Haas, W., Faherty, B.K., Gygi, S.P.: Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods. 2, 667–675 (2005)
Article CAS PubMed Google Scholar
Cox, J., Neuhauser, N., Michalski, A., Scheltema, R.A., Olsen, J.V., Mann, M.: Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805 (2011)
Article CAS PubMed Google Scholar
Chi, L.L., Amster, J., Linhardt, R.J.: Mass spectrometry for the analysis of highly charged sulfated carbohydrates. Curr. Anal. Chem. 1, 223–240 (2005)
Article CAS Google Scholar
Cooper, C.A., Gasteiger, E., Packer, N.H.: GlycoMod—a software tool for determining glycosylation compositions from mass spectrometric data. Proteomics. 1, 340–349 (2001)
Article CAS PubMed Google Scholar
Kailemia, M.J., Patel, A.B., Johnson, D.T., Li, L.Y., Linhardt, R.J., Amster, I.J.: Differentiating chondroitin sulfate glycosaminoglycans using collision-induced dissociation; uronic acid cross-ring diagnostic fragments in a single stage of tandem mass spectrometry. Eur. J. Mass Spectrom. 21, 275–285 (2015)
Article CAS Google Scholar
Flangea, C., Serb, A.F., Schiopu, C., Tudor, S., Sisu, E., Seidler, D.G., Zamfir, A.D.: Discrimination of GalNAc (4S/6S) sulfation sites in chondroitin sulfate disaccharides by chip-based nanoelectrospray multistage mass spectrometry. Cent. Eur. J. Chem. 7, 752–759 (2009)
CAS Google Scholar
Huang, R.R., Pomin, V.H., Sharp, J.S.: LC-MS (n) analysis of isomeric chondroitin sulfate oligosaccharides using a chemical derivatization strategy. J. Am. Soc. Mass Spectrom. 22, 1577–1587 (2011)
Article CAS PubMed PubMed Central Google Scholar
Leach, F.E., Xiao, Z.P., Laremore, T.N., Linhardt, R.J., Amster, I.J.: Electron detachment dissociation and infrared multiphoton dissociation of heparin tetrasaccharides. Int. J. Mass Spectrom. 308, 253–259 (2011)
Article CAS PubMed PubMed Central Google Scholar
Bin Oh, H., Leach, F.E., Arungundram, S., Al-Mafraji, K., Venot, A., Boons, G.J., Amster, I.J.: Multivariate analysis of electron detachment dissociation and infrared multiphoton dissociation mass spectra of heparan sulfate tetrasaccharides differing only in hexuronic acid stereochemistry. J. Am. Soc. Mass Spectrom. 22, 582–590 (2011)
Article CAS Google Scholar
Wolff, J.J., Laremore, T.N., Leach, F.E., Linhardt, R.J., Amster, I.J.: Electron capture dissociation, electron detachment dissociation and infrared multiphoton dissociation of sucrose octasulfate. Eur. J. Mass Spectrom. 15, 275–281 (2009)
Article CAS Google Scholar
Wolff, J.J., Laremore, T.N., Busch, A.M., Linhardt, R.J., Amster, I.J.: Influence of charge state and sodium cationization on the electron detachment dissociation and infrared multiphoton dissociation of glycosaminoglycan oligosaccharides. J. Am. Soc. Mass Spectrom. 19, 790–798 (2008)
Article CAS PubMed PubMed Central Google Scholar
Leach, F.E., Ly, M., Laremore, T.N., Wolff, J.J., Perlow, J., Linhardt, R.J., Amster, I.J.: Hexuronic acid stereochemistry determination in chondroitin sulfate glycosaminoglycan oligosaccharides by electron detachment dissociation. J. Am. Soc. Mass Spectrom. 23, 1488–1497 (2012)
Article CAS PubMed Google Scholar
Leach, F.E., Wolff, J.J., Laremore, T.N., Linhardt, R.J., Amster, I.J.: Evaluation of the experimental parameters which control electron detachment dissociation, and their effect on the fragmentation efficiency of glycosaminoglycan carbohydrates. Int. J. Mass Spectrom. 276, 110–115 (2008)
Article CAS PubMed PubMed Central Google Scholar
Wolff, J.J., Chi, L.L., Linhardt, R.J., Amster, I.J.: Distinguishing glucuronic from iduronic acid in glycosaminoglycan tetrasaccharides by using electron detachment dissociation. Anal. Chem. 79, 2015–2022 (2007)
Article CAS PubMed PubMed Central Google Scholar
Wolff, J.J., Laremore, T.N., Aslam, H., Linhardt, R.J., Amster, I.J.: Electron-induced dissociation of glycosaminoglycan tetrasaccharides. J. Am. Soc. Mass Spectrom. 19, 1449–1458 (2008)
Article CAS PubMed PubMed Central Google Scholar
Wolff, J.J., Laremore, T.N., Busch, A.M., Linhardt, R.J., Amster, I.J.: Electron detachment dissociation of dermatan sulfate oligosaccharides. J. Am. Soc. Mass Spectrom. 19, 294–304 (2008)
Article CAS PubMed Google Scholar
Huang, Y., Yu, X., Mao, Y., Costello, C.E., Zaia, J., Lin, C.: De novo sequencing of heparan sulfate oligosaccharides by electron-activated dissociation. Anal. Chem. 85, 11979–11986 (2013)
Article CAS PubMed PubMed Central Google Scholar
Leach, F.E., Riley, N.M., Westphall, M.S., Coon, J.J., Amster, I.J.: Negative electron transfer dissociation sequencing of increasingly sulfated glycosaminoglycan oligosaccharides on an orbitrap mass spectrometer. J. Am. Soc. Mass Spectrom. 28, 1844–1854 (2017)
Article CAS PubMed PubMed Central Google Scholar
Wolff, J.J., Leach, F.E., Laremore, T.N., Kaplan, D.A., Easterling, M.L., Linhardt, R.J., Amster, I.J.: Negative electron transfer dissociation of glycosaminoglycans. Anal. Chem. 82, 3460–3466 (2010)
Article CAS PubMed PubMed Central Google Scholar
Wolff, J.J., Amster, I.J., Chi, L., Linhardt, R.J.: Electron detachment dissociation of glycosaminoglycan tetrasaccharides. J. Am. Soc. Mass Spectrom. 18, 234–244 (2007)
Article CAS PubMed Google Scholar
Domon, B., Costello, C.E.: A systematic nomenclature for carbohydrate fragmentations in fab-ms ms spectra of glycoconjugates. Glycoconjugate J. 5, 397–409 (1988)
Article CAS Google Scholar
Kailemia, M.J., Li, L.Y., Ly, M., Linhardt, R.J., Amster, I.J.: Complete mass spectral characterization of a synthetic ultralow-molecular-weight heparin using collision-induced dissociation. Anal. Chem. 84, 5475–5478 (2012)
Article CAS PubMed PubMed Central Google Scholar
Kailemia, M.J., Ruhaak, L.R., Lebrilla, C.B., Amster, I.J.: Oligosaccharide analysis by mass spectrometry: a review of recent developments. Anal. Chem. 86, 196–212 (2014)
Article CAS PubMed Google Scholar
Zaia, J., Costello, C.E.: Tandem mass spectrometry of sulfated heparin-like glycosaminoglycan oligosaccharides. Anal. Chem. 75, 2445–2455 (2003)
Article CAS PubMed Google Scholar
Dancik, V., Addona, T.A., Clauser, K.R., Vath, J.E., Pevzner, P.A.: De novo peptide sequencing via tandem mass spectrometry. J. Comput. Biol. 6, 327–342 (1999)
Article CAS PubMed Google Scholar
Ma, B., Zhang, K.Z., Hendrie, C., Liang, C.Z., Li, M., Doherty-Kirby, A., Lajoie, G.: PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom. 17, 2337–2342 (2003)
Article CAS PubMed Google Scholar
Taylor, J.A., Johnson, R.S.: Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. Anal. Chem. 73, 2594–2604 (2001)
Article CAS PubMed Google Scholar
Campbell, M.P., Hayes, C.A., Struwe, W.B., Wilkins, M.R., Aoki-Kinoshita, K.F., Harvey, D.J., Rudd, P.M., Kolarich, D., Lisacek, F., Karlsson, N.G., Packer, N.H.: UniCarbKB: putting the pieces together for glycomics research. Proteomics. 11, 4117–4121 (2011)
Article CAS PubMed Google Scholar
Maxwell, E., Tan, Y., Tan, Y., Hu, H., Benson, G., Aizikov, K., Conley, S., Staples, G.O., Slysz, G.W., Smith, R.D., Zaia, J.: GlycReSoft: a software package for automated recognition of glycans from LC/MS data. PLoS One. 7, (2012)
Saad, O.M., Leary, J.A.: Heparin sequencing using enzymatic digestion and ESI-MSn with HOST: a heparin/HS oligosaccharide sequencing tool. Anal. Chem. 77, 5902–5911 (2005)
Article CAS PubMed Google Scholar
Chiu, Y.L., Huang, R.R., Orlando, R., Sharp, J.S.: GAG-ID: heparan sulfate (HS) and heparin glycosaminoglycan high-throughput identification software. Mol. Cell. Proteomics. 14, 1720–1730 (2015)
Article CAS PubMed PubMed Central Google Scholar
Hu, H., Huang, Y., Mao, Y., Yu, X., Xu, Y.M., Liu, J., Zong, C.L., Boons, G.J., Lin, C., Xia, Y., Zaia, J.: A computational framework for heparan sulfate sequencing using high-resolution tandem mass spectra. Mol. Cell. Proteomics. 13, 2490–2502 (2014)
Article CAS PubMed PubMed Central Google Scholar
Ly, M., Leach III, F.E., Laremore, T.N., Toida, T., Amster, I.J., Linhardt, R.J.: The proteoglycan bikunin has a defined sequence. Nat. Chem. Biol. 7, 827–833 (2011)
Article CAS PubMed PubMed Central Google Scholar
Baeck, T., Schwefel, H.-P.: An overview of evolutionary algorithms for parameter optimization. Evol. Comput. 1, 1–23 (1993)
Article Google Scholar
Fogel, L.J., Owens, A.J., Walsh, M.J.: Artificial intelligence through a simulation of evolution. Proceedings of the Second Cybernetic Sciences Symposium: Biophysics and cybernetic systems. 131–155 (1965)
Forrest, S.: Genetic algorithms—principles of natural-selection applied to computation. Science. 261, 872–878 (1993)
Article CAS PubMed Google Scholar
Han, L., Costello, C.E.: Mass spectrometry of glycans. Biochem. Mosc. 78, 710–720 (2013)
Article CAS Google Scholar
Kilgour, D.P.A., Neal, M.J., Soulby, A.J., O’Connor, P.B.: Improved optimization of the Fourier transform ion cyclotron resonance mass spectrometry phase correction function using a genetic algorithm. Rapid Commun. Mass Spectrom. 27, 1977–1982 (2013)
Article CAS PubMed Google Scholar
Das, S., Suganthan, P.N.: Differential evolution: a survey of the state-of-the-art. IEEE Trans. Evol. Comput. 15, 4–31 (2011)
Article Google Scholar
Knowles, J.D., Corne, D.W.: Approximating the nondominated front using the Pareto archived evolution strategy. Evol. Comput. 8, 149–172 (2000)
Article CAS PubMed Google Scholar
Phillips, S.J., Anderson, R.P., Schapire, R.E.: Maximum entropy modeling of species geographic distributions. Ecol. Model. 190, 231–259 (2006)
Article Google Scholar
Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999)
Article CAS PubMed Google Scholar
Verdonk, M.L., Cole, J.C., Hartshorn, M.J., Murray, C.W., Taylor, R.D.: Improved protein-ligand docking using GOLD. Proteins Struct. Funct. Genet. 52, 609–623 (2003)
Article CAS PubMed Google Scholar
Yu, Y.L., Duan, J.N., Leach, F.E., Toida, T., Higashi, K., Zhang, H., Zhang, F.M., Amster, I.J., Linhardt, R.J.: Sequencing the dermatan sulfate chain of decorin. J. Am. Chem. Soc. 139, 16986–16995 (2017)
Article CAS PubMed Google Scholar
Singh, A., Kett, W.C., Severin, I.C., Agyekum, I., Duan, J.N., Amster, I.J., Proudfoot, A.E.I., Coombe, D.R., Woods, R.J.: The interaction of heparin tetrasaccharides with chemokine CCL5 is modulated by sulfation pattern and pH. J. Biol. Chem. 290, 15421–15436 (2015)
Article CAS PubMed PubMed Central Google Scholar
Agyekum, I., Patel, A.B., Zong, C.L., Boons, G.J., Amster, I.J.: Assignment of hexuronic acid stereochemistry in synthetic heparan sulfate tetrasaccharides with 2-O-sulfo uronic acids using electron detachment dissociation. Int. J. Mass Spectrom. 390, 163–169 (2015)
Article CAS PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Department of Chemistry, University of Georgia, Athens, GA, 30606, USA
Jiana Duan & I. Jonathan Amster

Authors

Jiana Duan
View author publications
You can also search for this author in PubMed Google Scholar
I. Jonathan Amster
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to I. Jonathan Amster.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duan, J., Jonathan Amster, I. An Automated, High-Throughput Method for Interpreting the Tandem Mass Spectra of Glycosaminoglycans. J. Am. Soc. Mass Spectrom. 29, 1802–1811 (2018). https://doi.org/10.1007/s13361-018-1969-z

Download citation

Received: 27 February 2018
Revised: 06 April 2018
Accepted: 14 April 2018
Published: 22 May 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s13361-018-1969-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An Automated, High-Throughput Method for Interpreting the Tandem Mass Spectra of Glycosaminoglycans

Abstract

Similar content being viewed by others

A Scoring Algorithm for the Automated Analysis of Glycosaminoglycan MS/MS Data

A review of methods for interpretation of glycopeptide tandem mass spectral data

GlycoDeNovo – an Efficient Algorithm for Accurate de novo Glycan Topology Reconstruction from Tandem Mass Spectra

Introduction