Introduction

Electrospray ionization mass spectrometry (ESI-MS) is a powerful label- and immobilization-free method for detecting protein–ligand complexes in vitro, establishing their binding stoichiometry and quantifying their affinities [1]. The assay relies on the ESI mass spectrum to provide a quantitative distribution snapshot of free and ligand-bound proteins in solution. A unique feature of ESI-MS is the ability to monitor multiple binding equilibria simultaneously. As such, the assay is well suited to screening compound libraries against target proteins to rapidly identify ligands and estimate their affinities and has been used to analyze a variety of different types of libraries, including those composed of oligosaccharides, peptides, or small organic molecules [2,3,4,5].

The formation of gaseous ions resulting from the non-specific association of components of buffer or salts (added or present as impurities) to proteins (free protein and protein–ligand complexes) can complicate the interpretation of ESI-MS data acquired for libraries of compounds with similar molecular weights (MWs) [6,7,8]. These adducts form as the gas-phase protein and protein complex ions are produced from small ESI offspring droplets that evaporate to dryness [9,10,11,12]. Buffer and salts remaining in these droplets near the end of the evaporation process will associate non-specifically with the protein and its complexes. If the resulting gas-phase protein–adduct interactions are sufficiently stable (kinetically) to survive the ion source, they will be detected. When using a high (> 100 mM) concentration of ammonium acetate, the typical buffer used in ESI-MS analysis of proteins and their complexes, adducts of acetate/acetic acid and ammonia/ammonium may be observed in the mass spectrum. The presence of non-volatile salts in the sample, required for protein function (e.g., metal co-factors) or present as impurities, generally leads to adduct formation. For example, even in the absence of any added salt, Na+ and K+ are commonly observed adducts.

Adduct formation is undesirable in ESI-MS protein–ligand binding measurements for several reasons. It leads to the reduction of the signal-to-noise ratio (S/N) of the protein and protein–ligand complex ions and, when individual adducts are not resolved, can produce broad peaks, which complicates determination of accurate MWs. When the direct ESI-MS assay is used to screen libraries, adducts associated with a given protein–ligand complex may overlap with the ion signal of other complexes with similar MWs, leading to incorrect assignments and errors in the affinity measurements. Increasing the library size generally exacerbates the problem and, thereby, may impede high-throughput screening of small molecule libraries.

Here, we introduce a straightforward approach that addresses the problem of adduct overlap in the identification and quantification of protein–ligand complexes in ESI-MS library screening. The method, called sliding window adduct removal method (SWARM), relies on the statistical nature of non-specific adduct formation during ESI, which produces indistinguishable (or nearly so) distributions of adducts for the free and ligand-bound forms of a given protein at the same charge state. Implementation of SWARM involves the sequential subtraction (in order of ascending mass-to-charge ratio, m/z) from the mass spectrum of ion signal corresponding to adducts associated with a specific protein and its ligand complexes. This is done using the adduct distribution measured for an appropriate reference species (usually the free protein or a related protein–ligand complex). We tested the key assumption of SWARM using ESI mass spectra measured for protein–oligosaccharide interactions using aqueous solutions that produced either low- or high-abundance adducts. We then demonstrated the utility of the method by applying it to ESI-MS screening results acquired for a library of oligosaccharides and a library of novel bifunctional ligands recently developed for high-throughput and quantitative glycan screening using the competitive universal proxy receptor assay (CUPRA) [13].

Materials and Methods

Proteins

Human carbonic anhydrase type 1 (hCA1, MW 28,875 Da) and lysozyme from chicken egg white (Lyz, MW 14,310 Da) were purchased from Sigma-Aldrich Canada (Oakville, Canada). A recombinant fragment of the C-terminus of human galectin-3 (hGal-3C, MW 16,327 Da) was a gift from Prof. C. Cairo (University of Alberta).

Ligands

Oligosaccharides

The structures of the oligosaccharides (OS1OS19) used in this study are given in Supporting Information. The oligosaccharides OS1, OS4, OS6, OS7, OS11, and OS13 − OS15 were purchased from Elicityl SA (Crolles, France); OS12 and OS16OS18 were purchased from IsoSep (Tullinge, Sweden); OS2 and OS19 were purchased from Dextra (Reading, UK); and OS3, OS5, and OS8OS10 were a gift from Prof. T. Lowary (University of Alberta).

CUPRA Ligands

The structures of the 19 CUPRA ligands (CL1CL19), as well as that of the corresponding glycan-devoid CUPRA ligand (CL0) used in this study, are given in Supporting Information. As described in more detail elsewhere, the bifunctional CUPRA ligands consist of a common sulfonamide moiety (which serves as an affinity tag) linked to a variety of human glycan structures through a dipeptide linker [13].

Methods

Mass Spectrometry

All ESI-MS measurements were conducted in positive ion mode using a Synapt G2S quadrupole-ion mobility separation-time of flight (Q-IMS-TOF) mass spectrometer (Waters, Manchester, UK) equipped with a nanoflow ESI (nanoESI) source. NanoESI was performed using tips produced from borosilicate capillaries (1.0 mm o.d., 0.68 mm i.d.), pulled to ~ 5 μm at one end using a P-1000 micropipette puller (Sutter Instruments, Novato, CA). To carry out nanoESI, a platinum wire was inserted into the tip containing the analyte solution and a voltage (capillary) of 0.9–1.0 kV was applied. All ESI-MS binding measurements were performed at a source temperature of 60 °C, and cone, trap, and transfer voltages of 30 V, 5 V, and 2 V, respectively. Default values were used for the remaining instrumental parameters. Data acquisition and processing was performed with MassLynx software (version 4.1).

Deconvolution of ESI Mass Spectra Using UniDec

For spectral deconvolution using UniDec software [14], the following parameters were used: m/z range 2300–3500 (for hCA1); bin every 1.0; charge range 5–20; mass range 28,000–31,500 (for hCA1); sample mass every 1.0 Da; peak FWHM (Th) 4.0; peak shape function Gaussian; maximum number of iterations 100.

Implementation of SWARM

The SWARM algorithm was implemented in Python using standard libraries and general third-party data processing libraries (Pandas, Numpy, and Scipy). Because mass spectra aggregated from multiple scans may have been collected using different sampling rates (number of data points/Da), each mass spectrum was first re-sampled and then smoothed using a Savitzky-Golay (SG) filter [15]. Additionally, asymmetric least squares (ALS) baseline subtraction was performed, when necessary [16]. Prior to carrying out SWARM, the relevant protein and protein–ligand complexes to be considered must be identified. A tool is provided in our in-house SWARM program to facilitate enumeration of ions in the mass spectrum corresponding to free and ligand-bound protein ions based on the known MWs of the protein target and library components. To implement SWARM, the non-specific adduct template, which captures the distribution of adducts associated with the protein/protein–ligand complexes at a given charge state, is extracted from the mass spectrum. The adduct distribution measured for the free protein is normally used. However, the distribution corresponding to a ligand-bound form may also be used. The selected template (isotopically averaged by smoothing) is aligned with that of the neighboring protein–ligand complex, scaled (to match the ion abundance (intensity) of the fully protonated species) and subtracted. The procedure is repeated for all complexes at a given charge state.

ESI-MS Affinities

The affinity (Ka,i, Eq. (1)) of each library component (Li) for target protein (P) was quantified using the direct ESI-MS assay [1]. The value of Ka,i was calculated from the total abundance (Ab) ratio (Ri, Eq. (2)) of the ligand-bound (PLi)-to-free protein (P) ions and the initial concentrations of protein ([P]0) and ligand ([Li]0):

$$ {\mathrm{K}}_{\mathrm{a},i}=\frac{R_i}{{\left[{\mathrm{L}}_i\right]}_0-\frac{R_i}{\sum \limits_i{R}_i+1}{\left[\mathrm{P}\right]}_0} $$
(1)
$$ {R}_i=\frac{Ab\left({\mathrm{PL}}_i\right)}{Ab\left(\mathrm{P}\right)}=\frac{\left[{\mathrm{PL}}_i\right]}{\left[\mathrm{P}\right]} $$
(2)

Equations (1) and (2) were also used for individual ligand binding measurements (i.e., when i = 1). The reported affinities are average values from replicate measurements performed at a minimum of three different Li concentrations.

Results and Discussion

The key assumption underpinning SWARM is that, for a given protein, the distributions of non-specific adducts measured by ESI-MS for the free and ligand-bound species are identical at a given charge state. To test the validity of this assumption, we compared adduct distributions for the free and tetrasaccharide ligand-bound ions of two different protein–oligosaccharide complexes. Shown in Figure 1a is a representative ESI mass spectrum acquired for aqueous ammonium acetate (200 mM, pH 6.8, and 25 °C) solutions of hGal-3C (5 μM) and of OS7 (10 μM). The ion signal corresponding to protonated hGal-3C and (hGal-3C + OS7) complex, at charge states + 8 and + 9, was identified. Also detected were ions corresponding to the attachment of either/both of the Na+ or/and K+ cations. The total abundance ratio of the protonated ligand-bound-to-free hGal-3C ions corresponds to an affinity of (4.1 ± 0.1) × 104 M−1 (Supporting Table S3). This value agrees, within a factor of 2, with the reported affinity ((9.2 ± 0.3) × 104 M−1) [17], suggesting that little or no in-source dissociation of the (hGal-3C + OS7) complex occurred during ESI-MS analysis. The insets shown in Figure 1a are the normalized distributions of adducts measured for the free hGal-3C superimposed on that measured for the (hGal-3C + OS7) complex, at charge states + 8 and + 9. It can be seen that, at each charge state, the distributions are essentially indistinguishable.

Figure 1
figure 1

ESI mass spectra measured in positive ion mode for aqueous ammonium acetate (200 mM, pH 6.8, 25 °C) solutions of (a) hGal-3C (P, 5 μM) and OS7 (L, 10 μM); (b) hGal-3C (5 μM), OS7 (10 μM), and NaCl (1.5 mM). Shown in insets is the normalized distribution of adducts observed for the free protein (red) superimposed on that measured for the ligand-bound protein (blue), at the same charge state

Analogous measurements were performed on aqueous ammonium acetate solutions (200 mM, pH 6.8) of hGal-3C (5 μM), OS7 (10 μM), and varying concentrations of NaCl. Shown in Figure 1b is a representative mass spectrum acquired for a solution containing 1.5 mM NaCl. As expected, the relative abundance and number of adducts increased substantially in the presence of a high concentration of non-volatile salt. Nevertheless, the normalized adduct distributions of the free and ligand-bound protein ions are found to be very similar (insets, Figure 1b), and the affinity, calculated from the abundances of the protonated ligand-bound-to-free protein ions ((4.5 ± 0.1) × 104 M−1), is indistinguishable from that measured in the absence of NaCl.

ESI-MS measurements were also performed on solutions of Lyz and the specific ligand, chitotetraose (OS19). Analysis of the mass spectra (Supporting Figure S1) produced results that are similar to those for the hGal-3C-OS7 interaction. The adduct distributions measured for free and ligand-bound Lyz at a given charge state are nearly indistinguishable and the measured affinity ((1.1 ± 0.1) × 105 M−1) (Supporting Table S3) determined from the total abundance ratio of protonated ligand-bound-to-free Lyz ions is in excellent agreement with the reported value [7].

Although no correction (for adducts) of the ESI mass spectra shown above was required to identify or quantify the protein–oligosaccharide interactions, these data are useful for illustrating the implementation of SWARM. Shown in Figure 2a and b are expanded views of the region of the mass spectra shown in Figure 1a and b, respectively, corresponding to the + 8 charge state ions of hGal-3C and the (hGal-3C + OS7) complex. To apply SWARM, each mass spectrum was first smoothed using the SG filter. In the case of Figure 2b, baseline subtraction, using the ALS method [16], was also performed following SG smoothing. In each case, the adduct distribution template, corresponding to the distribution measured for free hGal-3C, was then subtracted from both the free and ligand-bound hGal-3C (following scaling to match the relative abundance of the protonated complex) to give the post-SWARM mass spectrum. SWARM was similarly applied to the + 9 charge state ions of hGal-3C and the (hGal-3C + OS7) complex (Supporting Figure S2). Notably, analysis of the total abundances of the ligand-bound and free hGal-3C ions, following application of SWARM, yields affinities that are identical, within error, to the values obtained prior to treatment of the mass spectra (Supporting Table S3).

Figure 2
figure 2

Application of SWARM to a portion (m/z 2020–2060) of the ESI mass spectra shown in (a) Figure 1a and in (b) Figure 1b. Shown in black is the original mass spectrum; yellow is the mass spectrum following smoothing with SG filter (window = 41, order = 4); purple is mass spectrum following ALS baseline subtraction (smoothness = 107, asymmetry = 0.01); red is mass spectrum following treatment with SWARM (window width = 35 Th). Shown in insets are the adduct distribution templates used to implement SWARM

SWARM was then applied to ESI-MS library screening data acquired for two different libraries to demonstrate the improvements in ligand identification and affinity quantification achieved with the method. One of the libraries, consisting of 18 oligosaccharides (OS1OS18), was screened against hGal-3C. We have previously shown that hGal-3C binds all 18 of these oligosaccharides, in 200 mM aqueous ammonium acetate (pH 6.8, 25 °C) with the affinities ranging from 1 × 104 to 3 × 105 M−1 [17, 18]. Shown in Supplementary Figure S3 are the representative mass spectra acquired in a positive ion mode for aqueous ammonium acetate (200 mM, pH 6.8), solutions of hGal-3C (10.4 μM), and the oligosaccharide library (OS1OS18, each 5 μM) with (a) no phosphate-buffered saline (PBS) added and (b) PBS (1X, 137 mM NaCl, 10 mM Na2HPO4, 2 mM KH2PO4, 2.7 mM KCl, pH 7.4) at 500-fold dilution and (c) 200-fold dilution. An expanded view of the m/z 2030 to m/z 2225 portion of the mass spectra (pre- and post-SWARM), which contains the signal corresponding to the + 8 charge state of hGal-3C ions, is given in Figure 3.

Figure 3
figure 3

Application of SWARM to oligosaccharide library screening data. (a) A portion (m/z 2030–2230) of ESI mass spectrum acquired in positive ion mode for aqueous ammonium acetate (200 mM, pH 6.8, 25 °C) solutions of hGal-3C (P, 10.4 μM) and oligosaccharide library (OS1OS18 (≡L1–L18), each 5.0 μM). (b) and (c) ESI mass spectra acquired for same solution as in (a) but with the addition of 1xPBS buffer at 500-fold dilution and 200-fold dilution, respectively. The original mass spectrum is shown in red; the mass spectrum following treatment with SWARM is shown in blue. SWARM implemented using SG filter (window = 41, order = 4) and ALS baseline subtraction (smoothness = 107, asymmetry = 0.01) after every SWARM cycle (window width = 45 Th). (d) Comparison of ligand affinities measured from the original ESI-MS data shown in (a)–(c) with values reported in references [17] and [18]. (e) Comparison of ligand affinities measured from the ESI-MS data shown in (a)–(c) following treatment with SWARM with values reported in references [17] and [18]

In the absence of PBS, an ion signal (predominantly protonated ions) corresponding to free hGal-3C and the 1:1 complexes of hGal-3C with each of the 18 ligands was detected. The addition of PBS buffer resulted in more extensive adduct formation, although the protonated ions of hGal-3C and all of the hGal-3C-ligand complexes were dominant. However, in the mass spectrum acquired at 200-fold dilution, it is difficult to conclusively identify a number of the low-abundance hGal-3C-ligand complexes (e.g., for OS3OS5 and OS17). Additionally, the relative abundances of the complexes that could be identified were significantly skewed by the pronounced adduct distributions, vide infra. Treatment of the mass spectra acquired for the PBS solutions with SWARM greatly facilitates the positive identification of the ligand-bound hGal-3C ions. Notably, all 18 ligands can be identified from the mass spectrum measured for the 200-fold PBS dilution solution (Figure 3c).

Plotted in Figure 3d and e are the affinities for each of the 18 ligands determined from the aforementioned ESI mass spectra, before and after treatment with SWARM, respectively, and reported values, which were measured one ligand at a time by ESI-MS [17, 18]. The affinities determined from the pre-SWARM data acquired in ammonium acetate show reasonably good correlation with the individual measurements, with a squared correlation coefficient (R2) of 0.75 and slope of 1.17. For the affinities measured from the solutions containing PBS, there is significantly more scatter, compared to the reported values, with an R2 of 0.48 and a slope of 1.46 for the 200-fold dilution data. Importantly, treatment of the three mass spectra with SWARM produced very similar affinities and the correlation plots shown in Figure 3e have similar R2 values (0.77 or 0.78) and slopes (0.9–1.1).

As an additional test of SWARM, the method was applied to the ESI-MS screening data measured for the CL library (CL0CL19) and hCA1. As noted above, all of these library components contain a sulfonamide moiety, which is recognized by hCA1. The affinities of each of the library components for hCA1 have been measured in 200 mM aqueous ammonium acetate (pH 6.8, 25 °C) and shown to range from 5 × 104 to 5 × 105 M−1 [13]. Shown in Supporting Figure S4 is a representative mass spectrum acquired in a positive ion mode for an aqueous ammonium acetate solution (200 mM, pH 6.8) of hCA1 (10.4 μM) and the CL library (each 3 μM). An expanded view of the m/z 2875 to m/z 3100 portion of the mass spectrum, which contains the signal corresponding to the + 10 charge state of the hCA1 ions, is given in Figure 4a. The ion signal corresponding to free hCA1 and the 1:1 complexes of hCA1 with each of the 20 ligands was detected. hCA1 is a metalloenzyme that has Zn(II) bound in the active site and all of the detected hCA1 ions were found to be associated with Zn2+. Also detected was signal corresponding to hCA1 bound to acetate. This complex likely arises from the specific but weak interaction between carbonic anhydrases (e.g., hCA type 2) with acetate [19]. Although the affinity of this interaction has not been reported for hCA1, acetate binding to Co(II)-containing bovine CA has been measured to be approximately 400 M−1 [20]. Importantly, since both acetate and sulfonamide compete for the same binding site, acetate binding to the (hCA1 + CL) complexes is not observed. Because of the specific interaction between hCA1 and acetate, it is inconvenient to use the non-specific adduct distribution measured for free hCA1 as the template for SWARM. Instead, the adduct distribution corresponding to (hCA1 + CL0), which does not overlap with that of any other hCA1 ions, was used. The resulting post-SWARM mass spectrum, in which the selected template was applied to all of the (hCA1 + CL) complexes, as well as free hCA1 and hCA1 bound to acetate, is shown in Supporting Figure S4 and Figure 4b (which shows the + 10 charge state region). Notably, treatment of the mass spectrum with SWARM clearly reveals that the (hCA1 + acetate) complex is a specific interaction and does not originate from non-specific association of acetate during the ESI process.

Figure 4
figure 4

Application of SWARM to CUPRA library screening data. (a) A portion (m/z 2030–2230) of ESI mass spectrum acquired in positive ion mode for an aqueous ammonium acetate (200 mM, pH 6.8, 25 °C) solutions of hCA1 (P, 10.4 μM) and the CL library (CL0CL19 (≡L0–L19), each 3 μM). (b) Mass spectrum from (a) following treatment with SWARM using adduct distribution template shown in inset. SWARM was implemented using SG filter (window = 41, order = 4) and ALS baseline subtraction (smoothness = 108, asymmetry = 0.001) after every SWARM cycle (window width = 25 Th). (c) and (d) Comparison of ligand affinities measured from the ESI-MS data shown in (a) and (b), respectively, with values reported in reference [13]

Plotted in Figure 4c and d are the affinities of CL0CL19 for hCA1 determined from the ESI mass spectrum in Figure S4, before and after treatment with SWARM, respectively, and the reported values, which were measured one ligand at a time by ESI-MS [13]. Prior to SWARM, the estimated affinities determined for the library correlate poorly with the values from individual measurements (R2 of 0.43) and were systematically high (4- to 10-fold overestimation). Importantly, following treatment with SWARM, the affinities measured for library correlate much better with the results from individual binding measurements (R2 of 0.8 and slope > 0.9).

The aforementioned examples clearly demonstrate the benefits of treating ESI-MS screening data with SWARM. However, it is appropriate to ask whether similar results can be obtained using commonly used spectral deconvolution methods, without SWARM [14, 21,22,23,24,25,26,27]. To that end, the mass spectrum shown in Figure 4a was converted to the zero-charge mass spectrum using the widely used UniDec deconvolution software (Figure 5a) [14]. Analysis of the relative abundances of the free and ligand-bound hCA1 measured from the deconvoluted mass spectrum yields binding data that are similar to those obtained directly from the mass spectrum—there is poor correlation with the individual binding data (R2 of 0.47) and the affinities are systematically high (Figure 5c). However, application of SWARM to the deconvoluted mass spectrum (Figure 5b) produces affinity data that are similar to those obtained by directly treating the original mass spectrum with SWARM (Figure 5d).

Figure 5
figure 5

(a) Mass spectrum shown in Figure 3 following deconvolution using UniDec. (b) Deconvoluted mass spectrum following treatment with SWARM using SG filter (window = 41, order = 4) and ALS baseline subtraction (smoothness = 1014, asymmetry = 0.001) after every SWARM cycle (window width = 250 Da). Comparison of ligand affinities measured from the ESI-MS data shown in (a) and (b), respectively, with values reported in reference [13]

Conclusions

Non-specific association of buffer and salts to proteins and their specific complexes during gas-phase ion formation can complicate the analysis of ESI-MS library screening data. Overlap of signal corresponding to free and ligand-bound protein ions and their adducts can hinder the identification of ligands and introduce error into the measured affinities. This work describes a straightforward method to quantitatively correct ESI mass spectra of low-to-moderate resolution for ion signal overlap associated with adduct formation. Results obtained for two protein-tetrasaccharide complexes support the key assumption of the method, namely that the distributions of adducts associated with a given protein (free protein and ligand-bound forms) are nearly identical at a given charge state. However, it should be noted that there may be instances, for example, when ligand binding induces significant protein conformational changes, when the free and ligand-bound proteins exhibit differences in their adduct distributions. Application of SWARM to ESI-MS screening data acquired for two different libraries and target proteins demonstrates the improvements in ligand identification and affinity quantification achieved following correction of the mass spectra for adduct formation.