Introduction

Discussions of ion-neutral clusters in the electrospray ionization (ESI) mass spectrometry literature are limited despite being a ubiquitous feature of the mass spectrum obtained from these techniques [1,2,3,4,5,6,7,8,9]. Singly and multiply charged ion-neutral clusters consisting of at least one ion and one or more neutral species are formed during the electrospray process [7,8,9]. Neither Dole’s charge residue model (CRM) [10] nor the Iribarne-Thomson ion evaporation model (IEM) [11,12,13,14] alone completely explain the presence of the various ions and ion-neutral clusters observed by Gamero-Castañoand and de la Mora in their study of electrospray produced salt clusters using a high-resolution differential mobility analyzer [7]. This work suggests a mixed set of ESI processes leading to gas-phase ions and ion-neutral clusters that do not neatly fit into a pure-CRM or pure-IEM. An earlier investigation of cluster ion behavior using an ESI quadrupole mass spectrometer at concentrations leading to ESI saturation also claims that these species can be rationalized under the CRM, IEM, and by additional processes occuring in the atmospheric pressure interface [8]. Other investigations by Meng and Fenn propose an entirely different theory where clusters are formed during the elctrospray process due to the precipitation of solutes as solvent evaporates [9]; they reject the hypotheses that gas-phase reactions could produce ion-neutral clusters in the ESI source and also reject the notion that adiabatic expansion during transfer to the vacuum system would result in cluster condensation due to their instrument's design.

The understanding of ion-neutral clusters is further obscured when considering the processes occurring during the transit of ions from the ESI source through the atmospheric pressure interface (API). APIs transport ions and bulk gas from an atmospheric pressure ion source to low pressure vacuum chambers during which adiabatic expansion and rapid cooling occur, resulting in cluster condensation [15,16,17]. API design for coupling ESI sources to mass analyzers has significantly improved over the years by preventing neutral species from entering the mass spectrometer [18]. This has decreased the extent to which ions are observed as solvated ion-neutral clusters, but some buffer additives (e.g., Na+) and contaminants are present in high abundance and can form very stable clusters. Importantly, the inherent design and operation of APIs biases the mass spectrum towards observing ion-neutral clusters and fragments depending on the strength of applied electric fields. These processes have been addressed but are underappreciated and remain poorly understood [19,20,21,22]. Recent comparisons between clustering in ion trap, triple quadrupole, and quadrupole-Orbitrap mass spectrometers highlight the significant influence of API design on the observed cluster distribution in the mass spectrum [2]. This comparative study demonstrates the influence of buffer gas and contaminants in buffer gases on the observed ion distribution.

These complexities and the pervasiveness of ion-neutral clusters raise the question as to why ion-neutral clusters have received limited attention. There exists a widely adopted canonical model of ESI that assumes ions observed in the mass spectrum should adhere to the forms of [M+H]+ and [M–H]- in the positive and negative modes, respectively. Generally speaking, these protonated and deprotonated molecules provide sufficient sensitivity for many analytical tasks, are abundantly formed by a majority of compounds, and have been the historical focus of targeted assays on triple-quadrupole systems due to their interpretability in the mass spectrum and the quality of their fragmentation spectra. However, complexity of the mass spectrum (and particularly the high-resolution mass spectrum) is extremely complicated and multiplexed. Numerous software packages have been developed to help deal with these complexities by grouping or annotating all the signals arising from a single molecule using a variety of approaches [23,24,25]. In non-targeted profiling, recognizing the presence of these adducts in the mass spectrum is of great importance for correctly interpreting mass spectral features.

Here, we quantitatively examine the behavior of all the ions arising from a single molecule across various column diameters in order to investigate the influence of solvent flow rate on the observed mass spectrum. We chose to focus on bile acids in the negative mode because previous work in our laboratory identified these molecules as being prone to forming ion-neutral clusters. Additionally, bile acids are complex, small molecules representative of real world metabolomics studies [26, 27]. Many previous investigations focusing on ion-neutral clustering rely on infusion based experiments and unit-mass resolution quadrupole mass analyzers. The use of reverse phase ultra-performance liquid chromatography (UPLC), high-resolution mass analyzers, and programmatic data analysis aims to extend the early work of this field towards real-world applications where the complexity of these processes can be embraced, explored, and the implications understood within the scope of modern analytical tasks.

Experimental/Methods

Preparation of Standards

Bile acid standards were obtained from the Human Metabolome Library though the Human Metabolome Database [28]. All solvents were LCMS grade or better. Bile acid standards were prepared from 1 mg mL-1 methanol stock solutions as both single component standards and calibration standard mixes by diluting the methanol stock in a 70:30 v/v solution of LCMS grade water (Fisher Scientific, Hampton, NH, USA) and LCMS grade acetonitrile (Fisher Scientific, Hampton, NH, USA) with 0.1% formic acid by volume (Thermo Fisher Scientific Pierce, Rockford, IL, USA). Single component standards were diluted to a final concentration of 1 μg mL-1. The seven calibration standard mixes span the concentration range from 125 ng mL-1 to 10 μg mL-1 with each individual component present at the same concentration within a single standard mix. The prepared standards were stored at –80 °C.

UPLC Conditions

A Waters Acquity UPLC system was used for reverse phase liquid chromatography with either a 2.1 mm × 50 mm or 1.0 mm × 50 mm HSS T3 Acquity column (Waters Corporation) with 1.8 μm diameter particles. A Waters Acquity UPLC M-Class was used for micro-flow reverse phase liquid chromatography with a 300 μm × 50 mm HSS T3 M-Class capillary column (Waters Corporation) with 1.8 μm diameter particles. The M-Class UPLC was also used with a custom passivated iKey Separation Device (Waters Corporation) with a 150 μm × 50 mm inner diameter channel packed with 1.8 μm diameter HSS T3 particles. All columns were thermostatically controlled to 45 °C using column heaters onboard the UPLC stack or within the iKey device. Fluidic connections and transfer lines were installed according to manufacturer recommendations.

Both UPLC systems were operated with a 5 μL injection loop in the partial loop injection mode (1 μL injection volume) using a 70:30 v/v water and acetonitrile with 0.1% formic acid weak wash solution to match both the composition of the standards and gradient starting conditions. Buffer A was water with 0.1% formic and buffer B was acetonitrile with 0.1% formic acid. Gradient elution was conducted using a 10 minute linear gradient starting at 70% A and ending at 3% A. The column was re-equilibrated at the starting conditions after the end of the gradient. The 2.1 mm column, 1.0 mm column, 300 μm column, and 150 μm iKey Separation Device were operated at flow rates of 600, 250, 15, 5 μL min-1 corresponding to initial backpressures of 6300, 7525, 7750, 5000 psi, respectively.

Ion Sources and MS Configuration

The 2.1 mm, 1.0 mm, and 300 μm columns were connected to a Waters Z-Spray ion source. The iKey Separation Device was operated with the Waters ionKey source. Both sources were installed and configured on a Waters Xevo G2 TOF and operated with LockSpray to provide accurate mass measurements and compensate for m/z calibration drift [29, 30]. It should be noted that this mass spectrometer does not have the Waters StepWave found in their more modern implementation of the API. The sources were optimized based off manufacturer recommended settings with slight modifications. TOF operation and data acquisition was achieved using Waters MassLynx (ver. 4.1) appropriately configured for each instrument configuration. Mass spectra were acquired as centroided spectra at 5 Hz in the negative ion mode from 50 to 2000 m/z using the MSE acquisition mode [31]. MSE acquisition alternates between acquiring two independent mass spectra: one mass spectrum is acquired with low collisional energy and the other with high collisional energy without any precursor ion selection. A collisional energy ramp from 50 to 110 V was performed during high energy operation [32]. All source configurations were operated with the sampling cone voltage set to 10 V, the extraction cone set to 4 V, the source temperature set to 150 °C, and the cone gas set to a flow rate of 30 L Hr-1. The Z-Spray and ionKey sources were operated at a capillary voltages of –1.8 and –2.5 kV, respectively. The desolvation gas flow rate and temperature were different dependent on the column flow rate following manufacture guidelines. The Z-Spray source operated with the 300 μm, 1.0 mm, and 2.1 mm columns was operated with desolvation gas flows rates and temperatures of 600 L Hr-1 (300 °C), 600 L Hr-1 (450 °C), and 800 L Hr-1 (600 °C), respectively; the ionKey source was operated without desolvation gas.

Data Collection, Peak Picking, and Compound Spectra Determination

A standard curve was acquired on each of the four instrument configurations starting with the lowest concentration bile acid standard mix. This process was then repeated in triplicate. Uncertainties are reported as either standard deviations or propagated and added in quadrature when appropriate. Single component standards were used to verify compound retention time and identity across the column formats as isobaric bile acids were included in the analysis (Supplement).

The Waters .raw files were converted to the Net Common Data Format (netCDF) [33] using Waters Databridge conversion software. Next, these files were grouped into individual calibration curves for processing with xcms (ver. 1.46.0) [34, 35] using the centWave algorithm [36] within the R environment (ver. 3.2.5) [37]. Relevant portions of the xcms R object were exported from the R environment and imported into Igor Pro (WaveMetrics, Inc. ver. 7.04) for all subsequent data processing, analysis, and visualization. Additionally, the exact mass bins of each xcms feature were used to bin and extract high-resolution ion chromatograms to avoid arbitrarily binning the high-resolution data. These data were also exported from the R environment and imported into Igor Pro allowing for analysis on features (peak areas, descriptive peak statistics, etc.) and scan-by-scan as extracted ion chromatograms (EICs).

A custom processing algorithm was developed in Igor Pro with the aim of grouping together all the signals from both the MS and MSE spectra that arise from a single compound.

These reconstructed sets of ions are termed compound spectra [23]. The implementation of this algorithm focuses on quantitatively characterizing in-source chemistry (ion-neutral clustering) although fragmentation data from the MSE are also processed and presented. The algorithm was realized by first searching the xcms peak group data for any features corresponding to the mass of [M–H]- for a species of interest. The [M–H]- peak was known to be present in large abundance for the compounds of interest. Then, all features with retention times (peak apex) near the center of the [M–H]- exact mass match are grouped together as preliminary compound spectra. Last, Pearson’s r was calculated for every high-resolution EIC against the [M–H]- EIC within the retention time window of the [M–H]- EIC to verify each ion actually belongs to the compound spectrum. Subsequent analysis was conducted using these verified compound spectra.

Results

Species Dependent Compound Spectra

Low collisional energy compound spectra exhibited significantly different distributions of ion-neutral clusters depending on the bile acid investigated. These compounds show similar patterns by broad bile acid subclasses as defined by the functional groups found on the steroidal ring or on the conjugation site; this is consistent with previous modeling work attempting to predict in-source phenomenon for metabolite annotation where classes of compounds were found to exhibit similar, though not identical, in-source behavior [1]. Many of the same ion-neutral clusters are present for each species investigated but in differing relative abundancies. For these reasons, and to keep the data presented manageable, the presented analysis is limited to four representative bile acids: cholic acid (C24H40O5), 3-oxocholic acid (C24H38O5), glycocholic acid (C26H43NO6), and taurocholic acid (C26H45NO7S) (Figure 1). These species were baseline resolved or nearly baseline resolved with little to no co-elution (Supplemental Information). Figure 2 shows the low-energy MS and collisional energy ramped MSE compound spectra of the four representative bile acids. These data were acquired during the same chromatographic run on the 1.0 mm × 50 mm T3 column at 5 ng on column using the Z-Spray ESI source. Each of the four bile acids generated multiple signals arising from ion-neutral clusters. For example, cholic acid produced at least 20 features with three or more points across the calibration range for all the flow regimes with increasing numbers of features observed at higher mass loadings (supplemental information). Little to no fragmentation was observed in the low collisional energy MS mode. Many ion-neutral clusters present in the MS compound spectrum are also present in the MSE compound spectrum; this means that some of these clusters are stable enough to survive at least a portion of the MSE collisional energy ramp which starts at 50 V.

Figure 1
figure 1

Structures of the representative bile acid classes. (A) Cholic acid, (B) 3-oxocholic acid, (C) glycocholic acid, (D) taurocholic acid

Figure 2
figure 2

Verified compound spectra of cholic acid, 3-oxocholic acid, glycocholic acid, and taurocholic acid. Ion signal intensities are shown at the chromatographic peak maximum. All data were acquired on the Z-Spray source with a 1.0 mm × 50 mm T3 column at a mass loading of 5 ng on column. MS (low-energy) ion signals are plotted as positive (black) bars and MSE (high-energy) ion signals are plotted as negative (red) bars. The number of features (isotopes included) in the MS (low-energy) spectra are included in the title of each graph subset. Various high-abundance and commonly observed ions are annotated based on their exact mass values and isotopic ratios: (1) [M–H]-, (2) [M–H+HCl]-, (3) [M–H+FA]-, (4) [M–2H+FA+Na]-, (5) [2M–H]-, (6) [2M–2H+Na]-

The spectral similarity of the compound spectra was investigated by calculating the cosine similarity of the four representative bile acids across all concentrations and other species investigated [38, 39] (see Supplement for details). The maximum concentration (10 ng on column) was used as the reference mass spectrum upon which all other spectra are compared. The procedure uses mass-gridded and normalized compound spectra where each EIC extracted across all files are ordered in a fixed list regardless of whether or not a peak was detected in that individual file. Changes in concentration for a bile acid of interest lead to substantial changes in the compound spectrum reflected in the calculated spectral similarity scores. This highlights the fact that all the ions or ion-neutral clusters arising from a single species do not increase linearly with concentration and each ion or cluster has its own sensitivity, limit of detection, and linear range (discussed in subsequent sections). The ions in each compound spectrum are not fixed in relative abundance but change as a function of concentration.

Solvent Flow Rate

The source configuration significantly influenced the compound spectrum observed for each bile acid driven primarily by the solvent flow rate determined by the column inner diameter (Figure 3). The solvent flow rate has two major effects on the electrospray process: (1) lower flow rates generate smaller droplets leading to increased ionization efficiency and higher sensitivity [40, 41], and (2) the lower flow rates bias the concentration of eluting compound towards higher instantaneous analyte concentration despite increased chromatographic peak width on the smaller scale separation devices. It should be noted that the physical source design (e.g., spray direction, distance from ESI emitter to MS inlet, emitter type, etc.) does not appear to significantly change the observed ion-neutral clustering in a way that deviates from our analysis of solvent flow rate. The use of the ionKey source and iKey Separation Device compared with the use of standard analytical columns with a standard ESI source highlights this finding. It is the ESI process itself occurring at each flow rate that primarily changes the behavior of ion-neutral clusters rather than the source hardware. The results obtained with the completely different ionKey source system follow the results obtained using standard columns.

Figure 3
figure 3

Compound spectra of the four representative bile acid species are shown for the four different instrument configurations: 150 μm iKey (black), 300 μm capillary column (red), 1.0 mm column (blue), and 2.1 mm column (purple)

We identified two general classes of ion-neutral clusters in these systems that arise from the bile acids. First, contaminant-containing clusters: these clusters consist of ions or neutrals arising from species not intentionally added to the system (e.g., no sodium was added to our buffer system, but Na+ is present in trace amounts in the solvents). Second, buffer-additive-containing clusters (e.g., formic acid was intentionally added to our buffer system) and retained species clusters such as homodimers/homo-multimers arising from the compound of interest, which has been chromatographically retained (e.g., [nM–H]-, where n = 1, 2, 3…). Low-flow LC configurations (M-Class UPLC: 5, 15 μL min-1) bias the compound spectrum towards homodimers and buffer containing clusters. High-flow LC configurations (Acquity UPLC: 250, 600 μL min-1) also contain homodimers and buffer containing clusters in substantial abundance, but other contaminant-containing clusters congest the mass spectrum due to the increased solvent flux.

Concentration Dependence of Observed Ion-Neutral Adducts

Calibration curves of select ions in the four representative compound spectra are presented in Figure 4 along with the sum of all the ions in the MS level compound spectra. These data were acquired on the 1.0 mm × 50 mm T3 column and reported in triplicate with the error bars representing the standard deviation of the averaged, integrated peak areas. Calibration curves obtained on the other columns are presented in the Supplemental Information. The average relative standard deviation (RSD) of each point in Figure 4 is less than ~10%; this is consistent with nearly all the columns investigated. More importantly, the RSD of the ion-neutral adducts for all column/source configurations is consistent with the corresponding [M–H]- RSD showing that these clusters are highly reproducible under constant instrument conditions.

Figure 4
figure 4

Calibration curves of select ions from cholic acid (black), 3-oxocholic acid (red), glycocholic acid (blue), and taurocholic acid (purple) are plotted in averaged triplicate with error bars representing the standard deviation of the average. These data were all collected on the Z-Spray source with a 1.0 mm × 50 mm T3 column. Weighted least squares linear regression results are also plotted. The sum of the MS ions in the compound spectra (bottom left), [M–H]- (top left), [2M–H]- (bottom right), and [M–2H+FA+Na]- (top right) are provided to illustrate representative behaviors

Figure 4 and supplemental calibration curves show all calibration points above the limit of detection. Weighted least squares linear regression was performed on the calibration curves because non-linearity was observed with some of the ions at high concentrations. These fits were included in Figure 4 and the supplemental data to draw the eye to deviations from linear behavior and should not be thought of as traditional calibration curves fitted over the linear range. The calibrations were conducted using the same on-column mass loadings for each column, which lead to different regions of the linear range being investigated on each column. Some points are below the limit of quantification (excluded from analysis), and other points exhibit behavior indicative of saturating the electrospray. Saturation is obvious at the 10 ng loading on the iKey, but some deviation from linear behavior is observed for the [M–H]- ion at the 10 ng loading for all columns. Electrospray saturation occurs where the instantaneous concentration of bile acid eluting from the column exceeds ~10-5 M [42]. The 10-5 M saturation concentration is generally agreed upon, but compound dependent differences are known to exist and limit the linear range [43].

Least squares linear regression (non-weighted) was also performed from 0.125 ng to 5 ng to better understand the linear range and to understand how to best deal with these phenomenon in the case of non-targeted metabolomics where internal standards are absent (Supplement). The sum of the MS compound spectrum from 0.125 ng to 5 ng consistently exhibits the highest r2 value while also containing the most calibration points across the calibration range. The [M–H]- calibration curves also have high r2 values, but the species that exhibited extensive clustering behavior (cholic acid and 3-oxocholic acid) show lower r2 values compared with the sum of the compound spectra. The lower r2 values for the [M–H]- calibration curve are most obvious for the low-flow systems probably due to an increased abundance of the [2M–H]- cluster. The summed compound spectra approach is a promising route to robust quantification when complex in-source phenomenon are observed; this is consistent with previous work quantifying ginkgolides and bilobalide, where various adducts were summed to improve linearity [5]. It should be highlighted that in the case of obvious saturation (e.g., iKey at 10 ng on column) the summed calibration method does not improve linearity since the deviation from linear behavior is so severe.

The sensitivities of the various ion source configurations follow the anticipated ESI behavior with the sensitivity increasing with lower flow rates for the mass loadings between 0.125 and 5 ng on column. For example, the [M–H]- ion of taurocholic acid on the 2.1 mm column exhibits an absolute sensitivity of 1.99 × 104 counts ng-1 with an r2 value of 0.9994. The 1.0 mm column and 300 μm capillary column show increasing sensitivities relative to the 2.1 mm column of 1.6x and 6.3x with r2 values of 0.9995 and 0.9971, respectively. The iKey shows a relative increase of 4.3x over the 2.1 mm column with an r2 value of 0.9908 due to peak broadening on the iKey device.

The highest mass loadings (5–10 ng on column) and lowest flow rates produce higher-order clusters observed in the mass range corresponding [3M–H]- and [4M–H]- (Figure 3). Homo-multimer formation occurs to the greatest extent on the iKey device which was operated at the lowest flow rate (3 μL min-1) leading to the highest instantaneous concentrations and saturation at 10 ng. The mass differences observed on the iKey are consistent with pure homo-multimer formation following the pattern of [nM–H]- for cholic acid, glycholic acid, and taurocholic acid. Interestingly, 3-oxocholic acid does not form a [3M–H]- cluster but rather a [3M–2H+Na]-. Various other unidentified clusters exist in these mass ranges when using the iKey device, but the most important finding is that at higher flow rates the higher order clusters (n > 2) contain other species making annotation of these signals more challenging and more complex compared to the iKey. With the exception of taurocholic acid, the pure homo-multimer, [nM–H]-, n > 2 is absent for all species investigated on all other source/column configurations except the iKey. The clusters in this mass range on the 300 μm column are not pure homo-multimer species.

Cluster Dependence Across Solvent Flow Rates

The ion source, concentration, and analyte dependence can be further investigated as a function of instantaneous analyte concentration linking the behavior of the various experimental configurations together. The instantaneous mass of a bile acid species eluting from the column was calculated by assuming that 100% of the injected mass elutes during the observed peak. Then, the fraction of the total signal at each point defining the chromatographic peak can be used to estimate the mass of compound being eluted. Similarly, the integrated peak areas can be plotted in the same coordinate-space by integrating the solvent flow rate across the entire peak. The number of moles of bile acid injected on column is divided by the integrated solvent volume to provide an average concentration across the peak.

This scan-by-scan approach allows for a much broader range of concentrations to be evaluated compared to only examining integrated peak areas, takes advantage of all the available data, and highlights the changing relative abundance of ions across an eluting chromatographic peak. Figure 5 shows the [M–H]- ion plotted against the instantaneous concentration of bile acid. It should be noted that the assumptions made to calculate instantaneous analyte concentration scan-by-scan may fail at the highest on-column mass loadings due to ESI saturation. At high concentrations, the ESI response to instantaneous changes in concentration becomes non-linear biasing the scan-by-scan concentrations. This is observed in Figure 5 since the integrated peaks show slight deviations from linearity in log-log space compared to the individual points which were calculated assuming signal response is directly proportional to concentration.

Figure 5
figure 5

[M–H]- bile acid signal intensity plotted against the calculated concentration of bile acid eluting from the column for both integrated peak areas (large markers, right axis) and extracted ion chromatograms (small markers, left axis) for all instrument configurations. The integrated peak areas are the average of triplicate injections with error bars representing the standard deviation of the average. Extracted ion chromatograms are plotted spectra by spectra but are also displayed in triplicate

The data acquired using 1.0 and 2.1 mm columns fall along a single linear trend line for each source in Figure 5. Interestingly, the 300 μm capillary column and 150 μm iKey exhibit multiple offset linear trend lines for all the species except for taurocholic acid on the 300 μm capillary column. These linear patterns are highly reproducible and highlighted in the supplemental information. Each individual linear trend line corresponds to a different mass of bile acid injected on column, and all the data presented in Figure 5 are shown in triplicate (i.e., multiple injections at the same concentration all fall on the same trend line). We hypothesize that this behavior arises from two sources: (1) non-linear ESI response at high concentrations biases the calculation of the x-axis, and (2) the iKey and 300 μm capillary column exhibit the highest extent of homodimer formation which behaves non-linearly as a function of concentration.

These hypotheses are supported by the fact that taurocholic acid ([M–H]-) falls along a single trend line on the 300 μm capillary column, produces few ion-neutral clusters, and shows little deviation from linear behavior across the entire calibration range. Conversely, glycocholic acid ([M–H]-) on the same column shows multiple trend lines, produces more ion-neutral clusters than taurocholic acid, and also shows little deviation from linear behavior across the entire calibration range. Thus, cluster formation in competition with ([M–H]-) appears to also produce these trend lines.

The other ions in the compound spectrum can be investigated by examining the ratio of some ion-neutral cluster of interest divided by the corresponding [M–H]- signal intensity. Two examples are provided following the data displayed in Figure 4: the ratio of [2M–H]-/[M–H]- (Figure 6) and [M–2H+FA+Na]/[M–H]- (Figure 7). These two species represent the general phenomenon occurring in these experiments. The [2M–H]-/[M–H]- ratio represents the dominant phenomenon occurring at lower solvent flow rates where conditions lead to a higher concentrations of bile acid relative to solvent vapors, buffer additives, and contaminants. Conversely, [M-2H+FA+Na]-/[M–H]- represents contaminant influence that becomes more important at higher flow rate operation with standard analytical columns. This ion-neutral cluster contains sodium, which is a contaminant species present in higher relative abundancies at higher solvent flow rates. The [2M–H]- ratio linearly increases with the concentration of M because the formation of this species is dependent on M (note the log-scale). The abundance of the [M–2H+FA+Na]- relative to [M–H]- should decrease at lower solvent flow rates which limit the abundance of Na+. This trend is observed in the data. These results across instrument configurations show that the concentration of various species in the solvent and the various available ion-neutral adduction pathways determine the observed ion distribution.

Figure 6
figure 6

Ratio plots of the [2M–H]- ion signal intensity relative to the [M–H]- ion signal intensity plotted against the calculated concentration of bile acid eluting from the column. Integrated peak areas (large markers, right axis) and extracted ion chromatograms (small markers, left axis) are displayed for all instrument configurations

Figure 7
figure 7

Log scale ratio plots of the [M–2H+Na+FA]- ion signal intensity relative to the [M–H]- ion signal intensity plotted against the calculated concentration of bile acid eluting from the column. Integrated peak areas (large markers, right axis) and extracted ion chromatograms (small markers, left axis) are displayed for all instrument configurations`

Close examination of Figure 6 shows that clustering under different flow regimes is not entirely explained by the instantaneous concentration of bile acid eluting from the column. This is due to numerous other processes occurring in the electrospray across flow regimens during gradient elution. Not only does the concentration profile of each compound differ with each column, the ESI sensitivity is also changing due to different flow rates. Examining the two species in Figures 6 and 7 also tells an incomplete story since the distribution of clusters is different with each column. This again arises from numerous possible factors including the composition of the solvent during the eluting peak [9], the solute concentration [43, 44], concentration of various contaminants [9], gas-phase chemistry arising from neutral species in thermal equilibrium with the produce ions [22], and solvent flow rate though the electrospray emitter [40, 41].

Discussion

Ionization and Clustering Mechanisms

The various observations above cannot be reconciled with standard ESI models [40, 41, 44,45,46] (CRM and IEM) and by the potential for the gas-phase formation (or transformation) of clusters during transit from the source through the API without knowing which processes are ultimately controlling the observed ion distribution [8]. Lower solvent flow rates will generate smaller solvent droplets with higher concentrations of the retained analyte species in each droplet if the same mass of analyte is loaded on the column. As droplet evaporation, droplet fission, and/or ion-evaporation occur, the likelihood of multiple analyte species forming dimers and higher-order clusters increases under low-flow conditions relative to higher-flow systems due to increased analyte concentration. High-flow rate systems will produce larger droplets with lower analyte concentrations. As these droplets undergo fission and/or ion-evaporation, the ratio of contaminant species and buffer additives to analyte species will be larger than in the low-flow systems leading to more contaminant and buffer containing clusters.

Similarly, in the gas phase, one would expect all the thermalized ions and ion-neutral clusters produced in the electrospray to rapidly cool during transfer through the high pressure stages of the API reaching thermal equilibrium [15, 17, 19,20,21, 47]. At constant contaminant concentrations in the mobile phase (e.g., Na+), increased solvent flow rates result in increased contaminant and buffer additive gas-phase concentrations relative to the eluting compound. This shifts the equilibrium towards contaminant and buffer-additive-containing clusters and away from the homodimer (retained species) population. Conversely, low-flow LC conditions will produce much higher concentrations of bile acid and lower concentrations of contaminant species leading to a relative increase in homodimers or retained species clusters.

All of the observed ions arise from complex processes where various ions/clusters formed in the ion source are being transmitted to the detector to various extents and fragmenting and/or clustering into other potentially observable ions and clusters even under low collision energy MS mode [19,20,21, 47, 48]. The cluster distribution in the ion source or in the initial stages of the API is unknown, but the observed ions (the mass spectrum) depend on the initial ion distribution and configuration of the mass spectrometer (i.e., internal temperature of ions in the source, collisional/translational energies through each optic, thermal equilibration during adiabatic expansion between pressure stages, etc.) (see [19] and [47] for a detailed discussion of clusters and the processes occurring in APIs). This makes understanding the exact formation mechanism of each member of these compound spectra very difficult. More importantly, the behavior and general trends that are observed for each species can be investigated to provide some constraints to better understand these complex systems. This becomes valuable for groups attempting to extract the maximum amount of chemical information from the mass spectrum. These data suggest that the matrix effects and differences across different platforms can actually be understood and investigated for method optimization. Sample matrix changes the chemical environment during the ionization process as well as during the ion’s initial transfer into the API leading to changes in sensitivity, ionized species, and analytical performance. Many of these reactions are observable if the mass spectrum is carefully examined and statistical grouping tools are applied to track the observed ions back to the individual compound.

Linearity and Concentration Response

The increased linear behavior observed for the summed MS compound spectra makes sense when examining the trends of the various ion-neutral clusters. For example, [M–2H+Na+FA]- saturates at the highest mass loadings while the [2M–H]- cluster begins to increase exponentially. This behavior was also observed by Zook et al. [8] when infused analyte concentration became sufficiently high to saturate the electrospray (~10-5 M). It is likely that this system is Na+ limited leading to the saturation of the [M-2H+Na+FA]- cluster while the [2M–H]- cluster depends only on the concentration of the bile acid species, M.

Metwally et al. differentiate the peak splitting due to adduct formation and ion suppression in their work investigating salt-induced signal suppression in protein detection using electrospray [6]. The authors define ion suppression as the point at which salt concentrations reduce the total analyte signal. This is mostly consistent with the data presented herein, but requires some clarification in light of this work and earlier studies examining these effects [8, 9]. Summing the ion intensities of each ion in the MS level compound spectra improves linearity compared to only using the most abundant signal, [M–H]-; this is clearly consistent with peak splitting. At sufficiently high analyte (or salt) concentrations where charge competition begins to dominate and reduce ionization efficiency saturation as defined by Metwally et al. occurs. Simultaneously, the formation of homo-multimers ([nM–H]- or [nM+H]+) increases when the saturation is driven by the analyte concentration. Thus, a substantial increase in homo-multimer formation under analyte saturating conditions becomes a dominant feature of the mass spectrum. This differs from salt-induced ion suppression in that the analyte is inducing the suppression effects. Thus, peak splitting continues to occur under high concentrations, which leads to ion suppression. Below concentrations leading to ion suppression, the intensity of various adducts should be summed to increase linearity because the signal is simply being divided into different ion m/z bins.

The role of instantaneous analyte concentration across a chromatographic peak is important for the formation of and the competition between various ions and ion-neutral adducts. One needs to be careful to not oversimplify this generalization; previous work has shown that increasing the buffer concentration (i.e., salt) does not directly lead to increased buffer-containing adducts [49, 50]. Again, this is likely due to the complex relationships between various ion formation pathways and subsequent chemistry occurring in the API. Increasing salt or buffer concentrations reduces the extent to which analyte species are initially ionized due to charge competition. Thus, this immediately reduces the concentration of analyte ions preventing the formation of analyte-ion containing clusters, regardless of where they form.

Conclusions

The complexity and diversity of ions observed for bile acids detected in the negative mode is extensive, and it significantly deviates from the canonical model of molecular or “pseudo-molecular” ions ([M–H]- or [M+H]-) dominating the mass spectrum. Decades of work have attempted to understand the behavior and formation mechanisms of ion-neutral clusters generated from electrosprays, highlighting the fact that this is a ubiquitous phenomenon of electrospray ionization and atmospheric pressure ionization techniques in general. Understanding the relationship of the ions and ion-neutral clusters generated in electrosprays and their subsequent transformations during transmission though API mass spectrometers remains a significant challenge to both experimentalists and modelers. The results presented herein provide qualitative and quantitative descriptions of the behavior of these clusters in the context of LC-ESI-MS, but further studies are needed to full explore the mechanisms of cluster formation.

The chemical diversity of the molecules under investigation for metabolomics workflows, particularly non-targeted workflows, necessitates that these processes are given more attention as quantification and robust annotation remain ongoing challenges [1, 3]. Quantification by summing all signals arising from a compound under low collisional energy operation improves linearity and increases the sensitivity. The complexity observed with this structurally similar set of molecules should not be thought of as rare, isolated, or worst case scenarios but as ubiquitous phenomenon [1, 3, 19, 51]. Many of these adducts are energetically stable with sufficiently high binding energies that collisional dissociation of all the clusters would inevitably cause unwanted and substantial fragmentation of other species.

The distribution of ions changes across LC flow regimes following the availability of ions and neutral species. Standard analytical columns (2.1 and 1.0 mm) operated under typical UPLC conditions lead to higher levels of contaminant-containing ion-neutral adducts compared to micro-flow regimes. The micro-flow regimes produce ion distributions with an increased likelihood of observing homodimers and higher order clusters with retained analytes driven primarily by instantaneous solute concentration.

The complexity of high-resolution UPLC-ESI-MS data require data-driven workflows that make fewer assumptions and operate on both individual data files and data sets. Most models and statistical methods would be incapable of dealing with the compound spectra where the relative abundance of each analyte is changing across the concentration range (often non-linearly). Spectral similarity scores developed for matching experimental spectra to reference spectra show limitations for these compound spectra due to the ions shifting in relative abundance as a function of concentration. When taken as a whole and considering the amount of data produced on high-resolution mass spectrometers, integration of all available data for a single sample set or individual experiment remains a distant goal, but we have reached a level of understanding where the benefits of such approaches are undeniable for improving annotation and compound identification.