Modelling the acid/base 1H NMR chemical shift limits of metabolites in human urine

Introduction Despite the use of buffering agents the 1H NMR spectra of biofluid samples in metabolic profiling investigations typically suffer from extensive peak frequency shifting between spectra. These chemical shift changes are mainly due to differences in pH and divalent metal ion concentrations between the samples. This frequency shifting results in a correspondence problem: it can be hard to register the same peak as belonging to the same molecule across multiple samples. The problem is especially acute for urine, which can have a wide range of ionic concentrations between different samples. Objectives To investigate the acid, base and metal ion dependent 1H NMR chemical shift variations and limits of the main metabolites in a complex biological mixture. Methods Urine samples from five different individuals were collected and pooled, and pre-treated with Chelex-100 ion exchange resin. Urine samples were either treated with either HCl or NaOH, or were supplemented with various concentrations of CaCl2, MgCl2, NaCl or KCl, and their 1H NMR spectra were acquired. Results Nonlinear fitting was used to derive acid dissociation constants and acid and base chemical shift limits for peaks from 33 identified metabolites. Peak pH titration curves for a further 65 unidentified peaks were also obtained for future reference. Furthermore, the peak variations induced by the main metal ions present in urine, Na+, K+, Ca2+ and Mg2+, were also measured. Conclusion These data will be a valuable resource for 1H NMR metabolite profiling experiments and for the development of automated metabolite alignment and identification algorithms for 1H NMR spectra. Electronic supplementary material The online version of this article (doi:10.1007/s11306-016-1101-y) contains supplementary material, which is available to authorized users.

1 Introduction 1 H NMR is widely used for the metabolomic analysis of biofluids, as it provides quantitative, structural information on a wide range of metabolites, in a non-destructive and highly reproducible manner (Nicholson and Wilson 1989;Wishart 2008;Zhang et al. 2010). However, metabolite chemical shifts are sensitive to the chemical environment and subtle matrix effects, such as differences in pH and ionic strength, generally lead to inter-sample peak position variation (Weljie et al. 2006).
Urine is a popular biofluid used for metabolomic investigations, as it consists of various metabolites that can provide insight into a number of metabolic processes and disease states (Lindon et al. 1999). However, differences in its other components (such as urea, salts and other ions) are common, and result in often large variations in metabolite peak chemical shifts (Lindon et al. 1999). The main metal ions present in urine are sodium, potassium, and the divalent ions calcium and magnesium, however it is the divalent ions that are the main contributors (after pH) to peak chemical shift variability (Ackerman et al. 1996;Lindon et al. 2007;Yang et al. 2008).
A number of studies have investigated different sample preparation techniques in order to limit the pH and metal ion dependent NMR peak position variation. Buffers are commonly added to urine samples to control the pH variation of different samples. Lauridsen et al. (Lauridsen et al. 2007) recommended a minimum final concentration of 0.3 M for normal urine, and 1 M for concentrated urine samples. Xiao et al. (2009) recommended using 1.5 M buffer solutions in a 1:10 volume ratio of urine. While the use of a strong buffer may limit many metabolite peaks chemical shifts, a small number of metabolites, such as citrate and histidine, still exhibit strong inter-sample chemical shift variations (Lindon et al. 1999). In addition, very salty samples reduce NMR sensitivity, especially for cryogenically cooled probes. An alternative approach to minimise pH-dependent shifts is to move the sample pH to extreme values away from metabolite pK a values and towards the acid/base chemical shift limits (Beneduci et al. 2011;Lehnert and Hunkler 1986;Sze and Jardetzky 1994;Wevers et al. 1999). A potential problem with this approach however may be the degradation of pH sensitive metabolites through hydrolysis or redox reactions.
A number of approaches have been proposed to remove the divalent metal ions from urine samples and therefore remove their influence on metabolite peak chemical shifts. Chelation of the divalent metals with EDTA has been used with promising results (Asiago et al. 2008;Ross et al. 2007). However, the use of EDTA introduces a number of 1 H NMR signals for the free form as well as the Ca-and Mg-bound chelate forms. Deuterated EDTA is available, but this is quite expensive. Studies using biological 31 P NMR have often used ion-exchange resins (such as Chelex-100) to remove paramagnetic metal ions, and some early work in 1 H NMR metabolic profiling also used these to remove metal ions (Briceño et al. 2006;Cade-Menun and Preston 1996;Fan et al. 1997Fan et al. , 2001. Another proposed method for the removal of calcium and magnesium is the formation of insoluble fluoride salts of the metals by the addition of NaF or KF (Jiang et al. 2012). The metal precipitates can then be removed by centrifugation of the samples; however, this method may not be suitable for all metals and a combined approach with chelation may prove to be optimal (Jiang et al. 2012).
Despite the attempts to limit the metabolite peak chemical shifts, NMR spectra often require extensive preprocessing prior to chemometric analysis. A number of analytical strategies have been adopted to overcome shifting NMR peaks, such as spectral binning, automated alignment algorithms, or deconvolution methods that match metabolite standards to the NMR spectra (Alm et al. 2009;Anderson et al. 2011;Hao et al. 2012;Liebeke et al. 2013;Veselkov et al. 2009;Weljie et al. 2006). However each of these methods comes with certain drawbacks. For example, spectral binning results in loss of information and lower resolution, and still does not solve the problem of correspondence of peaks near bin boundaries; peak alignment strategies can introduce artefacts and are not suitable for the (common) situation where adjacent peaks cross over in frequency between samples; and successful peak deconvolution may require a lot of user input to get the best quality results.
Part of the problem lies in not knowing the extent of the shifts for individual peaks. Different metabolite protons have different chemical shift ranges and sensitivity to varying sample properties, and therefore global methods to account for peak shifts cannot be applied. A popular commercial peak-fitting software package, NMR Suite (Chenomx, Edmonton, Canada), provides chemical shift ranges of individual peaks and multiplets for common pH range values, and this assists the user to identify unknown metabolites; however this information is not freely available, and also does not account for shifts caused by other ions.
The pH dependency on NMR chemical shifts is not always detrimental: this phenomenon has been used to measure titration curves of various metabolites (Fan 1996). When a monobasic ligand (L) becomes protonated (HL), the change in the local electron density affects the chemical shift for certain nuclei (Szakács et al. 2004a). Since the protonation event occurs effectively instantaneously on the NMR time scale, the observed chemical shift (d obs ) is actually a weighted average of the limiting chemical shifts of the unprotonated (d L ) and the protonated (d HL ) states of the molecule (Ackerman et al. 1996;Szakács et al. 2004a). The weighting factors correspond to the pH-dependent mole fractions of L and HL, which can be expressed in terms of the actual pH and the acid dissociation constant K a of the molecule as described by the Henderson-Hasselbalch equation (Ackerman et al. 1996;Szakács et al. 2004a). Therefore the observed chemical shift can be expressed as Eq. (1) (Ackerman et al. 1996), and the nonlinear fit of a molecule's 1 H NMR titration curve to this equation reveals the acid and base limiting chemical shifts d HL and d L respectively, as well as the acid dissociation constant K a for that molecule.
This equation is much more complex than for the monobasic ligand, due to interaction between protons bound at different binding sites as well as the statistics of proton binding (Onufriev et al. 2001). Three different equilibrium constants-macroscopic, microscopic and quasisite-have been described for multibasic ligands (Onufriev et al. 2001;Ullmann 2003). The nonlinear fit of an NMR titration curve to Eq. 2 gives the macroconstants, but for molecules with multiple protonation sites with similar pK a values, these macrospecies are actually mixtures of microspecies that hold identical numbers of protons but differ in the site of protonation (Ullmann 2003). Thus, the macroconstants only refer to the stoichiometry of proton binding, not the site of protonation, and a ligand with n protonation sites has 2 n microstates. It is also possible to characterise the titration curve of a multibasic ligand with n protonation sites as a sum of N noninteracting so called quasisites; see Onufriev et al. for a more in-depth treatment (Onufriev et al. 2001). Many studies that have used NMR to investigate pH titration curves have focused on solutions of single metabolites (Bezençon et al. 2014) or mixtures of a small number of metabolites (Xiao et al. 2009), whereas in metabolomics investigations complex biological matrices are analysed. Here, we have characterised the pH-and iondependent (Na ? , K ? , Ca 2? , Mg 2? ) chemical shift changes that occur in the 1 H NMR spectra of urine. Knowledge of the acid, base and metal ion dependent chemical shift limits of the main metabolites in a complex biological mixture will be important for metabolite assignments and will provide information that we hope may prove valuable in the future to help with designing alignment algorithms and automatic peak detection methods.

Materials and methods
Urine samples from five different individuals were collected in 1 day between 10 am and 3 pm, and placed on ice. The collected samples were pooled, and then frozen at -80°C in aliquots of approximately 150 ml. Prior to the NMR experiments, urine was passed through a column of Chelex-100 ion exchange resin (BioRad) to remove the majority of Ca 2? and Mg 2? ions, and the pH indicator standards imidazole, formate, Tris and piperazine were added to a final concentration of 1 mM, 1 mM, 0.25 mM and 0.25 mM, respectively. We measured the main metal ion concentrations in urine (Ca 2? , Mg 2? , Na ? and K ? ) with ion selective electrodes and pH measurements were performed with a glass electrode with inbuilt temperature sensor (Fisher Scientific).
For the pH titration experiment, two volumes of Chelextreated urine (200 ml) were treated dropwise with either 1 M HCl or 1 M NaOH while stirring. Samples were taken for NMR (400 ll) at 0.2 pH unit intervals and were prepared for NMR with addition of H 2 O (180 ll) and 2 H 2 O (20 ll) containing 4,4-dimethyl-4-silapentane-1-sulfonic acid-2 H 6 (DSS) to give a final concentration of 0.1 mM. For the ion titration experiment, Chelex-treated urine (400 ll), was supplemented with H 2 O containing various concentrations of CaCl 2 , MgCl 2 , NaCl or KCl, ranging from 0 to 1 M, and D 2 O (20 ll) containing DSS to give a final concentration of 0.1 mM. NMR samples were centrifuged for 5 min at 13,000 rpm and 550 ll was transferred to a 5 mm NMR tube. Spectra were acquired on a Bruker Avance DRX600 NMR spectrometer (Bruker BioSpin, Rheinstetten, Germany), with 1 H frequency of 600 MHz, and a 5 mm inverse probe at a constant temperature of 300 K. Samples were introduced with an automatic sampler and spectra were acquired following the procedure described by Beckonert et al. (2007). Briefly, a one-dimensional NOESY sequence was used for water suppression; data were acquired into 64 K data points over a spectral width of 12 kHz, with 8 dummy scans and 128 scans per sample. Spectra were processed in iNMR 3.4 (Nucleomatica, Molfetta, Italy). Fourier transform of the free-induction decay was applied with a line broadening of 0.5 Hz. Spectra were manually phased and automated first order baseline correction was applied. Metabolites were assigned at Metabolomics Standards Initiative (MSI) level 2 using the Chenomx NMR Suite 5.1 (Chenomx, Inc., Edmonton, Alberta, Canada). Metabolite peak positions from the different samples relative to DSS were obtained using MATLAB scripts written in-house by Dr Gregory Tredwell, and appropriate chemical shifts were determined for multiplets. A version of the scripts for peak picking and spline fits are part of the BATMAN project (batman.r-forge-project.org) (Liebeke et al. 2013). The observed chemical shifts of the various metabolite peaks were modelled with respect to pH with the general formula (Eq. 2) for multibasic acids (H n L), using the nlinfit function within MATLAB. The number of sites was assumed from the chemical structure. This enabled the estimation of pK a values and acid and base chemical shift limits for individual metabolite peaks.

Results and discussion
In order to systematically characterise the pH and ionic variation in a biologically relevant sample matrix, we collected a urine sample from five different individuals and pooled it to obtain a single large-volume representative human urine sample. Our aim was to manipulate the pH of the urine over a wide pH range and measure the resultant peak variability with 1 H NMR. It is likely that the large pH changes would alter metal ion concentrations in the urine, which could themselves interfere with the metabolite peak chemical shifts. To limit the effect of these changes, the main divalent metal ions, such as Ca 2? and Mg 2? , were removed from the urine sample using Chelex ion exchange resin before the pH adjustment step. The treatment reduced the concentrations of Ca 2? and Mg 2? , without compromising the metabolic composition of the urine (Supplementary Fig. 1, Supplementary Table 1). While the ion exchange resin was successful in removing the divalent ions, K ? concentrations were only slightly reduced by the resin treatment, and Na ? concentrations were slightly increased as they were displaced from the resin by the divalent ions. This was unfortunate as the sample ionic strength has the potential to alter certain metabolite pK a values and acid/base chemical shift limits (Xiao et al. 2009). However, as the increase in Na ? from the Chelex resin was relatively small compared with commonly used buffer concentrations (Lauridsen et al. 2007;Xiao et al. 2009), and as the main ionic contributors to NMR peak shift variations are the divalent ions Ca 2? and Mg 2? (Ackerman et al. 1996;Lindon et al. 2007Lindon et al. , 2011Yang et al. 2008), we felt that this was a suitable baseline urine sample to continue our investigations.
The 1 H NMR data for a single urine sample at 51 pH values over the range of 2-12 are shown in Fig. 1. Large chemical shift changes occur for a number of metabolite peaks. Individual peak positions were picked across all samples with the help of MATLAB scripts written inhouse. In total the shifts for 163 individual peaks were obtained, corresponding to 53 multiplets from 33 identified metabolites and 65 currently unassigned peaks. As Eqs. 1 and 2 assume that the reference chemical shift does not change, chemical shifts of the metabolite peaks were determined relative to DSS. DSS is a salt of a strong acid, and therefore its ionization state and hence chemical shift is stable over the pH range 2-12 (Szakács et al. 2004b). The nonlinear fitting of peak or multiplet pH dependent chemical shifts with respect to Eq. 2 was performed with the nlinfit MATLAB script, and pK a values and acid/base limits for 33 identified metabolites were modelled ( Table 1). The modelled pK a values were in fairly good agreement with literature values (Lundblad and Macdonald 2010), though some metabolites did show some differences. These are likely to be due to ionic interactions with sodium ions that could not be removed from the samples, and matrix effects of the complex biofluid. Characterising these effects was the goal of this study, as these data would then more accurately represent future metabolite profiling experiments.
An example of these model fits for one, two and three site protonation models of formate, alanine, and citrate respectively, is shown in Fig. 2. The two pairs of geminal methylene protons of citrate produce a strongly coupled AB spin system in 1 H NMR spectra (Moore and Sillerud 1994) and the two A and B chemical shifts are shown in Fig. 2c, d. The models closely agree with measured chemical shifts in most cases, although one of the citrate peaks ( Fig. 2d) was better fitted by a four site protonation model. It is not clear why this was the case. As previously described, the urea amide peak at 5.7 ppm, at both high and low pH values is seen to gradually disappear due to enhanced rates of proton exchange at these pH values (Xiao et al. 2009).
NMR samples are generally buffered to neutral or nearneutral pH using phosphate buffer, but there is no absolute need for this, and some studies have chosen more extreme sample pH endpoints in order to move away from the pK a values of highly shifting metabolites (Beneduci et al. 2011;Lehnert and Hunkler 1986;Sze and Jardetzky 1994;Wevers et al. 1999). If the chemical shift differences of metabolites between spectra can be minimised, the spectral quality would be increased and hence also the data processing simplified. For a mixture of nine urinary metabolites, Xiao et al. (Xiao et al. 2009) found the pH range of 7.1-7.7 was optimal for most of these metabolites. Figure 3 shows the extent of 163 metabolite peak chemical shift changes over single pH units for the pH range 2-12. The pH intervals 5-6, 6-7 and 7-8 have some of the lowest median peak chemical shift changes of 0.0006, 0.0003 and 0.0003 ppm respectively; however, these pH ranges also show some of the largest chemical shift changes for a small number of metabolite peaks. It is clear though, for all the  Modelling the acid/base 1 H NMR chemical shift limits of metabolites in human urine Page 5 of 10 152 pH ranges measured in this study, a small number of metabolite peaks still have significant chemical shift variability, which is likely to present problems for data processing no matter the chosen sample pH. These data will therefore be invaluable to the design of more sophisticated data processing methods in order to account for the metabolite peak chemical shift variability at a given sample pH.
Our current data indicate that it might be worth considering NMR buffers with higher pK a values, as the pH range from 8 to 9 also had a low median peak chemical shift change of 0.0003 ppm, but also showed a decreased range for the outlying highly shifting peaks, and the pH range 11-12 had the lowest average chemical shift difference (Fig. 3). Working at high pH values would also have the advantage that the signal from the urea peak would be  removed or have reduced intensity. There would of course also be some disadvantages to such an approach: it is likely that different metabolite classes (e.g. nucleotides) would have reduced stability at higher pH values. However, there is no reason why adjustment to high pH values would not be possible so long as studies were kept internally consistent (Sze and Jardetzky 1994).
The effects of the main metal ions present in human urine, Na ? , K ? , Ca 2? and Mg 2? , on metabolite chemical shift changes were also investigated. We added the metals (as chloride salts) to the ion-exchange-resin-treated urine sample, to final concentrations ranging from 0.01 mM to 1 M. As expected the divalent metal ions, in particular Ca 2? , induced the most prominent chemical shift changes (Fig. 4, Supplementary Fig. 2), followed by Mg 2? , and then the monovalent metal ions Na ? and K ? .
Concentrations of Ca 2? ions between 10 and 100 mM showed the largest chemical shifts, with little further change in peak chemical shift at higher concentrations (Fig. 5). Normal concentrations for urinary calcium are between 0.225 and 9.47 mM for males and between 0.125 and 8.92 mM for females, though this is largely dependent of diet (Wu 2006), while Jiang et al. (Jiang et al. 2012) measured an average Ca 2? concentration in rat, mouse and human urine samples to be 7.4, 1 and 0.9 mM respectively. The magnitudes of the Ca 2? -induced peak shifts correlated well to those induced by decreasing the pH, with a maximum peak shift of 0.99 ppm (Supplementary Table 2, Supplementary Data). In contrast, concentrations of Mg 2? ions between 10 and 100 mM showed only moderate peak shifts, but these continued for concentrations up to 1 M, and the magnitude of Mg 2? -induced chemical shift changes correlated well with acid-induced changes with a maximum of 0.76 ppm (Fig. 5). For the monovalent ions, concentrations between 331 mM and 1 M showed peak shifts [0.1 ppm for only a small number of metabolites, with maximum peak chemical shift changes of 0.33 and 0.24 ppm for Na ? and K ? , respectively (Fig. 5). As seen with the pH-induced changes on peak chemical shifts, the largest metal ion-induced peak variations were for the azole-class metabolites histidine, and methyl-histidines. Histidine's affinity for a number of metal ions is well known (Maley and Mellor 1949).
A complication of attempting to characterise the metal ion effects on metabolite peak shifts in the 1 H NMR spectra of human urine was the increase in pH due to displacement of H ? ions by the metal ions ( Supplementary Fig. 3). This effect was more pronounced for the divalent ions-especially Ca 2? and to a lesser extent Mg 2? . We considered using a buffered urine sample for this test, to try and reduce the decrease in pH, but decided against it as interaction of the buffer itself with the different metal ions could not be ruled out. Good's buffers, a class of zwitterionic N-substituted aminosulfonic acid buffers, have been considered to have weak metal complexation properties; however some reports suggest that many of these buffers in fact strongly bind metals (Mash et al. 2003). Furthermore, the addition of these buffers would introduce interfering peaks in the 1 H NMR spectra, potentially overlapping with important metabolite signals, as well as introducing additional counter ions. We therefore considered it to be the 'least flawed' approach to work with the unbuffered urine, and attempt to correct to some extent based on our knowledge of the pH-responsiveness of individual metabolite resonances. For the metal ion experiments, when the metal ion induced peak shifts are plotted with respect to the changing pH, it is clear that the metal ions alter peak chemical shifts independent of the pH (Supplementary Fig. 4). Not only are differences between the monovalent and divalent metal ions on peak chemical shift changes evident, but also differences between Ca 2? and Mg 2? ions can be observed. While we have yet to develop a model to predict metabolite chemical shift changes based on metal ion concentrations, these data will still be useful baseline data.

Conclusion
The pH-dependent chemical shift changes for a large number of metabolite peaks were modelled effectively using a modified equation based on the Henderson-Hasselbalch equation (Eq. 2). From these models we have obtained pK a values and acid/base peak chemical shift limits for 33 identified metabolites from human urine, as well as provided chemical shift data for a further 65 unassigned metabolite peaks for a wide pH range of 2-12. Furthermore, we have characterised the effects of the four metal ions commonly present in human urine, Na ? , K ? , Ca 2? and Mg 2? over a wide concentration range to better understand the ionic effects on metabolite peak chemical shift variability. Knowledge of the modelled pK a and acid/ base limits will give confidence to metabolite peak positions for a given sample pH and together these data will be valuable for the development of automated metabolite alignment and identification algorithms for 1 H NMR spectra.
Acknowledgments We would like to thank the two anonymous reviewers whose comments helped improve this manuscript.
Funding This work was funded by the BBSRC (BB/E20372/1).

Compliance with Ethical Standards
Conflict of Interest Gregory D. Tredwell, Jacob G. Bundy, Maria De Iorio and Timothy M. D. Ebbels declare that they have no conflict of interest.
Ethical approval All procedures performed involving human participants were in accordance with the ethical standards of Imperial College London.
Informed Consent Informed consent was obtained from all sample donors and they provided their urine voluntarily. No identifying information is included in this article.  Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://crea tivecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.