(3, 2)D 1H, 13C BIRDr,X-HSQC-TOCSY for NMR structure elucidation of mixtures: application to complex carbohydrates

Overlap of NMR signals is the major cause of difficulties associated with NMR structure elucidation of molecules contained in complex mixtures. A 2D homonuclear correlation spectroscopy in particular suffers from low dispersion of 1H chemical shifts; larger dispersion of 13C chemical shifts is often used to reduce this overlap, while still providing the proton–proton correlation information e.g. in the form of a 2D 1H, 13C HSQC-TOCSY experiment. For this methodology to work, 13C chemical shift must be resolved. In case of 13C chemical shifts overlap, 1H chemical shifts can be used to achieve the desired resolution. The proposed (3, 2)D 1H, 13C BIRDr,X-HSQC-TOCSY experiment achieves this while preserving singlet character of cross peaks in the F1 dimension. The required high-resolution in the 13C dimension is thus retained, while the cross peak overlap occurring in a regular HSQC-TOCSY experiment is eliminated. The method is illustrated on the analysis of a complex carbohydrate mixture obtained by depolymerisation of a fucosylated chondroitin sulfate isolated from the body wall of the sea cucumber Holothuria forskali.


Introduction
Spread of 13 C chemical shifts provides much needed separation of resonances in the indirectly detected dimension of 2D heterocorrelated NMR experiments. This separation is crucial for establishing networks of coupled spins by hyphenated techniques that combine homo-and heteronuclear polarisation transfers, such as 2D 1 H-13 C HSQC-TOCSY. Indeed, 2D 1 H-13 C HSQC-TOCSY spectra have been used to analyse complex oligosaccharides (Rodriguez-Carvajal et al. 2003;Sato et al. 2008;Panagos et al. 2012), polysaccharides (Uhrin et al. 1994), glycoproteins (Debeer et al. 1994), peptides (Sonti et al. 2012), fatty acids (Willker and Leibfritz 1998), metabolites (Bingol and Bruschweiler 2011;Bingol et al. 2014Bingol et al. , 2016, or food material (Kew et al. 2017;Wei et al. 2011). However, as the sample complexity increases, 13 C chemical shifts stop being unique at some point, making the interpretation of 2D HSQC-TOCSY spectra problematic. This is often the case for biomolecules with repeating structural units, especially if these are "randomly" modified. For example, methylation, sulfation or acetylation of polysaccharides at irregular position induces chemical shift changes that can affect nuclei at considerable distances from the substitution site, sometimes beyond the monosaccharide unit carrying the modification. When such polysaccharides are partially depolymerised, their natural heterogeneity is further increased due to the size variation of produced oligomers, causing molecular size dependent variations of chemical shift. Taking together, small variations in 1 H and/or 13 C chemical shifts cause clusters of cross peaks to appear in heterocorrelated spectra of biomolecules and their mixtures. This is of particular concern for complex oligosaccharides, which are characterised by low 13 C chemical shift dispersion ( Fig. S1) caused by their monotonous structural makeup with the prevalence of CH-OH groups. This complicates analysis of carbohydrate mixtures considerably, and above all, hides subtle, but important structural variations.
Providing the overlap exists only in the 13 C dimension, the resulting ambiguity can in principle be resolved by a 3D HSQC-TOCSY experiment (Krishnamurthy 1995). However, given the crowded nature of spectra of complex samples, increased dimensionality of NMR experiments will likely further reduce the achievable resolution and fail to deliver the desired outcome. Avoiding signal overlap may require much longer 3D experiments that are beyond practical limits. A more promising solution is to stay in two dimensions, while relying on the combined 13 C and 1 H chemical shifts to separate signal. This is one of the principles of the G-matrix and Fourier transformation (GFT) spectroscopy (Kim and Szyperski 2003). Here, more than one frequency is sampled simultaneously during the indirectly detected dimension of an (X,2)D experiment. Providing the 1 H chemical shifts are unique, the overlap caused by 13 C chemical shift degeneracy will be removed.
The proposed (3, 2)D HSQC-TOCSY experiment thus samples combined 1 H and 13 C chemical shifts of the directly bonded proton and carbon pairs, while the same F 1 frequency labelling is preserved for the TOCSY cross peaks. This concept was recently explored in the context of a dual receiver NMR experiment, which combined simultaneous acquisition of several 2D spectra (Pudakalakatti et al. 2014), including (3, 2)D HSQC-TOCSY. The authors acquired two (3, 2)D HSQC-TOCSY spectra coding for Ω 1H ± Ω 13C offset frequencies in the indirectly-detected dimension and analysed them using a covariance spectroscopy.
In this work we present an implementation of a (3, 2)D 1 H, 13 C HSQC-TOCSY experiment that removes modulation of cross peaks in the F 1 dimension due to proton-proton scalar couplings that is inherent to a regular (3, 2)D 1 H, 13 C HSQC-TOCSY experiment. Such treatment is essential, as it does not introduce further signal overlap to already crowded spectra. A simple manipulation of the original spectra produces high-resolution spectra with F 1 singlets at only one of the Ω 13C ± κΩ 1H offset frequencies.
The method is illustrated on an analysis of a mixture of oligosaccharides produced by β-eliminative depolymerisation (Gao et al. 2015) of fucosylated chondroitin sulfate (fCS) isolated from the body wall of the sea cucumber Holothuria forskali (Panagos et al. 2014). This mixture is referred to throughout the paper as "the fCS mixture".Fucosyalted chondroitin sulfates appear in a range of marine organisms (Chen et al. 2011;Ustyuzhanina et al. 2017Ustyuzhanina et al. , 2016 and display a variety of biological roles and potential bio-medical applications (Panagos et al. 2014;Nagase et al. 1995;Mourao et al. 1996;Borsig et al. 2007). The studied fCS is composed of the following repeating trisaccharide unit, [→3)GalNAcβ4,6S(1→4) [FucαX(1→3)]GlcAβ(1→] n , where X stands for different sulfation patterns (S) of fucose. For this species X = 3,4S (46%), 2,4S (39%) and 4S (15%), where the numbers before S refer to individual carbons of fucose ( Fig. 1). Only the two major sulfation patterns were present in sufficient concentration to yield detectable signals. As seen in the 1D 1 H and 2D 1 H, 13 C HSQC spectra of the fCS mixture (Fig. 1S), heterogeneous sulfation and a mixture of different sizes results in high degree of signal overlap in both the 1 H and 13 C dimension. Such overlap complicates structural studies of fCS, an issue that can be addressed by the use of (3, 2)D 1 H, 13 C HSQC-TOCSY experiment.

Results and discussion
The pulse sequences A Bruker implementations of the sensitivity-enhanced 2D 1 H-13 C HSQC (Kay et al. 1992;Schleucher et al. 1994) and 2D 1 H-13 C HSQC-TOCSY (Kay et al. 1992;Schleucher et al. 1994;Palmer et al. 1991) were used as a basis for the development of the new experiments referred to here as (3, 2)D BIRD r,X -HSQC and (3, 2)D BIRD r,X -HSQC-TOCSY. Their pulse sequences (Fig. 2) start with Ω 1H labelling during the initial variable t 1 period. Here, the evolution of proton-proton and proton-carbon couplings is suppressed by a central BIRD r,X pulse (Garbow et al. 1982;Uhrin et al. 1993). This pulse, in addition to 13 C spins, also inverts protons attached to 12 C, thus allowing Ω 1H labelling of 13 C-attached protons, while refocusing their evolution due to couplings with 1 3 12 C-atached protons and 13 C nuclei. The BIRD r,X pulse is surrounded by pulsed field gradients of opposite polarity, which helps with the suppression of 12 C-attached protons. Depending on the phase of the first 90° 1 H pulse of the pulse sequences, the 13 C chemical shifts labelled during the regular t 1 interval after the transfer of polarisation to carbons, are modulated either by cos(2πΩ 1H κt 1 ) or sin(2πΩ H κt 1 ) frequencies. Ω 1H is the difference between the 1 H chemical shift and the 1 H r.f. carrier frequency and κ is a scaling factor between the two simultaneously incremented indirectly detected periods. As a result, cross peaks in (3, 2) D BIRD r,X -HSQC-(TOCSY) spectra appear as in phase or antiphase doublets centred around 13 C chemical shifts of the 1 H, 13 C cross peaks and separated by 2κΩ 1H in F 1 . This generates κΩ 1H dependent displacement of signals. The cosine and sine modulated spectra are acquired in an interleaved manner and processed to produce two simplified spectra by the addition or subtraction of the original spectra. The simplified spectra thus contain only one part of the F 1 doublets each as positive signals with increased intensity. They are referred to here as the Ω 13C + κΩ 1H and the Ω 13C − κΩ 1H spectrum. The overall theoretical sensitivity of the (3, 2) D BIRD r,X -HSQC-(TOCSY) experiment is ½ of that of a regular 2D HSQC(-TOCSY) experiment. In practise, additional signal-to-noise reduction is observed, which can be attributed to the 1 J CH mismatch and relaxation effects during the BIRD r,X pulse; signal-to-noise of 30-45% relative to 2D HSQC spectra was measured for the corresponding cross peaks in the (3, 2)D BIRD r,X -HSQC spectra of the fCS mixture.
(3, 2)D BIRD r,X HSQC Acquisition of the (3, 2)D BIRD r,X HSQC spectra is not essential, but may help to identify the position of the direct proton-carbon correlation cross peaks in the (3, 2) D BIRD r,X -HSQC-TOCSY spectra and thus aid in their analysis. An overlay of a 2D 1 H, 13 C HSQC spectrum and the Ω 13C ± κΩ 1H (3, 2)D BIRD r,X -HSQC spectra of the fCS mixture focusing on the anomeric region is shown in Fig. 3. A complete spectrum is shown in Fig. S2. These spectra illustrate the essential attributes of the presented methodology. Cross peaks in the indirectly detected dimension of the (3, 2)D BIRD r,X -HSQC are displaced by ± κΩ 1H frequencies relative to the cross peaks in the regular HSQC spectrum. In this instance, the 1 H carrier frequency was set on the HOD signal, resulting in the (3, 2)D BIRD r,X -HSQC cross peaks to the left (positive Ω H ) and the right (negative Ω H ) of the water The anomeric region of 2D 1 H, 13 C HSQC (violet) and two (3, 2)D BIRD r,X -HSQC spectra (red and blue, pulse sequence of Fig. 2) of the fCS mixture; κ = 1 was used. 1D 1 H spectrum and the projection of the 2D 1 H, 13 C HSQC spectrum are shown at the top and the side, respectively. For further discussion see the text signal to swap the direction of the displacement in the final spectra. Expansions enclosed in rectangles illustrate how the cross peaks overlapping in the HSQC spectrum become resolved in the (3, 2)D BIRD r,X -HSQC spectra. Depending on the local topology of cross peaks, increased resolution is achieved in the Ω 13C + κΩ 1H or in the Ω 13C − κΩ 1H spectrum or in both, as discussed latter.
(3, 2)D BIRD r,X -HSQC-TOCSY The cross peak displacement seen in the (3, 2)D BIRD r,X -HSQC spectra is propagated to the TOCSY peaks in the (3, 2)D BIRD r,X -HSQC-TOCSY spectra. An example of (3, 2)D BIRD r,X -HSQC-TOCSY spectra of fCS mixture is shown in Fig. 4, where an overlay of four partial heterocorrelated spectra is presented. Complete spectra are shown in Figs. S3 and S4. The spectra in Fig. 4 are dominated by the cross peaks in the 4.0-4.4 ppm 1 H region that belong to the direct and TOCSY correlations of H5/H6 protons of GlcNAc. These are to be ignored for the purpose of this discussion, as we inspect four groups of HSQC cross peaks (shown in pink) resonating at ~ 66.5 ppm in F 1 and around 4.90, 4.36, 4.16 and 3.97 ppm in F 2 . Their vertical displacement in the (3, 2) D BIRD r,X -HSQC-TOCSY spectra is indicated by black arrows, while their corresponding TOCSY cross peaks, connected by horizontal arrows, are circled. In this example, the TOCSY cross peaks are stronger than the direct correlation peaks, as the magnetisation was efficiently transferred from the 13 C-to 12 C-attached protons during a 40 ms TOCSY mixing time. The HSQC cross peaks at the δ 1H of 4.90 and 4.36 ppm do not show any TOCSY correlations in the displayed region of the spectra and will be discussed later. On the other hand, signals resonating at 4.16 and 3.97 ppm show several HSQC-TOCSY correlations (violet circles), e.g. cross peaks overlapping in the proton dimension of a regular HSQC-TOCSY spectrum between 4.44 and 4.66 ppm. This overlap is removed in the (3, 2) D BIRD r,X -HSQC-TOCSY spectra (cross peaks circled by full and dashed red and blue lines). Analysis of the spectra showed that the HSQC cross peaks belong to protons H3 and H2 of fucose sulfated at positions 2,4 (2,4S, 4.16 ppm) and 3,4 (3,4S, 3.97 ppm), respectively. These protons show TOCSY transfers to protons H2 (4.56-4.44 ppm in 2,4S fucose, dashed circle) and H3 (4.66-4.49 ppm in 3,4S fucose, full line circle), respectively. Their separation in the (3, 2)D BIRD r,X -HSQC-TOCSY spectra is a consequence of unique 1 H chemical shifts of H3 (4.16 ppm) and H2 (3.97 ppm) protons.
Transfer of magnetisation from the H3 and H2 protons is seen to continue in opposite directions to protons H4 and H1 of their individual monosaccharide rings, an observation that allowed their assignment. These (H4, H1)/C2 and (H4, H1)/C3 cross peaks are also circled in Fig. 4 and appear in pairs between 4.7 and 5.7 ppm. The observed doubling of signals with distinct 1 H chemical shifts within each circle is a consequence of depolymerisation. Cleavage of the GalNAc(1→4)GlcA glycosidic bond leads to an appearance of fucose that is now linked to a terminal modified GlcA carrying a C4-C5 double bond (ΔGlcA, compounds II and IV in Fig. 1). At the same time, fucose linked to inner GlcA units (compounds I-III in Fig. 1) is still present to some extent, leading to doubling of signals. A comparison with the chemical shifts of fucose in the fCS polysaccharide showed that the H1 and H4 chemical shifts of the terminal fucose linked to ΔGlcA are always up to 0.2 ppm smaller. These fucose cross peaks are labelled with a Δ symbol in Fig. 4.   Fig. 4 Overlay of four partial heterocorrelated spectra of the fCS mixture. Regular 2D 1 H, 13 C HSQC (pink) and 2D 1 H, 13 C HSQC-TOCSY (violet) spectra and two (3, 2)D BIRD r,X -HSQC-TOCSY spectra (read and blue, pulse sequence of Fig. 2). The vertical and horizontal black arrows show the displacement and the positions of the direct and the TOCSY (circled) cross peaks, respectively. For further commentary see the text

(3, 2)D BIRD r,X -HSQC-TOCSY and the conformation of fCS
The other two groups of the above mentioned cross peaks at ~ 66.5 ppm in F 1 and 4.90 and 4.36 ppm in F 2 , were assigned to H5/C5 of fucose based on the H5 to H6 TOCSY transfers that produced H6/C5 cross peaks in the otherwise empty region of the HSQC-TOCSY spectra (1.30-1.45/~ 66.5 ppm). Figure 5 illustrates how the ambiguity of a regular HSQC-TOCSY experiment (Fig. 5b, violet cross peaks), where it is not clear if the transfer originates from one or both of these protons, is removed in the (3, 2)D BIRD r,X -HSQC-TOCSY spectra.
The (3, 2)D BIRD r,X -HSQC-TOCSY spectra also enabled the assignment of H6 protons of both fucose types. As the J(H4,H5) of fucose is close to zero, no transfer of magnetisation from H5 to H4 protons took place, making the H6/C5 correlations strong. As seen in the spectra in Fig. 5, protons H6 appear at around 1.41 and 1.33 ppm, for the native and terminal fucose, respectively, following the chemical shift trends seen for the other fucose protons in these different environments. Correlations in the opposite direction, from H6/C6 to H5, are also resolved in a (3, 2)D BIRD r,X -HSQC-TOCSY spectrum presented in Fig. S4. Here, the extent of the heterogeneity of the fCS mixture is clearly visible, especially upon inspection of the Ω 13C − κΩ 1H spectrum.
In the native polysaccharide, the fucose ring is stacked above the neighbouring GlcNAc residue enabling formation of an unusual hydrogen bond from its H5 proton to the ring oxygen of GalNAc, as described previously (Panagos et al. 2014;Aeschbacher et al. 2017). β-elimination breaks the GalNAc(1→4)GlcA glycosidic bond, transforming fucose into a terminal unit of the newly formed oligosaccharides II and IV. This results in significant lowering of the H5 chemical shift of fucose (− 0.51 ppm) and is indicative of successful depolymerisation. As seen above, the H1, H4 and H6 protons of fucose are also affected in a similar manner, although to a lesser extent. Analysis of the (3, 2) D BIRD r,X -HSQC-TOCSY spectra thus allowed to identify chemical shifts changes of fucose protons caused by different conformation of this residue in fCS oligosaccharides.

Which (3, 2)D BIRD r,X -HSQC-TOCSY spectrum to inspect?
As already hinted, it is useful to analyse both Ω 13C ± κΩ 1H (3, 2)D BIRD r,X -HSQC-TOCSY spectra. This point is elaborated on next in more detail, initially using a cartoon representation (Fig. 6a-c) of possible cross peak displacements in (3, 2)D BIRD r,X -HSQC spectra. It can be seen that when the proton and carbon chemical shift in the overlapped areas are correlated (i.e. both 1 H and 13 C chemical shifts increase), a better cross peak separation is achieved in the Ω 13C − κΩ 1H spectrum (Fig. 6b). The opposite is true for the anti-correlated signals (i.e. when 1 H chemical shifts increase, but 13 C chemical shifts decrease, Fig. 6c), where a better separation is achieved in the Ω 13C + κΩ 1H spectrum. If the 13 C chemical shifts are identical, while the 1 H chemical shifts vary, separation of signals is identical in both spectra (Fig. 6a). Nevertheless, as an accidental overlap with other cross peaks may occur in one of the spectra, it is worthwhile to inspect both. These points are illustrated on an overlay of several experimental spectra shown in Fig. 6d. Focusing on the overlapping 2,4S-H2 and 3,4S-H3 fucose HSQC cross peaks at ~ 4.52 ppm (pink circle) and the 2,4S-H3 HSQC-TOCSY cross peaks at ~ 4.16 ppm, it can be seen that that their separation is better in the Ω 13C − κΩ 1H (red circles) than in the Ω 13C + κΩ 1H (blue circle) spectrum. This is a consequence of the topology, indicated by a dotted arrow of 2,4S-H2 HSQC cross peaks, which follows the pattern shown in Fig. 6b. C HSQC-TOCSY spectrum (violet) and two (3, 2)D BIRD r,X -HSQC-TOCSY spectra (red and blue) focusing on a H5/ C5 and b H6/C5 cross peaks. The black arrows indicate displacement along the vertical axis and horizontal lines connect the H5 and H6 cross peaks. Blue cross peaks labelled by an asterisk in b indicate additional CH 3 signals. For further discussion see the text 1 3 Finally, increased separation or elimination of an accidental overlap of cross peaks can be achieved by acquiring spectra with κ ≠ 1. For κ > 1, this can lead to increased relaxation and peak broadening. Depending on the nature of the overlap, the use of κ < 1 may therefore be preferred. Another variable that influences the separation of signals in F 1 is the position of the 1 H r.f. carrier. This can be explored without changing the duration of the indirectly detected periods.

Conclusions
The presented methodology extends the application of a powerful 2D 1 H, 13 C HSQC-TOCSY experiment to samples with signal overlap in the 13 C dimension. A proposed solution, in the form of (3, 2)D BIRD r,X -HSQC-TOCSY experiment, preserves high digital resolution in the indirectly detected dimension while accepting some signal losses. The method was illustrated on the assignment of fucose resonances in a mixture of oligosaccharides containing FucX(1→3)GlcA and FucX(1→3)ΔGlcA disaccharide fragments. The achieved resolution also made possible a detailed assignment of individual resonances belonging to 2,4 or 3,4 sulfated fucose in different oligosaccharides (I or III and II or IV) and accounted for reducing rings effects resulted from depolymerisation (data not shown). Although illustrated here on the TOCSY transfer, the same principles can also be applied to a NOESY experiment. In conclusion, the (3, 2)D BIRD r,X -HSQC-TOCSY/NOESY are a powerful addition to the limited arsenal of NMR techniques specifically designed to enable structure elucidation of molecule contained in mixtures.

Materials
Fucosylated chondroitin sulfate polysaccharide was isolated from the body wall of the sea cucumber, Holothuria forskalias, as described previously (Panagos et al. 2014). The sample was depolymerised using a modified protocol (see Supplementary Information) for β-eliminative depolymerisation of carbohydrates in anhydrous solutions (Gao et al. 2015). Structure contained in the studied mixture are shown in Fig. 1.
The sample (20 mg) was dissolved in D 2 O (550 µL) and measured at 300 K. The (3, 2)D 1 H, 13 C HSQC and (3, 2) D 1 H, 13 C HSQC-TOCSY spectra of the fCS mixture were acquired on a 4-channel Avance III 800 MHz Bruker spectrometer equipped with a 5 mm TCI CryoProbe™ with automated matching and tuning. The following parameters were used: 2048 and 2048 complex points in t 2 and t 1 , respectively, spectral widths of 8 and 130 ppm in F 2 and F 1 , yielding t 2 and t 1 acquisition times of 160 and 39.17 ms, respectively. Four scans were acquired for each t 1 increment using a relaxation time of 1.4 s. The overall acquisition time was 7 h and 47 min (for two interleaved spectra, using 4096 t 1 points). A 40 ms mixing time for the TOCSY transfer used DIPSI-2 pulse sequence (Rucker and Shaka 1989). A forward linear prediction to 4096 points was applied in F 1 . A zero filling to 4096 was applied in F 2 . A cosine square window function was used for apodization prior to Fourier transformation in both dimensions. Identical parameters were used to acquire regular 2D 1 H, 13 C HSQC and 2D 1 H, 13 C HSQC-TOCSY spectra (Bruker pulse programs hsqcedetgpsisp2.3 and hsqcdietgpsisp.2) in half of the time required for the GFT experiments. Spectra were processed using a Bruker AU program provided in the Electronic Supplementary Material.