Introduction

The presence of liquid water is a prerequisite for life as we know it. Essentially, every drop of liquid water on Earth is inhabited by diverse communities of living organisms. Even in the nutrient-deplete, dark abyssal ocean, one milliliter of water contains more than one million of unicellular organisms (Gasol and Kirchman 2018). Life abounds even in droplets of storm clouds (Šantl-Temkiv et al. 2013) and in sediments, down to more than a kilometer below the sea floor (Kallmeyer et al. 2012). Only excessive heat in Earth interior or deep-sea hydrothermal fluids prevents life to flourish (Yang et al. 2021). During life and upon death, all organisms release polar organic molecules to their environment (Dittmar et al. 2021). These traces of life dissolve in water and travel with flowing waters through land- and seascapes (Dittmar and Stubbins 2014). Dissolved organic matter (DOM) is the nutritional basis for bacteria and other heterotrophic microorganisms (Dittmar et al. 2021). A fraction, however, escapes immediate microbial decomposition and has accumulated to one of the largest organic carbon pools on Earth’s surface (Dittmar and Stubbins 2014). Many DOM compounds are amphiphilic, i.e., they have polar functionalities conferring them water-solubility and, at the same time, apolar side chains (Dittmar and Kattner 2003). These compounds may keep hydrophobic organic persistent pollutants in aqueous solution, contributing to their global distribution via flowing waters (Dittmar and Kattner 2003). Similarly, municipal and industrial wastewaters often contain large amounts of DOM, which interacts with potential toxic pollutants, causing implications for environmental and human health (Komatsu et al. 2020; Anaraki et al. 2021). Despite its importance to life, its role in the global biogeochemical cycle and its potential relevance for the distribution of organic persistent pollutants, the molecular composition of DOM remains largely unknown (Dittmar et al. 2021).

Marine DOM is the largest pool of organic carbon in the ocean (~ 660 Pg of carbon), containing a similar amount of carbon as atmospheric carbon dioxide or all living biomass on Earth combined (Dittmar and Stubbins 2014). Because the ocean contains the largest quantities of DOM on Earth, we focus here on marine DOM, but refer frequently to progress made in freshwater systems or soils. DOM is a key component of global biogeochemical processes, and minor changes in its size and turnover dynamics potentially affect the global carbon cycle (Dittmar and Stubbins 2014). Global change stressors, such as ocean warming, acidification and deoxygenation, glacial and sea ice melting, changed inflow from rivers, etc., may influence the oceanic dissolved organic carbon (DOC) cycle (Lønborg et al. 2020), resulting in thus far unknown feedbacks in the global biogeochemical cycle. The impact of climate changes on the fate and cycling of oceanic DOC and vice versa is poorly understood (Ridgwell and Arndt 2015; Wagner et al. 2020). Marine DOM is primarily produced and turned over by marine microorganisms. Only in coastal areas and continental margins, microbial degradation products of vascular plant debris are a major component of DOM (Dittmar and Stubbins 2014). Depending on its lifetime in the ocean, DOM is operationally divided into labile and refractory fractions, with lifetimes ranging from hours to tens of millennia (Hansell 2013). Most DOM produced by phytoplankton is labile and turned over within hours to days, yet, a minor fraction resists decomposition over thousands of years and is distributed across the globe via ocean currents (Hansell 2013). This long-living fraction has accumulated to the large DOM pool we observe in the ocean. The millennium-scale stability of DOM remains enigmatic. Several hypotheses have been proposed to explain its longevity (Dittmar 2015; Dittmar et al. 2021). Resistance against microbial degradation might be due to the stable molecular structures present in DOM. In addition, very low concentrations of individual components make encounters between microbes and DOM molecules a rare event, which slows down degradation. Also ecological and environmental processes may lead to long-term accumulation of DOM (Dittmar et al. 2021). For more in-depth information on the biogeochemistry of marine DOM, the reader is referred to textbooks (Holland and Turekian 2014; Hansell and Carlson 2014).

DOM is one of the most complex natural mixtures on Earth. More than ten thousand molecular formulas have been identified in marine DOM with the help of ultra-high resolution Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) (Riedel and Dittmar 2014). A multitude of isomers exists for each molecular formula, and marine DOM consists of hundreds of thousands if not millions of different organic compounds at very low concentrations (Zark et al. 2017). Most oceanic DOM remains poorly characterized on an individual molecular level. A few known biochemical structures exist in DOM, and the bulk of DOM was roughly categorized into structural classes, such as aromatics, olefins, sugars, carboxyl-rich alicyclic molecules (CRAM) or material derived from linear terpenoids (Dittmar 2015). Unraveling the isomeric and structural diversity of DOM in different marine and freshwater ecosystems is crucial to understand the driving forces behind production, transportation and long-term stability of marine DOM and its role in global biogeochemical cycles.

Targeted analysis of specific compounds by liquid chromatography/mass spectrometry approaches helps to identify their role in microbial interactions in the ocean (Kujawinski 2011), but the fraction of molecules within targeted analytical windows is small in the case of DOM (Dittmar and Stubbins 2014). Untargeted analysis covers a broader range of the molecular spectrum and enables the determination of molecular diversity.

The molecular composition of marine DOM in various ecosystems has been extensively studied on a molecular formula level using FT-ICR-MS as the analytical gold standard within this research area (Stenson et al. 2002; Koch et al. 2005, 2007). FT-ICR-MS provides an accuracy of less than 0.1 mDa and therefore allows the extremely sensitive separation of thousands of individual molecular masses from which their corresponding molecular formulas can be calculated. Drawbacks of mass spectrometry are the selectivity of ionization techniques and that isomers are not resolved.

Nuclear magnetic resonance (NMR) spectroscopy overcomes these limitations of mass spectrometry, yielding complementary structural information (Seidel et al. 2022). Major drawbacks of NMR are its low sensitivity and low spectral resolution. Large amounts of DOM are required, and overlapping signals of the numerous DOM constituents result in broad unresolved spectral features. The recent development of ultra-high field superconducting magnets (up to 28.2 Tesla), small-volume helium-cooled microcryoprobes, advanced radio frequency (RF) electronics and novel pulse sequences greatly improved the sensitivity and spectral resolution of NMR measurements. These advances facilitate in principle the in-depth characterization of marine DOM on an individual molecular level. However, several improvements and concepts have not yet found application in the NMR spectroscopic analysis of DOM. The qualitative and quantitative analysis of highly complex mixtures such as marine DOM, using high field NMR spectroscopy, complemented by FT-ICR-MS and other analytical techniques is an active, quickly emerging, yet very challenging research field (Seidel et al. 2022). This review provides an overview of the current state of the art and challenges, as well as an outlook and discussion of future perspectives of NMR spectroscopy for the analysis of DOM in the ocean. We also give examples from transferable applications from NMR studies in freshwater and soil environments. It does not cover the physical fundamentals of NMR spectroscopy in detail, for which the reader is referred to one of the numerous classical textbooks (e.g., Hesse et al. 2008; Friebolin 2010; Keeler 2010; Günther 2013; Simpson and Simpson (Eds.) 2014; Lambert et al. 2019). However, a brief overview of the basic NMR principles is presented in the “Basic principle” and “Solution-state and solid-state NMR spectroscopy” subsections of the “NMR spectroscopy” section, followed by general aspects of the sample preparation and a description of the most relevant NMR experiments. The third section provides an overview of the application of solution-state NMR for the analysis of DOM, whereas the fourth section deals with current trends and future perspectives of NMR spectroscopy for the molecular-level analysis of DOM.

NMR spectroscopy

Basic principle

Atomic nuclei with non-zero nuclear spin quantum number (I ≠ 0), which are basically all nuclei with an odd number of protons and/or neutrons, possess a magnetic moment µ. According to the Zeeman effect, the energy levels of nuclei split into 2I + 1 energy levels under the influence of a static external magnetic field B0, which is by convention parallel to the z-axis. In fact, the magnetic moment is not static and precesses around B0, with the precession frequency termed as Larmor frequency. The Larmor frequency is not only very specific for each kind of nuclei but also slightly differs with the chemical environment across nuclei of the same species. Atoms with nuclear spin quantum number I = ½ possess two different energy levels, which are populated according to Boltzmann statistics and designated as α (lower energy state) and β (higher energy state), respectively. The sum of the magnetic moments is termed as macroscopic or bulk magnetization \(\overrightarrow {M}\) and precesses in the ground state as well around B0. Nuclei can be excited from the α- into the β-state and thus tilting the macroscopic magnetization \(\overrightarrow {M}\) by using electromagnetic radiation \({\upnu }\) in the radio frequency equal to the respective Larmor frequency, according to:

$$\nu = \frac{{\gamma *B_{0} }}{2\pi }$$
(1)

Where \(\gamma\) is a constant termed as gyromagnetic ratio.

In practice, the excitation is realized with a RF-pulse, generated in a coil within the NMR probe. Pulses frequently applied in NMR experiments are 90° pulses (tilting the macroscopic magnetization into the x-y-plane) and 180° pulses (inverting the macroscopic magnetization along the z-axis or refocusing it). The effect of a 90° pulse on the macroscopic magnetization is depicted in Fig. 1. After excitation by the RF-pulse, the macroscopic magnetization \(\overrightarrow {M}\) returns to equilibrium while precessing around B0. According to the Faraday’s law of induction, this induces a current in the RF-coil in the form of a free induction decay (FID), which is amplified and represents the actual signal of the NMR measurement. Fourier transformation (FT) of the time domain signal (FID) results in the NMR spectrum in frequency units. To obtain NMR spectra with improved signal to noise ratio (SNR), usually higher number of scans (= relaxation delay + RF-pulse + acquisition) are collected as the SNR increases with the square root of the number of scans. To ensure that all spins are returned to their Boltzmann equilibrium state before the next RF-pulse is applied, an equilibration time, commonly referred to as relaxation or recycle delay, is applied prior to the next RF-pulse. The sequence of events mentioned above (relaxation delay, RF-pulse and acquisition) is commonly referred to as a pulse sequence. More complex spectra (cf. 2D NMR experiments and 3D NMR experiments sections) are acquired using pulse sequences that contain several RF-pulses and delays in the presence of magnetic field gradients.

Fig. 1
figure 1

a Macroscopic magnetization \(\overrightarrow {M}\) precessing around the static external magnetic field B0 and excitation of the macroscopic magnetization \(\overrightarrow {M}\) by a 90° radio frequency (RF) pulse. b Macroscopic magnetization tilted into the x–y-plane due to the 90° RF-pulse, still precessing around B0. c and d Macroscopic magnetization \(\overrightarrow {M}\) returning back to its Boltzmann equilibrium state while precessing around B0 and thus inducing an oscillating electric current in the RF-coil that is detected as free induction decay

Since the spectral width and thus the specific values of signals in an NMR spectrum depend on the field strength of the instrument, it is usually converted into the dimensionless chemical shift scale that is displayed in parts per million (ppm, quantity symbol δ). This is achieved by recalculating the spectrum relative to a given standard with a defined chemical shift according to:

$$\delta = \frac{{\nu_{{{\text{sample}}}} - \nu_{{{\text{reference}}}} }}{{\nu_{{{\text{spectrometer}}}} }} \times 10^{6}$$
(2)

The scale is represented from higher ppm values on the left side to lower ppm values on the right side (cf. for example the scale of the NMR spectrum depicted in Fig. 3).

NMR spectroscopic experiments can in principle be performed with any nuclei that is NMR active. Nearly all elements possess at least one isotope that is NMR active and therefore receptive for an NMR spectroscopic measurement (Harris 1976). Since the chemical shift of a given nuclei depends on their neighboring groups and/or atoms, NMR spectroscopy enables the opportunity to selectively analyze specific elements and their chemical environment. Generously, 1H, which can be found in almost all organic compounds, is the second most sensitive nuclei of all NMR active nuclei (the most sensitive nuclei is 3H; however, this is of lesser relevance due to its low natural abundance and its radioactivity) (Patching 2016).

The number of features that can be differentiated by a specific spectroscopic method can be characterized by its theoretical peak capacity. According to Hertkorn et al. (2007), the theoretical peak capacity of FT-ICR-MS is in the range of some million peaks (less than 10 million), with every peak representing at least one different compound. In contrast, two-dimensional (2D) 1H,13C NMR spectroscopy (for an explanation of 2D NMR spectroscopy see “2D NMR experiments” section) offers a theoretical peak capacity of ~ 2 million peaks and three-dimensional NMR (3D) spectroscopy of ~ 100 million peaks. However, in NMR spectroscopy one distinct compound frequently shows several peaks, depending on its molecular structure. A medium-sized organic compound (Mw ~ 200 g/mol) might already account in average for 10–20 peaks and thus reducing the number of compounds theoretically separated in a 2D NMR spectrum to the range of hundred thousands.

Solution-state and solid-state NMR spectroscopy

NMR spectroscopy can be applied to both, liquids (pure liquids or dissolved samples) and solids and is then termed as solution-state and solid-state NMR spectroscopy, respectively. The instrumental setup of both techniques differs. Solid-state NMR suffers from poor spectral resolution and sensitivity due to broad signals caused by strong anisotropic interactions, such as dipole–dipole coupling, chemical shift anisotropy and quadrupolar coupling. Advanced solid-state NMR approaches such as cross-polarization and magic-angle spinning (CP-MAS), high-power homo- and heteronuclear decoupling are often used to overcome these circumstances. When applying MAS, the sample is spun at an angle of ~54.7° with respect to the external magnetic field that is generally referred to as magic-angle. In contrast, the anisotropic effects are statistically averaged out to zero in solution due to rapid molecular tumbling. Therefore, solution-state NMR spectroscopy is technically easier to realize but requires that the whole sample is soluble in an appropriate solvent (cf.Sample preparation” section). Because of the above-mentioned reasons, solution-state NMR spectroscopy is in general the preferred technique for the analysis of DOM (cf. Fig. 2). However, in cases where the sample is not soluble in any adequate solvent for solution-state NMR or relatively large amounts of a solid sample are available, solid-state NMR might be the method of choice.

Fig. 2
figure 2

Flowchart depicting the wealth of options to analyze a marine dissolved organic matter (DOM) sample with nuclear magnetic resonance (NMR) spectroscopy. The sample can be subjected to an NMR measurement either with or without extraction. Without extraction, solvent suppression techniques are needed to suppress the signal arising in the NMR spectrum due to the protons of water and only 1D 1H NMR spectra are feasible to record. In addition, the sample volume is heavily limited by the capacity of the NMR tube. The measurement after extraction benefits from both, the removal of the matrix (i.e., water and inorganic constituents that may disturb the NMR measurement) and concentration effects since usually several liters of water are extracted to obtain a few milligrams of DOM sample. Because sample extracts are usually dried, the sample can also be measured by solid-state NMR. Since solid-state NMR suffers from lower sensitivity and resolution compared to solution-state NMR, this is especially useful in case that the sample is not soluble in an appropriate solvent for solution-state NMR. Due to the concentration as a result of the extraction step, dozens of different NMR experiments, including multidimensional techniques, can be performed

Sample preparation

The presence of huge quantities of salts (~ 35 g/L) compared to the very low amounts of organic compounds (~ 1 mg/L) in marine DOM in its native state poses challenges to investigate it by analytical techniques in general and by NMR spectroscopy in particular. Moreover, the increased conductivity of the samples due to the high salt content reduces the performance of cryoprobes. For the effective usage of different analytical techniques, it is therefore necessary to remove the whole matrix (i.e., water and inorganic constituents) in order to obtain highly concentrated organic material. The introduction of microcryoprobes and high field NMR magnets significantly reduced the amount of material required for performing multidimensional NMR experiments in a reasonable measurement time (Seidel et al. 2022), which otherwise demands large quantities of water to be collected (Hertkorn et al. 2006; Panagiotopoulos et al. 2007; Lam et al. 2007; Arakawa et al. 2017). Two methods are routinely used for the extraction of DOM: solid-phase extraction (SPE) for the isolation of hydrophobic and amphiphilic DOM (Dittmar et al. 2008; Green et al. 2014) and ultrafiltration for the isolation of high molecular weight DOM (HMWDOM, also referred to as UDOM) (Benner et al. 1997). These methods recover about 65% (SPE) (Green et al. 2014) and 25% (UDOM) (Benner et al. 1997) of marine DOC. Reverse osmosis coupled with electrodialysis (RO/ED), which has the potential to recover more than 80% of marine DOC, has rarely been used to isolate DOM from seawater (Green et al. 2014).

As mentioned before, the sample must be dissolved in an appropriate solvent for solution-state NMR spectroscopy. An appropriate solvent is stable, does not react with the sample and does not produce major signals in the NMR spectrum. Since the nucleus that is most often observed in NMR spectroscopic experiments is 1H, usually deuterated solvents, such as D2O, DMSO-d6, MeCN-d3, MeOH-d3, MeOD-d4, CDCl3 or C6D6, are used. These solvents only cause very little signals in the corresponding 1H NMR spectra due to residual protons, which are unavoidable for technical reasons. In addition, the residual solvent signal is often used as an internal chemical shift reference standard. Beyond that, deuterated solvents are used for keeping the external magnetic field stable. For this purpose, an internal 2H lock system is used that basically acts as a separate spectrometer by monitoring the deuterium resonance of the solvent and automatically corrects for drifts of the magnetic field.

Available NMR experiments

Most NMR experiments presented within this review are in principle applicable to any NMR active nuclei. However, this is often not feasible due to the low receptivity of many nuclei that would result in almost infinite measurement times. Fortunately, most elements relevant to the biogenic environment possess at least one sufficiently receptive nucleus, such as 1H, 13C, 15N and 31P. Some frequently applied NMR spectroscopic techniques are presented within this section. In contrast to other spectroscopic techniques, each NMR spectroscopic experiment provides a very specific information. Thus, recording different NMR spectra usually enhances the knowledge about the sample and facilitates compositional and structural assignments. A tabular overview of the discussed NMR spectroscopic techniques and their applications for the analysis of marine and freshwater DOM is given at the end of this section.

1D NMR experiments

The simplest and most widely applied of all NMR spectroscopic experiments is the 1D 1H NMR experiment. It basically contains only one RF-pulse (most commonly a 90° pulse) after that the induced signal is detected. If properly acquired, it provides quantitative information about the protons in the sample. This information is delivered in the form of the peak integral, which corresponds to the number of protons responsible for the specific signal. In addition to the quantitative component of an NMR spectrum, the chemical shift of a signal provides structural insights, because it strongly depends on the chemical environment of an individual proton. The magnetic field experienced by a specific nucleus is influenced by its neighboring atoms and groups. Electron-withdrawing groups decrease the electron density at the observed nucleus, which is termed as deshielding. As a consequence of the deshielding, the magnetic field caused by the electrons around the nucleus is reduced and the experienced external magnetic field therefore increased. Thus, radiation of a higher frequency is required to fulfill the resonance condition [cf. Eq. (1)] and the chemical shift is shifted towards higher ppm values (termed as downfield shift). Electron-donating groups will cause the opposite effect, termed as upfield-shift (shift towards lower ppm values). Besides the electron-drawing properties of the surrounding groups, the shielding/deshielding of nuclei is also affected by anisotropic effects and hydrogen bonding.

In case that the signals of individual protons are sufficiently resolved, they also provide information about neighboring atoms in the form of the multiplet structure of the respective signal. The appearance as multiplets is caused by scalar couplings, which are also referred to as indirect spin–spin coupling or J-coupling. The fine structure of a J-coupling multiplet can be determined according to the ‘2*n*I + 1 rule’, where n is the number of coupling partners and I is the nuclear spin of the observed nucleus. The magnitude of the coupling decreases with increasing number of bonds between the coupled nuclei. Therefore, usually only J-couplings between atoms separated by one (termed as 1J-coupling) to five bonds (5J-coupling) are relevant. However, 4J and 5J-couplings are only observed in special cases. In principle, J-coupling can be observed between both, heteronuclear and homonuclear spins. In practice, only couplings to nuclei with high natural abundance, such as 1H, 19F or 31P, are of major relevance. A further point to consider is chemical and magnetic equivalence. Nuclei are chemically equivalent if they are interconvertible by symmetric operations and furthermore magnetically equivalent if they have exactly the same coupling partners. Chemically equivalent nuclei have the same chemical shift and can be further divided into homotopic and enantiotopic protons. Diastereotopic protons are chemically inequivalent, hence having different chemical shifts. The specific features of homotopic, enantiotopic and diastereotopic groups are covered by the stereochemical concept of topicity. Fig. 3 shows which information can be derived from a 1D 1H NMR spectrum.

Fig. 3
figure 3

1D 1H nuclear magnetic resonance spectrum (500 MHz, CDCl3) of (R)-2-ethoxypropionic acid (C5H10O3). The chemical shift (δ) scale (f1) is given in parts per million (ppm). Signals of protons (2-CH, 1’-CHa/b) adjacent to electronegative elements and/or electron-withdrawing groups are deshielded and appear at higher chemical-shifts. The protons of each methyl group (3-CH3, 2’-CH3) are chemically and magnetically equivalent and therefore have an integral of 3. The protons of the methylene group (1’-CHa/b) are diastereotopic and thus neither chemically nor magnetically equivalent. Therefore, they appear as two distinct signals at different chemical shifts. The proton signals show multiplet structures due to J-coupling interactions with the neighboring coupled nuclei (green arrows). For example, the signal of the 2’-CH3 protons appears as a triplet due to the expected 3J-coupling to two protons (2*n*I + 1 rule, n = 2, I = ½)

A second, frequently recorded 1D NMR spectrum is the 1D 13C NMR spectrum. Due to the low natural abundance of 13C (only 1.1% of all carbon isotopes), homonuclear J-couplings are usually not relevant. However, coupling to 1H nuclei takes place which leads to a splitting of signals into multiplets. In order to reduce spectral complexity, 13C NMR spectra are often recorded proton decoupled, which means that signal splitting due to coupling with 1H nuclei is suppressed. Whereas 1H decoupled 13C NMR spectra are not quantitative in case that the decoupling is applied during the whole pulse sequence (power-gated decoupling), they remain quantitative in case that the decoupling is only applied during the acquisition period (inverse-gated decoupling). However, 13C NMR spectra are frequently recorded with power-gated decoupling. This is due to the fact that proton decoupling during the whole pulse sequence causes 13C signal enhancements of varying degree as a result of the nuclear Overhauser effect (NOE) (Overhauser 1953; Solomon 1955). Coupling to other highly abundant nuclei, such as 19F or 31P, still takes place in proton decoupled 13C experiments. Recording the 13C spectra with distortionless enhancement by polarization transfer (DEPT) (Doddrell et al. 1982) causes both, an enhancement of sensitivity as well the possibility to distinguish between CH- (methine), CH2- (methylene) and CH3- (methyl) groups. Carbons with no attached hydrogen (e.g., quaternary and carbonyl carbons) show no signal in the regular DEPT experiment. Quaternary carbons can be detected in DEPT including the detection of quaternary nuclei (DEPTQ) (Burger and Bigler 1998), J-modulated spin-echo (JMOD) (Le Cocq and Lallemand 1981) and attached proton test (APT) (Patt and Shoolery 1982) experiments.

Although the 1D 1H NMR is the most sensitive experiment, it often suffers from the poor spectral resolution due to overlapping J-coupling multiplets spread over the limited 1H chemical shift range (~ 10 ppm), even at the ultra-high magnetic fields currently available. Recent advancements in the proton–proton homonuclear decoupling methodologies, popularly known as pure-shift NMR, facilitate the enhancement of spectral resolution by collapsing J-coupled multiplets into singlets for each chemically non-equivalent proton even at lower magnetic fields (Zangger 2015; Castañar 2017). One of the most sensitive variant of pure-shift NMR experiments is the PSYCHE (pure shift yielded by chirp excitation) experiment (Foroozandeh et al. 2014).

Whereas chemical shifts and J-couplings provide information about the chemical environment and coupling partners, respectively, NOEs and ROEs (rotating frame Overhauser effect) are the valuable NMR structural parameters that provide information about the conformation and configuration of small organic compounds and biomolecules. Their corresponding spectra display correlations between protons that are in spatial proximity (less than 5 Å). It is most common to record these type of experiments in a 2D fashion which is referred to as nuclear Overhauser effect spectroscopy (NOESY) (Jeener et al. 1979; Kumar et al. 1980) and rotating frame Overhauser effect spectroscopy (ROESY) (Bothner-By et al. 1984; Bax and Davis 1985a), respectively. However, also 1D NOE/ROE techniques, such as NOE difference (NOEDIFF) spectroscopy (Richarz and Wüthrich 1978) and 1D-NOESY (Kessler et al. 1986), exist.

Another, quite important NMR experiment is termed pulsed field gradient spin-echo (PFGSE) (Stejskal and Tanner 1965). In this experiment, the magnetization is dephased by a magnetic field gradient after its initial excitation by a 90° RF-pulse. After an evolution time ∆/2, the magnetization is inverted by a 180° RF-pulse, followed by the application of a second magnetic field gradient after an evolution time of ∆/2. Usually, this experiment is applied several times while incrementing the strength of the magnetic field gradient. From this series of spectra, the diffusion coefficients of individual compounds can be calculated and 2D spectra can be constructed, which is usually referred to as diffusion ordered spectroscopy (DOSY) (Morris and Johnson 1992). Since this technique leads to a virtual separation of individual compounds according to their diffusion coefficients, it is often designated as NMR chromatography.

2D NMR experiments

The introduction of a second dimension in NMR spectroscopic experiments has the advantage of an improved resolution which often allows the separation of peaks that would otherwise overlap in a 1D NMR spectrum. In addition, 2D NMR experiments can provide detailed evidence about through-bond (J-couplings) and/or through-space (dipole–dipole cross relaxation) atom connectivity, thus facilitating the atomic level structural analysis of individual molecules as well as of complex mixtures. They can either be recorded in a homonuclear (correlations between the same kind of nuclei are observed) or in a heteronuclear (correlations between two different kinds of nuclei are observed) fashion. In general, a 2D NMR pulse sequence contains four blocks: preparation, evolution (t1), mixing and detection (t2). A 2D NMR spectrum can be recorded by acquiring a series of FIDs as a function of incremented t1 evolution periods. The application of a fast Fourier transformation (FFT) along t2 followed by a FFT along the t1 time domain data results in a 2D NMR spectrum with two frequency axes f2 and f1, termed as direct (f2) and indirect (f1) dimensions, respectively. Usually, both axes of a 2D NMR spectrum represent chemical shift scales of the respective nuclei. However, also different types of 2D NMR experiments exist where, e.g., the diffusion coefficients (DOSY, cf. 1D NMR experiments section) or the J-couplings are displayed in the f1 dimension. The latter is designated as J-resolved (JRES) (Aue et al. 1976b) experiment and especially useful in case that the 1D 1H NMR spectrum is too crowded (i.e., massive peak overlapping) to allow the direct measurement of coupling constants (distance between two adjacent local maxima of a multiplet signal measured in Hz). The JRES experiment separates the chemical shift information from the coupling constants by displaying the pure 1H chemical shifts as singlets on the f2 axis and the J-couplings along the f1 dimension. Beyond the resolution of 1H-1H (homonuclear) couplings, it can also be used to resolve 1H-X (heteronuclear) couplings.

One of the most prominent 2D NMR spectroscopic experiments is the correlation spectroscopy (COSY) (Jeener 1971; Aue et al. 1976a; Brereton et al. 1991; Vanzijl et al. 1995; Jeener and Alewaeters 2016). Most often, it is recorded as a homonuclear 1H,1H COSY in which the J-couplings between protons are observed. It displays the 1D 1H NMR spectrum as the diagonal peaks of the spectrum and signals for protons which are connected with each other via J-couplings as cross-peaks. Most commonly, 3J-couplings are observed because they possess a higher intensity compared to long-range couplings (coupling through four or more bonds). By taking into account the information extracted from 1D spectra and in the case that the signals in the 1H,1H COSY spectrum are well resolved (i.e., no overlapping peaks), it is possible to define distinct structural fragments, such as an ethyl unit (CH3-CH2, cf. Fig. 4). Homonuclear COSY spectra of nuclei other than 1H can as well be recorded as heteronuclear COSY spectra, such as a 1H,13C COSY. The latter is designated as HETCOR (heteronuclear correlation spectroscopy) (Bodenhausen and Freeman 1977; Freeman and Morris 1978; Bax and Morris 1981). In this experiment, cross-peaks for one bond correlations between 13C and 1H are observed.

Fig. 4
figure 4

2D 1H,1H COSY (correlation spectroscopy) nuclear magnetic resonance spectrum (500 MHz, CDCl3) of (R)-2-ethoxypropanoic acid (C5H10O3). Both chemical shift (δ) scales (f1 and f2) are given in parts per million (ppm). The diagonal peaks represent the 1D 1H nuclear magnetic resonance spectrum and the off-diagonal peaks (‘cross-peaks’) indicate which CHn-groups are connected to each other via J-couplings. The cross-peaks highlighted in green indicate that the protons of the 1'-CH2-group (~ 3.6 ppm) are connected via three bonds to the protons of the 2’-CH3-group (~ 1.2 ppm) and thus forming an ethyl unit (CH3-CH2) and the cross-peaks highlighted in blue indicate that the 3-CH3-group (~ 1.5 ppm) must be connected to the 2-CH-group (~ 4.0 ppm). Please note that the protons of the 1'-CH2-group appear as two distinct signals due to diastereotopic splitting

An experiment very similar to the COSY is denoted as TOCSY (total correlation spectroscopy) (Braunschweiler and Ernst 1983; Bax and Davis 1985b; Cavanagh and Rance 1990; Kövér et al. 1998), historically sometimes also designated as HOHAHA (homonuclear Hartmann-Hahn) spectroscopy. It belongs as well to the group of 2D homonuclear experiments and is most often recorded as a 1H,1H TOCSY. It shows correlations from a given proton to all protons of an unbroken spin network and not only for those that are connected via two or three bonds. TOCSY correlations between two remote protons are usually observable as long as there are spin–spin couplings between all interstitial protons. It is a very useful experiment for the identification of individual constituents of a complex mixture and sugars. The presence of heteroatoms such as oxygen or zero spin–spin coupling between the protons interrupts the TOCSY transfer.

The HETCOR experiment is a 13C detecting experiment. Since 13C is by far less sensitive compared to 1H, this is of decisive disadvantage. This problem was tackled by the development of 1H detecting experiments, such as the HMQC (heteronuclear multiple quantum coherence) (Müller 1979; Bax et al. 1983), HSQC (heteronuclear single quantum coherence) (Bodenhausen and Ruben 1980) or HMBC (heteronuclear multiple bond correlation) (Bax and Summers 1986; Hurd and John 1991). The initial step of these experiments involves the excitation of the 1H magnetization, followed by the polarization transfer from highly sensitive 1H nuclei to insensitive X-nuclei by an INEPT (insensitive nuclei enhancement by polarization transfer) element. After an evolution period, the magnetization is transferred back to the protons by reverse-INEPT and detected. The HMQC and HSQC spectra display signals for couplings via one bond and thus show which hydrogen is connected to which carbon. In contrast, the HMBC spectrum displays signals for long-range couplings, usually via two to four bonds (with three bonds being most common). This type of information is especially useful to connect structural fragments that were defined by 1D, COSY, HSQC and HMQC spectra.

The 1,1-ADEQUATE (adequate sensitivity double-quantum transfer experiment) (Reif et al. 1996) spectrum shows correlations between protons and carbons that are separated by two bonds and thus, provides indirectly the information about which carbons are bonded to each other when recorded together with a HSQC or HMQC spectrum. Other useful variants of the ADEQUATE experiment are the 1,n, the n,1 or the n,n-ADEQUATE that display the spin–spin correlations over more than two bonds. The 13C detecting analogue of the 1,1-ADEQUATE is designated as INADEQUATE (incredible natural abundance double-quantum transfer experiment) (Bax et al. 1980, 1981). This experiment relies on one bond carbon–carbon couplings. Thus, only two adjacent 13C nuclei lead to signals in the spectrum. At natural abundance, the probability to find two adjacent 13C nuclei is only 0.01%. Therefore, the INADEQUATE experiment is extremely insensitive and usually requires isotopically enriched samples or highly concentrated samples at natural abundance.

In the above-mentioned experiments, the magnetization transfer occurs via through-bond spin–spin couplings. Other useful NMR experiments such as NOESY or ROESY rely on the dipole–dipole cross relaxation. As in the COSY spectrum, the diagonal peaks represent the 1D 1H NMR spectrum. For the NOESY spectrum, the diagonal peaks are of opposite phase compared to the cross-peaks for small molecules and in the same phase for large molecules. Peaks from exchangeable protons are always in the same phase as the diagonal peaks. However, for medium-sized compounds the NOE may become zero which results in no observable peak. To overcome this circumstance, the ROESY was developed. In this experiment, ROE cross-peaks appear in the opposite phase compared to the diagonal peaks, chemical exchange peaks (at least one of the correlating protons is exchangeable) have the same phase as the diagonal peaks and signals are also observed for medium-sized compounds. However, ROESY is prone to through-bond correlation artifacts (TOCSY cross-peaks appear in the same phase as diagonal peaks), which might be misinterpreted as through-space correlations. A spin-lock pulse used in the ROESY sequence causes sample heating, which is particularly problematic for salty samples measured in helium cooled cryo-probes.

3D NMR experiments

Approximately one decade after the development of 2D NMR techniques (Jeener 1971; Aue et al. 1976a; Jeener and Alewaeters 2016), also 3D NMR experiments were established (Plant et al. 1986; Griesinger et al. 1987). Today, NMR experiments with up to eleven dimensions have been reported (Rinaldi and Monwar 2017). Since a classical 3D NMR spectrum is typically recorded within days and a 4D NMR spectra will usually already take weeks to be recorded (Rinaldi and Monwar 2017), NMR spectra with more than three dimensions are of limited practical relevance.

3D NMR experiments can either be constructed by applying two consecutive 2D NMR experiments of up to two types of nuclei or as triple resonance experiments, by correlating three different nuclei. By tradition, 3D NMR experiments are used for the 3D structure determination of proteins. Thus, commonly correlated nuclei in triple resonance experiments are 1H, 13C and 15N. However, also triple resonance experiments correlating other nuclei, such as 1H, 13C, 19F (Li and Rinaldi 1996) and 1H, 13C and 31P (Berger and Bast 1993), have already been reported. Most often, proteins are isotopically labeled before subjecting them to 3D NMR experiments. Presumably because of the low sensitivity of nD NMR experiments (n greater than or equal to 3) and the huge molecular diversity of DOM (and thus, the high amount of sample needed to record spectra with sufficient resolution within a realistic time), nD NMR spectroscopy has rarely been used for the analysis of such complex mixtures. The few published examples include in particular studies from the working group around André J. Simpson (Simpson 2002; Simpson et al. 2003; Woods et al. 2012). Due to this reason, 3D NMR spectroscopy is not further reviewed herein, but its principles have already been discussed in more detail in other reviews (see, e.g., Rinaldi and Monwar 2017).

Solvent suppression

In NMR spectroscopy, the analog signal is converted into a digital signal by means of an analog-to-digital converter. Most common are analog-to-digital converters that convert the analog signal into a signal with ~ 16 bits. In general, the analog-to-digital converter must be adjusted with respect to the largest signal. When recording a 1H NMR spectrum without using a deuterated solvent, the by far largest signal will be that of the solvent (at least in case that the solvent possesses protons, which is the case for most common organic laboratory solvents). In case that the sample that should be measured is very low concentrated, the sample signal(s) will be represented by the last bit together with the noise, a circumstance that is commonly known as the ‘dynamic range problem’ of NMR. It is due to this reason that when using non-deuterated solvents (which is sometimes unavoidable, e.g., when acquiring spectra in vivo) or measuring very low concentrated samples, the solvent signal must be reduced. The easiest way to remove the solvent signal is to saturate it by a low-power and long-lasting selective RF-pulse before applying the actual pulse sequence of the desired NMR experiment. However, this approach comes along with some disadvantages and over the last decades numerous techniques for solvent signal suppression have been developed (Zheng and Price 2010).

NMR spectroscopy of dissolved organic matter

NMR spectra of complex mixtures containing hundreds of thousands of compounds with different molecular structures appear virtually completely different than those of individual small organic compounds. In very complex mixtures, the  molecular-level structural characterization of the components from scratch is commonly impossible. Fig. 5 shows an 1D 1H NMR spectrum of Suwannee River DOM. At first glance, it may appear even less complex than the spectrum of a single organic compound (Fig. 3), since it does not provide any fine structure of the signals. However, nuclei from thousands of distinct molecular formulas and structures contribute to the signal intensity of this spectrum. The elucidation of distinct molecular structures is not possible. Thus, 1D 1H NMR spectra of DOM are frequently classified in predefined chemical shift regions representing certain compound classes, such as aliphatics (0.0–1.9 ppm), acetate and CRAM (1.9–3.1 ppm), carbohydrates and methoxy groups (3.1–4.9 ppm), olefins (5.3–6.5 ppm) and aromatics (6.5–10.0 ppm) (Hertkorn et al. 2016). Similar classification schemes in 1D NMR also exist for nuclei other than 1H. The aforementioned regions are then individually integrated to draw conclusions about the relative abundance of the different compound classes.

Fig. 5
figure 5

1D 1H nuclear magnetic resonance (NMR) spectrum (900 MHz, CD3OD) of commercially available Suwannee River dissolved organic matter [International Humic Substance Service (IHSS), catalog number 2R101N] isolated by reverse osmosis (Green et al. 2015) (unpublished data). The chemical shift (δ) scale (f1) is given in parts per million (ppm). Although it may not appear as complex as the 1D 1H NMR spectrum of an individual compound, the molecular-level characterization is no longer possible because no distinct peaks are observed. Thus, the 1H NMR spectra of DOM are usually divided in regions of distinct chemical shifts, representing certain compound classes, e.g., aliphatics (0.0–1.9 ppm), acetate and carboxyl-rich alicyclic molecules (CRAM) (1.9–3.1 ppm), carbohydrates and methoxy groups (3.1–4.9 ppm), olefins (5.3–6.5 ppm) and aromatics (6.5–10.0 ppm). Classification according to Hertkorn et al. (2016)

In contrast, the 2D 1H,1H COSY NMR spectrum of a DOM sample (Fig. 6) appears much more complex than the COSY NMR spectrum of a single compound (Fig. 4). This is due to the introduction of a second dimension which is accompanied by a dramatic increase in the theoretical peak capacity by more than two orders of magnitude (Hertkorn et al. 2007). As expected from the high resolution of 2D NMR spectra, a large number of peaks do not overlap and show up as distinct peaks. The structure elucidation on an individual compound basis still remains very challenging but is not completely excluded in this case. However, due to their extreme molecular diversity, all kinds of NMR spectra from complex mixtures are frequently analyzed using statistical methods (cf. Multivariate statistics section).

Fig. 6
figure 6

2D 1H,1H COSY (correlation spectroscopy) nuclear magnetic resonance spectrum (800 MHz, CD3OD) of dissolved organic matter isolated by solid-phase extraction from North Equatorial Pacific intermediate water sample. Both chemical shift (δ) scales (f1 an f2) are given in parts per million (ppm). Due to the introduction of a second dimension and the associated increase in the peak capacity, numerous peaks are resolved and thus allow deeper structural insights compared to 1D nuclear magnetic resonance spectroscopy. Data have already been published by Seidel et al. (2022)

Overview of NMR spectroscopic applications for the analysis of dissolved organic matter

Tables 1, 2 and 3 provide an overview of existing NMR spectroscopic techniques that have been or could be applied for the structural characterization of DOM. A more detailed description of the techniques presented therein is given in the “Available NMR experiments” and in the “Current state and future perspectives in the analysis of DOM” sections. The vast majority of so far published NMR spectroscopic studies on DOM relied on the application of 1D NMR spectroscopic techniques, although this type of NMR spectra offers by far the smallest information content (cf. Fig. 7). Reasons for this fact might be the typical small sample amounts and the relatively low sensitivity of NMR spectrometers. On commonly available low-field NMR spectrometers, this circumstance usually leads to almost infinite measurement times for recording multidimensional NMR spectra of such samples.

Table 1 Brief overview of 1D NMR experiments applicable for the structural analysis of DOM. If not otherwise noted, studies applied liquid-state NMR on marine DOM. Exemplary studies employing solid-state NMR are marked with a cross ( +) and exemplary studies on freshwater or artificial DOM with an asterisk (*) or a hash (#), respectively. If applicable, studies were always assigned to liquid-state NMR on marine DOM, even if the respective study did not focus on it
Table 2 Brief overview of multidimensional NMR experiments applicable for the structural analysis of DOM. Even if not explicitly mentioned in Table 1, in most multidimensional NMR spectroscopic studies also lower dimensional NMR spectroscopic experiments were performed. If not otherwise noted, all studies applied liquid-state NMR on marine DOM. Exemplary studies employing solid-state NMR are marked with a cross ( +) and exemplary studies on freshwater or artificial DOM with an asterisk (*) or a hash (#), respectively. If applicable, studies were always assigned to liquid-state NMR on marine DOM, even if the respective study did not focus on it
Table 3 Further NMR techniques and current trends in method development applicable for the structural analysis of DOM. If not otherwise noted, all studies applied liquid-state NMR on marine DOM. Exemplary studies on freshwater DOM are marked with an asterisk (*). If applicable, studies were always assigned to liquid-state NMR on marine DOM, even if the respective study did not focus on it
Fig. 7
figure 7

Number of existing publications using nuclear magnetic resonance (NMR) spectroscopy for the analysis of marine dissolved organic matter (DOM), information content and measurement time of one- and multidimensional NMR spectroscopic experiments. Note that the information content and measurement time are scaled logarithmically. The number of publications was taken from Tables 1, 2 and 3, including also studies mentioned under hyphenated NMR and solvent suppression. Publications were only assigned to the highest dimensionality applied in the respective study. The information content was related to the theoretical peak capacity of the respective NMR spectroscopic method according to Hertkorn et al. (2007). Measurement times were estimated for a typical DOM sample consisting of a few milligrams of material as follows: 1D: 1 h, 2D: 24 h, 3D: 720 h. 2D NMR spectroscopic techniques constitute the best compromise for most DOM samples between measurement time and information content

NMR spectroscopic studies on dissolved organic matter

The application of NMR spectroscopy for the analysis of DOM has a long tradition. One major milestone was reached almost 50 years ago, when Stuermer and Payne (1976) investigated a marine fulvic acid sample that was obtained by extracting 2000 L of seawater with Amberlite XAD-2 resin. They used a 90 MHz NMR spectrometer (HFX-90, Bruker/Spectrospin, Germany) for recording 13C NMR spectra and even more remarkably two different 60 MHz NMR instruments that were operated in the continuous wave mode (a spectrometer design that was used before the much faster FT spectrometers became popular) for recording 1H NMR spectra. Since then, an enormous improvement of NMR spectrometers as well as an evolution of NMR spectroscopic techniques took place. This is demonstrated in Fig. 8 depicting selected NMR spectra of DOM that were recorded over ~ 40 years. Nowadays, numerous studies using NMR spectroscopy to analyze DOM, including the application of solution-state and solid-state NMR, have already been reported. The following section provides an exhaustive overview of studies on marine DOM that were conducted using solution-state NMR techniques and is divided into two subsections. The first subsection covers examples where 1D NMR experiments were applied, and the second subsection also includes those where multidimensional experiments were performed. In addition, in both subsections selected examples studying freshwater DOM and/or using solid-state NMR spectroscopy are presented. An almost comprehensive overview of NMR spectroscopic studies on marine and freshwater DOM is presented in Tables 1, 2 and 3 (“Overview of NMR spectroscopic applications for the analysis of DOM” section).

Fig. 8
figure 8

Evolution of sensitivity and resolution of nuclear magnetic resonance (NMR) spectroscopic experiments performed for the analysis of marine dissolved organic matter (DOM) over almost 40 years. The chemical shift (δ) scales are given in parts per million (ppm). a 1D solution-state 13C NMR spectrum of seawater fulvic acids (Stuermer and Payne 1976), b 1D solid-state 13C NMR spectra of marine DOM at different depths (Benner et al. 1992), c 2D solution-state HSQC (heteronuclear single quantum coherence) NMR spectra of marine DOM (Hertkorn et al. 2013). Due to the increasing field strenghts, resolution and sensitivity of NMR spectrometers, spectra of increasing qualtiy could be recorded over the last decades, while the required sample amounts were drastically decreased. Panel a was reprinted from Stuermer and Payne (1976) with permission from Elsevier. Panel b was reprinted from Benner et al. (1992) with permission from The American Association for the Advancement of Science. Panel c was reprinted from Hertkorn et al. (2013) with permission from Copernicus Publications. All figures were slightly altered to serve this work

Studies using 1D NMR spectroscopy

Probably one of the earliest examples for the application of NMR spectroscopy on DOM was provided by Stuermer and Harvery (1974). According to the integration of specific regions of the 1H NMR spectrum of fulvic acids, isolated from the Sargasso Sea, the ratio of methylene protons (1.0–1.3 ppm), protons “adjacent to functional groups” (2.1–2.8 ppm) and aromatic protons (7.6–7.9 ppm) was 15:10:1. Apart from the solvent (D2O with methanol as internal standard), no further experimental details were mentioned and no NMR spectrum was displayed in this study. Regardless of the missing experimental details, this study clearly shows that already in the first NMR studies on DOM concepts such as the integration of specific regions were used to draw conclusions about the structural composition of the sample. Further studies in the following years were more or less exclusively focused on the characterization of fulvic and humic acids that are both operationally defined fractions of DOM.

Harvey et al. (1983) extracted fulvic and humic acids from near surface seawaters (3–55 m depth) from different sampling locations within the Gulf of Mexico. The 1H NMR spectra of all extracts appeared relatively similar with the exception that aromatic signals were only detected in a noteworthy amount for the humic acids, which is contrary to the finding of Stuermer and Harvery (1974), who detected aromatic protons in measurable amounts also for fulvic acids. However, this was likely due to the fact that hydrogen-poor aromatic structures are present in fulvic acids that are highly substituted and thus poorly detectable with 1H NMR spectroscopy. A similar concern has already been reported by Hatcher et al. (1980) who calculated the percentage of aromatic structures in humic acids derived from marine sediments by 1H and 13C NMR spectroscopy, respectively. The integration of the 13C NMR spectra revealed up to 4.5 times higher estimates of aromatic moieties than the integration of the 1H NMR spectra. Taking into account the results obtained from other analytical techniques and evidence from the literature, Harvey et al. (1983) assumed a great structural similarity and a common pathway of formation for fulvic and humic acids.

Wilson et al. (1983) recorded both, 1H and 13C NMR spectra of marine humic substances. From these spectra, it could be concluded that humic substances largely consist of branched alkyl structures and carbohydrates and contain only low amounts of aromatic structural motifs. This is consistent with the previously presented studies on humic and fulvic acids. In addition, the comparison of these spectra with those of algal exudates and intracellular algal material led to the finding that the spectra of algal exudates largely resemble those of the marine humic substances investigated in this study. Hence, algae exudates may largely contribute to the composition of DOM by serving as a precursor.

Solution-state 1H and solid-state 13C NMR spectroscopy was also applied by Malcolm (1990) to analyze soil, stream and marine humic substances. For the extraction of marine humic acids, more than 35 000 L of seawater were processed. However, the yield of humic acids remained with less than 100 mg relatively low compared to the huge amount of processed water. It was argued that this sample was excluded from the analysis by 1H NMR spectroscopy because of “lack of sample for analysis,” which illustrates the enormous operational challenges related to DOM analysis by NMR spectroscopy at that time. Interestingly, a 13C NMR spectrum was recorded for this sample although 13C is a by far less receptive nucleus for NMR spectroscopic measurements. Unlike Harvey et al. (1983), Malcolm (1990) stated that compositional differences of humic substances do not only exist across different environments (soil, stream and ocean) but also between humic and fulvic acids derived from the same environment.

In earlier studies on DOM, XAD resins were predominantly used for the isolation and concentration of DOM by SPE. During the late 1990s, the isolation and concentration of DOM by ultrafiltration became more popular, although the DOC recovery rates are relatively low with approximately 25% (Benner et al. 1997). Since ultrafiltration has an analytical cutoff of all compounds with a nominal mass less than 1,000 Dalton, a completely different fraction of DOM is investigated when following this approach. This allows to put more pieces into the puzzle of the structural characterization of DOM.

UDOM samples from surface waters of different sample sites in the Atlantic and Pacific ocean were analyzed with respect to their carbohydrate, acetate and lipid contents by Aluwihare et al. (1997) using 1H NMR spectroscopy. In addition, these samples were compared with samples of algal exudates that were subjected to time-dependent biodegradation. These decomposition experiments led to a pattern in 1H NMR spectra that could be related to acyl polysaccharides and was similar to that observed in natural UDOM samples. The presence of persistent acyl polysaccharides in DOM due to algal primary production and the potential role of aggregation as a sink for surface DOM was further investigated using 1H NMR spectroscopy in follow-up studies by Aluwihare and Repeta (1999) and Gogou and Repeta (2010), respectively.

UDOM was also investigated by Repeta et al. (2002). This study focused on aqueous and sedimentary marine DOM, but one surface seawater sample from the North Pacific was included as reference. The 13C NMR spectrum of this sample was compared with an alpine creek sample from another study. The ratio of certain structural motifs was essentially identical for both samples. In addition, the 1H NMR spectroscopic data from freshwater, sedimentary marine and aqueous marine UDOM were relatively similar. All investigated samples exhibited characteristics for acyl polysaccharides, indicating that this component is ubiquitous for UDOM regardless of its origin.

Quan and Repeta (2007) used both, solution-state and solid-state NMR to investigate the periodate overoxidation of UDOM. Benzoic acid was used as an internal standard to quantify carbohydrates and lipids by integration of the solution-state 1H NMR spectrum. After ten days of periodate oxidation, only a minor amount of these compounds remained intact. Rather, the majority was oxidized to formic acid, methanol and acetic acid as detected by 1H NMR spectroscopy. Based on these experiments, evidence for the presence of methyl and 6-deoxy sugars was obtained.

An excellent example for the application of heteronuclear NMR experiments is the investigation of nitrogen-containing compounds in UDOM by Aluwihare et al. (2005). Solution-state 1H NMR spectroscopy was combined with solid-state 15N NMR spectroscopy to analyze the effect of mild and strong acid hydrolysis on UDOM. The quantitative analysis of the 15N NMR spectra revealed that a surface UDOM sample of Woods Hole (5 m depth) consisted of 43% N-acetyl aminopolysaccharides, 21% hydrolyzable proteins, 29% non-hydrolyzable amides and 8% amines. In contrast, a Middle Atlantic Bight deep-sea sample (1000 m depth) consisted of only 17% N-acetyl aminopolysaccharides and 12% hydrolyzable proteins but contained 71% non-hydrolyzable amides.

In contrast to the aforementioned studies on DOM where XAD resins or ultrafiltration were used for the isolation and concentration of DOM, nowadays more frequently SPE cartridges are used that are usually packed with a styrene–divinylbenzene polymer. Apart from fulvic and humic acids, several other polar to non-polar compounds are isolated using this approach. One key publication in this context was published by Dittmar et al. (2008), in which the efficiency of different SPE cartridges for the isolation and concentration of DOM was compared. Thereby, 1H NMR spectroscopy revealed that the molecular composition of DOM extracts from North Brazilian seawater using either a styrene–divinylbenzene polymer or C18 as the sorbent was relatively similar.

An alternative approach for the isolation and concentration of DOM that has also been developed during the late 2000s is RO/ED. The spectroscopic and chemical properties of DOM extracted by RO/ED from Atlantic ocean waters were extensively studied by Koprivnjak et al. (2009), using among other analytical techniques solution-state 1H NMR spectroscopy and solid-state 13C NMR spectroscopy. Surprisingly, the NMR spectroscopic properties of RO/ED extracted DOM differ from those extracted by ultrafiltration or SPE. The proportion of alkyl carbons was greater in RO/ED samples, whereas the proportion of carbohydrates was lower compared to samples extracted by ultrafiltration but higher compared to samples extracted by SPE. This actually points to the fact that each of these three isolation and concentration techniques only enable to investigate a limited, operationally defined subset of a given DOM sample. Thus, a deeper molecular-level understanding of DOM might be obtained when samples are more frequently processed using all three isolation techniques, analyzed and the results subsequently compared.

Lam and Simpson (2008) applied 1H NMR spectroscopy of unprocessed DOM without preconcentration using a SPR-W5-WATERGATE pulse sequence for the solvent suppression. This might be valuable to get a rough compositional overview of DOM samples in a high throughput manner. It should be noted that the water suppression sequence affects sample signals near the solvent signal. Thus, the affected regions must often be excluded from quantitative analysis. In addition, it is currently not possible to acquire multidimensional NMR spectra of unprocessed DOM due to the almost infinite long measurement times that would be required to acquire spectra of sufficient quality. Zheng and Price (2012) applied WATERGATE solvent suppression in combination with PFGSE experiments to unprocessed pond-, river- and seawater to determine the average diffusion coefficients of three major DOM constituents, namely carbohydrates, CRAM and aliphatic compounds. From this, the average hydrodynamic radii of these compound classes were calculated.

While essentially all NMR spectroscopic studies investigating dissolved organic nitrogen used solid-state NMR spectroscopy, a few solution-state NMR studies on dissolved organic phosphorous (DOP) have already been reported (Nanny and Minear 1994a, b, 1997; Bell et al. 2017, 2020). Nanny and Minear (1994a, b, 1997) investigated lake freshwater samples in all of their studies. They identified phospomono- and diesters (Nanny and Minear 1994a). Furthermore, they have reported that DNA is likely to be present in the samples. In a second study that was published in the same year, they synthesized several lanthanide shift reagents and screened them for their application in 31P NMR spectroscopy (Nanny and Minear 1994b). Among the examined shift reagents, praseodymium ethylenediaminetetraacetate resolved overlapping peaks in 31P NMR spectra and in addition enhanced the sensitivity of certain signals. It enabled the differentiation between phosphomonoesters, phosphodiesters and phosphotriesters (orthophosphates) that were detected next to phosphonates. Tri- and tetrapolyphosphates were identified within a third study (Nanny and Minear 1997).

Recently, Bell et al. (2017, 2020) characterized DOP using solution-state 31P NMR spectroscopy. Within the first study, they assessed the potential of RO/ED for the isolation of DOP (Bell et al. 2017). Thereby, 31P NMR spectroscopy played a crucial role to secure that the samples were not biased by the extraction method. During a follow-up study, seasonal and tidal changes of DOP were investigated that was extracted from the water of a salt-marsh estuary located in South Carolina with the RO/ED technique described in their previous publication (Bell et al. 2020). Notably, the recovery rates of DOP in this study reached 90% ± 13%. By integration of the 31P NMR spectra, they concluded that DOP is mainly comprised of phosphomonoesters (~ 61% of P detected by NMR) and phosphodiesters (~ 31%). The remaining 8% were distributed between pyrophosphates (~ 4%), phosphonates (~ 2%) as well as di-/triphosphate nucleotides and polyphosphates (~ 1% each). Seasonal changes were observed for Fall as the concentration of monoesters decreased while those of phosphonates, diesters and di-/triphosphate nucleotides increased. Tidal changes in the composition of DOP were neglectable.

Selected examples using 2D and 3D NMR spectroscopy

Panagiotopoulos et al. (2007) fractionated UDOM by cation exchange chromatography (Ag+ and Pb2+) after acid hydrolysis and analyzed the fractions using 1D and 2D (1H,1H COSY and 1H,13C HSQC) NMR spectroscopic techniques. They succeeded in identifying several sugars (2- and 3-O-methylrhamnose, 2-O-methylfucose, 3-O-methylglucose) based on the analysis of the NMR spectroscopic data. Key correlations used for the identification of α-2-O-methylfucose are exemplarily shown in Fig. 9. In addition, 3- and 4-deoxy sugars were detected. However, their complete molecular structures remained unsolved.

Fig. 9
figure 9

a 2D 1H,1H COSY (correlation spectroscopy) nuclear magnetic resonance (NMR) spectrum and b 2D 1H,13C HSQC (heteronuclear single quantum coherence) NMR spectrum of a fraction (F3C1) obtained after acid hydrolysis and cation exchange chromatography of a ultrafiltered dissolved organic matter sample and structure of α-2-O-methylfucose as structurally elucidated from the spectra (Panagiotopoulos et al. 2007). The figure was slightly altered and reprinted from Panagiotopoulos et al. (2007) with permission from Elsevier

Woods et al. (2011a) applied hydrophilic interaction liquid chromatography (HILIC) to separate a Suwannee River DOM sample into 80 fractions to overcome the analytical challenges related to the complexity of DOM. Each of these fractions was subsequently screened by 1D 1H NMR spectroscopy to select fractions for further analysis. Selected fractions were then subjected to 2D NMR spectroscopic experiments, and the structures of various, rather simple molecules were assigned by the comparison of the aquired NMR spectra with those of a database. Moreover, the 1D 1H NMR data set was combined with fluorescence data processed by a statistical PARAFAC (parallel factor analysis) approach. This combined dataset was further statistically analyzed using principal component analysis (PCA). This way, DOM polarity fractions were statistically related to functional moieties, such as carbohydrates or aromatics. One year later, Woods et al. (2012) applied 2D HILIC/HILIC separation to collect 126 fractions of a Suwannee River DOM sample and used 2D and 3D NMR experiments (including inter alia HSQC-TOCSY and DOSY-TOCSY) to investigate selected fractions. The authors succeeded in identifying oxidized sterols as well as hopanoid-type structures as major components of Suwannee River DOM. Two other examples for the application of 3D NMR spectroscopy for the analysis of natural organic matter include the investigation of soil organic matter with DOSY-TOCSY (Simpson 2002) and HMQC-TOCSY (Simpson et al. 2003) experiments. In both studies, signals in the NMR spectra were assigned to lignin structures. In addition, in the latter study these signals were retrieved in the 1D and 2D NMR spectra of additional environmental samples, such as a marine sediment and an oak forest soil (Simpson et al. 2003).

Few 2D NMR spectroscopic data of bulk, non-fractionated DOM (apart from fractionating effects due to the isolation method) have been reported. Different NMR experiments, including also 2D techniques such as 1H,13C HSQC and 1H,13C DEPT-HSQC, were used by Hertkorn et al. (2006) to characterize a major constituent of refractory UDOM as CRAM. For an estimation of the CRAM content in DOM samples, binning of 13C NMR data was combined with the application of a mixing model, resulting in an estimated minimum CRAM content of 23% for surface and 51% for deep-sea UDOM.

Lam et al. (2007) demonstrated that CRAM is also a major constituent of freshwater DOM in Lake Ontario using 1H,13C HMQC and 1H,13C HMBC NMR spectroscopy. Moreover, a second fraction was structurally assigned as material derived from linear terpenoids. In addition to that, some NMR spectroscopic signals were assigned to heteropolysaccharides and aromatics. However, these compounds only represented a minor constituent of Lake Ontario DOM.

Two-dimensional NMR spectroscopic experiments (HMQC and TOCSY) were also applied by Kaiser et al. (2003) to investigate structural differences of DOM from the Tagliamento River (Italy) extracted either by SPE or by ultrafiltration. While the ultrafiltered extracts contained inter alia peptides/proteins and aliphatic structural motifs, the SPE extracts contained primary aliphatic esters, ethers and compounds bearing hydroxy groups. This finding points again to the fact that different isolation techniques may complement each other for the molecular-level characterization of DOM (cf. “Studies using 1D NMR spectroscopy” section).

Arakawa et al. (2017) assigned at least 4% of marine SPE-DOC to carotenoid degradation products based on gas chromatography/mass spectrometry analysis. NMR spectroscopic experiments combined with the simulation of NMR spectra confirmed this structural assignment and further indicated that even more than 4% of DOM might consist of carotenoid degradation products.

In contrast to most other NMR spectroscopic studies on DOM using 2D NMR techniques, Hertkorn et al. (2013) used a high field instrument with 800 MHz. It represents the most comprehensive NMR spectroscopic study on marine DOM to date. A large variety of NMR spectroscopic experiments (including 1H,1H JRES, 1H,1H COSY, 1H,1H TOCSY, 1H,13C DEPT-HSQC, 1H,13C HSQC-TOCSY, 1H,13C HMBC and 1H,13C HSQC) was used for the characterization of four SPE-DOM samples taken at different water depths in the ocean (5, 48, 200 and 5446 m). The total NMR instrument time required for this sample series was longer than three months. The NMR spectroscopic data were further complemented and compared with those obtained from FT-ICR-MS measurements. Many NMR spectroscopic signals were assigned to CRAM. The 13C NMR spectrum showed a nearly Gaussian-shaped signal at ~ 174.3 ppm, indicating signal overlap of structurally diverse carboxylic acids. According to the NMR spectroscopic results, sp2-hybridized carbon atoms are present, but less common in marine DOM. A major source of sp2-hybridized carbon in DOM might be thermogenic organic carbon (also termed as black carbon), which stems from biomass burning on land or hydrothermal heating of marine sediments (Dittmar and Koch 2006; Dittmar and Paeng 2009). To evaluate the impact of substitution, steric congestion and condensation on the 1H and 13C NMR spectroscopic shifts for this fraction of DOM, Hertkorn et al. proposed a highly condensed strucutral model compound with certain characteristics (cf. Fig. 10) and simulated its HSQC and COSY spectra.

Fig. 10
figure 10

Model compound to elucidate putative nuclear magnetic resonance (NMR) characteristics of thermogenic organic carbon. The conceptual model includes protons of fjord regions (red) and bay regions (green), protons in unconstrained regions (blue), protons attached to highly condensed regions (pink) as well as protons attached to aromatic rings decorated with heteroatoms (orange). HSQC (heteronuclear single quantum coherence) and COSY (correlation spectroscopy) NMR spectra were simulated for this compound to depict the influence of the different chemical environments on the NMR spectroscopic properties (i.e., on the chemical shifts of the expected correlations). Correlations similar to those of the prediction were also observed in the experimental spectra of dissolved organic matter. Figure and highlighting of structural features according to Hertkorn et al. (2013)

A further study combining NMR and FT-ICR-MS was recently published by Seidel et al. (2022). The authors performed 1D 1H and 2D 1H,1H COSY NMR spectroscopy on a set consisting of five SPE-DOM samples from the surface and deep of the central Atlantic and Pacific oceans. Compared to previous studies using 2D NMR spectroscopic techniques (e.g., Hertkorn et al. 2013), Seidel et al. (2022) have successfully reduced the experimental times to two days per sample while using unprecedentedly small quantities of SPE-DOM (~ 1 mg). However, only sensitive NMR experiments involving 1H were recorded but no 2D heteronuclear correlation experiments due to a very limited access to the high-field NMR instruments. A second unique feature of this study is that a statistical approach [principal coordinates analysis (PCoA)] was used for dimension reduction of NMR and FT-ICR-MS data, and canonical correlation analysis was used to link both datasets. Two subsets of compounds with specific characteristics regarding their molecular formulas and chemical shifts were identified (cf. Fig. 11, subsets are characterized by either blue or red color of the signal dots). In particular, the subset highlighted in blue includes compounds with aromatic/olefinic and/or aliphatic features (determined according to their characteristic chemical shift regions) covering a broad range of molecular formulas (H/C ratios ~ 0.5–1.5 and O/C ratios ~ 0.1–0.9). In contrast, the second subset (red) mainly consists of compounds with oxygen-containing functional groups within a relatively narrow range of O/C ratios between 0.4 and 0.7 and H/C ratios greater than 1.0.

Fig. 11
figure 11

Flowchart depicting the key steps of marine dissolved organic matter (DOM) analysis as carried out by Seidel et al. (2022). 1H,1H COSY (correlation spectroscopy) nuclear magnetic resonance and Fourier-transform ion cyclotron resonance mass spectrometry datasets consisting of five DOM samples taken at different locations and depths were acquired and analyzed by principal coordinates analysis. Both data sets were then linked using canonical correlation analysis. As represented by signals in red and blue, two distinct subsets of compounds were identified. Molecular formulas highlighted in a specific color in the van-Krevelen diagram (bottom right) correspond to signal plots highlighted in the same color in the COSY NMR spectrum (top right) and vice versa. Reprinted (adapted) with permission from Seidel et al. (2022). Copyright 2022 American Chemical Society

In summary, the development of advanced NMR spectroscopic techniques gradually led to a more complete picture of the composition of DOM on a structural level. The first studies investigated operationally defined DOM fractions, namely humic and fulvic acids. The extraction efficiencies improved with the implementation of advanced extraction and concentration methods (SPE, ultrafiltration, RO/ED). However, also these methods only allow to investigate operationally defined subsets of DOM. While the first studies employed 1D NMR spectroscopy and related only three spectral regions to distinct structural features (Stuermer and Harvey 1974), recent studies usually employ multidimensional NMR techniques that are frequently combined with other analytical techniques, such as the physical separation of compounds by chromatographic methods, FT-ICR-MS and fluorescence spectroscopy, or with statistical approaches. This either allows to characterize also minor compound groups of DOM, such as carotenoid degradation products (Arakawa et al. 2017), or even the elucidation of distinct molecular structures from scratch of more dominant compounds, such as sugars as the monomers of acylated polysaccharides in UDOM (Panagiotopoulos et al. 2007).

Current state and future perspectives

Over the past decades, major progress has been made in both, technical and methodological advancements in NMR spectroscopy that facilitates the measurement of multidimensional NMR spectra of mass-limited samples more time efficiently. Thus, the acquisition of larger DOM data sets is now possible, which is needed for advanced chemometric approaches. In addition, the development of specific chemical derivatization strategies for the selective detection of various compound classes or functional groups might allow for further molecular insights in DOM. Basic concepts and considerations on these topics are summarized in the following section.

Technical improvements

Ultra-high magnetic fields

The low sensitivity of NMR compared to other analytical techniques such as FT-ICR-MS is a major bottleneck for the use of NMR spectroscopy in the analysis of DOM. Lately, ultra-high field 1.2 GHz (28.2 Tesla) NMR spectrometers have become commercially available (Schwalbe 2017). As the SNR in NMR is proportional to \(B_{0}^{{{\raise0.5ex\hbox{$\scriptstyle 3$} \kern-0.1em/\kern-0.15em \lower0.25ex\hbox{$\scriptstyle 2$}}}}\), it increases by a factor of 2.8 when doubling the magnetic field, e.g., from a 600 MHz instrument to a 1.2 GHz instrument. This may seem to be not a big improvement, but the SNR on a given instrument is proportional to the square root of the number of scans (which means that four times more scans must be recorded to double the SNR). In other words: to obtain the same SNR on a 600 MHz instrument as one would obtain on a 1.2 GHz instrument, one must record the spectra eight times as long as on the 1.2 GHz instrument. A second, maybe even more important improvement when using high field instruments, is the increase in resolution of n-dimensional NMR spectra, which is proportional to \(B_{0}^{n}\) (Schwalbe 2017). Consequently, a 3D NMR spectrum of the same sample is recorded with eight times higher resolution on a 1.2 GHz instrument than on a 600 MHz instrument.

Small diameter cryoprobes and NMR tubes

To overcome the relatively low sensitivity of NMR spectroscopy, high sample amounts are used to obtain reasonable NMR spectra. In a classical 5 mm diameter NMR tube, typically more than 1 mg of a single compound is dissolved in 0.5 to 0.6 mL of solvent for the acquisition of a 1D 1H NMR spectrum (remind that 1H is the most receptive nucleus in NMR spectroscopy). In case of 1D 13C NMR spectra, already more than 1 mg for each carbon atom in the respective compound is usually needed to obtain high quality spectra in an acceptable measurement time. In classical organic chemistry, the rule of thumb even says that 10 mg per carbon atom should be used to record 13C spectra. Recent developments of helium-cooled cryoprobes and micro(cryo)probes have greatly improved the NMR sensitivity of mass-limited samples. The key aspects of micro- and cryoprobes have been excellently described before (Anklin 2015) and are briefly summarized below (cf. also Table 4). In general, NMR probes with inner diameters of 10 mm, 5 mm, 3 mm, 1.7 mm and 1 mm are commercially available. They can be either operated at room temperature (RT, thus referred to as RT probe) or the probe coil and preamplifier are cooled with a stream of cold helium gas to 20 K (coil) and 77 K (preamplifier), respectively. Due to the low temperatures, the technical realization of the latter is referred to as cryoprobe. Whereas the application of a 3 mm RT probe only leads to an improvement of mass sensitivity of 40%, a 1 mm RT probe offers an improvement of 300%. A 5 mm cryoprobe already shows a mass sensitivity improved by a factor of five compared to a conventional 5 mm RT probe. The mass sensitivity of a 1.7 mm microcryoprobe is more than 15 times higher compared to a 5 mm RT probe. As a grateful side effect, less sample volume is needed for the microprobes (Table 4), which is especially useful in situations where the sample amount is limited.

Table 4 Typical features of micro- and cryoprobes used in NMR spectroscopy (Anklin 2015)

Multidimensional NMR spectroscopy

As already outlined in Fig. 7 (“Overview of NMR spectroscopic applications for the analysis of DOM” section), multidimensional NMR spectra provide by far higher peak capacity and as such higher potential information content. However, so far only a few studies exist in which multidimensional NMR spectroscopic techniques were applied for the analysis of DOM. This is probably due to the fact that NMR spectroscopy is an intrinsically insensitive spectroscopic technique. Furthermore, in case of marine DOM usually only small sample amounts can be sampled since only a few milligrams of DOM are present in one liter of water, and water cannot be recovered from the deep-sea in high volumes. These circumstances can now be tackled by using high-field NMR instruments that are equipped with small-volume cryoprobes offering a mass sensitivity improved by much more than one order of magnitude compared to conventional 5 mm RT probes (cf. “Small diameter cryoprobes and NMR tubes“ section). Beyond that, all multidimensional NMR spectroscopic studies on DOM that were published so far are limited to 1H and 13C as the observed nuclei. Thus, the application of multidimensional NMR spectroscopic techniques for analyzing biologically relevant nitrogen and phosphorus containing compounds (using the NMR active nuclei 15N and 31P, respectively) offers a huge and yet unexplored potential to gain further insights into the molecular composition of DOM.

Data acquisition

In recent years, great progress has been made in the development of methods for time-saving pulse sequences for multidimensional NMR experiments. These strategies aim either at recording fewer data points while maintaining or even improving the resolution or at combining multiple experiments into one pulse sequence. The first strategy is referred to as non-uniform sampling (NUS), and the latter is termed as NMR by ordered acquisition using 1H detection (NOAH), frequently also designated as NMR supersequences or nested sequences.

Non-uniform sampling

NUS is an acquisition method for multidimensional NMR experiments. Simple 1D NMR experiments consist of only one RF-pulse, followed by an acquisition period. In contrast, 2D NMR experiments are more complex and comprise several pulses, an evolution time t1, a mixing time and a detection time t2 (cf. 2D NMR experiments section). In simple 2D NMR experiments, two consecutive RF-pulses are applied that are separated by the evolution time t1. The FID is acquired within the detection time t2 and its FFT results in a conventional standard 1D NMR spectrum. By incrementing the evolution time t1, a series of FIDs is recorded. The 2D NMR spectrum is constructed by applying two consecutive FFTs: Initially, the FFT along t2 (direct dimension, resulting in a series of 1D NMR spectra) followed by a FFT along t1 (indirect dimension, resulting in a 2D NMR spectrum). When the t1 increments are equidistant in time, this acquisition method is termed as uniform sampling.

The spectral width is determined by the increment step size ∆t1, and the resolution is determined by the largest increment t1,max. Thus, the measurement time of an NMR experiment can drastically be reduced without cost of resolution by reducing the number of increments while keeping the largest evolution time t1,max constant. In non-uniform sampling, the reduction in data points follows specific schemes, which are referred to as sampling schedules. In addition, the missing time points with respect to the uniformly sampled acquisition must be reconstructed. However, the choice of the sampling schedule and of the reconstruction method seem to be only critical in some cases. In fact, it has already been demonstrated that the application of NUS for the acquisition of 2D NMR data of a small molecule rarely depends on the chosen sampling schedule and the reconstruction method and can result in the reduction in measurement time of a factor of eight without losses in spectral quality (Delaglio et al. 2017). A further advantage of NUS is the possibility to simultaneously improve the SNR and the resolution, two parameters that cannot be improved simultaneously in uniform sampling under all circumstances (Palmer et al. 2015). In addition, NUS neither influences quantification nor produces spectral artifacts (Delaglio et al. 2017), if carefully implemented. Care should be taken while choosing the NUS percentage as a lower number of NUS points compared to the number of signals in the spectrum degrades the spectral quality. NUS reconstruction often fails to reproduce the weak signals in the spectrum.

NMR supersequences

The most time-consuming step in an NMR pulse sequence is usually the recovery delay. The recovery delay (or relaxation delay) is needed to allow the spins to return to their Boltzmann equilibrium state before the next RF-pulse is applied. In NMR supersequences, different NMR experiments that all detect 1H are combined into one pulse sequence that possesses only one recovery delay (Fig. 12) (Kupče and Claridge 2017, 2018; Kupče et al. 2021). To avoid disturbance between the experiments, the sequence starts with the most insensitive module and phase cycling and refocusing gradients are applied. Whereas phase cycling is a technique to correct for artifacts in the NMR spectrum resulting from the quadrature detection, refocusing gradients ensure the coherence of spins. Despite the possibility of hundreds of combinations of experiments in supersequences, the order of the modules used in the sequence greatly influences the sensitivity of the spectra. NMR supersequences suffer from transverse relaxation (T2)-related sensitivity losses. The supersequence depicted in Fig. 12 starts with a 1H,15N HMQC experiment preserving the macroscopic magnetization along the z-axis of all protons that are not directly attached to 15N.

Fig. 12
figure 12

Combination of five conventional, 1H detecting nuclear magnetic resonance experiments into one time-saving supersequence that contains only one recovery delay. The sequence starts with a 1H,15N HMQC (heteronuclear multiple quantum coherence) experiment as the most insensitive of all applied experiments, followed by more sensitive experiments, i.e., 1H,13C HSQC (heteronuclear single quantum coherence), 1H,13C HMBC (heteronuclear multiple bond correlation), 1H,1H COSY (correlation spectroscopy) and 1H,1H NOESY (nuclear Overhauser effect spectroscopy) experiments. The order of experiments from the least sensitive to the most sensitive is crucial to avoid disturbance between experiments. Figure adapted from Kupče and Claridge (2017)

Both, NUS and supersequences have already been implemented in commercial software packages such as TopSpin (Bruker, Germany) and are therefore in principle available for most users. However, despite of the numerous advantages, supersequences and NUS have not been used for the NMR spectroscopic analysis of DOM so far but may offer an improvement in its analysis in the future.

Data analysis

Since DOM is one of the most complex mixtures, it is in general not possible to analyze the NMR spectra of unfractionated samples on the level of individual compounds. Thus, NMR spectra of complex mixtures are frequently analyzed in an untargeted manner. So far, most NMR spectroscopic studies on DOM relied on the visual comparison of NMR spectra across samples or on basic data analysis steps such as the integration over pre-defined chemical shift regions. The full information content of 2D NMR spectra of DOM samples has rarely been explored by established multivariate statistical tools (e.g., Seidel et al. 2022). Prior to statistical exploration, data must be pre- and postprocessed. Preprocessing steps include referencing, phase correction and baseline correction. Since these steps are commonly applied in NMR data processing, they are not further covered in this review. The three most common steps applied in postprocessing are binning, alignment and normalization of NMR spectra.

Postprocessing

The chemical shift of a specific atom of a given compound is sensitive to the matrix of the sample and to fluctuations in temperature. Thus, pH, ionic strength, concentration and temperature should be kept as stable as possible to maximize comparability among samples. Remaining fluctuations of chemical shifts must be considered when comparing samples. Several binning and alignment procedures of NMR spectra have been developed for this purpose.

Spectral alignment ensures that signals arising from a certain nucleus of a given compound appear at the same chemical shift throughout a series of spectra, by iteratively shifting the peaks so that they will match. Several alignment techniques have been developed over the last decades, including fuzzy warping (Wu et al. 2006), the use of a genetic algorithm (Forshed et al. 2003) and a previously proposed interval correlation shifting (Savorani et al. 2010). A recent review with a deeper discussion on peak alignment is provided by Vu and Laukens (2013).

Binning, sometimes also called bucketing, divides the spectrum into small regions with defined chemical shift widths, which are referred to as bin or bucket size. The bin size may be fixed (equally distributed over the whole spectrum, e.g., covering a range of 0.05 ppm for 1D 1H NMR) or variable. In case of the latter, bin sizes may be defined by an algorithm or set manually, e.g., with respect to specific functional groups and thus using a priori knowledge. In addition to equidistant binning, other binning techniques exists, including Gaussian binning (using overlapping bins) (Anderson et al. 2008), adaptive-intelligent binning (using variable bin sizes) (De Meyer et al. 2008) and an optimized bucketing algorithm (using local minima of the NMR spectrum to define bin sizes) (Sousa et al. 2013). Gaussian binning still uses equidistant bins, but the intensity in each bin is weighted according to a Gaussian function that is centered in the middle of each bin and overlaps with adjacent bins. In case of equidistant binning methods, the intensity of a given peak might be split into two adjacent bins, which might significantly affect further data analysis. Adaptive-intelligent binning, optimized bucketing and other advanced binning algorithms try to overcome this circumstance by basically using local minima of the spectra as bin edges and thus only covering complete peaks within bins. However, the performance of a certain binning method strongly depends on the dataset, the further analysis that should be conducted and in case of the advanced binning methods also on user chosen parameters.

Despite all the afford that has gone into the development of new alignment and binning algorithms, their application can also lead to worse results in the further data analysis than just using the raw or equidistantly binned data. This fact has been excellently summarized by Savorani et al. (2010) who stated “It is important to bear in mind that spectral alignment also can be a destructive process as it can remove useful physical information related to the signal shifts in the spectra.”

For quantitative comparison, NMR data are usually normalized. It should compensate the effect of variable concentrations across different samples as well as the influence of fluctuating instrumental parameters (such as receiver gain) and spectral parameters (such as number of scans). Normalization can either be applied with respect to the NMR signals of a specific compound whose concentration remains constant across different samples [e.g., creatine is often used for the normalization of NMR spectra of urine (Emwas et al. 2018)], to the integral of a specified subset or to all signals of the NMR spectrum. Especially in the field of metabolomics, normalization to the integral of all signals is often not appropriate and many other normalization techniques were developed to overcome several issues (Craig et al. 2006; Kohl et al. 2012; Giraudeau et al. 2014; Emwas et al. 2018; Zacharias et al. 2018). We refer to Emwas et al. (2018) for a profound discussion on further NMR pre- and postprocessing steps for 1D 1H NMR data. Most of the cited research and review papers belong to the field of metabolomics, where NMR spectroscopy is used much more frequently than in the environmental sciences. The key approaches from metabolomics are also applicable for the analysis of other complex mixtures such as DOM, without a direct metabolic context.

Multivariate statistics

Without chromatographic separation or derivatization, the NMR spectroscopic data of most complex mixtures can only be analyzed and compared with other NMR data either by subjective visual inspection or more objective by statistical methods. Since NMR data are represented by multiple variables, i.e., discrete chemical shifts or bins, multivariate statistics are needed. Presumably, the three most frequently applied multivariate statistical methods in the analysis of NMR data are principal component analysis (PCA), hierarchical cluster analysis (HCA) and k-means cluster analysis.

Principal component and principal coordinates analysis

In the case of DOM, each NMR spectrum of a sample taken along a gradient, e.g., along an ocean transect, represents one observation and the intensity of every single bin in the binned NMR spectra is a variable. To represent a dataset with n-variables in a conventional way, one would need an n-dimensional space. PCA reduces dimensions by combining correlating variables into a new variable. The reduced dimensions are termed as principal components (PCs) and are the eigenvectors of linear combinations of the variables in the n-dimensional space that were fitted to the data using, e.g., the least squares method. The first dimension (PC1) explains most of the variance in the data. The second dimension (PC2) is perpendicular to PC1 and explains second most of the variance and so on. The number of PCs is equal to the smaller value of variables and observations but in practice only a few significant PCs (usually less than 4) exist. The proportion of variance that is represented by each PC is determined by its eigenvalue. All observations are then projected onto the chosen number of PCs with the new coordinate values termed scores. One frequently used representation of the results of a PCA analysis is a score plot which represents the scores of the individual observations either in a 2D plot (usually using the two PCs that explain most of the variance) or in a 3D plot (when three PCs are used). A second value often used in PCA analysis is the loading which indicate how much a variable contributes to a given PC. The two- or three-dimensional representation of the loadings (referred to as loading plot) therefore contains the variables with the axes representing the contribution to the respective PCs. It is important to note that PCA only captures linear trends in multivariate data and normal distribution of data is a prerequisite. To overcome these limitations, principal coordinates analysis (PCoA) that is also designated as classical multidimensional scaling, Torgerson or Torgerson–Gower scaling, might be used that is conducted on arbitrary distance, similarity, or dissimilarity matrices of the data. When the Euclidean distance is used for the PCoA, it will give the same result as the PCA of the same dataset. The mathematical background and further information on PCA and PCoA are described in detail in one of numerous excellent reviews (e.g., Wold et al. 1987) or textbooks (e.g., Vandeginst et al. 1998; Cox and Cox 2008; Esbensen and Geladi 2009; Geladi and Linderholm 2020).

Hierarchical cluster analysis

Another common tool in multivariate statistics is HCA, which orders samples and their corresponding variables from which they depend based on their similarity. The similarity between sets of same variables for the different samples is determined by using an arbitrary distance or similarity measure, such as Euclidean, Manhattan or Minkowski distances. The choice of the best distance measure strongly depends on the dataset and the research question. Clusters can be constructed either ‘bottom-up’ (also termed as agglomerative, which means that clusters are constructed by combining single elements) or ‘top-down’ (also termed as divisive, which means that the clusters are constructed by dividing the entirety of elements into smaller elements). The results of hierarchical cluster analysis are often depicted in heatmaps, where the columns represent the different samples and the rows the variables. Frequently, the heatmaps are accompanied by a dendrogram that highlight both, the order in which the clusters were constructed and the similarity of the clusters among each other. Further reading on HCA is provided in Everitt et al. (2011) and Köhn and Hubert (2015).

k-means cluster analysis

k-means clustering will divide the data into k clusters, where k must be selected manually. In contrast, in HCA one could define the number of clusters after the analysis by analyzing the dendrogram. Due to this reason, k-means clustering is less computationally intensive and therefore well suited for large data sets. k-means clustering is an iterative process which starts with k-clusters that are randomly selected by placing k centroids in the space and assigning each data point to the nearest centroid. Afterward, the centroids are relocated to the mean of each cluster and all data points are assigned to the new centroids. This process (assigning data to nearest centroid, shifting centroid to the mean of each cluster) is continued until no more change of the cluster members occurs. Because the results strongly depend on the randomly chosen centroids in the first instance, the whole process is performed several times, starting with the random chose of centroids. The quality of each clustering result is rated by the sum of squared distances of each point to the cluster centroid the point belongs to. For obvious reasons, the sum of squared distances becomes zero in case that the number of cluster is equal to the number of data points. Since the value for k must be selected manually, a priori knowledge on the data is of advantage. Alternatively, the optimal number of clusters can be determined using different approaches (Kodinariya and Makwana 2013) such as the elbow-method (Thorndike 1953). We refer to Everitt et al. (2011) and Ashabi et al. (2020) for general background reading.

Application in DOM studies

For the statistical analysis of NMR data, a relatively large set of samples is needed. Due to the nature of DOM (usually low sample amounts are available) and NMR spectroscopy (relatively insensitive, thus demands long measurement times for low concentrated samples), it was hardly feasible to acquire such data sets in the past. Consequently, statistically analyzed NMR data of DOM are rare. However, one unprecedented example including the statistical analysis using multivariate methods has recently been published by Seidel et al. (2022) (cf. “Selected examples using 2D and 3D NMR spectroscopy” section). By applying two-dimensional heterocorrelation analysis, Abdulla et al. (2010b, 2013) provided two additional examples for the statistical analysis of NMR data derived from DOM. In a first study, two-dimensional heterocorrelation analysis was applied to link FTIR and NMR data (Abdulla et al. 2010b). In a follow-up study, specific chemical shift regions from the same 13C NMR data were linked to specific molecular formulas obtained by FT-ICR-MS, again by using correlation analysis (Abdulla et al. 2013). This approach allows in principle the assignment of structural features (such as carboxyl group) to specific molecular formulas obtained from FT-ICR-MS data. Hertkorn et al. (2016) applied HCA and PCA to assess the similarity among six samples derived from three subtropical wetlands. For each sampling site, one sample was representative for long-hydroperiods and one for short-hydroperiods. Based on the multivariate analysis of the NMR data, the highest similarity was found between same sample sites, regardless of the hydroperiod. However, differences were also detected between the hydroperiods. A fifth outstanding example was provided by Thomsen et al. (2002) who used PCA and HCA to analyze the 1D 13C NMR data of eight different DOM samples. In contrast to the results obtained by (Hertkorn et al. 2016), clustering observed in PCA was in principle different to that observed in HCA. It was concluded that PCA is in general suitable to group DOM of pre-defined groups (i.e., fulvic acids, humic acids and humic substances) according to their NMR properties in case that a higher number of samples will be used, which points again to the need of a relatively large sample set. Remarkably, the NMR instrument used in this study was operated at 250 MHz for 1H (corresponding to 62.5 MHz operational frequency of 13C) and typically 70 000 scans were acquired per sample. These experiments would be much more efficient if they are performed at currently available high field NMR instruments equipped with microcryoprobes. This might be of significant benefit for future applications of multivariate statistics to the analysis of NMR data acquired on DOM samples. The application of high field NMR, micro- and cryoprobes as well as NUS will enable the acquisition of larger data sets in future studies and will thus enhance our understanding of the structural diversity of DOM.

Molecular-level analysis of dissolved organic matter using isotopically labeled reagents

The targeted derivatization of certain functional groups using isotopically labeled reagents and the subsequent NMR spectroscopic analysis holds a lot of potential for the structural characterization of DOM. Due to isotopic labeling, the sample might become more receptive for a specific NMR spectroscopic measurement by introducing NMR active nuclei, e.g., when incorporating 13C into the sample, which only occurs with a relative natural abundance of 1.1%. If isotopes with a low natural abundance are used for the derivatization, such as 13C or 2H, it is straightforward to differentiate between the sample signals before and after derivatization. Therefore, certain functional groups can be ‘marked’ when using reagents or applying reaction conditions that specifically derivatize these groups. Preferentially, this should be feasible without side reactions and with an almost complete conversion of the targeted compounds.

Derivatization with isotopically labeled reagents has been applied in the analysis of DOM, but not yet in combination with NMR spectroscopy. Probably the most impressive example so far is the reduction of DOM with sodium borodeuteride (NaBD4) (Baluha et al. 2013). NaBD4 specifically converts ketones and aldehydes into the corresponding deuterated alcohols (cf. Scheme 1).

Scheme 1
scheme 1

Derivatization strategies for the isotopic labeling of dissolved organic matter. Sodium borodeuteride (NaBD4) allows for the specific reduction of a ketones or b aldehydes to the corresponding monodeuterated a secondary or b primary alcohols, respectively. The monodeuterated alcohols can be selectively investigated by 2H nuclear magnetic resonance (NMR) spectroscopy. Potentially, also the differentiation between secondary and primary alcohols by 2H NMR spectroscopy might be possible

By mass spectrometric comparison of untreated and reduced (either with sodium borohydride or sodium borodeuteride) DOM samples, ~ 30% of the mass peaks of DOM were related to species containing one or two ketone/aldehyde functionalities. This was possible due to the specific mass differences of the reduced samples. Further insights into the deuterated samples could be obtained by NMR spectroscopy. Well established 2H NMR spectroscopy might be used to selectively detect the reduced species. In addition, primary alcohols (as the reduction product of aldehydes) and secondary alcohols (as the reduction product of ketones) might be distinguishable by NMR spectroscopy due to their chemical shifts. Notably, mass spectrometry do not allow for this differentiation. In the following paragraphs, we propose several other derivatization strategies that might be used as an expansion or complementary to this approach. The proposed reactions are fundamental, and we refer to current textbooks about organic chemistry for further background reading (e.g., Organic Chemistry by Clayden et al. 2012). An obvious expansion might be the reduction with lithium aluminum deuteride (LiAlD4, cf. Scheme 2). As its non-deuterated counterpart, lithium aluminum hydride, LiAlD4 reduces both, ketones/aldehydes and carboxylic acids/esters/amides. By first reducing a sample with NaBD4 and consecutively with LiAlD4, it might be possible to distinguish between the three groups: aldehydes, ketones and esters/carboxylic acids/amides. Since amides are reduced to amines and esters/carboxylic acids to alcohols, it might also be possible to distinguish between them due to their NMR spectroscopic characteristics. When applying this approach, one must take care of several potential side reactions of LiAlD4, such as the reduction in nitriles or nitro groups. However, these functional groups are less common in DOM and the side reactions might therefore be of minor relevance.

Scheme 2
scheme 2

Derivatization strategies proposed for the isotopic labeling of dissolved organic matter. Reduction of a carboxylic acids (R' = H) or esters (R' ≠ H) and b amides with lithium aluminum deuteride (LiAlD4) to the corresponding dideuterated a primary alcohols or b amines, respectively. The deuterated products can be selectively investigated by 2H nuclear magnetic resonance spectroscopy

A second possible application of deuterium labeling might be the utilization of deuterium gas in hydrogenation reactions [Scheme 3a]. Depending on the particular reaction conditions, basically all naturally occurring multiple bonds can be deuterated and selectively investigated using 2H NMR techniques.

Scheme 3
scheme 3

Derivatization strategies proposed for the isotopic labeling of dissolved organic matter. a Hydrogenation of multiple bonds with deuterium gas, b alkylation of sulfur containing compounds with deuterated ethyl iodide (in a similar manner also deuterated methyl iodide or 13C labeled ethyl/methyl iodide could be used) (Wang and Schrader 2015), c acetylation of alcohols with 13C labeled acetic anhydride, d acetylation of primary (R''' = H) or secondary amines (R''' ≠ H) with 13C labeled acetic anhydride and e esterification of carboxylic acids with 13C labeled methanol. Whereas the deuterated products can be selectively investigated by 2H nuclear magnetic resonance (NMR) spectroscopy, the investigation of the 13C labeled products is possible by 13C NMR spectroscopy

A third application might be the derivatization of sulfur containing species by selective alkylation with deuterated methyl iodide or ethyl iodide as previously reported for the mass spectrometric analysis of crude oil by Wang and Schrader (2015) [Scheme 3b]. In a similar manner 13C labeled methyl/ethyl iodide might be used which enables the opportunity to use 13C NMR spectroscopic techniques to selectively investigate this class of compounds by NMR spectroscopy.

Further derivatization strategies for 13C labeling might include acetylation of alcohols and amines with 13C labeled acetic anhydride (please note that 13C labeled derivatives of acetic anhydride are available with 13C incorporation at different positions)  [Scheme 3c + d] or the esterification of carboxylic acids with 13C labeled methanol [Scheme 3e]. However, care must be taken since many potential side reactions might occur. In addition, acetic anhydride will always react with both, amines and alcohols, making the selective investigation of these functional groups with this derivatizing agent impossible. When using 2H labeled acetic anhydride or MeOH-d3/MeOD-d4, both techniques could also be adapted to deuterium labeling. 13C labeled compounds are in general more expensive than their deuterated counterparts, which might make them less attractive. However, 13C is a more receptive nuclei compared to 2H, making NMR spectroscopic measurements more feasible. In addition, quadrupolar nuclei, such as 2H (nuclear spin quantum number I = 1), yield broader signals that is usually of decisive disadvantage in terms of analyzing NMR spectra.

A completely different strategy might pursue the derivatization with fluorine. This would be of great advantage because the natural abundance of NMR active 19F is 100% and the receptivity of 19F is nearly as high as 1H, which makes it an ideal nucleus for NMR spectroscopic measurements. Furthermore, the occurrence of fluorine containing compounds is low in DOM. Many NMR probes can measure 19F, which makes 19F NMR spectroscopy easily implementable at most NMR facilities.

Common fluoro-containing reagents such as trifluoroethanol (note that trifluoromethanol is an instable gas), trifluoroacetic anhydride or 2,2,2-trifluorethyl iodide (note that trifluoromethyl iodide is a gas) can be used for the selective esterification, acetylation or alkylation as mentioned above for the 2H and 13C labeling strategies, respectively (cf. Scheme 3).

In addition, exchangeable protons might become selectively observable using 2H NMR spectroscopy by exchanging them with deuterium (e.g., three cycles of adding and evaporating a protic deuterated solvent, such as D2O or MeOD-d4).

Another, somehow completely different approach pursues the NMR spectroscopic analysis of artificial, isotopically labeled DOM that is produced by microbes by growing on isotopically labeled substrates, such as 13C labeled sodium bicarbonate (NaH13CO3) or 13C labeled glucose. A recent study investigated the diel dependent metabolite excretion of a model community consisting of a marine diatom and a marine bacterium by NMR spectroscopy and mRNA analysis (Uchimiya et al. 2022). Thereby, NaH13CO3 was used as the inorganic carbon source for the diatoms to label the produced metabolites and to make the NMR analysis of the diatoms’ endometabolome feasible. In another study, 13C and 15N labeled high-molecular weight dissolved organic nitrogen was derived from cyanobacteria by culturing them with NaH13CO3 and 15N labeled sodium nitrate (Na15NO3) and investigated by NMR spectroscopy. This led to the molecular-level characterization of previously less elucidated constituents of high-molecular weight dissolved organic nitrogen and thus to a more diverse view on this material (Cao et al. 2017). This approach might be further extended to produce artificial and isotopically labeled DOM for its molecular-level analysis.

Conclusion

NMR spectroscopy is a powerful tool for the molecular-level characterization of DOM. Key structural features of DOM have been identified by NMR spectroscopy. However, the use of multidimensional NMR spectroscopy in this analytical field is still in its early stage and its potential is far away from being exhausted. A major obstacle that must be overcome is that usually small sample amounts encounter the relatively large insensitivity of NMR spectroscopy, which results in almost infinite long measurement times to record certain spectra in high quality.

Technical improvements regarding the magnetic field strengths of the instruments as well as the development of micro- and helium-cooled cryoprobes made the use of multidimensional NMR spectroscopic techniques for the study of DOM more feasible. Today, instruments with field strength up to 28.2 T (corresponding to a 1H resonance frequency of 1.2 GHz) are commercially available and 1.7 mm microcryoprobes offer a mass sensitivity improved by a factor of 15 compared to conventional 5 mm RT probes. These technical advancements can be further combined with new strategies of data acquisition, such as non-uniform sampling. It will enable to reduce the measurement times by several folds (depending on the NUS sparsity) without compromising spectral resolution and sensitivity and thus allows to push the borders of NMR spectroscopy in the analysis of DOM on a new level. By combining several NMR experiments into one supersequence, the required measurement time may be further reduced. Thus far, only two examples using high field instruments and even no examples for the application of NUS and supersequences for the analysis of marine DOM have been reported (cf. Table 3).

Due to the rather small NMR data sets that were acquired on DOM so far, the application of multivariate statistical methods remained the exception. With the ongoing technical improvements, larger data sets become more and more available and multivariate statistical methods might become the standard in the analysis and comparison of DOM data sets rather than the exception. First steps have already been taken in this direction by combining state-of-the art NMR equipment with a chemometrics approach for the characterization of marine DOM (Seidel et al. 2022).

Since DOM is too complex for the isolation of most of its individual constituents, further progress in its molecular-level characterization by NMR spectroscopy might be achieved by derivatization strategies such as isotopic labeling of specific functional groups. This might include isotopic labeling with NMR active isotopes of low natural abundance, such as 2H and 13C, but also isotopes that are less common in DOM samples, such as 19F.

To summarize, the combination of technical and methodical improvements with new approaches in sample derivatization and application of multivariate statistical approaches offer an almost unexplored playground for the molecular-level analysis of DOM.