1 Introduction

Streams, rivers, and lakes receive, store, and transport upward of 5.1 PgC yr− 1 from the terrestrial environment and outgas 3.9 PgC yr− 1 as CO2 (Drake et al. 2018), and they are important hotspots for the transformation of terrestrial organic matter (Wilson and Xenopoulos 2009; Xenopoulos et al. 2021). Dissolved organic matter (DOM) constitutes a significant fraction of reactive organic carbon in aquatic ecosystems flowing through terrestrial landscapes (Battin et al. 2009; Kowalczuk et al. 2005). Microbial-mediated metabolism of DOM contributes markedly to carbon cycling and the energy flow of river and stream ecosystems. The active microbial community is closely associated with bio-labile aliphatic and protein-rich DOM fractions (Chen et al. 2022; Drake et al. 2019; Osterholz et al. 2016; Zhou et al. 2021). Changes in the molecular composition of DOM affect the environment of aquatic ecosystems by altering its chemical reactivity and bio-availability, the underwater light climate, especially the penetration of underwater UV radiation, food web dynamics, and the coupled biogeochemical cycles (Tranvik and Bertilsson 2008; Weyhenmeyer et al. 2015). Biologically reactive carbon and dissolved inorganic carbon released during microbial metabolism of DOM in streams serve as an important carbon source and influence the uptake, outgassing, and biogeochemical processes of carbon (Battin et al. 2009).

The chemical composition of DOM in streams and rivers is closely related to catchment attributes, including urban and agricultural land use, wetland coverage, and soil type (Drake et al. 2019; Spencer et al. 2019). Runoff over residential landscapes, together with urban point-source inputs from wastewater treatment plants and pipeline failure, may contribute substantial amounts of protein-rich and bio-labile DOM to receiving streams and rivers (Zhou et al. 2022). However, non-point source DOM inputs from impervious areas including roads, parking lots, and residential areas in urbanized landscapes are poorly constrained (Hosen et al. 2014). More than 40% of the global non-glaciated land surface is covered by cropland or pasture, and deforestation and intensified agricultural land use can enhance the mobilization of microbial DOM and the export of nutrients to fluvial ecosystems (Drake et al. 2020; Spencer et al. 2019; Vaughn et al. 2021). Microbial derived DOM increases with greater agricultural land use, and forest- and wetland-dominated landscapes are enriched in aromatic and bio-stable fractions of DOM (Hosen et al. 2014; Wilson and Xenopoulos 2009). To date, changes in the molecular composition and bio-lability of DOM in fluvial ecosystems along urbanization and agricultural land use gradients have not been well elucidated despite the scale of these impacts on Earth.

In this study, we aimed to unravel how the chemical composition and bio-lability of DOM in streams and rivers may change over a large gradient of human impacts in the catchment (urban and agricultural land use and population density). We collected samples from 76 medium-sized stream or river catchments (from 1 to 4850 km2) in south-eastern China along a large gradient in urban and agricultural land use within the same physiographical and hydrogeological unit. Optical measurements coupled with ultrahigh-resolution mass spectrometry (Fourier transform ion cyclotron resonance mass spectrometry, FT-ICR MS) were used to trace the processing and molecular composition of DOM in streams. Laboratory bio-incubations were used to quantify the bio-lability of DOM. We hypothesized that urbanization and intensified agricultural land use would shift the DOM pool toward increasing relative heteroatom content and that the DOM thus would be less aromatic and more aliphatic in nature, i.e. a shift from a more allochthonous to a more autochthonous signature (Coble et al. 2022). Concurrently with this shift in DOM composition, we hypothesized an increase in bio-lability at the anthropogenically impacted sites in line with previous studies (Drake et al. 2019; Vaughn et al. 2021; Wilson and Xenopoulos 2009).

2 Materials and methods

2.1 Study area

The selected sampling rivers were located in the river catchments of Yangtze and Qiantang, southeastern China, in the Anhui, Jiangsu, and Zhejiang provinces (Fig. S1). The studied streams and rivers have similar topography and a humid subtropical climate, with an average annual temperature of 17 °C and annual mean precipitation of approximately 1500 mm, the rainy season primarily being concentrated in May–September. The two river sites in the Yangtze River in this study were included to create a large gradient in urban and agricultural land use as these two sites were located in the metropolitan area of Nanjing and served as disturbed catchment endmembers (urban + agricultural land use = 87.2%; Fig. S1). All the samples were collected from independent streams or rivers to avoid the influence of DOM quality in the upstream sites on the downstream-linked sites, and the area of the sampled catchments ranged from 1 to 4850 km2 (Fig. S1).

Water samples were collected from 76 streams and rivers with large gradients in urban and agricultural land use in December 2019 to avoid high discharge events. A 30 m resolution land use data set was downloaded from http://www.globallandcover.com/, and 1 km resolution population density data for the stream catchments were downloaded from the Resource and Environmental Science Data Center, Chinese Academy of Sciences at https://www.resdc.cn/. The land use classifications used were urban, agricultural, forest, grassland, waterbody, and bare soil. Urban comprises manmade constructions in large metropolitan areas with high levels of impervious surfaces. Agricultural lands include row crops and actively tilled land, while pristine lands represent forested areas. The catchment boundary for each sampled stream was delineated following the products of HydroSheds at https://www.hydrosheds.org/page/hydrobasins, and the contribution percentages of urban and agricultural land use were determined using ArcGIS 10.2.

2.2 Sample collection and processing

Water samples were collected from the center of each stream at 0–0.5 m depth and were kept on ice and in the dark while in the field. The samples were immediately transported to the laboratory after collection and first filtered through a pre-combusted (450 °C for 4 h) GF/F glass fiber filter (0.7 μm pore size) to determine DOC concentration (Spencer et al. 2014), ammonium (NH4+-N), nitrite (NO2-N), nitrate (NO3-N), and phosphate (PO43−-P) and then through pre-rinsed Millipore membrane cellulose filters (0.22 μm porosity) to eliminate potential scattering of particles, followed by DOM optical spectroscopy (Coble 2007). DOC concentration was measured at a high temperature (680 °C) using a TOC-V CPN analyzer (Shimadzu, Tokyo, Japan) after acidification of the samples with H3PO4. Total nitrogen (TN), total phosphorus (TP), total dissolved nitrogen (TDN), and total dissolved phosphorus (TDP) were digested at a high temperature (120 °C) for 40 min and then measured using Shimadzu UA-2550PC UV-Vis. The concentrations of NH4+-N, NO2-N, NO3-N and PO43−-P were determined on a flow injection analyzer (Skalar SAN++, Delft, the Netherlands).

2.3 DOM optical properties

DOM absorbance measurement was carried out on a Shimadzu UV-2550 UV-Vis spectrophotometer with matching 5-cm quartz cells at room temperature and Milli-Q water in the reference cell. DOM samples were diluted when the absorbance at 254 nm was higher than 0.3 before they were measured for DOM optical properties following the methods detailed elsewhere (Miller and McKnight 2010). The absorption coefficient of DOM at 254 nm, i.e. a254, was calculated by multiplying 2.303/r where r is the path length in meters, and 2.303 converts between log10 and natural log (Coble 2007). Specific ultraviolet absorbance, i.e. SUVA254, increases with increasing aromaticity of DOM (Weishaar et al. 2003). The spectral slope of DOM absorption spectra, i.e. S275–295, decreases with an increasing terrestrial organic matter signal and was calculated using a nonlinear fitting over the spectral range 275–295 nm (Helms et al. 2008). DOM fluorescence excitation-emission matrices (EEMs) were measured using an F-7000 fluorescence spectrophotometer (Hitachi, Tokyo). The measured EEMs were first corrected by water Raman scattering, i.e., deducting the Milli-Q blank EEMs, and Rayleigh scatter peaks were eliminated using the drEEM toolbox inbuilt in MATLAB R2015b (Murphy et al. 2013). The inner-filter effect was calibrated using absorbance at corresponding excitation and emission wavelengths (Kothawala et al. 2013).

The biological freshness index (BIX) is calculated at an excitation wavelength of 310 nm by dividing the fluorescence intensity at emission wavelengths of 310 nm and 430 nm (Huguet et al. 2009). BIX represents the degree of autochthonous organic matter activity. The humification index (HIX) was evaluated as the ratio of integrated emission intensity from 435 nm to 480 nm to that from 300 nm to 345 nm, corresponding to an excitation wavelength of 254 nm (Zsolnay et al. 1999). HIX shows a strong positive relationship with the degree of DOM humification, which is usually used to indicate DOM source. In addition, the ratio of integration fluorescence humic-like peak C to protein-like peak T, i.e. IC: IT, was used to trace the variability of the relative terrestrial humic-rich DOM input compared to protein-like DOM (Zhou et al. 2017).

2.4 Bio-incubation of DOC

Biodegradable DOC (BDOC) is defined as the fraction of DOC degraded by microorganisms during 28 days of bio-incubation (Abbott et al. 2014; Vonk et al. 2015). One hundred millilitre filtrate passed through 0.7 μm filters was transferred to 120 mL acid-cleaned, and pre-rinsed brown glass bottles and then inoculated to each bottle using bacterial inoculum by adding 2 mL site-specific stream water filtered through 2.7 μm GF/D filters. All bottles received an amendment of nutrients to diminish nutrient limitation, with ambient nitrogen increased by 80 μM NH 4+-N and phosphorus by 10 μM PO43−-P during the incubation experiment (Abbott et al. 2014, Vonk et al. 2015). The bottles were gently shaken several times a day to ensure oxygenation but were loosely capped to limit evaporation and placed in darkness at room temperature. All water samples were re-filtered through 0.7 μm filters for DOC measurements after 28-days of bio-incubation to eliminate potential flocculation during the bio-incubation (Abbott et al. 2014). The water volume was determined on day 0 and day 28, and no notable evaporation was recorded. Percent BDOC (%BDOC) is defined as a percent loss of DOC during the 28 days of laboratory bio-incubation (Abbott et al. 2014, Vonk et al. 2015). Filtrates passed through 0.22 μm filters were determined for DOM optical properties following the aforementioned approaches.

2.5 Parallel factor analysis (PARAFAC)

PARAFAC was performed using the drEEM toolbox in MATLAB R2015b. The EEM array consisted of 152 samples (pre- and post-28 days of laboratory bio-incubation) with 251 emission wavelengths and 45 excitation wavelengths and was split into six random ‘halves’ (three calibration/three validation ‘halves’) to validate the model. A five-component model was validated using split-half analysis (Bro 1997), random initialization, and examination of residual error plots (Murphy et al. 2013). The spectral shapes of the five components were compared with those identified earlier in other ecosystems using an online fluorescence spectra library called OpenFluor (Murphy et al. 2014).

2.6 FT-ICR MS measurements and data processing

A subset of eight stream or river water samples taken in catchments with anthropogenically disturbed land use (urban + agricultural land use), ranging from 3.5% (pristine) to 87.2% (highly disturbed), was analyzed to obtain a more detailed picture of the DOM molecular composition using FT-ICR MS. The eight samples were collected from streams covering large gradients in urban and agricultural land use and were used to explore the effect of point and nonpoint source inputs on DOM molecular composition in fluvial ecosystems. The filtrates passed through 0.22 μm filters were solid-phase extracted (SPE) using 3 mL PPL cartridges (Agilent) and determined for their molecular spectra using electrospray ionization (ESI) FT-ICR MS with a negative-ion mode (−) (Choi et al. 2019). Briefly, ~ 40 mL (based on DOC concentrations to allow extraction of a similar load of DOC, i.e. ~ 60 μgC) were acidified to pH = 2. The filtrates were slowly passed through the PPL cartridge at ~ 5 mL min− 1, and subsequently 6 mL 0.01 M HCl was passed through the cartridges before they were dried with pure N2 gas and eluted using 1 mL of methanol. A Milli-Q water sample was solid-phase extracted to serve as a procedural blank. Molecular formulae were assigned to signal-to-noise >6σ RMS, and the assigned formulae did not exceed 0.3 ppm after internal calibration (Fu et al. 2020).

The modified aromaticity index (AImod) was determined, and AImod increased with increasing aromaticity of DOM (Koch and Dittmar 2016; Koch and Dittmar 2006). Using O/C and H/C molar ratios exhibited in van Krevelen diagrams together with AImod, DOM was categorized into: 1) polycyclic condensed aromatics (AImod > 0.66), 2) polyphenolic compounds (0.5 < AImod ≤ 0.66), 3) highly unsaturated and phenolic compounds (AImod < 0.5 and H/C < 1.5), 4) aliphatics (1.5 ≤ H/C < 2 and N = 0), 5) peptide-like compounds (1.5 ≤ H/C < 2 and N > 0), and 6) sugar-like compounds (H/C ≥ 1.5 and O/C > 0.9) (Coward et al. 2019; Kellerman et al. 2019; Spencer et al. 2014).

2.7 Statistical analyses

Principal component analysis (PCA), linear and nonlinear fittings, FT-ICR MS data processing, and parallel factor analysis were conducted using MATLAB R2015b. Mean and standard deviations were calculated using R × 64 4.0.5, and the location of sampling sites was mapped using ArcGIS 10.2. p < 0.05 was reported as significant in linear or nonlinear fittings.

3 Results

3.1 Water chemistry results

Urban land use ranged from 0.1% (pristine) to 56.2% (highly urbanized), and agricultural land use from 2.0% (pristine) to 70.0% (highly disturbed) for the 76 studied streams and rivers (Fig. S1). We found that agricultural land use and population density increased significantly with urban land use in the 76 stream and river catchments studied (Fig. S2). TN ranged from 0.6 mg·L− 1 to 8.9 mg·L− 1 (mean of 2.5 ± 1.6 mg·L− 1), TP from 0.01 mg·L− 1 to 0.40 mg·L− 1 (mean 0.07 ± 0.07 mg·L− 1), DOC from 0.5 mg·L− 1 to 7.4 mg·L− 1 (mean of 1.6 ± 1.1 mg·L− 1), and a254 from 1.5 to 21.7 m− 1 (7.5 ± 4.8 m− 1) (Table S1). TN, TP, DOC, and a254 all increased with increasing urban and agricultural land use and increasing population density in the upstream catchments (p < 0.001) (Fig. 1). BIX increased significantly with increasing urban (r2 = 0.21, p < 0.001) and agricultural (r2 = 0.08, p < 0.001) land use and increasing population density (r2 = 0.25, p < 0.001), while HIX and IC: IT decreased (p < 0.001) (Fig. 2).

Fig. 1
figure 1

Relationships between total nitrogen (TN), total phosphorus (TP), dissolved organic carbon (DOC), DOM absorption a254, and urban land use (a-d) for the 76 studied streams and rivers. Relationships between TN, TP, DOC, a254, and agricultural land use (e-h) for the 76 sampled streams and rivers. Relationships between TN, TP, DOC, a254, and population density in the sampled stream and river catchments (i-l)

Fig. 2
figure 2

Relationships between biological freshness index (BIX), humification index (HIX), fluorescence peak integration ratio (IC: IT), and the ratio of summed fluorescence intensity of humic-like (C1 + C2) to protein-like (C3 + C4 + C5) components, i.e. Humic: Protein and urban land use (a-d). Relationships between BIX, HIX, IC: IT, Humic: Protein, and agricultural land use (e-h). Relationships between BIX, HIX, IC: IT, Humic: Protein, and population density (i-l) in the sampled catchments

3.2 DOM optical results

Among the PARAFAC components, C1 [Ex/Em ≤ 240 (310)/404 nm] was similar to microbial humic-like material that was re-worked by microorganisms from terrestrial rivers (Kothawala et al. 2014; Kowalczuk et al. 2005), and C2 [Ex/Em ≤ 245 (350)/476 nm] corresponded to terrestrial humic-like substances derived from soil organic matter (Murphy et al. 2008; Stedmon and Markager 2005) (Fig. S3). The spectrum of C3 [Ex/Em ≤ 230/348 nm] was similar to tryptophan. C4 [Ex/Em ≤ 235 (280)/340 nm] was congruent with tryptophan-like fluorophores that associate with amino acids (Murphy et al. 2008; Osburn et al. 2012), and C5 [Ex/Em ≤ 230 (270)/300 nm] was categorized as a tyrosine-like substance (Murphy et al. 2011) (Fig. S3). The percentage of C1 and C2 to the summed fluorescence intensity of all the five components (%C1 and %C2) decreased with increasing urban and agricultural land use (p < 0.001) (Fig. 3a-b, Fig. 3f-g), while %C3-%C5 increased (p < 0.001) (Fig. 3c-e, Fig. 3h-j). The ratio of the summed fluorescence intensity of humic-like (C1 + C2) to that of protein-like (C3 + C4 + C5) components (i.e. Humic: Protein ratio, which is an indicator of the contribution of the terrestrial DOM export to rivers (Zhou et al. 2022)), decreased significantly with increasing urban (r2 = 0.20, p < 0.001) and agricultural (r2 = 0.12, p < 0.001) land use and increasing population density (r2 = 0.13, p < 0.001) in the upstream catchments (Fig. 2).

Fig. 3
figure 3

Relationships between the relative contribution of each PARAFAC component (C1-C5) and urban (a-e) and agricultural land use (f-j) for the DOM samples collected from the 76 studied streams and rivers

3.3 Bio-incubation results

The bio-availability of DOC, i.e., %BDOC, ranged from 1.0% (pristine catchments) to 67.8% (densely populated catchments) with a mean of 23.7% ± 13.8% (Fig. 4). %BDOC decreased (r2 = 0.30, p < 0.001) with increasing SUVA254 (indicating higher aromatic content) for the 76 rivers sampled streams and rivers. No significant relationship was found between %BDOC and catchment urban, agricultural land use, or population density (Fig. S4). We found significant and positive relationships between DOC and the fluorescence intensity of each component after 28 days of bio-incubation compared with that before bio-incubation (Fig. S5). The intercept of linear fittings between DOC prior to and post-bio-incubation was not significantly different from zero and was therefore set to zero, and the slope of the linear fitting was 0.69 (Fig. S5), indicating that ~ 31% of DOC in the studied streams and rivers was highly bio-labile. For the DOM samples with high urban land use (%urban > 10%) in the upstream catchments, we found a decline of protein-like C3-C5 and a corresponding increase of humic-like C2 after 28 days of bio-incubation (Fig. S6).

Fig. 4
figure 4

Relationship between the percentage of bio-degradable DOC (%BDOC) and specific UV absorbance (SUVA254) for the DOM samples collected from the 76 studied streams and rivers

3.4 PCA modeling results

We conducted a PCA model including %urban and %agricultural land use, population density, DOM chemical parameters containing TN, TDN, TP, and TDP, DOM optical indices encompassing DOC, a254, and the relative contribution of components fluorescence intensity (%C1-%C5) (Fig. 5). In the PCA analysis, PC1 explained 51.6% and PC2 19.3% of the variability in DOM-related variables (Fig. 5a). %Urban and %agricultural land use, population density, TN, TDN, TP, TDP, DOC, a254, and protein-like %C3-%C5 demonstrated positive PC1 loadings, while humic-like %C1-%C2 exhibited negative PC1 loadings (Fig. 5a). As C1 and C2 displayed spectral shapes similar to terrestrial soil organic matter, while the spectral shapes of C3-C5 were congruent with DOM impacted by residential effluents (Zhou et al. 2020), highlighting that PC1 was positively associated with anthropogenic DOM inputs from residential areas. Accordingly, we found that PC1 scores increased strongly with increasing urban land use (r2 = 0.66, p < 0.001) and population density (r2 = 0.74, p < 0.001) in the 76 stream catchments (Fig. 5b-c), especially in the high values region of %urban and %agricultural land cover.

Fig. 5
figure 5

Principal component analysis (PCA) of urban (%Urban) and agricultural (%Agri.) land use, population density, water quality parameters, DOM absorption coefficient at 254 nm, i.e. a254, and contribution percentage of the five PARAFAC components %C1-%C5 (a). Relationships between PC1 scores and urban land use (b), and population density (c) in the 76 stream and river catchments

3.5 FT-ICR MS results

The total assigned formulae for the eight sampling sites ranged from 2059 to 7523. The mean relative abundance-weighted m/z of the eight samples ranged from 350.9 to 397.5 with a mean of 370.5 ± 14.5, and the relative abundance-weighted DBE ranged from 7.4 to 8.9 with a mean of 7.9 ± 0.5 (Fig. S7). We found that both the mean relative abundance-weighted m/z and DBE declined, though not significantly so, with increasing catchment urban and agricultural land use, and population density (Fig. S7). The mean ratio of mass to charge (m/z) of FT-ICR MS peaks was lower (m/z ~ 350) for the samples collected from densely urbanized catchments (disturbed (urban + agricultural) land use = 87.2%) compared with the more pristine catchments (disturbed land use = 3.5%) with a mean m/z ~ 400 (Fig. 6a). The relative abundance-weighted contributions (%RA) of aliphatic, peptide-like, and sugar-like compounds increased in intensely urbanized catchments concurrently with a relative depletion in %RA of polycyclic condensed aromatics and polyphenolic compounds (Fig. 6b; Table S2). We further conducted Spearman’s rank order correlation analysis for each molecular formula occurring in > 50% of all samples, and the results showed that the relative abundance of aliphatics, peptide-like, and sugar-like compounds increased with increasing urban and agricultural land use in the catchment, and with increasing DOC (rho > 0.5, p < 0.05) (Fig. 6c; Fig. S8a), while the relative abundance of less-oxidized highly unsaturated and phenolic compounds decreased (rho < − 0.5, p < 0.05) (Fig. 6d). We found that a fraction of aliphatic and peptide-like formulae increased with increasing BDOC (Fig. S8b). The %RA of CHO-containing formulae (%) ranged from 53.0% to 66.5% with a mean of 61.3 ± 5.0%, and it decreased significantly with increasing urban land use (r2 = 0.80, p = 0.003) (Fig. 6e) and population density (Fig. S9a), while %RA of CHOS-containing formulae ranged from 7.4% to 18.6% with a mean of 11.9 ± 4.2%, and it increased significantly with increasing urban land use (r2 = 0.84, p = 0.001) (Fig. 6f) and population density (Fig. S9b). In addition, AImod ranged from 0.28 to 0.48 with a mean of 0.35 ± 0.07, but no significant relationship was found between AImod and catchment urban, agricultural land use, or population density (Fig. S7g-i). We found close linkages among the DOM molecular indices for the studied samples (Fig. S10).

Fig. 6
figure 6

Ultrahigh resolution mass spectra across the m/z range 150 ~ 800 of DOM samples collected from catchments with disturbed land use (urban + agricultural) ranging from 3.5% (pristine) to 87.2% (highly disturbed) (a). van Krevelen diagram showing enriched aliphatic formulae and depleted contribution of condensed aromatic substances for the DOM collected from the stream in the disturbed relative to the pristine catchment (b). The size of the dots represents the differences between the relative abundances of the samples collected from the disturbed and the pristine catchments (b). CA: polycyclic condensed aromatics; HUP: highly unsaturated and phenolic compounds; PP: polyphenolic compounds. Spearman correlation coefficients (rho) between the relative abundance (RA) of molecular formulae for the eight stream and river DOM samples and urban land use (c), and agricultural land use (d) of the corresponding catchments. The size of the dots size represents the absolute value of the rho in panels c-d. Relationships between RA-weighted contribution percentages of CHO-containing and CHOS-containing compounds (%) and urban land use (e-f) of the sampled catchments

4 Discussion

This study included an extensive field sampling campaign aiming to unravel the linkages between DOM optical properties, molecular composition, bio-lability, and catchment point and nonpoint source input across a large gradient of urban and agricultural land use in southeastern China. Previous studies have revealed that discharge impacts the concentration and optical composition of DOM in single streams and estuaries (Hur et al. 2014; Stedmon and Markager 2005; Yoon and Raymond 2012). Land use change could potentially affect the water flow in various environments (Eng et al. 2013; Gusarov 2020; Roa-García et al. 2011). However, all the samples included in this study were collected from independent medium-sized streams and rivers with no reservoirs in the sampled catchments. Our results indicate that enhanced urban and agricultural land use and relatively populated catchments (disturbed from hereafter) fueled the export of protein-like, aliphatic-rich substances, as well as a relatively greater abundance of heteroatomic-containing DOM with high bio-lability to the fluvial ecosystems.

4.1 Chemical composition of DOM across urban and agricultural land use gradients

Urbanization and intensified agricultural land use enhanced the mobilization of nitrogen, phosphorus, and DOC as supported by the higher concentrations of TN, TDN, TP, TDP, and DOC and higher levels of a254 in the disturbed catchments compared with pristine and forested catchments (Fig. 1). Substantial organic debris and household leachates from urbanized residential regions can be flushed to receiving streams and rivers (Hosen et al. 2014; Williams et al. 2010). In China, urbanization is often closely linked with intensified agricultural land use at the expense of forest loss in various catchments and is characterized by densely populated residential areas (Hosen et al. 2014; Williams et al. 2016). In urbanized catchments, low forest cover and high nutrient loading potentially enhance the microbial metabolism, thus altering the cycling of organic carbon across the urbanized stream and river ecosystems (Hosen et al. 2014; Zhou et al. 2022). The DOM fluorescence metrics provided evidence of a shift in DOM optical composition with catchment disturbance (Fig. 2). BIX increased, while HIX and IC: IT decreased significantly with increasing disturbance, verifying that DOM in rivers within highly urbanized catchments is of recent, likely aquatically produced, microbial origin. Higher BIX represents enhanced prevalence of aquatically fixed DOM of autochthonous nature in urbanized streams and rivers (Hosen et al. 2014). In comparison, HIX and IC: IT all increased with increasing terrestrial soil organic-rich signals as seen in other studies (Zhou et al. 2017; Zhou et al. 2022; Zsolnay et al. 1999). The PARAFAC analysis revealed that the percentages of the humic-like components %C1-%C2 decreased, while those of the protein-like components %C3-%C5 increased with increasing catchment disturbance (Fig. 3), substantiating that DOM in such areas contained a large fraction of protein-like substances associated with amino acids (Williams et al. 2016; Williams et al. 2010).

The FT-ICR MS results further revealed that the molecular composition of DOM differed as a function of urban and agricultural land cover. The molecular composition of DOM constrains its reactivity and persistence as polycyclic aromatics, polyphenols, and highly unsaturated and phenolic compounds can be photoreactive but bio-stable, while aliphatic compounds and peptides can be highly bio-labile (Kellerman et al. 2018). Large contributions from terrestrial soil organic matter in pristine catchments, such as condensed aromatics and polyphenols, likely shifted the overall composition of DOM to higher aromaticity compared with DOM from disturbed catchments (Fig. 6; Figs. S7, S8 and S9). We found lower molecular weight and higher H/C ratios and contributions of aliphatic and peptide-like compounds bound in CHON- and CHOS-containing formulae with a correspondingly lower contribution of aromatics and polycyclic condensed aromatic compounds of DOM in disturbed catchments compared with pristine catchments (Fig. 6a-b). Aromatics and polycyclic condensed aromatics have been used to reflect allochthonous sources (Kurek et al. 2020; Wagner et al. 2015). The relative abundance-weighted CHO-containing formulae decreased, while CHOS-containing formulae increased with increasing catchment disturbance, indicating enhanced dominance by autochthonous microbial degradation products with low molecular weight. This suggested that different molecular components of DOM were intrinsically and closely associated with each other (Fig. S10). Previous studies have shown that CHOS formulae primarily occurred in wastewater and septic-impacted aquatic ecosystems (Gonsior et al. 2011; Wagner et al. 2015). DOM sources from these urbanized and agricultural catchments were thus highly reduced and contained N and S compared with DOM sources from forested and pristine catchments (Drake et al. 2019), and DOM had lower molecular weight and aromaticity but higher hydrophobicity compared with pristine catchments (McElmurry et al. 2014). Our results, therefore, suggested that anthropogenic inputs altered the in-stream molecular composition of DOM by suppressing the soil organic matter signature observed for pristine catchments. Aliphatic and peptide-like DOM can be problematic in conventional water treatment, and they have a propensity to form disinfection byproducts upon chlorination (Kothawala et al. 2017; Lavonen et al. 2013). In addition, the presence of protein-like organic material increases the potential to form highly cytotoxic and genotoxic nitrogenous disinfection byproducts and thereby threatens the safety of the drinking water supply (Plewa et al. 2008).

4.2 Land use regulates BDOC and its implications

Our study showed that enhanced anthropogenic disturbance results in an elevated input of highly biodegradable DOC to streams and rivers. Aliphatic-rich DOM comprised an important fraction of organic substances in household effluents, often with high BDOC. In the densely populated catchment, runoff from asphalt landscapes, including roads, parking lots, and other impervious surfaces in urban areas, would be expected to exhibit high BDOC (Hosen et al. 2014). We found that 50 ~ 60% of DOC in disturbed sites could be biodegraded and respired as CO2 compared with %BDOC < 10% in pristine forested catchments (Fig. 4). %BDOC decreased with increasing SUVA254 across the sampled 76 streams and rivers (Fig. 4), showing that the anthropogenic disturbance generated low-molecular-weight, aliphatic energy-rich DOM with high bio-lability, which was distinctive from DOM composition in pristine catchments. We found a reduction of protein-like fluorophores and a corresponding increase in humic-like components after 28 days of bio-incubation in the DOM samples collected from the catchments with high urban land use (Fig. S6), demonstrating that protein-like substances associated with amino acids are more vulnerable to microbial degradation (Drake et al. 2019, Hosen et al. 2014, Williams et al. 2016). In comparison, aromatic DOM mobilized from pristine catchments is often biologically stable during the water transit time in these catchments (Drake et al. 2019). Where forests have been lost to continuous cropping and urbanization, it can be expected that DOM will be utilized at faster rates owing to enhanced microbial metabolism with elevated nutrient loading and BDOC. This bio-labile DOM comprises a large fraction of the carbon exported in agricultural and urbanized catchments, and higher nutrients and BDOC are expected to disproportionately increase respiration.

In this study, it is challenging to partition the specific effects of urban and agricultural land use and population density on the chemical compositional variability in DOM due to the high correlation between them, as is typical in China, and rivers tend to integrate across landscapes. Moreover, our study was based on a snapshot sampling strategy, but DOM may vary substantially seasonally and during extreme weather events (Coble et al. 2022; Zhou et al. 2020). Therefore, high-frequency DOM absorbance and fluorescence sensors can be useful surrogates for DOC and optical composition of DOM and have been rapidly applied in routine observational programs, especially in lake and drinking water supply reservoirs (Coble et al. 2022, Zhou et al. 2020). The linkage between SUVA254 and BDOC enables the possibility to model the biogeochemical fate of DOM in various stream and river systems with simple optical proxies or DOM-related indices retrieved in situ or via remote sensing, thus improving the ability to make broad-scale inferences at regional scales. Forests can regulate the river discharge (Lin et al. 2022), and many sampled streams and rivers included in this study were highly forested, and the ongoing deforestation will potentially reduce the forest-regulated water flow of these streams and rivers. Recent studies have further revealed that deforestation will mobilize substantial aliphatic and heteroatomic DOM with high bio-lability to the fluvial ecosystems (Drake et al. 2019; Spencer et al. 2019). Our results, therefore, highlight the importance of land use management for the biogeochemical cycling of organic carbon in the fluvial ecosystem. With the continuously increasing impervious cover area in the suburban catchments, point and nonpoint source effluent is dramatically altering the chemical composition and bio-lability of DOM in the downstream-linked stream and river systems. Given the importance of DOM in the global biogeochemical cycling of carbon, the impacts of urban and agricultural land use on the chemical composition and bio-lability of DOM should be considered in modeling the variability of its biogeochemical processes.

5 Conclusions

Our results highlight that DOM composition and its bio-lability are driven by the sources and the physicochemical conditions under which it is produced and preserved. Intensified urban and agricultural land use is often closely associated with high population density and has fueled the export of nutrients and protein-rich DOM with high BDOC to the receiving streams and rivers. The molecular level analyses revealed a shift in fluvial DOM sources from terrestrial soil organic substances in pristine catchments to autochthonously sourced aliphatics with increased heteroatom content and compounds with low molecular weight and aromaticity in urbanized catchments. We observed that aliphatic and protein-rich DOM derived from the disturbed catchments were highly bio-labile. With shifts in land use due to deforestation, especially from pristine forested to highly urbanized and densely populated catchments, more energy-rich aliphatic DOM associated with high bio-lability will be produced and discharged to downstream-linked fluvial ecosystems. As the degradation of this highly bio-labile DOM can enhance metabolism, future studies might benefit from unraveling the causal linkages between the composition of DOM and the aquatic effluxes of CO2 in highly urbanized landscapes.