Predictability and variance explained
The predictability of SOC varies substantially between observations and process-based models and across scales. Indeed, using random forest models with key state factors—namely, MAT, NPP, and TEX—to predict SOC from different sources showed a range of variance explained, from 26 % for global soil profile measurements up to 81 % for the MIMICS model (Table 1). The soil profile measurements were taken at the plot-level and exhibited greater heterogeneity due to fine-scale controls (e.g., proximity to vegetation, topographic wetness, soil aggregation; Wiesmeier et al. 2019) that are often not included in broad-scale analyses or models. As a result, the predictability of such SOC measurements is often lower (Doetterl et al. 2015; Hengl et al. 2014, 2017)—here, only 26 % of the variance in soil profiles is explained by MAT, NPP, and TEX using a random forest model, indicating that important controls at this plot-scale are still missing and that local conditions that drive SOC accumulation may not be well-matched with the relatively coarse spatial scales of MAT and NPP. Future studies should thus consider using finer-scale observationally-derived products of covariates, when feasible and available. Furthermore, other fine-scale controls (e.g., pH, exchangeable calcium, extractable iron and aluminum, among other properties) are important in explaining the variability in SOC across soil profiles (Heckman et al. 2020; Nave et al. 2021; Rasmussen et al. 2018), but these controls are not yet incorporated in most biogeochemical models and, therefore, could not be compared herein.
In contrast, observationally-derived data products (e.g., HWSD and NCSCD-adj. HWSD) and process-based biogeochemical models have imposed underlying structures that rely on fewer input variables and are more easily learned by a random forest algorithm. Indeed, these gridded observational and model products exhibit less fine-scale heterogeneity in SOC stocks than that seen in soil profile measurements. Using a random forest model for each of the observational and biogeochemical model products globally, the same three covariates – MAT, NPP, and TEX – explained 58 and 53 % of the variance in the HWSD and NCSCD-adj. HWSD versus 63 %, 81 %, and 53 % in the CASA-CNP, MIMICS, and CORPSE model outputs, respectively (Table 1). The predictability of SOC also varied between individual biomes and broad biome classifications (Fig. S4-S5). For example, temperate deciduous broadleaf and evergreen needleleaf forests showed greater predictability than tropical evergreen broadleaf forests, in both observational data products and process-based models (Fig. S5).
While we focused our analyses on the more parsimonious RF model that included MAT, NPP, and TEX, we briefly note that adding mean annual precipitation (MAP) into the random forest models only marginally increased the percent variance explained by 1–5 % across the different data sources globally (Table S2). The smallest difference was observed for MIMICS, agreeing with the fact that the version of the model used in the biogeochemical testbed did not contain a soil moisture function and thus did not use MAP as an input (Wieder et al. 2018). Therefore, subsequent analyses focused on the environmental variables (MAT, NPP, TEX) that were direct inputs to all of the biogeochemical models.
In addition to overall SOC predictability across data and models, we also explored the variable importance and emergent relationships for MAT, NPP, and TEX for each of the SOC sources. We found an apparent mismatch between observations and biogeochemical models, where TEX was the most important predictor for the observations (both soil profiles and global datasets), in contrast to MAT and NPP for the model outputs (Fig. 2). This result also holds for clay as a predictor, as often used in biogeochemical models (Fig. S8; though only 48 % of the variance in HWSD was explained). Furthermore, we note that the importance of MAT was higher in the NCSCD-adj. HWSD compared to the HWSD alone, suggesting potential differences in underlying temperature sensitivity and freeze-thaw dynamics that are important to consider when benchmarking biogeochemical models globally.
While biogeochemical models placed a higher importance on both temperature and plant productivity globally (Fig. 2), greater nuance exists at the biome-level. In our RF emulators of the biogeochemical model outputs, MAT stood out as the most important variable in forests, whereas NPP was most important in herbaceous biomes (Fig. 3). In contrast, TEX was again the most important variable for HWSD in both forest and herbaceous biomes (Fig. 3; Fig. S6), though the NCSCD-adj. HWSD showed an increased importance of MAT for herbaceous biomes at high-latitudes (Fig. 3; Fig. S6). The RF models performance in broad forest and herbaceous land classes were comparable to that of the global results (Fig. S4). Interestingly, among individual biomes where the RF also achieved similar performance to the global results (Fig. S5), we find agreement between the observations and model outputs in select biomes (Fig. S7). Namely, MAT had high explanatory power in temperate deciduous broadleaf and mixed forests and NPP had high explanatory power in temperate grasslands. This suggests that biogeochemical models can match observations in select biomes (e.g., temperate forests and grasslands), but targeted model improvements are needed in biomes where observations and models show divergent controls on SOC stocks (Fig. S6-S7). These targeted improvements could include a closer examination of model parameterizations in specific biomes (for example, increasing mineralogical controls or increasing/decreasing temperature dependencies) and the distribution of SOC among different model pools. Modifications may also require incorporating missing controls into biome-specific model formulations or conducting additional experiments and sampling campaigns in data-poor regions to further inform parameterizations.
Using random forest models trained on the observations and model output, we explored the dependence of measured and modeled SOC on individual covariates. These random forest models act as emulators of the observations and biogeochemical models, allowing us to explore individual relationships (i.e. partial dependence plots) while controlling for all other covariates. This method also allows for emergent non-linearities without a priori imposed relationships.
We observed stark differences between the observations and biogeochemical models (Fig. 4). Both the soil profiles and observational data products (HWSD and NCSCD-adj. HWSD) showed a greater dependence (steeper positive slope) on TEX (clay and silt content) compared to the models (near zero or negative slope; Fig. 4a; Fig. S2; Table S1). By contrast, the soil biogeochemical models were more sensitive to MAT and NPP, compared to the observational products. Though the soil biogeochemical models do contain some representation of mineral-organic associations (where MIMICS and CORPSE do so in a more explicit, mechanistic way), the importance of TEX appeared insignificant in explaining SOC content globally. Rather, climate and vegetation seemed to play a predominant role in driving the distribution of SOC in soil biogeochemical models. This discrepancy motivates a closer look at how mineralogical controls are implemented in each biogeochemical model and the resulting distributions of SOC stocks between modeled pools.
Both observations and models showed the highest SOC in grid cells with colder MAT (Fig. 4b). Indeed, SOC had the highest temperature sensitivity (steeper slope) in colder regions (MAT < 0 °C) and a lower temperature sensitivity (shallow slope) in warmer regions (MAT > 10 °C) (Fig. 4b), corroborating recent studies (Koven et al. 2017; Wieder et al. 2019a). This higher temperature sensitivity of SOC at low MAT primarily occurs in high-latitude forests, and therefore, a clear difference in temperature sensitivity was seen across herbaceous (here grasslands and savannas) and forest biomes (Fig. 3; Fig. S6). Without the NCSCD-adjusted values at high latitudes, the HWSD data product on its own showed a muted temperature sensitivity, far lower than that of the soil profiles and biogeochemical models (Fig. 4b), suggesting that the HWSD alone may not fully capture the underlying temperature sensitivity that is needed to benchmark biogeochemical models. Replacing the high latitudes with the NCSCD data product, known to be more accurate in those geographies (Hugelius et al. 2013; Koven et al. 2017), improved the emergent temperature sensitivity such that it approached that of the soil profiles.
While the soil biogeochemical models use similar temperature functions, their parameterizations and which fluxes or carbon pools they are applied to differ between the models (Wieder et al. 2018). As a result, the biogeochemical models exhibited a range of temperature sensitivities (Fig. 4b). The degree of temperature sensitivity of the soil profiles most closely matched the MIMICS and CASA-CNP models (Fig. 4b). CORPSE had the highest MAT dependence (Fig. 4b), which may be due in part to its ceased decomposition in frozen soils (Wieder et al. 2019a), and is reflected by its higher SOC content at high latitudes (Fig. 1).
Influence of plant productivity
Plant inputs are the main source of carbon to the soil and, thus, plant productivity across biomes plays an important role in carbon accumulation. We found that, in both the observations and soil biogeochemical models, SOC content exhibited a saturating relationship with increasing net primary productivity (NPP; Fig. 4c). However, the soil profiles and the biogeochemical models showed a significantly higher sensitivity (steeper increase) to NPP than the HWSD data product. This suggests that underlying environmental sensitivities seen in the soil profiles may not be effectively represented in the HWSD data product, and thus, care should be taken when using such data products to benchmark soil biogeochemical models. Indeed, the soil profiles showed a NPP sensitivity that fell in the middle of those seen in the models; specifically, at high NPP, the soil profiles showed a lower sensitivity than CORPSE and higher sensitivity than MIMICS and CASA-CNP, while at low NPP, the soil profiles were less sensitive than all three of the models (Fig. 4c). This qualitative difference in the shape of the soil profile curve is interesting, and further investigation with other datasets (especially those with site-level NPP measurements) is warranted for verification – especially given the low variance explained for the soil profiles – and to understand the root cause and potential implications for model representations. It is important to note that gridded NPP (and MAT) estimates may differ from the actual conditions experienced by the soil profiles being measured, as NPP can be highly variable at the site-level within a grid cell. Elucidating the sensitivity of long-term SOC storage to changes in plant inputs, and to underlying site-level heterogeneity, is critical for understanding potential feedbacks to land-use and land-cover change globally.