Abstract
The main hurdle in instrumentalizing agricultural soils to sequester atmospheric carbon is the development of methods to measure soil carbon stocks which are robust, scalable, and widely applicable. Our objective is to develop an approach that can help overcome these hurdles. In this paper, we present the Wageningen Soil Carbon STOck pRotocol (SoilCASTOR). SoilCASTOR uses a novel approach fusing satellite data, direct proximal sensing-based soil measurements, and machine learning to yield soil carbon stock estimates. The method has been tested and applied in the USA on fields with agricultural land use. Results show that the estimates are precise and repeatable and that the approach could be rapidly scalable. The precision of farm C stocks is below 5% enabling detection of soil organic carbon changes desired for the 4 per 1000 initiative. The assessment can be done robustly with as few as 0.5 sample per hectare for farms varying from 20 to 150 hectares. These findings could enable the structural implementation of carbon farming.
1 Introduction
Increasing soil organic matter can mitigate climate change by sequestering atmospheric carbon (Batjes 2019; Bossio et al. 2020; Amelung et al. 2020; IPCC 2021). Agricultural soils are of particular interest as they have undergone significant anthropogenically induced changes (Quine et al. 1997; Van Oost et al. 2007; Sanderman et al. 2017). However, effective management could restore and increase the carbon reserves (Spencer et al. 2011; Batjes 2019; Bossio et al. 2020). Enhanced amounts of soil organic carbon (SOC) can have co-benefits such as enhanced water retention, higher biodiversity, and higher resilience to climate-change induced droughts (Guillaume et al. 2022). From an environmental and economic policy dimension, there is increasing interest in instrumentalizing agricultural soils for enhanced natural carbon sequestration (Sikora 2020). There are numerous studies and initiatives that attempt to instrumentalize soil carbon stocks to actively offset carbon emissions, also referred to as carbon farming (Spencer et al. 2011; Black et al. 2020). In order to ensure the long-term removal of carbon from the atmosphere, the carbon must stay in the soil for an extended period of time (concept of permanence, Lutzow et al. 2006, Oldfield et al. 2022a). Land users can actively contribute to the increase of their carbon stocks by changing their soil management, e.g., by ceasing tillage or changing fertilizer use (the concept of Additionality, Black et al. 2020). Protocols are emerging to facilitate Monitoring, Reporting, and Verification (MRV) in the framework of carbon farming, but challenges remain regarding robust scientifically backed methods and the documentation on permanence, additionality, and leakage (Oldfield et al. 2022a). In light of current policy developments such as the European Green Deal and the EU climate action on sustainable carbon cycles, the importance of robust carbon monitoring is likely to increase in the coming years (Elkerbout 2020; Amelung et al. 2020; The European Commission 2022).
The main challenge in instrumentalizing soils to sequester atmospheric carbon is the implementation of a method which is robust, affordable, and scalable and which is coupled to feasible land-use management advice (Spencer et al. 2011). There are methods which assess soil carbon stocks using satellite data only (Köchy et al. 2015; Poggio et al. 2021), but they are limited with regard to their resolution and their ability to monitor changes over time on field and farm levels. For example, for SoilGrids, this is 250 by 250 m, which constitutes a resolution which is too coarse for numerous fields to accurately assess carbon stocks for carbon credit certification (Black et al. 2020). An additional drawback of satellite-based global models is that these models are geared toward optimal predictions on a global or national scale, limiting the reliability on farm-level (Poggio et al. 2021). Remote sensing can be supplemented by field-based measurements using wet chemistry (Van Der Voort et al. 2019). Alternatively, it can be supplemented by proximal sensing methods such as soil spectroscopy (Bellon-Maurel and McBratney 2011; Soriano-Disla et al. 2014; Gobrecht et al. 2014; Shen et al. 2022). The high cost of wet chemistry measurements constitutes a major hurdle for the implementation of carbon farming (Kragt et al. 2012; Alexander et al. 2015; Tang et al. 2016). An additional limitation of wet chemistry is that lab facilities are not readily accessible and available everywhere, in particular in developing countries. Soil spectroscopy methods are gaining ground rapidly in agricultural sciences as key tools to map soil properties rapidly and for large areas without the need of wet chemistry measurements (Nocita et al. 2015; Smith et al. 2020; Trontelj 2021). However, these types of measurements are associated with lower accuracy as compared to classical wet chemistry measurements (Soriano-Disla et al. 2014). Fusing of both remote and proximal sensing data can combine the strength of these approaches and deliver cost-effective soil mapping (Asgari et al. 2020). Sampling strategies are also key to accurately map soil carbon stocks as they must capture the spatial heterogeneity of an area (Goovaerts 1998; van der Voort et al. 2016). Robustly mapped soil carbon stocks are needed in order to to capture the temporal change and assert permanence of carbon and the impact of additionality (Van Der Voort et al. 2019; Smith et al. 2020; Oldfield et al. 2022a, b). Grid-based sampling (e.g., measuring every 10 m) yields robust estimates of spatial heterogeneity and can assess temporal changes but is highly labor intensive (Bivand et al. 2013; Nussbaum et al. 2014; van der Voort et al. 2016; Van Der Voort et al. 2019). Alternatives to grid-based sampling methods can be leveraged to capture similar levels of heterogeneity with a lower average sampling density such as with conditioned Latin hypercube sampling (cLHS) (Brus 2019; Minasny and McBratney 2006; Yang et al. 2016). Machine learning can be used to predict patterns of carbon stocks and reduce the need for additional sampling (Bivand et al. 2013; Nussbaum et al. 2014; Smeaton et al. 2021). However, overfitting in machine learning can limit the scalability of approaches (e.g. with random forests, Bivand et al. 2013). When an overfitted model is used, it may be applicable to the region it was trained on but not to other regions and is therefore not scalable. Furthermore, it is key that the best-performing machine-learning model is selected in order to get the optimal results and lowest possible residual errors (Padarian et al. 2020; Khaledian and Miller 2020). Changes in carbon stocks are optimally demonstrated at the decision-making level, i.e., at the farm level, but presently, this is challenging due to the significant cost associated to MRV processes (de Gruijter et al. 2016). This research gap can be addressed by developing methods which can determine soil carbon using a robust, affordable, and widely applicable approach which is useable at the farm-level.
Our objective in this study is to develop and test a method which can facilitate carbon stock monitoring in a wide range of settings. This approach leverages available satellite, on-field soil spectroscopy measurements, and machine learning techniques to create efficient sampling protocols and generate carbon stock estimates. This approach works on the farm-scale level and is tested on a range of arable fields in the USA on a range of soil types. In order to test the developed method, the optimal sampling density is determined and the error of carbon stock estimates are calculated for the farm and field level.
2 Material and methods
2.1 Carbon stock assessment protocol and field area
This section describes the key steps of the Wageningen Soil Carbon STOck pRotocol (SoilCASTOR): (1) the selection of spatial covariates from (satellite) data sources, (2) the selection of sampling locations, (3) the soil spectroscopy measurement of SOC in the field, (4) the training of a model on the available data, and (5) the calculation of the carbon stock with uncertainty estimates (Fig. 1). Code developed to facilitate this approach was developed in the statistical language R in RStudio (RStudio 2021, version 2021.09.0). The proposed protocol was tested in two farms in the states of Arkansas and Iowa in the USA. The fields locate between 35° and 41.3° latitude and −92.0° to 91.6° longitude (Fig. 2). The Arkansas farm consists out of five fields with a sum of ~140 ha, and the Iowa farm consists of three fields and a total of ~95 ha. The smallest field is ~10 ha and the largest ~64 ha. The land use of all fields is agricultural, and all soils are Alfisols. In Iowa, the fields are characterized as a silty loam to a silty clay loam. The parent material is Pleistocene loess. In Arkansas, the soils are also classified as silt loam. The parent material is also loess with occasional glacial deposits (Boiko et al. 2021).
Key steps of the Wageningen Soil Carbon STOck pRotocol (SoilCASTOR) method are (1) collection of covariates from (satellite) data sources, (2) the selection of sampling locations with conditioned Latin hypercube sampling (cLHS), (3) measurement of soil organic carbon (SOC) with the near-infrared (NIR) scanner in the field, (4) modeling of SOC using machine learning (ML), and (5) soil carbon stock estimates with uncertainties.
2.1.1 Selection of spatial covariates
The method utilizes all available (satellite) data sources and indices to find effective covariates to predict soil carbon stock (Fig. 1). These covariates were selected based on factors that are known to correlate with soil carbon stocks, such as vegetation and soil moisture (Jobbagy and Jackson 2000; McBratney et al. 2003; Seneviratne et al. 2010; Wang et al. 2021). Data sources encompass the Sentinel-1 and Sentinel-2, digital elevation map, and ISRIC SoilGrids (Escadafal 1989; Nellis and Briggs 1992; Marsett et al. 2006; Van Doninck et al. 2012; Zakharov et al. 2020; Wang et al. 2021; Poggio et al. 2021). Relevant covariates were extracted for points on a 0.001 by 0.001 degree grid which corresponds to a ~10 m by ~10 m grid in the designated sampling areas (McBratney et al. 2003). To approximate vegetation, both the transformed vegetation index (TVI) (Nellis and Briggs 1992) and satellite adjusted total vegetation index (SATVI) from Sentinel-2 were utilized (Marsett et al. 2006). In order to assess soil moisture with Sentinel-1, the volumetric soil moisture (VSM) following Zakharov et al. (2020) was used. Covariates for SOC encompassed of Sentinel-2 spectral images and shortwave infrared (SWIR) bands (B11 and BI2) and second brightness index band BI2 (Escadafal 1989; Wang et al. 2021). From ISRIC SoilGrids, relevant covariates such as clay content, cation exchange content, and bulk density were extracted (Poggio et al. 2021). This approach to extract covariates relevant to predict SOC content is globally applicable and modular, i.e., it can take up more (local) data sources and covariates when available. In order to design a sampling scheme for each site, the fields are divided in a grid of ~10 m resolution. Each point becomes a potential sampling location (SI Fig. 1). Subsequently, for each grid point, all covariates were retrieved. Additional details on the covariates can be found in Supplemental Information (SI) Table 1.
Data sources were cleaned in order to avoid data of insufficient data coverage. Covariates were excluded when the variable is available for less than 99% of the potential sampling points. This yielded a total of 45 variables. For the variables for which there was sufficient data, missing values were imputed with the median values of the covariate. These missing values occur only in a few cases, and imputation is needed to avoid the removal of valuable covariates due to single missing data points.
2.1.2 Sample location selection using cLHS
Conditioned Latin hypercube sampling (cLHS) was used to select optimal locations of field measurements (Minasny and McBratney 2006; Brus 2019; Saurette et al. 2022; Fig. 1). With the cLHS, a subset of the potential sampling locations is selected using a stratified random procedure based on the multivariate distribution of the covariates (SI Fig. 1). The asset of this method is that, for example, two points that are similar in the multidimensional covariate space are not both selected for sampling. This allows for a lower sampling density than classical grid-based sampling. The selected sampling points by cLHS effectively capture the range of covariates of the plot. The optimal sampling density of the cLHS was evaluated by testing the effects of different sampling density on uncertainty in SOC prediction (see Section 2.1.5).
2.1.3 Field measurements
Field samples were taken at a depth between 0 and 30 cm with an open spiral soil auger, and the SOC was measured using the AgroCares near-infrared (NIR) scanner (AgroCares 2022) in Iowa and Arkansas. Per location a single sample was taken. The NIR scanner was trained on a dataset of ~18,000 lab-based measured samples using a one-dimensional convolutional neural network (Tsakiridis et al. 2020; Tsimpouris et al. 2021; Yang et al. 2020a, b). The exact sampling locations were given as XY-coordinates, provided by the cLHS method. If no suitable spot is found at the location, the field sampler could deviate up to 2 m around the point. If sampling was still not possible, the sample location was skipped. Sampling is done if possible on bare land, and any plant debris is removed if necessary. Stones exceeding >2 mm, and roots are removed (Van Der Voort et al. 2019; Walthert et al. 2002). The sample is not dried before the measurement. After thoroughly mixing, the soil sample was scanned with the NIR scanner, and data was immediately transferred digitally. Bulk density was not determined in the field, but estimated from soil organic matter and clay content using a pedotransfer function calibrated on arable soils in the Netherlands (Commissie Bemesting Akkerbouw en Vollegrondsgroententeelt 2022). Additional details on the sampling procedure can be found in the SI.
2.1.4 Modeling carbon stocks
Model selection for SOC stock estimates
In order to predict the soil carbon content for each point in the ~10 × ~10 m grid, machine learning (ML) models were built using the field measurements of SOC (%) measured with the NIR scanner and the covariates. In order to ascertain that the optimal model was used, we evaluated the use of two model target variables and a range of ML models and data transformations.
The two target variables that were evaluated are firstly the SOC (%) of the NIR scanner and secondly the difference between the measured SOC (%) and ISRIC SoilGrids SOC (%) (hereafter referred as SOCdif). The SoilGrids SOC is the SOC estimated as predicted by a global model (Poggio et al. 2021). The SoilGrid SOC content for 0–30 cm was calculated from the SOC content of 0–5 cm, 5–15 cm, and 15–30 cm, weighted by the depth of the three soil layers. The rationale of using SOC as a target variable is that the model optimizes for the locally measured SOC values. The rationale of using the SOCdif as target variable is that the NIR scanner can capture local heterogeneity of SOC and thereby it can fine-tune the global estimates of SOC. Models were built on the farm level. The tested algorithms are linear regression, partial least square regression, ridge regression, lasso regression, elastic net regression, decision trees, and random forest regression (Bivand et al. 2013). The applied data transformations on the target variable are log-transformation, box-cox transformation, standardized (value minus mean divided by the standard deviation), and no transformation. Subsequently, tenfold cross-validation on all NIR-measurements (n=205) was used to evaluate the performance of each model in the form of residual mean square error (RSME) of SOC in the validation datasets. The optimal approach was selected after comparing model performance (RMSE) of the various model options with differing in the target variable, algorithms, and data transformations. With the best model (with the lowest RMSE), SOC (%) for all grid points were predicted.
2.1.5 Estimating carbon stock with uncertainties
Conversion from SOC content to C stock
In order to attain carbon stock estimates for the individual fields, the soil carbon content of top 30 cm (g C kg−1) was converted to a carbon stock (g C 100 m−2) by multiplying the soil C content (g C kg−1) with the bulk density (kg m−3), the depth of the soil (m), and the area of the grid cell (100 m2). This amount was then converted to the unit of 10,000 m2 or one hectare, to align with carbon credit certification protocols (Black et al. 2020). Finally, field and farm C stock (in the unit of ton C) was calculated as sum of soil C of all grids located within the field or farm, respectively.
Estimate of uncertainty associated with scanner error
Estimates of carbon stock are unavoidably associated with errors. To assess a change in soil carbon content, e.g., in the context of carbon farming for carbon credits, it is crucial to quantify the errors (Minasny et al. 2017; Black et al. 2020). Here, we assessed the error attributed to the field measurements of SOC with the NIR scanner and quantified how the error propagates when estimating the carbon stock. Based on previous validation studies with >18,000 independent sample locations all over the world, we conservatively assume that the measured SOC value of NIR scanner is associated with a 30% error rate (i.e., the error follows a normal distribution with SD 30%; thus, the majority (68.3%) of samples is associated with an error ranging between −30% and +30%) over the range of 5 to 100 g C kg−1 for a single SOC measurement. This is a conservative assessment; in reality, the error can be lower (see SI for details). The effect of the NIR scanner error on C stock estimate was quantified with Monte Carlo simulations, which is in line with current carbon credit certification protocols (Black et al. 2020). A Monte Carlo simulation was applied on the NIR field measurements (n=205) with a random error on the SOC (mean 0%, SD 30%) for a hundred times. For each iteration, the whole procedure of C stock estimate (i.e., model selection, grid-level SOC prediction with the best model, and calculation of C stock on field and farm level) was repeated. Subsequently, the uncertainty range of the field-level and farm-level C stock was quantified. Additional details can be found in the SI.
Evaluation of sampling density
To explore optimum sampling density of the on-field soil spectroscopy measurements, field-level carbon stock was estimated with varying numbers of sampling points. By reducing the sampling density to the minimum, the costs associated with carbon farming can equally be minimized (Kragt et al. 2012; Tang et al. 2016). For each field, a fraction of cLHS-derived sampling points was randomly selected from the full set of the sampling points of the field. The tested fractions were 10, 20, 30, 40, 50, 60, 70, 80 and 90%. This approach was repeated 100 times, and the relative error (coefficient of variation, CV, in percentage) in carbon stock was evaluated. This approach ensures that the minimum requirement of sampling density for a field can be determined in a region where prior knowledge of the SOC in the neighbor fields is reasonably available. In other words, to evaluate the optimum sampling density of field A, data of other fields of the same farm (i.e., fields B–E) were leveraged to build the machine learning model. A random error of 30% on the measured SOC was added for all simulations. For each simulation, a certain fraction of the sampling points were randomly chosen for 100 times. Subsequently, 100 different field-level carbon stock estimates were computed. To evaluate the appropriateness of that sampling density, the CV of those 100 carbon stocks were calculated. A low CV value indicates that the field carbon stock estimate is similar among different subsets of sampling points, indicating that the carbon stock can be estimated robustly with the sampling density. A flowchart exemplifying these steps can be found in the SI.
3 Results and discussion
3.1 Carbon stock assessment protocol
3.1.1 Covariates
The covariates encompass indicators of soil carbon content (e.g., vegetation indexes and soil moisture, Van Doninck et al. 2012; Escadafal 1989; Marsett et al. 2006; Nellis and Briggs 1992; Wang et al. 2021; Zakharov et al. 2020). However, additional parameters, related to, e.g., the land management, ground water level, and even fertilizer level, are not included, even though they could be potentially important covariates for soil carbon stock (Minasny et al. 2017). The method here is modular, i.e., it could absorb and utilize other covariates when available. Potentially, SoilCASTOR can improve when additional covariates are included. More research is needed to ascertain this.
3.1.2 cLHS Sampling design and sampling density
The cLHS method and required sampling density was evaluated by comparing model performance (exemplified by the coefficient of variation, CV in percent) across a range of sampling densities (0.1–1.0 samples per hectare) (Fig. 3) (Bivand et al. 2013; Brus 2019; Minasny and McBratney 2006). By evaluating which sampling density is required to get optimal results, efficient and effective sampling campaigns for carbon farming can be set up. A major cost component is the labor investment of fieldwork (Gobrecht et al. 2014; de Gruijter et al. 2016). If model performance plateaus at a certain sampling density, additional sampling is not necessary. In total, 205 field samples were taken, 141 in Arkansas, and 64 in Iowa (Fig. 2). Details on the number of samples per field can be found in the SI. Results show that the cLHS sampling-based model results optimize (lowest CV %) at a sampling density of around 0.5 samples per ha for the majority of the fields (Fig. 3). In other words, for every 2 ha, about one sample is required to achieve a robust estimate of the farm C stock with a deviation less than 5%. Additional measurements do not add much to the improvement of the model. This implies that with relatively low labor time investment (15-30 min per hectare); large areas can be covered, overcoming a key obstacle in the implementation of carbon farming (Evans et al. 2015; Tang et al. 2016). Although cLHS is established in soil mapping (Yang et al. 2016, 2020b), direct comparisons of cLHS and grid- based studies are rare and require more extensive research (Saurette et al. 2022).
3.1.3 Fieldwork and the NIR-based scanner
NIR-based scanners lend themselves to carbon stock analysis because they allow for high throughput (Bellon-Maurel and McBratney 2011). However, they are impeded by high associated errors (Bellon-Maurel and McBratney 2011; Gorbrecht et al. 2014). For this project, the AgroCares HandHeld NIR scanner was used, which leverages a measurement data exceeding 18,000 samples (AgroCares 2022). A conservative estimate of the error (mean 0%, SD 30%) for the range of 5 to 100 g C kg−1 is assumed. This error is propagated and resulted in a range of expected stocks. For example, for field A, average stock is 20.1 tC ha−1 within a range of 18.0–22.4 tC ha−1 (details in Section 2.1.9). Errors on GPS locations are minimal (max 2 m) as the sample is measured in the field and data is entered directly and is automatically associated to the correct location. Additional information on the HandHeld Scanner can be found in the SI 4. The SoilCASTOR protocol is also modular when it comes to the implementation of the scanner; thus, if more optimally performing NIR scanners become available, they can be implemented.
3.1.4 Machine learning model
The performance of the range of ML models (linear regression, partial least square regression, ridge regression, lasso regression, elastic net regression, decision trees, and random forest regression) and transformations (log-transformation, box-cox transformation, standardization and no transformation, Bivand et al. 2013) and target variables (SOC and SOCdif, Poggio et al. 2021) were compared (Fig. 4 and Table 1). The optimally performing algorithm was the random forest regression model with a box-cox transformation on the target variable SOCdiff (residual mean square error, RMSE = 0.240, r2 = 0.76). The random forest scored highest for both the target variables SOC% as well as the residual (difference between measured SOC% and ISRIC SoilGrids) (Fig. 4 and Table 1). The top-five performing models have an RMSE ranging from 0.240 to 0.253 and an r2 from 0.76 to 0.73, respectively (Table 1). This approach of running and evaluating multiple machine learning models, transformations, and target variables is comprehensive (Fig. 4). This multi-pronged approach leaves no stone unturned and allows for the selection of the model which is optimal in that instance (Padarian et al. 2020; Khaledian and Miller 2020). When this model is applied to other regions or datasets, other models may be more optimal, in which case the best performing model will be automatically selected.
Overview of tested machine learning (ML) models, transformations, and both target variables, evaluated by r2 and residual mean square error (RMSE). Carbon prediction means SOC as target variable; residual prediction refers to SOCdif. Abbreviation dt is for decision trees, elastic for elastic net regression, lm for linear regression, pls for partial least square regression, rf for random forest, and ridge for ridge regression. Transformations are box-cox (red); logtransform (blue); none (green); and standardized (yellow). Standardized is calculated as (value-mean)/standard deviation.
3.1.5 Carbon stocks and variability
The Arkansas and Iowa farms carbon stocks in the top 30 cm range from 32.8–38.7 and 25.4–29.6 tonC per hectare, respectively (Table 2). Carbon stocks in the fields for the same depth interval range between 14 and 84 ton C ha−1, with a mean of 34 and a median of 30 tonC per hectare (Table 2). This puts the stocks in the range as found in other studies (Jobbagy and Jackson 2000; Köchy et al. 2015; Nussbaum et al. 2014). There are significant inter and intra-field differences (Fig. 5). The carbon stocks were determined both per farm, per field and per hectare (Table 2). The carbon stocks per farm differ strongly, with Arkansas having a higher stock per ha (35 tC ha−1 ) and holding nearly double total carbon stock (5059 tC) as compared to Iowa with a lower stock per ha (28 tC ha−1) and lower total stock (2630 tC). When looking at individual fields, the ranges are even greater. Field C in Arkansas has the lowest stock per ha with an average of ~19 tC ha−1 contrasted by field B in Arkansas with the highest stock per ha at ~55 tC ha−1. The coefficient of variation (CV; standard deviation divided by the mean), a metric to evaluate the variability, is 4.3% for the Iowa and 5.0% for Arkansas farms. The CV of individual field-level C stock is slightly higher than for the composed farms and ranges between 5.4 and 9.4%. This shows that in order to gain a robust understanding of carbon stock dynamics, and in particular the element of carbon leakage, fields need to be individually assessed (Black et al. 2020, FAO 2020). These values are still below the thresholds established by most accreditation protocols (Black et al. 2020). The method captures small-scale (tens of meters) variability which can be matched with field-based assessments. The variability is strongly dependent on the field (Fig. 6). The bulk density was derived using the pedotransfer function (Commissie Bemesting Akkerbouw en Vollegrondsgroententeelt 2022). Potential errors on bulk density estimates were not available and not propagated. NIR measurement could potentially also be used to determine bulk density, but is not yet known how robust these results would be. More work is needed to ascertain this (Bellon-Maurel and McBratney 2011). To our knowledge, the SoilCASTOR approach is novel in the way it combines multiple data sources, comprehensive machine learning and offers robust soil carbon stock estimates.
Overview of carbon stocks (tC ha−1) for all fields. Subplots indicate for (a) field A, (b) field B, (c) fields D, C, and E (clockwise), (d) field F, and (e) fields H (left) and G (right). Note the total area in fields ranges from 10 to 64 ha. In addition, legends differ per field for visual clarity.
3.2 Upscaling and integration into carbon farming
In order to be used effectively for soil carbon stock-derived carbon credits, quantification methods need to be widely applicable, compatible with carbon credit requirements and find the balance between cost and benefit for the land user.
3.2.1 Applicability on the globe and across a range of scales
The covariates used as inputs for SoilCASTOR steps one (selection covariates) and two (sampling location selection) are globally available. However, it is possible that there are additional regional datasets available, e.g., for the EU sphere (Tóth et al. 2013). Depending on other geographic areas, the available covariates may subsequently differ from the situation here. The set-up of the model is modular, meaning that if more rich datasets are available, they can be incorporated. This may, however, have implications for appropriate sampling densities, relevant covariates, and target variables and therefore needs to be assessed separately. Additional studies on farms in other regions and countries are needed in order to evaluate the optimal sampling density and associated accuracy. Furthermore, comparisons between cLHS-powered and traditional grid-based soil carbon strategies should be done in order to evaluate and compare the most robust and scalable solutions (Saurette et al. 2022). Scalability could be further limited because a key requirement for SoilCASTOR is that it is necessary that a field analyst go to the field and samples.
3.2.2 Carbon credit requirements
In order to transform the carbon monitoring data so it can be reported and verified by an independent organization, it is necessary to propagate error. Furthermore, it is key that there is a distinguishable difference between the initial and altered carbon storage of the soil (Black et al. 2020; Verra VCS 2020). Within this study, we followed the approach of Verra VCS VMD0053 on the model calibration (Verra VCS 2020). The error propagation gave a range for each field or farm (e.g., Arkansas farm, average stock ~35 tC ha, ranging from ~33–39 tC ha−1). Impacts of adjusted land managed changes (Lessmann et al. 2022) need to be significant enough (e.g., bigger than 0.5 tC ha−1 per year) for a 5-year period in order to cause a measurable and discernible difference over time. Within carbon certification processes, there are penalties for high uncertainties (Black et al. 2020; Verra VCS 2020). The present research project only envelops a single time-point, and a time series is necessary with samples taken in the same fashion in order to assess changes in soil carbon over a period (Van Der Voort et al. 2019). Increased carbon stocks also need to remain stored for more than a transient time period. In other words, permanence must be asserted (Oldfield et al. 2022a). In this context, it is crucial to consider the turnover times of soil organic matter (of added carbon) in both in the top and deep soil (Van der Voort et al. 2016, 2019). Turnover in particular of labile compounds can be rapid leading to positive bias of mitigation measures designed to store carbon in soil (van der Voort et al. 2017; Berthelin et al. 2022). In order to effectively develop impactful land use management changes, soil science and biogeochemically driven modules should be incorporated into SoilCASTOR that can forecast impacts of these changes on soil carbon stocks. Furthermore, robust time-series sampling would need to be undertaken 5 to 10 years from now to evaluate the intermediate changes (Van Der Voort et al. 2019). Examples could, e.g., build on RothC (Jenkinson and Coleman 2008) and utilize radiocarbon both for decadal as well as millennial carbon turnover (Graven 2015; Galvez et al. 2020).
3.2.3 The carrot and stick of carbon credits
Actively brought-on changes in land-use practices (additionality) have been shown to positively impact carbon stocks (Lessmann et al. 2022). However, in order to incentive land users to implement carbon farming, the investment must be offset by the carbon credit value (Kragt et al. 2012; Tang et al. 2016). The SoilCASTOR calculations can be done rapidly and at low computational cost, but a time investment of ~15 min ha−1 remains a source of cost for carbon farming. However, proximal sensing data from the field remains a requirement for numerous carbon verification projects (Black et al. 2020). Nonetheless, by eliminating wet chemistry measurements, carbon farming may become more reachable for a range of farmers (Tang et al. 2016). More research is needed in this direction in order to ascertain that carbon farming is feasible with the socio-economic toolboxes that are present.
4 Conclusion and outlook
This paper presents the SoilCASTOR method which can be applied to determine the carbon stock robustly to a range of (agricultural) soil types with a relatively low measurement cost and time investment. The method presents a novel approach and leverages satellite data, field-based NIR-scanner measurements of SOC and machine learning to get optimal estimates of soil carbon stock. Therefore it can be widely instrumentalized to assess potential changes in carbon stocks in the framework of carbon farming. The carbon stock in the top 30 cm for the fields analyzed ranges between 19 and 55 t C/ha and the stock could be determined up to 10 m precision.
As an outlook, it will be key to include this method in certified soil carbon sequestration offsetting protocols, so it can be fully integrated in MRV (Black et al. 2020). It would especially be key to connect it to a module which can give advice on how to increase stocks (e.g., leveraging RothC, Coleman and Jenkinson 2014). Additionally, it would be insightful if it were applied to a wider range of fields (e.g., grasslands) and over a number of years (e.g., resampling after 3–5 years) Also, it would to be helpful to directly compare cLHS sampling to grid-based sampling strategy in the context of carbon stocks (Saurette et al. 2022). Uncertainty estimates could be improved when the uncertainty range for bulk density estimates becomes available. Furthermore, it could be investigated if this approach would also be appropriate for regional approaches (including multiple farms) instead of the current single farm-level focus. Another key element which can be further investigated is the maximum level up to which soil carbon stocks can be increased (Stewart et al. 2007; Castellano et al. 2015).
Data availability
The data is available in a csv file in the supplemental information.
Code availability
The code is available upon request by the readers.
References
AgroCares (2022) HandHeld Scanner Agrocares. https://www.agrocares.com/products/scanner/. Accessed 21 Mar 2022
Alexander P, Paustian K, Smith P, Moran D (2015) The economics of soil c sequestration and agricultural emissions abatement. SOIL 1:331–339. https://doi.org/10.5194/SOIL-1-331-2015
Amelung W, Bossio D, de Vries W et al (2020) Towards a global-scale soil climate mitigation strategy. Nat Commun 11:5427–5427. https://doi.org/10.1038/S41467-020-18887-7
Asgari N, Ayoubi S, Jafari A, Demattê JAM (2020) Incorporating environmental variables, remote and proximal sensing data for digital soil mapping of USDA soil great groups. Int J Remote Sens 41:7624–7648. https://doi.org/10.1080/01431161.2020.1763506
Batjes NH (2019) Technologically achievable soil organic carbon sequestration in world croplands and grasslands. L Degrad Dev 30:25–32. https://doi.org/10.1002/LDR.3209
Bellon-Maurel V, McBratney A (2011) Near-infrared (NIR) and mid-infrared (MIR) spectroscopic techniques for assessing the amount of carbon stock in soils – critical review and research perspectives. Soil Biol Biochem 43:1398–1410. https://doi.org/10.1016/J.SOILBIO.2011.02.019
Berthelin J, Laba M, Lemaire G et al (2022) Soil carbon sequestration for climate change mitigation: mineralization kinetics of organic inputs as an overlooked limitation. Eur J Soil Sci 73:e13221. https://doi.org/10.1111/EJSS.13221
Bivand RS, Pebesma E, Gómez-Rubio V (2013) Applied spatial data analysis with R, second edi. Springer, New York, New York
Black C, Brummit C, Campbell N, DuBuisson M, Harburg D, Matosziuk L, Motew M, Pinjuv G, Smith E (2020) Methodology for improved agricultural land management. Available on: https://verra.org/methodologies/vm0042-methodology-for-improved-agricultural-land-management-v1-0/. Accessed 21 Mar 2022
Boiko O, Kagone S, Senay G (2021) Soil properties dataset in the United States. https://doi.org/10.5066/P9TI3IS8
Bossio DA, Cook-Patton SC, Ellis PW et al (2020) The role of soil carbon in natural climate solutions. Nat Sustain 2020 35(3):391–398. https://doi.org/10.1038/s41893-020-0491-z
Brus DJ (2019) Sampling for digital soil mapping: a tutorial supported by R scripts. Geoderma 338:464–480. https://doi.org/10.1016/j.geoderma.2018.07.036
Castellano MJ, Mueller KE, Olk DC et al (2015) Integrating plant litter quality, soil organic matter stabilization, and the carbon saturation concept. Glob Chang Biol 21:3200–3209. https://doi.org/10.1111/gcb.12982
Coleman K, Jenkinson DS (2014) RothC-a model for the turnover of carbon in soil model description and users guide. Harpenden. Available on: www.rothamsted.ac.uk/sites/default/files/RothC_guide_DOS.pdf. Accessed 21 Mar 2022
Commissie Bemesting Akkerbouw en Vollegrondsgroententeelt (2022) Soil and soil density profiles. Wageningen. https://www.handboekbodemenbemesting.nl/nl/handboekbodemenbemesting.htm. Accessed 12 jan 2022.
De Gruijter JJ, McBratney AB, Minasny B et al (2016) Farm-scale soil carbon auditing. Geoderma 265:120–130. https://doi.org/10.1016/j.geoderma.2015.11.010
Elkerbout M (2020) The European green deal after corona: implications for EU climate policy when energy becomes security. The Copenhagen School Meets Energy Studies View Project Res 9:159–163. https://doi.org/10.1016/0273-1177(89)90481-X
Escadafal R (1989) Remote sensing of arid soil surface color with Landsat thematic mapper. Adv Space Res 9(1):159–163. https://doi.org/10.1016/0273-1177(89)90481-X
Evans MC, Carwardine J, Fensham RJ et al (2015) Carbon farming via assisted natural regeneration as a cost-effective mechanism for restoring biodiversity in agricultural landscapes. Environ Sci Policy 50:114–129. https://doi.org/10.1016/J.ENVSCI.2015.02.003
FAO (2020) A protocol for measurement, monitoring, reporting and verification of soil organic carbon in agricultural landscapes – GSOC-MRV Protocol. https://doi.org/10.4060/cb0509en
Galvez ME, Fischer WW, Jaccard SL (2020) Eglinton TI (2020) Materials and pathways of the organic carbon cycle through time. Nat Geosci 138(13):535–546. https://doi.org/10.1038/s41561-
Gobrecht A, Roger JM, Bellon-Maurel V (2014) Major issues of diffuse reflectance NIR spectroscopy in the specific context of soil carbon content estimation: a review. Adv Agron 123:145–175. https://doi.org/10.1016/B978-0-12-420225-2.00004-2
Goovaerts P (1998) Geostatistical tools for characterizing the spatial variability of microbiological and physico-chemical soil properties. Biol Fertil Soils 27:315–334
Graven HD (2015) Impact of fossil fuel emissions on atmospheric radiocarbon and various applications of radiocarbon over this century. Proc Natl Acad Sci 1–4. https://doi.org/10.1073/pnas.1504467112
Guillaume T, Makowski D, Libohova Z et al (2022) Soil organic carbon saturation in cropland- grassland systems: storage potential and soil quality. Geoderma 406:115529. https://doi.org/10.1016/J.GEODERMA.2021.115529
IPCC (2021) Summary for policymakers. In: Masson-Delmotte V, Zhai P, Pirani A, Connors SL, Péan C, Berger S, Caud N, Chen Y, Goldfarb L, Gomis MI, Huang M, Leitzell K, Lonnoy E, Matthews JBR, Maycock TK, Waterfield T, Yelekçi O, Yu R, Zhou B (eds) Climate change 2021: the physical science basis. Contribution of working Group I to the Sixth Assessment Report of the intergovernmental panel on climate change. Cambridge University Press. Available on: https://www.ipcc.ch/report/ar6/wg1/. Accessed 21 Mar 2022
Jenkinson DS, Coleman K (2008) The turnover of organic carbon in subsoils. Part 2. Modelling carbon turnover. Eur J Soil Sci 59:400–413. https://doi.org/10.1111/j.1365-2389.2008.01026.x
Jobbagy EG, Jackson RB (2000) Ther vertical distribution of soil organic carbon an its relation to climate and vegetation. Ecol Appl 10:423–436
Khaledian Y, Miller BA (2020) Selecting appropriate machine learning methods for digital soil mapping. Appl Math Model 81:401–418. https://doi.org/10.1016/J.APM.2019.12.016
Köchy M, Hiederer R, Freibauer A (2015) Global distribution of soil organic carbon – Part 1: Masses and frequency distributions of SOC stocks for the tropics, permafrost regions, wetlands, and the world. SOIL 1:351–365. https://doi.org/10.5194/SOIL-1-351-2015
Kragt ME, Pannell DJ, Robertson MJ, Thamo T (2012) Assessing costs of soil carbon sequestration by crop-livestock farmers in Western Australia. Agric Syst 112:27–37. https://doi.org/10.1016/J.AGSY.2012.06.005
Lessmann M, Ros GH, Young MD, de Vries W (2022) Global variation in soil carbon sequestration potential through improved cropland management. Glob Chang Biol 28:1162–1177. https://doi.org/10.1111/GCB.15954
Lutzow MV, Kogel-Knabner I, Ekschmitt K et al (2006) Stabilization of organic matter in temperate soils: mechanisms and their relevance under different soil conditions - a review. Eur J Soil Sci 57:426–445. https://doi.org/10.1111/j.1365-2389.2006.00809.x
Marsett RC, Qi J, Heilman P et al (2006) Remote sensing for grassland management in the ARID southwest. Rangel Ecol Manag 59:530–540. https://doi.org/10.2111/05-201R.1
McBratney AB, Mendonça Santos ML, Minasny B (2003) On digital soil mapping. Geoderma 117:3–52. https://doi.org/10.1016/S0016-7061(03)00223-4
Minasny B, Malone BP, McBratney AB et al (2017) Soil carbon 4 per mille. Geoderma 292:59–86. https://doi.org/10.1016/J.GEODERMA.2017.01.002
Minasny B, McBratney AB (2006) A conditioned Latin hypercube method for sampling in the presence of ancillary information. Comput Geosci 32:1378–1388. https://doi.org/10.1016/j.cageo.2005.12.009
Nellis MD, Briggs JM (1992) Transformed vegetation index for measuring spatial variation in drought impacted biomass on Konza Prairie. Kansas. Trans Kansas Acad Sci 95:93. https://doi.org/10.2307/3628024
Nocita M, Stevens A, van Wesemael B et al (2015) Soil spectroscopy: an alternative to wet chemistry for soil monitoring. Adv Agron 132:139–159. https://doi.org/10.1016/BS.AGRON.2015.02.002
Nussbaum M, Papritz A, Baltensweiler A, Walthert L (2014) Estimating soil organic carbon stocks of Swiss forest soils by robust external-drift kriging. Geosci Model Dev 7:1197–1210. https://doi.org/10.5194/GMD-7-1197-2014
Oldfield EE, Eagle AJ, Rubin RL et al (2022a) Crediting agricultural soil carbon sequestration. Science (80- ) 375:1222–1225. https://doi.org/10.1126/SCIENCE.ABL7991/SUPPL_FILE/SCIENCE.ABL7991_SM.PDF
Oldfield EE, Lavallee JM, Kyker-Snowman E, Sanderman J (2022b) The need for knowledge transfer and communication among stakeholders in the voluntary carbon market. Biogeochem 2022:1–6. https://doi.org/10.1007/S10533-022-00950-8
Padarian J, Minasny B, McBratney AB (2020) Machine learning and soil sciences: a review aided by machine learning tools. SOIL 6:35–52. https://doi.org/10.5194/SOIL-6-35-2020
Poggio L, De Sousa LM, Batjes NH et al (2021) SoilGrids 2.0: producing soil information for the globe with quantified spatial uncertainty. SOIL 7:217–240. https://doi.org/10.5194/SOIL-7-217-
Quine TA, Govers G, Walling DE et al (1997) Erosion processes and landform evolution on agricultural land — new perspectives from caesium-137 measurements and topographic-based erosion modelling. Earth Surf Process Landforms 22:799–816. https://doi.org/10.1002/(SICI)1096-9837(199709)22:9%3c799::AID-ESP765%3e3.0.CO;2-R
RStudio Team (2015) RStudio: integrated development environment for R. Boston, MA. Retrieved http://www.rstudio.com/. Accessed 12 Jan 2022
RStudio (2021) RStudio (2021.09.0). RStudio, PBC. https://rstudio.com/products/rstudio/download/
Sanderman J, Hengl T, Fiske GJ (2017) Soil carbon debt of 12,000 years of human land use. Proc Natl Acad Sci USA 114:9575–9580. https://doi.org/10.1073/PNAS.1706103114/-/DCSUPPLEMENTAL
Saurette DD, Berg AA, Laamrani A et al (2022) Geoderma effects of sample size and covariate resolution on field-scale predictive digital mapping of soil carbon. Geoderma 425:116054. https://doi.org/10.1016/j.geoderma.2022.116054
Seneviratne SI, Corti T, Davin EL et al (2010) Investigating soil moisture-climate interactions in a changing climate: a review. Earth-Science Rev 99:125–161. https://doi.org/10.1016/j.earscirev.2010.02.004
Shen Z, Ramirez-Lopez L, Behrens T et al (2022) Deep transfer learning of global spectra for local soil carbon monitoring. ISPRS J Photogramm Remote Sens 188:190–200. https://doi.org/10.1016/J.ISPRSJPRS.2022.04.009
Sikora A (2020) (2020) European Green Deal – legal and financial challenges of the climate change. ERA Forum 214(21):681–697. https://doi.org/10.1007/S12027-020-00637-3
Smeaton C, Hunt CA, Turrell WR, Austin WEN (2021) Marine sedimentary carbon stocks of the United Kingdom’s exclusive economic zone Front Earth Sci 50. https://doi.org/10.3389/FEART.2021.593324
Smith P, Soussana JF, Angers D et al (2020) How to measure, report and verify soil carbon change to realize the potential of soil carbon sequestration for atmospheric greenhouse gas removal. Glob Chang Biol 26:219–241. https://doi.org/10.1111/GCB.14815
Soriano-Disla JM, Janik LJ, ViscarraRossel RA et al (2014) The performance of visible, near-, and mid-infrared reflectance spectroscopy for prediction of soil physical, chemical, and biological properties. Appl Spectrosc Rev 49:139–186. https://doi.org/10.1080/05704928.2013.811081
Spencer S, Ogle SM, Breidt FJ et al (2011) Designing a national soil carbon monitoring network to support climate change policy: a case example for US agricultural lands. Greenh Gas Meas Manag 1:167–178. https://doi.org/10.1080/20430779.2011.637696
Stewart CE, Paustian K, Conant RT et al (2007) Soil carbon saturation: concept, evidence and evaluation. Biogeochemistry 86:19–31. https://doi.org/10.1007/s10533-007-9140-0
Tang K, Kragt ME, Hailu A, Ma C (2016) Carbon farming economics: what have we learned? J Environ Manage 172:49–57. https://doi.org/10.1016/J.JENVMAN.2016.02.008
The European Commission (2022) Sustainable carbon cycles. https://climate.ec.europa.eu/eu-action/forests-and-agriculture/sustainable-carbon-cycles_en. Accessed 30 Nov 2022
Tóth G, Jones A, Montanarella L (2013) LUCAS topsoil survey: methodology, data, and results. Publ off Eur Union. https://doi.org/10.2788/97922
Trontelj Chambers (2021) Machine learning strategy for soil nutrients prediction using spectroscopic method. Sensors. 21(21):4208. https://doi.org/10.3390/S21124208
Tsakiridis NL, Keramaris KD, Theocharis JB, Zalidis GC (2020) Simultaneous prediction of soil properties from VNIR-SWIR spectra using a localized multi-channel 1-D convolutional neural network. Geoderma 367:114208. https://doi.org/10.1016/j.geoderma.2020.114208
Tsimpouris E, Tsakiridis NL, Theocharis JB (2021) Using autoencoders to compress soil VNIR–SWIR spectra for more robust prediction of soil properties. Geoderma 393:114967. https://doi.org/10.1016/J.GEODERMA.2021.114967
Van der Voort TS, Hagedorn F, Mcintyre C et al (2016) Variability in 14 C contents of soil organic matter at the plot and regional scale across climatic and geologic gradients. Biogeosciences 13:3427–3439. https://doi.org/10.5194/bg-2015-649
Van Der Voort TS, Mannu U, Hagedorn F et al (2019) Dynamics of deep soil carbon - insights from 14C time series across a climatic gradient. Biogeosciences 16:3233–3246. https://doi.org/10.5194/BG-16-3233-2019
Van der Voort TS, Zell CI, Hagedorn F, et al (2017) Diverse soil carbon dynamics expressed at the molecular level. Geophys Res Lett. https://doi.org/10.1002/2017GL076188
Van Doninck J, Peters J, Lievens H et al (2012) Accounting for seasonality in a soil moisture change detection algorithm for ASAR Wide Swath time series. Hydrol Earth Syst Sci 16:773–786. https://doi.org/10.5194/HESS-16-773-2012
Van Oost K, Quine TA, Govers G et al (2007) The impact of agricultural soil erosion on the global carbon cycle. Science (80- ) 318:626–629. https://doi.org/10.1126/science.1145724
Verra VCS (2020) VMD0053 - model calibration, validation, and uncertainty guidance for the methodology for improved agricultural land management. Washington, United States. Available on: https://www.verra.org/wp-content/uploads/imported/methodologies/VMD0053_Model-Calibration-Validation-and-Uncertainty-Guidance-for-the-Methodology-for-Improved-Agricultural-Land-Management.pdf. Accessed 21 Mar 2022
Walthert L, Lüscher P, Luster J, Peter B (2002) Langfristige Waldökosystem- Forschung LWF. Kernprojekt Bodenmatrix. Aufnahmeanleitung zur ersten Erhebung 1994–1999. Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf. https://doi.org/10.3929/ethz-a-004375470
Wang K, Qi Y, Guo W, (2021) Retrieval and mapping of soil organic carbon using sentinel-2A spectral images from bare cropland in autumn. Remote Sens, et al (2021) Vol 13. Page 1072(13):1072. https://doi.org/10.3390/RS13061072
Yang J, Wang X, Wang R, Wang H (2020a) Combination of convolutional neural networks and recurrent neural networks for predicting soil properties using vis–NIR spectroscopy. Geoderma 380:114616. https://doi.org/10.1016/J.GEODERMA.2020.114616
Yang L, Li X, Shi J et al (2020b) Evaluation of conditioned Latin hypercube sampling for soil mapping based on a machine learning method. Geoderma 369:114337. https://doi.org/10.1016/j.geoderma.2020.114337
Yang L, Qi F, Zhu A-X et al (2016) Evaluation of integrative hierarchical stepwise sampling for digital soil mapping. Soil Sci Soc Am J 80:637–651. https://doi.org/10.2136/sssaj2015.08.0285
Zakharov I, Kapfer M, Hornung J et al (2020) Retrieval of surface soil moisture from sentinel-1 time series for reclamation of wetland sites. IEEE J Sel Top Appl Earth Obs Remote Sens 13:3569–3578. https://doi.org/10.1109/JSTARS.2020.3004062
Acknowledgements
We would like to acknowledge our collaborators, colleagues, and partners. We’d like to thank in particular Maarten van Doorn and Astrid Berndsen for their help in the field, Sam Sarjant for his expertise in NIR scanner deep learning, and our colleagues from the entire NMI and AgroCares team for their support. We’d like to thank Carolyn King and Diane Lafex for helping us store the samples. We’d like to also thank our project partners Rabo Carbon Bank for enabling the sampling.
Funding
This research was funded by the Nutrient Management Institute. Costs for fieldwork were covered by the Rabo Carbon Bank in the framework of carbon farming, a proof-of-concept pilot.
Author information
Authors and Affiliations
Contributions
TvdV and SV led the manuscript preparation. SV developed the method, YF led the data analysis, supported by TV. TV developed the content regarding carbon credits. GR contributed to the development of key concepts and paper writing. All co-authors contributed to the manuscript by discussion, writing, and comments.
Corresponding authors
Ethics declarations
Ethics approval
Not applicable
Consent to participate
Not applicable
Consent for publication
Not applicable
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.
About this article
Cite this article
van der Voort, T.S., Verweij, S., Fujita, Y. et al. Enabling soil carbon farming: presentation of a robust, affordable, and scalable method for soil carbon stock assessment. Agron. Sustain. Dev. 43, 22 (2023). https://doi.org/10.1007/s13593-022-00856-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s13593-022-00856-7
Keywords
- SOC
- Soil carbon
- Soil organic carbon
- Carbon sequestration
- Carbon farming
- Climate change mitigation
- Spatial statistics
- Machine learning
- 4 per 1000