1 Introduction

The total stock of organic carbon in the plant-soil biosphere amounts to a pool of about 2100 Gt C, of which approximately 30% is plant biomass and 70% soil organic C (SOC) to a depth of 1 m (Janzen 2015; Paustian et al. 2016). Since these two constituents are part of a dynamic system together with the atmosphere and represent a relatively large pool compared to the latter (≈ 800 Gt C), even small variations in the C flux balance between these three components can be important enough to significantly affect CO2 concentrations in the atmosphere (Paustian et al. 1997; Stockmann et al. 2013). The United Nations Framework Convention for Climate Change (UNFCCC) and the Kyoto Protocol recognized the importance of SOC stock in 1992, and biospheric storage of C is now part of the International Panel for Climate Change (IPCC) framework for estimating Greenhouse gas (GHG) inventories (IPCC 2006).

An international research initiative (“soil carbon 4 per mille”) was recently launched at COP21 (UNFCCC 21st annual Conference of the Parties in Paris 2015), which has the aim to increase carbon stocks on managed agricultural lands worldwide. The ambition is to increase actual SOC stocks to a depth of 1 m with 0.4% through the implementation of better management techniques, which could potentially reduce global GHG emissions by 20 to 35% (Minasny et al. 2017a). Increasing SOC stocks is a cost-effective alternative to counteract climatic change (Freibauer et al. 2004), and it can be a win-win strategy, not only reducing the growth rate of CO2 but also improving soil fertility and the sustainability of world’s soil to ensure sufficient food production (Henryson et al. 2018). The most common management techniques and cropping systems being considered for agricultural land includes reduced tillage, aboveground crop residue handling, the application of manures and other recycled organic materials, increased net primary productivity through more precise and efficient fertilization, reduced fallow periods, permanent surface protection with perennial crops, and the use of cover crops (Paustian et al. 2016). However, SOC dynamics is a time-dependent reversible process and improved management techniques need to be continuously practiced in order to contribute to an improved GHG budget (Andrén and Kätterer 2001).

The relative distribution of SOC in the first meter of world soils shows that approximately 50% are within the arable (0–30 cm) layer, and about 25% in both the upper (30–50 cm) and deeper (50–100 cm) subsoil layers (Batjes 1996; Jobbagy and Jackson 2000). The use of management practices and cropping systems to increase SOC stocks mostly consider the arable layer, and knowledge on management-induced changes in subsoil C is a priority research topic (Swift 2001; Lorenz and Lal 2005). Global estimates for the total amount of C potentially sequestered as SOC in agricultural soils vary from 1 to 2.5 Gt C year−1 (Paustian et al. 2016; Smith 2016; Minasny et al. 2017a). Several factors contribute to this uncertainty, including the variability in SOC stock change rates attributed to the improvement of different cropping systems and management techniques, which still need refinements (Minasny et al. 2017b).

The change in SOC stocks for agroecosystems is dynamic and determined by C inputs through net primary productivity (NPP) as well as recycled organic materials and C outputs from decomposition (Fig. 1). SOC stocks are in equilibrium when the C inputs are equal to the outputs. Carbon is being sequestered in soils only when the C inputs are greater than the outputs (negative CO2 emissions), and the net removal of CO2 from the atmosphere through photosynthesis left in the field is transformed into soil organic matter pools with long turnover times (long-lived SOC) (Kätterer et al. 2012; Powlson et al. 2011b). The magnitude of the plant C inputs from aboveground post-harvest residues and rhizodeposition (i.e., root-derived material) that remain in the field are driven by NPP, which is therefore a crucial factor with respect to potential C sequestration in soils (Bolinder et al. 2007). Consequently, the influence of management techniques, such as tillage or the addition of recycled organic materials on NPP, also needs to be considered when estimating SOC stock change rates (Ogle et al. 2012; Hijbeck et al. 2017). The decay of the actual SOC stock as a result of soil biological activity is the major C output component. It significantly varies regionally with pedo-climatic conditions, and with management techniques (e.g., tillage) and cropping (e.g., annual versus perennial crops) systems (Paustian et al. 2016; Minasny et al. 2017a). It is expected that the responsiveness to improved management practices are lower for soils with already high carbon stocks (Powlson et al. 2011b; Minasny et al. 2017b), particularly if soil C concentration is high relative to the amount of clay (Merante et al. 2017).

Fig. 1
figure 1

The major fluxes of carbon in the food chain affecting the soil organic carbon (SOC) balance for agroecosystems are net primary productivity (NPP) and CO2 release from soil through decomposition. Measures that stimulate NPP, decrease decomposition of SOC, or increase recycling of products will favor carbon sequestration. SOC stocks are in equilibrium when the C inputs are equal to the outputs

High background levels of SOC stocks, already present in soils, is making it difficult assessing short (1–5 years) to medium-term (˃ 5–10 years) changes. The treatment effects of different management practices on SOC are more easily measurable when they have been accumulating over periods longer than 10 years (Smith 2004; Kätterer et al. 2012). Continuous soil monitoring initiatives such as LUCAS (Land Use and Coverage Area Frame Survey), modeling approaches (e.g., RothC, Century) and long-term (˃ 10 years) field experiments (LTEs) are among the most common tools studying changes in SOC stocks in agroecosystems. There are more than 600 LTEs in the world, with the majority located in Europe and North America, where some are over 100 years old (Mitchell et al. 1991; Debreczeni and Körschens 2003). Increasing concerns regarding anthropogenic GHG emissions has led scientists to scrutinize LTEs (often by including more recent medium-term experiments) in literature reviews, commonly using meta-analysis statistical methods (Philibert et al. 2012).

Therein, the effect of management techniques and cropping systems on changes in SOC is calculated using effect size indices relative to a reference treatment (Borenstein et al. 2009). They are expressed either as response ratios (RRs) (%) or as stock change rates (SCRs, kg C ha−1 year−1). The RR value is obtainable with either SOC concentrations or mass, while SCR requires dry soil bulk density. These values are important in order to evaluate policy measures and for guiding future research activities. For instance, a substantial number of organizations and countries are supporting the soil carbon 4 per mille initiative (www.4p1000.org), and they need tools to assess their potential contribution towards this program. Moreover, IPCC guidelines are considering effect size indices at the tier I and II levels, where they are characterizing management and land use changes, accounting also for gradients in climate and soil properties through reference SOC stocks (IPCC 2006).

RR and SCR are relative measures of SOC changes, evaluating the potential C sequestration using the reference treatment as a baseline, assuming that the relative measured increase in SOC is sourced from atmospheric CO2 (i.e., Stockmann et al. 2013). The possibility exists that SOC stocks in the LTEs from which they are deriving may either be declining or increasing, for both the improved management practice and the reference. Although in both cases the effect of the improved management practice is positive from a global GHG budget perspective, this approach has raised concerns. For example, there is a possible bias considering that it can be more difficult to gain SOC than loosing SOC, and using RRs calculated from LTEs where SOC is either increasing (net gain in C) or less decreasing (avoidance of C losses) may influence predictions of changes in SOC stocks (Sanderman and Baldock 2010). There are also several other factors and selection criteria influencing the RR and SCR expressions, in particular the time perspective of input data. Results from long-term field experiments indicate that changes in SOC stocks are usually faster in the initial phase after a modification in management has taken place, but they do not necessarily stop after a few decades. For example, after about 150 years of continuous annual farmyard applications, the classical Hoosfield Barley experiment at Rothamsted show SOC is still increasing, although the stock changes are slightly leveling out with time and asymptotically move towards a new equilibrium value (Ludwig et al. 2007; Powlson et al. 2012).

Furthermore, for experiments that have been running for decades, initial SOC stocks are in many cases not available since SOC was not the focus when they were established. A common assumption is then that initial differences between experimental units are negligible (e.g., Dolan et al. 2006) and calculations of effect size indices are most often restricted to the last or later measurements of SOC. However, when initial conditions are known, initial treatment differences in SOC can be accounted for (e.g., Ladha et al. 2011), and it is also possible to account for the fact that soils with higher initial SOC contents will lose proportionally more C with time (Poeplau et al. 2016a). If time series for SOC measurements are available they can preferably be used (e.g., Lehtinen et al. 2014), thereby quantifying the influence of study lengths on SOC changes (Haddaway et al. 2016b).

The selection criteria for soil depth and the rationale used to group different depth increments are important and vary between reviews. The choice of units used (i.e., only concentration or mass based or a combination of both) are also variable, and the use of pedo-transfer functions to estimate data for dry soil bulk density are common. Furthermore, the effect on the mass of soil, which can be especially important for some management techniques and cropping systems (e.g., tillage) can be corrected using an equivalent soil mass concept (Meurer et al. 2018).

To our knowledge, there is no publication making a summary of reviews and meta-analyses considering several different management practices simultaneously. In particular, precious studies did not address both RR and SCR effect size indices. Our objectives were (i) to synthesize previously published reviews on the effect of aboveground crop residue removal, cover crops, manures, and nitrogen fertilization on changes in SOC and (ii) to discuss the main outcomes for these management practices related to interactions between SOC changes and yield, pedo-climatic conditions, and cropping systems. We also address selection criteria used and relationships between RR and SCR.

2 Materials and methods

We summarized and analyzed results from 20 publications for the main effect of selected management practices in agroecosystems on SOC dynamics (Tables 1, 2, and 3). They cover the effect of aboveground crop residue removal, cover crops, recycled organic materials, and the effect of nitrogen fertilization. The selection criteria for publications were that they should be literature reviews or meta-analyses based on results from LTEs. Seven of them were review articles (Smith et al. 1997; Körschens et al. 2013; VandenByggart et al. 2003; Alvarez 2005; Wang et al. 2015; Lehtinen et al. 2014; Powlson et al. 2011a). Ten publications were applying meta-analysis statistical methods (Ladha et al. 2011; Aguilera et al. 2013; Maillard and Angers 2014; Kopittke et al. 2017; Lu et al. 2011; Liu et al. 2014; Luo et al. 2010; Lu 2015; McDaniel et al. 2014; Poeplau and Don 2015). We also included three research papers that had a significant appraisal section related to previous studies and included an important number of LTEs for a given type of comparison: Lemke et al. (2010) with 19 LTEs for aboveground crop residue removal and Blanco-Canqui (2013) and Poeplau et al. (2015) analyzing 14 LTEs for cover crops.

Table 1 Characteristics and effect size indices for the reviews comparing aboveground (AG) crop residue removal and cover crops using paired comparisons
Table 2 Characteristics and effect size indices for the reviews on recycled organic materials (ROMs) using paired comparisons
Table 3 Characteristics and effect size indices for the reviews on N-fertilization effects using paired comparisons

For simplicity, the selected publications are all in the text that follows referred to as “reviews.” The reviews had different coverage, from worldwide to regional or country-specific, and some addressed more than one of the selected management practices (Tables 1, 2, and 3). We were using a Systematic Map created within the Swedish EviEM (Evidence-Based Environmental Management) council (www.eviem.se/en/projects/Soil-organic-carbon-stocks/) in our pursuit for reviews. This interactive map includes accessible meta-data from 735 LTEs reporting the effects of different management practices on SOC. A detailed explanation of the search process and criteria is given by Haddaway et al. (2015, 2016a, 2017). The Systematic Map also highlights 127 reviews and meta-analyses, from which we selected those relevant for our study and to which we added a few more recent ones (Kopittke et al. 2017; Lu 2015; Poeplau and Don 2015; Wang et al. 2015). We did not retain reviews on agricultural systems specifically combining effects of more than one of the management practices we chose to cover (i.e., such as organic or integrated farming versus conventional agriculture comparisons). Reviews on tillage were also not retained because we have recently published extensive analyses on this subject (Haddaway et al. 2017; Meurer et al. 2018), but the main findings on this issue are included in the discussion.

For each of these reviews (Tables 1, 2, and 3), we compiled, or calculated when not directly reported, the mean and maximum soil depth of the assessments and the proportion of observations made for soil depth greater than 30 cm. Furthermore, we assembled the mean length of the studies used in the reviews, as well as the proportion of studies with a duration of less than 5 and 10 years. These characteristics were mainly corresponding to or calculated using the total number of paired comparisons (N) included in the reviews, unless otherwise specified in the text. We also documented information on the findings for other topics addressed, in particular interactions with yield, climate, soil texture, and type of crops or rotations.

We extracted two effect size indices, the mean relative RRs and mean SCR that authors were reporting in the reviews (Figs. 2, 3, 4, and 5). RR are representing the percentage change (%) based on the comparison of results between a management practice against a reference treatment (i.e., (management treatment − reference treatment)/reference treatment × 100). SCR is corresponding to the differences between the SOC stock for a management practice and the SOC stock in the reference treatment expressed as a rate (kg C ha−1 year−1) over a certain period of time (i.e., (management treatment − reference treatment)/study length period). When feasible, for reviews not presenting both effect size indices, we calculated them from SOC (mass and/or concentration) data and study length periods provided in the publications (or supplementary materials) for each N. Consequently, we were completing information for both RR and SCR (Lemke et al. 2010; Blanco-Canqui 2013), for RR (Körschens et al. 2013; Powlson et al. 2011a) and for SCR (Alvarez 2005). The two effect size indices were all calculated using the last sampling point in time for a given study and paired comparison. Only one review (Lehtinen et al. 2014) included multiple time observations for some of the LTEs in their database, the authors then used the average RR. Another exception was the review by Aguilera et al. (2013) that used information on initial SOC data when available. In particular, Ladha et al. (2011) made their entire analysis accounting for initial differences in SOC in the paired comparisons (see detailed discussion therein). Only Kopittke et al. (2017) reported median (and not mean) values for RR and SCR.

Fig. 2
figure 2

The effect of aboveground crop residue removal on the mean relative response ratios (RRs) and mean stock change rate (SCR) effect size indices for soil organic carbon (review characteristics and # accordingly with Table 1)

Fig. 3
figure 3

The effect of cover crops on the mean relative response ratios (RRs) and mean stock change rate (SCR) effect size indices for soil organic carbon (review characteristics and # accordingly with Table 1)

Fig. 4
figure 4

The effect of recycled organic materials (ROM) as manure (only with the mineral reference treatment) on the mean relative response ratios (RRs) and mean stock change rate (SCR) effect size indices for soil organic carbon (review characteristics and # accordingly with Table 2)

Fig. 5
figure 5

The effect of N-fertilization on the mean relative response ratios (RRs) and mean stock change rate (SCR) effect size indices for soil organic carbon (review characteristics and # accordingly with Table 3)

For aboveground crop residue removal, the effect of leaving residues in the field was compared with a reference treatment consisting of residues removed from the field (Liu et al. 2014; Lehtinen et al. 2014; VandenByggart et al. 2003; Wang et al. 2015). Some of the treatments also included the addition of straw at various rates (Lemke et al. 2010; Powlson et al. 2011a; Smith et al. 1997), and allowing a comparison between chopped and unchopped straw (Lu 2015). In the review by Luo et al. (2010), the reference treatment specifically consisted of stubble burning (Table 1). For cover crops, paired comparisons were using a reference treatment without cover crops. The cover crop species used in the individual experiments were highly variable (Blanco-Canqui 2013), including both legumes and non-legumes (Poeplau and Don 2015). In the review by McDaniel et al. (2014), almost all paired comparisons (97%) involved leguminous cover crops, while that of Poeplau et al. (2015) examined only the effect of perennial ryegrass (mostly undersown) as a cover crop. Aguilera et al. (2013) specifically examined scenarios where cover crops were substituting bare soils (Table 1).

The reviews for the effect of recycled organic materials differ from the former comparisons because they involved the use of different reference treatments, which were either a mineral fertilized or an unfertilized (no mineral) treatment (Table 2). Manure was the most prevalent organic amendment addressed in the reviews (Maillard and Angers 2014; Ladha et al. 2011; Kopittke et al. 2017; Körschens et al. 2013; VandenByggart et al. 2003), although some also included other recycled organic materials such as sewage sludge and slurry (Smith et al. 1997; Aguilera et al. 2013).

The effects of nitrogen fertilization involved paired comparisons with unfertilized or control treatment (Table 3). The fertilization treatments were described either as conventionally managed (Aguilera et al. 2013; Körschens et al. 2013; VandenByggart et al. 2003) or implying N applications at various rates (Ladha et al. 2011; Lu et al. 2011; Alvarez 2005). There was some disparity in the unfertilized or control plots used as a reference for the effect of N fertilization in the reviews. They were either only identified as control plots (VandenByggart et al. 2003; Alvarez 2005; Lu et al. 2011) or clearly indicating no mineral fertilization (0 NPK) was compared against mineral fertilized (NPK) plots (Körschens et al. 2013; Aguilera et al. 2013), while the review by Ladha et al. (2011) also included no N but could also have received PK. Furthermore, the review by Alvarez (2005), allowing a specific comparison with the fertilized treatments for which the aboveground crop residues were either incorporated or removed (or burnt).

The cropping systems involved crop types and rotations common to the location of the collected studies used in each of the reviews (Tables 1, 2, and 3). The selection criterion was often quite general, with cropping systems included broadly defined as cropland, agricultural cropland, or arable rotations (Smith et al. 1997; Körschens et al. 2013; Kopittke et al. 2017; VandenByggart et al. 2003; Lu et al. 2011; Luo et al. 2010; Lehtinen et al. 2014; Blanco-Canqui 2013; Poeplau et al. 2015; Poeplau and Don 2015). However, some reviews specified that permanent grasslands (and forests) were excluded (Aguilera et al. 2013), that they were including studies on grasslands (Maillard and Angers 2014) or that fallow was excluded (McDaniel et al. 2014). Other reviews specified that they considered almost exclusively small-grain cereal rotations (Lemke et al. 2010; Powlson et al. 2011a) or that studies included at least a cereal-based crop rotation (Ladha et al. 2011) or grown in combination with corn and soybeans (Alvarez 2005). Reviews by Wang et al. (2015), Liu et al. (2014), and Lu (2015) also included cropping systems with rice.

3 Results and discussion

3.1 SOC changes

From the summary of 20 reviews, nine addressed the issue related to aboveground crop residue removal, seven covered recycled organic materials, six considered the effect of N fertilization, and five the effect of cover crops (Tables 1, 2, and 3 and Figs. 2, 3, 4, and 5).

3.1.1 Aboveground crop residue removal

Compared to the reference treatment with aboveground crop residue removal, the mean RR in SOC with residue incorporation ranged from a low of 2.7 to 18.2% (Fig. 2). Five of the reviews that allowed a reporting of data as SCR, showed a range of mean values from 53 to 590 kg C ha−1 year−1. Smith et al. (1997) were reporting the highest values for RR and SCR. The inclusion of a bare fallow as a reference treatment and large amounts of straw applied for a few of the paired comparisons are explaining those higher values. We found no specific enlightening for the lowest SCR value in the study by Lu (2015), albeit it had the lowest mean study duration (Table 1) among the reviews addressing effect of crop residue removal.

The effect of aboveground crop residue removal on SOC is mostly an input-driven consequence (i.e., the amount of carbon entering the soil is reduced), although it can also significantly influence losses occurring via soil erosion. With some exceptions (Liu et al. 2014; Wang et al. 2015; Lu 2015), the focus of reviews was on agroecosystems dominated by small-grain cereals. Depending on growing conditions and harvesting techniques, only about 50% of the straw may actually leave the field, a large proportion is left behind as stubble, chaff, and uncollected straw (Powlson et al. 2011a). Therefore, the impacts on SOC are generally expected to be greater for grain-maize because the potential aboveground crop residues represent approximately twice the amount that of small-grain cereals (Wilhelm et al. 2004). For instance, Anderson-Texeira et al. (2009) showed in some North American experiments that SCR for grain-maize varied from as much as 300 to 800 kg C ha−1 year−1. Among the reviews including maize, only Lu (2015) found that the effect was increasing with the proportion of maize in the rotations, where the RR for single-maize wheat (14%) was higher compared to that for rice-rice or wheat-rice systems (9 to 10%, respectively). They also found a higher RR for chopped compared to unchopped aboveground crop residues (13 and 9%, respectively), an effect that was more pronounced when the residues were incorporated into the soil with tillage.

3.1.2 Cover crops

Compared to the effects obtained for aboveground crop residue removal, the results for cover crops were more constant. The mean RR in SOC with cover crops ranged from 7.8 to 13.1%, while SCR showed a range of values from 270 to 430 kg C ha−1 year−1 (Fig. 3). The effect of cover crops on SOC is mostly also an input-driven effect because they are providing an additional source of aboveground and belowground crop residue carbon entering the soil. The associated reduction in losses of SOC from soil erosion can also be particularly important for systems with cover crops in permanent woody cropping systems such as olives and vineyards, which may contribute to higher changes in SOC. For instance, RR as high as 27 to 55% and SCR between 1160 and 1590 kg C ha−1 year−1 can be observed (Palese et al. 2014; Gonzalez-Sanchez et al. 2012; Favretto et al. 1992).

3.1.3 Recycled organic materials

With the exception of the review by Ladha et al. (2011), the results presented were obtained with a mineral fertilized treatment used as the reference and refer to recycled organic materials that were mainly defined as manure (Fig. 4). When specified, the manure treatments were applied with mineral fertilizer (Ladha et al. 2011; Körschens et al. 2013), while it implied either manure alone or manure in combination with mineral fertilizer (Maillard and Angers 2014; Aguilera et al. 2013). The mean RR in SOC ranged from 23.5 to 43.4% and SCR showed a range of values from 203 to 1310 kg C ha−1 year−1. Compared with the other manure treatments in all these reviews, the data analyzed by Aguilera et al. (2013) may also be including composted materials, possibly explaining the higher values of that particular study.

Two of the reviews allow a specific comparison for manures using either a mineral fertilized or an unfertilized treatment as the reference, and clearly show that the effect is lower with the former. Indeed, in the study by Maillard and Angers (2014), the SCR was 311 kg C ha−1 year−1 with the mineral fertilized, compared to 522 kg C ha−1 year−1 with the no mineral fertilized reference treatment (data not shown). Whereas the data in Körschens et al. (2013) show the RR was 33% with the mineral fertilized, compared to 46% with the unfertilized reference treatment (data not shown).

The effect of recycled organic materials on SOC obviously varies depending upon the quantity applied, and with the quality of the materials that is driving the proportions of organic material converted to more resistant SOC. The highest effects occurred for treatments with sewage sludge and municipal solid waste, with RR ranging from 98 to 117% and SCR from 1650 to 5290 kg C ha−1 year−1 (data not shown), respectively (Smith et al. 1997; Aguilera et al. 2013). Aguilera et al. (2013) also assessed the effect of liquid animal manure in their study, but it was not significant. However, as pointed out in the meta-analysis by Maillard and Angers (2014), there is a lack of studies allowing realistic comparisons between the effects of liquid versus solid manures on SOC stocks.

3.1.4 N-fertilization

The mean RR in five reviews for N fertilized treatments ranged from 3.5 to 10.0%, whereas only three reviews were reporting data as SCR and showed a range of mean values from 197 to 480 kg C ha−1 year−1 (Fig. 5). Similar to the effects for cover crops, the effect of N fertilization was also more consistent among studies, in particular for RR but with the exception regarding a high SCR value from the study by Aguilera et al. (2013). The role of major nutrients in SOC dynamics is complex because of the simultaneous effects on NPP and in the rate of heterotrophic respiration through N mining and mineralization of organic matter (Poeplau et al. 2016b), but it remains well recognized that there is a positive effect of N fertilization on SOC in agroecosystems (Kätterer et al. 2012). This is mainly an input-driven effect, where the increase in NPP (and yields) results in higher amounts of annual C inputs to soil from aboveground post-harvest crop residues and rhizodeposition (e.g., Christopher and Lal 2007). For instance, a worldwide review on grasslands showed that fertilization resulted in a mean SCR of 300 kg C ha−1 year−1 (Conant et al. 2001). For annual crops, this effect can be limited or absent depending on the management of aboveground crop residues. This was shown in the review by Alvarez (2005), where some of the paired comparisons (N = 26) included reference treatments when straw was removed or burnt, for which the SCR was even negative (− 111 kg C ha−1 year−1). Furthermore, contrary to the other management practices we examined here, the reference treatment with no applications of N is not a commonly used agronomic practice. When no mineral N is applied, farmers are usually compensating this with N from other sources such as manures, although no N applications may sometimes occur in agricultural systems and regions where there is a lack of alternative sources. However, the overall effect of N fertilization generally increase SOC, as shown for example by the relationship established in Alvarez (2005), who found that SOC storage increased by 2 kg C ha−1 for each cumulative 1 kg of N ha−1 applied. Kätterer et al. (2012) analyzing LTEs under Nordic conditions obtained similar values of 1 to 2 kg C ha−1 year−1 in the topsoil (0–20 cm) for each kg of N applied. Using unfertilized treatments as a reference may somewhat overestimate the effect of mineral N fertilization, as pointed out by VandenBygaart et al. (VandenByggart et al. 2003) since both NPP and SOC responses are decreasing at high N rates. Nevertheless, the relationship between N fertilization and SOC is positive also when considering only the range of data receiving N (i.e., excluding the zero N treatments).

3.1.5 Average mean effect size indices and scaling-up in space

For the effect of aboveground crop residue removal, the overall average mean RR and SCR we calculated across reviews was 9.9% and 212 kg C ha−1 year−1 (Table 4). The overall average mean values we calculated for RR and SCR when they were weighed by the number of paired comparisons (N) in each review was 10.3% and 117 kg C ha−1 year−1. The latter almost twice as low SCR estimate compared with the former was due to low SCR values for a large number of observations in one of the reviews (Lu 2015; Fig. 2). The two reviews with a worldwide coverage (Liu et al. 2014; Lemke et al. 2010) had similar RRs (Table 1 and Fig. 2) to the average that we calculated (Table 4). For cover crops, manures and N fertilization, the overall average mean RR and SCR were similar when calculated either across reviews or when weighed by the number of paired comparisons (Table 4). The reviews for cover crops with a worldwide coverage (Table 2 and Fig. 3) for RR (McDaniel et al. 2014) and SCR (Poeplau and Don 2015) gave values comparable to the overall averages we calculated. Similarly, for N fertilization, the mean RR (Fig. 5) in the review with a worldwide coverage by Ladha et al. (2011) was also comparable to our overall averages (Table 4).

Table 4 The overall average mean effect of the different management practices derived from the mean response ratio (RR) and mean stock change rate (SCR) calculated either across reviews or weighed by the number of paired comparisons (N) used in each review

The results with both RR and SCR clearly shows that manure applications are the most sensitive management practice to changes in SOC, followed by aboveground crop residue removal and cover crops with RRs of about 10%, while N fertilization responded the least with a RR of 7.2%. However, in terms of SCR, cover crops showed a higher response than that for aboveground crop residue retention. In fact, the latter even had a lower SCR than that for N fertilization (Table 4). There is a trend indicating a potentially larger number of observations are available for studying the effect of aboveground crop residue removal (Table 4), with 995 paired comparisons in total for RR and 279 for SCR, which is higher compared to that for recycled organic materials (418 and 217, respectively) and cover crops (129 and 176, respectively). Recognizing that in these numbers of observations, the data included in many of the different reviews are overlapping. Since many field experiments are also including a zero N treatment as a reference, the number of observations available for assessing the effect of N fertilization is also large. With the exception for cover crops, there is apparently and not so surprisingly, less observations allowing a calculation of SCR.

Comparing the results for SCR to reviews covering other commonly assessed management practices such as the effect of no-tillage (or direct seeding) or perennial forage crops versus annual crops, indicates that the management practices here, give SCR of intermediate magnitude. For example, in a worldwide review focusing on boreo-temperate regions, Meurer et al. (2018) showed that SCR for no-tillage (when compared to a conventionally tilled reference treatment) range from 94 to 341 kg C ha−1 year−1 for the 0–30 and 0–60 cm depths, respectively. However, in the same study and using an equivalent soil mass approach, SCR ranged from only 64–232 kg C ha−1 year−1. The highest mean SCR values obtained for the management practices examined here still remains lower than that typically observed for the effect of perennial forage crops (when the reference is annual crops), where values generally are ranging from 500 to 600 kg C ha−1 year−1 (e.g., VandenBygaart et al. 2010; Kätterer et al. 2013). It should also be recognized that there exists a synergy between management practices and the combined effects of for example cover crops, leaving crop residues in the field and application of organic amendments are promoting SOC stock changes (e.g., Aguilera et al. 2013; Blanco-Canqui 2013), but a synthesis of this issue was beyond the scope of this paper.

The effects of different management practices on SOC stocks discussed above only concern investigations made at the field scale (i.e., experimental plots). Not all of them are necessarily directly scalable in space. For example, the number of animals are determining the total amount of manures produced, and animal manure is most often already recycled either on the farms where they are produced or on nearby agricultural soils. If the number of animals are not changing, there will be no increase in net C sequestration resulting from manures because the amounts that is regionally distributed remain the same. A regional impact or influence at a larger scale is only achievable when the flows of organic materials back to soils from society increase or if their characteristics change due to different treatments such as fermentation, composting, etc. (Fig. 1). Increasing amounts may result from a larger number of animals and increasing availability of organic materials through transfer from applications made on other type of land use (e.g., forest) than agricultural soils. A more efficient recycling, changes in feedstocks or treatment methods may change the decomposability of organic materials. For example, harvesting of cereal straw for incineration in municipal heating plants could instead be pyrolized and part of the carbon remaining after gasification (i.e., biochar) applied to soil. Returning this biochar (with a much longer turnover time compared to straw) to soil would probably more than compensate for the SOC losses resulting from straw removal. Another difficulty in scaling-up the values for recycled organic materials relates to the amounts applied in LTEs, which are not necessarily reflecting today’s agronomic practices. A lowering of application rates has occurred in many regions during recent decades since they are subject to legal agro-ecosystem nutrient balance-based regulations. Sewage sludge applications are also subject to other type of regulations. For instance, in Sweden, regulations for sewage sludge limit SOC sequestration rates to around 80 kg C ha−1 year−1 (Kirchmann et al. 2017).

Increasing NPP per unit area is probably the most effective option for SOC sequestration (Kätterer et al. 2012). Measures that stimulate photosynthetic rates per unit area and time as well as the allocation of carbon to belowground plant tissues through plant breeding and proper agricultural management will likely lead to soil more rich in SOC (Fig. 1). Since decomposition of SOC is essentially governed by site-specific characteristics such as climate and edaphic conditions, management practices that are lowering decomposition rates without negatively affecting NPP are less effective (e.g., no-till) than those promoting NPP (e.g., cover crops).

3.2 Interaction of SOC changes with yield, texture, climate, cropping systems and initial C

Not all the reviews were addressing interactions between SOC changes and yield, texture climate, and cropping systems (Table 5). We discuss only the main effects reported in reviews, the indirect effect of yield is included because increased NPP are leading to higher crop residue carbon inputs to soil, thereby potentially affecting changes in SOC.

Table 5 The number of reviews for each management practice indicating interactions between soil organic carbon (SOC) changes and yield, texture, climate and cropping systems

3.2.1 Yield

The effect on yield associated with crop residue management in the studies by Liu et al. (2014) and Lehtinen et al. (2014) showed that there was a yield increase of 12.3 and 6%, respectively, when leaving straw in the field. There was a positive relationship between this yield increase and the increase in RR for SOC changes, although it was not significant in the study by Lehtinen et al. (2014). Wang et al. (2015) also found a yield increase for straw retention of 7%, positively related to the increase in SCR, and Lu (2015) obtained a significant positive regression relating SCR with yield increases. For cover crops, only the study by Poeplau et al. (2015) assessed the effect on yield of the main crop (i.e., cover crops were undersown in the main crop consisting mostly of cereals), where the relationship was not significant. However, reviews specifically examining the effects on grain yield (but not including SOC changes) when cereal is the main crop show that there is often a grain yield increase of about 5% with undersown legume crops (i.e., a nutrient effect), although it can be absent or even negative for non-legume cover crops (e.g., Valkama et al. 2015). For manure additions, only the study by Körschens et al. (2013) reported a yield benefit of 6% with manures, compared with mineral fertilization alone, but making no relationship directly with RR or SCR. In a recent meta-analysis on 20 LTEs in Europe, Hijbeek et al. (Hijbeck et al. 2017) described the difficulties in establishing the effect of organic inputs on crop yields and changes in SOC. However, for a subset of data, these authors could see an additional yield effect of manure (2.2%), and that was somewhat related to increases in SOC. In a study using multiple time observations for bovine farmyard manure from European LTEs, Zavatarro et al. (2017) considered the yield increases with manure applications were possibly also attributed to other effects than providing nutrients, including improvement of soil physical and chemical properties. Compared to the mineral fertilizer reference, the mean RR for manure with or without mineral fertilizer in that study was 32.9%, similar to the average we calculated for manure across reviews (Table 4). For the effect of N fertilization, Lu et al. (2011) showed that both aboveground and belowground net primary production increased, but only the latter correlated significantly with increases in the RR.

3.2.2 Soil texture

The studies that assessed the influence of soil texture on RR and SCR for the effect of cover crops (Poeplau et al. 2015; Poeplau and Don 2015) and for manures (Maillard and Angers 2014) found that it was not significant. Among the reviews on N fertilization, the effect was only assessed by Alvarez (2005), reporting the effect on SCR was greater in coarser (i.e., more sandy) soils than in fine-textured ones. For studies on crop residue removal, the reporting of interactions with soil texture was more frequent. Luo et al. (2010) detected a variation in RR with soil type. Liu et al. (2014) found that the relative effect of straw removal on RR for clayey soils was lower than for sandy soils (RR was about 10 and 16%, respectively). On the contrary, Lehtinen et al. (2014) found the effect on RR was greater for soils with a clay content exceeding 35%, compared with clay contents between 18 to 35%. Wang et al. (2015) and Lu (2015) were not able to find any significant relationship between soil texture and either RR or SCR.

3.2.3 Climate

For the effect of crop residue removal, three reviews evaluating the effect of climate (Liu et al. 2014; Wang et al. 2015; Lehtinen et al. 2014) could not establish any effects on RR or SCR. However, Lu (2015) found an effect across different regions in China. Furthermore, Luo et al. (2010) observed the largest increase in RR in the Australian regions with low rainfall (300 to 400 mm) and the smallest increase in regions with an average annual temperature around 18 to 19 °C. One review on the effect of manures (Ladha et al. 2011) and two reviews on the effect of cover crops (Poeplau et al. 2015; Poeplau and Don 2015) assessing the effect of climate, concluded that the effects on SOC changes were not climatic dependent. However, Maillard and Angers (2014) found a trend for lower SCR in tropical compared to temperate climates. For N fertilization effects, Ladha et al. (2011) reported that RR was highest (16%) for tropical conditions and lowest (3%) for temperate climates, while Alvarez (2005) found the SCR to be smaller under dry condition or tropical climates, compared to humid and temperate conditions. Lu et al. (2011) also found increased changes in SOC due to N fertilization under wetter conditions (RR decreased with mean annual precipitation).

3.2.4 Cropping systems and initial C

Liu et al. (2014) and Wang et al. (2015) found no effect of cropping systems on SOC changes using paired comparisons for crop residue removal. When testing the effect of plant functional types such as non-legume versus legume cover crops, Poeplau and Don (2015) found no influence on SCR. Maillard and Angers (2014) was the only review for manure assessing differences in RR and SCR related to cropping systems, where it was not significant comparing annual and perennial crops, and rice paddies. Similarly, Alvarez (2005) was the only review addressing this issue for N fertilization, where the effect on SOC changes was greater under rotations containing more crops per year or with corn as the main component of the cropping system.

Liu et al. (2014) found that RR decreased with increasing initial SOC content for straw residue incorporation, indicating that for a given amount of C inputs to soil, the percentage change is less for soils with a larger initial SOC stock, which is in agreement with the trend reported by Minasny et al. (2017a). However, Luo et al. (2010) found that RR (with the combined practice of stubble retention and conservation tillage) was showing the greatest increase when initial SOC stocks were higher, but the amount of soil C accumulation (i.e., SCR) was comparable with soils having lower initial SOC stocks. Maillard and Angers (2014) found no relationship between initial SOC concentrations and both RR and SCR in the review on recycled organic materials. They were emphasizing that this was possibly due to a lack of information on initial data, as well as the strong effect of the large external C inputs to soil. None of the other reviews examined the relationship between changes in RR and SCR regarding SOC contents.

3.3 Criteria and relationship between RR and SCR

There are several factors influencing the RR and SCR obtained in different reviews, including variations in soil depth and the duration of studies. Other dissimilarities are also occurring such as the units, all of which needs to be considered when comparing data from reviews and the relationship between the two indices.

3.3.1 Soil depth

With the exception of the review by Luo et al. (2010) with a mean sampling depth of 13 cm, the mean sampling depth of studies included in all the reviews was corresponding to the arable layer which ranged from 15 to 30 cm, mostly mean depth was ≥ 20 and never exceeded 30 cm (Tables 1, 2, and 3). With a few exceptions, reviews considering data from deeper soil layers rarely included depth ˃ 60 cm. When subsoil (i.e., ˃ 30 cm) samples was considered, their proportion in the entire datasets was small, ranging from about 1 to at the most 14%. This is not necessarily implying that information from subsoils is not available because some of the studies had specific selection criterions that excluded layers ˃ 20 (Lu 2015; Poeplau et al. 2015) or 30 cm (Smith et al. 1997; Körschens et al. 2013; Lehtinen et al. 2014). Furthermore, Maillard and Angers (2014) computed worldwide data mostly to a depth of 30 cm (and excluded soil depths < 15 cm), only one study was found for a whole profile of 100 cm. In another worldwide review, Poeplau and Don (2015) only found three studies investigating the effect of cover crops on SOC stocks below the plow layer (and subsoil data were excluded from their analysis). Mean soil depth may not be such a huge issue regarding the variations observed for RR and SCR between reviews considered here, but they are definitely not well representing SOC changes in subsoils. Similarly, in a worldwide review for the effect of different tillage practices, Haddaway et al. (2017) also found sampling for ˃ 30 cm was rare, representing only 19% of their entire database (i.e., 66 out of 351 studies).

3.3.2 Study duration

Compared to the mean soil depth, there was much larger variation in mean duration of the studies used in the reviews for assessing RR and SCR that ranged from 4 to 72 years, although it was mostly between about 10 to 25 years (Tables 1, 2, and 3). When stated, specific selection criteria for including studies in the reviews varied, from considering only LTEs ˃ 20 years (Smith et al. 1997; Körschens et al. 2013), ˃ 15 years (Kopittke et al. 2017), or ˃ 10 years (Wang et al. 2015; Powlson et al. 2011a). Other reviews were allowing much more variable lengths in their selection criteria, such as ˃ 3 years (Aguilera et al. 2013; Maillard and Angers 2014; Lu 2015; McDaniel et al. 2014) or only excluding studies with a duration < 1 year (Lu et al. 2011; Liu et al. 2014).

A few of the reviews were treating the effect of study length in detail. For instance, when assessing the effects of crop residue removal, the mean RR was much greater after 25 to 30 years (Wang et al. 2015) or for periods ˃ 20 years (Lehtinen et al. 2014), compared to studies with shorter duration. For the same management practice, Lu (2015) also found an increase in RR and SCR with the experimental duration, but suggesting the majority of the increases in SOC changes occurred within the first 15 years. While Liu et al. (2014) found that RR significantly increased with study length only for medium-term experiments (3 to 15 years), and suggested that soil C saturation may occur after 12 years of straw return. Luo et al. (2010) observed no consistent trend in the magnitude of change in RR with the duration of studies for aboveground crop residue removal. Furthermore, there was no indication of a soil C saturation in the review on cover crops by Poeplau and Don (2015) and the SCR increased linearly with study duration. For N fertilization, Lu et al. (2011) found no significant correlation between experimental duration and RR. None of the reviews on the effect of manuring was specifically addressing the issue of study length versus changes in RR or SCR. It is likely that the higher heterogeneity with respect to study duration has a greater impact on the variations observed for RR and SCR between reviews, as compared for example to soil depth.

In comparison, two reviews on the effect of no-tillage vs. conventional tillage in boreo-temperate regions were showing that the mean annual SOC stock change rates did not vary between study lengths of 10–20 years (Meurer et al. 2018) or slightly increased with time (Haddaway et al. 2017), and these trends were not affected by climate. However, Six et al. (2004) found that for drier climates, SCR for no-tillage could even be negative during the first 5–10 years before a positive trend occurs.

3.3.3 Units and calculations

Several reviews used SOC concentration (%) for calculating RR (Körschens et al. 2013; Lu et al. 2011; Liu et al. 2014; Luo et al. 2010; Lehtinen et al. 2014; Powlson et al. 2011a; McDaniel et al. 2014). Other reviews considered mass (kg C ha−1)-based calculations (Smith et al. 1997; Maillard and Angers 2014; Kopittke et al. 2017; VandenByggart et al. 2003; Wang et al. 2015; Lu 2015; Poeplau et al. 2015), or a mixture of using data both in terms of concentration and mass (Ladha et al. 2011; Aguilera et al. 2013; Lemke et al. 2010; Blanco-Canqui 2013). Liu et al. (2014) and Luo et al. (2010) converted mass based data back to a concentration basis using dry soil bulk density, while Ladha et al. (2011) and Aguilera et al. (2013) preferred the use of concentration-based data for RR (when both mass- and concentration-based data were available).

For calculating SCR, the units are always on a mass basis, and most often involves the use of pedotransfer functions to estimate missing data for bulk density (Aguilera et al. 2013; Kopittke et al. 2017; Wang et al. 2015; Lu 2015; Poeplau and Don 2015). For SCR analysis, differences related to the influence of a given management practice on dry soil bulk density are also influencing the results and using an equivalent soil mass approach is preferable (Ellert and Bettany 1995). For example, Meurer et al. (2018) showed that SCR may be overestimated by almost 50% when comparing different tillage practices and not considering equivalent soil mass principles. Only two reviews were considering an equivalent soil mass approach, Poeplau and Don (2015) for the effect of cover crops and Kopittke et al. (2017) for the effect of manures.

3.3.4 Relationship between RR and SCR

For eleven of the reviews, both the RR and SCR effect size indices were available (Tables 1, 2, and 3). Since RR is a ratio calculated considering the total change in SOC that had occurred at the end of a given study period and not a rate as SCR, they were not correlated (R2 = 0.02). When RR is expressed as a rate (i.e., % year−1) by normalizing it for the mean study duration, so that RR and SCR become comparable, the correlation becomes significant (R2 = 0.52, Fig. 6). For some of the mean RRs extracted from the reviews, we did not have the possibility to account for the duration of individual studies. In those cases, we used the reported mean study duration as a reasonable approximation. The Maillard and Angers (2014) data were excluded because the exact study duration (i.e., stated as ˃ 20 years) was not available. Furthermore, we are only presenting correlations excluding the very high value for manures and the very low value with crop residue removal for SCR from the studies by Aguilera et al. (2013) and Lu (2015), respectively (see discussion in Sect. 3.1). Including these latter two data point gives a better correlation (R2 = 0.76), but including only the Lu (2015) data point gives almost no correlation (R2 = 0.13). There is a trend for increasing SCRs with increasing RRs but not always, and this relates to the actual SOC stocks in the LTEs from which the indices are calculated, i.e., a large percentage response does not necessarily indicate a large change in absolute SOC stocks when these are small (Ladha et al. 2011). Differences in soil depth between reviews play a role here including other factors such as assumptions about bulk density. When data are available, it appears to be useful reporting both indices allowing more insight into comparisons between reviews.

Fig. 6
figure 6

Relationship between the mean relative response ratios (RRs) normalized with mean study duration and mean stock change rate (SCR) effect size indices for soil organic carbon (N = 11)

With respect to the soil carbon 4 per mille initiative, most of the normalized RRs are well above the 4 per mille level (Fig. 6) with an overall average value of 0.7% (i.e., 7 per mille). This initiative has been quantifying the offset of atmospheric CO2 emissions based on a blanket calculation of SOC stocks to a depth of 1 m for agricultural land. There are uncertainties in the total amount of C offset relating to the average SOC contents and areas of agricultural land, as widely discussed in Minasny et al. (2017a) and the initiative can be viewed more as a concept rather than precise numbers (Minasny et al. 2017b). However, SCR associated with the 4 per mille initiative are approximately in the order of magnitude 500 to 650 kg C ha−1 year−1, or more. The values we have compiled are generally below such levels (Table 4 and Fig. 6). This is a consequence of the sampling depth considered in the reviews, which mostly corresponded to a topsoil depth of 20 cm and almost never exceeded 30 cm as mentioned in Sect. 3.3.1. As discussed therein, there are measurements available also from subsoils. Although subsoil data are limited, when including changes in subsoil C they can be important, as illustrated by Meurer et al. (2018) who showed that the effect of no-tillage on SOC is overestimated when excluding measurements below the topsoil. There is definitely a need for measuring changes in subsoil C in existing long-term field experiments. As they emerge, they should be used to update effect size indices for characterizing SOC changes.

4 Conclusions

The overall average RR was highest for manures, followed by the effect of aboveground crop residue removal and cover crops; a lower value was associated with N fertilization. However, the overall average SCR for N fertilization exceeded that for the effect of aboveground crop residue removal, which was lowest. The highest average SCR was for manures followed by that for cover crops. The total number of paired comparisons involved in the reviews was highest for the effect of crop residue removal and N fertilization, followed by that for manures, and lowest for cover crop comparisons. Except for cover crops, the number of paired comparisons using the RR indices largely dominates over that using SCR. When it was assessed, the interactions between changes in SOC with texture and climate are equally either significant or not, with results not always consistent among reviews. The (relative) indirect effect of yield on SOC changes is sometimes detectable, in particular for the effect of crop residue removal, while it seems difficult to find interactions between SOC changes and crop types or rotations for a given management practice. The mean soil depth considered in reviews was less variable than mean study durations. When present, the inclusion of data on SOC changes in subsoils (˃ 30 cm) is usually about ≤ 10%. The mean study duration of reviews are subject to very different selection criteria, partly driven by data availability from individual studies on a given management practice. Inconsistent results for the effect of study duration on changes in SOC are discernible, most likely related to the heterogeneity in selection criteria used for length of experiments. When normalizing the RRs with the mean study duration of each review, there is a reasonable relationship between RRs and SCRs. Consequently, it appears relevant to calculate RRs as an average percentage change per year, in addition to the usual RR for the entire length of each LTE. Furthermore, when data allows (i.e., mass based data), it is suggested to include both indices in reviews because it allows a better appreciation of SOC changes. Although the average effects of management practices on SOC changes are reasonably well known at the field scale as shown in our synthesis, the variation between individual sites is large and the underlying mechanisms for these differences requires more focus in future research.