1 Introduction

The need for food production for a growing population has driven the intensification of agricultural practice within the last century, with severe consequences for environmental health. As a result, key planetary boundaries such as nutrient cycling, greenhouse gas emissions, and biodiversity loss have been crossed, also due to agricultural practices (Steffen et al. 2015). Sustainable agricultural systems that aim to ensure food security while preserving natural resources, need to be implemented on a broad scale. Within the last decades, organic agricultural systems have increasingly gained attention as alternative ways to produce food and ensure food security by focusing on environmental sustainability (Reganold and Wachter 2016; Lorenz and Lal 2016). Now that societies are becoming concerned about the environmental impact of food production systems, market shares of organic agricultural products are increasing globally (Willer et al. 2021), despite 5 to 34% lower yields in organic systems compared to conventional (Seufert et al. 2012). The principles of organic farming prohibit the use of chemical fertilizers and pesticides and favor biologically active and functional soils to support agricultural productivity. While both, organic and conventional farming systems, are largely based on a mechanistic understanding of nature and science, biodynamic farming also has roots in the non-scientific realm. It is based on personal inspirations (Steiner 1925) and incorporates ideas that have a ritual character. Despite its unmeasurable aspects, the biodynamic farming system includes rules and practices that may have measurable effects on crops and soils.

Soil quality plays a vital role in providing ecosystem services, e.g., by regulating nutrient and water cycles and providing habitat for soil biodiversity (Adhikari and Hartemink 2016; Greiner et al. 2017). Yet, due to the complex nature of soil processes and the large timescales, changes in soil quality are often overlooked in day-to-day farming operations and even more so in policy decisions (Adhikari and Hartemink 2016). While the definition of a universal soil quality index is inherently difficult, soil organic carbon (SOC) is the most important soil quality indicator (Milne et al. 2015), followed by soil pH and available phosphorus (Bünemann et al. 2018).

The SOC content, an outcome of the interplay of primary producers, decomposers, and soil mineralogy (Lehmann and Kleber 2015), has a direct and indirect impact on biological, chemical, and physical soil properties (Lal 2016). Increasing SOC contents can partly offset anthropogenic CO2 emissions (Minasny et al. 2017; VandenBygaart 2018; de Vries 2018). However, in the history of agriculture, SOC contents have mostly been depleted rather than replenished (Sanderman et al. 2017). Today, in times of agricultural intensification, SOC contents are further decreasing (Keel et al. 2019). This is particularly true when grassland is converted to arable farming (Oberholzer et al. 2014). Moreover, SOC losses may increase in the coming decades due to warming that triggers enhanced SOC mineralization in agricultural soils. Wiesmeier et al. (2016) have postulated an additional need for organic matter inputs to agricultural soils in order to stabilize SOC pools in Central European soils. Consequently, there is an urgent need for agricultural practices to counteract SOC losses and build additional SOC. Against this backdrop, continuous data from long-term field experiments provide the backbone for monitoring agricultural practices and their impact on SOC contents, but they cannot replace on-farm monitoring programs that cover the diversity of soils and management forms (Gubler et al. 2019).

Although the stabilization of soil organic matter relies on soil biological activity (Lehmann and Kleber 2015), recent reviews show that biological soil quality indicators are still under-represented in the scientific literature of the last decades compared to chemical or physical indicators (Bünemann et al. 2018; Lehmann et al. 2020). High biological soil quality is not only a value in itself, addressing the function of soil as a habitat, but it also drives other biological processes that support crop production, such as mineralization of nutrients, root colonization by symbionts, and suppression of plant diseases (Cadisch and Giller 1997; Bender et al. 2013; van der Heijden et al. 2016). Microbial biomass carbon and nitrogen (Cmic and Nmic) are among the most widely used biological soil quality indicators (Bünemann et al. 2018; Jenkinson et al. 2004; Anderson and Domsch 2010). Soil basal respiration, another frequently measured biological soil quality indicator, is defined as the steady rate of respiration, which originates from the mineralization of organic matter (Pell et al. 2005). The ratio between soil respiration and Cmic results in the metabolic quotient (qCO2), which characterizes the metabolic efficiency of a given soil microbial community (Anderson and Domsch 1993). The activity of phosphatase enzymes controls the biotic pathways of phosphorus that play a crucial role in plant growth (Caldwell 2005). Soil pH represents a soil quality indicator of key importance to drive the diversity of soil bacterial communities whereas fungal community composition appears to be less strongly affected by pH (Lauber et al. 2009; Rousk et al. 2010).

Although it has been shown repeatedly that organic farming systems improve soil quality compared to conventional systems (Mäder et al. 2002; Lori et al. 2017), the changes and differences in SOC contents are often controversially discussed (Leifeld and Fuhrer 2010; Kirchmann et al. 2008). Enhanced SOC contents under organic compared to conventional management were documented in a recent meta-analysis (Gattinger et al. 2012). However, monitoring of SOC contents over 21 years in the world’s oldest farming system comparison trial, the DOK trial, located in the temperate climate of north-western Switzerland (Therwil), reported decreasing or at best stable SOC contents in organic and conventional systems (Fliessbach et al. 2007). The current interest in determining changes in SOC, a large sample archive, advances in SOC methodology, and another 21 years of management call for a renewed assessment of soil organic carbon dynamics after 42 years of organic and conventional management.

Therefore, in this study, we present a timeline of SOC contents in the eight farming systems implemented in the DOK (bioDynamic, bioOrganic, Konventionell (german=conventional)) farming system comparison trial, across six 7-year crop rotation periods of arable cropping and discuss our results in relation to previous observations and methodological approaches for monitoring SOC contents. We aimed to answer the following research questions: (a) how long does it take to confirm changes in SOC, and (b) which farming system stabilises or builds additional SOC? Furthermore, we hypothesize that a long-term change in SOC content is indicated and driven by enhanced biological soil quality. Finally, we discuss possible mechanisms driving the effect of farming systems on soil biology and SOC changes.

2 Material and methods

The study presented here is based on the long-term analysis of SOC changes in archived soil samples and on the analysis of mainly biological soil quality indicators at the end of the 6th crop rotation period in 2019.

2.1 Field setup

The DOK trial was set up in 1978 and is located close to Therwil (47° 30.16′N, 7° 32.35′E) in the vicinity of Basel (Switzerland), and managed by the Swiss Centre of Excellence for Agricultural Research (Agroscope, Reckenholz) and the Research Institute of Organic Agriculture (FiBL, Frick). In order to underpin this, advisory farmer groups were established for each farming system. The soil is a haplic luvisol on deep deposits of alluvial loess at the southern end of the upper Rhine valley. On average, it contains 12% sand, 72% silt, and 16% clay (Leifeld et al. 2009). The climate is relatively dry and mild with a mean precipitation of 840 mm per year. A 30-year average annual temperature of 9.7 °C was reported at the start of the field trial; this has increased to 10.9 °C since then (Meteoblue 2022).

The field trial mimics certified conventional and organically systems that are practiced in the region. The DOK trial follows a system comparison approach in which farming systems mainly differ in fertilization strategy and plant protection (Table 1). With eight farming systems, this leads to a total of 96 experimental plots with a size of 5 × 20 m each (Fig. 1). In all farming systems, the same 7-year crop rotation is applied including 2 years of grass-clover and a catch crop before summer crops (Table 2). The five main crops are grass-clover, maize, wheat, potato, and soybean.

Table 1 Management details of the DOK farming systems (–: not applicable).
Fig. 1
figure 1

Field setup of the DOK trial with spatial orientation of blocks, columns, rows, and subplots. Each year three crops are cultivated in the subplots A, B, and C. N, NOFERT, unfertilized; D, BIODYN, biodynamic; O, BIOORG, bioorganic; K, CONFYM, conventional with farmyard manure; and M, CONMIN, conventional purely mineral fertilization. Organic fertilization: 0.7 and 1.4 correspond to organic fertilization at 0.7 and 1.4 livestock units per hectare.

Table 2 Crop rotation details in each of the six crop rotation periods (CRP) (Grass-clover is a standard mixture of Trifolium pratense 6%, Trifolium repens 12%, Dactylis glomerata 17%, Festuca pratensis 36%, Lolium perenne 21%, and Phleum pratense 8%. The catch crop in CRP 1 and 2 was harvested and used as fodder. Thereafter it was used as a green manure consisting of Trifolium alexandrinum 17%, Vicia sativa 57%, Phacelia tanacetifolia 13%, Guizotia abyssinica 3%, and Brassica juncea 3%).

The biodynamic (BIODYN) and the bioorganic (BIOORG) systems receive slurry and farmyard manure, whereas farmyard manure, slurry, and supplemental mineral fertilizers are applied in the conventional system CONFYM. All systems with manure amendment (BIODYN, BIOORG, and CONFYM) are practiced at two intensities, corresponding to fertilization by 0.6 and 1.2 livestock units (LU) per hectare in the first two crop rotation periods (CRP) and 0.7 and 1.4 LU per hectare from the third CRP onwards (Krause et al. 2020). Systems at high fertilization intensity correspond to the average stocking density in Switzerland, and those at low fertilization intensity mimic a reduction in stocking density by 50%. The second conventional treatment, CONMIN mimics a stockless system with mineral fertilizer only. This treatment was established after the first CRP and was left unfertilized during the first CRP. The control treatment, NOFERT has remained unfertilized since 1978. The amount of organic fertilizer is normalized referring to the amount of manure before treatment-specific manure processing. Manure processing increases in duration and aeration intensity from CONFYM (stacked manure) to BIOORG (rotted manure) to BIODYN (composted manure) (Table 1). Therefore, the mean organic carbon and nitrogen inputs via manure and slurry differ between farming systems, whereas potassium and phosphorus inputs are unaffected by manure processing and serve as an additional control for the normalization. Mineral fertilizers in CONMIN and CONFYM are given as calcium-ammonium-nitrate (CAN), urea (CH4N2O), potassium chloride (KCl), and triple super phosphate (TSP) up to the levels of the Swiss federal regulations (Richner and Sinaj 2017). It should be noted that organic fertilizer nitrogen in CONFYM is not fully accounted for as plant-available, which results in higher total nitrogen inputs in CONFYM than in CONMIN (Table 3). Mimicking mixed crop-livestock systems, the biomass of wheat, maize, and grass-clover is removed from the field, while crop residues of potato and soybean remain on site. Wheat straw is returned to the field plots as part of the system-specific manures.

Table 3 Mean annual inputs of nutrients and organic matter from 2nd to 6th crop rotation period and relative annual inputs as compared to conventional farming with fertilization corresponding to 1.4 livestock units per hectare (CONFYM 1.4) [%]. Treatments are listed from low to high fertilization intensity. NOFERT, unfertilized; BIODYN, biodynamic; BIOORG, bioorganic; CONFYM, conventional with farmyard manure; and CONMIN, conventional purely mineral fertilization. Organic fertilization: 0.7 and 1.4 correspond to organic fertilization at 0.7 and 1.4 livestock units per hectare. Ntot refers to total nitrogen inputs; Nmin is the sum of NH4+-N and NO3N from slurry or mineral fertilizer inputs. OM refers to organic matter inputs. P and K refers to phosphorus and potassium.

Plant protection in BIOORG and BIODYN is conducted according to the respective standards (Lampkin 1990). In the conventionally managed systems, herbicides, fungicides, insecticides, and molluscicides are applied. Potatoes in the BIOORG system are treated with copper sulphate (CuSO4) and in both organic systems Bacillus thuringiensis is sprayed against potato beetles. After the third CRP, economic thresholds for pest and disease treatments following the integrated plant protection approach were introduced in conventional systems. Plant protection in NOFERT is identical to organically managed systems. Crop rotation changes slightly at the end of each crop rotation period, but in the same way for all farming systems (Table 2) (Besson and Niggli 1991; Krause et al. 2020). Apart from the more frequent mechanical weeding in the organically managed plots, soil tillage is conducted in the same way for all farming systems to a depth of 20 cm.

The experiment has a split-split-plot design with four field replicates for each of the three crops planted each year.

2.2 Quantification of nutrient contents of organic fertilizers

Dry matter-dependent nutrient contents in treatment-specific organic fertilizers such as slurry and manure were quantified each year. Briefly, the nitrogen contents of fertilizers were determined according to the Kjeldahl method by chemical oxidation and subsequent steam distillation. Organic matter content was determined by weight loss after combustion and potassium and phosphorus content was determined by photometric analysis of ash extract.

2.3 Soil sampling

Across 42 years, soil samples for long-term assessment have been taken after crop harvest as a composite sample of 15 to 20 soil cores of 3 cm diameter from each of the 96 experimental plots to a depth of 20 cm. In the lab, samples were freed of large organic particles and roots, before being passed through a 2 mm sieve and air-dried at room temperature. Since three parallel crops were cultivated each year, sampling dates differed depending on the harvest of the respective crop. When grass clover was cultivated for two or three consecutive years, soil sampling was performed in autumn after the last cut.

Additional soil samples for measuring biological soil quality indicators were taken as described above at the end of the 6th CRP on February 26, 2019 before the first management procedures for grass-clover, catch crop, and winter wheat. Cooling boxes were used for sample transport to the lab, where moisture was adjusted to approx. 40 % of the soil’s maximum water holding capacity and the samples were sieved to 2 mm and stored at 4 °C before downstream analysis.

2.4 Quantification of soil organic carbon and total nitrogen content

Archived soil samples were used to quantify SOC and total nitrogen (Ntot) contents using a Vario Max Cube equipped with a thermal conductivity detector (Elementar Analysensysteme, DE). SOC was determined as the difference between total soil carbon and inorganic carbon analyzed in two subsamples of 1 g. The first subsample was combusted at 900 °C in an oxygen carrier gas to measure total soil carbon and total nitrogen. Even though the soil at the site is considered to be free of carbonates, the background level was measured by heating the second subsample to 500 °C in a muffle furnace for 4 h to remove the organic carbon. Subsequently, the second subsample was combusted as above at 900 °C to determine inorganic carbon contents and SOC was calculated as the difference of total and inorganic carbon contents. Outlier removal (42 samples) resulted in a dataset of 1878 measurements from the years 1982 and 1983 and every second year between 1984 and 2018.

2.5 Quantification of soil quality indicators

To measure soil pHH2O, 10 g of a dried soil sample were suspended in 25 ml demineralised water, left at room temperature overnight, and the next day pH was measured in the settling suspension using a calibrated electrode.

Cmic and Nmic were analyzed according to the original method of Vance et al. (1987) as adapted by Fliessbach et al. (2007). Moist soil samples were taken from the cooling room and a subsample of approx. 150 g were allowed to equilibrate at 20 °C for 7 days in the dark. Soil moisture was controlled and six parallel subsamples of a dry matter equivalent of 20 g weighed into glass jars. Three subsamples were extracted directly with 80 ml of 0.5M K2SO4 on a rotating shaker at 180 rpm. The other three subsamples were placed in a desiccator lined with moistened paper towels together with a beaker filled with ethanol-free CHCl3. The desiccator was evacuated until the CHCl3 was vigorously boiling and was left under vacuum for 24 h in the dark. After removal of the CHCl3, these samples were extracted in the same way as the others the day before. The six extracts per sample were then analyzed in a C-N-analyzer for liquid sample injection (Analytic Jena). Soil microbial biomass was calculated using the following formulas: Cmic [µg g−1] = EC/kEC and Nmic [µg g−1] = EN/kEN, where EC = TOC fumigated - TOC unfumigated and kEC = 0.45 is an empirically determined correction factor (Joergensen 1996), as is EN = TNb fumigated - TNb unfumigated and kEC = 0.54 (Joergensen and Mueller 1996).

Alkaline phosphomonoesterase activity was determined according to Alef et al. (1995) with p−Nitrophenol. One gram of soil was incubated in a solution containing dinatrium p−nitrophenylphosphat as an alternative substrate at 37 °C. After 60 min the reaction was stopped by adding 2M NaOH and the cleared solution was measured at 400 nm against a calibration curve with nitrophenol. The result was given as µg Nitrophenol g−1 soil hour−1.

Soil basal respiration was measured in pre-incubated (7 days at 22 °C) samples as CO2 evolved over a period of 72 h. Soil samples (20 g dry matter) were weighed into perforated centrifuge tubes and placed into a screw cap bottle (Schott, 250 ml) in the presence of 0.025N NaOH as CO2−trap for a 24 h pre-incubation period in the bottle. The actual measurement starts by adding exactly 20 ml of 0.025N NaOH. After 72 h, the soil was taken out of the bottle and the alkali was titrated with 0.025N HCl as described in the reference methods of the Swiss agricultural research centres (Agroscope 1996-2018).

2.6 Statistical analysis

The impact of the farming system on soil organic carbon contents for each sampling year was determined by means of a linear mixed effects model with repeated measurements using R version 4.02 and RStudio (Team R 2020). The lme function of the nlme package was used with farming system nested in subplot and column as repeated random factors to account for spatial heterogeneity of the field trial (Pinheiro et al. 2020). Subsequently, ANOVA was employed to determine the impact of the farming system, sample year and their interaction, and soil clay contents. The mean annual change in SOC contents for each experimental plot was calculated as the slope from a linear regression of SOC contents from 1982 to 2018. Statistical evaluations considering soil biological soil quality indicators were based on a factorial ANOVA with the clay content, column, and subplot nested in the column as co-variables (Figure S1) using JMP® Pro 14.1.0. In the case of model significance, a Tukey HSD test was applied to test the differences between the farming systems.

To assess the overall impact of farming systems on biological soil quality, we ran a principal component analysis (PCA) followed by a factor rotation using the varimax routine (JMP 14.1.0). PCA analysis included soil pH, Cmic, Nmic, SOC, Ntot, alkaline phosphatase activity, basal respiration, and the annual change in soil carbon content across 42 years. Biplots show the ordination of the principal components and their averages (n = 12) in addition to the loading factors that drive the ordination.

3 Results and discussion

3.1 Monitoring changes of soil organic carbon contents in the DOK trial

Across 2nd to 6th CRP, BIODYN 1.4 exhibited the highest mean annual increase in SOC contents at a rate of 35.8 mg kg−1 yr−1. SOC contents in BIOORG 1.4 increased slightly, while in CONFYM 1.4 SOC contents remained constant (Table 4). In CONMIN and BIODYN 0.7 similar mean annual SOC losses of −27.8 mg kg−1 yr−1 and −27.5 mg kg−1 yr−1 were observed, followed by even higher losses in BIOORG 0.7 and CONFYM 0.7. NOFERT exhibited the highest annual loss of −91.7 mg kg−1 yr−1.

Table 4 Soil pH, soil organic carbon (SOC), and total nitrogen (Ntot) contents in spring 2019, after 42 years of organic and conventional farming. Mean annual changes in soil organic carbon contents from 2nd to 6th crop rotation period. Data show least square means (n = 12), standard errors and different letters in a column denote significant difference of the post-hoc Tukey test at p = 0.05. Treatments are listed from low to high fertilization intensity. NOFERT, unfertilized; BIODYN, biodynamic; BIOORG, bioorganic; CONFYM, conventional with farmyard manure; and CONMIN, conventional purely mineral fertilization. Organic fertilization: 0.7 and 1.4 correspond to organic fertilization at 0.7 and 1.4 livestock units per hectare.

In 1986, 8 years after the start of the experiment, the first statistically significant difference between any of the farming systems emerged when SOC contents in BIODYN 1.4 exceeded the two systems without organic fertilization (NOFERT and CONMIN) by 1.8 g kg−1 (Fig. 2). However, CONMIN was introduced as such only then, being unfertilized during the first CRP. Twenty-two years after the start of the experiment, the difference in SOC became significant between BIODYN 1.4 and CONFYM 1.4 at the same fertilization intensity. We never found significant differences between the SOC contents of BIOORG and CONFYM at 1.4 LU or between any of the farming systems at 0.7 LU. ANOVA showed that the interaction of farming system and sampling year and clay content significantly affected SOC content over the course of the field trial (Fig. 2).

Fig. 2
figure 2

Temporal development of soil organic carbon contents in the eight farming systems of the DOK trial. Data show least square means (n = 12) and results of repeated two-way ANOVA (FS farming system, year: sampling year, clay: soil clay content) using a linear mixed effect model accounting for spatial arrangement of farming systems within the field setup. Arrows point at years when significant differences between treatments (empty arrow: BIODYN 1.4>CONMIN, NOFERT, full arrow: BIODYN 1.4>CONFYM 1.4) were detectable for the first time. NOFERT, unfertilized; BIODYN, biodynamic; BIOORG, bioorganic; CONFYM, conventional with farmyard manure; and CONMIN, conventional purely mineral fertilization. Organic fertilization: 0.7 and 1.4 correspond to organic fertilization at 0.7 and 1.4 livestock units per hectare.

In an earlier study, Fliessbach et al. (2007) presented the development of SOC content in the DOK trial between 1977 and 1998 and observed that BIODYN 1.4 was the only system with stable SOC contents, while all other farming systems lost SOC. The earlier analyses were closely correlated (r2 = 0.88) to those presented here, despite methodological differences (wet oxidation vs. element analysis). Also, the differences between the farming systems were similar to those found in the present study. However, Fliessbach et al. (2007) included mixed samples from 16 blocks (Fig. 1) taken before a homogenous grass-clover oats mixture was sown in May 1977. The inclusion of these samples might have captured an initial decrease in SOC due to the change in land management, in addition to the different sampling pattern. Since the SOC contents of these samples are out of range compared to the extended measurements until 2018, we refute the statement of Fliessbach et al. (2007) of decreasing or at best stable SOC contents, which is justified in light of the stable SOC contents in CONFYM 1.4 and increasing SOC contents in BIOORG 1.4 and especially BIODYN 1.4 after 42 years (Fig. 2).

Modified management practices in the DOK trial might explain some temporal changes in SOC contents. Specifically, at the beginning of the third CRP (1992–1998) stocking densities were increased from 0.6 to 0.7 and from 1.2 to 1.4 LU ha−1 in order to better conform with the livestock densities commonly found in Swiss agriculture (Mäder et al. 2006; Krause et al. 2020; Swiss Federal Office for Agriculture 2020). The crop rotation was then adapted to the higher stocking densities by introducing a third year of grass-clover in the third CRP. In the 4th CRP, the third year of grass-clover was replaced by soybean, and red beet by maize (Table 2). The incorporation of legumes may have had a particular positive influence on the SOC contents (Stagnari et al. 2017). Overall, the changes in manure application rate and crop diversity may have provided positive stimulus for the increase of SOC content from the 4th CRP onwards.

The future evolution of SOC contents in the DOK trial is not trivial to predict. The mean annual SOC changes given in Table 4 refer to past observations. Extrapolating these trends would neglect possible saturation levels (Stewart et al. 2007) or enhanced SOC mineralization due to rising temperatures (Wiesmeier et al., 2016). Quantification of SOC contents in separated density fractions revealed farming systems effect only in the labile occluded particulate organic matter fraction, with enhanced carbon contents in BIODYN 1.4 compared to CONFYM 1.4, CONMIN, and NOFERT (Mayer et al. 2022). This illustrates the need for continuous soil carbon management, as enhanced SOC contents in BIODYN 1.4 seem prone to losses upon disturbance and/or management change.

3.2 Effect of farming systems on SOC contents

Similar to our study, a long-term field experiment at the Rodale Institute (PA, USA) found higher SOC levels under organic management compared to a mineral fertilizer-based farming system (Drinkwater et al. 1998; Pimentel et al. 2005). Long-term experiments in Rothamsted (UK) also demonstrate the positive effects on soil quality of applying large annual rates of manure, through increasing SOC stocks over more than 150 years (Johnston and Poulton 2018). The influence of organic matter management has been demonstrated repeatedly as the main driver of differences in changes in SOC contents between farming systems (Johnston and Poulton 2018; Ludwig et al. 2011; Heitkamp et al. 2009). Our results essentially confirm these observations, as at 0.7 LU fertilization intensity a steady carbon loss was found and SOC contents remained stable or increased at 1.4 LU. Yet, the increase in SOC content in BIODYN 1.4 was achieved with a 20% lower total organic matter input compared to CONFYM 1.4 (Table 3). The SOC content in BIOORG 1.4 also increased slightly with a 15% lower input of organic matter compared to CONFYM 1.4. The differences in organic matter inputs between the farming systems are explainable by the losses during farming system-specific storage and composting procedures, which increase with increasing duration and aeration from CONFYM to BIOORG to BIODYN. This suggests that the assumption of higher organic matter inputs being the main reason for higher SOC content under organic compared to conventional management (Leifeld and Fuhrer 2010; Autret et al. 2016) needs to be refined by considering the quality of organic matter inputs and biological soil quality. Fertilization schedules in organically managed systems, which often involve less and more frequent manure application rates than in conventional systems, could also influence soil carbon dynamics and soil microbial transformation processes.

Apart from fertilization, carbon enters agricultural soils via crop residues, roots, and rhizodeposition. Averaged over 2nd to 6th CRP and across the five major crops (maize, wheat, soybean, potato, and grass-clover), NOFERT exhibited about 44%, BIODYN 0.7 72%, BIOORG 0.7 75%, CONFYM 0.7 89%, BIODYN 1.4 82%, BIOORG 1.4 84%, and CONMIN 97% of the yields reported in CONFYM 1.4. This demonstrates higher carbon inputs to conventional systems through crop residues. Yet, Kätterer et al. (2011) state that carbon derived from rhizodeposition and roots is more likely to contribute to SOC contents than carbon from aboveground biomass. In a 16-year-old field experiment in France, SOC stocks under organic management without manure amendment increased by 0.28 t ha−1 yr−1compared to no change in conventional or low-intensity cropping despite higher aboveground and belowground biomass inputs in the latter (Autret et al. 2016). The authors explain this observation by the presence of a legume cover crop and increased carbon inputs from the roots in a nutrient-poor environment. Belowground biomass and rhizodeposition are difficult to measure and are often based on fixed ratios (Bolinder et al. 2007). Hirte et al. (2018) used multiple pulse labeling with 13C−CO2 and found an equal belowground transfer of carbon in wheat grown in BIOORG 1.4 and CONFYM 1.4 plots. However, the same comparison in maize revealed a higher transfer to the soil in BIOORG 1.4 compared to CONFYM 1.4, supporting the suggestion that the shoot-to-root ratio is not constant and needs to be determined in dependency of soil quality and fertilization strategy. The findings of Hirte et al. (2018) help to explain the small difference in SOC content between CONFYM 1.4 and BIOORG 1.4, despite the yield difference between the two systems being so much higher.

3.3 Soil quality indicators after 42 years of organic and conventional management

After 42 years of system-specific soil management, the SOC and Ntot contents in spring 2019 exhibited clear differences between the farming systems (Table 4). Among the systems with fertilization intensity of 1.4 LU, SOC, and Ntot contents in BIODYN were significantly higher than those in BIOORG and CONFYM. SOC and Ntot contents were lower in CONMIN and the farming systems fertilized at 0.7 LU. The lowest SOC and Ntot contents were found in NOFERT. In general, SOC and Ntot contents exhibited similar differences between the experimental treatments, and no long-term effect of farming system on soil C−N ratio could be detected (Table 4). Soil pH ranged from 5.94 in NOFERT to 6.78 in BIODYN 1.4. Generally, systems at 1.4 LU had a higher soil pH compared to those at 0.7 LU, except BIODYN 0.7, which was at par with BIOORG 1.4 and CONFYM 1.4 (Table 4). In this context, it is important to note that CONMIN and CONFYM were limed between 1999 and 2005 when soil pH had dropped below the critical level of 6 (Oberholzer et al. 2009).

Cmic and Nmic in soils of the DOK trial were highest in BIODYN 1.4, followed by BIOORG 1.4, BIODYN 0.7, and CONFYM 1.4. The farming systems at 0.7 LU ranked lower, but again, BIODYN 0.7 did not differ significantly from CONFYM 1.4, which in turn was similar to farming systems at 0.7 LU. The lowest Cmic was found in CONMIN and NOFERT, while Nmic in CONMIN was higher than in NOFERT (Table 5). Lower values were observed for the Cmic/Nmic ratio in BIODYN compared to CONFYM at fertilization intensities of 1.4 and 0.7 LU. The Cmic/SOC ratio in BIODYN 1.4 was significantly higher than in CONFYM 1.4. At both intensity levels, BIODYN systems had higher Cmic/SOC ratios than the systems without manure input indicating an improved habitat function at the same SOC content (Table 5).

Table 5 Effects of farming systems on microbial biomass carbon (Cmic) and nitrogen (Nmic) in spring 2019, after 42 years of organic and conventional farming. Data show least square means (n = 12), standard errors and different letters in a column denote significant difference of the post-hoc Tukey test at p = 0.05. Treatments are listed from low to high fertilization intensity. NOFERT, unfertilized; BIODYN, biodynamic; BIOORG, bioorganic; CONFYM, conventional with farmyard manure; and CONMIN, conventional purely mineral fertilization. Organic fertilization: 0.7 and 1.4 correspond to organic fertilization at 0.7 and 1.4 livestock units per hectare.

Positive effects of organic and biodynamic agricultural practices on biological soil quality have been reported from system comparison trials based on the concept of microbial biomass and soil enzyme activities (Gunapala and Scow 1998; Fliessbach et al. 2007; Heitkamp et al. 2011; Heinze et al. 2009). In a global meta-analysis (Lori et al. 2017) reported 84% higher protease activity under organic soil management. Another global meta-analysis found higher soil biological quality under biodynamic compared to organic management for 43% of the indicators studied (Christel et al. 2021). This is supported by the highest alkaline phosphatase activity of the DOK trial being found in BIODYN 1.4. In BIODYN 0.7, BIOORG 1.4, and CONFYM 1.4, the activity was at the same level and higher compared to CONMIN, BIOORG 0.7, and CONFYM 0.7. The lowest phosphatase activity was observed in NOFERT (Table 6).

Table 6 Effects of farming systems on microbial enzyme activities in spring 2019, after 42 years of organic and conventional farming. Data show least square means (n = 12), standard errors and different letters in a column denote significant difference of the post-hoc Tukey test at p = 0.05. Treatments are listed from low to high fertilization intensity. NOFERT, unfertilized; BIODYN, biodynamic; BIOORG, bioorganic; CONFYM, conventional with farmyard manure; and CONMIN, conventional purely mineral fertilization. Organic fertilization: 0.7 and 1.4 correspond to organic fertilization at 0.7 and 1.4 livestock units per hectare.

The ordination of the principal components visualizes the capability of biological soil quality indicators to distinguish the farming systems along the gradient between BIODYN 1.4 and NOFERT (Fig. 3). While the unfertilized control exhibited the lowest soil quality, the outstanding role of the two biodynamic systems is particularly interesting, since at the respective intensity level they exhibit the best performance for biological soil quality and SOC build-up (Tables 4, 5, and 6). Notably, the biological soil quality of BIODYN 0.7 was similar to CONFYM 1.4 (Tables 5 and 6). The system comparison approach implemented in the DOK trial does not allow for evaluating the effects of specific biodynamic management practices such as long aerobic composting procedures, use of biodynamic preparations, or plant protection measures. Some of the open questions became topics of experiments prompted by the DOK results and discussions. Berner et al. (2008) designed a factorial field experiment on reduced soil tillage, manure management, and biodynamic preparations. Here, the biodynamic preparations never produced significant effects on soil quality (Krauss et al. 2020). Consequently, we assume an improved organic input quality after long manure composting under aerobic conditions to be the main factor for enhanced soil quality and increasing SOC contents in the BIODYN systems.

Fig. 3
figure 3

Principal component analysis (PCA) of soil quality indicators from Tables 4, 5, and 6. NOFERT, unfertilized; BIODYN, biodynamic; BIOORG, bioorganic; CONFYM, conventional with farmyard manure; and CONMIN, conventional purely mineral fertilization. Organic fertilization: 0.7 and 1.4 correspond to organic fertilization at 0.7 and 1.4 livestock units per hectare. Big symbols represent means per treatment, while smaller symbols show individual observations from each experimental plot (n = 12). Arrows represent factor loadings of rotated principal components of individual soil quality indicators. PC2 is driven by soil basal respiration and SOC change while all other indicators are more closely related to PC1 (Table 7).

Basal respiration rates indicate soil microbial activity and the ability to mineralize organic matter. Such rates were highest in BIODYN 1.4 and lowest in NOFERT (Table 6), while the soils of all other farming systems were similar in this respect. The ordination of the principal components in Fig. 3 supports this view and, moreover, the results are consistent with a recent meta-analysis across six European long-term field trials showing enhanced respiration in organically fertilized soils (García-Palacios et al. 2018). The ratio of basal respiration and microbial biomass carbon (qCO2) indicates the energy demand of soil microbes. This was found to be lower in BIODYN 1.4 and BIOORG 1.4 than in NOFERT and CONMIN (Table 6). The role of the soil microbial community to mineralize organic matter is of agronomic importance since their catabolic versatility drives nutrient recycling processes in soils. This applies to organic farming systems especially, where readily available fertilizers cannot be used to support crop growth. Interestingly, in the DOK trial the basal soil respiration and the annual SOC change rates estimated from the long-term trend over 42 years were correlated (r2 = 0.25; p < 0.001). In the multivariate ordination, they point in the same direction and their factor loadings are high on PC2, while all other independent indicators included in the PCA are more closely related to PC1 (Fig. 3, Table 7). While enhanced soil respiration and rising SOC contents might at first seem contradictory, it must be noted that the long-term stabilization of soil organic matter depends on microbially induced carbon turnover and transfer of non-respired carbon into particulate or mineral-bound carbon pools (Lehmann and Kleber 2015). Consequently, the ability of soil microbes to mineralize organic carbon sources can be seen as a precursor and early indicator of subsequently increasing SOC contents. A further explanation may be found in the lower metabolic quotient (qCO2) in BIODYN 1.4 and BIOORG 1.4 compared to NOFERT and CONMIN. This indicates that the soil microbes in these systems need less energy to maintain their metabolism, allowing them to focus on biomass growth rather than physiological maintenance (Table 6). Further support for farming systems affecting soil biological processes comes from a recent study that found organic matter inputs as main driving factor for distinct microbial communities in the farming systems of the DOK trial (Hartmann et al. 2014). Soil microbial biomass and other biological indicators are proposed as early indicators of SOC change, but under the conditions of the DOK trial, it merely appeared that the soil biology in the farming systems changed in step with SOC change. The observation that soil biological quality differs between farming systems implemented at similar fertilization rates underscores that not only input rates but also input quality and soil biology need be considered in agricultural management decisions to maintain long-term soil quality.

Table 7 Rotated factor loadings of the principal component analysis. Numbers in italics represent the dominant influence of the indicator on the respective PC.

4 Conclusions

In this study, we showed that mixed farming systems with stocking densities of 1.4 LU ha−1 maintain SOC contents, while pure mineral fertilization or organic fertilization at 0.7 LU ha−1 led to a decline in SOC contents. This demonstrates the beneficial effect on soil quality when livestock farming is coupled with arable cropping systems. Notably, it takes more than 2 decades before organic management practices resulted in enhanced SOC contents compared to conventional farming systems at the same fertilization intensity. Raising SOC levels is possible for organic farming systems but it is a slow process that is intimately linked with biological soil quality and the quality of input materials. In particular, high basal respiration appear to be decisive for raising SOC contents over time. Consequently, maintaining long-term soil quality relies on sophisticated organic matter recycling and management decisions that consider biological soil quality, such as the composting of manure before field application.