Dynamics of soil organic carbon in the steppes of Russia and Kazakhstan under past and future climate and land use

Changes in land use and climate are the main drivers of change in soil organic matter contents. We investigated the impact of the largest policy-induced land conversion to arable land, the Virgin Lands Campaign (VLC), from 1954 to 1963, of the massive cropland abandonment after 1990 and of climate change on soil organic carbon (SOC) stocks in steppes of Russia and Kazakhstan. We simulated carbon budgets from the pre-VLC period (1900) until 2100 using a dynamic vegetation model to assess the impacts of observed land-use change as well as future climate and land-use change scenarios. The simulations suggest for the entire VLC region (266 million hectares) that the historic cropland expansion resulted in emissions of 1.6⋅ 1015 g (= 1.6 Pg) carbon between 1950 and 1965 compared to 0.6 Pg in a scenario without the expansion. From 1990 to 2100, climate change alone is projected to cause emissions of about 1.8 (± 1.1) Pg carbon. Hypothetical recultivation of the cropland that has been abandoned after the fall of the Soviet Union until 2050 may cause emissions of 3.5 (± 0.9) Pg carbon until 2100, whereas the abandonment of all cropland until 2050 would lead to sequestration of 1.8 (± 1.2) Pg carbon. For the climate scenarios based on SRES (Special Report on Emission Scenarios) emission pathways, SOC declined only moderately for constant land use but substantially with further cropland expansion. The variation of SOC in response to the climate scenarios was smaller than that in response to the land-use scenarios. This suggests that the effects of land-use change on SOC dynamics may become as relevant as those of future climate change in the Eurasian steppes.


Introduction
Soil organic carbon (SOC) stocks received increasing attention in the context of climate change, food security and degradation of soils (FAO 2017). Being a main component of soil, SOC is an indicator for soil health, fertility and water availability. The anthropogenic influence on SOC stocks affects the capacity of soils to be a carbon sink or could losses were estimated as 1.14±0.18 PgC a −1 (mean ± standard deviation) between 1920 and 1999 (Houghton et al. 2012). Assuming the current SOC stock in the upper 100 cm to be between 1325 PgC (Köchy et al. 2015) and 1550 PgC (Global Soil Data Task 2000), then approximately between 2.6 and 5.6% of the total initial stocks were lost. Yet, land-use related emissions are likely to be underestimated .
The potential of recovering SOC stocks was estimated as 50 to 66% of the historically lost SOC (Lal 2004) when applying recommended land-use practices that sequester carbon on croplands. In general, cropland abandonment is associated with net carbon sequestration (Kurganova et al. 2015;Kämpf et al. 2016) whereas cropland expansion typically reduces SOC stocks (Lepers et al. 2005;Ramankutty et al. 2018). The area that was converted to cropland globally in the twentieth century is estimated at around 400 Mha and mostly resulted from the conversion of natural grasslands and steppe vegetation (Ramankutty and Foley 1999;Klein Goldewijk 2001). The conversion of steppe soils alone was associated with the reduction of SOC stocks by 38-43% in the upper 10 cm (Mikhailova et al. 2000) or 25% in the upper 100 cm (Beniston et al. 2014).
Climate change is expected to have negative effects on SOC stocks which are probably influenced by positive feedbacks, e.g. via enhanced microbial decomposition under elevated soil temperatures (Allison et al. 2010;Lal 2020). When weather extremes, such as heavy rainfall, droughts and storms, increase in frequency and intensity, erosion and deterioration of the soil will become more likely. Increasing mean temperatures enhance decomposition rates of soil organisms as long as soil moisture is sufficent (Davidson and Janssens 2006), although the long-term effect is under debate (Lloyd and Taylor 1994). Lloyd and Taylor (1994) found that the initially increasing decomposition rates decline after a few years of elevated temperatures, probably because of a decline of microbial biomass, which may rise again when microbial communities are adapted to higher temperatures (Allison et al. 2010). On the other hand, rising temperatures also induce drier conditions and with it reduced plant growth, which leads to reduced replenishment of SOC by organic matter and a decline in SOC stocks (Bischoff et al. 2016). Hence, it is expected that warming would increase the release of carbon in the form of carbon dioxide as well as methane, both of which are powerful greenhouse gases (Houghton 2018).
Besides climatic change, the conversion of grassland to cropland has profound effects on SOC stocks in Eurasian grasslands. Probably, the fastest and most widespread policy-driven cropland expansion in human history was the 'Virgin Lands Campaign' (VLC) in the south of European Russia, western Siberia and northern Kazakhstan between 1954and 1963(Durgin 1962McCauley 1976;Kraemer et al. 2015;Frühauf et al. 2020). The Campaign allowed to increase grain output in the first two years after the start with beneficial weather conditions, but was officially ended in 1963, following a sequence of droughts (McCauley 1976;Josephson et al. 2013). Despite the acknowledgment of the Campaign's failure to alleviate the Soviet's grain shortages (Einaudi 1964;Josephson et al. 2013), cropland cultivation in the Eurasian steppes continued until the fall of the Soviet Union in 1991. Following the breakdown of the Soviet Union in 1990, approximately 32 Mha of previously converted cropland were abandoned (Lesiv et al. 2018). The successional vegetation and secondary steppe began to re-establish on most of the abandoned croplands (Schierhorn et al. 2013;Kämpf et al. 2016), whereas the conversion of former cropland to pasture was negligible (Hankerson et al. 2019;Dara et al. 2020). Since the 2000s, some of the abandoned croplands have been recultivated partially in response to economic growth, higher world market prices for grains and favourable policies (Meyfroidt et al. 2016).
The large-scale land-use changes, both cropland expansion and abandonment, had substantial consequences for the SOC dynamics in the region. Field measurements in the Kulunda steppe of Altai Krai in Russia revealed SOC losses in the upper 25 cm of the soil of 20-35% due to cropland expansion (Bischoff et al. 2016;Guggenberger et al. 2020). An assessment of SOC stocks with a terrestrial ecosystem model enabled the evaluation of the carbon dynamics for the part of the VLC region in European Russia and found that substantial amounts of carbon were sequestered in the soils of abandoned croplands (Schierhorn et al. 2013).
The aim of our study was to evaluate the impact of landuse changes and climate change on the SOC stocks of the entire VLC region. Here, we used the dynamic vegetation and crop growth model LPJmL (Bondeau et al. 2007;Rolinski et al. 2018) to simulate the carbon cycle including all relevant fluxes between the atmosphere, vegetation and soil. We quantified carbon emissions caused by land-use change and climate change and highlighted the relative importance of both dynamics. Specifically, we addressed the following research questions: 1. What are the legacies of the VLC cropland expansion and the post-soviet abandonment for SOC stocks? 2. How likely is the recovery of the SOC stocks until 2100? 3. Are land-use changes or climate change more important for historical and future changes in SOC stocks?

Study area
We examined the central area that was subject to the Virgin Lands Campaign (VLC), 226 million hectares (Mha) in total. The VLC was initiated after World War II by the General Secretary of the Communist Party of the Soviet Union, Nikita Khrushchev (Josephson et al. 2013;Kraemer et al. 2015) to alleviate grain supply shortages in the Soviet Union (McCauley 1976;Wein 1980). During the campaign from 1954 to 1963, 45 Mha of 'virgin' steppes, temperate grasslands and fallow lands were converted to cropland (Durgin 1962), mostly in the south of European Russia, northern Kazakhstan and western Siberia (Fig. 1). The campaign was a major state development activity, included the resettlement of more than 300,000 people and aimed to transform the target regions into a main agricultural production area in the Soviet Union. After this largescale conversion, further 42 Mha of grassland steppe were converted to cropland, mainly for wheat cultivation, until 1980 (Wein 1980). The climatic conditions in the VLC region are continental semi-humid in the north to semi-arid in the south and characterized by low annual precipitation (500 mm in the eastern, 400 mm in the northern, 200 mm in the southern part), falling mostly in summer but with high interannual variability (Afonin et al. 2008;Blinnikov 2021). From north to south, the natural vegetation in the VLC region consists of forest steppe, meadow, dry and semi-arid steppes (Ogureeva et al. 1999;WWF-Russia 2018). The northern boundary of meadow (grassland) steppe is located between 53 and 54 • N in the European part of Russia and between 55 and 56 • N in the Ural and Siberian regions (Tchebakova et al. 2009).
Crop production is constrained by short vegetation periods, early frost and strong winds. Following the climatic gradient, carbon-rich black soils (Chernozems) dominate in the north of the study area, while brownish soils with smaller organic carbon contents (Kastanozems) are found in the south. Salt-affected soils can be found throughout the VLC area (Josephson et al. 2013) which is a natural phenomenon in arid steppe regions, where the soils contain a large amount of soluble salts (FAO/IIASA/ISRIC-/ISSCAS/JRC 2012). Crop yields can be high in years with above-average rainfall, such as in the early 1950s (Durgin 1962). However, consecutive drought years are common and exacerbate soil salinization and soil erosion, mainly caused by wind, pose major environmental challenges and cause high yield variability (Kraemer et al. 2015). The volatile weather conditions are also reflected in the high fluctuations in grain yields in the 1960s ranging between 0.3 and 1.1 t ha −1 (Petrick et al. 2013). The low yields are mainly caused by low fertilizer application rates , extreme climate conditions, and the very short cropping season (one of the shortest worldwide) (Kühling et al. 2017), which requires using adapted, yet old and low-yielding spring wheat varieties (Morgounov et al. 2010). The low yield levels and general productivity limit the carbon sequestration in agricultural soils.

Dynamic global vegetation model LPJmL
Simulations of the vegetation, the soil dynamics and the water cycle were performed using the dynamic global vegetation, hydrologic and agricultural model LPJmL ('Lund Potsdam Jena managed Land'; Sitch et al. 2003;Bondeau et al. 2007;Rolinski et al. 2018). The model was designed to simulate the global carbon cycle and separate flux components (gross primary productivity (GPP), net primary productivity (NPP), heterotrophic respiration, disturbances such as fire or harvest) and carbon pools (vegetation carbon, soil organic carbon), and it was validated in comparison to local measurements and global data sets (Schaphoff et al. 2013;Müller et al. 2017;Schaphoff et al. 2018b). Previously, LPJmL was also used for carbon flux and stock simulations in parts of the former Soviet Union (Schierhorn et al. 2013;  . For the assessment of the carbon budget, net ecosystem exchange (NEE) was derived by substracting net primary productivity from the sum of the carbon emissions due to heterotrophic respiration, fire, harvest and deforestation, so that negative values denote a net flux into the biosphere or carbon sequestration. In LPJmL, plants are represented in the form of functional types, which represent major growth habits of plants in response to environmental conditions. For the natural vegetation, plant functional types (PFTs) were parametrized and for agricultural activities, crop functional types (CFTs) were defined. The PFTs were distinguished by their ability to capture light, take up water and nutrients and their rates of growth, respiration and mortality. The composition of the natural vegetation was then calculated internally by establishment of PFTs according to their bioclimatic limitations and competition between the PFTs for water and light. Nine PFTs were parametrized within LPJmL from which four prevail in the study region (Table S1). The 12 agricultural crop types include temperate and tropical cereals, rice, maize, pulses, temperate and tropical roots, sunflower, soybean, groundnuts, rapeseed and sugarcane with or without irrigation (Bondeau et al. 2007).
Carbon from dead plant material enters PFT-specific litter pools for above-and belowground material so that differences in degradability between different PFTs and different plant organs can be accounted for. The decomposition of litter emits 70% of the decomposed carbon as CO 2 to the atmosphere, while the remainder is transferred to the slow (1.5%) and fast (98.5%) soil carbon pools across the soil layers. Decomposed soil carbon is also emitted to the atmosphere. Decomposition rates of litter and soil carbon pools are sensitive to soil temperatures and soil moisture, which are calculated dynamically for each fo the 5 soil layers (0-20, 20-50, 50-100, 100-200, 200-300 cm). A more detailed description of soil carbon dynamics is provided by Schaphoff et al. (2018a).
We ran the simulations using model version 3.6 (Rolinski et al. 2018) with the following improvements: The treatment of cropland expansion was adjusted since largescale conversion of natural vegetation plays an important role for the carbon cycle in the study region. Therefore, that aboveground tree biomass (100% of the sapwood and 66% of the heartwood) is removed from the fields upon land conversion and only the remainder (34% of the heartwood) is added to belowground litter pools. For comparison of the simulated patterns of vegetation (appendix A) and carbon pools with reference data ('Comparison of simulations with soil organic carbon measurements'), we used input data as specified in 'Climate scenarios and input data sets' in the model runs.

Climate scenarios and input data sets
We used climate and soil input data with a spatial resolution of 0.5 • × 0.5 • for the simulations with LPJmL ('Dynamic global vegetation model LPJmL'). Climate input data comprised raster data from 1901 to 2099 with daily values of air temperature, precipitation, long and shortwave radiation (Weedon et al. 2014). Following the framework 'Global Assessment of Land Use Dynamics, Greenhouse Gas Emissions and Ecosystem Services' (GLUES; Liniger et al. 2017) within the project KULUNDA (Frühauf et al. 2020), we included scenarios for climate projections until the year 2100 based on the storylines of the IPCC Special Report on Emissions Scenarios (SRES, Nakićenović et al. 2000). From the available storylines, we included the marker scenarios for the A1, A2 and B1 scenario families (appendix B). For each of the scenarios, the results of three general circulation models (GCMs) were chosen from the CMIP3 data set as input data from 1901 to 2100, namely MIUB ECHO-G, MPI ECHAM5, and UKMO HadCM3 (Meehl et al. 2007).

Land-use data sets
To represent actual land use, a data set is necessary with the information on the area fraction in each grid cell per year, crop functional type and irrigation system. Here, we used two different land-use data sets (Fig. 2) and derived two counterfactual future scenarios (Table 1).  The data set LU D is based on HYDE (Fader et al. 2010;Klein Goldewijk et al. 2010), a widely used data set with spatially explicit cropland extent from 1700 to 2000. LU D includes information on irrigated cropland and irrigation systems (Portmann et al. 2010;Jägermeyr et al. 2015). However, HYDE does not reflect the agricultural dynamics in the VLC region ( Fig. 2, red dashed line). In LU D , cropland extent was roughly stable from 1960 to 1990 (83.3 ± 2.3 Mha) so that it represented neither expansion nor abandonment (Fig. S4a).
Therefore, additional land-use data in the VLC region and the former Soviet Union were compiled from 1940 to 2012 based on official statistics on sown areas (detailed description in appendix C), which were found to be a good proxy for cropland expansion (Durgin 1962;Prishchepov et al. 2020) and abandonment (Ioffe and Nefedova 2004;Schierhorn et al. 2013). For simulating the effects of cropland expansion during the VLC era (Kraemer et al. 2015), the LU V data set was essential ( For modelling purposes, the available land-use data had to be extended to the entire period from 1700 to 2099. To ensure consistent land-use data sets from the different sources for the VLC region, LU V was extrapolated with cropland information from HYDE (Klein Goldewijk et al. 2010) from 1700 to 1940. For the projections during the twenty-first century, both data sets were extrapolated from 2001 to 2099 with cropland and grassland area trends from the LU D data set.
In order to capture the uncertainty in future land-use dynamics, two diverging scenarios were generated based on LU V : in the abandonment scenario LU V-A , all cropland in 2000 is gradually abandoned until 2050 (Fig. S4c), whereas in the expansion scenario LU V-E all areas abandoned after 1991 were gradually re-converted to wheat cultivation until 2050 (Fig. S4d).

Model setup
Model runs were performed with a daily temporal and 0.5 • × 0.5 • spatial resolution. The model was not initialized based on current conditions, but was run from bare soil in a spinup phase of 5000 years. The aim of the spinup simulation is to establish an equilibrium between the carbon fluxes and stocks in the soil and vegetation, which can be assumed to exist under the absence of anthropogenic disturbances and stable climatic conditions (prehistoric equilibrium). Starting simulations from this equilibrium state ensures that simulated carbon fluxes reflect the response to changes in drivers during the simulation period and not a mismatch between initialized carbon pools and parameterized processes. The prehistoric equilibrium is subsequently disturbed in a transient simulation, which introduces historical land use from 1700 onward and historical climate dynamics from 1901 onward. For the simulation years before 1901, the climate data from 1901 to 1930 were recycled because of the lack of alternative historical data.

Reference data for comparison
For comparison with simulated SOC stocks, we used SOC measurements in the VLC region (Table S4). The measurements sampled in the region between 47 • N and 55 • N and 40 • E and 108 • E cover a climatic gradient from forest to dry steppe types. The aim was to investigate the impact of cropland expansion on SOC stocks. The available data are SOC inventories from 56 sites, each consisting of two to five horizons. Observed values (obs) (e.g. in Bischoff et al. 2016) are given in mgC g −1 soil, whereas the unit of the simulation (sim) is gC m −2 . Conversion of the units was achieved with sim = 10 −3 · obs · BD · d 100 (1) with the measured bulk densities BD (e.g. between 1.2 and 1.54 · 10 6 g m −3 ; Bischoff et al. 2016) and the soil layer depth d in cm.

Analysis methods
Results were analysed for six different periods of 16 year length each (Table 2), which are highlighted as grey bands in figures showing time series. These periods are defined along significant changes in time (e.g. pre-VLC, VLC period, post-collapse of the Soviet Union) and intended to support the discussion of results. Simulation results were analysed for four subregions of the VLC area which are characterized by different climatic conditions and vegetation types ( Fig. S8 and description in section E). Regions differ in precipitation (1: dry, 2: moderate, 3: higher, 4: lowest, Fig. S9), vegetation types (1: steppe, 2: forest, 3: mostly forest, 4: cropland) and land-use intensity (1: very intense, 2: low, 3: nearly none, 4: intense).
Agreement with SOC measurements was statistically tested by linear regressions using the function lm of programming software R (R Core Team 2020). Temporal trends were derived by the partial Mann-Kendall test (Hamed and Rao 1998;Yue and Wang 2004) which calculates the median value of all differences between subsequent values and their successors. Its result was considered to be significant for P < 0.05.
To evaluate the relative importance of land-use and climatic drivers on SOC, we calculated standardized regression coefficients or β-weights (Nathans et al. 2012). This measure gives the change of a predicted variable when an explanatory variable is changed by one standard deviation. For the calculation, we used function lm.beta of package 'QuantPsyc' of programming software R (R Core Team 2020).

Results
Simulation results are presented for current SOC stocks (3.1), which we compared to field measurements. Temporal changes of the inventory of organic carbon in the soil and the vegetation as well as the net ecosystem exchange are evaluated (3.2) before depicting changes of the spatial patterns of SOC stocks (3.3) and analysing the drivers for changes in SOC (3.4).
While vegetation dynamics are also important for SOC dynamics, we here focus on the presentation of SOC. Results on LPJmL's ability to reproduce the gradient from steppe to forest (Fig. S1, Table S2) as well as changes in vegetation patterns under climate change (Figs. S2 and S3) and associated changes in patterns of evapotranspiration (Fig. S10) are included in the supplementary material only.

Comparison of simulations with soil organic carbon measurements
We found high agreement between simulations with LPJmL and field-based SOC measurements in the topsoil (0-20 cm) in the study area (Table S4) and the model results (for the period 1985-1995) (Fig. 3). The variability of measurements at single sites cannot be captured because of the coarser spatial resolution of the model grid, but the overall agreement with simulated values was significant (Table S5, Fig. 3a). The agreement with measurements to 50 cm soil depth was also significant (Table S5, Figs. S5a and S6a). When considering deeper soil layers, the number of available observations was low and the coefficient of determination of the regressions was not significant, but the simulated values were in the right order of magnitude ( Fig. S6b and c).

Simulation results of the carbon balance
Vegetation in the VLC region contained on average 2.16 kg C m −2 (i.e. in total 5.7 Pg C) at the beginning of the twentieth century for the simulations with the land-use data set LU V and on average 1.77 kg C m −2 (4.8 Pg C) for LU D (Fig. S11). During the twentieth century, values remained quite stable, except from 1940 to 1950 when abandonment during World War II and increasing atmospheric CO 2 concentrations led to an increase in vegetation carbon (Bastos et al. 2016). For the LU V data set, vegetation carbon during the VLC era (1954 to 1963) declined, followed by a stable phase without large fluctuations. With widespread cropland abandonment after 1991, vegetation carbon increased. At the end of the twenty-first century, the loss rates of vegetation carbon were lowest for the moderate scenario B1 (−7.5 gC m −2 a −1 for LU D and LU V ) and largest for scenario A2 with the highest temperature increases (−29.3 and −34.6 gC m −2 a −1 for LU D and LU V , respectively). The annual carbon flux between atmosphere and biosphere for the entire VLC area fluctuated mostly between −0.2 and 0.2 Pg C a −1 and did not differ significantly between scenarios and climate models until 2010. Since annual values of net ecosystem exchange (NEE, see 'Dynamic global vegetation model LPJmL', with negative values for land carbon uptake) varied strongly due to climatic fluctuations, we analysed the long-term fluxes as cumulative sums of NEE since 1900 (Fig. S12). In the first half of the twentieth century, carbon fluxes were relatively balanced for LU V and already showed carbon losses for LU D . Both showed a peak before the VLC era when vegetation carbon increased (Fig. S11), which led to carbon sequestration of −0.34 kg C m −2 for LU V and carbon losses of 0.14 kg C m −2 for LU D in 1947 in comparison to 1900. The beginning of the VLC era was characterized by high positive fluxes for LU V (0.17±0.05 kg C m −2 ), which denoted high carbon emissions to the atmosphere. The following period until 2020 showed rather stable conditions and huge differences depending on the land-use scenarios. At the end of the twenty-first century, standard variations across land-use scenarios (0.34 to 0.45 kg C m −2 a −1 ) were much lower than those across climate scenarios (0.79 to 0.98 kg C m −2 a −1 ).
We summarize the net carbon balance of uptake and emissions over the defined six periods of 16 years each (Table 2). For the LU V data set, the biosphere was a source of carbon only during the VLC period and a carbon sink until 2055 otherwise (Table 2). Carbon sequestration was small before the VLC era (period I from 1925 to 1940) (about −0.14 Pg C), but large carbon emissions occurred during the VLC era (1.57 Pg C in period II). Following the period of widespread cropland abandonment after the collapse of the Soviet Union, carbon storage increased at the end of the twentieth century (period III) and nearly doubled during the middle of the twenty-first century (period IV) (Fig. S11). From 1990 to 2100, the cumulative carbon emissions are highest for the expansion scenario LU V-E with 3.5 (±0.9) Pg C, lower for the scenario with LU V with 1.8 (±1.1) Pg C, whereas they were negative for the abandonment scenario LU V-A (-1.8±1.2 Pg C). Nevertheless, the region is projected to be a carbon source for the future periods V and VI, so that the overall balance from 1900 to 2100 results in a loss of 2.7 Pg C to the atmosphere und static future land use (LU V and LU D , Table 2). The scenarios for future cropland expansion and abandonment resulted in the same values until 2000 but caused higher losses for LU V-E and lower losses for LU V-A for periods V and VI, as expected. The net balance from 1900 to 2100 under cropland expansion was calculated as 4.5 Pg C and for abandonment as −0.9 Pg C ( Table 2).
The carbon fluxes were less pronounced for LU D for which larger losses were calculated before the VLC era but less during the time of cropland expansion. Carbon sequestration after cropland abandonment until 2055 was smaller as well but losses in the future are estimated in the same order of magnitude so that the net loss for LU D in 2100 was projected to be on average 3.7 Pg C.
SOC stocks for the VLC region were simulated for the entire soil column of 3 m. The SOC content, that Simulation results for soil organic carbon in the VLC region (kg C m −2 ) from 1900 to 2100 for the climate scenarios A1, A2 and B1 (color-coded; thick lines denote averages over three GMCs and shaded areas the range) and four different land-use data sets (line types). Shadings indicate VLC era and breakdown of the Soviet Union (light grey) and periods for further analysis (dark grey). Endpoints are denoted in right inset by dots (average values) and lines (ranges over three GCMs) per land-use dataset. The upper panel shows average values for the entire VLC region, and the lower figures for the four subregions R1 to R4 accumulated until 1900 in this region, added up to 18.7 kg C m −2 for LU V and 18.4 kg C m −2 for LU D (Fig. 4) which is 8.8 (LU V ) or 10.2 (LU D ) times the carbon stocks in the vegetation. SOC stocks globally are estimated for soil depth until 3 m as 2397 PgC (1836 and 3257 PgC were given as 2.5th and 97.5th percentile; Carvalhais et al. 2014). Our simulations suggest that the soils in the VLC area contain about 2.1% (1.6-2.7%) of the global stocks. SOC average values are determined by the eastern region R3 with 40.5 and 40.9 kg C m −2 , where the further development is determined by the climatic conditions. Lowest SOC in the western region R1 (6.4 kg C m −2 ) are reduced during the VLC era but stabilize and even increase after 1990. Loss of SOC was highest in the central regions 2 and 4 (0.22 and 0.14 kg C m −2 ), where most of the cropland expansion took place.
The SOC level remained stable for the simulation with the LU V data set until the middle of the twentieth century with slight fluctuations and increasing values before 1915 and 1947. Both periods were times of reduced land use because of the world wars, which was also reflected in preceding peaks in vegetation carbon (Fig. S11). Average decline of the SOC stocks from 1950 to 1965 (period II) is nearly twice as high for scenario LU V (−0.23 kgC m −2 ) as for LU D (−0.14 kgC m −2 ).
During the VLC era, the moderate decline for LU V was −9.0 gC m −2 a −1 . The SOC stocks declined for all realizations with the three climate model results until 2020 with steady rates of −3.8 gC m −2 a −1 . From 2060 on, the decline for all simulations persists (−8.3 gC m −2 a −1 ) but with higher deviations between the simulations. Strongest decline from 2070 to 2099 of −14.3 gC m −2 a −1 was simulated for the scenario with cropland expansion LU V-E and lowest for LU V-A (−4.1 gC m −2 a −1 ), whereas the rates were in between for LU D and LU V with −12.4 and −10.2 gC m −2 a −1 , respectively (significant trends according to Mann-Kendall test, Yue and Wang 2004).

Spatial distributions of the soil organic carbon balance in the VLC region
The spatial distributions of the SOC stock for LU V show a gradient of higher values of about 38 kgC m −2 in the northeast to lower values in the southwest (Fig. S7). In the dry steppe region R1, average SOC stock was 5.7±0.04 kgC m −2 , in the typical steppe area R2 21.7±0.02 kgC m −2 and in the forested eastern R3 about 38.4±0.00 kgC m −2 . To visualize the development over time, we show absolute values for the first period (1925 to 1940) (Fig. S7 a) and difference values for the subsequent 5 periods (Fig. S7 b to f). At the end of the twentyfirst century, the grassland area R3 in the east gained on average 0.4 kgC m −2 to 6.1±0.06 kgC m −2 , whereas the northern region R2 and the forest area lost on average 1.2 kgC m −2 to 19.5±0.77 kgC m −2 and 1.9 kgC m −2 to 36.5±0.66 kgC m −2 , respectively. Locally, both losses and gains in SOC increased over time, but with different spatial patterns. While in the second period, about 1.1% of the VLC area lost 1 kgC m −2 , about 19.6% were lost in the third and 54.2% in the last period. Losses of more than 3 kgC m −2 occurred from the third period (0.9% of the area) and increased to 15.7% for the last period. Losses of more than 6 kgC m −2 occurred on 1.1% of the area with maximum values of 9.5 kgC m −2 . On the other hand, carbon accumulated in the soil on 21.4% of the area and on 6.7% this gain was more than 1 kgC m −2 . Differences to the future land-use scenario LU V-E amounted to larger carbon losses of up to 1 kg C m −2 in the northern part with forested vegetation and reduced carbon losses on grasslands in the western part (e.g. 52 • N, 43 • W) or in the Kazakh part (e.g. 51 • N, 70 • W).
We estimated the SOC loss in the upper 20 cm for grid cells with specific land-use histories. First, we selected cells in which natural vegetation was entirely converted to cropland in the VLC era (e.g. 56.75 • N, 60.25 • E) and in which cropland was abandoned after the breakdown of the Soviet Union. For those cells, the loss in comparison to the pre-VLC period (1920 to 1935) for the land-use data set LU V amounted to 13-18%, whereas SOC remained at the same level for LU D (changes between −1.8 and 0.2%). Cropland abandonment reduced SOC loss by up to 1.5 kg C m −2 in the border region between grassland dominated and tree-dominated regions and especially in the eastern part of the region (east of 42 • W). For the example cells representing the full conversion, the upper 20 cm of the soil showed losses of −5.1 to −1.9% compared to the period before the conversion but recovered after the abandonment. In the simulations with LU D , the changes between both periods were mostly positive denoting SOC gains of up to 5.9%. In our simulation results, we found a significant relationship between cropland expansion and SOC loss (r 2 = 0.58). We could associate the conversion of natural vegetation to cropland with a reduction of the upper SOC stock of 11% (±4.9%).

Analysis of the driving factors of the changes in SOC in the VLC region
For the evaluation of the relative importance of land-use changes and climatic conditions for the development of SOC, we calculated the standardized regression measure βweights ('Analysis methods'). To take into consideration the response time of SOC, we decided to apply the method to decadal averages of the SOC values and of the driver variables cropland and grassland shares, precipitation and temperature for each of the grid cells within the VLC area. As target variable, we calculated the temporal difference of the decadal SOC averages per grid cell. For the entire VLC area, cropland expansion had the strongest effect on SOC loss and temperature a smaller effct on SOC gain, whereas grassland extent and precipitation are of lesser importance (Fig. 5, left). Cropland expansion had a similar effect on SOC loss in all subregions except the eastern R3 with predominantly natural vegetation. In the central regions R2 and R4, also grassland expansion caused SOC loss, while temperature was positively associated with SOC gain. Overall, cropland change was the dominant driver for changes in SOC, i.e. cropland expansion is associated with SOC loss and abandonment with SOC gain.

Discussion
We provide the first systematic assessment of large-scale land-use change impacts in the VLC region on the carbon cycle. The conversion of 45 Mha of steppe vegetation to cropland from 1954 to 1963 affected 17% of the entire region, of which 32 Mha (12%) were abandoned after the breakdown of the Soviet Union in 1990. Our results suggest that carbon emissions during the VLC era and carbon sequestration after the abandonment are underestimated when land-use data fail to properly reflect the expansion and abandonment in the region (Schierhorn et al. 2013).
Although climate change under all future scenarios will lead to larger carbon emissions, land-use dynamics have the potential to determine the regional carbon balance substantially. Previous investigations on the dynamics and effects of the large-scale land-cover change in the steppe belt of Eurasia, including in the VLC region, mainly assessed the massive cropland abandonment after 1991 (Schierhorn et al. 2013;Nguyen et al. 2018) and its role for grain and fodder crop production after the collapse of the Soviet Union (Meyfroidt et al. 2016;Prishchepov et al. 2012). Earlier studies combined land-use change with estimates on the SOC changes based on measurements ) and extrapolation of SOC chronosequences (Kurganova et al. 2014;Kurganova et al. 2015). Additionally, SOC measurements on grasslands and croplands up to 30 years after the conversion were used to calculate net effects on the SOC (Bischoff et al. 2016;Wertebach et al. 2017). Long-term losses for the former Soviet Union from 1850 to 2015 were evaluated by a bookkeeping model with 10.7(±4.3) PgC (Houghton and Nassikas 2017). With our study, we complement observation-based studies on land-use change effects in several parts of the VLC region with a consistent, modelbased assessment for the entire VLC region.
The LPJmL model can well represent the vegetation patterns of the VLC region. Our simulations show a vegetation composition of grass-dominated steppes in the south and a transition towards tree-dominated areas in the northern and eastern parts. The transition zone was simulated for 1990 between 54.2 • and 56.5 • N depending on the longitude. Observation based maps of vegetation show in the border region of northern Kazakhstan and Russia the transition from grassland steppe to forested area between 54 • and 55 • N (Tchebakova et al. 2009) which matches with our results. Wertebach et al. (2017) describe the vegetation at sites in the Tyumen province (between 55.7 • N, 65.7 • E and 57.8 • N, 69.2 • E) as belonging to the forest steppe ecozone, i.e. the transition between forested and treeless regions. In this area, LPJmL simulates a tree coverage of less than 50% south of 56.5 • N so that we consider the representation of the vegetation composition as reliable. Also the observed SOC stocks as, e.g., reported in the literature (see Table S4) can be reproduced well by the model, even though small-scale variability in SOC estimates cannot be reproduced due to the coarseness of the simulation and input data.
Conversion of steppe grassland to cropland during the VLC was a major driver for SOC losses and caused extensive soil erosion in this semi-humid to semi-arid region (McCauley 1976;Titlyanova and Naumov 1995;Schierhorn et al. 2013;Bischoff et al. 2016). Although methodological uncertainties remain in assessing SOC losses from observations, the cultivation of virgin steppe landscapes was identified as a major driver of carbon losses in these studies. For example, Orlov et al. (1996) found that the impact of the organic matter removal at harvest on the reduction of carbon replenishment of the soil is usually underestimated. Thus, not only consequences for SOC stocks but also for other soil properties, such as water availability for plant growth, may have been misjudged (Orlov et al. 1996). Our results confirm the strong influence of the land-use changes on SOC losses and associated deterioration of soil functions.
Our estimates of the effect of cropland expansion on SOC stocks are on the lower range of observed values. The stability of organic carbon in the soil is known to be associated with the texture and the structure within the soil, e.g. expressed in the particle size distribution (Lehmann and Kleber 2015), which is not accounted for in LPJmL. Findings from the Kulunda steppes in western Siberia suggest that tillage alone did not lead to an increase in mineralization of organic matter associated in macroaggregates (Bischoff et al. 2017). The decline in SOC after conversion to cropland was primarily attributed to the reduction of plant residue input into the soil. The LPJmL model accounts for SOC loss inherent in residue management. For the entire Kulunda steppe, a loss of 20-35% of the soil organic carbon is estimated . Regarding the VLC region, our results for the entire soil column suggest a loss of 1.6 Pg carbon which is about 3.5% of the SOC stock before the conversion. For the upper layer of 20 cm, soils converted to cropland lost 13-18% of their SOC. Although the sampling sites for the comparison were chosen only when the respective land-use history was not contradicting the land-use in the respective grid cell, the simulation results do not reflect a detailed reconstruction of the situation at the experimental sites. The deviation between these estimates are thus partly caused by differences in the land-use history and partly by the scale effect and differences in computing results at a grid with an extent of 0.5 • × 0.5 • vs. extrapolating field data from specific sites. In the upper 50 cm, soils of West Siberia lost 1.1-1.3 Pg C according to experimental data, i.e. 10-40%, of their initial C stock due to agricultural activities during the twentieth century (Titlyanova and Naumov 1995) which is consistent with our results. In the dry steppe and semidesert regions of Kazakhstan, the relative SOC stock change was evaluated with a loss of 17-24% after 50 years of continuous tillage (Morgounov and Trethowan 2008). The conversion of grassland to cropland showed highest losses in the upper 10 cm with 38-45% (Mikhailova et al. 2000) but influences SOC content also in deeper layers (25% less in first 100 cm, Beniston et al. 2014).
Cropland abandonment in the VLC region after 1991 occured on 33.1 Mha in the LU V data set, which is equivalent to 37% of the cropland in the 1950s in this region. Assessments of cropland abandonment are available for several regions of the former Soviet Union, yet not for the VLC region, e.g. in the northern Kazakhstan steppe regions, where 40% of the 4.7 Mha cropland were abandoned by 2013 (Dara et al. 2018), and in the Kostanay province, where 45% of the 3.1 Mha cropland in 1990 were abandoned by 2000 (Kraemer et al. 2015).  (2010) conclude that uncultivated virgin land is a carbon sink and that abandonment may replenish SOC. In the steppe ecosystems of the Russian Federation, a total carbon sink was estimated from measurements between 52±13 and 81±50 TgC a −1 . For the abandoned ecosystems in the steppe zone, a stable carbon sink of 114 to 201 gC m −2 a −1 was derived Golubyatnikov et al. 2020). In the topsoil (0-20 cm), cropland abandonments were associated with average carbon sequestration rates from 22-47 gC m −2 a −1 (Kurganova and Lopes de Gerenyu 2008;Vuichard et al. 2008) to 92-126 gC m −2 a −1 (Kurganova et al. 2010;Kurganova et al. 2014), depending on the calculation approach and time period. In our results, abandonment of cropland after 1990 led to re-establishment of natural vegetation, which caused carbon sequestration (0.5 PgC or 11.7 gC m −2 a −1 , Table 2). With some delay, the increasing carbon stock in the biosphere transferred into the SOC pool (on average 0.86 kgC m −2 in grid cells with complete abandonment between 1990 and 2005).
For most of the twenty-first century (2010 to 2075), changes in climate and atmospheric CO 2 concentrations contribute to increasing vegetation carbon while SOC stocks decline, unless a total abandonment of cropland is assumed (LU V-A ). In the scenario with further cropland abandonment until 2050 (LU V-A ), model results project the highest levels of soil and vegetation carbon. Without further land-use change from 2050 to 2100, future climatic conditions are projected to lead to substantial carbon losses. Restoration of abandoned cropland to natural vegetation in the dry steppe zone in the south-eastern part of the VLC region resulted in the recovery of the SOC stocks. The recovery rate depends on the soil types (Calcisols and Solonetz) and reaches 64% and 89% of the natural carbon pool within the upper 50 cm of the soil after 42 and 12 years, respectively (Kalinina et al. 2015). For both soil types, a complete recovery is projected after about 100 years. For other major soil types (Retisols, Luvisols, Chernozems), Kurganova et al. (2015) found that a new equilibrium of SOC stock can be reached after 30 to 40 years.
Projections may be altered by changes in salinity, which was not considered in our modelling approach. Increasing salinity is often associated with decreasing plant productivity and therefore also reduced SOC stocks (Wong et al. 2010). Investigations in the Kulunda steppe showed that organic carbon was not reduced by higher salinity but more effectively stored in the soil because of its association with minerals and reduced availability for microbial decomposition (Bischoff et al. 2018). This implies that semi-arid steppe regions can store organic carbon in comparable magnitudes when soil moisture conditions are sufficient, independent of salinity (Bischoff et al. 2018). We did not explicitly study different forms of cropland management, such as conservation agriculture, that can also strongly modify the dynamics of SOC under cropland.
Our results demonstrate that the development of agricultural production area would have more impact on the changes in SOC stocks than climate change alone. Until the middle of the twenty-first century, increasing temperatures and rising atmospheric CO 2 concentrations are beneficial for the vegetation as well as the ability of the vegetation and soils to sequester carbon in the study region. After 2070, all projections showed declines in carbon fluxes to the biosphere and drastic changes in the vegetation. Assumptions about future land use are decisive for whether the biosphere will constitute a carbon source or sink at the end of the twenty-first century. These findings should inform current discussions if the abandoned cropland in the Eurasian steppes should be recultivated . Our results are also useful to identify areas where the recultivation of former cropland in the VLC region can contribute to increasing global grain production with low carbon emissions (Meyfroidt et al. 2016).

Conclusion
We examined the effects of changes in land use and climate for soil organic carbon content in the entire area that was exposed to the Virgin Lands Campaign. Our analysis particularly focused on the widespread cropland expansion during the Campaign as well as the abandonment after the fall of the Soviet Union. The results from the dynamic vegetation model highlight the importance of accurately accounting for changes in cropland extent, particularly in regions that witnessed fundamental restructuring of their agricultural sector, such as our study region in the former Soviet Union. With the given importance of cropland extent for the region's carbon dynamics and the apparent conflict between agricultural production and carbon sequestration for healthy soils and climate mitigation, different agricultural management systems need to be tested in and for the region. The findings are important for assessing the carbon costs of recultivation and potential carbon gains that can be expected from restoring steppe ecosystems.
The legacies of the VLC era were evaluated with a locally more accurate land-use dataset with a loss of 1.6 PgC, which is 1 PgC more than the result with a standard dataset not representing the cropland expansion period. Potential recovery of SOC until 2100 was achieved in the simulation results only for the assumption of complete cropland abandonment. Therefore, further investigations considering soil improving techniques on cropland are necessary to evaluate the possibilities of agricultural activities without soil depletion. In comparison to climatic drivers, cropland extent turned out to be the major driver for SOC changes.

Acknowledgements
The study was part of the project Kulunda (BMBF under grant numbers 01LL0905L and 01LL0905I). We thank for financial support of the following projects: EU FP7 ERA.Net Russia Plus: 449 CLIMASTEPPE, funding Goszadanie of Institute of Steppe 'Problems of steppe management under the conditions of modern challenges: optimization of the interaction between environmental and socio-economic systems' No.AAAA-A21-121011190016-1. DFF-Danish ERC Support Program (grant number: 116491, 9127-00001B) and the BMBF project ReKKS under grant number 01LZ1704A. We also thank the GERUKA project, which is funded by the German Federal Ministry of Food and Agriculture (BMEL), the Federal Office for Agriculture and Food (BLE) and the EPIKUR project, which is funded by the Leibniz Association's 'Joint Initiative for Research and Innovation' ('Pakt für Forschung und Innovation'). I. Kurganova acknowledges support from state assignments of the Pushchino Scientific Centre of Biological Researches of RAS (under grant number AAAA-A18-118013190177-9). The authors acknowledge help by Jannes Breier. We thank Alexander Tsypin for assistance in collection the historical land-use change datasets. We acknowledge the modelling groups, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and the WCRP's Working Group on Coupled Modelling (WGCM) for their roles in making available the WCRP CMIP3 multi-model data set. Support of this data set is provided by the Office of Science, U.S. Department of Energy.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.