Introduction

The world population is projected to reach about 9.7 billion by the middle of the century, according to the medium variant of the World Population Prospect (United Nations 2019). The increased population, combined with a higher calorie demand per capita, will pose a significant challenge to ensure that food production can meet the increasing food demand (Godfray et al. 2010; Vermeulen et al. 2012). This challenge is further complicated by expected reductions in crop production caused by climate change and other environmental issues (Ray et al. 2019; Ortiz-Bobea et al. 2021; Soleymani 2022). Therefore, a sustainable solution requires understanding the complexity of agricultural systems and their interaction with other biogeochemical dynamics (Cramer et al. 1999; Sitch et al. 2003; Lindeskog et al. 2013). A class of global gridded crop models (GGCMs) which are able to simulate both crops and biogeochemistry have been developed to explore these multisystem interactions (Prentice et al. 1989; Bondeau et al. 2007; Monfreda et al. 2008; Müller et al. 2017). They have a broad range of applications in the simulation of crop productivity, climate impact on yields and the effect of management practices such as irrigation and fertilisation (Bloh et al. 2018; Franke et al. 2020; Ringeval et al. 2021).

To model grain yield accurately, GGCMs need to capture the behaviour of processes such as phenology, carbon assimilation and assimilate allocation (Fletcher and Jamieson 2009; Ringeval et al. 2021). The overall allocation to economically valuable organs such as grain is represented by the harvest index (HI), defined as the ratio of grain dry matter to aboveground biomass. HI describes the crop success in partitioning photosynthates to produce economic biomass and it is known to vary due to genetic and environmental factors (Qin et al. 2013; Porker et al. 2020). Given its relevance to grain yield and its economic importance, HI is a crucial variable to be addressed in crop modelling.

In most crops, HI has increased through domestication and breeding (Lorenz et al. 2010), and during the last decades, rising crop productivity has mainly relied on increased HI. For example, wheat cultivars released between 1860 and 1982 showed that 80% of the improvement in yield was associated with an increase in harvest index. For maize, differences between old and modern hybrids yields are also directly related to HI increase (Sinclair 1998; Lorenz et al. 2010; Liu et al. 2020). Correspondingly, grain yield is highly sensitive to factors that affect HI, such as water stress or nutrient management, which have been shown to affect the proportion of biomass converted to grains in wheat (Dai et al. 2016; Porker et al. 2020; Soleymani 2022).

Different approaches are used to represent HI in the GGCMs. The LPJml model uses a defined optimum and minimum HI as prescribed parameters for different cultivars, with a modifier for the effect of water stress (Bondeau et al. 2007; Ringeval et al. 2021). Other models like the EPIC family (a group of GGCMs composed of several similar site-based models) include potential HI as a cultivar parameter that can be modified by empirical response functions to N dynamics and drought stress during the productive phase between anthesis and harvest (Balkovič et al. 2013; Olin et al. 2015a; Ringeval et al. 2021). This approach is parsimonious, but does not allow the analysis of the effect of multiple factors affecting HI. Setting HI as a parameter also limits the potential use of crop models to explore the implications of future environmental change, conditions where atmospheric CO2 concentration, climate and nutrient availability may change substantially and simultaneously (Qin et al. 2013; Müller et al. 2019). Besides, an accurate prediction of HI increases the certainty about the robustness of the representation of production processes and yield simulation (Fletcher and Jamieson 2009).

To address this challenge, LPJ-GUESS has introduced a fully prognostic HI calculation. It is calculated mechanistically as a function of the assimilated carbon, the developmental stage and based on the daily fraction of net primary production (NPP) allocated to different plant tissues and the retranslocation of carbon from other organs to grains (Olin et al. 2015b). The simulation of HI in LPJ-GUESS was introduced as part of a package of updates, including a nitrogen (N) dynamic module that accounted for the effect of N limitation and N (Olin et al. 2015b). These implementations improved the LPJ-GUESS performance to simulate the productivity of grasslands, wheat yield in Europe, and global maize and wheat yield at the country scale (Olin et al. 2015a, b; Blanke et al. 2018).

LPJ-GUESS has shown acceptable performance in the simulation of global historical yield for wheat and maize globally (Olin et al. 2015a; Bodin et al. 2016; Müller et al. 2019). However, several parameters relevant to the allocation scheme are weakly constrained and the sensitivity of simulated yield to these parameter choices is not well characterised. Furthermore, the lack of global-scale compilations of reference data of HI (Boote et al. 2013; Iizumi et al. 2014; Ringeval et al. 2021) means that LPJ-GUESS has not been tested against simultaneous constraints for both yield and HI at the global scale. In this paper, we report on the sensitivity of yield and HI outputs from LPJ-GUESS to the variation of eight parameters related to production and allocation. We further identify the most suitable choice of parameter values to simulate yield and HI across a globally distributed range of reference sites for two cultivars of wheat (Triticum aestivum L.) and two cultivars of maize (Zea Mays L.) by comparing simulated and reference yield, as well as simulated and observed HI values. Finally, a global evaluation demonstrates the fit improvement in HI and yield with the new parameter setup.

Materials and methods

LPJ-GUESS crop model

LPJ-GUESS (Smith et al. 2014) is a process-based dynamic vegetation model that simulates vegetation response to climate, atmospheric carbon dioxide levels ([CO2]) and N dynamics. Different plant functional types (PFT) represent several vegetation categories according to growth form, phenology, photosynthetic pathway, distributional temperature limits and N requirements (Olin et al. 2015a). The land-use change and crop modules (Lindeskog et al. 2013) represent crops as PFTs differing in climatic thresholds and management-related parameters like baseline sowing and harvest dates. The model also includes management options such as irrigation, tillage and inter-growing season grass cover (Olin et al. 2015b). The main processes simulated daily to represent crops are soil hydrology, photosynthesis, canopy conductance, respiration, phenology, plant N demand, and carbon allocation (Smith et al. 2001; Olin et al. 2015b).

Carbon allocation of daily NPP and retranslocation of nutrients after anthesis and during senescence are critical factors to simulate yield in LPJ-GUESS and, subsequently, HI. These processes depend on the crop development stage, defined daily as a number between 0 and 2 in LPJ-GUESS, depending on air temperature, vernalisation and day length. Anthesis is represented by a developmental stage of 1; values below 1 represent the vegetative phase, while values above 1 represent the reproductive phase. During the vegetative phase, allocation is mainly represented by a logistic growth of roots, leaves and stems. During the reproductive phase, assimilates are allocated to grains (Olin et al. 2015b). LPJ-GUESS also considers a temporary carbon pool to supply demand on days when assimilation is below respiration cost. When the stem stops growing after anthesis, redistribution of the temporary carbon pool to storage organs starts (Penning de Vries et al. 1989; Olin et al. 2015b).

Senescence is integral to annual crop development but can also be prematurely induced in leaves by adverse conditions and stress. In LPJ-GUESS, the onset of senescence occurs when the available N in leaves declines below the level necessary to maintain the current leaf area index (LAI) (Yin et al. 2000; Smith et al. 2001; Gregersen et al. 2013; Olin et al. 2015b). The necessary N to maintain LAI depends on the N uptake and the N demand from leaves according to their C:N, which is not a fixed parameter in LPJ-GUESS. Instead, it is constrained between a minimum and a maximum leaf C:N. The optimum leaf C:N is estimated as 0.75 of this range (Wample et al. 1991; Smith et al. 2014; Olin et al. 2015b).

Model setup

LPJ-GUESS v4.1, revision 10304, was used in this study. The crop model in this version is based on the developments presented by Olin et al. (2015b), including the daily carbon allocation scheme and N dynamics in crops. The land cover was set to simulate cropland only in each 0.5° × 0.5° gridcell and the simulations were carried out starting in 1980 for maize, spring and winter wheat. All the crops were simulated for both rainfed and irrigated conditions. Tillage, N fertiliser application and a grass cover crop between growing seasons were turned on. The model dynamically estimated sowing and harvest dates based on climate suitability and heat unit accumulation (Lindeskog et al. 2013). The period for 1980–2010 used transient climate data from the AgMERRA dataset (Ruane et al. 2015). Prior to this period, the simulations were spun-up for 500 years with fixed [CO2] for the year 1980 and climate to build up C and N pools. N input was provided based on the atmospheric N deposition dataset from Lamarque et al. (2010) and the cropland N fertilisation database from AgGRID (AgMIP Gridded Crop Modelling Initiative) (Elliott et al. 2015).

Parameters

The effect of eight crop LPJ-GUESS parameters was evaluated on yield, harvest index (HI), NPP, carbon mass, LAI and N pool. A total of 17,280 simulations were performed, combining all the levels from each parameter (2 × 3 × 3 × 3 × 4 × 4 × 4 × 5). A specific combination of parameters is referred to as a “setup” in the following. To evaluate sensitivity to all parameters in the crop model, including their interactions, would be computationally unfeasible. Therefore, the evaluated parameters and the range of variation were chosen based on literature and expert assessment of those parameters, which were likely to affect the simulated harvest index and are poorly constrained by observations. These parameters were primarily related to N status in the plant, retranslocation of C and N towards the grain, leaf thickness and light extinction (Fig. 1).

  • Stem retranslocation (Sret)

    Sret represents the retranslocation of carbohydrates of easy mobilisation, mainly glucose and starch, from the stem to grains. This labile C pool represents 0.4 of the stem carbon at flowering, and it is retranslocated to the grains close to the end of the grain-filling period with a rate of 0.1 day−1. Retranslocation is induced when the total demand for sugar exceeds the supply or when the growth rate of the developing storage organ drops below a certain level (Penning de Vries et al. 1989; Olin et al. 2015b). This process is briefly considered in some models since it is not reported to be crucial to simulating yield, and stem starch residuals have not shown a significant relationship with yield in maize (Penning de Vries et al. 1989; Liang et al. 2019). Therefore, two possibilities were tested: inclusion and exclusion of Sret.

  • Specific leaf area (SLA)

    SLA is calculated in LPJ-GUESS for natural vegetation according to leaf longevity described by Reich et al. (1992), but is required as a cultivar parameter for crops. SLA could vary according to fertilisation and water status, cultivar and plant density, among other factors (Amanullah and Inamullah 2016). Values of 45 and 50 m2 kg C−1 have been reported for maize and 35 and 40 m2kg C−1 for wheat (Penning de Vries et al. 1989; Mohammadi 2007; Olin et al. 2015a). Therefore, for this study, SLA was set to vary between 40, 45 and 50 m2 kg C−1 for maize and between 30, 35 and 40 m2 kg C−1 for wheat.

  • Minimum C:N ratio in leaves (C:Nmin) and C:N range (C:Nrange)

    Since tissue C:N varies in LPJ-GUESS according to dynamics between plant demand and supply of N (Smith et al. 2014). C:Nmin represents the maximum N concentration in leaves; below this value, the leftover N is translocated to the labile N pool. Thus, higher values of this parameter cause higher amounts of N translocated from leaves. C:Nrange is a factor that, when multiplied by C:Nmin, equals the maximum C:N. The inappropriate constraint of N limits will overestimate the N use efficiency (Smith et al. 2014; Olin et al. 2015b). For this study, C:Nmin was set to vary between 12.5, 15 and 17.5, and C:Nrange between 2, 2.78, 3.5 and 5 for both crops since the original crop implementation of LPJ-GUESS, based on grass reports showing a C:Nmin of 16 (Olin et al. 2015a).

  • Retranslocation rate of N and C (Nret, Cret)

    During the senescence process, retranslocation to grains of N and C stored before anthesis occurs, but not instantaneously. In Olin et al. (2015b), this process is set to occur at a rate of 0.1 day−1. For this study, the fractional rates of retranslocation for N and C, respectively, Nret and Cret, were set to vary between 0.1, 0.2, 0.3 and 0.4 day−1. Quicker retranslocation implies a shorter senescence period and greater depletion of previously stored pools, increasing the dry matter ratio between grains and other organs, i.e., HI.

  • Nitrogen extinction coefficient (kN)

    The N extinction coefficient is directly related to the light extinction coefficient. It represents the decline in leaf N concentration from top to bottom of the canopy, typically following an exponential decrease. A higher extinction coefficient means a more drastic decrease in N concentration. N distribution is one of the most important determinants of photosynthesis rate, carbon gain, and senescence regulation in the canopy in LPJ-GUESS, affecting yield and HI (Yin et al. 2000; Olin et al. 2015b; Hikosaka et al. 2016). kN was set to vary between 0.175, 0.233 and 0.291 m2 m−2 for maize, and 0.15, 0.2 and 0.25 for wheat. Central values of these ranges were obtained from Olin et al. (2015b). The range was selected from references for wheat calculated under different conditions during the vegetative period (Yin et al. 2003; Olin et al. 2015b).

  • Nitrogen demand reduction (Ndred)

Fig. 1
figure 1

Diagram of the relationship between evaluated parameters and plant organs during different developmental stages (DS)

Ndred represents the reduction in N demand by leaves after anthesis, affecting photosynthesis, carbon gain, leaf senescence, yield and HI. This process is not fully understood but is known to occur gradually due to root senescence resulting in a change in the N source-sink relationship (Zhao et al. 2020). Some reports show that N content at anthesis in wheat is as high as 90% of N at maturity (Mi et al. 2000). Since this parameter is not well explored and the time span at which it occurs is unclear, a wide range of Ndred was set, varying between 0, 2.72, 7.39, 20.09 and 100. A value of 0 means a slow reduction equal to the original LPJ-GUESS scheme, while 100 means a drastic decrease of N demand after the anthesis.

Harvest index

A systematic literature review of published peer-reviewed research was conducted in August and September 2019, employing the widely used databases Google Scholar and Web of Science. A secondary search was also performed through publications cited by those in the primary search. The search was intended to identify studies published worldwide reporting the harvest index for wheat and maize. Studies published since 1990 were targeted to avoid the inclusion of old cultivars. Although studies published before 1990 were included in a few cases, in regions where no publications were found searching by the initial time target. The following search protocol was employed:

  1. 1.

    Search in English, Spanish and Portuguese for each of the four crop names, “wheat”, “maize or corn” (Spanish: Trigo and maiz; Portuguese: Trigo and milho), combined with “harvest index” (Spanish: índice de cosecha; Portuguese: índice de colheita).

  2. 2.

    A manual review of the results to identify papers containing relevant data and potential research to include in the database depending on the availability of the document and the inclusion of harvest index values for any of the crops included in the study at a specific location.

  3. 3.

    A further round of searches and manual review targeted regions where few or no studies were found in the first search: Africa and South America for all the crops and North America and Europe for rice.

  4. 4.

    Finally, the word “meta-analysis” was also combined with all the described search terms to get studies with previously compiled datasets.

Google Scholar is prone to excessive results from around 150 000 records. While the Web of Science produced around 1000 records in the first type of search to less than 10 in the latter, for this reason, only the first 100 results in each were considered. From these, a total of 46 relevant records were identified for maize and 50 for wheat. Additionally, 64 publications (34 for maize and 34 for wheat) reported in a harvest index meta-analysis countrywide in China, were collected to complement the database. These studies were published in Chinese between 2006 and 2010. Selected records were scrutinised to filter duplicate studies or identify different studies using the same HI data. Studies were retained if they included the compulsory target variables: location (country, city and coordinates), year of the harvest, harvest index value and type of wheat (spring or winter). Information on whether crops were irrigated, rainfed and fertilised traditionally (by synthetic fertilisers) or alternatively (organic, ecological) was also recorded if mentioned in the study.

When the dataset in meta-analysis studies or any compulsory variable was not included in the published document or supplementary data, the authors were contacted and requested to share the information; no response caused the rejection of the record. Detailed information about the number of records is contained in Table 1 and supplementary data S1. The commercial control harvest index data was chosen in studies about nutrition, plant density, or any other management practice. If a control treatment was not explicitly included in the study, the middle levels of the treatments were selected, as well as the well-watered treatments in irrigation experiments or cultivars used as a reference (checks) in the breeding studies and ambient treatments in air concentration enrichment experiments.

Table 1 Summary of entries by crop

Locations and evaluation data for calibration

Twenty locations for wheat and 22 for maize were selected from the compiled HI database at a 0.5° × 0.5° gridcell scale, including productive locations with high and medium–low yields and covering as many regions worldwide as possible (Table S2, S3). Yields from these locations were extracted from the global gridcell scaled (0.5° × 0.5° resolution) yield data reported by Ray et al. (2019), covering the period between 1970 and 2013. In addition, the average yield of the countries to which the selected grid cells belonged was extracted from FAO reports (Food and Agriculture Organization of the United Nations 2020). These datasets will be referred to as “Ray” and “FAO”, respectively, in this manuscript.

We preferentially calibrated against large-scale (gridcell or country) yields, rather than those reported for site level, because LPJ-GUESS is intended for application at large scale. Therefore we wished to avoid overparameterising to idiosyncrasies of particular sites or studies. Likewise, we used aggregated, rather than site-level, HI observations in the calibration. In addition, The Ray dataset was estimated based on the crop statistics from about 20 000 political units, so it carries some uncertainty in downscaling (Ray et al. 2019). To offset this, a comparison against the average reference yield of a larger political unit using the FAO dataset provides a cross-check for robustness.

Analysis of model sensitivity

Results were analysed for 2001–2010, taking the mean for each location over this period. According to their average simulated yields across all the setups, locations were separated into two groups. Locations with average yields above the third quartile from all the simulated yields in all setups were categorised as “High” and “Medium” when average yields were below. A “Low” category was not included since the few low-yield locations (with an average yield below the first quartile) made grouping challenging. Besides, these locations showed similar behaviour to medium-yield locations.

The mean, the standard deviation from all setups and locations and the standard deviation from setups only (mean value of locations per setup) of all the output variables were calculated for each parameter level to inspect the variability caused by the parameter variation. Then a one factorial ANCOVA for the “High” and “Medium” groups was carried out for yield and HI, correcting the location effect by considering locations as a categorical covariate. Finally, the percentages of the sum of squares from ANCOVA (%SS) were calculated for all the main effects (parameters) and for interaction effects as the joined percentage of all the interactions where each parameter was included (Eqs. 1 and 2).

$$\%{\mathrm{SSmain}}_{i}=\frac{{\mathrm{SSmain}}_{i}}{{\mathrm{SS}}_{\mathrm{Total}}}\times 100,$$
(1)
$${\%\mathrm{SSinter}}_{i}=\frac{\sum_{j=1}^{n}{\mathrm{SSinter}}_{i\times j}}{{\mathrm{SS}}_{\mathrm{Total}}}\times 100,$$
(2)

where i represents the parameter, j is the interaction including the ith parameter, SSmain is the sum of squares of main effects, SSinter is the sum of squares of interaction and SSTotal is the total sum of squares.

Parameter calibration

Only setups that simulated senescence appropriately were retained for the rest of the analysis, it means setups with a percentage of dead leaves at harvest for irrigated maize above 70% and above 50% for irrigated spring and winter wheat in more than half of the locations (12 for maize and 11 for wheat). These values were approximated for maize based on the decrease of chlorophyll in leaves during senescence reported for fertilised maize (He et al. 2004). The threshold was set lower for wheat since the breeding programs during the last years have selected many stay-green cultivars due to the associated improvement in grain yield (Kipp et al. 2014). The simulated yields were masked according to the irrigated and rainfed areas reported in each location by the Spatial Production Allocation Model “SPAM” 2005 (You et al. 2014) and adjusted to fresh weight assuming a 12% net water content for wheat and 13% for maize (Müller et al. 2017). The difference between simulated yields against both reference datasets (Ray and FAO) was calculated as a fraction of the reference yield for each location in every setup. For maize, the best setup for each location was selected by minimising the yield difference separately for FAO and Ray datasets and constraining HI to setups that produced values between the percentiles 5th and 95th of the compiled HI database (0.30 and 0.59) to avoid atypical HI values.

To assess whether a single generic setup was identified for maize or whether there was variation in the best setup suggesting a variation in cultivar, a k-means algorithm was performed on the best setups set to group the locations iteratively in clusters based on simulated yield and HI (Shamim Reza 2015). The number of clusters was selected based on the within-group sum of squares method, choosing the number of clusters at which the rate of change of the sum of squares with cluster number approaches zero. Spring and winter wheat were already included in the model, and the selected locations were already classified by cultivar, so no clustering was performed for wheat.

Subsequently, the 80th percentile of the yield difference (q80) was calculated for each setup separating locations by different cultivars; q80 was used to ensure that most of the locations had low yield differences instead of a central tendency statistic like mean, affected by extremely low values or median which is not sensitive to extremely high values. The average of simulated HI was also calculated per setup for each cultivar and if this value was out of the range between 0.30 and 0.59 for maize and 0.30 and 0.45 for wheat, the setup was not further included in the analysis. Wheat had a lower upper limit because the frequency distribution of the observed HI was skewed towards higher values compared to maize. Therefore, the setups were ranked by the low to high q80, and the best setups were considered to be those with lower q80 for each cultivar.

Global evaluation

The best setup, along with the mean parameter values across the best ten setups were both selected to simulate globally, as well as the LPJ-GUESS original parameter setup defined by Olin et al. (2015b). Global simulations were performed on a 0.5° × 0.5° grid based on the same driving datasets as the sensitivity simulations (Sect. 2.2), and simulated yields were also masked according to “SPAM” 2005 (You et al. 2014) and adjusted to fresh weight as described above. For maize, global outputs were aggregated at a country level, and cultivars were distributed by country, minimising the difference between aggregated and FAO-reported yields. While for wheat, the cultivar distribution was taken from the recent AgMIP climate change evaluation using a crop model ensemble (Jägermeyr et al. 2021). This process was performed separately for the selection method (best or ten best mean) and reference data (Ray and FAO), producing four global simulations per crop plus the original setup simulation. A production-weighted mean absolute error (WMAE) for yield was calculated at the country and gridcell scale as in Eqs. 3 and 4 to compare the five global simulations and select the best setup.

$$\mathrm{WMAE}=\sum_{i=1}^{n}{\mathrm{Frac}\_\mathrm{Prod}}_{i}\times \left|{\mathrm{Observed}}_{i}-{\mathrm{Simulated}}_{i}\right|,$$
(3)
$${\mathrm{Frac}\_\mathrm{Prod}}_{i}=\frac{{\mathrm{Production}}_{i }}{{W}_{\mathrm{prod}}},$$
(4)

where n, depending on the comparison scale, is either the global number of grid cells or countries where yield was simulated and reported in the Ray or FAO datasets, respectively. Production is the gridcell or country-scale crop production. Observed is the reference yield at gridcell or country scale (Ray or FAO), and Wprod is the aggregated world production, all reported from Ray or FAO, respectively, and averaged for 2001–2010. Simulated is the LPJ-GUESS simulated yield aggregated by country or for gridcell according to the scale comparison. Ray production was calculated by multiplying yield by harvested areas reported in “SPAM” 2005 (You et al. 2014). For HI evaluation, the mean absolute error (MAE) was calculated by comparing the n compiled HI data (observed) against the simulated HI (Eq. 5).

$$\mathrm{MAE}=\frac{\sum_{i=1}^{n}\left|{\mathrm{Observed}}_{i}-{\mathrm{Simulated}}_{i}\right|}{n}.$$
(5)

Finally, the best 50 setups per cultivar from the selected reference dataset were used to perform a descriptive analysis to show the relationship between cultivars and parameters, the distribution of parameter values and the yield difference range caused by parameters in these 50 setups.

Results and discussion

Sensitivity analysis

Variation of maize yield simulated in locations from “Medium” (Figs. 2A, 4A) was mainly influenced by Sret, C:Nrange and Cret, and to a lower extent by Nret, Ndred and C:Nmin. Except for C:Nmin, the same parameters also affected HI in this group (Figs. 3A, 4B). In locations from “High”, yield was more sensitive to Nret and less to C:Nmin compared to “Medium” (Figures S5A, 4A). HI was more sensitive in “High” to Nret and similarly sensitive to the parameters in “Medium” (Figs. 3A, S6A, 4B).

Fig. 2
figure 2

Mean values of simulated yield for 2001–2010 for each evaluated parameter level. A Maize medium-yield locations. B Wheat medium-yield locations. Vertical bars represent the standard deviation considering all the variation from parameters and locations. Horizontal bars only consider variation from parameters. The same plot for high-yield locations can be found in Fig. S5

Fig. 3
figure 3

Same as Fig. 2, but for simulated HI

Fig. 4
figure 4

Percentage of the sum of squares for main effects (red) and added interaction effects per parameter (green): A simulated yield and B simulated HI of maize and wheat

Simulated wheat yields showed high sensitivity to variation in Sret, C:Nrange, Nret, C:Nmin and Ndred. Ndred caused a more significant change in “High”, while C:Nrange had a stronger effect in “Medium”(Figs. 2B, S5B, 4A). Sret, C:Nrange, Nret and Ndred had the most influence on HI and, as for yield, Ndred had a stronger effect than other parameters in “High”, whilst C:Nrange did in “Medium” (Figs. 3B, S6B, 4B). N-related parameters affected yield in both crops, but the effect was higher in “High” locations. This occurred because of soil N limitation in “Medium” locations (Table S4), which constrained yield in setups with parameter values that, in principle, allow higher production. In particular, 0.1 for Nret and 0 for Ndred, which imply slower retranslocation and smooth N demand reduction after anthesis and, therefore, a more extended period of carbon assimilation (Figure S7).

Lower Ndred increases both HI and yield because N uptake after anthesis is one significant source for N in grains and allows plants to keep more foliar area and produce more assimilates. This correlation between dry weight and yield with slower post-anthesis N uptake has been previously reported for maize and wheat. Higher post-anthesis uptake occurred when available soil N increased during grain filling, increasing yield and green area index (Mi et al. 2000; Zhao et al. 2020). This indicates that N demand reduction is not a quick process, as represented by high Ndred values.

In wheat, yield and HI are more influenced by Nret than Cret. This occurred because a higher N translocation rate directly causes leaf senescence and the photosynthetic rate when the ratio of nitrogen-limited LAI to actual LAI reaches lower values than 1 (Yin et al. 2000; Smith et al. 2001; Gregersen et al. 2013; Olin et al. 2015b). On the other side, a higher carbon translocation causes a rapid decrease of leaf C:N, reaching the maximum N concentration and indirectly affecting the nitrogen-limited LAI ratio to actual LAI and leaf senescence. This decrease of LAI caused by Nret was stronger in wheat, while in maize Cret and Nret showed a similar response on LAI (Figure S9). The latter response is probably related to the limited available soil N content in productive areas of maize, decreasing Nret sensitivity (Table S4). One way to assess this is the small effect of low Nret (0.1) in actual leaf C:N in maize compared to wheat (Figure S8).

Higher wheat sensitivity to C:Nmin, compared to maize, was notable in outputs like GPP, LAI and leaf carbon mass, which directly influence yield and HI. This is also likely due to the higher soil N content in wheat soils, which suggests that carbon assimilation in wheat is often constrained by the maximum N concentration allowed in the leaves by LPJ-GUESS. In maize, C:Nmin was not equally critical since the limited soil N content is the main factor affecting carbon assimilation in most locations simulated here.

Maize was more sensitive to C:Nrange, which regulates the minimum N concentration. Higher ranges of C:N allow lower N concentrations in leaves, decreasing photosynthetic capacity and, in turn, GPP and yield (Fig. 2, S7). In addition, lower N limits favour increased LAI and vegetative growth with less N requirement, reducing HI (White et al. 2000; Hassan et al. 2007). Opposite to C:Nmin, the limited soil N in maize soils implies that the minimum N allowed in the leaves by LPJ-GUESS constrained maize carbon assimilation in more locations than wheat. N content in soil was not standardised since that would have critically affected yield, precluding the use of sensitivity simulations for parameter calibration.

Yield and HI of both crops were highly sensitive to Sret, which affects grain carbon accumulation, but also accumulation in vegetative organs. Consequently, it had a more significant effect and was the most critical parameter of HI. Stem dry weight loss has been widely observed after anthesis in both crops (Kiniry et al. 1992; Xue et al. 2014; Nazir et al. 2021) and is generally assumed to go to the grains, although it lacks experimental confirmation (Penning de Vries et al. 1989; Olin et al. 2015b). Alternative hypotheses propose that remobilisation is not constant and only occurs when the growth rate of the developing storage organs drops below a certain level (Penning de Vries et al. 1989), or that labile carbon from the stem can be used to synthesise structural material and maintain the plant and roots. Some evidence has shown that stem retranslocation to grains depends on the genotype and stress conditions (Kiniry et al. 1992; Xue et al. 2014; Nazir et al. 2021). If Sret was assumed to go to the grains in LPJ-GUESS, it directly affected the ratio between vegetative and harvested organs, increasing both simulated yield and HI and the likelihood of HI overestimation. In this study, Sret was included or discarded, but the fraction of easily mobilised carbohydrates retranslocated to grains could lie between the two extremes depending on stress conditions. Constraining this would substantially reduce the uncertainty in yield and HI simulations.

Overall, simulated GPP and growth in LPJ-GUESS were mainly affected by the allowed leaf N concentration. This correlation was found previously in shrubs and grasses but is stronger in herbaceous species due to the higher allocation of nutrients in herbaceous stems compared to woody stems (Tang et al. 2018). N and carbon retranslocation and N uptake reduction after anthesis also affected GPP to a low extent due to the effect on the carbon assimilation period, but had a stronger effect on yield and HI, since the grains are the dominant carbon sink after anthesis. This result matches previous reports in wheat and rice, where the amount of N taken up, including the post-anthesis period, has a proportional relationship with yield (Fageria 2014; Belete et al. 2018). Similarly, the stem labile retranslocation directly affects the carbon ratio between grain and vegetative organs. For that reason, it is the most influential parameter affecting HI.

Parameter calibration

The selection of the best setup for each location for maize resulted in two distinct clusters of parameters, similar for both Ray and FAO datasets. Mainly temperate regions were included in cluster 1, and subtropical and tropical regions in cluster 2 (Table S2). The only deviations were the inclusion of China in Cluster 1 in Ray-based analysis and Germany and UK in cluster 2 in FAO-based analysis. Based on this result, locations with latitudes above 35 degrees (N or S), except in China, were considered to belong to cluster 1 and grow high-yielding maize. The remaining locations were in cluster 2 and were assigned low-yielding maize. Both clusters consisted of eleven simulated locations. The parameterisation from cluster 1 will be referred to as high-yielding maize and cluster 2 as low-yielding maize in the following.

From the original 17,280 setups, 15,617 were retained after filtering by the senescence criteria for maize (not performed by cultivar since this filter was applied before clustering), 11,525 for winter wheat and 8901 for spring wheat. After constraining by HI, the considered number of setups for the minimisation of yield difference decreased to 7904 for high-yielding maize, 13,439 for low-yielding maize, 5435 for winter wheat and 3506 for spring wheat. The minimisation of yield difference then allowed to identify the best and the mean of the ten best setups per cultivar and crop separately by reference dataset (Table 2).

Table 2 Selected setups according to reference dataset by crop and cultivar and original setups in LPJ-GUESS

High-yielding maize had lower Nret and Cret values due to the slower retranslocation favouring a more extended photo assimilation period, and C:N parameters that produce higher N concentration: lower C:Nmin and C:Nrange. Wheat only showed cultivar difference in C:Nrange and kN with higher values and Nret with lower values for winter wheat. None of the selected setups included stem retranslocation (Sret); SLA and kN did not show a clear pattern between cultivars in both crops. Still, both parameters showed middle to high values related to lower yield and HI values. Similarly, no pattern was found in Ndred, but considering the wide range of this parameter, only low values were part of the best setups indicating that slow N demand reduction after anthesis, similar to the original LPJ-GUESS setup, fits better with reference yield. Other studies support this result since both crops have shown to continue uptaking N after anthesis, depending on the soil N availability. In addition, maize has been reported to absorb more N than wheat to satisfy grain N demand (Mi et al. 2000; Fageria 2014; Zhao et al. 2020).

Global simulations using the best Ray-calibrated setups produced higher values of WMAE for both crops compared to the FAO-calibrated setups. In wheat, the mean of the best ten Ray-calibrated setups at country and gridcell scales showed higher WMAE for yield, even compared to the original setup, indicating that only a few setups improved wheat yield estimation. At the gridcell scale, the best FAO setup performed better, while at the country scale, the mean of the best ten FAO-calibrated setups performed better. Besides, both FAO-calibrated setups fitted HI better in both crops. Therefore, the mean of the best ten FAO-calibrated setups was selected as the best LPJ-GUESS parameterisation for global simulations in both crops (Table 3).

Table 3 WMAE of yield and MAE for HI of the selected and original setups evaluated globally at gridcell and country scale, gray shades represent the setups with the lowest values for each scale and for H

The distribution of the best 50 setups per cultivar and crop using FAO as reference data (Figs. 5, 6, S11, S12) only showed Sret = 0 for both crops and cultivars, except winter wheat which had 12 setups with Sret = 1. This means that yield and HI estimations were better adjusted without labile carbon retranslocation from the stem to grain. The most contrasting parameters between wheat cultivars, similar to the ten best setups mean, were C:Nrange, Nret and kN. The best-fitted setups for spring wheat only had C:Nrange of 2, while winter wheat had different values in 32 setups, showing higher sensitivity to this parameter.

Fig. 5
figure 5

Box plots of the yield difference and distribution of the best 50 setups by parameter levels considering FAO country yield as reference. Orange denotes spring wheat and green winter wheat

Fig. 6
figure 6

Box plots of the yield difference and distribution of the best 50 setups by parameter levels considering FAO country yield as reference. Orange denotes high-yielding and green for low-yielding maize

Spring wheat had a more frequent value for kN of 0.15, while in winter wheat, it was 0.25, indicating that winter wheat has a more pronounced decrease of N concentration moving from the top to the bottom of the canopy. However, sensitivity to this parameter was low, so it does not affect yield or HI significantly. Nret had higher values in spring wheat, causing winter wheat to keep the green tissue for a more extended time, while Cret had a similar trend in both cultivars, showing independence between the effect of carbon and N retranslocation rate in LPJ-GUESS. Winter wheat also had low values of Ndred, representing slower N demand reduction after anthesis and, again, a more extended green period. This response between cultivars agrees with the longer growing period previously reported for winter wheat compared to spring wheat (He et al. 2019).

The best 50 setups for maize also showed similar responses to the mean of the 10 best setups; low-yielding maize only included setups with C:Nrange values of 5 and high values of C:Nmin between 15 and 17.5, while high-yielding maize included C:Nrange values of 3.5, 2.8 and, mostly, of 2, whilst C:Nmin was always 12.5, indicating that high-yielding maize requires lower C:N leaf and consequently higher concentrations of N. Nret and Cret had lower values for high-yielding maize indicating slower retranslocation of N and carbon to the grains. In high-yielding maize, Ndred distribution included the whole parameter range, but the lowest two levels, 0 and 2.7, represented 80% of the best 50 setups. SLA was not a parameter of high sensitivity, but high values of SLA are more frequent in all the best setups of all cultivars in both crops. Accordingly, the final selected setups included SLA values close to 50 for maize and 40 for wheat (Table 2; Figs. 5, 6).

Lower values of SLA (thicker leaves) have been reported to decrease LAI and water stress favouring NPP and harvest index in wheat and other cereals (White et al. 2000; Chen et al. 2020). LPJ-GUESS captures this effect on LAI but not in leaf carbon mass assimilation (Figure S9, S10), meaning that this parameter does not alter the photosynthetic capacity; this can also be observed in the minor GPP variation caused by SLA (Figure S7).

Global evaluation

The mean of the best ten FAO-calibrated setups was selected for global evaluation due to its low WMAE at both scales (Table 3). This setup produced aggregated yields by country with satisfactory goodness of fit compared to FAO-reported yields averaged between 2001 and 2010 (Fig. 7). Although there was an underestimation of wheat yield in some countries, the ordinary least squares linear regression line between simulated and observed yields by country was not significantly different to 1:1 line considering intercept and slope (R2 = 0.53). In maize, the regression line had a significantly different slope and intercept to the 1:1 line (R2 = 0.52), but the 95% confidence intervals of the regressions included most of the 1:1 line for both crops. The first five producers by crop were also well captured, except for Pakistan in wheat (Fig. 7).

Fig. 7
figure 7

By country comparison between simulated yields with LPJ-GUESS and reported by FAO averaged values (2001–2010). Circle size is proportional to production reported by FAO during the same period. Coloured dots show the five top producers in the world in order (for wheat and maize)—red: China, USA;, blue: India, China; green: USA, Brazil: yellow: Russia, Mexico; and purple: Pakistan, Argentina. Red lines represent the adjusted linear regression between simulated and observed yields. Shaded areas show the 95% confidence interval, and black lines represent the 1:1 line

Global simulations at the gridcell level showed similar patterns to those reported in the Ray dataset, with most productive areas showing differences below 3 t ha−1. The model overestimated maize yield in regions like the southeast of the USA and Argentina. However, there is a clear improvement worldwide in yield estimation compared to the original model setup, which presented a substantial overestimation for almost all the simulated countries (Olin et al. 2015a). The improvement in Africa, Asia and South America at the gridcell scale is significant (Fig. 8). Differences between simulated and reference yields are more evident at the gridcell than at the country scale, suggesting an improvement in the estimation in some areas of those previously poorly simulated countries.

Fig. 8
figure 8

Gridded yield simulated with LPJ-GUESS using the selected setup, and mean of the parameters from the best ten setups (top). The yield difference between the selected setup and Ray-reported yield (simulated-Ray) (middle) and yield difference between the original setup and Ray-reported yield (original-Ray) (bottom) for maize (left) and wheat (right). Average from the decade 2001–2010

In wheat, differences between yield from new and original setups against Ray global gridded yield were similar, and no significant improvement is noticeable. Both setups show underestimation in western and eastern Europe and East and South Asia overestimation. However, the calibration process significantly improved the estimation of HI for both crops. HI was strongly overestimated in the original setup, even showing values above 0.75 in both crops in several locations (Fig. 9). In contrast, the global distributions of HI had a similar median compared to reference data in the calibrated setup. A more subtle evaluation of HI response will require datasets which show how HI varies systematically as a function of growing conditions. Such responses were not apparent in the compiled HI database.

Fig. 9
figure 9

Box plot for HI from the compiled database and simulated HI using the original model and the newly selected setup for wheat and maize

Additionally, the included low-yielding maize cultivar also showed lower HI than high-yielding maize in the global simulation, with a global mean HI for irrigated maize of 0.33 and 0.53, respectively. Low-yielding maize was distributed in subtropical and tropical countries (Figure S13). In wheat, the new cultivar distribution had more winter wheat areas in Argentina, South Africa, Australia and the USA (Jägermeyr et al. 2021). Winter wheat showed lower HI values than spring wheat, with a global mean HI for irrigated wheat of 0.39 and 0.49, respectively.

Even though the cultivar global distribution improved the representation of crops production in LPJ-GUESS, given the significant influence of some studied parameters here on HI and yield, further observations to constrain them in a variety of cultivars would be particularly valuable to ensure that calibrated ranges generated in studies such as this one correspond closely to reality and represent genotypic differences such as tolerance to abiotic stresses, growth properties and productivity (Balkovič et al. 2013; Soleymani 2022). In the absence of such observations, especially for carbon assimilation and retranslocation rates, the values attained in the calibration performed in this study may be used as a basis for other large-scale modelling exercises.

The sensitivity and parameterisation performed here improved the estimation of global yield and harvest index of maize and wheat compared to the original setup used in LPJ-GUESS (Olin et al. 2015b), implying an improvement of the representation of crop production processes (Fletcher and Jamieson 2009). The improved version of LPJ-GUESS to simulate major crops growth and yield involves progress in the investigation of growing and management conditions at the regional and global scale (Rosenzweig et al. 2013). This is going to be particularly useful in estimating more accurately the effects of future climate change scenarios on global productivity of fsitwood systems, food security and global economics, since agriculture represents between 1 and 60% of national GDP in some countries (Rosenzweig et al. 2013; Mbow et al. 2019).

Conclusions

The sensitivity analysis showed that the main parameters affecting simulated carbon assimilation, GPP and yield were those related to N concentration, such as the leaf minimum C:N ratio and C:N ratio range. The carbon reallocation parameters, such as the retranslocation of labile carbon from stem to grain after anthesis and the retranslocation rate of N and carbon during senescence from leaf to grain, were more critical for HI. Although these parameters also affected GPP and yield to a lower extent, they directly influenced the allocation of carbon in grains and vegetative organs, causing high variation in HI. The extent to which labile carbon is retranslocated to grains was the most crucial parameter for HI simulation. It was the only parameter that had the effect of decreasing HI without significantly affecting yield.

The carbon assimilation rate and the period with active photosynthesis were highly affected by the N and carbon retranslocation rate and N demand reduction after anthesis. Lower carbon and N retranslocation rates and lower N uptake demand reduction after anthesis cause a more extended period of productive green tissue and higher assimilation, with consequent implications for GPP, yield and HI. Exclusion of the labile carbon retranslocation from stem to grains produced a better fit for yield, keeping HI within acceptable limits.

For maize, two cultivars were created according to yield distribution in the selected locations. Maize cultivars were contrasting in C:N ratio parameters as well as N and carbon retranslocation rates. The high-yielding cultivar of maize had lower values of leaf C:N minimum and range, as well as carbon and N retranslocation rates compared to low-yielding maize, indicating the need for higher leaf N concentration, a higher capacity for carbon assimilation and a more extended production period. On the other hand, wheat cultivars were only contrasting in the range of leaf C:N and N retranslocation rate, which was lower for winter wheat, indicating a longer productive and green period.

The cultivar parameterisation and global distribution developed in this study improved the global yield and HI estimation compared to the original setup used in LPJ-GUESS (Olin et al. 2015b), implying an improvement of the representation of crop production processes (Fletcher and Jamieson 2009). The calibrated version of the LPJ-GUESS crop model is a powerful tool for studies investigating how changes in management and growing conditions, as well as the future climate change scenarios, affect the global crop growth and yield of maize and wheat.