1 Introduction

The predictability and uncertainty of weather and climate are hot topics in atmospheric and oceanic sciences. The uncertainties involved in numerical simulations and predictions are rooted not only in the uncertainty of initial error (Morss and Battisti 2004; Aberson 2011) but also in model error (Williams et al. 2001; Berthelot et al. 2005; Carrassi and Vannitsem 2011; Jarvinen et al. 2012; Hally et al. 2013; Wan et al. 2012) and belong to the “first kind” and “second kind” categories of weather and climate predictability problems (Mu et al. 2002). The uncertainties in model errors mainly arise not only from the mathematical descriptions of earth system processes (Cramer et al. 2001) but also from the uncertainties related to physical parameters in the model (Zaehle et al. 2005). Reducing the uncertainties in physical parameter errors in numerical models through observations, optimization methods, or data assimilation is thus a crucial area of research. Efforts in this area will help to improve the simulation ability and forecasting skill of such models in atmospheric and oceanic science studies (Lu and Hsieh 1997; Janiskova and Morcrette 2005; Pulido et al. 2012; Smith et al. 2013).

There are many physical parameters included in numerical models to describe the physical processes of atmosphere–ocean coupled general circulation models. It would be costly and impractical to reduce the uncertainties of all the physical parameter errors through observation and other methods. Instead, a more effective route is to choose certain key physical parameters, which are referred to be “sensitive and important” to improve the simulation abilities and forecasting skill of models. There has been much discussion about how to identify the sensitivity and importance of parameters (White et al. 2000; Knorr and Heimann 2001; Li et al. 2012; Pappas et al. 2013; Wang et al. 2013). Pitman (1994) analyzed the percentage change of 18 physical parameters using the Biosphere Atmosphere Transfer Scheme (BATS), a land surface model, to reveal their relative importance. When the authors analyzed the percentage change of one parameter, the other 17 parameter values were fixed. The interaction among parameters was ignored during the process (Jackson et al. 2003; Bastidas et al. 2006). Other methods have also been employed to explore the sensitivity of certain physical parameters, such as the adjoint method, factorial experimentation, and the Multi-Objective Generalized Sensitivity Analysis (MOGSA) method (Henderson-Sellers 1992; Wang et al. 2001; Rayner et al. 2005; Bastidas et al. 2006; Rosero et al. 2010). The adjoint method is based on linear approximation, and while it may be valid for small parameter errors over short time periods, it is less applicable to large parameter errors and long integration times. The factorial experimentation, Monte Carlo, and MOGSA methods could be applied to consider the impact of interactions among physical parameters through producing a stratified sample under range of parameters values, as well as to identify key physical parameters (Zaehle et al. 2005; Bastidas et al. 2006). Henderson-Sellers (1992) employed factorial experimentation to show the rank of parameters using the BATS model. The top five most sensitive parameters are vegetation roughness length, vegetation albedo <0.7 μm, maximum leaf-area index, vegetation albedo ≥0.7 μm, and saturated soil hydrologic conductivity. Zaehle et al. (2005) applied a Monte Carlo-type stratified sampling approach to identify the sensitive and important physical parameters in a model and provide reasonable estimations of a model output variable. They used Latin hypercube sampling method to create random sample of parameter value. Based on the random sample, the parameters sensitivity and importance are determined by calculating ranked partial correlation coefficients (RPCC) to estimate the uncertainty contribution of a particular parameter to the total model output uncertainty. They reported that the top five most sensitive and important physical parameters were α C3,α a ,θ,g m , and r growth for net primary production (NPP). Bastidas et al. (1999) identified which of the parameters within a model was sensitive using the MOGSA method. Their studies showed the number of sensitive parameters according to significance levels below 1 %, from 1 to 5 %, and above 5 %.

In this work, a new approach was established to ascertain which subset or combination of relatively more sensitive and important parameters causes maximum uncertainty in numerical simulation and forecast results because the parameters combination should be considered to reduce the uncertainty of numerical simulation, such as the data assimilation method and the optimal method (Vrugt et al. 2005; De Lannoy et al. 2006). After first providing an overview of the new approach in Section 2, we then describe its methodology in more detail in Section 3, as well as the experimental procedures used to test it. The results from the experiments are reported in Section 4, and a discussion and summary of the key findings are presented in Section 5.

2 Overview of the new approach

2.1 The new approach

To ascertain the combination of relatively more sensitive and important parameters that cause maximum uncertainty in numerical simulation and forecast results, we propose a new approach, which consists of the following three steps.

The first step is to choose the physical parameters. Generally, there are three types of parameters in a numerical model. Taking the Lund–Potsdam–Jena (LPJ) model as our example, the normalizing coefficient for the exponential distribution, which is randomly created and related to computational stability, and unrelated to observations, belongs to the first type of parameter. Wood density is obtained through direct observation data and falls into the second type of parameter. Finally, the parameters in the allometric equations are achieved using indirect observations of leaf area index and tree height and hence pertain to the third type of parameter. The last two types of parameters describe physical processes and can be obtained through observations. In this study, we only consider parameters that can be obtained through direct or indirect observations.

The second step is to examine the sensitivity of single physical parameter, of which there are many in numerical models. Supposing there are n physical parameters obtained by observation data, some of the less sensitive n parameters first need to be eliminated through implementing sensitivity tests of every physical parameter. There are two reasons to implement this step. Firstly, the cost in terms of human and material resources needed to identify the combination of relatively more sensitive and important physical parameters among all parameters would be enormous. Secondly, not all of the n parameters will cause large uncertainties in the numerical simulation. In the LPJ model, if n = 24, and the combination approach is applied, we are able to find the most significant combination of five relatively more sensitive and important parameters among the total of 24 (Table 1). Those five parameters are identified through \( {C}_{24}^5 \) optimization experiments, and we find that the computational cost is high. We also find, by implementing simple sensitivity experiments using the LPJ model, that not all of the 24 parameters bring large uncertainty to the numerical simulation. For example, the extent of uncertainty in simulated NPP is 72.55 g C m−2 year−1 when the \( {g}_m^{*} \) parameter error is 3.05, which is added to the standard value, using the LPJ model while the other parameters remain unchanged (Table 1). However, the extent of uncertainty in simulated NPP is just 2.48 g C m−2 year−1 when the \( {est}_{\max}^{*} \) parameter error is −0.04, which is added to the standard value. In a previous study, Zaehle et al. (2005) first identified 12 more sensitive and important parameters among a total of 36. Hence, it is necessary to implement this step to eliminate some of the less sensitive parameters. To accomplish the second step, the conditional nonlinear optimal perturbation related to parameter error approach (CNOP-P; Mu et al. 2010) is employed. A detailed explanation of the CNOP-P approach is provided in the following section. The n parameters are optimized individually using the CNOP-P approach. The CNOP-Ps and their cost function values for every parameter are obtained within the range of reasonable parameter error. The physical explanation of the cost function induced by the CNOP-P is the maximal extent of uncertainty in numerical simulations caused by the optimal value. The sensitivity of every parameter can be identified according to the cost function value, and the more sensitive parameters can be chosen based on an appropriate threshold, i.e., we choose m physical parameters where n > m. The advantage of using the CNOP-P approach is to identify the sensitivity of every parameter, which in theory is the optimal way to ensure the ranking of parameters in terms of their sensitivity. However, it ignores the impact of interactions among parameters on the uncertainty of the numerical simulation. Therefore, this step alone does not identify the key combination of relatively more sensitive and important parameters.

Table 1 The chosen parameters within the LPJ model

The third step is to examine the sensitivity and importance of multiple parameters together, i.e., to reveal the most significant combination of parameters among all parameters previously identified in the second step. The aim of this step is to find the relatively more sensitive and important parameters (k) among the previously identified m parameters, where m > k. Continuing with the LPJ model as our example, we suppose n = 24, m = 10, and k = 5 (i.e., 10 parameters were chosen following the second step). In the third step, we want to find the combination of five relatively more sensitive and important parameters among those 10 that cause the greatest uncertainty in the numerical simulation. To begin, \( {C}_{10}^5=252 \) groups of parameters combinations are built. For these groups, 252 CNOP-Ps and their cost function values are obtained using the CNOP-P approach within the range of reasonable parameter error. The parameters combination causing the maximal cost function value among the 252 cost function values is regarded as the relatively more sensitive and important subset of five parameters among the total of 10. The experiments in this step are different to those conducted in the second step (see Fig. 1), and in carrying out this step, the impact of nonlinear interactions among parameters on the uncertainty of the numerical simulation can be considered.

Fig. 1
figure 1

Flowchart depicting the steps involved in the new method

2.2 The CNOP-P approach

In the above new approach, the maximal uncertainty in numerical simulations is obtained within the range of reasonable parameter error in both single-parameter and multi-parameter experiments using the CNOP-P approach. The CNOP-P is a type of parameter perturbation that could cause the maximal cost function with a certain constraint and at an optimal time. This type of parameter perturbation, which could lead to the maximal uncertainty in numerical simulation and forecasting results, is a parameter error or parameter error combination. The advantages of the CNOP-P approach are not only its ability to obtain the parameter error combination causing the maximal uncertainty but also that it can be used to consider the impact of nonlinear interactions among parameters on the level of uncertainty.

In the work of Mu et al. (2010). the CNOP-P approach was proposed according to types of predictability, and it has been applied to study ENSO predictability, estimations of terrestrial ecosystems, and the Kuroshio large meander (KLM) (Mu et al. 2010; Duan and Zhang 2010; Sun and Mu 2012a, b; Wang et al. 2012). Yu et al. (2012) analyzed the roles of initial error and model error in generating a significant spring predictability barrier (SPB) for El Nino events using the CNOP-P approach and noted that initial errors play a more important role than parameter errors in causing a significant SPB for El Nino events. In addition, Wang et al. (2012) discussed the impact of model error on the KLM. They found that not only did the initial condition errors have greater effects on the prediction of the KLM than errors in model parameters but also that the latter cannot be ignored. We now review the derivation of the CNOP-P approach for the readers’ convenience.

Let the nonlinear differential equations be as follows:

$$ \left\{\begin{array}{l}\frac{\partial U}{\partial t}=F\left(U,P\right)\kern1em U\in {R}^n,\kern0.5em t\in \left[0,T\right]\\ {}{\left.U\right|}_{t=0}={U}_0\end{array}\right. $$
(1)

where F is a nonlinear operator, P is a parameter vector in Eq. (1), and U 0 is an initial value. Let M τ be the propagator of the nonlinear differential equations from the initial time 0 to τ. u τ is a solution of the nonlinear equations at time τ and satisfies u(τ) = M τ (u 0, p).

Let U(T; U 0, P) and U(T; U 0, P) + u(T; U 0, p) be the solutions of the nonlinear differential equations (1) with P and P + p, respectively, where P and p are parameter vectors. u(T; U 0, p) describes the departure from the reference state U(T; U 0, P) caused by p. The solutions satisfy:

$$ \left\{\begin{array}{l}U\left(T;{U}_0,P\right)={M}_T\left({U}_0,P\right)\\ {}U\left(T;{U}_0,P\right)+u\left(T;{U}_0,p\right)={M}_T\left({U}_0,P+p\right)\end{array}\right.. $$
(2)

For a proper norm ‖‖, a parameter perturbation p δ is called a CNOP if and only if

$$ J\left({p}_{\delta}\right)=\underset{p\in \varOmega }{ \max }J(p), $$
(2)

where

$$ J(p)=\left\Vert {M}_T\left({U}_0,P+p\right)-{M}_T\Big({U}_0,P\Big)\right\Vert $$
(3)

P is a reference state of the parameters in the Eq. (1), and p is the perturbation of the reference state. In the second and third steps, the dimension of the P is 1 and 5. p ∈ Ω is a constraint condition. The CNOP-P is the parameter perturbation whose nonlinear evolution attains the maximum value of the cost function J at time T.

3 Experimental procedures and model

3.1 Experimental design

Twenty-four parameters within the LPJ model were chosen for examination based on the study by Zaehle et al. (2005). Table 1 shows the physical meanings, standard values, and minimum and maximum values for all parameters. As we know, parameter values differ; therefore, for the convenience of data processing and implementing nonlinear optimization, the chosen parameters were normalized using linear transformations. The physical parameters were mapped into the range of −1 to 1. The simple linear piecewise function was applied to normalize the physical parameters as follows:

$$ \left\{\begin{array}{l}y=\frac{x-\mathrm{Stavalue}}{\mathrm{Maxvalue}-\mathrm{Stavalue}}\kern1em \mathrm{when}\kern0.5em x\ge \mathrm{Stavalue}\\ {}y=\frac{x-\mathrm{Stavalue}}{\mathrm{Stavalue}-\mathrm{Minvalue}}\kern1em \mathrm{when}\kern0.5em x<\mathrm{Stavalue}\end{array}\right. $$

where x and y are the values of front and rear transformation and Stavalue, Maxvalue, and Minvalue are the standard, maximum, and minimum values, respectively, of the physical parameters. When x = Minvalue, y = −1; when x = Maxvalue, y = 1. When x = Stavalue, y = 0. The constraint condition in Eq. (2), p ∈ Ω, is a box constraint. The constraint condition parameter is 0.2 for parameters (|p| ≤ δ, δ = 0.2). When δ = 1, the parameter errors are also reasonable. However, the parameter errors will cause the terrestrial ecosystem to be unstable using the CNOP-P approach. Four study regions were chosen in China (northern, northeastern, southern, and arid/semi-arid); and 24 cases, whose longitude and latitudinal are shown in Table 2, were chosen as the study region in the below numerical results.

Table 2 The sensitivity of single parameter among 10 parameters using the CNOP-P approach (numbers represent the sequence number of single parameter as in Table 1, and 125.75 45.75 means 125.75oE 45.75oN. There are similar for other locations)
Table 3 The five most sensitive parameters using the CNOP-P approach (numbers represent the sequence number of single parameter as in Table 1, and 125.75 45.75 means 125.75oE 45.75oN. There are similar for other locations)
Table 4 The sensitivity of single parameter among 10 parameters using the OAT method (numbers represent the sequence number of single parameter as in Table 1, and 125.75 45.75 means 125.75oE 45.75oN. There are similar for other locations)
Table 5 Same to Table 3, just for different reference state, and 125.75 45.75 means 125.75oE 45.75oN. There are similar for other locations
Table 6 The comparison for three, four, and six sensitive parameters combinations using the CNOP-P approach (numbers represent the sequence number of single parameter as in Table 1, and 125.75 45.75 means 125.75oE 45.75oN. There are similar for other locations)

Figure 2 shows the detailed experimental design used to identify the relatively more sensitive and important parameters combination. Taking annual NPP as the variable in the cost function, we supposed n = 24, m = 10, k = 5 and wanted to find the five relatively more sensitive and important parameters among the 10 parameters. First, 10 physical parameters were chosen from the original 24 according to the second step introduced in Section 2. Next, we built \( {C}_{10}^5=252 \) groups of parameters combinations. For these groups, 252 CNOP-Ps and their cost function values were obtained using the CNOP-P approach within the range of reasonable parameter error. The five-parameter combination causing the maximal cost function value among the 252 cost function values was regarded as the most significant subset of relatively more sensitive and important parameters among the total of 10. In the second step, the sensitivity of single parameter was identified using the CNOP-P approach, which in theory is the optimal way to ensure the ranking of every parameter in terms of its sensitivity. For verification, we compared the CNOP-P approach with another method, the traditional one-at-a-time approach, to identify the sensitivity of every parameter (OAT; Pitman 1994; Saltelli 1999). The OAT approach supplied the variation due to the representative parameter perturbation value, such as ±10 % or ±20 %. The other parameter values were fixed when the sensitivity of a certain parameter was identified. So, it is convenient to obtain the variation due to the perturbed parameter, and the computational cost is relatively small. The identification of the parameter sensitivity was performed according to the extent of variation in the numerical simulation. A finite difference method was employed to calculate the variation due to the perturbed parameter in the factor space to identify the sensitivity of the parameter. It was visible that the interaction among the parameters and the optimal settings of factors are ignored when the sensitivity of parameter is determined with the OAT approach. The parameter errors were ±0.2 to run the LPJ model in our study, which were similar to those in the experiment using the CNOP-P approach.

Fig. 2
figure 2

Detail of steps 2 and 3 of the new method

In our studies, to obtain the maximum value of Eq. (2), an evolutionary algorithm (differential evolution (DE); Storn and Price 1997) was employed, because the cost function may be non-differentiable about the parameters as optimal variables. Some studies have applied evolutionary algorithms to investigate parameter estimations and uncertainties in land surface schemes and extreme events with the MM5 model and other models (Duan et al. 1992; Kruger, 1993; Zhang et al. 2000). The advantage of the evolutionary algorithms is that they obtain the optimal value of Eq. (2) without the gradient. The DE algorithm has been applied to explore terrestrial ecosystem responses to climate change, and details on the algorithm have been provided (Sun and Mu 2012a, b, 2013). In addition, the validity of the DE algorithm was checked before the it was applied to search for the optimal value with the LPJ model (Sun and Mu 2009). The CNOP-P was obtained by calculating Eq. (2), and an initial estimation value was given during the optimization process. In our study, to effectively obtain the CNOP-P, 12 random initial estimation values were chosen. The final optimal value was repeatedly verified using the DE algorithm.

3.2 The LPJ model

We used the LPJ model in the present study as an example to validate the theoretical framework (Sitch et al. 2003). The LPJ model is process-based and can describe the dynamics of land processes in atmospheric and oceanic science studies, the carbon exchange between land and the atmosphere, and the hydrological cycle. The model, which originates from the biome model family (Prentice et al. 1992). can simulate the distribution of plant functional types (PFTs), with 10 PFTs used to distinguish different photosynthetic (C3, C4), phenological (deciduous, evergreen), and physiognomic (tree, grass) features. The parameters will be identified in all PFTs when the parameter sensitivity analysis is implemented. The LPJ model explicitly considers photosynthesis, mortality, fire disturbance, and soil heterotrophic respiration. Carbon is stored in seven PFT-associated pools in this model with leaves, sapwood, heartwood, fine roots, a fast and a slow decomposing above-ground litter pool, a below-ground litter pool, and two soil carbon pools for each grid cell. A detailed description and evaluation of the model can be found in Sitch et al. (2003). The LPJ model has been widely employed to discuss the variation in terrestrial ecosystems and the carbon cycle (Werner et al. 2007; Hickler et al. 2008). and its simulation of PFTs has been shown to be in agreement with the observations in China. However, owing to a lack of observational data regarding NPP in China, the NPP simulated using the LPJ model relies on this simulation using other models at similar spatial and temporal scales. Nevertheless, numerical results indicate that the LPJ model can be employed to examine variations in terrestrial ecosystems in response to climate change (Sun 2009).

The LPJ model was run with climate data comprising monthly precipitation, temperature, wet frequency, and cloud cover. Furthermore, a dataset of global atmospheric CO2 concentrations obtained from a carbon cycle model, also including ice core measurements and atmospheric observations (Kicklighter et al. 1999). was used. Soil texture data were based on the Food and Agriculture Organization (FAO) soil dataset (Zobler 1986).

To run the LPJ model over a period of 1000 model years, the equilibrium state was obtained by repeatedly using climate data from the Climatic Research Unit (CRU) 0.5° global climate dataset over the period 1901–1930 (Mitchell and Jones 2005). Generally, the equilibrium state will change, and the model will need to be spun-up over and over again when the parameters change. Therefore, in our study, the equilibrium state due to variational parameters was explored. The run time for the state to attain equilibrium as a result of changing parameters was not too long. The state attained equilibrium for 100 model years as the optimization time. Previous simulation results for 1000 model years using unvaried parameters were considered as the reference state.

4 Numerical results

4.1 The sensitivity and importance of single parameter

The CNOP-P and its cost function value using the CNOP-P approach were used to examine the sensitivity and importance of single parameter, and Table 2 shows the results among all parameters for 24 cases. Only the 10 chosen parameters are shown for convenience, as the cost functions of the remaining 14 parameters were negligible. We found that the most sensitive and important parameter controlling NPP was intrinsic quantum efficiency in C3 plants (α C3), and this was the case in most of northern, northeastern, and southern China. In the three cases of northeastern China, the most sensitive and important parameter controlling NPP was the maximum canopy conductance analog (g m ); however, α C3 was still the second most sensitive and important parameter. In the other six cases of northeastern China, g m was the second most sensitive and important parameter. For the six cases of northern and southern China, the second most sensitive and important parameter was the fraction of photosynthetically active radiation (PAR) assimilated at the ecosystem level relative to leaf level (α a ), while the co-limitation shape parameter (θ *) was the third most sensitive and important parameter in these cases. The α a and r growth parameters, representing growth and respiration per unit NPP, and θ *, were, respectively, the third, fourth, and fifth most sensitive and important parameters for northeastern China. The numerical results suggest that photosynthesis and canopy conductance in the soil hydrology are important physical processes for NPP in northern China.

Furthermore, in southern and in part of northern China, α C3 and α a are the two important parameters, suggesting that photosynthesis is an important physical process for NPP in these two regions and that precipitation is sufficient, while canopy conductance in soil hydrology may be secondary. However, in the arid and semi-arid regions, the most sensitive and important parameter was found to be different in different cases. In most cases, the evaporation parameter (α m ) was the most sensitive and important parameter. However, the α C3,g m , and α a parameters were also important parameters in all cases of arid and semi-arid China. The results showed that the parameter related to evaporation was the most sensitive and important for water-limited regions. However, the most sensitive and important parameter was different for different water-limited regions, with other important parameters being those that describe photosynthesis and canopy conductance in soil hydrology, providing further indication that these are important physical processes overall. In summary, the numerical results showed that the most sensitive and important parameters might be different for different regions, especially in both water-limited and non-limited regions.

4.2 The sensitivity and importance of multiple parameters

In the results reported in the previous section, the sensitivity and importance of single parameter was identified using the CNOP-P approach. The dimension of p is 1 in Eq. (3). However, the results do not elucidate which combination or subset of parameters is the most sensitive and important. This is because the individual parameter is optimized only to determine its sensitivity, and the combination of top five ranked sensitive parameters for the single-parameter sensitive analysis method may not be equivalent to the combination of five parameters for the multiple parameter sensitive analysis method. Next, the sensitivity of parameters combination will be explored when the p is 5 in Eq. (3).

To address this, we first needed to discover, using the CNOP-P approach, which five parameters among the 10 were the most sensitive and important when the level of nonlinear interaction among the 10 parameters is considered. Table 2 shows the maximal cost function values and which five parameters among the 10 were the most sensitive and important using the CNOP-P approach for the different study regions. The results suggest that the five most sensitive and important parameters are similar in northern, northeastern, and southern China. For example, for the second and third cases, the five parameters were θ *, α a , α C3, r growth, and g m . However, in the arid and semi-arid regions of China, the five most sensitive and important parameters were different. For example, for the 16th, 18th, 20th, and 24th cases, the five most sensitive and important parameters were α a , α C3, r growth, α m , and k rp. The five parameters represent photosynthesis, respiration, hydrology, and allocation of annual carbon increment processes. The uncertainties of these physical process overestimate the GPP due to α a and α C3 and underestimate the autotrophic respiration in virtue of r growth, α m and k rp. So, the uncertainties of these physical processes lead to the large uncertainty of the NPP, which is highly sensitive to these parameters. Meanwhile, for the 19th case, the five most important parameters were θ *, α a , α C3, r growth and k rp. There are similar physical processes to influence the variation of the NPP.

While the sensitivity and importance of single parameter were identified in the single-parameter experiments, nonlinear interactions among all the parameters are neglected using the CNOP-P approach. Next, we try to answer whether the five most sensitive and important parameters using the CNOP-P approach were the same as the ranking of the foremost five parameters reported in Section 4.1. We found that in northern, northeastern, and southern China, the five most important parameters using the CNOP-P approach were similar to the foremost five parameters ranked in Section 4.1. For example, the important five parameters using the CNOP-P approach for the second case wereα a , α C3, r growth, g m , and α m , which was the same set of parameters as those determined using the single-parameter experiment. However, the five most sensitive and important parameters were different when using the CNOP-P approach for multiple parameters and using it for individual parameter in the arid and semi-arid regions. For example, for the 19th region, the five parameters were θ *, α a , α C3, r growth, and k rp. However, based on the single-parameter experiments using the CNOP-P approach, the five parameters were α a , α C3, g m , α m , and k mort1. The numerical results imply that the parameter \( {r}_{\mathrm{growth}}^{*} \), representing the respiration process, is very important when nonlinear interactions are considered during the plant growth process, although the single-parameter sensitivity regarding r growth may not show it to be more sensitive than other parameters. The results also illustrate that nonlinear interactions among parameters and complex physical processes can be explored using the CNOP-P approach. The CNOP-P approach may be able to reasonably estimate the carbon cycle process; however, the single-parameter sensitivity method may overestimate the respiration process without considering the r growth parameter (Table 3).

4.3 Sensitivity experiment using the OAT approach

In Section 4.1, we reported the results from single-parameter sensitivity experiments implemented using the CNOP-P approach. To compare these results with those based on other methods, the traditional OAT approach was employed to identify the sensitivity of the single parameter. The cost function was computed using Eq. (3) when p = ±0.2. The larger the cost function is, the more sensitive and important the parameter. As shown in Table 4, parameter sensitivity and importance were similar to those using the CNOP-P approach. Due to the design of the CNOP-P approach, the sensitivity and importance of every parameter are optimal. Zaehle et al. (2005) employed a Monte Carlo technique and reported a similar conclusion based on 81 cases. These results reveal that sensitivity and importance are similar when parameters are considered individually.

4.4 Sensitivity experiment on the five most important parameters determined for different basic states

To discuss whether the sensitivity of the parameters is dependent on the choice of the parameters values, the parameters are identified when the reference values of the parameters change. The random number, which satisfies the normal distribution with zero average and a standard deviation of 0.1, is superimposed onto the reference state as the new reference state. The sensitivity of every parameter is determined with the CNOP-P approach for the 24 cases. The results showed that the sensitivity of every parameter given the new reference state is similar to that for the previous reference state (not shown). The five most important parameters are identified according to the sensitivity of every parameter. The numerical results show that the combinations of important parameters in 21 out of 24 cases given the new reference state are the same as those for the previous reference state (Table 5). However, the important parameters combinations for three of the cases are different than those for the previous reference state. In three of these cases, one parameter in combinations of important parameters for the new reference state is different than the parameters in combination for the precious reference state. The three cases are located in the arid and semi-arid region in China. The above numerical results imply that the combinations of identified important parameters are the same for different reference states in most cases. In the arid and semi-arid region, there are slight differences in the combinations of identified important parameters for different reference states.

4.5 Sensitivity experiment on the most important parameters determined for different sizes of the optimization set

In the above studies, the size of the optimization set is 5. In this section, the difference about the sensitivity of the parameters combination is explored when the size of the optimization set changes. The sensitive parameters combinations are determined for the three, four, and six parameters in the parameters combination (Table 6). It is found that the sensitive parameters combination with high size parameters contains that with low size parameters in northern, northeastern, and southern China. For example, for the case (115.75°E, 33.25°N), the sensitive parameters combinations with different dimensions are respectively (\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \), \( {g}_m^{*} \)), (θ *,\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \), \( {g}_m^{*} \)), (θ *,\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \),\( {r}_{\mathrm{growth}}^{*} \),\( {g}_m^{*} \)), and (θ *,\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \),\( {r}_{\mathrm{growth}}^{*} \),\( {g}_m^{*} \),\( {k}_{\mathrm{mort}1}^{*} \)). It is found that these regions are moisture or semi-moisture region in China. In the part of arid and semi-arid regions, there is similar character to the moisture and semi-moisture regions. For example, for the case (116.75°E, 37.25°N), the sensitive parameters combinations with different dimensions are respectively (\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \), α m ), (\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \), \( {r}_{\mathrm{growth}}^{*} \),α m ), (\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \), \( {r}_{\mathrm{growth}}^{*} \),α m ,k rp), and (\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \), \( {r}_{\mathrm{growth}}^{*} \),α m ,k allom3,k rp). However, in the other part of arid and semi-arid regions, there is a different character. It is found that the sensitive parameters combination with high size parameters does not completely contain that with low size parameters in the other part of the arid and semi-arid region. For example, for the case (116.25oE, 36.75oN), the sensitive parameters combinations with different dimensions are respectively (\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \), α m ), (\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \),α m , \( {est}_{\max}^{*} \)), (\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \), \( {r}_{\mathrm{growth}}^{*} \),α m ,k rp), and (\( {\alpha}_a^{*} \), \( {\alpha}_{C3}^{*} \), \( {r}_{\mathrm{growth}}^{*} \),α m ,k rp, β). The above numerical results imply that in the strong interaction region between land and atmosphere, such as arid and semi-arid region, the parameters combination may be dependent on the size of the parameters combination. However, in the weak interaction region between land and atmosphere, such as moisture region, the parameters combination may not be dependent on the size of the parameters combination.

4.6 Sensitivity experiment to identify the five most important parameters

An important aim of this study was to improve the ability to estimate and predict the carbon cycle in terrestrial ecosystems using the identified sensitive and important parameters. Errors in these parameters are reduced through routine or additional observations. However, to show the extent to which the simulations were improved when the most sensitive and important parameters pattern was destroyed, experiments were designed and carried out as follows. As reported in Section 4.2, 252 parameters combinations were computed to obtain the CNOP-Ps. Among those CNOP-Ps and their cost functions, the CNOP-P leading to the maximal cost function is called the CNOP. Excluding the CNOP-P-max, two CNOP-Ps were chosen to compare the extent of improvement. Among the 252 CNOP-Ps representing parameters combinations, a group of parameters, which was the same as the foremost five parameters identified using the CNOP-P approach for individual parameter, was named CNOP_single. Another group of parameters, which was same as the foremost five parameters identified using the sensitivity analysis method for single parameter, was simply called the “single” group. Finally, a random group of parameters was also chosen. Referred to Mu et al. (2009). who employed a formula to show the benefit of reducing the CNOP-type initial errors compared with another type of initial error, the extent of improvement in the uncertainty in the numerical simulation is measured as τ:

$$ \tau =\left\Vert {M}_T\left({U}_0,P+p\right)-{M}_T\Big({U}_0,P\Big)\right\Vert -\left\Vert {M}_T\left({U}_0,P+\alpha p\right)-{M}_T\Big({U}_0,P\Big)\right\Vert $$
(4)

The parameters in Eq. (4) are similar to those in Eq. (3), and α represents the decreasing extent of the uncertainty of the parameters, which is a constant less than 1. In this study, the four types of parameters including the CNOP were compared with τ for each case. The larger τ was, the better the extent of improvement. The 252 parameters combinations must include the above four types of parameters. The numerical results in Section 4.2 showed that the most sensitive and important parameters were similar in most of the study regions, except for arid and semi-arid regions in China. Therefore, the sensitivity experiment was implemented for these regions. Figure 3 shows the variation using the four types of parameters combinations using three factors (α = 0.2, 0.4, and 0.6) by Eq. (4) in the arid and semi-arid regions in China. The average reduced extents of the uncertainties for nine cases in the simulated NPP due to the CNOP-P-type parameter error were 248.02, 215.35, and 180.43 g C/m2 for α = 0.2, 0.4, and 0.6, respectively. Meanwhile, for the CNOP_single parameter error, the results were 168.98, 122.87, and 95.48 g C/m2; for the “single” parameter error they were 177.36, 141.60, and 108.74 g C/m2; and for the random errors they were 128.23, 107.48, and 92.30 g C/m2, respectively. The results illustrate that the gain obtained by the CNOP-P-type parameters combinations was the best among all the parameters combinations for each case and each factor. The results also suggest that the uncertainties in the simulations of the carbon cycle due to parameter errors can be reduced through CNOP-P-type identification of the most sensitive and important parameters combination.

Fig. 3
figure 3

Sensitivity experiments carried out to compute the gain from the five most sensitive parameters using the CNOP-P approach. CNOP represents the reduced extent due to the CNOP-P-type parameter error. CNOP_single represents the reduced extent due to the parameter error by optimizing single parameter. Single represents the reduced extent due to the parameter error by the OAT approach. Random represents the reduced extent due to the random parameter error

5 Discussion and summary

The uncertainties in physical parameters of numerical models are the main source of numerical simulation ability and forecast skill, such as numerical simulation for land process (Kuczera and Parent 1998; Vrugt et al. 2003). An approach of reducing uncertainty in numerical simulation is to calibrate the model parameters or parameters combination to closely match the input and output behavior to the observation data for model development. However, it is necessary to choose the model parameters or parameters combination to be calibrated because there are many parameters in the numerical model, such as the number of controlling the model parameters about from O(10) to O(100) (Li et al. 2013). Besides, to reduce the uncertainty of numerical modeling, the effects of the parameters combination should be considered, not to just calibrate parameter one by one. So, it is a key issue to determine which group of physical parameters should firstly be calibrated. Although the model parameters could be calibrated through the above methods, the calibrated model parameters values may not match with the true values. So, the calibrated model parameter values may not be applicable to other numerical simulations or forecast. It is feasible method that the physical parameter errors are reduced through observational data. Since the observation of the physical parameters is difficult and expensive, it is very important which parameter should be firstly observed to obtain the true values to improve the numerical models.

The physical parameters were ranked using the current parameter sensitivity analysis methods (Zaehle et al. 2005). For example, based on probability theory, such as Monte Carlo techniques, parameter sensitivity can be conducted according to model output through multiple model simulations. One method is to identify parameter sensitivity by analyzing the contribution of single parameter to the overall output uncertainty with variance (Verbeeck et al. 2006). Another is to determine parameter sensitivity by calculating the ranked partial correlation coefficient between model inputs and outputs (Zaehle et al. 2005). These methods can determine parameter sensitivity and show the ranking of the parameters. For example, Zaehle et al. (2005) determined the five top-most sensitive parameters for NPP to be α C3, α a , θ *, g m , and r growth using a Monte Carlo-type stratified sampling approach for 81 cases from the class A data set, in which the regions in China maybe not included. In our study, the group of physical parameters among abundant parameters is identified using the new approach based on the CNOP-P approach. The results in Zaehle et al. (2005) were similar to those in the present study using the CNOP-P approach in the northern, northeastern, and southern regions. In the arid and semi-arid regions of China, the parameter sensitivity using the CNOP-P approach was different to that proposed by Zaehle et al. (2005). In the case (36.75°N, 116.25°E), the top five parameters for NPP using the CNOP-P approach were α m , g m , α C3, α a , and r growth. Furthermore, we found that the results emphasized water demand in water-limited regions. The sensitivity index was −0.11 for α m in Zaehle et al. (2005). Their study did note that the evaporation parameter, α m , is important, but the rank of α m was more than 12, which implied that the importance and sensitivity of α m were weaker than they were in 12 other parameters. In our study, α m was the most important and sensitive parameter among the 24 parameters in the arid and semi-arid regions of China.

Owing to the nonlinearity among parameters, the most sensitive and important five parameters are not necessarily the top five parameters identified using the above methods. One advantage of the CNOP-P approach is that it can identify which group of parameters is the most sensitive and important compared with other groups of five parameters according to the cost function value. Using this approach, our results showed that for the northern, northeastern, and southern regions of China, the subsets of the five relatively more sensitive and important parameters were similar to the top five parameters proposed by Zaehle et al. (2005). However, for the arid and semi-arid regions, the combination of the five relatively more sensitive and important parameters using the CNOP-P approach was different to the combination proposed by Zaehle et al. (2005). The numerical results also show that the parameters in the arid and semi-arid region need to be carefully validated for the model validation.

In the arid and semi-arid region, soil water content is the key factor controlling plant growth and survival. The parameter α m , representing evapotranspiration, plays a key role in vegetation growth in water-limited regions. The identified combination of parameters using the CNOP-P approach was able to exhibit the physical characteristics. The parameter g m , representing the maximum canopy conductance analog, had a faint sensitivity in water-limited regions when the nonlinear interaction among parameters was considered. The numerical results suggest that not all parameters related to vegetation growth are key parameters and illustrate that the nonlinear interactions among parameters are crucial to the uncertainty of numerical simulations in complex regions associated with nonlinear physical processes. As demonstrated in the present study, the CNOP-P approach is able to consider nonlinear physical processes and is recommended for identifying the most significant subset or combination of relatively more sensitive and important parameters. It is important to determine the combination of relatively more sensitive and important parameters within a given number of parameters. The number of parameters combinations can be evaluated according to the practicalities of human and material resources, such as obtaining observations of the physical parameters and executing the optimization method. The relatively more sensitive and important parameters combination is determined based on the evaluated number of parameters. Bastidas et al. (1999) employed the MOGSA method to identify which of the parameters were sensitive according to the significance level. They reported that there were nine sensitive parameters at a significance level below 1 % for sensible heat, latent heat, and ground temperature in the Tucson semi-arid region. However, our studies could answer which four or five parameters was the most sensitive among the abundant parameters.

The new approach established in the present study would be able to determine which subset of parameters was the most sensitive, as it is able to determine the most significant combination of relatively more sensitive and important parameters within a given total number of parameters. The core of the new approach is the CNOP-P method that considers the impact of nonlinear interaction among the parameters on the uncertainty of the numerical simulation. There are two key characteristics of the new approach. Firstly, it is able to explore the response of uncertainty in numerical simulations to nonlinear interactions among parameters. Secondly, in establishing the sensitivity of every parameter, it ensures that the sensitivity of every parameter is optimal, due to the CNOP-P approach optimizing every parameter. In addition, the most important combination of parameters is also identified through the new approach and is therefore optimized as well. According to current numerical results, for the different reference states, the top 10 sensitive parameters are similar using the CNOP-P approach to identify the sensitivity of every parameter. This will not lead to the determination of the five sensitive parameters combination using the CNOP-P approach, especially for the moisture regions. There is a little difference about the identification of the sensitive parameters in the arid and semi-arid regions if there is much difference among the reference states. It is possible and reasonable that these are different when the parameter values have large differences. The key parameters may be different for the different states due to climate condition or complex physical processes, or different initial conditions, and so on. However, the group of parameter, which could lead to the maximum uncertainty of numerical simulation, is identified using the new method based on the CNOP-P approach. The findings of the group of parameter could guide us firstly to calibrate the parameters to improve the ability of numerical simulation. On the other hand, because there are two group optimization processes during the two-stage sensitive parameter identification method in our study, the computational cost is more expensive than other methods, such as the OAT method, and the traditional Latin Hypercube method (Zaehle et al. 2005). However, the current dynamical vegetation models are single column models. The dynamical vegetation models could be run in parallel means. Based on the models’ feature, the computational cost will be reduced using the CNOP-P approach to identify the sensitivity of parameter when the parallel skill is applied using many central processing units (CPUs) of computers.

The LPJ model was taken as an example to demonstrate how to implement the new approach. The numerical results showed that the most important combination of parameters in the arid and semi-arid regions of China was different than those in northern, northeastern, and southern China, which were similar to each other. The numerical results also indicated that the most important five-parameter combinations in northern, northeastern, and southern China were the same with or without consideration of the nonlinear interactions among the parameters. However, this was not the case in the arid and semi-arid regions, where the most important combinations were different with and without consideration of the nonlinear interactions among the parameters.

A specific example was shown in which we identified the most important combination of parameters. We wanted to determine the most important subset of five relatively more sensitive and important physical parameters among an initial total of 24. Firstly, 14 physical parameters were eliminated using the CNOP-P approach. Secondly, the most significant five-parameter combination from the remaining 10 more sensitive and important parameters was determined using the CNOP-P approach and a combination of methods. Whether the identified subset of five relatively more sensitive and important parameters is dependent on the number of initially eliminated parameters will be explored in the future. Moreover, combinations of four or seven parameters will also be identified in future work. In the strong interaction region between land and atmosphere, such as arid and semi-arid region, the parameters combination may be dependent on the number of the parameters combination. However, in the weak interaction region between land and atmosphere, such as moisture region, the parameters combination may not be dependent on the number of the parameters combination. These issues will be explored in the future. Moreover, the constraint condition is chosen as δ = 0.2 in the current study. It is necessary to explore the sensitivity of parameters combination related to the constraint condition.

As we know, there are two types of predictability problems based on the different factors that lead to them. The first is related to initial errors (Palmer et al. 1998) and the second involves model uncertainties in numerical models performing simulations and forecasts of the Earth system (Williams et al. 2001; Sitch et al. 2003; Jackson et al. 2004; Berthelot et al. 2005; Lin et al. 2011). At present, targeted observations, also called adaptive observations, determine certain special areas (called sensitive areas) that cause large uncertainty in forecast results. These additional observations in these sensitive areas supply more reliable initial states for the model and thus a more accurate prediction is expected (Palmer et al. 1998; Bishop et al. 2001; Wu et al. 2005; Qin and Mu 2011). Mu (2013) proposed a new idea to address the second predictability problem by generalizing targeted observations from geographical space to phase space for model parameters. The essential point is how to determine the more sensitive and important parameters among all model parameters. The present paper provides a concrete approach to realize this goal in a new way. Whether such types of targeted observations in the phase space can effectively reduce the uncertainty of physical processes and model parameters requires more work in the future.