Theoretical and Applied Genetics

, Volume 109, Issue 8, pp 1632–1640

Environment characterisation for the interpretation of environmental effect and genotype × environment interaction

Authors

    • UMR 1097 Diversité et Génome des Plantes Cultivées : Institut de la Recherche Agronomique INRA-SGAP
  • Pierre Roumet
    • UMR 1097 Diversité et Génome des Plantes Cultivées : Institut de la Recherche Agronomique INRA-SGAP
Original Paper

DOI: 10.1007/s00122-004-1786-6

Cite this article as:
Lacaze, X. & Roumet, P. Theor Appl Genet (2004) 109: 1632. doi:10.1007/s00122-004-1786-6

Abstract

Increasing attention is being paid to environment characterisation as a means of identifying the environmental factors determining grain protein content (GPC) in durum wheat. New insights in crop physiology and agronomy have led to the development of crop simulation models. Those models can reconstruct plant development for past cropping seasons. One major advantage of these models is that they can also indicate the intensity of limiting factors affecting plants during particular developmental stages. The main environmental factors determining GPC in durum wheat can be investigated by introducing the intensity of limiting factors into genotype × environment (G×E) models. In our case, limiting factors corresponding to water deficit and nitrogen availability were calculated for the development period between booting and heading. These variables were then introduced into a clustering model. This model is an extension of factorial regression applied to discrete environment and genotypic variables. This procedure effectively described the environment main effect: around 30.9% of the sum of squares of the environment main effect was accounted for, using less than 33% of the degrees of freedom. It also partially accounted for G×E interaction. Our methodology, coupling the use of crop simulation and G×E analysis models, is of potential value for improving our understanding of the main development stages and identification of environmental limiting factors for the development of GPC.

Introduction

Genotypes grown in multi-environment trials may react differently to a range of climatic conditions, soil characteristics or technical practices. These differential responses of genotypes in different environments are known collectively as the genotype × environment (G×E) interaction.

Formal ANOVAs demonstrate the existence of G×E interaction but do not provide sufficient information for analysis of the differences in response of individual genotypes to each environment. Several approaches and models have been developed for analysis and interpretation of the G×E interaction. Yates and Cochran (1938) and Finlay and Wilkinson (1963) were among the first to propose a regression-based procedure for relating genotypic performance to an environmental index. This approach resulted in the classification of genotypes into three classes: varieties well adapted to favourable environments, varieties adapted to unfavourable environments and genotypes that are non-specific or display intermediate levels of adaptation. One major problem with this type of regression analysis is that the proportion of the interaction accounted for by the heterogeneity of regression is generally low. Furthermore, environments may be considered favourable or unfavourable, but these notions provide no useful information for biological interpretation of the G×E interaction.

Multiplicative models provide a more advanced approach to the analysis of G×E interaction. First formalized by Gollob (1968) and Mandel (1969), these models make it possible to quantify the specific adaptation of one genotype to one environment. The interaction term is broken down into a sum of products involving genotype and environment parameters. These parameters are estimated by performing an analogous procedure, such as principal component analysis, on the residuals of the additive model summarising the G×E matrix in two dimensions. The additive main effects and multiplicative interaction [(AMMI), Gauch 1992; Vargas et al. 1999] procedure provides useful information for the analysis of genotypic and environmental stability. As for joint regression, the analysis is performed a posteriori and provides no explanation of the origins of G×E interaction. Interpretation of the results obtained with multiplicative models requires the identification of external information matched to predefined groups of genotypes or environments. Then, environmental variables can be added in an AMMI biplot (Reynolds et al. 2002).

If information concerning external environmental (or genotypic) variables, such as meteorological data, earliness or time to flowering are available, these variables may be correlated to or regressed on the environmental scores estimated by factorial regression (Denis 1989), biadditive factorial regression (Denis 1991) and clustering models (Denis and Vincourt 1982). Factorial regression models are usually linear models accounting for G×E interaction by differential cultivar sensitivity to explicit external environmental variables. The influence of these external variables on G×E interaction can be tested statistically. Environmental characteristics are regressed on main additive effects and/or interaction terms (Brancourt-Hulmel et al. 2001; Foucteau et al. 2001). Factorial regression approaches can be carried out with software such as the INTERA package (Decoux and Denis 1991). The relevance of the regression variables depends on the percentage of the sum of squares of main effects and interaction terms explained by these variables and the number of degrees of freedom involved in the analysis (Brancourt-Hulmel et al. 1997). Biadditive factorial regression uses linear combination of environmental factors as regressors on G×E interaction (Brancourt-Hulmel and Lecomte 2003). Clustering analysis is based on the same principles as factorial regression, since the environmental (or genotypic) covariates are not continuous, but are instead discrete variables, related to classes of intensity of these variables. Although clustering models and factorial regression may potentially provide a biological interpretation of the environment main effect and G×E interaction, the efficiency with which these analyses explain the interaction depends on the nature of the covariates (Desclaux 1996).

The covariates can be collected on special genotypes chosen for their known reaction to environmental factors. Data collected on well-known genotypes, ‘the probe genotype approach’, avoid the necessity for the direct measurement of environmental data and generates covariates corresponding directly to the factors limiting plant development (Brancourt-Hulmel et al. 2000; Desclaux 1996).

Recent decades have seen a number of new developments in crop physiology and agronomy, and some integrated approaches to crop simulation modelling have been developed. Such approaches make it possible to reconstruct the plant development cycle and to determine whether a stress occurred at any given development stage. This approach can be extended to a number of past cropping seasons. Nevertheless, this promising approach requires the collection of meteorological data and their integration into crop simulation models. There is currently no example of use of information derived from crop simulation models to analyse G×E interaction.

The aim of this study was to develop a methodology, using information derived from crop simulation and G×E models, to account for variation in the grain protein content (GPC) of durum wheat varieties evaluated in a multi-environment network.

We aim to show how information derived from crop simulation models can be used to interpret both environmental effect and the adaptation of a given genotype to specific environments. We illustrate this method using durum wheat GPC data.

Materials and methods

Data collection

The data set was obtained from the French ‘Comité Technique et Permanent de la Sélection’ network; it included a total of 111 site-by-year combinations and 48 genotypes evaluated over 8 years. The study sites were located between 43.3°N and 45.4°N and at longitudes of −0.57° to 5.9°. Each genotype was tested in at least two cropping seasons (i.e. a biennial network) from 1992/1993 to 1998/1999. Only two varieties, Ardente and Néodur, were grown in every environment in all 8 years. In each biennial network, 6–12 varieties were tested over 11–21 environments. If a site was present in a given network in two consecutive years, it was considered as two distinct sites.

Each trial consisted of two replicates, with a plot size of approximately 10 m2 for each genotype. Each trial was treated with fungicides. The experimental design was either a split plot or a crisscross design. The traits measured in each trial plot were heading date, grain yield, thousand kernel weight, kernel weight per metre squared and GPC. In the field, lines were considered headed when 50% of the spikes had emerged from the flag leaf. GPC was determined by Kjeldahl’s method. We used a 1-g sample taken from the 3 kg of grain harvested in bulk for N analysis.

The data collected for each environment included climatic data (daily temperature, rainfall), cultural practices (fertilisation and irrigation) and soil characteristics [texture, depth and useful water reserve (UWR)].

Plant nitrogen nutrition and development stage

The booting-to-heading stage has been reported to be one of the most critical phenological phases for grain development. Nitrogen uptake increases dramatically during this period to reach a maximum accumulation rate (Gate 1995). We therefore focused our attention on this period of the plant cycle—the booting-to-heading stage.

As only heading date was recorded, a thermal (expressed in degree days) and vernalophotothermal index (Gate 1995), was used to estimate booting stage and to calculate the date and duration of the booting–heading period.

The duration of the booting-to-heading stage was simulated for each environment. Sowing and heading dates were available for all trials, whereas the booting stage was estimated according to the crop phenological schedule described by Gate (1995). As the mean heading date for the variety Néodur was close to the mean for the series of varieties (less than 1 day away from the mean, with a standard deviation of 1.1 days), this variety was used to characterise plant phenology at each site. Therefore, growth requirements for the variety Néodur were used for these simulations.

Agronomic diagnosis

Climatic covariates were calculated for the booting-to-heading stage in each environment.

Each location was characterised climatically, using data from the nearest meteorological station (less than 20 km away) for temperature, radiation, rainfall and potential evapotranspiration (PET). The difference between maximum evapotranspiration (MET) and real evapotranspiration (RET) provided a daily estimate of water deficit (WD): WD=RET−MET. In this formula, MET=kcPET, where kc, the plant development index, was fixed at 1.2 for the booting-to-heading stage. RET was therefore calculated as a function of MET and water supply (WS) for the plant (RET=MET+WS). Water supply was estimated by calculating UWR, and precipitation (P). We considered that WS(t)=UWR(t − 1)+P, with the UWR at time (t−1) represented as UWR (t −1). Based on UWR values, the amount of available moisture (AM) was calculated and compared with WS. If WS(t)>AM, no WD occurs and RET(t)=MET(t). In addition, when WS<AM, RET(t)=MET(t)+WS/AM (Gate 1995). In these calculations, according to Soltner (2000), we assumed that AM is equivalent to 60% of the UWR.

Experimental sites were classified according to the intensity of WD occurring during the booting-to-heading stage. Three classes of WD, representing a gradient of water stress intensity, were defined—locations with no WD during the booting-to-heading stage, locations with a cumulative WD of 0–40 mm and experimental sites with a WD greater than 40 mm.

The availability of nitrogen to the plants was also considered. The application of nitrogen during the booting-to-heading stage was described by the variable SOLUB. SOLUB is 0 if no nitrogen was given during this period or 1 if at least one nitrogen application was performed.

A synthetic variable SOLWD was generated from the variables WD and SOLUB to simultaneously take into account water and nitrogen status during the booting-to-heading stage. From the three WD groups and the two SOLUB groups, we generated six SOLWD groups by subdividing the groups generated for WD and SOLUB.

Statistical methods

We began by using the Néodur and Ardente data set. An ANOVA (GPC = year + region + variety) was performed to compare the magnitude of the factors ’year’ (eight levels) and ‘region’ (two levels, southeast and southwest). This analysis was carried out using the SAS procedure GLM (SAS Institute 1996).

Secondly, for each biennial data set, we fitted a fixed ANOVA model using INTERA software (Decoux and Denis 1991). All analyses were conducted with two primary effects, genotype, and environment. The residual term included a G×E interaction, because data for a given variety were not repeated at a given location.

Clustering model

The clustering model (Denis and Vincourt 1982) provides a tool to explain variables (here GPC) by grouping locations. These groups can be defined on the basis of external information, such as stress indices. The clustering procedure compares sum of squares generated by grouping together locations or genotypes with respect to main effects and interaction terms. The general equation of the model is as follows:
$$Y_{ij} = \mu + \alpha _s^{\text{b}} + \alpha _i^{\text{w}} + \beta _t^{\text{b}} + \beta _j^{\text{w}} + \theta _{st}^{{\text{bb}}} + \theta _{sj}^{{\text{bw}}} + \theta _{it}^{{\text{wb}}} + \theta _{ij}^{{\text{ww}}} $$
where α is the genotype main effect, β is the environment main effect, and θ is the G×E interaction term.

Within this formula, s and t refer to the group level of genotypes and environment, respectively. Indices b and w stand for ‘within’ and ‘between’ group. Finally, i refers to the ith genotype and j to the jth environment. The additive model is included in this model as αisbiw and βjtbjw.

The components between (B) and within (W) are readily comprehensible. The term αb represents the variation between groups with levels constant, whereas αiw only accounts for variation within groups. The mean of αiw was zero.

Ultimately, to recover the full interaction model we add the term θ ij= θstbbsjbwitwbijww.

Where θbb is the interaction term between groups of genotypes and groups of environments and θww the residual variation. The θbw and θwb terms are not readily comprehensible. They correspond to the interaction between groups of genotypes within one group of environments and vice versa. The efficiency of a model is defined as the ratio of the percentage of the sum of squares of main effects and interaction explained by the model to the percentage of degrees of freedom used. This analysis was performed for either the environment main effect or the residual term including both a pure error term and G×E interaction (software INTERA, Decoux and Denis 1991).

Multiplicative model

The papers by Gollob (1968) and Mandel (1969) describe the basis of multiplicative models. In this approach, the interaction term (θij) is partitioned as:
$$\theta _{ij} = \sum {l_n u_{ni} v_{nj} + r_{ij} } $$
where ∑ is the sum of the n=1, 2, . . . , N axis included in the model; ln is the singular value for the axis n; uni and vnj are the genotype and location eigenvectors, respectively, for the axis n; and rij is the residual term. The results of multiplicative analysis can be presented graphically in the form of biplots, in which the genotypic and environment scores of the first or second bilinear (multiplicative) terms are plotted against genotypic and environment main effects. A comparison of genotypic and environmental biplots provides an interpretation of the level of adaptation of genotypes to environments when multiplicative terms (genotypic and environmental) do or do not have the same sign and intensity. Multiplicative analysis was performed with INTERA software (Decoux and Denis 1991).

Results

Environmental covariates

No WD was detected during the booting-to-heading stage at 56 of 111 locations. Intermediate stress intensity (0–44 mm) was observed at 44 locations, whereas severe stress (>44 mm) affected 11 locations during this period of the plant cycle. We plotted, for Néodur, the relationship between thousand-kernel weight and kernel number per meters squared of every trial plot (Fig. 1). Each WD group is shown in these diagrams; the more intense the WD, the greater was the decrease in kernel number per m2. Therefore, the definition of three WD groups was consistent with the expected effects on yield components.
Fig. 1

Effect of water deficit on yield components for Néodur for every trial plot

Nitrogen was applied to the crop during the booting-to-heading stage (variable SOLUB) in only 22 of the 111 environments. This cultural practice was not observed in the first two biennial networks and was rare in the 1994/1995 network (12% of the environments). In other networks, it occurred at 35–50% of the sites, and the amount of N fertiliser applied was between 30 kg/ha and 60 kg/ha.

GPC variation

The GPC values obtained for the two genotypes grown at each experimental site (Ardente and Néodur) were normally distributed and varied from 10% to 20%, with a mean of 14.5%. In classical ANOVA, only the region and year effects were significant, accounting for 2.5% and 31.2% of total sum of squares (TSS) explained by the factor, respectively (Table 1).
Table 1

ANOVA for grain protein content (GPC) of Ardente and Néodur over 111 site-by-year combinations. TSS Total sum of squares explained by the factor

Source of variation

df

Mean square

TSS (%)

F-value

P>F

Year

7

30.1

31.2

14.3

<0.0001

Region

1

17

2.5

8

0.005

Variety

1

0.1

0.05

0.04

0.83

Clustering analysis was then used to compare the relevance of environmental descriptors in accounting for variations with that of the traditional ‘year’ and ‘region’ factors (Table 2). We included three environmental covariates (WD, SOLUB and SOLWD) in this analysis. WD accounted for 18.8% of total variation in GPC, whereas the availability of nitrogen (SOLUB) accounted for 0.2%, and the combined variable (SOLWD) accounted for 21.2% of total variation. SOLWD is therefore almost as relevant as ‘year’ for explaining variations in GPC.
Table 2

Clustering analysis for GPC of Ardente and Néodur over 111 site-by-year combinations. SOLUB Nitrogen availability, WD water deficit, SOLWD combined availability of water and nitrogen

 

SOLUB

WD

SOLWD

SS env (%)a

6.0

20.5

23.1

df

1

2

5

SS tot (%)a

0.2

18.8

21.1

aSSenv percentage of explanation of environment main effect, SStot percentage of explanation of total variation

Biennial network analysis

GPC variation was similar to that reported previously, ranging from 9.3% to 20%, with a mean of 13.5%. Mean GPC for all varieties was highest in 1997, reaching 15.7%, and lowest in 1994 (12.6%).

Environment had the strongest effect, accounting for 73.6–87.8% of the TSS in ANOVA (Tables 3, 4). The genotype and residual effects were of similar magnitude, accounting for 7.7–15.9% and 4.5–15% of the TSS, respectively.
Table 3

ANOVA for genotypes evaluated on biennial networks. SS Sum of squares

Source of variation

1992/1993

1993/1994

1994/1995

1995/1996

1996/1997

1997/1998

1998/1999

SS

df

SS

df

SS

df

SS

df

SS

df

SS

df

SS

df

Genotype

33.3

6

28.3

5

32.7

8

64.6

9

55.2

7

65.2

11

34.5

10

Environment

109.4

10

177.5

13

332.9

18

299.2

20

299.0

14

738.7

15

238.1

14

Residual

25.1

60

32.5

65

52.6

144

42.5

180

31.7

98

37.5

165

47.8

140

Table 4

ANOVA for genotypes evaluated on biennial networks. Results are expressed as a percentage of total variation explained

Source of variation

1993/1994

1994/1995

1995/1996

1996/1997

1997/1998

1998/1999

Mean

Genotype

11.9

7.8

15.9

14.3

7.7

10.8

12.6

Environment

74.5

79.6

73.6

77.5

87.8

74.3

76.1

Residual

13.6

12.6

10.5

8.2

4.5

14.9

11.3

GPC tended to increase with WD, as shown by the mean value of GPC for each environment (Table 5). As expected, the application of nitrogen fertiliser during the booting-to-heading stage seemed to increase GPC. However, in both cases, the difference was not found to be significant, because the standard deviation was too large (Table 5).
Table 5

Effect of environmental factors on GPC for all sites and all genotypes (SD standard deviation)

 

WD

Nitrogen application

No WD

0<WD<40 mm

WD>40 mm

No

Yes

GPC mean

12.9

13.8

14.8

13.2

14.1

GPC SD

1.3

1.2

1.8

1.4

1.4

Clustering analysis on environment main effect

Covariate classes were added for clustering analysis on each of the seven biennial networks (Table 6). According to the network data set analysed, WD (three classes) accounted for 0–47.3% of the environment main effect (ESS), with a mean of 26.9%. The efficiency of the clustering to account for ESS was closely related to the occurrence of water deficit. In terms of percentage of ESS explained, the efficiency of clustering procedure increased when the frequency of sites displaying a WD was higher (Fig. 2). For example, if 80% of the locations displayed no water deficit, as in 1993/1994 or 1998/1999, WD accounted for 0% of the ESS. In contrast, the percentage of ESS accounted for was maximal (about 45%) when WD occurred at 40% of the sites (1996/1997).
Table 6

Clustering analysis for GPC of genotypes evaluated in biennial networks. Results are percentage of the sum of square of environment main effect explained. Degrees of freedom are given in parentheses. NS number of sites

 

1992/1993 (NS=11)

1993/1994 (NS=14)

1994/1995 (NS=19)

1995/1996 (NS=21)

1996/1997 (NS=15)

1997/1998 (NS=16)

1998/1999 (NS=15)

Mean

SD

SOLUB

0 (0)

0 (0)

23.6 (1)

4.3 (1)

0 (1)

2.5 (1)

0 (1)

4.3

8.0

WD

18.6 (2)

0 (1)

47.3 (2)

32.4 (2)

45.5 (2)

44.2 (2)

0 (2)

26.9

19.3

SOLWD

18.6 (2)

0 (1)

51.3 (3)

34 (3)

50.8 (4)

61.3 (5)

0 (4)

30.9

23.3

Fig. 2

Occurrence of WD and explanation of environmental main effect

If nitrogen was applied after booting, as occurred in five biennial networks, it accounted for 0–23.6% of the ESS, with a mean of 4.3%. The percentage of ESS accounted for is influenced by the percentage of trials receiving fertiliser. The efficiency of clustering was maximal when only 15% of the trials received fertiliser, and decreased steadily as the percentage of trials receiving fertiliser increased (Fig. 3). The resulting combined variable, SOLWD (six classes), accounted for 0–61.3% of the ESS, with a mean of 30.9%.
Fig. 3

Occurrence of nitrogen applications (SOLUB) and explanation of environmental main effect explained by SOLUB

Clustering analysis on G×E interaction

WD accounted for up to 26.6% (1997/1998) of residual sources of variation (including G×E interaction) and SOLWD up to 43% (Table 7). The covariates WD and SOLWD accounted for part of the residual variation in three and two biennial networks, respectively. In these cases, such information could be useful for the interpretation of genotypic behaviour in a biplot based on a multiplicative model. We illustrated this by performing multiplicative analysis for the 1997/1998 network (Fig. 4a, b).
Table 7

Clustering analysis for GPC of genotypes evaluated in biennial networks. Results are percentage of the sum of squares of residual and interaction effect explained and percentages of degrees of freedom absorbed are given in parentheses

 

1992/1993

1993/1994

1994/1995

1995/1996

1996/1997

1997/1998

1998/1999

WD

0

0

26.4 (11%)

17.4 (10%)

0

26.6 (13.3%)

0

SOLUB

0

0

0

0

0

0

14.4 (7.1%)

SOLWD

0

0

26.4 (16%)

0

43 (28.6%)

0

0

Fig. 4

Multiplicative analysis of grain protein content for the biennial network 1997/1998. Multiplicative and main effects are displayed a for environments and b for genotypes. The relative precocity of genotypes is indexed from (1) for the earliest to (12) for the latest

For the 1997/1998 network, the first (MT1) and second (MT2) multiplicative terms accounted for 39.8% and 18.9% of the residual variation, respectively. A biplot representation of environment and genotypic main effects against the multiplicative genotypic and environmental second term, MT2, was drawn (Fig. 4a, b) and additional information concerning WD clustering was added. Overall, the multiplicative term, MT2, differentiated sites under stress from those with no water stress (Fig. 4a), whereas the plots of genotypes depended primarily on earliness (Fig. 4b). Comparison of the two biplots suggested that early lines produced higher GPC than later lines under stress.

Discussion

Crop simulation models use agronomic and meteorological data collected over past cropping seasons to provide source elements for the reanalysis of past networks, without additional measurements. To illustrate this point, a large network (8 years, 111 locations, 48 genotypes) was analysed according to environmental conditions during the booting-to-heading stage, which is a key phase in the development of grain number and nitrogen accumulation. The analysis performed on the two genotypes tested in each location indicated that defining an environment in terms of limiting factors at this phenological stage was almost as relevant for explaining GPC as the simple but robust analysis of year and region. Based on the analysis of all genotypes, we identified two environmental parameters as most discriminatory for grain protein content: WD and nitrogen availability during the booting-to-heading stage.

However, stress during the booting stage may be correlated with limiting factors occurring during other developmental stages. Thus, stress at booting may be only indicative of other stresses occurring later in the life cycle of the plant. Nevertheless, the GPC was consistently explained by SOLWD (five of the seven biennial networks tested), demonstrating the importance of this variable. The reason for fluctuations in the percentage variation accounted for by WD (and hence SOLWD) may lie in the strong correlation with the number of trials in which WD occurred. This suggests that water deficit, when it occurs, is the most discriminating environmental factor accounting for variation in GPC. WD probably exerts its effects by reducing the number of grains per metres squared. These results confirmed the importance of water and nitrogen supply to development of GPC at the booting stage and are consistent with previous observations (Ottman et al. 2000; Strong 1982; Wuest and Cassman 1992).

These environmental covariates probably also account for G×E interaction. However, we underestimated the relevance of these variables for interpreting G×E, as we had to pool interaction terms and pure residual variation when estimating the sum of squares. Our diagnosis also provided us with a tool for the interpretation of G×E interaction. The multiplicative model gave an evaluation of the specific adaptation of genotypes to environments.

Conclusion

The originality of our work lies in the calculation of covariables corresponding to limiting factors in the plant cycle. These covariables may be introduced into G×E interaction models such as clustering models. By calculating water and nitrogen availability we were able to explain variations of GPC. Thus, our approach has the potential to provide us with an understanding of the environmental bases of phenotypic plasticity and local adaptation. This work focused on a particular stage of development—the booting-to-heading stage. Limiting factors may also influence the development of GPC at other developmental stages. We need to extend our analysis to the entire life cycle of the plant to determine which developmental stages are discriminant for GPC. This could lead to the identification of several developmental stages as being critical for the development of GPC. Further investigations are required to determine the genetic bases of phenotypic plasticity and local adaptation. This should lead to the identification of quantitative trait loci (QTLs) involved in the response to limiting factors. This approach is now possible, by means of factorial regression analysis to account for variation in the additive effects of QTLs on material evaluated in several environments. Our method can also be used to screen evaluation networks for redundant locations. By characterising each site in terms of the frequency of limiting factors occurring during various developmental stages, it should be possible to identify sites at which similar stresses occur.

Acknowledgements

We would like to thank Philippe Gate for his active participation and the Arvalis Institut du Végétal for providing the climatic data and software necessary for the analysis. We also thank Christelle Crespin for providing the experimental data from the CTPS network.

Copyright information

© Springer-Verlag 2004