Genotype by environment interaction (G × E) often represents a significant proportion of the overall genetic variation (Yan and Kang 2003). Benefits genetic improvement can be fully realised only if improved genotypes are matched to environments so they can fully realise their potential. Importance of G × E in forest trees has been emphasised by White et al. (2007). Furthermore, future adaptation of forests depends on the response of genotypes to fast-changing climate conditions (e.g. Schreiber et al. 2011). The main approach to the issue of G × E has been to characterise test environments and identify main environmental factors driving G × E. Breeding and deployment zones are then defined so that within these zones, G × E is minimised. However, G × E is often not simply related to a single environmental variable, but rather several climatic or other site characteristics (e.g. Matheson and Cotterll 1990).

Many tree improvement programmes have not resolved with the issue of G × E because of the lack of sufficiently genetically connected trials and difficulty of interpreting G × E. Radiata pine (Pinus radiata D. Don) tree improvement programme in Australia is one of the most advanced in the world. Significant improvement has been made in growth, form and wood quality traits (Wu et al. 2008). However, current radiata pine breeding and deployment is based largely on the National Plantation Inventory regions (Gavran and Parsons 2011), rather than on environmental drivers of G × E. Therefore, current breeding and deployment zones cannot deliver optimal genetic gains across the whole estate.

Important G × E among radiata pine families were observed in several experiments in Australia. For example, the “Australia-wide Diallel” (AWD) experiment planted on 10 sites revealed significant rank changes between two high-elevation sites in New South Wales (NSW) and other sites in southern Australia (Wu and Matheson 2005; Gapare et al. 2010). Raymond (2011) reported significant G × E at family level for diameter growth, with elevational differences between sites being a key driver of G × E in NSW. A recent paper by Gapare et al. (2012) also confirmed those results and divided NSW sites into high-elevation/high-rainfall and low-elevation/low-rainfall groups. Another study involving eight sites revealed significant G × E for growth and branch size between sites in Tasmania and mainland southern Australia (Baltunis et al. 2010).

This study was based on the most comprehensive, genetically well-connected and well-distributed set of genetic trials in Australia. The study used information from 20 radiata pine progeny trials established in 1996/1997 by the Southern Tree Breeding Association (STBA). Based on data from those trials with common genetic material, estimates of site-site genetic correlations were obtained. The patterns among these correlations were modelled against various environmental characteristics. The study is part of a larger effort to understand G × E and to obtain site classifications that will capture a large proportion of G × E. The general aim is to facilitate optimal deployment of genetic stock to particular environments and development of software tools for deployment by the STBA.

The specific objectives of the study were:

  • To estimate genetic variances (i.e. additive and dominance) and heritability for growth in the set of trials

  • To estimate genetic correlations between trials

  • To examine patterns of G × E between trial locations in southern Australia

  • To identify potential geo-climatic drivers of G × E

Materials and methods

Field trials and measurements

Ten genetic control-pollinated progeny trials were planted in each 1996 and 1997 by the Southern Tree Breeding Association, in the “BR” trial series. The trials are listed together with their locations and dates of planting in Table 1, and the map of locations is given in Fig. 1. All trials had replication, incomplete block and row-plot design features, and were represented on a row-column grid for spatial analyses (see below). The assessments of diameter growth at breast height (DBH) at juvenile stage (i.e. 6–10 years) were used for the analyses.

Table 1 State, region, abbreviation of National Plantation Inventory region (NPI), location coordinates, altitude and planting year for BR1996/1997 trial series
Fig. 1
figure 1

Location of BR series trails within the Australian National Plantation Inventory (NPI) zones in southern Australia. The NPI zones represented are following: 1-Western Australia (WA), 4-Green Triangle (GT), 11-Murray Valley (MV), 12-Central Victoria (CV), 13-Central Gippsland (GL), and 15-Tasmania (TAS)

One of the most important issues in undertaking a G × E analysis is the assessment of the degree of genetic connectivity between trials. The concurrence matrix is based on the numbers of parents in common between trials is given in Online Resource 1. Number of parent trees represented in each trial ranged from 41 to 177 and the parent trees in common between trials from 31 to 165. There are no clear cut guidelines which determine an acceptable level of connectivity; however, it is obvious that disconnected or poorly connected data sets will not permit reliable estimation of the co-variance.

Environmental variables

Basic climate, geological and soil information for BR1996/97 trial locations are given in Table 2. Daily climate data for selected locations within Australia were extracted from the SILO Climate Database ( The daily climate data have been constructed using observations from 4,600 locations across Australia for rainfall, maximum and minimum temperatures, evaporation and solar radiation using spatial interpolation algorithms. A simple monthly aridity index (AIX) was calculated as a ratio of monthly mean daily pan evaporation rate to the total monthly rainfall. The aridity index is related to water balance, and monthly minimum AIX value for the most arid quarter was used to rank the sites in terms of aridity. Climate data sequence from planting to DBH assessment was obtained for each trial.

Table 2 Some basic climate, geological, and soil information for locations of BR1996/97 trials: Mean Temperature of Growing Season (Sep-Apr) (TGS), Mean Temperature Driest Month (TDM), Precipitation in Growing Season (Sep-Apr) (PGS), Precipitation Driest Quarter (PDQ), Total Evaporation (TE), Aridity Index (AIX), Parent Rock Code (PRC), Weathering Index (WI), Top soil Bulk Density (BD), Top soil PH

A seamless national coverage of outcrop and surface geology was obtained from Geoscience, Australia ( Soil types were categorised using a technical classification system into 11 categories of parent rock code. This system was developed to group forest sites according to expected volume productivity (Turner et al. 2001). In addition, a high resolution weathering intensity index for the Australian continent, based on airborne gamma-ray spectrometry and digital terrain analysis was also obtained (Wilford 2012).

Australian Soil Resource Information System (ASRIS, database was used to obtain information on soils in a consistent format across southern Australia. The information was obtained on soil depth, water storage, permeability, fertility, carbon and erodibility, with most soil information recorded at five depths. Soil profile data with fully characterised sites were also available from forestry organisations and companies.

Statistical analyses

Spatial analyses

Spatial analyses were performed using a two-dimensional separable autoregressive (AR1) model fitted to column-row grid diameter data of each trial using ASReml® (Gilmour et al. 2009). The spatial method partitioned the residual variance into an independent component and a two-dimensional spatially auto-correlated component. An additional random term with one level for each experimental unit was used so that a second (so called “units”) error term was fitted (Dutkowski et al. 2002).

All terms were fitted in a single model, where the spatial, extraneous (e.g. assessor) and treatment effects are estimated simultaneously. Diagnostic tools, variogram and plots of spatial residuals as recommended by Gilmour et al. (2009) were used to detect extraneous effects. Spatial adjustment to raw data was done for surface sum experimental design (i.e. replicate and plots) and/or spatial variation. Using the adjusted data, each trial was first analysed as univariate (i.e. single-site) in order to estimate the genetic variance components, before proceeding with multi-site analyses.

Statistical model, variance components and genetic parameters

The following reduced (i.e. parental) linear mixed-effects model was used for multi-environment analyses:

$$ y=X\tau +Z{u}_p+Z{u}_f+e $$

where y is a vector of observations formed by stacking the data for each trial j y = (y T1 , y T2 … y T j )T, b is a vector of fixed effects (i.e. site mean), u p is a vector of random additive genetic effects (i.e. female parent), u f is a vector of random non-additive effects (i.e. full-sib family), and e is a vector of random residual terms. X and Z are known incidence matrices relating the observations in y to effects in b, and p, f, respectively. The random effects in the model were assumed to follow a multivariate normal distribution with means and variances defined by:

$$ {u}_p\sim N\left(\mathbf{0},{\sigma}_p^2A\right),{u}_f\sim N\left(\mathbf{0},{\sigma}_f^2I\right)\mathrm{and}\;e\sim N\left(\mathbf{0},{\sigma}_e^2I\right) $$

where 0 is a null vector; A is the numerator relationship matrix, which describes the additive genetic relationships among individual genotypes; I is identity matrix, with order equal to the number of full-sib families or number of trees; σ 2 p is additive parental variance; σ 2 f is the non-additive variance between full-sib families; and σ 2 e is the residual variance.

For all models, Restricted Maximum Likelihood (REML) derived variance and covariance estimates using the ASReml program (Gilmour et al. 2009) were constrained to fall within the theoretically possible range; variance components estimates were constrained to be greater than zero. Two separate analyses were used to test for the significance of variance components and the difference between the 2 × log-likelihoods (i.e. likelihood ratio) of a full model that included a term and a reduced model without the term was obtained, and tested against a one degree of freedom chi-squared distribution to estimate p values.

Individual heritability (h 2) and proportion of dominance variance (d 2) were calculated following Costa e Silva et al. (2004):

$$ {h}^2=\frac{\sigma_a^2}{\sigma_P^2}=\frac{4\times {\sigma}_p^2}{2\times {\sigma}_p^2+{\sigma}_f^2+{\sigma}_e^2} $$
$$ {d}^2=\frac{\sigma_d^2}{\sigma_P^2}=\frac{4\times {\sigma}_f^2}{2\times {\sigma}_p^2+{\sigma}_f^2+{\sigma}_e^2} $$

where σ 2 a is additive genetic variance component (4*general combining ability), σ 2 d is non-additive variance component (4*specific combining ability), σ 2 P is phenotypic variance, and the other parameters as defined previously. Standard errors of genetic correlations were derived based on Taylor series approximation using the R pin function (White 2013). Considering regionalisation of breeding in southern Australia, the numerator of heritability becomes σ 2 a  + σ 2 a × r as opposed to only σ 2 a with no regionalisation, where and σ 2 a × r is genotype by region variance (Atlin et al. 2000).

The site-site genetic correlations were confounded with the age-age correlations, but the assumption was that the between-site genetic correlations were not significantly influenced by the age of assessments, because the age-age correlations for a narrow range of ages are usually very high (e.g. Li and Wu 2005).

Extended factor analyses

The availability of closely related trees planted across a wide range of environments provided genetic links and allowed estimation of across-site variance and covariance components. Flexible (i.e. unstructured) covariance can account for both scale and rank interactions, but with many environments, the estimation of an unstructured covariance matrix is not feasible (Cullis et al. 2014). Mixed model analysis with an extended factor analytic (XFA) variance structure for the G × E effects and separate variance for the errors for each trial was used here (Gilmour et al. 2009). The XFAk form is a sparse formulation that requires an extra k levels to be inserted into the mixed model equations for the k factors.

The reduced (i.e. parental) mixed model in extended factor analytic model was used to efficiently model variance structure of the additive (and non-additive) G × E effects (e.g. Beeck et al. 2010):

$$ var\left({\boldsymbol{u}}_p\right)=\boldsymbol{A}\otimes {\boldsymbol{G}}_{tp} $$
$$ var\left({\boldsymbol{u}}_f\right)=\boldsymbol{I}\otimes {\boldsymbol{G}}_{tf} $$
$$ {\boldsymbol{G}}_{ts}={\boldsymbol{\varLambda}}_{ts}{\boldsymbol{\varLambda}}_{ts}^{\boldsymbol{T}}+{\boldsymbol{\psi}}_{ts},s=p,f $$

where: A is the relationship matrix with 1+ inbreeding coefficient as diagonal and coefficient of parentage as off-diagonal elements, G jp is additive genetic variance-covariance matrix for parental effect in trial j, G jf is non-additive genetic variance-covariance matrix for full-sib family effect in trial j, I is an identity matrix, Λ is a t × k s matrix of trial loadings (t being number of trials k being the number of factors included in the model), and Ψ = diag (ψ j ) and where ψ j is the specific variance for the jth trial.

Heat map and hierarchical clustering

Cullis et al. (2010) recommended tools for exploring G × E interaction based on a factor analytic model. A heat map representation of the estimated genetic correlation matrix was used, with rows and columns ordered appropriately. A useful ordering is obtained by cluster analysis of trials. The cluster analysis used dissimilarity computed as 1-ρ ij (where ρ ij is the estimated genetic correlation between trials i and j) as a measure of distance between trials, and hierarchical clustering algorithm (i.e. complete linkage method with Euclidean distance measure), that is implemented within the hclust package in R (R Development Core Team 2011). The relationship between-site clustering and environment was examined by correlating site loadings for XFA1 and XFA2 with climate and soil variables (Costa e Silva et al. 2006).

Principal component (PCA) ordination was used to plot the variables of environmental data matrix in two-dimensional representations. The BiplotGUI package provides a graphical user interface for the construction and manipulation of ordination plots in R (La Grange et al. 2009). The trials are represented as points and climate variables as axes, with coordinates on principal component scales.


The heritability (h 2) and proportion of dominance variance (d 2) for DBH for the 20 trials are given in Table 3. Trials BR9610 (a Spring Needle Cast trial in north-west Tasmania), BR9709 (Central Victoria) and BR9712 (in north-east Tasmania) had no additive variance, and several other trials had insignificant, low dominance variance, presumably due to leaf diseases and/or measurement errors. In trials where the sample size was sufficient to obtain precise estimates (e.g. BR9601, BR9604 and BR9608), the d 2 was similar in magnitude to h 2 .

Table 3 Single-site analyses of variance component for DBH, including trial ID, number of parents and progeny trees tested, narrow sense heritability (h 2) and proportion of dominance variance (d 2), and standard errors (se) of the estimates

The relative magnitude of the variance components for genotypes, regions, sites within regions are given in Table 4. Overall, the SCA was higher in magnitude than GCA, but SCA by environment interaction was similar in magnitude GCA by environment variances. The Site(Region) by GCA interaction was roughly 75 % in magnitude of the additive variance (i.e. GCA) and Site(Region) by SCA was 70 % of non-additive variance (i.e. SCA). The region by GCA variance suggests that regions are capturing only a significant amount (i.e. ~55 %) of G × E interaction relative to site within region by GCA interaction. Heritability within NPI regions increased relative to across NPI regions from 0.09 ± 0.03 to 0.13 ± 0.03 and further to 0.16 ± 0.03 for heritability based on hierarchical site clustering based on additive genetic correlation matrix. Therefore, the genetic gain from regionalisation of breeding can be expected to be significant.

Table 4 Combined-site variance component analysis: REML estimates of variance and estimates as percentage of total genetic variance

The trial-trial additive and non-additive correlation matrix for DBH obtained using extended factor analytic with three factors (XFA3) analysis is given in Table 5. The matrix for additive and non-additive correlations is represented by heat maps with a dendrograms added to the left side and to the top (Fig. 2). Rows and columns (i.e. trial IDs) were reordered based on row or column means within restrictions imposed by the dendrogram. The dendrogram with a cut-off at height of 0.5 divided the trials into clusters to maintain high estimated genetic correlations for all pairs of trials within each cluster. At the additive level, this resulted in the formation of 4 clusters with 2, 2, 5 and 10 trials, and 1 singleton (i.e. BR9710). At the non-additive level, this resulted in the formation of 3 clusters with 2, 11, and 6 trials, and 1 singleton (i.e. BR9701). The heat map shows that the pair-wise genetic correlation between trials within clusters was greater than between clusters.

Table 5 Trial-trial additive correlation matrix for DBH, obtained using XFA3 analysis
Fig. 2
figure 2

Heat maps and dendrogram of site-site genetic correlations for a additive and b non-additive effects for DBH

Although, the XFA3 adequately modelled the G × E (i.e. explained 90.5 and 87.0 % genetic variance for additive and non-additive effects, respectively) interpretation of the effects may be problematic for sites that exhibited a low or zero proportion of genetic variance (i.e. h 2 = 0 or d 2 = 0). For example, the sites BR9705, BR9707, BR9709, BR9710 and BR9713, with relatively small non-additive variance, had 100 % variance accounted for from the XFA3 model and correlation fixed at r g  = 1. The first latent variable (i.e. XFA1 loading) separated the sites with low SCA variance, whereas the second latent variable contrasted TAS (BR9712, BR9615, BR9614) with sites on Mainland (BR9703, BR9613, BR9604).

Information including the identification and location of the trials in each cluster are given in Table 6. At the additive level, cluster 1 consisted of three TAS trials and one trial representing MV region (i.e. BR9617). The trial BR9710 from CV(OTW) was a singleton. Cluster 4 consisted of two trials from WA BR9701 and BR9702, two TAS trials and an inconsistent (i.e. water-logged) trial in GT (BR9604). Finally, cluster five consisted mainly of trials in GT, two trials in CV and two trials in GL (Table 6).

Table 6 Identification and location of the trials in clusters in heat map (Fig 2a, b) from left to right and from top to bottom

At the non-additive level, cluster 1 consisted again of trial BR9610 in TAS and trial BR9617 in MV. The large cluster 3 included trials with low SCA variance (i.e. BR9705, BR9707, BR9709, BR9710 and BR9713), and some other trials with high correlations with that group. Trial BR9701 from WA was the singleton. Cluster 4 consisted of trials from five different regions. Overall, the pattern of G × E for the non-additive effects was more complex than for the additive effects (Table 6).

To further examine G × E patterns, average additive pair-wise correlations within and between current breeding zones (i.e. NPI regions) are presented in Table 7. Two trials with low heritability (i.e. BR9709, BR9610) and five trials with low dominance proportion (BR9705, BR9707, BR9709, BR9710 and BR9713) were excluded. It was evident that the correlations between GT and CV regions and between CV and GL regions are not lower than the within region correlations. TAS had low correlation both with trials in other regions and within TAS. The trial BR9617 in MV region had very low or negative correlation with all the other regions. The trial BR9710 in OTW region had low correlation with GT and a negative correlation with MV trials.

Table 7 Average genetic correlation coefficients for diameter at the regional level: Green Triangle (GT), Central Victoria (CV), Gippsland (GL), Tasmania (TAS) and Western Australia (WA), and single trials in the Murray Valley (MV) and Otways (OTW) regions

Climate variables for the 20 sites of the BR 1996/97 progeny trial series were summarised using PCA-based ordination (Fig. 3). The first two principal components of the biplot explained 78 % of the variation. Trials from TAS, CV and GT regions were separated on the axes related to precipitation growing season (PGS), annual precipitation (AP), and relative humidity at high temperature (RHHT). The two trials in WA were separated from the other trials on the axes related to aridity, i.e. high evaporation (TE), solar radiation (MR), and high evaporation to precipitation ratio (AIX).

Fig. 3
figure 3

PCA ordination based on climate variables and grouping of 20 trial site locations as NPI regions: Western Australia (WA), Green Triangle (GT), Murray Valley (MV), Central Victoria (CV), Otways (OTW), Central Gippsland (GL), and Tasmania (TAS). Axes represent of most distinguishing climate variables: annual precipitation (AP), precipitation growing season (PGS), relative humidity at highest temperature (RHHT), relative humidity low temperature (RHLT), mean solar radiation (MTGS), and aridity index (AIX) (i.e. ratio of evaporation to precipitation)

The correlations between climatic attributes and the latent variates indicated the contributions of the individual climatic variables to differences between trials. At the additive level, XFA1 and XFA2, respectively, were moderately to highly correlated to total evaporation (TE) (r = −0.55, r = −0.55), annual precipitation (AP) (r = 0.62, r = 0.58), PGS (r = 0.48, r = 0.74), mean monthly temperature (MT) (r = −0.31, r = 0.59), mean temperature growing season (MTGS) (r = −0.19, r = −0.59), mean monthly maximum temperature (MAXT) (r = −0.19, r = −0.60), mean monthly minimum temperature (MINT) (r = 0.48, r = −0.50), vapour pressure (VPP) (r = −0.37, r = −0.63). On the other hand, the correlations with soil characteristics were significant only for soil depth (r = 0.63, r = −0.08), bulk density of layer 3 (r = −0.56, r = 0.05), available water content in the top layer (r = 0.18, r = 0.50).

At the non-additive level, the factor analytic site loadings on correlation scale for XFA1 and XFA2, respectively, were moderately to highly correlated to PGS (r = −0.52, r = −0.41), AP (r = −0.57, r = −0.32), PDQ (r = −0.41, r = −0.45), aridity index (AIX) (r = −0.07, r = 0.52), MINT (r = 0.42, r = −0.39) and VPP (r = 0.41, r = 0.49). The correlations with soil characteristics were significant only for soil pH (r = 0.40, r = −0.08).


For the 20 trials from the STBA BR1996/97 series, heritability (h 2) of DBH was within the range observed for radiata pine in Australia (Wu et al. 2008). The proportion of dominance variance (d 2) was similar in magnitude as h 2. In some previous studies of radiata pine, the magnitude of d 2 was less than h 2 (e.g. Wu and Matheson 2005; Gapare et al. 2010). However, relative magnitude of d 2 and h 2 similar to the one observed in the current study was observed in other pines (e.g. in 26-year-old in Scots pine in Waldmann et al. 2008). The result suggests that non-additive components genetic variance can be as important as the additive component (Wu and Matheson 2004).

The relative magnitudes of the variance components also suggest that the current deployment regions are capturing significant amount of G × E interaction of total G × E variance. Nevertheless, crossover interaction of breeding values can be expected between trials within a certain region and between different regions, even for regions in different states. For example, at both additive and non-additive level, the correlations between trials within TAS were lower than between trails in TAS and trials in other regions (except one trial in MV region). Nevertheless, our analyses of DBH at the regional level confirmed the results of Baltunis et al. (2010) that there is a significant G × E between Tasmania and mainland Australia.

This study also confirmed findings of Wu and Matheson (2005) in regards to G × E between two trials in NSW and those in other parts of southern Australia. A more recent study by Gapare et al. (2012) grouped trials according to rainfall and altitude. High-altitude sites had also higher rainfall and lower temperature. Those trials formed two groups on based on additive and provenance site-site correlations. This grouping was also consistent with the results of Raymond (2011), where higher altitude, higher rainfall sites were distinctly different from warmer, drier sites within NSW.

At the additive level, the trials within TAS, one trial in MV region in NSW and one trial in the OTW region in Victoria had either low or negative correlations with other trials. Nevertheless, G × E was present even between two sites (i.e. BR9604 and BR9608) within the otherwise relatively uniform GT region, and such interaction may be caused by micro-environmental causes (in this case, water-logging). Generally, as site-site correlations within regions increase to the same level as that between regions, the usefulness of regionalisation decreases (Pederick 1990). Genetic response to selection can be calculated as: G = iσh 2,, where i is selection intensity, σ is phenotypic standard deviation, and h 2 is heritability. As heritability increased significantly for within-region selection regionalisation seems to be justified.

Generally, the results presented here confirm the results previously found for additive gene effects. However, at the non-additive level, the pattern of site-site genetic correlations was less clear than at the additive level. Nevertheless, the overall pattern of regional (i.e. average) non-additive genetic correlations was similar to additive. Although non-additive genetic variation can be very important (e.g. Waldmann et al. 2008), there are no examples in forestry literature comparing environmental causes of both additive and non-additive G × E.

In some cases, certain climate, soil or other variables can be identified, which may then be utilised to delineate specific deployment regions. The BR1996/97 series of trials covered a wide geographic range in southern Mainland Australia and Tasmania, and covered a wide range of climates and soils. When geo-climatic variables for pairs of trials were used as predictor variables, at the large transcontinental scale, XFA1 and XFA2 site loadings were moderately correlated to climate variables, in particular annual rainfall. The results linked G × E to aridity with regards to the two Western Australian trials. Further modelling of genetic correlations based on environmental variables is underway. The modelling is involving all genetically connected pairs of STBA trials in addition to the BR trial series analysed here, and the results from those analyses will be used for re-defining breeding and deployment regions in southern Australia.


  1. 1.

    Based on previous studies and this study, significant G × E for diameter growth can be expected between Tasmanian and Mainland sites, and within Tasmania itself.

  2. 2.

    There were indications that the sites in Murray Valley region in NSW and Otway region in Victoria may exhibit G × E interaction with other regions; however, this is based only on individual trials.

  3. 3.

    Heritability increased significantly for within-region selection and regionalisation seems to be justified.

  4. 4.

    The G × E interaction at transcontinental scale can be correlated to the climate variables, primarily to rainfall and temperature. However, the drivers may also be related to smaller scale environmental variation (i.e. soil and terrain variation).

  5. 5.

    The results presented here can be used as evidence in favour of reconsidering the current breeding and deployment zones. However, further work, using other pairs of genetically connected trials, will give more robust results on which to base G × E regionalisation.