Introduction

Radiata pine (Pinus radiata D. Don) is the predominant species in planted forests in New Zealand, making up 90 % of the country’s 1.72 million hectares of plantation forest (FOA and MPI 2012). Wood from P. radiata is a versatile, fast-growing, medium-density softwood, suitable for a wide range of end-uses (Mead 2013). It is excellent for pulping, the manufacture of reconstituted board products and plywood and also used as structural wood in house construction or as appearance wood for furniture making (Sutton 1999).

The New Zealand radiata pine breeding programme began in the 1950s and initially focused on growth, form and health traits, later extending to wood quality traits (Dungey et al. 2009; Jayawickrama and Carson 2000). Branch cluster frequency and stem straightness are two important form traits targeted for improvement in radiata pine breeding (Jayawickrama and Carson 2000). Branch cluster frequency refers to the frequency of branch clusters between 1 and 6 m above the ground on the main stem. It affects both branch size and mean internode length, particularly in the first 3–11 m of the tree bole above the ground. Trees with high mean internode length can produce higher yields of valuable clear wood timber grades for appearance (Carson 1988). Stem straightness is an important factor affecting log grade, log length and sawn-timber recovery (Blackburn et al. 2013). Environmental factors affecting the straightness of individual trees include microclimate and physical damage (from wind, snow, insects and animals) (Del Río et al. 2004). Stem straightness and branch cluster frequency both have medium to high heritabilities (Wu et al. 2008), showing considerable response to genetic improvement (Sutton 1999).

Advancements in molecular genetic techniques have made it possible to identify DNA sequence polymorphisms linked to phenotypic variation in both plants and animals. Marker-assisted selection and genomic selection are promising strategies for genetic improvement of economically important traits (Collard and Mackill 2008). To implement marker-assisted selection, genes or DNA polymorphisms associated with specific phenotypes must first be identified. A candidate gene approach was proposed by geneticists as a tool to identify genes associated with wood quality and growth for possible use in genetic improvement programmes (e.g. Dillon et al. 2010). This approach is suitable in conifers because of immense genomes (Murray 1998) and low linkage disequilibrium (LD) (Neale and Savolainen 2004). DNA markers can be applied to achieve earlier selection and more cost-effective selection or to increase selection intensity in forest breeding programmes (Wilcox et al. 2007). Simple sequence repeat (SSR) markers were found to be associated with branch cluster frequency and stem straightness in radiata pine (Kumar et al. 2004). Association studies for branch cluster frequency and stem straightness have been reported in other species as well (Arcade et al. 1996; Cumbie 2010; Lepoittevin et al. 2012; Xiong 2010). Random amplified polymorphic DNA (RAPD) markers linked to quantitative trait loci (QTLs) for height, stem straightness, branch angle and wood specific gravity have been identified in interspecific hybrid families of European and Japanese larches (Arcade et al. 1996). Associated single nucleotide polymorphisms (SNPs) have also been found in loblolly pine (Pinus taeda L.) for stem straightness (Cumbie 2010) and ramicorn branching (Xiong 2010).

The phenomenon whereby genetically identical individuals are superior in one environment but inferior in another is called genotype by environment (G×E) interaction. A G×E interaction can result in changes in ranking and/or changes in performance scale in a population. If G×E interactions result in a change in scale, they are of lesser importance unless consistent performance is needed across environments (Muir et al. 1992). If G×E interactions result in a change in ranking, more complex breeding and deployment strategies are required (Goddard 1998; Howarth et al. 1997).

Most G×E studies in radiata pine have focused on growth and wood quality traits, with high G×E interactions reported for growth (Matheson and Raymond 1984; McDonald and Apiolaza 2009; Pederick 1990; Shelbourne 1972; Wu and Matheson 2005) and minor G×E interactions reported for wood density (Apiolaza 2012). A low level of G×E interaction has also been reported in radiata pine for stem straightness and number of branch clusters in both open-pollinated populations (Gapare et al. 2012; Johnston and Burdon 1990; Matheson and Raymond 1984; Pederick 1990), control-pollinated populations (Wu and Matheson 2005) as well as clonal populations (Baltunis and Brawner 2010).

Marker-assisted selection approaches to improve economically important traits can be potentially complicated by G×E interactions. Those SNPs that are associated with phenotypes in one environment may exhibit opposing directions of association or not be associated with the trait at all in other environments. Such interactions may reduce the prediction accuracy of genomic selection (Resende et al. 2012) and introduce uncertainty about the performance of transgenic variants in different environments (Zeller et al. 2010).

In the current study, radiata pine populations were studied for branch cluster frequency and stem straightness. The objectives of this study were to (1) determine the levels of genetic variation and G×E interactions in branch cluster frequency and stem straightness, (2) identify SNPs associated with these form traits and (3) investigate the impact of G×E interactions on any associations.

Materials and methods

Genetic material

Three radiata pine-breeding populations were analysed: a control-pollinated progeny trial series planted in 1995 (POP1), a clonally propagated single-paired mating design trial series planted in 1997 (POP2) and a clonally propagated factorial mating design trial series planted in 1999 (POP3) (Table 1). These trials were planted at various locations in New Zealand with different soil types and levels of productivity.

Table 1 A summary of key features for the three breeding populations used in this study

POP1 trial series

Established in 1995, the POP1 trial series comprised 1849 progeny from 46 parents planted at three sites: Kaingaroa, Tarawera and Glenledi forests. The first two sites are located in the central North Island and the third is in the Otago region of the South Island of New Zealand. The trial design utilized 30 replications of a randomized complete block, single-tree plot and sets in reps. Twenty-six families were formed by crossing 26 female parents and 21 male parents, with some parents used as both male and female parents. Of these families, 19 were crosses between high wood density parents and 7 were crosses between parents with high wood density and parents with fast growth and good form. All female parents were crossed once and some male parents crossed twice.

POP2 trial series

Established in 1997, the POP2 trial series comprised 457 progeny from 63 parents, which were clonally replicated and planted at two sites: Tarawera and Woodhill forests. Woodhill Forest is located on coastal sand dunes in the north-west of the North Island. Two groups of families were contained within this clonal trial: the first group comprised 33 full-sib families from parents selected on the basis of growth and form; the second group comprised 19 families, from 30 parents selected for high wood density. A single-tree-plot, sets-in-reps design with six replicates, ten clones per family and six ramets per clone was established on each site.

POP3 trial series

Established in 1999, the POP3 trial series comprised 524 progeny from 24 parents planted at three sites, Kinleith, Tarawera and Woodhill forests, in an incomplete block design. Kinleith Forest is located in the central North Island of New Zealand. Trials comprised 5 replicates per site, 9 incomplete blocks per replicate, 6 families per incomplete block, 10 clones per family and 15 ramets per clone. The 24 parents used were selected for growth and form.

Trait assessment

Stem straightness was assessed using a nine-point subjective scale where 1 = crooked and 9 = very straight (Carson 1986). Branch cluster frequency was also assessed using a nine-point subjective scale with 1 = uninodal and 9 = extreme multinodal (Carson 1991). Branch cluster frequency and stem straightness were assessed at age 7 in both POP1 and POP3 and at age 8 in POP2. Scores of branch cluster frequency and stem straightness were normally distributed.

Quantitative genetic analysis

Estimated breeding values (EBVs) for branch cluster frequency and stem straightness were estimated in an analysis within each population using ASReml v.3 (Gilmour et al. 2009) with an individual-tree linear mixed model:

$$ y=Xb+{Z}_ss+{Z}_aa+{Z}_dd+e $$

where y is the vector of observations for branch cluster frequency or stem straightness, b is the vector of fixed effects (i.e. population mean, site and replicate), s is a vector of random effects of sets within the replicate (POP1 and POP2) or a vector of random effects of incomplete block within the replicate (POP3), a is the vector of random additive genetic effects of individual genotypes, d is the vector of random non-additive genetic effects among clones in the clonal trials (POP2 and POP3) and e is the vector of random residual terms. Terms X, Z s , Z a and Z d are the incidence matrices associating phenotypes with fixed and random effects of b, s, a and d, respectively.

Variance components and EBVs were estimated using a model assuming heterogeneous additive genetic variances and residual variances across sites. For two clonal trial series, heterogeneous non-additive genetic variances were also fitted. The residual effects were assumed to have 0 mean and R variance-covariance matrix. The residual variance-covariance matrix R = R 0⊗ I, where \( {\boldsymbol{R}}_0=\left[\begin{array}{ccc}\hfill {\sigma}_{e_1}^2\hfill & \hfill 0\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill {\sigma}_{e_2}^2\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill {\sigma}_{e_3}^2\hfill \end{array}\right] \), \( {\sigma}_{e_1}^2 \), \( {\sigma}_{e_2}^2 \) and \( {\sigma}_{e_3}^2 \) are the residual variances for site 1, site 2 and site 3, respectively, and I is the identity matrix. The random additive genetic effects were assumed to have 0 mean and G variance-covariance matrix. The genetic variance-covariance matrix G = G 0⊗ A, where \( {G}_0=\left[\begin{array}{ccc}\hfill {\sigma}_{a_1}^2\hfill & \hfill {\sigma}_{a_{12}}\hfill & \hfill {\sigma}_{a_{13}}\hfill \\ {}\hfill {\sigma}_{a_{12}}\hfill & \hfill {\sigma}_{a_2}^2\hfill & \hfill {\sigma}_{a_{23}}\hfill \\ {}\hfill {\sigma}_{a_{13}}\hfill & \hfill {\sigma}_{a_{23}}\hfill & \hfill {\sigma}_{a_3}^2\hfill \end{array}\right] \); ⊗ denotes the Kronecker product; \( {\sigma}_{a_1}^2 \), \( {\sigma}_{a_2}^2 \) and \( {\sigma}_{a_3}^2 \) are the genetic variances for site 1, site 2 and site 3, respectively; the off-diagonal elements are the additive genetic covariances between site 1, site 2 and site 3 and A is the numerical relationship matrix. In the clonal trials (POP2 and POP3), effect d had a variance-covariance matrix of D 0 I, where \( {D}_0=\left[\begin{array}{ccc}\hfill {\sigma}_{d_1}^2\hfill & \hfill {\sigma}_{d_{12}}\hfill & \hfill {\sigma}_{d_{13}}\hfill \\ {}\hfill {\sigma}_{d_{12}}\hfill & \hfill {\sigma}_{d_2}^2\hfill & \hfill {\sigma}_{d_{23}}\hfill \\ {}\hfill {\sigma}_{d_{13}}\hfill & \hfill {\sigma}_{d_{23}}\hfill & \hfill {\sigma}_{d_3}^2\hfill \end{array}\right];\kern0.37em {\sigma}_{d_1}^2,\;{\sigma}_{d_2}^2 \) and \( {\sigma}_{d_3}^2 \) are the non-additive genetic variances for site 1, site 2 and site 3, respectively, and the off-diagonal elements are the non-additive genetic covariances between site 1, site 2 and site 3.

Narrow-sense heritability for site i was estimated as \( {\widehat{h}}_i^2=\frac{{\widehat{\sigma}}_{a_i}^2}{{\widehat{\sigma}}_{a_i}^2+{\widehat{\sigma}}_{e_i}^2} \) for POP1 and as \( {\widehat{h}}_i^2=\frac{{\widehat{\sigma}}_{a_i}^2}{{\widehat{\sigma}}_{a_i}^2+{\widehat{\sigma}}_{d_i}^2+{\widehat{\sigma}}_{e_i}^2} \) for POP2 and POP3, where \( {\widehat{\sigma}}_{a_i}^2 \), \( {\widehat{\sigma}}_{d_i}^2 \) and \( {\widehat{\sigma}}_{e_i}^2 \) are the additive genetic variance, non-additive genetic variance and residual variance of site i, respectively. The genetic correlation between additive genetic effects at site i and site j was estimated as \( {r}_{g_{ij}}=\frac{{\widehat{\sigma}}_{a_{ij}}}{\sqrt{{\widehat{\sigma}}_{a_i}^2{\widehat{\sigma}}_{a_j}^2}} \), where \( {\widehat{\sigma}}_{a_{ij}} \) is the additive genetic covariance between site i and site j, \( {\widehat{\sigma}}_{a_i}^2 \) is the additive genetic variance of site i and \( {\widehat{\sigma}}_{a_j}^2 \) is the additive genetic variance of site j (Burdon 1977). Genetic correlation between sites was used as an indicator of G×E interaction levels: a higher genetic correlation between sites indicated a low G×E interaction, while a low genetic correlation indicated a high G×E interaction (Burdon 1977; Falconer and Mackay 1996; Via and Hawthorne 2005). In tree breeding, a high G×E interaction is considered to exist if the genetic correlation between sites for the same trait is below 0.7 (Shelbourne 1972).

For a comparison, EBVs of branch cluster frequency and stem straightness were also estimated in an across-site analysis using a model assuming homogeneous additive genetic variances and heterogeneous residual variances across sites within each trial series. This model allowed to estimate a single EBV for each individual across multiple sites. Heritabilities of branch cluster frequency and stem straightness from the across-site analysis were different across sites because residual variances were different. For two clonal trial populations, homogeneous non-additive genetic variances were also fitted. The additive genetic effect had a variance of A σ 2 a , and the non-additive genetic effect had a variance of I d σ 2 d , where I d is an identity matrix. Akaike information criterion (AIC) (Akaike 1973) was used to assess model fitness between two models: model assuming heterogeneous genetic variances and residual variances across site and model assuming homogeneous genetic variance and heterogeneous residual variances across site. A higher AIC indicated that a better model is fit to data.

Tissue samples and DNA extraction

For all progeny and most of the parents, DNA was extracted from needle tissue. For parents where no needle tissue was available, 12 haploid megagametophyte DNA samples from the same parent were pooled in equimolar proportions to reconstruct the diploid parental genotype. Using this number of megagametophytes, the probability of correctly identifying a heterozygous parental SNP genotype was approximately 1 − (0.5)12 = 0.9998. For the clonally replicated test series (POP2 and POP3), tissue was collected from a single ramet per genotype. A modified version of the Macherey-Nagel NucleoSpin® 96 Plant II kit (Macherey-Nagel, Düren, Germany) was used to extract DNA from needle tissue (Telfer et al. 2013). Megagametophyte tissue was excised from seeds, and DNA was extracted using a CTAB-based method with further purification using the Qiagen® QIAquick 96 PCR Purification Kit (Qiagen, Düsseldorf, Germany).

Candidate gene selection and SNP genotyping

A total of 209 candidate genes (CG) for wood density and growth were originally selected using a range of criteria: (1) expression profile in microarray experiments conducted on differentiating xylem and/or the phenotypic extremes of a wood density QTL mapping population; (2) co-location with QTL for one or more of the following: juvenile wood density, wood density, cell wall thickness, microfibril angle, radial diameter, tangential diameter and ring width and (3) putative gene function. In the absence of an experimentally confirmed function, a putative function was assigned to each candidate gene using the Basic Local Alignment Search Tool (BLASTX) (Altschul et al. 1990) algorithm against the non-redundant protein dataset at NCBI.

Resequencing of partial candidate gene amplicons was performed using Sanger sequencing technology. To identify polymorphic SNP loci, resequence data were assembled using Sequencher® V4.9 sequence analysis software (Gene Codes Corporation, Ann Arbor, MI, USA). The majority of genes were resequenced in a panel of nine genotypes which were tested as both diploid (foliage) and haploid (megagametophyte) tissue. Six genes were part of a full-length sequencing project which screen between 64 and 77 megagametophyte samples. At least two researchers independently evaluated sequence alignments. Haploid DNA control samples were also resequenced to distinguish true SNPS from amplification of multiple loci. If polymorphisms were detected in individual haploid samples, these candidate gene amplicons were excluded from further analysis.

Prior to the design of genotyping assays, individual SNPs were ranked based on a combination of (1) original candidate gene selection criteria, (2) the number of unrelated genotypes resequenced, (3) allele frequency, (4) low redundancy for within-gene amplicon LD and (5) for some SNPs, evidence for association with either diameter at breast height or wood density in an unstructured population (Wilcox et al., unpubl. data). Where SNPs from the same gene amplicon exhibited putative LD, no more than three were included in the design process. Assays were designed using the Assay Design Suite 1.0 (Sequenom Inc., San Diego, CA, USA). Genotyping was performed using the Sequenom iPLEX® Gold MassARRAY® platform at GenomNZ (Mosgiel, New Zealand, see http://www.genomnz.co.nz/). The distance between two SNPs identified on the same genes ranged from 57 base pairs to 2280 base pairs, depending on the size of the gene sequence that were able to generate and the suitability of SNPs for design into a Sequenom assay. Table 2 shows a list of SNPs used in this analysis and candidate genes linked.

Table 2 List of SNPs used in this analysis and candidate genes linked

Association analysis

Association analyses were performed separately for each trial series using EBVs of progeny with a multiple linear regression model into which all SNPs were fitted. Sixty-six SNPs were identified using a candidate gene approach based on putative involvement with wood density, growth rate and cell wall metabolism. The Numbers of progeny with SNP genotypes were 467 and 524 in two clonal trial series POP2 and POP3, respectively. The number of progeny with SNP genotypes was 533 in the progeny trial POP1, and all of these progenies were planted at site Tarawera.

The statistical model for the analyses with n progeny and k SNPs is

$$ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{b}+\boldsymbol{e} $$

where y is a n × 1 dimensional vector of deregressed best linear unbiased predictions of branch cluster frequency or stem straightness, b is a (k + 1) × 1 vector of the substitution effect of k SNP and e is an n × 1 vector of the residual effect. The EBV for an individual was deregressed by using the accuracy of that EBV as well as the EBVs and accuracies of their parents to remove parents’ average (Garrick et al. 2009). The incidence matrix X is an n × (k + 1) vector relating the SNP genotypes (0, 1 or 2) to the response variable y. Only progenies at Tarawera in POP1 were genotyped; therefore, association analysis was only conducted for this site.

Results

Estimation of heritability and genetic correlations

Narrow-sense heritabilities of branch cluster frequency and stem straightness at each site of the three trial series were estimated to assess genetic variations of the traits (Table 3). Heritabilities of branch cluster frequency ranged from 0.29 to 0.45 in POP1, from 0.25 to 0.28 in POP2 and from 0.13 to 0.22 in POP3. Heritabilities of stem straightness ranged from 0.18 to 0.24 in POP1, from 0.11 to 0.14 in POP2 and from 0.04 to 0.18 in POP3. Heritabilities from the across-site analysis where genetic model assumed homogeneous genetic variances and heterogeneous residual variances were lower than from the model assuming both heterogeneous genetic and residual variances. The former model always led to lower AIC (Akaike 1973) than did the latter model.

Table 3 Heritabilities of branch cluster frequency and stem straightness at each site for the POP1, POP2 and POP3 trial series from a model assuming heterogeneous additive genetic variances and heterogeneous residual variances and a model assuming homogeneous additive genetic variance and heterogeneous residual variances (across-site analysis)

The genetic correlation between a pair of sites was used to indicate the magnitude of G×E interactions between those sites within each trial series. High genetic correlations between sites indicate low G×E interactions while low genetic correlations indicate high G×E interactions (Via and Hawthorne 2005). High G×E interactions in branching (e.g. genetic correlation <0.70) were found between Tarawera and Glenledi sites in POP1 and between Tarawera and Woodhill sites in POP2. High G×E interactions in straightness were found between Tarawera and Woodhill sites in both POP2 and POP3 (Table 4).

Table 4 Genetic correlations among sites, number of significant SNPs at each site and number of significant SNPs in common for each site pair in the POP1, POP2 and POP3 trial series in the site-specific analysis

Total number of significant SNPs

Fifty-one, 50 and 49 (Table 2) were subsequently used for association analyses for branch cluster frequency and stem straightness in POP1, POP2 and POP3 respectively after removing SNPs with minor allele frequencies of less than 0.05. The effect and the percentage of genetic variance explained by significantly associated SNPs in POP1, POP2 and POP3 are shown in Supplementary Tables 1, 2, and 3 for branch cluster frequency and in Supplementary Tables 4, 5 and 6 for stem straightness. The total number of SNPs that showed significant associations with branch cluster frequency in any site tested was 16, 11 and 13 in POP1, POP2 and POP3 respectively and 32 in three trial series. Some SNPs showed significant associations in only one site, some at two sites and some across all three sites. In POP2, three SNPs (F3H_SNP197, FAM18L_SNP233 and MYBR2_SNP1122) showed significant associations at two sites in POP2. In POP3, four SNPs (AGP_SNP1361, BTUB_SNP532, COBRA_SNP130, SWAP_SNP169) showed significant associations across all three sites and two SNPs (GRP_SNP286, UCL3_SNP110 and UNKS2_SNP302) showed significant associations across two sites. The total number of SNPs that had significant associations with branch cluster frequency was 16 at Tarawera in POP1; 6 and 7 at Tarawera and Woodhill in POP2 and 8, 11 and 6 at Kinleith, Tarawera and Woodhill in POP3, respectively.

The total number of SNPs that showed significant associations with stem straightness was 8, 13 and 14 in POP1, POP2 and POP3 respectively and 26 in three trial series. Two SNPs (LA2_SNP63 and RZF_SNP241) showed significant associations with stem straightness across two sites in POP2. Four SNPs (4CL_SNP210, CAD_SNP1200, CAD_SNP1584 and F3H_SNP197) showed significant associations with stem straightness across all three sites and two SNPs (BTUB_SNP1202 and RWP.RK_SNP339) across two sites in POP3. The total number of SNPs that had significant associations with stem straightness was 8 at Tarawera in POP1; 8 and 7 at Tarawera and Woodhill in POP2 and 8, 9 and 8 at Kinleith, Tarawera and Woodhill in POP3, respectively.

Genetic variance explained by SNPs

The percentage of genetic variance explained by the SNPs showing significant associations with branch cluster frequency ranged from 0.48 to 5.10 % in POP1, from 0.47 to 5.76 % in POP2 and from 0.23 to 8.76 % in POP3 (Supplementary Tables 1, 2 and 3). The percentage of genetic variance explained by the SNPs showing significant association with stem straightness ranged from 0.43 to 10.94 % in POP1, from 0.37 to 12.75 % in POP2 and from 0.59 to 6.48 % in POP3 (Supplementary Tables 4, 5 and 6).

Number of common SNPs

The effect of a SNP was also estimated for each site. If the effect of a SNP was significant across two sites within a trial series, this SNP was regarded as ‘common’ for these two sites. The percentage of common SNPs between any two sites was plotted against the genetic correlation between these sites. Traits with a high G×E interaction (low genetic correlation) tended to have a lower percentage of significant SNPs in common (Fig. 1). The percentage of significant SNPs in common ranged from 29 to 63 % between the sites where genetic correlations were below 0.70, while the percentage of significant SNPs in common ranged from 56 to 85 % between the sites where genetic correlations were above 0.70 (Table 4).

Fig. 1
figure 1

Relationship between genetic correlation and the percentage of common SNPs for sites in two trial series: POP2 (circle) and POP3 (triangle), for form traits branch cluster (unfilled icon) and stem straightness (filled icon)

Magnitude and direction of SNP effects within trial series

All SNPs that showed significant associations with branch cluster frequency or stem straightness at two or more sites within a trial series had the same effect direction (positive or negative effects), but the magnitude of these effects sometimes varied across sites (Supplementary Tables 1, 2, 3, 4, 5 and 6). In general, the difference in SNP effect magnitude decreased with increased genetic correlation between sites (Fig. 2). When the genetic correlation was above 0.70, differences in effect magnitude between sites ranged from 0.86 to 0.98, with larger differences tending to correspond to a higher G×E interaction (lower genetic correlation). When the genetic correlation was below 0.70, the difference in effect magnitude ranged from 0.33 to 0.67. Figures 3 and 4 show interaction plots for SNP effects across different environments in POP2 and POP3 for branch cluster frequency and stem straightness.

Fig. 2
figure 2

Relationship between genetic correlation and the absolute difference of SNP effects for sites in two trial series: POP2 (circle) and POP3 (triangle), for form traits branch cluster (unfilled icon) and stem straightness (filled icon). The absolute difference of SNP effects is calculated as \( \frac{\left|{b}_i-{b}_j\right|}{\overline{b}} \), where b i is the SNP effect for site i, b j is the SNP effect for site j and \( \overline{b} \) is the average SNP effect within a population across all possible sites

Fig. 3
figure 3

Interaction plot of SNP effects for branch cluster frequency across sites in three trial series

Fig. 4
figure 4

Interaction plot of SNP effects for stem straightness across sites in three trial series

Stability across series

Stability describes how stable the effect of a SNP is across different environments across trial series. The impact of stable SNPs on phenotype was less affected by environments. In total, 32 SNPs were found to have significant associations with branch cluster frequency in three trial series, 25 of which had significant associations in only one trial series and 7 (4CL_SNP210, AGP_SNP1361, CAD_SNP3480, COBRA_SNP130, HCT_SNP1075, LTP_SNP486 and SCPL_SNP590) across two trial series (see Supplementary Tables 1, 2 and 3). No SNPs showed significant associations with branch cluster frequency across all three trial series. Twenty-six SNPs were found to have significant associations with stem straightness in the three trial series, 19 of which had significant associations in only one trial series and 7 (BTUB_SNP1202, F3H_SNP197, FAM18L_SNP233, LTP_SNP486, MYBR1_SNP65, RPK_SNP440 and RZF_SNP127) in two trial series (see Supplementary Tables 4, 5 and 6). One SNP (RZF_SNP241) showed significant associations with stem straightness across all three trial series. The majority of SNPs showing significant associations with branch cluster frequency or stem straightness had the same direction of SNP effects across sites and trial series. LTP_SNP486 had SNP effects in opposite directions in POP1 and POP3 for branch cluster frequency and in POP2 and POP3 for stem straightness. RPK_SNP440 had SNP effects in opposite directions in POP1 and POP3 for stem straightness.

There were 15 SNPs showing significant associations with both branch cluster frequency and stem straightness across three trial series. The SNP effects of six out of these SNPs were in the same directions, and those of the rest SNPs were in the opposite directions. Six SNPs had significant associations with both traits at the same environments in the same trial series, and their effects were in the same direction in both traits: F3H_SNP197 and MYBR2_SNP65 at Woodhill in POP2, FAM18L_SNP233 at Tarawera in POP2 and RPK_SNP440, RZF_SNP127 and RZF_SNP241 at Tarawera in POP1.

Discussion and conclusions

This study determined the levels of genetic variation and genotype by environment interaction in branch cluster frequency and stem straightness. We concluded that heritable variation in branch cluster frequency and stem straightness was found at all sites of three trial series, which indicates that these traits can be improved through breeding programmes. The narrow-sense heritabilities of branch cluster frequency (0.13 to 0.45) and stem straightness (0.04 to 0.24) in the current study were within the range of heritabilities reported in the literature. For example, heritability of branch cluster frequency in radiata pine has been estimated as 0.19 in control-pollinated populations (Wu and Matheson 2005) and 0.37 in juvenile clones (Burdon et al. 1992). The heritability of stem straightness in radiata pine has been estimated as 0.11 to 0.17 in control-pollinated populations (Carson 1986, 1991; Wu et al. 2008) and 0.28 for juvenile clones (Burdon et al. 1992).

High G×E interactions were found for branch cluster and stem straightness among some sites within the three trial series, particularly between Tarawera and Woodhill. A volcanic soil type at Tarawera leads to a high productivity in radiata pine while sandy soil at Woodhill is less fertile. This suggests that the best genotypes at Tarawera may be not the best genotypes at Woodhill. Genotypes need to be selected separately for maximum genetic gain for the sites where G×E interactions exist. In contrast, the moderately fertile Kinleith and Kaingaroa sites showed no G×E with any of the other sites, suggesting that the best genotypes at Kinleith or Kaingaroa are also the best genotypes at Kaingaroa or Kinleith or at Tarawera and Woodhill.

Genotypes by environment interactions have been extensively studied in radiata pine (Baltunis and Brawner 2010; Baltunis et al. 2010; Codesido and Fernández-López 2009; Matheson and Raymond 1984; Pederick 1990; Raymond 2011; Shelbourne 1972; Wu et al. 2008; Wu and Matheson 2005) with growth rate being the trait most sensitive to environment. Genotypes by environment interactions in branch cluster frequency and stem straightness vary in the literature. Low G×E interactions for traits such as branch cluster frequency and stem straightness have been reported in some studies (Baltunis et al. 2010; Matheson and Raymond 1984; Pederick 1990; Wu and Matheson 2005). However, Codesido and Fernández-López (2009) found high G×E interaction for stem straightness in Spain with a genetic correlation of 0.2, and Baltunis et al. (2010) also reported evidence of G×E for stem straightness and branch quality score in New Zealand with a genetic correlation between sites ranging from 0.4 to 0.74. Genotypes by environment interactions were also found in stem diameter, straightness, branch diameter and angle, number of branch clusters, malformation, height of lowest stem cone and wood density, in 29 half-sib open-pollinated families of radiata pine planted at three sites in New Zealand (Shelbourne 1972).

In the current study, when a trait was influenced by G×E interactions between sites, the effects of SNPs tended to differ in magnitude, or there were fewer SNPs with significant associations with the trait between the sites, or there were effects in the opposite directions. Similar results were reported in a study estimating SNP effects on litter size in pigs, where the 100 largest absolute SNP effects ranged from 0 to 0.0026 across different environments (herd-year-season), 86 of which had effects in the opposite directions for different herd-year-season environments (Silva et al. 2014). These differences in SNP effects imply that either (1) a SNP associated with a trait in one environment might not be associated with the trait in another environment or that (2) the beneficial effect of a SNP allele on a trait might be detrimental in another environment. It has been considered that differences of SNP effects on a trait between populations are due to different linkage disequilibria between these populations (Falconer and Mackay 1996). Results of the current study demonstrate that G×E interaction, that is the response of genes to environments, could be a cause for a SNP having different effects in different environments.

Association study results have been difficult to replicate among independent association tests in human disease studies (see review by Ball (2007a)) and in forest studies (e.g. Dillon et al. (2010)). Lack of repeatability was also reported in previous experiences in QTL mapping (Beavis and Paterson 1997) including radiata pine (Wilcox et al. 1997). Various reasons have been postulated for the lack of repeatability, including the following: (1) the effect of genes was small; (2) populations used in these studies were different or of limited sample size or (3) analytical methods and criteria for assessing associations were different (Ball 2007b; Tabor et al. 2002). The current study has used multiple populations to investigate association between SNPs and economic important traits within each single trial series. It provides an opportunity to investigate association power between SNPs and traits across different trial series. The results in the current study might give other possible explanations for the difficulty in replicating the independent association tests mentioned above. One SNP had associated with stem straightness across all trial series, and a number of SNPs had significant associations with branch cluster frequency or stem straightness across two of three trial series. However, the magnitude, direction and significance of SNP effects across trial series and environments were quite different. SNP effects were determined by two factors: distribution of individuals among three categories of SNP genotypes and breeding values of individuals. The former factor was the allele frequency of SNPs, which was quite different between trial series (Supplementary Tables 1, 2 and 3). The latter factor reflected G×E interactions across environments, i.e. the responses of genes to environments. Therefore, difference in allele frequencies and difference in response of genotypes to environments might be two possible reasons that cause the lack of repeatability of associations across populations.

Falconer and Mackay (1996) considered that quantitative traits, such as branch cluster frequency and stem straightness, are controlled by polygenes that individually contribute small effects. If G×E interactions are high, one or both of the following may be true: (1) different sets of genes are required for high performance relative to a given trait in each environment, or (2) the allelic effects of genes controlling the trait might contribute differently in magnitude to trait variation between environments (Falconer and Mackay 1996; Via and Hawthorne 2005). Therefore, the contribution of a single gene to trait variation when G×E is present could follow one of two possible patterns: only contributing in some environments or contributing differently in magnitude in different environments. Similarly, if SNPs are associated with genes controlling a quantitative trait that exhibits G×E interaction, the association should have similar patterns as the genes because (1) the association only happens in some environments; (2) the association has effects which are different in magnitude among environments or (3) the association has effects different in direction among environments. In the case of no G×E interaction, the contribution of a gene should be consistent across all environments. If SNPs are associated with genes controlling a quantitative trait without G×E interaction, the association should be consistent across all environments. The results in the current study support all of these postulates through the fact that some SNPs had significant associations in some trial series but not others (e.g. CAD_SNP3480 for branch frequency cluster), or had an effect different in magnitude between sites or trial series (for high G×E interaction situation, e.g. F3H_SNP197 for stem straightness), or that some SNPs had effects different in direction across all sites or trial series (for high G×E interaction situation, e.g. RPK_SNP440 for stem straightness).

The SNPs used in this study were selected using a candidate gene approach based on putative involvement in wood density and growth form and were tested for associations with form traits. Thirty-two of these SNPs were found to be associated with branch cluster frequency and 26 SNPs with stem straightness. Seventeen SNPs had significant association with both traits across any sites of three trial series, and six of them showed significant association with both traits at the same environments within the same trial series. These results indicate that the candidate genes, which are linked with these SNPs used in the current study, can affect multiple traits. This phenomenon is known as pleiotropy (Falconer and Mackay 1996) and is the main cause of genetic correlation between traits (White et al. 2007). In radiata pine, a positive genetic correlation between stem straightness and branch cluster frequency has previously been reported, ranging from 0.38 to 0.47 (Wu et al. 2008). The genetic correlations between branch cluster frequency and stem straightness in the current study were 0.17, 0.61 and 0.38 in POP1, POP2 and POP3, respectively. SNPs would contribute toward a positive genetic correlation if they had effects on both traits in the same direction and contribute toward a negative genetic correlation if they had effects on both traits in the opposite directions.

Understanding the pattern of SNP associations with the form traits across environments could lead to significant application of our results in tree breeding programmes. If genotypes are already tested in the environments where they are to be deployed, selection of the most suitable genotypes for each environment would maximise genetic gain. In this case, SNPs that are significantly associated in each environment would be used for marker-assisted selection. If genotypes are to be selected for a wide range of environments with diverse characteristics (such as latitude, temperature, soil nutrition, water availability and weather factors), then a possible selection strategy could be to select genotypes which have stable performance across multiple environments. SNPs which consistently show associations across sites (see Supplementary Tables 1, 2, 3, 4, 5 and 6) could be used for this purpose. In the case of genomic selection, where hundreds or thousands of SNPs may show significant associations with a trait in one environment, an environment-specific genomic EBV could be developed to speed up genetic gain in that environment. Similarly, genomic EBVs could be developed using SNPs associated with a trait consistently across environments to produce offspring with stable performance, regardless of target environments.