Background

The worldwide demand for vegetable oils and proteins has significantly increased in recent decades due to population growth and increased standards of living. Therefore, high seed yield and quality are major goals in crop production, while at the same time, there is a need to stabilize seed production under fluctuating environments and to reduce the environmental impacts of agriculture by reducing the inputs. Rapeseed (Brassica napus L.) is a major oleaginous crop that is cultivated worldwide. It is grown for its oil-rich seeds (~40–45 % of the seed dry matter), which are used for food and industrial purposes, as well as for its seed cake containing ~30–35 % protein, which is used to feed livestock. Compared to other crops, rapeseed is highly demanding in terms of input, with particularly high requirements for mineral nitrogen (N) (~150–250 kg N/ha depending on the pedo-climatic growth conditions) for a seed yield of ~3.0–3.5 t/ha in Western Europe [1]. N fertilization is a key factor in the economic balance of rapeseed production, as N fertilizer is the main expense for farmers. In addition, there is serious concern regarding N loss in the field, which can lead to soil and water pollution through nitrate leaching and to air pollution through greenhouse gas emissions [2]. Reducing N input is therefore a current challenge for sustainable rapeseed production, which implies the maintenance of competitive yields at reduced N fertilization levels. This goal may be achieved by improving the nitrogen use efficiency (NUE), which can be defined as the process of converting N into seed yield [3].

Rapeseed is often described as a low-NUE crop, with values ranging from 15 to 20 kg seeds/kg of available N; the NUE of rapeseed is approximately half that of cereals (~35–40 kg seeds/kg N). The high oil accumulation in the seeds is an energy-consuming process requiring high amounts of carbon per unit of dry matter. This partly explains rapeseed relative low NUE [1]. In addition, the significant amounts of plant N that are lost through leaf fall during the crop cycle (approximately 45 kg N/ha) may also explain the low NUE of rapeseed [4].

From an agronomic point of view, the improvement of the NUE can be assessed by the increase of seed yield per unit of N fertilizer [5, 6]. Hence, a prerequisite for increasing NUE is gaining further insight into the genetic control of yield and yield components under contrasting N fertilization conditions. Additionally, the seed N content and the N harvest index are common proxies used to assess the efficiency of N remobilization from the vegetative to the reproductive organs and, more generally, to evaluate the NUE. However, a trade-off exists between the N and oil contents in seed, and this relationship must be uncoupled to increase the NUE while maintaining a high oil content. This uncoupling is one of the main goals of rapeseed breeding.

Yield is a particularly complex trait in rapeseed due to the plant’s capacity to grow and branch after flowering, which leads to compensations between the different yield components (seed number/m2, seed weight, etc.). Several quantitative trait loci (QTL) have already been identified as contributors to seed quality- and seed yield-associated traits in rapeseed [713]. However, only a few studies have reported on the genotypic yield stability under abiotic or nutritional constraints [1416], particularly under sub-optimal N fertilization conditions [1720]. This lack of evidence suggests that there is room to improve the understanding of the genetic control of NUE and yield stability across N nutrient conditions in rapeseed.

Gaining insight into this genetic control requires a better understanding of the genotypic responses to various N stress conditions. These responses are quantified by the genotype × N (G × N) interaction, which deviates from the expected trait level of one genotype under a particular N nutrient condition. The presence of G × N interactions may reflect specific genetic control depending on the N nutrient conditions. However, other biotic and abiotic stresses independent of crop N nutrient levels may occur throughout the crop cycle but are partially manageable with appropriate crop management. The combination of interactions of genotypes with all the stresses and/or constraints that are encountered throughout the crop cycle defines the genotype × environment (G × E) interactions.

Understanding the determinants of the G × E interactions for seed yield-related traits is a key consideration for breeding, and this issue has been extensively studied in crops [21]. Several parameters have been proposed to characterize the G × E interactions and to estimate phenotypic stability in multi-trial analyses; these proposals have been reviewed by Becker et al. [22]. Among them, non-parametric methods rely on genotype ranking between different environments [23]. Additional methods are based on the regression of each genotypic value according to either the means of the environments [24] or the environmental effects [25, 26], with the regression coefficients and the coefficients of determination used as indicators of genotypic stability. Finally, the calculation of ecovalence also provides clues to determine the contribution of each genotype to a G × E interaction [27]. All of these methods have been used to investigate G × E interactions and are likely to be transferable to the study of the G × N interactions. Nevertheless, the genetic determinants of these traits have hardly been studied to date [28].

The aims of the present study were to obtain a comprehensive overview of the genetic control of yield in winter oilseed rape and to assess the impact of N nutrition conditions on yield stability. To achieve these goals, a large variety of rapeseed genotypes were phenotyped in a wide network of trials under optimal versus limited N fertilization conditions calibrated to generate N stress and G × N interactions. We first studied the partition of the genotypic main effects: the G × N and G × trial interactions. We then combined genome-wide association studies (GWAS) and linkage analyses to investigate the genetic architecture of seed yield-related traits and the stability of these traits across environments by calculating ecovalence values. Finally, we assessed the genomic organization of the critical QTL within the B. napus genome.

Methods

Plant material and genotyping data

Populations for GWAS

A population of 92 WOSR accessions (hereafter referred to as the WOSR-92 population) was used for GWAS (Additional file 1: Table S1). The accessions originated from Western Europe, with 50 genotypes of the double-low type (‘00’, low in erucic acid and glucosinolates), 17 of the ‘0+’ type, 1 of the ‘+0’ type and 24 of the ‘++’ type. A subset of 69 individuals (WOSR-69) with homogeneous flowering precocities between accessions and a limited flowering period was chosen within the WOSR-92 set and considered for GWAS (Additional file 1: Table S1). All of the accessions were genotyped using the Brassica 60 K Infinium® SNP array (Illumina, Inc. San Diego, CA) [29], and the data were visualized using Genome Studio software (Illumina). Approximately 30 K SNPs were validated and scored in each of the WOSR populations using thresholds of 5 % for the minor allele frequency (MAF) and 10 % for the frequency of missing values (Additional file 2: Table S2). Up to 83 % of the SNPs were physically anchored to the B. napus genome [30], and most markers had a genetic position on the WOSR map that was obtained via successive projections of the individual maps of the Aviso × Montego, Tenor × Express, Darmor-bzh × Bristol, Aviso × Aburamasari and Darmor-bzh × Yudal crosses, all of which were genotyped using the Brassica 60 K SNP array (C. Falentin and G. Deniot, unpublished results). A pairwise estimate of linkage disequilibrium (LD, r2) was performed using PLINK 1.9 software [31, 32]. LD decay was evaluated using a non-linear regression of the expected r2 as described by Sved et al. [33] using the equation E[r2] = 1/(1 + 4 × Ne × c), where c is the recombination rate in morgans and Ne is the effective population size. E[r2] was plotted against the genetic distance between SNPs (in centimorgans (cM) or in base pairs (bp)) to estimate the extent of LD with the r2 set to 0.2. The LD decay of each WOSR population and of each linkage group is given in Additional file 2: Table S2. The genetic relatedness between individuals was assessed by computing an identity-by-state kinship matrix (K matrix) using the GEMMA package [34] with a set of 56 SSR markers spread uniformly across the genome [35, 36].

Populations for linkage analyses

Two populations of doubled haploid (DH) lines were derived from four WOSR lines with contrasting responses to different N fertilization conditions (unpublished data): Aviso × Montego (AM-DH, 112 individuals) and Tenor × Express (DK-DH, 75 individuals). The AM-DH population was described previously [19]. Both populations were genotyped with the Brassica 60 K SNP array using the same thresholds for SNP calling and validation as described for the WOSR populations. The AM-DH and DK-DH genetic maps contained 968 and 800 unique loci, covering a total length of 1,870 and 1,938 cM at a density of one locus per 1.93 and 2.42 cM, respectively.

Field trials

A summary of the different experimental conditions is shown in Table 1 and Additional file 3: Table S3. The trials (hereafter defined as combinations of location × year) were conducted in France across a set of locations representing a wide variety of pedo-climatic conditions. The WOSR-92 population was evaluated at Le Rheu (48.82163 N, 1.48926E) during the 2008–2009 (LR09) and 2009–2010 (LR10) crop seasons. The WOSR-69 population was evaluated in 2013–2014 at five sites: Châteauroux (Ch14, 46.914158 N, 1.756584E), Dijon (Dij14, 47.230468 N, 5.10036E), Prémesques (Pre14, 50.380000 N, 2.570000E), Selommes (Sel14, 47.44324 N, 1.14943E) and Verpillères (Ver14, 49.68028 N, 2.81528E). The AM-DH population was evaluated at Le Rheu in 2010–2011 (LR11), 2011–2012 (LR12) and 2012–2013 (LR13) as described previously [19]. The AM-DH population was also evaluated at Mondonville in 2010–2011 (Md11, 43.670000 N, 1.280000E), and a subset of 75 individuals was trialed in Dijon in 2012–2013 (Dij13, 47.234781 N, 5.104821E). The DK-DH population was evaluated at Le Rheu and Mondonville during the 2010–2011 crop season (LR11 and Md11, respectively).

Table 1 Experimental trials, crop management strategies and nitrogen nutrition indexes at the bolting stage (BBCH 50)

Plants were grown under two N nutrition conditions (N1: low; N2: optimal) as described in detail below. To limit the amount of mineral N in the soil in the experimental plots, no organic matter was spread on the fields for three years before the trials, and the previous crops were grown under a low-N-input management system. All of the trials were designed as split plots with N as the main plot and genotypes as the sub-plots, except for the Md11 trials, which were designed as alpha plans with N nutrient conditions as the main plots and genotypes as the sub-plots (Additional file 3: Table S3). Seeds were sown in plots of 10 to 18 m2 at a density of 35 plants/m2. In each trial, control plots planted with the Aviso cv. were included in the design to assess the N status of the plants throughout the crop cycle using N nutrition index (NNI) measurements according to Colnenne et al. [38] (see below). The mineral soil content was measured as described previously just before sowing, at the end of winter and just after harvest [19].

N fertilization was calculated using the balance sheet method, which is commonly used in France for the main arable crops [39, 40]. The difference in fertilizer amounts between the two N treatments varied between 60 and 100 kg N/ha, depending on the trial (Table 1, Additional file 3: Table S3). All of the N applications were made using a liquid fertilizer solution containing 39 % N (50 % urea, 25 % nitrate and 25 % ammonium) on two dates (the beginning of stem elongation and during spring elongation), except for Dij13 and Sel14, for which an additional application was made at the very beginning of flowering (Additional file 3: Table S3).

For each trial, the NNI was measured at three time points, including the end of the autumnal period (BBCH 19: date 1), the end of the winter period (BBCH 30: date 2) and during the course of spring elongation (BBCH 50: date 3) (Additional file 3: Table S3). On dates 1 and 2, no N fertilizer was applied so that all of the plants were at the same N nutrition level. Only the NNI values at BBCH 50 are presented in this study. The plants were considered stressed if the NNI values were below 0.90, and the intensity of the stress increased as the NNI value decreased. Intense stress conditions were defined as NNI values below 0.75. The N stresses that were applied to the crops were moderate for five of the trials, including LR13, Md11, Ch14, Dij14 and Ver14; in these trials, the NNI values for the low-N conditions ranged from 0.81 to 0.87. The N stress was intense in the Pre14 trial (NNI_N1 = 0.67), whereas no N stress was detected in the other trials (NNI_N1 > 0.96) (Table 1). However, despite the absence of N stress, differences in NNI values were observed between the two N treatments for the LR09 and LR10 trials (ΔNNI of 0.35 and 0.2, respectively), reflecting differences in plant N nutrition status between N fertilization conditions.

Phenotypic data acquisition and analysis

The measured traits were previously described in detail [19] and were as follows: days to flowering (DTF in days, measured at BBCH 61 [37]), seed yield (SY in t/ha), thousand-seed weight (TSW in g), seed number/m2 (SN = (SY × 100,000)/TSW), seed oil content (O in % of seed dry matter), seed protein content (Pr in % of seed dry matter) and seed oil content/seed protein content ratio (O/Pr). All of the statistical analyses were carried out with R software version 3.2.4 [41].

Characterization of the trials

To characterize the different environments (hereafter defined as combinations of trial × N treatment), a principal component analysis (PCA) was performed on the phenotypic means of each genotype for the AM-DH and WOSR-69 populations. The environments were then grouped via hierarchical clustering based on the coordinates of each genotype on the first five principal components (FactoMineR package; [42]). For the AM-DH population, data from Dij13 were not considered for the clustering analysis because the DTF, SN and TSW traits were not recorded in this trial. The clustering of environments was not performed for the DK-DH population because it was only tested in two trials (LR11 and Md11) that were already addressed in the AM-DH population. Concerning the WOSR-69 population, the DTF trait was not considered because it was not recorded in Ver14, and the data from Dij14 were discarded from the analysis because DTF, SY and SN were not recorded in this trial. The phenotypic and genetic correlations (rg) between the traits averaged over the trials for each population and each N fertilization condition were also calculated.

Mixed linear models

Different mixed models were analyzed using the lme4 [43] and lmerTest [44] packages, and the results are presented below.

First, a mixed linear model was applied to each trait (P) using the REML method, with all of the trials and N fertilization conditions confounded. This multi-environment model (1) was fitted for the two DH populations as well as for the WOSR-69 population tested in seven trials (LR09, LR10, Ch14, Dij14, Pre14, Sel14, and Ver14):

$$ {\boldsymbol{P}}_{\boldsymbol{i}\boldsymbol{jklm}} = \boldsymbol{\mu} +\underline {{\boldsymbol{G}}_{\boldsymbol{i}}} + {\boldsymbol{N}}_{\boldsymbol{j}} + {\underline {\boldsymbol{T}}}_{\boldsymbol{l}}+\underline {{\boldsymbol{T}}_{\boldsymbol{l}}\left({\boldsymbol{R}}_{\boldsymbol{k}}\right)} + \underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}}+{\underline {{\boldsymbol{G}}_{\boldsymbol{i}} \times \boldsymbol{T}}}_{\boldsymbol{l}} + {\underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}\times \boldsymbol{T}}}_{\boldsymbol{l}} + \underline {{\boldsymbol{e}}_{\boldsymbol{i}\boldsymbol{jklm}}} $$
(1)

where P ijklm is the phenotypic value, μ is the population mean, G i is the genotype i, N j is the N nutrition condition j, R k is the replicate k, T l is the trial l, and e ijklm is the residual. The underlined terms were considered as random. Based on model (1), broad sense heritability (h2) was then calculated as follows:

$$ {\boldsymbol{h}}^2=\frac{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{G}}}{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{G}} + \frac{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{G}\times \boldsymbol{N}}}{\boldsymbol{n}} + \frac{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{G}\times \boldsymbol{T}}}{\boldsymbol{t}} + \frac{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{e}}}{\boldsymbol{t}\times \boldsymbol{n}\times \boldsymbol{r}}} $$
(2)

where σ 2 G is the genetic variance, σ 2 G×N is the G × N variance, σ 2 e is the residual variance, σ 2 G×T is the G × T variance, n is the number of N fertilization conditions, t is the number of trials, and r is the number of replicates per genotype, per N fertilization condition, and per trial.

A second mixed linear model was applied to each trait (P) in each trial, with all N conditions confounded. Model (3) was adjusted for trials with a split plot design:

$$ {\boldsymbol{P}}_{\boldsymbol{i}\boldsymbol{jkl}} = \boldsymbol{\mu} +\underline {{\boldsymbol{G}}_{\boldsymbol{i}}} + {\boldsymbol{N}}_{\boldsymbol{j}} + \underline {{\boldsymbol{R}}_{\boldsymbol{k}}} + \underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}} + \underline {{\boldsymbol{e}}_{\boldsymbol{i}\boldsymbol{jkl}}} $$
(3)

Model (4) was fitted for the trials with an alpha plan design (AM-DH and DK-DH in Md11 trials):

$$ {\boldsymbol{P}}_{\boldsymbol{i}\boldsymbol{jklm}} = \boldsymbol{\mu} +\underline {{\boldsymbol{G}}_{\boldsymbol{i}}} + {\boldsymbol{N}}_{\boldsymbol{j}} + \underline {{\boldsymbol{R}}_{\boldsymbol{k}}}+\underline {{\boldsymbol{R}}_{\boldsymbol{k}}\left({\boldsymbol{B}}_{\boldsymbol{l}}\right)}+\underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}} + \underline {{\boldsymbol{e}}_{\boldsymbol{i}\boldsymbol{jklm}}} $$
(4)

where P ijkl and P ijklm are the phenotypic values, μ is the population mean, G i is the genotype i, N j is the N nutrition condition j, R k is the replicate k, B l is the block l, and e ijkl and e ijklm are residuals. All of the effects were considered random except for the N nutrition effect.

The corresponding heritabilities were assessed as follows:

$$ {\boldsymbol{h}}^2=\frac{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{G}}}{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{G}} + \frac{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{G}\times \boldsymbol{N}}}{\boldsymbol{n}} + \frac{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{e}}}{\boldsymbol{n}\times \boldsymbol{r}}} $$
(5)

Finally, a random linear model was applied to each trait P for each trial and N fertilization condition. This single-environment model (6) was fitted for each population (WOSR-92, WOSR-69, AM-DH and DK-DH):

$$ {\boldsymbol{P}}_{\boldsymbol{i}\boldsymbol{jk}} = \boldsymbol{\mu} +\underline {{\boldsymbol{G}}_{\boldsymbol{i}}} + \underline {{\boldsymbol{R}}_{\boldsymbol{j}}} + \underline {{\boldsymbol{e}}_{\boldsymbol{i}\boldsymbol{jk}}} $$
(6)

where P ijk is the phenotypic value, μ is the population mean, G i is the genotype i, R j is the replicate j, and e ijk is the residual. All of the terms were considered as random. Additionally, h2 was estimated for each N fertilization condition and each trial using the following formula:

$$ {\boldsymbol{h}}^2=\frac{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{G}}}{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{G}} + \frac{{{\boldsymbol{\sigma}}^2}_{\boldsymbol{e}}}{\boldsymbol{r}}} $$
(7)

Stability of the genotypes across environments

The stability of the genotypes from a given population across N fertilization conditions or trials was estimated by calculating the corresponding ecovalence values as described by Wricke (1962) [27]:

$$ {\boldsymbol{W}}_{\boldsymbol{i}} = {\displaystyle {\sum}_{\boldsymbol{i}}}{\left({\boldsymbol{Y}}_{\boldsymbol{i}\boldsymbol{j}}-{\boldsymbol{Y}}_{\boldsymbol{i}.}-{\boldsymbol{Y}}_{.\boldsymbol{j}}+{\boldsymbol{Y}}_{..}\ \right)}^2 $$
(8)

where Y ij is the phenotypic value of genotype i under treatment j (N nutrition condition or trial), Y i. is the mean phenotypic value of genotype i over all of the considered treatments (all N nutrition conditions or trials), Y .j is the mean phenotypic value of treatment j (N nutrition condition or trial), and Y .. is the general mean. The ecovalence calculated over the N fertilization conditions was called the G × N model, and the ecovalence calculated over the trials was called the G × T model.

Genetic analyses

For GWAS, a compressed mixed linear model [45] implemented in the GAPIT R package [46] was used. For each genotype of the WOSR populations, four datasets were considered for the GWAS of a given trait: 1) the adjusted means extracted from the single-environment model (6), 2) the genotypic estimates across trials extracted from the multi-environment model (1), 3) the ecovalence values over the N fertilization conditions extracted from the G × N model (8), and 4) the ecovalence values over the trials extracted from the G × T model (8). A mixed linear model (MLM) in which the K matrix was declared to be random was applied to each of the analyses, and fixed marker effects were included one by one. To correct for multiple analyses, the false discovery rate (FDR) was calculated for each test as previously described [47], and SNPs with a FDR of less than 0.15 were considered significantly associated with a given trait. To define trait-associated genomic regions (GWAS-QTL), confidence intervals were calculated as described by Cormier et al. [48]. Briefly, the trait-associated SNPs were clustered according to their genetic relatedness, and the boundaries of each cluster were extended via the addition of the local LD decay value, calculated with all of the markers covering 5 % of the linkage group length from the middle of the cluster. In addition, the SNP with the lowest FDR within each cluster was considered the most probable position of the GWAS-QTL.

For linkage analyses, a multiple QTL mapping (MQM) model was tested using the R/qtl package [49]. For each genotype of the DH populations, the same four datasets as those described above for GWAS were considered. The QTL mapping models were previously described in detail [19], and a p-value of 0.05 was considered the threshold for significance. The trait-associated genomic regions arising from the linkage analyses were referred to as LA-QTL. GWAS-QTL and LA-QTL were finally projected onto the WOSR map using BioMercator software [50].

Genomic analyses of targeted regions

Trait-associated QTL were analyzed in terms of structural organization within the B. napus genome. For this purpose, we focused on the QTL detected with the multi-environment model (1), which were associated with yield components (DTF, SY, SN, and TSW). The homoeologous relationships between genes from the A and C sub-genomes were extracted from the structural annotation of the Darmor-bzh genome sequence published by Chalhoub et al. [30] and aligned with the physical positions of the QTL found in the present study to find consistent matches. The results were represented graphically using CIRCOS [51].

Results

Yield-related traits were highly heritable

Broad sense heritability values calculated with the multi-environment model (h2, model (2)) were always greater than 0.84 for all traits in all populations, with the exception of the DK-DH population, in which h2 decreased to 0.63 (Table 2). Similar assessments were observed when considering the trait heritability values within each population and trial combination (h2, model (5)). In this case, h2 was high and was always greater than 0.8, except for the Md11 trials (Additional file 4: Table S4). When the traits were considered in each population per trial × N combination, h2 (model (7)) was high for all of the traits, with generally higher h2 for the N2 condition than for the N1 condition (Additional file 4: Table S4).

Table 2 Results of the mixed linear model applied to each population for each trait considering all trials and N conditions as confounded (multi-environment model (1)), \( {\boldsymbol{P}}_{\boldsymbol{i}\boldsymbol{jklm}} = \boldsymbol{\mu} +\underline {{\boldsymbol{G}}_{\boldsymbol{i}}} + {\boldsymbol{N}}_{\boldsymbol{j}} + \underline {{\boldsymbol{T}}_{\boldsymbol{l}}}+\underline {{\boldsymbol{T}}_{\boldsymbol{l}}\left({\boldsymbol{R}}_{\boldsymbol{k}}\right)} + \underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}}+\underline {{\boldsymbol{G}}_{\boldsymbol{i}} \times {\boldsymbol{T}}_{\boldsymbol{l}}} + {\underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}\times \boldsymbol{T}}}_{\boldsymbol{l}} + \underline {{\boldsymbol{e}}_{\boldsymbol{i}\boldsymbol{jklm}}} \)

In addition, several of the traits showed strong correlations. For instance, the seed number/m2 was positively correlated with the seed yield (0.62 < rp < 0.93), with strong positive genetic correlations (0.85 < rg < 1.26) for all of the populations and all of the trials studied (Additional file 5: Table S5). As already known from previous studies, oil and protein contents always displayed strong negative correlations (−0.81 < rp < −0.34; −1.14 < rg < −0.54 in our study).

NUE and yield traits were strongly impacted by N, trial and G × trial interaction effects, whereas weak G × N interactions were observed

When considering the multi-environment model (1), significant genotype (G), trial (T) and G × T interaction effects were found for each trait in each of the populations (Table 2). A significant N effect was also detected for each trait × population combination, except for TSW in the DK-DH population. However, no significant G × N effect was detected regardless of the trait and population considered, except for TSW in the WOSR population. Finally, significant G × T × N effects were observed in almost all cases, except for DTF, TSW and seed number/m2 in the DK-DH population.

When considering models (3) and (4) for each population × trial combination, the G effects were always highly significant, and N had an effect on most of the traits (Additional file 4: Table S4). Moreover, some G × N interactions were detected with these models, although they were not highly significant for most of the traits (0.01 < p-value < 0.05). In addition, the G × N interactions were detected in the LR09 and Ch14 trials for the WOSR populations and in the Md11 trial for the DH populations. These results prompted us to obtain further insight into the genetic control of these traits for each population and each trial under N1 and N2 conditions.

Genetic analyses based on single-environment models revealed a high number of stable QTL between N nutrition conditions that were mostly trial-specific

The architecture of the genetic control of the seed yield and the genomic stability across environments was first assessed by analyzing the QTL detected in the single-environment model (6). A total of 946 GWAS-QTL were detected in the WOSR populations (486 and 460 for WOSR-92 and WOSR-69, respectively; Additional file 6: Table S6), and 184 LA-QTL were detected in the DH populations (138 and 46 for AM-DH and DK-DH, respectively; Additional file 7: Table S7). Most of the QTL were specific to a population structure, with only 35 loci in common between the DH and WOSR populations. In particular, one region located in the A5 linkage group was detected in the AM-DH and WOSR populations under both N fertilization conditions in 13 different environments (data not shown). In addition, a striking result was the significant proportion of loci controlling flowering time in the WOSR-92 population (63 and 12 % of the GWAS-QTL were associated with DTF in LR09 and LR10, respectively), as well as in the two DH populations (34 and 43 % in AM-DH and DK-DH, respectively). In contrast, no DTF-associated QTL were detected in the WOSR-69 population due to the lower MAF in this smaller population at the loci identified in the WOSR-92 population (data not shown). The DTF-associated QTL were generally highly stable across environments (data not shown).

The stability of the QTL for yield-related traits between N treatments was consistent with the level of N stress observed in each trial (Table 3). Indeed, for most of the trials in which no N stress was noted (i.e., LR09, LR10, LR11 and LR12), a large proportion of the QTL (37 to 70 %) were independent of the N fertilization level. In contrast, when the N stress was moderate to intense, as for LR13, Md11, Ch14, Dij14 and Pre14, a lower proportion of N-stable QTL (22 to 48 %) was found than when no N stress was present. However, there was one exception: in Dij13, a no-stress trial, only 18 % of the QTL were common between the two N treatments. These results suggest that additional environmental stresses occurred during the trials and that these additional factors interacted with the N nutrition level. We also examined QTL consistency across trials and showed that more than 50 % of the QTL were specific to a single trial (data not shown).

Table 3 Number of significant QTL detected for each trial × N combination and the consistency of the QTL across N nutrient conditions for the WOSR (A) and DH (B) populations

Because the QTL distributions were consistent with the N stress level or were trial-specific, we hypothesized that there was a pattern of QTL based on environmental conditions. We investigated this hypothesis by studying the QTL distribution over clusters of environments.

Three clusters of environments were defined for each population, and the QTL were mostly cluster-specific

The environments associated with the WOSR-69 population were clustered into three groups, with the two N nutrition conditions for each trial consistently grouped in the same cluster (Fig. 1a). The first cluster (cluster 1) grouped the Pre14, Sel14 and Ch14 trials, which represented 36.14, 27.17 and 22.25 % of the cluster, respectively; cluster 2 grouped the Ver14 and Ch14 trials (72.06 and 26.82 %, respectively); and cluster 3 grouped the LR09 and LR10 trials (40.37 and 44.44 %, respectively). Clusters 1 and 2 were characterized by early flowering (up to 15.33 days below the overall mean DTF value), whereas the LR10 trial in cluster 3 showed the latest flowering time (up to +9.2 days). The yields were lower in clusters 1 and 2 but higher in cluster 3 compared to the mean SY over all environments (from −1.38 to +0.13 t/ha for clusters 1 and 2 and from +0.4 to +1.13 for cluster 3). For the yield components, clusters 1 and 2 were characterized by a lower seed number/m2 but higher TSW relative to the mean values. The situation was reversed for cluster 3, with a greater number of smaller seeds. Regarding the seed composition traits, the most striking difference between the three clusters was the low protein content that was obtained for Ver14 in cluster 2 (around −4.5 %). Approximately 67 % of the loci that were detected in the GWAS were specific to one environmental cluster, 23 % were common to two clusters, and 10 % were common to three clusters. No loci were common to all clusters (Fig. 2a).

Fig. 1
figure 1

Clustering of the environments based on phenotypic traits. Clustering analysis was performed on the WOSR-69 (a) and AM-DH (b) populations. The projection of the individuals on the first two principal components is presented in the left chart, with individuals colored according to the environment to which they belong. Clusters were defined via hierarchical clustering analysis. Each cluster is characterized by one or several constituent environments (a, Env.), the proportion of each environment in the cluster (b, Clu./Env., expressed as %), and the differences in trait values between the mean of each environment in the cluster and the general mean (c, values given for the different traits evaluated in this study). The trait abbreviations (DTF, SY, SN, TSW, O, Pr, and O/Pr) and units are described in the Methods section

Fig. 2
figure 2

Consistency of the QTL between clusters of environments for the WOSR (A) and AM-DH (B) populations. The QTL were detected for each population and trait using the single-environment model (6) and were counted for each cluster. The numbers in the intersections of the Venn diagrams indicate the numbers of common QTL between the given two, three or four groups. Dij14 (chart a) and Dij13 (chart b) were considered independent groups because they were not included in the clustering. Stars indicate potentially over-estimated numbers due to overlapping trials between clusters 1 and 2 (chart a) or clusters B and C (chart b)

The environments associated with AM-DH were also clustered into three groups, with the two N nutrition conditions of each trial grouped in the same cluster (Fig. 1b). The first cluster (cluster A) was clearly associated with the Md11 trial, which represented 100 % of the cluster. LR12 was attributed to cluster B (55.68 %) and LR11 to cluster C (81.30 %), whereas LR13 was split between cluster B (43.10 %) and cluster C (18.70 %). Cluster A was characterized by early flowering compared to the mean flowering time over all environments (−5.60 to −6.54 days), a low TSW (−1.21 to −1.28) and a high seed number/m2 (+16,287 to +24,296). The opposite trend was observed for cluster B, which was characterized by a high TSW (+0.52 to +0.61) and a low seed number/m2 (−8,087 to −19,514). Cluster C was characterized by a higher seed oil content and a lower protein content than the two other clusters. Approximately 65 % of the loci detected in the linkage analyses were specific to one environmental cluster, 30 % were common to two clusters, 4 % were common to three clusters, and one locus was common to all clusters (Fig. 2b).

In conclusion, the loci detected previously via GWAS or linkage analyses were mainly specific to one environmental cluster. However, the QTL of a given cluster were generally distinct between the constitutive trials, suggesting that the QTL × trial interactions predominated over the QTL × cluster interactions.

The additive genetic control of yield-related traits was assessed by multi-environment model-based genetic analyses

The genetic analyses of the additive genotypic values as estimated for each population using the multi-environment model (1) produced a clear synthetic overview of the consistent QTL for traits related to seed yield and quality. Fifty-one stable QTL were thus identified; all of these QTL were previously detected using a single-trial genetic model (6), confirming their robustness (Additional file 8: Figure S1). Of these, 32 loci were obtained from the WOSR populations, 11 from AM-DH and 8 from DK-DH. The QTL were scattered in all linkage groups except for A7 and C4 (Table 4, Additional file 8: Figure S1). These regions included 27 QTL for seed yield, 7 for flowering time, 6 for seed oil content, 6 for TSW, 2 for seed number/m2, 2 for oil/protein ratio, and one for seed protein content.

Table 4 Results of the genetic analyses performed on the WOSR (A) or DH (B) populations for [a] the genotypic estimates extracted from the multi-environment model (\( {\boldsymbol{P}}_{\boldsymbol{i}\boldsymbol{jklm}} = \boldsymbol{\mu} +\underline {{\boldsymbol{G}}_{\boldsymbol{i}}} + {\boldsymbol{N}}_{\boldsymbol{j}} + \underline {{\boldsymbol{T}}_{\boldsymbol{l}}}+\underline {{\boldsymbol{T}}_{\boldsymbol{l}}\left({\boldsymbol{R}}_{\boldsymbol{k}}\right)} + \underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}}+\underline {{\boldsymbol{G}}_{\boldsymbol{i}} \times {\boldsymbol{T}}_{\boldsymbol{l}}} + \underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}\times {\boldsymbol{T}}_{\boldsymbol{l}}} + \underline {{\boldsymbol{e}}_{\boldsymbol{i}\boldsymbol{jklm}}} \) (model 1)) and [b] the ecovalence values calculated using the G × N or G × T model (\( {\boldsymbol{W}}_{\boldsymbol{i}} = {\displaystyle \sum_{\boldsymbol{i}}}\left({\boldsymbol{Y}}_{\boldsymbol{i}\boldsymbol{j}}-{\boldsymbol{Y}}_{\boldsymbol{i}.}-{\boldsymbol{Y}}_{.\boldsymbol{j}}+{\boldsymbol{Y}}_{..}\ \right)2 \) (model 8))

Nine of the seed yield QTL showed putative colocalizations with loci controlling other traits, including flowering time (A1 and C6), seed weight (A4), seed number/m2 (A5), oil content (A5, A9, C1 and C7), and protein content (C2) (Additional file 8: Figure S1).

Most of the QTL that were detected with model (1) were specific to a given population. Indeed, only three genomic regions were consistent between two different populations, including two loci controlling flowering time in the A2 and C6 linkage groups detected in the AM-DH and DK-DH populations and one locus for seed yield in A5 detected in the WOSR and AM-DH populations.

Exploration of the loci contributing to the G × N and/or G × T interactions according to the ecovalence traits

The genetic analyses performed using the ecovalence values that were calculated with the G × N or G × T models revealed the loci contributing to the interactions of the genotypes with the N nutrition condition or the trial, respectively. These analyses revealed 27 and 40 QTL controlling the G × N and G × T interactions, respectively, for the three populations (Table 4, Additional file 8: Figure S1).

Five, 12 and 10 G × N QTL were detected in the WOSR, AM-DH and DK-DH populations, respectively. Eighteen of these QTL colocalized with QTL that were detected using the multi-environment model (1), suggesting that these loci were involved in both the additive and the G × N components (Additional file 8: Figure S1). Some of the loci were previously found to be environment-specific based on the single-environment model (6). For instance, a QTL for TSW in the A5 LG, which was specific to the N2 condition based on the single-environment model, colocalized with a G × N QTL. On the other hand, five QTL for DTF, seed number/m2 and oil content in A1, A5, C6 and C7, which were detected under both N nutrition conditions using the single-environment model, colocalized with G × N QTL. These colocalizations may therefore reflect a modulation of the effects of the QTL between N nutrition conditions rather than the presence or absence of a relationship between treatments. In addition, nine G × N QTL did not colocalize with any other QTL and formed specific N-interactive loci (Additional file 8: Figure S1). These loci corresponded to N-specific QTL that were detected using the single-environment model (6) (data not shown).

Twenty-seven, 7 and 6 G × T QTL were detected for the WOSR, AM-DH and DK-DH populations, respectively. Fifteen of these QTL colocalized with QTL that had previously been detected using the multi-environment model. In addition, four QTL for seed yield and oil content in the A1, A4 and A5 LG that were previously found to be trial-specific based on the single-environment model colocalized with G × T QTL. Five QTL for seed yield and oil content in the A5, A9, C1 and C9 LG that were consistent across trials according to the single-environment model did not colocalize with any G × T QTL, confirming their stability. On the other hand, six QTL for DTF in the A1, A2, C2 and C6 LG that were detected across at least five trials colocalized with G × T QTL (Additional file 8: Figure S1). Twenty-four of these QTL did not colocalize with any of the fifty-one QTL detected using the multi-environment model and instead represented specific trial-interactive loci (Additional file 8: Figure S1).

In summary, colocalizations of QTL detected using the multi-environment model and genomic regions contributing to the G × N and G × T interactions pinpointed loci with additive effects modulated by the N fertilization condition (11 QTL), the trial (9 QTL) or both (6 QTL). Nine and twenty-five loci were specific to the G × N and G × T interactions, respectively, but no region specific to both interactions was detected.

Structural organization of the key loci controlling seed yield in the B. napus genome

To gain insight into the genomic organization of the main regions controlling seed yield, we first determined the homoeologous relationships between predicted genes within the QTL that were detected using the multi-environment model according to the B. napus Darmor-bzh reference genome [30]. We focused only on the forty-two QTL that were associated with seed yield-related factors per se, including flowering time, seed yield, seed number/m2, and TSW.

Homoeologous relationships were observed between twelve QTL in the A genome and eight QTL in the C genome (Fig. 3). Pairs of homoeologous QTL controlling the same traits were identified for seed yield (A3-C3; A5-C5; A9-C9) and TSW (A5-C5), as well as homoeologous QTL controlling different traits, such as DTF - TSW (A2-C2), seed yield - seed number/m2 (A1-C1), seed yield - TSW (A5-C4; A10-C5), and seed number/m2 - TSW (A5- C4/C5) (Fig. 3).

Fig. 3
figure 3

Circos diagram showing the homoeologous relationships between genes present within the main regions for yield-related traits. The 19 B. napus chromosomes (2n = 4x = 38; AACC) belonging to the A and C sub-genomes are represented in blue and red, respectively. The physical size of each chromosome (in Mbp) is indicated above the chromosome, and a ruler is drawn below, with larger and smaller tick marks every 10 and 2 Mbp, respectively. The forty-two QTL seed yield-related traits detected using model (1) (\( {\boldsymbol{P}}_{\boldsymbol{i}\boldsymbol{jklm}} = \boldsymbol{\mu} +\underline {{\boldsymbol{G}}_{\boldsymbol{i}}} + {\boldsymbol{N}}_{\boldsymbol{j}} + \underline {{\boldsymbol{T}}_{\boldsymbol{l}}}+\underline {{\boldsymbol{T}}_{\boldsymbol{l}}\left({\boldsymbol{R}}_{\boldsymbol{k}}\right)} + \underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}}+\underline {{\boldsymbol{G}}_{\boldsymbol{i}} \times {\boldsymbol{T}}_{\boldsymbol{l}}} + \underline {{\boldsymbol{G}}_{\boldsymbol{i}}\times {\boldsymbol{N}}_{\boldsymbol{j}}\times {\boldsymbol{T}}_{\boldsymbol{l}}} + \underline {{\boldsymbol{e}}_{\boldsymbol{i}\boldsymbol{jklm}}}\Big) \) are represented by bars on three runways, depending on the population in which the QTL was detected (WOSR-69, AM-DH or DK-DH) according to the following color code: flowering time (DTF) in pink; seed yield (SY) in green; seed number/m2 (SN) in blue; and thousand-seed weight (TSW) in orange. Each pair of homoeologous genes is represented by a line colored according to the following code: in gray, the homoeologous genes between QTL and regions carrying no other QTL; in blue, the homoeologous genes between QTL detected using model (1). The numbers of homoeologous genes present in each pair of homoeologous regions are indicated in the boxes

In addition, for each pair of homoeologous QTL controlling the same trait that were detected in the WOSR-69 population, the frequency of the favorable allele in the most significant positions of the QTL was compared between copies. Although two homoeologous QTL for seed yield in A9 and C9 showed similar frequencies of the favorable allele in the WOSR-69 population (A9: 0.48; C9: 0.49), the frequency of the favorable allele was unbalanced for the QTL pair for seed yield in the A5 and C5 LG (A5: 0.19; C5: 0.71). This result suggests the differential evolution of QTL copies between the two sub-genomes for some of the homoeologous regions.

Finally, each pair of QTL showed 5 to 796 homoeologous genes (Fig. 3) that may be considered candidate genes for the corresponding traits.

Discussion

The N nutrition level may not be the only limiting factor

Although several precautions were taken during cropping to limit the amount of mineral N in the soil in the experimental plots, the N stresses that were applied in the different trials were often moderate to absent based on the NNI values as measured during the bolting stage. This finding suggests that low-N nutrition conditions were not limiting in most trials, probably due to the highly favorable environmental conditions, such as mild climate, and/or soil features, which promote substrate mineralization. Nevertheless, a difference in the N nutrition level was observed in many trials, even for those showing a low stress level, which was reflected in the difference in seed yield between the two N fertilization conditions (approximately 0.50 t/ha).

This observation was confirmed by highly significant N effects on all of the traits in the mixed model analyses. In contrast, low G × N interactions were found in most of the population × trial combinations, and a majority of the QTL were stable between the two N conditions, with fewer QTL detected under limited N fertilization conditions than under the sub-optimal N nutrition conditions. These findings are consistent with previous results of few G × N [52, 53] or QTL × N [17] interactions in rapeseed but contradicts the observations of Nyikako et al. [54], who detected G × N interactions. The identification of G × N interactions primarily depends on the environments tested as well as on the genetic diversity of the populations studied (a high genetic variance may conceal the relative G × N effects).

At least five trials were set up to evaluate each population in our study. Strong trial (T) and G × T effects were found for most of the studied traits, probably due to the specific environmental conditions in each trial. For example, water stress was observed in the Md11 trials at the flowering stage, consistent with the results that the corresponding environments clustered independently and that the heritability values were lower than in the other environments. Thus, dense networks of trials as well as a precise agronomic characterization of the environments are needed for consistent phenotypic evaluation. Indeed, many QTL × T interactions were observed, with more than 50 % of the QTL being specific to a single cluster of environments and even to a single trial. In bread wheat, Kuchel et al. [55] reported that approximately one quarter of the G × E interactions could be explained by interaction of the QTL with climatic cofactors. El-Soda et al. [56] reported that most QTL displayed QTL × E interactions in Arabidopsis, although their effects on the trait values can vary (e.g., opposing effects, differences in intensity). Similarly, QTL × E interactions have also been found in apples [57], cotton [58], rice [59], and wheat [60], whereas a high consistency of QTL between environments was reported in maize [61]. These discrepancies may be explained by the heritability of the traits across the environments or by the environments tested.

Refining the genetic architecture of complex traits

How to increase the precision of QTL detection is a central question in genetic analyses. Two ways are addressed in the present work. First, the genetic analyses of additive effects using multi-environment models and second, the combination of the results from complementary approaches, such as linkage analyses and GWAS.

In the GWAS and linkage analyses that were performed on each environment (model 6), 1,130 loci that controlled one or several of the nine traits under study were detected, and these loci were distributed over the 19 B. napus linkage groups. The complexity of the QTL architecture for seed yield and the difficulty of handling massive QTL sets were also assessed by Shi et al. [62], who detected 870 QTL for seed yield in two rapeseed populations grown in ten environments. This information could be used to assess the impact of the trials and the environmental clusters on the genetic architecture of seed yield traits. However, the large number of trial-specific QTL hindered interpretation of the results. Multi-environment QTL analyses were suggested to overcome the impact of QTL × E interactions because these analyses are based on multiple environmental datasets and thus lead to more robust QTL than single-environment analyses [63]. In our study, the set of QTL detected using the single-environment model could be reduced to fifty-one robust QTL using the multi-environment model (1), and this restricted dataset allowed for a more synthetic overview of the major regions involved in the control of seed yield traits.

Combining linkage analyses and GWAS has been successfully reported for several species, including rapeseed [9, 64]. In our study, we showed that a small proportion of the QTL were consistent between population structures (DH versus WOSR populations), underlying the complementarity of the two approaches to decipher the genetic architecture of complex traits. Another method that has been successfully tested in maize [65] integrates both GWAS and linkage analyses into a single analysis (integrated mapping, or joint mapping), allowing several genetic parameters to be tested simultaneously. The development of new types of mapping populations calibrated to perform joint linkage analyses-GWAS, such as nested association mapping (NAM) [66] or multi-parent advanced generation inter-cross (MAGIC) populations [67], is ongoing in several plant species, including rapeseed, to optimize both the power of QTL detection and the diversity of the tested alleles.

Genetic analyses of the ecovalence values revealed the QTL × N and QTL × T interactions

In most studies in which the genetic architecture of a trait is analyzed under several treatments, direct comparisons of QTL or analyses of the differences in their effects between treatments are the most common methods employed. Genetic analyses with stability parameters, such as ecovalence, are another way to identify genomic regions involved in G × N or G × T interactions. This type of genetic analysis has previously been performed on barley [28], in which the colocalization of ecovalence-related QTL with earliness genes was determined. In our study, we found colocalizations between QTL for seed yield traits detected using the multi-environment model (1) and QTL for G × N or G × T interactions. These results suggest that these regions showed allelic variations conferring adaptation to specific environments. In addition, 9 and 25 QTL specific to G × N and G × T interactions, respectively, were detected, and these QTL represent candidate regions for increasing NUE and for promoting adaptation to different environmental conditions in rapeseed.

Genomic tools provide clues regarding the organization and gene content of seed yield-related loci

The newly released B. napus genomic sequence [30] has provided new directions for QTL analyses in terms of gene content. Therefore, we were able to compare the gene repertoire in the forty-eight yield-related QTL and to assess their homoeologous relationships within the whole rapeseed genome. Indeed, several other studies previously described the occurrence of duplicated QTL for glucosinolate content or stem canker in rapeseed [68], and these findings are consistent with the allopolyploid origin of the B. napus genome [69]. Thus, we aimed to address whether homoeologous regions control seed yield-related traits. Our results showed that twenty of the QTL displayed homoeologous genes within the confidence intervals of at least one QTL controlling a related trait.

The homoeologous genes covered by QTL for the same traits may be considered as promising candidates. However, due to the large confidence intervals obtained in our study, such genes remain numerous (up to 796 genes in our study) and difficult to analyze. One possible way to decipher trends of potential functions underlying these regions would be to use the Gene Ontology (GO) network [70]. Recently, Bargsten et al. [71] proposed a method of candidate gene prioritization for the analysis of 1,591 QTL regions found in rice associated with 231 different traits. They compared the occurrence of the biological functions of the genes found in the regions associated with one trait to the rest of the genome and retained the genes displaying significant over-representation of a given function. This method led to a ten-fold decrease in the number of putative candidate genes. Combining high-throughput gene annotations with QTL data could therefore lead to the elucidation of the underlying functions of the QTL.

Conclusions

We found that under our environmental conditions, the effect of the trial was greater than the effect of N nutrition level on seed yield-related traits. This result paves the way for the development and use of new indicators of plant and environmental status to assess the most limiting factors during sensitive stages of yield production. Nevertheless, in the present study, fifty-one novel main regions involved in the yield of rapeseed were identified and were stable across populations and environments. It appeared that these regions gathered QTL for different traits related to yield and its components, suggesting possible pleiotropic effects. To go further, an analysis of the gene content and the corresponding functions of the related genes may provide evidence indicating the molecular determinants of seed yield and yield-related traits in rapeseed.