Introduction

Seeds of oilseed rape (Brassica napus L.) are a source of high-quality edible oil and of protein rich meal, which is used for livestock feeding. Considering the nutritional value, the meal of oilseed rape has an excellent balanced composition of essential amino acids (Tan et al. 2011). However, quality and digestibility of the meal is negatively influenced by the presence of high amounts of fiber in the seeds. According to the detergent system developed by Van Soest et al. (1991) three different fiber fractions are distinguished: neutral detergent fiber (NDF), acid detergent fiber (ADF) and acid detergent lignin (ADL). NDF mainly consists of cellulose, hemicellulose and lignin, ADF of cellulose and lignin, and ADL of lignin only. Through calculation of differences between these fractions, contents of cellulose and hemicellulose can be estimated. Traditional oilseed rape has a brown or black seed color, which is associated with high lignin contents in the seed hull (Qu et al. 2013; Carré et al. 2016). Animal nutritionists suggest that ADL is the most nutritionally relevant fiber fraction. It reduces the digestibility and the energy uptake both in ruminant and in monogastric animals (Kracht et al. 2004). Seed color mutants with a reduced ADL content were either identified within the oilseed rape gene pool or they were introduced through interspecific hybridization (Rashid et al. 1994; Rahman 2001; Rahman and McVetty 2011 and references therein). Yellow seed character has been reported to be associated not only with lower dietary fiber content but also with higher oil and protein content (Stringham et al. 1974; Simbaya et al. 1995; Rahman et al. 2001). During the past 30 years, developing yellow seeded oilseed rape has become an important breeding objective (Rahman and McVetty 2011). However, despite intensive breeding efforts and reported higher seed oil and protein contents of yellow seeded oilseed rape, it has not yet been possible to establish competitive yellow seeded oilseed rape cultivars in the market. Correlations of ADL content to seed coat phenolic compounds suggest that low ADL content is associated with reduced seed coat thickness (Wittkop et al. 2012; Behnke et al. 2018). Multiple genes and loci have been reported to be involved in the inheritance of seed coat color and testa thickness not only in Arabidopsis thaliana but also in Brassica species (e.g. Badani et al. 2006; Stein et al. 2013; Yan et al. 2009; Liu et al. 2015; Taylor-Teeples et al. 2015; Wang et al. 2015, 2017; Körber et al. 2016).

The seed coat controls seed germination and can be seen as a barrier for water and/or oxygen, or by providing mechanical resistance to radicle protrusion. In Arabidopsis, inhibited germination had been positively correlated with seed coat color, due to the presence of phenolic compounds (Debeaujon et al. 2000). In spite of all reported advantages of yellow seed coat, the thinner testa in yellow-seeded types also means the seed is more prone to damage by various environmental factors (Zhou et al. 2020). Seeds of oilseed rape are reported to maintain germination capacity for 10 years and longer in the soil. Thereby germination is prevented by secondary induced dormancy until seeds may finally germinate under favorable conditions in the soil (Schatzki et al. 2013 and references therein). Seed longevity declines also during dry storage (Rajjou and Debeaujon 2008) and the ageing rate depends on seed moisture content, temperature, initial seed quality and on genetic factors (Nagel et al. 2011, 2018). Seed longevity of yellow-seeded types tends to drop more rapidly compared to the black-seeded ones (Debeaujon et al. 2003). Since materials of naturally aged seeds are not always available, artificial seed ageing protocols are often utilized. Exposure of seeds to high temperature and high moisture conditions has been a commonly used method for ageing seeds in the laboratory (Hay et al. 2019).

In different populations a large number of QTL on linkage groups A05, A06, A09, C05, C08, C09, for reduced lignin content and yellow seed color have been published (Badani et al. 2006; Stein et al. 2013; Wang et al. 2015, 2017; Behnke et al. 2018; Miao et al. 2019). Despite the identification of numerous loci affecting seed coat color in the allopolyploid genome of oilseed rape, none of the identified QTL have led to the development of competitive true yellow seeded rapeseed cultivars. It has been argued that loci for yellow seed character may have negative pleiotropic effects on seed germination, pre-harvest germination, seed longevity and vigor. The objective of this work was to study the inheritance of the yellow seed character in relation to seed germination, seed longevity, pre-harvest germination and other seed quality traits. For this purpose, two not yet characterized yellow seeded genetic resources were crossed to the same black seeded inbred line and the derived two doubled haploid (DH) populations were characterized following field experiments.

Material and methods

Plant material

Population I consisted of 75 DH lines derived from the cross of genotype 4042 × Express 617 and the two parents. Line 4042 is a winter oilseed rape of the Department of Crop Sciences at the University of Göttingen. Seeds of 4042 have a yellowish-brown appearance and a reduced content of Acid Detergent Lignin (ADL). They are high in glucosinolate content but have low erucic acid content in the seed oil. Express 617 is a black seeded inbred line of the German winter oilseed rape cultivar Express and seeds are of canola quality. Population II consisted of 135 DH lines derived from the cross DH1372 × Express 617. DH1372 is a yellow-seeded canola (‘00’) DH line derived from the cross between the two black seeded Canadian spring cultivars Star and Bolero (Burbulis and Kott 2005). Seed material of DH1372 was provided by Laima Kott (University of Guelph). Both DH populations were developed at the University of Göttingen. Since DH1372 is a spring genotype it could not survive winter in field trials performed in north-western Germany.

Field experiments

Population I was tested in field experiments in five environments, in Göttingen-Reinshof and Einbeck, both located in north-western Germany in 2014, 2015 and 2016. Population II was tested in Göttingen-Reinshof in the three subsequent years 2015, 2016 and 2017. One hundred seeds were sown in single row observation plots. Fertilizer, fungicides and insecticides were applied following standard schemes at each location. At maturity, seeds were harvested and bulked from 10 main racemes of open pollinated plants from each plot. Seeds were dried at 30 °C to a moisture content of 5 to 8% and stored at a dry place at room temperature for further analysis. From population II 224 DH lines were sown in 2015. Due to segregation of the spring rapeseed character and unfavorable weather conditions during winter 2015/16 seeds from only 135 DH lines were harvested.

Seed quality analysis

Near Infrared Reflectance Spectroscopy (NIRS) was applied to predict moisture, oil, protein, and glucosinolate content of the seed, and fiber contents of the defatted meal (NDFm, ADFm, and ADLm). NIRS measurements were performed using seed samples of 3 g in small ring cups and the FOSS monochromator model 6500 (NIRSystem Inc., Silverspring, MD, USA). Oil (%), protein (%), and glucosinolate (µmol/g seed) contents were predicted at 91% seed dry matter content using the commercial calibration equations of raps 2012.eqa provided by VDLUFA Qualitätssicherung NIRS GmbH (Am Versuchsfeld 13, D-34128 Kassel, Germany). Total seed oil and protein content (oil + P) was obtained by forming the sum of oil and protein content. The values of oil and protein content were further used to calculate protein content of defatted meal (PDM): PDM (%) = [% protein/(100% oil)] × 100. NDFm, ADFm and ADLm contents were determined by calibration equations fibr2013.eqa (Dimov et al. 2012). All fiber content values are given as percentage of fiber in the defatted meal.

Determination of other seed traits

Thousand seed weight (TSW) was calculated by taking the weight of 500 seeds. Percentage of pre-harvest germination (PHG) was determined by counting the number of seeds showing pre-harvest germination in a sample of 100 randomly selected seeds. Seed color was visually scored according to a scale from (3) for yellow brown, (5) for brown, (7) for dark brown and (9) for black and also using intermediate scores. To avoid effects caused by primary seed dormancy, seed germination was performed with seeds that have been stored for 6 weeks after field harvest. The germination test was carried out in Petri dishes (92 × 16 mm, Sarstedt, reference code 82.1473) using customized filter paper (90 mm in diameter, Macherey-Nagel, GmbH & Co. KG, reference code 400,866,009.1) with 50 indented holes each to hold seeds. Fifty randomly chosen intact seeds per DH line and Petri dish were tested. Mean values for 100 seeds were used in statistical analysis of seed germination. Twelve milliliters of de-ionized water were carefully added. Petri dishes were covered with pertinent coverlids and they were placed in large plastic trays covered with cellophane to reduce evaporation. The trays were placed in darkness in a germination chamber at a temperature 16.5–17.5 °C. After 10 days, seeds were inspected and radicle protrusion (%), germination (%), contamination with bacteria or fungi (%) and hypocotyl length were determined. Germination was scored when the radicle was elongated (> 1 cm) and both cotyledons were outside of the seed coat. Radicle protrusion was scored when seed radicle elongated visually and protruded out of seed coat, but the cotyledons were not yet swollen and were still embedded within the seed coat. Contaminated seeds were identified by bacterial/fungal growth around seeds on filter paper. Based on slime or mycelium formation around the seeds, contaminations were considered to be caused by bacteria or fungi, respectively. Furthermore, average hypocotyl length in cm was measured as a proxy for seedling vigor. Seed germination after artificial ageing was determined following the ‘controlled deterioration’ procedure of Hay et al. (2008). Seeds were equilibrated above 8.7 molar non-saturated lithium chloride (LiCl) solution at 20 °C to reach 47% relative humidity (RH) for 10 days. Then, seeds were placed above 7.1 molar LiCl solution at 45 °C to increase the RH to 60% and were kept for 50 days. Afterwards, two replicates of 50 seeds each were germinated for 10 days in Petri dishes under the same conditions and were inspected for the same traits as described above.

Non-targeted metabolite analysis of segregating bulks of the population 4042 × Express 617

Non-targeted metabolite finger printing according to Feussner and Feussner (2019) was applied to detect differences in the metabolite abundance between yellow and black seeded DH lines of population I. Fully mature dry seeds derived from DH lines with contrasting low and high ADLm contents were analyzed. However, no clear differences in relative metabolite abundance between the two groups were found. Therefore, each 15 DH lines with low and with high ADLm content in mature seeds were sown in the greenhouse. At BBCH plant developmental stage 78 to 79 (fully grown siliques with green seeds; three to four weeks after self-pollination; Meier 2001; Hajduch et al. 2006) siliques were harvested and carefully opened with a scalpel. Green seeds were collected and immediately frozen in liquid nitrogen. Frozen seeds of five plants each with low and high ADLm content were bulked to give in total six bulks of seeds (three biological replicates with each low and high ADLm content). In order to check if identical or different changes in relative metabolite abundance occur in DH1372, green seeds of parental genotype DH1372 were also included with three biological replicates in the analysis. 150 seeds were homogenized for each replicate bulk using a Mixer Ball Mill MM200 (Retsch GmbH). For metabolite extraction, 80 mg of the frozen seed material were subjected to an adapted two-phase extraction with methyl-tert-butylether (MTBE), methanol and water (Feussner and Feussner 2019). Metabolite fingerprinting was performed with an UHPLC 1290 Infinity (LC, Agilent Technologies) coupled to a high-resolution mass spectrometer (HR-MS, 6540 UHD Accurate-Mass Q-TOF, Agilent Technologies) with Agilent Dual Jet Stream Technology as electrospray ionization (ESI) source (Agilent Technologies). For chromatographic separation of the samples from the polar as well as the non-polar extraction phase, an ACQUITY HSS T3 column (2.1 × 100 mm, 1.8 μm particle size, Waters Corporation) was used as stationary phase at 40 °C. The solvents A (water with 0.1% (v/v) formic acid) and B (acetonitrile with 0.1% (v/v) formic acid) were used to set-up the following gradient: 0 to 3 min: 1% to 20% B; 3 to 8 min: 20% to 100% B; 8 to 12 min: 100% B; 12 to 15 min: 1% B. The flow rate was set to 500 µl/min. The Q-TOF MS instrument was used in a range from m/z 50 to m/z 1700 with a detection frequency of 4 GHz, a capillary voltage of 3000 V, nozzle and fragmentor voltage of 200 V and 100 V, respectively. The flow of drying gas was set to 8 l/min and sheath gas to 8 l/min, respectively. The sheath gas was set to 300 °C, and drying gas to 250 °C. Data were acquired in positive as well as negative ESI mode with Mass Hunter Acquisition B.03.01 (Agilent Technologies). Profinder B.08.02 (Agilent Technologies) was used to generate data matrixes. The MarVis-Suite toolbox (Kaever et al. 2015, http://marvis.gobics.de/) was used for data processing, statistics, data mining and visualization. First, metabolite features with a false discovery rate (FDR) < 0.005 were selected. Overall, 486 features obtained from positive ESI mode as well as 254 features from negative ESI mode of the polar extraction phase and 225 features from positive ESI mode as well as 104 features from negative ESI mode of the non-polar extraction phase were selected, merged and used for clustering. The accurate mass information of the selected metabolite features was used for metabolite annotation (KEGG, http://www.kegg.jp and BioCyc, http://biocyc.org). Verification of the chemical structure of the annotated metabolite markers was performed by LC-HR-MSMS analyses (Online Resource 1) as described by Feussner and Feussner (2019). A principal component analysis (PCA) was performed with the filtered data set of 1,069 metabolite features (FDR < 0.005) using the software MetaboAnalyst (Chong et al. 2019, https://www.metaboanalyst.ca/).

Bulk segregant SNP-marker analysis

Frequency distributions of seed ADLm content of the DH populations 4042 × Express 617 and DH1372 × Express 617 showed a bimodal type of frequency distribution (Online Resource 2) which suggested segregation of one major gene controlling this trait. A bulk segregant analysis was performed using a proprietary Brassica 20 K Illumina Infinium™ SNP array (KWS SAAT SE, www.kws.de) for population I and the Brassica 15 K Illumina Infinium™ SNP array of TraitGenetics GmbH (www.traitgenetics.com) for population II. For population I, four bulks each consisting of 5 DH lines with low ADLm content and four bulks each consisting of 5 DH lines with high ADLm content were analyzed. For population II, three bulks with low and two bulks with high ADLm content, each consisting of 5 DH lines were analyzed. SNP markers specific for the low and high ADL bulks were identified. Selected SNP markers (c.f. Figure 1) were used to genotype the complete populations applying the KASP marker technology (LGC group; www.lgcgroup.com/genomics). SNP marker sequences were provided by Isobel Parkin (AAFC, Saskatoon, Canada; Clarke et al. 2016) and KASP markers were ordered from the LGC group (www.lgcgroup.com/genomics). KASP marker analysis was performed following suppliers’ instruction in 96 well plates using a CFX96 Touch Real-Time PCR Detection System. SNP sequences were aligned against the Darmor-bzh B. napus reference genome v4.1 (http://www.genoscope.cns.fr/brassicanapus/, Chalhoub et al. 2014) and against the ZS11 reference sequence (https://www.ncbi.nlm.nih.gov/assembly/GCF_000686985.2/, Sun et al. 2017). Most likely positions were selected from the BLAST hits considering best matching and highest possible E-value as well as genetic map data information (Fig. 1). SNP marker positions were also compared with the positions listed in the supplementary material of Clarke et al. (2016) Annotation of the B. napus gene sequences was performed by Boas Pucker based on the Araport11 complete reannotation of the Arabidopsis thaliana reference genome (Cheng et al. 2017).

Fig. 1
figure 1

Physical map with SNP markers on chromosome C03 that allows distinguishing bulks of genotypes with low and high ADL content in the two different DH populations. Physical positions of SNP marker and the cinnamate-4-hydroxylase gene (BnaC03g16950D and BnaC03g16960D) are based on the Darmor-bzh reference genome (Chalhoub et al. 2014)

Statistical analysis

Analysis of variance and prediction of heritability values were performed by PLABSTAT software (Utz 2011). Both environment and genotype factors were considered as random variables. The general model for analysis of variance is as follow:

$${\text{Y}}_{{{\text{ij}}}} = \, \mu \, + {\text{ g}}_{{\text{i}}} + {\text{ e}}_{{\text{j}}} + {\text{ ge}}_{{{\text{ij}}}}$$

where Y is observation of genotype i in environment j; µ is general mean; gi and ej were the effects of genotype i and environment j; geij is the interaction between genotype x environment of genotype i with environment j. Broad sense heritability (h2) was calculated as follows: h2 = δ2G/(δ2G + δ2GE/E) where σ2G is the variance component for genotype, σ2GE is the variance component for genotype and environment interaction, and E is the number of environments. Least significant differences (LSD5%) and Spearman’s rank correlation coefficients between trait means were calculated by PLABSTAT software (Utz 2011). Newman Keuls comparisons of contrasting groups were performed with STATISTICA data analysis software system (version13.1; StatSoft, I. N. C. (2016) using mean values over all environments.

Results

DH population of the cross 4042 × Express 617

Significant genotypic variation within this population was detected for all seed quality traits without and with artificial ageing treatment except for seed contamination without artificial ageing treatment (Online Resource 3). Heritabilities ranged from modest for glucosinolate content (0.26) to high for seed fiber (0.81 to 0.86) content and seed color (0.95). Parental line Express 617 had a significantly higher oil content (48.8%) and lower seed protein content (16.9%) compared to line 4042 (40.7 and 18.9, respectively; Table 1). Significant and relatively large differences between the parental lines and within the DH population were found for the seed fiber constituents NDFm, ADFm and ADLm and for seed color. Thereby, line 4042 had significantly lower fiber contents and lower seed color score compared to black seeded Express 617 and a comparable range was also detected in the DH population. Noteworthy, seeds of line 4042 showed much higher contaminations than Express 617. Both parental genotypes had a seed germination above 97%. In the DH population seed germination ranged from 81.5 to 100% with a mean of 98.4%. The artificial ageing treatment led to an increased germination range from 20.8 to 86.2% and a reduction to 62.8% as a mean of the population. This was accompanied by an increase in seeds showing only radical protrusion (i.e. incomplete germination) and a reduction by more than half of hypocotyl length. Furthermore, the artificial ageing treatment increased percentage of seeds with contaminations in the population from 3.2 to 11.0%.

Table 1 Descriptive statistics of the DH population 4042 × Express 617 (n = 75; mean values over five environments)

Spearman rank correlation analysis showed that ADLm content was closely positive correlated with ADFm and NDFm content. The three fiber fractions were slightly positive correlated with seed germination (for ADLm rS = 0.21) which became significant after artificial seed ageing (rS = 0.45**). In contrast, all fiber fractions were negatively correlated with pre-harvest germination, radical protrusion and contaminated seeds (Table 2) and similar correlations were obtained following artificial ageing treatment. Notably, the correlation between ADLm content and hypocotyl length became significant positive only after artificial seed ageing.

Table 2 Spearman rank correlation coefficients of means over five environments between NDF, ADF and ADL content and other seed quality traits of DH population 4042 × Express 617 (n = 75)

Frequency distribution of the ADLm content did not show a normal distribution. It was significantly skewed to the right (Skewness 0.46, p = 0.10; Online Resource 2) and suggested a bimodal segregation with a possible separation of two groups at 10% ADLm content. Comparing the two groups separated at 10.0% ADLm content confirmed significant differences between the two groups for seeds with contaminations and for all traits recorded following artificial seed ageing (Table 3). Separation of the two groups also showed that a larger number of DH lines with a low ADLm content regenerated from microspore culture in the DH population than DH lines with a high ADLm content (Table 3). The clear differences between the low and high ADLm bulks and the bimodal segregation of ADLm content in the DH population suggested that the difference might be caused by one major QTL. Bulk segregant analysis of DH lines with low and high ADLm content using the 20 K Illumina SNP-chip allowed to locate bulk specific SNP markers on chromosome C03 (Online Resource 4; Fig. 1). Three SNP-markers were selected and KASP-marker analysis was performed on the complete population of 75 DH lines. With some overlap at 10% ADLm content, the KASP markers allowed unequivocally distinguishing DH lines of the low and high ADLm group (data not shown).

Table 3 Means over five environments of 2 groups of genotypes with low ADL content (< 10%) and high ADL content > 10%) are compared (DH population 4042 × Express 617; n = 75; Newman-Keuls test)

DH population of the cross DH1372 × Express 617

Seed quality analysis revealed significant effects for the genotype and the environment for most of the traits of the DH1372 × Express 617 DH population (Online Resource 3). Heritabilities ranged from 0.22 for pre-harvest germination to 0.86 for seed color. Compared to the 4042 × Express 617 DH population, relatively low heritabilities were obtained because experiments were performed only in three environments. As for the first DH population, the frequency distribution of the ADLm content of DH1372 × Express 617 DH population was significantly skewed to the right (Skewness 0.51, p = 0.05; Online Resource 2). Again, the distribution suggested a bimodal segregation with a possible separation of two groups at 9.0% ADLm content. Separating the two groups at 9% ADLm content showed again a much larger group of DH lines with low ADLm content (n = 94) compared to the group with high ADLm content (n = 41). Seeds of the two different groups of DH lines were further analyzed for their germination without and with artificial ageing (Table 4). The low ADLm group had significantly higher contents of oil and of the sum of oil and protein. Furthermore, the low ADLm group had without and with artificial ageing significantly higher percentage of contaminated seeds. After the artificial ageing treatment, hypocotyl length decreased from 4 cm to less than half after 10 days of germination. As for population 4042 × Express 617, seeds with low ADLm content had significantly shorter hypocotyls after artificial ageing treatment. However, compared to population 4042 × Express 617, the seed ageing treatment for 50 days did not affect germination of DH1372 x Express 617. There was also a lack of correlations between fiber contents and seed germination after artificial ageing (Table 5). Bulk segregant analysis of DH lines with low and high ADLm content using the 15 K Illumina SNP-chip allowed to locate bulk specific SNP markers on chromosome C03 (Online Resource 4; Fig. 1). Surprisingly, the identified region was overlapping with the region identified in population 4042 × Express 617. Six SNP-markers were selected and KASP-marker analysis was performed on the whole population. With some overlap at 9% ADLm content, the KASP markers allowed unequivocally distinguishing DH lines of the low and high ADLm group (data not shown).

Table 4 Means over three environments of DH population 1372 × Express 617 (n = 135) of 2 groups of genotypes with low ADLm content (< 9.0%) and high ADL content (> 9.0%) (Newman-Keuls test)
Table 5 Spearman rank correlation coefficients of means over two environments between NDFm, ADFm and ADLm content and other seed quality traits of DH population DH1372 × Express 617

Non-targeted metabolite analysis of the DH population 4042 × Express 617

Non-targeted metabolome analysis of green seeds of the contrasting bulks for high and low ADLm content of the DH population 4042 × Express 617 and the low ADLm genotype DH1372 resulted in 1069 significantly different metabolite features (FDR < 0.005). The intensity profiles of these features were clustered and visualized by means of 1D-SOMs (one-dimensional self-organizing maps, feature based clustering) and organized into six clusters (Online Resource 5). Cluster 1 to 3 represent 732 metabolite features which are characterized by higher abundance in genotype DH1372 in comparison to both bulks of the DH population 4042 × Express 617. Features (279), which are summarized in cluster 4 and 5 show the inverse pattern with higher abundance in both bulks of the DH lines (low and high ADLm content). Cluster 6 contains 58 metabolite features that show high intensities in the high ADLm bulk in contrast to the low ADLm bulk and the low ADLm genotype DH1372. Since LC-ESI-MS analyses tend to form a considerable number of adducts, the 58 low ADLm specific features may represent in total about 15 metabolites. Low ADLm genotype DH1372 displayed a completely different metabolite feature in comparison to the 4042 × Express 617 low and high ADL bulks. This is not surprising, because genotype DH1372 is a Canadian spring DH-line that is genetically distant to European winter oilseed rape material. However, DH1372 showed a similar low metabolite intensity in cluster 6 as the low ADL bulk of 4042 × Express 617. The accurate mass of the metabolite features represented in cluster 6 was used for an automated data base search (against BioCyc and KEGG) to assign tentative identities to the features. Exclusively metabolites of the procyanidin biosynthesis were identified as enriched in the high ADLm bulk. The identity of leucocyanidin, catechin and the procyanidin-oligomers including two to five procyanidin units were unequivocally confirmed by LC-HR-MSMS analysis (Online Resource 5). Other precursors of the procyanidin pathway like phenylalanine, cinnamic acid, coumaroyl-CoA, chalcone or flavanones could not be detected among the metabolites in cluster 6. Sample-based clustering of the data set by a PCA showed a clear separation on principal component 1 (PC1) between the Canadian spring genotype DH1372 and the European winter oilseed rape material (4042 × Express 617), while on PC2 all three samples groups (DH1372, the low and the high 4042 × Express 617 bulks) were separated (Online Resource 6).

Identification of candidate genes

Bulk segregant analysis allowed confining the physical region responsible for reduced ADLm content on chromosome C03 in both populations. Using the SNP-marker sequences, the physical region between the first and last marker were characterized. In the Darmor-bzh reference genome the physical position of the Bn-scaff_21778_1-p215559 marker was at 5.012.004 bp and that of the to the last marker Bn-scaff_21312_1-p1309658 was at 9.145.606 bp (Fig. 1). In the ZS11 reference genome this region spanned from 10.393.115 to 18.397.417 bp. The physical region between the two flanking markers of the Darmor-bzh (Online Resource 7) and the ZS11 reference genome sequence (Online Resource 8) was inspected for candidate genes known to be involved in the phenylpropanoid and lignin biosynthetic pathway. Candidate genes comprised a number of Myb transcription factors (myb14, myb19, myb82, myb49, myb120, myb36, myb119, myb59) and a cinnamate-4-hydroxylase gene (Fig. 1) that are marked yellow in Online Resources 7 and 8. The most likely candidate gene is the cinnamate-4-hydroxylase gene at position 8.615.302 bp in Darmor-bzh and at position 17.639.366 in ZS11. In both reference genomes, this gene is present in tandem duplicate. Although the two genes in Darmor-bzh (BnaC03g16950D and BnaC03g16960D) are outside flanking marker Bn-scaff_18322_1-P1044275, the distance to this marker is less than 600 kb. In ZS11 the two genes BnC03g0556270.1 and BnC03g0556290.1 are located well within the region delimited by the group specific SNP-markers. BLAST of the ZS11 gene sequences against the Darmor-bzh genome showed that the genes corresponded to each other in the same order. A MATE efflux family protein (BnaC03g18950D) was identified in the Darmor-bzh genome a bit outside the group specific marker range at position 9.813.528. In the ZS11 genome the MATE efflux family protein gene BnC03g0554010.1 was identified at 15.788.740 bp. A little outside the flanking markers in Darmor-bzh at 10.964.534 bp and in ZS11 at 19.126.045 bp the transparent testa gene BnaC03g20650D is located.

Discussion

Development of oilseed rape cultivars with a reduced fiber content and increased oil and protein content is a desirable breeding objective. Genetic variation in oilseed rape may affect seed hull proportion and phenylpropanoid content of the seed hull. Distribution of the fiber components is unbalanced in the seed. Fiber content is especially high in seed hull which contains around 70% of NDF, 80% of ADF and more than 90% of the lignin (ADLm) content of the seed (Carré et al. 2016). Through technical dehulling of seeds, protein content in defatted meal can be increased from about 38% to 48% (Carré et al. 2016).

Despite existing genetic variation for seed fiber content, breeding of competitive yellow seeded oilseed rape cultivars has not yet been successful. Disadvantages of seeds with lower fiber content may include lower germination, increased pre-harvest germination, reduced seed longevity and vigor (Zhou et al. 2020). These traits negatively affect the crops performance in the field. Furthermore, a thinner seed hull may lead to damaged seeds by combined harvesting which in turn may reduce oil quality by increasing free fatty acid content. Instead of looking for major genes causing drastic reduction of seed hull and phenylpropanoid content, breeders may combine different minor loci for reduced fiber content to develop breeding lines, which do not show any of the mentioned disadvantages. Detecting effects of minor loci and combining them for reduced fiber content can be quite cumbersome and time consuming. However, breeding becomes easy, once a number of alleles for reduced fiber content are fixed in breeding material. Surprisingly, no negative pleiotropic effects of the bright yellow seed color of Yellow Sarson (B. rapa) on pre-harvest germination and seed longevity has been reported. Notably, Bagheri et al. (2012) mapped in a RIL population derived from a cross between a black seeded B. rapa line and the yellow seeded B rapa Yellow Sarson a major QTL for yellow seed color on chromosome A09 explaining 54% of the phenotypic variation. Furthermore, Bagheri et al. (2012) also investigated in the same RIL population the inheritance of pre-harvest germination (vivipary) and found no correlation to the yellow seed character in the population. Furthermore, although the largest QTL for pre-harvest germination was identified at the same chromosome A09, it mapped more than 50 cM away. Together, this indicates that at least for this locus there is no negative pleiotropic effect of the yellow seed color alleles on pre-harvest germination. Previously, another major QTL for preharvest germination was identified on chromosome N11 of B. napus (Feng et al. (2009). In the present work only a slight negative and no correlation between ADLm content and pre-harvest germination was found for the first and second population, respectively, which is in support of an independent inheritance of this trait from reduced lignin content. Interestingly, Bagheri et al. (2012) discussed that the major QTL for yellow seed color on chromosome A09 of the B. rapa population maybe the same as a QTL on A09 causing a reduced ADL content in a B. napus DH population (Liu et al. 2012; Stein et al. 2013). It was suggested that a Cinnamoyl-CoA Reductase (CCR1) locus may be responsible for this effect. CCR1 catalyzes in the phenylpropanoid pathway the step from p-coumaric acid to 4-coumaroyl-CoA. It would be interesting to test whether resynthesized B. napus lines derived from crosses between Yellow Sarson and black seeded B. oleracea genotypes (c.f. Jesske et al. 2013) would show reduced lignin contents in the seed hull.

In both DH populations in very similar regions a QTL was identified on chromosome C03. It remains unclear, if the two QTL were caused by mutations in the same or different genes. Since, one genotype was derived from Canadian spring canola material (Burbulis and Kott 2005) and the other genotype was a Göttinger resource with high glucosinolate content, it appears unlikely that the same genes were affected in both genotypes. Assuming that the QTL in both genotypes is located at a very similar position it would require a large segregating F2 population to separate the two loci. Reciprocal crossing of both homozygous genotypes to produce F1 seeds is also not conducive, because seed hull is maternal tissue. Assuming the genetic causes for low lignin content are different in DH1372 and 4042, one could estimate additive and synergistic effects, if the composition of the seed hull were determined by the genotype of the embryo.

There was no difference in metabolite content when fully mature seeds of the low and high ADLm groups were analyzed by non-targeted metabolome analysis. However, analyzing immature green seeds revealed exclusively metabolites of procyanidin biosynthesis were strongly depleted in the low ADLm 4042 × Express 617 group (Online Resource 5). Akhov et al. (2009) and Miao et al. (2019) obtained comparable results when they analyzed transcript abundance by qRT-PCR experiments. They found gene expression differences in early seed developmental stages but not in mature seeds. Despite gross differences in clusters 1 to 5 of yellow seeded DH1372 in comparison to the 4042 × Express 617 groups, DH1372 showed an equal reduction of procyanidin metabolites as the low ADLm 4042 × Express 617 group. The likewise reduction of the same procyanidin metabolites in the low ADLm group of 4042 × Express 617 and in DH1372 suggests that in both genotypes genes of the procyanidin pathway could be affected. Hong et al. (2017) found a number of biosynthetic and regulatory genes in seed coats, whose down-regulation mainly contributed to the reduction of procyanidins in yellow seeds. Similarly, Qu et al. (2020) found in a comparison of different yellow and black seeded genotypes, that genes of the procyanidin pathway had lower expression levels in yellow-seeded rapeseed. Based on metabolite profiles they suggested that the seed coat color could be mainly determined by the levels of epicatechin and their derivatives. In genotypes with mutations in different genes of the procyanidin pathway, the phenotypic effect on the metabolite composition may be different. Considering the large number of possible candidate genes in the QTL region for the observed phenotypic differences, non-targeted metabolite analysis could represent a valuable additional tool to distinguish between genotypes with different mutations (Qu et al. 2020). Within the genomic region delimited by the flanking SNP markers in the Darmor-bzh and the ZS11 reference genome a number of candidate genes were identified. A likely candidate is the cinnamate-4-hydroxylase—gene that occurs in this region in tandem duplicate in both reference genomes. Cinnamate-4-hydroxylase catalyzes the conversion of cinnamic acid to p-coumaric acid, the second step in the general phenylpropanoid pathway (Liu et al. 2015; Hong et al. 2017). Since above mentioned CCR1 catalyzes in the phenylpropanoid pathway the next step from p-coumaric acid to 4-coumaroyl-CoA, it is clear that the same pathway is affected by mutations. An impaired activity of Cinnamate-4-hydroxylase could very well lead to reduced contents of downstream metabolites of this pathway. None of the myb transcription factor genes detected within the flanking markers were so far reported to be activators or repressors of the phenylpropanoid pathway (Liu et al. 2015; Wei et al. 2017; Hong et al. 2017). Candidate gene BnaC03g18950D is a transporter protein, also known as multidrug and toxin extrusion (MATE) protein. In Arabidopsis, AtTT12 encodes a transporter protein that acts as a proton-dependent antiporter, assisting vacuolar localization of proanthocyanidins in the testa (Debaujon et al. 2001). Transparent Testa (TT12) transporter proteins have been reported to be involved in yellow seed character in Arabidopsis and Brassica species (Chai et al. 2009; Hong et al. 2017). Similarly, candidate gene BnaC03g20650D is a WRKY family transcription factor protein, which is also known as BnTTG2 transparent testa gene (Johnson et al. 2002; Qu et al. 2013; Liu et al. 2016, Qu et al. 2020).

Although both DH populations showed similar large and significant differences in ADLm, ADFm and NDFm content between the low and high ADLm groups, only DH population DH1372 × Express 617 showed a significant increase in oil content and in the sum of oil and protein content. This was not noticed in population 4042 × Express617, albeit this population was tested more thoroughly in five environments. This illustrates that positive effects of reduced lignin content alleles on oil and protein content may depend either on the genetic background or on the affected lignin type locus (Taylor-Teeples et al. 2015). In both DH populations there were only non-significant minor differences in traits like pre-harvest germination, thousand seed weight and hypocotyl length. To date only few studies have been performed on analyzing the effect of reduced seed fiber content on seed germination traits. In both DH populations there were no significant differences in seed germination of the low and high ADLm group (Tables 3 and 4). The artificial seed ageing treatment reduced in both populations the seed germination, but only in DH population 4042 × Express 617 the low ADLm group showed a comparatively stronger reduction in germination after artificial seed ageing treatment. This result is in contrast to a lower germination percentage and lower seed vigor index after artificial ageing of near-isogenic yellow-seeded and black-seeded lines as reported by Zhang et al. (2006). However, in the present study the low ADLm group of both populations had a 5 to 7% higher percentage of seeds showing contaminations (Tables 3 and 4). This difference was also observed after artificial seed ageing. Interestingly, the higher contamination rate of the low ADLm group had little effect on seed germination in both populations. Only the seed vigor, i.e. the hypocotyl length of both populations was reduced in the low ADLm group that may have been additionally influenced by the contamination. The microbial species of the seed contaminations has not been determined in the present study, but earlier studies have shown that seed born Alternaria spores may affect seed quality and germination (Meena et al. 2016; Soomro et al. 2020).

With the result of the present study, a new QTL for reduced lignin content on oilseed rape chromosome C03 has been identified in two different populations. It remains open whether the same gene or different closely linked genes are affected. The plant material can be used in crosses with already identified other lignin mutant genotypes (e.g. Behnke et al. 2018) that carry mutations on different chromosomes of the rapeseed A- or C-genome to identify additive and epistatic effects on lignin content and seed color. Despite many attempts to develop yellow seeded rapeseed, black seeded cultivars are still dominating the market. This is in clear contrast to soybean, where yellow seeded types are representing the commodity and dark seeded types are mainly used as vegetables. As a leguminous species, soybean has the advantage that is has a higher protein content and a lower oil and fiber content. The lower total fiber and especially the lower lignin content of the yellow seeded soybean contributes to the excellent quality of the meal and isolated proteins.