Introduction

Brassica oleracea is a diverse and commercially valuable species. Domesticated B. oleracea exists as the economically important vegetable crops: cauliflower, broccoli, cabbage, kale, Brussels Sprout and kohlrabi. Within each crop type there are sub types which exist as numerous cultivars, for example, within broccoli (B. oleracea L. var italica) there are both sprouting and heading morphologies that form the harvested material. The green form of the latter sometime referred to as Calabrese, is the most important form commercially and is generally referred to as ‘broccoli’ in many parts of the world and in this paper.

Broccoli production and consumption has risen significantly over the past 10 years driven by the convenience and ease of preparation and cooking as well as by reported health benefits (Van Poppel et al. 1999; Finley 2003; Lampe and Peterson 2002; Jeffery and Araya 2009; see Walley and Buchanan-Wollaston 2011 for review). In the UK, broccoli is now the most important Brassica vegetable occupying an area of over 7,000 hectares with a production value of over £50 million per year (DEFRA 2010).

Broccoli breeders aim to improve a wide range of traits including aspects of head morphology and appearance, pest and disease resistance (Darling et al. 2000; Farinhó et al. 2004) and quality attributes such as nutritional quality (Moreno et al. 2006; Jeffery and Araya 2009) and post harvest shelf life (Wurr et al. 2002; Jeffery et al. 2003). Many of these are quantitative traits and are influenced by environmental interactions (e.g. Wurr et al. 2002) making it difficult and expensive to carry out phenotypic selection, especially for traits with low heritability. Development of tools and resources to allow genotypic selection is therefore of great benefit for broccoli breeding and genetic research.

Several B. oleracea mapping populations and associated linkage maps exist (Sebastian et al. 2000; Gao et al. 2007; Iniguez-Luy et al. 2009). However, these are all derived from crosses between crop types with resultant segregation in morphology. This limits the utility of these populations for research underpinning broccoli breeding since many of the traits of interest to broccoli breeders can only be assessed on plants that produce heads with a broccoli morphology. Brown et al. (2007) created a valuable F2:3 mapping population based on a cross between a Calabrese and a landrace (Brocolette Neri E. Cespuglio). The population segregates for harvest date and head weight; however, the population contains residual heterozygocity and segregates for non-broccoli like characteristics from the landrace type parent, and therefore, individuals require several rounds of marker-assisted backcrossing before the material is of direct use as pre-breeding material.

The objectives of the work described in this paper were to (1) develop an intra-crop ‘immortal’ double-haploid (DH) mapping population derived from a broccoli × broccoli cross which would enable the genetic analysis of traits specific to the broccoli head, the harvested part of the plant, (2) construct a framework linkage map based on this population, which could be anchored to the consolidated B. oleracea map and (3) use these resources to identify quantitative trait loci (QTL) for a range of broccoli head morphological traits and traits representing variation in leaf architecture.

Materials and methods

Plant material

The broccoli parental lines used for the production of the broccoli x broccoli DH population were selected based on their relative shelf life performance. The female parent, GDDH33 is an anther culture-derived DH line derived from broccoli cv. Green Duke that has been used previously as a parental genotype for the A12DH x GDDH33 mapping population (Bohuon et al. 1996, 1998). The male parent MarDH34 is a microspore-derived DH line from broccoli cv. Marathon. GDDH33 was used as the female parent because it was known to be highly responsive to microspore culture, whereas MarDH34 shows a poor response. The DH population is hereafter referred to as the MGDH population.

The MGDH mapping population was produced by culture of F1 microspores following a modified protocol developed at the University of Guelph, Canada (Mathas 2004). Leaf samples were taken from the first true leaves of all microspore derived plantlets for ploidy determination by flow cytometry. Leaf tissue was collected from 154 diploid plantlets for DNA extraction. Only diploid plantlets were retained for subsequent seed production in the glasshouse. In total, 111 MGDH lines have been trialled over 5 years for phenotypic assessment.

DNA extraction

Total DNA was isolated from young true leaves using DNeasy plant maxi-kits (Quiagen Inc., UK) following the manufacturers guidelines and diluted to 100 ng μl−1 using TE (pH 8.0) and stored at −20°C.

Collection of phenotype data

Broccoli head morphology traits

Members of the MGDH population, the parental lines GDDH33 and MarDH34 and the commercial cv. Marathon were grown in field trials at Wellesbourne, Warwick, UK (Latitude 52º12′) over five growing seasons (2002–2008). In the 2002 trial, 30 MGDH lines were assessed, 39 in 2003, 29 in 2006, 72 in 2007 and 40 in 2008. In each year, the parental lines GDDH33 and MarDH34 were included as controls plus the commercial cv. Marathon as a check treatment. Field trials followed directly optimised, resolvable, row-column incomplete block designs. Typically, seeds were sown into Levingtons M2 compost (Scotts, UK) in Hassy 308 module trays and placed in a randomised design in a glasshouse. Plants were hardened off in a cold frame for 2 weeks and then transplanted by hand into the field plots. Plots within a block contained 24 plants arranged in 6 rows of 4 plants allowing for 8 central sample plants and 16 surrounding guard plants; plants were spaced 0.25 m apart. Six plants were sampled from each plot, giving 12 replicate plants per genotype. Broccoli heads were harvested at a stage equivalent to UK commercial maturity, determined by the first signs of bud cracking (about to open); harvesting was carried out at the same time of day for each harvest date.

Stems were cut and the leaves and bracts removed; the total head length (florets plus stalk) was trimmed to 15 cm. Harvested samples (day 0) were transferred to the field lab and trait measurements recorded: head diameter [mean of 2 measurements taken 90° across the head (mm)], stalk diameter [mean of 2 measurements taken 90° across the stem (mm)]; and head fresh weight (g). Samples were placed at 4°C for 22 h to remove field heat. The next day (day 1) heads were weighed and phenotype quality scores recorded following Wurr et al. (2002); the heads were tagged and transferred to the shelf-life facility at The University of Warwick, School of Life Sciences, Wellesbourne Campus. Heads were maintained at a room temperature of 14°C with a 16-h photoperiod and constant humidity (54 ± 2%).

Phenotype data were recorded daily until day 6; if the heads had reached failure (visible yellowing of buds) by day 6 they were discarded; if yellowing was not observed, data were recorded daily until failure. The rate of weight loss was determined as the slope of a linear regression of weight against time (g day−1) the relative weight loss was calculated as: the rate of weight loss/weight at day 0 (day−1).

Leaf traits

Leaf traits were assessed on glasshouse grown plants in 2007. Seed were sown into Levingtons M2 compost (Scotts, UK) in Hassy 308 module trays and placed in a randomised design in a glasshouse. Individual plants were transferred into M2 compost in 13 cm pots arranged in a randomized design. Leaf nine was sampled from five individual plants per genotype. The traits measured per leaf are described in Table 1 (see Fig. 1), following the nomenclature used in Sebastian et al. (2002), each set of five leaves were digitally scanned for future reference.

Table 1 Leaf traits scored in the MGDH mapping population, after Sebastian et al. (2002)
Fig. 1
figure 1

Outline of a typical broccoli leaf, illustrating the leaf traits measured in the MG mapping population; see Sebastian et al. (2002). Traits measured: LL leaf length, LW lamina width, LPL lamina petiole length, BPL bare petiole length, APL auricle petiole length, WPL wing petiole length, MW midrib width, AW auricle width, LS leaf shape, LAS leaf apex shape, LN lobe number, WN number of wings, PA presence of auricles

Statistical analyses

For individual traits, variance components (VC) and predicted means were calculated for all genotypes using restricted maximum likelihood (REML) linear mixed models (Patterson and Thompson 1971). For head weight and diameter on day 0 (the only traits recorded in all 5 years of experiments) the traits were analysed across years (2002, 03, 06, 07, 08). Two analyses were performed; the first treating year, DH line and their interaction as fixed, to obtain per year means, and the second treating DH line as fixed but year and the interaction as random factors to obtain overall means of the traits. The estimated VC of the blocking factors in both analyses were allowed to vary with year. Data for head weight at day zero across years, and head weight and diameter, leaf length and leaf width in 2007 were log transformed; whereas the square root of the data for head diameter across years, bare petiole length, auricle petiole length and wing petiole length in 2007 was used to improve the homogeneity of the variance prior to the analysis of VC using REML. The frequency distribution for each trait was examined using histograms; regressions of all traits against each other using predicted means for each trait were used to explore possible relationships between trait pairs. For each trait-pair regression, the Pearson product–moment correlation coefficient (r) was calculated to describe how well the regression line represents the data in each model; significance of the correlation coefficient was compared with a t distribution for a two-sided test at the appropriate degrees of freedom. To explore further the relationship between variables, the coefficient of determination (r 2) was also calculated. The predicted means for each trait were used as input to QTL mapping. All statistical analyses were carried out using GenStat for Windows (VSN International, UK).

Genetic analyses

Characterising the MGDH population using molecular markers

AFLP markers

The amplified fragment length polymorphism (AFLP) protocol was based upon that described by Vos et al. (1995), with modifications described for the AFLP Core Reagent Kit (Invitrogen, UK). Genomic DNA (100 ng μl−1) was digested at 37°C overnight using a restriction enzyme mix containing EcoR1/Mse1 1.25 U μl. Pre-amplification conditions included 20 cycles of 94°C for 30 s, 56°C for 60 s, 72°C 60 s, with a final hold at 18°C using a GeneAmp PCR system 9700 (Applied Biosystems, USA). Selective primer combinations were chosen following a screen of the parental genotypes; the selection criteria were based on the number of polymorphic loci detected between the two parental lines and their reproducibility. The primer pairs were labelled E or M plus selective nucleotides that were present at the 3′-end of the EcoR1 or Mse1 selective primers, respectively, following the nomenclature of Keygene (http://www.keygene.com); E + AA (E11) with M + AAG (M33), M + CAA (M47), M + CAC (M48), M + CAG (M49), M + CAT (M50) M + CCT (M54), M + CTA (M59) M + CTT (M62). The EcoR1 selective primers were 5′-end labelled with the fluorophore FAM (Applied Biosystems, USA). The selective amplification reactions used a touchdown programme: pre-PCR, 94°C for 5 min, 94°C for 30 s, annealing at 65°C for 30 s and extension at 72°C for 60 s; the annealing temperature was then dropped 1°C per cycle for 11 cycles, then 94°C for 30 s, 56°C for 30 s, 72°C for 60 s for 25 cycles with a final 72°C for 20 min, 4°C for 2 min and held at 18°C using GeneAmp PCR system 9700 (Applied Biosystems, USA). Fragments were analysed using an ABI Prism 3130xl capillary sequencer with the internal size standard 500 LIZ (Applied Biosystems, USA) and 3130xl genetic analyzer data collection software version 3.0 (Applied Biosystems, USA). Polymorphic alleles present between the parental genotypes were detected using GeneMarker V1.5 (SoftGenetics, USA). The polymorphic alleles detected were then used as a panel to screen the mapping population. Polymorphic loci used as markers were named based on the primer combination and the size of the fragment; for example marker E11M61_289 = E + AA/M + CTG with a polymorphic loci at 289 bp.

Microsatellite (SSR) markers

Thirty-five publicly available simple sequence repeat (SSR) markers were selected based on being polymorphic between the parental lines and being present in other B. oleracea genetic maps to enable future linking of maps (Table 2). SSR markers were fluorescently labelled using the fluorophore moieties FAM, NED, VIC, HEX and PET (Applied Biosystems, USA). Individual SSRs were used to amplify genomic DNA in separate reactions in a final volume of 10 μl, with 0.4 unit HotStar Taq polymerase (Quiagen Ltd., UK), 300 nM each primers, 200 μM dNTPs, 1 μl of 10× PCR buffer containing 15 mM MgC12 and genomic DNA at 10 ng μl−1. The reaction conditions were denaturation of the DNA for 15 min at 95°C, 35 cycles each with a 94°C denaturation, annealing for 30 s at 56°C and extension for 30 s at 72°C and a final extension at 72°C for 7 min. Reactions were carried out in a GeneAmp PCR system 9700 (Applied Biosystems, USA). Using fluorescently labelled SSRs enabled multiplexing of amplicons for genotyping, see Table 2. For a multiplex group, a 2-μl aliquot of amplicon from each marker was combined and made up to 15 μl using sterile water. Amplicons were analysed using an ABI Prism 3130xl capillary sequencer with the internal size standard 500 LIZ (Applied Biosystems, USA). Polymorphisms present between parental genotypes were detected using GeneMarker V1.5 (SoftGenetics, USA). The polymorphic alleles detected were then used as a panel to screen the mapping population.

Table 2 SSR primers, and the parental allele score sizes used to score the MGDH population

Framework map construction

The linkage map was constructed using JoinMap v4 (Van Ooijen 2006) specifying DH progeny. The independence logarithm of odds (LOD) grouping significance threshold was set to 5. Multi two-point linkage analyses were used to estimate marker order on each linkage group using a recombination frequency below 0.45 and LOD score greater than 0.5. A ripple function was performed after the addition of each marker, with a jump threshold of 5. Recombination frequencies were converted to map distances using the Haldane mapping function (Haldane 1919). The map data and genotype data were used to make a map file (*.map) and a locus genotype file (*.loc) for QTL mapping. The linkage map was illustrated using MapChart v2.2 (Voorrips 2002).

Map quality

Segregation distortion was explored by comparing the frequency of parental allele scores in each line and the mean scores across the population to see if the expected 1:1 Mendelian segregation ratio was present; the significance of deviance from 1:1 ratio was tested by a χ2 distribution test. Apparent double recombinants in the genotype data were rescored and the framework map was reconstructed as described above.

Quantitative trait analyses

Quantitative trait loci were estimated using MapQTL v4.0 (Van Ooijen et al. 2002), and R/qtl (Broman et al. 2003) for comparisons. In the first instance interval mapping was performed using MapQTL to scan for putative QTL. The genome-wide (GW) LOD significance threshold for each trait was determined using permutation tests set to 10, 000 iterations. Once significant QTL had been confirmed (α = 0.05), the markers that were most tightly linked to a QTL were used as cofactors in approximate multiple QTL models (MQM) as implemented in MapQTL v4.0. Where appropriate, markers were added as cofactors in a stepwise approach to select the marker combination that best represented the cofactors for the QTL model. Following cofactor selection, MQM were calculated and the linkage groups compared for changes in estimated QTL LOD score and position; empirical GW significance thresholds were recalculated for each model. For significant QTL, the LOD score, its additive effect and the parental allele underlying the QTL were recorded.

To confirm estimated QTL locations, genome scans were performed using R/qtl (Broman et al. 2003; R Development Core Team 2009). Conditional QTL genotype probabilities were estimated (calc.genoprob) with step = 1 cM, error probability = 0.001. The ‘scanone’ function was used for one-qtl model interval mapping with the expectation maximisation “EM” algorithm (Dempster et al. 1977) implemented in R/qtl; this is essentially the same as the interval mapping routine in MapQTL (Broman and Sen 2009). This was followed by the Haley–Knott regression (Haley and Knott 1992) function. Genome-wide significance thresholds (α = 0.05) were determined by permutation test (n.perm = 1, 000 iterations). Putative QTL intervals estimated using MapQTL were illustrated relative to the framework linkage map using MapChart 2.2 (Voorrips 2002). QTL were named following the nomenclature recommended for the Brassica QTL database (http://www.brassica.info; http://www.cropstoredb.org): “institution name_trait name_ chromosome and QTL number”, for example “whri_HWT_CO3.1”. In this paper, the sample year has been included to identify the QTL.

Results

Production of the MGDH mapping population

The parental lines GDDH33 and MarDH34 were primarily selected based on the differences in time taken to bud yellowing post harvest. However, the population also displays a wide range of variation in morphological traits including head size, plant height, leaf and petal shape. The parental line GDDH33 is responsive to microspore culture (78.5 embryos per bud, from buds 3.0–3.5 mm); however, MarDH34 was unresponsive to microspore culture for all bud lengths sampled. The F1 (GDDH33 × MarDH34) was responsive to microspore culture when shorter bud lengths were used (28.3 embryos per bud, from buds 2.8–3.2 mm). During the production of the MGDH population the recovery of doubled-haploid individuals was 53.8% (220 plants), which is typical for B. oleracea (Mathas 2004; Duijs et al. 1992). Not all individuals survived the culture process; for those that did, not all successfully produced seed; however, even when seed was not produced DNA samples were extracted for use in constructing the linkage map. Phenotype data have been collected from 111 MGDH lines.

New broccoli × broccoli linkage map

A pre-screen of the parental genotypes GDDH33 and MarDH34 using 177 SSR markers that have previously been incorporated into updated B. oleracea maps identified 99 polymorphic markers between the parental genotypes; 71 were monomorphic, and seven gave no score and were declared as null (Graham Teakle, The University of Warwick, in preparation). Thirty-five of the polymorphic SSR markers and a total of 177 AFLP loci (see Online resource 1) were used to genotype 154 members of the MG population.

Twenty-eight SSR markers were incorporated into the MG linkage map in addition to 106 AFLP markers. The linkage analysis resulted in nine linkage groups. Based on the presence of reference SSR markers, the linkage groups were designated C1–C9 in accordance with the nomenclature used by Parkin et al. (1995) and Sharpe et al. (1995) then orientated relative to their previously published chromosomal assignment (Sebastian et al. 2000; Suwabe et al. 2002; Lowe et al. 2003; Piquemal et al. 2005; Iniguez-Luy et al. 2008).

The framework linkage map (Fig. 2) is 946.729 cM in length with an average between marker distance of 7.698 cM, minimum between marker distance of 0.058 cM and a maximum distance of 34.099 cM. The largest linkage group (LG) was C1 at 138.153 cM, and the smallest LG was C6 measuring 46.411 cM, see Table 3. Linkage group C8 only contained AFLP markers; the other eight LG’s contained a mix of SSR and AFLP markers. The B. oleracea L. var. italica genome size has been estimated to be 599–618 Mbp (Arumuganathan and Earle 1991) and 696 Mbp (Johnston et al. 2005). Using the estimate of Johnston et al. (2005) the average physical distance between markers for this map is estimated to be 5.659 Mbp.

Fig. 2
figure 2

Brassica oleracea L. var. italica linkage map based on a population of doubled haploid lines (MGDH). Vertical bars represent linkage groups designated C1–C9. Marker locus positions (cM) and names are on the left and right sides of the linkage groups, respectively. Marker loci that show segregation distortion (P ≤ 0.01) are indicated (asterisk denotes distortion towards GDDH33; double asterisks distortion towards MarDH34). QTL for physiological and leaf traits are drawn as vertical bars to the left of linkage groups. QTL are draw as 1 LOD (bars) and 1.5 LOD (sticks) confidence intervals. Unfilled bars represent an increase in the trait value towards GDDH33 QTL genotypes; filled bars represent an increase in the trait value when the MarDH34 QTL genotype is present. QTL are named based on the institution code, trait, chromosome, QTL number and year: stalk diameter (whri_STDIA), head diameter (whri_HDIA), relative head weight loss (whri_RWLOSS); leaf length (whri_LL), lamina width (whri_LW), lamina petiole length (whri_WPL), wing petiole length (whri_WPL), leaf shape (whri_LS), leaf apex shape (whri_LAS), lobe number (whri_LN), wing number (whri_WN). AFLP loci are coded according to KeyGene nomenclature with the band size included. SSR loci are labelled by the prefixes: BRAS, BRMS, BN, A, MB, Na, Ni, OL, Ra, sN; linkage groups C7–C9 for the MGDH population linkage map

Table 3 MGDH genetic map characteristics

Segregation distortion

Segregation distortion was observed in the MGDH population. The MG linkage map is composed of a total of 134 loci of which 78 displayed varying degrees of distortion (P ≤ 0.05) with distorted loci grouping into distinct areas (Table 3; Fig. 2). Of the 78 distorted loci, 56 were AFLP markers compared with 22 SSRs. Segregation distortion is a feature common to Brassica DH populations, indicating possible preferential selection of genotypes responsive to microspore culture and/or the ability to produce seed during the regeneration and seed-bulking phases (Ferreira et al. 1994; Takahate and Keller 1991; Sebastian et al. 2000). Overall, of the 78 distorted loci, although there were slightly more MarDH34 alleles (58.97%) compared to GDDH33 (41.03%) alleles (Table 3), this was not a significant departure from a 1:1 Mendelian ratio (χ2 2.513 (1 df), P 0.113). Linkage groups C1, C7 and C8 have clusters of markers that are distorted towards MarDH34 alleles. By contrast, LG’s C3, C4 and C5 have clusters of distorted markers in favour of GDDH33. Linkage groups C2 and C6 have small clusters for both parental genotypes. No heterozygous loci were scored during the genotyping of molecular markers.

Phenotype assessment of MGDH lines

To assess the utility of the population and the framework linkage map for investigating broccoli production related traits, the population was assessed for head weight at harvest, head diameter; stalk diameter, weight loss and relative weight loss through storage. Overall, the distribution of the trait means appeared normal; however, data for head weight and head diameter in 2007 and head weight over years were log transformed; the square root of head diameter over years was used to improve the homogeneity of the variance prior to the analysis of VC using REML. Figure 3 illustrates the range in distributions for the traits head weight at harvest and head diameter at harvest scored in the MGDH population across 5 years (see Online Resource 2 for trait data summary and Online Resource 3 for VC).

Fig. 3
figure 3

Box and whisker plots showing the range of mean values for a head diameter at harvest and b head weight at harvest for the MGDH population sampled in the years 2002, 2003, 2006, 2007, 2008, and the combined analyses across years. Boxes represent the interquartile range bisected by the median of the range. Whiskers represent the range from the lower first quartile to the minimum value and the range from the upper third quartile to the maximum value

Comparisons between the parental means and the progeny means for each trait indicate that there is transgressive segregation for all traits and significant differences between the parental means (see Fig. 4 as an example of the distribution of morphological traits measured in the 2007 experiment). There was a high degree of variability between genotypes for head size and shape across all years. Head weight at harvest reflected this diversity in morphology. Over years, head weight ranged from 12.04 to 104.48 g, a difference of 1610.29% (\( \hat{\mu } \) = 78.22 g; \( \hat{s} \) = 17.95 g). Head diameter over years ranged from 31.4 to 121.3 mm (\( \hat{\mu } \) = 65.4 mm; \( \hat{s} \) = 8.4 mm), see Fig. 3. The stalk diameter of lines in the population measured in 2007 varied from the relatively small one at 8.29 mm to the largest at 30.3 mm (\( \hat{\mu } \) = 20.38 mm; \( \hat{s} \) = 4.76 mm). The distribution of values for the amount of weight lost over 6 days in storage had a positive skew, ranging from −6.38 to −1.33 g day−1 (\( \hat{\mu } \) = −3.63 g day−1; \( \hat{s} \) = 1.02 g day−1). The relative weight loss through storage takes into consideration the initial weight of the broccoli head at harvest; this trait had a narrow range from −0.11 to −0.05 day−1 (\( \hat{\mu } \) = −0.07 day−1; \( \hat{s} \) = 0.01 day−1).

Fig. 4
figure 4

Frequency distribution of head physiological traits scored in the MGDH mapping population in 2007. a Head weight (g), b head diameter (mm), c Stalk diameter (mm), d weight loss during storage (g day−1) and e relative weight loss (day−1). The mean score for the parental genotypes for each trait are indicated as vertical arrows for comparison, and to illustrate the amount of transgressive segregation present for each trait

When the trait head weight at harvest and head diameter at harvest were compared using Pearson correlation coefficients, it was found that all trait pairs gave significant correlation coefficients (P < 0.001) for two-sided tests, see Online Resource 4. The coefficient of determination was calculated to assess more clearly interactions between trait pairs. For the 2007 experiment, trimmed head weight gave strong positive correlations with increasing head and stalk diameters, r 2 = 0.72 and r 2 = 0.85, respectively (P ≤ 0.001), see Fig. 5. Broccoli with larger heads tended to have larger stalk diameters, r 2 = 0.53 (P ≤ 0.001). The data also suggest that the heavier heads tend to lose proportionally more weight through storage with a strong significant negative correlation, r 2 = 0.87 (P ≤ 0.001). Since a larger head and stalk diameter contribute towards a heavier head, both of these traits have strong significant correlations with weight loss through storage, r 2 = 0.73 and 0.73, respectively (P ≤ 0.001). To explore the relationship between broccoli head size and the degree of weight loss, the weight loss was expressed relative to the initial weight at harvest. As head weight at harvest increased the relative weight loss during storage also increased giving a significant strong positive correlation, r 2 = 0.63 (P ≤ 0.001). However, the regression of head diameter with relative weight loss gave a modest (but still significant) correlation, r 2 = 0.24 (P ≤ 0.005). In contrast, the correlation coefficient for stalk diameter and relative weight loss was strong with significant interaction between the variables, r 2 = 0.73 (P ≤ 0.001) (Fig. 5).

Fig. 5
figure 5

Correlation matrices illustrating the regression lines between the physical traits measured. The Pearson product–moment correlation coefficient (r) is included for comparisons; n = 71; *P < 0.005; **P < 0.001

Leaf traits

The parental lines have contrasting leaf types, for example the apex on GDDH33 leaves is a rounded obtuse shape compared with the acute pointed apex of MarDH34 leaves. On average, MarDH34 has longer leaves (140.85 mm) compared with GDDH33 (97.87 mm); for this trait the population showed transgressive segregation in a positive direction beyond MarDH34 (\( \hat{\mu } \) = 155.66 mm; \( \hat{s} \) = 26.94 mm) (see Fig. 6). The shorter leaves of GDDH33 also had a smaller mean value for lamina width compared with MarDH34, 39.24 and 80.66 mm, respectively (\( \hat{\mu } \) = 75.05 mm; \( \hat{s} \) = 10.20 mm), r 2 = 0.29 (P < 0.001). Again the population showed transgressive segregation beyond MarDH34, with a maximum lamina width of 97.47 mm. Leaf length had a positive correlation with wing petiole length (r 2 = 0.26, P < 0.001) and midrib width (r 2 = 0.25, P < 0.001). Midrib width had a modest correlation with leaf width (r 2 = 0.22, P < 0.001). GDDH33 had a smaller mean lamina petiole length (63.00 mm) compared with MarDH34 (99.40 mm), with transgressive segregation beyond MarDH34; the population maximum was 148.00 mm (\( \hat{\mu } \) = 148.00 mm; \( \hat{s} \) = 16.24 mm). Lamina petiole length had a strong correlation with leaf length (r 2 = 0.50, P < 0.001) and modest correlations with lamina width (r 2 = 0.33, P < 0.001) and midrib width (r 2 = 0.34, P < 0.001). GDDH33 has a narrower midrib width compared with MarDH34, 7.00 and 12.40 mm, respectively (\( \hat{\mu } \) = 9.67 mm; \( \hat{s} \) = mm). See Online Resource 5 for all leaf trait pair-wise correlation coefficients.

Fig. 6
figure 6

Examples of the frequency distributions for leaf traits scored within the MGDH population in 2007. a Log leaf length (mm), b lamina width (mm), c lobe number and d the square root of wing petiole length (mm) are illustrated. Arrows indicate the mean values for the parental lines, highlighting transgressive segregation within the population for each trait

To explore relationships between the broccoli head morphological traits and the broccoli leaf traits, all traits were correlated in pairwise comparisons (Online Resource 6). Modest correlations were found between leaf midrib width and broccoli head weight (r = 0.468, P < 0.001), head diameter (r = 0.464, P < 0.001) and stalk diameter (r = 0.414, P < 0.001). However, the coefficient of determination for these comparisons was low in each case, r 2 = 0.22, 0.22 and 0.17, respectively.

QTL analyses

A total of 58 indicative QTL were identified for the 5 morphological traits scored in the MGDH broccoli mapping population, and 20 of these were significant at the GW significance level (P < 0.05) (see Table 4; Fig. 2). For the other 38 QTL (P > 0.05), see Online Resource 7.

Table 4 Quantitative trait loci identified for broccoli head and leaf traits using multiple-QTL model analysis of the MGDH broccoli mapping population

The trait data recorded across multiple growing seasons plus the combined analyses across all years revealed QTL ‘hot-spots’. For example, QTL for head diameter recorded in 2002 (Whri_HDIA_CO2.1_2002), head diameter from the over years analyses (Whri_HDIA_CO2.1_OverYrs), and head weight over years (Whri_Hwt_CO2.1_OverYrs) all co-locate to a 34-cM interval on LG C2 (Fig. 2). For each QTL, it is GDDH33 that increases the trait value. Interestingly, a QTL for relative head weight loss during storage also mapped to this interval (Whri_RWLOSS_CO2.1_2007); however, the increasing allele in this case is from MarDH34. Six other suggestive QTL (P > 0.05) collocate to this interval. These represent the related traits, head weight recorded in 2002, 2003, and 2007; head diameter recorded in 2003 and stalk diameter recorded in 2007. For each of these QTL intervals it is GDDH33 that is the increasing allele, see Online Resource 7.

There was another QTL ‘hot’ spot on LG C6; here QTL for head diameter recorded in 2002, 2006, 2007 and head diameter across years co-locate within an 18-cM interval; for all four QTL, GDDH33 is the increasing allele. A suggestive QTL for head weight loss also co-locates to this interval, with GDDH33 being the beneficial parental genotype.

Two QTL for head diameters, Whri_HDIA_CO7.1_2002, Whri_HDIA_CO7.2_2008 recorded in 2002 and 2008, respectively, co-located on LG C7; however, the increasing allele in this case is MarDH34. Suggestive QTL for head weight loss and relative head weight loss also overlap with this interval. A second QTL for head diameter was also located on LG C7 (Whri_HDIA_CO7.1_2008); the increasing parental genotype at this QTL was GDDH33. For this QTL interval, two suggestive QTL for head weight and head diameter recorded in 2006 collocate, and MarDH34 was the increasing parental genotype in each case.

There were several QTL that appeared in one year only. On LG C9, two QTL, one for head weight (Whri_Hwt_CO9.1_2006) and one for head diameter (Whri_HDIA_CO9.1_2006) both recorded in 2006 collocated to 10-cM interval, with the MarDH34 QTL genotype increasing both weight and diameter. Suggestive QTLs for head weight recorded in 2008 and head weight across years over lap this interval; MarDH34 is the increasing parental genotype for both of these suggestive QTL. These suggestive QTL have intervals that overlap a third QTL on LG C9 for relative head weight loss recorded in 2007 (Whri_RHWLOSS_CO9.1_2007), however, GDDH33 as the beneficial allele. Three suggestive QTL overlap this interval for head weight recorded in 2002, 2007, and a QTL for stalk diameter; in all cases MarDH34 was the increasing allele.

Other QTL identified in a single year include a stalk diameter QTL, Whri_STKDIA_CO3.1_2007; on LG C3 the allele from MarDH34 increased stalk diameter. A suggestive QTL for head weight also collocated to this interval, with MarDH34 as the increasing parental allele. On LG C4 a single QTL for head diameter was identified in 2002, with MarDH34 as the increasing allele. In 2007 two head weight QTL were identified on C4, Whri_Hwt_CO4.1_2007 and Whri_Hwt_CO4.2_2007, and both QTL have GDDH33 as the increasing parental genotype; interestingly, the suggestive QTL for head diameter and head weight loss collocate to the same interval as Whri_Hwt_CO4.1_2007 and both have GDDH33 as the beneficial parental genotype. A suggestive QTL for head weight over years overlaps the QTL Whri_Hwt_CO4.2_2007; again, GDDH33 is the increasing parental genotype. A single head weight QTL was mapped to LG C5 in 2007; MarDH34 was the increasing parental genotype. This interval also contained a suggestive QTL for head weight loss, with MarDH34 being the beneficial genotype.

Leaf trait QTL

Sixteen significant QTL (P ≤ 0.05) were detected for the leaf traits measured in 2007 (Table 4), with a further fifteen that were not significant at the GW significance threshold, (Online Resource 7).

One significant QTL was detected for leaf length, Whri_LL_CO7.1. This QTL had a LOD score of 7.46 (P ≤ 0.001) and accounted for 31.4% of the phenotypic variance. The allele responsible for increased leaf length was from MarDH34. One significant QTL was mapped for lamina width, Whri_LW_CO4.1 (P = 0.028). This QTL contributed 31.4% of the phenotypic variation for this trait; again, the parental genotype that increased the mean trait value at this QTL was MarDH34.

Two QTL for lamina petiole length were mapped to LG’s C3 and C7. The QTL Whri_LPL_CO3.1 (P ≤ 0.001) was the most significant accounting for 24.1% of the phenotypic variation compared to 12.5% for Whri_LPL_CO7.1 (P ≤ 0.027). The MarDH34 genotype increases lamina petiole length at both QTL.

Two QTL for wing petiole length were mapped, Whri_WPL_CO5.1 (P 0.056) which was on the boundary of being significant, and Whri_WPL_CO7.1 (P ≤ 0.04). The parental QTL genotypes that increased wing petiole length were different for the two QTL with GDDH33 underling Whri_WPL_CO5.1 and MarDH34 at Whri_WPL_CO7.1.

Leaf shape had four significant QTL on LG’s C3, C4 and C9 (Table 4). The two QTL on LG C3 both had GDDH33 as the underlying parental genotype, contributing towards leaves that were more twisted and folded. In contrast, the QTL on LG’s C4 and C9 both have MarDH34 as the underlying parental genotype; however, these QTL also contributed towards leaves that tend to be more twisted and folded. Leaf apex shape had two significant QTL on LG’s C7 and C6 (P < 0.001). Both QTL had GDDH33 as the genotype contributing towards more rounded leaf apex shapes. The QTL Whri_LAS_CO6.1 accounted for 64.3% of phenotypic variation for this trait.

Three QTL were mapped for leaf lobe number, two on LG C3 and one on LG C9. The QTL Whri_LN_CO3.1 (P < 0.001) accounted for 32.8% of the phenotypic variation for this trait. Both lobe number QTL on LG C3 had GDDH33 as the genotype that increases lobe number, whereas the QTL Whri_LN_CO9.1 (P = 0.001) had MarDH34.

One QTL was mapped for leaf wing number, Whri_WN_CO3.1 (P = 0.023) on LG C3. GDDH33 genotype at this QTL increases the number of wings present on the leaf.

Discussion

The availability of a broccoli × broccoli DH mapping population and a broccoli-specific linkage map will allow breeders to select the most appropriate genetic markers present within the crop for marker-assisted selection (MAS). The MG population is composed of genetically fixed DH lines; as such residual heterozygosity is eliminated; this was confirmed as no markers were scored as heterozygous during genotyping. This allows the MGDH lines to be replicated between test sites and trialled over years, thus enabling a more accurate estimation of trait VC. This in turn will decrease the standard error of QTL genotype means, allowing a better estimate of trait heritability and increased power to detect QTL (Lander and Botstein 1989; Soller and Beckmann 1990; Knapp and Bridges 1990). The MGDH lines have only undergone one round of meiosis; therefore, the number of recombination events is reduced compared with an F2 or recombinant inbred line (RIL) population. However, the lines produced capture sufficient recombination events to be useful for calculating recombination fractions and thus marker linkage and map distances. In addition, a single round of recombination means that gene combinations are more likely to be conserved, enhancing the genetic analysis of complex quantitative traits in the DH population.

The framework linkage map for the MG population is 946.729 cM; this is longer than the A × G (888.5 cM); N × G (831.3 cM) and integrated B. oleracea map (892.6 cM) (Sebastian et al. 2000); the VI-158 × Brocolett (468 cM) (Brown et al. 2007) and the BolTBDH map (891.8 cM) produced by Iniguez-Luy et al. (2009). This size difference may reflect the relative expansion effect of the Haldane mapping function used to calculate map distances for the MG linkage map compared with the Kosambi mapping function used in the previously published maps, or simply a greater proportion of the genome is represented by the MG map. A drawback of using an ‘intra-crop’ cross to generate a mapping population is the reduced number of polymorphic markers that are available to screen the population. The map generated for this population is relatively sparse in markers, compared with other B. oleracea maps. However, the aim was therefore to generate a framework map for this population that included common markers between the A × G and N × G linkage maps (Teakle and King, unpublished data; Sebastian et al. 2000; Parkin et al. 2005). This enables anchorage to the consolidated B. oleracea map with its wide range of markers and means that a comparative approach can be used to saturate areas relevant to the trait of interest with additional markers. This will allow comparisons of inter-marker distances using common markers across maps and to explore the possibility of additional genomic coverage and/or map expansion. These markers will help determine syntenic links between the maps providing a route to syntenic relationships with Arabidopsis (Kaczmarek et al. 2009).

The utility of the MG population and associated linkage map has been demonstrated in the current study by the identification of QTL for a range of traits. In total, 20 genome-wide significant QTL (P < 0.05) for broccoli head morphology and 16 QTL for leaf traits were mapped in this population. For example, the identification of four unique significant QTLs on linkage groups C2, C4, C6 and C7 detected across multiple environments for head size in this population is indicative of genetic variation for this trait. Although the head sizes measured were much smaller than would be acceptable in commercial hybrid cultivars, being able to genotype QTL affecting head size and the identification of tightly linked markers may be of benefit for broccoli breeders when attempting to exploit heterosis due to accumulation of beneficial alleles at different loci, i.e. may be informed of the choice of parental lines for hybrid production. However, because we have worked with DH lines we are unable to determine the effect of heterozygosity at the loci or the role that gene interactions might play; to do this would require additional crosses to produce different combinations of QTL. The results reported here are therefore indicative that QTL for head size can be successfully identified and acts as an initial step in determining the genetic control of this commercially important trait.

Using the data for multiple years and across years (over multiple environments) for head weight and diameter at harvest increased the environmental heterogeneity but allowed improved estimates of QTL that may not reach the GW significance threshold in one environment alone (Piepho 2005; Chen et al. 2010; Van Eeuwijk et al. 2010). One must also consider that other alleles that influence head morphology may not be segregating in this population and therefore do not contribute to the genetic variance.

Both head morphology and leaf traits showed clustering of QTL for related traits, suggesting possible pleiotropic effects (modularity) of genes at these QTL intervals (Lande 1980; Conner 2002). For example, QTL Whri_RWLOSS_CO2.1_2007 was significant at the GW threshold (P = 0.028) and contributes to a decrease in relative weight loss when the QTL genotype is MarDH34. Whri_RWLOSS_CO2.1_2007 grouped with QTL for head weight over years, head diameter in 2002 and head diameter over years; however, for these 3 QTL an increase in the line mean is achieved when GDDH33 is the QTL genotype. This is a curious region, since it seems that GDDH33 alleles are acting to increase head diameter and weight, whereas when MarDH34 is the QTL genotype the benefit is decreased relative weight loss during storage; MarDH34 has a more compact head structure compared with GDDH33 which may help explain the reduced weight loss. Interestingly, GDDH33 has a smaller head diameter compared with MarDH34; however, the re-assortment of chromosomal segments due to recombination unmasked the QTL in this region. This is also the case for a small (10 cM) region on group C6, where four QTLs for head diameter collocate; in each case an increase head diameter is observed when GDDH33 is the QTL genotype; the revealing of these QTL may go some way to explain the transgressive segregation observed for this trait.

On LG C9, QTL for head weight and head diameter measured in 2006 and a QTL for relative head weight loss measured in 2007 sit within a 39.1-cM interval between markers OL10D08–OL12A04. Brown et al. (2007) mapped QTL for head weight at harvest (HW-3) and harvest date maturity (MAT-2) that sit within this 39.1-cM interval. The head weigh QTL may be collocated; it would therefore be useful to fine map this interval to resolve the allele or alleles that act to increase head weight.

The clustering of QTL for leaf traits was observed on group C3, a GDDH33 wing number QTL clustered with a GDDH33 QTL for lobe number. Also on C3 QTL for lobe number, leaf shape and lamina petiole length collocate.

The weak correlation between head weight, head diameter with midrib width may have been due to tight linkage of QTL where blocks of the genome are in linkage disequilibrium maintaining independent gene combinations responsible for head morphology/leaf architecture traits (Conner and Via 1993; Conner 2002; Juenger et al. 2005). The clustering of QTL for morphological traits indicates the presence of ‘hotspots’ for loci controlling traits of interest. QTL analysis is based on identifying additive effects of QTL; however, complex traits are likely to be influenced by interactions between loci and breeders aim to select beneficial gene combinations as well as beneficial alleles at individual QTL.

The MG map is anchored to other published maps which will increase its utility as this will allow identification of additional markers in QTL regions facilitating mapping of traits measured in populations derived from other broccoli cultivars, and the direct mapping of single nucleotide polymorphisms (SNPs) to linkage group bins thereby increasing the efficiency of MAS for agronomically important traits.

At present, the majority of genetic maps available for broccoli (B. oleracea L. var. italica) are based on intraspecific crosses (Sebastian et al. 2000; Parkin et al. 2005; Kim et al. 2006); however, it is not possible to assess head morphology traits in the populations associated with these maps. Intra-crop crosses reduce the number of available polymorphic loci, but for traits of agronomic importance, the genetic variation captured enables a direct relationship to be established between trait and crop type reducing the time required for incorporation into elite breeding material. Marker-assisted selection of genomic regions from non target crop varieties (e.g. when using existing intraspecific mapping populations available for B. oleracea) increases the chance of incorporating deleterious alleles through linkage drag; therefore, markers associated with a specific agronomic trait need to delimit the QTL to a smaller interval compared with a QTL discovered within the crop type, which is more useful to a breeder.

Species-specific sequence data combined with an accurate genetic map contributes to deciphering syntenic links between Brassica species and with Arabidopsis. Since B. oleracea contains triplicate regions compared with Arabidopsis, syntenic blocks may be rearranged making it difficult to establish syntenic relationships (Langercrantz and Lydiate 1996; O’Neill and Bancroft 2000; Patterson et al. 2001; Ryder et al. 2001). The MGDH map and population in combination with other B. oleracea genetic resources will play a central role in aligning B. oleracea L. var. italica genomic regions to the reference genomic sequence of B. rapa (Choi et al. 2007; Wang 2010) and B. oleracea [B. oleracea sequencing consortium 2011 (http://brassica.jcvi.org/cgi-bin/brassica/consortium.cgi)] when these become available. The opportunity will therefore arise to re-calculate syntenic relationships both between Brassica crops and with Arabidopsis for crop improvement (King 2006).

The increasing accessibility of high-throughput sequencing technologies is enabling SNP discovery between cultivars; however, these SNPs still need to be mapped if they are from unknown regions as is often the case in crop transcriptome sequencing. Within crop genetic maps placing of these polymorphic loci will assist their respective position, increasing the resolution of public and proprietary genetic maps. Genetic maps will still have a role in complex trait analysis—QTL linkage mapping, linking phenotype data to the growing genome data to derive usable markers for breeding purposes.

We have presented a new immortal broccoli × broccoli ‘intra-crop’ mapping population, with a framework linkage map. These tools offer the means to further expand the scope for trait dissection within this agronomically important crop, accelerating the incorporation of these traits directly into breeding programmes.

The broccoli leaf trait QTL identified in this work may be of application in comparative mapping in leafy vegetable Brassica crops such as cabbage and kale. The broccoli head morphological QTL can be used to select for increased saleable head weight with reduced water loss during post harvest storage.