Introduction

Soft scald (SS) and soggy breakdown (SBD) are economically significant cold storage disorders of apple (Malus × domestica) fruit. Fruits that develop either disorder are unmarketable for fresh eating. These disorders are problematic for growers because they usually develop weeks after producers have invested considerable resources into harvest and cold storage. Incidences of SS and SBD vary among cultivars. Soft scald, also known as ribbon scald, is characterized by distinct brown and often longitudinal lesions on the fruit that frequently extend into the flesh and can lead to secondary infections in the fruit (Brooks and Harley 1934; Meheriuk et al. 1994). Soft scald is said to be physiologically and incidentally distinct from the cold storage disorder superficial scald (Meheriuk et al. 1994; Kupferman 2001). SBD, also referred to as low-temperature breakdown (DeLong et al. 2004; Prange et al. 2011), is an internal disorder where the flesh of the fruit becomes brown, soft, spongy, and often moist (Plagge et al. 1935; DeLong et al. 2004). SBD is physiologically distinct from other forms of internal browning associated with cold storage, including senescent breakdown and CO2 injury (Brooks and Harley 1934; Prange et al. 2011; Watkins and Nock 2012). Soft scald and SBD may be physiologically related (Brooks and Harley 1934; Watkins and Rosenberger 2012; Watkins et al. 2004), though coincidence varies and some cultivars are only susceptible to either SS or SBD development (Brooks and Harley 1934; Watkins et al. 2005). Older literature lists several important cultivars as being susceptible to the development of SS and/or SBD, including ‘Jonathan,’ ‘Winter Banana,’ ‘Northwestern Greening,’ ‘Rome Beauty,’ ‘Grimes Golden,’ ‘Golden Delicious,’ ‘Wealthy,’ and ‘Stayman Winesap’ (Harley and Fisher 1931; Brooks and Harley 1934; Plagge and Maney 1937). More recent literature mentions ‘Scifresh’ as being susceptible to SS, but mostly focuses on ‘Honeycrisp,’ which is reported as being particularly susceptible to both SS and SBD (Rosenberger et al. 2001).

‘Honeycrisp’ has been very useful as a parent in apple breeding programs due to the exceptionally crisp texture of its fruit (Mann et al. 2005), which it retains through long periods of cold storage (Tong et al. 1999; Rosenberger et al. 2001, Trujillo et al. 2012), and its reported resistance to apple scab (Clark et al. 2014a). The usefulness of ‘Honeycrisp’ as a breeding parent warrants genetic studies designed to identify the presence of stable, large-effect quantitative trait loci (QTL) associated with the disorders, as well as the development of marker-assisted breeding (MAB) targets to select against the development of such storage disorders in future ‘Honeycrisp’ offspring.

Until recently, studies on SS and SBD have generally focused on identifying risk factors for these disorders, correlated variables, and management practices to reduce SS incidence, primarily in ‘Honeycrisp.’ These include studies investigating physiological differences between fruits affected and unaffected with SS (Hopkirk and Wills 1981; Tong et al. 2003; Johnston et al. 2009, 2010; Moran et al. 2009; Ehsani-Moghaddam and DeEll 2013; Leisso et al. 2016) and/or SBD (Leisso et al. 2015), studies investigating differences in growing or harvesting conditions (Robinson and Watkins 2003; Watkins et al. 2005; Henriod et al. 2008; Johnston et al. 2009, 2010; Moran et al. 2009; Prange et al. 2011; Robinson and Lopez 2012), and studies investigating optimum storage conditions and treatments (Harley and Fisher 1931; Fan et al. 1999; Watkins and Rosenberger 2012; DeLong et al. 2004, 2006; Watkins et al. 2004; Johnston et al. 2009, 2010, Moran et al. 2010; Watkins and Nock 2012).

Although the previously cited studies are of great importance to postharvest physiologists and the commercial apple industry, they have provided limited insight for apple breeding efforts. At the time of this study, only two published studies have focused on evaluating the genetic basis of SS and/or SBD incidence. Alspach et al. (2000) reported high narrow-sense heritabilities for SS and low to medium narrow-sense heritabilities for “breakdown” and “chilling injury,” respectively, suggesting genetic potential for limiting at least SS incidence in segregating families. More recently, McClure et al. (2016) reported on two QTL for SS incidence on linkage groups (LGs) 2 and 3 and on single SNP markers at each QTL where the heterozygous state was associated with higher SS incidence. Although these reports have provided useful and promising results, MAB targets for limiting SS and SBD incidence in progeny of ‘Honeycrisp’ based on consistent QTL results are still needed.

MAB targets for SS and SBD incidence will be most useful if they are compatible across a wide array of germplasm relevant to dessert apple breeding programs. Pedigree-based QTL analysis (PBA) (Van de Weg et al. 2003) using DNA markers, pedigree information, and inter-related populations has been proposed as an approach to identify and characterize more QTL and more alleles at identified QTLs than are possible in biparental QTL studies. Recent advances have been made in PBA by the development of dedicated software including FlexQTL™ (Bink et al. 2002, 2008, 2014; www.flexqtl.nl), Pedimap (Voorrips et al. 2012), and PediHaplotyper (Voorrips et al. 2016). PBA has been made a more attractive approach in apple through guidelines on the composition of the study germplasm (Peace et al. 2014), high throughput genome-wide genotyping capabilities through SNP arrays (Chagné et al. 2012; Bianco et al. 2014, 2016), the availability of sets of pedigreed full-sib families (Peace et al. 2014), and standardized phenotyping procedures for some major traits (Evans et al. 2011a). This approach has been increasingly utilized in a variety of Rosaceous fruit species, including strawberry (Whitaker 2011; Roach et al. 2016; Mangandi et al. 2017), peach (Fresnedo-Ramírez et al. 2015, 2016; Mora et al. 2017), cherry (Rosyara et al. 2013; Stegmeir et al. 2014; Sandefur et al. 2016), and apple (Schmitz et al. 2013; Bink et al. 2014; Guan et al. 2015, Allard et al. 2016, Di Guardo et al. 2017; Durand et al. 2017).

The previously described techniques and tools should enable identification and characterization of QTL for SS and SBD incidence that can be used to aid in the development of MAB targets that are compatible across a diversity of germplasm important in apple breeding. With this context, the objectives of this study were to (1) identify QTL for SS and SBD incidence in pedigree-connected germplasm sets, (2) characterize the effects of functional haplotypes at identified QTL that are stable across both families and years, and (3) report on the composition of functional marker haplotypes in important cultivars and founding germplasm in a context appropriate for use in MAB. The ultimate goal of this study was to enable a MAB solution to developing apple populations and seedlings with lower incidences of SS and SBD.

Materials and methods

Plant material

The germplasm evaluated in this study included the University of Minnesota (UMN) portion of the RosBREED crop reference set (Peace et al. 2014), parental selections, and seedling families from the UMN apple breeding program, including four large full-sib families that share ‘Honeycrisp’ as the common parent (Table S1). Several small families were included in the analysis as the assumption was that they would share haplotypes with larger families at identified QTL based on pedigree-relatedness between families.

The non-‘Honeycrisp’ parents in the four large full-sib families were the UMN selection MN1702 and cultivars ‘Jonafree,’ ‘Pitmaston Pineapple,’ and ‘Monark’ as described in McKay et al. (2011). Most of the individuals from these four families also had an additional replicated tree that was also evaluated if fruit were available. A total of 433 individuals were evaluated between the two years of the study, which were all grown on ‘Budagovsky 9’ rootstock in Chanhassen, MN, USA, at the UMN Horticultural Research Center.

Phenotypic data

A minimum of 10 and a maximum of 25 fruit, depending on availability, were harvested from each tree when sampled fruits were at an estimated starch-iodine maturity rating of 5 or higher on the 8 point Cornell starch-iodine index (SI) scale (Blanpied and Silsby 1992). Reports of fruits with advanced maturity being more likely to develop SS (Harley and Fisher 1931; Brooks and Harley 1934; Meheriuk et al. 1994) with a reported maturity point of maximum susceptibility (Tomkins 1966; Prange et al. 2011), prompted us to use this intermediate to high SI maturity target rating rather than the SI maturity target rating of 3 that has been previously used in RosBREED phenotyping (Evans et al. 2011a). Fruit was evaluated for SI maturity rating following observations of skin background and over-color changes and the ease by which fruit released from the tree when picking. Fruits were harvested when their exterior was representative of most of the fruit on the tree and of the test fruit that had an SI rating of 5 or higher. In cases where fruit was limited, all fruit were harvested except for fruit that was clearly at a lower maturity level based on skin background and over-color.

Five of the harvested fruit were destructively evaluated at harvest to obtain an estimated average SI rating. All remaining fruit (16.5 fruit average between all individuals and replicated trees) were stored at 1 °C and evaluated for SS and SBD incidence after 20 weeks. Disorder incidence was recorded as the number of fruit with the disorder divided by the total number of fruit evaluated from a tree. Fruit were scored as having SS if they had symptoms consistent with the descriptions in Brooks and Harley (1934) and Meheriuk et al. (1994) of distinct brown, usually sunken, and often longitudinal lesions on the skin of the fruit appearing after a period of cold storage. Fruit were scored as having SBD if they had symptoms consistent with the description in Brooks and Harley (1934) of distinct brown and spongy tissue in the fruit cortex. The description and image of “low-temperature breakdown” provided in Meheriuk et al. (1994) was also used, but was modified for this study. Whereas the description in Meheriuk et al. (1994) noted that low-temperature breakdown is “usually… separated from the skin by normal tissue,” fruit observed in this study with clear SBD symptoms in the presence of SS affected skin were scored as also having SBD. Examples of SS and SBD as defined in this study can be found in Fig. S1. Fruit that were too damaged or rotted to evaluate were excluded from analysis.

Statistical correlations were evaluated between SS and SBD incidence for 2014 and 2015 and between disorder incidence and average SI rating at harvest. Since the distribution of disorder incidence departed greatly from normal, these correlations were calculated using the non-parametric Kendall’s tau-b coefficient by means of the Kendall package (McLeod 2011) in the statistical software R version 3.3.0 (R Core Team 2017). All pair-wise comparisons that were found to be statistically significant in this test were also checked for statistical significance when applying a Bonferonni correction to account for multiple comparisons.

Genetic data

All individuals were genotyped on the International RosBREED SNP Consortium 8K Illumina Infinium® array v1 (Chagné et al. 2012) with the exception of several founding genotypes or intermediate ancestors, whose genetic data were provided by FruitBreedomics (Laurens et al. 2010) and were genotyped on the Illumina Infinium® 20K array (Bianco et al. 2014). The DNA extraction methods used were as in Clark et al. (2014b). Genotype data for the four large full-sib families came from a previous study on the creation of an integrated genetic map (Howard et al. 2017). SNP alleles for the remaining germplasm were called together using automated allele calling in GenomeStudio™. Marker data were evaluated for consistency within pedigrees with the assistance of the marker evaluation functions of FlexQTL™ and Visual FlexQTL™. Missing data were inferred by FlexQTL™ where feasible. Suspected false or questionable recombinations were evaluated, and genotype scores were changed or recoded as missing data if it was evident that an individual had a genotype score that could not have been inherited or passed on to offspring given information from the connected pedigree. All pedigree relationships used in this study have been validated through SNP data using a process similar to that used in Howard et al. (2017) to confirm ancestors of ‘Honeycrisp.’ A total of 3419 SNPs were used in the QTL analyses. The genetic map used in this study is described in Howard et al. (2017). The marker data for seedling individuals evaluated in this study can be found in Table S2.

QTL analyses

FlexQTL™ software (www.flexqtl.nl) was used to conduct the QTL analyses. The Bayesian statistics and PBA methodology used in FlexQTL™, as described in Bink et al. (2002, 2008, 2012) built from approaches and procedures developed in Sillanpää and Arjas (1999). The implementation of FlexQTL™ in QTL analyses has been described in detail in Bink et al. (2014) in a proof of concept paper. Each separate QTL analysis had Markov chain Monte Carlo simulation lengths of 2.5*105, with every 250th sample stored for a total of 1000 samples for use in posterior QTL inferences. Additional duplicate QTL analyses were performed with two different starting seed numbers, maximQTL settings (5, 10), and priorQTL settings (2, 5) for the Markov chain Monte Carlo simulation for each year-trait combination evaluated in order to ensure the results for the identified QTL, and their positions were reproducible. Data from only one simulation from each year and trait combination is reported in this paper, with the following settings: maximQTL = 5, priorQTL = 2, addGENE = 5 (for QTL effects being additive with a normal prior distribution and a (co)variance matrix with a random, diagonal structure). QTL positions were recorded as QTL intensity estimates via posterior distributions of QTL locations as described in Sillanpää and Arjas (1999). QTL regions were recorded as a series of successive 2-cM bins with 2*ln Bayes factors (Kass and Raftery 1995) that were greater than 5. The most probable QTL positions were recorded as the median cM value from the Markov chain Monte Carlo simulation samples within each QTL region.

Separate QTL analyses were performed for both SS and SBD incidences for 2014 and 2015. Phenotypic scores for individuals with replicates were averaged for the purposes of QTL analyses. Physical locations for identified QTL that were consistent across years were retrieved by blasting SNP sequences at QTL regions to version 1.1 of the GDDH13 apple reference genome (Daccord et al. 2017, https://iris.angers.inra.fr/gddh13/) with BLASTN 2.2.25+ (Zhang et al. 2000).

Haplotype analysis

QTL that were consistently identified in both 2014 and 2015 with 2lnBF > 5, indicating at least strong evidence for QTL (Bink et al. 2008), were considered for haplotype analysis. Marker phasing was conducted using FlexQTL™. Fifteen successive SNP markers were chosen for marker haplotyping of each QTL region based on their proximity to identified QTL peaks. Haplotype identities were assigned by marker state (Identity by state, IBS), but also traced through known pedigrees (Identity by descent, IBD) to ensure marker data and phasing were consistent across connected pedigrees. Genome-wide phased marker data for parents that did not have SNP-confirmed pedigrees were compared to founder genotypes that had the same haplotype at QTL of interest to determine whether it is likely that important haplotypes may also be IBD via distant, but still unidentified genetic relationships. Haplotypes that were identical by state (IBS) between these parents and founders were considered likely to be IBD if they shared multiple extended IBS haplotypes of more than 10 cM that together totaled more than 10% of the SNP genome from the linkage map used in this study, which is more than 234.4 cM (= 10% × 1172 cM (Howard et al. 2017) × 2 parental haplotypes for diploids). These postulated relationships also had to be temporally and geographically consistent with available historical information about the listed pedigree for the parent under consideration.

Bootstrapping was used to estimate SS and SBD incidence means and 95% confidence intervals per year for individuals grouped by their haplotype composition at underlying identified QTL using R (R Core Team 2017) to estimate the effects of functional haplotypes. Mean disorder incidence and 95% confidence intervals for each defined haplotype group were generated from 10,000 bootstrap sample means. Each bootstrap mean was comprised of N observations of disorder incidence with replacement, where N was equal to the number of genetically distinct individuals in the dataset from which bootstrap statistics were being generated. The replicates for individuals were weighted, meaning both of the disorder incidence observations for an individual with a replicate were individually half as likely to be chosen at random for inclusion into a single bootstrap sample mean compared to an individual that did not have a replicate. Conclusions regarding haplotype effects were based on differences between mean disorder incidences and associated 95% confidence intervals of the defined haplotype groups, without correction for multiple comparisons. This method was used because of the non-normal distribution of the phenotypic data and to incorporate the uneven replication in the populations that were evaluated.

Results

Phenotypic data

SS and SBD incidences ranged from either extreme, were heavily skewed towards no incidence (Figs. S2 and S3), and showed modest but significant year-to-year correlations (τ − b 2 = 0.23 and 0.18 for SS and SBD, respectively) (Table S3; Fig. S4). Coincidence of SS and SBD within individuals was very high for both years studied (τ − b 2 = 0.53 and 0.64 for 2014 and 2015, respectively) (Table S3), though many individuals had separate fruit with only either SS or SBD. Average SI rating at harvest ranged from either extreme on the 8-point Cornell SI scale (Fig. S5) despite the target SI rating of 5. SI was not correlated with either SS or SBD with the exception of a weak, positive correlation between SI and SS in 2014 (P = 0.047). (Table S3).

QTL discovery

Two QTL were consistently identified in separate analyses of both SS and SBD in both 2014 and 2015 at the top of LG2 and the middle of LG16 (Tables 1 and 2; Fig. 1). The only other genome region with strong evidence (2lnBF > 5) for the presence of a QTL was on LG13 for SS in 2014 (Tables 1 and 2). The LG2 QTL spans the physical range of approximately 0.2 to 1.5 Mbp (SNPs ss475876955 to ss475876990), indicating that the QTL is at the top of chromosome 2. The LG16 QTL spans the physical range from approximately 11.5 to 12.3 Mbp (SNPs ss475881829 to ss475881842). Use of binary data rather than original incidences resulted in the identification of the same QTLs (data not shown).

Table 1 Bayes factors (2*ln) for a 1 QTL versus 0 QTL model per linkage group for soft scald and soggy breakdown incidence for all years evaluated in this study. The 2*ln Bayes factors are interpreted as having very little (0–2), positive (2–5), strong (5–10), and decisive (> 10) evidence for a 1 QTL model versus a 0 QTL model per linkage group. The shade of red corresponds to the height of the observed 2lnBFs ranging from white (2lnBF = − 3) to deep red (2lnBF = 18)
Table 2 QTL regions and peak positions for soft scald and soggy breakdown for QTL with 2*ln Bayes factors greater than 5. QTL regions reported consist of successive 2-cM bins with 2*ln Bayes factors greater than 5. QTL peaks were recorded as the median cM value from the Markov chain Monte Carlo simulation samples within the QTL regions
Fig. 1
figure 1

Posterior probability for QTL positions from FlexQTL output for soft scald and soggy breakdown incidence in 2014 and 2015. Chromosome numbers are indicated at the top of each graph. Green lines at the bottom of each graph indicate marker positions. Numbers below the series of graphs indicate the cM position at the end of each linkage group

Haplotype analyses at LG2 QTL

At the LG2 QTL, 19 haplotypes were represented in the parents of seedling families using 15 markers between 0 and 2.03 cM (Tables S4 and S5). One haplotype was consistently associated with increased incidence of SS and SBD and will be referred to as the high disorder incidence (HDI) haplotype. ‘Honeycrisp’ is homozygous for the HDI haplotype. In the three larger ‘Honeycrisp’ families where the other parent is heterozygous for the HDI haplotype, homozygous HDI offspring exhibited consistently higher incidences of both SS and SBD than offspring with only one HDI haplotype in both 2014 and 2015 (Fig. 2). None of the larger families in the study were derived from two heterozygous HDI parents, resulting in an absence of offspring that had no copies of the HDI haplotype. However, such progenies did occur with some of the smaller families. The general trend across the entire population for SS and SBD incidence versus the number of HDI haplotypes in individuals was that the effect of a second copy was about 2 to 6 times higher than that of a single copy in 2014 and 2015, respectively (Fig. 3). This trend was consistent for both disorders and implies that the LG2 QTL showed partial negative dominance when this QTL is examined on its own. Averaged over disorders, years, and LG16 QTL genotypes, incidence decreased from 0.43, to 0.14, and to 0.07 for individuals having two copies, one copy, and no copies of the HDI haplotype, respectively.

Fig. 2
figure 2

Mean incidences and their 95% confidence intervals for soft scald and soggy breakdown in 2014 and 2015 for offspring from three families segregating for number of copies of the high disorder incidence (HDI) haplotype grouped by family and diplotype at the LG2 QTL (bootstrapped data). Numbers below bars represent the number of individuals within each category with the number in parentheses representing the number of individuals and their replicates

Fig. 3
figure 3

Mean incidence and 95% confidence intervals for soft scald and soggy breakdown in 2014 and 2015 for offspring across all families grouped by number of the high disorder incidence (HDI) haplotype at the LG2 QTL (bootstrapped data). Numbers below bars represent the number of individuals within each category with the number in parentheses representing the number of individuals and their replicates

Haplotype analyses at LG16

The LG16 QTL had 16 haplotypes represented in parents of the seedling families using 15 markers between 34.2 and 37.6 cM (Tables S5 and S6). Some haplotypes at the LG16 QTL were associated with greatly increased SS and SBD incidence in three of the large ‘Honeycrisp’ families, though these effects were not as large as the HDI haplotype at the LG2 QTL. The LG16-H03 haplotype showed variable performance among families. It was associated with increased incidence in the ‘Honeycrisp’ x ‘Monark’ family but not in the ‘Honeycrisp’ x MN1702 family. In the ‘Jonafree’ x ‘Honeycrisp’ family, haplotype effects were inconsistent between years, but the LG16-H08 haplotype was associated with higher disorder incidence in 2015. No haplotypes from the smaller families were identified that were associated with high disorder incidence, and there were no differences in disorder incidences observed between either haplotype from ‘Honeycrisp’ at this QTL (data not shown).

Interaction between LG2 and LG16 QTL

Considering the compound LG2/LG16 QTL genotypes (Fig. 4), the three larger ‘Honeycrisp’ families showed a consistent pattern across both disorders and across years for the HDI homozygous, but not for the HDI heterozygous offspring. The ‘Honeycrisp’ x MN1702 family was the only family that showed a consistent pattern both across years and HDI dose, whereby the LG16-H03Fireside haplotype was always associated with low disorder incidences. Also, the LG16-H07 haplotype was only associated with higher disorder incidences with two copies of the HDI haplotype at the LG2 QTL (Fig. 4). The latter, if reproducible, would indicate that the LG2 QTL is epistatic to the LG16 QTL in the ‘Honeycrisp’ x MN1702 family, whereby the LG16-H03Fireside haplotype comes only to expression in conjunction with two HDI doses at LG2. The ‘Honeycrisp’ x ‘Monark’ family showed a consistent pattern for the HDI-homozygous offspring, whereby the LG16-H03Monark haplotype was associated with high disorder incidence and the LG16-H06 haplotype was associated with a moderate incidence. Low disorder incidence was observed in individuals from this family with a single HDI dose, with the exception of the LG16-H03Monark haplotype in 2014, which was associated with moderate disorder incidence. The ‘Honeycrisp’ x ‘Jonafree’ family was variable in SS and SBD incidences across all QTL genotypes and years. In 2014, the mean disorder incidence of each of the four compound QTL genotypes was similar with large 95% confidence intervals. However, in 2015, the LG16-H08 was associated with higher disorder incidences compared to the LG16-H04 haplotype when in conjunction with a single HDI dose, and both haplotypes were associated with relatively higher disorder incidences in conjunction with two copies of the HDI haplotype at the LG2 QTL versus only one copy of the HDI haplotype. The variable performance of this QTL over years and genetic backgrounds hampered summarizing the effect of this QTL by a simple, single quantification, as was possible with the LG2 QTL.

Fig. 4
figure 4

Mean incidence and 95% confidence intervals for soft scald and soggy breakdown in 2014 and 2015 for offspring from three families with ‘Honeycrisp’ as a common parent across all germplasm evaluated, grouped by number of copies of the high disorder incidence (HDI) haplotype at the LG2 QTL and the non-‘Honeycrisp’ haplotype contribution at the LG16 QTL (bootstrapped data). Note that the LG16-H03 haplotype could not be traced to the same ancestral origins between MN1702 and ‘Monark.’ Numbers below bars represent the number of individuals within each category with the number in parentheses representing the number of individuals and their replicates

Origin of LG2 high disorder incidence haplotype

The origin of the HDI haplotypes in ‘Honeycrisp’ were traced to ‘Northern Spy’ through parent ‘Keepsake,’ and ‘Grimes Golden’ through parent MN1627 (Howard et al. 2017) (Table S4). Multiple parents of families evaluated in this study share one copy of the HDI haplotype by descent (Table S4). ‘Honeygold’ inherited it from its grandparent ‘Grimes Golden’ through ‘Golden Delicious,’ and ‘Sweet 16’ inherited it from its parent ‘Northern Spy.’ MN1702 and ‘Wildung’ both inherited this haplotype from ‘Fireside.’ ‘Jonafree,’ ‘Dayton,’ and ‘Monark’ did not have their direct parents genotyped with the array. The origin of their HDI haplotypes could be deduced by comparing their phased SNP data to the phased SNP data for earlier ancestors or, in the case of ‘Monark,’ to likely founding cultivars, as detailed below.

The pedigree of ‘Jonafree’ has been published (Dayton et al. 1979), and marker data were available for three of its grandparents and both of its great- grandparents through the grandparent that did not have marker data available. ‘Jonafree’ likely inherited its copy of the HDI haplotype through its ungenotyped parent NJ31 from either grandparent ‘Rome Beauty’ or ‘Northern Spy’ as it could not have inherited its other haplotype (LG2-H02) from either of these grandparents. The LG2 homolog in ‘Jonafree’ containing the HDI haplotype is IBS with ‘Rome Beauty’ for the first 49 cM of LG2 (155 markers). After this point, the haplotype matches with ‘Northern Spy’ for the remainder of the chromosome, indicating a meiotic recombination event at 49 cM in the gamete from NJ31 that formed ‘Jonafree.’ For the HDI haplotype to have been inherited through ‘Northern Spy,’ a recombination would have been necessary at 2 cM and again at 49 cM within NJ31 between its ‘Northern Spy’ and ‘Rome Beauty’ LG2 homolog. Either source was considered possible (Table S4).

The published pedigree of ‘Dayton’ (Korban et al. 1988) matched with SSR marker genotypes of ‘Dayton’ and its contributing founders when its assumed great- grandparent ‘Jonathan’ would be replaced by ‘Golden Delicious’ (Evans et al. 2011b). Extended SNP haplotypes shared between ‘Dayton’ and ‘Golden Delicious’ for at least 10 cM span 612 cM, or 26%, of their phased marker data and include five regions larger than 30 cM. ‘Rome Beauty’ and its sport ‘Red Rome’ are validated founders for ‘Dayton’ in different areas of the pedigree as a great-grandparent, a great-great-grandparent, and as two great-great-great-grandparents (coefficient of relationship = 0.25). Extended SNP haplotypes shared between ‘Dayton’ and ‘Rome Beauty’ for at least 10 cM span 611 cM, or 26%, of their phased marker data and include five regions larger than 30 cM. ‘Rome Beauty’ is the likely source of the HDI haplotype as the first ~ 12.9 cM (81 markers) of ‘Dayton’ match with ‘Rome Beauty,’ though it is also possible that ‘Golden Delicious’ could have contributed the HDI haplotype instead, with evidence of meiosis being undetected due to having a common extended haplotype with ‘Rome Beauty’ for the first 2.03 cM of the linkage map. Furthermore, the individuals in the published pedigree of ‘Dayton’ (Korban et al. 1988), ‘Wealthy,’ ‘Melba,’ ‘Jonathan,’ ‘Starr,’ and F2 26829-2-2, do not have the HDI haplotype (Table S4).

‘Monark’ (also known as AA-44) was developed by the University of Arkansas and recorded as being the offspring of a cross between NJ674016 and NJ40 (Rom et al. 1998). No pedigree data were available for these selections, but the selection designation indicated they were from Rutgers University as part of the cooperative apple breeding program of Purdue University, Rutgers University, and the University of Illinois, which extensively used both ‘Golden Delicious’ and ‘Rome Beauty’ in crosses (Crosby et al. 1990). Extended SNP haplotypes longer than 10 cM shared between ‘Rome Beauty’ and ‘Monark’ span 19.8% of their phased marker data and include one region of 33.8 cM. Extended SNP haplotypes longer than 10 cM shared between ‘Golden Delicious’ and ‘Monark’ span 28% of their phased marker data and include five regions that are greater than 30 cM. The first 13.1 cM (89 markers) of LG2 containing the HDI haplotype for ‘Monark’ is IBS with ‘Golden Delicious’ and ‘Grimes Golden,’ compared to only 5 cM with ‘Rome Beauty,’ indicating it is likely that ‘Monark’ inherited its copy of the HDI haplotype from ‘Golden Delicious.’

Origin of the LG16 haplotypes for high disorder incidence

The LG16-H07 haplotype in MN1702 associated with increased SS and SBD incidences (Fig. 4) in the ‘Honeycrisp’ x MN1702 family was inherited from its parent ‘Frostbite’ (Table S6). The other haplotype in MN1702 (LG16-H03) was inherited from its other parent ‘Fireside’ (Table S6) and is IBS to the haplotype from ‘Monark’ that was associated with increased SS and SBD incidences in the ‘Honeycrisp’ x ‘Monark’ family (Fig. 4). As the two LG16-H03 haplotypes were associated with distinct disorder incidence levels, we differentiated them by their ancestral origin (H03Fireside and H03Monark), whereby the sources of the LG16 haplotypes in ‘Monark’ were unable to be determined.

Discussion

We identified QTL for SS (soft scald) and SBD (soggy breakdown) incidence on LGs 2 and 16 in two years of observation (Tables 1 and 2) across a pedigree-connected germplasm set. SNPs flanking these QTL peaks were located at the top of chromosome 2, and middle of LG16. The observed concordance of these two storage disorders and characteristics of the discovered QTLs will be discussed below.

Relationships between SS and SBD

SS and SBD incidences were significantly correlated in this study (Table S3). This is consistent with hypotheses from some earlier studies that these disorders might be physiologically related (Brooks and Harley 1934; Watkins and Rosenberger 2012; Watkins et al. 2004). These previous studies evaluated the disorders only within individual cultivars. In contrast, our study identified a correlation between SS and SBD across segregating full-sib families. Additionally, SS and SBD were often observed in separate fruits from the same individual. These observations, in conjunction with the consistent identification of both the QTL on LGs 2 and 16 using phenotype data from either disorder, support the hypothesis that these disorders are related physiologically and genetically.

Previously reported QTL for SS and SBD

At the time of our study, only one other QTL study for SS incidence and no QTL studies for SBD incidence have been published. A QTL for SS incidence on LG2 has been reported in a (‘Summerred’ x ‘Discovery’) x (‘Splendor’ x ‘Gala’) family evaluated in Canada (McClure et al. 2016). Its QTL peak was at a genotype-by-sequencing SNP at the physical position of 3,379,607 bp of version 1.0 of the Golden Delicious reference genome (Velasco et al. 2010; McClure et al. 2016), which is located between SNPs at 4.21 and 5.33 cM in the map used in this study. This places the McClure et al. (2016) LG2 QTL peak between 1.45 and 4.17 cM away from the QTL peaks for SS incidence identified in our study, suggesting that they are the same QTL and that this QTL is expressed and relevant across multiple locations and populations.

SNP array data were not available for the parents in the McClure et al. (2016) cross where the LG2 QTL was identified; however, SNP array data were available for all four grandparents. None of these grandparents carry a copy of the HDI haplotype, indicating that additional haplotypes from different backgrounds may be associated with higher SS incidence at this QTL. The McClure et al. (2016) study also evaluated a ‘Honeycrisp’ x ‘Ambrosia’ cross. This cross segregates for one and two copies of the HDI haplotype as ‘Ambrosia’ has one copy of the HDI haplotype from ‘Grimes Golden’ through parent ‘Golden Delicious.’ However, no QTL were identified from this cross with the explanation provided being that small population size and GxE interactions may be to blame.

Fine mapping of and candidate genes for the LG2 QTL

Despite significant consistency in disorder incidence observed between 2014 and 2015 (Table S3) and the consistently identified position of the LG2 QTL for both disorders (Tables 1 and 2; ), the quality of the phenotypic data limited our ability to accurately fine map this QTL to an interval of less than 4 cM. This is possibly due to the small effective size of our study population, possible variable environmental effects on incidences of SS and SBD (as reflected by the observed moderate correlations between years), negative dominance at the LG2 QTL and/or the interaction with the second QTL identified on LG16, and other background genetic effects. However, markers for the identified founding sources of the HDI haplotype at LG2 are identical in state for a set of 21 SNP markers covering the first 2.03 cM of the linkage map. This area contains QTL identified for various alcohols and esters (Dunemann et al. 2009; Souleyre et al. 2014; Farneti et al. 2017) and aromatic compounds (Souleyre et al. 2014; Kumar et al. 2015; Yauk et al. 2015; Farneti et al. 2017) found in fruit that were thought to be under regulation of MdAAT1 (MDP0000637737) (Souleyre et al. 2014; Farneti et al. 2017), which codes for an alcohol acyl-transferase that catalyzes the transacylation from acyl-CoA to alcohol (Souleyre et al. 2005), the suppression of which led to increased levels of hexanol and other alcohols (Souleyre et al. 2014). MdAAT1 is within the LG2 QTL identified in this study. A series of publications from the 1970s reported that exposing or injecting ‘Jonathan’ fruit with hexanol or hexyl acetate resulted in the formation of SS (Wills and Scott 1970; Wills 1972, 1973). Additionally, Leisso et al. (2016) observed that higher levels of hexanol were associated with increased risk of soft scald formation in ‘Honeycrisp’ apples. Taken in the context of the previously cited studies, the results from this study suggest there may be a connection between hexanol, SS, and SBD incidence.

Although a study explicitly investigating these relationships would need to be conducted to confirm any physiological connections, MdAAT1 may be a good target to further study SS and SBD incidence. Markers at or near this gene may be better targets for developing selectable markers for implementing MAB compared to SNPs simply delimiting the HDI haplotype region in this study regardless of their physical locations.

It should be noted that the LG2 QTL for hexanol in the Dunemann et al. study (2009) only segregated in ‘Discovery,’ which does not have the HDI haplotype, but not in ‘Prima,’ which does have the HDI haplotype. Both parents did segregate for iso-pentanol and various esters at this QTL, indicating that independent mutations of MdAAT1 might have occurred that each led to reduced expression of the gene and thereby to increased levels of various alcohols.

Haplotype analyses at LG2 QTL

A single IBS haplotype at the LG2 QTL, named the HDI haplotype, was associated with increased incidences of SS and SBD in three large families that segregate for one and two copies of the HDI haplotype (Fig. 1). A general trend of negative partial dominance with the HDI haplotype at the LG2 QTL was observed where individuals with each additional copy of the HDI haplotype tended to have higher disorder incidences. This trend was common across all families and across the entire germplasm set (Fig. 4) in both years evaluated. These findings could explain previous reports of moderate SS and/or SBD development in ‘Golden Delicious,’ ‘Grimes Golden,’ and ‘Rome Beauty’ (Harley and Fisher 1931; Brooks and Harley 1934; Plagge and Maney 1937) which all have one copy of the HDI haplotype and were common sources of the HDI haplotype in many of the parents of the families evaluated in this study (Table S4). Indeed, all parents evaluated in this study that had the HDI haplotype inherited this haplotype from ‘Grimes Golden,’ ‘Northern Spy,’ ‘Fireside,’ or ‘Rome Beauty’ (Table S4). Other sources of this haplotype may exist, but since the RosBREED germplasm set was designed to include the most important breeding parents currently represented in the UMN, Washington State University, and Cornell University apple breeding efforts (Peace et al. 2014), the founder genotypes identified in this study should be the most relevant to these US breeding programs. We were unable to connect the pedigrees of these founding sources of the HDI haplotype, but a dose effect was consistently observed regardless of the source of the HDI haplotype, suggesting that the haplotype has a consistent effect on SS and SBD incidence in different genetic backgrounds. These findings, coupled with the identification of other common haplotypes at the LG2 QTL (Table S4), should make implementation of selectable markers for MAB easier and more transferable across different families.

Haplotype analyses at the LG16 QTL

The variability in disorder incidences between years for some of the compound LG2/LG16 genotypes hampered an unequivocal assessment of haplotype effects. In this study, we propose to classify haplotypes based on their highest observed incidence. For the non- ‘Honeycrisp’ haplotypes of the three families that segregated for one or two copies of the LG2-HDI haplotype, LG16-H03Fireside was the only haplotype that consistently had no incidence-increasing effect. LG-H06 might be associated with an intermediate effect, while the other four haplotypes (H03Monark, H04, H07, and H08) were associated with similar and larger increases.

The interaction between the LG2 and LG16 QTL

The variability in incidences between years for some of the compound LG2/LG16 genotypes also hampered conclusions on the interplay between the two QTL. The data from the ‘Honeycrisp’ x MN1702 family suggested LG2 to be epistatic over LG16, as LG16-H07 increased disorder incidence only in the presence of a double dose of LG2-HDI. Such an epistatic pattern was also observed for one of the haplotypes of the two other larger families, but it was not reproducible across years. Consequently, it cannot be excluded that the consistent pattern observed in the ‘Honeycrisp’ x MN1702 family was coincidental. Examination of additional years and additional families will be needed for definitive conclusions on the (in)dependent action of these two QTL and the exact effects of haplotypes underlying the LG16 QTL.

The reason for the observed opposite effects of the IBS haplotypes shared by MN1702 and ‘Monark’ is unknown. One explanation is that the uncertainty in the exact position of this QTL (Table 2) could have led to the incorrect position being examined. An analysis of the phased marker data revealed that the IBS haplotypes shared between MN1702 and ‘Monark’ at this QTL extend from 34.21 to 38.68 cM, which does not cover every peak identified for this QTL (Table 2). The haplotypes are also possibly not IBD whereby their corresponding homologs might have sequence polymorphism that remained hidden with the current set of SNP markers. Regardless of the position and origin of LG16 haplotypes, effects of haplotypes at the LG2 QTL are much more consistent across years and the germplasm evaluated, making it a better marker-assisted breeding target.

Another point of consideration regarding the QTL on LG16 is that LG13 is the homoeologous chromosome from the most recent genome duplication that occurred in apple (Velasco et al. 2010), and a QTL for SS was found on LG13 in roughly the same physical position as the LG16 QTL. The LG13 QTL was only present in 2014 and with a lower Bayes factor than the LG2 and LG16 QTLs, but it could be another genetic factor in SS and SBD incidence that could not be fully identified and characterized in this study, possibly due to limitations of the families evaluated, population sizes, and/or the type of phenotyping utilized.

SS and SBD variation due to additional causes

A moderate percentage of individuals that have haplotypes associated with higher disorder incidences at these QTL have low or even no incidence of either disorder. This observation was not unexpected, considering how many previous studies evaluating SS and SBD in ‘Honeycrisp,’ which is considered highly susceptible to both disorders, often observed only small percentage of fruit develop either disorder (Tong et al. 2003, DeLong et al. 2004, 2006; Moran et al. 2009, 2010; DeEll and Ehsani-Moghaddam 2010; Ehsani-Moghaddam and DeEll 2013). Some of the discrepancy between observed and expected disorder incidences when considering individual haplotype composition at the identified QTL can be attributed to experimental variation as reflected by non-systematic differences in disorder incidences between 2014 and 2015. A small number of genetically susceptible individuals expressed high disorder incidence in one year but not the other. Evidence of possible genotype-by-year interactions can also be seen in the difference in mean disorder incidence between years in individuals from the ‘Honeycrisp’ x ‘Monark’ family that have one copy of the HDI haplotype at the LG2 QTL and the LG16-H03 haplotype at the LG16 QTL, as well as in various haplotype combinations in the ‘Honeycrisp’ x ‘Jonafree’ family (Fig. 4). Genotype-by-location effects could also explain some of this discrepancy. This would be similar to previous reports of orchard location and year having an effect on soft scald incidence in ‘Honeycrisp’ fruit (Tong et al. 2003; Moran et al. 2010).

Another explanation could be that some individuals were not harvested at the ideal maturity level for development of either disorder due to difficulties in estimating fruit maturity of all fruit on a tree based on few or a single fruit destructively sampled at harvest. Immature harvested fruit might not have developed the disorders despite having the genetic predisposition to do so, as higher incidences of SS has been reported in apple fruit harvested at more advanced maturities (Harley and Fisher 1931; Brooks and Harley 1934; Meheriuk et al. 1994; Prange et al. 2011). Two studies have reported a point of maturity for “maximum susceptibility,” where fruit harvested very late are also less likely to develop SS (Tomkins 1966; Prange et al. 2011). We attempted to harvest fruit at an intermediately advanced maturity based on SI ratings (Blanpied and Silsby 1992) in order to avoid false negative phenotypic values. However, reliably harvesting fruit at the desired SI rating was difficult, resulting in a varied distribution of average SI ratings for fruit at harvest (Fig. S5). Within this distribution, SI rating did not strongly correlate with incidences of SS or SBD in different partitions of the data in this study, including partitions with one versus two copies of the HDI haplotype at the LG2 QTL (Table S3). Additionally, conventional maturity indicators, including SI, have proven to be problematic in estimating maturity of ‘Honeycrisp’ fruit (Watkins et al. 2005) and perhaps this finding may extend to its offspring. SI rating may also not be an accurate indicator of fruit maturity for individuals in this study or perhaps the SI target used in this study was not an appropriate indicator of a harvest maturity that would maximize expression of SS and SBD formation in individuals that are genetically predisposed to develop these disorders. This explanation is supported by the observation that, though efforts were made generally that ensure only fruit of uniform maturity levels were harvested, occasionally fruit of lower maturity were inadvertently included (possibly due to differences observable only after harvesting and cold storage) and in sets of fruit from individuals where almost all of the fruit had SS or SBD, the fruit that did not develop SS or SBD were clearly less advanced in maturity based on relative background and over-color. These observations could indicate a connection between maturity and SS and SBD incidence that was not captured in this study. It should be noted that the proportion of these less matured fruit was estimated to be less than 1% of all fruit. Such a low proportion was deemed not to have significantly impacted the identification of major QTL or the effects of haplotypes underlying them.

Some individuals in this study had no copies of the HDI haplotype but developed one or both disorders. Most of these individuals were from two families where one parent was a wild Malus sieversii accession from the USDA Plant Genetic Resources Unit in Geneva, NY. These individuals may have alleles at the LG2 and LG16 QTL that are associated with higher disorder incidences, or different QTL might be segregating in their offspring, but the small sizes of these families may have prohibited the characterization of their alleles and the identification of additional QTL in our study.

Conclusion

The LG2 QTL is an attractive MAB target given the economic impact of SS and SBD and considering the demonstrated consistent effect of the HDI haplotype across multiple families and years. The HDI haplotype was present in some major breeding parents, including ‘Golden Delicious’ and ‘Honeycrisp’ and could be traced back to the founding individuals ‘Grimes Golden,’ ‘Northern Spy,’ ‘Rome Beauty,’ and ‘Fireside.’ Future research might widen the set of major breeding parents for which the QTL genotypes and associated marker haplotypes are identified.

Selecting against seedlings based on haplotype at the LG2 QTL should be a valuable strategy to develop populations with lower incidences of SS and SBD, though parental contribution at the LG16 should be at least a point of consideration until a better understanding of alleles at the LG16 QTL and their interactions with the HDI haplotype at the LG2 QTL is gained. The best practice to immediately utilize the results in this study for MAB in seedling selection will be to select against the HDI haplotype at the LG2 QTL as the effect of the HDI haplotype was the most clear and consistent trend observed and the highest disorder incidence was consistently observed in individuals that are homozygous for the HDI haplotype.