The number of fire blight resistance genes is quite limited in Malus spp. genepool, although there is consensus amongst the research community that breeding fire blight resistant cultivars (Peil et al. 2021a) is the best strategy to mitigate the devastating effects of the disease, caused by the bacterium Erwinia amylovora (Burrill; Winslow et al. 1920). The devastation of fire blight is due largely to the pathogen strain-specificity of susceptibility/resistance found in the genus Malus (Norelli and Aldwinkle 1986; Peil et al. 2011; Vogt et al. 2013; Wöhner et al. 2018; Emeriewen et al. 2019), coupled with the fact that most apple cultivars are highly susceptible to the disease (Kostick et al. 2021a). Therefore, the focus of breeding and breeding research has been to identify genomic regions in Malus that are associated with reduced susceptibility and resistance to fire blight, and combining these factors to achieve durable resistance (Peil et al. 2021b). As a consequence, several quantitative trait loci (QTL) and markers linked to resistance were identified in apple cultivars (Calenge et al. 2005; Le Roux et al. 2010; Desnoues et al. 2018; van de Weg et al. 2018; Kostick et al. 2021b; Peil et al. 2021a). It has been argued, and there is evidence that the loci of apple cultivars are associated with reduced susceptibility rather than complete resistance to fire blight (Kostick et al. 2021b); nevertheless, they are important in achieving durable resistance (Kostick et al. 2021a). In contrast, a few wild Malus genotypes possessing strong fire blight resistance have been reported with major QTLs detected in these genotypes (Peil et al. 2007, 2019; Durel et al. 2009; Emeriewen et al. 2014, 2017a).

Furthermore, the yet proposed fire blight resistance candidate genes in Malus are all located within the major resistance QTL regions of wild genotypes. For example, Parravicini et al. (2011) proposed serine/threonine kinase, and NBS-LRR genes on the sequences of bacterial artificial chromosome (BAC) clones spanning the QTL region on linkage group (LG) 12 of the ornamental cultivar ‘Evereste’ (Durel et al. 2009). Homologs of the ‘Evereste’ candidate genes (Emeriewen et al. 2021a) were identified in the genome sequence of Malus baccata (Chen et al. 2019). Similarly, Fahrentrapp et al. (2013) identified a CC-NBS-LRR candidate gene (FB_MR5) on the sequence of a BAC clone that spans the M. ×robusta 5 (Mr5) major QTL region on LG3 (Peil et al. 2007), and Wöhner et al. (2016) identified FB_MR5 homologs in accessions of the Malus wild species M. prunifolia and M. baccata.

A major fire blight resistance QTL (Mfu10) was detected on LG10 of M. fusca – accession MAL0045 of the JKI genebank (Emeriewen et al. 2014). In a fine mapping approach, the region containing the QTL was delimited from 15.79 cM (Emeriewen et al. 2014) to 0.33 cM (Emeriewen et al. 2018) as shown in Fig. 1. Following extensive genetic analyses of recombinant individuals and screening an MAL0045 BAC library, a single BAC clone – 46H22, was found to span FB_Mfu10 region (Emeriewen et al. 2018). Furthermore, Emeriewen et al. (2018) reported MiSeq illumina sequencing of BAC 46H22 leading to the assembly of 45 contigs, with a total length of 216 Kbp, on which a single serine/threonine candidate gene was predicted. Sequencing of amplicons of this serine/threonine kinase candidate gene, which were amplified in the donor MAL0045, on DNA of the BAC clone spanning the FB_Mfu10 resistance region – i.e. 46H22, as well as on DNA of the BAC clone spanning the susceptible (S) homologous region – 94B13, led to the identification of 8 bp indel differentiating the candidate resistance allele from its S homolog (Emeriewen et al. 2018, 2021b). Here, we report the resequencing of BAC clone 46H22, and sequencing of the homolog 94B13 for the first time, as well as two other neighbouring clones, 5E10 and 95C21, using MinION sequencing technology, and successfully assembled each individual clone into single contigs, respectively. This facilitated the prediction of additional protein kinase genes within the sequence of 46H22, which could not be predicted using the initial assembly of 45 contigs.

Fig. 1
figure 1

MAL0045 fire blight resistance region on linkage group 10 illustrated by a fine genetic map with 11 closely linked markers (data from Emeriewen et al. 2018), and graphical representation of the BAC clones within the region. FB_Mfu10 maps between markers FR24N24RP/FR39G5T7xT7y and FRM7358424/FR46H22 in a 0.33 cM interval. Clone 46H22, which carries alleles of resistance of markers (dotted lines/R) spans FB_Mfu10 resistance region whereas 94B13, which carries alleles of susceptibility of the same markers, spans the susceptible homologous region. Clones 5E10 and 95C21 carry alleles of resistance but do not span FB_Mfu10 region. R and S = alleles of resistance and susceptibility, respectively

BAC clone 46H22 and neighbouring clones 5E10 and 95C21, as well as 94B13 (Fig. 1) were cultured overnight at 37 °C in LB medium containing 12.5 μg/mL Chloramphenicol and plasmid DNA was extracted from clone cultures using NucleoBond® Xtra MIDI Plasmid DNA Purification Kit (MACHEREY–NAGEL, Düren, Germany) according to the manufacturer’s protocol. Sequencing was conducted on Oxford Nanopore (ONT) MinION device with flow cell type R9.4.1 using ONT standard ligation sequencing kit SQK-LSK109. We used the selective sequencing concept, also called adaptive sampling strategy (Martin et al. 2022) to avoid sequencing the E. coli genome. Following the sequencing step, nucleotides were called using ONT guppy software (version 5.1) to exclude reads with an average phred quality score below 20. Assembly was conducted using flye software version 2.9 (Kolmogorov et al. 2019). Table 1 summarizes the results obtained from sequencing and assembly of the four BAC clones. The least number of total reads length was obtained for clone 94B13 whilst the highest was obtained for 95C21. However, clone 46H22 possessed the highest total length of final assembled sequence following the exclusion of reads that did not meet the required threshold (Table 1). One single contig could be assembled for each BAC clone, each covering the whole length. Assembled sequences of clones 46H22 and 94B13 are deposited on NCBI under GenBank accession numbers ON000501 and ON000502, respectively.

Table 1 Summary of sequencing and assembly results of BAC clones

Gene prediction analyses with the assembled sequence of the BAC clones was performed with FGENESH software using algorithm of Arabidopsis thaliana, and predicted proteins of open reading frames (ORF) were analyzed using ExPASy PROSITE (Bairoch 1991; Hulo et al. 2008) and the National Centre for Biotechnology (NCBI) Blastp program (Altschul et al. 2005) to determine their domains. The number of ORFs predicted on the sequence of each BAC clone, and their predicted domains/profile are shown in Table 2 and in the Supplementary file. Of the six disease resistance-associated ORFs predicted on the sequence of 46H22, three possessed only the signature protein-kinase domains whereas the other three possessed the signature protein-kinase domains in addition to other domains, for example, a bulb lectin domain, a PAN domain, and a zinc finger domain and an integrase domain (Supplementary file). The first candidate gene proposed by Emeriewen et al. (2018), including its border sequences upstream and downstream, could be aligned perfectly on the MinIon-obtained contig sequence of 46H22, with a small fragment repetitive on this sequence (Fig. 2). Three other ORFs (ORF_H22-22, ORF_H22-24 and ORF_H22-27) predicted on 46H22 sequence showed strong sequence identity and coverage (> 85%) with the first proposed candidate, suggesting that these sequences are repetitive sequences.

Table 2 Number of ORFs predicted on the sequences of the BAC clones
Fig. 2
figure 2

Positions of candidate genes (ORFs) on the MinIon-obtained sequence of clone 46H22, created using gggenes function on R. Candidate_1 is the ORF of the first candidate gene (Emeriewen et al. 2018) whilst candidate_2 is a repetitive fragment of the first candidate gene on the sequence of 46H22, indicating some repetitive sequences

Similarly, of the five ORFs with putative disease resistance domains predicted on the S homolog clone 94B13, three showed strong coverage and identity (> 93%) with the first candidate gene homolog, including one, which possessed the 8 bp indel unique to fire blight susceptibility (Emeriewen et al. 2018, 2021b). This is indicative that the ORFs predicted on clone 94B13, which spans the FB_Mfu10 susceptible region, are homologs of the resistance candidate genes predicted on the clone spanning the resistance region, which carry the alleles of resistance to fire blight.

Furthermore, ORFs predicted on 46H22, which possess known domains for plant disease resistance and which showed zero identity and coverage with the first candidate gene (Emeriewen et al. 2018), were considered as additional resistance candidates. These include ORF_H22-19), which possesses 10 exons in 1141 predicted amino acids (aa), ORF_H22-33 with 31 exons in 2519 predicted aa, and ORF_H22-38 with 15 exons in 663 predicted aa. The positions of these three candidate genes on the MinIon-obtained contig of 46H22, relative to the first candidate gene, is shown in Fig. 2. Whilst the predicted coding regions of ORF_H22-19 was 3426 bp, ORF_H22-33 and ORF_H22-38 were 7560 and 1992 bp, respectively. Whilst the predicted domains of ORF_H22-33 included zinc finger and integrase domains in addition to bulb lectin and protein kinase domains, conversely, ORF_H22-19 and ORF_H22-38 possessed only protein kinase domains. Analyses of the amino acid sequences of the three ORFs using NCBI Blastp program, returned significant alignments of serine/threonine kinase and uncharacterized proteins of different plant species, including members of the Rosaceae family for ORF_H22-19, cysteine rich receptor-like protein kinase and G-type/S-type serine-threonine protein kinase for ORF_H22-38, and integrase catalytic domain-containing protein for ORF_H22-33. Due to the uncertainty of the source of the integrase domain in ORF_H22-33, which occupies a large section of this ORF, we did not analyse this candidate gene further in this study. However, the potential of ORF_H22-33 is not lost on further research. Figure 3 shows the amino acid sequence of ORF_H22-19 and ORF_H22-38.

Fig. 3
figure 3

a Amino acid sequence of ORF_H22-19. Bold = serine/threonine protein kinases active-site signature Bold + underlined = Protein kinases ATP-binding region signature; b Amino acid sequence of ORF_H22-38. Bold protein_kinase domain; Bold + underlined: ATP profile

Primer pairs named FR-ORF_H22-19 forward and reverse, respectively, for ORF_H22-19 (5’-GCTGTTGGCGATTCAAATTATGCAAC-3’ and 5’-AGCCGCTGTAGTCATTGCTCGTAAG-3’) and FR-ORF_H22-38 for ORF_H22-38 (5’-GTTCAGGTCAAGGCACAGAGGAGTT-3’ and 5’-TCGTGCCAATCATCGTTCTCATCCT-3’) were developed using Primer3 program (Kõressaar et al. 2018) to amplify significant parts of the respective transcripts in MAL0045, and for subsequent resequencing of obtained amplicons to verify the sequences. PCR was performed using HotStart PCR kit (Thermo Scientific, Berlin, Germany) according to the manufacturer’s protocol, with 20 ng of DNA isolated from leaf material of samples using DNeasy® Plant Mini Kit (Qiagen, Hilden, Germany). PCR profile using ORF_H22-19 primer pair was 95 °C for 5 min, followed by 34 cycles of 95 °C for 60 s, 68 °C for 90 s and 72 °C for 6 min, and a final extension of 72 °C for 10 min. The same PCR profile applied to ORF_H-38 but with an annealing temperature of 65 °C. Amplified fragments obtained in MAL0045 and 46H22 (Fig. 4) were purified using MSB® Spin PCRapace Kit (Invitek GmbH, Berlin, Germany) following the protocol, and Sanger sequencing of the products was performed by Eurofins MWG Operon (Ebersberg, Germany). Analyses of obtained sequences of amplicons confirmed strong alignment with the original sequence. RNA was extracted from samples using Invitrap® Spin Plant RNA Mini kit (Stratec, Berlin, Germany) and cleaned using Invitrogen DNA-free™ kit (Thermo Scientific, Berlin Germany), and subsequent cDNA synthesis was done using the RevertAid First Strand cDNA Synthesis kit (Thermo Scientific, Berlin Germany) – all according to the manufacturers’ protocols. A different set of primer pairs named AP-ORF_H22-19 for ORF_H22-19 (5’-GAACCACAGGATTCCTTTGCA-3’ and 5’-TCGCTTGGTCGGGGTAATAT-3’) and AP-ORF_H22-38 for ORF_H22-38 (5’-TTTCTGAAACGATTGGCCCTCAC-3’ and 5’-TAGGTCTTCTTGTGGCCCCATTT-3’) were used to amplify fragments of the respective coding regions on cDNA (not shown). Here, PCR was performed using 1 × Dream Taq buffer (ThermoFisher Scientific, Darmstadt, Germany), 0.2 mM dNTPs, 1 μM each of forward and reverse primers, 0.5 U Dream Taq DNA polymerase (ThermoFisher Scientific, Darmstadt, Germany) and 2-μl of cDNA. Running PCR conditions for both primer pairs were 94 °C for 2 min, followed by 32 cycles of 94 °C for 30 s, 56 °C for 1 min and 72 °C for 1 min and an extension of 72 °C for 5 min.

Fig. 4
figure 4

The CDS including introns of ORF_H22-19 and ORF_H22-38 were amplified on genomic DNA of the fire blight resistance donor MAL0045 and on the BAC clone spanning FB_Mfu10 region, 46H22. Whilst ORF_H22-19 fragment failed to amplify on 94B13, ORF_H22-38 was amplified. Idared is the susceptible apple cultivar crossed with MAL0045 to establish the mapping populations

The assembly of a single contig for clone 46H22 facilitated the identification of additional fire blight candidate resistance genes within the resistance locus of MAL0045 on LG10. Previously, a single serine/threonine candidate gene was proposed following MiSeq illumina sequencing of 46H22 and the subsequent assembly in to 45 contigs (Emeriewen et al. 2018). Although the total length of all 45 contigs was 216 Kbp, the longest contig was only 88 Kbp (Emeriewen et al. 2018), in contrast to the current assembly, which resulted in a single contig of 217 Kbp (Table 1). Although failure to assemble a single contig is not uncommon (Fahrentrapp et al. 2013), our current results show that it is plausible for such a failure to hamper the prediction of candidate genes. Following 454 sequencing of the BAC clones spanning the fire blight resistance and susceptible regions on LG3 of Mr5 – clones 16k15 and 72i24 were assembled into 162 Kbp comprising 4 contigs and 251 Kbp comprising 22 contigs, respectively (Fahrentrapp et al. 2013). On the other hand, it is also possible to predict the accurate gene on sequences of clones without obtaining a single contig, as FB_MR5 was predicted on clone 16k15, and proven through overexpression studies in transgenic and cisgenic plants to confer fire blight resistance in an otherwise susceptible apple genotype (Broggini et al. 2014; Kost et al. 2015). However, prediction using several assembled contigs of a BAC clone relies largely on luck, as it is plausible that slices of the gene could be spread across the contigs. Moreover, gene prediction algorithms are notorious for missing actual start and stop codons, and correct exon/intron borders of genes, not least due to them being based on and/or predetermined by annotated genomes of mostly model organisms (Dimonaco et al. 2022).

Nevertheless, it is expected that regardless of sequencing method and assembly, obtained sequences of the same BAC clone should be identical. This was the case in the current study. Supplementary Fig. 1 shows the resultant dot plot following alignment of the initial assembly (Emeriewen et al. 2018) and the currently obtained assembly. The largest contig of 88 Kbp (Emeriewen et al. 2018) aligns perfectly on the MinIon obtained contig. Further, obtained sequences from MiSeq illumina (Emeriewen et al. 2018) and MinION Oxford Nanopore used in the current study, resulted in very similar assembled total length, 216 and 217 Kbp, respectively, for clone 46H22. Moreover, Sanger sequencing of amplified gene fragments confirmed the sequences of parts of the open reading frames. Consequently, it is unsurprising that gene prediction analyses also led to the prediction of the first candidate gene, as well as the identification of the homolog on the sequence of clone 94B13, with the eight bp indel synonymous with fire blight susceptible genotypes (Emeriewen et al. 2018). This indel, which was first identified following Sanger sequencing of PCR amplicons of 94B13 and MAL0045 fire blight-susceptible progeny (Emeriewen et al. 2018, 2021b), was confirmed in the assembled sequence of 94B13, and thus justifies the approach of using Sanger sequencing and primer walking to confirm sequences and to obtain putative homolog gene sequences. Furthermore, two categorical findings were interesting to observe: i) the candidate genes identified within FB_Mfu10 resistance region are predominantly receptor-like kinases and ii) no ORF with known disease resistance profiles were predicted on the sequences of the neighbouring clones – 5E10 and 95C21. The latter finding is a confirmation that molecular marker data, chromosome-walking approach, and phenotypic data of recombinant individuals (Emeriewen et al. 2014, 2018) led to the identification of the correct BAC clone spanning FB_Mfu10 fire blight resistance region. The former finding is evidence that the complexity of fire blight disease resistance in Malus is not restricted to the presence of NLR-like genes, as was found on LG3 of Mr5 (Fahrentrapp et al. 2013; Broggini et al. 2014).

Fire blight resistance in Malus is quantitatively controlled (Korban et al. 1988). Quantitative in this context does not only imply the continuous distribution of susceptible and resistant phenotypes, but also the possible involvement of different loci/genes, which infers polygenic resistance. Nevertheless, there is strong evidence of the involvement of single dominant genes (Peil et al. 2007; Parravicini et al. 2011; Fahrentrapp et al. 2013; Emeriewen et al. 2018), especially considering Malus differential host – pathogen strain interactions (Vogt et al. 2013; Emeriewen et al. 2019; Wöhner et al. 2018). Although polygenic/quantitative resistance involves different R-genes and resistance loci, we have devised ways to assess phenotypic results of quantitative fire blight resistance data as a single gene effect in mapping studies leading to the precise localization of putative gene/s within fire blight QTL regions (Parravicini et al. 2011; Fahrentrapp et al. 2013; Emeriewen et al. 2018). Nevertheless, it is not certain that a single dominant gene will underlie fire blight resistance loci. Membrane-localized kinases and NLR-like genes have been identified in Malus wild spp. fire blight QTL regions. Both serine/threonine kinase and NBS-LRR genes were found in ‘Evereste’ fire blight resistance region (Parravicini et al. 2011); FB_MR5 is a CC-NBS-LRR gene underlying Mr5 fire blight resistance locus (Fahrentrapp et al. 2013) and several protein kinases have been found in the Malus fusca fire blight resistance region in the current study and as previously reported (Emeriewen et al. 2018).

In general, monogenic resistance is stronger than polygenic/quantitative resistance. However, monogenic resistance is not durable and is usually overcome by pathogen race or strains, whilst a combination of several quantitative resistance loci can confer durable resistance (Pilet-Nayel et al. 2017). In Malus for example, the fire blight resistance conferred by FB_MR5 is strain-dependent and overcome by Mr5-virulent strains of E. amylovora (Peil et al. 2011; Vogt et al. 2013; Emeriewen et al. 2019). Although FB_MR5-conferred resistance fits the model of NLR-resistance been overcome overtime, fire blight resistance of Mr5 itself could be polygenic considering strain-specific minor QTLs were identified in another study (Wöhner et al. 2014). There is also evidence that the fire blight resistance of ‘Evereste’ might be overcome (Wöhner et al. 2018), which may imply that genes within the QTL locus on LG12 will be overcome too. It is in fact interesting that within the ‘Evereste’ locus, a CC-NBS-LRR gene and a serine/threonine kinase gene were proposed as the best candidates (Parravicini et al. 2011). Conversely, evidence shows that the resistance of MAL0045 is not broken down by any known strain (Emeriewen et al. 2020), including strains that breakdown the resistance of FB_MR5 (Vogt et al. 2013; Emeriewen et al. 2015, 2017b). It is also interesting that NLR-like genes were not found within the Mfu10 locus, but instead predominantly protein kinase genes. RLKs could confer resistance against a broad range of a particular group of pathogen (Krattinger and Keller 2016).

In conclusion, it makes sense that the types of genes found to underlie fire blight resistance QTL regions in wild genotypes are entirely different, and interestingly appear to fit polygenic and monogenic disease resistance models. We postulate that the candidate genes found in FB_Mfu10 resistance region contribute largely to the fire blight resistance of MAL0045. Whether one or combinations of more than one will be sufficient to provide resistance in a susceptible background will be proven only through further complementing studies.