Introduction

The cultivated apple (Malus domestica Borkh.) belongs to the Rosaceae family and is an economically important horticultural tree crop worldwide. Apple is cross-pollinated and has a self-incompatibility system preventing self-pollination, which results in a high level of heterozygosity and genetic diversity (Aguiar et al. 2015). Many old apple cultivars have arisen as chance seedlings created through open pollination and thus their origins and ancestries are often unknown. Other cultivars have originated as seedlings of selected mother cultivars or from undirected crossing experiments. In contrast, cultivars emerging from modern breeding programs mostly derive from controlled crosses (Sansavini et al. 2004). Since apple cultivars are maintained by clonal propagation and have a long lifespan, old cultivars that are well- adapted to local climates are often used as parents in breeding programs, leading to close genetic relationships between old and modern cultivars (Muranty et al. 2020; Skytte af Sätra et al. 2020).

Apple is a very important crop in the Balkan countries, including Bosnia and Herzegovina (BIH) (Ostojić et al. 2019). There has been no systematically collected data on cultivars in fruit production in BIH, but according to sources on nursery production, the most commonly produced apple cultivars are ‘Idared’ and ‘Golden Delicious’ (Davidović Gidas et al. 2017). For centuries, the region was exposed to the influences of different civilizations from the East and the West, which resulted in a large number of apple cultivars of different origins being introduced. These cultivars were often named after the person who introduced them or their place of origin, resulting in many synonyms and homonyms (Đurić et al. 2009; Gaši et al. 2013a, b). Furthermore, BIH is situated at the intersection of southern and south-eastern Europe and covers a broad range of climate conditions that enable cultivation of diverse germplasm for many specific uses. In addition, high specific interest of the population in fruit trees has ensured the survival of local and traditional cultivars in farming systems, along with international widespread modern cultivars (Đurić et al. 2009; Gaši et al. 2013a, b; Paunović et al. 1997; Rivera et al. 2018).

The first experimental fruit station in BIH to hold an apple collection was founded in Goražde (File S1) in 1937 and operated until 1967, with a temporary interruption of its activities during World War II (Mićić et al. 2017). Around 100 apple cultivars and the first modern apple rootstocks were introduced from the Experimental Fruit Station in New Jersey, USA, and planted in the cultivar collection at Goražde station in the 1950s (Mićić et al. 2017). An apple accession from Goražde designated as a traditional cultivar was recently discovered to be the old American cultivar ‘Delicious’, presumably introduced in this period (Konjic et al. 2023). After 1947, when fruit production research began to be conducted at the Faculty of Agriculture and Forestry at the University of Sarajevo, the complete fruit collection of Goražde was duplicated in the faculty’s experimental field Betanija (Anonymous 1963). Unfortunately, these collections no longer exist.

New activities and research on fruit genetic resources in BIH started during the 1980s, as part of a project “Gene bank of Yugoslavia”. Prior to the BIH War (1992–1995), the inventory was largely completed and documented using Multi-crop Passport Descriptors (MCPD). Unfortunately, a large proportion of the records were lost during the war. Since the war, BIH as an independent state constitutes three administrative parts: Republika Srpska (RS), Federation of Bosnia and Herzegovina (FBIH), and Brčko District. Separate apple collections have been established in RS and FBIH (Đurić et al. 2009; FAO 2008). The first post-war ex situ collection was established in the fruit tree nursery ‘Srebrenik’ (File S1) in north-eastern Bosnia, FBIH, in 2000, after several inventories and collection missions (Drkenda and Zečević 2018; Gaši et al. 2010). The collection efforts were based on traditional knowledge among local farmers (Gaši et al. 2013a), with the aim of conserving the best-known traditional cultivars rather than trying to capture genetic diversity. Consequently, much of the existing diversity was missing in the Srebrenik collection. The discovery of additional gaps through molecular characterization of apple genetic resources in eastern Bosnia (Gaši et al. 2013b) led to the establishment of a new apple collection near the previous experimental fruit station in Goražde (FBIH) (Drkenda and Zečević 2018).

In the RS area, the SeedNet project funded by the Swedish International Development Aid (Sida) led to the establishment of the Institute for Genetic Resources (IGR) at the University of Banja Luka. The entire gene bank of RS is held by IGR and includes ex situ field collections of different fruit species in Banja Luka and Aleksandrovac (northwestern BIH) (File S1).

The three BIH collections (Srebrenik, Goražde, and IGR) comprise 51, 40, and 165 apple accessions, respectively. Previous molecular studies carried out in Srebrenik and Goražde have enabled the curators to take the precaution of avoiding redundancies between these two collections. However, the overlap between these two and the IGR collection in Banja Luka is still unknown.

Parts of the Srebrenik and IGR collections have been characterized in terms of morphological and pomological traits (Gaši et al. 2010, 2011; Kecman 2015; Stanivuković et al. 2017), as well as phytosanitary status (Đurić et al. 2015). However, only the collections in Srebrenik and Goražde have been characterized at molecular level using simple sequence repeat (SSR) markers (Gaši et al. 2010, 2013a, b).

SSR markers have been the system of choice for germplasm characterization in the past two decades (Bakir et al. 2022; Cmejlova et al. 2021; Garkava-Gustavsson et al. 2008, 2013; Gasi et al. 2016; Larsen et al. 2006, 2017; Meland et al. 2022; Pereira-Lorenzo et al. 2008; Urrestarazu et al. 2012, 2016). SSR markers efficiently reveal genotypic duplicates, indicating synonymous and homonymous cultivars and cultivar groups, as reported also for the apple germplasm of BIH (Gaši et al. 2010, 2013a, b). In recent years, single nucleotide polymorphism (SNP) markers have become increasingly popular. Several SNP arrays, such as the Infinium® IRSC 8 K array (Chagné et al. 2012), the Illumina Infinium® 20 K array (Bianco et al. 2014), the Affymetrix Axiom® 480 K SNP array (Bianco et al. 2016), and the 50 K array (Rymenants et al. 2020), have been made available for research and breeding in apple. They have proven to be useful for robust cultivar characterization, evaluation of the genetic structure of germplasm collections, pedigree inferences (Gilpin et al. 2023; Howard et al. 2017; Luby et al. 2022; Muranty et al. 2020; Skytte af Sätra et al. 2020), construction of high-density genetic linkage maps (Di Pierro et al. 2016), genome-wide association studies (Jung et al. 2022; Miller et al. 2022; Urrestarazu et al. 2017), and QTL mapping (Rymenants et al. 2020; Skytte af Sätra et al. 2023; van de Weg et al. 2018). The 480 K SNP array was recently used to study part of the apple germplasm in BIH, maintained in the Srebrenik and Goražde collections (Konjic et al. 2023).

The primary goal of the present study was to evaluate and improve the status of the IGR collection. Specific objectives were: (i) to identify synonyms, homonyms, and mislabeled accessions, (ii) to assess the genetic structure of the collection, and (iii) to identify parent-offspring relationships. A further aim, exploiting the compatibility of the Illumina Infinium and Affymetrix Axiom arrays (Howard et al. 2021b), was to integrate the 20 K data obtained in the study with previously published 480 K genotypic data (Konjic et al. 2023) for the two other apple field collections maintained in BIH.

Materials and methods

Plant material and DNA isolations

A total of 165 apple accessions in the IGR collection, preserved in the botanical garden of the University of Banja Luka and in Aleksandrovac (File S1), were sampled (Table 1 of File S2). The first part of the collection was established in 2010 with material from the nursery ‘Visoko’. The second part was established in 2013 with young plants produced in the nursery at IGR. The accessions in the second part were identified within the SeedNet project (2007 to 2009) and were obtained at various locations, mainly in north-eastern and eastern parts of RS. The trees were propagated early spring 2012, and planted in spring 2013. The third part of the collection was established in Aleksandrovac (Laktaši) in spring 2017. These accessions were mainly collected from the north-western part of RS, grafted in early spring 2016, and planted in spring 2017. The rootstocks to which the accessions were grafted are indicated in Table 1 of File S2. For all collected accessions except those purchased in 2010, passport data are available. In spring 2020, young leaves from one tree per accession were collected and immediately transferred to the Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden. The leaves were freeze-dried and stored at -80 °C until use. Genomic DNA was extracted using the DNeasy 96 Plant Kit (Qiagen) as described by the supplier. Quality and concentration of the genomic DNA were assessed by spectroscopy (Nanodrop, Thermo Scientific).

SNP array genotyping and duplicate identification

The IGR accessions were genotyped using the 20 K Infinium® apple SNP array (Illumina Inc.) (Bianco et al. 2014). Samples of 200 ng genomic DNA per accession were analyzed, following the previously described standard protocol (Chagné et al. 2012). Genotype calls were obtained from Genome Studio v.2.0 (Illumina Inc.), using a subset of 10,295 SNPs and cluster definitions from Howard et al. (2021b). Monomorphic SNPs were excluded, leaving 9,992 SNPs for analysis of the IGR collection. Assessment of sample quality, ploidy level and identification of genotypically duplicate samples were performed as described previously (Skytte af Sätra et al. 2020), following parts of the previously described procedure for genotypic data curation (Vanderzande and Howard et al. 2019). Briefly, sample quality was assessed based on the overall call rate, median clustering quality score (p50 GC), and B-allele and signal intensity (R-parameter) histograms. Sample ploidy was determined from plots of B-allele frequency (Chagné et al. 2015). Samples that were genotypic duplicates were identified by global pairwise estimate of identity by descent, as employed in PLINK 1.9 (Chang et al. 2015).

Genetic structure and parent offspring relationships

Principal component analysis (PCA) was performed on unique diploid accessions using PLINK 1.9 (Chang et al. 2015). Population structure was analyzed statistically with the software STRUCTURE v.2.3.4 (Falush et al. 2003) using five replicates evaluating 1–10 subpopulations (K), a burn-in of 15,000, and a run length of 50,000 for each K. The most probable K was identified by the Evanno method, using Structure Harvester (Earl and von Holdt 2012; Evanno et al. 2005). The software CLUMPP v.1.1.2 (Jakobsson and Rosenberg 2007) was used to calculate average membership coefficients, which were visualized using DISTRUCT v.1.1 (Rosenberg 2003). Network and PCA plots were generated using the iGraph (Csardi and Nepusz 2006) and ggplot2 (Wickham 2016) packages for R (Core Team 2020). Possible parent-parent-offspring (PPO, ‘trio’) and parent-offspring (PO, ‘duo’) relationships within the BIH collection were identified based on Mendelian errors as previously described (Vanderzande et al. 2019), allowing up to 30/60 mismatches for a putative duo/trio. However, none of the reported duos had more than three Mendelian errors in the current dataset (Table 3 of File S2). Groups of individuals that shared a common unknown, ungenotyped parent were identified using Summed Potential Lengths of Shared Haplotype (SPLoSH) information following methods previously described (Howard et al. 2021a).

Integration of genotypic data from Srebrenik and Goražde

Genotypic data for 69 accessions in the Srebrenik and Goražde collections were available from the Axiom® Apple 480 K SNP array (Bianco et al. 2016) and analyzed together with genotypic data from the present study. Genotype data for 45 of these 69 accessions were recently published (Konjic et al. 2023), while new genotype data for an additional 24 accessions were obtained in the present study. Among these 69 accessions, 31 traditional apple accessions are maintained in Goražde, while 25 traditional and 13 international reference cultivars are conserved in Srebrenik. We obtained SNP calls for these individuals using the default settings in Axiom Analysis Suite 5.0 (Affymetrix Inc.), first calling diploid and triploid samples jointly for identification of genotypically duplicate samples and then calling only diploid samples for further analysis. This approach was used as the genotype calls for triploids were only intended for identification of genotypic duplicates, not for pedigree reconstruction. We retained SNP calls for SNP loci successfully called in Axiom Analysis Suite, identified as compatible between the Illumina and Infinium arrays with no adjustment needed (Howard et al. 2021b), and retained in ongoing germplasm work (Skytte af Sätra 2023). Thus, 5,268 SNPs were used for identification of genotypically duplicate samples across ploidy levels, while 5,723 SNPs were used for PCA and identification of parent-offspring relationships among diploid samples as described above. Considering the differences in quality of the genotype calls from the two SNP arrays, up to 60/120 mismatches were allowed for a putative duo/trio, including at least one individual from the Srebrenik and Goražde collections, based on differences between known synonymous samples in the dataset. However, the highest number of Mendelian errors between pairs of samples genotyped on the two platforms was 34, which was for a duo known from literature. Among pairs of samples analyzed only on the 480 K array up to 51 Mendelian errors were found, and a trio with one offspring genotyped with the 20 K array and the other individual genotyped with the 480 K SNP array had 64 Mendelian errors. Mendelian errors were addressed by evaluating cluster plots in Genome Studio (Howard et al. 2021b), which revealed that these issues were all due to problematic clustering.

Lastly, unique accessions from all three BIH collections were compared with genotypes in an ongoing pedigree reconstruction study (Denancé et al. 2020; Howard et al. 2018), to identify possible international synonyms. In cases where an accession was found to be genetically identical to a genotype already present in the reference database, the name of the foreign synonym was accepted. Regarding genotypic duplicates among local cultivars, the decision on the assigned preferred name (Table 2 of File S2) was made by consulting available pomological literature and, where available, descriptions and pictures of fruit from the trees in the collections.

Results

Sample quality, ploidy level, and duplicates in the IGR collection

All diploid samples in the IGR collection had call rates above 0.88 and p50 GC values between 0.75 and 0.76. All triploid samples had call rates above 0.8 and p50 GC values between 0.71 and 0.75. All samples had a B-allele frequency and signal intensity (R-parameter) histograms in Genome Studio indicating good sample quality, without contamination. Among the 165 accessions from the IGR collection analyzed, 54 unique diploid and 18 unique triploid genotypic profiles were identified. The two genotypes with the largest number of duplicates were the triploid profiles named ‘Kolačara’ (13 accessions) and synonyms of ‘Brixner Plattling’ (11 accessions). The ‘Kolačara’ genotype was recorded under 10 different names and three accessions were named ‘Kolačara’, which is consequently recommended as the future designation for this genotype. Some of the accessions were obvious cases of mislabeling. For example, four accessions (‘Batulinka’, ‘Eliflana’, ‘Zelenika’, and ‘Senabija’) were genotypically identical to the rootstock ‘MM.106’, to which the accessions had presumably been grafted. In other cases, pairs of genotypically duplicate accessions had similar pomological descriptions, indicating that they might truly be synonymous, e.g., ‘Zelenika’ and ‘Zvečarka’ (preferred name ‘Zelenika’). On the other hand, 32 samples had no duplicates within the collection (Fig. 1A; Table 2 of File S2), including 15 diploids and 5 triploids.

Genetic structure and parent-offspring relationships in the IGR collection

The first two eigenvectors in PCA explained 10.6% and 5.0%, respectively, of the total genetic variance among the diploid samples in the IGR collection. The cultivars clustered into four weakly defined groups, referred to as “American”, “European”, “Eastern”, and “Balkan” (Fig. 1B). Group names were assigned based on prevalence of cultivars with known origin in these regions present in each group, e.g., ‘Golden Delicious’ for “American”, ‘King of the Pippins’ for “European”, ‘Alexander’ for “Eastern”, and unique BIH genotypes for “Balkan”. However, the STRUCTURE analysis did not provide support for any subpopulations within the collection, as the mean likelihood for K was highest for K = 1 (Table 4 of File S2). Considering K = 4, there were no individuals with an average membership coefficient above 0.6 for any subgroup, although a large proportion of the individuals in the ‘Balkan’ cluster had a higher membership coefficient (> 0.4) for one of the STRUCTURE subpopulations (Fig. 1C). A single trio was identified within the collection, with the local cultivar ‘Sijanac jabuka’ (Eng. ‘apple seedling’) being the offspring of ‘Jonathan’ × ‘Yellow Bellflower’, which are two North American cultivars. Additionally, three previously unknown, undirected parent-offspring relationships were identified.

Fig. 1
figure 1

Duplicate samples and genetic structure in the apple germplasm collection at the Institute for Genetic Resources (IGR), University of Banja Luka, BIH. (A) Network plot illustrating genotypically duplicate accessions, where circles indicate diploid samples, squares indicate triploid samples, and lines connect samples with identical genotypic profiles. The two genotypes with the most and second most duplicates are colored red and blue, respectively. Other genotypes represented by more than one sample are colored beige and unique samples in the collection are colored grey. (B) Unique diploid samples plotted according to their coordinates on the first and second eigenvector in principal component analysis (PCA). Samples are colored according to their cluster assignment name after the origin of some characteristic cultivars. (C) Graphical display of the proportion of ancestry for four subpopulations, grouped by the PCA clusters in (B)

Integration of genotypic data across the BIH gene bank collections

Among the 69 samples from the Srebrenik and Goražde collections, there were two pairs of genotypic duplicates not reported previously. Nineteen samples were genotypic duplicates with one or more samples from the IGR collection, and 48 accessions had no genotypic duplicates within the integrated BIH dataset (Fig. 2A; Table 2 of File S2). Two of the accessions from Srebrenik and Goražde (‘Ranka’ and ‘Djulabija’) had a parent-offspring relationship to an accession assigned to the posterior Balkan group (‘Srebrenička’ and ‘Senabija’ (IGR)). Overall, first-degree relationships across the three BIH collections were relatively rare among the extant cultivars, with two trios and 26 duos identified across 95 diploid genotypes, mainly among the international cultivars. However, we inferred the existence of two unknown founders with 9 and 3 first degree relationships with extant diploid cultivars, respectively (Fig. 2B; Table 3 of File S2). The most common founder was ‘Djulabija’, which was involved in five duos. Structure-wise, the germplasm preserved in Srebrenik and Goražde followed the same pattern as that in the IGR collection, with some unique BIH genotypes clustering separately (Fig. 2C).

Fig. 2
figure 2

Genotypically duplicate samples and genetic structure across the IGR, Srebrenik, and Goražde heirloom apple collections in BIH. (A) Network plot illustrating genotypically duplicate accessions within the IGR collection (purple) amended with the BIH accessions from Srebrenik, and Goražde (green). Diploid samples are presented as circles and triploid samples as squares, with genotypic duplicates connected by lines. (B) Network plot illustrating first-degree relationships (lines) between diploid cultivars within the IGR, Srebrenik, and Goražde collections. Genotypes are colored based on their average membership coefficients for K = 4. Genotypes from the IGR collection with an average membership coefficient above 0.4 for the posterior group coinciding with the ‘Balkan’ in principal component analysis (PCA) cluster (see Fig. 1C) are indicated by pink circles. Genotypes not having a membership coefficient above 0.4 for any posterior group are indicated by green circles, genotypes from Srebrenik, and Goražde that do not have genotypic duplicates in the IGR collection are indicated by black circles, and inferred individuals are indicated by grey squares. (C) Diploid genotypes plotted by their coordinates on the first and second eigenvector of the PCA plot, with colors as in B

Discussion

Apple germplasm collections intended for preservation in BIH have been established based on historical information and phenotypes of the accessions, which usually results in redundancies. In this study, we used the 20 K apple SNP array to evaluate the status of the IGR collection and combined the findings with data from the two other collections in BIH (Srebrenik and Goražde) previously genotyped with 480 K apple SNP array. Integration of data for all three collections revealed the presence of a number of synonyms and duplicates, but also a large number of unique accessions. The IGR collection had a substantial number of duplicates, in contrast to the Srebrenik, and Goražde collections that have previously been analyzed by SSR-markers. However, the IGR collection was found to be the BIH gene bank with the largest number of unique accessions. In contrast to other national apple heirloom collections (Larsen et al. 2018; Skytte af Sätra et al. 2020), the BIH apple collections exhibited a relative lack of key founders.

However, by considering shared haplotype length between individuals (Howard et al. 2021a) two additional parents were inferred. One of these (‘Inferred Parent 1’) appears to have been a key founder of Balkan apples, being a parent of 9.5% of the genotyped unique diploid accessions.

Curation of the IGR collection germplasm

The accession with most redundancies (12), ‘Kolačara’, was mentioned in all available pomological books from the former Yugoslavia as an old cultivar of unknown origin (Anonymous 1963; Bubić 1977; Mišić 1994; Todorović 1899; Vitolović 1949). In addition to the above-mentioned names, the synonyms ‘Haslinger’ and ‘Roter Pogatscher’ are listed (Anonymous 1963) (see File S3 for more information). It was also described as a triploid cultivar, which was confirmed in our study.

The second group of 11 accessions with identical molecular profiles are genotypic duplicates of ‘Brixner Plattling’ from the Fondazione Edmund Mach (Howard et al. 2022). The local names of these duplicates indicate that the fruit is reddish and sour (‘Crvenika 1’, ‘Crveni kiseljak’, ‘Crvenika 2’, ‘Kiseljača’, ‘Crvena ljutika’, ‘Kasna crvenika’). The name of the other BIH accession synonymous with ‘Brixner Plattling’ (‘Šarenika’) indicates that the fruits have reddish stripes. The names of two accessions synonymous with ‘Brixner Plattling’ (‘Slatka Kanada’ and ‘Zećuša’, meaning ‘sweet Canada’ and ‘rabbit’) are not in line with the others, indicating that they are strictly local names or cases of mislabeling. Only one name has no specific meaning, ‘Kanjišak’.

Many local apple names in the Balkans indicate the color of the fruit skin, e.g., ‘Zelenika’ (green), ‘Bjela’, ‘Bjelica’ (very light whitish-yellow), ‘Limunka’ (lemon yellow), ‘Crvenka’ (red), ‘Šarenika’ (striped), etc. A group of five duplicates were named ‘Zelenika’. ‘Zelenika’ is an old cultivar of unknown origin that used to be highly appreciated and widely cultivated (Anonymous 1963), to the extent that it is mentioned in folk songs (Beširević 2009). Nowadays, it can only be found sporadically in the orchards of apple enthusiasts and growers of heirloom fruit (Beširević 2009). We also found ‘Zvečarka’ to be a synonym for ‘Zelenika’, but that name indicates rattling of the mature seeds within the fruit and not skin color (see also File S3).

Furthermore, a number of accessions under different names (‘Batulenka’, ‘Staklara’, ‘Baščovanka’, ‘Zimnjača’, and ‘Šarenika’) were found to be synonymous with ‘Batul Alma’, which may be due to mislabeling or other reason that resulted in an inappropriate name. Beširević (2009) describes this cultivar under the name ‘Staklara’ and the synonym ‘Batulinka’. Lukman (1938) states that ‘Batulenka’ is a Transylvanian (Romania) cultivar, while in Yugoslav pomology ‘Batul’, ‘Batulka’, and ‘Batulnapfel’ are considered synonyms (Anonymous 1963). Further details and discussions about synonymous genotypes are given in File S3.

Status of the BIH germplasm

Integration of the SNP genotyping data of the curated IGR collection with data on the other two BIH apple collections, situated in Srebrenik and Goražde, gave a broad overview of the apple germplasm conserved in the country. This allowed identification of additional cases of synonyms and homonyms to those revealed within the IGR (Table 2 of File S2). In addition to previously identified duplicates (Konjic et al. 2023), additional pairs of duplicate samples were found within the Srebrenik collection (‘Dobrić’ and ‘Masnjača’) and within the Goražde collection (‘Ovčiji nos’ and ‘Zečija glava’; ‘Švabska zelenika’ and ‘Limunka’; ‘Posavka’ and ‘Bjelka’). ‘Ovčiji nos’ (IGR, GO), ‘Zečija glava’ (GO), and ‘Pričevka’ (IGR) were all found to be identical to ‘Kantil Sinap’. This old cultivar, originating from the Black Sea region, is characterized by a specific fruit shape that is reflected in the local BIH names, e.g., ‘Ovčiji nos’, meaning sheep’s nose, and ‘Zečija glava’, meaning rabbit’s head. However, this characteristic is not exclusive to ‘Kantil Sinap’, as there are other apple cultivars with names referring to sheep’s nose, e.g., the German cultivar ‘Gelbe Schafsnase’. In the Yugoslav pomological literature (Vitolović 1949), ‘Ovčiji nos’ and ‘Prinčevka’ are considered different cultivars of the ‘rattle’ group, named so because the seeds rattle when the fruit is shaken. Both cultivars were maintained in the collection at the former fruit station in Goražde (Anonymous 1963) and are still conserved there. Bubić (1977) considers ‘Prinčevka’, ‘Zečja glava’, and ‘Prince’s apple’ (syn. ‘Berliner Hasenkopf’, ‘Pomme melon’, and ‘Prinzenapfel’) to be synonyms of ‘Prinčeva jabuka’. It is described as being widespread in Austria and Slovenia and was introduced to the territory of BIH during the Austro-Hungarian era. It spread to the Goražde area under the name ‘Zečja glava’ (meaning rabbit’s head, same as ‘Hasenkopf’).

Two cases of genotypic duplicates were also discovered between samples of the Srebrenik and Goražde collections only, i.e., ‘Kanada’ (GO) was identical to ‘Kanjiška’ (SR), while ‘Srebrenička’ (SR) was identical to ‘Limunka’ (GO). This is in line with findings in previous studies using SSR markers (Gaši et al. 2013a, b).

The accession ‘Žuja’ from Srebrenik was identical to the accession with the largest number of duplicates within the IGR collection, ‘Kolačara’. Interestingly, both of these names are descriptive, as the word kolač means cake in the local language, while žuja refers to something yellow. ‘Žuja’ was a presumed diploid based on earlier SSR findings (Gaši et al. 2010) and was recently identified as forming a trio with some of the best-known BIH apple cultivars, i.e., ‘Djulabija’ and ‘Senabija’ (Konjic et al. 2023). However, our analysis revealed ‘Žuja’ to be triploid, and its relationship with ‘Djulabija’ and ‘Senabija’ was not confirmed. A recent pedigree reconstruction study, conducted based on SNP data for more than 3500 European apple accessions, revealed that triploidy has been a dead end in historical apple pedigrees (Howard et al. 2022).

Furthermore, the accession ‘Petrovača bijela’ (SR) was a genotypically identical to the accessions ‘Žuta petrovača’, ‘Crvenika’, and ‘Bjeličnik’, which are all genotypic duplicates of ‘Yellow Transparent’. The accessions ‘Bobovec’ (IGR) and ‘Ljutika’ (GO), were found to be a genotypic duplicate of the cultivar ‘Bohnapfel’, while ‘Bobovec’ (SR) was found to represent a unique genotype in the dataset. This is in contrast to previous findings that ‘Bobovec’ (SR) is a possible synonym of ‘Bohnapfel’ (Gaši et al. 2013a).

Detection of duplicate samples based on high-density genotypic data and comparisons with a large international dataset (Howard et al. 2021a), pomological literature, and available phenotypic data for some but not all accessions (data not shown) guided the efforts to assign a unique name to most of the genetically distinct accessions (Table 2 of File S2). Following this process, only four cases of homonyms remained. The name ‘Petrovača’ was assigned to two different cultivars, while three accessions were designated ‘Šarenika’. As discussed earlier, both names are descriptive, relating to the ripening date and fruit skin color, respectively.

One of the most interesting cases of homonyms is the well-known BIH cultivar ‘Senabija’. The ‘Senabija’ from the IGR collection differs genetically from ‘Senabija’ maintained in Srebrenik. Further, ‘Senabija’ (IGR) has previously been considered a synonym of ‘Budimka’ (Hjalmarsson and Tomić, 2012), a name held by another accession from Srebrenik that is a unique triploid accession. The name ‘Budimka’ probably comes from the Budimlja monastery (today Berane, Montenegro), and its origin is probably from Transcaucasia (Mišić 1994; Vitolović 1949). On the other hand, ‘Senabija’ (SR) was found to be one of the progenies of the well-known traditional cultivar ‘Djulabija’ (Konjic et al. 2023), which appears to be the only relatively important founder among BIH apple cultivars. Thus, ‘Senabija’ (SR) can be considered a more viable candidate for this designation than the accession with the same name from the IGR collection, although the IGR accession was a unique genotype in our dataset clustering in the ‘Balkan’ group (Fig. 1B-C).

Comparisons with a wide international dataset confirmed that all but one of the international reference cultivars maintained in the Srebrenik ex situ collections are true-to-type (cultivar ‘Remo’ was mislabeled as ‘Pilot’). The complete list of individuals with duplicate genotypic profiles is presented in Table 2 of File S2.

Structure of BIH germplasm

The current study did not demonstrate any statistically significant genetic structure within the BIH apple germplasm. However we have identified an ancestral group which is overrepresented among the apple cultivars unique to BIH, forming a ‘Balkan’ cluster (Fig. 1).

Other studies using SNP array data (Skytte af Sätra et al. 2020; Vanderzande et al. 2017) also failed to identify significant STRUCTURE clustering, which may in part be due to the ascertainment bias inherent in the genotyping technology and in part due to actual weak genetic structure within the analysed plant material. The cultivars in the ‘Balkan’ cluster represent material that is slightly genetically differentiated from the common European cultivars thus making them the top priorities for future conservation efforts in BIH.

The presence of relatively few first-degree relationships among the extant BIH accessions can be attributed to the specific history of apple germplasm introduction into the country. Previous studies revealed a genetic structure within the BIH germplasm, characterized by a differentiation of apple accessions based on their introduction period and origin (Gaši et al. 2010, 2013a, b; Konjic et al. 2023; Mićić et al. 2017). The germplasm introduction was carried out in waves from different directions: during the Ottoman rule from the east, during the rule of the Austro-Hungarian Empire from the west and north, and during the former Yugoslavia time (kingdom and socialist republic) even from other continents. Our results clearly identified old British, German, French, and North American cultivars, as well as cultivars from the east (‘Batul alma’ and ‘Kantil Sinap’) within the BIH apple germplasm. Many of the accessions analyzed do not have any pedigree connection to known Western cultivars, and thus may have originated from other parts of the world. Furthermore, a previous study has revealed an overlap between Anatolian and BIH apples (Burak et al. 2014), as a result of germplasm exchange and gene flow, particularly during the Ottoman rule. Likewise, pronounced genetic differentiation between the Turkish and west European apple germplasm was noted. This introduction pattern has resulted in genetically diverse apple germplasm, but also in the absence of few key founding genotypes. The lack of a large single network such as that reported within Danish and Swedish apple germplasm (Larsen et al. 2018; Skytte af Sätra et al. 2020) indicates that the diversity present within BIH apple germplasm is a product of introduction, and not of previous domestic breeding efforts.

Conclusions

Genotyping and curation of the IGR apple gene bank collection, using the 20 K SNP array, and combining the newly obtained data with previously published 480 K data provided valuable insights into the substantial diversity present within BIH apple germplasm. Several synonymous accessions were identified within the IGR collection, allowing removal of redundant trees to keep the focus on preservation and evaluation of unique accessions. Some cultivars believed to have been preserved in the IGR collection were in fact rootstocks onto which they had been grafted, indicating the need for further inventory efforts to find these lost genetic resources. The collection also showed signs of weak genetic structure, whereby a group of local cultivars without international synonyms formed a separate “Balkan” cluster that should be a key target for preservation.

From previous work on compatibility between the Infinium and Axiom arrays, genotypic data for the two other apple collections in BIH (Srebrenik and Goražde), obtained using the 480 K SNP array, were available to this study. Cross-comparisons showed some overlap between the three collections, but each contains valuable unique accessions, including those representing a potential “Balkan” cluster. Together, the findings in this study create a solid basis for more efficient preservation and evaluation of unique accessions and their utilization in production and future apple breeding in Bosnia and Herzegovina.