Introduction

Bananas (Musa spp. Linnaeus) can be divided into edible cultivars and non-edible wild species. The edible bananas include dessert, cooking and beer making cultivars, which mostly originated from Southeast Asia (Perrier et al. 2009). Their ancestors are Musa acuminata Colla denoted as AA and Musa balbisiana Colla (BB). The natural hybridization between and within M. acuminata and M. balbisiana resulted in several cultivars with different genomes and ploidy levels (Hippolyte et al. 2012). The possible genomic groups for bananas include; AA, BB, AB, AAA, AAB, ABB AABB, AAAB and ABBB (Creste et al. 2003).

The East African Highland bananas (Musa AAA group) also referred to as EAHB, is an endemic group of bananas found in the Great Lakes region (Uganda National Council for Science and Technology 2007). They are grown at altitudes between 900 and 2000 m above sea level, and are mainly found in Burundi, Kenya, Rwanda, Tanzania, and Uganda, plus in some areas of Cameroon and the Democratic Republic of Congo. The cultivars within the group require an average of 2000–2500 mm of rain evenly distributed throughout the year (Uganda National Council for Science and Technology 2007) as they are very drought susceptible (Kissel et al. 2015, 2016). The EAHB are placed in the Lujugira–Mutika subgroup, which has been further divided into the five clone sets Mbidde, Musakala, Nakabululu, Nakitembe and Nfuuka (Karamura 1998; Pickersgill and Karamura 1999). Each clone set is composed of a number of cultivars that serve different functions such as beer making (as Mbidde), or as being eaten as cooked food or dessert (all others), depending on the region where they are grown.

In Uganda, cooking banana cultivars are locally known as ‘matooke’ and serve as staple food to a large part of the population. Uganda produces over 8 million tons of ‘matooke’ bananas annually, which makes it the second largest banana producer in the world. The daily per capita consumption of ‘matooke’ in Uganda is 0.7 kg (ABSPII 2013), making it the most important food and cash crop for small-scale farmers in this country. Banana production in Uganda has, however, declined over the past two decades due to production constraints such as attacks by black Sigatoka, parasitic nematodes, bacterial wilt and banana weevil, and problems related to soil fertility and inadequate moisture during drought (Swennen et al. 2013). Banana breeding carried out by the International Institute of Tropical Agriculture (IITA) and the National Agricultural Research Organization (NARO) targets constraints related to pests and diseases (Tushemereirwe et al. 2015). Banana crossbreeding starts with the hybridization of EAHB with wild or improved diploids which have resistance to banana diseases and pests, to generate banana clones showing host plant resistance to biotic and abiotic stresses, short cycle and height, high yield and quality (Ortiz and Swennen 2014). To facilitate the access and use in Musa breeding, appropriate conservation, characterization and evaluation of genetic variation in the matooke banana cultigen pool is required.

Based on the Global Conservation Strategy for Musa spp, the Taxonomy Advisory Group (2010) agreed on a list of the minimum (32) set of descriptors for characterization and documentation of bananas. These banana descriptors allow discrimination between different cultivars in the field, in addition to monitoring morphological attributes that are highly heritable (Daniells et al. 2001). To standardize data recording, plants at the right developmental stage, i.e., when plants are green ripe or having a bunch rachis with 45 cm length, are selected for description (Channeliére et al. 2011). However, little is known about the stability of the selected descriptors in Musa. In the present study, characterization of a sample of EAHB belonging to two clone sets was carried out with the objective of identifying stable descriptors that could be used for conservation purposes, to distinguish cultivars in germplasm collections and also for breeding purposes, to select breeding materials, and to describe new cultivars developed by the breeding program.

Materials and methods

Eleven female fertile East African Highland banana cultivars from two different clone sets (Table 1) were planted at the International Institute of Tropical Agriculture (IITA) in Namulonge/Sendusu, Uganda (00°31′47″N and 32°36′9″E at an elevation of 1167 m above sea level). The climate at this station fluctuates between dry and wet periods with an average temperature of 22 °C and average annual rainfall of 1264 mm (Nsubuga et al. 2011). A minimum of 5 plants and a maximum of 20 plants per cultivar within the same location were evaluated between September and December 2014 with Musa descriptors from the minimum descriptor list (Taxonomic Advisory Group 2010). For uniformity, the evaluation was done on plants having a bunch rachis of at least 60 nodes or a rachis of approximately 45 cm in length (Channeliére et al. 2011).

Table 1 Female fertile East African highland banana ‘matooke’ cultivars used in this study

Thirty-one descriptors were used to record the morpho-taxonomic characters on the 11 ‘matooke’ cultivars. Twenty-eight of the descriptors were qualitative, while three were quantitative. The quantitative descriptors were: fruit length (of the middle fruit of the third hand), number of hands per bunch and number of fruits on mid hand of the bunch (Tables 2, 3). The qualitative descriptors were: sap colour, edge of petiole margin, colour of cigar leaf dorsal surface, bract behaviour before falling, lobe colour of compound tepal, pseudostem height, predominant underlying colour of pseudostem, blotches at the petiole base, petiole canal leaf III, petiole margins, petiole margins colour, bunch position, bunch shape, rachis position, rachis appearance, male bud shape, bract apex shape, bract imbrication, colour of the bract external face, colour of bract internal face, compound tepal basic colour, anther colour, dominant colour of male flower, fruit shape, fruit apex, remains of flower relicts at fruit apex, fruit pedicel length and fusion of pedicels (Tables 4, 5). Size of male bud at harvest, which is supposed to be the 32nd descriptor according to the minimum descriptor list was not used in this study because the male buds were removed from plants before harvest to control the spread of banana bacterial wilt (Kubiriba and Tushemereirwe 2014). The descriptors related to color were examined using standard color charts developed by the Taxonomy Advisory Group (2010). All descriptor characters were recorded using scores ranging from 1 to 10, in a categorical manner, except the three quantitative descriptors which were measured and recorded directly.

Table 2 Fruit and bunch quantitative traits (mean ± SD) of eleven East African highland banana ‘matooke’ cultivars
Table 3 One-way analysis of variance for quantitative fruit and bunch traits in eleven East African highland banana ‘matooke’ cultivars
Table 4 Probability for binomial test of 28 categorical descriptors with null hypothesis P = 0.5 versus alternative hypothesis P > 0.5 for 11 banana cultivars
Table 5 Lower bounds of the one-sided confidence interval for 28 descriptors of 11 banana cultivars (The upper bound being 1)

Data were analyzed using R-software version 3.2.0 (R Core Team 2015). Categorical variables were first converted to binary scale by calculating mode of the data set. The mode scores were given a value of 0 while the non-mode scores were given a value of 1. The data were then analyzed by binomial test at 95% confidence level, the null hypothesis being that “the probability of getting a mode score is equal to the probability of getting a non-mode score (P = 0.5)”, while the alternative hypothesis was “the probability of getting a mode score is greater than 0.5 (P > 0.5)”. One way lower class boundaries were also calculated to determine the location of the mode. The means and standard deviations for the quantitative data were calculated (Table 2). One-way analysis of variance was done for the quantitative data (Table 3).

The stable (monomorphic) descriptors identified in this study were used to compare the 11 ‘matooke’ cultivars with banana cultivars from other groups. Consequently, seven dessert (AAA) bananas (Table 6), five Asian cooking (ABB) banana cultivars (Table 7) and 15 East African Highland banana cultivars belonging to five clone sets (Table 8) were compared using the identified stable qualitative descriptors. The data for these three additional banana groups were obtained from Musalogue, which is an international catalogue for Musa germplasm (Daniells et al. 2001). The data were first converted to binary scale using the mode. The mode scores were given a value of 0 while the non-mode scores were given a value of 1. Then data were used to cluster the banana groups using Ward’s hierarchical agglomerative clustering method (Murtagh and Legendre 2014).

Table 6 Dessert bananas (AAA) characterized using 10 monomorphic descriptors.
Table 7 Asian cooking bananas (ABB) characterized using 10 monomorphic descriptors.
Table 8 East African highland banana cultivars belonging to 5 clone sets characterized using 10 monomorphic descriptors.

Results

The variation for fruit length, number of hands per bunch and number of fruits on the mid hand of the bunch among the 11 female fertile East African highland bananas is given in Table 2. One-way analysis of variance indicated that the cultivars were significantly different for these traits (Table 3).

Within each cultivar, there was variation for stability of the qualitative descriptors used ranging from highly stable (‘***’), moderately stable (‘**’), fairly stable (‘*’) to unstable (NS) (Table 4). Ten descriptors of which six being flowering related, were stable across all the 11 ‘matooke’ cultivars (Table 4). These descriptors were: sap colour, edge of petiole margin, colour of cigar leaf dorsal surface, bract behaviour before falling, lobe colour of compound tepal, bract imbrication, compound tepal basic colour, anther colour, dominant colour of male flower and fruit shape. The stable descriptors stretched across the two clone sets and there was no set of stable descriptors observed in only one clone set. Only cultivar ‘Tereza’ had characters that were unique from all the others cultivars. These characters were colour of the bract external face and colour of bract internal face (Supplementary Figure S1).

The lower bounds of the mode scores for the qualitative descriptors varied from 0.08 (8%) to 0.86 (86%) across all the 11 cultivars (Table 5). All descriptors with P values having ‘***’, ‘**’ and ‘*’ levels of significance (Table 4) had their corresponding lower bounds higher than 0.5 (50%) (Table 5), while the descriptors with P values showing ‘NS’ had their lower bound values less than 0.5 (50%).

The cladogram (Fig. 1) grouped East African highland banana cultivars close to each other. Cultivars in the Nakabululu clone set, the Nfuuka clone set and the Mbidde clone set formed the major cluster while cultivars in the Musakala clone set and the Nakitembe clone set (except cultivar Mbwazirume) formed a minor cluster next to the main cluster for the EAHB. The Asian cooking bananas did not cluster together, neither did the dessert (AAA) cultivars.

Fig. 1
figure 1

A cladogram showing clustering of 11 female fertile East African highland bananas used in this study in comparison with the 7 dessert (AAA) bananas, 5 Asian cooking bananas (ABB) and 15 East African highland bananas belonging to 5 clone sets, compared using 10 monomorphic descriptors in female fertile East African highland bananas. ACB Asian Cooking Bananas; DB Dessert Bananas; EAHB East African Highland Bananas; tsEAHB this study East African Highland Bananas; Mbi Mbidde; Mus Musakala; Naka Nakabululu; Naki Nakitembe; Nfu Nfuuka

Discussion

A good morphological descriptor should be stable, heritable, distinctly identifiable, easy to distinguish by the human eye, expressed consistently and able to clearly distinguish the individuals of interest. The high variation exhibited by the quantitative descriptors: fruit length, number of hands per bunch and number of fruits on the mid hand of the bunch, is an indication that such descriptors are not stable and thus not suitable for description of the EAHB cultivars. Javed et al. (2002) characterized 16 populations of Malaysian wild M. acuminata using 46 morphological characters and also found out that the quantitative characters were unstable. However in their study, they found pseudostem colour, petiole sheath colour and rachis position as useful characters to distinguish the M. acuminata populations, which is contrary to the findings in this study.

Each cultivar had a set of descriptors that were stable between the individuals of that cultivar but these descriptors were not useful for distinguishing a particular cultivar because in most cases the same descriptor was shared with two or more other cultivars. The 10 descriptors that were stable across the 11 cultivars, had the same mode score across the cultivars. For example, sap colour had a mode of 2 representing milky sap, edge of petiole margin had a mode of 2 representing red–purple color or brown when dried, colour of cigar leaf dorsal surface had a mode of 3 representing medium green color, bract behaviour before falling had a mode of 1 representing revolute (rolling), lobe colour of compound tepal had a mode of 2 representing yellow color, bract imbrication had a mode of 1 representing old bracts overlap at apex of bud (no imbrication), compound tepal basic colour had a mode of 2 representing cream color, anther colour had a mode of 6 representing pink/pink-purple, dominant colour of male flower had a mode of 2 representing cream, and fruit shape had a mode of 1 representing straight or slightly curved. This implies that these stable descriptors are not suitable for discriminating between the EAHB cultivars. However, they can be used to distinguish the East African highland bananas as a group from other groups of bananas. Therefore, there is a need to revise the available minimum set of Musa morphological descriptor to find suitable ones capable of distinguishing EAHB cultivars. Kitavi et al. (2016) and Christelová et al. (2016) studied the genepool of the triploid East African highland bananas using SSR and AFLP markers. They found that EAHB cultivars were genetically uniform. However from our study, the results from morphological characterization do not agree with the molecular findings since the EAHB cultivars used in this study expressed stable and consistent similar behaviour in only 10 characters out of the 31 characters, representing only 32% level of similarity. There is therefore a need to study the genetic basis of the morphological variation in EAHB cultivars using high-density genotyping by sequencing.

The fact that all descriptors with P values having ‘***’, ‘**’ and ‘*’ levels of significance (Table 4) corresponded to lower bound values higher than 0.5 (50%) (Table 5), while the descriptors with P values showing ‘NS’ had lower bound values less than 0.5 (50%) is a confirmation that all the stable descriptors had more than 50% mode score within a cultivar. This is in agreement with the tested hypotheses; the null hypothesis being that “the probability of getting a mode score is equal to the probability of getting a non-mode score (P = 0.5)”, versus the alternative hypothesis “the probability of getting a mode score is greater than 0.5 (P > 0.5)”. Accordingly, if the null hypothesis is true, the descriptor is unstable, whereas if the alternative hypothesis is true, the descriptor is stable.

In order to minimize sources of variation during characterization and to have consistency in scoring, the gene bank curator or a specified team should be responsible for measuring and recording the descriptors. However, the number of individuals sampled also influenced the lower bound in that cultivars with low numbers of individuals sampled showed lower bounds for descriptors much lower than those cultivars with higher number of individuals sampled.

The light green margins with purple stripes on the bract external face and the yellow or green bract internal face that turns gradually to orange-red towards the apex, are characters which can be used to distinguish cultivar ‘Tereza’ from other EAHB (Online Resource 1).

The cladogram (Fig. 1) grouped EAHB close to each other. Cultivars in the Nakabululu clone set, the Nfuuka clone set and the Mbidde clone set formed a major cluster while cultivars in the Musakala clone set and the Nakitembe clone set (except cultivar Mbwazirume) formed a minor cluster next to the main cluster for the EAHB. This is in agreement with the observation by Karamura et al. (2016) who used SSR markers to assess the genetic variation within and between 53 banana groups. They found that the genetic distance was shortest within ilalyi and EAHB. However, within the EAHB, the variation was higher in the Nakitembe and Musakala clone sets. This was attributed to the fact that Nakitembe and Musakala are the clone sets containing most of the commercial cultivars, and the variation may be due to high and long-term selection pressure. The Asian cooking bananas (ABB) did not cluster together, neither did the dessert (AAA) cultivars. This may be because the set of descriptors used are neither suitable for grouping Asian cooking bananas nor dessert cultivars. Another reason might be that some of the selected descriptors’ data were missing for some cultivars in Musalogue. Hence, the Musalogue needs to be regularly updated to fill in the missing information about Musa germplasm. Grande Naine, Williams and Red Dacca clustered close to the EAHB cultivars, possibly because they are all triploid AAA cultivars and more closely related to the EAHB.

Molecular markers have been used in assessing the variation and relationships within and among different banana groups. Ortiz and Swennen (2014) indicated that DNA markers can be used as a tool to facilitate taxonomy and assessment of cultivar trueness-to-type. They referred to new microsatellites as being widely used for assessing diversity in bananas, plantains and other related crop wild relatives, some of which derived from expressed sequenced tags (EST) or from genomic sequence surveys (GSS). For example, Christelová et al. (2016) used simple sequence repeats (SSR) markers to characterize the global Musa germplasm collection kept at the international Transit Centre (ITC) in Leuven (Belgium). They found out that SSR marker assessment for 84% of the ITC accessions analyzed, agreed with the previous morphologically based classification while for 16% of the ITC accessions it did not. However, Creste et al. (2003), using SSR to analyze 35 polyploid banana cultivars (3x AAA, AAB; 4x AAAA, AAAB) grown in Brazil, concluded that their phenetic analysis based on the Jaccard similarity index highly agreed with the morphological classification. Kitavi et al. (2016) used 100 SSR markers to investigate the genetic diversity of 90 phenotypically diverse EAHB cultivars collected from Kenya and Uganda and compared them with plantain (AAB) and dessert (AAA) cultivars. They found out that EAHB cultivars had minimal genetic variation and were largely genetically uniform, irrespective of source of collection. They observed no association between EAHB genetic diversity classification according to SSR markers and morphological based classification for EAHB germplasm.

Conclusion

In summary, this research shows that the minimum set of descriptors developed for banana consists of stable (32%) and unstable descriptors and is inefficient to differentiate between cultivars like in a small sample of the ‘matooke’ banana cultigen. The available set of minimum morphological descriptors in Musa should be revised to include only those that are stable and which can efficiently distinguish the East African Highland bananas. Likewise, a minimum set of high-throughput dense DNA markers should be defined for an improved assessment of diversity in Musa germplasm (Nunes de Jesus et al. 2009), which will complement the morphological characterization. A similar kind of research should be initiated on all Musa subgroups like the morphological diverse subgroup of plantain, to find out whether the minimum set of descriptors is useful or not (De Langhe et al. 2005).