Introduction

Banana (genus Musa L.), one of the most important staple crops widely cultivated in tropics and subtropics, is tropical giant perennial herb belonging to the Musaceae family of the order Zingiberales (Simmonds 1995). Present-day edible bananas originate primarily from the diploid species M. acuminata (AA) and M. balbisiana (BB). Most cultivated diploid and polyploid banana varieties are sterile intra- or inter-specific hybrids of these two species, and have been fixed through hundreds of years of human selection. Based on analysis of morphological characters and ploidy level, five main genetic groups (AA, AB, AAA, AAB, and ABB) have been described for cultivated bananas by Simmonds and Shepherd (1955).

As one of the banana-producing countries, South China is on the north border of the originating center of Musa, with rich and diverse germplasm. There are an estimated 11 species in China (Li 1978; Wu and Kress 2000), and two new wild species of Musa in Yunnan were recently reported (Liu et al. 2002; Häkkinen 2009). The banana cultivars in China are primarily grouped into four groups: Xiangyajiao (AAA), Dajiao (ABB), Fenjiao (ABB), and Longyajiao (AAB). However, the available genetic resources are not well understood because of breeding limitations and extensive germplasm exchange; local names, synonymous and homonymous; and the high occurrence of somatic mutants for some cultivars.

As the only tools able to reveal DNA polymorphisms, molecular markers have been employed in the characterization and evaluation of genetic diversity in Musa species. Microsatellites have proved to be the best markers for banana typing because they are highly polymorphic, multi-allelic, co-dominant, reproducible, and provide extensive genome coverage (Gupta and Varshney 2000). The genetic diversity of the existing germplasm in China has not been fully and systematically characterized (Ge et al. 2005; Wang et al. 2007, 2008). Furthermore, the development of simple sequence repeat (SSR) primers for banana species has not yet been reported.

Analysis of SSRs has many advantages, but it is not easy to acquire the SSR primers. Researchers have developed SSR primers using many methods such as the classical screening of genomic library (Ujino et al. 1998), microsatellite enrichment (Huang et al. 1999), 5′-anchor polymerase chain reaction (PCR) (Fisher et al. 1996), sequence-tagged microsatellite profiling (STMP, Hayden and Sharp 2001a), selectively amplified microsatellite (SAM, Hayden and Sharp 2001b) and database blast search (Ramsay et al. 2000). SAM is an important method to efficiently develop SSR markers. In this study, we aimed to: (1) develop novel microsatellite markers isolated from M. acuminata cv. Gongjiao using the SAM method; (2) assess molecular variability in related species/subspecies and cultivated germplasm in the Kunming Botanical Garden (KBG) and the National Field Genebank for Banana (NFGB) in China; (3) construct a dendrogram to demonstrate relationships among genotypes; (4) compare this scheme with shared morphological features of the plants.

Materials and methods

Plant materials and genomic DNA extraction

Fresh leaf samples were collected from 26 cultivated banana varieties, 9 Musa species/subspecies (M. acuminata zebrina, M. a. burmannica, M. balbisiana, M. ornata, M. velutina, M. chiliocarpa, M. aurantiaca, M. yunnanensis and M. itinerans), and 2 species from the Musaceae genera (Musella lasiocarpa and Ensete glaucum) growing in the National Field Genebank for Banana (NFGB), Fruit Tree Research Institute, Guangdong Academy of Agricultural Sciences (Guangzhou) and the Kunming Botanical Garden (KBG), Kunming Institute of Botany, Chinese Academy of Science (Kunming) (Table 1). Total DNA was extracted from young leaves using the CTAB protocol (Paterson et al. 1993).

Table 1 Banana materials analyzed

SAM assay

SAM segments were isolated from genomic DNA of M. acuminata cv. Gongjiao, a commercial diploid cultivar in China, using the SAM protocol (Hayden and Sharp 2001b). After recovery, cloning, and sequencing, the fragments were analyzed by the SSRIT software (http://www.gramene.org/gramene/searches/ssrtool) and were used to design appropriate primers with the Primer3 software (http://www.genome.wi.mit.edu/genome_software/other/primer3.html).

SSR analysis

The primers obtained were initially used to study a commercial diploid cultivar ‘Gongjiao’ by PCR amplification in 20 μl volumes containing 10 mM Tris–HCl (pH 8.8), 50 mM KCl, 0.08% NP-40, 2 mM MgCl2, 0.125 mM of each dNTP, 0.5 μM of each primer, 40 ng genomic DNA, and 0.5 U Taq DNA polymerase (Shanghai Sangon Biological Engineering Technology & Services Co., Ltd, China). Reactions were carried out in the Whatman Biometra T1 Thermocycler (German) using the following temperature profile: an initial denaturation of 3 min at 94°C followed by 30 cycles of denaturation for 30 s at 94°C, 45 s at the primer-specific annealing temperature, and extension for 1 min at 72°C. Cycling was followed by the final step of 5 min at 72°C. PCR products were electrophoresed in 6% polyacrylamide gels stained with silver nitrate. The primers that produced clear and scorable amplification patterns were selected for further SSR analysis. The PCR analyses were repeated at least two times to ensure the reproducibility.

Morphological analysis

Morphological characterization of the 26 banana cultivars was scored based on the Banana Plant Descriptor method (IPGRI 1996) in the NFGB (Guangzhou). Observed qualitative characters included the following eight vegetative traits (pseudostem color, pseudostem pigments, predominant underlying color of the pseudostem, pigmentation of the underlying pseudostem, leaf habit, petiole canal leaf III, shape of leaf blade base, and insertion point of leaf blades on petiole), 14 inflorescence traits (rachis position, rachis appearance, male bud shape, male bract shape, bract base shape, bract apex shape, bract imbrication, color of the bract external face, color of the bract internal face, fading of color on bract base, bract scars on rachis, male bract lifting, wax on the bract, and bract behavior before falling), 10 male flower traits (male flower behavior, compound tepal basic color, lobe color of compound tepal, free tepal color, free tepal shape, free tepal appearance, style shape, stigma color, ovary shape, and ovary basic color), and nine fruit traits (fruit position, general fruit shape, fruit shape [longitudinal curvature], fruit apex, transverse section of fruit, immature fruit peel color, pulp color before maturity, mature fruit peel color, and pulp color at maturity).

Data analysis

The SSR gel images were analyzed with Bandscan Software v. 5.0 (http://www.glyko.com) and confirmed manually. The bands were sized and then binary coded with 1 or 0 for their presence or absence in each genotype. The polymorphism information content (PIC) for each primer was calculated based on the formula:

$$ {\text{PIC}} = 1 - \sum\limits_{i = 1}^{m} {p_{i}^{2} } - \sum\limits_{i = 1}^{m - 1} {\sum\limits_{j = i + 1}^{m} {2p_{i}^{2} p_{j}^{2} } }, $$

where p is the relative frequency of jth pattern of SSR marker i (Botstein et al. 1980). NTSYS-pc 2.11 software (Exeter Software, Stauket, NY) was used to estimate genetic similarities with the Nei and Li coefficient (Nei and Li 1979). The generated matrix of similarities was analyzed by the unweighted pair-group method with arithmetic average (UPGMA). The clustering was also tested by bootstrap analysis using the WinBoot program (Yap and Nelson 1996) with 1,000 iterations. Morphological traits were also analyzed using the same software.

Results

Microsatellite development

A SAM library from M. acuminata cv. Gongjiao was screened with the following 5′-anchored SSR primers: PGA6, PCT6, PAC6, and PGT6 (Hayden and Sharp 2001b). A total of 118 clones were randomly chosen and sequenced, producing a total of 100 readable sequences; 18 did not produce the results. Ninety-six SSR motifs from 83 non-redundant sequences were identified; 90 of 96 SSR motifs were dinucleotide repeats (DNRs). The GT/AC motif was the most common among DNRs, accounting for 67.71%, followed by AG/TC (25.00%) and TA/AT (1.04%); the GC/CG was not seen. Motifs of the four trinucleotide repeats were ATG/CAT, GAA/TTC, and AGG/CCT, and the two tetranucleotide repeats contained AGGG/CCCT and AAGG/CCTT respectively.

Finally, specific primers were designed for 38 microsatellite sequences containing 45 SSR motifs with melting temperature ranged between 40 and 65°C and produced amplification fragments of 80–380 bp. These 45 microsatellites were classified into simple and compound, and each class into perfect or imperfect. Forty-two microsatellites were simple and three were compound. Only one compound microsatellite was imperfect, and from the group of 42 simple microsatellites, 33 were perfect (6–12 motifs), and 9 were imperfect (2–21 uninterrupted motifs). NCBI blast searches showed no significant similarity for all the sequences. Microsatellite sequences have been deposited in GenBank.

Marker validation and detection of polymorphism

The 38 selected primers were pre-screened on ‘Gongjiao’: 68.42% (26/38) produced clear repeatable amplification patterns and were used to analyze 26 cultivated accessions and 11 related species/subspecies. Of the 26 tested primers, 80.77% (21/26) detected polymorphisms among 26 banana cultivars, 84.62% (22/26) detected polymorphisms among 11 related species/subspecies, and three of these primers were discarded because they produced a monomorphic pattern among all the accessions studied.

A total of 100 alleles were detected with 23 polymorphic SSRs from 37 banana accessions (mean, 4.55 per locus; range 2–9). The PIC values of SSR loci ranged from 0.10 to 0.74 with a mean value of 0.48 (Table 2). Within 26 cultivated accessions, analysis of 21 SSRs detected a total of 79 alleles (mean, 3.76 per locus; range, 2–7). While within 11 related species/subspecies, 8 (MA01–MA03, MA11, MA15, MA17, MA22, and MA23) of the 22 SSRs did not produce any amplification fragments from the genomic DNA of ‘Xiangtuijiao’. Twenty-two polymorphic SSRs produced scorable amplification fragments in the 10 related species/subspecies and detected 85 alleles (mean, 3.86 per locus; range, 2–6) (Table 3).

Table 2 Marker name, SSR motif, primer sequences (5′-3′), optimal annealing temperature (AT), and GenBank accession number for the 23 SSR marker described
Table 3 Marker validation and inter-specific/generic transferability of the 23 working SSR markers

Genetic relationships of the banana genotypes

Similarity among the banana accessions in this study ranged from 24.24 to 100% (mean, 61.12%), revealing high genetic variation. The highest genetic similarity coefficient (100%) was found between ‘Tai2’ and ‘Aguajiao’, indicating the same genetic constituents. The lowest genetic similarity coefficient (24.24%) was found between ‘Xiangtuijiao’ (Ensete glaucum) and ‘Akuanjiao’ (Musa itinerans), which indicated that they are relatively remote in relationship.

An UPGMA cluster of the 37 banana accessions constructed with SSR markers separated them into three significantly different clusters based on the similarity coefficient 0.54 (Fig. 1). The wild genotypes ‘Diyongjinlian’ (Musella lasiocarpa) and ‘Xiangtuijiao’ (Ensete glaucum) presented the lowest similarity values compared to those from the main group, and they were placed as an outgroup. Group I included the majority of accessions that have the ‘A’ genome alone, except ‘Huajiao’ (AAA) and FHIA-17 (AAAA). Interestingly, this group can be further divided into three subgroups. Two wild subspecies of M. acuminata were separated from the cultivated accessions. The cultivated diploid and triploid accessions formed two distinct groups that corresponded with ploidy level. All triploid accessions with AAB/ABB genomic composition and three tetraploid hybrids, plus the cultivar ‘Huajiao’ (AAA) and the wild species ‘Akuanjiao’ (M. itinerans) formed an arbitrary group II. Group II also can be further divided into two subgroups. In group III, six related species, including M. balbisiana, M. ornata, M. velutina, M. chiliocarpa, M. aurantiaca and M. yunnanensis, grouped together. Bootstrap analysis showed high values for most of the branches (>50%).

Fig. 1
figure 1

Dendrogram of the 37 banana accessions based on 22 SSR primers. Bootstrap values are given at the corresponding node for each cluster

Comparison of SSR-based and morphological analysis

Forty-one morphology descriptions were available for the 26 cultivated accessions used in this study. Scoring of observed morphological characters based on the standard Banana Plant Descriptors (IPGRI 1996) revealed variance among accessions of each character ranging from 2 to 7. Similarity values among the 26 banana accessions ranged from 19.75 to 84.34% (mean 48.37%), revealing the high level of phenotypic diversity. ‘Tai2’ and ‘Tansangniyaxiangjiao’ were close to each other with a similarity coefficient of 84.34%. Minimum similarity coefficient (19.75%) was observed between ‘Baiyoushen’ and ‘Dongguanzhongbadajiao’, ‘Rose’, ‘Meidajiao’, and ‘Dongguanzhongbadajiao’.

We compared the clustering based on SSR profiles with the morphological characters of the plants. Cluster analysis of SSR data separated the 26 cultivated accessions into four main groups that corresponded with the genome designation and ploidy status of the plants. Four genetic subgroups (AA-I and AA-II, AAA-I and AAA-II) were recognized within the M. acuminata accessions. Bootstrap analysis in SSR data showed high values for most of the branches (>50%) (Fig. 2a). In contrast to the SSR analysis, the cluster analysis of morphological characters showed no correlation with genome designation or ploidy status. It divided the 26 cultivated accessions into two groups (Fig. 2b). The 20 banana accessions, including cultivated diploid, triploid, and tetraploid types, formed a large group excluding all six accessions of Dajiao (M. × paradisiaca) with the ABB genome in China. Unlike the AAA cultivated bananas and tetraploid hybrids, the ABB accessions did not show close relationships with the diploid accessions. Bootstrap analysis produced low values for most of the branches (<50%) except for the node of ‘Tudlo Tumbaga’ and ‘Guifeijiao’ with 69.9%, ‘Tai2’ and ‘Tansangniyaxiangjiao’ with 71.5%, ‘Cachaco’ and ‘Zhongshanmilundajiao’ with 57.6%, and ‘Meidajiao’ and ‘Dongguanzhongbadajiao’ with 91.2%.

Fig. 2
figure 2

Dendrogram generated by UPGMA cluster analysis for 26 cultivated bananas based on a SSR markers and b morphological characters, derived from DICE coefficient of similarity. Bootstrap values are given at the corresponding node for each cluster

Discussion

Microsatellite development

Microsatellites are traditionally isolated by genomic library construction and are then hybridized with SSR radioactive-isotope-labeled or digoxigenin-labeled probes. This process requires significant manpower and money, and does not easily obtain positive clones (1–3%) (Ujino et al. 1998; Hayden and Sharp 2001b). While the possibility of obtaining positive clones increasing to 50% with microsatellite enrichment, SAM analysis provides a useful alternative to existing techniques for developing SSR markers. It does not require constructing and screening DNA libraries for SSR-containing clones, and provides good recovery of usable SSR markers. Our sequencing results from 100 DNA fragments using this approach demonstrated that 83 (83%) clones contained SSRs; 17 of them were not readable. The percentage of positive clones containing SSRs was higher than the values of 1.4 and 17% for M. acuminata cv. Gobusik (Kaemmer et al. 1997), 4.0% for M. balbisian cv. Tani (Buhariwalla et al. 2005), 78.9% for M. acuminata cv. Ouro (Creste et al. 2006), and 79.2% for M. acuminata subsp. malaccensis (Crouch et al. 1997).

Cross-species/genera amplication

Several studies have shown that SSRs developed for one species could be used in related plant species (Dayanandan et al. 1997). To evaluate the cross-species/genera amplification, 26 primers were screened against 11 related species/subspecies representing 10 different species, representing three important genus of Musaceae family. Of the 26 tested primers, 84.6% (22/26) primers amplified robust, polymorphic bands in a 9 related species/subspecies from Musa and 1 species from Musella but not Ensete glaucum. 63.6% (14/26) gave amplification with all the tested wild species. These findings suggest a high level of sequence conservation among the species examined. Hernández et al. (2001) reported high level of maize genomic SSRs transferability (74.5%) to sugarcane. High transferability of SSR markers was also reported in peach (Prunus spp) species by Dirlewanger et al. (2002), and grass species by Saha et al. (2006).

Genetic relationships of the banana genotypes

Cluster analysis of SSR in present study using the UPGMA method revealed that the wild species/subspecies are genetically distant from the cultivated banana varieties, and grouped the cultivars primarily according to genome designation and geographical origin. Two cultivars from Cavendish subgroup, ‘Tai2’ from Taiwan and ‘Aqua’ from Brazil, presented 100% similarity based on the microsatellite primers used and could not be distinguished, although they present some morphological differences in bract base shape and male bract lifting. It is likely that they are synonymous, or the number of SSR primers used in this study was too limited to differentiate them. Four ‘Dajiao’ (ABB) accessions (Fig. 1, in Group II) clustered separately. These cultivars clustered with ‘Akuanjiao’ (M. itinerans), possibly reflecting the parenthood of them. Dajiao (M. × paradisiaca) with ABB genome in China was different from exotic planatins such as French or Horn plantains. Wang et al. (1995) recognized the Chinese cultivars Dajiao as triploid forms of M. balbisiana according to morphological characters, meiosis and karyotype analysis.

Morphology descriptions were available for 26 cultivars used in this study. We compared clustering based on SSR profiles with the morphological characters of the plants. With a few exceptions, DNA clustering patterns were in general agreement with the shared morphological characteristics of the cultivars. As an exception, ‘Huajiao’ was collected from Yunnan Province by NFGB (Guangdong) in 1978. The fruit shape is straight with unconspicuous ridges and a round fruit apex. It is named sour banana because of its sour and sweet flavor. The color of leaf lower surface is pink, which is similar to Longyajiao, and the petiole wing is similar to Cavendish subgroup. Here we have shown that most of the cultivars of Cavendish subgroup are highly similar. However, the landrace ‘Huajiao’, did not cluster with the other Cavendish cultivars which indicates a different genetic background; it presented high similarity with four endemic Dajiao (ABB) accessions. Thus, it is possible that ‘Huajiao’, with similar morphology characteristics, is unrelated to Cavendish subgroup.