Introduction

Bananas and plantains, Musa L. (Musaceae Juss.), are perennial crops with rapid growth rate and are cultivated all the year round within tropics and sub-tropics. They are the favorite fruit crops of the world and are globally distributed in more than 120 countries, with a total production of approximately 106 million tonnes per year (Molina and Kudagamage 2002). In 2012, the global production was estimated at about 140 million metric tons (FAOStat 2014). They are regarded as the highest export fruit crops (FAO 2011) and rated fourth most important in sub-Saharan Africa (SSA) after cassava, maize, and yam (FAO 2009). Bananas and plantains are rich sources of carbohydrates, vitamin C, potassium, and sodium (IBA 2007). The different genotypes were derived from Musa acuminata (AA) and M. balbisiana (BB) and classified into different genomic groups including diploids (AA, AB, and BB), triploids (AAA, AAB, ABB, and BBB), and tetraploids (AAAA, AAAB, AABB, and ABBB) (Pollefeys et al.2004; INIBAP 2003). Also, East African (mainly dessert) bananas (AA, AAB, AAA, ABB, and AB) and the African plantains (AAB) are grown mainly in central and west Africa, while the East African Highland Banana (AAA) are for cooking and beer brewing (Karamura et al. 1998).

Production of these vital crops is plagued by pathogenic factors and diverse environmental stresses. With rising global temperatures, which are expected to have drastic effects including altered patterns of drought, salinity, and emergence of new pests and diseases, plant growth and yield will be adversely impacted (Tester and Langridge 2010). For example, drought has emerged as one of the major constraints in banana production in the tropics and sub-tropics. Bananas are quite sensitive to drought; interestingly, genotypes with “B” genome (in particular ABB type) are more tolerant to abiotic stresses than those solely possessing “A” genome. However, the combination of varied topography and arid/semi-arid climatic conditions calls for drought resistant genotypes to these factors to be developed. This is vital since the world population is fast growing and is expected to reach over 9 billion by the year 2050 (FAO 2015). Feeding this overwhelming population level is generating much pressure on agricultural crop production (Kastner et al. 2012; Dempewolf et al. 2014; Khoury et al. 2014). To increase food supply, especially Musa species, harnessing genetic diversity and novel traits could result in developing new genotypes that are capable of withstanding changing environmental factors, since populations with narrower range may fail to survive climatic extremes.

Breeders need plants that are resistant to abiotic and biotic stressors, but this goal cannot easily be achieved via conventional breeding due to the complicated genetic system of Musa species. However, it is possible with molecular markers that are not influenced by changes in environmental factors with time and can target different genes (Martínez et al. 2006). Different molecular marker techniques such as random amplified polymorphic DNA (RAPD) marker (Kaemmer 1992; Ude et al. 2003; Toral et al. 2009; Lamare and Rao 2015), restriction fragment length polymorphism (RFLP) (Gawel et al. 1992; Bhat et al. 1995; Ning et al. 2007), simple sequence repeat (SSR) (Buhariwalla et al. 2005; Christelova et al. 2011; Hippolyte et al. 2012; de Jesus et al. 2013; Nyine et al. 2017), genotyping by sequencing (GBS) (Elshire et al. 2011), inter-simple sequence repeats (ISSR) (Godwin et al. 1997; Silva et al. 2016), directed amplified minisatellite DNA (DAMD) (Lamare and Rao 2015), and amplified fragment length polymorphism (AFLP) (Bhat et al. 1995; Ude et al. 2002a, b; Wang et al. 2007; Opara et al. 2010) have been utilized in dissecting genetic diversity, population, and genetic constitutions of Musa species. Other advanced tools including proteomics (Toledo et al. 2012; Bhuiyan et al. 2020), clustered regularly interspaced short palindromic repeats (CRISPR)/ CRISPR-associated protein 9 (Cas9) (Tripathi et al. 2019; Ntui et al. 2020), and gene expression (Yang et al. 2015; Sanchez et al. 2016; Wang et al. 2017) have been utilized in bananas and plantains to address several challenging factors that are militating against improved breeding and productivity. However, there are more informative and cost-effective molecular markers that target conserved domains and can effectively exploit the genetic indices or genepools inherent in banana and plantain plants as well as their wild relatives for crop genetic improvement. It has been reported that structural variant genes possessing presence or absence of variants contribute to diversity genepool (Golicz et al. 2016). Identification of Musa accessions (wild and elite ones) that can be adopted and optimized to perform in diverse environmental conditions based on abundant allelic diversity is very important since the optimal development of these accessions is dependent on the allelic/genetic diversity (Montenegro et al. 2017). To reveal the degree of genetic diversity and population structure inherent in these accessions, informative molecular markers including conserved DNA-derived polymorphism (CDDP) genes are required to characterize the allelic pool diversity and population. Conserved DNA-derived polymorphism markers involving transcriptional factors (TFs: MYB, ERF, WRKY, and APB) are cost-effective marker techniques that target conserved sequences of plant functional genes (mainly involved in responses to abiotic and biotic stressors or plant development) and possibly produce candidate markers that may be partly or completely associated with known genes (Collard and Mackill 2009). Furthermore, CDDP marker techniques are agarose gel-based, convenient, highly polymorphic, and capable of generating markers that are phenotypically linked to traits (Collard and Mackill 2009). The CDDP markers are similar in principle to resistance gene analog markers, designed from conserved regions in plant disease resistance genes (Chen et al. 1998). They possess different putative domains including auxin-binding proteins, transcriptional factors for development, physiology, fruiting and ripening processes, plant disease resistance pathway, secondary metabolism, abiotic and biotic stresses, and cellular morphogenesis (D'Hont et al. 2012). It has been shown that within functional domains of well characterized plant genes, the CDDPs can generate informative banding patterns that are utilized for mapping, trait association, and germplasm genetic diversity studies (Collard and Mackill 2009; Poczai et al. 2013). Due to the inherent efficiency and ability of CDDP to easily generate functional markers (FMs) that are associated with given plant phenotypic expressions, they have been used in the improvement of different crops including Rosa rugosa Thunb. ex Murray (Jin et al. 2016; Jiang and Zang 2018), Chrysanthemum L. cultivars (Li et al. 2013), Peony (Paeonia L.) cultivar (Li et al. 2014), bittersweet (Solanum dulcamara L.) (Poczai et al. 2011), date palm (Phoenix dactylifera L.)(Mam et al. 2017), Chickpea (Cicer arietinum L.) (Hajibarat et al. 2015), rice (Oryza sativa L.) (Collard and Mackill 2009), and wheat (Triticum aestivum L.) (Hamidi et al. 2014; Seyedimoradi et al. 2016). However, in bananas and plantains, utility of CDDP markers has not yet been reported to our knowledge for genetic diversity and population assessment. Therefore, the objective of this study was to access the genetic diversity/allelic richness and population of variable genomic constitutions of cultivated and wild relatives of Musa species using CDDP markers.

Materials and Methods

Sample Collection, DNA Extraction, Quantification and Preparation of Working Dilutions

Sixty-six accessions of bananas and plantains from different genomic groups consisting of AA, AAA, AAAA, AAB, BB, AB, ABB, AAAB, and AS, as well as other three wild diploid accessions (Musa beccarii, M. coccinea, and M. textilis) were obtained from the Musa germplasm collection of Diversity’s International Transit Center (ITC), hosted by Leuven, Belgium (Ruas et al. 2017) (Table 1). These accessions were mostly derived from the hybridization between wild diploid subspecies of M. acuminata and M. balbisiana. Thirty-two out of 66 were obtained as tissue cultured plantlet materials, each in five replicates and were grown and maintained at the screenhouse of the Department of Natural Sciences, Bowie State University, while the remaining 34 were obtained in lyophilized condition from the same ITC. Approximately 100 mg and 120 mg were respectively weighed from young fresh and lyophilized leaves of Musa species for DNA extraction using Cetyl trimethylammonium bromide (CTAB) method (Abarshi et al. 2010) with little modifications, using a ratio of 24:1 of chloroform and isoamyl alcohol, respectively, without phenol.

Table 1 List of accessions of different groups of bananas and plantains used for this study

Polymerase Chain Reaction and Agarose Gel Electrophoresis

Polymerase chain reaction (PCR) amplification was performed in volume of 25µL which consisted of 2.0 µL 100 ng DNA, 5.0 µl of 5 × Green GoTaq Buffer (Promega Corporation, Madison, USA), 2.0 µl of 2.5 mM dNTPs (Bioline, Massachusetts, USA), and 0.2 µl GoTaq DNA polymerase (5U/ µl) (Promega Corporation, Madison, USA), 1.0 µl of 10 µM each of CDDP primer, and 14.80 µl of 500 ml diethyl pyrocarbonate (DEPC)-treated water (Invitrogen, Carlsbad, CA, USA). The names of CDDP primers, their functions, sequences, GC content, annealing temperatures, and sources (Anai et al. 1997; Nagasaki et al. 2001; Stracke et al. 2001; Jiang et al. 2004; Gutterson and Reuber 2004; Xie et al. 2005) including the ones designed in this study are presented in Table 2. The PCR cycling profile used for the reaction consisted of an initial step at 94 °C for 5 min., followed by 40 cycles of 94 °C for 30 s, 72 °C for 1 min, and a 10-min final extension at 72 °C using a Bio-Rad T100 Thermal cycler (Bio-Rad Laboratories Inc. Singapore). The PCR reaction products of 10 µl were electrophoresed in a 1.5% agarose gel containing 0.5 mg/ml ethidium bromide and photographed using Aplegen Omega Lum G gel documentation system (Minnesota 55,303, USA). Prior to analysis of all the accessions, we selected few accessions of variable genomes and amplified them with all the CDDP primers for optimizations, and then identified the reproducible ones with scorable bands, after repetition for the amplifications of all the 66 accessions.

Table 2 List of primers, their sequences, percentage GC contents, and annealing temperatures

Data Analyses

Data matrices of CDDP marker profiles were generated by scoring (1) for presence and (0) for absence of individual allele. The generated data matrices were used for genetic diversity, allele frequency, and polymorphic information content (PIC) and were computed using PowerMarker version 3.25 (Liu and Muse 2005). Analyses of percentage polymorphic loci (PPL), effective number of alleles (Ne) (Kimura and Ohta 1973), Nei’s gene diversity (NGD) (Nei 1973), Shannon’s information index (I) (Lewontin 1972) (very important parameters usually used in assessing genetic diversities despite the number of sample or population sizes), and population (total gene diversity or intraspecific genetic diversity, Ht; gene diversity within population of interspecific genetic diversity, Hs; coefficient of gene differentiation, GST; and level of gene flow, Nm) of the accessions were analyzed using POPGENE software version 1.32 (Yeh and Boyle 1997). Dendrogram reconstruction using Unweighted Pair Group Mean Arithmetic (UPGMA) and dissimilarity index in Jaccard’s option (Igwe et al. 2017) was conducted using NTSYSpc software version 2.02 (Rohl 2000). Principal component analysis (PCA) of the accessions was computed using DARwin software version 6.0.021 (Perrier and Jacquemoud-Collet 2006).

Results

Allelic Variation, Gene Diversity, and Polymorphic Information Content

Out of the fifteen primers of CDDP markers tested, twelve were found to be reproducible and scorable as indicated in some of the representatives of the gel images generated for analyses (Figs. 1, 2, 3, 4). A total of 421 numbers of alleles were generated from the reproducible ones (Table 3). The range of amplifiable alleles from the primers was from 20 to 51, with a mean of 35.083. The major allele frequency was 2.051, and it ranged from 0.046 to 0.454, with a mean value of 0.171. Gene diversity with a total value of 11.093 and mean of 0.924, ranged from 0.782 to 0.757. Polymorphic information content with a total value of 11.019, ranged from 0.768 to 0.975, with a mean of 0.918. The CDDP primers including ERF1, ERF2, WRKYMusa1a, KNOX-1, MYB2, WRKY-R1, KNOX-2, KNOX1M1a, MYB1, and WRKY-F1 demonstrated high polymorphisms, while ABP1-3 and ABP1-1 were monomorphic. The PIC values detected in the CDDP primers were ranked in a descending order as MYB1 > ERF1 > WRKY-F1 > WRKY-R1 > KNOX-1 > KNOX1M1a > MYB2 > ERF2 > KNOX-2 > WRKYMusa1a > ABP1-3 > ABP1-1. Allelic scores, counts, and frequencies obtained from these accessions of Musa species were high. The allelic counts ranged from 1 to 28, while the frequencies spanned between 0.015 and 0.424 (Supplementary file 1: Table S1).

Fig. 1
figure 1

Amplification profiles of 66 banana and plantain samples using ERF1 primer of CDDP marker gene: a = 1kb step DNA ladder and b = 100bp DNA ladder, Sample order (1-66 from left to right): 1 = “Fougamou 1,” 2 = “Obino I'Ewai,” 3 = “Calcutta 4,” 4 = “Improved Lady Finger,” 5 = “Blue Torres Strait Island,” 6 = “Silk,” 7 = “Truncata,” 8 = “Cardaba,” 9 = “Lidi,” 10 = “Pelipita,” 11 = “Pelipita Manjoncho,” 12 = ”Lai,” 13 = ”Higa,” 14 = “Pisang Keling,” 15 = “Pisang Lawadin,” 16 = “Balonkawe,” 17 = “Gros Michel,” 18 = “Green Red,” 19 = "Plantain no. 3", 20 = ”Pata,” 21 = “Chinese Cavendish,” 22 = “Dwarf Parfitt,” 23 = “Hochuchu,” 24 = “Umalag,” 25 = “Hsein Jen Chiao,” 26 = “Mons Mari” (Pedwell), 27 = “Lady Finger” (Nelson), 28 = “Pisang Rajah” (South Johnstone), 29 = ”Tani,” 30 = “Pisang Lilin,” 31 = “Poteau Geant,” 32 = “Pisang Klutuk Wulung,” 33 = “Garbon 2,” 34 = “Zebrina” (G.F), 35 = “Khae” (Phrae), 36 = “Dole,” 37 = “Wompa,” 38 = “Pisang Palembang,” 39 = “Pisang Awak,” 40 = “Williams” (Bell, South Johnstone), 41 = "Plantain no. 17", 42 = “Kluai Tiparot,” 43 = “Tiau Lagada,” 44 = “Niyarma Yik,” 45 = “Selangor,” 46 = “Long Tavoy,” 47 = “Malaccenesis,” 48 = “Figure Pomme Geante,” 49 = “Highgate,” 50 = “Borneo,” 51 = “Honduras,” 52 = “Pome,” 53 = “Kunnan,” 54 = Musa beccarii, 55 = Musa coccinea, 56 = “JD Yangambi,” 57 = Musa textilis, 58 = “Tomolo,” 59 = “Pisang Berlin,” 60 = FHIA-23, 61 = No.110, 62 = “Dwarf Cavendish,” 63 = SH-3436-6, 64 = “Lal Velchi,” 65 = “Madang” and 66 = FHIA-21 (#68). Yellow coloured arrows indicate unique/polymorphic bands in some accessions

Fig. 2
figure 2

Amplification profiles of 66 banana and plantain samples using ERF2 primer of CDDP marker gene: a = 1kb step DNA ladder and b = 100bp DNA ladder, Sample order (1-66 from left to right): 1 = “Fougamou 1,” 2 = “Obino I'Ewai,” 3 = “Calcutta 4,” 4 = “Improved Lady Finger,” 5 = “Blue Torres Strait Island,” 6 = “Silk,” 7 = “Truncata,” 8 = “Cardaba,” 9 = “Lidi,” 10 = “Pelipita,” 11 = “Pelipita Manjoncho,” 12 = ”Lai,” 13 = ”Higa,” 14 = “Pisang Keling,” 15 = “Pisang Lawadin,” 16 = “Balonkawe,” 17 = “Gros Michel,” 18 = “Green Red,” 19 = "Plantain no. 3", 20 = ”Pata,” 21 = “Chinese Cavendish,” 22 = “Dwarf Parfitt,” 23 = “Hochuchu,” 24 = “Umalag,” 25 = “Hsein Jen Chiao,” 26 = “Mons Mari” (Pedwell), 27 = “Lady Finger” (Nelson), 28 = “Pisang Rajah” (South Johnstone), 29 = ”Tani,” 30 = “Pisang Lilin,” 31 = “Poteau Geant,” 32 = “Pisang Klutuk Wulung,” 33 = “Garbon 2,” 34 = “Zebrina” (G.F), 35 = “Khae” (Phrae), 36 = “Dole,” 37 = “Wompa,” 38 = “Pisang Palembang,” 39 = “Pisang Awak,” 40 = “Williams” (Bell, South Johnstone), 41 = "Plantain no. 17", 42 = “Kluai Tiparot,” 43 = “Tiau Lagada,” 44 = “Niyarma Yik,” 45 = “Selangor,” 46 = “Long Tavoy,” 47 = “Malaccenesis,” 48 = “Figure Pomme Geante,” 49 = “Highgate,” 50 = “Borneo,” 51 = “Honduras,” 52 = “Pome,” 53 = “Kunnan,” 54 = Musa beccarii, 55 = Musa coccinea, 56 = “JD Yangambi,” 57 = Musa textilis, 58 = “Tomolo,” 59 = “Pisang Berlin,” 60 = FHIA-23, 61 = No.110, 62 = “Dwarf Cavendish,” 63 = SH-3436-6, 64 = “Lal Velchi,” 65 = “Madang” and 66 = FHIA-21 (#68). Yellow coloured arrows indicate unique/polymorphic bands in some accessions

Fig. 3
figure 3

Amplification profiles of 66 banana and plantain samples using KNOX-1 primer of CDDP marker gene: a = 1kb step DNA ladder and b = 100bp DNA ladder, Sample order (1-66 from left to right): 1 = “Fougamou 1,” 2 = “Obino I'Ewai,” 3 = “Calcutta 4,” 4 = “Improved Lady Finger,” 5 = “Blue Torres Strait Island,” 6 = “Silk,” 7 = “Truncata,” 8 = “Cardaba,” 9 = “Lidi,” 10 = “Pelipita,” 11 = “Pelipita Manjoncho,” 12 = ”Lai,” 13 = ”Higa,” 14 = “Pisang Keling,” 15 = “Pisang Lawadin,” 16 = “Balonkawe,” 17 = “Gros Michel,” 18 = “Green Red,” 19 = "Plantain no. 3", 20 = ”Pata,” 21 = “Chinese Cavendish,” 22 = “Dwarf Parfitt,” 23 = “Hochuchu,” 24 = “Umalag,” 25 = “Hsein Jen Chiao,” 26 = “Mons Mari” (Pedwell), 27 = “Lady Finger” (Nelson), 28 = “Pisang Rajah” (South Johnstone), 29 = ”Tani,” 30 = “Pisang Lilin,” 31 = “Poteau Geant,” 32 = “Pisang Klutuk Wulung,” 33 = “Garbon 2,” 34 = “Zebrina” (G.F), 35 = “Khae” (Phrae), 36 = “Dole,” 37 = “Wompa,” 38 = “Pisang Palembang,” 39 = “Pisang Awak,” 40 = “Williams” (Bell, South Johnstone), 41 = "Plantain no. 17", 42 = “Kluai Tiparot,” 43 = “Tiau Lagada,” 44 = “Niyarma Yik,” 45 = “Selangor,” 46 = “Long Tavoy,” 47 = “Malaccenesis,” 48 = “Figure Pomme Geante,” 49 = “Highgate,” 50 = “Borneo,” 51 = “Honduras,” 52 = “Pome,” 53 = “Kunnan,” 54 = Musa beccarii, 55 = Musa coccinea, 56 = “JD Yangambi,” 57 = Musa textilis, 58 = “Tomolo,” 59 = “Pisang Berlin,” 60 = FHIA-23, 61 = No.110, 62 = “Dwarf Cavendish,” 63 = SH-3436-6, 64 = “Lal Velchi,” 65 = “Madang” and 66 = FHIA-21 (#68). Yellow coloured arrows indicate unique/polymorphic bands in some accessions

Fig. 4
figure 4

Amplification profiles of 66 banana and plantain samples using MYB2 primer of CDDP marker gene: a = 1kb step DNA ladder and b = 100bp DNA ladder, Sample order (1-66 from left to right): 1 = “Fougamou 1,” 2 = “Obino I'Ewai,” 3 = “Calcutta 4,” 4 = “Improved Lady Finger,” 5 = “Blue Torres Strait Island,” 6 = “Silk,” 7 = “Truncata,” 8 = “Cardaba,” 9 = “Lidi,” 10 = “Pelipita,” 11 = “Pelipita Manjoncho,” 12 = ”Lai,” 13 = ”Higa,” 14 = “Pisang Keling,” 15 = “Pisang Lawadin,” 16 = “Balonkawe,” 17 = “Gros Michel,” 18 = “Green Red,” 19 = "Plantain no. 3", 20 = ”Pata,” 21 = “Chinese Cavendish,” 22 = “Dwarf Parfitt,” 23 = “Hochuchu,” 24 = “Umalag,” 25 = “Hsein Jen Chiao,” 26 = “Mons Mari” (Pedwell), 27 = “Lady Finger” (Nelson), 28 = “Pisang Rajah” (South Johnstone), 29 = ”Tani,” 30 = “Pisang Lilin,” 31 = “Poteau Geant,” 32 = “Pisang Klutuk Wulung,” 33 = “Garbon 2,” 34 = “Zebrina” (G.F), 35 = “Khae” (Phrae), 36 = “Dole,” 37 = “Wompa,” 38 = “Pisang Palembang,” 39 = “Pisang Awak,” 40 = “Williams” (Bell, South Johnstone), 41 = "Plantain no. 17", 42 = “Kluai Tiparot,” 43 = “Tiau Lagada,” 44 = “Niyarma Yik,” 45 = “Selangor,” 46 = “Long Tavoy,” 47 = “Malaccenesis,” 48 = “Figure Pomme Geante,” 49 = “Highgate,” 50 = “Borneo,” 51 = “Honduras,” 52 = “Pome,” 53 = “Kunnan,” 54 = Musa beccarii, 55 = Musa coccinea, 56 = “JD Yangambi,” 57 = Musa textilis, 58 = “Tomolo,” 59 = “Pisang Berlin,” 60 = FHIA-23, 61 = No.110, 62 = “Dwarf Cavendish,” 63 = SH-3436-6, 64 = “Lal Velchi,” 65 = “Madang” and 66 = FHIA-21 (#68). Yellow coloured arrows indicate unique/polymorphic bands in some accessions

Table 3 Major allele frequency, number of alleles, gene diversity, and PIC obtained from Musa species using conserved DNA-derived polymorphism primers

The identified number of polymorphic loci (NPL) and percentage of polymorphic loci (PPL) obtained from the 12 reproducible set of primers of CDDP markers using 66 accessions ranged from 59 to 66 and 89.34 to 100, respectively (Table 4). Based on the genetic diversity endowment of these primers in Musa species, eight out of the 12 primers exhibited 100% polymorphisms, while the lowest obtained from two primers was 89.39%. Within the 12 CDDP primers, effective number of alleles (Ne), Nei’s gene diversity (H), and Shannon’s information index (I) values and their standard deviations ranged from 1.455 ± 0.283 to 1.918 ± 0.152, 0.286 ± 0.145 to 0.482 ± 0.058, and 0.440 ± 0.198 to 0.674 ± 0.062, respectively.

Table 4 Genetic diversity within conserved DNA-derived polymorphism used in accessing genetic diversity of different genomic groups of bananas and plantains

Genetic Diversity Based on Different Genomic Groups

Within the 66 accessions of Musa species of the diverse genomic groups assessed with 12 CDDP primers, Ne, H, and I values spanned from 1.437 to 1.989, 0.344 to 0.497, and 0.495 to 0.691 (Table 5). The values of these genetic diversity indicators vary in the accessions based on their genomic constitutions involving AA (Ne: 1.775, H = 0.433, I = 0.624), AAA (Ne = 1.437, H = 0.344, I = 0.495), AAAA (Ne = 1.787, H = 0.436, I = 0.627), AAB (Ne = 1.831, H = 0.453, I = 0.645), BB (Ne = 1.731, H = 0.416, I = 0.617), AB (Ne = 1.539, H = 0.350, I = 0.535), and ABB (Ne = 1.771, H = 0.429, I = 0.619). For the groups with wild accessions, group AS consisted of 1.990, 0.497, and 0.691 as respective values of Ne, H, and I, while other diploid accessions with unknown genomic groups had different values of Ne, H, and I as in M. beccarii (Ne = 1.747, H = 0.427, and I = 0.619), M. coccinea (Ne = 1.800, H = 0.444, and I = 0.637), and M. textilis (Ne = 1.719, H = 0.418 and I = 0.609).

Table 5 Genetic diversity indices obtained from 66 accessions of Musa species using conserved DNA-derived polymorphism markers

The genetic diversity inherent in an AS group was identified to be the highest, with the values of Ne, H, and I. On the contrary, the genetic diversity in the AAA accessions was determined to be the lowest with Ne, H, and I indices. The genetic diversity parameters identified in these variable genomic (ploidy) groups were ranked as AS > AAB > AAAA > AA > ABB > wild diploidy accessions (M. beccarii, M. coccinea, and M. textilis) with unknown group > BB > AB > AAA from high to low based on polymorphic loci of the selected CDDP primers. The overall mean values of Ne, H, and I and their respective standard deviations across the diverse genomic groups were 1.778 ± 0.158, 0.433 ± 0.061, and 0.622 ± 0.070.

The assessment of genetic variations within and among the different populations of genomic groups revealed that the values of Ht, Hs, GST, and Nm identified in different groups of the accessions were genetically diverse and variable depending on the genomes or groups (Table 6). There were ranges in the values of Ht (0.350–0.497), Hs (0.345–0.451), GST (0.014–0.094), and Nm (4.818–35.824). Accessions that possess genome AS represented the highest values of Ht, Hs, GST, and Nm, while the lowest ones were associated with the accessions of AB group. The overall mean values of Ht, Hs, GST, and Nm across the studied 66 accessions of different genomic groups were 0.433 ± 0.004, 0.404 ± 0.004, 0.066 and 7.113, respectively. The GST value recorded 0.066 in which 6.57% was the total genetic divergence among the populations and the remaining 93.43% was found within the populations.

Table 6 Genetic differentiation in different genomic groups of 66 accessions of Musa species using conserved DNA-derived polymorphism markers

Dendrogram Analysis of Different Genomic Groups of Musa Species

A dendrogram analysis of the 66 accessions obtained from UPGMA procedure produced nine major groups at similarity index of 0.814 (Fig. 5). Group I was subdivided into two subgroups, subgroup I (SGI) and subgroup II (SGII). Subgroup I consisted of two accessions, “Fougamou 1” and “Kluai Tiparot,” each possessing ABB genomic group, while SGII had four accessions with different genomic groups as “Zebrina” G.F (AA), “Wompa” (AS), "Plantain no. 17" (AAB), and “Pisang Palembang” (AAB). In both subgroups, SGI and SGII, triploids ABB and AAB genomes dominated the groups. In group II, two subgroups, SGI and SGII, were respectively identified and in which accessions such as “Mons Mari” (Pedwell: AAA), “Highgate” (AAA), and “Honduras” (BB) were found and their respective genome groups in parentheses in SGI, while SGII had “Lady Finger” Nelson (AAB), “J.D Yangambi” (AAA), “Williams” (Bell South Jones: AAA), “Selangor” (AAA), “Pome” (AAB), “Pisang Awak” (ABB), Musa beccarii (wild diploidy Musa species), FHIA-23 (AAAA), No.110 (AA), and “Borneo” (AA). Triploids AAA dominated SGI of group II, while triploids of different genomic groups (AAB, AAA, and ABB) were the most occurring ones, followed by diploids (AA) and tetraploids (AAAA) in SGII of group II. Accessions of different ploidy groups including "Calcutta 4" (AA), “Garbon 2” (AAB), “Blue Torres Strait Island” (ABB), “Cardaba” (ABB), “Pelitita” (ABB), “Pelitipa Manjoncho” (ABB), “Tani” (BB), and “Pisang Klutuk Wulung” (BB) were detected in group III. In this group III, ABB genomes were the most occurring ones followed by BB. “Pelitita” and “Pelitipa Manjoncho,” each with ABB genome, got closely clustered and the same degree of relatedness was observed between accessions “Tani” and “Pisang Klutuk Wulung” that possessed BB group. The B genome dominated this group III, except “Calcutta 4” that possessed AA genomic group. In group IV, “Balonkawe” (ABB), “Poteau Geant” (ABB), “Kunnan” (AB), “Khae” (Phrae: AA), and M. coccinea (wild diploid) were found together. Accessions with B genome were the dominating ones, except “Khae” (Phrae) and M. coccinea that had AA and unknown diploid genome, respectively.

Fig. 5
figure 5

Dendrogram resolution of 66 accessions of Musa species using conserved DNA-derived polymorphism (CDDP) marker genes. SG=subgroup

Also, group V had two subgroups of SGI (“Obino I’Ewa”-AAB; “Long Tavoy”-AA; “Pata”-ABB; "Plantain no. 3"-AAB; “Madang”-AA; “Pisang Lawadin”-AAB; SH-3436-6-AAAA; “Tomolo”-AA; FHIA21-68-AAAB; and “Lal Velchi”-BB) and SGII (“Dwarf Parfitt”-AAA; “Malaccenesis”-AA; “Tiau Lagada”-AA; and “Niyarma Yik”-AA). In SGI of group V, different triploids (ABB, AAB) were the most abundant ones followed by diploids (AA, BB) and tetraploids (AAAA, AAAB). Diploid genomic group AA existed in SGII of group V, except “Dwarf Parfitt” with triploid (AAA) genomic group. Group VI was further divided into three subgroups, SGI, SGII, and SGIII, respectively. In SGI of group VI, accessions including “Improved Lady Finger” (AAB), “Higa” (AA), “Pisang Berlin” (AA), and “Umalag” (AAA), with A genome dominating but had equal number of diploids (two AA) and triploids (AAB and AAA). SGII consisted of “Silk” (AAB), “Pisang Keling” (AAB), “Gros Michel” (AAA), “Chinese Cavendish” (AAA), “Pisang Rajah” (South Jones: AAB), “Figure Pomme Geante” (AAB), “Lidi” (AA), “Lai” (AAA), “Green Red” (AAA), and “Hochuchu” (AAA). The SGII had triploids (AAA) as the most prominent genomic groups followed by other triploids (AAB) and a diploid (AA), while SGIII had “Hsein Jen Chiao” (AAA) and “Pisang Lilin” (AA). In groups VII and VIII, “Truncata” (AA) and M. textilis (wild diploid) were respectively identified. Different diploid accessions such as “Dole” (ABB) and “Dwarf Cavendish” (AAA) were contained in group IX.

Principal Component Analysis (PCA) of Different Genomic Groups of Musa Species

Further analysis of the 66 accessions of bananas and plantains of different genomic groups resolved them into various distinct coordinates (Supplementary file 2: Figure S1). Accessions "Plantain no. 3", “Pisang Lawadin” and “Plantain no. 17,” “Blue Strait-Island,” “Obino I’Ewa,” “Fougamou1,” “Pelipita,” “Lal-Velchi,” “Tani,” “Pisang Klutuk Wulung,” “Balonkawe,” and “Pelipita Manjoncho” among others were considered plantains due to dominance of “B” genome in all but got closely clustered based on their genomic constitutions. For instance, "Plantain no. 3", “Pisang Lawadin,” and "Plantain no. 17" were tightly grouped, and they possessed AAB genomic group. Similar clustering was noted among “Gros Michel,” “Truncata,” “Long Tavoy,” “Malaccenesis,” “Chinese Cavendish,” “Lidi,” “Lai,” “Hochuchu,” “Hsein-Jen Chiao,” “Green Red,” “Tiau Lagada,” “Highgate,” and “Niyarma Yik” among others that had “A” genome as the most occurring one to classify them as bananas. The accessions were either diploid (AA) or triploid (AAA) as contained in “Lidi” and “Chinese Cavendish” accessions, respectively. “Cardaba” and “Hondura,” which had AAB and BB groups, respectively, did not get clustered to other known AAB and BB accessions.

Discussion

Assessment of genetic diversity, population indices, and polymorphisms among accessions of different genomic groups ranging from diploids to tetraploids is very crucial in Musa species breeding programs, since most programs target establishment of superior ploidy accessions derived from genotypes with favorable traits like resistance to abiotic and biotic factors (Crouch et al. 1999). Conserved DNA-derived polymorphisms, which are sequences of gene families that are detectable in multiple copies within the plant genomes, are very efficient and cost-effective molecular techniques that access polymorphisms (variations) in plant species (Collard and Mackill 2009). It has been shown that within functional domains of well-characterized plant genes (involved in responses to abiotic and biotic stress or plant development), the CDDPs can generate informative banding patterns that are utilized for mapping, trait association, and germplasm genetic diversity studies (Poczai et al. 2013; Collard and Mackill 2009). Due to the inherent efficiency and reliability of using CDDP to easily generate functional markers that are associated with a given plant phenotypic expressions, they have been applied in the breeding of different crops (Poczai et al. 2011; Li et al. 2013, 2014; Hajibarat et al. 2015; Jin et al. 2016; Mam et al. 2017; Jiang and Zang 2018), but not yet in banana and plantain crops.

In plants, allelic richness of accessions is an indicator of their genetic diversity endowment and this is usually harnessed by informative molecular markers that detect populations meant for selection, breeding purposes and conservation (Patil et al. 2013; Vinceti et al. 2013). In this study, primers of CDDP markers were retrieved and new ones designed to identify 421 alleles with an average of 35.0833. The alleles ranged from 20 (ABP1) and 51 (MYB1) per primer. In a previous report involving a different crop, Safflower (Cartamus tinctorious L.), 89 alleles were detected among the primers of CDDP marker genes and alleles per primer ranged from 5 (ERF1)-11(WRKYF1) (Talebi et al. 2018). Also, in another investigation involving 21 CDDP primers amplified with twelve date palm samples, a total of 192 scorable bands with an average of 9.1 bands per primer were detected (Sami and Atia 2014). The total number of identifiable alleles, range per primer locus, and their average value were more than the ones detected in previous studies involving different molecular markers of eighteen SSR markers (alleles = 195, range = 4–18 and average = 10.8 (Nyine et al. 2017), and 38 triploid accessions analyzed with 17 microsatellite loci (alleles = 267, range = 8–24 and average = 14.00) (Christelova et al. 2011). Compared with our results, lower values (alleles = 292, average = 15.4) were generated from the analysis of 70 diploid accessions with 19 microsatellite loci (Christelova et al. 2011). The ranges of allelic counts (1–28) and the frequencies (0.015–0.424) obtained were high, thereby demonstrating the informative nature of these set of primers of the CDDP marker genes in Musa species. Studies in other crops using different molecular markers revealed that allelic richness has been established as an indicator of genetic diversity and that it is majorly used to assess populations purely meant for conservation and breeding purposes (Patil et al. 2013; Vinceti et al. 2013). In this study, the additionally designed primers of CDDP markers that had less than 60% GC content either failed woefully or did not amplify well, thereby confirming the higher percentage of GC content as a favorable factor for successful amplifications of CDDP primers in plants (Collard and Mackill 2009).

The primers of the CDDP markers demonstrated high level of PIC (0.918) ranging from 0.768 to 0.975, whereas 0.870 with a range of 0.530 to 0.950, were obtained as PIC and mean respectively, from SSR markers (Nyine et al. 2017). Also, in a study of 38 triploid accessions analyzed with 19 microsatellite markers, PIC of 0.850 (0.760–0.942) was obtained (Christelova et al. 2011; Changadeya et al. 2012). In comparison with our findings, lower value of PIC of 0.827 (0.625–0.936) was generated from the analysis of 70 diploid accessions with 19 microsatellite loci (Christelova et al. 2011). This shows how informative, discriminatory, and efficient the CDDP markers may be when compared to SSR, ISSR, and RAPD markers. The major allele frequency of 0.220 (0.100–0.450) generated from SSR markers (Nyine et al. 2017) was found similar to the ones (0.171; 0.046–0.454) obtained in this study, and this shows the effectiveness of CDDP markers in exploring the allelic richness of this vital crop. The identified gene diversity of 0.924 (0.782–0.976) was higher than the previously reported ones obtained with SSR markers (Poerba and Ahmad 2010; Changadeya et al. 2012; Nyine et al. 2017). The identified PIC was high enough and contributed to the resolution of even the closest accessions and genomic groups. Furthermore, MYB1 primer of CDDP markers displayed the highest PIC; therefore, it is regarded as the most informative one and has been implicated in secondary metabolism, abiotic, and biotic stresses, as well as cellular morphogenesis (Stracke et al. 2001; Jiang et al. 2004). Also, these novel primers generated unique alleles from the different genomic accessions as earlier reported (Youssef et al. 2011).

We obtained high PPL of 100 (89.39–100%) and that depicts high efficacious nature of the CDDP markers used. The range of PPL generated is highest when compared to the ones obtained from different marker systems as contained in RAPD (44.44–100%), ISSR (66.66–100%), and DAMD (66.66–100%) (Lamare and Rao 2015). High polymorphism identifiable by molecular markers has been shown to rely on the presence of repeated sequences of AC, CA, AG, and GA (Ghalmi et al. 2010). From the 12 CDDP markers, KNOX-2 was shown to be the most genetically abundant one in this crop species with values of NPL, PPL, Ne, H, and I, while the WRKY-F1 had the least of genetic diversity abundance. The KNOX-2 has been reported to be associated with homeobox genes that function as transcription factors with a unique homeodomain (Nagasaki et al. 2001), while WRKY-F1 is linked to transcription factor for developmental and physiological roles in plants (Xie et al. 2005).

Populations having high genetic diversity of neutral markers and alleles could be utilized as suitable candidates for high adaptive variation, fitness, and conservation (Van et al. 2012; Ilves et al. 2013). Genetic indices including Ne, H, and I have been considered very crucial in the analysis of genetic diversity in several plants since they measure degree of genetic diversity of species (Hamilton 2009; Freeland et al. 2011). Within the populations of different genomic groups of Musa accessions investigated, we found that the Ne, H, and I were highest in “Wompa” with AS followed by AAB, while the least diverse was the AAA population. The narrow genetic base in this A genome accession could be responsible for its susceptibility to different abiotic and biotic stressors. The higher genetic diversity observed in this wild accession, “Wompa,” has been reported in other invasive species of other crops (Kelager et al. 2013).

It is noteworthy that conservation efforts of biodiversity focus on selecting accessions of crops with genetic reservoir for potential and proven desirable adaptability, especially, under the influence of abiotic and biotic factors (Bilz et al. 2011). Using CDDP data matrix, all the assessed population and genetic parameters including Ht, Hs, GST, and Nm were found to be high in all the accessions studied. But compared to other accessions, “Wompa” with AS genomic group had the highest with Ht, Hs, GST, and Nm values as 0.497, 0.451, 0.094, and 4.818, followed by AAB that had 0.453, 0.421, 0.072, and 6.934 as respective indices of Ht, Hs, GST, and Nm. The AB group had the least values (Ht = 0.350, Hs = 0.345, GST = 0.014, and Nm = 35.824). Generally, the population genetic structure values (Ht = 0.433 ± 0.004, Hs = 0.404 ± 0.004, GST = 0.066, and Nm = 7.113) identified in this study are high and demonstrate the usefulness of the markers. Genetic diversities within and between populations enhance selection of populations that are responsible for the majority of the existing variations. If genetic diversities are found mostly within a population, then it implies that fewer populations are required to protect and maintain the overall differences in the accessions or populations. However, if genetic diversities are kept majorly between populations, then a higher number of populations should be prioritized for protection and utilization. According to Nei (Nei 1978), GST is classified as low when its value is < 0.05, medium when its value is 0.05 < GST < 0.15, and high when GST > 0.15. In this study, the GST is 0.066 and that signifies that 6.57% is among the population and 93.43% within the population. The higher percentage of genetic diversity within populations has been demonstrated in other plants (Yang 2009; Qu 2013). The distribution of genetic diversity also plays an important role in species conservation (Barrett and Kohn 1991; Ge et al. 1998; Millar and Libby 1991). The high level of Nm recorded is a potentially viable parameter capable of inducing huge genetic divergences noted in these accessions as earlier asserted in another crop (Jin et al. 2016).

The dendrogram analysis of the studied accessions of different ploidy groups using CDDP marker systems revealed nine principal clusters that exhibited unique topology with some similarities. In a previous study involving different marker systems, SSR, AFLP, and RAPD, five clusters were detected (Sami and Atia 2014), and this could be attributable to the nature of the markers and the number of accessions used. Some of the different genomic groups were correctly resolved, while some including those with mixed ploidy groups got clustered together based on their genetic similarity possessed from their progenitors, M. acumminata (A genome) and M. balbisiana (B genome). For instance, “Pelitita” and “Pelitipa Manjoncho,” each with ABB genome, closely clustered and the same relatedness was found between accessions “Tani” and “Pisang Klutuk Wulung” that possessed BB group. The B genome dominates group III, except “Calcutta 4” that possesses AA genomic group, but was found in the same group due to possible existence of ancestral linkage as previously reported (Brown et al. 2009). It has been reported that the farther away accessions are from one another, the more the possibility of acquiring wider genetic diversity, which also identifies their locations on clusters (Skroch and Nienhuis 1995). Accessions “Truncata” and M. textilis were the most genetically isolated as evidenced in their existing respective groups followed by “Dole” and “Dwarf Cavendish” that were found clustering only in one group. Most of the accessions of different genomic groups were located in the major groups with other subgroups to demonstrate the level of relatedness among them as earlier reported using ISSR markers (Silva et al. 2016). “Zebrina” G.F., M. acuminata with AA genomic group, grouped together with M. schizocarpa with AS genome and this collaborates with a previous report (Christelova et al. 2011). Some Musa diploid wild species, including M. beccarii and M. coccinea, whose genomic constitutions were yet to be known, got closely clustered with A genome, suggesting that they belong to A genomic group. This type of close relationship has been shown between M. acuminata (A genome) and Rhodochlamys (Christelova et al. 2011; Li et al. 2010; Liu et al. 2010). In group II, the diploid, triploids, and tetraploids formed two distinct but closely related subgroups, thereby demonstrating support for the hypothesis of production of unreduced triploid (3 N) and reduced haploid (N) gametes during meiotic events in the tetraploid progenitors (Ssali et al. 2010). The marker, CDDP, facilitated discrimination between subgroups and genomic constitutions, although some could not be resolved due to their common ancestral lineage and narrowed genetic polymorphisms occasioned by vegetative propagation cycles as earlier reported (Christelova et al. 2011).

Further analysis of the 66 accessions of bananas and plantains of different genomic groups resolved them into various distinct coordinates based on bananas and plantains as well as different genomic constitutions. Accessions "Plantain no. 3", “Pisang Lawadin” and "Plantain no. 17", “Blue Strait-Island,” “Obino I’Ewa,” “Fougamou1,” “Pelipita,” “Lal-Velchi,” “Tani,” “Pisang Klutuk Wulung”, “Balonkawe,” and “Pelipita Manjoncho” among others are plantains due to dominance of “B” genome in all but got clustered closely depending on their genomic constitutions. The association of some “A” could be attributable to previous misclassification of their ploidy groups and due to ancestral lineage. For instance, three plantain accessions (Plantain no. 3, “Pisang Lawadin,” and "Plantain no. 17") were tightly grouped and they possessed AAB genomic group. Similar clustering was noted in banana accessions (“Gros Michel,” “Truncata,” “Long Tavoy,” “Malaccenesis,” “Chinese Cavendish,” “Lidi,” “Lai,” “Hochuchu,” “Hsein-Jen Chiao,” “Green Red,” “Tiau Lagada,” “Highgate,” “Niyarma Yik” among others) that have “A” genome as the dominating one. The accessions were either diploid (AA) or triploid (AAA) as contained in “Lidi” and “Chinese Cavendish” accessions, respectively, and this type of homogenomic grouping has been reported (Brown et al. 2009; Rajamanickam and Rajmohan 2012). “Cardaba” and “Hondura,” which have AAB and BB groups, respectively, did not cluster with other known AAB and BB accessions.

Conclusion

The set of primers derived from CDDP markers exhibited high resolving potential and discriminatory capability based on high PIC values, and these primers may be employed in breeding programs to facilitate assessment of genetic diversity, population, and allelic richness of accessions of Musa species. The CDDP markers were identified to be more efficient and informative in assessing genetic diversity, and population potentials among Musa species, compared to other gel-based molecular markers including ISSR, and RAPD as demonstrated by high values of PIC, PPL, Ne, H, I, Ht, Hs, Nm, and other genetic indices obtained. The results suggest that AS genomic group is the most genetically diverse among the genomic groups. Dendrogram analysis of the accessions with variable genomic constitutions revealed better clustering of the accessions compared to PCA. Unique alleles identified in some of the accessions could be associated with useful phenotypic traits since the CDDP markers are functionally gene-based markers that are phenotypically linked to characters of abiotic and biotic stressors. Therefore, these selected primers of CDDP could serve as useful tools for selection of good hybrids for improved breeding and germplasm conservation. However, the accessions with high genetic indices as a result of variable combination events may be harnessed and utilized as suitable training populations in Musa species breeding programs.