Background

Commercial banana varieties, which are derived from intraspecific crosses within Musa acuminata Colla, together with interspecific hybrid development with Musa balbisiana Colla, are cultivated mostly by smallholder farmers, across over 120 countries in different tropical and sub-tropical environments. As an inexpensive starch source, banana is also rich in fibre, minerals and vitamins. Although an important food commodity in developing countries in terms of production value after rice, wheat and maize, genetic improvement has been limited. In wild bananas, sexual recombination results in viable seed. However, the majority of today's commercial cultivars are sterile A and B genome-containing triploids, with seedless fruit development occurring via parthenocarpy, partly as a result of translocations [1]. Conventional breeding in Musa diploids and triploids is also hampered as a result of a low number or complete absence of seeds, caused by either a lack of viable pollen, or inefficient pollinating insects. As many cultivars are evolving asexually via vegetative micropropagation or suckers, their genetic base is narrow, resulting in crops lacking resistance to pests and disease. Given the large scale global consumption of susceptible genotypes such as the sterile triploids of the M. acuminata Cavendish cultivar group, global Musa production faces threats by fungal, bacterial and viral pathogens and a number of pests, with greatest disease losses today caused by the fungal pathogens Mycosphaerella fijiensis, causal organism of black Sigatoka disease, and Fusarium oxysporum f. sp. cubense Tropical Race 4, which causes Fusarium wilt. For these reasons, molecular tools for the development of disease resistant genotypes are of paramount importance for the Musa industry.

Highly variable microsatellites or simple sequence repeat loci (SSRs), are abundant, randomly dispersed, locus specific, codominant and multi-allelic markers, which are composed of core repeat sequences of, for example, di- to penta-nucleotides, repeated in tandem. Their application in Musa has included genotyping [24], Musa evolution and taxonomy [5], and linkage map saturation [1]. Potential also exists in marker assisted selection (MAS), upon identification of SSRs for gene loci co-localizing with quantitative trait loci (QTLs) for desirable traits. To date, several hundred SSR markers have been developed from M. acuminata and M. balbisiana material [5, 2, 68]. In comparison with other crop species, however, the total number available for genetic analyses remains limited, given that alleles can be absent or monomorphic when applied across cultivars.

We report the development of novel SSR markers from sequenced BAC clones in M. acuminata Calcutta 4. This wild diploid species is resistant to numerous fungal and bacterial pathogens, as well as nematodes. Given its' potential as a source of exploitable genes, this cultivar is widely employed as a donor species in banana breeding programs [9]. Polymorphic loci were identified when tested across 21 potential parental diploid M. acuminata individuals contrasting in resistance to Sigatoka diseases caused by the ascomycete fungi M. fijiensis and Mycosphaerella musicola. Such BAC-derived markers are potentially advantageous in that polymorphism can not only be greater than that observed using EST-derived SSRs [10], but subsequent mapping also allows anchoring of individual BAC clones of interest to genetic maps.

Results

The sequences of five Musa BAC clones were subjected to a computational pipeline targeting perfect SSRs with periodicities ranging from two to ten nucleotides, and an overall length of 12 bases. In total, 41 SSRs were identified comprising six repeat classes. Di-nucleotide repeats are the most abundant (46.34%) class, followed by tri- (29.26%), tetra- (12.19%), penta- (7.31%), hexa- (2.43%) and nona-nucleotide repeats (2.43%). The most abundant dinucleotide repeat motifs isolated were AG, AT, CT, and TA (7.31% each). By contrast, all tri-nucleotide motifs were equal in abundance (7.31% each). Generally, the shorter the nucleotide core sequence, the greater were the number of repeats observed, with an average of 12.2 repeats for di-nucleotide motifs, 5.8 for tri, 3.6 for tetra, 3 for penta, 3 for hexa, and 3 for nona-nucleotide motifs. A summary of all designed primer sequences, SSR motifs, theoretical annealing temperature, and expected product size is provided for the 41 loci identified where primers could be designed [Additional file 1]. Twenty out of 33 tested primer pairs reproducibly amplified polymorphic PCR products across the Musa accessions, with allelic patterns under optimized primer conditions given in Table 1. Di-nucleotide repeats were the most abundant polymorphic group, followed by tri, penta and tetra-nucleotides. From a total of 56 scored alleles, the number of polymorphic alleles ranged from two to four, with an average of 2.8 alleles per locus. Heterozygosity values were calculated using GDA [11] and FSTAT [12], with expected values ranging from 0.31 to 0.75. Thirteen loci (MABN 09, MABN 12, MABN 14, MABN 16, MABN 18, MABN 21, MABN 24, MABN 31, MABN 33, MABN 37, MABN 38, MABN 39, and MABN 40) were monomorphic in M. acuminata accessions. Twelve loci showed departure from Hardy-Weinberg expectations (P < 0.05 using Fisher's exact test probability [P < 0.05] based on 2000 shufflings), possibly as a result of sampling, chromosomal inversions or null alleles. Phenomena potentially responsible for null alleles include point mutations and sequence divergence in primer annealing sites, or preferential allele amplification during PCR. In testing for linkage disequilibrium (LD) (FSTAT P < 0.01 with Bonferroni correction), no disequilibrium was detected among the loci pairwise combinations. PIC values for allelic diversity ranged from 0.258 to 0.681.

Table 1 Characteristics of microsatellite loci isolated from M. acuminata Calcutta 4 and polymorphic across 21 M. acuminata accessions.

Discussion

This is the first report identifying polymorphic microsatellite markers from M. acuminata Calcutta 4 across accessions contrasting in resistance to Sigatoka diseases. The availability of these molecular tools will contribute towards development of genetic maps with high marker density, derived from segregant populations for agronomically important traits, and offering potential for downstream application in MAS. Concerted efforts are currently underway by a number of Musa breeding groups for development of segregant mapping populations [13, 14].

Also, given difficulties in development of populations in Musa with sufficient numbers of individuals for high resolution mapping, LD mapping has been proposed as an alternative route for identifying genes for traits of interest in Musa[15]. As such an approach requires both hundreds of plant accessions and thousands of markers, the new microsatellite markers characterized in this study can serve as candidates for such work. Our markers are also a resource for characterizing diversity in wild species, cultivars and landraces deposited in genebanks, and for inferring phylogenetic relationships in Musa.

Finally, considering the increasing availability of genomic resources for M. acuminata Calcutta 4, such as BAC libraries [16], EST data sets [17] and candidate disease resistance gene sequences [18], in the context of available next generation sequencing technologies, identification of genes and markers for desirable traits such as resistance to biotic stress will no doubt accelerate considerably in the near future.

Conclusion

In this study 41 new microsatellite markers were developed for M. acuminata, of which 20 displayed reasonable polymorphism when screened across 21 diploid individuals contrasting in resistance to Sigatoka diseases. Polymorphic markers detected an average of 2.8 alleles per locus, with PIC values ranging from 0.258 to 0.681. The results also provided some information on repeat class nature and abundance.

Methods

Data for SSR identification was derived from genomic data (shotgun-sequenced BAC clones from a M. acuminata Calcutta 4 BAC library) [16, 19]. A computational search over five BAC consensi datasets [GenBank:AC186748, AC186749, AC186954, AC186747 and AC186750] was performed to locate SSRs with at least two repeating units spanning more than 10 bases, using the program Mreps [20]. Primers flanking microsatellite loci were designed using the program PRIMER3 [21].

From 41 loci identified where primers could be designed, 33 primer pairs were tested for polymorphism. Twenty one diploid (AA) M. acuminata accessions, contrasting in resistance to Sigatoka diseases, and potential parentals for genetic map construction, were used to characterize microsatellite loci. Genomic DNA was extracted from the Black Sigatoka-resistant M. acuminata accessions Calcutta 4, Lidi, 0323-03, SH32-63, 1304-06 and 0116-01; Black Sigatoka-susceptible accessions Pisang Berlin and Niyarma Yik; Yellow Sigatoka-resistant accessions Calcutta 4, Burmanica, Microcarpa, Lidi, 0323-03, 1304-06, 1741-01, 9179-03, 0116-01, 1318-01 and 4279-06; and Yellow Sigatoka-susceptible accessions Raja Uter, Tjau Lagada, F2P2, Khai Nai On, Pisang Berlin, Niyarma Yik, Sowmuk, Jaribuaya and SH32-63. Each PCR reaction was carried out in a 13 μl volume, containing 3 ng of template genomic DNA, 2.5 mM MgCl2, 0.2 mM dNTPs, 0.5 μM of each primer, 1.25 U of Taq polymerase, and 1 × PCR buffer (Invitrogen). Amplifications were conducted on a PTC-100 thermocycler (MJ Research), with temperature cycling conducted as follows: initial denaturation at 94°C for 5 min; 29 cycles of 94°C for 1 min, specific primer annealing temperature for 1 min, and extension at 72°C for 1 min; plus an extra elongation period of 7 min at 72°C. Following amplification, PCR products were initially electrophoresed in 3.5% agarose gels run in 1 × TBE buffer, in order to check amplicon size and PCR specificity. Allele sizes were estimated against 10-bp ladder molecular size standards (Invitrogen) on denaturing 6% polyacrylamide gels using 7 m urea, with PCR products visualized by silver staining according to standard protocols. The degree of polymorphism per locus was calculated using GDA software, version 1.2 [11].