Background

The increasing concern of the effect of global climate change and its likely impact on agriculture has stimulated scientists to search for crops that can withstand extreme environmental conditions. Among legumes, pigeonpea {Cajanus cajan (L.) Millspaugh} (2n = 22) has attracted attention as being both drought-tolerant [1] and highly nutritious [2]. Extensive morphological variation within the genus Cajanus as a whole and in cultivated species in particular has always led to the assumption that there exists abundant genetic diversity within the cultivated species. To the contrary, molecular studies have reported extremely low levels of polymorphism within the cultivated species compared to its wild relatives [3, 4]. Such findings suggest that efforts towards the development of a linkage map of pigeonpea should focus on the use of an interspecific cross, and the development of a substantially high number of markers. We report the development of new 36 polymorphic simple sequence repeat (SSR) markers that will be an asset in characterising and understanding the nature of diversity within Cajanus species.

Results

A total of 641 non-redundant contigs were generated from 2,131 sequenced clones reflecting an overall redundancy level of 70%. Of the 641 contigs, 117 sequences (20%) contained a microsatellite. The average size of each contig was 500 bp. This library thus covered an estimated 320,500 bp of pigeonpea genome. On the whole, di-nucleotide repeats were the longest (average 27 bp long) and also the most abundant followed closely by tri-nucleotide repeats (average 25 bp long). The longest motif was a 258 bp perfect hexa-nucleotide (AAACCC) repeat while a GT had the longest uninterrupted repeat of 74. A list of all designed pigeonpea primer sequences, SSR motifs and PCR programmes used for amplification is provided [see Additional file 1].

Table 1 gives a detailed characterisation of 35 pigeonpea primers that were polymorphic among the 24 diverse genotypes. Di-nucleotide repeats formed the highest proportion of polymorphic markers followed by tri-nucleotide repeats. The number of alleles detected ranged from 2 – 6 at each of the 35 polymorphic loci with a total of 110 alleles and an average of 3.1 alleles per locus. Gene diversity values ranged from 0.07–0.76 with an average of 0.41. While TG class of repeats formed the highest proportion (40%) of all the polymorphic loci, the highest number of alleles (6) was observed from a perfect tri-nucleotide repeat (CCttc019). The most informative marker with polymorphism information content (PIC) of 0.76 was a tri-nucleotide compound repeat (CCttc005), which was also the longest motif.

Table 1 PCR conditions, allele sizes and core motifs of new microsatellite loci that amplified in 24 different genotypes of pigeonpea

Optimum primer conditions were established for 39 (Table 2) out of the 220 (17.7%) soybean primers tested. One of the amplifying soybean primers revealed polymorphism among 24 diverse cultivated pigeonpea germplasm. The polymorphic microsatellite was an ATT-repeat (ATT18) and showed 4 alleles ranging from 290 to 335 bp. Twenty four of the amplifying soybean SSRs were genomic while the rest were EST-SSRs (Table 2). This means a higher proportion (25.4%) of the EST-SSRs designed could amplify pigeonpea DNA compared to 9.2% of the genomic SSRs tested. The EST-SSRs had relatively shorter motif lengths compared to the genomic ones. The common problems with amplification products for soybean SSRs were the appearance of excess number of bands, smears, and amplification failure.

Table 2 Characteristics of soybean microsatellite loci that amplified in pigeonpea.

Discussion

This study has resulted in the development of 112 new pigeonpea microsatellites; 73 de novo and 39 by transfer from soybean. The addition of these markers bring the numbers of microsatellite loci immediately available for testing in an interspecific cross to a total of 142. Due to the low genetic diversity within cultivated species, use of an interspecific cross would be the best strategy towards the development of a linkage map in pigeonpea. Even though wild relatives were not included in the current investigation, a past study [4] using microsatellites gives a good indication that these markers would be more informative among wild relatives than reported here for the cultivated species alone. Primers transferred from soybean will still have to be tested further to establish their informative nature for other species of the Cajanus genera. Such studies would be necessary to justify any future efforts towards the use of soybean primers in pigeonpea.

The average number of alleles per locus in this study was 3.14. Previous diversity analysis of cultivated pigeonpea species reported an average of 3.10 for 10 polymorphic loci [5] and 3.4 for 9 polymorphic loci [4], which are similar to the present results. This level of diversity is lower than 4.8 that has been reported when wild relatives are included in similar analysis [4]. Less diversity within cultivated pigeonpea has also been reported while using other markers [3]. Future breeding strategies in pigeonpea must focus on broadening genetic base by accumulating favourable alleles from other landraces and wild populations in order to maximise gains from selection. In tomato (Lycopersicon esculentum) for example, the development of an interspecific mapping population [6] greatly enhanced molecular breeding studies despite the low genetic diversity [7] reported.

Conclusion

Pigeonpea germplasm collection at ICRISAT is already benefiting from the current study by utilising the markers developed to characterise a representative collection. These markers are expected to be beneficial in future for interspecific crosses and comparative genome analysis between the different Cajanus species for more efficient exploitation of the desirable characteristics therein. A larger number of markers would still be required in future to enable marker-assisted selection (MAS) in pigeonpea. With the current efforts to make DArT technology [8] available in pigeonpea [3] and the falling prices of DNA sequencing and SNP assays [9], more superior markers will undoubtedly be incorporated to complement the current efforts and enhance molecular marker technology in pigeonpea.

Methods

Development of an enriched genomic library

Genomic DNA was extracted from accession ICP 2376 and purified as described by Oberhagemann and colleagues [10] for the development of a small enriched genomic library. Ten μg DNA was digested with 50 enzyme units (Sau 3AI) in a volume of 100 μl overnight at 37°C. The size of the DNA fragments was monitored by separating a 10 μl aliquot of the restricted DNA on a 1.5% agarose gel. The rest of the DNA fragments were ligated by T4 DNA ligase onto Mlu I self-complementary adaptors RSA 21 and phosphorylated RSA 25 according to Billotte and co-workers [11]. Twenty-five ng of the adaptor-ligated DNA was amplified through PCR in a 25-μl-reaction mixture that contained 1 μM RSA 21, 2 μl of 2.5 mM dNTP, 1× PCR buffer (50 mM KCl, 10 mM Tris-HCl, pH 8.3), 1.5 mM MgCl2 and 2.5 units of Taq polymerase. The PCR program involved initial denaturation at 95°C for 1 min followed by 28 cycles of 94°C for 40 sec, 60°C for 1 min and 72°C for 2 min, and a final extension at 72°C for 5 min. Ten μl of each PCR product was viewed on a 1.2% agarose gel to confirm amplification. The rest of the PCR product was purified using QIAquick spin columns (Qiagen, Hilden, Germany) according to the manufacturer's instructions and resuspended in 100 μl of water. Microsatellite sequences were selected using biotinylated GA8, GT8, GAA8 and TAA8 oligonucleotide probes and streptavidin-coated magnetic beads following the hybridisation based capture methodology adapted from Billotte and co-workers [11].

The selected fragments were once more amplified by PCR (20 cycles), purified using the QIAquick columns (Qiagen) and concentrations adjusted to approximately 25 ng μl-1 before cloning into pGEM-T vector (Promega, Madison, USA) according to the supplier's instructions. Positive colonies were handpicked and subjected to colony PCR using T7 and SP6 primers (Metabion, Martinsried, Germany). All products with an insert size range of 300 – 1100 bp were purified for sequencing using EXOSAP (Amersham Biosciences, Freiburg, Germany). Sequencing, sequence analysis and primer design was done as described elsewhere [4]. Primers were designed for 113 loci of which 73 [see Additional file 1] amplified single discrete bands.

Testing of soybean (Glycine max L.) microsatellites for amplification in pigeonpea

Additionally, 161 genomic soybean SSRs (100 of which were kindly provided by Dr. Perry Cregan of the United States Department of Agriculture) that were previously mapped in soybean [12] and 59 soybean EST (Expressed Sequence Tags)-SSRs were tested for amplification in pigeonpea using soybean cultivar Williams' DNA as a control. More details on the primer sequences and Genbank accession numbers of the EST sequences are also available [see Additional file 2].

Testing amplifying loci for polymorphism in pigeonpea

The amplifying loci of both soybean and pigeonpea primers were tested for polymorphism among 24 diverse pigeonpea accessions (ICP 13575, ICP 15145, ICP 9266, ICP 4167, ICP 14576, ICP 12058, ICP 14352, ICP 1514, ICP 7543, ICP 7852, ICPL 87091, ICP 7035, ICPL 151, Kat 60/8, HPL 24, LD Dwarf, ICPL 99066, MN 5, ICPL 332, ICPA 2068, ICPA 2032, ICP 13092, ICPL 87119, ICP 2376). All seeds were obtained from ICRISAT, India. PCR reactions were prepared in 10 μl volumes and amplification was confirmed in 1.2% agarose gels by loading 5 μl PCR products of randomly selected samples. Amplification products were thereafter visualised on non-denaturing 6% 29:1 (w/w) acrylamide/bisacrylamide gels followed by silver staining. The bands visualized on the gels were scored using a binary code for presence ("1") or absence ("0") of bands (alleles) for every SSR locus. Respective allele sizes were estimated using various commercial DNA ladders (Bioline, UK). The polymorphism Information Content (PIC) was calculated as described by Botstein and co-workers [13] using the formula below;

P I C = 1 [ i = 1 n p i 2 ] [ i = 1 n 1 j = i + 1 n 2 p i 2 p j 2 ] MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiuaaLaemysaKKaem4qamKaeyypa0JaeGymaeJaeyOeI0YaamWaaeaadaaeWbqaaiabdchaWnaaDaaaleaacqWGPbqAaeaacqaIYaGmaaaabaGaemyAaKMaeyypa0JaeGymaedabaGaemOBa4ganiabggHiLdaakiaawUfacaGLDbaacqGHsisldaWadaqaamaaqahabaWaaabCaeaacqaIYaGmcqWGWbaCdaqhaaWcbaGaemyAaKgabaGaeGOmaidaaOGaemiCaa3aa0baaSqaaiabdQgaQbqaaiabikdaYaaaaeaacqWGQbGAcqGH9aqpcqWGPbqAcqGHRaWkcqaIXaqmaeaacqWGUbGBa0GaeyyeIuoaaSqaaiabdMgaPjabg2da9iabigdaXaqaaiabd6gaUjabgkHiTiabigdaXaqdcqGHris5aaGccaGLBbGaayzxaaaaaa@5CD3@

where p i equals the frequency of the i th allele and p j the frequency of the (I + 1)th allele. Only data from polymorphic SSR loci were used for this analysis.