Introduction

Pigeonpea [Cajanus cajan (L.) Millspaugh] is one of the most suitable crops under subsistence farming system prevalent in the sub-tropical and semi-arid tropic regions of the world. Globally, pigeonpea is cultivated on 5.32 Mha with total production of 4.32 Mt in Asia, Latin America and southern and eastern Africa (FAO 2013). It is an often cross-pollinated (20–70 %) diploid species (2n = 2X = 22) with a genome size of 833.07 Mbp (Varshney et al. 2012). During the last six decades, continuous efforts have been made to increase pigeonpea yield, however only limited success was possible. In this scenario, hybrid technology has been found to be a promising option for increasing the yield. In general hybrids were developed by crossing two inbred lines, where manual emasculation is required. Since manual emasculation is time-consuming and cost-ineffective for developing hybrids, cytoplasmic male sterile (CMS) systems were developed. The CMS system involves a CMS line or A line and a maintainer line or B line. Furthermore, to realize hybrid vigour a restorer line or R line is used to cross with A line, so that only fertile F1 plants are obtained. This complete hybrid system comprising of A line, B line and R line has been termed as cytoplasmic genic male sterility (CGMS) system.

So far, a total of eight CMS systems (A1–A8) have been developed in pigeonpea (Saxena 2013; Saxena et al. 2010a, b). These CMS systems are derived from wild Cajanus species, namely, A1 CMS system from C. sericeus (Benth. ex Bak.) van der Maesen comb. nov. (Saxena et al. 1997), A2 CMS system from C. scarabaeoides (L.) Thou. (Saxena and Kumar 2003), A3 CMS system from C. volubilis (Blanco) Blanco (Wanjari et al. 1999), A4 CMS system from C. cajanifolius (Haines) van der Maesen comb. nov. (Saxena et al. 2005), A5 CMS system from C. acutifolius (F.V. Muell.) van der Maesen comb. nov. (Mallikarjuna and Saxena 2005), A6 CMS system from C. lineatus (Wight & Arn.) van der Maesen (Saxena et al. 2010a, b), A7 CMS system from C. platycarpus (Benth.) van der Maesen (Mallikarjuna et al. 2006) and most recently A8 CMS system from C. reticulatus (Dryander) F. V. Muell. (Saxena 2013). The CMS systems derived from A2 and A4 cytoplasms have been successfully used in developing commercial hybrids. It is noteworthy that the first A4 cytoplasm based pigeonpea hybrid ICPH 2671, showing an yield advantage of up to 45 % over control (Saxena et al. 2010a, b), was released for commercial cultivation in Central India (Saxena et al. 2013) and many promising hybrids such as ICPH 3762, ICPH 2740, etc. are in pipeline for release (personal communication). However, a number of constraints such as unstable male sterility, poor fertility restoration, lack of proper and complete restorers, etc. have been observed while utilizing other CMS systems in hybrid development.

It is now widely known that the CMS in plants arises due to aberrations in the mitochondrial (mt) genome (Tuteja et al. 2013). Therefore, understanding the genetic relationships of nuclear and mitochondrial interactions at genome level in CGMS system is critical to hybrid pigeonpea breeding. Recently the mitochondrial genome of 545.7 kb has been assembled for a pigeonpea A line ICPA 2039 derived from A4 cytoplasm (Tuteja et al. 2013). However, little is known about the nature and organization of simple sequence repeats (SSRs) in the pigeonpea mitochondrial genome. Microsatellite or SSR markers are known to be abundant, hypervariable and ubiquitous across prokaryotic and eukaryotic genomes. The high level of polymorphism combined with their high reproducibility and co-dominance nature have established them as the markers of choice for genetic studies and breeding applications in crops (Gupta and Varshney 2000). In the past, SSRs were developed by using a range of methods such as SSR-enriched libraries, BAC-end sequencing and mining of sequence data (Varshney et al. 2005a, b). However in silico mining of sequence data from nuclear or mitochondrial genome is the most cost-effective and fast method (Rajendrakumar et al. 2007). Therefore SSRs have been developed from the mitochondrial genomes in a number of crops such as sorghum (Nishikawa et al. 2002), rice (Nishikawa et al. 2005; Rajendrakumar et al. 2007) and cotton (Zhang et al. 2012).

In view of the importance of CMS for hybrid breeding, and the possible use of mt-genome-derived SSR markers, the present study reports identification, categorization and development of mt-genome derived SSR markers in pigeonpea. The mt-SSR markers so developed were used to understand the genetic relationships among A lines, B lines and wild Cajanus species accessions from six different CMS systems.

Materials and methods

Plant material and DNA isolation

A total of 22 genotypes including 8 A lines, 8 B lines and 6 wild species accessions were used in the present study. These 22 genotypes represented the following six different CMS systems in pigeonpea, viz., A1 (C. sericeus), A2 (C. scarabaeoides), A4 (C. cajanifolius), A5 (C. acutifolius), A6 (C. lineatus) and A8 (C. reticulatus) (Table 1).

Table 1 A list of 22 genotypes including 8 A lines, 8 B lines and 6 wild species accessions representing 6 CMS systems used in the present study

Genomic DNA was isolated from freshly harvested young leaves from 2-week-old seedlings using the standard DNA isolation protocol as mentioned in Cuc et al. (2008). The quantity and quality of DNA samples was assessed on 0.8 % agarose gel and the DNA was then diluted to 5 ng μL−1 for genotyping.

SSR identification and primer designing

A high-quality DNA sequence of pigeonpea mitochondrial genome ICPA 2039 (SRA053693) from GenBank (http://www.ncbi.nlm.nih.gov; Tuteja et al. 2013) was used for mining SSRs using MISA software (http://pgrc.ipk-gatersleben.de/misa/; Thiel et al. 2003). The search criteria used were mono (N) ≥ 10, di (NN) ≥ 6, tri (NNN) ≥ 5, tetra (NNNN) ≥ 5, penta (NNNNN) ≥ 5 and hexanucleotides (NNNNNN) ≥ 5 repeats as well as compound SSRs (which were interrupted by few bases).

Identified sequences containing SSRs were used for designing primer pairs using Primer3 software (Untergrasser et al. 2012). Following criteria were used for designing primers pairs:, amplicon length in the range of 100–300 bp, optimal melting temperature set to 60 °C and optimal primer size of 20 bp. Rest of the options were default values of Primer3 software.

Polymerase chain reaction

Polymerase chain reaction (PCR) mix contains 10 µl reaction volume constituting 1.0 µl of 10× PCR buffer, 1.0 µl of 2 mM dNTPs, 1.0 µl of 2 pM primer (Eurofins, Bangalore, India), 0.4 µl of 25 mM MgCl2, 0.06 U of Taq polymerase (Kappa Biosystems, Woburn, MA, USA), 1.5 µl (5 ng) of template DNA, 1.0 µl of fluorescent dyes (FAM, NED, VIC or PET) and 4.04 H2O in 96-well microtitre plate (Axygen Inc., Union City, CA, USA). The DNA fragments were then amplified using a touch-down PCR programme consisting of an initial denaturation step at 95 °C for 3 min, followed by 5 cycles, each cycle involving denaturation for 25 s at 94 °C, annealing for 20 s at 60 °C (the annealing temperature for each cycle being reduced by 1 °C per cycle) and extension for 30 s at 72 °C. The touch-down PCR was followed by 40 cycles, each cycle with denaturation for 20 s at 94 °C, annealing for 20 s at 55 °C and extension for 30 s at 72 °C, and subsequently the final extension step at 72 °C for 20 min using Gene-Amp PCR System 9700 thermal cycler (Applied Biosystems, Foster City, California, USA).

SSR fragment analysis

SSR fragment analysis was performed through capillary electrophoresis using 1 µl amplified PCR product, 7.5 µl of Hi-Di formamide, 0.05 µl of internal lane standard GeneScan 500 (Applied Biosystems) labelled with LIZ after denaturation at 95 °C for 5 min. LIZ 500 internal lane standard and the GeneScan Filter Set D were used for size fractionation of amplicons labelled with different fluorescent dyes. Capillary electrophoresis was done using ABI 3700 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). Allele sizing and scoring of the electrophoresis data was carried out using the GeneMapper 4.0 software (Applied Biosystems, Foster City, CA, USA).

Analysis of genotyping data

Allelic data recorded for each marker was subjected to Allelobin software (Prasanth et al. 2006) in order to get allele calls based on the repeat motif of each SSR. Several features of the mtSSR markers such as polymorphism information content (PIC) value, major allele frequency, and allele numbers were calculated using PowerMarker version 3.25 software (Liu and Muse 2005). DARwin version 5.0.158 (Perrier and Jacquemoud-Collet 2006) was used for principal coordinate analysis (PCoA) and construction of neighbour-joining tree.

Results

Distribution of SSRs

The pigeonpea mitochondrial genome ICPA 2039 (Tuteja et al. 2013) was used for mining SSRs using MIcroSAtellite (MISA) software (Thiel et al. 2003). A total of 25 SSRs were identified with a frequency of 0.046 SSR per kb of the mitochondrial genome. Identified SSRs were designated as C ajanus c ajan Mitochondrial (CcMt) SSRs. Among the CcMt SSR classes, mononucleotides constituted the major proportion at 60 % of the total SSRs identified (15 out of 25 SSRs identified). Only two mononucleotide SSR motifs namely poly “A” and poly “T” were present, however the SSR motif poly “A” was more abundant with its presence in 9 mononucleotide SSRs, the poly “T” being present in only 6 SSRs. Dinucleotide SSR motifs were the second most abundant repeats constituting 36 % of the total SSRs identified (9 out of 25 SSRs identified), with TA/AT repeat motif in six SSRs followed by TC/CT in two SSRs and AG in a solitary SSR. Only one trinucleotide SSR with “TAA” motif was identified.

Development of SSR markers

SSR containing sequences were used for primer pair designing. As a result, the primer pairs were designed for 24 SSRs. Amplification conditions for all 24 primer pairs were optimized initially on two pigeonpea genotypes viz., ICPA 2039 (A line) and ICPB 2039 (B line). Both the genotypes showed 100 % amplification for all the primer pairs. Subsequently, primer pairs for all 24 SSRs were used for assessing the polymorphism in pigeonpea germplasm (Table 2).

Table 2 Details of mitochondrial SSR markers in pigeonpea

Polymorphism assessment in different CMS sources

A total of 22 genotypes consisting of A lines, B lines and wild Cajanus species representing the following six different CMS systems (Table 1) were used for assessment of polymorphism: A1 (Cajanus sericeus), A2 (C. scarabaeoides), A4 (C. cajanifolius), A5 (C. acutifolius), A6 (C. lineatus) and A8 (C. reticulatus). Out of 24 CcMt SSR markers screened, a solitary marker (CcMt13) was found to be monomorphic. Remaining 23 SSRs showed polymorphism and generated a total of 107 alleles with an average of 4.65 alleles per SSR marker. Allele numbers identified by the polymorphic SSR markers ranged from 2 for 5 markers (CcMt07, CcMt12, CcMt18, CcMt22 and CcMt24) to 10 for one marker (CcMt19). The PIC values for the polymorphic markers varied from 0.09 (CcMt12) to 0.84 (CcMt03), with an average of 0.52 per marker. Furthermore, the major allele frequency for these SSRs ranged from 0.21 (CcMt03) to 0.95 (CcMt12) with a mean of 0.54 (Table 3).

Table 3 Polymorphism features of pigeonpea mitochondrial SSR markers

Genetic relationships among A line, B line and wild species accessions

The allelic data generated on 23 polymorphic CcMt SSR markers on 22 genotypes could separate all individual genotypes and were used to calculate genetic dissimilarity matrix and to construct dendrogram (Fig. 1) and factorial analysis (Supplementary Fig. 1) using the DARwin software. Five clusters (Cl) were identified based on the dendrogram (Fig. 1). The Cl I contained 8 genotypes, Cl II contained 4 genotypes, Cl III contained 2 genotypes, Cl IV contained 4 genotypes while Cl V contained 3 genotypes. However, one genotype ICPB 2052 was not part of any cluster and an outgroup in the dendrogram. The Cl I represents by and large genotypes related to A1 CMS system (3 A lines, 3 B lines and wild representative ICPW 162 which is donor of A1 cytoplasm), except for ICPA 2052 (A2 cytoplasm). The Cl II, had mixture of genotypes i.e. ICPA 2039 (A4 cytoplasm) and ICPW 29 (wild species donor of A4 cytoplasm), ICPW 89 (wild species donor of A2 cytoplasm) and ICPB 2209 (B line of A6 cytoplasm). The Cl III had two genotypes, one A line and the other wild species accession of pigeonpea having A8 cytoplasm. The Cl IV had 3 genotypes from different cytoplasms such as A5 (A line) and A6 (ICPW 42 and ICPA 2209) cytoplasm as well as a B line of A8 cytoplasm. Cl V had 3 genotypes, a wild genotype ICPW 2 (wild species of A5 cytoplasm), a B line of A5 cytoplasm and ICPB 2039 (B line for A4 cytoplasm).

Fig. 1
figure 1

Clustering pattern of pigeonpea genotypes (8 A lines, 8 B lines and 6 wild species accessions) obtained from survey of 23 mitochondrial SSR markers. Genotypes with blue, red and green colours indicate A lines, B lines and wild species accessions, respectively

Discussion

Simple sequence repeats are hypervariable and available in abundance in a range of crop genomes (Gupta and Varshney 2000). The polymorphism in SSRs is believed to be the outcome of replication slippage (Moxon and Wills 1999), which commonly occurs at a higher rate than mutation in the non-repetitive DNA (Wierdl et al. 1997). Owing to their high polymorphism, co-dominance and reproducible nature, they have consequently become the marker of choice for genetic analyses such as molecular mapping, diversity studies and breeding applications in crops (Gupta and Varshney 2000). In recent years, the availability of genome sequence information has eliminated technical limitations and enabled researchers to accelerate the process of SSR development (Varshney et al. 2005b).

In case of pigeonpea, until 2010 efforts towards identification of SSRs were limited to identification of SSR markers from SSR-enriched libraries (Burns et al. 2001; Odeny et al. 2007; Saxena et al. 2010a, b). However, with the advent of sequencing technologies, the efforts towards identification of SSRs in pigeonpea increased leading to initial identification of 3583 genic SSRs through transcriptome sequencing (Raju et al. 2010), 3072 SSRs through BAC end sequencing (Bohra et al. 2011) and later to a collection of 309,052 SSRs through draft genome sequencing (Varshney et al. 2012); however, the present report is the first report of mt-SSRs in pigeonpea.

The present study reports a set of 25 novel mitochondrial SSRs identified through in silico analysis of mitochondrial genome. Mononucleotide SSR motif emerged as a major class of SSRs, followed by dinucleotide and trinucleotide. While comparing organellar genomes of major cereals including rice, wheat, maize and sorghum, mononucleotides were the most frequent repeats in an earlier study by Rajendrakumar et al. (2008), the commonest mononucleotide repeat being poly “A” and poly “T”. These mononucleotide repeats were also relatively more abundant in mitochondrial genomes of algae and angiosperms (Kuntal and Sharma 2011). The frequency of SSRs in pigeonpea mitochondrial genome observed during the present study was relatively low, when compared with that in mitochondrial genomes of cereals (Rajendrakumar et al. 2007, 2008). In the case of legumes other than pigeonpea, the mitochondrial genomes of mung bean (Vigna radiata) (Alverson et al. 2011), faba bean (Negruk 2013) and soybean (Chang et al. 2013) have been sequenced. However, SSRs in the mitochondrial genomes of these above other legumes have yet to be discovered. The low frequency of SSRs in legumes may be attributed to a low proportion of repetitive DNA in legumes relative to that in cereals and eudicots (Alverson et al. 2011; Tuteja et al. 2013).

All mtSSRs were considered for designing the primer pairs. The percentage of primers designed was high at 96 % (24 out of 25 were successfully designed). While conducting polymorphism survey using these novel 24 markers on 22 genotypes (which represents six different cytoplasm in Cajanus), 96 % (23 out of 24 markers) of the markers showed polymorphism. This polymorphism rate is comparable to that with the SSRs from the nuclear genome, wherein 95 % polymorphism was observed (Odeny et al. 2007). In other studies, involving nuclear SSRs, polymorphism was low, with reports of 81.3 %, (Saxena et al. 2010a, b) and 50 % (Burns et al. 2001). The high level of polymorphism rate observed may be attributed to the use of wild species from secondary gene pool of pigeonpea. The average PIC value and number of alleles were 0.52 and 4.65, respectively, which were also comparable to those reported by Odeny et al. (2007), at 0.60 and 4.8 for the respective values. In another study, the average number of alleles and PIC value were respectively 3.4 and 0.32, which are relatively low (Saxena et al. 2010a, b).

In the present study, the average PIC value of mononucleotide and dinucleotide SSRs were almost same. However, higher level of polymorphism for dinucleotides was reported in the past (Ashworth et al. 2004; Saxena et al. 2010a, b). Furthermore, due to longer repeat length, better level of polymorphism has been observed by dinucleotide repeats, and they are in general considered better SSR markers than mononucleotide repeats (Temnykh et al. 2001). Since it is difficult to resolve polymorphism in mononucleotide SSRs on agarose gel, capillary electrophoresis was used to resolve the polymorphism for the SSR markers identified during the present study. It was interesting to note that all the 14 mononucleotide SSR markers showed polymorphism. In fact, CcMt03 (mononucleotide SSR) showed highest PIC of 0.84. Similar results were obtained by (Saxena et al. 2010a, b), where two mononucleotide SSR markers showed relatively high level of polymorphism. These observations suggest that in pigeonpea, it may be desirable to develop and deploy mononucleotide SSRs also.

Genotyping data were used for elucidating genetic dissimilarity among the 22 genotypes. The hypothesis used for deciphering the genetic relationship was that, the CMS occurs due to mitochondrial aberration which is maternally inherited, and since all the CMS systems in pigeonpea are from the wild species, the mitochondrial variations between the CMS line and its wild progenitor would be minimal. Chimeric open reading frames (ORFs) caused due to mitochondrial genome rearrangement are the leading cause of CMS trait in plants (Hanson and Bentolila 2004). In the case of pigeonpea, a total of 13 potential chimeric ORFs were identified in the male-sterile line ICPA 2039. Out of these 13 candidate ORFs, eight were within other mitochondrial genes, while five were having parts of different mitochondrial genes (Tuteja et al. 2013).

Phylogenetic analysis grouped 22 genotypes into 5 clusters. Clear clusters were observed for genotypes related to the CMS system of A1 and A8 cytoplasm. This suggests that these markers could by and large distinguish differences in the CMS system rather than at the genotype level. Genotypes related to A2 and A4 CMS system grouped together into a single cluster and therefore suggest that A2 and A4 CMS system have some homology. As earlier mentioned, in pigeonpea only these two CMS system have been successfully utilized to develop pigeonpea hybrids, with ICPH 2671 based on A4 cytoplasm (Saxena et al. 2013) and IPH 09-5 based on A2 cytoplasm (Pulses Newsletter IIPR 2013). Genotypes with A5 and A6 cytoplasm were grouped together into another cluster, suggesting that these two CMS system might have common ancestry.

In summary, the present study adds a novel set of 24 mitochondrial SSR markers to the marker repository in pigeonpea. These markers would be useful not only in differentiating A lines and their corresponding B line, but also for the study of the origin and phylogeny of pigeonpea.