The first SSR-based genetic linkage map for cultivated groundnut (Arachis hypogaea L.)
- First Online:
- Cite this article as:
- Varshney, R.K., Bertioli, D.J., Moretzsohn, M.C. et al. Theor Appl Genet (2009) 118: 729. doi:10.1007/s00122-008-0933-x
- 2k Downloads
Molecular markers and genetic linkage maps are pre-requisites for molecular breeding in any crop species. In case of peanut or groundnut (Arachis hypogaea L.), an amphidiploid (4X) species, not a single genetic map is, however, available based on a mapping population derived from cultivated genotypes. In order to develop a genetic linkage map for tetraploid cultivated groundnut, a total of 1,145 microsatellite or simple sequence repeat (SSR) markers available in public domain as well as unpublished markers from several sources were screened on two genotypes, TAG 24 and ICGV 86031 that are parents of a recombinant inbred line mapping population. As a result, 144 (12.6%) polymorphic markers were identified and these amplified a total of 150 loci. A total of 135 SSR loci could be mapped into 22 linkage groups (LGs). While six LGs had only two SSR loci, the other LGs contained 3 (LG_AhXV) to 15 (LG_AhVIII) loci. As the mapping population used for developing the genetic map segregates for drought tolerance traits, phenotyping data obtained for transpiration, transpiration efficiency, specific leaf area and SPAD chlorophyll meter reading (SCMR) for 2 years were analyzed together with genotyping data. Although, 2–5 QTLs for each trait mentioned above were identified, the phenotypic variation explained by these QTLs was in the range of 3.5–14.1%. In addition, alignment of two linkage groups (LGs) (LG_AhIII and LG_AhVI) of the developed genetic map was shown with available genetic maps of AA diploid genome of groundnut and Lotus and Medicago. The present study reports the construction of the first genetic map for cultivated groundnut and demonstrates its utility for molecular mapping of QTLs controlling drought tolerance related traits as well as establishing relationships with diploid AA genome of groundnut and model legume genome species. Therefore, the map should be useful for the community for a variety of applications.
Groundnut or peanut (Arachis hypogaea L.) is an important food and cash crop for resource-poor farmers in Asia and Africa. It is primarily grown for edible oil (48–50%) as well as for direct consumption by people. In addition, groundnut haulms and groundnut cake (after oil extraction) are excellent animal feed. For the subsistence farmers, groundnut contributes significantly to household food security and cash income through the sale of groundnut products. Groundnut productivity in Western and Central Africa (WCA) and Eastern and Southern Africa (ESA) is below the world average yield of 1.55 tons/ha. Although groundnut productivity in Asia (1.8 tons/ha) exceeds the world average, it is still lower than the yields in developed countries (3 tons/ha). One of the main reasons for low productivity of this crop in these regions is the exposure of the crop to severe abiotic and biotic stresses. For instance, groundnut producing regions in WCA, ESA and Asia represent typically the semi-arid tropics (SAT) environment which is characterized by short and erratic rainfall and then long periods with virtually no rain. Water deficit is one of the most severe stresses that threaten sustainable crop production in SAT regions as the yield losses each year due to drought alone are estimated to be around US$520 million (Johansen and Nigam 1994).
Water capture by roots and water-use efficiency are two major components of the yield architecture, as defined by Passioura (1977), that are important for crops growing under water-limited environments. Water use efficiency can be considered as a drought avoidance trait, which deals with using soil water more efficiently for biomass production, therefore to “avoid” drought. Drought avoidance is considered to be the major trait of interest for expanding production to presently uncropped areas and the post-rainy fallows in SAT regions. Crop productivity per unit of water has become an important consideration in breeding programs dealing with drought. Higher water use efficiency or transpiration efficiency (TE) is therefore a major component for improving yield under water deficit. Several groundnut genotypes with higher transpiration efficiency (TE, in g of biomass per kg of water transpired) have been identified at ICRISAT. A recombinant inbred line (RIL) mapping population has been developed by crossing ICGV 86031 and TAG 24 (respectively high and low TE under the conditions in which they were tested) that segregates for TE as well as several of its surrogate traits such as specific leaf area (SLA) and SPAD chlorophyll meter reading (SCMR).
Groundnut breeders and physiologists have been working across the world to improve the yield of the crop under water deficit conditions but the complexity of the drought issue and the difficulty to accurately measure plant response to drought requires some modern methods to unmistakably identify the genotypes having superior performance under stress conditions. Recent advances in the area of crop genomics have offered tools to assist breeding (Varshney et al. 2005, 2006). Molecular markers and molecular genetic linkage maps are the pre-requisites for undertaking molecular breeding activities in any crop. Such tools would then simply speed up the process of introgression of beneficial traits into preferred varieties, especially for complex traits such as drought. However, for groundnut, although several hundreds of microsatellite markers have been developed (see Varshney et al. 2007), no molecular genetic map based on a cultivated × cultivated cross has been published to date. The main reasons for this is the low level of genetic diversity present in cultivated germplasm, at least the level which can not be detected with the detection tools that are currently available. A genetic map based on a cross of a synthetic amphidiploid (TxAG-6) and a US variety (Florunner) was developed earlier using restriction fragment length polymorphism (RFLP) loci (Burow et al. 2001). However, RFLP is labor intensive and not very suitable for use in breeding programs. Therefore, several research groups have developed microsatellite or simple sequence repeat (SSR) markers (see Varshney et al. 2007) but to date, these SSR markers have only been integrated into a diploid Arachis AA genome map (Moretzsohn et al. 2005). This map is based on a cross of the most probable AA genome donor to cultivated groundnut A. duranensis (Kochert et al. 1996; Seijo et al. 2007) with a closely related species. The aim of developing this map was to provide a reference map based on a highly polymorphic population. This high polymorphism means that a very high percentage of candidate DNA markers are informative, thus permitting their inter leveraging on an integrated map with cultivated × cultivated, cultivated × synthetic amphidiploid and even other legumes (see below). With this in mind, since its publication, the AA genome map has been enriched, at UCB/Embrapa (Brazil), with other markers, including candidate genes and Universal Legume Anchor Markers (Leg markers). Leg markers are based on PCR primers that bind conserved sequences flanking introns in legume homologues of genes present in only a single copy in the Arabidopsis genome. As such they work in a wide range of legume species and allow the alignment and integration of different genetic maps (Hougaard et al. 2008; unpublished data).
The present study was initiated to develop a molecular genetic map of groundnut based on the cultivated × cultivated mapping population and SSR markers. Furthermore the application of this genetic map was demonstrated for mapping WUE and related surrogate traits in groundnut. In addition, it was possible to align some of the LGs of this map with the reference AA genome map and consequently with the genome sequences of the model legumes.
Material and methods
Plant material and DNA isolation
A RIL mapping population comprising of 318 F8/F9 lines, developed from a cross between ICGV 86031 × TAG 24 was used. DNA was extracted from the parental genotypes and the RILs according to a modified CTAB-based procedure, as described in Cuc et al. (2008).
The complete set of 318 F8/F9 lines was used for phenotyping for the following drought related traits for two consecutive years, 2004 and 2005: (i) transpiration (T), (ii) TE, (iii) SLA, and (iv) SCMR. Methodology for measuring the above mentioned traits are given in a separate study (Krishnamurthy et al. 2007).
Marker polymorphism and analysis
Summary on marker polymorphism between ICGV 86031 and TAG 24
Number of markers screened
Number of polymorphic markers
Number of amplified loci
Ferguson et al. (2004)
Mace et al. (2007)
Chaet, Dal, Lup, Stylo, Ades, Amor
Cuc et al. (2008)
Ah, gi, RN, ML, RI, TC, AC
Proite et al. (2007)
He et al. (2003), unpublished
Unpublished (S J Knapp)
Hopkins et al. (1999)
Nelson et al. (2006) (COS markers)
Gimenes et al. (2007)
PCR reactions for all SSR markers were performed in 10 μl reaction volume in an ABI 9700 thermal cycler (Applied Biosystems, USA), in 384-well PCR plates (Applied Biosystems, USA), consisting of 2 pmoles of primer, 1.5 mM MgCl2, 2 mM dNTPs, 0.1U of Taq DNA polymerase (Qiagen, Germany) and 1X PCR buffer (Qiagen, Germany). A touch down PCR amplification profile with 3 min of initial denaturation cycle, followed by first five cycles of 94°C for 20 s, 60°C for 20 s and 72°C for 30 s, with 1°C decrease in annealing temperature per cycle, then 30 cycles of 94°C for 20 s with constant annealing temperature (56°C) and 72°C for 30 s followed by a final extension for 20 min at 72°C. The amplified products were tested on 1.2% agarose gel to check for the amplification of the PCR products.
Amplified products for majority of SSR markers were separated by electrophoresis on 6% polyacrylamide gels and visualized through silver staining (Tegelstrom 1992). In some cases, where resolving polymorphism was difficult, the PCR was done using the forward primer labeled with one of four fluorescence dyes, 6-FAM, VIC, NED, or PET (Applied Biosystems, USA). Such PCR amplicons were size fractioned using capillary electrophoresis on an ABI 3700 automatic DNA sequencer (Applied Biosystems, USA). Allele sizing of the electrophoretic data thus obtained was done using Genescan 3.1 software (Applied Biosystems, USA) and Genotyper 3.1 (Applied Biosystems, USA).
Genotyping for identified polymorphic markers was done on 318 F8 RILs. Marker segregation was subjected to the χ2 test to examine distortion from the expected 1:1 segregation. Linkage analysis was performed using Mapmaker Macintosh version 2.0 (Lander et al. 1987). LGs were established using a minimum LOD score of 6.0 and a maximum recombination fraction (θ) of 0.35. The most likely marker order within each LG was estimated by comparing the log-likelihood of the possible orders using multipoint analysis (“compare” command) or by the matrix correlation method using the “first order” command, for groups containing more than six markers. The LOD score was then decreased to 3.0 in order to include new markers in the groups, by two-point analysis (“group” command). The exact position of the newly included markers within each group was determined by using the “try” command, which compares the maximum-likelihood of each marker order after placing the markers, one by one, into every interval of the established order. The new marker orders were confirmed by permuting all adjacent triple orders (“ripple” command). Recombination fractions were converted into map distances in centimorgans (cM) using the Kosambi’s mapping function.
Quantitative trait locus (QTL) analysis
Genotyping data and phenotyping data obtained for T, TE, SLA and SCMR were analyzed for mapping QTLs by using the method composite interval mapping (CIM), proposed by Zeng (1993, 1994) in the WinQTL Cartographer, version 2.5 (Wang et al. 2007). CIM analysis was performed using the Model 6, scanning the genetic map and estimating the likelihood of a QTL and its corresponding effects at every 1 cM, while using significant marker cofactors to adjust the phenotypic effects associated with other positions in the genetic map. The number of marker cofactors for the background control was set by forward–backward stepwise regression. A window size of 10 cM was used, and therefore cofactors within 10 cM on either side of the QTL test site were not included in the QTL model. Thresholds were determined by permutation tests (Churchill and Doerge 1994; Doerge and Churchill 1996), using 1,000 permutations and a significance level of 0.05. The significant QTLs were plotted in graphics. Graphic presentation of the LGs and the QTLs was obtained by using MapChart, version 2.1 (Voorrips 2002).
Some SSR markers that were mapped onto the genetic map in this study and in the diploid AA genome map (Moretzsohn et al. 2005) were the same. Selected markers mapped in this study that had not been screened earlier for polymorphisms in the AA genome parentals were screened and, wherever possible, genotyped and mapped onto AA genetic map using the same methodology as described above.
Identification of ESTs from multiple legume species, usually Lotus, soya and Medicago, with single strong BLAST hits against all predicted Arabidopsis proteins and the alignment of these ESTs.
Alignment of ESTs to a corresponding genomic region from Lotus or Medicago and inference of intron positions.
Identification of conserved intron-flanking sequences, and design of primers to bind these conserved sequences.
Markers to unique sequences within a genome facilitate the comparison of genetic maps, and genes that are single copy in Arabidopsis have a high probability of being single copy in legume genomes.
Introns are more variable than coding regions, and therefore they are better for marker development.
Primers that bind to sequences that are conserved are more likely to be transferable to other species.
The primers were used in PCR with the progenitors of the Arachis mapping population. Polymorphisms were identified by size- or sequence variation. In the latter case, most markers developed were cleaved amplified polymorphic sequences (CAPS) or dCAPS (Neff et al. 2002; Hougaard et al. 2008; unpublished data).
The methodology for determining synteny of the AA genome map with Lotus and Medicago will be described in detail elsewhere. Briefly, all legume anchor markers (Leg markers; Fredslund et al. 2005, 2006a, b) and most other markers mapped in the AA genome were sequence characterized. These sequences were used in BLAST as queries against the Lotus database from Kazusa DNA Research Institute (Japan), and against the pseudomolecules of Medicago using CViT blast (Chromosome Visualization Tool, http://www.medicago.org/genome/cvit_blast.php). For Lotus, genetic positions were available for most transformation-competent artificial chromosome (TAC)/bacterial artificial chromosome (BAC) clones (Sato et al. 2008), but where necessary, TAC/BACs were sequenced and microsatellite markers were developed for genotyping and mapping in Gifu × MG-20 and/or in L. filicaulis × L. japonicus Gifu (Sandal et al. 2006). All map positions are given with respect to the former map.
Results and discussion
A total of 1,145 SSR markers, available in public domain as well as unpublished markers were screened on ICGV 86031 and TAG 24 (Table 1), and 144 markers showed polymorphism between these genotypes. The very low level of polymorphism (12.6%) observed in the present study is not unexpected, as similar levels of polymorphism has been observed in several other studies (see Varshney et al. 2007). Low level of genetic polymorphism in cultivated groundnut has been attributed to its origin from a single polyploidization event that occurred relatively recently on an evolutionary time scale (Young et al. 1996). However, additional contributing factors to the low levels of molecular polymorphism observed to date could be due to the marker techniques used. This emphasizes the urgent need to develop a critical mass of highly polymorphic molecular markers in groundnut. Indeed, development of SSR markers, from longer SSR-enriched libraries, BAC-end sequences, and SNP (single nucleotide polymorphism) markers using next generation sequencing technologies is underway in several laboratories including the Embrapa Recursos Genéticos e Biotecnologia/Universidade Católica de Brasília (UCB) (Brazil), University of Georgia (USA) and University of California-Davis (USA).
All identified 144 polymorphic markers were used for genotyping 318 RILs of the mapping population. While genotyping the mapping population, segregation data were scored for one locus for 139 markers, for two loci for four markers (pPGSseq9H8, IPAHM108, PM733, GM635) and for three loci for one marker (TC3G01). As a result, segregation data were obtained for a total of 150 SSR loci. Amplification of more than one fragment by primer pair/marker in groundnut has been reported in earlier studies (Hopkins et al. 1999; Krishna et al. 2004; Kottapalli et al. 2007). In these studies observation of more than one fragment per marker has been attributed to either amplification of duplicated loci or different loci, because of the tetraploid genome.
Genetic map for cultivated groundnut
Genotyping data obtained for all 150 loci were checked for segregation ratio using χ2 test. A total 93 loci showed the expected 1:1 segregation ratio (P < 0.05) and were initially used to establish the LGs. Using a minimum LOD score of 6.0 and a maximum recombination fraction (θ) of 0.35, 84 marker loci were mapped into 20 LGs. The LOD score was then decreased to 3.0 in order to include other SSR loci, (basically markers that showed segregation distortion), by two-point analysis. As a result, additional 51 SSR loci could be integrated and two new LGs were formed. The LG_AhXI, composed of five distorted loci out of six mapped, and the LG_AhXVII composed by two SSR loci, being one distorted. Thus, in total, 135 loci were integrated into a total of 22 LGs, covering 1,270.5 cM of total map distance.
To the best of our knowledge this is the first genetic map of groundnut based only on cultivated genotypes. Although a genetic map for tetraploid groundnut genome was developed earlier (Burow et al. 2001), this was based on RFLP markers on 78 BC1F1 lines derived from a cross of TxAG-6, a synthetic amphidiploid (Simpson et al. 1993) with Florunner. Because different marker types were used by Burow et al. (2001) and in this study, a direct comparison can not be made between these two maps.
In contrast, the map of the AA genome of Arachis (Moretzsohn et al. 2005 and unpublished data) was developed using SSR markers. Mapping of common markers thus allowed the alignment of these two maps in some regions (see later). As SSR markers are the markers of choice for plant geneticists and breeders (Gupta and Varshney 2000) and a larger number of SSR markers are available for groundnut (see Varshney et al. 2007; Table 1), it is anticipated that future groundnut genetic maps will involve SSR markers. Therefore, the developed SSR-genetic map of cultivated groundnut should be very useful to the community to compare the future genetic maps of groundnut with the map developed in the present study.
Trait phenotyping and QTL analysis
The parental genotypes of the mapping population, ICGV 86031 and TAG 24, were found to show variation in transpiration (T), TE and also for its surrogate traits-SLA, and SCMR (Serraj et al. 2004; Nautiyal et al. 2002). Therefore all 318 RILs were phenotyped for the above mentioned traits for two consecutive years at ICRISAT, Patancheru. Phenotyping of RILs for T, TE and other surrogate traits for 2 years, overall, showed a fairly good consistency across seasons/years/watering regimes, in spite of the range of variations among RILs being much lower in one season, and of the differences in the evaporative demands in the two seasons. Details about the phenotyping data and the reasons for range of variations in different seasons are given elsewhere (Krishnamurthy et al. 2007).
Trait phenotyping data on ICGV 86031 and TAG 24 and its mapping population
Variation in RILs
Transpiration (T, kg)
Transpiration efficiency (TE, g kg−1)
Specific leaf area (SLA, cm2 g−1) at start of stress
Specific leaf area (SLA, cm2 g−1) at harvest
SCMR at start of treatment
SCMR after 5 days of treatment
SCMR after 1 week of treatment
SCMR after 10 days of treatment
SCMR after 15 days of treatment
SCMR at harvest
Quantitative trait loci for drought tolerance-related traits identified by composite interval mapping (CIM) method
Highest LOD score (threshold)
Phenotypic variation (r2 %)
Transpiration efficiency (TE)
Specific leaf area (SLA)
SPAD chlorophyll meter reading (SCMR)
SPAD at stage of harvest
SLA was measured at the start of drought stress imposition as well as at the time of harvest and showed variation with moderate levels of heritability in both years (Table 2). QTL analysis of SLA at the start of drought stress imposition showed five QTLs in 2004 and four QTLs in 2005. However for SLA measured at the time of harvest, two QTLs were identified in 2004 and three QTLs were identified in 2005. The phenotypic variation contributed by these QTLs ranged from 3.5 to 17.6%. As Krishnamurthy et al. (2007) did not find any relation of SLA with TE, the QTLs for SLA are not of much importance.
SCMR at the start of stress imposition in both seasons, at 7 and 10 days after imposing the stress in 2004, and at 5, 10 and 15 days after imposing the stress in 2005 showed large and significant variation among RILs (Table 2). Indeed, the heritability values observed for the SCMR were the highest among all the traits studied, particularly during 2005. For each season data, eight QTLs were identified for SCMR measured at different time points. However like the other traits mentioned above, the phenotypic variation explained by these QTLs was in the range of 2.9–11.0% (Table 3).
Alleles with moderate additive effects were identified for most of the evaluated traits. These alleles, which should confer more tolerance to drought, were derived from both the tolerant (positive additive effect) and the susceptible (negative effect) parents (Table 3). Alleles that improve the trait being derived from parents agronomically inferior have been identified for several plant species (Xiao et al. 1998; Frary et al. 2004; Wang et al. 2004; Yoon et al. 2006).
To the best of our knowledge, this is the first report on identification of QTLs for drought related traits in groundnut. As a result, no comparison can be made on QTLs identified in this study with other studies in groundnut. It is, however, important to mention here that though several QTLs were identified for each trait in both seasons, none of the identified QTLs explained a high phenotypic variation that could be used for marker assisted breeding. However, given the highly polygenic nature of the traits analyzed (Krishnamurthy et al. 2007) and the relatively high number of progenies, it is not surprising to get QTLs with lower phenotypic variation (R2 values). Based on QTL mapping studies in other species, it can be generalized that higher phenotypic variation for the given trait in the mapping population and high/reasonable marker density genotyping data are the pre-requisites to identify the major QTLs explaining higher phenotypic variation. However, in the present study, on one hand, range of variations for the targeted traits was not very high in RILs, the marker density on the developed genetic map is also not very satisfactory. For instance, here the range of TE value was only between 2.60 and 3.55 g kg−1 water transpired in 2004 and between 1.92 and 2.36 g kg−1 water transpired in 2005. The marker density on this genetic map will be further improved after integrating more number of polymorphic markers. Further, it may be possible to identify more (and major) QTLs, for different traits, explaining higher phenotypic variation. Work is also in progress on methods to capture larger variations for some of the phenotypic traits.
Comparative maps with Arachis and model legume genomes
The present study reports the development of the first genetic map for cultivated groundnut after screening a large number of SSR markers, available in public domain as well as unpublished ones. Low level of polymorphism, observed in the present study like in earlier studies, emphasises the need to develop a critical mass of polymorphic (SSR and SNP) markers, so that cultivated groundnut genetic maps with reasonable marker density can be developed in future. The present study also demonstrates the application of developed genetic map for identification of QTLs for drought tolerance related traits and comparative mapping. In summary the developed genetic map should be useful for the groundnut community to align the future genetic maps with it, and to transfer the sequence information from model legume species like Lotus and Medicago for enhancing the knowledge of comparative genome evolution of legumes as well as groundnut improvement.
Authors are thankful to Mr A Gafoor and Mr G Somaraju for their help in conducting some experiments and collection of data. Financial support from National Fund of Indian Council of Agricultural Research (NBFSRA), New Delhi, India and Generation Challenge Programme (http://www.generationcp.org) of CGIAR is gratefully acknowledged to sponsor this study.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.