Construction of an integrated consensus map of the apple genome based on four mapping populations
- First Online:
- Cite this article as:
- N’Diaye, A., Van de Weg, W.E., Kodde, L.P. et al. Tree Genetics & Genomes (2008) 4: 727. doi:10.1007/s11295-008-0146-0
- 231 Views
An integrated consensus genetic map for apple was constructed on the basis of segregation data from four genetically connected crosses (C1 = Discovery × TN10-8, C2 = Fiesta × Discovery, C3 = Discovery × Prima, C4 = Durello di Forli × Fiesta) with a total of 676 individuals using CarthaGene® software. First, integrated female–male maps were built for each population using common female–male simple sequence repeat markers (SSRs). Then, common SSRs over populations were used for the consensus map integration. The integrated consensus map consists of 1,046 markers, of which 159 are SSR markers, distributed over 17 linkage groups reflecting the basic chromosome number of apple. The total length of the integrated consensus map was 1,032 cM with a mean distance between adjacent loci of 1.1 cM. Markers were proportionally distributed over the 17 linkage groups (χ2 = 16.53, df = 16, p = 0.41). A non-uniform marker distribution was observed within all of the linkage groups (LGs). Clustering of markers at the same position (within a 1-cM window) was observed throughout LGs and consisted predominantly of only two to three linked markers. The four integrated female–male maps showed a very good colinearity in marker order for their common markers, except for only two (CH01h01, CH05g03) and three (CH05a02z, NZ02b01, Lap-1) markers on LG17 and LG15, respectively. This integrated consensus map provides a framework for performing quantitative trait locus (QTL) detection in a multi-population design and evaluating the genetic background effect on QTL expression.
KeywordsMalus × domestica Borkh.Integrated consensus genetic mapFemale–male mapSSR marker
Several complete or partial genetic linkage maps of apple have already been published. They were used to study sex-related differences in recombination rates (Hemmat et al. 1994) to locate genes of interest like scab resistance major genes (Gardiner et al. 1996; Gianfranceschi et al. 1996; Maliepaard et al. 1998; Vinatzer et al. 2001; Xu and Korban 2002; Hemmat et al. 2003; Belfanti et al. 2004; Gygax et al. 2004; Bus et al. 2005) and quantitative trait loci (QTLs; Durel et al. 2003; Liebhard et al. 2003b; Calenge et al. 2004), rosy leaf curling aphid resistance gene (Roche et al. 1997), the self-incompatibility locus (Maliepaard et al. 1997), QTLs for growth and development (Conner et al. 1998; Kenis and Keulemans 2007; Segura et al. 2007), QTLs for fruit skin color (Cheng et al. 1996), fruit quality (King et al. 2000; Costa et al. 2005) and vitamin C content (Davey et al. 2006). Liebhard et al. (2003a) constructed a saturated map for the apple genome, which consisted of different types of markers (amplified fragment length polymorphism (AFLP), random amplification of polymorphic DNA (RAPD), simple sequence repeats (SSRs), SCAR). Silfverberg-Dilworth et al. (2006) constructed a map, which is currently used as reference map, by adding a new set of SSR markers to the previous linkage map. A new map including several major genes and summarizing the position of 247 SSR was also recently published by Fernandez-Fernandez et al. (2008).
Each of these maps was based on a single population. In particular, QTLs for scab resistance have been identified in the following three single apple progenies: Prima × Fiesta (Durel et al. 2003), Fiesta × Discovery (Liebhard et al. 2003b), and Discovery × TN10-8 (Calenge et al. 2004). Similar QTL mapping studies have been engaged in three other genetically related progenies (unpublished results) within the framework of the European Durable Apple Resistance in Europe (DARE) project (Lespinasse et al. 2000a). The map intervals of these QTLs are still wide. However, a more precise position is required for a further efficient marker-assisted selection. Integrated analysis of all progenies may narrow down the QTLs’ confidence intervals. To carry out such an integrated analysis, one single integrated consensus map is required.
Integrated consensus maps have been constructed for many cultivated plant species like loblolly pine (Sewell et al. 1999), soybean (Cregan et al. 1999; Song et al. 2004), rapeseed (Lombard and Delourme 2001), wheat (Daryl et al. 2004), grapevine (Doligez et al. 2006), and lettuce (Truco et al. 2007). Composite maps derived from multiple populations provide several advantages. In particular, they increase marker density and genome coverage, and the order of common markers is more precise. Integrated consensus maps were used in comparative studies among related species in assigning linkage groups to chromosomes (Beavis and Grant 1991) and have provided evidence of chromosomal rearrangements between homologous linkage groups (Dirlewanger et al. 2004; Pelgas et al. 2006; Nicolas et al. 2007). Integrated consensus maps were also used for QTL detection in multiple populations (Symonds et al. 2005; Blanc et al. 2006).
So far, an integrated consensus map from different mapping populations has not been constructed for apple. However, such a map could potentially lead to more precise estimates of the effects and location of a common QTL and could be used to examine differences in QTL effects in different populations (Walling et al. 2000; Li et al. 2005), whereas single population-based QTL analyses do not provide direct comparisons of a QTL among different populations. In addition, pooled analysis on the basis of integrated consensus map is expected to increase the power of QTL detection (Rebai and Goffinet 1993; Walling et al. 2000; Blanc et al. 2006). The integration of mapping data from several crosses on a single integrated map would be useful to determine the relative positions of transferable markers. Finally, the merging together of a large number of markers belonging to different maps into a single high density map could facilitate the development of markers that are tightly linked to any QTL or major genes of interest.
Here, we report on the integration of the genetic maps from four separated, but genetically connected, populations. The aim was to get a single framework map in order to further perform integrated QTL analyses in the multi-population design.
Materials and methods
Four F1 populations, with a total number of 676 individuals, were used to construct an integrated consensus map of the apple genome. F1 populations were considered for mapping since apple is an outbreeding plant species (Maliepaard et al. 1997). The first population (C1) was derived from the cross “Discovery × TN10-8” and consisted of 149 genotypes. The second population (C2) was derived from “Fiesta × Discovery”, which consisted of 204 genotypes. The third population (C3) was derived from “Discovery × Prima”, which consisted of 149 genotypes. The fourth population was derived from “Durello di Forli × Fiesta”, which consisted of 174 genotypes. The C1, C2, C3, and C4 populations were obtained at INRA—Angers (France), PRI—Wageningen (The Netherlands), BAZ—Dresden (Germany), and UNIBO—Bologna (Italy), respectively. They belonged to a half-diallel design involving the five parents: Discovery, Durello di Forli, Fiesta, Prima, and TN10-8.
Available linkage mapping data
Marker data were generated and initially mapped by different partners during the European DARE project (Lespinasse et al. 2000a) for the four populations, except the C3 population for which most of the SSR markers were generated after DARE. The segregation datasets consisted predominantly of AFLP and SSR markers, but few isozymes, RAPD and nucleotide binding sites-leucine-rich repeat (NBS-LRR) loci, were also available depending on the population.
AFLP reactions were performed in each population as described in Vos et al. (1995), with minor modifications according to the partners. For the C3 population, AFLP amplification was adapted to a dual-laser automated DNA sequencer (LI-COR 4200). A multiplexed PCR reaction with infrared-dye-labeled primers of two different wavelengths as described by Myburg et al. (2001) was performed using a total amplification volume of 10 μl. At least ten common AFLP primers (E32M48, E32M50, E32M58, E32M59, E33M47, E33M50, E34M57, E34M58, E34M60, and E35M47) defined by Keygene (http://www.keygene.com) were surveyed in each individual population. This common set was the series of ten primer pairs out of 37 that gave the best overall genome coverage and the least clustering in C2 population. Amplicon were visualized by the mean of radioactive labeling (C2 population), silver staining (C1 and C4 populations), or infrared-dye labeling (C3 population). AFLP markers were named according to primer name and amplicon size (e.g., E34M53-282), but when the amplicon size was not available, the number of the AFLP band on the gel was added to the primer name (e.g., E39M62-n16). Sizes were estimated by means of external ladders, depending on the partner: 100-bp ladder Sigma Aldrich™ for C1, 10-bp ladder SequaMark™ (Research Genetics Inc) for C2, and 100-bp ladder Invitrogen™ for C4. In some cases, notably in the C1 population, the EcoRI primer was extended by a fourth nucleotide in order to increase selective amplification (e.g., E32AM59-183, A being the additional nucleotide).
SSR reactions were performed as described in Liebhard et al. (2002), with minor modifications according to the partner. In the framework of the current survey, additional SSR markers were tested to increase the number of bridges between homologous linkage groups (LGs) of the four populations. Thus, all LGs from individual parental maps contained at least two common markers, allowing correct orientation of all LGs relative to the other homologous LGs.
Identification of Malus NBS-LRR resistance gene analogs (RGA) and their mapping in the C3 population were described by Thiermann (2002). Briefly, multiple combinations of degenerate primers designed from motifs in the NBS of group 1- (toll interleucin receptor-like region) and group 2- (coiled coil domain) type NBS-LRR genes (Leister et al. 1996; Pan et al. 2000) were used to amplify specific RGA sequences. About 40 NBS-LRR-RGAs were finally mapped in the C3 map by using the SSCP technique.
Construction of the genetic maps for each individual population
Linkage maps were constructed over again using the software CarthaGene version 0.999R (De Givry et al. 2005; http://www.inra.fr/mia/T/CarthaGene/). CarthaGene permits the ordering of a large set of markers, derived from different populations, using a true multipoint maximum-likelihood criterion. First, framework individual parental maps were built independently within each cross using a double pseudo-testcross strategy (Grattapaglia and Sederoff 1994). AFLP markers that were not informative enough (scored on less than 50% of individuals within a single progeny, or <ab × ab> dominant segregation type) were discarded. When AFLP markers clustered exactly at the same position, only the one scored on the greatest number of individuals was maintained for analysis. In addition, because the goal was not to get an ultra-dense integrated consensus map but to generate a framework map for further integrated QTL detections, AFLP markers separated by less than 5 cM were also dropped. For each individual parental map, the phase was inferred by duplicating all markers, converting all duplicated markers into “mirror” markers (i.e., inverting the coding of all duplicated markers: “0” into “1” and “1” into “0”), building the linkage groups from all available markers (i.e., original and mirror markers), and finally choosing one of out the two mirror linkage groups for each pair of the 17 so-built linkage groups. This strategy is simply taking profit of recombination frequencies among all pairs of markers (original and inverted ones) to infer the phase of the significantly linked markers. Then, an integrated female–male map was constructed for each population using common SSR markers as “bridge markers”.
For individual parental and integrated female–male maps, LGs were built with a minimum logarithm of odds (LOD) score of 3.0 and a maximum distance of 30 cM. Map distances were calculated using the Kosambi function (Kosambi 1944). As described by Doligez et al. (2006), a raw marker order within LG was first determined using a heuristic procedure that incrementally includes each marker by determining its insertion point as the one that yielded the highest loglikelihood (“build 5” command). This raw marker order was then improved using an optimization algorithm called “taboo search” with the “greedy 3 1 1 15” command. This “greedy” command tries to improve the loglikelihood of the best-known order for the markers using a dedicated search algorithm. Here, the search was repeated three times, with minimum and maximum length of the taboo list varying stochastically during the search from 1% to 15% of the neighborhood marker list (http://www.inra.fr/mia/T/CarthaGene/). Finally, local marker order was refined by testing all possible marker orders within a sliding window of size 5, using the “flips 5 2 1” command. Once the two individual parental maps were built for a given population, the integrated female–male genetic map was constructed by using the “dsmergen” command of CarthaGene as described below.
We discarded loci showing inconsistent positions between parental and integrated female–male maps within each progeny, i.e., loci mapped in the middle of a LG in one parent but at the end of the LG in the female–male map (which was assumed to be due to genotyping errors). We also removed all loci mapping at the end of any LG (>10 cM far away from any other marker) and all loci that caused large increases in the distance between flanking loci.
Comparison of individual parental maps
A chi-square homogeneity test was performed to compare the recombination rates between individual parental maps. Analyses were performed only for Discovery and Fiesta that are of great interest in apple breeding programs and involved in three and two crosses, respectively. For the sake of simplicity, only pairs of adjacent SSR markers that were simultaneously common to the homologous LGs and linked at a LOD ≥ 2 were analyzed. Three LGs (LG11, LG12, and LG15) randomly chosen were surveyed. Because of missing data, the number of individuals involved in each recombination rate was not constant within a mapping population. This could lead to discrepancies in chi-square-estimated values from one marker pair to another.
The colinearity of parental maps was evaluated on the basis of their common SSR and isozyme markers.
Construction of the integrated consensus genetic map
Before building the integrated consensus map, we aligned the integrated female–male maps to visualize “bridging” markers and unified their names since the nomenclature of common markers should be strictly identical from one population to another. The nomenclature of SSR markers were according to the HiDRAS SSR-data base (http://www.hidras.unimi.it). Next, we constructed an integrated consensus genetic map, merging all four progeny datasets, using the “dsmergen” command of CarthaGene. This merging method uses all available information in the loaded datasets and estimates a single recombination rate for each given marker pair based on all available meioses, irrespective of which crosses the genotypic data have been derived from. Firstly, independent maximum-likelihood multipoint parameter estimations are performed on each dataset. Then, an overall maximum-likelihood marker ordering is sought (De Givry et al. 2005). The “build 5”, “greedy 1 0 1 20” and the “flips 5 2 1” commands were used for marker ordering within each LG. Graphical representations of the maps were drawn using the MapChart software (Voorrips 2002). For the integrated consensus map, positions of markers along the linkage groups were only given with an entire centimorgan unit since using decimals would not make sense considering the very small likelihood differences between the maximum-likelihood marker ordering represented here and the alternative marker orders giving slightly smaller likelihood values.
Marker distribution along each LG of the integrated consensus map was evaluated by comparing the difference between the expected positions of markers under a uniform distribution and the observed ones with the critical D value of the Kolmogorov–Smirnov statistic (α = 5%). The observed distribution of each LG was built as follows: let Li and Ni be the length and the number of markers of LGi, respectively; Pij is the position (in cM) of marker j along LGi; the observed cumulative distribution is made up of each j/Ni value. We also tested homogeneity of markers distribution over the linkage groups (with regard to their length) by a chi-square test.
Results and discussion
The main goal of this study was to construct an integrated consensus map of apple, combining segregation datasets from four mapping populations. Common SSR markers between homologous LGs were used as bridges to merge the four populations’ datasets.
Comparison of parental and integrated female–male maps
The individual parental maps consisted of different types of markers (AFLP, isozymes, NBS-LRR-RGA, SSR, and RAPD). They were enriched with additional SSR markers in order to generate more bridges between homologous LGs and to flank genomic regions where scab resistance QTLs (Durel et al. 2003; Liebhard et al. 2003a, b; Calenge et al. 2004) had previously been detected. Because of the unexpectedly low number of common AFLP markers, they were not used as bridges in the integration. Adding a few such markers was thought to give little additional value to the quality of the map, whereas they would obscure evaluations on the use of the much more balanced SSRs. Therefore, common AFLP markers were entered under their initial names as given by each partner on the basis of their single population. As AFLPs were tested on different platforms, and sizes were assessed in different ways, initial names of actually common markers were different due to slight differences in size estimates. In addition, sequence data are needed to prove that AFLP markers of the same size represent the same locus (Black 1993; Waugh et al. 1997; Cervera et al. 2001). However, when AFLP markers with similar size come from the same parent and map at the same position, these AFLP markers could reasonably be considered as identical.
Main features (number of markers, map length, and mean distance between adjacent loci) of the integrated female–male maps
Number of markers
Map length (cM)
Recombination rates between locus pairs present simultaneously in the 3 populations where Discovery is involved
Discovery (C1, N = 149)
Discovery (C2, N = 204)
Discovery (C3, N = 149)
Chi square (df = 2)
Recombination rates between locus pairs present simultaneously in the 2 populations where Fiesta is involved
Fiesta (C2, N = 204)
Fiesta (C4, N = 149)
Chi square (df = 1)
The distribution of markers was not compared between individual parental maps because we previously “arbitrary” discarded markers of low informativeness (scored on less than 50% of individuals, <ab × ab> segregation type, clustering at exactly the same position or separated by less than 5 cM) within each individual population. No evidence of marked clustering tendency in a particular parental map was observed. These markers were discarded because they would not bring any additional information in the further QTL analysis. Furthermore, this strategy allowed the optimization of computing time.
Features and usefulness of the integrated consensus map
Main characteristics of the integrated consensus map
Number of total markers
Number of SSR markers
Map length (cM)
Markers density (cM/markers)
Markers were proportionally distributed over the 17 linkage groups (χ2 = 16.53, df = 16, p = 0.41) with regard to their length, suggesting a balanced distribution. However, a non-uniform marker distribution was observed within all of the LGs. This could be due to non-random sampling of the genome by the primers used, by uneven distribution of recombination along the LGs (Tanksley et al. 1992), or by clustering of some markers (Radhika et al. 2007). Clustering of markers at the same position (within a 1-cM window) was observed throughout LGs and consisted predominantly of only two to three linked markers. Of the 159 mapped SSR markers, 90 were common to at least two populations simultaneously, generating a great amount of anchorage points between populations.
The validity of the construction of integrated consensus map from individual populations where a difference occurs in recombination frequency was questioned by Beavis and Grant (1991). However, if the marker order between individual maps and the integrated consensus map is conserved, the composite map should remain valuable (Lespinasse et al. 2000b). In our study, marker order was well conserved between the integrated consensus map and the integrated female–male maps of the four populations. Similar results were obtained by Doligez et al. (2006) for the integrated map of grapevine. “Small” inconsistencies of marker order can appear when differences in recombination rates are observed among populations (Beavis and Grant 1991; Lombard and Delourme 2001; Loridon et al. 2005). However, the composite map is still a valuable tool for selecting markers and comparing the relative location of traits in different genetic backgrounds.
In conclusion, the map constructed in this study is the first integrated consensus map for apple merging together individual maps from independent experiments. Singularly, AFLP markers belonging to different maps were brought together into a single map. This map will be a reliable tool to perform QTL detection in a multi-population design and evaluate the genetic background effect on QTL expression. It will first be devoted to scab and powdery mildew resistance QTL detection as phenotypic (resistance) data are available for each of the four populations tested in common environments. The construction of the integrated consensus map on the basis of four distinct genetic maps made it possible to determine the relative positions of transferable markers and to get a statistically based estimation on the order of the non-common markers.
This publication was partly carried out with the financial support from the Commission of the European Communities (European project “HiDRAS”: High-quality Disease Resistant Apples for a Sustainable agriculture; contract no. QLK5-CT-2002-01492), Directorate-General Research—Quality of Life and Management of Living Resources Programme. It does not necessarily reflect the Commission's views and in no way anticipates its future policy in this area. Its content is the sole responsibility of the publishers. The authors want to acknowledge E. Chevreau and M. Gallet (deceased) of INRA, France and S. Manganaris of NAGREF, Greece for providing the isozyme marker data. They also want to acknowledge technical assistance of A. Faure and L. Dondini.