High-density genetic linkage maps with over 2,400 sequence-anchored DArT markers for genetic dissection in an F2 pseudo-backcross of Eucalyptus grandis × E. urophylla
- First Online:
- Cite this article as:
- Kullan, A.R.K., van Dyk, M.M., Jones, N. et al. Tree Genetics & Genomes (2012) 8: 163. doi:10.1007/s11295-011-0430-2
Traits that differentiate cross-fertile plant species can be dissected by genetic linkage analysis in interspecific hybrids. Such studies have been greatly facilitated in Eucalyptus tree species by the recent development of Diversity Arrays Technology (DArT) markers. DArT is an affordable, high-throughput marker technology for the construction of high-density genetic linkage maps. Eucalyptus grandis and Eucalyptus urophylla are commonly used to produce fast-growing, disease tolerant hybrids for clonal eucalypt plantations in tropical and subtropical regions. We analysed 7,680 DArT markers in an F2 pseudo-backcross mapping pedigree based on an F1 hybrid clone of E. grandis and E. urophylla. A total of 2,440 markers (31.7%) were polymorphic and could be placed in linkage maps of the F1 hybrid and two pure-species backcross parents. An integrated genetic linkage map was constructed for the pedigree resulting in 11 linkage groups (n = 11) with 2,290 high-confidence (LOD ≥ 3.0) markers and a total map length of 1,107.6 cM. DNA sequence analysis of the mapped DArT marker fragments revealed that 43% were located in protein coding regions and 90% could be placed in the recently completed draft genome assembly of E. grandis. Together with the anchored genomic sequence information, this linkage map will allow detailed genetic dissection of quantitative traits and hybrid fitness characters segregating in the F2 progeny and will facilitate the development of markers for molecular breeding in Eucalyptus.
KeywordsMolecular markerConsensus genetic linkage mapComparative mapping
Eucalyptus tree species and their hybrids form the basis of the largest hardwood plantation crop in the world, occupying approximately 19.6 million hectares (www.git-forestry.com). Interspecific hybridization is important for the improvement of eucalypt plantations (Griffin et al. 1988; Eldridge et al. 1993; Khurana and Khosla 1998; Potts and Dungey 2004) yielding highly productive genotypes that are deployed in clonal eucalypt plantations in tropical and subtropical regions (Wright 1997; Campinhos and Ikemori 1989; Bison et al. 2006). Eucalyptus grandis, a subtropical eucalypt in the section Latoangulatae, has been extensively used for the production of pulp due to its rapid growth, good form and easy vegetative propagation. The species, however, has a low survival rate in humid and tropical areas, due to susceptibility to fungal diseases (Wingfield et al. 1989). Eucalyptus urophylla, a tropical eucalypt native to islands of Indonesia and also a member of the section Latoangulatae, is more tolerant to fungal diseases than E. grandis. Interspecific hybrids of E. grandis and E. urophylla combine the fast growth and better rooting ability of E. grandis with the disease tolerance, adaptability and greater coppicing capability of E. urophylla (Vigneron and Bouvet 2000; Campinhos and Ikemori 1989). Hybrids of E. grandis and E. urophylla are mainly grown in Brazil (Camphinos and Ikemori 1989; Bison et al. 2006), the Congo (Vigneron and Bouvet 2000) and South Africa (Darrow 1995; Wright 1997). E. grandis × E. urophylla hybrids often exhibit superior growth and quality compared to the pure species, but the genetic architecture of hybrid superiority (Verhaegen et al. 1997; Grattapaglia et al. 1996) remains to be fully characterized in this hybrid combination.
Genetic linkage maps are useful for studying genome-wide patterns of inheritance of qualitative and quantitative traits, developing markers for molecular breeding, map-based cloning and comparative genomic studies. In the past two decades, important advances have been made in the construction of genetic maps for Eucalyptus species. The first generation of Eucalyptus genetic maps were constructed with restriction fragment length polymorphism (RFLP) markers (Byrne et al. 1995; Thamarus et al. 2002), random amplified polymorphic DNA (RAPD) markers (Grattapaglia and Sederoff 1994; Vaillancourt et al. 1994; Verhaegen and Plomion 1996; Bundock et al. 2000; Gan et al. 2003) and amplified fragment length polymorphism (AFLP) markers (Marques et al. 1998; Myburg et al. 2003). However, the relatively low throughput of these techniques (e.g. RFLP) and low proportion of polymorphisms shared among different outbred pedigrees (e.g. RAPD and AFLP) have hampered the integration of information from different maps, except where shared parents were used in mapping pedigrees (Myburg et al. 2003). More recently, several Eucalyptus genetic maps have been constructed using co-dominant microsatellite markers (Byrne et al. 1996; Brondani et al. 1998; Bundock et al. 2000; Thamarus et al. 2002; Brondani et al. 2002; Brondani et al. 2006; Freeman et al. 2006; Thumma et al. 2010), which proved informative for genetic analysis in outbred eucalypts, but still limited in throughput for rapid genome-wide genetic dissection. Although almost 300 microsatellite markers have already been mapped in eucalypts (Bundock et al. 2000; Thamarus et al. 2002; Brondani et al. 2006), the genus will still benefit from the availability of high-density genetic linkage maps with thousands of DNA markers anchored to a reference genome sequence. This will facilitate the identification of positional candidate genes and the identification of tightly linked QTL markers for molecular breeding.
Diversity Arrays Technology (DArT; Jaccoud et al. 2001) offers a rapid and affordable methodology for high-throughput DNA marker analysis. As DArT assays are performed in a highly parallel and automated fashion, the cost per data point is reduced by at least an order of magnitude compared to gel-based marker technologies, which makes it attractive to plant breeders aiming to track genome-wide segregation in large pedigrees. The technology was originally developed for rice (Jaccoud et al. 2001) and later validated in barley (Wenzl et al. 2006) and Arabidopsis (Wittenberg et al. 2005). DArT markers are currently being used in more than 55 species (http://www.diversityarrays.com/). A dedicated DArT genotyping array was recently produced for Eucalyptus tree species (Sansaloni et al. 2010). This array of 7,680 markers was enriched for informative, polymorphic DArT markers by generating genomic representations from diverse Eucalyptus species and performing segregation analyses of more than 20,000 DArT polymorphisms in Eucalyptus mapping populations.
The aim of this study was to generate high-density genetic linkage maps for E. grandis, E. urophylla and an F1 hybrid of these species. We describe the use of a pseudo-backcross mapping pedigree to construct linkage maps of the parental genomes using DArT and microsatellite markers. The maps provide a high-resolution framework for future quantitative analysis of traits that differentiate the two species, as well as hybrid fitness traits that segregate in the F2 progeny.
Materials and methods
Plant material and DNA extraction
A commercially grown F1 hybrid (E. grandis × E. urophylla) clone (GUSAP1, Sappi, South Africa) was selected for backcrossing to individuals of the parental species. Two F2 backcross (BC) mapping families were established using the F1 hybrid as a pollen parent with unrelated E. grandis (GSAP2) and E. urophylla (USAP1) individuals as seed parents in both crosses. Unrelated backcross parents were used to avoid potential inbreeding depression. The mapping pedigree consisted of 367 individuals from the E urophylla BC family and 180 individuals from the E. grandis BC family. DNA was isolated from all of the backcross individuals, the F1 hybrid, the two backcross parents and the original E. grandis (GSAP1) seed parent of the F1 hybrid using a BIO101/Savant FastPrep FP120 (MP Biomedicals, Solon, OH) instrument in conjunction with DNeasy 96 Plant kits (QIAGEN, Valencia, CA).
A total of 71 previously published microsatellite markers were screened for polymorphism in the two backcross families (Table S1). Markers with the prefix “EMBRA” were previously developed from E. urophylla and E. grandis (Brondani et al. 1998; Brondani et al. 2006), “Eg” from Eucalyptus globulus (Thamarus et al. 2002), “En” from Eucalyptus nitens (Byrne et al. 1996) and “Es” from Eucalyptus sieberi (Glaubitz et al. 2001). Two microsatellites (CesA1-MS1, CesA3-MS2) located in the promoters of cellulose synthase genes, EgCesA1 and EgCesA3 (Creux et al. 2009) were also used.
Multiplexed PCR amplification of the microsatellite markers was performed using the QIAGEN Multiplex PCR kit. The reactions were performed in a total volume of 10 μl containing 12 ng of template DNA, 0.2 μM of 10× primer mix (0.2 μM of each primer in mixes of up to 12 primer pairs each), and 1× QIAGEN Multiplex PCR master mix. PCR amplification was performed in an iCycler thermocycler (Bio-Rad Laboratories, Hercules, CA) with the following cycling conditions: initial denaturing and activation of the enzyme for 15 min at 94°C, followed by 35 cycles of denaturing at 94°C for 30 s, annealing at 50–60°C for 45 s, and extension at 72°C for 1 min, followed by final extension of 30 min at 60°C. Microsatellite primers were labelled with phosphoramidite fluorescent labels (6-FAM™, HEX™ or VIC™) for automated fragment analysis on an ABI PRISM® 3100 Genetic Analyzer (Applied Biosystems, Life Technologies, Foster City, CA) using ROX™ (Genescan™ 500 ROX™; Applied Biosystems) as internal standard. Electropherograms were analysed using GeneMapper® 3.0 software (Applied Biosystems).
DArT marker assays were performed by Diversity Arrays Technology Pty Ltd (DArT P/L, Canberra, Australia) as described previously (Sansaloni et al. 2010).
Linkage analysis and parental map construction
Genetic linkage maps were constructed using JoinMap® 4 (Van Ooijen 2006) in combination with a two-way pseudo-testcross mapping strategy (Grattapaglia and Sederoff 1994). DArT and microsatellite markers were separated into three types: testcross markers segregating only in the hybrid parent (expected segregation ratio 1:1), testcross markers segregating only in the backcross parents (1:1) and intercross microsatellite (1:3, 1:2:1 or 1:1:1:1) and DArT (3:1) markers, segregating in both parents of the particular backcross. Four marker parental linkage maps were constructed: a maternal map of the E. grandis (GSAP2) backcross parent, a maternal map of the E. urophylla (USAP1) backcross parent, and two separate paternal maps of the F1 hybrid (GUSAP1). Segregation ratios were evaluated using the χ2 test included in JoinMap® 4. For all four maps, linkage groups (LGs) were defined at a logarithm-of-the-odds (LOD) score of 8.0 or above. The marker order in each LG was subsequently determined by calculating the goodness-of-fit criterion and simultaneously calculating the map position corresponding to that order (Stam 1993) with the parameter settings Rec = 0.40, LOD = 3 and Jump = 5. The overall marker order of the linkage group was improved in each round by sequentially removing markers based on high mean chi-square values, nearest neighbour fit and the genotype probability function as implemented in JoinMap® 4 (Van Ooijen 2006) and then reordering the remaining markers in the linkage group. Recombination fractions were converted to additive map distances in centiMorgans (cM; Kosambi 1944). Linkage maps were drawn using MapChart© 2.2 (Voorrips 2002) and numbered according to the convention established by Grattapaglia and Sederoff (1994) and Brondani et al. (2006). Total genome length and genome coverage were calculated using the method of Lange and Boehnke (1982).
The parental origin of the testcross markers in the map of the F1 hybrid was inferred from genotypes obtained for the E. grandis (GSAP1) seed parent of the F1 hybrid (GUSAP1) since the two linkage phases in the maps of the F1 hybrid represent the markers amplified from either the E. grandis or the E. urophylla chromosome of each homologous pair.
The two maps of the F1 hybrid were aligned using shared testcross DArT (1:1) and shared microsatellite markers. Intercross DArT (3:1) and shared microsatellite markers were then used to align the backcross parent maps to that of the F1 hybrid. The parental maps were aligned using MapChart© 2.2 (Voorrips 2002). Where marker order differed between individual maps, markers were classified as non-colinear only when the difference in order involved markers that were spaced more than 1 cM apart.
Consensus map construction
An integrated (consensus) map for the entire pedigree was constructed using the 'combine groups for map integration' module in JoinMap® 4. The locus order was calculated using the regression mapping module and the following parameters: LOD ≥ 3.0, REC frequency ≤ 0.4, goodness- of-fit Jump threshold for the removal of loci = 5.0, the number of added loci after which to perform a ripple = 1, and third round = Yes. The heterogeneity test in JoinMap was used to exclude pairs of markers with significantly different recombination fractions in individual datasets. The overall marker order was improved iteratively as described earlier for parental map construction.
DNA sequence analysis of cloned DArT fragments
All of the cloned DArT fragments printed on the array were re-arrayed from plasmid stocks and Sanger sequenced in both directions (Genbank accessions HR865291-HR872186). To identify potential protein-coding regions mapped in the present study, the DArT fragment sequences were compared with all non-redundant GenBank CDS translations, RefSeq proteins, PDB, SwissProt, PIR, and PRF (http://www.ncbi.nlm.nih.gov) using BLASTX at a threshold of 1e-10. Customized scripts (Coetzer et al. 2010) were used to group redundant DArT fragments and assign functional annotations derived from BLASTX and BLAST2GO to each group. The DArT fragment sequences were also compared to the 8× draft assembly of the E. grandis genome sequence (DOE-JGI) using BLAST (http://eucalyptusdb.bi.up.ac.za/blast) at a threshold of 1e−10. Marker sequences with more than 90% identity to the draft genome sequence were used to align the consensus linkage map with the corresponding superscaffolds in the V1.0 assembly of the E. grandis genome (DOE-JGI; www.phytozome.net).
Genome-wide distribution of genetic recombination
To investigate the genome-wide correlation of physical and recombination distances (bp vs cM), 153 genomic regions each corresponding to an approximately 1 cM interval were selected throughout the 11 linkage groups where both flanking markers were located on the same de novo assembled scaffold of the E. grandis 8× genome assembly (http://eucalyptusdb.bi.up.ac.za).
A total of 68 (96%) microsatellite markers (Table S1), primarily from the EMBRA (Brondani et al. 2006) and CSIRO (Thamarus et al. 2002) sets, were found to be polymorphic in at least one of the backcross families and were used for linkage mapping. Of the 63 markers polymorphic in the E. grandis backcross, 35 (55%) were informative in both parents and segregated with three to four alleles, 22 (35%) were only informative in the F1 hybrid (GUSAP1) and 6 (9.5%) were only informative in the E. grandis BC parent (GSAP2). Of the 64 markers in the E. urophylla backcross, 46 (72%) were informative in both parents, 14 (22%) were only informative in the F1 hybrid (GUSAP1) and four (6%) were only informative in the E. urophylla BC parent (USAP1). As expected, a higher proportion of microsatellite markers were polymorphic and segregated from the F1 hybrid than from the backcross parent in each backcross family (90.4% vs 65.0% and 93.8% vs 78.1%, respectively).
Summary of the 2,617 DArT markers that segregated and were used for linkage analysis in the F2 backcross pedigree
E. grandis BC family
E. urophylla BC family
Testcross markers (1:1)
Testcross markers (1:1)
Intercross markers (3:1)
Linkage analysis and parental linkage maps
Summary of DArT and microsatellite (SSR) markers mapped in each linkage group of the two backcross families
E. grandis BC parent
F1 hybrid (E. grandis BC)
F1 hybrid (E. urophylla BC)
E. urophylla BC parent
No. of DArT markers
No. of SSR markers
Size in cM
Mean distance between markers
No. of DArT markers
No. of SSR markers
Size in cM
Mean distance between markers
No. of DArT markers
No. of SSR markers
Size in cM
Mean distance between markers
No. of DArT markers
No. of SSR markers
Size in cM
Mean distance between markers
The genotypic ratios of a relatively large proportion of testcross and intercross markers deviated significantly from the expected Mendelian ratios in both backcross families (Table S2). Distorted markers were not excluded from the mapping analysis, because segregation distortion is expected to be prevalent in interspecific crosses and omitting such markers would result in low coverage in many regions of the genetic map (Myburg et al. 2003, Brondani et al. 2006). Chi-square testing revealed that 31.1% and 35.7% of the DArT markers showed significant (α = 0.05) segregation distortion in the E. grandis and E. urophylla BC families, respectively (Table S2). Similar proportions of markers were distorted in the backcross parent maps and the two F1 hybrid maps (27.5% and 36.3% vs 32.1 and 32.3%, Table S2). Clusters of distorted markers that were observed throughout the four parental maps most likely represent true cases of genomic segregation distortion linked to postzygotic isolation barriers segregating in the F2 backcross progeny (Myburg et al. 2004). Some chromosomal regions exhibited segregation distortion in four parental maps, e.g. almost the entire length of LG5 and the distal end of LG7 showed distorted marker segregation in all four maps.
The large number of markers mapped resulted in high map coverage. On average, 80-91% of the loci in the BC parent and F1 hybrid maps were within 1 cM of a marker and 99.9% of loci in the four parental maps were within 5 cM of a marker.
Comparative and consensus maps
Summary of markers integrated into the consensus map for the interspecific F2 backcross pedigree of E. grandis × E. urophylla
Consensus linkage group
No. of DArT markers
No. of microsatellite markers
Map length (cM)
Mean marker spacing (cM)
DNA sequence analysis of DArT fragments and alignment to the E. grandis genome sequence
DNA sequences were obtained for 6,895 of the 7,680 cloned DArT fragments on the array (Genbank accessions HR865291-HR872186). Of the sequenced markers, 2,030 were polymorphic and could be mapped in this study (Table S3). Consistent with the previously reported enrichment of DArT markers in single copy DNA (Tinker et al. 2009), a comparison of the DArT fragment sequences to the non-redundant protein database using BLASTX (<1e−10) revealed that 865 (42.6%, Table S3) of the marker fragments potentially contained protein coding sequences. Annotation of the putative protein coding sequences revealed a broad range of functional categories. Sequence analysis also revealed that 477 marker fragments (mapped to 305 loci) exhibited similarity to the same or similar protein sequences. Those mapping to different loci may represent duplicated gene loci or different gene family members in Eucalyptus, while those mapping to the same locus could be cloned copies of the same amplified DArT fragment (marker redundancy).
Mapping of the DArT marker sequences to the draft E. grandis genome sequence assembly (V1.0, DOE-JGI, http://eucalyptusdb.bi.up.ac.za/) identified 1,836 (90.3%) marker sequences that could be placed in the genome (at an identity greater than 90% over the length of the sequence). The DArT markers placed in the genome cover approximately 600 Mbp (87%) of the sequenced genome space (690 Mbp) in the V1.0 E. grandis genome assembly (www.phytozome.net). The remaining 9.7% of the markers that could not be placed in the genome could have originated from unassembled parts of the E. grandis genome (gaps), or they may represent allelic variants of E. grandis or other Eucalyptus species, since the DArT array was constructed with DNA from a variety of species mainly E. grandis, E. urophylla, E. globulus and E. nitens, some of which are very distantly related to E. grandis (Sansaloni et al. 2010; Steane et al. 2011). The overall marker order was highly conserved between the consensus map and the Eucalyptus genome scaffolds in the draft 8× (V1.0) assembly of the E. grandis genome (Fig. S3).
Comparison of marker intervals on the consensus genetic map to marker positions on de novo assembled scaffolds of the E. grandis genome (http://eucalyptusdb.bi.up.ac.za) enabled us to compare genetic distance and physical distance in the Eucalyptus genome, an important property for future map-based cloning efforts. Due to the early stage of the DOE-JGI E. grandis genome assembly, we expected the sequence to contain many gaps and some errors in assembly. We therefore selected 153 genomic intervals throughout the 11 linkage groups, each corresponding to an approximately 1 cM interval in the genetic map with both flanking markers placed in the same de novo assembled genomic scaffold. The average physical distance per centiMorgan in the 153 intervals was 633 kb with a range of 100 kb to 2.4 Mbp (Fig. S4, Table S4).
Dense genetic linkage maps are useful for genome-wide identification of molecular markers closely linked to genes or QTLs, the isolation of genes via map-based cloning, detailed comparative mapping, and genome evolution studies (Varshney and Tuberosa 2007). To develop resources for such investigations, we used DArT and microsatellite markers to construct high-density genetic linkage maps of E. grandis, E. urophylla and the fast-growing interspecific F1 hybrid of these two species. This is the first genetic linkage map of the F1 hybrid genome representing one of the most widely used hybrid combinations in commercial plantation forestry in tropical and subtropical areas. The consensus map of the pedigree provides a valuable resource for genetic analysis in Eucalyptus based on 2,229 DArT and 61 microsatellite loci with excellent genome coverage for targeted marker saturation of economically important traits and new anchor points for evaluation of genome colinearity among Eucalyptus species.
Genetic maps previously reported for Eucalyptus species ranged from 919 to 1,814 cM in length (Brondani et al. 2006). The parental maps constructed here ranged from 924.7 (E. grandis BC parent) to 1,107.3 (E. urophylla BC parent) and 1,107.6 cM for the consensus map. Despite high map coverage, the E. grandis BC parent map (924.7 cM) was substantially shorter than maps reported earlier for this species (1,552 cM—Grattapaglia and Sederoff 1994; 1,415 cM—Verhaegen and Plomion 1996; 1,335 cM—Myburg et al. 2003; 1,814 cM—Brondani et al. 2006). Similarly, the E. urophylla BC parent map (1,107 cM) was shorter than previously reported for the species (1,331 cM—Verhaegen and Plomion 1996; 1,505 cM—Gan et al. 2003), except for the map reported by Brondani et al. (2006, 1,133 cM). The difference in map lengths could be explained by the different mapping software used for linkage analysis. The maps reported previously were mostly constructed using MAPMAKER® (MM; Lander et al. 1987), whereas JoinMap® (v 4.0, Van Ooijen 2006) was used in this study. The multilocus likelihood method used by MM assumes the absence of crossover interference, while JoinMap accounts for a level of interference even though both programmes use the (Kosambi 1944) function. This difference was also observed in other crop plants (Vuylsteke et al. 1999; Liebhard et al. 2003; Hong et al. 2008). Due to these differences in estimation, JoinMap produces shorter maps than MM (Stam 1993; Vuylsteke et al. 1999; Liebhard et al. 2003; Hong et al. 2008), especially when large numbers of markers are mapped. The E. urophylla parental linkage map reported by Brondani et al. (2006) was constructed with MM, but had low genome coverage, which explains the smaller map length. The two F1 hybrid maps (1,021 and 1,067 cM) were intermediate in size compared to the pure-species maps, despite higher numbers of segregating markers. This suggests that (paternal) recombination rates were overall very similar in the F1 hybrid and the pure-species parents, although local differences in recombination rates were apparent in the comparative maps of the F1 hybrid and the backcross parents (Fig. S1).
For a comparison of genome coverage achieved in different studies, marker density and distribution should be considered. Past DArT mapping studies in plants (Wenzl et al. 2006; Tinker et al. 2009) suggested that DArT markers have a reasonably uniform genomic distribution. We observed apparent clustering of DArT markers in several linkage groups of the parental maps (Fig. S1) and the consensus map (Fig. 2). In addition, more than 25% of the DArT markers in the consensus map co-segregated perfectly with one or more other markers. This may simply be a feature of the large number of markers mapped in this study, which would by chance lead to higher marker density in some regions of the map. However, some genomic regions may indeed be more polymorphic than others, especially in the F1 hybrid genome where regions that are rapidly diverging between the parental species could give rise to higher marker density in the F1 hybrid maps than the pure-species maps. Clustering of DArT markers has also been reported in mapping studies in wheat (Akbari et al. 2006; Semagn et al. 2006), barley (Wenzl et al. 2006) and oat (Tinker et al. 2009) and may be the result of reduced recombination in regions such as centromeres or regions with an excess of repeats (Vuylsteke et al. 1999; Young et al. 1999; Van Os et al. 2006). Despite the apparent clustering and redundancy of many DArT markers, the average marker interval (Table 1) in our maps was smaller than that of previous Eucalyptus genetic maps (Marques et al. 1998; Myburg et al. 2003; Brondani et al. 2006). Only four map intervals greater than 10 cM were observed for the E. grandis and E. urophylla BC parent maps. The consensus map had no intervals larger than 10 cM and only ten intervals ranging 5 to 10 cM, with the largest gap (9.6 cM) on the distal end of LG5 (Fig. 2). It is known that DArT genomic representations obtained with PstI reflect the methylation status of the genomic DNA and produce markers preferentially situated in hypomethylated, gene rich regions (van Os et al. 2006). Therefore, regions with lower marker density may be heterochromatin rich, or simply regions with lower genetic variability. Nevertheless, the high genome coverage achieved (c > 99.9% at 5 cM) makes these maps particularly useful for genome-assisted breeding.
In Eucalyptus, segregation distortion is normally higher in interspecific crosses (Grattapaglia and Sederoff 1994; Verhaegen and Plomion 1996; Marques et al. 1998; Myburg et al. 2003) than in intraspecific crosses (Byrne et al. 1995; Thamarus et al. 2002). The observed segregation distortion in eucalypts is most likely caused by linkage between genetic markers and genes with recessive deleterious alleles or by hybrid incompatibility (Potts and Wiltshire 1997). Markers with significant deviation from the expected Mendelian ratios occurred throughout the F1 hybrid and BC parent maps (Table S2) suggesting the presence of multiple segregation distorting loci as previously reported for Eucalyptus (Myburg et al. 2004). Approximately the same proportion of DArT markers were distorted in the two backcross parents than in the F1 hybrid which suggests that genetic factors affecting hybrid fitness may also be segregating in the two pure-species parents. This may be a feature of F2 pseudo-backcrosses where the two alleles segregating from the backcross parent can exhibit different (positive or negative) heterospecific interactions with the alleles segregating from the F1 hybrid (Myburg et al. 2004). The distorted markers often occurred as clusters (>10 markers/5 cM) or in some cases spanning the entire chromosome in the parental and hybrid maps (LG5). Clustering of loci showing segregation distortion has been reported before in Eucalyptus (Byrne et al. 1995; Verhaegen and Plomion 1996; Marques et al. 1998; Bundock et al. 2000; Brondani et al. 2006). These regions may contain genetic factors influencing the viability of F1 gametes, or fitness of F2 progeny (Lorieux et al. 2000; Cervera et al. 2001; Myburg et al. 2004; Liebhard et al. 2003; Bundock et al. 2000).
The reliability of consensus mapping was questioned by Beavis and Grant (1991) who cited the variability of recombination frequency in different populations or crosses. However, where marker order is conserved among individual maps, consensus mapping is a robust approach (Lespinasse et al. 2000). Only a small number of markers exhibited a change in order in the consensus map relative to the parental maps, specifically in LG1 and LG7 of the E. grandis BC parent (Fig. S1, Fig. S2). Changes in marker order during map integration have been reported in Eucalyptus (Brondani et al. 2006) and other species (Doligez et al. 2006; Lombard and Delourme 2001; Mace et al. 2009) and could be caused by heterogeneity in recombination, incorrect ordering in individual parental maps and missing or poor quality marker data (Lombard and Delourme 2001). Despite the fact that the markers in the parental maps were ordered with high statistical support and the order of markers in the consensus map was highly similar to that in the E. grandis genome scaffolds (Fig. S3) users of this map should be aware of the mentioned limitations of consensus mapping when interpreting consensus marker order, as well as total map length and spacing (Table 3).
The high marker density of the consensus map allowed selection of more than 150 pairs of markers that are both located on the same de novo assembled E. grandis genome scaffold. The ratio of physical to genetic distance (Fig. S4) will determine the feasibility of future map-based cloning efforts in Eucalyptus. The average physical distance observed per centiMorgan (633 kb/cM) was substantially larger than that reported before in Populus (200 kb/cM; Yin et al. 2004), and rice (244 kb/cM; Chen et al. 2002). The first JGI annotation of the E. grandis genome (V1.0; www.phytozome.net) predicted a total of 41,204 protein-coding loci in the 11 chromosome assemblies, which correspond to the 11 linkage groups in our map (Fig. S3). Based on the cumulative size of the 11 chromosome assemblies (605.8 Mbp), the average gene density in the E. grandis genome is predicted to be 68 per Mbp. This is lower than the gene density in Arabidopsis (218 per Mbp, www.phytozome.net) and Populus (100 per Mbp, www.phytozome.net). However, considering genetic distance, the gene density in Eucalyptus, 43 per cM (633 kb), is predicted to be the same as in Populus (43.6 per cM, 200 kb). This means that a QTL interval of 20 cM would on average contain approximately 860 genes. In this context, genetical genomics (eQTL mapping) approaches (e.g. Kirst et al. 2004) would be valuable to further dissect candidate genes underlying trait QTLs. The high-density of the genetic maps that can be achieved with the Eucalyptus DArT array (up to an average spacing of 0.48 cM, Table 3) will ensure many (~40) sequence-anchored marker loci per QTL (assuming a confidence interval of 20 cM), which will increase the accuracy of QTL tagging. A total of 1,836 DArT markers were placed in the genome sequence assembly (Fig. S3). These markers and additional markers developed from the genome sequence in tagged QTL intervals will support fine-scale mapping of QTL regions of interest. Most QTLs underlying economically important traits in Eucalyptus have not been characterized at this scale. We expect that the sequence-anchored genetic maps reported here and others to follow will accelerate the tagging of QTLs and cloning of positional candidate genes, and enhance Eucalyptus breeding through marker-assisted selection.
The authors are grateful to Sappi (South Africa) for making the crosses, maintaining the mapping pedigree and providing plant materials for DNA isolation and, in particular, wish to acknowledge Geoff Galloway (Sappi) for assistance in this regards. Minique de Castro (University of Pretoria) provided technical assistance with microsatellite marker analysis and Nanette Coetzer (University of Pretoria) assisted with bioinformatics and statistical analyses. Diversity Arrays Technology Pty Ltd (DArT P/L, Canberra, Australia) are acknowledged for excellent technical assistance and service. Jeremy Schmutz and Jerry Jenkins (HudsonAlpha Genome Sequencing Center, Huntsville, AL) kindly provided genomic scaffold positions for the sequenced DArT markers in the draft E. grandis genome sequence (DOE-JGI). This work was funded by Sappi through the Forest Molecular Genetics Programme and by the Technology and Human Resources for Industry Program (THRIP), the National Research Foundation (NRF) and the Department of Science and Technology (DST) of South Africa.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.