Background

Drought resistance is crucial for growth and survival of species living in water scarce environments [1]. Unraveling the molecular mechanisms that control functional traits, such as photosynthesis and water use efficiency in response to drought, is especially relevant in view of its implication in survival, growth and biomass production. However, carbon uptake in response to drought is a complex process with many mechanisms acting in coordination in final CO2 fixation [2]. From stomatal and mesophyll resistances to diffusion of CO2 to biochemical processes within chloroplast, complex mechanisms are involved in net carbon fixation [25]. The functional bases that control carbon uptake under water stress have been largely studied [6, 7], but less information is available about its genetic regulation.

Complex functional trait dissection can be achieved through two approaches: association studies and QTL (Quantitative Trait Loci) mapping [8]. The resolution power of association studies is higher than QTL mapping [9]. However, the rapid decay in linkage disequilibrium of conifers [10] makes the development of genome wide association studies in these species laborious and advocates in favor of candidate gene approaches [11].

In the past, identification of candidate genes underlying QTL was difficult due to the use of anonymous markers and limited sequences with functional information, and thus restricted the approach to model plant species [12]. Nowadays, gene-based markers are easily developed and much more functional information is available for a wide range of organisms [1318], allowing to integrate functional annotation with QTL studies [19]. Moreover, the development and application of high throughput genotyping technologies have allowed the construction of dense genetic maps [2027], http://dendrome.ucdavis.edu/cmap/]. The use of highly saturated genetic maps allows to narrow down the position of loci involved in the genetic control of the targeted trait and the combination of high density gene based maps with functional annotation allows to identify positional candidate genes for these QTL [19, 28]. Suggested candidate genes are suitable for association studies that can validate marker-trait associations [29]. Therefore, identification of positional candidate genes within QTL confidence intervals, some of them with known function in other species, could be considered as a preliminary step that contributes to the detection of genes underlying traits of interest [30]. Additionally, QTL mapping allows the evaluation of the genetic basis for potential adaptation in natural populations [31, 32] and to extend the understanding of relationships between different morpho-functional traits [33]. The identification of the main QTL involved in drought response could be a first step to develop marker assisted selection (MAS) strategies for these traits [11].

Consequently, the detection of QTL involved in photosynthesis and water use efficiency in the context of drought response is a first attempt to understand the genetic basis regulating the expression of these traits. QTL studies on functional drought response have been largely implemented for non-forest model species [3439]. Some of these QTL studies in crop species have recently identified genomic regions controlling photochemistry of carbon uptake [40, 41]. Breeding programs implemented in crops have reported yield improvement associated with increased photosynthesis [42]. However, fewer QTL analyses on functional drought response of forest tree species have been performed [11, 4346] and to our knowledge none of them has focused on the photochemical machinery.

QTL studies involve development of a segregant progeny for target traits, phenotypic and molecular characterization of the progeny and construction of genetic maps [47]. The power to resolve the location of a QTL is related to the size of the studied population and the mapping coverage [48]. Additionally, forest tree species are characterized by long generation times which hinder development of backcross or three-generation pedigrees by controlled crosses. On the other hand, replication of each genotype is needed for a reliable phenotypic evaluation [49], especially when working with physiological parameters that are extremely sensible to environmental conditions [41, 50, 51].

Mediterranean species are particularly threatened by drought [5254], especially in the context of climatic change predictions [55]. Pinus pinaster Ait. is an important conifer in Mediterranean region with a high ecological and socio-economical value [5658]. Although P. pinaster shows evidence of drought adaptation [59, 60], recurrent or severe drought periods can limit its growth [61, 62]. Understanding the molecular basis of drought tolerance is of high importance for a suitable management of the available genetic resources of P. pinaster in conservation, afforestation or breeding programs. QTL and association studies of drought tolerance traits have been developed in several tree species, such as P. taeda[23, 63, 64], Populus sp. [19, 65, 66] or Quercus robur[43, 67]. Several QTL and association studies in P. pinaster have analyzed the molecular basis of different processes related to growth or wood quality traits [6873], terpenes [74] and serotiny [75]. However, to date only association studies based on a few potential candidate genes [59, 60] and one QTL study have analyzed the molecular basis of drought tolerance in P. pinaster[44].

The main objective of this work was to unravel the genetic basis of different functional parameters related to carbon uptake and water use efficiency in response to drought for P. pinaster. For this purpose a QTL analysis using vegetatively propagated genotypes in order to improve the reliability of phenotypic estimates was designed. Several specific objectives were outlined: 1) construction of dense gene-based linkage maps with functional information; 2) identification of genomic regions underlying photosynthesis and water use efficiency in response to drought through QTL analysis; and 3) identification of a set of promising candidate genes in targeted genomic regions that may be involved in the genetic regulation of photosynthesis and water use efficiency in response to drought.

Methods

Plant material, experimental setup and phenotypic evaluation

Plant material, experimental setup and phenotypic evaluation are explained in detail in de Miguel et al. [76]. Briefly, 162 seedlings from a F1 full-sib family of P. pinaster obtained from a controlled cross between a male parent (Oria6) from Oria, a natural population from South-East Spain (37° 31 ’N 2° 21 ’W) and a female parent (Gal1056) from a breeding program established in Pontevedra, North-West Spain (42° 10 ’N 8° 30 ’W), were vegetatively replicated and established in an incomplete block design in a greenhouse at Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA). Phenotypic evaluations were conducted on the 103 clones for which at least three ramets were obtained. For phenotypic characterization three time-points of measurement were carried out starting in October 2009. During the 1st time-point, plants were watered close to full holding capacity. Then, watering was withdrawn and during the 2nd time-point of measurement plants were left 7 days without watering. The final third batch of measurements was carried out after plants have been 14 days without watering. Net photosynthetic rate (An, μmol CO2m-2 s-1), stomatal conductance to water vapour (gsw, molH2Om-2 s-1), intrinsic water use efficiency (WUEi, μmol CO2 molH2O-1), specific leaf area (SLA, m2Kg-1), maximum efficiency of photosystem II under light conditions (Fv’Fm’) and quantum yield (ΦPSII) were measured for all plants. Chlorophyll fluorescence parameters were measured following the procedure described in Cano et al. [6].

In the 1st time-point of measurement four adult needles were collected for each plant, dried and ground into a fine homogeneous powder. Carbon isotope composition was measured with a PDZ Europa ANCA-GSL elemental analyzer interfaced to a PDZ Europa 20–20 continuous flow isotope ratio mass spectrometer (Sercon Ltd., Cheshire, UK) at Stable Isotope Facility UC Davis, California, USA. The isotopic composition of 13C (‰) was expressed as [77]:

δ 13 C = R s - R b R b * 1000

Where Rs and Rb refer to the 13C/12C ratio in the sample and in the Pee Dee Belemnite standard, respectively.

Broad-sense heritability estimates and genetic correlations were calculated for the analyzed traits according to de Miguel et al. [76].

DNA extraction and marker genotyping

The mapping progeny was genotyped with nuclear microsatellites (single sequence repeats, nSSR), selective amplification of microsatellite polymorphic loci (SAMPL) and single nucleotide polymorphism (SNP) markers. Different DNA extraction methods were used in needles: a modified protocol from Dellaporta et al. [78] for nSSRs, SAMPLs and SNP array D (detailed below); the commercial kit Invisorb DNA plants HTS 96 kit (Invitek GmbH, Berlin, Germany) for SNP arrays A and C (detailed below) and the commercial kit DNeasy Plant mini kit (Qiagen, Düsseldorf, Germany) for SNP array B (detailed below).

A total of twenty nine primer pairs designed for amplification of nSSR loci in P. pinaster and P. taeda[79, 80] were tested for their segregations in the mapping population and both progenitors and six progeny individuals were genotyped. The whole mapping progeny was then genotyped only for polymorphic microsatellite loci. PCRs were performed in 10 μl containing 10 ng of DNA, 1x PCR reaction buffer (Invitrogen, Grand Island, NY, USA), 250 μM of each dNTP (Invitrogen, Grand Island, NY, USA), 0.25 U Taq polymerase (Invitrogen, Grand Island, NY, USA), 4 mM MgCl2 (Invitrogen, Grand Island, NY, USA) except for A6D12 where 2 mM MgCl2 was used, 0.2 μM of forward primer and 0.2 μM of reverse primer labeled on its 5’ end with IRD800. The PCR profile used was 94°C 4 min, 2 cycles of 94°C 45 s, 60°C 45 s, 72°C 45 s, 18 touchdown cycles of 94°C 45 s, 59.5°C 45 s (-0.5°C/cycle), 72°C 45 s, 20 cycles 94°C 30 s, 50°C 30 s, 72°C 45 s and final extension at 72°C 5 min. PCR reactions were carried out with a Perkin-Elmer GenAmp 9700 thermal cycler (Perkin Elmer Inc., Waltham, Massachusetts, USA). Amplified products were separated on denaturing gels containing 6% (w/v) acrylamide/bisacrylamide (19:3), 7 M urea and 1 x TBE and visualized in a 4300 DNA Analyzer (LI-COR Biosciences, Lincoln, NE, USA). Fragments were scored visually as codominant markers.

SAMPL genotyping was performed as in de Miguel et al. [81] with several modifications. Preamplification was carried out using EcoRI + A / MseI + C primer combination. In order to identify the most informative selective primer combinations (those with a higher number of informative polymorphic fragments) different primer combinations were tested using DNA from the progenitors and 6 offspring. A total of five CATA/EcoRI and three GATA/EcoRI primer combinations were used for the selective amplification. The whole mapping progeny was then genotyped for the eight selected SAMPL primer combinations. Primers CATA and GATA were IRDye 700 and IRDye 800 5’end labeled, respectively. Samples were loaded into denaturing gels containing 16% (w/v) Long Ranger® 50% (w/v) Gel Solution (Lonza, Basel, Switzerland), 7 M urea and 1 x TBE. Fragment detection was carried out on a 4300 DNA Analyzer (LI-COR Biosciences, Lincoln, NE, USA). Each gel was visually scored twice independently by two different people.

In this study, four SNP genotyping assays were used, three of which were Golden Gate assays (Illumina Inc., San Diego, CA, USA): SNP arrays A and C, which were two different 1,536 BeadArray™ experiments; and SNP array B, which was a 384 BeadXpress®. The SNP array D was a 12 K Infinium assay (Illumina Inc., San Diego, CA, USA). SNP arrays B and C were used to genotype the whole mapping progeny, whereas A and D could be used only on 83 and 70 progeny individuals, respectively. SNP array B was developed including many of the SNPs targeted in array A [82] and 14 additional SNPs from candidate genes for drought resistance [59] in order to complete the information for a set of genes of special interest (see de Miguel et al. [81] for further details). When the same SNP was successfully genotyped in both assays only the data of SNP array B was used because of the higher number of individuals genotyped in this assay. SNP array C was designed using a P. pinaster gene catalog obtained from 454 sequencing of cDNA libraries constructed with different tissues from 9 siblings of the mapping progeny submitted to different growing conditions (i.e. drought stress versus control plants; E Sáez-Laguna et al., unpublished). The genotyping of SNP array C was developed at CNIO, Madrid, Spain. Finally, SNP array D contained 10,593 SNPs identified from unigene set “PineContig_v2” of P. pinaster[20]. The four genotyping assays were carried out according to the manufacturer’s instructions (Illumina Inc., San Diego, CA, USA) and SNPs clusters revised manually with Illumina Genome Studio v.1.9.4 software with a GenCall score cutoff of 0.15. SNP clusters were modified manually to refine cluster positions when necessary. For the SNP array D (12 K Infinium) SNPs with Gen-Train values lower than 0.25 were discarded, with values between 0.25 and 0.5 were manually scored and with values higher than 0.5 were automatically scored.

Construction of dense linkage maps

For the construction of two genetic maps, one for each progenitor (Gal1056 and Oria6), the “two-way-pseudo-testcross” mapping strategy was applied [83]. The consensus map for the cross, combining markers informative for both parents, was also developed (GxO). Linkage analyses and map estimations were performed using the regression mapping algorithm implemented in the software JoinMap® v4.1 [84] with the CP population type and using a recombination fraction < 0.35 and a LOD > 3 as mapping parameters. Map distances were calculated using Kosambi mapping function [85]. For map building a goodness-of-fit jump threshold of 5 was established. JoinMap suggests three genetic maps with increasing number of markers (map1, map2 and map3). In map2, new markers were added because more pair wise data were available but statistical support is the same as in map1. In map3, the remaining loci were added by increasing the goodness-of fit jump threshold. In these cases map2 was kept for further analyses. Mean χ2 contribution to the goodness of fit and number of double recombinants were inspected in order to remove not reliably positioned markers from the estimated maps. When a pair of markers was considered identical based on the lack of recombination between them, only one of the markers was selected for mapping (see Additional file 1). Segregation ratios were tested using χ2 test (p ≤ 0.01) after Bonferroni correction. Framework maps for Gal1056, Oria6 and GxO were also built. For this purpose, only the most informative markers with very reliable positions and inter marker distance of circa 10 cM were kept. Total genome length was calculated as the sum of all mapped marker intervals. Estimated genome length was determined from the partial linkage data according to Hulbert et al. [86] modified by Chakravarti et al. [87] (Method 3). To estimate genome length using framework maps, a minimum LOD score of three was chosen. Observed map coverage was calculated as the ratio of total genome length to estimated genome length. To estimate the number of different mapped genes a BlastN was performed between gene sequences contained in the different SNP genotyping arrays. Sequences with a percentage of identity higher than 98% were considered the same gene. To test whether the mapped genes were evenly distributed between linkage groups χ2 tests (p < 0.05) were performed by comparing observed and estimated numbers of genes per linkage group (LG). The expected number of genes for each LG was obtained by multiplying the ratio size of LG to total genome length by the total number of mapped genes. Linkage maps were compared with previously developed P. pinaster maps [20, 81, 82] based on common SNPs and SSRs.

QTL mapping

In order to avoid errors in marker order that may have some impact on the precision and accuracy of QTL placement, QTL analyses were performed using the framework linkage maps. QTL detection was carried out using the regression algorithm implemented in the software MapQTL® v6.0 [88]. Interval mapping was applied followed by multiple QTL mapping (MQM) when more than one QTL was found for a trait. Analyses were performed using a mapping step size of one. The thresholds (95% and 99% confidence) for QTL significance were determined using a chromosome and genome wide permutation test with 10,000 iterations. Support intervals for the detected QTL were estimated based on the observed decrease of LOD value in one and two units. QTL identified with only the 95% significance at chromosome level were considered as suggestive of putative QTL. Each detected QTL received an identification name indicating the measured trait, the time-point of measurement, the linkage group (LG) and the map (“f” and “m” for female and male parents respectively and “i” for consensus map) where the QTL was detected.

Candidate genes search

Functional annotation for gene based markers of SNP arrays A, B and D were described by Chancerel et al. [20, 82]. Functional annotation for SNP array C was obtained with Blast2GO software [89]. For high-scoring segment pair (HSP) a restrictive E-20 e-value was chosen finding in a Blastx search against a set of 88,516 reference proteins from UniProt (http://www.uniprot.org/). In order to update annotation information for the sequences of the four SNP arrays that mapped to the QTL, a second round of annotation was performed using Blast2GO software with a restrictive E-25 e-value for HSP, and annotation was completed with InterPro (http://www.ebi.ac.uk/interpro/) and Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg/) searches. For those genes within the significant QTL confidence intervals (±2 LOD), functional annotations were queried to identify functional relationship between the positional candidate genes and each analyzed trait. In order to cover all mapped genes, QTL confidence intervals in framework maps were extrapolated to maps with all the mapped markers.

Association between phenotypes and alleles at candidate loci was further studied by ANOVA using the traits as dependent variables and the SNP genotypes as factors. Thereof nineteen traits (seven different traits measured at three water irrigation regimes, except δ13C measured only at 1st time-point) and 73 SNPs located in 58 identified candidate genes were inspected. False discovery rate (FDR) was calculated using the package qvalue. Association analyses were carried out in R version 2.15.2 (R Development Core Team, 2012).

Results

Phenotypic evaluation

Descriptive statistics of all analyzed traits are shown in Table 1. Almost all traits showed a close to normal distribution with low levels of skewness and kurtosis. Although normal distribution is an assumption in interval mapping, this method and MQM are quiet robust against deviations from normality [88]. Water stress produced a decrease in mean values for almost all variables except for WUEi and SLA that showed higher and very similar mean values, respectively, for the three time-points of measurement. Coefficients of variation were progressively higher with the imposition of drought stress being gsw the trait that showed the higher coefficient of variation in the 3rd time-point of measurement.

Table 1 Descriptive statistics of measured traits in the F 1 full sib family Gal1056xOria6 (n = 103)

Phenotypic correlations between the studied traits are presented in Table 2. An was correlated with gsw and with chlorophyll fluorescence parameters (Fv’Fm’ and ΦPSII). The magnitude of the correlation coefficients was very similar for the 1st time-point of measurement. However, under drought stress An showed a higher correlation coefficient with chlorophyll fluorescence parameters than with gsw. Besides, An and chlorophyll fluorescence parameters showed a tight genetic correlation (see Additional file 2). For WUEi and δ13C, a significant phenotypic (Table 2) and broad sense genetic correlation (see Additional file 2) was found. Both traits had higher phenotypic correlation coefficients with gsw than with An. SLA was moderately correlated with An, gsw, WUEi and δ13C for the 1st and 2nd time-points of measurement (Table 2). Broad sense heritability estimates for the analyzed traits are presented in Additional file 3. All of them presented moderate to low values of heritability being the higher estimates for gsw, WUEi and δ13C.

Table 2 Pearson correlation coefficients and statistical significance for measured traits in the F 1 full sib family Gal1056xOria6 (n = 103)

Highly saturated linkage maps

For Gal1056, Oria6 and consensus map, 17, 16 and 13 linkage groups (LG) were obtained, respectively (Table 3). The three constructed genetic linkage maps had in total 2,107 markers representing 1,314 mapped genes (Table 3). Genes were evenly distributed between linkage groups (χ2 test p > 0.05 for the three linkage maps). Map coverage was 65–100% and average distance between two adjacent markers was smaller than 2 cM (Table 3). The vast majority of markers with distorted segregations were discarded because of insufficient linkage information to be mapped (Table 3). Out of the six distorted markers, five mapped in the first 10 and 20 cM of LG 5 in Oria6 and consensus maps, respectively (see Additional files 4 and 5).

Table 3 Mapping features of the two parental linkage maps (Gal1056 and Oria6) and consensus map for the cross (GxO)

Through comparisons between both parental maps, as well as with previously developed maps for P. pinaster[20, 81, 82] based on 654 common markers, 12 groups could be identified for the three maps, which is in agreement with the haploid number of chromosomes for the species. Common markers among the different genetic maps compared mapped always in the same homologous LG excepting three markers (see Additional file 6): contigs FN696780 and AL749831 that mapped in LG 9 and LG 4 in Chancerel et al. [20] and in LG 7 and LG 9 in this study, respectively (see Additional files 4 and 5); contig CT577280 that mapped in LG 7 and LG 4 in the two different maps obtained in Chancerel et al. [20] while in Gal1056 and the consensus map it was mapped in LG 4.

For the 82% and 86% of contigs with more than one mapped SNP, they mapped at less than 1 cM in Gal1056 and Oria6 respectively. There was a significant exception for contig BX249015 that had one SNP mapped in LG 5 (BX249015-204) in Gal1056, Oria6 and the consensus map and the other SNP mapped in LG 8 (BX249015-289) in Oria6 and the consensus map (see Additional files 4 and 5), whereas this contig was mapped in LG 5 in Chancerel et al. [20].

The consensus linkage map is available at Dendrome (http://dendrome.ucdavis.edu/cmap/).

QTL detection

Of the 55 detected QTL (Table 4, Figures 1 and 2), 28 were highly significant QTL, whereas the remaining 27 could be considered as suggestive or putative QTL. QTL were detected for all traits but the higher number of QTL were detected for Fv’Fm’ and ΦPSII (Table 4, Figures 1 and 2). The total phenotypic variance explained for a single QTL ranged from 4.6% (WUEi) to 20.9% (Fv’Fm’). The higher percentage of total phenotypic variance explained by all the QTL detected for a trait in a time-point of measurement was 44% (Fv’Fm’).

Table 4 Identified QTL in Gal1056, Oria6 and GxO maps
Figure 1
figure 1

Graphical representation of the QTL identified on the parental (Gal1056 and Oria6) and consensus (GxO) framework linkage maps (LGs 1 to 6). Only linkage groups for which QTL have been detected are presented. The one and two LOD confidence intervals are indicated by squares and lines, respectively. Colored QTL are the significant QTL (significant at 99% at chromosome level or 95% at genome wide level), with each color representing a different trait, and black QTL are the suggestive QTL (significant at 95% confidence at chromosome level). An = net photosynthetic rate (μmol CO2m-2 s-1); gsw = stomatal conductance to water vapour (molH2Om-2 s-1); WUEi = intrinsic water use efficiency (μmol CO2 molH2O-1); δ13C = isotopic composition of 13C (‰); SLA = specific leaf area (m2Kg-1); Fv’Fm’ = maximum efficiency of PSII under light conditions; ΦPSII = quantum yield. S1, S2 and S3 stand for 1st time-point of measurement (well watered plants), 2nd time-point of measurement (seven days without watering) and 3rd time-point of measurement (14 days without watering) respectively.

Figure 2
figure 2

Graphical representation of the QTL identified on the parental (Gal1056 and Oria6) and consensus (GxO) framework linkage maps (LGs 7 to 12). Only linkage groups for which QTL have been detected are presented. The one and two LOD confidence intervals are indicated by squares and lines, respectively. Colored QTL are the significant QTL (significant at 99% at chromosome level or 95% at genome wide level), with each color representing a different trait, and black QTL are the suggestive QTL (significant at 95% confidence at chromosome level). An = net photosynthetic rate (μmol CO2m-2 s-1); gsw = stomatal conductance to water vapour (molH2Om-2 s-1); WUEi = intrinsic water use efficiency (μmol CO2 molH2O-1); δ13C = isotopic composition of 13C (‰); SLA = specific leaf area (m2Kg-1); Fv’Fm’ = maximum efficiency of PSII under light conditions; ΦPSII = quantum yield. S1, S2 and S3 stand for 1st time-point of measurement (well watered plants), 2nd time-point of measurement (seven days without watering) and 3rd time-point of measurement (14 days without watering) respectively.

Consequently, four QTL hotspots could be identified in LG 5, LG 6, LG 7 and LG 12 (Table 4, Figures 1 and 2) due to the co-localization of QTL for different traits (Figures 1 and 2): SLA co-localized with gsw in LG 5 and LG 7; Fv’Fm’ co-localized with WUEi in LG 5 and LG 12 with δ13C in LG 6, with An in LG 6 and LG 12 and with ΦPSII in LG 7 and LG 12. QTL for SLA, Fv’Fm’ and ΦPSII were detected for the three time-points of measurement. Some of them co-localized in the same region, such as the identified for SLA in LG 5, LG 7 and LG 12 and for Fv’Fm’ in LG 6 and LG 7. Co-localization of QTL for the same traits at different levels of water stress highlights the stability of QTL with the imposition of drought stress. QTL for An could only be detected for the 2nd and 3rd time-point of measurement while QTL for gsw and WUEi were only detected in the 1st and 2nd time-point of measurement (Table 4).

Candidate gene identification

The 74% of the mapped sequences (991 out of 1,348) were annotated. Gene annotations and co-localization with the detected QTL lead to the identification of 58 positional candidate genes that could be involved in the expression of the targeted traits (see Additional file 7). Genes related with oxidative stress, ATPase family proteins or proteins of the light harvesting centers were found in the confidence intervals of QTL for net photosynthesis or chlorophyll fluorescence traits. Genes related with stomatal regulation, ABA signaling pathways or cell wall composition were found in QTL for gsw and WUEi. Genes expressed under drought conditions co-localized with QTL identified in the 2nd or 3rd time-point of measurements but not in the first one, which could be pointing out the induced drought functional response of P. pinaster. Other remarkable co-localizations were found for two QTL for SLA with an enzyme involved in auxin biosynthesis, or between a QTL for δ13C and a member of the aquaporin family (see Additional file 7 for a detailed list of candidate genes). ANOVA test developed for the 73 tested SNPs in candidate genes resulted in 43 significant associations with at least one of the analyzed traits (data not shown). After corrections using the false discovery rate estimated, only three SNPs of the gene MYB1 (m746, m747 and m751) remained significantly associated with Fv’Fm’ measured in the 1st time-point (well watered plants) and SLA measured in the 3rd time-point (14 days without watering; Figure 3). SNP m746 was located in an intron but m747 and m751 were located in exon regions. The base substitution in SNP m747 was a non-synonymous change between a threonine (when a cytosine is present) and an isoleucine (when a thymine is present) while in m751 was a synonymous change. SNPs m747 and m751 explained 14.4% and 12.6% of the phenotypic variance for Fv’Fm’ measured in the 1st time-point and SLA measured in the 3rd time-point respectively. SNP m746 explained 9% of the total phenotypic variance for Fv’Fm’ measured in the 1st time-point.

Figure 3
figure 3

Boxplots for SNPs in candidate gene MYB 1 significantly associated with traits. For each one of the three SNPs (gene name and SNP position between brackets) is shown the p-value of ANOVA and false discovery rate (q-value) estimated for Fv’Fm’ (maximum efficiency of PSII under light conditions) and SLA (specific leaf area). S1, S2 and S3 stand 1st, 2nd and 3rd time-point of measurements respectively.

Discussion

Highly saturated linkage maps

Combining different types of markers three highly dense linkage maps were constructed. They include more than 1,000 genes scattered throughout the genome of P. pinaster and distributed in 12 groups that match the chromosome number of the species. The aforementioned highly saturated maps, with less than 2 cM mean distance between markers, are in the range of recently published linkage maps for other conifer species [20, 22, 24, 27, 51]. Estimated map length was higher in the female than in the male parent. Differences in genome length between parental maps are usually found in conifer species [9093] and it may be a consequence of differences in the recombination rate between parental trees [94, 95].

The accuracy of the SNP genotyping assays previously proved [20, 82] has been confirmed in this study by genes with more than one SNP that mapped in almost all cases within a distance lower than 1 cM. The single exception of contig BX249015 could be attributed to the existence of two paralogous genes for this sequence placed in different LGs. Indeed, high levels of synteny and colinearity were observed between female and male parental maps. The fact that four out of the five distorted markers mapped in the same region suggests that segregation distortion could be due to pre or post-zygotic selection rather than to genotyping errors.

The construction of dense genetic maps for different conifers provides additional tools for studying conifer genomes organization and evolution at a finer scale [27]. In addition, high density linkage maps can be used to position scaffolds along linkage groups contributing to the assembly of a reference genome sequence [24, 96]. P. pinaster genome sequencing is currently in progress, and it should be noted that Oria6, the male progenitor of the mapping family, is the genotype from which the haploid line was selected and its DNA used as template [97].

Additionally, development of dense genetic maps from individuals belonging to two Spanish natural populations (from Northwest coast and Southeast mountains) that show high levels of genetic divergence with the Spanish (from the Castilian Plateau; [81]) and French populations (from Landes and Corsica; [20]), from which segregating progenies have been previously mapped, is important to explore the genetic organization and evolution of the species. Synteny and colinearity were highly conserved when compared with 654 common markers with previous studies [20, 81, 82]. Only three discrepancies were found that supposed just a 0.46% over all the common markers analyzed: Contig CT577280 was mapped in LG 7 and LG 4 in two of three obtained maps for P. pinaster in Chancerel et al. [20] and it was suggested the existence of two paralogous genes for this sequence. In this study the position of CT577280 in LG 4 was confirmed. Contigs FN696780 and AL749831 mapped in LG 9 and LG 4 in Chancerel et al. [20] and in LG 7 and LG 9 in this study respectively, which suggest also the existence of two paralogous genes for these sequences.

The high level of synteny and colinearity observed between the genetic maps developed for individuals that belong to populations with very different genetic backgrounds [98] points out the high reliability in the marker order obtained. Thus, it is possible the development of a composite genetic map for the species by integrating the genetic maps developed by de Miguel et al. [81], Chancerel et al. [20] and those obtained in this work, which is currently in progress. Parental maps are the most accurate regarding both, marker order and marker distances; since they have been constructed through separated information of the meiosis occurred in each progenitor. Accuracy is also related with the presence of genotyping errors, missing values and segregation distortion in the molecular marker data used for the construction of linkage maps [99]. In this study it was achieved by the thorough genotypic data integrity obtained by using highly stringent thresholds to consider SNPs for mapping. Also, the position of the SNPs genotyped in fewer individuals (SNP array D) was validated through the comparison with previously developed P. pinaster maps [20]. In addition, only a few distorted markers have been mapped and almost all located in a narrow region of a single linkage group (Oria6 and GxO LG 5), which points towards a probable biological origin. Even so, QTL analyses have been developed using framework linkage maps to minimize the problems that possible errors in marker order could cause.

QTL detection

One of the main goals of this work was to identify QTL for leaf functional traits related to photosynthesis and water use efficiency in response to drought. QTL analysis in forest tree species is challenging by its long generation times which hinder the development of classical mapping populations like backcross, F2 or recombinant inbred lines. In order to overcome this shortcoming alternative strategies are usually developed for QTL detection in trees, such as the two-way pseudo-testcross [83] used in this study. In this work, two parents from contrasting populations in their drought response were selected to maximize the variability of the F1 obtained progeny, at molecular and functional levels. Although some recent QTL studies in trees worked with larger progenies [51, 100, 101], the 162 obtained siblings in this study are in the range or higher than other QTL analysis in trees [31, 46, 69, 70, 102]. On the other side, gas exchange parameters are extremely sensible to variations in the environmental conditions. To cope with the problem of environmental noise in phenotypic evaluation, different strategies have been used for QTL analysis in the literature. For example the implementation of statistical and physiological models to adjust phenotypic values for microclimatic differences [41, 50] or the development of inbred line populations for annual species [103, 104]. In this study, four vegetative copies of each genotype were established in a completely randomized block design in a greenhouse. All these efforts allowed the identification of significant and suggestive QTL for important traits related to water use efficiency and carbon uptake in P. pinaster.

For all the analyzed traits several QTL were identified with moderate effects rather than a single or few QTL with large effect, as expected for complex functional traits. The only exception was δ13C for which only one QTL could be identified. For gas exchange parameters, the percentage of observed phenotypic variance explained when taking into account all significant QTL detected in a single trait in each time-point of measurements was in average 20%. In addition, some suggestive QTL for water use efficiency estimated from different approaches have been reported, although their existence should be further tested using larger population sizes or analyzing their stability in different genetic backgrounds [105]. For chlorophyll fluorescence parameters, a large number of QTL were identified. The significant QTL detected in each time-point of measurement for Fv’Fm’ explained together up to 44% of the observed phenotypic variance. The results achieved point towards a tight genetic control of photochemical traits, as previously described in crop species like maize [103, 106], barley [104] or soybean [40].

For δ13C only one QTL was detected in LG 6, in agreement with Brendel et al. [44] that found a QTL in the same region of LG 6 for P. pinaster. No co-localization of QTL for WUEi and δ13C was found in spite of the significant phenotypic and genetic correlation between both traits. In this study δ13C was measured only in the 1st time-point of measurement, thus its value probably reflected the water use efficiency in well watered conditions. As the genotypes analyzed have showed high genetic variability in water use efficiency in response to drought [76], it could be expected to observe higher variation in δ13C in needles developed under water limiting conditions maintained in a long-lasting water stress period [107109]. Higher variability on δ13C would enhance the detection of QTL for this trait and it might be possible to find other QTL as reported by Brendel et al. [44], who found four significant and four suggestive QTL. Differences in the number of detected QTL for δ13C between both studies could also be explained because Brendel et al. [44] measured δ13C in tree growth rings from 15 years-old trees while in this study δ13C was measured in needles of one year-old seedlings. Nevertheless, the co-localization of this QTL for δ13C between two genetically unrelated progenies from wide geographic origins (Landes x Landes versus Galicia x Oria) and growing under different environmental conditions supports its stability.

Interaction between QTL and environmental conditions was tested performing the QTL analyses using three different time-points of measurements corresponding to different water irrigation regimes. In general, most of the detected QTL were environment-specific, suggesting that genes are differentially activated during maritime pine drought response [110]. Nevertheless, several QTL for Fv’Fm’, ΦPSII and SLA were less sensitive to environmental conditions and maintained the same location with drought imposition, confirming the stability of these QTL across different levels of water stress endured by plants.

This way, four clusters of QTL were identified in LG 5, LG 6, LG 7 and LG 12. Clustering of QTL could be related with the pleiotropic effect of one or a few genes affecting different traits rather to the existence of rich gene regions, as genes were homogeneously distributed between LGs. Chancerel et al. [20] detected higher number of genes in LG 6 and LG 12 than in the other linkage groups, however the maps developed in this study could not confirm these results.

QTL for photosynthesis measured through gas exchange and chlorophyll fluorescence parameters co-localized in LG 6 and LG 12, accordingly with the high broad-sense genetic correlation found between both traits. However, additional and no co-localizing QTL were identified for these traits in other LGs, suggesting that CO2 fixation and electron transport were not entirely coupled, in agreement with Gu et al. [41]. Uncoupling of these two processes may be due to drought effects on stomatal conductance, biochemical alterations of carbon fixation enzymes, or photoinhibition affecting electron transport rate [7, 8].

Under drought stress An and gsw showed a lower level of phenotypic correlation while the correlation coefficient between An and Fv’Fm’ or ΦPSII increased with water stress, which suggests that under stomatal closure the differences that can be observed between genotypes in carbon fixation could be due to differences in electron transport through PSII rather than to differences in stomatal conductance, as previously observed in other species [106].

SLA showed a significant phenotypic and genetic correlation with WUEi and Fv’Fm’. The identification of relationships between two traits using phenotypic correlations may not distinguish whether the traits could be causally related or simply varying in association. However, the coincidence of QTL for two traits is strong evidence that they could be functionally related [36]. QTL co-localization of SLA with WUEi and Fv’Fm’ was found in LG 5 and LG 7 pointing towards a strong inter-relationship between SLA, WUEi and Fv’Fm’. The aforementioned co-localization could indicate that plants with lower SLA are more efficient in water use but had a lower efficiency of electron transport through photosystem II that could be explained because of the higher importance of gsw over An in determining WUEi in this species [76, 111113].

Most of the detected QTL were found only in one of the two progenitors. The parental trees were selected from two distant populations, showing high level of genetic differentiation, and with a different degree of drought tolerance. Oria6 came from the southeast of Spain governed by a Mediterranean climate with long, hard and frequent summer dry periods, while Gal1056 came from the northwest of Spain where Atlantic climate is present. Consequently, a higher degree of drought adaptation is expected in Oria6 than in Gal1056. Controlled crosses performed with so different parental trees in their response to drought are very useful to compare QTL identified in individuals with different genetic backgrounds.

Candidate genes within QTL

The identification of the gene or genes underlying a trait has been described as one of the greatest challenge for geneticists during this century [114]. The development of high density linkage maps using gene-based markers selected, in some cases, for their known implication in drought response allowed the identification of potential candidate genes for the quantitative multigenic traits analyzed in this study. Due to the lack of sequence annotation, a considerable number of mapped sequences showing high homology with cDNA sequences from other conifers could not be functionally inspected. Thus, some QTL with large effect had no obvious candidate genes but hold great promise to identify unknown genes underlying the corresponding processes in the future. For other QTL, positional candidate genes with known function in other species that were selected according to their functional similarity with genes involved in processes related with the studied trait were identified. A MIXTA-LIKE TRANSCRIPTION FACTOR (MYB) and a HISTONE CHAPERONE were found at 25 and 12 cM from the LOD peak of one of the four most clearly detected QTL, Fv’Fm’S2LG7i. MYB transcription factors are a wide group related with multiple physiological processes such as photosynthesis signaling [115]. The HISTONE CHAPERONE acts as a heat protection protein [116]. The increase of leaf temperature could be an important consequence under drought stress conditions due to reduced transpiration caused by stomatal closure. In this sense, the gene encoding the MYB transcription factor and the HISTONE CHAPERONE also co-localized with Fv’Fm’S3LG7i, both QTL measured under water stress. Another gene of the MYB family encoding the MYB 1 transcription factor, co-localized with several QTL for Fv’ Fm’ and ΦPSII measured in well watered conditions (Fv’Fm’S1LG7i, Fv’Fm’S1LG7_2f, ΦPSII S1LG7_2f) and SLA measured under water stress (SLAS3LG7i). In this sense, three SNPs positioned in MYB 1 gene resulted in a significant association with Fv’Fm’ measured in the 1st time-point of measurements (well watered plants) and SLA measured in the 3rd time-point of measurements (14 days without watering). Lepoittevin et al. [117] found that the gene MYB 1 showed complete linkage disequilibrium in P. pinaster over a distance of 1,304 bp. Together with their intron/exon location and base substitution types, this points towards association of SNPs m751 and m746 with target traits could be the consequence of genetic linkage with m747, that had higher chance to influence Fv’Fm’ and SLA. The expression of MYB 1 regulates genes of the phenylalanine pathway in white spruce [118] and maritime pine [119]. Increase of isoprenoid related compounds has been described to be related with photoprotection mechanisms triggering under abiotic stresses [120]. In this respect, some of the SNPs observed for MYB 1 at present could be related with enhancing maintenance of photochemistry function as higher Fv’Fm’ during drought. These associations should be further validated analyzing, i.e. specific nucleotide variants in a panel of unrelated genotypes [121].

Several genes related with oxidative stress co-localized with QTL for photosynthesis under water stress conditions inferred both by gas exchange or chlorophyll fluorescence. For example, 5 -ADENYLSULFATE REDUCTASE-LIKE 4-LIKE that was implicated in the cell redox homeostasis [122], co-localized with QTL ΦPSIIS1LG8_2m; PROLYL 4-HYDROXYLASE ALPHA SUBUNIT-LIKE PROTEIN that has oxidoreductase activity [123], co-localized with QTL AnS2LG9m; or CINNAMOYL- REDUCTASE 1-LIKE and PEROXIREDOXIN- CHLOROPLASTIC-LIKE that are enzymes from the flavonoid and phenylpropanoid biosynthesis pathways, respectively [124], were on the confidence interval of QTL Fv’Fm’S2LG1_2m. Overall, gene annotation seems to point out to an important role of maintenance photochemical integrity machinery in the drought response of P. pinaster.

Several genes that have been described to be related with regulation of stomatal aperture were found in the range of QTL for gsw and WUEi. For example, MALATE DEHYDROGENASE catalyzes the reaction which converts malate to oxalacetate and a reduction in malate before stomatal closure was observed [125, 126]. Also, PHOSPHOLIPASE C 3-LIKE is required for the control of stomatal aperture by ABA [127, 128]. Genes encoding these enzymes co-localized with QTL gswS2LG5f, WUEiS2LG5f, WUEiS2LG5i. AQUAPORIN NIP1-2-LIKE co-localized with δ13CS1LG6i, which was found interesting because of the importance of aquaporins in determining the leaf water status [129] and the proved stability of this QTL.

Conclusions

The in-depth analysis of genetic control of the CO2 fixation process in response to drought was possible after measuring different functional parameters using complementary techniques, such as gas exchange and chlorophyll fluorescence, that measure final carbon capacity uptake. The use of maritime pine replicated genotypes and a suitable experimental design have made possible to identify genetic control for functional and morphological leaf traits, measured under three water irrigation regimes as they are highly dependent on environmental conditions. Several genomic regions implicated in the genetic control of drought resistance traits have been identified. The identification of potential candidate genes leads this project a step beyond the simple detection of QTL. Nonetheless, further association studies with proposed candidate genes are needed in order to validate detected SNP marker-trait associations.