Background

Grapevine is one of the most important fruit crops grown worldwide. It provides berries to be used as fresh fruit, raisins or for wine making, a key socio-economic sector for many countries.

Grapevine’s developmental cycle is described by three main phenological stages. Budbreak represents the onset of the vegetative growth. Flowering starts the vine reproductive growth, leading, when fertilisation takes place, to berry formation. Veraison, the onset of the berry ripening process, is the stage when major changes occur in berries. Indeed, while organic acids and a few other compounds already accumulate in berries before this stage, starting from veraison time berries switch from being small, hard and acidic to a status where they become larger, softer, coloured and accumulate sugars and flavours or aromatic compounds with a decrease in organic acids content. All these events largely determine wine quality [1]. Finally, when berries meet required sugar and acidity content they are harvested, even though maturity, also called ripening, cannot be considered as true a phenological stage, due to the difficulties in establishing uniform criteria for different varieties [2].

Grapevine phenology is driven by temperature and, being also under genetic control, different varieties differ in their phenological timing, due to morphological and physiological characteristics [3]. Accordingly, the suitability of each variety to a given area has been defined by climatic factors that limit their geographic distribution, with the finest wines associated with geographically distinct viticulture regions [4]. Veraison date determines climatic conditions during the ripening. Too high temperatures lead to negative effects, which include the reduction of final anthocyanin content in berries [5], with consequences on the visual aspect of the fruit and red wines, the decoupling of ripening parameters (i.e., excess sugar content but low acidity, [6, 7]), an inadequate pool of polyphenolic compounds and incomplete development of flavours. Long-term studies on climatology and grapevine phenology demonstrated that global warming has already affected, in several areas, the onset and duration of phenological events, with an acceleration in their timing [4, 8,9,10,11,12,13,14]. Further changes and their impacts on quality have been also modelled either globally or for the most important grape growing areas worldwide, highlighting that impacts of climate change on viticulture suitability is expected to become substantial, at least for some regions [2, 15,16,17,18,19]. Adaptation by taking into account agronomic practices or migration of vineyards is unlikely, and incorporation through cross-breeding of traits for the control of phenology beside temperature resilience is recommended in the long term [2, 15, 20, 21].

With the final aim of breeding varieties better adapted to future climatic conditions many teams have attempted to elucidate the genetic determinism of phenology, and in particular veraison time, by applying QTL studies, finding regions in the grapevine genome linked to observed variation including a large number of genes [22,23,24,25,26]. An interesting opportunity to summarize available QTL information and refine their genomic location, by comparing individual experiments narrowing down original intervals, comes from QTL meta-analysis [27, 28]. Indeed, QTLs detected in different experiments and located in a given region of a chromosome could possibly represent several estimations of one single shared QTL. This hypothesis can be tested by appropriate statistical tools that indicate the most likely number of ‘real’ QTLs underlying co-located QTLs. The resulting meta-QTLs are expected to better refine the boundaries of the causative genomic intervals by integrating information from different studies. This approach was first applied to study flowering time in maize [29]. Subsequent positional cloning and association mapping analysis revealed two genes in meta-QTL intervals effectively involved in modulating flowering time [30,31,32] confirming meta-analysis as useful tool for predicting candidates and developing markers for breeding. Since then QTL meta-analyses have become popular in the literature to score QTLs of huge breeding potential and towards QTL validation and/or prioritization of candidates. Lately, meta-analysis has been successfully applied in many crops like rice [33], cotton [34], potato [35], soybean [36], bean [37] and many others. However, this approach has so far not been applied in grapevine.

Technological advances and the availability of a high-quality draft of the grapevine genome sequence [38] have encouraged characterizations of berry development at the transcriptomic level [39,40,41,42,43,44,45,46,47,48]. Beside identifying specific transcripts modulated during berry development, these studies revealed that a major transcriptomic shift is associated to the veraison transition [44, 49]. Integrated network analysis of expression data allowed genes to be classified according to their correlation with interaction partners, and to define “switch genes,” likely playing a key role in this major transition [44, 46]. Lately, by detailed analysis of expression profiles in two different varieties, two rapid and successive transitions at the timing of the molecular reprogramming associated to veraison were defined, including positive and negative molecular “biomarkers” [48].

The number of candidate genes putatively involved in the genetic control of the veraison transition either underlying veraison QTLs, or emerging from transcriptomic studies, is huge. With the final aim of defining a prioritizing strategy we developed a consensus genetic map from 39 independent maps and, following QTLs projection, performed a meta-analysis of co-located veraison QTLs or of veraison QTLs and other phenology QTLs. Then, by anchoring to the grapevine genome assembly and integrating information from transcriptomic studies, we selected a set of putative key regulators for the grapevine veraison transition.

Results

Selection of grape QTLs for integration and scoring of phenology related ones

All published grapevine QTL studies up to October 2018 were collected from the literature to retrieve those including suitable information for integration of data from independent studies/populations. This resulted in the selection of 42 publications, reporting 47 different QTL maps from more than 80 available (Additional file 1) [22,23,24,25,26, 50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86]. These QTL studies exploited a total of 24 different cross populations, constituted on average by 157 offspring (number of offspring ranging from 74 to 265). Cross populations were mainly F1, with the only exception of two populations obtained by self-pollination and one obtained by selfing an F1 [56, 67, 68]. A large number of cross populations (14) were derived by crossing Vitis vinifera with hybrids or other Vitis species, but a number of intra-vinifera crosses was also reported.

These selected QTL studies included 2093 QTLs for 354 different phenotypes. A detailed list of all scored phenotypes, grouped according to the study and including the QTL short name used in the relative reference as well as a short description, is provided in Additional file 2. Each measured phenotype/QTL was manually attributed to its most related trait, for which the phenotype was considered to be a descriptor, and traits were arbitrarily grouped in 8 main trait categories. An overview of the plant traits in grapevine currently more characterized by these studies, grouped according to the eight different trait categories, is given in Additional file 3. Number of QTLs for each trait as well as number of studies considering each trait are shown. The trait for which the highest number of QTLs is currently available in the literature is berry metabolites (Additional file 3a). This is expected since high throughput metabolomic approaches can easily produce a large amount of data. However, the overall most scored trait across QTL studies was berry weight (scored in 12 independent studies), while trait categories most addressed by independent studies so far have been phenology and pathogen resistance (Additional file 3b, number of independent studies per category indicated in brackets).

Interestingly, 184 QTLs among those included in the selected studies were belonging to the trait category phenology, derived from 12 publications. This category includes 5 main traits namely bud burst, flowering, veraison, ripening times and length of intervals among the different stages. Table 1 provides details about these phenology QTLs including the reference in which they were found, the short name attributed to the originally scored phenotype in each of these publications and the distribution across the different traits’ type. Among these, 54 QTLs derived from 6 studies were related to the main trait veraison.

Table 1 Published grapevine phenology related QTLs included in the analysis

Building of a grapevine consensus genetic map

All 35 genetic maps used in the selected QTL studies (see genetic map references in Additional file 1) were used as input for the construction of a consensus map in BioMercator v4.2 software [87]. A grapevine reference map, developed from the integration of 5 different genetic maps [88], was also included, as well as 3 further available grapevine genetic maps [22, 89, 90].

Common markers made the construction of a consensus possible for each of the 19 grapevine chromosomes with no residual conflicts (Additional files 4 and 5). The consensus map consisted of 19 linkage groups, corresponding to the 19 grapevine chromosomes, including 3130 markers with a total length of 1922 cM and an average number of markers and length per linkage group of 164 and 101 centiMorgan (cM) respectively. The number of markers shared by at least two maps was 1209, corresponding to 38.63% of the total markers, with an average of 63 shared markers per linkage group (Table 2). The number of maps used for the construction of each linkage group varied from 26 (LG 11) to 39 (LGs 1, 2, 4, 5, 10, 12, 17, 18), due to the different number of markers shared among maps (Table 2, Additional file 6).

Table 2 Consensus genetic map features

Marker density was not equally distributed among the consensus, with peaks in putative centromeric positions similarly to those found in original maps. However, comparison of markers order between the single component maps and consensus map revealed a high level of correlation (Additional file 7). Spearman’s rank correlation values of pairwise comparisons were significantly high for all maps but two, possibly due to the low number of shared markers.

The consensus genetic map was connected to the genome annotation through the use of an anchor file including marker’s physical position, recovered as explained in the methods section. Upon removal of markers showing incongruent or not unique physical positions 713 markers (on average 38 markers per LG) were physically mapped on the 12X.v2 assembly of the grapevine genome [91]. Their physical coordinates are also included in the map file Additional file 4. Among these markers, 480 (67%) were shared by at least two original maps, and the majority (513, 72%) were microsatellite markers.

Distribution of grapevine QTLs on the consensus genetic map

All QTLs from the 47 selected QTL studies (Additional file 1) were projected onto the consensus map to build QTL consensus maps for each trait (Additional file 8). In total 1899 QTLs (91%) could be successfully plotted while 194 could not be projected, due to lack of anchoring markers. The percentage of plotted QTLs was comparable across the different traits, ranging from 82 to 100%. Only for the traits ripening time, fertility and black rot resistance the number of plotted QTLs was lower (75, 65 and 65% respectively).

To aid spotting QTLs emerging independently in more studies, circular plots of consensus QTL maps were prepared for each trait grouped by trait category. Plots for all trait categories except phenology are provided in Additional file 9, while the phenology category plot is provided in Fig. 1. Careful inspection revealed that the trait with the highest number of co-located QTLs from independent studies, highlighted by bars on the outer side of chromosomes, was downy mildew resistance (Additional file 9e). QTLs located on LG 1, 4, 5, 6, 7, 8, 10, 12, 14, 17 and 18 were all confirmed by more studies, with up to 5 studies mapping QTLs to an overlapping region on LG 18. However, for all other pathogen resistance traits only one QTL was discovered by different studies, namely the QTL on LG 15 for powdery mildew resistance found in three independent studies. In a similar way, several QTLs for the trait anthocyanin, on LG 1, 2, 4, 6, 7, 10, 12 and 17, were all confirmed by independent studies with the most consistently found QTL mapping on LG 2. However, also for this category no other trait revealed confirmed QTLs, at least considering these studies, since the overlapping QTLs shown in Additional file 9d for other traits in this category all come from one study. For abiotic stress and cluster related traits categories, overlaps among QTLs from independent studies were also few and involved just one trait inside the category (Additional file 9a and b). Instead, for the categories berry morphology, seeds related traits and vegetative traits more than one traits had overlapping QTLs mapped in independent studies (Additional file 9c, f and g). Importantly phenology was overall the category for which the number of QTLs’ kind confirmed by independent studies was the highest. Indeed, in this category four ripening QTLs were independently found at similar genetic loci by independent studies, i.e. on LG 1, 2, 3, 18, six flowering QTLs were consistently mapped to LG 1, 2, 7, 14, 17 and 19, while for veraison and flowering-veraison interval 3 and 1 confirmed QTLs were mapped respectively on LG 1, 2 and 16 or on LG 16 (Fig. 1). However, for each of these QTLs the confirmation was only based on few studies, ranging from 2 to a maximum of 3 (Additional file 10).

Fig. 1
figure 1

Phenology consensus QTL map with meta-QTLs positions. Consensus QTL map for phenology related traits built by plotting of original QTLs on the consensus map (outer plot). QTLs positions are indicated on the internal side of each chromosome. Genetic regions spanned by QTLs confirmed in independent populations are highlighted by a bar on the outer side of each chromosome. Meta-QTLs (inner plot) were calculated from overlapping veraison QTLs and from veraison QTL overlapping to other phenology QTLs. Colour code for each trait is given in the legend tables

Narrowing down of candidates for veraison time by meta-QTL analyses

For the purpose of performing a meta-analysis on veraison co-located QTLs independently mapped by more studies, thus reducing their genetic intervals, the list of veraison QTLs projected onto the consensus map (Additional file 8) was manually curated, as explained in the material and methods section, to avoid over-representation of any trait. We selected 35 veraison QTLs from 6 studies [22, 24,25,26, 53, 58]. Meta-analysis was applied to LG 1 and 2, where overlapping QTLs were retained after pruning. Our meta-analysis resulted in the identification of 4 veraison meta-QTLs (ver), one located on LG 1 (ver_1.1) and three on LG 2 (ver_2.1, ver_2.2 and ver_2.3) (Table 3, Fig. 1). More in detail, the veraison meta-QTL on LG 1 was derived from 2 original co-located QTLs while on LG 2 each resulted from the integration of 5 to 7 original co-located QTLs. Average confidence interval (CI) was 3.5 cM ranging from 1.2 cM for ver_2.3 to 5.1 cM for ver_2.1, which was the largest one. On LG 1 the original CI covered by QTLs was reduced from 23.9 cM to 4.3 cM (5.6 times) by the meta-analysis. On LG 2 the reduction of CI was 5 times overall, with a strongest effect on the ver_2.3 meta-QTL. R2 values of meta-QTLs were all higher than 10%. In particular, ver_2.2 was the most relevant, explaining up to 34% of total variance.

Table 3 Meta-QTLs calculated from overlapping veraison QTLs

Inspection of the consensus QTL map for the whole phenology category (Fig. 1) revealed extensive co-localization also across the different traits (i.e. co-location of veraison and ripening QTLs, etc.). Co-location of veraison QTLs with other phenology QTLs was indeed highly significant compared to a random distribution (χ2-test, p < 0.01) [92]. Overlapping phenology QTLs could represent several estimates of a single QTL affecting more developmental stages, which would justify the attempt to identify consensus QTLs across different phenology traits. In agreement with such an option, a meta-analysis for veraison QTLs and overlapping QTLs for other phenology traits was applied by considering 141 phenology QTLs retained after plotting and pruning of those in the consensus map (Additional file 8), similarly to what was previously described. When applied to veraison QTLs mapping on LG 1 and LG 2, this approach identified meta-QTLs largely overlapping with previously reported veraison meta-QTLs (Additional file 11). Therefore, with the final aim of reducing the number of genes underlying veraison QTLs, the same strategy was applied to veraison QTLs on other LGs, and this identified 13 further indicative meta-QTL regions potentially relevant for the genetic control of veraison (Table 4, Fig. 1). We named these additional meta-QTLs as ver/ph to clarify that they were obtained from veraison QTLs overlapping with other phenology related QTLs. Among these, two meta-QTLs on LG 16 were particularly relevant, explaining up to 35 and 38% of total phenotypic variance.

Table 4 Meta-QTLs calculated from veraison QTLs overlapping with other phenology QTLs

In conclusion, after anchoring to the genome, the number of genes underlying original veraison QTLs was narrowed down by applying the meta-QTL analysis, by a factor of 3.7. By including alternative phenology related traits, we also reduced the number of positional candidates at further locations, even though to a lesser extent (2.2 times) (Fig. 2). However, we should consider that this last approach relies on the assumption of shared genetic control, which could lead to skipping relevant candidates, if not verified. Lists of candidate genes in ver meta-QTLs and ver/ph meta-QTLs intervals, with the corresponding CRIBIv1 annotation (http://genomes.cribi.unipd.it/gb2/gbrowse/public/vitis_vinifera_v2/), are given in Additional files 12 and 13 respectively.

Fig. 2
figure 2

Reduction in number of candidate genes for the genetic control of veraison time by the integrated approach. Number of candidate genes for veraison time control in each chromosome selected by QTLs studies, meta-QTL analysis, transcriptomic analysis or by the integrated approach is shown

To validate our meta-QTL procedure a similar analysis was applied to QTLs projected on the consensus for the anthocyanins trait. We focused on overlapping QTLs mapping on LG 2. Indeed, berry colour genetic control has already been elucidated and linked to two adjacent regulatory genes, the VvMYBA1 and VvMYBA2 genes, located on chromosome (Chr) 2 [93, 94]. The meta-QTL analysis on 28 overlapping QTLs derived from 5 independent studies identified 7 meta-QTLs (Additional file 14). The VvMYBA1 and VvMYBA2 genes were both included in the list of the 125 genes underlying these meta-QTLs (Additional file 15).

Selection of meta-QTL candidate genes differentially expressed across veraison

As further alternative to reduce the number of candidates and prioritize them, we integrated positional information derived from the meta-QTL approach with molecular information obtained in previous transcriptomic studies.

Firstly, the 2195 positional candidates underlying veraison meta-QTLs ver or ver/ph (Additional files 12 and 13) were explored for their expression profiles in different organs, according to the grapevine expression atlas [49]. Four hundred and thirteen genes were never expressed either in berry, rachis or seed and were thus excluded.

Transcriptomic changes in berries during development and in particular across veraison have been widely explored. Taking advantage of previous transcriptomic studies we constituted a list of molecular candidates putatively involved in the veraison time control and compared them with our remaining positional candidates. From a transcriptomic dataset including berries collected at four time points of development (pea-size, beginning to touch, softening and full ripe), in 10 different grapevine genotypes, a massive transcriptomic change was found to be associated to the veraison transition, and 1478 genes commonly differentially expressed across veraison in all genotypes, were identified [44, 46]. Moreover, a recent transcriptomic map of berry development analysing weekly gene expression in Pinot Noir over 3 years allowed to define two rapid and successive transitions at the timing of the molecular reprogramming of berry development associated to veraison and identified positive and negative molecular “biomarkers” of these transitions [48]. This RNA-Seq dataset was further inspected in this study, especially at early time points before veraison, searching for first molecular events associated to veraison by looking for the transition across which the highest number of such “biomarkers” was differentially expressed in each of the 3 years (Additional file 16). This transition represents an early stage before veraison when the transcriptomic rearrangement associated to veraison starts to occur. One thousand seven hundred forty-nine genes mainly modulated in their profiles across this transition in at least two of the 3 years were then selected. By combining these genes with the 1478 genes identified in the 10 genotypes a final list of 2850 genes, representing transcriptomic candidates for veraison time control, was created (Additional file 17).

We found that among the 1782 positional candidates located under meta-QTLs and expressed in berry, rachis or seed, 272 genes are also transcriptomic candidates (Additional file 18). In detail, 61 lays under ver meta-QTLs, and include 16 genes encoding for proteins involved in regulation of gene expression, signalling or development (Table 5). Beside these, other genes belonging to functional classes like transport (7 genes) or carbohydrate metabolism (5 genes), like, for example, a vacuolar invertase, were found, which could also be potentially involved in the genetic control of veraison time mapped at these locations. The other 211 genes co-localized instead to ver/ph meta-QTLs. Among these, 62 were involved in regulation of gene expression, signal transduction or development according to their gene ontologies (GO) annotation (Table 5). Moreover, representatives of other relevant functional classes, mainly enzymes involved in carbohydrate metabolism or transporters for sugar related compounds, were also among candidates found at these locations.

Table 5 Transcriptomic candidates underlying ver meta-QTLs or ver/ph meta-QTLs selected by GO annotation for gene expression regulation, signalling or development

As expected, some of the proposed candidates were previously pointed out by QTL studies or by analysis of transcriptomic profiles. However, integration of available QTLs genetic data, by meta-QTL analysis, with transcriptomic data has allowed prioritization of the huge number of candidates, reducing by about 20 and 10 times the genes proposed so far by either only genetic or transcriptomic approaches (Fig. 2). Among these genes we expect to be included, according to all available molecular information, those controlling the grapevine veraison transition.

Discussion

A classical way to dissect the genetic determinism of grape phenology has been QTL studies [22,23,24,25,26, 53, 58]. However, QTLs mapping often provides inconsistent results among studies, and huge genomic locations. A big advantage can derive from meta-analysis, which offers stronger evidence than individual studies, by revealing regions robustly associated with traits in multiple environments and genetic backgrounds [29, 95]. This approach has been already successfully exploited to improve and validate QTLs in several species, allowing insights into the genetic architecture of complex traits and paving the way for fine mapping and gene cloning [32, 34,35,36,37]. With this aim a genetic consensus map was built from 39 available simple sequence repeats (SSR)-based maps, including 3130 markers. By looking at marker distribution we observed they were not regularly spread along the chromosomes, but tended to concentrate in the middle regions, even though a good correlation was found with original maps. This is not surprising, reflecting a similar trend to original maps, due to suppression of recombination in centromeric regions. Other consensus maps already reported this drawback [35, 96]. Moreover, genetic positions of markers on the consensus arose from positions of shared markers according to the BioMercator software procedure [97], and were not based on recombination, since original genetic data are unfortunately not available from original maps. We fully agree that QTL meta-analysis would gain power and precision if raw genotypic and phenotypic data were made available. Recent advances in markers technology, with development of the next generation sequencing-based genotyping by sequencing (GBS) technology in particular, have given a strong impulse to plant genotyping, and QTL studies now rely more on dense single nucleotide polymorphism (SNP) maps. However, unshared markers do not allow for a direct genetic comparison of mapped QTLs, but require an indirect comparison through anchoring to the genome assembly. The distribution pattern of QTLs on chromosomes differs strongly between genetic and physical maps [96]. Therefore, integration directly at genetic level could aid the improving of QTL location through co-location and meta-analysis, when feasible. Further comparisons can be then undertaken to newly generated QTLs relying on high throughput SNP maps, following anchoring to the genome assembly. Taking all this into account, we concluded that the consensus map we built constitutes a valuable reference, especially to the aim of integrating available genetic information, from related QTL studies. Moreover, it will also provide a valuable instrument to enquire co-location with newly generated QTLs relying on dense SNP maps.

Taking advantage of this tool we have provided a compendium of all available QTL information that can be integrated at genetic level. Interestingly QTLs plotting revealed extensive co-locations across studies for each of the phenology related traits, besides downy mildew resistance, powdery mildew resistance, anthocyanin, drought stress, fertility, water use efficiency and growth, as well as for some berry and seeds related traits. However, studies addressing phenology are still few, negatively affecting the number of studies supporting each of the co-located QTLs. R2 values of plotted QTLs, beside their distribution, suggest a highly polygenic nature for phenology related traits, with several QTLs involved, each of small effect, differently from other traits like pathogen resistance, seeds related traits and colour, all showing a more oligogenic architecture. More in detail, concerning veraison time four main regions located on LG 1 and 2 have so far emerged consistently. Interestingly, plotting on a unique consensus map of QTLs also allows inspection of co-location across traits and categories, which is especially relevant for complex traits. In this way QTL meta-analysis also allows genetic correlation among traits to be investigated [35, 92, 98, 99]. In a previous work a second round meta-QTL analysis was proposed for seed yield QTLs and co-located yield associated QTLs in rapeseed, which allowed “indicator” meta-QTLs contributing to the complex trait crop yield to be defined [100]. Indeed, QTL co-localization can be due to tight-linkage of QTL/genes playing different functions, but could also arise from pleiotropism. When pleiotropy is likely, it would also justify meta-analysis across traits, to further reduce the number of candidates [100]. Veraison time is expected to be strictly related to other phenological stages [9]. Accordingly, tests on the previously mentioned regions on LG 1 and LG 2 confirmed that, at least in some cases, comparable results are achieved when only veraison or all co-located phenology related traits are considered for the meta-analysis (see ver_2.1 and ver/ph_2.1 as an example). We therefore also attempted a similar approach for veraison QTLs co-located with other phenology QTLs, finally identifying a number of regions, of which the most relevant were those located on LGs 14, 16 and 18. However, we are aware that these rely on the pleiotropic assumption, which could be not always satisfied. A recent QTL study based on a GBS SNP map also addressed the mapping of veraison time [101]. That study mainly aimed to discover and map stable QTLs across environments. A veraison QTL mapping on LG 16 between 5 and 24 cM, which corresponded to the region between 2 and 16 Mbp, was found, but was not consistent across environments. Interestingly, it partially overlapped the ver/ph_16.2 meta-QTL we derived here starting from a veraison QTL and its co-location to a flowering-veraison interval QTL. Beside the detailed analysis of phenology traits we have undertaken, our compendium now provides a useful tool for the inspection of co-location and meta-analysis for further traits in a similar way.

Transcriptomic studies have been also widely applied to characterize molecular changes associated to the onset of ripening, revealing, first of all, a massive transcriptomic rearrangement at veraison time [44, 49]. Among others, genes triggering such transition are expected to modulate their expression at this stage, although alternative regulative mechanisms cannot be excluded. We thus mined available transcriptomic profiles to i) identify the timing of such massive change, ii) select genes differentially expressed during this time in more varieties. Then, beside inspection of positional candidates underlying meta-QTLs, we propose to also integrate information about differential expression at veraison time, in order to prioritize candidates.

On LG 1 a veraison time QTL was previously mapped [25]. A more recent study [26] also mapped a QTL for veraison at this location, which allowed us to define the ver_1_1 meta-QTL. Flowering QTLs consistently overlapped at same location [22, 58] suggesting a possible control of veraison time through regulation of flowering time. Accordingly, candidates for the flowering time control mapped under this meta-QTL, like the PFT1 (phytochrome and flowering time 1) gene or a CONSTANS-like gene both controlling the photoperiodic flowering pathway in A. thaliana [102, 103]. Even though a possible impact of the genetic control of flowering on veraison time would reduce the relevance of candidates found by our transcriptomic approach, integration of transcriptomic data allowed to pinpoint 14 candidates, among which the VvRAV1 transcription factor, belonging to the plant-specific RAV (RELATED TO ABI3 AND VP1) family. In Arabidopsis, RAV1 was shown to act as negative regulator of both development and flowering, probably in complexes with other co-repressors [104,105,106]. Interestingly, some members of this gene family were shown to modulate developmental transitions in response to temperature [107]. Moreover RAV1 was also shown to be negatively regulated by brassinosteroid and abscisic acid [104, 108], both hormones modulated at the onset of veraison time [1].

On LG 2 meta-QTL analysis of overlapping veraison QTLs allowed 3 main regions to be spotted. In the first of these regions flowering QTLs were also plotted [22], again supporting a possible regulation of veraison time through flowering, even though no genes controlling flowering time where found under this locus. Interestingly, the orthologous of the Arabidopsis YABBY1/FIL transcription factor, which directly activate the AtMYB75, a key regulator of anthocyanin biosynthesis [109], was found among candidates selected by the integration of expression data. Moreover, by looking at other functional categories possibly related to veraison time, a gene encoding for a vacuolar invertase 2, key enzyme of sugar metabolism in fruits during ripening [110], a stay-green protein 1 gene related to a gene shown to be involved in ripening in tomato [111], beside two pectin methylesterase inhibitor (PMEI) genes, were found as differentially expressed. These last belong to a gene family previously characterized in grape [112]. Their function is supposed to inhibit pectin methylesterase activity in pectin degradation, and may play a role in the beginning of ripening by regulating initial events such as softening and loss of turgor [113]. Interestingly, network analysis of gene expression profiles during berry ripening revealed PMEI among genes likely involved in triggering the major transcriptome reprogramming that occurs at veraison [44]. Within ver_2.2 meta-QTL, the most notable candidate considering both positional and expression data was the VvNAC13 transcription factor. This gene belongs to a wide family of transcription factors in grapevine [114]. Interestingly members of this family in tomato are involved in ethylene biosynthesis, reception and signalling during ripening [115]. Moreover, they were also already suggested as playing a crucial role in berry transcriptome modulation associated to veraison, according to network analysis of berry expression profiles [44]. However, in the same region, a gene encoding an atypical pseudo-response regulator (APRR2), involved in the circadian clock mechanism and contributing to fruit pigmentation and ripening in tomato [116], as well as two 1-aminocyclopropane-1-carboxylate oxidases, taking part in ethylene biosynthesis and ripening were also selected by our approach and represent promising candidates. Lastly, a cluster of Myb genes locates within ver_2.3 meta-QTL interval. These genes have previously been extensively characterized for their involvement in the transition to berry ripening, by regulating the accumulation of anthocyanins in the berry skin [93, 94]. This finding, thus, supports our approach, even though these genes are unlikely to be themselves the early triggers of ripening onset.

Other genomic regions were also proposed by previous studies for the genetic control of veraison time [22,23,24,25,26], among which the most relevant were mapping on LG 14, 16 and 18. By considering overlapping with other phenology related QTLs, followed by integration of transcriptomic data, we also selected candidates for these regions. The ver/ph_14.3 meta-QTL was computed from overlapping veraison QTL and flowering QTLs [23, 25, 58], and was accordingly highly enriched in candidates playing a role in the flowering transition control or fruit ripening, among which the most notable are Constans 2 (COL2), the feronia receptor-like kinase, a gene encoding a Brassinosteroid-6 oxidase, a gene encoding a COBRA protein and the putative MADS-box FRUITFULL 2. Interestingly this last gene was recently shown to also contribute to modulate the onset of ripening in tomato at early fruit development, beside its involvement at later ripening stages [117]. A QTL previously mapped on LG 16 and explaining a large part of the genetic variance in the corresponding mapping population [26] partially co-localized to QTLs for the derived trait flowering-veraison interval [22, 23] and to the genomic region involved in veraison recently identified by the GBS-SNP map previously discussed [101]. According to our strategy, the original interval was reduced to two regions of about 3.3 Mbp overall, including 15 transcriptomic candidates. Interestingly, more recently, the SSR marker UDV052, mapping under the ver/ph_16.3 meta-QTL close to the two candidates ABC transporter and an ERF transcription factor (19.1 Kbp and 56.9 Kbp respectively), was shown to be significantly associated to the early phenotype in a collection of different varieties, thus supporting our approach [118]. Lastly, three different veraison QTLs were mapped on LG18 [26]. Two of them partially co-located with flowering QTLs from an independent study, and one of them was overlapping also with a QTL for the flowering-veraison interval [23, 58]. Under the derived meta-QTLs, ver/ph_18.1 and ver/ph_18.2, spanning a still large region, we selected 74 transcriptomic candidates among which 19 were encoding proteins involved in regulation of gene expression, signalling or development. Candidates involved in carbohydrate metabolism, including especially a hexose (HT2) and a sucrose transporter (SUT2-2), putatively modulating sucrose signalling, or candidates encoding for genes for cell wall degradation (like a glucanase and a galactosidase, as examples), were also among those selected.

Conclusions

By building a grape consensus genetic map anchored to the genome assembly a comprehensive overview about genomic distribution of several QTLs from published studies and their co-location both inside traits as well as across related traits was provided. Extensive co-localization was evident especially for phenology related traits. Four veraison meta-QTLs located on LG 1 and 2 were found. Moreover several additional meta-QTLs, computed from co-localization of veraison QTLs with alternative phenology related QTLs, were derived, among which most relevant on LG 14, 16 and 18. Integration of meta-QTLs with expression data from prior transcriptomic studies allowed to select a set of 272 candidate genes for the genetic control of the veraison transition, reducing by about 20 and 10 times the genes proposed so far by either only genetic or transcriptomic approaches. Among these candidates 78 genes were involved in regulation of gene expression, signal transduction or development. Specific relevant candidates according to their annotation have been discussed. Further studies can now test and eventually validate the putative involvement of these candidates in the genetic control of the veraison transition during berry development.

Methods

Collection of QTL studies and QTLs data

All published QTL experiments on grapevine were collected, mainly by using the public database PubMed (https://www.ncbi.nlm.nih.gov/pubmed/) and searching for “grape” and “QTL”. QTL experiments were selected if relying on genetic maps including shared SSR markers and if all required information for further analysis were available (Additional file 1). Individual genetic linkage map including marker names and position in cM were transcribed from all experiments. Consensus map was selected, when this was provided. Parental maps were included only if consensus maps were not available (see Additional file 1 for more details). Data about mapped QTLs were also transcribed, in particular start and end position of the QTL, confidence interval in cM, peak of the QTL in cM, QTL associated variance explained value (R2) and the size/type of the population that was used for mapping the QTL. All QTLs were included, independently of their phenotypic scoring system, score thresholds, LOD/variance values or years of observation. Details on original QTL short names, as well as a short description, are given in Additional file 2. Only for the veraison trait, on which this work is mainly focused, further details about the different phenotypic scoring systems in the different original publication have been collected (Additional file 19). For all other QTLs we refer to the original publication for more details about the phenotypic scoring. QTLs were attributed to eight main categories including related traits, to aid storage and further studies. All markers and QTLs information were properly formatted to be imported into BioMercator v4.2 software [87].

Building of a grapevine consensus map

Name of the markers in each map were manually curated in order to correct misspellings and find synonyms. Indeed, to combine the individual maps into a consensus map, markers' name requires to be consistent. Each map file was imported in BioMercator v4.2 [87] and each linkage group was oriented according to the reference map published in Doligez et al., 2006 [88]. Linkage groups that did not share at least two markers with others were removed from the analysis, since they could not be properly oriented. This led to a different number of input maps for each linkage group depending on the chromosome. InfoMap command in the software was used to evaluate markers order and consistency between each pair. In case of inversions, occurrence of inverted markers in all the maps was evaluated and the less represented marker across all maps was removed to retain most frequent common marker. When no more inversions were left, the command ConsMap was used to build the consensus map in a single step chromosome by chromosome, without providing any reference.

Anchoring to the grapevine genome by in silico mapping of GCM markers

Grape Consensus Map (GCM) markers’ sequences were downloaded from original publications and blasted against the 12X.v2 assembly of the grapevine genome using the website https://urgi.versailles.inra.fr/blast/. An anchor map was created including all univocally mapping GCM markers with corresponding base pairs positions. The anchor map was uploaded to BioMercator v4.2 and the option “New genome version” was used to anchor the GCM to the grapevine genome from the .gff3 file (https://urgi.versailles.inra.fr/Species/Vitis/Annotations). This allows recovering of physical intervals for any feature (like QTLs or meta-QTLs), through BioMercator using a software internal formula (Yannick De Oliveira, personal communication).

QTL projection

Each QTL was associated to the genetic map were it was originally mapped. The command QTLProj in the BioMercator v4.2 software was applied to project the QTLs of the component maps to the consensus map. The command performs a homothetic projection of the original QTL to the consensus map based on flanking markers and using a scaling rule. This is applied only when flanking markers are found where the ratio of the distance of these markers to the confidence interval of the QTL that is being projected is not reduced by a factor greater than 0.25. Default options were kept for the analysis. Consensus QTL maps were extracted for each trait and manually inspected for QTLs co-location across populations. All regions spanned by QTLs for a same trait mapped in different mapping populations were recorded. Significance of QTLs co-localization was calculated as described in [92].

QTL meta-analysis

The meta-QTL analysis was performed by using the QTLClust command in BioMercator v4.2 software when at least two overlapping QTLs belonging to the same trait were found. Redundant QTLs, that is, QTLs on same position from same study, which could overestimate the effect of that QTL in the analysis were pruned retaining that with highest R2 prior the analysis [35]. The meta-analysis was executed selecting the Veyrieras algorithm [28]. Optimal number of meta-QTLs explaining overlapping QTLs was statistically determined by choosing the most likely model, as computed by the software according to five different tests. Indeed the software performs the clustering of the input overlapping QTLs for a trait and allows determining the most likely number of meta-QTLs, calculating models for as many QTLs up to the number of the input QTLs and providing values for each model for five different criterion: the AIC (Akaike information criterion), the AICc, the AIC3, the BIC (Bayesian information criterion) and the AWE (average weight of evidence). Best model was selected as the one minimizing values for the highest number of criterion which represents the optimal number of clusters that best explain the observed QTL distribution. MQTLView command allows to graphically represent the meta-QTLs according to the selected model. A second round of meta-QTL analysis was performed as described in [35] by merging veraison QTLs and other overlapping phenology related traits for meta-QTL analysis.

Physical intervals for each meta-QTL were computed as previously explained through anchoring to grape genome assembly 12X.v2 in the BioMercator v4.2 software and underlying candidate genes retrieved including their functional annotation according to CRIBIv1 annotation (http://genomes.cribi.unipd.it/gb2/gbrowse/public/vitis_vinifera_v2/). Gene ontology annotations were retrieved by using the getBM function of the Bioconductor biomaRt (2.38.0) package. Vitis vinifera Ensembl database was used and candidate genes were annotated with GO slim accessions.

Transcriptomic data integration

The grapevine expression atlas [49] was used to retrieve expression in different grape organs and exclude candidate genes never expressed in berry, rachis or seeds. RNA-Seq gene expression data along berry development were retrieved from three studies [44, 46, 48]. Ninety-nine berry RNA-Seq profiles for the cultivars Pinot Noir collected in triplicates in the years 2012, 2013 and 2014 around the time of veraison [48] were retrieved. Genes with FPKM (fragments per kilobase of exon model per million reads mapped) values lower than 1 in at least 2 replicates at all time-points were considered as never expressed and removed from the dataset. Expression of early “biomarkers” of veraison transition [48] was inspected at early time points during berry development before veraison to identify the interval in each year when the transcriptomic rearrangement associated to veraison first occurs. Genes showing highest modulation in their expression across this interval in at least 2 years were selected by inspecting FPKM values and considered as transcriptomic candidates to be joined to genes differentially expressed across veraison in all varieties as defined in [44, 46]. Finally transcriptomic candidates positioned under meta-QTLs were selected.