The CBF gene family in hexaploid wheat and its relationship to the phylogenetic complexity of cereal CBFs
- First Online:
- Cite this article as:
- Badawi, M., Danyluk, J., Boucho, B. et al. Mol Genet Genomics (2007) 277: 533. doi:10.1007/s00438-006-0206-9
- 1.1k Views
Most temperate plants tolerate both chilling and freezing temperatures whereas many species from tropical regions suffer chilling injury when exposed to temperatures slightly above freezing. Cold acclimation induces the expression of cold-regulated genes needed to protect plants against freezing stress. This induction is mediated, in part, by the CBF transcription factor family. To understand the evolution and function of this family in cereals, we identified and characterized 15 different CBF genes from hexaploid wheat. Our analyses reveal that wheat species, T. aestivum and T. monococcum, may contain up to 25 different CBF genes, and that Poaceae CBFs can be classified into 10 groups that share a common phylogenetic origin and similar structural characteristics. Six of these groups (IIIc, IIId, IVa, IVb, IVc and IVd) are found only in the Pooideae suggesting they represent the CBF response machinery that evolved recently during colonization of temperate habitats. Expression studies reveal that five of the Pooideae-specific groups display higher constitutive and low temperature inducible expression in the winter cultivar, and a diurnal regulation pattern during growth at warm temperature. The higher constitutive and inducible expression within these CBF groups is an inherited trait that may play a predominant role in the superior low temperature tolerance capacity of winter cultivars and possibly be a basis of genetic variability in freezing tolerance within the Pooideae subfamily.
The ability of plants to respond and adapt to low temperature (LT) stresses varies greatly with species (Sakai and Larcher 1987). Most temperate plants, such as wheat, canola and Arabidopsis, are able to tolerate both chilling and freezing temperatures. In contrast, species from tropical regions, such as tomato, maize and rice are unable to tolerate freezing temperatures and even suffer chilling injury when exposed to temperatures in the range of 0–12°C. The cold acclimation process induces the expression of cold-regulated (COR) genes, whose products are thought to be necessary for protection against freezing stress (Thomashow 1999). Many of these COR genes contain copies of the C-repeat/dehydration-responsive element (CRT/DRE) in their promoters, which has the core motif CCGAC and is responsible for the LT-responsiveness of these genes. The factors that bind the CRT/DRE were first identified in Arabidopsis and designated CRT-Binding Factors/DRE-binding proteins 1 (CBF/DREB1) genes (Stockinger et al. 1997; Liu et al. 1998). These genes encode LT-induced transcription factors that when constitutively overexpressed in Arabidopsis mimic cold acclimation by inducing COR gene expression and freezing tolerance (FT) (Jaglo-Ottosen et al. 1998; Liu et al. 1998). CBF-like proteins have been isolated from a wide range of plants that include species capable and incapable of cold acclimation, suggesting that the CBF cold response pathway is broadly conserved in plants (Jaglo et al. 2001).
The CBF/DREB1 proteins belong to the AP2/ERF superfamily of DNA-binding proteins. This protein family has in common the AP2 DNA-binding motif. The 122 Arabidopsis members of the ERF family containing one AP2 DNA-binding motif were divided into 12 groups, with several of the groups being further divided into subgroups (Nakano et al. 2006). The six CBF/DREB1 proteins of Arabidopsis were included in subgroup IIIc, one of the five group III subgroups. The CBF proteins are distinguished from other group III members by a conserved set of amino acid sequences (motif CMIII-3) flanking the AP2 DNA-binding domain. Other motifs found in CBF proteins (CMIII-1, CMIII-2 and CMIII-4) are also present in one or more of the group III subgroups, suggesting related molecular functions may be conserved between subgroups. These features were previously noted as conserved features of the C-terminal activation domain between aligned CBFs (Wang et al. 2005; Skinner et al. 2005). Comparison of group III proteins between the eudicot Arabidopsis and the monocot rice reveals they share four common subgroups (Nakano et al. 2006) suggesting a functional diversification of group III proteins had occurred before the divergence of these two species. The 4 ancestral genes have amplified to 23 and 26 genes in Arabidopsis and rice, respectively. Arabidopsis CBF studies have demonstrated that CBF1, 2, and 3 function in the cold-acclimation pathway with redundant and some possibly specific functions (Gilmour et al. 2004; Novillo et al. 2004; Van Buskirk and Thomashow 2006), CBF4 is involved in drought adaptation (Haake et al. 2002), and DDF1 and DDF2 are involved in gibberellin biosynthesis and salt stress tolerance (Magome et al. 2004). Based on these findings, the functional divergence of group III ERF proteins in monocots may differ from eudicots and therefore warrant a characterization of monocot genes.
Members of the Poaceae have been targeted for study since they contain the major cereal crops wheat, maize and rice which provide >60% of the calories and proteins for our daily life. To meet the needs of the projected human population by 2050, cereal grain production must increase at an annual rate of 2% on an area of land that will not increase much beyond the present level (Gill et al. 2004). Therefore, significant advances in our understanding of the CBF family in cereals are essential to develop needed strategies to protect crops from losses caused by abiotic stress. The Poaceae represent an excellent model system to study the roles of the CBF family in the evolution of LT tolerance. The Poaceae radiated some 55–70 million years ago (MYA) into several subfamilies (Kellogg 2001). The subfamilies Oryzaceae (rice) and Panicoideae (maize) have a more tropical geographical distribution compared to members of the Pooideae, which contain the temperate cereals wheat, barley and oat. LT tolerance within the Pooideae subfamily ranges from low in oat (a Poeae tribe representative), to intermediary in barley and wheat (Triticeae tribe), to highly tolerant in rye (Triticeae tribe). The estimated divergence time between the Triticeae and Poeae is around 35 MYA, and within the Triticeae, barley and rye diverged from wheat around 11 and 7 MYA, respectively (Huang et al. 2002a). The more recent evolutionary history of bread wheat started with an adaptive radiation of the diploid progenitors around 2.2–4.5 MYA followed by successive hybridizations around <0.5 MYA and 8,000 years ago to produce hexaploid bread wheat (Huang et al. 2002b). Wheat is an interesting model since the comparative analysis of CBF gene function among closer and more distantly related species may shed light on important evolutionary trends that have sculpted CBF function. Furthermore, the three genomes of hexaploid wheat are known to contain differences for many agronomically important genes (Gill et al. 2004), and recently, a LT tolerance QTL on chromosome 5 of T. monococcum (Miller et al. 2006), barley (Skinner et al. 2006) and hexaploid wheat (Båga et al. 2007) was found to coincide with the location of 11, 12 and 2 CBF genes, respectively. The exact molecular explanation for this LT tolerance QTL is not known but indicates the possibility that CBF genes may be at the base of this important trait.
Many CBF genes have been identified from Poaceae species such as rye (Jaglo et al. 2001), rice (Dubouzet et al. 2003; Skinner et al. 2005), barley (Choi et al. 2002; Xue 2002; Francia et al. 2004; Skinner et al. 2005), wheat (Jaglo et al. 2001; Kume et al. 2005; Skinner et al. 2005; Vágújfalvi et al. 2005; Miller et al. 2006), and Festuca arundinacea (Tang et al. 2005). These studies, in particular those of Skinner et al. (2005) and Miller et al. (2006), have revealed that the cereal CBF family is large and complex. To better understand the functions of this gene family during cold stress and its evolution in the Poaceae, we initiated a study to identify and characterize CBF genes from hexaploid wheat. Here, we show that hexaploid wheat contains at least 15 different CBF genes, and that Poaceae CBFs can be subdivided into groups with specific characteristics. These findings expand our understanding on the functional categories of cereal CBF genes and provide a starting point for future studies.
Materials and methods
Preparation of the cDNA libraries
Five different cDNA libraries prepared from Triticum aestivum L. cv Norstar were used to identify expressed wheat CBF genes in this study (Table S1). Plant growth conditions, RNA purification and cDNA library construction were described in detail elsewhere (Houde et al. 2006). Briefly the five libraries (L2–L6) were prepared from the following pooled mRNA populations: (L2) aerial parts (leaf and crown) from control and long-term cold acclimated wheat (1–53 days); (L3) root tissue from control, cold-acclimated and salt stressed wheat; (L4) aerial parts of dehydration stressed wheat; (L5) crown tissue during vernalization and different developmental stages of spike and seed formation in wheat; (L6) crown and leaf tissues from wheat after short exposures to LT in the light and in the dark. All cDNAs synthesized were directionally cloned into the pCMV.SPORT6 vector (Invitrogen) with the SalI adaptor (GTCGACCCACGCGTCCG) and NotI primer adaptor (GCGGCCGCCCT15). For the last four libraries, the first strand cDNA reaction mix contained methylated dCTP to prevent cDNAs from internal cleavage by the NotI restriction enzyme used for directional cloning. For the last library, the ‘GeneRacer’ kit (Invitrogen) was used prior to first strand synthesis to produce a library containing a high proportion (95%) of full-length cDNAs. For each library, six million primary transformants were obtained, amplified and frozen as glycerol stocks.
To prepare plasmids for PCR experiments, 40 μl of a bacterial library stock (>40 × 106 clones) were inoculated into LB media (100 ml) supplemented with ampicillin and grown at 30°C for 6–8 h. Plasmids were isolated using the QIAprep Miniprep system (QIAGEN) and the quantity was evaluated on a gel.
Identification of wheat CBF genes
Nomenclature and characteristics of wheat CBF genes isolated from the cultivar Norstar
Open reading frame (CDS)
Total amino acids
Expression profiling using northern analysis
The spring wheat cultivar Triticum aestivum L. cv Quantum and the winter cultivar Norstar were germinated in moist sterilized vermiculite in a growth chamber (Model-E15, Conviron) for 7 days at 20°C and 70% relative humidity under an irradiance of 250 μmol m−2 s−1 and a 16-h photoperiod. At the end of this period, 5 g of control leaves were sampled and individually frozen. A cold treatment (4°C) was initiated by changing the temperature in the growth chamber 4 h into the day phase and continued under the same irradiance and photoperiod conditions for the times indicated in Fig. 4. Total RNA was extracted from wheat leaves as described (Danyluk and Sarhan 1990), and equal amounts (10 μg) were separated on formaldehyde–agarose gels. Transfers to positively charged nylon membranes and hybridizations with 32P-labeled probes were performed using standard molecular biology techniques. Probes for the different TaCBFs were designed outside of the AP2 domain to avoid cross-hybridization with known CBF transcripts (Table S3). All filters were washed at high stringency (0.1× SSC, 0.1% SDS; 65°C) for 30 min. Membranes were exposed to BioMax-MS films (Kodak) and Molecular Imager FX screens (Bio-Rad).
Expression profiling by quantitative real-time PCR
Plant growth conditions and cDNA synthesis
Spring cultivar Manitou and winter cultivar Norstar were germinated in a mixture of 50% black earth and 50% Pro-Mix (Premier) for 8 days at 20°C under a 16-h photoperiod with a light intensity of 250 μmol m−2 s−1. In this experiment, Manitou was used as the spring cultivar in order to compare the response to the Quantum spring cultivar. At the end of this period, 8 control seedlings were sampled every 2 h on dry ice, and immediately frozen at −70°C. Cold treatment (4°C) was initiated by changing the temperature in the growth chamber 4 h before the night cycle, and sampling as indicated in each Figure. Sampling of control and cold-treated plants every 2 h was chosen so that closely spaced time points act as closely related biological replicates in addition to providing a better resolution of the expression characteristics of a gene. Night samples were collected by opening the growth chamber in the dark, removing plants to another room and harvesting rapidly. Total RNA was isolated using the RNeasy Plant Mini Kit (QIAGEN) using the optional on-column DNAse digestion. Purified RNA (2.8 μg) was reverse transcribed in a 20 μl reaction volume using the Superscript II first strand cDNA synthesis system for RT-PCR (Invitrogen). Parallel reactions for each RNA sample were run in the absence of Superscript II (no RT control) to assess for genomic DNA contamination. The reaction was terminated by heat inactivation; the cDNA product was treated with RNase H, and diluted in water (20 ng/μl) for storage (−20°C).
Design of gene-specific primers
The genome of hexaploid wheat contains three genomes inherited from three diploid ancestors. The 37 TaCBF gene sequences identified in this study were analyzed using ClustalW (http://www.align.genome.jp/) and phylogenetic characterization. This analysis revealed 15 groups of genes containing one to three homeologous copies in each group. Primers were specifically designed to monitor the expression of only one representative gene per group. This representative was chosen randomly. Fluorescent TaqMan-MGB probes as well as the non-fluorescent primers (Table S4) were designed using the combination of Primer Express Software Version 2.0 (Applied Biosystems) and Primer3. BLASTN searches against EST and NR databases were performed to confirm the gene specificity of the primers. Non-fluorescent primers were synthesized by Invitrogen and TaqMan-MGB probes by Applied Biosystems.
PCR amplification and data analysis
Quantitative real-time PCR assays for each gene target were performed in quadruplicate on an ABI Prism 7000 sequence detection system (Applied Biosystems) using the eukaryotic 18S rRNA as the endogenous control (Applied Biosystems #4319413E). From the diluted cDNA, 2 μl (40 ng) was used as a template in a 25-μl PCR reaction containing 1× TaqMan universal PCR master mix (Invitrogen), 0.9 μM of non-fluorescent primers, and 0.25 μM of TaqMan-MGB fluorescent probe. The PCR thermal cycling parameters were 50°C for 2 min, 95°C for 10 min followed by 50 cycles of 95°C for 15 s and 60°C for 1 min.
All calculations and statistical analysis were performed by the SDS RQ Manager 1.1 software using the 2−ΔΔCt method with a relative quantification RQmin/RQmax confidence set at 95% (Livak and Schmittgen 2001). The error bars display the calculated maximum (RQmax) and minimum (RQmin) expression levels that represent SE of the mean expression level (RQ value). Collectively, the upper and lower limits define the region of expression within which the true expression level value is likely to occur (SDS RQ Manager 1.1 software user manual; Applied Biosystems). Amplification efficiency (90–100%) for the 15 primer sets was determined by amplification of cDNA dilution series using 80, 20, 10, 5, 2.5, and 1.25 ng per reaction (data not shown). Specificity of the RT-PCR products was assessed by gel electrophoresis.
Chromosome localization of TaCBF genes
Genomic DNA was extracted and quantified (Limin et al. 1997) from several stocks of the wheat cultivar Chinese Spring: ditelocentric series provided by the USDA from E. R. Sears collection (all chromosomes are present but in each line one chromosome pair is represented by only the telocentric chromosomes of one arm); chromosome 5 nullisomic–tetrasomic lines (a pair of chromosomes is removed and replaced by another pair of homoeologous chromosomes); and deletion lines for homoeologous group 5AL and 5DL chromosomes (Endo 1988; Endo and Gill 1996) generated using the gametocidal chromosome of Aegilops cylindrical. From the diluted genomic stocks, 2 μl (20 ng) was used as a template in a 25-μl PCR reaction containing 1× TaqMan universal PCR master mix (Invitrogen), 0.9 μM non-fluorescent primers, and 0.25 μM TaqMan-MGB fluorescent probe. The PCR thermal cycling parameters were 50°C for 2 min, 95°C for 10 min followed by 50 cycles of 95°C for 15 s and 60°C for 1 min. At the end of the run, the Ct values were compared, and genetic stocks that showed a delayed or undetectable amplification were identified as the location of the assayed TaCBF gene.
Phylogenetic and other bioinformatic analysis
List of monocotyledon CBF genes and their proposed nomenclature
Proposed gene name
Type of sequence
AP008215 (CDS: 20099705-20098965)
DV482761, DV481106, DV485162
DV485858, DV489297, DV487222, DV478843, DV489355, DV484024, DV478031
Clone 79 CBF
Clone 80 CBF
Clone 81 CBF
AY951944 (CDS: 181665-182366)
AY951945 (CDS: 91223-90357)
AY951945 (CDS: 22116-22838)
AY951945 (CDS: 35360-34722)
DN143696, DN145297, DN143313
DN144355, DN143145, DN143526
CA212714, CA156028, CA080013, BQ534420
CA162524, CA161788, CA089833, BQ533805
BU103690, CA090085, CA109731, CA105526
DR964417, DV523865, DR790548, DV530932
EB674710, DV539998, DV510250, DR821363, EB674709, EB400670, DR795602
DY344903, DY345420, DY380161, DY381223
Hydrophobic cluster analysis (HCA) (Gaboriaud et al. 1987; Callebaut et al. 1997) was conducted using the web-based interface at: (http://www.bioserv.rpbs.jussieu.fr/RPBS/cgi-bin/Ressource.cgi?chzn_lg = an&chzn_rsrc = HCA). Briefly, the protein sequences are displayed on a duplicated α-helical net in which hydrophobic amino acids (V, I, L, F, M, Y, W) are contoured. Hydrophobic residues separated by four or more nonhydrophobic residues, or a Proline, are placed into distinct clusters. The defined hydrophobic clusters were shown to mainly correspond to the internal faces of regular secondary structures (α-helices or β-strands). Two other amino acids were chosen to be highlighted in this study: proline which confers the greatest constraint to the polypeptide chain and glycine which confers the largest freedom to the chain. This secondary structure information was highlighted on a ClustalW alignment of group-related wheat CBFs.
Identification of wheat CBF genes
CBF family members are important regulators of FT in plants. Data mining and analyses of cereal CBF sequences present in GenBank suggest that different species contain diverse and complex CBF families. Based on preliminary studies conducted in a number of varieties, hexaploid wheat contains at least seven CBF genes (Jaglo et al. 2001; Kume et al. 2005; Skinner et al. 2005). To maximize our chance of discovering CBF genes involved in the development of wheat FT, several cDNA libraries were constructed (Table S1) and screened to identify CBF genes expressed under various cold acclimation time points and conditions in the freezing tolerant cultivar Norstar. A combination of EST sequencing, cDNA library screening and PCR amplification allowed the identification of 37 expressed TaCBF genes from hexaploid wheat (Table 1). To be consistent with the established H. vulgare and T. monococcum nomenclature (Skinner et al. 2005; Miller et al. 2006), we assigned identical gene numbers to orthologs of hexaploid wheat (2–15) and new consecutive numbers (starting from 19 to 22) to novel genes identified following homology comparison with published and identified wheat sequences (e.g. TaCBFIa-A11 shows the highest homology with its ortholog HvCBF11 identified previously (Skinner et al. 2005). These analyses revealed that the 37 genes identified from hexaploid wheat can be classified into at least 15 different orthologous gene groups with 1–3 homeologous copies in each. Phylogenetic analysis of T. aestivum and T. monococcum genes reveals that wheat CBF genes can be divided into 10 monophyletic groups. Therefore, we included in the proposed CBF nomenclature CBF subgroup information (e.g. CBFIa, II, IIIa, IIIb, IIIc, IIId, IVa, IVb, IVc and IVd) which should facilitate future comparison of monocot CBF properties and functions. In the cases where homeologous copies were mapped to one of the three genomes of hexaploid wheat, a letter designating its location precedes the CBF gene number (e.g. TaCBFIIIa-D6). On the other hand, when the genomic localization of a homeologous copy has not yet been determined, the CBF gene number is followed by a temporary designation of .1, .2 or .3 (e.g. TaCBFIIIa-6.1).
At least three gene groups (TaCBFIIId-19, TaCBFIVb-20 and TaCBFIVd-22) represent true orthologous series in hexaploid wheat since the homeologous copies (A, B and D) were identified and mapped to each genome equivalent in our study (Table 1). One gene group (TaCBFII-5) was not mapped in our study (Table 1). However, the T. monococcum ortholog was mapped to chromosome 7A (Miller et al. 2006) and the barley ortholog was mapped to the short arm of 7H (Skinner et al. 2006) suggesting that this gene group will be located on chromosome 7. The vast majority of gene groups (13 out of 15) were mapped to chromosome 5 (Table 1), and at least 9 of these 13 have so far been mapped more precisely to a region, between two deletion breakpoints, associated with a cold tolerance QTL in several Triticeae species (Francia et al. 2004; Båga et al. 2007; Miller et al. 2006; Skinner et al. 2006). The present study allowed the localization of 4 new gene groups (TaCBFIIId-19, TaCBFIVb-20, TaCBFIVb-21 and TaCBFIVd-22) (Table 1) to this QTL region containing the 11 tandem CBF genes in T. monococcum (Miller et al. 2006) and 12 tandem CBF genes in barley (Skinner et al. 2006).
A cut-off of 95% was chosen to differentiate between homeologous copies and possible recently duplicated genes. Because homeologous copies of different CBF gene groups would have started diverging around the same time, one would expect them to show similarities in the same range. The comparison of T. monococcum (Miller et al. 2006) and T. aestivumCBF genes (this study) does reveal five orthologous genes with high levels of identity (more than 98%). From the 13 T. aestivum gene groups that contain homeologous copies, only 5 show identities below 95% (the percent identity excludes bases involved in transitions, transversions and gaps) between the homeologous ORFs. These include TaCBFII-5, TaCBFIIIa-6, TaCBFIIIc-3, TaCBFIVd-9 and TaCBFIVd-22 which show identities of 85.9, 91.7, 94.5, 94.8 and 89.9%, respectively. When the identity comparison is repeated with gap regions not included, only TaCBFII-5 and TaCBFIIIa-6 show lower similarities of 93.4 and 93.5%, respectively. Therefore, these results suggest that the TaCBFII-5 and TaCBFIIIa-6 homeologous groups may contain closely related paralogs and/or have diverged at a different rate than other groups.
Phylogeny of monocot CBF genes
Monocot and eudicot CBF sequences are separated on a phylogenetic tree (Dubouzet et al. 2003; Qin et al. 2004; Bräutigam et al. 2005; Xiong and Fei 2006) suggesting that at least some CBF gene function specialization has evolved recently in plants. To understand the relationship between TaCBFs and other monocot CBFs, we searched NR and EST databases and compiled a set of sequences for analysis (Table 2). During our BLAST comparisons of CBFs, we found that it would be difficult to align the entire ORF of distant members with complete confidence. Therefore, the nucleotide sequence of the AP2 DNA binding domain and adjacent CBF signatures were chosen for alignment and phylogeny analysis since these domains are extremely well conserved in the CBF family. To simplify the future comparison of CBF gene functional studies with different monocotyledon plants, we propose a nomenclature that reflects the evolutionary relationship of CBF genes. This will help to distinguish their specific functional roles which may have appeared during their evolution.
To establish the relationship between the different CBF genes in wheat, only one representative of each of the 15 TaCBF gene groups (preferentially the homeologous A copy) was included in this analysis. In addition, 8 (Table 2) of the 13 identified T. monococcumCBF genes (Miller et al. 2006) were included in the analysis since they showed less than 95% identity (gap regions not included in the analysis) with any of the 15 TaCBF gene groups. These lower identities of TmCBFII-5 (86.5%), TmCBFIIIb-18 (76%), TmCBFIIIc-10 (94%), TmCBFIIIc-13 (86.7%), TmCBFIIId-16 (82.1%), TmCBFIIId-17 (78.8%), TmCBFIVa-2 (92.4%) and TmCBFIVd-4 (87.1%) with their closest homologues suggest that they may represent additional wheat CBF genes. To compare wheat CBFs with the closely related Triticeae species barley, we included 13 of the 19 HvCBF genes reported (Table 2) (Skinner et al. 2005). The two additional pseudogenes, HvCBFIIIc-8B and HvCBFIIIc-8C, and the remaining four genes HvCBFIIIc-10B (97.2%), HvCBFIVa-2B (98.8%), HvCBFIVd-4B (99.8%) and HvCBFIVd-4D (96.4%) were not included since their high homologies with their closest related paralog suggested that these duplications happened following the divergence of wheat and barley. Several other CBF sequences from families of the monocotyledon order Poales were included in the analysis (Table 2). In addition, CBF representatives of two other monocotyledon orders Arecales and Zingiberales were also included in the analysis (Table 2). HvCBF7 and TmCBF7 (Skinner et al. 2005; Miller et al. 2006) were not included in this phylogenetic analysis since their protein sequence contains a less conserved CBF signature and the presence of motifs found in subgroup IIId of ERF proteins (Nakano et al. 2006).
The second group (named CBFII) also contains CBF genes from the Oryzaceae, Panicoideae and Pooideae subfamilies but the presence of only one gene in rice suggests that it was less complex before divergence. The possibility that additional genes may exist in wheat based on the lower homology of TmCBFII-5 (86.5%) with T. aestivum genes will require additional sequence identification and characterization to be confirmed. Although these genes were initially presented as part of group I (Skinner et al. 2005), they are classified separately here as a group II based on their evolutionary distance from the first group, their specific occurrence in all Poales/Poaceae subfamilies examined, and structural differences (see next section).
The results presented in Fig. 1 show that the 11 CBF genes from group III are clustered in several distinct subgroups. These subgroups were named CBFIIIa, CBFIIIb, CBFIIIc and CBFIIId to reflect their monophyletic origins. Groups IIIa and IIIb are the only ones that contain CBF genes from the three Poaceae subfamilies suggesting that the ancestral CBFIIIa and CBFIIIb genes were already present before divergence of these subfamilies. However, group CBFIIIb did not always form a monophyletic clade with other substitution models suggesting a less certain orthologous relationship. With these alternative models, TmCBFIIIb-18 was found to cluster alone or in the vicinity of the CBFIIId group (results not shown). Sequencing of additional CBFIIIb genes will help to resolve this ambiguity. Interestingly, only CBF genes from tribes of the Pooideae subfamily were found clustered in groups CBFIIIc and CBFIIId suggesting that these groups evolved following the appearance of the Pooideae. Based on the available data, group IIIc contains at least three common genes (CBFIIIc-3, CBFIIIc-10 and CBFIIIc-13) before wheat–barley speciation. No report has yet shown the existence of the HvCBFIIIc-8 type pseudogenes in wheat. In addition, this group contains two of the genes (HvCBFIIIc-8 and HvCBFIIIc-10) that have duplicated in barley following divergence from wheat. Excluding the pseudogenes, this group is presently composed of four genes in barley and four genes in wheat. In the case of group IIId, one barley gene has been identified (HvCBFIIId-12) compared to five in wheat (TaCBFIIId-12, TaCBFIIId-15, TmCBFIIId-16, TmCBFIIId-17, TaCBFIIId-19). However, it is probable that barley has at least an additional CBFIIId gene since multiple homologs were identified in the more distantly related Avena sativa. These results suggest that differences may exist between closely related species in the exact number of CBF genes present within a group. The group III rice genes OsCBFIII-1D, OsCBFIII-1I and OsCBFIII-1J do not cluster within any of the above described subgroups. Therefore, no subgroup classification is proposed since these genes may have evolved specifically in Oryzaceae.
The analysis of the nine wheat genes from group IV (Fig. 1) reveals a compact clustering compared to group III genes, suggesting a more recent diversification. However, based on the origin of the main branches, it is possible to classify the wheat genes into four groups: CBFIVa (TaCBFIVa-2 and TmCBFIVa-2), CBFIVb (TaCBFIVb-20 and TaCBFIVb-21), CBFIVc (TaCBFIVc-14), and CBFIVd (TaCBFIVd-4, TmCBFIVd-4, TaCBFIVd-9 and TaCBFIVd-22). The group CBFIVb is the only one that does not contain a barley representative at the moment. Within group CBFIVd, the identity between the complete ORF of wheat–barley orthologs (91.5% for CBFIVd-4 and 94.2% for CBFIVd-9) and paralogs (90.8% for wheat–wheat and 90.3% for barley–barley) are very similar indicating that this group amplified just prior to the divergence of the wheat–barley lineage. At present, only two genes (AsCBFIVa and FaCBFIVa-2) from the closely related Pooideae tribes Aveneae and Poeae were found to cluster with the CBFIVa group indicating that at least this group had appeared prior to the radiation of these tribes. These results suggest that the amplification of group CBFIVd and possibly the emergence of groups IVb and IVc may represent a specific characteristic of the Triticeae tribe. In addition, groups CBFIVa and CBFIVd contain the two genes (HvCBFIVa-2 and HvCBFIVd-4) that have duplicated in barley following divergence from wheat. The rice (OsCBFIV-1B.1) and sugarcane (SoCBFIV) representatives were found to be distantly related to the core group IV genes suggesting that the ancestral group IV gene continued to evolve following the divergence of the corresponding plant subfamilies. A similar observation was noted previously by Skinner et al. (2005). Based on present data, barley has at least seven genes in group IV (four original plus three specifically amplified in barley) compared to nine possible genes in wheat.
In conclusion, these analyses reveal that hexaploid wheat contains at least 15 CBF genes, and could contain up to 23–25 CBF genes. The latter estimates assume hexaploid wheat has retained orthologs of all T. monococcumCBF genes and of HvCBFIa-1 and OsCBFI-1F. The Poaceae CBF genes can be divided into ten groups with six of these (CBFIIIc, IIId, IVa, IVb, IVc and IVd) having evolved only in the Pooideae.
Bioinformatic analysis of wheat CBF proteins
The TaCBF genes identified in this study encode for proteins ranging from 202 to 290 amino acids that share homologies with other CBF proteins. Analysis of these protein sequences reveals that they contain, to different degrees, the characteristic motifs found in this family (Wang et al. 2005; Skinner et al. 2005; Nakano et al. 2006). From the N- to C-terminus, we can identify the AP2 DNA-binding domain flanked by the CMIII-3 motif, and then the CMIII-1 motif, the CMIII-2 motif, and the conserved C-terminus LWSY motif (or CMIII-4). Analysis of the ORF of homeologous copies reveals that four groups have a member with a sequence difference at the C-terminus. The TaCBFIVa-2.2 protein contains an extension of 30 amino acids past the last motif because of a T→C transition that destroys a termination codon. In the TaCBFIIIc-3.1 gene, a transversion in the last codon creates a premature termination codon and truncates the motif to LWS. The TaCBFIVb-A20 protein contains an extension of five amino acids comparatively to other members of this group because of a deletion of the termination codon. In the TaCBFIVd-B22 gene, a 38 base insertion downstream of the region encoding CMIII-1 motif changes the reading frame in the remainder of the sequence thus destroying the motifs CMIII-2 and -4. How these changes impact the activity of these proteins remains to be established.
Previously, Wang et al. (2005) had noted that the occurrence of hydrophobic clusters (HCs) in the activation domain of CBF proteins were evolutionarily conserved and demonstrated their functional importance and redundant nature. To identify some of the structural differences associated with the ten CBF groups classified by phylogenetic analysis, HC analysis was performed to highlight changes to internal faces of regular secondary structures (α-helices or β-strands) that may impact their functional activity. In groups that contain only one or two members, rice and barley orthologs were included in the analysis to examine the extent of structure conservation over evolutionary time. Two regions were analyzed and include the AP2 DNA-binding domain with the motifs CMIII-3 and CMIII-1 flanking it, and the C-terminal activation domain region containing motifs CMIII-2 and CMIII-4. For comparative purposes, we included RhCBFI-1 and ZoCBFI-1 from the monocotyledon orders Arecales and Zingiberales, respectively, the lone rice group I protein OsCBFI-1F, and AtCBF3 from the eudicot Arabidopsisthaliana.
Analysis of the regions surrounding the AP2 DNA binding domain reveals significant differences between different CBF groups (Fig. 2). In the CMIII-1 motif, the presence and length of HC and the P/G pattern are capable of differentiating between the CBFIa and CBFII groups, between the CBFIIIa and the remaining CBFIII groups, and between the four CBFIV groups. In the CMIII-3 region, the absence of a P is specific for the CBFIa and CBFIVa groups, and a larger HC defines groups CBFIa, CBFII and CBFIVa. These results show that all CBF groups besides CBFIIIb, CBFIIIc and CBFIIId may have slightly different folding of their secondary and/or tertiary structures of their DNA binding domains.
The CBFII and CBFIV groups contain between 4 and 6 HCs in their C-terminal activation region which is similar to the number identified in Arabidopsis (Wang et al. 2005). However, the structural patterns are different. In fact, the lone protein OsCBFI-1F is the one that shows the highest structural similarity with AtCBF3 suggesting that it may represent a closer version of the ancestral type CBF in monocots. The CBFII group contains five HCs and is unique among the groups analyzed since it does not contain P at the end of motif CMIII-2. The CBFIVa and CBFIVd groups share a similar structure with three HCs in motif CMIII-2 and up to five common HCs in total. Group CBFIVd can be differentiated by the presence of the first HC upstream of motif CMIII-2 in all members and a P in motif CMIII-4. Groups CBFIVb and CBFIVd contain only two HCs in motif CMIII-2, and these display variable lengths. In addition, group CBFIVc contains an additional HC downstream of CMIII-2 which is not found in any other group IV CBF. In summary, these results show that most groups described in this study display some structural differences in the AP2 DNA binding region while all groups show differences in their C-terminal activation regions. An important observation from these results is that rice proteins (OsCBFIII-1D, OsCBFIII-1I, OsCBFIII-1J and OsCBFIV-1B.1) do not show clear structure conservation with any of the described groups corroborating their non-orthologous relationship. This is in contrast to the rice members of groups CBFIa, CBFII and CBFIIIa and partly ZoCBFI-1 and RhCBFI-1 that do show structures that are orthologous in nature, and have been conserved since the divergence of the respective branches.
Expression of wheat CBF genes
Results of Fig. 5 show that in almost all cases the maximum accumulation of TaCBFs occurs after 2 h of LT treatment. These results contrast with those of 4 and 6 h observed in Fig. 4 and suggest that the pattern of induction is influenced by the period of the day when the LT treatment was initiated. Several other observations can be noted from the results presented in Fig. 5. TaCBFIIIc-D3 displays an extreme transient expression profile showing low expression in all points examined except the 2 h LT treatment. A similar expression profile was observed for the barley ortholog (Choi et al. 2002) suggesting evolutionary conservation of regulation pattern. The fact that no detectable expression of TaCBFIa-A11 and TaCBFIIIc-B10 was observed in this study suggests that CBFIa and CBFIIIc genes may be expressed under specific conditions and/or extremely low levels. Other observations include: a near constitutive expression for TaCBFII-5.2, a low and transient induction for CBFIIIa and CBFIVb genes, and a high and more sustained expression for most CBFIIId, CBFIVc and CBFIVd genes. A comparison of winter and spring LT induction profiles reveals that they are qualitatively very similar. The quantitative comparison of the 2 h LT time points reveals that CBFII, CBFIIIa and CBFIIIc genes are expressed to similar levels (within a threefold factor) in both cultivars. On the other hand, CBFIIId, CBFIVa, CBFIVb, CBFIVc and CBFIVd genes (except TaCBFIVd-D9) show increased LT expression (4.7-fold and more) in the winter compared to the spring cultivar. In addition, these five groups (except for TaCBFIIId-B12) also show a higher basal expression in the winter cultivar (results not shown), and this can be easily evaluated for some members when comparing the induction profiles in Fig. 5 if we consider that the LT expression of these genes is more than 4.7-fold higher in winter compared to spring cultivars as indicated above. These results indicate that the five CBF groups are associated with both the superior inherited and LT inducible capacities of the winter cultivar to develop FT.
Wheat is a good model species for studying FT since its tolerance lies between that of freezing sensitive plants (rice, maize and oat) and the extremely tolerant species rye. To decipher the genetic basis underlying the different capacities of temperate cereal species in developing FT, we have initiated the identification of genes that have the potential to influence FT. Since CBF genes have been widely implicated in cold acclimation in many species, we identified and characterized 15 TaCBF gene groups in hexaploid wheat. Our analyses reveal that wheat species, T. aestivum and T. monococcum, may have a large and complex CBF family with up to 25 different CBF genes. The large number of CBF genes in wheat is comparable to the number found in barley (20 or more) (Skinner et al. 2005) but contrasts with the ten genes present in rice and six in Arabidopsis. It is not known why freezing tolerant cereals have evolved and maintained so many CBF genes. Since the amplification of the CBF gene family has evolved independently after the monocot-eudicot divergence (Qin et al. 2004; Bräutigam et al. 2005; Xiong and Fei 2006), a thorough characterization of a large number of CBF genes in wheat seemed a daunting task. At present, it is difficult to assess if all TaCBFs have specific functions, obtained by subfunctionalization and neofunctionalization, or if they all have redundant functions. Therefore, an important objective of this study was to classify CBF genes into functional categories that would help orient functional studies. Towards this goal, we studied the evolution of the CBF family in monocotyledons and determined their structural characteristics using HCA to display conservation/changes that could affect protein secondary structure.
This study indicates that the CBF amplification seen in wheat has occurred quite recently and following the emergence of the Oryzaceae, Panicoideae and Pooideae lineages. From the phylogenetic analysis, it is easily observable that these subfamilies had already evolved representatives of groups Ia, II, IIIa and possibly IIIb suggesting that orthologs within these groups should have common functions. The HCA analyses confirm that rice and wheat orthologs have similar structural characteristics that have been conserved over evolution. On the other hand, groups IIIc, IIId, IVa, IVb, IVc and IVd representing 18 genes in wheat species arose following the emergence of the Pooideae since no rice or maize CBFs clustered within these groups. After the emergence of Oryzaceae, rice only gained four genes. However, these rice genes are distantly related to the groups containing wheat genes and their proteins do not display similar structural features. Therefore, the 18 wheat genes in groups IIIc, IIId, IVa, IVb, IVc and IVd and the 4 unclassified group III and IV rice genes may represent the CBF response machinery that evolved in Pooideae and Oryzaceae, respectively, as they radiated into specific habitats. Since these arose after the subfamilies split, it is not surprising to see that structural patterns are not conserved in the respective CBF proteins. There is some indications that the CBF family is still (or has been recently) evolving under some selective pressure. The first comes from the observation that, although the evolutionary distance between groups IVa, IVb, IVc and IVd is small, their proteins display notable structural differences in their AP2 domain and C-terminal region. This is evident between groups but can also be seen between members of group IVa (Fig. 2) and IVb (Fig. 3). Finally, the observation that barley contains several duplicated genes with high similarities (Skinner et al. 2005) suggests that they arose after the wheat–barley divergence with some (those with identities above 95%) having arisen in the last 4 MY. As more CBF sequence information becomes available, it will allow a detailed evaluation of the number and nature of CBF gene groups found in different subfamilies of the Poaceae and even in other monocot orders. In a general sense, this will lead to a better understanding of the evolution of CBF-mediated tolerance to abiotic stresses, and in a more practical way, it may allow associating the emergence (or loss) of certain CBF genes (or groups) with maximum species freezing tolerance.
An interesting conclusion that can be drawn from the phylogenetic study is that the evolution of the Pooideae in a specific ecosystem has impacted the CBF signaling machinery by increasing the total number of CBF genes and the number of functional categories as also corroborated by the HCA analyses. This complexity is found in the Triticeae tribe and suggests that these plants have faced a strong selection pressure to maintain genes that will help them to perform well under a variety of environmental conditions. The selection pressure may not be constant but could intensify during generations that experience an unusually severe winter.
Studies with monocot CBFs have shown that they share the conserved domains present in this protein family, and that they are capable of binding a DRE-related cis element and inducing COR gene expression in Arabidopsis although with a lesser efficiency or an incomplete response (Dubouzet et al. 2003; Qin et al. 2004; Skinner et al. 2005) compared to overexpression of endogenous ArabidopsisCBFs (Jaglo-Ottosen et al. 1998; Liu et al. 1998). This can be partly explained by structural differences which make monocot CBFs less efficient in replacing the endogenous Arabidopsis protein function. The independent evolution of CBF genes in plants will certainly make it harder to elucidate the exact roles of cereal genes in model species like Arabidopsis that are more evolutionary distant. Therefore, to understand the exact functions of Pooideae- and possibly Triticeae-specific groups/genes, it will be essential to study these CBF genes in species such as wheat and barley. In addition, determining the exact contributions of members of the CBF family may be even more complex than anticipated since members of other subgroups of group III ERF proteins have been shown to be cold regulated and capable of binding a DRE cis element such as HvCBF7 from barley (Skinner et al. 2005) and TINY2 from Arabidopsis (Wei et al. 2005). Their contribution to the regulation of COR gene expression in species with large CBF families or their possible compensation in species with small families needs to be explored.
The HCA analysis of the AP2 DNA binding domain was capable of differentiating seven out ten groups suggesting that protein structure and binding properties may be affected. Results of Xue (2002) demonstrated that HvCBFIa-1 preferred TTGCCGACAT as a binding site while HvCBFIVa-2 (Xue 2003) preferred YYGTCGACAT. In addition, HvCBFIVa-2 and HvCBFIVd-4A showed a LT dependence for maximal binding activity (Xue 2003; Skinner et al. 2005). These results corroborate that functional differences can be visualized through HCA analyses and allow us to predict that up to seven groups may show some differences in DNA binding properties while group members will show a redundancy in binding. Determining such properties will be important for understanding the possible differences/overlap in regulons controlled by CBF groups. It was recently suggested that small variation in transcription factor binding consensus could have important consequences for bioactivity (Benedict et al. 2006) and some of the barley CBFs have been demonstrated to have differential affinity for specific CRT/DRE motifs (Xue 2002, 2003; Skinner et al. 2005). On the other hand, the HCA analysis of the C-terminal activation domain revealed substantial differences between all ten groups. The structural patterns were relatively well conserved between group members even for those corresponding to distantly related species. The varied patterns detected may be at the base of specific functional differences which could impact protein folding, recognition of specific interaction partners and the transactivation potential of a specific set of CBF proteins. The molecular dissection of Arabidopsis CBF1 (Wang et al. 2005) showed that certain hydrophobic clusters and other structural determinants in this region were important in regulating transactivation potential of this region. In Brassica napus, two closely related CBF proteins were shown to have substantially different transactivation potentials in yeast and tobacco (Zhao et al. 2006). The major differences between these proteins lie in the C-terminal activation domain within the hydrophobic clusters. Such behaviour has not been reported for the CBF1, 2, and 3 genes of Arabidopsis (Gilmour et al. 2004) which diverged from Brassica some 24 MYA (Koch et al. 2000) suggesting that even related species may have evolved some specific differences in CBF properties. These examples illustrate that the C-terminal activation domain of different groups could have distinct properties that play important and specific roles. In addition, this last observation suggests that some CBF properties in group IVa (Fig. 2) and IVb (Fig. 3) may even differ between the closely related species barley and wheat. The three structurally different groups IIIc, IIId and IVd share the characteristic of having several members with similar structural patterns which contrast with the lower complexity present in other groups. An explanation for the selection and conservation of duplicated genes that seem to accomplish redundant functions comes from work in yeast and suggests a selection for a higher flux through the pathways controlled by these genes (Papp et al. 2004). Therefore, the wheat genes from these groups may be necessary in certain circumstances where maximal induction is needed to achieve very high levels of LT tolerance and/or activation of a large number of genes in a regulon.
Quantitative RT-PCR analyses revealed that the expression level of groups IIId, IVa, IVb, IVc and IVd was more pronounced in the winter cultivar (fourfold and more) compared to the remaining groups assayed (threefold and less). This was also demonstrated in barley and wheat for members of the CBFIV groups (Kume et al. 2005; Skinner et al. 2005). It is probably not a coincidence that five of the six groups that evolved in the Pooideae show expression levels that are correlated with the winter cultivar’s capacity to develop LT tolerance. Previous studies had already noted that the majority of the expansion of CBF genes has occurred from an ancestral cluster/locus (Skinner et al. 2005, 2006; Miller et al. 2006). In rice, three CBF genes (OsCBFIIIa-1A, OsCBFIIIb-1H and OsCBFIV-1B.1) are present as a tandem cluster on a region on rice chromosome 9 that is collinear with the chromosome 5 region of the Triticeae where the CBFs occur. Therefore in the Triticeae (T. monococcum, barley and hexaploid wheat), there has been amplification of the genes in this region to give rise to the six groups present specifically in this tribe. These observations suggest that as FT-associated CBF genes were amplified, selective pressure has maintained their role in FT. The genetic capacity of the winter cultivar to induce a higher level of expression during the LT response is also reflected in constitutive levels measured under control growth conditions. Since LT is not involved in this regulation, the higher level in winter versus spring wheat comes from different inheritable capacities to express CBF genes. A similar association was observed between the inheritable level of CBF expression and the LT tolerance of different Arabidopsis lines collected at different latitudes (Hannah et al. 2006). These observations are not surprising since a study had already shown an association between higher levels of the WCS120 protein family and cultivar capacity to develop FT (Houde et al. 1992).
The expression patterns also revealed that groups IIId, IVa, IVb, IVc and IVd displayed a diurnal fluctuation that peaked 8–14 h after dawn. This natural rhythm influenced the time course of LT induction with faster inductions during evenings (maximum at 2 h) and slower induction during mornings (maximum at 4–6 h). Although the peak period of induction does not coincide with the coolest period of the day, it does coincide with the daily decrease of temperature during sunset suggesting the rhythm is preceding/anticipating the event. Circadian clock regulation of CBF expression has been described in Arabidopsis (Fowler et al. 2005), and therefore, may be common in plants capable of developing FT since it would confer a selective advantage during sudden drops in LT. In temperate cereals, group IV proteins (HvCBFIVa-2 and HvCBFIVd-4A) were shown to bind cis elements in a LT-dependent manner. If this property is present in all group IV members of temperate cereals, the daily accumulation of these proteins during normal growth conditions would not cause profound changes in COR gene expression and thus prevent wasting cellular resources since their DNA binding activity is relatively low at this temperature. Once exposed to a sudden drop in LT, the DNA binding activity of these factors would increase, and this would immediately impact COR gene expression. The accumulation of partially inactive factors during warm growth conditions would also alleviate deleterious symptoms from developing as observed in transgenic plants constitutively overexpressing complete or portions of CBF genes (Liu et al. 1998; Wang et al. 2005; Ito et al. 2006). Therefore, group IV proteins from temperate cereals may ultimately represent a uniquely engineered protein group that functions as a first line of defence against sudden drops in LT. The functional studies of group IV CBFs is thus essential to understand the specific contribution of these groups and their impact on the range of LT tolerance capacities observed in temperate cereals. In addition, these studies may identify unique properties that could be incorporated in CBF proteins from other species.
In conclusion, this study reveals that wheat species, T. aestivum and T. monococcum, may contain up to 25 CBFs. These genes can be divided into at least ten groups that share a common phylogenetic origin and similar structural characteristics. Six of these groups (CBFIIIc, IIId, IVa, IVb, IVc and IVd) are found only in the Pooideae, suggesting that they evolved recently during the colonization of temperate habitats. Expression studies revealed that five groups (CBFIIId, IVa, IVb, IVc and IVd) display higher constitutive and LT inducible expressions in the winter cultivar. The higher inherited and inducible CBF expression suggests that these groups may be major components that regulate the capacities of Pooideae species to develop LT tolerance.
We wish to thank F. Allard, M. Champoux, N. Toudji and K. Tremblay for their technical assistance. The excellent editing work of one of the anonymous reviewers is also appreciated. This study was supported by Genome Canada and Génome Québec grants (F.S. and M.H.), and by Natural Sciences and Engineering Research Council of Canada discovery grant (F.S. and J.D.). M.B. was supported by a scholarship from the Ministry of State for Higher Education and Scientific Research, Egypt.