Background

Wheat is a widely cultivated gramineous plant and one of the three most important cereals in the world [1]. It is a heterologous hexaploid derived from three closely related ancestors that have undergone two rounds of natural hybridization. Therefore, the large and complex genome of wheat (17 Gb) poses a significant challenge for wheat genome research [2, 3]. The completion of a whole genome sequence for wheat based on single chromosome sequencing has laid the foundation for wheat genomics research and wheat gene family identification.

Valine-glutamine (VQ) proteins are a class of plant-specific proteins with five highly conserved amino acids in the core FxxxVQxLTG sequence of the VQ motif [4], in which x represents any amino acid (aa) and VQ is a highly conserved pair of aa residues. Research on the VQ proteins has shown that the last three amino acids in almost all species are LTG, although some species have other variants, including FTG, ITG, LTA, and VTG [5]. In some VQ proteins of Gramineae species such as rice, maize, and Moso bamboo, VQ has mutated to VH in the conserved domain [5, 6]. VQ proteins are generally less than 300 aa in length and contain no or few introns [7]. To date, 34, 40, 61, 18, and 74 VQ proteins have been identified in Arabidopsis, rice, maize, grape, and soybean, respectively [6, 8,9,10,11]. According to bioinformatics predictions and experimental verification, some Arabidopsis VQ proteins are located in the nucleus, some in the plastid, and a few partly in the mitochondria [12].

VQ proteins play important roles in the regulation of plant growth and development and the response to abiotic and biotic stress [5,6,7, 13,14,15,16,17]. For instance, AtCaMBP25 (AtVQ15) negatively regulates osmotic stress response during the early stages of seed germination and growth in Arabidopsis [13]. Likewise, AtVQ9 expression responds strongly to NaCl treatment, and its mutation enhances salt stress tolerance in Arabidopsis [16]. VQ54 and VQ19 in maize, as well as VQ2, VQ16, and VQ20 in rice, are highly expressed under drought induction [6, 11]. Soybean VQ6 and VQ53 are highly expressed in roots and stems under low nitrogen conditions [10]. SIB1 (Sigma factor binding protein 1, also known as AtVQ23) was the first VQ motif protein discovered in Arabidopsis and participates in plant disease resistance signaling pathways [15]. AtVQ21 (MSK1) transgenic plants show enhanced resistance to the pathogen Pseudomonas syringae but reduced resistance to Botrytis cinerea [13, 18]. AtVQ22 negatively regulates JA-mediated disease resistance signaling pathways [19], and rice VQ22 shows high expression levels after rice blast infection [5]. AtVQ14 (IKU1) participates in the regulation of endosperm development, thereby affecting the size of Arabidopsis seeds [20]. AtVQ29 is involved in the photomorphogenesis of Arabidopsis seedlings and flowering time regulation [9]. In addition, the growth of VQ17, VQ18, VQ8, and VQ22 transgenic Arabidopsis plants is inhibited, indicating that these genes play crucial roles in plant growth and development [8].

VQ proteins came to the attention of researchers because of their interactions with WRKY transcription factors, which are involved in regulating the plant’s defense response system [13, 15]. WRKY transcription factors belong to a large gene family and are ubiquitous in plants. Studies have shown that WRKY transcription factors are widely involved in plant growth and development and in resistance to adverse conditions [21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38]. For example, AtVQ14 and AtWRKY10 interact to form a protein complex that affects seed size in Arabidopsis [20]. AtVQ15 and AtWRKY25 interact and participate in high salt and osmotic stress response [13]. The interaction between AtVQ22 and AtWRKY28 negatively regulates JA-mediated disease resistance signaling pathways [19]. AtVQ23 (SIB1) and AtVQ16 (SIB2) interact with WRKY33 to enhance the binding capacity of WRKY33 to the W-box, thereby regulating plant disease resistance [6, 15]. AtVQ21 can form ternary complexes with AtWRKY33 and AtMPK4 to regulate plant growth and disease resistance [4, 39]. In brief, VQ proteins are transcriptional regulatory cofactors that participate in growth and developmental processes and stress resistance through their interactions with transcription factors. However, until now, the VQ gene family has not been characterized in common wheat.

Pre-harvest sprouting (PHS) refers to the germination of wheat seeds within the spike of the mother plant that occurs in rainy or high moisture conditions before harvest. A series of physiological and biochemical reactions take place in wheat grains when PHS occurs. The activity of hydrolases such as amylase and proteolytic enzymes is enhanced, leading to starch and protein degradation and seriously affecting wheat processing quality and utilization value. In the international wheat market, when the germination rate of commercial wheat reaches 5%, it is regarded as feed wheat and its price is reduced, causing serious economic losses to producers [40, 41]. Seed dormancy and germination traits determine wheat PHS resistance: wheat varieties with higher levels of dormancy or lower germination percentages show higher resistance to PHS. Therefore, the identification of candidate genes that control seed dormancy and germination may help to reduce yield and quality losses caused by PHS. Previously, in Arabidopsis, VQ18 and VQ26 were found to be involved in seed germination via the ABA signaling pathway [42]. However, the functions of VQ genes in common wheat are largely unknown.

The objectives of this study were to identify TaVQ genes and to perform bioinformatics analysis, including phylogenetic tree construction and characterization of gene structures, conserved domains, chromosome positions, expression patterns, and promoter elements. In addition, we measured the expression levels of TaVQ genes in wheat varieties with contrasting seed dormancy and germination phenotypes by qRT-PCR to identify TaVQ gene family members that were potentially involved in seed dormancy and germination.

Results

Identification and attribute analysis of VQ candidate genes in wheat

A total of 65 TaVQ genes were identified, mapped to wheat chromosomes, and named TaVQ1–TaVQ65. The length of the encoded proteins ranged from 127 to 723 aa, with an average length of 220 aa. Their MWs ranged from 13,377.16 Da (TaVQ52) to 61,926.76 Da (TaVQ1). Information on chromosome positions, ORF lengths, and exon numbers is provided in Table 1. The majority of genes included one exon, and only four genes (TaVQ13/-17/-18/-53) contained two exons.

Table 1 Detailed information about the predicted TaVQ genes

Phylogenetic trees of VQ proteins from wheat, maize, poplar, rice, and Arabidopsis

To explore the evolutionary relationships among VQ genes from wheat, Arabidopsis, rice, poplar, and maize, we downloaded published VQ protein sequences from these species (Table S1) [8,9,10] and constructed a phylogenetic tree (Fig. 1). Based on the original division and naming of VQ subfamilies in Arabidopsis and rice, we divided the 251 VQ genes (34 AtVQ genes, 40 OsVQ genes, 51 PtVQ genes, 61 ZmVQ genes, and 65 TaVQ genes) into seven subfamilies (VQI, VQII, VQIII, VQIV, VQV, VQVI, and VQVII). The VQII subfamily contained the largest number of genes (28) (Fig. 1). Members of the VQ family from rice, maize, and wheat, which belong to the Gramineae, were interspersed, whereas those of Arabidopsis, a model plant from the Cruciferae, formed separate clades, probably due to the relatively distant relationship between monocots and dicots.

Fig. 1
figure 1

Phylogeny of VQ proteins from wheat, rice, poplar, maize, and Arabidopsis. The 65 TaVQ proteins, 40 OsVQ proteins, 51PtVQ proteins, 61 ZmVQ proteins, and 34AtVQ proteins are clustered into seven subfamilies. Details of VQ genes from Arabidopsis, maize, poplar and rice are listed in Table S1. The red circle, black square, yellow triangle, purple diamond and green inverted triangle represent the VQ gene families of wheat, rice, poplar, maize and Arabidopsis respectively. Different colors of inner ring and outer ring represent different subfamilies

Structural analysis of the VQ gene family

We constructed a wheat VQ phylogenetic tree and a gene structure diagram (Fig. 2a). The structure of each VQ gene contained one to three parts: the untranslated region (yellow rectangle), the exon region (green rectangle), and the intron region (solid gray line). Among the 48 TaVQ paralogous pairs (Table 2), only two pairs (TaVQ10/-17 and TaVQ53/-58) differed in intron number, having lost or gained one intron (Fig. 2A and Table S2). Further analysis revealed that 94% of the TaVQ genes had no introns, and only four genes (TaVQ13/-17/-18/-53) contained one intron. This result is consistent with previous studies in other species: 78%, 88%, 89% and 93% of the VQ genes in poplar, Arabidopsis, maize, and rice have no introns, respectively, whereas only 28% of moss VQ genes have no introns (Fig. 2b). Based on comparisons of many species, including angiosperms (rice, poplar, soybean, Chinese cabbage, etc.) and bryophytes (moss), we speculate that most VQ genes tend to lose introns during long-term evolution [6, 8, 9, 43, 44].

Fig. 2
figure 2

Gene structure analysis of TaVQ genes. a Phylogenetic relationships and gene structures of TaVQ genes. Exons, introns, and untranslated regions (UTRs) are indicated by green rectangles, gray lines, and yellow rectangles, respectively. Coloured boxes indicate the subfamily based on the phylogenetic analysis. b The numbers of VQ genes and the numbers of VQ genes without introns in different species. Pp: Moss, At: Arabidopsis, Pt: Poplar, Br: Chinese Cabbage, Gm: Soybean, Os: Rice, Zm: Maize, Vv: Grape, Pe: Moso bamboo, Ta: Wheat

Table 2 Paralogous (Ta/Ta) and orthologous (Ta/Os and Ta/Zm) gene pairs

Using published information on characteristics of the VQ domain as a reference, we aligned the protein sequences of wheat and analyzed their VQ domains. The 65 TaVQ proteins all contained conserved VQ domains, but they differed slightly and could be grouped into three types: FxxxVQxLTG (52/65), FxxxVQxFTG (10/65), and FxxxVQxITG (3/65) (Fig. 3a). We further analyzed the VQ domains from multiple species (Fig. 3b) and found that the FxxxVQxLTG sequence was most prevalent and that two additional VQ domain types (LTG/FTG) were also common. There were differences in VQ domain sequence between monocots and dicots. In addition to the more common domain sequences, monocot VQ domains also included ITG, ATG, and LTA, and dicot VQ domains included LTS, LTD, YTG, LTR, and LTV (Table S3).

Fig. 3
figure 3

Multiple sequence alignment of TaVQ proteinsand domain type analysis in different species. a Multiple sequence alignment of VQ proteins in wheat. b VQ domain type in different species. Pp: Moss, At: Arabidopsis, Pt: Poplar, Br: Chinese Cabbage, Gm: Soybean, Os: Rice, Zm: Maize, Vv: Grape, Pe: Moso bamboo, Ta: Wheat. c Schematic representation of 20 conserved motifs in the TaVQ genes. Different colored boxes represent different motifs. Box lengths are not proportional to actual motif size

A total of 20 conserved motifs were identified in the TaVQ gene family (Table S4). All 65 TaVQ proteins shared one conserved motif (core motif 1, Motif 1) (Fig. 3c). The MEME diagram showed that wheat VQ genes from the same subfamily tended to share the same conserved motifs. Only 7 of the 48 TaVQ paralogous pairs (TaVQ12/-15, TaVQ32/-36, TaVQ32/-41, TaVQ44/-49, TaVQ47/-49, TaVQ55/-60, and TaVQ60/-65) differed in their motifs.

Evolution and divergence of the VQ gene family in wheat, rice, and maize

In total, 48 homologous pairs were identified in wheat, 22 in wheat and rice, and 14 in wheat and maize (Table 2). The Ks (number of synonymous substitutions per synonymous site) values of the wheat paralogous pairs ranged from 0.0163 to 1.5197, indicating that duplication events occurred in this species approximately 1.2538 to 116.8985 million years ago (MYA). The Ks values of orthologous pairs from wheat and rice ranged from 0.5821 to 1.6479, indicating that duplication events occurred approximately 44.78 to 126.7631 MYA. The Ks values of orthologous pairs from wheat and maize ranged from 0.6453 to 1.0257, indicating that the duplication events occurred approximately 49.6415 to 78.8962 MYA (Table 3).

Table 3 Ka, Ks and Ka/Ks ratios of paralogous and orthologous pairs

To investigate the role of natural selection in the evolution of the VQ gene family in Gramineae, we analyzed the Ka (number of non-synonymous substitutions per non-synonymous site)/Ks ratios of all homologous pairs and generated sliding window graphs (Figure S1 and Table 3). Among the 48 paralogous pairs, 11 had Ka/Ks ratios less than one, and 37 pairs had Ka/Ks ratios greater than one, indicating that wheat VQ genes were mainly under positive selection during the evolutionary process. The Ka/Ks ratio of all orthologous pairs was greater than one, indicating that the VQ gene family in wheat, rice, and maize had primarily undergone positive selection.

Expression pattern analysis of the TaVQ gene family

Transcriptome data (FPKM values) were obtained for all TaVQ genes with the exception of TaVQ13/-18/-45 (Fig. 4a and Table S5). The expression patterns of VQ genes differed among varieties and within time periods in the same variety. Most TaVQ genes were highly expressed in J411, especially at 4 h after seed imbibition, and only four genes (TaVQ4/-7/-8/-20) were expressed at a low level. It is worth noting that TaVQ8 and TaVQ20 were both highly expressed in HMC21 and expressed at a low level in J411. We further analyzed the expression of 48 paralogous gene pairs. Only one pair showed a similar expression pattern, whereas the rest were differentially expressed among varieties and within different time periods of the same variety.

Fig. 4
figure 4

Expression profiles of TaVQ genes in different tissues and at different developmental stages. a Heatmap shows hierarchical clustering of the 62 TaVQ genes based on transcriptome results. b Heatmap shows hierarchical clustering of 11TaVQ genes from different tissues. Abbreviations represent specific developmental stages. GSC, germinating seed coleoptile; GSR, germinating seed root; GSE, germinating seed, embryo; SR, seedling root; SC, seedling crown; SL, seedling leaf; II, immature inflorescence; Fba, floral bracts before anthesis; Pba, pistil before anthesis; Aba, anthers before anthesis; 3–5 DAP C, 3–5 DAP caryopsis; 22 DAP EM, 22 DAP embryo; 22 DAP EN, 22 DAP endosperm

Microarray data were obtained for 11 TaVQ genes to further investigate wheat VQ gene expression (Fig. 4b and Table S6). TaVQ16, TaVQ31, and TaVQ35 were highly expressed in Aba and at 22 DAP EM (22 days after planting—embryo), but they showed little expression elsewhere. Further analysis of paralogous pairs showed that three pairs (TaVQ55/-60, TaVQ55/-65, and TaVQ60/-65) had similar expression patterns in different tissues.

Promoter analysis and gene ontology annotation of the TaVQ gene family

Two categories of response element were analyzed in the promoter regions of the TaVQ genes (Fig. 5a and Table S7). The first category included elements associated with biotic stress, such as ABRE, CGTCA motif, TGACG motif, TGA element, AuxRR core, TCA element, GARE motif, and P-box. The second category included elements associated with abiotic stress, such as MBS, LTR, and TC-rich repeats. The most common biotic stress response elements in the TaVQ promoters were associated with methyl jasmonate (CGTCA motif and TGACG motif) (42.77%) and ABA (ABRE) (41.85%) (Fig. 5b). The drought-associated MBS element (4.31%) was the most common abiotic stress response element (Fig. 5c).

Fig. 5
figure 5

Cis-acting element analysis of the promoter regions of TaVQ genes. Based on functional annotation data, cis-acting elements were classified into two major classes: phytohormone responsive elements (i.e. those responsive to ABA, auxin, GA, MeJA, and/or SA) and abiotic stress responsive elements (e.g. those involved in plant defense, drought stress response, and/or low temperature stress response). a Percentage of total cis-acting elements in the promoter region of the TaVQ gene. b and c The percentage of cis-acting elements in different categories

The 65 TaVQ genes were annotated with 15 GO terms (Fig. 6 and Table S8): three, three, and nine terms in the molecular function, cellular component, and biological process categories, respectively. Among these terms, GO:0,005,634 (cellular component), GO:0,003,674 (molecular function), and GO:0,008,150 (biological process) were most common and were assigned to 28, 21, and 13 genes, respectively.

Fig. 6
figure 6

Gene Ontology (GO) annotations of TaVQ proteins. Red represents molecular_function, blue represents biological_process, black represents cellular_component

Chromosome locations and subcellular localization predictions for the TaVQ gene family

The TaVQ genes were unevenly distributed on wheat chromosomes 1–7, and no TaVQ genes were present on chromosomes 1B and 1D (Figure S2). One gene was located on chromosome 1A, five were located on chromosomes 5B, 7A, 7B, and 7D, and two to four were located on each of the other chromosomes. We defined a single gene cluster as a chromosomal region of less than 200 kb that contained two or more TaVQ genes [45]. Two gene clusters containing six genes were identified on chromosomes 4A and 4B (Figure S2).

Subcellular localization prediction indicated that the TaVQ proteins were present in three locations. Most were predicted to be located in the periplasmic region (47, 72.3%), some in the extracellular region (15, 23.1%), and the rest in the cytoplasm (3, 4.6%) (Table S9).

Responses of TaVQ genes to water imbibition

We investigated the responses of 65 TaVQ genes (Table 1 and Table S10) in six wheat varieties with different seed dormancy and germination phenotypes after water imbibition for 0, 6, and 10 h. Seeds from three highly dormant varieties (HMC21, YXM, and SNTT) showed no seed germination, whereas partial seeds from three low-dormancy varieties (J411, ZY9507, and ZM895) germinated after 10 h of imbibition with an average germination index (GI) of 0.33, 0.31, and 0.41, respectively (Table S11). We found that the TaVQ genes were differentially expressed in the six wheat varieties. The expression levels of 13 genes (TaVQ8/-9/-13/-17/-25/-32/-34/-43/-48/-49/-53/-59/-62) were higher in the low-dormancy varieties than in the high-dormancy varieties. Eight genes showed the opposite expression trend (TaVQ4/-16/-20/-35/-38/-42/-51/-56) (Fig. 7).

Fig. 7
figure 7

Relative expression (mean ± SE) of 65TaVQ genes during seed imbibition in six wheat varieties

Discussion

The plant-specific VQ proteins initially attracted attention due to their interactions with WRKY transcription factors [15]. Additional in-depth studies showed that the VQ gene family not only participated in responses to biotic and abiotic stress, but was also involved in the regulation of plant growth and development [5,6,7, 13,14,15,16,17]. Genome-wide surveys of VQ proteins have now been performed in a number of species, although functional research has remained focused on Arabidopsis. VQ genes have not previously been characterized in wheat, and we therefore performed basic bioinformatics analyses to better understand the VQ gene family in wheat.

We identified 65 VQ genes from wheat and classified them into seven subfamilies. The VQ genes of five species (wheat, rice, maize, poplar, and Arabidopsis) were distributed in each subfamily, but the number of subfamily members differed among species, indicating that the VQ genes have developed in multiple directions over the course of evolution. The VQ genes of monocots (rice, maize, and wheat) were interspersed and clustered together, whereas the VQ genes of Arabidopsis and poplar were clustered into separate clades, indicating that proteins encoded by wheat VQ genes were highly similar to those of rice and maize [8, 9]. These results highlight the evolutionary conservation of the VQ gene family.

Phylogenetic trees represent the genetic relationships among gene families from different species and reflect the similarity of protein-coding genes. From structural analysis of the VQ gene family, we found that most VQ genes were intron-free [6, 8,9,10,11, 43, 44]. Based on comparisons of several species, we speculate that this gene family tends to lose introns during evolution. Amino acid sequence alignment and motif analysis indicated that the sequences of most VQ domains from different species were similar, although a small number of variants existed. In general, members of the same subfamily had similar types and numbers of conserved motifs, but there were also cases in which members of the same family had different types and numbers of conserved motifs. In addition, VQ had mutated to VH in the VQ domain of several Gramineae species [5, 6]. Taken together, these results indicate that the VQ gene family is highly conserved and diverse, reflecting the functional diversity of the gene family members.

With the development of next generation sequencing technology, the genomes of Arabidopsis, rice, maize, and wheat have recently been sequenced [2, 3, 8,9,10]. Their genome sizes are 164 Mb, 389 Mb, 2500 Mb, and 17 Gb, respectively. Based on genome size and chromosome number, the number of VQ genes among the four species is expected to be the highest in wheat, followed by maize, rice, and Arabidopsis. The numbers of maize, rice, and Arabidopsis VQ genes are 61, 40, and 34, and the number of wheat VQ genes in this study was 65, consistent with predictions based on genome size. By calculating the Ks value of homologous pairs to estimate the time of duplication events, the time range for whole genome duplication events in wheat was approximately 1.2538 to 116.8985 MYA. The Ka/Ks ratio of most paralogous pairs (37, 77%) was greater than one, indicating that the TaVQ gene family had undergone positive selection. A sliding window graph demonstrated that the Ka/Ks ratio of homologous pairs differed among different coding segments: some had Ka/Ks ratios greater than one, and some had Ka/Ks ratios less than one, indicating that the homologous pairs had undergone different evolutionary selection pressures. These results show that natural selection has played an important role in the evolution and differentiation of the VQ gene family.

TaVQ55/-60/-65 were highly expressed in GSC, GSR, GSE, SR, SL, Fba, and Pba; TaVQ2/-5/-8 were highly expressed in SL, Aba, 3–5 DAP C (22 days after planting—caryopsis), and 22 DAP EM; and TaVQ16/-31/-35 were highly expressed in SL, Aba, and 22 DAP EM. These results indicate that the VQ gene family is active during multiple plant growth and developmental stages. Previous studies on Arabidopsis have shown that IKU1 (AtVQ14, At2g35230) regulates endosperm development and seed size [20]. In this study, TaVQ48, which belongs to the same subfamily as AtVQ14/-29, was also expressed in floral bracts before anthesis, in 22 DAP EN, and in 22 DAP EM. In addition, TaVQ48 was strongly expressed in germinating seeds, roots, seedling roots, and seedling leaves. These results will guide further exploration of the functions of the TaVQ gene family.

GO annotations of 62 TaVQ genes were extracted from transcriptome data. The most common GO terms were from the biological process category (43, 38.1%), especially GO:0,006,952 (13 genes) and GO:0,008,150 (15 genes). GO:0,006,952 is related to defense response, and combined with promoter analysis, we found that 5 TaVQ genes (TaVQ1/-2/-32/-41/-51) had this function in both analyses. GO:0,010,337 is related to the regulation of salicylic acid (SA) metabolism, but only TaVQ14 was assigned this annotation. TaVQ14 also had an SA cis-acting element in the promoter analysis. These results indicate that gene structure determines function, and the diversity of structure reflects the diversity of function.

In the present study, we measured the expression of the 65 TaVQ genes during seed imbibition of six wheat varieties (HMC21, YXM, SNTT, J411, ZY9507, and ZM895). The expression of thirteen TaVQ genes (TaVQ8/-9/-13/-17/-25/-32/-34/-43/-48/-49/-53/-59/-62) was consistently higher in low-dormancy varieties than in high-dormancy varieties. By contrast, the expression levels of 8 TaVQ genes (TaVQ4/-16/-20/-35/-38/-42/-51/-56) were consistently higher in high-dormancy varieties. These 21 TaVQ genes may therefore participate in the regulation of seed dormancy and germination. According to phylogenetic analysis, three of these 21 genes (TaVQ8, TaVQ13, and TaVQ59) are members of the VQV subfamily. Interestingly, Arabidopsis AtVQ18 and AtVQ26 involved in seed germination also belong to the VQV subfamily. These results suggest that TaVQ8/-13/-59 may have similar functions in the regulation of seed dormancy and germination, a hypothesis that requires future validation.

Conclusions

We investigated the phylogeny and diversification of VQ genes in wheat by multiple methods, including phylogenetic tree construction and characterization of gene structures, conserved domains, chromosome positions, expression patterns, and promoter elements. In addition, we measured the expression levels of TaVQ genes in wheat varieties with contrasting seed dormancy and germination phenotypes by qRT-PCR to identify genes that were potentially involved in seed dormancy and germination. Sixty-five TaVQ proteins were identified for the first time in common wheat, and qRT-PCR data showed that 21 were potentially involved in seed dormancy and germination. These findings provide valuable information for further cloning and functional analysis of TaVQ genes, as well as useful candidate genes for improvement of PHS resistance in wheat.

Methods

Plant materials

We measured TaVQ gene expression in six wheat varieties with extreme dormancy levels [46]: J411 (Jing 411, average germination index [GI] = 0.89, average germination rate [GR] = 98.00%), HMC21 (Hongmangchun 21, average GI = 0.04, average GR = 10.00%), SNTT (Suiningtuotuo, average GI = 0.06, average GR = 16.00%), ZM895 (Zhongmai 895, average GI = 0.81, average GR = 96.00%), ZY9507 (Zhongyou 9507, average GI = 0.90, average GR = 98.00%), and YXM (Yangxiaomai, average GI = 0.03, average GR = 9.00%) (Tables S11 and S12). J411 and HMC21 were provided by Shihe Xiao from the Chinese Academy of Agricultural Sciences, and ZM895, ZY9507, YXM, and SNTT were provided by Xianchun Xia from the Chinese Academy of Agricultural Sciences.

Germination index and germination rate assays

Freshly harvested seeds were used to measure the GI as described in our previous study [46]. Fifty seeds from each genotype were placed in Φ 90 Petri dishes on filter paper with 9 ml distilled water, then grown in a 20 °C greenhouse with a 14 h day/10 h night photoperiod at 80% humidity. The number of germinated seeds in each culture dish was counted at the same time every day, and germinated seeds were removed. The GI value was calculated after 3 days as GI = ([3 × n1] + [2 × n2] + [3 × n1])/3 × N. The GR was also calculated after 3 days of seed imbibition as GR = [(n1 + n2 + n3)/N] × 100%. In these equations, n1, n2, and n3 are the numbers of seeds germinated on the first, second, and third days, and N is the total number of seeds. Each genotype was replicated three times, and germination was defined as visible rupture of the pericarp and testa.

Identification of wheat VQ genes

To determine the number of VQ genes in common wheat, we used sequences obtained from the Ensembl database to build a local wheat database [46]. The VQ domain hidden Markov model (PF05678) was used to identify candidate genes by BLAST in the established local wheat database. To ensure the accuracy of the results, all candidate genes were inspected, repetitive sequences were removed, and Pfam, SMART, and NCBI online tools were used to verify the existence of the conserved VQ domain in all candidate genes [46, 47]. The ExPASy online tool was used to predict the isoelectric point (PI), protein molecular weight (MW), open reading frame (ORF), and other attributes of the VQ proteins.

Phylogenetic tree and multiple sequence alignment

FASTA sequence files were opened in ClustalX2.11 software [48,49,50,51] and used to generate a multiple sequence alignment from which a phylogenetic tree was constructed using the neighbor-joining method with 1000 bootstrap replicates in MEGA7.0 [43, 52,53,54]. The same method was used to build a composite phylogenetic tree of VQ protein sequences from maize, rice, poplar, Arabidopsis, and wheat.

Intron/exon structure and conserved motif analysis

The distribution and structure of exons and introns were determined by uploading CDS and genomic sequences to the Gene Structure Display Server (http://gsds.cbi.pku.edu.cn/) for plotting and analysis [7, 54, 55].

To predict structural differences among the TaVQ proteins, all candidate protein sequences were uploaded to the MEME online tool (http://memesuite.org/tools/meme) for conserved motif analysis using standard operating parameters [54, 56].

Identification of homologous pairs and calculation of Ka/Ks values

Using previously reported methods for the identification of homologous gene pairs (paralogs and orthologs), the nucleotide sequences of VQ genes from wheat and other species were compared using BLASTN [57, 58].

Wheat homologous gene pairs were compared and aligned in ClustalX 2.11, and the aligned sequences were analyzed in MEGA7.0 [59]. The results were uploaded to DnaSP v5.10.1 [60] to calculate the values of Ka (non-synonymous nucleotide mutation rate) and Ks (synonymous nucleotide mutation rate) for all homologous pairs. The formula T = (Ks/2λ) × 10−6 was used to estimate the approximate dates of divergence events. To further analyze Ka/Ks values, we used GraphPad Prism 5 software to generate a sliding window graph [7, 61]. A Ka/Ks ratio less than 1 indicates that a DNA mutation is harmful and under purifying selection, whereas a Ka/Ks ratio greater than 1 indicates that a DNA mutation is beneficial and under positive selection. A Ka/Ks ratio of 1 indicates neutral selection [62].

Chromosome location and gene ontology annotation

The chromosome locations of TaVQ genes were downloaded from the Ensembl database, and chromosome maps were built using MapGene2Chromosome v2.0 [63]. Gene ontology (GO) annotations in the biological process, cellular component, and molecular function categories were assigned based on our transcriptome data (http://amigo.geneontology.org/amigo) [54].

Promoter analysis and subcellular location prediction

The 1500-bp sequence upstream of the transcription start site of each VQ gene was downloaded from the Ensembl website, and cis-acting elements in the promoter region were identified using the PlantCARE online tool [64]. WOLF was used to predict the subcellular localization of the TaVQ proteins [65].

Tissue expression pattern analysis

We collected three replicate seed tissue samples from HMC21 and J411 at 4, 6, and 10 h after seed imbibition for transcriptome sequencing. In addition, we obtained microarray data for 13 different tissues (three biological replicates each) from the Gene Expression Omnibus database (accession number GSE12508) [66, 67]. Mapper Plus was used to generate an expression heat map [45, 68].

RNA extraction and RT-qPCR analysis

Total RNA was extracted from seeds using the TaKaRa MiniBEST Universal RNA Extraction Kit. Primer Premier 5.0 was used to design 65 TaVQ gene-specific primers (Table S10), and TaActin was used as the reference gene [69]. The total PCR volume was 10 μl. The reaction process was 94 °C denaturation for 30 s, followed by 40–45 cycles of 94 °C for 5 s, 50–60 °C for 15 s, and 72 °C for 10 s. We performed three biological replicates for each sample. Finally, we processed the data and created the corresponding figure in GraphPad version 5 [70].