Introduction

Rice (Oryza sativa L.) is one of the most important cereal crops. More than half of the global population use rice as the main food. Development of superior rice varieties is essential to ensure food security. Genetic variation is the basic resource for crop improvement, which has been greatly reduced during domestication and artificial selection (Huang et al. 2012; Wang et al. 2014). Low level of genetic diversity has become a major bottleneck for rice improvement. To enhance genetic diversity of modern rice varieties, considerable efforts have been made in two ways, i.e., introducing allelic variations from wild rice and creating novel alleles by artificial mutagenesis. Introgression of favorable wild alleles into rice varieties has been successful for insect and disease resistance (Hajjar and Hodgkin 2007; Mammadov et al. 2018), but not for grain yield. Moreover, wild resources have become less and less available because the diminishing of their natural habitats (Akimoto et al. 1999; Song et al. 2005; Xie et al. 2010). Broadening genetic variations by means of artificial mutagenesis has been widely used. According to the International Atomic Energy Agency, 833 mutant rice varieties have been officially registered (https://mvd.iaea.org). Nevertheless, mutations induced by conventional methods are random, requiring great efforts to identify favorable mutants. Over the past decade, a new genome editing technology called CRISPR/Cas9 has emerged, providing a simple and more accurate tool for creating targeted mutations (Mussolino and Cathomen 2013; Chen et al. 2019; Manghwar et al. 2019). This system has been applied to create new alleles for rice improvement (Shen et al. 2017; Huang et al. 2020; Li et al. 2020b; Zeng et al. 2020).

Available target gene is a prerequisite for applying CRISPR/Cas9. Tremendous progress has been made in rice gene cloning, but only a small proportion of casual genes for quantitative trait loci (QTLs) have been identified. Among key traits determining grain yield in rice, grain weight and size is the trait having the largest number of QTLs identified in primary mapping. More than 500 QTLs were documented in Gramene (https://www.gramene.org), distributing throughout the 12 chromosomes of rice. To date, 20 causal genes of QTLs for grain weight and size have been validated (Li et al. 20192020a; Ma et al. 2019; Wang et al. 2019a; Dong et al. 2020; Ruan et al. 2020; Shi et al. 2020), which is also higher than other yield traits. These genes were sparsely distributed on eight of the 12 rice chromosomes. Lack of validated genes underlying QTLs for yield traits has severely hindered the application of CRISPR/Cas9.

Complex traits are generally controlled by a small number of genes having large effects and a large number of genes having small effects. In a population segregating a minor-effect QTL, the respective parental alleles may each possess partial functions, thereby limiting the magnitude of observable phenotypic contrast. This makes it difficult to validate their casual genes by complementation test. Producing target gene knockout mutants using CRISPR/Cas9 and validating the gene effect by mutational analysis might be a promising approach (Zhang et al. 2016). In our previous studies, three minor QTLs for grain weight were resolved in a 4.5 Mb region on the long arm of chromosome 1, using near isogenic lines (NILs) derived from a cross between indica rice cultivars Zhenshan 97 (ZS97) and Milyang 46 (MY46) (Wang et al. 2015). One of them, qTGW1.2b, was delimited into a 418.8 kb region. In the present study, this QTL was fine-mapped into a 44.0 kb region and its casual gene was validated using CRISPR/Cas9. Our study provides and effective strategy for cloning minor QTLs in rice.

Materials and methods

Development of NIL populations

Seven NIL populations were used for QTL analysis in this study, including four populations in BC2F11:12 and three populations in BC2F14:15 (Table S1). They were derived from a BC2F9 plant of the rice cross ZS973/MY46 as described below and illustrated in Fig. 1.

Fig. 1
figure 1

Development of the near isogenic lines used in this study

The BC2F9 plant was selfed to develop a BC2F10 population. Four BC2F10 plants with sequential heterozygous segments extending from Wn33252 to RM11787, were selected and selfed. In the resultant four BC2F11 populations, plants carrying homozygous alleles of a single parental type throughout the segregating region, either ZS97 or MY46 type, were identified. Selfing seeds of these plants resulted in the development of four NIL populations in BC2F11:12. The numbers of ZS97 and MY46 homozygous lines were 16 and 20 in L1, 19 and 20 in L2, 20 and 18 in L3, and 42 and 42 in L4, respectively. These populations were used for QTL mapping, and the region segregating qTGW1.2b was narrowed down to Wn34286–RM11787.

Then, three plants were selected from the BC2F11 population that was used to develop L4. They carried sequential heterozygous segments extending from Wn34286 to RM11787, These plants were selfed for three generations. In the resultant three BC2F14 populations, ZS97 homozygotes and MY46 homozygotes were identified and selfed. Three NIL populations in BC2F14:15 were developed. The numbers of ZS97 and MY46 homozygous lines were 36 and 38 in W1, 40 and 39 in W2, and 40 and 40 in W3, respectively.

DNA marker analysis for population development and QTL mapping

Total DNA was extracted using 2 cm-long leaves following the method of Zheng et al. (1995). PCR procedure was conducted according to Chen et al. (1997) and the products were separated on 6% non-denaturing polyacrylamide gels and visualized using silver staining. A total of 17 polymorphic DNA markers were used for QTL mapping (Table S2). Six simple sequence repeat (SSR) markers were selected from the Gramene database (https://www.gramene.org), and 11 InDel markers were designed according to the sequences differences between ZS97 and MY46 as detected by whole genome re-sequencing and targeted next-generation sequencing.

Sequence analysis of candidate genes

Sequence analysis was performed for two annotated genes located in the qTGW1.2b region, including OsCDPK2 (LOC_Os01g59360) and OsVQ4 (LOC_Os01g59410). DNA was extracted using DNeasy Plant Mini Kit (QIAGEN, Hilden, German) according to the manufacturer’s instruction. Primers 360s1 and 360s2 for OsCDPK2 and 410s for OsVQ4 (Table S2) were designed according to Nipponbare and ZS97 genomic sequences (https://rice.plantbiology.msu.edu and https://rice.hzau.edu.cn/rice). Products amplified from the genomic DNA of ZS97 and MY46 were sequenced by the Sanger method. Nucleotide sequence and the predicted amino acid sequence between ZS97 and MY46 were compared.

Generation of knockout mutants using CRISPR/Cas9 system

The CRISPR/Cas9 system was used to generate knockout mutants for one of the annotated genes, OsVQ4. Two targets, located at + 30 to + 49 and + 34 to + 53 in the coding region, respectively, were selected using the web-based tool CRISPR-GE (https://skl.scau.edu.cn). The oligonucleotides 410cri-1 and 410cri-2 (Table S2) were designed and independently ligated into BGK03 vector (BIOGLE Co., Ltd, Hangzhou, China) according to the manufacturer’s instruction. The original BGK03 vector contains a rice U6 promoter for activating the target sequence, a Cas9 gene driven by the maize ubiquitin promoter, and a hygromycin marker gene driven by the Cauliflower mosaic virus 35S promoter.

According to the rice information gateway database (https://rice.hzau.edu.cn/rice), coding sequences of the six annotated genes in the qTGW1.2b region are identical between Nipponbare and ZS97, so is the promoter sequence of OsVQ4. Thus, Nipponbare was used as the recipient for transformation. The two constructs were separately introduced into Nipponbare using Agrobacterium tumefaciens-mediated transformation, which was performed by BioRun Co., Ltd (Wuhan, China). Genomic DNA of T0 plants was extracted using the DNeasy Plant Mini Kit. They were assayed with Hyg marker for hygromycin gene (Table S2). The OsVQ4 gene fragment was amplified from each Hyg-positive plant using the sequencing primer 410s. The product was directly sequenced by the Sanger method and decoded using the web-based tool DSDecodeM (https://skl.scau.edu.cn/dsdecode). For cri-1, 1 bp insertion was found at 3 bp upstream from its PAM sequence. Of the 11 independent T0 plants tested, one showed no mutation, four were homozygous mutants, two were heterozygous mutants, and the other four were not decoded. For cri-2, 1 bp insertion was found 3 bp upstream from its PAM sequence. Of the four independent T0 plants tested, two were homozygous mutants and the other two were not decoded. Five of the T0 plants were selected and selfed, including the wild-type (WT) plant, two homozygous mutants for cri-1, and two homozygous mutants for cri-2. In each resultant T1 family, 16 plants were assayed with the 410m marker for detecting the 1-bp insertion (Table S2). Genotypes of the five families were confirmed.

Field experiments and phenotyping

The rice materials were tested at the China National Rice Research Institute in Hangzhou, Zhejiang province, China. The four BC2F11:12 populations were tested in 2014, the three BC2F14:15 populations were tested in 2016, and one of the BC2F14:15 populations (W3) was further tested in 2018 and 2019. The five transgenic lines and the recipient Nipponbare were tested in 2019. A planting density of 16.7 cm × 26.7 cm was used. Field management was performed following the normal agricultural practice.

For the NIL populations, a randomized complete block design with two replications was applied. In each replication, one line was grown in a single row of eight plants. Seven traits were tested, including 1000-grain weight (TGW, g), grain length (GL, mm), grain width (GW, mm), number of spikelets per panicle (NSP), number of grains per panicle (NGP), spikelet fertility (SF), and heading date (HD, d). Of them, TGW, GL and GW were measured for all populations in all experiments, and the other four were only scored for the four populations tested in 2014. HD was recorded for individual plants and averaged for each replication. At maturity, five of middle six plants in each row were bulk-harvested and measured for the six yield traits. Of these, TGW, GL and GW were evaluated using fully filled grains following the procedure reported by Zhang et al. (2016). In short, the grains were soaked in 3.5 mol/L NaCl solution and floating grains were removed. Approximately 18 g fully filled grains were obtained and dried. They were divided into two halves and measured for TGW, GL and GW using an automatic seed counting and analyzing instrument (Model SC-G, Wanshen Ltd, Hangzhou, China).

For the five transgenic lines, each line was grown in 10 rows of 12 plants, and HD was recorded for the middle 100 plants. For Nipponbare, 20 rows of 12 plants were grown, and HD was recorded for the middle 200 plants. At maturity, 20 plants of each transgenic line and 40 plants of Nipponbare were individually harvested. They were measured for eight traits, including grain yield per plant (GY, g), number of panicles per plant (NP), NSP, NGP, SF, TGW, GL and GW. For TGW, GL and GW, approximately 6 g fully filled grains were divided into two halves and tested.

Data analysis

For each of the NIL populations, phenotypic differences between the two genotypic groups were tested using two-way analysis of variance (ANOVA). The analysis was performed using the SAS procedure GLM (SAS Institute 1999) as described previously (Dai et al. 2008). Given the detection of a significant difference (P < 0.05), the same data were used to estimate the genetic effects of the QTL, including additive effect and the proportion of phenotypic variance explained (R2). For the transgenic materials and Nipponbare, Duncan’s Multiple Range Test was used to examine the phenotypic differences among them.

Results

Validation and delimitation of qTGW1.2b into a 108.6 kb region

Four NIL populations in BC2F11:12 were used in the first experiment that was conducted in 2014. Distributions of TGW, GL and GW in these populations are shown in Fig. S1a. Differences between ZS97 and MY46 homozygous genotypes were observed for TGW and GL in three populations (L2, L3, L4), especially in L2. Difference between the two genotypic groups was also observed for GL in L1. In all cases, the MY46 homozygous lines were clustered toward the area of the higher values, and ZS97 homozygous lines toward the lower-value area.

Results of two-way ANOVA for phenotypic differences between the two genotypic groups in each population are presented in Table 1. Highly significant (P < 0.0001) genotypic effects on TGW and GL were detected in L2, L3 and L4, and the enhancing alleles were all derived from MY46. In L2, the additive effects were 0.30 g for TGW and 0.073 mm for GL, having R2 values of 29.76 and 55.55%, respectively. In L3, the additive effects were 0.27 g for TGW and 0.048 mm for GL, having R2 values of 30.00 and 41.29%, respectively. In L4, the additive effects were 0.16 g for TGW and 0.025 mm for GL, having R2 values of 11.15 and 13.12%, respectively. These results indicate that a QTL for TGW and GL, i.e., qTGW1.2b reported by Wang et al. (2015), was located in the common segregating region of L2, L3 and L4. As shown in Fig. 2a, this region is the interval Wn34232–RM11800, having a physical distance of 341.9 kb in the Nipponbare genome. This QTL also had small effects on NSP and NGP in L4, with the enhancing allele derived from ZS97.

Table 1 QTL effects detected in the four NIL populations in BC2F11:12
Fig. 2
figure 2

Genotypic composition of the near isogenic lines (NILs) in the target region. a Composition of four NIL populations in BC2F11:12. b Composition of the NIL population L4 updated with more polymorphic markers. c Composition of three NIL populations in BC2F14:15

For the remaining population tested in 2014, L1, the segregating region did not include the qTGW1.2b region RM11781–RM11800 reported by Wang et al. (2015) (Fig. 2a). This population was initially used as a negative control. No significant genotypic effect was detected except for GL. The additive effect was 0.026 mm and R2 was 23.51%. These results suggest that a QTL responsible for GL but not for TGW was linked to qTGW1.2b.

In the updated region of qTGW12.b, Wn34232–RM11800, the two cross-over regions occupied the major portion (Fig. 2a). Therefore, we developed InDel markers in the cross-over regions according to the sequence differences between ZS97 and MY46. Six polymorphic markers were selected to assay the individual lines of L4. One marker (Wn34286) was heterozygous and other five markers (Wn34259, Wn34367, Wn34384, Wn34458 and Wn34526) were homozygous (Fig. 2b). Consequently, qTGW1.2b was narrowed down to a 108.6 kb region flanked by Wn34259 and Wn34367.

Fine-mapping of qTGW1.2b

Fine-mapping was continued using three NIL populations segregating within the updated qTGW1.2b region, including W1, W2 and W3 (Fig. 2c). Differences between ZS97 and MY46 homozygous genotypes were observed for TGW and GL in all the three populations (Fig. S1b). The MY46 homozygous lines were clustered toward the area of higher values, while ZS97 homozygous lines toward the lower-value area. In all the three populations, significant effects were detected on TGW and GL, but not on GW (Table 2). The additive effects ranged from 0.13 to 0.20 g for TGW and from 0.021 to 0.038 mm for GL, with R2 ranging as 5.05−9.26% and 14.14−29.68%, respectively. The enhancing alleles were all derived from MY46. Obviously, qTGW1.2b was located within the common segregating region of the three populations. As shown in Fig. 2c, this is an interval flanked by Wn34323 and Wn34367, corresponding to a 44.0 kb region in the Nipponbare genome.

Table 2 QTL effects detected in the three NIL populations in BC2F14:15

The population carrying the smallest segregating region, W3, was grown in two more years. Consistent with the previous observations, the QTL effects were significant on TGW and GL, but not on GW (Table 2). In the two years, the additive effects for TGW were 0.19 and 0.13 g, with R2 of 8.90 and 12.18%; and the additive effects for GL were 0.030 and 0.020 mm, with R2 of and 19.36 and 20.20%. Again, the enhancing alleles were all derived from MY46. These results indicate that qTGW1.2b has a stable effect on grain weight and size.

Candidate gene analysis of qTGW1.2b

Based on the MSU Rice Genome Annotation Project Database and Resource (https://rice.plantbiology.msu.edu) and the rice genomic variation and functional annotation database of Huazhong Agricultural University (https://ricevarmap.ncpgr.cn/v2/), six annotated genes are predicted in the 44.0 kb region for qTGW1.2b (Table S3). Four of them encode expressed proteins with unknown function, including LOC_Os01g59370, LOC_Os01g59390, LOC_Os01g59400 and LOC_Os01g59420. In addition, all of the four genes had no homolog in other plant species (https://blast.ncbi.nlm.nih.gov/blast.cgi). Of the remaining two annotated genes, LOC_Os01g59360 encodes calcium dependent protein kinase 2 (OsCDPK2), and LOC_Os01g59410 encodes VQ motif-containing protein 4 (OsVQ4). The calcium dependent protein kinases and VQ proteins are involved in various biological processes in plants, including growth, development, and abiotic and biotic stress responses (Wang et al. 2010; Li et al. 2014; Jing and Lin 2015; Shi et al. 2018; Zhong et al. 2018). Moreover, two VQ proteins genes, AtVQ14 and OsVQ13, have been found to control seed size in plant (Garcia et al. 2003; Wang et al. 2010; Uji et al. 2019).

Sequence comparisons of OsCDPK2 and OsVQ4 were conducted between full-length genomic DNA of ZS97 and MY46. For OsCDPK2, six single nucleotide polymorphisms (SNPs) and two 1 bp InDels were found, all of which were located in introns (Fig. S2). For OsVQ4 that consists of only one exon, a 3 bp InDel and three SNPs were identified (Fig. S3). The InDel resulted in an additional aspartic acid at residue 157 in MY46 compared with ZS97 (Fig. S4). One SNP (G259A) resulted in the substitution of glycine in ZS97 to serine in MY46 at residue 86. The remaining two SNPs were synonymous mutations. These results suggest that OsVQ4 was likely to be a candidate gene for qTGW1.2b.

Knockout of OsVQ4 using the CRISPR/Cas9 system

The CRISPR/Cas9 system was used to generate knockout mutants for OsVQ4 (Fig. 3a). Four independent mutational lines were selected to investigate the effects of OsVQ4, including Tb1 and Tb2 for cri-1, and Tb3 and Tb4 for cri-2 (Fig. 3b). The recipient Nipponbare and one WT-type transgenic line were used as the controls. Nine traits were measured, including TGW, GL, GW, NP, NSP, NGP, SF, GY and HD. Duncan’s Multiple Range Test was performed to determine the phenotypic differences among the six lines.

For TGW and GL, no significant difference was found between Nipponbare and the WT-type transgenic line, but significant reductions were observed in all the four mutational lines (Table 3). Compared with Nipponbare, the mutational lines were smaller by 4.9–9.8% for TGW and 1.5–3.5% for GL. Compared with the WT line, the mutational lines were smaller by 2.8–7.9% for TGW and 1.5–3.5% for GL. For GW, the effects were inconsistent. Compared with Nipponbare, the WT line showed significant increase, three mutational lines showed significant reductions, and one mutational line showed no significant difference. These results were in agreement with the effect detected for qTGW1.2b, showing that qTGW1.2b/OsVQ4 were responsible for controlling grain weight and grain length.

Table 3 Grain size traits in Nipponbare (NPB), the wild-type transgenic line (WT) and the four mutational lines of OsVQ4

Among the other five yield traits, significant differences were detected for NSP, NGP and SF, but not for NP and GY (Table S4). For NSP, significant increase was observed in all the four mutational lines, larger by 13.4–19.2% than Nipponbare, and by 15.1–21.0% than WT. For NGP, significant increase was observed in two mutational lines, Tb2 and Tb4, larger by 12.3 and 15.3% than Nipponbare, and by 13.7 and 16.8% than WT. For SF, significant reduction was observed in one mutational line, Tb3, lower by 13.7% than Nipponbare and 16.8% than WT. For HD, significant reductions were observed in the four mutational lines than WT, although the differences were low as 1.0–2.3 d or 1.5–2.9%. No obvious change was observed for plant stature and health in the mutational lines (Fig. S5).

Discussion

Most important agronomic traits are controlled by a few QTLs having large effects and many QTLs having small effects. Among rice QTLs for which the casual genes have been validated, major QTLs accounted for the vast majority (Ashikari et al. 2005; Hori et al. 2016; Huo et al. 2017; Li et al. 2019; Ma et al. 2019; Wang et al. 2019a; Dong et al. 2020; Ruan et al. 2020; Shi et al. 2020). Only a small number of genes underlying minor QTLs for HD were validated to date (Hori et al. 2016; Chen et al. 2018). In the present study, a minor QTL for grain weight, qTGW1.2b, was fine-mapped using NIL populations derived from sequential residual heterozygotes. Its casual gene OsVQ4 was validated using CRISPR/Cas9-targeted mutagenesis. In the NIL populations, TGW decreased by 0.9–2.0% in ZS97 homozygous lines compared with MY46 homozygous lines. In the knockout lines, TGW decreased by 2.8–9.8% compared with the wild-type transgenic line and the recipient. Our study establishes a strategy for cloning minor QTLs, which could also be used to identify a large number of potential target genes for the application of CRISPR/Cas system.

One inspiration from the present study was that minor QTLs could be used as new targets for enhancing genetic diversity of rice varieties. This may be done by searching for rare alleles presented in germplasm resources or by creating new alleles using artificial mutagenesis. Based on the data documented in the rice genomic variation and functional annotation database of Huazhong Agricultural University (https://ricevarmap.ncpgr.cn/v2/), two non-synonymous SNPs were found in coding region of qTGW1.2b/OsVQ4 in 1782 varieties. One was G4A, where both ZS97 and MY46 carried G. Another was G259A, which is identical to the variation detected between ZS97 and MY46. Three haplotypes could be classified based on the two SNPs. Among them, G4G259 of the ZS97-type accounted for the highest proportion (67.7%), followed by G4A259 of the MY46-type (31.4%). The remaining A4A259 type only accounted for 0.9%. Genetic effects of the rare allele A4A259 could be tested and its value for breeding application could be evaluated. Artificial insertional mutants including an activation tagging line for OsVQ4 were also found in the Rice Functional Genomic Express Database (https://signal.salk.edu/cgi-bin/RiceGE). These insertional mutants could also be tested and evaluated.

Loss-of-function allele of qTGW1.2b was produced using CRISPR/Cas9 in the present study. Although grain weight was reduced in these mutational lines (Table 3), grain yield remained stable due to compensation in grain number (Table S4). Moreover, a slight reduction was observed in heading date. Thus, the loss-of-function allele of qTGW1.2b/OsVQ4 could be used to shorten growth period without yield penalty. Generally, the agronomic traits are determined by both positive and negative regulators. Among the 20 validated genes underlying QTLs for grain weight, eight were negative regulators, including GW2, TGW2, GS3, SG3, GL3.1/qGL3, TGW3/qTGW3/GL3.3, GSE5 and TGW6 (Li et al. 20192020a; Ruan et al. 2020). Loss function of these genes results in increase of grain weight and grain yield. In our previous studies, eight minor QTLs for grain weight and size were fine-mapped (Zhang et al. 20162020; Dong et al. 2018; Wang et al. 2019b; Zhu et al. 2019). There is a high probability that negative regulators could be identified from these minor QTLs. Applying CRISPR/Cas9 to create their loss-of-function alleles may achieve improvement in grain yield.

The VQ protein are a class of plant-specific proteins, which were found to regulate diverse plant growth and developmental processes (Wang et al. 2010; Li et al. 2014; Jing and Lin 2015; Ye et al. 2016; Lei et al. 2017). The characteristic structure of VQ proteins is the conserved VQ motif, which possesses the core sequence FxxhVQxhTG (x refers to any amino acids and h refers to hydrophobic amino acids) (Jing and Lin 2015). The primary structure of VQ proteins is highly diverse in regions other than the VQ motif (Jing and Lin 2015). Two VQ proteins, AtVQ14 and OsVQ13 were reported to control seed development in plant (Wang et al. 2010; Uji et al. 2019). In the present study, we found OsVQ4 was also responsible for grain size. Similarity of these three proteins is weak. The OsVQ4 had only 20.1% amino acid residuals identity to AtVQ14 (37/184) and 25.0% identity to OsVQ13 (46/184) (data not shown). They were classified into three different groups based on the VQ motif (Kim et al. 2013). Both AtVQ14 and OsVQ13 were found to function as a positive regulator for seed size. Mutation of AtVQ14 exhibited small seed compared with wild type (Wang et al. 2010), while over-expression of OsVQ13 produced larger grain compared with wild type (Uji et al. 2019). Our results showed that knockout of OsVQ4 also resulted in small grain (Fig. 4). These results suggested that VQ proteins may have conserved roles in seed size, though they had highly diverse protein sequence.

Fig. 3
figure 3

Knockout of OsVQ4 using CRISPR/Cas9 system. a Schematic of OsVQ4 gene structure and the CRISPR/Cas9 target site. OsVQ4 contains a single exon indicated by orange rectangles. The translation initiation codon (ATG) and termination codon (TAG) are shown. The target site nucleotides are shown in blue letters. The protospacer adjacent motif (PAM) site is marked in green and underlined. b Sequence of Nipponbare (NPB), the wild-type transgenic line (WT) and the four mutational lines of OsVQ4 in the target region. Insertions are indicated by red letters

Fig. 4
figure 4

Grains of near isogenic lines (NILs), Nipponbare (NPB), the wild-type transgenic line (WT) and the four mutational lines of OsVQ4. ZS97: Zhenshan 97; MY46: Milyang 46. Bar = 10 mm

Site-directed mutagenesis revealed that the function of VQ protein largely depended on its VQ-motif. Transgenes harboring mutations in the VQ motif of AtVQ14 resulted in small seed phenotype, but transgenes harboring mutations in other regions did not had such an effect (Wang et al. 2010). This suggests that mutations in other regions had limited influence on it molecular function. The non-synonymous variations between ZS97 and MY46 were located at residues 86 and 157 in OsVQ4 protein, which were outside the VQ motif (Fig. S4). Twenty-seven homologs of OsVQ4 were found in gramineae using BLASTP program from National Center for Biotechnology Information with a E value < 10–5 (https://blast.ncbi.nlm.nih.gov/blast.cgi). At residue 86, seven homologs carried glycine, seven carried serine, and the other 13 had deletion. At the region harboring residue 157, substitutions, insertions and deletions were observed. The variation between ZS97 and MY46 occurred in non-conserved regions outside the VQ motif, which may be the reason why qTGW1.2b exhibited a small effect. The VQ protein functions as transcriptional regulators in plant, which usually work in combination with the WRKY transcription factors (Lai et al. 2011; Chi et al. 2013; Jing and Lin 2015; Ye et al. 2016; Lei et al. 2017). The regulation of seed size by AtVQ14 was achieved by forming a complex with a WRKY transcription factor MINI3 (Wang et al. 2010). Hence, WRKY transcription factors would be a good starting point to investigate the molecular mechanisms underlying the role of qTGW1.2b/OsVQ4 in regulating grain size in rice.

Grain weight and grain number are two key traits determining grain yield in rice, however, the negative correlation between these two traits were frequently observed. In the present study, qTGW1.2b was found to have opposite effects on grain weight and number in the NIL populations (Table 1). This was also confirmed in the knockout lines, which showed smaller but more grains compared with WT (Table 3). These results suggest that qTGW1.2b may coordinate the trade-off between grain weight and number in rice. More recently, a mitogen-activated protein kinase (MAPK) cascade, consisting of OsMKKK10, OsMKK4 and OsMPK6, was found to contribute the trade-off between grain weight and number in rice, by integrating localized cell differentiation and proliferation (Guo et al. 2018, 2020; Xu et al. 2018). Knockout of OsMKKK10 or OsMKK4, and knockdown of OsMPK6, all resulted in smaller but more grains (Guo et al. 2018; Xu et al. 2018). Members of the MAPK cascade could interact with VQ proteins to form a trimeric complex with WRKY transcription factors (Andreasson et al. 2005; Pecher et al., 2014). Investigating whether OsVQ4 interacts with the MAPK cascade will help to reveal the molecular mechanism coordinating the trade-off between grain weight and number in rice.