Background

Wheat (Triticum aestivum L.) is one of the most important cereal crops worldwide, providing nearly 20% of the calories for the world population [1]. It is estimated that wheat yield needs to be increased by 70% to meet the food demand associated with the growth of the world population [2]. In this context, improving wheat yield is critical to ensuring food security in the future. Wheat yield is significantly influenced by thousand kernel weight (TKW), kernel number per spike (KNS), and spike number per unit area (SN) [3, 4]. Of these, TKW has been selected as an essential trait in wheat breeding programs, due to its high heritability [5]. Kernel size-related traits, as one of the critical factors determining the formation of kernel weight, are mainly composed of kernel length (KL), kernel width (KW), kernel diameter ratio (KDR) and kernel thickness (KT) [6]. Larger kernels positively influence wheat seedling growth and significantly contribute to high-yield improvement [4, 7, 8]. Therefore, deciphering the genetic basis and finding functional genes for kernel size are critical for the enhancement of grain yield traits in wheat.

Grain-size related traits have attracted considerable attention in wheat breeding. Yield-related traits are complex quantitative traits controlled by polygenes [9,10,11], which are strongly influenced by genotypic and environmental factors [12]. In the last two decades, a large number of QTLs underlying wheat kernel size-related traits have been successfully identified by traditional bi-parental linkage mapping [7, 9,10,11, 13,14,15,16,17] and genome-wide association studies (GWAS) [18,19,20,21,22,23]. However, due to the large and highly repetitive nature of the wheat genome, identifying stable and robust QTLs for kernel size-related and yield traits remains challenging in wheat breeding [24, 25].

Previous studies reported that QTLs for grain size were generally mapped in large confidence interval (CI) with minor effects and are significantly influenced by different genetic backgrounds and environments, which limits the usefulness of these QTLs in wheat breeding programs [26]. The meta-QTL analysis is a robust method for the genetic analysis of complex traits by integrating QTLs from different studies to obtain stable genetic regions controlling a quantitative trait [27]. Compared to QTLs identified in a single study, MQTLs have the advantage of a smaller CI and a higher consistency under different genetic backgrounds. The meta-QTL analysis also facilitates the identification of candidate genes in a genome as complex as wheat.

MQTL analysis has been successfully applied in various crops, including maize [28,29,30,31], rice [26, 32, 33] and soybean [34]. MQTL analysis in wheat has also been effectively used to establish the consensus map of QTLs for many agronomic traits [35,36,37]. Previous studies integrated QTLs for yield and yield-related traits from published articles. They identified 12 significant MQTLs on chromosomes 1A, 1B, 2A, 2D, 3B, 4A, 4B, 4D and 5A, including two critical underlying genes, Rht and Vrn [38]. Tyagi et al. (2015) performed a meta-analysis of QTLs associated with kernel morphological traits and mapped 17 MQTLs on seven chromosomes in wheat [39]. In a previous study, a total of 2230 QTLs for yield and yield-related traits were used for meta-QTL analysis and 145 MQTLs were identified, of which 85 were verified by GWAS using different natural populations. Within 76 MQTL core intervals, 237 candidate genes involved in photoperiod response, kernel development, multiple plant growth regulatory pathways, carbon and nitrogen metabolism and ear and flower organ development were identified through searching for sequence homology and expression analysis [37]. Meanwhile, Liu et al. (2020) performed a meta-analysis with 381 QTL related to yield and identified 86 MQTL and 210 candidate genes in wheat [40]. In addition to yield-related traits, MQTL analysis was also used to discover consistent QTLs and identify candidate genes for various quantitative traits such as leaf rust [41], drought and heat tolerance [42,43,44], salt tolerance [45] and disease resistance [46,47,48].

The present study used the inclusive composite interval mapping (ICIM) method to identify the QTLs controlling kernel size-related traits across seven environments. We performed a meta-analysis by combining the QTLs detected in our study with the 1071 QTLs from previous studies. Our main objectives were to: (1) identify stable QTL for traits related to kernel size in seven environments; (2) discover and map MQTLs from numerous reported QTL and current studies; and (3) identify candidate genes related to kernel size associated with MQTL intervals.

Results

Phenotypic and correlation analyses

In the field trials conducted in seven environments (E1-E7), the parental line Q9086 had a significantly longer and wider kernel than Longjian19 (Table S1). In KT, the parental line Longjian19 had an advantage over Q9086. In the RILs population, all traits varied widely and had an approximately normal distribution with significantly transgressive segregation (Fig. 1). The coefficients of variation for KL, KW, KDR and KT ranged from 3.47% to 5.71%, 2.47% to 6.27%, 3.24% to 8.57% and 3.88% to 5.45%, respectively. The ANOVA of four kernel size-related traits revealed significant differences (P < 0.01) in the variation factors of environment, genotype, and genotype × environment interaction. Among the kernel size-related traits, KL (h2 = 0.89) and KDR (h2 = 0.70) were highly heritable, followed by KW (h2 = 0.67) and KT (h2 = 0.61) (Table S2).

Fig. 1
figure 1

The frequency distribution of kernel size-related traits

Significant correlations were found among KL, KW, KDR and KT (Fig. 2). KL showed a positive correlation with KW (r = 0.45, P < 0.01) and KDR (r = 0.71, P < 0.01), whereas there was a negative correlation with KT (r = -0.03, P < 0.05). KW showed a significant positive correlation with KT (r = 0.41, P < 0.01) and a negative correlation with KDR (r = -0.30, P < 0.05). In addition, a negative correlation was observed between KT and KDR (r = -0.42, P < 0.01).

Fig. 2
figure 2

Correlation coefficient among four kernel size-related traits in the Q9086/Longjian19 RILs population. * and ** indicate significant level at P < 0.05 and P < 0.01, respectively

QTLs controlling kernel size-related traits

QTL mapping detected 32 QTLs for kernel size-related traits with the PVE ranging from 3.06% to 14.2% in different environments (Table S3, Fig. 3). These loci were mapped on 17 chromosomes, except for chromosomes 2B, 4B, 5A and 5D. Eleven stable QTLs, namely QKL.acs-1A, QKW.acs-1A, QKDR.acs-2A, QKL.acs-2D, QKW.acs-3A, QKDR.acs-4A, QKDR.acs-5B.2, QKL.acs-6A, QKL.acs-6B, QKW.acs-7B.1 and QKW.acs-7B.2, were detected in more than three environments, with PVE ranging from 3.07% to 9.85%.

Fig. 3
figure 3

Chromosomal locations of QTLs detected for kernel size-related traits. The vertical bars with different colors represent the interval of QTLs for kernel length (black), kernel width (red), kernel diameter ratio (green), and kernel thickness (blue)

Ten QTLs associated with KL were identified on chromosomes 1A, 1B, 2D, 3D, 4A, 6A, 6B, 7A, 7B and 7D, with PVE ranging from 3.40% to 8.26% (Table S3, Fig. 3). Of these, four stable QTLs were identified for KL on chromosomes 1A, 2D, 6A and 6B, including QKL.acs-1A identified in E3, E4 and E5, QKL.acs-2D identified in E3, E4 and E7, QKL.acs-6A identified in E3, E5 and E7, QKL.acs-6B identified in E1, E2, E3, E6 and E7, respectively. Notably, QKL.acs-6B, with 4.07%-8.26% of the PVE, was detected in five environments (E1, E2, E3, E6 and E7). Except for the QTL QKL.acs-6B, the additive effect of the other three stable QTLs contributed to decreasing KL.

Among the seven QTLs associated with KW on six chromosomes (1A, 2D, 3A, 4A, 4D and 7B), with PVEs ranged from 3.35% to 9.85% (Table S3, Fig. 3). Four stable QTLs, QKW.acs-1A identified in E4, E6 and E7, QKW.acs-3A identified in E2, E3 and E6, QKW.acs-7B.1 identified in E4, E5 and E6, and QKW.acs-7B.2 identified in E1, E5 and E6, which were mapped on chromosomes 1A, 3A and 7B, respectively. QKW.acs-1A and QKW.acs-7B.1 had a negative additive effect on KW, while QKW.acs-3A and QKW.acs-7B.2 showed a positive additive effect for increasing KW. The QTLs QKW.acs-7B.1 and QKW.acs-7B.2 were detected on the same chromosomes with opposite additive effects.

Nine QTLs for KDR were mapped on chromosomes 1A, 1D, 2A, 2D, 3B, 3D, 4A and 5B, with individual PVE ranging from 3.06% to 14.2% (Table S3, Fig. 3). Three stable QTLs, QKDR.acs-2A identified in E1, E5 and E7, QKDR.acs-4A identified in E2, E3, E6 and E7, and QKDR.acs-5B.2 identified in E1, E3 and E5, were also detected in at least three environments with a range of PVE from 3.06% to 6.9%. A major QTL (QKDR.acs-2D) was identified and explained 14.2% of phenotypic variance. In addition, a stable QTL QKDR.acs-4A was detected in four environments (E2, E3, E6 and E7) and accounted for 3.06–6.9% of the PVE.

On chromosomes 2A, 3B, 4D, 6A, 6B and 6D, six QTLs associated with KT were identified, each accounting for 4.6%-10% of PVE (Table S3, Fig. 3). They were all detected in less than two environments. Of those, QTL QKT.acs-3B.1 owned the highest PVE of 10%.

QTLs identified under different water environments

In the present study, 23 QTLs for kernel size-related traits were detected under DS and WW environments (Table S3, Fig. 3). Under DS conditions, 14 QTLs were located on chromosomes 1A, 1B, 1D, 2D, 3B, 3D, 4A, 5B, 6A, 6B and 7D with PVE ranging from 3.4% to 14.2%. Two stable QTLs, QKL.acs-2D and QKW.acs-1A, were identified under DS conditions. Under WW environments, nine QTLs for kernel size-related traits were located on chromosomes 1A, 2A, 3D, 4A, 4D, 6D, 7A and 7B with PVE ranging from 4.23% to 9.53%. Importantly, nine stable QTLs, including QKL.acs-1A, QKDR.acs-2A, QKW.acs-3A, QKDR.acs-4A, QKDR.acs-5B.2, QKL.acs-6A, QKL.acs-6B, QKW.acs-7B.1 and QKW.acs-7B.2, were identified under both WW and DS environments.

Initial QTLs collection for wheat kernel size-related traits

By integrating 1071 initial QTLs from 34 QTL studies published between 2007 and 2020 (Table S4) and 32 QTLs identified in this study, a total of 1103 initial QTLs for kernel size-related traits were used for MQTL analysis (Fig. 4a). The distribution of initial QTLs significantly differed from homoeologous groups, sub-genomes and individual chromosomes. For example, the number of identified QTLs ranged from 101 on homoeologous group VII to 241 on group II, and from 15 on chromosome 4D to 117 on chromosome 2D (Fig. 4b). Of the 1103 initial QTLs, 399, 433 and 271 QTLs were distributed among sub-genomes A, B and D, respectively (Fig. 4d). The CI ranged from 0.14 cM to 190 cM, with an average of 14.52 cM (Fig. 4c). The proportion of phenotypic variance explained by individual QTL ranged from 1.00% to 86.31%, with an average of 9.98% (Fig. 4c).

Fig. 4
figure 4

Number of QTLs collected (a) by trait category, KL (kernel length), KW (kernel width), KT (kernel thickness), and KDR (kernel diameter ratio) and (b) in 21 wheat chromosomes (c) frequencies of QTLs with different PVE (%) and CI values and (d) proportion of QTL numbers in wheat sub-genomes A, B, and D

MQTL analysis for wheat kernel size-related traits

A total of 346 initial QTLs were projected on the consensus map, while the remaining QTLs were eliminated due to the lack of common markers with the consensus map (Fig. 5). After meta-analysis, 58 MQTLs were detected on chromosomes 1B, 1D, 2A, 3D, 4A, 5B, 5D, 6B, 7A, 7B and 7D (Table S5). Each chromosome harbored two (3D) to seven MQTLs (1B, 4A and 7B) (Fig. 6a). The projected initial QTLs on the chromosomes varied from 20 (5D) to 80 (5B) (Fig. 6b). Most of the MQTL regions were co-localized for more than two kernel size-related traits (Fig. 5). The number of individual QTL per MQTL ranged from 1 (MQTL6B.2 and MQTL6B.3) to 18 (MQTL2A.5) (Table S5). MQTL intervals ranged from 0.21 cM (MQTL5B.6) to 72.64 cM (MQTL7A.6) with an average of 4.46 cM, indicating a reduction in CI of 3.26 fold compared to the initial QTLs (14.54 cM) (Table S5, Fig. 6c). The PVE ranged from 5% (MQTL1B.7) to 56% (MQTL5D.2) with an average PVE of 17.12%, which was increased 1.72 fold (Table S5, Fig. 6d). Based on the comparison of the flanking marker sequences, the MQTLs had unique physical positions in the reference sequence of the Chinese Spring wheat genome. The physical interval of these 58 MQTLs ranged from 1.54 Kb to 580.66 Mb (Table S5). Of these, 12 MQTLs with a physical interval less than 20 Mb were selected as core MQTLs.

Fig. 5
figure 5

The chromosome distribution of the 58 MQTL for kernel size-related traits on 11 chromosomes. The circles from inside to outside represent the high-density consensus genetic map, the number of initial QTLs mapped on the MQTL interval, values of the confidence interval, values of the phenotypic variation explained, and the physical map, respectively

Fig. 6
figure 6

Number of MQTLs on different wheat chromosomes (a); comparison of the original QTLs and the projected QTLs located in the MQTL intervals on different chromosomes (b); comparison between the mean CI of the original QTLs and the MQTLs (c); comparison between the mean PVE of the MQTLs and the original QTLs (d). The numbers above the bars show the rate of change for the mean CI and PVE between the MQTLs and the original QTLs in (c) and (d), respectively

Candidate genes mining and expression analysis

We identified 1864 potential candidate genes in 12 core MQTL intervals, with the lowest (1) and highest (487) number of potential candidate genes in the MQTL7B.4 and MQTL2A.2 intervals, respectively. The potential candidate genes within the regions of 12 MQTLs were screened and annotated based on IWGSC RefSeq v1.1 from the Chinese Spring wheat reference genome (Table S6).

The GO terms associated with biological processes belonged to metabolic and cellular (229 and 210 potential candidate genes, respectively) pathways (Fig. 7). GO terms associated with molecular function were related to binding and catalytic activity (380 and 260 potential candidate genes, respectively). Regarding the cellular component, potential candidate genes were mainly related to the cell and cell part, with 130 and 128 potential candidate genes, respectively. KEGG analysis for potential candidate genes revealed that ubiquitin-mediated proteolysis and plant hormone signaling are the two most important pathways involved in the metabolic process (Fig. 8).

Fig. 7
figure 7

Gene ontology (GO) terms for 1863 candidate genes underlying MQTLs interval for kernel size-related traits

Fig. 8
figure 8

KEGG pathway enrichment of 1863 candidate genes

The potential candidate genes were subjected to in silico expression analysis using RNAseq data [35,36,37]. Only 70 candidate genes predicted within the regions of nine MQTLs (except MQTL7B.4, MQTL7B.5 and MQTL7D.2) were differentially expressed in spike and grain (Table 1, Fig. 9). These candidate genes are involved in various metabolic pathways, such as carbon fixation in photosynthetic organisms (4 genes), carbon metabolism (6 genes), mRNA surveillance pathway (4 genes), RNA transport (4 genes) and biosynthesis of secondary metabolites (18 genes).

Table 1 Identification of 70 candidate genes located in the nine core MQTL intervals
Fig. 9
figure 9

Heatmap showing the differential expression level of 70 candidate genes underlying MQTL intervals

Discussions

Grain yield is influenced by the combination of kernel weight and number per spike [14, 49]. TKW is not only one of the critical components of grain yield, but also is commonly used as a common factor for determining commercial value in wheat. Kernel size and shape, including KL, KW and KT, are strongly and positively correlated with TKW [50,51,52]. A bigger kernel positively affects wheat kernel weight, yield and commercial value [40, 53]. Kernel size-related traits influence wheat yield by regulating TKW, and both are associated with high heritability [16, 54,55,56,57,58,59].

We observed significant and positive correlations between KL, KW and KDR (r = 0.45, P < 0.01 and r = 0.71, P < 0.01, respectively), KW and KT (r = 0.41, P < 0.01). Meanwhile, a negative and significant correlation was also observed between KT and KDR (r = -0.42, P < 0.01) (Fig. 2), which is consistent with previous studies [13, 50, 60, 61]. It is known that KL reached its maximum value 15 days after anthesis, while KW and KT reached their maximum value four weeks after anthesis [62, 63]. KL showed the highest heritability (0.89) in this study, followed by KDR (0.70), KW (0.67) and KT (0.61) (Table S2), which is consistent with previous studies [7, 62,63,64]. Therefore, increasing KL and KW through genetic improvement has a positive effect on the grain weight and yield of wheat.

Many QTLs and genes for kernel size have been identified on 21 chromosomes in wheat [16, 61, 64,65,66,67, 69, 70]. In this study, 32 QTLs for KL, KW, KDR and KT were found on 17 chromosomes (Table S3, Fig. 3). Of these, a stable QTL, QKW.acs-1A, identified in E4, E6 and E7, is mapped in the Xcfa2219-Xgwm99 interval on chromosome 1A only under DS environments. Li et al. (2012) identified a major QTL with a PVE of 40.79% that shares the same flanking marker Xgwm99 with QKW.acs-1A [71]. In addition, the stable QTLs QKL.acs-1A and QKW.acs-1A share the same flanking marker Xwmc99 with the QTLs QGw.ccsu-1A.3 reported by Mir et al. (2012) [72]. This suggests that the marker Xgwm99 can be used for marker-assisted selection in wheat breeding programs. QKL.acs-2D, located in the interval of Xgwm157-Xwmc41, shared a common flanking marker (Xwmc41) with QTKW.ncl-2D.2 [54]. QKL.acs-2D, located in the Xgwm157-Xwmc41 interval on chromosome 2D, strongly overlapped with the different environmental QTLs for KDR (QKDR.acs-2D) and KW (QKW.acs-2D). In addition, QKL.acs-6B was identified in E1, E2, E3, E6 and E7, with a PVE ranging from 4.07% to 8.26%. This indicates that kernel size-related traits are closely linked and represent one of the crucial elements in the regulation of kernel weight.

MQTL analysis is a powerful strategy for validating consistent QTLs by integrating independent QTLs from different trials on a consensus or reference map [27, 73]. In the present study, a total of 1103 initial QTLs from previous mapping studies and identified in this study were performed MQTL analysis to identify key genomic regions linked to kernel size-related traits in wheat (Table S4, Fig. 4). As a result, 346 initial QTLs were finally refined into 58 MQTLs on chromosomes 1B, 1D, 2A, 3D, 4A, 5B, 5D, 6B, 7A, 7B and 7D (Table S5, Fig. 5). The average 95% CI of MQTLs (4.46 cM) was 3.26-fold less than that of initial QTLs (14.54 cM). The result was similar to previous MQTL analysis for grain yield and yield-related traits, where the average CI of MQTLs was 2.9-fold lower than that of the initial QTLs [37]. Most of the MQTLs in the present study controlled more than one trait, likely indicating either a tight linkage of genes or the presence of pleiotropic genes for controlling kernel size-related traits [37, 43, 48, 73]. By the peak marker sequences compared with the wheat genome reference sequence of Chinese Spring, 58 MQTLs had definite physical positions and the physical intervals ranged from 1.54 Kb to 580.66 Mb (Table S5). Of these, six MQTLs, such as MQTL1D.2, MQTL4A.2, MQTL7B.1, MQTL7B.4, MQTL7B.5 and MQTL7B.6, showed narrower physical intervals (< 5 Mb), shorter genetic distance (< 10 cM) and more initial QTLs (> 2) (Table S5). These MQTLs are promised to be used in future marker-assisted selection for improving kernel size, and for isolating key genes by the map-based cloning approach in wheat.

Candidate genes related to important agronomic traits in wheat have been identified by MQTL analysis [37, 40, 43, 74, 75]. Nadolska-Orczyk et al. (2017) classified the genes controlling kernel yield into five categories: transcription factors, growth regulator signaling, carbohydrate metabolism, cell division and proliferation and flowering regulators [76]. Understanding the genetic and physiological pathways involved in grain development is of great help for investigating traits related to kernel size. In this study, we detected 1864 potential candidate genes in 12 core MQTL intervals with a physical interval of less than 20 Mb using the wheat genome reference sequence of Chinese Spring. Among 1864 potential candidate genes, 70 candidate genes were mainly expressed in the spike and grain at different developmental stages (Table 1, Fig. 9), consistent with those previously reported by Yang et al. (2021) [37]. In recent years, the analysis of homology relationships between wheat and rice facilitates the cloning of several yield-related genes such as TaFlo-A1 [77], TaCKX6-D1 [78] and TaTGW6-A1 [79]. In the present study, 17 out of 70 candidate genes homologous to rice genes were found within nine core MQTL intervals (Table 1). Of these, a key gene TraesCS3D02G024700 in the MQTL3D.1 interval was homologous to the gene OsCYP709C5 involved in regulating cytochrome P450 in rice [80]. Guo et al. (2021) also showed that constitutive overexpression of TaCYP78A5 significantly increased seed size and weight [81]. The ubiquitin–proteasome pathway has been associated with seed size development in wheat and rice. The corresponding genes, e.g., TaGW2-6A/6B [82, 83] and OsUBC [84] have been cloned in wheat and rice, respectively. According to a previous study, carbohydrate metabolism is essential to yield and yield-related traits [76]. The gene TraesCS7D02G149000 identified in the MQTL7D.2 region was homologous to the genes of OsSWEET15 in rice [85] and TaSWEETs in wheat [86, 87], which were identified as the key gene involved in the sucrose transport pathway in rice [85] and floral development in wheat [76]. The gene of TraesCS7D02G149500 (MQTL7D.2) was identified as an orthologous gene of DPL1/2, involved in pollen hybrid incompatibility in rice [88]. In this study, the orthologous genes of DEP2, EP2 and SRS1 were found in the MQTL7B.1 region as TraesCS7B02G002900 and TraesCS7B02G003000, which was involved in regulating kernel size and yield [89, 90]. In addition, the remaining 53 candidate genes were involved in various signaling pathways, such as zinc finger protein [91], transcription factors [17] and glycosyltransferase [92], which are also involved in the regulation of yield and yield-related traits.

Conclusions

In this study, we found that kernel size-related traits in wheat are predominantly regulated by genetic factors with moderate and high heritability. Most of stable QTLs were detected under both well-watered and drought-stressed conditions. Potential candidate genes expressed in spike and grain were identified through meta-QTL and in-silico expression analysis. The markers closely linked to stable QTLs had great potential in the marker-assisted breeding program and the identification of candidate genes advanced the understanding of the genetic basis governing kernel size in wheat.

Methods

Plant materials and field trials

A RILs population consisting of 120 lines derived from the cross between two winter wheat cultivars, Longjian19 and Q9086 [93]. The male parent, Longjian19, released by the Gansu Academy of Agricultural Sciences, Lanzhou, Gansu, is an elite drought-tolerant variety widely grown in rainfed areas (300–500 mm annual rainfall) in northwest China. The female parent Q9086, is a high-yielding cultivar developed by Northwest Agriculture and Forestry University, Yangling, Shanxi, China. It is suitable for cultivation under conditions with sufficient water and high fertility. The two parents differ significantly from several physiological and agronomical traits, especially under rainfed environments [93,94,95].

Field trials were conducted at Yuzhong farm station, Gansu, China (35°48'N, 104°18'E, altitude 1860 m) during the growing seasons in 2015–2016 under drought-stressed (DS, designated E1) and well-watered conditions (WW, designated E2), while in 2016–2017 only under drought-stressed conditions (designated E3). Field trials were also conducted at Tongwei farm station, Gansu, China (35°11′N, 105°19′E, altitude 1750 m) during the 2017–2018, 2018–2019, and 2019–2020 growing seasons. Planting in 2017–2018 was conducted under drought-stressed (designated E4) and well-watered conditions (designated E5), while the 2018–2019 and 2019–2020 cropping seasons were conducted under drought-stressed conditions only (designated E6 and E7, respectively). The two cropping sites are characterized by a typical dry inland environmental condition in Northwest China, where the annual average temperature is about 7.0 °C, the annual rainfall is less than 400 mm with approximately 60% falling from July to September, but the annual evapotranspiration capacity is more than 1500 mm. The two water treatments in different locations and years were conducted in field conditions without any rainout shelter. The DS treatments were equivalent to the rainfed condition in each growing season, whereas the WW treatments were irrigated with a water supply of 75 mm at the spike emergence (Zadoks 55) and grain filling (Zadoks 71) stages, respectively. Here, the decimal codes for the growth stages of wheat are described by Zadoks et al. (1974) [96]. In this case, the rainfall of the DS plots in each field environment was 164.3 mm (E1) to 296.5 mm (E7) (Fig. S1). All progenies and parents were sown in late September and harvested in early July of the following year. A randomized complete block design (RCBD) was conducted with three replications for each line and parent. Each plot consisted of six 1 m rows, 0.2 m spacing, with a sowing rate of 60 seeds per row. Field management followed local wheat cultivation practices.

After harvesting, two hundred seeds for each line were used to measure kernel length (KL), kernel width (KW) and kernel diameter ratio (KDR) with the SC-G wheat grain appearance quality image analysis system (Hangzhou WSeen Detection Technology Co., Ltd, Hangzhou, China). The kernel thickness (KT) was determined with a vernier caliper. All measurements were conducted with three biological replicates. The average values of the traits were used for QTL analysis.

Statistical analysis

Statistical analysis and analysis of variance (ANOVA) were performed using SPSS 22.0 (IBM Corporation, Armonk, NY, United States). According to the method described by Toker et al. (2004) [97], the broad-sense heritability (h2) was estimated across environments using the formula:

$${h}^{2}={\sigma }_{g}^{2}/\left({\sigma }_{g}^{2}+{\sigma }_{ge}^{2}/r+{\sigma }_{e}^{2}/re\right)$$

where \({\sigma }_{g}^{2}\), \({\sigma }_{ge}^{2}\) and \({\sigma }_{e}^{2}\) estimate genotype, genotype × environment interaction and residual error variances, respectively, and e and r are the numbers of environments and replicates per environment, respectively. The correlation among KL, KW, KDR and KT in the RILs population was also assessed.

Construction of linkage map and QTL analysis

For QTL mapping, a genetic map consisting of 524 SSR markers, described in a previous study was used [98]. These markers were distributed among 21 linkage groups and covered a total genetic distance of 2266.72 cM with an average distance of 4.33 cM between adjacent markers.

The inclusive composite interval mapping (ICIM) method was performed using the QTL software IciMapping V4.1 to determine the positions and effects of QTLs [99]. QTL with LOD value ≥ 2.5, as determined by 1000 permutation tests at P ≤ 0.05, were declared for the presence of significant QTL. QTLs were named based on the International Rules for Genetic Nomenclature (http://wheat.pw.usda.gov/ggpages/wgc/98/intro.htm). QTLs detected in at least three of seven environments were considered stable QTLs. QTLs for a trait identified with common flanking markers or overlapping CIs were treated as one QTL, with the CI reassigned by overlapping genetic positions.

Initial QTL collections used for MQTL analysis

A total of 1071 QTLs for KL, KW, KDR and KT traits derived from 36 bi-parental populations were retrieved from 34 published studies from 2007 to 2020 (Table S4). The size of the mapping populations varied from 99 to 547 lines of different types, including three double haploid (DH), seven F2 and 26 RILs populations evaluated in different years and locations. The population information, including target traits, population parents, population types, and the number of markers used in the genetic map, was listed in Table S4.

QTLs localization on the reference map

A high-density map containing 7352 markers, including SSR, DArT, SNP and other types of markers, was used as a reference map in this study [75]. The total length of the reference map is 4994.0 cM with an average distance of 0.68 cM. The original QTL data and associated individual genetic maps from previous studies, and the reference map, were used as input files to create a consensus map (Fig. S2) and perform MQTL analysis with BioMercator V4.2.3 [100].

The position, chromosome groups, proportion of phenotypic variance explained (PVE or R2), and logarithm of odds ratio (LOD score) were recorded for each of the QTLs in the 36 studies. The formula CI = 530/(N × R2) for BC and F2 lines, CI = 287/(N × R2) for DH lines, and CI = 163/(N × R2) for RILs lines was applied to calculate the 95% CIs of QTLs, where N is the population size and R2 is the proportion of phenotypic variance explained of the QTL [101]. For QTLs without well-defined LOD scores and R2, these criteria were arbitrarily set at 3 and 10%, respectively. All collected QTLs with appropriate information were projected onto the reference map using BioMercator V4.2.3 [100]. The approach proposed by Goffinet and Gerber (2000) [27] was used when the number of QTLs per chromosome was ten or less, while the two-step algorithm was used when the number of QTLs per chromosome was higher than ten [102]. The Akaike Information Criterion (AIC) statistics were used to determine the best model for defining the number of MQTLs or "true" QTLs that best represent the original QTLs. The algorithms and statistical procedures implemented in this software are well described in previous studies [100, 102, 103].

Identification of candidate genes

To identify candidate genes, initially, the marker or its related primer sequences on both sides of the MQTL confidence intervals were manually searched using URGI Wheat (https://wheat-urgi.versailles.inra.fr), GrainGenes (https://wheat.pw.usda.gov/GG3/), DArT (https://www.diversityarrays.com) and the Illumina company (https://www.illumina.com) databases. The obtained sequences were then aligned to IWGSC RefSeq v1.1 (https://wheat-urgi.versailles.inra.fr/) to find the physical location of each marker. Candidate genes for this MQTL with a physical interval of less than 20 Mb were identified, and their associated functions were compared to choose the best possible candidates. The candidate genes were also investigated using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses using Omicshare online tools (https://www.omicshare.com/).

In-silico expression analysis of candidate genes

The transcriptomic data of several wheat tissues deposited in the Expression Visualization and Integration Platform (expVIP, https://www.wheat-expression.com/) were downloaded to study the in-silico tissue expression of candidate genes [104]. This included 18 tissues throughout the wheat growth period [105]. The expression levels of candidate genes were assessed by transcripts per million (TPM) and visualized using the heatmap of TBtools software (https://github.com/CJChen/TBtools/releases).