Traits variations and correlations
ANOVA results showed that differences among genotypes and environments were highly significant for all measured traits (Table S1). Genotype (G) explained an average of 51.8 ± 6.9% of the phenotypic variance, ranging from 39.4% for GW to 62.6% for GL. Environment (E) accounted for an average of 12.9 ± 5.1% of the phenotypic variation, ranging from 6.0% for GL to 21.8% for GW. G × E interaction was also significant for all measured traits and the phenotypic variance accounted for was 27.9% ± 2.0%, ranging from 25.3% for GP to 30.8% for GR.
The BLUP values for each line were used to draw the boxplots and conduct the phenotypic correlation analysis. The five parents showed great differences in grain morphology (Fig. 1). The common parent YZ exhibited a moderate grain size compared with the parent CY, which was characterized by the largest GW, GD, GA, and GP, and the parent YT had the largest GL compared with the other three parents. Thus, the TGW of CY (52.70 g) and YT (49.05 g) was higher, followed by YZ (38.24 g), YN (31.85 g), and HR (29.58 g) (Fig. 1). Wide variations were observed for all measured traits in the four RIL populations. Moreover, most of the traits in each RIL population appeared to be normally distributed, and strong transgressive segregations toward both directions were observed. The trend of differences in mean values among the populations was consistent with the differences among their male parents (Fig. 1). The phenotype pairwise correlations between the eight measured traits are illustrated in Fig. 2. Significant correlations between the traits were observed at the level of p = 0.01. The GA, GP, GL, GW, GS, and GD were positively correlated with one another (except for GS with GW), and the correlation coefficients varied from 0.26 of GS with GD to 1.00 of GA with GD. The GR was negatively correlated with all other grain morphological traits except for GW, with correlation coefficients ranging from − 0.29 with GD to − 0.99 with GS. TGW was positively correlated with all the grain morphological traits (except for GR), especially with GA and GD, with correlation coefficients being 0.95 in both cases. To investigate the environmental stability of the measured traits, the correlations between seven environments for each trait were calculated (Fig. S1). For all the traits, the positive correlations between different environments were significant at the level of p = 0.01, with most of the correlation coefficients being larger than 0.50. The GL showed the best environmental stability, with correlation coefficients ranging from 0.70 to 0.92, while the GW showed unstable correlation coefficients ranging from 0.25 to 0.79.
Basic statistics of markers
For the 14,643 high-quality SNP markers, there were 5881, 6483, and 2279 markers on sub-genomes A, B, and D, respectively. The number of markers per chromosome ranged from 112 on chromosome 4D to 1292 on chromosome 2B (Table S2 and Fig. 3). The coverage rate of the 14,643 SNP markers was 82.70% on average, ranging from 66.28% for chromosome 6D to 90.48% for chromosome 3A. The average marker spacing was 1.28 Mb, with spacing ranging from 0.60 Mb for chromosome 1A and 4.54 Mb for chromosome 4D. On average, the marker spacing for sub-genomes A, B, and D was 0.87 Mb, 0.85 Mb, and 2.14 Mb, respectively (Table S2).
Population genetic structure
The entire nested association population comprised four RIL populations. According to values of LnP(D) generated from STRUCTURE, with its modal value used to detect the true k of three groups, k = 3 was recommended where the ascent changed gradually according to the method of Evanno et al. (2005). A kinship analysis also suggested three distinct subgroups (Fig. S2). Group I consisted of 75 lines from the CY-RIL population and 90 lines from the YT-RIL population. Group II consisted of 116 accessions, with most of the lines being from the HR-RIL population. Group III comprised 88 lines, and all of them were from the YN-RIL population. In the entire nested association population, 80.2% (296/369) of the lines did not show any admixture, and 6.0% (22/369) showed less than 20% admixture (Fig. S2B).
QTL mapping
A total of 88 QTLs were identified for eight grain morphological traits through combined analysis of seven environments, which were located on all 21 wheat chromosomes except for 2A, 3D, 4A, 4D, and 7D (Table S3). These QTLs were also detected via individual environment analysis in between one to five environments (Tables 1 and S4). Among the 88 QTLs, 64 (72.7%) had the most favorable alleles (i.e., the largest absolute additive effects in increasing grain weight or decreasing grain shape) donated from the semi-wild cultivars (Tables S3 and S5). CY, YN, and YT contributed the most favorable alleles for 43, 0, and 21 QTLs, respectively. Moreover, the exotic germplasm HR donated the most favorable alleles, amounting to 19 QTLs (21.6%). For the remaining five QTLs (5.7%), the most favorable alleles were from the common parent YZ.
Table 1 QTL clusters identified for grain weight or shape in wheat by combined analysis on a nested association population comprised of four RIL populations evaluated under seven environments For TGW, 14 QTLs were detected on chromosomes 1A, 1B, 2B, 3A, 3B, 6A, 6B, 7A, and 7B. For seven QTLs, including qTGW-1A.1, qTGW-1A.2, qTGW-2B, qTGW-3A, qTGW-3B, qTGW-7A, and qTGW-7B.2, the most favorable alleles were donated by the CY. The favorable alleles of qTGW-1B.1, qTGW-1B.2, and qTGW-7B.1 were contributed by YT. For qTGW-1B.3, qTGW-6A, and qTGW-6B, the most favorable alleles were contributed by HR. The most favorable allele of qTGW-7B.3 was from the common parent YZ.
For GA, 13 QTLs were detected on chromosomes 1A, 1B, 2B, 3A, 5A, 7A, and 7B. The most favorable alleles of seven QTLs (qGA-1A.1, qGA-1A.2, qGA-1A.3, qGA-2B, qGA-3A, qGA-5A, and qGA-7A), three QTLs (qGA-1B.2, qGA-1B.1, and qGA-7B.1), two QTLs (qGA-1A.4 and qGA-1B.3), and one QTL (qGA-7B.2) were contributed by CY, YT, HR, and YZ, respectively.
For GP, 14 QTLs were identified on chromosomes 1A, 1B, 1D, 2B, 5A, 7A, and 7B. The most favorable alleles of seven QTLs (qGP-1A.1, qGP-1A.2, qGP-2B, qGP-5A, qGP-7A.1, qGP-7A.2, and qGP-7B.2), four QTLs (qGP-1B.1, qGP-1B.2, qGP-1D, and qGP-7B.1), two QTLs (qGP-1A.3 and qGP-1B.3), and one QTL (qGP-7B.3) were donated by CY, YT, HR, and YZ, respectively.
Eight QTLs for GS were identified on chromosomes 2D, 4B, 5A, 5B, 5D, and 7A. The alleles of CY, YT, and HR decreased the GS for two QTLs (qGS-5B and qGS-5D), three QTLs (qGS-4B, qGS-7A.1, and QS-7A.2), and three QTLs (qGS-2D, qGS-5A.1, and qGS-5A.2), respectively.
Nine QTLs for GL were identified on chromosomes 1A, 1B, 1D, 2B, 5A, 7A, and 7B. The most favorable alleles of six QTLs (qGL-1A.1, qGL-2B, qGL-5A, qGL-7A.1, qGL-7A.2, and qGL-7B), two QTLs (qGL-1B and qGL-1D) and one QTL (qGL-1A.2) were contributed by CY, YT, and HR, respectively.
Nine QTLs for GW were detected on chromosomes 1A, 3B, 5A, 6A, 6B, 6D, and 7B. The most favorable alleles of four QTLs (qGW-1A.1, qGW-1A.2, qGW-3B, and qGW-7B), four QTLs (qGW-6A.1, qGW-6A.2, qGW-6B, and qGW-6D), and one QTL (qGW-5A) were donated by CY, HR, and YZ, respectively.
In total, 14 QTLs for GD were detected on chromosomes 1A, 1B, 2B, 3A, 3B, 5A, 7A, and 7B. The most favorable alleles of eight QTLs (qGD-1A.1, qGD-1A.2, qGD-1A.3, qGD-2B, qGD-3A, qGD-3B, qGD-5A, and qGD-7A), three QTLs (qGD-1B.1, qGD-1B.2, and qGD-7B.1) and two QTLs (qGD-1A.4 and qGD-1B.3), and one QTL (qGD-7B.2) were contributed by CY, YT, HR, and YZ, respectively.
For GR, seven QTLs were detected on chromosomes 2D, 4B, 5A, 5B, and 7A. The alleles of CY, HR, and YT increased the GR for two QTLs (qGR-5B and qGR-5D), two QTLs (qGR-2D and qGR-5A) and three QTLs (qGR-4B, qGR-7A.1, and qGR-7A.2), respectively.
QTL clusters
The 14 QTLs governing TGW were always co-located with the QTL for grain size traits, such as GA, GP, GD, GW, and GL, which merged into 14 QTLs clusters located on chromosomes 1A, 1B, 2B, 3A, 5A, 6A, 6B, 7A, and 7B (Table 1). The semi-wild relatives, CY and YT, contributed the most favorable alleles of the seven and three QTL clusters, respectively, and the exotic line HR donated the most favorable alleles of the three clusters. The other one was from the common parent YZ. QW-1A.1 consisted of five QTLs for GA, GD, GP, GW, and TGW, and the most favorable alleles were from CY. The cluster QW-1A.2 for GA, GD, GW, and TGW were detected, and the most favorable alleles were contributed by CY. Four QTL clusters, QW-1B.3, QW-1B.2, QW-7B.1, and QW-7B.3, were identified to affect GA, GD, GP, and TGW, and the most favorable alleles were donated by HR, YT, YT, and YZ, respectively. QW-1B.1 and QW-2B, which consisted of QTL for GA, GD, GP, GL, and TGW, were detected, and the most favorable alleles came from YT and CY, respectively. The QTL cluster QW-3B, which harbored QTL for GD, GW, and TGW, was identified, with the most favorable alleles contributed by CY. Three QTL clusters for GW and TGW, including QW-6A, QW-6B, and QW-7B.2, were identified, and the most favorable alleles were from HR, HR, and CY, respectively. Two QTL clusters for GA, GD, and TGW, QW-7A and QW-3A.1, were detected, and the most favorable alleles were contributed by CY.
The QTLs affecting the GS were co-located with QTL for GR or GW, which merged into eight QTL clusters on chromosome 2D, 4B, 5A, 5B, 5D, and 7A (Table 1). The alleles of HR decreased GS for QS-2D, QS-5A.1, and QS-5A.2. For QS-5B and QS-5D, the alleles decreasing GS were contributed by CY. The alleles of YT decreased GS for QS-4B, QS-7A.1 and QS-7A.2.
Validation and haplotype analysis for important QTLs
Haplotype analysis was conducted using a natural wheat population containing 574 cultivars or lines to validate three important QTLs: qTGW-1B.1, qTGW-1B.2, and qTGW-1A.1 (Table S7). qTGW-1B.1, which affected GA, GD, GP, GL, and TGW, was identified in the region of 49,926,288–53,251,268 ( ~ 3.3 Mb) on chromosome 1B in individual environments harboring 30 annotated genes (Tables 1 and S6). This QTL was consistently detected in four individual environments and showed excellent environment stability (Table S4). Subsequently, seven KASP markers in the target region of qTGW-1B.1 were developed for haplotype analysis through a natural wheat population (Tables S7 and S8). Significant differences were detected for the eight haplotypes at a significant level of p = 0.001 (Fig. 4). Haplotypes H7 and H8 showed significantly higher TGW than the other haplotypes. qTGW-1B.2, affecting GA, GD, GP, and TGW, was detected in the region of 368,543,950–376,616,816 ( ~ 8.07 Mb) on chromosome 1B in three individual environments harboring 38 annotated genes (Tables 1 and S6). Four KASP markers in the target region of qTGW-1B.2 were developed for haplotype analysis, and significant differences were detected for the four haplotypes at a significant level of p = 0.001 (Tables S7 and S8, Fig. 4). The TGW for haplotypes H3 and H4 was significantly higher than that of haplotypes H1 and H2. For, qTGW-1A.1, affecting GA, GD, GP, GW, and TGW, one KASP marker was developed in the target region of 1,339,530–3,556,253 bp ( ~ 2.2 Mb) on chromosome 1A harboring 39 annotated genes (Tables 1 and S6). Haplotype analysis revealed that the two haplotypes showed significant differences at a significant level of p = 0.01 (Tables S7 and S8, Fig. 4). Haplotype H2 had a significantly higher TGW than Haplotype H1 did.