Introduction

Sesame (Sesamum indicum L.) is an ancient and worldwide oilseed crop with high oil quality (Bedigian et al. 1985; Bedigian and Harlan 1986; Ashri 1998; Zhang et al. 2019); it is widely grown in tropical and subtropical regions due to its high tolerance to high temperatures and barren soil. Sesame seeds are abundant in unsaturated fatty acids (oil content, ~ 55%), proteins (~ 20%), and antioxidant lignans (~ 0.5%), such as sesamin (Anilakumar et al. 2010; Zhang et al. 2019). Therefore, sesame seeds are widely used in food and medicines (Kanu et al. 2010; Zhang et al. 2019). Nevertheless, sesame is a crop with a low seed yield and relatively low economic efficiency (~ 0.2) (Zhang et al. 2019). Therefore, it is necessary to better understand the genetic and molecular bases underlying seed yield and its component traits to efficiently improve sesame seed yield.

Seed yield is one of the most important and complex quantitative traits of crops. Research has shown that seed yield is controlled by multiple loci in sesame (Zhang et al. 2019). In addition to population density, a dozen agronomic traits contribute to capsule and seed formation, such as capsule number per plant (CNP), seed number per capsule (SNC), seed size or weight, plant type (uniculm or branching), plant growth and development rhythm (e.g. budding stage), initial flowering stage (IFS), and final flowering stage (FFS). The ability of sesame to adapt to environments, such as resistance to diseases and tolerance to waterlogging and other abiotic stresses, also affects the final seed yield per plant (PSY) (Biabani and Pakniyat 2008; Muhamman et al. 2010; Akbar et al. 2011; Tabatabaei et al. 2011; Daniya et al. 2013; Jatothu et al. 2013; Zhang et al. 2019).

Previous research indicated that 1000-seed weight (TSW), capsule stem length (CSL), CNP, and plant height (PH) are the key seed yield component traits in sesame because of their direct effects on seed yield in a field unit (Solanki and Gupta 2000; Yol et al. 2010). Goudappagoudra et al. (2011) found that CNP was positively correlated with seed yield (r = 0.732), followed by seed number per plant (SNP), branch number per plant (BNP), PH, and TSW. Akbar et al. (2011) showed that the CNP, PH, capsule length (CL), and TSW significantly and positively contributed to seed yield. The BNP, seed weight per plant (SWP), and height to the first capsule (FCH) indirectly contributed to the final seed yield, and thus, were regarded as seed yield-related traits (Sarwar et al. 2005; Daniya et al. 2013). Miao et al. (2020) created a dwarf mutant ‘dw607’ (dwf1) with a significantly shorter internode length (decreased from 8.0 to 4.2 cm) by EMS-induced mutagenesis. Interestingly, the dwarf cultivar, Yuzhi Dw607, selected from the ‘dw607’ mutant, had a capsule node number of 62 and a seed yield of 3450 kg per hectare in Xinjiang province, China (Zhang et al. 2021). The decreased PH and reduced internode length resulted in higher seed yield and tolerance to lodging (Miao et al. 2020), suggesting that PH, internode length, and lodging tolerance are also seed yield-related traits that contribute to the final sesame seed harvest.

Genetic and molecular research on seed yield and its related traits provided information necessary for understanding the relationships between seed yield and its related traits, as well as its genetic and molecular bases. The genetic map of sesame provides a good foundation and platform for sesame molecular breeding (Wei et al. 2009; Zhang et al. 2013). Recently, with the aid of the newly developed sesame reference genome, resequencing data intensive quantitative trait loci (QTL) mapping and genome-wide association study (GWAS) research have been conducted on seed yield-related traits in sesame (Zhang et al. 2019). Several candidate genes for these traits have been identified (Wu et al. 2014; Wei et al. 2015, 2019; Zhou et al. 2018; Zhang et al. 2019; Miao et al. 2020; Mei et al. 2021). Wu et al. (2014) constructed a high-density genetic map using a recombinant inbred line (RIL) population consisting of 224 RILs based on restriction-site associated DNA sequencing (RAD-seq) data and identified 13 QTLs for seven seed yield-related traits. Of the three QTLs with an R2 value > 10.0%, Qtgw-11 (R2 = 7.7–12.3%) was responsible for TSW, Qgn-6 (R2 = 8.0–18.3%) for SNC, and Qcl-12 (R2 = 52.2–75.6%) for CL (Wu et al. 2014). Wei et al. (2015) performed a genome-wide association study (GWAS) for 56 agronomic traits, including 14 yield-related traits, with a panel of 705 sesame accessions and identified 15 SNP/InDel markers (P < 10−6) associated with seed yield traits. Zhou et al. (2018) used GWAS on 39 seed yield-related traits with the same sesame accession panel across three environments, from which 6 environment-consistent QTLs and 76 QTLs, each associated with 2–5 seed yield-related traits and 48 candidate genes, were identified. Miao et al. (2020) cloned the sesame dwarf gene Sidwf1 using an RIL population derived from the dw607 (dwf1) mutant and the wild-type accession 15N41 (wt type) with the aid of a new method designated as cross-population association mapping. Similar to the green revolution genes Rht-B1 and Rht-D1, in wheat (Peng et al. 1999), SiDWF1 encodes a gibberellin receptor GID1B-like protein that controls internode length and PH in sesame. An SNP mutation in Sidwf1 resulted in a decrease in internode length and PH, thereby enhancing the seed yield of the mutant (Miao et al. 2020).

However, the basic characteristics of an ideal sesame cultivar with high seed yield remain unclear. Comprehensive knowledge of the relationship between seed yield and yield-related traits can help facilitate the breeding of high seed-yielding sesame cultivars. Thus, we systematically surveyed the phenotypic characteristics that could potentially contribute to seed yield and examined their relationships with plant seed yield (PSY) using a natural population consisting of 369 sesame core accessions from five environments. We conducted a GWAS on seed yield and its component traits with 369 sesame core accessions in five environments and screened the candidate genes for seed yield-related traits. The findings of this study provide the genetic and molecular bases and knowledge necessary for breeding high-yielding sesame plants.

Materials and methods

Plant materials

A panel of 369 sesame core accessions collected from 16 countries (Table S1) was used in this study. 318 of the accessions were collected from 19 provinces in China and 51 from other 15 countries. These accessions are available at the Sesame Germplasm Reservoir of the Henan Sesame Research Center, Henan Academy of Agricultural Sciences, China (HSRC, HAAS, China).

Phenotypic data collection of seed yield and yield-related traits

Each of the 369 accessions was self-pollinated for six or more generations before phenotyping. The 369 accessions were grown in five environments during the 2011 and 2012 seasons in Henan Province, including Pingyu (114.62°E, 32.97°N) (2011 and 2012), Xinyang (114.08°E, 32.13°N) (2012) and Yuanyang (113.96°E, 35.05°N) (2011 and 2012). A randomized block design with three replicates was applied for the field experiments (Edmondson 2005; Verdooren 2020). Each entry plot contained two rows, 5 m long and with a 40 cm distance between the rows. Field management was performed using routine treatments in the sesame field trials and breeding.

Plant seed yield (PSY) and its nine related traits, including PH, FCH, tip length (TL), CSL, CNP, SNC, TSW, initial flowering stage (IFS), and FFS were examined as described by Zhang and Feng (2006) and Langham (2017a, b, 2018). Accordingly, plants with branches number ≤ 2 were classified as “uniculm type”, and those with branches number ≥ 3 as “branched type”. PH was measured from the ground to the stem tip for uniculm accession plants and to the main stem tip for branching of the accession plants. The CSL was the stem length of a uniculm accession or the main stem length of a branching accession. TSW was measured using the SC-G machine (WSeen Detection Technology Company, Hangzhou, China).

Phenotypic data statistics and heritability analysis

Phenotypic data for seed yield and related traits were analyzed using SAS 8.1 software (https://www.sas.com/zhcn/home.html). The effects of genotype, environment, and genotype-environment interaction (G × E) were modeled using PROC GLM, y = g + e + ge, where g is the genotype effect, e is the environment effect, and ge is the effects of the genotype-environment interaction (G × E) (SAS User’s Guide Version 8 1999; Piepho 2018; Cai et al. 2014). Broad-sense heritability (H2) was calculated by: \({{H}^{2}=\delta }_{\mathrm{g}}^{2}/({\delta }_{\mathrm{g}}^{2}+\frac{{\delta }_{\mathrm{g}e}^{2}}{n}+\frac{{\delta }_{e}^{2}}{nr})\), where \({\delta }_{\mathrm{g}}^{2}\) is the genetic variance; \({\delta }_{\mathrm{ge}}^{2}\) is the variance of G × E interaction; \({\delta }_{\mathrm{e}}^{2}\) is the environment variance; n is the number of environments; r is the number of replicates within environment. \({\delta }_{\mathrm{g}}^{2}\), \({\delta }_{\mathrm{ge}}^{2}\) and \({\delta }_{\mathrm{e}}^{2}\) were calculated using analysis of variance (ANOVA) by considering the environment as a random effect. Distributions of the yield and yield-related traits were tested for normal analysis using the Shapiro–Wilk test with a significance level of P < 0.05 (SAS User’s Guide Version 8 1999). Multiple comparisons were conducted based on the LSD method at the 0.05 and 0.01 levels, which were marked by * and **, respectively.

Correlation relationship analysis and path analysis of seed yield and yield-related traits

Correlation analysis was performed using Pearson's coefficients in the ‘ggcorrplot’ package of R software (https://www.r-project.org/). Path coefficient analysis was carried out as described by Dewey and Lu (1959). Path analysis procedures were carried out using the SPSS 20.0 (https://www.ibm.com/cn-zh/analytics/spss-statistics-software).

Single sequence repeat genotyping, population structure and principal component analysis

One hundred and twelve polymorphic single sequence repeat (SSRs) were chosen from the SSR marker dataset based on our previous research results (Yue et al. 2012; Zhang et al. 2012). Genomic DNA was extracted and SSR genotyping was performed as described by Li et al. (2014). Polymorphisms in SSR markers were analyzed using polymerase chain reaction (PCR), as described by Zhang et al. (2012). The polymorphic alleles of the 112 SSR markers among the 369 core accessions and their genotypic data determined by Li et al. (2014), and the population structure results were used in this study (Fig. S6).

The population structure (Q = 2) calculated by Li et al. (2014) was used in the present study. Principal component analysis (PCA) analysis in the 369 germplasm accessions was estimated using ‘Poppr’ in R software of 112 SSR markers. The 369 resources were grouped and labeled according to plant type and population structure.

Association mapping of seed yield and yield-related traits

TASSEL software (version 3.0) with a mixed linear model (MLM) (Q + K) (Laurentin and Karlovsky 2006; Bradbury et al. 2007) was used for association. Population structure (Q) and kinship (K) of the accessions were calculated as previously described (Li et al. 2014). The global significance level was set at (− log10 (P value) ≥ 3.0) (Hou et al. 2018). Manhattan and Q–Q plots were generated using the ‘qqman’ package in R software (Turner 2014). Phenotypic data were analyzed according to the genotypes of the screened candidate SSR markers. We used the reference genome of the sesame cultivar Yuzhi 11 to locate genes close to the associated markers.

Results

Phenotypic variation of seed yield and yield-related traits in sesame

We phenotyped 369 sesame core accessions for PSY and 9 yield-related traits, including PH, FCH, TL, CSL, CNP, TSW, SNC, IFS and FFS, in 2011 and 2012 in five environments to determine the characteristics of PSY and its relationships with seed yield-related traits (Table 1; Fig. S1). Each of the traits varied substantially and exhibited high diversity among the 369 accessions. PSY varied between 1.79 and 8.63 g, with an average of 5.32 g. CNP ranged from 28.8 to 94.02 with an average of 57.15. The ShapiroWilk normality test indicated that PSY, TL, FCH, CSL, CNP, and TSW were normally distributed (P = 0.86), whereas the other four traits did not fit a normal distribution (P < 0.01) (Fig. S1; Table 1).

Table 1 Phenotypic variation and broad-sense heritability of seed yield and nine related traits in the 369 sesame accessions

ANOVA showed that genotype (G) and environment (E) had significant effects on most traits (P < 0.01) (Table 1), suggesting their sensitivity to the environment. PH, FCH, and SNC showed medium-to-high broad-sense heritabilities (H2) of 62%, 72%, and 72%, respectively. PSY, TSW, and TL had lower H2 values, ranging from 14 to 23%, suggesting that they were not stably inherited under the influence of the environment.

Relationships of seed yield and yield-related traits in accessions with different plant types

For the 369 accessions, we correlated PSY with its related traits (Fig. S2). The results showed that PSY was positively and significantly correlated with PH, CSL, CNP, FFS, and TSW (r = 0.173–0.669, P < 0.01), and the strongest correlation was present between PSY and CNP (0.669). In contrast, PSY was significantly negative correlation with IFS (− 0.160) and FCH (− 0.141). No significant correlations were found between PSY and the two remaining traits, TL and SNC.

To check whether plant type (i.e., uniculm or branching type) affects the growth habit and final seed yield in sesame, we further classified the 369 core accessions into two groups according to plant type and carried out a correlation analysis between PSY and its related traits in each group (Fig. 1). For the 210 uniculm accessions, PSY was positively and significantly correlated with CNP, CSL, PH, FFS, and TSW, with correlation coefficients of 0.754, 0.594, 0.272, 0.214, and 0.206, respectively (P < 0.01). For the 159 branching accessions, PSY was shown to significantly correlate with CNP, CSL, TSW, and IFS with correlation coefficients of 0.581, 0.549, 0.335, and − 0.205, respectively (P < 0.01) (Fig. 1). These results confirmed that CNP, CSL, and TSW were key traits affecting seed yield in both uniculm and branching accessions.

Fig. 1
figure 1

Correlation analysis of seed yield and yield-related 9 traits for uniculm and branching accessions seperately. a Uniculm group. b Branching group. Correlation coefficients vary from 1.0 (green) to − 1.0 (brown). * and ** indicate the significance at 0.05 and 0.01, respectively. PH, Plant height. FCH, Height to the first capsule. TL, Tip length. CSL, Capsule stem length. CNP, Capsule number per plant. TSW, Thousand seed weight. SNC, Seed number per capsule. IFS, Initial flowering stage. FFS, Final flowering stage. PSY, Plant seed yield. (Color figure online)

Path coefficient analysis showed that CNP, CSL, TSW, SNC, IFS, and FFS were directly related to PSY (Table 2). CNP had a positive and considerably strong direct effect on PSY (0.703). The indirect effects of CNP were positive through CSL and FFS and were very small. SNC was observed to have a positive and very high direct effect on PSY (0.305) but also a negative indirect effect through CNP (− 0.260). CSL had a positive direct effect on PSY and an indirect effect via CNP showed a strong positive effect (0.325). The IFS showed a negative direct effect (− 0.198) on PSY. The indirect effects of the IFS were positive for CNP, SNC, and FFS. If branching type groups are considered separately in the uniculm group, CNP had the highest positive direct effect (0.755) and followed by CSL (0.281), SNC (0.254), TSW (0.202), and FFS (0.109) (Table S2). PH had a direct negative effect on PSY. The indirect effects of PH were positive for CNP, CSL and FFS. In the branching group, five traits (CNP, TSW, SNC, IFS, and PH) showed high direct effects on PSY. CNP also had the greatest effect on PSY (P = 0.726), whereas IFS had the highest negative direct effect (− 0.394) (Table S3). PH has a direct positive effect on PSY. The indirect effect of PH was negative through the SNC and IFS.

Table 2 Path effect analysis of yield component traits on seed yield in 369 sesame accessions

The traits with highly significant positive association and direct positive effects on PSY were CNP, CSL, TSW, PH and FFS. Therefore CNP had the highest directly influence, whereas CSL had the highest indirect influence.

Phenotypic variation of seed yield and yield-related traits in population structure

Li et al. (2014) grouped 369 sesame core accessions into two subgroups: one (Q1) consisting of 243 accessions and the other (Q2) consisting of 126 accessions. Differences in the traits between the two subgroups were estimated. Phenotypic values of PH, FCH, TL, CNP, IFS, and FFS in the Q1 subgroup were higher (P < 0.01) and the value of TSW was lower (P < 0.01) than those in the Q2 subgroup. For example, the PH of group Q1 was 144.45 cm taller than that of group Q2 (130.68 cm), and the final flower stage of group Q1 was 77.56 d longer than that of group Q2 (73.68 d). The CSL and SNC phenotypes were not significantly different between the two subgroups (Fig. S3).

PCA was performed to explore the relationships among the 369 accessions based on 112 SSR markers (Fig. 2). The results showed that no significant correlation between geographic origin and the subgroups. We then grouped the 369 accessions obtained by PCA according to different plant types and found that the 369 accessions were grouped into two groups, in which uniculm accessions were grouped together (blue) and branching accessions were grouped together (orange) (Fig. 2, 3). To determine, whether plant type affected on the phenotypic variations in PSY and its related traits, we compared the grouped uniculm and branching accessions (Fig. 4). FCH, TL, and CSL were higher in the uniculm accessions than in the branching accessions (P < 0.01), whereas FFS in branching subgroup accessions was higher than in the uniculm accessions.

Fig. 2
figure 2

PCA plots of 369 accessions. Dots in blue and red represent uniculm and branching accessions, circle and triangle represent Q1 and Q2 (population structure), respectively. (Color figure online)

Fig. 3
figure 3

Distribution of the SSR markers significantly associated with seed yield and yield-related traits in the sesame chromosomes. Of the 112 SSR markers used in this study, 110 are located on the 13 sesame chromosomes. Right bar on chromosome indicates SSR marker. Left bar on chromosome indicates the physical location of each SSR marker in the sesame genome. Thirteen SSR markers significantly associated with the yield-related traits are listed on chromosomes in red. (Color figure online)

Fig. 4
figure 4

Box plot comparison of plant seed yield and yield-related traits of plant type. a Plant height (PH); b Height to the first capsule (FCH); c Tip length (TL); d Capsule stem length (CSL); e Capsule number per plant (CNP); f Thousand seed weight (TSW); g Seed number per capsule (SNC); h Initial flowering stage (IFS); i Final flowering stage (FFS); j Plant seed yield (PSY). U, Uniculm accession; B, Branching accessions. Significance of the difference of each trait between two subgroups is estimated by one-way analysis of variance (P < 0.01). The horizontal bar in box indicates the value of median. Bars over and under the box indicate the maximum and minimum value, respectively. * and ** indicate the significance level at 0.05 and 0.01, respectively

Association mapping and marker effect of seed yield and yield-related traits

We performed association mapping for PSY and its related traits using 369 core accessions and 112 SSR markers. The adjusted significance threshold (− log10 (P)) was set at 3.0. As a result, a total of 27 genomic loci were identified to associated with PSY and its related traits (P < 0.001) (Table S4). Among them, two SSR loci (Hs4209 and Hs345) were significantly associated with PSY, but with low phenotypic variance (R2 = 6.54 and 5.97%, respectively). For PH, Hs4089 and Hs1775 were detected for PH and explained of 12.50% and 8.16% of the phenotypic variance, respectively (Table S4). For FCH, three SSR markers were detected, with phenotypic variance explained by 4.61%, 6.94%, and 7.43%, respectively. For TL, six SSR markers were detected, with phenotypic variance ranging from 4.96 to 6.68%. For CSL, only one locus (Hs4089) was detected with a phenotypic variance of 9.44%. With CNP, six associated loci were detected, with phenotypic variance explained between 4.79 and 11.73%, of which Hs345 explained the largest portion of the CNP (R2 = 11.73%) in the 2011Pingyu (Fig. S4). For SNC, six associated loci were detected, and the explained phenotypic variance ranged from 3.90 to 8.66%. Hs635 was detected in two of the five environments (Fig. S5). For TSW, Hs395 and Hs4325 were detected with phenotypic variances of 3.85% and 9.20%, respectively. For IFS, five markers were found and the explained phenotypic variance ranged from 5.43 to 8.81%. For FFS, only one marker (Hs270) was detected with a phenotypic variance of 6.33%. Interestingly, we found that six SSR loci were simultaneously associated with several traits. For example, Hs4082 was associated with CNP, TL, and SNC; Hs233 was associated with PH and CSL; and Hs270 was associated with IFS and FFS.

To reveal whether plant type affected the association mapping results, we conducted association mapping of PSY and its related traits using the uniculm and the branching accessions separately (Table S5 and S6). 20 unique SSR loci were associated with PH (2), FCH (2), CNP (2), TL (1), CSL (2), SNC (4), TSW (1), IFS (2), and FFS (2) (Table S5) in the uniculm accessions, with 6.96–19.08% phenotypic variance. Hs4089 explained the largest portion of the phenotypic variation in PH. Two markers (Hs4089 and Hs464) were associated with PH and CSL, Hs485 was associated with IFS and TL, and Hs618 was associated with IFS in both environments.

When the branching accessions were analyzed by GWAS, 14 unique SSR loci were associated with FCH (4), CNP (2), CSL (1), SNC (3), TL (2), and TSW (2) (Table S6), explaining the 9.22–24.91% phenotypic variance. The Hs345 and Hs1385 loci explained the largest proportion of the variation in CNP (R2 = 24.91%) and FCH (R2 = 20.34%), respectively, whereas the Hs425 locus had the least effect on the variation in FCH (R2 = 9.22%). Hs376 is associated with FCH and SNC.

Two SSR loci were detected in both plant types. Three SSR loci were commonly detected between the uniculm and 369 accessions. In both groups, Hs4089 was associated with PH and CSL, whereas Hs4082 and Hs376 were associated with SNC and FCH, respectively. Four SSR loci were detected between the branching accessions and the 369 accessions. Hs345 was associated with CNP; Hs233 and Hs376 with FCH; and Hs250 with SNC.

To explore the association between the above markers, we analyzed the allelic effects of the six markers associated with yield and yield-related traits in the five environments (Table 3). Hs4082 had the largest effect on the variation of SNC. The Hs4082-1:1 (55.05 ± 9.31), Hs4082-1:2 (52.08 ± 6.52), and Hs4082-2:2 (50.84 ± 7.41) showed the different variation effects on SNC trait in 369 accessions (Fig. 5d). Genotypes carrying the Hs345-1:1 (58.61 ± 11.11), Hs345-2:2 (54.13 ± 11.27), and Hs345-1:2 (61.32 ± 11.33) showed different variation effects on CNP trait in 369 accessions (Fig. 5b). Furthermore, specific alleles had a negative or positive effect on yield and yield-related traits.

Table 3 Allele effects of 6 markers associated with yield and yield-related traits under 5 environments and different type
Fig. 5
figure 5

Phenotypic variation of alleles of the four markers associated with yield and yield-related traits under 5 environments. a phenotype variation of CNP of marker Hs1703 alleles; b phenotype variation of CNP of marker Hs345 alleles; c phenotype variation of SNC of marker Hs635 alleles; d phenotype variation of SNC of marker Hs4082 alleles

Comparative genome analysis

Finally, we aligned 13 non-redundant SSR markers associated with PSY and other traits related to the sesame var. Yuzhi 11 reference genome (version 2.0) (Zhang et al. 2016; Zhao et al. 2018) (Fig. 3). The 13 markers were located on 8 sesame chromosomes (Table 4); additionally, four of those markers were located in gene regions and nine were in intergenic regions. The Hs1703 locus is close to SiACS8, which is associated with CNP (Wei et al. 2015; Zhou et al. 2018).

Table 4 Thirteen candidate marker identified in SSR marker loci associated with seed yield and related traits in sesame

Discussion

Key component traits to seed yield

Seed yield is the result of interactions between numerous traits and genotypes through environmental interaction (Pathirana 1995; Haruna et al. 2012; Agrawal et al. 2017). In the present study, PH, FCH, CSL, CNP, TSW, IFS, and FFS were likely the key traits for seed yield in the sesame population. The results of the correlation analysis in this study suggest that any increase in PH, CNP, CSL, TSW, and FFS or a decrease in FCH and IFS will lead to an improvement in PSY. In previous studies, PH, CSL, CNP, and TSW were regarded as the key component of seed yield (Liu et al. 1980; Sumathi et al. 2007; Gnanasekaran et al. 2008; Banerjee and Kole 2009; Yol et al. 2010; Sumathi and Muralidharan 2010; Gangadhara et al. 2012; Ibrahim and Khidir 2012). IFS showed a negative and significant correlation with PSY, which is comparable to the study by Akbar et al. (2011). Ibrahim and Khidir (2012) found a significant positive correlation between IFS and PSY in 220 F5 families derived from ten sesame. This may be due to differences in germplasm resources and populations. FFS showed a positive and significant correlation with PSY, suggesting that improvements in the plant growth cycle will bring about an increase in PSY.

Path coefficient analysis provides a more realistic picture of the relationship, as it considers the direct and indirect effects of the variables by partitioning the correlation coefficients. The results showed that CNP, CSL, TSW, SNC, IFS, and FFS were directly associated with PSY (Table 2). CNP had the highest positive direct effect on PSY, followed by SNC, whereas CSL and FFS had a small direct effect but a higher indirect effect through CNP on PSY. In previous research, CNP, TSW, and SNC have been shown to have strong and direct effects on PSY (Shim et al. 2001; Azeez and Morakinyo 2011; Navaneetha et al. 2019). Therefore, CNP, CSL, TSW, and FFS were the main traits affecting sesame yield, and these traits can be considered as selection criteria for improving sesame yield.

Effects of plant type on seed yield and component traits

Plant type is one of the most important traits in sesame, because it determines plant architecture and affects seed yield and cultivation practices (Kobayashi 1986; Van Zanten 2001; Mei et al. 2017). Brar and Ahuja (1979) reported that the branching habit (plant type) is monogenically controlled and that the monostem (unbranched) is recessive. In China, most cultivars are of the uniculm type, which allows for higher plant density under high input conditions and results in a high seed yield (Van Zanten 2001). In Africa and Latin America, most cultivars are branched, have soft stems and reduced plant height, and are thus easier to cut (Beech and Imrie 2001). Moreover, branching cultivars are highly adapted to input conditions and space. The present study is the first to investigate the interactions of plant type with seed yield and related traits using the uniculm and branching accession groups separately (Fig. 1; Fig. S3). CNP, CSL, and TSW were shown to be the three common key component traits of seed yield for both uniculm and branching accessions, and therefore, are the most important component traits for sesame seed yield. PH and FFS were also the key traits for uniculm accessions, whereas IFS was also the key component trait for branching accessions. Path analysis of the uniculm group showed similar results (Table 2; Table S2 and S3). Associations between the ten traits and plant type may be caused not only by genetic differentiation, but also by the adaptation of plant type to distinct environments. Thus, plant type should be considered to improve seed yield and related traits during sesame breeding (Daniya et al. 2013).

For ideal sesame cultivars with high yield and adaption to mechanization, some sesame breeders and scientists suggest that the plant height of a sesame cultivar should be 120–130 cm, with 82 cm of reproductive stem and 28 node pairs of capsules (Zhang et al. 2021). The above results suggest that it would be efficient to breed ideal sesame cultivars with uniculm plant types and high seed yields by increasing CSL, CNP, TSW, and/or FFS. To breed high-yield cultivars with branching plant types, it may be efficient to increase CSL, CNP, and/ or TSW, as well as decrease IFS. Interestingly, Miao et al. (2020) reported a high-seed-yielding dwarf sesame cultivar, Yuzhi Dw607, with shorter internode length and shorter PH. Compared to the wild type, the plant height of this cultivar was decreased to 1.1–1.6 m, while its CNP and seed yield was highly increased under sufficient nutrition and watering (Zhang et al. 2019; Miao et al. 2020). These results provide advanced ideas for sesame breeding with desirable plant types and high seed-yield potential.

Key SSR markers for seed yield and yield-related component traits

The genetic bases of yield-related traits have been thoroughly dissected in many crops by QTL mapping (Xing and Zhang 2010; Li et al. 2011). Several QTLs with major effects have been identified using the map-based cloning of sesame (Miao et al. 2020; Wei et al. 2019). In this study, the genetic bases of ten yield-related traits were analyzed using association mapping based on phenotypic data collected over three years. Thirteen markers were repeatedly detected (P < 0.001; Table 3; Table S5 and S6), suggesting that the QTLs associated with these markers were insensitive to the growing environment (Zoric´et al. 2012). Nine markers were identified to colocalize with the QTLs identified in previous research (Table 4; Li et al. 2014; Wei et al. 2015; Zhou et al. 2018). However, owing to a lack of common markers between the genetic maps, the other marker-trait associations could not align with previous QTLs. Loci that were detected in more than two environments or were consistent with QTLs identified in previous research would be very useful for marker-assisted selection of yield-related traits. Fine mapping of such chromosomal regions would help identify candidate genes responsible for the natural variation of these yield-related traits.

One SSR marker, Hs635, linked to SNC, was located on Chromosome SiChr.3. The SNC with 1:1 genotype (50.63 ± 6.77) was much lower than the 2:2 genotype (56.27 ± 10) (Fig. 5c). Hs233, which is linked to four yield-related traits including FCH and IFS, is located on Chromosome SiChr.10. The accessions with Hs233-3:3 had low FCH (49.54 ± 11.31 cm). Hs1703 was associated with CNP, and the alleles of 1:1 (61.15 ± 10.63), 1:2 (57.92 ± 11.97), and 2:2 (56.53 ± 11.63) indicated the phenotype variation of CNP (Fig. 5a). The Hs1703 locus on chromosome SiChr.11 is close to the gene SiACS8, which is involved in CNP (Wei et al. 2015; Zhou et al. 2018). The results provide an effective way to enhance trait performance of sesame ccultivars for yield-related traits, which depict us the perspective of application in molecular breeding.

Conclusion

We systematically explored the association between yield and nine yield-related traits in 369 sesame accessions worldwide using 112 SSR markers. A significant positive correlation between CNP, CSL, TSW, and PSY was observed. Among the different plant types, CNP, CSL, and TSW were the key components. Path analysis showed a similar direct effect. Thirteen SSR markers were significantly associated with nine seed yield-related traits using MLM method. The three SSR markers were repeatedly detected in both environments. These results would provide an efficient platform for insight into the traits that influence PSY and MAS breeding in sesame. In the future, we plan to develop or screen additional SSR markers or genes to support the breeding of high-yield molecules in sesame cultivars.