High-density linkage map reveals QTL for Type-I seed coat cracking in RIL population of soybean [Glycine max (L.) Merr.]

Seed coat cracking (SCC), particularly the Type-I irregular cracking, is critical in determining the quality of appearance and commercial value of soybean seeds. The objective of this study was to identify the quantitative trait loci (QTLs) for SCC with high-density genetic map. One hundred sixty-seven recombinant inbred lines (RILs) developed from a cross between Uram (SCC-resistant) and Chamol (SCC-susceptible) were evaluated for SCC over 2 years (2016–2017). The QTL analysis identified 12 QTLs located on chromosomes 2 (D1b), 6 (C2), 8 (A2), 9 (K), 10 (O), 12 (H), 19 (L), and 20 (I). Out of the 12 QTLs, qSC2-1, qSC9, SC10-1, qSC10-2, and qSC12 were novel QTLs and the other seven QTLs (qSC2-2, qSC2-3, qSC6, qSC8, qSC19-1, qSC19-2, and qSC20) were found to co-localize with the previously identified QTLs. The mean SCC of the RILs of early maturity group was significantly higher than that of the late maturity group, suggesting an association between SCC and maturity loci. In addition, although 10 QTLs were distantly located from the maturity loci (E1, E3, E4, E7, and E10), qSC10-1 and qSC10-2 co-localized with the maturity loci E2. The results obtained in this study provide useful genetic information on SCC which could be used in the SCC breeding programs.


Introduction
Soybean [Glycine max (L.) Merr.] is one of the major field crops cultivated globally. Because of plentiful protein and oil contents in soybean seed, it is used for diverse purposes such as food, feed, fuel, and other industrial usages (Masuda and Goldsmith 2009). In a few Asian countries including Korea, several wholeseed-based soybean food recipes are popular and have been a part of traditional foods. Therefore, the quality of soybean seed appearance is considered as an important factor for commercial value.
Seed coat cracking (SCC) is one of the critical traits in determining the visual quality of seed. The SCC can induce and increase the possibility of splitting, damaging, and pathogen infection of the seed. Also, SCC decreases seed germination and emergence when seeds are planted (Yaklich and Barla-Szabo 1993). The SCC can be classified into two types: Type-I is the irregular cracking on seed coat, whereas Type-II is the net-like cracking on seed coat (Liu 1949). Type-II seeds are produced and sold, sometimes, in local markets because of the unique seed coat patterns, while Type-I seeds have a significantly decreased commercial value due to the irregular cracking.
The Type-I cracking results from the separation of the epidermal (palisade cells) and hypodermal (hourglass cells) tissues, which exposes the underlying parenchyma tissue (Yaklich and Barla-Szabo 1993). The SCC may be induced by exposure to chilling temperature (10-18°C) at the flowering stage (Takahashi 1997). In previous studies, I (responsible for the distribution of seed coat color), T (responsible for pubescence and seed coat color), and E1 and E5 (responsible for flowering and maturity) loci are found to suppress the SCC at low temperatures (Takahashi 1997;Takahashi and Abe 1999), whereas E2 and T loci are found to induce the SCC in pods-removing treatment .
To evaluate the SCC of different soybean genotypes, the SCC is promoted by using artificial methods, such as pod-removal, drying of imbibed seeds, and application of an ethychlozate (ethylene generating reagent) . The conventional approaches for screening SCC resistant lines are time-consuming and labor-intensive due to multiple steps involved in the evaluation and complicated genetic backgrounds as well as the existence of an interaction between genetic and environmental effects (Ha et al. 2012). Recent advances in the sequencing and genotyping technologies have facilitated genetic study for many complex traits such as seed fat, protein, seed size, and seed starch content in soybean (Ha et al. 2014;Asekova et al. 2016;Dhungana et al. 2017;Kulkarni et al. 2016Kulkarni et al. , 2018. For SCC, Oyoo et al. (2010) identified two QTLs, cr1 on chromosome 2 (D1b) and cr2 on chromosome 7 (M), using a mapping population of 95 recombinant inbred lines (RILs) genotyped with 1015 simple sequence repeat (SSR) markers. In another study, Ha et al. (2012) studied QTL, epistatic effects, and QTL-by-environment interactions for SCC in a 117 RILs population genotyped with 138 SSR markers, and identified 10 QTLs. Out of the 10 QTLs, three QTLs (qSCC2-1, qSCC9, and qSCC20) were identified in more than two environments. Saruta et al. (2019) identified the QTL qScr20-1 on chromosome 20 (I) using 172 RILs genotyped with 264 SSR markers.
For a comprehensive understanding of the genetic basis of SCC in soybean, it is necessary to identify QTLs using different genetic background across various environments. In the present study, we evaluated a mapping population comprising of 167 RILs across two environments, and identified QTLs associated with SCC using a high-density linkage map constructed by 5179 SNP markers (Kang et al., unpublished). The investigation of QTLs and phenotypic variation can expand knowledge for SCC, Type-I irregular cracking, in soybean.

Plant materials and growing conditions
A mapping population comprised of 167 RILs, derived from a cross between SCC-resistant Uram (Ko et al. 2016) and SCC-susceptible Chamol (Ko et al. 2018), was developed from 2012 to 2017. Figure 1 shows the appearance of irregular cracking and normal seed coat of Chamol and Uram. Uram is a late-maturing, whereas Chamol is an early-maturing cultivar. Uram grows taller with a higher-positioned first pod than Chamol. However, both parental cultivars have white pubescence. In 2012, the female parent Uram was crossed with the male parent Chamol. The F 1 seeds were planted in Daegu Experiment Station, NICS, RDA (35°90 0 N 128°44 0 E, Korea) in 2013. In the subsequent year (2014), F 2 population was planted in the same location. One hundred sixty-seven plants derived from the F 2 population were advanced from F 3 to F 5 through single seed descent method in Hung Loc Agricultural Center (10°56 0 N 107°04 0 E, Vietnam) in 2015. The F 5:6 RILs were planted in Daegu Experiment Station over 2 years (2016 and 2017) in a randomized block design with two blocks. Planting dates were June 28th in 2016 and June 29th in 2017. The RILs were grown in the black vinyl-mulched 2 m long rows those spaced 60 cm apart. Seeds were sown manually keeping 15 cm between hills, and plants were thinned to keep one seedling per hill. Compost (10 ton ha -1 ) and chemical fertilizers (N-P-K: 30-30-34 kg ha -1 ) were applied during field preparation.

Evaluation of seed coat cracking
The 167 RILs and parents planted in 2016 and 2017 were harvested at maturity and evaluated for SCC. One hundred seeds were randomly collected in triplicate from each plot, the number of irregularly cracked (Type-I) seed was counted, and expressed as percentage seed cracked.

Statistical analysis
Analysis of variance (ANOVA) was conducted and frequency distribution was obtained using R Studio (Ver 1.1.419). The descriptive statistical parameters (mean, minimum, maximum, median, standard deviation (SD), variance (VAR), coefficient of variation (CV), kurtosis, and skewness) were generated using Microsoft Excel 2016. The environment, genotype, and their interaction were considered as a fixed effect, and the broad-sense heritability (h 2 ) was estimated from ANOVA using the following formula: here 'y', 'g' and 'r' are number of year, genotype, and replication, respectively; r 2 g , r 2 gy , and r 2 e are components of variance for genotypes, interaction between genotype and environment, and error, respectively (Toker 2004;Kulkarni et al. 2017).

Linkage mapping and QTL analysis
Young trifoliate leaves from single plant of F 6 line derived from F 5 plant were collected and used for DNA extraction. The DNA was extracted using QIAGEN DNeasy Ò plant mini kit (Qiagen Sciences Inc., Germantown, MD, USA). The extracted DNA was genotyped with 180 K AXIOM Ò SoyaSNP array (Lee et al. 2015) and scanned with a GeneTitan Ò Scanner (Affymetrix, Santa Clara, CA, USA).
The genetic linkage map was constructed with 180,375 genome-assigned SNPs, excluding 586 scaffolds region in the whole 180,961 SNPs. A total of 20,046 SNPs showed polymorphism between parental cultivars. The low polymorphism found between the parental lines might be due to the reduced genetic diversity existed among soybean cultivars that resulted as a consequence of domestication and development of commercial varieties (Li et al. 2013;Achard et al. 2020). The genetic map construction and QTL analysis were performed using the polymorphic markers in QTL IciMapping Ver. 4.1 (Meng et al. 2015;Wang et al. 2016). The polymorphic markers were subjected to the Binning function of IciMapping considering a missing rate (5%) and segregation distortion (P \ 0.001). The mapping options were set as follows: 3.0 LOD (logarithm of odds) grouping, 'nnTwoOpt' ordering, and five size of window for sum of adjacent recombinant frequencies (SARF). Kosambi's mapping function was used in transforming recombination frequencies into centimorgan (cM) distances (Kosambi 1943).
The QTLs were detected using inclusive composite interval mapping of additive QTLs (ICIM-ADD) with parameters of 1.0 step and 1,000 permutation tests at P B 0.05 (Li et al. 2007). The figure of linkage maps showing QTL positions was constructed using Map-Chart 2.32 (Voorrips 2002).
The QTL for SCC identified in this study was named by combining different letters and numbers: q; quantitative trait locus, SC; seed coat cracking; the numbers followed by the letters indicate the chromosome harboring the QTL. Thus, qSC2-1 and qSC6, respectively, denote the first QTL on chromosome 2 and the single QTL on chromosome 6.

ANOVA and phenotypic analysis
The SCC of the parental cultivars and the RIL population was measured in two-year environments, and ANOVA was used to analyze genotype, environment, and genotype by environment interaction (G 9 E) effects on the SCC variation (Table 1). Genotype, environment and G 9 E effects were significant for the SCC (P \ 0.001). The estimated broad-sense heritability of SCC was 81.5% which suggested that the higher proportion of variation for SCC was due to the genetic effects was more than the environmental effects.
The skewness values of RILs in 2016, 2017, and combined year were more than 0 (Table 2), and the phenotypic distribution of the SCC in RILs was rightskewed (right-tailed) in all the environments (Fig. 2).
The kurtosis values of RILs in 2016 and the combined year were less than 3, but the value in 2017 was more than 3 (Table 2). It indicated that the phenotypic distribution of RILs was less peaked than normal distribution in 2016 and combined year, but more peaked in 2017 (Fig. 2).
The SCC variation in the combined year was different by the group of maturity days (MD, from seeding to maturity) (

Discussion
The SCC of soybean, especially Type-I irregular cracking, is an important phenotype in determining the commercial value of seeds. The main purpose of this study was to identify QTL for SCC using a biparental mapping population. The results of phenotypic evaluation indicated that the SCC was significantly influenced by genotype, environment, and their interaction. The high value of the broad-sense heritability showed that the genotypic factor was more influential than environmental factor in determining the SCC variation. When the heritability is higher than 50%, the target quantitative phenotype can be considered as a selection marker for subsequent generations, considering the trait variation is mainly based on genetic inheritance. The SCC of RIL population showed transgressive segregation, especially over the susceptible parent because the resistant parent showed small variance (2.33), whereas the susceptible parent showed large variance (49.92). Similar results of right-skewed distribution were also found in previous studies (Oyoo et al. 2010;Ha et al. 2012;Saruta et al. 2019).
The average distance between SNP markers, in this study, was 0.7 cM, which was relatively of higher density compared to previous QTL studies for SCC (Oyoo et al. 2010;Ha et al. 2012;Saruta et al. 2019). Construction of the high-density linkage map is important for precise mapping of QTLs and their potential application in breeding programs.
The previous studies on SCC suggested that the maturity loci (E1, E2, and E5) and pigment loci (T and I) were associated with SCC variation in specific environments (Takahashi 1997;Takahashi and Abe 1999;Yang et al. 2002). In the present as well as previous QTL studies, most of the QTLs for SCC were located in the same linkage groups where the maturity loci exist. Therefore, we investigated the SCC variation in the RILs considering their maturity period (early, normal, and late), and found significant differences among the groups (Table 3). The maturity can be an important factor to affect SCC variation. We also compared the physical locations of the QTLs identified in the present study with that of the maturity loci and previously detected QTLs based on the information obtained from SoyBase (https://www.soybase. org/, accessed February 2020).QTLs qSC2-2 and qSC2-3 co-localized with qSCC2-1 (Ha et al. 2012) and cr1 (Oyoo et al. 2010). The physical location of qSC6 overlapped that of qSCC6 which located at about 30 cM from three clustered loci E1, E7, and T (Molnar et al. 2003;Ha et al. 2012). E1 and T were known to suppress the SCC at low temperatures and possibly Table 4 The QTLs identified for Type-I seed coat cracking evaluated in 2016, 2017, and combined years with the RIL population developed from the cross between cultivars   had roles for controlling SCC variation (Takahashi 1997;Takahashi and Abe 1999). However, the physical position of the markers for qSC6 was 14.3 Mb, 2.51 Mb, and 1.24 Mb away from the loci E1, E7, and T, respectively (Toda et al. 2002;Molnar et al. 2003;Dissanayaka et al. 2016). Also, the pubescence color, relating to T locus, was not different between the parental cultivars as well as among the RILs. All the parents and RILs had white pubescence. Therefore, SCC variation found in this study might not be related to loci E1, E7, and T. qSC8 co-localized with qSCC8 (Ha et al. 2012), and located at 2.8 Mb distance from E10 locus (Samanfar et al. 2017). qSC19-1 and qSC19-2 were found to cover the physical location of qSCC19 (Ha et al. 2012), and located 2 Mb far from E3 locus (Mao et al. 2017). Similarly, qSC20 co-localized with qSCC20 (Ha et al. 2012;Saruta et al. 2019). These results showed that the SCC variation found in the RIL population was not directly related to the maturity loci E3 and E10, and the association of SCC with E4 was not also clear (Molnar et al. 2003). qSC2-1, qSC9, qSC10-1, qSC10-2, and qSC12 were the novel QTLs for SCC detected in this study. Out of the four chromosomes that harbored the novel QTLs, only chromosome 10 was found to contain the maturity loci. E2 locus found on chromosome 10 (O) has been reported to induce SCC in one of the treatment groups of pod-removing experiments in soybean . The chromosomal region between qSC10-1 and qSC10-2 covered a GIGANTEA ortholog, GmGIa gene (Glyma.10g221500) that was identified as E2 locus in soybean genome (Watanabe et al. 2011). On the other three chromosomes (2, 9, and 10), several candidate genes associated with flowering and seed maturing were found in the interval of the novel QTLs qSC2-1, qSC9, and qSC12 (Supplementary Table S2). The marker interval of qSC2-1 includes four genes, out of which Glyma.02g008300 and Glyma.02g008400 are related to pectinesterase which affects the accumulation of methanol in maturing soybean seeds (Markovic and Obendorf 2008) and Glyma.02g008500 is related to protein kinase domain playing important roles in seed maturation of rice and sandalwood (Kawasaki et al. 1993;Anil et al. 2000). A protein kinase domain is also associated with the activity of oil bodies of several plant species, including soybean seed (Anil et al. 2003). The physical location of qSC9 overlapped the marker Gm09_43508261 associated with flowering time in soybean (Mao et al. 2017). The marker interval of qSC12 includes seven genes, out of them Glyma.12g095700 is related to seed maturation protein PM37 from NCBI database (https://ncbi.nlm.nih.gov, accessed February 2020).
A few studies suggested that SCC was related to several maturity loci (Takahashi 1997;Takahashi and Abe 1999;Yang et al. 2002), which was also noticed in the RIL population with a higher SCC in the earlymature soybean lines. During the soybean seed coat development, various cells and tissues undergo several changes after fertilization until maturation (Shibles et al 2004). The flowering time and subsequent seed development may vary with genotypes, and are also influenced by the growing environmental conditions such as temperature. Temperature affects cell division (Francis and Barlow 1988) and can induce variation in the physical appearance of soybean seed coat. The scatter plot indicates that the late-mature group has lower SCC than the early-mature group, even though few early-mature lines have low level of SCC (Fig. 4). Thus, SCC varied in the RILs of different groups of maturity, suggesting the effect of maturity loci, especially E2 locus for the variation in SCC along with maturity in this study. Further research using the population derived from the cross between SCCresistant and -susceptible lines but without a difference in flowering and maturing time could be useful to investigate the relationship between maturity and SCC. To precisely determine the genetic regions Fig. 4 The scatter plot between seed coat cracking and maturity days of RILs in combined years (2016)(2017) and clustered by maturity group. MD indicates maturity days affecting SCC and develop useful markers, the wholegenome resequencing data of both parents would be required to identify the sequence variations within candidate genes (Asekova et al. 2016). The QTLs for SCC and the potential relation between SCC and maturity identified in this study could provide useful information on the genetic control of SCC in soybean. This information can be of great significance for soybean breeding and development of SCC-resistant cultivar by adopting marker-assisted selection technology.