Male sterility and hybrid breeding in soybean

Hybrid breeding can help us to meet the challenge of feeding a growing world population with limited agricultural land. The demand for soybean is expected to grow; however, the hybrid soybean is still in the process of commercialization even though considerable progress has been made in soybean genome and genetic studies in recent years. Here, we summarize recent advances in male sterility-based breeding programs and the current status of hybrid soybean breeding. A number of male-sterile lines with cytoplasmic male sterility (CMS), genic-controlled photoperiod/thermo-sensitive male sterility, and stable nuclear male sterility (GMS) have been identified in soybean. More than 40 hybrid soybean varieties have been bred using the CMS three-line hybrid system and the cultivation of hybrid soybean is still under way. The key to accelerating hybrid soybean breeding is to increase the out-crossing rate in an economical way. This review outlines current problems with the hybrid soybean breeding systems and explores the current efforts to make the hybrid soybean a commercial success.


Introduction
The area of soybean (Glycine max) cultivation in the world has expanded by more than 900% since the 1960s in North and South America, due to its significant roles in animal feeding and human nutrition (http:// www. fao. org/ faost at/ en/# home). However, soybean yield per unit area has not changed significantly compared with rice, wheat, and maize, suggesting the lack of a true Green Revolution in soybean breeding (Liu, et al. 2020a). Soybean yield is determined by both the total number of nodes and the number of pods per node, therefore the yield increase cannot be achieved in soybean by simply adoption of the shorter varieties. There are several options to increase soybean yields, and hybrid breeding hold the greatest potential to boost yield.
Soybean is an autogamous legume species, and male sterility line is a prerequisites for commercially available hybrid breeding and large quantities of seed production. An earlier heterosis test demonstrated that significant yield increases could be achieved in soybean; the heterozygous F 1 plants of 248 combinations yielded 20% more than their parental lines among the 1123 combinations that were tested (Sun, et al. 1999;Palmer, et al. 2001). However, hybrid breeding in soybean has received limited attention in contrast to maize and rice. Male-sterile female lines with cytoplasmic male sterility (CMS) or geniccontrolled photoperiod/thermo-sensitive male sterility (P/TGMS) have been extensively used for many years in maize and rice (Chen and Liu, 2014;. Hybrid rice in China covers 50-60% of the total rice cultivation fields, which contributed greatly to rice yield and ensure food security (Kim and Zhang, 2018;Liao, et al. 2021).
Male sterility lines are available in soybean, and the first soybean CMS line was reported under the US patents No. 4545146 in Davis (1985). Since then, no further information on this CMS line has been reported. Considerable research on soybean CMS lines has been conducted in China since the early 80s of the last century (Sun, et al. 1994a;Palmer, et al. 2001). To date, more than 40 hybrid soybean varieties have been bred and approved in China after several generations of researchers with more than 40 years of efforts, and more than 30 invention patents and technical standards have been authorized for the use of new technologies and methods for soybean hybrid breeding ). However, the CMS genes and underlying molecular mechanisms are still unknown in soybean, which has restricted the development of commercial varieties.
With the explosion in genomic resources and the rapid development of molecular biology and technology, the biotechnology-based male-sterility (BMS) systems for hybrid breeding have been established in maize, rice and other crops and vegetables (Chang, et al. 2016;Singh, et al. 2019). The BMS systems utilize nuclear male sterility to propagate the pure nuclear male sterile seeds on a large scale, which not only make the climate change not a threat to the pure hybrid seed production anymore, but also unlock the potential for breeding superior hybrids through expanding the parental germplasm pool. In soybean, the nuclear male sterile mutants ms4, ms1, ms6, and ms3 have been cloned in recent years (Thu, et al. 2019;Fang, et al. 2021;Jiang, et al. 2021;Nadeem, et al. 2021;Hou, et al. 2022). To speed up the large-scale commercial cultivation of hybrid soybean, it is time to consider where to put the investments, should we continue to count on the three-line hybrid system and looking for the ideal maintainer and restorer lines, or we can rely on the BMS systems to realize the commercialization of hybrid soybean. In this review, we try to cover recent advances in cytoplasmic-nuclear and nuclear male sterility systems in soybean to see if the technological breakthroughs will make us to succeed in hybrid soybean production.

Male sterility in plant
Plant male sterility (MS) refers to the phenomenon that the stamen develops unnormal, losing the ability to produce the functionally active male gametes for fertilization. According to their phenotypic characteristics, Kaul (1988) divide MS into three categories including structural, sporogenous, and functional. Structural MS indicates that the stamen is either completely absent or abnormally formed, which results in the absence of pollen. Sporogenous MS indicates that the stamen is essentially morphological normal, but fail to produce functional microspores or pollen due to the failure of early microsporogenesis and late microgametogenesis. Functional MS indicates that the viable pollen is produced, but either cannot be released from the anther due to the absence of dehiscence or is unable to geminate on the stigma and to initiate fertilization.
On the other scheme, according to the origin of inheritance, two types of MS are distinguished: cytoplasmic male sterility (CMS) and nuclear or genic male sterility (GMS). CMS is co-controlled by the nuclear and cytoplasmic genes, while GMS is controlled by the nuclear genes alone. CMS is widely spread in the higher plants and more than 300 species possess CMS were reported up to now (Liu, et al. 2001). The CMS is the result of the incompatibility between nuclear and mitochondrial gene products and there are several ways to generate CMS, including wide/inter-specific hybridization, protoplasmic fusion, induced mutations and genetic engineering (Bohra, et al. 2016). GMS is derived from the changes in the structure and function of nuclear genes, most of which are caused by natural variation and can also be achieved by physical and chemical mutagenesis. Mostly, the fertility of a GMS line is controlled by a recessive gene, and rarely by a dominant gene.

Cytoplasmic male sterility (CMS) system
CMS/Rf (restorer-of-fertility) system, also known as the three-line hybrid system, comprises a cytoplasmic male sterile line, a maintainer line, and a restorer line. The sterile line contains a cytoplasmic male sterile gene, while lacks a nuclear restorer gene (Schnable and Wise, 1998), which is characterized by sterile pollen and unable to produce progeny by self-inbreeding. The maintainer line excludes the nuclear restorer gene but contains the fertile cytoplasmic gene (Chen and Liu, 2014). However, the restorer line preserves a functionally nuclear gene and with or without a fertile cytoplasmic gene (Chen and Liu, 2014). The pollens of both the maintainer and the restorer lines are fertile, so they can propagate by self-pollination. When the sterile line is used as the female parent, it can receive pollen from either the maintainer or restorer line and produce hybrid progeny. The maintainer line is used to cross with the male sterile line to reproduce the male sterile line, while the restorer line is used to cross with the male sterile line to produce hybrid progenies with heterosis to realize yield increase.
The CMS/Rf system has been exploited for hybrid seed production in plenty of crops such as maize, rice, wheat, rape, soybean, sorghum, carrot, sugar beet, sunflower, cotton, pepper, and petunia (Garcia, et al. 2019). Although this system has been successfully applied to soybean, the yield increase is still far away from that of rice and maize. One of the main reasons is the limited number of identified CMS lines, which heavily restricts the utilization of the three-line system in soybean hybrid seed production. In order to address this issue, the various cytoplasmic genes that produce MS phenotypes, along with their corresponding nuclear-encoded restorer-of-fertility genes, need to be identified urgently.

Genic male sterility (GMS) system
GMS is controlled by nuclear genes without the influence of the cytoplasmic genome that are either insensitive or sensitive to environmental conditions, called genetically stable GMS and environment-sensitive genic male sterility (EGMS), respectively. In the case of EGMS, male fertility is often impressionable to different environmental conditions, including photoperiod (PGMS), temperature (TGMS), photoperiod and temperature (PTGMS), and humidity (HGMS) (Chen and Liu, 2014;Xue, et al. 2018;Abbas, et al. 2021). EGMS is regarded as an efficient genetic tool to develop two-line hybrids, since the need of a maintainer line can be eliminated, and the male sterile line can be propagated by self-pollination under specific conditions (Garcia, et al. 2019). In this system, almost every conventional inbred line is able to restore the fertility of the male-sterile line, and no negative effects related to sterility-inducing cytoplasm have been observed. Furthermore, genes of this system can be easily transferred to other genetic backgrounds (Yu, et al. 2015). EGMS has long been exploited to efficiently produce hybrid rice seeds within a twoline system consisting of a male-sterile line and a maintainer line. For instance, the first PGMS line Nongken 58S, discovered in japonica rice in 1973, is completely male sterile when grown under long-day conditions but male fertile when grown under shortday conditions (Shi 1985). While TGMS line Annong S-1, discovered in indica rice in 1988, is completely male sterile when grown at high temperatures but male fertile at low temperatures (Deng, et al. 1999).
Similar to the CMS system, the use of genetically stable GMS in plant breeding and hybrid seed production also involves three different lines, including a male sterile line, a maintainer line, and a restorer line. However, the reproduction of male sterile line in the GMS system presents a difficulty in that the segregation obtained in the cross with the maintainer line requires an additional step of identification and removal of 50% heterozygotes for hybrid seed production. Fortunately, the discovery of BMS systems, omitting the further cross and identification, has overcome this drawback, which including but not limited to seed production technology (SPT) , and genome-editing technology (Song, et al. 2021). The SPT platform consists of a transgenic maintainer line capable of maintaining and propagating non-transgenic nuclear male-sterile female lines for hybrid seed production (Qi, et al. 2020). The transgenic maintainer line includes three important elements: (i) the fertility restorer element, which can be used to complement the sterility phenotype of homozygous mutant. The most important component of this element is the functional GMS gene corresponding to the relevant mutant, which is often driven by a CaMV 35S or native promoter. (ii) The gametophyte devitalize element, which was used to destroy the pollen and cause the gametophyte sterility. The α-amylase is usually chosen and driven by an anther-specific promoter. (iii) The seed screening element, which can be used for seed separation by eyes or specific apparatus. The seed specific expressed promoter was used to activate the expression of fluorescence protein gene, such as the mCherry gene (Chang, et al. 2016;). The recessive genetic male-sterile mutant ms45, defective in the formation of outer pollen wall, was the first one used to establish the SPT system in maize ). Up to now, four fertility related genes, including rice Oryza sativa No Pollen 1 (OsNP1) (Chang, et al. 2016), maize MS44 (Fox, et al. 2017), MS7 (Zhang, et al. 2018b), and MS30  were successfully used in developing SPT systems, individually.
With the development of genome-editing technologies and the knowledge of the molecular genetics of male fertility genes, male sterile line can be created using CRISPR/Cas9 genome-editing technology, and is no longer limited by the naturally stable GMS mutants. Zhou et al. (2016) developed new "transgene clean" commercial TGMS lines in rice by knocking out TMS5 via CRISPR/Cas9. Whereafter, Li et al. (2017) produced TGMS maize by targeted mutation of the maize homolog of rice TMS5 (called ZmTMS5) using the CRISPR/Cas9 editing system. In addition, two rice reverse PGMS lines in japonica cultivars 9522 and JY5B were also generated by editing Carbon Starved Anther (CSA) gene using CRISPR/Cas9 . Furthermore, Qi et al. (2020) integrated CRISPR/Cas9 and SPT systems to generate Zmms26 male sterility line in maize by co-transforming two independent vectors. In soybean, Chen et al. (2021) used targeted editing of ABORTED MICRO-SPORES (AMS) homologs by CRISPR/Cas9 technology for the first time to generate stable male-sterile lines in soybean. Since GmAMS1 functions not only in the formation of the pollen walls, but also in the controlling the degradation of the soybean tapetum, targeted editing of GmAMS1 resulted in a male-sterile phenotype. Recent advances in genomics and the emergence of multiple biotechnological approaches have revolutionized the field of crop breeding. However, identifying more male-sterility genes and elucidating the molecular mechanisms of male sterility in crops are prerequisites for hybrid crop breeding and heterosis exploitation.

CMS system and hybrid breeding in soybean
Studies on soybean CMS started relatively late. The phenomenon of CMS was firstly observed by Davis (1985), while no further information about this mutant is available. The similar work of identification CMS line with wide cross between RNTED (Ru Nan Tian E Dan, Glycine max) and 5090035 (Glycine soja) was performed in China since 1983 by Sun et al., (1994a). It was not until 1985 that they found the high pollen abortion rate in (RNTED X 5090035) F 1 at different environmental locations. In addition, they conducted reciprocal crossed in 1987 and uncovered that the pollen abortion rate of (RNTED X 5090035) was much higher than that of (5090035 X RNTED). All these results confirmed they found a real CMS line and named it with CMS-RN (Sun et al., 1994a).
Hereafter, another five CMS types were developed and reported in soybean (Table 1). Peng et al. (1994) found that ZD8319, also known as Zhongdou 19 , processes CMS features, hereafter the derivative CMS lines from ZD8319 were attributed as CMS-ZD type, including of Zhongy-ou89B, M, ZA, W931A, FuCMS5A, SXCMS1A, SXCMS5A, and JLCMSZ9A (Zhao, et al. 1998;Zhang, et al. 1999a;Dong, et al. 2012;Dai, et al. 2017;Bai, et al. 2022). In addition, Zhao et al. (1998) confirmed that a Glycine max named XXT contained cytoplasmic male sterile gene, so this type of CMS was allocated as CMS-XX ). However, no further studies, such as cytological and functional analyses, have been performed for the CMS-XX line. Gai et al. (1995) discovered the cytoplasmic-nuclear male sterility phenomenon in the F 1 of (N8855 X N2899, Page 5 of 17 47 Vol.: (0123456789) two Glycine max cultivars). Ding et al. (2002) further developed a CMS line NJCMS1A (maintainer line NJCMS1B) by back-crossing with the recurrent parent N2899 for another four successive generations. Since N8855 contributed the cytoplasmic gene, this new variety is called CMS-N8855 type. Subsequently, NJCMS3A (maintainer line NJCMS3B) derived from the cross between N21566 and N21249, exhibited different cytological characteristics when compared with NJCMS1A and NJCMS2A (N8855 X N1628, CMS-N8855), which was considered as a new CMS type, and was named CMS-N21566 type here after (Zhao and Gai, 2006). Furthermore, Nie et al. (2017) developed a CMS line NJCMS4A from the cross of N23661 X N23658. The mitochondrial markers and genome sequences analyses suggested that N23661 male sterile cytoplasm is distinguished from that of CMS-RN, CMS-ZD, CMS-N8855, and CMS-N21566 types, so CMS-N23661 was named as a new CMS type .
To date, only the CMS system has been applied for the development of hybrid soybean. The CMS-RN is the earliest CMS type discovered and was first used in the three-line system used for hybrid breeding in soybean. Since the nuclear gene was replaced by the wild soybean 5090035, OA CMS line displayed the sprawling and shattering characteristics, which restricts its utilization for breeding (Zhang, et al. 1999a). Then Sun et al. (2001) used OA BC 3 as the female parent to cross with the Glycine max and to create a new CMS line YA (maintainer line YB) in 1995, which brings the possibility to establish the three-line hybrid breeding system in soybean. The world's first commercially approved soybean hybrid HybSoy 1 was bred in China in 2002 through successive backcrossing (JLCMS9A X Jihui 1) (Zhao, et al. 2004). Taking advantage of this type of cytoplasm, more than five hundred pairs of CMS-RN lines and their corresponding maintainers have been bred, among which more than forty CMS lines with high sterility rate, high combining ability, Glyma.09G171200 (Sun et al., 1994a), (Sun et al., 1994b), , (Zhang, et al. 2018a), , (Zhang, et al. 2018c), (Wang, et al. 2010), , (Lin, et al. 2014), (Guo, et al. 2022), , (Yang, et al. 2023 GmSSR1602 and GmSSR1610 (Zhao, et al. 1998), , (Wang, et al. 2010), (Dong, et al. 2012), (Dai, et al. 2017), (Bai, et al. 2022), (Lin, et al. 2020), , (Zhang, et al. 1999a), ) CMS-XX XXT JLCMSPI9A - (Zhao, et al. 1998), , (Lin, et al. 2020) CMS-N8855 N8855 NJCMS1A, NJCMS2A Rf, Chr. 16, GmPPR576 (Glyma.16G161900) (Ding, et al. 2002), ) CMS-N21566 N21566 NJCMS3A - (Zhao and Gai, 2006) CMS-N23661 N23661 NJCMS4A -  47 Page 6 of 17 Vol:. (1234567890) and high outcrossing rate (called three-high CMS line) have been bred . Totally, 42 commercialized soybean hybrid varieties have been developed in China (Table 2). Among the six different CMS types, CMS-RN was the most broadly used and totally 34 varieties were generated for spring soybean region in northern China (Table 2). In addition, the CMS-ZD type is often used in Huanghuai summer soybean region of China, and more than 50 pairs of CMS lines and the corresponding maintainer lines have been cultivated, but no three-high CMS line has been cultivated yet . The hybrid soybean variety, named HybSoy 1 (Zhao, et al. 2004) and Zayoudou No.1 (Zayoudou 1) ), increased the grain yield of soybean up to nearly 20% (Table 2). However, the cytological observations and functional analysis have lagged behind the applied research. Pollen abortion in male sterile lines may occur throughout the reproductive process, and the patterns of abortion varies greatly. A few cytological studies of the soybean CMS lines have been performed, and the studies mainly focused on the CMS-N8855, CMS-N21566, and CMS-ZD types. Ding et al. (2001) reported that the pollen abortion of NJC-MS1A (CMS-N8855) occurred at the stage of binucleate pollen. In addition, Fan (2003) observed that the pollen abortion of NJCMS1A occurred at the stages of the microspore mother cells (MMCs), tetrads, uninucleate microspores and binucleate pollens, but most occurred at the early binucleate pollen stage. However, the microspore abortion of another CMS-N8855 type line NJCMS2A was mainly happened at the late uninucleate stage (Fan 2003). In addition, Zhao and Gai (2006) found that the microspore abortion of NJCMS3A (CMS-N21566) was mainly at the middle uninucleate stage, which was also confirmed by Fan (2003). Furthermore, Ren (2005) found that the pollen morphology of CMS line W931A (Zhongyou 89B X W206, CMS-ZD) was different from that of maintainer line W931B. The normal pollen grains from W931B were full, the surface of which was clearly visible, and the pollen apertures were easy to identify. However, the surface of the pollen grains from the CMS line W931A was blurred and shriveled, the pollen apertures could not be distinguished, and the size of the pollen grains was smaller than W931B.
The three-line system depends on the ability of the fertile restorer gene (Rf) in the restorer line to restore the CMS line's fertility. With the development of the next-generation sequencing and the CRISPR/Cas9 technologies, multiple Rf genes will be identified in soybean, which will speed up the development of the three-line system in soybean. A set of works have been conducted for identifying Rf gene of CMS line, especially for CMS-RN type (Guo, et al. 2022). To narrow down the candidate region for Rf1 gene for CMS-RN, Guo et al. (2022) constructed an F 2 population by crossing JLCMS204A with JLR230 (restorer line), and the gene was located between the marker dCAPS-1 and BARCSOYSSR_16_1076., A recent study identified Glyma.16G161900 as the candidate gene of Rf1 (Yang, et al. 2023). In addition, Glyma.09G171200, encoding a pentatricopeptide repeat (PRR) protein, was confirmed as the candidate gene of another Rf3 gene for CMS-RN ). In addition, the Rf gene of CMS-ZD type was located to the marker BARC-SOYSSR_16_1064 and BARCSOYSSR_16_1082 on Chr. 16 (Dong, et al. 2012). Another Rf-m gene of CMS-ZD allocating on Chr. 16 was identified between GmSSR1602 and GmSSR1610 . Furthermore, another PPR gene (GmPPR576, Glyma.16G161900) was identified as the candidate Rf gene of CMS-N8855 type , which was consistent with that of Rf1 gene (Yang, et al. 2023). Four Rf genes for CMS-RN, CMS-ZD and CMS-N8855 were closely distributed on Chr. 16 with a close region (Dong, et al. 2012;Guo, et al. 2022), whether they were controlled by the same gene needs to be further verified.
In addition, due to the lack of systematically cytological observation and the inconsistent cytological phenotype even for the same CMS type, for example, the CMS-N8855 line, whether the different CMS types are really distinguished from each other should be confirmed (Ding, et al. 2001;Fan 2003). Furthermore, an unusual phenomenon also happened that the same maintainer and restorer line can maintain and restore different CMS type, viz. YA (CMS-RN) and ZA (CMS-ZD) (Zhao, et al. 1998). Considering the contradictions, we cannot rule out the possibility that the six classified CMS types may not be completely different from each other.

GMS system in soybean
The first report of GMS line in soybean was published in 1928, the mutant st1 was both male and female sterile caused by abnormal chromosome association, which was controlled by a single recessive gene (Owen 1928). To date, approximately 30 GMS lines have been identified in soybean (Table 3). According to the phenotypic characteristics, fs1fs2 (Johns and Palmer, 1982) and ft (transformed flower) (Singh and Jha, 1978) belong to the structural MS, the others belong to sporogenous MS, and no functional MS has been reported in soybean. Two PGMS including ms3 (Chaudhari and Davis, 1977) and 88-428-BY (Wei 1991) and three TGMS including ms8 (Palmer 2000), ms9 (Palmer 2000), and msp (Stelly and Palmer, 1980) have already been reported. In addition, st1-st8, NJS-1H, D8804-7, and fs1fs2 mutants were both male and female sterile (Owen 1928;Hadley and Starnes, 1964;Palmer 1974;Johns and Palmer, 1982;Palmer and Kaul, 1983;Skorupska and Palmer, 1990;Zhao, et al. 1995;Ilarsian, et al. 1997;Palmer and Horner, 2000;Kato and Palmer, 2003;Li, et al. 2010;Speth, et al. 2015). The ms1 was the first GMS line that showed male sterile and female fertile phenotype in soybean (Brim and Young, 1971). In addition, ms2, ms4-ms7, ms12, msMOS,ms NJ ,N7241S,Wh921,and ft also belong to the male sterile and female fertile category (Singh and Jha, 1978;Palmer 1979;Buss 1983;Graybosch, et al. 1984;Graybosch and Palmer, 1985;Skorupska and Palmer, 1989;Ma, et al. 1993;Jin, et al. 1997;Zhang, et al. 1999b;Palmer 2000;Zhao, et al. 2005;. Although five PGMS and TGMS lines have been identified, so far, only MS3 (Glyma.02G107600), encoding a plant homeodomain (PHD) protein, has been identified (Hou, et al. 2022). The fertility of mutant ms3 mutant can restore under long-day conditions, thus the mutant could be used to create a new, more stable photoperiod-sensitive genic male sterility line for two-line hybrid seed production in soybean. With the rapid development of BMS systems in rice and maize, more and more attempts have been made in soybean. The 13 GMS lines (ms1,ms2,ms12,msMOS,ms NJ ,N7241S,Wh921, displaying male sterile and female fertile phenotypes are suitable for exploiting this new technology in soybean. In order to make this design   (Chaudhari and Davis, 1977), , (Hou, et al. 2022) 2

Challenges and prospects in the commercialization of hybrid soybean
Although more than 40 hybrid soybean varieties have been generated from the three-line hybrid system (cytoplasmic male sterility), the unstable sterility of MS line and the high cost of hybrid seed production constrained the large-scale application of heterosis in soybean, which makes the cultivation of hybrid soybean still has a long way to go. We believe that the three components are the keys to make hybrid soybean a commercial success: Identify the male sterile lines with high out-crossing rate The out-crossing rate is the key determinant of hybrid seed production. Seed production has not been efficient and cost-effective for hybrid soybean.
The main reason is that the mutations in causing the male sterility also very often have pleiotropic effects and lead to the defect in female function, which make the male sterile lines with low seed set. The identification of ms1 locus revealed that the gene was highly expressed in style and ovary and may also function in megagametogenesis or embryo development in soybean (Fang, et al. 2021). Studies had focused on the outcrossing rate on male sterile plants, the most promising record was the ms2 mutant, the outcrossing rate on male sterile plants was 74% of the self-pollinated plants (Carter, et al. 1986;Perez, et al. 2009). The feasible solution is to speed up the cloning of the causal gene for male sterile mutants that have good recorded with seedset, and simultaneously generate new male sterile lines by genome editing to make the mutation only affect the male fertility and without any effects on female productivity and other growth habits.
Besides the finding of ideal male sterility lines from the genetic perspective, the structure changes of flower and reproductive organs, for examples, the stigma protruding beyond the anthers, more pollen grains, and nectaries produce more fluids and/or volatiles, could increase the opportunity for cross-pollination (Palmer, et al. 2001). Pollen grain from soybean is heavy and sticky and the insect-mediated pollination is still indispensable even when the soybean flower is opened. The improvement of techniques for hybrid seed production is equally important for the commercialization of hybrid soybean, including the management of insect pollinators for cross-pollination and the suitable environment for both pollinators and soybeans, etc (Palmer, et al. 2001;Garibaldi, et al. 2021).
Incorporate genomic selection to precise guidance on hybridization combination Breeding 4.0 has been considered the next revolution of maize breeding (Wallace, et al. 2018). Even though the soybean breeding program is still at the Breeding 2.0 to 3.0 stages with molecular markers and genomic data to complement phenotypic data, the high-quality graph-based soybean pan-genome and the low cost of genome sequencing will turn promise into practice (Liu, et al. 2020b). The genotypes of soybean germplasm lines will be collected using high-throughput Page 13 of 17 47 Vol.: (0123456789) genotyping approaches such as next-generation sequencing (NGS) and SNP array platforms. Genetic variations among soybean germplasm of different origins/sources will make the selection of superior hybrid cost-effective.

Good understanding of the molecular mechanisms of anther development in soybean
Little is known about the biological processes and genes that regulate anther and pollen development in soybean. Like most (70%) angiosperms, soybean produces bicellular pollen. By contrast, rice and Arabidopsis both produce tricellular pollen, the biological significance of the evolution of these two types of pollen grains is still unclear (Williams, et al. 2014). Bicellular pollen undergoes mitotic division to form two sperm cells after germination; prior to anthesis, tricellular pollen forms a male germ unit (MGU) that develops rapidly, which may make tricellular pollen favored in angiosperms that demand rapid reproduction (Hackenberg and Twell, 2019). So, the knowledge of extensive studies of anther and pollen formation in Arabidopsis and rice should not be simply transferred to soybean. Taking advantage of the comparative transcriptome analysis, the uncovering of anther-specific genes, genetic networks, and hub genes in soybean anther development will provide important insights into the molecular events underlying soybean reproductive developmental processes, as well as valuable resources for the plant reproductive biology community in the areas of pollen evolution, pollination/fertilization, and hybrid breeding.
In summary, on-going and future research should consider the enhancement of hybrid seed production efficiency, and the long-term investment and commitment will definitely make the commercialization of hybrid soybean a reality.

Data availability
The tables are included in this article albertsen.

Declarations
Ethics approval All authors approved the submission.

Consent to participate Not applicable.
Consent for publication Yes.

Conflict of interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.