Analysis of differentially expressed genes in soybean leaf tissue of tolerant and susceptible cultivars under flooding stress revealed by RNA sequencing

Flooding stress causes severe yield reduction in soybean worldwide. The development of stress-tolerant cultivars could be an effective measure to reduce the negative effects of flooding stress. Molecular information on the gene expression pattern of tolerant and susceptible genotypes under flooding stress could be valuable to improve the flooding tolerance in soybean. The objective of this study was to analyze the differentially expressed genes (DEGs) revealed by RNA sequencing in the soybean leaf tissues of tolerant (‘Paldalkong’ and ‘Danbaekkong’) and susceptible (‘NTS1116’) cultivars under flooding stress. Seedlings were grown in a well-watered condition up to the V1–V2 stage and flood-stressed by inundating ~ 10-cm water for 14 days. A total of 22,468 genes were differentially expressed in flood-stressed condition compared to the well-watered control condition, out of which 13,729, 13,405, and 13,160 were differentially expressed in ‘Paldalkong’, ‘Danbaekkong’, and ‘NTS1116’, respectively. A higher number of some of the flooding tolerance-related genes such as lipoxygenase, expansin, glutathione S-transferase, and sugar efflux transporter were up-regulated in the tolerant cultivars than in the susceptible cultivar. The number of some abscisic acid-related transcription factors of basic leucine zipper domain and myeloblastosis families was also higher in the tolerant cultivars than in the susceptible cultivar. The molecular information about the DEGs of tolerant and susceptible cultivars obtained in the present study could be valuable to improve the flooding tolerance in soybeans.


Introduction
Considering the multiple uses of soybean as sources of food, feed, biodiesel, and other industrial products, extensive efforts have been made to increase its yield worldwide. The rise in commercial value of and demand for soybean in national and international markets has increased the soybean cultivation area. Although soybean is an upland crop, the cultivation of soybean in converted paddy fields has been increased for a few years owing to comparative economic benefit, nutrient management, and government policy (Nishida et al. 2013;Singh 2010). Soybean crop is generally sensitive to flooding stress which hampers crop growth and causes significant yield reduction (Ahmed et al. 2013;Komatsu et al. 2009). Flooding stress results from heavy rainfall and/or excessive irrigation coupled with poor drainage. This situation is further worsened under paddy soils which are specially developed to hold water for longer durations.
Plants, under flooding stress, make efforts to overcome the stress by adopting various mechanisms, including flooding avoidance, flooding tolerance, and flooding adaptation (Bailey-Serres et al. 2012;Mustroph 2018;Yamauchi et al. 2018). Various adaptive mechanisms, as well as physiological, biochemical, and molecular responses under drought and flooding stresses, help to survive and retain normal growth in plants ( Dat et al. 2004). Plants perceive environmental signals that are subsequently transmitted to molecular signaling as the initial step in a stress response. Among various signal compounds in several pathways of stress responses, abscisic acid (ABA) dependent and independent responses include transcription factors (TFs). Regular transcription and its regulation rely on specific protein factors, TFs, which bind to certain DNA sequences in gene regulatory regions and control their transcription (Latchman 1993). A number of TFs of diverse families account for approximately 10% of the genes in soybean that are participated in many biotic and abiotic stress responses (http://plant tfdb.cbi.pku.edu.cn/ index .php?sp=Gma).
A number of efforts have been made to unravel the complex molecular mechanisms underlying various biotic and abiotic stresses in soybean. Molecular techniques such as microarray profiling have been conducted to produce gene expression data of soybeans and stored in a public database (https ://www.ncbi.nlm.nih.gov/geo/). The microarray platforms have several drawbacks, such as limited sensitivity, cross-and non-specific hybridization, and limited usefulness under scarce gene information, especially for soybean in which gene models have not been well characterized. RNA sequencing (RNA-Seq), a sequencing-based recently advanced techniques, can overcome the limitations of microarray techniques (Trapnell et al. 2013). Several RNA-Seq studies have been conducted to investigate the gene expression in numerous tissues of soybean under biotic and abiotic stresses, such as drought (Vidal et al. 2012), drought and flooding (Chen et al. 2016), common cutworm attack (Du et al. 2019), salt (Zeng et al. 2019), and seed-flooding (Sharmin et al. 2020). The objective of this study was to analyze the differentially expressed genes (DEGs) revealed by RNA-Seq in soybean leaf tissue of tolerant and susceptible cultivars under flooding stress.

Plant material and growing condition
Two flood-tolerant ('Paldalkong' and 'Danbaekkong') and one flood-susceptible ('NTS1116') soybean cultivars were selected on the basis of a previous study (Koo et al. 2014).
Plants were grown in round-bottomed pots of 20 × 30 cm dimension (top and bottom diameters × height), kept in the greenhouse of Department of Southern Area Crop Science, National Institute of Crop Science, Miryang, Republic of Korea in April of 2019. The pots were filled with the soils prepared by mixing upland soil, compost, and nursery soil at 1:1:0.75 ratio. Five seeds were sown in three replicates for each cultivar and treatment condition. Seedlings were thinned by the first trifoliate (V1) stage to keep three plants in each pot. The plants were grown in a well-watered condition up to the V1-V2 stage and flood-stressed by inundating ~ 10-cm water for 14 days. The control-designated pots were grown in the well-watered condition during the period. Leaf samples were collected at 14 days after flooding (DAF) in 1.5-mL tubes, immediately kept into liquid nitrogen, and stored at -80 °C until RNA extraction.

Measurement of flooding tolerance
Leaf chlorophyll content (CC) was measured on the second trifoliate leaves of the control and flooded plants at 2-3-day intervals using a chlorophyll meter (SPAD-502Plus, Minolta Camera Co., Osaka, Japan). Leaf chlorophyll index (CCI) was calculated as the ratio of mean CC values obtained under flooded to control conditions (Dhungana et al. 2019). Three replications were maintained for each cultivar and treatment condition and the mean values of three replicates were reported.

RNA extraction and library preparation
Total RNA was extracted using an RNA extraction kit (RNeasy Plant Mini Kit, Qiagen, Hilden, Germany) following the manufacturer's instructions. The concentration of RNA was measured with a NanoDrop 2000 spectrophotometer (Thermo Scientific, Waltham, MA, USA). Pooled samples of three biological replicates were prepared for individual cultivar and control/flooding treatment and thus six RNA samples were applied for RNA-Seq and DEGs analysis. The possible contaminated DNAs in the RNA samples were removed using DNase. Total RNA was further purified using a ribo-zero rRNA removal kit. The cDNA libraries were constructed using a TruSeq RNA Sample Preparation v2 Guide, Part # 15026495 Rev. F.

Quality filtering and mapping of RNA-Seq reads
The paired-end RNA-Seq reads were generated on the Illumina Genome Analyzer platform and the quality of the trimmed reads was evaluated with FastQC (Bolger et al. 2014). The RNA-Seq reads were aligned to the Glycine max reference genome (Gmax2.1version) using Bowtie2 aligner.
The read mapping was performed using the HISAT2 program (Kim et al. 2015).

Sequence assembly and differential counting
After read mapping, transcript assembly was performed through the StringTie program (Pertea et al. 2015). The estimated gene abundance of the samples was determined in terms of the fragment per kilobase of transcript per million mapped reads (FPKM). The DEGs between the floodstressed and control samples of three cultivars were identified using the StringTie-e option. The significantly up-and down-regulated genes of each cultivar were obtained for the flooding stress. The genes with a log 2 fold change ≥ + 1.5 and ≤ − 1.5, and a false discovery rate adjusted P ≤ 0.05 were identified as DEGs.

Functional annotation and gene ontology enrichment
The DEGs were annotated for gene ontology (GO) terms and grouped into three categories: molecular function (MF), cellular component (CC), and biological process (BP). The GO enrichment analysis was done using gProfileR (Raudvere et al. 2019). The DEGs were also mapped to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database of soybean to find the genes associated with different KEGG pathways. The TFs and transcription-related genes were identified using the plant transcription factor database (http://plant tfdb.cbi.pku.edu.cn/index .php?sp=Gma) to analyze their potential roles in flooding tolerance.

Screening of DEGs using the QTL results for flooding tolerance
To identify the DEGs for their functions related to flooding stress response, the physical location of some DEGs in the genome was compared with the former mapping results of quantitative trait loci (QTL) associated with flooding tolerance in soybean. The recombinant inbred line (RIL) populations for QTL were developed using the tolerant ('Paldalkong' and 'Danbaekkong') and susceptible ('NTS1116') cultivars as parents which were also used for RNA-Seq in this study. The QTL result of one of the RIL populations was published (Dhungana et al. 2020). QTL map-based identification of DEGs may allow us to find the putative genes related to flooding tolerance and to verify the results of RNA-Seq study.

Mapping and analysis of DEGs under flooding stress
The average number of reads was 65.3, 67.5, 69.5, 60.1, 58.5, and 69.5 million for 'Paldalkong' control, 'Paldalkong' flooding, 'Danbaekkong' control, 'Danbaekkong' flooding, 'NTS1116' control, and 'NTS1116' flooding, respectively. Over 98.8% of the raw reads from each library retained after quality control and approximately 97% of reads in each library were mapped (Table 2). A summary of the workflow for RNA-Seq analysis is shown in Supplementary Fig. S1.
In total, 22,468 genes were differentially expressed under flooding stress as compared to the control condition (Supplementary Table S1). The core set of DEGs was analyzed in three combinations, namely 'Paldalkong' flooding vs. 'Paldalkong' control, 'Danbaekkong' flooding vs. 'Danbaekkong' control, and 'NTS1116' flooding vs. 'NTS1116' control. The DEGs specific to and common between the three combinations were also determined ( Fig. 1; Supplementary Table S2).
Some of the stress-related DEGs common to both tolerant cultivars 'Paldalkong' and 'Danbaekkong' such as lipoxygenase (LOC100785480) were significantly upregulated only in the tolerant cultivars but not in the susceptible cultivar, 'NTS1116'. Two phosphofructokinases (LOC100795344 and LOC100798766) were up-regulated and other two (LOC100782166 and LOC100820495) were down-regulated in the tolerant cultivars but were not significantly regulated in the susceptible cultivar. Four glutathione S-transferase (GST) (LOC100306119, LOC100796296, LOC106796199, and LOC100793025) were up-regulated and one LOC100775492 was down-regulated in the tolerant cultivars; however, none of these five genes were significantly regulated in the susceptible cultivar. Similarly, two sugar efflux transporter (SWEET) genes (GMSWEET30 and GMSWEET40) were up-regulated in the tolerant cultivars but not in the susceptible cultivar. The analysis of stress-related DEGs common to tolerant and susceptible cultivars showed mixed results, that is some of the DEGs were significantly up-regulated and others were down-regulated in the tolerant as well as in the susceptible cultivars. Lipoxygenase LOC100811820 was up-regulated in 'Paldalkong' and down-regulated in 'NTS1116'; LOC100793614 was down-regulated in 'Danbaekkong' but up-regulated in 'NTS1116'; whereas, LOC100802887 and LOC100800451 were down-regulated in 'Paldalkong' and 'NTS1116'. Expansin-coding gene LOC100799370 was up-regulated in 'Paldalkong' but down-regulated in 'NTS1116'; LOC100807807 was up-regulated in 'Danbaekkong' but down-regulated in 'Paldalkong' and 'NTS1116'; whereas LOC100807183, EXP2, and EXPB8 were downregulated in 'Paldalkong' and 'NTS1116'. Phosphofructokinase LOC100815278 was up-regulated in 'Paldalkong' but down-regulated in 'NTS1116'; LOC100818755 and LOC100803139 were down-regulated in 'Paldalkong' but up-regulated in 'NTS1116'; whereas, LOC100780164 was up-regulated in 'Danbaekkong' and 'NTS1116'. Three GST were up-regulated in 'Paldalkong', out of which only one GSTU58 was up-regulated and the other two LOC100805827 and LOC548030 were down-regulated in 'NTS1116'; whereas, EF1BGAMMA3 and GSTZ3 were down-regulated in the tolerant as well as susceptible cultivars. Three SWEET genes GMSWEET21, GMSWEET29, and GMSWEET42 were down-regulated in 'Danbaekkong' and 'NTS1116'.

Functional annotation and gene ontology (GO) enrichment
The GO analysis was performed to observe the gene expression pattern based on three categories: BP, MF, and CC.  The DEGs were functionally annotated to investigate their GO ( Supplementary Fig. S2) and were also mapped to KEGG pathways to analyze their functional enrichment. The GO functional annotations indicated that the DEGs were involved in a series of biological processes, such as photosynthesis, light reaction, biosynthetic process, metabolic process, and cell wall biogenesis. The DEGs were also related to the following molecular functions: binding, transporter, and signal receptor. In addition, the DEGs were components of photosystem, membranes, cells, and organelles. The KEGG pathway analysis revealed that some of the major pathways were biosynthesis of secondary metabolites, carbon metabolism, biosynthesis of amino acids, glycolysis/ gluconeoprotein, starch and sucrose metabolism, and oxidative phosphorylation (Supplementary Fig. S3). A total of 462, 464, and 437 differentially expressed TFs were identified in 'Paldalkong', 'Danbaekkong', and 'NTS1116', respectively. The RNA-Seq analysis showed that 308 and 154; 244 and 220; and 286 and 151 TFs were up-and down-regulated in 'Paldalkong', 'Danbaekkong', and 'NTS1116', respectively (Fig. 3). The number of the up-regulated TFs was higher than that of the down-regulated TFs in all cultivars. The bHLH, bZIP, ERF, MYB, and WRKY families, some of the TF families related to stress tolerance, represented the majority of the differentially expressed TFs in all cultivars. The number of up-regulated TFs in bHLH, ERF, and WRKY was higher in the susceptible cultivar; whereas that of bZIP and MYB was higher in the tolerant cultivars. Another stress-related TF family G2-like also consisted of a higher number of up-regulated TF in the tolerant cultivars than in the susceptible cultivar.

Differentially expressed genes in the QTL region
Several abiotic stress-related DEGs including GST (GLYMA_11G216400), xyloglucan endotransglucosylase/hydrolase (LOC100799976), polygalacturonase (GLYMA_07G066900), calmodulin-binding p rot e i n ( G LY M A _ 0 7 G 0 0 8 4 0 0 ) , c yst e i n e p rotease (GLYMA_12G039400), soybean NAC gene (GLYMA_07G050600), DEAD-box RNA helicase gene (GLYMA_07G056600), and dehydration-responsive element-binding (GLYMA_07G017300) reside in the QTL associated with flooding tolerance in soybean. Moreover, all of these genes (except GST) were found in the QTL hotspots which were the regions containing relatively more stable QTL with higher phenotypic variation. Some of these DEGs, which were found in the QTL and revealed by the RNA-Seq analysis, were up-regulated in the tolerant and down-regulated in the susceptible cultivar (Supplementary Table S1).

Discussion
The results indicated that the flooding tolerance level of susceptible cultivar reduces as the flooding period extends. Several genes are found to be differentially expressed in various abiotic stresses including flooding, drought, salinity, and low temperature (Deshmukh et al. 2014;Patil et al. 2016). Although several physiological and molecular mechanisms are involved in specific stress responses, the majority of genes and pathways are common across various stress conditions (Deshmukh et al. 2014). The RNA-Seq expression analysis in this study revealed the DEGs in flood-tolerant and flood-susceptible cultivars under flooding stress conditions. The results of the up-and down-regulated genes in the tolerant and susceptible cultivars could provide useful information for the comparative transcriptional analysis of cultivars with differential stress tolerance. Such transcriptional analyses are useful in collecting the information about candidate genes associated with flooding tolerance that would help develop flooding tolerant cultivars.
Plants undergo various adaptive mechanisms, including molecular responses by regulating several genes, to survive and retain normal growth under stress conditions. In this study, several genes were also found to be differentially expressed in tolerant and susceptible cultivars under flooding stress. The analysis of DEGs was basically focused on the genes that were related to stress tolerance mechanism. Relative expressions of genes such as lipoxygenase, expansin, phosphofructokinase, GST, and SWEET between the tolerant and susceptible cultivars were analyzed. Although some of the stress-related genes were differently expressed across cultivars, the overall analysis showed that a relatively higher number of genes were up-regulated in the tolerant than in the susceptible cultivars. Lipoxygenase was reported to play roles in the adaption of radish to biotic and abiotic stresses . Results with higher numbers of up-regulated expansin-coding genes were also found in a seed-flooding tolerant soybean (Sharmin et al. 2020). The roles of expansin on germination, root length, and the number of lateral roots under abiotic stresses have been reported (Lü et al. 2013;Marowa et al. 2016). Increased expression of phosphofructokinase, a glycolytic enzyme involved in the glycolytic pathway, is associated with adaptation to flooding and drought stresses in soybean (Oh and Komatsu 2015). The GST is believed to play a protective role against flooding and drought stresses in soybean (Chen et al. 2016;Oh and Komatsu 2015). The GSTs scavenge the reactive oxygen species, which may be produced in high concentration due to stresses (Klok et al. 2002) leading to cell death (Clement et al. 2008;Wrzaczek et al. 2011), and protect plants from oxidative damage. Up-regulations of several SWEET genes were also observed in tolerant soybean lines under drought and flooding stresses . It is implied that the regulation of SWEET genes affects sugar balancing under flooding stress condition . The role of SWEET gene on plant development and stress tolerance has also been reported in Arabidopsis (Klemens et al. 2013).
As in a previous study (Sharmin et al. 2020), some cell wall-related genes such as GLYMA_08G249800 and GLYMA_18G272100 associated with COBRA-like protein were significantly up-and down-regulated in tolerant and susceptible cultivars, respectively. However, a higher number of the genes related to proline-rich cell wall protein, which were up-regulated in tolerant genotype (Sharmin et al. 2020), were down-regulated in the present study. Sharmin et al. (2020) reported that these cell wall-related genes had an important function in retaining the cell wall flexibility under flooding stress condition.
Abscisic acid (ABA) and other upstream signals may regulate the downstream pathways in abiotic stress responses (Tuteja 2007). A few number of the ABA-related TFs such as AP2, bZIP, and MYB were up-regulated in 'Paldalkong' and/or 'Danbaekkong' than in 'NTS1116'. Soybean AP2 TFs are believed to play a vital role in enhancing abiotic stress tolerance in Arabidopsis and soybean (Zhao et al. 2019). The TF families of bZIP and MYB had been also reported to have an important role in abiotic stress response in soybean, tobacco, and Arabidopsis (Cai et al. 2015;Shukla et al. 2015;Xu et al. 2016). The bZIP TFs may adjust regulation of a few stress-related genes that result into accumulation of proline (Hoang et al. 2017). Proline is supposed to contribute in stabilizing sub-cellular structures, scavenging free radicals, and exhibiting signals for cross tolerance to many stresses (Kaur and Asthir 2015). The numbers of up-regulated G2-like TF were higher in 'Paldalkong' (16) and 'Danbaekkong' (12) than in 'NTS1116' (7). Tahmasebi et al. (2019) implied that G2-like TFs contributed to enhance abiotic tolerance in tobacco since they participated in the formation and development of chloroplast (Liu et al. 2016).
The results of some stress-related DEGs by RNA-Seq in this study were comparatively analyzed with putative genes in the QTL associated with flooding tolerance (Dhungana et al. 2020). Several genes located in the QTL regions were also found by RNA-Seq to contribute to flooding and other abiotic tolerance. For example, the cell wall remodeling enzymes such as xyloglucan endotransglucosylase/hydrolase maintains cell wall stiffness and helps overcome stress (Tenhaken 2015). Polygalacturonase has been found responsive to flooding stress in soybean (Nanjo et al. 2013). The contribution of calmodulin-binding protein in biotic and abiotic stresses tolerance has also been reported . Cysteine protease shows a positive impact on abiotic stress tolerance in soybean (Mangena 2020). Overexpression of the soybean NAC gene enhanced root formation and abiotic stress tolerance in transgenic Arabidopsis (Yang et al. 2019). Tomato plant with the overexpressed DEAD-box RNA helicase gene showed drought and salt tolerance . The dehydration-responsive element-binding gene improved heat and drought stress tolerance in Arabidopsis (Mizoi et al. 2013). These reports further support the contribution of DEGs to flooding tolerance and increase the usefulness of their genetic information for soybean breeding programs.

Conclusion
Flood-tolerant and -susceptible soybean cultivars were grown under flooding stress and control conditions to investigate the gene expression patterns in leaf tissues using RNA-Seq. The flooding tolerance level, calculated as the leaf chlorophyll index, of susceptible cultivar reduced as flooding period prolonged. The number of DEGs in the tolerant cultivars was higher than that in the susceptible cultivar. A higher number of various genes and TFs related to stresses, including flooding, were up-regulated in the tolerant cultivars than in the susceptible cultivar. The results of QTL associated with flooding tolerance that were obtained with the same genetic background of tolerant parents, 'Paldalkong' and 'Danbaekkong' could support the findings of putative genes by RNA-Seq in this study. The variation in expression of some genes in the tolerant and susceptible cultivars could provide useful information about candidate genes associated with flooding tolerance that may be valuable to develop flooding tolerant soybean cultivars.