Background

Rising global temperature, drought stress and increased exposure to air pollutants have contributed to decreased regional and global crop production [1,2,3] and represent a challenge for future agriculture [4, 5]. If fossil fuel emissions continue at their current pace, global land surface temperatures are projected to increase by 5–9 °C by the end of the century [6]. Increased demand for soil moisture imposed by higher temperatures, coupled with climate projections of more variable precipitation patterns in the future, will result in increased drought stress [7, 8]. Higher temperatures will also favor formation of atmospheric pollutants, including ozone (O3), which significantly decreases plant productivity [9].

Genomic approaches have proven powerful tools to understand the complex relationship among genes, proteins and metabolites involved in plant responses to abiotic stress and future climate change [10,11,12,13,14]. In addition, there is growing awareness of the importance of investigating the mechanisms of crop response to environmental change in the dynamic field environment where multiple variables interact [15,16,17,18,19,20,21,22,23]. Coupling integrated analyses at the molecular, biochemical, physiological and agronomic level of crop responses to global climate change within a production environment has led to better understanding of the underlying transcriptomic responses responsible for complex phenotypes observed across plant species and stressors [24,25,26].

Soybean (Glycine max L. Merr.) is the most widely grown legume worldwide and provides an important global source of oil and protein for food and feed [27]. Soybean seeds provide the economic value for the commodity and have been used as a model system for identifying genes and gene networks required for seed development [28, 29]. The soybean seed coat is a critical tissue that serves as a conduit for water and nutrients [30, 31], coordinates embryo and endosperm growth [32], and encapsulates and protects the embryo at maturity [33]. As the link between maternal and filial tissue, the seed coat plays a critical role in the metabolic control of seed development, and in turn, successful seed production [34]. This is achieved, in part, by the activities of acid invertases (vacuolar and cell wall) and sucrose synthase, which facilitate sucrose transport by generating a strong sucrose-to-hexose gradient across the apoplastic space between the seed coat and the developing seed [30, 35,36,37]. It has been demonstrated that high temperature or drought stress imposed during soybean seed development can cause changes in seed coat morphology leading to negative effects on seed quality, seed germination rate, and seedling vigor [38,39,40].

Global transcriptional profiling studies have described the genetic events involved in soybean seed development [28, 29, 41,42,43], seed coat pigment color [44] and identified a complex seed coat specific transcriptome [31]. However, the effects of abiotic stress on gene expression patterns in the seed coat have not been explored. Here, we investigated the transcriptional response of soybean seed coat tissue to abiotic stresses including drought, elevated O3 concentration ([O3]), or elevated temperature throughout the growing season in a field setting. Due to the critical role the seed coat plays in supplying nutrients to developing seeds, we investigated abiotic stress-mediated transcriptional changes during the pod filling stage and coupled this with physiological and biochemical activity to identify genes involved in abiotic stress response in the seed coat.

Results

Photosynthesis, but not photoassimilate transport to the seed coat, is altered by abiotic stress in soybean

Leaf-level photosynthetic and biochemical measurements were taken to characterize the effects of abiotic stress on field-grown soybean. Soybeans exposed to drought, elevated [O3], or elevated temperature had significantly lower rates of photosynthesis (A) and stomatal conductance (g s; Fig. 1a-f; p-value <0.10). A was reduced by 20–40% with exposure to abiotic stress, while g s was reduced by 35–56% (Fig. 1a-f). Leaf total nonstructural carbohydrate (TNC) content was also significantly decreased in the elevated temperature (Fig. 1i, p-value <0.10) and drought (Fig. 1g, p-value <0.10) treatments, but not in elevated [O3] (Fig. 1h, p-value >0.10). Average leaf TNC values for elevated temperature and drought were 1131.9 and 2166.0 μmol g DW−1, respectively, representing a 23.1 and 21.2% decrease from control conditions.

Fig. 1
figure 1

The effects of abiotic stress on primary metabolism. (a-c) Carbon assimilation rate (A), (d-f) stomatal conductance (g s,) and (g-i) foliar total non-structural carbohydrate content (TNC) of soybeans grown in season-long drought (a, d, g), elevated [O3] (b, e, h) and elevated temperature (c, f, i) conditions. A and g s measurements were taken on different days for ozone and temperature treatments and different year for drought treatment. This was done in order to capture the proper development stage. The light level for A and g s measurements for the drought treatment was 1900 μmol m−2 s−1 , 1650 μmol m−2 s−1 for the ozone treatment, and 450 μmol m−2 s−1 for the temperature treatment (due to cloudy day conditions). Asterisks represent significance (* p < 0.05, ** p < 0.01, *** p < 0.001)

The abundance of TNCs was also quantified at three positions along the petiole based on the proximity to source and sink tissues (Fig. 2). There was a significant difference between petiole TNC at the leaf position (source) and stem position (sink) for all treatments (Fig. 2a-c, p-value <0.10). These differences were likely driven by changes in starch content, as sucrose content did not differ in different positions of the petiole (data not shown). No significant effect of drought, elevated [O3], or elevated temperature on petiole TNC was observed at any position (Fig. 2a-c) (p-value >0.10), with the exception of the middle position in the drought treatment, which saw increased petiole TNC in drought conditions (Fig. 2a; p-value <0.10). Additionally, a similar high ratio of seed coat sucrose to cotyledon hexose was maintained in the seeds of plants grown at elevated [O3] and elevated temperature (Fig. 3) suggesting that sink strength at the seed coat was not altered by those stresses [35]. The average ratios of seed coat sucrose to cotyledon hexose concentration in elevated [O3] and elevated temperature conditions were 5.09 and 4.02, which were not significantly different from the ratios in ambient conditions (Fig. 3; p-value >0.10).

Fig. 2
figure 2

Carbohydrate gradient along the leaf petiole. TNC content of the petiole lengths at three positions along the petiole (leaf adjacent, middle and stem adjacent) for (a) drought, (b) elevated [O3] and (c) elevated temperature. Different letters represent significant differences at p < 0.10

Fig. 3
figure 3

Maintenance of sink strength during pod-fill. Hexose and sucrose content in the seed coat and cotyledon tissues for (a) ambient and elevated [O3], and (b) ambient and elevated temperature. Sucrose:hexose was calculated as the ratio between seed coat sucrose and cotyledon hexose content and is not significantly different for either stress treatment. Con Hex: control hexose concentration; Trt Hex: treatment hexose concentration; Con Suc: control sucrose concentration; Trt Suc: treatment sucrose concentration

At maturity, the number of pods per node was measured in all experimental plots and total seed yield was measured in the drought and elevated temperature plots (Table 1). The total number of pods per node decreased for all abiotic stress treatments. There were 3.3–3.4 pods per node in ambient conditions, and growth under abiotic stress conditions reduced pod number to 2.7–3.0 pods per node (Table 1). Seed yield was also reduced by 25.4% in drought and 20.5% in elevated temperature (Table 1).

Table 1 Whole plot seed yield and number of pods per node (PPN) in response to three abiotic stress treatments and their respective control

Analysis of global changes in expression abundance across multiple abiotic stress conditions in the soybean seed coat

The quality of all RNA-Sequencing (RNA-Seq) libraries was assessed based on mapping alignment statistics (Additional file 1: Table S1). All libraries except two aligned at >80% to the reference genome. Analysis of the low mapping libraries showed some contamination with bean pod mottle virus (drought treatment replicate 1) and various non-plant contaminations (drought treatment replicate 2). Pearson correlation coefficients were calculated between all biological replicates per treatment. A high correlation was expected between replicates (per treatment), and replicates with low R2 values <0.60 were removed from the analysis (Fig. 4) [45]. Hierarchical clustering of expression data demonstrated stronger clustering of the temperature treatment samples, compared to drought and elevated [O3] treatment samples (Fig. 4).

Fig. 4
figure 4

Heat map of Pearson correlation coefficient values across all biological replicates and treatments. ConD: drought control; ConO3: ozone control; ConT: temperature control; Dri: drought; O3: elevated ozone, Tmp: elevated temperature

Differential gene expression analysis revealed relatively few differentially expressed genes in the soybean seed coat in response to drought and elevated [O3], as may have been predicted from the clustering analysis. Only 49 differentially expressed genes in the soybean seed coat were detected in response to drought, and 148 differentially expressed genes in response to elevated [O3] (FDR p-value <0.10) (Additional file 2: Table S2). In contrast, 1576 differentially expressed genes were detected in the soybean seed coat in response to elevated temperature (Additional file 2: Table S2). However, as we only sampled a single time point during soybean pod fill (R5), we cannot rule out that elevated [O3] and drought elicit more substantive transcriptional changes in other tissues, or at different developmental time points. Gene ontology (GO) term analysis did not reveal any enriched terms in the drought and elevated [O3] treatments; therefore, we focused our transcriptional analysis on the elevated temperature treatment.

Table 2 Minichromosome maintenance (MCM) genes with increased expression under elevated temperature

Genes that showed the largest increase in expression (log fold-change) in the elevated temperature treatment included peroxidase proteins, sugar transporter proteins, MYB-domain and leucine-rich repeat domain proteins, and long-chain-alcohol oxidase proteins (Additional file 2: Table S2). Additionally, genes with the highest overall expression in the elevated temperature treatment, represented as the normalized fragment per kilobase of transcript per million mapped reads (FPKM), included BURP domain-containing proteins, peroxidase family proteins, extension-like proteins, senescence-associated genes, and seed storage albumin superfamily proteins (Additional file 3: Table S3).

Differentially expressed genes in elevated temperature were related to chlorophyll biosynthetic processes, DNA replication, and nucleosome assembly, based on GO analysis (Additional file 4: Figure S1). Functional analysis of genes with GO terms related to DNA replication identified twelve minichromosome maintenance (MCM) family protein genes (Table 2). MCM proteins are licensing factors and part of the pre-replicative complex (pre-RC) in eukaryotes, playing an essential role in cell division [46]. MCM2 to MCM7 encode subunits of the MCM(2–7) hexamer helicase that is recruited to replication origins [47]. In our study, two genes from each of the MCM families 2 through 7 were significantly increased by growth at elevated temperature. To further understand the role of MCM genes in response to elevated temperature in the soybean seed coat additional analyses were completed.

MCM genes may play a role in maintaining proper DNA replication in the soybean seed coat under elevated temperature stress

Multiple sequence alignment of all twelve differentially expressed MCM genes in the soybean seed coat was completed with the sequences of known MCM genes in Arabidopsis (AtMCM2–7), maize (ZmMCM2–7), pea (PsMCM2–7), and two Brassica species (B. oleracea BoMCM2–7, B. rapa BrMCM2–7) (Additional file 5: Figure S2). Soybean MCM2 and MCM7 genes had between ~76–84% amino acid identity with both B. oleracea and B. rapa MCM genes, while soybean MCM4, 5 and 6 genes showed higher amino acid sequence similarity with B. rapa MCM genes (Additional file 6: Table S4), ranging from ~73–78% identity. Soybean MCM3 (GmMCM3) genes Glyma05g25980 and Glyma08g08920 had 76–77% identity with BoMCM3 and BrMCM3_2, but 32% identity with BrMCM3_1. Soybean MCM6 (GmMCM6) genes Glyma09g05240 and Glyma15g16570 had 75.0 and 74.7% amino acid identity with ZmMCM6, and 84.1 and 84.7% amino acid identity with PsMCM6 (Additional file 6: Table S4). Both GmMCM2 genes Glyma07g36680 and Glyma17g03920 had ~76% amino acid identity with AtMCM2 (Additional file 6: Table S4). The high sequence similarity with known MCM genes in Arabidopsis, maize, pea and Brassica species may suggest a similar role of soybean MCM genes in proper development under abiotic stress. Additionally, analysis of expression of these twelve MCM genes in the soybean expression atlas (https://soybase.org) found high expression in young leaf tissue (Additional file 7: Table S5). This indicates that the MCM expression in the soybean seed coat is not unique to that tissue, most likely due to their general role in DNA replication.

Discussion

Reductions in primary metabolism and yield are common responses to abiotic stress [21, 22, 48,49,50,51], and our study confirmed that drought, elevated [O3] and elevated temperature reduce photosynthesis and seed yield. We also predicted that photoassimilate available for translocation to developing reproductive tissues would be diminished. However, despite decreased leaf-level CO2 assimilation in soybeans grown under drought, elevated [O3], and elevated temperature treatments (Fig. 1), translocation of photoassimilate was not altered. Additionally, a high sucrose:hexose ratio between the seed coat and cotyledon was maintained (Fig. 2), which is one possible indicator that the sink strength of individual seeds was not affected by abiotic stress treatments [35], despite a net reduction in sink strength at the whole plant level. To further understand the effects of abiotic stresses on the soybean seed coat, we performed transcriptomic analysis.

Differential gene expression analysis revealed far fewer genes differentially expressed in the drought and elevated [O3] treatments compared to the elevated temperature treatment (Additional file 2: Table S2). This suggests that different abiotic stresses do not elicit common transcriptional responses in the soybean seed coat, which supports other transcriptional and metabolomic experiments done with Arabidopsis [52]. However, our analysis was limited to a single time point during the pod filling stage (R5), and it cannot be ruled out that drought and elevated [O3] induce more substantive transcriptional responses at other development time points, or in tissues other than the seed coat.

Genes involved with DNA replication showed increased expression in seed coats of soybeans exposed to elevated temperature. In particular, twelve MCM family genes showed greater expression at elevated temperature. MCM proteins form a heterohexomeric complex (MCM2–7) that is a key part of the initiation and elongation steps in eukaryotic DNA replication [46, 47]. MCM proteins also ensure DNA replication occurs only once during the S phase of the cell cycle [53]. How MCM genes control DNA replication, however, is less well understood in plants [46, 47, 54,55,56]. Previous work in Arabidopsis and maize has found MCM genes are preferentially expressed in young tissues with large numbers of replicating cells [57,58,59,60,61,62,63], and in Arabidopsis MCM subunits are coordinately expressed across tissue types and development [54]. MCM proteins are also critical components of plant reproductive development. Work done in Arabidopsis has identified MCM genes essential for embryo development [64], and MCM proteins required for proper cytokinesis during seed development [57, 58, 61]. Work done in maize found ZmMCM6 is an essential protein for both vegetative and reproductive growth [63], and transgenic maize plants with minor antisense transcript amounts of ZmMCM6 had an overall reduced size and were unable to develop cobs to maturity [63]. We found high sequence similarity with known MCM genes in Arabidopsis, maize, pea and two Brassica species (Additional file 6: Table S4), which may indicate a similar functional role of MCM genes in the soybean seed coat.

There is growing evidence that MCM proteins also play a role in plant response to abiotic stress. For example, work in pea (P. sativum) has shown MCM6 is associated with salinity tolerance [46]. Furthermore, constitutive expression of PsMCM6 in tobacco seedlings increased salinity tolerance. Findings from Dang et al. [46] indicate that MCM proteins may interact with proteins that are related to stress tolerance, and/or are involved in transcriptional regulation of stress response genes through their function as helicases. Recently, it was demonstrated that different MCM family genes were up-regulated in B. oleracea and B. rapa in response to cold and salt stress, suggesting some degree of species-specific response [65]. These previous analyses have suggested that subunits of the MCM complex are not changing in concert in response to stress, and that perhaps different subunits of the MCM complex can respond independently. In our study of soybean seed coats, two transcripts from each of the MCM families 2 through 7 increased expression in elevated temperature stress (Additional file 2: Table S2). We hypothesize that the coordinated increase in expression allowed for greater DNA replication and cell cycle activity mediated by the MCM(2–7) helicase in the soybean seed coat under high temperature stress, possibly due to acceleration of seed development in the temperature stress conditions.

Conclusions

This study investigated the transcriptomic response of the soybean seed coat to multiple climate change factors in a field environment. Soybean plants exposed to drought, elevated [O3], and elevated temperature showed decreased carbon assimilation and stomatal conductance, leading to decreased leaf TNC in drought and elevated temperature treatments. At maturity, soybean yield was also decreased in drought and elevated temperature. While decreased carbon assimilation was observed, there was no observed decrease in photoassimilate transport from source to sink tissue, as measured by petiole TNC abundance at three positions along the petiole. Additionally, sink strength was maintained in the soybean seed coat; a high seed coat sucrose-to-cotyledon hexose ratio was maintained in the soybean seed coat exposed to drought, elevated [O3] and elevated temperature. Transcriptomic analysis found elevated temperature caused increased expression of genes related to DNA replication, cell cycle and microtubule motor family proteins, in particular MCM genes. This indicates greater cell cycle and DNA replication activity in seeds exposed to elevated temperature, and represents a possible acceleration of the completion of seed development due to elevated temperature stress.

Methods

Experimental site and plant growth conditions

Soybean (Glycine max cv. Pioneer 93B15) was grown in drought conditions (n = 3) at the 32-hectacre Soybean Free Air Concentration Enrichment (SoyFACE; https://soyface.illinois.edu) experimental field site in the summer of 2011 and in elevated [O3] (n = 4) and elevated temperature conditions (n = 4) in the summer of 2012. Soybeans were planted on 8 June 2011 and 15 May 2012, at 0.38 m row spacing. Soybean and maize (Zea mays) are rotated each year at the experimental facility, and the soybean crop was not fertilized or irrigated. For each stress, soybean plants were grown in control and treatment plots nested within the 32 ha field. Each ozone plot was 21 m in diameter, with control and treatment plots separated by minimum of 100 m. The elevated [O3] fumigation system described in [66] increased [O3] to 100 nL L−1 from ~10:00 to 17:00, except when leaves were wet. In 2012, the season-long 8-h average ambient [O3] was 50.6 nL L−1 , and the 8-h season-long elevated [O3] was 69.7 ± 1.3 nL L−1. Drought was established by employing modified Solair motorized retractable fabric awnings (Glen Raven, Inc., Glen Raven, NC, http://www.glenraven.com) mounted 25–50 cm above the plant canopy to intercept nighttime rainfall (described in [67]), resulting in a 35% reduction in total growing season precipitation (control precipitation, 274 mm; reduced precipitation, 179 mm). The drought plots were 8 m long and 4 m wide. The elevated temperature treatment was produced using infrared heaters (Salamander Aluminum Extrusion Reflector Assembly Housing for Ceramic Infrared Heaters; Mor Electric Heating Assoc., http://www.morelectricheating.com) fitted with four heating elements (Mor-FTE 1000-W, 240-V heaters; Mor Electric Heating Assoc., http://www.morelectricheating.com) mounted 1.2 m above the plant canopy (described in [21]). The growing season mean increase in temperature was 2.71 °C ± 0.4 °C in the temperature plots.

Photosynthetic gas exchange, tissue sampling, biochemical analyses and harvest

Gas exchange measurements were taken at mid-day using the middle trifoliate of fully expanded leaves at the 5th node down from the shoot apex during the pod filling stage (R5). This stage is characterized by nutrient accumulation and synthesis of storage proteins [42]. A portable infrared gas analyzer (LI-6400; Licor Biosciences, Inc., Lincoln, NE, http://www.licor.com) was used to take measurements of leaf photosynthesis (A) and stomatal conductance (g s ) by setting the chamber conditions to reflect the ambient light intensity, temperature and relative humidity in the field. Three leaves from different plants were measured for each treatment and control plot.

Following gas exchange measurements, tissue was collected from the 5th node at dusk (approximately 18:00–20:00) for carbohydrate and gene expression measurements. Leaf discs (1.34 cm2) were excised from fully expanded leaves, flash-frozen in liquid N, and then stored at −80 °C. Leaf discs were also collected and dried at 55 °C for one week to assess specific leaf weight. Petioles were removed and sectioned into 2.0 cm lengths based on proximity to the leaf, the stem and at a distance mid-way between the leaf and stem. The seed coat was harvested from detached pods by making a small incision into the seed coat of the seed with a scalpel and separating the seed coat from the cotyledons. Seed coat tissue was collected from several pods per plant in order to fill a 2 mL tube. Seed coats were flash-frozen in liquid N and stored at −80 °C. Seed coat tissue was collected from ~10–20 plants per plot and pooled in order to obtain sufficient tissue for subsequent analyses.

Total non-structural carbohydrate content was calculated from sequential determination of glucose, fructose and sucrose content using the methods of [68]. The pellets remaining after the ethanol extraction were then solubilized by heating to 95 °C in 0.1 M NaOH for subsequent determination of starch content. The NaOH solution was acidified to pH 4.9 and starch content was determined from glucose equivalents [69].

Seed yield from whole-plots was determined at maturity (R8) in the drought and elevated temperature treatments. Seed yield for ambient and treatment plots was measured as total seed weight per area (g m−2). At maturity (R8), the number of pods per node was counted at the same node where physiological measurements were made. In 2011 the drought yield was obtained by harvesting all plants within 4, 1 m rows per plot, giving a sampling area of 1.542 m2. In 2012, the elevated temperature yield was obtained by harvesting all plants inside a single 1 m row per plot, giving a sampling area of 0.762 m2. Final yield for the elevated [O3] plots was not measured in 2012.

Statistical analysis of physiological and biochemical data

All model assumptions of normality and homogeneous error (NID, 0, σ2) were examined for each parameter (A, g s , TNC, and hexose/sucrose concentrations). Assumptions of normality were tested using the Shapiro-Wilk test, and the assumption of homogeneous variance was examined by plotting the residual versus the predicted value for each variable. A linear mixed model was used to assess the impact of the fixed effect of treatment (drought, elevated [O3], or elevated temperature) compared to the control with block as a random factor in the model. The dependent variables leaf TNC, and seed hexose and sucrose concentrations were fit separately. A repeated measures analysis was used for petiole TNC data, due to the correlation in space. For yield data analysis, optimal α values for the pod per node data were analyzed according to [70]. This method minimizes the average of Type I and Type II errors, therefore minimizing the overall error rate. This method avoids unnecessarily high rates of Type II error and is appropriate in studies where Type I and Type II errors are considered to have equal importance [70]. Degrees of freedom were taken from each data set (drought, elevated [O3] and elevated temperature) and Cohen’s ƒ2 were input into R (ver. 3.0.2; www.r-project.org) using code provided by [70]. Cohen’s ƒ2 of 0.35 was chosen a priori based on previous literature [71, 72]. Based on the degrees of freedom from each data set the traditional optimal α value to analyze the pod per node data was higher than the standard α = 0.05. All analyses were conducted in SAS (SAS Institute, Version 9.3, Cary, NC, http://www.sas.com/).

RNA extraction and library preparation

Total RNA was isolated from each biological replicate of frozen seed coat tissue (pooled from multiple plants) following standard protocols. Briefly, seed coats were ground to a fine powder in liquid N using a mortar and pestle. RNA was extracted using the PureLink Plant RNA Reagent (Ambion, by Life Technologies Corp., Grand Island, NY, USA, http://www.lifetechnologies.com) and genomic DNA contamination was removed from RNA samples using TurboDNase treatment (Applied Biosystems by Life Technologies, Austin, TX, USA, http://www.lifetechnologies.com) according to the manufacturer’s protocols. RNA quantity was determined with a spectrophotometer (Nanodrop 1000, Thermo Fisher Scientific, Waltham, MA, USA, http://www.thermofisher.com) and RNA quality was assessed using the Agilent 2100 bioanalyzer (Agilent Technologies, Santa Clara, CA, USA, http://www.alliedelec.com/). cDNA libraries were prepared using the Illumina TruSeq Sample Prep kit (Illumina Inc. San Diego, CA, USA, http://www.illumina.com) according to the manufacturers protocol. Library fragments were barcoded and multiplexed for sequencing to obtain 100 nt single-end reads. Library preparation and sequencing was performed at the Roy J. Carver Biotechnology Center using the Illumina Genome HiSeq 2000 (Illumina Inc. San Diego, CA, USA, http://www.illumina.com) and Cassava pipeline 1.8. FASTQ files from all sequencing runs are located on the Small Read Archive (http://www.ncbi.nlm.nih.gov/sra), SRA089043, BioProject number PRJNA207354.

Transcriptome analyses

Sequencing adapters were removed from the raw FASTQ files using Cutadapt (ver. 1.8) [73]. A quality cutoff of 20 was used to trim low-quality bases. Only reads with a minimum length of 36 nt after trimming were retained. Trimmed RNASeq reads were aligned to the soybean reference genome (ver 1.1) using TopHat (ver. 1.4.1) [74]. A minimum intron length of 5 and a maximum intron length of 60,000 bp was used. Fragments Per Kilobase of Exon Model per Million mapped read (FPKM) values were determined using Cufflinks (ver. 1.3.0) [75]. Differentially expressed genes were determined using edgeR [76] from count data generated from HTSeq [77]. Due to underlying heterogeneity among all plots across the entire field experiment, genes were considered significantly differentially expressed when they had an FDR-adjusted p-value less than 0.1. The soybean genome (ver 1.1) functional annotation was used for all gene annotations (https://phytozome.jgi.doe.gov/pz/portal.html). Gene ontology (GO) enrichment was performed using single enrichment analysis from Agrigo (http://bioinfo.cau.edu.cn/agriGO/) with Glycine max (ver. 1.1) reference.

Multiple sequence alignment

Multiple sequence alignment was performed using Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/) [78]. Peptide sequences for the analysis were downloaded from the following public databases and can be found in Additional file 8: Table S6.