Introduction

Plant–insect interactions have been widely studied (Ferry et al. 2004; Gatehouse 2002; Kessler and Baldwin 2002; Van Poecke 2007; Wu and Baldwin 2009). The resistances to herbivores in plants could be divided into two types, direct and indirect (Kessler and Baldwin 2002). The direct defense are conferred by some secondary metabolites, proteinase inhibitors (PIs) and physical barriers, which have toxic, repellent or anti-digestive effects on the performance of insect life (Wu and Baldwin 2009). The indirect defense means that the plants gain the protection by emitting some volatile compounds that attract the natural enemies of herbivores (Heil 2008; Kessler and Baldwin 2002). In addition to the above definition, the plant defense against herbivory insects can also be categorized as constitutive and induced defenses based on the timing of defense activation (Gatehouse 2002). No matter what kind of defenses, they are the burdens on plant growth or reproduction (Zavala and Baldwin 2004). Therefore, a sophisticated regulatory network is needed by plants to maintain a balance between the growth and defense responses.

Three kinds of phytohormones, jasmonic acid (JA), ethylene (ET) and salicylic acid (SA), were found to play important roles in response to herbivore attack (Wu and Baldwin 2009). JA positively regulated a series of genes conferring resistance to herbivory insects, whereas ET and SA mainly functioned as negative regulators, which have antagonistic effects on JA-dependent responses (Wu and Baldwin 2009). In addition to JA, ET and SA, other phytohormones such as abscisic acid (ABA), auxin, cytokinin and gibberellins, are also involved in herbivore-induced responses (Gatehouse 2002; Stam et al. 2014). In addition to phytohormones, anti-herbivore secondary metabolites such as phenylpropanoids, glucosinolates, alkaloids and phenolics, also had important functions in conferring resistance to herbivores in plants (Wu and Baldwin 2009). Phenylpropanoids were a diverse group of secondary metabolites including flavonoids, coumarins, monolignols and lignans, which were derived from the phenlylalanine, an end product of the shikimate pathway (Fraser and Chapple 2011). The phenylpropanoid pathway was indispensable to plants and involved in plant defense, structural support and survival (Vogt 2010). Overexpression of a transcription factor gene PAP1 that regulated the biosynthesis of phenylpropanoids in Arabidopsis, resulted in the reduced feeding rates of generalist Spodoptera frugiperda (Johnson and Dowd 2004).

Soybean (Glycine max) is one of the most important oil and protein crops in the world. Cotton bollworm (Helicoverpa armigera) as well as beet armyworm (Spodoptera exigua Hübner), common leafworm (Spodoptera litura), bean leaf beetle (Cerotoma trifurcata) and green cloverworm (Heliothis dipsacea), are important defoliating insects that seriously damage soybean yield and quality. Despite Bt genes and chemical insecticides that were widely used in fighting against these insects (Stewart et al. 1996), utilization of soybean native resistant genes through either traditional breeding or genetic engineering, could broaden the options of the insect management and be economical and environment friendly (Wang et al. 2014, 2015a, b; Chen et al. 2015; Gunadi et al. 2016; Zhai et al. 2017). Thus, identifying native resistant genes and elucidating the molecular mechanism of resistance to defoliating insects are of great significance for the improvement of soybean. As the ancestors of cultivated soybean, wild soybean (Glycine soja) retained more genetic diversity and carried more useful stress-resistant genes, which might be lost during domestication in cultivated soybean (Guan et al. 2014; Li et al. 2014; Qi et al. 2014; Winter et al. 2007). Therefore, mining genes with resistance to stresses from the wild soybean sources becomes the concern of soybean researchers. In a previous evaluation of soybean germplasm for their resistance to herbivory insects, we found that a wild soybean, ED059, was the one with highest level of resistance to H. armigera (Wang et al. 2015a, b). In order to learn the resistance mechanism in ED059, we conducted comparative analyses of transcriptome and proteome in response to H. armigera between ED059 and a susceptible control, cultivar (cv.) Tianlong 2. The results suggested a potential important role for metabolic processes in conferring resistance to H. armigera in ED059. One candidate resistance gene was functionally confirmed by overexpressing it in Arabidopsis. Our studies provided insights into the mechanism of resistance to H. armigera in wild soybean ED059, which could contribute to the understanding of defence mechanism against herbivory insects in plants.

Results

Overview of RNA-seq data

Our previous studies showed that ED059 had a high level of resistance to H. armigera in either choice or no-choice tests (Wang et al. 2015a, b). To learn the resistant mechanism, we conducted transcriptome analyses by RNA-seq for the leaves at 24 h after treatment with or wihout H. armigera between ED059 and a susceptible cultivar (cv.), Tianlong 2. Four libraries were constructed, which included: 1L, Tianlong 2 at 24 h without treatment; 1-L, Tianlong 2 at 24 h with treatment of H. armigera; 2L, ED059 at 24 h without treatment; and 2-L, ED059 at 24 h with treatment of H. armigera (Table 1). The number of total raw tags for each library ranged from 5.82 to 6.23 million (Table 1). After filtered by removing 3′ adaptor fragments, low-quality tags and several types of impurities, the clean tags were produced for each library, which accounted for 97% of raw tags (Table 1). More than 2.5 million tags for each library, accounting more than 45% of clean tags, were unambiguously mapped to the reference genome (http://www.soybase.org/) (Table 1). The analyses of sequencing saturation showed that the number of detected genes approached saturation when the number of tags reached two millions (Additional file 1), which indicated that our RNA-seq data for each library contained sufficient information of soybean genes for further studies of gene expression. The number of unambiguous clean tags for each gene was normalized to TPM (number of transcripts per million clean tags) to represent the expression level of each gene (Morrissy et al. 2009).

Table 1 Summary of RNA-seq data for four samples, 1L, 1-L, 2L and 2-L. 1L, Tianlong 2 at 24 h without treatmentof H. armigera; 1-L, Tianlong 2 at 24 h with treatment of H. armigera; 2L, ED059 at 24 h without treatment; and 2-L, ED059 at 24 h with treatment of H. armigera

Specifically expressed genes in response to H. armigera between ED059 and cv. Tianlong 2

To learn the difference of defense mechanisms between ED059 and cv. Tianlong 2, we compared the expression of genes in response to H. armigera between ED059 and cv. Tianlong 2. The genes with twofold change and FDR (false discovery rate) ≤0.001 were defined as significantly differentially expressed genes (DEGs). The hierarchical clustering analyses for the DEGs between 1-L versus (vs.) 1L and 2-L vs. 2L, suggested an apparently different defense responses against H. armigera between ED059 and Tianlong 2 (Fig. 1a). In summary, 2213 specific DEGs with 417 up-regulated and 1796 down-regulated were identified in 2-L vs. 2L (Fig. 1b; Additional file 2). Interestingly, some genes involved in the JA pathway were found to be significantly down-regulated in ED059, which included OPR3 (Glyma17g38160.1), JAR1 (Glyma19g44310.1) and MYC2 (Glyma08g36720.2, Glyma01g15930.1, Glyma01g02251.1, Glyma06g35330.1, and Glyma08g28010.1). In comparison, 2179 specific DEGs with 1374 up-regulated and 805 down-regulated were identified in 1-L vs. 1L (Fig. 1b; Additional file 3). One hundred and forty-eight up- and two hundred and fifty-two down-regulated genes were commonly identified in both of two comparisons (Fig. 1b; Additional file 4).

Fig. 1
figure 1

Analyses of differently expressed genes between ED059 and cv. Tianlong 2. The hierarchical clustering analyses (a) and Venn diagram (b) of the DEGs between 1-L vs. 1L and 2-L vs. 2L. Red and green indicate high and low expression levels, respectively. The upward arrow represents the up-regulations of gene expression. The downward arrow represents the down-regulations of gene expression. 1L, Tianlong 2 at 24 h without treatment; 1-L, Tianlong 2 at 24 h with treatment of H. armigera; 2L, ED059 at 24 h without treatment; and 2-L, ED059 at 24 h with treatment of H. armigera. (Colour figure online)

GO annotation by use of singular enrichment analysis (SEA) were conducted for three categories of genes (Du et al. 2010), which included: (I) specific differentially expressed genes in 2-L vs. 2L; (II) specific differentially expressed genes in 1-L vs. 1L; and, (III) common differentially expressed genes between two comparisons. For the genes in the category I, 84 significant GOs in the biological processes were detected (Additional file 5). Of those, a large number of GOs were related to the regulation of metabolic processes (GO: 0009893, GO: 0009892, GO: 0031325, GO: 0031324, GO: 0010604, GO: 0010605, GO: 0051247 and GO: 0051248) (Additional file 5). For the genes in the category II, 52 significant GOs in the biological processes were detected, which were mainly related to response to stimulus (GO: 0050896), secondary metabolic process (GO: 0019748), cellular amino acid and derivative metabolic process (GO: 0006519), and cell wall macromolecule metabolic process (GO: 0044036) (Additional file 6). For the genes in the category III, 12 significant GOs were all related to response to stimulus (GO: 0050896) (Additional file 7).

Taking into account the expression pattern, we further used the method of parametric analysis of gene set enrichment (PAGE) to investigate the biological process for the genes between category I and II (Du et al. 2010). The results showed that the number of significant biological processes between category I and II were significantly reduced (Table 2). For category I, two significant biological processes were identified to be toxin metabolic process (GO: 0009404) and toxin catabolic process (GO: 0009407), which contained nine genes respectively (Table 2; Additional file 8). For category II, two significantly down-regulated biological processes were response to light intensity (GO: 0009642) and response to high light intensity (GO: 0009644), which contained 24 and 16 genes respectively (Table 2; Additional file 8). The biological processes between category I and II showed big differences, which indicated that there might be distinct defense mechanisms against H. armigera between ED059 and cv. Tianlong 2.

Table 2 The biological processes identified by use of the method of parametric analysis of gene set enrichment (PAGE) for the genes in category I and II

The genes that were up-regulated in ED059 but down-regulated or not changed in cv. Tianlong 2 in response to H. armigera

To further learn the difference of defense mechanism, we analyzed the genes with different expression patterns in response to H. armigera between ED059 and cv. Tianlong 2. The genes with different expression patterns were grouped into six clusters (Fig. 2), which included: (I) 50 genes significantly up-regulated in 2-L vs. 2L while down-regulated in 1-L vs. 1L (Additional file 9); (II) 354 genes significantly up-regulated in 2-L vs. 2L while not changed in 1-L vs. 1L (Additional file 10); (III) 67 genes significantly down-regulated in 2-L vs. 2L while up-regulated in 1-L vs. 1L (Additional file 11); (IV) 1647 genes significantly down regulated in ED059 while not changed in 1-L vs. 1L (Additional file 12); (V) 878 genes not changed in 2-L vs. 2L while significantly up-regulated in 1-L vs. 1L (Additional file 13); and, (VI) 779 genes not changed in 2-L vs. 2L while significantly down-regulated in 1-L vs. 1L (Additional file 14).

Fig. 2
figure 2

Genes with different expression patterns between ED059 and cv. Tianlong 2. The upward arrow represents the up-regulations of gene expression. The downward arrow represents the down-regulations of gene expression. The horizontal line represents no changes of gene expression. 1L, Tianlong 2 at 24 h without treatment; 1-L, Tianlong 2 at 24 h with treatment of H. armigera; 2L, ED059 at 24 h without treatment; and 2-L, ED059 at 24 h with treatment of H. armigera

The genes in cluster I and II were speculated to be involved in the pathways that were activated in ED059 while suppressed or not changed in cv. Tianlong 2. For the 50 genes in cluster I, most of them were associated with metabolic process such as flavonoid biosynthesis (Glyma06g03410.1, Glyma18g03020.1, Glyma18g06510.4 and Glyma18g12210.1), phytohormone biosynthesis (Glyma04g03740.1 and Glyma01g29930.1), lipid metabolism (Glyma18g13540.2, Glyma02g43440.1, Glyma09g37430.1, Glyma12g30870.1, and Glyma06g23560.1) (Additional file 9). For the 354 genes in cluster II, some of them encode transcription factors (e.g. Glyma18g26600.1, Glyma06g24730.1, Glyma11g01850.1 and Glyma07g05351.1), phytohormone-related genes (e.g. Glyma15g08540.2, Glyma15g13320.1, Glyma19g28250.1, Glyma02g41280.1, Glyma17g35610.1, Glyma03g30460.1 and Glyma11g21650.1) and secondary metabolism associated genes (e.g. Glyma15g06730.1, Glyma09g03270.1 and Glyma05g25460.1) (Additional file 10). The involved pathways for these genes might play important roles in conferring resistance to H. armigera in ED059.

qRT-PCR confirmation of selected gene expressions

To validate the transcript profiles produced in this study, 15 genes were randomly selected for qRT-PCR analysis (Fig. 3; Additional file 15). Three housekeeping genes, HDC, EF1b, and UKN2, were used as reference genes to normalize the gene expression levels (Wang et al. 2015a, b). The expression patterns detected by qRT-PCR for these 15 genes were consistent with those in the profilings, which indicated our RNA-seq data were reliable (Fig. 3).

Fig. 3
figure 3

Real-time PCR validation of selected genes. The abscissa represents the name of selected genes; the left ordinate represents the relative expression levels of the selected genes, which were quantified by the 2−ΔΔCt method; the right ordinate represents the relative expression levels of the selected genes (DEGs), which were calculated by the absolute value of log2 ratio

Analysis of proteome in response to H. armigera between ED059 and cv. Tianlong 2

To learn the changes of protein expressions in response to H. armigera between ED059 and cv. Tianlong 2, we conducted the analysis of proteome for the samples, 2-L, 2L, 1-L and 1L, by use of isoaric tag for relative and absolute quantitation (iTRAQ) technology. Two biological replicates were performed for each sample and a high level of reproducibility in the iTRAQ analysis was observed between the two biological replicates of each sample (Additional file 16).

In summary, a total of 116 specifically expressed proteins with 77 up-regulated and 39 down-regulated, were significantly identified in a comparison of 2-L vs. 2L (Fig. 4a; Additional file 17). For the comparison of 1-L vs. 1L, 9 specific proteins were identified to be up-regulated, while no down-regulated proteins were detected (Fig. 4a; Additional file 18). Three proteins were commonly found in both of these two comparisons, all of which were up-regulated (Fig. 4a; Additional file 19). GO analysis showed that the specific expressed proteins in 2-L vs. 2L were mainly enriched in 18 significant GOs, of which the response to stress (GO: 0006950) contained the largest number of genes (Fig. 4b; Additional file 20). For the specific expressed proteins in 1-L vs. 1L, no significant GO was identified.

Fig. 4
figure 4

The differently expressed proteins in response to H. armigera between ED059 and cv. Tianlong 2. a Venn diagram for the differently expressed proteins between 2-L vs. 2L and 1-L vs. 1L; b Histogram of gene ontology classifications for the differently expressed proteins. 1L, Tianlong 2 at 24 h without treatment; 1-L, Tianlong 2 at 24 h with treatment of H. armigera; 2L, ED059 at 24 h without treatment; and 2-L, ED059 at 24 h with treatment of H. armigera

Functional confirmation of a potential resistant gene

Because of our interests about the roles of phenylpropanoids in resistance to chewing insects in plants, we selected the gene Glyma06g03410 that was predicted to be involved in the phenylpropanoid pathway for the functional analysis in the next step. Glyma06g03410 encoded a putative phenylcoumaran benzylic ether reductase (PCBER), an enzyme responsible for the synthesis of lignans in the phenylpropanoid pathway (Karamloo et al. 2001; Nuoendagula et al. 2016; Vander et al. 2000). The expression of Glyma06g03410 were up-regulated by 3.0 fold change in response to H. armigera in ED059 while down-regulated by 3.0 fold change in cv. Tianlong 2 (Additional file 9). Whole genome resequencing showed that, in comparison with Tianlong 2, there were one nucleotide deletion in the promoter region of Glyma06g03410 and six point mutations in the gene region of Glyma06g03410 in ED059, among which three point mutations were located in the introns and the other three were located in the exons (Additional file 21). The point mutations in the exons of Glyma06g03410 in ED059 did not cause amino acids changes. No copy number variations were found for Glyma06g03410 between ED059 and cv. Tianlong 2. We overexpressed Glyma06g03410 in Arabidopsis and used two independent transgenic lines for the evaluation of gene function (Fig. 5a). The newly hatched H. armigera were placed on the transgenic plants and controls. And, the weights of H. armigera were determined at 6 days after treatment. The results showed that the H. armigera feeding on the transgenic plants gained less weight than the ones feeding on the controls (Fig. 5b), suggesting the role of Glyma06g03410 as a resistant gene. The functional confirmation of Glyma06g03410 showed that our comparative analysis of transcriptome between resistant and susceptible responses was a feasible strategy of mining the herbivore-resistant genes in ED059.

Fig. 5
figure 5

Functional confirmation of Glyma06g03410 in Arabidopsis. a Mean ± SE of the relative transcript levels of Glyma06g03410 in the transgenic plants (n = 3; three independent biological replicates were used per genotype). The transcript levels of the selected genes in related transgenic plants were calculated by the 2−ΔΔCt method. Data are the means ± SE of two or three strains. b H. armigera larval performance on A. thaliana plants overexpressing Glyma06g03410. Mean ± SE of H. armigera larval mass after 6 days of feeding on wild-type (WT) plants and transgenic plants with the empty vector (EV) and Glyma06g03410 (OV-1 and OV-2) (mean ± SEM; n = 5; statistics by t test; **p < 0.01)

Discussion

The possible mechanism of resistance to H. armigera in ED059

A wild soybean line ED059, was identified to have the highest level of resistance to chewing insects among more than one thousand soybean germplasm in the field. With the aim of mining useful genes and elucidating the resistant mechanism, we conducted the comparison of transcriptome and proteome in response to H. armigera between ED059 and the susceptible cv. Tianlong 2. The hierarchical clustering analyses of gene expressions showed that there were two obviously different molecular responses to H. armigera between ED059 and cv. Tianlong 2 (Fig. 1). In ED059, a large number of specifically expressed genes were significantly enriched in the metabolic process rather than the response to stimulus that were detected for the specifically expressed genes in the susceptible cv. Tianlong 2 (Additional file 5 and 3). Since the common regulated genes were significantly enriched in response to stresses in both ED059 and cv. Tianlong 2, we speculated that the divergence of downstream metabolic pathways might be responsible for the differences of phenotypes between these two soybean accessions. Four hundred and four genes that were up-regulated in response to H. armigera in ED059 while down-regulated or not changed in cv. Tianlong 2, were speculated to play important roles in conferring resistance to H. armigera in ED059. These genes might provide useful resources for the identification of resistant genes, which was suggested by the functional analysis of Glyma06g03410 by use of transgenic Arabidopsis (Fig. 5). Future studies should testify the enzyme activity of Glyma06g03410 as a PCBER and investigate the roles of its metabolic products in defense against herbivores. More changed genes related to metabolism were also worthy of being further studied in order to elucidate the roles of metabolic pathway in resistance to H. armigera in ED059.

Genes related to phytohormones between resistant and susceptible responses

JA was known to be an important phytohormone that mediated herbivore defense in plants (Gilardoni et al. 2011; Van Poecke 2007). However, in this study, we found that some genes such as OPR3 (Glyma17g38160.1), JAR1 (Glyma19g44310.1) and MYC2 (Glyma08g36720.2, Glyma01g15930.1, Glyma01g02251.1, Glyma06g35330.1, and Glyma08g28010.1) in the JA pathway, were significantly down-regulated in response to H. armigera in the resistant ED059 in comparison with the susceptible cv. Tianlong 2 (Additional file 2). These results indicated that the JA-regulated defense against H. armigera might be suppressed in ED059. Because only one time point, 24 h after treatment, was adopted, we had no idea about the earlier gene expression in response to H. armigera between these two soybean accessions. Thus, we could not exclude the possibility that the JA pathway might be activated earlier in ED059 than that in cv. Tianlong 2, which resulted in a quick defense response to H. armigera in ED059. The exact role of the JA pathway in defense response between ED059 and cv. Tianlong 2 need to be investigated further.

SA was an important signal factor that is involved in the local- and systemic-induced defense response against pathogens (Glazebrook 2005). However, the role of SA in chewing-insect defense is still unclear. Previous studies found that SA could inhibit the expression of some anti-herbivore related genes in potato (Sivasankar et al. 2000), and the decrease of SA content in response to Manduca sexta in Nicotiana attenuata was correlated to the induction of defense responses (Gilardoni et al. 2011). Interestingly, a minor increase of endogenous SA content in response to M. sexta in N. attenuata was also reported, suggesting a complicated role of SA in herbivory defense responses (Wu et al. 2007). In our studies, the SA responsive gene, GmPR1 (Glyma15g06780. 1), was induced in response to H. armigera in both cv. Tianlong 2 and ED059 (Additional file 4), which indicated that the SA pathway were commonly activated in these two accessions and might not be responsible for the resistance in ED059.

Genes related to metabolism between resistant and susceptible responses

Polyphenol oxidase (PPO) can alkylate the essential amino acids and reduce the nutritional values of plant proteins (Van Poecke 2007). Overproduction of PPO could enhance the ability of transgenic plants to defend themselves against herbivores (Wang and Constabel 2004). In plants, lipoxygenase (LOXs) not only participated in the JA-dependent defense pathway, but also directly fought against herbivores by producing peroxides and oxylipin products (Zhu-Salzman et al. 2008). In ED059, three PI-encoding genes including Glyma18g46580.1, Glyma14g04250.1 and Glyma03g37260.1, two PPO-encoding genes including Glyma04g14361.1 and Glyma13g25181.1, and two LOX-encoding genes including Glyma07g03920.2 and Glyma07g00900.1, were specifically up-regulated in response to H. armigera in ED059 (Additional file 2), which might play important roles in conferring resistance to H. armigera.

Glutathione S-transferases (GSTs) are ubiquitous in different types of plant organs, which were involved in the detoxification and processing of various xenobiotics but also endogenous toxic and non-toxic metabolites (Dixon et al. 2002). The expression of GSTs can be induced by a range of abiotic and biotic stresses such as drought, salt, cold, heavy metals, pathogens, and elicitors (Dixon et al. 2002). PAG analysis showed that two significant biological processes, toxin metabolic process (GO: 0009404) and toxin catabolic process (GO: 0009407), were specifically detected in 2-L vs. 2L, in which 5 GSTs with 3 up-regulated and 2 down-regulated were identified (Additional file 8). It is worthy of figuring out the targets of these GSTs in further studies.

Comparison of transcriptome and proteome

In this study, we used RNA-seq and iTAQ technologies to conduct the analyses of transcriptome and proteome respectively. In comparison with the number of DEGs detected by RNA-seq, the number of DEGs detected by iTAQ was significantly reduced (Figs. 1b, 4a). In addition, some genes were found to have the opposite expression patterns between transcript and protein levels (Additional file 22). For example, photosystem I psaG/psaK (Glyma5g22780.2) and thioredoxin (Glyma12g35190.1) were down-regulated at the transcript level while up-regulated at protein level in 2-L vs. 2L (Additional file 2 and 17). This discrepancy between transcript and protein levels were also reported in other studies (Botella et al. 1996; Liang et al. 2016; Zhao et al. 1996), which might be attributed to the limited capability of iTAQ technology for the detection of small differences in protein abundance, or the posttranscriptional or posttranslational modification of proteins.

In this paper, we investigated the mechanism of resistance to H. armigera in a wild soybean line ED059 through comparative analyses of transcriptome and proteome between ED059 and a susceptible cv. Tianlong 2. The results showed that the altered metabolism might be responsible for the resistance to H. armigera in ED059. Our studies provided us useful resources for identifying genes with resistance to H. armigera in soybean and also improve our understanding of plant defense mechanism against insects.

Methods

Samples preparation

The soybean cultivar (cv.), Tianlong 2, and the wild soybean, ED059, were obtained from the Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, China. Seeds were pre-germinated on moistened filter paper in a plant growth chamber at 27 °C, 85% ambient humidity and a 16:8 (light:dark) photoperiod for 3–4 days. The seedlings were then transferred into 18-cm × 18-cm individual plastic pots with nutrient-rich soil (Pindstrup Substrate, Denmark) and vermiculite at a ratio of 2:3 at 27 °C under 16 h of light. All plants used in the experiments had three fully expanded trifoliate leaves. For H. armigera treatment, the second fully expanded leaves were infected with three third-instar larvae of H. armigera. After 24 h, the wounded local leaves were harvested, flash frozen in liquid nitrogen, and stored at −80 °C until RNA isolation. Samples from untreated plants were used as controls. For each treatment, we used five leaves which were from five independent plants respectively. For biological replicates, all treatments and samples were set up and collected independently following the same method described above.

Solexa/Illumina sequencing and iTRAQ-based proteome analysis

At least 10 µg of total RNA (≥300 ng/µL) from each sample was extracted for RNA-seq. Sample preparation, library construction, sequencing and bioinformatics analysis were performed as described previously (Hao et al. 2011). The samples were sequenced by an Illumina HiSeq2000 platform. Sequencing-received raw image data were transformed by base calling into raw sequences. The raw sequences were transformed into clean tags by removing the 3′ adaptor sequences and low quality tags. A virtual library containing all the possible CATG+17 bases length sequences was generated using the SoyBase soybean genomics database. All clean tags were mapped to these reference sequences, and those tags with no more than one mismatch were considered.

The plant materials used in DGEseq were also used for proteome quantitation analysis. In addition to that, one more biological replicate was also set up. Thus, the samples from two biological replicates were used for proteome analyses. Total protein (100 μg) was extracted from each sample solution, and the proteins were digested with Trypsin Gold (Promega, Madison, WI, USA) at a protein:trypsin ratio of 30:1 at 37 °C for 16 h, according to a previously reported method (Lan et al. 2012). After trypsin digestion, peptides were labeled with the iTRAQ tags according to the manufacturer’s protocol for an 8-plex iTRAQ reagent (Applied Biosystems, Foster City, CA, USA). Subsequently, the labeled peptide mixtures were pooled and pre-separated by strong cation exchange chromatography with the LC-20AB HPLC pump system (Shimadzu, Kyoto, Japan). iTRAQ analysis was performed on a TripleTOF 5600 system (AB SCIEX, Concord, ON, Canada) combined with a Famos autosampler (LC Packings) and LC20-AD Nano HPLC (Shimadzu, Kyoto, Japan), according to a previously reported study (Du et al. 2014). Proteome Discoverer software (ver.1.2.0.339; Thermo Fisher Scientific, San Jose, CA, USA) was used to transform raw data files into MGF files. Protein identification and quantitation for iTRAQ was performed using the Mascot search engine (ver. 2.3.0; Matrix Science, London, U.K.) against data in the Soybase database.

qRT-PCR

Total RNA was extracted using Trizol reagent (Sigma). Total RNA (2 µg) was reverse transcribed to cDNA using Superscript II reverse transcriptase (Invitrogen). qRT-PCR analyses were performed with the SuperReal PreMix (SYBR Green, Tiangen) using a Rotor-Gene Q (Qiagen) real-time PCR system. The amplification program was initiated with a denaturation step at 95 °C for 15 min; followed by 40 cycles of 95 °C for 10 s, 60 °C for 20 s, and 72 °C for 30 s. The samples for qRT-PCR were from additional three biological replicates, which were different from the ones for RNA-seq. Relative quantification of the expression levels of genes was performed by the 2−ΔΔCT method (Livak and Schmittgen 2001). Three soybean genes, namely, HDC, EF1b, and UKN2, were used as the internal controls to normalize the expression levels of the selected genes (Wang et al. 2015a, b).

Whole-genome sequencing and copy number variation analysis

Whole-genome sequencing of ED059 and cv. Tianlong 2 was conducted using Illumina technology at the BIOMARKER company, Beijing, China, following a described protocol (Jiao et al. 2015). The sequencing depth for each sample was about 50 × coverage. The CNV analysis was conducted using CNV-seq software (Xie and Tammi 2009).

Plasmid construction and transgenic confirmation

The PCR products of Glyma06g03410 were digested with XcmΙ and inserted into plasmid PCX-DG, as previously described (Chen et al. 2009). The transgenic Arabidopsis were obtained by use of the Agrobacterium-mediated floral dip method (Bent 2006). The empty vector (EV) was transformed into the plants as a control. The T2 plants with a segregation ratio of 3:1 (the number of positive plants: the number of negative plants), were propagated, and the homozygous lines were used for the further studies. The evaluation of transgenic plants for the resistance to H. armigera was performed in a growth chamber at 27 °C, 85% ambient humidity, and a 16:8 (L:D) photoperiod. The eggs of H. armigera were hatched at 27 °C. One freshly hatched larva of H. armigera was placed in a Petri dish (100 × 25 mm) and starved for 12 h before being used in the test. Five mature rosette leaves from 4-week-old plants were collected and placed in each dish. The leaves were changed every 2 days (days), and the test was stopped at 6 days after treatment. The larval weight from each Petri dish were measured and recorded at 2, 4, and 6 days after treatment. The dish experiment was conducted in a randomized complete block design with 10 replications.