Background

The traditional medicinal mushroom Inonotus obliquus (I. obliquus), also commonly known as chaga, is a plant parasitic white-rot/brown-rot fungus, which belongs to the family Hymenochaetaceae of the phylum Basidiomycota [1, 2]. This medicinal fungus is widely distributed in North America, Asia, and Northern Europe, and has been used for more than four centuries as a folk medicine in Northern Europe for the treatment of stomach diseases, intestinal worms, liver and heart ailments [2,3,4,5,6,7]. Pharmacological studies on the bioactive substances of the fungus in cancer research have been demonstrated its excellent medicinal value in the treatment of various human tumors [5, 8,9,10,11,12,13], and its powerful effect in the treatment of several human diseases without any unacceptable toxic side effects, has become demonstrated [14, 15].

Previous chemical investigations of extracts from the fruit body and submerged cultures of I. obliquus have demonstrated that the fungus produces multiple types of bioactive components, including polysaccharides, organic acids, phenoliccompounds, terpenoids, ligins and melanins [2, 4, 5, 8, 9, 16,17,18,19]. More than 100 species of metabolites have been identified, most of which are involved in antioxidation, antitumor and immunomodulation activities [5, 20]. Nevertheless, most studies have focused on metabolites in the fruit body of I. obliquus [5, 21, 22]. Due to the slow growth of I. obliquus in its natural habitat [2, 5], submerged cultures of this fungus have been focused on the identification of bioactive secondary metabolites [20, 23,24,25,26], and 26 phenolic compounds have been identified in the mycelia and categorized into small phenolics, glycosylated flavonoids (GF), flavonoid aglycones (FAG) and polyphenols [20]. Furthermore, genomic sequencing of the fungus [27] and differential transcriptomic analysis of submerged cultures of the fungus under different culture condition has been applied to identify and understand the regulation of genes involved in secondary metabolite biosynthesis, especially terpenoid biosynthesis [28].

In previous studies, high-performance liquid chromatography (HPLC) and nuclear magnetic resonance (NMR) spectroscopy have been used to rapidly advance studies of chemical composition and quality evaluation of metabolites of the fungus [5, 12]. Recently, HPLC-MS (mass spectrometry) based metabolomics has been increasingly applied in discovery studies for comprehensive quality assessment of multiple metabolites [29, 30]. In addition, widely targeted metabolomics is a promising novel technique for large-scale, ultrasensitive qualitative and quantitative analysis of targeted metabolites of interest [31], and facilitates the understanding of metabolic pathways contributed to the modulation of metabolites in fungi [29, 32,33,34,35], plants [36, 37] and animals [38].

In this study, we presented the genome of the I. obliquus (strain CFCC 83,414), and further integrated comparative omic analysis of the fungus under different submerged culture conditions. Our results identified multiple secondary metabolites and elucidated an understanding of the regulations of metabolite productions at transcriptomic, proteomic metabolomic levels.

Results

Genome assembly and inference of chromosomes

We sequenced genomic DNA of the I. obliquus using PacBio Sequel II and Illumina platforms, yielding high-quality data of ~ 64× coverage (2.32G, PacBio platform) and ~ 244× coverage (8.82G, Illumina platform), respectively (Table S1 and S2). The size of the assembled genome of the I. obliquus was 36.13 Mb, including 32 contigs with an N90 of 1.94 Mb and a GC content of 47.39% (Table 1). The heterozygous ratio was estimated to be 0.82%. The completeness of the predicted protein-coding gene set was estimated to be 95.2% complete based on the fungal BUSCO families (Table S3).

Table 1 Characteristics of the assembled contigs and genome of the Inonotus obliquus CFCC 83,414

Among the 32 contigs, 13 major contigs (approximately 1.58–4.36 Mb in length) compresed 98.63% of the entire genome, while the remaining 19 minor contigs (approximately 15.40–55.10 kb in length) (Table S4) comprised 1.37% of the genome. The repeat sequences mostly proximal to both terminals of these 13 major contigs were collected and it was found that approximately 18–24 repeat units containing 6 bases of TTAGGG(C) at the 5’- end of 10 contigs with the exception of the contig 4, 6 and 7) and the 3’- end of all the 13 contigs (Fig. 1b). The sequence of TTAGGG(C) was highly similar to the telomere tandem repeat sequence frequently found in various species of invertebrates [39,40,41], plants [42], fungi [41, 43] and so on [41, 44] (see details in http://telomerase.asu.edu/sequences_telomere.html), and was considered to be the telomeric tandem repeat sequence of the I. obliquus in this study. It seemed that the karyotypes of I. obliquus appeared to be 12–13 chromosome pairs. Detailed characteristics of the 13 large contigs were shown in Fig. 1c.

Fig. 1
figure 1

Assembly and annotation of the genome of I. obliquus. (a) Experimental schema of multiple omic analysis. (b) Alignment of the predicted telomeric tandem repeat sequences at terminals of 13 major contigs. The start-end sites of telomere tandem repeat sequences were indicated following the contig names; (c) Characterization of 13 major contigs of the genome. From the inner to the outer ring in order is the GC content in each contig, repeat sequence density in each contig, transposable element density in each contig, and gene density in each contig; (d) The KEGG functional annotation of genes. The gene number was noted at the right of the bar corresponding to relevant class, and the bar length indicated the gene number

Genome annotation of the Inonotus obliquus

The genome annotation was performed using ab initio prediction, homology-based searches as well as a cDNA-based evidential support of transcriptional data in this study. A total of 8352 protein-coding genes were predicted with a length of 2271 bp and 7.08 exons per gene on average, of which, 8347 genes (≥ 99.9%) were located on the 13 major contigs. Further, 7915 genes (94.8%) were annotated (Table S5) and 3885 (46.5%) genes were functionally annotated according to the KEGG annotation (Fig. 1d). In addition, a total number of repetitive sequences and non-coding RNAs accounted for 20.31% and 0.4% of the genome respectively.

Further, a total of 365 CAZyme-coding genes, including 72 auxiliary activities (AAs), 3 carbohydratebinding modules (CBMs), 21 carbohydrate esterases (CEs), 187 glycoside hydrolases (GHs), 69 glycosyltransferases (GTs) and 13 polysaccharide lyases (PLs) (Table S6) were identified in the genome. The CAZyme-coding genes were categorized into several superfamilies, among which, GHs, GTs and AAs were larger superfamilies than the others, and the numbers of GHs were significantly larger than the numbers of other categories, consistent with the saprophytism lifestyle of using lignocellulose decomposition. The biosynthetic genes of secondary metabolites were usually found to be clustered [45], and due to secondary metabolites have noteworthy pharmaceutical potential and the absence of risks to human and animal health, genomic analyses have provided useful information for elucidating the biosynthesis of secondary metabolites, such as terpenoids, polyketides, and nonribosomal peptides [46,47,48,49,50]. In the genome, a total of 19 BGCs were predicted (Table S7), including 2 hybrid (mixed NRPS-like and type 1 PKS) BGCs, 1 NRPS, 1 type 1 PKSs, 3 NRPS-like BGCs, and 12 BGCs encode for terpenes including the antitumor compound clavaric acid [51, 52].

Overview of the Inonotus obliquus transcriptome

For the triplicate seed culture (SEC) and fermentation culture (FEC) samples, a total of 42.18 G of clean data were acquired from transcriptomic sequencing, with an arrange of 6.5 ~ 7.38 G for each sample and an average GC content of 50.79% (Table S8). Approximately 96.93 ~ 97.07% of the expressed sequence tag sequences (EST) generated from RNA-Seq data were mapped to the genome referred, indicating almost complete coverage of the protein-coding gene regions (Table S9). Further analysis showed that 8149 and 8117 of the whole 8352 genes in the genome were covered under the SEC and FEC environment, respectively (Table S10). Candidate differential genes involved in FEC adaptation were chosen from 8049 quantified genesusing uniquely mapped reads in both environment (Table S10). The transcripts of 1283 genes were detected to be associated with FEC adaptation, including 399 up-regulated genes and 884 down-regulated genes (Fig. 2a). The differentially expressed genes (DEGs) showed distinct transcriptional patterns under different culture conditions, which was depicted intuitively in Fig. 2b with the hierarchical clustering of DEG transcription levels. Furthermore, for all the fold changes of all up- and down-regulated genes, 69.8% of DEGs were altered 2–4-fold at the mRNA level, while fewer DEGs (18.5%) were altered more than 4–8-fold, and only 11.7% of DEGs were altered more than 8-fold. Among these DEGs, many were related to fungal metabolic functions, such as Aldo/keto reductase (76.39-fold), terpenoid synthase (22.96-fold), and cyclopropane fatty acid synthase (12.12-fold). The functional classification of DEGs was performed by mapping to the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway, demonstrating that the DEGs were mainly associated with the biosynthesis and metabolism of various metabolites including tryptophan, cyanoamino acid, sucrose and purine (Fig. 2c), and also with some transport and catabolism, signal transduction and membrane transport pathways. In general, the results implied that a considerable portion of genes involved in various cellular and physiological processes were strongly affected at the transcriptional level following FEC condition adaptation.

Fig. 2
figure 2

Transcriptome and proteom analysis of the I. obliquus under the seed culture (SEC) and fermentation culture (FEC) conditions. (a) Volcano plot showing differential transcription of genes; (b) Transcrtiption heatmap; (c) Pathway enrichment in FEC vs. SEC; (d) Volcano plot showing differential expression of proteins; (e) Exprssion heatmap; (f) Correlation between the mRNA and protein levels of 1648 differential genes; (g) Correlation between the mRNA and protein levels of 157 differential proteins of group I

TMT-based quantitative proteome and its correlation with transcriptome

Proteins were extracted from the triplicate SEC and FEC samples and subjected to TMT (tandem mass tag)-based proteomic analysis. A total of 4962 proteins were identified, among which, 4729 proteins were quantified with at least one peptide and 4265 were quantified with at least two peptides (Table S11). A total of 264 differential expression proteins (DEPs) was detected to be associated with FEC adaptation, including 105 up-regulated genes and 159 down-regulated proteins (Fig. 2d), which were depicted intuitively with the hierarchical clustering of DEP expression levels in Fig. 2e. For the fold changes in DEPs of all up- and down-regulated genes, 75.4% of DEPs were altered 1.5–2-fold at the protein level, whereas fewer DEPs (20.8%) were altered more than 2–3-fold, and only 3.8% of DEGs were altered more than 3-fold, which were still less than 6-fold, demonstrating lower fold changes at expression level than at transcription level of of differential genes.

To examine whether a stronger correlation exists between the mRNA and protein level changes after the adaptation to fermentation condition, combination of transcriptomic and proteomic analyses were performed using linear regression analysis based on the log2-transformed gene fold changes for pair-wise comparison of the TMT-based quantitative proteomic (Table S11) and RNA-Seq results (Table S10). For all proteins identified in TMT-based quantitative proteomic analysis, the correlation coefficient between the mRNA and protein level changes was only 0.31, whereas, that of all the data of differential 1608 genes (including all the 264 DEPs in quantitative proteome) with p value ≤ 0.05 at transcription level and with p value ≤ 0.05 at expression level was 0.51(Fig. 2f). There was a positive correlation (0.61) between mRNA and protein level changes for the 264 DEPs, showing similar trends at both of the mRNA and protein levels after of FEC-condition adaptation. Further, all the 264 DEPs of the TMT-based quantitative proteome were categorized into three groups based on the following patterns: group I (including 157 DEPs) (Table S12), the mRNA and protein levels showed the same changes; group II (including 104 DEPs), mRNAs were basically unchanged while protein levels were up- or down-regulated; and group III (including 3 DEPs), the directions of mRNA and protein changes were opposite. And for the 157 DEPs of group I, the correlation coefficient between the changes at mRNA and protein levels was 0.83 (Fig. 2g), indicating a strong positive correlation with their mRNA in expression patterns. These proteins were involved predominantly in metabolism-related physiological process, including proteins involved in polysaccharide, carbohydrate, amino acid, lipid and purine metabolism, transmembrane transpor, and also in protein phosphorylation, and signaling molecules and interaction (Table S12).

Identification anf quantatition of secondary metabolites

We further focused on production of secondary metabolites using widely targeted metabolomics. The metabolomic profiling of I. obliquus under SEC and FEC conditions was performed using HPLC-MS/MS identification, and led to a comprehensive identification and quantification of a range of 307 metabolites, including 68 amino acids and derivatives, 60 lipids, 44 nucleotides and derivatives, 36 organic acids, 35 phenolic acids, 29 saccharides and alcohols, 7 vitamins and 26 flavonoids, and a few of lignans and coumarins and tannins (Fig. 3a and Table S13). Significantly regulated metabolites between the two condition groups were determined with VIP value ≥ 1 and log2 FC value ≥ 1 or ≤ -1 (Fig. 3b) and hierarchical cluster analysis (HCA) was performed (Fig. 3c). A total of 137 differential metabolites were selected, among which, 59 metabolites were more productive and 43 metabolites were less productive under FEC condition than under SEC condition (Fig. 3d). For the fold changes in differntial generated metabolites (DGMs), 100 (73%) of DGMs were altered 2–8 fold at the metabolite level, whereas 37 (27%) were altered more than 8-fold, among which, 17 metabolites including amino acids and derivatives, phenolic acids, flavanols and organic acids were altered more than 100 fold, demonstrating a remarkable change at metabolite level.

Fig. 3
figure 3

Widely target metabolomic analysis of the I. obliquus under the seed culture (SEC) and fermentation culture (FEC) conditions. (a) Classification of identified 307 secondary metabolites; (b) Volcano plot showing differential production of metabolites; (c) Production heatmap of 157 differential production of metabolites; (d) Classification of 157 differential production of metabolites. Red: Up-regulated; Blue: down-regulated

Discussion

This study described the assembly and annotation of the genome of the I. obliquus (strain CFCC 83,414), and provided a comprehensive view on the response of the fungus to the changes in submerged culture conditions at transcriptomic, proteomic and metabolomic levels through multi-omic analysis based on the sequenced genome. Our results demonstrated differences in metabolism-related pathways between seed culture and fermentation culture conditions and shed light on understanding of the regulation of secondary metabolite productions.

For basidiomycetes, the karyotype had been found to be conserved, with the number of known karyotypes mainly ranging from 11 to 14 chromosome pairs [53,54,55]. The genome of I. obliquus presented in this study was sequenced using the PacBio Sequel II HiFi sequencing technology with high coverage and accuracy [56]. Considering the analysis of telomeric tandem repeat sequence, it seemed that the genome contains 10 completed chromosome-length contigs in the 13 major contigs and the remaining 19 minor contigs (total length < 0.5 Mb), and the karyotype of I. obliquus would be 12–13 chromosome pairs. The higher BUSCO rate of 95.2% compared to that (about 90%) of the previous reported genome of the I. obliquus strain CT5 which the sequenced using the Oxford Nanopore PromethION sequencing platform [27], also suggested the high-quality mononuclear genome assembled in this study. Further construction of chromosome-level genome map using the Hi-C protocol [57, 58] would be used to understand the detail genome organization and future comparative genomic analysis of more genomes of I. obliquus stains and other speices.

After the transformation of the mycelia from the seed culture into fermentation culture, various biological processes were differentially regulated, resulting in the differential change of the production of secondary metabolites. The RNA-seq based comparative transcriptome yielded much more comprehensive and detailed information about transcriptomic profiling (8049 quantified genes) than the comparative proteome (with 4729 proteins quantified) which demonstrated lower fold changes of differential genes at expression level than at transcription level. To explore the consistency between mRNA and protein levels, correlation analysis was performed for all the genes quantified at both transcriptional and translational levels, and the results suggested a positive association but rather poor (0.51). Nevertheless, a strong positive correlation (0.83) was found between changes at the mRNA and protein levels for the 157 DEPs of group I among all the 264 DEPs, which gived insight into investigating how reliably the transcriptional profile reflects the translational profile. Further studies would be performed to explore the multiple-time-point dynamic changes [20] of production pathways of secondary metabolites under fermention condition rather a single time point experimental design.

The widely targeted metabolomics method based on the LC-MS/MS technology was a very sensitive and accurate method for the measurement of targeted metabolites, which was facilitated by the construction of MS2 spectral tag (MS2T) libraries [31]. Taken into considerartion the high sensitive detection capability of widely targeted metabolomics and the ambiguity that metabolites from the sclerotium may be self-sunthesized or acquried from the hosts or culture mediums with undefied compositions, the fungus grown on the FEC cunlture condition with chemically defined medium was selected with that grown on the SEC culture condition for identifcation and quantification of secondary metabolites synthesized by the fungus I. obliquus. This identification and quantification of a series of 307 secondary metabolites supplied comprehensive information on fungal secondary metabolites under fermentation conditions, significantly updating the list of potential bioactive metabolites [5]. Due to the close relationship of biosynthetic pathways [59], many bioactive polyphenols including flavonoids and phenolic acids, such as caffeic acid, vanillic acid, isorhamnetin-3-O-arabinoside, 3’,4’,7-trihydroxyflavone were found to be highly accumulated in fermentation culture, which is in consistent with previous studies [20, 23, 24]. The accumulation of multiple bioactive polyphenols including flavonoids and phenolic acids may suggest that some proteins including the phenylalanine ammonia lyase (PAL) and cinnamate 4-hydroxylase (C4H, a cytochrome P450 monooxygenase) that are located upstream of the two biosynthetic pathways for basic structural skeletons of flavonoid and phenolic acid compounds, will be upregulated. It has been demonstrated that an increased expression of PAL can lead to the accumulation of flavonoids [60]. Furthermore, among the 157 differentially expressed genes/proteins in group I, we found that the gene001065, encoding a cytochrome P450 monooxygenase protein with a C4H domain (with E value of 1.54e-46) (ID in NCBI: PLN02394), was upregulated with a 5.12-fold change in transcription level and a 2.17-fold change in expression level, suggesting its correlation with the accumulation of flavonoid and phenolic acid compounds. The biosynthetic pathway of flavonoid compounds seems to be present in the fungus, although the chalcone synthase (CHS) responsible for producing the key precursor of flavonoid compounds is still not found in the genomes of the fungus and other mushrooms [61]. This may facilitate the elucidation of the regulation of production of bioactive polyphenols and promote further reseaches on identification of novel CHS-like genes with homology or functional similarity.

Neverthelss, terpene synthases were found to be up-regulated at both the transcriptional and translational levels, terpenes were not found here. In addition, aqueous alcohols can capture a broad spectrum of metabolites but mostly on the polar side. Some constituents of intermediate polarity and non polar metabolites may be under-presented in the LC-MS/MS data. Hence, the variation between the two types of culture, in terms of non-polar metabolites, is left out in this study. For further comprehensive identification of metabolites, especially that from the sclerotium, it is suggested to use a combination of different solvents for metabolite extration to improve the identification coverage, as well as additional corresponding MS-based detection, such as GC (gas chromatography)-MS/MS detection. It is noticed that due to the absence of environmental stimuli, fewer secondary metabolites accumulated under fermentation conditions compared to those of fungi grown in natural habitats, highlighting that the large proportion of potentially bioactive compounds are waiting to be identified. Further efforts will be needed to use a variety of stimuli to promote the productions of secondary metabolites and use the I. obliquus as a reliable source for pharmaceutical purposes [24, 25, 62,63,64,65,66,67]. Furthermore, in combination with this study, further transcriptomic analysis and comprehensive identification of additional metabolites of the fungal sclerotium will promote further exploring the link between sequenced fungal genomes and bioactive fungal secondary metabolites [47]. In combination with bioinformatic tools, construction of genetic engineered fungi through knock-out of target genes [47, 68, 69] will be conducted for identification of genes involved synthesis pathways and discovery of novel fungal secondary metabolites [70].

In summary, genome sequencing, integrated comparative omic approach and metabolomic profiling revealed the genetic basis of metabolic profile of the fungus I. obliquus under SEC and FEC conditions. The large number of secondary metabolites self-synthesized by the fungus with widely targeted profiling and quantification supplied fundamental information for further screening of promising target metabolites under fermentation conditions. In the future, this multi-omic should assist the scientific community in genetic manipulation and metabolic engineering for the production of potential phamaceutical metabolites based on the presented genome with high accuracy.

Methods

Preparation of Fungal materials for multiple omics

The culture of I. obliquus (strain CFCC 83,414) was purchased from the China Forestry Culture Collection Center (CFCC) and maintained on potato dextrose agar (PDA) culture medium (Solarbio, China) at 26℃ in darkness. The mycelia were transferred to PDB culture medium (Solarbio, China) and incubated at 26℃ and 150 rpm for 72 h as previously described [20]. The mycelia were then homogenized and inoculated aseptically [20] into 500 ml conical flasks containing 150 ml PDB medium (used as seed culture in this study and named SEC) and the medium (used as fermentation culture in this study and named FEC) consisting of 2% glucose, 0.35% peptone, 0.01% KH2PO4 and 0.05% MgSO4·7H2O. The culture was incubated at 26℃ and 150 rpm for 7 days, respectively, and used for further differential analysis of transcriptomics, proteomics and widely targeted metabolomics (Fig. 1a) and three biological triplactes were performed (Fig. 1a).

Genome sequencing, assembly and annotation

The genomic DNA of the mycelia was prepared as previously described [71,72,73]. The genomic DNA was sequenced with the Illumina HiSeq X Ten and the PacBio Sequel II sequencing platform using the Circular Consensus Sequencing (CCS) model [74] from Biomarker Technologies Co., Ltd (Beijing, China) and the obtained data were further processed according to the conventional Biomarker pipeline. Briefly, the initial PacBio Sequel II sequence data was processed with the SMRT LINK (https://www.pacb.com/support/software-downloads/) (v10.0) to obtain consensus reads (high fidelity reads, HiFi reads). The HiFi reads obtained with an N50 of > 11 kb and accuracy of > 99% were assembled using Hifiasm [75] (v0.15.5) and further curated using Pilon [76] (v1.23) for the primary assembly using the Illumina-derived short reads generated above [72] to correct any remaining errors.

The prediction of gene structure was performed through a combination of homology-based prediction, transcriptome-based prediction and de novo prediction methods as follows. Firstly, the protein sequences of four Hymenochaetaceae genomes (Fomitiporia mediterranea, Phellinidium pouzarii, Pyrrhoderma noxium and Sanghuangporus baumii) were aligned to the assembly using TblastN [77] and the gene structure of the corresponding genomic regions for each BLAST hit was predicted using GeneWise [78] (v2.4.1). Secondly, the gene structure was predicted with Transdecoder [79] (v3.01) based on transcripts assembled from differential transcriptome analysis using Trinity [79, 80] (v2.3.2), while cufflinks [81] (v2.2.0) was then used to assemble the transcripts into gene models which also used for the further ab initio prediction. Thirdly, the ab initio gene prediction was from the repeat-masked genome using Augustus [82] (v3.3.3) and GeneMark [83] (v4.33) with default parameters. All predicted genes from the above forecast results were combined into a non-redundant set of gene structures using EVidenceModeler [84] (EVM, v1.1.1).

For the prediction of noncoding RNAs (ncRNAs), the tRNAs and rRNA were predicted using tRNAscan-SE(v1.3.1) [85] and RNAmmer [86] (v1.2), respectively, while small nuclear RNA (snRNA) and small nucleolar RNA (snoRNA) sequences were identified using Infernal [87] (v1.1.2) against the Rfam [88] (v14.0) database. For the analysis of repetitive sequences and transposable elements (TEs), homology-based prediction was performed using the RepeatMasker [89] (v4.0.7) against the repeated sequence database RepBase [90]. The ab initio prediction was performed using the RepeatModeler (http://www.repeatmasker.org/RepeatModeler/) for the establishment of de novo repeat sequence library and then the RepeatMasker [89] against the de novo repeat sequence database generated from the RepeatModeler’s prediction above. Furthermore, the tandem repeat sequences were searched in the genomic sequence using Tandem Repeat Finder (TRF) [91] (v4.09). In addition, for identification of telomere sequences, repeat sequences were aligned using MAFFT [92, 93] (v7.310) with the G-INS-i option, and the sequence logo of alignment sequences was generated using WebLogo [94].

Gene functions were inferred according to the best match of alignments using BLAST (e < 1e-5) against functional databases including GO [95], KEGG [96], COG/KOG [97], NR (https://ftp.ncbi.nlm.nih.gov/blast/db/), Swissprot and TrEMBL [98] and search of the database Pfam [99] (v35) using HMMER [100]. Annotation completeness analysis was performed using BUSCO (v3.1.0) [101] search against the dataset of fungi_odb9 (https://busco-archive.ezlab.org/v3/frame_fungi.html).

Identification of CAZymes and secondary metabolite BCGs

The CAZymes in the genome of I. obliquus were identified using dbCAN2 meta server [102] with HMMER [100], DIAMOND [103] and eCAMI [104] search tools with default parameters and following the server‘s guidelines and recommendations (https://bcb.unl.edu/dbCAN2/help.php). Proteins were identified, grouped into CAZymes families defined in the CAZy database [105] and selected according to the recommendation of the dbCAN2 server. The biosynthetic gene clusters (BCGs) of secondary metabolites were predicted using AntiSMASH 6.0.1 [106] with all extra features selected.

Transcriptomic analysis and identification of DEGs.

Total RNA was extracted from each sample (triplicate SEC and FEC samples) using mirVana miRNA Isolation Kit (Thermo, USA) and 4 µg of total RNAs from each sample were used for construction of cDNA library using TruSeq Stranded mRNA LTSample Prep Kit (Illumina, USA) and then sequenced using Illumina HiSeq X Ten system. All the experiments were conducted following the manufacturers’ protocols.

The acquired raw reads in fastq format were firstly processed using Trimmomatic [107] (v0.36) and low quality reads were removed to obtain clean reads, then the retained clean reads were mapped to reference genome of I. obliquus using HISAT2 [108] (v2.2.1.0). The read counts of each gene were obtained using HTSeq [109] (v 0.6.0) and the gene FPKM (fragments per kilobase of transcript per million fragments mapped) [110] expression values were calculated using cufflinks [81] (v2.2.0). To calculate fold changes, the number of reads for each gene in each library was normalized by the total number of mapped reads for the library using the DESeq2 [111] (v1.20.0) R package functions estmateSizeFactors and nbinomTest. The p value (≤ 0.05, negative binomial test) as well as FC (fold change) value were further calculated using in DESeq2. Genes with a significant p value (≤ 0.05) and FC value (≥ 2 or ≤ 0.5) were considered as differentially expressed genes (DEGs). And the KEGG [96] pathway enrichment analysis of DEGs were performed based on the hypergeometric distribution.

TMT-based quantitative proteomic analysis

The mycelia samples (SEC and FEC samples in triplicate) were frozen in liquid nitrogen and grinded into fine powder. Approximately 30 mg powder of each sample was mixed with 5 volumes (w/v) of TCA (trichloroacetic acid)/acetone (1:9) (Sigma, USA), vortexed, and then incubated at -20℃ for 4 h. The precipitates were centrifuged at 6000 g for 40 min at 4℃, and washed three times with pre-cooled acetone. After air-dried at room temperature, the pellets were solubilized with 30 volumes (w/v) of SDT lysis buffer (4% SDS, 100 mM Tris-HCl, pH 7.6) [112]. The respended lysates were incubated at 95℃ for 5 min, sonicated on ice, and then incubated at 95 °C for 15 min. The supernatants were collected through centrifugation at 14,000 g for 15 min, and filtrated using 0.22 μm filters (Millipore, USA). The protein concentrations of the resultant filtrates were determined using the bicinchoninic acid (BCA) assay kit (Beyotime, China). Approximately 100 µg proteins from each of all samples were reduced with a final concentration of 100 mM dithiothreitol (DTT) at 95℃ for 5 min, and subjected to trypsin digestion following the FASP protocol as previously described [112, 113] using 30 kDa Ultra filter unit (Sartorius, Germany). The desalted peptides were resuspended in 40 µL 0.1% formic acid (Thermo Fisher Scientific, USA) and measured at OD280 using Nano Drop 3000 spectrometer (Thermo Fisher scientific, USA).

The TMT (tandem mass tag)-labeling was performed by Shanghai Genechem Co., Ltd (China) following the product instruction of TMTsixplex Isobaric Label Reagent Set (Thermo Fisher scientific, USA) and as previously described [114]. All six samples from SEC and FEC samples in triplicate, each containing 100 𝜇g protein digest, were mixed with the TMT label dissolved in 41 µL anhydrous acetonitrile and incubated for 2 h at room temperature. The TMT labels 126, 127 and 128 were used for triplicate SEC samples, while the labels 129, 130 and 131 were used for triplicate FEC samples. The reaction was quenched by adding 8 µL of 5% hydroxylamine, and then all the labeled samples were combined and further separated on a 1260 infinity II HPLC System (Agilent, USA) equipped with an XBridge Peptide BEH C18 Column (130Å, 5 μm, 4.6 mm × 100 mm, Waters). The mobile phase consisted of two components, with component A being 5% acetonitrile (ACN) with 0.1% ammonium formate and component B being 85% ACN with 0.1% ammonium formate. The 85 min solvent gradient at a flow rate of 1 mL/min was set as follows: 0% B within 25 min, 0 − 7% B in 5 min, 7 − 40% B in 25 min, 40 − 100% B in 5 min and 100% B for 15 min. Fractions were collected every minute for a total of 40 fractions. All fractions were dried by vacuum centrifugation (Huamei, China), then reconstituted with 10% formic acid, and further combined into ten samples prior to LC − MS/MS analysis.

The MS data were acquired with an Easy nLC system (Thermo, USA) coupled to an Orbitrap Q-ExactiveTM Plus mass spectrometer (Thermo, USA). Peptides were trapped (Acclaim PepMap RSLC 50 𝜇m × 15 cm, nano viper, Thermo, USA) before being seperated on the Easy nLC system. The mobile phase consisted of two components, with component A being 0.1% formic acid in water and component B being 0.1% formic acid in 80% ACN. The 90 min solvent gradient at a flow rate of 300 nL/min was set as follows: 6% B within 5 min, 6 − 38% B in 70 min, 38 − 100% B in 10 min, and 100% B for 5 min. Full scan MS spectra from m/z 350–1800 were acquired at a resolution of 70,000 with automatic gain control (AGC) set to 3e6 and a maximum injection time (IT) set to 50 ms, followed by 10 MS2 scans of precursors selected for fragmentation by higher-energy collision dissociation (HCD) with normalized collision energy set to 30 ev. All MS2 spectra were acquired at a solution of 35,000 with a maximum injection time (IT) set to 45 ms.

Raw files in raw were transformed into files in mgf format using Proteome Discoverer (v2.2, Thermo, USA) and uploaded onto MASCOT (v2.6) server against the newly annotated protein database of of the I. obliquus. The search parameters as previously descried were used [112] and identified peptides were filtered to a 1% false discovery rate (FDR). Proteins that showed more than two-fold change (FC of ≥ 1.5 or ≤ 0.67) with p value ≤ 0.05 [115] were considered to show significant differential expression. Linear regression analysis was performed based on the log2-transformed fold changes of the TMT-based quantitative proteomic and RNA-Seq results for pair-wise comparison. Correlation was regarded as strong when R2>0.81 [116, 117].

Extraction and quantification of metabolites

The mycelia from each of all samples (triplicate SEC and FEC) were freeze-dried, and then extracted by 70% aqueous methanol following the previously described [31]. The freeze-dried mycelia were crushed using a mixer mill (MM 400, Retsch, Germany) with a zirconia bead for 1.5 min at 30 Hz. The 100 mg powder was incubated with 0.6 ml 70% aqueous methanol overnight at 4 °C. The extracts were collected after centrifugation at 10, 000 g for 10 min, and then filtrated using 0.22 μm filter (SCAA-104, ANPEL, China). The MS data was acquired with an UPLC (Shim-pack UFLC SHIMADZU CBM30A system, SHIMADZU, Japan) coupled to a 4500 Q-TRAP tandem MS system (Applied Biosystems, USA). The extracts were separated by using a Shim-pack UFLC SHIMADZU CBM30A system, SHIMADZU, Japan). Metabolites of 4 µL of each extract were separated on the UPLC equipped with an SB-C18 (1.8 μm, 2.1 mm × 100 mm, Agilent, USA). The mobile phase consisted of two components, with component A being 0.1% ammonium formate and component B being 100% ACN. The 90 min solvent gradient at a flow rate of 0.35 mL/min was set as a program that employed the starting conditions of 95% A, 5% B. Within 9 min, a linear gradient to 5% A, 95% B was programmed, and a composition of 5% A, 95% B was kept for 1 min. Subsequently, a composition of 95% A, 5.0% B was adjusted within 1.10 min and kept for 2.9 min. The column oven was set to 40 °C. The effluent was alternatively connected to an ESI (electrospray ionization)-triple quadrupole-linear ion trap spectrometer (Q-TRAP). The ESI source operation parameters were set as previously described [31], whereas ion source gas I (GSI), gas II (GSII), curtain gas (CUR) were set at 50, 60, and 30.0 psi, respectively, and QQQ (triple quadrupole) scans were acquired as MRM (multiple reaction monitoring) experiments with collision gas (nitrogen) set to 5 psi [31]. Quality control (QC) samples were used and injected to the mass spectrum to monitor the repeatability of the analysis process.

The MS/MS data were processed using the Analyst (v1.6.3, AB SCIEX, USA) and then indentification of metabolites was conducted by match of mass spectrum to reference library of the local MetWare database (MWDB) based on the standard compounds or public databases including METLIN [118]. The secondary spectrum and retention time (RT) of the metabolites in the project samples were compared with MWDB, specifically, the MS tolerance and MS2 tolerance were set as 20 ppm, RT offset did not exceed 0.2 min. Quantification of metabolites were conducted using the multiple reaction monitoring (MRS), and the sub-data representing chromatographic peak areas of metabolites was subjected to log transform (log2) and mean centering for further orthogonal partial least squares-discriminant analysis (OPLS-DA) using the “OPLSR.Anal” function of the R package MetaboAnalystR [119]. In order to avoid overfitting, a permutation test (200 permutations) was performed for OPLS-DA test. Significantly regulated metabolites between groups were determined by VIP (Variable important in projection) value ≥ 1 and Log2 FC value ≥ 1 or ≤ -1.