Expression and analysis of zinc finger family gene in Lenzites gibbosa

Zinc finger transcription factors play significant roles in the growth and development of plant and animal, but their function remains obscure in fungi. Lenzites gibbosa mycelia were extracted and sequenced by transcriptome analysis after growing on sawdust at different times to support mycelial growth of L. gibbosa in a nutrient matrix. Data bases used for analysis were the Kyoto encyclopedia of genes and genomes (KEGG) annotation, the cluster of orthologous groups of proteins (COG) and gene ontology (GO) annotation. Zinc finger class genes related to the growth and development of L. gibbosa were screened. GO annotation and enrichment analysis of differentially expressed genes were carried out. A total of 114.55 Gb Clean Data were obtained from the L. gibbosa transcriptome. The average Clean Data in each sample was 6.16 Gb. The relative efficiency of reads between each sample and the reference genome was 88.5% to 91.4%. The COG analysis showed that most zinc finger protein genes were related to replication, recombination and repair function. GO enrichment analysis showed that the expressed genes involved in cellular process, cell part and binding. We identified seventy-two expressed genes including seven up-regulated genes and sixty-five down-regulated genes by applying DESeq2 data analysis software. By comparing the significantly expressed genes with KEGG database, 66 annotated sequences were obtained, and 35 primary metabolic pathways were annotated. Pathway enrichment analysis showed that differentially expressed genes were significantly enriched in protein processing in endoplasmic reticulum and ubiquitin-mediated proteolysis pathways. Gene_11750 and gene_5266 are highly correlated with the growth and development of L. gibbosa and are closely related to protein processing in endoplasmic reticulum and ubiquitin-mediated proteolysis pathway. According to gene functional analysis, seven important differentially expressed genes related to the growth and development of L. gibbosa were identified.


Introduction
Zinc is a significant trace element in eukaryotic organisms and more than 300 zinc-containing enzymes have been found. In some enzymes, zinc interacts directly with the transforming substrate molecules and participates in hydrogen peroxide reaction, while in others, zinc plays a structural role (Berg and Shi 1996). Zinc-structured protein transcription factors (TF III A) were originally discovered in African clawed frog oocyte in 1983 (Miller et al. 1985;Lander et al. 2001). These proteins are widely distributed in the eukaryotic genome. Zinc finger protein is a type of transcription factor with a "finger" domain responsible for regulating gene expression. The common feature of zinc finger proteins is to stabilize a very short, self-folding, finger-like polypeptide space configuration by binding zinc ions. Zinc finger proteins have been reported in petunia (Takatsuji et al. 1992), wheat (Sakamoto et al. 1993), thale cress (Arabidopsis thaliana (L.) Heynh.) (Sakamoto et al. 2000), soybean (Kim et al. 2001), cotton  and rice (Huang et al. 2002). According to the number and position of cysteine (C) and histidine (H) bases, zinc finger proteins can be classified into C2H2, C2HC5, C2HC, C3HC4, C3H, C4, C4HC3, C6 and C8 (Berg and Shi 1996). Most of zinc finger proteins belong to the C2H2 type. Eukaryotic zinc finger proteins are not only involved in cell proliferation, differentiation and apoptosis or cell death (Bowman et al. 1992;Takatsuji et al. 1992;Sakai et al. 1995;Kobayashi et al. 1998;Kapoor et al. 2002;Yun et al. 2002), but also related to biological stress (Lippuner et al. 1996;Sakamoto et al. 2000). Zinc finger is a universal transcription factor structure for identifying specific base sequences. However, the function of most zinc finger proteins in Lenzites gibbosa (Pers.) Hemmi is not obscure. To explore the mechanism of zinc finger gene regulating mycelial growth in a sawdust surrounding by L. gibbosa is of significance for wood lignin degradation.
Owing to the development of novel high-throughput assay technologies, transcriptome sequencing has become a vital means of transcriptome research (Qi et al. 2011;Zhou et al. 2012;Xu et al. 2014;Jia et al. 2015;Shi 2016). In recent years, transcriptome techniques have been used in fungi. Tang et al. (2013) used Illumina RNA high-throughput sequencing technology to sequence the mycelium of Lentinus edodes after 30 days of darkness and 80 days of darkness, as well as 30 days of darkness combine with 50 days of light to identify the genes related to the formation of lightinduced brown film. Yi (2015) sequenced the fruiting bodies of L. edodes under softening and non-softening conditions by Illumina Hiseq2000 and identified fifteen genes bond up with cell wall metabolism. Wu et al. (2017) sequenced the transcriptome of mushrooms at different development stages and screened out eight genes related to development. Peng et al. (2019) sequenced the transcriptome of Agaricus bisporus mushroom stored one day, one week and two weeks after harvest, and screened out five genes related to polyphenol oxidase.
The transcriptome of L. gibbosa mycelium samples was sequenced by high throughput sequencing technology. The mycelium samples grew on sawdust over different periods of time. The transcriptome data were analyzed for differential gene expression, COG (cluster of orthologous groups of proteins) functional annotation, GO (gene ontology) functional annotation and KEGG (Kyoto encyclopedia of genes and genomes) enrichment. At the same time, genes bond up with growth and development were screened and provided a basis for the future study on the mechanism of L. gibbosa degradation of wood.

Experimental materials and sampling site overview
The CB1strain of L. gibbosa was isolated from the fruiting bodies of plants on Changbai Mountain, northeast China (41°42′N, 127°41′E), and stored in PDA slant culture medium (potato dextrose agar) and refrigerated at 4 °C at the Forest Disease and Insect Pathology Laboratory of the Forestry College, Northeast Forestry University.
The wood used in the test was hybrid black poplar (Populus simonii Carr × P. nigra L.). The bark of the wood segment was removed and made into sawdust and sterilized at high temperature for backup use.
The medium for enzyme production was low nitrogen asparagine-succinic acid. It was configured according to (Yan 2009).

Mycelium culture and sample collection
After 24 h at room temperature, the strain was transferred to a new PDA plate (9 cm in diameter) and cultured at 28 °C for seven days until the hyphae covered the entire plate. With a 5 mm diameter perforator, the edge cake of the plate was placed into a 250 mL triangular bottle containing 70 mL LANS medium and 5 mL filtered 15% glucose (V/V). Five pieces of cake were added into each bottle and incubated for 10 days at 25 °C. Fifteen bottles were randomly selected as a control group. Five bottles were mixed into one sample and three control samples were taken as biological repeats. CK1, CK2 and CK3 were recorded as control mycelium samples without sawdust. A 2 g aseptic sawdust was added to each bottle of bacterial solution after 10 days as treatment group (MX). Mycelia were extracted 3, 5, 7 and 11 days after sawdust was added. Five bottles were taken as a sample, with three recorded as biological repeats in the treatment group which were MX11, MX12, MX13, MX21, MX22, MX23, MX31, MX32, MX33, MX41, MX42 and MX43 in chronological order. CK, MX1, MX2, MX3 and MX4 were the mean values of biological repetition at different treatment times. The mixed hyphae were placed under eight layers of gauze, washed with sterile water to remove impurities in the culture medium, filtered and twisted to dry hyphae not less than 0.6 g. The hyphae were transferred into the freezing tube, sealed, quickly frozen in liquid nitrogen and stored at −80 °C.

RNA extraction and transcriptome cDNA library sequencing
The samples were sent to Beijing Biomarker Biotechnology Co., Ltd, Beijing, China. Total mycelia RNA was extracted using the centrifuge column RNA Prep Pure Plant Kit (TIANGE BIOTECH, Beijing, China). Ultraviolet spectrophotometry (NanoDrop 2000, Thermo Fisher Scientific, Beijing, China) was used to detect RNA concentration, and the RNA Nano 6000 detection box of Agilent Bioanalyzer 2100 system used to evaluate RNA integrity. Total RNA with oligo (dT) primer was treated with magnetic beads to enrich the RNA and randomly broken into short fragments. The first and second chains of the DNA were synthesized by reverse transcription using the broken RNA as template. The DNA library was then amplified, enriched by PCR and evaluated using Agilent Bioanalyzer 2100 system. After quality inspection, the high-throughput sequencing level was performed in Illumina Hiq X Ten. Two-terminal sequencing of the cDNA library was performed on the platform.

Selection of transcriptome zinc finger genes
The original sequenced data were clean reads after quality control. Using TopHat software and Bowtie, clean reads were sequenced with JGI Trametes gibbosa reference genome (https ://genom e.jgi.doe.gov/Tragi b1/Tragi b1.home.html). Mapped data and Cufflinks were obtained using mapped data reads and GFF of the reference genome. The document assembles the transcripts and they are merged with Cuffmerg to get a set of transcript information and the expression value of each gene. BLAST software was used to annotate differentially expressed genes in COG, GO and KEGG function annotation database. Zinc finger genes and their expression levels were screened according to the functional annotations of the databases. The sequence data reported here have been submitted to the GenBank database under the accession numbers displayed in Table S2.

Screening of differentially expressed genes of zinc finger protein
Differentially expressed genes (DEGs) between the control group and the treatment group were analyzed using DESeq2 software under the condition of False Discovery Rate (FDR) = 0.01 and Fold Change (FC) = 2.

Transcriptome sequencing data and assembly results
The transcriptomes of 15 samples were analyzed (Table 1) and 114.55 Gb of valid data were obtained. The sequence of each sample was 6.16 Gb, and the base ratio of Q30

Zinc finger family gene structure
Functional annotation results based on KEGG, COG and GO gene functional annotation databases (Ashburner et al. 2000;Tatusov et al. 2000;Koonin et al. 2004;Kanehisa et al. 2004;Finn et al. 2009) searched for 194 zinc finger protein genes in all transcriptome genes. According to the conserved motif of zinc finger proteins, the gene structure can be classified and divided into 24 types (Table S1). Individual genes have been found to have two conserved motifs from the classification of conserved motifs, such as gene_5515, gene_10106 and gene_8201.

Analysis of COG and GO functional annotation of zinc finger family genes
The COG database is for homologous classification of gene products and an early database for identifying direct homologous genes. It is based on a large number of comparisons of protein sequences of various organisms. The functions of gene products were classified into 25 taxa in alphabetical order from A to Z. In this research, 32 genes were annotated in COG database and distributed in different functional categories (Fig. 1). In COG classification, nine functions were annotated as: (1) four genes were involved in translation, ribosomal structure and biogenesis, accounting for 4.0% to 12.5% of the total transcriptome genes; (2) five were involved in transcription function, accounting for 5.0% to 15.6%; (3) nine were involved in replication, recombination and repair function, accounting for 9.0% to 28.1%; (4) one was involved in cell cycle control, cell division and discrete partitioning, accounting for 1.0% to 3.1%; (5) three were involved in posttranslational modification, protein turnover and chaperones, accounting for 3.0% to 9.4%; (6) One was involved in energy production and conversion, accounting for 1.0% to 3.1%; (7) one was involved in nucleotide transport and metabolism function, accounting for 1.0% to 3.1%; (8) Seven were involved in only general function prediction, Fig. 1 The COG annotation classification of zinc finger family genes accounting for 7.0% to 21.9%; (9) there was one gene whose function was unknown, accounting for 1.0% to 3.1% of the total transcriptome genes. Based on the preceding results, it can be inferred that the remaining gene products are the direct homologous proteins related to the primary metabolism of L. gibbosa on sawdust, except for the general function prediction "R" and unknown "S" genes. The growth of L. gibbosa mycelium is primary metabolism. Most zinc finger protein genes in COG functional annotation are related to replication, recombination and repair functions, and these genes are meaningful differential genes. GO is a functional classification system for describing the properties of genes and their products in organisms. In this research, 194 zinc finger protein genes were analyzed by GO annotation enrichment analysis (Fig. 2). The main differential gene distribution was found in 24 functional secondary items (9 + 10 + 5) of three major categories of biological process, cell composition and molecular function. Out of the 194 differentially expressed genes, 136 were annotated with biological process function, 35 of which were secondary function cell process (GO: 0009987). One hundred and thirty-six were annotated with a cell component function, of which the differentially expressed genes with secondary function "cell part (GO: 0044464)" and "cell (GO: 0005623)" were the most numerous. There were 159 "molecular function" annotated, and 113 differentially expressed genes annotated to secondary function "binding (GO: 0005488)" the most numerous.
GO annotation enrichment analysis was performed on all 194 differentially expressed genes, and secondary Fig. 2 The GO annotation classification of zinc finger family genes entries with a corrected P value < 0.01 or < 0.05 in the three functional classifications were screened. Table 2 shows the visualization analysis results of GO functional significance enrichment with a corrected P value < 0.002 in DEGs. The most significantly enriched GO term were six functional annotation categories with 160 genes, including "protein targeting (GO: 0006605)", "establishment of protein localization (GO: 0072594)", "mitotic cell cycle DNA replication (GO: 1902969)", "COPII vesicle coat (GO: 0030127)", "zinc ion binding (GO: 0008270)" and "metal ion binding (GO: 0046872)". The preceding is the differentially expressed genes (DEGs) that need attention. Their corrected P value < 0.002 shows that the enrichment is remarkable, and the more significant is more important. They can be classified into two categories: one involved in the mitotic DNA replication of cell cycle, the other in the binding of zinc or metal ions. These functional genes play an essential role in the growth and development of L. gibbbosa mycelium.

Screening of differentially expressed genes
The sequenced data were used for zinc finger gene screening and expression analysis. The differential genes were analyzed by DESeq2 data analysis software [FC ≥ 2, FDR < 0.01] under different treatment times. The expression of most genes was normal (specific data not provided), some genes were down-regulated and others up-regulated but all showed certain regularity ( Fig. 3 and Table 3). There were 18 differential genes in CK versus MX1, two of which were up-regulated; 14 in CK versus MX2, one of which was up-regulated; 19 in CK versus MX3, three of which were up-regulated; and 17 in CK versus MX4, one of which was up-regulated. The differential gene expression levels of MX1 versus MX3, MX1 versus MX4 and MX2 versus MX4 were down-regulated, and MX1 versus MX2, MX2 versus MX3 and MX3 versus MX4 had no obviously expressed genes. L. gibbosa was grown on sawdust for three days, and the expression levels of gene_8705 and gene_7835 were upregulated; their expression levels were normal at 5, 7 and 11 days on sawdust. L. gibbosa was grown on sawdust for five days, and the expression level of gene_11332 was upregulated was normal after 7 and 11 days on sawdust. L. gibbosa was grown on sawdust for 7 days and the expression levels of gene_11332, gene_11087 and gene_3897 were up-regulated. When grown on sawdust for 11 days, the expression levels of gene_11332 and gene_3897 genes were normal, while the expression of gene_11087 was still up-regulated.

KEGG pathway analysis of differentially expressed genes
KEGG integrates the biological characteristics and relationships between macromolecular function and metabolic pathways through functional hierarchy and network path diagram, and is an important database in biosystematics. The main differential gene distribution was found in 20 functional secondary items (5 + 2 + 10 + 3) of four major categories of cellular process, environmental information processing, genetic information processing and metabolism (Fig. 4). According to KEGG pathway annotation, 66 annotation sequences were obtained by comparing the differentially expressed genes in the KEGG database, and 35 primary metabolic pathways were annotated. In order to control and calculate the false positive rate, when the corrected P-value (p_FDR) was 0.05, the Pathway that met this condition was defined as the KEGG Pathway and was significantly enriched in the differentially expressed genes (Table 4). In this research, significantly enriched genes were mainly concentrated in the protein processing in endoplasmic reticulum and ubiquitin-mediated proteolysis. The main enriched genes were gene_11750, gene_5266, gene_4275, gene_8575, gene_9670, gene_6191 and gene_6400.

Discussion
In this study, L. gibbosa was cultured on low nitrogen asparagine-succinic acid medium for 10 days before grown on sawdust and was nutrient deficient. The L. gibbosa culture was provided with sawdust to provide carbon and nitrogen sources contributed to growth. Transcriptional genome sequencing was carried out on bacterial samples treated with sawdust on the first day and on days 3, 5, 7 and 11. The results showed that the expression of most genes was normal, some were down-regulated, others up-regulated, but all showed certain regularity, which was consistent with the results of Zhao et al. (2019). They screened 11 genes related to lignin degradation by L. gibbosa. When the L. gibbosa culture was deficient in nutrients, it was provided with sawdust. The gene expression levels related to lignin degradation and growth and development of L. gibbosa are mostly down-regulated and very few up-regulated in different culture periods and needs further study. According to the functional annotations of COG, GO and KEGG, we searched all transcriptome genes and 194  zinc finger class genes. Most zinc finger genes were related to growth and development and cell division. We obtained Sixty-six annotated sequences and annotated thirty-five metabolic pathways. The pathways of significant enrichment were focused on endoplasmic reticulum protein processing and ubiquitin-mediated protein hydrolysis pathways. Kirk and Jeffries (1996) found that the lignin-degrading enzymes of L. gibbosa were produced during the secondary metabolic period of mycelium, which indicated that secondary metabolism of microorganisms was closely related to primary  metabolism. Furthermore, the number and amount of differentially expressed genes of zinc finger family genes were less than those related to lignin degradation (Zhao et al. 2019). It was indicated from the side that L. gibbosa, in the sawdust matrix, first carried out primary metabolism, obtained carbon and nitrogen sources for its growth by degrading cellulose, and then carried out secondary metabolism to degrade lignin. There is, therefore, a close relationship between the primary and secondary metabolic processes of L. gibbosa, which needs further study. Different conserved motifs of zinc finger protein genes have various functions (Berg and Shi 1996). Among 194 zinc finger protein genes, the C2H2 gene accounted for 15.7%, C3HC4 for 29.9%, C3H for 10.7%, C4 for 4.1%, other zinc finger protein genes for 22.8%, and unknow genes for 16.8%. C2H2 has the most typical zinc finger motif structure. According to the characteristics of zinc finger conserved motif, the protein can be divided into the preceding types. However, the majority zinc finger protein genes in zinc finger family is C3HC4, which is inconsistent with the results of Berg and Shi (1996). Two conserved motifs were found in individual genes of C2H2 type which indicated that specific zinc finger protein genes had many functions. According to KEGG pathway analysis of obviously expressed genes, the significant enrichment pathways in this study were concentrated in protein processing in endoplasmic reticulum and ubiquitin-mediated protein hydrolysis pathway. The main enriched genes were gene_11750, gene_5266, gene_4275, gene_8575, gene_9670, gene_6191 and gene_6400. The basic structure of gene_11750 belongs to C3HC4 type. The structure of the remaining six genes is unknown. However, their specific functions are found according to the functional annotations of COG, GO and KEGG. In order to further study the functional genes related to the growth and development of L. gibbosa, it is necessary to analyze the unknown conserved motif structure and supplement the deficiencies in the structural and functional research of zinc finger protein.

Conclusions
This study preliminarily analyzed the primary metabolic pathways related to L. gibbosa growth and development at the transcriptome level. The results obtained have important reference value for mycelial growth research, and lay a foundation for further research on secondary metabolism.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.