Study on potential differentially expressed genes in stroke by bioinformatics analysis

Stroke is a sudden cerebrovascular circulatory disorder with high morbidity, disability, mortality, and recurrence rate, but its pathogenesis and key genes are still unclear. In this study, bioinformatics was used to deeply analyze the pathogenesis of stroke and related key genes, so as to study the potential pathogenesis of stroke and provide guidance for clinical treatment. Gene Expression profiles of GSE58294 and GSE16561 were obtained from Gene Expression Omnibus (GEO), the differentially expressed genes (DEGs) were identified between IS and normal control group. The different expression genes (DEGs) between IS and normal control group were screened with the GEO2R online tool. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of the DEGs were performed. Using the Database for Annotation, Visualization and Integrated Discovery (DAVID) and gene set enrichment analysis (GSEA), the function and pathway enrichment analysis of DEGS were performed. Then, a protein–protein interaction (PPI) network was constructed via the Search Tool for the Retrieval of Interacting Genes (STRING) database. Cytoscape with CytoHubba were used to identify the hub genes. Finally, NetworkAnalyst was used to construct the targeted microRNAs (miRNAs) of the hub genes. A total of 85 DEGs were screened out in this study, including 65 upward genes and 20 downward genes. In addition, 3 KEGG pathways, cytokine − cytokine receptor interaction, hematopoietic cell lineage, B cell receptor signaling pathway, were significantly enriched using a database for labeling, visualization, and synthetic discovery. In combination with the results of the PPI network and CytoHubba, 10 hub genes including CEACAM8, CD19, MMP9, ARG1, CKAP4, CCR7, MGAM, CD79A, CD79B, and CLEC4D were selected. Combined with DEG-miRNAs visualization, 5 miRNAs, including hsa-mir-146a-5p, hsa-mir-7-5p, hsa-mir-335-5p, and hsa-mir-27a- 3p, were predicted as possibly the key miRNAs. Our findings will contribute to identification of potential biomarkers and novel strategies for the treatment of ischemic stroke, and provide a new strategy for clinical therapy.

thrombolysis by using recombinant tissue plasminogen activator (rTPA). However, the biggest disadvantage of this treatment scheme is that the treatment time window is only 3 h [5]. Therefore, it is particularly important to identify molecular targets for effective treatment of stroke and clarify the mechanism of brain injury.
IS is a disease mediated by many mechanisms and pathways. Studies have shown that abnormal gene expression caused by exogenous injury plays an important role in the occurrence and development of IS [6,7]. The biological processes that the IS injury-related genes participate in and the specific mechanism of IS injury are still unclear. Bioinformatics analysis can use high-throughput gene sequencing technology to analyze the genome, transcriptome, and proteome information of organisms, and can reveal the mechanism of disease occurrence and development from various molecular levels, providing direction for laboratory and clinical research [8]. The biological functions of IS-related genes were analyzed. With the help of bioinformatics methods, differentially expressed genes (DEGs) were screened, and their functions and signal pathways were analyzed. Then, a gene network diagram was constructed and key gene targets were screened. By analyzing the biological functions related to IS, it could lay a foundation for the clinical diagnosis and treatment of IS.
Two original datasets were selected to screen the DEGs between IS sample and normal control group. In order to evaluate the potential molecular mechanism of regulating IS metastasis, DEGs was further analyzed by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis based on Database for Annotation, Visualization and Integrated Discovery (DAVID) and gene set enrichment analysis (GSEA) database. By constructing PPI network and using the Search Tool for the Retrieval of Interacting Genes (STRING) database and Cytoscape software, a key module was then screened out from the whole network, and the hub genes were identified based on the key module. This study identified several potentially critical stroke-associated biomarkers involved in the progress of IS, which may provide novel insights for exploring the pathogenesis of IS, to understand the pathogenesis and clinical treatment of IS lay the foundation.

Microarray data
The microarray datasets (GSE58294, GSE16561) of IS and control samples were collected from the GEO databade ( https:// www. ncbi. nlm. nih. gov/ geo/). The author, year, platform, and the proportions of IS and control samples in each dataset were extracted and evaluated. Table 1 gives the details of expression spectrum datasets.

Differentially expressed Gene Identification
GEO2R (http:// www. ncbi. nlm. nih. gov/ geo/ geo2r) was applied to perform DEGs analysis between serum samples from IS and control groups, and corrected P-value calculations to obtain |log 2 FC|. Genes with correcting P-value < 0.05 and |log 2 FC|≥ 0.5 were deemed as DEGs. An online visualization software Funrich (http:// funri ch. org/) was used to generate the Venn diagram of DEGs. DEGs with logFC > 0 were considered as upregulated genes, while those with logFC < 0 were classified as downregulated genes.

Functional enrichment analysis
Based on the DAVID database4 (Version 6.8), we carried out GO and KEGG pathway enrichment analyses for the DEGs [9,10], and the GO analysis included the following domains: biological process (BP), cellular component (CC), and molecular function (MF). A p value < 0.05 was specified for statistical significance.

PPI network construction and module analysis
The PPI network of DEGS was constructed by String V10 online tool (Search Tools for the Retrieval of Interacting Genes, STRING(https:// string-db. org/). Used the plug-in of Cytohubba in Cytoscape to calculate the topological structure of PPI network. According to the centrality score, the key nodes in PPI network were determined, and then the key pathogenic genes were deduced.

MiRNAs associated with hub genes
Used NetworkAnalyst 3.0 (https:// www. netwo rkana lyst. ca/), a visual online platform for discovering miRNA-gene interactions in Gene Regulatory Networks, the top 10 central genes were mapped to their corresponding microRNAs, and the hub genes and miRNA were plotted by Cytoscape 3.7.2.

IS DEGs identification
According to the limiting conditions, the qualified gene chips were GSE58294 and GSE16561 in the GEO database including both controls and IS samples. Among them, GSE58294 chip belonged to GPL570 platform, including 23 control samples and 69 samples, while GSE16561 chip belonged to GPL6882, including 24 control samples and 39 IS samples. Sequencing information was obtained from human peripheral blood mononuclear cells. According to the criteria of P < 0.05 and |log 2 FC|≥ 0.5, 4898 DEGs in total were obtained from GSE58294, containing 3102 upregulated genes and 1796 downregulated genes. In the GSE16561 dataset, 169 DEGs were obtained, 97 of which were upregulated, and 72 of which were downregulated. The gene expression profile of each 2 of DEGs containing 2 sets of sample data was shown in Figs. 1 and 2. These genes were further filtered and then mapped by Venn diagram. As shown in Fig. 3, it was found that 85 genes were remarkably differentially expressed between the two groups, of which 65 genes were upregulated and 20 genes were downregulated.

GO functional enrichment analysis and KEGG enrichment pathway
GO function enrichment analysis was carried out on 85 DEGs by David platform, and 56 GO items with significant differences were obtained, including 18 CC items, 27 BP items, and 11 MF items. The bubble chart showed the top 10 pathways.
It can be seen from the graph that the biological processes of BP mediated by DEGs were mainly concentrated in signal transduction, innate immune response and protein phosphorylation. The results of CC were mainly concentrated in plasma membrane, cytoplasm, and extracellular exosomes. The results of MF showed that calcium ion binding, receptor binding, and kinase activity were important enrichment items. See Fig. 4 and Tables 2, 3, and 4 for detailed results.
The results of KEGG enrichment pathway showed that DEGs was mainly involved in cytokine-cytokine receptor interaction, hematopoietic cell lineage and B cell receptor signaling pathway. See Fig. 4 and Table 5 for the results.

Construction of PPI network and identification of hub genes
PPI network of DEGs was constructed by STRING v10, and visualized by Cytoscape, the result was shown in the Fig. 5. CytoHubba of Cytoscape was used to determine the key nodes in PPI network, ten key genes were obtained, namely CEACAM8, CD19, MMP9, ARG1, CKAP4, CCR7, MGAM, CD79B, and CLEC4D. According to the node degree score generated by Cytoscape, the potential hub genes were determined see Fig. 5. The results showed that CEA cell adhesion molecule 8 (CEACAM8, score 11) and CD19 molecule (CD19, score 11) were the most significant genes. The rest were matrix metalloproteinases 9 (MMP9, score 9), arginase 1 (ARG1, score 8), C-C chemokine receptor 7 (CCR7, score 8), maltase-glucoamylase (score

Integrated miRNA/gene regulatory networks
The significant difference miRNA-gene regulatory network was constructed by Cytoscape software, and the target miR-NAs were predicted according to the network analysis database. The top 10 DEGs and their corresponding regulatory miRNAs molecules were shown in the Fig. 7. In 10 DEGs, for example, MMP9, CD79B, MGAM, and CD79A could be used as common targets for predicting hsa-mir-146a-5p. MMP9, CKAP4, and ARG1 could be used as common targets for predicting hsa-mir-7-5p. Common targets of hsamir-335-5p were CD79A, CEACAM8, and CCR7. Common targets of hsa-mir-27a-3p were ARG1, CKAP4, CEACAM8, CCR7, and CLEC4D. However, these findings need to be further verified in future studies.

Discussion
The incidence of IS is caused by a combination of many factors, such as environment and heredity. In this study, the gene chip information of peripheral blood mononuclear cells of IS patients and healthy controls was obtained by using GEO database, and 85 DEGs were analyzed and screened by using GEO2R software, including 65 upregulated genes and 20 downregulated genes. Then, GO function enrichment analysis and KEGG pathway enrichment analysis were carried out on the DEGs obtained. The results of GO functional classification showed that DEGs mainly concentrates on signal transduction, immune response, inflammatory response, and receptor binding. In the PPI network of DEGs, the scores of CEACAM8, CD19, and MMP9 were higher. The results of KEGG pathway enrichment analysis showed that DEGs mainly mediated cytokine-receptor interaction, hematopoietic pathway, and B cell receptor signaling pathway.  Adhesion molecule is a kind of membrane surface glycoprotein which can mediate the adhesion between cell-cell and cell-extracellular matrix. It is mainly expressed in leukocytes, platelets, and endothelial cells. There are many kinds of adhesion molecules including platelet membrane glycoprotein [11]. At present, the view that inflammation and immune response play a key role in cerebrovascular diseases has been widely recognized. Inflammatory cells can lead to the formation of early pathological changes of cerebrovascular diseases, inflammatory effector molecules can lead to the progression of pathological changes, and inflammatory activation can lead to the occurrence of acute ischemic cerebrovascular diseases. The adhesion of circulating leukocytes to endothelial cells and migration into arterial walls is an early step in the formation of atherosclerosis, which needs to be mediated by cell adhesion molecules expressed on the surface of vascular endothelial cells. Adhesion molecules allow monocytes and lymphocytes to roll, adhere tightly, and migrate across endothelium. Adhesion molecules play an important role in the occurrence of stroke and cerebral ischemia-reperfusion (I/R) injury. The adhesion in vascular endothelial cells and transmembrane migration of leukocytes need to be mediated by adhesion molecules, so adhesion molecules are an important inflammatory mediator. In addition, platelet activation is also related to adhesion molecules. At present, it is considered that cerebral I/R injury is essentially an inflammatory process, in which white blood cells infiltrate into brain tissue through the interaction with adhesion molecules. Therefore, the treatment measures related to anti-adhesion molecules were expected to become a new field in the treatment of cerebrovascular diseases [12,13]. In addition, adhesion molecules can reflect whether vascular endothelial cells are damaged or not, and can also reflect the activation state of leukocytes and platelets, which all play a role in the occurrence and development of cerebrovascular diseases. Therefore, adhesion molecules can be used as predictors of stroke in theory. The correlation between adhesion molecules and cerebrovascular diseases has been preliminarily confirmed, and the determination of adhesion molecules may have potential value in preventing and monitoring cardiovascular and cerebrovascular diseases, guiding clinical treatment.   CD19 is a member of B-cell-specific immunoglobulin superfamily, which is expressed by early B cells during the period from heavy chain rearrangement to plasma cell differentiation. CD19 has been reported to enhance the activity of Src family protein tyrosine kinase and mitogenactivated protein kinase, promote cell proliferation, and positively regulate the function of B cells. Its functional diversity and important signal transduction make it play an important role. If abnormal expression occurs, it will lead to B cell-related diseases [14]. In addition, it can mediate the pathophysiological process of diseases by affecting immune function.
Matrix metalloproteinase families (MMPs) are closely related to cerebrovascular diseases. Some studies have shown that [15] there is a certain relationship between acute ischemic stroke and MMPs. By detecting patients with acute ischemic stroke, it is found that the level of MMPs in serum will obviously increase. Elevated MMPs can degrade extracellular matrix, aggravate vascular brain edema, and damage neurons. The experiment found that [16] knocking out MMPs gene or adding MMPs inhibitor can obviously reduce the injury of stroke. MMP9 is one of the most closely related members of MMPs family with cerebrovascular diseases, which can degrade the components of cerebrovascular basement membrane, increase the permeability of blood-brain barrier, and aggravate the occurrence of cerebral edema after being activated. The research also confirmed that the high expression of MMP9 can aggravate the destruction of  Fig. 5 Protein-protein interaction network of 65 upregulated and 20 downregulated genes were analyzed using Cytoscape software. The edges between 2 nodes represent the gene-gene interactions. The size and color of the nodes corresponding to each gene were determined according to the degree of interaction. The closer to the blue node, the higher connectivity between 2 nodes blood-brain barrier and promote the occurrence of brain edema, and the purpose of relieving brain edema can be achieved by knocking out MMP9 in mice or using MMPs inhibitors [17]. In the central nervous system diseases such as inflammation, MMP9 could lead to cerebral hemorrhage and brain edema by breaking the blood-brain barrier, and had toxic and side effects on neurons in varying degrees [18]. The results of clinical research on patients with cerebral ischemia suggested that the expression level of MMP9 in human cerebral ischemia patients was higher than that in healthy people [19], and the expression level of MMP9 was also closely related to the prognosis of patients with cerebral ischemia. There is a certain relationship between the expression level of MMP9 and the hemorrhagic transformation after cerebral ischemia, the expression level of MMP9 can predict the possibility of transforming into cerebral hemorrhage. After later observation, it was found that the plasma level of MMP9 was higher in patients with hemorrhagic transformation after cerebral hemorrhage. After cerebral ischemia, the degranulation of neutrophils leads to the release of stored MMP9, which leads to the increase of blood MMP9 level. Therefore, MMP9 can be used as a blood detection marker in the prediction of hemorrhagic transformation of patients, and the high expression level of MMP9 after cerebral ischemia is more likely to lead to hemorrhagic transformation after ischemia [20]. The disorder of MMP9 gene level may participate in the pathophysiological process of stroke, and may become a new molecular target for diagnosis and prognosis of stroke patients.
The results of GO functional enrichment analysis showed that the biological processes involved in DEGs were mainly concentrated in signal transduction, immune response, inflammatory response, and cytokine-receptor interaction, suggesting that inflammatory response played an important role in the occurrence and development of IS. Pathological inflammation can cause permanent damage to functional cells and vegetative cells in the brain, and the damage caused by IS could be alleviated by inhibiting inflammation [21]. In the rat model of middle cerebral artery embolism, electroacupuncture treatment can inhibit the pathological inflammatory reaction mediated by NLRP3 inflammatory corpuscles through the pathway mediated by α7 nicotinic acetylcholine receptor, thus alleviating the brain injury induced by IS [22]. This prediction result of GO functional enrichment analysis was consistent with the results of many clinical studies, which indicated that bioinformatics analysis could provide new ideas for the prevention and clinical treatment of IS [23]. Many animal experiments and clinical studies have shown that inflammatory response and immune response played an important role in the early stage after stroke, and affected the prognosis and treatment of stroke [24][25][26][27].
Ten key genes in the pathogenesis of IS were identified by using the STRING software of Cytoscape. Most of these key genes were related to inflammatory response, which was the core link of IS injury. Studies had shown [28] that abnormal expression of inflammation-related genes was the genetic basis of IS, and targeted therapy for these genes may provide a new direction for the prevention and treatment of IS. The   Fig. 6 Protein-protein interaction network for the top 10 hub genes. Node color indicates the number of degrees. The top 10 ranked hub genes are depicted using a pseudocolor scale. Red color stands for highest degree, and yellow color represents lowest degree occurrence and development of IS is mediated by multiple genes [29]. KEGG pathway analysis showed that multiple DEGs mediated the same information transmission pathway, PPI network results also showed that multiple DEGs were closely related, and miRNA results also showed that multiple key genes could predict common target miRNA. Recent studies had found that miRNAs were involved in the pathological and physiological processes of many diseases, including tumors, immune system, cell proliferation, cardiovascular diseases, and nervous system diseases [30]. Stroke affected the level of miRNA in brain and circulation. MiRNA played a key role in the pathogenesis of stroke and its complications, and participated in the regulation of key metabolism, inflammation, and angiogenesis [31]. Studies have shown that miRNAs played a key role in regulating cell growth, differentiation, progression, and apoptosis, as well as neuron development, hematopoiesis, and repair and remodeling of injured tissues [32]. MiRNA-146a is a strong pro-apoptosis factor [33]. As an important inflammatory microRNA, miR-146a can be found in both immune cells and circulating cells [34]. At the level of translation inhibition, miR-146a can act as a negative regulator of inflammation, mediating IL-1 signaling pathway of IRAK1 [35]. In addition, the animal I/R model study explored the relationship between miR-146a and myocardial I/R injury in mice heart by upregulating the expression of miR-146a, and confirmed that miR-146a can protect myocardium from I/R injury [36]. MiRNAs can mediate the pathological and physiological processes of stroke and its complications. Studies have shown that miRNAs is promising as a treatment for stroke and its complications [37]. Samaraweera et al. [38] found that miR-335 can downregulate the expression level Fig. 7 Integrated miRNA-DEGs networks for the top 10 hub genes. Green hexagons represent 10 hub genes. Red circles represent miRNA which has high connectivity with hub genes, yellow circles represent miRNA which has moderate connectivity with hub genes, purple circles represent miRNA which has low connectivity with hub genes of HAND1 and JAG1, block the growth and differentiation of neurons, and participate in the regulation of neuronal development. Tom et al. [39] found that overexpression of miR-335 can downregulate the expression of AP-1 which is related to cell proliferation, differentiation, and apoptosis. These studies suggest that miR-335 may be an important neuronal regulatory factor, and the abnormal expression of miR-335 may be closely related to nervous system diseases. Our current results indicated that some miRNAs including mir-146a, mir-335, mir-27a, and mir-7 may play a key role in the occurrence and development of IS diseases, and the role of these miRNAs in IS may need further discussion. In addition, the research on stroke genes and miRNAs is still limited.
Gene regulatory network plays an important role in the pathophysiological process of stroke. This discovery will help us better understand the pathogenesis of stroke and provide effective and novel treatment strategies for stroke. However, our research also has some limitations, such as the following: (1) Our current research only involves the top 10 hub genes; (2) the specific molecular mechanism of hub genes and miRNAs in stroke regulation is insufficient; (3) in the constructed network, there is a lack of research on the functions of hub genes and miRNA.

Conclusion
Our findings suggested that compared with the healthy control group, the expressions of CEACAM8, CD19, MMP9, ARG1, CKAP4, CCR7, MGAM, CD79B, and CLEC4D in patients with IS were significantly upregulated, which may have an important influence on the pathophysiological mechanism of ischemic stroke. Some potential target miRNAs such as hsa-mir-146a-5p, hsa-mir-7-5p, hsa-mir-335-5p, and hsa-mir-27a-3p were also predicted. Identification of these genes and miRNAs may contribute to the development of early diagnostic strategies, prognostic markers, and therapeutic targets for IS. However, experimental research is still necessary to validate the functions of these molecules in IS.