Background

Stroke is the third leading cause of mortality and the primary cause of permanent disability worldwide; 87% of all strokes are ischemic [1]. Ischemic strokes are classified into cardioembolic, large-vessel, small-vessel lacunar, cryptogenic, and other causes based on stroke etiology. Cardiogenic embolisms account for ~20% of ischemic strokes each year [2]. Cardioembolic strokes are largely preventable through efforts at primary prevention for major-risk cardioembolic sources, e.g. high blood pressure, hyperlipidemia, etc. Once a cardioembolic stroke occurrs, the likelihood of recurrence is relatively high; therefore, the following prevention is also important. When known causes of strokes are identified, etiologic classification can guide treatments. Not knowing the etiology of a stroke restricts optimal therapy implementation and limits stroke research [3]. Several studies offered evidence of significant genetic implications in ischemic stroke [4]. We attempted to examine whether gene expression features in the blood can distinguish the causes of stroke, and determine whether these gene expression profiles can predict the stroke etiology and its outcomes.

Although no existing valid clinical criteria for diagnosing cardioembolic stroke, a diagnosis of cardioembolism can be based on the triad of (1) identification of a potential source of cardiogenic embolisms, (2) exclusion of other potential sources of cerebral ischemia, and (3) consideration of clinical neurologic features. Cardioembolism can be predicted on clinical grounds but is difficult to document [5]. Magnetic resonance imaging (MRI), echocardiography, Holter monitoring, transcranial Doppler, and electrophysiological studies increase the ability to identify the origin of cardioembolisms. In general, cardioembolic strokes have much worse prognosis and produce larger and more-disabling symptoms than other stroke subtypes. A recurrent embolism occurs in 30%~60% of patients with a history of a previous embolic event [6]. Cardioembolic strokes are a heterogeneous, complex disease resulting from interactions between genetic and environmental risk factors [7]. To understand contributions of various genetic risk factors to the etiology of stroke, the genetic risk factor must be analyzed and integrated in terms of biological functions and pathways [8]. With advances in affordable, high-throughput technologies, a systems biology study of diagnoses and treatments of cardioembolic strokes can shed light on applications of systems biology to the diagnosis, prognosis, and therapy of cardioembolic strokes.

In this study, we compared molecular interaction networks of 3 stages of cardioembolic strokes to reveal the underlying cellular mechanisms of cardioembolic strokes. As to different etiologies and heterogenic genomic alterations of cardioembolic stroke, the systems biology methodology integrated with Omics data is suitable to develop accurate diagnoses, novel therapeutic targets, and efficient targeted therapies. In this study, microarray data were applied to build the protein-protein interaction (PPI) networks (PPINs) of 3 stages of cardioembolic strokes. Network structures and protein association abilities in different stages of cardioembolic strokes were compared to obtain a set of significant proteins which can serve as important network biomarkers in the progressive process of cardioembolic strokes. In the future, significant proteins including UBC, CUL3, APP, NEDD8, JUP, SIRT7, etc., can be potent drug targets for first aid and emergency treatment within 24 h post-stroke. The complex behaviors of strokes differ from those of cancer and other complex diseases. We hope that this work can help scientists reveal more hidden cellular mechanisms of stroke etiology and repair processes.

Materials and methods

Overview of the construction process of stroke network marker

We successfully used our methods to find the core and specific network markers of 4 different cancers and the evolution of network markers from the early to late stages of bladder cancer [9, 10]. A similar theoretical framework was employed in this study to find the evolution of network biomarkers of stroke at 3 time points which represent 3 important stages after a stroke has occurred. The theoretical systematic method in this paper was developed from a previous study. Figure 1 shows the flowchart to identify network biomarkers of stroke at 3 time points. Due to the theoretical framework have been successfully applied by us on various cancers and have been published on many journals, so we do not repeat it in detail in the main text. We only highlighted the significant key points of it and put the detailed description in the Additional file 1.

Figure 1
figure 1

Flowchart of constructing the network marker at 3 time points post-stroke. We integrated microarray data, a gene ontology database, and protein-protein interaction (PPI) information to construct PPI networks (PPINs). These data were used for the differential protein pool selection, and then the selected proteins and their corresponding microarray data were used for the contribution of PPIN by a maximum-likelihood estimation and model order detection methods, resulting in a stroke PPIN (SPPIN) and a normal PPIN (NPPIN) in the 3 stages (3, 5, and 24 h post-stroke) of stroke. The 2 constructed PPINs were used to determine critical proteins of stroke by the difference of SPPIN and NPPIN matrices. By the help of the differential value of these two networks, the stroke relevance value (SRV) was computed for each protein, and significant proteins in the stroke recovery process were determined based on p values of the SRVs. These significant critical proteins with top SRVs were obtained as network markers for the 3 stages of stroke.

At first, two kinds of data sources were combined to build the network, they are microarray gene expression data and the protein-protein interaction data. We used them to construct the stroke PPINs (SPPINs, stroke protein-protein interaction networks) and normal PPIN (NPPINs). We calculated the stroke relevance value (SRV) for each protein in the network, and choose the proteins with top significant SRVs to be the network biomarkers. Detailed please refer to Additional file 1.

Data sets selection and pre-processing

The stroke microarray dataset GSE58294 [11] and its corresponding platform, GPL570, were obtained from the NCBI GEO [12]. It contains gene expression data following a cardioembolic stroke. The dataset contained 3 time points of 23 stroke patients' samples and 23 control samples from non-disease subjects (totally 23*4 = 92 samples)(Table 1). We built 3 SPPINs for 3, 5, and 24 h post-stroke in this study and the NPPIN. We extract the PPI data for Homo sapiens from the online interaction repository with data compiled through comprehensive curation efforts, Biological General Repository for Interaction Database (BioGRID). It was used to delete false-positive PPIs for pruning PPINs. These PPINs of 3, 5, and 24 h post-stroke (3 SPPINs), and normal stage (NPPIN) were then compared mathematically to get SRVs and corresponding network markers (top SRVs). Detailed please refer to Additional file 1[1315].

Table 1 Descriptive information on datasets extracted from the GEO database used in this study.

Protein pool selection and the PPINs identification for stroke and normal samples

We collect a protein pool of those proteins with differential expressions to construct the corresponding SPPINs and NPPIN. A one-way analysis of variance (ANOVA) was used to screen out the differential proteins. We used the following protein association model to describe the PPI relationship:

x i ( n ) = j = 1 M i α i j x j ( n ) + ω i ( n )
(1)

where x i (n) is the target protein i's expression level for each sample n (stroke or normal); x j (n) is the j-th protein's expression level interacting with target protein i for each sample n; α ij means the ability of association interaction (combination strength) between the i-th target protein and its corresponding j-th interaction protein; M i is the number of proteins that interacting with their i-th target protein; and finally ω i (n) means stochastic noise caused by other factors in the biological systems or uncertainty of our model.

The second step is to use the maximum-likelihood (ML) estimation method [16] to determine associated parameters (combination strength) in (1) by the microarray expression data as follows (see Additional file 2):

x i ( n ) = j = 1 M i α ^ i j x j ( n )
(2)

where α ^ i j was determined by using microarray expression data and the ML estimation method.

To do the model order selection and determine the significant protein interactions in α ^ i j , finally we use the Akaike information criterion (AIC) [16] and a Student's t-test [17] method (see Additional file 3). Please refer to details in Additional file 1.

Determination of the network structures and their corresponding significant proteins at 3, 5, and 24 h post-stroke and normal stage

After pruning away the spurious false-positive PPIs, only significant PPIs are remained:

x i ( n ) = j = 1 M i α ^ i j x j ( n ) , i = 1 , 2 . . . . . M
(3)

where M i '≤M i is the number of significant PPIs in the total PPIN, with the i-th target protein. The refined PPIN is:

X ( n ) = A X ( n ) + w ( n )
(4)

where

X ( n ) = x 1 ( n ) x 2 ( n ) x M ( n ) , A= α ^ 11 α ^ 1 M α ^ M 1 α ^ M M , and w ( n ) = w 1 ' ( n ) w 2 ' ( n ) w M ' ( n )

The interaction matrix A of refined PPINs in equation (4) for 3, 5, and 24 h post-stroke and normal cells was constructed, respectively, as follows:

A S k = α ^ 11 , S k α ^ 1 M , S k α ^ M 1 , S k α ^ M M , S k , and A N = α ^ 11 , N α ^ 1 M , N α ^ M 1 , N α ^ M M , N
(5)

where k = 3, 5, and 24 h post-stroke; A S k and A N are the interaction matrices of the refined PPINs of 3, 5, and 24 h post-stroke, respectively; and M denotes the proteins number in the refined PPIN. The two protein association (combination strength) models for both SPPINs and the NPPIN for 3, 5, and 24 h post-stroke and normal stage are:

x S k ( n ) = A S k x S ( n ) x N ( n ) = A N x N ( n )
(6)

where k = 3, 5, and 24 h post-stroke and x S k ( n ) = x 1 S k x 2 S k x M S k T and x N (n)=[x1Nx2N··· x MN ]T are vectors of proteins expression levels.

We defined the difference matrix A S k - A N of the DPPIN between SPPINs and NPPIN as follows:

D k = d 11 k d 1 M k d M 1 k d M M k = α ^ k 11 , S - α ^ 11 , N α ^ k 1 M , S - α ^ 1 M , N α ^ k M 1 , S - α ^ M 1 , N α ^ M M , S k - α ^ M M , N ;
(7)

where k = 3, 5, and 24 h post-stroke; d i j k is the protein association (combination strength) ability difference between SPPINs and NPPIN at k = 3, 5, and 24 h post-stroke and normal samples; and matrix Dk is the difference in network structures between SPPINs and the NPPIN for k = 3, 5, and 24 h post-stroke and normal samples.

Then we defined a stroke relevance value (SRV) to show the difference summation of SPPIN and NPPIN as follows [13]:

S R V k = S R V 1 k S R V i k S R V M k
(8)

where S R V i k = j = 1 M d i j k , and k = 3, 5, and 24 h post-stroke. Detailed please refer to Additional file 1.

Pathway analysis by many on-line freeware and powerful commercial software

We mapped the network biomarkers found to several on-line freeware of pathway analysis, such as KEEG (Kyoto Encyclopedia of Genes and Genomes) [18], NOA (network ontology analysis) [19, 20] and the DAVID bioinformatics database [21, 22]. They can help to investigate critical pathways related to these network markers and explore the relationships between these pathways and stroke. They also can illustrate the biological processes, cellular components and molecular functions. They also interpret the pathways involved in stoke etiology and repair processes. To complete our research results, we used the well-known commercial software, Ingenuity® Pathway Analysis (IPA) and Metacore, to do multiple functional and pathway analyses. IPA® is from QIAGEN (Redwood City, CA, http://www.qiagen.com/ingenuity). MetaCore™ is an integrated software suite from GeneGo for functional analysis of microarray, metabolic, SAGE, proteomics, siRNA, microRNA, and screening data. Please refer to details in Additional file 1.

Results and discussion

Evolution of network biomarkers at 3 post-stroke time points

We built DPPINs for the 3 post-stroke time points (3, 5, and 24 h) (Figure 2). The SRVs of each protein in the 3 PPINs were calculated. One can find more information than SRVs in this figure, such as the edges and nodes of these PPINs. Screened by the p value of the SRV, we found significant proteins of network markers for these 3 stroke stages. Similar to our previous experience with bladder cancer [10], we wanted to reveal the repair mechanism of stroke at these 3 time points.

Figure 2
figure 2

The constructed differential protein-protein interaction (PPI) networks (PPINs; DPPINs) for 3 time points post-stroke. This shows the DPPINs with edge and node information for 3 time points after a stroke occurred. It is the difference between the stroke PPIN (SPPIN) and normal PPIN (NPPIN). The node size means the stroke relevance value (SRV) of each protein, and the edge width is proportional to the link ability between the 2 proteins. Red and blue edges respectively indicate positive and negative values of d ij in (7). Besides UBC, we see at 3 h that CUL3, ATXN2L, TTN, and NRF2 dominate the network. At 5 h, APP, CUL3, NEDD8, EVAL1, TCO, PAN, and JUP dominate the network. At 24 h, CLU3 and APP dominate the network. We suggest that readers examine these figures together with Table 2. Information of the SRV and PPI are important for you to develop new therapeutic methods for stroke recovery. The figures were created using Cytoscape.

Network markers at the 3 time points

After p value (≤0.01) screening, we found that there were 5, 9, and 4 significant proteins at 3, 5, and 24 h, respectively, post-stroke (Table 2). In addition, their corresponding SRVs respectively ranged 1.7~6.1, 2.1~11.7, and 1.7~26. These significant top SRV proteins and their corresponding PPIs were used to construct network markers at 3 post-stroke time points. We found that SRVs of stroke were much smaller than SRVs of our pervious cancer results [9, 10], and also the cancer networks were much more complex than the stroke network. To compare the overall stroke process, we also combined samples at 3 time points into a total one (69 samples), and used it with normal data to build the DPPIN. This is not the main topic of this research, so we only put the total DPPIN in the results of Metacore. We do not discuss UBC in this paper, because it is another complex problem. It is a house keeping gene for many different kinds of diseases. We will extend our research on this target in the future.

Table 2 Top proteins at 3 time points post-stroke/

Pathway analysis of network biomarkers at 3 h post-stroke

After SRV screening with our systems biology approach, the complete and complex functional and pathway analyses fundamentally revealed the evolutionary process of repair mechanisms of stroke. Because the number of significant proteins was very small compared to results for cancers, the KEGG results could not give us as much information as in cancer cases.

The IPA gave us the clearest information on the disease, so we first show the IPA results (Table 3). We then show additional information given by NOA (Table 4). From Figure 3, one can see that the 2 key moduli of Tx_Cardiac-Hypertrophy and ML_Cardiovascular-Disease were related to our significant proteins (Figure 2(A)). We found that CUL3 appeared at all 3 stages, which implies that this time stationary network marker would be a significant target for therapy. It is easily seen that CUL3 is a key hub of the network. Functions and behaviors of CUL3 are very complex. Salinas et al. discussed how actinfilin acts as a CUL3 substrate adaptor, linking gluR6 kainate receptor subunits to the ubiquitin-proteasome pathway. They said that kainate receptors were implicated in excitotoxic neuronal death induced by stroke [23]. We list the disease functional analyses in a Additional file 3. The IPA results are shown in Table 3. NOA results are shown in Table 4. Results of Metacore are shown in Figure 7 to 14, for 3, 5, and 24 h, and the total (the sum of all samples).

Table 3 Functional analyses of the network biomarker at 3 h post- stroke.
Table 4 Pathway analysis and gene set enrichment analysis of 5 proteins at 3 h post-stroke on (1) biological processes, (2) cellular components and (3) molecular functions by NOA.
Figure 3
figure 3

IPA results at 3 h post-stroke. Please refer to the legend of Figure 5 and 6.

Pathway analysis of network biomarkers at 5 h post-stroke

IPA results (Figure 4) show that there were 5 modules of ML_Cardiovascular-Disease, ML_Cell-Death-Brain, Tx_Increases-Heart-Failure, Tx_Cardiac Necrosis/Cell Death, and BM_Unspecified-Application/Actute-Coronary Syndrome related to our significant proteins (Figure 2(B)). We found that caspase was related to 4 modules. Aries et el. discussed caspase-1 cleavage of transcription factor GATA4 and regulation of cardiac cell fate. They showed that GATA4 is cleaved by caspase-1 in cardiomyocytes, and their data identified a target for caspase-1 in nuclei and a pathway to explain its related cardiac actions [24]. The amyloid precursor protein (APP) is part of a binding-protein-dependent transport system. It is probably responsible for translocation of substrate across membranes, and it belongs to the permease family of the binding-protein-dependent transport system. It is also known as the β-amyloid (Aβ) precursor protein. From [25], we know that APP is a key gene related to Alzheimer disease (AD), and it implicates the relationship between neurodegenerative diseases and stroke. A lot of research has discussed this gene [2629]. It could possibly be an efficient therapy target at this time point. We list the disease functional analyses in a Additional file 3. IPA results are shown in Table 5. NOA results are shown in Table 6.

Figure 4
figure 4

IPA results at 5 h post-stroke. Please refer to the legend of Figure 5 and 6.

Table 5 Functional analyses of the network biomarker at 5 h post-stroke.
Table 6 Pathway analysis and gene set enrichment analysis of 9 proteins at 5 h post-stroke on (1) biological processes, (2) cellular components and (3) molecular functions by NOA.

Pathway analysis of network biomarkers at 24 h post-stroke

IPA results (Figure 5) (Figure 6 shows the detailed legend of IPA in Figure 3, 4, 5) show that there were 6 modules of ML_Cell-Cycle-Brain, ML_Cell-Death-Brain, Tx_Cardiac-Necrosis/Cell Death, Tx_Cardiac-Fibrosis, Tx_Cardiac-Hypertrophy, and ML_Cardiovascular-Disease related to our 4 significant proteins (Figure 2(C)). Another key protein, SIRT7, was found at this time point. We found that SIRT7 was related to 4 modules. Vakhrusheva et al. discussed how "SIRT7 increases stress resistance of cardiomyocytes and prevents apoptosis and inflammatory cardiomyopathy in mice." It is a member of the mammalian sirtuin family that consists of 7 genes, SIRT1~7. Its deficiency can cause the development of heart hypertrophy and inflammatory cardiomyopathy [30]. SIRT7 was discovered to be highly associated with ischemic stroke in our analytical results. Previous studies showed the roles of sirtuins in cell death. Increasing evidence has suggested that sirtuins play fundamental roles in a variety of biological processes, including cell death, inflammation, and energy metabolism. In addition, SIRT7 increases the stress resistance of cardiomyocytes and prevents apoptosis and inflammatory cardiomyopathy in mice. We list the disease functional analyses in Additional file 3. IPA results are shown in Table 7. NOA results are shown in Table 8. Results of Metacore are shown in Figure 7 to 14, for 3, 5, and 24 h and the total (the sum of all samples).

Figure 5
figure 5

IPA results at 24 h post-stroke. By the IPA analysis, one can see that the 3 network markers are related to different modules at 3 different time points (3, 5, and 24 h) post-stroke. It is easy to see the evolutionary process of network biomarkers. From the detailed legend in Figure 6, one can see different regulatory mechanisms at these 3 time points of stroke. This abundant information can offer experts various novel strategies to develop stroke therapies or recovery methods. The experts can decide to inhibit or activate key proteins in these networks. And experts can refer to a patient's medical history to decide the therapeutic strategy. We analyzed the stroke relevance value (SRV) results by IPA software, and it gave us more clues to uncover hidden mechanisms of stroke. We consider this inspired pioneering work, and in the future, experts need to design new therapies or recovery strategies for validation.

Figure 6
figure 6

The detailed legend of IPA in Figures 3 to 5.

Figure 7
figure 7

Pathway maps of Metacore. Sorting is done for the 'Statistically significant Maps'. Canonical pathway maps represent a set of signaling and metabolic maps covering human in a comprehensive way. All maps are created by Thomson Reuters scientists by a high-quality manual curation process based on published peer-reviewed literature. [The above paragraph is directly cited from the Metacore results.]. Figure 7-14 are serial maps generated by Metacore should give experts more choices and strategies to attack the core network post-stroke. Figure 7-12 show pathway maps for the 3 time points of stroke. Figure 13 shows the process networks. We can see cell cycle G2-M, G1-S, and meiosis are the top 3 process networks. They give experts actual targets to develop novel strategies. Figure 14 shows our network markers related to statistically significant diseases.

Table 7 Functional analyses of the network biomarker at 24 h post-stroke.
Table 8 The pathway analysis and gene set enrichment analysis of 4 proteins at 24 h post-stroke on (1) biological processes, (2) cellular components and (3) molecular functions by NOA.

Network biomarkers and the evolution of network biomarkers of stroke etiology and repair processes

Our stroke PPI model was constructed from differential expressions of stroke and normal microarray data and data mining of PPI information from the BioGRID database. So, the 3 SPPINs and NPPIN were the results of our systems biology model using the original microarray data and PPI databases. There are 3 key factors which affected the final results.

(i) The effect of different microarray data: We know that microarray data have the drawback of being irreproducible. That means even in the same case, microarray data might not produce the same results as previous ones. Also, for the same diseases, patients of different ethnicities, different ages, or different genders will produce different microarray data. This is the first factor that affected the final results.

(ii) The effect of different original PPI databases: We know that PPI databases, such as BioGRID and MIPS, are constructed from putative information and then validated by wet-lab experiments. Due to advances in many high-throughput experimental skills, the original PPI databases have evolved with time. Newly updated original PPI databases were the second factor that affected the final results.

(iii) The effect of the systems biology model: Our mathematical model combined with many biological databases to be a novel one that we have successfully applied it on various cancer researches [9, 10]. We used AIC and Student's t-test methods to construct the DPPIN of SPPIN and NPPIN, and get the SRV for three time points post stroke. The significance and the novelty of our model please refer to our previous work [9]. Although we described the novelty of our systems biology method, we have validated our results through a literature survey in the research. In the future, our results should be validated by other researchers' wet-lab experiments, and we will repeatedly modify our mathematical model. This is the third key factor that affected the results. Although not directly, it also had an influence on the protein interaction networks.

We also know that bio-systems evolve with time. It is obvious that different-stage patients have very different symptoms; these are key features for us to classify stroke stages. Since patients of different stages have greatly different symptoms, there is no doubt that the microarray data of these stage patients will be quite different. As described above, protein expressions from microarray data are one of the key factors of our systems biology model used to produce the final SPPINs and NPPIN. And the SPPINs and NPPIN yielded the final network biomarkers from our systems biology method. So, the most important thing for the evolution of network biomarkers is the evolution of microarray data at different stroke stages, which is inherent in the exhibition of stroke-related genes due to DNA mutations in the stroke process. The main purpose of this research was to discuss the network evolutionary process of stroke at 3 time points, and we hope it can provide clues for therapy and medical recovery processes. We found that CUL3 appeared at all 3 time points, and may be a target we should pay more attention to. At the second time point of 5 h, we found that the APP and caspase both played significant roles. At the last time point of 24 h, we found another important one, SIRT7. A lot of research has discussed these key proteins (Table 2).

Results in Figure 13 show that stroke-associated biomarker genes among different time points were significantly involved in cell cycle processing, including G2-M, G1-S and meiosis. Both in vitro and in vivo evidences for involvement of cell cycle elements in stroke was reported in a previous study [31]. The activity level of key regulators of the cell cycle are downregulated in differentiated neurons, and there is increasing evidence that activation of cell cycle machinery leads to death of neurons following stroke insults [32, 33]. Our finding also shows the involvement of multiple cell cycle-regulatory signals in ischemic injury, and this may contribute to our current understanding of the etiology of stroke [34].

Figure 13
figure 13

Process Networks. Sorting is done for the 'Statistically significant Networks'. The content of these cellular and molecular processes is defined and annotated by Thomson Reuters scientists. Each process represents a pre-set network of protein interactions characteristic for the process. [The above paragraph is directly cited from the Metacore results.]

Figure 14
figure 14

Diseases (by Biomarkers). Sorting is done for the 'Statistically significant Diseases'. Disease folders are organized into a hierarchical tree. Gene content may very greatly between such complex diseases as cancers and some Mendelian diseases. Also, coverage of different diseases in literature is skewed. These two factors may affect p-value prioritization for diseases. [The above paragraph is directly cited from the Metacore results.]

Comparison with our previous results of traumatic brain injury in Danio rerio

We compared the results with our previous study, "On the Crucial Cerebellar Wound Healing-Related Pathways and Their Cross-Talks after Traumatic Brain Injury in Danio rerio [35]". We found that there were no intersections between these 2 results. To discuss core and specific network biomarkers of cardiac and brain injury between humans and other species is important work, and we will extend this work in the future. It is difficult to obtain datasets for stroke patients. The original reason we wanted to compare the results with traumatic brain injury in D. rerio was to determine if any intersection existed between these 2 results. Then maybe it would be possible to use D. rerio as a model organism to model human stroke. However, we found nothing at this stage, and we will try to develop other methods to model human stroke.

Summary of results and discussion

Due to the help of high-throughput data and the power of our systems biology model, we determined total different network structures and biomarkers at 3 significant time points. Besides the original results of our model of SRV and network structure, we offer an abundant pathway analysis by various powerful commercial software and free web-servers. The entire work should be very valuable for experts (doctors and researchers) in developing novel strategies of recovery, therapy and prevention for stroke patients. Take for example, if you are only interested on SRVs, you can refer to Table 2 to choose the top SRV for drug targets. If you want to separate the PPIN by multiple drug targets, you can refer to Figure 1 to focus on elements of the network and select some of them to be drug targets. If you want to break down the network by destroying the regulatory relationship, you can refer to Figure 3 to 5, the IPA results, to choose some regulatory elements for your drug targets. If you want to break down the network by the complex modules given by Metacore, you can refer to Figure 7 to 14. You can use your medical knowledge combined with the complex modules to develop novel strategies. Additionally, the diseases and functional annotation given by IPA was shown in Additional file 4. And we also extended our research to examine relationships between significant genes determined by our models and many other diseases. This can give clues for new clinical application of old drug.

Figure 8
figure 8

Development Hedgehog Signaling which is the top scored pathway map in MetaCore enrichment analysis results. The family of protein called Hedgehog controls and patterns various aspects of the vertebrate body plan such as survival, cell growth and etc. Ubiquitin was down-regulated while Cullin 3 and Cul3/SPOP/Rbx 1 E3 ligase complex was up-regulated in stoke samples at 3, 5, 24 hours and overall stroke samples as compared with control. ITCH was up-regulated in overall stroke samples. Figure 8-12: *Experimental data from all files is linked to and visualized on the maps as thermometer-like figures. Up-ward thermometers have red color and indicate up-regulated signals and down-ward (blue) ones indicate down-regulated expression levels of the genes. [The above paragraph is directly cited from the Metacore results.]

Figure 9
figure 9

Development WNT signaling pathway Part 1. Degradation of beta catenin which is the second scored pathway map in MetaCore enrichment analysis results. Ubiquitin was down-regulated in stoke samples at 3, 5, 24 hours and overall stroke samples as compared with control. HDAC1 was down-regulated in overall stroke samples.

Figure 10
figure 10

Cell cycle Role of SCF complex in cell cycle regulation which is the third scored pathway map in MetaCore enrichment analysis results. The Skp, Cullin, F-box containing complex (SCF complex) play critical roles in the ubiquitination of proteins involved in cell cycle regulation. Ubiquitin was down-regulated in stoke samples at 3, 5, 24 hours and overall stroke samples as compared with control. NEDD8 was up-regulated in stoke samples at 5 hours and overall stroke samples.

Figure 11
figure 11

Apoptosis and survival NGF activation of NF-kB which is the fourth scored pathway map in MetaCore enrichment analysis results. Nerve growth factor (NGF) involved in neuron survival and differentiation, and the NF-kB signal generated by receptors of tyrosinekinase (TrkA) and the tumor necrosis factor receptor (NGFR) exerts neuroprotective effects. Ubiquitin was down-regulated in stoke samples at 3, 5, 24 hours and overall stroke samples as compared with control. GAB1 was up-regulated in overall stoke samples.

Figure 12
figure 12

LRRK2 in neurons in Parkinson's disease which is the fifth scored pathway map in MetaCore enrichment analysis results. Mutation in LRRK2 (R1441C, R1441G, R1441H, Y1699C, I2020T and G2019S) are the most common genetic cause of Parkinson's disease, and LRRK2 stimulates various pathways leading to progression of Parkinson's disease. Ubiquitin was down-regulated in stoke samples at 3, 5, 24 hours and overall stroke samples as compared with control. LRRK2 was up-regulated in overall stroke samples.

Conclusions

Stroke is a complex disease, and its complex cellular behaviors differ from those of cancers. We found a lot of research work that focused on cancer systems biology, and not as much work on stroke systems biology. Our systems biology method applied to cancers helped us successfully identify network biomarkers. This is our first attempt to apply a similar framework of systematic theory to the stroke process. We focused on a systematic analysis of 3 key post-stroke time points, and our findings showed that stroke-associated biomarker genes among different time points were significantly involved in cell cycle processing, including G2-M, G1-S and meiosis, which contributes to our current understanding of the etiology of strokes. We identified a significant PPIN and the corresponding network biomarkers for 3 time points. We hope this work helps scientists reveal more hidden cellular mechanisms of stroke etiology and recovery processes. In future work, we will try to integrate more data samples and more critical time points of data, and design new methods of model organisms to unearth more deeply the mechanisms and processes.