Atrial fibrillation (AF) is one of the most prevalent sustained arrhythmias, having an age-adjusted hospitalization incidence of 1–4% of the general population and an prevalence rising of > 13% for those older than 80-years-of-age [1, 2]. However, epidemiological data may understate its actual prevalence, because 40% of patients are asymptomatic and remain undiagnosed with subclinical AF [3]. There is also evidence that patients with AF have significantly increased cardiovascular-related morbidity, given its association with atrial and ventricular mechanical or electrical failure, structural and hemodynamic alterations, and thromboembolic events [3].

Stroke is the leading cause of disability and death and has an estimated incidence of 3.73 (95% CI 3.51–3.96) per 1000 person-years among black- and white- adults in an atherosclerosis risk in communities (ARIC) cohort [4]. Furthermore, global increases in stroke prevalence plus stroke-related disability and mortality associated with aging will increase [5, 6]. Thus, we may not now know the actual true burden of stroke due to limits in brain imaging identification in < 10 mm small hypointense areas and silent infarctions for 28% of those patients older than 65-years-of-age [7]. AF is commonly classified as paroxysmal, persistent or permanent, or new onset arrhythmia basing on the present continuous time, which mainly included that paroxysmal AF was self-terminates within 7 days, while persistent AF was lasts longer than 7 days or needs cardioversion, and usually has lasted for 3 months [8]. As we all kwon, AF is considering to be a major cause of ischemic strokes due to irregular heart-rhythm, coexisting chronic vascular inflammation, and renal insufficiency, and blood stasis. According to Rivaroxaban Once Daily Oral Direct Factor Xa Inhibition Compared With Vitamin K Antagonism for Prevention of Stroke and Embolism Trial in Atrial Fibrillation (ROCKET-AF) trial study, Steinberg et al. [9] suggested that the paroxysmal AF patients carrying a lower adjusted rate of stroke or systemic embolism (adjusted HR: 0.78, 95% CI 0.61–0.99, P = 0.045), all-cause mortality (adjusted HR: 0.79, 95% CI 0.67–0.94, P = 0.006), and the composite of stroke or systemic embolism or death (adjusted HR: 0.82, 95% CI 0.71–0.94, P = 0.005) than persistent AF patients after adjusted efficacy and safety outcomes. According to the Oxford vascular study (OXVASC), nearly 43.9% of ischemic strokes were associated with AF among patients 80 years-of-age or older who had a threefold increase in AF in the past 3 decades [10]. However, this assumption has been challenged by the atrial fibrillation reduction atrial pacing trial (ASSERT) which identified a temporal association between subclinical AF and stroke risk among patients with implantable pacemakers and defibrillators. They reported that only 8% and 16% of patients had an association between pre-detected and post-detected AF within months of stroke or systemic embolism, respectively [11]. Of note, AF is often intermittent and asymptomatic, and presents as an electromechanical disassociation of atrial fibrillation. Clinically, current stroke risk scores and traditional diagnosis with an electrocardiogram are practical, while the limitation of predict stroke risk accurately in individual AF patients was significantly identified, especially in persistent AF which carrying a higher risk of stroke or systemic embolism and all-cause mortality [12]. In this study, we identified co-expressed differentially expressed genes (co-DEGs) of persistent AF and stroke and elucidated molecular mechanisms and pathology of AF-related DEGs (AF-DEGs) and stroke-related DEGs (stroke-DEGs). Finally, we provide a bioinformatic analysis of DEGs and predicted microRNAs (miRNAs) for AF patients prone to stroke.


Materials and methods

GSE79768 and GSE58294 datasets were downloaded from GEO ( [13] and expression profiling arrays were generated using GPL570 (HG‑U133_Plus_2) Affymetrix Human Genome U133 Plus 2.0 Array (Affymetrix, Santa Clara, CA). Additionally, the GSE79768 dataset, including 26 specimens with paired left atrial (LA) and right atrial (RA) tissue obtained from 13 patients was used to identify differential LA-to-RA gene expression and molecular mechanisms for patients with persistent AF or sinus rhythm (SR) abnormalities and we describe potential mechanisms of AF-related remodeling in the LA and the relationship between LA arrhythmogenesis and thrombogenesis. In this study, persistent AF patients has lasts continuously for > 6 months, while the SR patients had no evidence of AF clinically and any anti-arrhythmic drug history. Blood samples of GSE58294 were collected from cardioembolic stroke (N = 69) and control patents (N = 23) at < 3, 5, and 24 h.

Data processing

R packages of “affy”, “affyPLM”, and “limma” (, provided by a bioconductor project [14], were applied to assess GSE79768 and GSE58294 RAW datasets. We used background correction, quantile normalization, probe summarization and log2‑transformation, to create a robust multi-array average (RMA), a log-transformed perfect match, and a mismatch probe (PM and MM) methods. The Benjamini‑Hochberg method was used to adjust original p-values, and the false discovery rate (FDR) procedure was used to calculate fold-changes (FC). Genes expression values of the|log2 FC| > 1and adjusted p < 0.05 were used for filtering AF-DEGs. However, the |log2 FC| > 1.5 and adjusted p < 0.05 were used to identify stroke-DEGs, given that blood sample specificity pointed to many genes. Additionally, we calculated and made Venn diagrams for co-DEGs for AF- and stroke-DEGs.

Finally, we applied online prediction tools utilizing microRNA Data Integration Portal (mirDIP) ( [15], miRDB ( [16], TargetScan (v7.1; [17], and DIANA Tools ( [18], to predict potential microRNA targeting. Subsequently, we used the mirDIP, miRDB, TargetScan, and Diana Tools software to predict which of the selected miRNAs could target co-DEGs. We determined 5 top candidate miRNAs based on higher predicted scores for ≥ 3 prediction tools for each co-DEG.

Identification of protein–protein interaction (PPI) networks of DEGs

PPI networks of AF- and stroke-DEGs were analyzed using the search tool for the retrieval of interacting genes (STRING database, V10.5; that predicted protein functional associations and protein–protein interactions. Subsequently, Cytoscape software (V3.5.1; was applied to visualize and analyze biological networks and node degrees, after downloading analytic results of the STRING database with a confidence score > 0.4 [19].

Functional enrichment analysis

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of AF- and stroke-DEGs were carried out using the database for annotation, visualization and integrated discovery bioinformatics resources (DAVID Gene Functional Classification Tool, [20], and REACTOME databases (v62; [21]. GO terms and KEGG maps of biological functions associated with a p < 0.05 was considered to be significantly enriched. In addition, we presented different biofunctions of AF- and stroke-DEGs in biological processes, molecular functions, and cellular components from DAVID and REACTOME databases, respectively.

Subsequently, the AmiGO database (v2.0; was used to analyze the GO consortium for selected co-DEGs to verify the accuracy and annotate biofunctions of identified co-DEGs [22]. Using microRNA target prediction, online tools from Diana-miRPath (v3.0; [18] were applied to evaluate interactions between miRNA previously identified using prediction tools and co-DEGs involved in AF and stroke.

Identification of co-DEGs associated with nervous or cardiovascular diseases

The comparative toxicogenomics database ( was used to find integrated chemical-gene, chemical-disease, and gene-disease interactions to generate expanded networks and predict novel associations [23]. We used these data to analyze relationships between gene products and nervous or cardiovascular diseases. Here, relationships between co-DEGs and diseases and association or an implied association were identified.


Identification of DEGs

We identified 54,674 probes corresponding to 20,484 genes in GSE79768 and GSE58294 datasets and AF- and stroke-DEGs were confirmed. We found 489 DEGs in LA specimens of AF patients compared with SR patients, including 428 down-regulated genes and 61 up-regulated genes. However, total of 265, 518, and 592 DEGs were identified following the time points of less than 3, 5, and 24 h after stroke, respectively. Here, we defined 210 co expressed DEGs in the three time points mentioned above as the stroke-DEGs. Heatmaps of AF-DEGs in relation to inflammatory and immune response, ion channels, and cell signaling were conducted for genes expression and these data appear in Fig. 1 and Additional file 1: S1. Simultaneously, Fig. 2 and Additional file 2: S2 has shown the genes expression value in relation to inflammatory response, energy metabolism, ions channel and transportation, and neuronal regulation above the stroke-DEGs.

Fig. 1
figure 1

Hierarchical clustering analysis of AF-related differentially expressed genes: ad results of hierarchical clustering analysis for DEGs expression in relation to cellular signaling, ion channel, inflammatory and immune responses. Red, greater expression. Blue, less expression

Fig. 2
figure 2

Hierarchical clustering analysis of stroke-related DEGs: ac results of hierarchical clustering analysis for DEG expression in relation to energy metabolism, ion channel, inflammatory response, and neuronal regulation. Red, greater expression. Blue, less expression. a PPI network of AF-related DEGs; b PPI network of stroke-related DEGs. Red, greater degree. yellow, lesser degree; c Venn diagrams of DEGs

Functional enrichment in Co-DEGs

Figure 3c illustrates expressed AF- and stroke-DEGs and co expressed genes. Interestingly, four co expressed DEGs, including zinc finger protein 566 (ZNF566), PDZK1 interacting protein 1(PDZK1IP1), zinc finger homeobox 3 (ZFHX3), paired-like homeodomain 2 (PITX2), were observed. The AmiGO database was used to confirm GO term enrichment related to biological processes, molecular functions, and cellular components and Co-DEGs were associated with various processes as indicated in Table 1.

Fig. 3
figure 3

PPI network and Venn diagrams: (1) PPI networks from a and b constructed using STRING database for DEGs (threshold > 0.4). (2) Venn diagrams of c of DEGs related to AF and < 3, 5, and 24 h after stroke, respectively. Co-expressed genes, including ZNF566, PDZK1IP1, ZFHX3, and PITX2, are identified

Table 1 The Gene Ontology (GO) terms enrichment for the co-expressed genes of the AF-related stroke

PPI network analysis and functional GO terms and pathway enrichment analyses

We identified 256 and 43 nodes from PPI network of AF- and stroke-DEGs, respectively and these data appear in Fig. 3. Here, the hub nodes, including leucine-rich repeat kinase 2 (LRRK2; degree = 38), calmodulin 1 (CALM1; degree = 25), chemokine (C-X-C motif) receptor 4 (CXCR4; degree = 25), toll-like receptor 4 (TLR4; degree = 21), catenin (cadherin-associated protein), beta 1(CTNNB1; degree = 21), and chemokine (C-X-C motif) receptor 2 (CXCR2; degree = 21) are considering as hub-genes in related to AF maintaining. However, the hub-genes, involved in CD19 (degree = 5), fibroblast growth factor 9 (FGF9; degree = 5), SRY (sex determining region Y)-box 9 (SOX9; degree = 5), guanine nucleotide binding protein (G protein), gamma transducing activity polypeptide 1(GNGT1; degree = 4), and noggin (NOG; degree = 4), are demonstrated in stroke-DEGs with a relative higher degree.

Using the DAVID database, the top 5 GO terms related biological processes among those genes were primarily associated with inflammatory response (Fold Enrichment: 4.08; p value: 1.11E−07), immune response (Fold Enrichment: 3.50; p-value: 2.49E−06), regulation of MAP kinase activity (Fold Enrichment: 9.53; p-value: 1.92E−05), and regulation of NF-kappa B activety (Fold Enrichment: 16.73; p-value: 1.95E−04). There is significant correlation in plasma membrane (Fold Enrichment: 1.60; p-value: 1.37E−06), extracellular region (Fold Enrichment: 1.75; p-value: 8.88E−04), and MHC class II protein complex (Fold Enrichment: 13.47; p-value: 0.003) in relation to cellular components. In addition, the terms related molecular functions were mainly involved in ion channel binding (Fold Enrichment: 5.06; p-value: 0.001), neuropeptide Y receptor activity (Fold Enrichment: 23.84; p-value: 0.007), and transmembrane receptor protein tyrosine kinase adaptor activity (Fold Enrichment: 21.46; p-value: 0.008). With respect to stroke-DEGs, the biological processes terms of regulation of myoblast differentiation (Fold Enrichment: 25.39; p-value: 4.89E−04), endocardial cushion morphogenesis (Fold Enrichment: 27.38; p-value: 0.005), positive regulation of epithelial cell proliferation (Fold Enrichment: 9.73; p-value: 0.008), and fibroblast growth factor receptor signaling pathway (Fold Enrichment: 7.12; p-value: 0.018) were significantly enriched. Similarly, the terms of RNA polymerase II transcription factor activity, sequence-specific DNA binding (Fold Enrichment: 5.15; p-value: 0.006), ISG15-specific protease activity (Fold Enrichment: 146.79; p-value: 0.013), and nucleic acid binding (Fold Enrichment: 2.09; p-value: 0.015) related molecular functions were primarily enriched (As shown in Fig. 4 and Additional file 3: S3).

Fig. 4
figure 4

GO terms and KEGG pathway enrichment: a, b AF-and stroke-related GO term enrichment for DEGs, respectively. c KEGG pathway of AF- and stroke-related DEGs. d Functional and pathway enrichment of AF-and stroke-related DEGs from REACTOME database. Dot sizes represent counts of enriched DEGs, and dot colors represent negative Log10-p values

KEGG pathway analysis data appear in Fig. 4c. The results suggesting that the AF-DEGs were mainly enriched in pathways of cytokine–cytokine receptor interaction (p-value: 1.02E−04), cGMP-PKG signaling pathway (p-value: 0.025), antigen processing and presentation (p-value: 0.022), and NF-kappa B signaling pathway (p-value: 0.037). However, KEGG terms included PI3 K-Akt signaling pathway (p-value: 0.017) and B cell receptor signaling pathway (p-value: 0.045) were enriched in stroke-DEGs. (As shown in Fig. 4c and Additional file 4: S4).GO terms enrichment using the REACTOME database identified additional associations and these appear in Fig. 4d. The CTD database showed that Co-DEGs targeted several nervous system and cardiovascular diseases and these data appear in Fig. 5 and Additional file 5: S5.

Fig. 5
figure 5

Relationship to nervous system and cardiovascular diseases related to co-expressed genes based on the CTD database. *Direct evidence of marker or mechanism in this disease

Identification of functional and pathway enrichment among predicted miRNAs and Co-DEGs

Prediction analysis using mirDIP, miRDB, TargetScan, and DIANA bioinformatic tools identified the top 5 selected miRNAs targeting each Co-DEG involved in AF-related stroke and these data appear in Table 2. These data enable us to understand how predicted miRNAs are related to AF-related stroke progress.

Table 2 The Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment among predicted miRNAs and Co-DEGs


Predicting AF is needed for stroke prevention but 30% of patients have no signs of AF despite months of continuous cardiac rhythm monitoring. Thus, cardiovascular malignant events may be correlated with irregular and infrequent cardiovascular incidents as well as limitations in electromechanical indices that should predict problems with atrial contractility [7, 11, 12]. Estimating markers and associations between atrial dysfunction and embolic stroke are thus of interest and may be novel therapeutic targets for primary care. The inflammatory and immune response, and ion channel and transportation are significantly associated with AF recurrence and maintenance, as well as the stroke occurrence. Several hub-genes involved directly or indirectly that regulate the nervous system were found among AF-DEGs. Visanji’s group compared resting electrocardiograms of LRRK2-associated Parkinson’s disease (PD) patients, nonmanifesting carriers, noncarriers, and idiopathic PD patients to investigate heart rate variability in LRRK2-associated PD [24]. There is evidence that LRRK2 may act as a biofunctional mediator to correlate heart rate variability and PD [24]. In a molecular mechanistic study, the neural protective role for regulating mitochondrial complex I function and oxidative stress in ischemia/reperfusion was identified [25, 26]. According gene–gene interaction analysis, Timasheva’s group illustrated that the loci of CXCR2 is significantly associated with stroke development in patients with hypertension [26]. In addition, CXCR2 antagonism attenuated neurological deficits and infarct volumes via decreased cerebral neutrophil infiltration and peripheral neutrophilia in a hyperlipidemic ApoE−/− mice stroke model [27]. CALM1 is recognized as a major regulator of cardiac ion-current expression and calcium handling, and a key determinant of cardiac electrical function [28]. Also, specific risk alleles for CALM1 were identified as being associated with increased risk of stroke in studies of coronary heart disease [29]. Thus, there may be a relationship between cardiovascular and nervous system disease and they may arise from loci mutations or gene variants.

Additionally, PITX2, of the pituitary homeobox (Pitx) family, has a critical role in organ morphogenesis and AF maintenance which is related to short stature homeobox 2 (Shox2) [30]. Pitx2 is expressed in the LA and the pulmonary vein, which is considered a substrate and trigger for AF maintenance respectively. However, several experimental data indicate a trend that PITX2 gene expression is silenced during aging in LA samples, suggesting genetic evidence for gene silencing for increased AF susceptibility [30, 31]. Then, miRNAs function analysis and a genomic approach showed that miR-17-92 and miR-106b-25 were associated with Pitx2 expression regulation and are implicated in human AF susceptibility [31]. To reveal relationships between genetic variants and the risk of ischemic stroke, Malik’s group studied PITX2 and ZFHX3 genes and found a significant association with cardioembolic stroke (CE) in a meta-analysis [31, 32]. Similarly, in a genome-wide association study using clinical samples from paroxysmal or persistent AF patients, ZFHX3 was significantly associated with LA enlargement and persistent AF and subsequently with ablation outcomes [33]. Correspondingly, Choi’s group found a significant association between top susceptibility loci (chromosomes 4q25 [PITX2], 16q22 [ZFHX3]) and AF recurrence after ablation in a Korean population, despite no top single nucleotide polymorphisms (SNPs) that predicted clinical recurrence after catheter ablation [34]. A regulatory role for PDZK1IP1 (MAP17) in reactive oxygen species production has been confirmed and is considered as a marker for increased oxidative stress and may be a new therapeutic target [35]. and recent research suggests a potential role for ions channels regulation, linked to the Na+/H+ exchanger 3 and A-kinase anchor protein 2/protein kinase A pathway [36]. However, ZNF566 plays a central role in heart regeneration and repair, and endocardial and epicardial epithelial to mesenchymal transitions [37, 38].

Research suggests potential beneficial effects of miRNA transformation therapy vectored by adenovirus, plasmid, and lentivirus for AF therapy [39]. We found that miR-27a-3p, miR-27b-3p, and miR-494-3p were co-DEGs and may be potential biomarkers of AF-related stroke. Interestingly, Vegter’s group compared heart failure-specific circulating miRNAs in 114 heart failure patients with/without different manifestations of atherosclerotic disease, and reported that miR-18a-5p, miR-27a-3p, miR-199a-3p, miR-223-3p and miR-652-3p abundance were associated with atherosclerosis and cardiovascular-related rehospitalizations [40]. Similarly, Marques and colleagues found that several miRNAs involved in let-7b-5p, let-7c-5p, let-7e-5p, miR-122-5p, and miR-21-5p, and absorbed miR-16-5p, miR-17-5p, miR-27a-3p, and miR-27b-3p are target pathways related to heart failure and considered to be potential biomarkers [41]. In contrast, expression of miR-27b-3p is significantly related to embryonic myogenesis and protein synthesis but miR-494-3p expression is associated with cerebral blood supply and functional recovery in a rat stroke model according to cerebral cortical miRNA profile changes [42, 43].


The hub-genes of LRRK2, CALM1, CXCR4, TLR4, CTNNB1, CXCR2, KIT, and IL1B may be associated with AF recurrence and maintenance and CD19, FGF9, SOX9, GNGT1, and NOG may be associated with stroke. Additionally, co-DEGs of ZNF566, PDZK1IP1, ZFHX3, and PITX2 link AF and stroke. Finally, the top 5 miRNAs for each co-DEGs may be potential biomarkers or therapeutic targets for AF-stroke, especially miR-27a-3p, miR-27b-3p, and miR-494-3p. Thus, there is an association between AF and stroke, and expression of ZNF566, PDZK1IP1, ZFHX3, and PITX2 genes favor AF-related stroke.


Several limitations still detected in our study. First, this study is a microarray analysis that all the results based on gene expression value. However, owing to gene expression may be not directly equivalent to protein expression, the biomarkers of this study should consider as gene, not in protein. In application, assay of PCR and microarray chip may be better for accessing the risk of AF-related stroke. Second, validation should be carried out both in vitro, in vivo and clinical trials. However, as of now the techniques of in vivo or in vitro models for AF and stroke was immature. And the larger, prospective clinical studies may be better to validate our results to some extent.