Introduction

Alzheimer’s disease (AD), a progressive irreversible neurodegenerative disorder affecting the elderly, is characterized by dementia and disruption of cognitive functioning. It represents one of the highest unmet medical needs worldwide. The International D World Alzheimer Report 2018 noted a global prevalence of 50 million in 2018 and forecasted a threefold rise in AD cases to 139 million globally by 2050 (International D World Alzheimer Report 2018). In the United States, around 121,000 deaths due to Alzheimer’s dementia were reported in 2019. During the coronavirus disease 2019 (COVID-19) pandemic, fatality rates amongst AD patients increased by 145% (Alzheimer’s disease facts and figures 2021). The Alzheimer’s and Related Disorders Society of India (ARDSI) forecasts a huge burden of 6.35 million AD cases across India by 2025 (Kumar et al. 2020).

To date, the US Food and Drug Administration (US-FDA) has approved only four anti-AD drugs, belonging to the following categories: (i) cholinesterase inhibitors: donepezil, rivastigmine and galantamine; and (ii) N-methyl-d-aspartate receptor antagonist: memantine (Alzheimer’s Association 2017). The AD treatments are oriented towards nominal symptomatic relief and offer modest clinical effect.

Looking into the pathophysiology, neuropathological evidence shows that AD is characterized by the presence of amyloid beta (Aβ) plaques and neurofibrillary tangles (NFT) in the hippocampal and cortical regions. Although there are various complex pathophysiological theories explaining the role of numerous genes and proteins in AD progression, a major role is attributed to presenilin 1 (PSEN1), beta-secretase 1 (BACE1), amyloid precursor protein (APP) and microtubule-associated protein tau (MAPT) proteins (Chouraki and Seshadri 2014). Disruption in regulatory activities such as phosphorylation and dephosphorylation of these proteins result in AD progression. Notwithstanding the existence of countless genetic evaluations, inconsistencies among various ethnicities contribute to a lacuna in unraveling crucial disease-specific targets. This study was aimed at exploring the major genetic alterations among various microarray datasets to retrieve common differentially expressed genes (DEGs) among various ethnicities, with the hypothesis that overlapping DEGs across different ethnicities might play a definitive role in AD pathogenesis.

Methodology

Selection of Datasets

Microarray datasets pertaining to Alzheimer’s disease were retrieved from the Gene Expression Omnibus (GEO) database (Barrett et al. 2013) using the keywords “Alzheimer’s disease”, “Familial Alzheimer’s disease”, “Sporadic Alzheimer’s disease,” “Early onset Alzheimer’s disease” and “Late onset Alzheimer’s disease”. The datasets retrieved through the above search terms were screened through a set of inclusion and exclusion criteria.

Inclusion Criteria

Datasets satisfying all the following criteria were selected:

  • Datasets with controls and AD

  • Datasets with expressional arrays

  • Datasets describing the diagnostic criteria of AD

  • Datasets studied in Homo sapiens

  • Datasets with a minimum of two samples in each category, i.e., control and AD

  • Datasets with blood/brain samples

Exclusion Criteria

Datasets with the following criteria were excluded.

  • Drug-treated datasets

  • Methylation studies

  • Datasets with no diagnostic criteria

  • Cell line studies

  • Datasets from other organisms

  • Datasets with no details about controls

  • Mutation studies

Gene Expression Analysis

The selected datasets were preprocessed, curated and analyzed individually for retrieval of differentially expressed genes (DEGs) (both upregulated and downregulated) through the Bioconductor package. The datasets which revealed DEGs with a false discovery rate (FDR) p-value (adjusted p-value according to Benjamini–Hochberg method) < 0.05 were selected. These datasets were then subjected to four sets of filtering criteria based on FDR and log fold change (FC): (i) FDR p-value < 0.05 and log FC > 2, (ii) FDR p-value < 0.05 and log FC > 1.5, (iii) FDR p-value < 0.05 and log FC > 1 and (iv) FDR p-value < 0.01 and log FC > 1. Based on the above stringent filtering criteria, the datasets possessing the following characteristics were included: (a) datasets satisfying one of the above four criteria, (b) datasets that encompassed both upregulated and downregulated DEGs and (c) 60% of the datasets showing the aforementioned characteristics (a) and (b) that display a higher degree of common DEGs.

Protein–Protein Interaction (PPI) Analysis

The common DEGs retrieved from the above step were subjected to PPI analysis with literature-derived genes (LDGs) gathered from the National Center for Biotechnology Information (NCBI) (Brown et al. 2015) pertinent to AD progression through the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database (von Mering et al. 2003). The PPI network was visualized through Cytoscape with proteins as nodes and interactions as edges. The proteins exhibiting significant interactions (70% confidence score) with LDGs were shortlisted, and the nodes exhibiting node degree > 2 were selected as AD targets.

Functional Enrichment Analysis

The common DEGs retrieved were subjected to functional enrichment analysis to explore their involvement in signaling pathways and physiological functions associated with AD pathogenesis through ClueGO (Bindea et al. 2009) in Cytoscape.

Results

Selection of Datasets

A total of 134 GEO datasets derived from studies performed on Homo sapiens were retrieved from NCBI, of which 32 datasets were found to satisfy the initial inclusion criteria. Details pertaining to the 32 datasets are presented in Table 1.

Table 1 List of GEO datasets selected for the study

Gene Expression Analysis

The datasets were analyzed individually through Bioconductor package in R using GEO2R tool (Barrett et al. 2013). Among the 32 datasets, 16 were rejected because they did not exhibit significant FDR p-values. The remaining 16 datasets were analyzed based on the four filtering criteria and three characteristics mentioned in the methodology section (Fig. 1).

Fig. 1
figure 1

CONSORT diagram explaining the selection and screening of datasets

  1. (i)

    FDR p-value < 0.05 and log FC > 2:

    Out of the 16 qualified datasets, five possessing upregulated DEGs and four with downregulated DEGs (Fig. 2) satisfied this criterion (Tables 2 and 3). Nevertheless, the upregulated DEGs of two datasets of the five displayed overlapping genes, while the downregulated DEGs of the shortlisted datasets did not show common genes. Therefore, this criterion was rejected.

    Fig. 2
    figure 2

    Venn diagram exhibiting the common upregulated (a) and downregulated (b) DEGs

    Table 2 Number of DEGs obtained through filtering criteria
    Table 3 List of common DEGs obtained through filtering criteria
  2. (ii)

    FDR p-value < 0.05 and log FC > 1.5:

    Among the 16 datasets, only six were found to meet this criterion (Tables 2 and 3). Common DEGs were found in datasets which accounted for 50% and thus did not meet characteristic (c) mentioned in the methodology section (Fig. 3). Thus, this criterion was also rejected.

    Fig. 3
    figure 3

    Venn diagram exhibiting the common upregulated (a) and downregulated (b) DEGs

  3. (iii)

    FDR p-value < 0.05 and log FC > 1

    Among the 16 datasets, this criterion was met by nine datasets with upregulated DEGs and eight datasets with downregulated DEGs (Tables 2 and 3). Also, the number of datasets was not equal, and the common DEGs were not seen in 60% of the datasets. Therefore, this criterion was rejected.

  4. (iv)

    FDR p-value < 0.01 and log FC > 1

    Among the 16 datasets, this criterion was met by six datasets containing both upregulated and downregulated DEGs (Tables 2 and 3). Common upregulated and downregulated DEGs were found in four datasets which accounted for more than 60%. Hence, this criterion was selected to retrieve the DEGs for PPI and functional enrichment analysis. Among upregulated DEGs, solute carrier family 5 member 3 (SLC5A3) and serpin family A member 3 (SERPINA3) were found to be common in four datasets. Among downregulated DEGs, somatostatin (SST), regulator of G protein signaling 4 (RGS4), crystallin mu (CRYM), neuronal pentraxin 2 (NPTX2), reticulon 3 (RTN3), brain-derived neurotrophic factor (BDNF) and ectodermal-neural cortex 1 (ENC1) genes were found to be common in four datasets (Fig. 4). These genes were selected for further PPI analysis with LDGs.

    Fig. 4
    figure 4

    Venn diagram exhibiting the common upregulated (a) and downregulated (b) DEGs

PPI Analysis

Eighteen LDGs were selected from the NCBI portal (Table 4) and were subjected to PPI analysis with the shortlisted DEGs from the above step. PPI analysis (Fig. 5) revealed that BDNF exhibited the highest node degree (16), followed by SST (7), AACT (SERPINA3) (4), RTN3 (2), RGS4 (3), NPTX (1) and CRYM (1). BDNF exhibited high connectivity with AD-specific proteins including glutamate ionotropic receptor NMDA type subunit 2B (GRIN2B), BACE1, MAPT, PSEN1, TP53, BCHE, SNCA, COMT, INS, APP, APOE and ACHE. SST exhibited PPI with IDE, MME, IGF, APP, INS and ACHE. SERPINA3/AACT exhibited interactions with APOA1, APOE and APP proteins. RTN3 interacted with BACE1 and APP. RGS4 interacted with COMT alone. NPTX and CRYM did not exhibit interactions with any of the LDGs (Fig. 5, Tables 5 and 6).

Table 4 List of LDGs retrieved from NCBI
Fig. 5
figure 5

PPI network of DEGs exhibiting significant interactions with LDGs. Yellow nodes represent common genes retrieved from GEO datasets. Pink nodes represent LDGs

Table 5 Significant PPI of identified DEGs with LDGs
Table 6 Characteristics of the PPI network

Functional Enrichment Analysis

The common DEGs retrieved were subjected to functional enrichment analysis to explore their involvement in Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.

GO analysis revealed that SLC5A3 was involved in the transport of potassium ions across plasma membranes (GO:0098739) and peripheral nervous system development (GO:0007422), whereas BDNF, RGS4, NPTX2 and SST were involved in cognitive ability (GO:0050890), trans-synaptic signaling (GO:0099157), striated muscle cell differentiation (GO:0051154), anterograde trans-synaptic transmission (GO:0098916) and regulation of nervous system processes (GO:0031644). BDNF, SST and ENC1 were involved in receptor ligand activity (GO:0048018), cytokine receptor binding (GO:0005126), positive regulation of cell projection organization (GO:0031346) and receptor regulator activity (GO:0030545). ENC1 and RTN3 were found to be involved in negative regulation of cellular amide metabolic process (GO:0034249). SERPINA3 in combination with SST was known to be involved in digestion (GO:0007586) (Fig. 6).

Fig. 6
figure 6

Gene Ontology categories of common DEGs describing their physiological roles

KEGG analysis revealed that BDNF was involved in triggering the phosphoinositide 3-kinase (PI3K) pathway (hsa04213), rat sarcoma (RAS) signaling (hsa05212), RAC1 signaling (hsa04510), FYN signaling (hsa04380), cyclin-dependent kinase 5 (CDK5) phosphorylation, FYN-mediated GRIN2B activation and transcriptional signaling. BDNF and SST were involved in transcription regulation by methyl-CpG-binding protein 2 (MECP2), gastric acid secretion (hsa04971) and somatostatin gene expression. RGS4 was known to mediate G alpha (i) auto-inactivation and G alpha (q) inactivation by hydrolysis of guanosine triphosphate (GTP) to guanosine diphosphate (GDP). CRYM was involved in lysine catabolism and autosomal-dominant deafness, whereas RTN3 was involved in PPI  at synapses, binding of synaptic adhesion-like molecule 1–4 (SALM1–4) to reticulons and synaptic adhesion-like molecules. SERPINA3 was involved in exocytosis of platelet alpha granules and azurophil granule lumen proteins (Fig. 7).

Fig. 7
figure 7

Significant KEGG pathways of common DEGs

Discussion

This study was aimed to retrieve significant DEGs associated with AD by analyzing the gene expression data available in the GEO database. Initially, the GEO datasets were selected based on the inclusion and exclusion criteria, which resulted in 32 datasets. The raw data for each dataset were analyzed individually using the Bioconductor package in R, and DEGs with FDR p-value < 0.05 were retrieved and segregated into upregulated and downregulated DEGs. Although 32 datasets were found to be eligible, only 16 satisfied the initial criteria FDR p-value < 0.05. These DEGs were subjected to screening based on different filtering norms, and this yielded six datasets with both upregulated and downregulated DEGs. Herein, the overlapping DEGs were found in more than 60% of the above mentioned six datasets. SLC5A3 and SERPINA3 were found to be common in upregulated DEGs, whereas SST, BDNF, RGS4, CRYM, NPTX2, RTN3 and ENC1 were found to be common in downregulated DEGs. These DEGs were further subjected to PPI analysis with 18 LDGs which were known to play a strong role in AD pathogenesis. Among the above nine DEGs, BDNF, SST, SERPINA3 (AACT), RTN3 and RGS4 exhibited significant interactions.

BDNF exhibited interaction with crucial targets including GRIN2B, BACE1, APP, MAPT, SNCA, ACHE, APOE, PSEN1 and COMT. Functional enrichment analysis revealed a normal physiological role of BDNF in cytokine signaling, receptor ligand activity and regulation, trans-synaptic signaling, cognitive function, chemical synaptic transmission, cell differentiation, cell growth and regulation. This suggests its crucial involvement in neuronal growth, development and transmission, which is found to be abnormal in AD. KEGG pathway analysis revealed detailed mechanistic action of BDNF. BDNF initiates its response by binding to the tyrosine kinase beta (TRKβ) receptor; post-binding, the receptor dimerizes and undergoes autophosphorylation. The phosphorylated TRKβ triggers various signaling mechanisms such as PI3K, RAS, CDK5, RAC1 GTPase, Src homology 2 domain-containing 1 (SHC1), FYN kinase, fibroblast growth factor receptor substrate 2 (FRS2), T-lymphoma invasion and metastasis-inducing protein 1 (TIAM1) and phospholipase C gamma 1 (PLCG1). These were in turn found to be involved in triggering secondary signaling pathways through GRIN2B, which is associated with cocaine addiction, cognitive central hypoventilation syndrome and eating disorders. A number of research studies have reported downregulation of BDNF expression, which is in line with our findings (Kang et al. 2020; Akhtar et al. 2020).

The PPI analysis of SST revealed its interaction with primary AD targets including IDE, MME, IGF, APP, INS and ACHE. Like BDNF, SST also exhibited a physiological role in trans-synaptic signaling, cognitive function, anterograde trans-synaptic signaling, receptor ligand activity, cytokine receptor binding and receptor regulator activity. KEGG pathway analysis revealed the association of SST with MECP2 and c-AMP responsive element-binding protein 1 (CREB1). It is reported that MECP2 together with CREB1 enhances the expression of SST by binding to the promoter region (Chahrour et al. 2008). There are five subtypes of SST receptors, of which three receptors, i.e., SSTR2, SSTR4 and SSTR5, were observed to display marked downregulation and reduced sensitivity in AD. This interferes with their inhibitory control over the adenylyl cyclase (AC) pathway. Decreased SSTR2 results in decreased activity of neprilysin, an enzyme involved in the degradation of Aβ peptides (Burgos-Ramos et al. 2008; Aguado-Llera et al. 2018; Sandoval et al. 2019). In addition, postmortem AD brains with decreased levels of SST receptors were correlated with a higher degree of amnesia and cognitive dysfunction (Saiz-Sanchez et al. 2010; Beal et al. 1985). In concordance with the above studies, our analysis found downregulation of SST receptors.

SERPINA3 or AACT is a 55–68 kDa serine protease inhibitor secreted by ependymal cells of the choroid plexus (Zhang and Janciauskiene 2002). Our PPI analysis identified its interaction with APP, APOE and APOA1. Functional enrichment analysis revealed its role in digestion and exocytosis. In AD, it was reported to be colocalized with amyloid plaques. The hydrophobic domain at the C-terminal of this enzyme interacts and forms a complex with amyloid fibrils. These complexes are known to upregulate SERPINA3, resulting in disruption of cognitive function (Abraham and Potter 1989; Eriksson et al. 1995). Apart from interacting with Aβ fibrils, it is also known to promote tau phosphorylation at Ser202, Thr231, Ser396 and Thr404 by augmenting extracellular signal-related kinase (ERK), glycogen synthase kinase-3β (GSK-3ß) and c-Jun N-terminal kinase (JNK), leading to inflammatory responses promoting neuronal death and degeneration (Tyagi et al. 2013; Padmanabhan et al. 2006).

RTN3, a transmembrane endoplasmic reticulum (ER) protein, belongs to a family of reticulons. Reticulons consist of four mammalian paralogs, i.e., RTN1, RTN2, RTN3 and RTN4, of which RTN3 and RTN4 are neuronal-specific. The members of this reticulon family possess a conserved QID triplet region, known as a reticulon homology domain (RHD) in their C-terminal region. This RHD domain was found to interact with the C-terminal domain of BACE1, which is involved in the formation of Aβ peptides (Kume et al. 2009; He et al. 2006, 2007). The BACE1-RTN3 complex is reported to halt the axonal transport and enzymatic activity of BACE1 on APP, thereby terminating the amyloidogenic pathway. It was also reported that BACE1 was found to specifically interact with monomeric RTN3 rather than dimeric forms (Sharoar and Yan 2017; He et al. 2006). The formation of RTN3 aggregates was found to be regulated by B-cell receptor-associated protein 31 (BAP31), an integral ER membrane protein. Silencing of this gene leads to formation of RTN3 aggregates, thereby reducing the interaction with BACE1 which promotes Aβ formation (He et al. 2004; Wang et al. 2019). Our functional enrichment analysis revealed the interactions of RTN3 with synaptic proteins and gene expression analysis demonstrated downregulation of this gene.

RGS4, a member of the RGS family, modulates G protein signaling activity by inhibiting AC and phospholipase C (PLC) activity. RGS4 inhibits G protein-coupled receptor (GPCR)-mediated APP cleavage, while downregulation of RGS4 enhances APP cleavage (Emilsson 2005). Functional enrichment analysis revealed that RGS4 was involved in various regulatory functions including modulation of chemical synaptic transmission, regulation of trans-synaptic signaling, nervous processes, striated muscle cell differentiation and regulation of cell growth. KEGG analysis revealed that active G alpha (i), (q) and (z) are binding partners of RGS4. Our gene expression analysis revealed downregulation of RGS4 in AD cases.

In summary, from the analysis, BDNF, SST, SERPINA3, RTN3 and RGS4 were found to be crucially involved in AD pathogenesis. BDNF and SST trigger various signaling mechanisms including PKA, PI3K and AKT, which in turn inhibit GSK3β and BAD activity. This process results in the inhibition of apoptosis and promotion of neuronal growth. On the other hand, downregulation of BDNF and SST enables Aβ fibrils to inhibit the aforementioned signaling mechanisms, thereby resulting in enhanced apoptosis and neuronal cell death. RTN3 interacts with BACE1 directly and impedes its access to APP cleavage, thereby promoting the non-amyloidogenic pathway. RGS4 acts in similar fashion as SST by hindering GTP hydrolysis (Fig. 8). The presence of Aβ fibrils leads to AD progression; however, the aforesaid targets are believed to have substantial potential to counteract Aβ toxicity.

Fig. 8
figure 8

Signaling mechanisms and cross-talk pathways underlying AD progression

Blue arrows represent signaling mechanisms in the absence of Aβ fibrils, and red arrows represent signaling responses in the presence of Aβ fibrils. BDNF: brain-derived neurotrophic factor, TRKβ: tyrosine kinase β, SST: somatostatin, SSTR: somatostatin receptor, APP: amyloid precursor protein, AC: adenylyl cyclase, BACE1: beta-secretase 1, ER: endoplasmic reticulum, RTN3: reticulon 3, GTP: guanosine triphosphate, GDP: guanosine diphosphate, RGS4: regulator of G protein signaling 4, cAMP: cyclic adenosine monophosphate, CDK5: cyclin-dependent kinase 5, TIAM1: T-lymphoma invasion and metastasis-inducing protein 1, FYN: Fyn kinase, IRS: insulin receptor substrate, AQ11SHC: src homology and collagen, DOCK3: dedicator of cytokinesis 3, GRIN2B: glutamate ionotropic receptor NMDA type subunit 2B, RAC1: Rac family small GTPase 1, PI3K: phosphatidylinositol-4,5-bisphosphate 3-kinase, AKT: AKT serine/threonine kinase, GSK3β: glycogen synthase kinase 3β, BAD:BCL2-associated agonist of cell death, GRB2: growth factor receptor bound-protein 2, RAS: KRAS proto-oncogene, GTPase, MEK: mitogen-activated protein kinase, ERK: extracellular signal-regulated kinase, CREB: cAMP responsive element binding protein 1, PHF: paired helical filaments, EPAC: Rap guanosine nucleotide exchange factor 3, RAP1: member of Ras oncogene family, PKA: protein kinase A, BCL2: BCL2 apoptosis regulator.

Conclusion

Systematic analysis of the metadata by considering all AD-related genetic datasets with a developed set of filtering criteria improved the precision of results. Through this analysis, SLC5A3, BDNF, SST, SERPINA3, RTN3, RGS4, NPTX, ENC1 and CRYM were identified as potential genes involved in AD pathogenesis. Among the identified genes, BDNF, SST, SERPINA3, RTN3 and RGS4 exhibited significant interactions with LDGs, and thus they were considered to play a major role in AD progression.