Abstract
Alzheimer’s disease is a neurological disorder that affects an individual’s memory, motor functions, behaviour, and thought process. It has been observed that the hippocampus is the first region that gets affected by Alzheimer’s. Hence, a study of the hippocampus region can identify genes responsible for the occurrence of the early stage of the disease. Most often, t-test and correlation are used to identify significant genes at the initial level. As the genes are differentially expressed, their classification power is generally high. These genes might appear significant, but their degree of specificity towards the disease might be low, leading to misleading interpretations. Similarly, there may be many false correlations between the genes that can affect the identification of relevant genes. This paper introduces a new framework to reduce the false correlations and find the potential biomarkers for the disease. The framework concerned uses the t-test, correlation, Gene Ontology (GO) categories, and machine learning techniques to find potential genes. The proposed framework detects Alzheimer-related genes and achieves more than 95% classification accuracy in every dataset considered. Some of the identified genes which are directly involved in Alzheimer are APP, GRIN2B, and APLP2. The proposed framework also identifies genes like ZNF621, RTF1, DCH1, and ERBB4, which may play an important role in Alzheimer’s. Gene set enrichment analysis (GSEA) is also carried out to determine the major GO categories: down-regulated and up-regulated.
Similar content being viewed by others
Notes
References
Akoglu H (2018) User’s guide to correlation coefficients. Turk J Emerg Med 18(3):91–93
AL-Dlaeen D, Alashqur A(2014) Using decision tree classification to assist in the prediction of Alzheimer’s disease. 6th International conference on computer science and information technology (CSIT), Amman, pp. 122-126
Alzheimer’s Disease Fact Sheet, National Institute of Aging, U.S. Department of Health and Human Services. 2015. https://www.nia.nih.gov/health/alzheimers-disease-fact-sheet
Anand KS, Dhikav V (2012) Hippocampus in health and disease: an overview. Ann Indian Acad Neurol 15(4):239–46
Ashburner M et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25(1):25–9
Blennow K, Bogdanovic N et al (1996) Synaptic pathology in Alzheimer’s disease: relation to severity of dementia, but not to senile plaques, neurofibrillary tangles, or the ApoE4 allele. J Neural Transm 103(5):603–618
Chaudhury AR, Gerecke KM et al (2003) Neuregulin-1 and ErbB4 immunoreactivity is associated with neuritic plaques in Alzheimer disease brain and in a transgenic model of Alzheimer disease. J Neuropathol Exp Neurol 62(1):42–54
Chen Y-H, Lo RY (2017) Alzheimer’s disease and osteoporosis. Ci Ji Yi Xue za Zhi (Tzu-chi Med J) 29(3):138–142
Cheng J, Liu HP et al (2021) Machine learning compensates fold-change method and highlights oxidative phosphorylation in the brain transcriptome of Alzheimer’s disease. Sci Rep 11:13704
Crow M, Lim N et al (2019) Predictability of human differential gene expression. Proc Natl Acad Sci 116(13):6491–6500
Duff MC, Covington NV et al (2020) Semantic memory and the hippocampus: revisiting, reaffirming, and extending the reach of their critical relationship. Front Hum Neurosci 13:471
Hall M (2000) Correlation-based feature selection for machine learning. Dep Comput Sci 19
Hu R-T, Yu Q et al (2020) Co-expression network analysis reveals novel genes underlying Alzheimer’s disease pathogenesis. Front Aging Neurosci 12:432
Huang DW et al (2007) The DAVID gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol 8(9):R183
Keaney J et al (2019) Inhibition of Bruton’s tyrosine kinase modulates microglial phagocytosis: therapeutic implications for Alzheimer’s disease. J Neuroimmune Pharmacol Off J Soc NeuroImmune Pharmacol 14(3):448–461
Kelly BL, Ferreira A (2007) $A\beta $ disrupted synaptic vesicle endocytosis in cultured hippocampal neurons. Neuroscience 147(1):60–70
Kiecolt-Glaser JK, Marucha PT et al (1995) Slowing of wound healing by psychological stress. Lancet 346(8984):1194–1196
Kuang J, Zhang P, Cai T et al (2021) Prediction of transition from mild cognitive impairment to Alzheimer’s disease based on a logistic regression-artificial neural network-decision tree model. Geriatr Gerontol Int 21(1):43–47
Lanoiselée H-M et al (2017) APP, PSEN1, and PSEN2 mutations in early-onset Alzheimer disease: a genetic screening study of familial and sporadic cases. PLoS Med 14(3):e1002270
Magistri M et al (2015) Transcriptomics profiling of Alzheimer’s disease reveal neurovascular defects, altered amyloid-$\beta $ homeostasis, and deregulated expression of long noncoding RNAs. J Alzheimer’s Dis JAD 48(3):647–665
Martin D, Brun C, Remy E et al (2004) GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biol 5:R101
Mukaka MM (2012) Statistics corner: a guide to appropriate use of correlation coefficient in medical research. Malawi Med J J Med Assoc Malawi 24(3):69–71
Nishiwaki K, Kanamori K, Ohwada H (2016) Finding a disease-related gene from microarray data using random forest. IEEE 15th international conference on cognitive informatics and cognitive computing (ICCICC), Palo Alto, pp. 542-546
Norstrom EM et al (2010) Identification of NEEP21 as a $\beta $-amyloid precursor protein-interacting protein in vivo that modulates amyloidogenic processing in vitro. J Neurosci Off J Soc Neurosci 30(46):15677–15685
Pinner E et al (2017) CD44 splice variants as potential players in Alzheimer’s disease pathology. J Alzheimer’s Dis JAD 58(4):1137–1149
Ramaswamy R et al (2021) Feature selection for Alzheimer’s gene expression data using modified binary particle swarm optimization. IETE J Res 2021:1–12
Ray M, Zhang W (2010) Analysis of Alzheimer’s disease severity across brain regions by topological analysis of gene co-expression networks. BMC Syst Biol 4:136
Ray M, Yunis R, Chen X, Rocke DM (2012) Comparison of low and high dose ionizing radiation using topological analysis of gene co-expression networks. BMC Genomics 13(1):190
Ray S et al (2017) A comprehensive analysis on preservation patterns of gene co-expression networks during Alzheimer’s disease progression. BMC Bioinform 18(1):579
Ruan J, Zhang W (2006) Identification and evaluation of functional modules in gene co-expression networks. Syst Biol Comput Proteomics Lecture Notes Comput Sci 4532(1):57–76
Saputra RA et al (2020) Detecting Alzheimer’s disease by the decision tree methods based on particle swarm optimization. J Phys Conf Ser 1641:012025
Seifert B, Eckenstaler R et al (2016) Amyloid-beta induced changes in vesicular transport of BDNF in hippocampal neurons. Neural Plast 2016:4145708
Sharma A, Dey P (2021) A machine learning approach to unmask novel gene signatures and prediction of Alzheimer’s disease within different brain regions. Genomics 113(4):1778–1789
Subramanian A, Tamayo P et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 102(43):15545–15550
Swamy K et al (2021) Protein complexes form a basis for complex hybrid incompatibility. Front Genet 12:144
Takahiro K, Kazutaka N, Hayato O (2016) Finding unknown disease-related genes by comparing random forest results to secondary data in medical science study. Proceedings of the 7th international conference on computational systems-biology and bioinformatics (CSBio ’16), pp. 24-27
Watt NT et al (2010) The role of Zinc in Alzheimer’s disease. Int J Alzheimer’s Dis 2011:971021
Woo R-S et al (2011) Expression of ErbB4 in the neurons of Alzheimer’s disease brain and APP/PS1 mice, a model of Alzheimer’s disease. Anat Cell Biol 44(2):116–27
Wu Y, Zhang S et al (2016) Regulation of global gene expression and cell proliferation by APP. Sci Rep 6:22460
Xia J, Rocke DM, Perry G, Ray M (2014) Differential network analyses of Alzheimer’s disease identify early events in Alzheimer’s disease pathology. Int J Alzheimer’s Dis. https://doi.org/10.1155/2014/721453
Zhao C, Wang Z (2018) GOGO: an improved algorithm to measure the semantic similarity between gene ontology terms. Sci Rep 8:15107
Zhou S (2008) Probability theory and mathematical statistics, 4th edn. Higher Education Press, Beijing
Zhu G, Yang P (2016) Identifying the candidate genes for Alzheimer’s disease based on the rejection region of T-test. International conference on machine learning and cybernetics (ICMLC), Jeju, vol. 2, pp. 732–736
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Sahu, S., Dholaniya, P.S. & Rani, T.S. Identifying the candidate genes using co-expression, GO, and machine learning techniques for Alzheimer’s disease. Netw Model Anal Health Inform Bioinforma 11, 10 (2022). https://doi.org/10.1007/s13721-021-00349-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-021-00349-9