Abstract
Alzheimer’s disease (AD) is a serious neurodegenerative disorder and its cause remains largely elusive. In past years, genome-wide association (GWA) studies have provided an effective means for AD research. However, the univariate method that is commonly used in GWA studies cannot effectively detect the biological mechanisms associated with this disease. In this study, we propose a new strategy for the GWA analysis of AD that combines random forests with enrichment analysis. First, backward feature selection using random forests was performed on a GWA dataset of AD patients carrying the apolipoprotein gene (APOEɛ4) and 1058 susceptible single nucleotide polymorphisms (SNPs) were detected, including several known AD-associated SNPs. Next, the susceptible SNPs were investigated by enrichment analysis and significantly-associated gene functional annotations, such as ‘alternative splicing’, ‘glycoprotein’, and ‘neuron development’, were successfully discovered, indicating that these biological mechanisms play important roles in the development of AD in APOEɛ4 carriers. These findings may provide insights into the pathogenesis of AD and helpful guidance for further studies. Furthermore, this strategy can easily be modified and applied to GWA studies of other complex diseases.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Pandey P, Singh M, Gambhir I. Alzheimer’s disease: A Threat to mankind. J Stress Physiol Biochem, 2011, 7: 15–30
Ferri C P, Prince M, Brayne C, et al. Global prevalence of dementia: a Delphi consensus study. Lancet, 2005, 366: 2112–2117
Gatz M, Reynolds C A, Fratiglioni L, et al. Role of genes and environments for explaining Alzheimer disease. Arch Gen Psychiatry, 2006, 63: 168–174
Holscher C. Diabetes as a risk factor for Alzheimer’s disease: insulin signalling impairment in the brain as an alternative model of Alzheimer’s disease. Biochem Soc Trans, 2011, 39: 891–897
Cai J, Yin D. Research progress on important genes and functional proteins related to Alzheimer’s disease. Chin J Neuroimmunol Neurol, 2006, 13: 120–123
Saunders A M, Strittmatter W J, Schmechel D, et al. Association of apolipoprotein E allele epsilon 4 with late-onset familial and sporadic Alzheimer’s disease. Neurology, 1993, 43: 1467–1472
Liu Q, Wu W, Li R, et al. Advance in research of apolipoprotein E and Alzheimer’s disease. Process Chem, 2008, 19: 2006–2011
Zhuang Y, Chen J. Research progress on causes and mechanism of Alzheimer’s disease. J Jilin Med College, 2008, 29: 1–2
Reiman E M, Webster J A, Myers A J, et al. GAB2 alleles modify Alzheimer’s risk in APOE epsilon4 carriers. Neuron, 2007, 54: 713–720
Tang L, Lv Z, Yang Z, et al. Association between cholesterol 24S-hydroxylase gene polymorphism and late onset Alzheimer disease. Chin J Geriatr, 2007, 26: 13–15
Tan L, Liu R, Lei S, et al. A genome-wide association analysis implicates SOX6 as a candidate gene for wrist bone mass. Sci China Life Sci, 2010, 53: 1065–1072
Wang M, Chen X, Zhang M, et al. Detecting significant single-nucleotide polymorphisms in a rheumatoid arthritis study using random forests. BMC Proc, 2009, 3: S69
Wang M, Zhang M, Chen X, et al. Detecting genes and gene-gene interactions for age-related macular degeneration with a forest-based approach. Stat Biopharm Res, 2009, 1: 424–430
Chen X, Liu C T, Zhang M, et al. A forest-based approach to identifying gene and gene-gene interactions. Proc Natl Acad Sci USA, 2007, 104: 19199–19203
Wang M, Chen X, Zhang H. Maximal conditional chi-square importance in random forests. Bioinformatics, 2010, 26: 831–837
Bertram L, Lill C M, Tanzi R E. The genetics of Alzheimer disease: back to the future. Neuron, 2010, 68: 270–281
Harold D, Abraham R, Hollingworth P, et al. Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer’s disease. Nat Genet, 2009, 41: 1088–1093
Satake W, Nakabayashi Y, Mizuta I, et al. Genome-wide association study identifies common variants at four loci as genetic risk factors for Parkinson’s disease. Nat Genet, 2009, 41: 1303–1307
Han J, Zhang X. Current status of genome-wide association study. Hereditas, 2011, 33: 25–35
Birnbaum S, Ludwig K U, Reutter H, et al. Key susceptibility locus for nonsyndromic cleft lip with or without cleft palate on chromosome 8q24. Nat Genet, 2009, 41: 473–477
Lunetta K L, Hayward L B, Segal J, et al. Screening large-scale association study data: exploiting interactions using random forests. BMC Genet, 2004, 5: 32
Dennis G Jr., Sherman B T, Hosack D A, et al. DAVID: database for annotation, visualization, and integrated discovery. Genome Biol, 2003, 4: P3
Huang da W, Sherman B T, Lempicki R A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc, 2009, 4: 44–57
Breiman L. Random forests. Mach learn, 2001, 45: 5–32
Trevor H, Robert T, Jerome F. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York: Springer-Verlag, 2001. 371–406
Zhang H, Wang M, Chen X. Willows: a memory efficient tree and forest construction package. BMC Bioinformatics, 2009, 10: 130
Huang da W, Sherman B T, Lempicki R A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res, 2009, 37: 1–13
Guo H, Zhu Y P, Li D, et al. Identification, modeling and simulation of key pathways underlying certain cancers. Hereditas, 2011, 33: 809–819
Liu M, Wang M, Ding W, et al. Gene function enrichment analysis of microarray data. J Biomed Engineer, 2010, 27: 1166–1168
Al-Shahrour F, Diaz-Uriarte R, Dopazo J. FatiGO: a web tool for finding significant associations of gene ontology terms with groups of genes. Bioinformatics, 2004, 20: 578–580
Rivals I, Personnaz L, Taing L, et al. Enrichment or depletion of a GO category within a class of genes: which test? Bioinformatics, 2007, 23: 401–407
Hosack D A, Dennis G Jr., Sherman B T, et al. Identifying biological themes within lists of genes with EASE. Genome Biol, 2003, 4: R70
Genuer R, Poggi J M, Tuleau C. Random Forests: some methodological insights. Arxiv preprint arXiv: 08113619, 2008
Bertram L, Tanzi R E. Thirty years of Alzheimer’s disease genetics: the implications of systematic meta-analyses. Nat Rev Neurosci, 2008, 9: 768–778
Zhong X L, Yu J T, Hou G Y, et al. Common variant in GRB2 is associated with late-onset Alzheimer’s disease in Han Chinese. Clin Chim Acta, 2010, 412: 446–449
Lucatelli J F, Barros A C, Silva V K, et al. Genetic influences on Alzheimer’s disease: evidence of interactions between the genes APOE, APOC1 and ACE in a sample population from the south of Brazil. Neurochem Res, 2011, 1–7
Bertram L, McQueen M B, Mullin K, et al. Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database. Nat Genet, 2007, 39: 17–23
Zhang G, Song H, Chen Z. Molecular mechanism of mRNA alternative splicing. Acta Genet Sin, 2004, 31: 102–107
Tollervey J R, Wang Z, Hortobagyi T, et al. Analysis of alternative splicing associated with aging and neurodegeneration in the human brain. Genome Res, 2011, 21: 1572–1582
Mukai F, Ishiguro K, Sano Y, et al. Alternative splicing isoform of tau protein kinase I/glycogen synthase kinase 3beta. J Neurochem, 2002, 81: 1073–1083
Li M, Chang X, Tao X. Senile dementia of the Alzheimer type and the abnormal modification of tau protein. J Shantou Univ Med College, 2000, 13: 73–75
Tojima T, Ito E. Signal transduction cascades underlying de novo protein synthesis required for neuronal morphogenesis in differentiating neurons. Prog Neurobiol, 2004, 72: 183–193
Perez R G, Zheng H, Van der Ploeg L H, et al. The beta-amyloid precursor protein of Alzheimer’s disease enhances neuron viability and modulates neuronal polarity. J Neurosci, 1997, 17: 9407–9414
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is published with open access at Springerlink.com
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Zou, L., Huang, Q., Li, A. et al. A genome-wide association study of Alzheimer’s disease using random forests and enrichment analysis. Sci. China Life Sci. 55, 618–625 (2012). https://doi.org/10.1007/s11427-012-4343-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11427-012-4343-6