Introduction

Alzheimer Disease

Alzheimer disease (AD) is the most common form of dementia in the elderly, accounting for up to 75 % of all dementia patients. Currently, there are more than 25 million people worldwide with AD, and their numbers are anticipated to double every 20 years because of the expected demographic shift toward higher age. Disease onset of AD is usually over 70 years of age. However, age-specific prevalence of about 4 % at age 65 increases exponentially with age and exceeds 20 % at age 90 [1]. AD is clinically characterized by a slow but progressive impairment in memory, executive function, language and other areas of cognition leading to a loss of social and occupational functions [2]. The pathological hallmarks of AD are extracellular amyloid β (Aβ) plaques and intracellular neurofibrillary tangles (NFT) that accumulate over time in the aging brain. According to the predominant amyloid hypothesis of AD, aggregation of Aβ initiates a pathogenic cascade that eventually leads to inflammation, loss of neurons and synapses, and brain atrophy [3, 4].

Genetics of Alzheimer Disease

The heritability of late-onset AD (LOAD) has been estimated to be between 58 and 79 % and can be regarded as the proportion of disease vulnerability explained by heritable genetic factors. The remaining risk for LOAD has been attributed to environmental factors, for instance, exposure to vascular risk (e.g., obesity) and possibly beneficial psychosocial factors such as high education and physical exercise [1, 5]. Besides LOAD, there are early onset forms of AD (EOAD) with disease onset prior to age 60 that represent up to 1 % of all AD cases at an average prevalence of about 65 EOAD cases per 100,000 individuals [6, 7]. Approximately 60 % of EOAD patients have several relatives also affected by AD, and 13 % of EOAD cases occur in families that are in concordance with autosomal-dominant inheritance of EOAD over at least three generations (ADEOAD) [6].

In the 1990s, genetic linkage and subsequent positional cloning studies in large, multi-generational ADEOAD families led to the discovery of the first causative AD mutations in the genes encoding the amyloid precursor protein (APP) [8, 9], presenilin 1 (PSEN1) [10, 11] and presenilin 2 (PSEN2) [12, 13]. Meanwhile, around 24 mutations in APP, 185 in PSEN1 and 13 in PSEN2 have been reported to be pathogenic for AD [14••] (see [15••] for review on clinical aspects of genetic variants in AD). Most of these extremely rare, familial variants are single amino acid substitutions and show dominant, fully penetrant co-segregation with ADEOAD. The majority of pathogenic variants in full-length APP are located near the N-terminal β-secretase and the C-terminal γ-secretase proteolytic cleavage site of amyloidogenic Aβ fragments of APP that are endogenously produced by cells [15••]. The presenilin proteins are the catalytic units of γ-secretase complexes that are involved in the proteolytic cleavage of APP to produce Aβ fragments. Mutations in the presenilins most often alter the proteins’ catalytic properties in a way that increases the absolute or relative production of most amyloidogenic Aβ42 fragments [16]. Thus, genetic and biochemical evidence for the predominant amyloid hypothesis of AD is convincing [17].

Early genetic linkage and genetic association studies also led to the identification of the two most prominent genetic risk variants for LOAD in exon 4 of the apolipoprotein E (APOE) gene, in terms of both their high population frequencies and large effect on LOAD risk [18, 19]. Depending on the population, the APOE ε4 LOAD risk allele typically occurs in about 15–20 % of individuals. Heterozygous carriers (ε3/ε4) for the APOE ε4 risk allele are threefold, and homozygous carriers (ε4/ε4) are up to 15-fold more likely to suffer from LOAD compared to individuals with the predominant APOE ε3/ε3 genotype. Homozygous ε4 carriers reach close to complete penetrance at age 90 and older. Moreover, the APOE ε4 risk allele is also associated with EOAD, an earlier disease onset of AD, and the rarer APOE ε2 allele is protective for AD [20, 21]. Multiple functions of the APOE protein have been suggested in the pathogenesis of AD. There is strong evidence for differential effects of APOE isoforms on Aβ aggregation and clearance. Additionally, proposed mechanisms of APOE with regard to AD involve neurotoxicity, tau phosphorylation, synaptic plasticity and neuroinflammation [22].

The advent in high content genotype chip array technology enabled genome-wide association studies (GWAS) in large cohorts of unrelated LOAD cases and unaffected aged controls. To date, large GWAS consortia in LOAD have identified >20 non-APOE loci that show association with LOAD [23, 24••]. Unlike APOE, these loci confer low individual but reproducible risk to LOAD with odds ratios (OR) between 1.08 and 1.30. Several of these LOAD susceptibility genes can be functionally linked to the pathways of APP and protein tau, are enriched for immune response and inflammation and involve cell migration, lipid transport, endocytosis, hippocampal synaptic function, cytoskeletal function, axonal transport, regulation of gene expression and post-translational modification of proteins, and microglial and myeloid cell function. The population-attributable fraction (PAF) of APOE on risk for LOAD is at about 30 %, while each single GWAS-identified locus contributes with individual PAFs between 1.0 and 8.0 % to the risk of or protection from LOAD [24••].

The genetics of sporadic LOAD are consistent with the amyloid hypothesis of AD, but seem to go far beyond the immediate APP pathway and are therefore complex. Moreover, the so far established susceptibility loci for LOAD do not completely explain the overall heritability of LOAD.

Rare Variants and Insights from Transcriptomics

GWAS successfully identified many susceptibility loci in LOAD, which is important for the better understanding of LOAD etiology. However, with the exception of APOE, GWAS-identified associations confer small risk to LOAD, and GWAS loci defining SNPs most often merely represent genetic markers correlated to nearby functional risk variants that remain to be revealed. Since the discovery of rare disease susceptibility variants is challenging because of their low abundance, resulting in low statistical power to detect an association with disease in a genome-wide screening approach, most LOAD GWAS so far have only considered common variants with minor allele frequencies (MAFs) above 5 % in the general population. Therefore, missing heritability in AD could be explained by low-frequency (5 % > MAF > 1 %) and rare variants (MAF < 1 %) [25]. The first rare susceptibility variants for LOAD have been identified in the last 2 years, and they show intermediate to large effects, some even comparable to APOE [26••]. In the first part of this review, we focus on rare variant associations with LOAD that reached study-wide significance with independent replication in a chronological order of publication. In contrast to GWAS-identified common risk variants, pathogenic ADEOAD mutations and emerging rare susceptibility variants in LOAD are most often protein coding and therefore the actual functional variants. This opens an avenue of functional studies that can assess the molecular consequences of disease-associated variants in AD. The second part of this review concentrates on whole-transcriptomics studies from human post-mortem brain and peripheral blood cells (PBCs). We compare biological processes with differential gene expression in AD to pathways revealed by genetic studies and summarize the first studies on transcriptional profiles typical to AD risk allele carriers.

Genetics of Rare Variants in Alzheimer Disease

Rare Variants in Previously Identified AD Genes

The Rare APP p.A673T Variant is Protective against LOAD

Next generation sequencing methods now allow for rapid and deep sequencing of many human genomes. Cruchaga et al. [27••] sequenced the coding region of ADEOAD genes in 440 probands of LOAD families and showed an overall overrepresentation of rare protein sequence changing variants in the genes APP, PSEN1 and 2 when compared to 12,500 population controls, while describing known EOAD mutations and novel, likely pathogenic variants in APP (Table 1).

Table 1 Overview of rare variants associated with late-onset Alzheimer disease (LOAD)

Johnsson et al. [28••] took advantage of whole-genome sequence data from 1,795 Icelanders to search for low-frequency variants in the APP gene. The genotypes of variants found at least twice were then imputed (in silico genotyped) into 71,700 Icelanders for whom high-density SNP array genotypes were available in order to test for association with LOAD. The rare APP p.A673T substitution (rs63750847) was found to confer a large protective effect from LOAD (OR = 4.2–7.5, depending on age and cognitive status of control groups) and also showed reduced cognitive decline among elderly subjects without a diagnosis of AD. The population frequencies of APP p.A673T in northern Europe are around 0.4 % (MAF), whereas the protective allele seems to be even rarer in the US (MAF < 0.01 %). Functionally, p.A673T is located adjacent to the aspartyl protease β-site in APP and results in a reduced β-cleavage efficiency of the aspartyl protease β-site cleaving enzyme 1 (BACE1) and thereby in an about 40 % reduction of Aβ fragments, as the authors showed in vitro. Of importance, the strong protective effect of APP p.A673T serves as a proof of principle that reducing β-cleavage of APP may protect from AD.

Discovery of New LOAD Genes Through Rare Variants

Rare Variants in TREM2, e.g., p.R47H, Confer Risk to LOAD

Back to back, two independent groups reported a highly significant association with LOAD for the rare substitution p.R47H (rs75932628, MAF = 0.3 %) in the gene encoding the triggering receptor expressed on myeloid cells 2 (TREM2) [26••, 29]. This variant already reached genome-wide significant association (p < 5e-08) with LOAD in the large Icelandic study samples of Jonsson et al. [26••] described above, with subsequent replication in several LOAD cohorts of European descent. Interestingly, the rare allele of p.R47H confers similarly high risk to LOAD (overall OR = 2.90) as the common APOE ε4 allele and comparably reduces age of LOAD onset also by about 3 years per risk allele copy. Reminiscent of Jonsson et al.’s finding that the protective APP p.A673T substitution reduces cognitive decline in elderly controls, TREM2 p.R47H accelerates cognitive decline in aged controls. Guerreiro et al. [29] independently reported highly significant association for p.R47H in large European LOAD GWA study samples, while also showing an overall significant accumulation of rare variants in exon 2 of the TREM2 gene in LOAD cases versus controls including additional variants such as p.D87N. Interestingly, they found that three variants (Q33X, Y38C and T66M) in the recessive state cause the rare Nasu-Hakola disease and related forms of early onset dementia distinct from AD [29]. TREM2 is a membrane protein that forms a receptor signaling complex with the TYRO protein tyrosine kinase-binding protein (TYROBP) and is involved in macrophage activation and inflammation. In the brain, TREM2 is mainly expressed in microglia of white matter. TREM2 expression and microglial phagocytosis of cell debris and amyloid concomitantly increases with the accumulation of Aβ plaques in transgenic mice models of AD that carry pathogenic ADEAOD mutations in a human copy of the APP gene [26••, 29]. Thus, loss-of-function variants in TREM2 may interfere with Aβ clearance and anti-inflammatory responses, thereby increasing the risk for AD.

Rare Variants in PLD3, e.g., p.V232M, Confer Risk to LOAD

Cruchaga et al. [30] applied whole-exome sequencing in multiple multiplex families with a high burden of LOAD and identified the rare missense variant p.V232M (rs145999145, MAF = 0.4 %) in the phospholipase D3 (PLD3) gene that co-segregated with disease in two independent families. Subsequent genotyping of p.V232M in several large LOAD case-control cohorts of European descent (>11,000 individuals) confirmed the LOAD risk-conferring character of the variant allele with high significance and intermediate effect size (OR = 2.1). This case-control association resulted in a large effect when only familial LOAD cases were compared against controls (OR = 3.4). Moreover, the risk allele also showed an association with an earlier onset of LOAD. The authors then sequenced the coding region of PLD3 in more than 2,000 LOAD cases and as many controls of European descent and found a genome-wide significant gene-based association with LOAD and several rare variants in the PLD3 gene (OR = 2.6). Nominally significant single-variant associations with intermediate to large effect could be shown for PLD3 p.M6R and the synonymous splice site variant p.A442A. This gene-based association of rare PLD3 variants being overrepresented in LOAD cases was replicated in an African American case-control cohort as well as the single-variant association for p.A442A. PLD3 expression is high in several AD-relevant brain regions in healthy controls, but reduced in neurons of LOAD patients. PLD3 overexpression and knockdown experiments in cell cultures revealed that high PLD3 expression correlates with lower extracellular Aβ levels and that PLD3 protein can be co-immunoprecipitated with APP. Thus, PLD3 protein is likely protective against AD through its role in APP trafficking [30].

Transcriptomics in Alzheimer Disease

EOAD and LOAD have similar clinical manifestations and pathological features [31]. This suggests that similar cellular and biological processes are disrupted in both forms of AD. This notion is also supported by genetics of AD, most recently complemented by rare variant associations as mentioned above. Genome-wide gene transcription is a measurable intermediate proxy of how the genomic sequence gives rise to the altered protein formation that eventually triggers disease and reflects disease progression and cellular coping mechanisms to pathological changes. There are two primary means of measuring genome-wide transcription: Microarray technology and massively parallel RNA sequencing (RNA-seq). While microarrays use hybridization to measure levels of known transcripts, the recent advent of RNA-seq allows for measurement of known and novel transcripts, including alternatively spliced transcripts. Both of these technologies have been used to shed light on gene transcriptional changes related to AD pathology.

Profiling in Post-Mortem Brain Tissue

Since 2005, at least 25 studies have been published examining post-mortem human brain tissue. These studies primarily focused their efforts on tissue from the frontal cortex, hippocampus and temporal lobe because these regions are most affected by AD pathology, essentially Aβ plaques and NFTs [3250••]. These studies differ in their findings on which individual genes are significantly altered between AD patients and healthy controls; however, several biological processes are consistently indicated in AD. The most recent and largest studies describe several hundreds of differentially expressed genes in AD after correction for multiple testing [50••, 51••]. Although post-mortem studies might also reflect pathological changes that might rather be consequence than cause of the disease, identified biological processes do overlap with recent genetic findings (Table 2).

Table 2 Overview of biological processes altered in Alzheimer disease (AD) from transcriptome studies since 2005

Processes with Increased Expression

Overall gene expression in post-mortem brain tissue of AD patients is generally lowered compared to controls. However, gene expression in following biological processes is upregulated in brains of AD subjects. Inflammation has been associated with AD since the 1980s. Several studies demonstrated that genes related to inflammation have increased expression in AD across several brain regions [32, 43, 46, 49, 50••, 52, 53]. It is still unknown whether inflammation is the culprit, the result or a secondary response of AD; however, it is important to note that five (CR1, CD33, HLA-DRB5DRB1, INPP5D, MEF2C) of the 20 LOAD GWAS-identified genes are involved in inflammation [24••] (Table 2). Increased calcium signaling was also observed across many transcriptome studies [36, 48, 49, 54]. Studies in neurons and mice expressing human APP and presenilin genes harboring ADEOAD mutations also show altered calcium signaling and calcium storage in the endoplasmic reticulum (ER) as well as synaptic dysfunction and loss of dendritic spines [54]. Further, cellular processes involved in mitochondrial and metabolic functions were shown to have increased gene expression in several transcriptome studies [42, 44, 49]. Mitochondrial function is impaired by APP, Aβ and presenilins [55]. Moreover, PET scans report a decrease in resting-state brain glucose metabolism and metabolic failure in AD brains [56]. The expression of genes related to cytoskeletal architecture is also increased in AD; this is consistent with the tau hypothesis of AD [57]. Microtubules are a major component of the cytoskeleton, and the formation of NFTs in AD increasingly depletes microtubules by hyper-phosphorylated and misfolded tau [58]. Another cytoskeletal process increased in AD patients is the formation of cofilin-actin rods along axons and dendrites, which results in cellular disruption. Cofilin-actin rods are known to form in response to heat shock, osmotic pressure and ATP rundown within the hippocampus and frontal cortex of AD patients [59].

Processes with Decreased Expression

Synaptic-related processes are decreased in AD [36, 48, 49, 54]. While Aβ plaques and oligomers indirectly destroy synapses, aggregation of NFTs in neurons results in apoptosis and inadvertently destroys synapses [60, 61]. Normal neurons remain in the G0 phase; however, most AD neurons re-enter the cell cycle into the G1 phase. This departure from the normal cell cycle in neurons results in axonal defects. These findings are congruent with the decrease in synaptic-related processes and the finding that negative regulation of cell cycle processes in AD is decreased [50••, 62]. Moreover, signal transduction is also decreased in AD [63], particularly insulin signaling [6466]. Lastly, genes involved in myelination are also decreased in LOAD [50••, 67]. This finding corresponds to studies demonstrating that brain regions with the most myelination are the most vulnerable to AD pathology and that Aβ plaques form retroactively to the developmental progression of myelination in the brain [68].

Profiling in Peripheral Tissue

Studies have attempted to find expression profiles specific to AD in peripheral blood leukocytes to serve as biomarkers. Apoptosis is increased in PBCs of patients with AD [69]. Chemokine and cytokine signaling processes, which are both heavily involved in inflammation, also show increased expression in AD [53, 70, 71]. This is in line with a general increase in inflammation in response to apoptosis [72, 73]. Moreover, increased expression of inflammatory genes in PBCs has been associated with dementia and is thought to be triggered by progressing AD pathology [74]. Another response to inflammation observed in AD is increased expression of TGF-β [7577]. Decreasing the expression of TGF-β within innate immune cells mitigates AD symptoms, such as an increase in spatial memory and Aβ phagocytosis [78].

Most profiles that examined peripheral blood leukocytes in AD observed overall decreases in expression. In contrast to transcriptional profiles in brain tissue, peripheral blood leukocyte profiles show decreased expression of genes involved in cell structure-related processes in AD [75, 79]. This finding has been explained by increased apoptosis observed in the peripheral blood of AD patients. Similar to brain transcriptome studies, cellular signaling, lipid rafts and cholesterol-related processes are decreased in AD [77]. Two proteins responsible for lipid transport, APOE and APOJ (alias CLU) are genetically associated with AD, and the LOAD associated alleles of both APOE and CLU result in a decrease of lipid transportation [24••, 51••]. Moreover, lipid transport is reduced in patients with AD [76]. Additionally, AD patients had decreased expression of ATP-binding cassette transporters (ABC transporters), transporters that utilize ATP and carry out different processes within the cell. Seven ABC transporters have been directly linked to AD through functional studies or GWAS, including ABCA7 [80]. The cellular processes altered in the blood parallel those disrupted in the brain. PBC gene expression profiling in AD points to processes that are disrupted across the body and can potentially serve as a biomarker for AD.

Single-Cell Profiling

Aβ plaques are extracellular, whereas NFTs are intracellular deposits typical of AD pathology. According to the tau hypothesis of AD, abnormal hyper-phosphorylation of the microtubule-associated protein tau (MAPT) leads to neurotoxic aggregates of tau, the formation of intracellular NFTs, the disintegration of microtubules, the collapse of neuronal transport and finally cell death [57]. To understand the transcriptional responses of neurons affected by intracellular NFTs, several studies applied single-cell transcriptomic profiling. Similar to post-mortem brain transcriptomics, there was an overall decrease of gene expression within neurons of AD patients with versus without NFTs [35, 62, 81]. Genes involved in cell cycle, cell signaling, cytoskeleton, mitochondria and metabolism were decreased in AD NFT positive neurons [35, 62, 81]. In addition, most cellular processes with increased gene expression in post-mortem brain studies also show increased expression within neurons affected by NFTs, such as inflammation and mitochondrial dysfunction [35, 62, 81]. Intriguingly, an increase in vesicle-mediated transport in singular neurons was observed prior to NFT development [62]. Defects in axonal transport were also observed in neurons prior to NFT formation. This increase in vesicle-mediated transport might be explained by tau oligomer toxicity prior to NFT formation, but might also be mediated by concomitant Aβ toxicity or neuroinflammation.

Impact of AD-Related Variants on Transcriptional Profiles

To understand how human genetic disease variants impact AD pathogenesis, transcriptomics have been applied by either utilizing humanized transgenic cell and animal models or AD patient-derived cells with and without variant allele status. Nagasaka et al. [82] demonstrated that single causative ADEAOD mutations (APP p.K595N/M596L, p.E693G and PSEN1 p.H163Y) significantly impact transcriptional profiles. The authors compared transcriptomic profiles of cultured fibroblasts from AD patients carrying an ADEOAD mutation with profiles from unaffected siblings who were non-carriers. While the levels of APP and PSEN1 were comparable between fibroblasts of mutation carriers and non-carriers, up to 200 genes were differentially expressed between the groups, but showed similar profiles among AD-affected mutation carriers.

Transcriptional profiles of APOE ε4 AD risk allele carriers differed greatly when compared to non-risk allele carriers [49, 52]. Xu et al. [49] compared hippocampal gene expression of AD patients with the APOE ε4/ε4 genotype versus patients with the APOE ε3/ε3 genotype and found increased gene expression in processes such as cell growth, protein modification and RNA binding/editing, whereas gene expression was lowered in stress response, ER-Golgi transport and mitochondrial oxidative phosphorylation. Expression differences were also observed between APOE ε4 carriers and non-APOE ε4 carriers when examining expression profiles in subjects with mild cognitive impairment (MCI) [52]. Similar to the Xu et al. study, genes involved in MHC class II protein complex, cell-matrix adhesion and cell growth had increased expression in APOE ε4 carriers, whereas genes involved in processes such as mitochondrial electron transport, microtubule, synaptic and nucleosome assembly were downregulated [52].

Using post-mortem brain tissue, Rhinn et al. [51••] constructed gene expression networks to examine how APOE alleles influence gene network interactions in AD patients and healthy controls with different APOE risk genotypes. Interestingly, transcriptional profiles of non-demented APOE ε4 AD risk allele carriers already most resembled those of subjects with a diagnosis of LOAD when compared to patients with neurological diseases other than AD. Transfection of N2a-APP cells with human APOE ε4 alleles increased Aβ40 and Aβ42 levels, but did not with APOE ε3 or ε2 alleles. Moreover, their analyses identified six genes (RNF219, SV2A, HDLBP, ROGDI, CALU and PTK2B) that exclusively interacted with the APOE ε4 allele, but not with the other APOE alleles. Importantly, knockdown of these genes in APOE ε4 allele-transfected cells resulted in decreased Aβ40 and Aβ42 levels, which had no effect on Aβ levels in cells transfected with alternative APOE alleles.

Conclusions

Three different study approaches have so far led to the successful identification of rare variants in LOAD : (1) large-scale sequencing of autosomal-dominant early onset Alzheimer disease (ADEOAD) genes in a case-control design with subsequent association testing in several even larger case-control cohorts followed by functional studies, (2) an analogous unbiased, genome-wide sequencing approach, and (3) a combined approach of sequencing and co-segregation analyses in several families enriched for LOAD also in conjunction with subsequent large association and molecular studies. Importantly and for the first time, strategy 1 showed genetic association between LOAD and a variant in the ADEOAD gene APP. Noteworthy, the strong protective effect of the newly discovered rare APP p.A673T variant serves as a proof of principle that reducing β-cleavage of APP may protect from AD. Strategy 2 and 3 led to the identification of completely novel LOAD susceptibility genes (TREM2, PLD3) that were not implicated by common GWAS variants. Given the complex nature of LOAD, it is likely that additional, yet unknown rare variant associations also with intermediate to large risk to or protection from LOAD will be revealed in the near future. Furthermore, biological processes defined by genes with causative mutations, rare and common susceptibility variants in AD overlap with processes indicated by human whole-transcriptomic studies in AD examining post-mortem brain, PBC and single neurons. Genes commonly upregulated in AD involve processes such as mitochondrial function, inflammation, calcium signaling and cytoskeletal organization, whereas gene expression in synaptic functions and signal transduction is reduced in AD. PBC transcriptional profiles reflect those obtained from brain tissues of AD patients. Thus, PBC profiling could become a practical biomarker for AD diagnosis and disease progression monitoring [83]. Moreover, transcriptome profiles of pre-symptomatic and AD-affected pathogenic AD mutation or APOE risk allele carriers both reflect transcriptional changes reminiscent of those of LOAD patients. As more rare variants will be related to AD, transcriptional profiling in AD variant carriers will likely give variant-specific insight into molecular mechanisms involved in LOAD pathogenesis, which might provide an avenue for personalized medicine.