Introduction

Parkinson’s disease (PD) is a neurodegenerative disease characterized by rest tremor, rigidity, bradykinesia, and postural instability [57]. It is the most common neurodegenerative movement disorder, affecting more than six million people worldwide, with its prevalence projected to double in the next several decades [29]. Aggregated and phosphorylated alpha-synuclein (α-Syn) is the main protein component of Lewy bodies (LB) and neurites, the pathological hallmark of Lewy body diseases. The gene dosage effect of the SNCA gene, which encodes α-Syn, correlates with cerebrospinal fluid (CSF) α-Syn levels and a more severe PD phenotype [30, 55]. Common variants in the SNCA promoter are among the top genome-wide association studies (GWAS) signals for PD [54], suggesting that genetic control of CSF α-Syn level plays a role in PD phenotype variability. A modest but significant decrease (~ 10% to 15%) in CSF α-Syn levels has been reported in PD cases compared to controls [33] and is correlated with disease progression [1, 13, 35]. CSF α-Syn is not currently used as a clinical biomarker [35, 53], but is a proxy for pathological brain α-Syn accumulation [64]. Therefore, identifying genetic modifiers of CSF α-Syn levels could provide insight into PD pathogenesis. To date, genetic modifiers of CSF α-Syn remain unknown.

The α-Syn accumulation in specific brain regions defines different subtypes of Lewy body diseases (LBD). However, pure α-Syn pathology is only found in 45% (brainstem), 32% (limbic) and 19% (neocortical) of LBD. Concomitant presence of amyloid beta (Aβ), tau, and TDP-43 are common findings in LBD. Thus, Aβ and tau pathology is present in up to 80% and 53% in cases of neocortical LBD, respectively [58]. LBD patients with concomitant Alzheimer’s disease (AD) pathology exhibit a faster cognitive decline [44]. CSF levels of amyloid beta1–42 (Aβ42), total tau (t-tau), and phosphorylated tau181 (p-tau181) are commonly used as proxies of Aβ and tau pathology in the brain [8]. A correlation between lower Aβ42 CSF levels and higher Braak stage scores of AD neuropathology was found in neuropathologically confirmed LBD cases [8]. In cross-sectional studies, PD cases exhibit lower CSF levels of Aβ42 compared to age and gender-matched healthy controls [14, 52]. CSF levels of Aβ42 and t-tau levels are also associated with cognitive decline progression [52]. Decreased CSF Aβ42 levels predict the development of dementia in PD patients [47, 63]. These results suggest that dementia-associated CSF biomarker profile signatures could be informative of brain pathology in PD patients. GWAS using CSF Aβ42, t-tau, and p-tau181 levels as quantitative traits have identified genes involved in AD pathogenesis [24]. However, a systematic study of the role of genetic modifiers of dementia CSF biomarkers in PD has not yet been thoroughly evaluated.

This study aimed to uncover genetic modifiers of α-Syn, Aβ42, t-tau, and p-tau181 CSF levels in PD patients by performing a large (N = 1960) GWAS meta-analysis of CSF biomarkers in PD cohorts. Polygenic risk scores (PRS) and Mendelian randomization (MR) analyses were integrated with the latest PD risk meta-analysis (META-PD) and CSF biomarker summary statistics to examine the causal relationship between CSF biomarkers and PD risk. This is the first comprehensive analysis of CSF biomarkers using GWAS, PRS, and MR in PD.

Materials and methods

Study design

The goal of this study was to identify common genetic variants and genes associated with CSF α-Syn, Aβ42, tau, and p-tau181 in PD. A three-stage GWAS was used: discovery, replication, and meta-analysis. The discovery phase included 729 individuals from the Protein and Imaging Biomarkers in Parkinson’s disease study (PIB-PD) at the Washington University Movement Disorder Center [9] (n = 103) and the Knight ADRC [24] (n = 626). The replication phase included 1231 independent CSF samples obtained from PD cases and healthy elderly individuals from three additional studies [the Parkinson’s Progression Markers Initiative (PPMI), Alzheimer Disease Neuroimaging Initiative (ADNI), and Spain]. Meta-analyses were performed using a fixed-effects model. Genetic loci that passed the multiple test correction for GWAS (p < 5.0×10−8) were functionally annotated using bioinformatics tools to identify variants and genes driving the GWAS signal. PRS were used to test the correlation between CSF biomarkers and PD genetic architecture. Instrumental variables were selected from the summary statistics of CSF biomarkers, and MR methods were applied to test causality.

Cohorts/datasets

This cross‐sectional multicenter study was performed using 1960 samples from non-Hispanic white (NHW) individuals from four cohorts: Washington University in Saint Louis (WUSTL) (N = 729), PPMI (N = 785), the University Hospital Mutua Terrassa (Spain, N = 130) and ADNI (N = 316). Cohorts included 700 clinically diagnosed PD cases, 564 controls, and 386 clinically-diagnosed AD cases. The remaining N = 310 individuals do not exhibit symptoms of neurodegenerative disease (Table 1 and Additional file 2: Table S1). PD clinical diagnoses were based on the UK Brain Bank criteria [39]. Clinical, biomarker, and genetic data from the PPMI and ADNI were obtained from the corresponding data repositories (www.ppmi-info.org and http://adni.loni.usc.edu/), accessed most recently on April 2019. The demographic characteristics of some of those cohorts have been published previously [4, 21, 28]. PPMI is a prospective study with ongoing recruitment. CSF samples were obtained at baseline (N = 510), 6 months (N = 385), and yearly after enrolment (N1stYear = 428, N2ndYear = 404, and N3rdYear = 320). CSF α-Syn, Aβ42, t-tau, and p-tau181 were available for all the mentioned time points in the PPMI cohort.

Table 1 Summary demographics for the individuals with CSF measurements available

Biomarker measurements

α-Syn in CSF was measured in 107 samples from the WUSTL cohort [9] and the entire PPMI cohort, using a commercial ELISA kit (Covance, Dedham, MA) [45]. The additional samples (N = 622) from WUSTL were quantified using the SOMAScan platform (See below). Aβ42, t-tau, and p-tau181 were quantified using the INNOTEST assay (WUSTL) and xMAP-Luminex with INNOBIA AlzBio3 (PPMI). The immunoassay platform from Roche Elecsys cobas e 601 was used in the ADNI cohort to quantify all four biomarkers. ELISA assays from Euroimmun (Germany) were used in the Spanish cohort to measure the CSF levels of α-Syn, Aβ42, t-tau, and p-tau181. The α-Syn levels were normalized by log10 transformation. Aβ42, t-tau, and p-tau181 values were normalized and standardized by the Z score transformation. Individuals with biomarker levels outside three standard deviations of the mean were removed from the analysis (Table 1).

Amyloid beta imaging

[11C]-Pittsburgh Compound B (PIB) acquisition and analysis were performed according to published methods [34]. Briefly, 10-15 mCi of the radiotracer was injected via an antecubital vein, and a 60-min, a three-dimensional dynamic PET scan was collected in 53 frames. Emission data were corrected for scattering, randoms, attenuation, and dead time. Image reconstruction produced images with a final resolution of 6 mm full-width half-maximum at the center of the field of view. Frame alignment was corrected for head motion and co-registered to each person’s T1-weighted magnetization-prepared rapid gradient echo magnetic resonance scan [61]. For quantitative analyses, three-dimensional regions of interest (prefrontal cortex, gyrus rectus, lateral temporal cortex, precuneus, occipital lobe, caudate nucleus, brainstem, and cerebellum) were created by a blinded observer for each subject based on the individual’s MRI scans, with boundaries defined as previously described [50]. Binding potentials (BPND) were calculated using Logan graphical analysis, with the cerebellum as the reference tissue input function [49, 50]. Mean cortical binding potentials (MCBP) were calculated for each subject as the average of all cortical regions except the occipital lobe.

Neuropathologic analysis

The neuropathological analysis was done at WUSTL, as previously reported [47]. Briefly, brains were fixed in 10% neutral buffered formalin for 2 weeks. Paraffin-embedded sections were cut at 6 μm. Blocks were taken from the frontal, temporal, parietal, and occipital lobes (thalamus, striatum, including the nucleus basalis of Meynert, amygdala, hippocampus, midbrain, pons, medulla oblongata) and the cervical spinal cord. Histologic stains included hematoxylin–eosin and a modified Bielschowsky silver impregnation. The Alzheimer’s disease pathologic changes were rated using an amyloid plaque stage (range, 0 to A–C) [7] and diffuse and neuritic plaques were also assessed. Cases were classified according to the neuropathologic criteria of Khachaturian [46], the Consortium to Establish a Registry for Alzheimer Disease (CERAD) [51] and NIA-Reagan [18].

Genotyping

All cohorts, except PPMI, were genotyped using the Global Screening Array (GSA) Illumina platform. Genotyping quality control and imputation were performed using SHAPEIT [23] and IMPUTE2 [38] with the 1000 genomes as a reference panel. Single nucleotide polymorphisms (SNPs) with a call rate lower than 98% and autosomal SNPs that were not in Hardy–Weinberg equilibrium (p < 1.0×10−06) were excluded from downstream analyses. The X chromosome SNPs were used to determine sex based on heterozygosity rates. Samples in which the genetically inferred sex was discordant with the reported sex were removed. Whole-genome sequence data from the PPMI cohort was merged with imputed genotyped data; only variants present in both files were included in further analyses. Pairwise genome-wide estimates of proportion identity-by-descent tested the presence of unexpected duplicates and cryptically related samples (Pihat > 0.50). Unexpected duplicates were removed; the sample with a higher genotyping rate in the merged file was kept for those cryptically related samples. Finally, principal components were calculated using HapMap as an anchor. Only samples with European descent, an overall call rate higher than 95%, and variants with minor allele frequency (MAF) greater than 5% were included in the analyses.

Single variant analysis

The three-stage single variant analysis was performed due to differences in time and platform for biomarker quantification. PLINK1.9 [16, 56] was used to perform the analysis of each cohort independently. A linear model using the normalized and standardized CSF levels and corrected by sex, age, and the first two principal components, was used. Disease status was not included in the model [25]. Then, the results for each protein were meta-analyzed using METAL [69]. For the α-Syn analyses, the WUSTL cohort was divided into two subsets based on the quantification method (ELISA or SOMAscan).

Analysis of variance

The genome-wide complex traits analysis (GCTA) software [71] was used to calculate the amount of variance explained by the APOE locus. GCTA estimates the phenotypic variance explained by genetic variants for a complex trait by fitting the effect of these SNPs as random effects in a linear mixed model.

Multi-tissue analysis

The levels of α-Syn were measured in CSF, plasma, and brain (parietal cortex) using an aptamer-based approach (SOMAScan platform) [70]. After stringent quality control, CSF (n = 835), plasma (n = 529), and brain (n = 380) samples were included in the downstream analyses (Additional file 2: Table S2). The protein level was 10-based log-transformed to approximate the normal distribution and used as phenotype for the subsequent GWAS. The single variant analysis was performed in each tissue independently using PLINK1.9 [56]. A multi-tissue analysis using the multi-trait analysis of GWAS (MTAG) [66] was applied to increase the power of detecting a no tissue-specific protein quantitative trait loci for α-Syn. MTAG calculates the trait-specific effect estimate for each tissue separately and then performs a meta-analysis while accounting for sample overlap. Measurements of Aβ42, t-tau, and p-tau181 were not available in different tissues.

Polygenic risk score

PRS is constructed by summing all trait-associated alleles in a target sample (META-PD and CSF biomarkers separately), weighted by the effect size of each allele in a base using different p-value thresholds. SNPs in linkage disequilibrium (LD) are grouped together to avoid extra weight into a single marker. The optimal threshold is considered the one that explains the maximum variance in the target sample. The association was tested using the default parameters and nine p-value cutoffs. The PRSice2 software [17] was used to calculate the PRS. Longitudinal measures of CSF α-Syn, Aβ42, t-tau, and p-tau181 were available for the PPMI cohort. A simplified PRS (detailed below) was used to test if the genetic architecture of PD was predictive of biomarker level progression. The PD PRS using sentinel SNPs from the META-PD [54] was modeled using the method previously described [19, 41, 42, 68]. Briefly, only genetic variants corresponding to the top hit on each GWAS locus (also known as sentinel SNP) available in the dataset with a minimum call rate of 85% were included in the PRS. If not possible, a proxy with R2 > 0.90 was used. The weight of each variant was calculated using the binary logarithm transformation of the reported Odd ratios. The final PRS is the sum of the weighted values for the alternate allele of all the sentinel SNPs.

Mendelian randomization

MR requires that the genetic instruments are associated with the modifiable exposure of interest (GWAS of CSF biomarkers), and any association between the instruments and the outcome (PD risk) is mediated by the exposure [11]. A two-sample MR was used to estimate causal effects using the Wald ratio for single variants along with an inverse-variance–weighted (IVW) fixed-effects meta-analysis for an overall estimate [36]. The IVW estimate is the inverse variance weighted mean of ratio estimates from 2 or more instruments. Two-sample MR provides an estimate of the causal effect of an exposure on an outcome, using independent samples to obtain the gene-exposure and gene-outcome associations, provided three key assumptions: (i) genetic variants are robustly associated with the exposure of interest (i.e. replicate in independent samples), (ii) genetic variants are not associated with potential confounders of the association between the exposure and the outcome and (iii) there are no effects of the genetic variants on the outcome, independent of the exposure (i.e. no horizontal pleiotropy). To account for potential violations of the assumptions underlying the IVW analysis, a sensitivity analysis using MR-Egger regression and the weighted median estimator was performed [36]. MR Egger regression consists of a weighted linear regression of SNP META-PD against SNP biomarker effect estimates. Assuming that horizontal pleiotropic effects and SNP exposure associations are uncorrelated (i.e., the instrument strength independent of direct effects assumption), MR Egger regression provides a valid effect estimate even if all SNPs are invalid instruments. Moreover, the MR Egger intercept can be interpreted as a test of overall unbalanced horizontal pleiotropy because one would expect a null y-intercept (i.e., the mean value of the SNP META-PD associations when the SNP biomarker association is zero) if there are no horizontal pleiotropic effects. Robust regression to downplay the contribution to the causal estimate of instrumental variables with heterogeneous ratio estimates were also performed [10, 12]. Heterogeneity (i.e., instrument strength) was tested using the I2 statistic. I2 statistic, instead of F statistic, is a better indicator of instrument strength for the two-sample summary data approach [6]. The R package “MendelianRandomization” [72] (version 0.4.1) was used for the MR analyses.

The latest and largest meta-analysis for PD genetic risk was used to perform the MR analyses [54]. Summary statistics from the largest GWAS of CSF Aβ42, t-tau, and p-tau181 were also used [25]. Deming et al. performed a one-stage GWAS for 3146 NHW individuals across nine independent studies [25]. None of these cohorts included PD affected individuals for each biomarker (Aβ42, t-tau, and p-tau181). Finally, the summary statistics of the GWAS for α-Syn CSF levels generated in the current study were used. There was no overlap between CSF biomarker datasets and PD risk datasets. Instrumental variables for each GWAS were obtained by clumping each GWAS summary statistics based on the LD structure of the exposure (CSF biomarker levels) and a significance threshold of 1.0x10−5 using PLINK1.9 [56]. Instrumental variables were restricted to those that are uncorrelated (in linkage equilibrium) by setting the –clump-r2 flag to 0.0 and the –clump-kb flag to 1000 (1 Mb).

Results

Association of CSF biomarkers with disease status

A generalized linear model (CSF biomarker levels ~ Age + Sex + Status) including PD cases (N = 700) and controls (N = 189) from two independent datasets (WUSTL and PPMI—Additional file 2: Table S1) in which α-Syn levels were measured with the same platform revealed that all CSF biomarker levels were significantly lower in PD cases compared to controls (α-Syn: betaPD = − 0.05, p = 2.10 × 10−04; Aβ42: betaPD = − 0.34, p = 4.38 × 10−05; t-tau: betaPD = − 0.23, p = 4.58 × 10−03; and p-tau181: betaPD = − 0.25, p = 2.46 × 10−03—Fig. 1). All associations passed multiple test correction (p < 0.013). Using a longitudinal model adjusted by age at lumbar puncture, sex, and the first two principal components, we found significant changes over time for CSF Aβ42 (p = 0.01) but not for α-Syn, t-tau, or p-tau181 in the PPMI cohort (N = 785). These results suggest that CSF dementia biomarkers are associated with PD status.

Fig. 1
figure 1

CSF α-Syn, Aβ42, t-tau, and p-tau181 levels are lower in Parkinson’s disease than in controls. Box plot of the normalized CSF levels of a α-Syn. b total tau. c phosphorylated tau and d Aβ42 in controls (gray) and Parkinson’s disease cases (orange). Parkinson’s disease cases (N = 700) and controls (N = 189) from two independent datasets (WUSTL and PPMI). The means for each group are represented by a horizontal line. A generalized linear model (CSF biomarker levels ~ Age + Sex + Status) was used to calculate the statistical differences between the CSF protein levels in Parkinson’s disease cases and controls

No significant loci were identified for CSF α-Syn, t-tau or p-tau181 in Parkinson’s disease cohorts

Within each cohort, a linear regression testing the additive genetic model of each SNP for association with CSF protein levels using age, gender, and two principal component factors for population stratification as covariates did not reveal any genome-wide significant loci associated with CSF α-Syn. Although several suggestive loci (p < 10−6 to 10−8) were identified in these analyses (Additional file 1: Fig. S1 and Additional file 2: Table S3), none of them passed multiple test correction threshold when cohorts were combined in the meta-analysis (Fig. 2a and Additional file 2: Table S3).

Fig. 2
figure 2

Association plot of single variant analyses of CSF α-Syn, t-tau, p-tau181 and Aβ42 levels. Manhattan plot shows negative log10-transformed p-values from the meta-analysis of a α-Syn. b total tau. c phosphorylated tau and d Aβ42 CSF levels. The lowest p-value on chr19 (APOE locus) was p = 4.5 × 10−43. The horizontal lines represent the genome-wide significance threshold, p = 5×10−8 (red) and suggestive threshold, p = 1×10−5 (blue). e, f Regional association plots of loci are shown for SNPs associated with CSF Aβ42 levels near HLA (e) and near APOE locus (f). The SNPs labeled on each regional plot had the lowest p-value at each locus and are represented by a purple diamond. Each dot represents an SNP, and dot colors indicate linkage disequilibrium with the labeled SNP. Blue vertical lines show the recombination rate marked on the right-hand y-axis of each regional plot. Suggestive SNPs for α-Syn, t-tau, p-tau181 can be found in Additional file 2: Tables S3 to S6

Joint analysis for CSF α-Syn levels stratifying by PD cases (N = 700), PD cases and controls (N = 889), AD cases only (N = 386), AD cases and controls (N = 575) and controls only (N = 189) were also performed. None of these analyses revealed any genome-wide significant locus, suggesting that these sample sizes might be underpowered to uncover the genetic modifiers of CSF α-Syn.

For t-tau, individual cohort analyses revealed four genome-wide significant loci (Additional file 1: Fig. S3 and Additional file 2: Table S5). However, none of them remained significant in the meta-analyses (Fig. 2b and Additional file 2: Table S5). For p-tau181, individual cohort analyses revealed three genome-wide significant loci (Additional file 1: Fig. S4 and Additional file 2: Table S6). However, none achieved significance in the meta-analyses (Fig. 1c and Additional file 2: Table S6).

Genetic analyses of multi-tissue α-Syn levels

In a subgroup of samples, α-Syn levels were measured in plasma (N = 529), brain (N = 380), and CSF (N = 835) using the SOMAScan platform (Additional file 2: Table S2). Single variant analysis was performed in each tissue separately (Additional file 1: Fig. S2A to 2C). Multi-tissue analysis was performed using MTAG [66]. Although two suggestive loci were observed in chromosomes 3 and 13 (Additional file 1: Fig. S2D and Additional file 2: Table S4) within genomic regions enriched with long intergenic non-protein coding (LINC) genes (Additional file 1: Fig. S2E and F), no genome-wide significant locus was identified. These results suggest that the power boost of using MTAG is not enough to unveil the genetic architecture of α-Syn.

APOE locus is associated with Aβ42 CSF levels in Parkinson’s disease cohorts

A proxy SNP for APOE ε4, rs769449, was associated with CSF levels of Aβ42 in the WUSTL (effect = − 0.56, p = 4.15 × 10−19), and ADNI cohorts (effect = − 0.73, p = 1.25 × 10−15). This association did not pass the genome-wide multiple test correction threshold in the PPMI cohort (effect = − 0.43, p = 3.09 × 10−07) and was not significant in the Spanish cohort (Additional file 1: Fig. S5 and Additional file 2: Table S7). The APOE locus (effect = − 0.57, p = 4.46 × 10−43) and a locus in the HLA region (effect = 0.23, p = 2.88 × 10−08) remained significant in the meta-analysis (Fig. 2d–f). When the cohorts containing only PD cases and controls were analyzed jointly (WUSTL and PPMI – N = 700 cases and 189 controls), the APOE locus was GWAS significant (effect = − 0.50, p = 9.25 × 10−19) but not the HLA region (effect = 0.22, p = 3.58 × 10−04). In the combined analysis of all cohorts (N = 1960), the APOE locus accounted for 36.2% of the CSF Aβ42 levels variance (p = 2.35 × 10−03). Overall, these results revealed a strong and highly significant association between APOE locus and lower CSF Aβ42 levels in PD cohorts.

Significant correlation of genomic architecture of Parkinson’s disease risk and CSF Aβ42

PRS at different p-value thresholds were used to test if the genetic variants associated with dementia biomarkers were associated with the genomic architecture of PD. PRS calculated using the META-PD [54] were associated with PD status in the WUSTL cohort (N = 108; p = 0.035). The PPMI cohort was excluded from this analysis due to overlap with META-PD. No correlation was observed between the genetic architecture of PD and that of CSF α-Syn, t-tau, or p-tau181 levels (Fig. 3). In contrast, the genetic architecture of CSF Aβ42 was correlated with PD, with the best fit when collapsing independent SNPs with p-value < 0.01 (p = 2.50 × 10−11) with a correlation coefficient (R2) of 2.29%. In PD cases and controls only, the correlation remained significant (p = 4.78 × 10−08), with an R2 of 2.36%. In PD patients with both GWAS and CSF biomarker data, the CSF levels of each biomarker were analyzed by quartiles of the PRS calculated from META-PD risk. A significant difference (p = 7.30 × 10−04) was found among the top and the bottom quartiles; higher PRS values exhibit lower levels of CSF Aβ42 (Additional file 1: Fig. S6). No association between PD PRS and longitudinal changes of α-Syn, Aβ42, t-tau, and p-tau181 levels was found in the PPMI dataset. These results indicate that PD and Aβ42 CSF levels have a shared genomic architecture.

Fig. 3
figure 3

Genetic architecture correlations of Parkinson’s disease risk with CSF α-Syn, t-tau, p-tau181 and Aβ42 levels. PRSice bar plots for Parkinson’s disease risk and CSF biomarkers. Nagelkerke pseudo-R-squared fit for the model of a CSF α-Syn levels PRS and Parkinson’s disease risk. b CSF t-tau levels PRS and Parkinson’s disease risk. c CSF p-tau181 PRS and Parkinson’s disease risk. d CSF Aβ42 levels PRS and Parkinson’s disease risk. Total variance explained by the PRS for multiple p-value thresholds for the inclusion of SNPs, with the red bar indicating the optimal p-value threshold (PT), explaining the maximum amount of variance (R2) in Parkinson’s disease risk in the target sample

Mendelian randomization suggest a causal link between CSF Aβ42 and Parkinson’s disease

Robust regression with the MR-Egger method found no association for t-tau or p-tau181 levels but revealed a trend for CSF α-Syn levels (effect = − 1.40; p = 0.06), and a significant causal effect for CSF Aβ42 on PD (effect = 0.43; p = 1.44 × 10−05) (Fig. 4a–c and Additional file 2: Table S7; Table 2 and Additional file 2: Table S8). When each cohort included in the META-PD was tested separately, CSF Aβ42 showed a causal effect in Nalls et al., 2014 and 2019 (p = 1.54 × 10−07 and 8.74 × 10−05 , respectively), but not in Chang et al., 2017 (Table 2 and Additional file 2: Table S8). Additionally, a significant causal effect for CSF Aβ42 on PD age-at-onset was found using the data from Blauwendraat et al., 2019 (effect = 7.75; p = 7.65 × 10−06—Table 2 and Additional file 2: Table S8). A leave-one-out sensitivity analysis on CSF Aβ42 revealed that the proxy SNP for APOE ε4, rs769449 is the strongest instrumental variable of this analysis (I2 is greater than 90% except when this variant was removed) and the main driver of the causal effect of CSF Aβ42 on PD. Other SNPs contribute in a smaller proportion to the causal effect (Fig. 4d). Altogether these results suggest a causal role of SNPs on the APOE locus and CSF Aβ42 on PD.

Fig. 4
figure 4

MR regressions on Parkinson’s disease risk genetic architecture and CSF α-Syn and Aβ42 levels. a Association between META-PD risk and CSF α-Syn levels (four variants). Robust regression MR-Egger method effect = -1.40 and p = 0.06, which is not consistent with causality. b Association between Parkinson’s disase risk and CSF Aβ42 levels (twelve variants). Robust regression with MR-Egger method effect = 0.43 and p = 1.44 × 10−05, which is consistent with causality. Each dot corresponds to one genetic variant, with a 95% confidence interval (CI) of its genetic association with the exposure (α-Syn and Aβ42 levels) and the outcome (Parkinson’s disease risk). Regression lines correspond to the robust MR-Egger method regression; numerical results are given for all tested methods in Additional file 2: Table S8. c CSF Aβ42 regression using multiple MR methods. Each dot is one of the twelve variants included in this test; the effect of CSF Aβ42 levels on the x-axis and Parkinson’s disease risk on the y-axis. Each line represents the regression of one MR-method of CSF Aβ42 levels on Parkinson’s disease risk with one MR method. Additional details on the data sources and analysis methods to generate these figures are provided in Additional file 2: Table S8. d The forest plot illustrates the leave-one-out sensitivity analysis between CSF Aβ42 and META-PD risk. MR analysis without rs769449 decreased the I2 statistic (I2 = 0.0%) and increased the p-value to non-significant levels, suggesting that the association is mainly driven by this variant

Table 2 Mendelian randomization results for the causal role of α-Syn, Aβ42, tau, and t-tau in Parkinson’s disease using the robust regression MR-Egger method with robust regression

APOE ε4 is associated with Aβ deposition in brains of Parkinson’s disease individuals

CSF Aβ42 and APOE genotype data were available for 134 participants (NControls = 26 and NCases = 108). No difference in the APOE ε4 frequency was found between cases (0.14%) and controls (0.11%). However, the CSF Aβ42 levels were significantly different between controls (p = 3.00 × 10−02) and cases (p = 3.80 × 10−06) when stratifying by the presence of APOE ε4 allele (Fig. 5a) [9].

Fig. 5
figure 5

APOE ε4 is associated with Aβ42 deposition in the brains of Parkinson’s disease individuals. a Comparison of the levels of CSF Aβ42 in control (N = 26) and PD (N = 108) participants stratified by the presence (ε4 + ; green) or absence (ε4-; blue) of the APOE ε4 allele. b Effect of APOE ε4 allele on the levels of mean cortical binding potentials (MCBP) in controls (N = 44) and Parkinson’s disease (N = 156). c PD patients carrying the APOE ε4 allele exhibit a higher Braak Aβ score than non-carriers (N = 92). Differences between APOE ε4 carriers and non-carriers were statistically significant by the Mann–Whitney U test

PET PiB analysis (N = 108) revealed that MCBP increased with age-at-onset (r = 0.20, p = 3.00 × 10−02) and number of APOE ε4 alleles (r = 0.22, p = 8.00 × 10−03) (Fig. 5a), but decreased CSF Aβ42 (r = − 0.55, p = 3.33 × 10−12). A linear regression model indicated that CSF Aβ42 and APOE ε4, explain 48% of the variance of MCBP. APOE ε4 is also significantly associated with MCBP (β = 0.14, p = 1.40 × 10−06) in analysis with 200 participants that included sex and age as covariates. APOE ε4 and age at onset explain 20% of the MCBP variance in this larger cohort. The presence of APOE ε4 did not affect the MCBP in controls (p = 0.19). However, PD patients carrying APOE ε4 exhibit significantly (p = 5.80 × 10−08) higher levels of MCBP than non-carriers (Fig. 5b).

Neuropathological data and APOE genotype were available from 92 PD cases. Individuals carrying an APOE ε4 allele had significantly (p = 4.40 × 10−04) higher Braak Aβ stage (Fig. 5c). APOE ε4 correlated with Braak Aβ stage (r = 0.33, p = 1.00 × 10−03) and diffuse plaques (r = 0.42, p = 5.00 × 10−03), but not with neuritic plaques (r = 0.42, p = 0.12). The best multiple linear regression model for the Braak Aβ stage, which included age at onset and APOE ε4, explained 42% of the variance of the Braak Aβ stage. Altogether, these results suggest that APOE ε4 drives the Aβ deposition in PD participants.

Discussion

CSF α-Syn, Aβ42, t-tau, and p-tau181 levels were significantly lower in PD cases compared with controls, as we previously reported with a smaller sample size [9]. GWAS were performed using CSF biomarker levels as quantitative traits in a large cohort (N = 1,960). With the current sample size, no signal was below the GWAS significant threshold for CSF α-Syn, t-tau, or p-tau181. A SNP proxy for APOE ε4 was genome-wide associated with CSF Aβ42 levels. The PRS calculated using META-PD was associated with PD status and correlated with the genomic architecture of CSF Aβ42; in fact, individuals with higher PRS scores exhibit lower CSF Aβ42 levels. Two-sample MR analysis revealed that CSF Aβ42 probably plays a role in PD and PD age-at-onset, an effect mainly mediated by variants in the APOE locus. Using a subset of participants from the WUSTL cohort with additional clinical and neuropathological data, we found that the APOE ε4 allele was associated with lower levels of CSF Aβ42, higher cortical binding of PiB PET and higher Braak Aβ score.

This is the first comprehensive analysis of CSF α-Syn and AD biomarkers using GWAS, PRS, and MR in PD. We found lower levels of CSF α-Syn in PD cases compared to controls in a cross-sectional analysis but no significant differences in the longitudinal study (PPMI). CSF α-Syn, as measured with ELISA-based assays, is not a clinically useful diagnostic marker for PD, and utility as an outcome measure for clinical trials or progression is still controversial [35, 53]. CSF biomarkers in AD used as quantitative endophenotypes have provided insights into AD pathophysiology [24]. Here, we used a large CSF α-Syn cohort (N = 1920) to identify its genetic modifiers. However, we did not find any locus associated with CSF α-Syn levels. Recently, a GWAS on CSF α-Syn using the ADNI cohort (N = 209) reported a genome-wide significant locus [73] (rs7072338). In the present meta-analyses (N = 1960), the p-value for rs7072338 was not significant (0.99). In the ADNI cohort, we found a nominal association for this SNP (p = 0.50 × 10−3). No correlation was found between the genetic architecture of PD with cross-sectional or longitudinal CSF α-Syn levels, consistent with what we have previously reported [41]. Using MR methods, we found a trend for the association between the CSF α-Syn levels and the risk of developing PD. However, sensitivity analyses showed limited power due to the small number of variants included in the analyses.

MR analyses suggest that Aβ42 could play a causal role in PD. Our MR results consistently identified a causal correlation between the APOE locus, CSF Aβ levels, and PD. MR is used to test if the genetic variation associated with a trait has a causal relationship with a health outcome [20]. MR is not affected by confounding factors or reverse causation, like in observational studies. However, the proper implementation of MR depends on several assumptions [20]. Here, instrumental variables (SNPs) relevant to CSF Aβ were previously and consistently identified [24]. A second MR assumption is independence; SNPs associated with the trait (e.g. APOE locus with CSF Aβ) should not be associated with the outcome (PD risk). The third MR assumption is the exclusion restriction, which means that SNPs do not affect PD risk except through CSF Aβ levels. The two sample MR used here requires two additional criteria: both cohorts must have similar genetic background but no overlap with each other. Here, samples used for the MR analysis, summary statistics from Deming, et al. [24] and Nalls, et al. [54], met both criteria. We could not rule out a horizontal pleiotropic effect of all the SNPs associated with CSF Aβ with PD, but our study is powered to detect the causal association with the APOE locus. Thus, we inferred that the lifetime effect of the APOE locus is causal in relation to PD.

The APOE locus and CSF Aβ42 levels were GWAS significant in the meta-analysis. The association of the APOE locus with CSF Aβ42 levels has been previously reported in AD [48] but not in PD cohorts. Interestingly, the direction of the effect was the same as what has been reported in AD but with a higher effect size (− 0.57 in PD compared to − 0.10 in AD) [25]. The APOE locus is the most significant locus associated with sporadic LBD, [59, 60], and cognitive decline in PD [40], but not with PD risk [54]. Here, we also found for the first time that patients with higher PRS from PD risk exhibit lower levels of CSF Aβ42, suggesting that similar genes or pathways predispose individuals to an accumulation of Aβ in the brain and to develop PD. This is in agreement with a recent report suggesting that the PD genes from the PRS analysis are enriched for AD genes [2].

The results from unbiased analyses like GWAS, PRS and MR demonstrated a link between PD genetic risk with CSF Aβ42 levels and the APOE locus. Here, we also provided further evidence by showing that PD patients carrying the APOE ε4 allele presented with lower levels of CSF Aβ42 (p = 3.8 × 10−06), higher MCBP (p = 5.80 × 10−08) and higher Braak Aβ scores (p = 4.40 × 10−04). These results support the synergistic relationship between α-Syn and Aβ pathology in AD, PD and LBD brains [43], and the effect of Aβ plaques exacerbating the propagation of α-Syn pathology in mouse models [3]. It is known that APOE ε4 drives the production of Aβ, the accumulation of Aβ fibrils in AD patients [37], exacerbates tau-mediated neurodegeneration in a mouse model of tauopathy [62] and affects CSF αSyn levels in the prodromal phase of sporadic and familial AD [67]. However, the role of APOE in human synucleinopathies is probably more complex. In LBD patients, the APOE ε4 effect on α-Syn pathology could be dependent on concurrent Aβ and/or tau pathology [58], however APOE ε4 also promotes α-Syn pathology independently [27, 65] and affects CSF αSyn levels [67]. We recently showed that APOE ε4 increased the α-Syn phosphorylation, worsened motor impairment, and increased neuroinflammation and neurodegeneration in different mouse models [22].

This is the largest sample size used for discovering CSF α-Syn genetic modifiers to date and yet no GWAS significant locus was found. It is possible that the complexity of α-Syn genetic architecture makes the current sample size insufficiently powered to detect signals with a smaller effect. Here, we found lower levels of CSF α-Syn in PD patients, which aligns with previous reports. However, neither PRS nor MR analysis revealed evidence of the causal link of CSF α-Syn with PD risk. In fact, it has been reported that α-Syn aggregation is neither necessary nor sufficient for neurodegeneration or clinical parkinsonism [31, 32]. The cohorts used in this study rely on clinical diagnosis rather than neuropathological confirmation, which precludes analyses of a correlation between CSF α-Syn levels and pathologic brain accumulation of brain α-Syn. Factors that may have contributed to the lack of power to detect genetic modifiers of CSF α-Syn include participant characteristics (PD subtypes, misdiagnosis, comorbidities, medications, disease duration), preanalytical factors (blood contamination at lumbar puncture), and differences in assays (measuring various abnormal pathological or normal forms of α-Syn) [26].

PD is a heterogeneous disorder with different identifiable clinical-pathological subtypes based on symptom severity and predominance [15]. It is conceivable that more homogeneous PD subtypes could be defined using biomarker-driven, clinical-molecular phenotyping approaches. This study, with 1960 samples with CSF α-Syn levels, showed that the genomic architecture of α-Syn is complex and not correlated with the genomic landscape of PD. Additional studies with larger sample sizes and standardized methods to quantify α-Syn in both CSF and brain are needed to uncover genetic modifiers of α-Syn levels. Our results using high-throughput and hypothesis-free, unbiased approaches demonstrated a link between PD genetic risk, CSF Aβ42 levels and APOE locus. These findings were further validated by strong significant associations of APOE ε4 with Aβ deposition in cortical regions of living and postmortem PD patients.