For our analysis, we used genome-wide DNA methylation (DNAm) data from two publicly available PD studies (Gene Expression Omnibus (GEO) accession numbers GSE145361 (SGPD), GSE72774 and GSE72776 (PEG)). DNAm was measured with the Human Methylation 450 k BeadChip in whole blood samples.
The System Genomics of Parkinson’s Disease (SGPD) is a consortium of three studies from across Australia and New Zealand (11). The data available on GEO data consists of 1889 samples (959 PD patients and 930 controls) of European ancestry. Prevalent PD patients were recruited (PD duration 2–40 + years), and controls consisted primarily of community-based age-matched volunteers from the same communities, as well as some patient’s spouses and siblings. Cohort details, DNA extraction methods, quality control procedures, and normalization methods (quantile-normalized and normalization adjusted for batch, slide, cohort, sentrix row/column, sex, and age) have been described (11).
The Parkinson’s Environment and Genes (PEG) study is population-based study from three agricultural counties of Central California (12). GEO data is available for 807 samples (569 PD patients and 238 controls) of European and Hispanic ancestry. Patients early in disease (mean PD duration = 2.9 years (SD = 2.3)) were diagnosed in-person by UCLA Movement Disorder Specialists (J.B.). Population-based controls from the same communities were randomly sampled from Medicare lists and via residential tax assessor's records. Cohort details, DNA extraction methods, quality control procedures, and normalization methods have been previously described (13).
We generated two epigenetic biomarkers for cumulative lead exposure (tibia and patella), developed in the Normative Aging Study (NAS). The epigenetic biosensors of patella and tibia are linear combinations of 59 and 138 CpGs, respectively, identified with site-by-site analysis and combined via machine-learning algorithms trained on K x-ray fluorescence (KXRF) in-vivo measures of bone-lead (10). To determine lead biomarker levels in SGPD and PEG, we extracted the published regression coefficients from NAS (10), and applied them to the corresponding DNAm beta matrices. At time of publication, the specificity of the DNAm biosensors has not been validated beyond the initial study.
In order to quantify the relationship between the DNAm lead and PD, we used logistic regression to estimate odds ratios (ORs) and 95% CIs for PD. The OR is a measure of association that compares the odds of PD based on exposure, with an OR greater than 1 indicating that the odds of PD increase with exposure. Regression diagnostics showed that model assumptions were met. Based on data availability from GEO, we controlled for age (estimated with the Horvath DNAmAge in SGPD (14)), sex, ancestry (PEG only), blood cell composition, smoking history (PEG only), and mean methylation by sample to account for global methylation. We imputed blood cell proportions using the Houseman method (15). We assessed between-study heterogeneity for SGPD and PEG results with Cochran’s Q, and calculated a meta OR using a fixed-effects model with weights based on precision.