Introduction

Parkinson’s disease (PD) is the second most common neurodegenerative disorder, impacting millions of people worldwide, contributing to significant disability and death in people diagnosed with the condition1,2. PD is characterized by motor symptoms of bradykinesia, tremor at rest or postural rigidity3. Non-motor symptoms include neuropsychiatric and cognitive impairment, disruptions of sleep-wake cycle regulation, autonomic dysfunction, sensory disturbances, and pain4. Our recently published study mapped cortical gray matter thinning and subcortical volume deficits across PD disease progression stages, revealing a pattern of gray matter change closely tracking Braak stages of Lewy body pathology in PD5. Diffusion MRI (dMRI) is primarily used to investigate pathophysiological processes that impact white matter (WM) in the brain, enabling an investigation of the brain circuits that are impacted in the condition. The most common dMRI model used to study WM alterations in PD is diffusion tensor imaging (DTI) which characterizes the three dimensional diffusion of water as a function of spatial location: the diffusion tensor describes the magnitude, orientation, and degree of diffusion anisotropy6. The current study investigates the two most commonly used DTI parameters of fractional anisotropy (FA) and mean diffusivity (MD). Typically, high FA and low MD are associated with structurally intact WM pathways, while the opposite is found in damaged WM.

Reviews and literature-based meta-analyses of individual studies report a heterogeneous, multifocal pattern of DTI alterations in PD, with higher FA and/or lower MD found in motor pathways soon after the disease is diagnosed, but lower FA and/or higher MD in frontal, temporal and callosal regions at advanced disease stages7,8,9,10,11. These findings of increased FA and reduced MD, often linked to compensatory changes in PD, are bolstered by individual studies showing increases in WM fiber density early in the disorder12. Conflicting results still exist, however, with some work showing no significant DTI alterations in PD13. A number of studies have also tracked WM changes longitudinally, demonstrating greater declines in WM integrity in PD participants over time13,14,15,16. However, individual studies often have smaller sample sizes comprised of individuals from similar demographics, and can include participants from various disease stages, making it difficult to assess whether the findings generalize to other populations. Here we aim to overcome these limitations by conducting a coordinated analysis of dMRI across many international studies, using standardized and validated image analysis and quality control protocols. Using the largest sample size generated in this field of research, we set out to investigate how microstructural brain measures are impacted at different stages along the PD disease spectrum, and how WM alterations are associated with clinical symptoms. We hypothesized that participants with PD would show more pronounced DTI WM alterations at advanced disease stages. At the onset of PD, male patients typically show more bradykinesia and rigidity, while female patients tend to display greater signs of tremor17. Due to the slower progression of PD in females18, along with research suggesting that males demonstrate greater local WM disruptions19, we also hypothesized that female PD participants would have less severe alterations in DTI measures than males. Finally, we hypothesized that lower microstructural integrity would be associated with poorer motor and cognitive function.

Results

Study participants

To harmonize Hoehn and Yahr (HY)20 staging across sites, HY1.5 and HY2.5 increments were reclassified as HY2. HY5 scores were grouped with HY4 to maintain a sufficiently large sample for the analysis. These groupings align with our consortium’s published international study using standard anatomical MRI5.

There were significant differences in age, Mini-Mental State Examination (MMSE)21 and Montreal Cognitive Assessment (MoCA)22 scores between PD subgroups and controls, and in the proportions of males and females across PD HY subgroups and controls. There were also significant differences in these measures when comparing only the PD HY subgroup participants (Supplementary Results 1.1). Control participants were significantly younger than the entire PD cohort by 2.2 years. The Rome cohort collected data from controls who were significantly younger than PD participants (p < 0.001). As expected, the entire PD cohort had lower scores on the MMSE and MoCA tests than controls. There was also a significant difference in the proportions of males and females across the two groups (Supplementary Results 1.2). Two cohorts, Charlottesville and Radboud, did not collect control data. Because of these age differences, below we report results comparing PD patients to all controls, as well as to only age- and sex-matched controls. Participants were assessed with either the original UPDRS Part-III20 or revised Movement Disorder Society UPDRS Part-III23 scores. We therefore used a validated formula to convert original UPDRS-III scores to predicted MDS-UPDRS-III scores (MDS-UPDRS-III) (Supplementary Results 1.3). MDS-UPDRS-III scores were found to be lower in the OFF-medication state than the ON-medication state, which is counter-intuitive. A fact that may explain this discrepancy is that the OFF-medication subjects have a shorter disease duration (4.9 years) relative to those ON-medication (6.5 years). ON-medication subjects also had lower average MoCA scores (24) relative to the OFF-medication subjects (27) suggesting that they might be earlier in their disease course.

Between-group differences in white matter microstructural measures

Pronounced patterns of FA differences emerged when stratifying PD participants according to HY stage. HY1 participants (n = 275) had higher FA across the entire WM skeleton (d = 0.30) as well as 4 out of 21 regions of interest (ROI) compared to controls (n = 885), with effect sizes ranging from d = 0.18 to 0.19 (all p < 0.05, false discovery rate corrected24). Implicated regions included the anterior corona radiata, anterior and retrolenticular parts of the internal capsule, and the genu of the corpus callosum. Higher FA in HY1 participants was in the 1.0–1.4% range across significant ROIs. HY2 PD participants (n = 742) had lower FA at the fornix (d = −0.26) relative to controls, a reduction in FA of 5.4%. HY3 PD participants (n = 220) had lower FA across the entire WM skeleton (d = −0.24), as well as 9 out of 21 ROIs, with the largest effect at the fornix (d = −0.33). Implicated regions include the anterior and posterior corona radiata, posterior thalamic radiation, genu of the corpus callosum, external capsule, fornix/stria terminalis, superior longitudinal fasciculus and the sagittal stratum. Lower FA in HY3 participants was in the 1.3–7.3% range across significant ROIs. HY4/5 PD participants (n = 75) had lower FA across the entire WM skeleton (d = −0.74), as well as 20 out of 21 ROIs relative to controls, with the largest effect in the fornix (d = −1.01) and remaining values increasing in magnitude from d = −0.26, representing between 1.4% and 18.9% lower FA. The only ROI not implicated in HY4/5 participants was the cingulum (hippocampal portion) (Fig. 1A; Supplementary Results 1.4.1).

HY1 PD participants displayed higher FA across the entire WM skeleton (d = 0.27) as well as 3 out of 21 ROIs, relative to matched controls (n = 275), with effect sizes ranging from d = 0.23 to 0.24. Implicated regions included the anterior corona radiata and the anterior and posterior limb of the internal capsule, representing between 0.9% and 1.5% higher FA. HY2 PD participants demonstrated significantly lower FA in the fornix (d = −0.27), relative to matched controls (n = 742), a reduction in FA of 5.6%. HY3 PD participants displayed lower FA in the fornix (d = −0.31) and the sagittal stratum (d = −0.29), relative to matched controls (n = 220), a reduction in FA of 2.1% and 6.7%, respectively. HY4/5 PD participants displayed widespread significant differences in FA across the entire WM skeleton (d = −0.56) as well as 18 out of 21 ROIs, relative to matched controls (n = 75), with effect sizes ranging from d = −0.38 to −1.09. Lower FA in HY4/5 participants was in the 3.0% to 19.0% range across significant ROIs. ROIs not implicated include the retrolenticular and posterior limb of the internal capsule and the corticospinal tract (Fig. 1B; Supplementary Results 1.4.2).

Fig. 1: Microstructural differences in FA. Between-group analyses comparing FA across PD HY subgroups and controls.
figure 1

A Results when compared to all controls, and (B) when compared to only age- and sex-matched controls. A anterior, L left, R right.

HY1 PD participants displayed lower MD across the entire WM skeleton (d = −0.19) as well as 5 out of 21 ROIs, relative to controls, with effect sizes ranging from d = −0.18 to d = −0.27. Implicated regions include the anterior and retrolenticular limbs of the internal capsule, the fornix/stria terminalis, cingulum (hippocampal portion) and the sagittal stratum. Lower MD in HY1 participants was in the 1.3–2.1% range across significant ROIs. HY2 PD participants displayed lower MD at the fornix/stria terminalis (d = −0.22), as well as 2 out of 21 ROIs relative to controls, with effect sizes of −0.19 and −0.18 at the retrolenticular limb of the internal capsule and the cingulum (hippocampal portion), respectively. Lower MD in HY2 participants was in the 1.2% to 1.9% range across the significant ROIs. HY2 participants also displayed higher MD at the fornix (d = 0.15), an increase in MD of 3.2%. There were no significant MD differences in HY3 PD participants relative to controls. HY4/5 PD participants displayed higher MD in the fornix (d = 0.69) as well as 6 out of 21 ROIs, relative to controls, with effect sizes ranging from d = 0.33 to 0.46. Implicated regions include the anterior and superior corona radiata, external capsule, genu and body of the corpus callosum and the superior fronto-occipital fasciculus. Higher MD in HY4/5 participants was in the 1.9% to 13.1% range across the significant ROIs. HY4/5 PD participants also displayed lower MD in the cingulum (hippocampal portion) (d = −0.32) relative to controls, a reduction in MD of 3.2% (Fig. 2A; Supplementary Results 1.4.3).

There were no significant differences in MD in HY1 PD participants relative to matched controls. HY2 PD participants demonstrated significantly lower MD at the fornix/stria terminalis (d = −0.22), as well as 3 out of 21 ROIs, relative to matched controls (n = 742), with effect sizes ranging from d = −0.14 to −0.19. Implicated regions include the posterior limb and the retrolenticular parts of the internal capsule as well as the cingulum (hippocampal portion). Lower MD in HY2 participants was in the 0.8% to 2.0% range across the significant ROIs. These participants also showed higher MD at the fornix (d = 0.15), an increase in MD of 3.3%. There were no significant MD differences in HY3 PD participants relative to matched controls, while HY4/5 PD participants displayed significantly higher MD at the fornix (d = 0.72) relative to matched controls, an increase in MD of 13.0% (Fig. 2B; Supplementary Results 1.4.4).

Fig. 2: Microstructural differences in MD. Between-group analyses comparing MD across PD HY subgroups and controls.
figure 2

A Results when compared to all controls, and (B) when compared to only age- and sex-matched controls. A anterior, L left, R right.

Associations between effect sizes

A number of results found when performing HY stage analyses did not remain significant when performing the same analyses with age- and sex-matched controls. To investigate this incongruence, we examined the correlation between the effect sizes generated at the two levels of analysis. Effect sizes generated for each respective ROI across analyses were highly correlated, with correlation coefficients greater than 0.8 found for each ROI (Supplementary Results 1.5; Supplementary Figs. 14).

Validation analyses using ComBat-harmonized data

To ensure results generated via our linear mixed effects models were not driven by unwanted inter-site variability, we used the batch-effect correction tool, ComBat, to harmonize between-site variation across the diffusion metrics25 for use in two validation analyses. Harmonized data was then analyzed using the same linear models used when analyzing DTI metrics stratified by HY stage, with ‘site’ removed as a random effect, and only the fixed effect covariates included in the models. Results of our validation analyses, (1) when run at each HY stage when compared to all controls, and (2) when run at each HY stage when compared to age- and sex-matched controls, were highly similar to those generated with linear mixed effect models, suggesting that ‘site’ effects were not driving the significant results. Results of these validation analyses are presented in Supplementary Results 1.6 and 1.7, Supplementary Figs. 512, as well as Supplementary Table 1.

Overall aggregation of all PD participants and controls: FA and MD

If participants from all HY stages were combined, the overall PD group (n = 1,654) showed significantly lower FA (d = −0.24) and higher MD (d = 0.17) at the fornix, relative to controls (n = 885). The overall PD group also showed lower MD at the cingulum (hippocampal portion) (d = −0.16) relative to controls (Supplementary Results 1.8).

Random-effects meta-analysis: PD and Controls

Random-effects meta-analysis revealed no significant differences in FA or MD between the entire PD cohort and controls across the individual sites (Supplementary Results 1.9; Supplementary Figs. 1334).

Interaction effects between sex and diagnosis

We found no significant interactions between sex and diagnosis when comparing FA and MD across each PD HY stage and matched controls.

Associations between diffusion measures and clinical variables

In PD participants with MoCA data (n = 907), we found significant positive associations between these scores and FA across the entire WM skeleton (r = 0.11), as well as 5 out of 21 ROIs, with effect sizes ranging from r = 0.09 to 0.13. Implicated regions include the anterior limb of the internal capsule, external capsule, fornix, posterior thalamic radiation and the sagittal stratum. Significant negative associations with MD were found across the entire WM skeleton (r = −0.13), as well as 12 out of 21 ROIs: effect sizes ranged from r = 0.07 to 0.17. Implicated regions include the anterior and superior corona radiata, anterior limb of the internal capsule, external capsule, fornix, genu and body of the corpus callosum, cingulate gyrus, superior fronto‐occipital and longitudinal fasciculi, sagittal stratum and the uncinate fasciculus (Fig. 3; Supplementary Results 1.10.1).

In PD participants with MDS-UPDRS-III (OFF) scores (n = 540), we found significant negative associations between these scores and FA across the entire WM skeleton as well as 14 out of 21 ROIs, with effect sizes ranging from r = 0.10 to 0.15. Implicated regions include the anterior, superior and posterior corona radiata, retrolenticular part of the internal capsule, external capsule, fornix, posterior thalamic radiation, genu, body and splenium of the corpus callosum, cingulate gyrus, sagittal stratum, tapetum and the uncinate fasciculus. No significant associations were detected between MD and MDS-UPDRS-III (OFF) scores (Fig. 3; Supplementary Results 1.10.2).

Fig. 3: Associations between DTI measures and MoCA and MDS-UPDRS-III (OFF) Scores.
figure 3

A anterior, L left, R right, MoCA Montreal Cognitive Assessment, MDS-UPDRS-III Movement Disorders Society Unified Parkinson’s Disease Rating Scale part-III.

Discussion

In this worldwide study of WM microstructure in PD, we compared brain dMRI data from 1654 participants with PD to 885 control participants from 17 cohorts across Africa, Asia, Europe, North and South America, and Oceania. With uniquely large sample sizes at each HY stage, we found widespread WM microstructural differences in people with PD that appear as higher FA and lower MD at early disease stages, with a reversal of this pattern at advanced disease stages. We also found significant relationships between FA and MD and cognitive and motor function, with poorer clinical performance associated with lower FA and higher MD. After stratifying participants by HY stage, a pattern of DTI differences emerged that was linked to disease progression. The direction of effects earlier in the disease appeared opposite to those later in the disease. These stage-dependent effects were generally obscured when all PD participants were aggregated.

We found significantly higher FA in HY1 participants across the entire WM skeleton, lower FA of the fornix in HY2 participants, and lower FA across the vast majority of the WM skeleton in participants at HY stages 3 and 4/5. While the study is cross-sectional, we can see evidence of a progression of effect sizes, where results from early disease stages present as very small effects, while those at advanced disease stages showing medium and large effect sizes. In the HY1 PD participants, the largest effect sizes were found for the higher FA in the internal capsule and anterior corona radiata. These structures are key components of the cerebello-thalamo-cortical and basal ganglia-cortical loops and likely play an important role in the pathophysiology of tremor in PD. It has been suggested that tremor signals originating in the basal ganglia ascend via the ventrolateral nucleus of the thalamus through the internal capsule to the corona radiata where they radiate out to the cortex26. One study found that increased FA is associated with tremor-dominant PD, relative to postural instability-gait disturbance PD, impacting multiple brain regions including the corpus callosum, forceps minor, bilateral thalamic radiation, bilateral superior and inferior longitudinal fasciculi and the left sagittal stratum, suggesting an important relationship between this motor phenotype and DTI abnormalities in PD27. The fornix also shows among the largest effect sizes in PD; FA was significantly reduced here in HY2, HY3 and HY4/5 participants. The fornix is a major hippocampal output structure that traverses longitudinally from the mesial temporal lobes to the diencephalon and basal forebrain, and it plays a key role in cognition and episodic memory recall28. Here we show that this structure is greatly impacted in PD, with a significant correlation between FA and cognitive performance.

We identified significantly lower MD across the entire WM skeleton at the HY1 stage, while in HY2, this result was confined to the internal capsule, fornix/stria terminalis and the cingulum (hippocampal portion). In HY3, there were no significant differences in MD, while HY4/5 participants showed significantly higher MD at widespread regions of interest. Some of the significant MD findings were not observed when comparing PD cohorts to age- and sex-matched controls, with only the most robust effect sizes remaining significant. In our correlational analyses, the effect sizes generated at the matched and unmatched levels of analysis had the same direction and were highly correlated, suggesting that the lack of significant results when comparing MD in HY3 and HY4/5 groups with age- and sex-matched controls are likely due to differences in sample size and statistical power, as well as removing confounding effects. Similar to our results for the FA analyses, in HY1 and HY2 participants large effect sizes were found for regions of interest in the internal capsule, further highlighting this region as an important neuroanatomical structure in PD. In HY4/5 participants, we also found significantly higher MD in the fornix, and MD of the fornix was inversely correlated with performance on the MoCA test.

When all PD participants were combined, irrespective of HY disease stage, the full PD cohort showed lower FA and higher MD at the fornix, and lower MD at the cingulum (hippocampal portion) compared to controls, with no other significant effects. The contrast in findings between (1) comparing all cases to controls and (2) performing stage-stratified analyses, may explain why a number of prior TBSS studies in PD have found few, if any, significant differences in FA or MD29,30,31,32,33,34. A number of studies have also found higher FA or lower MD in PD35,36,37,38,39,40,41. A significant strength of our work is that we have highly powered sample sizes at all HY stages, and stage-dependent effects may account for opposing findings in the literature, resolved by the current study design using the large ENIGMA-PD sample.

Diffusion anisotropy in the WM, including the DTI-based FA metric, is influenced by differences in fiber density, degree of myelination, diameter of fibers, presence of ‘crossing’ or ‘kissing’ fibers, and the density of neuroglial cells42. Several PD-specific processes have been linked to changes to anisotropy and diffusivity43. Higher FA may indicate a compensatory response to altered activity arising from dopaminergic depletion of the substantia nigra and connected neuronal pathways38,44. Such a process could promote activity-driven myelination whereby neuronal reserve is recruited and strengthened in response to the increased demand45. At the onset of PD symptoms, over two thirds of dopaminergic neurons in the lateral ventral substantia nigra may already be lost46. Still, many participants in the early stages of PD can maintain normal levels of functional performance in a range of domains, potentially through the recruitment of compensatory mechanisms45,47,48. This theory is supported by recent work demonstrating that lower clinical severity in PD is associated with stronger upregulation of activity in parietal and premotor cortices during movements made under mild cognitive load49. Further supporting this compensatory hypothesis, experimental animal models of PD have found higher anisotropy and lower diffusivity in the degenerating nigrostriatal pathways after injection of the 6-OHDA neurotoxin50. Other compensatory mechanisms that may contribute to higher anisotropy or lower diffusivity could include increases in axonal density, potentially due to axonal sprouting32,51. Supporting this, a PD study using advanced dMRI data found greater apparent fiber and WM tract-density metrics early in PD compared to controls36. Higher anisotropy may also reflect neuroinflammation, whereby the infiltration of gliotic cells to certain brain areas may contribute to a more anisotropic diffusion of water molecules50. This may partly explain our findings as neuroinflammation plays a crucial role in the etiology of PD52. Another possibility might be that genetic risk factors for PD may contribute to these DTI alterations. Members of our team recently reported that polygenic risk for PD is associated with greater cortical surface area, potentially relating to increases in neural progenitor cells53. Subcortical WM contains populations of glial progenitors54, and genetic polymorphisms associated with PD may impact the development of these cells in a way that increases the number or density of WM cells in PD.

In contrast, several studies have reported lower FA30,31,55,56,57,58,59,60, and higher MD in PD compared to controls57,61,62,63, as we found at later disease stages. Longitudinal studies have also shown that PD is associated with decreases in WM integrity as the disease progresses, supporting the general direction of WM changes our study indicates13,14,15,16,64. These findings are more typically associated with neurodegeneration, where brain tissue is constraining water diffusion to a lesser degree, due to neurite loss or diminished membrane integrity causing an increase in the extracellular space65,66. Considering the neuropathology of PD in particular, the accumulation of Lewy neurites in neuronal axons impacts the functional integrity and survival of neurons by inhibiting axonal transport67,68. Further, alpha-synuclein, a key component of Lewy neurites, inhibits neurite outgrowth and branching69. Comorbid small-vessel ischemia and WM hyperintensities may also contribute to lower anisotropy and higher diffusivity in WM tracts in PD70. It has been shown that neurodegeneration, in the form of deficits in subcortical and cortical gray matter tissue, occurs at greater levels at advanced disease stages5,71, and it does so in a way that parallels the topological spread of Lewy pathology across the brain which becomes more severe as the disease advances72.

Females have a lower incidence of PD, generally slower disease progression, and a more mild motor phenotype, indicating likely hormonal, environmental, genetic load and gene-environment interactions17,18,73. Given these factors, we expected to find a significant interaction effect between sex and PD diagnosis in this work; however, we found no evidence to support this hypothesis.

We found widespread positive associations between MoCA scores and FA and negative associations between MoCA scores and MD across the brain, however the magnitude of these associations were minor, with only very weak associations being found. The relationship between lower MoCA scores (more impairment) and higher MD, is consistent with prior reports in PD linking DTI alterations to level of cognitive function62,74,75,76,77. While DTI-derived diffusivity alterations have been associated with cognitive impairment in PD11, here we also show anisotropy effects related to cognitive impairment. The large sample size of our study provided the power to uncover these small but significant effects in FA, with effect sizes ranging from r = 0.09 to 0.11, that would not necessarily be detectable in studies with small sample sizes. We found negative associations between MDS-UPDRS-III (OFF) scores and FA across widespread regions of interest in the brain, with no significant associations found with MD at any regions of interest. Our results contrast with prior reports in smaller samples that were unable to detect associations between FA and measures of motor impairment31,62,63. Our results provide evidence that microstructural differences in WM may underlie the cognitive impairment and motor disturbances in PD, suggesting that cumulative effects of the PD may result in greater microstructural damage to WM at later disease stages, characteristic of the changes associated with neurodegeneration65,66. Higher scores on the MDS-UPDRS-III (OFF) are indicative of greater motor disability in people with PD, and our results suggest that greater motor disability is mainly associated with lower FA, more so than changes in diffusivity.

A limitation of the current work is that we can only speculate about the potential compensatory mechanisms underlying early DTI alterations in PD, and given the cross-sectional nature of the study, we could not map disease progression on an individual level or in the same subjects. Longitudinal studies that include PD participants from across the disease spectrum (including prodromal PD participants) are needed to better understand how DTI metrics change with the progression of the disorder. Another limitation is that datasets from each site had already been collected, using different clinical and dMRI acquisition protocols. This limited our ability to investigate how DTI differences relate to many common symptoms of the disorder. This is also a strength of our work as it means our findings have been derived from diverse populations from around the world. Using data compiled from varied dMRI acquisition protocols potentially biases our results. However, experimental research using traveling participants scanned on varied scanner vendors and software has found high concordance of DTI metrics across acquisition protocols supporting the feasibility of multicentered DTI studies78,79. Given the importance of several other non-motor features in PD, such as sleep, as well as olfactory and autonomic dysfunction, a key future goal will also be examining how microstructural brain differences relate to these non-motor features. A final limitation of this work is that we assess large WM tracts as a whole and do not evaluate the effects of cerebrovascular factors that may impact local WM regions. Tractography-based analyses incorporating WM hyperintensities will be performed in the future, as well as mapping how these factors contribute to cognitive impairment.

In conclusion, our research demonstrates that people with Parkinson’s disease present with a pattern of DTI alterations involving higher anisotropy and lower diffusivity at early disease stages. By contrast, later-stage participants exhibit differences more typically associated with neurodegeneration, including lower anisotropy and higher diffusivity.

Methods

Study participants

To enhance statistical power and generalizability of the findings in this field of research, we coordinated a worldwide, pooled, multisite analysis of data from 17 cohorts from Africa, Asia, Europe, Oceania, and both North and South America (Supplementary Fig. 35). In total, dMRI scans from 1654 participants with PD and 885 controls were analyzed (age range: 20–89 years; 38% female).

Information on how each cohort diagnosed PD, as well as inclusion and exclusion criteria, is available in Supplementary Methods 2.1. Fourteen of the seventeen cohorts collected scans and clinical information from age-matched control participants. Demographic and clinical characteristics of each cohort are summarized below in Table 1; the distribution of age across sites is shown in Fig. 4.

Table 1 ENIGMA-PD sample characteristics by site
Fig. 4: Age distribution for the 17 datasets analyzed.
figure 4

The number above the boxplot is the median age for each sample and to the right is the total number of subjects.

Clinical assessments included the HY stage, MMSE, MoCA, and the original UPDRS Part-III or revised Movement Disorder Society UPDRS Part III scores. All participants provided written informed consent; each local study was approved by local institutional review boards (institutional approval details supplied in Supplementary Methods 2.2 Study Participants), and all research was performed in accordance with the World Medical Association’s Declaration of Helsinki.

Demographic and clinical characteristics of each HY disease stage group are shown in Table 2.

Table 2 Characteristics of the PD HY Cohorts, Stratified by HY Disease Stage

MRI acquisition

All cohorts’ scanner descriptions and acquisition protocols are provided in Supplementary Methods Table 3. Prior work by the ENIGMA Worldwide Consortium in epilepsy, brain trauma, PTSD, major depression, bipolar disorder, obsessive-compulsive disorder, and schizophrenia80,81,82,83,84,85 and experimental work using traveling subjects78,79 suggest that DTI measures can be merged and analyzed across scanning protocols, provided that standardized processing procedures are used as detailed below alongside statistical adjustments.

Image processing

Each cohort conducted dMRI processing locally or sent unprocessed imaging data to the central site for analysis after appropriate data transfer agreements were approved by the relevant institutions. Preprocessing steps included image denoising, Gibbs deringing correction, eddy current correction, echo-planar imaging induced distortion, and B1 field inhomogeneity correction. All data was then processed using the standardized ENIGMA-DTI pipeline86. For each site’s specific preprocessing protocol, see Supplementary Methods Table 4. A tensor estimation step was performed for each participant, creating FA, MD, axial diffusivity (AD), and radial diffusivity (RD) maps. FA images were visually checked to ensure that the principal eigenvector direction was anatomically plausible. DTI FA maps were non-linearly registered to the ENIGMA-DTI template and alignment was visually checked. Tract-based spatial-statistics (TBSS)87 was then used to skeletonize FA, MD, AD, and RD maps. For the registration of DTI maps to the ENIGMA-DTI template we used the default nonlinear registration approach in TBSS, however it should be noted that other methods for image alignment exist which have been shown to demonstrate improved registration and fewer false-positives88. However, to keep across-site consistency we chose to follow the standard TBSS registration approach. Mean DTI measures were extracted from 21 WM tracts based on the Johns Hopkins University atlas of WM labels89. An overall average value was also derived for the whole brain WM skeleton (entire WM) assessing the mean of left and right DTI measures for ROIs. The final step in the ENIGMA-DTI pipeline produces ROI data for each participant; this was then sent to the central analysis site for data curation and statistical analyses. The main text of this paper presents results from FA and MD only—the two most widely reported DTI measures in the literature. AD and RD results are presented in the Supplementary Results.

Between-group differences in white matter microstructural measures

To investigate DTI alterations at different stages of PD, mixed effects linear regression models were fitted to evaluate regional DTI differences at each HY subgroup with controls. Fixed effect covariates included age, sex, the square of the mean-centered age, and the mean-centered interaction of age with sex. To account for differences in dMRI acquisitions across contributing groups, ‘site’ was used as a random-effect covariate, as in prior multi-site morphometric ENIGMA studies90. Effect sizes were estimated as Cohen’s d values as described in Supplementary Methods 2.4. To adjust for multiple comparisons, we used the false discovery rate (FDR) procedure to correct raw p values, with a q = 0.05. Statistical analyses were run using R software (v4.0.3) with lme4 (v1.1.30) and MASS (v7.3.53) packages. To control the experiment-wise false positive rate and limit false positive inferences across the whole study, we considered the FA and MD analyses as primary, while the clinical association tests were treated as secondary post hoc analyses.

To ensure that findings were not driven by age or sex differences in case-control sample comparisons, all HY PD subgroup analyses were repeated using subsamples of age- and sex-matched controls, generated using 1:1 nearest-neighbor matching with the MatchIt package (v4.5.0) in R.

Associations between effect sizes

To evaluate the similarity between results generated using ‘Stratification by HY Stage’ and ‘Matched Control Samples’ designs, we correlated the Cohen’s d values generated for each ROI across these two analyses.

Validation analyses using ComBat-harmonized data

To ensure results generated via our linear mixed effects models were not driven by unwanted inter-site variability, we used the batch-effect correction tool, ComBat, to harmonize between-site variation across the diffusion metrics25 for use in two validation analyses. With ComBat, the effect of any given batch variable, in our case “site”, is modeled using an empirical Bayesian estimate of the batch variable-specific shift of the residual mean and scale of the residuals of the model, for each DTI metric and ROI, while preserving the expected biological variation in the data associated with variables of interest, here age and sex. ComBat was run on all sites that had control subjects, with the residuals then applied to PD participant data from the corresponding sites. This data was then analyzed using the same linear models used when analyzing DTI metrics stratified by HY stage, however “site” was not included as a random effect, with only the fixed effects covariates included in the models.

Overall aggregation of all PD participants and controls: FA and MD

Mixed effects linear regression models were also fitted to evaluate regional DTI differences between the overall PD cohort and controls.

Random-effects meta-analysis: PD and controls

Differences in DTI measures between PD participants and controls were also assessed using random-effects meta-analysis (RE-Meta), to ensure that pooled analysis findings were not driven by any single site (Supplementary Results 1.9). Within each of the 15 out of 17 cohorts that had both PD participants and controls, linear regressions were used, adjusting for age, sex, the square of the mean-centered age, and the mean-centered interaction of age with sex. Cohen’s d effect sizes and standard errors were then analyzed for each ROI across site via meta-analysis using the Metafor package (v3.8.1) in R.

Interaction effects between sex and diagnosis

To examine sex differences in the effect of PD on DTI measures, sex-by-diagnosis interactions were also investigated, in groups stratified by HY PD stage. These analyses used mixed-effects linear regression models, with fixed-effect covariates including age, sex, the square of mean-centered measure of age, and the mean-centered interaction of age with sex, with ‘site’ modeled as a random effect.

Associations between diffusion measures and clinical variables

We tested for associations between regional DTI measures and (1) MoCA scores, and (2) MDS-UPDRS-III scores in the entire PD cohort (MDS-UPDRS-III scores were only considered when the participant was in the functional OFF-state for their Parkinsonian medication at the time of clinical assessment). Fixed-effect covariates included disease duration, age, sex, the square of mean-centered measure of age and the mean-centered age interaction with sex, with ‘site’ used as the random-effect covariate. The degree of association was estimated using partial correlation coefficient r values (Supplementary Methods 2.4).