Complemental Value of Microstructural and Macrostructural MRI in the Discrimination of Neurodegenerative Parkinson Syndromes

Purpose Various MRI-based techniques were tested for the differentiation of neurodegenerative Parkinson syndromes (NPS); the value of these techniques in direct comparison and combination is uncertain. We thus compared the diagnostic performance of macrostructural, single compartmental, and multicompartmental MRI in the differentiation of NPS. Methods We retrospectively included patients with NPS, including 136 Parkinson’s disease (PD), 41 multiple system atrophy (MSA) and 32 progressive supranuclear palsy (PSP) and 27 healthy controls (HC). Macrostructural tissue probability values (TPV) were obtained by CAT12. The microstructure was assessed using a mesoscopic approach by diffusion tensor imaging (DTI), neurite orientation dispersion and density imaging (NODDI), and diffusion microstructure imaging (DMI). After an atlas-based read-out, a linear support vector machine (SVM) was trained on a training set (n = 196) and validated in an independent test cohort (n = 40). The diagnostic performance of the SVM was compared for different inputs individually and in combination. Results Regarding the inputs separately, we observed the best diagnostic performance for DMI. Overall, the combination of DMI and TPV performed best and correctly classified 88% of the patients. The corresponding area under the receiver operating characteristic curve was 0.87 for HC, 0.97 for PD, 1.0 for MSA, and 0.99 for PSP. Conclusion We were able to demonstrate that (1) MRI parameters that approximate the microstructure provided substantial added value over conventional macrostructural imaging, (2) multicompartmental biophysically motivated models performed better than the single compartmental DTI and (3) combining macrostructural and microstructural information classified NPS and HC with satisfactory performance, thus suggesting a complementary value of both approaches. Supplementary Information The online version of this article (10.1007/s00062-023-01377-w) contains supplementary material, which is available to authorized users.


Introduction
Parkinson's disease (PD) is the fastest growing neurological disorder and a leading source of disability [1].Atypical Parkinson syndromes comprise multiple system atrophy (MSA), progressive supranuclear palsy (PSP) and corticobasal degeneration.As clinical symptoms of these diseases can be difficult to distinguish especially in the early stages, correct diagnosis is highly dependent on the experience of the clinician [2].Thus, up to 20-25% of patients with PD and approximately 30% of patients with MSA and PSP are initially misdiagnosed based only on clinical characteristics [3].
While 18-F fluorodeoxyglucose positron emission tomography-computed tomography (FDG PET-CT) has high diagnostic accuracy in the differential diagnosis of neurodegenerative Parkinson syndromes (NPS), it is currently only available at centers specialized in neuroimaging and neurodegenerative diseases [3].I 123 -ioflupane single-photon emission computed tomography (SPECT) primarily delineates neurodegenerative and non-neurodegenerative Parkinson syndromes but does not distinguish between different Parkinson syndromes [4].Currently, a widely accessible serum marker is not available [5].Because of its broad use, noninvasiveness, and comparatively low cost, MRI represents a promising technique to fill the diagnostic gap.
In suspected NPS, MRI is typically employed to exclude structural abnormalities such as malignancies, small vessel disease, strategic infarcts, or hydrocephalus [6].Nevertheless, MRI can contribute to delineating Parkinson syndromes as the various NPS each exhibit pathognomonic patterns of neurodegeneration [7].As microstructural precede macrostructural cerebral changes and thus offer a potentially earlier and possibly more accurate diagnosis [8][9][10], several different methods have been evaluated in the differential diagnosis of NPS in recent years.Mesoscopic approaches that approximate the brain's microstructure in vivo were in vivo were employed to improve the diagnostic value of MRI.The diffusion tensor imaging (DTI) parameters were applied to identify NPS-related changes [11][12][13][14].For example, an elevated nigral mean diffusivity and reduced fractional anisotropy were observed compared to healthy subjects; however, the interpretability of DTI metrics is constrained as they are limited to information on the orientation and isotropy of diffusivity [15].Compared to DTI, biophysically motivated models, such as neurite orientation dispersion and density imaging (NODDI) and diffusion microstructure imaging (DMI) provide a more specific and interpretable approximation of the microstructure [16][17][18], e.g., observing an increased nigral-free water fraction in neurodegenerative Parkinson syndromes [19,20].In contrast to NODDI, DMI is not restricted to hard a priori assumptions, therefore is more suitable to assess pathologically altered microstructures [17] and has already been successfully applied in clinical research [14,[20][21][22][23].
In this retrospective study, we analyzed and compared the diagnostic value of macrostructural and microstructural MRI metrics to differentiate patients with PD, MSA, and PSP and healthy controls (HC) and developed a fully automatic algorithm for assessing NPS.We hypothesized that (1) MRI parameters that approximate the cerebral microstructure using a mesoscopic approach perform significantly better in delineating NPS and HC than those derived from macrostructural imaging, (2) MRI parameters based on multicompartmental techniques perform better than single compartmental DTI and (3) the combination of these approaches performs best.

Participants
This retrospective single center cross-sectional study included consecutive patients with NPS according to current consensus diagnostic criteria [24][25][26] who underwent MRI for the differential diagnosis of NPS between January 2018 and December 2021.Disease was staged according to Hoehn and Yahr [27].Patients with suspected corticobasal degeneration and dementia with Lewy bodies were excluded from this study due to the small number of cases and etiological heterogeneity.We included agematched and sex-matched HC.The inclusion workflow is shown in Fig. 1.

Fig. 1 Flowchart for patient inclusion and exclusion
For training and subsequent validation of our SVM algorithm, we randomly split our cohort into a training and test subset (4:1) matched in terms of age, sex, and diagnoses.
The study was approved by the Institutional Review Board (EK22/20) and carried out in accordance with the Declaration of Helsinki and its later amendments.The need for written informed consent was waived.

Spatial Normalization and Calculation of Tissue Probability Values
All data processing was carried out on a local instance of the NORA platform (www.nora-imaging.org).To obtain a macrostructural measure of atrophy, T1w-imaging datasets were automatically segmented into white and gray matter using CAT12 (http://www.neuro.uni-jena.de/cat/).Diffusion-weighted images were rigidly coregistered to the T1w images using the SPM toolbox.

Calculation of Diffusion MRI Parameters
Preprocessing of diffusion-weighted images included a denoising step [28], followed by the correction of Gibbs ringing artifacts [29] and final upsampling to an isotropic resolution of 1.5 mm 3 .
DMI metrics based on a three-compartment diffusion model (i.e., V-CSF, V-intra, and V-extra) were estimated using a Bayesian approach and machine learning techniques [17].In fact, NODDI and DMI are based on a very similar mode, whereas the NODDI approach, relies on fixes for certain parameters of the model (intra-axonal and extraaxonal diffusion coefficients), while DMI relaxes these hard constraints by using broad prior distributions.

Extraction of Macrostructural and Microstructural Imaging Features
The parameter maps of DTI, NODDI, and DMI were separated into gray and white matter using a CAT12-derived tissue probability value (TPV) threshold of 0.4.For this, the TPV provides the probability of a voxel to be attributed to gray or white matter.From this, only the gray matter compartment was read for the AAL3 atlas and only the white matter part for the JHU WMPM III atlas (Fig. 2; [31,32]).In addition, we extracted the MRI metrics from the Human Motor Thalamus atlas [33].

SVM Training
A linear support vector machine (SVM) was trained and optimized in respect of the area under the receiver operating characteristic curve (AUC-ROC) in a one-vs.-rest(OVR) classifier for PD, MSA, PSP, and HC.The diagnostic performance of the SVM was compared with respect to different inputs, i.e., TPV, DTI, NODDI, and DMI individually and in combination.
The SVM was developed in the Python (version 3.8.5)package "Scikit-learn" (version 0.23.2).The atlas-derived microstructural and macrostructural parameters served as input for the linear SVM.We investigated the diagnostic value of 638 TPV-derived features, 2124 features obtained by DTI, 1584 by NODDI, and 1593 by DMI resulting in a total of 5939 features with data available in all participants.To address the considerably different group sizes, class_weight was set to 'balanced'.The class_weight argument can be specified as a model hyperparameter and via a dictionary defines each class label and the weighting to apply to the C value in the calculation of the soft margin.Different combinations of input parameters for the training subset were tested.Using the Scikit-learn StandardScaler, the input parameters were normalized to mean 0 and standard variance.
Prior to training, the input parameters were sorted with respect to maximum marginal diversity [34].We chose this K approach because of the relatively small subgroup sizes in relation to the maximum number of input features.For this, normalized values of a given input feature are distributed into 20 equally spaced bins separately for each group.Subsequently, the Kullback-Leibler divergence for the resulting group-wise histogram is calculated as a measure of the difference between the distributions.To obtain the best combination of the linear SVM parameter C and the number of input parameters, different linear SVMs were trained in a grid search approach with C varying between 0.01 and 100 and the number of input parameters varying between the top 5-40%, fivefold cross-validated and compared with respect to the area under the curve (AUC).The maximum number of input parameters was set to be below 40% to reduce noise and prevent overfitting.

Statistical Analysis
Statistical analysis was performed using R (version 4.1.0,https://www.R-project.org/).Data are presented as the mean and standard deviation for continuous variables and as absolute frequencies and percentages for categorical variables.We compared demographic and clinical characteristics of patient and control groups with analysis of variance (ANOVA) followed by Tukey's honest significance test, or the Kruskal-Wallis test with post hoc pairwise, Bonferroni-Holm corrected, Wilcoxon test.The AUC-ROC and DeLong's test were employed to investigate and compare the discriminative power of the different SVM inputs.The significance threshold was set to p < 0.05.

Comparison of Different SVM
In general, TPV alone performed inferior to each dMRIderived metric alone in both the training and test cohorts.Using the DeLong's test for the training cohort, we found a significant superiority of the dMRI-derived metrics alone compared with TPV in the delineation of patients with PD (TPV vs. DTI, p = 0.027; TPV vs. NODDI, p = 0.004; TPV vs. DMI, p = 0.047), while none of the dMRI inputs performed superior to another as given in Supplementary Table 1.This finding, however, did not reach significance in the test cohort (Supplementary Table 2).
The overall best diagnostic performance in the training cohort was revealed for the combination of TPV, DTI, and NODDI (see Supplementary Table 3).The corresponding OVR-ROC-AUC was 0.95 for HC, 0.94 for PD, 0.99 for MSA, and 0.96 for PSP.In the test cohort, the combination of DMI and TPV performed best and correctly classified Fig. 3 Receiver operating characteristic curves for the combination of tissue probability values and diffusion microstructure imaging as the best performing combination to classify neurodegenerative Parkinson syndromes and healthy controls in the test cohort.HC healthy controls, PD Parkinson's disease, MSA multiple system atrophy, PSP progressive supranuclear palsy, AUC area under the curve 85% of the participants (i.e. the patients and healthy controls) and 88% of the patients (Fig. 3).The corresponding OVR-ROC-AUC was 0.88 for HC, 0.98 for PD, 1.0 for MSA, and 0.97 for PSP with sensitivities of 67% for HC, 95% for PD, 67% for MSA, and 86% for PSP.Respective specificities were 94% (HC), 95% (PD), 94% (MSA), and 97% (PSP).Upon inspection of performance of the dMRI metrics alone, we found better diagnostic performance for DMI compared with DTI or NODDI, however not reaching significance in the De Long test (AUCs of DMI 0.91 for Fig. 4 Receiver operating characteristic curves for each investigated input separately and diagnostic performance in the test cohort as given by the area under the curve (AUC) to classify neurodegenerative Parkinson syndromes and healthy controls.Top left depicts the performance for the tissue probability values as input, top right for diffusion tensor imaging, bottom left for neurite orientation dispersion and density imaging, and bottom left for diffusion microstructure imaging.HC healthy controls, PD Parkinson's disease, MSA multiple system atrophy, PSP progressive supranuclear palsy HC; 0.94 for PD; 0.96 for MSA; 0.99 for PSP vs. DTI: 0.84 for HC; 0.96 for PD; 0.95 for MSA; 0.97 for PSP vs. NODDI: 0.86 for HC; 0.91 for PD; 1.00 for MSA; 0.95 for PSP).See Fig. 4 and Tables 2 and Fig. 5 for more details.

Most Discriminative Regions
For the combination of DMI and TPV, the maximum marginal diversity algorithm revealed a high diversity of the group-specific marginal distribution especially for the putamen, the cerebellar peduncles, the pons, the pontine crossing tracts, the pallidum, various cerebellar regions, and the frontal white matter as given in Supplementary Table 4.In contrast, the SVM attributed high coefficient weight to widespread gray and white matter regions exceeding the areas with high maximum marginal diversity weight.In detail, cortical regions including the cingulum, temporal, and frontal cortex had a high discriminative value (Supplementary Table 4).Information on the best-performing combination (i.e., TPV, DTI, and NODDI) is provided in Supplementary Table 5.

Discussion
In the differential diagnosis of NPS, the microstructural parameters obtained in a mesoscopic approach using diffusion MRI data outperformed the macrostructural data derived from conventional structural T1w imaging.Of note, K the combination of microstructural DMI and macrostructural TPV performed best.
In a head-to-head comparison, we revealed higher accuracy of the mesoscopic approaches than T1w-derived TPV.Furthermore, multicompartmental approaches of NODDI and DMI performed better than single compartmental DTI.Although, despite better AUC in ROC analysis, this effect did not reach significance.We attribute this primarily to the fact that the differences in the already high diagnostic values are small and thus would require a much larger sample size for statistical significance.The added value of dMRI metrics is indicated, as for the input combination of DMI and TPV, we observed high variance in regional TPV metrics in the MMD, whereas the SVM relied more on DMI metrics.The value of advanced multicompartment diffusion MRI approaches compared to DTI has been demonstrated for NPS in previous studies.For example, Ogawa et al. used NODDI to more specifically interpret the alterations observed in DTI in patients with MSA [11].A study using DMI was able to demonstrate the added diagnostic value over DTI to identify patients with PSP versus a cohort of Fig. 5 Confusion matrix for the combination of the tissue probability values and diffusion microstructure imaging metrics as the best performing input in the test cohort.SVM support vector machine, HC healthy controls, PD Parkinson's disease, MSA multiple system atrophy, PSP progressive supranuclear palsy  [35].Using NODDI or DMI, evidence of neurodegeneration in PD was found in the substantia nigra [20,36] as well as in the nigrostriatal pathway [37].Despite the hypothesis that DMI is more suited than NODDI to capture pathological conditions in the calculation of metrics [17,22], we did not detect significant superiority of DMI over NODDI.While the highest diagnostic value of the SVM was found for the input combination DMI + TPV, we could not prove a significant superiority of DMI over NODDI in DeLong's test.Again, we attribute this primarily to the fact that the differences in the already high diagnostic values are small and also would require a much larger sample size for statistical significance.The same reasoning holds for the fact that the test and training cohorts differ with respect to the best-performing combination of input parameters.

SVM-Prediction
The validity of the different macrostructural and microstructural inputs is supported by the fact that the maximum marginal diversity algorithm identified regions mainly typical for MSA and PSP [7] (i.e., middle and superior cerebellar peduncle, midbrain, and pons) while the SVM also assessed numerous supratentorial regions with high coefficients is explained by the fact that in PD atrophy is more subtle and less localized, thus expressing lower variance in the overall cohort while nevertheless having diagnostic value for the SVM.In line with this, the observed regions have been previously identified as pathologically altered in PD [7,38].
When assessing the performance of the SVM, a particular challenge for all approaches was the differentiation between HC and PD as in conventional radiological reporting [39].In the test cohort, one third of the HC subjects were erroneously assigned to the PD group.We especially attribute this to different group sizes with a large PD group leading to a potential bias with favor for PD; however, as the overall distribution of entities in our cohort reflects the expected distribution of diagnoses in clinical practice, the SVM's performance is satisfactory.Future research should focus on evaluating this finding by including more HC.
Techniques based on artificial intelligence have previously been investigated to classify NPS using diffusion MRI.In line with our results, a study of 45 patients with PD, 20 with PSP, and 38 HC showed that the combination of DTI and macrostructural parameters performed best [40].Of particular note is a multicenter study that demonstrated high diagnostic value in a dataset of 1002 patients with NPS and HC using free-water and free-water corrected FA with an AUC in PD vs. atypical parkinsonism of 0.96 and MSA vs. PSP of 0.93, although lacking a multiclass approach [13].Compared to these approaches, we were able to demonstrate excellent AUCs in the test cohort; however, our accuracy in the test cohort was 83%, which is particularly driven by the HCs incorrectly assessed as PD.Potential approaches to overcome the diagnostic challenge of delineating HCs and PD might include the additional use of neuromelanin-sensitive imaging parameters [41].
A potential limitation is the accuracy of the clinical gold standard diagnosis [2], and subsequently the lack of pathological diagnoses.To mitigate this issue, diagnoses were made by two neurologists with long-standing experience in the diagnosis of movement disorders, additionally [18F]-FDG-PET was available in a large proportion of patients, strengthening the clinical diagnoses [3].In order to determine the most suitable parameters for differentiating neurodegenerative Parkinson syndromes, it is essential to investigate larger, more statistically powered studies.This will contribute to further refining these methodologies.Validation of our single-center data must be subject to further studies in an external cohort.Other strengths are that we did not constrain our analyses to single atlas regions as input to prevent a priori assumptions and that the algorithm is capable of multiclass classification of NPS so we can furthermore validly identify HC in this process.
In conclusion, we were able to demonstrate that (1) MRI parameters that approximate microstructure using a mesoscopic approach provided substantial added value over conventional macrostructural imaging, (2) multicompartmental/biophysically motivated models performed better than the single compartmental DTI, although not reaching statistical significance and (3) NPS and HC were classified with satisfactory performance using an SVM, combining macrostructural and microstructural information, thus suggesting a complementary value of both approaches.DEAL.Data Sharing Data are available from the authors upon reasonable request and approval of the ethics committee.The code is available in a public repository.

Declarations
Conflict of interest N. Schröter received grants from Berta-Ottenstein-Programme for Clinician Scientists, Faculty of Medicine, University of Freiburg, and honoraria from AbbVie (presentations).M. Reisert received honoraria from AbbVie (presentations).E. Kellner is a shareholder of and receives fees from VEObrain GmbH, Freiburg, Germany.H. Urbach is co-editor of Clinical Neuroradiology, member of the Advisory Board of Biogen and received honoraria for lectures from Biogen, Eisai, and mbits.A. Rau received grants from Berta-Ottenstein-Programme for Clinician Scientists, Faculty of Medicine, University of Freiburg.All declared interests are independent of the submitted work.P.G.Arnold, J.A. Hosp, M. Rijntjes, W.H. Jost and C. Weiller declare that they have no competing interests.

Ethical standards
For this article no studies with human participants or animals were performed by any of the authors.All studies mentioned were in accordance with the ethical standards indicated in each case.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.K

Fig. 2
Fig. 2 Schematic of data processing and input into the support vector machine (SVM).GM gray matter tissue probability value, WM white matter tissue probability value, NODDI neurite orientation dispersion and density imaging, DTI diffusion tensor imaging, DMI diffusion microstructure imaging, TPV tissue probability value

Table 1
Demographic and clinical characteristics of patient groups Hoehn and Yahr stage at MRI imaging was missing for eight patients with MSA and one patient with PSP.UPDRS III in OFF-State was missing in 11 patients with PD, 6 patients with MSA and 11 patients with PSP HC healthy controls, PD Parkinson's disease, MSA multiple system atrophy, PSP progressive supranuclear palsy, SD standard deviation, UPDRS III in OFF-State unified Parkinson's disease rating scale motor part 3 (off medication) a-e Significance of pairwise comparisons: a, b, c, e, f p < 0.001; d, h p < 0.01; g p < 0.05

Table 2
Areas under the curve of the receiver operating characteristics in the test cohort DMI diffusion microstructure imaging, DTI diffusion tensor imaging, NODDI neurite orientation dispersion and density imaging, TPV tissue probability value