Abstract
Background
Differentiating Progressive supranuclear palsy-Richardson’s syndrome (PSP-RS) from PSP-Parkinsonism (PSP-P) may be extremely challenging. In this study, we aimed to distinguish these two PSP phenotypes using MRI structural data.
Methods
Sixty-two PSP-RS, 40 PSP-P patients and 33 control subjects were enrolled. All patients underwent brain 3 T-MRI; cortical thickness and cortical/subcortical volumes were extracted using Freesurfer on T1-weighted images. We calculated the automated MR Parkinsonism Index (MRPI) and its second version including also the third ventricle width (MRPI 2.0) and tested their classification performance. We also employed a Machine learning (ML) classification approach using two decision tree-based algorithms (eXtreme Gradient Boosting [XGBoost] and Random Forest) with different combinations of structural MRI data in differentiating between PSP phenotypes.
Results
MRPI and MRPI 2.0 had AUC of 0.88 and 0.81, respectively, in differentiating PSP-RS from PSP-P. ML models demonstrated that the combination of MRPI and volumetric/thickness data was more powerful than each feature alone. The two ML algorithms showed comparable results, and the best ML model in differentiating between PSP phenotypes used XGBoost with a combination of MRPI, cortical thickness and subcortical volumes (AUC 0.93 ± 0.04). Similar performance (AUC 0.93 ± 0.06) was also obtained in a sub-cohort of 59 early PSP patients.
Conclusion
The combined use of MRPI and volumetric/thickness data was more accurate than each MRI feature alone in differentiating between PSP-RS and PSP-P. Our study supports the use of structural MRI to improve the early differential diagnosis between common PSP phenotypes, which may be relevant for prognostic implications and patient inclusion in clinical trials.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Over the last decade, many studies have reported the existence of distinct progressive supranuclear palsy (PSP) phenotypes characterized by different initial clinical presentation and progression, with PSP-Richardson’s syndrome (PSP-RS) and PSP-parkinsonism (PSP-P) as the most frequent phenotypes [1,2,3,4,5,6]. PSP-RS patients usually show a more severe disease course and overall earlier appearance of PSP typical symptoms, but the clinical differential diagnosis between different PSP phenotypes is challenging also for movement disorder specialists [1, 2, 7,8,9,10,11]. It is based on the clinical presentation at the beginning of the disease, and the main difference between PSP-RS and PSP-P relies on postural instability (PI), which must be present within the first 3 years of the disease for a PSP-RS diagnosis, while is usually tardive in PSP-P [1, 2, 8, 12]. The first logical implication is that PSP-P diagnosis requires a disease duration of at least three years to rule out the appearance of early falls, thus configuring a significant diagnostic delay [1]. In addition, establishing the presence of PI can be difficult in the early stage, since the pull-test is not an objective test, suffering from variability of the pull strength and patient conditioning as well as from patient’s attention/cognition, age and comorbidities [13, 14]. On the other hand, in patients with advanced disease, establishing the exact time of appearance of PI is difficult, since falls may have different causes including freezing of gait, impaired balance, cognitive decline and environmental factors [15, 16]. On these bases, objective imaging biomarkers to support the differential diagnosis between common PSP phenotypes are urgently needed.
Most studies so far focused on the differential diagnosis between PSP and other parkinsonian syndromes, and several imaging biomarkers have been reported to distinguish PSP-RS from PD and multiple system atrophy, including planimetric MRI measures (manual or automated) [17,18,19,20], brain volumetry [21, 22], diffusion tensor imaging metrics [23,24,25], and PET imaging with 18FDG [26] or tau tracers [27]. Among the MR planimetric measures, most studies evaluated the midbrain/pons area ratio and the Magnetic Resonance Parkinsonism Index (MRPI). This latter index is a MR planimetric biomarker combining the midbrain area and the superior cerebellar peduncle width (normalized by pons area and middle cerebellar peduncle width respectively, as reference structures), which can be calculated by multiplying the pons/midbrain area ratio by the ratio between middle cerebellar peduncle width and superior cerebellar peduncle width [19]. A few imaging biomarkers, such as the MRPI 2.0 (a second version of MRPI, obtained by multiplying the MRPI value by the third ventricle width normalized by the frontal horns width) [28, 29] and FDG-PET [26], showed good performances also in distinguishing PSP-P from PD patients. Accurate biomarkers, however, to distinguish between PSP-RS and PSP-P are still lagging behind and are not currently available.
Advancements in machine learning (ML) have permeated various domains of medicine, through the development of accurate classification or prediction models which may assist physicians in clinical decision making [30, 31]. Several machine learning algorithms have been successfully applied on structural MRI data in the differential diagnosis of neurological diseases [21, 22, 32, 33]. Random Forest (RF) and XGBoost are widely used classification algorithms with a decision tree-based approach: RF is an algorithm based on classification and regression tree (CART) introduced by Breiman [34], which constructs trees in parallel and makes predictions through majority voting; XGBoost algorithm uses eXtreme Gradient Boosting for maximizing the classification performance, generating trees sequentially leveraging error correction to improve their performance [35].
In the current study we investigated if the MRPI and MRPI 2.0, alone or included in decision tree-based machine learning models (XGBoost and RF) in combination with other MRI structural data, could differentiate between PSP-RS and PSP-P.
Materials and methods
Participants
One hundred and nine PSP patients (65 probable PSP-RS and 44 probable PSP-P) were consecutively recruited at the Movement Disorder Center of Magna Graecia University, between 2012 and 2020.
The clinical diagnoses of PSP-RS and PSP-P were performed by movement disorder specialists according to international diagnostic criteria [1]. PSP patients enrolled before 2017 were diagnosed according to previous diagnostic criteria [36] and expert guidelines [37] and were retrospectively reclassified according to recent MDS diagnostic criteria for probable PSP-RS (vertical ocular dysfunction associated with early postural instability) and PSP-P (vertical ocular dysfunction associated with parkinsonism as predominant clinical features, in the absence of early postural instability) [1]. PSP-P patients with disease duration shorter than 3 years underwent clinical follow-up to rule out the appearance of early falls. Exclusion criteria were the presence of clinical features suggestive of other diseases, normal striatal uptake on 123I-FP-CIT-SPECT, and MRI abnormalities such as lacunar infarctions in the basal ganglia, diffuse subcortical vascular lesions, or imaging signs suggestive of normal pressure hydrocephalus [38]. Most PSP-P patients included in the current cohort have been reported in a recent study to validate the automated MRPI 2.0 [29], but no comparison with PSP-RS patients was done in this previous study. All patients underwent a neurological examination including the MDS—sponsored revision of the Unified Parkinson’s Disease Rating Scale part III (MDS-UPDRS-III) [39] in off-state, the Hoehn and Yahr (H–Y) rating scale [40] and the Mini Mental State Examination (MMSE) [41]. Written informed consent according to the Declaration of Helsinki for the use of their medical records for research purposes was obtained from all individuals participating in the study. All study procedures and ethical aspects were approved by the institutional review board (Magna Graecia University review board, Catanzaro, Italy).
MRI acquisition and processing
All study participants underwent a brain MRI with a 3 T-MR750 General Electric scanner and an 8-channel head coil, with a recently described MRI protocol including a 3D T1-weighted MR image [42]. Freesurfer 7 was employed with the standard pipeline recon-all to automatically extract thickness and volume of 34 cortical regions for each hemisphere, and the volume of the subcortical regions caudate, putamen, globus pallidus, thalamus and cerebellum, divided into white and gray matter (WM, GM) [43]. All the segmentations performed by Freesurfer were visually inspected by a neuroradiologist, and images with inaccurate segmentations due to prominent movement artefacts (3 PSP-RS and 5 PSP-P patients) were excluded. The automated MRPI and MRPI 2.0 were calculated on 3D T1-weighted MR images using the previously described algorithm [29]. In 4 PSP-RS patients the algorithm failed and the MRPI and MRPI 2.0 were measured manually by an expert rater.
Statistical analysis
Difference in gender distribution was assessed with Fisher’s exact test. Normality of data was tested using Shapiro’s test. The analysis of variance (ANOVA) or Kruskal–Wallis test were employed for comparing age at examination and education level among the three groups (PSP-RS, PSP-P and control subjects). Age at disease onset, disease duration and clinical scores were compared between PSP-RS and PSP-P patients using t-test or Wilcoxon rank sum test. ANCOVA with age and education level as covariates was applied to assess differences in MMSE. ANCOVA with age and gender was used to compare cortical thickness, cortical and subcortical volumes among groups. Other covariates in the ANCOVA included: education level for cortical thickness, education level and intracranial volume (ICV) for cortical volumes, and ICV for subcortical volumes. All ANCOVA tests was repeated to assess differences between PSP-RS and PSP-P including also disease duration as covariate. All tests were two tailed, and the α level was set at p < 0.05. All p values were corrected according to Bonferroni. Statistical analysis was conducted with R language version 4.1.2.
Receiver operating characteristic (ROC) analysis
We first assessed the diagnostic performance of the automated MRPI and MRPI 2.0 in differentiating between PSP-RS and PSP-P patients, and between patients and controls. In addition, we also tested these biomarkers in a sub-cohort of early PSP patients (38 PSP-RS and 21 PSP-P) with disease duration up to 4 years (early stage), selected from the whole cohort. Optimal cut-offs, defined as the values with the highest sum of sensitivity and specificity on the Receiver Operating Characteristic (ROC) curves, and 95% confidence intervals (CI), were calculated using pROC software package with bootstrapping (n = 2000 iterations) [44].
Machine learning models
Subsequently, we investigated the performance of Machine Learning (ML) models based on structural MR imaging data in distinguishing between PSP-RS and PSP-P patients, and between patients and controls, both in the whole cohort and in the above-mentioned early cohort. ML models used two different tree-based algorithms (Random Forest [RF] and XGBoost) [34, 35] with all combinations of six different imaging variable groups: cortical thickness (34 regions for each hemisphere), cortical volumes (34 regions for each hemisphere), subcortical volumes (bilateral caudate, putamen, pallidum, thalamus, cerebellar grey and white matter), MRPI and MRPI 2.0 values. Age, gender, education level and intracranial volume were also included in all ML models, but the feature importance (both in RF and XGB) showed that these variables were not relevant for classification, and the feature selection procedure excluded them from the final models. The hyperparameters of the two ML algorithms were tuned through five-fold cross-validation (fivefold cv) with randomized search (ten iterations) to maximize the accuracy [45, 46]. In detail, the dataset was split into K number of subset (folds) and the model was iteratively fitted K times, training it on (K-1) set and validating it on the Kth fold not used for training. The hyperparameters tuned for RF were: number of trees, features considered for splitting a node, levels in each decision tree, data points placed in a node before the node is split and points allowed in a leaf node. The hyperparameters tuned for XGB were: learning rate, maximum depth, minimum child weight, gamma and fraction of features to use. Further details on hyperparameters tuning in supplementary materials. The permutation feature importance (Mean Decrease in Accuracy, MDA) [47] was then evaluated, using 50 repetitions to ensure the reliability of the feature ranking, which might otherwise be biased by the multicollinearity among the training features. Feature selection was then applied by iteratively training the models on the variables ordered according to the permutation importance. Finally, the performance of the RF and XGB models trained on the most important features were evaluated using fivefold cv with 5 repetitions, and the mean and standard deviation of area under the curve (AUC), accuracy, sensitivity and specificity were calculated. A model was considered able to distinguish between groups when the mean AUC in the validation folders was > 0.85. The analyses were conducted with Python 3.9 and the packages scikit-learn v1.0.2.
Results
The demographic, clinical and imaging data of PSP-RS and PSP-P patients are summarized in Table 1. The two patient groups had similar age at examination and gender distribution. PSP-RS patients showed higher clinical severity than PSP-P. Education level and MMSE scores were lower in PSP patients than in control subjects, but similar between the two PSP phenotypes (Table 1). The whole cohort was then split into early and late sub-cohorts; early PSP patients (38 PSP-RS and 21 PSP-P) had disease duration up to 4 years (range 1–4 years), while late PSP patients (24 PSP-RS and 18 PSP-P) had disease duration > 4 years (range 5–14 years). Demographic and clinical data of early and late sub-cohorts are shown in Table 2.
Structural MRI data
Both PSP phenotypes had higher MRPI and MRPI 2.0 values than control subjects, and PSP-RS patients had significantly higher values than PSP-P patients (Tables 1 and 2). Both PSP groups also showed reduced thickness and volume in frontal lobe regions, but PSP-P had a more widespread cortical thinning, involving also the temporal and parietal lobes (Tables S1 and S2). This finding was confirmed by the direct comparison between the two PSP phenotypes, which showed cortical thinning in PSP-P patients compared to PSP-RS patients in several brain regions (Table S1). On the contrary, PSP-RS patients had a more severe atrophy of subcortical structures, including thalamus, pallidum and cerebellum (Table S3). Similar results were obtained in the early sub-cohort (Tables S4 and S5). The main differences respect to the whole cohort were that cortical involvement was detected only by thickness, while cortical volumes in the early sub-cohort were not different among the three groups, and that the cortical thinning in early PSP-P patients involved the frontal and parietal regions, sparing the temporal lobes (Table S4).
Classification performance of MRPI and MRPI 2.0 in distinguishing between PSP phenotypes using ROC analysis
The MRPI had acceptable performance (AUC 0.88) and was superior to the MRPI 2.0 (AUC 0.81) in distinguishing between the two PSP phenotypes (Fig. 1 and Table S6). Similar performances were obtained in the early sub-cohort (Fig. 1 and Table S6). The ROC analysis identified optimal cut-off values of 16.25 for MRPI and 3.82 for MRPI 2.0 in distinguishing between PSP-RS and PSP-P (Table S6). The classification performances of MRPI and MRPI 2.0 in distinguishing PSP-RS and PSP-P from control subjects are described in supplementary materials and Table S6.
Receiver operating characteristic (ROC) curves for assessing the classification performance of automated MRPI (A) and automated MRPI 2.0 (B) in differentiating between PSP-RS and PSP-P patients, in the whole cohort (red) and in the sub-cohort of early-stage PSP patients (blue). MRPI Magnetic Resonance Parkinsonism Index, PSP-RS progressive supranuclear palsy-Richardson’s syndrome, PSP-P Progressive supranuclear palsy-parkinsonism, AUC area under the ROC curve
Classification performance of ML models in distinguishing between PSP phenotypes
ML models with the MRPI and MRPI 2.0 used alone showed acceptable performance (AUC 0.86 and 0.79, respectively) in differentiating between PSP-RS and PSP-P patients, in line with ROC results. Lower performances were obtained by ML models using only cortical thickness (AUC 0.82), cortical volumes (AUC 0.78) or subcortical volumes (AUC 0.82), as shown in Tables 3 and 4 and Fig. 2. In most cases the performances were slightly higher using XGBoost than using Random Forest. ML models combining volumetric/cortical thickness data together with planimetric biomarkers (MRPI or MRPI 2.0) showed the highest classification performance in distinguishing the two PSP phenotypes, reaching mean AUC in the validation folds of 0.94 ± 0.04 using XGBoost and 0.91 ± 0.06 using RF (Table 5 and Figs. 2 and 3). In all these models, the MRPI was the selected feature with the highest importance score (Figs. 3 and 4). The Receiver Operating Characteristic (ROC) curve and the feature importance list of the best XGBoost and RF models are shown in Fig. 3. Of importance, similar results were obtained also in the differentiation between PSP-RS and PSP-P patients in the first years after disease onset (Figs. 2 and 4, Tables 5, S7 and S8), which is clinically more challenging. Classification performances of ML models in distinguishing PSP-RS and PSP-P from controls are described in supplementary materials and Tables 3, 4 and 5. All the hyperparameters of the best models are shown in supplementary materials and Table S9.
Machine learning models in differentiating between PSP-RS and PSP-P patients in the whole cohort (A) and in the sub-cohort of early-stage PSP patients (B). The XGBoost “combined model” in the whole cohort was trained on MRPI values, cortical thickness and subcortical volumes. The XGBoost “combined model” in the sub-cohort of early-stage patients was trained on MRPI values, cortical thickness, cortical volumes and subcortical volumes. MRPI Magnetic Resonance Parkinsonism Index, AUC area under the curve
Machine learning models in differentiating between PSP-RS and PSP-P patients in the whole cohort. On the left side, classification performances of the best XGBoost (top line) and Random Forest (bottom line) models in distinguishing between the two PSP phenotypes. The XGBoost model was trained on MRPI values, cortical thickness and subcortical volumes. The Random Forest model was trained on MRPI values, cortical thickness and cortical volumes. On the right side, the feature importance assessed via permutation methods in distinguishing between the two groups; data are shown in descending order from the most to the less important feature. MRPI Magnetic Resonance Parkinsonism Index, WM white matter, Rh right, Lh left, AUC area under the curve
Machine learning models in differentiating between PSP-RS and PSP-P patients in the sub-cohort of PSP patients with short disease duration (early cohort). On the left side, classification performances of the best XGBoost (top line) and Random Forest (bottom line) models in distinguishing between the two PSP phenotypes. The XGBoost model was trained on MRPI values, cortical thickness, cortical volumes and subcortical volumes. The Random Forest model was trained on MRPI 2.0 values, cortical thickness and subcortical volumes. On the right side, the feature importance assessed via permutation methods in distinguishing between the two groups; data are shown in descending order from the most to the less important feature. MRPI Magnetic Resonance Parkinsonism Index, WM white matter, Rh right, Lh left, AUC area under the curve
Classification performance of ML models in distinguishing between early and late PSP patients
Finally, we investigated the performance of each structural MRI metric in distinguishing between early and late patients, separately for PSP-RS and PSP-P cohorts. As shown in Table S10, both classifiers (XGB and RF) showed that the cortical metrics were superior to the brainstem measurements (MRPI and MRPI 2.0) in distinguishing between early and late patients, with cortical thickness as the best feature in both PSP-RS and PSP-P cohorts. The main difference between the two PSP phenotypes was the higher performance of subcortical volumes in distinguishing between early and late patients in PSP-P cohort than in PSP-RS cohort.
Discussion
In this study, we investigated the role of several structural MRI features including both planimetric (MRPI and MRPI 2.0) and volumetric data (cortical thickness, cortical volumes and subcortical volumes), in differentiating between PSP-RS and PSP-P patients. Machine Learning models using a combination of MRPI, and volumetric/thickness data showed the best classification performance in distinguishing between these two PSP phenotypes.
Differentiating between PSP-RS and PSP-P may be challenging in clinical practice [7,8,9,10,11], suggesting the need for objective imaging biomarkers to support the differential diagnosis between these two diseases. Previous MR studies found smaller volume of midbrain, superior cerebellar peduncles (SCPs), subthalamic nucleus and cerebellum, and more widespread white matter (WM) involvement in PSP-RS than in PSP-P at the group level [48,49,50,51]. Pilot studies in small PSP cohorts reported excellent performances in differentiating between PSP-RS and PSP-P using DTI metrics in the dentatorubrothalamic tract [23, 50], but these findings were not confirmed by other authors [52, 53], making further studies necessary to explore the potential of DTI in the differential diagnosis between PSP phenotypes. Taken together, these findings suggest that no robust imaging biomarker to accurately differentiate among PSP-RS and PSP-P phenotypes at individual level is currently available.
The MRPI and MRPI 2.0 (a second version of this biomarker also including the measurement of the third ventricle width) are two well-known automated biomarkers to distinguish PSP-RS and PSP-P from other parkinsonian syndromes [17, 28]. Here, we investigated the performance of these biomarkers in distinguishing between these two PSP phenotypes. In our cohort, PSP-RS patients had higher MRPI and MRPI 2.0 values than PSP-P, and these biomarkers showed acceptable performances (AUC 0.88 and 0.81, respectively) using ROC analysis in differentiating between these two diseases. Similar results were obtained in the early PSP cohorts where MRPI and MRPI 2.0 showed AUC of 0.87 and 0.79, respectively in differentiating PSP-RS from PSP-P. Our results are in line with some previous reports [51, 54] and slightly better than others [4, 55] showing suboptimal performances of these MR biomarkers in distinguishing between PSP phenotypes. Previous evidence demonstrated that the MRPI 2.0 was more powerful than the MRPI in distinguishing patients with PSP-P from those with Parkinson’s disease (PD) [28, 29, 56]. In our study, however, the MRPI 2.0 was not superior to the MRPI in distinguishing between PSP-RS and PSP-P, likely due to the similar degree of third ventricle enlargement usually observed in these two PSP phenotypes [28].
In the current study, we compared the performances of MRPI and MRPI 2.0 with those of cortical thickness, cortical volumes and subcortical volumes in differentiating between PSP-RS and PSP-P employing two of the most used decision tree-based approaches for ML classification (Random Forest and XGBoost). These ML models showed that cortical thickness, cortical volumes and subcortical volumes, used separately, were not able to accurately distinguish between PSP-RS and PSP-P patients, and that these features were less powerful than MRPI in differentiating between these two PSP phenotypes. This result may be surprising since PSP-RS and PSP-P showed significant differences in volumetric/cortical thickness atrophy of the brain. Indeed, in agreement with previous imaging and pathological data [9, 57,58,59] a reduced volume in the thalamus, globus pallidus and cerebellum was found in PSP-RS compared to PSP-P patients. On the other hand, PSP-P patients showed more widespread cortical thinning than PSP-RS, involving also some temporal and parietal regions in addition to the frontal lobes, which were affected in both diseases. These between-group differences, however, were not large enough to allow these features to accurately classify PSP phenotypes.
In an effort to improve the classification accuracy of the automated MRPI biomarkers in the differential diagnosis between PSP phenotypes, in the current study, we combined MRPI and MRPI 2.0 with other structural MRI data (cortical thickness, cortical volumes and subcortical volumes) into ML models. This new approach yielded a very good performance (AUC 0.94) when MRPI, cortical thickness and subcortical volumes were combined together for differentiation between PSP-RS and PSP-P, outperforming these features used alone, and the performance improvement was even higher in the early cohort. The ML model with the best performance used XGBoost where MRPI was selected as the most important feature, both in the whole and in the early cohorts. This higher classification performance obtained with ML approach may be the result of combining the larger subcortical atrophy observed in PSP-RS patients (detected by MRPI and subcortical volumes) and the higher cortical involvement in PSP-P (detected by cortical thickness and volumes). These results on the combination of cortical and subcortical data are in line with very recent structural MRI studies in PSP. A recent large study [60] demonstrated that the MRPI performed well in distinguishing pathologically-proven PSP-RS patients from cortico-basal degeneration (CBD) and from other neurodegenerative diseases including fronto-temporal lobe degeneration and Alzheimer’s disease, but the addition of cortical thickness data to the MRPI allowed to further increase the classification performances, due to the lower cortical atrophy in PSP-RS patients than in the other considered neurodegenerative conditions.
Finally, we investigated the performance of each structural MR metric in distinguishing between early and late patients, separately for PSP-RS and PSP-P, which may provide insights on the brain atrophy progression in these common PSP phenotypes. In our cohort, the cortical thickness was the best structural metric in distinguishing between early and late patients, both in PSP-RS and PSP-P cohorts. These results are in line with pathological and imaging studies showing that the neurodegenerative process usually starts in the brainstem regions and basal ganglia, and later spreads to cortical regions [59, 61]. This time sequence thus makes brainstem atrophy more useful for the early differential diagnosis and cortical atrophy more suitable for distinguishing between early and late stages of the disease.
Overall, the two ML algorithm used in this study showed very similar results in most comparisons, with XGB showing slightly better performances than RF in a few cases. Although, these two tree-based ML algorithms share several rules for tree growing, they differ in the creation of the ensemble of trees. RF uses bagging to build trees in parallel and then the prediction is done by majority voting [34]. On the contrary, XGB builds a sequential ensemble of trees with the aim to improve the performance of the previous tree by correcting its errors [35]. Broadly speaking, XGB may thus be slightly more powerful than RF because of its ability to learn from its wrong predictions, which are corrected by giving more weight to the misclassified instances, and to its higher ability to deal with imbalanced datasets [35, 62]. The main advantage of RF is that its performance may be less influenced by slight hyperparameters tuning modifications compared with XGB [62], and the very similar results obtained using RF in the present work (compared to XGB) increase the reliability of the findings.
The importance of the current study, demonstrating a role of structural MRI in the differential diagnosis among common PSP phenotypes, is linked to the large clinical overlap between PSP-RS and PSP-P, which can make the clinical differential diagnosis difficult. Distinguishing between these two PSP phenotypes, however, is of extreme relevance in clinical practice for prognostic implications, since PSP-P is characterized by significantly slower disease progression than PSP-RS. Indeed, while PSP-RS is a rapidly progressive PSP phenotype, with death occurring after 6–8 years, PSP-P patients have a more benign disease course and longer survival [63,64,65]. These discrepancies among PSP phenotypes may also significantly affect the results of clinical trials with new possible disease-modifying therapies in PSP patients. In fact, to avoid bias and optimize statistical power, it is crucial to include in these trials homogeneous populations with similar rate of progression over time, not lumping PSP patients with different phenotypes [7, 65]. The current study provides evidence that ML models using structural combined MRI data can accurately differentiate between PSP-RS and PSP-P also in the early stage of the disease when patients are more suitable for enrollment in trials; thus, if further validated in independent cohorts, these automated imaging biomarkers to support PSP phenotype classification may significantly improve future clinical trial design in PSP. A limitation to the immediate widespread use of such biomarkers is the complexity of ML approaches, which require high level-technology and expertise not yet available in clinical routine; however, there is a growing interest in ML use for diagnostic purposes in medicine and such approaches will be likely available in clinical practice soon.
This study has several strengths. First, we enrolled a large cohort of around 100 probable PSP patients, including 40 PSP-P patients classified according to recent international diagnostic criteria. Second, all imaging data (thickness, volumes, MRPI and MRPI 2.0 values) were obtained using fully automated validated procedures. Third, two distinct decision-tree based ML models were compared, and the performances of the ML models were assessed using fivefold cross-validation with 5 repetitions to increase the reliability of the findings. Some limitations can be identified in the current study. First, PSP patients did not undergo autopsy, thus it is possible that in some cases the clinical diagnosis might be in error. However, clinical evaluations were performed according to the MDS diagnostic criteria for PSP-RS and PSP-P [1] and the recent MAX rules [8], by movement disorder specialists with more than 10 years of experience. Second, our study focused on PSP-RS and PSP-P only, while others PSP variants were not included due to low sample size. Third, an independent validation cohort is missing. In this study, two different ML algorithms showed similar classification performances, increasing the robustness of the findings; however, future studies to validate the performances of these models based on structural MR data in independent patient cohorts are warranted. Fourth, in this study we used only structural MRI data without exploring the potential of combining structural features with Quantitative Susceptibility Mapping or DTI data. However, structural data obtained from T1-weighted images have the advantage of wider availability and lower variability in the MR acquisition protocols, hopefully allowing a broader use of these biomarkers.
In conclusion, this study demonstrates that ML models combining the MRPI values with cortical thickness and volumetric data had high classification performances in distinguishing PSP-RS from PSP-P patients, also in the early stage of the disease, and can thus assist the differential diagnosis between these common PSP phenotypes in vivo.
Availability of data and materials
The data that support the results of this study are available from the corresponding author upon reasonable request.
Code availability
Not applicable.
References
Höglinger GU, Respondek G, Stamelou M et al (2017) Clinical diagnosis of progressive supranuclear palsy: the movement disorder society criteria. Mov Disord 32:853–864. https://doi.org/10.1002/mds.26987
Boxer AL, Yu JT, Golbe LI et al (2017) Advances in progressive supranuclear palsy: new diagnostic criteria, biomarkers, and therapeutic approaches. Lancet Neurol 16:552–563. https://doi.org/10.1016/S1474-4422(17)30157-6
Respondek G, Stamelou M, Kurz C et al (2014) The phenotypic spectrum of progressive supranuclear palsy: a retrospective multicenter study of 100 definite cases. Mov Disord 29:1758–1766. https://doi.org/10.1002/mds.26054
Campagnolo M, Weis L, Fogliano C et al (2023) Clinical, cognitive, and morphometric profiles of progressive supranuclear palsy phenotypes. J Neural Transm (Vienna) 130:97–109. https://doi.org/10.1007/s00702-023-02591-z
Picillo M, Cuoco S, Tepedino MF et al (2019) Motor, cognitive and behavioral differences in MDS PSP phenotypes. J Neurol 266:1727–1735. https://doi.org/10.1007/s00415-019-09324-x
Jabbari E, Holland N, Chelban V et al (2020) Diagnosis across the spectrum of progressive supranuclear palsy and corticobasal syndrome. JAMA Neurol 77:377–387. https://doi.org/10.1001/jamaneurol.2019.4347
Shoeibi A, Litvan I, Juncos JL et al (2019) Are the International Parkinson disease and Movement Disorder Society progressive supranuclear palsy (IPMDS-PSP) diagnostic criteria accurate enough to differentiate common PSP phenotypes? Parkinsonism Relat Disord 69:34–39. https://doi.org/10.1016/j.parkreldis.2019.10.012
Grimm MJ, Respondek G, Stamelou M et al (2019) How to apply the movement disorder society criteria for diagnosis of progressive supranuclear palsy. Mov Disord 34:1228–1232. https://doi.org/10.1002/mds.27666
de Gordoa JJS-R, Zelaya V, Tellechea-Aramburo P et al (2022) Is the phenotype designation by PSP-MDS criteria stable throughout the disease course and consistent with tau distribution? Front Neurol 13:827338. https://doi.org/10.3389/fneur.2022.827338
Alster P, Madetko N, Koziorowski D, Friedman A (2020) Progressive supranuclear palsy-parkinsonism predominant (PSP-P)—a clinical challenge at the boundaries of PSP and Parkinson’s disease (PD). Front Neurol 10(11):180. https://doi.org/10.3389/fneur.2020.00180
Srulijes K, Mallien G, Bauer S et al (2011) In vivo comparison of Richardson’s syndrome and progressive supranuclear palsy-parkinsonism. J Neural Transm (Vienna) 118:1191–1197. https://doi.org/10.1007/s00702-010-0563-8
Quattrone A, Caligiuri ME, Morelli M et al (2019) Imaging counterpart of postural instability and vertical ocular dysfunction in patients with PSP: a multimodal MRI study. Parkinsonism Relat Disord 63:124–130. https://doi.org/10.1016/j.parkreldis.2019.02.022
Bloem BR, Marinus J, Almeida Q et al (2016) Measurement instruments to assess posture, gait, and balance in Parkinson’s disease: critique and recommendations. Mov Disord 31:1342–1355. https://doi.org/10.1002/mds.26572
Hunt AL, Sethi KD (2006) The pull test: a history. Mov Disord 21(7):894–899. https://doi.org/10.1002/mds.20925. (PMID: 16685683)
Bluett B, Litvan I, Cheng S et al (2017) Understanding falls in progressive supranuclear palsy. Parkinsonism Relat Disord 35:75–81. https://doi.org/10.1016/j.parkreldis.2016.12.009
Amboni M, Barone P, Hausdorff JM (2013) Cognitive contributions to gait and falls: evidence and implications. Mov Disord 28:1520–1533. https://doi.org/10.1002/mds.25674
Nigro S, Antonini A, Vaillancourt DE et al (2020) Automated MRI classification in progressive supranuclear palsy: a large international cohort study. Mov Disord 35:976–983. https://doi.org/10.1002/mds.28007
Zhang K, Liang Z, Wang C et al (2019) Diagnostic validity of magnetic resonance parkinsonism index in differentiating patients with progressive supranuclear palsy from patients with Parkinson’s disease. Parkinsonism Relat Disord 66:176–181. https://doi.org/10.1016/j.parkreldis.2019.08.007
Quattrone A, Morelli M, Bianco MG et al (2022) Magnetic resonance planimetry in the differential diagnosis between Parkinson’s disease and progressive supranuclear palsy. Brain Sci 12:949. https://doi.org/10.3390/brainsci12070949
Archer DB, Mitchell T, Burciu RG et al (2020) Magnetic resonance imaging and neurofilament light in the differentiation of Parkinsonism. Mov Disord 35:1388–1395. https://doi.org/10.1002/mds.28060
Chougar L, Faouzi J, Pyatigorskaya N et al (2021) Automated categorization of parkinsonian syndromes using magnetic resonance imaging in a clinical setting. Mov Disord 36:460–470. https://doi.org/10.1002/mds.28348
Huppertz HJ, Möller L, Südmeyer M et al (2016) Differentiation of neurodegenerative parkinsonian syndromes by volumetric magnetic resonance imaging analysis and support vector machine classification. Mov Disord 31:1506–1517. https://doi.org/10.1002/mds.26715
Seki M, Seppi K, Mueller C et al (2018) Diagnostic potential of dentatorubrothalamic tract analysis in progressive supranuclear palsy. Parkinsonism Relat Disord 49:81–87. https://doi.org/10.1016/j.parkreldis.2018.02.004. (Epub 2018 Feb 7 PMID: 29463454)
Nicoletti G, Tonon C, Lodi R et al (2008) Apparent diffusion coefficient of the superior cerebellar peduncle differentiates progressive supranuclear palsy from Parkinson’s disease. Mov Disord 23:2370–2376. https://doi.org/10.1002/mds.22279
Spotorno N, Hall S, Irwin DJ et al (2019) Diffusion tensor MRI to distinguish progressive supranuclear palsy from α-synucleinopathies. Radiology 293:646–653. https://doi.org/10.1148/radiol.2019190406
Martí-Andrés G, van Bommel L, Meles SK et al (2020) Multicenter validation of metabolic abnormalities related to PSP according to the MDS-PSP criteria. Mov Disord 35:2009–2018. https://doi.org/10.1002/mds.28217
Jin J, Su D, Zhang J et al (2023) Tau PET imaging in progressive supranuclear palsy: a systematic review and meta-analysis. J Neurol. https://doi.org/10.1007/s00415-022-11556-3
Quattrone A, Morelli M, Nigro S et al (2018) A new MR imaging index for differentiation of progressive supranuclear palsy-parkinsonism from Parkinson’s disease. Parkinsonism Relat Disord 54:3–8. https://doi.org/10.1016/j.parkreldis.2018.07.016
Quattrone A, Bianco MG, Antonini A et al (2022) Development and validation of automated magnetic resonance Parkinsonism index 2.0 to distinguish progressive supranuclear Palsy-Parkinsonism from Parkinson’s disease. Mov Disord 37:1272–1281. https://doi.org/10.1002/mds.28992
Obermeyer Z, Emanuel EJ (2016) Predicting the future—big data, machine learning, and clinical medicine. N Engl J Med 375:1216–1219. https://doi.org/10.1056/NEJMp1606181
Singh NM, Harrod JB, Subramanian S et al (2022) How machine learning is powering neuroimaging to improve brain health. Neuroinformatics 20:943–964. https://doi.org/10.1007/s12021-022-09572-9
Benito-León J, Louis ED, Mato-Abad V et al (2019) A data mining approach for classification of orthostatic and essential tremor based on MRI-derived brain volume and cortical thickness. Ann Clin Transl Neurol 6:2531–2543. https://doi.org/10.1002/acn3.50947
Bianco MG, Quattrone A, Sarica A et al (2023) Cortical involvement in essential tremor with and without rest tremor: a machine learning study. J Neurol 270:4004–4012. https://doi.org/10.1007/s00415-023-11747-6
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. association for computing machinery, New York, pp 785–794
Litvan I, Agid Y, Calne D et al (1996) Clinical research criteria for the diagnosis of progressive supranuclear palsy (Steele-Richardson-Olszewski syndrome): report of the NINDS-SPSP international workshop. Neurology 47:1–9. https://doi.org/10.1212/wnl.47.1.1
Williams DR, de Silva R, Paviour DC et al (2005) Characteristics of two distinct clinical phenotypes in pathologically proven progressive supranuclear palsy: Richardson’s syndrome and PSP-parkinsonism. Brain 128:1247–1258. https://doi.org/10.1093/brain/awh488
Miskin N, Patel H, Franceschi AM et al (2017) Diagnosis of normal-pressure hydrocephalus: use of traditional measures in the era of volumetric MR imaging. Radiology 285:197–205. https://doi.org/10.1148/radiol.2017161216
Goetz CG, Tilley BC, Shaftman SR et al (2008) Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov Disord 23:2129–2170. https://doi.org/10.1002/mds.22340
Hoehn MM, Yahr MD (1967) Parkinsonism: onset, progression, and mortality. Neurology 17:427–442
Folstein MF, Folstein SE, McHugh PR (1975) “Mini-mental state:” a practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 12:189–198
Salsone M, Caligiuri ME, Vescio V et al (2019) Microstructural changes of normal-appearing white matter in vascular Parkinsonism. Parkinsonism Relat Disord 63:60–65. https://doi.org/10.1016/j.parkreldis.2019.02.046
Dale AM, Fischl B, Sereno MI (1999) Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage 9:179–194. https://doi.org/10.1006/nimg.1998.0395
Robin X, Turck N, Hainard A et al (2011) pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12:77. https://doi.org/10.1186/1471-2105-12-77
Bianco MG, Quattrone A, Sarica A et al (2022) Cortical atrophy distinguishes idiopathic normal-pressure hydrocephalus from progressive supranuclear palsy: a machine learning approach. Parkinsonism Relat Disord 103:7–14. https://doi.org/10.1016/j.parkreldis.2022.08.007
Vaccaro MG, Sarica A, Quattrone A et al (2021) Neuropsychological assessment could distinguish among different clinical phenotypes of progressive supranuclear palsy: a Machine Learning approach. J Neuropsychol 15:301–318. https://doi.org/10.1111/jnp.12232
Strobl C, Boulesteix AL, Zeileis A et al (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8:25. https://doi.org/10.1186/1471-2105-8-25
Whitwell JL, Tosakulwong N, Botha H et al (2020) Brain volume and flortaucipir analysis of progressive supranuclear palsy clinical variants. Neuroimage Clin. 25:102152. https://doi.org/10.1016/j.nicl.2019.102152
Agosta F, Kostić VS, Galantucci S et al (2010) The in vivo distribution of brain tissue loss in Richardson’s syndrome and PSP-parkinsonism: a VBM-DARTEL study. Eur J Neurosci 32:640–647. https://doi.org/10.1111/j.1460-9568.2010.07304.x
Potrusil T, Krismer F, Beliveau V et al (2020) Diagnostic potential of automated tractography in progressive supranuclear palsy variants. Parkinsonism Relat Disord 72:65–71. https://doi.org/10.1016/j.parkreldis.2020.02.007
Agosta F, Pievani M, Svetel M et al (2012) Diffusion tensor MRI contributes to differentiate Richardson’s syndrome from PSP-parkinsonism. Neurobiol Aging 33:2817–2826. https://doi.org/10.1016/j.neurobiolaging.2012.02.002
Whitwell JL, Tosakulwong N, Clark HM et al (2021) Diffusion tensor imaging analysis in three progressive supranuclear palsy variants. J Neurol 268:3409–3420. https://doi.org/10.1007/s00415-020-10360-1
Nicoletti G, Caligiuri ME, Cherubini A et al (2017) A fully automated, atlas-based approach for superior cerebellar peduncle evaluation in progressive supranuclear palsy phenotypes. AJNR Am J Neuroradiol 38:523–530. https://doi.org/10.3174/ajnr.A5048
Longoni G, Agosta F, Kostić VS et al (2011) MRI measurements of brainstem structures in patients with Richardson’s syndrome, progressive supranuclear palsy-parkinsonism, and Parkinson’s disease. Mov Disord 26:247–255. https://doi.org/10.1002/mds.23293
Picillo M, Tepedino MF, Abate F et al (2020) Midbrain MRI assessments in progressive supranuclear palsy subtypes. J Neurol Neurosurg Psychiatry 91:98–103. https://doi.org/10.1136/jnnp-2019-321354
Heim B, Mangesius S, Krismer F et al (2021) Diagnostic accuracy of MR planimetry in clinically unclassifiable parkinsonism. Parkinsonism Relat Disord 82:87–91. https://doi.org/10.1016/j.parkreldis.2020.11.019
Schofield EC, Hodges JR, Macdonald V et al (2011) Cortical atrophy differentiates Richardson’s syndrome from the parkinsonian form of progressive supranuclear palsy. Mov Disord 26:256–263. https://doi.org/10.1002/mds.23295
Williams DR, Holton JL, Strand C et al (2007) Pathological tau burden and distribution distinguishes progressive supranuclear palsy-parkinsonism from Richardson’s syndrome. Brain 130(Pt 6):1566–1576. https://doi.org/10.1093/brain/awm104. (PMID: 17525140)
Kovacs GG, Lukic MJ, Irwin DJ et al (2020) Distribution patterns of tau pathology in progressive supranuclear palsy. Acta Neuropathol 140:99–119. https://doi.org/10.1007/s00401-020-02158-2
Illán-Gala I, Nigro S, VandeVrede L et al (2022) Diagnostic accuracy of magnetic resonance imaging measures of brain atrophy across the spectrum of progressive supranuclear palsy and corticobasal degeneration. JAMA Netw Open 5:e229588. https://doi.org/10.1001/jamanetworkopen.2022.9588
Scotton WJ, Bocchetta M, Todd E et al (2022) A data-driven model of brain volume changes in progressive supranuclear palsy. Brain Commun 4:fcac098. https://doi.org/10.1093/braincomms/fcac098
Bentéjac C, Csörgő A, Martínez-Muñoz G (2021) A comparative analysis of gradient boosting algorithms. Artif Intell Rev 54:1937–1967
Jecmenica-Lukic M, Petrovic IN, Pekmezovic T et al (2014) Clinical outcomes of two main variants of progressive supranuclear palsy and multiple system atrophy: a prospective natural history study. J Neurol 261:1575–1583. https://doi.org/10.1007/s00415-014-7384-x
Guasp M, Molina-Porcel L, Painous C et al (2021) Association of PSP phenotypes with survival: a brain-bank study. Parkinsonism Relat Disord 84:77–81. https://doi.org/10.1016/j.parkreldis.2021.01.015
Street D, Malpetti M, Rittman T et al (2021) Clinical progression of progressive supranuclear palsy: impact of trials bias and phenotype variants. Brain Commun. 3:fcab206. https://doi.org/10.1093/braincomms/fcab206
Funding
Open access funding provided by Università degli studi "Magna Graecia" di Catanzaro within the CRUI-CARE Agreement. No funding to declare.
Author information
Authors and Affiliations
Contributions
AQ, AS and AQ contributed to the study conception and design. Data collection was performed by, JB, MM and MGV. Statistical analysis was performed by AS, MGB, FA, MDM, CC and BV. The first draft of the manuscript was written by AS and AQ. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Ethics approval
All patients in our study gave their informed consent prior to their inclusion in the study. Approval of our study was obtained from the ethics committee of Magna Graecia University review board, Catanzaro, Italy. The procedures used in this study adhere to the tenets of the Declaration of Helsinki.
Consent to participate
Informed consent was obtained from all individual participants included in the study.
Consent to publication
All patients signed informed consent regarding publishing their data.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Quattrone, A., Sarica, A., Buonocore, J. et al. Differentiating between common PSP phenotypes using structural MRI: a machine learning study. J Neurol 270, 5502–5515 (2023). https://doi.org/10.1007/s00415-023-11892-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00415-023-11892-y