Abnormal pattern of brain glucose metabolism in Parkinson’s disease: replication in three European cohorts

Rationale In Parkinson’s disease (PD), spatial covariance analysis of 18F-FDG PET data has consistently revealed a characteristic PD-related brain pattern (PDRP). By quantifying PDRP expression on a scan-by-scan basis, this technique allows objective assessment of disease activity in individual subjects. We provide a further validation of the PDRP by applying spatial covariance analysis to PD cohorts from the Netherlands (NL), Italy (IT), and Spain (SP). Methods The PDRPNL was previously identified (17 controls, 19 PD) and its expression was determined in 19 healthy controls and 20 PD patients from the Netherlands. The PDRPIT was identified in 20 controls and 20 “de-novo” PD patients from an Italian cohort. A further 24 controls and 18 “de-novo” Italian patients were used for validation. The PDRPSP was identified in 19 controls and 19 PD patients from a Spanish cohort with late-stage PD. Thirty Spanish PD patients were used for validation. Patterns of the three centers were visually compared and then cross-validated. Furthermore, PDRP expression was determined in 8 patients with multiple system atrophy. Results A PDRP could be identified in each cohort. Each PDRP was characterized by relative hypermetabolism in the thalamus, putamen/pallidum, pons, cerebellum, and motor cortex. These changes co-varied with variable degrees of hypometabolism in posterior parietal, occipital, and frontal cortices. Frontal hypometabolism was less pronounced in “de-novo” PD subjects (Italian cohort). Occipital hypometabolism was more pronounced in late-stage PD subjects (Spanish cohort). PDRPIT, PDRPNL, and PDRPSP were significantly expressed in PD patients compared with controls in validation cohorts from the same center (P < 0.0001), and maintained significance on cross-validation (P < 0.005). PDRP expression was absent in MSA. Conclusion The PDRP is a reproducible disease characteristic across PD populations and scanning platforms globally. Further study is needed to identify the topography of specific PD subtypes, and to identify and correct for center-specific effects. Electronic supplementary material The online version of this article (10.1007/s00259-019-04570-7) contains supplementary material, which is available to authorized users.


Introduction
Parkinson's disease (PD) is a common neurodegenerative disorder, for which only symptomatic therapies are available. Efforts to develop neuroprotective or preventive treatments will benefit from a reliable biomarker. Ideally, such a biomarker can identify PD in its early stages, differentiate between PD and other neurodegenerative parkinsonian disorders, track disease progression, and quantify treatment effects.
In PD, abnormal accumulation of α-synuclein in neurons impairs synaptic signaling, causing disintegration of specific neural networks [1]. Neuro-imaging with [ 1 8 F]fluorodeoxyglucose positron emission tomography ( 18 F-FDG PET) can capture synaptic dysfunction in vivo. The radiotracer 18 F-FDG provides an index for the cerebral metabolic rate of glucose, which is strongly associated with neuronal activity and synaptic integrity [2].
Brain regions with altered 18 F-FDG uptake in PD have been identified with univariate group comparisons using Statistical Parametric Mapping (SPM) [3][4][5][6][7]. However, because metabolic activity is correlated in functionally interconnected brain regions, analysis of covariance is more suitable to assess whole-brain networks. Multivariate disease-related patterns can be identified with the Scaled Subprofile Model and Principal Component Analysis (SSM PCA). Subsequently, a disease-related pattern can be used to quantify the 18 F-FDG PET scans of new subjects [8][9][10]. In this procedure, an individual's scan is projected onto the pattern, resulting in a subject score. This is a single numeric value which reflects the degree of pattern expression in that individual's scan.
The PD-related pattern (PDRP) was initially identified by Eidelberg et al. with SSM PCA in 33 healthy controls and 33 PD patients from the USA [11]. This PDRP (PDRP USA ) has served as a reference in many consecutive studies [12]. The PDRP USA is characterized by relatively increased metabolism of the thalamus, globus pallidus/putamen, cerebellum and pons, and by relative hypometabolism of the occipital, temporal, parietal, and frontal cortices. PDRP USA subject scores were significantly correlated with motor symptoms and presynaptic dopaminergic deficits in the posterior striatum [13], increased with disease progression [14], and were shown to decrease after effective treatment [15,16]. PDRP USA was significantly expressed in patients with idiopathic REM sleep behavior disorder (iRBD), a well-known prodromal disease stage of PD [17], and could discriminate between healthy controls, PD, and patients with multiple system atrophy (MSA) [18,19].
Because of these properties, PDRP USA is considered a neuro-imaging biomarker for PD [12]. It is essential that the PDRP is thoroughly validated. In collaboration with Eidelberg et al., PDRPs were identified in independent American, Indian, Chinese, and Slovenian populations [11,15,20,21]. Independently from these authors, the PDRP was recently derived in an Israeli population [22]. These PDRPs were highly similar to the PDRP USA , although minor deviations in PDRP regional topography were observed in several of these studies, which may be caused by differences in demographics or clinical characteristics of the cohorts.
We previously identified a PDRP in a retrospective cohort of PD patients scanned on dopaminergic medication [23], and subsequently in an independent cohort of prospectively included PD patients who were in the off-state (PDRP NL ) [24]. We used code written in-house, and obtained similar results compared with other PDRP studies. Recently, we demonstrated significant expression of the PDRP NL in idiopathic REM sleep behavior disorder (a prodromal stage of PD), PD, and dementia with Lewy bodies [25]. However, the PDRP NL has not been validated in a larger cohort, and correlations with PDRP USA were not explored.
The aim of the current study was to validate the PDRP NL in several independent cohorts. We were able to test the PDRP NL in 19 controls and 20 PD patients from our own clinic in the Netherlands, in 44 healthy controls and 38 "de-novo" PD patients from Italy, and 19 healthy controls and 49 late-stage PD patients from Spain. In addition, we newly identified a PDRP in Italian and Spanish datasets and performed a crossvalidation between these populations. We compared the three PDRPs to the reference pattern (PDRP USA ).

F-FDG PET data from the Netherlands
The PDRP NL was previously identified in 18 F-FDG PET scans from 17 healthy controls and 19 PD patients (NL1; Table 1) [24]. In these subjects, antiparkinsonian medication was withheld for at least 12 h before PET scanning.
In a previous study, we demonstrated that the PDRP NL was significantly expressed in an independent dataset of 20 PD patients compared with 19 controls (NL2; Table 1) [25]. For the current study, we added scans of 8 patients with the parkinsonian variant of MSA (MSA-P). Patients were diagnosed with probable PD or MSA-P by a movement disorder specialist [26]. 18 F-FDG-PET was performed in our clinic as part of routine diagnostic workup. These patients were scanned with the same camera as NL1. However, since the PDRP NL derivation study [24], reconstruction algorithms were updated (Table 1). Antiparkinsonian medication was not routinely withheld in NL2 PD patients. 18

F-FDG PET data from Italy
The IT dataset consisted of 18 F-FDG PET scans from 44 healthy controls and 38 consecutive outpatients with "denovo," drug-naïve PD [27] (Table 2). 123 I-FP-CIT Single Photon Emission Computed Tomography (DAT SPECT) was abnormal in all Italian PD patients. Disease-related patterns are typically determined on approximately 20 patients and 20 controls. Therefore, 20 controls and 20 patients were randomly selected from the IT dataset for PDRP IT derivation. The remaining 24 controls and 18 patients were used for validation. 18 F-FDG PET data from Spain 18 F-FDG PET scans from 49 PD patients and 19 controls from Spain (SP) were included from a previous study (Table 3) [28]. Patients in this cohort had long disease durations and were studied in the on state (i.e., antiparkinsonian medication was continued). From this dataset, 19 PD patients were randomly selected for PDRP SP identification. The remaining 30 patients were used for validation.
Identification of PDRP NL , PDRP IT , and PDRP SP All images were spatially normalized onto an 18 F-FDG PET template in Montreal Neurological Institute brain space [29] using SPM12 software (Wellcome Department of Imaging Neuroscience, Institute of Neurology, London, UK).
Identification of the PDRP NL was described previously [24]. For identification of the PDRP IT and PDRP SP , we applied an automated algorithm written in-house, based on the SSM PCA method of Spetsieris and Eidelberg [10], implemented in MATLAB (version 2017b; MathWorks, Natick, MA). Images were masked to remove out-of-brain voxels, log-transformed, and subject and group means were removed. Principal component analysis (PCA) was applied to the residual profiles in voxel space, and the components explaining the top 50% of the total variance were selected for further analysis. For each subject, a score was calculated on each selected principal component (PC). These scores were entered into a forward stepwise logistic regression analysis. The components that could best discriminate between controls and patients [30] were linearly combined to form the PDRP. In this linear combination, each component was weighted by the coefficient resulting To investigate which regions in each PDRP were stable, a bootstrap resampling was performed within each derivation set (1000 repetitions) [31]. Voxels that survived a one-sided confidence interval (CI) threshold of 90% (percentile method) after bootstrapping were overlaid on a T1 MRI template. The stable regions in the three PDRPs were visually compared.

Validation of PDRP NL , PDRP IT , and PDRP SP
For validation, subject scores for PDRP NL , PDRP IT , and PDRP SP were calculated in patients and controls from the same population. First, images were log-transformed and the subject mean and group mean (originating from the PDRP identification cohort) were removed, resulting in a residual profile for each subject. The subject score is calculated by projecting the subject residual profile on the pattern. To account for differences in data-acquisition, subject scores were always z-transformed to the subject scores of healthy controls that were scanned on the same camera, with identical reconstruction algorithms. If subject scores in validation PD subjects were significantly higher compared with subject scores in controls, the pattern was considered valid.
In this manner, PDRP NL subject scores were calculated in the derivation cohort (NL1) and in the validation cohort (NL2). However, data acquisition was not identical for NL1 and NL2 data. This resulted in a significant difference in PDRP NL subject scores between the NL1 and NL2 healthy control groups (supplementary Fig 1). To correct for these differences, subject scores in NL1 were z-transformed to NL1 healthy controls, such that NL1 control mean was 0 with a standard deviation of 1. Similarly, subject scores in NL2 were z-transformed to NL2 controls.
Subject scores for PDRP IT were calculated in the IT derivation cohort (controls n = 20; PD n = 20) and the IT validation cohort (controls n = 24; PD n = 18). Because all IT scans were acquired with identical protocols, subject scores could be z-transformed to the IT healthy controls from the derivation sample (n = 20).
Subject scores for the PDRP SP were calculated in the SP derivation cohort (controls n = 19; PD n = 19) and the SP validation cohort (PD n = 30). PDRP SP subject scores were z-transformed to the SP controls from the derivation sample (n = 19). As a second SP healthy control cohort Cross-validation of PDRP NL , PDRP IT , and PDRP SP Subsequently, PDRP NL subject scores were determined in the IT and SP datasets, PDRP IT subject scores were determined in the NL and SP datasets, and PDRP SP subject scores were determined in the NL and IT datasets. In addition, subject scores for the PDRP USA were calculated in each dataset in the same manner. Each subject score was then transformed into a z-score with respect to controls from the same camera, such that control mean was 0 with a standard deviation of 1. To determine the performance of each pattern in discriminating between controls and patients, a receiver operating curve was plotted (for each pattern in each dataset) and the area under the curve (AUC) was obtained. The similarity of the three PDRPs to each other and to the PDRP USA was tested in two ways. First, in each dataset, the z-scores for each PDRP were correlated with Pearson's r correlation coefficient. Second, the voxelwise topographies of the different patterns were compared by using volume-of-interest (VOI) correlations over the whole brain. A set of 30 standardized VOIs were selected from a previous study [21,32], reflecting key nodes of the reference PDRP. Within each VOI, region weights were

PDRP expression in MSA-p subjects
Subject scores for each PDRP were calculated in 8 MSA-p patients. Subject scores for each PDRP were z-transformed to corresponding subject scores in NL2 controls.
Principal component 1 PDRP USA [11], as well as the PDRP determined in Chinese [20] and Slovenian [21] populations, consisted of PC1 in isolation. Combinations of components were not considered. There are several methods to decide which components are disease-related and should be included in the final diseaserelated pattern [10]. In the current study, this decision was based on a forward stepwise logistic regression model, using the Akaike information criterion (AIC) as model selection criterion [30], in order to combine the least possible number of components to obtain the optimum discrimination between controls and patients. It is possible that the optimal model selects one component. If the PDRPs identified in the current study were not based on PC1 in isolation, we repeated all analyses using PC1 alone for each cohort. In that case, we additionally identified PDRP NL -PC1, PDP IT -PC1, and PDRP SP -PC1, and repeated the cross-validation.

Statistical procedures
Between-group differences in PDRP z-scores were assessed using a Student's t test. Correlations between PDRP and age, disease duration, and UPDRS were examined with Pearson's r correlation coefficient. Analyses were performed using SPSS software version 20 (SPSS Inc., Chicago, IL) and considered significant for P < 0.05 (uncorrected).

PDRP NL
The first six principal components explained 50% of the total variance. The PDRP NL was formed by a weighted linear combination of principal components 1 and 2 (variance explained 17% and 10%, respectively; Figs. 1a and 2a). PDRP NL zscores were significantly different between healthy controls and PD patients in both derivation (NL1) and validation (NL2) cohorts (P < 0.0001; Fig. 3a).

PDRP IT
The first six principal components explained 51% of the total variance. A weighted linear combination of principal components 1 and 2 (variance explained 19% and 8% respectively) could best discriminate between controls and patients in the logistic regression model, and was termed the PDRP IT (Figs.  1b and 2b). PDRP IT subject scores were significantly different between healthy controls (n = 24) and patients (n = 18) in the validation cohort (P < 0.0001; Fig. 3b).

PDRP SP
The first five principal components explained 51% of the total variance. The PDRP SP was formed by a weighted linear combination of principal components 1, 2, and 3 (variance explained 17%, 14%, and 5%, respectively; Figs. 1c and 2c). PDRP SP was significantly expressed in PD patients from the validation set (P < 0.0001, Fig. 3c).

Cross-validation
Each of the PDRPs (including the PDRP USA ) was significantly expressed in PD patients compared with controls, in each of the datasets (Figs. 4a-c and 5). Corresponding ROC-AUCs are reported in Table 4.
Correlations to UPDRS and disease duration were inconsistent (Table 5). Within each dataset, z-scores of any two PDRPs were significantly correlated. Subject scores on all three patterns were also significantly correlated to subject scores on PDRP USA (Table 5). Especially, the PDRP NL showed consistent high correlations to PDRP USA . In addition, a comparison between spatial topographies of the original PDRP USA versus the PDRP IT , PDRP NL , and PDRP SP showed significant correlations in region weights (Table 6).

PDRP subject scores in MSA-p patients
Subject scores for each PDRP were calculated in MSA patients. Subject z-scores on each PDRP were not significantly different between controls and MSA patients (Fig. 6).

Principal component 1
As stated above, PDRP NL and PDRP IT were identified as linear combinations of multiple PCs. All analyses were repeated for PDRP NL -PC1, PDP IT -PC1, and PDRP SP -PC1. The PDRPs that were based on combinations of PCs yielded higher diagnostic accuracy (Table 4) compared with patterns based on PC1 alone (Table 7). However, subject scores on PDP IT -PC1, PDRP NL -PC1, and PDRP SP -PC1 did show much higher correlations to subject scores on PDRP USA (all r > 0.98, P < 0.0001).

Discussion
In this study, we cross-validated the previously published PDRP NL [24], and additionally identified a PDRP in an Italian (PDRP IT ) and Spanish (PDRP SP ) sample. The three patterns were akin to PDRP USA , and also to the PDRP described in other populations [20,21]. Topographical similarity to PDRP USA was confirmed for each of the three PDRPs by a significant correlation of region weights, and a significant correlation in subject scores. Furthermore, PDRP NL , PDRP IT , and PDRP SP were significantly expressed in PD patients compared with controls in both identification and validation cohorts, but were not significantly expressed in MSA-p patients.
The high degree of similarity in PDRP topography across samples is striking considering differences in demographics, clinical characteristics, scanning methods, and reconstruction algorithms. Especially the PDRP NL was highly similar to the reference pattern (PDRP USA ). These two patterns showed the highest subject score correlation and region weight correlation. Furthermore, the PDRP NL achieved the highest AUC in the other cohorts. Like PDRP USA , PDRP NL was derived in a cohort of off-state patients with a wide range of disease durations (duration 4.4 ± 3.2 years; range 1.5-11.5 years) and severities.
Deviations from the typical PDRP topography are worth exploring further in relation to clinical characteristics. The PDRP IT is unique in that it is, to our knowledge, the first time the PDRP has been derived in "de-novo," treatment-naïve PD patients. It is likely that these very early-stage patients have a less severe nigrostriatal dopaminergic deficit compared with the more advanced PD patients in PDRP USA , PDRP NL , and PDRP SP derivation cohorts. This may be reflected by less severe involvement of the frontal cortex in PDRP IT , as nigrostriatal denervation is known to be positively correlated with hypometabolism in the frontal cortex [13,39].
The PDRP SP was derived in PD patients who were scanned while being on dopaminergic medication. Levodopa is known to decrease metabolism in the cerebellar vermis, putamen/ pallidum, and sensorimotor cortex. Levodopa therapy can reduce PDRP expression, but does not completely correct the underlying network abnormalities [16]. It can be assumed that the effect of dopaminergic therapy on PDRP expression is modest in comparison with the effect of disease progression [40]. Indeed, the typical PDRP topography could still be identified in these on-state patients. However, the PDRP SP did not correlate as well to the other patterns, both in terms of subject scores and region weights. It is not clear whether this is related to the advanced disease stage or the effect of treatment. The PDRP SP was characterized by more stable involvement of the occipital cortex, possibly related to the presence of mild cognitive impairment and visual hallucinations, which often occur in advanced PD [41].
Following from the above, it can be concluded that the typical PDRP topography is highly reproducible. Similar topographies have also been identified in studies comparing 18 F-FDG-PET scans of healthy controls and PD patients with SPM [3][4][5][6][7]. Such analyses can be supportive in the visual assessment of an 18 F-FDG-PET scan in clinical practice. Several studies have evaluated the diagnostic value of observer-dependent visual reads supported by SPM-based comparisons to healthy controls [3,4,[42][43][44]. A recent meta-analysis (PD versus "atypical" parkinsonism) estimated a pooled sensitivity of 91.4% and specificity of 94.7% for this semi-quantitative approach [45].
The merit of SSM PCA over mass-univariate approaches lies in its ability to objectively quantify 18 F-FDG PET scans of patients using the pre-defined patterns. Pattern expression scores were shown useful in differential diagnosis, tracking disease progression, and estimating treatment effects [46]. Although in the current study PDRP z-scores were significantly higher in PD patients compared with healthy controls, there was a considerable overlap in PDRP z-scores between patients  All subject scores were z-transformed to the 20 HC from the derivation sample. c PDRP SP was identified in 19 HC and 19 PD, and validated in 30 PD. Additional HC for validation were not available. All subject scores were z-transformed to the 19 HC from the derivation sample. Subject zscores are compared between groups with a Student's t test. Bars indicate mean and standard deviation Fig. 4 Subject scores for each PDRP in the other cohorts (crossvalidation). a PDRP NL subject scores are plotted for the Italian (IT) and Spanish (SP) data. b PDRP IT subject scores are plotted for the two Dutch samples (NL1 and NL2) and in SP data. c PDRP SP subject scores are plotted for NL1, NL2, and IT data. Subject scores are z-transformed to healthy control values from the same camera, and compared between groups with a Student's t test. Bars indicate mean and standard deviation and controls in almost every cohort. This overlap is not unique to the current data, and is also apparent in other studies [12].
Some healthy controls appear to express the PDRP. Since we found significant correlations between PDRP z-scores and age in healthy controls, it could be suggested that ageing and PD share certain metabolic features. Metabolic decreases have been reported in the parietal cortex in normal aging [47,48]. This may cause some overlap with the PDRP. However, the correlation with age in our study was not consistent across all datasets and patterns (Table 5). Furthermore, expression of an age-related spatial covariance pattern was shown to be independent from PDRP expression [49,50]. Alternatively, a high PDRP z-score in a healthy subject could signal a prodromal stage of neurodegeneration. For instance, subjects with idiopathic REM sleep behavior disorder (a prodromal stage of PD) were shown to express the PDRP years before onset of clinical parkinsonism [17,25].
Low PDRP z-scores in PD patients could indicate inaccurate clinical diagnosis. A recent meta-analysis of clinicopathologic studies suggests that the clinical diagnosis of PD by an expert, after an adequate follow-up, has a sensitivity of 81.3% and a specificity of 83.5% [51]. Thus, even under ideal circumstances, the diagnosis is inaccurate in a number of patients.
Overlap in pattern expression scores is not only apparent between controls and PD patients, but also between patients with different parkinsonian disorders. For instance, the PDRP may also be expressed in patients with progressive supranuclear palsy (PSP) [52]. This means that the expression score for a single disease-related pattern is inadequate to differentiate between multiple disorders. However, this does not hamper application in differential diagnosis. Previous studies have shown that an algorithm combining multiple disease- Fig. 5 Subject z-scores for the reference pattern PDRP USA [11] in each of the datasets. Subject scores are z-transformed to healthy control values from the same camera, and compared between groups with a Student's t test. Bars indicate mean and standard deviation  [18]. Highly similar results were obtained in an independent sample (n = 129) [19].
In this study, we compared data from different centers. It is well-known that variations in PET scanners and image reconstruction algorithms influence disease-related pattern scores [53][54][55] (supplementary Fig 1). In support of this, we recently identified clear center-specific features in the current data using machine-learning algorithms [56]. Therefore, PDRP subject scores cannot be compared readily between subjects from different centers. In all PDRP studies, this is solved with a z-transformation using the mean and standard deviation of a small healthy control group. This potentially introduces a bias, depending on which controls are selected. However, this issue is not relevant for within subject studies. Therefore, PDRP subject scores may be especially useful in tracking disease progression [14], or treatment effects [16,[35][36][37][38].
This study is methodologically different from previous PDRP studies. The PDRPs identified in this study were formed by a combination of principal components (PCs). These combinations were determined based on a forward stepwise logistic regression model [30]. There are different methods to decide which components are included in the disease-related pattern [10]. Previous studies have always identified the PDRP as PC1 in isolation [11,20,21]. The process of component selection is not always described in detail. Automatically choosing PC1 as the disease-related pattern, and disregarding consecutive, smaller PCs, increases the risk information loss. On the other hand, a pattern that combines multiple PCs may give a better fit of the initial sample, but may be limited in its relevance or generality across new  datasets. Therefore, we re-evaluated the data and included only PC1 for PDRP IT , PDRP NL , and PDRP SP . Indeed, the PC1 patterns correlated better to the reference pattern (PDRP USA ). However, the patterns that included multiple PCs yielded higher diagnostic accuracy . Apart from component selection, several other decisions and cutoffs may influence pattern identification [10]. More advanced machinelearning algorithms may be of use in determining optimal patterns without the use of arbitrary thresholds and associated loss of potentially useful information [55][56][57][58].
There is increasing interest to apply the PDRP in clinical practice and in therapeutic trials [12]. However, rigorous validation by independent research groups is necessary before widespread application. The current study has contributed to the finding that the PDRP is a universal feature of PD, and it is striking that such similar patterns could be identified in a limited number of 18 F-FDG PET scans from three populations, despite overt clinical and methodological heterogeneity. However, our results also show considerable overlap in PDRP subject scores between control and PD groups. Further study is needed to overcome this issue, perhaps by addressing potential center-specific effects in the data or by employing more advanced machine-learning algorithms.

Compliance with Ethical Standards
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Ethical permission for the procedures was obtained from the local ethics committees at the University Medical Center Groningen (Groningen, The Netherlands), the University of Genoa (Genoa, Italy), and from the Ethics Committee for Medical Research of the University of Navarra (Navarra, Spain). All Fig. 6 Subject scores for each PDRP in eight cases of MSA-p. Subject scores are z-transformed to NL2 controls and compared between groups with a Student's t test. Bars indicate mean and standard deviation