Parkinson’s disease: deep learning with a parameter-weighted structural connectome matrix for diagnosis and neural circuit disorder investigation

Purpose To investigate whether Parkinson’s disease (PD) can be differentiated from healthy controls and to identify neural circuit disorders in PD by applying a deep learning technique to parameter-weighted and number of streamlines (NOS)–based structural connectome matrices calculated from diffusion-weighted MRI. Methods In this prospective study, 115 PD patients and 115 healthy controls were enrolled. NOS-based and parameter-weighted connectome matrices were calculated from MRI images obtained with a 3-T MRI unit. With 5-fold cross-validation, diagnostic performance of convolutional neural network (CNN) models using those connectome matrices in differentiating patients with PD from healthy controls was evaluated. To identify the important brain connections for diagnosing PD, gradient-weighted class activation mapping (Grad-CAM) was applied to the trained CNN models. Results CNN models based on some parameter-weighted structural matrices (diffusion kurtosis imaging (DKI)–weighted, neurite orientation dispersion and density imaging (NODDI)–weighted, and g-ratio-weighted connectome matrices) showed moderate performance (areas under the receiver operating characteristic curve (AUCs) = 0.895, 0.801, and 0.836, respectively) in discriminating PD patients from healthy controls. The DKI-weighted connectome matrix performed significantly better than the conventional NOS-based matrix (AUC = 0.761) (DeLong’s test, p < 0.0001). Alterations of neural connections between the basal ganglia and cerebellum were indicated by applying Grad-CAM to the NODDI- and g-ratio-weighted matrices. Conclusion Patients with PD can be differentiated from healthy controls by applying the deep learning technique to the parameter-weighted connectome matrices, and neural circuit disorders including those between the basal ganglia on one side and the cerebellum on the contralateral side were visualized.


Introduction
Parkinson's disease (PD) is a progressive disease that is characterized by degeneration of mesencephalic dopamine neurons. The prevalence of PD is increasing. The number of patients with PD has more than doubled globally from 1990 to 2015 [1], and worldwide in 2016, 6.1 million individuals had PD [2]. Although MRI is used for the diagnosis of many neurological disorders, PD does not show abnormal findings on conventional MRI. Patients with PD show extrapyramidal symptoms such as resting tremor, bradykinesia, rigidity, and postural reflex disturbance, and diagnosis of PD is made based on the clinical time course, symptoms, and physical findings.
Advancements in diffusion MRI techniques have allowed researchers to capture imaging findings of PD [3] by using diffusion tensor imaging (DTI) [4], diffusion kurtosis imaging (DKI) [5], and neurite orientation dispersion and density imaging (NODDI) [6]. Parameters derived from g-ratio analysis [7] may also capture microstructural changes in PD [8]. The structural connectome matrix, or number of streamlines (NOS)-based connectome matrix, is calculated from diffusion-weighted MRI, represents neural connections of the whole brain [9], and also has potential to reveal abnormalities in neural connections in some neurological and psychological disorders [10] including PD [11]. While NOS-based matrices represent only microstructural fiber counts, elements of the connectome matrix can also be weighted using parameters derived from advanced diffusion MRI techniques. Such parameter-weighted matrices have a potential to represent tissue properties and to have higher sensitivity to capture the characteristics of neurological diseases. For example, the usefulness of the g-ratio-weighted connectome matrix over the NOS-based matrix was reported in the diagnosis of multiple sclerosis [12]. Use of a parameter-weighted connectome matrix from DTI, DKI, NODDI, and g-ratio analyses has the potential to diagnose PD better than the conventional NOSbased matrix; however, to the best of our knowledge, no investigations have used such parameter-weighted connectome matrices to evaluate PD.
Deep learning, which is an artificial intelligence approach, allows automated analyses of large-volume data such as images without explicitly teaching knowledge to computers [13]. Convolutional neural network (CNN), a deep learning strategy, has shown high performance in image recognition tasks [14]. Therefore, application of CNN to radiological imaging diagnosis has been gaining wide attention [15][16][17], and researchers have successfully applied this technique to several imaging diagnosis tasks with radiography [18,19], CT [20,21], and MRI [22,23]. Deep learning would also have a potential to handle connectome matrix which has a large data volume comprising thousands of independent elements. One of the weaknesses of deep learning is how it makes decisions has been difficult to interpret for us [15]. However, visualization of regions of images where deep learning models focus has become possible with gradient-weighted class activation mapping (Grad-CAM) [24]. This technique may allow visualizing alterations of neural circuits in PD patients when it is applied to CNNs which is trained with connectome matrices to diagnose PD.
The aim of this study was to investigate whether patients with PD can be differentiated from healthy controls by applying a deep learning technique to a parameter-weighted connectome matrix with higher performance than a NOSbased structural connectome matrix calculated from MRI obtained with a 3-Tesla MRI unit and to find neural circuit disorders associated with PD by using Grad-CAM techniques.

Methods
In this prospective study, which was approved by our institutional review board, MRI data of patients with PD and healthy controls were analyzed. There is subject overlap (64 out of 230) with previous studies [25,26], which used different analyzing methods (tract-of-interest, tract-based spatial statistics, and gray matter-based spatial statistics analyses) from the current study. Those studies did not use deep learning technique or connectome matrices in the evaluations of PD. Written informed consent was obtained from each participant.

Participants
Patients with PD and healthy controls who visited Juntendo University Hospital (blinded for the review process) between February 2017 and October 2018 were enrolled in this study. Specialized neurologists made the diagnosis of PD according to clinical diagnostic criteria for PD of the Movement Disorder Society [27]. Patients showed parkinsonism and responded to antiparkinsonian therapy. Patients with apparent signs of other diseases which shows parkinsonism were not included. Disease duration from onset to MRI examination, disease severity, and dominant side are shown in Table 1. Healthy controls had no history of neurological diseases. Participants with a brain infarction or incomplete MRI examination were excluded. Age-and gender-matched participants in the PD and healthy control groups were selected before the deep learning and main analyses.

MRI data acquisition
MRI examinations were performed with a 3-T MRI unit (MAGNETOM Prisma; Siemens Healthcare) using a 64channel head coil. Multi-shell diffusion-weighted MRI, magnetization transfer saturation images, and T1-weighted images were obtained.
Multi-shell diffusion-weighted MRIs were obtained with b-values of 0, 1000, and 2000 s/mm 2 along 64 uniformly distributed directions for each shell with spin-echo echo-planar imaging. We also acquired standard and reverse phaseencoded blipped images with no diffusion weighting (blip up and blip down) to correct for the magnetic susceptibilityinduced distortions related to the echo-planar imaging acquisitions [28]. The data were corrected for eddy currents, susceptibility-induced geometric distortions, and inter-volume motion using the EDDY and TOPUP toolboxes [28]. To calculate the magnetization transfer (MT) saturation index, dual excitation three-dimensional multi-echo fast lowangle shot sequences were performed with predominant T1, proton density, and MT weighting. The excitation of MTweighted images was preceded by an off-resonance Gaussian-shaped radiofrequency pulse under the following conditions: frequency offset from water resonance, 1.2 kHz; pulse duration, 9.984 ms; and nominal flip angle, 500°. Other acquisition parameters for MT saturation imaging were as follows: MT-off and MT-on scanning (TR/TE = 24/2.53 ms, flip angle = 5°) and T1-weighted imaging (TR/TE = 10/ 2.53 ms, flip angle = 13°); parallel imaging using a generalized auto-calibrating partially parallel acquisition factor in the phase-encoding direction, 2; 7/8 partial Fourier acquisition in the partition direction; bandwidth, 260 Hz/pixel; slice thickness, 1.8 mm; number of slices, 104; FOV, 224 × 224 mm, and matrix, 128 × 128. The total acquisition time was 6.25 min.
Diffusion analyses (DTI, DKI, NODDI, and g-ratio analyses) An ordinary least square was applied to the diffusionweighted MRI data with b = 0 and 1000 s/mm 2 to produce FA, mean diffusivity, axial diffusivity, and radial diffusivity based on standard formulae [4].
The diffusional kurtosis estimator [29] was implemented in MATLAB (Math-Works, Natick, MA, USA) to generate AK, MK, and RK maps.
Using an in-house MATLAB script, MT saturation data were analyzed to calculate the myelin volume fraction (MVF). To estimate MVF maps, a calibration factor of 0.1 was subsequently used to obtain a g-ratio of 0.7 in the corpus callosum as previously suggested [31]. The AVF was calculated based on parameters of the NODDI model (ICVF and isotropic volume fraction) and MT saturation (myelin volume fraction), as follows [7]: where ISOVF denotes isotropic volume fraction. Finally, the g-ratio was calculated using the myelin volume fraction and AVF for each voxel using the following equation [7]:

Connectome pre-processing and construction
By using the functional MRI of the Brain Software Library, Version 5.0.9 [32], pre-processing (registration, removing non-brain tissue from three-dimensional T1-weighted images, segmentation [estimation of partial volume fractions of white matter/cortical gray matter/deep gray matter/cerebrospinal fluid], and obtaining gray matter-white matter interface mask) was performed. Then, nodes were obtained according to Desikan-Killiany cortical atlas segmentation [33]. With the MRtrix software package (Brain Research Institute, http:// www.brain.org.au/software/), whole-brain tractograms were generated using probabilistic multi-shell, multi-tissue constrained spherical deconvolution tracking from multishell diffusion-weighted MRI data. For fiber tracking, data with b-values of 1000 and 2000 s/mm 2 were used, and 5 × 10 7 streamlines were seeded from the white matter fiber orientation distribution function.
To map the NOS-based connectome for each individual, the total NOS interconnecting each pair of regions was enumerated and stored in a connectivity matrix. Then, the NOS interconnecting each pair of nodes was enumerated, resulting in a connectome matrix with 84 × 84 components (Fig. 1a). Furthermore, DTI-, DKI-, NODDI-, and g-ratio-weighted connectome matrices were obtained by multiplying the NOS by the tract-averaged values of each parameter [12]. Therefore, from each patient, a total of 14 connectome matrices (one NOS-based, four DTI parameter-weighted [fractional anisotropy (FA), mean diffusivity, axial diffusivity, and radial diffusivity], three DKI parameter-weighted [mean kurtosis (MK), axial kurtosis (AK), and radial kurtosis (RK)], three NODDI parameter-weighted [intracellular volume fraction (ICVF), orientation dispersion index, isotropic volume fraction], and three g-ratio parameter-weighted [axon volume fraction (AVF), myelin volume fraction, and g-ratio] matrices) were generated.

Training of the CNN
Deep learning was performed with a computer equipped with the graphical processing unit of Quadro P5000 (NVIDIA), 64-GB random access memory, and a Core i9-9900K (Intel). To perform deep learning, the programming language, Python 3.6.4, and the deep learning framework, Chainer 4.0.0, were used.
As teaching data, two-element vector data (PD [0, 1] vs. healthy control [1, 0]) were used. The structure of the CNN is illustrated in Fig. 2. Errors between output data and teacher data (PD vs. healthy control) were calculated with softmax cross-entropy. The CNN was updated with a minibatch size of 15 and with an optimizer of AdaGrad [34] so that the error became small. The number of epochs for the training was 20. In the training phase, data augmentation was performed by adding Gaussian noise with a mean of 0 and a standard deviation of 0.002, 0.004, 0.006, 0.008, and 0.010 to the input data. Therefore, data were augmented 6-fold in the training phase of the CNN. The hyperparameters were determined before the main analyses by using NOS-based matrices of PD patients and healthy controls who were excluded from the study due to age and gender matching.

Evaluating performance of the trained models
To evaluate the performance of the model, 5-fold cross-validation was performed. Data of patients with PD and healthy controls were randomly divided into five groups. Training and validation were repeated 10 times (trials) in each fold.
The performance of the model that showed the best among the 10 trials was recorded for each fold. Then, by pooling the data across 5-fold, the performance of the model was evaluated.

Grad-CAM
To visualize the regions where the developed CNN focused in diagnosing PD, the Grad-CAM technique [24] was applied to the connectome matrices of patients with PD who were diagnosed correctly in the validation. The target layer was set at the 4th convolutional layer. The data for each patient with PD were normalized and then pooled. To assess whether the relations between the dominant side and the neural circuit alteration, subanalyses were also performed by separately pooling the data for patients according to the dominant side (left or right).

Statistics
Statistical analyses were performed with EZR version 1.37 (http://www.jichi.ac.jp/saitama-sct/SaitamaHP.files/ statmedEN.html), which is a graphical user interface of R 2.4. 0 (https://www.r-project.org/). Continuous and binominal data were compared with the Student t test and chi-square test, respectively. The performance of the CNN model for differentiating patients with PD from healthy controls was evaluated with receiver operating characteristic analyses. Areas under the receiver operating characteristic curve (AUCs) of the model that showed the best performance among each evaluation category (NOS, DTI, DKI, NODDI, g-ratio) were compared across categories with DeLong's test. Because of multiple comparisons across five categories, a p value less than 0.005 (= 0.05/10, Bonferroni correction) was considered to indicate statistical significance. Sensitivity, specificity, and accuracy in diagnosing PD were calculated using CNN's output value of 0.5 as a cut-off.

Background of study participants
In this study, 115 patients with PD (mean age with standard deviation, 68.9 ± 6.9 years, 52 men and 63 women) and 115 healthy controls (mean age with standard deviation, 69.8 ± 3.0 years, 61 men and 54 women) were enrolled. Other background information is summarized in Table 1.
Sensitivity, specificity, and accuracy of models that showed the best performance in each category are shown in Table 4.

Grad-CAM results
Regions in which the trained CNN focused, which were revealed with Grad-CAM, are shown in Fig. 4. Generally, Grad-CAM result images are blurred due to the use of kernel size of more than 1 in convolutional layers and due to the use of maxpooling layers. In this study, the customized structure of our CNN (Fig. 2) allowed pin-point visualization in Grad-CAM analysis. The trained CNN model focused on many Fig. 2 The CNN comprised four convolutional layers with channels of 8, 16, 16, and 32. Because the distance between two elements within the connectome matrix does not represent spatial distance as in general images, we customized the structure of the CNN: a a filter size of 1 was used for all convolutional layers and b max-pooling layers were omitted from the CNN. The processed data were further processed at three fully connected layers with the number of units of 4096, 4096, and 2. Conv, convolutional layer; FC, fully connected layer; ker, numbers of kernels of the convolutional layer; out, numbers of output values of the fully connected layer connections on the RK-weighted connectome matrix (Fig. 4c) and no specific trend in areas of focus for neural connections was found. On the contrary, ICVF-weighted and AVFweighted connectome matrices focused on some specific neural connections, which are indicated as reddish color dots in Fig. 4d, e. These neural connections are summarized in Fig. 5 and Table 5. For these two matrices, connections between the basal ganglia and the cerebellum played important roles in discriminating patients with PD from healthy controls.

Discussion
In this study, by applying the deep learning technique to DKI-, NODDI-, and g-ratio-weighted structural connectome matrices obtained with a 3-T MRI unit, we found that PD could be diagnosed with moderate performance. Deep learning models that were trained with the DKI-weighted connectome matrix performed significantly better than those trained with the NOS-based connectome matrix. By using the Grad-CAM technique, we found that the trained CNN focused on connections between the basal ganglia and the cerebellum for ICVFand AVF-weighted connectome matrices, whereas no specific trend was found in the areas of focus for neural connections for the RK-weighted connectome matrix for differentiating patients with PD from healthy controls.
Parameter-weighted connectome matrix has been gaining attention recently in the evaluations of neurological diseases [12]. While NOS-based connectome matrix represents only the strength of the connectivity between brain regions, the parameter-weighted matrix can represent other tissue properties. In the current study, DKI parameters-weighted, ICVFweighted, and AVF-weighted matrices were found to show AUCs of over 0.800 in diagnosing PD, while NOS-based  matrix showed AUC of 0.761. The RK-weighted connectome matrix (best among DKI) was significantly superior to the NOS-based connectome matrix. This means that parameterweighted connectome matrix would be superior to conventional NOS-based matrix in diagnosing PD. The RK-weighted connectome matrix (best among DKI) was also significantly superior to the FA-weighted (best among DTI) connectome matrices. This finding matches a previous study that reported that parameters derived from DKI analyses were significantly better than those from DTI analyses in diagnosing PD [35]. NODDI parameters represent the density and direction of neurites (axons and dendrites),  Note: Sensitivity, specificity, and accuracy were calculated using a cut-off value of 0.5 for output from the convolutional neural network AVF, axon volume fraction; FA, fractional anisotropy; ICVF, intracellular volume fraction; NOS, number of streamlines; RK, radial kurtosis and they are more specific markers of brain tissue microstructure [6]. There is a previous study which reported that both the DKI and NODDI parameters were useful for diagnosing PD [36]. However, the DKI parameter-weighted matrix performed significantly better than the NODDI parameterweighted matrix in our study. The difference might have come from the difference in the evaluating methods; in that previous study regions of interest were placed on gray matter in measuring these values. In our study, the trained CNN based on RK-weighted matrix took into consideration many of the neural circuits comprehensively, which was indicated by the Grad-CAM result. According to Braak et al., as the disease progresses from stage 1 to stage 6 in PD patients, the deposition of Lewy bodies is thought to be spread to the upper brainstem and forebrain [37]. Because the PD patients included in this study had relatively long clinical course, we assume that such pathological conditions were present widely in the brain in many PD patients. DKI model, which quantifies non-Gaussian water diffusion [5], might be better to capture such pathological changes of PD than NODDI model when parameters derived from these models are used to weigh connectome matrix.
Little is known regarding g-ratio analysis for diagnosing PD. With g-ratio analysis, AVF, myelin volume fraction, and the ratio of the inner to outer myelinated axon diameter (g-ratio) can be derived [7]. AVF is an indicator of altered axons like ICVF in NODDI model [6,7]. In this study, only ICVF and AVF showed performance with AUC of over 0.80 among the NODDI and g-ratio analysis, respectively. This does not conflict with a previous study [25], which reported reduction of ICVF in white matter indicating axonal degeneration in PD.
Traditionally, PD is characterized by degeneration of mesencephalic dopamine neurons. However, it has become known that several other brain regions and connections are known to be affected in PD [3]. A connection between basal ganglia and cerebellum is one of these connections in PD patients [38]. In previous studies, by using functional MRI in patients with PD, alterations in connectivity strength between basal ganglia and cerebellum have been suggested [39,40]. In the current study, we found that neural connections between the basal ganglia and the cerebellum were focused on by ICVF and AVF-weighted connectome matrices with the Grad-CAM technique. The alterations of axons along those neural circuits might possibly be associated with alterations of function.
In this study, we customized the structure of the CNN, because we thought that the connectome matrix is different from general images in that distances between elements of matrices do not represent spatial distances. We used a filter size of 1 in all the convolutional layers. In addition, we avoided implementing max-pooling layers, which extract max values within H × H pixels (H = 2 or more). The customized CNN was found to be applicable to the imaging diagnosis with connectome matrices. While Grad-CAM result images are blurred generally [24], the customized structure of the CNN enabled pin-point visualization in Grad-CAM as shown in Fig. 4. The developed method would have a potential to identify neural circuit disorders in other neurological and psychiatric diseases.
This study has some limitations. First, although the Grad-CAM results indicated abnormalities in some of the neural circuits, the abnormalities were not proven histopathologically. Future analyses are required to consolidate the Grad-CAM findings obtained in this study. Second, because data from a large number of patients with PD were required to manage the overfitting problem associated with deep learning, patients with PD with variable disease durations (mean disease duration was over 10 years) were included in this study. Therefore, early diagnosis of PD is not possible with our models. It would also be possible that midbrain atrophy, which can be seen in some late-stage cases of PD, might have affected the performance of the model. Third, the model's performance was not externally validated. Our study indicates that once the model is developed, it can be used to diagnose PD in the same institution. However, the model's performance may not be as high as we reported when the method is applied to data obtained in other  [41].
In conclusion, PD can be diagnosed by applying deep learning to DKI-, NODDI-, and g-ratio-weighted connectome matrices with moderate performance. Deep learning models trained with the DKI-weighted connectome matrix performed significantly better than those trained with the NOS-based connectome matrix. Alterations in neural connections between the basal ganglia on one side and the cerebellum on Fig. 4 Results of Grad-CAM, which was reorganized to 84 × 84, for a NOS-based, b FAweighted, c RK-weighted, d ICVF-weighted, and e AVFweighted connectome matrices are shown. Left/top and right/ bottom halves of the matrix indicate left and right brain regions, respectively. Red and blue colors indicate higher and lower values, respectively, obtained from Grad-CAM analysis the contralateral side in PD patients were indicated by applying Grad-CAM technique to g-ratio-weighted and NODDIweighted matrices.
Funding This study was funded by the Brain/MINDS Beyond program from Japan Agency for Medical Research and Development (JP19dm0307024 and JP19dm0307101) and the Japan Society for the Promotion of Science KAKENHI (19K17244).

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Ethical approval All procedures performed in reports involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed consent Informed consent was obtained from all individual participants included in the study.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.