Classifying early and late mild cognitive impairment stages of Alzheimer’s disease by fusing default mode networks extracted with multiple seeds
The default mode network (DMN) in resting state has been increasingly used in disease diagnosis since it was found in 2001. Prior work has mainly focused on extracting a single DMN with various techniques. However, by using seeding-based analysis with more than one desirable seed, we can obtain multiple DMNs, which are likely to have complementary information, and thus are more promising for disease diagnosis. In the study, we used 18 early mild cognitive impairment (EMCI) participants and 18 late mild cognitive impairment (LMCI) participants of Alzheimer’s disease (AD). First, we used seeding-based analysis with four seeds to extract four DMNs for each subject. Then, we conducted fusion analysis for all different combinations of the four DMNs. Finally, we carried out nonlinear support vector machine classification based on the mixing coefficients from the fusion analysis.
We found that (1) the four DMNs corresponding to the four different seeds indeed capture different functional regions of each subject; (2) Maps of the four DMNs in the most different joint source from fusion analysis are centered at the regions of the corresponding seeds; (3) Classification results reveal the effectiveness of using multiple seeds to extract DMNs. When using a single seed, the regions of posterior cingulate cortex (PCC) extractions of EMCI and LMCI show the largest difference. For multiple-seed cases, the regions of PCC extraction and right lateral parietal cortex (RLP) extraction provide complementary information for each other in fusion, which improves the classification accuracy. Furthermore, the regions of left lateral parietal cortex (LLP) extraction and RLP extraction also have complementary effect in fusion. In summary, AD diagnosis can be improved by exploiting complementary information of DMNs extracted with multiple seeds.
In this study, we applied fusion analysis to the DMNs extracted by using different seeds for exploiting the complementary information hidden among the separately extracted DMNs, and the results supported our expectation that using the complementary information can improve classification accuracy.
KeywordsDefault mode network Seeding-based analysis Joint independent component analysis Classification Alzheimer’s disease
Alzheimer’s Disease Neuroimaging Initiative
Area under curve
Blood oxygenation level dependent
Default mode network
Early mild cognitive impairment
Functional magnetic resonance imaging
Independent component analysis
Left lateral parietal cortex
Late mild cognitive impairment
Mini mental state examination
Medial prefrontal cortex
Posterior cingulate cortex
Radial basis function
Right lateral parietal cortex
Receiver operating characteristic
Resting state network
Sparse representation analysis
Support vector machine
Functional Magnetic Resonance Imaging (fMRI) [1, 2] provides a novel perspective for the study of brain functions, which is noninvasive and has high resolution in both space and time. Different from task-based fMRI [3, 4] that studies the brain reacting to stimulus, resting state fMRI [5, 6] studies the inner functional connectivity of brain, which can obtain the change of spontaneous functions in our brain. In resting state, there are several functions in operation and the regions of each function constitute a functional network, i.e., resting state network (RSN).
The approaches to extract RSNs from resting state fMRI data mainly fall in two types: data-based [7, 8] and model-based [9, 10]. Both of the two types of approaches have their own merits and demerits. Data-based methods are data driven. For example, independent component analysis (ICA) [11, 12] assumes the independence of the brain patterns; Sparse representation analysis (SRA) [13, 14] assumes the spatial sparsity of brain patterns, but their performance is also limited by the fully data driven process, because sometimes a brain pattern can be further decomposed into more than one subpattern, which causes the difficulty of recognizing RSNs. While model-based methods manually select a representative signal as reference. For example, seeding-based analysis [15, 16, 17] assumes some representative regions as seeds and detects temporal correlation between the selected seeds and the other regions, its performance depends on the selected seeds, but the results are unique. In this paper, we consider seeding-based analysis.
Prior works of resting state fMRI mainly use a single extraction of RSN to explore biomarkers or do classification. However, in task-based fMRI, several works conduct multi-task analysis to improve performance. Calhoun et al.  described a two-task fusion of auditory oddball and Sternberg working memory for schizophrenia, which reveals two additional findings, compared to the traditional separate analysis. Remezani et al.  reported the fusion of three levels of auditory tasks, and showed that the information across multiple tasks can be usefully combined, Remezani et al.  compared SRA and ICA for multi-task analysis, and showed the effectiveness of multi-task analysis, but the fusion techniques need to be further improved. The effectiveness of multi-task analysis lies in that each subtraction related to a task can provide complementary information for the others, even though multi-task fMRI data are acquired from the same subject but not necessary at the same time. Considering that resting state fMRI data can provide RSNs that exist at the same time and on the same subject, so it is more likely to get improved performance by combining multiple RSNs.
In our study, we found that there are four seeds can be used to extract the default mode network [21, 22, 23, 24] (the main RSN in resting state) by seeding-based analysis, they are medial prefrontal cortex (MPFC), PCC, LLP and RLP . One idea is to average the signals of the four seeds, and then take the average signal as reference for extraction. In this paper, we try to extract four DMNs with the four seeds separately, which can keep their specificities, and then conduct fusion analysis to combine them. We think that this process can help us look insight the relationship of signals of the four seeds. We test this idea on a data set consisting of early mild cognitive impairment participants (EMCI) and late mild cognitive impairment participants (LMCI) of Alzheimer’s disease [26, 27, 28]. Alzheimer’s disease is a neurological, progressive disease, which has a strong impact on the lives of some old people, and gains more and more attention in recent years. As the transition from EMCI to LMCI is irreversible, and means a significant change in the state of a patient, we address the classification of these two stages in this paper.
Our study consists of three major steps. First, seeding-based analysis is used to extract DMNs for EMCIs and LMCIs with four seeds, respectively. Second, joint ICA [29, 30, 31] is adopted to fuse all the nonempty combinations of the four DMNs. Third, the mixing coefficients from joint ICA are taken as feature for classification, which is based on nonlinear support vector machine (SVM) [32, 33]. Our findings are as follows: 1) the four DMNs extracted by different seeds for subjects are different, and maps corresponding to different seeds in the most different joint source by fusion capture different functional regions. All these lay the foundation of conducting fusion analysis to integrate DMNs extracted by different seeds. 2) The regions of posterior cingulate cortex (PCC) extractions for EMCI and LMCI show the largest difference. 3) The regions of PCC extraction and right lateral parietal cortex (RLP) extraction can provide complementary information for each other in fusion. Besides, the regions of right lateral parietal cortex (LLP) extraction and RLP extraction also have complementary information for each other. 4) We can improve AD diagnosis by exploiting complementary information of DMNs extracted with multiple seeds.
Participants and fMRI data preprocessing
Participants falling to two stages of Alzheimer’s disease are used, so they are split to two groups: early mild cognitive impairment participants (EMCI) and late mild cognitive impairment participants (LMCI). The transition from EMCI to LMCI means an irreversible change of AD. In total, the group of EMCI contains 18 subjects (with 11 females and 7 males, age mean = 72.1667 years and standard deviation (s.d.) = 5.0904 years, Mini Mental State Examination (MMSE) score mean and s.d. are 27.3333 and 1.7150 respectively). The LMCI group also has 18 subjects (5 females and 13 males, age mean and s.d. are 72.2778 and 8.3582 years, MMSE mean and s.d. are 26.5000 and 2.4313). All data were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/), which connects researchers with data to study the progression of Alzheimer’s disease since 2004. And the fMRI data are relatively new, which have been added and updated since 2009.
The magnetic resonance image data were acquired using a 3.0T Philips Medical Systems. In the acquisition of functional images, subjects should have eyes open. Each acquisition of functional images consists of 48 contiguous slices, and each slice has a grid of 64×64 (TR = 3000 ms, TE = 30 ms, flip angle =80o, voxel size =3.313×3.313×3.313m,m3). For each subject, a high-resolution, T1-weighted, sagittal MPRAGE, 3D structural image was also captured, which consists of 170 contiguous slices, each of which has a grid of 256×256 (TR = 6.78 ms, TE = 3.157 ms, flip angle =9.0o, voxel size =1×1×1.2m,m3).
All subject data were preprocessed using Statistical Parametric Mapping 8 (SPM8) and Functional Connectivity Toolbox (Conn) on Matlab 2015a. First of all, by using SPM8, the acquired DICOM images were converted to NIFTI format with 140 3D functional images and a 3D structural image for each subject. Then, the first 10 functional images for each subject were discarded to equilibrate the T1 effect. Finally, the preprocessing was done using Conn, the pipeline includes functional realignment and unwarp, functional center to (0,0,0) coordinates, functional slice-timing correction, structural center to (0,0,0) coordinates, structural segmentation and normalization, functional normalization, functional outlier detection, and functional smoothing. It should be noted that the selected standard brain is MNI-space template, the slice order in slice timing is interleaved from top to down, and smoothing is done with an 8-mm Gaussian kernel. After preprocessing, the size of 3D functional images is 91×109×91.
Seeding-based analysis is a model-based method because a seed is selected as the model assumption of RSN. The core idea is to evaluate the temporal correlation between the seed and all other regions in brain, and then constitute the corresponding functional connectivity network. In this paper, we perform seeding-based analysis with 4 seeds, MPFC, PCC, LLP and RLP to extract DMNs for EMCI and LMCI participants by using the Conn software, available online at http://www.nitrc.org/projects/conn/. After preprocessing, we do seed-to-voxel analysis, which applies a weighted general linear model to the weighted correlation measures of the condition-specific association between the seed BOLD time series and each voxel BOLD time series. As a result, four DMNs corresponding to the four seeds were extracted for each subject, which are regarded as features for fusion analysis.
In fusion analysis, the extracted features (DMNs) of each subject are concatenated together, so a joint feature is created. Then, a matrix decomposition method (i.e., independent component analysis) is used to represent the joint feature as a linear combination of a set of joint independent sources. The maps of different DMNs in a joint source share a common mixing coefficient. If the fused features are complementary, then the common coefficient could be more discriminative. This constitutes the foundation of the fusion analysis in this paper.
To examine whether the fusion of DMNs extracted by four different seeds can integrate complementary information and enhance shared information, we consider all the combinations of the four DMNs, totally 15 settings: 1 of combining 4 DMNs, 4 of combing 3 of the 4 DMNs, 6 of combing 2 of the 4 DMNs, and 4 of using only one of the 4 DMNs.
Classification with nonlinear support vector machine
As a result of fusion analysis, each subject is transformed to an 8-dimensional feature vector. If the DMNs extracted by different seeds can be effectively fused, the resulting feature vectors can be used to effectively classify EMCI and LMCI. Here, classification is performed by a nonlinear SVM that is useful for a small number of samples. The radial basis function (RBF) is used as the kernel function, and the two parameters, i.e., penalty parameter C and radius of the kernel function g are determined by grid search with a step size of 0.5. In addition, we use the ROC curve to measure the classification performance. As the output of SVM is not a probability, we train a sigmoid function following the SVM to generate probability output for test set . The nonlinear SVM is implemented by using Statistical Pattern Recognition Toolbox software, available at http://cmp.felk.cvut.cz/cmp/software/stprtool/.
Both the 18 EMCI subjects and the 18 LMCI subjects are randomly split into two groups: 13 subjects for training and 5 subjects for testing. Such splitting is repeatedly done 100 times, and the final performance result is obtained by averaging the results of the 100 testings. Note that the splitting is done on DMN data, while each input of SVM is an 8-dimensional feature vector obtained by fusion analysis. In training, the 8-dimensional feature vectors are obtained by joint ICA. In testing, we set the joint sources from joint ICA as basic sources, and the 8-dimensional feature vectors are obtained by mapping the joint DMN data to the bases, which is solution of a least square problem.
All the 15 combinations of the four seeds of DMNs are tested. Performance comparison is done to check the complementary effect of the seeds for Alzheimer’s disease diagnosis.
DMNs extracted by seeding-based analysis
The most significant joint source from joint ICA
Discussion and conclusion
In this paper, we first applied seeding-based analysis with four seeds to extract DMNs from resting state fMRI data for two groups of AD subjects (EMCI and LMCI), then performed joint ICA on them, finally trained a nonlinear SVM to classify these two groups of AD subjects with the fused information. The results support our expectation that using complementary information among separately extracted DMNs can improve classification accuracy.
In the classification based on a single seed (without fusion), PCC extraction obtains the best performance. This suggests that if we want to discriminate patients of EMCI and LMCI using one DMN, PCC extraction is the best choice. In our fusion analysis, we found that the maps of LLP and RLP extractions in the joint source show high similarity, and fusing the two extractions can improve classification accuracy by about 5%. This indicates the symmetry and complementarity of the left and right brain functions.
In practice, we often need a unique DMN for each subject to do follow-up research. Now we know PCC and RLP extractions can complement each other, so we can use the RLP extraction to supplement the PCC extraction, which can generate a unique DMN with more complete information.
Our study about the relationship among different seeds is a second-level approach, which is based on the fusion of extracted DMNs instead of straightforwardly analyzing the extracted DMNs. Whether or not the extracted DMNs are complementary is implied in the mixing coefficients, which are short features (rather than thousands of voxels).
For a subject, joint ICA provides an 8-dimensional feature vector that is favorable for classification, and it indeed proves our expectation. However, it is worthy of pointing out that the feature selection is limited by the hypothesis of independence. We believe that with more advanced feature selection methods (e.g. deep neural networks [40, 41]), the accuracy of EMCI and LMCI classification can be further improved.
The experimental results in this paper show that the fusion of DMNs obtained with different seeds is effective. Moreover, the idea of this work can be extended to multiple RSNs, and different RSNs can also be fused to improve disease diagnosis. The key is to exploit the complementary information among the RSNs.
In summary, we used seeding-based analysis, joint ICA and SVM to improve classification accuracy by combining different DMNs extracted by different seeds over two groups of AD conditional subjects, and found that PCC extraction shows the largest between EMCI and LMCI. Meanwhile, PCC and RLP extractions as well as LLP and RLP extractions can complement each other in fusion. Our future work will focus on more advanced feature selection methods to improve classification accuracy under the fusion analysis framework.
All work and publication costs are funded by National Natural Science Foundation of China under grant No. 61772367 and National Key Research and Development Program of China under grant No. 2016YFC0901704.
Availability of data and materials
The data and materials are available at http://dmb.tongji.edu.cn/supplementary-information/fuse-dmns-with-multiple-seeds.
About this supplement
This article has been published as part of BMC Bioinformatics Volume 19 Supplement 19, 2018: Proceedings of the 29th International Conference on Genome Informatics (GIW 2018): bioinformatics. The full contents of the supplement are available online at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-19-supplement-19.
SP did the work and drafted the paper; JG devised and supervised this work; SZ supervised this research and revised the paper. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 3.Roge RE, Madsen KH, Schmidt MN, Morup M. Unsupervised segmentation of task activated regions in fMRI. In: IEEE International Workshop on Machine Learning for Signal Processing.Boston: IEEE: 2015. p. 1–6.Google Scholar
- 12.Du W, Li H, Li XL, Calhoun VD, Adali T. ICA of fMRI data: performance of three ICA algorithms and the importance of taking correlation information into account. In: IEEE International Symposium on Biomedical Imaging.Chicago: IEEE: 2011. p. 1573–6.Google Scholar
- 22.Ren Y, Hu X, Lv J, Quo L, Han J, Liu T. Identifying autism biomarkers in default mode network using sparse representation of resting-state fMRI data. In: IEEE International Symposium on Biomedical Imaging.Prague: IEEE: 2016. p. 1278–81.Google Scholar
- 39.Platt J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classifiers. 1999; 10(3):61–74.Google Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.