## Abstract

Hyper-connectivity network is a network where every edge is connected to more than two nodes, and can be naturally denoted using a hyper-graph. Hyper-connectivity brain network, either based on structural or functional interactions among the brain regions, has been used for brain disease diagnosis. However, the conventional hyper-connectivity network is constructed solely based on single modality data, ignoring potential complementary information conveyed by other modalities. The integration of complementary information from multiple modalities has been shown to provide a more comprehensive representation about the brain disruptions. In this paper, a novel multimodal hyper-network modelling method was proposed for improving the diagnostic accuracy of mild cognitive impairment (MCI). Specifically, we first constructed a multimodal hyper-connectivity network by simultaneously considering information from diffusion tensor imaging and resting-state functional magnetic resonance imaging data. We then extracted different types of network features from the hyper-connectivity network, and further exploited a manifold regularized multi-task feature selection method to jointly select the most discriminative features. Our proposed multimodal hyper-connectivity network demonstrated a better MCI classification performance than the conventional single modality based hyper-connectivity networks.

You have full access to this open access chapter, Download conference paper PDF

### Similar content being viewed by others

## 1 Introduction

Hyper-connectivity brain network is a network where each edge is connected to more than two brain regions, which can be naturally denoted using a hyper-graph. Hyper-connectivity network, either based on structural or functional interactions among the brain regions, has been used for brain disease diagnosis [1]. Functional interactions and structural interactions can be extracted from functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI), respectively [2]. However, the conventional hyper-network, which is constructed solely based on single modality data, ignores the potential complementary information conveyed by other modalities. Integration of complementary information from different modalities has been shown to provide a more comprehensive representation on the brain structural and functional organizations [3, 4]. Inspired by this observation, classification framework based on multimodal brain networks constructed from resting-state fMRI (rs-fMRI) and DTI has been proposed to enhance the classification performance of mild cognitive impairment (MCI) [5].

In this paper, we proposed the first multimodal hyper-connectivity network modelling method that simultaneously considers the information from rs-fMRI and DTI data during the network construction. Specifically, the multimodal hyper-connectivity network was constructed using a star expansion method [6] based on the anatomically weighted functional distance between pairs of brain regions. The anatomically weighted functional distance, which is defined as the strength of the anatomically weighted functional connectivity (awFC) [7], was computed using the complementary information conveyed by the rs-fMRI and DTI data. We then extracted network features from the constructed hyper-connectivity network, and selected the most discriminative features using a manifold regularized multi-task feature selection method (M2TFS) [1]. Finally, we applied a support vector machine (SVM) on the selected features for MCI classification. Promising classification results demonstrated the superiority of the proposed multimodal hyper-connectivity network over the single-modal hyper-connectivity networks which were constructed either from rs-fMRI or DTI data.

## 2 Materials and Methodology

### 2.1 Dataset

Ten MCI patients (5M/5F) and 17 normal controls (8M/9F) were included in this study with informed consent obtained from all participants, and the experimental protocols were approved by the institutional ethics board. The mean age for MCI and control groups are 74.2 ± 8.6 and 72.1 ± 8.2 (years), respectively. All the subjects were scanned using a 3.0-Tesla scanner to acquire the rs-fMRI and DTI data. The acquisition parameters for rs-fMRI were as follows: repetition time (TR) = 2000 ms, echo time (TE) = 32 ms, flip angle = 77°, acquisition matrix = 64 × 64, voxel size = 4 mm. One hundred fifty fMRI volumes were acquired. During the scanning, all subjects were instructed to keep their eyes open and stare at a fixation cross in the middle of the screen, which lasted for 5 min. The acquisition parameters for DTI were as follows: b = 0 and 1000 s/mm^{2}, flip angle = 90°, TR/TE = 17000/78 ms, imaging matrix = 128 × 128, FOV = 256 × 256 mm^{2}, voxel thickness = 2 mm, and 72 continuous slices.

### 2.2 Data Preprocessing

Resting-state fMRI images were preprocessed using Statistical Parametric Mapping software package (SPM8). Specifically, the first 10 fMRI volumes were removed before parcellating the brain space into 116 regions-of-interest (ROIs) based on the automated anatomical labeling (AAL) [8] template. We averaged the fMRI time series over all voxels in each ROI to compute the mean fMRI time series. Prior to constructing the hyper-connectivity network, a temporal band-pass filtering with frequency interval (\( 0.025\, \le \,f\, \le \,0.100\,{\text{Hz}} \)) was applied to the mean time series of each individual ROI to reduce the effects of physiological and measurement noises. Following previous study, global signal regression was not performed due to its controversy in the rs-fMRI preprocessing procedure [9].

Similar to the fMRI preprocessing, DTI images were aligned to the AAL template space using a deformable DTI registration algorithm (F-TIMER) [10] before the parcellating the brain space into 116 ROIs. A whole-brain streamline fiber tractography was then applied on each image using ExploreDTI [11] with the minimal seed point fractional anisotropy (FA) of 0.45, stopping FA of 0.25, minimal fiber length of 20 mm, and maximal fiber length of 400 mm.

### 2.3 Methods

### Anatomically Weighted Functional Distance.

We proposed a novel multimodal hyper-connectivity network modelling method that simultaneously utilizes the information from rs-fMRI and DTI data. Our method is based on the anatomically weighted functional distance which reflects the evidence for the underlying DTI data to supplement the fMRI data as defined below [7]

where \( \pi_{ij} \in \left[ {0,} \right.\left. 1 \right) \) is the strength of DTI-based structural connectivity between the brain regions \( i \) and \( j \), \( \uplambda \in \left[ {1,} \right.\left. \infty \right) =\varvec{\varOmega} \) is an unknown parameter that potentially attenuates the anatomically weighting, and \( FD_{ij} \) is the functional distance between the fMRI profiles. Equation (1) explicitly incorporates the brain anatomy for guiding a more accurate inference of the functional connectivity between two brain regions. Following the premise that structural connection is neither a sufficient nor necessary condition for the functional connection [7], a parameter \( \uplambda \) was imposed in Eq. (1) to regulate the contribution of the structural connection especially for the case where no fibers connect two regions. The functional distance between the fMRI profiles of ROIs \( i \) and \( j \) at lag-\( o \) is defined as [7]

where \( x_{i} (t) \) denotes the fMRI time series of the ROI \( i \) at time \( t, \) \( T \) is the total number of rs-fMRI volumes, \( \hat{\sigma }_{i} \) and \( \hat{\sigma }_{j} \) denote the standard deviations of samples \( x_{i} \) and \( x_{j} \), \( \bar{x}_{i} \) and \( \bar{x}_{j} \) indicate the sample means of \( x_{i} \) and \( x_{j} \), respectively. For the ease of explanation, we considered only the positive correlation. In view of the potential differences in the hemodynamic responses of resting-state neuronal activity between different brain regions, we estimated the functional distance with a few lagging \( o \) in \( O\, = \,\left[ { - 3,{ 3}} \right] \) and obtained the minimum lag-\( o \) distance [7].

The structural distance, which represents the strength of the DTI-based structural connectivity between pairs of ROIs, is defined as [7]

where \( \pi_{ij} \), which is the average on-fiber FA, denotes the strength of structural connection between ROIs \( i \) and \( j \), and \( \uplambda \) denotes an unknown parameter that potentially reduces the effect of structural data. The indirect structural connections were allowed by defining \( \pi_{ij} = { \hbox{max} }[\pi_{ij} ,{ \hbox{max} }_{l} (\pi_{il} ,\pi_{lj} )] \) [7]. The optimal \( \uplambda \) was determined empirically through minimizing the impact of false positive structural connectivity [7].

### Hyper-graph Construction.

We employed a multimodal hyper-graph construction method to estimate the anatomically weighted functional distance. Let \( V \) be the vertex set and \( E \) the hyper-edge set of a hyper-graph \( G \). For the \( n \)-th subject with \( P \) ROIs, a hyper-graph \( G_{n} = \left( {V_{n} ,E_{n} } \right) \) with \( P \) vertices can be constructed with each of its vertices representing an ROI. We employed a star expansion method [6] to generate hyper-edges among vertices. Specifically, for each distance matrix, a vertex was first selected as the centroid vertex and a hyper-edge was then constructed by linking the centroid vertex to its nearest neighbors within \( {\upvarphi }\bar{d} \) distance [6]. Here, \( . \bar{d} . \) is the average anatomically weighted distance between regions and \( {\upvarphi } \), which was set to 0.78 via grid search on training data, is a hyper-parameter controlling the sparsity of the hyper-network. It is noteworthy that the constructed hyper-edges were non-weighted edges.

### Feature Extraction and Selection.

Topological properties derived from a hyper- connectivity network provide quantitative measures to effectively study the differences in terms of brain organization between MCI subjects and normal controls (NC). In this study, we extracted three different types of clustering coefficients from the constructed multimodal hyper-connectivity network. Given a multimodal hyper-network \( G = \left( {V,E} \right) \), let \( M\left( v \right) \) be the hyper-edges adjacent to the vertex \( v \), i.e., \( M\left( v \right) = \left\{ {e \in E:v \in e} \right\} \), and \( N\left( v \right) \) the neighboring vertices to \( v \), i.e., \( N\left( v \right) = \left\{ {u \in V:\exists e \in E, u,v \in e} \right\} \). Then, three different types of clustering coefficients [1] can be computed on the vertex \( v \) as

where \( u,q,v \in V \) and \( e \in E \), \( I\left( {u,q,\neg v} \right) = 1 \) if there exists \( e \in E \) such that \( u,q \in e \) but \( v \notin e \), and 0 otherwise. \( I^{'} \left( {u,q,v} \right) = 1 \) if there exists \( e \in E \) such that \( u,q,v \in e \), and 0 otherwise. Three types of clustering coefficient features represent the topological properties of the multimodal hyper-connectivity network from three different perspectives. Specifically, the HCC^{1} denotes the number of neighboring nodes that have connections not facilitated by node \( v \). In contrast, the HCC^{2} denotes the number of neighboring nodes with connections facilitated by node \( v \), giving that these nodes may share some brain functions with each other and node \( v \). The HCC^{3} denotes the amount of overlap among adjacent hyper-edges of node \( v \). We jointly selected features from these three types of clustering coefficients using a manifold regularized multi-task feature selection method (M2TFS) defined as [1]

where \( Z^{c} = \left[ {z_{1}^{c} , \cdots ,z_{n}^{c} , \cdots ,z_{N}^{c} } \right]^{T} \in R^{N \times P} \) denotes a set of features from a total of \( N \) training subjects, each with \( P \) regions, and \( z_{n}^{c} = \left[ {{\text{HCC}}^{c} \left( {v_{i} } \right)} \right]_{i = 1:P} \in R^{P} \) is the vector of clustering coefficients from the \( n \)-th training subject on task \( c \) (in our case, a task represents selecting features from one type of clustering coefficients), \( Y = \left[ {y_{1} , \cdots ,y_{n} , \cdots ,y_{N} } \right]^{T} \in R^{N} \) is the response vector for those *N* training subjects, where \( y_{n} \) is the class label for the \( n \)-th training subject. \( L^{c} = D^{c} - S^{c} \) is the combinatorial Laplacian matrix on task \( c. \) \( S^{c} \) is a matrix that describes the similarity on the \( c \)-th task across training subjects, where \( D^{c} \) is a diagonal matrix defined as \( D^{c} \left( {n,n} \right) = \mathop \sum \limits_{m = 1}^{N} S^{c} \left( {n,m} \right) \). \( W = \left[ {w^{1} ,w^{2} , \cdots ,w^{C} } \right] \in R^{P \times C} \) is a weight matrix with \( C \) being the total number of tasks (i.e., \( C = 3 \)), and \( \left\| W \right\|_{2,1} = \sum\nolimits_{i = 1}^{P} {\left\| {w_{i} } \right\|}_{2} \) is the group sparsity regularizer that encourages features from different tasks to be jointly selected. Here, \( w_{i} \) is the \( i \)-th row vector of \( W \). \( \beta \) and \( \gamma \) are the corresponding regularization coefficients. \( h \) is a free parameter to be tuned empirically. The values of \( h, \beta \) and \( \gamma \) can be determined via inner cross-validation on the training subjects.

### Classification.

We employed a multi-kernel SVM to fuse three types of clustering coefficient features for MCI classification. Specifically, let \( f_{n}^{c} \) be the selected features from the \( c \)-th task of the \( n \)-th subject. We computed a linear kernel based on the features selected by the M2TFS method for each type of clustering coefficients and then fused them via a multi-kernel technique given as follows:

where \( k^{c} \left( {f_{n}^{c} ,f_{m}^{c} } \right) \) denotes the linear kernel function between the \( n \)-th and \( m \)-th subjects for the \( c \)-th set of selected clustering coefficients, and \( \mu^{c} \) is a non-negative weight coefficient with \( \mathop \sum \limits_{c = 1}^{C} \mu^{c} = 1 \). A coarse-grid search was used to optimize \( \mu^{c} \) through a nested cross-validation on the training subjects.

## 3 Experiment Results

Due to the limited sample size, we employed a nested leave-one-out cross-validation (LOOCV) scheme to evaluate the performance and generalization power of our proposed method. In the inner LOOCV loop, the training data was used to optimize the parameters \( h \), \( \beta \) and \( \gamma \) that identify a set of the most discriminative features for classification. To determine the weights \( \mu^{c} \) for integrating multiple kernels, we used a grid search with the range [0, 1] at a step size of 0.1.

The proposed method was compared to three single-modal models, i.e., hyper- networks derived either from fMRI or DTI data individually and also hyper-networks constructed from fMRI using sparse representation (fMRI-SR) [1]. Multiple values of the regularization parameter that determines the sparsity level of hyper-networks in the fMRI-SR model were set to [0.1, 0.2, …, 0.9]. As shown in Table 1, the proposed method yielded an accuracy of 96.3%, which is 7.4% better than the second best performed DTI-based hyper-network model. The fMRI-based hyper-network model performed the worst with an accuracy of 74.1%. The area under receiver operating characteristic curve (AUC) was used to evaluate the generalization performance and the proposed method achieved an AUC of 0.98, indicating an excellent generalization performance.

As shown in Table 2, there were 11 most discriminative features that were always selected in each LOOCV fold. These brain regions included regions located in the frontal lobes (e.g., left inferior frontal gyrus (triangular) [12] and left rectus gyrus [13]), the temporal lobes (e.g. left temporal pole and middle temporal gyrus [14]), cerebellum, and other regions including hippocampus [14] and occipital gyrus [14]. Our findings are consistent with previous findings that (1) atrophies of regions in the temporal lobe and frontal lobe were found at the early AD [15], and (2) gial accumulation of redox-active iron in the cerebellum was found significant in preclinical Alzheimer’s disease patients [16]. Figure 1 graphically illustrates the significant differences in terms of hypergraph structure between MCI and NC [1]. For example, in Fig. 1(b), the right hippocampus (HIP.R) was connected to the left hippocampus (HIP.L), left thalamus (THA.L), right thalamus (THA.R), right parahippocampal gyrus (PHG.R), right lenticular nucleus (pallidum) (PAL.R) and right cerebellum 3 (CRBL3.R) in MCI, while it was connected to the left hippocampus (HIP.L), left thalamus (THA.L), right thalamus (THA.R), right parahippocampal gyrus (PHG.R), right temporal pole (superior) (TPOsup.R) and right cerebellum 6 (CRBL6.R) in NC. As the hippocampus is highly associated with the memory performance, this pattern of alteration in functional connectivity involving the hippocampus may provide clues on the underpinnings of cognitive deficit in MCI.

## 4 Conclusion

In this paper, we proposed a novel multimodal hyper-network modelling method for improving the diagnostic accuracy of MCI. The proposed hyper-connectivity network encodes complementary information from multiple modalities to provide a more comprehensive representation on the brain structural and functional organizations. We demonstrated the superiority of our proposed method via MCI classification. Compared to the single-modal method, our proposed method achieved a higher classification accuracy and a better generalization performance. In the future, we will evaluate the performance of the proposed method on larger datasets.

## References

Jie, B., Wee, C.-Y., Shen, D., Zhang, D.: Hyper-connectivity of functional networks for brain disease diagnosis. Med. Image Anal.

**32**, 84–100 (2016)Zhu, D., Zhang, T., Jiang, X., Hu, X., Chen, H., Yang, N., Lv, J., Han, J., Guo, L., Liu, T.: Fusing DTI and fMRI data: a survey of methods and applications. Neuroimage

**102**, 184–191 (2014)Greicius, M.D., Supekar, K., Menon, V., Dougherty, R.F.: Resting-state functional connectivity reflects structural connectivity in the default mode network. Cereb. Cortex

**19**, 72–78 (2009)Van den Heuvel, M.P., Mandl, R.C., Kahn, R.S., Pol, H., Hilleke, E.: Functionally linked resting-state networks reflect the underlying structural connectivity architecture of the human brain. Hum. Brain Mapp.

**30**, 3127–3141 (2009)Wee, C.-Y., Yap, P.-T., Zhang, D., Denny, K., Browndyke, J.N., Potter, G.G., Welsh-Bohmer, K.A., Wang, L., Shen, D.: Identification of MCI individuals using structural and functional connectivity networks. Neuroimage

**59**, 2045–2056 (2012)Gao, Y., Wee, C.-Y., Kim, M., Giannakopoulos, P., Montandon, M.-L., Haller, S., Shen, D.: MCI identification by joint learning on multiple MRI Data. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9350, pp. 78–85. Springer, Cham (2015). doi:10.1007/978-3-319-24571-3_10

Bowman, F.D., Zhang, L., Derado, G., Chen, S.: Determining functional connectivity using fMRI data with diffusion-based anatomical weighting. Neuroimage

**62**, 1769–1779 (2012)Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., Mazoyer, B., Joliot, M.: Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage

**15**, 273–289 (2002)Murphy, K., Birn, R.M., Handwerker, D.A., Jones, T.B., Bandettini, P.A.: The impact of global signal regression on resting state correlations: are anti-correlated networks introduced? Neuroimage

**44**, 893–905 (2009)Yap, P.-T., Wu, G., Zhu, H., Lin, W., Shen, D.: F-TIMER: fast tensor image morphing for elastic registration. IEEE Trans. Med. Imaging

**29**, 1192–1203 (2010)Leemans, A., Jeurissen, B., Sijbers, J., Jones, D.: ExploreDTI: a graphical toolbox for processing, analyzing, and visualizing diffusion MR data. In: ISMRM, p. 3537 (2009)

Bell-McGinty, S., Lopez, O.L., Meltzer, C.C., Scanlon, J.M., Whyte, E.M., DeKosky, S.T., Becker, J.T.: Differential cortical atrophy in subgroups of mild cognitive impairment. Arch. Neurol.

**62**, 1393–1397 (2005)Fleisher, A.S., Sherzai, A., Taylor, C., Langbaum, J.B., Chen, K., Buxton, R.B.: Resting-state BOLD networks versus task-associated functional MRI for distinguishing Alzheimer’s disease risk groups. Neuroimage

**47**, 1678–1690 (2009)Salvatore, C., Cerasa, A., Battista, P., Gilardi, M.C., Quattrone, A., Castiglioni, I.: Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer’s disease: a machine learning approach. Front. Neurosci.

**9**, 307 (2015)Möller, C., Vrenken, H., Jiskoot, L., Versteeg, A., Barkhof, F., Scheltens, P., van der Flier, W.M.: Different patterns of gray matter atrophy in early-and late-onset Alzheimer’s disease. Neurobiol. Aging

**34**, 2014–2022 (2013)Smith, M.A., Zhu, X., Tabaton, M., Liu, G., McKeel Jr., D.W., Cohen, M.L., Wang, X., Siedlak, S.L., Dwyer, B.E., Hayashi, T.: Increased iron and free radical generation in preclinical Alzheimer disease and mild cognitive impairment. J. Alzheimers Dis.

**19**, 363–372 (2010)

## Author information

### Authors and Affiliations

### Corresponding authors

## Editor information

### Editors and Affiliations

## Rights and permissions

## Copyright information

© 2017 Springer International Publishing AG

## About this paper

### Cite this paper

Li, Y. *et al.* (2017). Multimodal Hyper-connectivity Networks for MCI Classification.
In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D., Duchesne, S. (eds) Medical Image Computing and Computer Assisted Intervention − MICCAI 2017. MICCAI 2017. Lecture Notes in Computer Science(), vol 10433. Springer, Cham. https://doi.org/10.1007/978-3-319-66182-7_50

### Download citation

DOI: https://doi.org/10.1007/978-3-319-66182-7_50

Published:

Publisher Name: Springer, Cham

Print ISBN: 978-3-319-66181-0

Online ISBN: 978-3-319-66182-7

eBook Packages: Computer ScienceComputer Science (R0)