Hierarchical graph learning with convolutional network for brain disease prediction

Liu, Tong; Liu, Fangqi; Wan, Yingying; Hu, Rongyao; Zhu, Yongxin; Li, Li

doi:10.1007/s11042-023-17187-8

Hierarchical graph learning with convolutional network for brain disease prediction

Open access
Published: 23 October 2023

Volume 83, pages 46161–46179, (2024)
Cite this article

Download PDF

You have full access to this open access article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hierarchical graph learning with convolutional network for brain disease prediction

Download PDF

Tong Liu¹,
Fangqi Liu²,
Yingying Wan²,
Rongyao Hu¹,
Yongxin Zhu³ &
…
Li Li⁴

845 Accesses
Explore all metrics

Abstract

In computer-aided diagnostic systems, the functional connectome approach has become a common method for detecting neurological disorders. However, the existing methods either ignore the uniqueness of different subjects across the functional connectivities or neglect the commonality of the same disease for the functional connectivity of each subject, resulting in a lack of capacity of capturing a comprehensive functional model. To solve the issues, we develop a hierarchical graph learning with convolutional network that not only considers the unique information of each subject, but also takes the common information across subjects into account. Specifically, the proposed method consists of two structures, one is the individual graph model which selects the representative brain regions by combining each subject feature and its related brain region-based graph. The other is the population graph model to directly conduct classification performance by updating the information of each subject which considers both the subject itself and the nearest neighbours. Experimental results indicate that the proposed method on four real datasets outperforms the state-of-the-art approaches.

Dual-Graph Learning Convolutional Networks for Interpretable Alzheimer’s Disease Diagnosis

BCN-GCN: A Novel Brain Connectivity Network Classification Method via Graph Convolution Neural Network for Alzheimer’s Disease

Balanced Graph Structure Information for Brain Disease Detection

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Medical imaging technologies have been applied already to brain disease diagnosis, such as Alzheimer’s disease (AD) and Fronto-Temporal Dementia (FTD) since they exploit physical phenomena by creating visual images of both internal tissues and external structure of the human body in a noninvasive manner [4, 5].

Among the enormous imaging clinical methods, resting-state functional magnetic resonance imaging (rs-fMRI) provides rapid identification in functional areas for different patient groups [26]. Therefore, it is an important tool for studying spontaneous functional brain activity in the resting state [27, 28]. Specifically, the brain Functional Connectivity Networks (FCNs) has become a powerful method for measuring and mapping brain activity, they are first constructed based on Blood-Oxygen-Level Dependent (BOLD) signals, and then, diverse machine learning methods are used to characterize the patterns of brain functional activities by employing classifiers [17, 22]. Thus, the existing methods on the rs-fMRI data contain three processes, i.e., FCN creation, feature learning and disorder diagnosis.

FCN creation employs different kinds of relationships to brain regions for describing statistical results of brain neural activities or representing the degrees of correlations between two brain regions [32]. The general methods of FCN creation contain linear ways (e.g., partial coherence/correlation or Pearson correlation) and non-linear ways (e.g., mutual information) [44]. Feature learning focuses on searching the informative features, i.e., deep representations, and semantic information, from FCNs of each subject [33]. Disorder diagnosis employs different classifiers to verify the performance of a learning subset of features, such as the Support Vector Machine (SVM) and the decision tree [16, 18]. However, the traditional machine learning methods separately conduct feature learning and disorder diagnosis. Therefore, the selected optimal features might not obtain the best classification performance in disorder diagnosis as well as the subset of features corresponding to the best classification result may not be the optimal ones, resulting in the sub-optimal problem in the two parts by each other [51].

Recently, deep learning methods have been widely used for disorder diagnosis by the reason of dealing with high-level features [40]. In general, convolutional neural networks (CNNs) can only handle data with a grid structure, but non-grid structure data are the mainstream representation in the real world. Moreover, the correlations of each sample (i.e., brain region or subject) are not considered in CNNs and that is also beneficial to feature learning. Thus, graph neural networks (GNNs) have become an effective way since the representative features are learned from both the original data and the neighbouring nodes [24, 31]. Additionally, GNNs can combine feature information and structural information in feature learning to effectively improve model performance [1, 25]. For rs-MRI data classification, the general methods of GNNs can be divided into two parts, the individual graph model which considers the unique information of each subject and the population graph model which exploits the common message across subjects [17]. However, both of them cannot conduct a comprehensive consideration of local information and global information. Hence, the joint framework by combining the individual graph model with the population graph model has become the indispensable way [50]. For example, Zhou et al. proposed an individual graph model to search the important features by the functional connective network to each subject and conducted a population graph model to exploit the representative features by node (i.e., each subject) and edge (i.e., phenotypic information) [49]. Although the graph is used to guide convolution operations in the GNN model and plays an important role in feature extraction, some issues can be found. For one issue, the above procedure of graph construction methods is independent of classification tasks, and fixed graphs are applied to the entire network but do not obtain the underlying structure of nodes features in different layers very well, resulting in the model performance reduction. The other issue is that it uses local features from each subject to reconstruct global features across subjects, resulting in the sub-optimal solution of utilizing local features for global feature learning. Meanwhile, the initial global features across subjects also provide important information that is not considered.

To solve the aforementioned issues, we develop a Hierarchical graph learning with convolutional network framework by considering both local regions of the brain and global information about subjects. Specifically, one graph model is conducted to consider individual brain function networks, called the individual graph and the other graph model is designed based on the entire population network, called the population graph. The former model learns a node representation of each brain region via the GCN and uses a graph to obtain the graph-level features of each subject. Then, the latter model further updates the graph data embedding of each node by aggregating the representations of its neighbours and itself. It is used to process correlations between subjects. Moreover, the important contributions of this paper are summarized in the following.

We develop a unified framework with the capability of processing both local regions of the brain and the global information of subjects, and can effectively learn high-level embeddings of brain network representations at both node-level and graph-level.
The graph learning in this paper is dynamically updated in the GNN for each subject, enabling better graph representations to be obtained. Also, the optimal graph representation is obtained for the population graph model.
Extensive experiments have been conducted in three real-life medical clinical applications, and the results indicate that learning network embeddings from correlations between population networks and individual brain networks can improve predictive performance.

The flow of the next parts can be viewed as the related work is shown in Section 2. The details of the proposed method are discussed in Section 3. Then, the procedure of experiments and the process of ablation study are described in Sections 4 and 5, respectively. Lastly, a conclusion is given in Section 6.

2 Related work

2.1 Brain connectivity analysis (SPBA)

The three main types of analysis for brain connectivity are seed point-based analysis (namely SPBA), independent component analysis (namely ICA), and graph theory.

2.1.1 Seed point-based analysis

Seed point-based analysis is essentially a model-based approach where a seed point or region of interest (ROI) is selected and the linear correlation of that seed point region with all other voxels throughout the brain is found, resulting in a seed-based Functional Connectivity map [34]. The straightforwardness and interpretability of the technique make it an adequate method for studying rs-fMRI FC. However, as the technique is fully dependent on user-specified ROIs, it is not easy to detect functional connectivity throughout the brain using this method [35].

2.1.2 Independent component analysis (ICA)

The human brain consists of an extensive network of neurons that produce fluctuations in both low-frequency and high-frequency [28]. Then, rs-fMRI depends on spontaneous low-frequency fluctuations (Less than 0.1 Hz) from anatomical regions of a network that are spatially separated from each other, functionally connected and in constant communication [37]. The rs-fMRI signals we extract from our subjects are composite signals containing the signals of interest and other extra artefacts. This analysis method uses mathematical algorithms to decompose the signals from entire brain voxels into temporally and spatially independent components that help to extract different rs-fMRI networks efficiently [13].

2.1.3 Graph theory

Graph theory in human neuroscience is to build mathematical models of the function of complicated networks in the human brain [9]. The neural networks have associations between various regions and sub-regions of the brain, and the dynamic connections between the networks form a larger single neural network [12]. A graph theory method focuses on the relationship between nodes and their edges, which can be expressed as $\mathcal {G} = ({\textbf{V}}, {\textbf{E}})$, where ${\textbf{V}}$ is the set of nodes and ${\textbf{E}}$ is the edge that connects those nodes. The application of graph theory in brain FC analysis can be characterised by different graph-theoretic metrics to demonstrate different aspects of connectivity [14]. These include the average path length, the clustering coefficient, the degree of a node, the centrality measures, and the level of modality.

The summary of brain connectivity analysis with SPBA, ICA, and graph theory with the latest and informative literature can be found in Table 1.

Table 1 The summary of brain connectivity analysis

Full size table

Compared with seed-based analysis a single correlation between the seed region and the entire brain can be found and independent component analysis that voxel-to-voxel interactions across several different networks in the brain can be searched [36], graph theory focuses on the topological properties of the seed points in the brain or in the neural network associated with a particular function. Segregation and integration are two means by which neural networks are represented because the brain operates in this way. Functional integration views the brain as a large neural network of interactions that integrates different neural networks of the brain to collaborate on specific functions, whereas segregation implies connections within the various networks of the brain. Therefore, graph theory is one of the useful techniques in inspecting the integration and separation of brain neural networks.

2.2 Graph neural networks

By combining graph broadcast operations and deep learning algorithms, graph neural networks allow both structural and vertex attribute information to be involved in learning. In this way, good results and interpretability have been shown in applications, such as vertex classification, graph categorisation and disease prediction, and have become a widely used method for graph analysis.

2.2.1 Individual graph model on GNN

Given a set of $\mathcal {G}i = (\textbf{X}i, \textbf{A}i)$, individual graph models usually input all $\mathcal {G}i$ into the same GNN model in sequence [31]. For example, Jiang et al. [22] proposed a hierarchical graphical convolutional network to learn feature embeddings of graphs while considering network topological information and associations between subjects. Zhou et al. [49] proposed a graphical convolutional network that processes both subject-level information and area-level information in the brain. Then, it learns the individual feature of each subject and classifies each subject by employing the classifier. However, two issues can be found in the following, (i) the same element (i.e., the correlation between two brain regions) in different subjects has the correlation by similar label (e.g., patient or healthy control); (ii) the features across subjects are also important to the final classification task.

2.2.2 Population graph model on GNN

Given a set of BOLD signals, the population graph model calculates multiple FCNs of all subjects and extracts FCNs’ upper triangle part to obtain a feature matrix. Following this, the traditional methods conduct feature extraction or feature selection models to search the important features for the classification task, or design variant CNN models to diagnose disorders on brain region images [17, 39, 46]. Farouk et al. [10] argued that methods based on deep neural network could produce more accurate feature representations compared to traditional methods using shallow learning. Zhang et al. proposed a residual CNN model to diagnose Alzheimer’s disease in an end-to-end approach by considering the global, the local features and the spatial features [46]. However, the existing node classification methods either conduct separate feature selection and classifier, which may result in the method may not consider the correlations between the brain regions which is important to model construction and brain region selection.

3 Proposed method

In this section, the graph convolutional network method will be reviewed first and then introduce our proposed method in detail.

3.1 Graph convolutional network

The most important aspect of deep learning is feature learning, which can automatically discover potential high-level information from high-dimensional neuroimaging data. It aims to obtain hierarchical feature information in a hierarchical network, solving the important challenge of needing to design features manually [6]. Although CNNs are involved in many tasks yet, convolutional neural networks can only operate on regular Euclidean data (e.g. images and text) [11]. In reality, most data are often obtained from non-Euclidean domains and need to be analysed effectively, such as the relational networks of social networks, traffic networks and chemical molecules. This data can be better represented by graphs [31]. Therefore, we adopt the spectral approach to define graph convolution by using a well-defined locus operator on graphs.

The spectral method treats the attributes of the node as signals of the graph and performs a convolution operation on the spectrum of the graph (i.e. the singular values of the graph Laplacian) directly. The convolution of the spectrum of the filter $g\theta = diag\left( \theta \right)$ in the Fourier domain is as follows:

$$\begin{aligned} \begin{array}{l} g\theta *{\textbf{x}} = {\textbf{U}}g\theta \left( \varvec{\Lambda } \right) {{\textbf{U}}^T}{\textbf{x}}, \end{array} \end{aligned}$$

(1)

where ${\textbf{x}} \in {{\mathbb {R}}^N}$ is the signal , and $\textbf{x}$ is the eigenvector corresponding to each vertex on the graph. $\theta \in {{\mathbb {R}}^N}$ are the parameters, $*$ denotes the graph convolution operation. ${\textbf{U}}$ and $\varvec{\Lambda }$ represent the singular vectors and singular values of the graph Laplacian ${\textbf{L}} = {{\textbf{D}}^{ - 1/2}}\left( {{\textbf{D}} - {\textbf{A}}} \right) {{\textbf{D}}^{ - 1/2}}$, respectively, where ${\textbf{D}}$ is a diagonal matrix and ${\textbf{A}}$ is a adjacency matrix.

However, the computational complexity of the singular value decomposition is too high to apply for the large-scale data. Thus, Defferrard et al. [6] proposed an approximated solution to the spectral filter with Chebyshev polynomials as follows:

$$\begin{aligned} \begin{array}{l} g\theta *\textbf{x}=\sum \nolimits _{p=0}^{P}{\theta _{p}^{'}{{T}_{p}}\left( \textbf{L} \right) \textbf{x}}, \end{array} \end{aligned}$$

(2)

where ${T_p}$ and ${\theta _p}$ are the Chebyshev polynomials and coefficients, respectively. Kipf et al. [25] further simplified the Chebyshev map convolution as:

$$\begin{aligned} \begin{array}{l} g\theta *{\textbf{x}} = \theta \left( {{\textbf{I}} + {{\textbf{D}}^{ - 1/2}}{\textbf{A}}{{\textbf{D}}^{ - 1/2}}} \right) {\textbf{x}} \end{array} \end{aligned}$$

(3)

By constraining the first-order Chebyshev polynomial and let the maximum singular value be equal to two, where ${\textbf{I}}$ denotes an identity matrix. Moreover, by defining ${\tilde{\textbf{A}}} = {\textbf{A}} + {\textbf{I}}$, ${\tilde{\textbf{D}}}$ to be a diagonal matrix of $\tilde{\textbf{A}}$, with the diagonal elements being the column sums of the matrix $\tilde{\textbf{A}}$. If the signal has one input channel and one spectral filter, the convolution equation is given by:

$$\begin{aligned} \begin{array}{l} {{\textbf{H}}^{\left( {l + 1} \right) }} = \mathrm{{ReLU}}\left( {{{{\tilde{\textbf{D}}}}^{ - 1/2}}{\tilde{\textbf{A}}}{{{\tilde{\textbf{D}}}}^{ - 1/2}}{{\textbf{H}}^{\left( l \right) }}{\varvec{\Theta }}} \right) , \end{array} \end{aligned}$$

(4)

where ${\textbf{X}} \in {{\mathbb {R}}^{N \times C}}$, $\nleq \,\in {{\textbf{R}}^{F \times C}}$ is a filter parameter matrix, ${\textbf{H}}$ is the feature matrix of each layer and ${{\textbf{H}}^{\left( 0 \right) }} = {\textbf{X}}$, ${\text {ReLU}}\left( \cdot \right)$ is a non-linear activation function. The final output layer ${\textbf{Z}}$ is defined as follows:

$$\begin{aligned} \begin{array}{l} {\textbf{Z}} = {\text {softmax}}\left( \overset{\wedge }{\textbf{A}} \mathrm{{ReLU}}\left( \overset{\wedge }{\textbf{A}}{\textbf{X}}{{\varvec{\Theta }}^{\left( 0 \right) }} \right) {{\varvec{\Theta }}^{\left( 1 \right) }} \right) , \end{array} \end{aligned}$$

(5)

where ${\overset{\wedge }{\textbf{A}} = {{\tilde{\textbf{D}}}^{ - 1/2}}{\tilde{\textbf{A}}}{{\tilde{\textbf{D}}}^{ - 1/2}}}$. There are also many graph convolution network models based on spectral methods, for example, Defferrard et al. [6] proposed Chebyshev graph convolution by fitting a convolution kernel with Chebyshev polynomials. Bruna et al. [2] proposed a method to define the graph convolution based on the theory of eigenvalue decomposition of graph Laplacian matrices in the Fourier domain.

The diagnosis of functional brain networks is a classic graph classification problem. It takes the brain neural network as an input and predicts the corresponding label (i.e., clinical state). Therefore, a new graph neural network framework is proposed to process both local regions of the brain and global information of subjects. The framework consists of two parts, a graph for modelling individual brain function networks, called the individual graph. The other is a graph applied to the whole population network, called the population graph. Figure 1 presents the diagram of the overall framework of the proposed method.

3.2 The individual graph model

In the individual graph model, multiple layers of the graph convolutional network are stacked. After the convolution layer, the average pooling operator generates a coarsened graph globally, which summarises sub-graph information while exploiting the sub-graph structure [22]. In addition, the pooling layer enables the graph convolutional network to reduce the total number of parameters by reducing the size of the representation, thus avoiding overfitting. The output layer resolves the node representation of each graph to a single graph representation.

Given each subject, define a graph ${G_n} = \left\{ {{{\textbf{X}}_n},\mathrm{{ }}{{\textbf{A}}_n}} \right\}$, where ${{\textbf{X}}_n} = \left\{ {d_i^1,...,d_i^n} \right\}$, n is the number of ROIs, and ${{\textbf{A}}_i} \in {{\mathbb {R}}^{n \times n}}$ is the adjacency matrix to represent the network connectivity of the i-th subject, and each graph is obtained by K-nearest-neighbor way. Each node embedding in ${{\textbf{X}}_i}$ is learned in the GCN training phase. Thus, in the proposed individual graph model, each brain region takes into account information from neighbouring brain regions of the same subject [49].

We use a two-layer GCN and the individual graph model is defined as:

$$\begin{aligned} \begin{array}{l} {{\textbf{H}}_n} = \sigma \left( {{{\textbf{A}}_n}\sigma \left( {{{\textbf{T}}_n}({{\textbf{A}}_n}{{\textbf{X}}_n}{{\textbf{W}}_1})} \right) {{\textbf{W}}_2}} \right) , \\ \textbf{A}_n(i,j) = \frac{exp(ReLU(\textbf{p}^T [\textbf{X}_n(i)\textbf{W}, \textbf{X}_n(j)\textbf{W}]))}{\sum \nolimits _{k \in \mathcal {N}(i)} exp(ReLU(\textbf{p}^T [\textbf{X}_n(i) \textbf{W}, \textbf{X}_n(k) \textbf{W}]))}, \end{array} \end{aligned}$$

(6)

where ${\textbf{W}_1}$ and ${\textbf{W}_2}$ represent the weight matrix in various convolutional layers, and $\sigma$ denotes the activation function. ${\textbf{H}}$ represent the learned feature matrix. ${\textbf{T}}$ can dynamically select useful brain regions for each subject. Each element in $\textbf{A}_n$ is denoted by $\textbf{A}_n(i,j)$, $\textbf{p}$ is a learned weight vector, $\mathcal {N}(i)$ denote the set of nearest neighbors about the node i.

3.3 The population graph model

In the population graph model, its purpose is to search the representative features from both the graph matrix and the feature matrix. While we obtain a set of feature matrices {$\textbf{H}_1$,...,$\textbf{H}_n$,...,$\textbf{H}_N$} and graph representation {$\textbf{A}_1$,...,$\textbf{A}_n$,...,$\textbf{A}_N$} from individual graph model, we plan to exploit the informative features from them. Hence, we have

$$\begin{aligned} \begin{array}{l} \textbf{H} = MLP(\textbf{H}_1,...,\textbf{H}_N),\\ \textbf{S} = Mean(\textbf{A}_1,...,\textbf{A}_N), \end{array} \end{aligned}$$

(7)

where the fused feature matrix $\textbf{H}$ and the fused graph matrix $\textbf{S}$, respectively, consider the neighbouring information across subjects and the correlations information across brain regions of different subjects.

Then, we employ a two-layer GCN to search important features, and we obtain

$$\begin{aligned} \begin{array}{l} {\textbf{F}} = \sigma \left( {{\textbf{S}}\sigma \left( {{{\textbf {SH}}}{{\varvec{\Theta }}_1}} \right) {{\varvec{\Theta }}_2}} \right) , \end{array} \end{aligned}$$

(8)

where ${\varvec{\Theta }}$ is the weight matrix, and $\textbf{F}$ is the learned features across brain regions and across subjects at the same time.

3.4 The unified model

After obtaining the features ${\textbf{F}}$, the final diagnostic features can be obtained by

$$\begin{aligned} \begin{array}{l} {{\textbf{F}}^*} = \sigma \left( {\mathrm{{MLP}}\left\{ {{\textbf{F}}} \right\} } \right) , \end{array} \end{aligned}$$

(9)

where $\sigma (\cdot )$ denotes the activation function which is Softmax for multi-class classification or Sigmoid for binary classification.

Moreover, the cross-entropy loss is employed for the final classification task as follows:

$$\begin{aligned} {\text {Loss}} = - \sum \limits _{i \in {Y}} {\sum \limits _{j = 1}^c {{\textbf{Y}}\ln {{\textbf{F}}^*}} }, \end{aligned}$$

(10)

where Y is a set of labeled nodes and ${\textbf{Y}}$ represents the real label.

For individual graph model and population graph model, the computational complexity is (O(n)+O(2n)) and O(N), respectively. Then, the total computational complexity is T*(N*O(3n)+O(N)), where T is the number of epochs, N is the number of subjects, and n is the number of brain regions.

4 Experiments

4.1 Dataset

To assess the validity of the proposed model, we conducted experiments on three datasets, namely FrontoTemporal Dementia (FTD), Obsessive-Compulsive Disorder (OCD), the Alzheimer’s Disease Neuroimaging Initiative (ADNI), and Autism Brain Imaging Data Exchange (ABIDE).

The FTD dataset contains 95 FTD subjects and 86 age-matched healthy control subjects from the source^{Footnote 1}.
The OCD dataset have 20 healthy control subjects and 62 OCD subjects from the hospital [7].
The ADNI dataset includes 59 Alzheimer’s disease subjects and 48 healthy control subjects^{Footnote 2}.
The ABIDE dataset contains 1029 subjects with functional magnetic resonance imaging data from ABIDE-I and ABIDE-II datasets, including 485 ASD patients and 544 healthy control subjects^{Footnote 3}.

For a clear demonstration, the summary of datasets can be found in Table 2. Then, For details of the data acquisition process, please refer to [29] and [33].

Table 2 The summary of all datasets

Full size table

4.2 Setting

We ran all experiments on a server with 8 NVIDIA GeForce 3090 GPU and implemented them in PyTorch. We obtained author-provided code for all our comparison methods and followed the settings for parameters recommendations in the associated literature to guarantee that all our comparison methods perform optimally on each dataset. In addition, given the initial graph, training/test partitioning, network dimensions and training procedures, all methods including comparison methods and our proposed method use the same settings.

In all experiments, training/testing data is split by 5-fold cross-validation and the experiments are repeated 5 times with random seeds. The average results with corresponding standard deviation (std) are reported for all methods. We selected 30% of the entire dataset as the marker samples randomly in the training set, For the training process of the Adam optimizer, the maximum number of epochs is set to 500, and the initial learning rate and weight-decay are set to 0.01 and 0.0005, respectively [19]. Four metrics were evaluated for the diagnostic results of all methods, including accuracy (ACC), specificity (SPE), sensitivity (SEN) and area under the subject operating characteristic curve (AUC).

4.3 Comparison methods

We employed seven comparison methods, the details of all methods can be found in the following:

High-order Functional Connectivity (HFC) uses higher-level dynamic interactions between brain regions for the diagnosis of early mild cognitive impairment [47].
Strength and Similarity Group Sparse Representation (SSGSR) integrates low-level and high-level functional connectivity to accurately guide the modelling of the brain network function [48].
Graph Convolutional Networks (GCN) generates a new node representation through aggregated node information by utilizing the edge information of connected nodes [25].
Deep Iterative and Adaptive Learning (DIAL) is an end-to-end graph learning framework for learning graph structure and graph embedding simultaneously. It iteratively updates the new graph [3].
Simplify Graph Convolutional networks (SGC) iteratively removes non-linearities between GCN layers and compressing the resulting function into a single linear transformation to reduce the complexity of the network [42].
Interpretable Brain Graph Neural Network (BrainGNN) is a graph neural network framework to analyse the functional magnetic resonance images and discover the neurological biomarkers [30].
Hierarchical Graph Convolution Network (HiGCN) designs an intra-subject GCN to explore the informative feature vector and an inter-subject GCN to obtain the important feature for disease diagnosis [20].

For all comparison methods, they contain four kinds of categories, the shallow learning-based methods (i.e., HFC and SSGSR) consider the shallow features for disease diagnosis. The single population graph models (i.e., GCN, DIAL, and SGC) consider feature matrix and graph representation across subjects to learn important features for the classification task. The single individual graph approach (i.e., BrainGNN) considers the local information of each subject and combining with all local features for the final diagnosis task. The unified model (i.e., HiGCN) first conducts an individual graph module to obtain local information, and then, utilizes the combined features and the fixed graph representation to achieve important features for the classification task.

Table 3 All evaluation metrics (%) of different methods on the dataset FTD

Full size table

Table 4 All evaluation metrics (%) of different methods on the dataset OCD

Full size table

4.4 Experimental results

Tables 3, 4, 5 and 6 showed the disease diagnostic performance of the different methods on three real neurological disease datasets. From all tables, we obtained that the proposed method outperforms the comparison methods, and followed by BrainGNN, HiGCN, DIAL, SSGSR, GCN, SGC, and HFC in terms of four evaluation metrics. For example, the proposed method averagely increased by 1.22%, 1.89%, 1.47%, and 0.89% compared to the best comparison method (i.e., BrainGNN), on the dataset FTD, OCD, ADNI, and ABIDE with all evaluation metrics. Likewise, the proposed method on average improved by 1.77%, 1.86%, 2.05%, and 0.98%, respectively, compared to the best shallow learning method (i.e., SSGSR) with regard to ACC, SEN, SPE, and AUC on four datasets. The main reason is that the proposed method searches both local information of each subject and global information across subjects at the same time, and learns the common graph representation across subjects at the same time. Moreover, the proposed method exploits the potential deep learning-based feature representation, which can obtain better performance than the shallow learning-based features, resulting in better classification results.

Table 5 All evaluation metrics (%) of different methods on the dataset ADNI

Full size table

Table 6 All evaluation metrics (%) of different methods on the dataset ABIDE

Full size table

Table 7 Ablation analysis of our method on four datasets (FTD, OCD, ADNI, ABIDE)

Full size table

5 Ablation study

The Ablation study was conducted to demonstrate the necessity and the effectiveness of the technologies of the proposed method, and it mainly contains four aspects, i.e., the population graph model, dynamic graph representation, the individual graph model, and the selected brain regions by learnable weight.

5.1 Effectiveness of the population graph model

We then remove the population graph model (namely w/o PopulationGM) from our framework to demonstrate its effectiveness. In Table 7, compared to our method, the performance of w/o PopulationGM is poorer on all datasets at different label ratios. The reasons could be that the population graph model is essential for learning individual features for each subject and improving the performance of personalized diagnosis.

5.2 Effectiveness of the dynamic graph representation

We remove the dynamic graph representation (namely w/o DynamicGM) from our framework to demonstrate its effectiveness. As shown in Table 7, compared to our method without DynamicGM, our method improved by 12.14%, 16.68%, 15.55%, and 4.68% in terms of all label rates on the dataset FTD, OCD, ADNI, and ABIDE. The reason could be that the updated graph representation of each subject can exploit the important brain regions in the population graph model which is essential for accessing the graph structure dynamically.

5.3 Effectiveness of the individual graph model

Table 7 shows the classification performance of our proposed method and the comparison methods with different label rates (i.e., 10%, 20% and 30%). After evaluating the effectiveness of the individual graph model (namely w/o IndividualGM), we can see that the proposed method is superior to w/o IndividualGM, and the results increased by 4.89% with regard to all label rates on all datasets. The reason might be that the individual graph model can extract relationships among subjects, and successfully capture important information for brain diseases diagnosis.

5.4 Effectiveness of selected specific brain regions

To further verify the importance of selecting specific brain regions ${\textbf{T}}$ (SelectSBA), we conducted experiments that removed from our framework (namely w/o SelectSBA) on all datasets. As shown in Table 7, the results of our proposed method are better than that of w/o SelectSBA on disease classification task, which indicates that selecting specific brain regions $\mathbf{{T}}$ plays a vital role in improving the classification effectiveness.

Then, the visualization of top brain regions by the proposed method on all datasets can be viewed in Fig. 2. The proposed method selected 12, 13, 14, and 25 brain regions on the dataset FTD, OCD, ADNI, and ABIDE, respectively.

6 Conclusion

In this article, we have proposed a new hierarchical graph learning with convolutional network framework for brain disease diagnosis, which can process both local brain regions and subject information as a whole. The framework consists of two parts including the individual graph model that learns the node representation of each brain region through GCN and uses a graph to capture the graph-level features of each subject, and the population graph model that further updates the graph data by embedding each node through the aggregation of representations of its neighbours and itself. Experimental results show that our proposed method surpasses the comparison algorithms on three real data sets. In future work, we will improve our model as follows. Aiming to the data, the characteristics of rs-fMRI may not be considered comprehensively, that is, the time series information and the correlations between time series will be considered in the future model. Paying attention to the model structure, each subject has private information and common information related to different diseases, so our future model will optimize the model structure and search the private information and common information across subjects and across similar diseases. Trying different kinds of graph representations and features to improve the accuracy of brain disease prediction, and focusing on the interpretability of each brain region of each subject and of each subject at the same time.

Data availability statement

We have provided the hyperlink for FTD, ADNI, and ABIDE datasets in Section 4.1, and provided the source of OCD dataset from ref. [7]. Besides, we provided the details of the data acquisition process of all datasets from ref. [29] and [33].

Notes

References

Atwood J, Towsley D (2016) Diffusion-convolutional neural networks. In: NIPS, pp 1993–2001
Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks and locally connected networks on graphs. arXiv:1312.6203
Chen Y, Wu L, Zaki MJ (2019) Deep iterative and adaptive learning for graph neural networks. arXiv:1912.07832
Chou C-C, Zhang Y, Umoh ME, Vaughan SW, Lorenzini I, Liu F, Sayegh M, Donlin-Asp PG, Chen YH, Duong DM et al (2018) Tdp-43 pathology disrupts nuclear pore complexes and nucleocytoplasmic transport in als/ftd. Nat Neurosci 21(2):228–239
Article Google Scholar
Jeffrey C, Garam L, Aaron R, Marwan S, Kate Z (2019) Alzheimer’s disease drug development pipeline: 2019. Alzheimer’s & Dementia: Transl Res Clin Interv 5:272–293
Michaël D, Xavier B, Pierre V (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In NIPS 29:3844–3852
Google Scholar
Chenjie D, Qiong Y, Jingjing L, Seger CA, Han H, Ning Y, Chen Q, Peng ZW (2020) Impairment in the goal-directed corticostriatal learning system as a biomarker for obsessive-compulsive disorder. Psychol Med 50(9):1490–1500
Article Google Scholar
Du Y, Fu Z, Jing S, Shuang G, Ying X, Dongdong L, Mustafa S, Abrol A, Rahaman MA, Jiayu C et al (2020) Neuromark: an automated and adaptive ica based pipeline to identify reproducible fmri markers of brain disorders. NeuroImage: Clin 28:102375
Article Google Scholar
Farahani FV, Karwowski W, Lighthall NR (2019) Application of graph theory for identifying connectivity patterns in human brain networks: a systematic review. Front Neurosci 13:585
Article Google Scholar
Farooq A, Anwar SM, Awais M, Rehman S (2017) A deep cnn based multi-class classification of alzheimer’s disease using mri. In: IST, pp 1–6
Fesseha A, Xiong S, Emiru ED, Diallo M, Abdelghani D (2021) Text classification based on convolutional neural networks and word embedding for low-resource languages: Tigrinya. Information 12(2):52
Article Google Scholar
Gao W, Wu H, Siddiqui MK, Baig AQ (2018) Study of biological networks using graph theory. Saudi J Biol Sci 25(6):1212–1219
Article Google Scholar
Garcia-Bracamonte JE, Ramirez-Cortes JM, Rangel-Magdaleno JJ, Gomez-Gil P, Peregrina-Barreto H, Alarcon-Aquino V (2019) An approach on mcsa-based fault detection using independent component analysis and neural networks. IEEE Trans Instrum Meas 68(5):1353–1361
Article Google Scholar
Hallquist MN, Hillary FG (2018) Graph theory approaches to functional network organization in brain disorders: a critique for a brave new small-world. Netw Neurosci 3(1):1–26
Hejazi S, Karwowski W, Farahani FV, Marek T, Hancock PA (2023) Graph-based analysis of brain connectivity in multiple sclerosis using functional mri: a systematic review. Brain Sci 13(2):246–268
Article Google Scholar
Hu R, Jiangzhang G, Xiaofeng Z, Tong L, Xiaoshuang S (2022) Multi-task multi-modality svm for early covid-19 diagnosis using chest ct data. Inf Process Manag 59(1):102782
Article Google Scholar
Hu R, Ziwen P, Xiaofeng Z, Jiangzhang G, Yonghua Z, Junbo M, Wu G (2021) Multi-band brain network analysis for functional neuroimaging biomarker identification. IEEE Trans Med Imaging 40(12):3843–3855
Article Google Scholar
Hu R, Xiaofeng Z, Yonghua Z, Jiangzhang G (2020) Robust svm with adaptive graph learning. World Wide Web 23(3):1945–1968
Article Google Scholar
Jang E, Gu S, Poole B (2016) Categorical reparameterization with gumbel-softmax. arXiv:1611.01144
Ji J, Xing X, Yao Y, Li J, Zhang X (2021) Convolutional kernels with an element-wise weighting mechanism for identifying abnormal brain connectivity patterns. Pattern Recogn 109:107570
Article Google Scholar
Ji L, Hendrix CL, Thomason ME (2022) Empirical evaluation of human fetal fmri preprocessing steps. Netw Neurosci 6(3):702–721
Article Google Scholar
Jiang H, Cao P, MingYi X, Yang J, Zaiane O (2020) Hi-gcn: a hierarchical graph convolution network for graph embedding learning of brain network and brain disorders prediction. Comput Biol Med 127:104096
Article Google Scholar
Jiang Y, Wang P, Wen J, Wang J, Li H, Biswal BB (2022) Hippocampus-based static functional connectivity mapping within white matter in mild cognitive impairment. Brain Struct Funct 227(7):2285–2297
Article Google Scholar
Kim P (2017) Convolutional neural network. In: MATLAB deep learning, pp 121–147
Kipf T, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: ICLR
Kornfeld S, Yuan R, Biswal BB, Grunt S, Kamal S, Rodríguez JAD, Regényi M, Wiest R, Weisstanner C, Kiefer C et al (2018) Resting-state connectivity and executive functions after pediatric arterial ischemic stroke. NeuroImage: Clin 17:359–367
Article Google Scholar
Ktena SI, Parisot S, Ferrante E, Rajchl M, Lee M, Glocker B, Rueckert D (2018) Metric learning with spectral graph convolutions on brain connectivity networks. NeuroImage 169:431–442
Article Google Scholar
Lang S, Duncan N, Northoff G (2014) Resting-state functional magnetic resonance imaging: review of neurosurgical applications. Neurosurgery 74(5):453–465
Article Google Scholar
Li H, Shi X, Zhu X, Wang S, Zhang Z (2022) Fsnet: dual interpretable graph convolutional network for alzheimer’s disease analysis. IEEE Trans Emerg Top Comput Intell 7(1):1–11
Google Scholar
Li X, Zhou Y, Dvornek N, Zhang M, Gao S, Zhuang J, Scheinost D, Staib LH, Ventola P, Duncan JS (2021) Braingnn: interpretable brain graph neural network for fmri analysis. Medical Image Analysis 74:102233
Article Google Scholar
Li Y, Hao ZB, Lei H (2016) Survey of convolutional neural network. Journal of Computer Applications 36(9):2508–2515
Google Scholar
Liu J, Pan Y, Fang-Xiang W, Wang J (2020) Enhancing the feature representation of multi-modal mri data by combining multi-view information for mci classification. Neurocomputing 400:322–332
Article Google Scholar
Peng L, Wang N, Dvornek N, Zhu X, Li X (2022) Fedni: federated graph learning with network inpainting for population-based disease prediction. IEEE Trans Med Imaging:1–12
Pijpker PAJ, Oosterhuis TS, Witjes MJH, Faber C, van Ooijen PMA, Kosinka J, Kuijlen JMA, Groen RJM, Kraeima J (2021) A semi-automatic seed point-based method for separation of individual vertebrae in 3d surface meshes: a proof of principle study. Int J Comput Assist Radiol Surg:1–11
Qi CR, Litany O, He K, Guibas LJ (2019) Deep hough voting for 3d object detection in point clouds. In: ICCV, pp 9277–9286
Seewoo BJ, Joos AC, Feindel KW (2021) An analytical workflow for seed-based correlation and independent component analysis in interventional resting-state fmri studies. Neurosci Res 165:26–37
Article Google Scholar
Sompairac N, Nazarov PV, Czerwinska U, Cantini L, Biton A, Molkenov A, Zhumadilov Z, Barillot E, Radvanyi F, Gorban A et al (2019) Independent component analysis for unraveling the complexity of cancer omics datasets. International Journal of Molecular Sciences 20(18):4414
Article Google Scholar
Tjerkaski J, Thompson WH, Bellander B-M, Thelin EP, Fransson P (2022) Meso-scale network analysis of resting state-fmri brain network connectivity performs poorly as a prognostic tool in critically ill traumatic brain injury patients. Neuroimage: Rep 2(1):100079
Article Google Scholar
Wang L, Li K, Hu XP (2021) Graph convolutional network for fmri analysis based on connectivity neighborhood. Netw Neurosci 5(1):83–95
Article Google Scholar
Wang X, Li J, Wang M, Yuan Y, Zhu L, Shen Y, Zhang H, Zhang K (2018) Alterations of the amplitude of low-frequency fluctuations in anxiety in parkinson’s disease. Neurosci Lett 668:19–23
Article Google Scholar
Wei P, Bao R, Fan Y (2022) Comparing the reliability of different ica algorithms for fmri analysis. PLoS ONE 17(6):e0270556
Article Google Scholar
Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K (2019) Simplifying graph convolutional networks. In: ICML, pp 6861–6871
Wu L, Xuan W, Qian L, Lijun C, Sheng T, Wu W (2023) A study on alterations in functional activity in migraineurs during the interictal period. Heliyon 9(1):e12372
Article Google Scholar
Xiong Z, Xiong Y, Liu H, Li C, Li X (2020) Identification of purity and prognosis-related gene signature by network analysis and survival analysis in brain lower grade glioma. J Cell Mol Med 24(19):11607–11612
Article Google Scholar
Zamani J, Sadr A, Javadi A-H (2022) Classification of early-mci patients from healthy controls using evolutionary optimization of graph measures of resting-state fmri, for the alzheimer’s disease neuroimaging initiative. PLoS ONE 17(6):e0267608
Article Google Scholar
Zhang X, Han L, Zhu W, Sun L, Zhang D (2021) An explainable 3d residual self-attention deep neural network for joint atrophy localization and alzheimer’s disease diagnosis using structural mri. IEEE J Biomed Health Inform 4(8):1–10
Google Scholar
Zhang Y, Zhang H, Chen X, Lee S-W, Shen D (2017) Hybrid high-order functional connectivity networks using resting-state functional mri for mild cognitive impairment diagnosis. Sci Rep 7(1):1–15
Google Scholar
Zhang Y, Zhang H, Chen X, Liu M, Zhu X, Lee S-W, Shen D (2019) Strength and similarity guided group-level brain functional network construction for mci diagnosis. Pattern Recogn 88:421–430
Article Google Scholar
Zhou H, Zhang D (2021) Graph-in-graph convolutional networks for brain disease diagnosis. In: ICIP, pp 111–115
Zhou SK, Greenspan H, Davatzikos C, Duncan JS, Ginneken BV, Madabhushi A, Prince JL, Rueckert D, Summers RM (2021) A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE
Zhu X, Zhu Y, Zheng W (2020) Spectral rotation for deep one-step clustering. Pattern Recogn 105:107175
Article Google Scholar

Download references

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions

Author information

Authors and Affiliations

Massey University Albany Campus, 0745, Auckland, New Zealand
Tong Liu & Rongyao Hu
University of Electronic Science and Technology of China, 611731, Chengdu, China
Fangqi Liu & Yingying Wan
Jinan Laboratory of Applied Nuclear Science, The Institute of High Energy Physics of the Chinese Academy of Sciences, 100039, Beijing, China
Yongxin Zhu
Computer School, Beijing Information Science and Technology University, 100101, Beijing, China
Li Li

Authors

Tong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Fangqi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yingying Wan
View author publications
You can also search for this author in PubMed Google Scholar
Rongyao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yongxin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Li Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Rongyao Hu or Li Li.

Ethics declarations

Conflict of Interests

The authors declare that neither associations nor perceived conflicts of interest nor competing financial interests exist in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is partially supported by the Massey University College of Sciences’ REaDI funds.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, T., Liu, F., Wan, Y. et al. Hierarchical graph learning with convolutional network for brain disease prediction. Multimed Tools Appl 83, 46161–46179 (2024). https://doi.org/10.1007/s11042-023-17187-8

Download citation

Received: 01 June 2022
Revised: 13 September 2023
Accepted: 19 September 2023
Published: 23 October 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11042-023-17187-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Hierarchical graph learning with convolutional network for brain disease prediction

Abstract

Similar content being viewed by others

Dual-Graph Learning Convolutional Networks for Interpretable Alzheimer’s Disease Diagnosis

BCN-GCN: A Novel Brain Connectivity Network Classification Method via Graph Convolution Neural Network for Alzheimer’s Disease

Balanced Graph Structure Information for Brain Disease Detection

1 Introduction

2 Related work

2.1 Brain connectivity analysis (SPBA)

2.1.1 Seed point-based analysis

2.1.2 Independent component analysis (ICA)

2.1.3 Graph theory

2.2 Graph neural networks

2.2.1 Individual graph model on GNN

2.2.2 Population graph model on GNN

3 Proposed method

3.1 Graph convolutional network

3.2 The individual graph model

3.3 The population graph model

3.4 The unified model

4 Experiments

4.1 Dataset

4.2 Setting

4.3 Comparison methods

4.4 Experimental results

5 Ablation study

5.1 Effectiveness of the population graph model

5.2 Effectiveness of the dynamic graph representation

5.3 Effectiveness of the individual graph model

5.4 Effectiveness of selected specific brain regions

6 Conclusion

Data availability statement

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation