1 Introduction

Studies of resting-state functional magnetic resonance imaging (rs-fMRI) have shown that BFCN has disease-related changes [1]. A number of BFCN strategies have been proposed for the diagnosis and analysis of mental illness. Most BFCN disease diagnosis strategies first construct a brain network based on the correlation between paired-wise brain areas, and then perform feature selection and classification algorithms.

Currently, most BFCN construction methods only consider the association between pairs of brain areas or voxels, i.e. the first-order BFCN construction methods. For example, method based on Pearson correlation (PC) coefficients [2]. The first-order BFCN is robust, but not sensitive to small signal changes. There are also some BFCN construction methods that consider the association between multiple brain areas, namely the high-order BFCN construction method. For example, Guo et al. [3] take for that the correlation between pairs of brain areas may be affected by the third brain area, and proposes a method to eliminate the effects through partial correlation. High-order BFCN construction methods can capture small changes between brain networks, but it lacks robustness. Zhu et al. [4] proposed a BFCN construction method for hybrid first-order second-order brain networks, performing a simple weighted combination of first-order and second-order brain networks. However, before the classification step, no machine learning method such as feature selection is used for brain networks to process the brain network data.

In the field of machine learning, researchers have proposed various methods for processing multi-modality data. These methods may provide a theoretical basis for combining first-order and second-order BFCN information. In fact, although there seems no multi-modality work for integrating first-order and second-order BFCNs, some multi-modality methods have been proposed and applied to the diagnosis of brain diseases. For example, Huang et al. [5] proposed a sparse composite linear descriptive analysis model to jointly identify disease-related brain features from multi-modality data. Zhang et al. [6] proposed a multi-modality multi-task method for joint feature selection and classification of Alzheimer’s disease data. However, these methods ignore the structural information of multi-modality data and the data noise problem is not considered.

In view of the above problems, in this paper, we proposed a multi-modality low rank representation learning framework to fuse first and second order BFCN information and applied it to the diagnosis of schizophrenia. Our contributions mainly include the following two points:

  1. (1)

    We extract the intrinsic structure information through low-rank constraint, embed the correlation of multi-modality data into the learning model, and encourage the cooperation between first-order and second-order BFCN by combining ideal representation term to obtain better diagnostic performance.

  2. (2)

    To the best of our knowledge, this is the first work to combine first-order and second-order BFCN information through multi-modality learning strategy.

2 Background

2.1 First-Order and Second-Order Brain Functional Connection Network

At present, BFCN analysis has been widely used in the diagnosis of mental diseases, and brain function network construction is the core step of BFCN analysis. The BFCN construction methods can be divided into low-order methods and high-order methods. The low-order methods are highly robust, while the high-order methods are usually more sensitive to subtle changes in signals.

The most common low-order method is the BFCN construction method based on the PC coefficient, which reveals the first-order relationship of the brain interval by calculating the correlation coefficient of the paired brain areas. Let \( x_{i} \) and \( x_{j} \) represent a pair of brain areas, PC can be calculated by the following formula:

$$ C_{ij}^{1} = \frac{{Cov\left( {x_{i} , x_{j} } \right)}}{{\sqrt {Var\left( {x_{i} } \right)Var\left( {x_{j} } \right)} }} $$
(1)

where \( Cov\left( { \cdot , \cdot } \right) \) is a function for calculating the covariance, and \( Var\left( \cdot \right) \) is a function for calculating the variance.

In our previous work, we proposed a second-order BFCN strategy based on triplet [8] for extracting high-order information from brain areas. Specifically, let triplet \( \left( {x_{i} ,x_{u} ,x_{v} } \right) \) consists of \( x_{i} \) and its neighbors \( x_{u} \) and \( x_{v} \). \( S_{uv}^{i} \) defines the distance between \( x_{i} \) and \( x_{v} \) relative to \( x_{u} \):

$$ S_{uv}^{i} = dist\left( {x_{i} ,x_{v} } \right) - dist\left( {x_{i} ,x_{u} } \right) $$
(2)

where \( dist\left( { \cdot , \cdot } \right) \) calculates the squared Euclidean distance.

The triplet-based second-order BFCN strategy takes into account that a brain area usually interacts with its neighbors rather than distant brain areas, so it only considers second-order information among neighbors. Let \( N_{i} \) be a set of sequence numbers indicating the \( k \) nearest neighbors of \( x_{i} \), and then, relative to all \( k \) nearest neighbors of \( x_{i} \), the distance between \( x_{i} \) and \( x_{v} \) can be expressed as \( \sum\limits_{{u \in N_{i} }} {S_{uv}^{i} } \). Thus, the triplet-based second-order BFCN can be expressed as:

$$ C_{ij}^{2} = \left\{ {\begin{array}{*{20}l} {norm\left( { - \mathop \sum \limits_{{u \in N_{i} }} S_{uv}^{i} } \right),\left( {j \in N_{i} } \right)} \hfill \\ {0,\left( {j \notin N_{i} } \right)} \hfill \\ \end{array} } \right. $$
(3)

where \( norm\left( \cdot \right) \) is a function that that normalizes the data.

2.2 Low-Rank Representation

Recently, LRR has performed very well in feature extraction and subspace learning [7]. For a given set of samples, the LRR looks for the low-rank component of all samples as the basis in the dictionary so that the data can be represented as a linear combination of bases [9].

Let \( \varvec{X} = \left[ {\varvec{x}_{1} , \varvec{x}_{2} , \ldots , \varvec{x}_{n} } \right] \in R^{d \times n} \) denote a set of data vectors, and each column of which can be represented by a linear combination of the bases in dictionary \( \varvec{A} = \left[ {\varvec{a}_{1} , \varvec{a}_{2} , \ldots , \varvec{a}_{m} } \right] \):

$$ \varvec{X} = \varvec{AZ} $$
(4)

where \( \varvec{Z} = \left[ {\varvec{z}_{1} , \varvec{z}_{2} , \ldots , \varvec{z}_{n} } \right] \) is a coefficient matrix, and \( \varvec{z}_{i} \) is the representation coefficient vector of \( \varvec{x}_{i} \).

Considering that dictionary \( \varvec{ A} \) is usually over-complete, LRR seeks a low-order solution to solve the problem of possible multiple solutions by solving the following problems [10]:

$$ \mathop {min }\limits_{\varvec{Z}} \left\| { \varvec{Z}} \right\|_{ *} \;\;\;s.t. \varvec{X} = \varvec{AZ} $$
(5)

where \( \left\| \cdot \right\|_{ *} \) represents the nuclear norm of the matrix [12].

2.3 Materials

In this study, three schizophrenia rs-fMRI datasets were used, including The Center for Biomedical Research Excellence (COBRE) dataset (53 patients and 67 normal controls), Nottingham dataset (32 patients and 36 normal controls), and Xiangya dataset (83 patients and 60 normal controls). Detailed information such as subjects, image acquistions, data preprocessing, and anatomical parcellation please refer to [4].

3 Method

3.1 Proposed Method

Let \( \varvec{X}_{k} = \left[ {\varvec{x}_{k,1} ,\varvec{x}_{k,2} , \ldots ,\varvec{x}_{k,n} } \right] \in R^{m \times n} \) denotes the k-th modality of the training data. Assuming that \( \varvec{X}_{k} \) contains \( c \) classes, then \( \varvec{X}_{k} \) can be divided into \( c \) subsets, expressed as \( \varvec{X}_{k} = \left\{ {\varvec{X}_{k}^{1} ,\varvec{X}_{k}^{2} , \ldots ,\varvec{X}_{k}^{c} } \right\} \). In the context of this study, \( \varvec{X}_{1} \) represents the first-order BFCN, and \( \varvec{X}_{2} \) represents the second-order BFCN. Considering \( \varvec{X}_{k} \) itself as the dictionary, \( \varvec{X}_{k} \) can be re-represented as \( \varvec{X}_{k} = \varvec{X}_{k} \varvec{Z}_{k} + \varvec{E}_{k} \), where \( \varvec{Z}_{k} = \left[ {\varvec{z}_{k,1} ,\varvec{z}_{k,2} , \ldots ,\varvec{z}_{k,n} } \right] \in R^{n \times n} \) represents the LRR of \( \varvec{X}_{k} \) and \( \varvec{E}_{k} \) denotes the sparse noise matrix. In order to include the block diagonal structure information in the learning process and enhance the cooperation between the two modalities, we introduced the regular ideal representation term \( \left\| {\varvec{Z}_{1} \varvec{Z}_{2}^{T} - \varvec{P}} \right\|_{F}^{2} \), where \( \varvec{P} = block\left( {\varvec{p}_{1} ,\varvec{p}_{2} , \ldots ,\varvec{p}_{c} } \right) \), \( \varvec{p}_{i} = ones\left( {n_{i} ,n_{i} } \right) \) is the code for \( \varvec{X}^{i} \), \( n_{l} \) is the number of samples in \( \varvec{X}^{i} \). That means, if \( \varvec{ x}_{k,j} \) belongs to class \( i \), then the coefficients in \( \varvec{p}_{i} \) are all 1 s, whiles in others are all 0 s. Thus, we have the follow function:

$$ \mathop {\hbox{min} }\nolimits_{\varvec{Z}} \sum\nolimits_{k = 1}^{2} {\left( {\left\| {\varvec{Z}_{k} } \right\|_{*} + \beta \left\| {\varvec{Z}_{k} } \right\|_{1} + \gamma \left\| {\varvec{E}_{k} } \right\|_{2, 1} } \right)} + \lambda \left\| {\varvec{Z}_{1} \varvec{Z}_{2}^{T} - \varvec{P}} \right\|_{F}^{2} \,\;\;\;\;\;{\text{s}}.{\text{t}}. \;\varvec{X}_{k} = \varvec{X}_{k} \varvec{Z}_{k} + \varvec{E}_{k} $$
(6)

where \( \left\| \cdot \right\|_{1} \) denotes the l1-norm, \( \left\| \cdot \right\|_{2, 1} \) denotes the l2,1-norm, \( \left\| \cdot \right\|_{F} \) denotes the Frobenius norm, and \( \beta \), \( \gamma \), \( \lambda \) are the hyperparameters used to balance the differents parts of the function.

Combining Eq. (6) with the multimodal supervised feature selection framework [6, 13], the following objective functions can be obtained:

$$ \mathop {\hbox{min} }\nolimits_{{\varvec{W},\varvec{Z}}} \sum\nolimits_{k = 1}^{2} {\left( {\left\| {\varvec{H} - \varvec{Z}_{k}^{T} \varvec{W}_{k} } \right\|_{F}^{2} + \alpha \left\| {\varvec{Z}_{k} } \right\|_{*} + \beta \left\| {\varvec{Z}_{k} } \right\|_{1} + \gamma \left\| {\varvec{E}_{k} } \right\|_{2, 1} +\upxi\left\| {\varvec{W}_{k} } \right\|_{F}^{2} } \right)} + \lambda \left\| {\varvec{Z}_{1} \varvec{Z}_{2}^{T} - \varvec{P}} \right\|_{F}^{2} \;\;\;\;\;{\text{s}}.{\text{t}}. \;\varvec{X}_{k} = \varvec{X}_{k} \varvec{Z}_{k} + \varvec{E}_{k} $$
(7)

where \( \varvec{H} \) denotes the ground truth of \( \varvec{X} \), \( \varvec{W} = \left[ {\varvec{w}_{1} , \varvec{w}_{2} , \ldots ,\varvec{w}_{m} } \right] \in R^{c \times m} \) is the weight matrix, and \( \alpha \), \( \upxi \) denote the hyperparameters.

3.2 Optimization and Solution

Since the problem (7) is non-convex, we first introduce the auxiliary variables \( {\mathbf{Z}}_{k}^{'} \) and \( {\mathbf{Z}}_{k}^{''} \) to make the problem separable:

$$ \mathop {\hbox{min} }\nolimits_{\varvec{W}} \sum\nolimits_{k = 1}^{2} {\left( {\left\| {\varvec{H} - \varvec{Z}_{k}^{T} \varvec{W}_{k} } \right\|_{F}^{2} + \alpha \left\| {\varvec{Z}_{k}^{{\prime }} } \right\|_{*} + \beta \left\| {\varvec{Z}_{k}^{{\prime \prime }} } \right\|_{1} + \gamma \left\| {\varvec{E}_{k} } \right\|_{2, 1} +\upxi\left\| {\varvec{W}_{k} } \right\|_{F}^{2} } \right)} + \lambda \left\| {\varvec{Z}_{1} \varvec{Z}_{2}^{T} - \varvec{P}} \right\|_{F}^{2} \;\;\;\;{\text{s}}.{\text{t}}. \varvec{X}_{k} = \varvec{X}_{k} \varvec{Z}_{k} + \varvec{E}_{k} , \varvec{Z}_{k}^{{\prime }} = \varvec{Z}_{k} , \varvec{Z}_{k}^{{\prime \prime }} = \varvec{Z}_{k} $$
(8)

The problem (8) can be solved by minimizing the following augmented Lagrangian multiplier (ALM) function L:

$$ \begin{aligned} L & = \left\| {\varvec{H} - \varvec{Z}_{k}^{T} \varvec{W}_{k} } \right\|_{F}^{2} + \alpha \left\| {\varvec{Z}_{k}^{{\prime }} } \right\|_{*} + \beta \left\| {\varvec{Z}_{k}^{{\prime \prime }} } \right\|_{1} + \gamma \left\| {\varvec{E}_{k} } \right\|_{2, 1} + \lambda \left\| {\varvec{Z}_{1} \varvec{Z}_{2}^{T} - \varvec{P}} \right\|_{F}^{2} +\upxi\left\| {\varvec{W}_{k} } \right\|_{F}^{2} \\ & \quad + \left\langle {\varvec{Y}_{1, k} , \varvec{X}_{k} - \varvec{X}_{k} \varvec{Z}_{k} - \varvec{E}_{k} } \right\rangle + \left\langle {\varvec{Y}_{2, k} , \varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime }} } \right\rangle + \left\langle {\varvec{Y}_{3, k} , \varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime \prime }} } \right\rangle \\ & \quad + \frac{\mu }{2}\left( {\left\| {\varvec{X}_{k} - \varvec{X}_{k} \varvec{Z}_{k} - \varvec{E}_{k} } \right\|_{F}^{2} + \left\| {\varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime }} } \right\|_{F}^{2} + \left\| {\varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime \prime }} } \right\|_{F}^{2} } \right) \;\;\;k = 1, 2 \\ \end{aligned} $$
(9)

where \( \left\langle {\varvec{A}, \varvec{B}} \right\rangle = tr\left( {\varvec{A}^{T} \varvec{B}} \right) \), \( \varvec{Y}_{1, k} \), \( \varvec{Y}_{2, k} \) and \( \varvec{Y}_{3, k} \) are Lagrange multipliers and \( \mu \) is a balance parameter.

Further, the augmented Lagrangian function (9) would reduce to:

$$ \begin{aligned} L = & \left\| {\varvec{H} - \varvec{Z}_{k}^{T} \varvec{W}_{k} } \right\|_{F}^{2} + \alpha \left\| {\varvec{Z}_{k}^{{\prime }} } \right\|_{*} + \beta \left\| {\varvec{Z}_{k}^{''} } \right\|_{1} + \gamma \left\| {\varvec{E}_{k} } \right\|_{2, 1} + \lambda \left\| {\varvec{Z}_{1} \varvec{Z}_{2}^{T} - \varvec{P}} \right\|_{F}^{2} +\upxi\left\| {\varvec{W}_{k} } \right\|_{F}^{2} \\ & + \frac{\mu }{2}\left( {\left\| {\varvec{X}_{k} - \varvec{X}_{k} \varvec{Z}_{k} - \varvec{E}_{k} + \frac{{\varvec{Y}_{1, k} }}{\mu }} \right\|_{F}^{2} + \left\| {\varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime }} + \frac{{\varvec{Y}_{2, k} }}{\mu }} \right\|_{F}^{2} + \left\| {\varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime \prime }} + \frac{{\varvec{Y}_{3, k} }}{\mu }} \right\|_{F}^{2} } \right) - \frac{1}{2\mu }\left( {\left\| {\varvec{Y}_{1, k} } \right\|_{F}^{2} + \left\| {\varvec{Y}_{2, k} } \right\|_{F}^{2} + \left\| {\varvec{Y}_{3, k} } \right\|_{F}^{2} } \right) \,\,\,k = 1, 2 \\ \end{aligned} $$
(10)

The above problem can be solved by inexact ALM (IALM) algorithm, which is an iterative method that solves each variable in a decreasing coordinate manner [14,15,16,17,18]. The stopping criteria are \( \left\| {\varvec{X}_{k} - \varvec{X}_{k} \varvec{Z}_{k} - \varvec{E}_{k} } \right\|_{\infty } < \varepsilon \), \( \left\| {\varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime }} } \right\|_{\infty } < \varepsilon \) and \( \left\| {\varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime \prime }} } \right\|_{\infty } < \varepsilon \), where \( \left\| \cdot \right\|_{\infty } \) denotes the l-norm.

The solution processes are as follows, and each step has a closed solution:

Step 1 (Update \( \varvec{Z}_{k}^{{\prime }} \) ):

$$ \varvec{Z}_{k}^{{\prime }} = \mathop {argmin }\limits_{{\varvec{Z}_{k}^{{\prime }} }} \alpha \left\| {\varvec{Z}_{k}^{{\prime }} } \right\|_{*} + \frac{\mu }{2}\left\| {\varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime }} + \frac{{\varvec{Y}_{2, k} }}{\mu }} \right\|_{F}^{2} $$
(11)

Problem (11) can be solved by the singular value threshold (SVT) operator [11]:

$$ \varvec{Z}_{k}^{{\prime }} = \varvec{US}_{\theta } \left[ \varvec{S} \right]\varvec{V}^{T} $$
(12)

where \( \theta = \frac{\alpha }{\mu } \), \( \varvec{USV}^{T} \) is the SVD decomposition of \( \varvec{Z}_{k} + \frac{{\varvec{Y}_{2, k} }}{\mu } \), and \( \varvec{S}_{\theta } \left[ x \right] \) is the soft-thresholding (shrinkage) operator [17], which defined as follows:

$$ \varvec{S}_{\theta } \left[ x \right] = \left\{ {\begin{array}{*{20}l} {x - \theta ,\;\;} \hfill & {if\;\;x > \theta } \hfill \\ {x + \theta ,} \hfill & {if\;\;x < - \theta } \hfill \\ {0,} \hfill & {otherwise} \hfill \\ \end{array} } \right. $$
(13)

Step 2 (Update \( \varvec{Z}_{k}^{{\prime \prime }} \) ):

$$ \varvec{Z}_{k}^{{\prime \prime }} = \mathop {argmin }\limits_{{\varvec{Z}_{k}^{''} }} \beta \left\| {\varvec{Z}_{k}^{{\prime \prime }} } \right\|_{1} + \frac{\mu }{2}\left\| { \varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime \prime }} + \frac{{\varvec{Y}_{3, k} }}{\mu }} \right\|_{F}^{2} $$
(14)

According to [14], the above problem has the following closed form solution:

$$ \varvec{Z}_{k}^{{\prime \prime }} = shrink\left( { \varvec{Z}_{k} + \frac{{\varvec{Y}_{3, k} }}{\mu },\frac{\beta }{\mu } } \right) $$
(15)

Step 3 (Update \( \varvec{Z}_{k} \) ):

\( \varvec{Z}_{k} \) is updated by solving optimization problem (16):

$$ \begin{aligned} \varvec{Z}_{k} = & \mathop {argmin }\limits_{{\varvec{Z}_{k} }} \left\| {\varvec{H} - \varvec{Z}_{k}^{T} \varvec{W}_{k} } \right\|_{F}^{2} + \lambda \left\| {\varvec{Z}_{1} \varvec{Z}_{2}^{T} - \varvec{P}} \right\|_{F}^{2} + \\ & \frac{\mu }{2}\left( {\left\| {\varvec{X}_{k} - \varvec{X}_{k} \varvec{Z}_{k} - \varvec{E}_{k} + \frac{{\varvec{Y}_{1, k} }}{\mu }} \right\|_{F}^{2} + \left\| {\varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime }} + \frac{{\varvec{Y}_{2, k} }}{\mu }} \right\|_{F}^{2} + \left\| {\varvec{Z}_{k} - \varvec{Z}_{k}^{{\prime \prime }} + \frac{{\varvec{Y}_{3, k} }}{\mu }} \right\|_{F}^{2} } \right) \\ \end{aligned} $$
(16)

It is easy to solve for the closed solution of \( \varvec{Z}_{k} \):

$$ \begin{aligned} \varvec{Z}_{k} = & \left( {2\left( {1 + \lambda + \mu } \right)\varvec{I} + \mu \varvec{X}_{k}^{T} \varvec{X}_{k} } \right)^{ - 1} \left( {2\varvec{HW}_{k}^{T} + 2\lambda \varvec{PZ}_{l} + \mu \varvec{X}_{k}^{T} \varvec{X}_{k} + \mu \varvec{X}_{k}^{T} \varvec{E}_{k} - \varvec{X}_{k}^{T} \varvec{Y}_{1, k} + \mu \varvec{Z}_{k}^{'} - \varvec{Y}_{2, k} + \mu \varvec{Z}_{k}^{''} - \varvec{Y}_{3, k} } \right) \\ & \left( {\varvec{W}_{k} \varvec{W}_{k}^{T} + \varvec{Z}_{l}^{T} \varvec{Z}_{l} + 3\varvec{I}} \right)^{ - 1} \\ \end{aligned} $$
(17)

where \( l \ne k \), \( \varvec{Z}_{k}^{T} = \varvec{Z}_{k} \) and \( \varvec{P}^{T} = \varvec{P} \).

Step 4 (Update \( \varvec{W}_{k} \) ):

\( \varvec{W}_{k} \) can be updated by solving optimization problem (18):

$$ \varvec{W}_{k} = \mathop {argmin }\limits_{{\varvec{W}_{k} }} \left\| {\varvec{H} - \varvec{Z}_{k}^{T} \varvec{W}_{k} } \right\|_{F}^{2} +\upxi\left\| {\varvec{W}_{k} } \right\|_{F}^{2} $$
(18)

Similar to the previous step, it is easy to get the solution:

$$ \varvec{W}_{k} = \left( {\varvec{Z}_{k} \varvec{Z}_{k}^{T} +\upxi\varvec{I}} \right)^{ - 1} \varvec{Z}_{k} \varvec{H} $$
(19)

Step 5 (Update \( \varvec{E}_{k} \) ):

$$ \varvec{E}_{k} = \mathop {argmin }\limits_{{\varvec{E}_{k} }} \gamma \left\| {\varvec{E}_{k} } \right\|_{2, 1} + \frac{\mu }{2}\left\| {\varvec{X}_{k} - \varvec{X}_{k} \varvec{Z}_{k} - \varvec{E}_{k} + \frac{{\varvec{Y}_{1, k} }}{\mu }} \right\|_{F}^{2} $$
(20)

In order to solve the problem (20), the following lemma is required:

Lemma 1:

Let \( \varvec{Q} \) be a given matrix. If the optimal solution to

$$ \mathop {min }\limits_{\varvec{E}} \alpha \left\| \varvec{B} \right\|_{2,1} + \frac{1}{2}\left\| {\varvec{B} - \varvec{Q}} \right\|_{F}^{2} $$
(21)

is \( \varvec{U}^{ *} \), the i-th column of \( \varvec{U}^{ *} \) is as follows [19]:

$$ \left[ {\varvec{B}^{ *} } \right]_{:,i} = \left\{ {\begin{array}{*{20}l} {\frac{{\left\| {\varvec{Q}_{:,i} } \right\|_{2} - \alpha }}{{\left\| {\varvec{Q}_{:,i} } \right\|_{2} }}\varvec{Q}_{:,i} ,} \hfill & {if\left\| {\varvec{Q}_{:,i} } \right\|_{2} > \alpha } \hfill \\ {0, } \hfill & {otherwise} \hfill \\ \end{array} } \right. $$
(22)

Therefore, it is clear that the optimal solution of problem (20) is:

$$ \left[ \varvec{E} \right]_{:,i} = \left\{ {\begin{array}{*{20}l} {\frac{{\left\| {\varvec{J}_{:,i} } \right\|_{2} - \frac{\gamma }{\mu }}}{{\left\| {\varvec{J}_{:,i} } \right\|_{2} }}\varvec{J}_{:,i} , } \hfill & {if\left\| {\varvec{J}_{:,i} } \right\|_{2} > \frac{\gamma }{\mu }} \hfill \\ {0, } \hfill & {otherwise} \hfill \\ \end{array} } \right. $$
(23)

where \( \varvec{J} = \varvec{X}_{k} - \varvec{X}_{k} \varvec{Z}_{k} + \frac{{\varvec{Y}_{1, k} }}{\mu } \).

Step 6 (Update multiplies):

Multipliers \( \varvec{Y}_{1, k} \), \( \varvec{Y}_{2, k} \), \( \varvec{Y}_{3, k} \) and iteration step-size \( \rho \) (\( \rho \) > 1) are updated by Eq. (24):

$$ \left\{ {\begin{array}{*{20}l} {\varvec{Y}_{1, k} = \varvec{Y}_{1, k} + \mu \left( {\varvec{X}_{k} - \varvec{X}_{k} \varvec{Z}_{k} - \varvec{E}_{k} } \right)} \hfill \\ {\varvec{Y}_{2, k} = \varvec{Y}_{2, k} + \mu \left( {\varvec{Z}_{k} - \varvec{Z}_{k}^{'} } \right)} \hfill \\ {\varvec{Y}_{3, k} = \varvec{Y}_{3, k} + \mu \left( {\varvec{Z}_{k} - \varvec{Z}_{k}^{''} } \right)} \hfill \\ {\mu = min\left( {\rho \mu ,\mu_{max} } \right)} \hfill \\ \end{array} } \right. $$
(24)

In short, we summed up the process of solving the objective function (7) in Algorithm 1.

figure a

4 Experiments and Discussion

4.1 Comparison of Our Method and Baseline Classification Methods

We first compare our approach to some baseline feature selection and classification methods. Comparison methods include Nearest neighbor (NN) classifier without feature selection, linear discriminant analysis (LDA) [20], support machine vector (SVM) [21], and kernel discriminant analysis (KDA) [22].

We used the 10-fold cross-validation strategy in this experiment. Specifically, each dataset is equally divided into 10 subsets, and then each subset is taken as test set in turn, with all remaining subsets as training set. This process is repeated 20 times to avoid deviations caused by the sample segmentation process. Accuracy (ACC), sensitivity (SEN), specificity (SPE), the area under the receiver operating characteristic (ROC) curve (AUC) and their respective standard deviations (STD) are used to measure performance in the classification. For their calculation process, please refer to [4].

For fair comparison, the first-order and second-order BFCN construction steps are the same for all methods, that is, the method in [4]. We set a threshold of \( \left\{ {0,0.05, 0.1, \cdots ,0.4} \right\} \) for the first-order and second-order BFCN on all datasets. For NN, LDA, SVM, and KDA, we refer to [4], combining the first-order and second-order BFCN feature information for feature selection and classification steps. For SVM, we use a linear kernel. For KDA, we use a Gaussian kernel function. For the method we proposed, for simplicity, we use greedy strategy to select \( \alpha \) and \( \upxi \) from \( \left\{ {10^{ - 3} , 10^{ - 2} , \cdots , 10^{3} } \right\} \) and set the other hyperparams to 1.

Table 1 reports the comparison of our method and baseline feature selection and classification methods on three schizophrenic datasets. It can be seen that our proposed method has the best ACC, SEN and AUC performance on the three datasets compared to the baseline feature selection and classification algorithm. As can be seen from the Table 1, the SEN and SPE of these baseline methods show unbalanced results, so their SEN or SPE is very high, but the AUC results are not ideal. In contrast, the proposed method shows good performance, probably because the proposed method uses a multi-modality learning strategy to better combine the first-order and second-order BFCN information.

Table 1. The classification results (ACC/SEN/SPE/AUC ± STD%) of our method and several comparison baseline classification algorithms. The best results are shown in bold.

4.2 Comparison of Our Method and State of Art Multi-modality Based Methods

In this experiment, we compare the proposed method with some start of art multimodality based methods. Comparison methods include multi-kernel learning (MKL) SVM method [23], multitask feature selection (MTFS) model [6], manifold regularized MTFS (M2TFS) model [13], and multi-modal structured low-rank dictionary learning (MM-SLDL) method [24].

The experimental setup and the metrics for measuring classification performance are the same as in the previous experiment. For fair comparison, the MKL parameters are chosen from \( \left\{ {0, 0.1, 0.2, \cdots , 1} \right\} \) using greedy strategy. The parameters selection method of MTFS and MT2FS is the same as that in [13], and the parameter selection range of MM-SLDL is the same as that in [24].

The experimental results are reported in Table 2. In addition, we plot the ROC curve of the experiment and show it in Fig. 1. It can be seen that the proposed method performs best on all indicators, and the ROC curve of the comparison method is almost at the bottom right of the ROC curve of our method. This may be because strategies based on multi-modality low-rank representation can better learn and fuse first-order and second-order BFCN information.

Table 2. The classification results (ACC/SEN/SPE/AUC ± STD%) of our method and state of art multi-modality based algorithms. The best results are shown in bold.
Fig. 1.
figure 1

ROC curves for our proposed method and all state of art multi-modality based methods on all datasets.

4.3 Analysis of Convergence and Parameter Sensitivity

In this section, we first show the convergence iteration of the proposed method. Figure 2 shows the change in the value of the objective function as the number of iterations increases. It can be found that our method can basically converge within 300 iterations.

Fig. 2.
figure 2

The convergence property of the proposed algorithm on different datasets

We then evaluated the effect of two hyperparameters (\( \alpha \) and \( \upxi \)) on the performance of our method. The results are shown in Fig. 3. We mainly evaluated the effects of these two parameters on ACC and AUC. \( \alpha \) and \( \upxi \) have a value range of \( \left\{ {10^{ - 3} , 10^{ - 2} , \cdots , 10^{3} } \right\} \). For the sake of simplicity, we only show the effect of these two parameters on the results of the Xiangya dataset. Figure 3 shows that our method achieves a steady state when \( \upxi \) takes a value of 10, \( 10^{2} \), or \( 10^{3} \). Similar results were obtained on the other two datasets. In addition, we can see that parameter \( \upxi \) has a more critical impact on the results than parameter \( \alpha \).

Fig. 3.
figure 3

The effect of parameters α and \( \upxi \) on our method. Left subgraph is the result ACC, and right subgraph is result of AUC.

5 Conclusion

In summary, we propose a multi-modality feature learning method based on low-rank representation to fuse first-order and second-order BFCN information and apply it to the classification of schizophrenia. Specifically, we combine the low-rank representation method with the multi-modality learning framework and add an ideal representation term to effectively learn the first-order and second-order BFCN feature information. Experiments on three schizophrenia datasets show that our approach is superior to existing multimodal feature learning methods.