1 Introduction

The development of satellite communication technology makes it play an important role in satellite TV broadcasting, Internet, distance education, military applications and mobile communications. Satellite has become an indispensable way for realizing seamless global personal communication and high-speed Internet in the air because of its long distance and not being easily affected by time, region, and airspace [1]. As business demands increase, frequency resources become more and more precious. However, the unreasonable allocation and low utilization rate of spectrum resources in the satellite communication system have further caused the problem of spectrum shortage. At present, there are many ways to improve the utilization of unit spectrum resources, such as using multiple carrier frequency multiplexing or multi-antenna transmission technology [2]. But the starting point of this kind of technology is fixed spectrum allocation strategy, which cannot fundamentally solve the unbalanced spectrum allocation [3]. In order to fundamentally solve this contradiction and improve the utilization of spectrum resources, people consider applying cognitive radio theory to satellite communication system, and then put forward the concept of satellite cognitive radio network [4].

In recent years, many scholars have conducted extensive research on the application scenarios, network architecture and key technologies of satellite cognitive wireless network. Traditional fast spectrum sensing algorithms, such as energy detection algorithm [5], cannot provide good sensing performance under such low SNR conditions. Other algorithms with good detection performance, such as cyclic stationary feature detection algorithm [6], have high complexity and long detection time. These algorithms are mainly applicable to a single sensor node, and are not applicable to scenarios where multiple sensor nodes are jointly detected. In 2013, European scientists launched the “Cognitive Radio for Satellite Communications” (CoRaSat) project, which aims to research, develop and build cognitive radio technology that can be applied to satellite communication systems [7]. Reference [8] analyzed the specific application scenarios, operating frequency bands and challenges of the satellite cognitive wireless network in the CoRaSat project. Reference [9] comprehensively analyzes the key technologies of satellite cognitive wireless network, including spectrum sensing, interleaving and database technology, and matches the existing satellite multibeam pattern with the actual scene of European map to analyze the feasibility of satellite cognitive wireless network. The Institute of Microsystems, Chinese Academy of Sciences has studied the feasibility of applying cognitive radio technology to satellite communication system. The results show that cognitive radio technology can solve low spectrum utilization and low anti-interference ability. Reference [10] analyzes the application prospects of cognitive radio technologies in low earth orbit satellite communication systems. Although many scholars have done some theoretical research on related technologies, there is still no perfect application scheme for cognitive radio technology in satellite communication system.

Considering the particularity of satellite communication channel, the traditional single sensing node cognitive radio spectrum sensing algorithm is easily to be interfered, resulting in the performance degradation [11]. Therefore, the centralized multi-node spectrum sensing method is adopted to improve the spectrum sensing performance and realize the rapid and accurate sensing of authorized signals. Based on the traditional large-scale cognitive radio spectrum sensing algorithm, the eigenvalue exponential mean algorithm based on big data analysis theory is proposed.

The paper is organized as follows. In Sect. 2, we introduce the system model of satellite cognitive radio networks. In Sect. 3, the spectrum sensing algorithms based on big data analysis theory for large scale cognitive networks are investigated, and a new algorithm using mean exponential eigenvalue is proposed. The performance of the spectrum sensing algorithms is verified in Sect. 4. Finally, the conclusion is given in Sect. 5.

2 System model

The spectrum sensing scenario based on big data analysis in satellite cognitive wireless network is shown in Fig. 1. In the cognitive network, there are multiple sensing nodes and cognitive users that perceive the signals of authorized users (users using authorized frequency bands). For the received signal \(x\left( t \right)\), the sampling interval is \(T_{s}\), and the received signal samples are:

$$x\left[ n \right] = x\left( {nT_{s} } \right)$$
(1)
Fig. 1
figure 1

Satellite cognitive wireless network spectrum sensing scenario

According to the spectrum sensing theory of cognitive radio, when the sensing node only receives noise, the decision is \(H_{0}\), which means that the authorized user does not occupy the spectrum. When both the signal and noise are received, the decision is H1, which means that the spectrum is occupied. Therefore, the received sampling signal can be expressed as:

$$H_{0} : \quad x\left[ n \right] = w\left[ n \right]$$
(2)
$$H_{1} : \quad x\left[ n \right] = s\left[ n \right] + w\left[ n \right]$$
(3)

where \(w\left[ n \right]\) is the received independent and identically distributed gaussian white noise, the mean value is 0 and the variance is \(\sigma^{2}\). Assuming that the sensing node cannot predict the current channel condition, the mean value and variance of the received authorized user signal \(s\left[ n \right]\) are unknown.

Detection probability \(P_{d}\) and false alarm probability \(P_{f}\) are used to measure detection performance. \(P_{d}\) indicates that the sensing node correctly detects the signal when the authorized user occupies the spectrum. \(P_{f}\) indicates that when the authorized user does not occupy the spectrum, the sensing node misjudges that the signal of the authorized user exists, that is:

$$P_{d} = P\left( {\left. {D_{1} } \right|H_{1} } \right)$$
(4)
$$P_{f} = P\left( {\left. {D_{1} } \right|H_{0} } \right)$$
(5)

where \(D_{1}\) indicates that the cognitive radio network determines that the authorized user is occupying the spectrum.

The sample space \(\Gamma_{x}\) collected by the sensing node are divided into several subspaces, and each subspace contains \(N_{tot}\) sampling data. The subspace \(\Gamma_{x,i}\) is regarded as composed of \(N\) sample vectors \({\varvec{X}}_{i}\) of length \(L\), that is:

$$\Gamma_{x,i} = \left\{ {{\varvec{X}}_{{\left( {i - 1} \right)N + 1}} ,{\varvec{X}}_{{\left( {i - 1} \right)N + 2}} , \ldots ,{\varvec{X}}_{{\left( {i - 1} \right)N + N}} } \right\}$$
(6)
$${\varvec{X}}_{i} = \left[ {x\left[ i \right],x\left[ {i + 1} \right], \ldots ,x\left[ {i + L - 1} \right]} \right]^{T}$$
(7)

where \(L\) called smoothing factor, \(\left( \right)^{T}\) represents the transpose of matrix. Assume \({\varvec{X}}_{i} \sim {\mathbb{N}}\left( {0,{\varvec{R}}_{x} } \right)\), where \({\varvec{R}}_{x}\) is the covariance matrix of the received signal. In practical applications, \({\varvec{R}}_{x}\) is often unknown, but it can be approximated by the sample covariance matrix \({\hat{\varvec{R}}}_{x}\).

$${\hat{\varvec{R}}}_{x} = \frac{1}{N}\sum\limits_{i = 1}^{N} {{\varvec{X}}_{i} {\varvec{X}}_{i}^{T} }$$
(8)

In large-scale cognitive radio networks, multiple sensing nodes are usually used to obtain large amounts of data. Multivariate analysis based on big data usually assumes that the sample of variables is large enough, that is, \(N/L\) is large enough. The satellite cognitive wireless network covers a wide range, there are a large number of sensing nodes and cognitive users, and it is easy to obtain a large amount of cognitive data. Centralized processing of the sensory data can obtain a sufficiently large sample subspace in the fusion center and improve the detection capability.

3 Spectrum sensing algorithms based on big data theory

In recent years, the use of big data analysis to extract statistical features of sample subspaces has become a hot spot in the study of spectrum sensing in large-scale cognitive wireless networks. The dimensionality of the observation space occupied by the authorized user signal is smaller than that of the noise, and the frequency spectrum of the authorized signal and Gaussian white noise is very different, so the spectrum sensing algorithm uses the sample covariance matrix for signal detection.

3.1 Estimation-correlation (EC) algorithm

The EC algorithm performs sensing based on the prior information of authorized user signals. It is assumed that the sensed signal is a zero-mean Gaussian random process, and the covariance matrix is known. First, assume that the noise \(w\left[ n \right]\) in the channel is Gaussian white noise, which is not related to the signal, and the variance is \(\sigma^{2}\). The signal \(s\left[ n \right]\) is also zero mean. From Eqs. (2) and (3), the sample vector \({\varvec{X}}_{i}\) of the sensing node obeys:

$$H_{0} : \quad {\varvec{X}}_{i} \sim {\mathbb{N}}\left( {0,\sigma^{2} {\varvec{I}}} \right)$$
(9)
$$H_{1} : \quad {\varvec{X}}_{i} \sim {\mathbb{N}}\left( {0,{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)$$
(10)

where \({\varvec{R}}_{s}\) is the covariance matrix of the signal and \({\varvec{I}}\) is the unit matrix.

In the cognitive radio system, the given false alarm probability \(P_{f}\). According to the Neyman-Pearson (NP) criterion [12], when Eq. (11) is established, it is considered that the authorized user is occupying the spectrum.

$$H_{1} : \quad L\left( {\varvec{X}} \right) = \frac{{p\left( {{\varvec{X}};H_{1} } \right)}}{{p\left( {{\varvec{X}};H_{0} } \right)}} > \gamma$$
(11)

where \(p\left( {{\varvec{X}};H_{0} } \right)\) and \(p\left( {{\varvec{X}};H_{1} } \right)\) are the probability density functions of random variables \({\varvec{X}}\) in the case of \(H_{0}\) and \(H_{1}\),respectively. The decision threshold \(\gamma\) is related to the expected false alarm probability \(\alpha\), which can be determined by:

$$P_{f} = \int_{{\left\{ {{\varvec{X}}:L\left( {\varvec{X}} \right) > \gamma } \right\}}} {p\left( {{\varvec{X}};H_{0} } \right)} d{\varvec{X}} = \alpha$$
(12)

In addition, the function \(L\left( {\varvec{X}} \right)\) in Eq. (11) is called the likelihood ratio, which represents the similarity of variable \(x\) between \(H_{0}\) and \(H_{1}\). Therefore, it is also called likelihood ratio detection or NP detection.

The probability density functions of the sample vector \({\varvec{X}}_{i}\) at the sensing node satisfy the following expressions respectively:

$$\begin{gathered} p\left( {{\varvec{X}}_{i} ;H_{1} } \right) = \frac{1}{{\left( {2\pi } \right)^{\frac{N}{2}} \det^{\frac{1}{2}} \left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)}} \times \hfill \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\exp \left[ { - \frac{1}{2}{\varvec{X}}_{i}^{T} \left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)^{ - 1} {\varvec{X}}_{i} } \right] \hfill \\ \end{gathered}$$
(13)
$$p\left( {{\varvec{X}}_{i} ;H_{0} } \right) = \frac{1}{{\left( {2\pi \sigma^{2} } \right)^{\frac{N}{2}} }}\exp \left[ { - \frac{1}{{2\sigma^{2} }}{\varvec{X}}_{i}^{T} {\varvec{X}}_{i} } \right]$$
(14)

where \(\det \left( \cdot \right)\) denotes the determinant of the matrix.

According to the NP criterion, when the likelihood ratio \(L\left( {{\varvec{X}}_{i} } \right)\) satisfies the Eq. (15), the cognitive radio system will consider that the authorized user is occupying the spectrum.

$$L\left( {{\varvec{X}}_{i} } \right) = \frac{{\;\frac{{\exp \left[ { - \frac{1}{2}{\varvec{X}}_{i}^{T} \left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)^{ - 1} {\varvec{X}}_{i} } \right]}}{{\left( {2\pi } \right)^{\frac{N}{2}} \det^{\frac{1}{2}} \left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)}}\;}}{{\frac{{\exp \left[ { - \frac{1}{{2\sigma^{2} }}{\varvec{X}}_{i}^{T} {\varvec{X}}_{i} } \right]}}{{\left( {2\pi \sigma^{2} } \right)^{\frac{N}{2}} }}}} > \gamma$$
(15)

After taking the logarithm and simplifying, we get:

$$\begin{gathered} l\left( {{\varvec{X}}_{i} } \right) \triangleq \ln \left\{ {L\left( {{\varvec{X}}_{i} } \right)} \right\} \hfill \\ \;\;\;\;\;\;\;\; = \frac{1}{2}\ln \left( {\frac{{\sigma^{2N} }}{{\det \left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)}}} \right) - \hfill \\ \;\;\;\;\;\;\;\;\;\;\frac{1}{2}{\varvec{X}}_{i}^{T} \left[ {\left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)^{ - 1} - \frac{1}{{\sigma^{2} }}{\varvec{I}}} \right]{\varvec{X}}_{i} \hfill \\ \end{gathered}$$
(16)

At this time, when \(l\left( {{\varvec{X}}_{i} } \right) > \ln \gamma\), it is determined that the vector \({\varvec{X}}_{i}\) contains the signal of authorized user. For the sake of simplicity, only the part of Eq. (16) that contains the vector \({\varvec{X}}_{i}\) is considered, which is simplified to:

$$- \frac{1}{2}{\varvec{X}}_{i}^{T} \left[ {\left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)^{ - 1} - \frac{1}{{\sigma^{2} }}{\varvec{I}}} \right]{\varvec{X}}_{i} > \gamma ^{\prime}$$
(17)

Then we can get:

$$T\left( {{\varvec{X}}_{i} } \right) \triangleq \sigma^{2} {\varvec{X}}_{i}^{T} \left[ {\frac{1}{{\sigma^{2} }}{\varvec{I}} - \left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)^{ - 1} } \right]{\varvec{X}}_{i} > 2\sigma^{2} \gamma ^{\prime}$$
(18)

For invertible matrices \({\varvec{A}}\),\({\varvec{B}}\),\({\varvec{C}}\) and \({\varvec{D}}\), it has the following property:

$$\left( {{\varvec{A}} + {\varvec{BCD}}} \right)^{ - 1} = {\varvec{A}}^{ - 1} - {\varvec{A}}^{ - 1} {\varvec{B}}\left( {{\varvec{DA}}^{ - 1} {\varvec{B}} + {\varvec{C}}^{ - 1} } \right)^{ - 1} {\varvec{DA}}^{ - 1}$$
(19)

Let \({\varvec{A}} = \sigma^{2} {\varvec{I}}\), \({\varvec{B}} = {\varvec{D}} = {\varvec{I}}\), \({\varvec{C}} = {\varvec{R}}_{s}\), then:

$$\left( {\sigma^{2} {\varvec{I}} + {\varvec{R}}_{s} } \right)^{ - 1} = \frac{1}{{\sigma^{2} }}{\varvec{I}} - \frac{1}{{\sigma^{4} }}\left( {\frac{1}{{\sigma^{2} }}{\varvec{I}} + {\varvec{R}}_{s}^{ - 1} } \right)^{ - 1}$$
(20)

Combining Eqs. (18) and (20), we can get:

$$\begin{aligned} T\left( {{\varvec{X}}_{i} } \right) = & {\varvec{X}}_{i}^{T} \left[ {\frac{1}{{\sigma^{2} }}\left( {\frac{1}{{\sigma^{2} }}{\varvec{I}} + {\varvec{R}}_{s}^{ - 1} } \right)^{ - 1} } \right]{\varvec{X}}_{i} \\ = & {\varvec{X}}_{i}^{T} \frac{1}{{\sigma^{2} }}\left[ {\frac{1}{{\sigma^{2} }}\left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right){\varvec{R}}_{s}^{ - 1} } \right]^{ - 1} {\varvec{X}}_{i} \\ = & {\varvec{X}}_{i}^{T} {\varvec{R}}_{s} \left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)^{ - 1} {\varvec{X}}_{i} \\ \triangleq {\varvec{X}}_{i}^{T} {\hat{\varvec{S}}} \\ \end{aligned}$$
(21)

Alternatively, \(T\left( {{\varvec{X}}_{i} } \right)\) can be written as the product of discrete signals:

$$T\left( {{\varvec{X}}_{i} } \right) = \sum\limits_{k = 0}^{L - 1} {x\left[ {i + k} \right]} \hat{s}\left[ {i + k} \right]$$
(22)

Therefore, for the sample vector \({\varvec{X}}_{i}\), the sensing node only needs to calculate \(T\left( {{\varvec{X}}_{i} } \right)\) according to the prior information \({\varvec{R}}_{s}\) and \(\sigma^{2}\) of the authorized user signal, and compare it with the threshold \(\gamma ^{\prime\prime}\) under the specific false alarm probability \(\alpha\).

In satellite cognitive wireless networks, sufficient sample subspace can be obtained. Therefore, the \(N\) vectors \({\varvec{X}}_{i} \left( {i = 1,2, \cdots ,N} \right)\) in the subspace can be estimated and correlated, and then the detection results can be averaged to improve the detection performance. When authorized users occupy the spectrum, there is:

$$H_{1} :\;\;\;\;T_{EC} = \frac{1}{N}\sum\limits_{i = 1}^{N} {{\varvec{X}}_{i}^{T} {\varvec{R}}_{s} \left( {{\varvec{R}}_{s} + \sigma^{2} {\varvec{I}}} \right)^{ - 1} {\varvec{X}}_{i} } > \gamma_{EC}$$
(23)

where \(\gamma_{EC}\) is the threshold. Equation (23) is the expression for estimation-correlation detection using the prior information \({\varvec{R}}_{s}\) and \(\sigma^{2}\) of the authorized user signal.

3.2 Eigenvalue template matching (ETM) algorithm

In the actual spectrum sensing process of cognitive radio, the prior information \({\varvec{R}}_{s}\) and \(\sigma^{2}\) are often unknown. In this case, the sensing node uses the previous sample subspace to extract features as a prior information, and compare with the features extracted in the current sample subspace to determine whether there is a spectrum hole.

The ETM algorithm uses the covariance matrix \({\varvec{R}}_{x}\) of the sample subspace \(\Gamma_{x,i}\) for sensing, and extracts the first eigenvector of the matrix (that is, the eigenvector corresponding to the largest eigenvalue) as the feature of the subspace \(\Gamma_{x,i}\), denoted as \(\eta_{i}\). If the subspace contains only noise vectors, the feature is random, and if it contains signal, the feature is stable. Therefore, the ETM algorithm uses an adaptive feature learning algorithm (FLA) to automatically extract features from two consecutive sample subspaces, that is, to compare the similarity of the two subspace features.

$$T_{FLA} = \mathop {\max }\limits_{l = 1,2, \cdots ,L - k + 1} \left| {\sum\limits_{k = 1}^{L} {\eta_{i} \left[ k \right]\eta_{i + 1} \left[ {k + l} \right]} } \right|$$
(24)

where \(\eta_{i}\) and \(\eta_{i + 1}\) are the feature vectors of two continuous sample subspaces \(\Gamma_{x,i}\) and \(\Gamma_{x,i + 1}\) respectively. If the value of \(T_{FLA}\) is higher than \(\gamma_{FLA}\), the feature learning process will end. The vector \(\eta_{i + 1}\) at this time is regarded as the feature \(\phi_{s}\) of the authorized signal.

Then use \(\phi_{s}\) as the prior information, the vector \(\eta_{current}\) corresponding to the current subspace \(\Gamma_{x,current}\) with \(\phi_{s}\) are calculated for similarity. If similarity \(T_{ETM}\) is higher than the threshold \(\gamma_{ETM}\), it is determined that the current subspace contains the signal of the authorized user. The detection expression of ETM algorithm is:

$$H_{1} :\;\;\;T_{ETM} = \mathop {\max }\limits_{l = 1,2, \cdots ,L - k + 1} \left| {\sum\limits_{k = 1}^{L} {\phi_{s} \left[ k \right]\eta_{current} \left[ {k + l} \right]} } \right| > \gamma_{ETM}$$
(25)

The algorithm consists of two stages: First, the similarity function is used in feature learning to automatically extract the signal features of authorized users. Second, in the decision-making process, compare the similarity between the characteristics of authorized user signal and the current subspace to determine whether the authorized signal is included.

3.3 Matrix function detection (MFD) algorithm

The MFD algorithm uses the positive semi-definiteness of the covariance matrix for detection, and does not need to calculate the eigenvalues of matrix, but uses the trace of the matrix as the criterion for decision-making. Assume that the covariance matrices of authorized user signal and noise are \({\varvec{R}}_{s}\) and \({\varvec{R}}_{n}\), respectively. When the signal contains authorized user signal and noise, there is:

$${\varvec{R}}_{x} = {\varvec{R}}_{s} + {\varvec{R}}_{n} + E\left( {{\varvec{sw}}^{T} } \right) + E\left( {{\varvec{ws}}^{T} } \right)$$
(26)

where \(E\left(\cdot \right)\) means the average. When the sample value of the sensing node is large, it can be assumed that the signal is not correlated with the noise, then \(E\left( {{\varvec{sw}}^{T} } \right) = 0\),\(E\left( {{\varvec{ws}}^{T} } \right) = 0\).

$$H_{1} :\quad {\varvec{R}}_{x} = {\varvec{R}}_{s} + {\varvec{R}}_{n}$$
(27)

Correspondingly, when authorized user do not occupy the spectrum, there is:

$$H_{0} :\quad {\varvec{R}}_{x} = {\varvec{R}}_{n}$$
(28)

In the MFD algorithm, the sample subspace \(\Gamma_{x,i}\) is divided into \(K\) segments. Each segment contains \(N_{k} = N/K\) sample vectors. Therefore, the covariance matrix of the k-th segment is:

$${\varvec{R}}_{x,k} = \frac{1}{{N_{k} }}\sum\limits_{{i = \left( {k - 1} \right)N_{k} + 1}}^{{\left( {k - 1} \right)N_{k} + N_{k} }} {{\varvec{X}}_{i} {\varvec{X}}_{i}^{T} }$$
(29)

From Eqs. (27) and (29), when the authorized user occupies the spectrum, the covariance matrix is:

$$H_{1} : \quad {\varvec{R}}_{x,k} = {\varvec{R}}_{s,k} + {\varvec{R}}_{n,k}$$
(30)

Since the sample covariance matrix is positive semi-definite, for any \(k = 1,2, \ldots ,K\), we have:

$${\varvec{R}}_{s,k} + {\varvec{R}}_{n,k} \succ {\varvec{R}}_{n,k}$$
(31)

In the case of \(H_{1}\), the elements of the sample covariance matrix of any segment have larger values. We use this property to determine whether the subspace contains signal from authorized users. In order to improve the robustness, we average the sample covariance matrix of each segment. From Eq. (31), we can get:

$$\frac{1}{K}\sum\limits_{k = 1}^{K} {\left( {{\varvec{R}}_{s,k} + {\varvec{R}}_{n,k} } \right)} \succ \frac{1}{K}\sum\limits_{k = 1}^{K} {{\varvec{R}}_{n,k} }$$
(32)

For \({\varvec{A}} \prec {\varvec{B}}\), we can usually find a monotonically increasing function \(f\left( \right)\), such that \(f\left( {\varvec{A}} \right) \prec f\left( {\varvec{B}} \right)\). The output of the function \(f\left( \right)\) is also a matrix, which is still difficult to compare. Therefore, we look for the trace of matrix to make the result a real number, and then compare.

For positive semi-definite matrix \({\varvec{A}},{\varvec{B}} \in {\mathbb{C}}^{n \times n}\), suppose \({\varvec{A}} \prec {\varvec{B}}\), let \(f:[0,\infty ) \to [0,\infty )\) satisfy: ① \(f\left( 0 \right) = 0\), ②\(f\left( \right)\) is continuous increasing functions, then

$$Tr\left( {f\left( {\varvec{A}} \right)} \right) < Tr\left( {f\left( {\varvec{B}} \right)} \right)$$
(33)

Combining Eqs. (32) and (33), we can get:

$$Tr\left( {f\left( {\frac{1}{K}\sum\limits_{k = 1}^{K} {\left( {{\varvec{R}}_{s,k} + {\varvec{R}}_{n,k} } \right)} } \right)} \right) \succ Tr\left( {f\left( {\frac{1}{K}\sum\limits_{k = 1}^{K} {{\varvec{R}}_{n,k} } } \right)} \right)$$
(34)

Equation (34) shows that when the authorized user occupies the spectrum, the left end of the formula is greater than the right end. The process of averaging the covariance matrix of each segment can stabilize the inequality. If \(T_{MFD}\) is higher than the threshold \(\gamma_{MFD}\), it is assumed that the authorized user is occupying the spectrum. The detection expression of MFD algorithm is as follows:

$$H_{1} :\;\;\;T_{MFD} = Tr\left( {f\left( {\frac{1}{K}\sum\limits_{k = 1}^{K} {{\varvec{R}}_{x,k} } } \right)} \right) > \gamma_{MFD}$$
(35)

The monotonically increasing matrix function \(f\left( \right)\) used by the MFD algorithm will affect the performance of the entire algorithm, so it is called the detection algorithm based on matrix function. In addition, F. Lin pointed out that monotonically increasing matrix function \(f\left( {\varvec{X}} \right) = {\varvec{X}}\) can provide near optimal detection performance for MFD detection algorithm [13].

3.4 Maximum-minimum eigenvalue ratio (MME) algorithm

The MME detection algorithm is based on the generalized likelihood ratio, does not require any prior information of the detection signal, and is suitable for spectrum sensing scenarios in satellite cognitive wireless networks. Under the framework of big data, the MME algorithm makes decisions based on the statistical characteristics of the sample subspace. First, it calculates the maximum and minimum eigenvalues of the sample covariance matrix, denoted as \(\lambda_{\max }\) and \(\lambda_{\min }\), respectively. If the sample subspace only contains noise, then \({{\lambda_{\max } } \mathord{\left/ {\vphantom {{\lambda_{\max } } {\lambda_{\min } = 1}}} \right. \kern-\nulldelimiterspace} {\lambda_{\min } = 1}}\). If it contains the detection signal, then \({{\lambda_{\max } } \mathord{\left/ {\vphantom {{\lambda_{\max } } {\lambda_{\min } > 1}}} \right. \kern-\nulldelimiterspace} {\lambda_{\min } > 1}}\). According to this property, we can get the detection expression of MME algorithm as follows:

$$H_{1} : \quad T_{MME} = \frac{{\lambda_{\max } }}{{\lambda_{\min } }} > \gamma_{MME}$$
(36)

where \(\gamma_{MME}\) is the threshold value.

When the prior information of the authorized user signal is unknown, the MME algorithm can use the information obtained in the signal sample subspace. The signal and noise are distinguished by the ratio of the maximum and minimum eigenvalues of the sample covariance matrix. Therefore, it can provide better detection performance in low signal-to-noise ratio (SNR) and complex electromagnetic environment.

3.5 Absolute value of covariance (CAV) algorithm

Both MFD and MME algorithms do not need the signal of authorized user as prior information for detection, but they need to find the eigenvalue of sample covariance matrix. When the dimension of covariance matrix is large, the computational complexity is very high, and the detection complexity and detection time will increase accordingly. CAV algorithm has low complexity and does not need prior information. First, the covariance matrix \({\varvec{R}}_{x}\) of the current subspace \(\Gamma_{x,i}\) is calculated. Second, the variables \(T_{1}\) and \(T_{2}\) are used for detection.

$$T_{1} = \frac{1}{L}\sum\limits_{i = 1}^{L} {\sum\limits_{j = 1}^{L} {\left| {r_{ij} } \right|} }$$
(37)
$$T_{2} = \frac{1}{L}\sum\limits_{i = 1}^{L} {\left| {r_{ii} } \right|}$$
(38)

where \(r_{ij} , \, 1 \le i \le L, \, 1 \le j \le L\) are the elements in covariance matrix. The detection expression of CAV algorithm is:

$$H_{1} :\;\;\;\;T_{CAV} = \frac{{T_{1} }}{{T_{2} }} > \gamma_{CAV}$$
(39)

3.6 Eigenvalue exponential mean (EEM) algorithm

The spectrum sensing algorithm based on big data analysis theory essentially describes the statistical characteristics of subspace in a way that is easy to quantify and compare under the framework of big data. In satellite cognitive radio networks, cognitive base stations and cognitive users often do not know the prior information of the authorized user signal. In order to better distinguish between signal and noise in this "blind detection" state, the spectrum sensing algorithm needs to make full use of the subspace containing the current received signal. Based on this, a eigenvalue exponential mean (EEM) algorithm is proposed to provide better detection performance without knowing the prior information of authorized user signals.

Similar to MME algorithm, the proposed EEM algorithm also uses the eigenvalues of the covariance matrix to distinguish the signal and noise. However, MME algorithm only uses the ratio of the maximum and minimum eigenvalues to distinguish. In the case of low SNR, the size of eigenvalues will be easily affected by noise, which will reduce the stability and performance of the algorithm.

The EEM algorithm uses all the eigenvalues \(\left\{ {\lambda_{i} \left( {{\varvec{R}}_{x} } \right)} \right\}\) to make decision, and improve the robustness and detection performance. Specifically, the EEM algorithm takes the exponential mean of L eigenvalues of the sample covariance matrix as the judgment basis. The detection expression of the EEM algorithm is as follows:

$$H_{1} :\;\;\;\;T_{EEM} = \ln \left( {\frac{1}{L}\sum\limits_{i = 1}^{L} {e^{{\lambda_{i} }} } } \right) > \gamma_{{EE{\text{M}}}}$$
(40)

where \(\gamma_{{EE{\text{M}}}}\) is the threshold. If the sample subspace only contains noise, then \(T_{EEM} = \ln \left( {{{L \times e} \mathord{\left/ {\vphantom {{L \times e} L}} \right. \kern-\nulldelimiterspace} L}} \right) = 1\). If it contains authorized signal, then \(T_{EEM} > 1\).

4 Simulation results and analysis

4.1 Algorithm performance comparison

The false alarm probability is set to 1%, and the Rice factor K of Rice channel is set to 6 dB. The cognitive radio spectrum sensing algorithms based on big data discussed above are all applied to satellite cognitive radio networks, and the detection probabilities under different SNR are compared, as shown in Fig. 2. In order to improve the robustness of the simulation, the algorithm uses Monte Carlo simulation 2000 times in each SNR.

Fig. 2
figure 2

Detection probability of spectrum sensing algorithm with SNR (false alarm probability is 1%, 10 sensing nodes, rice factor K = 6 dB)

The spectrum sensing algorithm based on big data analysis can provide good spectrum sensing performance in the low SNR environment of − 15 dB. The EC algorithm has the best detection performance and can accurately detect the authorized signals with the SNR of − 23 dB. In the actual sensing process, it is often impossible to obtain the prior information of authorized users, so the EC algorithm is not applicable in actual engineering. The performance of ETM algorithm is second only to EC algorithm, because it can automatically extract the prior information. However, when the SNR is lower than − 24 dB, the sensing node is affected by noise and it is difficult to extract the authorized user signal from the subspace, and the detection probability is very low. When the SNR increases, the signal features of authorized users are easier to extract. Once the prior information is accurately extracted, the detection probability will be greatly improved.

For FMD, MME, EEM and CAV algorithms, which do not depend on the prior information, their detection performance is lower than the former two algorithms. However, these algorithms do not require prior information and have lower complexity, are easier to implement in engineering, and have more practical significance. Among these "blind detection" algorithms, the EEM algorithm has the highest detection probability. However, the EEM algorithm needs to calculate eigenvalues, which has higher requirements for the computing power of the fusion center. The MFD algorithm has the lowest detection probability when the SNR is higher than -21db, but it only needs to calculate the trace of the sample covariance matrix, so it is easier to implement.

4.2 Influence of channel conditions on algorithm performance

The Rician factor is increased to 10 dB, and other system parameters remain unchanged. For each SNR, the detection probability of the spectrum sensing algorithm based on big data analysis is simulated by Monte Carlo 2000 times, and the simulation result is shown in Fig. 3.

Fig. 3
figure 3

Detection probability of spectrum sensing algorithm with SNR (false alarm probability is 1%, 10 sensing nodes, rice factor K = 10 dB)

Comparing Figs. 2 and 3, when the size of the subspace is unchanged, the detection probability of various spectrum sensing algorithms changes similarly. In addition, increasing the Rician factor can improve the detection probability to a certain extent. Therefore, in the satellite cognitive wireless network, the sensor node should be located in an open area with good channel conditions, and try to avoid the shelter of trees and tall buildings.

On the other hand, the increase of the Rician factor can greatly improve the detection performance of EC, ETM, MME and EEM algorithms, enabling the ETM algorithm to obtain the signal characteristics of authorized users from the sample subspace with a lower SNR. However, the detection performance of MFD and CAV algorithms are not sensitive to changes in channel conditions.

4.3 Influence of sensing nodes number on algorithm performance

The number of sensor nodes in the satellite coverage area is set to 20, the false alarm probability is 1%, and the Rice coefficient is 10 dB. In a sensing process, the sensing node samples the signals of authorized satellite users 50 times, and the sampling frequency remain unchanged. Therefore, 128,000 data points are collected for each sensing process and converge to form a subspace at the fusion center. The curve of the detection probability is shown in Fig. 4.

Fig. 4
figure 4

Detection probability of spectrum sensing algorithm with SNR (false alarm probability is 1%, 20 sensing nodes, rice factor K = 10 dB)

Comparing Figs. 3 and 4, the detection performance can be improved by increasing the number of sensing nodes under the same channel conditions. As the subspace becomes larger, it is easier for the fusion center to extract statistical features under the framework of big data, and to better distinguish between signal and noise.

In order to obtain better detection performance in satellite cognitive wireless networks, the number of sensing nodes and sampling time can be increased. However, this will increase the cost of data transmission between the sensing node and the fusion center, the amount of calculation of the fusion center, and the sensing time. Therefore, it is necessary to make a trade-off between perception performance and system complexity according to the actual situation.

5 Conclusion

In the context of satellite cognitive radio networks, traditional spectrum sensing algorithms based on big data analysis theory are studied, including EC, ETM, FMD, MME and CAV algorithms. These algorithms utilize the wide coverage of satellite network and the large number of sensing nodes, and transmit the sampling data of the sensing nodes to the fusion center to form a sample subspace. Make judgments based on different statistical rules of sample data under the framework of big data. This paper proposes the EEM algorithm based on big data analysis theory, which uses all the eigenvalues of the sample covariance matrix to make decision, and the detection performance is better than traditional algorithms. The simulation results show that under the condition of SNR above − 18 dB, different algorithms can provide good detection performance. Based on big data analysis theory, the influence of channel conditions and the number of sensing nodes on the spectrum sensing algorithm is verified. In addition, compared with traditional algorithms, EEM algorithm provides higher detection probability without knowing the prior information of authorized users.