2.1 Introduction

The opportunistic spectrum access (OSA) model, also referred to as interweave paradigm in [1] or spectrum overlay in [2], is probably the most appealing model for unlicensed/secondary users to access the radio spectrum. In this model, the secondary users (SUs) opportunistically access the spectrum bands of primary users (PUs) which are temporally unused. Enabling the unlicensed use of the spectrum while guaranteeing the priority of licensed users, the OSA model has received great attention from both the research and the regulatory organizations.

By definition, before transmission, the SUs in the OSA model need to know the busy/idle status of the spectrum bands which they are interested in. With such knowledge, the SUs can access the unused spectrum bands of the PUs, i.e., the spectrum holes, or the spectrum white space so that the PUs’ QoS will not be degraded. As introduced in Chap. 1, such knowledge can be acquired using two approaches, including the use of a geolocation database and spectrum sensing technique. The former approach can be applied when the PU’s spectrum usage is highly predictable [3, 4] and the PUs are willing to publicize the spectrum usage, possibly for improving the spectrum utilization [5]. However, when the spectrum usage of the PUs might not be predictable or the PUs are unwilling to share such information, spectrum sensing becomes a critical way to detect the available spectrum which enables the operation of the OSA model.

Spectrum sensing based OSA design has gone through a thriving development from the academia. One batch of works have focused on improving the accuracy of spectrum sensing, while others have focused on the coordination of spectrum sensing and access, i.e., the sensing-access design. As essentially a signal detection technique, spectrum sensing might lead to incorrect results due to the noise uncertainty and the channel effects such as multipath fading and shadowing. The accuracy of spectrum sensing is however crucial in the detection of spectrum holes and the protection of PUs. Thus, a lot of works have focused on the design of efficient detection algorithms or the collaboration of SUs for the diversity gain [6,7,8,9,10,11]. The detailed introduction on spectrum sensing techniques will be given in Chap. 3, while in this chapter, we will discuss the other important aspect of OSA design which is the sensing-access design. The sensing-access structure of the OSA reveals that the spectrum access is largely dependent on the results of spectrum sensing. Moreover, the optimization of the performance on spectrum sensing and access might be conflicting with some practical concerns such as the limited computational capability of the SUs, which gives rise to the tradeoff design between the spectrum sensing and access.

In Fig. 2.1, we illustrate the key functions for the physical (PHY) and medium access control (MAC) layers of the CR networks (CRNs). In the PHY layer, spectrum sensing enables the SUs to detect the spectrum holes, while the access control optimizes the transceiver design with respect to the carrier frequency, the modulation and coding scheme, etc. In the MAC layer, there are mainly two functions, including the sensing scheduling and access scheduling. The former determines when, on which channel, how long and how frequently the spectrum sensing should be implemented, while the latter governs the access of multiple users to the detected spectrum holes. A coordinator of the two functions, called as the sensing-access coordinator, is established. In the following sections, we will investigate three classic problems in the sensing-access design by first presenting their basic ideas and concerns, and then reviewing the existing literatures on solving them.

Fig. 2.1
figure 1

Key functions of the PHY and MAC layer in the OSA model

2.2 Sensing-Throughput Tradeoff

Due to the half duplex operation of a transceiver, an SU cannot perform spectrum sensing and access at the same time. As a result, it has to alternate between sensing operation and access operation within a data frame. Assuming that the spectrum sensing is performed periodically in each frame, the frame structure for the SU is illustrated in Fig. 2.2. Denote \(\tau \) as the spectrum sensing time and T as the frame length. Then the time duration left for potential spectrum access is thus \(T-\tau \). Intuitively, with longer sensing time, the accuracy of spectrum sensing can be improved and it is higher chance that the status of the spectrum can be correctly detected. However, this reduces the time left for spectrum access and thus affects the throughput of the SU. Therefore, there is a tradeoff between spectrum sensing and throughput. This problem of sensing-throughput tradeoff is investigated in [12]. In the following, the basic formulation of such a problem is first presented. Extension to the case when cooperative spectrum sensing is employed is then followed.

Fig. 2.2
figure 2

Frame structure for periodic spectrum sensing

2.2.1 Basic Formulation

The performance of spectrum sensing is characterized by two performance metrics, namely, the probability of false alarm \(P_f\) (i.e., the probability of detecting the PU as being present when the PU is actually absent) and the probability of detection \(P_d\) (i.e. the probability of detecting the PU as being present when the PU is present). The decision whether to access the spectrum depends on the result of spectrum sensing. There are two scenarios when the SU could access the spectrum.

  • When the PU is not present and no false alarm is generated by spectrum sensing.

  • When the PU is present but is not detected by spectrum sensing.

The average throughput of the secondary network can be calculated by taking into consideration the achievable throughput for both scenarios

$$\begin{aligned} R = R_0 + R_1 \end{aligned}$$
(2.1)

where \(R_0\) is the amount of the throughput contributed by the first scenario whereas \(R_1\) is the one contributed by the second scenario. Denote \(P(\mathcal {H}_0)\) as the probability that the PU is absent. Denote the \(C_0\) and \(C_1\) as the throughout of the SU when it continuously transmits in the first scenario and second scenario, respectively. Then, \(R_0\) and \(R_1\) can be expressed as follows

$$\begin{aligned} R_0(\epsilon , \tau )= & {} P(\mathcal {H}_0)\frac{T-\tau }{T} C_0 (1-P_f(\epsilon , \tau ) )\end{aligned}$$
(2.2)
$$\begin{aligned} R_1(\epsilon , \tau )= & {} (1-P(\mathcal {H}_0))\frac{T-\tau }{T} C_1 (1-P_d(\epsilon , \tau ) ) \end{aligned}$$
(2.3)

where \(\epsilon \) is the threshold of energy detection for spectrum sensing. Since both the threshold \(\epsilon \) and the sensing time \(\tau \) affect the accuracy of spectrum sensing, \(P_f\) and \(P_d\) are functions of \((\epsilon , \tau )\) and so are \(R_0\) and \(R_1\).

Note that different from the first scenario, in the second scenario, the SU transmits in the presence of the PU. Hence, in general, we have \(C_0>C_1\). Furthermore, it is typically more beneficial to explore the spectrum that is underutilized, for example, when \(P(\mathcal {H}_0) \ge 0.5\). Therefore, it can safely assume that \(R_0\) dominates the overall throughput R. Hence, \(R(\epsilon , \tau )\approx R_0(\epsilon , \tau )\).

The problem of sensing-throughput tradeoff is to optimize the spectrum sensing parameters to maximize the achievable throughput of the SU subject to that the PU is sufficiently protected. Mathematically, the problem can be expressed as

$$\begin{aligned} \max \limits _{\epsilon ,\tau }&R(\epsilon ,\tau )\approx P(\mathcal {H}_0)\frac{T-\tau }{T} C_0 (1-P_f(\epsilon , \tau ) ) \end{aligned}$$
(2.4a)
$$\begin{aligned} \text {s.t.}&P_d(\epsilon ,\tau ) \ge \bar{P}_d \end{aligned}$$
(2.4b)
$$\begin{aligned}&0<\tau <T, \end{aligned}$$
(2.4c)

where \(\bar{P}_d\) is the target probability of detection. It has been proved in [12] that the above optimization achieves its optimality when the constraint (2.4b) is satisfied with equality.

Note that the above formulation highly depends on the two performance metrics of spectrum sensing, i.e., \(P_d\) and \(P_f\). The former can be considered as an indication to the level of protection to the PU since a higher probability of detection reduces the chance that the SU accesses the spectrum over which the PU is operating; whereas the latter is related to the amount of transmission opportunities for the SUs since the lower the false alarm, the better that the SU can reuse the spectrum. These two metrics in the form of \(P_d\) and \(1-P_f\) are conflicting with each other. For example, for a given detection scheme, an increase in the probability of detection can improve the protection to the PU; however, this is achieved at the expense of increasing probability of false alarm which leads to decreasing spectrum access opportunities to the SU.

By using energy detection and setting \(P_d=\bar{P}_d\), the probability of false alarm can be expressed as [12]

$$\begin{aligned} P_f (\tau ) = \mathcal {Q}\left( \sqrt{2\gamma +1} \mathcal {Q}^{-1}(\bar{P}_d) + \sqrt{\tau f_s}\gamma \right) \end{aligned}$$
(2.5)

where \(\gamma \) is the received signal-to-noise ratio (SNR) of the primary signal, \(\tau \) is the sensing time and \(f_s\) is the sampling rate.

Fig. 2.3
figure 3

Probability of false alarm \(P_f\) versus sensing time \(\tau \) under different received SNRs \(\gamma \) of the primary signal

Then the above optimization problem reduces to an optimization problem with only a single variable \(\tau \) with the objective function given as follows

$$\begin{aligned} {R}(\tau ) \approx C_0 P(\mathcal {H}_0)\left( 1-\frac{\tau }{T}\right) \left( 1-\mathcal {Q}\left( \sqrt{2\gamma +1} \mathcal {Q}^{-1}\left( \bar{P}_d\right) +\sqrt{\tau f_s}\gamma \right) \right) \end{aligned}$$
(2.6)

It has been proved in [12] that under certain regulating assumptions on the form and distribution of the noise and the primary and secondary signals there exists an optimal sensing time that maximizes the achievable throughput of the secondary network.

Consider the scenario when \(P(\mathcal {H}_0)=0.8\), \(T=100\) ms and \(f_s=6\) MHz. The probability of false alarm \(P_f\) and the normalized achievable throughput \(R/C_0 P(\mathcal {H}_0)\) of the secondary network are plotted with respect to the spectrum sensing time \(\tau \) in Figs. 2.3 and 2.4, respectively, under different received SNRs of the primary signal \(\gamma \). As expected, it can be seen from Fig. 2.3 that with longer sensing time, the quality of spectrum sensing improves and thus the probability of false alarm decreases. However, this leads to a reduction in the available spectrum access time. Overall, it can be observed from Fig. 2.4 that there is an optimal sensing time which maximizes the achievable throughput. Furthermore, it can be observed that when \(\gamma \) decreases which indicates more stringent spectrum sensing requirement, the SU has to devote more time for spectrum sensing in order to protect the PU which leads increased optimal spectrum sensing time and reduced maximum achievable throughput.

Fig. 2.4
figure 4

Normalized achievable throughput \(R/(C_0 P(\mathcal {H}_0))\) versus sensing time \(\tau \) under different received SNRs \(\gamma \) of the primary signal

2.2.2 Cooperative Spectrum Sensing

The above formulation considers the case where the result of spectrum sensing is determined by a single SU. When there are multiple nearby SUs, spectrum sensing can be improved by combining the sensing result of these users. Thus, the quality of spectrum sensing does not only depend on the detection threshold \(\epsilon \) and the sensing time \(\tau \) but also the way how the individual sensing results are combined, i.e., the fusion rule. In [13], the basic formulation of the sensing-throughput problem is extended to the case that cooperative spectrum sensing is used.

Assume that there are N SUs participating in cooperative spectrum sensing and reporting their individual sensing result to the fusion center. Consider that k-out-of-N fusion rule [11] is used by which the channel is detected to be busy if there are at least k out of N users that detect so. The quality of spectrum sensing thus depends on the parameter k of the fusion rule. The overall probability of false alarm and the probability of detection are given by

$$\begin{aligned} \mathbf {P}_f (\epsilon , \tau , k) = \sum _{i=k}^N \left( {\begin{array}{c}N\\ i\end{array}}\right) P_f(\epsilon , \tau )^i \left( 1-P_f(\epsilon , \tau )\right) ^{N-i} \end{aligned}$$
(2.7)

and

$$\begin{aligned} \mathbf {P}_d (\epsilon , \tau , k) = \sum _{i=k}^N \left( {\begin{array}{c}N\\ i\end{array}}\right) P_d(\epsilon , \tau )^i \left( 1-P_d(\epsilon , \tau )\right) ^{N-i}, \end{aligned}$$
(2.8)

respectively.

Then the basic formulation in (2.4) can be revised to the following problem

$$\begin{aligned} \max \limits _{\epsilon ,\tau ,k}&R(\epsilon ,\tau ,k)\approx P(\mathcal {H}_0)\frac{T-\tau }{T} C_0 \left( 1-\mathbf {P}_f(\epsilon , \tau ,k)\right) \end{aligned}$$
(2.9a)
$$\begin{aligned} \text {s.t.}&\mathbf {P}_d(\epsilon ,\tau ,k) \ge \bar{P}_d \end{aligned}$$
(2.9b)
$$\begin{aligned}&0<\tau <T \end{aligned}$$
(2.9c)
$$\begin{aligned}&0 \le k \le N. \end{aligned}$$
(2.9d)

Similar to the basic formulation, it is proved in [14] that optimality is achieved with (2.9b) satisfied in equality. Then any fixed k value, the value of \(P_d(\epsilon , \tau )\) for the individual SU, denoted by \(\bar{P}_d\), that satisfies (2.9b) in equality can be found. For any given value of \(\bar{P}_d\), \(P_f\) is related to \(\bar{P}_d\) by (2.5). Then the above optimization problem can be reduced to an optimization problem of only two variables \((\tau , k)\). In [13], an iterative algorithm is proposed to compute the optimal value of \((\tau , k)\).

2.3 Spectrum Sensing Scheduling

In the sensing-throughput tradeoff problem, spectrum sensing time is optimized by considering a fixed frame duration. In the above formulation, it implicitly assumes that the status of the PU remains unchanged throughout the entire frame. In other words, it implies that the SU has be in synchronization with the PU’s frame. This may not be easy to achieve if the PU refuses to cooperate to provide such synchronization information. In this case, for a periodic spectrum sensing scheme, the duration of the frame, which determines how frequent spectrum sensing is scheduled, also affects the achievable throughput of the secondary network.

Intuitively, with a fixed sensing time, the longer the frame duration, the more the effective transmission time. This potentially leads to higher throughput. However, when the frame duration is long, there is higher chance that the PU’s status will change during an SU’s transmission. This may result in collision in the middle of the secondary transmission if the primary user becomes active. The throughput of the SU, in this case, will suffer. Therefore, the duration of the frame needs to be optimized to balanced the tradeoff between the PU protection and the SU performance. Such a problem is investigated in [15] and is presented in the following.

As mentioned in Sect. 2.2.1, there are two scenarios when the SU accesses the spectrum. Similar to the treatment in the sensing-throughput tradeoff problem, only the achievable throughput of the first scenario is considered since it is the dominating factor.

Fig. 2.5
figure 5

The considered scenario

Consider the first scenario when the primary user is not active during spectrum sensing as shown in Fig. 2.5. Assume that the primary user has an exponential on-off traffic model in which both the durations of the active and inactive periods are exponential distributed with the mean duration of \(\mu _1\) and \(\mu _0\), respectively. Due to the memoryless property of the exponential distribution, without loss of generality, the end of the sensing slot can be considered as the starting time \(t=0\). Denote the instance that the primary user becomes active by t. Then the duration of time during which collision occurs is a random variable, which can be expressed as

$$\begin{aligned} x(t) = \left\{ \begin{array}{ll} T-\tau -t, &{} 0\le t\le T-\tau \\ 0, &{} t>T-\tau \end{array}\right. \end{aligned}$$
(2.10)

Based on this, the average time that collision occurs in the frame can be calculated as follows

$$\begin{aligned} \bar{x} =\mathbb {E}\{x(t)\}= & {} \int _0^{T-\tau } (T-\tau -t) \frac{1}{\mu _0} \exp \left( -\frac{t}{\mu _0}\right) dt \end{aligned}$$
(2.11)
$$\begin{aligned}= & {} T-\tau -\mu _0 \left( 1-\exp \left( -\frac{T-\tau }{\mu _0}\right) \right) \end{aligned}$$
(2.12)

Then the normalized achievable throughput (normalized by \(P(\mathcal {H}_0)(1-P_f)C_0\)) of the SU in this scenario is

$$\begin{aligned} \tilde{R}(T) = \frac{T-\tau -\bar{x}}{T}=\frac{\mu _0}{T}\left( 1-\exp \left( -\frac{T-\tau }{\mu _0}\right) \right) \end{aligned}$$
(2.13)
Fig. 2.6
figure 6

The normalized achievable throughput and the PU’s collision probability at different frame durations T

Next, the collision probability for the PU will be derived. The average collision time within each active period of the primary user can be calculated as

$$\begin{aligned} \bar{y}=\frac{\bar{x}}{\text {Pr}\{0\le t\le T-\tau \}} = \frac{\bar{x}}{1-\exp \left( -\frac{T-\tau }{\mu _0}\right) } \end{aligned}$$
(2.14)

Then the collision probability can be expressed as

$$\begin{aligned} P_c^p(T) = \frac{\bar{y}}{\mu _1} = \frac{1}{\mu _1}\left( \frac{T-\tau }{1-\exp \left( -\frac{T-\tau }{\mu _0}\right) }-\mu _0\right) \end{aligned}$$
(2.15)

The objective is to find the optimal frame duration to maximize the normalized achievable throughput subject to that the collision probability of the primary user is kept below a limit. Mathematically, it is expressed as

$$\begin{aligned} \max \limits _{T}&\tilde{R}(T)=\frac{\mu _0}{T}\left( 1-\exp \left( -\frac{T-\tau }{\mu _0}\right) \right) \end{aligned}$$
(2.16)
$$\begin{aligned} \text {s.t.}&P_c^p(T) \le \bar{P}_c^p \end{aligned}$$
(2.17)
$$\begin{aligned}&T>\tau \end{aligned}$$
(2.18)

Setting the derivative of \(\tilde{R}(T)\) to zero, the stationary point of the objective function can be found as

$$\begin{aligned} T_o = -\mu _0 \left( 1+\mathcal {W}_{-1} \left( -\exp \left( -\frac{\mu _0+\tau }{\mu _0}\right) \right) \right) \end{aligned}$$
(2.19)

where \(\mathcal {W}_{-1}(x)\) represents the negative branch of the Lambert’s W function which solves the equation \(w\exp (w)=x\) for \(w<-1\). If \(P_c^p(T_o)\le \bar{P}_c^p\) and \(T_o>\tau \), then \(T_o\) is the optimal frame duration.

Consider the scenario with the average inactive duration of \(\mu _0 = 650\) ms, the average active duration of \(\mu _1 = 352\) ms, the spectrum sensing time of \(\tau =1\) ms and the target collision probability of \(\bar{P}_c^p=0.1\). Figure 2.6 shows the normalized achievable throughput and the PU’s collision probability, respectively, with respect to the frame duration. It can be seen from the figure that in this scenario there is a unique frame duration that maximizes the normalized achievable throughput and at the same time satisfies PU’s collision probability constraint.

2.4 Sequential Spectrum Sensing

Periodic spectrum sensing has been considered in the preceding two sections. In such a sensing framework, if a channel is sensed to be busy, the SU has to wait until the next frame to sense the same channel or another channel to identify any spectrum opportunity. This could result in delay in accessing the spectrum. Another approach that is fundamentally different from periodic spectrum sensing is the sequential spectrum sensing. In such a sensing framework, the SU will sequentially sense a number of channels without any additional waiting period in between before it decides which channel to transmit over. In this case, the SU can dynamically determine how many channels should be sensed before a transmission. This approach allows the SU to explore diversity in the occupancy among different licensed channels. Hence, in case that one channel is sensed to be busy, the SU can quickly identify a spectrum opportunity by continuing to sense other channels. Furthermore, it allows the SU to explore diversity in the secondary channel fading statistics so that the SU can possibly take advantage of a better channel to maximize its own desire.

Clearly, when more channels are sensed, there is higher chance to identify a channel with higher throughput. However, this may result in more energy and time wasted in spectrum sensing. A tradeoff has to be balanced between throughput and energy consumption. In [16], the sensing-access design for sequential spectrum sensing is investigated from an energy-efficiency perspective. In particular, the paper designs the sensing policy which determines when to stop sensing and start transmission, the access policy which determines how much power is used upon transmission and the sense order that determines which channel to sense next if the current channel is given up for transmission to maximize the overall energy efficiency of the entire sequential spectrum sensing process. In the following, the energy-efficient sensing-access design with a fixed sensing order is first described and then extended to the case when sensing order is optimized.

2.4.1 Given Sensing Order

In [16], the authors consider the sequential spectrum sensing framework under which the SU can sequentially sense a maximum number of K channels of bandwidth B each. An example of the considered sequential channel sensing process when the sensing order is given according to the logical indices of the channels is illustrated in Fig. 2.7. For each channel, e.g., channel k, the SU will perform spectrum sensing to find if channel k is busy or idle, i.e., the status \(\delta _k\) of channel k with \(\delta _k=1\) and \(\delta _k=0\) representing channel k is busy and idle, respectively. If channel k is sensed to be busy, the SU will continue to sense the next channel. If channel k is sensed to be idle, the SU will continue to perform channel estimation to determine the channel gain \(h_k\) and then decide whether to select a power level to transmit over this channel for a period of \(T_k\) or to continue to sense the next channel.

Fig. 2.7
figure 7

An illustration of the sequential spectrum sensing process in [16]

Under such a sequential spectrum sensing framework, a decision has to be made after sensing each channel before the SU decides to access a channel. At maximum, K decisions have to be made. Such a process can be modeled as a K-stage stochastic sequential decision-making problem, which consists the following basic components:

  • State: Due to the imperfection of spectrum sensing, only an observation \(\hat{\delta }_k\) about the true PU channel status \(\delta _k\) is available. The system state in this case is characterized by \(s_k = (\hat{\delta }_k, h_k)\). It is assumed that \(h_k = 0\) when \(\hat{\delta }_k = 1\) since channel estimation will not be carried out in this case. An additional state \(s_k = \mathcal {T}\) is introduced to denote that the sequential spectrum sensing process has been terminated once the transmission has started.

  • Decision: At each stage k, after observing the system state \(s_k\), a decision \(u_k\) has to be made. When channel k is sensed to be busy, i.e., \(s_k = (1,0)\), the only decision available is \(u_k = \mathcal {C}\), where \(\mathcal {C}\) denotes continuing to sense the next channel. However, when channel k is sensed to be idle, the SU has to decide whether to continue to sense the next channel, i.e., \(u_k =\mathcal {C}\) or choose a transmit power level, i.e., \(u_k=p_t\) to transmit, the latter of which leads to the termination state, i.e., \(s_{k+1}=\cdots =s_K=\mathcal {T}\).

  • Cost functions: Since energy efficiency is the ratio between the throughput (i.e., average number of bits transmitted) and the average energy consumption, two cost functions are defined, i.e., one for the throughput and the other for the energy consumption. First, some notations are introduced. Denote \(r_k^0\) and \(r_k^1\) as the number of bits that can be transmitted over channel k when it is truly idle and truly busy, respectively. Furthermore,

    $$\begin{aligned} r_k^0 = T_kB\log _2\left( 1+{SNR_k}/{\Gamma } \right) \end{aligned}$$
    (2.20)

    where \(SNR_k\) represents the SNR received at the SU-Rx, and \(\Gamma \) is considered as the SNR gap to channel capacity. The received SNR can be further defined as

    $$\begin{aligned} SNR_k(h_k,p_t) = \frac{\rho _k h_k p_t}{\iota \sigma ^2} \end{aligned}$$
    (2.21)

    where \(\rho _k\) captures the propagation loss, \(\iota \) is the link margin compensating the hardware process variation and imperfection, and \(\sigma ^2\) is the noise power at the receiver front end. Moreover, denote \(\theta _k\) as the probability that channel k is idle, i.e., \(\theta _k=\text {Pr}(\delta _k=0)\), \(P_{f,k}\) as the probability of false alarm for channel k, \(\bar{P}_{d,k}\) as the target probability of detection for channel k. When channel k is sensed to be busy, the throughput is \( g_k^R (s_k,u_k) = 0\). Otherwise, it is defined as

    $$\begin{aligned} g_k^R (s_k,u_k) = \mathbb {E}_{\delta _k|\hat{\delta }_k}[r_k] = \omega _k r_k^0 + (1-\omega _k)r_k^1 \approx \omega _k r_k^0 \end{aligned}$$
    (2.22)

    where \(\omega _k = \frac{\theta _k(1-{P}_{f,k})}{\theta _k(1-{P}_{f,k}) + (1-\theta _k)(1-\bar{P}_{d,k})}\) is the posterior probability of \(\delta _k = 0\) given \(\hat{\delta }_k = 0\). The approximation is due to the fact that \(\bar{{P}}_d^k\approx 1\) and \(r_k^0\ge r_k^1 \) [12]. The energy consumption at stage k is defined as

    $$\begin{aligned} g_k^E (s_k,u_k) = \left\{ \begin{array}{ll} 0, &{} \text{ if } s_k = \mathcal {T}\\ \tau p_s + T_k(\alpha p_t+p_c) , &{} \text{ if } s_k \ne \mathcal {T} \text{ and } u_k = p_t \\ \tau p_s,&{} \text{ otherwise } \end{array}\right. \end{aligned}$$
    (2.23)

    where \(p_s\) is the sensing power, \(p_c\) is the power consumed in various transceiver electronic circuits excluding the power amplifier (PA), \(\alpha = \xi /\zeta \), \(\xi \) the peak-to-average ratio of the PA, and \(\zeta \) is the drain efficiency of the PA.

The objective of the energy-efficient sequential spectrum sensing is to find a sequence of functions \(\phi =\{\mu _1(s_1),\ldots , \mu _K(s_K)\}\) mapping each state \(s_k\) into a control \(u_k=\mu _k(s_k)\), such that the energy efficiency of the entire sequential spectrum sensing process is maximized. Mathematically, this can be expressed as

$$\begin{aligned} \max \limits _{\phi }~ \eta _\phi = \frac{\mathbb {E}\left\{ \sum _{k=1}^K g_k^R \left( s_k,\mu _k(s_k)\right) \right\} }{\mathbb {E}\left\{ \sum _{k=1}^K g_k^E (s_k,\mu _k(s_k))\right\} } \end{aligned}$$
(2.24)

where the expectation is taken over \(s_k, k = 1, \ldots , K\).

The above problem is very difficult to solve in its current form as it consists of a ratio of two addictive cost functions. To tackle it, a new sequential decision-making problem is formed with the states and decisions defined same as above but the cost function at stage k, \(k=1,\ldots ,K\), which is more precisely considered as a reward function, defined as follows

$$\begin{aligned} G_k(s_k,u_k;\lambda )= & {} g_k^R (s_k,u_k)-\lambda g_k^E (s_k,u_k)\nonumber \\= & {} \left\{ \begin{array}{ll} 0, &{} \text{ if } s_k = \mathcal {T}\\ \mathcal {F}_k(h_k,p_t,\lambda )-\lambda \tau p_s, &{} \text{ if } s_k\ne \mathcal {T} \text{ and } u_k = p_t\\ -\lambda \tau p_s,&{} \text{ otherwise } \end{array}\right. ~~~ \end{aligned}$$
(2.25)

where \(\lambda \ge 0\) is a parameter which can be considered as the price per unit of energy consumption, and

$$\begin{aligned} \mathcal {F}_k(h,p,\lambda ) = \omega _k T_k B\log _2\left( 1+SNR_k(h,p)/\Gamma \right) -\lambda T_k (\alpha p+p_c) \end{aligned}$$
(2.26)

For a given value of \(\lambda \), the objective of the new sequential decision-making problem is to find a sequence of functions, \(\phi (\lambda ) = \{\mu _1(s_1;\lambda ),\ldots ,\mu _K(s_K;\lambda )\}\), mapping each \((s_k; \lambda )\) into a control \(u_k = \mu _k(s_k;\lambda )\) to maximize the expected long-term reward. Mathematically, this can be expressed as

$$\begin{aligned} \max \limits _{\phi (\lambda )} ~J_{\phi (\lambda )}( \lambda ) = \mathbb {E}\left\{ \sum _{k=1}^K G_k(s_k,\mu _k(s_k,\lambda );\lambda )\right\} \end{aligned}$$
(2.27)

Equation (2.27) is parameterized by \(\lambda \). Clearly, for different values of \(\lambda \), we have different parametric formulations and thus different resulting optimal policies. Each of the parametric formulations in (2.27) can be considered as an energy-aware design of the sensing-access policies since the throughput can be treated as the monetary reward and both the sensing energy and the transmission energy can be treated as cost. The beauty of this is that unlike the original problem in (2.24), the parametric formulation in (2.27) has an additive and separable structure, which allows dynamic programming to be applied.

By using backward induction, the optimal policy after observing channel k is idle, i.e., \(s_k \ne (1,0)\), can be found as

$$\begin{aligned} \mu _k(s_k;\lambda ) = \left\{ \begin{array}{ll} \left[ \frac{\omega _k B}{ \lambda \alpha \ln (2)}-\frac{\iota \Gamma \sigma ^2}{\rho _k h_k}\right] _{p_{min}}^{p_{max}},&{}\text{ if } \mathcal {F}^*_k(h_k,\lambda ) \ge \mathbb {E}\left\{ J_{k+1}(s_{k+1};\lambda )\right\} \text{ or } k=N\\ \mathcal {C}, &{}\text{ otherwise } \end{array}\right. \nonumber \\ \end{aligned}$$
(2.28)

where \(\mathcal {F}^*_k(h,\lambda ) = \max \limits _{u \in [p_{min}, p_{max}]} \mathcal {F}_k(h,u,\lambda )\) represents the maximum immediate net reward associated with transmitting over channel k, \(p_{min}\) is a predefined small positive power level and \(p_{max}\) is the maximum transmit power allowed due to the hardware limitation or some other regulations.

The above result has the following interpretations:

  • First, since \(\mathcal {F}^*_k(h_k,\lambda )\) is a non-increasing function in \(h_k\) and \(\mathbb {E}\left\{ J_{k+1}(s_{k+1};\lambda )\right\} \) is a constant value once it is computed, the condition \(\mathcal {F}^*_k(h_k,\lambda ) \ge \mathbb {E}\left\{ J_{k+1}(s_{k+1};\lambda )\right\} \) is equivalent to \(h_k\ge \bar{h}_k\) where \(\bar{h}_k\) is the minimum of \(h_k\) that satisfies the condition. This indicates that the optimal sensing policy which determines when to stop sensing has a threshold structure. The SU will transmit over an idle channel only when the channel condition is good enough.

  • Second, the optimal access policy which specifies the optimal power allocation has a waterfilling structure with the water level determined by the price \(\lambda \) for energy consumption and the perceived spectrum opportunity of the channel \(\omega _k\). It can be seen from the expression that the SU will transmit with a higher power when the price for energy consumption is lower and/or when the perceived spectrum opportunity is larger.

In [16], it has been shown that the original problem in (2.24) and the parametric formulation are related in such a way that \(\eta _{\phi ^*}=\lambda ^*\) if and only if \(J_{\phi ^*}(\lambda ^*)=0\). This shows that the energy-aware sensing-access design that has zero expected long-term reward is the most energy-efficient sensing-access design. In addition, the maximum energy efficiency is equal to the price for the energy-aware design in this case. Furthermore, based on the monotonicity of the function \(J_{\phi ^*}(\lambda )\), an iterative algorithm has been proposed to find the optimal \(\lambda ^*\).

Consider the following scenario for the sequential spectrum sensing with a bandwidth of \(B=6\) MHz, noise power spectrum density of \(N_0 /2=-204\) dBW/Hz, noise figure of \(N_f=10\) dB, noise power of \(\sigma ^2 = N_0 B N_f\), distance between the SU-TX and the SU-RX of \(d=200\) m, carrier frequency of \(f_{c,1}=700\) MHz, \(f_{c,k+1}-f_{c,k}=B\) for \(k=1, \ldots , K-1\), propagation loss of \(\rho _k = \left( \frac{c}{4\pi d f_{c,k}}\right) ^2\), link margin of \(\iota =10\) dB, bit error rate of \(BER=10^{-5}\), SNR gap to channel capacity of \(\Gamma \approx -\frac{\ln (5 BER)}{1.5}\), minimum transmit power of \(p_{min}=1\) mW, maximum transmit power of \(p_{max}=166.62\) mW, circuit power of \(p_c=210\) mW, sensing power of \(p_s=110\) mW, transmission or frame time of \(T=100\) ms, PU idle probability of \(\theta _k=0.8\), SU channel distribution of \(h_k=\{1,2,3,4,5\}\) with probability of \(\{0.64, 0.23, 0.09, 0.03, 0.01\}\), PU worst-case received SNR \(\gamma _k=-20\) dB, target probability of detection of \(\bar{P}_{d,k}=0.9\), PAR of \(\xi =6\) dB, and drain efficiency of \(\zeta =0.35\).

In Fig. 2.8, the achieved energy efficiency of the optimal sensing-access policies is compared with two suboptimal policies for \(K=6\) channels and the access time \(T_k=T\). The first suboptimal scheme consists of a sensing policy that always transmits over the first available channel and an access policy based on adaptive power allocation. The second suboptimal scheme allows the exploration of diversity of multiple available channels but always uses the maximum transmit power. It can be seen from the figure that the optimal scheme outperforms both suboptimal ones.

In Fig. 2.9, the achieved energy-efficiency is plotted versus sensing time when the number of channels k varies. We can see that with more number of channels, the energy efficiency is improved and the optimal sensing time is reduced due to the increased channel diversity effect.

Fig. 2.8
figure 8

Energy efficiency versus sensing time for \(K=6\) channels (constant transmission time)

Fig. 2.9
figure 9

Energy efficiency versus sensing time (constant transmission time)

2.4.2 Optimal Sensing Order

The above formulation only considers the design of the sensing policy and the access policy in terms of power allocation strategy with a given sensing order. However, it can be modified to incorporate the design of the optimal sensing order. In this case, the SU will have to decide which channel to sense next if the current channel is given up for transmission. To do so, the state and decision have to be modified as follows.

  • State: Denote \(i_k\) as the index of the channel sensed at stage k and \(\Omega _k\) as the set of channels that has not been sensed at stage k. Then the joint variable \((s_{i_k},\Omega _k)\) can be taken as the state at stage k. Different from the formulation in Sect. 2.4.1, one more decision has to be made before the first channel is sensed. To be consistent, the moment when such a decision is made is denoted as stage 0. At stage 0, we have \((s_{i_k},\Omega _k)=(\emptyset ,\Omega _0)\) where \(\Omega _0 = \{1,\ldots ,K\}\) and \(\emptyset \) is a null index.

  • Decision: At stage k, in addition to determining the power allocation and the sensing policy, the SU has to determine the index of the channel to be sensed next if the current channel is given up. In this case, the decision is \(u_k = j, j\in \Omega _k\).

The objective is to find a sequence of functions \(\phi =\{\mu _0(\emptyset , \Omega _0),\mu _1(s_{i_1}, \Omega _1), \ldots ,\) \(\mu _K(s_{i_K},\emptyset )\}\), with \(\mu _k, k=0,1,\ldots ,K\), mapping each state \((s_{i_k}, \Omega _k)\) into a control \(u_k=\mu _k(s_{i_k}, \Omega _k)\), to maximize the energy efficiency of the whole process. Mathematically, this can be expressed as

$$\begin{aligned} \max \limits _{\phi } \eta = \frac{\mathbb {E}\left\{ \sum _{k=0}^K g_k^R \left( s_{i_k},\Omega _k,\mu _k(s_{i_k}, \Omega _k)\right) \right\} }{\mathbb {E}\left\{ \sum _{k=0}^K g_k^E (s_{i_k}, \Omega _k,\mu _k(s_{i_k}, \Omega _k))\right\} } \end{aligned}$$
(2.29)

Similar to the case when the sensing order is fixed, the above problem can be related to a parametric formulation parameterized by \(\lambda \). The optimal strategy for the parametric formulation can be found as follows

$$\begin{aligned}&{\mu _k(s_{i_k},\Omega _k;\lambda )} \nonumber \\= & {} \left\{ \begin{array}{ll} \left[ \frac{\omega _k B}{ \lambda \alpha \ln (2)}-\frac{\iota \Gamma \sigma ^2}{\rho _k h_k}\right] _{P_{min}}^{P_{max}},&{}\text{ if } \mathcal {F}^*_{i_k}(h_{i_k},\lambda ) \ge \mathop {\max } \limits _{j\in \Omega _k} \mathbb {E}\{J_{k+1}(s_{j},\Omega _k-j;\lambda )\}\\ \mathop {\text {argmax} } \limits _{j\in \Omega _k} \mathbb {E}\{J_{k+1}(s_{j},\Omega _k-j;\lambda )\}, &{}\text{ otherwise } \end{array}\right. \nonumber \\ \end{aligned}$$
(2.30)

Compared to the result for the case of given sensing order, the following conclusions can be drawn. First, the optimal power allocation has the same structure as (2.28). Second, the optimal sensing strategy also has a threshold structure due to the monotonicity of \(\mathcal {F}^*_{i_k}(h_{i_k},\lambda )\). The condition \(\mathcal {F}^*_{i_k}(h_{i_k},\lambda ) \ge \mathop {\max } \limits _{j\in \Omega _k} \mathbb {E}\{J_{k+1}(s_{j},\Omega _k-j;\lambda )\}\) indicates that sensing is stopped when the immediate net reward is greater than the expected future net reward of continuing sensing any of the remaining channels. Lastly, if the current channel is given up, the best channel to sense next is the one with the maximum expected future net reward.

Consider that there are \(K=5\) channels with the corresponding PU idle probability set as \(\{\theta _1, \ldots , \theta _K\}=\{0.2, 0.4, 0.6, 0.7, 0.8\}\) while other settings remain the same as above. Figure 2.10 compares the energy efficiency achieved by using the optimal sensing order at different values of sensing time with that achieved with two given sensing orders. It can be seen that the optimization of sensing order is important to improve the energy efficiency of the sequential spectrum sensing process.

Fig. 2.10
figure 10

Energy efficiency versus sensing time with \(\{\theta _1, \ldots , \theta _K\}=\{0.2, 0.4, 0.6, 0.7, 0.8\}\) (constant transmission time)

2.5 Applications: LTE-U

An important application of OSA is the long-term evolution in unlicensed band (LTE-U), which is also known as licensed-assisted access (LAA) [17, 18]. The motivation of introducing LTE service in unlicensed band comes from the crisis of licensed spectrum exhausting of LTE service and the under-utilization of unlicensed bands, such as 5 GHz band which contains 500 MHz radio resource and is mainly used by WiFi service. These bands can be excellent complementary spectrum for enhancing the LTE performance. Through carrier aggregation, the data information can be conveyed via licensed and unlicensed spectrum simultaneously, while the control signal can be still transmitted via licensed spectrum for QoS guarantee. Introducing LTE in unlicensed bands requires the LTE to be a fair and friendly neighbor of the incumbent WiFi in unlicensed bands. To achieve this goal, critical problems, including the protection to WiFi system, efficient coexistence between LTE and WiFi system, and efficient user association need to be addressed.

2.5.1 LBT-Based Medium Access Control Protocol Design

Since WiFi adopts contention-based media access, the access of LTE will introduce collision to the WiFi transmission. To mitigate this collision, listen-before-talk (LBT) scheme, which enables the LTE to monitor the channel status, can be adopted by LTE, which has been shown to be able to maintain the most advantages of LTE when coexisting with WiFi system [19]. Moreover, when LTE transmits on the channel, the WiFi users will keep silent and wait for the channel becomes idle. To guarantee the normal service of WiFi, the LTE should vacate from the channel after a period of data transmission and leave the channel to WiFi operation. Thus, we can see that the LBT-based MAC protocol of LTE-U should contain the periodic channel sensing phase which is followed by data transmission phase and channel vacating phase. In the channel sensing phase, the LTE monitors the channel idle/busy status. If the channel is sensed idle, the LTE transmits data for a period of time. After that, the LTE system vacates from the channel for WiFi transmission.

It can be seen that the LBT-based MAC protocol is similar to the sensing-transmission protocol of the typical OSA system, except that the channel vacating phase is absent in the latter one. This is because that in the typical OSA system, the primary system has higher priority to the secondary system, and thus the secondary system can only passively adapt than the transmission of primary system. In the LTE-U system, however, although the legacy WiFi system is protected, the secondary LTE system can actively control the transmission of WiFi by carefully designing the sensing period and the transmission time.

To protect WiFi services, the performance of the multiple WiFi users should be quantified with the coexistence of LTE. There are works on evaluating the performance of LTE-U via simulation [20,21,22]. To facilitate the theoretical analysis of LTE-U system, an LBT-based LTE-U MAC protocol is designed as shown in Fig. 2.11 [23]. In this protocol, \(\tau _s\), \(\tau _t\) and \(\tau _v\) denote the spectrum sensing time, the LTE transmission time, and the LTE vacating time (WiFi transmission time), respectively. Moreover, the vacating time \(\tau _v\) contains \(\gamma \) (\(\gamma \in {\mathbb Z}^+\)) transmission periods (TPs), each of which contains the WiFi packet transmission time and its propagation delay. Assuming that the spectrum sensing result is perfect, the LBT-based MAC protocol can be specified as follows.

Fig. 2.11
figure 11

An LBT-based MAC protocol design for LTE-U

  • Instead of sensing spectrum at the beginning of each frame, the LTE starts and keeps sensing from the beginning of the \(\gamma \)th TP in a frame. Once the channel is sensed to be idle and the TP is not completed, the LTE will send a dummy packet until the TP ends. By doing so, the WiFi packet arrived during the \(\gamma \)th TP will be deferred and the channel can be held by the LTE for the next frame.

  • Though the spectrum sensing of LTE for the ith frame may happen within the final TP of the \((i-1)\)th frame, the length of LTE frame can be consistently quantified as \(T=\tau _t+\tau _v\). Thus, given T, we can trade off the LTE transmission time and vacating time to optimize the system performance.

  • There is an essential difference between the proposed MAC protocol and the sensing-transmission protocol of the traditional OSA system, although they are both frame-based. In the traditional OSA system, SU can transmit only when the channel is sensed to be idle. In the proposed MAC protocol, however, the LTE not only senses the channel, but also grabs the channel for data transmission in the next frame. Therefore, there is always transmission opportunity for the LTE in each frame.

Based on this protocol design, the performance of the LTE and the WiFi system can be theoretically quantified by mapping the protocol parameters to that in [24]. With the closed-form throughput and delay performance of WiFi, the protocol parameters, including the LTE transmission time and the frame length can be optimized in terms of maximizing the LTE throughput or the overall channel utilization.

2.5.2 User Association: To be WiFi or LTE-U User?

One important observation obtained in the performance analysis of the LBT-based LTE-U MAC protocol design is that when a batch of new users join in the networks, it is not always advantageous to be LTE-U users in terms of individual throughput of the new users or the overall channel utilization. The simulation results in [23] have shown that whether the new users should join in the LTE-U system or the legacy WiFi system to get a better performance is highly determined by the traffic load of the existing WiFi system, including the packet arrival rate and the number of WiFi users. Therefore, the user association, which determines the provider of the service for the new users, should be optimized.

In order to maximize the normalized throughput of the unlicensed band with guaranteeing the QoS of WiFi service, a joint resource allocation and user association problem is proposed for a heterogeneous network, where the LTE small cells opportunistically access the spectrum of WiFi system [25]. For solving the problem, a two-level learning-based framework is proposed with which the original problem is decomposed into two subproblems. The master level problem, which aims to optimize the transmission time of LTE, has been solved by a Q-learning based method; while the slave one, which aims to optimize the user association, has been solved by a game-theory based learning method. With the proposed scheme, each of the newly enrolled users can choose the optimal resource allocation strategy and the service provider autonomously.

Considering that the QoS of the LTE-U users is not guaranteed in the existing literatures, the authors in [26] study the provision of QoS guarantee for the LTE-U by jointly optimizing the resource allocation and the user association strategy. To address the QoS provision problem, the users in the LTE-U system are classified into best-effort users and QoS-preferred users, while the WiFi users are all treated as the best-effort users. When the QoS requirement of an LTE-U user can be satisfied, the user becomes the QoS-preferred user; otherwise, it will join in the WiFi system as a normal WiFi user to receive the best-effort service. By quantifying the performance matrices, including the throughput and delay of the WiFi user and the LTE-U user, the optimization problem is formulated with the objective of maximizing the number of QoS-preferred users, with guaranteeing the fair coexistence of WiFi and LTE-U users. To solve this problem, the original problem is equivalently decomposed into two subproblems, i.e., the sum-power minimization problem and the user association problem. For the former problem, the deep-cut ellipsoid method is used to optimize the LTE transmission time, subcarrier assignment, and power allocation. For the latter one, a successive user removal algorithm is proposed. This scheme can realize that all the LTE-U users are QoS guaranteed and the number of such users is maximized.

2.6 Summary

In this chapter, we have discussed in detail about the OSA technique from the basic OSA model based on which the sensing-access protocol is designed. The sensing-throughput tradeoff has been presented, with which the cooperative spectrum sensing, sensing scheduling and sequential spectrum sensing are introduced. As a recent application of OSA in practical networks, the LTE-U has been presented, in which several critical problems, including the MAC protocol design and optimization, the resource allocation, and the user association, have been addressed.