Minimum Power Adversarial Attacks in Communication Signal Modulation Classification with Deep Learning

Integrating cognitive radio (CR) technique with wireless networks is an effective way to solve the increasingly crowded spectrum. Automatic modulation classification (AMC) plays an important role in CR. AMC significantly improves the intelligence of CR system by classifying the modulation type and signal parameters of received communication signals. AMC can provide more information for decision making of the CR system. In addition, AMC can help the CR system dynamically adjust the modulation type and coding rate of the communication signal to adapt to different channel qualities, and the AMC technique help eliminate the cost of broadcast modulation type and coding rate. Deep learning (DL) has recently emerged as one most popular method in AMC of communication signals. Despite their success, DL models have recently been shown vulnerable to adversarial attacks in pattern recognition and computer vision. Namely, they can be easily deceived if a small and carefully designed perturbation called an adversarial attack is imposed on the input, typically an image in pattern recognition. Owing to the very different nature of communication signals, it is interesting yet crucially important to study if adversarial perturbation could also fool AMC. In this paper, we make a first attempt to investigate how we can design a special adversarial attack on AMC. we start from the assumption of a linear binary classifier which is further extended to multi-way classifier. We consider the minimum power consumption that is different from existing adversarial perturbation but more reasonable in the context of AMC. We then develop a novel adversarial perturbation generation method that leads to high attack success to communication signals. Experimental results on real data show that the method is able to successfully spoof the 11-class modulation classification at a model with a minimum cost of about − 21 dB in automatic modulation classification task. The visualization results demonstrate that the adversarial perturbation manifests in the time domain as imperceptible undulations of the signal, and in the frequency domain as small noise outside the signal band.


Introduction
While the rapid development of wireless communication technology brings convenience to people, it also brings the tension of spectrum resources [1]. To alleviate the spectrum Da Ke and Xiang Wang contribute equally to this work and they are co-first authors. resource constraints, researchers have proposed cognitive radio (CR) [2,3] technique. CR can detect the spectrum accessibility in real time and plan the spectrum resources dynamically. Automatic modulation classification (AMC) [4,5] plays an important role in CR. AMC can identify the modulation mode and communication parameters of the received signal and provide more information for decision-making of the CR system. Deep learning (DL) techniques have been widely used in AMCs due to their capability in learning highlevel or even cognitively reasonable representations [6][7][8][9][10][11][12]. However, DL is usually hard to explain and is challenged for its non-transparent nature [13]. In other words, albeit its effectiveness, one hardly knows what DL has exactly learned and why it can succeed, which hinders the application of DL in many risk-sensitive and/or security-critical applications. More seriously, recent studies have found that DL is susceptible to certain well-designed adversarial examples, which are defined as small and imperceptible perturbation imposed on input samples. Szegedy et al. [14] first found that adding imperceptible adversarial perturbations to input samples can easily fool well-performing Convolutional Neural Network (CNN) models. The effectiveness of the adversarial attack in the modulation classification scenario has been verified [15]. The Fast Gradient Sign Method (FGSM) [16,17] is a simple, fast, and effective method for adversarial example generation. Papernot et al. [18] proposed a method based on the forward derivative to produce adversarial perturbations, called Jacobian-based saliency map attack (JSMA). Carlini and Wagner [19] proposed a method of generating adversarial perturbation and explored three different distance matrices ( L 1 , L 2 , and L ∞ ).
The research in the area of communication signals is still embryonic. Communication signals are often expressed as a waveform, rather than as pixels in images. Owing to the very different nature of communication signals, it is interesting yet crucially important to study if adversarial perturbation could also fool AMC.
To explore the performance of adversarial attack on DLbased AMC and provide an evaluation of the reliabilities for researchers, an algorithm for generating a minimum power adversarial perturbation is proposed in this paper. The method generates theoretically minimal perturbation that makes the modulation classification task of the neural network much less accurate, well concealment, and destructive.
The main contributions of this article are as follows: • We propose a minimum power adversarial attack which is more rational in the modulation classification scenario and validate the feasibility of the adversarial attack in the modulation signal. Experiment results indicate that our proposed method can misclassify a classifier with smaller perturbations. where k is defined as a multi-dimensional parameter set consisting of a bunch of unknown signal and channel variables, and it is given as The symbols used in (2) and (3) are listed as follows: 1. A , the signal amplitude. 2. c , the phase shift introduced by the propagation delay and the initial phase together. 3. N , the number of received symbols. 4. x k,i , the i-th constellation point under k-th modulation scheme,k ∈ {1, ⋯ , C} , and C is the number of candidate modulation schemes to be identified. 5. T s , the symbol interval. 6. g(t) = h(t) * p(t) , where * is the convolution operator, describing the effect of the signal channel h(t) and the pulse-shaping filter p(t). 7. , the normalized timing offset between the transmitter and the receiver, 0 ≤ ≤ 1.
We consider that the values of the above parameter set determine which modulation scheme the signal belongs to. The actual received signal in (1) is discrete, and the received sequence at a symbol interval T s can be written as = [y 0 , ⋯ , y N−1 ] T . The conditional probability density function (PDF) of y n under hypothesis H k can be written as The elements in parameter set k are deterministic or random variables with known PDFs. Hence, the marginalized density function of an observation sequence can be obtained by taking the statistical averaging of (4), which is given as where E(⋅) is the expectation operator.
We assume that the channel environment is stationary, and the parameters in set k is static over the whole observation period. Besides, the normalized epoch = 0 and the noise is white. Thus, the elements of are I.I.D. The function p y n |H k , k represents the PDF of single observation at instant n , and the joint likelihood function of the whole observation sequence can be described as According to the maximum likelihood criterion, the most possible hypothesis H k which is correspondent to the k-th modulation scheme is finally chosen with the maximal likelihood value.
where the top-mark ∧ (⋅) denotes the estimation. The method based on maximum likelihood (ML) has been proved to achieve the optimal performance. However, the exact likelihoods are hard to achieve in complex environments. To obviate this problem, the deep learning-based methods can learn complex functions automatically. While deep learning solves the problem of complex modeling, its robustness has been questioned.
For advanced neural network classifiers, a minimal adversarial perturbation r can be added to the input samples, such that the classifier cannot distinguish the input samples as its true label k (x) . The formal description is given as follows: where x is the input data and k (x) is the label predicted by the classifier. Δ x;k can be referred to as the robustness of the classifier at the input point. In other words, this minimum adversarial perturbation represents the ability of the classifier to withstand the maximum perturbation in any direction.
As shown in Fig. 1, in order to find the optimal solution, we first reduce the classifier to a simple binary classifier. Assume k (x) = sign(f (x)) , and f be any judgment function, i.e., f ∶ ℝ preim → ℝ.
The minimal perturbation that makes the classifier change its judgment is the orthogonal projection of x 0 to the hyperplane. The mathematical expression is: In the case of a linear decision function, it happens to be the direction of the gradient of the decision function and the preceding scalar f x 0 ∕ ∥ ∥ 2 2 corresponds to the optimal perturbation coefficient . At this time, r * (x 0 ) is the distance from point x 0 to the hyperplane. For a nonlinear decision function f , as shown in Fig. 2, we first consider the case where the dimension of input space is n = 1 . We then use the iterative approach to approximate the optimal solution. In the first iteration when the input is x 0 , the green line represents the first-order Taylor expansion tangent, i.e., f (x 0 )+∇f (x 0 ) T (x − x 0 ) , and the intersection of the tangent line and the input data set ℝ 1 is x 1 , at which time the distance of the input data x 0 from x 1 is r 1 , which is the perturbation of the first iteration. The iteration stops when the input data sample is on the other side of the decision function. The minimum perturbation obtained at this point is (10) r * (x 0 ) = r * (x 1 ) + r * (x 2 ) = Δr 1 + Δr 2 . Fig. 1 Schematic diagram of the minimal perturbation of a binary classifier As such, in the i th iteration, the minimum perturbation can be obtained as: The above optimization represents the case where the classifier is a binary classifier, which can also be extended to a multiclass classifier. When it is extended to a multiclass classifier, we use the method of one-vs-others. Let the number of categories be c, and the classification function at this time is f ∶ ℝ n → ℝ c . The classification task can be expressed as: Similar to the binary case, the method is first generalized to the linear case and then to arbitrary classifiers. At this point, the problem is transformed into an optimal perturbation problem in the 1 − n classification problem. The minimum distance to multiple bounds is the minimum adversarial perturbation. The formal description is as follows:

Implementation
In this study, all experiments were performed on an NVIDIA GeForce GTX 3090 using on GPU per run. Each of the four attack methods is implemented using pytorch1.10 and cuda11.3. We use Adam to optimize the target model, and the ReLU activation function was used in all layers, and the categorical-cross-entropy loss functions were tried in the experiments. We adopt the early stopping strategy to decide whether to stop the training.
To test the algorithm proposed in this paper, we conducted experiments on the open-source simulation dataset RML 2016.10A designed by DeepSiG [20], which includes 8 digital modulation types BPSK, QPSK, 8PSK, 16QAM, 64QAM, BFSK, CPFSK, and PAM4, and 3 analog modulation WB-FM, AM-SSB, and AM-DSB. The signal-noise-ration (SNR) range covers − 20 ~ 18 dB. The dataset contains 220,000 samples and each sample contains two IQ quadrature signals, and the length of each sample is 128 points. All data were normalized to zero mean and unit variance. The time domain waveforms for 11 modulation styles of high-SNR are shown in Fig. 3. The ratio of training, validation, and test set is 7:2:1.

Experiments and Results
To ensure that the modulation dataset adapts and matches the CNN model, we adopt the VT CNN2 model [21], which was optimized and improved for the dataset RADI-OML 2016.10A. That is, the number of layers, network parameters, and initial weights of CNN are modified for Return r the dataset and the input signal is reshaped into a 2 × 128 format at each SNR; these formats are adjusted according to length, width, and height. The specific structure is shown in Table 1. We also consider adopting widely used DL models such as VGG11 [22] and Resnet18 [23,24] to test our algorithm.

Time-Frequency Characteristics of Communication Signal Adversarial Examples
The purpose of adversarial examples is to make the classifier misjudgment on the basis of not destroying the original  Fig. 4a, there is no obvious difference between the adversarial examples and the original signals, and there are only some slight slope changes at some signal time-domain inflection points. As shown in Figure 4b, there is also no significant difference between the main frequency bands of the adversarial examples and the original signals. The number of peaks in the adversarial sample signals is the same as that of the original signal. It

Effectiveness of Adversarial Examples
After studying the time-frequency characteristics of the adversarial examples, it is found that the adversarial example is difficult to be distinguished from the original signal. On this basis, it is necessary to study whether the adversarial examples have the effect of misclassifying the classifier.
In order to evaluate the robustness to adversarial perturbations of a classifier, we compute the average perturbation-tosignal ratio (PSR), defined by.
where P r is the power of adversarial perturbations and P s is the power of signals. The PSR represents the magnitude of adversarial perturbations required to misclassify a classifier.
We compare the proposed method to several attack techniques, including FGSM [16], PGD [25], and white noise attack (WGN). The accuracy and average PSR of each classifier computed using different methods are reported in Table 2. Since the dataset contains some low-SNR (− 20dB ~ − 2 dB) signals that are difficult to classify by the classifiers, we only report the results for high-SNR(0 dB ~ 18 dB) signals in Table 2. Our definition of attack success is that all examples are misclassified. In particular, the FGSM and WGN method cannot misclassify all examples, and these two methods can only reduce the accuracy to about 9%. It can be seen that our proposed method estimates smaller perturbations than other competitive methods. The perturbations estimated by our method are 100 smaller in magnitude than the original signal.
For each modulated signal, the corresponding adversarial example signal is calculated and fed to the already trained neural network. The confusion matrix after classification is shown in Fig. 5. Figure 5a shows the confusion matrix in which the classifier VT_CNN2 identifies the original signals. Most of the signals are classified correctly. Figure 5b shows the confusion matrix in which the classifier classified the signals attacked by the proposed method. It can be seen that the matrix approximates a symmetric matrix, indicating (14) PSR = 10 lg P r P s ,

Power of Adversarial Perturbation Required for Successful Attack
The core idea of the adversarial example generation technique proposed in this paper is to find the minimum adversarial perturbation that makes the classifier misclassified. This section will show the minimum perturbation power required for different modulations.
As can be seen from Table 3, the maximum PSR is − 20.05 dB and the minimum is − 43.10 dB. Ideally, the model is spoofed for classification of AM-DSB modulation when the PSR is − 43.1 dB, and in the worst case, when the PSR is − 20.05 dB, the model is spoof for classification of CPFSK.
Through the abovementioned experiments, the DNNbased modulation classification is quite vulnerable to adversarial attacks. We believe that the security issues could be a main concern in AMC.

Defense of Adversarial Attack
To deal with these types of attacks, we use the adversarial training [16] approach to build more robustness classifiers for the dataset. The loss function of adversarial training is formulated as: where L( , , y) is the loss function of original model, and = 0.5 . is the step size of adversarial examples, and we use PSR to measure its strength. The evolution of PSR for different robust classifiers is shown in Table 4. Observe that retraining with adversarial examples significantly increases the robustness of the classifiers to adversarial perturbations. For example, the robustness of the network Resnet18 is improved by 4.38 dB and VGG11's robustness is increased by about 3.25 dB. Moreover, adversarial training is also benefit to the robustness of classifiers for other adversarial attacks and WGN attack. Quite surprisingly, the accuracy of VGG11 and Resnet18 for original signals both increase to 75% after adversarial training, but the VT_CNN2's decreases to 36%. We think that this behavior is due to the depth of VT_CNN2 being relatively shallow, and its capacity is not enough. Adversarial training will increase the complexity of dataset, so the simple classifier will lose its accuracy.

Conclusions
In this paper, we propose a minimum-power adversarial example generation method for AMC tasks. We test our method in three classifiers, including VT_CNN2, VGG11, and Resnet18. The results show that the CNN-based AMC method is very vulnerable to the proposed adversarial attack. Our method simply generates the raw signals by about 100 times smaller adversarial perturbations, making the classifier completely misidentified. We visualize the samples before and after adding adversarial perturbations from the perspective of time domain waveform and spectrum. These extensive results show that adversarial perturbation is imperceptible in both the time and frequency domains of the signal. We also present the confusion matrix and the minimum PSR required to attack each modulation, which helps to reveal the vulnerable point of classifiers for AMC task. Furthermore, in order to deal with these attacks, we adopt adversarial training to retrain our classifiers. The results indicate that adversarial training can indeed improve the robustness of classifiers. The robustness of the three classifiers, VT_CNN, VGG11, and Resnet18, against our attack has been improved by 7.06 dB, 3.25 dB, and 4.38 dB.
In the next step, we will construct the real-RF-world communication signal for attack and defense environments and verify the effectiveness of adversarial attacks in a variety of complex electromagnetic environments.

Data Availability
The dataset RML 2016.10A [20] that supports the findings of this study are available in https:// www. deeps ig. io/ datas ets for free.

Declarations
Ethics Approval This article does not contain any experiments with human or animal participants performed by any of the authors.

Consent to Participate
Informed consent was obtained from all individual participants included in the study.

Conflict of Interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.