1 Introduction

Unmanned aerial vehicles (UAVs), which are commonly called drones, hold significant importance due to their ability to offer various advantages, including speed, ease of access, cost-effectiveness, and enhanced human safety. They have simplified operations across multiple domains, such as firefighting, emergency responses during natural disasters, search and rescue missions, and even estimating irrigation requirements for agricultural purposes. Furthermore, UAVs are extensively employed in border security and counterterrorism scenarios [1,2,3]. UAVs are capable of transmitting data through a multitude of methods, including radio frequency (RF) communication, Wi-Fi, Bluetooth, mobile data networks (3 G, 4 G, and 5 G), satellite communication, wired or fiber-optic communication, and the use of sensors and cameras [4]. However, during the data transmission process, UAVs may be susceptible to various vulnerabilities, including RF interference, jamming, GPS spoofing, unauthorized data access, malware installation, denial-of-service (DOS) attacks, malicious flight control, and even satellite communication attacks. To proactively prevent malicious UAV intrusions, a UAV anomaly detection framework must effectively handle complex signals in noisy environments, often with minimal labeled data samples [5,6,7]. As a result, intrusion detection systems (IDSs) have become increasingly vital for safeguarding UAVs against vulnerabilities and potential attacks [8]. Modern anomaly-based IDSs leverage machine learning (ML) methods to detect known and unknown threats [9]. Several noteworthy contributions can be highlighted when reviewing the studies in the literature: Li et al. [10] acquired UAV signals as input data, transformed these data into bispectrum representations, and employed a Siamese network-based adversarial learning model to learn vector encodings. Through unsupervised learning, they achieved an impressive 92.85% accuracy in a UAV type detection task and 91.4% accuracy in terms of identifying out-of-sample UAVs. Anwar et al. [11] utilized Mel-frequency cepstral coefficient (MFCC) and linear predictive cepstral coefficient (LPCC) feature extraction techniques to distinguish amateur drone sounds from various background noises, such as those of birds, airplanes, and thunderstorms, in noisy environments. They effectively employed an SVM with multiple kernels; the SVM cubic kernel coupled with MFCCs outperformed the LPCC method, achieving approximately 96.7% accuracy in an amateur drone detection scenario. Nemer et al. [12] conducted feature extraction on RF signals through finite impulse response filtering. They extracted features using ensemble learning (combining K-nearest neighbors and XGBoosting) to accurately determine whether a UAV was present in a given area, achieving approximately 99% accuracy. Xie et al. [13] introduced an intrusion detection model for UAVs based on a belief rule base (BRB), focusing on reducing the input sample size using Wi-Fi data traffic. They proposed an evidential reasoning (ER) algorithm to address the rule combination explosion issue encountered in BRBs. By merging the capabilities of the ER and BRB methodologies, a novel evaluation model termed the EBRB-based model achieved an impressive 99.50% success rate in terms of predicting UAV intrusion detection, even in the presence of numerous attributes. Alipour-Fanid et al. [14] exclusively extracted features from the packet sizes and interarrival times of encrypted Wi-Fi traffic. To reduce the time required online identification, they implemented a reweighted L1-norm regularization method that considered the numbers of frames and samples, as well as the computational costs of different features. To overcome packet interarrival time uncertainties while optimizing the tradeoff between detection accuracy and delays, they employed maximum likelihood estimation (MLE) for packet interarrival time estimation. Their evaluation results demonstrated the ability of the developed approach to detect and identify UAVs within 0.15\(-\)0.35 s with high accuracy, ranging from 85.7 to 95.2%. The UAV detection process covered physical detection ranges of 70 m in line-of-sight (LoS) scenarios and 40 m in nonline-of-sight (NLoS) scenarios. Khan et al. [15] addressed the significant cybersecurity challenge associated with the Internet of Medical Things (IoMT) by using innovative bidirectional simple recurrent units (SRUs) to prevent the vanishing gradient problem and enable fast training through skip connections. They also incorporated explainable artificial intelligence (XAI) into their work. Additionally, the authors successfully detected threats, with a 99.38% accuracy rate. Al-Haija and Badawi [16] employed an encrypted Wi-Fi dataset for UAV intrusion detection, utilizing nine metrics based on three UAVs. They modeled the dataset using convolutional neural networks (CNNs), achieving an accuracy of approximately 99.50%. Utilizing the KDD Cup 99 dataset, Tan et al. [17] achieved a success rate of 92.44% by employing a particle swarm-optimized deep belief network (DBN) for UAV intrusion detection in networks. Alheeti et al. [18] employed ICMetric technology for UAV detection using signals acquired from an encrypted Wi-Fi network. ICMetric numbers represented additional features integrated into the dataset used for drone detection. This study employed a deep neural network (DNN) for classification and achieved an impressive performance rate of 99.99%. Medaiyese et al. [19] utilized Wi-Fi and Bluetooth data to extract distinctive signatures from the transient and steady states of signals. By employing RF control, signals from UAVs, the discrete wavelet transform, the continuous wavelet transform, and the wavelet scattering transform, they conducted feature extraction under varying signal-to-noise ratio (SNR) levels. They constructed different models using these features for signal feature extraction and trained on the obtained dataset using SqueezeNet, achieving an accuracy of 98.9%. Ezuma et al. [20] detected RF signals from 15 UAVs using a naive Bayes decision mechanism based on Markov models. Additionally, signals from Wi-Fi and Bluetooth emitters were detected based on the bandwidth and modulation characteristics of the identified RF signals. After recognizing the input signals as UAV control signals, they achieved a classification accuracy of 98.13% using the k-nearest neighbors (knn) method. In recent studies, SVMs have been used to detect injection attacks on automatic dependent surveillance-broadcast (ADS-B) devices in UAVs [21] and to protect against global positioning system (GPS) signal spoofing attacks [22, 23]. In this study, however, the SVM method was used for two purposes. First, attacks conducted over an encrypted Wi-Fi network were modeled, and the obtained model was validated with a black-box model called the LIME method. When analyzing prior research, it became evident that intrusion detection studies involving Wi-Fi-based data have predominantly embraced ML methodologies. However, it is noteworthy that these studies have not rigorously explored model interpretability similarly to LIME. In the contemporary era of burgeoning computational power, ML approaches have garnered extensive use in UAV intrusion detection scenarios. Even so, ML models, which are often shrouded in opacity, need help regarding their comprehensibility. Very few studies in this domain have adequately addressed the imperative of comprehending their constructed models. The deficiencies of this oversight may encompass pitfalls such as data overfitting, the absence of confidence intervals surrounding point estimates, and arbitrary variable selection results. Khan et al. [24] proposed an autoencoder-based detection framework utilizing convolutional and recurrent networks to identify and explain cyber threats in Industrial Internet of Things (IIoT) networks, using an SVM for error construction. By applying a two-step sliding window approach, the framework effectively extracted the temporal and spatial features of malicious events for classification and explanation purposes. The empirical results showed that this framework outperformed the state-of-the-art methods, demonstrating its robustness and suitability for real-world IIoT-based networks. This study utilized encrypted Wi-Fi traffic data from three UAV networks: Parrot Bebop 1, DBPower UDI, and DJI Spark. This made it possible to evaluate the performance of various UAV models and determine their general validity, unlike in previous studies. The LIME technique was utilized to improve the performance and interpretability of SVM models. This enabled the identification of the features that are critical for intrusion detection and made the decision-making processes of the models transparent. Such an approach is rare in the literature and is an innovative aspect of this work. This study trained SVM models considering both one-way and two-way communication flow modes. The communication mode is an essential factor that was often overlooked in previous works, and it helped us better understand how UAVs behave in real-world scenarios. This study demonstrates the performance superiority and effectiveness of the IDS model developed with Bayesian optimization-based fine-tuning through a comprehensive comparison. This thorough analysis makes the scientific contributions and practical applications of the study more evident.

The principal contributions of this research can be succinctly summarized as follows.

  • A linear SVM model was developed and engineered for UAV attack risk detection.

  • In this investigation, the local LIME method, which is an XAI technique, was used to elucidate the results obtained from the proposed model. This approach allowed us to interpret the significance of each feature, shedding light on the performance of the model.

  • This study represents a pioneering effort of introducing an interpretable SVM model for risk detection. The model is predicated on encrypted Wi-Fi traffic records sourced from the Parrot Bebop 1, DBPower UDI, and DJI Spark UAVs, encompassing bidirectional and unidirectional communication flow modes.

  • Furthermore, this research delved into unraveling the intricate nonlinear relationships among features, marking a departure from the conventional variable selection approach employed in classic statistical methods.

The structure of this paper is as follows. Section 2 elucidates the employed materials and methods, providing a detailed architectural overview. Section 3 presents the experimental results obtained from the proposed framework. Finally, Section 4 offers concluding remarks and provides a comprehensive discussion of the paper.

2 Materials and methods

2.1 Communication flow modes

UAVs can establish direct connections with devices such as smartphones or tablets by creating Wi-Fi networks. The data utilized in this study were collected through direct Wi-Fi connections between UAVs and their controllers. Specifically, three types of UAVs, namely, Parrot Bebop, DBPower UDI, and DJI Spark, as illustrated in Fig. 1, were employed to capture Wi-Fi data under both the bidirectional and unidirectional communication flow modes [25].

Fig. 1
figure 1

The UAVs utilized in this study [9]

(a) The bidirectional communication flow mode pertains to communication modes that facilitate the bidirectional exchange of data and commands between a drone and the control station. Information is transmitted from the ground control station (GCS) to the drone and from the drone to the control station. This mode encompasses three categories of data sources: the uplink flow, downlink flow, and total traffic flow. Nine statistical parameters were calculated for each data source, culminating in 54 features. (b) The unidirectional communication flow mode delineates a scenario wherein data and commands flow solely in one direction, predominantly from the ground control station (GCS) to a drone. In such an arrangement, the drone is adept at receiving instructions and commands from the operator but may not furnish real-time data or feedback for the operator. This mode encompasses solely the total traffic flow data source, from which 9 statistical parameters were computed.

2.2 Methodology for the ML

In this investigation, a binary classification approach for UAV intrusion detection involving the operation of normal and anomalous UAV traffic was modeled utilizing a linear SVM. The modeling steps are depicted in Fig. 2.

Fig. 2
figure 2

Proposed model architecture

2.2.1 Z score standardization

The packet sizes and interarrival times between the data packets obtained in bidirectional and unidirectional communication flow modes served as raw data sources. The dataset derived from these sources underwent processing utilizing the z score standardization method, as outlined in Eq. 1 [26].

$$\begin{aligned} X^{*} = \frac{X - \bar{X}}{\sigma _{X}} \end{aligned}$$
(1)

2.2.2 SVM model

A dataset D, consisting of n elements, is structured as \((X_1, Y_1), (X_2, Y_2),..., (X_n, Y_n)\), where \(y\epsilon (+1, -1)\). Selecting the element that maximizes the margin between two hyperplanes is necessary for data classification. The midpoint of hyperplanes \(H_1\) and \(H_2\) is represented as \(H_0\). \(H_0\) is the optimal hyperplane that linearly separates the two classes. \(H_0\) can be expressed as in Eq. 2 [27, 28].

$$\begin{aligned} H_0 = \sum _{i=1}^{n}(w_i x_i + b) = 0 \end{aligned}$$
(2)

In accordance with Eq. 2, where n represents the number of attributes, \(W = w_1, w_2,..., w_n\) denotes the weight vector, and b is a constant. In a case with having two attributes, the hyperplanes \(H_1\) and \(H_2\) for the values \(X = x_1, x_2,..., x_n\) are formulated as demonstrated in Eqs. 3 and 4, respectively.

$$\begin{aligned}{} & {} H_1:W^T X+b=1 \end{aligned}$$
(3)
$$\begin{aligned}{} & {} H_2:W^T X+b=-1 \end{aligned}$$
(4)

The points on the upper side of each hyperplane are calculated using Eq. 5.

$$\begin{aligned} W^T X+b>0,y_1=+1 \end{aligned}$$
(5)

The points at the bottom of each hyperplane are calculated according to Eq. 6.

$$\begin{aligned} W^T X+b<0,y_2=-1 \end{aligned}$$
(6)

The observations on hyperplanes \(H_1\) and \(H_2\) are designated support vectors. z denotes a point on a hyperplane, and the distance between a support vector and \(H_0\) is computed as demonstrated in Eq. 7.

$$\begin{aligned} d=\frac{|WX_{z}^{'}\mp b|}{\left\| w \right\| } \end{aligned}$$
(7)

2.3 Fine-tuning with Bayesian optimization

This study used the Bayesian optimization method to estimate the optimal hyperparameter combinations for the model developed by the SVM method, using the priority function and an observation function [29, 30].

Bayesian optimization mathematically calculates where the objective function \({\mathcal {F}}(x)\) is maximized:

  • Bayesian optimization starts with a probabilistic model of the objective function. This model is usually represented by Gaussian processes (GPs). Gaussian processes provide a probability distribution for \({\mathcal {F}}(x)\) with respect to any combination of hyperparameters \(x\).

  • Initially, the objective function is evaluated on several hyperparameter combinations \(x_1, x_2, \ldots , x_n\) and observations \(y_1 = {\mathcal {F}}(x_1), y_2 = {\mathcal {F}}(x_2), \ldots , y_n = {\mathcal {F}}(x_n)\) are obtained for these points.

  • As new observations arrive, the Gaussian process updates the posterior probability distribution. This is computed using Bayes’ theorem, as in Eq. 8.

    $$\begin{aligned} P({\mathcal {F}} | X, Y) = \frac{P(Y | X, {\mathcal {F}}) P({\mathcal {F}})}{P(Y | X)} \end{aligned}$$
    (8)

    Here \(X = [x_1, x_2, \ldots , x_n]\) and \(Y = [y_1, y_2, \ldots , y_n]\) are observation points.

  • Bayesian optimization uses an acquisition function to select the points to be observed next. A commonly used acquisition function is given in Eq. 9.

    $$\begin{aligned} EI(x) = {\mathbb {E}}[\max (0, {\mathcal {F}}(x) - {\mathcal {F}}^*)] \end{aligned}$$
    (9)

    Here, \({\mathcal {F}}^*\) is the best available observation.

    $$\begin{aligned} \text {UCB}(x) = \mu (x) + \kappa \sigma (x) \end{aligned}$$
    (10)

    Here \(\mu (x)\) is the estimated mean, and \(\sigma (x)\) is the estimated uncertainty.

  • A new hyperparameter combination \(x_{\text {new}}\) is chosen at the location where the acquisition function is maximal, and the objective function is evaluated at this point. This new observation is used to update the posterior distribution.

  • This process is repeated until the stopping criterion of the objective function or a certain number of iterations is reached.

2.3.1 Evaluational metrics

In binary classification models, accuracy, specificity, sensitivity, and precision are employed to evaluate the achieved modeling performance [31]. These metrics are calculated using Eqs. 11 - 14, respectively. Minimizing the false-negative (FN) rate is crucial for optimizing the accuracy of a classification model. A ROC analysis represents the comprehensive success curve of the tested classification model, which is generated from different precision- false-positive rate pairs. The area under the curve (AUC) of the ROC curve reflects the accuracy of the classification model [32]. The area under the ROC curve, which is based on the true-positive (TP) rate and false-positive (FP) rate, is calculated according to Eq. 15.

$$\begin{aligned}{} & {} \text {Accuracy} = \frac{\text {TP} + \text {TN}}{\text {TP} + \text {TN} + \text {FP} + \text {FN}} \end{aligned}$$
(11)
$$\begin{aligned}{} & {} \text {Specificity} = \frac{\text {TN}}{\text {TN} + \text {FP}} \end{aligned}$$
(12)
$$\begin{aligned}{} & {} \text {Sensitivity} = \frac{\text {TP}}{\text {TP }+ \text {FN}} \end{aligned}$$
(13)
$$\begin{aligned}{} & {} \text {Precision} = \frac{\text {TP}}{\text {TP }+ \text {FP}} \end{aligned}$$
(14)
$$\begin{aligned}{} & {} \text {ROC}=\int _{0}^{1} \left( \frac{\text {TP}}{\text {TP }+ \text {FN}} \right) d\left( 1- \frac{\text {TN}}{\text {TN }+ \text {FP}} \right) \end{aligned}$$
(15)

2.4 Explanation of the proposed model with LIME

The original Wi-Fi data in D and (x) are segmented into samples via an SVM. The annotation model is \(g\epsilon G\), where G denotes the set of interpretable models that can be visually presented to a user. \(\pi _{x} \left( z \right) \) is employed to represent the proximity between instances z and x and to establish locality around x. An objective function \(\xi (x)\) is established, and the L-function applied to \(\xi (x)\) elucidates how the interpretability g of the local definition approximates f through \(\pi _{x}(z)\). The L-function is minimized from the perspective of human comprehension to attain the optimal solution for the objective function when \(\Omega (g)\), the complexity of the explanatory model, is adequately low. The explanation function \(\xi (x)\) acquired through the LIME method is expressed as in Eq. 16 [33].

$$\begin{aligned} \xi (x) = \text {argmin}_{g\epsilon G} L(f,g,\pi _{x}(z))+\Omega (g)) \end{aligned}$$
(16)

Equation 17 elaborates on the extent of the similarity, \(\pi _{x}(z)\).

$$\begin{aligned} \pi _{x}(z)=\text {exp}\left( -\frac{D(x,z)^{2}}{\sigma ^{2}} \right) \end{aligned}$$
(17)

The formula for the similarity degree \(\pi _{x}(z)\) in Eq. 17 and the objective function are presented in Eq. 18. Here, \(g(z)^{'}\) signifies the estimated value in a d-dimensional space, and f(z) represents the estimated value in the d-dimensional space. The perturbed \(z^{'}\) values are generated by toggling features on and off.

$$\begin{aligned} \xi (x) = \sum _{z,z^{'}\epsilon Z}\pi _{x}(z)\left( f(z)-g(z^{'}) \right) ^{2} \end{aligned}$$
(18)

The perturbed data \(z^{'}\) are mapped back to the original input, resulting in z being the new input for f. The estimates of z correspond to the labels of the perturbed samples. The combination of \(z^{'}\) and f(z) constitutes the dataset. The SVM model, as defined, can be enhanced by incorporating a locally trained interpretable model [34, 35].

3 Experimental results

The procedural steps of the model presented in this study are illustrated in Fig. 3.

Fig. 3
figure 3

Architecture of the proposed model

According to Fig. 3, the processing steps of the proposed architecture in this study were as follows.

Step 1 Encrypted raw Wi-Fi recordings were collected from Parrot Bebop 1, DBPower UDI, and DJI Spark UAV sources in bidirectional and unidirectional communication flow modes. Tables 1 and 2 show that nine features \(\times \) 2 sources \(\times \) 3 directional flows = 54 features were obtained from the UAVs with the bidirectional communication flow. Similarly, as indicated in Tables 3, 9 features \(\times \) 2 sources = 18 features were obtained from the UAVs with the unidirectional communication flow mode.

Table 1 Features extracted from Source 1 under the bidirectional communication flow mode
Table 2 Features extracted from Source 2 under the bidirectional communication flow mode
Table 3 Features extracted under the unidirectional communication flow mode

The formulas for the mean (\(\bar{x}\)), median, mean absolute deviation (MAD), standard deviation (STD; (\(\sigma \))), skewness (\(\gamma \)), kurtosis (\(\beta \)), maximum, minimum, and mean square (MS) are provided in Eqs. 1926, respectively.

$$\begin{aligned}{} & {} \bar{x} =\frac{1}{N}\sum _{i=1}^{N}x(i) \end{aligned}$$
(19)
$$\begin{aligned}{} & {} \text {MAD} = \text {median}(\left| x(i)-\text {median}(x) \right| ) \end{aligned}$$
(20)
$$\begin{aligned}{} & {} \sigma =\sqrt{\frac{1}{N-1}\sum _{i=1}^{N}(x(i)-\text {mean}(x)^{2})} \end{aligned}$$
(21)
$$\begin{aligned}{} & {} \gamma =\frac{1}{N}\sum _{i=1}^{N}(x(i)-\text {mean}(x)/\sigma )^{3} \end{aligned}$$
(22)
$$\begin{aligned}{} & {} \beta =\frac{1}{N}\sum _{i=1}^{N}(x(i)-\text {mean}(x)/\sigma )^{4} \end{aligned}$$
(23)
$$\begin{aligned}{} & {} H=(\text {max}(x(i))\mid i=1\cdots N) \end{aligned}$$
(24)
$$\begin{aligned}{} & {} L=(\text {min}(x(i))\mid i=1\cdots N) \end{aligned}$$
(25)
$$\begin{aligned}{} & {} MS=\frac{1}{N}\sum _{i=1}^{N}(x(i))^{2} \end{aligned}$$
(26)

The interdependencies of the nine features are delineated in Tables 1 through 3. Each node symbolizes a distinct feature, while each hypersegment denotes a feature component shared among various features.

Step 2 The gathered data were subjected to feature processing by implementing the z score normalization technique.

Step 3 The datasets were evaluated using a linear model formulated through the SVM methodology.

Step 4 The integrity of the performance of the model constructed via a linear SVM was elucidated using a black-box explanatory model (LIME).

3.1 Dataset

The processed datasets obtained from the Parrot Bebop 1, DBPower UDI, and DJI Spark UAVs and their sizes are presented in Table 4.

Table 4 Sizes of the datasets acquired from different UAV sources

The three-dimensional visual representations of the datasets procured from various UAV sources are shown in Figs. 4 and 5.

Fig. 4
figure 4

Mesh visualizations produced under the bidirectional communication flow mode

Fig. 5
figure 5

Mesh visualizations produced under the unidirectional communication flow mode

3.2 Linear SVM modeling

When modeling on the datasets acquired from disparate UAV sources via the linear SVM, 70% of the data were earmarked for training purposes, with the remaining 30% being designed for testing. The training data were divided into tenfold using the cross-validation method against overfitting. For classification, the UAV category was assigned a label of ’1’, while the anomaly category was denoted with a label of ’0’.

3.2.1 SVM modeling results obtained under the bidirectional communication flow mode

The encrypted Wi-Fi records produced by the Parrot Bebop, DBPower UDI, and DJI Spark UAVs under the bidirectional communication flow mode were modeled using a linear SVM. The training and validation results obtained with the linear SVM are summarized in Table 5.

Table 5 Training and validation results obtained under the bidirectional communication flow mode

As delineated in Table 5, the validation accuracies achieved for the trio of UAVs, specifically, the DJI Spark, DBPower UDI, and Parrot Bebop UAVs, were recorded at values of 100%. During the comparative analysis of the parameters, i.e., the prediction velocity and training duration, gradation trends were discernible, extending from the minimal values to the maximal values in order of the above UAVs. Furthermore, Table 6 provides a detailed account of the testing efficacy achieved within the scope of the bidirectional communication flow modality.

Table 6 Test results obtained under the bidirectional communication flow mode

The confusion matrix produced for the training and validation datasets is presented in Fig. 6.

Fig. 6
figure 6

Train and validation confusion chart produced under the bidirectional communication flow mode

The confusion matrices produced for the test dataset are presented in Fig. 7.

Fig. 7
figure 7

Test confusion chart produced under the bidirectional communication flow mode

In the intrusion detection task, when operating within the bidirectional communication flow mode, the encrypted Wi-Fi signals from the Parrot Bebop and DBPower UDI UAVs exhibited paramount efficacy. Despite attaining a test performance level of 99.94%, DJI Spark manifested a marginally diminished model efficacy value relative to the other UAVs, as mentioned above; this outcome is attributable to the use of a less voluminous dataset. Empirical trials executed throughout the testing phase culminated in a 100% success metric with respect to the ROC analyses performed for all three classes of UAVs.

3.2.2 SVM modeling results obtained under the unidirectional communication flow mode

The encrypted Wi-Fi transmission logs of the Parrot Bebop, DBPower UDI, and DJI Spark UAVs operating under the unidirectional communication flow mode were subjected to modeling through a linear SVM methodology. The outcomes associated with the training and validation exercises utilizing the linear SVM are concisely encapsulated in Table 7.

Table 7 Training and validation results obtained under the unidirectional communication flow mode

Referencing Table 7, the validation accuracies attained for the Parrot Bebop and DBPower UDI UAVs were 100%. In comparison, the validation accuracy achieved for the DJI Spark UAV was 98.1%. In the context of the total costs associated with these three UAVs, Parrot Bebop and DJI Spark yielded values of 1, whereas DBPower UDI returned a value of 0. In descending order, the prediction speeds of the UAVs were ranked in the following order: DJI Spark > Parrot Bebop > DBPower UDI. Concerning the training time, the rankings of the durations from shortest to longest were as follows: DBPower UDI < DJI Spark < Parrot Bebop. The DJI Spark UAV possessed fewer data samples within the dataset than did the other two UAVs. Owing to the reduced sample size, its performance was lower than that of the other two UAVs. When evaluating the prediction speed and training time parameters, the following order was discerned from lowest to highest: DJI Spark, DBPower UDI, and Parrot Bebop. Table 8 presents the test performance achieved under the unidirectional communication flow mode.

Table 8 Test data performance evaluation metrics produced under the unidirectional communication flow mode

The confusion matrices obtained for the training and validation datasets are presented in Fig. 8.

Fig. 8
figure 8

Taining and validation confusion matrices produced under the unidirectional communication flow mode

The confusion matrices obtained for the test dataset is presented in Fig. 9.

Fig. 9
figure 9

Test confusion matrices produced under the unidirectional communication flow mode

In the intrusion detection task conducted under the unidirectional communication flow mode, the encrypted Wi-Fi signals procured from all three UAVs attained an accuracy of 100%. Under empirical trials executed throughout the testing phase, the success rates derived from the ROC analyses performed for all three categories of UAVs were uniformly recorded as 100%.

3.3 Explanation results obtained with LIME

Interpretation results were derived from the dataset employed to train the intrusion detection system for identifying the discordant instances of each UAV. The LIME method was utilized to extract descriptions of normal UAVs and anomalous data records from the dataset. The process of extracting rules from the training dataset using the LIME explainer is delineated in Algorithm 1.

Initially, the training data (X_ train) were annotated with the target (Y_train) by employing the SVM model, and a descriptor of this dataset encompassing all its features was acquired. The prediction function of the trained model was ascertained. By leveraging the LIME descriptor and the prediction function of the model, a description of the normal (UAV)/anomalous records in the dataset was obtained. Subsequently, the software generated samples of a synthetic dataset and fit a simplistic model for the query point with the critical predictors. By specifying the number of features N (where N was 54 for the bidirectional communication flow mode and 18 for the unidirectional communication flow mode), the most salient features to be utilized were identified. If more than one feature did not pertain to set z, this input traffic was categorized as anomalous. If all features pertained to the dataset, the input traffic was classified as normal and necessitated no further action (as per Algorithm 1).

Algorithm 1
figure b

Extract significant parameters and detect UAV attacks

Once the number of features was selected, the explication of the normal (UAV)/anomalous data records in the dataset was finalized using the prediction function of the model. The annotation process yielded a set containing the N most significant features among all selected normal (UAV)/anomalous data records. The algorithm then identified the z features that constituted the normal (UAV)/anomalous data descriptions. In the detection phase, the intrusion detection system analyzed the network traffic under surveillance. Explanation results were obtained by using LIME on all data, including both the attack and normal test data, to evaluate the detection performance of the model. Subsequently, an analysis was conducted to ascertain whether the extracted features belonged to set z. If all the features of the data pertained to the set, they were deemed normal. If a feature did not pertain to set z, the data were classified as an attack.

Fig. 10
figure 10

The LIME model for the bidirectional communication flow mode

In Fig. 10a, in the anomaly class detection results obtained for the Parrot Bebop UAV, the coefficient values for parameters X41, X23, X5, and X50 are the top four parameters that had negative impacts. At the same time, the significant anomaly detection parameter values for Parrot Bebop were X24, X42, and X6. In Fig.  10b, in the UAV class detection results obtained for the DBPower UDI UAV, the coefficient values for parameters X14, X46, and X42 are the top parameters with negative impacts. At the same time, the significant UAV detection parameter values for DBPower UDI were X41, X23, and X50. In Fig. 10c, the anomaly detection results obtained for DJI Spark shows that parameters X50 and X41 had negative impacts. At the same time, the significant anomaly detection parameter values for DJI Spark were X23 and X14.

Fig. 11
figure 11

The LIME model for the unidirectional communication flow mode

In Fig. 11a, the anomaly class detection results obtained for the Parrot Bebop UAV show that the coefficient values for parameters X3 and X4 ranked among the top four parameters exerting negative influences. Concurrently, X1, X17, and X5 were identified as significant anomaly detection parameters for the Parrot Bebop UAV. In Fig. 11b, which shows the anomaly class detection results obtained for the DBPower UDI UAV, the coefficient values for parameters X4 and X5 emerged as the paramount parameters manifesting negative impacts. Simultaneously, X17, X2, and X3 were ascertained to be significant anomaly detection parameters for the DBPower UDI UAV. In Fig. 11c, which concerns the anomaly detection results obtained for the DJI Spark UAV, the X1, X9, X2, X4, and X3 parameters were shown to have negative impacts. Concurrently, X7 was recognized as a significant anomaly detection parameter for the DJI Spark UAV.

3.4 Cost analysis of the proposed model

The cost analysis steps of the model developed in this study, including evaluations of its temporal complexity, spatial complexity, computational resource costs, and practical implementation costs, were as follows. The temporal complexity of the linear SVM with LIME for explainability was calculated as shown below.

  • Linear SVM training is \( O(nd) \), where \( n \) is the number of samples and \( d \) is the number of features.

  • LIME explainability is \( O(n \cdot d^2) \).

The spatial complexity of the linear SVM with LIME for explainability was calculated as follows.

  • Linear SVM is \( O(nd) \) for storing the input data and model parameters.

  • LIME is \( O(nd) \) for storing perturbations and the local models.

  • The computational resource cost of the linear SVM with LIME for explainability was calculated as follows.

  • Training is 2 CPU hours.

  • Explainability is 3 CPU hours.

4 Conclusion

A linear SVM model was developed to detect attacks on UAVs through encrypted Wi-Fi data. For this purpose, data from Parrot Bebop 1, DBPower UDI, and DJI Spark drones were utilized. The data were processed separately for each UAV under bidirectional and unidirectional communication flow modes. The efficacy of the developed linear model in terms of predicting anomalies in UAVs was elucidated utilizing the black-box LIME method. As shown in Table 9, a comparison with prior methods presented in the literature underscores the significance of the success achieved in this investigation [13, 14, 16, 18]. Moreover, under each communication flow mode, the LIME method computed crucial parameters for UAV and anomaly prediction. The model was fine-tuned using Bayesian optimization, enhancing its predictive accuracy and robustness. A cost analysis of the proposed model was also performed, demonstrating its practicality and efficiency. In the future, particularly with the augmentation of encrypted Wi-Fi data derived from DJI Spark drones, the objective will be to attain more auspicious outcomes. This study provided a wide range of encrypted Wi-Fi traffic data from three different UAV networks. Using the LIME technique, the decision-making process of the developed model was made transparent, and critical features were identified. The model was trained and evaluated while considering both unidirectional and bidirectional communication flow modes, proving that the developed model performs better than existing methods and that it is more suitable for practical applications. These modifications highlight the place of this study in the literature and its original contributions, emphasizing the efficacy and cost-effectiveness of the enhanced model, which are achieved through Bayesian optimization.

Table 9 Literature comparison intrusion detection system with Wi-Fi encrypted data for UAVs