Introduction

Wireless sensor networks (WSNs) are task-oriented networks whose ultimate goal is to detect the monitored area to obtain the required data [1, 2]. To ensure the accuracy and validity of the data, researchers have investigated the deployment strategy [3, 4], energy efficiency [5, 6], and coverage of the sensors [7, 8]. The deployment strategy of sensor nodes will directly affect the performance of the network; energy efficiency will directly determine the lifetime of the network; coverage is the basis for accomplishing the task of target monitoring and information acquisition, which directly affects the energy consumption of sensors and network lifetime, and is also a hot spot and focus of current research [9].

Network coverage can be classified in several ways depending on the application of wireless sensors. Based on how the nodes are deployed and based on whether the nodes are in motion are two common classifications [8, 10]. According to the deployment method of nodes, coverage can be divided into two categories: deterministic coverage and random coverage. Suppose the environment to be covered is known or does not change much. In that case, the location of the nodes can be determined in advance, and this case is suitable for deterministic coverage. If the environment to be covered is harsh, and the environmental conditions cannot be predicted in advance, only random deployment can be used for coverage. Random deployment of nodes produces random coverage effects, which may achieve optimal coverage, cause coverage redundancy, or make too many coverage blind areas [11, 12]. Whether or not the nodes are in motion, they can be divided into static and dynamic network coverage. Fixed networks are defined as networks in which the nodes are not in motion [13]. Dynamic network coverage means the nodes have some mobile capability to provide better monitoring services to the area [14]. Regardless of the coverage method, the goal is to maximize network coverage. Efficient target coverage algorithms can improve the performance of wireless sensors [15]. This paper uses the coverage method of mobile nodes to increase the network. Move the available sensor nodes to fill the void when a coverage gap occurs [16]. There are various ways to move the nodes, such as random, fixed, or adaptive. The advantage of random movement is that the movement method is simple and does not require planning the movement path. The disadvantage is that the stability of coverage cannot be guaranteed. Fixed movement means that the movement path of the nodes is scheduled in advance. This method can guarantee coverage to a certain extent. Adaptive movement automatically adjusts the nodes’ direction based on the real-time status. This approach ensures the maximum coverage of the network but is more energy intensive. The choice of movement method needs to be selected on a case-by-case basis, or a combination of the above three methods can be adapted to specific scenarios. Meta-heuristic algorithms are often considered effective measures to solve optimization problems and practical algorithms to improve the coverage of wireless sensor networks [17,18,19]. In this paper, we focus on the coverage of movable nodes based on the improved meta-heuristic algorithm, and the following is an introduction to the meta-heuristic algorithm.

Meta-heuristic algorithms emerged to solve practical problems and are characterized by the ability to find approximate solutions to optimal solutions globally, with exploration and exploitation at their core. Various meta-heuristic algorithms have been proposed, for example, the gray wolf optimizer (GWO) [20], the whale optimization algorithm (WOA) [21], and the Phasmatodea population evolution (PPE) [22,23,24]. In some aspects, they have advantages over classical genetic algorithms (GA) [25], particle swarm optimization (PSO) [26,27,28], differential evolution (DE) [29] and other algorithms. Regardless of the type of meta-heuristic algorithm, random variables are introduced in the iterative process to prevent the algorithm from deviating from the optimal value [30]. This is the most significant difference between meta-heuristic algorithms and traditional methods. However, no single algorithm can be applied to all practical problems. Researchers have therefore proposed various improvements. For example, add restart policy [31], orthogonal learning strategy [32], transfer learning [33], or parallel strategy [34], etc. Zhu [35] proposed a multigroup-based Phasmatodea population evolution algorithm, which further improved the overall performance of PPE. Similarly, Liang [34] proposed a parallel Phasmatodea population evolution algorithm (PPPE) with a communication strategy. Chaotic models are often used to improve various algorithms, which can generate chaotic numbers between 0 and 1, and have been applied to applications. For example, Dynamic analysis of chaotic digital maps via state-mapping networks [36] and Exponential chaotic model for generating robust chaos [37], proposed in 2019. Bird swarm algorithms with chaotic mapping proposed in 2020 [38]. All the articles mentioned above can prove that chaotic mapping has good results.

Table 1 Introduction to various surrogate assisted algorithms

In addition, adding a surrogate model not only improves the accuracy of the algorithm, but also solves the time-consuming problem of the algorithm in computing the expensive optimization problem. Compared to 2D coverage, 3D coverage is computationally expensive and is well suited for the use of surrogate models. To date, many existing regression models have been used as proxy models, such as Kriging models [39], radial basis function networks [40], polynomial regression [41], and artificial neural networks [42, 43]. However, all types of regression models have limitations, and no single regression model is suitable for all problems. For example, the Kriging model is a typical interpolation estimation method sensitive to numerical noise in the computation process. The modeling process could be more computationally heavy and may result in construction failure for problems with large sample sizes [39]. What makes RBFN unique from other neural network models is the introduction of a new variable, Euclidean distance, which transforms the fitted prediction problem in a multidimensional design space into a one-dimensional problem with itself as the independent variable. It is shown that the RBFN model works very well for problems with many design variables and a high degree of nonlinearity [44, 45]. At the moment, there are many articles using RBFN as a surrogate model. For example, Sun [46] proposed a surrogate-assisted cooperative particle swarm optimization algorithm (SACOSO) that uses two surrogate models to search for the global optimum collaboratively. Experiments show that the algorithm can find the best solution for problems of relatively high dimensionality in a limited number of calculations,fully levering the local approximation ability of RBFN. Wu [47] proposed a new surrogate assisted evolutionary algorithm MASTO based on adaptive technology. This algorithm fully utilizes the learning and local approximation capabilities of RBFN to dynamically establish the most promising RBF for each target, thereby improving the reliability of alternative model predictions. Regis [48] proposed an algorithm for particle swarm optimization using a surrogate model (RBFPSO), which uses RBF to construct a surrogate model to identify the most promising particle positions. In addition, the RBFPSO algorithm has shown outstanding performance in 36-D groundwater bioremediation problems, 14-D watershed calibration problems, and ten mostly 30-D testing problems. Liang [49] proposed a two-level substitution-assisted PPE algorithm (SAPPE), which improves the PPE algorithm using two surrogate models constructed by RBFN. The algorithm was applied to the 3D coverage problem of wireless sensors and achieved good results. RBFN has strong nonlinear mapping ability and can handle very complex nonlinear problems. The following is a summary of the above algorithms and other similar algorithms, please refer to Table 1 for details.

This does not mean that only RBFN is effective. When selecting a proxy model, it is necessary to consider factors such as algorithm, characteristics of the problem to be solved, data size, and computing resources comprehensively. Gu [50] proposed a random forest surrogate-assisted adaptive multi-objective particle swarm optimization (RFMOPSO). In this algorithm, random forests are used to speed up the algorithm’s search; new adaptive rules are used to update the particle states. The experiments show that the algorithm has some applications for combinatorial optimization problems. On the one hand, Random Forest needs to build multiple decision trees with longer training time, which is not as simple as RBFN to build the model. On the other hand, random forest has high prediction accuracy and can handle data well under adaptive conditions. Ji [51] proposed a dual-surrogate-assisted cooperative particle swarm optimization algorithm (DSCPSO-EMM) and searched for several optimal solutions with significantly reduced individual evolutionary costs. To verify the effect of the surrogate model on the performance of the DSCPSO-EMM, two variants, DSCPSO-EMM-LS and DSCPSO-EMM-BP, are proposed in this literature. The former surrogate models all take the form of RBFN-LS, while the latter uses a single-layer surrogate-assisted model combining RBFN-BP. The algorithm has significant advantages over other algorithms of the same type in solving expensive optimisation problems, but is structurally complex using multiple surrogate models.

Fig. 1
figure 1

3D coverage schematic

Based on the above considerations, it is good to use surrogate assisted techniques to solve the computationally expensive problem. Therefore, in this paper, we propose A surrogate-assisted sine Phasmatodea population evolution algorithm (SASPPE) to solve the expensive optimization problem. We evaluate the proposed SASPPE algorithm on multiple dimensions of 7 benchmark functions as well as a real-world optimization problem. Comparative results show that the proposed algorithm SASPPE is effective. The main work of this paper is as follows.

  1. 1.

    This article proposes a new algorithm for solving the three-dimensional coverage problem of wireless sensors. At the same time, to increase network coverage, this article adds a mobile mechanism to sensor nodes.

  2. 2.

    SASPPE adopts an alternation mechanism. LSAMs and GSAMs are used alternately, which effectively balances the computational cost and the algorithm’s optimization seeking ability.

  3. 3.

    The use of hierarchical clustering method to divide the solid LSAMs ensures the balance of building LSAM data. In LSAMs, the use of the Sine-PPE algorithm to search the solution space avoids the algorithm from falling into local optimality to a certain extent.

The rest of this paper is structured as follows, section “Related work” introduces the relevant components of this paper, section “The proposed algorithm” mainly describes the proposed algorithm SASPPE, section “Experiment” presents the experimental results of the algorithm, section “3D coverage of removable nodes” is the application of the algorithm to the 3D coverage of movable nodes, and section “Conclusion and future work” is the summary and outlook.

Related work

Wireless sensor networks

Sensor nodes are the cornerstone of WSNs, and their sensing models significantly impact the detection and coverage capabilities of the network [7, 8]. Currently, the main sensing models of sensor nodes are binary and probabilistic sensing models. The binary sensing model is an ideal sensing model that ignores the attenuation of the signal during transmission, i.e., the ability to transmit the signal is constant [49]. The probabilistic sensing model takes into account the fading of the nodes during the message of the signal, i.e., the ability to transmit the signal decreases with increasing distance. This paper uses a binary sensing model, 0–1 sensing model, or Boolean sensing model. It is covered as shown in Fig. 1a, and all areas within this sphere can be detected. Assuming that the point P(\(x_{p}\),\(y_{p}\),\(z_{p}\)) does not fall exactly within the sensing range of a sensor \(S_{i}\), the coverage value C of the point P is recorded as 0, and vice versa as 1, as shown in Eq. (1).

Where \(d(P, S_{i})\) denotes the Euclidean distance between point P and sensor node \(S_{i}\), and r denotes the sensing radius of sensor \(S_{i}\). The red line in Fig. 1a is r, and the green line is d. To ensure maximum coverage, the nodes are movable for coverage in this paper. The energy of each sensor node is limited and cannot be moved indefinitely. It is assumed that each node is full of energy, gradually consumed as the nodes move. At the time when some sensor nodes run out of electrical power, mobile nodes can be used to fill them and ensure the connectivity of the network. In this paper, the nodes move a constant distance each time, i.e., the energy consumed is also constant.

$$\begin{aligned} C(P,S_{i})= {\left\{ \begin{array}{ll} 1, &{} \text{ if } d(P,S_{i})\le r \\ 0, &{} \text{ otherwise } \end{array}\right. }, \end{aligned}$$
(1)

Radial basis function networks

RBFN, one of the methods for constructing surrogate-assisted models, is often used to compute more costly optimization problems [40, 46]. RBF neural network is a three-layer neural network. The transformations from the input layer to the hidden layer are nonlinear and are connected by radial basis functions. The transformation from the hidden layer to the output layer is linear and is connected by weights. The basic principle of RBFN is to divide the input space into N non-overlapping subspaces, each described by an independent RBF. The value of the RBF function depends only on the distance from the origin, which is generally adopted as the Euclidean distance [45]. The RBF method combines a series of exact interpolation methods, i.e., the surface must be sampled through each measured value. The interpolation function based on the radial basis function is shown in Eq. (2).

$$\begin{aligned} {\check{y}}=\sum _{i=1}^{N} \lambda _{i}\phi ({\textbf{x}} - \mathbf {x_{i}}) + p(x), \end{aligned}$$
(2)

where N is the number of RBF established and the number of nodes in the hidden layer. \(\varvec{\lambda } = [\lambda _{1}, \lambda _{2}, \lambda _{3},\ldots , \lambda _{N}]^{T}\) is the correlation weight. \(\phi ( \cdot )\) is the kernel function, and \( \Vert \cdot \Vert \) is the Euclidean distance. There are various activation functions for radial basis functions, such as Gaussian, polynomial, linear, etc. The formula of the Gaussian function is shown in Eq. (3), and the formula of the cubic surface function is shown in Eq. (4).

This paper uses the least squares method to solve for the unknown parameters. p is either a polynomial or a constant. Suppose the number of polynomials of p is m, then \({\textbf{b}} = [b_{1}, b_{2}, b_{3},\ldots , b_{m}]^{T}\), \(\mathbf {\Phi }_{ij} = \phi (x_{i} - x_{j}), (i = 1, 2, 3,\ldots , N), (j = 1, 2, 3,\ldots , N)\), \({\textbf{P}}_{ij} = p_{j}(x_{i}), (i = 1, 2,\ldots , N), (j = 1, 2,\ldots , m)\), \({\textbf{b}}\) and \(\varvec{\lambda }\) are solved in the way shown in Eq. (5).

$$\begin{aligned}{} & {} \phi (r)= exp\left( -\frac{r^{2}}{2\sigma ^{2}}\right) , \end{aligned}$$
(3)
$$\begin{aligned}{} & {} \phi (r)=r^{3}, \end{aligned}$$
(4)
$$\begin{aligned}{} & {} \left[ \begin{array}{ll} \mathbf {\Phi } &{}\quad {\textbf{P}}\\ {\textbf{P}}^{T} &{}\quad 0 \end{array} \right] \left[ \begin{array}{c} \varvec{\lambda } \\ {\textbf{b}} \end{array} \right] = \left[ \begin{array}{c} {\textbf{y}} \\ 0 \end{array} \right] \end{aligned}$$
(5)

PPE algorithm

PPE differs from other algorithms in that each solution represents a population and has two properties: number and growth rate [22,23,24]. The solution is generated randomly at the beginning of the evolutionary stage of the population. The PPE algorithm selects the k nearest optimal solutions to avoid losing solutions during evolution to guide the subsequent population movement. The k is determined by the number of solutions N. The population position update formula is Eq. (6).

$$\begin{aligned} {\textbf{x}}^{t+1}={\textbf{x}}^{t}+\textbf{ev}, \end{aligned}$$
(6)

where \(\textbf{ev}\) represents the evolutionary trend of the population, three types of updates are available for ev affected by path dependence, nearest optimum, and competition, and the update formula is shown in Eq. (7). Where \({\textbf{m}}\) denotes the population mutation; \(f({\textbf{x}})\) denotes the fitness value of the current population i; \({\textbf{A}}\) represents the nearest optimal solution; \({\textbf{B}}\) means an n-dimensional random variable generated from a normal distribution, and st is determined by the upper and lower bounds of the search space.

$$\begin{aligned}{} & {} \textbf{ev}^{t+1}\nonumber \\{} & {} \quad = {\left\{ \begin{array}{ll} (1- {\textbf{p}}^{t+1}){\textbf{A}} + {\textbf{p}}^{t+1}(\textbf{ev}^{t} + {{\textbf {m}}} )), &{} \text{ if } \text{ path } \text{ dependence } \\ rand \cdot {\textbf{A}} + st \cdot {\textbf{B}}, &{} \text{ if } \text{ nearest } \text{ optimum } \\ \textbf{ev}^{t+1} + \frac{f({\textbf{x}}_{j})-f({\textbf{x}}_{i})}{f({\textbf{x}}_{j})} ({\textbf{x}}_{j}-{\textbf{x}}_{i}), &{} \text{ if } \text{ competition } \end{array}\right. },\nonumber \\ \end{aligned}$$
(7)

The update of population size is shown in Eq. (8), and a denotes the population growth rate, which is set as a constant 1.1 in this paper. Population competition occurs when two populations are close enough to each other. The competition formula is shown in Eq. (9).

$$\begin{aligned}{} & {} {\textbf{p}}^{t+1} = a^{t+1}{\textbf{p}}^{t}(1-{\textbf{p}}^{t}), \end{aligned}$$
(8)
$$\begin{aligned}{} & {} {\textbf{p}}_{i} = {\textbf{p}}_{i} +a _{i}{\textbf{p}}_{i}(1-{\textbf{p}}_{i}-\frac{f({\textbf{x}}_{j})}{f({\textbf{x}}_{i})}{\textbf{p}}_{j}), \ if \ d({\textbf{x}}_{i}, {\textbf{x}}_{j}) < G,\nonumber \\ \end{aligned}$$
(9)

The pseudo-code of the PPE is Algorithm  1.

Algorithm 1
figure a

PPE algorithm

Chaotic model

In the optimization field, chaotic mappings can be used instead of pseudo-random number generators that produce random numbers in the range of 0–1 [52, 53]. The random numbers generated by chaotic mappings have the characteristic of being highly dependent on the initial values, i.e., two similar initial values will yield completely different sequences of random numbers [36]. This is also the nature of chaotic mappings, and such property will affect the generation of optimal solutions of the algorithm. When the algorithm generates multiple solutions, searching for the global optimal solution is more accessible. Chaotic mapping can be used not only in the initialization phase of the population but also in the selection, crossover, and mutation operations on the population. This operation also affects the whole evolution process of the algorithm, getting better results than the original algorithm. Chaotic mappings commonly used in meta-heuristic algorithms are logistic mapping, piecewise mapping, singer mapping, sine mapping, etc. The sine mapping is single-peaked and has a value domain of [− 1, 1]. Adding the sine mapping ensures that the population is adjusted in a small range, so the sine mapping is chosen in this paper. The original formula of sine mapping is shown in Eq. (10), representing the chaotic parameters. In this paper, the value of \(\beta \) is set to 4 and \(Z_{0}\) to 0.01.

$$\begin{aligned} Z_{t+1}=\frac{4}{\beta }sin(\pi Z_{t}), 0< \beta \le 4. \end{aligned}$$
(10)

Hierarchical clustering

Clustering is commonly used in the initial stage of data exploration or data mining and is often used for data pre-processing. Hierarchical clustering can be divided into two opposite approaches: agglomerative hierarchical clustering and divisive hierarchical clustering [54, 55]. Agglomerative clustering treats each object as a cluster and merges these clusters to form a larger cluster based on conditions. Divisive clustering treats all objects as one cluster and then gradually divides them into smaller and smaller clusters. The advantage of hierarchical clustering is that it is easy to implement similarity measures or distance measurements to draw a tree diagram. Another advantage is that the number of clusters can be specified in the clustering process to achieve a more desirable clustering effect.

This paper chooses to use agglomerative clustering that requires merging clusters. Generally speaking, the two clusters with the smallest distance are effortlessly merged. The definition methods of the distance between clusters determine different clustering results. The most commonly used distance measurement methods include complete-linkage, single-linkage, and ward-linkage [56]. The single-linkage may be combined due to the proximity of extreme data points. The complete-linkage may not be able to be combined due to the distance between their extreme values. The single-linkage may be grouped due to the proximity of different data classes; complete-linkage may not be grouped due to the distance of similar data. Therefore, ward-linkage is chosen in this paper, and the calculation formula is shown in Eq. (11).

$$\begin{aligned} d(r,s)=\sqrt{\frac{2n_{r}n_{s}}{(n_{r}+n_{s})}}{\Vert {\bar{x}}_{r}-{\bar{x}}_{s}\Vert }_{2}, \end{aligned}$$
(11)

where \(\Vert \cdot \Vert _{2}\) denotes the Euclidean distance, \({\bar{x}}_{r}\) denotes the center of mass of r clusters, and \(n_{r}\) denotes the number of elements of r clusters. The number of clusters set in this paper is 3.

The proposed algorithm

Like most meta-heuristics, the PPE algorithm has the disadvantages of slow convergence and low accuracy. The PPE algorithm works well in 30 dimensions, and its effectiveness in higher dimensions is unknown. However, the efficacy of other meta-heuristics decreases exponentially as the dimensionality increases. The PPE algorithm has the same deficiency. Improve the problem, three solutions are proposed in this paper to solve the problem collaboratively. At first, to solve the problem of high computational cost, this paper adds a surrogate-assisted model to reduce the computational cost of the algorithm. The surrogate models include the global surrogate-assisted model and multiple local surrogate-assisted models. In this paper, the global surrogate-assisted model is abbreviated as GSAM, and the local surrogate-assisted model is abbreviated as LSAM. Second, multiple LSAMs are divided using agglomerative hierarchical clustering to speed up the algorithm’s operations. Although adding the surrogate models improves the accuracy of the algorithm to some extent, it tends to make the algorithm premature. Finally, this paper adds a chaotic mapping mechanism to solve the algorithm’s premature problem. In summary, a surrogate-assisted sine PPE algorithm is proposed in this paper.

Local surrogate-assisted model

LSAMs are used before GSAM to quickly find the current optimal solution and avoid premature algorithm maturity. Before constructing the LSAM, the data samples are pre-processed to improve the quality of the model. This paper uses agglomerative hierarchical clustering to cluster the data samples and the “ward” method to calculate the distance between two clusters. This method is simple to define the distance without specifying the center of the cluster in advance and generates non-spherical clusters. Multiple LSAMs are generated using agglomerative hierarchical clustering to ensure that the generated LSAMs are all in the best condition. The LSAM uses cubic surface functions as kernel functions and p takes the form of a polynomial. The local search space is explored using the Sine-PPE algorithm. Sine mapping is used throughout the LSAM and influences the generation of optimal solutions, improves the development capability of LSAM, and prevents the algorithm from falling into local optimum. After each LSAM generates the optimal particles, the excellent particles around the optimal particles are found using sine mapping and evaluated using the true evaluation function.

The specific process of LSAM is as follows:

  • It is divided into three clusters by hierarchical clustering, and three LSAMs are constructed using the sample data in the clusters.

  • The optimal solution is found using Sine-PPE, and the obtained optimal solution is added to the database (DB). Subsequently, the optimal solution and sine mapping are used to find the excellent particles around the optimal solution.

  • The suitable particles are added to the DB, and the matrix DB is updated.

It is necessary to determine whether the end condition is reached each time the current optimal solution is added. If the state is reached, the whole algorithm is ended, and vice versa; the algorithm continues to run. The DB used is the same DB as the one used by the GSAM, and LSAM uses all the data in the DB.

The pseudo-code of the LSAM is Algorithm 3.

Sine-PPE algorithm

As shown in Song’s literature [22], the overall performance of PPE on 30 functions of CEC 2014 is better than GWO, GSA, SA, SLPSO, BOA, and GA algorithms. To avoid premature algorithm, sine chaotic mapping is added to the PPE. Chaotic mappings can have an impact on the algorithm during the iterative process, and this impact will affect the subsequent iterative process. Adding chaotic mapping improves the possibility of the algorithm jumping out of the local optimum and avoids premature maturity. The evolutionary part of the PPE algorithm is modified using sine chaos mapping. The revised sine mapping is shown in Eq. (12).

$$\begin{aligned} Z_{t} = sin(\pi Z_{t-1}) + cos(\pi Z_{t-1}), \end{aligned}$$
(12)

Considering the evolutionary process, the population will be influenced not only by the optimal solution and path dependence but also by the parents. In this paper, \(x^{t^{'}}\) is introduced as the influence of the previous generation on the current iteration, and the formula for calculating \(x^{t^{'}}\) is shown in Eq. (13).

$$\begin{aligned} x^{t^{'}} = (a_{t}Z_{t}+ 4cos(Z_{t}))(x^{t}-x^{t-1}), \end{aligned}$$
(13)

Because the evolutionary trend takes into account the effects of path dependence, convergence to the nearest optimum, and competition, the variable \(S_{t}\) is introduced in this paper to balance the relationship between the evolutionary trend ev and \(x^{t^{'}}\). \(S_{t}\) is calculated as shown in Eq. (14). To ensure that the solution generated by the algorithm is in the search space, the value of \(S_{t}\) must be less than 1. In this paper, the solution overflow is prevented by using modulo operation, i.e., a remainder of \(S_{t}\) to generate \(S_{t}^{*}\). The population evolution formula of the improved PPE algorithm is shown in Eq. (15).

$$\begin{aligned} S_{t}= & {} sin(u\cdot \pi \cdot rand1 \cdot S _{t-1})\nonumber \\{} & {} + sin(u \cdot \pi \cdot rand2 \cdot S _{t-1}), \end{aligned}$$
(14)
$$\begin{aligned} x^{t+1}= & {} x^{t} + S_{t} \cdot ev + (1-S_{t}^{*}) x^{t^{'}}, \end{aligned}$$
(15)

The pseudo-code of the Sine-PPE is Algorithm 2.

Algorithm 2
figure b

Sine-PPE algorithm

Algorithm 3
figure c

Local surrogate-assisted sine-PPE

Global surrogate-assisted model

The primary use of the GSAM is to ensure the stability of the model. The meta-heuristic algorithm is unstable and will have different results in different computation instances in the same problem, sometimes performing excellent and sometimes poorly. In case of poor performance, the GSAM is used to adjust the search capability of the SASPPE algorithm to ensure that the searched solutions do not deviate from the normal values. The RBF function is constructed using “newbe” in Matlab, with the kernel function as a Gaussian and p as a constant 0. But building GSAM needs to solve the following two problems. One is that GSAM requires a relatively large data sample. Another is that GSAM is to select a point on the whole search space, and the accuracy of the data cannot be guaranteed. To solve the above problems, the following solutions are given in this paper. The PPE algorithm has the number and growth property, which is more advantageous in exploring the search space. In addition, the time complexity of the algorithm could be higher, which is suitable for the calculation of relatively large data samples. Therefore, the search space is explored in GSAM using the PPE algorithm. Besides, multiple LSAMs are constructed in this paper to develop the current search space and exploit the optimal values in the whole search space as much as possible.

To reduce the expenses and better find the optimal solution, the GSAM and the LSAM use the same database DB. The size of the database is positively correlated with the time to train the model; the more significant the database, the longer the time. A reasonable database size must be set for reasonable training model time. The complexity of the dimensionality also needs to be considered when designing the database size. Because as the dimensionality increases, more samples are required to ensure the accuracy of the model, then the time to train the model must increase. The particle size of the algorithm is assumed to be Np. After several experiments, the database size setting of Np and \(2 \times Np\) is the best solution. When the search space dimension is 30 and 50, the size of DB is set to Np. At this time, GSAM uses all the data in the DB. When the dimension of the search space is 100, the size of DB is set to \(2 \times Np\). To reduce the time of building the model, GSAM uses two-thirds of the data in the DB. The initial values of the DB are generated using the Latin hypercube sampling (LHS). Once the data is updated, the DB must be sorted once to delete the inferior particles and ensure that the number of particles in the DB is Np and \(2 \times Np\).

The pseudo-code of the GSAM is Algorithm 4.

Algorithm 4
figure d

Global surrogate-assisted PPE

The entire algorithm (SASPPE)

Fig. 2
figure 2

Flowchart of SASPPE

The SASPPE algorithm integrates strategies such as surrogate, chaotic mapping, and hierarchical clustering to improve the overall performance of the PPE algorithm. Adding a surrogate mechanism saves time and cost for the algorithm. Adding chaotic mapping enhances the randomness and variability of the algorithm and avoids premature maturation of the algorithm. Adding hierarchical clustering facilitates classifying data and building a more reasonable surrogate model. Figure 2 shows the overall framework of the algorithm; the purple part is LSAM, and the green part is GSAM. The execution steps of the SASPPE algorithm are as follows. Where f(x) represents the true fitness function, function FUN1 refers to the LSAM function constructed using RBF, and FUN2 denotes the GSAM function made using RBFN.

  1. 1.

    nitialize the data using LHS, then evaluate the data using f(x), and select the appropriate amount of data stored in the DB.

  2. 2.

    LSAMs are first constructed in the following way. The selected data are divided into multiple sub-archived \(DB_{i}\) using hierarchical clustering, and LSAMs are constructed for each \(DB_{i}\). In each LSAM, the function FUN1 is used instead of f(x).

  3. 3.

    The Sine-PPE is executed to find the optimal particle in the current space, and the end condition of the Sine-PPE algorithm is executed 200 times, or the optimal value is constant for 20 consecutive times. The evaluation function of the Sine-PPE algorithm is the FUN1, not the f(x).

  4. 4.

    Evaluate the particle using the f(x). If the particle is superior to the particle in the DB, replace the worst particle in the DB. Subsequently, update the DB. Conversely, leave the status quo unchanged and perform the next step.

  5. 5.

    Sine mapping is used for the optimal particles selected by Sine-PPE to discover the best particles around the optimal particles, then 4 is executed.

  6. 6.

    Repeat 3–5 to ensure that the LSAM is executed once for each build.

  7. 7.

    At this point, the GSAM build phase is entered. Enough particles are selected from DB to form \(DB_{g}\), and a GSAM is constructed using the particles in \(DB_{g}\). The particles are evaluated using the FUN2 instead of the f(x). Subsequently, the optimal value is searched over the entire search space using the PPE algorithm. The PPE uses FUN2 to evaluate the population, and the PPE algorithm ends with the same conditions as the Sine-PPE algorithm.

  8. 8.

    The evaluation process is the same as LSAM; refer to 4.

  9. 9.

    If the end condition is reached, the algorithm is finished; if not, the algorithm returns to 2 and executes the SASPPE.

Table 2 Test functions and their parameters

The pseudo-code of the SASPPE is Algorithm 5.

Algorithm 5
figure e

Surrogate-assisted sine-PPE algorithm

Experiment

In order to verify the effectiveness of the proposed algorithm SASPPE algorithm, this paper chooses to test the algorithm with 7 test functions on 30D, 50D, and 100D.

Table 3 Table of experimental results for SASPPE, SAPPE, PPE, RBFPSO and SACOSO on F(1)–F(7) (30D)
Table 4 Table of experimental results for SASPPE, SAPPE, PPE, RBFPSO and SACOSO on F(1)–F(7) (50D)
Table 5 Table of experimental results for SASPPE, PPE, SAPPE, RBFPSO and SACOSO on F(1)–F(7) (100D)

Parameter description

Table 2 shows the test functions used for this experiment and their parameters. F(5), F(6) and F(7) in Table 2 are F(9), F(10) and F(16) in CEC2005 [57], respectively. The algorithms that participated in the comparison were SAPPE, PPE, RBFPSO, and SACOSO algorithms. All algorithms demonstrate their performance on 30D, 50D, and 100D, respectively. For a fair comparison, all algorithms were run 10 times independently on MATLAB 2020b, the number of particles for all algorithms was set to 100, and the number of real evaluations was 1000, which was also the condition for the end of the whole algorithm. SAPPE, SASPPE and PPE have the same parameters, please refer to PPE algorithm for details of the parameters. The sample volumes generated by SASPPE,SAPPE, RBFPSO and SACOSO using LHS are set to 100 (for the 30D and 50D states) and 200 (for the 100D state). The maximum iterations of the PPE and Sine-PPE algorithms used in SAPPE and SASPPE are 200. However, when the optimal value changes unchanged for 20 iterations, the current search is ended, and the next search is performed. The condition that the optimal value remains unchanged is that the difference between the current and last evaluated values is less than 1.00E−6.

Tables 3, 4, and 5 show the final results of all algorithms on 30D, 50D, and 100D, respectively, and Figs. 3, 4, and 5 show the experimental comparison plots of all algorithms on 30D, 50D, and 100D. Where “Best” represents the best value among all data, “Std.” represents the standard deviation of the obtained data, and “Mean” represents the average of 10 runs. Bolded font indicates that the value outperforms equivalent types of data. Only the mean values are bolded in this paper to enhance the experimental persuasiveness. The “p” means the result of the Wilcoxon rank test at a 5% confidence level, “+” indicates a significant difference with the SASPPE algorithm, and “-” indicates no statistically significant advantage with SASPPE.

Results

30D and 50D experiment results

From Tables 3 and 4, it is clear that SASPPE can achieve the best on all function, whether it is the contingent “Best” or the persuasive “Mean.” As shown in Table 3, the optimal value of the SASPPE algorithm is closest to that of SAPPE on F(1), F(3) and F(4) but closest to SACOSO and RBFPSO on the other functions. The optimal value of SASPPE is close to the ideal optimal value of 0 on F(2) and F(4), but the largest difference is with the ideal value of − 120 on F(7). On F(3), F(5), and F(7), the optimal values of SASPPE are all 100 or even 200 smaller than the optimal values obtained from SAPPE. Compared with “Std.”, the SASPPE algorithm is minimum on F(1), F(2), F(3), F(4) and F(7); SACOSO is minimum on F(5); SAPPE is minimum on F(6). It can be seen that the SASPPE algorithm with the added surrogate model has the best stability. As shown in Table 3, the SASPPE algorithm is not significantly different from with RBFPSO only on F(5). As shown in Table 4, the SASPPE algorithm does not differ from with RBFPSO and SACOSO only on F(7). On 50D, the SASPPE mean value is similar to 30D, and it still cannot win on F(7). The value obtained by subtracting the mean value of 50D from the mean value of 30D is defined as the “mean difference”. The minimum “mean difference” is SASPPE on all function. On F(1), the mean value of the PPE algorithm on 50D is superior to that on 30D, and the SASPPE algorithm has the largest “mean difference”. On f (2), only the SAPPE algorithm’s “mean difference” less than 1, all other algorithms are greater than 1 On F(3) and F(4), the “mean difference” is smaller for SASPPE and is larger for the other algorithms, even a hundred times that of SASPPE. The “mean difference” of all functions on F(5) and F(6) is greater than 100. From the comparison of SASPPE,SAPPE and SACOSO and PPE, RBFPSO, it can be seen that the addition of surrogate model has a better ability to find the optimal; from the comparison of SASPPE and SACOSO and RBFPSO, it can be seen that the choice of surrogate model affects the stability of the algorithm, and the excellent surrogate model can reduce the influence of the dimensionality on the algorithm; from the comparison of SASPPE and SAPPE, it can be seen that the addition of the chaotic model and the reasonable clustering approach are conducive to the improvement of the performance of surrogate model.

As can be seen from Figs. 3 and 4, the fitness function gradually converges as the number of true evaluations increases. For a more intuitive demonstration of the convergence performance of the algorithm, the fitness function is taken logarithmically over F(1)–F(4). For the first 150 real evaluations on all test functions, the SAPPE and SASPPE algorithms are comparable in power, and their optimal values are similar. However, as the number of evaluations increases, the SASPPE algorithm gradually becomes more capable of exploitation, and the convergence speed is much faster than that of other algorithms. As can be seen from Fig. 3, the SASPPE algorithm is highly exploitable on F(1), F(2), F(3), F(4) and F(5). In particular, on F(1), the SASPPE algorithm has the strongest development ability and the fastest convergence speed, which is approximately linear descent. On F(3), SASPPE converges faster in the first 600 evaluations and tends to level off in the last 400 evaluations. However, the SASPPE algorithm has beaten the other algorithms to dominate at the 150 times evaluation. On F(6), the SASPPE and SAPPE algorithms converge at equivalent rates, and SASPPE, SAPPE and SACOSO start to level off after 300 times evaluation, and RBFPSO start to level off after 400 times evaluation. For the first 150 times On F(7), SASPPE and RBFPSO converge at the same rate and both algorithms have similar accuracy. Observing the minima of the seven graphs in Fig. 3, it can be seen that the SASPPE algorithm wins with an absolute advantage on the F(1)–F(4) functions, and is similar to RBFPSO on the last three functions. It wins with a slight advantage on F(5) and F(6), and is similar to RBFPSO on F(7).It is tentatively concluded that the SASPPE algorithm has an advantage on single-peak, multiple-peak, and some of the complex functions.

As can be seen from Fig. 4, the convergence performance of the SASPPE algorithm on 50D remains strong. he SASPPE algorithm has much better convergence accuracy than the other algorithms on F(1), F(3), and F(4), but the convergence speed is not as fast as that of RBFPSO. As the number of evaluations increases, the SAPPE algorithm cannot escape the premature dilemma. In contrast, the SASPPE algorithm can break through the current optimal value and continue to converge until the end of the algorithm. On F(2) and F(5), the SASPPE and SAPPE algorithm converges at a similar rate and falls into a local optimum; however, after 400 evaluations, the SASPPE algorithm jumps out of the local optimum on F(2) and develops a new optimum. On F(6), the convergence speed of the SASPPE algorithm and SAPPE algorithm are similar, both tend to level off after 150 times, but the convergence speed of SASPPE is still stronger than that of SAPPE. On F(7), after 200 evaluations, RBFPSO is victorious, and if only the first 200 evaluations are considered, SASPPE can be victorious.

Overall, the SASPPE algorithm outperforms the other algorithms on 30D and 50D, and from the above analysis, it can be seen that the SASPPE algorithm will always win in one way or another on F(1)–F(6), and can be defeated by RBFPSO only on F(7). Combined with the graph and table analysis, the SASPPE algorithm has a robust search and exploitation ability to go beyond the local optimum to find better results. Comparing the algorithms of SASPPE, SAPPE, SACOSO and RBFPSO with added surrogate model and without added surrogate model, the algorithm with the added surrogate model has more advantages. Compared with SASPPE and SAPPE, sine mapping and hierarchical clustering enhance the overall performance of the algorithm and solve the problem that adding surrogate models can easily fall into the local optimum.

Fig. 3
figure 3

Comparison of SASPPE, SAPPE, PPE, RBFPSO and SACOSO algorithms for F(1)–F(7) (30D)

Fig. 4
figure 4

Comparison of SASPPE, SAPPE, PPE, RBFPSO and SACOSO algorithms for F(1)–F(7) (50D)

Fig. 5
figure 5

Comparison of SASPPE, SAPPE, PPE, RBFPSO and SACOSO algorithms for F(1)–F(7) (100D)

100D experiment results

As can be seen from Table 5, SASPPE is not as dominant as 30D and 50D, especially on F(5) and F(6). Comparing the mean values in Table 5, SASPPE wins on the F(1)–F(4), F(6) and F(7) functions and is beaten by SAPPE on F(5)F(6). A horizontal comparison of “Best” shows that SASPPE wins on F(1)–F(5) and F(7), and SAPPE wins on F(6). Comparing the “p” results in the table, we can see that RBFPSO is not significantly different from SASPPE on F(5), SACOSO is not significantly different from SASPPE on F(6), and all other algorithms are significantly different from SASPPE. On F(1), the mean result of the PPE algorithm is inversely proportional to the dimensionality; the higher the dimensionality, the better the result, but it is not as good as SASPPE and SAPPE. Compared with “Std.”, SASPPE is the smallest in F(2), F(3), and F(4), and SAPPE gains an advantage in F(7); PPE is the smallest in F(1) and F(5); SACOSO is the smallest in F(6). It can be seen that the stability of the SASPPE algorithm does not decrease as the dimensionality increases.

As shown in Fig. 5, SASPPE continues to converge on F(1), F(5), and F(6), and continues to converge after several smooth evaluations on F(2), F(3), F(4), and F(7). On F(1), SAPPE achieves the optimal value in the early stage of the algorithm. As the evaluation proceeds, the SASPPE algorithm gradually catches up with the SAPPE algorithm and achieves the final victory. On F(2), F(3), and F(4), the SASPPE and SAPPE algorithms converge to the current optimal value in the early stages of the algorithm. As the algorithm proceeds, the SASPPE algorithm continues to converge after approximately 600, 550, and 450 evaluations, respectively and successfully beats SAPPE. From Fig. 6e, it can be guessed that by increasing the number of evaluations, the SASPPE algorithm can catch up and catch up with SAPPE. On F(5), the SAPPE algorithm wins but matures too early and does not converge as well as the SASPPE algorithm. On F(6), the convergence rate of the SASPPE algorithm and the SAPPE algorithm gradually decreases and eventually plateaus after 150 evaluations. On F(5) and F(6), SASPPE converges to a better optimum than SAPPE, and SASPPE always has a better optimum than SAPPE for the same number of true evaluations. On F(7), the strength of the SASPPE algorithm and SAPPE algorithm are comparable, but the SASPPE algorithm catches up with the SAPPE algorithm at 600 evaluations by its characteristics.As the dimensions increase, the SASPPE algorithm gradually loses its advantage in the F(7) algorithm.

Fig. 6
figure 6

Comparison of boxplots of the five algorithms for F(1)–F(7) (50D)

Fig. 7
figure 7

Comparison of boxplots of the five algorithms for F(1)–F(7) (30D)

In conclusion, when the dimension is 100, the convergence performance advantage does not decrease from 30D and 50D. On F(1)–F(4), the SASPPE algorithm always wins with absolute advantage, and on F(7), the advantage of the SASPPE algorithm is gradually lost as the number of dimensions increases. On F(5) and F(6), the advantage of SASPPE algorithm is most obvious at 50D. However, SAPPE, SASPPE, SACOSO and RBFPSO outperform the other algorithms by enhancing exploration and exploitation through surrogate models. Adding sinusoidal mapping enhances population diversity, avoids premature maturation of the algorithms, and increases their probability of winning. Comparing the three dimensions, it can be seen that SASPPE has the highest optimal value and stability. It is not prone to fall into local optimality in any dimension. The next best algorithm is SAPPE. it can be seen that the application of the agent model is reasonable and the sinusoidal mapping and hierarchical clustering are effective. No single algorithm is suitable for all problems and the fact that the SASPPE algorithm consistently prevails on most functions is enough to show that the proposed algorithm SASPPE is feasible.

Fig. 8
figure 8

Comparison of boxplots of the five algorithms for F(1)–F(7) (100D)

Stability analysis

Analysing Figs. 6, 7, and 8 shows that the SASPPE algorithm is the most stable. Regardless of that dimension, the SASPPE algorithm obtains absolute advantage on F(2)–F(4). On F(5), the SASPPE algorithm is poorly stable on 30D and 50D, but stable on 100D. On 30D, on F(1), the SASPPE algorithm is stable and the worst value is better than the optimal value of SAPPE and SACOSO, while the RBFPSO algorithm has deviating values and poor stability. With the increase of dimension, all algorithms can get better stability on F(7), and the advantage of SASPPE algorithm on F(6) gradually increases. However, the SASPPE algorithm is prone to deviating values on the complex function on 50D, which is prone to chance. On 100D, RBFPSO performs the worst and SACOSO is stable although it does not obtain the optimum value. Overall, the SASPPE algorithm has an advantage over other algorithms in terms of stability.

Empirical analysis of the computational complexity

The computational complexity of the various algorithms varies depending on the composition of the algorithms. The computational complexity of the SASPPE algorithm consists mainly of the actual evaluation time and the time required to train the model. Table 6 shows the computational time of this algorithm and the comparison algorithm on 30D, 50D and 100D respectively. It is easy to see from the table that the PPE algorithm is the most time efficient, this is because PPE does not need to construct a model, only the actual evaluation time. The SAPPE and SACOSO algorithms take more time than the SASPPE algorithm due to the fact that SAPPE and SACOSO do not use clustering and construct larger, more time-consuming models. Comparing RBFPSO, the SASPPE algorithm takes longer time, this is because the SASPPE algorithm has more training time for multiple local agent models than the RBFPSO algorithm. Comparing 100D and 30D shows that the algorithm increases the computation time due to the increase in dimensionality. The PPE algorithm has an insignificant increase in time on F(1)–F(4) and a significant increase in time on F(5)–F(7) complex functions by a factor of almost 20, while the SASPPE algorithm has an insignificant increase in time. Based on the above findings, it is not difficult to conclude that when the problem evaluation is relatively complex, most of the time consumed by the algorithm will be taken up by the fitness evaluation. When the problem size is relatively large, the actual evaluation is more time-consuming. Therefore, the time spent on RBFN model training prediction building is acceptable.

Table 6 Average calculation time required for 1000 fitness evaluations of the compared algorithms in 30D, 50D and 100D problems
Table 7 3D coverage parameter design for movable nodes

3D coverage of removable nodes

From the above experimental results, it is clear that the algorithm SASPPE proposed in this paper is effective, but it has yet to be tested in terms of practical applications. In this chapter, SASPPE is applied to 3D coverage of movable nodes to check the performance of the algorithm in practical applications.

Parameter settings

This paper uses a combination of random coverage and fixed mobile nodes to meet the dual objectives of low cost and high coverage efficiency. First, the sensor nodes are deployed randomly. Then, the coverage of the current position of the node is compared with the coverage of the place to be moved. If the coverage of the position to be moved is smaller at this time, the node is moved to that location. Conversely, it is not moved. At last, the covered area is set to 1 to increase the network coverage. The energy of each movable sensor node is a constant value E. Each time it moves a distance d, the energy consumed is E(d). The method to determine whether the node should be moved is as follows.

  1. 1.

    Convert 3D actual scenes into 2D.

  2. 2.

    Define the search length Rs (R/2) to search in the four directions of node P.

  3. 3.

    If it is found that the coverage increases by moving to a location, move the node to that location, and conversely, increase the search length Rs to \(Rs + 1\) and search again.

  4. 4.

    If the node is searched G times and still cannot be searched, the move is abandoned.

The application scene used in this paper is a \(50\times 50\) hill, and its simulation is shown in Fig. 1b. The values of the required parameters in this chapter are shown below. All experiments were run 10 times independently and averaged to avoid experimental chance.

Table 8 Coverage comparison table of SASPPE, PPE, PSO, PPSO
Fig. 9
figure 9

Coverage comparison table of SASPPE, PPE, PSO, PPSO

Results

Table 8 shows the results of SASPPE, PPE, PSO, and PPSO algorithms applied to the 3D coverage of removable nodes. As can be seen in Table 8, the SASPPE has the strongest coverage capability. When the sensor node sensing radius is 3, and the number of nodes is 30, its coverage is only 0.768 higher than PSO, and when the number of nodes increases to 40, its coverage is 1.576 higher than PPSO. When the sensing radius of the nodes is 5, the SASPPE algorithm has the greatest advantage and the greatest difference from other algorithms, especially when the number of nodes is 30. The difference with the suboptimal coverage is 2.504. When the sensing radius of the sensors is 5, it is recommended to use 30 sensor nodes when the cost of sensors is considered. When the sensor node sensing radius is 7, the difference between the SASPPE algorithm’s and PPSO and PSO is insignificant due to too little uncovered area. To observe the coverage gap of all algorithms more visually, a histogram of coverage is provided in this paper. As shown in Fig. 9, SASPPE wins by a narrow margin among all algorithms, the PPE algorithm is the worst among all algorithms, and PPSO and PSO are comparable in capability. Comparing the two graphs in Fig. 9, it can be found that when the radius is the same, the more nodes, the greater the coverage. When the nodes are the same, the larger the radius, the more extensive the coverage.

Conclusion and future work

This paper proposes a method to enhance network coverage using mobile sensor nodes. All nodes in the network are mobile; however, to reduce energy consumption, nodes need to determine whether they should move before they do so. This not only reduces the energy consumption of the nodes but also maximizes the lifetime of the sensor nodes. The paper proposes a surrogate-assisted sine Phasmatodea population evolution algorithm to maximize network coverage. The GSAM smooth overall algorithm is used in this algorithm to ensure the search capability of the algorithm. This paper uses multiple LSAMs to develop the region in depth at a local scale, trying to find even better results. To better find the optimal value, GSAM, and LSAM use the same database, saving time and cost overhead. Using agglomerative hierarchical clustering to divide multiple LSAMs ensures the robustness of each LSAM. Sine mapping is added to LSAM to enhance the randomness of the algorithm. The algorithm was tested in 7 test functions in three dimensions and the results showed the feasibility of the algorithm is the algorithm. The proposed algorithm has the highest number of functions obtaining the optimum value on 30D, 50D and 100D and only on 50D the advantage is lost on F(7). The algorithm is applied to the 3D coverage of mobile nodes, and the experimental results show that the SASPPE has the highest coverage rate. The improved algorithm is at least 5.196% higher than the original PPE algorithm. However, this experiment leaves something to be desired. It consumes some energy to determine whether a node moves and the movement is fixed and can only be moved around it. The proposed algorithm SASPPE is only applied to the 3D coverage of movable nodes, and its effectiveness in other scenes is unknown. We hope the algorithm performs more robustly and the nodes move more rationally.