Boosting capuchin search with stochastic learning strategy for feature selection

Abd Elaziz, Mohamed; Ouadfel, Salima; Ibrahim, Rehab Ali

doi:10.1007/s00521-023-08400-8

Boosting capuchin search with stochastic learning strategy for feature selection

Original Article
Open access
Published: 22 March 2023

Volume 35, pages 14061–14080, (2023)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

Boosting capuchin search with stochastic learning strategy for feature selection

Download PDF

Mohamed Abd Elaziz ORCID: orcid.org/0000-0002-7682-6269^1,3,4,5,6,
Salima Ouadfel² &
Rehab Ali Ibrahim¹

1185 Accesses
4 Citations
Explore all metrics

Abstract

The technological revolution has made available a large amount of data with many irrelevant and noisy features that alter the analysis process and increase time processing. Therefore, feature selection (FS) approaches are used to select the smallest subset of relevant features. Feature selection is viewed as an optimization process for which meta-heuristics have been successfully applied. Thus, in this paper, a new feature selection approach is proposed based on an enhanced version of the Capuchin search algorithm (CapSA). In the developed FS approach, named ECapSA, three modifications have been introduced to avoid a lack of diversity, and premature convergence of the basic CapSA: (1) The inertia weight is adjusted using the logistic map, (2) sine cosine acceleration coefficients are added to improve convergence, and (3) a stochastic learning strategy is used to add more diversity to the movement of Capuchin and a levy random walk. To demonstrate the performance of ECapSA, different datasets are used, and it is compared with other well-known FS methods. The results provide evidence of the superiority of ECapSA among the tested datasets and competitive methods in terms of performance metrics.

Feature selection via Lèvy Antlion optimization

Article 19 March 2018

Optimizing Feature Selection through Binary Charged System Search

Global Best Guided Binary Crow Search Algorithm for Feature Selection

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The enhancement of information technology that became rapid has allowed an excessive increase in data of various kinds. These data sets are also represented by more and more attributes, which makes their analysis and interpretation difficult and requires expensive processing time. For this purpose, feature selection (FS) methods are used for choosing from the original features the finest subset of informative features [1]. The learning speed and accuracy performance are improved if no informative features are deleted [2]. FS has been widely employed in a variety of applications, including text clustering [3], bioinformatics [4], image processing [5], and others [6,7,8].

According to the use or not of a classifier algorithm, feature selection techniques wrapper and filter are the two primary types of techniques. The wrapper techniques can choose the optimal predictive characteristics subset using a learning algorithm. The Sequential Forward Selection (SFS) and Sequential Backward Selection (SBS) algorithms are the most common in this category [9, 10]. Without using a classifier, the filter techniques discover the best subset features by maximizing specific criteria based on the statistical aspects of datasets, such as Relief [11] and mutual information (MI) [12].

Wrapper approaches, which are guided by the learning mode, normally produce better results than filter methods, but they consume a lot of time because each selected feature subset is evaluated using a learning process. This is especially true for high-dimensional datasets. Wrappers also risk overfitting, especially with high-dimensional datasets where the number of features outnumbers the number of samples [13]. The search space expands exponentially with the increase in the number of features, making an exhaustive search to find the best feature subset unfeasible [14]. Meta-heuristics (MH) have been frequently employed to solve the FS problem because of their ability to find optimal or near-ideal solutions without studying the complete range of options. This MH technique includes arithmetic optimization algorithm (AOA) [15], Equilibrium Optimization Algorithm (EO) [16], Reptile Search Algorithm (RSA) [17], Ant lion, Crow Search Algorithm (CSA) [18], Genetic algorithm (GA) [19], and Aquila Optimizer (AO) [20]. Each of these methods has its advantages and disadvantages; for example, the exploration of AOA, AO, and RSA is better than their exploitation, and this can lead to stuck in the attractive points. In contrast, the CSA and GA have better exploitation than exploration, which can influence the convergence toward the optimal solution.

Despite the effectiveness of both tactics when used together, a groundbreaking strategy for improving classification accuracy by selecting a reduced number of features by combining filters and meta-heuristic algorithms as independent approaches is still being developed. Also applicable is the No-Free-Lunch (NFL) theorem [21]. Over the years, several scholars have introduced new MH with improved performance based on the concept that one optimization method cannot solve all problems. The major goal of this research is to develop a new hybrid FS approach based on a new meta-heuristic Capuchin search algorithm (CapSA).

The Capuchin search algorithm (CapSA) is a meta-heuristic (MH) approach that has been enhanced in [22]. Furthermore, it was inspired by capuchin monkeys foraging in the wild. In the CapSA, the capuchin population consists of two groups: alpha which represents the leader and the followers. While foraging for food, the capuchins move according to five strategies (1) jumping on trees, (2) jumping over banks, (3) swinging on trees, (4) climbing on trees, and (5) moving normally and randomly on the ground [23]. The CapSA has been utilized as an optimization algorithm to solve a variety of issues, including global optimization and engineering challenges, based on these behaviors [22]. In [23], Modeling Industrial Winding Process has been improved by utilizing a modified CapSA that uses multi-gene genetic programming.

Despite CapSA's excellent performance, its full potential has yet to be realized, and there is room to improve it and apply it to feature selection problems based on our best understanding. The standard CapSA algorithm is an iterative process in which a population of Capuchins is characterized by their position and the velocity with which they move in the solution space. As in PSO, Capuchins change their position according to both the position of their best past performance and the position of the global best Capuchin found so far. In the standard CapSA, the velocity is adjusted according to three parts. The first part is the current velocity of the Capuchins. The second part is the cognition component, which represents the impact of personal experiences on a capuchin's trajectory and helps Capuchin to move toward the optimal position. The third part is the social component, which represents the effects of group experiences on the movement of a capuchin and guides it toward the best position found so far. The three parts are controlled by the inertia weight and two acceleration coefficients. Although the CapSA is very robust and efficient when compared with other population-based methods, its ability to fine-tune solutions and escape from local optimal is weakened; its performance is largely determined by these three parameters that must be tweaked in order to avoid a lack of diversity and premature convergence. In CapSA [22], the inertia weight takes the value of 1.0, then in [23], a linearly decreasing weight (LDW) strategy is adopted to adjust the inertia weight dynamically. However, the linear decreasing inertia weight strategy is known to have the shortcoming of premature convergence, and therefore, CapSA can easily be trapped into a local extremum [24]. To address this shortcoming, we propose a dynamic decreasing inertia weight using the logistic map combined with the velocity to effectively balance the global and local search of the basic CapSA. The goal was to adapt the inertia weights by using the chaotic optimization to improve the performance convergence of CapSA and avoid getting into the local optimum during the optimization process.

In addition, in the original CapSA, the cognitive and social components influence the velocity of a capuchin by using two acceleration coefficients constants similar to the basic PSO. Therefore, these two components are very important for convergence performance [25, 26]. Many strategies have been developed to select the optimal values to promote the basic PSO [27,28,29]. Based on these research studies, the constant acceleration coefficients are replaced in this paper with sine cosine acceleration coefficients to solve premature convergence and stagnation. To add more diversity to the movement of capuchins and help them to explore the search space more efficiency, the velocity update formula is modified. Unlike original CapSA, Capuchins update their velocities based on the historical best information; a learning strategy is introduced to help capuchins to learn from other good individuals in their local neighborhood and in each dimension [30, 31]. Finally, a Levy random walk is applied instead of a random movement to avoid unnecessary exploration that decreases the convergence speed of the algorithm.

The main objectives and contributions of the paper can be summarized as follows:

1.
Propose an alternative wrapper-based FS approach according to modify the performance of new meta-heuristic technique named the Capuchin search algorithm (CapSA).
2.
Use the sine cosine acceleration coefficients, Chaotic Inertia Weight Strategy, Stochastic Learning Strategy, and Levy random walk to improve the performance of CapSA and accelerate the convergence.
3.
Apply the developed ECapSA approach to solve the problem of high dimensionality of real-world datasets. In addition, compare the performance of ECapSA with other well-known FS methods based on using different UCI datasets.

This is how the paper is structured: The second part showcases related works. The conventional CapSA is presented in part 3, and the suggested Enhanced CapSA (ECapSA) is discussed in Sect. 4. Section 5 introduces the experimental results and discusses them. The conclusion and future projects are presented in Sect. 6.

2 Related works

Recently, for assessing the FS problem as wrapper techniques, more metaheuristic techniques have been introduced; also, such techniques have high efficiencies when competed with the techniques [32]. In [14], the authors combine the Pareto optimization with the harmony search algorithm (HS) to handle the FS in high-dimensional data classification issues. A Wrapper-based approach is proposed in [33] based on the binary bat algorithm (BA). The authors in [34] combined the particle swarm optimization (PSO) with a local search for performing the feature selection. The authors of [35] introduced utilizing PSO for feature selection with an adaptive modernization mechanism through each particle swarm rank to be suitable with the inertia weight parameter's value.

Emary et al. [36] propose a binary variant of the Gray wolf optimization (GWO) for the feature selection domain. Abd ELAZIZ et al. [37] introduce an opposition-based SCA technique for solving the feature selection problem. Pashaei and Aydin [38] introduce the binary black hole technique for feature selection (BBHA). The BBHA is a binarization-assisted extension of the conventional BHA.

Recently, Kumar and Bharti [39] combine the PSO algorithm with SCA to take advantages of each optimizer to solve the feature selection problem. Abualigah and Dulaimi [40] introduce SCAGA for FS, which is a hybrid of the SCA and GA algorithms that takes benefit of each method. Xue et al. [41] hybrid the PSO with the forward and backward FS strategy. Gu et al. [42] propose a novel wrapper-based approach for selecting the optimal features subset using the binary CSO. Further, Kaya [43] developed a binary cuckoo search to select relevant features and enhance classification accuracy. In [44], a developed chaotic dragonfly method to eliminate irrelevant features is presented. Ouadfel et al. [45] introduced a novel version of CSA to determine the optimal features. They propose an adaptive awareness probability and a global search method to improve the convergence performance of the basic CSA. T. Jia et al. [46] presented a hybrid FS approach according to the combination of the simulated annealing (SA) and spotted hyena optimization (SHO). Ghosh et al. introduced in [47] an enhanced version of the binary sailfish optimizer and β-hill climbing for determining relevant features. Hammouri et al. proposed an improved version of the Dragonfly Algorithm (DA) for solving the feature selection problem [48]. In [44], the authors combine the DA with chaotic maps to accelerate the convergence rate of the basic DA. Zhang et al. combine the Harris’ Hawk Optimization (HHO) algorithm with SalpSA to select the best feature subset [49]. The proposed approach performed better and presented a good compromise between the exploration and exploitation. Too et al. proposed a binary quadratic version of HHO for feature selection, and experiments prove the superiority of their method in comparison with other metaheuristics [50]. Rodrigues et al. introduce the swap mutation in the basic KH and apply the new variant for feature selection on high-dimensional data in text clustering [51]. Sadeghian et al. combine the Information Gain with the binary Butterfly Optimization Algorithm BOA for the feature selection problem in [52]. Experiments performed on UCI datasets demonstrate the ability of the proposed approach to select the smallest feature subset. An improved version of salp swarm algorithm (SSA) was proposed byTubishat et al. for the feature selection problem [53]. The authors use the Opposition-Based Learning to initialize the population instead of the random initialization. Moreover, they introduce a new local search to improve exploitation performance. Faris et al. [54] propose a wrapped approach based on SSA for the feature selection problem and use eight transfer functions for discretizing the continuous search space and a crossover operator was to enhance the exploratory performance of the basic SSA. Sindu et al. [55] combine the sine cosine algorithm (SCA) with an elitism approach. Rodrigues et al. [56] developed a new binary-constrained flower pollination algorithm and apply their approach for feature selection problem. The new approach defined a Boolean lattice as a search space such that each solution defines whether a feature is selected or not. Yan et al. [57] developed a new variant of Coral Reefs Optimization algorithm for selection the optimal feature subset. Tournament selection was used to increase the diversity of the population and the KNN classifier is used to evaluate the quality of the feature subset encoded by each individual in the population.

3 Capuchin search algorithm: background material

The Capuchin Search Algorithm (CapSA) is a MH technique [22]. It takes its inspiration from Capuchin monkeys' natural behavior during foraging activity in the wild. Like any metaheuristic, the CapSA uses a population X of N capuchins such that each Capuchin represents a candidate solution in a d-dimensional search space. X can be expressed as a two-dimensional size matrix N × d.

During the process of searching within the d-dimensional space, the position of the $i$th Capuchin is represented by ${{x}^{i}=[x}_{1}^{i},{x}_{2}^{i},\dots .,{x}_{\begin{array}{c}d\\ \end{array}}^{i}]$ and its velocity ${{v}^{i}=[v}_{1}^{i},{ v}_{2}^{i},\dots .,{v}_{\begin{array}{c}d\\ \end{array}}^{i}]$. Moreover, capuchin $i$ will hold on to its prior best position ${{\mathrm{pbest}}^{i}=[\mathrm{ pbest}}_{1}^{i},{\mathrm{pbest}}_{2}^{i},\dots .,{\mathrm{pbest}}_{\begin{array}{c}d\\ \end{array}}^{i}]$.

In CapSA, positions and velocities vectors are randomly initialized, then, ${v}^{i}$ at jth dimension was updated as

$${v}_{j}^{i}\left(t+1\right)=\rho {v}_{j}^{i}\left(t\right)+{a}_{1}\left({x}_{{best}_{j}}^{i}\left(t\right)-{x}_{j}^{i}\left(t\right)\right){r}_{1}+{a}_{2}\left({Best}_{j}-{x}_{j}^{i}\left(t\right)\right){r}_{2}$$

(1)

where ${x}_{j}^{i}$ is the current position of the ith leader solution at dimension $j$. ${v}_{j}^{i}\left(t+1\right)$ and ${v}_{j}^{i}\left(t\right)$ refer to the current velocity and old velocity of ${x}_{j}^{i}$. ${x}_{{best}_{j}}^{i}$ is the best agent and ${Best}_{j}$ is the best solution found so far at $j$ th dimension. ${a}_{1}$ and ${a}_{2}$ refer to two acceleration constants that balance the influence of ${x}_{{best}_{j}}^{i}$ and ${Best}_{j}$ on ${r}_{1}$ and ${r}_{2}$ denote uniformly distributed random values ranging from 0 to 1. $\rho$ stands for the inertia weight that takes the value 1.0 in [22] and in [23], Eq. 2 shows its updated value.

$$\rho ={w}_{\mathrm{max}}-\left({w}_{\mathrm{max}}-{w}_{\mathrm{min}}\right){\left({t}/{\mathrm{maxite}}\right)}^{2}$$

(2)

where ${w}_{\mathrm{max}}$ and ${w}_{\mathrm{min}}$ represent the inertia weight's maximum and minimum values, respectively.

Capuchins are divided into two groups in CapSA: (1) Leaders (alpha leaders) are in charge of discovering food sources, and (2) followers are in charge of updating their positions by following the group's leaders. The capuchins are directed by the leaders, while the followers are pursued by the leaders, either directly or indirectly [22]. During the evolutionary process, the leaders of the community use five different movement techniques to find food:

1.
In the first technique (jumping on trees), leaders update their positions using Eq. 3 as follows:
$${x}_{j}^{i}(t+1)={\mathrm{Best}}_{j}+\frac{{P}_{bf}{\left({v}_{j}^{i} (t+1)\right)}^{2}\mathrm{sin}\left(2\theta \right)}{g}, i<n/2;0.1\le \mathcal{E}\le 0.20$$
(3)

where $E$ is a random number with a uniformly distributed distribution generated within [0, 1] and ${P}_{bf}$ is the probability of the capuchins' tails providing balance. $g=9.81$ stands for the force of gravity, and $\theta$ is the capuchins' jumping angle formulated in Eq. 4

$$\theta =\frac{3}{2}{r}_{4}$$

(4)

Equation (4), ${r}_{4}\in [0, 1]$ represents a uniform number.

2.
In the second technique named Jumping on the ground, the position of the leader's solution is updated as
$${x}_{j}^{i}(t+1)={\mathrm{Best}}_{j}+\frac{{{P}_{ef}P}_{bf}{\left({v}_{j}^{i}(t+1)\right)}^{2}\mathrm{sin}\left(2\theta \right)}{g}, \quad i<n/2;0.2\le \mathcal{E}\le 0.30$$
(5)

where ${P}_{ef}$ stands for the elasticity probability of motion of solution on the ground.

3.
Normal walking is the third technique and the position of alpha capuchins. The following was defined when seeking food on the ground using standard walking [22]:
$${x}_{j}^{i}\left(t+1\right)={x}_{j}^{i}\left(t\right)+{v}_{j}^{i}\left(t+1\right); i<n/2 0.3< \mathcal{E}\le 0.5$$
(6)
4.
Swinging on trees is the fourth technique and the position of alpha capuchins. While swinging on trees in search of food, the following was defined as [22];
$${x}_{j}^{i}\left(t+1\right)={\mathrm{Best}}_{j}+{ P}_{bf}\times \mathrm{sin}\left(2\theta \right), i<n/2;0.5\le \mathcal{E}\le 0.75$$
(7)
5.
Climbing trees is the fifth technique, and the position of alpha capuchins is updated using the following equation [22]:
$${x}_{j}^{i}(t+1)={\mathrm{Best}}_{j}+{ P}_{bf}\left({v}_{j}^{i}(t+1)-{v}_{j}^{i}(t)\right),\qquad i<n/2;0.75<\mathcal{E}\le 1.0$$
(8)

In order to find a better solution, alpha leaders are randomly relocated, as shown in Eq. (9) [22].

$${x}_{j}^{i}=\tau \left({lb}_{j}+\left({ub}_{j}-{lb}_{j}\right)\mathrm{rand}\right) i<n/2; \mathcal{E}\le {P}_{r}$$

(9)

where ${P}_{r}=0.1$ is the probability of a random walk search. ${ub}_{j}$ and ${lb}_{j}$ are the upper and lower boundaries of the search domain at dimension $j$ and $\tau$ is a parameter formulated as

$$\tau =2{e}^{-21{\left(\frac{t}{\mathrm{maxite}}\right)}^{2}}$$

(10)

where t and $\mathrm{maxite}$ stand for the current iterations and total iterations, respectively.

According to Eq. 11, the standing of the supporters of the capuchins' commanders in CapSA is updated as follows:

$${x}_{j}^{i}(t+1)= \frac{1}{2}\left( {{x}^{\prime}}_{j}^{i}(t+1)+{x}_{j}^{i}(t)\right) n/2\le i\le n$$

(11)

where ${{x}^{\prime}}_{j}^{i}$ and ${x}_{j}^{i-1}$ represent the followers' current and prior positions at dimension $j$, respectively, and ${{x}^{\prime}}_{j}^{i}$ is the current position of the leaders at dimension $j$.

With these properties of CapSA, it still needs to be improved, and this motivated us to propose a modified version of CapSA as discussed in the following section.

4 Enhanced CapSA for feature selection

The FS process can be viewed as an NP-hard searching problem [38]. Selecting the optimal feature subset from high-dimensional data is a difficult task that involves expensive computational time. Metaheuristics are stochastic methods that have been applied with great success in solving complex optimization problems for which exact methods cannot be applied. The CapSA algorithm is a novel metaheuristic that shows the high performance when tackling various optimization problems [22] and [23]. However, like any metaheuristic, CapSA requires specific parameters to be tuned, and therefore, local optima stagnation and immature convergence may occasionally.

The merits of CapSA motivate us to develop for the first time a novel FS approach based on an enhanced version of CapSA optimizer (ECapSA) and use it as a wrapper searching strategy. EcapSA aims to enhance the performance of the basic CapSA by incorporating four improvements: a dynamic decreasing inertia weight using the logistic map is combined with the velocity to effectively balance the global and local search of the basic CapSA, sine cosine-based adjustment of the acceleration coefficients to enhance the performance rate, a stochastic learning strategy is applied to add more diversity to the movement of the Capuchin and the search strategy-based Lévy flight is integrated with the position update so that the global search ability is enhanced. The overall framework of the proposed ECapSA for the FS problem is presented in detail in the following subsection.

4.1 Population initialization

ECapSA begins by assigning an initial position to a set of agents for which the entire population is responsible ${X}^{t}$ (t = 0.…,$maxite$) of $N$ capuchins with d-dimensional is formulated as

$${X}^{t}=\left[\begin{array}{c}{x}_{1}^{1},{x}_{2}^{1},\dots .,{x}_{\begin{array}{c}d\\ \end{array}}^{1}\\ {x}_{1}^{2},{x}_{2}^{2},\dots .,{x}_{\begin{array}{c}d\\ \end{array}}^{2}\\ .\\ .\\ {x}_{1}^{N},{x}_{2}^{N},\dots .,{x}_{\begin{array}{c}d\\ \end{array}}^{N}\end{array}\right]$$

The initial population is generated by random methods, which ensure it covers as much solution space as possible. Therefore, ${X}^{0}$ is generated by the uniform distribution as follows:

$${x}_{j}^{i}={lb}_{j}+\left({ub}_{j}-{lb}_{j}\right)\mathrm{rand },\mathrm{ i}=1..\mathrm{N}, =1..\mathrm{d}$$

(12)

where ${lb}_{j}$ and ${ub}_{j}$ stand for the lower and the upper boundaries, respectively, of the solution ${x}_{j}^{i} \epsilon X$ at $j$ th dimension, respectively. In this paper, we set ${lb}_{j}=0$ and ${ub}_{j}=1$. Then, the fitness value of each ${x}_{j}^{i}$ is computed as discussed in the following section.

4.2 Fitness function

The features subset encoded in each individual atom is assessed using the KNN classifier. We utilize the fitness function represented by the following equation to discover a subset with the least number of characteristics and the highest classification accuracy:

$$fit=\alpha .\mathrm{ErrClass}+\left(1-\alpha \right).\frac{{D}_{s}}{d}$$

(13)

where ErrClass stands for the classification error. ${D}_{s}$ refers to the length of the selected features. $\alpha$ ∈ [1, 0] related to the relevance of feature reduction and the weight of the classification error rate. After that competition between sine cosine acceleration coefficients, Chaotic Inertia Weight Strategy, Levy random walk, Stochastic Learning Strategy is used to enhance the current solutions. The details of these methods are given in the following subsections.

4.3 Sine cosine acceleration coefficients

When looking at the velocity update Eq. (1), it is made up of three terms. The first one is the $v(t)$ that represents population velocity, while the second term is the cognitive component that represents the personal best position visited by the capuchin and the third one is the social component that regulates the velocity of $\mathrm{the capuchin}$ toward the global best solution ($Gbest$). The second and third terms influence the algorithm, causing it to execute global and local searches, respectively. As can be seen, Eq. (1) is similar to the PSO velocity update formula, and it has been demonstrated that the second term of Eq. (1) can decrease the convergence rate quickly, while the third term of Eq. (1) can cause a premature convergence [58].

In the original CapSA, the acceleration coefficients are set to a fixed value as in the original PSO. The optimizer's solution quality is influenced by the relative values of cognitive and social components. When the social component ${a}_{2}$ is relatively high in comparison with the cognitive component ${a}_{1}$, particles arrive at a local optimum sooner, and when the cognitive component is relatively high, particles meander over the search space [29]. Many studies have been conducted to determine the ideal mix of these elements [29, 59, 60].

These coefficients are updated in such a way that the cognitive component is lowered, and the social component is boosted as iteration progresses to improve the solution quality. Based on the work [61], the following equations are used to update the two acceleration coefficients:

$${a}_{1}=-2\times \mathrm{sin}\left(\frac{\pi }{2}\times \frac{t}{\mathrm{maxite}}\right)+2.5$$

(14)

$${a}_{2}=-2\times \mathrm{cos}\left(\frac{\pi }{2}\times \frac{t}{\mathrm{maxiter}}\right)+2.5$$

(15)

According to [29], ${a}_{1}$ takes its value from the range [2.5, 0.5], while ${a}_{2}$ increased during the searching process from 0.5 to 2.5.

4.4 Chaotic inertia weight strategy

In CapSA, the update process of velocity is mainly based on the inertia weight, such as in some previously introduced algorithms, like PSO [58] and BAT [62]. The inertia weight technique is critical in maintaining a global and local search balance. The inertia weight technique determines the former particle velocity to its new one at the current iteration. In [63], Shi and Eberhart introduced inertia weight (IW) in PSO as a constant and illustrated that the exploration is enhanced by using a large value for IW. In contrast, the exploitation is improved when the value of IW is small. A large IW facilitates a global search, while a small IW facilitates a local search [63]. Many dynamic IW techniques have already been presented to augment PSO's capabilities. We uncover time-varying IW techniques among them, in which the value of the IW is modified according to the number of iterations [64,65,66].

Pursued by a sense of propriety, Chaos has ergodic and stochastic properties and specific elements. In a dynamic system, a global optimum or a good approximation can be attained with high probability by following chaotic orbits. Based on the work of [67], we introduce a chaotic optimization mechanism into CapSA and propose the use of the logistic map to tune the IW $\rho$. The purpose of using a chaotic descending IW instead of the linear descending strategy proposed in the original CapSA is to improve the population diversity in the searching process, as well as enhance the ability to converge at the global optimal. The logistic map is applied to update the IW as described in Eq. (16):

$$r\left(t+1\right)=4\times r\left(t\right)\times \left(1-r\left(t\right)\right)\,\,\, r\left(0\right)=\mathrm{rand}$$

(16)

where ${r}_{0}\notin \left\{0, 0.25, 0.5,\mathrm{ 0.75,1}\right\}$

$$\rho \left(t\right)=r\left(t\right)\times {\rho }_{\mathrm{min}}+\frac{t\times \left({\rho }_{\mathrm{max}}-{\rho }_{\mathrm{min}}\right)}{\mathrm{maxite}}$$

(17)

where ${\rho }_{\mathrm{max}}$ and ${\rho }_{min}$ are the value of final and initial IW, t stands for the current iteration. $\mathrm{maxite}$ stands for the maximum number of generations, and $\rho \left(t\right)$ represents the IW value in iteration $t$. $r\left(t\right)$ is a random number between 0 and 1 and generated by the logistic chaotic.

4.5 Stochastic learning strategy

In CapSA, during each iteration, the Capuchin updates its velocity using Eq. (1), which consists of three weighted terms: the first term ((t)) denotes the old velocity of each Capuchin at the previous iteration, the second term $({\mathrm{pbest}}_{i}\left(t\right)-{x}_{i}(t))$ is the "cognitive part," which reflects the Capuchin's memory of its own historical experience; the third part is the "social part,"$(\mathrm{Gbest}\left(t\right)-{x}_{i}(t))$ which represents the information sharing and cooperation among particles. Regarding formula 1, capuchins update their positions by moving toward the Capuchin's personal best solution (pbest) and the global best solution (gbest). However, this strategy leads to premature convergence and poor performance of CapSA. In order to add more diversity to the movement of capuchins, we propose in this paper a new equation for the velocity update inspired by [30] and [31] in the following way:

$${v}_{i}^{d}\left(t+1\right)=\rho .{v}_{i}^{d}\left(t\right)+{a}_{1}.\mathrm{rand}.\left({\mathrm{pbest}}_{{f}_{i}^{d}}^{d}(t)-{x}_{i}^{d}(t)\right)+{a}_{2}.\mathrm{rand}.\left(\mathrm{gBest}-{x}_{i}^{d}(t)\right)$$

(18)

where ${f}_{i}^{d}$ defines which particle's local best the ith particle should follow. For each dimension of Capuchin ${x}_{i}$, two capuchins are chosen randomly from their local neighborhood (the ring topology of the neighborhood is used). Then, the fitness values of these two capuchins' personal best positions and the personal best position of the Capuchin whose velocity is updated are compared. The personal best of the Capuchin with better fitness is used in Eq. 19. This learning strategy helps capuchins learn from other good individuals in their local neighborhood, providing ECapSA with fast convergence and better global exploration ability.

4.6 Levy random walk

In the exploration phase of the CapSA, and according to Eq. (9), the previous solution is moved to a randomly chosen solution to generate a new individual, and the direction of the search is random. However, this strategy can lead to too much exploration, which tends to decrease the convergence speed of the algorithm [68,69,70]. To deal with this issue, some optimizers replace the simple random walk with a Lévy flight random walk to enhance the performance of the CapSA. In fact, Lèvy's random walking helps to generate solutions that are apart from existing solutions and furthermore enables a better exploration of the search space [68,69,70].

Motivated by the interesting properties of that Levy Flight walk, and in order to improve the global search of CapSA, we reformulate Eq. (9) as follows:

$${x}_{i}\left(t+1\right)={x}_{i}\left(t\right)+\mathrm{Levy}\left(M\right).\left({x}_{r}\left(t\right)-{x}_{i}\left(t\right)\right)$$

(19)

where ${x}_{i}$ is the current solution to update, ${x}_{r}$ refers to a randomly picked solution through random permutation, and $Levy(M)$ is the Levy flight step size.

$$\mathrm{Levy}\left(M\right)=0.01\times \frac{{r}_{1}\times \sigma }{{\lceil{r}_{2}\rceil}^{\frac{1}{\beta }}}$$

(20)

where ${r}_{1}$ and ${r}_{2}$ are two random numbers from the range [0,1}, $\beta$ is a constant, and

$$\sigma ={\left\{\frac{\Gamma \left(1+\beta \right)\times sin\left(\frac{\pi \beta }{2}\right)}{\Gamma \left(\frac{1+\beta }{2}\right)\times \beta \times {2}^{\left(\beta -1\right)/2}}\right\}}^{\frac{1}{\beta }},\Gamma \left(z\right)=\left(z-1\right)!$$

(21)

where Γ(x + 1) = x !. The ECapSA is described in Algorithm 2.

5 Experimental results

5.1 Datasets description

To assess the quality of ECapSA, 16 well-known UCI benchmark datasets were employed in this work. Many scholars utilize these datasets to compare performance in the field of FS. Table 1 summarizes the properties of these datasets. Training and test sets are generated using fivefold cross-validation for each dataset. The KNN classifier is used to evaluate each feature subset obtained by the crow (in this article, K = 5).

Table 1 The tested datasets

Full size table

5.2 Evaluation measures and value of parameter

The following evaluation criteria applied in many works for FS problems are utilized to evaluate and compare the comparative methods employed in this paper.

Average accuracy: It is given by Eq. (12) and is the mean of the values of the accuracy for the procedure over $M$ runs:

$${\text{AvgAcc}} \, = \frac{1}{M}\sum \limits_{i=1}^{M}{\mathrm{Acc}}^{\mathrm{i}} \,$$

(22)

where ${\mathrm{Acc}}^{i}$ is the accuracy of the optimal agent at ith run.

Average fitness: It is determined by Eq. (13) as the average of the method's fitness value across M runs.

$${\text{AvgFit}} \, = \frac{1}{M}\sum \limits_{i=1}^{M}{\mathrm{fit}}_{\mathrm{Best}}^{i} \,$$

(23)

Average Feature selection ratio: It is calculated as the average of the ratio of relevant features to D over M runs as given in Eq. (24).

$$\mathrm{AvgFR} = \frac{1}{M}\sum \limits_{i=1}^{M}\frac{FS{N}^{i}}{D} \, ({24})$$

(24)

where ${\mathrm{FSN}}^{i}$ is the length of selected features obtained using the best solution at ith run.

To assess the efficiency of the developed ECapSA, 16 well-regarded datasets of UCI [71] are listed in Table 1. The ECapSA is compared with the original CapSA and with nine metaheuristics, including DE [72], BPSO [58], SSA [73], SCA [74], and Binary DA ([75]. The tuning parameters of all algorithms are listed in Table 2. The control parameters of the original CapSA are obtained from [23].

Table 2 Parameter setting of each method

Full size table

Because the comparing algorithms are stochastically based, they use the same parameters during the optimization phase to ensure a fair comparison. $maxite$ is set to 100, while the population size is set to 20. Furthermore, each of them is subjected to ten separate executions. The parameter $\alpha =0.5$ because it has been utilized in numerous FS literature. We also use fivefold cross-validation, and a KNN classifier with K = 5 is used to assess the quality of each feature subset encoded in each unique solution.

5.3 Results analysis and discussion

The classification results of all optimizers are reported in Tables 3, 4 and 5 according to the Accuracy (ACC), Fitness (Fit), and features selected number (FSN) measures. In addition, we rank the ECapSA algorithm and comparative approaches using the nonparametric Friedman test to see if the difference between the suggested strategy and others is significantly based on the three-assessment metrics. The mean rank of each FS approach among the statistical findings is presented in the last two rows of each table. Because we want to maximize the ACC, Fit, and FSN scores, the approach that gives us the highest rank is the best.

Table 3 The developed and other optimizers were compared in terms of the average accuracy and their STD

Full size table

Table 4 ECapSA and six optimizers' average fitness values are compared

Full size table

Table 5 ECapSA and other optimizers compare their average feature selection number

Full size table

According to the results in Tables 3, 4 and 5, we make the following notices:

Table 3 shows that ECapSA outperforms alternative optimizers for almost all datasets for the evaluation measure ACC. Except for the Exactly2 dataset, where the Original CapSA delivers a greater average accuracy value, ECapSA outperforms the original CapSA on all datasets. This finding demonstrates that the three changes made to the classic CapSA had an effect on classification performance. For 12 of the sixteen datasets analyzed, ECapSA outperforms the other algorithms. With CapSA and DE, it delivers the same average accuracy and STD values for M-of-N and Exactly datasets. SCA is only accurate for one dataset: the zoo dataset. According to Table 3, ECapSA is placed first with an overall rank value of 6.5938, followed by the original CapSA in second place with an overall rank value of 5.7813. DE, SCA, and SSA are ranked third, fourth, and fifth in the rankings, respectively. The BPSO and BDA are placed sixth and seventh, respectively.
Table 4 demonstrates ECapSA's superiority in terms of average fitness. For 75 percent of datasets, ECapSA surpasses the other six optimizers and has the greatest average fitness value. Except for the Exactly2 dataset, where CapSA offers the least value, ECapsa performs better than the original CapSA for all datasets. For two datasets, Exactly and M-Of-N, DE, CapSA, and ECapSA produce equivalent results. SCA, on the other hand, provides a greater average fitness value for the zoo dataset. Table 4 shows the mean rank of each FS technique in terms of average fitness value. As can be seen, the suggested ECapSA is ranked higher than CapSA. SSA and BPSO are ranked fifth and sixth, respectively, while DE and SCA are ranked third and fourth, respectively. BDA, once again, is the worst of all optimizers, coming in last place.
In the majority of datasets, ECapSA beats all optimizers when considering the ratio of selected attributes presented in Table 5. It produces a better average ratio value for 75% of datasets, whereas CapSA, BDA, and BPSO each produce the highest average ratio value for one dataset: IonosphereEW, SpectEW, and Tic-tac-toe. For Exactly and M-of-N datasets, ECapSA and CapSA yield similar average ratio values. Based on the overall rank (1.4688), ECapSA has a lower mean rank (1.4688) and is placed first as a result. With an overall rank of 2.7813, CapSA is ranked second, followed by BDA and BPSO in third and fourth position, respectively. The SSA and SCA are tied for the sixth position with the same average rank, followed by the DE in last place.

Figure 1 shows the average performance of the fitness value attained by ECapSA and CApSA for each dataset. Regarding Fig. 1, except for Exactly 2 dataset, we can see that the ECapSA had greater convergence than the other approaches on nearly all of the datasets we studied. The use of the three adjustments to improve the CapSA algorithm is the major reason for this increase in the rate of convergence toward the optimal solution. When comparing the behavior of ECapSA to that of the original CapSA, it is clear that ECapSA outperforms traditional CapSA.

Figure 2 shows the average convergence curves of seven optimizers on different datasets. It is clear from the figure that ECapSA outperforms other algorithms in terms of average fitness value in the majority of datasets. It should be mentioned that ECapSA presents faster convergence because the improvements introduced allowed for a better balance between exploration and exploitation capabilities.

Table 6 compares the results of ECapSA with several state-of-the-art methodologies in terms of classification accuracy to further evaluate the performance of the suggested methodology. WOA is based on crossover and mutation (WOA-CM) [76], Chaotic Interior Search Algorithm (CISA) [77], BBO [78], Satin Bird Optimizer (SBO) [78], Binary Bat Algorithm (BBA) [79], and Binary Gras (BGOA). Missing values in Table 6 are replaced with the "–." We can see from the findings in Table 6 that ECapSA produces greater classification accuracy for eight datasets and gives optimal performance for the M-Of-N and Exactly datasets. The ECWSA-4 is allocated in the second because it provides higher accuracy values for four datasets: BreastEW, Exactly2, HeartEW, and Zoo. CISA is ranked third because it outperforms CISA when it comes to zoo datasets. The results of the other optimizers were respectable, although they were the worst for all datasets.

Table 6 Comparison with other FS methods from literature

Full size table

6 Conclusion

In this paper, an improvement of capuchin search algorithm (CapSA) has been presented as a feature selection approach. The enhancement of CapSA, named ECapSA, depends on using four improvements: a dynamic decreasing inertia weight using the logistic map, sine cosine-based adjustment of the acceleration coefficients, Stochastic Learning Strategy, and the search strategy based Lévy flight to improve the convergence rate. To justify the efficiency of ECapSA, it has been compared with different MH techniques, including CapSA, BDA, SSA, SCA, DE, and BPSO. In addition, a set of sixteen datasets has been used to evaluate the performance of the competitive algorithms. According to the obtained results, the developed ECapSA has an efficiency better than the competitive MH and other state-of-the-art techniques. In addition, it can be observed that the convergence of the developed method is better than other methods, as can be noticed from the convergence curve, whereas the results of using the Friedman test, the developed ECapSA has been allocated the first mean rank among the comparative algorithms according to the performance metrics. With these advantages of ECapSA, it still needs some improvements, especially the time required to determine the relevant features. This can be tackled by reducing the population size or using a local search method to improve the global solution by removing irrelevant features and adding relevant ones. In addition, we can use the sine cosine acceleration coefficients and Chaotic Inertia Weight Strategy according to some criteria for example in the first half of iterations or according to random value (rand > 0.5), since this will lead to reduce the process of updating them at each iterations.

Besides the superiority of ECapSA, it can be presented in other real-life areas, including task scheduling in cloud computing and the Internet of things problems. Most of these problems are similar to feature selection problems since they are discrete problems that can be extended. Moreover, ECapSA can be used as a prediction technique (i.e., classification or regression) to improve the performance of different machine learning methods. For example, it can be combined with a random vector functional link and artificial neural network (ANN). In addition, the proposed method can be applied as a multi-objective optimization technique to minimize or maximize the objective functions; however, it must be combined with the concept of Pareto front and archive to save the optimal solution.

Data availability

Data will be made available on reasonable request.

References

Ekbal A, Saha S (2015) Joint model for feature selection and parameter optimization coupled with classifier ensemble in chemical mention recognition. Knowl-Based Syst 85:37–51. https://doi.org/10.1016/j.knosys.2015.04.015
Article Google Scholar
Nguyen BH, Xue B, Zhang M (2020) A survey on swarm intelligence approaches to feature selection in data mining. Swarm Evol Comput. https://doi.org/10.1016/j.swevo.2020.100663
Article Google Scholar
Bai X, Gao X, Xue B (2018) Particle swarm optimization based two-stage feature selection in text mining. In: 2018 IEEE congress on evolutionary computation, CEC 2018—proceedings, pp 1–8
Awada W, Khoshgoftaar TM, Dittman D, et al (2012) A review of the stability of feature selection techniques for bioinformatics data. In: Proceedings of the 2012 IEEE 13th international conference on information reuse and integration, IRI 2012, pp 356–363
Jain S, Salau AO (2019) An image feature selection approach for dimensionality reduction based on kNN and SVM for AkT proteins. Cogent Eng. https://doi.org/10.1080/23311916.2019.1599537
Article Google Scholar
Li L, Abd-El-Atty B, El-Latif AA, Ghoneim A (2017) Quantum color image encryption based on multiple discrete chaotic systems, pp 555–559
Wang N, Li Q, Abd El-Latif AA et al (2014) Toward accurate localization and high recognition performance for noisy iris images. Multimed Tools Appl 71:1411–1430. https://doi.org/10.1007/s11042-012-1278-7
Article Google Scholar
Gad R, Talha M, El-Latif AAA et al (2018) Iris recognition using multi-algorithmic approaches for cognitive internet of things (CIoT) framework. Futur Gener Comput Syst 89:178–191. https://doi.org/10.1016/j.future.2018.06.020
Article Google Scholar
Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput c–20:1100–1103. https://doi.org/10.1109/T-C.1971.223410
Article MATH Google Scholar
Marill T, Green DM (1963) On the effectiveness of receptors in recognition systems. IEEE Trans Inf Theory 9:11–17. https://doi.org/10.1109/TIT.1963.1057810
Article Google Scholar
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Mach Learn 53:23–69. https://doi.org/10.1023/A:1025667309714
Article MATH Google Scholar
Vergara JR, Estévez PA (2014) A review of feature selection methods based on mutual information. Neural Comput Appl 24:175–186
Article Google Scholar
Loughrey J, Cunningham P (2007) Overfitting in wrapper-based feature subset selection: the harder you try the worse it gets. In: Research and development in intelligent systems XXI, pp 33–43
Dash R (2021) An adaptive harmony search approach for gene selection and classification of high dimensional medical data. J King Saud Univ Comput Inf Sci 33:195–207. https://doi.org/10.1016/j.jksuci.2018.02.013
Article Google Scholar
Abualigah L, Diabat A, Mirjalili S et al (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng. https://doi.org/10.1016/j.cma.2020.113609
Article MathSciNet MATH Google Scholar
Elgamal ZM, Yasin NM, Sabri AQM et al (2021) Improved equilibrium optimization algorithm using elite opposition-based learning and new local search strategy for feature selection in medical datasets. Computation 9:68. https://doi.org/10.3390/computation9060068
Article Google Scholar
Abualigah L, Elaziz MA, Sumari P et al (2021) Reptile search algorithm (RSA): a nature-inspired meta-heuristic optimizer. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2021.116158
Article Google Scholar
Sayed GI, Hassanien AE, Azar AT (2019) Feature selection via a novel chaotic crow search algorithm. Neural Comput Appl 31:171–188. https://doi.org/10.1007/s00521-017-2988-6
Article Google Scholar
Babatunde O, Armstrong L, Leng J, Diepeveen D (2014) A genetic algorithm-based feature selection. Int J Electron Commun Comput Eng 5:899–905
Google Scholar
Abualigah L, Yousri D, AbdElaziz M et al (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng. https://doi.org/10.1016/j.cie.2021.107250
Article Google Scholar
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1:67–82. https://doi.org/10.1109/4235.585893
Article Google Scholar
Braik M, Sheta A, Al-Hiary H (2021) A novel meta-heuristic search algorithm for solving optimization problems: capuchin search algorithm. Neural Comput Appl 33:2515–2547. https://doi.org/10.1007/s00521-020-05145-6
Article Google Scholar
Braik M (2021) A hybrid multi-gene genetic programming with capuchin search algorithm for modeling a nonlinear challenge problem: modeling industrial winding process, case study. Neural Process Lett 53:2873–2916. https://doi.org/10.1007/s11063-021-10530-w
Article Google Scholar
Arasomwan MA, Adewumi AO (2013) On the performance of linear decreasing inertia weight particle swarm optimization for global optimization. Sci World J. https://doi.org/10.1155/2013/860289
Article Google Scholar
Trelea IC (2003) The particle swarm optimization algorithm: convergence analysis and parameter selection. Inf Process Lett 85:317–325. https://doi.org/10.1016/S0020-0190(02)00447-7
Article MathSciNet MATH Google Scholar
Zhang Y, Wang S, Ji G (2015) A comprehensive survey on particle swarm optimization algorithm and its applications. Math Probl Eng 2015
Tang Z, Zhang D (2009) A modified particle swarm optimization with an adaptive acceleration coefficients. In: Proceedings—2009 Asia-pacific conference on information processing, APCIP 2009, pp 330–332
Jordehi AR (2016) Time varying acceleration coefficients particle swarm optimisation (TVACPSO): a new optimisation algorithm for estimating parameters of PV cells and modules. Energy Convers Manag 129:262–274. https://doi.org/10.1016/j.enconman.2016.09.085
Article Google Scholar
Ratnaweera A, Halgamuge SK, Watson HC (2004) Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients. IEEE Trans Evol Comput 8:240–255. https://doi.org/10.1109/TEVC.2004.826071
Article Google Scholar
Liang JJ, Qin AK, Suganthan PN, Baskar S (2006) Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans Evol Comput 10:281–295. https://doi.org/10.1109/TEVC.2005.857610
Article Google Scholar
Nasir M, Das S, Maity D et al (2012) A dynamic neighborhood learning based particle swarm optimizer for global numerical optimization. Inf Sci (Ny) 209:16–36. https://doi.org/10.1016/j.ins.2012.04.028
Article MathSciNet Google Scholar
ZorarpacI E, Özel SA (2016) A hybrid approach of differential evolution and artificial bee colony for feature selection. Expert Syst Appl 62:91–103. https://doi.org/10.1016/j.eswa.2016.06.004
Article Google Scholar
Nakamura RYM, Pereira LAM, Rodrigues D, et al (2013) Binary bat algorithm for feature selection. In: Swarm intelligence and bio-inspired computation, pp 225–237
Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput J 43:117–130. https://doi.org/10.1016/j.asoc.2016.01.044
Article Google Scholar
Mafarja M, Sabar NR (2018) Rank based binary particle swarm optimisation for feature selection in classification. In: ACM international conference proceeding series
Emary E, Zawbaa HM, Hassanien AE (2016) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381. https://doi.org/10.1016/j.neucom.2015.06.083
Article Google Scholar
AbdElaziz M, Oliva D, Xiong S (2017) An improved opposition-based sine cosine algorithm for global optimization. Expert Syst Appl 90:484–500. https://doi.org/10.1016/j.eswa.2017.07.043
Article Google Scholar
Pashaei E, Aydin N (2017) Binary black hole algorithm for feature selection and classification on biological data. Appl Soft Comput 56:94–106. https://doi.org/10.1016/j.asoc.2017.03.002
Article Google Scholar
Kumar L, Bharti KK (2021) A novel hybrid BPSO–SCA approach for feature selection. Nat Comput 20:39–61. https://doi.org/10.1007/s11047-019-09769-z
Article MathSciNet Google Scholar
Abualigah L, Dulaimi AJ (2021) A novel feature selection method for data mining tasks using hybrid Sine Cosine Algorithm and Genetic Algorithm. Cluster Comput. https://doi.org/10.1007/s10586-021-03254-y
Article Google Scholar
Xue B, Zhang M, Browne WN (2014) Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl Soft Comput J. https://doi.org/10.1016/j.asoc.2013.09.018
Article Google Scholar
Gu S, Cheng R, Jin Y (2018) Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Comput. https://doi.org/10.1007/s00500-016-2385-6
Article Google Scholar
Kaya Y (2018) Feature selection using binary cuckoo search algorithm [Ikili guguk kuşu arama algoritmasi ile öznitelik seçimi]. In: 26th IEEE signal process commun appl conf SIU 2018
Sayed GI, Tharwat A, Hassanien AE (2019) Chaotic dragonfly algorithm: an improved metaheuristic algorithm for feature selection. Appl Intell. https://doi.org/10.1007/s10489-018-1261-8
Article Google Scholar
Ouadfel S, AbdElaziz M (2020) Enhanced crow search algorithm for feature selection. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2020.113572
Article Google Scholar
Jia H, Li J, Song W et al (2019) Spotted hyena optimization algorithm with simulated annealing for feature selection. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2919991
Article Google Scholar
Ghosh KK, Ahmed S, Singh PK et al (2020) Improved binary sailfish optimizer based on adaptive β-Hill climbing for feature selection. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2991543
Article Google Scholar
Hammouri AI, Mafarja M, Al-Betar MA et al (2020) An improved dragonfly algorithm for feature selection. Knowledge-Based Syst. https://doi.org/10.1016/j.knosys.2020.106131
Article Google Scholar
Zhang Y, Liu R, Wang X et al (2021) Boosted binary Harris hawks optimizer and feature selection. Eng Comput. https://doi.org/10.1007/s00366-020-01028-5
Article Google Scholar
Too J, Abdullah AR, Saad NM (2019) A new quadratic binary harris hawk optimization for feature selection. Electron. https://doi.org/10.3390/electronics8101130
Article Google Scholar
Abualigah L, Alsalibi B, Shehab M et al (2021) A parallel hybrid krill herd algorithm for feature selection. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-020-01202-7
Article Google Scholar
Sadeghian Z, Akbari E, Nematzadeh H (2021) A hybrid feature selection method based on information theory and binary butterfly optimization algorithm. Eng Appl Artif Intell. https://doi.org/10.1016/j.engappai.2020.104079
Article Google Scholar
Tubishat M, Idris N, Shuib L et al (2020) Improved Salp Swarm Algorithm based on opposition based learning and novel local search algorithm for feature selection. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2019.113122
Article Google Scholar
Faris H, Mafarja MM, Heidari AA et al (2018) An efficient binary Salp Swarm Algorithm with crossover scheme for feature selection problems. Knowl-Based Syst 154:43–67. https://doi.org/10.1016/j.knosys.2018.05.009
Article Google Scholar
Sindhu R, Ngadiran R, Yacob YM et al (2017) Sine–cosine algorithm for feature selection with elitism strategy and new updating mechanism. Neural Comput Appl. https://doi.org/10.1007/s00521-017-2837-7
Article Google Scholar
Rodrigues D, Yang XS, De Souza AN, Papa JP (2015) Binary flower pollination algorithm and its application to feature selection. Stud Comput Intell. https://doi.org/10.1007/978-3-319-13826-8_5
Article Google Scholar
Yan C, Ma J, Luo H, Patel A (2019) Hybrid binary coral reefs optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical datasets. Chemom Intell Lab Syst. https://doi.org/10.1016/j.chemolab.2018.11.010
Article Google Scholar
Kennedy J, Eberhart RC (1997) Discrete binary version of the particle swarm algorithm. In: Proceedings of the IEEE international conference on systems, man and cybernetics, pp 4104–4108
Mohammadi-Ivatloo B, Rabiee A, Ehsan M (2012) Time-varying acceleration coefficients IPSO for solving dynamic economic dispatch with non-smooth cost function. Energy Convers Manag 56:175–183. https://doi.org/10.1016/j.enconman.2011.12.004
Article Google Scholar
Chaturvedi KT, Pandit M, Srivastava L (2009) Particle swarm optimization with time varying acceleration coefficients for non-convex economic power dispatch. Int J Electr Power Energy Syst 31:249–257. https://doi.org/10.1016/j.ijepes.2009.01.010
Article Google Scholar
Chen K, Zhou F, Yin L et al (2018) A hybrid particle swarm optimizer with sine cosine acceleration coefficients. Inf Sci (Ny) 422:218–241. https://doi.org/10.1016/j.ins.2017.09.015
Article MathSciNet Google Scholar
Yang XS (2010) A new metaheuristic Bat-inspired Algorithm. In: Studies IN COMPUTATIONAL INTELLIGENCE, pp 65–74
Shi Y, Eberhart R (1998) Modified particle swarm optimizer. In: Proceedings of the IEEE conference on evolutionary computation, ICEC, pp 69–73
Eberhart RC, Shi Y (2000) Comparing inertia weights and constriction factors in particle swarm optimization. In: Proceedings of the 2000 congress on evolutionary computation, CEC 2000, pp 84–88
Liao W, Wang J, Wang J (2011) Nonlinear inertia weight variation for dynamic adaptation in particle swarm optimization. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp 80–85
Lei K, Pu C (2014) Complex optimization problems using highly efficient particle swarm optimizer. TELKOMNIKA 12:1023–1030. https://doi.org/10.12928/TELKOMNIKA.v12i4.535
Article Google Scholar
Feng Y, Teng GF, Wang AX, Yao YM (2007) Chaotic inertia weight in particle swarm optimization. In: 2nd international conference on innovative computing, information and control, ICICIC 2007
Emary E, Zawbaa HM, Sharawi M (2019) Impact of Lèvy flight on modern meta-heuristic optimizers. Appl Soft Comput J 75:775–789. https://doi.org/10.1016/j.asoc.2018.11.033
Article Google Scholar
Ling Y, Zhou Y, Luo Q (2017) Lévy flight trajectory-based whale optimization algorithm for global optimization. IEEE Access 5:6168–6186. https://doi.org/10.1109/ACCESS.2017.2695498
Article Google Scholar
Li N, Li G, Deng Z (2017) An improved sine cosine algorithm based on levy flight. In: 9th international conference on digital image processing (ICDIP 2017), p 104204R
Asuncion A, Newman DJ (2007) UCI machine learning repository: data sets. In: Univ. Calif. Irvine Sch. Inf. http://www.ics.uci.edu/~mlearn/MLRepository.html%5Cn http://archive.ics.uci.edu/ml/datasets.html
Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11:341–359. https://doi.org/10.1023/A:1008202821328
Article MathSciNet MATH Google Scholar
Mirjalili S, Gandomi AH, Mirjalili SZ et al (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191. https://doi.org/10.1016/j.advengsoft.2017.07.002
Article Google Scholar
Mirjalili S (2016) SCA: a sine cosine algorithm for solving optimization problems. Knowl-Based Syst 96:120–133. https://doi.org/10.1016/j.knosys.2015.12.022
Article Google Scholar
Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27:1053–1073. https://doi.org/10.1007/s00521-015-1920-1
Article Google Scholar
Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput J 62:441–453. https://doi.org/10.1016/j.asoc.2017.11.006
Article Google Scholar
Arora S, Sharma M, Anand P (2020) A novel chaotic interior search algorithm for global optimization and feature selection. Appl Artif Intell 34:292–328. https://doi.org/10.1080/08839514.2020.1712788
Article Google Scholar
Arora S, Singh H, Sharma M et al (2019) A new hybrid algorithm based on grey wolf optimization and crow search algorithm for unconstrained function optimization and feature selection. IEEE Access 7:26343–26361. https://doi.org/10.1109/ACCESS.2019.2897325
Article Google Scholar
Qasim OS, Algamal ZY (2020) Feature selection using different transfer functions for binary bat algorithm. Int J Math Eng Manag Sci 5:697–706. https://doi.org/10.33889/IJMEMS.2020.5.4.056
Article Google Scholar

Download references

Funding

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Author information

Authors and Affiliations

Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, 44519, Egypt
Mohamed Abd Elaziz & Rehab Ali Ibrahim
Department of Computer Science, NTIC Faculty, University of Constantine2, Abdelhamid Mehri, 25000, Constantine, Algeria
Salima Ouadfel
Faculty of Computer Science and Engineering, Galala University, Suez, Egypt
Mohamed Abd Elaziz
Department of Electrical and Computer Engineering, Lebanese American University, Byblos , Lebanon
Mohamed Abd Elaziz
Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman, United Arab Emirates
Mohamed Abd Elaziz
MEU Research Unit, Middle East University, Amman, Jordan
Mohamed Abd Elaziz

Authors

Mohamed Abd Elaziz
View author publications
You can also search for this author in PubMed Google Scholar
Salima Ouadfel
View author publications
You can also search for this author in PubMed Google Scholar
Rehab Ali Ibrahim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Abd Elaziz.

Ethics declarations

Conflict of interest

There is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Abd Elaziz, M., Ouadfel, S. & Ibrahim, R.A. Boosting capuchin search with stochastic learning strategy for feature selection. Neural Comput & Applic 35, 14061–14080 (2023). https://doi.org/10.1007/s00521-023-08400-8

Download citation

Received: 08 December 2021
Accepted: 13 February 2023
Published: 22 March 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00521-023-08400-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Boosting capuchin search with stochastic learning strategy for feature selection

Abstract

Similar content being viewed by others

Feature selection via Lèvy Antlion optimization

Optimizing Feature Selection through Binary Charged System Search

Global Best Guided Binary Crow Search Algorithm for Feature Selection

1 Introduction

2 Related works

3 Capuchin search algorithm: background material