Abstract
This paper proposes new improved binary versions of the Sine Cosine Algorithm (SCA) for the Feature Selection (FS) problem. FS is an essential machine learning and data mining task of choosing a subset of highly discriminating features from noisy, irrelevant, high-dimensional, and redundant features to best represent a dataset. SCA is a recent metaheuristic algorithm established to emulate a model based on sine and cosine trigonometric functions. It was initially proposed to tackle problems in the continuous domain. The SCA has been modified to Binary SCA (BSCA) to deal with the binary domain of the FS problem. To improve the performance of BSCA, three accumulative improved variations are proposed (i.e., IBSCA1, IBSCA2, and IBSCA3) where the last version has the best performance. IBSCA1 employs Opposition Based Learning (OBL) to help ensure a diverse population of candidate solutions. IBSCA2 improves IBSCA1 by adding Variable Neighborhood Search (VNS) and Laplace distribution to support several mutation methods. IBSCA3 improves IBSCA2 by optimizing the best candidate solution using Refraction Learning (RL), a novel OBL approach based on light refraction. For performance evaluation, 19 real-wold datasets, including a COVID-19 dataset, were selected with different numbers of features, classes, and instances. Three performance measurements have been used to test the IBSCA versions: classification accuracy, number of features, and fitness values. Furthermore, the performance of the last variation of IBSCA3 is compared against 28 existing popular algorithms. Interestingly, IBCSA3 outperformed almost all comparative methods in terms of classification accuracy and fitness values. At the same time, it was ranked 15 out of 19 in terms of number of features. The overall simulation and statistical results indicate that IBSCA3 performs better than the other algorithms.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Back in 2003, the amount of generated data was around five exabytes. Nowadays, the same amount of data, and even more, is produced within two days [1]. This rapid increase in the volume, velocity and variety of data raises challenges and, at the same time, opportunities. Dealing with such data is a challenge, but there are opportunities to utilize the data for beneficial applications [2].
In order to perform data mining, data are first pre-processed [3], which involves cleaning and preparing the data to best meet the requirements of input for later stages. One possible pre-processing step is Feature Selection (FS) [3], which is a method of choosing a subset of features of a dataset that can best represent the data accurately without redundancy, noise, or repetition. FS is used in a wide number of applications, including data classification [4,5,6], data clustering [7,8,9], image processing [10,11,12,13], and text categorization [14, 15].
Generally speaking, FS techniques are either based on an evaluation criterion or on a search strategy. Evaluation criterion-based methods can be further classified as either filters or wrappers. The main difference between these two is the absence or existence (respectively) of a learning algorithm in the process to evaluate feature subsets. Chi-Square [16], Gain Ratio [17], Information Gain [18], support vector machines [19], ReliefF [20, 21], and hybrid ReliefF [22, 23] are filter methods. They depend upon correlations between features and classes in the dataset. Wrapper FS methods [24], on the other hand, utilize learning algorithms. A disadvantage of wrapper FS methods is the high computational cost, however they often give precise results.
Due to the huge search space, the FS problem has been shown to be NP-Hard [25, 26]. Thus, it is costly and time-consuming to employ exact methods to find a solution. However, when searching for approximate solutions, randomization searching strategies, such as sequential forward, sequential backward, random, and heuristic [27], often enhance results. Further, metaheuristic algorithms often lead to efficient implementations of various FS methods.
Metaheuristic algorithms use heuristic strategies or guidelines in optimization algorithms to solve complex optimization problems (e.g., FS problem) in real time. Unlike single-purpose algorithms, metaheuristic algorithms can be used for many different optimization problems [27,28,29,30,31,32,33]. One major category of metaheuristic algorithms is Swarm Intelligence (SI), where creature swarms are the main inspiration (e.g., ants, flocks, bees) [34]. SI algorithms have been tested with various optimization problems, including FS. For instance, the authors of [35] utilize the powerful SI algorithm Grey Wolf Optimizer (GWO) with an FS problem, and the results reported a respectable performance. Similarly, the Antlion Optimizer (ALO) [36] has been successfully used as a wrapper for a FS strategy, and the Whale Optimization Algorithm (WOA) has been utilised in several different implementations of FS algorithms [37,38,39,40], as has Particle Swarm Optimization (PSO) [41], Artificial Bee Colony (ABC) [42], Ant Colony Optimization (ACO) [43, 44], Gravitational Search Algorithm [45], and the Salp Swarm Algorithm (SSA) [46,47,48].
Indeed, the hardness of tackling the FS problem is considerably increased with an increase of the original problem’s dimensions. For instance, when the FS data has n features, its search space has 2n different solutions. Thus, any metaheuristic algorithm used to tackle such an FS problem often requires modification to work well given the complex nature of the FS search space. This is also mentioned in the No Free Lunch (NFL) theorem [49], which states that no superior algorithm can achieve the best performance for all optimization problems or even for the same optimization problem with different instances. Therefore, research opportunities are still available to introduce new/modified metaheuristic algorithms for FS problems.
Besides the previously mentioned SI algorithms, metaheuristics algorithms can imitate a physical rule, evolutionary phenomena, or human-based technique [50]. To this end, Seyedali Mirjalili proposed a metaheuristic algorithm called the Sine Cosine Algorithm (SCA) [50] in 2016. SCA is a population-based algorithm inspired by the sine and cosine trigonometric functions. The simplicity, robustness and efficiency of the algorithm are SCA’s main advantages. Those characteristics have motivated others to implement SCA for different optimization problems. For example, truss structure optimization is an architecture-based optimization problem [51] where SCA has been applied. SCA has also been adapted to support the travelling salesperson problem [52], text categorization [53], image segmentation [54], object tracking [55], unit commitment [56], optimal design of a shell and tube evaporator [57], abrupt motion tracking [58], and parameter optimization for support vector regression [59].
Because real-world problems are complex and have constraints, researchers have attempted to enhance SCA in a number of different ways. Firstly, SCA operators have been modified to deal with particular problems [60,61,62,63]. Alternatively, SCA has been hybridized with i) local-based algorithms [52, 64, 65], ii) population-based algorithms [66, 67], iii) operators from other optimization algorithms [65, 68]. For instance, in [62] the SCA exploration and exploitation phases were managed by a nonlinear conversion parameter. In addition, to help avoid local optima, the position update equation was modified. Another example of SCA hybridization is improving exploitation utilizing the Nelder-Mead simplex concept and the Opposition-based learning (OBL) searching strategy [64]. Further, the diversification of SCA has been enhanced by integrating SCA with a random mutation and gaussian local search technique [65]. Quite recently, Al-betar et al. [69] introduced a memetic version of SCA to solve the economic load dispatch problem. In this approach, adaptive β-hill climbing [70] was hybridized with the optimization framework of SCA to better balance exploration and exploitation.
SCA was initially proposed for continuous decision variables. However, with a mapping function (transforming a continuous search space to binary), a binary SCA (BSCA) version was introduced in [71], where it was implemented for an FS optimization problem, and verified to be an efficient technique. The performance, accuracy, capability, and variety of decision variables’ types are the factors that motivated us to conduct the research described in this paper. We propose three versions of the Improved Binary Sine Cosine Algorithm (i.e., IBSCA1, IBSCA2, and IBSCA3) for the FS problem, in which different approaches of exploration and exploitation are conducted. Consequently, this leads to the following contributions:
-
We apply Opposition Based Learning (OBL) in IBSCA1 to ensure a diverse population of solutions. The use of OBL is expected to expand the search region and improve the solution’s approximation.
-
IBSCA2 builds on IBSCA1 and includes Variable Neighborhood Search (VNS) and Laplace distribution to explore the search space using several mutation methods (swap, insert, inverse, or random mutation). One of the advantages of VNS is that the mutated solution may break out of a local optimum.
-
IBSCA3 builds on IBSCA2 and enhances the best candidate solution using Refraction Learning (RL). RL is a novel opposition learning approach that is based on the principle of light refraction. It is expected to improve the ability of IBSCA3 to jump out of local optima.
-
The three exploration techniques are applied in an incremental manner, where IBSCA3 implements all of the three exploration techniques. Our purpose here is to show that the incremental integration of each exploration method gradually improves the performance of IBSCA and eventually leads to a strong optimization algorithm (IBSCA3).
-
The candidate solutions produced by the optimization process of SCA and RL are continuous. Therefore, we used the V3 transfer function to convert the values of continuous decision variables into binary ones. V3 was selected based on extensive simulations on eight binary transfer functions (4 S-shaped and 4 V-shaped transfer functions). The experimental results indicated that V3 is the most viable transfer function.
-
We evaluate the variations of IBSCA utilizing 19 well-known datasets (18 FS datasets from UCI repository and a COVID-19 dataset). IBSCA3 is found to be the most efficient version of IBSCA (Section 5.2).
-
The performance of IBSCA3 was evaluated and compared to 10 popular binary algorithms (Section 5.3). The overall simulation results indicate that IBSCA3 outperformed all the compared algorithms in terms of accuracy and number of features selected over most of the datasets.
-
We compared IBSCA3 to 10 state-of-the-art algorithms that adopt OBL-enhanced methods, VNS and Laplace distribution (Section 5.4). We found that IBSCA3 produces the best results among the results of the compared algorithms.
-
We compared IBSCA3 to seven popular variations of SCA (Section 5.5). The experimental results indicate that IBSCA3 is the most accurate algorithm.
The accumulative advantages proposed for IBSCA are included in IBSCA3 where the method has the ability to diversify the search through Opposition Based Learning (OBL) and intensify the search through Variable Neighborhood Search (VNS) while also having the ability to escape local optima through Refraction Learning (RL). By means of these improvements, a superior method (i.e., IBSCA3) is introduced for the FS problem.
In general, the overall simulation results indicate that IBSCA3 outperforms the compared algorithms, based on accuracy and number of features selected, over almost all tested datasets. Note that there are two main differences between IBSCA3 and the other hybrid optimization algorithms that attempt to solve the FS problem. First, IBSCA3 is the only hybrid algorithm that combines OBL, RL, VNS and Laplace distribution in a single algorithm. Second, IBSCA3 is the first such algorithm to include Laplace distribution inside VNS.
The rest of the paper is organized as follows: SCA optimization problem implementations and versions are highlighted in Section 2. Section 3 then reviews the binary Sine Cosine algorithm and the objective function used. The newly proposed Improved Binary SCA with multiple exploration and exploitation approaches (IBSCA) for solving the FS problem is presented in Section 4. For the purpose of evaluation, the algorithms’ performances over different experiments are compared and discussed in Section 5. Lastly, Section 6 summarises the work and presents potential future research avenues.
2 Related work
Several discrete variations of SCA have been developed to solve the FS problem [48, 61, 72,73,74,75,76,77,78]. This section examines recently proposed variations of the SCA for global optimization and solving the FS problem.
El-kenawy and Ibrahim [72] introduced a binary hybrid optimization algorithm (Binary SC-MWOA) that includes the SCA algorithm and a modified Whale Optimization algorithm. Binary SC-MWOA converts the continuous candidate solutions generated by the optimization operators of the SC and whale optimization algorithms into binary discrete solutions that can be used for the FS problem using the sigmoid function. Binary SC-MWOA was evaluated over 10 UCI repository datasets and compared to a number of popular optimization algorithms including the Grey Wolf Optimizer (GWO) [79], Whale Optimization Algorithms (WOA) [80] and memetic firefly algorithm. The Binary SC-MWOA was able to find an optimum subset of features with the best category error.
Neggaz et al. [48] presented a new hybrid optimization algorithm for FS called ISSAFD that combines the optimization operators of the SC algorithm and the Disrupt Operator of the Salp Swarm Optimizer (SSA). ISSAFD optimizes followers’ positions in the SSA algorithm using sinusoidal mathematical functions similar to those in SCA operators. The disrupt operator diversifies the population of candidate solutions in the algorithm. The performance of ISSAFD was compared to many optimization algorithms including SSA, SCA, binary GWO (bGWO), PSO, ALO, and Genetic Algorithm (GA) over four well-known datasets. The simulation results suggested that ISSAFD was more accurate, had higher sensitivity, and chose fewer features than the other tested FS algorithms.
Hussain et al. [73] suggested an algorithm to solve continuous optimization problems and the FS problem called SCHHO that integrates the SCA algorithm in the Harris Hawks Optimization (HHO) algorithm. The goal of SCHHO is to use SCA as an exploration method in HHO. In addition, the exploitation ability of HHO is improved in SCHHO by having candidate solutions adjust dynamically to help avoid staying in local optima. As reported in [73], SCHHO performs much better than popular optimization algorithms, including Dragonfly algorithm (DA), grasshopper optimization algorithm (GOA), GWO, WOA, and SSA.
The wrapper-based Improved SCA (ISCA) [61] adds an Elitism strategy to SCA as well as a mechanism to update the best solution. The experimental results in [61] suggest that ISCA provides more accurate results and fewer features than GA, PSO and the original SCA algorithm.
Abd Elaziz et al., [74] proposed SCADE, an algorithm that combines the differential evolution (DE) algorithm with the SCA algorithm. DE’s optimization operators are used at each iteration of SCA to improve its population of solutions. This helps the SCA algorithm avoid local optima. SCADE’s performance was assessed over eight UCI datasets with comparison to three popular algorithms (social spider optimization (SSO), ABC and ACO [74]), with SCADE obtaining the best results.
Abualigah and Dulaimi [75] introduced the hybrid SCA and GA algorithm (SCAGA) for solving the FS problem. In SCAGA, the genetic optimization operators (crossover and mutation) are used to improve the optimization process of SCA and balance between its exploration and exploitation of candidate solutions. SCAGA was compared to SCA, PSO, and ALO using 16 UCI datasets. SCAGA was found to be a better feature-selection method than the other tested algorithms in terms of the maximum obtained accuracy and minimal obtained features.
Sindhu et al., [77] proposed an algorithm named Improved Biogeography Based Optimization (IBBO) for solving the FS problem. IBBO attempts to improve the optimization process of Biogeography Based Optimization (BBO) by employing the optimization operators of SCA after the migration operator of BBO. The performance of IBBO was compared to the performance of popular optimization algorithms such as BBO, SCA, GA, PSO, and ABC using four popular datasets. The simulation results suggest that IBBO is more accurate and selects fewer features compared to the other FS algorithms.
SCA may get stuck in sub-optimal regions during its optimization process. This is because its exploration operators (i.e., the two trigonometric functions of SCA) are unable to efficiently explore the search space. Abd Elaziz et al., [76] proposed Opposition-based SCA (OBSCA), which is a variation of SCA that uses the OBL technique to improve the performance of SCA. In OBSCA, OBL selects the best candidate solutions and generates their opposite solutions in an attempt to lead to more accurate solutions. OBSCA was compared in [76] to several optimization algorithms including SCA, Harmony Search (HS), GA, and PSO using standard optimization test functions and real-world engineering problems. OBSCA performed competitively compared to the other algorithms.
Kumar and Bharti [78] proposed the Hybrid Binary PSO and SCA algorithm (HBPSOSCA). In this algorithm, a V-shaped transfer function converts continuous candidate solutions into binary solutions. The effectiveness of HBPSOSCA was compared in [78] to binary PSO, modified BPSO with chaotic inertia weight, binary moth flame optimization algorithm, binary DA, binary WOA, binary SCA, and binary ABC using 10 standard benchmark functions and seven real-world datasets. The conducted experiments showed that HBPSOSCA exhibited better performance in most of the tested cases.
ASOSCA [81] is a hybrid optimization algorithm based on the Atom Search Optimization (ASO) algorithm and the SCA algorithm. It is basically used for automatic clustering. In ASOSCA, SCA is used to improve the quality of candidate solutions (i.e., reduce the number of features and improve accuracy of the solutions) in ASO. The performance of ASOSCA was compared in [81] to other optimization methods (e.g., SCA, ASO, PSO) using 16 clustering datasets and different cluster validity indexes. ASOSCA performed better than the other tested algorithms.
The Artificial Algae Algorithm (AAA) is a metaheuristic for solving continuous optimization problems [82]. It was originally inspired by the living behaviors of microalgae, photosynthetic specie. Turkoglu et al. [83] proposed eight binary versions of the AAA algorithm for solving the FS problem. Each binary version of AAA uses a different transfer function (four V-shaped and four S-shaped transfer functions). The performance of the binary versions of AAA was compared to the performance of seven well-known optimization algorithms (BBA, binary CS, binary Firefly algorithm, binary GWO, binary Moth flame algorithm, binary PSO, binary WOA [83]) using the UCI datasets. The experimental results indicate that the binary versions of AAA outperform the other tested algorithms.
The Horse herd Optimization Algorithm (HOA) is a metaheuristic that simulates the survival behaviour of a pack of horses in solving NP-hard optimization problems [84]. Awadallah et al. [85] proposed fifteen binary versions of HOA (BHOA) for solving the FS problems. The fifteen variations of BHOA were created by combining three popular crossover operators (one-point, two-point and uniform operators) with three transfer-functions categories (S-shaped, V-shaped and U-shaped transfer functions). The versions of BHOA were tested and evaluated against each other using 24 real-world datasets and the experimental findings suggest that the best version of BHOA is the one with S-shape and one-point crossover.
The Black Widow Optimization (BWO) algorithm is a new population-based optimization algorithm that mimics the mating process of black-widow spiders to solve the continuous optimization problems [86]. However, the BWO algorithm converges slowly to solutions when attempting to solve hard optimization problems. Therefore, the enhanced version of BWO (SDABWO) was proposed in [87] to improve the convergence behaviour of BWO and solve the FS problem. Three techniques were integrated in SDABWO. First, the spouses of male spiders are chosen based on a computational procedure that takes into consideration the weight of female spiders and the distance between spiders. Second, the mutation operators of differential evolution are used in SDABWO at its mutation phase in order to escape from local optima. Lastly, the three key parameters of SDABWO (procreating rate, cannibalism rate, and mutation rate) are adjusted dynamically over the course of the simulation process of SDABWO. SDABWO was compared to the performance of five well-established optimization algorithms (GWO, PSO, DE, BOA, HHO) using 12 datasets from the UCI repository. The experimental results indicate that SDABWO outperforms the other compared algorithms.
The chimp optimization algorithm (ChOA) is an optimization algorithm that is inspired by the behaviour of individual chimps in their group hunting for prey [88]. This algorithm was originally proposed for solving continuous optimization problems. The binary chimp optimization algorithm (BChOA) for solving the FS problem was introduced in [89]. BChOA has two variations, which are a result of combining the chOA with the one-point crossover operator and two transfer-functions categories (S-shaped and V-shaped transfer functions). The two versions of BChOA were compared to six popular metaheuristics (GA, PSO, BA, ACO, firefly algorithm, and flower pollination) and the results revealed that the two versions of BChOA perform better than the other tested algorithms.
The Hunger Games Search Optimization (HGSO) algorithm is an optimization algorithm for continuous mathematical problems. It was inspired by the prey anxiety from being eaten by their predators [90]. Devi et al. [91] presented two binary versions of the HGSO algorithm for the FS problem. It uses V-shaped and S-shaped transfer functions to transfer continuous solutions to binary ones. Binary HGSO was compared to well-known optimization algorithms (e.g., binary GWO and BSCA) using 16 datasets from the UCI repository. The simulation results demonstrated that the binary HGSO are more accurate with less selected features than the other tested algorithms.
In summary, many of the hybrid SCA variations in this section, including Binary SC-MWOA, ISSAFD, SCHHO, SCADE, HBPSOSCA and SCAGA, have internal parameters that require fine tuning and use iterative-based optimization operators inside their optimization loops (e.g., the crossover and mutation operators in SCAGA). In general, when compared to traditional optimization algorithms, hybrid methods use more computations (e.g., ASOSCA, HBPSOSCA, SCHHO). We are encouraged to use SCA in this new work because the candidate solutions in SCA can easily be converted to binary solutions using the transfer function described in Section 4.3.
3 Binary version of sine cosine algorithm for FS
The Sine Cosine Algorithm (SCA) [50], summarized in code in Algorithm 3 and pictorially in Fig. 1, iteratively optimizes a population of candidate solutions using basic trigonometric functions. A candidate solution is usually made of m decision variables X =< x1,x2,...,xm >, each initially generated randomly between the lower (LB) and upper (UB) bound for the variable. Once an initial population of candidate solutions has been randomly generated, SCA uses the problem’s fitness function to calculate a fitness value of each candidate solution. The iterative optimization process of SCA then begins, and the decision variables of each candidate solution \({X^{t}_{i}}\) are updated as follows:
where r1, r2, r3 and r4 are random numbers and \({P^{t}_{i}}\) is the position of the destination point in \({x^{t}_{i}}\) at iteration t. In detail, r1 is used to balance between exploration and exploitation of the range of the trigonometric functions in (1). The value of r1 is selected at each iteration of SCA as follows:
where a is a constant, t is the iteration number and T is the maximum number of iterations. r2 ∈ [0,2π] specifies the distance and direction of the movement related to the destination. r3 ∈ [0,2] determines the weight of the destination point \({P^{t}_{i}}\). The fourth parameter r4 ∈ [0,1] is a number used to randomly choose one of the two options in (1).
The FS problem is a binary optimization problem. A hypercube represents its search space, and a bit flip in the candidate vector changes the candidate position in the search space (X = {x1,x2,...,xm}). However, given that SCA is originally for continuous optimization problems, there is a need for a mapping function. The transfer function (TF) proposed by [92] is utilized to map a candidate continuous value to its corresponding binary value. In this paper, the use of the TF is based on literature work described in [93].
In more detail, the use of the TF is conducted as follows. First, the probability of flipping a bit is calculated using (3). Where \({v_{i}^{d}}(t)\) refers to the velocity of the dth dimension in the ith step vector (velocity) for the current iteration (t). Next, the decision value is updated based on (4), in which a random number r ∈ [0,1] is generated and, if the probability of flipping \(T({v_{i}^{d}}(t))\) is greater than r, then a bit flip takes place on the i-th element of the position vector (Xi(t + 1)). This TF is called V-shaped and is visualized in Fig. 2.
3.1 Objective function
In every optimization problem, there must be an objective function, which is an evaluation function that is used to measure a solution’s effectiveness. In the case of the FS optimization problem, a wrapper (optimizer) aims to i) minimize the number of the selected feature, and ii) increase the algorithm accuracy. Therefore, the developed objective function is as illustrated in (5). The focus is to minimize the classification error rate and the selection ratio, where the classification error rate is denoted as ERR(D) and the selection ratio is calculated by dividing the selected number of features (|R|) over the total number of features (|N|). α ∈ [0,1] is the weight assigned to the classification error rate, and β = 1 − α is the weight assigned to the selection ratio [94].
4 Proposed algorithm: an improved binary sine cosine algorithm with multiple exploration and exploitation approaches for feature selection
We present three versions of our binary optimization algorithm called Improved Binary SCA with multiple exploration and exploitation approaches (IBSCA) which can be used to solve FS problems. Algorithm 2 and the flowchart in Fig. 3 present the details of this approach. Three exploration techniques are applied in an accumulative manner to the three versions of IBSCA (IBSCA1, IBSCA2, IBSCA3), where IBSCA3 uses all of the three exploration techniques. The three versions of IBSCA are as follows:
-
IBSCA1: OBL is used as the exploration method.
-
IBSCA2: Builds on IBSCA1 by additionally using the VNS method combined with the Laplace distribution to explore the search space using several mutation methods.
-
IBSCA3: Builds on IBSCA2 by additionally using Refraction Learning to improve the current best candidate solution at each iteration of the optimization loop of SCA.
4.1 Representation of candidate solutions
A candidate solution for a FS problem with m features is a vector of m binary decision variables. Given a candidate solution X, xi = 1 means that the i th feature is included in X, whereas xi = 0 means that it is not. Table 1 shows an example candidate solution consisting of 10 decision variables X =< x1 = 0,x2 = 1,x3 = 1,...,x10 = 1 >.
4.2 Population initialization
The performance of optimization algorithms can be improved by a diversified initial population of solutions [95,96,97]. One possible way to create a diverse initial population is by using the opposition-based learning (OBL) approach. OBL is an intelligent method developed from the observation that considering opposite candidate solutions can lead to improved search times [98]. It can be be applied to the decision variables in machine learning, optimization and search algorithms. For example, if X = 〈x1,x2...,xm〉 is a candidate solution with m decision variables, the opposite candidate solution Xo is as follows:
where LBi is the lower bound for variable i and UBi is its upper bound.
The initialization stage is similar in all versions of IBSCA. In this stage, the first half of the population is generated randomly. The remainder of the population is generated by applying OBL to the first half (Line 1 in Algorithm 2). The use of OBL is expected to expand the search region and improve the solution’s approximation.
While OBL can also be applied in the initialization stage of other optimization algorithms (e.g., Cuckoo Search [96, 99], Grey Wolf Optimizer [100], Whale Optimization [101]), as can be seen in Section 5.2, the performance of IBSCA using only OBL is slightly better than the performance of BSCA. This leads to it being a good base to later combine VNS, Laplace distribution, and RL to strongly improve IBSCA’s performance.
4.3 Discretization strategy
Candidate solutions produced by the optimization process of SCA and RL are continuous. Therefore, we use two-step transfer functions to convert the continuous decision variables into binary ones (lines 8 and 10).
Table 2 shows eight binary transfer functions (4 S-shaped and 4 V-shaped transfer functions). We conducted extensive simulations to verify the efficiency of these transfer functions and found that V3 was the most viable transfer function. The experimental results in [93, 102] confirm our conclusion about V3. Thus, V3 is adopted in our experiments.
In V3, each decision variable \({x^{j}_{i}}\) in candidate solution \(X_{i}=<{x_{i}^{1}},{x_{i}^{2}}, ..., {x_{i}^{m}}>\) at iteration t is used to calculate the probability of altering \({x^{j}_{i}}\) to 0 or 1. The probability is calculated as follows:
Then, \({x^{j}_{i}}(t)\) is set to 0 or 1 as follows:
where r ∈ [0,1] is generated randomly. The chance of flipping the new value \({x^{j}_{i}}(t+1)\) increases as the value \(T({x^{j}_{i}}(t))\) increases.
4.4 Fitness function
In wrapper FS methods, we seek to minimize the number of selected features while maximizing classification accuracy. These two conflicting goals should be taken into account in the fitness function. We adopted the following fitness function to be used in our proposed algorithm:
where F(X) is the fitness function of candidate solution X, ERR is the error rate obtained by a k-Nearest Neighbor classifier using X, |R| is the number of features in X, |N| is the total number of features in the dataset, α is the weight for ERR and β = 1 − α is the weight for the selection ratio (|R|/|N|).
4.5 Optimization loop
The optimization loop of IBSCA starts at Line 3 in Algorithm 2, and ends at line 15. The first step is to evaluate each candidate solution using the fitness function (Section 4.4). Then, the random parameters of the algorithm are initialized (r1, r2, r3 and r4) and the best solution is determined (P = X∗). Afterwards, all the candidate solutions are updated using (1) and the two-step transfer function (Section 4.3) is applied to the updated solutions to generate binary equivalences. In line 9, RL is applied to the best solution X∗ as described in Section 4.5.1 and then the result is converted to a binary solution using the two-step transfer function. Finally, a combination of the variables neighborhood search with Laplace distribution (lines 11-14) is applied to a randomly selected solution from the current population, as described in Section 4.5.2.
4.5.1 Refraction learning
IBSCA3 applies RL to the current best solution to improve it. In this section, we describe RL and then show how it can be used in IBSCA3.
The refraction of light is caused by a light ray hitting an interface between two different mediums (e.g., air and water). The ray bends as its velocity changes when it moves toward the boundary between the two mediums. RL is an OBL method based on the principle of light refraction. The one-dimensional spatial refraction-learning process for the global optima X∗ at iteration t is illustrated in Fig. 4 [95, 103].
The inverse of X∗ can be calculated using refraction learning as follows:
where η is the refraction index, given by:
where \({\sin \limits } \theta _{1}= ((\text {LB}+\text {UB})/2-X^{*})/h\) and \({\sin \limits } \theta _{2}= (X^{\prime *}-(\text {LB}+\text {UB})/2)/h^{\prime }\)
In the above equations, X represents the incidence point (original candidate solution) while \(X^{\prime }\) is the refraction point (opposite candidate solution). O denotes the center point of the search interval [LB, UB], h denotes the distance between x and O and \(h^{\prime }\) denotes the distance between \(X^{\prime }\) and O.
In general, (11) can handle n decision variables as follows:
where \(x^{*}_{j}\) and \(x^{\prime *}_{j}\) are the jth decision variable of X∗ and \(X^{\prime *}\), respectively, and LBj and UBj are the lower and upper bounds of the jth decision variable, respectively.
In IBSCA3, (11) is applied to the best solution yet discovered (Line 9 in Algorithm 2).
4.5.2 Variables neighborhood search with laplace distribution
Two versions of IBSCA (IBSCA2 and IBSCA3) employ a combination of the Laplace distribution and VNS method. In this section, we first explain the Laplace distribution and VNS method and then show how they are applied in these algorithms.
Variable Neighborhood Search (VNS) is a powerful metaheuristic for solving combinatorial optimization problems. The primary goal when using VNS is to enhance a candidate solution by performing a series of operations (e.g., mutation) on a solution. This nearby solution may break out of a local optimum. The optimization process of VNS is iterative and moves between adjacent solutions in an attempt to identify a better candidate [97, 104].
The Laplace distribution is suitable for stochastic modeling because it is stable under geometric, rather than ordinary, summation [105, 106]. The Laplace distribution’s density function is given by:
where \(-\infty <x<\infty \). The Laplace distribution is then defined as follows:
where a ∈ R is the location parameter and b > 0 is the scale parameter.
IBSCA2 and IBSCA3 employ a combination of the Laplace distribution and VNS method (lines 11 to 14 in Algorithm 2). In detail, these algorithms randomly pick a candidate solution \({x_{i}^{t}}\) at iteration t from the current population of solutions. They then generate a random number r ∈ [0,1] using the Laplace distribution. r is then used as a probability to select one of four operations on the selected candidate solution (swap, insert, inverse, or random mutation), as follows:
The swap operator randomly selects two decision variables in the candidate solution (say xi and xj) and then exchanges the values of xi and xj, as illustrated in Fig. 5.
The insert operator randomly selects two decision variables (say xi and xj) in the candidate solution and then shifts the values between xi+ 1 and xj− 1 down one position, inserting xi into xj− 1, as illustrated in Fig. 6.
The inverse operator, shown in Fig. 7, randomly selects two decision variables (xi and xj) in the candidate solution and then inverses the order of values from xi to xj.
The random operator, shown in Fig. 8, randomly selects a number of decision variables (say p) in the candidate solution and then flips the binary value of each selected decision variable.
4.6 Computational complexity of IBSCA
The purpose of this section is to show the detailed computational complexity of IBSCA. We assume that the cost of any basic vector operation is O(1) and we denoted MaxItr as M.
The computational complexity of IBSCA (Algorithm 2) can be calculated as follows:
-
In Line 1(a), the generation of n/2 candidate solutions using a random generation function requires O(n/2) operations.
-
In Line 1(b), the generation of n/2 opposite candidate solutions using OBL (7) requires O(n/2) operations.
-
Line 2 requires O(1) operations.
-
The internal operations inside the while loop (lines 3 to 15) are as follows:
-
The number of operations required to evaluate the fitness of the candidate solutions is O(n) operations (Line 4).
-
Updating the best candidate solution so far (P = X∗) requires O(n) operations (Line 5).
-
Generating four random numbers requires O(1) operations (Line 6).
-
Updating the candidate solutions using (1) requires O(n) operations (Line 7).
-
Applying the two-step transfer function (Section 4.3) to the updated candidate solutions requires O(n) operations (Line 8).
-
Applying RL to the best solution X∗ requires O(1) operations (Line 9).
-
Applying the two-step transfer function (Section 4.3) to the updated solution using RL requires O(1) operations (Line 10).
-
Selecting a random solution from the current population of solutions (say \({X_{i}^{t}}\)) requires O(1) operations (Line 11).
-
Generating a random number r ∈ [0,1] based on the Laplace distribution requires O(1) operations (Line 12).
-
Selecting one of four moves based on the value of r requires O(1) operations (Line 13).
-
Line 14 requires O(1) operations.
-
-
Overall, the cost of the operations in the while loop (lines 3 to 15) is O(M(n + n + 1 + n + n + 1 + 1 + 1 + 1 + 1 + 1)), where M is the maximum number of iterations. This can be reduced to O(M.n).
-
The total number of operations in IBSCA (lines 1 to 16) is O(n/2 + n/2 + 1 + M.n + 1). This can be reduced to O(M.n) because M.n is greater than n + 2.
In summary, the computational complexity of IBSCA is O(M.n).
5 Experiments
In this section, we first demonstrate the performance of the three variations of IBSCA when solving the FS problem. The detailed characteristics of the used datasets are presented in Section 5.1. Section 5.2 provides a comparison of the convergence behavior of the original Binary Sine Cosine Algorithm (BSCA) [107] to the convergence behaviors of the three variations of IBSCA over the UCI datasets. Section 5.3 shows the performance of IBSCA3 in comparison to other well known FS algorithms.
Table 3 illustrates the parameter settings of our proposed approach. The values of the parameters of all of the algorithms have been finely tuned based on several experiments. Thus, the algorithms in this section were compared to each other based on their best parameter settings. Since the general feature of the optimization algorithms is random in nature, we executed the algorithms for 30 independent runs. We executed our experiments on a Windows 7 computer with an Intel Core i7-3517U CPU @ 1.90GHz 2.40GHz and 8.0 GB memory.
5.1 Datasets properties
The performance of IBSCA was evaluated using nineteen datasets (18 from UCI repository [108] and a real-world COVID-19 datasetFootnote 1. Table 4 provides a description of these datasets in terms of their dimensions, number of instances, and number of classes. All datasets were split randomly into 80 training instances and 20 testing instances [38] where the k-nearest neighbors classifier (KNN) is used. The KNN technique is a supervised machine learning method for solving classification and regression problems [102].
5.2 Convergence behavior of BSCA vs three variations of IBSCA
Figures 9, 10 and 11 show the convergence behavior of BSCA, IBSCA1, IBSCA2 and IBSCA3 over the UCI datasets. In each chart of these figures, the x-axis represents the iteration number, and the y-axis represents the fitness value. The convergence charts show that IBSCA3 converges faster to good solutions than all of the other algorithms for all of the datasets. The superiority of IBSCA3 is mainly because it uses three exploration techniques. First, it uses OBL when initializing the population to improve quality and diversity. Second, it integrates the VNS and Laplace distribution to explore the search space using multiple mutation methods. Third, it uses RL to search the neighborhood of best candidate solutions for better solutions.
The second best performing algorithm was IBSCA2. It uses two exploration techniques compared to IBSCA3 that uses three techniques. IBSCA1 was the third best performing algorithm. It uses only one exploration technique. BSCA exhibits the worst convergence behavior compared to the other algorithms. This may be because it does not use any additional exploration techniques compared to the other algorithms.
5.3 Performance analysis of IBSCA3 compared to baseline algorithms
In this section, we present a comparison between IBSCA3 and other binary versions of the baseline algorithms: BSCA, Random based Binary Dragonfly Algorithm (RBDA) [102], Linear based Binary Dragonfly Algorithm (LBDA) [102], Quadratic based Binary Dragonfly Algorithm (QBDA) [102], Sinusoidal based Binary Dragonfly Algorithm (SBDA) [102], Binary Gray Wolf Optimizer (BGWO) [109], Binary Gravitational Search Algorithm (BGSA) [109], and Binary Bat Algorithm (BBA) [109]. These algorithms were compared according to their classification accuracy, number of selected features, and their best fitness values. We also compared IBSCA to Coronavirus Herd Immunity Optimizer (CHIO) [110] and Coronavirus Herd Immunity Optimizer-Greedy Crossover (CHIO-GC) [110]. Table 5 shows the parameter settings of these algorithms, as in [102, 110].
Table 6 shows the average value and standard deviation of the results obtained by the proposed IBSCA3 algorithm, and the other compared algorithms, in terms of average classification accuracy. IBSCA3 outperforms the other algorithms and obtains the best classification accuracy on all the UCI and COVID-19 datasets.
Table 7 presents the average number of selected features for the tested algorithms. IBSCA3 outperforms the other tested algorithms on 14 out of 18 datasets. This is better than the second best algorithm (SBDA algorithm) which outperforms the remaining compared algorithms on 11 out of 18 datasets.
Table 8 illustrates the best fitness values obtained by the tested algorithms. We can observe that IBSCA3 shows superior performance over the other algorithms. It obtains the best fitness values on all datasets.
In summary, the enhanced version of the Binary Sine Cosine algorithm outperformed the other algorithms for all of the tested datasets, with IBSCA3 providing the highest classification accuracy and the lowest fitness function for all datasets with different dimensions, and the lowest average number of selected features in most cases. The overall results indicate that IBSCA3 converges faster than the other algorithms to the most accurate solutions with the least number of features.
The original SCA employs a random update method to update the solutions in the algorithm. This negatively affects the ability of SCA to balance between the exploration and exploitation of the search space. In contrast, IBSCA3 improves exploration and exploitation in the original SCA by employing several techniques. First, it employs an OBL approach to improve the diversity of initial population. Second, it integrates the VNS and Laplace distribution to explore the search space using multiple mutation methods. Third, it uses RL to search the neighborhood of the best candidate solutions for better ones. The overall results indicate that IBSCA3 improves the performance and convergence behavior of the original SCA in solving the FS problem.
5.4 Performance analysis of IBSCA3 compared to state-of-the-art algorithms that adopt OBL-enhanced methods, VNS and laplace distribution
In this section, we demonstrate a comparison between IBSCA3 and other new algorithms that incorporate OBL into their basic structure. These algorithms are: Improved Salp Swarm Algorithm based on opposition based learning and novel local search algorithm for feature selection (ISSA) [111], Improved Harris Hawks Optimization using elite opposition-based learning and novel search mechanism for feature selection (IHHO) [112], and New feature selection methods based on opposition-based learning and self-adaptive cohort intelligence for predicting patient no-shows (OSACI) [113]. We also compare IBSCA3 with other new algorithms that employ similar methods (VNS and Laplace distribution): A variable neighborhood search algorithm for human resource selection and optimization problem in the home appliance manufacturing industry (VNS-HRS) [114], Improving feature selection performance for classification of gene expression data using Harris Hawks Optimizer with variable neighborhood learning (VNLHHO) [115], Improved equilibrium optimization algorithm using elite opposition-based learning and new local search strategy for feature selection in medical datasets (IEOA) [116], Dynamic salp swarm algorithm for feature selection (DSSA) [117], Semi-supervised feature selection with minimal redundancy based on local adaptive (SFS-LARLRM) [118] and Binary optimization using hybrid grey wolf optimization for feature selection (BGWOPSO) [119]. Table 9 shows the parameter settings of these algorithms, as in [111,112,113,114,115,116,117,118,119].
Table 10 shows a comparison of the average classification accuracy achieved by the proposed IBSCA3 algorithm, BSCA and the other algorithms that incorporate OBL, VNS and Laplace distribution. In Table 10, we report the average value and standard deviation of the results. Among all the datasets from UCI and COVID-19 functions (except one, where it is second best), IBSCA3 delivers the best classification accuracy.
5.5 Performance analysis of IBSCA3 compared to state-of-the-art SCA algorithms
A comparison of IBSCA3 with other SCA variants is presented in this section. These variants include: An efficient hybrid sine-cosine Harris Hawks Optimization for low and high-dimensional feature selection (SCHHO) [73], A novel feature selection method for data mining tasks using hybrid Sine Cosine Algorithm and Genetic Algorithm (SCAGA) [75], A Hybrid Feature Selection Framework Using Improved Sine Cosine Algorithm with Metaheuristic Techniques (MetaSCA) [120], A novel hybrid BPSO–SCA approach for feature selection (BPSO–SCA) [78], Boosting Salp Swarm Algorithm by Sine Cosine algorithm and Disrupt Operator for Feature Selection (ISSAFD), and An improved sine cosine algorithm to select features for text categorization (ISCA) [121]. Table 11 shows the parameter settings of these algorithms, as in [72, 73, 75, 78, 120, 121].
Table 12 displays the average classification accuracy of the proposed IBSCA3 algorithm, BSCA and the other state-of-the-art SCA algorithms. In Table 12, we report the average value and standard deviation of the results. IBSCA3 consistently outperforms other algorithms when applied to UCI and COVID-19 datasets. Based on the classification accuracy of IBSCA3 and these algorithms, we determined that it had the best performance.
5.6 Performance analysis of IBSCA3 compared to other new nature-inspired metaheuristic algorithms
This section shows a comparison between IBSCA3 and other new nature-inspired metaheuristic algorithms, including: A novel Binary Farmland Fertility Algorithm (BFFAG) [122], African vultures optimization algorithm (AVOA) [123] and Artificial gorilla troops optimizer (GTO) [124]. Table 13 shows the parameter settings of these algorithms, as in [122,123,124].
A comparison of the average classification accuracy achieved by the proposed IBSCA3 algorithm, BSCA and the other new nature-inspired metaheuristic algorithms is shown in Table 14, where we report the average value and standard deviation of the results. In all datasets from UCI and COVID-19, IBSCA3 delivers the best classification accuracy.
Consequently, the overall results summarized in all different sets of experiments indicate the strength of the IBSCA3 algorithm in improving the performance and convergence behavior of the original SCA when solving the FS problem.
5.7 Runtime performance comparison of IBSCA3 to existing algorithms
Tables 15, 16, 17, 18 provide the running time comparison of IBSCA3, BSCA, and the other algorithms described in Tables 6, 10, 12, and 14, respectively. The results are given in milliseconds, representing an average of 30 independent runs. For each algorithm in the tables, the values in the tables represent the run time to obtain the results after 100 iterations. As shown in the tables, IBSCA3 is faster than the other algorithms when applied to all datasets.
The experiments were conducted using an Intel Core i7-3517U, 1.90 GHz CPU with 16 GB RAM running 64-bit Windows. All the algorithms were implemented using Python programming language.
5.8 Statistical test results
An investigation of the significance of the results in Tables 6, 10, 12, and 14 has been conducted. We applied both Friedman’s test and Wilcoxon’s test [125] to the classification accuracy in the tables with α = 0.05. Tables 19, 20, 21 and 22 present the results of the Friedman’s test. The best ranks in each row are highlighted in bold. The average ranks of the algorithms were as follows (best to worst): In Table 19: IBSCA3, SBDA, LBDA, RBDA, QBDA, BSCA. BGWO, BGSA, and BBA. In Table 20: IBSCA3, VNS-HRS, IHHO, BGWOPSO, OSACI, ISSA, VNLHHO, SFS-LARLRM, IEOA, and DSSA. In Table 21: IBSCA3, ISSAFD, SCHHO, BPSO-SCA, SCAGA, MetaSCA, and ISCA. In Table 22: IBSCA3, GTO, AVOA, and BFFAG.
It is clear from the results that IBSCA3 achieves the best rank over 12 datasets, and competitive results for the other datasets. Therefore, IBSCA3 is the best in terms of the average of ranks among the other compared algorithms.
We also conducted the Wilcoxon’s test with α = 0.05 as summarized in Tables 23, 24, 25 and 26 to evaluate the data in Tables 6, 10, 12, and 14, respectively. Our purpose here is to evaluate the significance of the values of the classification accuracy of IBSCA3 compared to the other algorithms in the tables. The reported p-values indicate that the values of the classification accuracy of IBSCA3 are statistically significant compared to the values of the other algorithms.
In addition, we used Mann-Whitney U test to compare IBSCA3 against all other algorithms. Based on the results, IBSCA3 produces significant results compared to the other algorithms except for IHHO (0.28014), VNS-HRS (0.35758), BGWOPSO (0.0536), AVOA (0.39532), and GTO (0.65272).
Accordingly, the statistical analysis gives evidence that the included modifications of IBSCA3 improve its search strategy, as compared to the original SCA algorithm, and thus achieves the highest accuracy for most of the datasets.
6 Conclusion and future work
This paper introduced three versions of a binary optimization algorithm by the name of Improved Binary Sine Cosine Algorithm with multiple exploration and exploitation approaches (IBSCA) for solving the Feature Selection problem. All versions of IBSCA (IBSCA1, IBSCA2, IBSCA3) employ an opposition-based learning approach in their initialization stage to generate a diverse population of candidate solutions. IBSCA2 and IBSCA3 use a combination of the variable neighborhood search and Laplace distribution to explore the search space using several mutation methods. Further, IBSCA3 improves the best candidate solution using Refraction Learning, which is a novel opposition learning approach that is based on the principle of light refraction. All versions of IBSCA use two-step transfer functions to convert continuous decision variables into binary ones.
The three versions of IBSCA were compared with each other using 18 FS datasets from UCI repository and one COVID-19 dataset. These datasets are suitable for comparison because the numbers of features, objects and classes in these datasets vary significantly. IBSCA3 was found to be the most efficient version of IBSCA. Furthermore, the performance of IBSCA3 was evaluated and compared to several popular binary algorithms (RBDA, LBDA, QBDA, SBDA, BGWO, BGSA, BBA, CHIO, CHIO-GC, ISSA, IHHO, OSACI, VNS-HRS, VNLHHO, IEOA, DSSA, SFS-LARLRM, BGWOPSO, SCHHO, SCAGA, MetaSCA, BPSO–SCA, ISSAFD, ISCA, BFFAG, AVOA, GTO) using the 18 FS datasets from UCI repository and a COVID-19 dataset. The overall simulation results indicate that IBSCA3 outperformed all comparative algorithms in terms of accuracy and number of features selected over most datasets.
It is worth mentioning that the performance of IBSCA is affected by the limitations of its methods. To begin, OBL and RL tend to generate good solutions at the beginning of the optimization process, but the generated solutions may converge to sub-optimality as the optimization process progresses [98]. Besides, every optimization problem requires a special OBL strategy that is suitable for the problem structure. In other words, there are no clear guidelines for designing OBL strategies for different optimization problems [126, 127]. Secondly, if the VNS method is implemented too frequently, the population of solutions could be spread over a larger area than necessary [128].
In the future, we are interested in conducting two research studies based on IBSCA3. We are going to apply IBSCA3 to multi-agent cooperative reinforcement learning [129, 130] based on the models described in [131, 132]. We also plan to incorporate the island model [96, 133,134,135,136,137] with IBSCA3 to further improve its performance over the FS problem. Applying the proposed methods on other FS applications can also be addressed in future work.
References
Bolón-Canedo V, Sánchez-maroño N, Alonso-Betanzos A (2015) Recent advances and emerging challenges of feature selection in the context of big data. Knowledge-based systems 86:33–45
Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI magazine 17(3):37
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182
Hua J, Tembe WD, Dougherty ER (2009) Performance of feature-selection methods in the classification of high-dimension data. Pattern Recognit 42(3):409–424
Gómez-Verdejo V, Verleysen M, Fleury J (2009) Information-theoretic feature selection for functional data classification. Neurocomputing 72(16-18):3580–3589
Al-Abdallah RZ, Jaradat AS, Doush IA, Jaradat YA (2017) A binary classifier based on firefly algorithm. Jordanian J Comput Inf Technol (JJCIT), vol 3(3)
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng, (4):491–502
Boutemedjet S, Bouguila N, Ziou D (2008) A hybrid feature extraction selection approach for high-dimensional non-gaussian data clustering. IEEE Trans Pattern Anal Mach Intell 31(8):1429–1443
ElMustafa S, Jaradat A, Doush IA, Mansour N (2017) Community detection using intelligent water drops optimisation algorithm. Int J Reasoning-Based Intell Syst 9(1):52–65
Ke H, Aviyente S (2008) Wavelet feature selection for image classification. IEEE Trans Image Process 17(9):1709–1720
Chen Bo, Chen L, Chen Y (2013) Efficient ant colony optimization for image feature selection. Signal Process 93(6):1566–1576
Sawalha R, Doush IA (2012) Face recognition using harmony search-based selected features. Int J Hybrid Inf Technol 5(2):1–16
AbuNaser A, Doush IA, Mansour N, Alshattnawi S (2015) Underwater image enhancement using particle swarm optimization. J Intell Syst 24(1):99–115
Shang W, Huang H, Zhu H, Lin Y, Qu Y, Wang Z (2007) A novel feature selection algorithm for text categorization. Expert Syst Appl 33(1):1–5
Zheng Z, Wu X, Srihari R (2004) Feature selection for text categorization on imbalanced data. ACM Sigkdd Explorations Newsletter 6(1):80–89
Liu H, Setiono R (1995) Chi2: feature selection and discretization of numeric attributes. In: Proceedings of 7th IEEE international conference on tools with artificial intelligence. IEEE, pp 388–391
Quinlan JR (2014) C4. 5: programs for machine learning. Elsevier
Quinlan JR (1986) Induction of decision trees. Machine learning 1(1):81–106
Kandaswamy KK, Pugalenthi G, Hazrati MK, Kalies K-U, Martinetz T (2011) Blprot: prediction of bioluminescent proteins based on support vector machine and relieff feature selection. BMC soinformatics 12(1):345
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Mach Learn 53(1):23–69
Le TT, Urbanowicz RJ, Moore JH, McKinney BA (2019) Statistical inference relief (stir) feature selection. Bioinformatics 35(8):1358–1365
Huang Z, Yang C, Zhou X, Huang T (2018) A hybrid feature selection method based on binary state transition algorithm and relieff. IEEE J Biomed Health Inf 23(5):1888–1898
Deng Z, Chung F-L, Wang S (2010) Robust relief-feature weighting, margin maximization, and fuzzy optimization. IEEE Trans Fuzzy Syst 18(4):726–744
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1-2):273–324
Hossam Faris, Ala’M A-Z, Heidari AA, Aljarah I, Mafarja M, Hassonah MA, Fujita H (2019) An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks. Inf Fusion 48:67– 83
Chantar H, Mafarja M, Alsawalqah H, Heidari AA, Aljarah I, Faris H (2019) Feature selection using binary grey wolf optimizer with elite-based crossover for arabic text classification. Neural Comput Appl:1–20
Zelinka I (2015) A survey on evolutionary algorithms dynamics and its complexity–mutual relations, past, present and future. Swarm Evolution Comput 25:2–14
Gharehchopogh FS, Shayanfar H, Gholizadeh H (2020) A comprehensive survey on symbiotic organisms search algorithms. Artif Intell Rev 53(3):2265–2312
Mafarja M, Mirjalili S (2017) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453
Soleimanian GF, Gholizadeh H (2019) A comprehensive survey: whale optimization algorithm and its applications. Swarm Evolution Comput 48:1–24
Gharehchopogh FS, Maleki I, Dizaji ZA (2022) Chaotic vortex search algorithm: metaheuristic algorithm for feature selection. Evolution Intell 15(3):1777–1808
Turkoglu B, Kaya E (2020) Training multi-layer perceptron with artificial algae algorithm. Eng Sci Technol Int J 23(6):1342– 1350
Turkoglu B, Uymaz SA, Kaya E (2022) Clustering analysis through artificial algae algorithm. Int J Mach Learn Cybern 13(4):1179–1196
Rahnema N, Gharehchopogh FS (2020) An improved artificial bee colony algorithm based on whale optimization algorithm for data clustering. Multimed Tools Appl 79(43):32169–32194
Tu Q, Chen X, Liu X (2019) Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Appl Soft Comput 76:16–30
Mirjalili S (2015) The ant lion optimizer. Adv Eng Softw 83:80–98
Mafarja M, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312
Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453
Mafarja M, Jaber I, Ahmed S, Thaher T (2019) Whale optimisation algorithm for high-dimensional small-instance feature selection. Int J Parallel Emergent Distributed Syst:1– 17
Hussien AG, Hassanien AE, Houssein EH, Bhattacharyya S, Amin M (2019) S-shaped binary whale optimization algorithm for feature selection. In: Bhattacharyya S, Mukherjee A, Bhaumik H, Das S, Yoshida K (eds) Recent trends in signal and image processing. Springer Singapore, pp 79–87, Singapore
Zaman HRR, Gharehchopogh FS (2021) An improved particle swarm optimization with backtracking search optimization algorithm for solving continuous optimization problems. Eng Comput:1–35
Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm. J Global Optimization 39(3):459–471
Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B (Cybernetics) 26(1):29–41
Turabieh H, Mafarja M, Li X (2018) Iterated feature selection algorithms with layered recurrent neural network for software fault prediction. Expert Syst Appl 122:27–42
Taradeh M, Mafarja M, Heidari AA, Faris H, Aljarah I, Mirjalili S, Fujita H (2019) An evolutionary gravitational search-based feature selection. Inf Sci
Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw
Faris H, Heidari AA, Ala’M A-Z, Mafarja M, Aljarah I, Eshtay M, Mirjalili S (2020) Time-varying hierarchical chains of salps with random weight networks for feature selection. Expert Syst Appl 140:112898
Neggaz N, Ewees AA, Elaziz MA, Mafarja M (2020) Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Expert Syst Applications 145:113103
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evolution Computat 1(1):67–82
Mirjalili S (2016) Sca: a sine cosine algorithm for solving optimization problems. Knowl-Based Syst 96:120–133
Gholizadeh S, Sojoudizadeh R (2019) Modified sine-cosine algorithm for sizing optimization of truss structures with discrete design variables. Iran Univ Sci Technol 9(2):195–212
Tawhid MA, Savsani P (2019) Discrete sine-cosine algorithm (dsca) with local search for solving traveling salesman problem. Arabian J Sci Eng 44(4):3669–3679
Belazzoug M, Touahria M, Nouioua F, Brahimi M (2019) An improved sine cosine algorithm to select features for text categorization. J King Saud Univ-Comput Inf Sci 32(4):454–464
Oliva D, Hinojosa S, Elaziz MA, Ortega-sánchez N (2018) Context based image segmentation using antlion optimization and sine cosine algorithm. Multimed Tools Appl 77(19):25761–25797
Nenavath H, Jatoth RK, Das S (2018) A synergy of the sine-cosine algorithm and particle swarm optimizer for improved global optimization and object tracking. Swarm Evolution Computat 43:1–30
Reddy KS, Kumar PL, Panigrahi BK, Kumar R (2018) A new binary variant of sine–cosine algorithm: development and application to solve profit-based unit commitment problem. Arabian J Sci Eng 43 (8):4041–4056
Turgut OE (2017) Thermal and economical optimization of a shell and tube evaporator using hybrid backtracking search—sine–cosine algorithm. Arabian J Sci Eng 42(5):2105–2123
Zhang H, Gao Z, Zhang J, Liu J, Nie Z, Zhang J (2020) Hybridizing extended ant lion optimizer with sine cosine algorithm approach for abrupt motion tracking. EURASIP J Image Video Process 2020(1):4
Li S, Fang H, Liu X (2018) Parameter optimization of support vector regression based on sine cosine algorithm. Expert Syst Appl 91:63–77
Gupta S, Deep K (2019) A hybrid self-adaptive sine cosine algorithm with opposition based learning. Expert Syst Appl 119:210–230
Sindhu R, Ngadiran R, Yacob YM, Hanin Zahri NA, Hariharan M (2017) Sine–cosine algorithm for feature selection with elitism strategy and new updating mechanism. Neural Comput Appl 28 (10):2947–2958
Long W, Wu T, Liang X, Xu S (2019) Solving high-dimensional global optimization problems using an improved sine cosine algorithm. Expert Syst Appl 123:108–126
Hao Chen, Heidari AA, Zhao X, Zhang L, Chen H (2020) Advanced orthogonal learning-driven multi-swarm sine cosine optimization: framework and case studies. Expert Syst Appl 144:113113
Huiling Chen, Jiao S, Heidari AA, Wang M, Chen X, Zhao X (2019) An opposition-based sine cosine approach with local search for parameter estimation of photovoltaic models. Energy Conversion Manag 195:927–942
Liu S, Feng Z-K, Niu W-J, Zhang H-R, Song Z-G (2019) Peak operation problem solving for hydropower reservoirs by elite-guide sine cosine algorithm with gaussian local search and random mutation. Energies 12(11):2189
Nenavath H, Jatoth RK (2018) Hybridizing sine cosine algorithm with differential evolution for global optimization and object tracking. Appl Soft Comput 62:1019–1043
Chegini SN, Bagheri A, Najafi F (2018) Psoscalf: a new hybrid pso based on sine cosine algorithm and levy flight for solving optimization problems. Appl Soft Comput 73:697–726
Gupta S, Deep K (2019) Improved sine cosine algorithm with crossover scheme for global optimization. Knowl-Based Syst 165:374–406
Al-Betar MA, Awadallah MA, Abu R, Assaleh K (2022) Economic load dispatch using memetic sine cosine algorithm. J Ambient Intell Humanized Comput:1–29
Al-Betar MA, Aljarah I, Awadallah MA, Faris H, Mirjalili S (2019) Adaptive β-hill climbing for optimization. Soft Comput 23(24):13489–13512
Hafez AI, Zawbaa HM, Emary E, Hassanien AE (2016) Sine cosine optimization algorithm for feature selection. In: 2016 International symposium on innovations in intelligent systems and applications (INISTA). IEEE, pp 1–5
Eid MM, El-kenawy E-SM, Ibrahim A (2021) A binary sine cosine-modified whale optimization algorithm for feature selection. In: 2021 National computing colleges conference (NCCC). IEEE, pp 1–6
Hussain K, Neggaz N, Zhu W, Houssein EH (2021) An efficient hybrid sine-cosine harris hawks optimization for low and high-dimensional feature selection. Expert Syst Appl 176:114778
Elaziz MEA, Ewees AA, Oliva D, Duan P, Xiong S (2017) A hybrid method of sine cosine algorithm and differential evolution for feature selection. In: International conference on neural information processing. Springer, pp 145–155
Abualigah L, Dulaimi AJ (2021) A novel feature selection method for data mining tasks using hybrid sine cosine algorithm and genetic algorithm. Cluster Comput:1–16
Elaziz MA, Oliva D, Xiong S (2017) An improved opposition-based sine cosine algorithm for global optimization. Expert Syst Appl 90:484–500
Sindhu R, Ngadiran R, Yacob YM, Zahri NAH, Hariharan M, Polat K (2019) A hybrid sca inspired bbo for feature selection problems. Math Prob Eng:2019
Kumar L, Bharti KK (2021) A novel hybrid bpso–sca approach for feature selection. Natural Comput 20(1):39–61
El-Kenawy E-SM, Eid MM, Saber M, Ibrahim A (2020) Mbgwo-sfs: modified binary grey wolf optimizer based on stochastic fractal search for feature selection, vol 8
Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67
Elaziz MA, Nabil N, Ewees AA, Lu S (2019) Automatic data clustering based on hybrid atom search optimization and sine-cosine algorithm. In: 2019 IEEE congress on evolutionary computation (CEC). IEEE, pp 2315–2322
Uymaz SA, Tezel G, Yel E (2015) Artificial algae algorithm (aaa) for nonlinear global optimization. Appl Soft Comput 31:153– 171
Turkoglu B, Uymaz SA, Kaya E (2022) Binary artificial algae algorithm for feature selection. Appl Soft Comput 120:108630
MiarNaeimi F, Azizyan G, Rashki M (2021) Horse herd optimization algorithm: a nature-inspired algorithm for high-dimensional optimization problems. Knowl-Based Syst 213:106711
Awadallah MA, Hammouri AI, Al-Betar MA, Braik MS, Elaziz MA (2022) Binary horse herd optimization algorithm with crossover operators for feature selection. Comput Bio Med 141:105152
Hayyolalam V, Kazem AAP (2020) Black widow optimization algorithm: a novel meta-heuristic approach for solving engineering optimization problems. Eng Appl Artif Intell 87:103249
Hu G, Du B, Wang X, Wei G (2022) An enhanced black widow optimization algorithm for feature selection. Knowl-Based Syst 235:107638
Khishe M, Mosavi MR (2020) Chimp optimization algorithm. Expert Syst Appl 149:113338
Pashaei E, Pashaei E (2022) An efficient binary chimp optimization algorithm for feature selection in biomedical data classification. Neural Comput Appl 34(8):6427–6451
Yang Y, Chen H, Heidari AA, Gandomi AH (2021) Hunger games search: visions, conception, implementation, deep analysis, perspectives, and towards performance shifts. Expert Syst Appl 177:114864
Devi RM, Premkumar M, Jangir P, Kumar BS, Alrowaili D, Nisar KS (2022) Bhgso: binary hunger games search optimization algorithm for feature selection problem. CMC-Comput Materials Continua 70(1):557–579
Mirjalili S, Lewis A (2013) S-shaped versus v-shaped transfer functions for binary particle swarm optimization. Swarm Evolution Computat 9:1–14
Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27(4):1053–1073
Emary E, Zawbaa HM, Hassanien AE (2016) Binary ant lion approaches for feature selection. Neurocomputing 213:54–65
Abed-alguni BH, Alawad NA, Barhoush M, Hammad R (2021) Exploratory cuckoo search for solving single-objective optimization problems. Soft Comput:1–14
Alawad NA, Abed-alguni BH (2020) Discrete island-based cuckoo search with highly disruptive polynomial mutation and opposition-based learning strategy for scheduling of workflow applications in cloud environments. Arabian J Sci Eng:1–21
Alkhateeb F, Abed-alguni BH, Al-rousan MH (2021) Discrete hybrid cuckoo search and simulated annealing algorithm for solving the job shop scheduling problem. J Supercomput:1–28
Tizhoosh HR (2005) Opposition-based learning: a new scheme for machine intelligence. In: International conference on computational intelligence for modelling, control and automation and international conference on intelligent agents, web technologies and internet commerce (CIMCA-IAWTIC’06). IEEE, vol 1, pp 695–701
Shishavan ST, Gharehchopogh FS (2022) An improved cuckoo search optimization algorithm with genetic algorithm for community detection in complex networks. Multimed Tools Appl:1–27
Luo J, Liu Z (2020) Novel grey wolf optimization based on modified differential evolution for numerical function optimization. Appl Intell 50(2):468–486
Tubishat M, Abushariah MA, Idris N, Aljarah I (2019) Improved whale optimization algorithm for feature selection in arabic sentiment analysis. Appl Intell 49(5):1688–1707
Hammouri AI, Mafarja M, Al-Betar MA, Awadallah MA, Abu-Doush I (2020) An improved dragonfly algorithm for feature selection. Knowl-Based Syst 203:106131
Abed-alguni BH, Paul D, Hammad R (2022) Improved salp swarm algorithm for solving single-objective continuous optimization problem. Appl Intell:1–20
Alkhateeb F, Abed-Alguni BH (2019) A hybrid cuckoo search and simulated annealing algorithm. J Intell Syst 28(4):683–698
Deep K, Thakur M (2007) A new crossover operator for real coded genetic algorithms. Appl Math Computat 188(1):895–911
Boudt K, Galanos A, Payseur S, Zivot E (2019) Multivariate garch models for large-scale applications: a survey. In: Handbook of statistics. Elsevier, vol 41, pp 193–242
Taghian S, Nadimi-Shahraki MH (2019) Binary sine cosine algorithms for feature selection from medical data. Adv Comput Int J 235:1–10
Lichman M et al (2013) Uci machine learning repository, 2013. http://archive.ics.uci.edu/ml, vol 40. Accessed 14 April 2022
Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, Ala’M A-Z, Mirjalili S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl-Based Syst 145:25–45
Alweshah M, Alkhalaileh S, Al-Betar MA, Bakar AA (2022) Coronavirus herd immunity optimizer with greedy crossover for feature selection in medical diagnosis. Knowl-Based Syst 235:107629
Tubishat M, Idris N, Shuib L, Abushariah MA, Mirjalili S (2020) Improved salp swarm algorithm based on opposition based learning and novel local search algorithm for feature selection. Expert Syst Appl 145:113122
Sihwail R, Omar K, Ariffin KAZ, Tubishat M (2020) Improved harris hawks optimization using elite opposition-based learning and novel search mechanism for feature selection. IEEE Access 8:121127–121145
Aladeemy M, Adwan L, Booth A, Khasawneh MT, Poranki S (2020) New feature selection methods based on opposition-based learning and self-adaptive cohort intelligence for predicting patient no-shows. Appl Soft Comput 86:105866
Ji X, Liao B, Yang S (2022) A variable neighborhood search algorithm for human resource selection and optimization problem in the home appliance manufacturing industry. J Combinatorial Optimization 44(1):223–241
Qu C, Zhang L, Li J, Deng F, Tang Y, Zeng X, Peng X (2021) Improving feature selection performance for classification of gene expression data using harris hawks optimizer with variable neighborhood learning. Brief Bioinform 22(5):bbab097
Elgamal ZM, Yasin NM, Sabri AQM, Sihwail R, Tubishat M, Jarrah H (2021) Improved equilibrium optimization algorithm using elite opposition-based learning and new local search strategy for feature selection in medical datasets. Computation 9(6):68
Tubishat M, Ja’afar S, Alswaitti M, Mirjalili S, Idris N, Ismail MA, Omar MS (2021) Dynamic salp swarm algorithm for feature selection. Expert Syst Appl 164:113873
Wu X, Chen H, Li T, Wan J (2021) Semi-supervised feature selection with minimal redundancy based on local adaptive. Appl Intell 51(11):8542–8563
Qasem Al-Tashi, Abdul Kadir SJ, Rais HM, Mirjalili S, Alhussian H (2019) Binary optimization using hybrid grey wolf optimization for feature selection. Ieee Access 7:39496–39508
Sun L, Qin H, Przystupa K, Cui Y, Kochan O, Skowron M, Su J (2022) A hybrid feature selection framework using improved sine cosine algorithm with metaheuristic techniques. Energies 15 (10):3485
Belazzoug M, Touahria M, Nouioua F, Brahimi M (2020) An improved sine cosine algorithm to select features for text categorization. J King Saud Univ-Comput Inf Sci 32(4):454–464
Hosseinalipour A, Gharehchopogh FS, Masdari M, Khademi A (2021) A novel binary farmland fertility algorithm for feature selection in analysis of the text psychology. Appl Intell 51(7):4824–4859
Abdollahzadeh B, Gharehchopogh FS, Mirjalili S (2021) African vultures optimization algorithm: a new nature-inspired metaheuristic algorithm for global optimization problems. Comput Industr Eng 158:107408
Abdollahzadeh B, Gharehchopogh FS, Mirjalili S (2021) Artificial gorilla troops optimizer: a new nature-inspired metaheuristic algorithm for global optimization problems. Int J Intell Syst 36 (10):5887–5958
Banerjee T, Sinha S, Choudhury P (2022) Long term and short term forecasting of horticultural produce based on the lstm network model. Appl Intell 52(8):9117–9147
Li J, Gao Y, Wang K, Sun Y (2021) A dual opposition-based learning for differential evolution with protective mechanism for engineering optimization problems. Appl Soft Comput 113:107942
Goldanloo MJ, Gharehchopogh FS (2022) A hybrid obl-based firefly algorithm with symbiotic organisms search algorithm for solving continuous optimization problems. J Supercomput 78(3):3998–4031
Wu G-H, Cheng C-Y, Pourhejazy P, Fang B-L (2022) Variable neighborhood-based cuckoo search for production routing with time window and setup times. Appl Soft Comput 125:109191
Abed-alguni BH, Ottom MA (2018) Double delayed Q-learning. Int J Artif IntellTM 16(2):41–59
Abed-Alguni BH, Paul DJ, Chalup SK, Henskens FA (2016) A comparison study of cooperative Q-learning algorithms for independent learners. Int J Artif IntellTM 14(1):71–93
Abed-alguni BH (2018) Action-selection method for reinforcement learning based on cuckoo search algorithm. Arabian J Sci Eng 43(12):6771–6785
Abed-Alguni BH (2017) Bat Q-learning algorithm. Jordanian J Comput Inf Technol(JJCIT) 3 (1):56–77
Abed-alguni BH, Paul D (2022) Island-based cuckoo search with elite opposition-based learning and multiple mutation methods for solving optimization problems. Soft Comput:1–20
Abed-alguni BH, Alawad NA (2021) Distributed grey wolf optimizer for scheduling of workflow applications in cloud environment. Appl Soft Comput J:1–37
Abed-alguni BH, Barhoush M (2018) Distributed grey wolf optimizer for numerical optimization problems. Jordanian J Comput Inf Technol (JJCIT), vol 4(03)
Abed-Alguni BH (2019) Island-based cuckoo search with highly disruptive polynomial mutation. Int J Artif Intell 17(1):57–82
Abed-Alguni BH, Klaib AF, Nahar KM (2019) Island-based whale optimization algorithm for continuous optimization problems. Int J Reasoning-Based Intell Syst:1–11
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Human and animal rights
This article does not contain any studies with human participants or animals performed by any of the authors.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Abed-alguni, B.H., Alawad, N.A., Al-Betar, M.A. et al. Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection. Appl Intell 53, 13224–13260 (2023). https://doi.org/10.1007/s10489-022-04201-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04201-z