Trade-off between exploration and exploitation with genetic algorithm using a novel selection operator

Hussain, Abid; Muhammad, Yousaf Shad

doi:10.1007/s40747-019-0102-7

Trade-off between exploration and exploitation with genetic algorithm using a novel selection operator

Original Article
Open access
Published: 02 April 2019

Volume 6, pages 1–14, (2020)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

Trade-off between exploration and exploitation with genetic algorithm using a novel selection operator

Download PDF

10k Accesses
50 Citations
Explore all metrics

Abstract

As an intelligent search optimization technique, genetic algorithm (GA) is an important approach for non-deterministic polynomial (NP-hard) and complex nature optimization problems. GA has some internal weakness such as premature convergence and low computation efficiency, etc. Improving the performance of GA is a vital topic for complex nature optimization problems. The selection operator is a crucial strategy in GA, because it has a vital role in exploring the new areas of the search space and converges the algorithm, as well. The fitness proportional selection scheme has essence exploitation and the linear rank selection is influenced by exploration. In this article, we proposed a new selection scheme which is the optimal combination of exploration and exploitation. This eliminates the fitness scaling issue and adjusts the selection pressure throughout the selection phase. The $\chi ^2$ goodness-of-fit test is used to measure the average accuracy, i.e., mean difference between the actual and expected number of offspring. A comparison of the performance of the proposed scheme along with some conventional selection procedures was made using TSPLIB instances. The application of this new operator gives much more effective results regarding the average and standard deviation values. In addition, a two-tailed t test is established and its values showed the significantly improved performance by the proposed scheme. Thus, the new operator is suitable and comparable to established selection for the problems related to traveling salesman problem using GA.

A review on genetic algorithm: past, present, and future

Article 31 October 2020

Sourabh Katoch, Sumit Singh Chauhan & Vijay Kumar

Multi-objective Geometric Mean Optimizer (MOGMO): A Novel Metaphor-Free Population-Based Math-Inspired Multi-objective Algorithm

Article Open access 11 April 2024

Sundaram B. Pandya, Kanak Kalita, … Laith Abualigah

Spider wasp optimizer: a novel meta-heuristic optimization algorithm

Article 13 March 2023

Mohamed Abdel-Basset, Reda Mohamed, … Mohamed Abouhawwash

Introduction

Several modern meta-heuristic algorithms have been developed during the last five decades for solving the non-deterministic polynomial (NP-hard) and complex nature optimization problems. According to some specified criteria, these algorithms can be divided into different groups such as stochastic, deterministic, population, and iterative-based, etc. If an algorithm is trying to improve the solution according to the probabilistic rules, it is called stochastic algorithm. If an algorithm is trying to increase the solution quality with a set of solutions, it is called population-based and to seek the better solution to using multiple iterations called an iterative approach. The two important classifications of population-based algorithms are swarm intelligence and evolutionary approaches which depend on simulation theory with natural phenomenon.

Genetic algorithm (GA) is one of the most popular methods of evolutionary algorithms. It was first established on the theoretical basis by Holland [1]. GA is a universal optimization approach which relies on one of the most important criteria of Darwin’s evolution process, as shown in Fig. 1. Usually, GA generates a better solution from all the possible solutions of a population based on the survival of the fittest principle. The random population of individuals with different encodes, such as binary, permutation, or real, etc., is created first. In nature, the most suitable individuals are likely to survive and mate. GA iteratively generates new chromosomes with the help of two operators, i.e., crossover and mutation. The process is repeated until or unless the required criteria such as convergence, a fixed time or a number of iterations are met. The objective is the solution with high astounding fitness values which are remarkable in the search process towards the optimal solution. The most attractive feature of GA is that it has the ability to explore the search space with the help of the entire population of individuals [2]. Recently, Bayesian network used as an adaptive parameter setting tool to enhance the performance of GA for multimodal problems [3]. A lot of work and applications have been highlighted about GAs by Goldberg [4].

A very common issue about GA is premature convergence to find the optimal solution of a problem. This is strongly linked to the loss of population diversity. If it is very low then a very quick convergence will be observed by GA; otherwise, time-consuming and may cause wastage of computational resources. Hence, there is essential to find a trade-off between exploration (i.e., exploring the new areas of search space) and exploitation (i.e., using already detected points to search the optimum). Therefore, the performance of the GA highly depends on its genetic operators, in general. The first operator is selection being used to choose the set of chromosomes for mating process, the crossover is the second one and used to create new individuals, and the last one is the mutation used for random changes. The balance between exploration and exploitation can be adjusted either by selection pressure in a selection approach or by the recombination operators with adjustment of their probabilities.

The selection scheme is the procedure to choose a sub-population (set of individuals) from the current population that will form the next population. GA is one of those algorithms whose performance is highly affected by the choice of selection operator. Without this mechanism, GA is only simple random sampling giving different results in each generation. Hence, we can say that the selection operator is the backbone of the GA process. Usually, the choice of the selection mechanism depends on the complexity of the problem. A hard approach combined with a conservative replacement mechanism and soft one manipulate an algorithm without sufficient exploring capability which may cause to stuck off on local optima.

There are several selection operators used and reviewed in the literature. A study about various selection approaches and results showed that different schemes perform well in different problems [5]. Thus, the most suitable selection approach has to be chosen in relation to a specific problem to enhance the optimality of desired result. Goldberg and Deb [6] did a comprehensive study of some traditional selection methods through the solutions of differential equations. Another popular study to adjust the probabilistic noise level throughout the mating pool to regulate the selection pressure [7]. Abd-Rahman et al. [8] established a hybrid roulette-tournament selection operator for solving a real-valued shrimp diet formulation problem which can also be generalized to evolutionary algorithm-related problems. A detailed study is about the selection process in GA and examined some common issues in various selection operators in Ref. [5].

The main objective of this study is to present the performance of selection operators that have a major impact on the GAs process. In this way, a new selection operator is proposed that intended to enhance the average quality of the population and gives a better trade-off between exploration and exploitation.

The rest of this article is presented as follows: in “Background” we present the background of selection schemes. The proposed selection operator is presented in “Proposed selection operator” with the statistical properties of a sampling algorithm. The traveling salesman problem (TSP) is discussed and reviewed in “Test problem (traveling salesman problem)”. Performance evaluation of the proposed scheme and conclusions are given in “Performance evaluation” and “Conclusions”, respectively.

Background

The first selection mechanism for GA was fitness proportional selection (FPS), which was introduced by Holland [1]. Now, it has become the most prevalent selection approach which used the concept of proportionality. It works as the fitness value of each individual in a population corresponds to the area of roulette wheel proportions. Then, an individual is marked by the roulette wheel pointer after it has spun. This operator gives individuals, a probability $p_i$ of being selected Eq. (1) that is directly proportionate to their fitness:

$$\begin{aligned} p_i=\frac{f_i}{\sum _{j=1}^{K}f_j}; \quad i\in \{1,2,\ldots ,K\}, \end{aligned}$$

(1)

where K is the size of the population and $f_i$ is the value of fitness function for the individual i. Thus, individuals who have better fitness values may have a higher chance of being selected as parents.

The FPS has been widely used selection scheme in various fields such as spanning tree [9], scheduling [10, 11], sources allocation problem [12], menu planning [13], and traveling salesman problem [14]. Throughout the selection procedure, there is no change in the segment size and selection probability. It is easy to implement and gives a high probability for the best individual; these aspects are the main strengths [15]. Another advantage of this approach is that it provides no bias with unlimited spread [16]. However, the difficulty is encountered when a significant difference appears in the fitness values [14, 17, 18]. The scaling problem which is the major drawback of this scheme was first pointed out by Grefenstette [19]. It has happened when population evolves, the ratio between the variance and the fitness average becomes increasingly small. The selection pressure, therefore, drops as the population converges [7]. On the other hand, high selection pressure may lead to premature convergence to a sub-optimal solution.

In the literature, there are some alternative techniques to overcome the above drawbacks. The most popular technique is the linear rank selection (LRS) scheme proposed by Baker [20]. It sorts the individuals in the sequence as worst to best according to the fitness and allocates them a survival probability proportional to their rank order. After this task, a sampling procedure (i.e., roulette wheel sampling) is used to select the individuals for mating process. In this way, the LRS can maintain a constant selection pressure throughout in the sampling process, because it introduces a uniform scaling across the population. Therefore, a unique selection probability is always assigned to the best individual, regardless of its fitness value. Another advantage of the LRS is that it behaves in a more robust manner than other techniques. The selection probability of an individual through this scheme is assigned according to the following formula:

$$\begin{aligned} {p_i=\frac{1}{K}\left( \eta ^- + (\eta ^+ - \eta ^-)\frac{i-1}{K-1}\right) }; \quad i\in \{1,2,\ldots ,K\}. \end{aligned}$$

(2)

Here, $\frac{\eta ^-}{K}$ and $\frac{\eta ^+}{K}$ are the probabilities of worst and best chromosomes to be selected, respectively. All the individuals get a different rank even if they have the same fitness value. The conditions $\eta ^+ = 2 - \eta ^-$ and $\eta ^- \ge 0$ must be fulfilled. The selective pressure can be adjusted by varying $\eta ^+$, the parametric value. As remarked by Baker, if $\eta ^+ = 2$, then all individuals would be within 10% of the mean and the population is driven to convergence during every generation. Baker recommended value of $\eta ^+ =1.1$ to control the selection pressure. The weakness of this scheme is that it can lead to slower convergence, because there is no significant difference between the best and other individuals. The selection probability of two consecutive chromosomes by the same amount is regardless of whether the gap between their fitness is larger or smaller [7].

Another rank-based selection scheme is exponential ranking selection (ERS). It works similar as to LRS, except for the non-linear assignment of probabilities to the individuals. A constant ratio r is used to assign the rank-based values to the individuals, such that $0< r < 1.0$$(r \simeq 1.0)$. The selection probability for the ith ranked individual through this scheme is assigned according to the following formula:

$$\begin{aligned} p_i = \frac{r^{K-i}(1 - r)}{1 - r^K}; \quad i\in \{1,2,\ldots ,K\}. \end{aligned}$$

(3)

The tournament selection (TS) is also widely used as an alternative to FPS. In TS, first, randomly select the t (where t is the predefined tournament size) individuals from the population and then they compete against each other based on their fitness. An individual with higher fitness value is declared as a winner and selected for mating process. The selection pressure can be adjusted with change the tournament size [7]. Usually, the most used tournament size is 2 (binary tournament selection (BTS)), which is the simplest form of TS [21]. However, the larger tournament size can be used to enhance the competition among individuals, but it leads to loss of population diversity [22, 23]. As shown by Back [24], an individual i is selected for t tournament using Eq. (4), where K is the size of the population. If the tournament size equals the population size, then the TS will be the approximate to the deterministic selection procedure [7]:

$$\begin{aligned} { p_i = \frac{1}{K^t} ((i)^t - (i-1)^t)}; \quad i\in \{1,2,\ldots ,K\}. \end{aligned}$$

(4)

Another case of the TS is probabilistic two-tournament selection (PTS) was presented by Julstrom [25]. The two individuals without replacement are chosen at random from the population and the winner of this tournament will be selected for mating process with a probability q, such that $0.5< q <1.0$. In this scheme, the loser can also be selected for mating process with the probability $(1 - q)$. Thus, the selection probability of an individual through this scheme is assigned according to the following rule:

$$\begin{aligned} p_i = \frac{2(i-1)}{K(K-1)}q + \frac{2(K-i)}{K(K-1)}(1-q); \quad i\in \{1,2,\ldots ,K\}. \end{aligned}$$

(5)

Moreover, throughout the evolution process, a fixed and a suitable adjustment of the selection pressure is a difficult task. An ideal situation may exist, if the selection pressure is low at the early stage of the search to gives a free hand to an exploration of the solution space and enhance at the ending stage to help the algorithm for convergence [26]. Hence, to trade-off between these two competing criteria, an adjustable selection pressure must desired [7]. The main contribution of this article is in the development of the proposed selection approach which reduces the weakness associated with FPS and LRS in the GA procedure. The proposed approach is based on the ranking scheme which splits the individuals after ranking and then assign them probabilities for selection. This will increase the competition among individuals to be selected for mating process to regulate the selection pressure. The detail is given in the next section.

Proposed selection operator

Motivation

The LRS has a small range of selection pressures (i.e., for a population of K individuals, the selection probability of the fittest individual is fallen must be between $\frac{1}{K}$ and $\frac{2}{K}$). The LRS introduces slow convergence speed and sometimes converges to a sub-optimal solution as less fit individuals may be preserved from one generation to another. In GA, the FPS has the essence of exploitation, while LRS is influenced by exploration. The information about the relative evaluation of individuals is ignored, all cases are treated uniformly regardless of the magnitude of the problem and, finally, the schema theorem is violated. LRS prevents too quick convergence and differs from FPS in terms of selection pressure. This discussion suggests that, whenever a selection procedure is used, some kind of adaptation of the selection pressure is highly desirable.

Split rank selection

In this research, we propose an alternative selection scheme [split rank selection (SRS)] that maintains a fine balance between exploration and exploitation. This approach not only eliminates the fitness scaling problem, but also provides an adequate selection pressure throughout the selection phase. In this scheme, all individuals are sorted from worst to best according to their fitness values. All the individuals get a different rank even if they have the same fitness value. Consider a population, a combination of K individuals, i.e., population size (usually it is even). Now, we divide the K individuals into two equal portions.

The top half portion is considering from individual 1 to individual $\frac{K}{2}$. The result (sum up) of this portion of the series is as follows:

$$\begin{aligned} \frac{K}{4}\left( \frac{K}{2}+1\right) . \end{aligned}$$

(6)

Now, sum up the last half portion of the series, i.e., from $\frac{K}{2}+1$ to K is

$$\begin{aligned} \frac{K^2}{4} + \frac{K}{4}\left( \frac{K}{2}+1\right) . \end{aligned}$$

(7)

Hence, the probability distribution according to the individual’s rank as

$$\begin{aligned} p(i)={\left\{ \begin{array}{ll} \lambda ^- (\frac{8i}{K(K+2)}); &{} i\le \frac{K}{2}\\ \lambda ^+ (\frac{8i}{K(3K+2)}); &{} i> \frac{K}{2}, \end{array}\right. } \end{aligned}$$

(8)

where $\lambda ^- + \lambda ^+ = 1$ and $\lambda ^- \ge 0$ must be satisfied. The selection pressure can be restrained by varying $\lambda ^+$, the tuning parameter, in the selection phase. To maintain balance of exploitation and exploration, we adjust this parameter $\lambda ^+ = 0.7$, i.e., 70% portion is assigned to the last half individuals. This gives

$$\begin{aligned} p(i)={\left\{ \begin{array}{ll} \frac{12i}{5K(K+2)}; &{} i\le \frac{K}{2}\\ \frac{28i}{5K(3K+2)}; &{} i> \frac{K}{2}. \end{array}\right. } \end{aligned}$$

(9)

We also derive the formula to select individuals if the population size is odd (usually this happens rarely in literature). First of all, we divide the K individuals in $\frac{K-1}{2}$ and $\frac{K+1}{2}$ portions. The top portion is considering from individual 1 to individual $\frac{K-1}{2}$. The result (sum up) of this portion of the series is as follows:

$$\begin{aligned} \frac{K^2-1}{8}. \end{aligned}$$

(10)

Now, sum up the last portion of the series, i.e., from $\frac{K+1}{2}$ to K is

$$\begin{aligned}&\frac{K+1}{2} + \frac{K+3}{2} + \frac{K+5}{2} + \cdots + \frac{K+K}{2} \end{aligned}$$

(11)

$$\begin{aligned}&\frac{K}{4}(K+1) + \frac{1}{8}(K+1)^2 \end{aligned}$$

(12)

$$\begin{aligned}&\frac{(K+1)(3K+1)}{8}. \end{aligned}$$

(13)

Hence, the probability distribution according to the individual’s rank is as follows:

$$\begin{aligned} p(i)={\left\{ \begin{array}{ll} \lambda ^- (\frac{8i}{(K^2-1)}); &{} i\le \frac{K-1}{2}\\ \lambda ^+ (\frac{8i}{(K+1)(3K+1)}); &{} i> \frac{K-1}{2}. \end{array}\right. } \end{aligned}$$

(14)

We compare the proposed operator with LRS, ERS, BTS, and PTS for 150 individuals at various parameters and depicted in Fig. 2. We used the most optimal parametric values from the literature to achieve a maximal performance from the said operators. For example, Baker shows that LRS performs best at $\eta ^+ = 1.1$. For tournament selection, binary is much better size for selection; otherwise, it may lead to loss of diversity. For ERS, the parametric value ‘r’ is very close to 1 for better performance and we use $r = 0.99$. For PTS, the parametric value ‘q’ is allowed to be within $0.5-1$, but a high value is recommended to control population diversity so we take $q = 0.8$. We see that the SRS works efficiently at $\lambda ^+=0.7$ for bad individuals and gives slightly better probability than BTS but not as much as given to them by LRS.

The sampling procedure

In a two-step selection procedures, i.e., FPS, LRS, ERS, and SRS, etc., a sampling mechanism is required to choose the individuals for mating process. That sampling mechanism fills the mating pool with the individual’s copies of the given population, while respecting the selection probabilities $p_i$, such that the expected and observed number of individuals are equals. Among the broad variety of sampling mechanisms, we used roulette wheel sampling scheme (or Monte Carlo sampling) for testing the accuracy of the proposed SRS operator.

The $\chi ^2$ goodness-of-fit measure

To measure the average difference between the expected and actual numbers of offspring, the $\chi ^2$ measure, as a tool for the average accuracy was first introduced by Schell et al. [27]. At first, there are c disjoint classes as $\{C_1,C_2,\ldots ,C_c\}$ where $C_j \subset \{1,2,\ldots ,K\}$ and $\cup ^c_{j=1} C_j=\{1,2,\ldots ,N\}$. Let $\xi _j=\sum _{i\in C_j}^{} e_i$ denotes the overall expectation and $O_j=\sum _{i\in C_j}^{} o_i$ is for the observed (actual) copies of individuals in mating pool after the sampling procedure. Ideally, $\xi _j$ should be of the order K / c for $1\le j \le c$, so that each class contains the same individuals on average and it should be at least 10 to obtain the required stochastic accuracy. Schell et al. [27] defined the Chi-square test as a measure to determine the accuracy of the sampling process as follows:

$$\begin{aligned} \chi := \sum _{j=i}^{c} \frac{(\xi _j - O_j)^2}{\xi _j}. \end{aligned}$$

(15)

In the roulette wheel sampling situation, the aforementioned constraint (i.e., $\xi _j \ge 10$), however, $\chi $ should follow the Chi-square distribution with $c-1$ degrees of freedom, since this is the asymptotic distribution of $\chi $ under multinomial distributed $o_i$ when $K \rightarrow \infty $. In this research, we present the results for a fixed parameter setting, i.e., $\lambda ^+=0.7$, the size of population $K=150$, classes $c=10$, and total number of tests $s=150$.

Table 1 Classes $C_j$ and overall expectation $\xi _j$ for SRS

Full size table

Table 1 presents the probability distribution of SRS and the corresponding overall expectation that are very close to 150/10. We used $\chi ^{S,R}$ to measure the results of $\chi $. In $\chi ^{S,R}$, S represents the proposed operator that assigns the probabilities to individuals and R is a type of sampling algorithm, i.e., roulette wheel. The main objective of this test is to estimate the expectation and variance. The population is randomly generated with predefined fixed individuals, and used the probability distribution R to assign them probabilities for selection and after that the sampling scheme R is applied to obtain instances of $o_i$, $O_j$ and $\chi ^{S,R}$ respectively. From the sequence $(\chi ^{S,R}_k)_{1\le k \le s}$, the sample mean and variance can be calculated as follows:

$$\begin{aligned} \hat{e}^{(S,R)}= & {} \frac{1}{s} \sum _{k=1}^{s} \chi ^{S,R}_k \end{aligned}$$

(16)

$$\begin{aligned} \hat{\sigma }^{2(S,R)}= & {} \frac{1}{s-1} \sum _{k=1}^{s} ( \chi ^{S,R}_k - \hat{e}^{(S,R)})^2. \end{aligned}$$

(17)

This scheme is compared to the theoretical $\chi _{c-1}^2$ distribution at $99\%$ confidence interval. For 10 classes, the mean and variance of Chi-square are $c-1=9$ and $2(c-1)=18$. The corresponding estimates of $\hat{e}$ and $\hat{\sigma }^2$ are 8.5739 and 19.6010, respectively. These estimates are almost the same and a more symbolic representation of a comparison of accuracy between assigning the probabilities to individuals and the number of copies corresponding to their probabilities come in the mating pool. The empirical results confirm the average behavior of the sampling scheme with respect to the probability distribution of SRS. The roulette wheel sampling provides the empirical distribution function that can not be significant from the theoretical $\chi _{c-1}^2$ distribution by $\hat{e}$ and $\hat{\sigma }^2$ statistics.

Test problem (traveling salesman problem)

The traveling salesman problem (TSP) is one of the most famous benchmark, significant, and historic hard combinatorial optimization problems. The main objective of TSP is to find the shortest Hamiltonian tour in a complete graph with n nodes. It was documented by Euler in 1759 (his interest was how to get rid of the knight’s tour problem) [28]. It is a fundamental problem in the fields of computer science, engineering, operations research, discrete mathematics, and graph theory. In this problem, a salesman visits all cities (nodes) exactly once (the constraint) and then returns to the initial point to complete a tour. It has many applications such as a variety of vehicle routing [29], scheduling [30], and bioinformatics [31] which can easily be transformed into the TSP.

If there are ‘n’ cities, a distance matrix $C = [c_{ij}]_{n\times n}$ is searched for a permutation $\lambda :\{0,\ldots , n -1\} \longrightarrow \{0,\ldots , n -1\}$, where $c_{ij}$ is the distance from city i to city j, which minimizes the traveled distance, $f(\lambda , C)$:

$$\begin{aligned} f(\lambda , C) = \sum _{i=0}^{n-1}d(c_{\lambda (i)}, c_{\lambda (i+1)}) + d(c_{\lambda (n)}, c_{\lambda (1)}), \end{aligned}$$

(18)

where $\lambda (i)$ represents the location of city i in each tour, $d(c_{i}, c_{j})$ is the distance between city i to city j and $(x_{i}, x_{j})$ is a specified position of each city in a tour in the plane, and the Euclidean distances of the distance matrix C between the city i and j is expressed as follows:

$$\begin{aligned} c_{ij} = \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2}. \end{aligned}$$

(19)

TSP is easy to understand but very difficult to solve, i.e., for ‘100’ cities, there are $10^{155}$ possible ways to find the tour. This is the reason to say that TSP is a non-deterministic polynomial (NP-hard) problem [32, 33]. These type of problems cannot be solved using the traditional optimization approaches like gradient-based methods. To achieve the optimal solution within a considerable amount of time, heuristic approaches are efficient at handling the NP-hard problems [34,35,36,37]. The GA has also been used to solve this problem in several aspects [28, 38,39,40,41,42,43,44].

Performance evaluation

In this section, we evaluate the performance of the SRS in comparison to other selection schemes. At first, we present basic information about benchmarks and the parameters setting for GA in “Computational testing methodology”. Second, MATLAB software (version R2015a) was used to compare the simulation study among selection operators, and a detailed discussion on results is given in “Simulation results and discussion”.

Computational testing methodology

In this research, computational experiments on ten different instances taken from the library of traveling salesman problem (TSPLIB) [45] are solved to compare the proposed scheme along with five competing selection methods. The test benchmarks are Euclidean, two-dimensional symmetric and asymmetric problems within 34–561 cities and given in Table 2. In addition, we consider the three most widely used crossover schemes, namely order crossover (OX), partially mapped crossover (PMX) and cycle crossover (CX). The exchange mutation (EM) as a mutation operator is used throughout our simulation study. These are the state-of-the-art genetic operators and a detailed discussion is given in Ref. [28]. Therefore, there are three group of experiments conducted as six selection schemes examined with each of three crossovers and one mutation operators. Table 3 presents the genetic operators and parameter values which are under consideration in our simulation study.

Table 2 The benchmark problems

Full size table

Table 3 Parametric configuration for GA

Full size table

Table 4 Results of different selection strategies with PMX (crossover) and EM (mutation) operators

Full size table

Table 5 Results of different selection strategies with OX (crossover) and EM (mutation) operators

Full size table

Table 6 Results of different selection strategies with CX (crossover) and EM (mutation) operators

Full size table

Since GA belongs to the class of probabilistic search algorithms, we use the two-sampled t test as a statistical hypothesis testing [46]. The experiments were performed in 30 independent trials (each pair of $n_1=n_2=30$) for each instance to achieve a comparable solution. The two-tailed t test values are calculated using Eq. (20); where $\bar{x_1}$ and $s_1$ are, respectively, the average and standard deviation (SD) of SRS and $\bar{x_2}$ and $s_2$ are, respectively, the average and SD of other competitor operators (i.e. FPS, LRS, ERS, BTS, and PTS). In this study, we set our null hypothesis in the following way ‘SRS convergences at least as fast as other selection operators in comparison’. Throughout this study, all the statistical differences are shown at $p = 0.05$ (95% confidence) level of significance using the two-sample (independent samples) t test with 58 degrees of freedom. The two-tailed t test value indicates whether a significant improvement by SRS ($t\le - \ 2.00$) or significant degradation by SRS ($t\ge 2.00$). But within the range ($- \ 2.00<t<2.00$) of two-tailed t test score does not reflect the reasonable statistical evidence to confirm or refute our null hypothesis, which indicates that there is no statistical significance between the two approaches:

$$\begin{aligned} t=\frac{\bar{x_1} - \bar{x_2}}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}, \end{aligned}$$

(20)

where

$$\begin{aligned} s_p=\sqrt{\frac{(n_1 -1)s^2_1 + (n_2 -1)s^2_2}{n_1 + n_2 - 2}}. \end{aligned}$$

Simulation results and discussion

Table 4 summarizes the results of six competing selection schemes with PMX and EM as crossover and mutation operators respectively. Results compare on the basis of average, SD, and improved performance of the SRS in percentage (%) values. The significant improvements in the results of SRS with respect to each other approach are indicated through t values. The proposed operator is indicated less average values for all ten benchmarks with stable results (low SD), as well. According to the critical value ($t = -\ 2.00$), all computed t scores are less than − 2.00 for all ten benchmark instances and bold t test values have shown the significantly improved performance by the proposed operator. The other t test values which are not bold (non-significant), but negative values indicates a slightly improved performance with respect to an average by the proposed operator. In other words, the simulation results found by the SRS are statistically significant and better than the other five selection approaches (i.e., FPS, LRS, ERS, BTS, and PTS).

The order crossover (OX) is used instead of PMX and simulation results are summarized in Table 5 for various selection operators. These results are also compared on the basis of average, SD and improved performance of the SRS in percentage (%) values. The t test is also used to measures not only improved but significant performance by the proposed SRS. The simulated results show less average values by SRS for all the benchmarks with consistent results (low SD), as well. According to the critical value ($ t = - \ 2.00$), all computed t scores are less than − 2.00 for all ten benchmark instances and bold t test values have shown the significantly improved performance by the proposed operator. The other t test values which are not bold (non-significant), but negative values indicates a slightly improved performance with respect to an average by the proposed operator. The table shows that there is no positive t test value which means that no other operator is better than the proposed one in any case. Based on the simulation results, we can say that the proposed operator (SRS) is statistically significant and better than the other five selection approaches (i.e., FPS, LRS, ERS, BTS, and PTS).

We continue our simulation study to check the performance of the proposed operator along with other selection methods and different techniques of crossover and mutation operators. Likewise, in Table 6, we tested the performance of SRS with the pair of CX (crossover operator) and EM (mutation operator). The simulation results indicate the lower average and SD values for all benchmarks by the SRS. Based on statistical perspectives, the SRS outperforms (bold t test values) all the other selection methods for all ten benchmark instances ($t \le - \ 2.00$), but, in some cases, only BTS and LRS give non-significant results with the proposed operator. The non-bold t test values are all negative which means that the proposed operator is not worse than any other competing selection operators used in this study. Besides, we can clearly see from Figs. 3, 4 and 5 and analyses performed on the ‘rbg403’ instance that SRS produces lower average results using three different crossover and one mutation operators. We also observe that FPS and BTS produced faster results in early stages, but lead to premature convergence because of high selection pressure. On the other hand, the proposed operator work efficiently throughout the generations taking care of selection pressure and population diversity.

Conclusions

Exploration and exploitation are the two main techniques which employed normally to all the optimization methods. The fitness proportional selection approach has essence exploitation and linear rank approach is influenced by exploration. This article presented a new split ranked selection operator which is a great trade-off between exploration and exploitation. In the proposed procedure, the individuals are ranked according to their fitness scores from worst to best, thus overcoming the fitness scaling issue. After this, split the whole population into two portions and assigning them probabilities for selection based on their ranks. The $\chi ^2$ goodness-of-fit test confirms that there is insignificant difference between the expected and the actual number of offspring. To evaluate the performance of the proposed operator, we conducted a series of simulation study along with some conventional operators. Computational results proved the superior performance of the new selection scheme in comparison with the traditional GA approaches. The significance of such improvement is also validated through two-tailed t test. Hence, the proposed operator might be a good candidate to get optimum or near to optimum results. Moreover, researchers might be more confident to apply it for any problems related to evolutionary algorithms.

References

Holland JH (1975) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. University of Michigan Press, Oxford
MATH Google Scholar
Osyczka A (2002) Evolutionary algorithms for single and multi-criteria design optimization. Springer, Physica-Verlag, Berlin
MATH Google Scholar
Corriveau G, Guilbault R, Tahan A, Sabourin R (2016) Bayesian network as an adaptive parameter setting approach for genetic algorithms. Complex Intell Syst 2(1):1–22
Article Google Scholar
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley Publishing Company, Reading
MATH Google Scholar
Sivaraj R, Ravichandran T (2011) A review of selection methods in genetic algorithm. Int J Eng Sci Technol 3(5):3792–3797
Google Scholar
Goldberg DE, Deb K (1991) A comparative analysis of selection schemes used in genetic algorithms. Found Genet Algorithms 1:69–93
MathSciNet Google Scholar
Pham DT, Castellani M (2010) Adaptive selection routine for evolutionary algorithms. Proc Inst Mech Eng Part I J Syst Control Eng 224(6):623–633
Article Google Scholar
Abd-Rahman R, Ramli R, Jamari Z, Ku-Mahamud KR (2016) Evolutionary algorithm with roulette-tournament selection for solving aquaculture diet formulation. Math Probl Eng 2016:1–10
Article Google Scholar
Sundar S, Singh A (2010) A swarm intelligence approach to the quadratic minimum spanning tree problem. Inf Sci 180(17):3182–3191
Article MathSciNet Google Scholar
Ramli R (2004) An evolutionary algorithm for the nurse scheduling problem with circadian rhythms. Ph.D. thesis, University Sains Malaysia
Ramli R, Ibrahim H, Shung LT (2013) Innovative crossover and mutation in a genetic algorithm based approach to a campus bus driver scheduling problem with break consideration and embedded overtime. Appl Math Inf Sci 7(5):1921–1928
Article Google Scholar
Yeh HD, Lin CC, Yang BJ (2014) Applying hybrid heuristic approach to identify contaminant source information in transient groundwater flow systems. Math Probl Eng 2014:1–13
Article Google Scholar
Asyikin S.N, Razali M (2011) Menu planning model for Malaysian boarding school using self-adaptive hybrid genetic algorithms. Ph.D. thesis, University Utara Malaysia
Razali N.M, Geraghty J (2011) Genetic algorithm performance with different selection strategies in solving TSP. In: Proceedings of the world congress on engineering, vol 2. International Association of Engineers Hong Kong, pp 1134–1139
Aibinu AM, Salau HB, Rahman NA, Nwohu MN, Akachukwu C (2016) A novel clustering based genetic algorithm for route optimization. Eng Sci Technol Int J 19(4):2022–2034
Article Google Scholar
Yan HS, Wan XQ, Xiong FL (2015) Integrated production planning and scheduling for a mixed batch job-shop based on alternant iterative genetic algorithm. J Oper Res Soc 66(8):1250–1258
Article Google Scholar
Sharma A, Mehta A (2013) Review paper of various selection methods in genetic algorithm. Int J Adv Res Comput Sci Softw Eng 3(7):1476–1479
Google Scholar
Mashohor S, Evans J.R, Arslan T (2005) Elitist selection schemes for genetic algorithm based printed circuit board inspection system. In The 2005 IEEE congress on: evolutionary computation, vol 2, pp 974–978
Grefenstette JJ (1986) Optimization of control parameters for genetic algorithms. IEEE Trans Syst Man Cybern 16(1):122–128
Article Google Scholar
Baker J.E (1985) Adaptive selection methods for genetic algorithms. :In Proceedings of an International Conference on Genetic Algorithms and their applications. Hillsdale, pp 101–111
Back T, Fogel DB, Michalewicz Z (2000) Evolutionary computation 1: basic algorithms and operators, vol 1. CRC Press, Boca Raton
Book MATH Google Scholar
Blickle T, Thiele L (1995) A comparison of selection schemes used in genetic algorithms. TIK-report
Whitley L.D (1989) The GENITOR algorithm and selection pressure: why rank-based allocation of reproductive trials is best. In: ICGA, vol 89. Fairfax, pp 116–123
Back T (1996) Evolutionary algorithms in theory and practice. Oxford Press, Oxford
MATH Google Scholar
Julstrom B.A (1999) It’s all the same to me: Revisiting rank-based probabilities and tournaments. In: Proceedings of the congress on evolutionary computation, vol 2. IEEE, pp 1501–1505
Eiben A.E, Schut M.C, de-Wilde A.R (2006) Is self-adaptation of selection pressure and population size possible? A case study. In: Parallel problem solving from nature-PPSN IX, pp 900–909
Schell T, Wegenkittl S (2001) Looking beyond selection probabilities: adaptation of the $\chi ^2$ measure for the performance analysis selection methods in GAs. Evol Comput 9(2):243–256
Article Google Scholar
Larranaga P, Kuijpers CM, Murga RH, Inza I, Dizdarevic S (1999) Genetic algorithms for the traveling salesman problem: a review of representations and operators. Artif Intell Rev 13:129–170
Article Google Scholar
Ha MH, Bostel N, Langevin A, Rousseau LM (2014) An exact algorithm and a metaheuristic for the generalized vehicle routing problem with flexible fleet size. Comput Oper Res 43:9–19
Article MathSciNet MATH Google Scholar
Ho W, Ji P (2009) An integrated scheduling problem of PCB components on sequential pick-and-place machines: mathematical models and heuristic solutions. Expert Syst Appl 36(3):7002–7010
Article Google Scholar
Ezziane Z (2006) Applications of artificial intelligence in bioinformatics: a review. Expert Syst Appl 30(1):2–10
Article Google Scholar
Applegate DL, Bixby RE, Chvatal V, Cook WJ (2006) The traveling salesman problem: a computational study. Princeton University Press, Princeton
MATH Google Scholar
Helsgaun K (2000) An effective implementation of the Lin–Kernighan traveling salesman heuristic. Eur J Oper Res 126(1):106–130
Article MathSciNet MATH Google Scholar
Huang HX, Li JC, Xiao CL (2015) A proposed iteration optimization approach integrating backpropagation neural network with genetic algorithm. Expert Syst Appl 42(1):146–155
Article Google Scholar
Ruiz E, Albareda-Sambola M, Fernandez E, Resende MG (2015) A biased random-key genetic algorithm for the capacitated minimum spanning tree problem. Comput Oper Res 57:95–108
Article MathSciNet MATH Google Scholar
Zhang X, Zhang Y, Hu Y, Deng Y, Mahadevan S (2013) An adaptive amoeba algorithm for constrained shortest paths. Expert Syst Appl 40(18):7607–7616
Article Google Scholar
Zhang X, Mahadevan S, Deng Y (2015) Physarum-inspired applications in graph-optimization problems. Parallel Process Lett 25(01):1540005
Article MathSciNet MATH Google Scholar
Ahmed ZH (2010) Genetic algorithm for the traveling salesman problem using sequential constructive crossover operator. Int J Biometr Bioinform (IJBB) 3(6):96–105
Google Scholar
Bhattacharyya M, Bandyopadhyay AK (2008) Comparative study of some solution methods for traveling salesman problem using genetic algorithms. Cybern Syst 40(1):1–24
Article MATH Google Scholar
Hussain A, Muhammad YS, Sajid MN, Hussain I, Shoukry MA, Gani S (2017) Genetic algorithm for traveling salesman problem with modified cycle crossover operator. Comput Intell Neurosci 2017:1–7
Article Google Scholar
Moon C, Kim J, Choi G, Seo Y (2002) An efficient genetic algorithm for the traveling salesman problem with precedence constraints. Eur J Oper Res 140(3):606–617
Article MathSciNet MATH Google Scholar
Nagata Y, Soler D (2012) A new genetic algorithm for the asymmetric traveling salesman problem. Expert Syst Appl 39(10):8947–8953
Article Google Scholar
Philip A, Taofiki AA, Kehinde O (2011) A genetic algorithm for solving traveling salesman problem. Int J Adv Comput Sci Appl 2(1):26–29
Google Scholar
Potvin JY (1996) Genetic algorithms for the traveling salesman problem. Ann Oper Res 63(3):337–370
Article MATH Google Scholar
Reinelt G (1995) TSPLIB. TSPLIB95 http://www.iwr.uni-heidelberg.de/groups/comopt/software. Accessed 15 July 2018
Yuan S, Skinner B, Huang S, Liu D (2013) A new crossover approach for solving the multiple traveling salesmen problem using genetic algorithms. Eur J Oper Res 228(1):72–82
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Quaid-i-Azam University, Islamabad, Pakistan
Abid Hussain & Yousaf Shad Muhammad

Authors

Abid Hussain
View author publications
You can also search for this author in PubMed Google Scholar
Yousaf Shad Muhammad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abid Hussain.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Hussain, A., Muhammad, Y.S. Trade-off between exploration and exploitation with genetic algorithm using a novel selection operator. Complex Intell. Syst. 6, 1–14 (2020). https://doi.org/10.1007/s40747-019-0102-7

Download citation

Received: 27 November 2018
Accepted: 15 March 2019
Published: 02 April 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s40747-019-0102-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Trade-off between exploration and exploitation with genetic algorithm using a novel selection operator

Abstract

Similar content being viewed by others

A review on genetic algorithm: past, present, and future

Multi-objective Geometric Mean Optimizer (MOGMO): A Novel Metaphor-Free Population-Based Math-Inspired Multi-objective Algorithm

Spider wasp optimizer: a novel meta-heuristic optimization algorithm

Introduction

Background