Localized biogeography-based optimization

Biogeography-based optimization (BBO) is a relatively new heuristic method, where a population of habitats (solutions) are continuously evolved and improved mainly by migrating features from high-quality solutions to low-quality ones. In this paper we equip BBO with local topologies, which limit that the migration can only occur within the neighborhood zone of each habitat. We develop three versions of localized BBO algorithms, which use three different local topologies namely the ring topology, the square topology, and the random topology respectively. Our approach is quite easy to implement, but it can effectively improve the search capability and prevent the algorithm from being trapped in local optima. We demonstrate the effectiveness of our approach on a set of well-known benchmark problems. We also introduce the local topologies to a hybrid DE/BBO method, resulting in three localized DE/BBO algorithms, and show that our approach can improve the performance of the state-of-the-art algorithm as well.


Introduction
The complexity of real-world optimization problems gives rise to various kinds of evolutionary algorithms (EA) such as genetic algorithms (GA) (Holland 1992), particle swarm optimization (PSO) (Kennedy and Eberhart 1995), differential evolution (DE) (Storn and Price 1997), etc. Drawing inspiration from biological systems, EA typically use a population of candidate individual solutions to stochastically explore the search space. As a result of the great success of such heuristic methods, viewing various natural processes as computations has become more and more essential, desirable, and inevitable (Kari and Rozenberg 2008).
Proposed by Simon (2008), biogeography-based optimization (BBO) is a relatively new EA borrowing ideas from biogeographic evolution for global optimization. Theoretically, it uses a linear time-invariant system with zero input to model migration, emigration, and mutation of creatures in an island (Sinha et al. 2011). In the metaheuristic algorithm, each individual solution is considered as a "habitat" or "island" with a habitat suitability index (HSI), based on which an immigration rate and an emigration rate can be calculated. High HSI solutions tend to share their features with low HSI solutions, and low HIS solutions are likely to accept many new features from high HIS solutions. BBO has proven itself a competitive heuristic to other EA on a wide set of problems (Simon 2008;Du et al. 2009;Song et al. 2010). Moreover, the Markov analysis proved that BBO outperforms GA on simple unimodal, multimodal and deceptive benchmark functions when used with low mutation rates .
In the original BBO, any two habitats in the population have a chance to communicate with each other, i.e, the algorithm uses a global topology which is computationally intensive and easy to cause premature convergence. In this paper we modify the original BBO by using local topologies, where each habitat can only communicate with a subset of the population, i.e., its neighboring habitats. We introduce three local topologies, namely the ring topology, the square topology, and the random topology, to BBO, and demonstrate that our approach can achieve significant performance improvement. We also test the local topologies on a hybrid algorithm, namely DE/BBO (Gong et al. 2010), and show that our approach is also useful to improve the state-of-the-art method. To our best knowledge, the paper is the first attempt to improve BBO by modifying its internal structure with local topologies.
In the remainder of the paper, we first introduce the original BBO in Sect. 2, and then discuss related work in Sect. 3. In Sect. 4 we propose our approach for improving BBO, the effectiveness of which is demonstrated by the experiments in Sect. 5. Finally we conclude in Sect. 6.

Biogeography-based optimization
Biogeography is the science of the geographical distribution of biological organisms over space and time. MacArthur and Wilson (1967) established the mathematical models of island biogeography, showing that the species richness of an island can be predicted in terms of such factors as habitat area, immigration rate, and extinction rate. Inspired by this, Simon (2008) developed BBO, where a solution is analogous to a habitat, the solution components are analogous to a set of suitability index variables (SIV), and the solution fitness is analogous to the species richness or HSI of the habitat. Central to the algorithm is the equilibrium theory of biogeography, which indicates that high HSI habitats have a high species emigration rate and low HSI habitats have a high species immigration rate. For example, in a linear model of species richness (Fig. 1), a habitat H i 's immigration rate λ i and emigration rate μ i are calculated based on its fitness f i as follows: where f max is the maximum fitness value among the population, and I and E are the maximum possible immigration rate and emigration rate respectively. However, there are other non-linear mathematical models of biogeography that can be used for calculating the migration rates (Simon 2008). Migration modifies habitats by mixing features within the population. BBO also use a mutation operator to change SIV within a habitat itself, and thus probably increasing diversity of the population. For each habitat H i , a species count probability P i computed from λ i and μ i indicates the likelihood that the habitat was expected a priori to exist as a solution for the problem. In this context, very high HSI habitats and very low HSI habitats are both improbable, and medium HSI habitats are relatively probable. The habitat's mutation rate is inversely proportional to its probability: where π max is a control parameter and P max is the maximum habitat probability in the population. Algorithm 1 shows the general framework of BBO for a D-dimensional numerical optimization problem (where l d and u d are the lower and upper bounds of the dth dimension respectively, and rand is a function generating a random value uniformly distributed in [0,1]).
Typically, in Line 8 we can use a roulette wheel method for selection, the time complexity of which is O(n). It is not difficult to see that the complexity of each iteration of the algorithm is O(n 2 D + n f ), where f denotes the time complexity of the fitness function.

Related work
Since the proposal of the original BBO, several work has been devoted to improve its performance of by incorporating features of other heuristics. Du et al. (2009) combined BBO with the elitism mechanism of evolutionary strategy (ES) (Beyer and Schwefel 2002), that is, at each generation the best n solutions from the n parents and n children as the population are chosen for the next generation. Based on opposite numbers theory (Tizhoosh 2005;Rahnamayan et al. 2008;Wang et al. 2011;Ergezer et al. 2009) employed oppositionbased learning to create the oppositional BBO, which uses a basic population together with an opposite population in problem solving. Each basic solution has an opposite reflection in the opposite population, which is probably closer to the expected solution than the basic one. Consequently, the required search space can be reduced and the convergence speed can increase.
The migration operator makes BBO be good at exploiting the information of the population. Gong et al. (2010) proposed a hybrid DE/BBO algorithm, which combines the exploration of DE with the exploitation of BBO effectively. The core idea is to hybridize DE's mutation operator with BBO's migration operator, such that good solutions would be less destroyed, while poor solutions can accept a lot of new features from good solutions. In some detail, at each iteration, each dth component of each ith solution is updated using the following procedure (where C R is the crossover rate, r 1 , r 2 and r 3 are mutually distinct random indices of the population, and F is a differential weight): Li and Yin (2012) proposed two extensions of BBO, the first using a new migration operation based on multiparent crossover for balancing the exploration and exploitation capability of the algorithm, and the second integrating a Gaussian mutation operator to enhance its exploration ability.
Observing that the performance of BBO is sensitive to the migration model, Ma et al. (2012a) proposed an approach that equips BBO with an ensemble of migration models. They realized such an algorithm using three parallel populations, each of which implements a different migration model. Experimental results show that the new method generally outperforms various versions of BBO with a single migration model. The original BBO is for unconstrained optimization. Ma and Simon (2011) developed a blended BBO for constrained optimization, which uses a blended migration operator motivated by blended crossover in GA (Mühlenbein and Schlierkamp-Voosen 1993), and determines whether a modified solution can enter the population of the next generation by comparing it with its parent. In their approach, an infeasible solution is always considered worse than a feasible one, and among two infeasible solutions the one with a smaller constraint violation is considered better.
Some recent studies (Mo and Xu 2011;Ma et al. 2012b;Costa e Silva et al. 2012) have been carried out on the application of BBO to multiobjective optimization problems. Those methods typically evaluate solutions based on the concept of Pareto optimality, use an external archive for maintaining non-dominated solutions, and employ specific strategy to improve diversity of the archive. Experiments show that they can provide comparable performance to some well-known multiobjective EA such as NSGA-II (Deb et al. 2002). Boussaïd et al. (2011) developed another approach combining BBO and DE, which evolves the population by alternately applying BBO and DE iterations. Boussaïd et al. (2012) extended the approach for constrained optimization, which replaces the original mutation operator of BBO with the DE mutation operator, and includes a penalty function to the objective function to handle problem constraints.
Recently Lohokare et al. (2013) proposed a new hybrid BBO method, namely aBBOmDE, which uses a modified mutation operator to enhance exploration, introduces a clear duplicate operator to suppress the generation of similar solutions, and embeds a DE-based neighborhood search operator to further enhance exploitation. The improved memetic algorithm exhibits significantly better performance than BBO and DE and comparable performance with DE/BBO. All of the above work focuses on hybridization and inherits the global topology of the original BBO, i.e., any two habitats in the population have a chance to communicate with each other. Up to now, few researches have been carried out on the improvement of internal structure of BBO. Here we refer to PSO, the most popular swarm-based EA which also uses a global best model in its original version, i.e., at each iteration each particle is updated by learning from its own memory and the very best particle in the entire population. However, many later researches have revealed that the global best model shows inferior performance on many problems and prefer to use a local best model (Kennedy and Mendes 2006), i.e., make every particle learn from a local best particle among its immediate neighbors rather than the entire population. Drawing inspiration from this, we conduct this research that introduces local topologies to BBO.

Localized biogeography-based optimization
The original BBO uses a global topology where any two habitats can communicate with each other: if a given habitat is chosen for immigration, any other habitat has a chance to be an emigrating habitat, as shown in Fig. 2a. However, such a migration mechanism is computationally expensive.
More importantly, if one or several habitats near local optima are much better than others in the population, then all other solutions are very likely to accept many features from them, and thus the algorithm is easy to be trapped into the local optima.
A way to overcome this problem is to use a local topology, where each individual can only communicate with its (immediate) neighbors rather than any others in the population. Under this scheme, as some individuals in one part of the population interact with each other to search one region of the solution space, another part of the population could explore another. However, the different parts of the population are not isolated. Instead they are interconnected by some intermediate individuals, and the flow of information can be moderately pass through the neighborhood. In consequence, the algorithm can achieve a better balance between global search (exploration) and local search (exploitation).

Using the ring and square topologies
The first local topology we introduce to BBO is the ring topology, where each habitat is directly connected to two other habitats, as shown in Fig. 2b. The second is the square topology, where the habitats are arranged in a grid and each habitat has four immediate neighbors, as shown in Fig. 2c.
Suppose the population of n habitats are stored in an array or a list, the ith of which is denoted by H i . When using the ring topology, each H i has two neighbors H (i−1)%n and H (i+1)%n , where % denotes the modulo operation. In this case, Line 8 of Algorithm 1 can be refined to the following procedure: For the square topology where the width of the grid is w, the indices of the neighbors of habitat H i are respectively (i − 1)%n, (i + 1)%n, (i − w)%n and (i + w)%n. In this case, Line 8 of Algorithm 1 can be refined to the following procedure: For both the two topologies, the complexity of each iteration of the algorithm decreases to O(n D + n f ). In case that the fitness function of the problem is not very difficult, the BBO with the ring topology can save much computational cost.

Using the random topology
The neighborhood size is fixed to 2 in the ring topology and 4 in the square topology. A more general approach is to set the neighborhood size to K (0 < K < n), which is a parameter that can be adjusted according to the properties of the problem at hand. But how to set the topology for an arbitrary K ? A simple way is to randomly select K neighbors for each habitat, the result of which is called a random topology. In the algorithm implementation, the neighborhood structure   In column 3-5, • means that the localized BBO has statistically significant improvement over BBO and • vice versa (at 95 % confidence level). The last row compares the number of problems on which the algorithm outperform BBO and vice versa. In table column 2-5, the upper part of each cell corresponds to the mean best and the lower part corresponds to the standard deviation can be saved in an n × n matrix Link: if two habitats H i and H j are directly connected then Link(i, j) = 1, otherwise Link(i, j) = 0. However, such an approach has the following limitations: -K has to be an integer, which limits the adjustability of the parameter. -All habitats in the population have the same number of neighbors. But for an EA modeling a social network, it can be better that some individuals have more informants while others have less (Kennedy and Mendes 2006).
A more effective way is to make that each habitat has probably K neighbors. This gives a Bernoulli distribution where any value between 0 ∼ (n − 1) is possible, such that all possible topologies may appear. Precisely, the probability of any two habitats being connected is K /(n − 1), and thus we can use the following procedure for setting the topology with an expected neighborhood size of K : The random topology provides enough diversification for effectively dealing with multimodal problems, but its neighborhood structure should be dynamically changed according to the state of convergence. There are different strategies, such as resetting the neighborhood structure at each iteration, after a certain number of iterations, or after one or a certain number of non-improvement iterations. Empirically, we suggest setting K to a value between 2 and 4 and resetting the neighborhood structure when the best solution has not been improved after a whole iteration.
Algorithm 2 presents the framework of such a localized BBO with the random topology.

Benchmark functions and experimental settings
We test the localized BBO algorithms on a set of well-known benchmark functions from Yao et al. (1999), denoted as f 1 ∼ f 23 , which span a diverse set of features of numerical optimization problems. f 1 ∼ f 13 are high dimensional and scalable problems, and f 14 ∼ f 23 are low dimensional problems with a few local minima. For the high dimensional f 8 and all the low dimensional functions that have non-zero optimal values, in the experiments we simply add some constants to the objectives such that their best fitness values become 0.
The experiments are conducted on a computer of Intel Core i5-2520M processor and 4 GB memory. For a fair comparison, all the algorithms use the same population size n = 50 and the same maximum number of function evaluations (NFE) preset for each test problem. For the local BBO with the random topology we set K = 3. Every algorithm has been run on each problem for 60 times with different   In table column 2-6, the upper part of each cell corresponds to the mean best and the lower part corresponds to the standard deviation. In column 4-6, • means that the localized BBO has statistically significant improvement over DE/BBO and • vice versa; In column 3-6, + means that the algorithm has statistically significant improvement over DE and − vice versa (at 95 % confidence level). The last row compares the number of problems on which the algorithm outperform DE/BBO and vice versa, and the last row compares the number of problems on which the algorithm outperform DE and vice versa random seeds, and the resulting mean best fitness values and the success rate (with respect to the required accuracies) are averaged over 60 runs. The required accuracies are set to 10 −8 for 22 test functions expect 10 −2 for f 7 (Suganthan et al. 2005).

Comparative experiments for BBO with global and local topologies
We first compare our three versions of localized BBO, denoted by RingBBO, SquareBBO and RandBBO respectively, with the basic BBO (for which we use program code from http://academic.csuohio.edu/simond/bbo). For all the four algorithms, we set I = E = 1. We also set π max to 0.01 for BBO, but 0.02 for localized BBO to increase the chance for mutation since the local topologies decrease the impact of migration. Table 1 presents the mean bests and success rates of the four BBO on the test functions. As we can see, the mean best fitness values of the three localized BBO algorithms are always better than BBO on all the functions except f 7 . In particular, on high dimensional functions except f 7 , the mean bests of the localized BBO are typically about 1−20 % of that of the BBO. The localized BBO also have obvious improvement of the success rates on functions f 6 and f 14 . The BBO achieves better mean best fitness only on f 7 , but in terms of success rate there is no significant difference between the BBO and localized BBO.
We have performed paired t-tests on the mean values of the BBO and localized BBO. As the results show, RingBBO and SquareBBO have significant improvement over the BBO on 20 functions, and RandBBO has significant improvement on 21 functions. There are no significant differences between the four algorithms on function f 17 , which is low dimensional and is one of the simplest functions among the test problems. BBO is significantly better only on f 7 .
Moreover, Fig. 3a-m respectively present the convergence curves of the four algorithms on high dimensional functions f 1 − f 13 . From the curves we can see that, on all the functions except f 7 , the three localized BBO have obvious advantages in convergence speed. There are no obvious differences between the convergence speeds of the three localized BBO.
Quartic function f 7 is the only function on which the BBO has performance advantage over the localized BBO, but the advantage is not very significant. Note that f 7 is a noisy function with one minima, where the noise is a random value uniformly distributed in [0,1]. For such a relatively simple function, the BBO with global communication topology is more capable of tolerating noises. However, since the Ring-BBO and SquareBBO have low time complexity and it is not expensive to evaluate f 7 , when given the same computational time, we observe that the RingBBO and SquareBBO still achieve better result than the BBO. For most functions without noises and multimodal functions, the local topologies have much more advantages.
On the other hand, there are no statistically significant differences between the three versions of localized BBO. Roughly speaking, the performance of the RandBBO is better than the RingBBO and SquareBBO, but RandBBO consumes a bit more computational cost. On most high dimensional problems, SquareBBO is slightly better than RingBBO in terms of mean best fitness, but RingBBO is better in terms of standard deviation. In other words, all of the three localized BBO provide performance improvement over the BBO; the improvement of RandBBO and SquareBBO is more obvious, and RingBBO is expected to be more stable.  Similarly, there are no statistically significant differences between the three versions of localized DE/BBO. However, the performance of RingDB is slightly better than SquareDB and RandDB on some high dimensional problems. This is mainly because that, the DE search method enhances the global information sharing in the population and provides good exploration capability, and a large neighborhood size (such as in SquareDB) or random topology is not very important for hybrid DE/BBO. A local topology can effectively enhance the BBO's exploitation capability, and the combination of DE and BBO with the ring topology can provide a promising search capability.

Conclusion
Biogeography-based optimization is a relatively new bioinspired EA which has proven its quality and versatility on a wide range of optimization problems. Although many researches have been devoted to improve the performance of BBO, most of them focus on combining with operators of some other heuristic methods and thus increase the difficulty of implementation. In this paper we propose a new approach to improve BBO by simply replacing the global topology of BBO with a local topology. Under this scheme, as some individuals in one part of the population interact with each other to search one region of the solution space, another part of the population could explore another, and the flow of information can be moderately pass through the neighborhood. Thus our method can achieve a much better balance between exploration and exploitation. The main contributions of the paper can be summarized as follows: 1. To our best knowledge, it is the first attempt to improve BBO with local topologies, which not only effectively improves the algorithm's search capability and avoid premature convergence, but also is much easier to implement than other hybrid approaches. 2. The localized approach has also been successfully used to greatly improve a hybrid DE/BBO method. Also, we believe that our approach can be utilized to further improve other enhanced BBO methods, such as Li and Yin (2012) ing new BBO-based heuristics, including those for constrained, multiobjective, and/or combinatorial optimization.
Currently, we are testing the hybridization of our localized BBO with ES and PSO. Another ongoing work is the parallelization of our BBO algorithms, which is much easier to implement on local topologies than on a global one (Zhu 2010).
In this paper we study three local topologies: ring, square, and random. There are many other interesting neighborhood topologies that are worth to study (Mendes et al. 2004). But we still have an intense interest in the random topology, especially in the dynamic changes of the topology within a single run of the algorithm. For example, how can we adaptively vary the neighborhood size K during the search to achieve better performance improvement? Or can we dynamically and randomly change the neighborhood structure, as dictated by the topic of variable neighborhood search (Mladenović and Hansen 1997)? All of these require deep theoretical analysis and extensive computational experiments.
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.