Improving monarch butterfly optimization through simulated annealing strategy

Currently, a novel of meta-heuristic algorithm called monarch butterfly optimization (MBO) is presented for solving machine learning and continuous optimization problems. It has been proved experimentally that MBO is superior to artificial bee colony algorithm (ABC), ant colony optimization algorithm (ACO), Biogeography-based optimization (BBO), differential evolution algorithm (DE) and simple genetic algorithm (SGA) algorithms on most test functions. This paper presents a new version of MBO with simulated annealing (SA) strategy called SAMBO. The SA strategy is put in the migration operator and butterfly adjusting operator. So the newly proposed algorithm has two features: One is that the algorithm accepts all the butterfly individuals whose fitness are better than their parents. The other is that the algorithm randomly selects some individuals which are worse than their parents to disturbance the convergence of algorithm. In this way, the SAMBO algorithm can escape from local optima. Finally, the experiments are carried on 14 continuous nonlinear functions, and results show that SAMBO method exceeds the MBO algorithm on most test functions.


Introduction
Recently, more and more nature-inspired algorithms (Yang 2014) have been proposed and generally applied in numerous applications, such as path planning , machine learning (Zhou 2016), knapsack problem (Feng et al. 2015), fault diagnosis (Duan and Luo 2015;Gao et al. 2008) and directing orbits of chaotic system (Cui et al. 2013). Among all kinds of natural-inspired algorithms, clustering algorithms and evolutionary algorithms are the most representative ones.
Analysis of nature-inspired algorithms from the exploration and search space shows that they both have two major components: exploitation and exploration, also called reinforcement and diversification. Exploitation is mainly to obtain relevant information from the search space of neighborhood to generate new solution superior to the current. But in the process it always presents to attempt to find the local optimal solution. Therefore, the advantage of exploitation is that it usually has fast convergence. But its disadvantage is also obvious that it is easy to fall into the local optimum. Exploration is the opposite. Generally speaking, the search space of exploration is within the global scope, so the diversity of search space is better. Therefore exploration could generate new solutions far away from the existing ones. It is not easy for exploration to fall into the local optimal, so the possibility of finding the global optimal will be greatly improved. But it is slow to converge and has a lot of calculation. In order to make the best performance of the algorithm, it is important to balance exploitation and exploration. SI was originally inspired by the collective behavior of social insects. It was first formally proposed by Dorigo et al. in his book "Swarm Intelligence: From Natural to Artificial Systems" in 1999. The following particle swarm optimization (PSO) (Zhao 2010;Kennedy and Eberhart 2002;Mirjalili et al. 2014), animal migration optimization (AMO) (Li et al. 2014), artificial fish swarm algorithm (Zainal et al. 2015) and monarch butterfly optimization (MBO)  belong to this category.
Neighborhood search (NS) (Ahuja et al. 2002) is an incomplete and perturbed search method. This approach starts with the candidate solution, iteratively looking for a better alternative in its neighborhood. The neighborhood is a basic concept in optimization, which in metric space is a sphere centered on one point. When seeking an extreme value for a continuous function, you can implement the approximation of function extreme by seeking the direction of independent variable's change to accomplish iteration of location in the neighborhood of current location. Neighborhood search is very practical. Firstly, it requires a certain amount of solutions, so it needs less memory space. Secondly, it usually show nice performance, and could find solutions faster than large complete approaches. Finally, it can be applied to combinational optimization problem through redefining the concept of neighborhood in discrete space.
Inspired by the migration behavior of monarch butterfly individuals in the nature, Wang et al. put forward monarch butterfly optimization (MBO)  recently. The algorithm is obtained by simplifying and idealizing the migration behavior of monarch butterfly. The algorithm divides the butterfly population into two parts, one for exploitation and the other or exploration. In the process of iteration, migration operator and butterfly adjusting operator are respectively used to generation the new population. MBO is easy to operate, and the operation of two operators is theoretically parallel.
In order to escape local optima to get better fitness and accept moves which reduce the solution quality, MBO adopts simulated annealing. This paper presents an improved version of MBO, called SAMBO. In SAMBO, simulated annealing is involved into the migration operator and butterfly adjusting operator. If the new generated individuals' fitness outperforms their parents', the algorithm will accept all of them as the new generation. Otherwise, the algorithm receives it with certain probability. While basic MBO accepts all the new generated individuals. Obviously, this new algorithm can speed up convergence and augment the diversity of individual in the later search process. Towards proving the advantage of SAMBO, a series of experiences are performed on 14 test instances. The results indicate that SAMBO perform better than basic MBO in finding the optimal solution on almost all the benchmarks. Section 2 reviews MBO algorithm and simulating annealing. Section 3 discusses how to incorporate the simulating annealing strategy into the migration operator and butterfly adjusting operator. Section 4 shows the comparison of MBO and SAMBO through testing on 14 benchmark problems. Finally, Sect. 5 outlines the summary of algorithm and the plan for the future work.
2 Literature reviews

Monarch butterfly optimization
As one of the most familiar North American butterflies, the monarch butterfly has an orange and black pattern that can be easily recognized (Garber 2013). Male and female monarchs have different wings, which makes it easy to tell them apart. It is well known that every summer North American monarch will fly thousands of miles from the USA and southern Canada to Mexico to complete its migration, it involves flying over the Rocky Mountains in the west to California. In order to overwinter, they travel thousands of miles to Mexico. The southward movement occurs in August and ends with the first frost. Then, in the spring, they moved back from the south. Females lay eggs during these migrations to produce offspring. Recent studies have shown that some monarchs show levy flight as they migrate or move. By simulating the migration behavior of Monarch butterflies in nature, a new nature-inspired algorithm has been proposed, which is called monarch butterfly optimization (MBO). The Monarch Butterfly Optimization algorithm mainly solves continuous Optimization problems. In monarch optimization, all individual monarch butterflies are idealized and distributed only in two locations, namely, southern Canada, northern United States (Land 1) and Mexico (Land 2). Therefore, the location of monarch butterflies can be updated in two ways. First, all descendants are generated by the migration operator, which can be adjusted by the migration ratio. Then, the position of the other butterflies is adjusted by the butterfly adjustment operator. In other words, the search direction of all butterfly individuals in the monarch butterfly optimization algorithm is mainly determined by migration operator and butterfly adjustment operator. The migration operator and the butterfly adjustment operator can be implemented simultaneously. Therefore, the monarch butterfly optimization algorithm can be processed in parallel theoretically, and can balance intensification and diversification well, which is a very important point in the field of metaheuristic algorithm.
To solve optimization problems, the migration of monarch butterfly could be reduced to the following regulations in MBO .
1. The butterflies which are on Land1 and Land2 make up the whole species group. 2. Every progeny is just produced by the individuals in Land 1 or Land 2. 3. The scale of the butterfly species group remains the same. 4. There is elitist strategy in MBO that the individual with the best fitness is preserved.

Migration operator
By simplifying monarch butterflies' migration process, the number of butterfly individuals in Land1 and Land2 can be recorded as ceil(p * NP)(NP1) and NP − NP1(NP2) respectively, according to the time in Land 1 and Land 2. Here, ceil(x) rounds x to the nearest integer greater than or equal to x; NP represents the total number of butterfly individuals; p presents the ratio of butterfly individuals in Land 1.This migration process can be marked out below: (1). r is computed as Eq. (2) where rand is a random number gained from uniform distribution. peri is set to 1.2 that presents the migration period in basic MBO. When r > p, x t+1 i,k is updated by Eq. (3) where x t r 2 ,k indicates the kth element of x r 2 at t generation. Monarch butterfly r2 is randomly chosen from Land2.

Butterfly adjusting operator
For all the elements of monarch butterfly j, if rand ≤ p , it is generated as Eq. (4): in the same way where x t+1 j,k indicates the kth element of x j at t + 1 generation, and x t best,k indicates the kth element of x best which owns the best fitness in the whole species group. If rand > p , it could be generated as Eq. (5): where x t r 3 ,k indicates the kth element of x r 3 . Butterfly r 3 is randomly chosen from Land 2.
With this addition,if rand > BAR , it is generated as Eq. (6) : where BAR presents butterfly adjusting rate. d x indicates the walk step of butterfly j which could be computed through L é vy flight as shown in Eq. (7). indicates the weighting factor as follow in Eq. (8): where S max indicates the max walk step.
Although MBO could obtain the better solution, it has a poor performance on finding the average value. So we propose a new method that could escape from local optimal to get the better value. And the part of Sect. 5 gives the result of experimental proof.

Simulated annealing
Simulated annealing (SA) was first proposed by Metropolis in 1953, and used into combinatorial optimization problems by Kirkpatrick in 1983. SA mimics the annealing process of solid matter as the optimization process, where the energy function is expressed in form of the objective function. The simulated annealing strategy can search for the system state with the lowest energy, and select the adjacent state of raised energy with certain probability along with the decrease of temperature.This makes SA completely different from the greedy strategy, and the ability to escape from the local optima to be greatly enhanced.
One of the limitations of neighborhood search methods is that only local knowledge, the neighborhood N(s) of the solution s, is thought over at each step. So these neighborhood search methods are simple to fall into the local optimum. To avoid such, SA accepts a worsening move m ∈ N(s) at certain probability which depends on Δ f (s, m) . The bigger is the loss of solution's quality attracted by move, the less likely move is chosen. If the move gets a better function value, it must be selected. The probability of electing a move m is calculated as Eq. (9) (Urli 2015): To command the frequency of accepting worse moves, a parameter temp (temperature) is presented and initialized to t 0 which could be calculated heuristically or adjusted for the special problems. As the time passes, temp can be updated continually in terms of a cooling schedule. Then,a random parameter r ′ is a random number in (0, 1). If (m|s, temp) , the move m is adopted. If not, the move is refused. Due to influence of temp, SA will adjust the value of temp as time goes. At the start of the search, many worse moves are adopted when temp is large. As temp declines, the algorithm accepts less and less moves, and implements standard hill climbing (Sun et al. 2018b) in the last stage of search.
The usual method to reduce the value of temp is to employ a geometric cooling schedule, i.e.,to choose a cooling rate parameter (0 < < 1) , and to generate the new temperature as Eq. (10): after lots of moves' neighbors_sampled max have been measured for acceptance. The main component of simulated annealing is shown as Fig. 1.
Customarily when temp < t min (set by experimenter), the search quits. But we control the stop condition through the iteration generation of monarch butterfly in algorithm.

Research methodology
MBO is a newly proposed nature-inspired meta-heuristic algorithm. Though it has revealed its advantage on some benchmark problems, it may fail to obtain the optimal property on average value. In this paper, the basic MBO (10) temp = ⋅ temp algorithm incorporates simulated annealing to enhance the property of algorithm. The main structure of SAMBO method is given below.
The butterfly individuals in Land1 and in Land2 are updated according to Eqs. (1)-(3) and Eqs. (4) , it will be accepted. Otherwise P(m|s, temp) can be formulated by Eq. (9). If random number r ′ is less than P(m|s, temp),the generated individuals could also be accepted. Besides, the algorithm will refuse new generated individuals. This is the difference between simulated annealing and greedy strategy. On the basis of above analysis, the updated migration operator and updated butterfly adjusting operator are shown in algorithm 1. and algorithm 2. The main frame of SAMBO could be described in algorithm 3. The algorithm flow chart is as follows: It is notable when the simulated annealing (SA) joins. If SA occurs too early, it is prejudicial for convergence of algorithm. If it is cited late, the algorithm has already fallen into local optima so that the simulated annealing has no effect. And initial temperature and cooling rate of algorithm have much influence on final function value. So there are three parameters to be adjusted (Fig. 2).

Dataset descriptions and parameters setting
In this section, toward investigating the advantage of SAMBO, fourteen high-dimensional benchmark functions which are shown in Table 1 are adopted to evaluate the algorithm improved above. Among all the benchmark functions, some are multimodal, which means that they have multiple local minima, some are nonseparable, which means that they cannot be written as a sum of functions of individual variables, some are regular, which means they are analytical (differentiable) at each point of their domain. Each of the functions in this study has 20 independent variables. The functions are summarized in Table 1. More information about these functions, including their domains, can be found in Back (1996), Yao et al. (1999), and Cai and Wang (2006). In addition, all the tests are implemented under the same conditions: MATLAB R2014b on Win7 64-bit Intel i7-3770 processor, 8 GB RAM.
The property of SAMBO was compared with MBO, probability-based incremental learning (PBIL), particle swarm optimization (PSO). The same parameters of MBO method and SAMBO method are set as follows: max step S max = 1.0 , butterfly adjusting rate BAR = 5∕12 , migration period peri = 1.2 , the migration ratio p = 5∕12 , population size NP = 50 , and maximum generation MaxGen = 50 . Accordingly, the monarch butterfly individuals' number in Land1 and Land2 are 21 and 29, respectively. For other algorithms' parameters, the settings could be referred as (Dan 2008).
In order to make the simulated annealing strategy play its greatest advantage, it is advisable to consider which generation is suitable to introduce the simulated annealing. In the simulated annealing process there are two parameters initial temperature and cooling rate. Different initial temperature and cooling rate will lead to disparate results, so all of three parameters influencing function value need to be tuned. Simulated annealing is introduced at 10, 15, 20 generation respectively in the experiments, the initial temperature is set to 200, 500 and 1000 respectively, and the cooling rate is set to 0.9, 0.95, 0.99, respectively. There are 27 forms to arrange combinations of three parameters. We run all combination forms on each function. In order to reduce the effect of random function in the experiment, each experiment runs two hundred times independently. The experimental result of SAMBO shown in tables is the best situation in the all experiments under all the different combinations of three parameters. The each value recorded in tables is the average value of 200 experiments, and the optimum of best and mean solution for each function is bold. By the way, in the paper proposed by MBO algorithm, the author has already compared MBO algorithm with ABC, ACO, BBO, DE and SGA algorithms, and concluded that MBO algorithm is better than these algorithms in most cases (see the table below for details). Therefore, in this paper, we do not repeat the comparison between SAMBO algorithm and these algorithms. So, in the experimental process, we only compared our proposed SAMBO algorithm with MBO, PBLB and PSO algorithms while the benchmark function dimension is 20, 40 and 80. The experimental results show that, in most cases, our proposed SAMBO algorithm is superior to other algorithms in terms of best solution time and average solution time. See Tables 2, 3, and 4 for details.

Experimental results with benchmark functions of 20 dimensions
The dimension of benchmark functions is set to 20. The number of iteration is taken as stop condition. It is set to 50. Table 2 shows the experiment results obtained by MBO, SAMBO, PBIL and PSO. For the best and mean values shown in Table 2, there is no ambiguity in concluding that SAMBO could find the better value of most functions than other algorithms except the best value of F04, F06 and F11 and mean value of F13. It can be observed that even though the best value obtained by SAMBO is slightly better than those got by three other methods on most test functions, SAMBO significantly outperforms MBO, PBIL and PSO on finding the mean value of functions. Then there is some improvement on mean values for follow functions. For F04 and F05, the result that SAMBO obtains is smaller than those which found by MBO about 6 times and 7.7 times, respectively, about 140 times smaller while compared with PBIL and PSO. For F08 and F10, SAMBO's solution is smaller than that of MBOPBIL and PSO about 30 times. Compared with other algorithms on the F02 and F06, SAMBO has a little advantage. And SAMBO has a slight advantage over MBOPBIL and PSO on certain functions such as F01, F03, F07, F09 and F12. It's worth noting that MBO and PBIL have a terrible performance on the mean value of F11. For optimal solution, SAMBO has a good performance on most of all the test functions expect F04, F06 and F11.

Experimental results with benchmark functions of 40 dimensions
The dimension of benchmark functions is set to 40. The number of iteration is taken as stop condition. It is set to 50.  Especially for the mean value of F11, SAMBO is much better than others about 27 orders of magnitude. And for F04, F05 and F08, the average values acquired by SAMBO are 4 times smaller than that of MBO. While compared with PBIL and PSO, the newly proposed method is better about 37 times, 28 times and 13 times respectively. Then SAMBO has a little advantage over MBO, PBIL and PSO on certain functions such as F01, F02, F03, F07, F09, F13 and F14. And for F06, PSO performs best. Although MBO finds better solutions on the best and mean value of F12, the results obtained by each algorithm are not much different.

Experimental results with benchmark functions of 80 dimensions
The dimension of benchmark functions is set to 80. The iteration generation is taken as stop condition and set to 50. This time we only choose five test functions (F02, F04, F05, F08 and F10), because the advantage of SAMBO could be seen clearly and shown in the convergent graphs for these functions. Table 4 shows the experiment results acquired by MBO, SAMBO, PBIL and PSO. And this paper primarily demonstrates that the MBO has been improved, so we just provide the convergence curves of MBO and SAMBO on F02, F04, F05, F08 and F10. From Table 4, there is no ambiguity in concluding that SAMBO outperforms MBO, PBIL and PSO. And compared with MBO, the advantage of SAMBO on F02, F04, F05, F08 and F10 are revealed in Figs. 5, 6 and 7. Figure 3 shows the convergence curve of SAMBO and MBO on F02.The curve of MBO tends to be flat quickly, while the curve of SAMBO is always going down. The performance of SAMBO is much greater than MBO. Figure 4 demonstrates the convergent process of SAMBO and MBO on F04. Except the fitness obtained by MBO is better than that of SAMBO at the generation 6,the curve of SAMBO is lower than MBO's all the time. And their gap is getting bigger and bigger later. Finally, the two curves tend to be parallel, and the final fitness of SAMBO is much smaller than MBO's. Figure 5 illustrates the convergent trajectory of SAMBO and MBO on F05. Although initial value of SAMBO is worse than MBO's initial value, the convergence of SAMBO happens earlier than MBO. Ultimately, the curve of SAMBO is much lower than MBO's curve. Step No Yes No ±200  Figure 6 reveals the convergent history of SAMBO and MBO on F08. Similarly with Fig. 5, though SAMBO has the worse initial value, SAMBO achieve the better fitness in the end. Figure 7 displays the convergent track of SAMBO and MBO on F10. It could be observed visibly that the convergent amplitude of SAMBO is greater, and the final fitness found by SAMBO is much less than the value which MBO finds, despite SAMBO's fitness is much bigger than MBO's at start.

Experiment on real world dataset
Vehicle routing problem (VRP) (Elhassania et al. 2014) is one of the most challenging combinatorial optimization problems. This problem was defined more than 40 years ago and includes the design of an optimal path set for the fleet to serve a known set of customers. The practical application significance and its difficulty of solution are important in VRP. A company may have several warehouses for customers. If customers are clustered around these warehouses, the allocation problem can be modeled as several independent VRPs. However, if the distribution of customers and warehouses is mixed, this should be solved as a multi-depot vehicle routing problem (Renaud et al. 1996).
We use our proposed SAMBO algorithm to solve multidepot vehicle routing problem, and in order to prove the efficiency of the new algorithm, we also use the traditional MBO algorithm and genetic algorithm to solve the same problem. In the experiment, the performance of MBO algorithm, genetic algorithm and SAMBO algorithm was tested by six unsolved multi-depot vehicle routing problems, Table 5 shows the specific information of the six unsolved Multi-Depot Vehicle Routing Problems. In addition to the number of customers in the problem, the number of warehouses and the vehicle load, each instance's data also stores the coordinates of each customer's geographic location and the coordinates of the warehouse. Through the coordinates, the length of the path can be calculated and used as the cost value of the path. Table 6 shows the solutions found by three algorithms for six problems: cost is the total length of the path found, and the running time is the time required to run the algorithm once, with the unit of seconds (s). The italics represent the optimal solution. The italics in the table clearly show that SAMBO found the best solution for all six problems. Compared with traditional genetic algorithm, both MBO and SAMBO perform better. On the first four problems (P01, P02, P03 and P04), the cost value of genetic algorithm is about 3 times of MBO and SAMBO, especially for P02, the cost value of genetic algorithm is 4 times of SAMBO. Moreover, the path found by SAMBO is indeed less expensive than that found by MBO, indicating that the algorithm has improved on the basic MBO algorithm, and with the increase of the size of the problem, the gap between solutions becomes larger and larger. For problems P05 and P06, the path cost found by the algorithm SAMBO is more than 8000 and 2000 less than that found by the genetic algorithm and the MBO  algorithm respectively. This shows that as the size of the problem increases, the advantages of the SAMBO algorithm become more and more obvious. Through experiments, it is proved that SAMBO algorithm is effective in solving the multi-warehouse vehicle routing problem.

Conclusion and future work
In recent years, more and more swarm intelligence algorithms are proposed and applied in many different areas. MBO is a newly proposed meta-heuristic algorithm to solve continuous optimization problems and testified that it outperforms five other meta-heuristic algorithms on most of all the test problems. But the ability of MBO to find the average value is poor on certain test functions. So the simulated annealing strategy is incorporated into MBO algorithm to escape local optima. A novel version of MBO with simulated annealing called SAMBO is proposed. In MBO, all the updated individuals are accepted. While in SAMBO method, except the updated butterflies whose fitness is better than their parents are accepted, the algorithm could adopt the worsening new individuals with a certain probability when the algorithm introduces simulated annealing strategy. This can ensure that outstanding individuals are preserved and churn search space to seek the better solution. The experiment results demonstrate that SAMBO can find better solutions than MBOPBIL and PSO especially on average value on almost of all the benchmark problems (Sun et al. 2018b, a). But there are limits. In future, there are three points to clarify that can facilitate the future research direction. Firstly, in the current work, experiments just proceed on 14 benchmark problems. More benchmarks should be used to evaluate SAMBO. Secondly, we could study a self-adapting way to solve how to tune three parameters Liu et al. 2017;Zheng et al. 2017). If the self-adapted way is worked out, it is convenient for us to use the SAMBO method to solve what we want to handle. Finally, SAMBO is just to solve single objective problems. We can ameliorate it to deal with multi-objective problems.