Matching island topologies to problem structure in parallel evolutionary algorithms
Abstract
In the context of Parallel Evolutionary Algorithms, it has been shown that different population structures induce different search performances. Nevertheless, no work has shown a clear cut evidence that there is a correlation between the solver’s population structure and the problem’s network structure. In this work, we verify this correlation performing a clear and systematic analysis of a large set of population structures (based on the well known βgraphs and NKlandscape problems. Furthermore, we go beyond our findings in these idealised experiments by analysing the performance of variabletopology EAs on a dynamic realworld problem, the MultiSkills Call Centre.
Keywords
Parallel Evolutionary Algorithm Island model Problem structure1 Introduction
Networkbased approaches are a powerful tool for understanding the properties of complex systems (including optimisation dynamics by means of metaheuristic methods) and represent a useful data structure for capturing information of processes taking place at multiple temporal or spatial scales. The paper by Michell (1904) is widely regarded as one of the earliest industrial papers to recognise the crucial role that “topology” plays when solving a specific problem. Since then, of course, many other studies have emerged that profoundly changed our understanding of the role of networks in complex systems. Indeed, the crucial work by Watts and Strogatz (1998) on smallworld (SW) networks launched the field of networks science in earnest and was followed by rapid advances by, e.g. Barthelemy and Amaral (1999), Barabási and Albert (1999), Newman et al. (2000), Wang and Chen (2003), Barrat and Weigt (2000) among others.
The smallworld networks described by Watts and Strogatz (1998) raised a great deal of interest in different research areas as they postulated that, apparently different networks arising in biological, social or technological systems had, at their core, some common characteristics that helped to organise the universe of possible networks into welldefined classes with welldefined features. For example, some naturally occurring smallworld networks present a highclustering coefficient and yet a small characteristic path length that enables the rapid percolation of information across the network. Later, Barabási and Albert (1999) suggested that the distribution of highly connected vertices in networks such as the WWW or the citation of scientific publications is far from being random. In fact, in the socalled scalefree networks, vertex connectivities follow a scalefree powerlaw distribution, meaning that a reduced set of vertices dominate the connectivity of the network. This feature is a consequence of two generic mechanisms: networks expand continuously by the addition of new vertices, and new vertices attach preferentially to nodes that are already well connected.
Advances in network science resonated well with evolutionary algorithm research, specifically with work on parallel and cellular evolutionary algorithms were structured populations were introduced that modified the dynamics of information exchange (e.g., through genetic recombination, solution migration policies or memetic transmission) within a population. It is currently accepted that, contrary to panmitic populations, the use of decentralised populations confers the evolutionary algorithm the opportunity for a better exploration of the search space and can improve both the numerical and runtime behaviour of the algorithm. For example, in CantúPaz (1999), a relation was established between the network topology, the deme size, the migration rate, and the efficacy of a given algorithm for some idealised problems. Moreover, CantúPaz (1999), showed that the choice of migration and replacement strategies affected the takeover time within multipopulation Genetic Algorithms (e.g. choosing migrants or replacements according to fitness increases global selection pressure and causes faster convergence; in turn, shortened convergence times, although desirable, may also constitute a potential source of failure due to the premature loss of diversity).
Evolutionary algorithms that embraced a refined population structure can be roughly classified into two main families, namely, the Cellular Evolutionary Algorithm (CEA) in which genetic interactions may only take place in a small neighbourhood that is defined around each individual and Parallel Evolutionary Algorithms (PEA), also known as Distributed Evolutionary Algorithms, in which a single population is partitioned into several subpopulations or “islands” that exchange individuals according to a given migration policy. The concept of migration policy was further formalised in Alba and Tomassini (2002) as a tuple of five values indicating the migration rate, the frequency of migration, the policy for selecting migrants, the replacement policy, and whether or not the migration is synchronous. The authors showed that the balance between exploration and exploitation and hence, the probability of success of a given algorithm, was directly affected by the migration policy.
In Giacobini et al. (2006), the properties of CEAs with populations structured as Watts–Strogatz smallworld graphs and Albert–Barabási scalefree graphs were investigated using as benchmarks problems of different difficulty. Their results showed that smallworld topologies allow for a tradeoff between robustness and speed of the search. In terms of success rate, these topologies behaved often better than the panmitic case but with slower convergence rates. On the other hand, scalefree topologies did not seem to be appropriate for the given benchmarks, probably due to premature convergence problems.
Lattice topologies were also explored for memetic algorithms where a cellular memetic algorithm was used to successfully solve a range of standard continuous optimisation benchmark problems (Quang et al. 2009) while Woolley et al. (2011) tailored the approach for a realworld optimisation of parameters for scanning probe microscopy.
The work presented in Wang et al. (2011) provides a systematic study of the performance of a PEA under a number of simple topologies including a single population (effectively a panmitic EA), a set of distributed populations but without links between them, paired populations, twolayered lattice connections and a fully connected topology. Upon these topologies they run the 0–1 Knapsack and the Weierstrass Function minimisation problems. The results showed that the twolayered lattice connections and a fully connected topology outperform the others as the complexity of the considered problems increased.
A different approach was taken by Whitacre et al. (2008) who, rather than fixing the population topology, they allowed it to coevolve with the solutions being sought for the target problem.
Networkcentric perspectives have benefitted other metaheuristics too. For example, Kennedy and Mendes (2002) analysed the effect of different topologies on particle swarm optimisation. They showed that for PSO, some random networks achieve outstanding results while the commonly used structures (Fully Connected and Ring) correspond to suboptimal solutions. In fact, the evolutionary search process might benefit from some topological properties of these random structures, resulting in a good balance between exploration and exploitation. The paper by Li et al. (2009) shows an extensive study of the topology space, in which the effect of 1,200 different network topologies is analysed in the context of selfassembling programs. The cited work studies a wide range of graph topologies, covering simple reticular structures, smallworld networks and fully random networks. It is shown that different topologies and average interconnection distances within the network have an influence on the software selfassembly process, along with resulting complexity and diversity of the generated programs.
The above summary, albeit by necessity only partial, reflects the variety of works that have been undertaken into the amalgamation of Evolutionary Algorithms and Network Science. Notably, and notwithstanding (a) the diversity and quality of the work done in the past and (b) that is currently beyond doubt that different network structures induce different search performance, no work has shown a clear cut evidence that there is a correlation between the solver’s population structure and the problem’s network structure. It is this correlation that we seek to verify by a set of idealised, simple and clear experiments based on the wellknown βgraphs (as the solvers’ population structure) and the NKlandscapes as the problem network structure. Furthermore, we go beyond our findings in these idealised experiments by analysing the performance of variabletopology EAs on a realworld problem.
The rest of the paper is structured as follows: Sect. 2 explains the methodology of the experiments. Experimental results are detailed on Sect. 3 together with the discussion. Section 4 concludes the paper.
2 Methodology
In this section we describe the benchmarks used to ascertain whether there are correlations between problem structures and the topologies used to interconnect a set of population islands. We describe first the problems used to benchmark the different topologies and the evolutionary algorithms employed.
2.1 Benchmarks
We have used three different problems to ascertain whether correlations exists between problem structures and population topologies. The first two types of benchmarks are idealised problems, OneMax and NKlandscapes, that allow for a precise control of the problem structure. The last benchmark is a realworld, dynamic and stochastic problem that is used to evaluate whether the findings uncovered with the idealised problems scaleup to more realistic scenarios. We describe in details each of these benchmarks next.
2.2 OneMax benchmark
2.3 NKlandscape benchmark
It has been shown that the neighbourhood structure determines the complexity of the problem. In fact, if the structure used is that of adjacent neighbours then the problem can be solved in polynomial time. On the other hand, if the neighbours are chosen at random the problem can be NPHard (Weinberger 1996). Furthermore, the landscapes can be tuned from smooth to rugged by increasing the value of the parameter K. Krasnogor and Gustafson (2004) and Krasnogor (2004) showed that it was possible to evolve local searchers for Memetic Algorithms that would match the structure of the problem being solved, in particular, NKlandscape instances. In this paper we would like to evaluate whether one could match different island topologies to different instances of the NKlandscape. Thus, we study four different scenarios: low epistasis and polytime solvable, high epistasis and polytime solvable, low epistasis and NPhard and high epistasis and NPhard instances of the NKlandscape problem as done by Krasnogor and Gustafson (2004).
2.4 Systematic topologies via βgraphs for idealised problems
In order to simply and clearly assess whether different island topologies could better serve different problem structures (i.e. different N and K in the NKlandscape benchmark) we use a family of graph models, βgraphs, which were proposed to analyse small world phenomena (Watts and Strogatz 1998), as a systematic source of island topologies. The question Watts tries to answer can be briefly explained as: What are the most general conditions under which the elements of a large, sparsely connected network will be close to each other. The closeness of vertices is determined by the length property of the graph, which has been an active research area and has been studied on different problem classes, for example, the performance of computer networks, telecommunication network and, more recently, we showed that these graphs topologies greatly impact the diversity of generated programs in a “GPlike” setting (Li et al. 2006, 2009).
βGraphs capture a variety of network topologies from a highly ordered to a completely random graph. Three parameters are used to define the properties of graphs generated under the βgraph model, namely, n representing the number of vertex in the graph, k determining how many initial nearest neighbours each vertex has and—finally—β that defines the rewiring rate.
Characteristic path length (abbreviated as CPL in the remaining of this paper) is one of the most important statistics used to measure the shortest distance between each vertex (i, j) in a graph. The formal definition of CPL is given by Watts and Strogatz (1998) as “The characteristic path length (CPL) of a graph (G) is the median of the means of the shortest path lengths connecting each vertex \(v \in V(G)\) to all other vertices. That is, calculate \(d(v, v_j) \, \forall \, v_j \, \in \, V(G)\) and find \(\overline{d_v}\) for each v. Then define L as the median of \(\{\overline{d_v} \}.\)” Furthermore, for each graph one can also define the clustering coefficient (abbreviated as CC in the remaining of this paper) as the degree to which a vertex neighbours are also neighbours of each other: \( \text {CC}_v = \frac{E(v)} { {{k_v} \choose {2}} }, \) where E(v) is the number of edges incident in v and k _{ v } the maximum number of possible edges.
In Fig. 2, for reference, we show the CPL and CC for the βgraphs we used later on our experiments.
2.5 A realworld benchmark problem: dynamic optimisation in a MultiSkill Call Centre
The two idealised benchmarks described above, together with a systematic topology generation through βgraphs, must be complemented with a realworld problem for which, a priori, one has no possibility of predetermining an optimal population structure. We will use the problem described below to investigate (later in the paper) whether it is, in principle, possible to let the optimisation algorithm (in our case a Genetic Algorithm) choose the best topology as the dynamic problem unfolds. Thus, this benchmark will be used to ascertain whether different topologies are better at different stages of the search process under a dynamic optimisation scenario.
In a MultiSkill Call Centre (MSCC), there are n incoming customer calls \(C=\{c_{1},c_{2},\ldots,c_{n}\}\) grouped in k call groups \( \text {CG}=\{cg_{1},cg_{2},\ldots,cg_{k}\}\) according to the call type, and m agents \(A=\{a_{1},a_{2},\ldots,a_{m}\}\) that have a subset of all the possible skills (\(S=\{s_{1},s_{2},\ldots,s_{k}\}\)) to attend the corresponding call groups (having the skill s _{ i } enables you to attend the call group cg _{ i }). Not all the agents have the same skill set and the number of skills per agent is different. Agents can only attend the call groups they have been trained for. This implies that each agent can attend different call types and, given a call type, it can be answered by several agents who have the associated skill. Note that agents cannot attend any kind of customer calls as they are usually specialised in concrete tasks (they do not have the complete skill set) or sometimes limited by the law regulations. Although agents may have multiple skills, each agent can only process one call at the same time. Furthermore, given a call, it requires an unknown amount of time to be accomplished. Besides, each agent must orderly process each call during an uninterrupted period of time; in other words, the call cannot be divided or postponed once it has been started.
The main objective of this realworld problem is to get, for each timeframe (t), an automatic allocation of agents and call groups ({a _{ i }, cg _{ j }}_{ t } when a _{ i } is related to s _{ j }) that maximises the service level [see MillánRuiz and Hidalgo (2010)]. It stands to reason that we want to devote more agents to those call groups with greater traffic volume or to those with higher priority or relevance.
The problem of workforce distribution in MSCCs is a very complex and dynamic realworld problem. Usually, the number of incoming calls (n) is much larger than the number of agents (m) and the flow of calls is very dynamic over time, making this problem really hard. Intuitively, this problem is much more complicated than having a simple pool of incoming calls where agents take work from, since it requires the assignment of customer incoming calls to the agents having the right skills, satisfying a given set of additional constraints and respecting the dependencies among individual tasks and differences in the execution skills of the agents [see MillánRuiz and Hidalgo (2010) for getting further information about this problem]. This problem is somehow related to other classic changing scenarios where staffing requirements are identified to insure that the organisation has the right number of agents at the right time. This is a highly difficult problem because we are not only dealing with an NPhard problem like the job assignment problem (Brucker 2007), but the problem also considers rapidly varying conditions, massive incoming calls and a large number of agents having hard constraints to process certain tasks. Reviewing the stateoftheart, we can find a number of strategies and algorithms to solve this problem (see MillánRuiz and Hidalgo 2010a, b and c) where Parallel Genetic Algorithms (PGA) have proved to be the most competitive approach.
2.6 Algorithms and experimental setup for the OneMax and NKlandscape problems

Batch 1: N = 50 and \(K \in [2,4,8,14]\)

Batch 2: N = 100 and \(K \in [2,4,8,14,28,56].\)
For the OneMax problem we used one instance of 5,000 bits (preliminary experiments with smaller instances were easily solvable by our GA).
 1.
Initialisation The algorithm starting population is initialised randomly.
 2.
Selection A classical binary Tournament Selection has been implemented to select the parents of the offspring. A fitnessbased match is used where the selected parents survive until the next generation.
 3.
Crossover The offspring is generated by a single point crossover (SPX). The probability of crossover is 0.9;
 4.
Mutation In order to avoid another variable, no mutation was used.
 5.
Migration policy It is fixed to a simple replacement policy with a bandwith of 10 % of an island population. This means that the best individuals, 10 % of each emitting island, replaces randomly a part of the population of the receiving island (always preserving the elitism).
2.6.1 Algorithms and experimental setup for the MultiSkill Call Centres
It is not always straightforward to control the internal dynamics of a PGA based on the island model, especially while seeking to ensure a fair balance between exploration and exploitation in a dynamic realworld environment. In real production environments, engineers do not always have enough time to test out and compute all the possible combinations to determine the optimal island connectivity configuration as there are many factors that may have an effect on the overall performance and accuracy (number of islands, topology, migration and replacement policies, amount of migrants, frequency of migrations, number of individuals in each island, type of synchronism, etc). This problem is even more severe when dealing with dynamic optimisation under uncertainty such as in the MultiSkills Call Centre problem. To select the optimal configuration, we have developed a MetaPGA that automatically determines a sufficiently competent configuration for a second, “internal”, PGA is the one that actually solves the MSCC problem.
Some authors have already developed MetaGAs in the past. Wright (1991) was one of the pioneers in using GAs to optimise problems over several real parameters. Lee and Takagi (1993) proposed an automatic fuzzy system design method that used a GA and integrated three design stages. Their method determined membership functions, the number of fuzzy rules and the ruleconsequent parameters at the same time. Clune et al. (2005) used a MetaGA to investigate the evolution of parameter settings (genetic operators) for genetic and evolutionary algorithms in the hope of creating a selfadaptive algorithm. Nannen and Eiben (2006) presented and evaluated a method for estimating the relevance and calibrating the values of the parameters of an evolutionary algorithm. The method provided an information theoretic measure on how sensitive a parameter was to the choice of its value. In Nannen and Eiben (2007), the same authors proposed an advanced method that helped to calibrate the parameters of an evolutionary algorithm in a systematic and semiautomated manner. The method for relevance estimation and value calibration of evolutionary algorithm parameters was empirically evaluated in two different ways. More recently, Brain and Addicoat (2010) made use of a MetaGA to optimise the parameters of a simple GA through an evolutionary process. They addressed the problem of determining the electronic structure of long chain molecules. The same year, Shahsavar et al. (2011) proposed a methodology for both optimal pattern selection and tuning. They employed a robust GA to solve a project scheduling problem.
All these algorithms were focused on classical GA, but we now propose a MetaPGA for parameter calibration that automatically tests out different islands and migration configurations. It entails a number of independently evolving populations to determine the right settingup of an internal PGA.
Let us present the pseudocode before going on with the explanation (see below). Open image in new window
 1.
Number of islands (from 1 to 12 populations).
 2.
Topology (star, bidirectional ring, alltoall).
 3.
Population size (from 4 to 100 individuals per island).
 4.
Migration and replacement policies BestFitted Individuals by WorstFitted Individuals (BFIWFI), BestFitted Individuals by Random Individuals (BFIRI), BestFitted Individuals by BestFitted Individuals (BFIBFI), BestFitted Individuals by Most Different Individuals (BFIMDI), BestFitted Individual + “Annealing” by WorstFitted Individuals (BFIAWFI).
 5.
Migration frequency (30 or 60 s).
 6.
Amount of migrants (percentage from 10 to 30 %).
 1.
Fitness function We measure the service level resulting from each configuration (see MillánRuiz and Hidalgo 2010).
 2.
Population size The population contains 20 different individuals encoded as hinted above.
 3.
Initialisation The initial population is randomly generated.
 4.
Selection We have applied a binary tournament selection to select individuals from the population.
 5.
Crossover The offspring inherits the common points in their parents and randomly receives the rest of genes from them.
 6.
Mutation We apply a perturbation over each gene of the chromosome with a probability of 0.1.
 1.
Encoding We encode every solution as an array of integers whose indexes represent the available agents at a given instant and the array contents refer to the profile assigned to each agent.
 2.
Fitness function We measure the service level resulting from the configuration of agents and incoming calls (see MillánRuiz and Hidalgo 2010).
 3.
Initialisation The initial population is randomly generated.
 4.
Selection Individuals are selected, using a binary tournament mechanism.
 5.
Crossover The offspring inherits the common points in their parents and randomly receives the rest of genes from them.
 6.
Mutation We apply a perturbation over each gene of the chromosome with a probability of 0.03.
 7.
Replacement policy We consider elitism with a probability of 0.93 to replace the worstfitted individuals of the population in next generation. And with a probability of 0.07, a worsefitted individual may be captured. Note that our basic GA relies on a steadystate scheme.
 8.
Parallel GA’s operators The PGA’s parameters and evolutionary operators to play with are: number of islands, topology, population size, migration and replacement policies, migration frequency and amount of migrants.
3 Results
In this section we provide the results we have obtained. We focus first on analysing the idealised problems, OneMax and NKlandscapes, and then we provide results for the MultiSkills Call Centre problem.
3.1 OneMax and NKlandscape
As mentioned in the previous section, we have conducted a total of 357,420 experiments, i.e. 20 experiments per each of the 851 different island topologies and each of the 21 tested problems. These experiments were organised in two batches.
3.1.1 Batch 1
Batch 1 was a preliminary set up based on the NKlandscape with N = 50 and \(K \in [2,4,8,14]\) in both a contiguous and random epistatic interaction structure and a 5,000 bits OneMax instance. For each of these problem instances and topologies, 20 different random realisations of the βgraph island model were used. The island topologies were derived from βgraphs with topologies in the range \(n \in [1,2,4,8,16,20] \times k \in [0, \ldots, N1] \times \beta \in [0,\ldots,1, +0.05]\)).
3.1.2 Batch 2
More clearly than Figs. 5, 6 show clusters where groups of island topologies perform well (or bad) with high confidence on groups of problems. To analyse in more details which topologies are the more productive for different problems we resort to group the topologies based on their CC and CPL and reapply the biclustering algorithm. To do this we performed the following calculations
Shows the average clustering coefficient and the characteristic path lenght of the 25 considered bins in the case of joint CC and CPL clustering
Bin  CC  CPL 

0  0.9445  0.0031 
1  0.8986  0.0056 
2  0.8381  0.0091 
3  0.7932  0.0117 
4  0.7491  0.0145 
5  0.7014  0.0179 
6  0.6618  0.0203 
7  0.7203  0.0236 
8  0.6638  0.0244 
9  0.6022  0.0277 
10  0.4439  0.0309 
11  0.4598  0.0343 
12  0.4183  0.0391 
13  0.2671  0.0451 
14  0.4727  0.0491 
15  0.2037  0.0577 
16  0.2902  0.0710 
17  0.1733  0.0821 
18  0.0914  0.1143 
19  0.0548  0.1446 
20  0.0227  0.2100 
21  0.0014  0.2833 
22  0.0005  0.4937 
23  0.0000  0.7160 
24  0.0000  0.8301 
3.2 MultiSkill Call Centres
Up till now, we have presented the results of the two idealised problems which illustrate the relationship between island topologies and problem structure. For the realworld problem, we have created a predefined set of representative topology structures according to the findings unveiled by the two aforementioned idealised problems. The purpose is to delimit the topology space to analyse how those discoveries uncovered by the idealised problems scaleup to a realworld problem and study how other parameters may have an impact on the relationship between topologies and problem structure.
We now describe the problem instance of medium difficulty that we have created to test out our MetaPGA in a real environment. This problem instance is composed by real data taken from an MSCC during a common day. The size of the snapshot where each configuration has been executed is 300 s (5 min). Note that around 800 incoming calls (n) simultaneously arrive during a normal day in such a time interval. The number of agents (m), for each time interval, oscillates around 700, having 16 different skills for each agent on average, grouped in skill profiles (sets of skills) of 7 skills on average. The total number of call types considered for this study is 167.
Given the values for the 6 genes of the MetaPGAs chromosome, there are 6,480 possible combinations (8 × 3 × 9 × 5 × 2 × 3 = 6,480). This may seem an easy search space but every evaluation takes time, as we have to reexecute the internal PGA each time, which is unfeasible in a production environment that requires fast adaptations. Of course, we can optimise this by avoiding recalculations previously made by the MetaPGA. The challenge should now be to develop a fast and effective MetaPGA that avoids performing too many iterations to find the right configuration or, at least, a good enough approximation (see the MetaPGA previously described).
Bidirectional ring outperforms other more connected topologies, especially when the number of islands increases. When this happens, the population quickly converges towards the same solution. Therefore, bidirectional ring seems to be the most appropriate topology for dynamic environments, most likely because this topology allows for opportune convergence, while preserving the desired diversity. It is important to highlight that, for this problem, it is crucial to have a connected topology rather than several isolated islands working in parallel (this problem requires a collaborative scheme).
The star topology also entails highquality outcomes but quickly suffers premature convergence. The reason is that the master island receives many migrants from the subordinate islands after some migrations (and it is even worse when there are many subordinate islands), implying that populations eventually become very similar. This intuitively involves a lack of diversity so that the gain of fitness gets fatally damaged. This phenomenon affects much more strongly to the hub topology as, being all the islands interconnected to each other, the diversity diminishes too much after one or two migrations.
The two previous paragraphs confirm the results of the previous section, reflecting that each problem structure needs a different island topology configuration. Dynamic, complex problems should be supported by mediumconnected topologies like the bidirectional ring to make the PGA evolve properly.
The migration and replacement of individuals is another important feature to setup. In this manner, replacing the worstfitted individuals in the receiving population by the bestfitted individuals of the source population does not always behave better than taking the most different individuals. The process of analysing differences in the chromosomes in contrast implies that the internal PGA can run fewer generations (as it is a costly operation), but entails better fitness values in the end. The underlying principle may be that fitnessbased comparisons can occasionally be misleading or deceptive, leading to the situation in which two close individuals in terms of genes in common may have associated very different fitness values, whereas two far chromosomes in terms of genes in common may have assigned close fitness values. Another consequence of measuring gene differences as compared to gauging fitness values is that the lift of the fitness curve has a smoother slope in the first generations. Naturally, replacing the bestfitted individuals of the receiving population by the bestfitted ones of the source population implies a slower convergence in each processing node as we will find a larger percentage of lessfitted individuals. This way, the best migration policy has been sending the best fittedindividual with some nonnecessarily bestfitted individuals (annealing set) as it provides diversity.
Another finding has been that having many individuals on each population makes the algorithm slower and fewer generations are executed. Best values seem to range from 15 to 30 individuals per population.
The migration frequency is also important in the performance. Migrations should not be done with too much frequency, each population needs to evolve separately enough time. Of course, the amount of migrants should not be very big as the internal PGA may converge very fast to the same solutions. The impact is higher when the number of islands is rather large.
We have seen that PGAs can also deal with complex, realworld application domains although they require specific tuning, depending on the nature of the problem being faced. This way, we have presented a MetaPGA for finetuning PGAs based on the island model. Thanks to the results uncovered by the exhaustive study of the idealised problems, we have been able to effectively delimit the island connectivity to scaleup those findings to a very complex, realworld problem.
3.3 Discussion
The experiments performed on the OneMax and NKlandscapes using a series of βgraph based topologies for configuring the island connectivity of the parallel GA have clearly identified a direct correlation between problem structure and islands connectivity structures as evidenced by the various biclustering analysis we performed. The results discovered through the idealised problems have helped us to delimit the search space to scaleup those findings to the realworld problem. The resulting correlations are neither linear nor trivial (given the complex nature of evolutionary search) even for these kind of “perfectly known” problems and the situation is exacerbated with the MultiSkill Call Centre Problem, a realworld scenario.
4 Conclusion
In this paper we have performed a systematic analysis of the correlation between island topologies and problem structure. We have been able to clearly establish that different problem structures require different island topologies for a PGA both through an analysis of idealised problems, OneMax and NKlandscapes, and a realworld scenario, namely, the MultiSkills Call Centre. The analysis performed, involving the execution of thousands of simulations, the utilisation of biclustering and statistic analysis, has shown that there is no single island topology that is best across different problems and that the link between island topologies and problem structures is highly complex. Thus, the utilisation of adapting and selfadapting solvers that can dynamically choose (perhaps even construct) the interconnection scheme between islands seems to be the preferred way forward.
Notes
Acknowledgments
N. Krasnogor would like to acknowledge UK EPSRC funding for project EP/H010432/1 and The Weizmann Institute of Science for granting him a Morris Belkin Visiting Professorship during which a part of this paper was executed. Iván Contreras and Ignacio Arnaldo are supported by Spanish Government Avanza Competitividad I+D+I: TSI0201002010962 and Iyelmo INNPACTOIPT20111198430000 projects and the mobility Grant “Orden ECD /3628/2011, de 26 de diciembre, Dirección General de Política Universitaria, Ministerio de Educación, Cultura y Deporte”. The work has also been supported by Spanish Government grants TIN 200800508 and MEC CONSOLIDER CSD00C0720811.
References
 Alba E, Tomassini M (2002) Parallelism and evolutionary algorithms. Evol Comput IEEE Trans 6(5):443–462CrossRefGoogle Scholar
 Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512. http://jmvidal.cse.sc.edu/library/barabasi99a.pdf Google Scholar
 Barrat A, Weigt M (2000) On the properties of smallworld network models. Eur Phys J B Condens Matter Complex Syst 13:547–560. doi: 10.1007/s100510050067 Google Scholar
 Barthelemy M, Amaral LAN (1999) Smallworld networks: evidence for a crossover picture. Phys Rev Lett 82(15):5. http://arxiv.org/abs/condmat/9903108 Google Scholar
 Brain Z, Addicoat M (2010) Using metagenetic algorithms to tune parameters of genetic algorithms to find lowest energy molecular conformers. In: Proc. of the Alife XII ConferenceGoogle Scholar
 Brucker P (2007) Scheduling algorithms, 5th edn. Springer Publishing Company, BerlinGoogle Scholar
 CantúPaz E (1999a) Migration policies and takeover times in genetic algorithms. In: Proceedings of the genetic and evolutionary computing conference (GECCO) 1999, p 775Google Scholar
 CantúPaz E (1999b) Topologies, migration rates, and multipopulation parallel genetic algorithmsGoogle Scholar
 Clune J, Goings S, Punch B, Goodman E (2005) Investigations in MetaGAs: panaceas or pipe dreams? In: Proceedings of the 2005 workshops on genetic and evolutionary computation, GECCO ’05. ACM, New York, pp 235–241. doi: 10.1145/1102256.1102311
 Fialho A, Costa L, Schoenauer M, Sebag M (2008) Extreme value based adaptive operator selection. In: Proceedings of the 10th international conference on parallel problem solving from nature: PPSN X. Springer, Berlin, Heidelberg, pp 175–184Google Scholar
 Giacobini M, Preuss M, Tomassini M (2006) Effects of scalefree and smallworld topologies on binary coded selfadaptive cea. In: Gottlieb J, Raidl G (eds) Evolutionary computation in combinatorial optimization. Lecture Notes in Computer Science, vol 3906, Springer, Berlin, pp 86–98. doi: 10.1007/117300958
 Goeffon A, Lardeux F (2011) Optimal onemax strategy with dynamic island models. In: 23rd IEEE International Conference on Tools with artificial intelligence (ICTAI), 2011, pp 485–488. doi: 10.1109/ICTAI.2011.79
 Kennedy J, Mendes R (2002) Population structure and particle swarm performance. In: Proceedings of the 2002 Congress on evolutionary computation, 2002: CEC ’02, vol 2, pp 1671–1676. doi: 10.1109/CEC.2002.1004493
 Krasnogor N (2004) Selfgenerating metaheuristics in bioinformatics: the protein structure comparison case. Genetic Progr Evolv Mach 5(2):181–201. http://www.cs.nott.ac.uk/~nxk/PAPERS/GPEM04.pdf
 Krasnogor N (2012) Memetic algorithms. In: Rozenberg G, Bck T, Kok J (eds) Handbook of natural computing. Springer, Berlin, pp 905–935. doi: 10.1007/978354092910929
 Krasnogor N, Gustafson S (2004) A study on the use of “selfgeneration” in memetic algorithms. Nat Comput 3:53–76. doi: 10.1023/B:NACO.0000023419.83147.67 Google Scholar
 Lee M, Takagi H (1993) Integrating design stage of fuzzy systems using genetic algorithms. In: Second IEEE International Conference on fuzzy systems, 1993, vol 1, pp 612–617. doi: 10.1109/FUZZY.1993.327418
 Li L, Garibaldi JM, Krasnogor N (2006) Automated selfassembly programming paradigm: a particle swarm realization. In: University of Granada, pp 123–134Google Scholar
 Li L, Garibaldi JM, Krasnogor N (2009) Automated selfassembly programming paradigm: the impact of network topology. Int J Intel Syst 24(7):793–817. doi: 10.1002/int.20361 Google Scholar
 Liaw A (2006) Enhanced heat map. http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/gplots/html/heatmap.2.html
 Michell A (1904) The limits of economy of materials in frame structures. Philos Mag 8:589–597Google Scholar
 MillánRuiz D, Hidalgo I (2010a) Comparison of metaheuristics for workforce distribution in multiskill call centres. In: Proceedings of the international joint conference on computational intelligence (ICEC 2010)Google Scholar
 MillánRuiz D, Hidalgo I (2010b) A parallel memetic algorithm for workload distribution in dynamic multiagents systems. In: Proceedings of the 3rd workshop on parallel architectures and bioinspired algorithms held in conjunction with PACT 2010Google Scholar
 MillánRuiz D, Hidalgo J (2010c) A memetic algorithm for workforce distribution in dynamic multiskill call centres. In: Cowling P, Merz P (eds) Evolutionary computation in combinatorial optimization. Lecture Notes in Computer Science, vol 6022. Springer, Berlin, pp 178–189Google Scholar
 Nannen V, Eiben A (2006) A method for parameter calibration and relevance estimation in evolutionary algorithms. In: Proceedings of the 8th annual conference on Genetic and evolutionary computation, GECCO ’06. ACM, New York, pp 183–190. doi: 10.1145/1143997.1144029
 Nannen V, Eiben AE (2007) Relevance estimation and value calibration of evolutionary algorithm parameters. In: Proceedings of the 20th international joint conference on Artifical intelligence, IJCAI’07. Morgan Kaufmann Publishers Inc., San Francisco, pp 975–980. http://dl.acm.org/citation.cfm?id=1625275.1625433
 Newman MEJ, Moore C, Watts DJ (2000) Meanfield solution of the smallworld network model. Phys Rev Lett 84:3201–3204. doi: 10.1103/PhysRevLett.84.3201 Google Scholar
 Quang Q, Ong Y, Lim M, Krasnogor N (2009) Adaptive cellular memetic algorithm. Evol Comput 17(3):231–256. http://www.mitpressjournals.org/doi/pdf/10.1162/evco.2009.17.2.231 (for the latest version of this paper please refer to the journal website)Google Scholar
 Shahsavar M, Najafi AA, Niaki STA (2011) Statistical design of genetic algorithms for combinatorial optimization problems. Math Probl Eng 2011:17. doi: 10.1155/2011/872415
 Verel S, Ochoa G, Tomassini M (2011) Local optima networks of nk landscapes with neutrality. Evol Comput IEEE Trans 15(6):783–797. doi: 10.1109/TEVC.2010.2046175 Google Scholar
 Wang G, Wu D, Szeto K (2011) Quasiparallel genetic algorithms with different communication topologies. In: IEEE Congress on Evolutionary computation (CEC), 2011, pp 721–727Google Scholar
 Wang XF, Chen G (2003) Complex networks: smallworld, scalefree and beyond. Circuits Syst Mag IEEE 3(1):6–20. doi: 10.1109/MCAS.2003.1228503 Google Scholar
 Watts DJ, Strogatz SH (1998) Collective dynamics of ’smallworld’ networks. Nature 393(6684):440–442. doi: 10.1038/30918 Google Scholar
 Weinberger ED (1996) NP completeness of kauffman’s nk model, a tuneable rugged fitness landscape. Working papers, Santa Fe Institute. http://EconPapers.repec.org/RePEc:wop:safiwp:9602003
 Whitacre J, Sarker R, Pham Q (2008) The selforganization of interaction networks for natureinspired optimization. Evol Comput IEEE Trans 12(2):220–230. doi: 10.1109/TEVC.2007.900327 Google Scholar
 Woolley R, Stirling J, Radocea A, Krasnogor N, Moriarty P (2011) Automated probe microscopy via evolutionary optimization at the atomic scale. Appl Phys Lett 98(25):253104Google Scholar
 Wright AH (1991) Genetic algorithms for real parameter optimization. In: Foundations of genetic algorithms. Morgan Kaufmann, San Mateo, pp 205–218Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.