Quantifying the role of complexity in a system’s performance
In this work we studied the relationship between a system’s complexity and its performances in solving a given task. Although complexity is generally assumed to play a key role in an agent’s performance, its influence has not been deeply investigated in the past. To this aim we analysed a predator–prey scenario where a prey had to develop several strategies to counter an increasingly skilled predator. The predator has several advantages over the prey, thus requiring the prey to develop more and more complex strategies. The prey is driven by a fully recurrent neural network trained using genetic algorithms.We conducted several experiments measuring the prey’s complexity using Kolmogorov algorithmic complexity. Our finding is that, in accordance to what was believed in literature, complexity is indeed necessary to solve non-trivial tasks. The main contribution of this work lies in having proved the necessity of complexity to solve non-trivial tasks. This has been made possible by blending together a goal oriented system with a complex one. An experiment is provided to distinguish between the complexity of a chaotic system and the complexity of a random one.
KeywordsComplexityGenetic algorithmsNeural networksChaotic systems
Complexity is known to be a necessary aspect in the development of strategies to solve non-trivial tasks (Lenski et al. 2003). However it is not clear how complexity can be quantified and how can its impact on a system’s performance be measured. Key questions arise, for example is a complex structure necessary to exhibit complex behaviours or can behavioural complexity emerge even in simple structures?
Artificial life and evolutionary algorithms are a natural test bed to explore these questions. Complexity can be obtained by evolving an artificial agent to solve tasks of incremental difficulty. In Lenski et al. (2003) populations of digital organisms evolved the ability to perform complex logic functions by learning to perform simpler ones and by combining them. A related approach is called scaffolding (Bongard 2008), where a simulated robot learns to pick-up an object by incrementally learning all the intermediate steps that lead to the final desired behaviour.
A major drawback of scaffolding-like approaches is the need of hand-crafting scenarios of increasing difficulty. This might constrain the solutions an agent is able to find. This problem is partially addressed by studying the evolution of several agents contemporaneously competing or cooperating to solve a task (Gomez and Miikkulainen 1997). In particular, the study of co-evolution is likely to provide insights into how complexity autonomously emerges as a necessity. Nolfi and Floreano (1998) studied co-evolution in a predator–prey system. They argued that an “arms race” does not necessarily arise in a competitive scenario, as oscillation of strategies and rediscovery of old solution are a likely solution to the predator–prey problem. This poses a limit on the amount of complexity an evolved system will exhibit, as simple “ad-hoc” solutions often proved to be better than general complex ones.
Stanley and Miikkulainen (2004) explored a different competitive scenario. Two agents battled to obtain food and the related energy so that they could overcome each other. Artificial neural networks had been incrementally built via genetic algorithms to drive each of the agents. The authors linked the agent success rate to the complexity of the evolved underlying network, which in turn had been informally linked to the number of neurons and to the number of connections in the network. In this paper we propose to study and quantify the complexity of a network’s behaviour, as opposed to its structure.
The link between the complexity of a neural network and the emergence of novel behaviours has also been investigated in Riano and McGinnity (2010). Several neural networks of increasing complexity have been trained to drive a robot towards people. The authors showed that, when the complexity is sufficiently high, the network generates behaviours that were not programmed nor anticipated. The network’s complexity was measured using Kolmogorov’s algorithmic theory (Kolmogorov 1965), which we use here to characterise the complexity of a behaviour.
We are, in alignment with the current literature, convinced that complexity is mandatory to solve challenging tasks. However we feel that in previous work the focus has been mainly on solving a problem, rather than investigating what allows a problem to be solved. To the best of our knowledge there are no studies that investigate “how much” complexity is necessary to solve a given task. Answering this question requires one to (i) quantify the amount of complexity in a system and (ii) analyse the performance of a system (related to a task) with varying levels of complexity. Our goal in this work is therefore to numerically investigate a system’s complexity and its role in the system’s performance.
Measuring the complexity of a system is still nowadays an open problem, and it is often domain-dependent. A survey of “complexity measures” is beyond the scope of this paper, and we refer the reader to existing surveys in the literature (see for example Kuusela et al. 2002; Daw et al. 2003). Two measures seem to have gained wide acceptance: Kolmogorov complexity (Kolmogorov 1965) and Grassberg effective measure complexity (Grassberger 1989). However while the first proved to be uncomputable, the latter is defined only for stochastic processes. Moreover Kolmogorov’s measure tends to be high for random process, an undesirable property when comparing the complexity of different systems. Other measures like Gell-Mann total information (Gell-Mann and Lloyd 1996) are subjective and more philosophical, thus lacking a proper scientific use.
In Sect. 2.4 we will show that it is possible to obtain a close approximation of Kolmogorov complexity, thus allowing us to effectively use it as a measure. Moreover, as we will illustrate in Sect. 2, we will be calculating the complexity of recurrent neural networks, which by definition do not contain any randomness element. This means that we will not face the problem of high complexity assigned to random processes.
Once provided with a way to measure complexity, we can investigate how it influences a system’s performance. However this is non-trivial, as complexity and the system generating it are deeply interleaved and, although it is possible to measure the complexity after a system is defined, it is hard to generate a system with a given complexity. We addressed this problem by using a “complexity generator”, that is a neural network trained to produce an output whose sole purpose is to be complex. This network can be combined with a task-oriented one to “inject” complexity into the system.
The test bed we will use is a predator–prey scenario. This scenario has been widely studied in the literature (see for example Nolfi and Floreano 1998; Stanley and Miikkulainen 2004). In contrast with previous work the predator is coded by us, while the prey is evolved using genetic algorithms. This allows us to study the dynamics the prey evolves without transgressing into oscillations or greedy solutions (Nolfi and Floreano 1998). This also gives us the possibility to accurately control the ability of the predator, as we can simply vary its driving algorithm to generate more complex behaviours.
The prey is controlled by a fully recurrent neural network as described in Sect. 2.1. We will illustrate in several experiments the solutions developed by the prey to overcome a more and more sophisticated predator. Ultimately the prey must evolve complexity so that it will win against the best predator. Thorough the paper we will use the term “agents” to refer to both the prey and the predator.
This paper is organised as follow: in Sect. 2 and subsections we describe the techniques we will be using, including the neural network equations, the genetic algorithms and the Kolmogorov complexity. In Sect. 3 we describe the simulation model and dynamics, while in Sect. 4 we describe the experimental results. These results are discussed in Sect. 5 and conclusions are drawn in Sect. 6.
2 Employed techniques
2.1 Fully recurrent neural network
This model, with minor modifications, has been widely used and studied in the neuroevolution literature (see among others Paine and Tani 2004; Urzelai and Floreano 2001; Izquierdo et al. 2008; Hülse et al. 2004).
A RNN is Turing-compatible, i.e. it can approximate any arbitrary recursive function.
This network can exhibits rich and often chaotic dynamics (Dauce et al. 1998).
2.2 Genetic algorithms
Genetic algorithms (GA) have long been used in optimisation problems, including robotics control (Harvey et al. 2005) and neural network training (Floreano et al. 2008). The network structure and weights are encoded in a gene array, and conventional mutation and crossover operators are used.
Several strategies have been employed in the past to train a neural network using a GA (Nelson et al. 2009; Harvey et al. 2005; Walker et al. 2003). The goal of this work is to prove that complexity plays a key role in the survival skills of a prey. As such, it would be out of scope to provide a comparison of several genetic algorithms. Therefore in this paper we decided to use the most common basic approach to neuroevolution (Beer and Gallagher 1992; Floreano et al. 2008): the network weights Wi,j and biases bi are encoded in a single genome vector; mutation is applied by adding a Gaussian random value to each element of the genome vector, and we perform a single-point crossover. In all the experiments we used PyEvolve (Perone 2009) as the main GA engine.
2.3 Least squares approximation of the prey’s trajectory
2.4 Kolmogorov complexity
The Kolmogorov complexity has been shown to be incomputable. However, it is possible to obtain an upper bound of it (which in practice is a very good approximation) by using a universal compression algorithm, like Lempel and Ziv (Kaspar and Schuster 1987). In Falcioni et al. (2003) the connections between entropy, chaos and algorithmic complexity are studied. These connections can be summarised by saying that a complex system generates incompressible strings (outputs) and it is unpredictable. The incompressibility of a string is therefore a clear indicator of complexity in a system. A similar approach is used in Khalatur et al. (2003).
In this paper we will be concerned with the compressibility of sequences of numbers between 0 and 1. However a sequence of floating point numbers may be highly incompressible, given the nature of its representation in a machine, even if the numbers are very close to each other. A solution to this problem is to discretise a real-numbers string into a series of symbols (Daw et al. 2003).
The degree of discretisation has been empirically determined. A fine discretisation (more than 70 symbols) leads to high complexities that do not reflect the real nature of the underlying process. On the other side, a coarse discretisation leads to constant low complexities, removing any informative content of this approach. We found however that discretisations between 10 and 50 symbols did not lead to any substantial change in the evaluated complexities. Therefore we discretised the range 0–1 into 10 different symbols, and calculated the complexity of the resulting string. In this way a series of numbers must evenly and chaotically cover the entire 0–1 range to obtain an high complexity value.
Map s into a string d of symbols drawn from an alphabet of 10 elements.
Compress d using the Lempel-Ziv algorithm (Kaspar and Schuster 1987), thus obtaining a new shorter string d′.
Calculate the Kolmogorov complexity as the ratio between the length of d′ and the length of d′.
3 Experimental setup
3.1 Experiments overview
The goal of the following experiments is to measure the performance of a prey with respect to its complexity. To this aim we ran several experiments where the prey was challenged each time by a more skilled predator. The predator was not only “naturally” better equipped than the prey (see next subsection), but it was also able to approximate the prey’s trajectory using the approach illustrated in Sect. 2.3. This means that the difficulty of the task the prey has to face increases with the experiments in Sects. 4.1, 4.2 and 4.3. While the prey managed to evolve good survival strategies in two out of three scenarios, eventually in Sect. 4.3 we show that the GA could not find a practical solution. This prompted for the blending of a goal-oriented behaviour with complexity illustrated in Sect. 4.5.
3.2 World and agent dynamics
The predator has a complete knowledge of its position and the prey’s position, regardless of their distance. They prey’s sensors are far more limited, as it can sense the predator only when it is less than 4 m away. Moreover the prey does not know exactly the angle between itself and the predator, but only an imprecise direction. To simulate this we divided the whole 360 area around the prey into 6 sectors. The prey therefore knows only the distance from the predator and the sector from which it is approaching. This is analogous to having an array of 6 sonars around the prey.
The RNN described in Sect. 2.1 controls the prey. It takes as input a vector of 6 real numbers indicating the distance from the predator in one of the sectors and it has as the only output the angular speed ω of the prey. Both agents’ linear velocities are kept constant at their maximum.
3.3 Evolving the prey
In Sect. 2.2 we described the network representation and basic operators to use in GA. For all the experiments below we used a mutation probability of 0.8. Mutation is performed by adding to a single gene a random value drawn from a Gaussian distribution with 0 mean and 1 variance. Weights and biases are clamped between −3 and 3. Single point crossover is performed with a 0.1 probability.
The fitness function for the prey is its survival time, calculated in seconds. We assume that if the distance between the prey and the predator is less than 0.5 m the predator successfully captured the prey. The fitness function is averaged from four separate trials, with the prey being located in each of the four possible sectors of the Cartesian space and the predator being on the opposite corner at a fixed distance.1 The initial angles are always so that the predator is facing the prey and the latter is facing the opposite direction. The trial is over if the predator catches the prey or if the latter survives for at least 150 s.
After training, when evaluating the prey’s performance, both the prey and the predator’s initial position and orientation are randomly initialised. This way we tested the generality of the prey’s strategy. On the other side there might be starting positions where the predator is highly favoured and it will catch the prey, despite its best efforts. Therefore in the following experiments we will never see an average survival time of 150 s (as opposed to training), but always smaller values.
4 Experiments and results
4.1 Greedy strategy
In the first experiment the predator used a simple greedy strategy, that is it always steered towards the current location of the prey without any approximation of Its trajectory. The prey’s RNN is composed by 6 input neurons, 1 output neuron and no hidden neurons. Training has been performed with a population of 100 individuals.
4.2 Linear strategy
In this experiment the predator approximates the prey’s strategy using the linear least square approach described in Sect. 2.3 with a first degree polynomial. Using this approach the predator was always able to catch the prey trained in the previous section. We therefore used the GA to obtain a new generation of preys able to survive the improved predator. Given the failure of the previous generation, we used a RNN with 3 hidden neurons. The predator used τ = 15 previous prey’s positions to solve the linear system in Eq. 4, and it used a prediction \(\Updelta t =10\)s ahead in time.
Although this strategy is ingenious, it is very simple. The changes of trajectory are not frequent and easily predictable by a smarter predator, as we will show next. The Kolmogorov complexity of the prey’s behaviour supports this idea, as it is only 0.02 bps, while the predator’s one is 0.06. Out of 300 random starting locations, the average prey survival time was 133 s.
4.3 Quadratic strategy
In the previous experiments we found that the prey always developed dynamics that ended in a circular trajectory. We therefore crafted the third version of the predator that uses a second degree polynomial to approximate the prey’s trajectory. The predator used τ = 10 previous steps to approximate Eq. 4 and it used a prediction \(\Updelta t =10\) s ahead in time. Again an improved predator outperformed the prey model previously evolved. This time however we could not evolve a prey able to survive this last deployed predator.
Out of 300 random starting locations, the average prey survival time was 30.78 s. The Kolmogorov complexity did not change significantly from the previous experiments, with a value of 0.06 bps for the prey, and a value of 0.04 for the predator.
4.4 Evolving for complexity
The predator’s strategy is to approximate the prey’s trajectory, predict where it is going and proceed it. We already saw in Sect. 4.2 that the prey developed a behaviour that exploited a weakness of linear approximation. Our hypothesis was therefore that the same concept could be used to overcome the predator’s quadratic approximation, even if the GA had not found this kind of solution in the previous experiment.
The approach we adopted was to create a network whose only goal is to produce a complex output. The complex network had the same number of inputs and outputs as the prey’s network, and 5 hidden nodes. The fitness function was the complexity of the network output when fed with inputs coming from a previous predator–prey experiment.
4.5 Selector network
The SRNN for the prey has 8 inputs, 6 for the sector distances as described in Sect. 3.2 and 2 for the outputs of the two subnetworks, and 5 hidden nodes. Moreover the SRNN has 2 weights whose sum is one, indicating the weight of each of the single output of the subnetworks. The final speed of the prey is obtained by multiplying each of the subnetwork output by the respective SRNN output.
The similarity with the previous model can be seen by comparing Figs. 7 and 4. In both cases at the beginning of the trial the prey did not know the predator’s position until it was closer than 4 m. So it moved around a circle, waiting for the predator to approach before fleeing in the opposite direction. The following trajectory is a mixture of circular and piecewise linear trajectories. Even when following circular trajectories, small variations generate every time circles of different diameter or position.
The prey’s average survival time was 101 s out of 300 random starting locations, and the network output Kolmogorov Complexity is 0.29 bps. The predator’s complexity was measured to be 0.16.
4.6 Chaos is not random
The velocity profile of Fig. 7 might be mistaken as randomly generated. This might lead to the conclusion that by simply generating random movements a prey would be able to escape the predator. To disprove this, we substituted the complex network described in the previous section with a random number generator. Therefore the SRNN used as input the output of the network trained to avoid the predator and a uniformly random number between 0 and 1.
5 Discussion of the results
A summary of the results of all the previous experiments
One could be tempted by measuring the difficulty of the task the prey is facing by looking at the predator’s complexity. However, this measure provides only a part of the picture. The reasons behind the predator’s successes with the quadratic strategy are not only in the strategy’s complexity itself, but also in the predator’s increased speed and better sensing. This can hardly be observed by looking at the complexity alone, as the first three entries in Table 1 show. Of particular interest is however the jump in the predator’s KC from the experiment in Sect. 4.3 to the following ones. Although the predator’s algorithm did not change, its complexity boosted when trying to cope with a more skilled prey. Given the number of experiments we conducted and the results provided in Sect. 4.5, we can postulate that 0.19 bps is the maximum complexity a Quadratic strategy can exhibit.
Table 1 shows also that, as the predator’s skills increase, the average survival time of the prey decreases. In other words finding a successful strategy against a more challenging opponent becomes harder and harder. Nonetheless in all of the scenarios the prey managed to find a “sound” strategy i.e. elusive manoeuvres that, apart from a few unfortunate starting positions, lead the prey to survive. If we were to investigate a long term arms race between the prey and the predator, we hypothesise that eventually they will find a balance where no strategy is guaranteed to be successful all the times, in accordance to Nolfi and Floreano (1998).
We strongly believe that the GA will eventually find a successful RNN for the predator with quadratic strategy in Sect. 4.3 if provided with enough time and a big population. A similar solution could have been found by using evolutionary algorithms that incrementally build a RNN structure (see for example Stanley and Miikkulainen 2004), or by exposing the prey to tasks of increasing difficulty. In other words employing a structure like a SRNN is not necessary to ensure the prey’s survival. However, having separated the complexity from a goal-oriented agent allowed us to highlights the big role that complexity plays in a system’s performance.
Complexity alone could not increase the prey’s survival chances, as showed in Sect. 4.4. This becomes evident by looking at the last three rows of Table 1. A prey with only a complex network or a random number generator could not survive, in spite of a complexity higher than the selector illustrated in Sect. 4.5. A SRNN developed the right tools to combine complexity with a goal-oriented strategy, tools that led to the prey’s final good performances. We therefore proved that complexity is necessary but not sufficient to increase a system’s performance. This idea was already well-grounded in the scientific community, but to the best of our knowledge it lacked an experimental proof.
As introduced in Sect. 2.4 complexity can be observed in chaotic systems. An analogous complexity can be also observed in a random number generator that does not require any evolutionary strategy to be created. The last experiment in Sect. 4.6 however marked a clear separation between chaos and randomness. The SRNN learnt to exploit hidden rules, typically found in deterministic chaos, in the chaotic dynamics of the subnetwork to drive the prey to success. On the other hand a random number generator does not have any rule, therefore its presence did not improve the prey’s performance.
As the scope of this work was to explore the role of complexity underlying the dynamics of a successful prey, we did not investigate thoroughly the choice of parameters for the GA. The literature in this topic provides a vast selection of algorithms and solutions to neural networks training using GA (see Floreano et al. 2008 for a recent survey). We did investigate however how the structure of RNN influences the prey’s performance. Our finding, confirmed by the experiments, is that full connection and a relatively small number of hidden neurons suffice to have a successful prey. The network evolved in Sect. 4.1 had no hidden neurons, but the recurrent connections in the output neurons generated the dynamics necessary to evade the predator. In all the subsequent experiments 3 or 5 hidden neurons proved to be sufficient to generate winning strategies. Our approach was to conduct experiments with an increasing number of hidden neurons until we did not observe any increase in performance. This is similar to manually running the NEAT algorithm (Stanley and Miikkulainen 2004).
5.1 Relation to Ashby’s law of requisite variety
Ashby’s (1956) law of requisite variety states that the larger the variety of actions available to a control system, the larger the variety of perturbations it is able to compensate. Ashby (1958) defined variety as the total number of states available to a system. The law of requisite variety reads that a controller (or regulator), in order to be successful in its task, must be capable of reacting to all the signals (or perturbation) that the system to be controlled produces. In the scenario proposed here, a prey must be able to cope with all the strategies the predator employs to capture it.
If we draw a parallel between the variety of a system and its complexity, we find that this work provides an empirical proof of Ashby’s law: in order to overcome the predator’s superiority in speed and sensing (which is, as we stated before, not entirely captured by its numerical complexity), the prey had to increase its variety. This becomes evident by comparing Figs. 2 and 4 with Fig. 7, where the prey exhibits a far richer trajectory. Here we will not provide a quantification of the variety of both the predator and the prey as it is out of the scope of this paper.
6 Conclusions and future work
Genetic algorithms evolve networks of increasing complexity when facing more difficult tasks.
We uncoupled the algorithm solving a specific task from its complexity.
Complexity is necessary but not sufficient to solve several tasks.
Chaotic-like complexity follows rules that can not be characterised as random.
The key “tool” we used in this work is the Kolmogorov’s theory to measure the effect of complexity, numerically comparing different architectures in different scenarios. This allowed us to provide a proof to what was strongly believed in the scientific community but not really proved. That is: complexity, defined according to Kolmogorov, is indeed necessary to solve a non-trivial task.
A question that naturally arises at this point is “what is the right amount of complexity necessary and sufficient to solve a given task?”. We believe that answering this question will be hard if not impossible in all but trivial tasks. Even if we could find such a “minimally complex” system, its generalisation capabilities will be very limited. As we showed in this paper, a more demanding task requires a more complex solution. To avoid having to work with ever-increasing complex systems, we are currently investigating how learning can spontaneously produce the necessary amount of complexity without necessarily having a complex underlying system. This will integrate into the study of learning and evolution (Nolfi and Floreano 1999).
A first approach had been to generate several random starting locations for both the prey and the predator, and then averaging the survival time of the prey. This however led the GA to favour preys that started far from the predator, instead of prey with good avoiding skills.
Dr. Riano is supported by InvestNI and the Northern Ireland Integrated Development Fund under the Centre of Excellence in Intelligent Systems project.