Excitable networks for finite state computation with continuous time recurrent neural networks

Continuous time recurrent neural networks (CTRNN) are systems of coupled ordinary differential equations that are simple enough to be insightful for describing learning and computation, from both biological and machine learning viewpoints. We describe a direct constructive method of realising finite state input-dependent computations on an arbitrary directed graph. The constructed system has an excitable network attractor whose dynamics we illustrate with a number of examples. The resulting CTRNN has intermittent dynamics: trajectories spend long periods of time close to steady-state, with rapid transitions between states. Depending on parameters, transitions between states can either be excitable (inputs or noise needs to exceed a threshold to induce the transition), or spontaneous (transitions occur without input or noise). In the excitable case, we show the threshold for excitability can be made arbitrarily sensitive.


Introduction
It is natural to try to understand computational properties of neural systems through the paradigm of network dynamical systems, where a number of dynamically simple units (i.e. with attracting equilibria or periodic orbits) interact to give computation as an emergent property of the system. This is as much the case for biological models of information processing, pattern generation and decision making as it is for artificial neural networks inspired by these models. A variety of specific models have been developed to describe the dynamics and training of recurrent networks comprised of coupled neurons in biological and artificial settings. The particular challenge that we address here is the construction of arbitrarily complex, but specified, dynamical structures that enable discrete (finite-state) computation in an input-driven Communicated by Peter J. Thomas. system that is nonetheless continuous in both state and time. Clearly, invariant objects of the autonomous system (such as equilibria, periodic orbits and chaotic attractors) only form part of the picture and an input-dependent (non-autonomous) approach such as Manjunath et al. (2012) is needed.
To help understand the response of systems to inputs the authors introduced in Ashwin and Postlethwaite (2016) a notion of a "network attractor", namely an invariant object in phase space that may contain several local invariant sets, but also systems of interconnections between them. This generalises the notions of "heteroclinic network" and "winnerless competition/stable heteroclinic channels" ) which have been used to describe a range of sequence generation, computational and cognitive effects in neural systems: see for example Rabinovich et al. 2001Rabinovich et al. , 2006Rabinovich et al. 2020;Hutt and beim Graben 2017. These models connect computational states, represented as saddles in the dynamics. In the presence of inputs or noise, the switching between saddles is a useful model for spontaneous computational processes, but there are two problems. One of these is that the states are dynamically unstable and so there are spontaneous transitions that are not noise driven. The other is that heteroclinic chains are destroyed by arbitrarily small perturbations, unless there are special structures in phase space (symmetries or invariant subspaces).
Network attractors joining stable equilibria (Ashwin and Postlethwaite 2016) overcome both of these issues. In this paper, we show that arbitrarily complex network attractors exist in an idealised class of model called a continuous time recurrent neural network (CTRNN) (Beer 1995). A CTRNN is a set of differential equations each with one scalar variable that represents the level of activation of a neuron, and feedback via a saturating nonlinearity or "activation function". These models have been extensively investigated in the past decades as simple neurally inspired systems that can nonetheless (without input) have complex dynamics by virtue of the nonlinearities present (Beer 1995). They can also be trained to perform arbitrarily complex tasks depending on time-varying input (Tuci et al. 2002;Yamauchi and Beer 1994). They are frequently used in investigations of evolutionary robotics (Blynel and Floreano 2003) and (in various equivalent formulations (Chow and Karimipanah 2020)) as models for neural behaviour in biological or cognitive systems. For example, the classical work of Hopfield and Tank (Hopfield and Tank 1985) considers such systems with symmetric weights to solve optimization problems, while more recently, Bhowmik et al. (2016) discusses CTRNN models for episodic memory, and several other biological and cognitive applications are discussed in Nikiforou (2019).
CTRNNs are often referred to as "universal dynamical approximators" (Funahashi and Nakamura 1993), meaning that the trajectory of a CTRNN can approximate, to arbitrary precision, any other prescribed (smooth) trajectory in R n . However, this does not mean that the dynamics of CTRNNs are simple to understand, or that it is easy to form the above approximation. It also raises the question of how a "trained" CTRNN performs a complex task. Gradient descent or more general evolutionary training algorithms train the network by navigating through a high dimensional landscape of possible feedback weightings and moving these towards a setting that is sufficiently optimal for the task. It may be possible to give a clear description of the resulting nonlinear dynamics of the autonomous (constant input) CTRNN, but we want to understand not only this but also how inputs affect the state of the system.
The main theoretical result in this paper focusses on "excitable network attractors": these consist of a finite set of local attracting equilibria and excitable connections between them (see Appendix A). It was demonstrated in Ashwin and Postlethwaite (2018) that an excitable network attractor can be used to embed an arbitrary Turing machine within a class of purpose-designed coupled dynamical system with two different cell types. Rather than relying on an optimization approach to design the system, that paper gave a constructive method for designing a realisation of any desired network attractor. However, this construction required specialist dynamical cells with quite complex nonlinear couplings between them, and a comparatively large number of cells. It was recently shown in Ceni et al. (2020) that trained RNNs in the form of echo-state networks can realise excitable networks for certain tasks and that structural errors in the trained network can explain errors in imperfectly trained systems.
The current paper demonstrates that CTRNN dynamics is sufficiently rich to realise excitable networks with arbitrary graph topology, simply by specifying appropriate connection weights. The construction algorithm in the proof of Theorem 1 assigns one of only four values to each of the connection weights to realise an arbitrary graph on N vertices as an excitable network on N states (subject to some minor constraints on its connectivity), using a CTRNN with N cells. The CTRNN we consider in this paper (see for example (Beer 1995), which corresponds to a continuous time Hopfield model (Hopfield and Tank 1985) in the symmetric coupling case w i j = w ji ) are ordinary differential equationṡ where y = (y 1 , . . . y N ) ∈ R N is the internal state of the N cells of the system, w i j is a matrix of connection weights, φ is a (sigmoid) activation function that introduces a saturating nonlinearity into the system and I i (t) is an input. We say system (1) is input-free if I i (t) = 0 for all i and t. We consider two cases for φ, a smooth function and a piecewise affine function In both cases, φ is monotonic increasing with and a maximum derivative at y = θ , equal to 1 4 . In both cases, and θ are parameters, and we are interested in the case 0 < 1. In general, the function φ need not be the same in every component of (1), but here we make a simplifying assumption that it is.
Note that both activation functions (2) and (3) have piecewise constant limits in the singular limit → 0. Such limiting systems are of Fillipov type and have been explored in various biological contexts, especially for gene regulatory dynamics. These can also have rich dynamics as discussed in the literature (for example Gouzé and Sari 2003;Harris and Ermentrout 2015), but we do not consider this limit here.
The main contribution of Sect. 2 is to give, in Theorem 1, a construction of a connection weight matrix w i j such that the dynamics of the input-free system (1) contains (or realises: definition given below) an excitable network attractor, as defined in Ashwin and Postlethwaite (2016). We prove this (with details in Appendix B) for the case of the piecewise affine function φ P . In Sect. 3, we present evidence that this is also true for the smooth case φ S for an open set of parameters. Qualitatively, this means the system will contain a number of stable equilibrium states, and small inputs (either deterministic, or noisy) will push the trajectory from one stable equilibrium into the basin of attraction of another. In this way, transitions can be made around the network, and the transition time between states tends to be much smaller than the residence times of the trajectory in neighbourhoods of the states. In particular, we can choose w i j so that the network attractor has (almost) any desired topology. In Appendix A, we recall formal definitions of network attractors from Postlethwaite 2016, 2018).
In Sect. 3, we consider several examples of simple graphs and demonstrate that the desired networks do indeed exist in the systems as designed. We also perform numerical bifurcation analysis to demonstrate the connection between periodic orbits in the input-free deterministic system (1) and excitable networks in the same system with additive noise, that is, the system of stochastic differential equations (SDEs): where W i (t) are independent standard Wiener processes. Here, the noise plays the role of inputs that propel the trajectory around the network, although of course this occurs in a random manner. In Sects. 3.3 and 3.4, we consider graphs that have multiple edges leading out from a single vertex and show that additional equilibria may appear in the network attractor where two or more cells are active simultaneously. We further show that the existence of these additional equilibria can be suppressed by choosing one of the parameters used in the construction of the weight matrix w i j to be sufficiently large.
Section 4 concludes by relating our results to other notions of sequential computation. We also conjecture some extensions of the results shown in this paper.

Construction of a CTRNN with a network attractor
Let G be an arbitrary directed graph between N vertices, and let a i j be the adjacency matrix of G. That is, a i j = 1 if there Fig. 1 From left to right, the figures show an order-one loop, an ordertwo loop, and a -clique in a directed graph. The construction we provide realises an arbitrary graph G, as long of none of these subgraphs are present in G is a directed edge from vertex i to vertex j, and a i j = 0 otherwise. Let be an invariant set for a system of ordinary differential equations. We say is an excitable network that realises a graph G for some amplitude δ > 0 if for each vertex v i in G there is a unique stable equilibrium ξ i in , and if there is an excitable connection with amplitude δ in from ξ i to ξ j whenever there is an edge in G from v i to v j . The existence of an excitable connection means that there exists a trajectory with initial condition within a distance δ of ξ i that asymptotes in forward time to ξ j (formal definitions are given in Appendix A).
For the purposes of our construction of a network attractor, we assume that G contains no loops of order one, no loops of order two, and no -cliques. Figure 1 shows each of these graph components schematically.
In terms of the adjacency matrix, for G to contain no loops of order one requires that a ii = 0 for all i; for G to contain no loops of order two requires that a i j a ji = 0 for all i, j; and for G to contain no -cliques requires that a i j a ik a jk = 0 for all i, j, k.
In our earlier work, we have demonstrated a network design which can admit order-two loops and -cliques (Ashwin and Postlethwaite 2016), although this previous construction is not motivated by neural networks per se and requires a higher dimensional system of ODEs (for a given graph) than the one presented here. Before we move into the details of the construction, we briefly discuss our terminology. A graph G has vertices, which correspond to (stable) equilibria in the phase space of the dynamical system (1). Also within the phase space, there exist excitable connections (sometimes abbreviated to connections) between the equilibria, which correspond to the edges of the graph. When a trajectory in the phase space moves between neighbourhoods of equilibria along a path close to one of these connections, we say that a transition between the equilibria has occurred. We refer to each of the components of the dynamical system (1) as a cell, and say that a cell j is active if φ(y j ) is close to one.

Realization of arbitrary directed graphs as network attractors
We construct a weight matrix w i j that depends only on the adjacency matrix a i j , and on four parameters w t , w s , w m and w p . It is given by where δ i j is the Kronecker δ. This choice of w i j ensures e. a i j = 1), and w i j = w t otherwise. In later sections, we allow for different weights along different edges by allowing w p to depend on i and j (i.e. w p = w i j p ). We give an overview of how each of the parameters affect the dynamics of the system in Sect. 2.2.
We write w = ( , θ, w s , w m , w t , w p ) ∈ R 6 to be a vector of all parameter values. The next result shows that for the piecewise affine activation function and suitable choice of parameters, there is an embedding of G as an excitable network attractor for the input-free system.
Theorem 1 For any directed graph G with N vertices containing no loops of order one, no loops of order two, and no -cliques, and any small enough δ > 0, there is an open set W ex ⊂ R 6 such that if the parameters w ∈ W ex , then the dynamics of input-free equation (1) on N cells with piecewise affine activation (3) and w i j defined by (8) contains an excitable network attractor with threshold δ, that realises the graph G.
Recall that by realises we mean that all edges in the graph are present as transitions between stable equilibria using perturbations of size at most δ.
Proof We give the main ideas behind the proof here, deferring some of the details to Appendix B. We construct an excitable network attractor in R N for (1) with piecewise activation function (3) and weight matrix (8). For any 1 2 > δ > 0, we show there exist parameters w (with > 0 small) and stable equilibria ξ k (k = 1, . . . , N ) that are connected according to the adjacency matrix a i j by excitable connections with amplitude δ. Below, we provide an explicit set of parameters that make such a realisation, and note that the realisation will hold for an open set of nearby parameters.
We show in Appendix B that the equilibria ξ k have components (cells) that are close to one of four values Y T , Y D , Y L , Y A related to the edges attached to the corresponding vertex k in the graph G. For any 0 < δ < 1 2 , we use the following parameters: and then w p and w m are given by We then set We use square brackets and subscripts to identify the components of points in phase space, that is, [ξ k ] j is the jth component of ξ k . Each ξ k has: -Exactly one cell that is Active: Note that the requirement of no loops of order one or two and no -cliques implies that this labelling is well defined.
From equilibrium ξ k , there is an excitable connection to any of the equilibria ξ l with a kl = 1, that is, any of the Leading cells can become the Active cell. During a transition, the remaining cells can be classified into six types, which are identified in Fig. 2, and depend (for each j) on the values of the four entries in the adjacency matrix a jk , a k j , a jl and a l j . We label cell k as AT (Active-Trailing) and cell l as LA (Leading-Active). Note that the lack of two cycles means that the cases with a jk = a k j = 1 or a jl = a l j = 1 (a total of seven possibilities) cannot occur, and the lack of -cliques mean that the cases with a jk = a jl = 1, a k j = a jk = 1, or a k j = a l j = 1 also cannot occur (which includes the cases where a cell would switch from Leading to Trailing). The remaining six possibilities are listed below.
-Type DD: a jk = a k j = a jl = a l j = 0; the cell is Disconnected throughout. -Type TD: a jk = 1, a k j = a jl = a l j = 0; the cell switches from Trailing to Disconnected. -Type LD: a k j = 1, a jk = a jl = a l j = 0; the cell switches from Leading to Disconnected. -Type TL: a jk = a l j = 1, a k j = a jl = 0; the cell switches from Trailing to Leading. -Type DT: a jl = 1, a jk = a k j = a l j = 0; the cell switches from Disconnected to Trailing. -Type DL: a l j = 1, a jk = a k j = a jl = 0; the cell switches from Disconnected to Leading.
The right panel in Fig. 2 shows how a transition from cell AT active to cell LA active will occur in a general network.
To prove the existence of an excitable connection giving a realisation, we consider a perturbation from ξ k to the point where e l is the unit basis vector, and we show in Appendix B that, for small enough δ, ζ k,l is in the basin of attraction of ξ l if a kl = 1. This means there is an excitable connection from ξ k to ξ l in this case.
We believe that for small enough δ and suitable choice of weights, the realisation of G in Theorem 1 can be made almost complete in the sense analogous to the similar notion for heteroclinic networks , namely that the set has zero measure. If a network realization is almost complete, then almost all trajectories starting close to some ξ k will remain close, or will follow a connection corresponding to the realization. We do not have a proof of this, though Appendix B.2 shows that for small enough δ, ζ k,l is in the basin of attraction of ξ l if and only if a kl = 1. This does not preclude the possibility that there exist perturbations in B δ (ξ i ) other than ζ k,l resulting in trajectories asymptotic to ξ l , or indeed to other attractors. Our numerical investigations suggest that by choosing w t nonzero and large enough, any connections to other equilibria can be suppressed. This is discussed more in Sects. 3.4 and 4.

Excitable networks for smooth nonlinearities
For small enough , the smooth activation function (2) can be made arbitrarily close to the piecewise activation function (3), and so we expect Theorem 1 to also apply in the smooth case, but do not give a proof here. However, throughout Sect. 3 we use the smooth activation function (2) in our examples. We use = 0.05, θ = 0.5, w s = 1, as our default parameter set (compare this choice of parameters with that given in equations (9) and (10), with δ = 0.4), though there will be an open set of nearby parameters with analogous behaviour. In Sect. 3, we provide several examples of using this choice of weight matrix to realise a graph G. From equation (11), we can see that the parameter choices directly affect the location of the equilibria ξ k in phase space. As we will see in the following sections, the parameters also have further effects on the dynamics. In particular, the relative sizes of the parameters w p and θ determine whether the dynamics are excitable or spontaneous: essentially, for small enough, w p needs to be smaller than θ to observe excitable dynamics. If w p is too large, then the equilibria ξ k cease to exist: periodic orbits can exist instead. The parameter w m controls how fast a Trailing cell decays, and the parameter w t controls the suppression effects when there is more than one Leading cell. We discuss the effects of w t in more detail in Sect. 3.3.
For a smooth activation function such as (2) that is invertible on its range, there is a useful change of coordinates to J i = φ(y i ) (a similar transformation is made in Beer (1995)). The input-free equations (1) then becomė where each J i ∈ (0, 1) (which is the domain of the function φ −1 ), and Each vertex of the graph G is realised in the phase space of the J k variables by the stable equilibrium with one of the J k close to 1 and the remainder close to 0. With a slight abuse of notation, we refer to these equilibria as ξ k . As we will see in the examples that follow, the parameters can be chosen such that the dynamics are close to a saddle-node bifurcation. In general, the system is near a degenerate bifurcation with codi- where O k is the out-degree of the kth vertex for the graph G. This corresponds to there being simultaneous saddle-node bifurcations (with O k -fold degeneracy) at each equilibrium ξ k corresponding to each of the outgoing directions. This bifurcation has global connections analogous to a saddle-node on an invariant circle (SNIC)/saddle-node homoclinic bifurcation: there are connecting orbits between the saddle nodes that reflect the network structure.
If parameters are chosen as in Theorem 1 (such that none of the saddle-node bifurcations have occurred), then for each k there will be a stable equilibrium ξ k and a group of nearby saddles and sources. A small perturbation near ξ k can move the trajectory out of the basin of attraction of ξ k to effect a transition to another equilibrium: this gives an excitable connection from ξ k .
If parameters are chosen such that all saddle-node bifurcations have been passed (for example, by sufficiently increasing the value of w p ), then the flow through the corresponding region of phase space near where the sink ξ k was will be slow, and all equilibria in this region will have been destroyed. However, we will still observe intermittent dynamics as the trajectory passes through this region (Strogatz 1994, p99). In this case, we refer to a region where (a) there is a unique local minimum of |ẏ| and (b) a large subset of initial conditions in this region pass close to this minimum, as a bottleneck region P k , and refer to a spontaneous transition past P k . (This is also called the ghost of a saddlenode bifurcation in Strogatz (1994).) If all the connections corresponding to edges in the graph G are excitable, then the system contains an excitable network attractor. If all connections are spontaneous, then we typically see a periodic orbit, although we do not prove this.
For other cases (e.g. where the parameter w p is assumed to depend on i, j in (8)), some saddle-node bifurcations will have occurred, but others will have not. In this case, we expect that some connections will be excitable, and for others, trajectories will automatically pass through bottleneck regions.
When there is a choice of two out-going connections, one of which is excitable and the other of which is spontaneous, the one chosen by the trajectory will depend on noise amplitudes and other effects: we expect there to be a rich complicated local and indeed global dynamical behaviour, the analysis of which is beyond the scope of this paper.

Two vertex graph
For our first example, we consider the connected graph with two vertices and a single edge joining them, that is a 12 = 1, and a i j = 0 for (i, j) = (1, 2). We use bifurcation analysis to show that the transition between spontaneous and excitable dynamics is caused by a saddle-node bifurcation and find an approximation to the location of the saddle-node bifurcation in parameter space. The two-dimensional system of equations is: In the J i variables, this becomeṡ where (J 1 , J 2 ) ∈ (0, 1) 2 . We note the following properties of the function g : If w s > 4 , then g has local extrema at x + and x − , where The J 1 and J 2 nullclines of system (16) are at, respectively In Fig. 3, we show a sketch of the phase space of (16); the J 1 nullcline is shown in blue, and the J 2 nullcline in red. Solid dots show stable equilibria, open dots show unstable equilibria, and arrows show the direction of flow. Note that we do not include any nullclines at J i = 0 or J i = 1 because they are not in the domain of equations (16). As w p is decreased (in the figures, moving from left to right), a saddle-node bifurcation creates a pair of equilibrium solutions.
Lemma 1 If θ < w s , and 0 < 1, then a saddle node bifurcation occurs in system (16) when To begin the proof, we note that the saddle-node bifurcation will occur when the points A and B (marked by squares in the left-hand panel of Fig. 3) coincide. These points are defined as the intersection of the nullclines with the line at at the local extrema of the J 2 nullcline.
Let the J 1 coordinate of A be Let the J 1 coordinate of B be J B 1 , and write J B 1 = 1 − B , for some B 1. Substituting this, along with J 2 = x − , into the expression for the J 1 nullcline in (18) gives Expanding in terms of the small quantities and B , this gives which we rearrange to find Since we are assuming θ < w s , then B is exponentially small, that is B = O( n ) for all n ∈ N, thus, The points A and B collide when J A 1 = J B 1 , that is, when For the default parameters (12), except for w p , we find w SN p = 0.3027 (4 s.f.). Note that this means for w p = 0.3 we are close to saddle node and there is an excitable connection with small δ > 0. More generally, note that for any fixed w s , as → 0 we have w SN p → θ as expected from Theorem 1. The following result gives an approximation of the positions of the equilibria that are created in the saddle-node bifurcation. Methods similar to those used in this proof are used in later sections for larger networks.
Lemma 2 If θ < w s , 0 < 1, and 0 < η 4 , then if w p = w SN p − η, the system (16) has a pair of equilibria at Recall that Using the earlier results on the location of the J 1 nullcline, we will have equilibria when Expanding g about J 2 = x − and writing w p = w SN p − η gives, where the final line follows because g(x − ) = w SN p . Substituting for g (x − ) then gives the result.

Three vertex cycle
Our second example is the cycle between three vertices shown schematically in Fig. 4. As a heteroclinic cycle between equilibria, this system has been studied extensively in the fields of populations dynamics (May and Leonard 1975), rotating convection (Busse and Heikes 1980) and symmetric bifurcation theory (Guckenheimer and Holmes 1988).
We give some numerical examples of the dynamics of this system as realised by the CTRNN excitable network and use the continuation software AUTO (Doedel et al. 2007) to show that the transition from excitable to spontaneous dynamics occurs at a saddle-node on an invariant circle (SNIC) bifurcation generating a periodic orbit. The deterministic equations realising this graph are: We also consider the noisy case, using the setup given in equations (4). Figure 5 shows sample time series for two different parameter sets. On the left, we show a noisy realisation with w p = 0.3, and on the right, a periodic solution in the deterministic system (equations (1)) with w p = 0.305.
Note that in both cases, for this system the y k variables oscillate between three values: 3), and low (y k = Y T = w m = −0.7), as the cells shift between Active, Trailing and Leading. Only the first of these corresponds to J k ≈ 1, as can be seen in the time series plots of the J k variables in the lower panels of the figure, the other two correspond to J k ≈ 0.
We compute a bifurcation analysis of the system (20) using the continuation software AUTO (Doedel et al. 1997). Figure 6 shows a bifurcation diagram of this system as w p is varied. Stable solutions are shown in red. There is a saddle-node on an invariant circle (SNIC) bifurcation at w p = w SN I C p ≈ 0.30287. For w p < w SN I C p , the diagram shows a stable equilibrium solution with y 1 ≈ Y A = 1 (and y 2 ≈ Y L , y 3 ≈ Y T ). As w p increases through w SN I C p , this equilibrium disappears in a SNIC bifurcation creating a stable periodic orbit. Note that the period of the periodic orbit asymptotes to ∞ as the SNIC bifurcation is approached. Due to the symmetry, there are of course two further pairs of equilibria, one pair with y 1 ≈ Y T , y 2 ≈ Y A , y 3 ≈ Y L , and another with y 1 ≈ Y L , y 2 ≈ Y T , y 3 ≈ Y A . The symmetry causes three saddle-node bifurcations to occur simultaneously, creating the periodic orbit. If we were to instead choose the w p to be different in each of the lines in (20) bifurcations by diamonds, saddle-node of periodic orbits by triangles, and saddle-node on invariant circles (SNIC) by squares. Note that there may be two squares for a single SNIC bifurcation because the maximum value of y 1 on the periodic orbit is not the same as the value of y 1 for the equilibria undergoing the saddle-node bifurcation. The SNIC bifurcation labelled at w p = w SN I C p ≈ 0.30287 is the transition between excitable and spontaneous dynamics and shows the periodic orbit which has resulted from the SNIC bifurcation.
We note that this SNIC bifurcation occurs at approximately the same value of w p as the saddle-node bifurcation found in Sect. 3.1. This is not surprising; using similar methods to those in the previous section, we can show that to lowest order in , the SNIC bifurcation occurs when , there thus exists an excitable network in the sense defined in appendix A.

Four node Kirk-Silber network
For our next example, we consider a graph with the structure of the Kirk-Silber network (Kirk and Silber 1994), shown schematically in Fig. 7. This graph has one vertex which has two outgoing edges, and the dynamics here are somewhat different to vertices with only one outgoing edge. The bulk of this section is devoted to an analysis of these differences, in particular, the possibility of an additional equilibrium in the network attractor with two active cells.
The corresponding deterministic equations for this network are (moving immediately into the J i variables): We can break the symmetry between J 3 and J 4 by choosing w p 3 = w p 4 . In fact, in what follows, we will frequently write w p 3 = w p 4 + w, for w > 0, and choose w p 4 = w p for simplicity.
We consider first the dynamics near each of the vertices that have exactly one outgoing edge (vertices 1, 3 and 4 in the graph; see Fig. 7). Again using the same techniques that were used in Sect. 3.1 (lemma 2), we can show that for w p , w m , w t < θ < w s , and w p = w SN p + η, 0 < η < 4 , there exist equilibria solutions at, for example That is, there is a transition from excitable to spontaneous dynamics (in this case between cells 1 and 2, but also between cells 3 and 1 and cells 4 and 1) as w p is increased through w SN p .
The dynamics close to the vertex with two outgoing connections (vertex 2) is modified by the presence of the additional parameter w t . Consider the three dimensional subsystem of (21) with J 1 = 0, that is: We can perform a similar calculation to that shown in Sect. 3.1 to show that there is a section of the J 2 null-surface which lies asymptotically close to the surface J 2 = 1. Equilibria solutions exist on this null-surface if the J 3 and J 4 null-surfaces intersect there, that is, if there are solutions to the pair of equations We assume without loss of generality that w p 3 > w p 4 (i.e. w > 0) and then the arrangement of these curves is in one of the configurations shown in Fig. 8. If w t < 0 (lower panel), equilibria solutions exist (i.e. the red and blue curves intersect) for a range of w p 3 and w p 4 with both larger than w SN p : that is, the transition to spontaneous dynamics happens at a larger value of w p j (than if w t = 0). If w t > 0 (upper panel), the opposite happens: that is, the transition to spontaneous dynamics occurs at a smaller value of w p j . Solving these equations exactly requires solving a quartic equation, and the resulting expression is not illuminating. We label the value of w p at which this transition from spontaneous to excitable dynamics occurs as w SN p , and note that this is a function of w t , w s , , θ as well as more generally, the number of Leading directions from that cell. For the specific system (21), with w p 3 = w p 4 + w = w p + w, we thus have two conditions. If w p < min(w SN p , w SN p − w), then the system is excitable along all connections. If w p > max(w SN p , w SN p − w), then a periodic orbit will exist. If neither of these conditions holds, then we will see excitable connections in some places and spontaneous transitions in others.
In Fig. 9, we show some example time-series of the system (21) (in the y k coordinates). In panel (a), parameters are such that w p > max(w SN p , w SN p − w), so we see a periodic solution in the deterministic system. Note that the y 3 (yellow) coordinate becomes close to Y A = 1 during this trajectory, but the y 4 (purple) coordinate does not: it switches between Y L = 0.3, Y D = 0 and Y T = 0.7. In panel (b), parameters are such that w p < min(w SN p , w SN p − w), so without noise the trajectory would remain at a single equilibrium solution. Here, we add noise with σ = 0.05, and the trajectory can be seen exploring the network. Note that there are some transitions between ξ 2 (y 2 is red) and ξ 3 (y 3 is yellow), and some from ξ 2 to ξ 4 (y 4 is purple). In panels (c) and (d), we increase w p further away from the saddle-node bifurcation (further into the regime of spontaneous transitions) and observe some qualitative differences in the trajectories. In the deterministic case (c), the periodic solution now transitions from near the bottleneck region P 2 to a region of phase space where y 3 and y 4 are both Active. We label this region of phase space as P 3,4 . In the noisy case (d) (which is also in the spontaneous Fig. 10 Behaviour of periodic orbits in the Kirk-Silber-type four-node network (21) as the parameter w t is varied. The top panel shows max(y 3 ) (blue) and max(y 4 ) (red) along the periodic orbit. The bottom panel shows the period of the orbit on varying w t regime), the trajectory makes transitions from the bottleneck region P 2 to each of P 3 , P 4 , and to P 3,4 .
In the above simulations, we have used w t = 0, but we observe that the type of qualitative behaviour observed depends both on the parameters w p and w t . Specifically, a sufficiently negative w t provides a suppression effect, meaning that only a single cell y k can be active at any one time, but the transitional value of w t depends on w p . In Fig. 10, we show maximum values of y 3 and y 4 along the periodic orbits as w t is varied. It can be seen clearly here that the transition between periodic orbits which visit P 3 (where max(y 3 ) is significantly larger than max(y 4 )) and those which visit P 3,4 (where max(y 3 ) ≈ max(y 4 )) is quite sharp.
We extend these results to show the behaviour as both parameters w p and w t are varied in Fig. 11. The data in this figure show the observed behaviours for both noisy and deterministic systems as the parameters w p and w t are varied. The red lines are the curves w p = w SN p (dotted) and w p = w SN p (dashed). If w p is above both of these lines, then all transitions are spontaneous, and so a periodic orbit exists in the system. If w p lies below either one (or both) of these lines, then at least one of the transitions will be excitable and so there will be no periodic solutions. The black line shows the boundary between those periodic solutions which visit P 3 (to the left of the black line) and those which visit P 3,4 (to the right of the black line), as determined by the location of the sharp transition in calculations similar to those shown in Fig. 10 for a range of w p . The background colours are results from noisy simulations. The colour indicates the ratio of transitions to P 3,4 to the total number of transitions to P 3 , P 4 and P 3,4 . Interestingly, the noisy solutions require a much larger value of w t than the deterministic ones to have a significant proportion of transitions to P 3,4 .
These changes in qualitative dynamics can be explained in terms of the three-dimensional subsystem with J 1 = 0, given by equations (22). In this three-dimensional system, there are stable equilibria at as well as further unstable/saddle equilibria. Recall that these equilibria are not on the boundaries of the box (which are not part of the domain). In Fig. 12, we show solutions from the full four-dimensional system (21) projected onto the threedimensional space with J 1 = 0. In panel (a), we show the periodic solutions from the deterministic systems for five different values of w p , ranging from w p = 0.309 (left curve, dark purple), to w p = 0.3096 (right curve, red) in increments of 0.0002. It can be seen that the first two of these trajectories approach the saddle equilibria (marked as a blue dot) from one side of its stable manifold, and the latter three from the other. The first three thus visit P 3 , and the latter three visit P 3,4 . It is the transition of the periodic orbit across the stable manifold which results in the rapid change in the qualitative behaviour of the periodic orbit and likely indicates that a homoclinic bifurcation to this saddle point separates these behaviours. That is, the sharp transition in T in Fig. 10 should actually extend to ∞ on both sides. In panel (b), we show both noisy and deterministic trajectories with w p = 0.315. Note that only one of the noisy trajectories follows the deterministic trajectory closely: the majority visit P 3 .

A ten node network
In this section, we demonstrate the method of construction described in Sect. 2 for a larger network. Specifically, we randomly generated a directed graph between 10 vertices, with the constraints that it contained no one-loops, two-loops or -cliques, and such that the graph does not have feedforward structure (i.e. you cannot get 'stuck' in a subgraph by following the arrows). The graph we consider is shown in Fig. 13.
We ran one simulation of the deterministic CTRNN system (1), and two simulations of the noisy CTRNN system (4), and in each case, randomly generated the entries for the w p in the equation (8). For the deterministic system, the entries of w i j p were chosen independently from the uniform distribution U (0.32, 0.34). For the noisy systems, the entries of w i j p were chosen independently from the uniform distribution U (0.30, 0.32). The remaining parameters were set at the default parameter values given in (12) except w t = −0.3 for the deterministic system, and one of the noisy systems, and σ = 0.01 for the noisy systems. Note that for the deterministic parameter values there are bottlenecks in the phase space close to the locations in phase space for the excitable states that are present for the parameter values used in the stochastic cases.
The results of the simulations are shown in Figs. 14, 15 and 16. For the deterministic simulation (Fig. 14), we see that the system has an attracting period orbit, which visits the nodes in the order 1 → 4 → 7 → 3 → 8. The entries of w i j p were randomly generated as described above, and we do not give them all here for space reasons, but we note that in all cases in which a vertex in this cycle has two 'choices' for which direction to leave (in the graph shown in Fig. 13), the attracting periodic orbit chooses the more unstable direction. That is, if i is a vertex in the above cycle, and if i → j and i → k are connections in the directed graph, with i → j being a part of the attracting periodic orbit, then w ji p > w ki p . Although we do not prove here that this will always be the case, it is intuitively what one might expect, that is, the connection from i to j is stronger than the connection from i to k.
In the noisy simulation with w t = −0.3 (Fig. 15), the equilibria are not visited in a regular pattern, but random choices are made at each equilibria from which there is more than one direction in which to leave. See, for instance, the transition 3 → 8 at t ≈ 75, and the transition 3 → 10 at t ≈ 155. The length of time spent near each equilibria is also irregular; note for instance, the variable amount of time spent near ξ 1 and ξ 3 . In this simulation, because the transverse Fig. 11 Behaviour of the Kirk-Silber-type four-node network (21) as the parameters w t and w p are varied. The red lines are the curves w p = w SN p (dotted) and w p = w SN p (dashed; determined numerically by solving a quartic equation). For w p above, both of these lines periodic solutions exist in the deterministic system. To the left of the black line, these periodic solutions visit P 3 , to the right they visit P 3,4 . The background colours are results from noisy simulations with σ = 0.05. The colour indicates the ratio of transitions to P 3,4 to the total number of transitions to P 3 , P 4 and P 3,4 . The labelled dots give the parameter values of the time-series plots in Fig. 9 (a) (b) Fig. 12 The figures show trajectories for the system (21) projected onto the three-dimensional space with J 1 = 0. In panel (a), five periodic trajectories in the deterministic system are shown for different values of w p , ranging from w p = 0.309 (left curve, dark purple), to w p = 0.3096 (right curve, red) in increments of 0.002. In panel (b), we set w p = 0.315 (as in panels (c) and (d) of Fig. 9). Noisy trajectories (σ = 0.05) are shown in blue, and the periodic orbit of the deterministic system is shown in purple. In both panels, an equilibrium of the threedimensional system (22) is shown by a blue dot. Other parameters are ε = 0.05, θ = 0.5, w t = 0, w s = 1, w m = −0.5. The arrows indicate the direction of flow Fig. 13 A example of a directed graph between ten nodes with no loops of order one or two and no -cliques. Three timeseries from realisations of this as a network using CTRNNs are shown in Figs. 14-16 parameter w t is sufficiently negative, only one node is active at any given time.
By contrast, the simulation in Fig. 16 has w t = 0. Here, the transitions again are made randomly, but without the suppression provided by the transverse parameter it is possible to have multiple cells active at once. For instance, at around t = 55, both cells 1 and 5 become active at the same time; they were both leading cells from the previously active cell 6. As cells 1 switches off, cell 4 becomes active, and as cell 5 switches off, cell 8 becomes active. The system continues to have two active cells around t = 250, at which point a third cell also becomes active. If the trajectory was to run for longer, then the number of active cells could decrease again, if an active cell suppresses more than one previously active cells.
The entire excitable network attractor for this level of noise is clearly more complicated than the design shown in Fig. 13, in that additional equilibria (with more than one active cell) are accessible to those encoded and described in Theorem 1. An interesting extension of this work would be to understand which additional equilibria appear in a network attractor generated from a given directed graph in this manner: Fig. 16 suggests that at least seven levels of cell activity are needed to uniquely describe the states that can appear when more than one cell becomes "active". In particular, when a cell is Trailing to more than one Active cell, the value of y j for that cell is even lower than Y T (compare the top panel of Fig. 16 with the schematic in Fig. 2).

Discussion
The main theoretical result of this paper is Theorem 1, which states that it is possible to design the connection weight matrix of a CTRNN such that there exists a network attractor with a specific graph topology embedded within the phase space of the CTRNN. The graph topology is arbitrary except for minor restrictions: namely there should be no loops of order one or two, and no -cliques. Theorem 1 assumes a piecewise affine activation function, but the examples in Sect. 3 suggest that the results generalise to CTRNN using any suitable smooth activation function. More generally, note that the coupled network is in some sense close to N simultaneous saddle-node bifurcations. However, the units are not weakly coupled and indeed this is necessary to ensure that when one cell becomes active, the previous active cell is turned off.
Theorem 1 proves the existence of an excitable network with threshold δ where not only the connection weights, but also and θ (properties of the activation function) may depend on δ. We believe that a stronger result will be true, namely that δ can be chosen independent of properties of the activation function, and also that this can be made an almost complete realisation by appropriate choice of parameters.
Conjecture 1 Assume the hypotheses on the directed graph G with N vertices as in Theorem 1 hold. Assume that > 0 is small and θ > 0. Then, there is a δ c ( , θ ) > 0 such that for any 0 < δ < δ c there is an open setŴ ex ⊂ R 4 . If the parameters (w s , w m , w t , w p ) ∈Ŵ ex , then the dynamics of input-free equation (1) with N cells and piecewise affine activation function (3) and w i j defined by (8) contains an excitable network attractor with threshold δ that gives an almost complete realisation of the graph G.
The construction in Theorem 1 uses a comparatively sparse encoding of network states-each of the N vertices in the network is associated with precisely one of the N cells being in an active state. Indeed, the connection weights (8) assign one of only four possible weights to each connection, depending on whether that cell can become active next, was active previously or neither. Other choices of weights will allow more dense encoding: and many more than N excitable states within a network of N cells. However, the combinatorial properties of the dynamics seem to be much more difficult to determine and presumably additional connection weightings will be needed, not the just four values considered in Theorem 1.
Section 3 illustrates specific examples of simple excitable networks for smooth activation function (2) on varying parameters-this requires numerical continuation to understand dependence on parameters even for fairly low dimension. For this reason, we expect that a proof of an analogous result to Theorem 1 for the smooth activation function may be a lot harder. These examples also give some insight into bifurcations that create the excitable networks.
In general, there is no reason that the realisation constructed in Theorem 1 is almost complete (in the sense that  Fig. 13, using the deterministic model (1). The top panel shows the y j coordinates, where the colours correspond to the node colours in Fig. 13. In the lower panel, each horizontal row corresponds to one node (as labelled on the vertical axis), and the colour is blue when the corresponding J j coordinate is close to zero, and yellow when it is close to one. That is, the yellow segments indicate when each node is active. Parameters are as described in the text Fig. 15 Trajectories of the CTRNN network for the directed graph shown in Fig. 13, using the stochastic model (4). Lines and colours are described in Fig. 14. Parameters are as described in the text, here w t = −0.3

Fig. 16
Trajectories of the CTRNN network for the directed graph shown in Fig. 13, using the stochastic model (4). Lines and colours are described in Fig. 14. Parameters are as described in the text, here w t = 0 almost all initial conditions in B δ (ξ k ) evolve towards some ξ l with a kl = 1, analogous to , Defn 2.6)). If it is not, then other attractors may be reachable from the excitable network. Conjecture 1 suggests that the realisation can be made almost complete for small enough δ: we expect that for this we will require w t to be sufficiently negative. If δ is too large, we cannot expect almost completeness: there are other stable equilibria (notably the origin) that can be reached with large perturbations, and the simulation in Fig. 16 shows that other equilibria may be reachable from the network. It will be a challenge to strengthen our results to show that the excitable network is an almost complete realization. However, the examples studied in Sect. 3 confirm that, at least for relatively simple graphs, this conjecture is reasonable.
Our theoretical results are for networks with excitable connections. We expect much of the behaviour described here is present in the spontaneous case, if the coupling weights are chosen such that equilibria are replaced with bottlenecks. In the absence of noise, we expect to see a deterministic switching between slow moving dynamics within bottlenecks. This dynamical behaviour is very reminiscent of the stable heteroclinic channels described, for example, in Rabinovich et al. 2020). However, stable heteroclinic switching models require structure in the form of multiplicative coupling or symmetries that are not present in CTRNN or related Wilson-Cowan neural models (Wilson and Cowan 1972) (see Chow and Karimipanah 2020 for a recent review of related neural models). Other models showing sequential excitation include (Chow and Karimipanah 2020): this relies on a fast-slow decomposition to understand various different modes of sequential activation in a neural model of rhythm generation. It will be an interesting challenge to properly describe possible output dynamics of our model in the case of bottlenecks.
We remark that asymmetry of connection weights is vital for constructing a realisation as an excitable networksindeed, the lack of two-cycles precludes a jk = a k j = 1. While this may be intuitively obvious, it was not so obvious that we also need to exclude one-cycles and -cliques in the graph to make robust realisations.
Finally, although we do not consider specific natural or machine learning applications of CTRNN here, the structures found here may give insights that give improved training for CTRNN. In particular, it seems plausible that CTRNN may use excitable networks to achieve specific input-output tasks (especially those requiring internal states). For example, recent work (Ceni et al. 2020) demonstrates that echo state networks can create excitable networks in their phase space to encode input-dependent behaviour. It is also likely there are novel optimal training strategies that take advantage of excitable networks, for example, choosing connection weights that are distributed close to one of the four values we use. EPSRC as part of the Centre for Predictive Modelling in Healthcare grant EP/N014391/1. CMP is grateful for additional support from the London Mathematical Laboratory.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.

A Definition of excitable network
This appendix (which extends ideas in Ashwin and Postlethwaite (2016)) gives formal definitions for excitable networks considered in this paper. We say a system has an excitable connection for amplitude δ > 0 from one equilibrium ξ i to another ξ j if is the open ball of radius δ centred at ξ ) and this connection has threshold δ th if We define the excitable network of amplitude δ > 0 between the equilibria E = {ξ i } to be the set We say the excitable network E for amplitude δ realises a graph G if each vertex v i in G corresponds to an equilibrium ξ i in E and there is an edge in G from v i to v j if only if there is a connection in E for amplitude δ from ξ i to ξ j .

B Proof of Theorem 1
As in equations (9) and (10), for any δ < 1 2 we choose and then w p and w m are given by Fig. 17 Illustration of a connection (shown in red) with amplitude δ > 0 from ξ k to ξ l , projected in the coordinates y k and y l . The points ξ k , ξ l and 0 are linear sinks. The point ζ k,l is within δ of ξ l and limits to ξ l in forwards time. The grey areas are regions of width 4 centred around y k,l = θ where φ P (y k ) and φ P (y l ) are non-constant: these contain saddle and other equilibria (not shown) We define equilibria ξ k as in Sect. 2.1, where we write the jth component of the equilibrium as [ξ k ] j : Y D if a k j = 0 and a jk = 0, where Y A := w s = 1, Y L := w p = θ − δ 2 , are the values of the Active, Leading, Trailing and Disconnected components, respectively. Note that As mentioned before, the hypotheses of Theorem 1 imply that this labelling is well defined and Fig. 2 shows how a transition from cell 1 active to cell 2 active will occur in a general network. It is simple to check that (27) is an equilibrium solution of (1) with activation function (3). Moreover, ξ k is linearly stable with n eigenvalues −1. We define [J k ] j := φ P ([ξ k ] j ) then note that (28) implies that in terms of the Kronecker δ k j .
First, consider the dynamics of all y j , with j = k, l. This means that y j < θ −δ/4 for all points in T kl ≡ R kl ∪S kl ∪R l . While the trajectory remains in T kl , it can be shown thatẏ j is negative along the line with y j = θ − δ/4. Hence, none of the y j will leave T kl .
Within S kl , the dynamics of y k and y l are governed bẏ y k = − y k + w m + φ P (y k ), y l = − y l + w s + w p φ P (y k ).
Consider the equation forẏ k in R kl , S kl and R l , namely: Recall that w m = −(w s − θ) − δ 2 , and 0 ≤ φ P (y k ) ≤ 1. We can use these bounds to show thaṫ Furthermore, if y ∈ R l and y k > 0, thenẏ k < −w m < 0.
In particular, we note that for y ∈ T kl , with y k > 0,ẏ k is negative and bounded below zero. Now, consider the equation forẏ l in R kl , S kl and R l , namely: We use this to computeẏ l along the lower boundary of the three regions, R kl , S kl and R l , that is, the line y l = θ + δ 4 , and we finḋ Since w s > 3δ 4 and w s > θ + δ 4 , we see thatẏ l > 0 in all three cases.
Combining our knowledge ofẏ l andẏ k tells us that a trajectory which starts in R kl , or more specifically, a trajectory starting in a small neighbourhood of ζ k,l will have monotonic decreasing y k component until (at least) y k = 0. Furthermore, the y l component cannot decrease below y l = 1 2 + δ 4 . Thus, the trajectory will move through S kl and into R l in a bounded time.
Within R l , ξ l is a linearly stable fixed point. In summary, we have demonstrated that if a kl = 1, then there is a connection from a δ-neighbourhood of ξ k to ξ l . Moreover, as the equilibria are linearly stable and having a connection is an open condition, the realisation will persist for an open set of parameters.

B.2 Absence of excitable connections for edges absent from G
Now, suppose that a lk = a kl = 0. Then, the dynamics for x k and x l is shown schematically in Fig. 18. Equilibria are shown with dots, and all equilibria shown in this figure are linearly stable. Note that the equilibrium ξ k has y k = Y A = 1, and y l = Y D = 0. It is clear that there are no small perturbations which allow for a connection between ξ k and ξ l . For θ = 1/2, if δ < 1/4, then for any k and j = k such that a k j = 0, all trajectories starting in ζ k,l = ξ k + ae k + be l , with a 2 + b 2 ≤ δ 2 . In particular, perturbations of the form (32) will return to ξ k . This is because this set remains in the region of validity for the equivalent of (30) for which the only attractor is ξ k . In the case, a kl = 0 and a lk = 1 a similar argument holds as the phase portrait corresponds to Fig. 17 reflected in the diagonal.
This shows that no perturbations of amplitude δ within the (x k , x l ) plane that give a connection from ξ k to ξ l of Fig. 18 Schematic diagram showing the dynamics in the y k -y l plane when a kl = a lk = 0. Equilibria are shown with dots, and all equilibria are linearly stable amplitude δ. However, we cannot rule out the existence of connections from other locations in B δ (ξ k ) to ξ l . This would be needed to prove that the realisation is almost complete.