Multi-scale verification of distributed synchronisation

Algorithms for the synchronisation of clocks across networks are both common and important within distributed systems. We here address not only the formal modelling of these algorithms, but also the formal verification of their behaviour. Of particular importance is the strong link between the very different levels of abstraction at which the algorithms may be verified. Our contribution is primarily the formalisation of this connection between individual models and population-based models, and the subsequent verification that is then possible. While the technique is applicable across a range of synchronisation algorithms, we particularly focus on the synchronisation of (biologically-inspired) pulse-coupled oscillators, a widely used approach in practical distributed systems. For this application domain, different levels of abstraction are crucial: models based on the behaviour of an individual process are able to capture the details of distinguished nodes in possibly heterogenous networks, where each node may exhibit different behaviour. On the other hand, collective models assume homogeneous sets of processes, and allow the behaviour of the network to be analysed at the global level. System-wide parameters may be easily adjusted, for example environmental factors inhibiting the reliability of the shared communication medium. This work provides a formal bridge across the “abstraction gap” separating the individual models and the population-based models for this important class of synchronisation algorithms.


Introduction
Small computing devices comprising networks, be it commercial wireless sensor networks, or communicating devices in the Internet of Things, are becoming increasingly common. However, to enable these devices to communicate efficiently, they have to employ methods to use the shared communication medium while avoiding conflicting messages on this medium, This work was supported by the Sir Joseph Rotblat Alumni Scholarship at Liverpool and the Engineering and Physical Sciences Research Council, under Grants EP/N007565/1 (S4: Science of Sensor Systems Software), EP/L024845/1 (Verifiable Autonomy), and the FAIR-SPACE (EP/R026092/1) and RAIN (EP/R026084/1) RAI Hubs.
Extended author information available on the last page of the article in particular in the form of collisions. Collisions occur if two or more devices simultaneously try to access the communication medium, and often result in neither message being delivered. Several protocols to organise shared medium access have been developed and analysed [1,51]. These protocols typically identify a common time frame and divide this frame into slots associated to each node. Thus every node has an allocated time slot that it may use to send its messages onto the shared medium.
Such an approach introduces the need for a common clock between the nodes, i.e., they need to synchronise. A valuable approach to achieve synchrony of nodes is the implementation of biologically-inspired pulse-coupled oscillators (PCOs) [38]. A network of PCOs synchronises in the following way: all oscillators have a similar clock cycle at the end of which they fire. That is, they transmit a broadcast message which is received by all oscillators in their communication range. These oscillators then adjust their own position within their clock cycle according to a phase response function. Depending on the concrete implementation, they may move their current position within the clock cycle closer to its end, or closer to its start.
Most analyses of the synchronisation behaviour of PCOs are concerned with continouous clock cycles, i.e., where clocks take real values from the interval [0, 1]. However, the smaller devices get, the more important it is to save memory and computing time for such a low-level functionality. Even a floating point number may need too much memory, compared to an implementation with, for example, a four-bit vector. Hence, in previous work, we chose to analyse the behaviour of discrete time PCOs [24].
In contrast to continuous time PCOs, networks of discrete time PCOs are not always guaranteed to synchronise. Instead, whether they synchronise or not depends on the type of coupling between the oscillators and their common phase-response function. We analysed the behaviour of such networks for different parameters via model-checking, to check both qualitatively for which parameters the networks synchronise, as well as quantitatively for how long they need to achieve a synchronised state and how much energy is used to achieve this [25]. In the context of large numbers of single oscillators, for example in the context of wireless sensor networks, the well-known state-space explosion problem of the modelchecking approach is extremely important [14]. We formalised a network of oscillators as population models [20] which exploit the behavioural homogeneity of the nodes to encode the global state efficiently. This allows the network size to be increased above what would be feasible when distinguishing each node, with the restriction that only fully-connected networks, where all sensors can communicate with all other sensors, can be modelled. But the construction of a population model from a given oscillator specification is not straightforward, and in particular, it is not obvious whether the constructed population model correctly reflects the behaviour of the oscillators. This results in an 'abstraction gap': after abstracting into populations, how can we be sure that the abstraction process was correct and that the results of verification of population models actually hold for the concrete models on which they are based?
In this paper, we remedy this lack of certainty, by proving the correspondence of our population model with an explicit formalisation of the oscillators. To that end, we present the concrete oscillator model as well as its formalisation as a discrete-time Markov chain. Subsequently we describe the corresponding population model, and show how we can, in addition to the abstraction created by the populations, reduce the state space even further to facilitate the analysis. Finally, we prove that the behaviour of a network of concrete oscillators and the population model are probabilistically weakly bisimilar. We cannot prove a one-to-one correspondence, since the concrete model implicitly includes the possibility of identifying individual oscillators, which is exactly what the population model abstracts from. In particular, our contributions are: -the definition of a model for fully-connected networks of pulse-coupled oscillators (Sect. 4), -the detailed definition of a population model (Sect. 5), based on previous work [24,25], -a way of reducing the size of the formal population models (Sect. 5.5), -a proof that these two models are probabilistic weak-bisimilar (Theorem 3), and -an evaluation of synchronisation behaviour using probabilistic model-checking (Sect. 7).
The paper is structured as follows. In Sect. 2, we review a selection of related work, both for models of pulse-coupled oscillators, as well as approaches for their verification. After an introduction of preliminary notions in Sect. 3, we present the concrete model of single oscillators as a discrete-time Markov chain in Sect. 4. The abstract model in terms of population models and proofs about their properties are contained in Sect. 5. In Sect. 6, we prove the correspondence between the two types of models. The experimental evaluation of synchronisation behaviour is presented in Sects. 7, and 8 concludes the paper.

Related work
The canonical model of pulse-coupled oscillators, and their synchronisation, was formulated by Mirollo and Strogatz [38], and based on Peskin's model of a cardiac pacemaker [43].
Here the progression of an oscillator through its oscillation cycle is given by a real value in the interval [0, 1]. Mirollo and Strogatz proved that with a convex phase response function, a network of mutually coupled oscillators always converges, i.e., their position within the oscillation cycle eventually coincides. Such a model has been shown to be applicable to the clock synchronisation of wireless sensor nodes [47] and swarms of robots [41].
Synchronisation algorithms based on pulse-coupled oscillators are often beneficial in unreliable, decentralised networks, where other synchronisation algorithms are not appropriate. For example, the Flooding Time Synchronisation Protocol (FTSP) [36] requires the use of an arbitrary root node. In situations where the root becomes unavailable due to communication failure or power outage, FTSP will have to assign another root node. When implemented on unreliable, decentralised networks, FTSP may spend considerable resources on repeatedly assigning root nodes, which may slow down or prevent synchronisation [11]. Other algorithms such as the Berkeley algorithm [26] and Cristian's algorithm [16] require the use of centralised time servers, which is problematic for unreliable, decentralised networks.
Several decentralised network algorithms for synchronisation are based on pulse-coupled oscillators [47,50]. For example, the Gradient Time Synchronisation Protocol (GTSP) by Sommer and Wattenhofer [46] achieves synchronisation by having nodes send their current clock value to their neighbours. Each node then calculates the average of the clock values received and its own clock value. This process is then repeated to maintain synchronisation. Another approach to synchronisation, the Pulse-Coupled Oscillator Protocol [39], makes use of refractory periods after sending messages containing time information. During the refractory period, no more messages are sent, which reduces network bandwidth and energy usage. A similar approach is used in the FiGo protocol [11], which combines biologically inspired synchronisation with information distribution via gossiping. All of these approaches use different phase response functions.
In general, synchronisation algorithms based on PCOs are more robust for unreliable networks, as they do not require centralised nodes and can work with only partial network connectivity [11]. They are particularly useful for battery-powered nodes in wireless networks, as the node can be placed in a low-power mode during the refractory period, thus reducing energy usage. (The clock keeps ticking even in low-power mode, thanks to the design of microcontrollers such as the 'Atmel ATmega128L' [6].) Synchronisation of clocks for networks of nodes has been investigated from different perspectives. Heidarian et al. [29] analysed the behaviour of a synchronisation protocol based on time allocation slots for up to four nodes and different topologies, from fully connected networks to line topologies. They modelled the protocol as timed automata [3], and used the model-checker UPPAAL [10] to examine its worst-case behaviour. Their model is based on continuous time, and in particular, they did not model pulse-coupled oscillators.
Bartocci et al. [8] described pulse-coupled oscillators as extended timed automata with suitable semantics to model their peculiarities. They defined a dedicated logic to analyse the behaviour of a network of such automata along traces, and used a pacemaker as a case study to verify the eventual synchronisation and the time needed to achieve this.
Our models and methods are different from all of these approaches. A key difference in our work from that of others analysing PCOs is that we define the oscillation cycle to consist of discrete steps. To the best of our knowledge, with the exception of the paper by Webster et al. (including some of the authors of this paper) [49] and our previous work [24,25], there is no other work concerned with PCOs with discrete oscillation cycles. Furthermore, all of these approaches distinguish between single oscillators in the network, while the properties of interest relate to global behaviour. This discrepancy between local modelling and global analysis restricts the size of networks that can be analysed, due to the state-space explosion. To extend the size of analysable networks, we employ population models, a counting-abstraction of such networks [19]. Instead of identifying each oscillator on its own, we record how many oscillators are in each step of the oscillation cycle. This significantly reduces the state-space by exploiting the symmetries in the model [20], and we are hence able to extend the size of networks.
The notion of population models should not be confused with population protocols [5], a formalism to express distributed algorithms. In contrast to our setting, communication in population protocols is always between two agents, where one agent initiates the communication and the other responds. Furthermore, even though the agents cannot identify the other agents in the network, within the global model each agent is uniquely associated with a state. In our model, we cannot distinguish between two different agents sharing the same state, even at the global level. Finally, our oscillators may change their state without interacting with other oscillators, while the agents in a population protocol must communicate with another agent to change their internal state.
Other techniques have been used to model populations of processes. For example population-based models using PEPA, a stochastic process algebra, are discussed in [30]. The modelling of individuals using PEPA is introduced and if the identification of individuals is not necessary (similar to our work) a population-based approach is advocated to allow larger populations to be modelled. Unlike our approach the population-based models make use of a continuous approximation of the discrete behaviour.
Chemical Reaction Networks (CRNs), see for example [12,45] have been used to represent the behaviour of reactions between chemicals in a solution. These have been provided with different semantics including both deterministic and stochastic semantics. CRNs have been used to model asynchronous logic circuits with properties of the models being verified using the probabilistic model checker PRISM [13]. They have also been investigated to analyse their capacity to represent discrete probability distributions focusing on their steady state [12]. In our work we model both a fully connected network of oscillators and the population models directly as stochastic processes, in particular discrete-time Markov chains. We focus on synchronisation properties rather than their steady state. Like [13] we use PRISM to verify these properties.
Similarly to typical definitions of counter abstractions [9,21], we use counters to model concurrent entities that are indistinguishable for our purposes. For example, to analyse the probability of eventually reaching a synchronised state, we are not interested in an order of oscillators, which would be artificial anyway. However, in contrast to these approaches, we do not include means to introduce new oscillators into a model. That is, the values within our population models are naturally bounded by the number of oscillators within the network.

Preliminaries
In this section we define discrete-time Markov chains (DTMCs), stochastic processes with discrete state space and discrete time, and introduce Probabilistic Computation Tree Logic (PCTL), a logic that can be used to reason about probabilistic reachability and rewards in these processes.
Throughout this paper, we use the notation f ⊕[x → y], where f is a function, to express updating f at x by y. That is, the function that coincides with f , except for x, where it takes the value y.

Discrete-time Markov chains
DTMCs can be used to model systems where the discrete-time evolution of the system can be represented by a discrete probabilistic choice over several outcomes at each step.
Definition 1 A discrete-time Markov chain D is a tuple (S, σ I , P, L) where S is a finite set of states. σ I is the initial state, and L : S → P(L) is a labelling function that assigns properties of interest from a set of labels L to states. P : S × S → [0, 1] is the transition probability matrix subject to σ ∈S P(σ, σ ) = 1 for all σ ∈ S, where P(σ, σ ) gives the probability of transitioning from σ to σ . We say that there is a transition between two states σ, σ ∈ S if P(σ, σ ) > 0.
Intuitively, a DTMC is a state transition system where transitions between states are labelled with probabilities greater than 0 and where states are labelled with properties of interest. An execution path ω of a DTMC D = (S, σ I , P, L) is a non-empty finite, or infinite, sequence σ 0 σ 1 σ 2 · · · where σ i ∈ S and P(σ i , σ i+1 ) > 0 for i 0. We denote the set of all paths starting in state σ by Paths D (σ ), and the set of all finite paths starting in σ by Paths D f (σ ). For paths where the first state along that path is the initial state σ I we will simply use Paths D and Paths D f . Furthermore, we will use Paths and Paths f if D is clear from the context. For a finite path ω f ∈ Paths f (σ ) for some state σ , the cylinder set of ω f is the set of all infinite paths in Paths(σ ) that share ω f as a prefix. The probability of taking a finite path σ 0 σ 1 · · · σ n is given by n i=1 P(σ i−1 , σ i ). This measure over finite paths can be extended to a probability measure Pr σ over the set of infinite paths Paths(σ ), where the smallest sigma-algebra over Paths(σ ) is the smallest set containing all cylinder sets for paths in Paths f (σ ). For the probability measure over paths where the first state is the initial state σ I we will simply use Pr. For a detailed description of the construction of the probability measure we refer the reader to [31].

Probabilistic computation tree logic
Probabilistic Computation Tree Logic [28] (PCTL) is a probabilistic extension of the temporal logic CTL. Properties for DTMCs can be formulated in PCTL and then checked against the DTMCs using model checking.

Definition 2
The syntax of PCTL is given by: where p is an atomic proposition taken from the set of labels L, ∈ {<, , , >} and λ ∈ [0, 1].
Formulas denoted by Φ are state formulas and formulas denoted by Ψ are path formulas. A PCTL formula is always a state formula, and a path formula can only occur inside the P operator. We now give the semantics of PCTL over a DTMC.

Definition 3
Given a DTMC D = (S, σ I , P, L), we inductively define the satisfaction relation | for any state σ ∈ S as follows: where p ∈ L, and for any path ω = σ 0 σ 1 σ 2 · · · of D as follows: Disjunction, true, false, and implication are derived as usual, and we define eventuality as F Φ ≡ true U Φ. When model checking any PCTL formula of the form P λ [Ψ ] the actual probability is first calculated and then compared to the bound λ [35]. We will denote this calculated probability value by P =? [Ψ ]. While probabilistic reachability properties allow us to quantitatively analyse models with respect to the likelihood of reaching some set of states, they do not allow us to reason about other properties of interest, for instance the expected time taken for a network to synchronise [24], or the expected energy consumption of a network [25]. Therefore, we will often want to augment the DTMC corresponding to a population model with rewards. We do this by annotating states and transitions with real-valued rewards (respectively costs, should values be negative) that are awarded when states are visited, or transitions taken. Definition 4 Given a DTMC D = (S, σ I , P, L) a reward structure for D is a pair R = (R s , R t ) where R s : S → R and R t : S × S → R are the state reward and transition reward functions that respectively map states and transitions in D to real valued rewards.
For any finite path ω = σ 0 · · · σ k of D we define the total reward accumulated along that path up to, but not including, σ k as Given a DTMC D = (S, σ I , P, L) augmented with a reward structure R, and some state σ 0 ∈ S, we will often want to reason about the reward that is accumulated along a path ω = σ 0 σ 1 σ 2 · · · ∈ Paths(σ 0 ) that eventually passes through some set of target states Ω ⊂ S. We first define a random variable over the set of infinite paths starting in state σ 0 , V Ω,σ 0 : Paths(σ 0 ) → R ∪ {∞}. If σ 0 = σ I then we will simply use V Ω . Given the set ω Ω = { j | σ j ∈ ω ∩ Ω} of indices of states in ω that are in Ω we define the random variable and define the expectation of V Ω,σ 0 with respect to Pr σ 0 by The logic of PCTL can be extended to include reward properties by introducing the state formula R r [F Φ], where ∈ {<, , , >} and r ∈ R [33]. Given a state σ ∈ S, a real value r , and a PCTL path formula Φ, the semantics of this formula is given by where Sat(Φ) denotes the set of states in S that satisfy Φ. Similarly to the operator P, for any PCTL formula of the form R r [Ψ ] we will denote the calculated expected value by R =? [Ψ ].

Concrete model of a network of pulse-coupled oscillators
In this section we give a brief introduction to the formal model of a single pulse-coupled oscillator, as originally presented in previous work [24]. Subsequently, we encode fullycoupled networks of such oscillators as discrete time Markov chains.

Pulse-coupled oscillator model
A model of a pulse-coupled oscillator is composed of two features. Firstly, we need to model the oscillation, a periodic variation of the state of the oscillator. In our model the phase of an oscillator indicates its progression through an oscillation cycle, which is divided into discrete steps. We assume that the oscillation frequency is the same for all oscillators; the internal clocks of all oscillators are running at the same speed. Secondly, the model must encapsulate the interactions between the oscillators. Oscillators that are pulse-coupled interact with each other at discrete times during their oscillation cycles. At some distinguished point an oscillator transmits a message to other oscillators that react to the reception of the message by adjusting their phase. We assume that the duration of time corresponding to an increment of 1 to the phase of an oscillator is long enough to perceive all messages coming from other oscillators. That is, the only way a message may be lost is if the sending oscillator fails to transmit its message.
The phase of an oscillator u at time t is denoted by φ u (t). The phase of each u progresses through a sequence of discrete integer values bounded by some T 1, its maximal phase. The phase progression over time of a single uncoupled oscillator is determined by the successor function, where the phase increases over time until it exceeds T , at which point the oscillator will fire in the next moment in time and the phase will reset to one. The phase progression of an uncoupled oscillator is therefore cyclic with period T , and we refer to one cycle as an oscillation cycle.
When an oscillator fires, it may happen that its firing is not perceived by any of the other oscillators coupled to it. We call this a broadcast failure and denote its probability by μ ∈ [0, 1]. Note that μ is a global parameter, hence the chance of broadcast failure is identical for all oscillators. When an oscillator fires, and a broadcast failure does not occur, it perturbs the phase of all oscillators to which it is coupled; we use α u (t) to denote the number of all other oscillators that are coupled to u and will fire at time t.

Definition 5
The phase response function is a positive increasing function Δ : {1, . . . , T } × N×R + → N that maps the phase of an oscillator u, the number of other oscillators perceived to be firing by u, and a real value defining the strength of the coupling between oscillators, to an integer value corresponding to the perturbation to phase induced by the firing of oscillators where broadcast failures did not occur. We require Δ(Φ, 0, ) = 0 for all possible phase response functions, that is, oscillators are only perturbed if they perceive at least one other firing oscillator.
We can introduce a refractory period into the oscillation cycle of each oscillator. A refractory period is an interval of discrete values T is the size of the refractory period, such that if φ u (t) is inside the interval, for some oscillator u at time t, then u cannot be perturbed by other oscillators to which it is coupled. If R = 0 then we set [1, R] = ∅, and there is no refractory period at all.
and takes as parameters δ, the degree of perturbance to the phase of an oscillator, and Φ, the phase, and returns Φ if it is in the refractory period, or Φ + δ otherwise.
The phase evolution of an oscillator u over time is then defined as follows, where the update function and firing predicate, respectively denote the updated phase and firing of oscillator u, For real deployments of synchronisation protocols it is often the case that the duration of a single oscillation cycle will be at least several seconds [15,42]. The perturbation induced by the firing of a group of oscillators may lead to groups of other oscillators to which they are coupled firing in turn. The firing of these other oscillators may then cause further oscillators to fire, and so forth, leading to a "chain reaction", where each group of oscillators triggered to fire is absorbed by the initial group of firing oscillators. Since the whole chain reaction of absorptions may occur within just a few milliseconds, and in our model the oscillation cycle is a sequence of discrete states, when a chain reaction occurs the phases of all perturbed oscillators should be updated in one single time step.

Modelling the network as a DTMC
In this section, we present our model of a fully-connected network of pulse-coupled oscillators. Observe that by the definition of single PCOs, the reaction of an oscillator u to incoming communication of other oscillators depends on the number of oscillators communicating and the implementation of the perturbation function. We choose to present the full semantics of such a network, instead of defining it as the parallel composition of single oscillators. After presenting the semantics, we will discuss this decision.
We model the whole, fully-connected network of oscillators as a single DTMC D = (S, s I , P, L), where each state s ∈ S denotes a global state of the network, with the exception of s I , which is a distinguished initial state. More precisely, each state of the DTMC contains the internal states of each oscillator, as well as an abstraction of the environment.
We model each transition of an oscillator as a single transition within the DTMC. However, since the oscillators may influence each other within a single time step (that is, when they are firing), we cannot simply allow for arbitrary sequences of transitions. For instance, as stated in Sect. 4.1, all oscillators run with the same clock-speed. Hence we need to prevent a single oscillator from taking a transition and thus progressing its phase without giving the other oscillators a chance to do the same. We achieve this by the following means: -we divide the internal computation of each oscillator into two modes: start and update, and -we add a counter to the model, containing the number of oscillators that fire.
The counter also possesses both modes, and resets at the start of each "round" of computation. First, in the start mode, each oscillator checks whether it would fire, according to its phase response function and the current number of oscillators that already fired, as given by the counter. If it does, it increases the counter and updates its mode to update, otherwise it just updates its mode. If all oscillators are in the update mode, they compute their new phases in a single step, according to the phase response function and the current state of the environment counter. Furthermore, we impose an order on the evaluation of the oscillators in the start mode if at least one oscillator fires, starting from the highest phase to the lowest. In this way, we model that firing oscillators are always perceived by the other nodes, and thus may lead to the firing of the latter. In particular, this reflects the absorptions of messages as defined in the previous section.
The general idea of the progress of the network of oscillators is visualised in Fig. 1. In the figure, each rounded rectangle shows a state of a network of four oscillators. The circles represent the nodes, where we inscribe their current phases and an abbreviation of their modes. A node that is about to fire is indicated by a starred circle, while a shaded circle indicates a node that is within the refractory period. The rectangle denotes the environment counter, with its corresponding value and mode. The phase response function is where [·] denotes rounding to the nearest integer. We set = 0.3, and μ = 0.2.
In the first state, all outgoing transitions only check whether to increase the counter. Since no oscillator is in the firing phase, all oscillators just update their mode (observe that the single arrow actually denotes four transitions). In the next step, all oscillators increase their phase by one, and reset their mode to start. In the next four transitions, oscillator 2 fires and increases the counter, which in turn is sufficient for oscillator 3 to fire as well, since 7 + [7 · 1 · 0.3] + 1 = 7 + [2.1] + 1 = 7 + 2 + 1 = 10 > T . Oscillator 3, however, then fails to fire, and does not increase the counter. Neither oscillator 1 nor 4 get perturbed to fire. The latter because it is still in its refractory period, and thus ignores the firing oscillator, and the former since 4 + [4 · 1 · 0.3] + 1 = 4 + [1.2] + 1 = 4 + 1 + 1 = 6 T . During the last transition of the example, oscillator 2 and 3 reset their phase to one, while oscillator 1 is perturbed and increases its phase by two steps at once. Oscillator 4 is within its refractory period, which means that it is not perturbed, and simply increments its phase. In addition to these transitions, we also need some bookkeeping transitions, to ensure that the counter is reset before the oscillators check their phase response. Furthermore, observe that in the example, it is crucial that oscillator 3 checks its response after oscillator 2 increased the counter, since otherwise oscillator 3 would not have been perturbed to fire.
Formally, we combine the states of the oscillators and the environment into a single state of the DTMC. Each oscillator can be described by a tuple consisting of the current phase Φ of the oscillator and the mode mode within this phase. The phase ranges from 1 to T , while the mode takes values from {start, update}. Furthermore, we use a single counter to keep track of the number of oscillators that fired successfully within a single phase computation.
For a network of N oscillators, a state of the DTMC consists of a function osc that associates a phase and mode with each oscillator, A state is therefore a tuple s = (env, osc), where env is the state of the environment, and osc is the state of the network. We denote the set of all concrete system states by S c . For simplicity, we will use the notation φ(s, u) to refer to the phase of oscillator u in state s, and similarly, mode(s, u) to refer to its mode. By abuse of notation, we will also write mode(s, env) to denote the mode of the environment and count(s) to refer to the value of the environment counter in state s. We sometimes need to denote that an oscillator changes neither its phase nor its mode in the definitions of the transitions. To that end, we define the stability of oscillator u between s and s as Similarly, we define stability of the environment: stable env (s, s ) ≡ count(s) = count(s ) ∧ mode(s, env) = mode(s , env) We use the notation init Φ (s) = {u | mode(s, u) = start ∧ φ(s, u) = Φ} for the set of all oscillators sharing phase Φ and mode start in the state s. Furthermore, we also use the notation init(s) = {u | mode(s, u) = start}.
We now define the transition probabilities between states. To do this we first distinguish the following cases: 1. the environment resets its counter; 2. no oscillator has a clock value of T ; 3. an oscillator is in the mode start, has a clock value lower than T , is perturbed, but not enough to fire; 4. an oscillator is in the mode start, has a clock value lower than T and is perturbed enough to fire; 5. an oscillator is in the mode start, has a clock value lower than T and is perturbed enough to fire, but fails to do so; 6. an oscillator is in the mode start, has a clock value of T , and broadcasts its pulse; 7. an oscillator is in the mode start, has a clock value of T , and fails to broadcast its pulse; 8. all oscillators are in the mode update, update their clock and reset their state to start.
We will impose an order on certain transitions for two reasons. Firstly, we will restrict transitions that are only used for bookkeeping purposes. For example, we will require that the reset transition of the environment is taken before any of the transitions for the oscillators  within a phase are activated. In particular, this means that each computation starts with a transition of the type 1. Secondly, we need to ensure that, if at least one oscillator fires, the phase response of all oscillators is evaluated starting with oscillators in the highest phase, down to the lowest phase, as described above. The cases stated above are reflected in the following definitions for the transition probability between two states s = (env, osc) and s = (env , osc ). Case 1, where the environment resetting its counter is treated as follows. In the precondition, we require that the mode of the counter is start, and the state of the oscillators does not change from s to s . Furthermore, the mode of the counter changes to update in s , and its value is set to 0. Since this transition is mandatory at the beginning of each round, its probability is 1. An example for this transition can be found in Fig. 1 in the transition from state (c) to (d).
If mode(s, env) = start ∧ mode(s , env) = update ∧ count(s ) = 0 ∧ ∀u : stable u (s, s ), If no oscillator is at the end of its cycle, that is, in case 2, we define the probability of one oscillator updating its mode as follows. Observe that we have to normalise the transition probability by the number of all oscillators that have not transitioned to their update mode yet. This is correct, since no oscillator fires, which also means that no oscillator can be activated beyond the maximum phase. This implies in particular that the order of oscillator transitions does not matter in this round. For example, all outgoing transitions from state (a) in Fig. 1 have probability 1 4 , while the subsequent transitions (which are not shown in the figure) occur with probability 1 3 , and so on.
If mode(s, env) = update and there is a w s.t.
Now we will consider the cases 3, 4 and 5, where some oscillator already fired (i.e., count(s) > 0), and other oscillators are perturbed. In all three cases, one common precondition is that the counter is in its update mode and that there is an appropriate oscillator in the start mode. One complication arises: we have to ensure that the messages of firing oscillators lead to the perturbation of the other oscillators. Recall that the perturbation function is increasing, and thus a higher phase of an oscillator may result in a higher perturbation. That is, oscillators with a higher phase need to be perturbed by fewer firing oscillators before their phase is increased beyond the threshold and they in turn fire. However, if the oscillators with high phases fire, their additional messages may be enough to perturb oscillators with lower phases to fire as well. Hence, if we did not enforce an order from high to low phases, oscillators with a lower phase might not be perturbed when oscillators with a higher phase fire. To solve this, we only allow the oscillators to update their mode once all oscillators with a higher phase have been considered. Observe that we normalise the transition probabilities according to the number of oscillators satisfying similar conditions. Formally, this means that we need to normalise on the number of oscillators with the same phase in the start mode. To model case 3, we only change the mode of the corresponding oscillator to update, and keep the rest of the state. The precondition is, that all oscillators with higher phases have already been considered, and the oscillator under consideration is not perturbed enough to fire, still within its refractory period, or both. For example, from state ( f ) on, the oscillators 1 and 4 are possibly perturbed, but not enough to fire. In this case, we will still first update oscillator 1, due to its higher phase.
If mode(s, env) = update and there is a w s.t.
If an oscillator is perturbed, and actually fires, we update its mode to update, and increase the counter of the environment. We only allow for this transition, if the oscillator is perturbed to fire, and is outside of refractory period. This definition corresponds to case 4. The probability of such a transition is the probability that a broadcaset failure does not occur, 1−μ, normalised by the number of oscillators in the same phase that have not yet been considered.
If the oscillator fails to fire, as described by case 5, the only differences to the preceeding case is that we do not increase the counter, and that the probability of the transition is the normalised broadcast failure probability. As an example, consider the transition in Fig. 1 from state e to f , where oscillator 3 is indeed perturbed to the end of its cycle, but fails to fire. That is, the environment counter is not increased. The probability of this transition is μ 1 = 0.2, since it is the only oscillator with the phase value 7.
If mode(s, env) = update and there is a w s.t.
Now we turn to the cases 6 and 7 where some oscillator is at the end of its cycle. The preconditions of both cases are similar to the preceeding cases: the counter is required to be in the update mode, and there is an oscillator w, whose phase is T and mode is start. Furthermore, in s , the mode of w is update, and the state of all other oscillators does not change. The difference between the cases is whether the counter is increased, that is, whether the oscillator manages to broadcast its signal. The probability of succeeding is 1−μ |init T (s)| , since there may be more than one oscillator in phase T at state s. Hence we have to normalise the tranistion probability accordingly. Similarly, the probability of failing to fire is μ |init T (s)| . So, the probability of the transition from state (d) to (e) is 1−0.2 1 = 0.8, since it is the only oscillator in the firing phase.
If mode(s, env) = update and there is a w s.t.
If mode(s, env) = update and there is a w s.t.
The final case 8, where all oscillators update their clock values simultaneously, is given by the following formula. It requires that all oscillators have finished their computation, whether they fire, and both the counter and the oscillators will reset their mode to start after the transition.
If mode(s, env) = update and mode(s , env) = start and The formula F update is an abbreviation for the conjunction of the following four conditions, which model the update of the phases of the oscillators, according to the phase response function. Observe that the phases of the oscillators had not been updated by the previously defined transitions. Hence, we now update the phases of all oscillators at once.
In this formula, (9a) handles the simple case of firing oscillators, while (9b) defines the behaviour of oscillators within their refractory period. The formulas (9c) and (9d) reflect the two cases where oscillators are perturbed, either not exceeding their oscillation cycle, or firing, respectively. For example, in the transition from state (b) to (c) in the figure, oscillators 1 to 3 satisfy clause 9c (where Δ(Φ, 0, 0.3) = 0 by definition for all phases Φ), while oscillator 4 is updated due to clause 9b. In the transition from (g) to (h), however, oscillator 1 is again updated due to clause 9c, oscillator 2 due to clause 9a, oscillator 3 due to 9d and oscillator 4 again due to 9b.
Finally, we define the transitions from the initial state, that form a distribution over all possible initial configurations for the network. As explained above, states where any component (i.e., an oscillator or the environment) is in the mode update are considered intermediate states. Hence, we only allow for transitions from the initial state to states, where every component is in the start mode, and furthermore, the counter of the environment is set to 0. Let us denote this set of states by S , i.e., Then, for each state s ∈ S , we have

Population model
In this section, we define a population model of a network of pulse-coupled oscillators for parameters as defined in Sect. 4.1 as P = (Δ, N , T , R, , μ). Oscillators in our model have identical dynamics, and two oscillators are indistinguishable if they share the same phase. That is, we can reason about groups of oscillators, instead of individuals. We therefore encode the global state of the model as a tuple k 1 , . . . , k T where each k Φ is the number of oscillators sharing a phase value of Φ. The population model does not account for the introduction of additional oscillators to a network, or the loss of existing coupled oscillators. That is, the population N remains constant.

Definition 7 A global state of a population model
The set of all global states of P is Γ (P), or simply Γ when P is clear from the context. Figure 2 shows four global states for an instantiated population model of N = 8 oscillators with T = 10 discrete values for their phase and a refractory period of length R = 2. We use the linear phase response function

Example 1
denotes rounding to the closest integer. Furthermore, let = 0.115. For example σ 0 = 0, 0, 2, 1, 0, 0, 5, 0, 0, 0 is the global state where two oscillators have a phase of three, one oscillator has a phase of four, and five oscillators have a phase of seven. The starred node indicates the number of oscillators with phase ten that will fire in the next moment in time, while the shaded nodes indicate oscillators with phases that lie within the refractory period (one and two). If no oscillators have some phase Φ then we omit the 0 in the corresponding node. Observe that, while going from σ i−1 to σ i (1 i 3), the oscillator phases increase by one. In the next section, we will explain how transitions between these global states are made. Note that directional arrows indicate cyclic direction, and do not represent transitions.
With every state σ ∈ Γ we associate a non-empty set of failure vectors, where each failure vector is a tuple of broadcast failures that could occur in σ .

Fig. 2
Evolution of the global state over four discrete time steps

Transitions
In Sect. 5.2 we will describe how we can calculate the set of all possible failure vectors for a global state, and thereby identify all of its successor states. However we must first show how we can calculate the single successor state of a global state σ , given some failure vector F.

Absorptions
Since we are considering a fully connected network of oscillators, two oscillators sharing the same phase will have their phase updated to the same value in the next time step. They will always perceive the same number of other oscillators firing. Therefore, for each phase Φ we define the function Transition function We now define the transition function that maps phase values to their updated values in the next time step. Note that since we no longer distinguish different oscillators with the same phase we only need to calculate a single value for their evolution and perturbation.

Definition 9
The phase transition function τ : Γ × {1, . . . , T } × F → N maps a global state σ , a phase Φ, and some possible failure vector F for σ , to the updated phase in the next discrete time step, with respect to the broadcast failures defined in F, and is defined as Let U Φ (σ, F) be the set of phase values Ψ where all oscillators with phase Ψ in σ will have their phase updated to Φ in the next time step, with respect to the broadcast failures defined in F. Formally, Fig. 3 Evolution of the global state over four discrete time steps We can now calculate the successor state of a global state σ and define how the model evolves over time.

Definition 10
The successor function → succ : Γ × F → Γ maps a global state σ and a failure vector F to a state σ , and is defined as

Example 2
Recall that the perturbation function of our example was given as Δ(Φ, α, ) = [Φ · α · ], where [·] denotes rounding and = 0.115. Consider the global state σ 2 of Fig. 3 where no oscillators will fire since k 10 = 0. We therefore have one possible failure vector for σ 0 , namely F = { } 10 . Since no oscillators fire the dynamics of the oscillators are determined solely by their standalone evolution, and all oscillators simply increase their phase by 1 in the next time step. Now consider the global state σ 3 and F = , , , , , , 1, 0, 0, 0 , a possible failure vector for σ 3 , indicating that oscillators with phases of 7 to 10 will fire and one broadcast failure will occur for the single oscillator that will fire with phase 7. Here a chain reaction occurs as the perturbation induced by the firing of the 5 oscillators causes the single oscillator with a phase of 7 to also fire. A broadcast failure occurs when this single oscillator fires, and the perturbation of the 5 firing oscillators is insufficient to cause the 2 oscillators with a phase of 6 to also fire. In the next state the oscillator with phase 7 has been absorbed by the group of the 5 oscillators that had phase 10.

Lemma 1
The number of oscillators is invariant during transitions, i.e., the successor function only creates tuples that are states of the given model. Formally, let σ = k 1 , . . . , k T and σ = k 1 , . . . , k T be two states of a model P such that σ = → succ(σ, F), where F is some possible failure vector for σ .
Proof Observe that the range of the function τ is bound by T . By construction we can see that for any σ , for any possible failure vector F for σ , and for all Φ ∈ {1, . . . , T },

Failure vector calculation
We construct all possible failure vectors for a global state by considering every group of oscillators in decreasing order of phase. At each stage we determine if the oscillators would fire. If they fire then we consider each outcome where any, all, or none of the firings result in a broadcast failure. We then add a corresponding value to a partially calculated failure vector and consider the next group of oscillators with a lower phase. If the oscillators do not fire then there is nothing left to do, since by Definition 5 we know that Δ is increasing, therefore all oscillators with a lower phase will also not fire. We can then pad the partial failure vector with appropriately to indicate that no failure could happen since no oscillator fired. Table 1 illustrates how a possible failure vector for global state σ 3 in Fig. 3 is iteratively constructed. The first three columns respectively indicate the current iteration i, the global state σ 3 with the currently considered oscillators underlined, and the elements of the failure vector F computed so far. The fourth column is true if the oscillators with phase T +1−i would fire given the broadcast failures in the partial failure vector. We must consider all outcomes of any or all firings resulting in broadcast failure. The final column therefore indicates whether the value added to the partial failure vector in the current iteration is the only possible value (false), or a choice from one of several possible values (true).
Initially we have an empty partial failure vector. At the first iteration there are 5 oscillators with a phase of 10. These oscillators will fire so we must consider each case where 0, 1, 2, 3, 4 or 5 broadcast failures occur. Here we choose 0 broadcast failures, which is then added to the partial failure vector. At iterations 2 and 3 the oscillators would have fired, but since there are no oscillators with a phase of 9 or 8 we only have one possible value to add to the partial failure vector, namely 0. At iteration 4 a single oscillator with a phase of 7 fires, and we choose the case where the firing resulted in a broadcast failure. In the final iteration oscillators with a phase of 6 do not fire, hence we can conclude that oscillators with phases less than 6 also do not fire, and can fill the partial failure vector appropriately with .
Formally, we define a family of functions fail indexed by Φ, where each fail Φ takes as parameters some global state σ , and V , a vector of length T − Φ. V represents all broadcast failures for all oscillators with a phase greater than Φ. The function fail Φ then computes Table 1 Construction of a possible failure vector for a global state σ 3 = 0, 0, 0, 0, 0, 2, 1, 0, 0, 5 iteration (i) Observe that the result of fail T is always a set of well defined failure vectors, since whenever is introduced into a failure vector at index Φ, all preceding indices are also filled with , as required by Definition 8.

Definition 12
Given a global state σ ∈ Γ , we define F σ , the set of all possible failure vectors for that state, as F σ = fail T (σ, ), and define next(σ ), the set of all successor states of σ , as Note that for some global states |next(σ )| < |F σ |, since we may have that Given a global state σ and a failure vector F ∈ F σ , we will now compute the probability of a transition being made to state → succ(σ, F) in the next time step. Recall that μ is the probability with which a broadcast failure occurs. Firstly we define the probability mass We then denote by PFV : Γ × F σ → [0, 1] the function mapping a possible broadcast failure vector F for σ , to the probability of the failures in F occurring. That is, Lemma 2 For any global state σ , PFV is a discrete probability distribution over F σ . Formally, Proof Given a global state σ = k 1 , . . . , k T we can construct a tree of depth T where each leaf node is labelled with a possible failure vector for σ , and each node Λ at depth Φ is labelled with a vector of length Φ corresponding to the last Φ elements of a possible failure vector for σ . We denote the label of a node Λ by V (Λ). We label each node Λ ω with ω V (Λ). We iteratively construct the tree, starting with the root node, root, at depth 0, which we label with the empty tuple . For each node Λ at depth 0 Φ < T we construct the children of Λ as follows: 1. If oscillators with phase Φ fire we define the sample space Ω = {0, . . . , n Φ } to be a set of disjoint events, where each ω ∈ Ω is the event where ω broadcast failures occur, given that k Φ oscillators fired. For each ω ∈ Ω there is a child Λ ω of Λ with label ω V (Λ), and we label the edge from Λ to Λ ω with PMF(k Φ , ω). 2. If oscillators with phase Φ do not fire then Λ has a single child Λ labelled with V (Λ), and we label the edge from Λ to Λ with 1.
We denote the label of an edge from a node Λ to its child Λ by L(Λ, Λ ). For case 2 we can observe that if oscillators with phase Φ do not fire then we know that oscillators with any phase Ψ < Φ will also not fire, since from Definition 5 we know that Δ is an increasing function. Hence, all descendants of Λ will also have a single child, with an edge labelled with 1, and each node is labelled with the label of its parent, prefixed with . After constructing the tree we have a vector of length T associated with each leaf node, corresponding to a failure vector for σ . The set F σ of all possible failure vectors for σ is therefore the set of all vectors labelling leaf nodes. We denote by P ↓ (Λ) the product of all labels on edges along the path from Λ back to the root. Given a global state σ = k 1 , . . . , k T and a failure vector F = f 1 , . . . , f T ∈ F σ labelling some leaf node Λ at depth T , we can see that Let D Φ denote the set of all nodes at depth Φ. We show d∈D Φ P ↓ (d) = 1 by induction on Φ. For Φ = 0, i.e., D Φ = {root}, the property holds by definition. Now assume that d∈D Φ P ↓ (d) = 1 holds for some 0 Φ < T . Let Λ be some node in D Φ , and let C Λ be the set of all children of Λ. Consider the following two cases: If oscillators with phase Φ do not fire then |C Λ | = 1, and for the only c ∈ C Λ we have that L(Λ, c) = 1. If oscillators with phase Φ fire observe that PMF is a probability mass function for a random variable defined on the sample space Ω = {0, . . . , k Φ }. In either case we can see that c∈C Λ L(Λ, c) = 1.
Since c∈C d L(d, c) = 1 for each d ∈ D Φ , and from the induction hypothesis, we then have that We have already shown that P ↓ (Λ) = PFV(σ, F) for any leaf node Λ labelled with a failure vector F, and since the set of all labels for leaf nodes is F σ we can conclude that This proves the lemma.

Synchronisation
When all oscillators in a population model have the same phase in a global state we say that the state is synchronised. Formally, a global state σ = k 1 , . . . , k T is synchronised if, and only if, there is some Φ ∈ {1, . . . , T } such that k Φ = N , and hence k Φ = 0 for all Φ = Φ. We will often want to reason about whether some particular run ω of a model leads to a global state that is synchronised. We say that a path ω = σ 0 σ 1 · · · synchronises if, and only if, there exists some k 0 such that σ k is synchronised. Once a synchronised global state is reached any successor states will also be synchronised. Finally we can say that a model synchronises if, and only if, all runs of the model synchronise.

Model construction
Given a population model P = (Δ, N , T , R, , μ) we construct a DTMC D(P) = (S, σ I , P, L) where L ranges over the singleton {synch}. We define the set of states S to be To calculate the likelihood of permutations of a population of oscillators we will make use of multinomial coefficients, for which we provide the standard definition.

Definition 13
The multinomial coefficient is an extension of the binomial coefficient that gives the number of ordered permutations of the elements of a multiset. Given a finite multiset M, a permutation is an ordered arrangement of its elements, where each element appears a number of times equal to its multiplicity in M. The number of permutations of M is given by where m 1 , . . . m i are the multiplicities of the elements of M and n is the sum of those multiplicities.
In the initial state all oscillators are unconfigured. That is, oscillators have not yet been assigned a value for their phase. For each σ = k 1 , . . . , k T ∈ S \ {σ I } we define to be the probability of moving from σ I to a state where k i arbitrary oscillators are configured with the phase value i for 1 i T . The multinomial coefficient defines the number of possible assignments of phases to distinct oscillators that result in the global state σ . The fractional coefficient normalises the multinomial coefficient with respect to the total number of possible assignments of phases to all oscillators. In general, given an arbitrary set of initial configurations (global states) for the oscillators, the total number of possible phase assignments can be calculated by computing the sum of the multinomial coefficients for each configuration (global state) in that set. Since Γ is the set of all possible global states, we have that We assign probabilities to the transitions as follows: for every σ ∈ S \ {σ I }, we consider each F ∈ F σ , and set P(σ, → succ(σ, F)) = PFV(σ, F). For every combination of σ and σ where σ / ∈ next(σ ) we set P(σ, σ ) = 0.

Model reduction
We now describe a reduction of the population model that results in a significant decrease in the size of the model, but is equivalent to the original model with respect to the unboundedtime reachability of synchronised states. We first distinguish between states where one or more oscillators are about to fire, and states where no oscillators will fire at all. We refer to these states as firing states and non-firing states respectively. In the following, we fix a population model P and simply refer to the states by Γ , instead of Γ (P).
Definition 14 Given a population model P, a global state k 1 , . . . , k T ∈ Γ is a firing state if, and only if, k T > 0. We denote by Γ F the set of all firing states of P, and denote by Γ NF = Γ \ Γ F the set of all non-firing states of P. We will again omit P if it is clear from the context The reduction is constructed by collapsing deterministic sequences of transitions. This allows non-firing states to be eliminated from the model. While this approach may seem simple, to the best of our knowledge such a reduction is not automatically applied by existing tools for the automatic analysis of such a model, for example the model checkers PRISM [34], Storm [18], and IscasMC [27]. In addition, collapsing sequences of deterministic transitions from the initial unconfigured state to firing states is not straightforward, and it is unclear how this could be easily inferred by automated tools.
Given a DTMC D = (S, σ I , P, L) let |P| = |{(t, t ) | t, t ∈ S 2 and P(t, t ) > 0}| be the number of non-zero transitions in P, and |D| = |S| + |P| be the total number of states and non-zero transitions in D.

Theorem 1 For every population model
Observe that only unbounded-time reachability properties are preserved in the reduction of a model. In the unreduced model precisely T transitions correspond to one oscillation of a standalone oscillator. For the reduced model this is not the case, since the length of removed deterministic transitions sequences are not encoded in the reduction.
We now proceed to prove this theorem. To that end, we need some preliminary properties of non-firing states and their relation to firing states.

Lemma 3
Every non-firing state σ ∈ Γ NF has exactly one successor state, and in that state all oscillator phases have increased by 1.

Corollary 1 A transition from any non-firing state is taken deterministically, since for any
Reachable state reduction Given a path ω = σ 0 · · · σ n−1 σ n where σ i ∈ Γ NF for 0 < i < n and σ 0 , σ n ∈ Γ F , we omit transitions (σ i , σ i+1 ) for 0 i < n, and instead introduce a direct transition from σ 0 , the first firing state, to σ n , the next firing state in the sequence. For any σ = k 1 , . . . , k T ∈ Γ let δ σ = max{Φ | k Φ > 0 and 1 Φ T } be the highest phase of any oscillator in σ . The successor state of a non-firing state is then the state where all phases have increased by T − δ σ . Observe that T − δ σ = 0 for any σ ∈ Γ F .

Definition 15
The deterministic successor function succ : Γ → Γ F , given by maps a state σ ∈ Γ to the next firing state reachable by taking T −δ σ deterministic transitions.
Observe that for any firing state σ we have δ σ = T , and hence that succ(σ ) = σ .
We now update the definition for the set of all successor states for some global state σ ∈ Γ to incorporate the deterministic successor function.

Definition 16
Given a global state σ ∈ Γ , we define next(σ ) to be the set of all successor states of σ , where

Definition 17
Given a firing state σ ∈ Γ F let pred(σ ) be the set of all non-firing predecessors of σ , where σ is reachable from the predecessor by taking some positive number of transitions deterministically. Formally, We refer to all states σ ∈ pred(σ ) as deterministic predecessors of σ .
Then given D = (S, σ I , P, L) with S = {σ I } ∪ Γ , we define S = S \ σ ∈Γ F pred(σ ) to be the reduction of S where all non-firing states from which a firing state can be reached deterministically are removed.   From Definition 17 it follows that P ⊆ Γ NF . In addition, for any σ ∈ Γ NF there is some state σ such that σ ∈ pred(σ ) and σ = succ(σ ) ∈ Γ F , hence Γ NF ⊆ P and the lemma is proved.

Lemma 5 For a population model
Proof Observe that there are N +T −1 N ways to assign T distinguishable phases to N indistinguishable oscillators [23]. Since S = Γ ∪ {σ I } and Γ is the set of all possible configurations for oscillators we can see that |S| = N +T −1 N + 1. For any non-firing state σ = k 1 , . . . k T ∈ Γ NF we know from Definition 7 that T Φ=1 k Φ = N and from Definition 14 that k T = 0, so it must be the case that T −1 Φ=1 k Φ = N . That is, there must be N +T −2 N ways to assign T − 1 distinguishable phases to N indistinguishable oscillators, and so |Γ NF | = N +T −2 N . From Lemma 4 we know that S = S \ Γ NF so it must be the case that Transition matrix reduction Here we describe the reduction in the number of non-zero transitions in the model. We ilustrate how initial transitions to non-firing states are removed by using a simple example, and then describe how we remove transitions from firing states to any successor non-firing states.. Figure 4 shows five possible initial configurations σ i , . . . , σ i+4 ∈ S for N = 2 oscillators with T = 6 values for phase, where a transition is taken from σ I to each σ k with probability P(σ I , σ k ). Any infinite run of D where a transition is taken from σ I to one of the configured states σ i , . . . , σ i+3 will pass through σ i+4 , since all transitions (σ i+k , σ i+k+1 ) for 0 k 3 are taken deterministically. Also, observe that states σ i , . . . , σ i+3 are not in S , since σ i+4 is reachable from each by taking some number of deterministic transitions. We therefore set the probability of moving from σ I to σ i+4 in P to be the sum of the probabilities of moving from σ I to σ i+4 and each of its predecessors in P. Generally, given a state σ ∈ S where σ = σ I , we set P (σ I , σ ) = P(σ I , σ ) + σ ∈pred(σ ) P(σ I , σ ).
We now define how we calculate the probability with which a transition is taken from a firing state to each of its possible successors. For each firing state σ ∈ S we consider each possible successor σ ∈ next(σ ) of σ and define F σ →σ to be the set of all possible failure vectors for σ for which the successor of σ is σ , given by F σ →σ = {F ∈ F σ | succ( → succ(σ, F)) = σ }. We then set the probability with which a transition from σ to σ is taken to P (σ, σ ) = F∈F σ →σ PFV(σ, F). (Δ, N , T , R, , μ), the corresponding DTMC D = (S, σ I , P, L) with S = {σ I } ∪ Γ , and its reduction D (P) = (S , σ I , P , L ), the transitions in P are reduced in P such that |P | |P| − 2|Γ NF | Proof From Lemma 4 we know that |S | = |S \ Γ NF |, and hence that |Γ NF | transitions from σ I to non-firing states are not in P , and from Lemma 3 we also know that there is one transition from each non-firing state to its unique successor state that is not in P . Since no additional transitions are introduced in the reduction it is clear that |P | |P| − 2|Γ NF |.

Lemma 6 For a population model P =
Therefore we need to show that where Pr D and Pr D denote the probability measures with respect to the sets of infinite paths from σ I in D and D respectively.
Given a firing state σ F ∈ S we denote by Paths D σ F the set of all infinite paths of D starting in σ I where the first firing state reached along that path is σ F . All such sets for all firing states in S form a partition, such that σ F ∈Γ F Paths D σ F = Paths D . That is, for all firing states σ F , σ F ∈ S where σ F = σ F we have that Paths D σ F ∩ Paths D σ F = ∅. Now observe that any infinite path ω of D can be written in the form ω = σ I ω NF 1 σ F 1 ω NF 2 σ F 2 · · · where σ F i is the i th firing state in the path and each ω NF i = σ 1 i σ 2 i · · · σ k i i is a possibly empty sequence of k i non-firing states. Then for every such path in D there is a corresponding path ω of D without non-firing states, and of the form ω = σ I σ F 1 σ F 2 σ F 3 · · · , as for any i we have σ j i ∈ pred(σ F i ) for all 1 j k i . As only deterministic transitions have been removed in D we can see that Pr D Hence, we only have to consider the finite paths from σ I to σ F 1 . To that end, observe that there are pred(σ F 1 ) possible prefixes for each path from σ I to σ F 1 where the initial transition is taken from σ I to some non-firing predecessor of σ F 1 , plus the single prefix where the initial transition is taken to σ F 1 itself. Overall there are exactly pred(σ F 1 ) + 1 distinct finite prefixes that have ω as their corresponding path in D . We denote the set of these prefixes for a path ω in D by Pref (ω ). Since the measure of each finite prefix extends to a measure over the set of infinite paths sharing that prefix, it is sufficient to show that the sum of the probabilities for these finite prefixes is equal to the probability of the unique prefix σ 0 , σ F 1 of ω , that is Pr D Pref (ω ) = Pr D {σ I , σ F 1 }. We can then write where k σ is the number of deterministic transitions that lead from σ to σ F 1 in D. Now recall that for any σ ∈ S \ {σ I } we have P (σ I , σ ) = P(σ I , σ ) + σ ∈pred(σ ) P(σ I , σ ).
So we have shown that Pr D Pref (ω ) = Pr D {σ I , σ F 1 } and the lemma is proved.
Proof of Theorem 1 Follows from Lemmas 5 and 6 for the reduction of states and transitions respectively, and from Lemma 7 for the preservation of unbounded time reachability properties.

Reward structures for reductions
We now show how a reward structure for a reduced model can be derived from any reward structure for the unreduced model, and prove that these reward structures are equivalent with respect to the reachability of synchronised firing states.

for D such that unbounded-time reachability reward properties with respect to synchronised firing states in D are preserved in D .
Given any reward structure R = (R s , R t ) on D we construct the corresponding reward structure R = (R s , R t ) as follows: -There is no reward for the initial state and we set R s (σ I ) = 0. -For every firing state σ F in S with R s (σ F ) = r we set R s (σ F ) = r . -For every pair of distinct firing states σ F 1 , σ F 2 ∈ S , where there is a non-zero transition from σ F 1 to σ F 2 in D , there is a (possibly empty) sequence σ NF 1 · · · σ NF k of k deterministic predecessors of σ F 2 in S such that k > 0 implies P(σ F 1 , σ NF 1 ) > 0, P(σ NF k , σ F 2 ) = 1, and P(σ NF i , σ NF i+1 ) = 1 for 1 i < k. We set the reward for taking the transition from σ F 1 to σ F 2 in D to be the sum of the rewards that would be accumulated across that sequence by a path in D, formally -For every firing state σ F in S there is a non-zero transition from the initial state σ I to σ F in P . Therefore, all paths of D where σ F is the first firing state along that path share the same prefix, namely σ I , σ F . For paths of D this is not necessarily the case, since σ F is the first firing state not only along the path where the initial transition is taken to σ F itself, but also along any path where the initial transition is taken to a non-firing state from which a sequence of deterministic transitions leads to σ F (that state is a deterministic predecessor of σ F ). We therefore set the reward along a path ω = σ I σ F 1 σ F 2 · · · for taking the initial transition to σ F in D to be the sum of the total rewards accumulated along all distinct path prefixes of the form σ I ω NF σ F , normalised by the total probabilitiy of taking any of these paths, where ω NF is a possibly empty sequence of deterministic predecessors of σ F , and where the total reward for each prefix is weighted by the probability of taking the transitions along that sequence,

Proof of Theorem 2
We want to show that for every reward structure R for D and corresponding reward structure R for D , every ∈ {<, , , >} and every r ∈ R, if σ I | R r [F synch] holds in D then it also holds in D . Let V Sat(Fsynch) and V Sat (Fsynch) respectively denote the random variables over Paths D and Paths D whose expectations correspond to R and R . From the semantics of PCTL over a DTMC we have Therefore, we need to show that where Pr D and Pr D denote the probability measures with respect to the sets of infinite paths from σ I in D and D respectively. There are two cases: Firstly, if there exists some path of D that does not synchronise then by definition V Sat(synch) = ∞. Also, from Lemma 7 we know that there is a corresponding path of D that does not synchronise, and hence that V Sat(synch) = ∞. By definition the probability measure of all paths of D and D are strictly positive. Therefore, all summands of Equation 19 are defined, and the expectation of both V Sat(synch) and V Sat(synch) is ∞.
Secondly, we consider the case where all possible paths of D and D synchronise. First we define the function reduce : Paths D → Paths D that maps paths of D to their corresponding path in the reduction D , where ω NF i is the (possibly empty) sequence of deterministic predecessors of the firing state σ F i . Let reduce −1 (ω) denote the preimage of ω under reduce. Then, we can rewrite the left side of (19) to For any path ω of D or D let pre s (ω) be the prefix of that path whose last state is the first firing state along that path that is in the set Sat(synch). So we want to show that the following holds for any path ω of D , Given some path ω let ω[i : j] denote the sequence of states in ω from the i th firing state to the j th firing state along that path (inclusively). The notation ω[− : j] indicates that no states are removed from the start of the path i.e. the first state is σ I , and the notation ω[i : −] indicates that no states are removed from the end of the path. By recalling that Pr(σ 0 σ 1 · · · σ n ) = n i=1 P(σ i−1 , σ i ) we can see that Pr(σ 0 σ 1 · · · σ n ) = Pr(σ 0 · · · σ i )Pr(σ i · · · σ n ) for any 0 < i < n. Also from (1) it is clear that for any reward structure R, tot R (σ I · · · σ n ) = tot R (σ I · · · σ i ) + tot R (σ i · · · σ n ) holds for all 0 < i < n. Now we can rewrite (20) to By the definition of R we can write the right hand side of (21) as From Lemma 7 we know that and hence obtain Since Pref (ω ) is the set of all possible finite prefixes from the initial state σ I to the first firing state σ F 1 , and since ω[− : 1] = pre s (ω)[− : 1] clearly holds, we know that This is the same as the left hand side of (21) and the theorem is proved.

Connecting the concrete model and the population model
In this section, we define the abstraction function to connect a concrete model with a population model. To that end, let D c = (S c , s I , P c ) be a concrete model of a network of N PCOs with a clock cycle length T , a refractory period R, a phase response function Δ, a coupling and broadcast failure probability of μ. Furthermore, let D p = (S p , σ I , P p ) be the DTMC of a population model for the same parameters. For a finite path ω = σ 0 , . . . , σ n we denote its last element by last(ω) = σ n .
The correspondence we want to show is that the initial states of the two models are weakly bisimulation equivalent [7]. Since we do not employ any actions except for the silent action in our models, we first define a simplified version of weak bisimulations. To that end, we use the definition of Baier and Hermanns [7], but with slightly altered notations to fit to our setting, and by ignoring all references to sequences of actions.
Definition 18 (Weak Bisimulation [7]) A weak bisimulation on D = (S, s I , P) is an equivalence relation R on S such that for all (s, s ) ∈ R, and all equivalence classes E ∈ S/R, we have

P(s, E) = P(s , E)
where P(s, E) = ω∈Paths D f (s)∧last(ω)∈E P(ω) for any set E ⊆ S. We say that two states s and s are weakly bisimilar, if, and only if, there is a weak bisimulation R such that (s, s ) ∈ R.

Proving the correspondence between concrete and population models
In this section, we will formally define a weak bisimulation relation between states in the concrete model D c and the corresponding population model D p . Since weak bisimulations are defined on a single DTMC, we will define a single DTMC combining both the concrete and the population model.

Definition 19
Let D c = (S c , s I , P c ) be a concrete model of a network and let D p = (S p , σ I , P p ) be the DTMC of a corresponding population model. The combination of D c and D p is the DTMC D = (S, i, P), where S = S c ∪ S p ∪ {i}, i is a new state (i / ∈ S c ∪ S p ), and P is defined as follows: The initial transitions from the new state i form a uniform distribution over the two behaviours, and hence the combination DTMC could behave like either model. However, for simplicity we will often refer to the original models D c and D p . We need to associate states in D c to states in D p . In general, several concrete states will be mapped to a single population state, since we do not distinguish between different orders of oscillators in the latter, while we do in the former.
Furthermore, we want to abstract from different modes of the oscillators. However, it is not sensible to associate all modes within a phase to the same population state, since in the transitions from one mode to the next the system chooses, whether an oscillator fails to broadcast its pulse or not. If we want to be able to define a probabilistic weak bisimulation relation, we need to represent the failures described by the transitions in the population model. To have an exact correspondence, we first collect all the concrete states where the counter and all oscillators are at the start mode into a single set. We now proceed to show that this relation indeed is a weak bisimulation on D, with a minor caveat. Consider a state of the population model σ and a state of the concrete model s in the same equivalence class. Then we have P(s, E ∅ ) = 1, since every transistion starting in s immediately leads into a state in E ∅ . However, we also have P(σ, E ∅ ) = 0, since none of the successors of σ is an element of E ∅ . We could remedy this discrepancy by introducing a new state σ for each state in the population model. Then all of these states would be in E ∅ , and every σ of the original population model would have a unique successor in E ∅ . Since this addition does not change the overall behaviour of the population model we chose instead to ignore the transitions ending in E ∅ , and only consider the probabilities of paths into the other equivalence classes.
Theorem 3 Let D c = (S c , s I , P c ) and D p = (S p , σ I , P p ) be a concrete network of oscillators and its abstraction as a population model, respectively. Furthermore, let D = (S, i, P) be the combination of both of these models from Definition 19 and R ⊆ S × S be the relation from Definition 20. Then R is a weak bisimulation relation, and s I and σ I are weakly bisimilar.
Proof From transition sequences in D p to transition sequences in D c . Let (s 1 , σ 1 ) ∈ R, where s 1 ∈ S c , σ 1 ∈ S p , and let E be an equivalence class such that P(σ 1 , E) > 0, i.e., for the single state σ 2 ∈ E of the population model, we have P(σ 1 , σ 2 ) > 0. Furthermore, for s 1 , we have by the definition of R that mode(s 1 , env) = start, mode(s 1 , u) = start for all 1 u N and h(s 1 ) = |{u | φ(s 1 , u) = 1}|, . . . , |{u | φ(s 1 , u) = T }| = σ 1 . Note that there is only a single outgoing transition from s 1 according to condition (2). That is, in the successor state s of s 1 , we have count(s) = 0 and mode(s, env) = update, while the oscillator states are not changed. To keep the notation tidy, we identify this successor state with s 1 in the following. Now consider two cases. If |{u | φ(s 1 , u) = T }| = 0, then σ 2 = 0, |{u | φ(s 1 , u) = 1}|, . . . , |{u | φ(s 1 , u) = T − 1}| , since no set of oscillators in σ 1 is perturbed by a firing oscillator. In particular, there is no u such that φ(s 1 , u) = T . Hence, for all possible successors s of s 1 , we have that only condition (3) is satisfied. Furthermore, this is the case until all oscillators changed their mode to update. Let us call this state s 1 . Now, the environment was not changed from s 1 to s 1 , i.e., count(s 1 ) = count(s 1 ) = 0.
Hence, condition (9b) is satisfied for all oscillators. Since Δ(Φ, 0, ) = 0 for all Φ, the phase of each oscillator is increased by one. This implies that there is a single successor of s 1 , which we call s 2 and that for all u, φ(s 2 , u) = φ(s 1 , u) + 1. In particular, we have that for all oscillators u, φ(s 2 , u) > 0, and Then, each transition in the population model is induced by a failure vector F = f 1 , . . . , f T . In particular, there is a maximal number k, such that for all l < k, we have f l = . That is, k denotes the lowest phase in which oscillators possibly fire.
First, we introduce some notation, where Φ > R.
That is, N F (s) denotes the set of oscillators possibly firing in s. The sets N P Φ (s) and N PF Φ (s) denote the sets of oscillators being perturbed but not firing (since the perturbation is not sufficient for the oscillators to reach the end of their cycle), and possibly firing, respectively. We can only say that elements of N F (s) and N PF Φ (s) possibly fire, since they may be affected by a broadcast failure.
We now have to construct a sequence of transitions, where we draw the firing oscillators from the sets N F (s 1 ) and N PF Φ (s 1 ), according to the broadcast failure vector F. Furthermore, all elements of N F (s 1 ) and the sets N PF Φ (s 1 ) have to take transitions such that their phase value in the next iteration is 1.
Let σ 1 = k 1 , k 2 , . . . , k T . Now consider an arbitrary sequence u 1 , . . . , u k T of all k T elements from N F (s 1 ). Additionally, let C T ⊆ N F (s 1 ) be the set of oscillators in phase T with a broadcast failure, i.e., |C T | = f T . Observe that φ(s 1 , u j ) = T for all 1 j k T . Furthermore, let r T 0 = s 1 . Then we define a sequence of successors of Observe that these states define a sequence of transitions from r T 0 to r T k T according to conditions (7) and (8). Now, for each phase Φ, with k Φ < T , we proceed similarly. That is, we first choose Observe again, that these sequences exhaust N PF Φ (s 1 ) for each phase Φ. Furthermore, the number of firing oscillators that are not inhibited by a broadcast failure in the concrete model coincides with the number of perceived firing oscillators in the population model in this phase.
To prove this let us consider the different cases: for Φ = T , we have count(r T 0 ) = 0 = α T (σ 1 , F). Now let Φ < T and assume count(r Φ+1 0 ) = α Φ+1 (σ 1 , F). By definition, we have By assumption, we then get and property (23) holds. This property in particular states that the perturbation within the population model and the concrete model is the same.
Since Δ is a monotonically increasing function in α, every oscillator in N PF Φ (s 1 ) is still perturbed to firing after other oscillators in the same phase fired. Hence, for each pair of states u Φ j−1 and u Φ j with 1 j k Φ − f Φ , a transition according to condition (6) is well-defined. Similarly, for oscillators that should fire, but are affected by a broadcast failure, u Φ j−1 and u Φ j with k Φ − f Φ + 1 j k Φ , the transition is defined according to condition (5). Now, for every Φ < k, we know that Φ + Δ(Φ, α Φ (σ 1 , F), ) + 1 T and α Φ−1 (σ 1 , F) = α Φ (σ 1 , F), according to equation (13). Hence, for every phase Φ < k, we arbitrarily enumerate the oscillators of N P For each Φ and pair of states r Φ j and r Φ j+1 , there is a transition according to condition (4). So, all in all, we have a sequence of transitions from s 1 to r 0 k 0 . Now, in r 0 k 0 , we have that mode(r 0 k 0 , env) = update and for all u, mode(r 0 k 0 , u) = update. Then let s 2 be defined by the following formulas. mode(s 2 , env) = start and count(s 2 ) = count(r 0 k 0 ) (24) ∀u : mode(s 2 , u) = start (25) ∀u : Then r 0 k 0 and s 2 satisfy all parts of condition (9). Hence, we have a sequence of transitions from s 1 to s 2 , and s 2 ∈ S c . To prove h(s 2 ) = σ 2 , we need to show that the number of oscillators possessing a phase Φ in s 2 matches the Φ-th entry of σ 2 = k 1 , . . . , k T . To that end, recall that by Definition 10, each Observe that both the concrete model and the population model use the same perturbation function Δ and that τ is defined in terms of Δ. In particular, we have Now let us distinguish three cases for Φ.
Similarly, for all u such that φ(s 1 , u) = Φ, we get that φ(s 2 , u) = Φ + 1. Hence, for all Φ R, we have that |{u | φ(s 2 , u) = Φ + 1}| = |{u | φ(s 1 , u) = Φ}|. 2. If Φ > R and update Φ (σ 1 , F) = Ψ , with Ψ T . Then φ(s 2 , u) = Ψ , by formula (28). Observe that the number of oscillators in s 1 with a phase of Φ is k Φ . So, the number of oscillators that get perturbed to be in Ψ is the union of the oscillators u in phases Φ, where Δ(Φ, count(s 2 ), ) (s 1 , u), count(s 2 ), ) + 1 = Ψ }. By the definition of τ and property (23), we get that τ (σ 1 , Φ, F) = Ψ . That is, for a specific Ψ , we have that the phases Φ of oscillators perturbed to Ψ are in U Ψ (σ 1 , F). Hence, since the sets of oscillators in each phase are disjoint, |{u | φ(s 2 , u) = Ψ }| = Φ∈U Ψ (σ 1 ,F) k Φ . 3. Finally, let update Φ (σ 1 , F) = Ψ and Ψ > T . Then τ (σ 1 , Φ, F) = 1. Furthermore, by formulas (26) and (29), we have φ(s 2 , u) = 1 for all u with phase Φ. With similar reasoning as above, we get that |{u | φ( Hence, we get h(s 2 ) = σ 2 , that is s 2 ∈ E. Together with the existence of the transition sequence from s 1 to s 2 we get P(s 1 , E) > 0. From transition sequences in D c to transition sequences in D p . Now we turn our attention to the other direction. That is, if we have a sequence of transitions in the concrete model, we can find a corresponding transition sequence in the population model. Let (s 1 , σ 1 ) ∈ R, and let E be an equivalence class such that P(s 1 , E) > 0. If s 1 ∈ E, then the theorem holds, since σ 1 ∈ E by definition of R. Otherwise, let ω = s 1 . . . s 2 be an execution sequence from s 1 to s 2 and assume that for all s on ω different from s 1 and s 2 , we have mode(s , env) = update. Recall that there is a unique σ 2 ∈ E. By definition of R, we have We now distinguish two cases. First, assume that {u | φ(s 1 , u) = T } = ∅, and let s be a state on ω such that mode(s, u) = update for all u. Then there is exactly one transition from s to s 2 , which is defined according to Eq. (9). Furthermore, due to the assumption that no oscillator fires, we have count(s) = 0, which implies Δ(Φ, count(s), ) = 0 for all Φ by Definition 5. Hence, for all u, we have φ(s 2 , u) = φ(s, u) + 1 = φ(s 1 , u) + 1. That is, That is, we have P(σ 1 , σ 2 ) > 0 due to a deterministic transition, which, in particular, implies The second case is more involved. Let us assume {u | φ(s 1 , u) = T } = ∅, that is, at least one oscillator fires. Hence, due to the preconditions of the transitions, we can divide the transition sequence from s 1 to s 2 as follows: . . . , r T , . . . , r T −1 , . . . , r 1 , . . . , s 2 , where r Φ denotes the state where all oscillators with phase Φ changed their mode to update. Our goal now is to find a broadcast failure vector F, such that → succ(σ 1 , F) = σ 2 . To that end, let Then F = f 1 , . . . , f T . With this broadcast failure vector at hand, we now have to show that φ(s 1 , u), F) for all oscillators u. To this end, we now need to distinguish several cases, according to the different cases of the transition defined by condition (9).
First, let u be such that φ(s 1 , u) R, i.e., oscillator u is within its refractory period. If φ(s 1 , u) = T , then we have Now assume that φ(s 1 , u) R, i.e., oscillator u is outside of its refractory period and thus will be perturbed by firing oscillators. If φ(s 1 , u) = T , then we proceed as in the previous case.
So, let us assume φ(s 1 , u) < T . To show that the transition function of the population model coincides with the result within the concrete model, we need to ensure that the perceived firing oscillators are equal in both models for each oscillator. Within each phase, the perceived oscillators in the population model coincide with the oscillators that fired up to the next higher phase in the concrete model. Formally, for each 1 Φ < T , we have To show that this is true, let Then If f Φ = , then the equality immediately holds. Furthermore, observe that for all Φ, if f Φ = , then we have count(r Φ+1 ) = count(r 1 ). We can now proceed to prove the final two cases.
Now recall that we assumed initially that for all intermediate states s of the transition sequence, we have mode(s, env) = update. If this is not the case, we can partition the sequence into distinct subsequences, where this assumption holds for each subsequence, and apply the arguments above.
Comparing the probabilities of transition sequences. Let (s 1 , σ 1 ) ∈ R and E be an equivalence class with P(s 1 , E) > 0 and P(σ 1 , E) > 0. Furthermore, let σ 2 , s 2 ∈ E be the unique state of the population model in E and a state of the concrete model, respectively. Let If no oscillator fires in σ 1 this also the case in s 1 . Then we have N ! possibilities to create an transition sequence starting in s 1 , each of which has a probability of 1 N ! to happen. Hence, the probability that one of these transitions happen is N ! · 1 N ! = 1, which coincides with the definition in the population model. For the case that at least one oscillator fires and thus perturbs the other oscillators, we consider the construction of a transition sequence from s 1 with respect to a failure vector F = f 1 , . . . , f T for σ 1 . During each phase Φ, we have to choose the particular order of the k Φ oscillators to create a sequence from r Φ 0 to r Φ−1 0 and in addition, we have to choose the set C Φ . That is, we have k Φ ! possible orders, and k Φ f Φ possibilities for the choice of C Φ . Furthermore, the combined probability for the transitions of the oscillators that should fire but are inhibited by a broadcast failure is Observe that at the start of the construction of each phase, |init Φ (s 1 )| = k Φ . Hence the probability above simplifies to Due to the possible choices during the construction of the transition sequence, we have that the probability of one of these sequences to happen is which is exactly the function PMF(k Φ , f Φ ) as in the population model. Furthermore, with similar reasoning as above, the transition probability for the sequences, where no oscillator is perturbed anymore, is 1. Hence, P(s 1 , E) = P(σ 1 , E), and the sum of the probabilities of the paths from s 1 of a population model state σ 1 to the elements of the equivalence class E is equal to the probability of the the transition from σ 1 to σ 2 . Finally, consider the transitions from the initial states into the corresponding equivalence classes. As stated in Sect. 5.4,Eq. (17), for each state of the population model σ = k 1 , . . . , k T , we have which by construction equals P(σ I , E), where E is the equivalence class with σ ∈ E.
The multinomial coefficient gives the number of possible assignments of values k i to the oscillators. This is exactly the number of states in the set S c , where k i oscillators are in phase i and the environment counter is set to 0. Since T N is the total number of such states, the probability of reaching the equivalence class E from s I is also and R is a weak bisimulation.
By this theorem, we can use population models to analyse the global properties of a network of pulse-coupled oscillators following the concrete model as defined in Sect. 4 without loss of precision. In particular, this allows us to increase the size of the network to check such properties, while still giving us the opportunity to analyse the internal behaviour of nodes, if we restrict the network size.

Empirical analysis
In this section we first compare the model checking times for an analysis of the concrete and the population model, followed by an analysis of the impact of the reduction of the population model as defined in Sect. 5.5. Subsequently, we present the results of an empirical analysis of the population model. This evaluation is based on previous work where parametric influence [24] and power consumption [25] were investigated. For all analyses the perturbation function we used is a discretisation of the Mirollo and Strogatz model of pulse-coupled oscillator synchronisation [38]. In particular, we set the perturbation function to be Δ(Φ, α, ) = [Φ · α · ], where [·] denotes rounding to the nearest integer; the perturbation induced by the firing of another oscillator increases linearly with the phase of the perturbed oscillator. We use the probabilistic model checker PRISM [34] to formally verify properties of our models.

Model specification
Both the concrete models and population models are encoded using the guarded command language of PRISM, a state-based language, based on the Reactive Modules formalism of Alur and Henzinger [4]. Each model consists of a set of modules, which in turn consist of a set of local variables over finitely bound integers and Booleans and a set of commands that define its behaviour. A command consists of a predicate over the set of local variables for all modules and a set of possible transitions that will occur with some given probability should the predicate hold. A transition is an assignment of values to each of the local variables of the module. The local state space of a module is given by the set of all valuations for its local variables, and the global space is the product of all local state spaces. For further details we refer the reader to [40]. Concrete model The specification of the concrete model within PRISM is a straightforward implementation. Each individual oscillator, as well as the environment, is defined as a module. All of these modules only synchronise on a single message sync, to ensure that all oscillators were considered to update their phases.
The module for oscillator i contains local variables mode i and phase i for its mode and phase. Each module specifies transitions for a single oscillator according to Sect. 4.2. All of these transitions are similar in structure in each module, and hence we can employ the template mechanism of PRISM. To that end, we define the behaviour of a single oscillator within the network, and can then replace the names of the local variables suitably. 1 That is, while we still need to explicitly define the dependencies between the oscillators, we do not need to manually write all module specifications. With the exception of the number of oscillators, the parameters of the network (the lengths of the oscillation cycle and the refractory period, the coupling constant and the broadcast failure probability) are also parameters within the input language of PRISM. That is, we can instantiate them and automatically run the modelchecking engine on sets of parameter combinations.
Population model A single module is used to specify an instance of the population model. The global state is encoded using T finitely bound integer variables ranging over N discrete values, where each variable records the number of oscillators sharing some phase value in 1, . . . , T . In contrast to the specification of the concrete model, specifying an instance of the population model is not straightforward. For the concrete model the length of the oscillation cycle T could be easily specified as a parameter within the input language of PRISM, and moving between two models with different values for T , but sharing the same value for all other parameters, is as simple as changing the value of the parameter in the specification. For the population model this was not possible, as T variables are needed to count the oscillators with different phases, and hence changing T leads to a change in the number of variables. The changes that arise from the introduction or removal of these new variables propagate throughout the specification, which for large models may consist of many thousands of lines of code. Further complications also arose when encoding the transitions from the initial unconfigured state σ I to every possible configured state for the oscillators, since a single model may have many thousands of such transitions for larger values of N and T .
To facilitate the analysis of families of parameter-wise different oscillator population models we developed a Python script 2 that allows the user to define ranges for N , T , R, and μ, for some fixed definitions for the perturbation function Δ. Then, given a list of properties, for each combination of parameters the script generates a specification, checks all the given properties against that specification using PRISM, and writes user specified output (e.g. result, model checking time, etc.) to a file that can be used by statistical analysis tools. The reduction introduced in Sect. 5.5 could not be implemented with the existing high-level specification language of PRISM. It was only possible to specify the individual rewards labelling some transitions, namely those from the initial unconfigured state to each of its possible successors, by introducing additional variables to the model. Therefore, at the time of writing, and to the best of our knowledge, it is not possible to implement a script for the generation of PRISM code for all the reduced models. The results shown in Table 3 were obtained using a prototypical version of the model checker ePMC, formerly known as IscasMC [27], which had been modified to accept a low level representation of a model as input (as a set of states and transitions), where each transition could be individually labelled with a reward.

Comparison between concrete and population
In this section, we compare the differences between the concrete and population models. We restrict our comparison to the analysis of the probability to achieve synchronisation,  4, the increase in the size of the model is much more pronounced for the concrete model. Hence, the performance of the model checking procedure differs strongly between the models. Table 2 shows the model construction and checking times for some exemplary parameter combinations of the models, as reported by PRISM. 3 In the table, the model construction time denotes the time PRISM needs to construct a DTMC representation from the specification. In the concrete model, the bulk of time is spent in the model checking phase, while the construction is much faster. For the analysis of the population model, however, the situation is reversed. The model construction phase is at least an order of magnitude longer than the model checking phase. As expected, the model checker needs less time for the analysis of the population model, even if the times needed for model construction and checking were considered together. However, we can also see that while checking both the concrete and population model, refractory periods around R = T 2 are harder for the model checker to analyse. This is because for very low and very high refractory periods the synchronisation probability is either 1 or 0, and such qualitative results are trivially, and efficiently obtained in a precomputation step by the model checker.
For properties relating to global behaviour a population model is beneficial, However, it requires all nodes to be behaviourally identical. While we defined our concrete model in a similar fashion, we could in principle relax this restriction. That is, we could allow for different perturbation functions for each node, or for partitions of nodes.

Reduction analysis
In this section we present the effect of the reduction of population models as defined in Sect. 5.5. Table 3 shows the number of reachable states and transitions of the DTMC D, and corresponding reduction D , for different population sizes (N ) and oscillation cycle lengths (T ), using the Mirollo and Strogatz model of synchronisation presented at the start of this section. The number of reachable states is stable under changes to the parameters R, , and μ, since every possible firing state is always reachable from the initial state. For the results shown here the parameters were arbitrarily set to R = 1, = 0.1. The underlying graph of the DTMC, and hence the number of transitions, is stable under changes to the parameter μ, and is not of interest here. Larger values of N and T were not investigated, due to the large model sizes when generating the unreduced model. Table 4 shows the number of transitions of the DTMC, and corresponding reduction, for various population model instances, and again uses the Mirollo and Strogatz model of synchronisation. Increasing the length of the refractory period (R) results in an increase in the reduction of transitions in the model. A longer refractory period leads to more firing states where the firing of a group of oscillators is ignored. This results in successor states having oscillators with lower values for phase, and hence a longer sequence of deterministic transitions (later removed in the reduction) leading to the next firing state. Conversely, increasing the strength of the coupling between oscillators ( ) results in a decrease in the reduction of transitions in the model. For the Mirollo and Strogatz model of synchronisation used here, increasing the coupling strength results in a linear increase in the pertubation to phase induced by the firing of an oscillator. This results in successor states of firing states having oscillators with higher values for phase, and hence a shorter sequence of deterministic transitions leading to the next firing state.

Population model evaluation
In this section we discuss the influence of different parameters of the population model defined in Sect. 5 on both the likelihood that a network of oscillators will eventually synchronise, and the requisite time and power consumption to achieve this. For a real deployment of synchronising nodes, for example a Wireless Sensor Network (WSN), communication is costly with respect to energy consumption. Therefore, minimising power consumption is a critical consideration for their design [2,44]. Once deployed, a WSN is generally expected to function independently for long periods of time. In particular, regular battery replacement can be costly and impractical for remote sensing applications. Hence, it is important to reduce the power consumption of the individual nodes by choosing low-power hardware and/or energy efficient protocols. However, to make informed choices, it is also necessary to have good estimations of the power consumption for individual nodes. While the general power consumption of the hardware can be extracted from data sheets, estimating the overall power consumption of different protocols is more demanding. Communication between nodes in the network is either active when sending a message, i.e., when a node fires, or passive, when receiving messages from other nodes. Hence, during periods where a sensor does neither, the antenna can be shut down to save energy. In our models, this interval of inactivity corresponds to the refractory period. That is, the longer the refractory period is, the less energy will be consumed. First, we consider a binary metric where we are only interested in states where all oscillators share precisely the same phase. Second, we derive a synchronisation metric from the complex order parameter of Kuramoto [32], that captures the degree of synchrony of a fully connected network of oscillators as a real value in the interval [0, 1].

Binary synchronisation metric
Here we use the binary notion of synchronisation introduced in Sect. 5.3, where a global state σ = k 1 , . . . , k T is synchronised if, and only if, there is some Φ ∈ {1, . . . , T } such that k Φ = N . We created concrete input models for PRISM for different parameters of the model, for example the number of oscillators and different coupling strengths. Each of these models was subsequently checked with respect to different properties. Other case studies could also be considered for alternative models of synchronisation where the dynamics of oscillators, and their interactions, can be described by some perturbation function.
We are interested in the probability of eventual synchronisation and in the expected time needed to achieve synchronisation. The probability of eventual synchronisation is given by the PCTL property we first define a reward structure that associates a value of 1 T with every unsynchronised state in Γ \ {σ I }, that records the number of cycles taken to achieve synchrony.
To determine the expected time taken for a population model to synchronise we accumulate a reward along a path until some synchronised global state is reached, and define a reward structure R time = (R s , R t ). We set R s (σ ) = 1 T for every unsynchronised state in Γ \ {σ I }, R s (σ ) = 0 for all other σ ∈ Γ , and R t (σ 1 , σ 2 ) = 0 for all σ 1 , σ 2 ∈ Γ . Intuitively, we expect a reward of 1 along a path where synchronisation occurs after T transitions (one complete oscillation cycle). This is achieved by assigning a reward of 1 T to each unsynchronised state, since a transition of the model from one state to the next corresponds to a step of 1 T oscillation cycles. In this way we obtain a measure of synchronisation time in oscillation cycles.
The expectation of time to achieve synchronisation is then given by the PCTL property with respect to the reward structure R time . We note here that a result of Infinity is obtained for accumulating this reward along a path where the probability of reaching a synchronised state is less than 1. We generated models for different numbers of oscillators 3 N 8, cycle lengths 4 T 10, coupling constants ∈ {0, 0.1, . . . , 1.0}, refractory periods 0 R T , and message loss probabilities μ ∈ {0, 0.1, . . . , 1.0}, and analysed the models with respect to the two properties of interest. Figure 5a plots the probability of synchronisation for different rates of broadcast failure against the refractory period for N = 8, T = 10, and = 0.1. We can observe a tradeoff between a high refractory period and high synchronisation probability. As long as the refractory period is less than half the oscillation cycle, synchronisation will be achieved in almost all cases. Higher values for R result in a rapid drop in synchronisation probability. The exception is the edge case μ = 0, which may seem surprising. If μ = 0 a model is deterministic. The results for μ = 1 are omitted here as, unsurprisingly, if all firings result in broadcast failures the synchronisation probability is almost zero. In fact, the only runs that synchronise in this case are runs where the first configured state is already synchronised. Figure 5b shows us that a higher refractory period results in shorter synchronisation times when the probability for broadcast failure is low. In general, a longer refractory period up to half the cycle length improves the rate of convergence to synchrony, which is consistent with the findings of [17]. Furthermore, for high values of μ the differences in synchronisation times for different refractory period lengths are negligible. Hence, a refractory period of slightly less than half the cycle, with a low coupling constant , is optimal for this model of synchronisation. As is increased the results remain similar, but with a decrease in synchronisation times.
Order parameter synchronisation metric In the previous section a binary metric of synchrony for a population model was employed, where a state was synchronised if, and only if, all oscillators in that state shared the same phase. However, it is clear that some global states appear to be closer to achieving a truly synchronised state than others. Consider the global states σ 1 = 0, 2, 0, 2, 0, 2 and σ 2 = 0, 0, 0, 0, 1, 5 of some population model for a network of N = 6 nodes with an oscillation cycle over T = 6 discrete values. Using the binary notion of synchrony all that is known is that both states are not synchronised, yet it is clear that for nearly all models of synchronisation, encoded as some perturbation function, σ 2 appears to be closer to converging to a state where all oscillators share the same phase.
The binary notion of synchrony can be extended by introducing a phase coherence metric for the level of synchrony of a global state. The metric is derived from the order parameter introduced by Kuramoto [32] as a measure of synchrony for a population of coupled oscillators. If the phases of the oscillators are considered as positions on the unit circle in the complex plane, they can be represented as complex numbers with magnitude 1.

Definition 21
The function φ C : [1 . . . T ] → C maps a phase value to its corresponding position on the unit circle in the complex plane, and is defined as . A measure of synchrony η ∈ [0, 1] can then be obtained by calculating the magnitude of the complex number corresponding to the mean of the phase positions. A global state has a maximal value of η = 1 when all oscillators are synchronised and share the same phase Φ, mapped to the position defined by φ C (Φ). It then follows that the mean position is also φ C (Φ) and |φ C (Φ)| = 1. A global state has a minimal value of η = 0 when all of the positions mapped to the phases of the oscillators are uniformly distributed around the unit circle. This also occurs when their positions achieve mutual counterpoise, for example when N 2 oscillators share some phase value Φ and the remaining N 2 oscillators have a phase value whose position on the complex plane is the negation of φ C (Φ).

Definition 22
The phase coherence function PCF : Γ → [0, 1] maps a global state to a real value in the interval [0, 1], and is given by where |·| denotes the complex modulus. The two new properties of interest are firstly, the time taken for the network to reach a state where some desirable degree of synchronisation with respect to the new metric has been achieved, and secondly, the power consumed by the network to reach that state. In addition to the expected time/power consumption we will also investigate the maximal time/power consumption. 4 We now define a reward structure R pow = (R s , R t ) that annotates a model with rewards corresponding to power consumption. Firstly, let I id , I r x , and I t x be the current draw in amperes for the idle, receive, and transmit modes of a synchronising node in a network, V be the voltage, C be the length of the oscillation cycle in seconds, and M t be the time taken to transmit a synchronisation message in seconds. We now define where W id is the power consumption in Watt-hours of one node for one discrete step within its refractory period (the oscillator is in the idle mode), W r x is the power consumption in Watt-hours of one node for one discrete step outside of its refractory period (in receive mode), and W t x is the power consumption in Watt-hours to transmit one synchronisation message. The power consumption of the network consists of the power necessary to transmit  . 7 Power (a) and time (b) per node to achieve synchronisation the synchronisation messages, and that of the oscillators in the idle and receive modes. For synchronisation messages, we consider each firing state σ , and assign a reward of R t (σ, σ ) = k 1 W t x to every transition from σ to a successor state σ = k 1 , . . . , k T . This corresponds to the total power consumption for the transmission of k 1 synchronisation messages. For each state σ = k 1 , . . . , k T where the oscillators are configured, the total power consumption for oscillators in the idle and receive modes is For our experiments we set = 0.1 and μ = 0.2. We could have conducted analyses for different values for these parameters. For a real system, the probability μ of broadcast failure occurrence is highly dependent on the deployment environment. For deployments in benign environments we would expect a relatively low rate of failure, for instance a WSN within city limits under controlled conditions, whilst a comparably high rate of failure would be expected in harsh environments such as a network of off-shore sensors below sea level. The coupling constant is a parameter of the system itself. Our results suggest that higher values for are always beneficial, however this is because we only model fully connected networks. High values for may be detrimental when considering different topologies, since firing nodes may perturb synchronised subcomponents of a network. However we defer such an analysis to future work.
As an example we analyse the power consumption for values taken from the datasheet of the MICAz mote [37]. For the transmit, receive and idling mode, we assume I t x = 17.4 m A, I r x = 19, 7 m A, and I id = 20μA, respectively. Furthermore, we assume that the oscillators use a voltage of 3.0 V . We define coherent λ to be a predicate that holds for any state σ in Γ \ {σ I } with PCF(σ ) λ. The properties of interest are then given by the PCTL property Figure 7a, b show both the average and maximal power consumption per node (in mWh) and time (in cycles) needed to synchronise, in relation to the phase coherence of the network with respect to different lengths of the refractory period, where = 0.1 and μ = 0.2. That is, they show how much power is consumed (time is needed, resp.) for a system in an arbitrary state to reach a state where some degree of phase coherence has been achieved.
The average, and maximal values are obtained using the avg and max filters of the PRISM model checker. These filters give the average (maximum, resp.) expected reward across all paths starting in states that satisfy some given predicate. The desired values are therefore obtained by checking property (31) against all states of the model where oscillators are The much larger values obtained for R = 1 and phase coherence 0.9 are not shown here, to avoid distortion of the figures. The energy consumption for these values is roughly 2.4 mWh, while the time needed is around 19 cycles. Observe that we only show values for the refractory period R with R < T 2 . For larger values of R not all runs synchronise [24], resulting in an infinitely large reward being accumulated for both the maximal and average cases.
As expected when starting from an arbitrary state, the time and power consumption increases monotonically with the order of synchrony to be achieved. On average, networks with longer refractory periods require less power for synchronisation, and take less time to achieve it. The only exception is that the average time to achieve synchrony with a refractory period of four is higher than for two and three. However, if lower phase coherence is sufficient then this trend is stable. In contrast, the maximal power consumption of networks with R = 4 is consistently higher than of networks with R = 3. In addition, the maximal time needed to achieve synchrony for networks with R = 4 is higher than for lower refractory periods, except when the phase coherence is greater than or equal to 0.9. We find that networks with a refractory period of three will need the smallest amount of time to synchronise, regardless of whether we consider the maximal or average values. Furthermore, the average power consumption for full synchronisation (phase coherence one) differs only slightly between R = 3 and R = 4 (less than 0.3 mWh). Hence, for the given example, R = 3 gives the best results. These relationships are stable even for different broadcast failure probabilities μ, while the concrete values increase only slightly, as illustrated in Fig. 8, which shows the power consumption for different values of μ when = 0.1.
The general relationship between power consumption and time needed to synchronise is shown in Fig. 9a, b. Within these figures, we do not distinguish between different coupling constants and broadcast failure probabilities. We omit the two values for R = 1, = 0.1 and μ ∈ {0.1, 0.2} in Fig. 9b to avoid distortion of the graph, since the low coupling strength and low probability of broadcast failure leads to longer synchronisation times and hence higher power consumption. While this might seem surprising it has been shown that uncertainty in discrete systems often aids convergence [22].
The relationship between power consumption and time to synchronise is linear, and the slope of the relation decreases for higher refractory periods. While the linearity is almost perfect for the average values, the maximal values have larger variation. The figures again suggest that R = 3 is a sensible and reliable choice, since it provides the best stability of power consumption and time to synchronise. In particular, if the broadcast failure probability changes, the variations are less severe for R = 3 than for other refractory periods.

Conclusion
In this paper we have introduced a formal concrete model for a network of nodes synchronising their clocks over a set of discrete values. Furthermore, we developed a population model that can alleviate state-space explosion when reasoning about significantly larger networks. We encoded both models as discrete-time Markov chains, and evaluated them for several parameter combinations. Furthermore, we formally connected the models by showing that a concrete model of a network and a population model of same network are probabilistically weakly bisimilar.
Formalising the individual nodes of a network allows for the analysis of their internal properties. Even though we did not give explicit definitions, a concrete network could be instantiated to incorporate different topologies by explicit encoding of possible perturbances in the nodes' transitions. However, the internal structure also complicates the verification of global network properties. Modelling the whole network as the product of the models for the individual nodes quickly, and unsurprisingly, results in a model that is too large to analyse with existing tools and techniques. While the use of appropriate collective abstractions, such as population models, allow for the analysis of larger networks, they often impose restrictions on the topologies of the network that can be considered. We could, of course, simply take the product of individual population models to represent network structures more specialised than the fully-connected graphs considered here, but again we face the consequences of this approach when trying to analyse the resulting model, in particular the explosion of the state space. Furthermore, this also means that every node of a population model may influence all nodes of a connected population model. Finally, our abstract relation would need to take the mapping of single nodes into different components into account. When using population models we lose the possibility to distinguish between nodes having the same internal state. However, this does not restrict our analysis when considering networks of homogeneous nodes where the properties of interest relate to global behaviours of the network itself.
Our current definition of pulse-coupled oscillators only allows for non-negative results of the phase response function. However, there are also oscillator definitions with phase response functions with possibly negative values [48]. That is, instead of shifting the state of an oscillator towards the end of the cycle, the perturbation may reduce the value of the oscillator's state. It would be interesting to study the impact of negative-valued phase response functions in the setting of discrete clock values.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.