Running Time Analysis of Broadcast Consensus Protocols

Broadcast consensus protocols (BCPs) are a model of computation, in which anonymous, identical, finite-state agents compute by sending/receiving global broadcasts. BCPs are known to compute all number predicates in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {NL}=\mathsf {NSPACE}(\log n)$$\end{document}NL=NSPACE(logn) where n is the number of agents. They can be considered an extension of the well-established model of population protocols. This paper investigates execution time characteristics of BCPs. We show that every predicate computable by population protocols is computable by a BCP with expected \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(n \log n)$$\end{document}O(nlogn) interactions, which is asymptotically optimal. We further show that every log-space, randomized Turing machine can be simulated by a BCP with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(n \log n \cdot T)$$\end{document}O(nlogn·T) interactions in expectation, where T is the expected runtime of the Turing machine. This allows us to characterise polynomial-time BCPs as computing exactly the number predicates in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {ZPL}$$\end{document}ZPL, i.e. predicates decidable by log-space, randomised Turing machine with zero-error in expected polynomial time where the input is encoded as unary.


Introduction
In recent years, models of distributed computation following the computation-byconsensus paradigm attracted considerable interest in research (see for example [9,25,26,8,13]). In such models, network agents compute number predicates, i.e. Boolean-valued functions of the type N k → {0, 1}, by reaching a stable consensus whose value determines the outcome of the computation. Perhaps the most prominent model following this paradigm are population protocols [5,6], a model in which anonymous, identical, finite-state agents interact randomly in pairwise rendezvous to agree on a common Boolean output.
Due to anonymity and locality of interactions, it is an inherent property of population protocols that agents are generally unable to detect with absolute This work was supported by an ERC Advanced Grant (787367: PaVeS) and by the Research Training Network of the Deutsche Forschungsgemeinschaft (DFG) (378803395: ConVeY). The full version of this paper can be found at https://arxiv.org/abs/2101.03780 . certainty when the computation has stabilized. This makes sequential composition of protocols difficult, and further complicates the implementation of control structures such as loops or branching statements. To overcome this drawback, two kinds of approaches have been suggested in the literature: 1.) Let agents guess when the computation has stabilized, leading to composable, but merely approximately correct protocols [7,24], or 2.) extend population protocols by global communication primitives that enable agents to query global properties of the agent population [13,8,26].
Approaches of the first kind are for the most part based on simulations of global broadcasts by means of epidemics. In epidemics-based approaches the spread of the broadcast signal is simulated by random pairwise rendezvous, akin to the spread of a viral epidemic in a population. When the broadcasting agent meets a certain fraction of "infected" agents, it may decide with reasonable certainty that the broadcast has propagated throughout the entire population, which then leads to the initiation of the next computation phase. Of course, the decision to start the next phase may be premature, in which case the rest of the execution may be faulty. However, epidemics can also be used to implement phase clocks that help keep the failure probability low (see e.g. [7]).
In [13], Blondin, Esparza, and one of the authors of this paper introduced broadcast consensus protocols (BCPs), an extension of population protocols by reliable, global, and atomic broadcasts. BCPs find their precursor in the broadcast protocol model introduced by Emerson and Namjoshi in [17] to describe bus-based hardware protocols. This model has been investigated intensely in the literature, see e.g. [18,19,15,28]. Broadcasts also arise naturally in biological systems. For example, Uhlendorf et al. analyse applications of broadcasts in the form of an external, global light source for controlling a population of yeasts [12].
The authors of [13] show that BCPs compute precisely the predicates in NL = NSPACE(log n), where n is the number of agents. For comparison, it is known that population protocols compute precisely the Presburger predicates, which are the predicates definable in the first-order theory of the integers with addition and the usual order; a class much less expressive than the former.
An epidemics-based approach was used in [7] to show that population protocols can simulate with high probability a step of a virtual register machine with expected O(n log 5 (n)) interactions, where n is the number of agents. This result stimulated further research into time bounds for classical problems such as leader election (see e.g. [21,1,16,29,11]) and majority (see e.g. [4,2]). In their seminal paper [5], Angluin et al. already showed that population protocols can stably compute Presburger predicates with O(n 2 log n) interactions in expectation. Belleville et al. further showed that leaderless protocols require a quadratic number of interactions in expectation to stabilize to the correct output for a wide class of predicates [10]. The aforementioned bounds apply to stabilisation time: the time it takes to go from an initial configuration to a stable consensus that cannot be destroyed by future interactions. In [24], Kosowski and Uznanski considered the weaker notion of convergence time: the time it takes on average to ultimately transition to the correct consensus (although this consensus could in principle be destroyed by future interactions), and they show that sublinear convergence time is achievable.
By contrast, to the best of our knowledge, time characteristics of BCPs have not been discussed in the literature. The NL-powerful result presented in [13] does not establish any time bounds. In fact, [13] only considers a non-probabilistic variant of BCPs with a global fairness assumption instead of probabilistic choices. Contributions of the paper. This paper initiates the runtime analysis of BCPs in terms of expected number of interactions to reach a stable consensus. To simplify the definition of probabilistic execution semantics, we introduce a restricted, deterministic variant of BCPs without rendezvous transitions. In Section 2, we define probabilistic execution semantics for the restricted version of BCPs, and we provide an introductory example for a fast protocol computing majority in Section 3.
In Section 4, we show that these restrictions of our BCP model are inconsequential in terms of expected number of interactions: both rendezvous and nondeterministic choices can be simulated with a constant runtime overhead.
In Section 5, we show that every Presburger predicate can be computed by BCPs with O(n log n) interactions and with constant space, where n denotes the number of agents in the population. This result is asymptotically optimal.
In more generality, in Section 6, we use BCPs to simulate Turing machines (TMs). In particular, we show that any randomised, logarithmically space-bound, polynomial-time TM can be simulated by a BCP with an overhead of O(n log n) interactions per step. Conversely, any polynomial-time BCP can be simulated by such a TM. This result can be considered an improvement of the NL bound from [13], now in a probabilistic setting. We also give a corresponding upper bound, which yields the following succinct characterisation: polynomial-time BCPs compute exactly the number predicates in ZPL, which are the languages decidable by randomised log-space polynomial-time TMs with zero-error (the log-space analogue to ZPP).
Bounding the time requires a careful analysis of each step in the simulation of the Turing machine. Thus, our proof diverges in significant ways from the proof establishing the NL lower bound in [13]. Most notably, we now make use of epidemics in order to implement clocks that help reduce failure rates.

Preliminaries
Complexity classes. As is usual, we define NL as the class of languages decidable by a nondeterministic log-space TM. Additionally, by ZPL we denote the set of languages decided by a randomised log-space TM A, s.t. A only terminates with the correct result (zero-error) and that it terminates within O(poly n) steps in expectation, as defined by Nisan in [27]. Multisets. A multiset over a finite set E is a mapping M : E → N. The set of all multisets over E is denoted N E . For every e ∈ E, M (e) denotes the number of occurrences of e in M . We sometimes denote multisets using a setlike notation, e.g. f, g, g is the multiset M such that M Broadcast Consensus Protocols. A broadcast consensus protocol [13] (BCP) is a tuple P = (Q, Σ, δ, I, O) where -Q is a non-empty, finite set of states, -Σ is a non-empty, finite input alphabet, δ is the transition function (defined below), -I : Σ → Q is the input mapping, and -O ⊆ Q is a set of accepting states.
The function δ maps every state q ∈ Q to a pair (r, f ) consisting of the successor state r ∈ Q and the response function f : Q − → Q.

Configurations.
A configuration is a multiset C ∈ N Q . Intuitively, a configuration C describes a collection of identical finite-state agents with Q as set of states, containing C(q) agents in state q for every q ∈ Q. We say that Step relation. A broadcast δ(q) = (r, f ) is executed in three steps: (1) an agent at state q broadcasts a signal and leaves q; (2) all other agents receive the signal and move to the states indicated by the function f , i.e. an agent in state s moves to f (s); and (3) the broadcasting agent enters state r. Formally, for two configurations C, C we write C − → C , whenever there exists a state q ∈ Q s.t. C(q) ≥ 1, δ(q) = (r, f ), and C = f (C − q) + r is the configuration computed from C by the above three steps. By * − → we denote the reflexive-transitive closure of − →. For example, consider a configuration C def = a, a, b and a broadcast transition a → b, {a → c, b → d}. To execute this transition, we move an agent from state a to state b and apply the transition function to all other agents, so we end up Broadcast transitions. We write broadcast transitions as q → r, S with S a set of expressions q → r . This refers to δ(q) = (r, f ), with f (q ) = r for (q → r ) ∈ S. We usually omit identity mappings q → q when specifying S.
For graphic representations of broadcast protocols we use a different notation, which separates sending and receiving broadcasts. There we identify a transition δ(q) = (r, f ) with a name α and specify it by writing q !α − → r and q ?α −→ r for f (q ) = r . Intuitively, q ?α −→ r can be understood as an agent transitioning from q to r upon receiving the signal α, and q !α − → r means that an agent in state q may transmit the signal α and simultaneously transition to state r.
As defined, δ is a total function, so each state is associated with a unique broadcast. If we do not specify a transition δ(q) = (r, f ) explicitly, we assume that it simply maps each state to itself, i.e. q → q, {r → r : r ∈ Q}. We refer to those transitions as silent.
Executions. An execution is an infinite sequence π = C 0 C 1 C 2 ... of configurations with C i − → C i+1 for every i. It has some fixed number of agents n def = |C 0 | = |C 1 | = ... . Given a BCP and an initial configuration C 0 ∈ N Q , we generate a random execution with the following Markov chain: to perform a step at configuration C i , a state q ∈ Q is picked at random with probability distribution p(q) = C i (q)/|C i |, and the (uniquely defined) transition δ(q) is executed, giving the successor configuration C i+1 . We refer to the random variable corresponding to the trace of this Markov chain as random execution.
Stable Computation. Let π denote an execution and inf(π) the configurations occurring infinitely often in π. If inf(π) contains only b-consensuses, we say that π stabilises to b. For a predicate ϕ : N Σ → {0, 1} we say that P (stably) computes ϕ, if for all inputs X ∈ N Σ , the random execution of P with initial configuration C 0 = I(X) stabilises to ϕ(X) with probability 1.
Finally, for an execution π = C 0 C 1 C 2 ... we let T π denote the smallest i s.t. all configurations in C i C i+1 ... are ϕ(X)-consensuses, or ∞ if no such i exists. We say that a BCP P computes ϕ within f (n) interactions, if for all initial configurations C 0 with n agents the random execution π starting at C 0 has E(T π ) ≤ f (n) < ∞, i.e. P stabilises within f (n) steps in expectation. If f ∈ O(poly(n)), then we call P a polynomial-time BCP.
Global States. Often, it is convenient to have a shared global state between all agents. If, for a BCP P = (Q, Σ, δ, I, O) we have Q = S × G, I(Σ) ⊆ Q × {j} for some j ∈ G, and f ((s, j)) ∈ Q × {j } for each δ((q, j)) = ((r, j ), f ), then we say that P has global states G. A configuration C has global state j, if C ⊆ Q × {j} for j ∈ G. Note that, starting from a configuration with global state j, P can only reach configurations with a global state. Hence for P we will generally only consider configurations with a global state. To make our notation more concise, when specifying a transition δ(q) = (r, f ) for P, we will write f as a mapping from S to S, as q, r already determine the mapping of global states.
Population Protocols. A population protocol [5] replaces broadcasts by local rendezvous. It can be specified as a tuple (Q, Σ, δ, I, O) where Q, Σ, I, O are defined as in BCPs, and δ : Q 2 → Q 2 defines rendezvous transitions. A step of the protocol at C is made by picking two agents uniformly at random, and applying δ to their states: first q 1 ∈ Q is picked with probability C(q 1 )/|C|, Broadcast Protocols. Later on we will construct BCPs out of smaller building blocks which we call broadcast protocols (BPs). A BP is a pair (Q, δ), where Q and δ are defined as for BCPs. We extend the applicable definitions from above to BPs, in particular the notions of configurations, executions, and global states. As an introductory example, we construct a broadcast consensus protocol for the majority predicate ϕ(x, y) = x > y.
Note that we use the more compact notation for transitions in the presence of global states, written in long form (α) would be To make the presentation of the following sample execution more readable, we shorten the state (i, j) to i j . For input x = 3 and y = 2, an execution could look like this: Intuitively, there is a preliminary global consensus, which is stored in the global state. Initially, it is rejecting, as x > y is false in the case x = y = 0. However, any x agent is enough to tip the balance, moving to an accepting global state. Now any y agent could speak up, flipping the consensus again.
The two factions initially belonging to x and y, respectively, alternate in this manner by sending signals α and β. Strict alternation is ensured as an agent will not broadcast to confirm the global consensus, only to change it.
After emitting the signal, the agent from the corresponding faction goes into state , where it can no longer influence the computation. In the end, the majority faction remains and determines the final consensus.
Considering these alternations with shrinking factions, the expected number of steps of the protocol until stabilization can be bounded by 2 n k=1 n/k = O(n log n). To see that this holds, we consider the factions separately: let n 0 denote the number of agents the first faction starts with (i.e. agents initially in state (x, 0)), and n 1 the number at the end. When we are waiting for the first transition of this faction all n 0 agents are enabled, so we wait n/n 0 steps in expectation until one of them executes a broadcast. For the next one, we wait n/(n 0 − 1) steps. In total, this yields n0 k=n1+1 n/k ≤ n k=1 n/k steps for the first faction, and via the same analysis for the second as well.
In contrast to the O(n log n) interactions this protocol takes, constant-state population protocols require n 2 interactions in expectation for the computation of majority [4]. However, these numbers are not directly comparable: broadcasts may not be parallelizable, while it is uncontroversial to assume that n rendezvous occur in parallel time 1.

Comparison with other Models
To facilitate the definition of an execution model, we only consider deterministic BCPs, in the sense that for each state there is a unique transition to execute. Blondin, Esparza and Jaax [14] analysed a more general model, i.e. they allow multiple transitions for a single state, picking one of them uniformly at random when an agent in that state sends a broadcast. Additionally, as they consider BCPs as an extension of population protocols, they include rendezvous transitions. We now show that we can simulate both extensions within a constant-factor overhead.

Non-Deterministic Broadcast Protocols
The following construction allows for two broadcast transitions to be executed uniformly at random from a single state. This can easily be extended to any constant number of transitions using the usual construction of a binary tree with rejection sampling. Now assume that we are given a BCP (Q, Σ, δ 0 , I, F ) with another set of broadcast transitions δ 1 and we want each agent to pick one transition uniformly at random from δ 0 or δ 1 whenever it executes a broadcast.
We implement this using a synthetic coin, i.e. we are utilising randomness provided by the scheduler to enable individual agents to make random choices. This idea has also been used for population protocols [1,3]. Compared to these implementations, broadcasts allow for a simpler approach.
The idea is that we partition the agents into types, so that half of the agents have type 0 and the other half have type 1. Additionally, there is a global coin shared across all agents. To flip the coin, a random agent announces its type (the coin is set to heads if the agent is type 0, tails if it is type 1) and a second random agent executes a broadcast transition from either δ 0 or δ 1 , depending on the state of the global coin that has just been set. These two steps repeat, the former flipping the coin fairly and the latter then executing the actual transitions. Figure 2 sketches this procedure. Intuitively, we start with no agents having either type 0 or 1. When such a typeless agent is picked by the scheduler to announce its type (to flip the global coin) it instead broadcasts that it is searching for a partner. Once this has happened twice, these two agents are matched, one is assigned type 0 and the other type 1. Thus we ensure that there is the exact same number of type 0 and type 1 agents at all times, meaning that we get a perfectly fair coin. Additionally we make progress regardless of whether an agent with or without a type is chosen.
To describe the construction formally, we introduce a set of types T So an agent of type ? announces that it seeks a partner, moving itself to type + and the others to type −. Then any type − agent may broadcast that a match has been found, moving itself to type 1 and the type + agent to type 0. The other type − agents revert to type ?. This ensures that the number of type 0 and 1 agents is always equal. Note that there may be an odd number of agents, in which case one agent of type + remains.
The following transitions effectively flip the global coin, by having an agent of type 0 or 1 announce that we now execute a broadcast transition from respectively δ 0 or δ 1 . Here, we have q ∈ Q, • ∈ {0, 1}. Then we actually execute the transition δ • (q) = (r, f ), for each (q, i) ∈ Q × T .
As the number of type 0 and 1 agents is equal, we select transitions from δ 0 and δ 1 uniformly at random. It remains to show that the overhead of this scheme is bounded.
Executing transition (exec 0) or (exec 1) is the goal. Transitions (flip 0) and (flip 1) ensure that the former are executed in the very next step, so they cause at most a constant-factor slowdown. Transitions (seek) and (find) can be executed at most n times, as they decrease the number of agent of type ?. All that remains is the implicit silent transition of states (q, +, j), which occurs with probability at most 1/n in each step.
Hence, to execute m ≥ n steps of the simulated protocol our construction takes at most (2m + 2n) · n/(n − 1) ≤ 8m steps in expectation.

Population Protocols
Another extension to BCPs is the addition of rendez-vous transitions. Here we are given a map R : Q 2 → Q 2 . At each step, we flip a coin and either execute a broadcast transition as usual, or pick two distinct agents uniformly at random, in state q and r, respectively. These interact and move to the two states R(q, r).
Again, we can simulate this extension with only a constant-factor increase in the expected number of steps. Given a BCP (Q, Σ, B, I, F ), the idea is to add states {q : q ∈ Q} ∪ {r q : r, q ∈ Q} and insert "activating" transitions q →q, {r → r q : r ∈ Q} for q ∈ Q and "deactivating" transitions r q → s, {q → t} ∪ {u q → u : u ∈ Q} for each R(q, r) = (s, t). So a state q first signals that it wants to start a rendez-vous transition. Then, any other state r answers, both executing the transition and signalling to all other states that it has occurred.
Each state in Q has exactly 2 broadcast transitions, so (using the scheme described above) the probability of executing any "activating" transition is exactly 1 2 , the same as doing one of the original broadcast transitions in B. After doing an activating transition we may do nothing for a few steps by executing the broadcast transition onq, but eventually we execute a "deactivating" transition and go back. The probability of executing a broadcast onq is 1/n, so simulating a single rendez-vous transition takes 1 + n/(n − 1) ≤ 3 steps in expectation.

Protocols for Presburger Arithmetic
While Blondin, Esparza and Jaax [14] show that BCPs are more expressive than population protocols, they leave the question open whether BCPs provide a runtime speed-up for the class of Presburger predicates computable by population protocols. We already saw that Majority can be computed within O(n log n) interactions in BCPs. This also holds in general for Presburger predicates: Theorem 1. Every Presburger predicate is computable by a BCP within at most O(n log n) interactions.
We remark that the O(n log n) bound is asymptotically optimal: e.g. the stable consensus for the parity predicate (x = 1 mod 2) must alternate with configuration size, which clearly requires every agent to perform at least one broadcast in the computation, and thus yields a lower bound of n k=1 n k = Ω(n log n) steps like in the coupon collector's problem [20].
It is known [22] that every Presburger predicate can be expressed as Boolean combination of linear inequalities and linear congruence equations over the integers, i.e. as Boolean combination of predicates of the form i α i x i < c, and i α i x i = c mod m, where the α i , c and m are integer constants. In Section 5.1 we construct BCPs that compute arbitrary linear inequalities, before we sketch the construction for congruences and Boolean combinations in Section 5.2.

Linear Inequalities
Intuitively, in the first component of its state an agent stores its contribution to i α i x i , the left-hand side of the inequality. The global state is used to store a counter value, initially set to 0. Each agent adds its contribution to the counter, as long as it does not overflow. The counter goes from −2A to 2A, which allows it to store the threshold plus any single contribution. The final counter value then determines the outcome of the computation.
Correctness. Let ctr(C) denote the global state (and thus current counter value) of configuration C. Further, let denote the sum of all agents' contributions and the current value of the counter. Every initial configuration C 0 has ctr(C) = 0 and thus sum(C) = i αx i . Each transition α increases the counter by α but sets the agent's contribution to 0 (from α), so sum(C) is constant throughout the execution.
Recall that our output mapping depends only on the value of the counter, so our agents always form a consensus (though not necessarily a stable one). If this consensus and ϕ(C 0 ) disagree, then, we claim, a non-silent transition is enabled.
To see this, note that the current consensus depends on whether ctr(C) < c. If that is the case, but ϕ(C 0 ) = 0, then sum(C) ≥ c and some agent with positive contribution α > 0 exists. Due to ctr(C) < c, transition α is enabled. Conversely, if ctr(C) ≥ c and ϕ(C 0 ) = 1, some transition α with α < 0 will be enabled.
Finally, note that each non-silent transition increases the number of agents with contribution 0 by one, so at most n can be executed in total. So the execution converges and reaches, by the above argument, a correct consensus. Convergence time. Each agent executes at most one non-silent transition. To estimate the total number of steps, we partition the agents by their current contribution: for a configuration C let C + def = C {(q, v) ∈ Q : q > 0} denote the agents with positive contribution, and define C − analogously. We have that either ctr(C) < 0 and all transitions of agents in C + would be enabled, or ctr(C) ≥ 0 and the transitions of C − could be executed.
If C + is enabled, then we have to wait at most n/|C + | steps in expectation until a transition is executed, which reduces |C + | by one. In total we get n/|C + 0 |+ n/(|C + 0 | − 1) + ... + n/1 ∈ O(n log n). The same holds for C − , yielding our overall bound of O(n log n).
Proof (sketch). The idea is the same as for Proposition 1, but instead of taking care not to overflow the counter we simply perform the additions modulo l.
Proposition 3 (Boolean combination of predicates). Let ϕ be a Boolean combination of predicates ϕ 1 , ..., ϕ k , which are computed by BCPs P 1 , ..., P k , respectively, within O(n log n) interactions. Then there is a protocol computing ϕ within O(n log n) interactions.
Proof (sketch). We do a simple parallel composition of the k BCPs, which is the same construction as used for ordinary population protocols (see for example [5,Lemma 6]). A detailed proof can be found in the full version of this paper.

Protocols for all Predicates in ZPL
BCPs compute precisely the predicates in NL with input encoded in unary, which corresponds to NSPACE(n) when encoded in binary. The proof of the NL lower bound by Blondin, Esparza and Jaax [14] goes through multiple stages of reduction and thus does not reveal which predicates can be computed efficiently.
We will now take a more direct approach, using a construction similar to the one by Angluin, Aspnes and Eisenstat [7]. A step of a randomised Turing machine (RTM) can be simulated using variants of the protocols for Presburger predicates from Section 5, which we combine with a clock to determine whether the step has finished, with high probability.
Instead of simulating RTMs directly, it is more convenient to first reduce them to counter machines. Here, we will use counter machines that are both randomised and capable of multiplying and dividing by two, with the latter also determining the remainder. This ensures that the reduction is performed efficiently, i.e. with overhead of O(n log n) interactions per step.
We first show the other direction: simulating BCPs with RTMs. The proof of Theorem 2 will take up the remainder of this section.
Counter machines. Let Cmd def = {mul 2 , inc, divmod 2 , iszero} denote a set of commands, and Ret def = {done 0 , done 1 } a set of completion statuses. A multiplicative counter machine with k counters (k-CM) A = (S, T 1 , T 2 ) consists of a finite set of states S with init, 0, 1 ∈ S and two transition functions T 1 , T 2 mapping a state q ∈ S to a tuple (i, j, q 0 , q 1 ) where i ∈ {1, ..., k} refers to a counter, j ∈ Cmd is a command, and q 0 , q 1 ∈ S are successor states (q 1 is not used for mul 2 and inc operations). Additionally, we require that T 1 , T 2 map q ∈ {0, 1} to (1, iszero, q, q), effectively executing no operation from those states.
The idea is that A, starting in state init, picks transitions uniformly at random from either T 1 or T 2 . Apart from this randomness, the transitions are deterministic. Eventually, A ends up in either state 0 or 1, at which point it cannot perform further actions, thereby indicating whether the input is accepted or rejected.
Step-execution function. A CM-configuration is a tuple K = (q, x 1 , ..., x k ) ∈ Q × N k . We define the step-execution function step as follows, with x ∈ N: For two CM-configurations K = (q, x 1 , ..., x k ) and K = (q , x 1 , ..., , and x r = x r for r = i. Note that for each K and • there is exactly one K with K • − → K . The reasoning for introducing the step-execution function is that we want to construct a broadcast protocol (BP) which simulates just one step of the CM. Later on we can use this BP as a building block in a more general protocol. Computation. Let ϕ : N l → {0, 1} denote a predicate, for l ≤ k, and C ∈ N l an input to ϕ. We sample a random (CM-)execution π = K 0 K 1 K 2 ... for input C, where K 0 , ... are CM-configurations, via a Markov chain. For the initial configuration we have K 0 def = (init, C(1), ..., C(l), 0, ..., 0), and K i is determined as the unique configuration with K i−1 is chosen uniformly at random. (So π is the random variable defined as trace of the Markov Chain.) We say that A computes ϕ within f (n) steps if for each C ∈ N l with |C| = n the random execution for input C reaches a configuration in {ϕ(C)}×N k after at most f (n) steps in expectation. Finally, A is n-bounded if the random executions for inputs C with |C| = n can only reach configurations in Q × N k ≤n . Theorem 3. Let ϕ be a predicate decidable by a log-space bounded RTM within O(f (n)) steps in expectation with unary input encoding. There exists an nbounded CM that accepts ϕ within O(f (n) log(n)) steps in expectation.
Proof (sketch). This can be shown by first representing the Turing machine by a stack machine with two stacks that contain the tape content to the left/right of the current machine head position. In this representation, head movements and tape updates amount to performing pop/push operations on the stack. Moreover, we can simulate an c · n-bounded stack by c many n-bounded stacks. An nbounded stack, in turn, can be represented in a counter machine with a constant number of 2 n -bounded counters. The stack content is represented as the base-2 number corresponding to the binary sequence stored in the stack. Popping then amounts to a divmod 2 operation, and pushing amounts to doubling the counter value, followed by adding 1 or 0, respectively.
A detailed proof can be found in the full version of this paper.
We formally define two types of BPs, ones that simulate a step of the CM, and ones behaving like a clock. Definition 1. Let BP P = (Q × G, δ) denote a BP with global states G where 0, 1, ⊥∈ Q and Cmd, Ret ⊆ G. We define the injection ϕ : G × N ≤n → N Q×G as ϕ(j, x) def = x · (1, j) + (n − x) · (0, j) . The configurations in ϕ(Cmd × N) are called initial, the ones in ϕ(Ret × N) final. We call a configuration C failing, if C(⊥, i) > 0 for some i ∈ G.
We say that P is CM-simulating if the sets of final and failing configurations are closed under reachability, and from every initial configuration ϕ(j, w) the only reachable final configuration is ϕ (step(j, w)), if both are well-defined. Definition 2. Let P = (Q, δ) denote a BP with 0, 1 ∈ Q and Time(P) the number of steps until P, starting in configuration 0, ..., 0 , reaches 1, ..., 1 , or ∞ if it does not. If Time(P) is almost surely finite and no agent is in state 1 before Time(P), then we call P a clock-BP. Now we begin by constructing a CM-simulating BP. The value of a given counter is scattered across the population: each agent stores its contribution to this counter value in its state. The counter value is the sum of all contributions. Usually, an agent's contribution is either 1 or 0, thus n agents can maximally store a counter value equal to n, which is not problematic, since the counter machine is assumed to be n-bounded. The difficult part is multiplying and dividing the counter by two. Besides contributions 0 and 1, we will also allow intermediate contributions 1 2 and 2. By executing a single broadcast, we can multiply (or divide) all the individual contributions by 2, by setting all contributions of value 1 to 1 2 , or 2, respectively. Then, over time, we "normalise" the agents to all have contribution 0 or 1 again in a manner which is specified below. This process takes some time, and we cannot determine with perfect reliability whether it is finished, so we only bound the time with high probability. Here and in the following, we say that some event (dependent on the population size n) happens with high probability, if for all k > 0 the event happens with probability 1 − O(n −k ).
In this and subsequent lemmata we use G(p), for 0 < p < 1, to denote the geometric distribution, that is the number of trials until a coin flip with probability p succeeds, which has expectation 1/p. We start with a statement about the tail distributions of sums of geometric variables. Lemma 2. Let n ≥ 3 and X 1 , ..., X n denote independent random variables with sum X and X i ∼ G(i/n). Then for any k ≥ 1 there is an l s.t.
P(X ≥ l · n ln n) ≤ n −k Proof. See the full version of this paper.
Additionally, we need transitions that move agents back into states 0 and 1.
This requires some explanation. Basically, we have the invariant that for a configuration C the current value of the counter is b + i∈Q,j∈G i · C((i, j)), where b is 1 if the global state is high and 0 else. There is a "canonical" representation of each counter value, where b = 0 and the individual contributions i ∈ Q are only 0 and 1. The transitions (α 1 -α 3 ) update the represented counter value in a single step, but cause a "noncanonical" representation. The transitions (β 1β 4 ) preserve the value of the counter and cause the representation to eventually become canonical. This corresponds to final configurations from Definition 1: as long as the representation is noncanonical, i.e. an agent with value 1 2 , 2 or * exists, the configuration is not final. Conversely, once we reach a final configuration our representation is canonical, and, as the value of the counter is preserved, we reach the correct final configuration.
For iszero we do something similar, but the value of the counter does not change.
If the initial transition is executed by an agent with value 1, we can go to the global state done 1 directly. Otherwise, we replace 1 by * and go to done 0 , so if no agents with value 1 exist, we are finished. Else some agent with value * executes (β 5 ) and we move to the correct final configuration.
Final configurations can only contain states {0, 1} × Ret. As we have no outgoing transitions from those states, they are indeed closed under reachability.
It remains to be shown that starting from a configuration C 0 we reach a final configuration within O(n log n) steps with high probability. Note that transitions (α 1 -α 5 ) are executed at most once. Moreover, these are the only transitions enabled at C 0 , so let C 1 denote the successor configuration after executing (α 1α 5 ), i.e. C 0 → C 1 . From now on, we consider only transitions (β 1 -β 5 ).
Let M def = { 1 2 , 2, * } × G denote the set of "noncanonical" states, and, for a configuration C, let Φ(C) def = 2 q∈M C(q) + b denote a potential function, with b being 1 if the global state of C is high and 0 else. Now we can observe that executing a (β 1 -β 5 ) transition strictly decreases Φ, and that 0 ≤ Φ(C) ≤ 2n for any configuration C. So after at most 2n non-silent transitions, we have reached a final configuration.
Fix some transition (β j ), let q ∈ Q × G denote the state initiating (β j ), and let C, C , C denote configurations with C βj − → C * − → C , meaning that C is a configuration reachable from C after executing (β j ). Then, we claim, C(q) > C (q).
To see that this holds for transitions (β 2 -β 5 ), note that for i ∈ { 1 2 , 2, * } the number of agents with value i can only decrease when executing transitions (β 1β 5 ). For (β 1 ) this is slightly more complicated, as (β 3 ) increases the number of agents with value 0. However, (β 1 ) is reachable only after (α 1 ) or (α 3 ) has been executed, while (β 3 ) requires (α 2 ). Thus, our claim follows. Let X k denote the number of silent transitions before executing (β j ) for the k-th time, k = 1, ..., l, and let r k denote the number of agents in state q at that time. Then n ≥ r 1 > r 2 > ... > r l ≥ 1 and X k is distributed according to G(r k /n). So we can use Lemma 2 to show that the sum of X k is O(n log n) with high probability. There are only 5 transitions (β j ), so the same holds for the total number of steps until reaching a final state.
Our next construction is the clock-BP, which indicates that some amount of time has passed (with high probability). Angluin, Aspnes and Eisenstat used epidemics for this purpose [7], as do we. The idea is that one agent initiates an epidemic and waits until it sees an infected agent. Similar to standard analysis of the coupon collector's problem, this is likely to take Θ(n log n) time. Proof (sketch). For a clock we use states {0, 1, c 1 , c 2 , c 3 , c + 1 , c + 2 } and transitions State 0 is the initial state, 1 the final state. States c 1 and c 2 denote "uninfected" agents, state c 3 "infected" ones. The former can become activated (moving to c + 1 and c + 2 ), causing one of them to become infected. Transition (α) marks a leader c 1 , once they are infected the clock ends (via (ω)). In (β), a single activated agent becomes infected, deactivating the other agents. They get activated again via transition (γ). The state diagram is shown in Figure 3.
It remains to show that this protocol fulfils the stated time bounds. We prove E(Time(P)) ∈ O(n log n) by using that, in expectation, the protocol spends at most n/j steps in state j and at most n/(n − j) in state j + . For the lower bound we make a case distinction: either state √ n is not visited (i.e. the leader is one of the first √ n agents to be infected), or the total number of steps is at least X 1 + ... + X √ n , where X j is the number of steps the protocol spends in state i. As X j is geometrically distributed with mean n/j, we apply a tail bound from Janson [23] to get the desired result.
A detailed proof can be found in the full version of the paper.
While the above clock measures some interval of time with some reliability, we want a clock that measures an "arbitrarily long" interval with "arbitrarily high" reliability. Constructions for population protocols use phase clocks for this purpose, but broadcasts allow us to synchronise the agents, so we can directly execute the clock multiple times in sequence instead.
Lemma 5. Let k ∈ N denote some constant. Then there is a clock-BP P s.t. E(Time(P)) ∈ O(n log n), and Time(P) < kn log n with probability O(n −k ).
Proof (sketch). The idea is that we run 28k 2 clocks in sequence, in groups of 2k. Then it is likely that at least one clock in each group works, yielding the overall minimum running time. A detailed proof can be found in the full version of this paper.
As mentioned earlier, we combine the clock with the construction in Lemma 3. While we cannot reliably determine whether the operation has finished, we can use a clock to measure an interval of time long enough for the protocol to terminate with high probability. The next construction does just that. In particular, in contrast to Lemma 3, it uses its global state to indicate that it is done. Lemma 6. There is a CM-simulating BP s.t. starting from an initial configuration it reaches either a final or a failing configuration C almost surely and within O(n log n) steps in expectation, and C is final with high probability. Additionally, all reachable configurations with global state in Ret are final or failing.
Proof. Fix some k ∈ N and let P = (Q × G, δ) denote the BP we want to construct. Further, let P 1 = (Q 1 × G 1 , δ 1 ) denote the BP from Lemma 3 and choose some c s.t. P 1 reaches a final configuration after at most cn log n steps with probability at least 1 − n −k . Now we use Lemma 5 to get a clock P 2 = (Q 2 , δ 2 ) that runs for at least cn log n steps with probability at least 1 − n −k .
We do a parallel composition of P 1 and P 2 to get P. In particular, Q def = Q 1 × Q 2 , G def = {j • : j ∈ G 1 } ∪ Ret, where for Q we identify (i, 0) with i for i ∈ {0, 1 ⊥}, and for G we identify j with j • for j ∈ Cmd.
Intuitively, we use • to rename the global states of P 1 , meaning that the global state j ∈ G 1 of P 1 is now called j • in our protocol. We want P 1 to start with the same initial state we have, which is why we identified j with j • for j ∈ Cmd. However, we only want to enter a final configurations once the clock has run out, so the completion statuses of P 1 are renamed into j • for j ∈ Ret and we enter a final configuration by setting to global state to a j ∈ Ret.
For each (q 1 , j) ∈ Q 1 × G 1 and q 2 ∈ Q 2 with δ 1 (q 1 , j) = ((r 1 , j ), f 1 ) and δ 2 (q 2 ) = (r 2 , f 2 ) we get the transition (q 1 , q 2 , j • ) → (r 1 , r 2 , j • ), {(t 1 , t 2 ) → (f 1 (t 1 ), f (t 2 )) : t 1 ∈ Q 1 , t 2 ∈ Q 2 } (α) These transitions, together with the way we identified states, ensure that P 1 and P 2 run normally, with the input being passed through to P 1 transparently. However, note that the final configurations of P 1 are not final for P, meaning that the protocol never ends. Hence, for q 1 ∈ Q 1 , j ∈ Ret we add the transition This terminates the protocol once the clock has run out. If P 1 was in a final state, we will now enter a final state as well, else we move into a failing state.
Finally, we use the above BP to simulate the full l-CM. Proof (sketch). For each counter we need n agents, so ln in total, but we can simply have each agent simulate a constant number of agents. To execute a step of the CM, we use the BP from Lemma 6. It succeeds only with high probability, but in the case of failure at least one agent will have local state ⊥, from which that agent initiates a restart of the whole computation. As the CM takes only a polynomial number of steps, we can fix a k s.t. a computation of our BCP without failures (i.e. one that succeeds on the first try) takes O(n k ) steps. A single step succeeds with high probability, so we can require it to fail with probability at most O(n −k−1 ). In total, the restarts increase the running time by a factor of 1/(1 − O(n −1 )), which is only a constant overhead.
A detailed proof can be found in the full version of this paper.