Near-Optimal Self-Stabilising Counting and Firing Squads

Consider a fully-connected synchronous distributed system consisting of $n$ nodes, where up to $f$ nodes may be faulty and every node starts in an arbitrary initial state. In the synchronous $C$-counting problem, all nodes need to eventually agree on a counter that is increased by one modulo $C$ in each round for given $C>1$. In the self-stabilising firing squad problem, the task is to eventually guarantee that all non-faulty nodes have simultaneous responses to external inputs: if a subset of the correct nodes receive an external"go"signal as input, then all correct nodes should agree on a round (in the not-too-distant future) in which to jointly output a"fire"signal. Moreover, no node should generate a"fire"signal without some correct node having previously received a"go"signal as input. We present a framework reducing both tasks to binary consensus at very small cost. For example, we obtain a deterministic algorithm for self-stabilising Byzantine firing squads with optimal resilience $f<n/3$, asymptotically optimal stabilisation and response time $O(f)$, and message size $O(\log f)$. As our framework does not restrict the type of consensus routines used, we also obtain efficient randomised solutions, and it is straightforward to adapt our framework for other types of permanent faults.


Introduction
The design of distributed systems faces several unique issues related to redundancy and faulttolerance, timing and synchrony, and the efficient use of communication as a resource [30]. In this work, we give near-optimal solutions to two fundamental distributed synchronisation and coordination tasks: the synchronous counting and the firing squad problems. For both tasks, we devise fast self-stabilising algorithms [17] that are not only communication-efficient, but also tolerate the optimal number of permanently faulty nodes. That is, our algorithms efficiently recover from transient failures that may arbitrarily corrupt the state of the distributed system and permanently damage a large number of the nodes.

Synchronous counting and firing squads
We assume a synchronous message-passing model of distributed computation. The distributed system consists of a fully-connected network of n nodes, where up to f of the nodes may be faulty and the initial state of the system is arbitrary. To model the behaviour of faulty nodes, we consider three typical classes of permanent faults: • crash (the faulty node stops sending information), • omission (some or all of the messages sent by the faulty node are lost), and • Byzantine faults (the faulty node exhibits arbitrary misbehaviour).
Note that even though the communication proceeds in a synchronous fashion, the nodes may have different notions of current time due to the arbitrary initial states. However, many typical distributed protocols assume that the system has either been properly initialised or that the nodes should collectively agree on the rounds in which to perform certain actions. Thus, we are essentially faced with the task of having to agree on a common time in a manner that is both self-stabilising and tolerates permanently faulty behaviour from some of the nodes. To address this issue, we study the synchronous counting and firing squad problems, which are among the most fundamental challenges in fault-tolerant distributed systems.
In the synchronous counting problem, all the nodes receive well-separated synchronous clock pulses that designate the start of a new round. The received clock pulses are anonymous, and hence, all correct nodes should eventually stabilise and agree on a round counter that increases consistently by one modulo C. The problem is also known as digital clock synchronisation, as all non-faulty nodes essentially have to agree on a shared logical clock. A stabilising execution of such a protocol for n = 4, f = 1, and C = 3 is given below: In the self-stabilising firing squad problem, the task is to have all correct nodes eventually stabilise and respond to an external input simultaneously. That is, once stabilised, when a sufficiently large (depending on the type of permanent faults) subset of the correct nodes receive an external "go" signal, then all correct nodes should eventually generate a local "fire" event on the same round. The time taken to react to the "go" signal is called the response time. Note that before stabilisation the nodes may generate spurious firing signals, but after stabilisation no correct node should generate a "fire" event without some correct node having previously received a "go" signal as input. An execution of such a protocol with n = 4, f = 1, and response time R = 5 is illustrated below: A firing squad protocol can be used, for example, to agree in a self-stabilising manner on when to initiate a new instance of a non-self-stabilising distributed protocol, as response to internal or external "go" inputs.

Connections to fault-tolerant consensus
Reaching agreement is perhaps the most intrinsic problem in fault-tolerant distributed computing. It is known that both the synchronous counting [15] and the self-stabilising firing squad problem [14] are closely connected to the well-studied consensus problem [25,32], where each node is given an input bit and the task is to agree on a common output bit such that if every non-faulty node received the same value as input, then this value must also be the output value. Indeed, the connection is obvious on an intuitive level, as in each task the goal is to agree on a common decision (that is, the output bit, clock value, or whether to generate a firing event).
However, the key difference between the problems lies in self-stabilisation. Typically, the consensus problem is considered in a non-self-stabilising setting with only permanent faults (e.g. f < n/3 nodes with arbitrary behaviour), whereas synchronous counting copes with both transient and permanent faults. In fact, it is easy to see that synchronous counting is trivial in a non-self-stabilising setting: if all nodes are initialised with the same clock value, then they can simply locally increment their counters each round without any communication. Furthermore, in a properly initialised system, one can reduce the firing squad problem to repeatedly calling a consensus routine [6].
Interestingly, imposing the requirement of self-stabilisation -convergence to correct system behavior from arbitrary initial states -reverses the roles. Solving either the synchronous counting or firing squad problem in a self-stabilising manner also yields a solution to binary consensus, but the converse is not true. In fact, in order to internally or externally trigger a consistent execution of a consensus protocol (or any other non-self-stabilising protocol, for that matter), one first needs a self-stabilising synchronous counting or firing squad algorithm, respectively! In light of this, the self-stabilising variants of both problems are important generalisations of consensus. While considerable research has been dedicated to both tasks [3, 12, 14-16, 18, 26, 28], our understanding is significantly less developed than for the extensively studied consensus problem. Moreover, it is worth noting that all existing algorithms utilise consensus subroutines [12,26,28] or shared coins [3], the latter of which essentially solves consensus as well. Given that both tasks are at least as hard as consensus [15], this seems to be a natural approach. However, it raises the question how much of an overhead must be incurred by such a reduction. In this paper, we subsume and improve upon previous results by providing a generic reduction of synchronous counting and self-stabilising firing squad to binary consensus that incurs very small overheads.

Contributions
We develop a framework for efficiently transforming non-self-stabilising consensus algorithms into self-stabilising algorithms for synchronous counting and firing squad problems. In particular, the resulting self-stabilising algorithms have the same resilience as the original consensus algorithms, that is, the resulting algorithms tolerate the same number and type of permanent faults as the original consensus algorithm (e.g. crash, omission, or Byzantine faults).
The construction we give incurs a small overhead compared to time and bit complexity of the consensus routines: the stabilisation time and message size are, up to constant factors, given as the sum of the cost of the consensus routine for f faults and recursively applying our scheme to f < f /2 faults. Finally, our construction can be used in conjunction with both deterministic and randomised consensus algorithms. Consequently, we also obtain algorithms for probabilistic variants of the synchronous counting and firing squad problems.
Our novel framework enables us to address several open problems related to self-stabilising firing squads and synchronous counting. We now give a brief summary of the open problems we solve and the new results obtained using our framework.
Self-stabilising firing squads. In the case of self-stabilising firing squads, Dolev et al. [14] posed the following two open problems: 1. Are there solutions that tolerate either omission or Byzantine (i.e., arbitrary) faults? 2. Are there algorithms using o(n)-bit messages only?
We answer both questions in the affirmative by giving algorithms that achieve both properties simultaneously. Concretely, our framework implies a deterministic solution for the self-stabilising Byzantine firing squad problem that • tolerates the optimal number of f < n/3 Byzantine faulty nodes, • uses messages of O(log f ) bits, and • is guaranteed to stabilise and respond to inputs in linear-in-f communication rounds.
Thus, compared to prior state-of-the-art solutions [14], our algorithm tolerates a much stronger form of faulty behaviour and uses exponentially smaller messages, yet retains asymptotically optimal stabilisation and response time. We also obtain algorithms that tolerate f < n/2 omission failures and f < n crash failures while retaining a small message size of O(log f ) bits.
Synchronous counting. We attain novel algorithms for synchronous counting, which is also known as self-stabilising Byzantine fault-tolerant digital clock synchronisation [3,18,23]. Our new algorithms resolve questions left open by our own prior work [28], namely, whether there exist 1. deterministic linear-time algorithms with optimal resilience and message size o(log 2 f ), or 2. randomised sublinear-time algorithms with small bit complexity.
Again, we answer both questions positively using our framework developed in this paper. For the first question, we give linear-time deterministic algorithms that have message size O(log f ) bits. For the second question, we show that our framework can utilise efficient randomised consensus algorithms to obtain probabilistic variants of the synchronous counting and firing squad problems. For example, the result of King and Saia [24] implies algorithms that stabilise with high probability in polylog n rounds and use message size polylog n, assuming private communication links and an adaptive Byzantine adversary corrupting f < n/(3 + ε) nodes for an arbitrarily small constant ε > 0.

Related work
In this section, we overview prior work on the synchronous counting and firing squad problems. By now it has been established that both problems [14,15] are closely connected to the wellstudied (non-self-stabilising) consensus [25,32]. As there exists a vast body of literature on synchronous consensus, we refer the interested reader to e.g. the survey by Raynal [33]. We note that self-stabilising variants of consensus have been studied [2,9,10,19] but in different models of computation and/or for different types of failures than what we consider in this work.
Synchronous counting and digital clock synchronisation. In the past two decades, there has been increased interest in combining self-stabilisation with Byzantine fault-tolerance. One reason is that algorithms in this fault model are very attractive in terms of designing highlyresilient hardware [15]. A substantial amount of work on synchronous counting has been carried out [3,12,16,18,23,29], comprising both positive and negative results.
In terms of lower bounds, many impossibility results for consensus [11,13,22,32] also directly apply to synchronous counting, as synchronous counting solves binary consensus [15,16]. In particular, no algorithm can tolerate more than f < n/3 Byzantine faulty nodes [32] (unless cryptographic assumptions are made) and any deterministic algorithm needs at least f + 1 rounds to stabilise [22].
In a seminal work, Dolev and Welch [18] showed that the task can be solved in a selfstabilising manner in the presence of (the optimal number of) f < n/3 Byzantine faults using randomisation; see also [17,Ch. 6]. While this algorithm can be implemented using only constantsize messages, the expected stabilisation time is exponential. Later, Ben-Or et al. [3] showed that it is possible to obtain optimally-resilient solutions that stabilise in expected constant time. However, their algorithm relies on shared coins, which are costly to implement and assume private communication channels.
In addition to the lower bound results, there also exist deterministic algorithms for the synchronous counting problem [12,16,23,29]. Many of these algorithms utilise consensus routines [12,23,29], but obtaining fast and communication-efficient solutions with optimal resilience has been a challenge. For example, Dolev and Hoch [12] apply a pipelining technique, where Ω(f ) consensus instances are run in parallel. While this approach attains optimal resilience and linear stabilisation time in f , the large number of parallel consensus instances necessitates large messages.
In order to achieve better communication and state complexity, the use of computational algorithm design and synthesis techniques have also been investigated [5,16]. While this line of research has produced novel optimal and computer-verified algorithms, so far the techniques have not scaled beyond f = 1 faulty node due to the inherent combinatorial explosion in the search space of potential algorithms.
Recently, we gave recursive constructions that achieve linear stabilisation time using only polylogarithmic message size and state bits per node [26,28]; see also the extended and revised version [29]. However, our previous constructions relied on specific (deterministic) consensus routines and their properties in a relatively ad hoc manner. In contrast, our new framework presented here lends itself to any (possibly randomised) synchronous consensus routine and improves the best known upper bound on the message size to O(log f ) bits. Currently, it is unknown whether it is possible to deterministically achieve message size o(log f ).
Firing squads. In the original formulation of the firing squad synchronisation problem, the system consists of an n-length path consisting of finite state machines (whose number of states is independent of n) and the goal is to have all machines switch to the same "fire" state simultaneously after one node receives a "go" signal. This formulation of the problem has been attributed to John Myhill and Edward Moore and has subsequently been studied in various settings; see e.g. [31] for survey of early work related to the problem.
In the distributed computing community, the firing squad problem has been studied in fully-conneted networks in the presence of faulty nodes. Similarly to synchronous counting, the firing squad problem is closely connected to Byzantine agreement and simultaneous consensus [6-8, 14, 20]. Both Burns and Lynch [6] and Coan et al. [7] studied the firing squad problem in the context of Byzantine failures. Burns and Lynch [6] considered both permissive and strict variants of the problem (i.e., whether faulty nodes can trigger a firing event or not) and showed that both can be solved using Byzantine consensus algorithms with only a relatively small additional overhead in the number of communication rounds and total number of bits communicated. On the other hand, Coan et al. [7] gave authenticated firing squad algorithms for various Byzantine fault models. Coan and Dwork [8] gave time lower bounds of f + 1 rounds for deterministic and randomised algorithms solving the firing squad problem in the crash fault model.
However, neither the solutions of Burns and Lynch [6] or Coan et al. [7] are self-stabilising or use small messages. Almost two decades later, Dolev et al. [14] gave the first self-stabilising algorithm for the firing squad problem. In particular, their solution has optimal stabilisation time and response time depending on the fault pattern. However, their algorithm tolerates only crash faults and uses messages of size Θ(n log n) bits. In this work, we improve on this result by achieving Byzantine fault-tolerance using messages of O(log n) bits.

Outline of the paper
The article is structured as follows. For the first part of the paper, we confine the presentation to Byzantine faults. In the second part, we discuss how to extend our results in two ways: first, we consider the randomised setting, where sublinear time algorithms are possible, and secondly, other fault models that allow a larger number of faulty nodes.
We start with Section 2, where we give formal definitions related to the model of computation, synchronous counting, and firing squads in the Byzantine setting. In the sections following this, we show our main result in a top-down fashion as illustrated in Figure 1. We introduce a series of new problems and give reductions between them: • Section 3 shows how to obtain synchronous counting and firing squad algorithms that rely on binary consensus routines and strong pulsers, • Section 4 devises strong pulsers with the help of weak pulsers and multivalued consensus, • Section 5 constructs weak pulsers using silent consensus and less resilient strong pulsers. Section 6 combines the results of Section 4 and Section 5 to obtain a recursive construction for strong pulsers used by the algorithms given in Section 3. Finally, to demonstrate the flexibility and generality of our approach, we discuss how to extend our results to randomised consensus routines in Section 7, and cover deterministic solutions under crash and omission faults in Section 8.

Preliminaries
In this section, we first fix some basic notation, then describe the model of computation, and finally give formal definitions of the synchronous counting, self-stabilising firing squad, and consensus problems.

Model of computation
We consider a fully-connected synchronous network on node set V consisting of n = |V | processors.
We assume there exists a subset of F ⊆ V faulty nodes that is (at least initially) unknown to all nodes, where the upper bound f on the size |F | ≤ f is known to the nodes. We say that nodes in V \ F are correct and nodes in F are faulty. All correct nodes in the system will follow a given algorithm A that is the same for all the nodes in the system. The execution proceeds in synchronous rounds, where in each round t ∈ N the nodes take the following actions in lock-step: 1. perform local computations, 2. send messages to other nodes, and 3. receive messages from other nodes.
We assume that nodes have unique identifiers from {1, . . . , n} and can identify the sender of incoming messages.
We say that an algorithm A has message size M (A) if no correct node sends more than M (A) bits to any other node during a single round.
The local computations of a node v determine the decision which messages to send to other nodes and what is the new state of the node v. As we are interested in self-stabilising algorithms, the initial state of a node is arbitrary; this is equivalent to assuming that transient faults have arbitrarily corrupted the state of each node, but the transient faults have ceased by the beginning of the first round.
As mentioned above, we allow for additional (possibly permanent) Byzantine faults. A Byzantine faulty node v ∈ F may deviate from the algorithm arbitrarily, i.e., send arbitrary messages in each round. In particular, a Byzantine faulty node can send different messages to each correct node in the system, even if the algorithm specifies otherwise. Since we consider deterministic algorithms, the meaning of "arbitrary" in this context is that the algorithm must succeed for any possible choice of behavior of the faulty nodes. We require that f = |F | < n/3, as otherwise none of the problems we consider can be solved due to the impossibility of consensus under f ≥ n/3 Byzantine faults [32].

Synchronous counting
In the synchronous C-counting problem, the task is to have each node v ∈ V output a counter value c(v, t) ∈ [C] on each round t ∈ N in a consistent manner. We say that an execution of an algorithm stabilises in round t if and only if all t ≤ t ∈ N and v, w ∈ V \ F satisfy SC1. Agreement: c(v, t ) = c(w, t ) and SC2. Consistency: c(v, t + 1) = c(v, t ) + 1 mod C.
We say that A is an f -resilient C-counting algorithm that stabilises in time t if all executions with at most f faulty nodes stabilise by round t. The stabilisation time T (A) of A is the maximum such t over all executions.

Self-stabilising firing squad
In the self-stabilising Byzantine firing squad problem, in each round t ∈ N, each node v ∈ V receives an external input GO(v, t) ∈ {0, 1}. Moreover, the algorithm determines an output FIRE(v, t) ∈ {0, 1} at each node v ∈ V in each round t ∈ N. We say that an execution of an algorithm stabilises in round t ∈ N if the following three properties hold: Note that the liveness condition requires f + 1 correct nodes to observe a GO input, as otherwise it would be impossible to guarantee that a correct node observed a GO input when firing; this corresponds to the definition of a strict Byzantine firing squad [6]. We say that an execution stabilised by round t has response time R from round t if (i) when firing is required in response to (sufficiently many) GO inputs of 1 in round t G ≥ t, this happens no later than round t G + R, and (ii) when the squad fires in round t F ≥ t, there was sufficient support (in terms of GO inputs of 1) justifying this in a round t G with t F > t G ≥ t F − R.
Finally, we say that an algorithm F is an f -resilient firing squad algorithm with stabilisation time T (F) and response time R(F) if in any execution of the system with at most f faulty nodes there is a round t ≤ T (F) such that the algorithm stabilised and has response time at most R(F) from round t. We remark that under Byzantine faults, previous non-stabilising algorithms [6] have considered the case where the input signals (from different nodes) do not need to be received on the same round, but they can be spread out over several rounds. In the self-stabilising setting, we can easily cover the case where f + 1 input signals are received within a time window of ∆ rounds as follows: instead of relying on the input GO signals as-is, we can use an auxiliary variable GO (v, t) as input to our algorithms, where GO (v, t) = 1 iff there is a round t ∈ {t − ∆ + 1, . . . , t} with GO(v, t ) = 1.

Consensus
Let us conclude this section by definining the multivalued consensus problem. Unlike the synchronous counting and self-stabilising firing squad problems, the standard definition of consensus does not require self-stabilisation: we assume that all nodes start from a fixed starting state and the algorithm terminates in finitely many communication rounds.
In the multivalued consensus problem for L > 1 values, each node v ∈ V receives an input value x(v) ∈ [L] and the task is to have all correct nodes output the same value y ∈ [L]. We say that an algorithm C is an f -resilient T (C)-round consensus algorithm if the following conditions hold when there are at most f faulty nodes: We remark that one may ask for stronger validity conditions, but for our purposes this condition is sufficient. The binary consensus problem is the special case of L = 2 of the above multivalued consensus problem. In the case of binary consensus, the stated validity condition is equivalent Later, we utilise the fact that multivalued consensus can be reduced to binary consensus with only a small overhead in time. In [27], it is shown how to do this with 1-bit messages and an additive overhead of O(log L) rounds, preserving resilience.

Synchronous counting and firing squads
In this section, we give a firing squad algorithm with asymptotically optimal stabilisation and response times. The algorithm relies on two auxiliary routines: a so-called strong pulser and a consensus algorithm. We start with a discussion on strong pulsers.

Strong pulsers and counting
Our approach to the firing squad problem is to solve it by repeated consensus, which in turn is controlled by a joint round counter. To minimise message size, however, we will not communicate counter values directly. Instead we make use of what we call a strong pulser.
Definition 1 (Strong pulser). An algorithm P is an f -resilient strong Ψ-pulser that stabilises in T (P) rounds if it satisfies the following conditions in the presence of at most f faulty nodes. Each node v ∈ V produces an output bit p(v, t) ∈ {0, 1} on each round t ∈ N. We say that v generates a pulse in round t if p(v, t) = 1 holds. We require that there is a round t 0 ≤ T (P) such that: Put otherwise, a strong Ψ-pulser consistently generates pulses at all non-faulty nodes exactly every Ψ rounds. Figure 2 illustrates an execution of a strong pulser with Ψ = 3. It is straightforward to see that strong pulsers and synchronous counting are almost equivalent. Lemma 1. Let C ∈ N and Ψ ∈ N. If C divides Ψ, then a strong Ψ-pulser that stabilises in T rounds implies a synchronous C-counter that stabilises in at most T rounds. If Ψ divides C, then a synchronous C-counter that stabilises in T rounds implies a strong Ψ-pulser that stabilises in at most T + Ψ − 1 rounds. Another way of interpreting this relation is to view a strong Ψ-pulser as a different encoding of the output of a Ψ-counter: since the system is synchronous, it suffices to communicate when the counter overflows to value 0 and otherwise count locally. This saves bandwidth when communicating the state of the counter.

Firing squads via pulsers and consensus
We now show how an f -resilient strong pulser and f -resilient binary consensus algorithm can be used to devise an f -resilient firing squad algorithm. As a strong pulser can be used to control repeated execution of a non-self-stabilising algorithm, it enables us to run consensus on whether a firing event should be triggered or not repeatedly. As the firing squad problem is at least as hard as consensus, this maintains asymptotically optimal round complexity.
Recall that for the Byzantine firing squad problem, we are interested in a liveness condition in which a firing event needs to be generated if at least f + 1 non-faulty nodes v ∈ V \ F recently saw GO(v, t) = 1 on some round t. To this end, we have each node continuously inform all other nodes about its GO values (i.e. their received input signals). Whenever node v ∈ V sees f + 1 nodes w ∈ V claim GO(w, t) = 1, it will memorise this and use input x(v) = 1 for the next consensus instance. Otherwise, it will use the input value x(v) = 0; this ensures that at least one non-faulty node w had GO(w, t) = 1 recently in case v uses input x(v) = 1. The validity condition of the (arbitrary) T (C)-round consensus routine C thus ensures both liveness and safety for the resulting firing squad algorithm. Apart from C, the algorithm concurrently runs a strong Ψ-pulser P for some Ψ > T (C).
The firing squad algorithm. Given a strong Ψ-pulser algorithm P and a binary consensus algorithm C, each node v stores the following variables on every round t: , the input and output variables of C, and • m(v, t) ∈ {0, 1}, an auxiliary variable used to memorise whether sufficiently many GO signals were received to warrant a firing event.
In the following algorithm, on each round t ∈ N any (correct) node v ∈ V will broadcast the value GO(v, t) and receive the values GO(v, w, t − 1) sent by every w ∈ V in the previous round. The algorithm consists of each node v executing the following operations 1 in each round t ∈ N: If p(v, t) = 1, start executing a new instance of C using the value x(v, t) as input and set m(v, t) = 0 while aborting any previously running instance. More specifically, this entails the following: • Maintain a local round counter r, which is initialised to 1 on round t and increased by 1 after each round. • Maintain the local state variables related to the consensus routine C.
• On each round, execute round r of algorithm C; if the state variables indicate that C terminated at v, then do nothing. • On the round when r would attain the value T (C) + 1, stop the simulation (indicating this, e.g., by setting r(v) = ⊥) and locally output the value of y(v) computed by the simulation of C.
We now show that the above algorithm satisfies the properties required from a self-stabilising firing squad. Proof. Let F be the algorithm described above. We now argue that the algorithm satisfies the three properties given in Section 2.4: (FS1) agreement, (FS2) safety, and (FS3) liveness. We will show that the algorithm has a response time bounded by R = T (C) + Ψ.
(FS1) Denote by t 0 ≤ T (P) the round in which the execution of the strong Ψ-pulser P has stabilised and generated a pulse. That is, for rounds t ≥ t 0 we have that p(v, t) = 1 is equivalent to t = t 0 + kΨ for some k ∈ N 0 . This implies that the algorithm will correctly simulate instances of the consensus routine C and locally output its decision on rounds r k = t 0 + T (C) + kΨ < t k+1 for k ∈ N 0 . The agreement property of the firing squad thus follows from the agreement property of consensus for all rounds t ≥ t 0 , as FIRE(v, t) = 1 if and only if t = r k and the simulation of C output the value y(v, t) = 1 in Step 4.
(FS2) Concerning safety, suppose v ∈ V \F outputs FIRE(v, t F ) = 1 in round t F ≥ t 1 +T (C). By the above discussion and the validity property of consensus, this implies that there was some node w ∈ V \ F that started a (successfully and completely simulated) instance of C with input Step 3) and thus w sets x(w, r k−1 ) = 0 later in round r k−1 (by Steps 4 and 5), the round in which the previous instance of C locally output some value. This contradicts the fact that x(w, t k ) = 1 is set in round t k . Hence, there must be u ∈ V \ F and t G ∈ {t k−1 , . . . , t k − 1} such that GO(u, t G ) = 1.
Recall that the above claimed existence of u ∈ V \ F and t G such that GO(u, t G ) = 1 is necessary for the safety condition to hold, but not sufficient. It is also required that FIRE(v, t ) = 0 for all t ∈ {t G + 1, . . . , t F − 1}. To show this, observe that the time t G shown to exist by the above reasoning does not satisfy this additional constraint if and only if some instance of C locally outputs y(v, t ) = 1 at node v in such a round t . The only possible such round t is r k−1 , as t ≥ t G + 1 > t k−1 > r k−2 . However, in this case, each w ∈ V \ F sets x(w, r k−1 ) = 0 in round r k−1 regardless of m(w, r k−1 ) in Step 4, and we can conclude that some w ∈ V \ F must set x(w, t ) = 1 in some round t ∈ {r k−1 + 1, . . . , t k }. As above, it follows that there is a round t G ∈ {t k−1 , . . . , t k − 1} and a node u ∈ V \ F such that GO(u, t G ) = 1. Overall, we see that the safety condition for a firing squad algorithm with response time (FS3) It remains to argue that the algorithm satisfies the liveness property with response time bounded by R.
The instance of C started in this round will satisfy that all correct nodes v ∈ V \ F have input x(v, t k ) = 1: by our assumption towards contradiction, no node can locally output y(v, t ) = 1 during rounds t ∈ {t G + 1, . . . , t G + R}; thus, no node can set x(v, ·) to 0 without setting m(v, ·) to 0 first (by Step 3 and Step 5), which in turn entails that at time t k an instance of C with value of x(v, t k ) = 1 is started before this happens. By the properties of C, it follows that each v ∈ V \ F locally outputs 1 in round contradicting our previous assumption. We conclude that our algorithm satisfies the liveness property with response time R = Ψ + T (C) for rounds t G ≥ t 0 − 1.
As t 0 ≤ T (P), it follows that the algorithm satisfies (FS1) agreement after round t 0 , (FS2) safety after round t 1 , and (FS3) liveness after round t 0 − 1. Since t 1 = t 0 + Ψ ≤ T (P) + Ψ, it follows that the algorithm is a firing squad with response time at most R = Ψ + T (C) that stabilises in max{T (P), T (P) + Ψ, T (P) − 1} = T (P) + Ψ rounds. The bound on the message size follows from the fact that the algorithm F only broadcasts 1 bit in Step 1 in addition to the messages related to P and C.

From weak pulsers to strong pulsers
In Section 3, we established that it suffices to construct suitable strong pulsers to solve the synchronous counting and firing squad problems. We will now reduce the construction of strong pulsers to constructing weak pulsers.

Weak pulsers
A weak Φ-pulser is similar to a strong pulser, but does not guarantee a fixed frequency of pulses. However, it guarantees to eventually generate a pulse followed by Φ − 1 rounds of silence. Formally, we define weak pulsers as follows.
Definition 2 (Weak pulsers). An algorithm W is an f -resilient weak Φ-pulser that stabilises in T (W) rounds if the following holds. In each round t ∈ N, each node v ∈ V produces an output a(v, t). Moreover, there exists a round t 0 ≤ T (W) such that We say that on round t 0 a good pulse is generated by W. Figure 3 illustrates a weak 4-pulser. Note that while the definition formally only asks for one good pulse, the fact that the algorithm guarantees this property for any starting state implies that there is a good pulse at least every T (W) rounds. Clock Node 1 0 Eventually, a good pulse is generated, which is highlighted. A good pulse is followed by three rounds in which no correct node generates a pulse. In contrast, the pulse two rounds earlier is not good, as it is followed by only one round of silence.

Constructing strong pulsers from weak pulsers
Recall that a strong pulser can be obtained by having nodes locally count down the rounds until the next pulse, provided we have a way of ensuring that the local counters eventually agree. This can be achieved by using a weak pulser to control a suitable consensus routine, where again we always have only a single instance running at any time. While some instances will be aborted before they can complete, this will not affect the counters, as we only adjust them when the consensus routine completes. On the other hand, the weak pulser guarantees that within T (W) rounds, there will be a pulse followed by Φ − 1 rounds of silence, enabling to complete a run of any consensus routine C satisfying T (C) ≤ Φ. Thus, for constructing a strong Ψ-pulser, we assume that we have the following f -resilient algorithms available: • a T (C)-round Ψ-value consensus algorithm C and • a weak Φ-pulser W for Φ ≥ T (C).
Given the above two algorithms, we show how to construct an f -resilient strong Ψ-pulser for any Ψ > 1. The pulser will stabilise in time T (W) + T (C) + Ψ and the message size of the strong pulser will be bounded by M (W) + M (C).
As mentioned earlier, the idea is to have nodes simply count locally between pulses and use the weak pulser to execute a single instance of the consensus algorithm C. Eventually, a good pulse will run an instance consistently and establish agreement among the local counters. Leveraging validity, we can ensure that the counters will never be affected by the consensus routine running in the background again.
Variables. Beside the variables of the weak pulser W and (a single copy of) C, our construction of a strong Ψ-pulser uses the following local variables: 1} is the output variable of the strong Ψ-pulser we are constructing, • c(v, t) ∈ [Ψ] is the local counter keeping track on when the next pulse occurs, and • d(v, t) ∈ {1, . . . , T (C)} ∪ {⊥} keeps track of how many rounds an instance of C has been executed since the last pulse from the weak pulser W. The value ⊥ denotes that the consensus routine has stopped.
Strong pulser algorithm. The algorithm is as follows. Each node v executes the weak Φ-pulser algorithm W in addition to the following instructions on each round t ∈ N: 1. If c(v, t) = 0, then set b(v, t) = 1 and otherwise b(v, t) = 0. In the above algorithm, the first step simply translates the counter value to the output of the strong pulser. We then use a temporary variable c (v, t) to hold the counter value, which is overwritten by the output of C (increased by T (C) mod Ψ) if it completes a run in this round. In either case, the counter value needs to be increased by 1 mod Ψ for the next round. The remaining code does the bookkeeping for an ongoing run of C and starting a new run if the weak pulser generates a pulse.
Observe that in the above algorithm, each node only sends messages related to the weak pulser W and the consensus algorithm C. Thus, there is no additional overhead in communication and the message size is bounded by M (W) + M (C). Hence, it remains to show that the local counters c(v, t) implement a strong Ψ-counter. Proof. Suppose round t 0 ≤ T (W) is as in Definition 2, that is, a(v, t) = a(w, t) for all t ≥ t 0 , and a good pulse is generated in round t 0 . Thus, all correct nodes participate in simulating an instance of C during rounds t 0 + 1, . . . , t 0 + T (C), since no pulse is generated during rounds t 0 + 1, . . . , t 0 + T (C) − 1, and thus, also no new instance is started in the last step of the code during these rounds.
By the agreement property of the consensus routine, it follows that c (v, t 0 + T (C)) = c (w, t 0 + T (C)) for all v, w ∈ V \ F after Step 3ci. By Steps 2 and 4, the same will hold for both c(·, t ) and c (·, t ), t > t 0 + T (C), provided that we can show that in rounds t > t, Step 3ci never sets c (v, t) to a value different than c(v, t) for any v ∈ V \ F ; as this also implies that c(v, t + 1) = c(v, t ) + 1 mod Ψ for all v ∈ V \ F and t > t 0 + T (C), this will complete the proof.
Accordingly, consider any execution of Step 3ci in a round t > t 0 + T (C). The instance of C terminating in this round was started in round t − T (C) > t 0 . However, in this round the weak pulser must have generated a pulse, yielding that, in fact, t − T (C) ≥ t 0 + T (C). Assuming for contradiction that t is the earliest round in which the claim is violated, we thus have that c (v, t − T (C)) = c (w, t − T (C)) for all v, w ∈ V \ F , i.e., all correct nodes used the same input value c for the instance. By the validity property of C, this implies that v ∈ V \ F outputs y(v, t ) = c in round t and sets c (v, t ) = c + T (C) mod Ψ. However, since t is the earliest round of violation, we already have that c (v, t ) = c(v, t ) = c + T (C) mod Ψ after the second step, contradicting the assumption and showing that the execution stabilised in round t 0 + T (C) + 1 ≤ T (W) + T (C) + 1.
Together with Lemma 1, we get the following corollary.

Constructing weak pulsers from less resilient strong pulsers
Having seen that we can construct strong pulsers from weak pulsers using a consensus algorithm, the only piece missing in our framework is the existence of efficient weak pulsers. Indeed, having a pair of an f -resilient weak pulser and a consensus routine, we immediately obtain a corresponding firing squad algorithm.
In this section, we devise a recursive construction of a weak pulser from strong pulsers of smaller resilience. Given that a 0-resilient pulser is trivial and that we can obtain strong pulsers from weak ones without losing resilience, this is sufficient for constructing strong pulsers of optimal resilience from consensus algorithms of optimal resilience.
Our approach bears similarity to our constructions from earlier work [26,28], but attains better bit complexity and can be used with an arbitrary consensus routine. On a high level, we take the following approach as also illustrated in Figure 4: 1. Partition the network into two parts, each running a strong pulser (with small resilience).
Our construction guarantees that at least one of the strong pulsers stabilises.
2. Filtering of pulses generated by the strong pulsers: (a) Nodes consider the observed pulses generated by the strong pulsers as potential pulses.
(b) Since one of the strong pulsers may not stabilise, it may generate spurious pulses, that is, pulses that only a subset of the correct nodes observe.
(c) We limit the frequency of the spurious pulses using a filtering mechanism based on threshold voting.
3. We enforce any spurious pulse to be observed by all correct nodes by employing a silent consensus routine. In silent consensus, no message is sent (by correct nodes) if all correct nodes have input 0. Thus, if all nodes actually participating in an instance have input 0, non-participating nodes behave as if they participated with input 0. This avoids the chicken-and-egg problem of having to solve consensus on participation in the consensus routine. We make sure that if any node uses input 1, i.e., the consensus routine may output 1, all nodes participate. Thus, when a pulse is generated, all correct nodes agree on this.
4. If a potential pulse generated by one of the pulsers both passes the filtering step and the consensus instance outputs "1", then a weak pulse is generated.

The filtering construction
Our goal is to construct a weak Φ-pulser (for sufficiently large Φ) with resilience f . We partition the set of n nodes into two disjoint sets V 0 and V 1 with n 0 and n 1 nodes, respectively. Thus, we have n = n 0 + n 1 . For i ∈ {0, 1}, let P i be an f i -resilient strong Ψ i -pulser. That is, P i generates a pulse every Ψ i rounds once stabilised, granted that V i contains at most f i faulty nodes. Nodes in block i execute the algorithm P i . Our construction tolerates f = f 0 + f 1 + 1 faulty nodes. Since we consider Byzantine faults, we require the additional constraint that f < n/3. Let a i (v, t) ∈ {0, 1} indicate the output bit of P i for a node v ∈ V i . Note that we might have a block i ∈ {0, 1} that contains more than f i faulty nodes. Thus, it is possible that the algorithm P i never stabilises. In particular, we might have the situation that some of the nodes in block i produce a pulse, but others do not. We say that a pulse generated by such a P i is spurious. We proceed by showing how to filter out such spurious pulses if they occur too often.

3.
Use consensus to agree whether the block generated a pulse recently.

The network is divided into two blocks.
Each block runs a strong pulser instance, where the pulsers have coprime frequencies.

4.
A pulse is generated if one of the consensus instances outputs "1". Filtering rules. We define five variables with the following semantics: t) indicates when was the last time block i triggered a (possibly spurious) pulse, • w i (v, t) indicates how long any firing events coming from block i are ignored, and • b i (v, t) indicates whether node v accepts a firing event from block i.
The first two of the above variables are set according to the following rules: where a i (v, u, t) and m i (v, u, t) denote the values for a(·) and m(·) node v received from u at the end of round t, respectively. Furthermore, we update the (·, ·) variables using the rule In words, the counter is reset on round t + 1 if v has proof that at least one correct node u had m i (u, t) = 1, that is, some u observed P i generating a (possibly spurious) pulse.
We reset the cooldown counter w i whenever suspicious activity occurs. The idea is that it is reset to its maximum value C by node v in the following two cases: • some other correct node u = v observed block i generating a pulse, but the node v did not • block i generated a pulse, but this happened either too soon or too late.
To capture this behaviour, the cooldown counter is set with the rule where C = max{Ψ 0 , Ψ 1 } + Φ + 2. Finally, a node v accepts a pulse generated by block i if the node's cooldown counter is zero and it saw at least n − f nodes supporting the pulse. The variable b i (v, t) indicates whether node v accepted a pulse from block i on round t. The variable is set using the rule

Analysis of the filtering construction
We now analyse when the nodes accept firing events generated by the blocks. We say that a block i is correct if it contains at most f i faulty nodes. Note that since there are at most f = f 0 + f 1 + 1 faulty nodes, at least one block i ∈ {0, 1} will be correct. Thus, eventually the algorithm P i run by a correct block i will stabilise. This yields the following lemma.
Lemma 2. For some i ∈ {0, 1}, the strong pulser algorithm P i stabilises by round T (P i ).
We proceed by establishing some bounds on when (possibly spurious) pulses generated by block i are accepted. We start with the case of having a correct block i. Proof. If block i is correct, then the algorithm P i stabilises by round T (P i ). Hence, there is some t 0 ≤ T (P) so that the output variable a i (·) of P i satisfies a i (v, t) = 1 if and only if t = t 0 + kΨ i for k ∈ N 0 holds for all t ≥ t 0 . We will now argue that r 0 = t 0 + 2 satisfies the claim of the lemma.
If P i generates a pulse on round t ≥ t 0 , then at least n i − f i correct nodes u ∈ V i \ F have a i (u, t) = 1. Therefore, for all v ∈ V \ F we have m i (v, t + 1) = 1, and consequently, M i (v, t + 2) = 1. Since block i is correct, there are at most f i faulty nodes in the set V i . Observe that by Lemma 1 strong pulsers solve synchronous counting, which in turn is as hard as consensus [15]. This implies that we must have f i < n i /3, as P i is a strong f i -resilient pulser for n i nodes. Therefore, if P i does not generate a pulse on round t ≥ t 0 , then at most f i < n i − f i faulty nodes u may claim a i (u, t) = 1. This yields that m i (v, t + 1) = M i (v, t + 2) = 0 for all v ∈ V \ F .
We can now establish that a correct node accepts a pulse generated by a correct block i exactly every Ψ i rounds.

Lemma 4. If block i is correct, then there exists a round
Proof. Lemma 3 implies that there exists r 0 ≤ T (P i ) + 2 such that both M i (v, t) = 1 and i (v, t) = 0 hold for t ≥ r 0 if and only if t = r 0 + kΨ i for k ∈ N 0 . Thus, it follows that w i (v, t + 1) = max{w i (v, t) − 1, 0} for all such t and hence w i (v, t ) = 0 for all t ≥ r 0 + C + 2. The claim now follows from the definition of b i (v, t ), the choice of r 0 , and the fact that Ψ i ≤ C −2.
It remains to deal with the faulty block. If we have Byzantine nodes, then a block i with more than f i faulty nodes may attempt to generate spurious pulses. However, the filtering mechanism prevents the spurious pulses from occuring too frequently.

Lemma 5.
Let v, v ∈ V \ F and t > 2. Suppose b i (v, t) = 1 and suppose that t > t is minimal such that b i (v , t ) = 1. Then t = t + Ψ i or t > t + C.
Proof. Suppose b i (v, t) = 1 for some correct node v ∈ V and t > 2. Since b i (v, t) = 1, w i (v, t) = 0 and M i (v, t) = 1. Because M i (v, t) = 1, there must be at least n − 2f > f correct nodes u such that m i (u, t − 1) = 1. Hence, i (u, t) = 0 for every node u ∈ V \ F .
Recall that t > t is minimal so that b i (v , t ) = 1. Again, w i (v , t ) = 0 and M i (v , t ) = 1. Moreover, since i (v , t) = 0, we must have i (v , r) < Ψ i − 1 for all t ≤ r < t + Ψ i − 1. This In the event that t = t + Ψ i , the cooldown counter must have been reset at least once, i.e., w i (v , r) = C holds for some t < r ≤ t − C, implying that t > t + C.

Introducing silent consensus
The above filtering mechanism prevents spurious pulses from occurring too often: if some node accepts a pulse from block i, then no node accepts a pulse from this block for at least Ψ i rounds. We now strengthen the construction to enforce that any (possibly spurious) pulse generated by block i will be accepted by either all or no correct nodes. In order to achieve this, we employ silent consensus.
Definition 3 (Silent consensus). We call a consensus protocol silent, if in each execution in which all correct nodes have input 0, correct nodes send no messages.
The idea is that this enables to have consistent executions even if not all correct nodes actually take part in an execution, provided we can ensure that in this case all participating correct nodes use input 0: the non-participating nodes send no messages either, which is the exact same behavior participating nodes would exhibit. We show that silent consensus protocols can be obtained from non-silent ones using a simple transformation.
Theorem 4. Any consensus protocol C can be transformed into a silent binary consensus protocol C with T (C ) = T (C) + 2 and the same resilience and message size.
Proof. The new protocol C can be seen as a "wrapper" protocol that manipulates the inputs and then lets each node decide whether it participates in an instance of the original protocol. The output of the original protocol, C, will be taken into account only by correct nodes that participate throughout the protocol, as specified below.
In the first round of the new protocol, C , each participating node broadcasts its input if it is 1 and otherwise sends nothing. If a node receives fewer than n − f times the value 1, it sets its input to 0. In the second round, the same pattern is applied.
Subsequently, C is executed by all nodes that received at least f + 1 messages in the first round. If during the execution of C a node (i) cannot process the messages received in a given round in accordance with C (this may happen e.g. when not all of the correct nodes participate in the instance, which is not covered by the model assumptions of C), We first show that the new protocol, C , is a consensus protocol with the same resilience as C and the claimed bounds on communication complexity and running time. We distinguish two cases. First, suppose that all correct nodes participate in the execution of C at the beginning of the third round. As all nodes participate, the bounds on resilience, communication complexity, and running time that apply to C hold in this execution, and no node will quit executing the protocol before termination. To establish agreement and validity, again we distinguish two cases. If all nodes output the outcome of the execution of C, these properties follow right away since C satisfies them; here we use that although the initial two rounds might affect the inputs of nodes, a node will change its input to 0 only if there is at least one correct node with input 0. On the other hand, if some node outputs 0 because it received f or fewer messages in the second round of C , no node received more than 2f < n − f messages in the second round. Consequently, all nodes executed C with input 0 and computed output 0 by the agreement property of C, implying agreement and validity of the new protocol.
The second case is that some correct node does not participate in the execution of C. Thus, it received at most f messages in the first round of C , implying that no node received more than 2f < n − f messages in this round. Consequently, correct nodes set their input to 0 and will not transmit in the second round. While some nodes may execute C, all correct nodes will output 0 no matter how C behaves. Since nodes abort the execution of C if the bounds on communication or time complexity are about to be violated, the claimed bounds for the new protocol hold.
It remains to show that the new protocol is silent. Clearly, if all correct nodes have input 0, they will not transmit in the first two rounds. In particular, they will not receive more than f messages in the first round and not participate in the execution of C. Hence correct nodes do not send messages at all, as claimed.
For example, plugging in the phase king protocol [4], we get the following corollary.

Using silent consensus to prune spurious pulses
As the filtering construction bounds the frequency at which spurious pulses may occur from above, we can make sure that at each time, only one consensus instance can be executed for each block. However, we need to further preprocess the inputs, in order to make sure that (i) all correct nodes participate in an instance or (ii) no participating correct node has input 1; here, output 1 means agreement on a pulse being triggered, while output 0 results in no action.
Recall that b i (v, t) ∈ {0, 1} indicates whether v observed a (filtered) pulse of the strong pulser P i in round t. Moreover, assume that C is a silent consensus protocol running in T (C) rounds. We use two copies C i , where i ∈ {0, 1}, of the consensus routine C. We require that Ψ i ≥ T (C), which guarantees by Lemma 5 that (after stabilisation) every instance of C has sufficient time to complete. Adding one more level of voting to clean up the inputs, we arrive at the following routine.
The pruning algorithm. Besides the local variables of C i , the algorithm will use the following variables for each v ∈ V and round t ∈ N: Analysis. Besides the communication used for computing the values b i (·), the above algorithm uses messages of size M (C) + 1, as M (C) bits are used when executing C i and one bit is used to communicate the value of b i (v, t).
We say that v ∈ V \ F executes round r ∈ {1, . . . , T (C)} of C i in round t iff r i (v, t) = r. By Lemma 5, in rounds t > T (C) + 2, there is always at most one instance of C i being executed, and if so, consistently.
Exploiting silence of C i and the choice of inputs, we can ensure that the case U = V \ F causes no trouble. Lemma 6. Let t > T (C) + 2 and U be as in Corollary 3. Then U = V \ F or each u ∈ U has input 0 for the respective instance of C i .
Proof. Suppose u ∈ U starts an instance with input 1 in round t ∈ {t − T (C) − 1, . . . , t}. Then b i (w, t − 1) = 1 for at least n − 2f nodes w ∈ V \ F , since u received b i (u, w, t − 1) = 1 from n − f nodes w ∈ V . Thus, each v ∈ V \ F received b i (v, w, t − 1) = 1 from at least n − 2f nodes w and sets r i (v, t ) = 1, i.e., U = V \ F . The lemma now follows from Corollary 3.
Recall that if all nodes executing C i have input 0, non-participating correct nodes behave exactly as if they executed C i as well, i.e., they send no messages. Hence, if U = V \ F , all nodes executing the algorithm will compute output 0. Therefore, Corollary 3, Lemma 5, and Lemma 6 imply the following corollary.

Corollary 4. In rounds t > T (C) + 2 it holds that
Finally, we observe that our approach does not filter out pulses from correct blocks.
Proof. Lemma 4 states the same for the variables b i (v, t) and a round t 0 ≤ T (P i ) + 2C. If b i (v, t) = 1 for all v ∈ V \ F and some round t, all correct nodes start executing an instance of C i with input 1 in round t + 1. As, by Corollary 3, this instance executes correctly and, by validity of C i , outputs 1 in round t + T (C), all correct nodes satisfy B i (v, t + T (C) + 1) = 1. Similarly, B i (v, t + T (C) + 1) = 0 for such v and any t ≥ t 0 with b i (v, t) = 0.

Obtaining the weak pulser
Finally, we define the output variable of our weak pulser as As we have eliminated the possibility that B i (v, t) = B i (w, t) for v, w ∈ V \ F and t > T (C) + 2, Property W1 holds. Since there is at least one correct block i by Lemma 2, Lemma 7 shows that there will be good pulses (satisfying Properties W2 and W3) regularly, unless block 1 − i interferes by generating pulses violating Property W3 (i.e., in too short order after a pulse generated by block i). Here the filtering mechanism comes to the rescue: as we made sure that pulses are either generated at the chosen frequency Ψ i or a long period of C rounds of generating no pulse is enforced (Corollary 4), it is sufficient to choose Ψ 0 and Ψ 1 as coprime multiples of Φ.
Accordingly, we pick Ψ 0 = 2Φ and Ψ 1 = 3Φ and observe that this results in a good pulse within O(Φ) rounds after the B i stabilised.
Proof. We have that C = max{Ψ 0 , Ψ 1 } + Φ + 2 ∈ O(Φ). By the above observations, there is a round t ∈ max{T (P 0 ), T (P 1 )} + T (C) + O(Φ) = max{T (P 0 ), T (P 1 )} + O(Φ) satisfying the following four properties. For either block i ∈ {0, 1}, we have by Corollary 4 that Moreover, for a correct block i and for all v ∈ V \ F we have from Lemma 7 that and for a (possibly faulty) block 1 − i we have from Corollary 4 that 4. if B 1−i (v, t ) = 1 for some v ∈ V \F and t ∈ {t+1, . . . , t+Ψ i +Φ−1}, then B 1−i (u, t ) = 0 for all u ∈ V \ F and t ∈ {t + 1, . . . , t + C} that do not satisfy t = t + kΨ 1−i for some k ∈ N 0 . Now it remains to argue that a good pulse is generated. Suppose that i is a correct block given by Lemma 2. By the first property, it suffices to show that a good pulse occurs in round t or in round t + Ψ i . From the second property, we get for all v ∈ V \ F that B(v, t) = 1 and B(v, t + Ψ i ) = 1. If the pulse in round t is good, the claim holds. Hence, assume that there is a round t ∈ {t + 1, . . . , t + Ψ i − 1} in which another pulse occurs, that is, B(v, t ) = 1 for some v ∈ V \ F . This entails that B 1−i (v, t ) = 1 by the third property. We claim that in this case the pulse in round t + Ψ i is good. To show this, we exploit the fourth property. Recall that C > Ψ i + Φ, i.e., t + C > t + Ψ i + Φ. We distinguish two cases: • In the case i = 0, we have that t + Ψ 1−i = t + 3Φ = t + Ψ 0 + Ψ > t + Ψ 0 + Φ, that is, the pulse in round t + Ψ 0 = t + Ψ i is good.
From the above lemma and the constructions discussed in this section, we get the following theorem.
Theorem 5. Let n = n 0 + n 1 and f = f 0 + f 1 + 1, where n > 3f . Suppose C is an f -resilient consensus algorithm on n nodes and let Φ ≥ T (C) + 2). If there exist f i -resilient strong Ψ i -pulser algorithms on n i nodes, where Ψ 0 = 2Φ and Ψ 1 = 3Φ, then there exists an f -resilient weak Φ-pulser W on n nodes that satisfies Proof. By Theorem 4, we can transform C into a silent consensus protocol C , at the cost of increasing its round complexity by 2. Using C in the construction, Lemma 8 shows that we obtain a weak Φ-pulser with the stated stabilisation time, which by construction tolerates f faults. Concerning the message size, note that we run P 0 and P 1 on disjoint node sets. Apart from sending max{M (P 0 ), M (P 1 )} bits per round for its respective strong pulser, each node may send M (C) bits each to each other node for the two copies C i of C it runs in parallel, plus a constant number of additional bits for the filtering construction including its outputs b i (·, ·).

Main results
Finally, in this section we put the developed machinery to use. As our main result, we show how to recursively construct strong pulsers out of consensus algorithms.
Theorem 6. Suppose that we are given a family of f -resilient deterministic consensus algorithms C(f ) running on any number n > 3f of nodes in T (C(f )) rounds using M (C(f ))-bit messages, where T (C(f )) and M (C(f )) are non-decreasing in f . Then, for any Ψ ∈ N, f ∈ N 0 , and n > 3f , there exists a strong Ψ-pulser P on n nodes that Proof. We show by induction on k that f -resilient strong Ψ-pulsers P(f, Ψ) on n > 3f nodes with the stated complexity exist for any f < 2 k , with the addition that the (bounds on) stabilisation time and message size of our pulsers are non-decreasing in f . We anchor the induction at k = 0, i.e., f = 0, for which, trivially, a 0-resilient strong Ψ-pulser with n ∈ N nodes is given by one node generating pulses locally and informing the other nodes when to do so. This requires 1-bit messages and stabilises in Ψ + 1 rounds. Now assume that 2 k ≤ f < 2 k+1 for k ∈ N 0 and the claim holds for all 0 ≤ f < 2 k . Since 2 · (2 k − 1) + 1 = 2 k+1 − 1, there are f 0 , f 1 < 2 k such that f = f 0 + f 1 + 1. Moreover, as n > 3f > 3f 0 + 3f 1 , we can pick n i > 3f i for both i ∈ {0, 1} satisfying n = n 0 + n 1 . Let P(f , Ψ ) denote a strong Ψ -pulser that exists by the induction hypothesis for f < 2 k .
Choose Φ ∈ O(log Ψ) + T (C(f )) in accordance with Theorem 1 for L = Ψ; without loss of generality we may assume that the O(log Ψ) term is at least 2, that is, Φ ≥ 2 + T (C(f )). We apply Theorem 5 to C(f ) and P i = P(f i , Ψ i ), where Ψ 0 = 2Φ and Ψ 1 = 3Φ, to obtain a weak Φ-pulser W with resilience f on n nodes and stabilisation time of Next, we apply Theorem 1 to C(f ) to obtain an f -resilient Ψ-value consensus protocol C that uses M (C(f ))-bit messages and runs in T (C ) ≤ Φ rounds. We feed the weak pulser W and the multivalued consensus protocol C into Corollary 1 to obtain an f -resilient strong Ψ-pulser P with a stabilisation time of and message size bounded by Applying the bounds given by the induction hypothesis to P 0 and P 1 , the definitions of Φ, Ψ 0 and Ψ 1 , and the fact that both T (C(f )) and M (C(f )) are non-decreasing in f , we get that the stabilisation time satisfies and message size is bounded by Because we bounded complexities using max i {T (P i )}, max{M (P i )}, T (C(f )) and M (C(f )), all of which are non-decreasing in f by assumption, we also maintain that the new bounds on stabilisation time and message size are non-decreasing in f . Thus, the induction step succeeds and the proof is complete.
Plugging in the phase king protocol [4], which has optimal resilience, running time O(f ), and constant message size, we can extract a strong pulser that is optimally resilient, has asymptotically optimal stabilisation time, and message size O(log f ). We obtain efficient solutions to the firing squad and synchronous counting problems. Proof. We use Corollary 5 with Ψ ∈ O(f ) being the running time of the phase king protocol [4], followed by applying Theorem 2 to the obtained pulser and the phase king protocol.

Corollary 7.
For any C, f ∈ N and n > 3f , an f -resilient C-counter on n nodes with stabilisation time O(f + log C) and message size O(log f ) exists.
Proof. In the last step of the construction of Theorem 6, we do not use Corollary 1 to extract a strong pulser, but directly obtain a counter using Theorem 3. This avoids the overhead of Ψ due to waiting for the next pulse. Recalling that the o(Ψ) term in the complexity comes from the O(log Ψ) additive overhead in time of the multi-value consensus routine, the claim follows.
We remark that one can strengthen the bound on the stabilisation time to O(f + (log Ψ)/B) using messages of size B, by using larger messages in the reduction given by Theorem 1 [27]. However, this affects the asymtotic stabilisation time only if Ψ is super-exponential in f .

Probabilistic sublinear-time algorithms
So far, we have confined our discussion to the deterministic setting. However, it is straightforward to adapt our framework to also utilise randomised consensus routines, which can break the linearin-f bound for consensus [21] and attain better bit complexities than deterministic algorithms [24]. Indeed, Ben-Or al. [3] have shown how to obtain randomised counting algorithms that stabilise in O(1) expected time. However, these algorithms rely on a shared coin, which is costly in terms of communication.
We now use our framework to obtain fast and communication-efficient probabilistic pulsers that stabilise in polylog f communication rounds, where algorithms need to broadcast only polylog f bits per round. Here, a probabilistic pulser means that after stabilisation the pulser P may fail to behave correctly in round t ≥ T (P) with some small positive probability after which it needs to re-stabilise again.

Using probabilistic consensus routines
For our framework, we require that the running time of the underlying consensus algorithms satisfy deterministic running time bounds, while we allow for a probabilistic guarantee on the agreement and validity properties. That is, we need Monte Carlo consensus algorithms. Accordingly, we demand that the agreement and validity properties of the Monte Carlo consensus algorithm hold with probability 1 − p, where the probability of failure is p ≤ 1/f c for a sufficiently large constant c. Noting that our recursive construction of strong pulsers involves f O(1) calls to the utilised consensus routine within f O(1) rounds, it follows from the union bound that with probability at least all consensus instances succeed. These observations give the following generalisation of Theorem 6.
Theorem 7. Suppose that for constant ε ≥ 0 we are given a family of f -resilient consensus algorithms C(f ) running on any number n > (3 + ε)f of nodes in T (C(f )) rounds using M (C(f ))-bit messages, where T (C(f )) and M (C(f )) are increasing in f , and C(f ) fails with probability p ≤ 1/f c for sufficiently large c ∈ O(1). Then, for any Ψ, f ∈ N and n > (3 + ε)f , a strong probabilistic Ψ-pulser P on n nodes with exists, where for f = 0 the sums are empty and on any round t ≥ T (P) the algorithm P fails with probability f O(1) p (and then needs to re-stabilise).
The additional reservation that C may require n > (3 + ε)f accounts for the fact that various randomised consensus protocols have slightly suboptimal resilience. Note also that any further model requirements of the randomised consensus protocols, such as private channels, of course still apply when employing our framework.

Probabilistic pulsers, counting and firing squads
As a concrete example, we plug in the consensus algorithm by King and Saia [24], as it satisfies the properties we need. We now make the additional assumptions that (1) the number of faults is restricted to f < n/(3 + ε) (for arbitrarily small constant ε > 0) and (2) communication is via private channels, i.e., faulty nodes behavior in round t is a function of all communication from correct nodes to faulty nodes in rounds t ≤ t.

Theorem 8 ([24]
). There exists a protocol C that with probability 1 − 1/f c solves consensus in polylog f rounds using messages of size polylog f , provided f < n/(3 + ε) and communication is via private channels.
We remark that the consensus algorithm from [24] actually limits the number of bits sent by each node to O( √ n polylog n), but in our framework each node broadcasts Ω(log f ) bits per round. Similarly as before, we can obtain efficient probabilistic counting and firing squads algorithms from the probabilistic pulsers. We note that we choose a failure probability of 1/f Θ(1) for illustrative purposes; by increasing the running time of the underlying consensus routine (incurring the corresponding linear increase in stabilisation time), one can decrease the failure probability exponentially.

Extensions to other fault models
In this section, we utilise our framework under more benign fault models than the one given by Byzantine faults. This allows us to tolerate a larger amount of faulty nodes: for example, while one cannot tolerate more than f < n/3 Byzantine faulty nodes, it is possible to tolerate any number of f < n crash faults or f < n/2 send omission faults.
We start by giving a simple and efficient algorithm for synchronous counting under crash faults; here, our framework is overkill, and a direct approach suffices. Together with the approach used in Section 3 and a crash-tolerant consensus algorithm, we readily obtain an efficient firing squad protocol in the crash fault setting. After this, we illustrate how to modify the construction of strong and weak pulsers given in Section 4 and Section 5 to work with omission faults. This highlights one of the key features of our construction: the resilience of the underlying consensus routine essentially dictates what kind of -and how many -permanent faults our self-stabilising counting and firing squad algorithms tolerate, while only making minor modifications to the various voting steps used in the construction.

Counting and firing squads under crash faults
Crash faults are perhaps the most benign fault type: the nodes do not send misinformation and, in the synchronous setting, all nodes can eventually detect which nodes have crashed. Thus, unlike in the Byzantine setting, designing algorithms under crash faults is relatively easy, as nodes crash cleanly and cause no further trouble.
Definition 4 (Crash faults). A crashing node stops executing the algorithm in some round r ∈ N. In this round, the node manages to send only a subset of the messages it would send if it ran correctly. Thus, only a subset of the respective recipients receive a message from the crashed node in this round. The remaining nodes (and, in rounds r > r all nodes) receive no message.
The benign nature of crash faults allows us to use more strict requirements in the synchronous counting and firing squad problems as we will see. In the following, let us use F (t ) ⊆ V to denote the set of nodes that have crashed before or in round t .
Optimal crash-tolerant counting. Let us start with a definition of the synchronous counting problem under crash faults. The problem is defined similarly as in the case of Byzantine faults, but with the requirement that agreement and consistency are satisfied by the set of currently non-crashed nodes.
Definition 5 (Counting with crash faults). In synchronous C-counting with crash faults, an execution of an algorithm stabilises in round t ∈ N if and only if all t ≤ t ∈ N the output counters c(·) satisfy We now give a simple counting algorithm that attains optimal stabilisation time and resilience under crash faults. Let c(v, t) ∈ [C] be a local variable that indicates the counter value of node v on round t. On every round, every node v broadcasts the value c(v, t) to all other nodes. For every u, v ∈ V , let c(v, u, t) ∈ [C] ∪ { * } denote the value node v receives from node u at the start of round t + 1. Here, we use the special value * to indicate that node v received no message from node u. Observe that we have the guarantee that for any non-crashed node v ∈ V \ F (t + 1), we have that c(v, u, t + 1) = * for all crashed nodes u ∈ F (t).
Let U (v, t) = {u ∈ V : c(v, u, t) = * } be the set of nodes v received a message from at the start of round t + 1. Node v updates its counter value on round t + 1 by picking the majority value among the values it received: Lemma 9. Suppose no node crashes on round t. Then for any u, v ∈ V \ F (t ) and all t > t, we have that c(u, t ) = c(v, t ) and c(u, t + 1) = c(u, t ) + 1 mod C.
Proof. Since no node crashes on round t, we have that U (u, t) = U (v, t). Hence, both u and v set the same value x for their counter for round t + 1 when using the above update rule and we have c(v, t + 1) = c(u, t + 1). It remains to argue that non-crashed nodes will not ever disagree on their counter values after round t + 1. To this end, suppose all non-crashed nodes agree on the output on some round t , that is, there exists x ∈ [C] such that for all v ∈ V \ F (t ) we have c(v, t ) = x. Now for any v ∈ V \ F (t + 1) and each w ∈ U (v, t ) it holds that c(v, w, t ) = x. Thus, by the above update rule, node v ∈ V \ F (t + 1) satisfies c(v, t + 1) = x + 1 mod C.
Theorem 9. Let C > 1 and f < n. There exists a synchrous C-counter for n nodes that tolerates f crash faults and stabilises in f + 1 rounds, where each node broadcasts log C bits every round. Moreover, if no node crashes on some round t < f + 1, then the algorithm stabilises on round t + 1.
Proof. Since there are at most f crash faults, there exists a round t < f + 1 such that no node crashes. Applying Lemma 9 to this round implies that the algorithm stabilises. Since nodes only need to communicate their current counter values every round, a node needs to broadcast at most log C bits every round.
The above algorithm has exactly optimal stabilisation time: it is known that any t-round counting algorithm solves consensus in t rounds [16], but even under crash faults consensus requires f + 1 rounds [1]. Moreover, the algorithm is "early-stabilising" in the sense that if there is no crash on some round t, then the algorithm stabilises on round t + 1 even if some nodes crash on later rounds t > t. Finally, the message size is optimal in the worst case: if there are no crashes on the first round, then it is necessary for the correct nodes to communicate log C bits to stabilise in one round.
Asymptotically optimal crash-tolerant firing squads. Let us now consider the firing squad problem under crash faults. Observe that Dolev et al. [14] give a crash-tolerant firing squad algorithm with exactly optimal stabilisation and response time. However, their algorithm uses messages of size O(f log f ) for f ∈ Θ(n). We now show that if one relaxes the stabilisation and response times to be asymptotically optimal, then messages of size O(log f ) suffice.
Definition 6 (Firing squad with crash faults). In the firing squad problem with crash faults, we say that an execution of an algorithm stabilises in round t ∈ N if the following properties hold: (ii) FIRE(v, t ) = 0 for all t ∈ {t G + 1, . . . , t F − 1}.
• Liveness: If GO(v, t G ) = 1 for v ∈ V \ F (t G + 1) and t ≤ t G ∈ N, then FIRE(v, t F ) = 1 for all nodes v ∈ V \ F (t F ) and some t G < t F ∈ N.
In Section 3 we saw that firing squad can be solved easily using consensus and a strong pulser algorithm. The same reduction works also under crash faults. The only difference is that we need to modify the second line of the firing squad algorithm given in Section 3.2. We replace the condition of seeing at least f + 1 times GO(w, t − 1) = 1 with seeing at least one node w with GO(w, t − 1) = 1. This yields an result analogous to Theorem 2 under crash faults.
Similarly, in the case of consensus under crash faults, the agreement and validity conditions need to be satisfied by all non-crashed nodes at the end of the execution. In this setting, consensus can be solved in f + 1 rounds using 1-bit messages [33]. For example, we can adapt the same majority voting technique as in the counting algorithm above for f + 1 rounds to solve consensus as well. For C = f + 1, we can use a crash-tolerant C-counter to obtain a crash-tolerant strong C-pulser using Lemma 1. Using similar arguments as in Theorem 2, we obtain the following result.
Corollary 11. For any f ∈ N and n > f , there exists an f -crash-tolerant firing squad on n nodes with stabilisation and response times of O(f ) and message size O(log f ).

The framework under omission faults
We consider a fault type that falls between crash and Byzantine faults: omission faults. The case of omission faults is more challenging than crash faults, as faulty nodes may drop some of the messages, while still continuing to participate in the execution of the algorithm for indefinitely long. For simplicity, we focus on send omission faults, as our primary goal here is to demonstrate the flexibility our framework. One could also consider e.g. receive or general omission faults [33].
Definition 7 (Omission faults). We say that a node v ∈ V suffers from (send) omission faults if in each round r the messages sent by v are only received by some (arbitary) subset U (r) ⊆ V of the nodes only. The remaining nodes in V \ U (r) receive no message from v.
Note that under send omission faults, the faulty nodes still receive messages from correct nodes. Hence, we modify the definitions of synchronous counting and firing squad problems as follows.
Definition 8 (Counting with omission faults). In the synchronous C-counting problem with omission faults, we require that the agreement and consistency conditions are satisfied by all nodes.
Definition 9 (Firing squad with omission faults). In the firing squad problem with omission faults, the agreement, safety, and liveness conditions are adapted as follows. We say that an execution of an algorithm stabilises in round t ∈ N if the following three properties hold: • Agreement: FIRE(v, t ) = FIRE(w, t ) for all v, w ∈ V and t ≤ t ∈ N.
• Safety: If FIRE(v, t F ) = 1 for v ∈ V and t ≤ t F ∈ N, then there is t F ≥ t G ∈ N such that (i) GO(w, t G ) = 1 for some w ∈ V , (ii) FIRE(v, t ) = 0 for all t ∈ {t G + 1, . . . , t F − 1}.
• Liveness: If GO(v, t G ) = 1 for v ∈ V \ F and t ≤ t G ∈ N, then FIRE(v, t F ) = 1 for all nodes v ∈ V and some t G < t F ∈ N.
Finally, we remark that also the definition of consensus needs to be adapted in the case of send omission faults. For send omission faults, termination, agreement, and validity apply to all nodes in the system.
Adjustments to the basic framework. As Byzantine faults also cover omission faults, our framework could be used as-is with minimal modifications. However, weaker fault types permit a larger number of faults to be tolerated. Moreover, we can readily employ consensus protocols tailored for various different fault types from the literature by slightly adapting the voting schemes used in our constructions outside the consensus routines. In addition, we must adapt our reductions of multivalue consensus and silent consensus to standard binary consensus. More precisely, for each fault type, we need to address and handle the following issues: 1. Each node broadcasts its input bit by bit. There is a unique input x that can be received n − f > f times by any node (the threshold must be met for every bit, but the senders may differ). If v receives such an input, it stores it and sends it again bit by bit; if not, it sends nothing in this second transmission. If x is receveived at least n − f times by v ∈ V in this second iteration, v uses input 1 in a call to the binary consensus routine, otherwise 0. If v received any value x in this second transmission, it returns it in case the consensus routine outputs 1. If the routine outputs 0, it returns 0. Note that if any node used input 1, it received x f + 1 times in the second iteration, entailing that every node received x. Thus, agreement holds by the properties of the binary consensus routine. Likewise, validity of the latter implies validity of the former: if all nodes have the same input x, it is received n − f times by each node in both iterations.
2. Again, we replace the threshold of receiving GO(w, t − 1) = 1 from f + 1 nodes w ∈ V with the threshold of receiving GO(w, t − 1) = 1 from any node w ∈ V in Step 2 of the firing squad algorithm, and adjusting the proof of Theorem 2 is straightforward.
3. In the filtering construction, we replace the requirement from f < n/3 to f < n/2. The only change is that i (v, t + 1) is set to 0 if there is any node u sending m i (u, t) = 1.
One can readily check that this does not affect the correctness of Lemma 2, Lemma 3, or Lemma 4. Concerning Lemma 5, observe that any node having b i (v, t) = 1 implies M i (v, t) = 1 and thus m i (w, t − 1) = 1 for at least n − f > f nodes w ∈ V . Hence, each node u ∈ V receives m i (w, t − 1) = 1 from at least one node w ∈ V and sets l i (u, t) = 0. Lemma 5 now follows by similar reasoning as in the Byzantine case.
4. We follow the same strategy as for the Byzantine case. In the first two rounds, a node sets its input to 0 if receiving fewer than n − f times 1. Any node receiving a message in the first round participates in the execution of the (non-silent) binary consensus protocol. Each node returns 0 if it received no message in the second round, it was forced to abort the binary consensus protocol due to violation of message size bound or an otherwise invalid execution, or the binary consensus protocol returned 0. If a node does not participate, there are at most f nodes with non-zero input, implying that no node receives a message in the second round. Thus, agreement holds in this case. If a node uses input one for the call to the non-silent consensus routine, all nodes participate, as at least f + 1 nodes sent 1 in the first round. Thus agreement follows from the correct execution of the non-silent protocol. Silence and validity are easily verified.

We modify
Step 1 of the pruning algorithm to set r i (v, t + 1) = 1 if received b i (w, t) = 1 from any w ∈ V . It follows that if any node v ∈ V uses input 1 for a consensus instance whose first round is simulated in round t, each node received b i (v, t − 1) = 1 and thus participates in the instance. Moreover, if all nodes have b i (v, t − 1) = 1, all use input 1 for the instance. Similar reasoning to the Byzantine case now establishes the required properties of the pruning routine.
Results for omission faults. None of the above modifications change message size or time bounds, implying that we can feed the modified machinery with an arbitrary binary consensus algorithm resilient to f < n/2 omission faults to obtain results analogous to the Byzantine case.
Theorem 10. Suppose that we are given a family of f -omission-resilient deterministic consensus algorithms C(f ) running on any number n > 2f of nodes in T (C(f )) rounds using M (C(f ))-bit messages, where T (C(f )) and M (C(f )) are non-decreasing in f . Then, for any Ψ, f ∈ N and n > 2f , a strong f -omission-resilient Ψ-pulser P on n nodes with