Self-Stabilizing Byzantine Clock Synchronization with Optimal Precision

Khanchandani, Pankaj; Lenzen, Christoph

doi:10.1007/s00224-017-9840-3

Self-Stabilizing Byzantine Clock Synchronization with Optimal Precision

Open access
Published: 20 January 2018

Volume 63, pages 261–305, (2019)
Cite this article

Download PDF

You have full access to this open access article

Theory of Computing Systems Aims and scope Submit manuscript

Self-Stabilizing Byzantine Clock Synchronization with Optimal Precision

Download PDF

Pankaj Khanchandani¹ &
Christoph Lenzen²

2036 Accesses
8 Citations
Explore all metrics

Abstract

In the Byzantine-tolerant clock synchronization problem, the goal is to synchronize the clocks of n fully connected nodes. The clocks run at rates between 1 and 𝜗 > 1, and messages have a delay (including computation) between d − U and d. Moreover, up to f < n/3 of the nodes can fail by deviating arbitrarily from the protocol, i.e., are Byzantine. Despite this interference, correct nodes need to generate distinguished events (or pulses) almost simultaneously and periodically. The quality of the solution is measured by the skew, which is the maximum real time difference between corresponding pulses. In the self-stabilizing setting, in addition we allow for transient failures, possibly of all nodes. Once transient faults have ceased and at most f nodes remain faulty, the system should start generating synchronized pulses again. We design a self-stabilizing solution to this problem with asymptotically optimal skew. We achieve our goal by refining and extending the protocol of Lynch and Welch and make the following contributions in the process.

We give a simple analysis of the Lynch and Welch protocol with improved bounds on skew and tolerable difference in clock rates by rebuilding upon the main ingredient of their protocol, called approximate agreement.
We give a modified version of the protocol so that the frequency and amount of communication between the nodes is reduced. The modification adds a step to adjust the clock rates by another application of approximate agreement. The skew bound achieved is asymptotically optimal for suitable choices of parameters.
We present a method to add self-stabilization to the above protocols while preserving their skew bounds. The heart of the method is a coupling scheme that leverages a self-stabilizing protocol with a larger skew.

Self-stabilizing Byzantine Clock Synchronization with Optimal Precision

Near-Optimal Self-stabilising Counting and Firing Squads

Time Optimal Synchronous Self Stabilizing Spanning Tree

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

When designing a synchronous distributed system, the most fundamental question is how to generate a system clock at all the n nodes, i.e., how to periodically generate a distinguished event or pulse at each node so that the actual time of the i^th pulse at each node is close to the actual time of the i^th pulse of any other node. This clock synchronization problem is easily solved if each node is reliable and equipped with an accurate clock. However, neither is always the case. For instance, in space applications accurate clocks such as quartz oscillators are prone to failure, so less accurate electronic oscillators are preferable, and nodes are subject to radiation-induced transient faults. Thus, nodes have to frequently adjust their clocks by sending and receiving messages and executing a suitable algorithm. The inaccuracy in the clocks is modelled by assigning a clock rate or frequency that varies at each node, but within fixed bounds. We measure the precision of the algorithm by skew, which is the maximum over all pulses i and pairs of (correct) nodes of the time difference between the i^th pulses of the respective nodes.

The clock synchronization task is mission critical, both in terms of performance and reliability. Therefore, fault-tolerant distributed clock synchronization algorithms have found their way into real-world systems with high reliability demands. For example, the Time-Triggered Protocol (TTP) [13] and FlexRay [9, 10] tolerate Byzantine failure (i.e., arbitrary out-of-spec behavior) of less than n/3 nodes and are utilized in cars and airplanes. This means that these algorithms guarantee that correct nodes continue to generate synchronized pulses. They are based on the classic Byzantine clock synchronization algorithm by Lynch and Welch [19].

Another application domain with even more stringent requirements is hardware for spacecraft and satellites. Here, a reliable system clock is in demand despite frequent transient failure of any number of nodes due to radiation. The property to recover from an unknown state once the transient failures have stopped is known as self-stabilization. This is essential for the space domain, but also highly desirable in the systems utilizing TTP or FlexRay. This claim is supported by the presence of various mechanisms that monitor the nodes and perform resets in case of observed faulty behavior in both protocols. Thus, it is of interest to devise synchronization algorithms that stabilize on their own, instead of relying on monitoring techniques: these need to be highly reliable as well, or their failure may bring down the system due to erroneous detection of or response to faults.

Thus, self-stabilizing Byzantine clock synchronization algorithms with small skew have critical and useful applications and, accordingly, have received significant attention in the past (e.g., [2, 8]). However, existing algorithms cannot achieve asymptotically optimal skew. Our key motivation and main goal is to build a self-stabilizing Byzantine clock synchronization algorithm with asymptotically optimal skew.

Our Contribution

We achieve our main goal by building upon and extending the approach given by Lynch and Welch [19] to solve the Byzantine clock synchronization problem. The approach uses approximate agreement [7] repeatedly to adjust the time of the next clock pulse. In the process of achieving our goal, we make the following contributions.

1.
We present a simplified analysis of the Lynch-Welch algorithm. We show that the algorithm converges to a steady-state error E ∈ O((𝜗 − 1)d + U) , where hardware clock rates are between 1 and 𝜗 and messages take between d − U and d time to arrive at their destination. This works even for very inaccurate clocks: it suffices if 𝜗 ≤ 1.1, although the skew bound goes to infinity as 𝜗 approaches the critical value.^{Footnote 1} However, for, e.g., 𝜗 ≤ 1.01, Theorem 1 bounds the skew by E(𝜗, d, U) ≤ 2.222(𝜗 − 1)d + 4.533U.
2.
We give a conceptually simple extension of the previous algorithm that, in addition to changing the (logical) clock values, also adjusts the clock rates using approximate agreement. If the clocks are sufficiently stable, i.e., the maximum rate of change ν of clock rates is sufficiently small, then we can significantly increase the nominal round length T and decrease the frequency of communication without substantially affecting skew. Concretely, if 𝜗 ≤ 1.01, max{F, U}≪ T (where nodes’ clocks are initialized within F of each other), and max{(𝜗 − 1)²T, νT²}≪ U, it is possible to guarantee a skew of O(U) (see Corollary 12 and subsequent explanation), which is asymptotically optimal.
3.
We introduce a generic scheme that enables making either of these algorithms self-stabilizing. The scheme couples one of the above (non-stabilizing) algorithms with a self-stabilizing Byzantine clock synchronization algorithm of larger skew 2d.^{Footnote 2} The coupled algorithm is both self-stabilizing and has the original smaller skew of the non-stabilizing algorithm (Theorem 4 and Theorem 5). The self-stabilizing Byzantine clock synchronization algorithm that we utilize is FATAL [4, 5], which already offers a suitable interface to our coupling mechanism.

On the technical side, the first two results require little innovation compared to prior work. However, it proved challenging to obtain simple algorithms that also achieve tight skew bounds. The effort spent was worthwhile for two reasons.

1.
A prototype FPGA implementation [12] strongly indicates that these algorithms are also easy to implement in hardware.^{Footnote 3}
2.
There is no mathematical analysis of a clock rate or frequency correction scheme in the literature that can be readily applied to yield accurate bounds for simple algorithms. We provide such a tailored analysis of our second algorithm.

To clarify the second point, we first note that the framework in [16, 17] does address frequency correction, but would require substantial specialization, including its mathematical analysis, to achieve good constants in the bounds. Second, the FlexRay algorithm also adjusts frequencies, but differs from our second algorithm in a crucial point. In order to avoid that the approximate agreement scheme is rendered ineffective because nodes reach the imposed limits on adjusting their frequency,^{Footnote 4} we add a correction slowly pulling back nodes’ frequencies to the nominal rate. Without this provision, it is straightforward to construct executions in which, e.g., the majority of the nodes run too fast for another node to sufficiently adjust its clock rate to match their speed. This means that, in the worst case, FlexRay’s frequency correction is futile.

In contrast to the above contributions, the coupling scheme we use to combine our non-stabilizing algorithms with the FATAL algorithm showcases a novel technique of independent interest. We leverage FATAL’s clock “beats” to effectively (re-)initialize the synchronization algorithm we couple it to. Here, care has to be taken to avoid such resets from occurring during regular operation of the non-stabilizing algorithms, as this could result in large skews or even spurious clock pulses. The solution is a feedback mechanism that enables the synchronization algorithm to actively trigger the next beat of FATAL at the appropriate time. FATAL stabilizes regardless of how these feedback signals behave, while actively triggering beats ensures that all nodes pass the checks which, if failed, trigger the respective node being reset. While a specific interface is required from the stabilizing algorithm to permit this approach, it seems likely that most, if not all, self-stabilizing synchronization algorithms could be modified to provide it. Thus, we consider the technique a highly useful separation of the tasks to achieve small skews and to ensure (fast) stabilization.

Organization of the Paper

After presenting related work and the model, we proceed in the order of the contributions listed above: simplified phase synchronization (Section 4), frequency synchronization (Section 5), and finally the coupling scheme adding self-stabilization (Section 6). Section 7 concludes the paper.

2 Related Work

TTP [13] and FlexRay [9, 10] are both implemented in software (barring minor hardware components). This is sufficient for their application domain, in which synchronous communication between hardware components at frequencies in the megahertz range is required. Solutions fully implemented in hardware are of interest for two reasons. First, having to implement the full software abstraction dramatically increases the number of potential reasons for a node to fail – at least from the point of view of the synchronization algorithm. A slim hardware implementation is thus likely to result in a substantially higher degree of reliability of the clocking mechanism. Second, if higher precision of synchronization is required, the significantly smaller delays incurred by dedicated hardware make it possible to meet these demands.

Apart from these issues, the complexity of a software solution renders TTP and FlexRay unsuitable as fault-tolerant clocking schemes for VLSI circuits. The DARTS project [3, 11] aimed at developing such a scheme, with the goal of coming up with a robust clocking method for space applications. Instead of being based on the Lynch-Welch approach, it implements the fault-tolerant synchronization algorithm by Srikanth and Toueg [18]. Unfortunately, DARTS falls short of its design goals in two ways. On the one hand, the Srikanth-Toueg primitive achieves skews of Θ(d), which tend to be significantly larger than those attainable with the Lynch-Welch approach.^{Footnote 5} Accordingly, the operational frequency DARTS can sustain (without large communication buffers and communication delays of multiple logical rounds) is in the range of 100MHz, i.e., about an order of magnitude smaller than typical system speeds. Moreover, DARTS is not self-stabilizing. This means that DARTS – just like TTP and FlexRay – is unlikely to successfully cope with high rates of transient faults. Worse, the rate of transient faults will scale with the number of nodes (and thus sustainable faulty nodes). For space environments, this implies that adding fault-tolerance without self-stabilization cannot be expected to increase the reliability of the system at all.

These concerns inspired a follow-up work called FATAL, which seeks to overcome the downsides of DARTS. From an abstract point of view, FATAL [4, 5] can be interpreted as another incarnation of the Srikanth-Toueg approach. However, FATAL combines tolerance to Byzantine faults with self-stabilization in O(n) time with probability 1 − 2^−Ω(n); after recovery is complete, the algorithm maintains correct operation deterministically. Like DARTS, FATAL and the substantial line of prior work on Byzantine self-stabilizing synchronization algorithms (e.g., [2, 8]) cannot achieve better clock skews than Θ(d). The key motivation for the present paper is to combine the better precision achieved by the Lynch-Welch approach with the self-stabilization properties of FATAL.

Concerning frequency correction, little related work exists. A notable exception is the extension of the interval-based synchronization framework to rate synchronization [16, 17]. In principle, it seems feasible to derive similar results by specialization and minor adaptions of this powerful machinery to our setting. Unfortunately, apart from the technical hurdles involved, an educated guess (based on the amount of necessary specialization and estimates that need to be strengthened) results in worse constants and more involved algorithms, and it is unclear whether our approach to self-stabilization can be fitted to this framework. However, it is worth noting that the overall proof strategies for our (non-stabilizing) phase and frequency correction algorithms bear notable similarities to the generic framework: separately deriving bounds on the precision of measurements, plugging these into a generic convergence argument, and separating the analysis of frequency and phase corrections.

Coming to lower bounds and impossibility results, the following is known.

impossibility results In a system of n nodes, no algorithm can tolerate ⌈n/3⌉ Byzantine faults. All mentioned algorithms are optimal in that they tolerate ⌈n/3⌉− 1 Byzantine faults [6].
To tolerate this number of faults, Ω(n²) communication links are required.^{Footnote 6} All mentioned algorithms assume full connectivity and communicate by broadcasts (faulty nodes may not adhere to this). Less well-connected topologies are outside the scope of this work.
The worst-case precision of an algorithm cannot be better than (1 − 1/n)U in a network where communication delays may vary by U [15]. In the fault-free case and with 𝜗 − 1 sufficiently small, this bound can be almost matched (cf.

Section 4); all variants of the Lynch-Welch approach match this bound asymptotically, granted sufficiently accurate local clocks.
Trivially, the worst case precision of any algorithm is at least (𝜗 − 1)T if nodes exchange messages every T time units. Moreover, a simple indistinguishability argument shows a lower bound of (𝜗 − 1)d, regardless of T. In the fault-free case, this is essentially matched by our phase correction algorithm as well.
With faults, the upper bound on the skew of the algorithm increases by factor 1/(1 − α), where α ≈ 1/2 if 𝜗 ≈ 1. It appears plausible that this is optimal under the constraint that the algorithm’s resilience to Byzantine faults is optimal, due to a lower bound on the convergence rate of approximate agreement [7].

Overall, the resilience of the presented solution to faults is optimal, its precision asymptotically optimal, and it seems reasonable to assume that there is little room for improvement in this regard. In contrast, no non-trivial lower bounds on the stabilization time of self-stabilizing fault-tolerant synchronization algorithms are known. Very recently, it has been shown that stabilization time O(log n) can be achieved, and that stabilization time polylogn is possible with nodes broadcasting only polylogn bits per time unit [14]. The same coupling strategy as presented in this work could be applied to these algorithms, achieving much faster overall stabilization.

3 Model

We assume a fully connected system of n nodes, up to f := ⌊(n − 1)/3⌋ of which may be Byzantine faulty (i.e., arbitrarily deviate from the protocol). We denote by V the set of all nodes and by C ⊆ V the subset of correct nodes, i.e., those that are not faulty.

Communication is by broadcast of “pulses,” which are messages without content: the only information conveyed is when a node transmitted a pulse. Nodes can distinguish between senders; this is used to distinguish the case of multiple pulses being sent by a single (faulty) node from multiple nodes sending one pulse each. Note that faulty nodes are not bound by the broadcast restriction, i.e., may send a pulse to a subset of the nodes only. The system is semi-synchronous. A pulse sent by node v ∈ C at (Newtonian) time $p_{v}\in \mathbb {R}_{0}^{+}$ is received by node w ∈ C at time t_vw ∈ [p_v + d − U, p_v + d]; we refer to d as the maximum message delay (or, chiefly, delay) and to U as the delay uncertainty (or, chiefly, uncertainty).

For these timing guarantees to be useful to an algorithm, the nodes must have a means to measure the progress of time. Each node v ∈ C is equipped with a hardware clock H_v, which is modeled as a strictly increasing function $H_{v}:\mathbb {R}^{+}_{0}\to \mathbb {R}^{+}_{0}$. We require that there is a constant 𝜗 > 1 such that the following holds for all times t < t^′.

$$t^{\prime}-t\leq H_{v}(t^{\prime})-H_{v}(t)\leq \vartheta (t^{\prime}-t) $$

In other words, the hardware clocks have bounded drift.^{Footnote 7} We remark that our results can be easily translated to the case of discrete and bounded clocks.^{Footnote 8} We refer to H_v(t) as the local time of v at time t.

Executions are event-based, where an event at node v is the reception of a message, a previously computed (and stored) local time being reached, or the initialization of the algorithm. A node may then perform computations and possibly send a pulse. For simplicity, we assume that these operations take zero time; adapting our results to account for computation time is straightforward.

Problem

A clock synchronization algorithm generates distinguished events or clock pulses at times p_v(r) for $r \in \mathbb {N}$ and v ∈ C so that the following conditions are satisfied for all $r\in \mathbb {N}$.

1.
∀v, w ∈ C : |p_v(r) − p_w(r)|≤ e(r)
2.
∀v ∈ C : A_min ≤ p_v(r + 1) − p_v(r) ≤ A_max

The first requirement is a bound on the synchronization error between the r^th clock ticks; naturally, it is desired that e(r) is as small as possible. The second requirement is a bound on the time between consecutive clock ticks, which can be translated to a bound on the frequency of the clocks; here, the goal is that A_min/A_max ≈ 1. The precision of the algorithm is measured by the steady state error^{Footnote 9}

$$E := \lim\limits_{r^{\prime}\to\infty}\sup\limits_{r\geq r^{\prime}}\{e(r)\}\,. $$

3.1 Model for Frequency Correction Algorithms

In order for frequency corrections to be useful, we need to assume that hardware clock rates do not change faster than the algorithm can adjust to keep the effective frequencies aligned.

Accordingly, in Section 5, we additionally require that clock rates satisfy a Lipschitz condition as well. There, we assume that H_v is differentiable (for all v ∈ C) with derivative h_v, where h_v satisfies for $t,t\in \mathbb {R}^{+}_{0}$ that

$$ |h_{v}(t^{\prime})-h_{v}(t)|\leq \nu |t^{\prime}-t| $$

(1)

for some ν > 0. Note that we maintain the model assumption that hardware clock rates are close to 1 at all times, i.e., 1 ≤ h_v(t) ≤ 𝜗 for all $t\in \mathbb {R}^{+}_{0}$.

3.2 Self-stabilization

An algorithm is self-stabilizing, if it (re)establishes correct operation from arbitrary states in bounded time. If there is an upper bound on the time this takes in the worst case, we refer to it as the stabilization time.

In Section 6, we will make use of a self-stabilizing pulse synchronization algorithm to “reset” the system from inconsistent initial states. Starting the analysis only from this point, we have a consistent labeling of the pulses (modulo some $M\in \mathbb {N}$) that is shared by all correct nodes. For this special case, we can still apply the above problem formulation (w.r.t. this labeling).

4 Phase Synchronization Algorithm

In this section, we give a basic algorithm for byzantine clock synchronization and show its guarantees in Theorem 1. The basic algorithm is a variant of the one by Lynch and Welch [19], which synchronizes clocks by simulating perpetual synchronous approximate agreement [7] on the times when clock pulses should be generated. We diverge only in terms of communication: instead of round numbers, nodes broadcast content-free pulses. Due to sufficient waiting times between pulses, during regular operation received messages from correct nodes can be correctly attributed to the respective round. In fact, the primary purpose of transmitting round numbers in the Lynch-Welch algorithm is to add recovery properties. Our technique for adding self-stabilization (presented in Section 6) leverages the pulse synchronization algorithm from [4, 5] instead, which requires to broadcast constant-sized messages only.

Before presenting the algorithm and its analysis in Sections 4.2 and 4.3, respectively, we revisit some basic properties of the approximate agreement technique [7]. The results in this section are derivatives of the ones from [7, 19], but adapting them to our setting and notation is essential for deriving our main results in Sections 5 and 6.

4.1 Properties of Approximate Agreement Steps

Abstractly speaking, the synchronization performs approximate agreement steps in each (simulated synchronous) round. In approximate agreement, each node is given an input value and the goal is to let nodes determine values that are close to each other and within the interval spanned by the correct nodes’ inputs.

In the clock synchronization setting, there is the additional obstacle that the communicated values are points in time. Due to delay uncertainty and drifting clocks, the communicated values are subject to a (worst-case) perturbation of at most some $\delta \in \mathbb {R}^{+}_{0}$. We will determine δ later in our analysis of the clock synchronization algorithms; we assume it to be given for now. The effect of these disturbances is straightforward: they may shift outputs by at most δ in each direction, increasing the range of the outputs by an additive 2δ in each step (in the worst case).

Algorithm 1 describes an approximate agreement step from the point of view of node v ∈ C. When implementing this later on, we need to make use of timing constraints to ensure that (i) correct nodes receive each other’s messages in time to perform the associated computations and (ii) correct nodes’ messages can be correctly attributed to the round to which they belong. Figure 1 depicts how a round unfolds assuming that these timing constraints are satisfied.

Denote by $\vec {x}$ the |C|-dimensional vector of correct nodes’ inputs, i.e., $(\vec {x})_{v}=x_{v}$ for v ∈ C. The diameter$\|\vec {x}\|$ of $\vec {x}$ is the difference between the maximum and minimum components of $\vec {x}$. Formally,

$$\|\vec{x}\| := \max\limits_{v\in C}\{x_{v}\} - \min\limits_{v\in C}\{x_{v}\}. $$

We will use the same notation for other values, e.g. $\vec {y}$ and $\|\vec {y}\|$. For simplicity, we assume that |C| = n − f in the following; all statements can be adapted by replacing n − f with |C| where appropriate.

Consider the special case of δ = 0. Intuitively, Algorithm 1 discards the smallest and largest f values each to ensure that values from faulty nodes cannot cause outputs to lie outside the range spanned by the correct nodes’ values. Afterwards, y_v is determined as the midpoint of the interval spanned by the remaining values. Since f < n/3, i.e., n − f ≥ 2f + 1, the median of correct nodes’ values is part of all intervals computed by correct nodes. From this, it is easy to see that $\|\vec {y}\|\leq \|\vec {x}\|/2$, see Fig. 1. For δ > 0, we simply observe that the resulting values y_v, v ∈ C, are shifted by at most δ compared to the case where δ = 0, resulting in $\|\vec {y}\|\leq \|\vec {x}\|/2 + 2\delta $. We now prove these properties.

Lemma 1

$$\forall v\in C:\,\min\limits_{w\in C}\{x_{w}\}-\delta\leq y_{v} \leq \max\limits_{w\in C}\{x_{w}\}+\delta\,. $$

Proof

As there are at most f faulty nodes, for v ∈ C we have that

$$S_{v}^{f + 1}\geq \min\limits_{w\in C}\{\hat{x}_{wv}\}\geq \min_{w\in C}\{x_{w}\}-\delta\,. $$

Analogously, $S_{v}^{n-f}\leq \max _{w\in C}\{x_{w}\}+\delta $. We conclude that

$$\min\limits_{w\in C}\{x_{w}\}-\delta\leq S_{v}^{f + 1}\leq \frac{S_{v}^{f + 1}+S_{v}^{n-f}}{2}=y_{v}\leq S_{v}^{n-f} \leq \max\limits_{w\in C}\{x_{w}\}+\delta\,. $$

□

Corollary 1

$\max _{v\in C}\{|y_{v}-x_{v}|\} \leq \|\vec {x}\|+\delta $ .

Lemma 2

$\|\vec {y}\|\leq \|\vec {x}\|/2 + 2\delta $ .

Proof

We show the claim for δ = 0 first, i.e., $\hat {x}_{wv}=x_{w}$ for all v, w ∈ C. Denote by x^k the k^th element of $\vec {x}$ w.r.t. ascending order. Since f < n/3, we have that n − f ≥ 2f + 1. Hence, for all v ∈ C,

$$x^{1}\leq S_{v}^{f + 1}\leq x^{f + 1}\leq S_{v}^{2f + 1}\leq S_{v}^{n-f}\leq x^{n-f}\,. $$

For any v, w ∈ C, it follows that

$$\begin{array}{@{}rcl@{}} y_{v}-y_{w}&=&\frac{S_{v}^{f + 1} - S_{w}^{f + 1} + S_{v}^{n-f}-S_{w}^{n-f}}{2}\\ &\leq& \frac{x^{f + 1}-x^{1}+x^{n-f}-x^{f + 1}}{2}=\frac{x^{n-f}-x^{1}}{2}\\ &=&\frac{\|\vec{x}\|}{2}\,. \end{array} $$

Symmetrically, we have that $y_{w}-y_{v}\leq \|\vec {x}\|/2$ and thus $|y_{v}-y_{w}|\leq \|\vec {x}\|/2$. As v, w ∈ C were arbitrary, this yields $\|\vec {y}\|\leq \|\vec {x}\|/2$ (under the assumption that δ = 0).

For the general case, observe that $S_{v}^{f + 1}$, $S_{w}^{f + 1}$, $S_{v}^{n-f}$, and $S_{w}^{n-f}$ each can be changed by at most δ. This can affect $(S_{v}^{f + 1} - S_{w}^{f + 1} + S_{v}^{n-f}-S_{w}^{n-f})/2$ by at most 4δ/2 = 2δ; the claim follows. □

4.2 Algorithm

Algorithm 2 shows the pseudocode of the phase synchronization algorithm at node v ∈ C. It implements iterative approximate agreement steps on the times when to send pulses. The algorithm assumes that the nodes are initialized within a (local) time window of size F. In each round $r\in \mathbb {N}$, the nodes estimate the phase offset of their pulses^{Footnote 10} and then compute an according phase correction Δ_v(r). Figure 2 illustrates how a round of the algorithm plays out.

To fully specify the algorithm, we need to determine how long the waiting periods in each round are (in terms of local time), which will be given as τ₁(r), τ₂(r), and T(r) −Δ(r) − τ₁(r) − τ₂(r). Here, we must ensure for all $r\in \mathbb {N}$ that

1.
for all v, w ∈ C, the message that v broadcasts at time t_v(r − 1) + τ₁(r) is received by w at a local time from [H_w(t_w(r − 1)), H_w(t_w(r − 1)) + τ₁(r) + τ₂(r)] and
2.
for all v ∈ C, T (r) − Δ_v (r) ≥ τ₁ (r) + τ₂(r), i.e., v computes H_v(t_v(r)) before time t_v(r).

If these conditions are satisfied at all correct nodes, we say that roundr is executed correctly, and we can interpret the round as an approximate agreement step in the sense of Section 4.1. We will show in the next section that the following condition is sufficient for all rounds to be executed correctly.

Condition 1

Definee(1) := F + (1 − 1/𝜗)τ₁(1) and inductively for all$r\in \mathbb {N}$that

$$e(r + 1):=\frac{2\vartheta^{2}+ 5\vartheta-5}{2(\vartheta+ 1)}\,e(r) +(3\vartheta-1)U+\left( 1-\frac{1}{\vartheta}\right)(T(r)+\tau_{1}(r + 1)-\tau_{1}(r))\,. $$

We require for all$r\in \mathbb {N}$that

$$\begin{array}{@{}rcl@{}} \tau_{1}(r)&\geq& \vartheta e(r)\\ \tau_{2}(r)&\geq& \vartheta(e(r)+d)\\ T(r)&\geq \tau_{1}(r)+\tau_{2}(r)+\vartheta(e(r)+U)\,. \end{array} $$

Here, e(r) is a bound on the synchronization error in round r, i.e., we will show that $\|\vec {p}(r)\|\leq e(r)$ for all $r\in \mathbb {N}$, provided Condition 1 is satisfied. Condition 1 cannot be satisfied for arbitrary 𝜗 > 1 such that e(r) is bounded independently of r. The intuition is that rounds must be long enough to ensure that all pulses from correct nodes are received (i.e., at least 𝜗e(r)), but during this time additional error is built up by drifting clocks; if the approximate agreement step cannot overcome this relative skew increase, round r + 1 has to be even longer, and so on. However, any 𝜗 ≤ 1.1 can be sustained.

Lemma 3

Condition 1 can be satisfied such that $\lim _{r\to \infty } e(r)<\infty $ if

$$\alpha:=\frac{6\vartheta^{2}+ 5\vartheta-9}{2(\vartheta+ 1)(2-\vartheta)}<1\,. $$

In this case, we can achieve

$$\lim\limits_{r\to \infty} e(r) = \frac{(\vartheta-1)d+(4\vartheta -2)U}{(2-\vartheta)(1-\alpha)}\,. $$

Proof

By plugging e(1) into the inequality for τ₁(1), we see that we may choose τ₁(1) < ∞ if and only if 𝜗 < 2. Assuming that this is the case, we choose to satisfy all inequalities with equality, yielding for $r\in \mathbb {N}$ that

$$\begin{array}{@{}rcl@{}} \tau_{1}(r)&=&\vartheta e(r)\\ T(r)&=& \vartheta(3e(r)+d+U)\\ e(r + 1)&=& \frac{6\vartheta^{2}+ 5\vartheta-9}{2(\vartheta+ 1)(2-\vartheta)}\,e(r) +\frac{(\vartheta-1) d}{2-\vartheta} + \frac{(4\vartheta-2)U}{2-\vartheta}\\ &=&\alpha e(r)+\frac{(\vartheta-1) d+(4\vartheta-2)U}{2-\vartheta}\,. \end{array} $$

Thus,

$$\begin{array}{@{}rcl@{}} \lim\limits_{r\to \infty} e(r)&=& \lim\limits_{r\to \infty}\left( \alpha^{r-1}e(1)+\sum\limits_{r'= 0}^{r-1}\alpha^{r^{\prime}} \left( \frac{(\vartheta-1)d+(4\vartheta-2)U}{2-\vartheta}\right)\right)\\ &=&\frac{(\vartheta-1)d+(4\vartheta-2)U}{(2-\vartheta)(1-\alpha)}\,, \end{array} $$

where the second equality holds because α < 1. Because α < 1 is a stricter constraint on 𝜗 than 𝜗 < 2, this completes the proof. □

Several remarks are in order.

α goes to 1/2 as 𝜗 goes to 1. For 𝜗 = 1.01, we already have that α ≈ 0.55. Thus, the approach can support fairly large phase drifts.
For 𝜗 ≈ 1, we have that $\lim _{r\to \infty } e(r)\approx 4U + 2(\vartheta -1)d$. From Corollary 2, one can see that if (𝜗 − 1)d ≪ U, this can be reduced to $\lim _{r\to \infty } e(r)\approx 2U$.
The lower bound by Lynch and Welch [15] shows that this is optimal up to factor 2. It is straightforward to verify that in the fault-free case with 𝜗 = 1, the algorithm attains the lower bound.
The convergence is exponential, i.e., for any ε > 0 we have that $e(r)\leq (1+\varepsilon )\lim _{r\to \infty } e(r)$ for all $r\geq r_{\varepsilon }\in {\Theta }(\log F/(\varepsilon \lim _{r\to \infty } e(r)))$.

4.3 Analysis

In this section, we prove that Condition 1 is indeed sufficient to ensure that $\|\vec {p}(r)\|\leq e(r)$ for all $r\in \mathbb {N}$. In the following, denote by $\vec {p}(r)$, $r\in \mathbb {N}_{0}$, the vector of times when nodes v ∈ C broadcast their r^th pulse, i.e., H_v(p_v(r)) = H_v(t_v(r − 1)) + τ₁(r). If v ∈ C takes note of the pulse from w ∈ C in round r, the corresponding value τ_wv − τ_vv can be interpreted as inexact measurement of p_w(r) − p_v(r). This is captured by the following lemma, which provides precise bounds on the incurred error.

Lemma 4

Supposev ∈ Creceives the pulses from bothw ∈ Cand itself in round rat a time from [H_v(t_v(r − 1)), H_v(t_v(r − 1)) + τ₁(r) + τ₂(r)].Then

$$\left|\frac{2(\tau_{wv}-\tau_{vv})}{\vartheta+ 1}-(p_{w}(r)-p_{v}(r))\right|< \vartheta U + \frac{\vartheta-1}{\vartheta+ 1}\|\vec{p}(r)\|\,, $$

whereτ_wvandτ_vvdenote the values of the respective variables in the algorithm in roundr.

Proof

Denote by t_uv the time when v receives the pulse from u ∈{v, w}. The communication model guarantees that t_uv ∈ [p_u(r) + d − U, p_u(r) + d]. Thus,

$$ \tau_{uv}=H_{v}(t_{uv})\in [H_{v}(p_{u}(r)+d-U),H_{v}(p_{u}(r)+d)]\subseteq H_{v}(p_{u}(r)+d-U/2)\pm \frac{\vartheta U}{2}\,. $$

(2)

Moreover, if p_w(r) − p_v(r) ≥ 0, the bounds on the hardware clock speed guarantee that

$$\begin{array}{@{}rcl@{}} \frac{2(p_{w}(r)-p_{v}(r))}{\vartheta+ 1}&\leq& \frac{2(H_{v}(p_{w}(r)+d-U/2)-H_{v}(p_{v}(r)+d-U/2))}{\vartheta+ 1}\\ &\leq& \frac{2\vartheta(p_{w}(r)-p_{v}(r))}{\vartheta+ 1} \end{array} $$

and thus

$$\begin{array}{@{}rcl@{}} &&\;\frac{(1-\vartheta)(p_{w}(r)-p_{v}(r))}{\vartheta+ 1}\\ &\leq &\; \frac{2(H_{v}(p_{w}(r)+d-U/2)-H_{v}(p_{v}(r)+d-U/2))}{\vartheta+ 1}-(p_{w}(r)-p_{v}(r))\\ &\leq &\;\frac{(\vartheta-1)(p_{w}(r)-p_{v}(r))}{\vartheta+ 1}\,. \end{array} $$

Since $|p_{w}(r)-p_{v}(r)|\leq \|\vec {p}(r)\|$ by definition, this yields that

$$\begin{array}{@{}rcl@{}} &&\left|\frac{2(H_{v}(p_{w}(r)+d-U/2)-H_{v}(p_{v}(r)+d-U/2))}{\vartheta+ 1}-(p_{w}(r)-p_{v}(r))\right|\\ &\leq &\;\frac{\vartheta-1}{\vartheta+ 1}\|\vec{p}(r)\|\,. \end{array} $$

(3)

This bound also holds in case p_w(r) − p_v(r) < 0, as we can switch the roles of v and w in the above inequalities. We conclude that

$$\begin{array}{@{}rcl@{}} &&\left|\frac{2(\tau_{wv}-\tau_{vv})}{\vartheta+ 1}-(p_{w}(r)-p_{v}(r))\right|\\ &&\qquad\leq \frac{2}{\vartheta+ 1} (|\tau_{wv}-H_{v}(p_{w}(r)+d-U/2)|+|\tau_{vv}-H_{v}(p_{v}(r)+d-U/2)|)\\ &&\qquad\qquad + \left|\frac{2(H_{v}(p_{w}(r)\,+\,d\,-\,U/2)\,-\,H_{v}(p_{v}(r)\,+\,d\,-\,U/2))}{\vartheta+ 1} -(p_{w}(r)-p_{v}(r))\right|\\ &&\qquad \overset{(2),(3)}{<} \vartheta U + \frac{\vartheta-1}{\vartheta+ 1}\|\vec{p}(r)\|\,. \end{array} $$

□

We remark that if (𝜗 − 1)d < U and U is known, it is beneficial to refrain from having v send a message to itself. Instead it estimates the arrival time of the message using its hardware clock, yielding the following corollary.

Corollary 2

Supposev ∈ Creceives the pulse fromw ∈ Cin roundr at a time from [H_v(t_v(r − 1)), H_v(t_v(r − 1)) + τ₁(r) + τ₂(r)].Then

$$\left|\frac{2(\tau_{wv}-H_{v}(p_{v}(r)))}{\vartheta+ 1} -\left( \!d\,-\,\frac{U}{2}\!\right)\,-\,(p_{w}(r)\,-\,p_{v}(r))\right|\!<\! \frac{\vartheta U}{2} + \frac{\vartheta-1}{\vartheta+ 1}(\|\vec{p}(r)\|+d)\,, $$

where τ_wvdenotes the value of the respective variable in the algorithm in roundr.

Proof

By repeating the proof of Lemma 4, where the term |τ_vv − H_v(p_v(r) + d − U/2)| gets replaced by

$$\begin{array}{@{}rcl@{}} &&\quad\left|H_{v}(p_{v}(r))+\frac{(\vartheta+ 1)(d-U/2)}{2} -H_{v}\left( p_{v}(r)+d-\frac{U}{2}\right)\right|\\ &&\leq \max\left\{\left|\frac{\vartheta+ 1}{2}-1\right|, \left|\frac{\vartheta+ 1}{2}-\vartheta\right|\right\}\left( d-\frac{U}{2}\right)\\ &&= \frac{\vartheta-1}{\vartheta+ 1}\left( d-\frac{U}{2}\right)\\ &&<\frac{\vartheta-1}{\vartheta+ 1}\,d\,. \end{array} $$

□

In the sequel, we use the bounds provided by Lemma 4. However, the reader should keep in mind that in case (𝜗 − 1)d ≪ U and sufficiently precise bounds on U are known, Corollary 2 shows how to effectively cut the influence of the uncertainty in half.

Using Lemma 4, we can interpret the phase shifts Δ_v(r) as outcomes of an approximate agreement step, yielding the following corollary.

Corollary 3

Suppose in round$r\in \mathbb {N}$, it holds for allv, w ∈ Cthatv receives the pulse fromw ∈ Canditself in roundr during [H_v(t_v(r − 1)), H_v(t_v(r − 1)) + τ₁(r) + τ₂(r)].Then

1.
$|{\Delta }_{v}(r)|< \vartheta (\|\vec {p}(r)\|+U)$ and
2.
$\max _{v,w\in C}\{p_{v}(r)-{\Delta }_{v}(r)-p_{w}(r)+{\Delta }_{w}(r)\}\leq (5\vartheta -3)\|\vec {p}(r)\|/(2(\vartheta + 1))+ 2\vartheta U$ .

Proof

By Lemma 4, we can interpret the values 2(τ_wv − τ_vv)/(𝜗 + 1) as measurements of p_w(r) − p_v(r) with error $\delta =\vartheta U + (\vartheta -1)\|\vec {p}(r)\|/(\vartheta + 1)$. Note that shifting all values by p_v(r) in an approximate agreement step changes the result by exactly p_v(r), implying that p_v(r) −Δ_v(r) equals the result of an approximate agreement step with inputs p_w(r), w ∈ C, and error δ at node v. Thus, the claims follow from Corollary 1 and Lemma 2, noting that 1/2 + 2(𝜗 − 1)/(𝜗 + 1) = (5𝜗 − 3)/(2(𝜗 + 1)). □

To derive a bound on $\|\vec {p}(r + 1)\|$, it remains to analyze the effect of the clock drift between the pulses. To this end, we examine how an established timing relation between actions of two correct nodes deteriorates due to measuring time using the inaccurate hardware clocks.

Lemma 5

Suppose $H_{v}(t_{v}^{\prime })-H_{v}(t_{v})=h_{v}\geq 0$ and $H_{w}(t_{w}^{\prime })-H_{v}(t_{w})=h_{w}\geq 0$ . Then

$$t_{v}-t_{w}+\frac{h_{v}}{\vartheta}-h_{w}\leq t_{v}^{\prime}-t_{w}^{\prime}\leq t_{v}-t_{w}+h_{v}-\frac{h_{w}}{\vartheta}\,. $$

Proof

Since hardware clocks are increasing, $t_{v}^{\prime }\geq t_{v}$ and $t_{w}^{\prime }\geq t_{w}$. The inequalities follow because hardware clock rates are between 1 and 𝜗 ≥ 1. □

This readily yields a bound on $\|\vec {p}(r + 1)\|$ – provided that all nodes can compute when to send the next pulse on time.

Corollary 4

Assume that round $r\in \mathbb {N}$ is executed correctly. Then

$$\|\vec{p}(r + 1)\|\leq \frac{2\vartheta^{2}+ 5\vartheta-5}{2(\vartheta+ 1)}\|\vec{p}(r)\|+(3\vartheta-1)U +\left( 1-\frac{1}{\vartheta}\right) T(r)\,. $$

Proof

For v, w ∈ C, assume w.l.o.g. that p_v(r + 1) − p_w(r + 1) ≥ 0. By Lemma 5 and Corollary 3, we have that

$$\begin{array}{@{}rcl@{}} &&\quad~p_{v}(r + 1)-p_{w}(r + 1)\\ &&\leq p_{v}(r)-p_{w}(r)+T(r)-{\Delta}_{v}(r)+\tau_{1}(r + 1)-\tau_{1}(r)\\ &&\qquad-\frac{T(r)-{\Delta}_{w}(r)+\tau_{1}(r + 1)-\tau_{1}(r)}{\vartheta}\\ &&\leq p_{v}(r)\,-\,{\Delta}_{v}(r)\,-\,(p_{w}(r)\,-\,{\Delta}_{w}(r)) \,+\,\left( \!1\,-\,\frac{1}{\vartheta}\!\right)\!(T(r)\,+\,\tau_{1}(r\,+\,1)\,-\,\tau_{1}(r)\,+\,|{\Delta}_{w}(r)|)\\ &&\leq \frac{2\vartheta^{2}+ 5\vartheta-5}{2(\vartheta+ 1)}\|\vec{p}(r)\|+(3\vartheta-1)U +\left( 1-\frac{1}{\vartheta}\right) (T(r)+\tau_{1}(r + 1)-\tau_{1}(r))\,. \end{array} $$

□

This bound hinges on the assumption that the round is executed correctly. We next establish sufficient conditions for this to be the case.

Lemma 6

Suppose that

$$\begin{array}{@{}rcl@{}} \tau_{1}(r)&\geq& \vartheta (\|\vec{p}(r)\|-(d-U))\\ \tau_{2}(r)&\geq &\vartheta (\|\vec{p}(r)\| + d)\\ T(r)&\geq \tau_{1}(r)+\tau_{2}(r)+\vartheta(\|\vec{p}(r)\|+U)\,. \end{array} $$

Then round r is executed correctly.

Proof

Suppose v, w ∈ C. Denote by t_vw ∈ [p_v(r) + d − U, p_v(r) + d] the time when this message is received by w. We have that

$$\begin{array}{@{}rcl@{}} t_{vw}&\geq& p_{v}(r)+d-U\geq p_{w}(r)-\|\vec{p}(r)\|+d-U\\ &\geq& t_{w}(r-1)+\frac{\tau_{1}(r)}{\vartheta}-(\|\vec{p}(r)\|-(d-U))\\ &\geq& t_{w}(r-1)\,, \end{array} $$

showing that H_w(t_vw) ≥ H_w(t_w(r − 1)), i.e., w starts listening for the pulse of v on time. Similarly,

$$t_{vw}\leq p_{v}(r)+d\leq p_{w}(r)+\|\vec{p}(r)\|+d\leq p_{w}(r)+\frac{\tau_{2}(r)}{\vartheta}\,, $$

implying that H_w(t_vw) ≤ H_w(p_w(r)) + τ₂(r) = H_w(t_w(r − 1)) + τ₁(r) + τ₂(r). Thus, w receives the pulse from v before it stops listening, and the first requirement of correct execution of round r is met for all v, w ∈ C.

It remains to prove that for each v ∈ C, it holds that T(r) −Δ_v(r) ≥ τ₁(r) + τ₂(r). By the preconditions of the lemma, this is satisfied if ${\Delta }_{v}(r)\leq \vartheta (\|\vec {p}(r)\|+U)$. As we already established the precondition of Corollary 3 for round r, the corollary shows that this inequality is satisfied. □

We have almost all pieces in place to inductively bound $\|\vec {p}(r)\|$ and determine suitable values for τ₁(r), τ₂(r), and T(r). The last missing bit is an anchor for the induction, i.e., a bound on $\|\vec {p}(1)\|$.

Corollary 5

$\|\vec {p}(1)\|\leq F+(1-1/\vartheta )\tau _{1}(1)=e(1)$ .

Proof

Since H_v(0) ∈ [0, F) for all v ∈ C, t_v(0) ∈ [0, F) for all v ∈ C. The claim follows by applying Lemma 5. □

Theorem 1

Suppose that Condition1 is satisfied. Then, for all$r\in \mathbb {N}$,it holds that$\|\vec {p}(r)\|\leq e(r)$.Ifα = (6𝜗² + 5𝜗 − 9)/(2(𝜗 + 1)(2 − 𝜗)) < 1 (which holds for𝜗 ≤ 1.1),we can choose the parameters such that the condition holds and Algorithm2has steady state error

$$E = \lim\limits_{r\to \infty} e(r) = \frac{(\vartheta-1)d+(4\vartheta-2)U}{(2-\vartheta)(1-\alpha)}\,. $$

Proof

To show the first part, inductively use Lemma 6 and Lemma 4 to show that round r is executed correctly and that $\|\vec {p}(r + 1)\|\leq e(r + 1)$, respectively; the induction anchor is given by $\|\vec {p}(1)\|\leq e(1)$ according to Corollary 5. The second part directly follows from Lemma 3. □

5 Phase and Frequency Synchronization Algorithm

In this section, we extend the phase synchronization algorithm to also synchronize frequencies and give the guarantees of the extended algorithm in Theorem 3; a simplified statement is provided by Corollary 12. The basic idea is to apply the approximate agreement not only to phase offsets, but also to frequency offsets. To this end, in each round the phase difference is measured twice, applying any phase correction only after the second measurement. This enables nodes to obtain an estimate of the relative clock speeds, which in turn is used to obtain an estimate of the differences in clock speeds.

Ensuring that this procedure is executed correctly is straightforward by limiting |μ_v(r) − 1| to be small, where μ_v(r) is the factor by which node v changes its clock rate during round r. However, constraining this multiplier means that approximate agreement steps cannot be performed correctly in case μ_v(r + 1) would lie outside the valid range of multipliers. This is fixed by introducing a correction that “pulls” frequencies back to the default rate.

Of course, for all this to be meaningful, we need to assume that hardware clock rates do not change faster than the algorithm can adjust the multipliers to keep the effective frequencies aligned. We recall the additional model assumption stated in Section 3.1: we assume that H_v is differentiable (for all v ∈ C) with derivative h_v, where h_v satisfies for $t,t\in \mathbb {R}^{+}_{0}$ that |h_v(t^′) − h_v(t)|≤ ν|t^′− t| for some ν > 0.

5.1 Algorithm

Algorithm 3 gives the pseudocode of our approach. Mostly, the algorithm can be seen as a variant of Algorithm 2 that allows for speeding up clocks by factors μ_v(r) ∈ [1, 𝜗²], where 𝜗h_v(t) is considered the nominal rate at time t.^{Footnote 11} For simplicity, we fix all local waiting times independently of the round length.

The main difference to Algorithm 2 is that a second pulse signal is sent before the phase correction is applied, enabling to determine the rate multipliers for the next round by an approximate agreement step as well. A frequency measurement is obtained by comparing the (observed) relative rate of the clock of node w during a local time interval of length τ₂ + τ₃ to the desired relative clock rate of 1. Since the clock of node v is considered to run at speed μ_v(r)h_v(t) during the measurement period, the former takes the form μ_v(r)Δ_wv/(τ₂ + τ₃), where Δ_wv is the time difference between the arrival times of the two pulses from w measured with H_v. The approximate agreement step results in a new multiplier $\hat {\mu }_{v}(r + 1)$ at node v; we then move this result by a (small) value ε in direction of the nominal rate multiplier 𝜗 and ensure that we remain within the acceptable multiplier range [1, 𝜗²].

To fully specify the algorithm, we need to determine how long the waiting periods are (in terms of local time) and choose ε. Here, we must ensure for all $r\in \mathbb {N}$ that

1.
for all v, w ∈ C, the message v broadcasts at time t_v(r − 1) + τ₁/μ_v(r − 1) is received by w at a local time from [H_w(t_w(r − 1)), H_w(t_w(r − 1)) + τ₁/μ_v(r − 1) + τ₂/μ_w(r)],
2.
for all v, w ∈ C, the message v broadcasts at time t_v(r − 1) + τ₁/μ_v(r − 1) + (τ₂ + τ₃)/μ_v(r) is received by w at a local time from [H_w(t_w(r− 1)) + τ₁/μ_v(r− 1) + τ₂/μ_w(r), H_w(t_w(r− 1)) + τ₁/μ_v(r− 1)+(τ₂ + τ₃ + τ₄)/μ_w(r)], and
3.
for all v ∈ C, T −Δ_v(r) ≥ τ₁/μ_v(r − 1) + (τ₂ + τ₃ + τ₄)/μ_v(r), i.e., v computes H_v(t_v(r)) before time t_v(r).

If these conditions are satisfied for $r\in \mathbb {N}$, we say that roundr was executed correctly.

We now specify the constraints our choices for the parameters must satisfy to ensure that all rounds are executed correctly and both phase and frequency errors converge to small values.

Condition 2

Set$\bar {\vartheta }:=\vartheta ^{3}$.Define

$$e(1):=\max\left\{F+\left( 1-\frac{1}{\bar{\vartheta}}\right)\tau_{1}, \frac{(1-1/\bar{\vartheta})T+(3\bar{\vartheta}-1)U}{1-\bar{\beta}}\right\} $$

and,inductively for$r\in \mathbb {N}$,

$$e(r + 1):=\frac{2\bar{\vartheta}^{2}+ 5\bar{\vartheta}-5}{2(\bar{\vartheta}+ 1)}\,e(r) +(3\bar{\vartheta}-1)U+\left( 1-\frac{1}{\vartheta}\right)T\,. $$

We require that

$$\begin{array}{@{}rcl@{}} \tau_{1}&\geq &\bar{\vartheta} e(1)\\ \tau_{2}&\geq &\bar{\vartheta}(e(1)+d)\\ \tau_{3}&\geq& \bar{\vartheta}\left( e(1)+\left( 1-\frac{1}{\bar{\vartheta}}\right)(\tau_{1}+\tau_{2})\right)\\ \tau_{4}&\geq& \bar{\vartheta}\left( e(1)+d+\left( 1-\frac{1}{\bar{\vartheta}}\right)(\tau_{1}+\tau_{2})\right)\\ T&\geq & \tau_{1}+\tau_{2}+\tau_{3}+\tau_{4}+\bar{\vartheta}(e(1)+U)\\ \varepsilon&\geq& 2\left( (\vartheta-1)(\vartheta^{3}-1)+ 2\vartheta^{3}\left( 1-\frac{1}{\vartheta^{3}}\right)^{2} +\frac{2\vartheta^{3} U}{\tau_{2}+\tau_{3}}+ 2(\vartheta^{3}+ 1)\nu T\right)\,. \end{array} $$

Here, all but the last conditions mimic Condition 1, where the bounds on τ₃ and τ₄ account for the fact that between the first and the second pulse of each round, the nodes’ opinion on the “synchronized time” drift apart slowly. The lower bound on ε ensures that the pull-back of multipliers to the nominal ones is sufficiently strong to guarantee that, in fact, multipliers will never leave the valid range of [1, 𝜗²]. We now show that these constraints can be satisfied provided that 𝜗 is not too large.

Lemma 7

Condition2 can be satisfied such that$\lim _{r\to \infty } e(r)<\infty $if

$$\bar{\alpha}:=\bar{\beta}+(4\bar{\vartheta}+ 3)(\bar{\vartheta}-1)<1\,, $$

where$\bar {\beta }:=(2\bar {\vartheta }^{2}+ 5\bar {\vartheta }-5)/(2(\bar {\vartheta }+ 1))$. Here, we may choose any T ≥ T₀ ∈ O(F + d + U). In this case,

$$\lim\limits_{r\to \infty} e(r) = \frac{(1-1/\bar{\vartheta})T+(3\bar{\vartheta} -1)U}{1-\bar{\beta}}\,. $$

Proof

We choose τ₁, τ₂, τ₃, and τ₄ minimal such that the respective constraints are satisfied, and pick any feasible ε. Hence, the remaining constraints are that

$$ T\geq \bar{\vartheta}((4\bar{\vartheta}+ 3)e(1)+(2\bar{\vartheta}+ 1)d+U) $$

(4)

and

$$e(1)=\max\left\{F+\left( 1+\frac{1}{\bar{\vartheta}}\right)e(1), \frac{(1-1/\bar{\vartheta})T+(3\bar{\vartheta}-1)U}{1-\bar{\beta}}\right\}. $$

Using that $2-\bar {\vartheta }>0$ (which is a weaker constraint than $\bar {\alpha }<1$), assuming that e(1) equals the first term of the maximum would yield that

$$e(1)= \frac{F}{2-\bar{\vartheta}}\,, $$

and clearly there is a T₀ ∈ O(F + d + U) such that (4) is satisfied for any T ≥ T₀. Assuming that e(1) equals the second term in the maximum, (4) becomes

$$T\geq \bar{\vartheta}\left( (4\bar{\vartheta}+ 3) \left( \frac{(1-1/\bar{\vartheta})T+(3\bar{\vartheta}-1)U}{1-\bar{\beta}}\right) +(2\bar{\vartheta}+ 1)d+U)\right). $$

Using that $\bar {\alpha }<1$, we can resolve this to

$$T\geq \bar{\vartheta}\cdot \frac{(4\bar{\vartheta}+ 3)(3\bar{\vartheta}+ 1)U+(1+\bar{\beta}) ((2\bar{\vartheta}+ 1)d+U)}{1-\bar{\alpha}}\in O(U+d)\,. $$

For the final claim, observe that by induction on r, we have that

$$\begin{array}{@{}rcl@{}} \lim\limits_{r\to \infty}e(r) &=&\lim\limits_{r\to \infty}\left( \bar{\beta}^{r-1}e(1) +\sum\limits_{i = 1}^{r-1}\bar{\beta}^{i-1} \left( (3\bar{\vartheta}-1)U+\left( 1-\frac{1}{\vartheta}\right)T\right)\right)\\ &=& \frac{(1-1/\bar{\vartheta})T+(3\bar{\vartheta}-1)U}{1-\bar{\beta}}\,. \end{array} $$

□

5.2 Analysis

In the following, denote by $\vec {p}(r)$ and $\vec {q}(r)$, $r\in \mathbb {N}$, the vectors of times when nodes v ∈ C broadcast their first and second pulse in round r, respectively. Thus, we have that H_v(p_v(r)) = H_v(t_v(r − 1)) + τ₁/μ_v(r − 1) and H_v(q_v(r)) = H_v(t_v(r − 1)) + τ₁/μ_v(r − 1) + (τ₂ + τ₃)/μ_v(r).

We will first make use of the analysis we performed for the phase correction algorithm to show that all rounds are executed correctly. Then we will refine the analysis by examining the impact of the frequency correction steps.

5.2.1 Phase Correction Steps

Observe that because for all $r\in \mathbb {N}_{0}$ and v ∈ C, we have that 1 ≤ μ_v(r) ≤ 𝜗², for all times t we have that $1\leq \mu _{v}(r)h_{v}(t)\leq \vartheta ^{3}=\bar {\vartheta }$. Thus, we may interpret the waiting periods of Algorithm 3 as nodes waiting for τ₁, τ₂, etc. local time with hardware clocks of drift $\bar {\vartheta }=\vartheta ^{3}$. Thus, we can make use of the same arguments as in Section 4.3 to obtain a series of results.

Corollary 6

For all $r\in \mathbb {N}$ , $\|\vec {q}(r)\|\leq \|\vec {p}(r)\|+(1-1/\bar {\vartheta })(\tau _{1}+\tau _{2})$ .

Proof

By application of Lemma 5. □

Corollary 7

Suppose that

$$\begin{array}{@{}rcl@{}} \tau_{1}&\geq& \vartheta (\|\vec{p}(r)\|-(d-U))\\ \tau_{2}&\geq& \vartheta (\|\vec{p}(r)\| + d)\\ \tau_{3}&\geq& \vartheta (\|\vec{q}(r)\|-(d-U))\\ \tau_{4}&\geq& \vartheta (\|\vec{q}(r)\| + d)\\ T&\geq \tau_{1}+\tau_{2}+\tau_{3}+\tau_{4}+\vartheta(\|\vec{p}(r)\|+U)\,. \end{array} $$

Then round r is executed correctly.

Proof

As for Lemma 6, where the pulse in the frequency correction step is analyzed analogously. □

Theorem 2

Suppose that Condition2 is satisfied and that

$$\bar{\alpha}:=\bar{\beta}+(4\bar{\vartheta}+ 3)(\bar{\vartheta}-1)<1\,, $$

where$\bar {\beta }:=(2\bar {\vartheta }^{2}+ 5\bar {\vartheta }-5)/(2(\bar {\vartheta }+ 1))$(this is the case for𝜗 ≤ 1.011). Then, for all$r\in \mathbb {N}$, it holds that$\|\vec {p}(r)\|\leq e(r)$and the algorithm has steady state error

$$E\leq \frac{(1-1/\bar{\vartheta})T+(3\bar{\vartheta}-1)U}{1-\bar{\beta}}\,. $$

In particular, all rounds$r\in \mathbb {N}$are executed correctly.

Proof

As for Theorem 1, where we replace 𝜗 with $\bar {\vartheta }$, Lemma 6 with Corollary 7 and Lemma 3 with Lemma 7. However, the induction step requires that we can apply Lemma 6 again in step r + 1 if we could do so in step $r\in \mathbb {N}$. This readily follows from Condition 2 if e(r + 1) ≤ e(r) for all $r\in \mathbb {N}$.

We show this by induction on r. Abbreviate $x:=(3\bar {\vartheta }-1)U+(1-1/\bar {\vartheta })T$. Our claim is that (i) for $r\in \mathbb {N}$, $e(r)\geq x/(1-\bar {\beta })$ and (ii) for r ≥ 2, e(r) ≤ e(r − 1). The base case r = 1 requires (i) only, which holds by definition of e(1). For the step from r to r + 1, we bound

$$e(r + 1)=\bar{\beta}e(r)+x\geq \frac{\bar{\beta} x}{1-\bar{\beta}}+x=\frac{x}{1-\bar{\beta}} $$

and

$$e(r)-e(r + 1)=(1-\bar{\beta})e(r)-x\geq x-x = 0\,. $$

Finally, observe that our reasoning shows as part of the inductive argument that all rounds are executed correctly. □

5.2.2 Frequency Correction Steps

In the following, we assume that the prerequisites of Theorem 2 are satisfied. In particular, all rounds are executed correctly, i.e., we can assume that correct nodes receive each others’ pulses. We introduce some notation to capture the behavior of the (logical) rates of the nodes’ clocks. This notation may seem somewhat cumbersome; basically, the reader may think of the clock rates h_v(t) as being almost constant, implying that all considered values for a given node v ∈ C are essentially the same, slowly deviating at rate at most ν.

By $\vec {\rho }(r)$, we denote the vector whose entries are the intervals of clock rate ranges of nodes v ∈ C between the first pulses in rounds $r\in \mathbb {N}$ and r + 1. Concretely,

$$\vec{\rho}(r)_{v}:=\left[ \min\limits_{p_{v}(r)\leq t\leq p_{v}(r + 1)}\{\mu_{v}(r)h_{v}(t)\}, \max\limits_{p_{v}(r)\leq t\leq p_{v}(r + 1)}\{\mu_{v}(r)h_{v}(t)\} \right]. $$

By $\|\vec {\rho }(r)\|$, we denote the difference between maximum and minimum rate in $\vec {\rho }(r)$, i.e.,

$$\|\vec{\rho}(r)\|:=\max\limits_{v\in C}\max\limits_{p_{v}(r)\leq t\leq p_{v}(r + 1)}\{\mu_{v}(r)h_{v}(t)\} -\min\limits_{v\in C}\min\limits_{p_{v}(r)\leq t\leq p_{v}(r + 1)}\{\mu_{v}(r)h_{v}(t)\}\,. $$

Furthermore, we denote by $\bar {\rho }(r)_{v}:=\mu _{v}(r)h_{v}((p_{v}(r)+p_{v}(r + 1))/2)$, by $\bar {\rho }(r)$ the respective vector, and by $\|\bar {\rho }(r)\|:=\max _{v\in C}\{\bar {\rho }(r)\}-\min _{v\in C}\{\bar {\rho }(r)\}$. Note that $\bar {\rho }(r)_{v}\in \vec {\rho }(r)_{v}$ by definition.

We start by showing that $\bar {\rho }(r)_{v}$ approximates μ_v(r)h_v(t) well for times t between pulse r and r + 1 of v ∈ C, i.e., we may see $\bar {\rho }(r)_{v}$ as “the” clock rate of v in round r.

Lemma 8

Lett ∈ [p_v(r), p_v(r + 1)] for somev ∈ Cand$r\in \mathbb {N}$.Then

$$|\mu_{v}(r)h_{v}(t)-\bar{\rho}(r)_{v}|<\nu\, \frac{T+\tau_{2}}{2}\,. $$

Proof

Using that hardware clock rates are at least 1 and that |Δ_v(r)| < max{τ₁, τ₂} = τ₂, we see that

$$\left|t-\frac{p_{v}(r + 1)+p_{v}(r)}{2}\right|\leq \frac{|p_{v}(r + 1)-p_{v}(r)|}{2} \leq \frac{|T-{\Delta}_{v}(r)|}{2\mu_{v}(r)}<\frac{T+\tau_{2}}{2\mu_{v}(r)}\,. $$

By our assumptions on the hardware clocks, this yields that

$$\begin{array}{@{}rcl@{}} \left|\mu_{v}(r)\left( h_{v}(t)-h_{v}\left( \frac{p_{v}(r + 1)+p_{v}(r)}{2}\right)\right)\right| &\leq& \mu_{v}(r)\cdot\nu \left|t-\frac{p_{v}(r + 1)+p_{v}(r)}{2}\right|\\ &<&\nu\,\frac{T+\tau_{2}}{2}\,. \end{array} $$

□

Two corollaries relate the progress of the hardware clocks between (i) p_v(r) and q_v(r) and (ii) $t_{wv}^{\prime }$ and t_wv to $\bar {\rho }(r)_{v}$, respectively.

Corollary 8

Forv ∈ Cand$r\in \mathbb {N}$,we have that

$$|\bar{\rho}(r)_{v}(q_{v}(r)-p_{v}(r))-(\tau_{2}+\tau_{3})|<\nu T(\tau_{2}+\tau_{3})\,. $$

Proof

Let $\rho \in \vec {\rho }(r)_{v}$ such that ρ(q_v(r) − p_v(r)) = τ₂ + τ₃. By definition of $\vec {\rho }(r)_{v}$ and the mean value theorem, such a ρ exists and ρ = μ_v(r)h_v(t) for some t ∈ [p_v(r), p_v(r + 1)]. By Lemma 8, $|\rho -\bar {\rho }(r)_{v}|<\nu T$. Thus,

$$\begin{array}{@{}rcl@{}} |\bar{\rho}(r)_{v}(q_{v}(r)-p_{v}(r))-(\tau_{2}+\tau_{3})| &=&|\rho-\bar{\rho}(r)_{v}|(q_{v}(r)-p_{v}(r))\\ &=&|\rho-\bar{\rho}(r)_{v}|\,\frac{\tau_{2}+\tau_{3}}{\rho}\\ &<&\nu T(\tau_{2}+\tau_{3})\,. \end{array} $$

□

Corollary 9

Forv, w ∈ Cand$r\in \mathbb {N}$,we have that

$$|\mu_{v}(r)(H_{v}(t_{wv}^{\prime})-H_{v}(t_{wv}))-\bar{\rho}(r)_{v}(t_{wv}^{\prime}-t_{wv})| <\nu T(\tau_{2}+\tau_{3})\,. $$

Proof

Let $\bar {\rho }\in \vec {\rho }(r)_{v}$ such that $t_{wv}^{\prime }-t_{wv}=\mu _{v}(r)(H_{v}(t_{wv}^{\prime })-H_{v}(t_{wv})$. By definition of $\vec {\rho }(r)_{v}$ and the mean value theorem, such a ρ exists and ρ = μ_v(r)h_v(t) for some t ∈ [t_wv, twv′] ⊆ [p_v(r), p_v(r + 1)]. By Lemma 8, $|\rho -\bar {\rho }(r)_{v}|<\nu (T+\tau _{2})/2$. Thus,

$$\begin{array}{@{}rcl@{}} |\mu_{v}(r)(H_{v}(t_{wv}^{\prime})-H_{v}(t_{wv}))-\bar{\rho}(r)_{v}(t_{wv}^{\prime}-t_{wv})| &=&|\rho-\bar{\rho}(r)_{v}|(t_{wv}^{\prime}-t_{wv})\\ &<&\nu \,\frac{T+\tau_{2}}{2}(\tau_{2}+\tau_{3}+U)\\ &<&\nu T(\tau_{2}+\tau_{3})\,, \end{array} $$

where the second last step exploits that $t_{wv}^{\prime }-t_{wv}\leq q_{w}(r)+d-(p_{w}(r)+d-U)\leq \tau _{2}+\tau _{3}+U$, since clock rates are at least 1, and the final inequality easily follows from Condition 2. □

These results put us in the position to prove that 1 − μ_v(r)Δ_wv/(τ₂ + τ₃) is indeed a good estimate of $\bar {\rho }(r)_{w}-\bar {\rho }(r)_{v}$. Thus, this (computable) value can serve as a proxy for the difference between “the” clock rates of w and v in round r.

Lemma 9

Forv, w ∈ Cand$r\in \mathbb {N}$,we have that

$$\left|\bar{\rho}(r)_{w}-\bar{\rho}(r)_{v} -\left( 1-\frac{\mu_{v}(r){\Delta}_{wv}}{\tau_{2}+\tau_{3}}\right)\right| \leq \vartheta^{3}\left( 1-\frac{1}{\vartheta^{3}}\right)^{2} + \frac{\vartheta^{3} U}{\tau_{2}+\tau_{3}}+(\vartheta^{3}+ 1) \nu T\,. $$

Proof

We have

$$ |t_{wv}^{\prime}-t_{wv}-(q_{w}(r)-p_{w}(r))|\leq U $$

(5)

and by Corollaries 8 and 9 that

$$\begin{array}{@{}rcl@{}} \left|\frac{q_{w}(r)-p_{w}(r)}{\tau_{2}+\tau_{3}}-\frac{1}{\bar{\rho}(r)_{w}}\right|& <&\frac{\nu T}{\bar{\rho}(r)_{w}}\leq \nu T \end{array} $$

(6)

$$\begin{array}{@{}rcl@{}} \left|\frac{\mu_{v}(r){\Delta}_{wv}}{t_{wv}^{\prime}-t_{wv}}-\bar{\rho}(r)_{v}\right| &<&\nu T\,. \end{array} $$

(7)

Note that $|\mu _{v}(r){\Delta }_{wv}/(t_{wv}^{\prime }-t_{wv})|\leq \vartheta ^{3}$. Therefore,

$$\begin{array}{@{}rcl@{}} \left|\frac{\bar{\rho}(r)_{v}}{\bar{\rho}(r)_{w}} -\frac{\mu_{v}(r){\Delta}_{wv}}{\tau_{2}+\tau_{3}}\right| &=& \left|\frac{\bar{\rho}(r)_{v}}{\bar{\rho}(r)_{w}} -\frac{\mu_{v}(r){\Delta}_{wv}}{t_{wv}^{\prime}-t_{wv}} \cdot\frac{t_{wv}^{\prime}-t_{wv}}{q_{w}(r)-p_{w}(r)} \cdot\frac{q_{w}(r)-p_{w}(r)}{\tau_{2}+\tau_{3}}\right|\\ &&\overset{(5)}{\leq} \left|\frac{\bar{\rho}(r)_{v}}{\bar{\rho}(r)_{w}} -\frac{\mu_{v}(r){\Delta}_{wv}}{t_{wv}^{\prime}-t_{wv}} \cdot\frac{q_{w}(r)-p_{w}(r)}{\tau_{2}+\tau_{3}}\right|+ \frac{\vartheta^{3} U}{\tau_{2}+\tau_{3}}\\ &&\overset{(6)}{\leq} \left|\frac{\bar{\rho}(r)_{v}}{\bar{\rho}(r)_{w}} -\frac{\mu_{v}(r){\Delta}_{wv}}{t_{wv}^{\prime}-t_{wv}} \cdot \frac{1}{\bar{\rho}(r)_{w}}\right|+ \frac{\vartheta^{3} U}{\tau_{2}+\tau_{3}}+\vartheta^{3} \nu T\\ &&\overset{(7)}{\leq} \frac{\vartheta^{3} U}{\tau_{2}+\tau_{3}}+(\vartheta^{3}+ 1) \nu T\,. \end{array} $$

Moreover,

$$\begin{array}{@{}rcl@{}} \left|\bar{\rho}(r)_{w}-\bar{\rho}(r)_{v} -\left( 1-\frac{\bar{\rho}(r)_{v}}{\bar{\rho}(r)_{w}}\right)\right| &=&\left( 1-\frac{1}{\bar{\rho}(r)_{w}}\right)|\bar{\rho}(r)_{w}-\bar{\rho}(r)_{v}|\\ &\leq &\left( 1-\frac{1}{\vartheta^{3}}\right)(\vartheta^{3}-1)\,. \end{array} $$

We conclude that

$$\left|\bar{\rho}(r)_{w}-\bar{\rho}(r)_{v} -\left( 1-\frac{\mu_{v}(r){\Delta}_{wv}}{\tau_{2}+\tau_{3}}\right)\right| \leq \vartheta^{3}\left( 1-\frac{1}{\vartheta^{3}}\right)^{2} + \frac{\vartheta^{3} U}{\tau_{2}+\tau_{3}}+(\vartheta^{3}+ 1) \nu T\,. $$

□

We remark that the Θ((1 − 1/𝜗³)²) factor is, more precisely, bounded as ${\Theta }((1-1/\vartheta ^{3})\|\bar {\rho }(r)\|)$. However, for this to be of use, we would have to choose ε depending on r. Since rule-of-thumb calculations show that this term is unlikely to be significant in any real system and the improvement would not extend to the self-stabilizing variant of the algorithm, we refrained from adding this additional complication.

Given that we can bound the “measurement error” of the frequency correction step by Lemma 9, the results from Section 4.1 can be invoked to show convergence. First, we analyze the properties of $\hat {\mu }_{v}(r + 1)$, which Lemma 11 then uses to control μ_v(r + 1).

Lemma 10

Forv ∈ Cand$r\in \mathbb {N}$,abbreviate$\bar {t}_{v}:= (p_{v}(r)+p_{v}(r + 1))/2$,i.e.,$\bar {\rho }(r)_{v}=\mu _{v}(r)h_{v}(\bar {t}_{v})$.Then, for allv, w ∈ C,

$$|\hat{\mu}_{v}(r + 1)h_{v}(\bar{t}_{v})-\hat{\mu}_{w}(r + 1)h_{w}(\bar{t}_{w})|\leq \frac{2\vartheta-1}{2}\,\|\bar{\rho}(r)\|+\vartheta\varepsilon\,. $$

Furthermore,

$$\begin{array}{@{}rcl@{}} (\hat{\mu}_{v}(r + 1)-\varepsilon)h_{v}(\bar{t}_{v}) &\leq& \max\limits_{u\in C}\left\{\mu_{u}(r)h_{u}(\bar{t}_{u})\right\}-\frac{\varepsilon}{2}\\ (\hat{\mu}_{v}(r + 1)+\varepsilon)h_{v}(\bar{t}_{v}) &\geq& \min\limits_{u\in C}\left\{\mu_{u}(r)h_{u}(\bar{t}_{u})\right\}+\frac{\varepsilon}{2}\,. \end{array} $$

Proof

Set δ := 𝜗³(1 − 𝜗^− 3)² + 𝜗³U/(τ₂ + τ₃) + (𝜗³ + 1)νT. Observe that, according to Lemma 9, we can interpret $\bar {\rho }(r)_{v}+\xi _{v}(r)$, v ∈ C, as the results of an approximate agreement step with error δ on inputs $\bar {\rho }(r)$. By Lemma 2, this implies that

$$|\hat{\mu}_{v}(r)h_{v}(\bar{t}_{v})+\xi_{v}(r)-(\hat{\mu}_{w}(r)h_{v}(\bar{t}_{w})+\xi_{w}(r))| \leq \frac{\|\bar{\rho}(r)\|}{2}+ 2\delta\,. $$

By Corollary 1, $\max _{u\in C}|\{\xi _{u}(r)|\}\leq \|\bar {\rho }(r)\|+\delta $. Hence, we have for u ∈ C that

$$\begin{array}{@{}rcl@{}} |\hat{\mu}_{u}(r + 1)h_{u}(\bar{t}_{u})-(\hat{\mu}_{u}(r)h_{u}(\bar{t}_{u})+\xi_{u}(r))| &=&\left|\frac{2h_{u}(\bar{t}_{u})}{\vartheta+ 1}-1\right|\cdot|\xi_{u}(r)| \\ &\leq& \frac{\vartheta-1}{\vartheta+ 1}(\|\bar{\rho}(r)\|+\delta)\,. \end{array} $$

(8)

Using this bound for both v and w, we conclude that

$$\begin{array}{@{}rcl@{}} |\hat{\mu}_{v}(r + 1)h_{v}(\bar{t}_{v})-\hat{\mu}_{w}(r + 1)h_{w}(\bar{t}_{w})| &\leq &\frac{\|\bar{\rho}(r)\|}{2}+ 2\delta+ \frac{2(\vartheta-1)}{\vartheta+ 1}(\|\bar{\rho}(r)\|+\delta)\\ &<&\frac{2\vartheta-1}{2}\,\|\bar{\rho}(r)\|+(\vartheta+ 1) \delta\\ &<&\frac{2\vartheta-1}{2}\,\|\bar{\rho}(r)\|+\vartheta\varepsilon\,. \end{array} $$

For the second claim of the lemma, we apply Lemma 1. Together with (8), this shows for v ∈ C that

$$\begin{array}{@{}rcl@{}} \hat{\mu}_{v}(r + 1)h_{v}(\bar{t}_{v}) &<& \max\limits_{u\in C}\left\{\mu_{u}(r)h_{u}(\bar{t}_{u})\right\}+\delta +\frac{h_{v}(\bar{t}_{v})-1}{2}\,(\|\bar{\rho}(r)\|+\delta)\\ \hat{\mu}_{v}(r + 1)h_{v}(\bar{t}_{v}) &>& \min\limits_{u\in C}\left\{\mu_{u}(r)h_{u}(\bar{t}_{u})\right\}-\left( \delta +\frac{h_{v}(\bar{t}_{v})-1}{2}\,(\|\bar{\rho}(r)\|+\delta)\right), \end{array} $$

where we used that $2h_{v}(\bar {t}_{v})/(\vartheta + 1)-1\leq (h_{v}(\bar {t}_{v})-1)/2$. By Condition 2 (and because $\|\bar {\rho }(r)\|\leq \vartheta ^{3}-1$),

$$\frac{\varepsilon}{2}\, h_{v}(\bar{t}_{v})\geq \left( \delta+\frac{(\vartheta-1)(\vartheta^{3}-1)}{2}\right) h_{v}(\bar{t}_{v}) >\delta +\frac{h_{v}(\bar{t}_{v})-1}{2}\,(\|\bar{\rho}(r)\|+\delta)\,. $$

Combining this with the above inequalities completes the proof. □

Lemma 11

For round$r\in \mathbb {N}$andv ∈ C,abbreviate$\bar {t}_{v}:= (p_{v}(r)+p_{v}(r + 1))/2$,i.e.,$\bar {\rho }(r)_{v}=\mu _{v}(r)h_{v}(\bar {t}_{v})$.For allv, w ∈ C,we have that

$$|\mu_{v}(r + 1)h_{v}(\bar{t}_{v})-\mu_{w}(r + 1)h_{w}(\bar{t}_{w})|\leq \max\left\{ \frac{2\vartheta-1}{2}\,\|\bar{\rho}(r)\| + 3\vartheta \varepsilon, \|\bar{\rho}(r)\|-\frac{\varepsilon}{2}\right\}. $$

Proof

Let v ∈ C and w ∈ C maximize and minimize $\mu _{u}(r + 1)h_{u}(\bar {t}_{u})$ over u ∈ C, respectively. By Lemma 10, we have that

$$|\hat{\mu}_{v}(r + 1)h_{v}(\bar{t}_{v})-\hat{\mu}_{w}(r + 1)h_{w}(\bar{t}_{w})|< \frac{2\vartheta-1}{2}\,\|\bar{\rho}(r)\|+\vartheta\varepsilon\,. $$

We make a case distinction.

Case 1: $\mu _{v}(r + 1)-\hat {\mu }_{v}(r + 1)\leq \varepsilon $ and $\hat {\mu }_{w}(r + 1)-\mu _{w}(r + 1)\leq \varepsilon $. Because we have that $\max \{h_{v}(\bar {t}_{v}),h_{w}(\bar {t}_{w})\}\leq \vartheta $, we get
$$\begin{array}{@{}rcl@{}} \mu_{v}\!(r\,+\,1)h_{v}\!(\bar{t}_{v})\,-\,\mu_{w}\!(r\,+\,1)h_{w}\!(\bar{t}_{w}) \!&\leq&\! (\mu_{v}(r\,+\,1)\,-\,\hat{\mu}_{v}(r\,+\,1))h_{v}(\bar{t}_{v})\\ &&~+\hat{\mu}_{v}\!(r\,+\,1)h_{v}\!(\bar{t}_{v})\,-\,\hat{\mu}_{w}\!(r\,+\,1)h_{w}(\bar{t}_{w})\\ &&~+(\hat{\mu}_{w}(r\,+\,1)-\mu_{w}(r\,+\,1))h_{w}(\bar{t}_{w})\\ &\leq& \frac{2\vartheta-1}{2}\,\|\bar{\rho}(r)\|+ 3\vartheta\varepsilon\,. \end{array} $$
Case 2: $\mu _{v}(r + 1)-\hat {\mu }_{v}(r + 1)>\varepsilon $. This implies that μ_v(r + 1) = 1 ≤ μ_v(r).
- $\hat {\mu }_{w}(r + 1)\leq \vartheta $, i.e., we have that $\mu _{w}(r + 1)\geq \hat {\mu }_{w}(r + 1)+\varepsilon $. Using Lemma 10, we bound
  $$\begin{array}{@{}rcl@{}} \mu_{v}(r\,+\,1)h_{v}(\bar{t}_{v})\,-\,\mu_{w}(r\,+\,1)h_{w}(\bar{t}_{w})\!&\leq&\! h_{v}(\bar{t}_{v})\mu_{v}(r)\\&&-\!\left( \!\min\limits_{u\in C}\{ \mu_{u}(r)h_{u}(\bar{t}_{u})\}\,+\,\frac{\varepsilon}{2}\right)\\ &\leq& \|\bar{\rho}(r)\|-\frac{\varepsilon}{2}\,. \end{array} $$
- $\hat {\mu }_{w}(r + 1)> \vartheta $, yielding that μ_w(r + 1) ≥ 𝜗 − ε. It follows that
  $$\mu_{v}(r + 1)h_{v}(\bar{t}_{v})-\mu_{w}(r + 1)h_{w}(\bar{t}_{w}) \leq h_{v}(\bar{t}_{v})-(\vartheta-\varepsilon) \leq \varepsilon\,. $$
Case 3: $\hat {\mu }_{w}(r + 1)-\mu _{w}(r + 1)> \varepsilon $. This implies that μ_w(r + 1) = 𝜗² ≥ μ_w(r).
- $\hat {\mu }_{v}(r + 1)> \vartheta $, i.e., we have that $\mu _{v}(r + 1)\leq \hat {\mu }_{v}(r + 1)-\varepsilon $. Using Lemma 10, we bound
  $$\begin{array}{@{}rcl@{}} \mu_{v}(r\,+\,1)h_{v}(\bar{t}_{v})\,-\,\mu_{w}(r\,+\,1)h_{w}(\bar{t}_{w})\!&\leq& \!\left( \max\limits_{u\in C}\{ \mu_{u}(r)h_{u}(\bar{t}_{u})\}-\frac{\varepsilon}{2}\right) \\&&-h_{w}(\bar{t}_{w})\mu_{w}(r)\\ &\leq& \|\bar{\rho}(r)\|-\frac{\varepsilon}{2}\,. \end{array} $$
- $\hat {\mu }_{v}(r + 1)\leq \vartheta $, yielding that μ_v(r + 1) ≤ 𝜗 + ε. It follows that
  $$\mu_{v}(r + 1)h_{v}(\bar{t}_{v})-\mu_{w}(r + 1)h_{w}(\bar{t}_{w}) \leq (\vartheta+\varepsilon)h_{v}(\bar{t}_{v})-\vartheta^{2} \leq \vartheta\varepsilon\,. $$

In all cases, we get that

$$\begin{array}{@{}rcl@{}} && \max\limits_{u,u^{\prime}\in C}\{|\mu_{u}(r + 1)h_{u}(\bar{t}_{u})-\mu_{u^{\prime}}(r + 1)h_{u^{\prime}}(\bar{t}_{u^{\prime}})\}\\ &=&\;\mu_{v}(r + 1)h_{v}(\bar{t}_{v})-\mu_{w}(r + 1)h_{w}(\bar{t}_{w})\\ &\leq &\; \max\left\{ \frac{2\vartheta-1}{2}\,\|\bar{\rho}(r)\| + 3\vartheta\varepsilon, \|\bar{\rho}(r)\|-\frac{\varepsilon}{2}\right\}. \end{array} $$

□

It remains to take into account that hardware clock speeds change between rounds using Lemma 8.

Corollary 10

For all $r\in \mathbb {N}$ ,

$$\|\bar{\rho}(r + 1)\|\leq \max\left\{ \frac{2\vartheta-1}{2}\,\|\bar{\rho}(r)\| + 3\vartheta\varepsilon, \|\bar{\rho}(r)\|-\frac{\varepsilon}{2}\right\} + 2\nu(T+\tau_{2})\,. $$

Proof

By applying Lemma 11 and noting that for all u ∈ C, $|\bar {\rho }(r)_{v}-\bar {\rho }(r + 1)_{v}|\leq \nu (T+\tau _{2})$ by Lemma 8. □

We conclude that the steady state frequency error is in O(ε).

Corollary 11

Assume thatβ := (2𝜗 − 1)/2 < 1.Then

$$\lim\limits_{r\to \infty}\sup\limits_{r^{\prime}\geq r}\{\|\vec{\rho}(r^{\prime})\|\}\leq \frac{3\vartheta\varepsilon+ 2\nu(T+\tau_{2})}{1-\beta}+\nu(T+\tau_{2})\in O(\varepsilon)\,. $$

Proof

From iterative application of Corollary 10, we get that

$$\lim\limits_{r\to \infty}\sup\limits_{r^{\prime}\geq r}\{\|\vec{\rho}(r^{\prime})\|\}\leq \frac{3\vartheta\varepsilon+ 2\nu(T+\tau_{2})}{1-\beta}\,. $$

Lemma 8 shows that $\|\vec {\rho }(r^{\prime })\|\leq \|\bar {\rho }(r^{\prime })\|+\nu (T+\tau _{2})$. Since Condition 2 holds, 1 − β ∈Ω(1) and the overall error is bounded by O(ε). □

5.2.3 Steady State Error with Frequency Correction

To make use of Corollary 11, we need to derive a variant of Corollary 4 that allows for better control of $\|\vec {p}(r + 1)\|$ in case $\|\bar {\rho }(r)\|$ is small.

Lemma 12

If round $r\in \mathbb {N}$ is executed correctly, then

$$\|\vec{p}(r + 1)\|\leq \frac{4\bar{\vartheta}^{2}+ 5\bar{\vartheta}-7}{2(\bar{\vartheta}+ 1)} \|\vec{p}(r)\|+\left( 4\bar{\vartheta}-2\right)U +\|\vec{\rho}(r)\|T\,. $$

Proof

For v, w ∈ C, assume w.l.o.g. that p_v(r + 1) − p_w(r + 1) ≥ 0 (the other case is symmetric). Denote by $\rho _{v}\in \vec {\rho }(r)_{v}$ the average (adjusted) clock rate of v during [p_v(r), p_v(r + 1)], i.e.,

$$T-{\Delta}_{v}(r)=\frac{H_{v}(p_{v}(r + 1))-H_{v}(p_{v}(r))}{\mu_{v}(r)}=\rho_{v}(p_{v}(r + 1)-p_{v}(r))\,; $$

ρ_w is defined analogously for w. Recall that $1\leq \rho _{u}\leq \bar {\vartheta }$ for u ∈{v, w}. Using this and Corollary 3 (with 𝜗 replaced by $\bar {\vartheta }=\vartheta ^{3}$), we conclude that

$$\begin{array}{@{}rcl@{}} &&\quad~p_{v}(r + 1)-p_{w}(r + 1)\\ &&= p_{v}(r)-p_{w}(r)+\frac{T-{\Delta}_{v}(r)}{\rho_{v}} -\frac{T-{\Delta}_{w}(r)}{\rho_{w}}\\ &&\leq p_{v}(r)-{\Delta}_{v}(r)-(p_{w}(r)-{\Delta}_{w}(r)) +\frac{\rho_{w}-\rho_{v}}{\rho_{v}\rho_{w}}\,T\\ &&\qquad +\left( 1-\frac{1}{\rho_{v}}\right)|{\Delta}_{v}(r)| +\left( 1-\frac{1}{\rho_{w}}\right)|{\Delta}_{w}(r)|\\ &&\leq \frac{5\bar{\vartheta}-3}{2(\bar{\vartheta}+ 1)}\|\vec{p}(r)\|+ 2\bar{\vartheta} U +\|\vec{\rho}(r)\|T + 2(\bar{\vartheta}-1)(\|\vec{p}(r)\|+U)\\ &&=\frac{4\bar{\vartheta}^{2}+ 5\bar{\vartheta}-7}{2(\bar{\vartheta}+ 1)} \|\vec{p}(r)\|+\left( 4\bar{\vartheta}-2\right)U +\|\vec{\rho}(r)\|T\,. \end{array} $$

□

Plugging this into our machinery we arrive at the main result of this section.

Theorem 3

Suppose that Condition2 is satisfied and that

$$\bar{\alpha}:=\frac{2\bar{\vartheta}^{2}+ 5\bar{\vartheta}-5}{2(\bar{\vartheta}+ 1)}+(4\bar{\vartheta}+ 3)(\bar{\vartheta}-1)<1 $$

(which is thecase for 𝜗 ≤ 1.01). Then, with$\alpha :=(4\bar {\vartheta }^{2}+ 5\bar {\vartheta }-7)/(2(\bar {\vartheta }+ 1))<1$andβ := (2𝜗 − 1)/2 < 1, Algorithm 3 has steady state error

$$E\leq \frac{(4\bar{\vartheta}-2)U+\nu(T+\tau_{2})T}{1-\alpha} +\frac{(3\vartheta\varepsilon+ 2\nu(T+\tau_{2}))T}{(1-\alpha)(1-\beta)}\,. $$

Proof

As the preconditions of Theorem 2 are satisfied, all rounds are executed correctly. By Corollary 11, this implies that

$$\lim\limits_{r\to \infty}\sup\limits_{r^{\prime}\geq r}\{\|\vec{\rho}(r^{\prime})\|\}\leq \frac{3\vartheta\varepsilon+ 2\nu(T+\tau_{2})}{1-\beta}+\nu(T+\tau_{2})\,. $$

We plug this into the bound from Lemma 12, which we apply inductively to show that

$$\begin{array}{@{}rcl@{}} E=\lim\limits_{r\to \infty}\sup\limits_{r^{\prime}\geq r}\{\|\vec{p}(r^{\prime})\|\}&\leq& \frac{(4\bar{\vartheta}-2)U+\lim_{r\to \infty}\sup_{r^{\prime}\geq r} \{\|\vec{\rho}(r)\|T\}}{1-\alpha}\\ &\leq& \frac{(4\bar{\vartheta}-2)U+\nu(T+\tau_{2})T}{1-\alpha} +\frac{(3\vartheta\varepsilon+ 2\nu(T+\tau_{2}))T}{(1-\alpha)(1-\beta)}\,. \end{array} $$

□

Under reasonable assumptions we can obtain a more readable error bound. Intuitively, we require that (i) 𝜗 is not too large, so that α ≈ 1/2, (ii) rounds are long enough to allow for a sufficiently accurate frequency measurement, which is the case if T ≫ max{F, U}, i.e., rounds are long compared to both the precision F of the initialization and the uncertainty U, and (iii) rounds remain short enough to not let the drifting clocks dominate the error. The third condition amounts to two further constraints: we need that νT² ≪ U, since the rate of change of the speed of clocks enters the skew bound quadratically in T, and we also need that (𝜗 − 1)²T ≪ U, because inaccurate frequency measurements prevent us from synchronizing frequencies better than up to a factor of Θ((𝜗 − 1)²).

Corollary 12

Assume that the prerequisites of Theorem3 are satisfied (including ( 1 )).Moreover, suppose that

α ≈ 1/2,
ε is chosen minimally such that it satisfies Condition 2,
T ≈ τ₃ ≫ τ₂,which is feasible whenever$T\gg \bar {\vartheta } (e(1)+d)$,and
$\max \{(\bar {\vartheta }-1)^{2}T,\nu T^{2}\}\ll U$ .

Then the steady state error of Algorithm3 is bounded by roughly 28U.

Proof

Note that α ≈ 1/2 implies that β ≈ 1/2 and that $\bar {\vartheta }\approx 1$. Plugging ε into the bound from Theorem 3, the steady state error is approximately bounded by

$$\begin{array}{@{}rcl@{}} &&4U + 10\nu(T+\tau_{2})T + 12\varepsilon T\\ &\approx &\; 4U + 10\nu(T+\tau_{2})T + 12\left( 6(\bar{\vartheta}-1)^{2}+\frac{2U}{\tau_{2}+\tau_{3}}+ 4\nu T\right) T\\ &\approx &\; \left( 4+\frac{24T}{\tau_{2}+\tau_{3}}\right)U + 72(\bar{\vartheta}-1)^{2}T + 58\nu T^{2}\\ &\approx &\; 28U\,. \end{array} $$

□

A few remarks:

Note that that 𝜗 ≤ 1.01 implies that β < α < 0.55, $\bar {\vartheta }< 1.031$ and e(1) ≤ max{1.031F,0.07T + 4.65U}. Thus the requirements of the corollary are met if max{F, U}≪ T and $\max \{(\bar {\vartheta }-1)^{2}T,\nu T^{2}\}\ll U$ for the minimal choice of ε, yielding the claim stated in the introduction.
Corollary 12 basically states that increasing T is fine, as long as $\max \{(\bar {\vartheta }-1)^{2}T,\nu T^{2}\}\ll U$. This improves over Algorithm 2, where it is required that (𝜗 − 1)T ≪ U, as it permits transmitting pulses at significantly smaller frequencies.
While the error bound of roughly 28U is about factor 7 larger than the about 4U Algorithm 2 provides, this is likely to be overly conservative. The source of this difference is that we assume that in a frequency measurement, the full uncertainty U may skew the observation of the relative clock speed. However, this measurement is based on sending two signals in the same direction over the same communication link in fairly short order. In most settings, the difference in delays will be much smaller than between messages on different communication links. Accordingly, the relative contribution of the frequency measurement to the error is likely to be much smaller in practice.
If this is not the case, one may extend the time span for a frequency measurement over multiple rounds to decrease the effect of the uncertainty. This requires that the accumulated phase corrections do not become so large as to prevent a clear distinction of the frequency-related pulse (whose sending time must not be altered due to phase corrections) from phase-related pulses.^{Footnote 12} To not further complicate the analysis, we refrained from presenting this option; it is used in [16, 17].

6 Self-Stabilization

In this section, we propose a generic mechanism that can be used to transform Algorithm 2 and Algorithm 3 into self-stabilizing solutions and give the corresponding main results in Theorem 4 and Theorem 5. An algorithm is self-stabilizing, if it (re)establishes correct operation from arbitrary states in bounded time. If there is an upper bound on the time this takes in the worst case, we refer to it as the stabilization time. We stress that, while self-stabilizing solutions to the problem are known, all of them have skew Ω(d); augmenting the Lynch-Welch approach with self-stabilization capabilities thus enables us to achieve an optimal skew bound of O((𝜗 − 1)T + U) in a Byzantine self-stabilizing manner for the first time.

Our approach can be summarized as follows. Nodes locally count their pulses modulo some $M\in \mathbb {N}$. We use a low-frequency, imprecise, but self-stabilizing synchronization algorithm (called FATAL) from earlier work [4, 5] to generate a “heartbeat.” On each such beat, nodes will locally check whether the next pulse with number 1 modulo M will occur within an expected time (local) window whose size is determined by the precision the algorithm would exhibit after M correctly executed pulses (in the non-stabilizing case). If this is not the case, the node is “reset” such that pulse 1 will occur within this time window.

This simple strategy ensures that a beat forces all nodes to generate a pulse with number 1 modulo M within a bounded time window. Assuming a value of F corresponding to its length in Algorithm 2 or Algorithm 3 hence ensures that the respective algorithm will run as intended—at least up to the point when the next beat occurs. Inconveniently, if the beat is not synchronized with the next occurrence of a pulse 1 mod M, some or all nodes may be reset, breaking the guarantees established by the perpetual application of approximate agreement steps. This issue is resolved by leveraging a feedback mechanism provided by FATAL: FATAL offers a (configurable) time window during which a NEXT signal externally provided to each node may trigger the next beat. If this signal arrives at each correct node at roughly the same time, we can be sure that the corresponding beat is generated shortly thereafter. This allows for sufficient control on when the next beat occurs to prevent any node from ever being reset after the first (correct) beat. Since FATAL stabilizes regardless of how the externally provided signals behave, this suffices to achieve stabilization of the resulting compound algorithm.

6.1 FATAL

We summarize the properties of FATAL in the following corollary, where each node has the ability to trigger a local NEXT signal perceived by the local instance of FATAL at any time.

Corollary 13 (of [5])

For suitable parameters$P,B_{1},B_{2},B_{3},D\in \mathbb {R}^{+}$,FATAL stabilizes withinO((B₁ + B₂ + B₃)n) time with probability1 − 2^−Ω(n).Once stabilized, nodesv ∈ Cgenerate beatsb_v(k),$k\in \mathbb {N}$, such that the followingproperties hold for all$k\in \mathbb {N}$.

1.
For allv, w ∈ C,we have that |b_v(k) − b_w(k)|≤ P.
2.
If nov ∈ Ctriggers its NEXT signal during [min_w∈C{b_w(k)} + B₁, t] for somet ≤ min_w∈C{b_w(k)} + B₁ + B₂ + B₃,then min_w∈C{b_w(k + 1)}≥ t.
3.
If allv ∈ Ctrigger their NEXT signals during [min_w∈C{b_w(k)} + B₁ + B₂, t] for somet ≤ min_w∈C{b_w(k)} + B₁ + B₂ + B₃,then max_w∈C{b_w(k + 1)}≤ t + P.

Denoting byd_Fthemaximum end-to-end delay (sum of maximum message and computational delay) of FATAL,for anyϕ ≥ 1 and any constantC we can ensure that

$$\begin{array}{@{}rcl@{}} P&\in& O(d_{F})\\ B_{1}&\geq& P+d\\ B_{1}+B_{2}+B_{3}&\in& {\Theta}(\phi\cdot (d_{F}+d))\\ B_{3}&\geq& C(B_{1}+B_{2})\,. \end{array} $$

Proof

For ϕ = 1, all statements follow directly from Lemma 3.4 and Corollary 4.16 in [5], noting that nodes will switch from state ready to propose (in the main state machine) in response to a NEXT signal if their timeout T₃ is expired. Once all correct nodes switched to propose, this results in all nodes switching to accept and generating a beat within d_F time. For ϕ > 1, one simply needs to observe that multiplying each timeout for choices satisfying Condition 3.3 in [5] by ϕ results in another valid choice; the bound on the stabilization time given in Corollary 4.16 scales accordingly. □

6.2 Algorithm

Our self-stabilizing solution utilizes both FATAL and the clock synchronization algorithm with very limited interaction. We already stressed that FATAL will stabilize regardless of the NEXT signals and note that it is not influenced by Algorithm 4 in any other way. Concerning the clock synchronization algorithm (either Algorithm 2 or Algorithm 3), we assume that a “careful” implementation is used that does not maintain state variables for a long time. Concretely, Algorithm 2 will clear memory between loop iterations, and Algorithm 3 will memorize the new multiplier value μ_v(r + 1) only, which is explicitly assigned during round r. If this is satisfied, no further consistency checks of variables are required, and it will be straightforward to re-use the analyses from Sections 4.3 and 5.2.

Having said this, let us turn to Algorithm 4, which is basically an ongoing consistency check based on the beats that resets the clock synchronization algorithm if necessary. The feedback triggering the next beat in a timely fashion is implemented by simply triggering the NEXT signal on each M^th beat, with a small delay ensuring that all nodes arrive in the same round and have their counter variable i reading 0. The consistency checks then ask for i = 0 and the next pulse being triggered within a certain local time window; if either does not apply, the reset function is called, ensuring that both conditions are met (Fig. 3).

Condition 3 lists the constraints on R⁻ (the minimum local time between a beat and local pulse 1 mod M), R⁺ (the respective maximum local time), and M (the number of pulses between beats) – the parameters of Algorithm 4 – need to satisfy so that we can show that the algorithm is guaranteed to stabilize.

Condition 3

We require that

$$\begin{array}{@{}rcl@{}} P+R^{+}+\tau_{1}-\frac{R^{-}}{\vartheta} &\leq& e(1) \end{array} $$

(9)

$$\begin{array}{@{}rcl@{}} P + R^{+} &\leq& \frac{R^{-}}{\vartheta} \end{array} $$

(10)

$$\begin{array}{@{}rcl@{}} P + R^{+} + \tau_{1} + d &\leq& \frac{R^{-}+\tau_{2}}{\vartheta} \end{array} $$

(11)

$$\begin{array}{@{}rcl@{}} P + d &\leq& \frac{R^{-} - \tau_{1}}{\vartheta} \end{array} $$

(12)

$$\begin{array}{@{}rcl@{}} P+R^{+}+T+\vartheta(e(1)+U) &\leq& B_{1}+B_{2} \end{array} $$

(13)

$$\begin{array}{@{}rcl@{}} P+\vartheta e(M)&\leq& B_{1} \end{array} $$

(14)

$$\begin{array}{@{}rcl@{}} B_{1}\,+\,B_{2}\!&\leq&\! e(M)\,+\,(M\,-\,1)\!\left( \!\frac{T}{\vartheta}\,-\,\tau_{1}\!\right)\,+\,\frac{R^{-}}{\vartheta} \end{array} $$

(15)

$$\begin{array}{@{}rcl@{}} \vartheta e(M)\,+\,(M\,-\,1)(T\,+\,\vartheta\tau_{1})\,+\,P\,+\,R^{+}\,+\,\tau_{1} \!&\leq&\! B_{1}+B_{2}+B_{3} \end{array} $$

(16)

$$\begin{array}{@{}rcl@{}} R^{-} &\leq& \frac{T}{\vartheta}-((\vartheta\,+\,2) e(M) \,+\, U \,+\, P) \end{array} $$

(17)

$$\begin{array}{@{}rcl@{}} T+\vartheta(e(M)+U)-\tau_{1} & \leq& R^{+}\,. \end{array} $$

(18)

Intuitively, these constraints ensure the following:

Equation (9) says that resets on a beat enforce the skew to become bounded by e(1).
Equations (10) and (11) ensure that correct nodes receive the first pulses from all other correct nodes after a beat.
Equation (12) guarantees that these are actually the “round-1” pulses also for nodes that have been reset, i.e., there are no spurious pulses from before such a reset that are received during the respective time window.
Equations (13) and (14) make sure that FATAL will ignore any NEXT signals that may still be active when a beat occurs and that there is sufficient time for the first round after the beat to complete.
Equations (15) and (16) enforce that the (now correctly executing) algorithm will trigger the NEXT signals and thus the next beat is well-aligned with the time reference it provides.
Finally, (17) and (18) imply that such a beat will result in no resets.

We need to show that these constraints can be satisfied in conjunction with the ones required by the employed synchronization algorithm.

Lemma 13

Conditions1 and3 can be simultaneously satisfied such thatτ₁(r) = τ₁,τ₂(r) = τ₂andT(r) = Tfor all$r\in \mathbb {N}$,and$\lim _{r\to \infty } e(r)<\infty $if

$$\alpha=\frac{2\vartheta^{2}+\vartheta}{2-\vartheta}\cdot\left( 1-\frac{1}{\vartheta^{2}} +\frac{4(\vartheta-1)}{1-\beta}\right)<1\,, $$

whereβ = (2𝜗² + 5𝜗 − 5)/(2(𝜗 + 1)). In this case,

$$\lim\limits_{r\to \infty} e(r) = \frac{(1-1/\vartheta)T+(3\vartheta -1)U}{1-\beta}\,. $$

Here, we may choose anyT ≥ T₀ ∈ O((d_F + d)/(1 − α)) andB₁, B₂, and B₃such that FATAL stabilizes in time O(n(d_F + d)) with probability 1 − 2^−Ω(n).

Proof

We choose R⁻ and R⁺ such that (17) and (18) are satisfied with equality. Thus, any choice of

$$F\geq \left( 1-\frac{1}{\vartheta^{2}}\right)T + 2P + 4\vartheta e(M)+ 2\vartheta U $$

satisfies (9), and for (10)–(12) to hold it is sufficient that

$$\begin{array}{@{}rcl@{}} F&\leq& \tau_{1} \leq \frac{T}{\vartheta}-3\vartheta e(M)-\vartheta d-(\vartheta-1)P\\ \vartheta F &\leq& \tau_{2}\,. \end{array} $$

These lower bounds on τ₁ and τ₂ are weaker than those imposed by Condition 1, which demands that min{τ₁, τ₂}≥ 𝜗e(1) > F. Setting τ₁ := 𝜗e(1), τ₂ := 𝜗(e(1) + d), and requiring T ≥ 𝜗(τ₁ + τ₂ + e(1) + U) thus guarantees that the above lower bounds on τ₁ and τ₂ hold. We get that

$$\frac{T}{\vartheta}>\tau_{1}+F+\vartheta d>\tau_{1}+ 3\vartheta e(M)+\vartheta d+(\vartheta-1)P\,, $$

and the inequalities of Condition 1 are satisfied for r = 1. Moreover, with x := (3𝜗 − 1)U + (1 − 1/𝜗)T, we have for $r\in \mathbb {N}$ that

$$e(r)=\beta^{r-1}e(1)+\frac{1-\beta^{r-1}}{1-\beta}\,x\,, $$

i.e., e(r) is a convex combination of e(1) and x/(1 − β). We require that e(1) ≥ x/(1 − β), i.e.,

$$\frac{F}{2-\vartheta}=e(1)\geq \frac{(3\vartheta-1)U+(1-1/\vartheta)T}{1-\beta}\,; $$

here, we used that 2 − 𝜗 > 0, because α < 1. Thus, e(r) ≤ e(1), and we conclude that Condition 1 holds for

$$\begin{array}{@{}rcl@{}} F&:=&\\ &&\max\left\{\!\left( \!1\,-\,\frac{1}{\vartheta^{2}}\right)T\,+\,2P\,+\,4\vartheta e(M)\,+\,2\vartheta U, \frac{(2\,-\,\vartheta)((3\vartheta\,-\,1)U\,+\,(1\,-\,1/\vartheta)T)}{1-\beta}\right\} \end{array} $$

under the constraint that

$$T\geq \vartheta(\tau_{1}+\tau_{2}+e(1)+U)= \vartheta\left( \frac{(2\vartheta+ 1)F}{2-\vartheta}+\vartheta d +U\right). $$

For any c > 1, sufficiently large M ensures that

$$e(M)\leq c \lim_{r\to \infty}e(r) = \frac{cx}{1-\beta}= \frac{c((3\vartheta-1)U+(1-1/\vartheta)T)}{1-\beta}, $$

where the last step uses that 1 − β ∈Ω(1) because α < 1.

Assuming sufficiently large M, the above lower bound on T can hence be met iff

$$\frac{2\vartheta^{2}+\vartheta}{2-\vartheta}\cdot\max\left\{1-\frac{1}{\vartheta^{2}} +\frac{4(\vartheta-1)}{1-\beta}, \frac{(2-\vartheta)(1-1/\vartheta)}{1-\beta}\right\} =\alpha<1\,. $$

In this case, for sufficiently large M the constraint on T is satisfied if

$$(1-\alpha)T \geq (1-\alpha)T_{0}\in O\left( \!\max\!\left\{\!P \,+\, \frac{U}{1-\beta}+U,\frac{U}{1-\beta}\right\}+d+U\right)=O(P+d)\,, $$

where we used that 𝜗 and thus 1 − α and 1 − β are constants.

To complete the proof, it remains to show that, for any such choice of T and a given lower bound on M, we can satisfy Inequalities (13)–(16) such that FATAL has the claimed guarantees on the stabilization time. Given that all parameters except for M, B₁, B₂, and B₃ are already fixed independently of these values, it suffices if we can solve the system

$$\begin{array}{@{}rcl@{}} K&\leq B_{1}\\ B_{1}+B_{2}&\leq& (M-1)K\\ \vartheta M K&\leq& B_{1}+B_{2}+B_{3}\\ \end{array} $$

for an arbitrary $K\in \mathbb {R}^{+}$ such that M is sufficiently large. By Corollary 13, we may choose B₁, B₂, and B₃ such that, e.g., B₃ ≥ B₁ + B₂. Picking ϕ ≥ 1 in the corollary sufficiently large, we get that ϕB₁ ≥ K and M := ⌊2(B₁ + B₂)/(𝜗K)⌋ is sufficiently large and satisfies the second and third inequality (where again we use that 2 − 𝜗 ∈Ω(1)).

Finally, note that P ∈ O(d_F) and all factors occurring in this proof are constants depending on 𝜗 only, implying that ϕ and M are constants as well. The bound on the stabilization time thus readily follows from Corollary 13 as well. □

In the remainder of the section, we assume (i) that the beat generation algorithm has already stabilized, i.e., the guarantees stated in Corollary 13 hold, (ii) that the executed clock synchronization algorithm is Algorithm 2, and (iii) that Condition 1 holds. The analysis for Algorithm 3 is analogous, where $\bar {\vartheta }=\vartheta ^{3}$ takes the role of 𝜗 and Condition 2 takes the role of Condition 1; this is formalized by the following corollary and Theorem 5 at the end of this section.

Corollary 14

Conditions2 and3 can be simultaneously satisfied with$\lim _{r\to \infty } e(r)<\infty $if

$$\bar{\alpha}=\frac{4\bar{\vartheta}^{2}+ 5\bar{\vartheta}}{2-\bar{\vartheta}}\cdot\left( 1-\frac{1}{\bar{\vartheta}^{2}} +\frac{4(\bar{\vartheta}-1)}{1-\bar{\beta}}\right)<1\,, $$

where$\bar {\vartheta }=\vartheta ^{3}$and$\bar {\beta }=(2\bar {\vartheta }^{2}+ 5\bar {\vartheta }-5)/(2(\bar {\vartheta }+ 1))$. In this case,

$$\lim\limits_{r\to \infty} e(r) = \frac{(1-1/\bar{\vartheta})T+(3\bar{\vartheta} -1)U}{1-\beta}\,. $$

Here, we may choose anyT ≥ T₀ ∈ O((d_F + d + U)/(1 − α)) andB₁, B₂, andB₃such that FATAL stabilizes in timeO(n(d_F + d)) with probability 1 − 2^−Ω(n).

Proof

Analogous to the proof of Lemma 13, but replacing the constraint T ≥ 𝜗(τ₁ + τ₂ + e(1) + U) by $T\geq \tau _{1}+\tau _{2}+\tau _{3}+\tau _{4}+\bar {\vartheta }(e(1)+U)>\bar {\vartheta }(\tau _{1}+\tau _{2}+e(1)+U)$ and setting $\tau _{3}:=\bar {\vartheta }(e(1)+(1-1/\bar {\vartheta })(\tau _{1}+\tau _{2}))$ and $\tau _{4}:=\bar {\vartheta }(e(1)+d+(1-1/\bar {\vartheta })(\tau _{1}+\tau _{2}))$ in accordance with Condition 2. This results in the requirement that

$$T\geq \frac{(4\bar{\vartheta}^{2}+ 5\bar{\vartheta})F}{2-\vartheta}+\bar{\vartheta} d +U\,, $$

which in turn leads to the value for $\bar {\alpha }$. □

6.3 Analysis

Our analysis starts with the first correct beat produced by FATAL, which is perceived at node v ∈ C at time b_v(1). Subsequent beats at v occur at times b_v(2), b_v(3), etc. We first establish that the first beat guarantees to “initialize” the synchronization algorithm such that it will run correctly from this point on (neglecting for the moment the possible intervention by further beats). We use this do define the “first” pulse times p_v(1), v ∈ C, as well; we enumerate consecutive pulses accordingly.

Lemma 14

Letb := min_v∈C{b_v(1)}.We have that

1.
Eachv ∈ Cgenerates a pulse at timep_v(1) ∈ [b + R⁻/𝜗, b + P + R⁺ + τ₁].
2.
$\|\vec {p}(1)\|\leq e(1)$ .
3.
At timep_v(1),v ∈ Csetsi := 1.
4.
w ∈ Creceives the pulse sent byv ∈ Cat a local time from the range[H_w(p_w(1)) − τ₁, H_w(p_w(1)) + τ₂].
5.
This is the only pulsew receives fromv at a local time from the range [H_w(p_w(1)) − τ₁, H_w(p_w(1)) + τ₂].
6.
Denoting by round1 the execution of thefor-loop in Algorithm2 during which eachv ∈ Csends the pulse at timep_v(1),this round is executed correctly.

Proof

Assume for the moment that min_v∈C{b_v(2)} is sufficiently large, i.e., no second beat will occur at any correct node for the times relevant to the proof of the lemma; we will verify this at the end of the proof.

From the pseudocode given in Algorithms 2 and 4, it is straightforward to verify that v ∈ C generates a pulse at a local time from [H_v(b_v(1)) + R⁻, H_v(b_v(1)) + R⁺ + τ₁]. Since b_v(1) ∈ [b, b + P] by Corollary 13, this shows the first claim. The second follows immediately, since

$$\|\vec{p}(1)\|\leq P+R^{+}+\tau_{1}-\frac{R^{-}}{\vartheta} \overset{(9)}{\leq} e(1)\,. $$

Note that, until we show the last claim, it is not clear that p_v(1) is unique for each v ∈ C. For the moment, let p_v(1) be the first pulse v ∈ C sends during the local time interval [H_v(b_v(1)) + R⁻, H_v(b_v(1)) + R⁺ + τ₁]. With this convention, the third claim is shown as follows. Observe that any v ∈ C that executes the reset function in response to the beat sets i := 0 when doing so. Hence, it will set i := 1 at time p_v(1). Thus, consider v ∈ C that does not execute the reset function. This entails that i = 0 at time b_v(1) and v generates no pulse during local times from [H_v(b_v(1), H_v(b_v(1)) + R⁻). Consequently, v will increase i to 1 at time p_v(1).

For the fourth claim, we bound

$$p_{v}(1)\geq b+\frac{R^{-}}{\vartheta}\geq b_{w}(1)+\frac{R^{-}}{\vartheta}-P \overset{(10)}{\geq}b_{w}(1)+R^{+}\,. $$

Thus, either the next round has already started at node w by time p_v(1) or w calls reset with argument 0, i.e., starts a new round. Either way, we have that w receives the pulse from v no earlier than local time H_w(p_w(1)) − τ₁. To see that the pulse arrives on time, we bound

$$p_{v}(1)+d\leq p_{w}(1)+P+R^{+}+\tau_{1}+d-\frac{R^{-}}{\vartheta} \overset{(11)}{\leq}p_{w}(1)+\frac{\tau_{2}}{\vartheta}\,. $$

As H_w(p_w(1) + τ₂/𝜗) ≤ H_w(p_w(1)) + τ₂, the fourth claim follows.

Concerning the fifth claim, observe that v ∈ C sends exactly one pulse during the local time interval [H_v(b_v(1)), H_v(p_v(1))]. As for w ∈ C we have that

$$b_{v}(1)+d\leq b_{w}(1)+P+d\leq p_{w}(1)-\frac{R^{-}}{\vartheta}+P+d \overset{(12)}{\leq} p_{w}(1)-\frac{\tau_{1}}{\vartheta}\,, $$

no pulse v sent at an earlier local time is received by w at or after local time H_w(p_w(1)) − τ₁. In particular, the first pulse w receives from v at a local time from [H_w(p_w(1)) − τ₁, H_w(p_w(1)) + τ₂] arrives at w at a time t_vw ∈ [p_v(1) + d − U, p_v(1) + d]. Since we also showed that $\|\vec {p}(1)\|\leq e(1)$, we conclude that the analysis of Section 4.3 can be applied to show that any subsequent pulse arrives after the round is complete at all nodes. Furthermore, we conclude that round 1 is executed correctly.

Recall that in the above reasoning, we assumed that minv∈C{b_v(2)} is sufficiently large. Clearly, this is the case if round 1 ends at all nodes before this time. Accordingly, we bound for v ∈ C

$$\begin{array}{@{}rcl@{}} p_{v}(1)+T-{\Delta}_{v}(1)-\tau_{1} &\leq& b_{v}(1)+R^{+}+T-{\Delta}_{v}(1)\\ &\leq& b+P+R^{+}+T+\vartheta(e(1)+U)\\ &&\overset{(13)}{\leq}b+B_{1}+B_{2}\,, \end{array} $$

where the second last step makes use of Corollary 3. Because no node v ∈ C generates a pulse with i = M during times [b_v(1) + 𝜗e(M), p_v(2)], no such node triggers a NEXT signal during this time interval (cf. Algorithm 4). We have that

$$b_{v}(1)+\vartheta e(M)\leq b+P+\vartheta e(M)\overset{(14)}{\leq} B_{1}\,, $$

implying by Corollary 13 that min_v∈C{b_v(2)}≥ b + B₁ + B₂. □

Lemma 14 serves as induction anchor for the argument showing that all rounds of the algorithm are executed correctly. However, due to possible interference of future beats, for the moment we can merely conclude that this is the case until the next beat; we obtain the following corollary.

Corollary 15

Denote byN the infimum over all timest ≥ b + B₁at which somev ∈ Ctriggers a NEXT signal. If min_v∈C{p_v(M) + e(M)}≤ min{N, b + B₁ + B₂ + B₃},then all roundsr ∈{1,…, M} are executed correctly and$\|\vec {p}(r)\|\leq e(r)$.

Proof

Lemma 14 shows that the first beat “initializes” the system such that $\|\vec {p}(1)\|\leq e(1)$ and the first round is executed correctly. By Corollary 13, minv∈C{b_v(2)}≥ min{N, b + B₁ + B₂ + B₃}. Hence, after round 1 Algorithm 2 will be executed without interference from Algorithm 4 until (at least) time minv∈C{p_v(M) + e(M)}. For r ∈{2,…, M}, the claim thus follows as in Section 4.3. □

Next, we leverage this insight to prove that the progress of the synchronization algorithm – which will operate correctly at least until the next beat – together with the constraints of Condition 3 ensures the following: the first time when node v ∈ C triggers its NEXT signal after time b + B₁ falls within the window of opportunity for triggering the next beat provided by FATAL.

Lemma 15

Forv ∈ C,denote byN_v(1) the infimum of timest ≥ b + B₁when it triggers its NEXT signal. We have thatH_v(N_v(1)) = p_v(M) + 𝜗e(M) and that

$$b+B_{1}+B_{2}\leq N_{v}(1)\leq b+B_{1}+B_{2}+B_{3}\,. $$

Proof 32

At time b_v(1), v ∈ C sets i := 0 (unless it already holds that i = 0). Thus, v will not trigger the NEXT signal until it sent at least M pulses and waited for 𝜗e(M) local time, i.e., N_v(1) ≥ p_v(M) + e(M). As observed in the proof of Lemma 14, we have that b_v(1) ≥ b + B₁. Thus, we can apply Corollary 15, where

$$N:=\min\limits_{v\in C}\{N_{v}(1)\}\geq \min\limits_{v\in C}\{p_{v}(M)+e(M)\}\,, $$

to conclude that one of the following must hold true: (i) all rounds r ∈{1,…, M} are executed correctly or (ii) minv∈C{p_v(M) + e(M)} > b + B₁ + B₂ + B₃.

In the first case, we have that

$$H_{v}(N_{v}(1))=H_{v}(p_{v}(1))+\vartheta e(M)+\sum\limits_{r = 1}^{M-1} T-{\Delta}_{v}(r)\,, $$

where

$$\sum\limits_{r = 1}^{M-1}|{\Delta}_{v}(r)|\leq \sum\limits_{r = 1}^{M_{1}}e(r)\leq \vartheta(M-1)\tau_{1}\,. $$

We conclude that

$$p_{v}(1)+e(M)+(M-1)\left( \frac{T}{\vartheta}-\tau_{1}\right) \leq N_{v}(1)\leq p_{v}(1) +\vartheta e(M)+(M-1)(T+\vartheta\tau_{1}). $$

Applying the first statement of Lemma 14, this yields that

$$\begin{array}{@{}rcl@{}} &&b+e(M)+(M-1)\left( \frac{T}{\vartheta}-\tau_{1}\right)+\frac{R^{-}}{\vartheta}\\ \leq \;&& N_{v}(1)\\ \leq \;&& b +\vartheta e(M)+(M-1)(T+\vartheta\tau_{1})+P+R^{+}+\tau_{1}\,. \end{array} $$

The claim now follows from (15) and (16).

With respect to the second case, observe that since no NEXT signal is triggered at any v ∈ C after time b + B₁ until time b + B₁ + B₂ + B₃, minv∈C{b_v(2)}≥ b + B₁ + B₂ + B₃ by Corollary 13. Thus, Algorithm 2 runs without interference up to this time. Using this, we can establish the same bounds as for the first case. □

This immediately implies that the second beat occurs in response to the NEXT signals, which itself are aligned with pulse M.

Corollary 16

For allv ∈ C,b_v(2) ∈ [p_v(M), p_v(M) + (𝜗 + 1)e(M) + P].

Proof

By Lemma 15, N_v(1) ∈ [b + B₁ + B₂, b + B₁ + B₂ + B₃] for all v ∈ C. Thus, by Corollary 15, $\|\vec {p}(M)\|\leq e(M)$. As v ∈ C triggers its NEXT signal at local time H_v(p_v(M)) + 𝜗e(M), it follows that

$$p_{v}(M)\leq \min\limits_{w\in C}\{p_{w}(M)+e(M)\}\leq \min\limits_{w\in C}\{N_{w}(1)\} $$

and that

$$\max\limits_{w\in C}\{N_{w}(1)\}\leq \max\limits_{w\in C}\{p_{w}(M)+\vartheta e(M)\} \leq p_{v}(M)+(\vartheta+ 1)e(M)\,. $$

The claim now follows from the second and third statements of Corollary 13. □

Having established this timing relation between $\vec {b}(2)$ and $\vec {p}(M)$, we can conclude that no correct node is reset due to the second beat.

Lemma 16

Nodev ∈ Cdoes not call the reset function of Algorithm4 in response to beatb_v(2).

Proof

By Corollary 16, b_v(2) ∈ [p_v(M), p_v(M) + (𝜗 + 1)e(M) + P]. By Corollary 15, Algorithm 2 has been executed without interruption by beat after time b_v(1) up to this time. Hence, v sets i := M mod M = 0 at time p_v(M) ≤ b_v(2). As also round M is executed correctly, the earliest time when v could generate pulse M + 1 without a reset is bounded by

$$\begin{array}{@{}rcl@{}} p_{v}(M)+\frac{T-{\Delta}_{v}(M)}{\vartheta}&\geq& p_{v}(M)-(e(M)+U)+\frac{T}{\vartheta}\\ &\geq& b_{v}(2)-((\vartheta+ 2)e(M)+P+U)+\frac{T}{\vartheta}\\ &&\overset{(17)}{\geq} b_{v}(2)+R^{-}\,, \end{array} $$

where in the first step we applied Corollary 3. This implies that node v’s variable i equals 0 at time b_v(2) and v does not generate a pulse at a local time from [H_v(b_v(2)), H_v(b_v(2)) + R⁻]. It remains to show that v enters round M + 1 at the latest at local time H_v(b_v(2)) + R⁺. To show this, we bound

$$\begin{array}{@{}rcl@{}} H_{v}(p_{v}(M))+T-\tau_{1}-{\Delta}_{v}(M)&\leq& H_{v}(p_{v}(M))+T-\tau_{1}+\vartheta(e(M)+U)\\ &\leq& H_{v}(b_{v}(2))+T-\tau_{1}+\vartheta(e(M)+U)\\ &&\overset{(18)}{\leq} b_{v}(2)+R^{+}\,. \end{array} $$

□

Repeating the above reasoning for all pairs of beats $\vec {b}(k)$, $\vec {b}(k + 1)$, $k\in \mathbb {N}$, it follows that no correct node is reset by any beat other than the first. Thus, the clock synchronization algorithm is indeed (re-)initialized by the first beat to run without any further meddling from Algorithm 4. This implies the same bounds on the steady state error as for the original synchronization algorithm.

Theorem 4

Suppose that Algorithm4 is executed with Algorithm2 as synchronizationalgorithm. If

$$\alpha=\frac{2\vartheta^{2}+\vartheta}{2-\vartheta}\cdot\left( 1-\frac{1}{\vartheta^{2}} +\frac{4(\vartheta-1)}{1-\beta}\right)<1 $$

(which holds for 𝜗 ≤ 1.03), where β = (2𝜗² + 5𝜗 − 5)/(2(𝜗 + 1)), then all parameters can be chosen such that the compound algorithm isself-stabilizing and has steady state error

$$E \leq \frac{(\vartheta-1)T+(3\vartheta-1)U}{1-\beta}\,. $$

Here, any nominal round lengthT ≥ T₀ ∈ O(d_F + d) is possible.

Proof

Lemma 13 that Conditions 1 and 3 can be satisfied such that $\lim _{r\to \infty } e(r)=((\vartheta -1)T+(3\vartheta -1)U)/\beta $ and T₀ ∈ O(d_F + d). Hence, we may apply the statements derived in this section.

By Corollary 13, the beat generation mechanism will eventually stabilize. Afterwards, we can apply Lemma 16 to show that the second (correct) beat results in no calls to the reset function in Algorithm 4. In fact, this extends to any beat except for the first: letting beat $k\in \mathbb {N}$ take the role of beat 1, our reasoning shows that beat k + 1 does not result in a reset at any node. Moreover, applying the same reasoning to Corollary 15, we conclude that all rounds $r\in \mathbb {N}$ are executed correctly, and that $\|\vec {p}(r)\|\leq e(r)$. The bound on E follows. □

Observe that, in comparison to Theorem 1, the expression obtained for the steady state error replaces d by O(d_F + d), which is essentially the skew upon initialization by the first beat. In Algorithm 2, we circumvented any dependence on F by varying round lengths over time. For the self-stabilizing solution, this is not possible, since counting rounds locally is not guaranteed to ensure a consistent opinion across all nodes concerning the nominal length of the current round; we are restricted to counting rounds $\bmod M\in \mathbb {N}$, so any long round length will reoccur regularly.

It remains to draw the analogous conclusions for using Algorithm 4 with Algorithm 3 as synchronization algorithm.

Theorem 5

Suppose that Algorithm4 is executed with Algorithm3 as synchronizationalgorithm (where ( 1 ) holds). If

$$\bar{\alpha}=\frac{4\bar{\vartheta}^{2}+ 5\bar{\vartheta}}{2-\bar{\vartheta}}\cdot\left( 1-\frac{1}{\bar{\vartheta}^{2}} +\frac{4(\bar{\vartheta}-1)}{1-\bar{\beta}}\right)<1 $$

(which holds for𝜗 ≤ 1.004), where$\bar {\vartheta }=\vartheta ^{3}$and$\bar {\beta }=(2\bar {\vartheta }^{2}+ 5\bar {\vartheta }-5)/(2(\bar {\vartheta }+ 1))$, then all parameters can be chosen such that the compound algorithm self-stabilizes inO(n) time and has steady state error

$$E\leq \frac{(4\bar{\vartheta}-2)U+\nu(T+\tau_{2})T}{1-\alpha} +\frac{(3\vartheta\varepsilon+ 2\nu(T+\tau_{2}))T}{(1-\alpha)(1-\beta)}\,, $$

where$\alpha :=(4\bar {\vartheta }^{2}+ 5\bar {\vartheta }-7)/(2(\bar {\vartheta }+ 1))<1$andβ := (2𝜗 − 1)/2 < 1. Here, anyvalue ofT ≥ T₀ ∈ O(d_F + d) is possible.

Proof

As for Theorem 4, with Corollary 14 taking the place of Lemma 13 and noting that the convergence argument for the frequencies relies on rounds being executed correctly only (i.e., no assumptions on μ_v(1), v ∈ C, are required). □

We remark that despite the stringent requirements on 𝜗 for the recovery argument to work (i.e., $\bar {\alpha }<1$), the actual bound on the precision involves α and β. If 𝜗 ≤ 1.004, we have α ≤ 0.512 and β ≤ 0.502. Concerning stabilization, we remark that it takes O(n) time with probability 1 − 2^−Ω(n), which is directly inherited from FATAL. The subsequent convergence to small skews is not affected by n, and will be much faster for realistic parameters, so we refrain from a more detailed statement.

7 Conclusions

The results derived in this paper demonstrate that the Lynch-Welch synchronization principle is a promising candidate for reliable clock generation, not only in software, but also in hardware. Apart from accurate bounds on the synchronization error depending on the quality of clocks, we present a generic coupling scheme enabling to add self-stabilization properties.

We believe these results to be of practical merit. Concretely, first results from a prototype Field-Programmable Gate Array (FPGA) implementation of Algorithm 2 show a skew of 182ps [12]. Given the appealing simplicity of the presented algorithms and this excellent performance, we consider the approach a viable candidate for reliable clock generation in fault-tolerant low-level hardware and other areas.

Notes

For comparison, the critical value in [19] is smaller than 1.025, i.e., we can handle a factor 4 weaker bound on 𝜗 − 1. Non-quartz oscillators used in space applications, where temperatures vary widely, may have 𝜗 close to this value, cf. [1].
All prior self-stabilizing algorithms have at least this skew. It should also be noted that d involves computational delay and turns out to be larger for FATAL, due to issues related to implementation.
The prototype implementation achieves 182ps skew [12], which is suitable for generating a system clock.
Constraining feasible clock rates is necessary to avoid that measurement errors result in clocks speeding up or slowing down arbitrarily over time.
The maximum delay d tends to be at least one or two orders of magnitude larger than the delay uncertainty U.
If a node has fewer than 2f + 1 neighbors in a system tolerating f faults, it cannot distinguish whether it synchronizes to a group of f correct or f faulty neighbors.
It is common to define the drift symmetrically, i.e., (1 − ρ)(t^′− t) ≤ H_v(t^′) − H_v(t) ≤ (1 + ρ)(t^′− t) for some 0 < ρ < 1. For ρ ≪ 1 and 𝜗 ≈ 1, up to minor order terms this is equivalent to setting ρ := (𝜗 − 1)/2 and rescaling the real time axis by factor 1 − ρ. The one-sided formulation results in less cluttered notation.
Discretization can be handled by re-interpreting the discretization error as part of the delay uncertainty. All our algorithms use the hardware clock exclusively to measure bounded time differences.
Typically, e(r) is a monotone sequence, implying that simply $E=\lim _{r\to \infty }e(r)$.
Note that we divide the measured local time differences by factor (𝜗 + 1)/2, the average of the minimum and maximum clock rates. This is an artifact of our more notation-friendly “one-sided” definition of hardware clock rates from [1, 𝜗]; in an implementation, one simply reads the hardware clocks (which exhibit symmetric error) without any scaling.
Given that hardware clock speeds may differ by at most factor 𝜗, nodes need to be able to increase or decrease their rates by factor 𝜗: a single deviating node may be considered faulty by the algorithm, so each node must be able to bridge this speed difference on its own.
This issue can be circumvented by having a second, dedicated communication link between each pair of nodes.

References

Overview of Silicon Oscillators by Linear Technology (retrieved May 2016). http://cds.linear.com/docs/en/product-selector-card/2PB_osccalcfb.pdf
Daliot, A., Dolev, D.: Self-stabilizing byzantine pulse synchronization computing research repository. arXiv:0608092 (2006)
Distributed Algorithms for Robust Tick-Synchronization (2005–2008). Research project [retrieved: 05, 2014]. http://ti.tuwien.ac.at/ecs/research/projects/darts
Dolev, D., Függer, M., Lenzen, C., Posch, M., Schmid, U., Steininger, A.: Rigorously modeling self-stabilizing fault-tolerant circuits: An ultra-robust clocking scheme for systems-on-chip. J. Comput. Syst. Sci. 80(4), 860–900 (2014)
Article MathSciNet MATH Google Scholar
Dolev, D., Függer, M., Lenzen, C., Schmid, U: Fault-tolerant algorithms for tick-generation in asynchronous logic: Robust pulse generation. J. ACM 61(5), 30:1–30:74 (2014)
Article MathSciNet MATH Google Scholar
Dolev, D., Halpern, J.Y., Strong, H.R.: On the possibility and impossibility of achieving clock synchronization. J. Comput. Syst. Sci. 32(2), 230–250 (1986)
Article MathSciNet MATH Google Scholar
Dolev, D., Lynch, N.A., Pinter, S.S., Stark, E.W., Weihl, W.E.: Reaching approximate agreement in the presence of faults. J. ACM 33, 499–516 (1986)
Article MathSciNet MATH Google Scholar
Dolev, S., Welch, J.L.: Self-stabilizing clock synchronization in the presence of byzantine faults. J. ACM 51(5), 780–799 (2004)
Article MathSciNet MATH Google Scholar
FlexRay Consortium, et al.: FlexRay communications system-protocol specification. Version 2.1 (2005)
Függer, M., Armengaud, E., Steininger, A.: Safely stimulating the clock synchronization algorithm in time-triggered systems - a combined formal & experimental approach. IEEE Trans. Indus. Inf. 5(2), 132–146 (2009)
Article Google Scholar
Függer, M., Schmid, U.: Reconciling fault-tolerant distributed computing and systems-on-chip. Distrib. Comput. 24(6), 323–355 (2012)
Article MATH Google Scholar
Huemer, F., Kinali, A., Lenzen, C.: Fault-tolerant clock synchronization with high precision. In: IEEE Symposium on VLSI (ISVLSI), pp. 490–495 (2016)
Kopetz, H., Bauer, G.: The time-triggered architecture. Proc. IEEE 91(1), 112–126 (2003)
Article Google Scholar
Lenzen, C., Rybicki, J.: Self-stabilising Byzantine clock synchronisation is almost as easy as consensus. In: 31st Symposium on Distributed Computing (DISC). To appear (2017)
Lundelius, J., Lynch, N.: An upper and lower bound for clock synchronization. Inf. Control. 62(2–3), 190–204 (1984)
Article MathSciNet MATH Google Scholar
Schossmaier, K.: Interval-based Clock State and Rate Synchronization. Technical University of Vienna, Ph.D. thesis (1998)
MATH Google Scholar
Schossmaier, K., Weiss, B.: An algorithm for fault-tolerant clock state and rate synchronization. In: 18th Symposium on Reliable Distributed Systems (SRDS), pp. 36–47 (1999)
Srikanth, T.K., Toueg, S.: Optimal clock synchronization. J. ACM 34(3), 626–645 (1987)
Article MathSciNet Google Scholar
Welch, J.L., Lynch, N.A.: A new fault-tolerant algorithm for clock synchronization. Inf. Comput. 77(1), 1–36 (1988)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Open access funding provided by Max Planck Society. We thank Matthias Függer and Attila Kinali for fruitful discussions, and the anonymous reviewers of an earlier version for valuable comments.

Author information

Authors and Affiliations

ETH Zurich, Zurich, Switzerland
Pankaj Khanchandani
Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany
Christoph Lenzen

Authors

Pankaj Khanchandani
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Lenzen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christoph Lenzen.

Additional information

This article is part of the Topical Collection on Special Issue on Stabilization, Safety, and Security of Distributed Systems (SSS 2016)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Khanchandani, P., Lenzen, C. Self-Stabilizing Byzantine Clock Synchronization with Optimal Precision. Theory Comput Syst 63, 261–305 (2019). https://doi.org/10.1007/s00224-017-9840-3

Download citation

Published: 20 January 2018
Issue Date: 15 February 2019
DOI: https://doi.org/10.1007/s00224-017-9840-3

Self-Stabilizing Byzantine Clock Synchronization with Optimal Precision

Abstract

Similar content being viewed by others

Self-stabilizing Byzantine Clock Synchronization with Optimal Precision

Near-Optimal Self-stabilising Counting and Firing Squads

Time Optimal Synchronous Self Stabilizing Spanning Tree

1 Introduction

Our Contribution

Organization of the Paper

2 Related Work

3 Model

Problem

3.1 Model for Frequency Correction Algorithms

3.2 Self-stabilization

4 Phase Synchronization Algorithm

4.1 Properties of Approximate Agreement Steps

Lemma 1

Proof

Corollary 1

Lemma 2

Proof

4.2 Algorithm

Condition 1

Lemma 3

Proof

4.3 Analysis

Lemma 4

Proof

Corollary 2

Proof

Corollary 3

Proof

Lemma 5

Proof

Corollary 4

Proof

Lemma 6

Proof

Corollary 5

Proof

Theorem 1

Proof

5 Phase and Frequency Synchronization Algorithm

5.1 Algorithm

Condition 2

Lemma 7

Proof

5.2 Analysis

5.2.1 Phase Correction Steps

Corollary 6

Proof

Corollary 7

Proof

Theorem 2

Proof

5.2.2 Frequency Correction Steps

Lemma 8

Proof

Corollary 8

Proof

Corollary 9

Proof

Lemma 9

Proof

Lemma 10

Proof

Lemma 11

Proof

Corollary 10

Proof

Corollary 11

Proof

5.2.3 Steady State Error with Frequency Correction

Lemma 12

Proof

Theorem 3

Proof

Corollary 12

Proof

6 Self-Stabilization