Upper and Lower Bounds for the Synchronizer Performance in Systems with Probabilistic Message Loss

In this paper, we revisit the performance of the α-synchronizer in distributed systems with probabilistic message loss as introduced in Függer et al. [Perf. Eval. 93(2015)]. In sharp contrast to the infinite-state Markov chain resp. the exponential-size finite-state upper bound presented in the original paper, we introduce a polynomial-size finite-state Markov chain for a new synchronizer variant α′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\alpha ^{\prime }$\end{document}, which provides a new upper bound on the performance of the α-synchronizer. Both analytic and simulation results show that our new upper bound is strictly better than the existing one. Moreover, we show that a modified version of the α′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\alpha ^{\prime }$\end{document}-synchronizer provides a lower bound on the performance of the α-synchronizer. By means of elaborate simulation results, we show that our new lower bound is also strictly better than the lower bound presented in the original paper.


Introduction
Simulating synchronous executions in a distributed message-passing system is a wellknown and powerful design approach. Synchronizers like the well-known α-synchronizer by Awerbuch (Awerbuch 1985) allow to establish a virtual (lock-step) round structure, which greatly simplifies the design of higher-level distributed algorithms. Moreover, it makes it easy to reason about the time complexity of an atop-running algorithm, which is just the number of rounds needed until termination.
The underlying idea of the α-synchronizer is to let processes continuously exchange round numbers and to allow a process to proceed to the next round only after it has witnessed that all processes have already started the current round.
Given the exploding number of distributed systems that are interconnected by wireless networks, ranging from Bluetooth over WLANs to 4G/5G broadband communication, the question of simulating synchronous executions in such systems arises. Unfortunately, though, the communication properties of a wireless link are typically unstable and highly time-variant (Cerpa et al. 2005b), due to limited transmission ranges, near-far problems (Ware et al. 2000), fading (Schilcher et al. 2012), interference (Fussen et al. 2005) and other phenomenons. There is hence no alternative to statistical modeling lossy links, which has been done in various different models, for simple sensor networks (Cerpa et al. 2005a) to elaborate signal-to-interference-plus-noise (SINR) ratio (Dousse et al. 2005) and even fading models (Bettstetter and Hartmann 2005;Schilcher et al. 2016). Most of this work focuses on individual links; some papers also deal with broadcasting protocols (Clementi et al. 2007).
We use a very simple model based on these results, which just assumes independent and identically distributed message loss per communication link. A similar assumption also underlies the edge-Markovian model (Clementi et al. 2008). It is appealing because of tractability and, despite its simplicity, not unreasonable in practice, at it provides (probabilistic) lower bounds on the performance of real networks for suitably chosen message loss probabilities. We note that, depending on the type of the underlying wireless network, both a constant value of p and a value that decreases with the number N of participants may make sense here: In wireless networks where a collision, i.e., a simultaneous attempt of two senders to broadcast a message at the same time, may lead to the destruction of both messages, some form of transmission scheduling needs to applied. Maximizing the overall throughput or similar performance measures in such networks (Gupta and Kumar 2000) requires to reduce the sending probability, and hence also p, down to something like 1/ log N or even 1/N , see e.g. Moscibroda and Wattenhofer (2006) for more information.
In Függer et al. (2015), Függer et. al. analyzed the expected round duration of the αsynchronizer in a synchronous distributed system of N processes that execute in lock-step unit-time rounds in such a model. The model just assumes that every message sent from process i to process j in a round may be lost with some fixed probability 1−p. The expected round duration is crucial for determining the running time of a synchronous distributed algorithm running atop of the α-synchronizer: its expected value is just the time complexity of the algorithm (measured in rounds, as already mentioned) times the expected round duration.
It turned out that the operation of the α-synchronizer, and variants thereof that sometimes forget part of their state, in such a system can be modeled by an infinite-state Markov chain. Whereas (Függer et al. 2015) also provided a reduction to a finite-state Markov chain, it has a state space that is exponential in the number of processes in the system, however. Owing to the inherent complexity involved in the numerical or analytical solution of this chain, the authors had to resort to coarse lower and upper bounds for analyzing the synchronizer performance, in particular, the expected duration of a synchronized round.

Main results:
(1) We provide a synchronizer α and show that it provides an upper bound on the performance of the α-synchronizer.
(2) We prove analytically that the upper bound guaranteed by α is not worse than the exponential-size upper bound presented in Függer et al. (2015), and strictly better for p → 0. These results are backed-up by simulation results, which show that the accuracy of the approximation of the expected round duration of α provided by our new upper bound is considerably better than the original one.
(3) We model α by a finite-state Markov chain, which has only polynomial state space. (4) We provide a variant of our α -synchronizer, and prove that it provides a lower bound for the performance of α. Albeit its complexity did not allow us to find an analytical proof, we demonstrate by means of elaborate simulations that our new upper bound better approximates the expected round duration of α than the existing lower bound from Függer et al. (2015).
These results also contribute to a better understanding of the Markov chain underlying the original problem, which may eventually pave the way to computationally more efficiently computable bounds.

Related work
Early work on synchronizer performance in probabilistic systems considered varying message delays and computation times: Bertsekas and Tsitsiklis (1989) proved performance bounds for the case of constant processing times and exponentially distributed message delays on communication links without message loss. This model has been augmented by exponentially distributed processing times in Rajsbaum (1994a). On the other hand, Rajsbaum and Sidi (1994b) analyzed synchronizer performance in the case of exponentially distributed processing times and negligible transmission delays.
In contrast to the above work, we assume bounded message delays. Varying delays between sending and successfully receiving a message are due to message loss and repeated retransmission. The performance of the α-synchronizer in certain lossy environments has been considered by Nowak et al. (2013). The authors calculated the expected round duration of a retransmission-based synchronizer in systems, where every message is successfully transmitted with constant probability p, subject to the additional constraint that a message that was retransmitted at least M times is guaranteed to arrive. Nowak et al. (2013) assumed M to be finite, however, which Függer et al. (2015) (and we) do not.
The dominant computational complexity in solving Markov chains like the ones arising in Függer et al. (2015) is due to calculating the steady states. Instead of exactly determining those, there exist also techniques that allow to just sample the steady state: However, while standard simulation techniques allow to sample the Markov chain's state at some time t = T , there is no guarantee that these samples resemble the distribution of the steady state for t → ∞. By contrast, Propp and Wilson (1996) proposed backward coupling techniques to obtain exact steady state samples for Markov chains. In the case of monotonic Markov chains, these techniques are computationally efficient. Unfortunately, while our infinite state Markov chains are monotonic, our reduced finite chains are not. Their method thus requires to explore the complete finite state space, rendering this method computationally infeasible.
Paper organization Section 2 introduces our system model and the performance measure of interest, as well as the α-synchronizer and its corresponding Markov chain. In Section 3, we introduce our novel upper bound α -synchronizer and its Markov chain; Section 3.3 shows that it indeed provides an upper bound for the α-synchronizer, Section 3.4 evaluates the asymptotics for p → 0, and in Section 3.5 we compare this bound with the existing upper bound. Section 4 finally provides a modification of the α -synchronizer and the proof that it indeed provides a lower bound for the performance of the α-synchronizer. The paper is rounded off in Section 5 by our simulation results, a discussion of our findings, and some future work; a glossary of our notation is appended in Section 6.

System Model and Algorithm
In this paper, we study the performance of the α-synchronizer (Awerbuch 1985) running in a fully-connected message passing system with processes 1, 2, . . . , N. Processes take steps simultaneously at all integral times t ≥ 0, but messages may be lost with some fixed probability 1 − p. Messages that do arrive have a transmission delay of 1, i.e., a message sent at time t arrives at time t + 1, or not at all. A step consists in (a) receiving messages from other processes, (b) performing local computations, and (c) broadcasting a message to the other processes.
The synchronizer has two local variables, specified for every process i at time t: The local round number R i (t) and the knowledge vector K i,1 (t), K i,2 (t), . . . , K i,N (t) . Processes broadcast their local round number R i (t) in every step t. The knowledge vector contains information on other processes' local round numbers, accumulated via received messages. A process increments its local round number, and thereby starts the next round in step (t +1), after it has gained knowledge that all other processes have already started the current round by step t. This round increment rule hence assures a precision of 1, i.e., |R i (t) − R j (t)| ≤ 1 for all t. We write R(t) = min i R i (t) and call it the global round number at time t. When R(t) increases, we say a global round switch occurs.
Formally, let (P(t)) t∈N * be a sequence of (N × N )-matrices whose entries are pairwise independent random variables with where P i,j (t + 1) = 0 means that process j 's message to process i sent at time t via channel (i, j ) was lost, and P i,j (t + 1) = 1 that it arrives (at time t + 1). Therefore we call the parameter p the probability of successful transmission. Note that in our notation of a channel, process j is the sender and process i is the receiver, i.e., the channel (i, j ) leads from j to i. Moreover, row i in P(t) corresponds to the point of view of the receiver i. Initially, R i (0) = 0 and K i,j (0) = −1 (i.e., no messages are received at time 0). At every time step t ≥ 1, process i's computation consists of the following: 1. Update knowledge according to received messages: In the remainder of this paper, when we refer to K i,j (t) and R i (t), we mean its value after step (2). Figure 1 shows part of an execution of the α-synchronizer. Times are labeled t 0 to t 10 . Processes 1 and 3 start their local round r at time t 4 while process 2 has already started its local round r at time t 3 . The arrows in the figure indicate the time until the first successful reception of a message sent in round r: The tail of the arrow is located at time t a process i Fig. 1 An execution of the synchronizer starts round r and thus broadcasts r for the first time. The head of the arrow marks the smallest time after t at which a process j receives a message from i. Messages from processes to themselves are always received at the next time step and thus are not explicitly shown in the figure. For example, processes 1 and 3 start round r at time t 4 sending r for the first time. While process 2 receives the message from 3 in the next step, it needs an overall amount of 4 time steps and consecutive retransmissions to receive a message from process 1 at time t 8 .

Performance Measure
For a system with N processes and probability p of successful transmission, we define the expected round duration of process i by λ i (N, p) = E lim t→∞ t/R i (t). Since our synchronization algorithm guarantees precision 1, it directly follows that λ i (N, p) = λ j (N, p) for any two processes i and j . We will henceforth refer to this common value as λ(N, p), or simply λ if the choice of parameters N and p is clear from the context.

Markov Chain and Definitions
The α-synchronizer can easily be modeled as a Markov chain (see Függer et al. (2015) for details): Let A(t) be the sequence of matrices with A i,i (t) = R i (t) and A i,j (t) = K i,j (t) for i = j . It is easy to see that A(t) is a Markov chain, i.e., the distribution of A(t + 1) depends only on A(t). Since both R i (t) and K i,j (t) are unbounded, the state space of Markov chain A(t) is infinite.
We therefore introduce the sequence of normalized states a(t), defined by a(t) = A(t) − min k A k,k (t).
Normalized states belong to the finite set {−1, 0, 1} N×N . This is still a Markov chain. Clearly, the computation steps defined above can be translated directly in terms of A(t) and a(t): For A(t) they read 1. Update knowledge according to received messages: For a(t) we have the following: 1. Update knowledge according to received messages: For j = i: a i,j (t) ← a j,j (t − 1) if P i,j (t) = 1, and a i,j (t) ← a i,j (t − 1) otherwise. 2. Increment round number if possible: for all j = i, and a i,i (t) ← a i,i (t − 1) otherwise. 3. Normalizing: a i,j (t) ← a i,j (t) − 1 ∀i, j if min 1≤k≤N a k,k (t) = 1.
In the following, we will switch between these representations as needed. For simplicity, when we extend the computation steps later on, we will do it for only one of these representations since the adaptations are straightforward.
As before, A(t) resp. a(t) refer to the values at the end of time step t (here: after step (2) resp. (3)). If we need to specifically refer to the values after step (i) (i ∈ {1, 2, 3}), we denote this matrices by A (i) (t) and a (i) (t). Moreover, let md (i) (t) denote the minimum of the diagonal of A (i) (t) and w (i) (t) the number of md (i) -entries in the diagonal of A (i) (t).
A channel is called relevant (in a(t) resp. A(t)) if a successfully transmitted message on this channel in P(t + 1) would increase knowledge (in a(t + 1) resp. A(t + 1)). Similarly, a message is called relevant if its arrival would increase knowledge.
Global round switches occur at times t = 2 and t = 4. In a(t), at times t ∈ {0, 2, 4} we have three 0-processes, at t = 1 we have two 0-processes and one 1-process, and at time t = 3 we have one 0-process and two 1-processes.
Since it is very expensive to calculate the expected round duration based on this Markov chain, Függer et al. (2015) presented easy computable but quite conservative upper and lower bounds. The main purpose of this paper is to develop a new upper bound approximation, which will be shown both analytically and by means of simulations to improve the known upper bound. It will also be stated as a Markov chain and therefore still expensive to calculate, but reduces the state space from exponential to polynomial size in N .

Algorithm of our Upper Bound
We will now present the algorithm that generates our new upper bound, which will be called the α -synchronizer. The main idea is to insert a reordering step between steps (1) and (2) of the computation, which reorders the entries in the matrix a(t) in such a way that generating 1-entries in the diagonal is avoided as long as possible. Roughly speaking, we take all nondiagonal entries of the matrix, sort them in descending order of their value, and fill them in column by column.
Formally, we introduce a step (1a) after updating the knowledge: (1a) Reordering of the knowledge: Let (k i ) be a decreasing sorting of the multiset {A (1) i,j (t) : i = j }. Now fill in this sequence into the matrix column by column (omitting the diagonal) until position (N − 2, N − 1) has been filled. Then go to (1, N) and fill the last column. Finally, fill position (N, N − 1).
However, in order to be able to prove that this indeed results in an upper bound, we slightly need to modify the above simple strategy of updating the knowledge to slow down the synchronizer even further: If we are in a state with exactly one r-process i [and (N − 1) (r + 1)-processes], then some of i's (r − 1)-knowledge will be updated to an r-knowledge, rather than to the (r +1)-knowledge sent [by one of the other (r +1) processes] if i switches to an (r + 1)-process in step (2). This can be implemented into our algorithm by introducing an additional step (3): (3) If w (2) (t) − w (2) (t − 1) = N − 1, then replace the last non-diagonal md (2) (t)-entry (according to our filling rule) by md (2) (t) − 1.
Step (1a) means one fills in the 1s first, then the 0s, and finally the (−1)s. For example if we have a (4 × 4)-matrix then the matrix is filled up in the following order as stated in the matrix S below. On the right-hand side we give an example of reordering a matrix resulting from computation step (1) and incrementing round numbers afterwards. Note that due to the reordering 1-processes will be on the top of the matrix and 0-processes on the bottom. S = 4 7 9 1 8 10 2 5 11 3 6 12 Note that in this example reordering indeed has an effect: Without reordering processes 3 and 4 would become 1-processes whereas now only process 1 is a 1-process.
Example 2 This example illustrates the α -synchronizer and compares it to the αsynchronizer: Here the α-synchronizer does two global round steps (at times t = 2 and t = 4), but the α -synchronizer only does one (at t = 4). So the α -synchronizer is indeed slower than the original α-synchronizer. We will see later on, however, that-when provided with the same sequence of matrices P(t)-this slowdown is not always the case. Note that at time t = 4 we applied step (3). Otherwise the rightmost entry in the first row would be 0.
Observe that several different sequences of matrices P(t) can lead to the same a (t) (but different a(t)): Note that, without the additional modification (step (3)) of the primary reordering of knowledge step (1a) needed to make our proofs working, one can achieve a(t) = a (t) if the matrices P(t) are properly chosen, as the following example shows: In this case, another view of the basic upper bound construction is not to employ the reordering step (1a), but to choose specific matrices P(t + 1) depending on the state a(t). Unfortunately, though, this does not work if the additional modification is employed.
The proof that this algorithm leads indeed to an upper bound is given in Section 3.3. However, first we give a detailed description of the corresponding Markov chain.

Markov Chain of our Upper Bound
We now describe the Markov chain that models our upper bound. For this purpose, note that, due to our reordering step (1a), it is not necessary anymore to store the whole matrix: It would be sufficient to know the number of 1s and 0s we have to fill in. However, we will use another representation, which is even more appropriate for our purposes: We represent a matrix by a pair (x, a), where x denotes the number of 0-entries in the diagonal (i.e., 0-processes) and a is the number y of non-diagonal 0-entries if x = N , whereas a is the number z of non-diagonal 1-entries if 1 ≤ x ≤ N − 1.
Note that these pairs contain all the information we need: Firstly, there can be only 0-or 1-entries in the diagonal -so it is sufficient to know the number of one of them. Secondly, if x = N , then the only non-diagonal elements are (−1) or 0 -so again one of its numbers is sufficient. Thirdly, if 1 ≤ x ≤ N − 1 we can have 1s, 0, and (−1)s outside the diagonal. But, due to our filling rule, x 0-processes imply that we have exactly x (−1)-entries! Hence knowing the number either of 1s or 0s is sufficient again.
For simplicity, we introduce the following abbreviations: Our Markov chain can now be described as follows: The state space is given by pairs where δ ij is the Kronecker delta, i.e., δij = 1 if i = j and 0 otherwise. Clearly, we can have 1 ≤ x ≤ N 0-processes. For y, observe that with N 0-processes there can be up to N(N − 2) non-diagonal 0-entries (since there must be one remaining (−1) in each row). For the number of possible 1s we have to be a little bit more careful: If x ≥ 2, then for each 1-process there can be up to (N − 1) 1-entries in the corresponding columns. On the other hand, if x = 1, we have to subtract one because the last process still needs a (−1)-entry. So the number of states of the Markov chain equals N(N 2 − 1)/2. Let s 1 = (x 1 , y 1 , z 1 ) and s 2 = (x 2 , y 2 , z 2 ) be the uniform representation of the states s 1 and s 2 , where, depending on e.g. x 1 , either y 1 or z 1 is not used. Then, the transition probabilities p s 1 s 2 from state s 1 to s 2 are given by p s 1 s 2 =p s 1 s 2 +p s 1 s 2 , wherep s 1 s 2 is the probability of a transition without making a global round switch andp s 1 s 2 is the probability of a transition with a global round switch, with In case (3a), clearly the number of non-diagonal 0-entries can only increase and is bounded by N(N − 2) (as mentioned above). Hence, to do the state transition, exactly (y 2 − y 1 ) from the (ν −y 1 ) relevant 0-messages must arrive in order to replace (y 2 −y 1 ) of the (−1)-entries in s 1 .
If x 1 = x 2 < N (case (3b)), the number of 1-entries cannot decrease. Since the number of 0-processes remains the same, none of the x (−1)-entries is allowed to be overwritten by a 0 or 1. This gives the factor (1 − p) x 1 . Moreover, the number of 1s increases by (z 2 − z 1 ), hence, for x 1 > 1, we can choose them among the (ν(x 1 ) − z 1 ) relevant channels. If x 1 = 1, we have to subtract the one (−1)-entry of the last process (otherwise we would do a global round switch).
In case (3c), where we have a transition from a state with N 0-processes to a state with x 2 < N 0-processes, we must have z 2 = 0 since in s 1 there are only 0-processes and so no 1-messages can be sent. Due to the fact that having x 2 0-processes is equivalent to have x 2 (−1)-entries, all but x 2 of the (ν − y 1 ) (−1)-entries must be replaced by 0.
In case (3d), we do a transition with decreasing number of 0-processes again. To ensure that exactly (x 2 − x 1 ) 0-processes become 1-processes, exactly (x 2 − x 1 ) of the x 1 (−1)entries in s 1 must be replaced with 0s. This leads to the term Clearly, the number of 1-entries cannot decrease. So we have to choose exactly (z 2 − z 1 ) messages from the (ν(x 1 ) − z 1 ) relevant channels outgoing from 1-processes for a successful transmission.
The second part of the transition probabilities is given bŷ Clearly, if a global round switch occured we have only 0-processes in the matrix. Hence, a positive transition probability is only possible if x 2 = N and z 2 = 0. In case (4a), we also have N 0-processes in state s 1 . Thus, only 0-messages can be sent and so we have only (−1)s after the round switch in the non-diagonal entries of the matrix. This is why y 2 has to be 0. To do the round switch, all of the remaining (ν − y 1 ) (−1)-entries in s 1 have to be overwritten by 0. This immediately gives the transition probability in this case. [(1a)] Let us turn to case (4b): We start with x 1 > 1: If we have (N −x 1 ) 1-processes and z 1 non-diagonal 1-entries in s 1 , then the number y 2 of non-diagonal 0-entries in s 2 is at least z 1 (since the existing 1s convert to 0s in a global round switch) and at most ν(x 1 ) = (N −1)(N −x 1 ) (the maximum number of 1s generated by (N − x 1 ) 1-processes). To make this state-transition, we have to take two things into account: Firstly, all x 1 (−1)-entries in s 1 must be overwritten -this gives the term p x 1 . Secondly, we have to produce exactly (y 2 − z 1 ) new 1-entries (before doing the reduction due to the global round switch). These 1s overwrite existing 0s. Thus, we can choose them from the (ν(x 1 ) − z 1 ) relevant channels outgoing from 1processes. Finally, note that in case x 1 = 1 the last (−1)-entry is overwritten by 0 although the correspoding messages was sent from a 1-process! This is why we have to add the correction term −δ 1x 1 .

Proof Upper Bound
Now we want to show that our previously defined process is indeed an upper bound. First of all, it is worthwile to mention that this can't be done execution-wise, since there exist schedules such that the α -synchronizer is faster than the α-synchronizer.
Example 4 We give an example of a schedule such that the α -synchronizer is faster than the α-synchronizer. Here A (t) and A(t) denote the matrices as defined in Section 2.3.
To handle this problem we will construct a measure-preserving bijection f on the sets of schedules (i.e., a bijection that preserves the number of ones in the message patterns at each time step) such that the α-synchronizer under a schedule E is always faster than the αsynchronizer given the schedule f (E). The basic idea behind our construction is to map the relevant messages of the α-synchronizer on the relevant messages of the α -synchronizer in such a way that the α-synchronizer is always in front. Let and E n = M n the set of sequences of messagepatters of length n (i.e., prefixes of schedules). For E ∈ E or E m let E n denote the nth element of E. Moreover, let |E n | = i,j E n,i,j denote the number of ones in E n . With E ≤n we denote the prefix of length n of E, i.e., the tuple (E 1 , . . . , E n ). Then we have the following theorem: Theorem 1 There exists a bijection f : Here To simplify notation, we define for a matrix A(n) the submatrix A (n, u, v) as the submatrix consisting of the intersection of rows of u-processes and columns of v-processes. To illustrate this definition look at the following example. Now we will inductively construct functions f n : E n → E n (n ≥ 1), the limit of which gives f , with the following properties:
Then the function f defined by f (E) = lim n→∞ f n (E ≤n ) has the stated properties.
Remark 1 It is worthwile to mention that conditions (iv) and ( We start with f 1 := id M . Then f 1 fulfils obviously (i) − (v). Let f n with (i) − (vi) be already defined. We will first construct -in dependence on E ≤n (and consequently on A(n)) -the function f E ≤n : M → M. Then, the function f n+1 : E n+1 → E n+1 is defined as follows: For our construction, we have to do a case distinction: Case A R(n) = R (n) = r: Fig. 2 The partioning of matrix A into the sets A, B, C, D, F , G Given the matrices A(n) and A (n), we also define the following sets: B n , B n = {positions of r-entries in A(n, r + 1, r + 1) and A(n, r, r + 1) resp. A (n, r + 1, r + 1) and A (n, r, r + 1)}, F n , F n = {positions of r-entries in A(n, r + 1, r) resp. A (n, r + 1, r)}, C n , C n = {positions of non-diagonal r-entries in A(n, r, r) resp. A (n, r, r)}, D n , D n = {positions of (r − 1)-entries in A(n, r, r) resp. A (n, r, r)}.
For a graphical interpretation of these sets see Fig. 2a: The sets A and B contain (the positions of) the (r+1) resp. r-entries in regions 1 and 3, region 2 is exactly F, the (r−1)-entries in region 3 correspond to G, and the sets C and D contain the r resp. (r −1)-entries in region 4. A visualisation of these sets is given in Fig. 2b.
Moreover, B, D, and G correspond to relevant channels M = B ∪ D ∪ G in A(n) (and analogous for A (n)).
Our assumptions imply |A| + |G| ≥ |A | (due to (v)) and |D| + |G| ≤ |D | + |G | (see (iv): we have m n = r − 1). The next property is a little bit more involved: Note that in Case A we have m n = r − 1. Moreover, due to (iv) we have at least as many non-diagonal (r − 1)-elements in A (n) than in A(n); recall that because of computation step (1a) (where messages are inserted column by column) we generate with b n (r − 1)-elements as many r-processes in A (n) as possible. Hence the number of r-processes in A n (i.e., the number of diagonal elements equal r) is less than or equal to the number of r-processes in A n ; equivalently |C| + |D| ≤ |C | + |D | (the sizes of the submatrices A(n, r, r) and A (n, r, r)) or equivalently |A| + |B| + |G| ≥ |A | + |B | + |G | (7) (the total number of entries in the columns with (r + 1) in the diagonal).
Note that G is either empty or contains just one element, namely (N, N − 1). This is the special case in which we interpret the 1-message transmitted on this channel only as 0-message (i.e., applying step (3)).
In case m n+1 = r (i.e., in A(n) a global round switch occured, this means all (r − 1)-entries in A(n) were overwritten by r or (r + 1)) let B − resp. B − denote the number of succesfully transmitted messages in the sets B resp. B . Hence, b n+1 (m n+1 ) = |F n | + |C n | + |D n | + |B n | − B − + |G n | The last inequality holds in case of min(|B|, |B |) = |B | due to condition (v), in case of min(|B|, |B |) = |B| due to Eq. 7. Thus also (iv) is valid. If R(n + 1) = R (n + 1), we also have to verify condition (v). In case R(n + 1) = r + 1 there are no (r + 2)-processes and the condition is fulfilled trivially. On the other hand, i.e, in case R(n + 1) = r, firstly observe that the difference of (r + 1)-entries can decrease by up to max(0, |B | − |B|) (due to b)). Secondly, use the following partioning of G n+1 : We can write G n+1 = (G n ∪ N ) \ G − with N = G n+1 \G n the set of 'new' (r −1)-positions in A(n+1, r, r +1) (i.e., those (r −1)entries which are in columns of processes which switched from rto (r + 1)-processes from A(n) to A(n + 1)) and G − the set of arriving (!) messages related to positions in G n .
Note that we do not need the term |G − | in the second equation since in the only case G is nonempty the transmitted message is treated as 0 due to computation step (3) of our bound. This yields Note that the second inequality uses property b) of the bijection and the last inequality is again valid due to (v) or Eq. 7. So the proof is finished in Case A.
Case B R(n) − 1 = R (n) = r: Given the matrices A(n) and A (n) we define the following sets (in addition to the sets A , B , F , C , D , G , A,

A visualisation of the partioning of A(n) and A (n) into these sets is given in
Again and Define f E ≤n as the bijection on M induced by h. Then conditions (i) and (ii) for f n+1 defined in Eq. 5 are obviously fulfilled. The fact R (n+1) ≤ R (n)+1 immediately implies (iii). Let us we verify (iv). If m n+1 = r + 1 (i.e., a global round switch in A(n) occured), then m n+1 ≥ max A (n + 1) and we are done. Otherwise, if m n+1 = r, observe that m n = r, that B and J contain all relevant r-entries of A(n) and B contain all those relevant entries in A (n) which can be overwritten by (r + 1). Hence the change in a n and b n can be bounded, and so (iv) is proven.
To finish the proof in Case B, it remains to check condition (v) in case of R(n + 1) = R (n + 1) = r + 1, i.e., a global round switch occured also in A (n), but then there are no (r + 2)-processes in A (n + 1) and so (v) is trivially fulfilled.
Case C R(n) > R (n) + 1: Here we define f E ≤n := id M . The conditions (i) and (ii) are clear, and we can ignore (v). From R(n) > R (n) + 1 we obtain R(n + 1) > R(n + 1) and thus (iii). Moreover, we have min A(n) ≥ max A (n) and hence A(n + 1) i,j ≥ A (n + 1) i,j for all 1 ≤ i, j ≤ N ; this implies (iv).

Asymptotics for p → 0
In this section, we give the asymptotics of the behavior of the α -synchronizer for p → 0, which we argued in Section 1 to be the appropriate regime for p for wireless networks with destructive collisions (Moscibroda and Wattenhofer 2006). Lemma 1 For p → 0, where H n denotes the n-th harmonic number.
The expected round duration λ can be written in terms of these expectations: Letâ(t) be the Markov chain obtained from a(t) by adding to each state a an additional flag Step such that Step(â(t)) = 1 if there is a global round switch at time t, and 0 otherwise. Then we have Step(â) ·π(â) · (# −1 (â), p), whereπ denotes the steady-state distribution of the Markov chainâ(t) and # −1 (â) is the number of (−1)-entries inâ (see Függer et al. (2015) for details). This formula shows that λ can be represented as a weighted sum (weighted by the steady-state probabilities) of the expected time to the next global round switch starting in stateâ, where we only take those states which can occur after a global round switch.
Note that this representation is equivalent to the following one: Let b(r) be the Markov chain only on the states which can occur after a global round switch, i.e., on the statesâ with Step(â) = 1 with steady state distribution ρ. Then Moreover, a standard method to compute the stationary distribution of a Markov chain is the following: Let P denote the transition matrix of a homogeneous and irreducible Markov chain with finite state space. Then the steady state distribution π is the solution of the equation Bπ = e with e = (0, . . . , 0, 1) T and B = ((P − I ) (n→1) ) T , where M (k→x) denotes a matrix M with its k-th column set to x. For all states s of the Markov chain we have π(s) > 0, and by Cramer's rule π(s) = det(B s )/ det(B) holds, where B s is the matrix arising from B by replacing the column corresponding to s by e. In particular, all matrices B s and B are regular. Now we can establish the asymptotics for the α -synchronizer: with Remark 2 Let us give a rough explanation of formula Eq. 11 (see Step 3 of the proof for details): With # we denote 'the number of', and let d, d + 1 be the time interval from switching from d to (d + 1) 1-processes. Then P(#1 = k) stands for the probability that the number of (non-diagonal) 1s in the matrix a just before a global round switch is performed equals k and P(#(new 1s in d, d+1 ) = e|#(old 1s) = g) is the probability that in d, d+1 e new 1-entries in the matrix a are created given that we have already had g non-diagonal 1-entries before.

Proof
First step: We will observe that for the limit p → 0 it is enough to consider a restricted model in which we allow only those transitions to be possible which arise from the successful arrival of at most one relevant message in one time step: Denote with P the transition matrix for the Markov chainâ for the α -synchronizer and consider the reduced matrixP = P (mod p 2 ) (i.e., we delete all powers of order greater than or equal to 2 of p in P ). ThenP is still a stochastic matrix (for p sufficiently small: the row sum is still one since we deleted all higher powers, but the remaining terms of the form (1 − dp) could be negative) and corresponds to the model in which only at most one arriving message is allowed. Since every transition with k > 1 arriving messages can be simulated by k transitions with only 1 arriving message, the corresponding markov chain is still homogeneous, aperiodic and irreducible, and thus has a unique steady state distribution.
In particular,B = ((P − I ) (n→1) ) T andB s are regular. Sinceπ(â) = det(Bâ)/ det(B), and all entries of B and Bâ are polynomials in p, the involved determinants are polynomials in p, too, and therefore it is sufficient for the limit p → 0 to know the term with the lowest power of p. Since for a matrix M = (m i,j ) we have we just need to know the lowest power of p in each row (provided they do not cancel out in the sum). For B, note that the last row consists only of 1s, whereas in all the other rows the lowest power is p (the only entries where 1 could appear is in the diagonal when no message arrives and we stay in the same state, but we have substracted the indentity matrix!). Hence the term with the lowest power of p in det B is of the form c · p r(B)−1 (with r(B) the number of rows of the matrix B), but this is nothing else than det(B (mod p 2 )). But B (mod p 2 ) is the same as ((P − I ) (n→1) ) T , which is regular as mentioned above, and hence we can use the restricted model. Second step: Let σ s and σ s be two states that can occur after a global round switch. If we pass from σ s to σ s then we have to go through the statê To see this, assume that we are in state σ s and let k denote the number of (non-diagonal) zeros in σ s . Due to computation step (3) of our upper bound we have k ≤ N(N − 2), and due to (1a) each row contains at least one (−1)-entry. Again due to (1a) new zeros are filled in according to our construction 'column by column', but at most one zero in one time step. Hence, we have to passσ . Third step: Now we want to compute the probability P(#1 = k) of exactly k ones occuring at a global round switch given that we start from stateσ (in fact, ρ(N 2 − N − k) = P(#1 = k) with ρ from Eq. 10). When passing fromσ to the next state σ s after a global round switch, let t d denote the time when we switch to a state with d one-processes. In a first step we will determine the probability P(#(new 1s in [t d , t d+1 )) = e|#1 = g), i.e., the probability of exactly e new 1-entries in the interval [t d , t d+1 ) (i.e., the time until the next 0-process becomes a 1-process) given already g 1-entries in the matrix a(t d ).
Observe that the probability of the succesfully transmitted 0-message equals (N − d)p (recall that we have (N − d) 0-processes), the probability of the arrival of the s-th new 1-message equals (d(N − 1) − g − (s − 1))p (we have d(N − 1) channels outgoing from 1-processes, and we have to subtract the number of 1s already arrived). Thus, the s-th sequence n k s appears with probability Hence, by summing over all possible lengths k s ≥ 0 and multiplying these three factors, Thus Eq. 12 is shown. In case d = N −1 the situation changes a little bit: Here we have only (d(N −1)−1) = ((N −1) 2 −1) channels outgoing from 1-processes not leading to the global round switch, hence the probability of the arrival of the s-th new 1-message equals ((N − 1) 2 − 1 − g − (s − 1))p. Analogous to above we obtain (13).
In a second step we compute P(#1 = k): To obtain the probability of k ones at the global round switch we have to sum over all weak compositions of k with (N − 1) parts (a weak composition of k with m parts is an ordered collection of m non-negative integers whose sum is k). Moreover, the number of arrived 1s cannot exceed the number of possible 1-channels. Hence (11) is shown. Fourth Step: Now we can determine the constant: We have (see Eq. 10) By Lemma 1, the stated asymptotics follows.

Relation to the Existing Upper Bound
In this section, we will prove that our new upper bound provided by α is better than the one presented in Függer et al. (2015). The upper bound for the α-synchronizer given in Függer et al. (2015) was given by λ III (N, p) = (N (N − 1), p), which originated from a synchronizer variant whose knowledge is reset after each global round switch. The following Corollary 1 shows that λ III is also an upper bound for λ , which confirms our claim. Moreover, it establishes that our new upper bound has a strictly better asymptotic behavior for p → 0.
Corollary 1 Let θ denote the constant such that λ III (N, p) ∼ θ/p for p → 0. Then λ (N, p) ≤ λ III (N, p) and (N, p), we define a synchronizer α f in the same way as α , except that we reset knowledge after every global round switch in α f , i.e., we introduce a new "forgetting" step (4): Let A and A f denote the matrices of the α -resp. α f -synchronizer. Then the relation for every t, i, j can be seen directly from comparing A and A f executionwise. Secondly, to see that α f is equivalent to λ III , it is sufficient to note the following two properties: Step (3) is redundant for α f due to the forgetting step (4). (ii) The reordering step (1a) cannot affect the expected time until the next global round switch, since we reset all knowledge and thus all N(N − 1) messages have to arrive every round.
Showing that our new bound has a better asymptotic behavior for p → 0 is straightforward: (N−1) , which concludes the proof.
Remark 3 If one compares the the first few values of θ and θ given below, it is apparent that our new upper bound is indeed strictly better than the old one.

New Lower Bound
Similar to the upper bound, we can also construct a lower bound for the α-synchronizer. We will start by presenting the algorithm that generates our new lower bound, which will be called the α -synchronizer. Subsequently, we will give the proof that it is indeed a lower bound. In contrast to the upper bound, we will now choose the channels on which messages arrive in a 'good' way, i.e., we try to generate 1-entries in the diagonal as fast as possible.
The basic idea is to fill in incoming messages received by 0-processes now line by line. But to make our proof working, we have to speed up our algorithm even more: Roughly speaking, we will treat entries below the diagonal as 1-entries. We adapt the steps (1a) and (3) of the α -synchronizer in Section 3.1. Let md := md (1) (t) and w := w (1) (t), cf. Section 2.3.
(1a) Partial reordering of the knowledge: Let (k i ) be a decreasing sorting of the multiset i,i (t) = md}, i.e., a sorting only of the knowledge of the md-processes (note that the rows of (md + 1)-processes remain unchanged!). Now fill in this sequence into the rows of 0-processes again according to the following rule: Fill them in line by line, but if a non-(md − 1)-entry is to be placed into A (1) (t, md, md) replace it by a md, and if it is to be placed into A (1) (t, md, md + 1) replace it by (md + 1).
This means that we not only reorder the knowledge, we also sometimes change the values of the knowledge depending on the round number of the new sender. Note the following: (α) Firstly, this reordering step causes the diagonal of a(n) to have all 1-processes before the 0-processes, i.e., the diagonal is of the form (1, . . . , 1, 0, . . . , 0). If the 1s of the diagonal are on positions 1, . . . , j − 1, we call process j the first 0-process (of a(n)). (β) Secondly, this reordering implies that below of 1-entries in the diagonal we either have 1-or (−1)-entries.
Step (3a) means that if the system does not perform a global round switch and if there are processes in front we increase some knowledge below the diagonal. This modification is necessary to make our proof work -it avoids some specific executions in which the α-synchronizer would overtake the lower bound: Roughly speaking, it can happen that in the lower bound several 0-processes become 1-processes at the same time step, whereas in the α-synchronizer they switch step by step. Then, in the lower bound, there are 0-entries between the 'new' 0-processes below the diagonal. But in the α-synchronizer there are still (−1)-entries at those positions which will be overwritten by 1s afterwards step by step. Hence the α-synchronizer could be in front.
Step (3b) ensures that also after a global round switch we immediately have knowledge filled in in decreasing order. Note that omitting this step would lead to a mathematical equivalent synchronizer, since reordering would be done in the next time step in step (1a), but with this additional step the synchronizer is easier to handle in the following.
Example 7 This example illustrates the α -synchronizer and compares it to the αsynchronizer: Here the α-synchronizer does only one global round switch (at time t = 4), but the αsynchronizer performs two (at times t = 2 and t = 4). So the α -synchronizer is indeed faster than the original α-synchronizer. We will see later on, however, that-when provided with the same sequence of matrices P(t)-this speed-up does not always happen. Note that at times t = 1 and t = 3 we applied step (3a), whereas at t = 2 we applied step (3b).
Formally, the lower bound can be described as follows: We consider the number of 0entries in the diagonal (i.e., 0-processes) x, the total number of 1-entries above the diagonal y, and the total number of 0-or 1-entries z in rows of 0-processes, where with n(x) = (N − x)(N − x − 1). Note, due to the reordering step (1a), that the non-(−1)-entries in rows of 0-processes are located in the row of the first 0-process if x < N. In case x = N we can have up to (N − 1) 2 0-entries because after a global round switch the 1-entries become 0-entries, and we can have up to (N − 1) 1-processes and thus up to (N − 1) 2 non-diagonal 1-entries before performing the round switch. So the number of states of the Markov chain equals (N 4 − N 3 + 5N 2 − 11N + 12)/6. Moreover, let μ(y) = (N − 1)y and s 1 = (x 1 , y 1 , z 1 ) and s 2 = (x 2 , y 2 , z 2 ) be two states of our Markov chain. Then, the transition probabilities p s 1 s 2 are given by p s 1 s 2 = p s 1 s 2 +p s 1 s 2 , wherep s 1 s 2 is the probability of a transition from s 1 to s 2 without making a global round switch andp s 1 s 2 is the probability of a transition performing a global round switch:p In both cases, we have to bound z 2 by (N − 2) since states with z 1 > (N − 2) can only occur directly after a global round switch as all 0-processes with (N − 1) 0-entries would perform a local round switch and become 1-processes otherwise. Let us turn to the transition probabilities in detail: Equation 15a: If the number of 0-processes remains constant, the number of 1-entries above the diagonal and the number of non-(−1)-entries of the first 0-process trivially cannot decrease. For the number of ones we have (n(x 1 )/2 − y 1 ) possible channels to choose (y 2 − y 1 ) new entries. For the new entries of the first 0-process we have (μ(x 1 ) − z 1 ) possible channels for the new entries. This immediately gives the transitions probabilities in this case.
Equation 15b: If the number of 0-processes decreases, the number of 1-entries above the diagonal cannot decrease and is clearly bounded by n(x 1 )/2. To generate (x 1 − x 2 ) new 1processes and z 2 non-(−1)-entries at the first 0-process (μ(x 1 − x 2 ) − z 1 ) + z 2 messages overwriting (−1)-entries must arrive (the first summand equals the number of non-diagonal entries in (x 1 − x 2 ) rows) minus the number of already existing non-(−1)-entries z 1 in rows of 0-processes). This gives the stated formula in this case.
For the transition probability of a transition performing a global round switch, we havê After making a global round switch, the number of 0s in the diagonal equals N . For the number of zeros z 2 in s 2 recall that these 0-entries arise from 1-entries by applying the normalization step. Thus, the number of non-diagonal 0-entries z 2 must be at least the number of 1-entries above the diagonal before (i.e., y 1 ) plus all entries below the diagonal of 1-processes (for performing the global round switch all messages must arrive and below 1-processes we have only 1-entries -see (β)), this number equals n(x 1 )/2 + x 1 (N − x 1 ). Since we had (N − x 1 ) 0-processes in state s 1 and each process can send messages to at most (N − 1) other processes, we have z 2 ≤ ν(x 1 ). Since all (−1)-entries must be overwritten and since due to (β) no 0-entries exists below 1s in the diagonal, the only choice we have for relevant messages leading to 1-entries (before normalizing) which may arrive or not is to choose them from the (n(x 1 )/2 − y 1 ) 0-entries above the diagonal; because of the argument above we have to choose z2 − (y 1 + x 1 (N − x 1 ) + n(x 1 )/2) from them. This gives the binomal coefficient.
The total number of messages which must arrive for this state transition is the number of new 1-entries above the diagonal plus the number of (−1)-entries in s 1 , i.e., this is the exponent of p. Finally, computing the exponent of (1 − p), we have

Proof Lower Bound
Now we want to show that our previously defined process is indeed a lower bound. As before this can not be done execution-wise, since there exist schedules such that the α-synchronizer is faster than the α -synchronizer.
Example 8 We give an example of a schedule such that the α-synchronizer is faster than the α -synchronizer. Here A (t) and A(t) denote the matrices as defined in Section 2.3. t: As before we will construct a measure-preserving bijection g on the sets of schedules such that the α-synchronizer under a schedule E is always slower than the α -synchronizer given the schedule g(E). Again, the basic idea behind our construction is to map the relevant messages of the α-synchronizer on the relevant messages of the α -synchronizer in such a way that the α -synchronizer is always in front.
Then we have the following theorem: Theorem 3 There exists a bijection g : Here, R(E, n) denotes the global round number of the α-synchronizer at time n given the schedule E and R (E , n) the global round number of the α -synchronizer at time n given the schedule E . Proof Let A(n) = A(E, n) and A (n) = A (g(E), n) denote the matrices of the αand the α -synchronizer under the schedule E and g(E), resp. Moreover, define m n = min i,j A (n) i,j (i = j ) as the minimum of A (n), and a n (m n ) as the number of nondiagonal entries of A (n) that equal the minimum m n . With b n (m n ) we denote the number of non-diagonal entries in A(n) less than or equal to m n . Similarly, R(n) = R(E, n) and R (n) = R (g(E), n) are the global round numbers of A(n) and A (n), respectively. Now we will inductively construct functions g n : E n → E n (n ≥ 1), the limit of which gives g, with the following properties: (i) g n bijective on E n , (ii) |E n | = |(g n (E)) n | ∀ E ∈ E n , (iii) R(E, n) ≤ R (g n (E), n) ∀ E ∈ E n , (iv) a n (m n ) ≤ b n (m n ) and (v) if R(n) = R (n): |A n | + |G n | ≤ |A n | + |G n | where (for matrices A(n) and A (n) with r = R(n) = R (n)) A n , A n = {positions of non-diagonal (r + 1)-entries in A(n) resp. A (n)}, G n , G n = {positions of (r − 1)-entries in A(n, r, r + 1) resp. A (n, r, r + 1)}.
Then the function g defined by g(E) = lim n→∞ g n (E ≤n ) has the stated properties.
Remark 4 It is worthwile to mention that conditions (iv) and (v) imply i =j A(E, n) i,j + |G n | ≤ i =j A (g n (E), n) i,j + |G n |.
We start with g 1 := id M . Then g 1 fulfils obviously (i) − (vi). Let g n with (i) − (vi) be already defined. We will first construct -in dependence on E ≤n (and consequently on A(n)) -the functions g E ≤n : M → M. Then, the function g n+1 is defined as follows: For our construction, we have to do a case distinction: Case A R(n) = R (n) = r: Given the matrices A(n) and A (n), we define the sets A, A , . . . analogously to the proof of Theorem 1 for the upper bound. Moreover, letB n resp.B n denote the set of positions of r-entries in A(n, r + 1, r + 1) resp. A (n, r + 1, r + 1) above the diagonal.
Our assumptions imply |A|+|G| ≤ |A |+|G | (due to (v)) and |D|+|G| ≥ |D |+|G | (see (iv): we have m n = r −1). Similar to the upper bound we have due to (iv) and reordering step (1a) where messages are inserted line by line, that the number of r-processes in A n is greater than or equal to the number of r-processes in A n ; equivalently |C| + |D| ≥ |C | + |D | (the sizes of the submatrices A(n, r, r) and A (n, r, r)) or equivalently |A| + |B| + |G| ≤ |A | + |B | + |G | (i.e., the total number of entries in columns in A(n) with (r + 1) in the diagonal is less than or equal to the number of entries in columns in A (n) with (r + 1) in the diagonal). In fact -by adding a term counting the difference of (r + 1)-processes in A(n) and A (n) -we even have |A| + |B| + |G| + (x n − x n )(N − 1) = |A | + |B | + |G | (19) with x n resp. x n the number of r-processes in A(n) resp. A (n). Furthermore, we have the relations Let h be any bijection from [N ] 2 to [N ] 2 keeping the diagonal fixed with Now define g E ≤n on M as the function induced by h. We have to check properties (i) -(vi) for g n+1 .
Moreover, note that every arriving message related to G − increases the number of (r + 1)-entries.
Additionally, we can decompose the set A n+1 in the following way: But, due to our construction of the lower bound, the correspondig equation for A n+1 is where L are the positions of the new (r +1)-entries (instead of r-entries) below the diagonal of new (r + 1)-processes (see step (3) and Define g E ≤n as the bijection on M induced by h. Then conditions (i) and (ii) for g n+1 defined in Eq. 17 are obviously fulfilled. The fact R(n+1) ≤ R(n)+1 immediately implies (iii). Let us we verify (iv). If m n+1 = r +1 (i.e., a global round switch in A (n) occured), then m n+1 ≥ max A(n + 1) and we are done. Otherwise, if m n+1 = r, observe that m n = r, that B and J contain all relevant r-entries of A (n) and B and G contain all those relevant entries in A(n) which can be overwritten by (r + 1). Hence the change in a n and b n can be bounded, due to property (23) and so (iv) is proven.
To finish the proof in Case B, it remains to check conditions (v) and (vi) in case of R(n + 1) = R (n + 1) = r + 1, i.e., a global round switch occured in A(n), but then there are no (r + 2)-processes in A(n + 1) and so (v) and (vi) are trivially fulfilled.
Case C R(n) + 1 < R (n) + 1: Here we define g E ≤n := id M . The conditions (i) and (ii) are clear, and we can ignore (v) and (vi). From R(n) + 1 < R (n) we obtain R(n + 1) < R(n + 1) and thus (iii). Moreover, we have max A(n) ≤ min A (n) and hence A(n + 1) i,j ≤ A (n + 1) i,j for all 1 ≤ i, j ≤ N ; this implies (iv).

Discussion and Future Work
In the previous sections, we constructed an upper and a lower bound for the α-synchronizer and proved several properties of these approximations. In this section, we will complement our analytical findings by some simulation results.
Based on our constructions of the Markov chains in Sections 3 and 4, the upper and lower bound can be directly determined by solving the linear equation system for the steady state distribution π of the respective Markov chain and by computing the expected round round duration from π using Füger et al. (2015, Theorem 5). Algorithm 1 and 2 provide the details of these computations.
In Figs. 4 (for p fixed) and Fig. 5 (for N fixed), we compare simulations for the αsynchronizer against the exact values of the bounds given in Függer et al. (2015) and of our new bounds. For the simulations, we performed 30 runs with 100000 time steps each. The results for the α-synchronizer are represented as (very thin) box-whisker-charts in Fig 4, whereas in Fig. 5 we only show the average. Note that the bounds of Függer et al. (2015) are represented by triangles, whereas our bounds are represented by squares. Noticeable is the very good approximation by our new upper bound, which confirms the analytic dominance results obtained in Section 3.5.
Our bounds are indeed an improvement of the results of Függer et al. (2015), since we reduced the state-space of the Markov chains from exponential to polynomial size and since our bounds (especially the upper bound) are much better. Nevertheless, we don't have closed formulas and calculating and solving the Markov chains of our bounds is still expensive: We could solve it algebraically only up to n = 4 (whereas for the α-synchronizer we could do it only for n = 3 and were not even able to generate the state space for n = 4). Numercial solutions are computable also for higher n: the upper bound we could solve for n = 12 in about 12 seconds, for n = 20 it took us about five minutes. The lower bound is more timeconsuming: For n = 10 it took us about half a minute, whereas for n = 12 the solution was computed in 2,2 minutes. To be more precise, the size of the state spaces of the Markov chain and time and memory consumption of our algorithms computing the bounds are summarized in the following table. Fig. 4 Monte-Carlo simulation results for the α-synchronizer (represented as boxplots) compared against the lower ( ) and upper ( ) bound given in Függer et al. (2015), our new upper bound , and our new lower bound . As usual, the (orange) box of the boxplots represents the interval from the 25% to the 75% quantile, whereas the fences mark the minimum and the maximum