Near-Optimal Self-stabilising Counting and Firing Squads
Consider a fully-connected synchronous distributed system of n nodes, where up to f nodes may be faulty and every node starts in an arbitrary initial state. In the synchronous counting problem, all nodes need to eventually agree on a counter that is increased by one modulo some C in each round. In the self-stabilising firing squad problem, the task is to eventually guarantee that all non-faulty nodes have simultaneous responses to external inputs: if a subset of the correct nodes receive an external “go” signal as input, then all correct nodes should agree on a round (in the not-too-distant future) in which to jointly output a “fire” signal. Moreover, no node should generate a “fire” signal without some correct node having previously received a “go” signal as input.
We present a framework reducing both tasks to binary consensus at very small cost while maintaining the resilience of the underlying consensus routine. Our results resolve various open questions on the two problems, most prominently whether (communication-efficient) self-stabilising Byzantine firing squads or sublinear-time solutions for either problem exist. For example, we obtain a deterministic algorithm for self-stabilising Byzantine firing squads with optimal resilience \(f<n/3\), asymptotically optimal stabilisation and response time O(f), and message size \(O(\log f)\). As our framework does not restrict the type of consensus routines used, we can also obtain efficient randomised solutions, and it is straightforward to adapt our framework to allow \(f<n/2\) omission or \(f<n\) crash faults.
- 1.Ben-Or, M., Dolev, D., Hoch, E.N.: Fast self-stabilizing Byzantine tolerant digital clock synchronization. In: Proceedings of 27th Annual ACM Symposium on Principles of Distributed Computing (PODC 2008), pp. 385–394. ACM Press (2008). doi:10.1145/1400751.1400802
- 2.Berman, P., Garay, J.A., Perry, K.J.: Towards optimal distributed consensus. In: Proceedings of 30th Annual Symposium on Foundations of Computer Science (FOCS 1989), pp. 410–415. IEEE (1989). doi:10.1109/SFCS.1989.63511
- 4.Dolev, D., Függer, M., Lenzen, C., Schmid, U., Steininger, A.: Fault-tolerant distributed systems in hardware. Bull. EATCS (116) (2015). http://bulletin.eatcs.org/index.php/beatcs/issue/view/18
- 13.Lenzen, C., Függer, M., Hofstätter, M., Schmid, U.: Efficient construction of global time in SoCs despite arbitrary faults. In: Proceedings of 16th Euromicro Conference on Digital System Design (DSD 2013), pp. 142–151 (2013). doi:10.1109/DSD.2013.97
- 15.Lenzen, C., Rybicki, J.: Near-optimal self-stabilising counting and firing squads, manuscript, full version. arXiv:1608.00214 (2016)
- 16.Lenzen, C., Rybicki, J., Suomela, J.: Towards optimal synchronous counting. In: Proceedings of 34th ACM Symposium on Principles of Distributed Computing (PODC 2015), pp. 441–450. ACM (2015)Google Scholar