Keywords

1 Introduction

In computer science, separation of concerns is a standard design principle which consists of decomposing a complex problem into several simpler ones. These sub-problems are then solved independently, and finally, glued together to obtain a global solution to the initial problem. With this in mind, composition is a natural tool that simplifies both the design and proof of complex algorithms. For example, the sequential composition of two algorithms “ ” enforces \(\mathcal {A}_1 \) and \(\mathcal {A}_2 \) to be executed in sequence, i.e., \(\mathcal {A}_2 \) is initiated only after \(\mathcal {A}_1 \)’s completion. Composition methods are widely used in distributed systems [4, 10, 25].

Self-stabilization [21] is a versatile fault-tolerant paradigm of distributed computing. Indeed, a self-stabilizing distributed algorithm resumes a correct behavior within finite time, regardless the initial state of the system, and therefore also after a finite number of transient faults hit the system and place it in some arbitrary global state. The ability to implement sequential composition in a distributed system mainly relies on the ability to locally detect the termination. Now, termination detection is inherently impossible for self-stabilizing algorithms [34]. Indeed, since the system may suffer from faults such as memory corruption, the nodes cannot trust their local memory. To circumvent such an issue, several other composition operators devoted to self-stabilizing algorithms have been proposed, e.g., the fair [23] and cross-over [6] compositions. We are more particularly interested in the hierarchical collateral composition [20], a simple and widely used variant of the collateral composition [27]. This composition actually emulates the sequential composition “ ” by providing the same output despite \(\mathcal {A}_1 \) and \(\mathcal {A}_2 \) being executed (more or less) concurrently.

The PADEC framework [2, 3] consists in a library for certifying self-stabilizing algorithms. The certification of an algorithm means proving its correctness formally using a proof assistant, here Coq [9, 35], i.e., a tool which allows to develop formal proofs interactively and mechanically check them. The framework includes tools to model self-stabilizing algorithms, certified general statements that can be used to build certified correctness proofs of such algorithms, and case studies that validate them. In PADEC, the semantics of self-stabilizing algorithms’ executions is defined as potentially infinite streams and properties, such as algorithm specifications, are defined using temporal logic on those streams. Hence, the definitions and proofs presented in PADEC as well as this paper, make an intensive use of streams and thus of coinductive definitions and proofs.

Overview of the Contributions. The first contribution of this paper consists of new general tools for streams, in particular a squeezing operator. This latter is actually a productive filter on streams that uses both inductive and coinductive mechanisms. Our second contribution is a case study: we apply the squeezing operator to certify the hierarchical collateral composition of self-stabilizing algorithms. To our knowledge, our proposal is the first work on the certification of a composition operator for self-stabilization.

Detailed Contributions. We develop many tools for streams. Our streams are potentially infinite sequences of at least one element and require to be defined over a partial setoid, i.e., over a type endowed with a partial equivalence relation that models equality; thus justifying this new development. Apart from usual tools required by developments on streams, such as temporal logic operators, we also provide tools specific for PADEC. In particular, the squeezing toolbox provides a filter to remove any duplicated value from a given stream that may contain an infinite suffix of duplicates. We study the conditions under which such a squeezed stream can be computed and provide a function that actually builds it. This filter can be viewed as an extension of a work by Bertot [8]. Indeed, although Bertot’s filter relies on a general predicate (ours simply uses the equality between two consecutive elements), the squeezing operator is designed for more complex streams (that can be finite or infinite) and allows to remove an infinite suffix. In his paper, Bertot clearly explains the difficulty to formally define such a filter since this latter mixes both coinduction and induction mechanisms. The definition of squeezing is even more difficult since it requires to decide at each step whether the filtering of new elements should continue or be given up because a constant, potentially infinite, suffix has been reached.

As an application, we use these tools to enrich the PADEC library with a formalization of the hierarchical collateral composition operator \(\oplus \) and a sufficient condition to show its correctness. By correctness, we mean that if \(\mathcal {A}_1 \) and \(\mathcal {A}_2 \) are self-stabilizing w.r.t. their specification, then \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) is also self-stabilizing w.r.t. both specifications. Executions of self-stabilizing algorithms and their compositions are modeled as streams, and the squeezing toolbox has been the cornerstone to solve the major locks in the correctness proof of the composition operator.

Related Work. Previous work dealing with PADEC [3] only considered terminating algorithms that did not require any scheduling assumption, consequently their proofs were only induction-based. Here, \(\mathcal {A}_2\) may be a non-terminating algorithm (e.g., a token circulation). Moreover, the sufficient condition to show the correctness of the composition assumes a weakly fair scheduling, which requires a coinductive definition. Coinductive objects and proofs allow to reason on potentially infinite objects. They are supported by major proof assistants such as Coq [26], Isabelle [32], and Agda [1]. Coinductive constructions are commonly used to represent potentially infinite behaviors of programs and systems (see, for example, [31] for sequential programs and [17] for distributed systems) but also for modeling lazy programs such as the prime number sieve [8].

Proofs in distributed algorithms are commonly written by hand, based on informal reasoning. This potentially leads to errors when arguments are not perfectly clear, as explained by Lamport in his position paper [30]. As a matter of facts, several studies [7, 22] reveal, using formal methods, some bugs in existing literature. Hence, certification of distributed algorithms is a powerful tool to prevent bugs in their proofs, and so, to increase confidence in their correctness. Certification of non fault-tolerant distributed algorithms is addressed in [13, 14, 17]; and certification in the context of fault-tolerant, yet non self-stabilizing, distributed computing is addressed in [5, 28]. Up to now, only few simple self-stabilizing algorithms have been certified, e.g., [29] (in PVS) and [3, 16] (in Coq). By simple, we mean non-composed algorithms working on particular topologies (i.e., rings, lines, or trees) and/or assuming restrictions on possible interleaving (e.g., in  [29], only sequential executions are considered). Now, progress in self-stabilization has led to consider more and more complex distributed systems running in increasingly more adversarial environments. As an illustrative example, the three first algorithms proposed by Dijkstra in 1974 [21] were designed for oriented ring topologies and assuming sequential executions only, while nowadays most self-stabilizing algorithms are designed for fully asynchronous arbitrary connected networks, e.g., [12, 19], and even for networks, such as peer-to-peer systems, where the topology (frequently) varies over the time, e.g., [11]. Consequently, the design of self-stabilizing algorithms becomes more and more intricate, and accordingly, their proofs of correctness and complexity. To handle such difficulties, designers must adopt a modular approach, e.g., using composition operators. Consequently, a preliminary necessary step to certify present-day self-stabilizing algorithms is the certification of a composition operator.

Roadmap. Section 2 introduces streams and self-stabilization as defined in PADEC. Section 3 presents the composition. Section 4 details the squeezing toolbox and shows its application into the proof of correctness of the composition.

The composition and the stream toolboxes contain about 1500 lines of Coq specifications and 4800 lines of Coq proofs.Footnote 1 This represents about 25% of the whole PADEC library. We advocate the reader to visit the following webpage for a deeper understanding of our work.

            http://www-verimag.imag.fr/~altisen/PADEC/

All documentation and source codes are available at this address.

2 Streams and Self-stabilization in the PADEC Library

PADEC is a Coq library designed to model and prove results on self-stabilizing algorithms. The framework makes an intensive use of (partial-)setoids, i.e., types for which the equality is represented by a (partial-)equivalence relation. This choice is justified in [3] and has some consequences on the design of the framework. Nevertheless, we omit here the technical issues due to the use of such (partial-)setoids, since it is out of the scope of this paper.

We now present self-stabilizing algorithms as they are defined in distributed computing and the PADEC library. Beforehand, we introduce streams as they are used to model executions of self-stabilizing algorithms in PADEC.

2.1 Streams

We implement a stream as a potentially infinite sequence of at least one element. Each element belongs to some given type . A stream is then defined as a value of the following type.

figure f

Remark that such a stream cannot be empty since each constructor ( ) enforces the existence of a first element. Moreover, it may be finite or infinite since the keyword generates the greatest fixed point capturing potentially infinite constructions.Footnote 2 For instance, the finite stream of naturals is given by and the infinite stream of naturals, made of an infinite number of 1, is defined by . Therefore, the above definition allows to construct both finite and infinite streams thanks to the two constructors. In contrast, streams from the standard Coq API [35] and those proposed by Bertot [8] are made of only one constructor, which enforces the stream to be necessarily infinite.

We define the function which returns the first element of the stream, e.g., returns 1. For any function and any type , we note the function defining the composition of and as follows: returns , for any stream .

We now briefly introduce tools on streams that will be used in the sequel. The following predicates are usual temporal logic operators [15, 33]. They are defined w.r.t. a given predicate over streams. The first one checks that there is a suffix of the stream in which is satisfied. The second one checks that is satisfied in every suffix of the stream.

figure z

Note the difference between the two definitions: is defined using the keyword since a proof of , for some stream and predicate , should only contain a finite number of . In contrast, uses : a proof of would potentially contain an infinite number of and so, should be lazily constructed. We defined many other properties and technical tools that ease the use of those predicates (see [2] for details), e.g., we use to check that a stream is finite:

figure al

where holds if and only if the stream is made of a single element , i.e., is equal to .

2.2 Self-stabilization: Model and Semantics

Most of self-stabilizing algorithms are designed in the atomic-state model, a computational model introduced by Dijkstra [21], which abstracts away the communications between nodes of the network. The PADEC framework has been developed using this model (see [3]). However, we do not detail the model here, since this is not the heart of the contribution. Instead, we summarize features that are mandatory to present and understand our contributions.

A distributed algorithm is executed over a network, made of a finite number of nodes (we introduce the Coq type to represent nodes). Each node p is endowed with a local state (of type ) defined by the value of its local variables. Node p updates its local state by executing its local algorithm in atomic moves, where it first reads its own local state and that of its neighbors, and then only writes its own variables. Notice that some variables owned by p, usually system inputs, should never be written by its local algorithm. Such variables are declared read-only. A node is said to be enabled if its next move will actually modify its local state. Otherwise, the node is said to be disabled.

We call environment the global state of the network. In PADEC, environments are functions from to , namely for an environment and a node is the local state of in . If no node is enabled in , then said to be terminal. This property is defined by the predicate . Each node can locally evaluate whether or not it is enabled. So, since the number of nodes is finite, the property is decidable, i.e., the evaluation of is computable:Footnote 3

figure bh

Let be the current environment. If is not terminal, then a step of the distributed algorithm is performed as follows: every node that is enabled in is candidate to be executed; some candidates (at least one) are nondeterministically activated, meaning that they atomically update their local state using their local algorithm, leading the system to a new environment . This nondeterminism actually materializes the asynchronism of the system.

Notice that two environments linked by a step are necessarily different. This point is fundamental in asynchronous deterministic algorithms: the system progress can only be observed when the environment changes. In PADEC, we use the relation to encode all possible steps.

A maximal run in the network is defined as a stream of environments, using type , where every pair of consecutive environments in the stream is a step, and if the stream is finite then its last environment is necessarily terminal. This notion is captured by the predicate

figure bp

where checks that the stream matches one of the two following patterns. Either is and the environment is terminal, i.e., holds; or is (with an environment and a stream) and there is a step from to , i.e., holds.

We model the nondeterminism of the system using an artifact called the daemon. In this paper, we focus on the so-called weakly fair daemon [24]: a maximal run is executed under the weakly fair daemon if and only if every node that is continuously enabled is eventually activated by the daemon. To encode the weakly fair daemon, we define the following predicate:

figure cd

Namely, all along a run , whenever some node is enabled (predicate ), it is eventually either activated or neutralized (predicate ), i.e., either it is eventually chosen by the daemon to execute in a step, or it eventually becomes disabled, due to the move of some of its neighbors. Note that this definition involves both inductive and coinductive predicates.

A self-stabilizing algorithm is designed to fulfill a given specification under some assumptions, often related to the system. In the literature, those assumptions are directly encoded in the configurations using constants whose values achieve some conditions. For example, an identified network is modeled using a constant variable, called identifier, for each process and assuming that every two distinct processes have different identifiers. Following the literature, we express those assumptions (predicate ) on the read-only variables of the nodes. Such assumptions need only to be checked on the initial environment of a run. Indeed, they are then inherently satisfied all along the run since they only rely on read-only variables.

To sum up, we define an execution of the algorithm to be a stream of environments which is a maximal run satisfying the daemon constraints and where the read-only assumptions are satisfied in its first environment. Hence, executions are encoded by the following predicate.

figure cj

It is important to note that a self-stabilizing algorithm can be initiated from any environment where the read-only assumptions are satisfied. This, in particular, means that every suffix of an execution is also an execution.

The specification of an algorithm is given as a predicate . Then, an algorithm \(\mathcal {A}\) is self-stabilizing (predicate ) w.r.t. a specification \(\mathcal {S}\) under the weakly fair daemon if there exists a set of environments called legitimate and detected using the predicate , such that for every execution (implicitly and ),

  • if its initial environment is legitimate, then each of its environments is legitimate, i.e., (Closure);

  • it converges to a legitimate environment, i.e., (Convergence); and

  • if it is initiated in a legitimate environment, then it satisfies the specification, i.e., (Specification).

In this paper, we also consider the class of silent self-stabilizing algorithms. In the atomic-state model, an algorithm \(\mathcal {A}\) is silent if all executions are finite: A silent algorithm is designed to converge to terminal environments satisfying some properties. So, the specification of such an algorithm is rather formulated as a predicate over environments , henceforth called environment specification.

3 Composition

The hierarchical collateral composition has been introduced in [20] together with a simple sufficient condition to show its correctness. We now describe the operator, its modeling, and the certification of the sufficient condition in PADEC. Beyond the higher confidence in the accuracy of the result, certification, by enforcing proofs to be more rigorous, leads to a deeper understanding of the result.

The goal of the hierarchical collateral composition operator is to mimic the sequential composition “ ”. \(\mathcal {A}_1\) and \(\mathcal {A}_2\) run concurrently modulo some priorities (see details below) and collaborate together using common variables. The goal of \(\mathcal {A}_1\) is to self-stabilizingly output correct inputs to \(\mathcal {A}_2\). \(\mathcal {A}_2\) is self-stabilizing provided that its inputs, in particular those computed by \(\mathcal {A}_1\), are correct. Hence, the actual convergence of \(\mathcal {A}_2\) is ensured only after \(\mathcal {A}_1\) has stabilized. For example, the clustering algorithm for general networks given in [18] is a hierarchical collateral composition \(\mathcal {A}_1 \oplus \mathcal {A}_2 \), where \(\mathcal {A}_1\) is a spanning tree construction and \(\mathcal {A}_2\) a clustering algorithm dedicated to tree topologies.

\(\mathcal {A}_1\) should converge so that its output variables permanently fulfill the input assumptions of \(\mathcal {A}_2\) to ensure that \(\mathcal {A}_2\) stabilizes in nominal conditions. To that goal, we assume that \(\mathcal {A}_1\) is silent, e.g., in [18], once the spanning tree construction has stabilized, all its variables, in particular those defining the tree, are constant.

For each node, the local variables of the composite algorithm \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) are made of variables specific to \(\mathcal {A}_1\) and \(\mathcal {A}_2\) respectively, but also of variables common to \(\mathcal {A}_1\) and \(\mathcal {A}_2\). Those variables store, in particular, the output of \(\mathcal {A}_1\) used as input by \(\mathcal {A}_2\). They should be read-only in \(\mathcal {A}_2\) since \(\mathcal {A}_2\) should not prevent \(\mathcal {A}_1\) from stabilizing by overwriting these variables.

In the previous collateral composition [27] of Gouda and Herman, the choice for an activated node to execute either \(\mathcal {A}_1\) or \(\mathcal {A}_2\), when both are enabled, was nondeterministic. In contrast, in the hierarchical collateral composition, the composite algorithm gives priority to \(\mathcal {A}_1\) over \(\mathcal {A}_2\) locally at each node. Let p be a node enabled w.r.t. \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) in some environment and assume that p is activated by the daemon in the next step.

  • If \(\mathcal {A}_1\) is enabled at p (n.b., \(\mathcal {A}_2\) may be enabled at p too), then p makes a move of \(\mathcal {A}_1\) only.

  • Otherwise, p is disabled w.r.t. \(\mathcal {A}_1\), but enabled w.r.t. \(\mathcal {A}_2\), and so makes a move of \(\mathcal {A}_2\) (only).

Hence, when p moves in \(\mathcal {A}_1 \oplus \mathcal {A}_2 \), it either executes \(\mathcal {A}_1\) or \(\mathcal {A}_2\), but not both. We should underline that this priority mechanism is only local: globally, a step of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) may contain moves of \(\mathcal {A}_1\) only, moves of \(\mathcal {A}_2\) only, but also a mix of them, yet executed at different nodes.

3.1 The Composite Algorithm in Coq

We model the composite algorithm \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) in Coq as follows. We define the local states of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) assuming that the local states of \(\mathcal {A}_1\), noted , can be handled using the following getter and setter:

  • is a projection from ,

  • modifies the -part of a composite state.

Functions and are defined similarly for the local states of \(\mathcal {A}_2\).Footnote 4 Those functions follow the properties given by the commutative diagram of Fig. 1. For example, to update the -part of the composite local state with , we use : this produces a new local state with -part , namely, returns . Additionally, we encode the fact that any writing in the -part (by \(\mathcal {A}_2\)), that respects the read-only condition, actually does not modify the -part of an state. Indeed, the common part between and is necessarily read-only for (see x in Fig. 2).

Fig. 1.
figure 1

Commutative diagram for and . \(\pi _1\) gives access to the first element of the pair.

Fig. 2.
figure 2

cannot modify -part. x is the common part between and , read-only for .

We generalize the projections and to environments and streams. The projection to \(\mathcal {A}_1\) of an environment of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) is an environment for \(\mathcal {A}_1\) defined as . Namely, for a node , the projection of on \(\mathcal {A}_1\), is obtained by . The projection on \(\mathcal {A}_1\) of a stream of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) is called and is obtained using a cofixed point that applies to every element of the stream (i.e., a map on a stream). In particular, and represent one and the same environment. The projections and on \(\mathcal {A}_2\) are defined similarly.

3.2 Correctness of the Composition

The composition operator is proven correct under the following hypotheses:

  • \(\mathcal {H}_1:\) The daemon is weakly fair.

  • \(\mathcal {H}_2:\) \(\mathcal {A}_1\) is silent and self-stabilizing w.r.t. the environment specification \(\mathcal {S}^g_1\): given the read-only assumption , each of its executions is finite and terminates in an environment satisfying \(\mathcal {S}^g_1\).

  • \(\mathcal {H}_3:\) \(\mathcal {A}_2\) is self-stabilizing w.r.t. specification \(\mathcal {S}_2\): given the read-only assumption , each of its executions eventually reaches a legitimate environment (predicate ) from which \(\mathcal {S}_2\) is satisfied.

  • \(\mathcal {H}_4:\) The read-only assumption of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) is on \(\mathcal {A}_1\)-projections.

  • \(\mathcal {H}_5:\) \(\mathcal {S}^g_1\) implies , i.e.,

Under those hypotheses, we have proven the theorem below for the specification .

figure et

The above theorem states that \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) eventually reaches an environment from which \(\mathcal {S}_2\) holds and \(\mathcal {S}^g_1\) is satisfied in all environments.

We now outline the proof of the theorem. We first have to exhibit a predicate that defines the legitimate environments of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \). This predicate holds in each environment that is terminal for \(\mathcal {A}_1\) and legitimate for \(\mathcal {A}_2\):

figure ev

Then, we prove the following intermediate result:

figure ew

Namely, any execution of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) that starts in an \(\mathcal {A}_1\)-terminal environment remains in \(\mathcal {A}_1\)-terminal environments and is actually an execution for \(\mathcal {A}_2\). First, from an environment which is terminal for \(\mathcal {A}_1\), there is no way to update variables of \(\mathcal {A}_1\). So, remains in environments that are \(\mathcal {A}_1\)-terminal. This claim also implies that each step of is actually a step of \(\mathcal {A}_2\) and, consequently, is a maximal run of \(\mathcal {A}_2\) satisfying the weakly fair condition. Finally, satisfies . Indeed, \(\mathcal {A}_1\) being silent and self-stabilizing, this implies that if \(\mathcal {A}_1\) starts in a terminal environment, then, this environment satisfies \(\mathcal {S}^g_1\). Thus, we can use hypothesis \(\mathcal {H}_{5}\) on the first environment of . Hence, we can conclude that is an execution of \(\mathcal {A}_2\).

In the rest of the explanation, we consider an arbitrary execution of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \).

Closure. To show the closure, we have to prove that if starts in a legitimate environment of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) (i.e., an environment satisfying ), it always remains in such environments. This is straightforward using Lemma . Indeed, first, remains in environments that are \(\mathcal {A}_1\)-terminal. Second, as is an execution for \(\mathcal {A}_2\), we can use the closure property of \(\mathcal {A}_2\) on (since \(\mathcal {A}_2\) is self-stabilizing) and prove that legitimate environments for \(\mathcal {A}_2\) are maintained forever in .

Specification. We have to prove that if is initiated in , then holds. We use Lemma again. First, every environment of is \(\mathcal {A}_1\)-terminal, and so satisfies \(\mathcal {S}^g_1\). Second, is an execution of \(\mathcal {A}_2\) on which we can apply the specification property of \(\mathcal {A}_2\) (since \(\mathcal {A}_2\) is self-stabilizing), hence satisfies \(\mathcal {S}_2\).

Convergence. We should prove that eventually reaches an environment that is legitimate for \(\mathcal {A}_1 \oplus \mathcal {A}_2 \). This goal is split into three subgoals:

figure fu

i.e., eventually reaches an environment which is terminal for \(\mathcal {A}_1\). This part of the proof is postponed to Sect. 4. Claim ensures that contains a suffix \(\sigma \) that starts in a terminal environment for \(\mathcal {A}_1\): we have . The second subgoal is then:

figure fz

i.e., \(\sigma \) remains \(\mathcal {A}_1\)-terminal. As any suffix of an execution is also an execution, so is \(\sigma \). Hence, Claim is immediate from Lemma . The third subgoal is:

figure gc

After has reached an environment that is terminal for \(\mathcal {A}_1\), it eventually reaches an environment that is legitimate for \(\mathcal {A}_2\). Indeed, its suffix \(\sigma \) eventually reaches an environment that is legitimate for \(\mathcal {A}_2\): to prove this, we use the convergence of \(\mathcal {A}_2\) (as \(\mathcal {A}_2\) is self-stabilizing) since, by Lemma is an execution of \(\mathcal {A}_2\). Now, as \(\sigma \) eventually reaches , so does .

4 Squeezing Streams and Convergence of Composition

The main part of the proof consists in proving that any execution of the composite algorithm eventually reaches an environment which is terminal for \(\mathcal {A}_1\) (Claim ). This requires to use the assumption that \(\mathcal {A}_1\) is silent, i.e., Hypothesis \(\mathcal {H}_2\). To that goal, we consider an execution of \(\mathcal {A}_1 \oplus \mathcal {A}_2 \) and we focus on its projection on \(\mathcal {A}_1\), . Now, this latter stream is usually not an execution of \(\mathcal {A}_1\) and \(\mathcal {H}_2\) only applies to executions of \(\mathcal {A}_1\). Actually, each step of matches one of the two following cases: either at least one node executes \(\mathcal {A}_1\) in the step, or all activated nodes only execute \(\mathcal {A}_2\). In the former case, the projection of the step on \(\mathcal {A}_1\) is a step of \(\mathcal {A}_1\). In the latter case, the projection gives two identical environments of \(\mathcal {A}_1\). Hence, is made of steps of \(\mathcal {A}_1\), separated by duplicates. So, to apply \(\mathcal {H}_2\), it is mandatory to construct an execution of \(\mathcal {A}_1\) by computing the squeezing of , i.e., the stream obtained by removing all duplicates from .

In Subsect. 4.1, we describe how to compute the squeezing of a general stream. Again, it is a filter in the sense of Bertot [8] since it removes elements from the stream. Yet, its filtering predicate is particular, as it forbids any two consecutive elements to be equal. But, in contrast with Bertot [8], the squeezing applies to streams that can be finite or infinite and allows to remove an infinite suffix of duplicates. Therefore, we have an additional issue: the squeezing needs to decide at each step whether to continue or give up because a constant, potentially infinite, suffix has been reached.

The object resulting from squeezing – so-called squeezed stream – is complex since it is defined as an explicit construction. Consequently, dealing with it directly in proofs requires heavy Coq developments. To avoid such implementation details, we rather work on an abstraction stream relation, called simulation, which encompasses the useful properties of the squeezed stream.

4.1 Squeezing

We now explain how to build the squeezing of an arbitrary stream whose elements are of type . Let be such a stream. The squeezed version of contains exactly the same elements as , in the same order, yet without any duplicate.

For example, if ( ends with an infinite suffix of ), then the squeezing of is the finite sequence . Every element in is still present in , following the same increasing order, yet every duplicate has been removed from , including the infinite sequence of . Note that a squeezing may not be finite, e.g., if is the infinite repetition of the pattern , then the squeezing of is itself.

We want to build the squeezed stream, i.e., to define a function which computes the squeezed version of an input stream . This computation will be carried out by a coinductive function. To compute , it is necessary to test whether all elements in the stream are identical to . If so, will be ; otherwise it will be , where is the maximal suffix of starting with an element distinct from . Now, a stream may be infinite, and so the aforementioned test is undecidable in general. Thus, we need extra information in order to make the decision and, since we will base the result of the squeezing algorithm on this decision, it has to be constructive. In Coq, constructive objects are in sort and carry computational content.Footnote 5 We assume where:

figure ht

Any value of carries either a proof of or a proof of .

For an element and a stream means ,Footnote 6 i.e., the stream contains nothing but the element , albeit finite or not. Then, means that begins with a finite number of followed by some element different from . To be able to compute should also provide a way to compute the suffix of where all the instances of at the beginning of have been removed. Actually, we implement as . For an element and two streams holds when either and are both reduced to the single element or is equal to , i.e., . The inductive proposition is taken from the standard library which provides tools on well-founded inductions. Predicate means that any descending chain from , using relation , is finite. Using a well-founded induction on a value of , we are able to define a recursive function with dependent arguments , and that computes the maximal suffix of starting with an element distinct from . Thus, whenever we obtain a proof of , we can compute using .

However, to be able to compute the corecursive call , we need to exhibit a value of . This means that we need an algorithm that may compute, repeatedly and lazily, along the stream, a value of , where \(\sigma \) is any suffix of . This is performed by , the counterpart in of . So, we obtain the following definition:

figure jq

The construction of can now be completed as a cofixed point with dependent arguments and (n.b., we omit parameter when it is clear from the context).

As a direct consequence of the definition, we can show that a squeezed stream contains no duplicate:

figure jv

In the lemma, the predicate checks that a stream differs on its two first elements if they exist. The lemma is proven using a coinductive proof that follows the definition of . It essentially relies on the fact that for every element and stream on which can be evaluated (i.e., which begins with a finite number of ), the first element of is different from .

4.2 Preserving Properties by Simulation

We now define the simulation relation. As usual, our simulation defines an abstract view, yet adapted to our context. Given two streams and means that is obtained from by removing some of its duplicates, namely and contains exactly the same elements, in the same order, yet each element is at most as duplicated in as in . For instance, with (ending with an infinite sequence of 8) and , we have since every value, from 1 to 8, appears in both sequences and the number of 1 (resp. 2, 3, 4, 5, 6, 7, 8) is smaller or equal in . The relation \(\le _{sim}\) is based on inductive and coinductive mechanisms, defined as follows:

figure kq

The first constructor means that every stream made of one element is smaller than any stream constantly made of (albeit finite or infinite). The second constructor means that given an element and two streams and such that is smaller than , if the stream is obtained from by adding a positive number of (namely, holds), then is smaller than . The predicate is inductively defined and checks that either and are equal or .

We can show that \(\le _{sim}\) is a partial order, and as squeezing means removing all duplicates of a stream, we can prove that for a given squeezable stream, its squeezing is minimal w.r.t. \(\le _{sim}\) (see [2] for details):

figure ll

We show that some properties can be transferred between \(\le _{sim}\)-related streams. Precisely, a property is defined to be (decreasing) monotonic (resp. comonotonic) w.r.t. \(\le _{sim}\) as follows:

figure ln

The proof of Claim requires to prove the preservation of the following properties. First, we prove a result related to the implication: for two predicates and , such that is comonotonic and is monotonic, we easily obtain that is monotonic. For some property which is monotonic (resp. comonotonic), is monotonic (resp. comonotonic): indeed if is reached by a given stream , then it is also reached by any stream that contains less (resp. more) duplicates. Similarly, for some monotonic property is monotonic. Some other ad-hoc properties are proven (co)monotonic, if necessary, using straightforward coinductions.

4.3 Proof of Claim

The core of the proof is to use on and to show that the result is actually an execution of \(\mathcal {A}_1\).

To allow the use of , we need to show that is squeezable, meaning that from any environment of , we can decide whether the remaining sequence of environments is constant. This proof uses the fact that is weakly fair and that the predicate is decidable. First, if initially is terminal for \(\mathcal {A}_1\), then it remains so forever, and is a constant sequence made of the environment only. Second, we show that if initially, is not terminal for \(\mathcal {A}_1\), then necessarily, we have which means that begins by a finite number of duplicates of . Indeed, as is not terminal for \(\mathcal {A}_1\), there exists a node which is enabled to execute its local algorithm \(\mathcal {A}_1\) in e. It will remain continuously enabled until being activated or neutralized, meaning that the node or one of its neighbors has made a move of \(\mathcal {A}_1\). This activation or neutralization eventually occurs due to the weakly fair assumption and the fact that \(\mathcal {A}_1\) has priority over \(\mathcal {A}_2\). Following this remark, the proof is done by induction on the weakly fair assumption. Third, whether or not is initially terminal for \(\mathcal {A}_1\) is decidable (Lemma ). Hence, the proof that is squeezable is performed coinductively and each step of the coinduction decides whether the current environment is terminal for \(\mathcal {A}_1\).

So we can build and show it is an execution of \(\mathcal {A}_1\) .  

(a) is initiated under since a stream and its squeezing have the same initial environment.

(b) is weakly fair. We have . We show that is weakly fair by induction and coinduction on the definition of . Now, we can prove that is monotonic directly using preservation properties. So, we conclude that is weakly fair.

(c) is a maximal run. We first prove the following intermediate claim:

figure nd

The proof is split into two subgoals: and . Let be any stream. means that when is made of at least two elements (i.e., is equal to some ), then holds. means that when is made of a single element (i.e., is equal to )), then holds.

Subgoal 1 follows from the fact that is monotonic. This latter can be shown by a direct coinduction, which mainly relies on the fact that as applies on the first two elements of the stream and can hold only when they are different.

For Subgoal 2, we use the assumption to show the property

figure nv

namely, if is constantly made of an environment , then is terminal. Actually, we proceed by contradiction and prove that if is not terminal, then there exists a node which is enabled in . Therefore, due to the weakly fair assumption, will be eventually activated or neutralized in \(\mathcal {A}_1 \oplus \mathcal {A}_2 \), hence in \(\mathcal {A}_1\), since \(\mathcal {A}_1\) has priority over \(\mathcal {A}_2\) in \(\mathcal {A}_1 \oplus \mathcal {A}_2 \). Then, by induction on , we obtain that cannot be constantly made of . Subgoal 2 is then obtained with a direct coinductive proof using Claim .

This concludes the proof of Claim which can be applied on since, again, and hold on (see Lemma ). Hence, is a maximal run. Actually, and represent one and the same stream, but working on has allowed to get rid of the (complex) construction of in the proof.

As a conclusion, we deduce (by (a), (b), and (c)) that the squeezing of the stream is an execution of \(\mathcal {A}_1\) and use it on \(\mathcal {H}_2\) (\(\mathcal {A}_1\) is silent). Hence, is finite, i.e., it eventually reaches a terminal environment. As is comonotonic, eventually reaches a terminal environment too, hence Claim holds. This concludes the proof of convergence.

5 Conclusion

The composition theorem proves that hierarchical collateral composition preserves self-stabilization when applied under convenient assumptions, in particular assuming weakly fair executions. It comes with a toolbox for squeezing streams that was mandatory to achieve the proof of the theorem. As an example, we instantiated the theorem with the two first layers of the algorithm proposed in [20]. The first layer builds a rooted spanning tree on an identified connected network; the second layer assumes such a tree exists and computes a k-dominating set of the network (\(k\in \mathbb {N}\)) using this tree. Both algorithms are self-stabilizing under a weakly fair daemon, and our result certifies that their composition is also self-stabilizing and so builds a k-dominating set of an arbitrary connected identified network.

Composition techniques, in particular the hierarchical collateral composition, are widely used in the self-stabilizing area [18, 23, 27] because adopting a modular approach is unavoidable to design and prove complex present-day algorithms. Certification of such techniques is a step beyond traditional handmade proofs that offers hugely more confidence in the correctness of the result; and also a step towards the certification of complex multi-layered algorithms.