figure a

1 Introduction

Timed automata [1] are a widely used formalism for modeling real-time systems, which is employed in a class of successful model checkers such as Uppaal [2]. These tools can be understood as trust-multipliers: we trust their correctness to deduce trust in the safety of systems checked by these tools. However, mistakes have previously been made. This particularly concerns an approximation operation that is used by model-checking algorithms to obtain a finite search space. The use of this operation induced a soundness problem in the tools employing it [3], which was only discovered years after the first model checkers were devised.

Our ongoing workFootnote 1 addresses this issue by constructing a fully verified model checker for timed automata, using Isabelle/HOL [4]. Our tool is not intended to replace existing model checkers, but to serve as a reference implementation against which other implementations can be validated. Thus, it must provide sufficient performance to check real world examples. To this end, we use the Isabelle Refinement Framework (IRF) [5, 6] to obtain efficient imperative implementations of the algorithms required for model checking.

Our work starts from an existing abstract formalization of reachability checking of timed automata [7]. To close the gap to a practical model checker, we need to address two types of issues: efficient implementation of abstract model checking algorithms, and expressiveness of the offered modeling formalism. Two kinds of algorithms deserve special attention here. The first are operations to manipulate Difference Bound Matrices (DBMs) [2], which represent abstract states. With the help of the IRF, we obtain efficient implementations of DBMs represented as arrays. The second are search algorithms that govern the search for reachable states. These algorithms are interesting in their own right, since they make use of subsumption: during the search process an abstract state can be ignored if a larger abstract state was already explored. We provide a generalized framework for different variants of search algorithms, including a version which resembles Uppaal’s unified passed and waiting list [2].

We aim to offer a modeling formalism that is comparable in its expressiveness to the one of Uppaal. To accomplish this goal while keeping the formalization effort manageable, we opt to accept Uppaal bytecode as input. At the current state of the project we have formalized the semantics of a subset of the bytecode produced by Uppaal. We support the essential modeling features: networks of automata with synchronization, and bounded integer state variables. We apply a product construction to reduce models of this formalism to a single timed automaton. As in real model checkers, the whole construction is computed on the fly. However, not every bytecode input designates a valid automaton. To this end, we employ a simple program analysis to accept a sufficiently large subset of the valid inputs.

We conducted experiments on a small number of established benchmark models. The throughput of our model checker — the number of explored states per time unit — is within an order of magnitude of a version of Uppaal running a comparable algorithm.

1.1 Isabelle/HOL

Isabelle/HOL [4] is an interactive theorem prover based on Higher-Order Logic (HOL). You can think of HOL as a combination of a functional programming language with logic. Although Isabelle/HOL largely follows ordinary mathematical notation, there are some operators and conventions that should be explained. Like in functional programming, functions are mostly curried, i.e. of type \(\tau _1 \Rightarrow \tau _2 \Rightarrow \tau \) instead of \(\tau _1 \times \tau _2 \Rightarrow \tau \). This means that function application is usually written \(f\ a\ b\) instead of f(ab). Lambda terms are written in the standard syntax \(\lambda x.\ t\) (the function that maps x to t) but can also have multiple arguments \(\lambda x\ y.\ t\), paired arguments \(\lambda (x,y).\ t\), or dummy arguments \(\lambda \_.\ t\). Type variables are written \('a\), \('b\), etc. Compound types are written in postfix syntax: \(\tau \ set\) is the type of sets of elements of type \(\tau \). In some places in the paper we have simplified formulas or code marginally to avoid distraction by syntactic or technical details, but in general we have stayed faithful to the sources.

1.2 Related Work

The basis of the work presented in this paper is our existing formalization of timed automata [7]. We are aware of one previous proof-assistant formalization of timed automata using PVS [8, 9]. This work has the basic decidability result using regions and claims to make some attempt to extend the formalization towards DBMs. Another line of work [10, 11] aims at modeling the class of p-automata [12] in Coq and proving properties of concrete p-automata within Coq. A similar approach was pursued with the help of Isabelle/HOL in the CClair project [13]. In contrast, our formalization [7] focuses on the foundations of timed automata model checking. In particular, it encompasses a formalization of the relevant DBM algorithms and the rather intricate developments towards the correctness proof for the approximation operation.

We are not aware of any previous formalizations or verified implementations of timed automata model checking. The first verification of a model checker we are aware of is by Sprenger for the modal \(\mu \)-calculus in Coq [14]. Our important forerunner, however, is the CAVA project [15,16,17] by Esparza et al. It sets out for similar goals as we do but for finite state LTL model checking. A significant part of the refinement technology that we make use of was developed for this project, and it was the first project to demonstrate that verification of model checking can yield practical implementations. Compared to CAVA, our work offers several novelties: we target model checking of timed automata, which have an infinite state space; we use imperative data structures, which is crucial for efficient DBMs; finally, we implemented complex search algorithms with subsumption. Additionally, we operate on automata annotated with Uppaal bytecode, which has interesting ramifications: for the product construction, and because we need to ensure that the input actually defines a timed automaton.

2 Timed Automata and Model Checking

2.1 Transition Systems

We take a very simple view of transition systems: they are simply a relation \(\rightarrow \) of type \('a \Rightarrow {'a} \Rightarrow bool\). We model (finite) runs as inductive lists, and infinite runs as coinductive streams. We write \(a \rightarrow xs \rightarrow b\) to denote the \(\rightarrow \)-run from a to b using the intermediate states in the list xs, and \(a \rightarrow ^{ys}\) to denote the infinite \(\rightarrow \)-run starting in a and then continuing with states from the stream ys. Additionally, we define:

$$ a \rightarrow ^+ b = (\exists xs.\ a \rightarrow xs \rightarrow b)\ \text {and}\ a \rightarrow ^* b = (a \rightarrow ^+ b \vee a = b).$$

We define the five CTL properties that are supported by Uppaal, \(\mathbf {A}\lozenge \), \(\mathbf {A}\square \), \(\mathbf {E}\lozenge \), \(\mathbf {E}\square \), and \(\dashrightarrow \), as properties of infinite runsFootnote 2 starting from a state. For instance,

$$\begin{aligned} \mathbf {A}\lozenge \ \phi \ x = (\forall xs.\ x \rightarrow ^{xs}\ \Longrightarrow \ ev\ (holds\ \phi )\ (x \cdot xs)), \end{aligned}$$


$$\begin{aligned} \phi \dashrightarrow \psi = \mathbf {A}\square \ (\lambda x.\ \phi \ x \Longrightarrow \mathbf {A}\lozenge \ \psi \ x), \end{aligned}$$

where ev specifies that a property on a stream eventually holds, and holds constrains ev to the current state instead of the remainder stream. It then is trivial to prove identities such as \(\mathbf {E}\square \ \phi \ x = (\lnot \, \mathbf {A}\lozenge \ ( Not \circ \phi )\ x)\).

2.2 Timed Automata

Compared to standard finite automata, timed automata introduce a notion of clocks. Figure 1 depicts an example of a timed automaton. We will assume that clocks are of type \( nat \). A clock valuation u is a function of type \( nat \Rightarrow real \).

Fig. 1.
figure 1

Example of a timed automaton with two clocks.

Locations and transitions are guarded by clock constraints, which have to be fulfilled to stay in a location or to take a transition. Clock constraints are conjunctions of constraints of the form \(c \sim d\) for a clock c, an integer d, and \(\sim \; \in \{<, \le , =, \ge , >\}\). We write \(u \vdash cc\) if the clock constraint cc holds for the clock valuation u. We define a timed automaton A as a pair \((\mathcal {T}, \mathcal {I})\) where \(\mathcal {I}\) is an assignment of clock constraints to locations (also named invariants); and \(\mathcal {T}\) is a set of transitions written as \(A \vdash l \longrightarrow ^{g,a,r} l'\) where l and \(l'\) are start and successor location, g is the guard of the transition, a is an action label, and r is a list of clocks that will be reset to zero when the transition is taken. States of timed automata are pairs of a location and a clock valuation. The operational semantics define two kinds of steps:

  • Delay: \((l, u) \rightarrow ^d (l, u \oplus d)\) if \(d \ge 0\) and \(u \oplus d\vdash \mathcal {I}\ l\);

  • Action: \((l, u) \rightarrow _a (l', [r \rightarrow 0]u)\)

    if \(A \vdash l \longrightarrow ^{g,a,r} l'\), \(u \vdash g\), and \([r \rightarrow 0]u\vdash \mathcal {I}\ l'\);

where \(u \oplus d = (\lambda c.\ u\ c + d)\) and \([r \rightarrow 0]u = (\lambda c.\ \textsf {if}\ c \in r\ \textsf {then}\ 0\ \textsf {else}\ u\ c)\). For any (timed) automaton A, we consider the transition system

$$\begin{aligned} (l, u) \rightarrow _A (l', u') = (\exists d \ge 0.\ \exists a\ u''.\ (l, u) \rightarrow ^d (l, u'') \wedge (l, u'') \rightarrow _a (l', u') ) . \end{aligned}$$

That is, each transition consists of a delay step that advances all clocks by some amount of time, followed by an action step that takes a transition and resets the clocks annotated to the transition. We write \(A,s_0 \models \phi \) if \(\phi \) holds in state \(s_0\) w.r.t. \(\rightarrow _A\). Note that it is crucial to combine the two types of steps in order to reason about liveness. Consider the automaton from Fig. 1 and assume the two kinds of steps could be taken independently. Then the automaton has a run on which some predicate P holds everywhere if and only if \(P\ s_1\) holds.

2.3 Model Checking

Due to the use of clock valuations, the state space of timed automata is inherently infinite. Thus, model checking algorithms for timed automata are based on the idea of abstracting from concrete valuations to sets of clock valuations of type \(( nat \Rightarrow real )\ set\), often called zones. The initial decidability result [1] partitioned the state space into a quotient of zones, the so-called regions, and showed that these yield a sound and complete abstractionFootnote 3. However, practical model checking algorithms rather explore the state space in an on-the-fly manner, computing successors directly on zones, which are typically represented symbolically as Difference Bound Matrices (DBMs). DBMs are simply a matrix-form representation of clock constraints, which contain exactly one conjunct for each pair of clocks. To represent constraints on single clocks, an artificial \(\mathbf {0}\)-clock is added, which is assumed to be assigned 0 in any valuation.

The delicate part of this method is that the number of reachable zones could still be infinite. Therefore, an over-approximation is applied to zones to obtain a finite search space. We call the transition system of zones the zone graph, and the version where over-approximations are applied the abstract zone graph [18]. The soundness argument for this method (due to over-approximation completeness is trivial), starts from the region construction and then introduces the notion of the closure of a zone, which is defined to be the union of all regions intersecting with a zone. It can be shown from the correctness of the region construction that closures yield a sound over-approximation of zones. Finally, one shows that the result of applying the over-approximation operator to zones is always contained in the closure, thus inheriting soundness from the soundness of closures. We have formalized this argument and all of the material summarized in this section in previous work [7]. It only covers the case of reachability, but we will demonstrate how to extend the soundness argument to liveness below.

3 A First Glance at the Model Checker

This section provides a first overview of our model checker, its construction, and the correctness theorem we proved. The input to our checker consists of a model, i.e. a network of Timed Automata, and a formula to be checked against the model. To achieve high compatibility with Uppaal, guards and updates can be formulated in Uppaal bytecodeFootnote 4. This intermediate representation is computed by Uppaal from the original C-style input before the actual model checking process is started. Given such an input, our tool will first determine whether the input is valid and lies in the supported fragment. This is achieved by a simple program analysis. As input formulae, our model checker accepts the same (T)CTL fragment that is supported by UPPAAL, but restricts formulae to not contain clocks. While this is not a principal limitation of our work, it reduced the complexity of our first prototype. If the input is invalid, our tool answers with “invalid input”, else it determines whether

$$\begin{aligned} conv \ N,( init , s_0, u_0) \vDash _{ max\_steps } \phi \end{aligned}$$

holds for the all-zero valuation \(u_0\) under the assumption that the automaton is deadlock-freeFootnote 5, and answers with true/false. Here, N is the input automaton, \( conv \) converts all integer constants to reals (as the semantics are specified on reals), and \(\phi \) is the input formula. The relation \(\vDash _{ max\_steps }\) is a variant of \(\vDash \) lifted to networks of timed automata with shared state and Uppaal bytecode annotations. It is indexed with the maximum number of steps that any execution of a piece of Uppaal bytecode can use (i.e. \( max\_steps \) is the fuel available to executions). The vector of start locations init, and the shared state \(s_0\) (part of the input) describe the initial configuration.

The actual model checking proceeds in two steps. First, a product construction converts the network to a single timed automaton, expressed by HOL functions for the transition relation and the invariant assignment. Second, according to the formula, a model checking algorithm is run on the single automaton. We need three algorithms: a reachability checker for \(\mathbf {E}\lozenge \) and \(\mathbf {A}\square \), a loop detection algorithm for \(\mathbf {E}\square \) and \(\mathbf {A}\lozenge \), and a combination of both to check \(\dashrightarrow \)-properties. Note that the aforementioned HOL functions are simply functional programs that construct the product automaton’s state and invariant assignments on-the-fly. The final correctness theorem we proved can be stated as follows:

$$\begin{aligned} \begin{array}{l} \mathbf {\{}emp\}\\ \ \ precond\_mc \ p\ m\ k\ max\_steps \ I\ T\ prog \ formula \ bounds \ P\ s_0\\ \{\lambda Some \ r \Rightarrow valid\_input \ p\ m\ max\_steps \ I\ T\ prog \ bounds \ P\ s_0\ na\ k\ \wedge \\ \ \ \ \ (\lnot \, deadlock \ ( conv \ N)\ ( init , s_0, u_0) \Longrightarrow \\ \ \ \ \ \ \ r = conv \ N,( init , s_0, u_0) \vDash _{ max\_steps } formula )\\ \ |\ None \Rightarrow \lnot \, valid\_input \ p\ m\ max\_steps \ I\ T\ prog \ bounds \ P\ s_0\ na\ k \} \end{array} \end{aligned}$$

This Hoare triple states that the model checker terminates and produces the result None if the input is invalid. If the input is valid and deadlock free, it produces the result \(Some\ r\), where r is the answer to the model checking problem.

4 Single Automaton Model Checking

In this section, we describe the route from the abstract semantics of timed automata to the implementation of an actual model checker. The next section will describe the construction of a single timed automaton from the Uppaal-model.

4.1 Implementation Semantics

Although we have established that the DBM-based semantics from Sect. 2 can only explore finitely many zones, it is still “too infinite”: the automaton and DBMs are described by real constants, and operations on DBMs are performed on infinitely many dimensions (i.e. clocks). Thus, we introduce an implementation semantics, in which automata are given by integer constants, and where the number of clocks is fixed. We prove equivalence of the semantics in two steps: first, we show that DBM operations need only be performed on the clocks that actually occur in the automaton; second, we show that all computations can be performed on integers, provided the initial state only contains integers.

For the former step, we simplify the operations under the assumptions that they maintain canonicity of DBMs. A DBM is canonical if it stores the tightest derivable constraint for each pair of clocks, i.e.

$$ canonical \ M\ n = (\forall i\ j\ k.\ i \le n \wedge j \le n \wedge k \le n \rightarrow M\ i\ k \le M\ i\ j + M\ j\ k). $$

During model checking, the Floyd-Warshall algorithm is used to turn a DBM into its canonical counterpart.

Fig. 2.
figure 2

Refinement stages of the up operation for computing time successors.

For the latter step, we use Isabelle’s integrated parametricity prover [19] to semi-automatically transfer the operations from reals to integers.

As an example, Fig. 2 displays the refinement steps of the up operation, which computes the time successor of a zone Z, i.e. the set \(\{u \oplus d\ |\ u \in Z \wedge d \ge 0\}\).

In the step from up to \(up_1\), the assumption that the input DBM is canonical is introduced. In \(up_2\), which is the version used in the implementation semantics, the operation is constrained to clocks 1 to n. Finally, in \(up_3\), the matrices are implemented as arrays and the fold is implemented as a foreach loop.

At this point, a naive exploration of the transitive closure of the implementation semantics would already yield a simple but inefficient model checker. The rest of this section outlines the derivation of a more elaborate implementation that is close to what can be found in Uppaal.

4.2 Semantic Refinement of Successor Computation

We further refine the implementation semantics to add two optimizations to the computation of successor DBMs: to canonicalize DBMs it is sometimes sufficient to only “repair” single rows or columns instead of running the full Floyd-Warshall algorithm; moreover, we can terminate the computation early whenever we discover a DBM that represents an empty zone (as it will remain empty). Both arguments are again carried out on the semantic level.

4.3 Abstraction of Transition Systems with Closures

Recall that the correctness of the reachability analysis on the abstract zone graph in Sect. 2 was obtained arguing that the region closure of zones forms a sound over-approximation of zones, which in turn is larger than the abstract zone graph. We want to reuse the same kind of argument to also argue that there exists a cycle in the abstract zone graph if and only if there is a cycle in the automaton’s transition system. This proof is carried out in a general abstract framework for transition systems and their abstractions.

We consider a concrete step relation \(\rightarrow _C\) over type \('a\) and what is supposed to be its simulation, a step relation \(\rightarrow _{A_1}\) over type \('a\ set\). We say that \(\rightarrow _{A_1}\) is post-stable [20] if \(S \rightarrow _{A_1}T\) implies

$$\begin{aligned} \forall s' \in T.\ \exists s \in \ S.\ s \rightarrow _Cs', \end{aligned}$$

and that \(\rightarrow _{A_1}\) is pre-stable [20] if \(S \rightarrow _{A_1}T\) implies

$$\begin{aligned} \forall s \in S.\ \exists s' \in \ T.\ s \rightarrow _Cs'. \end{aligned}$$

In the timed automata setting, for instance, the simulation graph is post-stable and the region graph is pre-stable.

Lemma 1

If \(\rightarrow _{A_1}\) is post-stable and we have \(a \rightarrow _{A_1}as \rightarrow _{A_1}a\) with a finite and non-empty, then there exist xs and \(x \in a\) such that \(x \rightarrow _Cxs \rightarrow _Cx\).


Let \(x \rightarrow y = (\exists xs.\ x \rightarrow _Cxs \rightarrow _Cy)\). As \(\rightarrow _{A_1}\) is post-stable, every a has an ingoing \(\rightarrow \)-edge. Because a is finite we can thus find an \(\rightarrow \)-cycle in a, and obtain the claim.

Lemma 2

If \(\rightarrow _{A_1}\) is pre-stable and we have \(a \rightarrow _{A_1}as \rightarrow _{A_1}a\) and \(x \in a\), then there exist xs such that \(x \rightarrow _C^{xs}\) and xs passes through a infinitely often.


By coinduction. From pre-stability we can find \(x_1 \in a\) such that \(x \rightarrow _C^+ x_1\), from \(x_1\) we find \(x_2 \in a\) such that \(x_1 \rightarrow _C^+ x_2\), and so forth.

We can now consider doubly-layered abstractions as in the case for regions and zones. That is, we add a second simulation \(\rightarrow _{A_2}\) and two predicates \(P_1\) and \(P_2\) that designate valid states of the first and second simulation layer, respectively. Then we define the closure \(\mathcal {C}\) of a state of the second layer as

$$\mathcal {C}\ a = \{x\ |\ P_1\ x \wedge a \cap x \ne \emptyset \}\ \text {and}\ a \rightarrow _{\mathcal {C}}b = (\exists x\ y.\,a = \mathcal {C}\ x \wedge b = \mathcal {C}\ y \wedge x \rightarrow _{A_2}y).$$

We assume that \(\rightarrow _{A_1}\) is pre-stable w.r.t. \(\rightarrow _C\) and that \(\rightarrow _{\mathcal {C}}\) is post-stable w.r.t. \(\rightarrow _{A_1}\). Along with some side conditions on \(P_1\) and \(P_2\)Footnote 6 we can prove:

Theorem 1

If \(a_0 \rightarrow _{A_2}as \rightarrow _{A_2}a \rightarrow _{A_2}bs \rightarrow _{A_2}a\) and \(P_2\ a\), then there exist \(x \in \bigcup (\mathcal {C}\ a_0)\) and xs such that \(x \rightarrow _C^{xs}\) and xs passes through \(\bigcup (\mathcal {C}\ a)\) infinitely often.


We first apply \(\mathcal {C}\) to the second layer states and get a path of the form: \(\mathcal {C}\ a_0 \rightarrow _{\mathcal {C}}as' \rightarrow _{\mathcal {C}}\mathcal {C}\ a \rightarrow _{\mathcal {C}}bs' \rightarrow _{\mathcal {C}}\mathcal {C}\ a\) for some \(as'\) and \(bs'\). From Lemma 1 and post-stability, we obtain a path of the form \(a_{0_1} \rightarrow _{A_1}as_1 \rightarrow _{A_1}a_1 \rightarrow _{A_1}bs_1 \rightarrow _{A_1}a_1\) with \(a_{0_1} \in \mathcal {C}\ a_0\) and \(a_1 \in \mathcal {C}\ a\). By applying Lemma 2 and pre-stability, we obtain the desired result.

This is the main theorem that allows us to run cycle detection on the abstract zone graph during model checking: the other direction is trivial, and the theorem can be directly instantiated for regions and (abstracted) zones. There is a slight subtlety here since we only guarantee \(x \in \bigcup (\mathcal {C}\ a_0)\). However we typically have \(\mathcal {C}\ a_0 = a_0\), as all clocks are initially set to zero.

4.4 Implementation of Search Algorithms

We first implement the three main model checking algorithms abstractly in the nondeterminism monad provided by the IRF. On this abstraction level, we can use such abstract notions as sets and specify the algorithm for an arbitrary (finite) transition system \(\rightarrow \). We only showcase the implementation of our cyclicity checker (used for \(\mathbf {A}\lozenge \) and \(\mathbf {E}\square \)). The techniques used for the other algorithms are similar. The code for our cyclicity checker is displayed in Listing 1.1.

figure b

We claim that this closely resembles the pseudo-code found, e.g., in [21]. The algorithm takes a passed set, and produces a new passed set in addition to the answer. This can be used in the algorithm for checking \(\dashrightarrow \)-properties. The crux of the algorithm is the use of the subsumption operator \(\preceq \), to check whether smaller states are already subsumed by larger states that we may have discovered earlier (for timed automata, this would correspond to set inclusion on zones). We assume that \(\preceq \) is a pre-order and monotone w.r.t. \(\rightarrow \). Then, using the IRF’s verification condition generator, we prove:

$$\begin{array}{c} dfs \ P \le \textsf {SPEC}\ (\lambda (r, P').\ (r \Longrightarrow (\exists x.\ a_0 \rightarrow ^* x \wedge x \rightarrow ^+ x)) \\ \wedge (\lnot \, r \Longrightarrow \lnot \, (\exists x.\ a_0 \rightarrow ^* x \wedge x \rightarrow ^+ x) \wedge liveness\_compatible \ P' ) )\\ \text {if}\ liveness\_compatible \ P. \end{array}$$

The invariant we maintain for the passed set P is encoded in the predicate \( liveness\_compatible \ P\). We say that a state x is covered by P if there exists \(x' \in P\) such that \(x \preceq x'\). Then, informally, \( liveness\_compatible \ P\) states that the successors of every node that is covered by P are also covered, and that there is no cycle through nodes that are covered by P. After specifying the correct loop invariant (using \( liveness\_compatible \) as the main insight) and the termination relation, together with some key lemmas about the invariant, the verification conditions can be discharged nearly automatically.

In subsequent steps, we gradually refine this implementation to use more efficient data structures. The final version uses a function to compute a list of successors, and is able to index the passed set and the stack according to a key function on states (this corresponds to the location part of states in the abstract zone graph). The refinement theorem can be stated as:

$$\begin{aligned} dfs\_map \ P \le \ \Downarrow (Id \times _r map\_set\_rel )\ ( dfs \ P')\ \text {if}\ (P, P') \in map\_set\_rel . \end{aligned}$$

That is, \( dfs\_map \) is a refinement of \( dfs \), where the passed set is data-refined w.r.t. the relation \( map\_set\_rel \). This relation describes the implementation of passed sets indexed by keys.

These refinement steps are conducted inside the nondeterminism monad of the IRF. The final step leads into the heap-monad of Imperative HOL [22], which supports imperative data structures. Here, the Sepref tool [6] replaces functional by imperative data structures and generates a refinement theorem automatically.

Maps are implemented via hash tables, which poses a challenge for the implementation as the maps contain objects stored on the heap. This was not supported by the existing implementation in the Imperative Collections Framework, due to sharing issues: when retrieving a value from the map, we cannot obtain ownership of the value while the map also retains ownership. This is even true if the value is read-only. One way to solve this problem would be to extend the separation logic that underlies the IRF to fractional permissions or read-only permissions. Our solution, however, is more ad-hoc: we simply restrict the operations that we perform on the hash map to insertions and an extract operation, which deletes a key-value pair from the map and returns the value (i.e. it combines lookup and delete). To define the map implementation, we use a trick similar to Chargueraud’s ideas from [23]: we use a heap assertion that first connects an abstract map m with an intermediate map \(m_h\) of pointers to the elements, and then implements the map of pointers by a regular hash map \(m_i\). Formally, the assertion is defined as:

$$\begin{aligned} hms\_assn \ A\ m\ m_i = (\exists _A m_h.\ is\_map \ m_h\ m_i * map\_assn \ A\ m\ m_h). \end{aligned}$$

Here \( is\_map \) is the assertion for an existing map implementation from the Imperative Collections Framework (which cannot store heap objects, but supports pointers), and \( map\_assn \ A\ m\ m_h\) connects a map of abstract values with a map of pointers, where the relation between abstract values and pointed-to objects is defined by A.

Then, the final implementation is produced by proving that all map-related operations in \( dfs\_map \) can be replaced by insert and extract operations, and letting Sepref synthesize the imperative variant, making use of our new hash map implementation. The key theorem on the final implementation is the following Hoare triple:

$$ \begin{array}{l} \{ emp \}\\ \ \ dfs\_map\_impl' \ succsi \ a_0i\ Lei \ keyi \ copyi \\ \{ \lambda r.\ r=(\exists x.\ a_0 \rightarrow ^* x \wedge x \rightarrow ^+ x) \} \end{array} $$

It is expressed in a locale (Isabelle’s module system) that assumes that \(a_0i\), \( succsi \), etc., are the correct imperative implementations of \(a_0\), the successor function, and so forth. Versions of the search algorithm for concrete transition systems are obtained by simply instantiating the locale with the operations of the transition system and their implementations.

4.5 Imperative Implementations of Model Checking Operations

Recall the refinement of the up operation (Fig. 2). It is crucial that \(up_2\) is expressed as a fold-operation with explicit updates, as only then the IRF can extract an efficient imperative version with destructive updates and a foreach loop. The imperative implementation \(up_3\) is, again, synthesized by the Sepref tool. As can be witnessed for \(up_3\), the pattern \( fold \ f\ [1 \,..{<}\, n+1]\) is turned into a foreach loop. Technically, this is achieved by a set of rewrite rules that are applied automatically by the Sepref tool at the end of the synthesis process. The only hurdle for this kind of synthesis is that the dimension of DBMs needs to become a parameter of the refinement relations. For n clocks, we define

$$ mtx\_assn = asmtx\_assn \ (n + 1)\ id\_assn . $$

This specifies that our DBMs are implemented by square-arrays of dimension \(n+1\), and their elements are refined by the identity relation.

The refinement theorem for \(up_3\) is proved automatically by the Sepref tool:

$$ \begin{array}{c} (up_3, up_2) \in [\lambda (\_, i).\ i \le n]\ mtx\_assn ^d * nat\_assn ^k \rightarrow mtx\_assn . \end{array} $$

This theorem states that, if the specified dimension is in bounds, \(up_3\) refines \(up_2\). The \(\cdot ^d\) annotation indicates that the operation is allowed to overwrite (destroy) the input matrix on the heap. Symmetrically, the \(\cdot ^k\) annotation means that the second parameter is not overwritten (kept).

4.6 Code Extraction

Finally, Isabelle/HOL’s code generator [24] is used to extract imperative Standard ML code from the Isabelle specifications generated by Sepref. Code generation involves some optimizations and rewritings that are carried out as refinement steps and proved correct, followed by pretty printing from the functional fragment of HOL and the heap monad to Standard ML.

5 From UPPAAL-Style Semantics to a Single Automaton

5.1 UPPAAL-Style Semantics

Due to the lack of documentation on the Uppaal intermediate format, we define an approximation of this assembler-like language by reverse engineering. This is sufficient to check typical benchmarks, and gives a clearly defined semantics to the fragment that we cover. The language is defined as a simple data type instr. A step function of type \(instr \Rightarrow state \Rightarrow state\ option\) computes the successor state after executing an instruction, or fails. A state consists of an instruction pointer, the stack, the state of the shared integer variables, the state of the comparison flag, and a list of clocks that have been marked for reset. Using a fuel parameter, we execute programs by repeatedly applying the step function until we either reach a halt instruction, fail, or run out of fuel, which we also regard as a failed execution.

A special instruction CEXP is used to check whether an atomic clock constraint holds for a given valuation u. However, this instruction cannot simply be executed during model checking as it would need to work on zones instead of valuations. Unconstrained use of the CEXP instruction would allow for disjunctions of clock constraints on edges, which is not part of the standard timed automata formalism. Thus, in the same way as Uppaal, we restrict the valid input programs to those that only yield conjunctions of clock constraints on edges. We then replace every CEXP instruction by a special meta instruction that sets the comparison flag to true. This amounts to enforcing a program execution where the clock constraint, which is expressed by a piece of bytecode, holds for a valuation. Edges are annotated with the conjunction of the atomic clock constraints encountered during execution. In the current version of our tool, we separate concerns for locations by using a state predicate, which is not allowed to use CEXP instructions, and a separate clock constraint. The two could be merged by using the same approach as for edges.

5.2 Program Analysis

As stated in the last section, we need to ensure that successful program executions can only induce conjunctive clock constraints. That is, we need to ensure that program executions can only be successful when all CEXP instructions that are encountered during execution evaluate to true. To this end, we use a naive analysis, which recognizes a subclass of these programs that is sufficiently large to cover common timed automata benchmarks. This analysis tries to identify what we call conjunction blocks. A conjunction block reaching from addresses \(pc_s\) to \(pc_t\) ends with a halt instruction at \(pc_t\), starts with a CEXP instruction at \(pc_s\) and then is extended to \(pc_t\) by repeatedly using one of the following two patterns:

  • a copy instruction to push the flag on the stack, followed by CEXP and an and instruction;

  • a copy instruction, followed by a jump-on-zero instruction with \(pc_t\) as the destination, followed by CEXP and an and instruction.

We simultaneously show the two key properties of conjunction blocks via induction: if there is a conjunction block from \(pc_s\) to \(pc_t\), then any successful execution starting from \(pc_s\) ends in \(pc_t\), and every CEXP instruction that is encountered has to evaluate to true. Given a start address \(pc_s\), the whole analysis works by computing an approximation of the set of possible addresses that can be reached from \(pc_s\), say S, and then checking whether

$$\begin{aligned} Min \ \{pc\ |\ pc \in S \wedge (\exists ac.\ P_{pc} = \textit{CEXP}\ ac)\}\ \text {to}\ Max \ S \end{aligned}$$

is a conjunction block, where \(P_{pc}\) is the program instruction at address pc. A major limitation of this analysis is that it cannot approximate the reachable set for call and return instructions, so we are not able to handle inputs that are compiled from Uppaal programs with sub-routines. However, as the main objective of our work is not program analysis, we consider the current naive analysis sufficient to demonstrate the general viability of our approach.

5.3 Product Construction

The general shape of our product construction is outlined in Fig. 3. The first stage of the construction encodes the bytecode annotations as guards and updates on the shared state. The subsequent stage constructs a network of automata for each shared state by essentially filtering the transitions that are valid for a given state. For a simple network, the product can be constructed in the obvious way. However, this is only used in the correctness proof of the final step, which directly constructs a single automaton by pairing the location vector and the state.

Fig. 3.
figure 3

Outline of the product construction.

The result of this construction is a highly contrived description of the single automaton. To obtain an efficiently executable version of this description, we specify an alternative functional implementation and prove the equivalence of the two.

6 Experimental Evaluation

We conducted experiments on some standard benchmark models for timed automata: a variant of Fischer’s mutual-exclusion protocol, the FDDI token ring protocol, and the CSMA/CD protocol used in Ethernet networks. We tested one reachability and one liveness property for each model: \(\mathbf {E}\lozenge (c > 1)\) and \(P_1.b \dashrightarrow P_1.c\) for Fischer’s protocol; \(\mathbf {E}\lozenge (\lnot \, P_1. idle \wedge \lnot \, P_2. idle )\) and \( true \dashrightarrow z\_async_1 \) for FDDI; and \(\mathbf {E}\lozenge (P_1. abort \wedge P_2. send )\), and \( collision \dashrightarrow active \) for CSMA/CD. We compare (c.f. Table 1) our tool against Uppaal configured with two different approximation operators: difference (\({\textsc {Uppaal}}_1\)) and location-based (\({\textsc {Uppaal}}_2\)) extrapolation. We give the computation time in seconds and the number of explored states, as reported by our tool and UppaalFootnote 7. Since the number of explored states differs significantly, we also calculated throughput, i.e. the number of explored states per second. The ratio of Uppaal’s throughput and our tool’s throughput is given in the column TR. We specify the problem size as the number of automata in the network.

Table 1. Experimental results on a set of standard benchmarks.

The results indicate that our tool’s throughput is around one order of magnitude lower than Uppaal’s. Encouragingly, the gap seems to decrease for larger models. However, for larger problem sizes of some models, we also start to run out of memory because our tool is not tuned towards space consumption. We do not have a convincing explanation for the difference in states explored by our tool and Uppaal — particularly, because our tool already implements location-based extrapolation. Nevertheless, we conclude that the performance offered by our tool is reasonable for a reference implementation against which other tools can be validated: we can check medium sized instances of common benchmark models, which should be sufficient to scrutinize the functionality of a model checker.

7 Conclusion

We have derived an efficiently executable and formally verified model checker for timed automata. Starting from an abstract formalization of timed automata, we first reduced the problem to model checking of a single automaton, and then used stepwise refinement techniques to gradually replace abstract mathematical notions by efficient algorithms and data structures. Some of the verified algorithms and data structures, e.g. search with subsumption and Difference Bound Matrices, are interesting in their own right. Our experiments demonstrate that our tool’s performance is suitable for validating other model checkers against it on medium sized instances of classic benchmark models. Using a simple program analysis, we can cover a subset of the Uppaal bytecode that is sufficient to accept common models as an input.

Following the construction we expounded above, our checker can be improved on two different axes: advanced modeling feature such as broadcast channels or committed locations can be enabled by elaborating the product construction; using the refinement techniques that we demonstrated above, further improvements of the model checking algorithms can achieve better performance.

An alternative approach to tackle performance problems is to resort to certification of model checking results. For the simple CTL properties that are supported by our tool and Uppaal, passed sets could be used as the certificates and the model checking algorithms could be reused for certificate checking. As the model checking algorithms for timed automata make use of subsumption, passed sets can contain significantly less states than the total number of states explored during model checking. We plan on exploring this avenue in the future.