Timed Automata Verification and Synthesis via Finite Automata Learning

. We present algorithms for model checking and controller synthesis of timed automata, seeing a timed automaton model as a parallel composition of a large ﬁnite-state machine and a relatively smaller timed automaton, and using compositional reasoning on this composition. We use automata learning algorithms to learn ﬁnite automata approximations of the timed automaton component, in order to reduce the problem at hand to ﬁnite-state model checking or to ﬁnite-state controller synthesis. We present an experimental evaluation of our approach.


Introduction
Timed automata [1] are a well-known formalism for modeling and verifying realtime systems.They can be used to model systems as finite automata, while using, in addition, clocks to impose timing constraints on the transitions.Using clock variables have advantages.They allow one to describe models that are expressive thanks to real-valued clock values; moreover, the use of specific clock variables enable optimizations such as sound and complete abstractions, also known as extrapolation operators [5].Model checking algorithms have been developed and implemented in tools such as Uppaal [8], TChecker [29], PAT [50].
One approach for model checking timed automata is based on representing the set of clock values with zones, which are particular polyhedra, and using explicit enumeration on the discrete states.There has been extensive research on sound and complete abstractions on zones, which improved the performance of the model checking tools, and made it possible to handle models with more complex time constraints; see [11] for a survey.However this approach does not scale to models with large discrete spaces due to explicit enumeration.Several authors have developed algorithms to remedy this issue, and to attempt to adapt efficient model checking techniques finite-state systems to timed systems.Extensions of binary decision diagrams (BDD) with clock constraints have been considered both for continuous time [53,10,24] and discrete time [43,51].Another approach is to use predicate abstraction on clock variables that enables efficient finite-state verification techniques based on BDDs or SAT solvers [17,16,47].
Controller synthesis is a related problem in which some transitions of the system are controllable and some are uncontrollable, and the objective is to compute a control strategy which guarantees that all induced runs of the system satisfy a given specification; see e.g.[52].This problem is formalized using games, and in the case of real-time systems, using timed games [40,4].Zone-based algorithms have been developed to solve timed games and compute control strategies [14], and are available in the Uppaal TIGA tool [7].These algorithms suffer from the same limitations as the zone-based model checking algorithms.Although they can be efficient on instances with small discrete state spaces, they do not scale well to large systems.An attempt was made to implement the counter-example guided abstraction refinement scheme to handle larger discrete state space in timed games in [45].On the other hand, there are several efficient finite-state game solvers, based on BDDs and SAT solvers, which can efficiently handle relatively large state spaces [32], but cannot handle real time.
In this work, we introduce an approach that is applied both to model checking and controller synthesis of timed automata with the objective of combining the advantages of both timed automata and finite-state model checkers and game solvers.Our suggestion is to see the input model, without loss of generality, as a parallel composition between a finite-state machine A, and a timed automaton T .We specifically target instances where A is large, and T is relatively small but nontrivial.Note that this point of view was considered before in the verification of synchronous systems within a real-time environment [9].As a novelty, for model checking, we apply a compositional reasoning rule on the product A T by replacing the timed automaton T by a (small) deterministic finite automaton (DFA) H which represents the behaviors of T .To automatically select the DFA H, we adapt the algorithm [44] to our setting, and use a DFA learning algorithm (such as L* [3], or TTT [30]) to find an appropriate DFA either to prove the specification or to reveal a counterexanple.
Our approach enjoys the principle of separation of concerns in the following sense.A timed automaton model checker is used by the learning algorithm to answer membership and equivalence queries (see Section 2.2); these are answered without refering to A, thus, by avoiding the large discrete state space.Therefore, the timed automaton model checker is used in this approach for what it is designed for: handling real-time constraints encoded in T , not for dealing with excessive discrete state spaces.Once an appropriate DFA H is found by the learning algorithm, the system A H is model-checked using a finite-state model checker whose focus is to deal with large discrete state spaces.We can thus benefit from the best of the two worlds: a state-of-the-art model checker for timed automata, which is somewhat used here as a theory solver, and any finite-state model checker based on BDDs, SAT solvers, or even explicit-state enumeration.
The application of the learning-based compositional reasoning of [44] to controller synthesis is more involved.Our objective was to find a way to exploit efficient finite-state game solvers [32] in the context of timed automata even if this meant having an incomplete algorithm.We describe a setting where a one-sided abstraction is applied for controller synthesis by replacing the timed automaton component by a learned DFA.Contrarily to the model checking algorithm, our controller synthesis algorithm is sound but not complete, that is, the algorithm may fail although there exists a control strategy, while any control strategy that is output is correct.More precisely, we consider timed games in the form G T where G is a finite-state game, and T is a timed automaton.We describe an algorithm that alternates between two phases.In the first phase, the goal is to find a DFA H that is an overapproximation of T .Once this is found, we use a finite-state game solver on G H; if there is a control strategy, we prove that it can be applied in the original system G T .If not, then we obtain a counterstrategy S. We then switch to the second phase whose goal is to check whether the counterstrategy is spurious or not; and it does so by learning an underapproximation DFA H of T , and checking whether S induces runs that are all in H. Accordingly, we either reject the instance or switch back to the first phase.As in the model checking algorithm, the timed automaton model checker is only used to answer queries independently from G, and a finite-state game solver and a model checker are used to compute and analyze strategies in a discrete state-space.
To the best of our knowledge, apart from [45], we present the first algorithm that can solve timed games with large discrete state spaces.Although the algorithm applies to a subset of timed games and is not complete, we believe it is of utmost importance to make progress on the scalability of timed game solvers in order for these methods to be applied in convincing applications.Our paper makes an attempt in this direction.
We evaluate our algorithms in comparison with state-of-the-art tools and show that our approach is competitive with the existing tools, and can allow both model checking and synthesis to scale to larger models.The approach offers an alternative treatment of timed models, which might be applied in other settings.
We present the model checking algorithm in Section 2 which contains formal definitions, the description of the algorithm, and the experiments.Section 3 presents our contributions on the controller synthesis problem, and includes formal definitions, the description of the algorithm, and the experiments.In Section 4, we provide a broader discussion on related works, and present our conclusions and perspectives.

Preliminaries
Labeled Transition Systems and Finite Automata.We denote finite labeled transition systems (LTS) as tuples (Q, q 0 , Σ, T ) where Q is the set of states, q 0 ∈ Q is the initial state, Σ is a finite alphabet, T ⊆ Q×Σ ∪ { }×Q is the transition relation ( labels silent transitions).Because we will consider synchronous product of LTSs, we will use silent transitions to define internal transitions not exposed for synchronization.A finite automaton is an LTS given with a set of accepting states F ⊆ Q, and is written (Q, q 0 , Σ, T, F ).A run of an automaton is a sequence q 1 e 1 q 2 e 2 . . .q n where q 1 = q 0 , e i = (q i , σ i , q i+1 ) ∈ T for some σ i ∈ Σ ∪ { } for each 1 ≤ i ≤ n − 1.The trace of the run is the sequence σ 1 σ 2 . . .σ n−1 .An accepting run starts at q 0 and ends in F .The language of a finite automaton A is the set of the traces of all accepting runs of A, and is denoted by L(A).We will consider deterministic finite automata (DFA) which do not have silent transitions, and have at most one edge for each label from each state.
The parallel composition of two automata A i = (Q i , q 0 i , Σ, T i , F i ), i ∈ {1, 2}, defined on the same alphabet, is the automaton 2 ), F = F 1 ×F 2 , and T contains ((q 1 , q 2 ), σ, (q 1 , q 2 )) for all (q 1 , σ, q 1 ) ∈ T 1 , and (q 2 , σ, q 2 ) ∈ T 2 ; and ((q 1 , q 2 ), , (q 1 , q 2 )) for all (q 1 , , q 1 ) ∈ T 1 , and q 2 ∈ Q 2 ; and symmetrically, ((q 1 , q 2 ), , (q 1 , q 2 )) for all (q 2 , , q 2 ) ∈ T 2 , and Finite Automata Learning.We use finite automata learning algorithms such as L * [3,46] and TTT [30].In the online learning model, the learning algorithm interacts with a teacher in order to learn a deterministic finite automaton recognizing a hidden regular language known to the teacher.The algorithm can make two types of queries.A membership query consists in asking whether a given word belongs to the language, to which the teacher answers by yes or no.An equivalence query consists in creating a hypothesis automaton H, and asking the teacher whether H recognizes the language.The teacher either answers yes, or no and provides a counterexample word which is in the symmetric difference of L(H) and of the target language.Learning algorithms typically make a large number of membership queries, and a smaller number of equivalence queries.Timed Automata.We fix a finite set of clocks C. Clock valuations are the elements of We extend these operations to sets of valuations in the standard way.We write 0 for the valuation that assigns 0 to every clock.
We consider a clock named 0 which has the constant value 0, and let C 0 = C ∪ {0}.An atomic guard is a formula of the form x k, or x − y k where x, y ∈ C 0 , k, l ∈ N, and ∈ {<, ≤, >, ≥}.A guard is a conjunction of atomic guards.A valuation v satisfies a guard g, denoted v |= g, if all atomic guards are satisfied when each x ∈ C is replaced by v(x).Let Φ C denote the set of guards for C.
A timed automaton T is a tuple (L, 0 , Σ, Inv, C, E, F ), where L is a finite set of locations, 0 ∈ L is the initial location, Σ is the alphabet, Inv : An edge e = ( , g, σ, R, ) is also written as A run of T is a sequence r = q 1 e 1 q 2 e 2 . . .q n where q i ∈ L×R C ≥0 , q 1 = ( 0 , 0), and writing ).The run is accepting if the last location is in F .The trace of the run r is the word σ 0 σ 1 . . .σ n where σ i is the label of edge e i .
The untimed language of the timed automaton T is the set the traces of the accepting runs of T , and is denoted by L(T ).
A timed automaton is label-deterministic if at each location , for each label σ ∈ Σ, there is at most one transition leaving labelled by σ; in other terms, the finite automaton obtained by removing all clocks is deterministic.
We consider the parallel composition of a finite automaton A = (Q, q 0 , Σ, T, F ) and a timed automaton T = (L, 0 , Σ, Inv, C, E, F T ) which is a new timed automaton.Intuitively, a transition labeled by σ consists in an arbitrary number of silent transitions of A, followed by a joint σ-transition of both components.The guard and the reset of the overall transition are those of the transition of T .Formally, let A T = (L , 0 , Σ, Inv , C, E , F ) with L = Q×L, Inv : (q, ) → Inv( ), 0 = (q 0 , 0 ), and E contains all edges of the form ((q, ), g, σ, R, (q , )) such that ( , g, σ, R, ) ∈ E, and there exists a sequence q = q 0 , q 1 , . . ., q k , q k+1 = q of states of A such that (q 0 , , q 1 ), . . ., (q k−1 , , q k ), (q k , σ, q k+1 ) are transitions of A. We let F = F ×F T .
It follows from the definition of the parallel composition that Target Timed Automata Instances.Our main motivation is to consider real-time systems that are modeled naturally as A T .Typically, A has a large (discrete) state space, and T is a relatively small timed automaton, but with potentially complex time constraints involving several clocks.
It should be clear however that any timed automaton T can be seen as such a product as follows.Let A be a finite automaton identical to T except that guards and resets are removed; and for each pair of guard g and reset r, a fresh label σ g,r is defined and added to each edge with the said guard and reset.Now, define the timed automaton T as a single state with the same clocks as T , with one self-loop for each pair (g, r): such an edge is labeled by σ g,r , has guard g, and reset r.We have that T is isomorphic to A T .
An example is given in Figure 1 which shows how a simple scheduling setting can be modeled in this way.Here, the finite automaton is simple and only stores the mapping from machines to the tasks they are executing.Typically, if the machines or the processes executing tasks have internal states, these could be modeled in A as well without altering the timed automaton.

Learning-Based Compositional Model Checking Algorithm
We present an algorithm for model checking the untimed language L(A T ).
Although it is known that the untimed language is regular [1], the size of the corresponding finite automaton can be exponential so a direct computation is not efficient.We will be looking for a finite automaton H which is an overapproximation of T i.e.L(T ) ⊆ L(H).H stands for hypothesis made by the learning algorithm.We will in fact use the following lemma.

Lemma 1. For all finite automata A and H, and timed automata T on common alphabet Σ, if L(T ) ⊆ L(H), then L(A T ) ⊆ L(A H).
Finite automaton A: ready [1] done [1] ready [1] done [1] ready done [1] done [1] ready [1] done[0] Timed automaton T : 10,20] x1 := 0 Fig. 1.Timed automaton A T modeling a simple scheduling policy.The finite automaton A is given above and models a scheduler which schedules tasks (0 and 1) immediately when they become ready (ready[0] and ready [1]) on machines M0 and M1, using M0 first if it is available.The timed automaton T is below, here, as a network of the timed automata, and models interarrival and computation times for each task.
In other terms, by replacing the timed automaton T by its overapproximation, we obtain an overapproximation of the compound system in terms of untimed language.So if a linear property can be established on A H for an appropriate H, then the property also holds on the original system.
Let us present the above property as a verification rule.Assuming that we want to establish A T ⊆ Spec for some language Spec, we have Here, H serves as an assumption we make on T when verifying A; so as in Lemma 1, we can use H instead of T during model checking.The rule (1) is well known as the assume-guarantee verification rule [19], and has been used in model checking finite-state systems as well as timed automata [36].The assumption H can either be provided by the user, or automatically computed using automata learning as in [44].Intuitively, the model checking algorithm we present in this section is an application of [44] to our specific case.Figure 2 presents the overview of the algorithm.The membership queries of the learning algorithm are answered by the membership oracle; the equivalence query with conjecture H is answered by the inclusion oracle.When the conjecture H passes the inclusion check, we model-check H A. When this is successful, we stop and declare that the original system A T satisfies the specification.Otherwise, a counterexample w ∈ L(A H) \ Spec was found, and we use a realizability check to see whether w ∈ L(T ) (this is actually done by the membership oracle).If the answer is yes, then the counterexample is confirmed, and we stop.Otherwise, we inform the learning algorithm that w must be excluded, and continue the learning process.
Note that this algorithm can be used for any regular language specificaton Spec.We focus on safety properties in our experiments, presented next.

DFA Learning Algorithm
Membership Oracle w ∈ L(T )?

Experiments
We built a prototype implementation of our algorithm in Scala, using the TTT automata learning algorithm [30] from the learnlib library [31], and the associated automatalib for manipulating finite automata.We used the TChecker [29] model checker for implementing membership and inclusion oracles.For the latter, we complement H into H c , and check the emptiness of the parallel composition of T with H c .We use the NuSMV model checker for finite-state model checking.The implementation is available at https://github.com/osankur/compRTMC.
The overall input consists in an SMV file describing A, and of a TChecker timed automaton describing T .We use define expressions in SMV to define the labels Σ, while TChecker allows us to tag each transition with a label.
We compare our algorithm on a set of benchmarks with the model checkers Uppaal [8] and nuXmv which has a timed automata model checker [16].The Table 1.Model checking benchmarks.The column #Clk is the number of clocks; #C is the number of conjectures made by the DFA learning algorithm; #M is the number of membership queries; and |DFA| is the size of the final finite automaton learned.The safety specification holds on all models but those marked with *.In each cell, -means out of memory (8GB), and -means time out (30 minutes).

Compositional
Uppaal former implements a zone-based enumerative algorithm, while the latter uses predicate abstraction through IC3IA.We describe some of the benchmarks here.
The leader election protocol is a distributed protocol that can recover from crashes [22], extended here with periodic activation times and crash durations.The first four rows of Table 1 correspond to the case where one of the processes crashes when its internal state enters an error state.Internal states are modeled using Boolean circuits from from the synthesis competition (SYNTCOMP) benchmarks.The stateless version is more abstract: there is no internal state model, and crashes can occur at any time.The letters A, B, C, D indicate different timed automaton models.Uppaal was more efficient at solving the stateless version but failed in the full version due to the large discrete state space.The compositional algorithm was effective in verifying all instances but the D case which required a large finite automaton to be learned.One can notice an overhead of the compositional algorithm in the stateless version due to the computation of the finite automaton H.This was particularly an issue in the stateless D case where Uppaal could find a counterexample trace faster; nuXmv was not able to solve these instances.
The flooding time synchronization protocol (FTSP) is a leader election algorithm for multi-hop wireless sensor networks used for clock synchronization [41], and has been the subject of formal verification before [42,35].We consider the abstract model used in [48] for parameterized verification allowing one to verify the model for a large number of topologies.Our algorithm was faster for the model with 3 processes, although none of the tools scaled to 4 processes.
Overall, the experiments show that our algorithm is competitive with the state of the art tools; while it does not improve the performance uniformly on all considered benchmarks, it does allow us to solve instances that are not solvable by other tools, and sometimes to improve performance both compared to a zone-based approach (Uppaal) and SAT-based algorithms (nuXmv).

Preliminaries
Games.A finite safety game is a pair (G, Bad) where G is an LTS (Q E ∪Q C , q 0 , Σ, T ) with the set of states given as a partition Q E ∪Q C , namely, Environment states (Q E ), and Controller states (Q C ), and Bad ⊆ Q E ∪Q C is an objective.The game is played between two players, namely, Controller and Environment.At each state q ∈ Q C , Controller determines the successor by choosing an edge from q, and Environment determines the successor from states q ∈ Q E .A strategy for Controller (resp.Environment) maps finite runs of (Q E ∪Q C , q 0 , Σ, T ) ending in Q C (resp.Q E ) to an edge leaving the last state.A pair of strategies, one for each player, induces a unique infinite run from the initial state.A run is winning for Controller if it does not visit Bad; it is winning for Environment otherwise.A winning strategy for Controller is such that for all Environment strategies, the run induced by the two strategies is winning for Controller.Symmetrically, Environment has a winning strategy if for all Controller strategies, the induced run is winning.A strategy is positional (a.k.a.memoryless) if it only depends on the last state of the given run.
The parallel composition of (G, Bad) and a deterministic finite automaton F = (Q , q 0 , Σ, T , F ) on alphabet Σ is a new game whose LTS is G F in which the Controller states are Q C ×Q , the Environment states are Q E ×Q , and the objective is Bad×F (Notice that Controller thus has a safety objective).
Finite games were extended to the real-time setting as timed games [40,4].A timed game is a timed automaton T = (L E ∪L C , 0 , Σ, Inv, C, E, Bad) with the exception that its edges are labeled by Σ∪{ } (and not just by Σ as in the previous section), and the locations are partitioned as L E ∪L C into Environment locations and Controller locations.The semantics is defined by letting Environment choose the delay and the edge to be taken at locations L E , while Controller choose these from L C .Formally, a strategy for Environment (resp.Controller) is a function which associates a run that ends in L E (resp.L C ) to a pair of delay and an edge enabled from the state reached after the delay.A run is winning for Controller if it does not visit Bad.A Controller (resp.Environment) strategy is winning for objective Bad if for all Environment (resp.Controller) strategies, the induced run from the initial state is winning (resp.not winning) for Controller.A run r is compatible with a strategy S for Controller (resp.Environment) if there exists an Environment (resp.Controller) strategy S such that r is induced by S, S .
The parallel composition of a finite safety game (G, Bad) and a timed automaton T = (L, 0 , Σ, Inv, C, E, F ) on common alphabet Σ is the timed game G T where Controller locations are Q C ×L, and Environment locations are Q E ×L.
Positional strategies exist both for reachability and safety objectives in finite and timed games.Both finite and timed games are known to be determined for reachability and safety objectives.For instance, if Controller does not have a winning strategy for the safety objective, then Environment has a strategy ensuring the reachability of Bad [40,4].
Target Timed Game Instances.We consider controller synthesis problems described as timed games in the form of (G T , Bad×F ) where (G, Bad) is a finite safety game, and T is a timed automaton.In addition, we assume that G T is Controller-silent, defined as follows.
Definition 1.The timed game (G T , Bad×F ) on alphabet Σ is Controller-silent if 1) all Controller transitions are silent; and 2) all Controller locations in T are urgent, that is, an invariant ensures that no time can elapse.
Hence, we again separate the game G defined on a possibly large discrete state space while real-time constraints are separately given in T .The intuition behind the semantics is the following: because the game is played in G T and G is Controller-silent, the timed automaton model T is only used to disallow some of the Environment transitions according to real-time constraints, while Controller's actions are instantaneous responses to Environment's actions and thus are unaffected by the contraints of T .One can think of the timed automaton as some form of scheduler that schedules uncontrollable events in the system, so the order of these is determined by Environment.This assumption is restrictive; for instance, this excludes controller synthesis problems where the control strategy is to choose delays to execute some events.Nonetheless, this asymetric view enables a one-sided abstraction framework presented in the next section, where Environment transitions are approximated by a DFA.
An example is given in Figure 3.The finite game drawn here only shows the structure of the game.It has, in addition, integer variables rob x, rob y, obs x, obs y encoding the positions of the robot and of the obstacle, and a Boolean variable door to encode the state of the door.The state e belongs to Environment, which can move the obstacle in any direction, close or open the door, or let the robot move by going to state c.The state c belongs to Controller.All its leaving transitions are silent, and correspond to moving the robot in four directions.These transitions have preconditions, not shown in the figure, that check whether the moves are possible, and have updates that modify the state variables.The timed automaton, given as a network of three timed automata, determine the timings of these events.One can notice, for example, that the robot is moving faster than the obstacle, and that whenever the door is closed, it remains so for 10 time units.

One-Sided Abstraction
Thanks to the assumption we make on considered timed games, we show that by replacing T by a DFA H that is an overapproximation, we obtain an abstract game in which Controller strategies can be transferred to the original game.This is formalized in the next lemma (the proof is in the appendix).
Lemma 2. Consider a Controller-silent timed game (G T , Bad×F ), and a complete DFA H with accepting states F H , satisfying L(T ) ⊆ L(H).
-If Controller wins (G H, Bad×F H ), then it wins (G T , Bad×F ).
-If Environment wins (G T , Bad×F ), then it wins (G H, Bad×F H ), and has a strategy in (G H, Bad×F H ) whose all compatible runs have traces in L(T ).
Note that in the above lemma, it is crucial that the game is Controller-silent.In fact, if Controller could take edges that synchronize with T , then we may not be able to apply a strategy in G H to G T , since such a strategy may prescribe traces that are not accepted in T .Moreover, if Controller locations are not urgent, we would not know how to select the delays when mapping the strategy to G T .

Learning-Based Compositional Controller Synthesis Algorithm
We now present our compositional controller synthesis algorithm whose overview is given in Figure 4.The algorithm for controller synthesis is more involved than the model checking algorithm due to the alternating semantics for two players in games.It consists in two phases that alternate: the overapproximation phase, and the underapproximation phase.Each phase runs a DFA learning algorithm which is interrupted when we switch to the other phase, and continued when we switch back, until a decision is made.Together, both phases maintain two approximations, H and H, such that L(H) ⊆ L(T ) ⊆ L(H).
The objective of the overapproximation phase is to attempt to learn a DFA H satisfying L(T ) ⊆ H, and such that Controller wins in G H. The learning algorithm uses membership and inclusion oracles just like in Section 2.2.Once such a candidate DFA H is found, the synthesis oracle checks, using finite-state techniques, whether Controller has a winning strategy in G H. If this is the case, we stop and conclude that Controller wins in G T by Lemma 2. Otherwise, Environment has a winning strategy S in this game; and we switch to the underapproximation phase.
The goal of the underapproximation is to check whether the given Environment strategy S can be proved to be spurious.Intuitively, we would like to check whether L((G H) S ) ⊆ L(T ) and reject if this is the case.In fact, by Lemma 2, we know that a winning Environment strategy in G T implies that there is such a strategy S.This is the source of incompleteness of our algoritithm, since this condition is necessary but not sufficient for Environment to win; that is, the condition does not guarantee that Environment actually wins in G T .
While L((G H) S ) ⊆ L(T ) can be checked with a timed automaton model checker (see Checking Containment below), this would mean exploring the large state space due to G. Since we want to avoid using timed automata model checkers on such large instances, we rather learn an underapproximation H of L(T ) using the membership and containment oracles, and use a finite-state model checker to check L((G H) S ) ⊆ L(H).Note that although the learning process does require inclusion checks of the form L(H) ⊆ L(T ), this check is feasible with a timed automaton model checker since H is typically much smaller than G.If the above check passes, then we reject the instance, that is, we declare the system not controllable.Otherwise, some trace w appears in L((G H) S ) but not in L(H).If w ∈ L(T ), then we require that w be included in H, and continue the learning process.Otherwise, S is not valid since it induces w which is not in L(T ).So we interrupt the current phase and switch back to the overapproximation phase requiring w to be removed from H.
Membership and inclusion oracles are implemented with a timed automata model checker.Here, the synthesis oracle can be any finite game solver; we just need the capability of computing the controlled system (G H) S .Such a system is finite-state, so the strategy containment oracle can be implemented using a finitestate model checker (since H is deterministic and can thus be complemented).It remains to explain how the containment oracle is implemented.
Checking Containment L(H) ⊆ L(T ).First, notice that, even with determinism assumptions on T , the untimed language of the timed automaton complement of T is not the complement of L(T ).To see this, consider a timed automaton with a single state which is both initial and accepting, a single clock x, and a self-loop with guard x = 1, labeled by σ.Then, both L(T ) and L(T c ) are the language σ * where T c denotes the timed automaton complement.
Nevertheless, assuming the label-determinism of T , this check can be done by a simple adaptation of a zone-based exploration algorithm, as follows.Let us assume that accepting states are reachable from all states of H, which can be ensured by a preprocessing step.We start exploring the timed automaton H T using a zone-based exploration algorithm [11].Consider any search node ((q H , q T ), Z) encountered during the exploration algorithm, reachable by the trace w, where (q H , q T ) is a location of H T , and Z a zone.The exploration algorithm generates all available successors for σ ∈ Σ.We make the following additional check: If there is σ ∈ Σ such that q H has a successor by σ , but not T (either because there is no such edge, or because the guard of the unique edge labeled by σ is not satisfied by Z), then we stop and return the trace wσ ∈ L(H) \ L(T ) as a counterexample to containment.If no such label can be found, the zone-based exploration will terminate and the algorithm confirms the containment.
As an alternative, one can use testing such as the Wp-method [37] to establish the containment, as it is customary in DFA learning.In this case, the answer is approximate in the sense that the conformance test can fail to detect that containment does not hold.However, this does not affect the soundness of the overall algorithm since it can only increase false negatives.

Experiments
Our tool accepts instances G T where G is given as a Verilog module, and T as a TChecker timed automaton.Some of the inputs of the Verilog module are uncontrollable (chosen by Environment), some others are controllable (chosen by Controller).We use outputs of the Verilog module to define the synchronization labels Σ; while TChecker models tag each transition with such a label.Membership, inclusion, and containment queries are answered by TChecker.For the synthesis oracle, we used the game solver Abssynthe [12].Abssynthe's input format is the and-inverter graphs format (AIG).For translating Verilog modules to AIG circuits, we use berkeley-abc and yosys.Abssynthe is able to compute the winning strategy S for the winning player; it also computes the system controlled by S in this case as an AIG circuit.The strategy containment oracle is implemented using NuSMV; since H is deterministic, one can complement it, and check whether the intersection with (G H) S is empty.
The tool uses two Java threads to implement both learning phases, which are interrupted and continued while switching phases.Note that the very first learning step of H and H can be parallelized since the first underapproximation conjecture H does not depend on S.
We evaluate our algorithm with two classes of benchmarks.The only tool to which we compare is Uppaal-TIGA [6] since Synthia [45] is not available anymore, and we are not aware of any other timed game solver.
In the scheduling benchmarks, there are two sporadic tasks that arrive nondeterministically, but constrained by the timed automaton.The controller must schedule these using two machines which have internal states, modeled either by a simple 8-bit counter, or by a genbuf circuit from the SYNTCOMP database.The scheduling duration depends on the internal state: some states require executing two external tasks, some others require executing three.The external task has a nondeterministic duration constrained by the timed automaton.The internal states change when a task is finalized.The controller loses if all machines are busy upon the arrival of a new task, or if it schedules a task on a busy machine.Uppaal TIGA was able to solve the counter models since they induce a smaller state space, but failed at the genbuf models.The compositional algorithm could efficiently handle these models.Uppaal was generally able to determine very quickly when the model is not controllable by finding a small counterstrategy, while the compositional algorithm a overhead: it had to learn H and H before it can find and check the counterexample.
In the planning benchmarks, a robot and an obstacle is moving in a 6×6 grid (or 9×9 for the stateless case).Each agent can decide to move to an adjacent cell when they are scheduled, and the scheduling times are determined by a timed automaton.The goal of the robot is to avoid the obstacles.In the genbuf case, there are moreover internal states that can cause a glitch and prevent the agents from performing their moves, depending on their states.Uppaal TIGA was not able to manage the large state space unlike the compositional algorithm in this case, but both were able to solve the stateless case.

Conclusion
Related Works Perhaps the most closely related approach to our compositional model checking algorithm is trace abstraction refinement [26].This was originally applied to program verification, and consists in building a network of finite automata that recognizes the program's control flow paths that are infeasible.One refines this language by model checking the control flow graph intersected with the complement of the automaton.Thus, the semantics of the variables of the program are abstractly represented by the finite automaton.This idea was applied to timed automata as well [54,15].However, the generalization of the counterexamples which ensures convergence turns out to be less effective in timed automata.We attempted at obtaining an implementation, but could only confirm the poor performance for model checking timed automata as in [15] (we do not include these results here).It might be that simpler graph structures such as control flow graphs of programs are necessary for this approach to scale; further investigation is also necessary to study better generalization methods.
The learning-based compositional reasoning approach of [44] is also related to counter-example guided abstraction refinement (CEGAR) [18].In fact, the automata learning algorithm builds an overapproximation of one of the components, and refines it as needed, guided by counterexamples.The difference is that, instead of using predicates, one uses automata to represent the overapproximation.A discussion can also be found in [44].
Learning algorithms for event-recording automata, a subset of timed automata were studied in [25].The algorithm of [44] was extended for these automata in [36].In the context of parameter synthesis with learning, parameterized systems were seen as a parallel composition of a non-parameterized component, and a parameterized component in [2].
Other approaches targeting the formal verification of real-time systems with large discrete state spaces include encodings of timed automata semantics in Boolean logic include [34,49].An extension of and-inverter graphs were used in [20] that uses predicates to represent the state space of linear hybrid automata.
The abstract interpretation of games were studied in [28] that presents a theory allowing one to define under-and over-approximations.Abstraction-refinement algorithms based on counterexamples were given in [27,21].These ideas were applied to timed games in [45].Several abstraction-refinement and compositional algorithms were given in [12,13] for solving finite-state games given as Boolean circuits.The synthesis competition gathers every year researchers who present their game solvers [32,33].
Perspectives The algorithm we presented builds finite-state abstractions of realtime constraints, that it represents as DFA.The approach is well adapted when the interaction alphabet between A and T is small; this is the case, for instance, for distributed systems where the time constraints are used to describe the approximate period with which each process communicates with its neighbors; so the alphabet contains only a few symbols per process.Some of the benchmarks we considered are models of such systems.The approach is less convenient for timeintensive systems such as, say, job shop scheduling problems where a separate alphabet symbol is needed for each task.
As future work, we would like to understand when various abstraction schemes are efficient among the approach presented here, the predicate-abstraction approach, and zone-based state-space exploration.Currently, all algorithms fail in some benchmarks.Understanding the strengths of each algorithm might help designing a uniformly better solution.Currently, we can only verify linear properties; one might verify branching-time properties by learning automata with a stronger notion of equivalence such as bisimulation.In fact, an important limitation is due to learning being slow for large alphabets.Our setting could be extended to deal with large or symbolic alphabets e.g.[38,39].
For synthesis, our setting is currently restricted by the abstractions we use since when the algorithm rejects the instace, we cannot conclude whether the system is controllable or not.Using both the under-and overapproximations within the finite-state synthesis, for instance, using the three-valued abstraction approach [21] might allow us to render the approach complete, and to consider a larger class of timed games such as those that allow Controller to select nonzero delays.

Fig. 2 .
Fig. 2. The learning-based compositional model checking algorithm.The box on the left is a DFA learning algorithm, while the oracles answering the queries of the learning algorithm are shown on the right and correspond to the teacher.

z ≥ 2 ,Fig. 3 .
Fig. 3.The sketch of a timed game G T modelling a planning problem.The finite game models a robot and an obstacle moving in a grid world as shown on top right.The cells r and o show, respectively, the initial positions of the robot and the obstacle.The robot cannot cross walls (shown in thick segments), and can only cross the door if it is open.Here four silent transitions were marked with r , l , u, d for readability; in reality, these are all labeled by .

Fig. 4 .
Fig. 4. The learning-based compositional controller synthesis algorithm for the input timed game G T , with G a Controller-silent finite game, and T a label-deterministic timed automaton.Two automata learning algorithms run in parallel to learn underand over-approximations H and H such that H ⊆ L(T ) ⊆ H.

Table 2 .
The results of the controller synthesis experiments.The columns #Clks, #C, #M respectively show the number of clocks in the model, the numbers of conjectures and membership queries made by the compositional algorithm; while |H|, |H| show the sizes of the DFAs learned by the two phases.