Active Learning of Deterministic Timed Automata with Myhill-Nerode Style Characterization

Waga, Masaki

doi:10.1007/978-3-031-37706-8_1

Masaki Waga ORCID: orcid.org/0000-0001-9360-7490⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13964))

Included in the following conference series:

International Conference on Computer Aided Verification

2335 Accesses
1 Citations

Abstract

We present an algorithm to learn a deterministic timed automaton (DTA) via membership and equivalence queries. Our algorithm is an extension of the L* algorithm with a Myhill-Nerode style characterization of recognizable timed languages, which is the class of timed languages recognizable by DTAs. We first characterize the recognizable timed languages with a Nerode-style congruence. Using it, we give an algorithm with a smart teacher answering symbolic membership queries in addition to membership and equivalence queries. With a symbolic membership query, one can ask the membership of a certain set of timed words at one time. We prove that for any recognizable timed language, our learning algorithm returns a DTA recognizing it. We show how to answer a symbolic membership query with finitely many membership queries. We also show that our learning algorithm requires a polynomial number of queries with a smart teacher and an exponential number of queries with a normal teacher. We applied our algorithm to various benchmarks and confirmed its effectiveness with a normal teacher.

You have full access to this open access chapter, Download conference paper PDF

Learning One-Clock Timed Automata

Active Learning of One-Clock Timed Automata Using Constraint Solving

Learning Symbolic Timed Models from Concrete Timed Data

Keywords

1 Introduction

Active automata learning is a class of methods to infer an automaton recognizing an unknown target language \(\mathcal {L}_{\textrm{tgt}}\subseteq \varSigma ^*\) through finitely many queries to a teacher. The L* algorithm [8], the best-known active DFA learning algorithm, infers the minimum DFA recognizing \(\mathcal {L}_{\textrm{tgt}}\) using membership and equivalence queries. In a membership query, the learner asks if a word \(w \in \varSigma ^*\) is in the target language \(\mathcal {L}_{\textrm{tgt}}\), which is used to obtain enough information to construct a hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\). Using an equivalence query, the learner checks if the hypothesis \(\mathcal {A}_{\textrm{hyp}}\) recognizes the target language \(\mathcal {L}_{\textrm{tgt}}\). If \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \ne \mathcal {L}_{\textrm{tgt}}\), the teacher returns a counterexample \( cex \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\) differentiating the target language and the current hypothesis. The learner uses \( cex \) to update \(\mathcal {A}_{\textrm{hyp}}\) to classify \( cex \) correctly. Such a learning algorithm has been combined with formal verification, e. g., for testing [22, 24, 26, 28] and controller synthesis [31].

Most of the DFA learning algorithms rely on the characterization of regular languages by Nerode’s congruence. For a language \(\mathcal {L}\), words \(p\) and \(p'\) are equivalent if for any extension \(s\), \(p\cdot s\in \mathcal {L}\) if and only if \(p' \cdot s\in \mathcal {L}\). It is well known that if \(\mathcal {L}\) is regular, such an equivalence relation has finite classes, corresponding to the locations of the minimum DFA recognizing \(\mathcal {L}\) (known as Myhill-Nerode theorem; see, e. g., [18]). Moreover, for any regular language \(\mathcal {L}\), there are finite extensions \(S\) such that \(p\) and \(p'\) are equivalent if and only if for any \(s\in S\), \(p\cdot s\in \mathcal {L}\) if and only if \(p' \cdot s\in \mathcal {L}\). Therefore, one can learn the minimum DFA by learning such finite extensions \(S\) and the finite classes induced by Nerode’s congruence.

The L* algorithm learns the minimum DFA recognizing the target language \(\mathcal {L}_{\textrm{tgt}}\) using a 2-dimensional array called an observation table. Figure 1b illustrates observation tables. The rows and columns of an observation table are indexed with finite sets of words \(P\) and \(S\), respectively. Each cell indexed by \((p, s) \in P\times S\) shows if \(p\cdot s\in \mathcal {L}_{\textrm{tgt}}\). The column indices \(S\) are the current extensions approximating Nerode’s congruence. The L* algorithm increases \(P\) and \(S\) until: 1) the equivalence relation defined by \(S\) converges to Nerode’s congruence and 2) \(P\) covers all the classes induced by the congruence. The equivalence between \(p, p'\in P\) under \(S\) can be checked by comparing the rows in the observation table indexed with \(p\) and \(p'\). For example, Fig. 1b shows that and are deemed equivalent with extensions but distinguished by adding to \(S\). The refinement of \(P\) and \(S\) is driven by certain conditions to validate the DFA construction and by addressing the counterexample obtained by an equivalence query.

Timed words are extensions of conventional words with real-valued dwell time between events. Timed languages, sets of timed words, are widely used to formalize real-time systems and their properties, e. g., for formal verification. Among various formalisms representing timed languages, timed automata (TAs) [4] is one of the widely used formalisms. A TA is an extension of an NFA with finitely many clock variables to represent timing constraints. Figure 1c shows an example.

Despite its practical relevance, learning algorithms for TAs are only available for limited subclasses of TAs, e. g., real-time automata [6, 7], event-recording automata [15, 16], event-recording automata with unobservable reset [17], and one-clock deterministic TAs [5, 30]. Timing constraints representable by these classes are limited, e. g., by restricting the number of clock variables or by restricting the edges where a clock variable can be reset. Such restriction simplifies the inference of timing constraints in learning algorithms.

Contributions. In this paper, we propose an active learning algorithm for deterministic TAs (DTAs). The languages recognizable by DTAs are called recognizable timed languages [21]. Our strategy is as follows: first, we develop a Myhill-Nerode style characterization of recognizable timed languages; then, we extend the L* algorithm for recognizable timed languages using the similarity of the Myhill-Nerode style characterization.

Due to the continuity of dwell time in timed words, it is hard to characterize recognizable timed languages by a Nerode-style congruence between timed words. For example, for the DTA in Fig. 1c, for any \(\tau , \tau ' \in [0,1)\) satisfying \(\tau < \tau '\), distinguishes \(\tau \) and \(\tau '\) because leads to \(l_0\) while leads to \(l_1\). Therefore, such a congruence can make infinitely many classes.

Instead, we define a Nerode-style congruence between sets of timed words called elementary languages [21]. An elementary language is a timed language defined by a word with a conjunction of inequalities constraining the time difference between events. We also use an equality constraint, which we call, a renaming equation to define the congruence. Intuitively, a renaming equation bridges the time differences in an elementary language and the clock variables in a TA. We note that there can be multiple renaming equations showing the equivalence of two elementary languages.

Example 1

Let \(p_1\) and \(p_2\) be elementary languages and . For the DTA in Fig. 1c, \(p_1\) and \(p_2\) are equivalent with the renaming equation \(\tau ^1_0 + \tau ^1_1 = \tau ^2_1 + \tau ^2_2\) because for any and : 1) we reach \(l_0\) after reading either of \(w_1\) and \(w_2\) and 2) the values of \(c\) after reading \(w_1\) and \(w_2\) are \(\tau ^1_0 + \tau ^1_1\) and \(\tau ^2_1 + \tau ^2_2\), respectively.

We characterize recognizable timed languages by the finiteness of the equivalence classes defined by the above congruence. We also show that for any recognizable timed language, there is a finite set \(S\) of elementary languages such that the equivalence of any prefixes can be checked by the extensions \(S\).

By using the above congruence, we extend the L* algorithm for DTAs. The high-level idea is the same as the original L* algorithm: 1) the learner makes membership queries to obtain enough information to construct a hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}\) and 2) the learner makes an equivalence query to check if \(\mathcal {A}_{\textrm{hyp}}\) recognizes the target language. The largest difference is in the cells of an observation table. Since the concatenation \(p\cdot s\) of an index pair \((p, s) \in P\times S\) is not a timed word but a set of timed words, its membership is not defined as a Boolean value. Instead, we introduce the notion of symbolic membership and use it as the value of each cell of the timed observation table. Intuitively, the symbolic membership is the constraint representing the subset of \(p\cdot s\) included by \(\mathcal {L}_{\textrm{tgt}}\). Such a constraint can be constructed by finitely many (non-symbolic) membership queries.

Example 2

Figure 1d illustrates a timed observation table. The equivalence between \(p_1, p_2 \in P\) under \(S\) can be checked by comparing the cells in the rows indexed with \(p_1\) and \(p_2\) with renaming equations. For the cells in rows indexed by \(p_1\) and \(p_2\), their constraints are the same by replacing \(\tau _0 + \tau _1\) with \(\tau _1 + \tau _2\) and vice versa. Thus, \(p_1\) and \(p_2\) are equivalent with the current extensions \(S\).

Once the learner obtains enough information, it constructs a DTA via the monoid-based representation of recognizable timed languages [21]. We show that for any recognizable timed language, our algorithm terminates and returns a DTA recognizing it. We also show that the number of the necessary queries is polynomial to the size of the equivalence class defined by the Nerode-style congruence if symbolic membership queries are allowed and, otherwise, exponential to it. Moreover, if symbolic membership queries are not allowed, the number of the necessary queries is at most doubly exponential to the number of the clock variable of a DTA recognizing the target language and singly exponential to the number of locations of a DTA recognizing the target language. This worst-case complexity is the same as the one-clock DTA learning algorithm in [30].

We implemented our DTA learning algorithm in a prototype library LearnTA. Our experiment results show that it is efficient enough for some benchmarks taken from practical applications, e. g., the FDDI protocol. This suggests the practical relevance of our algorithm.

The following summarizes our contribution.

We characterize recognizable timed languages by a Nerode-style congruence.
Using the above characterization, we give an active DTA learning algorithm.
Our experiment results suggest its practical relevance.

Related Work. Among various characterization of timed languages [4, 10,11,12,13, 21], the characterization by recognizability [21] is closest to our Myhill-Nerode-style characterization. Both of them use finite sets of elementary languages for characterization. Their main difference is that [21] proposes a formalism to define a timed language by relating prefixes by a morphism, whereas we propose a technical gadget to define an equivalence relation over timed words with respect to suffixes using symbolic membership. This difference makes our definition suitable for an L*-style algorithm, where the original L* algorithm is based on Nerode’s congruence, which defines an equivalence relation over words with respect to suffixes using conventional membership.

As we have discussed so far, active TA learning [5, 15,16,17, 30] has been studied mostly for limited subclasses of TAs, where the number of the clock variables or the clock variables reset at each edge is fixed. In contrast, our algorithm infers both of the above information. Another difference is in the technical strategy. Most of the existing algorithms are related to the active learning of symbolic automata [9, 14], enhancing the languages with clock valuations. In contrast, we take a more semantic approach via the Nerode-style congruence.

Another recent direction is to use a genetic algorithm to infer TAs in passive [27] or active [3] learning. This differs from our learning algorithm based on a formal characterization of timed languages. Moreover, these algorithms may not converge to the correct automaton due to a genetic algorithm.

2 Preliminaries

For a set X, its powerset is denoted by \(\mathcal {P} ({X})\). We denote the empty sequence by \(\varepsilon \). For sets X, Y, we denote their symmetric difference by \(X \triangle Y = \{x \mid x \in X \wedge x \notin Y\} \cup \{y \mid y \in Y \wedge y \notin X\}\).

2.1 Timed Words and Timed Automata

Definition 3

(timed word). For a finite alphabet \(\varSigma \), a timed word \(w\) is an alternating sequence \(\tau _0 a_1 \tau _1 a_2 \dots a_{n} \tau _{n}\) of \(\varSigma \) and \({\mathbb {R}}_{\ge 0}\). The set of timed words over \(\varSigma \) is denoted by \(\mathcal {T}(\varSigma )\). A timed language \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) is a set of timed words.

For timed words \(w=\tau _0 a_1 \tau _1 a_2 \dots a_{n} \tau _{n}\) and \(w' = \tau '_0 a'_1 \tau '_1 a'_2 \dots a'_{n'} \tau '_{n'}\), their concatenation \(w \cdot w'\) is \(w \cdot w' = \tau _0 a_1 \tau _1 a_2 \dots a_{n} (\tau _{n} + \tau '_0) a'_1 \tau '_1 a'_2 \dots a'_{n'} \tau '_{n'}\). The concatenation is naturally extended to timed languages: for a timed word \(w\) and timed languages \(\mathcal {L}, \mathcal {L}'\), we let \(w\cdot \mathcal {L}= \{ w\cdot w_{\mathcal {L}} \mid w_{\mathcal {L}} \in \mathcal {L}\}\), \(\mathcal {L}\cdot w= \{ w_{\mathcal {L}} \cdot w\mid w_{\mathcal {L}} \in \mathcal {L}\}\), and \(\mathcal {L}\cdot \mathcal {L}' = \{ w_{\mathcal {L}} \cdot w_{\mathcal {L}'} \mid w_{\mathcal {L}} \in \mathcal {L}, w_{\mathcal {L}'} \in \mathcal {L}'\}\). For timed words \(w\) and \(w'\), \(w\) is a prefix of \(w'\) if there is a timed word \(w''\) satisfying \(w\cdot w'' = w'\). A timed language \(\mathcal {L}\) is prefix-closed if for any \(w\in \mathcal {L}\), \(\mathcal {L}\) contains all the prefixes of \(w\).

For a finite set \(C\) of clock variables, a clock valuation is a function \(\nu \in ({\mathbb {R}}_{\ge 0})^{C}\). We let \(\textbf{0}_{C}\) be the clock valuation satisfying \(\textbf{0}_{C}(c) = 0\) for any \(c\in C\). For \(\nu \in ({\mathbb {R}}_{\ge 0})^{C}\) and \(\tau \in {\mathbb {R}}_{\ge 0}\), we let \(\nu + \tau \) be the clock valuation satisfying \((\nu +\tau )(c)=\nu (c)+\tau \) for any \(c\in C\). For \(\nu \in ({\mathbb {R}}_{\ge 0})^{C}\) and \(\rho \subseteq C\), we let \(\nu [\rho {:}{=}0]\) be the clock valuation satisfying \((\nu [\rho {:}{=}0])(x)=0\) for \(c\in \rho \) and \((\nu [\rho {:}{=}0])(c)=\nu (c)\) for \(c\notin \rho \). We let \(\mathcal {G}_{C}\) be the set of constraints defined by a finite conjunction of inequalities \(c\bowtie d\), where \(c\in C\), \(d\in {\mathbb {N}}\), and \({\bowtie } \in \{>, \ge , \le ,<\}\). We let \(\mathcal {C}_{C}\) be the set of constraints defined by a finite conjunction of inequalities \(c\bowtie d\) or \(c- c' \bowtie d\), where \(c, c' \in C\), \(d\in {\mathbb {N}}\), and \({\bowtie } \in \{>, \ge , \le ,<\}\). We denote \(\bigwedge \emptyset \) by \(\top \). For \(\nu \in ({\mathbb {R}}_{\ge 0})^{C}\) and \(\varphi \in \mathcal {C}_{C}\cup \mathcal {G}_{C}\), we denote \(\nu \models \varphi \) if \(\nu \) satisfies \(\varphi \).

Definition 4

(timed automaton). A timed automaton (TA) is a 7-tuple \((\varSigma ,L,l_0,C,I,\varDelta ,F)\), where: \(\varSigma \) is the finite alphabet, \(L\) is the finite set of locations, \(l_0\in L\) is the initial location, \(C\) is the finite set of clock variables, \(I:L\rightarrow \mathcal {C}_{C}\) is the invariant of each location, \(\varDelta \subseteq L\times \mathcal {G}_{C}\times (\varSigma \cup \{\varepsilon \}) \times \mathcal {P} ({C})\times L\) is the set of edges, and \(F\subseteq L\) is the accepting locations.

A TA is deterministic if 1) for any \(a\in \varSigma \) and \((l, g, a, \rho , l'), (l, g', a, \rho ', l'') \in \varDelta \), \(g\wedge g'\) is unsatisfiable, or 2) for any \((l, g, \varepsilon , \rho , l') \in \varDelta \), \(g\wedge I ({l})\) is at most a singleton. Figure 1c shows a deterministic TA (DTA).

The semantics of a TA is defined by a timed transition system (TTS).

Definition 5

(semantics of TAs). For a TA \(\mathcal {A}=(\varSigma ,L,l_0,C,I,\varDelta ,F)\), the timed transition system (TTS) is a 4-tuple \(\mathcal {S}= (Q, q_{0}, Q_{F}, {\rightarrow })\), where: \(Q= L\times ({\mathbb {R}}_{\ge 0})^{C}\) is the set of (concrete) states, \(q_{0}= (l_0, \textbf{0}_{C})\) is the initial state, \(Q_{F}= \{ (l, \nu ) \in Q\mid l\in F\}\) is the set of accepting states, and \({\rightarrow }\subseteq Q\times Q\) is the transition relation consisting of the following^{Footnote 1}.

For each \((l, \nu ) \in Q\) and \(\tau \in {\mathbb {R}_{>0}}\), we have \((l, \nu ){\mathop {\rightarrow }\limits ^{\tau }} (l, \nu + \tau )\) if \(\nu + \tau ' \models I ({l})\) holds for each \(\tau ' \in [0, \tau )\).
For each \((l, \nu ), (l', \nu ') \in Q\), \(a\in \varSigma \), and \((l,g,a,\rho ,l')\in \varDelta \), we have \((l, \nu ) {\mathop {\rightarrow }\limits ^{a}} (l', \nu ')\) if we have \(\nu \models g\) and \(\nu '= \nu [\rho {:}{=}0]\).
For each \((l, \nu ), (l', \nu ') \in Q\), \(\tau \in {\mathbb {R}_{>0}}\), and \((l, g, \varepsilon , \rho , l') \in \varDelta \), we have \((l, \nu ) {\mathop {\rightarrow }\limits ^{\varepsilon , \tau }} (l', \nu ' + \tau )\) if we have \(\nu \models g\), \(\nu '= \nu [\rho {:}{=}0]\), and \(\forall \tau ' \in [0, \tau ).\, \nu ' + \tau ' \models I ({l'})\).

A run of a TA \(\mathcal {A}\) is an alternating sequence \(q_0, {\rightarrow }_1,q_1,\dots , {\rightarrow }_n,q_n\) of \(q_i \in Q\) and \({\rightarrow }_i \in {\rightarrow }\) satisfying \(q_{i-1} \rightarrow _i q_{i}\) for any \(i \in \{1,2, \dots ,n\}\). A run \(q_0, {\rightarrow }_1,q_1,\dots , {\rightarrow }_n,q_n\) is accepting if \(q_n \in Q_{F}\). Given such a run, the associated timed word is the concatenation of the labels of the transitions. The timed language \(\mathcal {L}(\mathcal {A})\) of a TA \(\mathcal {A}\) is the set of timed words associated with some accepting run of \(\mathcal {A}\).

2.2 Recognizable Timed Languages

Here, we review the recognizability [21] of timed languages.

Definition 6

(timed condition). For a set \(\mathbb {T}= \{\tau _0,\tau _1,\dots ,\tau _{n}\}\) of ordered variables, a timed condition \(\varLambda \) is a finite conjunction of inequalities \(\mathbb {T}_{i,j} \bowtie d\), where \(\mathbb {T}_{i,j} = \sum _{{k = i}}^{j} \tau _k\), \({\bowtie } \in \{>, \ge , \le ,<\}\), and \(d\in {\mathbb {N}}\).

A timed condition \(\varLambda \) is simple^{Footnote 2} if for each \(\mathbb {T}_{i,j}\), \(\varLambda \) contains \(d< \mathbb {T}_{i,j} < d+ 1\) or \(d\le \mathbb {T}_{i,j} \wedge \mathbb {T}_{i,j} \le d\) for some \(d\in {\mathbb {N}}\). A timed condition \(\varLambda \) is canonical if we cannot strengthen or add any inequality \(\mathbb {T}_{i,j} \bowtie d\) to \(\varLambda \) without changing its semantics.

Definition 7

(elementary language). A timed language \(\mathcal {L}\) is elementary if there are \(u = a_1,a_2,\dots ,a_n\in \varSigma ^*\) and a timed condition \(\varLambda \) over \(\{\tau _0,\tau _1,\dots ,\tau _{n}\}\) satisfying \(\mathcal {L}= \{\tau _0 a_1 \tau _1 a_2 \dots a_{n} \tau _{n} \mid \tau _0,\tau _1,\dots ,\tau _{n} \models \varLambda \}\), and the set of valuations of \(\{\tau _0,\tau _1,\dots ,\tau _{n}\}\) defined by \(\varLambda \) is bounded. We denote such \(\mathcal {L}\) by \((u, \varLambda )\). We let \(\mathcal {E} (\varSigma )\) be the set of elementary languages over \(\varSigma \).

For \(p, p' \in \mathcal {E} (\varSigma )\), \(p\) is a prefix of \(p'\) if for any \(w' \in p'\), there is a prefix \(w\in p\) of \(w'\), and for any \(w\in p\), there is \(w' \in p'\) such that \(w\) is a prefix of \(w'\). For any elementary language, the number of its prefixes is finite. For a set of elementary languages, prefix-closedness is defined based on the above definition of prefixes.

An elementary language \((u, \varLambda )\) is simple if there is a simple and canonical timed condition \(\varLambda '\) satisfying \((u, \varLambda )= (u, \varLambda ')\). We let \(\mathcal{S}\mathcal{E} (\varSigma )\) be the set of simple elementary languages over \(\varSigma \). Without loss of generality, we assume that for any \((u, \varLambda )\in \mathcal{S}\mathcal{E} (\varSigma )\), \(\varLambda \) is simple and canonical. We remark that any DTA cannot distinguish timed words in a simple elementary language, i. e., for any \(p\in \mathcal{S}\mathcal{E} (\varSigma )\) and a DTA \(\mathcal {A}\), we have either \(p\subseteq \mathcal {L}(\mathcal {A})\) or \(p\cap \mathcal {L}(\mathcal {A}) = \emptyset \). We can decide if \(p\subseteq \mathcal {L}(\mathcal {A})\) or \(p\cap \mathcal {L}(\mathcal {A}) = \emptyset \) by taking some \(w\in p\) and checking if \(w\in \mathcal {L}(\mathcal {A})\).

Definition 8

(immediate exterior). Let \(\mathcal {L}= (u, \varLambda )\) be an elementary language. For \(a\in \varSigma \), the discrete immediate exterior \(\textrm{ext}^{a} (\mathcal {L})\) of \(\mathcal {L}\) is \(\textrm{ext}^{a} (\mathcal {L}) = (u \cdot a, \varLambda \cup \{\tau _{ |u| + 1 } = 0\})\). The continuous immediate exterior \(\textrm{ext}^{t} (\mathcal {L})\) of \(\mathcal {L}\) is \(\textrm{ext}^{t} (\mathcal {L}) = (u, \varLambda ^t)\), where \(\varLambda ^t\) is the timed condition such that each inequality \(\mathbb {T}_{i,|u|} = d\) in \(\varLambda \) is replaced with \(\mathbb {T}_{i,|u|} > d\) if such an inequality exists, and otherwise, the inequality \(\mathbb {T}_{i,|u|} < d\) in \(\varLambda \) with the smallest index i is replaced with \(\mathbb {T}_{i,|u|} = d\). The immediate exterior of \(\mathcal {L}\) is \(\textrm{ext}^{} (\mathcal {L}) = \bigcup _{a\in \varSigma }\textrm{ext}^{a} (\mathcal {L}) \cup \textrm{ext}^{t} (\mathcal {L})\).

Example 9

For a word and a timed condition \(\varLambda = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} \in (1,2) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} = 0\}\), we have and . The discrete and continuous immediate exteriors of \((u, \varLambda )\) are and \(\textrm{ext}^{t} ((u, \varLambda )) = (u, \varLambda ^{t})\), where and \(\varLambda ^{t} = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} \in (1,2) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} > 0\}\).

Definition 10

(chronometric timed language). A timed language \(\mathcal {L}\) is chronometric if there is a finite set \(\{(u_{1}, \varLambda _{1}), (u_{2}, \varLambda _{2}), \dots , (u_{m}, \varLambda _{m})\}\) of disjoint elementary languages satisfying \(\mathcal {L}= \bigcup _{i \in \{1,2, \dots ,m\}} (u_{i}, \varLambda _{i})\).

For any elementary language \(\mathcal {L}\), its immediate exterior \(\textrm{ext}^{} (\mathcal {L})\) is chronometric. We naturally extend the notion of exterior to chronometric timed languages, i. e., for a chronometric timed language \(\mathcal {L}= \bigcup _{i \in \{1,2, \dots ,m\}} (u_{i}, \varLambda _{i})\), we let \(\textrm{ext}^{} (\mathcal {L}) = \bigcup _{i \in \{1,2, \dots ,m\}} \textrm{ext}^{} ((u_{i}, \varLambda _{i}))\), which is also chronometric. For a timed word \(w=\tau _0 a_1 \tau _1 a_2 \dots a_{n} \tau _{n}\), we denote the valuation of \(\tau _0,\tau _1,\dots ,\tau _{n}\) by \(\kappa (w)\).

Chronometric relational morphism [21] relates any timed word to a timed word in a certain set \(P\), which is later used to define a timed language. Intuitively, the tuples in \(\varPhi \) specify a mapping from timed words immediately out of \(P\) to timed words in \(P\). By inductively applying it, any timed word is mapped to \(P\).

Definition 11

(chronometric relational morphism). Let \(P\) be a chronometric and prefix-closed timed language. Let \((u,\varLambda ,u{'},\varLambda {'}, R)\) be a 5-tuple such that \((u, \varLambda )\subseteq \textrm{ext}^{} (P)\), \((u{'},\varLambda {'}) \subseteq P\), and \(R\) is a finite conjunction of equations of the form \(\mathbb {T}_{i,|u|} = \mathbb {T}^{'}_{{j},{|u'|}}\), where \(i \le |u|\) and \(j \le |u'|\). For such a tuple, we let \(\llbracket { (u,\varLambda ,u{'},\varLambda {'}, R) }\rrbracket \subseteq (u, \varLambda )\times (u{'},\varLambda {'})\) be the relation such that \((w, w') \in \llbracket { (u,\varLambda ,u{'},\varLambda {'}, R) }\rrbracket \) if and only if \(\kappa (w), \kappa (w') \models R\). For a finite set \(\varPhi \) of such tuples, the chronometric relational morphism \(\llbracket { \varPhi }\rrbracket \subseteq \mathcal {T}(\varSigma )\times P\) is the relation inductively defined as follows: 1) for \(w\in P\), we have \((w, w) \in \llbracket { \varPhi }\rrbracket \); 2) for \(w\in \textrm{ext}^{} (P)\) and \(w' \in P\), we have \((w, w') \in \llbracket { \varPhi }\rrbracket \) if we have \((w, w') \in \llbracket { (u,\varLambda ,u{'},\varLambda {'}, R) }\rrbracket \) for one of the tuples \((u,\varLambda ,u{'},\varLambda {'}, R) \in \varPhi \); 3) for \(w\in \textrm{ext}^{} (P)\), \(w' \in \mathcal {T}(\varSigma )\), and \(w'' \in P\), we have \((w\cdot w', w'') \in \llbracket { \varPhi }\rrbracket \) if there is \(w''' \in \mathcal {T}(\varSigma )\) satisfying \((w, w''') \in \llbracket { \varPhi }\rrbracket \) and \((w''' \cdot w', w'') \in \llbracket { \varPhi }\rrbracket \). We also require that all \((u, \varLambda )\) in the tuples in \(\varPhi \) must be disjoint and the union of each such \((u, \varLambda )\) is \(\textrm{ext}^{} (P) \setminus P\).

A chronometric relational morphism \(\llbracket { \varPhi }\rrbracket \) is compatible with \(F\subseteq P\) if for each tuple \((u,\varLambda ,u{'},\varLambda {'}, R)\) defining \(\llbracket { \varPhi }\rrbracket \), we have either \((u{'},\varLambda {'}) \subseteq F\) or \((u{'},\varLambda {'}) \cap F= \emptyset \).

Definition 12

(recognizable timed language). A timed language \(\mathcal {L}\) is recognizable if there is a chronometric prefix-closed set \(P\), a chronometric subset \(F\) of \(P\), and a chronometric relational morphism \(\llbracket { \varPhi }\rrbracket \subseteq \mathcal {T}(\varSigma )\times P\) compatible with \(F\) satisfying \(\mathcal {L}= \{w\mid \exists w' \in F, (w, w') \in \llbracket { \varPhi }\rrbracket \}\).

It is known that for any recognizable timed language \(\mathcal {L}\), we can construct a DTA \(\mathcal {A}\) recognizing \(\mathcal {L}\), and vice versa [21].

2.3 Distinguishing Extensions and Active DFA Learning

Most DFA learning algorithms are based on Nerode’s congruence [18]. For a (not necessarily regular) language \(\mathcal {L}\subseteq \varSigma ^*\), Nerode’s congruence \({\equiv _{\mathcal {L}}} \subseteq \varSigma ^* \times \varSigma ^*\) is the equivalence relation satisfying \(w\equiv _{\mathcal {L}} w'\) if and only if for any \(w'' \in \varSigma ^*\), we have \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\).

Generally, we cannot decide if \(w\equiv _{\mathcal {L}} w'\) by testing because it requires infinitely many membership checking. However, if \(\mathcal {L}\) is regular, there is a finite set of suffixes \(S\subseteq \varSigma ^*\) called distinguishing extensions satisfying \({\equiv _{\mathcal {L}}} = {\sim ^{S}_{\mathcal {L}}}\), where \({\sim ^{S}_{\mathcal {L}}}\) is the equivalence relation satisfying \(w\sim ^{S}_{\mathcal {L}} w'\) if and only if for any \(w'' \in S\), we have \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\). Thus, the minimum DFA recognizing \(\mathcal {L}_{\textrm{tgt}}\) can be learned by^{Footnote 3}: i) identifying distinguishing extensions \(S\) satisfying \({\equiv _{\mathcal {L}_{\textrm{tgt}}}} = {\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}\) and ii) constructing the minimum DFA \(\mathcal {A}\) corresponding to \({\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}\).

The L* algorithm [8] is an algorithm to learn the minimum DFA \(\mathcal {A}_{\textrm{hyp}}\) recognizing the target regular language \(\mathcal {L}_{\textrm{tgt}}\) with finitely many membership and equivalence queries to the teacher. In a membership query, the learner asks if \(w\in \varSigma ^*\) belongs to the target language \(\mathcal {L}_{\textrm{tgt}}\) i. e., \(w\in \mathcal {L}_{\textrm{tgt}}\). In an equivalence query, the learner asks if the hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\) recognizes the target language \(\mathcal {L}_{\textrm{tgt}}\) i. e., \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\), where \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}})\) is the language of the hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\). When we have \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \ne \mathcal {L}_{\textrm{tgt}}\), the teacher returns a counterexample \( cex \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \triangle \mathcal {L}_{\textrm{tgt}}\). The information obtained via queries is stored in a 2-dimensional array called an observation table. See Fig. 1b for an illustration. For finite index sets \(P, S\subseteq \varSigma ^*\), for each pair \((p, s) \in (P\cup P\cdot \varSigma ) \times S\), the observation table stores whether \(p\cdot s\in \mathcal {L}_{\textrm{tgt}}\). \(S\) is the current candidate of the distinguishing extensions, and \(P\) represents \(\varSigma ^* / {\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}\). Since \(P\) and \(S\) are finite, one can fill the observation table using finite membership queries.

Algorithm 1 outlines an L*-style algorithm. We start from and incrementally increase them. To construct a hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\), the observation table must be closed and consistent. An observation table is closed if, for each \(p\in P\cdot \varSigma \), there is \(p' \in P\) satisfying \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\). An observation table is consistent if, for any \(p, p' \in P\cup P\cdot \varSigma \) and \(a\in \varSigma \), \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\) implies \(p\cdot a\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p' \cdot a\).

Once the observation table becomes closed and consistent, the learner constructs a hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\) and checks if \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\) by an equivalence query. If \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\) holds, \(\mathcal {A}_{\textrm{hyp}}\) is the resulting DFA. Otherwise, the teacher returns \( cex \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \triangle \mathcal {L}_{\textrm{tgt}}\), which is used to refine the observation table. There are several variants of the refinement. In the L* algorithm, all the prefixes of \( cex \) are added to \(P\). In the Rivest-Schapire algorithm [20, 25], an extension \(s\) strictly refining \(S\) is obtained by an analysis of \( cex \), and such \(s\) is added to \(S\).

3 A Myhill-Nerode Style Characterization of Recognizable Timed Languages with Elementary Languages

Unlike the case of regular languages, any finite set of timed words cannot correctly distinguish recognizable timed languages due to the infiniteness of dwell time in timed words. Instead, we use a finite set of elementary languages to define a Nerode-style congruence. To define the Nerode-style congruence, we extend the notion of membership to elementary languages.

Definition 13

(symbolic membership). For a timed language \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) and an elementary language \((u, \varLambda )\in \mathcal {E} (\varSigma )\), the symbolic membership \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u, \varLambda ))\) of \((u, \varLambda )\) to \(\mathcal {L}\) is the strongest constraint such that for any \(w\in (u, \varLambda )\), we have \(w\in \mathcal {L}\) if and only if \(\kappa (w) \models \texttt{mem}^{\texttt{sym}}_{\mathcal {L}}(\mathcal {L})\).

We discuss how to obtain symbolic membership in Sect. 4.5. We define a Nerode-style congruence using symbolic membership. A naive idea is to distinguish two elementary languages by the equivalence of their symbolic membership. However, this does not capture the semantics of TAs. For example, for the DTA \(\mathcal {A}\) in Fig. 1c, for any timed word \(w\), we have , while they have different symbolic membership. This is because symbolic membership distinguishes the position in timed words where each clock variable is reset, which must be ignored. We use renaming equations to abstract such positional information in symbolic membership. Note that \(\mathbb {T}_{i,n} = \sum _{{k = i}}^{n} \tau _k\) corresponds to the value of the clock variable reset at \(\tau _i\).

Definition 14

(renaming equation). Let \(\mathbb {T}= \{\tau _0,\tau _1,\dots ,\tau _{n}\}\) and \(\mathbb {T}' = \{\tau ^{'}_0,\tau ^{'}_1,\dots ,\tau ^{'}_{n^{'}}\}\) be sets of ordered variables. A renaming equation \(R\) over \(\mathbb {T}\) and \(\mathbb {T}'\) is a finite conjunction of equations of the form \(\mathbb {T}_{i,n} = \mathbb {T}^{'}_{{i'},{n'}}\), where \(i \in \{0, 1, \dots , n\}\), \(i' \in \{0, 1, \dots , n'\}\), \(\mathbb {T}_{i,n} = \sum _{{k = i}}^{n} \tau _k\) and \(\mathbb {T}^{'}_{{i'},{n'}} = \sum _{{k = i'}}^{n'} \tau '_k\).

Definition 15

(\(\sim ^{S}_{\mathcal {L}}\)). Let \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) be a timed language, let \((u, \varLambda ), (u{'},\varLambda {'}), (u{''},\varLambda {''}) \in \mathcal {E} (\varSigma )\) be elementary languages, and let \(R\) be a renaming equation over \(\mathbb {T}\) and \(\mathbb {T}'\), where \(\mathbb {T}\) and \(\mathbb {T}'\) are the variables of \(\varLambda \) and \(\varLambda '\), respectively. We let \((u, \varLambda )\sqsubseteq ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u{'},\varLambda {'})\) if we have the following: for any \(w\in (u, \varLambda )\), there is \(w' \in (u{'},\varLambda {'})\) satisfying \(\kappa (w), \kappa (w')\models R\); \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u, \varLambda )\cdot (u{''},\varLambda {''})) \wedge R \wedge \varLambda '\) is equivalent to \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u{'},\varLambda {'}) \cdot (u{''},\varLambda {''})) \wedge R\wedge \varLambda \). We let \((u, \varLambda )\sim ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u{'},\varLambda {'})\) if we have \((u, \varLambda )\sqsubseteq ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u{'},\varLambda {'})\) and \((u{'},\varLambda {'} \sqsubseteq ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u,\varLambda )\). Let \(S\subseteq \mathcal {E} (\varSigma )\). We let \((u, \varLambda )\sim ^{S, R}_{\mathcal {L}} (u{'},\varLambda {'})\) if for any \((u{''},\varLambda {''}) \in S\), we have \((u, \varLambda )\sim ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u{'},\varLambda {'})\). We let \((u, \varLambda )\sim ^{S}_{\mathcal {L}} (u{'},\varLambda {'})\) if \((u, \varLambda )\sim ^{S, R}_{\mathcal {L}} (u{'},\varLambda {'})\) for some renaming equation \(R\).

Example 16

Let \(\mathcal {A}\) be the DTA in Fig. 1c and let \((u, \varLambda )\), \((u{'},\varLambda {'})\), and \((u{''},\varLambda {''})\) be elementary languages, where , , . We have \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}(\mathcal {A})}((u, \varLambda )\cdot (u{''},\varLambda {''})) = \varLambda \wedge \varLambda '' \wedge \tau _1 + \tau ''_0 \le 1\) and \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}(\mathcal {A})}((u{'},\varLambda {'}) \cdot (u{''},\varLambda {''})) = \varLambda ' \wedge \varLambda '' \wedge \tau '_2 + \tau ''_0 \le 1\). Therefore, for the renaming equation \(\mathbb {T}_{1,1} = \mathbb {T}^{'}_{{2},{2}}\), we have \((u, \varLambda )\sim ^{(u{''},\varLambda {''}), \mathbb {T}_{1,1} = \mathbb {T}^{'}_{{2},{2}}}_{\mathcal {L}} (u{'},\varLambda {'})\).

An algorithm to check if \((u, \varLambda )\sim ^{S}_{\mathcal {L}} (u{'},\varLambda {'})\) is shown in Appendix B.2 of [29].

Intuitively, \((u, \varLambda )\sqsubseteq ^{s, R}_{\mathcal {L}} (u{'},\varLambda {'})\) shows that any \(w\in (u, \varLambda )\) can be “simulated” by some \(w' \in (u{'},\varLambda {'})\) with respect to \(s\) and \(R\). Such intuition is formalized as follows.

Theorem 17

For any \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) and \((u, \varLambda ), (u{'},\varLambda {'}), (u{''},\varLambda {''}) \in \mathcal {E} (\varSigma )\) satisfying \((u, \varLambda )\sqsubseteq ^{(u{''},\varLambda {''})}_{\mathcal {L}} (u{'},\varLambda {'})\), for any \(w\in (u, \varLambda )\), there is \(w' \in (u{'},\varLambda {'})\) such that for any \(w'' \in (u{''},\varLambda {''})\), \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\) holds. \(\square \)

By \(\bigcup _{(u, \varLambda )\in \mathcal {E} (\varSigma )} (u, \varLambda )= \mathcal {T}(\varSigma )\), we have the following as a corollary.

Corollary 18

For any timed language \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) and for any elementary languages \((u, \varLambda ), (u{'},\varLambda {'}) \in \mathcal {E} (\varSigma )\), \((u, \varLambda )\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}} (u{'},\varLambda {'})\) implies the following.

For any \(w\in (u, \varLambda )\), there is \(w' \in (u{'},\varLambda {'})\) such that for any \(w'' \in \mathcal {T}(\varSigma )\), we have \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\).
For any \(w' \in (u{'},\varLambda {'})\), there is \(w\in (u, \varLambda )\) such that for any \(w'' \in \mathcal {T}(\varSigma )\), we have \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\). \(\square \)

The following characterizes recognizable timed languages with \(\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}}\).

Theorem 19

(Myhill-Nerode style characterization). A timed language \(\mathcal {L}\) is recognizable if and only if the quotient set \(\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}}}\) is finite. \(\square \)

By Theorem 19, we always have a finite set \(S\) of distinguishing extensions.

Theorem 20

For any recognizable timed language \(\mathcal {L}\), there is a finite set \(S\) of elementary languages satisfying \({\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}}} = {\sim ^{S}_{\mathcal {L}}}\). \(\square \)

4 Active Learning of Deterministic Timed Automata

We show our L*-style active learning algorithm for DTAs with the Nerode-style congruence in Sect. 3. We let \(\mathcal {L}_{\textrm{tgt}}\) be the target timed language in learning.

For simplicity, we first present our learning algorithm with a smart teacher answering the following three kinds of queries: membership query \(\texttt{mem}_{\mathcal {L}_{\textrm{tgt}}}(w)\) asking whether \(w\in \mathcal {L}_{\textrm{tgt}}\), symbolic membership query asking \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}_{\textrm{tgt}}}({(u, \varLambda )})\), and equivalence query \(\texttt{eq}_{\mathcal {L}_{\textrm{tgt}}}(\mathcal {A}_{\textrm{hyp}})\) asking whether \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\). If \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\), \(\texttt{eq}_{\mathcal {L}_{\textrm{tgt}}}(\mathcal {A}_{\textrm{hyp}}) = \top \), and otherwise, \(\texttt{eq}_{\mathcal {L}_{\textrm{tgt}}}(\mathcal {A}_{\textrm{hyp}})\) is a timed word \( cex \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \triangle \mathcal {L}_{\textrm{tgt}}\). Later in Sect. 4.5, we show how to answer a symbolic membership query with finitely many membership queries. Our task is to construct a DTA \(\mathcal {A}\) satisfying \(\mathcal {L}(\mathcal {A}) = \mathcal {L}_{\textrm{tgt}}\) with finitely many queries.

4.1 Successors of Simple Elementary Languages

Similarly to the L* algorithm in Sect. 2.3, we learn a DTA with an observation table. Reflecting the extension of the underlying congruence, we use sets of simple elementary languages for the indices. To define the extensions, \(P\cup (P\cdot \varSigma )\) in the L* algorithm, we introduce continuous and discrete successors for simple elementary languages, which are inspired by regions [4]. We note that immediate exteriors do not work for this purpose. For example, for and , we have \(w\in (u, \varLambda )\) and , but there is no \(t > 0\) satisfying \(w\cdot t \in \textrm{ext}^{t} ((u, \varLambda ))\).

For any \((u, \varLambda )\in \mathcal{S}\mathcal{E} (\varSigma )\), we let \(\varTheta _{(u, \varLambda )}\) be the total order over 0 and the fractional parts \(\textrm{frac}(\mathbb {T}_{0,n}),\textrm{frac}(\mathbb {T}_{1,n}),\dots ,\textrm{frac}(\mathbb {T}_{n,n})\) of \(\mathbb {T}_{0,n},\mathbb {T}_{1,n},\dots ,\mathbb {T}_{n,n}\). Such an order is uniquely defined because \(\varLambda \) is simple and canonical (Proposition 36 of [29]).

Definition 21

(successor). Let \(p= (u, \varLambda )\in \mathcal{S}\mathcal{E} (\varSigma )\) be a simple elementary language. The discrete successor \(\textrm{succ}^{a} (p)\) of \(p\) is \(\textrm{succ}^{a} (p) = (u \cdot a, \varLambda \wedge \tau _{n+1} = 0)\). The continuous successor \(\textrm{succ}^{t} (p)\) of \(p\) is \(\textrm{succ}^{t} (p) = (u, \varLambda ^t)\), where \(\varLambda ^{t}\) is defined as follows: if there is an equation \(\mathbb {T}_{i,n} = d\) in \(\varLambda \), all such equations are replaced with \(\mathbb {T}_{i,n} \in (d, d+ 1)\); otherwise, for each greatest \(\mathbb {T}_{i,n}\) in terms of \(\varTheta _{(u, \varLambda )}\), we replace \(\mathbb {T}_{i,n} \in (d, d+ 1)\) with \(\mathbb {T}_{i,n} = d+1\). We let \(\textrm{succ}^{} (p) = \bigcup _{a\in \varSigma }\textrm{succ}^{a} (p) \cup \textrm{succ}^{t} (p)\). For \(P\subseteq \mathcal{S}\mathcal{E} (\varSigma )\), we let \(\textrm{succ}^{} (P) = \bigcup _{p\in P} \textrm{succ}^{} (p)\).

Example 22

Let , \(\varLambda = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} \in (1,2) \wedge \mathbb {T}_{1,1} \in (0,1) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} = 0\}\). The order \(\varTheta _{(u, \varLambda )}\) is such that \(0 = \textrm{frac}(\mathbb {T}_{2,2})< \textrm{frac}(\mathbb {T}_{1,2}) < \textrm{frac}(\mathbb {T}_{0,2})\). The continuous successor of \((u, \varLambda )\) is \(\textrm{succ}^{t} ((u, \varLambda )) = (u, \varLambda ^{t})\), where \(\varLambda ^{t} = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} \in (1,2) \wedge \mathbb {T}_{1,1} \in (0,1) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} \in (0,1)\}\). The continuous successor of \((u, \varLambda ^{t})\) is \(\textrm{succ}^{t} ((u, \varLambda ^{t})) = (u, \varLambda ^{ tt })\), where \(\varLambda ^{ tt } = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} = 2 \wedge \mathbb {T}_{1,1} \in (0,1) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} \in (0,1)\}\).

4.2 Timed Observation Table for Active DTA Learning

We extend the observation table with (simple) elementary languages and symbolic membership to learn a recognizable timed language.

Definition 23

(timed observation table). A timed observation table is a 3-tuple \((P, S, T)\) such that: \(P\) is a prefix-closed finite set of simple elementary languages, \(S\) is a finite set of elementary languages, and \(T\) is a function mapping \((p, s) \in (P\cup \textrm{succ}^{} (P)) \times S\) to the symbolic membership \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}_{\textrm{tgt}}}(p\cdot s)\).

Figure 2 illustrates timed observation tables: each cell indexed by \((p, s)\) show the symbolic membership \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}_{\textrm{tgt}}}(p\cdot s)\). For timed observation tables, we extend the notion of closedness and consistency with \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}\) we introduced in Sect. 3. We note that consistency is defined only for discrete successors. This is because a timed observation table does not always become “consistent” for continuous successors. See Appendix C of [29] for a detailed discussion. We also require exterior-consistency since we construct an exterior from a successor.

Definition 24

(closedness, consistency, exterior-consistency, cohesion). Let \(O = (P, S, T)\) be a timed observation table. \(O\) is closed if, for each \(p\in \textrm{succ}^{} (P) \setminus P\), there is \(p' \in P\) satisfying \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\). \(O\) is consistent if, for each \(p, p' \in P\) and for each \(a\in \varSigma \), \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\) implies \(\textrm{succ}^{a} (p) \sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} \textrm{succ}^{a} (p')\). \(O\) is exterior-consistent if for any \(p\in P\), \(\textrm{succ}^{t} (p) \notin P\) implies \(\textrm{succ}^{t} (p) \subseteq \textrm{ext}^{t} (p)\). \(O\) is cohesive if it is closed, consistent, and exterior-consistent.

From a cohesive timed observation table \((P, S, T)\), we can construct a DTA as outlined in Algorithm 2. We construct a DTA via a recognizable timed language. The main ideas are as follows: 1) we approximate \(\sim ^{\mathcal {E} (\varSigma ), R}_{\mathcal {L}_{\textrm{tgt}}}\) by \(\sim ^{S, R}_{\mathcal {L}_{\textrm{tgt}}}\); 2) we decide the equivalence class of \(\textrm{ext}^{} (p) \in \textrm{ext}^{} (P) \setminus P\) in \(\mathcal {E} (\varSigma )\) from successors. Notice that there can be multiple renaming equations \(R\) showing \(\sim ^{S, R}_{\mathcal {L}_{\textrm{tgt}}}\). We use one of them found by an exhaustive search in Appendix B.2 of [29].

The DTA obtained by MakeDTA is faithful to the timed observation table in rows, i. e., for any \(p\in P\cup \textrm{succ}^{} (P)\), \(\mathcal {L}_{\textrm{tgt}}\cap p= \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \cap p\). However, unlike the L* algorithm, this does not hold for each cell, i. e., there may be \(p\in P\cup \textrm{succ}^{} (P)\) and \(s\in S\) satisfying \(\mathcal {L}_{\textrm{tgt}}\cap (p\cdot s) \ne \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \cap (p\cdot s)\). This is because we do not (and actually cannot) enforce consistency for continuous successors. See Appendix C of [29] for a discussion. Nevertheless, this does not affect the correctness of our algorithm partly by Theorem 26.

Theorem 25

(row faithfulness). For any cohesive timed observation table \((P, S, T)\), for any \(p\in P\cup \textrm{succ}^{} (P)\), \(\mathcal {L}_{\textrm{tgt}}\cap p= \mathcal {L}(\texttt {MakeDTA}(P, S, T)) \cap p\) holds. \(\square \)

Theorem 26

For any cohesive timed observation table \((P, S, T)\), \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} = \sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}\) implies \(\mathcal {L}_{\textrm{tgt}}= \mathcal {L}(\texttt {MakeDTA}(P, S, T))\). \(\square \)

4.3 Counterexample Analysis

We analyze the counterexample \( cex \) obtained by an equivalence query to refine the set \(S\) of suffixes in a timed observation table. Our analysis, outlined in Algorithm 3, is inspired by the Rivest-Schapire algorithm [20, 25]. The idea is to reduce the counterexample \( cex \) using the mapping defined by the congruence \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}\) (lines 5–7 ), much like \(\varPhi \) in recognizable timed languages, and to find a suffix \(s\) strictly refining \(S\) (line 9), i. e., satisfying \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\) and for some \(p\in \textrm{succ}^{} (P)\) and \(p' \in P\).

By definition of \( cex \), we have \( cex = w_0 \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\). By Theorem 25, \(w_n \not \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\) holds, where n is the final value of i. By construction of \(\mathcal {A}_{\textrm{hyp}}\) and \(w_i\), for any \(i \in \{1, 2, \dots , n\}\), we have \(w_0 \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \iff w_i \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\). Therefore, there is \(i \in \{1, 2, \dots , n\}\) satisfying \(w_{i-1} \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\) and \(w_{i} \not \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\). For such i, since we have \(w_{i-1} = w'_i \cdot w''_i \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\), \(w_i = \overline{w}_i \cdot w''_i \not \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\), and \(\kappa (w'_i), \kappa (\overline{w}_i) \models R_i\), such \(w''_i\) is a witness of \(p'_i {\not \sim }^{\mathcal {E} (\varSigma ), R_i}_{\mathcal {L}_{\textrm{tgt}}} p_i\). Therefore, \(S\) can be refined by the simple elementary language \(s\in \mathcal{S}\mathcal{E} (\varSigma )\) including \(w''_i\).

4.4 L*-Style Learning Algorithm for DTAs

Algorithm 4 outlines our active DTA learning algorithm. At line 1, we initialize the timed observation table with . In the loop in lines 2–15, we refine the timed observation table until the hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}\) recognizes the target language \(\mathcal {L}_{\textrm{tgt}}\), which is checked by equivalence queries. The refinement finishes when the equivalence relation \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}\) defined by the suffixes \(S\) converges to \(\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}\), and the prefixes \(P\) covers \(\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}}\).

In the loop in lines3–11, we make the timed observation table cohesive. If the timed observation table is not closed, we move the incompatible row in \(\textrm{succ}^{} (P) \setminus P\) to \(P\) (line 5). If the timed observation table is inconsistent, we concatenate an event \(a\in \varSigma \) in front of some of the suffixes in \(S\) (line 8). If the timed observation table is not exterior-consistent, we move the boundary \(\textrm{succ}^{t} (p) \in \textrm{succ}^{t} (P)\setminus P\) satisfying \(\textrm{succ}^{t} (p) \nsubseteq \textrm{ext}^{t} (p)\) to \(P\) (line 10). Once we obtain a cohesive timed observation table, we construct a DTA \(\mathcal {A}_{\textrm{hyp}}= \texttt {MakeDTA}{} \texttt {(}{P, S, T}{} \texttt {)}\) and make an equivalence query (line 12). If we have \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\), we return \(\mathcal {A}_{\textrm{hyp}}\). Otherwise, we have a timed word \( cex \) witnessing the difference between the language of the hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}\) and the target language \(\mathcal {L}_{\textrm{tgt}}\). We refine the timed observation table using Algorithm 3.

Example 27

Let \(\mathcal {L}_{\textrm{tgt}}\) be the timed language recognized by the DTA in Fig. 1c. We start from and . Figure 2a shows the initial timed observation table \(O_1\). Since the timed observation table \(O_1\) in Fig. 2a is cohesive, we construct a hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}^1\). The hypothesis recognizable timed language is \((P_1, F_1, \varPhi _1)\) is such that and . Figure 2b shows the first hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}^1\).

We have \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}^1) \ne \mathcal {L}_{\textrm{tgt}}\), and the learner obtains a counterexample, e. g., , with an equivalence query. In Algorithm 3, we have \(w_0 = cex \), , , and \(w_3 = 0\). We have \(w_0 \not \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}^1) \triangle \mathcal {L}_{\textrm{tgt}}\) and \(w_1 \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}^1) \triangle \mathcal {L}_{\textrm{tgt}}\), and the suffix to distinguish \(w_0\) and \(w_1\) is . Thus, we add to \(S_1\) (Fig. 2d).

In Fig. 2d, we observe that \(T_2(p_1, s_1)\) is more strict than \(T_2(p_0, s_1)\), and we have \(p_1 {\not \sim }^{S_2}_{\mathcal {L}_{\textrm{tgt}}} p_0\). To make \((P_2, S_2, T_2)\) closed, we add \(p_1\) to \(P_2\). By repeating similar operations, we obtain the timed observation table \(O_3 = (P_3, S_3, T_3)\) in Fig. 2e, which is cohesive. Figure 2c shows the DTA \(\mathcal {A}_{\textrm{hyp}}^3\) constructed from \(O_3\). Since \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}^3) = \mathcal {L}_{\textrm{tgt}}\) holds, Algorithm 4 finishes returning \(\mathcal {A}_{\textrm{hyp}}^3\).

By the use of equivalence queries, Algorithm 4 returns a DTA recognizing the target language if it terminates, which is formally as follows.

Theorem 28

(correctness). For any target timed language \(\mathcal {L}_{\textrm{tgt}}\), if Algorithm 4 terminates, for the resulting DTA \(\mathcal {A}_{\textrm{hyp}}\), \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\) holds. \(\square \)

Moreover, Algorithm 4 terminates for any recognizable timed language \(\mathcal {L}_{\textrm{tgt}}\) essentially because of the finiteness of \(\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}}\).

Theorem 29

(termination). For any recognizable timed language \(\mathcal {L}_{\textrm{tgt}}\), Algorithm 4 terminates and returns a DTA \(\mathcal {A}\) satisfying \(\mathcal {L}(\mathcal {A}) = \mathcal {L}_{\textrm{tgt}}\).

Proof

(Theorem 29). By the recognizability of \(\mathcal {L}_{\textrm{tgt}}\) and Theorem 19, \(\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}}\) is finite. Let \(N = |\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}}|\). Since each execution of line 5 adds \(p\) to \(P\), where \(p\) is such that for any \(p' \in P\), \(p{\not \sim }^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}} p'\) holds, it is executed at most N times. Since each execution of line 8 refines \(S\), i. e., it increases \(|\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}|\), line 8 is executed at most N times. For any \((u,\varLambda ) \in \mathcal{S}\mathcal{E} (\varSigma )\), if \(\varLambda \) contains \(\mathbb {T}_{i,|u|} = d\) for some \(i \in \{0, 1, \dots , |u|\}\) and \(d\in {\mathbb {N}}\), we have \(\textrm{succ}^{t}((u,\varLambda )) \subseteq \textrm{ext}^{t} ((u, \varLambda ))\). Therefore, line 10 is executed at most N times. Since \(S\) is strictly refined in line 14, i. e., it increases \(|\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}|\), line 14 is executed at most N times. By Theorem 26, once \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}\) saturates to \(\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}\), \(\texttt {MakeDTA}\) returns the correct DTA. Overall, Algorithm 4 terminates. \(\square \)

4.5 Learning with a Normal Teacher

We briefly show how to learn a DTA only with membership and equivalence queries. We reduce a symbolic membership query to finitely many membership queries, answerable by a normal teacher. See Appendix B.1 of [29] for detail.

Let \((u, \varLambda )\) be the elementary language given in a symbolic membership query. Since \(\varLambda \) is bounded, we can construct a finite and disjoint set of simple and canonical timed conditions \(\varLambda '_1, \varLambda '_2,\dots , \varLambda '_n\) satisfying \(\bigvee _{1 \le i \le n} \varLambda '_i = \varLambda \) by a simple enumeration. For any simple elementary language \((u{'},\varLambda {'}) \in \mathcal{S}\mathcal{E} (\varSigma )\) and timed words \(w, w' \in (u{'},\varLambda {'})\), we have \(w\in \mathcal {L}\iff w' \in \mathcal {L}\). Thus, we can construct \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u, \varLambda ))\) by making a membership query \(\texttt{mem}_{\mathcal {L}}(w)\) for each such \((u{'},\varLambda {'}) \subseteq (u, \varLambda )\) and for some \(w\in (u{'},\varLambda {'})\). We need such an exhaustive search, instead of a binary search, because \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u, \varLambda ))\) may be non-convex.

Assume \(\varLambda \) is a canonical timed condition. Let M be the size of the variables in \(\varLambda \) and I be the largest difference between the upper bound and the lower bound for some \(\mathbb {T}_{i,j}\) in \(\varLambda \). The size n of the above decomposition is bounded by \({(2 \times I + 1)}^{1/2 \times M \times (M + 1)}\), which exponentially blows up with respect to M.

In our algorithm, we only make symbolic membership queries with elementary languages of the form \(p\cdot s\), where \(p\) and \(s\) are simple elementary languages. Therefore, I is at most 2. However, even with such an assumption, the number of the necessary membership queries blows up exponentially to the size of the variables in \(\varLambda \).

4.6 Complexity Analysis

After each equivalence query, our DTA learning algorithm strictly refines \(S\) or terminates. Thus, the number of equivalence queries is at most N. In the proof of Theorem 29, we observe that the size of \(P\) is at most 2N. Therefore, the number \((|P| + |\textrm{succ}^{} (P)|) \times |S|\) of the cells in the timed observation table is at most \((2 N + 2 N \times (|\varSigma | + 1)) \times N = 2 N^2 |\varSigma | + 2\). Let J be the upper bound of i in the analysis of \( cex \) returned by equivalence queries (Algorithm 3). For each equivalence query, the number of membership queries in Algorithm 3 is bounded by \(\lceil \log J \rceil \), and thus, it is, in total, bounded by \(N \times \lceil \log J \rceil \). Therefore, if the learner can use symbolic membership queries, the total number of queries is bounded by a polynomial of N and J. In Sect. 4.5, we observe that the number of membership queries to implement a symbolic membership query is at most exponential to M. Since \(P\) is prefix-closed, M is at most N. Overall, if the learner cannot use symbolic membership queries, the total number of queries is at most exponential to N.

Table 1. Summary of the results for Random. Each row index \(|L|\_|\varSigma |\_K_{C}\) shows the number of locations, the alphabet size, and the upper bound of the maximum constant in the guards, respectively. The row “count” shows the number of instances finished in 3 h. Cells with the best results are highlighted.

Full size table

Let \(\mathcal {A}_{\textrm{tgt}}= (\varSigma ,L,l_0,C,I,\varDelta ,F)\) be a DTA recognizing \(\mathcal {L}_{\textrm{tgt}}\). As we observe in the proof of Lemma 33 of [29], N is bounded by the size of the state space of the region automaton [4] of \(\mathcal {A}_{\textrm{tgt}}\), N is at most \(|C|!\times 2^{|C|}\times \prod _{c\in C} (2 K_{c} + 2) \times |L|\), where \(K_c\) is the largest constant compared with \(c\in C\) in \(\mathcal {A}_{\textrm{tgt}}\). Thus, without symbolic membership queries, the total number of queries is at most doubly-exponential to \(|C|\) and singly exponential to \(|L|\). We remark that when \(|C| = 1\), the total number of queries is at most singly exponential to \(|L|\) and \(K_c\), which coincides with the worst-case complexity of the one-clock DTA learning algorithm in [30].

5 Experiments

We experimentally evaluated our DTA learning algorithm using our prototype library LearnTA^{Footnote 4} implemented in C++. In LearnTA, the equivalence queries are answered by a zone-based reachability analysis using the fact that DTAs are closed under complement [4]. We pose the following research questions.

RQ1 How is the scalability of LearnTA to the language complexity?
RQ2 How is the efficiency of LearnTA for practical benchmarks?

For the benchmarks with one clock variable, we compared LearnTA with one of the latest one-clock DTA learning algorithms [1, 30], which we call OneSMT. OneSMT is implemented in Python with Z3 [23] for constraint solving.

For each execution, we measured the number of queries and the total execution time, including the time to answer the queries. For the number of queries, we report the number with memoization, i. e., we count the number of the queried timed words (for membership queries) and the counterexamples (for equivalence queries). We conducted all the experiments on a computing server with Intel Core i9-10980XE 125 GiB RAM that runs Ubuntu 20.04.5 LTS. We used 3 h as the timeout.

Table 2. Summary of the target DTAs and the results for Unbalanced. \(|L|\) is the number of locations, \(|\varSigma |\) is the alphabet size, \(|C|\) is the number of clock variables, and \(K_{C}\) is the maximum constant in the guards in the DTA.

Full size table

5.1 RQ1: Scalability with Respect to the Language Complexity

To evaluate the scalability of LearnTA, we used randomly generated DTAs from [5] (denoted as Random) and our original DTAs (denoted as Unbalanced). Random consists of five classes: \(3\_2\_10\), \(4\_2\_10\), \(4\_4\_20\), \(5\_2\_10\), and \(6\_2\_10\), where each value of \(|L|\_|\varSigma |\_K_{C}\) is the number of locations, the alphabet size, and the upper bound of the maximum constant in the guards in the DTAs, respectively. Each class consists of 10 randomly generated DTAs. Unbalanced is our original benchmark inspired by the “unbalanced parentheses” timed language from [10]. Unbalanced consists of five DTAs with different complexity of timing constraints. Table 2 summarizes their complexity.

Table 1 and 3 summarize the results for Random, and Table 2 summarizes the results for Unbalanced. Table 1 shows that LearnTA requires more membership queries than OneSMT. This is likely because of the difference in the definition of prefixes and successors: OneSMT’s definitions are discrete (e. g., prefixes are only with respect to events with time elapse), whereas ours are both continuous and discrete (e. g., we also consider prefixes by trimming the dwell time in the end); Since our definition makes significantly more prefixes, LearnTA tends to require much more membership queries. Another, more high-level reason is that LearnTA learns a DTA without knowing the number of the clock variables, and many more timed words are potentially helpful for learning. Table 1 shows that LearnTA requires significantly many membership queries for \(4\_4\_20\). This is likely because of the exponential blowup with respect to \(K_{C}\), as discussed in Sect. 4.6. In Fig. 3, we observe that for both LearnTA and OneSMT, the number of membership queries increases nearly exponentially to the number of locations. This coincides with the discussion in Sect. 4.6.

In contrast, Table 1 shows that LearnTA requires fewer equivalence queries than OneSMT. This suggests that the cohesion in Definition 24 successfully detected contradictions in observation before generating a hypothesis, whereas OneSMT mines timing constraints mainly by equivalence queries and tends to require more equivalence queries. In Fig. 3c, we observe that for both LearnTA and OneSMT, the number of equivalence queries increases nearly linearly to the number of locations. This also coincides with the complexity analysis in Sect. 4.6. Figure 3c also shows that the number of equivalence queries increases faster in OneSMT than in LearnTA.

Table 3. Summary of the target DTA and the results for practical benchmarks. The columns are the same as Table 2. Cells with the best results are highlighted.

Full size table

Table 2 also suggests a similar tendency: the number of membership queries rapidly increases to the complexity of the timing constraints; In contrast, the number of equivalence queries increases rather slowly. Moreover, LearnTA is scalable enough to learn a DTA with five clock variables within 15 min.

Table 1 also suggests that LearnTA does not scale well to the maximum constant in the guards, as observed in Sect. 4.6. However, we still observe that LearnTA requires fewer equivalence queries than OneSMT. Overall, compared with OneSMT, LearnTA has better scalability in the number of equivalence queries and worse scalability in the number of membership queries.

5.2 RQ2: Performance on Practical Benchmarks

To evaluate the practicality of LearnTA, we used seven benchmarks: AKM, CAS, Light, PC, TCP, Train, and FDDI. Table 3 summarizes their complexity. All the benchmarks other than FDDI are taken from [30] (or its implementation [1]). FDDI is taken from TChecker [2]. We use the instance of FDDI with two processes.

Table 3 summarizes the results for the benchmarks from practical applications. We observe, again, that LearnTA requires more membership queries and fewer equivalence queries than OneSMT. However, for these benchmarks, the difference in the number of membership queries tends to be much smaller than in Random. This is because these benchmarks have simpler timing constraints than Random for the exploration by LearnTA. In AKM, Light, PC, TCP, and Train, the clock variable can be reset at every edge without changing the language. For such a DTA, all simple elementary languages are equivalent in terms of the Nerode-style congruence if we have the same edge at their last event and the same dwell time after it. If two simple elementary languages are equivalent, LearnTA explores the successors of only one of them, and the exploration is relatively efficient. We have a similar situation in CAS. Moreover, in many of these DTAs, only a few edges have guards. Overall, despite the large number of locations and alphabets, these languages’ complexities are mild for LearnTA.

We also observe that, surprisingly, for all of these benchmarks, LearnTA took a shorter time for DTA learning than OneSMT. This is partly because of the difference in the implementation language (i. e., C++ vs. Python) but also because of the small number of equivalence queries and the mild number of membership queries. Moreover, although it requires significantly more queries, LearnTA successfully learned FDDI with seven clock variables. Overall, such efficiency on benchmarks from practical applications suggests the potential usefulness of LearnTA in some realistic scenarios.

6 Conclusions and Future Work

Extending the L* algorithm, we proposed an active learning algorithm for DTAs. Our extension is by our Nerode-style congruence for recognizable timed languages. We proved the termination and the correctness of our algorithm. We also proved that our learning algorithm requires a polynomial number of queries with a smart teacher and an exponential number of queries with a normal teacher. Our experiment results also suggest the practical relevance of our algorithm.

One of the future directions is to extend more recent automata learning algorithms (e. g., TTT algorithm [19] to improve the efficiency) to DTA learning. Another direction is constructing a passive DTA learning algorithm based on our congruence and an existing passive DFA learning algorithm. It is also a future direction to apply our learning algorithm for practical usage, e. g., identification of black-box systems and testing black-box systems with black-box checking [22, 24, 28]. Optimization of the algorithm, e. g., by incorporating clock information is also a future direction.

Notes

1.
We use \({\mathop {{\rightarrow }}\limits ^{\varepsilon , \tau }}\) to avoid the discussion with an arbitrary small dwell time in [21].
2.
The notion of simplicity is taken from [15].
3.
The distinguishing extensions \(S\) can be defined locally. For example, the TTT algorithm [19] is optimized with local distinguishing extensions for some prefixes \(w\in \varSigma ^*\). Nevertheless, we use the global distinguishing extensions for simplicity.
4.
LearnTA is publicly available at https://github.com/masWag/LearnTA. The artifact of the experiments is available at https://doi.org/10.5281/zenodo.7875383.

References

GitHub: Leslieaj/DOTALearningSMT. https://github.com/Leslieaj/DOTALearningSMT, (Accessed 10 Jan 2023)
Github: ticktac-project/tchecker. https://github.com/ticktac-project/tchecker, (Accessed 20 Jan 2023)
Aichernig, B.K., Pferscher, A., Tappler, M.: From passive to active: learning timed automata efficiently. In: Lee, R., Jha, S., Mavridou, A., Giannakopoulou, D. (eds.) NFM 2020. LNCS, vol. 12229, pp. 1–19. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55754-6_1
Chapter Google Scholar
Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. 126(2), 183–235 (1994). https://doi.org/10.1016/0304-3975(94)90010-8
Article MathSciNet Google Scholar
An, J., Chen, M., Zhan, B., Zhan, N., Zhang, M.: Learning one-clock timed automata. In: TACAS 2020. LNCS, vol. 12078, pp. 444–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45190-5_25
Chapter Google Scholar
An, J., Wang, L., Zhan, B., Zhan, N., Zhang, M.: Learning real-time automata. Science China Inf. Sci. 64(9), 1–17 (2021). https://doi.org/10.1007/s11432-019-2767-4
Article MathSciNet Google Scholar
An, J., Zhan, B., Zhan, N., Zhang, M.: Learning nondeterministic real-time automata. ACM Trans. Embed. Comput. Syst. 20(5s), 99:1–99:26 (2021). https://doi.org/10.1145/3477030,
Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987). https://doi.org/10.1016/0890-5401(87)90052-6
Article MathSciNet Google Scholar
Argyros, G., D’Antoni, L.: The learnability of symbolic automata. In: Chockler, H., Weissenbacher, G. (eds.) CAV 2018. LNCS, vol. 10981, pp. 427–445. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96145-3_23
Chapter Google Scholar
Asarin, E., Caspi, P., Maler, O.: Timed regular expressions. J. ACM 49(2), 172–206 (2002). https://doi.org/10.1145/506147.506151
Article MathSciNet Google Scholar
Bersani, M.M., Rossi, M., San Pietro, P.: A logical characterization of timed regular languages. Theor. Comput. Sci. 658, 46–59 (2017). https://doi.org/10.1016/j.tcs.2016.07.020
Article MathSciNet Google Scholar
Bojańczyk, M., Lasota, S.: A machine-independent characterization of timed languages. In: Czumaj, A., Mehlhorn, K., Pitts, A., Wattenhofer, R. (eds.) ICALP 2012. LNCS, vol. 7392, pp. 92–103. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31585-5_12
Chapter Google Scholar
Bouyer, P., Petit, A., Thérien, D.: An algebraic characterization of data and timed languages. In: Larsen, K.G., Nielsen, M. (eds.) CONCUR 2001. LNCS, vol. 2154, pp. 248–261. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44685-0_17
Chapter Google Scholar
Drews, S., D’Antoni, L.: Learning symbolic automata. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10205, pp. 173–189. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54577-5_10
Chapter Google Scholar
Grinchtein, O., Jonsson, B., Leucker, M.: Learning of event-recording automata. Theor. Comput. Sci. 411(47), 4029–4054 (2010). https://doi.org/10.1016/j.tcs.2010.07.008
Article MathSciNet Google Scholar
Grinchtein, O., Jonsson, B., Pettersson, P.: Inference of event-recording automata using timed decision trees. In: Baier, C., Hermanns, H. (eds.) CONCUR 2006. LNCS, vol. 4137, pp. 435–449. Springer, Heidelberg (2006). https://doi.org/10.1007/11817949_29
Chapter Google Scholar
Henry, L., Jéron, T., Markey, N.: Active learning of timed automata with unobservable resets. In: Bertrand, N., Jansen, N. (eds.) FORMATS 2020. LNCS, vol. 12288, pp. 144–160. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57628-8_9
Chapter Google Scholar
Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to automata theory, languages, and computation, 3rd edn. Addison-Wesley, Pearson international edition (2007)
Google Scholar
Isberner, M., Howar, F., Steffen, B.: The TTT algorithm: a redundancy-free approach to active automata learning. In: Bonakdarpour, B., Smolka, S.A. (eds.) RV 2014. LNCS, vol. 8734, pp. 307–322. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11164-3_26
Chapter Google Scholar
Isberner, M., Steffen, B.: An abstract framework for counterexample analysis in active automata learning. In: Clark, A., Kanazawa, M., Yoshinaka, R. (eds.) Proceedings of the 12th International Conference on Grammatical Inference, ICGI 2014, Kyoto, Japan, 17–19 September 2014. JMLR Workshop and Conference Proceedings, vol. 34, pp. 79–93. JMLR.org (2014). http://proceedings.mlr.press/v34/isberner14a.html
Maler, O., Pnueli, A.: On recognizable timed languages. In: Walukiewicz, I. (ed.) FoSSaCS 2004. LNCS, vol. 2987, pp. 348–362. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24727-2_25
Chapter Google Scholar
Meijer, J., van de Pol, J.: Sound black-box checking in the learnlib. Innov. Syst. Softw. Eng. 15(3–4), 267–287 (2019). https://doi.org/10.1007/s11334-019-00342-6
Article Google Scholar
de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24
Chapter Google Scholar
Peled, D.A., Vardi, M.Y., Yannakakis, M.: Black box checking. In: Wu, J., Chanson, S.T., Gao, Q. (eds.) Formal Methods for Protocol Engineering and Distributed Systems, FORTE XII / PSTV XIX 1999, IFIP TC6 WG6.1 Joint International Conference on Formal Description Techniques for Distributed Systems and Communication Protocols (FORTE XII) and Protocol Specification, Testing and Verification (PSTV XIX), 5–8 October 1999, Beijing, China. IFIP Conference Proceedings, vol. 156, pp. 225–240. Kluwer (1999)
Google Scholar
Rivest, R.L., Schapire, R.E.: Inference of finite automata using homing sequences. Inf. Comput. 103(2), 299–347 (1993). https://doi.org/10.1006/inco.1993.1021
Article MathSciNet Google Scholar
Shijubo, J., Waga, M., Suenaga, K.: Efficient black-box checking via model checking with strengthened specifications. In: Feng, L., Fisman, D. (eds.) RV 2021. LNCS, vol. 12974, pp. 100–120. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88494-9_6
Chapter Google Scholar
Tappler, M., Aichernig, B.K., Larsen, K.G., Lorber, F.: Time to learn – learning timed automata from tests. In: André, É., Stoelinga, M. (eds.) FORMATS 2019. LNCS, vol. 11750, pp. 216–235. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29662-9_13
Chapter Google Scholar
Waga, M.: Falsification of cyber-physical systems with robustness-guided black-box checking. In: Ames, A.D., Seshia, S.A., Deshmukh, J. (eds.) HSCC 2020: 23rd ACM International Conference on Hybrid Systems: Computation and Control, Sydney, New South Wales, Australia, 21–24 April 2020, pp. 11:1–11:13. ACM (2020). https://doi.org/10.1145/3365365.3382193
Waga, M.: Active learning of deterministic timed automata with myhill-nerode style characterization. CoRR abs/ arXiv: 2305.17742 (2023). http://arxiv.org/abs/2305.17742
Xu, R., An, J., Zhan, B.: Active learning of one-clock timed automata using constraint solving. In: Bouajjani, A., Holík, L., Wu, Z. (eds.) Automated Technology for Verification and Analysis - 20th International Symposium, ATVA 2022, Virtual Event, 25–28 October 2022, Proceedings. LNCS, vol. 13505, pp. 249–265. Springer (2022). https://doi.org/10.1007/978-3-031-19992-9_16
Zhang, H., Feng, L., Li, Z.: Control of black-box embedded systems by integrating automaton learning and supervisory control theory of discrete-event systems. IEEE Trans. Autom. Sci. Eng. 17(1), 361–374 (2020). https://doi.org/10.1109/TASE.2019.2929563
Article Google Scholar

Download references

Acknowledgements

This work is partially supported by JST ACT-X Grant No. JPMJAX200U, JST PRESTO Grant No. JPMJPR22CA, JST CREST Grant No. JPMJCR2012, and JSPS KAKENHI Grant No. 22K17873.

Author information

Authors and Affiliations

Graduate School of Informatics, Kyoto University, Kyoto, Japan
Masaki Waga

Authors

Masaki Waga
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masaki Waga .

Editor information

Editors and Affiliations

LIX, Ecole Polytechnique, CNRS and Institut Polytechnique de Paris, Palaiseau, France
Constantin Enea
Microsoft Research, Bangalore, India
Akash Lal

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Waga, M. (2023). Active Learning of Deterministic Timed Automata with Myhill-Nerode Style Characterization. In: Enea, C., Lal, A. (eds) Computer Aided Verification. CAV 2023. Lecture Notes in Computer Science, vol 13964. Springer, Cham. https://doi.org/10.1007/978-3-031-37706-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-37706-8_1
Published: 17 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37705-1
Online ISBN: 978-3-031-37706-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Active Learning of Deterministic Timed Automata with Myhill-Nerode Style Characterization

Abstract

Similar content being viewed by others

Learning One-Clock Timed Automata

Active Learning of One-Clock Timed Automata Using Constraint Solving

Learning Symbolic Timed Models from Concrete Timed Data

Keywords

1 Introduction

Example 1

Example 2

2 Preliminaries

2.1 Timed Words and Timed Automata

Definition 3

Definition 4

Definition 5

2.2 Recognizable Timed Languages

Definition 6

Definition 7

Definition 8

Example 9

Definition 10

Definition 11

Definition 12

2.3 Distinguishing Extensions and Active DFA Learning

3 A Myhill-Nerode Style Characterization of Recognizable Timed Languages with Elementary Languages

Definition 13

Definition 14

Definition 15

Example 16

Theorem 17

Corollary 18

Theorem 19

Theorem 20

4 Active Learning of Deterministic Timed Automata

4.1 Successors of Simple Elementary Languages

Definition 21

Example 22

4.2 Timed Observation Table for Active DTA Learning

Definition 23

Definition 24

Theorem 25

Theorem 26

4.3 Counterexample Analysis

4.4 L*-Style Learning Algorithm for DTAs

Example 27

Theorem 28

Theorem 29

Proof

4.5 Learning with a Normal Teacher

4.6 Complexity Analysis

5 Experiments

5.1 RQ1: Scalability with Respect to the Language Complexity

5.2 RQ2: Performance on Practical Benchmarks

6 Conclusions and Future Work

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation