Keywords

figure a
figure b

1 Introduction

Active automata learning is a class of methods to infer an automaton recognizing an unknown target language \(\mathcal {L}_{\textrm{tgt}}\subseteq \varSigma ^*\) through finitely many queries to a teacher. The L* algorithm [8], the best-known active DFA learning algorithm, infers the minimum DFA recognizing \(\mathcal {L}_{\textrm{tgt}}\) using membership and equivalence queries. In a membership query, the learner asks if a word \(w \in \varSigma ^*\) is in the target language \(\mathcal {L}_{\textrm{tgt}}\), which is used to obtain enough information to construct a hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\). Using an equivalence query, the learner checks if the hypothesis \(\mathcal {A}_{\textrm{hyp}}\) recognizes the target language \(\mathcal {L}_{\textrm{tgt}}\). If \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \ne \mathcal {L}_{\textrm{tgt}}\), the teacher returns a counterexample \( cex \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\) differentiating the target language and the current hypothesis. The learner uses \( cex \) to update \(\mathcal {A}_{\textrm{hyp}}\) to classify \( cex \) correctly. Such a learning algorithm has been combined with formal verification, e. g., for testing [22, 24, 26, 28] and controller synthesis [31].

Most of the DFA learning algorithms rely on the characterization of regular languages by Nerode’s congruence. For a language \(\mathcal {L}\), words \(p\) and \(p'\) are equivalent if for any extension \(s\), \(p\cdot s\in \mathcal {L}\) if and only if \(p' \cdot s\in \mathcal {L}\). It is well known that if \(\mathcal {L}\) is regular, such an equivalence relation has finite classes, corresponding to the locations of the minimum DFA recognizing \(\mathcal {L}\) (known as Myhill-Nerode theorem; see, e. g.,  [18]). Moreover, for any regular language \(\mathcal {L}\), there are finite extensions \(S\) such that \(p\) and \(p'\) are equivalent if and only if for any \(s\in S\), \(p\cdot s\in \mathcal {L}\) if and only if \(p' \cdot s\in \mathcal {L}\). Therefore, one can learn the minimum DFA by learning such finite extensions \(S\) and the finite classes induced by Nerode’s congruence.

Fig. 1.
figure 1

Illustration of observation tables in the L* algorithm for DFA learning (Fig. 1b) and our algorithm for DTA learning (Fig. 1d)

The L* algorithm learns the minimum DFA recognizing the target language \(\mathcal {L}_{\textrm{tgt}}\) using a 2-dimensional array called an observation table. Figure 1b illustrates observation tables. The rows and columns of an observation table are indexed with finite sets of words \(P\) and \(S\), respectively. Each cell indexed by \((p, s) \in P\times S\) shows if \(p\cdot s\in \mathcal {L}_{\textrm{tgt}}\). The column indices \(S\) are the current extensions approximating Nerode’s congruence. The L* algorithm increases \(P\) and \(S\) until: 1) the equivalence relation defined by \(S\) converges to Nerode’s congruence and 2) \(P\) covers all the classes induced by the congruence. The equivalence between \(p, p'\in P\) under \(S\) can be checked by comparing the rows in the observation table indexed with \(p\) and \(p'\). For example, Fig. 1b shows that and are deemed equivalent with extensions but distinguished by adding to \(S\). The refinement of \(P\) and \(S\) is driven by certain conditions to validate the DFA construction and by addressing the counterexample obtained by an equivalence query.

Timed words are extensions of conventional words with real-valued dwell time between events. Timed languages, sets of timed words, are widely used to formalize real-time systems and their properties, e. g., for formal verification. Among various formalisms representing timed languages, timed automata (TAs) [4] is one of the widely used formalisms. A TA is an extension of an NFA with finitely many clock variables to represent timing constraints. Figure 1c shows an example.

Despite its practical relevance, learning algorithms for TAs are only available for limited subclasses of TAs, e. g., real-time automata [6, 7], event-recording automata [15, 16], event-recording automata with unobservable reset [17], and one-clock deterministic TAs [5, 30]. Timing constraints representable by these classes are limited, e. g., by restricting the number of clock variables or by restricting the edges where a clock variable can be reset. Such restriction simplifies the inference of timing constraints in learning algorithms.

Contributions. In this paper, we propose an active learning algorithm for deterministic TAs (DTAs). The languages recognizable by DTAs are called recognizable timed languages [21]. Our strategy is as follows: first, we develop a Myhill-Nerode style characterization of recognizable timed languages; then, we extend the L* algorithm for recognizable timed languages using the similarity of the Myhill-Nerode style characterization.

Due to the continuity of dwell time in timed words, it is hard to characterize recognizable timed languages by a Nerode-style congruence between timed words. For example, for the DTA in Fig. 1c, for any \(\tau , \tau ' \in [0,1)\) satisfying \(\tau < \tau '\), distinguishes \(\tau \) and \(\tau '\) because leads to \(l_0\) while leads to \(l_1\). Therefore, such a congruence can make infinitely many classes.

Instead, we define a Nerode-style congruence between sets of timed words called elementary languages [21]. An elementary language is a timed language defined by a word with a conjunction of inequalities constraining the time difference between events. We also use an equality constraint, which we call, a renaming equation to define the congruence. Intuitively, a renaming equation bridges the time differences in an elementary language and the clock variables in a TA. We note that there can be multiple renaming equations showing the equivalence of two elementary languages.

Example 1

Let \(p_1\) and \(p_2\) be elementary languages and . For the DTA in Fig. 1c, \(p_1\) and \(p_2\) are equivalent with the renaming equation \(\tau ^1_0 + \tau ^1_1 = \tau ^2_1 + \tau ^2_2\) because for any and : 1) we reach \(l_0\) after reading either of \(w_1\) and \(w_2\) and 2) the values of \(c\) after reading \(w_1\) and \(w_2\) are \(\tau ^1_0 + \tau ^1_1\) and \(\tau ^2_1 + \tau ^2_2\), respectively.

We characterize recognizable timed languages by the finiteness of the equivalence classes defined by the above congruence. We also show that for any recognizable timed language, there is a finite set \(S\) of elementary languages such that the equivalence of any prefixes can be checked by the extensions \(S\).

By using the above congruence, we extend the L* algorithm for DTAs. The high-level idea is the same as the original L* algorithm: 1) the learner makes membership queries to obtain enough information to construct a hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}\) and 2) the learner makes an equivalence query to check if \(\mathcal {A}_{\textrm{hyp}}\) recognizes the target language. The largest difference is in the cells of an observation table. Since the concatenation \(p\cdot s\) of an index pair \((p, s) \in P\times S\) is not a timed word but a set of timed words, its membership is not defined as a Boolean value. Instead, we introduce the notion of symbolic membership and use it as the value of each cell of the timed observation table. Intuitively, the symbolic membership is the constraint representing the subset of \(p\cdot s\) included by \(\mathcal {L}_{\textrm{tgt}}\). Such a constraint can be constructed by finitely many (non-symbolic) membership queries.

Example 2

Figure 1d illustrates a timed observation table. The equivalence between \(p_1, p_2 \in P\) under \(S\) can be checked by comparing the cells in the rows indexed with \(p_1\) and \(p_2\) with renaming equations. For the cells in rows indexed by \(p_1\) and \(p_2\), their constraints are the same by replacing \(\tau _0 + \tau _1\) with \(\tau _1 + \tau _2\) and vice versa. Thus, \(p_1\) and \(p_2\) are equivalent with the current extensions \(S\).

Once the learner obtains enough information, it constructs a DTA via the monoid-based representation of recognizable timed languages [21]. We show that for any recognizable timed language, our algorithm terminates and returns a DTA recognizing it. We also show that the number of the necessary queries is polynomial to the size of the equivalence class defined by the Nerode-style congruence if symbolic membership queries are allowed and, otherwise, exponential to it. Moreover, if symbolic membership queries are not allowed, the number of the necessary queries is at most doubly exponential to the number of the clock variable of a DTA recognizing the target language and singly exponential to the number of locations of a DTA recognizing the target language. This worst-case complexity is the same as the one-clock DTA learning algorithm in [30].

We implemented our DTA learning algorithm in a prototype library LearnTA. Our experiment results show that it is efficient enough for some benchmarks taken from practical applications, e. g., the FDDI protocol. This suggests the practical relevance of our algorithm.

The following summarizes our contribution.

  • We characterize recognizable timed languages by a Nerode-style congruence.

  • Using the above characterization, we give an active DTA learning algorithm.

  • Our experiment results suggest its practical relevance.

Related Work. Among various characterization of timed languages [4, 10,11,12,13, 21], the characterization by recognizability [21] is closest to our Myhill-Nerode-style characterization. Both of them use finite sets of elementary languages for characterization. Their main difference is that [21] proposes a formalism to define a timed language by relating prefixes by a morphism, whereas we propose a technical gadget to define an equivalence relation over timed words with respect to suffixes using symbolic membership. This difference makes our definition suitable for an L*-style algorithm, where the original L* algorithm is based on Nerode’s congruence, which defines an equivalence relation over words with respect to suffixes using conventional membership.

As we have discussed so far, active TA learning [5, 15,16,17, 30] has been studied mostly for limited subclasses of TAs, where the number of the clock variables or the clock variables reset at each edge is fixed. In contrast, our algorithm infers both of the above information. Another difference is in the technical strategy. Most of the existing algorithms are related to the active learning of symbolic automata [9, 14], enhancing the languages with clock valuations. In contrast, we take a more semantic approach via the Nerode-style congruence.

Another recent direction is to use a genetic algorithm to infer TAs in passive [27] or active [3] learning. This differs from our learning algorithm based on a formal characterization of timed languages. Moreover, these algorithms may not converge to the correct automaton due to a genetic algorithm.

2 Preliminaries

For a set X, its powerset is denoted by \(\mathcal {P} ({X})\). We denote the empty sequence by \(\varepsilon \). For sets XY, we denote their symmetric difference by \(X \triangle Y = \{x \mid x \in X \wedge x \notin Y\} \cup \{y \mid y \in Y \wedge y \notin X\}\).

2.1 Timed Words and Timed Automata

Definition 3

(timed word). For a finite alphabet \(\varSigma \), a timed word \(w\) is an alternating sequence \(\tau _0 a_1 \tau _1 a_2 \dots a_{n} \tau _{n}\) of \(\varSigma \) and \({\mathbb {R}}_{\ge 0}\). The set of timed words over \(\varSigma \) is denoted by \(\mathcal {T}(\varSigma )\). A timed language \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) is a set of timed words.

For timed words \(w=\tau _0 a_1 \tau _1 a_2 \dots a_{n} \tau _{n}\) and \(w' = \tau '_0 a'_1 \tau '_1 a'_2 \dots a'_{n'} \tau '_{n'}\), their concatenation \(w \cdot w'\) is \(w \cdot w' = \tau _0 a_1 \tau _1 a_2 \dots a_{n} (\tau _{n} + \tau '_0) a'_1 \tau '_1 a'_2 \dots a'_{n'} \tau '_{n'}\). The concatenation is naturally extended to timed languages: for a timed word \(w\) and timed languages \(\mathcal {L}, \mathcal {L}'\), we let \(w\cdot \mathcal {L}= \{ w\cdot w_{\mathcal {L}} \mid w_{\mathcal {L}} \in \mathcal {L}\}\), \(\mathcal {L}\cdot w= \{ w_{\mathcal {L}} \cdot w\mid w_{\mathcal {L}} \in \mathcal {L}\}\), and \(\mathcal {L}\cdot \mathcal {L}' = \{ w_{\mathcal {L}} \cdot w_{\mathcal {L}'} \mid w_{\mathcal {L}} \in \mathcal {L}, w_{\mathcal {L}'} \in \mathcal {L}'\}\). For timed words \(w\) and \(w'\), \(w\) is a prefix of \(w'\) if there is a timed word \(w''\) satisfying \(w\cdot w'' = w'\). A timed language \(\mathcal {L}\) is prefix-closed if for any \(w\in \mathcal {L}\), \(\mathcal {L}\) contains all the prefixes of \(w\).

For a finite set \(C\) of clock variables, a clock valuation is a function \(\nu \in ({\mathbb {R}}_{\ge 0})^{C}\). We let \(\textbf{0}_{C}\) be the clock valuation satisfying \(\textbf{0}_{C}(c) = 0\) for any \(c\in C\). For \(\nu \in ({\mathbb {R}}_{\ge 0})^{C}\) and \(\tau \in {\mathbb {R}}_{\ge 0}\), we let \(\nu + \tau \) be the clock valuation satisfying \((\nu +\tau )(c)=\nu (c)+\tau \) for any \(c\in C\). For \(\nu \in ({\mathbb {R}}_{\ge 0})^{C}\) and \(\rho \subseteq C\), we let \(\nu [\rho {:}{=}0]\) be the clock valuation satisfying \((\nu [\rho {:}{=}0])(x)=0\) for \(c\in \rho \) and \((\nu [\rho {:}{=}0])(c)=\nu (c)\) for \(c\notin \rho \). We let \(\mathcal {G}_{C}\) be the set of constraints defined by a finite conjunction of inequalities \(c\bowtie d\), where \(c\in C\), \(d\in {\mathbb {N}}\), and \({\bowtie } \in \{>, \ge , \le ,<\}\). We let \(\mathcal {C}_{C}\) be the set of constraints defined by a finite conjunction of inequalities \(c\bowtie d\) or \(c- c' \bowtie d\), where \(c, c' \in C\), \(d\in {\mathbb {N}}\), and \({\bowtie } \in \{>, \ge , \le ,<\}\). We denote \(\bigwedge \emptyset \) by \(\top \). For \(\nu \in ({\mathbb {R}}_{\ge 0})^{C}\) and \(\varphi \in \mathcal {C}_{C}\cup \mathcal {G}_{C}\), we denote \(\nu \models \varphi \) if \(\nu \) satisfies \(\varphi \).

Definition 4

(timed automaton). A timed automaton (TA) is a 7-tuple \((\varSigma ,L,l_0,C,I,\varDelta ,F)\), where: \(\varSigma \) is the finite alphabet, \(L\) is the finite set of locations, \(l_0\in L\) is the initial location, \(C\) is the finite set of clock variables, \(I:L\rightarrow \mathcal {C}_{C}\) is the invariant of each location, \(\varDelta \subseteq L\times \mathcal {G}_{C}\times (\varSigma \cup \{\varepsilon \}) \times \mathcal {P} ({C})\times L\) is the set of edges, and \(F\subseteq L\) is the accepting locations.

A TA is deterministic if 1) for any \(a\in \varSigma \) and \((l, g, a, \rho , l'), (l, g', a, \rho ', l'') \in \varDelta \), \(g\wedge g'\) is unsatisfiable, or 2) for any \((l, g, \varepsilon , \rho , l') \in \varDelta \), \(g\wedge I ({l})\) is at most a singleton. Figure 1c shows a deterministic TA (DTA).

The semantics of a TA is defined by a timed transition system (TTS).

Definition 5

(semantics of TAs). For a TA \(\mathcal {A}=(\varSigma ,L,l_0,C,I,\varDelta ,F)\), the timed transition system (TTS) is a 4-tuple \(\mathcal {S}= (Q, q_{0}, Q_{F}, {\rightarrow })\), where: \(Q= L\times ({\mathbb {R}}_{\ge 0})^{C}\) is the set of (concrete) states, \(q_{0}= (l_0, \textbf{0}_{C})\) is the initial state, \(Q_{F}= \{ (l, \nu ) \in Q\mid l\in F\}\) is the set of accepting states, and \({\rightarrow }\subseteq Q\times Q\) is the transition relation consisting of the followingFootnote 1.

  • For each \((l, \nu ) \in Q\) and \(\tau \in {\mathbb {R}_{>0}}\), we have \((l, \nu ){\mathop {\rightarrow }\limits ^{\tau }} (l, \nu + \tau )\) if \(\nu + \tau ' \models I ({l})\) holds for each \(\tau ' \in [0, \tau )\).

  • For each \((l, \nu ), (l', \nu ') \in Q\), \(a\in \varSigma \), and \((l,g,a,\rho ,l')\in \varDelta \), we have \((l, \nu ) {\mathop {\rightarrow }\limits ^{a}} (l', \nu ')\) if we have \(\nu \models g\) and \(\nu '= \nu [\rho {:}{=}0]\).

  • For each \((l, \nu ), (l', \nu ') \in Q\), \(\tau \in {\mathbb {R}_{>0}}\), and \((l, g, \varepsilon , \rho , l') \in \varDelta \), we have \((l, \nu ) {\mathop {\rightarrow }\limits ^{\varepsilon , \tau }} (l', \nu ' + \tau )\) if we have \(\nu \models g\), \(\nu '= \nu [\rho {:}{=}0]\), and \(\forall \tau ' \in [0, \tau ).\, \nu ' + \tau ' \models I ({l'})\).

A run of a TA \(\mathcal {A}\) is an alternating sequence \(q_0, {\rightarrow }_1,q_1,\dots , {\rightarrow }_n,q_n\) of \(q_i \in Q\) and \({\rightarrow }_i \in {\rightarrow }\) satisfying \(q_{i-1} \rightarrow _i q_{i}\) for any \(i \in \{1,2, \dots ,n\}\). A run \(q_0, {\rightarrow }_1,q_1,\dots , {\rightarrow }_n,q_n\) is accepting if \(q_n \in Q_{F}\). Given such a run, the associated timed word is the concatenation of the labels of the transitions. The timed language \(\mathcal {L}(\mathcal {A})\) of a TA \(\mathcal {A}\) is the set of timed words associated with some accepting run of \(\mathcal {A}\).

2.2 Recognizable Timed Languages

Here, we review the recognizability [21] of timed languages.

Definition 6

(timed condition). For a set \(\mathbb {T}= \{\tau _0,\tau _1,\dots ,\tau _{n}\}\) of ordered variables, a timed condition \(\varLambda \) is a finite conjunction of inequalities \(\mathbb {T}_{i,j} \bowtie d\), where \(\mathbb {T}_{i,j} = \sum _{{k = i}}^{j} \tau _k\), \({\bowtie } \in \{>, \ge , \le ,<\}\), and \(d\in {\mathbb {N}}\).

A timed condition \(\varLambda \) is simpleFootnote 2 if for each \(\mathbb {T}_{i,j}\), \(\varLambda \) contains \(d< \mathbb {T}_{i,j} < d+ 1\) or \(d\le \mathbb {T}_{i,j} \wedge \mathbb {T}_{i,j} \le d\) for some \(d\in {\mathbb {N}}\). A timed condition \(\varLambda \) is canonical if we cannot strengthen or add any inequality \(\mathbb {T}_{i,j} \bowtie d\) to \(\varLambda \) without changing its semantics.

Definition 7

(elementary language). A timed language \(\mathcal {L}\) is elementary if there are \(u = a_1,a_2,\dots ,a_n\in \varSigma ^*\) and a timed condition \(\varLambda \) over \(\{\tau _0,\tau _1,\dots ,\tau _{n}\}\) satisfying \(\mathcal {L}= \{\tau _0 a_1 \tau _1 a_2 \dots a_{n} \tau _{n} \mid \tau _0,\tau _1,\dots ,\tau _{n} \models \varLambda \}\), and the set of valuations of \(\{\tau _0,\tau _1,\dots ,\tau _{n}\}\) defined by \(\varLambda \) is bounded. We denote such \(\mathcal {L}\) by \((u, \varLambda )\). We let \(\mathcal {E} (\varSigma )\) be the set of elementary languages over \(\varSigma \).

For \(p, p' \in \mathcal {E} (\varSigma )\), \(p\) is a prefix of \(p'\) if for any \(w' \in p'\), there is a prefix \(w\in p\) of \(w'\), and for any \(w\in p\), there is \(w' \in p'\) such that \(w\) is a prefix of \(w'\). For any elementary language, the number of its prefixes is finite. For a set of elementary languages, prefix-closedness is defined based on the above definition of prefixes.

An elementary language \((u, \varLambda )\) is simple if there is a simple and canonical timed condition \(\varLambda '\) satisfying \((u, \varLambda )= (u, \varLambda ')\). We let \(\mathcal{S}\mathcal{E} (\varSigma )\) be the set of simple elementary languages over \(\varSigma \). Without loss of generality, we assume that for any \((u, \varLambda )\in \mathcal{S}\mathcal{E} (\varSigma )\), \(\varLambda \) is simple and canonical. We remark that any DTA cannot distinguish timed words in a simple elementary language, i. e., for any \(p\in \mathcal{S}\mathcal{E} (\varSigma )\) and a DTA \(\mathcal {A}\), we have either \(p\subseteq \mathcal {L}(\mathcal {A})\) or \(p\cap \mathcal {L}(\mathcal {A}) = \emptyset \). We can decide if \(p\subseteq \mathcal {L}(\mathcal {A})\) or \(p\cap \mathcal {L}(\mathcal {A}) = \emptyset \) by taking some \(w\in p\) and checking if \(w\in \mathcal {L}(\mathcal {A})\).

Definition 8

(immediate exterior). Let \(\mathcal {L}= (u, \varLambda )\) be an elementary language. For \(a\in \varSigma \), the discrete immediate exterior \(\textrm{ext}^{a} (\mathcal {L})\) of \(\mathcal {L}\) is \(\textrm{ext}^{a} (\mathcal {L}) = (u \cdot a, \varLambda \cup \{\tau _{ |u| + 1 } = 0\})\). The continuous immediate exterior \(\textrm{ext}^{t} (\mathcal {L})\) of \(\mathcal {L}\) is \(\textrm{ext}^{t} (\mathcal {L}) = (u, \varLambda ^t)\), where \(\varLambda ^t\) is the timed condition such that each inequality \(\mathbb {T}_{i,|u|} = d\) in \(\varLambda \) is replaced with \(\mathbb {T}_{i,|u|} > d\) if such an inequality exists, and otherwise, the inequality \(\mathbb {T}_{i,|u|} < d\) in \(\varLambda \) with the smallest index i is replaced with \(\mathbb {T}_{i,|u|} = d\). The immediate exterior of \(\mathcal {L}\) is \(\textrm{ext}^{} (\mathcal {L}) = \bigcup _{a\in \varSigma }\textrm{ext}^{a} (\mathcal {L}) \cup \textrm{ext}^{t} (\mathcal {L})\).

Example 9

For a word and a timed condition \(\varLambda = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} \in (1,2) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} = 0\}\), we have and . The discrete and continuous immediate exteriors of \((u, \varLambda )\) are and \(\textrm{ext}^{t} ((u, \varLambda )) = (u, \varLambda ^{t})\), where and \(\varLambda ^{t} = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} \in (1,2) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} > 0\}\).

Definition 10

(chronometric timed language). A timed language \(\mathcal {L}\) is chronometric if there is a finite set \(\{(u_{1}, \varLambda _{1}), (u_{2}, \varLambda _{2}), \dots , (u_{m}, \varLambda _{m})\}\) of disjoint elementary languages satisfying \(\mathcal {L}= \bigcup _{i \in \{1,2, \dots ,m\}} (u_{i}, \varLambda _{i})\).

For any elementary language \(\mathcal {L}\), its immediate exterior \(\textrm{ext}^{} (\mathcal {L})\) is chronometric. We naturally extend the notion of exterior to chronometric timed languages, i. e., for a chronometric timed language \(\mathcal {L}= \bigcup _{i \in \{1,2, \dots ,m\}} (u_{i}, \varLambda _{i})\), we let \(\textrm{ext}^{} (\mathcal {L}) = \bigcup _{i \in \{1,2, \dots ,m\}} \textrm{ext}^{} ((u_{i}, \varLambda _{i}))\), which is also chronometric. For a timed word \(w=\tau _0 a_1 \tau _1 a_2 \dots a_{n} \tau _{n}\), we denote the valuation of \(\tau _0,\tau _1,\dots ,\tau _{n}\) by \(\kappa (w)\).

Chronometric relational morphism [21] relates any timed word to a timed word in a certain set \(P\), which is later used to define a timed language. Intuitively, the tuples in \(\varPhi \) specify a mapping from timed words immediately out of \(P\) to timed words in \(P\). By inductively applying it, any timed word is mapped to \(P\).

Definition 11

(chronometric relational morphism). Let \(P\) be a chronometric and prefix-closed timed language. Let \((u,\varLambda ,u{'},\varLambda {'}, R)\) be a 5-tuple such that \((u, \varLambda )\subseteq \textrm{ext}^{} (P)\), \((u{'},\varLambda {'}) \subseteq P\), and \(R\) is a finite conjunction of equations of the form \(\mathbb {T}_{i,|u|} = \mathbb {T}^{'}_{{j},{|u'|}}\), where \(i \le |u|\) and \(j \le |u'|\). For such a tuple, we let \(\llbracket { (u,\varLambda ,u{'},\varLambda {'}, R) }\rrbracket \subseteq (u, \varLambda )\times (u{'},\varLambda {'})\) be the relation such that \((w, w') \in \llbracket { (u,\varLambda ,u{'},\varLambda {'}, R) }\rrbracket \) if and only if \(\kappa (w), \kappa (w') \models R\). For a finite set \(\varPhi \) of such tuples, the chronometric relational morphism \(\llbracket { \varPhi }\rrbracket \subseteq \mathcal {T}(\varSigma )\times P\) is the relation inductively defined as follows: 1) for \(w\in P\), we have \((w, w) \in \llbracket { \varPhi }\rrbracket \); 2) for \(w\in \textrm{ext}^{} (P)\) and \(w' \in P\), we have \((w, w') \in \llbracket { \varPhi }\rrbracket \) if we have \((w, w') \in \llbracket { (u,\varLambda ,u{'},\varLambda {'}, R) }\rrbracket \) for one of the tuples \((u,\varLambda ,u{'},\varLambda {'}, R) \in \varPhi \); 3) for \(w\in \textrm{ext}^{} (P)\), \(w' \in \mathcal {T}(\varSigma )\), and \(w'' \in P\), we have \((w\cdot w', w'') \in \llbracket { \varPhi }\rrbracket \) if there is \(w''' \in \mathcal {T}(\varSigma )\) satisfying \((w, w''') \in \llbracket { \varPhi }\rrbracket \) and \((w''' \cdot w', w'') \in \llbracket { \varPhi }\rrbracket \). We also require that all \((u, \varLambda )\) in the tuples in \(\varPhi \) must be disjoint and the union of each such \((u, \varLambda )\) is \(\textrm{ext}^{} (P) \setminus P\).

A chronometric relational morphism \(\llbracket { \varPhi }\rrbracket \) is compatible with \(F\subseteq P\) if for each tuple \((u,\varLambda ,u{'},\varLambda {'}, R)\) defining \(\llbracket { \varPhi }\rrbracket \), we have either \((u{'},\varLambda {'}) \subseteq F\) or \((u{'},\varLambda {'}) \cap F= \emptyset \).

Definition 12

(recognizable timed language). A timed language \(\mathcal {L}\) is recognizable if there is a chronometric prefix-closed set \(P\), a chronometric subset \(F\) of \(P\), and a chronometric relational morphism \(\llbracket { \varPhi }\rrbracket \subseteq \mathcal {T}(\varSigma )\times P\) compatible with \(F\) satisfying \(\mathcal {L}= \{w\mid \exists w' \in F, (w, w') \in \llbracket { \varPhi }\rrbracket \}\).

It is known that for any recognizable timed language \(\mathcal {L}\), we can construct a DTA \(\mathcal {A}\) recognizing \(\mathcal {L}\), and vice versa [21].

2.3 Distinguishing Extensions and Active DFA Learning

Most DFA learning algorithms are based on Nerode’s congruence [18]. For a (not necessarily regular) language \(\mathcal {L}\subseteq \varSigma ^*\), Nerode’s congruence \({\equiv _{\mathcal {L}}} \subseteq \varSigma ^* \times \varSigma ^*\) is the equivalence relation satisfying \(w\equiv _{\mathcal {L}} w'\) if and only if for any \(w'' \in \varSigma ^*\), we have \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\).

Generally, we cannot decide if \(w\equiv _{\mathcal {L}} w'\) by testing because it requires infinitely many membership checking. However, if \(\mathcal {L}\) is regular, there is a finite set of suffixes \(S\subseteq \varSigma ^*\) called distinguishing extensions satisfying \({\equiv _{\mathcal {L}}} = {\sim ^{S}_{\mathcal {L}}}\), where \({\sim ^{S}_{\mathcal {L}}}\) is the equivalence relation satisfying \(w\sim ^{S}_{\mathcal {L}} w'\) if and only if for any \(w'' \in S\), we have \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\). Thus, the minimum DFA recognizing \(\mathcal {L}_{\textrm{tgt}}\) can be learned byFootnote 3: i) identifying distinguishing extensions \(S\) satisfying \({\equiv _{\mathcal {L}_{\textrm{tgt}}}} = {\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}\) and ii) constructing the minimum DFA \(\mathcal {A}\) corresponding to \({\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}\).

The L* algorithm [8] is an algorithm to learn the minimum DFA \(\mathcal {A}_{\textrm{hyp}}\) recognizing the target regular language \(\mathcal {L}_{\textrm{tgt}}\) with finitely many membership and equivalence queries to the teacher. In a membership query, the learner asks if \(w\in \varSigma ^*\) belongs to the target language \(\mathcal {L}_{\textrm{tgt}}\) i. e., \(w\in \mathcal {L}_{\textrm{tgt}}\). In an equivalence query, the learner asks if the hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\) recognizes the target language \(\mathcal {L}_{\textrm{tgt}}\) i. e., \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\), where \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}})\) is the language of the hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\). When we have \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \ne \mathcal {L}_{\textrm{tgt}}\), the teacher returns a counterexample \( cex \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \triangle \mathcal {L}_{\textrm{tgt}}\). The information obtained via queries is stored in a 2-dimensional array called an observation table. See Fig. 1b for an illustration. For finite index sets \(P, S\subseteq \varSigma ^*\), for each pair \((p, s) \in (P\cup P\cdot \varSigma ) \times S\), the observation table stores whether \(p\cdot s\in \mathcal {L}_{\textrm{tgt}}\). \(S\) is the current candidate of the distinguishing extensions, and \(P\) represents \(\varSigma ^* / {\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}\). Since \(P\) and \(S\) are finite, one can fill the observation table using finite membership queries.

figure s

Algorithm 1 outlines an L*-style algorithm. We start from and incrementally increase them. To construct a hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\), the observation table must be closed and consistent. An observation table is closed if, for each \(p\in P\cdot \varSigma \), there is \(p' \in P\) satisfying \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\). An observation table is consistent if, for any \(p, p' \in P\cup P\cdot \varSigma \) and \(a\in \varSigma \), \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\) implies \(p\cdot a\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p' \cdot a\).

Once the observation table becomes closed and consistent, the learner constructs a hypothesis DFA \(\mathcal {A}_{\textrm{hyp}}\) and checks if \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\) by an equivalence query. If \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\) holds, \(\mathcal {A}_{\textrm{hyp}}\) is the resulting DFA. Otherwise, the teacher returns \( cex \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \triangle \mathcal {L}_{\textrm{tgt}}\), which is used to refine the observation table. There are several variants of the refinement. In the L* algorithm, all the prefixes of \( cex \) are added to \(P\). In the Rivest-Schapire algorithm [20, 25], an extension \(s\) strictly refining \(S\) is obtained by an analysis of \( cex \), and such \(s\) is added to \(S\).

3 A Myhill-Nerode Style Characterization of Recognizable Timed Languages with Elementary Languages

Unlike the case of regular languages, any finite set of timed words cannot correctly distinguish recognizable timed languages due to the infiniteness of dwell time in timed words. Instead, we use a finite set of elementary languages to define a Nerode-style congruence. To define the Nerode-style congruence, we extend the notion of membership to elementary languages.

Definition 13

(symbolic membership). For a timed language \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) and an elementary language \((u, \varLambda )\in \mathcal {E} (\varSigma )\), the symbolic membership \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u, \varLambda ))\) of \((u, \varLambda )\) to \(\mathcal {L}\) is the strongest constraint such that for any \(w\in (u, \varLambda )\), we have \(w\in \mathcal {L}\) if and only if \(\kappa (w) \models \texttt{mem}^{\texttt{sym}}_{\mathcal {L}}(\mathcal {L})\).

We discuss how to obtain symbolic membership in Sect. 4.5. We define a Nerode-style congruence using symbolic membership. A naive idea is to distinguish two elementary languages by the equivalence of their symbolic membership. However, this does not capture the semantics of TAs. For example, for the DTA \(\mathcal {A}\) in Fig. 1c, for any timed word \(w\), we have , while they have different symbolic membership. This is because symbolic membership distinguishes the position in timed words where each clock variable is reset, which must be ignored. We use renaming equations to abstract such positional information in symbolic membership. Note that \(\mathbb {T}_{i,n} = \sum _{{k = i}}^{n} \tau _k\) corresponds to the value of the clock variable reset at \(\tau _i\).

Definition 14

(renaming equation). Let \(\mathbb {T}= \{\tau _0,\tau _1,\dots ,\tau _{n}\}\) and \(\mathbb {T}' = \{\tau ^{'}_0,\tau ^{'}_1,\dots ,\tau ^{'}_{n^{'}}\}\) be sets of ordered variables. A renaming equation \(R\) over \(\mathbb {T}\) and \(\mathbb {T}'\) is a finite conjunction of equations of the form \(\mathbb {T}_{i,n} = \mathbb {T}^{'}_{{i'},{n'}}\), where \(i \in \{0, 1, \dots , n\}\), \(i' \in \{0, 1, \dots , n'\}\), \(\mathbb {T}_{i,n} = \sum _{{k = i}}^{n} \tau _k\) and \(\mathbb {T}^{'}_{{i'},{n'}} = \sum _{{k = i'}}^{n'} \tau '_k\).

Definition 15

(\(\sim ^{S}_{\mathcal {L}}\)). Let \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) be a timed language, let \((u, \varLambda ), (u{'},\varLambda {'}), (u{''},\varLambda {''}) \in \mathcal {E} (\varSigma )\) be elementary languages, and let \(R\) be a renaming equation over \(\mathbb {T}\) and \(\mathbb {T}'\), where \(\mathbb {T}\) and \(\mathbb {T}'\) are the variables of \(\varLambda \) and \(\varLambda '\), respectively. We let \((u, \varLambda )\sqsubseteq ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u{'},\varLambda {'})\) if we have the following: for any \(w\in (u, \varLambda )\), there is \(w' \in (u{'},\varLambda {'})\) satisfying \(\kappa (w), \kappa (w')\models R\); \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u, \varLambda )\cdot (u{''},\varLambda {''})) \wedge R \wedge \varLambda '\) is equivalent to \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u{'},\varLambda {'}) \cdot (u{''},\varLambda {''})) \wedge R\wedge \varLambda \). We let \((u, \varLambda )\sim ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u{'},\varLambda {'})\) if we have \((u, \varLambda )\sqsubseteq ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u{'},\varLambda {'})\) and \((u{'},\varLambda {'} \sqsubseteq ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u,\varLambda )\). Let \(S\subseteq \mathcal {E} (\varSigma )\). We let \((u, \varLambda )\sim ^{S, R}_{\mathcal {L}} (u{'},\varLambda {'})\) if for any \((u{''},\varLambda {''}) \in S\), we have \((u, \varLambda )\sim ^{(u{''},\varLambda {''}), R}_{\mathcal {L}} (u{'},\varLambda {'})\). We let \((u, \varLambda )\sim ^{S}_{\mathcal {L}} (u{'},\varLambda {'})\) if \((u, \varLambda )\sim ^{S, R}_{\mathcal {L}} (u{'},\varLambda {'})\) for some renaming equation \(R\).

Example 16

Let \(\mathcal {A}\) be the DTA in Fig. 1c and let \((u, \varLambda )\), \((u{'},\varLambda {'})\), and \((u{''},\varLambda {''})\) be elementary languages, where , , . We have \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}(\mathcal {A})}((u, \varLambda )\cdot (u{''},\varLambda {''})) = \varLambda \wedge \varLambda '' \wedge \tau _1 + \tau ''_0 \le 1\) and \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}(\mathcal {A})}((u{'},\varLambda {'}) \cdot (u{''},\varLambda {''})) = \varLambda ' \wedge \varLambda '' \wedge \tau '_2 + \tau ''_0 \le 1\). Therefore, for the renaming equation \(\mathbb {T}_{1,1} = \mathbb {T}^{'}_{{2},{2}}\), we have \((u, \varLambda )\sim ^{(u{''},\varLambda {''}), \mathbb {T}_{1,1} = \mathbb {T}^{'}_{{2},{2}}}_{\mathcal {L}} (u{'},\varLambda {'})\).

An algorithm to check if \((u, \varLambda )\sim ^{S}_{\mathcal {L}} (u{'},\varLambda {'})\) is shown in Appendix B.2 of [29].

Intuitively, \((u, \varLambda )\sqsubseteq ^{s, R}_{\mathcal {L}} (u{'},\varLambda {'})\) shows that any \(w\in (u, \varLambda )\) can be “simulated” by some \(w' \in (u{'},\varLambda {'})\) with respect to \(s\) and \(R\). Such intuition is formalized as follows.

Theorem 17

For any \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) and \((u, \varLambda ), (u{'},\varLambda {'}), (u{''},\varLambda {''}) \in \mathcal {E} (\varSigma )\) satisfying \((u, \varLambda )\sqsubseteq ^{(u{''},\varLambda {''})}_{\mathcal {L}} (u{'},\varLambda {'})\), for any \(w\in (u, \varLambda )\), there is \(w' \in (u{'},\varLambda {'})\) such that for any \(w'' \in (u{''},\varLambda {''})\), \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\) holds.    \(\square \)

By \(\bigcup _{(u, \varLambda )\in \mathcal {E} (\varSigma )} (u, \varLambda )= \mathcal {T}(\varSigma )\), we have the following as a corollary.

Corollary 18

For any timed language \(\mathcal {L}\subseteq \mathcal {T}(\varSigma )\) and for any elementary languages \((u, \varLambda ), (u{'},\varLambda {'}) \in \mathcal {E} (\varSigma )\), \((u, \varLambda )\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}} (u{'},\varLambda {'})\) implies the following.

  • For any \(w\in (u, \varLambda )\), there is \(w' \in (u{'},\varLambda {'})\) such that for any \(w'' \in \mathcal {T}(\varSigma )\), we have \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\).

  • For any \(w' \in (u{'},\varLambda {'})\), there is \(w\in (u, \varLambda )\) such that for any \(w'' \in \mathcal {T}(\varSigma )\), we have \(w\cdot w'' \in \mathcal {L}\iff w' \cdot w'' \in \mathcal {L}\).    \(\square \)

The following characterizes recognizable timed languages with \(\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}}\).

Theorem 19

(Myhill-Nerode style characterization). A timed language \(\mathcal {L}\) is recognizable if and only if the quotient set \(\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}}}\) is finite.    \(\square \)

By Theorem 19, we always have a finite set \(S\) of distinguishing extensions.

Theorem 20

For any recognizable timed language \(\mathcal {L}\), there is a finite set \(S\) of elementary languages satisfying \({\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}}} = {\sim ^{S}_{\mathcal {L}}}\).    \(\square \)

4 Active Learning of Deterministic Timed Automata

We show our L*-style active learning algorithm for DTAs with the Nerode-style congruence in Sect. 3. We let \(\mathcal {L}_{\textrm{tgt}}\) be the target timed language in learning.

For simplicity, we first present our learning algorithm with a smart teacher answering the following three kinds of queries: membership query \(\texttt{mem}_{\mathcal {L}_{\textrm{tgt}}}(w)\) asking whether \(w\in \mathcal {L}_{\textrm{tgt}}\), symbolic membership query asking \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}_{\textrm{tgt}}}({(u, \varLambda )})\), and equivalence query \(\texttt{eq}_{\mathcal {L}_{\textrm{tgt}}}(\mathcal {A}_{\textrm{hyp}})\) asking whether \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\). If \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\), \(\texttt{eq}_{\mathcal {L}_{\textrm{tgt}}}(\mathcal {A}_{\textrm{hyp}}) = \top \), and otherwise, \(\texttt{eq}_{\mathcal {L}_{\textrm{tgt}}}(\mathcal {A}_{\textrm{hyp}})\) is a timed word \( cex \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \triangle \mathcal {L}_{\textrm{tgt}}\). Later in Sect. 4.5, we show how to answer a symbolic membership query with finitely many membership queries. Our task is to construct a DTA \(\mathcal {A}\) satisfying \(\mathcal {L}(\mathcal {A}) = \mathcal {L}_{\textrm{tgt}}\) with finitely many queries.

4.1 Successors of Simple Elementary Languages

Similarly to the L* algorithm in Sect. 2.3, we learn a DTA with an observation table. Reflecting the extension of the underlying congruence, we use sets of simple elementary languages for the indices. To define the extensions, \(P\cup (P\cdot \varSigma )\) in the L* algorithm, we introduce continuous and discrete successors for simple elementary languages, which are inspired by regions [4]. We note that immediate exteriors do not work for this purpose. For example, for and , we have \(w\in (u, \varLambda )\) and , but there is no \(t > 0\) satisfying \(w\cdot t \in \textrm{ext}^{t} ((u, \varLambda ))\).

For any \((u, \varLambda )\in \mathcal{S}\mathcal{E} (\varSigma )\), we let \(\varTheta _{(u, \varLambda )}\) be the total order over 0 and the fractional parts \(\textrm{frac}(\mathbb {T}_{0,n}),\textrm{frac}(\mathbb {T}_{1,n}),\dots ,\textrm{frac}(\mathbb {T}_{n,n})\) of \(\mathbb {T}_{0,n},\mathbb {T}_{1,n},\dots ,\mathbb {T}_{n,n}\). Such an order is uniquely defined because \(\varLambda \) is simple and canonical (Proposition 36 of [29]).

Definition 21

(successor). Let \(p= (u, \varLambda )\in \mathcal{S}\mathcal{E} (\varSigma )\) be a simple elementary language. The discrete successor \(\textrm{succ}^{a} (p)\) of \(p\) is \(\textrm{succ}^{a} (p) = (u \cdot a, \varLambda \wedge \tau _{n+1} = 0)\). The continuous successor \(\textrm{succ}^{t} (p)\) of \(p\) is \(\textrm{succ}^{t} (p) = (u, \varLambda ^t)\), where \(\varLambda ^{t}\) is defined as follows: if there is an equation \(\mathbb {T}_{i,n} = d\) in \(\varLambda \), all such equations are replaced with \(\mathbb {T}_{i,n} \in (d, d+ 1)\); otherwise, for each greatest \(\mathbb {T}_{i,n}\) in terms of \(\varTheta _{(u, \varLambda )}\), we replace \(\mathbb {T}_{i,n} \in (d, d+ 1)\) with \(\mathbb {T}_{i,n} = d+1\). We let \(\textrm{succ}^{} (p) = \bigcup _{a\in \varSigma }\textrm{succ}^{a} (p) \cup \textrm{succ}^{t} (p)\). For \(P\subseteq \mathcal{S}\mathcal{E} (\varSigma )\), we let \(\textrm{succ}^{} (P) = \bigcup _{p\in P} \textrm{succ}^{} (p)\).

Example 22

Let , \(\varLambda = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} \in (1,2) \wedge \mathbb {T}_{1,1} \in (0,1) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} = 0\}\). The order \(\varTheta _{(u, \varLambda )}\) is such that \(0 = \textrm{frac}(\mathbb {T}_{2,2})< \textrm{frac}(\mathbb {T}_{1,2}) < \textrm{frac}(\mathbb {T}_{0,2})\). The continuous successor of \((u, \varLambda )\) is \(\textrm{succ}^{t} ((u, \varLambda )) = (u, \varLambda ^{t})\), where \(\varLambda ^{t} = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} \in (1,2) \wedge \mathbb {T}_{1,1} \in (0,1) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} \in (0,1)\}\). The continuous successor of \((u, \varLambda ^{t})\) is \(\textrm{succ}^{t} ((u, \varLambda ^{t})) = (u, \varLambda ^{ tt })\), where \(\varLambda ^{ tt } = \{\mathbb {T}_{0,0} \in (1,2) \wedge \mathbb {T}_{0,1} \in (1,2) \wedge \mathbb {T}_{0,2} = 2 \wedge \mathbb {T}_{1,1} \in (0,1) \wedge \mathbb {T}_{1,2} \in (0,1) \wedge \mathbb {T}_{2,2} \in (0,1)\}\).

figure ac

4.2 Timed Observation Table for Active DTA Learning

We extend the observation table with (simple) elementary languages and symbolic membership to learn a recognizable timed language.

Definition 23

(timed observation table). A timed observation table is a 3-tuple \((P, S, T)\) such that: \(P\) is a prefix-closed finite set of simple elementary languages, \(S\) is a finite set of elementary languages, and \(T\) is a function mapping \((p, s) \in (P\cup \textrm{succ}^{} (P)) \times S\) to the symbolic membership \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}_{\textrm{tgt}}}(p\cdot s)\).

Figure 2 illustrates timed observation tables: each cell indexed by \((p, s)\) show the symbolic membership \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}_{\textrm{tgt}}}(p\cdot s)\). For timed observation tables, we extend the notion of closedness and consistency with \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}\) we introduced in Sect. 3. We note that consistency is defined only for discrete successors. This is because a timed observation table does not always become “consistent” for continuous successors. See Appendix C of [29] for a detailed discussion. We also require exterior-consistency since we construct an exterior from a successor.

figure ad

Definition 24

(closedness, consistency, exterior-consistency, cohesion). Let \(O = (P, S, T)\) be a timed observation table. \(O\) is closed if, for each \(p\in \textrm{succ}^{} (P) \setminus P\), there is \(p' \in P\) satisfying \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\). \(O\) is consistent if, for each \(p, p' \in P\) and for each \(a\in \varSigma \), \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\) implies \(\textrm{succ}^{a} (p) \sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} \textrm{succ}^{a} (p')\). \(O\) is exterior-consistent if for any \(p\in P\), \(\textrm{succ}^{t} (p) \notin P\) implies \(\textrm{succ}^{t} (p) \subseteq \textrm{ext}^{t} (p)\). \(O\) is cohesive if it is closed, consistent, and exterior-consistent.

From a cohesive timed observation table \((P, S, T)\), we can construct a DTA as outlined in Algorithm 2. We construct a DTA via a recognizable timed language. The main ideas are as follows: 1) we approximate \(\sim ^{\mathcal {E} (\varSigma ), R}_{\mathcal {L}_{\textrm{tgt}}}\) by \(\sim ^{S, R}_{\mathcal {L}_{\textrm{tgt}}}\); 2) we decide the equivalence class of \(\textrm{ext}^{} (p) \in \textrm{ext}^{} (P) \setminus P\) in \(\mathcal {E} (\varSigma )\) from successors. Notice that there can be multiple renaming equations \(R\) showing \(\sim ^{S, R}_{\mathcal {L}_{\textrm{tgt}}}\). We use one of them found by an exhaustive search in Appendix B.2 of [29].

The DTA obtained by MakeDTA is faithful to the timed observation table in rows, i. e., for any \(p\in P\cup \textrm{succ}^{} (P)\), \(\mathcal {L}_{\textrm{tgt}}\cap p= \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \cap p\). However, unlike the L* algorithm, this does not hold for each cell, i. e., there may be \(p\in P\cup \textrm{succ}^{} (P)\) and \(s\in S\) satisfying \(\mathcal {L}_{\textrm{tgt}}\cap (p\cdot s) \ne \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \cap (p\cdot s)\). This is because we do not (and actually cannot) enforce consistency for continuous successors. See Appendix C of [29] for a discussion. Nevertheless, this does not affect the correctness of our algorithm partly by Theorem 26.

Theorem 25

(row faithfulness). For any cohesive timed observation table \((P, S, T)\), for any \(p\in P\cup \textrm{succ}^{} (P)\), \(\mathcal {L}_{\textrm{tgt}}\cap p= \mathcal {L}(\texttt {MakeDTA}(P, S, T)) \cap p\) holds.    \(\square \)

Theorem 26

For any cohesive timed observation table \((P, S, T)\), \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} = \sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}\) implies \(\mathcal {L}_{\textrm{tgt}}= \mathcal {L}(\texttt {MakeDTA}(P, S, T))\).    \(\square \)

figure af

4.3 Counterexample Analysis

We analyze the counterexample \( cex \) obtained by an equivalence query to refine the set \(S\) of suffixes in a timed observation table. Our analysis, outlined in Algorithm 3, is inspired by the Rivest-Schapire algorithm [20, 25]. The idea is to reduce the counterexample \( cex \) using the mapping defined by the congruence \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}\) (lines 5–7 ), much like \(\varPhi \) in recognizable timed languages, and to find a suffix \(s\) strictly refining \(S\) (line 9), i. e., satisfying \(p\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}} p'\) and for some \(p\in \textrm{succ}^{} (P)\) and \(p' \in P\).

By definition of \( cex \), we have \( cex = w_0 \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\). By Theorem 25, \(w_n \not \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\) holds, where n is the final value of i. By construction of \(\mathcal {A}_{\textrm{hyp}}\) and \(w_i\), for any \(i \in \{1, 2, \dots , n\}\), we have \(w_0 \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}) \iff w_i \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\). Therefore, there is \(i \in \{1, 2, \dots , n\}\) satisfying \(w_{i-1} \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\) and \(w_{i} \not \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\). For such i, since we have \(w_{i-1} = w'_i \cdot w''_i \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\), \(w_i = \overline{w}_i \cdot w''_i \not \in \mathcal {L}_{\textrm{tgt}}\triangle \mathcal {L}(\mathcal {A}_{\textrm{hyp}})\), and \(\kappa (w'_i), \kappa (\overline{w}_i) \models R_i\), such \(w''_i\) is a witness of \(p'_i {\not \sim }^{\mathcal {E} (\varSigma ), R_i}_{\mathcal {L}_{\textrm{tgt}}} p_i\). Therefore, \(S\) can be refined by the simple elementary language \(s\in \mathcal{S}\mathcal{E} (\varSigma )\) including \(w''_i\).

4.4 L*-Style Learning Algorithm for DTAs

Algorithm 4 outlines our active DTA learning algorithm. At line 1, we initialize the timed observation table with . In the loop in lines 2–15, we refine the timed observation table until the hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}\) recognizes the target language \(\mathcal {L}_{\textrm{tgt}}\), which is checked by equivalence queries. The refinement finishes when the equivalence relation \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}\) defined by the suffixes \(S\) converges to \(\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}\), and the prefixes \(P\) covers \(\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}}\).

In the loop in lines3–11, we make the timed observation table cohesive. If the timed observation table is not closed, we move the incompatible row in \(\textrm{succ}^{} (P) \setminus P\) to \(P\) (line 5). If the timed observation table is inconsistent, we concatenate an event \(a\in \varSigma \) in front of some of the suffixes in \(S\) (line 8). If the timed observation table is not exterior-consistent, we move the boundary \(\textrm{succ}^{t} (p) \in \textrm{succ}^{t} (P)\setminus P\) satisfying \(\textrm{succ}^{t} (p) \nsubseteq \textrm{ext}^{t} (p)\) to \(P\) (line 10). Once we obtain a cohesive timed observation table, we construct a DTA \(\mathcal {A}_{\textrm{hyp}}= \texttt {MakeDTA}{} \texttt {(}{P, S, T}{} \texttt {)}\) and make an equivalence query (line 12). If we have \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\), we return \(\mathcal {A}_{\textrm{hyp}}\). Otherwise, we have a timed word \( cex \) witnessing the difference between the language of the hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}\) and the target language \(\mathcal {L}_{\textrm{tgt}}\). We refine the timed observation table using Algorithm 3.

Fig. 2.
figure 2

Timed observation tables \(O_1, O_2, O_3\), and the DTAs \(\mathcal {A}_{\textrm{hyp}}^1\) and \(\mathcal {A}_{\textrm{hyp}}^3\) made from \(O_1\) and \(O_3\), respectively. In \(O_2\) and \(O_3\), we only show the constraints non-trivial from \(p\) and \(s\). The DTAs are simplified without changing the language. The use of clock assignments, which does not change the expressiveness, is from [21].

Example 27

Let \(\mathcal {L}_{\textrm{tgt}}\) be the timed language recognized by the DTA in Fig. 1c. We start from and . Figure 2a shows the initial timed observation table \(O_1\). Since the timed observation table \(O_1\) in Fig. 2a is cohesive, we construct a hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}^1\). The hypothesis recognizable timed language is \((P_1, F_1, \varPhi _1)\) is such that and . Figure 2b shows the first hypothesis DTA \(\mathcal {A}_{\textrm{hyp}}^1\).

We have \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}^1) \ne \mathcal {L}_{\textrm{tgt}}\), and the learner obtains a counterexample, e. g., , with an equivalence query. In Algorithm 3, we have \(w_0 = cex \), , , and \(w_3 = 0\). We have \(w_0 \not \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}^1) \triangle \mathcal {L}_{\textrm{tgt}}\) and \(w_1 \in \mathcal {L}(\mathcal {A}_{\textrm{hyp}}^1) \triangle \mathcal {L}_{\textrm{tgt}}\), and the suffix to distinguish \(w_0\) and \(w_1\) is . Thus, we add to \(S_1\) (Fig. 2d).

In Fig. 2d, we observe that \(T_2(p_1, s_1)\) is more strict than \(T_2(p_0, s_1)\), and we have \(p_1 {\not \sim }^{S_2}_{\mathcal {L}_{\textrm{tgt}}} p_0\). To make \((P_2, S_2, T_2)\) closed, we add \(p_1\) to \(P_2\). By repeating similar operations, we obtain the timed observation table \(O_3 = (P_3, S_3, T_3)\) in Fig. 2e, which is cohesive. Figure 2c shows the DTA \(\mathcal {A}_{\textrm{hyp}}^3\) constructed from \(O_3\). Since \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}^3) = \mathcal {L}_{\textrm{tgt}}\) holds, Algorithm 4 finishes returning \(\mathcal {A}_{\textrm{hyp}}^3\).

By the use of equivalence queries, Algorithm 4 returns a DTA recognizing the target language if it terminates, which is formally as follows.

Theorem 28

(correctness). For any target timed language \(\mathcal {L}_{\textrm{tgt}}\), if Algorithm 4 terminates, for the resulting DTA \(\mathcal {A}_{\textrm{hyp}}\), \(\mathcal {L}(\mathcal {A}_{\textrm{hyp}}) = \mathcal {L}_{\textrm{tgt}}\) holds.    \(\square \)

Moreover, Algorithm 4 terminates for any recognizable timed language \(\mathcal {L}_{\textrm{tgt}}\) essentially because of the finiteness of \(\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}}\).

Theorem 29

(termination). For any recognizable timed language \(\mathcal {L}_{\textrm{tgt}}\), Algorithm 4 terminates and returns a DTA \(\mathcal {A}\) satisfying \(\mathcal {L}(\mathcal {A}) = \mathcal {L}_{\textrm{tgt}}\).

Proof

(Theorem 29). By the recognizability of \(\mathcal {L}_{\textrm{tgt}}\) and Theorem 19, \(\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}}\) is finite. Let \(N = |\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}}|\). Since each execution of line 5 adds \(p\) to \(P\), where \(p\) is such that for any \(p' \in P\), \(p{\not \sim }^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}} p'\) holds, it is executed at most N times. Since each execution of line 8 refines \(S\), i. e., it increases \(|\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}|\), line 8 is executed at most N times. For any \((u,\varLambda ) \in \mathcal{S}\mathcal{E} (\varSigma )\), if \(\varLambda \) contains \(\mathbb {T}_{i,|u|} = d\) for some \(i \in \{0, 1, \dots , |u|\}\) and \(d\in {\mathbb {N}}\), we have \(\textrm{succ}^{t}((u,\varLambda )) \subseteq \textrm{ext}^{t} ((u, \varLambda ))\). Therefore, line 10 is executed at most N times. Since \(S\) is strictly refined in line 14, i. e., it increases \(|\mathcal{S}\mathcal{E} (\varSigma )/ {\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}}|\), line 14 is executed at most N times. By Theorem 26, once \(\sim ^{S}_{\mathcal {L}_{\textrm{tgt}}}\) saturates to \(\sim ^{\mathcal {E} (\varSigma )}_{\mathcal {L}_{\textrm{tgt}}}\), \(\texttt {MakeDTA}\) returns the correct DTA. Overall, Algorithm 4 terminates.    \(\square \)

4.5 Learning with a Normal Teacher

We briefly show how to learn a DTA only with membership and equivalence queries. We reduce a symbolic membership query to finitely many membership queries, answerable by a normal teacher. See Appendix B.1 of [29] for detail.

Let \((u, \varLambda )\) be the elementary language given in a symbolic membership query. Since \(\varLambda \) is bounded, we can construct a finite and disjoint set of simple and canonical timed conditions \(\varLambda '_1, \varLambda '_2,\dots , \varLambda '_n\) satisfying \(\bigvee _{1 \le i \le n} \varLambda '_i = \varLambda \) by a simple enumeration. For any simple elementary language \((u{'},\varLambda {'}) \in \mathcal{S}\mathcal{E} (\varSigma )\) and timed words \(w, w' \in (u{'},\varLambda {'})\), we have \(w\in \mathcal {L}\iff w' \in \mathcal {L}\). Thus, we can construct \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u, \varLambda ))\) by making a membership query \(\texttt{mem}_{\mathcal {L}}(w)\) for each such \((u{'},\varLambda {'}) \subseteq (u, \varLambda )\) and for some \(w\in (u{'},\varLambda {'})\). We need such an exhaustive search, instead of a binary search, because \(\texttt{mem}^{\texttt{sym}}_{\mathcal {L}}((u, \varLambda ))\) may be non-convex.

Assume \(\varLambda \) is a canonical timed condition. Let M be the size of the variables in \(\varLambda \) and I be the largest difference between the upper bound and the lower bound for some \(\mathbb {T}_{i,j}\) in \(\varLambda \). The size n of the above decomposition is bounded by \({(2 \times I + 1)}^{1/2 \times M \times (M + 1)}\), which exponentially blows up with respect to M.

In our algorithm, we only make symbolic membership queries with elementary languages of the form \(p\cdot s\), where \(p\) and \(s\) are simple elementary languages. Therefore, I is at most 2. However, even with such an assumption, the number of the necessary membership queries blows up exponentially to the size of the variables in \(\varLambda \).

4.6 Complexity Analysis

After each equivalence query, our DTA learning algorithm strictly refines \(S\) or terminates. Thus, the number of equivalence queries is at most N. In the proof of Theorem 29, we observe that the size of \(P\) is at most 2N. Therefore, the number \((|P| + |\textrm{succ}^{} (P)|) \times |S|\) of the cells in the timed observation table is at most \((2 N + 2 N \times (|\varSigma | + 1)) \times N = 2 N^2 |\varSigma | + 2\). Let J be the upper bound of i in the analysis of \( cex \) returned by equivalence queries (Algorithm 3). For each equivalence query, the number of membership queries in Algorithm 3 is bounded by \(\lceil \log J \rceil \), and thus, it is, in total, bounded by \(N \times \lceil \log J \rceil \). Therefore, if the learner can use symbolic membership queries, the total number of queries is bounded by a polynomial of N and J. In Sect. 4.5, we observe that the number of membership queries to implement a symbolic membership query is at most exponential to M. Since \(P\) is prefix-closed, M is at most N. Overall, if the learner cannot use symbolic membership queries, the total number of queries is at most exponential to N.

Table 1. Summary of the results for Random. Each row index \(|L|\_|\varSigma |\_K_{C}\) shows the number of locations, the alphabet size, and the upper bound of the maximum constant in the guards, respectively. The row “count” shows the number of instances finished in 3 h. Cells with the best results are highlighted.

Let \(\mathcal {A}_{\textrm{tgt}}= (\varSigma ,L,l_0,C,I,\varDelta ,F)\) be a DTA recognizing \(\mathcal {L}_{\textrm{tgt}}\). As we observe in the proof of Lemma 33 of [29], N is bounded by the size of the state space of the region automaton [4] of \(\mathcal {A}_{\textrm{tgt}}\), N is at most \(|C|!\times 2^{|C|}\times \prod _{c\in C} (2 K_{c} + 2) \times |L|\), where \(K_c\) is the largest constant compared with \(c\in C\) in \(\mathcal {A}_{\textrm{tgt}}\). Thus, without symbolic membership queries, the total number of queries is at most doubly-exponential to \(|C|\) and singly exponential to \(|L|\). We remark that when \(|C| = 1\), the total number of queries is at most singly exponential to \(|L|\) and \(K_c\), which coincides with the worst-case complexity of the one-clock DTA learning algorithm in [30].

5 Experiments

We experimentally evaluated our DTA learning algorithm using our prototype library LearnTAFootnote 4 implemented in C++. In LearnTA, the equivalence queries are answered by a zone-based reachability analysis using the fact that DTAs are closed under complement [4]. We pose the following research questions.

  • RQ1 How is the scalability of LearnTA to the language complexity?

  • RQ2 How is the efficiency of LearnTA for practical benchmarks?

For the benchmarks with one clock variable, we compared LearnTA with one of the latest one-clock DTA learning algorithms [1, 30], which we call OneSMT. OneSMT is implemented in Python with Z3 [23] for constraint solving.

For each execution, we measured the number of queries and the total execution time, including the time to answer the queries. For the number of queries, we report the number with memoization, i. e., we count the number of the queried timed words (for membership queries) and the counterexamples (for equivalence queries). We conducted all the experiments on a computing server with Intel Core i9-10980XE 125 GiB RAM that runs Ubuntu 20.04.5 LTS. We used 3 h as the timeout.

Fig. 3.
figure 3

The number of locations and the number of queries for \(|L|\_2\_10\) in Random, where \(|L| \in \{3,4,5,6\}\)

Table 2. Summary of the target DTAs and the results for Unbalanced. \(|L|\) is the number of locations, \(|\varSigma |\) is the alphabet size, \(|C|\) is the number of clock variables, and \(K_{C}\) is the maximum constant in the guards in the DTA.

5.1 RQ1: Scalability with Respect to the Language Complexity

To evaluate the scalability of LearnTA, we used randomly generated DTAs from [5] (denoted as Random) and our original DTAs (denoted as Unbalanced). Random consists of five classes: \(3\_2\_10\), \(4\_2\_10\), \(4\_4\_20\), \(5\_2\_10\), and \(6\_2\_10\), where each value of \(|L|\_|\varSigma |\_K_{C}\) is the number of locations, the alphabet size, and the upper bound of the maximum constant in the guards in the DTAs, respectively. Each class consists of 10 randomly generated DTAs. Unbalanced is our original benchmark inspired by the “unbalanced parentheses” timed language from [10]. Unbalanced consists of five DTAs with different complexity of timing constraints. Table 2 summarizes their complexity.

Table 1 and 3 summarize the results for Random, and Table 2 summarizes the results for Unbalanced. Table 1 shows that LearnTA requires more membership queries than OneSMT. This is likely because of the difference in the definition of prefixes and successors: OneSMT’s definitions are discrete (e. g., prefixes are only with respect to events with time elapse), whereas ours are both continuous and discrete (e. g., we also consider prefixes by trimming the dwell time in the end); Since our definition makes significantly more prefixes, LearnTA tends to require much more membership queries. Another, more high-level reason is that LearnTA learns a DTA without knowing the number of the clock variables, and many more timed words are potentially helpful for learning. Table 1 shows that LearnTA requires significantly many membership queries for \(4\_4\_20\). This is likely because of the exponential blowup with respect to \(K_{C}\), as discussed in Sect. 4.6. In Fig. 3, we observe that for both LearnTA and OneSMT, the number of membership queries increases nearly exponentially to the number of locations. This coincides with the discussion in Sect. 4.6.

In contrast, Table 1 shows that LearnTA requires fewer equivalence queries than OneSMT. This suggests that the cohesion in Definition 24 successfully detected contradictions in observation before generating a hypothesis, whereas OneSMT mines timing constraints mainly by equivalence queries and tends to require more equivalence queries. In Fig. 3c, we observe that for both LearnTA and OneSMT, the number of equivalence queries increases nearly linearly to the number of locations. This also coincides with the complexity analysis in Sect. 4.6. Figure 3c also shows that the number of equivalence queries increases faster in OneSMT than in LearnTA.

Table 3. Summary of the target DTA and the results for practical benchmarks. The columns are the same as Table 2. Cells with the best results are highlighted.

Table 2 also suggests a similar tendency: the number of membership queries rapidly increases to the complexity of the timing constraints; In contrast, the number of equivalence queries increases rather slowly. Moreover, LearnTA is scalable enough to learn a DTA with five clock variables within 15 min.

Table 1 also suggests that LearnTA does not scale well to the maximum constant in the guards, as observed in Sect. 4.6. However, we still observe that LearnTA requires fewer equivalence queries than OneSMT. Overall, compared with OneSMT, LearnTA has better scalability in the number of equivalence queries and worse scalability in the number of membership queries.

5.2 RQ2: Performance on Practical Benchmarks

To evaluate the practicality of LearnTA, we used seven benchmarks: AKM, CAS, Light, PC, TCP, Train, and FDDI. Table 3 summarizes their complexity. All the benchmarks other than FDDI are taken from [30] (or its implementation [1]). FDDI is taken from TChecker [2]. We use the instance of FDDI with two processes.

Table 3 summarizes the results for the benchmarks from practical applications. We observe, again, that LearnTA requires more membership queries and fewer equivalence queries than OneSMT. However, for these benchmarks, the difference in the number of membership queries tends to be much smaller than in Random. This is because these benchmarks have simpler timing constraints than Random for the exploration by LearnTA. In AKM, Light, PC, TCP, and Train, the clock variable can be reset at every edge without changing the language. For such a DTA, all simple elementary languages are equivalent in terms of the Nerode-style congruence if we have the same edge at their last event and the same dwell time after it. If two simple elementary languages are equivalent, LearnTA explores the successors of only one of them, and the exploration is relatively efficient. We have a similar situation in CAS. Moreover, in many of these DTAs, only a few edges have guards. Overall, despite the large number of locations and alphabets, these languages’ complexities are mild for LearnTA.

We also observe that, surprisingly, for all of these benchmarks, LearnTA took a shorter time for DTA learning than OneSMT. This is partly because of the difference in the implementation language (i. e., C++ vs. Python) but also because of the small number of equivalence queries and the mild number of membership queries. Moreover, although it requires significantly more queries, LearnTA successfully learned FDDI with seven clock variables. Overall, such efficiency on benchmarks from practical applications suggests the potential usefulness of LearnTA in some realistic scenarios.

6 Conclusions and Future Work

Extending the L* algorithm, we proposed an active learning algorithm for DTAs. Our extension is by our Nerode-style congruence for recognizable timed languages. We proved the termination and the correctness of our algorithm. We also proved that our learning algorithm requires a polynomial number of queries with a smart teacher and an exponential number of queries with a normal teacher. Our experiment results also suggest the practical relevance of our algorithm.

One of the future directions is to extend more recent automata learning algorithms (e. g., TTT algorithm [19] to improve the efficiency) to DTA learning. Another direction is constructing a passive DTA learning algorithm based on our congruence and an existing passive DFA learning algorithm. It is also a future direction to apply our learning algorithm for practical usage, e. g., identification of black-box systems and testing black-box systems with black-box checking [22, 24, 28]. Optimization of the algorithm, e. g., by incorporating clock information is also a future direction.