1 Introduction

Model checking [9, 19, 33, 36] is an important technique to automatically determine whether a system satisfies a specified property. However, it suffers from the state explosion problem since it needs to store the explored system states in memory, which is impossible for most realistic systems [21]. In timed systems, although symbolic representations and partial order reductions have greatly increased the size of the systems that can be verified, many realistic timed systems are still too large to be handled. In particular, if a system has several components, the number of global system states will grow exponentially with the number of components. Assume-guarantee reasoning (AGR) [20, 25, 29, 35] is a promising method helpful to address the state explosion problem.

Consider a system M composed of two components \(M_1\) and \(M_2\) that synchronize on a given set of shared actions. Supposing we are to verify that M satisfies a property \(\phi \), the verification rule in AG states that if there exists an assumption A on the environment of \(M_2\) such that 1) \(M_1\) and A satisfy the property \(\phi \), and 2) \(M_2\) is a refinement of A, then M satisfies \(\phi \).

A major challenge in verifying component-based systems using the AG rule is the need to obtain the appropriate assumption that requires non-trivial human effort [26]. Based on abstraction-refinement paradigm in [22], the assumption is computed as a conservative abstraction of some of the components, and it is then refined using counterexamples obtained from model checking it [15]. The algorithm presented in [24] is capable of generating the weakest possible assumption automatically, though it does not compute partial results. In the later work [23], a framework is proposed for the automatic generation of assumptions in an incremental fashion using the L* learning algorithm [8]. Several improvements, e.g. [14, 17, 18, 38], are proposed to further reduce the learning complexity. The work [6] by Alur et al. presents a symbolic implementation of the L* algorithm where the required data structures are maintained compactly using ordered BDDs [16].

All the aforementioned work focuses on untimed systems. For timed systems, using assume-guarantee style proof rules, the work in [39] proves a refined representation is a correct implementation of an abstract one. To check Zeroconf, a protocol for dynamic configuration of IPv4 link-local addresses, Berendsen et al. [12] model the protocol as a network of timed automata (TAs) [3, 4], and provide a proof that combines model checking with the application of a new abstraction relation that is compositional with respect to committed locations. However, the abstract models there are all provided manually. Compared to the manual methods, the compositional verification framework presented in [31, 32] utilizes a learning algorithm for automatic construction of timed assumptions for AGR. The work considers event-recording automata [5], which are a subclass of timed automata. Sankur [37] gives compositional verification for the system composed by a deterministic finite automaton (DFA) and a timed automaton, where a DFA assumption is learned [27] to approximate the timed component. The framework can only check the untimed property of the system and it has the limitation that the TA size is relatively small.

The timed automaton is the most appreciated model for its simplicity and adequacy in expressiveness, and it is widely used for practical real-time systems [28, 30]. However, to the best of our knowledge, though compositional verification for timed systems helps mitigate the state space explosion problem, there is still no work to tackle the problem of automatically inferring the timed assumptions based on AGR for timed automata. Therefore, we propose, in this paper, a learning-based framework for AG-based automatic verification of deterministic timed automata. The framework applies the compositional rule in an iterative fashion. Each iteration consists of three steps. In the first step, based on the work in [7], a modified L* algorithm is presented to learn a timed assumption in the form of a deterministic one-clock timed automata (DOTAs) using membership queries. Then two further steps are conducted to check whether the learned assumption satisfies the two premises of the proof rule via candidate queries. We design an algorithm for model conversion with polynomial complexity, which is executed as a step preceding the above iterative steps. It converts the input models \(M_1\), \(M_2\) and \(\phi \) to the output ones, which contain the clock reset information for the assumption learning. Thus, the complexity of the learning step in the framework in total is polynomial. We show this conversion preserves the verification results.

We further prove the correctness and termination of the compositional verification. We would like to note that the framework we propose applies to verification of systems with a number of components. In other words, though the assumption learned is a DOTA, \(M_1\) and \(M_2\) can be compositions of several DOTAs. For this, we design a heuristic to transform multi-clock reset information to one-clock reset information, which enables the framework to handle learning-based compositional verification for multi-clock systems. We also propose two improvements to speed up the verification, which are shown to have different advantages in cases of experiments. Finally, we implement the framework and conduct comparative experiments with UPPAAL [10, 11] on cases of the benchmark of AUTOSAR (Automotive Open System Architecture) [1]. The experiments show that the framework proposed in this paper performs better than that of UPPAAL provided the properties to be checked are satisfied.

The rest of the paper is organized as follows. In Sect. 2, we introduce background knowledge. We present in Sect. 3 our learning-based compositional verification framework, as well as the proofs of termination and correctness. In Sect. 4, we present the two improvements. We report the experimental results in Sect. 5. Finally, we discuss the conclusions of the paper in Sect. 6.

2 Preliminaries

We use \(\mathbb {N}\) to denote the set of natural numbers, \(\mathbb {R}_{\ge 0}\) the set of non-negative reals, and let \(\mathbb {B}=\{\top ,\bot \}\), where \(\top \) and \(\bot \) stand for true and false, respectively.

2.1 Timed Automata

Let X be a finite set of real-valued variables ranged over by x, y, etc. standing for clocks. A clock valuation for X is a function \(\nu : X \mapsto \mathbb {R}_{\ge 0}\) which associates every clock x with a value \(\nu (x) \in \mathbb {R}_{\ge 0}\). For \(t\in \mathbb {R}_{\ge 0}\), let \(\nu +t\) denote the clock valuation which maps every clock \(x\in X\) to the value \(\nu (x)+t\). For a set \(\gamma \subseteq X\) and a valuation \(\nu \), we use \([\gamma \rightarrow 0]\nu \) to denote the valuation which resets all clock variables in \(\gamma \) to 0 and agrees with \(\nu \) for other clocks in \(X\backslash \gamma \).

We use \(\varPhi (X)\) to denote the set of clock constraints over X of the form \(\varphi \,\,{:}{:}\!= \top \mid x_1 \bowtie m \mid x_1 - x_2 \bowtie m \mid \varphi \wedge \varphi \), where \(x_1, x_2 \in X\), \(m\in \mathbb {N}\) and \(\bowtie \ \in \{=,<,>,\le ,\ge \}\). We use \(\varphi (\nu ) = \top \) to mean that the clock valuation \(\nu \) for X satisfies the clock constraint \(\varphi \) over X, i.e. \(\varphi \) evaluates to true using the values given by \(\nu \).

Definition 1 (Timed Automata)

A timed automaton (TA) is a 6-tuple \(M = (Q,q_0,\varSigma ,F, X, \varDelta )\), where Q is a finite set called the locations, \(q_0 \in Q\) is the initial location, \(\varSigma \) is a finite set called the alphabet, \(F \subseteq Q\) is the set of accepting locations, X is the finite set of clocks, and \(\varDelta \subseteq Q \times \varSigma \times \varPhi (X) \times 2^X \times Q\) is a finite set called the transitions.

A transition \(\delta \in \varDelta \) is a 5-tuple \((q, \sigma , \varphi , \gamma , q')\), where \(q, q' \in Q\) are respectively the source and target locations, \(\sigma \in \varSigma \) is an action, \(\varphi \) is a clock constraint over X which is called the guard of the transition and specifies that the transition is enabled when it is true in the source state, and the set \(\gamma \subseteq X\) gives the reset clocks by this transition. Thus, \(\delta \) allows a jump from q to \(q'\) by performing an action \(\sigma \) if it is enabled, i.e. \(\varphi (\nu ) = \top \). We use \(\delta [i]\) to denote the i’th element of the tuple \(\delta = (q, \sigma , \varphi , \gamma , q')\) for \(i=1,\ldots , 5\). A run \(\rho \) of \(M\) is a finite sequence of transitions \(\rho =\left( q_{0}, \nu _{0}\right) {\mathop {\longrightarrow }\limits ^{\sigma _{1},t_{1}}} \left( q_{1}, \nu _{1}\right) {\mathop {\longrightarrow }\limits ^{\sigma _{2},t_{2}}} \cdots {\mathop {\longrightarrow }\limits ^{\sigma _{n},t_{n}}}\left( q_{n}, \nu _{n}\right) \) where \(\nu _{0}=\{\nu (x) | \nu (x) = 0, x \in X\}\), and for all \(1 \le i \le n\) there exists a transition \(\left( q_{i-1}, \sigma _{i}, \varphi _{i}, \gamma _{i}, q_{i} \right) \in \varDelta \) such that \(\varphi _{i}(\nu _{i-1}+t_{i})=\top \), and \(\nu _{i}=[\gamma _{i}\rightarrow 0](\nu _{i-1}+t_{i})\). If \(q_n\) is an accepting location, we say \(\rho \) is an accepting run of M. Each pair \((\sigma _i, t_i)\) \(\in \varSigma \times \mathbb {R}_{\ge 0}\) in the run \(\rho \) is called a timed action that indicates the action \(\sigma _i\) is applied after \(t_i\) time units since the occurrence of the previous action.

The timed trace of \(\rho \) is a timed word \(\textit{trace}(\rho ) = \left( \sigma _{1}, t_{1}\right) \left( \sigma _{2}, t_{2}\right) \ldots \left( \sigma _{n}, t_{n}\right) \). Since time value \(t_{i}\) represents delay time, we also call such a timed trace a delay-timed word, denoted by \(\omega \). Adding the reset information along \(\omega \), we get the corresponding reset-delay-timed word, denoted by \(\omega _{r}=\textit{trace}_r(\rho )=(\sigma _1,t_1,\gamma _1)(\sigma _2,t_2,\gamma _2)\cdots (\sigma _n,t_n, \gamma _n)\). Notice that here \(\gamma _i\) is a clock set \(\gamma _i \subseteq X\) which records the reset clocks in the corresponding transition when taking timed action \((\sigma _i,t_i)\).

If \(\rho \) is an accepting run of M, \(\textit{trace}(\rho )\) is called an accepting timed word. The recognized timed language of M is the set of its accepting delay-timed words, i.e. \(\mathcal {L}(M)=\{\textit{trace}(\rho ) \, \vert \, \rho \text { is an accepting run of } M\}\). The recognized reset-delay-timed language \(\mathcal {L}_{r}(M)\) is defined as \(\{\textit{trace}_r(\rho ) \, \vert \, \rho \text { is an accepting run of } M\}\). A TA M is deterministic iff for any given delay-timed word \(\omega \), there is at most one run \(\rho \) in M having \(\textit{trace}(\rho )=\omega \).

For a run \(\rho \), we define the corresponding logical-timed word \(\omega _l=(\sigma _1,\textbf{v}_1)\) \((\sigma _2,\textbf{v}_2) \cdots (\sigma _n,\textbf{v}_n)\), where \(\textbf{v}_i\in \mathbb {R}_{\ge 0}^{|X|}\) is the vector which records the values for all clocks in X. Therefore, delay-timed words and logical-timed words describe the operations of the timed model M from different perspectives. The former describe M from the external perspective, recording the actions and time intervals between two consecutive actions. While the latter describe it from the internal perspective, recording the actions and the specific values of internal clocks when the actions occur. Both are necessary for the active learning algorithm described in Sect. 2.2.

Given the clock reset information \(\gamma _{i}\) along the run \(\rho \) over the delay-timed word \(\omega =\left( \sigma _{1}, t_{1}\right) \left( \sigma _{2}, t_{2}\right) \ldots \left( \sigma _{n}, t_{n}\right) \), we can obtain \(\omega \)’s corresponding logical-timed word \(\omega _{l}=(\sigma _1,\textbf{v}_1) (\sigma _2,\textbf{v}_2)\cdots (\sigma _n,\textbf{v}_n)\) by taking

$$\begin{aligned} \textbf{v}_{i}[j] = {\left\{ \begin{array}{ll} t_{i} , &{} \text {if}\ \ i=1\ \ \text {or}\ \ x_{j}\in \gamma _{i-1} \text { for all } 2\le i \le n; \\ \textbf{v}_{i-1}[j]+t_{i}, &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$
(1)

where \(1\le j \le |{X}|\) and \(\textbf{v}_{i}[j]\) is the j’th element in \(\textbf{v}_{i}\). We use \(\varGamma \) to denote the mapping from the delay-timed words to the logical-timed words, that is, \(\varGamma (\omega ) = \omega _l\). With the reset information along the run \(\rho \), we have the reset-logical-timed word \(\omega _{rl}=(\sigma _1,\textbf{v}_1,\gamma _1)(\sigma _2,\textbf{v}_2,\gamma _2)\) \(\dots (\sigma _n,\textbf{v}_n,\gamma _n)\). We can extend the mapping \(\varGamma \) to a mapping from the reset-delay-timed words to the reset-logical-timed words.

The recognized logical-timed language of M is given as \(L(M)=\{\varGamma (\textit{trace}(\rho ) )\, \vert \, \rho \) is an accepting run of \(M\}\), and the recognized reset-logical-timed language of M is \(L_{r}(M)=\{\varGamma (\textit{trace}_r(\rho )) \, \vert \, \rho \) is an accepting run of \(M\}\).

Definition 2 (Projection of Delay-Timed Words)

[Projection of Delay-Timed Words] Given a delay-timed word \(\omega = \left( \sigma _1,t_1\right) \left( \sigma _2,t_2\right) ...\left( \sigma _n,t_n\right) \in \left( \varSigma _1\times \mathbb {R}_{\ge 0} \right) ^*\) and an alphabet \(\varSigma _2\), the projection of \(\omega \) to \(\varSigma _2\) is a delay-timed word, denoted by \(\omega {\downharpoonleft _{\varSigma _2}}\), and defined as follows:

$$\begin{aligned} \omega {\downharpoonleft _{\varSigma _2}} = \left( \sigma _{i_1},\textstyle \sum _{j=1}^{i_1}t_j\right) \left( \sigma _{i_2},\textstyle \sum _{j=i_1+1}^{i_2}t_j\right) ... \left( \sigma _{i_m},\textstyle \sum _{j=i_{m-1}+1}^{i_m}t_j\right) \end{aligned}$$
(2)

where \(\sigma _{i_k}\in \varSigma _2\) is the \(i_k\)’th action in \(\omega \), \(1\le k\le m\).

Therefore, \(\omega {\downharpoonleft _{\varSigma _2}}\) restricts each action \(\sigma _{i_k}\) to be in \(\varSigma _2\) and modifies the corresponding delay time of \(\sigma _{i_k}\) to be the time interval between \(\sigma _{i_k -1}\) and \(\sigma _{i_k}\) in \(\omega \). For instance, let \(\omega = (a, 1)(b,3)(a,1)(c,4)(a,2)\) and \(\varSigma _2 = \{b, c\}\), then the corresponding \(\omega {\downharpoonleft _{\varSigma _2}} = (b, 4)(c,5)\).

Definition 3 (Parallel Composition of Timed Automata)

Given two timed automata \(M_1 = (Q_1,q_0^1,\varSigma _1,F_1,X_1,\varDelta _1)\) and \(M_2 = (Q_2, q_0^2, \varSigma _2, F_2, X_2, \varDelta _2)\), assume that the clock sets \(X_1\) and \(X_2\) are disjoint. Their parallel composition is a TA \(M_1\Vert M_2 = (Q_1 \times Q_2, (q_0^1, q_0^2), \varSigma _1 \cup \varSigma _2, F_1 \times F_2, X_1 \cup X_2, \varDelta )\) where the transitions \(\varDelta \) are as follows:

  • for \(\sigma \in \varSigma _1 \cap \varSigma _2\), for every \(\delta _1: (q_1,\sigma ,\varphi _1,\gamma _1,q'_1) \in \varDelta _1\) and \(\delta _2: (q_2,\sigma ,\varphi _2,\gamma _2,q'_2) \in \varDelta _2\), \(((q_1,q_2), \sigma , \varphi _1 \wedge \varphi _2,\gamma _1 \cup \gamma _2, (q'_1, q'_2)) \in \varDelta \).

  • for \(\sigma \in \varSigma _1 \setminus \varSigma _2\), for every \(\delta _1: (q_1,\sigma ,\varphi _1,\gamma _1,q'_1) \in \varDelta _1\) and every \(q \in Q_2\), \(((q_1,q), \sigma , \varphi _1,\gamma _1 , (q'_1, q)) \in \varDelta \).

  • for \(\sigma \in \varSigma _2 \setminus \varSigma _1\), for every \(\delta _2: (q_2,\sigma ,\varphi _2,\gamma _2,q'_2) \in \varDelta _2\) and every \(q \in Q_1\), \(((q, q_2), \sigma , \varphi _2, \gamma _2 , (q, q'_2)) \in \varDelta \).

The language of the composition is the set of accepting delay-timed words and \(\mathcal {L}(M_1\Vert M_2)=\{\omega |\omega \in \left( \left( \varSigma _{1}\cup \varSigma _{2}\right) \times \mathbb {R}_{\ge 0}\right) ^*\) and \(\omega {\downharpoonleft _{\varSigma _i}}\in \mathcal {L}(M_i),i\in \{1,2\}\}\).

Definition 4 (Language Inclusion)

Given two timed automata \(M_1\) and \(M_2\), if \(\mathcal {L}(M_1){\downharpoonleft _{\varSigma _2}} = \{ \omega {\downharpoonleft _{\varSigma _2}} | \omega \in \mathcal {L}(M_1) \}\) is a subset of \(\mathcal {L}(M_2)\), we say \(M_1\) satisfies \(M_2\), denoted by \(M_1\models M_2\).

Definition 5 (Deterministic One-Clock Timed Automata)

A one-clock timed automaton (OTA) is the timed automaton with only one clock. A deterministic OTA is denoted by DOTA.

2.2 Learning Deterministic One-Clock Timed Automata

In this section, we briefly describe the active learning algorithm for a DOTA M. We refer to [7] for more details. Active learning of a DOTA assumes the existence of a teacher who can answer two kinds of queries: membership and candidate queries posed by a learner. A membership query asks the question if \(\omega _l\in L({M})\) for a logical-timed word \(\omega _l\); and a candidate query asks if the learned DOTA A represents the assumption satisfies the equation \(\mathcal {L}(A)=\mathcal {L}(M)\). The main challenge for learning the timed assumption is to obtain the reset information of the logical clocks for each transition. We consider two different settings, depending on whether the teacher also provides clock reset information along with answers to queries.

A smart teacher is one which provides clock reset information along with answers to queries. It accepts a logical-timed word \(\omega _{l}\) as an input for the membership query from the learner. It then returns an answer about if the timed word is accepted or not together with reset information of each transition along the trace, that is, the reset-logical-timed word \(\omega _{rl}\).

When the smart teacher takes a candidate query from the learner, a counterexample is yielded and provided as a reset-delay-timed word. The algorithm maintains a timed observation table \(\textbf{T}\) to store answers from all previous queries. Once the learner has gained sufficient information, i.e. \(\textbf{T}\) is closed and consistent, an assumption A is constructed from the table. Then the learner poses a candidate query to the teacher to judge if \(\mathcal {L}(A)=\mathcal {L}(M)\). If yes, the algorithm terminates with the learned model A. Otherwise, the teacher responds with a reset-delay-timed word \(\omega _{r}\) as a counterexample. After processing \(\omega _{r}\), the algorithm starts a new round of learning. The whole procedure repeats until the teacher gives a positive answer to a candidate query. It is known that the complexity of the algorithm is polynomial in the size of the learned model. In practical applications, this corresponds to the case where some parts of the model (information of clock reset) are known by testing or watchdogs.

In the case when normal teacher is used, the learner needs to guess the reset information on each transition discovered in the observation table. At each iteration, the learner guesses all needed reset information and forms a number of table candidates. Due to the required guesses, the complexity of the algorithm is exponential in the size of the learned model. The following theorem which is presented in [7] shows that for both types of teachers, the algorithm converts the learning problem to that of learning the reset-logical-timed language.

Theorem 1

Given two DOTAs M and A, if \(L_r(M) = L_r(A)\), then \(\mathcal {L}(M)\) \(= \mathcal {L}(A)\).

3 Framework for Learning-Based Compositional Verification of Timed Automata

Consider a system \(M = M_1\Vert M_2\) consisting of two deterministic timed automata and a safety property \(\phi \) represented as a deterministic timed automaton. We devote this section to presenting our learning-based verification framework for automatically finding an appropriate assumption A in the AG rule to verify that M satisfies \(\phi \). Section 3.1 first describes the framework. Then, in Sects. 3.23.3 and 3.4, the main algorithms of the framework are presented in detail. Finally, Sect. 3.5 shows the correctness and termination of the framework.

3.1 Verification Framework via Assumption Learning

Let \(\varSigma _1\), \(\varSigma _2\) and \(\varSigma _\phi \) be the alphabets of the TAs \(M_1\), \(M_2\) and \(\phi \), respectively. We then have that the alphabet of the assumption \(A_0\) is \(\varSigma _{A_0} = (\varSigma _1 \cup \varSigma _{\phi })\cap \varSigma _2\). The AG rule is stated as follow:

$$\begin{aligned} \begin{array}{c} M_{1} \Vert A_0 \models \phi , \; M_{2} \models A_0 \\ \hline M_{1}\Vert M_{2} \models \phi \end{array} \end{aligned}$$
(3)

The rule converts the problem of verifying \(M_1 \Vert M_2 \models \phi \) to that of finding an assumption \(A_0\) which is a DOTA satisfying both \(M_1\Vert A_0 \models \phi \) and \(M_2\models A_0\). Here, we consider \(M_1\) and \(M_2\) as general TAs, which are either a DOTA or compositions of a number of DOTAs. Therefore, the framework we propose is not only applicable to verifying the composition of just two components. For a system composed of n components, where \(n>2\), we can partition the components into two parts. For instance, if a system consists of 4 components \(M=\{H_1,H_2,H_3,H_4\}\), we can let \(M_1=H_1\Vert H_3\) and \(M_2=H_2\Vert H_4\). In order to automatically obtain the assumption, we use model learning algorithms. However, the current learning algorithm for DOTA [7] is not directly applicable. We thus design a “smart teacher” with heuristic to answer clock reset information for the learning. For this, we also need to design a model conversion algorithm. We illustrate the learning-based verification framework in Fig. 1. The inputs of the framework are \(M_1\), \(M_2\) and property \(\phi \) and the verification process consists of four steps, which we describe below.

Fig. 1.
figure 1

Learning-based compositional verification framework for timed automata

The First Step. This step converts the input models into TAs \(M'_1\), \(M'_2\) and \(\phi '\) (ref to Sec. 3.2) without changing the verification results, i.e. checking \(M'_1\Vert M'_2\) against \(\phi '\) is equivalent to checking \(M_1\Vert M_2\) against \(\phi \). The output of this step is utilized to determine the clock reset information for the assumption learning in the second step. Then, the AG rule 3 is applied to \(M'_1\), \(M'_2\) and \(\phi '\). Thus, if there exists an assumption A such that \(M'_1\Vert A\models \phi '\) and \(M'_2\models A\), then \(M'_1\Vert M'_2\models \phi '\). The weakest assumption \(A_w\) is the one with which the rule is guaranteed to return conclusive results and \(M_1'\Vert A_w\models \phi '\).

Definition 6 (Weakest Assumption)

Let \(M'_1\), \(M'_2\) and \(\phi '\) be the models mentioned above and \(\varSigma _{A} = (\varSigma '_1 \cup \varSigma '_{\phi }) \cap \varSigma '_2\). The weakest assumption \(A_w\) of \(M'_2\) is a timed automaton such that the two conditions hold: 1) \(\varSigma _{A_w} = \varSigma _{A}\), and 2) for any timed automaton E with \(\varSigma _{E} = \varSigma _{A}\) and \(M'_2 \models E\), \(M'_1 \Vert E \models \phi '\) iff \(E\models A_w\).

The Second Step. A DOTA assumption A is learned through a number of membership queries in this step. The answer to each query involves gaining the definite clock reset information for each timed word, i.e. whether the clock of A is reset when an action is taken at a specific time. We design a heuristic to obtain such information from the clock reset information of the converted models \(M'_1\), \(M'_2\) and \(\phi '\). This allows the framework to handle learning-based compositional verification for multi-clock systems. We refer to Sect. 3.3 for more details.

The Third and the Fourth Steps. Once the assumption A is constructed, two candidate queries start for checking the compositional rule. The first is a subset query to check whether \(M'_1\Vert A\models \phi '\). The second is a superset query to check whether \(M_2'\models A\). If both candidate queries return true, the compositional rule guarantees that \(M'_1\Vert M'_2\models \phi '\). Otherwise, a counterexample ctx (either \(ctx_1\) or \(ctx_2\) in Fig. 1) is generated and further analyzed to identify whether ctx is a witness of the violation of \(M'_1\Vert M'_2\models \phi '\). If it does not show the violation, ctx is used to update A in the next learning iteration. The details about candidate queries are discussed in Sect. 3.4.

Therefore, \(\mathcal {L}(A)\) is a subset of \(\mathcal {L}(A_w)\) and a superset of \(\mathcal {L}(M'_2){\downharpoonleft _{\varSigma _{A}}}\). It is not guaranteed that a DOTA A can be learned to satisfy \(\mathcal {L}(A) = \mathcal {L}(A_w)\). However, as shown later in Theorem 3, under the condition that \(\mathcal {L}(A_w)\) is accepted by a DOTA, the learning process terminates when compositional verification returns a conclusive result often before \(\mathcal {L}(A_w)\) is computed. This means that verification in the framework usually terminates earlier by finding either a counterexample that verifies that \(M'_1 \Vert M'_2 \not \models \phi '\) or an assumption A that satisfies the two premises in the reasoning rule, indicating \(M'_1 \Vert M'_2 \models \phi '\).

3.2 Model Conversion

We use membership queries to learn the DOTA assumption. For a membership query with the input of a logical-timed word \(\omega _l\), an answer from the teacher is the clock reset information of the word, which is necessary for obtaining the reset-logical-timed word \(\omega _{rl}\). As shown in [7], the learning algorithm with a normal teacher can only generate the answer by guessing reset information and this is the cause of high complexity. We thus design a smart teacher in our framework scheme. The smart teacher generates the answer to a query with the input \(\omega _l\) by directly making use of the available clock reset knowledge of \(\mathcal {L}(A_w)\) (related with \(\varSigma _A\), \(M_1\) and \(\phi \)). To this end, we implement the model conversion from the models \(M_1\), \(M_2\) and \(\phi \) to the models \(M_1'\), \(M_2'\) and \(\phi '\), respectively.

The model conversion algorithm is mainly to ensure that each action in \(\varSigma _A\) corresponds to unique clock reset information. Given an action \(\sigma \) having \(\sigma \in \varSigma _A\) and \(\sigma \in \varSigma _1\) (resp. \(\varSigma _{\phi }\)), if there is only one transition by \(\sigma \) or all its different transitions have the same reset clocks, i.e. for any transitions \(\delta _1\) and \(\delta _2\), \(\delta _1[4]=\delta _2[4]\) if \(\delta _1[2]=\delta _2[2]=\sigma \), the reset information for the action \(\sigma \) is simply \(\delta [4]\) of any particular transition by \(\sigma \). If there are different transitions by \(\sigma \), say \(\delta _1\) and \(\delta _2\), which have different reset clocks, i.e. \(\delta _1[4]\ne \delta _2[4]\), we say that the reset clocks of action \(\sigma \) are inconsistent.

Reset clock inconsistency causes difficulty for the teacher to obtain the clock reset information of an action in a whole run. To deal with this difficulty, we design model conversion in Algorithm 1 to convert \(M_1\), \(M_2\) and \(\phi \) into \(M'_1\), \(M'_2\) and \(\phi '\). In the algorithm, the conversion is implemented by calling Algorithm 2 twice to introduce auxiliary actions and transitions into \(M_1\) and \(\phi \) to resolve reset clock inconsistency in the two automata, respectively.

The converted models \(M'_1\), \(M'_2\) and \(\phi '\) returned by the invocations to Algorithm 2 have the property that all transitions with the same action \(\sigma \in \varSigma _A\) will have the same reset clocks, and thus \(M_1'\) and \(\phi '\) do not have reset clock inconsistency. As shown later in Theorem 2, the verification of \(M'_1 \Vert M'_2\) against \(\phi '\) is equivalent to that of \(M_1\Vert M_2\) against \(\phi \).

Algorithm 2, denoted by ConvertS(\(\mathcal {M}_1, \mathcal {M}_2, \mathcal {M}_3\)), takes three deterministic TAs, namely \(\mathcal {M}_1\), \(\mathcal {M}_2\) and \(\mathcal {M}_3\), as its input and convert them into three new TAs, namely \(\mathcal {M}_1'\), \(\mathcal {M}_2'\) and \(\mathcal {M}_3'\), as the output. We explain the three main functionalities of the algorithm in the following three paragraphs.

figure a
figure b

Check Reset Information in \(\mathcal {M}_1\) (Lines 1-6). Let \(\boldsymbol{\varSigma }=(\varSigma _{\mathcal {M}_1} \cup \varSigma _{\mathcal {M}_2}) \cap \varSigma _{\mathcal {M}_3}\), f be a binary relation between \(\boldsymbol{\varSigma }\) and \(2^X\), where X is the set of clocks of \(\mathcal {M}_1\), and \(f=\emptyset \) initially. The transitions of \(\mathcal {M}_1\) are checked one by one. For a transition \(\delta \), if its action \(\delta [2]\) is in \(\boldsymbol{\varSigma }\) but not in the domain of f (Line 5). Transition \(\delta \) is the first transition by \(\delta [2]\) found, and thus the pair \(\langle \delta [2], \delta [4]\rangle \) is added to the relation f. If the action of \(\delta \) is already in \( dom (f)\) but the reset clocks \(\delta [4]\) is inconsistent with the records in f, the algorithm proceeds to the next steps to handle the inconsistency of the reset clocks (Lines 7-20).

Introduce Auxiliary Actions in \(\mathcal {M}_1\) (Lines 7-12). If \(\delta [2] \in dom (f) \wedge \langle \delta [2], \delta [4]\rangle \not \in f\) (Line 7), we need to introduce a new action (through the variable \(\sigma _{new}\)) and add it to the alphabets of the output models. Then the transition \(\delta \) with action \(\sigma \) is modified to a new transition, say \(\delta '\) by replacing action \(\sigma \) with the value of \(\sigma _{new}\) (Lines 11-12).

Add Auxiliary Transitions in \(\mathcal {M}_2\) and \(\mathcal {M}_3\) (Lines 13-20). Since new actions are introduced in \(\mathcal {M}_1\), we need to add auxiliary transitions with each new action in \(\mathcal {M}_2\) and \(\mathcal {M}_3\) accordingly. Specifically, consider the case when \(\mathcal {M}_1\) and \(\mathcal {M}_2\) synchronize on action \(\sigma \) via transitions \(\delta \) and \(\overline{\delta }\) in the models, respectively. If \(\delta \) in \(\mathcal {M}_1\) is modified to \(\delta '\) in \(\mathcal {M}_1'\) by renaming its action \(\sigma \) to \(\sigma '\), a fresh co-transition \(\overline{\delta '}\) should be added to \(\mathcal {M}_2'\) which is a copy of \(\overline{\delta }\) by changing \(\sigma \) to \(\sigma '\) so as for the synchronisation in the composition of \(\mathcal {M}_1'\) and \(\mathcal {M}_2'\) (Lines 13-16). The same changes are made for \(\mathcal {M}_3\) (Lines 17-20).

Example 1

Fig. 2 shows an example of the conversion. In \(M_1\), there are two transitions that contain action a but only one has clock reset. To solve clock reset inconsistency of \(M_1\), the new action \(a'\) is introduced, and \(M_1\) is converted into \(M_1''\) by changing action name a of one transition to \(a'\) marked as an orange dashed line. In \(M_2\) and \(\phi \), by adding the corresponding new transitions, \(M_2''\) and \(\phi ''\) are achieved. In \(\phi ''\), the transitions with a and \(a'\) still have different reset information, so it is further changed to \(\phi '\) by adding a transition marked as a blue dotted line. Correspondingly, \(M_1''\) and \(M_2''\) are changed. Obviously, we can determine the reset information of the transition with a (\(a',a'',a'''\)) in automata \(M_1'\) and \(\phi '\).

Fig. 2.
figure 2

\(M_1\), \(M_2\) and \(\phi \) are converted into \(M'_1\), \(M'_2\) and \(\phi '\)

We now show that the verification of \(M'_1\Vert M'_2\) against \(\phi '\) is equivalent to the original verification of \(M_1\Vert M_2\) against \(\phi \).

Theorem 2

Checking \(M'_1{\Vert }M'_2\models \phi '\) is equivalent to checking \(M_1{\Vert } M_2\models \phi \).

Proof

We prove \(M_1 \Vert M_2 \not \models \phi \Leftrightarrow M'_1 \Vert M'_2 \not \models \phi '\). This is equivalent to prove \(\mathcal {L}(M_1\Vert M_2\Vert \overline{\phi }) \not = \emptyset \Leftrightarrow \mathcal {L}(M_1'\Vert M_2'\Vert \overline{\phi '}) \not = \emptyset \), where \(\overline{\phi }\) and \(\overline{\phi '}\) are the complements of \(\phi \) and \(\phi '\), respectively.

We first prove \(\mathcal {L}(M_1\Vert M_2\Vert \overline{\phi }) \not = \emptyset \Rightarrow \mathcal {L}(M_1'\Vert M_2'\Vert \overline{\phi '}) \not = \emptyset \). The left hand side implies that \(M_1\Vert M_2\Vert \overline{\phi }\) has at least one accepting run \(\rho \). According to the construction of \(M'_1\), \(M'_2\) and \(\phi '\), for the composed model \(M_1'\Vert M_2'\Vert \overline{\phi '}\), compared with \(M_1\Vert M_2\Vert \overline{\phi }\), the locations and the guards of transitions remain the same, although some auxiliary transitions have been added to the model where actions are renamed. So we can construct a run \(\rho '\) in \(M_1'\Vert M_2'\Vert \overline{\phi '}\), which visits the locations in the same order as \(\rho \). Since \(\rho \) is an accepting run, its final location must be an accepting one, which implies \(\rho '\) is an accepting run of \(M_1'\Vert M_2'\Vert \overline{\phi '}\), and \(trace(\rho ') \in \mathcal {L}(M_1'\Vert M_2'\Vert \overline{\phi '})\).

For \(\mathcal {L}(M_1'\Vert M_2'\Vert \overline{\phi '}) \not = \emptyset \Rightarrow \mathcal {L}(M_1\Vert M_2\Vert \overline{\phi }) \not = \emptyset \), since \(\mathcal {L}(M_1'\Vert M_2'\Vert \overline{\phi '}) \not = \emptyset \), there exists at least one accepting run \(\rho '\) in \(M_1'\Vert M_2'\Vert \overline{\phi '}\). Still, by the construction of \(M'_1\), \(M'_2\) and \(\phi '\), we can construct an accepting run \(\rho \) in \(M_1\Vert M_2\Vert \overline{\phi }\), by replacing the newly introduced actions along \(\rho '\) with their original names, and \(trace(\rho )\) is an evidence of \(\mathcal {L}(M_1\Vert M_2\Vert \overline{\phi }) \not = \emptyset \).    \(\square \)

Complexity. For the model conversion, Algorithm 1 mainly consists of two invocations of Algorithm 2 which has a nested loop. In the worst case execution of Algorithm 2, the transitions of \(\mathcal {M}_1\) in the outer loop and the transitions of \(\mathcal {M}_1, \mathcal {M}_2\) and \(\mathcal {M}_3\) in the inner loops are traversed, so the time complexity is polynomial and quadratic in the number of transitions.

3.3 Membership Queries

After model conversion, a number of membership queries are used to learn the DOTA assumption A. For each membership query, the learner provides the teacher a logical-timed word \(\omega _l= (\sigma _1,\textbf{v}_1)(\sigma _2,\textbf{v}_2)\cdots (\sigma _n,\textbf{v}_n)\) to obtain clock reset information, where \(\sigma _i\in \varSigma _A\) and \(|\textbf{v}_i|=1\). Based on the converted model, the teacher supplements corresponding reset information \(\gamma _i\) for each \(\sigma _i\) in \(\omega _l\) to construct the reset-logical-timed word \(\omega _{rl}=(\sigma _1,\textbf{v}_1,\gamma _1)(\sigma _2,\textbf{v}_2,\gamma _2)\ldots (\sigma _n,\textbf{v}_n,\gamma _n)\). Though the learning algorithm we use is associated with one clock and the hypothesis we obtain is always a DOTA, the number of clocks in \(M'_1\) and \(\phi '\) might be multiple since they are not necessarily DOTAs. This raises the question of how to transform the multi-clock reset information to the single-clock reset information. To solve this problem, we use a heuristic to generate the one-clock reset information \(\gamma _i\) for each action \(\sigma _i\). Let X be the finite set of clocks of \(M'_1\) and \(\phi '\), and x be the single clock of the learned assumption, where \(|X|>1\). For each action \(\sigma _i\), we try four heuristics to determine whether x is reset: 1) random assignment, 2) \(\gamma _i\) is always \(\{x\}\), 3) \(\gamma _i\) is always \(\emptyset \), and 4) dynamic reset rule (if there exits a reset clock \(y\in X\), then \(\gamma _i=\{x\}\), otherwise \(\gamma _i=\emptyset \)). We use the fourth since the verification has the least checking time. After obtaining the logcial timed word \(\omega _{rl}\), the teacher further checks whether it satisfies \(\phi '\) under the environment of \(M'_1\) by model checking if \(M_1'\Vert A_{w_{rl}}\models \phi '\), where \(A_{\omega _{rl}}\) is the automaton constructed from \({\omega _{rl}}\).

As shown in Fig. 1, the step of model conversion is executed only once. It is then followed by the execution of the smart teacher we design, which only requires a polynomial number of membership queries for the assumption learning. Without the first step, the framework needs to turn to a normal teacher, in which case the reset information is obtained by guessing, and an exponential number of membership queries are required.

3.4 Candidate Queries

The candidate queries are to get answers about whether the learned hypothesis A satisfies the AG reasoning rule.

The First Candidate Query. This step checks whether \(M_1'\Vert A\models \phi '\). If the answer is positive, we proceed to the second candidate query. Otherwise, a counterexample \(ctx_1\), \(ctx_1{\downharpoonleft _{\varSigma _A}} \in \mathcal {L}(A)\) is generated and further analyzed by constructing a TA \(A_{ctx1}\) such that \(M_1' \Vert A_{ctx1} \not \models \phi '\). We then check whether \(ctx_1{\downharpoonleft _{\varSigma _A}} \in \mathcal {L}(M_2'){\downharpoonleft _{\varSigma _A}}\). If the result is positive, we have \(M_1' \Vert M_2' \not \models \phi '\). Otherwise, \(ctx_1{\downharpoonleft _{\varSigma _A}}\in \mathcal {L}(A) \setminus \mathcal {L}(A_w)\) and \(ctx_1\) serves as a negative counterexample to refine assumption A via the next round of membership queries.

The Second Candidate Query. This step checks whether \(M'_2 \models A\), i.e. \(\mathcal {L}(M_2'){\downharpoonleft _{\varSigma _A}} \subseteq \mathcal {L}(A)\). If yes, as \(M_1'\Vert A \models \phi '\) and \(M_2'\models A\), the verification algorithm terminates and we conclude \(M_1' \Vert M_2' \models \phi '\). Otherwise, a counterexample \(ctx_2\) is generated and a TA \(A_{ctx_2}\) is constructed from the timed word \(ctx_2\). We check whether \(M_1' \Vert A_{ctx_2} \not \models \phi '\). If yes, as \({ctx_2{\downharpoonleft _{\varSigma _A}}} \in {\mathcal {L}(M_2'){\downharpoonleft _{\varSigma _A}}}\), we conclude \(M_1' \Vert M_2' \not \models \phi '\). Otherwise, \(ctx_2{\downharpoonleft _{\varSigma _A}} \in \mathcal {L}(A_w)\setminus \mathcal {L}(A)\) is a counterexample, indicating a new round learning is needed to refine and check A using membership and candidate queries until a conclusive result is obtained.

3.5 Correctness and Termination

We now show the correctness and termination of the framework.

Theorem 3

Given two deterministic timed automata \(M_1\) and \(M_2\), and property \(\phi \), if there exists a DOTA that accepts the target language \(\mathcal {L}(A_w)\), where \(A_w\) is the weakest assumption of the converted model \(M_2'\), the proposed learning-based compositional verification returns true if \(\phi \) holds on \(M_1\Vert M_2\) and false otherwise.

Proof

From Theorem 2, we only need to consider the converted models \(M'_1\), \(M'_2\) and \(\phi '\).

Termination. The proposed framework consists of the steps of model conversion, membership and candidate queries. We argue about the termination of the overall framework by showing the termination of each step.

By Algorithm 1 and Theorem 2, the step of model conversion terminates. Because the learning algorithm of DOTA terminates [7], assumption A will be obtained at last by membership queries. As to the candidate queries, they either conclude \(M'_1\Vert M'_2\models \phi '\) and then terminate, or provide a positive or negative counterexample ctx, that is, \(ctx{\downharpoonleft _{\varSigma _A}}\in \mathcal {L}(A_w)\setminus \mathcal {L}(A)\) or \(ctx{\downharpoonleft _{\varSigma _A}}\in \mathcal {L}(A)\setminus \mathcal {L}(A_w)\), for the refinement of A.

For the weakest assumption \(A_w\), since there exists a DOTA which accepts \(\mathcal {L}(A_w)\), the framework eventually constructs \(A_w\) in some round to produce the positive answer \(M'_1\Vert A_w\models \phi '\) to the first candidate query. As shown in Sect. 3.4, we can check whether \( \mathcal {L}(M'_2){\downharpoonleft _{\varSigma _A}} \subseteq \mathcal {L}(A)\). If the result is positive, we have \(M'_1\Vert M'_2\models \phi ' \) and the framework terminates. Otherwise, a counterexample \({ctx_2{\downharpoonleft _{\varSigma _A}}\in {\mathcal {L}(M'_2){\downharpoonleft _{\varSigma _A}}\setminus \mathcal {L}(A_w)}}\) is generated. So \(M'_1\Vert M'_2\not \models \phi '\), and \(ctx_2\) is a witness to the fact that \(M'_1\Vert M'_2\) violates \(\phi '\).

Correctness. Since there exists a DOTA that accepts the target language \(\mathcal {L}(A_w)\), the framework always eventually terminates with a result which is either true or false. It is true only if both candidate queries return true and this means that \(\phi '\) is held on \(M'_1 \Vert M'_2\). Otherwise, a counterexample \({ctx{\downharpoonleft _{\varSigma _A}} \not \in \mathcal {L}(A_w)}\) is generated. Since \(M'_1 \Vert A_{ctx} \not \models \phi '\) and \({ctx{\downharpoonleft _{\varSigma _A}}} \in {\mathcal {L}(M'_2){\downharpoonleft _{\varSigma _A}}}\), hence \(M'_1 \Vert M'_2 \not \models \phi '\).

   \(\square \)

It is possible, in some cases, there is no DOTA that can accept \(\mathcal {L}(A_w)\), and the proposed verification framework cannot be guaranteed in these cases. However, the framework is still sound, meaning that for the cases when a DOTA assumption is learned and the verification terminates with a result, the result holds. Therefore, the framework is able to handle more flexible models such as multi-clock models. We will explore this with experiments in Sect. 5.

Theorem 4

Given two deterministic timed automata \(M_1'\) and \(M_2'\) which might have multiple clocks, and property \(\phi '\), even if there is no DOTA that accepts the target language \(\mathcal {L}(A_w)\), the proposed verification framework is still sound.

Proof

Given \(M'_1\) and \(M'_2\) which are multi-clock timed automata, suppose in some round if the learned DOTA assumption A satisfies \(\mathcal {L}(A) \subseteq \mathcal {L}(A_w)\) and \(\mathcal {L}(M'_2){\downharpoonleft _{\varSigma _A}} \subseteq \mathcal {L}(A)\), we have that both results of the first and second candidate queries are positive. Hence, verification terminates and \(M'_1\Vert M'_2\models \phi '\) holds. For the same reasoning, in the case of a counterexample ctx is generated, that is \(M'_1 \Vert A_{ctx} \not \models \phi '\) and \({ctx{\downharpoonleft _{\varSigma _A}}\in \mathcal {L}(M'_2){\downharpoonleft _{\varSigma _A}}}\), this implies that \(M'_1\Vert M'_2\not \models \phi '\) and the verification terminates with the valid result.    \(\square \)

The framework is not complete though. For a \(M_1\) with multiple clocks, it is not guaranteed to have a DOTA assumption A such that \(\mathcal {L}(A)=\mathcal {L}(A_w)\). Thus, the framework is not guaranteed to terminate. Furthermore, for a \(M_2\) with multiple clocks, the framework may not be able to learn a DOTA assumption A, such that \(\mathcal {L}(M'_2){\downharpoonleft _{\varSigma _A}} \subseteq \mathcal {L}(A)\) even though \(M'_1\Vert M'_2\models \phi '\).

4 Optimization Methods

In this section, we give two improvements to the verification framework proposed in Sect. 3. The first one reduces state space and membership queries in terms of the given information of \(M'_1\) and \(\phi '\). The second one uses a smaller alphabet than \(\varSigma _A = (\varSigma '_1 \cup \varSigma '_{\phi }) \cap \varSigma '_2\) to improve the verification speed.

4.1 Using Additional Information

In the process of learning assumption A with respect to \(M'_1\) and \(\phi '\), we make better use of the available information of \(M'_1\) and \(\phi '\). It is clear that if there are more actions taking place from a learned location, it is likely there are more successor locations from that location and more symbolic states are needed. It is, in general, that not all the actions are enabled in a location. Since the logical-timed words of the models \(M_1'\) or \(\phi '\) are known beforehand, the sequence of actions that can be taken can be obtained. Therefore, we can use this information to remove those actions which do not take place from a certain location to reduce the number of successor states. Furthermore, the number of membership queries can be reduced by directly giving answers to these queries whose timed words violate the action sequences. This results in accelerating the learning process as well as speeding up the verification to some extent. The experiments in the next section also show these improvements.

For example, \(M_1'\) has two actions read and write. In addition, it is known that the write action can only be performed after the read has been executed. So, we add such information to the learning step of the verification framework. That is, read should take place before write in any timed word. Thus, for the membership queries with such word \(\omega _l=\ldots (write, \textbf{v}_k)\ldots (read, \textbf{v}_m)\ldots \), where write takes place before read, a negative answer is directly returned without the model checking steps for membership queries as shown in Sect. 3.3.

The additional information is usually derived from the design rules and other characteristics of the system under study. In the implementation, we provide some basic keywords to describe the rules, e.g. “beforeAfter” specifies the order of actions, and “startWith” specifies a certain action should be executed first. Therefore, the above example is encoded as “[beforeAfter]:(read,write)”.

4.2 Minimizing the Alphabet of the Assumption

In our framework, the automated AG procedure uses a fixed assumption alphabet \(\varSigma _A = (\varSigma '_1 \cup \varSigma '_{\phi }) \cap \varSigma '_2\). However, there may exist an assumption \(A_s\) over a smaller alphabet \(\varSigma _s\subset \varSigma _A\) that satisfies the two premises of the AG rule. We thus propose and implement another improvement to build the timed assumption over a minimal alphabet. Smaller alphabet size can directly reduce the number of membership queries and thus speeds up the verification process.

Theorem 5

Given \(\varSigma _A = (\varSigma '_1 \cup \varSigma '_{\phi }) \cap \varSigma '_2\), if there exists an assumption \(A_s\) over non-empty alphabet \(\varSigma _s \subset \varSigma _A\) satisfying \(M_1' \Vert A_s \models \phi '\) and \(M_2' \models A_s\), then there must exist an assumption A over \(\varSigma _A\) satisfying \(M_1' \Vert A \models \phi '\) and \(M_2'\models A\).

Proof

Based on \(A_s\), we can construct a timed assumption A over \(\varSigma _A\) as follows. For \(A_s = (Q_s,q^s_0,\varSigma _s,F_s,X_s,\varDelta _s)\), we first build \(A = (Q,q_0,\varSigma _A,F,X,\varDelta )\) where \(Q = Q_s, q_0 = q^s_0, F = F_s, \varDelta = \varDelta _s\) and \(X=X_s\). Then for \(\forall q \in Q\) and \(\forall \sigma \in \varSigma _A \setminus \varSigma _s\), we add \((q, \sigma ,true, \emptyset , q)\) into \(\varDelta \).

We now prove with such A, \(M_1' \Vert A \models \phi '\) and \(M_2'\models A\) still hold, that is, \(M_1'\Vert M_2'\models \phi '\). Since the locations of A and \(A_s\) are the same, the locations of \(M_1'\Vert A \) and \(M_1' \Vert A_s\) are the same. For the composed model \(M'_1\Vert A\), and the newly added transition \(\delta _{new} = (q, \sigma , true, \emptyset , q)\) from state q in A, since \(\sigma \in \varSigma _A \setminus \varSigma _s\), it will be synchronized with such transition taking the form \(\delta _1=(q_c,\sigma ,\varphi _c,\gamma _c,q'_c)\) in \(M_1'\). So in \(M'_1\Vert A\), the composed transition with respect to \((q_c,q)\) and \(\sigma \), is \(((q_c,q), \sigma , \varphi _c,\gamma _c, (q'_c,q))\). While in \(M_1'\Vert A_s\), for such transition \(\delta _1\) in \(M_1'\), though there is no synchronized transition from state q in \(A_s\), the composed transition is still \(((q_c,q), \sigma , \varphi _c,\gamma _c, (q'_c,q))\) in \(M_1'\Vert A_s\). So \(M_1' \Vert A \models \phi '\). According to the construction process of A from \(A_s\), as \(M_2'\models A_s\), i.e. \(\mathcal {L}(M'_2){\downharpoonleft _{\varSigma _{s}}} \subseteq \mathcal {L}(A_s)\), it follows that \(M_2'\models A\).    \(\square \)

The main problem with smaller alphabet is that AG rule is no longer complete for deterministic finite automata [18]. The problem still exists for timed automata. If \(\varSigma _s\subset \varSigma _A\), then there might not exist an assumption \(A_s\) over \(\varSigma _s\) that satisfies the two premises of AG even though \(M_1' \Vert M_2'\models \phi '\). In this situation, we say \(\varSigma _s\) is incomplete and needs to be refined. So each time when we find \(\varSigma _s\) is incomplete, we select another \(\varSigma '_s \subset \varSigma _A\) and restart the learning algorithm again. If a large number of round of refinement is needed, the speed of the verification is reduced significantly. To compensate for this speed reduction, we reuse the counterexamples that indicate the incompleteness of \(\varSigma _s\) in the previous loops and use a variable \(List_c\) to store them. Before starting a new round of learning, we use \(List_c\) to judge whether the current \(\varSigma _s'\) is appropriate in advance. We say \(\varSigma _s'\) is appropriately selected only if all the counterexamples of \(List_c\) can not indicate \(\varSigma _s'\) is incomplete.

With a small alphabet \(\varSigma _s \subset \varSigma _A\), we can not directly conclude the verification result if \(M_1' \Vert M_2' \not \models \phi '\). The reason is that any given counterexample ctx maintaining \(M_1' \Vert A_{ctx} \not \models \phi ' \wedge ctx{\downharpoonleft _{\varSigma _s}} \in \mathcal {L}(M_2'){\downharpoonleft _{\varSigma _s}}\) will be used to illustrate the incompleteness of the \(\varSigma _s\), though in some cases ctx indeed indicates that \(M_1' \Vert M_2' \not \models \phi '\) over \(\varSigma _A\). As a result, the treatment of ctxs will decrease the whole verification speed if \(M_1' \Vert M_2' \not \models \phi '\). To solve this, we need to detect real counterexamples earlier. We will first check whether \(M_1' \Vert A_{ctx} \not \models \phi '\wedge ctx{\downharpoonleft _{\varSigma _A}} \in \mathcal {L}(M_2'){\downharpoonleft _{\varSigma _A}}\) holds. If the result is yes, the verification concludes \(M_1' \Vert M_2' \not \models \phi '\). Otherwise ctx is used to refine assumption over new \(\varSigma '_s\).

5 Experimental Results

We implemented the proposed framework in Java. The membership queries and candidate queries are executed by calling the model checking tool UPPAAL. We evaluated the implementation on the benchmark of AUTOSAR (Automotive Open System Architecture) case studies. All the experiments were carried out on a 3.7GHz AMD Ryzen 5 5600X processor with 16GB RAM running 64-bit Windows 10. The source code of our tool and experiments is available in [2].

AUTOSAR is an open and standardized software architecture for automotive ECUs (Electronic Control Units). It consists of three layers, from top to bottom: AUTOSAR Software, AUTOSAR Runtime Environment (RTE), and Basic Software [1]. Its safety guarantee is very important [13, 34, 40]. A formal timed model of AUTOSAR architecture consists of several tasks and their corresponding runnables, different communication mechanisms of any two runnables, RTE communication controllers and task schedulers. In terms of different number of tasks and runnables, we designed three kinds of composed models: the small-scale model AUTOSAR-1 (8 automata), the complex-scale composed models AUTOSAR-2 (14 automata) and AUTOSAR-3 (14 automata). The properties of the architecture to be checked are: 1) buffers between two runnables will never overflow or underflow, and 2) for a pair of sender runnable and receiver runnable, they should not execute the write action simultaneously. The checking methods we performed in the experiments are: 1) traditional monolithic model checking via UPPAAL, 2) compositional verification framework we propose (CV), 3) CV with the first improvement that uses additional information of \(M_2'\) and \(\phi '\) (CV+A), 4) CV with the second improvement that minimizes assumption alphabet (CV+M), and 5) CV with both improvements (CV+A+M). Each experiment was conducted five times to calculate the average verification time. Tables 1-4 show the detailed verification results for each property using these methods, where Case IDs are given in the format n-m-k-l, denoting respectively the identifiers of the verified properties, the number of locations and clocks of \(M_2\), and the alphabet size of \(M_2\). The Boolean variable Valid denotes whether the property is satisfied. The symbols |Q|, \(|\varSigma |\), R, and \(T_{\text {mean}}\) stand for the number of the locations and the alphabet size of the learned assumption, the number of alphabet refinements during learning and the average verification time in seconds, respectively.

1) AUTOSAR-1 Experiment. AUTOSAR-1 consists of 8 timed automata: 4 runnables, 2 buffers, and 2 schedulers used for scheduling the runnables. We partition the system into two parts, where \(M_1\) is a DOTA and \(M_2\) is composed of 7 DOTAs. The experimental results for this case are recorded in Table 1, where the proposed compositional verification (CV) outperforms the monolithic checking via UPPAAL except for cases 1-71424-7-8 and 3-71424-7-8 . This is because, for these two cases, the learning algorithm needs more than 30 rounds to refine assumptions using generated counterexamples. However, in terms of the first improvement (CV+A), i.e. CV with additional information of \(M_1'\), the verification time reduces drastically for these two cases. Similarly, by the use of the second improvement (CV+M), i.e. CV with a minimized alphabet, the verification time decreases due to fewer membership queries. With both improvements (CV+A+M), compared with single ones, the checking time varies depending on the actual case. As shown in Table 1, in the case of checking property 1 with CV+A, since the alphabet size of the learned assumption A is the largest one, i.e. 3, the second improvement can take effect. So the verification time using CV+A+M is less than that using CV+A. However, it is worse than CV+M.

Table 1. Verification Results for AUTOSAR-1 where \(M_1\) is a DOTA.

We have discussed in Sect. 3.5 that the framework can handle models for \(M_1\) which might be a multi-clock timed automaton, though termination is not guaranteed. So, we also repartition the AUTOSAR-1 system into two parts for verification, where \(M_1\) is composed of 7 DOTAs. The results in Table 2 reveal that the proposed compositional method outperforms UPPAAL in most of the cases except the case 5-4-1-2. The reason is that UPPAAL might find a counterexample faster than the compositional approach because of the on-the-fly technique, which terminates the verification once a counterexample is found. In contrast, our framework needs to spend some time learning the assumption ahead of searching the counterexample, resulting in more time for the termination of the verification framework. In the experiments, we also observe that the time varies with the selection of \(M_1\). Therefore, a proper selection of the components composed as \(M_1\) or \(M_2\) can lead to a faster verification, while ensuring termination of the framework.

Table 2. Verification Results for AUTOSAR-1 where \(M_1\) is a composition of DOTAs

2) AUTOSAR-2 Experiment. AUTOSAR-2 is a more complex system with totally 14 automata, including 6 runnables and a task to which the runnables are mapped, 5 buffers, a RTE and a scheduler. In this experiment, we select \(M_1\) as a composition of several DOTAs. The results in Table 3 show that in the cases of properties 1-4, UPPAAL fails to obtain checking results due to the large state space, whereas our compositional approach can finish the verification for all the properties in 300 seconds using the same memory size. This indicates that the framework can reduce the state space significantly in some cases.

Table 3. Verification Results for AUTOSAR-2

3) AUTOSAR-3 Experiment. The system consists of 14 components, where both \(M_1\) and \(M_2\) are the compositions of several DOTAs. The checking results shown in Table 4 illustrate that the minimal alphabet improvement can obtain the smallest alphabet with size 1, thus reducing the verification time. However, the additional information improvement performs badly in most cases.

Table 4. Verification Results for AUTOSAR-3

6 Conclusion

Though in model checking, assume-guarantee reasoning can help alleviate state space explosion problem of a composite model, its practical impact has been limited due to the non-trivial human interaction to obtain the assumption. In this paper, we propose a learning-based compositional verification for deterministic timed automata, where the assumption is learned as a deterministic one-clock timed automaton. We design a model conversion algorithm to acquire the clock reset information of the learned assumption to reduce the learning complexity and prove this conversion preserves the verification results. To make the framework applicable to multi-clock systems, we design a smart teacher with heuristic to answer clock reset information. We also prove the correctness and termination of the framework. To speed up the verification, we further give two kinds of improvements to the learning process. We implemented the framework and performed experiments to evaluate our method. The results show that it outperforms monolithic model checking, and the state space can be effectively reduced. Moreover, the improvements also have positive effects on most studied systems.