Abstract
An ncomplete test suite for automata guarantees to detect all faulty implementations with a bounded number of states. We propose a construction of such a test suite for ioco conformance on labeled transition systems, which we derive from construction methods for deterministic FSMs. Our resulting test suite poses no further restrictions on the implementations other than their number of states and fairness in test execution. This elevates restrictions made in existing methods. In particular, we address the problem of compatible states: specification states which can be implemented by a single state. Such states are forbidden by existing methods for ioco, as they complicate test suite construction.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The holy grail of modelbased testing is a complete test suite: a test suite that can detect any possible faulty implementation. This is impossible for blackbox testing, since a tester can only make a finite number of observations, but for an implementation of unknown size, it is unclear when to stop. Often, a socalled ncomplete test suite is used to tackle this problem, meaning it is complete for all implementations with at most n states.
A celebrated result for deterministic finite state machines (FSMs, or Mealy machines) is the existence of efficient ncomplete test suites (Chow 1978). Nowadays, many variations exist (Dorofeeva et al. 2010), all of which share the basic structure. The test suites usually provide some way to reach all states and transitions of the implementation. After reaching some implementation state, state identification is used to test whether it is equivalent to the intended specification state: the intended state is distinguished from inequivalent specification states.
We will explore how an ncomplete test suite can be constructed for more general labeled transition systems instead. We use the ioco relation (Tretmans 2008) as a conformance relation. Unlike FSM equivalence, ioco is not an equivalence relation, meaning that many inequivalent implementations may conform to the same specification and, conversely, an implementation may conform to several inequivalent specifications. Specification states which can be implemented with a single state are called compatible. Standard distinguishing techniques cannot be applied to compatible states. We investigate and characterize the notion of compatibility, and we introduce an alternative to the usual way of distinguishing compatible states. Using these insights, we give a construction for an ncomplete test suite and prove it to be correct.
We already addressed this problem in van den Bos et al. (2017). This paper improves on it in the following ways:

We give a more detailed discussion on equivalence and compatibility of states, and we discuss the construction of a merge of two states explicitly. In particular, we now prove that states are compatible if and only if their merge is valid.

The algorithm in van den Bos et al. (2017) to compute distinguishing trees assumes incompatible states, but no means of deciding compatibility of states is given. In this paper, we instead give an algorithm for computing the compatibility relation, while simultaneously computing witnesses (i.e., distinguishing graphs) if states are incompatible.

Instead of distinguishing trees, we use distinguishing graphs by reusing nodes. This gives a more efficient algorithm, as the graphs are polynomial in size.

For sets of distinguishing graphs, we define the notions of characterization sets and (harmonized set of) state identifiers. This makes the relation to FSM theory more explicit.

We add examples to highlight properties of compatibility, and the construction and execution of test suites.
This paper is structured as follows. In Section 2, we introduce the domain of specifications and implementations, as well as the ioco relation. Furthermore, we give a short overview of existing theory on ncomplete test suites for FSMs. We formalize the notions of equivalence of states in Section 3 and of compatibility in Section 4. In Section 5, we show how to compute distinguishing graphs for incompatible states. The construction of ncomplete test suites is then be described in Section 6, together with a correctness proof. We conclude in Section 7.
Related work
Testing methods for FSMs have been analyzed thoroughly, including ncomplete test suites and various ways of distinguishing states. A survey is given by Dorofeeva et al. (2010). Progress has been made on generalizing these testing methods to nondeterministic FSMs. Petrenko and Yevtushenko (2005, 2011) use the reduction relation for testing nondeterministic FSMs, which resembles ioco more closely than equivalence.
Complete testing received less attention within ioco theory. With the original test generation method (Tretmans 2008), test cases are generated randomly. This method is described as complete, but only in the sense that any fault can eventually be found: there is no upper bound to the required number and length of test cases.
Paiva and Simao (2016) construct complete test suites for MealyIOTSes. MealyIOTSes are a subclass of labeled transition systems, but are similar to Mealy machines, as (sequences of) outputs are coupled to inputs. This makes the translation from FSM testing more straightforward.
The work most similar to ours is that of Simao and Petrenko (2014) and works on deterministic labeled transition systems. Some further restrictions are made on the specification domains. In particular, every specification state should be certainly reachable, i.e., all conforming implementations must implement that state. Furthermore, all states should be mutually incompatible, such that an implementation state cannot possibly conform to multiple specification states. In this sense, our test suite construction can be applied to a broader set of systems, potentially at the cost of efficiency. Thus, we explore the bounds of ncomplete test suites for ioco in an unrestricted setting, whereas Simao and Petrenko (2014) aim at efficient test suites in a restricted setting.
2 Preliminaries
To model implementations and specifications, we use a particular domain of labeled transition systems, namely suspension automata. We essentially regard them as deterministic automata, for which the transitions are labeled with an input or output.
For the remainder of this paper, we fix L_{I} and L_{O}≠∅ as disjoint finite sets of input and output labels respectively, with L = L_{I} ∪ L_{O}. Furthermore, we use a, b as input labels and w, x, y, z as output labels. We use μ as a label that can be either input or output. The set L^{∗} denotes the set of sequences of labels in L. For a partial function \(f : X \rightharpoonup Y\), let f(x) ↑ and f(x) ↓ mean that f(x) is defined and undefined respectively.
Definition 1
An automaton with inputs and outputs is a tuple (Q, T, q_{0}) where

Q is a finite set of states,

\(T : Q \times L \rightharpoonup Q\) is the (partial) transition function, and

q_{0} ∈ Q is the initial state.
We interchangeably use T as partial function and as the set of transitions T = {(q, μ, T(q, μ))∣ T(q, μ)↑}. For q ∈ Q, we denote the set of enabled inputs and outputs in q by in(q) = {a ∈ L_{I}∣T(q, a) ↑} and out(q) = {x ∈ L_{O}∣T(q, x) ↑} respectively. An automaton (Q, T, q_{0}) is inputenabled if ∀q ∈ Q : in(q) = L_{I}, and nonblocking if ∀q ∈ Q : out(q)≠∅.
The set of all automata with inputs and outputs is denoted by \(\mathcal {A}\mathcal {I}\mathcal {O}\). With \(\mathcal {S\!A}\), we denote the set of suspension automata, which are nonblocking automata with inputs and outputs. \(\mathcal {S\!A}_{\textit {IE}}\) denotes the set of inputenabled suspension automata.
We will use \(\mathcal {S\!A}\) as the domain of specifications, and \(\mathcal {S\!A}_{\textit {IE}}\) as the domain of implementations. Both thus have an output transition in every state, and implementations have a transition for every input (see Fig. 1 for an example specification and two implementations). We will encounter automata in \(\mathcal {A}\mathcal {I}\mathcal {O}\) only as intermediate product of an operation introduced in Section 4.
At this point, we remark that \(\mathcal {S\!A}\) and \(\mathcal {S\!A}_{\textit {IE}}\) are different from the usual implementation and specification domains for ioco: the original theory considers nondeterministic labeled transition systems with inputs, outputs, internal transitions, and the artificial output quiescence, i.e., observation of the absence of explicit outputs. Quiescence ensures that every labeled transition system in ioco theory is nonblocking. By determinizing these nonblocking labelled transition systems, any labeled transition system may equivalently be expressed as a suspension automaton (Tretmans 2008). For suspension automata, we will consider quiescent transitions to be output transitions like any other. By using suspension automata, we thus do not need to concern ourselves with nondeterminism and internal transitions, as suspension automata describe the observable behavior.
Readers familiar with suspension automata may remark that they usually adhere to particular restrictions. For example, quiescence should not be followed by any output (other than quiescence itself) and it should not cause any actual transition in the underlying nondeterministic labeled transition. We refer to Willemse (2006) for a more elaborate description of these restrictions. We will not pose such restrictions in order to simplify reasoning, and our domains are thus a generalization of the usual domains for ioco theory. This implies soundness of our test suites: if any faulty implementation in our general domain can be detected, then we can certainly detect all faults in a more restricted implementation domain. When reusing results from other works in which this difference is relevant, we will clarify the translation to our domain.
Throughout the paper, we will use the following notation (Definition 2), where 𝜖 denotes the empty sequence. With after, we lift the transition relation to sets of states and sequences of labels. With traces, we denote the set of all traces of a set of states. We also lift in and out to sets, and use init to obtain the labels of all enabled transitions. We sometimes interchange a singleton set with its element, e.g., we write out(q) instead of out({q}). Following Simao and Petrenko (2014), we write S/q to refer to the suspension automata starting in state q of specification S.
Definition 2
Let \(S = (Q,T,q_{0}) \in \mathcal {A}\mathcal {I}\mathcal {O}\), q ∈ Q, B ⊆ Q, μ ∈ L and σ ∈ L^{∗}. Then, we define
The ioco relation formalizes when implementations conform to specifications. We give a definition relating traces, following (Tretmans 1996; Willemse 2006), and a coinductive definition relating states. This last definition can be seen as an alternating simulation. Several papers (Aarts and Vaandrager 2010; Noroozi 2014; Veanes and Bjørner 2012) have related the original ioco definition to alternating simulation, and proven that the two coincide for deterministic systems. Note that our domain of suspension automata extends the usual domain, and as such, our definition of ioco is also an extension with respect to Noroozi (2014) and Tretmans (1996).
Definition 3
Let \(S \in \mathcal {S\!A}\) and \(I \in \mathcal {S\!A}_{\textit {IE}}\). Then, we say that I ioco S if for all σ ∈ traces(S) we have out(I after σ) ⊆out(S after σ).
Definition 4
Let \(S = (Q_{S},T_{S},{q_{0}^{S}}) \in \mathcal {S\!A}\) and \(I = (Q_{I},T_{I},{q_{0}^{I}}) \in \mathcal {S\!A}_{\textit {IE}}\). Then, for q_{I} ∈ Q_{I}, q_{S} ∈ Q_{S}, we say that q_{I} ioco q_{S} if there exists a relation R ⊆ Q_{I} × Q_{S} such that (q_{I}, q_{S}) ∈ R, and for all (q, q^{′}) ∈ R :

∀a ∈in(q^{′}) : (T_{I}(q, a),T_{S}(q^{′}, a)) ∈ R, and

∀x ∈out(q) : x ∈out(q^{′}) and (T_{I}(q, x),T_{S}(q^{′}, x)) ∈ R.
Any such relation R is called a coinductive ioco relation.
Proposition 5
Let \(S \in \mathcal {S\!A}\) , \(I \in \mathcal {S\!A}_{\textit {IE}}\) and let \({q_{0}^{S}}\) and \({q_{0}^{I}}\) be their initial states. We have I ioco S if and only if \({q_{0}^{I}} \textup {\textsf { ioco }} {q_{0}^{S}}\) .
The relation ioco is a preorder on inputenabled labeled transition systems (Tretmans 2008), and it is also a preorder on our extended domain \(\mathcal {S\!A}_{\textit {IE}}\). We introduce the notion of ioco counterexample as a witness for nonconformance, since this is sometimes convenient for reasoning about the ioco relation.
Definition 6
Let \(S \in \mathcal {S\!A}\), σ ∈ L^{∗}, and x ∈ L_{O}. We call σx an ioco counterexample forS if σ ∈ traces(S) and x∉out(S afterσ).
Lemma 7
Let\(S \in \mathcal {S\!A}\)bea specification and\(I \in \mathcal {S\!A}_{\textit {IE}}\)animplementation. Then,IiocoSif and only iftraces (I) contains no ioco counterexample for S.
Proof
I i o c o S
□
Example 8
Figure 1 shows two implementations for the specification S in Fig. 1a. The first (Fig. 1b) is conforming and to see this we can define the relation R = {(1^{′},1),(2^{′},2),(2^{′},3),(5^{′},4),(5^{′},5),(6^{′},6)} and check that it is a coinductive ioco relation. In particular, observe that the state 2^{′} is related to two different specification states. This will be important when we discuss compatible states. Ioco counterexample awzx shows that Fig. 1c does not conform to the specification. (The final x is not allowed by the specification.)
2.1 nComplete test suites for FSMs
As this paper is founded on the ideas of existing theory on ncomplete test suites for deterministic complete FSMs (Chow 1978), we give a short overview to ease comparison.
A finite state machine (FSM) is a state machine in which every transition has both an input and output label. A deterministic complete FSM contains precisely one transition for every input in every state. We only consider deterministic complete FSMs in this section.
One can provide a sequence of inputs to an FSM, on which it will produce a sequence of outputs following the transitions. Every state can thus be characterized as a function from input sequences to output sequences, which induces an equivalence on states. When both the specification and the implementation are FSMs, we take equivalence of initial states as implementation relation. An input sequence represents a test for this equivalence: the sequence is provided to the implementation, and the outputs are compared to the specification. An ncomplete test suite is a set of tests which detect all faulty implementations having at most n states.
If m is the number of states of a specification FSM, then an mcomplete test suite can be constructed as follows. We construct a set P containing access sequences to every specification state and a set W containing sequences which distinguishes every pair of specification states. The set P is usually called the statecover and W the characterization set. The set \(P \cdot L_{I}^{\le 1} \cdot W\) is then an mcomplete test suite, with \(L_{I}^{\le 1}\) the set of input sequences of length 0 or 1. By executing every distinguishing sequence after every access sequence (P ⋅ W), we ensure that the implementation shows at least P different behaviors, i.e., the implementation has at least as many states as the specification. Executing the access sequence with an additional input before the distinguishing sequence (P ⋅ L_{I} ⋅ W) ensures that after every transition, we observe the correct destination state in the implementation. By extending the set \(L_{I}^{\leq 1}\) to \(L_{I}^{\leq k + 1}\), one can construct (m + k)complete test suites. Such a test suite then detects all faulty implementations with k more states than the specification. There exist various variants of distinguishing sequences from which more efficient (i.e., smaller) test suites can be constructed. An overview is given in Dorofeeva et al. (2010).
3 Equivalent states
If two specifications or two specification states have precisely the same implementations conforming to them, it is impossible but also unnecessary to distinguish them. We provide a characterization of this equivalence.
Definition 9
Two specifications \(S_{1}, S_{2} \in \mathcal {S\!A}\) are equivalent, denoted S_{1} ≃ S_{2}, if \(\forall I \in \mathcal {S\!A}_{\textit {IE}}: I \mathrel {\textsf {ioco}} S_{1} \iff I \mathrel {\textsf {ioco}} S_{2}\).
This defines an equivalence relation. Algorithmically, it is useful to have a coinductive definition. However, a direct definition might be cumbersome as it has to relate explicit underspecification with implicit underspecification. The former is a specification which allows all outputs after an input transition while the latter is a specification which omits such an input transition altogether. One can make all underspecifications explicit with demonic completion (Tretmans 2008). This will lead to a simple coinductive definition of equivalence.
Definition 10
Let \(S = (Q,T,q_{0}) \in \mathcal {S\!A}\), and let χ∉Q. The demonic completion ofS is defined as X(S) = (Q ∪{χ},T^{′}, q_{0}) where T^{′} = T ∪{(q, a, χ)∣q ∈ Q, a ∈ L_{I}, T(q, a) ↓}∪{(χ, μ, χ)∣μ ∈ L}.
Using the demonic completion one can transform specifications to equivalent, inputenabled ones. The basic properties are listed in the next lemma. These properties are used on suspension automata by Beneš et al. (2015).
Lemma 11
For all\(S \in \mathcal {S\!A}\),we have thatX(S) is inputenabled andS ≃ X(S).Moreover, we haveX(S) ioco S.
With these properties, we can characterize equivalence as follows.
Lemma 12
Let \(S_{1},S_{2} \in \mathcal {S\!A}\) . Then, we have
Proof
(⇒ ) Let S_{1} ≃ S_{2}. From X(S_{1}) ioco S_{1} (Lemma 11), it follows that X(S_{1}) ioco S_{2} by equivalence. By using Lemma 11 again, we conclude that X(S_{1})iocoX(S_{2}). Similarly X(S_{2})iocoX(S_{1}).
(⇐= ) Let \(I \in \mathcal {S\!A}_{\textit {IE}}\) and assume that I ioco S_{1}. We have to show that I ioco S_{2}. By Lemma 11 we have I ioco X(S_{1}), and by assumption, we have X(S_{1}) iocoX(S_{2}). Using the transitivity on \(\mathcal {S\!A}_{\textit {IE}}\), we get I iocoX(S_{2}). By Lemma 11, we conclude that I iocoS_{2}. The implication I iocoS_{2} to I iocoS_{1} is proven similarly. □
We note that the righthand side in Lemma 12 can be defined coinductively by using Proposition 5. If we spell this out, we get the following definition.
Definition 13
Let \(S \in \mathcal {S\!A}\) be a specification and X(S) = (Q_{X}, T_{X}, q_{0}) its demonic completion. A relation R ⊆ Q_{X} × Q_{X} is a coinductive equivalence relation if for all (q, q^{′}) ∈ R:
We define q ≈ q^{′} if there is a coinductive equivalence relation R with (q, q^{′}) ∈ R.
Proposition 14
Let\(S=(Q,T,q_{0}) \in \mathcal {S\!A}\)andq, q^{′}∈ Qtwo states. Then, we haveq ≈ q^{′} ⇔ S/q ≃ S/q^{′}.
Proof
By Lemma 12, we need to prove q ≈ q^{′} ⇔ X(S/q) ioco X(S/q^{′}) ∧ X(S/q^{′}) ioco X(S/q). Note that all relations involved here are on the set Q ∪{χ}. (⇒) Any coinductive equivalence relation is also a coinductive ioco relation. (⇐=) Let R and R^{′} be the coinductive ioco relations for X(S/q) ioco X(S/q^{′}) and X(S/q^{′}) ioco X(S/q) respectively. Then, we conclude that R ∪ R^{′} is a coinductive equivalence relation. □
4 Compatible states
For two inequivalent specification states, there may still exist an implementation that conforms to the two, which we should be able to handle in our test suite construction. For example, in Fig. 1, states 2 and 3 of the specification are both implemented by state 2^{′} of the implementation (as shown by ioco relation R in Example 8). In that case, we say that the two specification states are compatible, following the terminology introduced by Petrenko and Yevtushenko (2011) and Simao and Petrenko (2014). We give an explicit coinductive relation for compatibility and relate it to ioco in Lemma 24.
Definition 15
Let \((Q,T,q_{0}) \in \mathcal {S\!A}\). A relation R ⊆ Q × Q is a compatibility relation if for all (q, q^{′}) ∈ R we have
Two states q, q^{′} are compatible, denoted by q ◊ q^{′}, if there exists a compatibility relation R relating q and q^{′}. Otherwise, the states are incompatible, denoted by .
Lemma 16
Let\((Q,T,q_{0}) \in \mathcal {S\!A}\).The relation ◊ is the largest compatibility relation. Furthermore, ◊ is reflexive and symmetric.
Proof
Symmetry follows from the fact that the definition is symmetric, and reflexivity holds as (1) holds trivially for any (q, q), and (2) follows from suspension automata being nonblocking. Thus, {(q, q)∣q ∈ Q} is a compatibility relation.
Second, note that ◊ is a compatibility relation: for any element (q, q^{′}) ∈ ◊, there is a compatibility relation R and so any successors of q and q^{′} are related by R as well, meaning that the successors are also included in ◊. To show that ◊ is the largest, let R be any compatibility relation, then all its elements are included in ◊ by definition. □
Example 17 shows that compatibility is not transitive, thus it is not an equivalence relation. We will later show that equivalence is stronger than compatibility.
Example 17
In Fig. 2, we have 1 ◊ 2 and 1 ◊ 3, but . This last fact can be immediately deduced from the common outputs of states 2 and 3, since out(2) ∩out(3) = {y, z}∩{x} = ∅. From the observations {1, 2}aftery = 2, {1, 3} afterx = 2, and in({1, 2, 3}) = ∅, it follows that 1 ◊ 2 and 1 ◊ 3.
Definition 18
Let \(S = (Q,T,q_{0}) \in \mathcal {S\!A}\). Define \(F_{\mathrel {\Diamond }}: \mathcal {P}(Q \times Q) \rightarrow \mathcal {P}(Q \times Q)\) as
Lemma 19
Relation ◊ can be computed iteratively as greatest fixpoint ofF_{◊}.
Proof
First, we remark that F_{◊} is a monotone function on the set of relations on Q. Define the relations ◊_{0} = Q × Q and ◊_{i+ 1} = F_{◊}(◊_{i}). Now note that ◊_{0} ⊇ F_{◊}(◊_{0}) and so by monotonicity, we get ◊_{0} ⊇◊_{1} ⊇◊_{2} ⊇…. Since ◊_{0} is finite, this sequence stabilizes at some stage k: ◊_{k} = ◊_{k+ 1}. Due to the correspondence between F_{◊} and Definition 15, a relation U is a compatibility relation if and only if it is a fixpoint for F_{◊}. In particular, ◊_{k} = ◊. Since ◊ is reflexive, pairs (q, q) are not removed from ◊ during this computation, and since it is symmetric, we remove (q, q^{′}) and (q^{′}, q) at the same time. Thus, k is bounded by \(\frac {Q \cdot (Q1)}{2}\). □
Compatibility of two specification states means that there is some common behavior allowed by both states. Beneš et al. (2015) introduce the mergeoperator, which produces a new specification allowing precisely this common behavior. We present the definitions here, although in a somewhat different notation. In particular, we specialize the nary operator to a binary operator. We prove that compatibility indeed corresponds to existence of such a merge. Intuitively, merging is similar to parallel composition, removing blocking states afterwards.
Definition 20
Let \(S = (Q,T,q_{0}), S^{\prime } = (Q^{\prime },T^{\prime },q_{0}^{\prime }) \in \mathcal {S\!A}\) and let (Q_{X}, T_{X}, q_{0}) and \((Q_{X}^{\prime },T_{X}^{\prime },q_{0}^{\prime })\) be their demonic completions. For q ∈ Q and q^{′}∈ Q^{′}, we define their parallel composition as \(q \parallel q^{\prime } = (Q_{\parallel },T_{\parallel },(q,q^{\prime })) \in \mathcal {A}\mathcal {I}\mathcal {O}\), where

\(Q_{\parallel } = Q_{X} \times Q_{X}^{\prime }\)

\(T_{\parallel } = \left \{((q_{1},q_{1}^{\prime }),\mu ,(q_{2},q_{2}^{\prime })) \mid (q_{1},\mu ,q_{2}) \in T_{X} \wedge (q_{1}^{\prime },\mu ,q_{2}^{\prime }) \in T_{X}^{\prime } \right \}\).
Note that q ∥ q^{′} may contain states without any outputs (i.e., blocking states) and may therefore not be a suspension automaton. A blocking state cannot be implemented in a conforming manner, as an implementation must produce an output. States with transitions unavoidably leading to blocking states can also not be implemented. These states are denoted to be invalid by Beneš et al. (2015). We prove that two states are compatible exactly when their parallel composition has a valid initial state.
Definition 21
Let \((Q,T,q_{0}) \in \mathcal {A}\mathcal {I}\mathcal {O}\). We define the set of invalid states, inv(Q) ⊆ Q, inductively as follows. A state q ∈ Q is invalid if^{Footnote 1}
A state is called valid if it is not invalid and we define valid(Q) = Q ∖inv(Q).
Lemma 22
Let\(S = (Q,T,q_{0})\in \mathcal {S\!A}\),and letq, q^{′}∈ Q.The initial state ofq ∥ q^{′}is valid if and only ifq ◊ q^{′}.
Proof
Let q ∥ q^{′} = (Q_{∥}, T_{∥},(q, q^{′})). We first remark that condition (1) in Definition 21 is redundant as it implies condition (3). So we have that inv(Q_{∥}) is the smallest set closed under (2) and (3). Thus, since the set of valid states is its complement, valid(Q_{∥}) is the largest set for which the negations of (2) and (3) hold. We unfold these negated definitions to see that this coincides with Definition 15, by using De Morgan’s laws and Definition 20:
According to Definition 15, valid(Q_{∥}) is thus the largest compatibility relation on X(S). Removing all pairs of states (p, χ) and (χ, p^{′}) from valid(Q_{∥}) results in the largest compatibility relation for S, that is, the relation ◊. □
We can now define the merge of two states as the parallel composition in which the invalid states have been removed. Figure 3 shows an example.
Definition 23
Let \(S = (Q,T,q_{0}) \in \mathcal {S\!A}\), q, q^{′}∈ Q and q ∥ q^{′} = (Q_{∥}, T_{∥},(q, q^{′})) be their parallel composition. If (q, q^{′}) ∈valid(Q_{∥}), then the merge of q and q^{′} is defined as \(q \wedge q^{\prime } = (Q_{\wedge }, T_{\wedge }, (q,q^{\prime })) \in \mathcal {S\!A}\), where Q_{∧} = valid(Q_{∥}) and T_{∧} = T_{∥}∩ (Q_{∧}× L × Q_{∧}).
By Beneš et al. (2015), it is proven that removing the invalid states yields no new invalid states. The merge thus yields a suspension automaton, except when its initial state would be removed. The initial state thus should be valid for the merge to be welldefined. From Lemma 22, it then follows that ∧ yields a suspension automaton precisely for compatible states.
We introduced the merge as an operation that describes the common behavior of two compatible states. The following lemma states that implementations conform to both compatible states exactly when these implementations implement their merge. Moreover, there exists an implementation conforming to two states exactly when two states are compatible. This also means that our compatibility relation coincides with the one given by Simao and Petrenko (2014).
Lemma 24
Let\(S = (Q,T,q_{0}) \in \mathcal {S\!A}\)andq, q^{′}∈ Q.Then, the following holds:

1.
\(q \mathrel {\Diamond } q^{\prime } \implies (\forall I \in \mathcal {S\!A}_{\textit {IE}}: I \mathrel {\mathsf {ioco}} (q \wedge q^{\prime }) \iff (I \mathrel {\mathsf {ioco}} S/q)\)and(I ioco S/q^{′}))

2.
\(q \mathrel {\Diamond } q^{\prime } \iff \exists I \in \mathcal {S\!A}_{\textit {IE}}: I \mathrel {\mathsf {ioco}} S/q\)andIiocoS/q^{′}.
Proof
Let q ∥ q^{′} = (Q_{∥}, T_{∥},(q, q^{′})). For both statements, we can replace q ◊ q^{′} by (q, q^{′}) ∈valid(Q_{∥}) by Lemma 22. The merge is then welldefined (Definition 23). Statement 1 then follows from [Beneš et al. (2015), Axiom (M)]. Although \(\mathcal {S\!A}\) is an extension of the specification domain of Beneš et al. (2015), the proof holds in our setting as well.
For statement 2 (⇐=), we prove the contrapositive: if the initial state of q ∥ q^{′} is invalid, no implementation exists. If condition 1 of Definition 21 holds for (q, q^{′}), then trivially no implementation exists, as implementations are nonblocking by Definition 1. If condition 2 or 3 holds then there exists no implementation by induction: If condition 2 holds, an implementation cannot prevent receiving any input that reaches an invalid state, as implementations are input enabled by Definition 1; If condition 3 holds, any output transition for x ∈out((q, q^{′})) leads to an invalid state. Hence, q ∥ q^{′} cannot be implemented. By statement 1, we then obtain that S/q and S/q^{′} cannot be implemented.
To prove 2 (⇒), note first that we take the demonic completion before computing the parallel composition. Therefore, q ∥ q^{′} is inputenabled. Pruning preserves this, as a state is invalid already if it has one input transition to an invalid state (Definition 21). Hence, \(q \wedge q^{\prime } \in \mathcal {S\!A}_{\textit {IE}}\). As ioco is reflexive for \(\mathcal {S\!A}_{\textit {IE}}\), q ∧ q^{′} conforms to itself. We obtain the conclusion by applying statement 1. □
From the established properties of ◊ and ≈, we can now easily relate the two.
Lemma 25
Let\(S=(Q,T,q_{0}) \in \mathcal {S\!A}\).Then, ≈⊆◊.
Proof
Let q, q^{′}∈ Q be two states with q ≈ q^{′}. By Lemma 11, we have X(S/q) ioco S/q and by equivalence of q and q^{′}, we get X(S/q) ioco S/q^{′}. We conclude that S/q and S/q^{′} are both implemented by X(S/q). This implies q ◊ q^{′} by Lemma 24. □
5 Distinguishing graphs
In Definition 27, we define distinguishing graphs. Intuitively, such a graph describes how a tester can distinguish the specification states in a set D. That is, how to steer an implementation in state q_{i} in such a way that it can only show conformance to at most one specification state in D, forcing it to reveal nonconformance to other specification states in D. Figure 4 shows an example distinguishing graph. Distinguishing graphs are very similar to the distinguishing sequences used in FSM theory.
In our context, we may either want to observe outputs, or we may want to apply some input. In the latter case, this gives a racecondition between the tester and the implementation, if the implementation delivers an output before the desired input can be supplied. We then simply reattempt the test. We will elaborate on this in Section 6.
When distinguishing states D, we require that every input that we take is specified in all states of D. Furthermore, if multiple states of D have the same destination state for some common input or output μ, i.e., T(q, μ) = T(q^{′}, μ) for different q, q^{′}∈ D, then μ cannot be used to distinguish D. The reason is that after performing μ, the resulting behavior afterwards is then the same for both states. We then say that μ is not injective for D. Injectivity as we define it here is similar to the concept of validity as used in Lee and Yannakakis (1994) (not to be confused with validity as introduced in Definition 21).
Definition 26
Let \((Q,T,q_{0}) \in \mathcal {S\!A}\), D ⊆ Q a set of states, and μ ∈ L a label. Then, injective(D, μ) holds if \(\mu \in L_{O} \cup \bigcap _{q \in D}\textsf {in}(q)\) and for all distinct q, q^{′}∈ D, we have μ ∈init(q) ∩init(q^{′}) ⇒ qafterμ≠q^{′}afterμ.
Definition 27
Let \((Q,T,q_{0}) \in \mathcal {S\!A}\), and D ⊆ Q a set of states. A distinguishing graph for D is a directed acyclic graph with a finite set of nodes \(V \subseteq \mathcal {P}(Q) \cup \{\textbf {reset}\}\), labeled edges E ⊆ V × L × V, and root node D. For every node v ∈ V, we require

1.
if v≤ 1, then v is a leaf node, and

2.
if v > 1, then v is a nonleaf node and either of the following holds:

(a.)
for every output x ∈ L_{O}, there is an edge (v, x, vafterx) ∈ E, and injective(v, x) holds, or

(b.)
for some input a ∈ L_{I} such that injective(v, a), there is an edge(v, a, v after) ∈ E, and for every output x ∈ L_{O} there is an edge (v, x, reset) ∈ E.

(a.)
A node v ∈ V is a pass node if v≠reset and v≤ 1. We define \(\mathcal {DG}(S,D)\) as the set of all distinguishing graphs for D^{′} with D ⊆ D^{′}⊆ Q.
A node v of a distinguishing graph describes the states of the specification that can be reached from states in the root node, by taking the sequence of labels from the root node to v. By injectivity, if a node is reached with less states than the root, then the sequence to that node disproves conformance to some states of the root. A pass node is reached when at most one state is left, disproving conformance to all, or all but one state of the root node. Any graph \(w \in \mathcal {DG}(S,\{q, q^{\prime }\})\) distinguishes q and q^{′}. By Definition 27, we have \(\mathcal {DG}(S,D) \subseteq \mathcal {DG}(S,D^{\prime })\) for D^{′}⊆ D, because a distinguishing graph that can distinguish all states D, can also distinguish all its subsets of states D^{′}⊆ D.
Example 28
Figure 4 shows a distinguishing graph for states {1,2,3,4,5} of the specification in Fig. 5. Suppose that we observe outputs zz from some implementation. Then, the distinguishing graph tells us that we can perform input a. Suppose that we then observe outputs xy. We then have observed trace zzaxy, thus we must be in state 1. We can trace this path backwards from state 1, traversing only states in the nodes of distinguishing graph, to find our starting state. We must have reached state 1 with y from state 5, which in turn we have reached with x from 4. State 4 has two incoming edges for a from states 2 and 4, but only state 4 is in the respective node of the distinguishing graph. Continuing, we find that we started in state 4. Indeed, no other state has this trace.
Lemma 29
Let\(S = (Q, T, q_{0}) \in \mathcal {S\!A}\),andq, q^{′}∈ Q.There is a distinguishing graph\(Y \in \mathcal {DG}(S, \{q, q^{\prime }\})\)ifand only if.
Proof
(⇒) Note that the graph is directed and acyclic, so successor nodes define strictly smaller graphs. This means we can prove the implication by induction on the graph Y . We know that Y is a distinguishing graph for q and q^{′}, so its root node is {q, q^{′}}. This excludes that Y is constructed with rule (1) of Definition 27.
Assume Y is constructed by rule 2(a). Then, for all x ∈out(q) ∩out(q^{′}), we have that qafterx≠q^{′}afterx by injectivity. We then have a distinguishing graph for qafterx and q^{′}afterx. By induction, we may assume that . Hence, as condition (2) of Definition 15 cannot be satisfied.
Now assume Y is constructed by rule 2(b). Then, we have an a ∈in(q) ∩in(q^{′}) with q aftera≠q^{′}aftera. Again, we have a graph distinguishing q aftera and q^{′}aftera. By induction we know . So as condition (1) of Definition 15 cannot be satisfied.
In both cases, we showed that as required.
(⇐=) By Lemma 19, we know that ◊ can be computed iteratively as ◊_{i}. Let i be the smallest number such that (q, q^{′})∉◊_{i}. (Note that i≠ 0.) Since (q, q^{′})∉◊_{i}, either of the conditions (1) and (2) in Definition 15 is false.
If i = 1, the first condition trivially holds: for all a ∈in(q) ∩in(q^{′}), we have (q aftera, q^{′}aftera) ∈ Q × Q, as ◊_{0} = Q × Q. So the second condition must be false. This means that for all x ∈out(q) ∩out(q^{′}), we have (qafterx, q^{′}afterx)∉Q × Q. This can only happen if out(q) ∩out(q^{′}) = ∅. So we can make a distinguishing graph with root node {q, q^{′}} and edges for x ∈ L_{O} to a node with either {q}, {q^{′}} or ∅.
If i > 1, both conditions can be false. If the first condition is false, there exists an a ∈in(q) ∩in(q^{′}) such that (q aftera, q^{′}aftera)∉◊_{i− 1}. We then make a distinguishing graph with root node {q, q^{′}}, with an edge for a to a distinguishing graph for {q, q^{′}}aftera, which exists by induction, and xlabeled edges to reset nodes for each x ∈ L_{O}. Otherwise, the second condition is false and we have for all x ∈out(q) ∩out(q^{′}) that (qafterx, q^{′}afterx) ∉◊_{i− 1}. In this case, we make a node with several edges, one for each such x. In all cases, the children are constructed inductively using the fact that (qafterμ, q^{′}afterμ) ∉◊_{i− 1}. □
Lemma 29 tells us that a distinguishing graph always exists for two incompatible states. However, for a set D of more than two mutually incompatible states, a distinguishing graph for D may not exist.
Example 30
Consider mutually incompatible states 1, 3, and 5 in Fig. 6. States 1 and 3 both reach the same state after a, so injective({1,3,5},a) does not hold, and these states can thus not be distinguished by a. Similarly, states 3 and 5 cannot be distinguished after b. For the only output z ∈out({1,3,5}), we have that {1,3,5}afterz = {1,3,5}, so we cannot distinguish {1,3,5} on outputs as this would make the distinguishing graph cyclic.
Definition 31 defines properties on sets of distinguishing graphs needed for constructing ncomplete test suites.
Definition 31
Let \(S = (Q,T,q_{0}) \in \mathcal {S\!A}\) be a specification. Let W be a set of distinguishing graphs.

W is a characterization set if ∀q, q^{′}∈ Q: \(\implies \exists w \in W: w \in \mathcal {DG}(S,\{q, q^{\prime }\})\).

W is a state identifier forq if: ∀q^{′}∈ Q: \(\implies \exists w \in W: w \in \mathcal {DG}(S,\{q, q^{\prime }\})\).

A set of state identifiers {W(q)∣q ∈ Q} is harmonized if: ∀q, q^{′}∈ Q: \(\implies \exists w \in W(q) \cap W(q^{\prime }): w \in \mathcal {DG}(S,\{q, q^{\prime }\})\).
Algorithm 1 shows how to construct a set of distinguishing graphs that is a set of harmonized state identifiers. We will only construct distinguishing graphs for pairs of states, as we can guarantee that these graphs have polynomial size.
This algorithm extends the fixpoint algorithm as described in Lemma 19, in which ◊ is computed. We add a partial function W, which keeps track of all distinguishing graphs for sets D of at most two states. Initially, we already know that every D with size zero or one has a trivial distinguishing graph of a pass root node. We then start computing ◊_{i} for increasing i until this procedure stabilizes. During every iteration, we find new pairs of states which are incompatible, stored in . We then immediately construct a distinguishing graph for the found pairs.
Incompatibility arises for two reasons. Either for some input a ∈ L_{I}, successor states q aftera and q^{′}aftera have earlier been found incompatible. Otherwise, for all outputs x ∈ L_{O}, states qafterx and q^{′}afterx have been found incompatible. We thus know that we have already constructed a distinguishing graph for these successor states in an earlier iteration. Since the transitions from q and q^{′} for the found input or all outputs lead to incompatible states, we can then use the distinguishing graph for the successor states to create a distinguishing graph for q and q^{′}, which we add to W. The result of this algorithm is thus the compatibility relation, proven to be correct by Lemma 19, together with distinguishing graphs for all incompatible states.
On first sight, the algorithm may seem to miss a base case, as it finds incompatible states only if the successor states for some input or for all outputs are also incompatible. However, the condition on line 10 is trivially true if out(q) ∩out(q^{′}) = ∅ for some incompatible states q and q^{′}. The successors {q, q^{′}} afterx for x ∈ L_{O} are then singleton or empty, for which W contains a (trivial) distinguishing graph.
Note that at line 27 of the algorithm, we describe how to distinguish states by applying an input: we do this with an edge to an existing distinguishing graph for this input, and an edge to reset for all outputs. This indicates that a failed attempt of applying an input should simply be retried, until it succeeds. However, after an output, we may still reach incompatible states, which instead we may attempt to distinguish without resetting. Furthermore, one may want to prioritize distinguishing with inputs (if waiting for outputs may be slow) or with outputs (if one wants to prevent race conditions). One may thus adapt Algorithm 1 to his or her needs.
Example 32
We demonstrate how to apply Algorithm 1 on specification S in Fig. 1a. Since ◊_{0} contains all pairs of states, iteration i = 1 will find pairs of incompatible states only for pairs of states with disjoint outputs. These are all pairs except (1,4), (2,3), and (4,5) (and, obviously, their mirrored variants, as well as all pairs of equal states (1,1), (2,2), …). Every pair in is assigned a distinguishing graph on outputs, with leaf nodes as children. For example, the distinguishing graph for pair is shown in Fig. 7. We find
In iteration i = 2, we additionally find , as out(1) ∩out(4) = x, and 1 afterx = 2, 4 afterx = 6 and . The distinguishing graph for 1 and 4 is built up from the previously found graph, as also shown in Fig. 7. We find
In iteration i = 3, no new incompatible states are found so ◊ = ◊_{3} = ◊_{2}. Indeed, 2 ◊ 3 and 4 ◊ 5 are the only (nontrivial) compatible state pairs.
Lemma 33
Let\(S = (Q,T,q_{0}) \in \mathcal {S\!A}\),and let (◊,W) be the result of Algorithm 1. Then,

1.
∀q, q^{′}∈ Q:⇔ W({q, q^{′}})↑.

2.
∀q, q^{′}∈ Q: \(\implies \mathbf {W}(\{q,q^{\prime }\}) \in \mathcal {DG}(S,\{q,q^{\prime }\}))\).

3.
For any distinguishing graph in W, the number of its nodes is bounded by O(Q^{2}) and the number of its edges is bounded by O(Q^{2}⋅L_{O}).

4.
By taking W(q) = , we obtain a harmonized set of state identifiers {W(q)∣q ∈ Q}.
Proof

(1)
This follows from the simultaneous construction of W and ◊: we add a distinguishing graph for {q, q^{′}}, precisely when we conclude .

(2)
We indeed find a graph by (1). Thus, we only need to show that it is acyclic and finite, conforming to Definition 27. For any graph in W constructed in iteration i, the graph is acyclic, and the height of the graph is at most i. This can be shown by induction to i: at iteration i = 0, W contains only leaf nodes, which have no outgoing edges. For all graphs constructed in iteration i + 1, the root node only has edges to root nodes of graphs from previous iterations, and to reset. By induction, these contain no cycles and have height of at most i.

(3)
For any distinguishing graph with root D, all nodes D^{′} in that graph have D^{′}≤D, by Definition 27. Since nodes of distinguishing graphs of W are sets of at most two states, the number of nodes is bounded by Q^{2} + Q + 2 (including the node {reset}). Since every node in the graph contains at most one outgoing edge for every output, and possibly a single edge for some input, we find the claimed bounds.

(4)
The fact that W(q) is a state identifier follows from (1) and (2). The set {W(q)∣q ∈ Q} is harmonized because for each pair q, q^{′}∈ Q we constructed one graph, which is then added to both W(q) and W(q^{′}).
□
6 Test suites
An ncomplete test suite \(\mathbb {T}(S,n)\) for a specification S guarantees for any implementation I that I ioco S if I passes \(\mathbb {T}\), assuming that the size of I is at most n. Implementations may contain many states which are unspecified in S, and these states are not relevant for conformance. We will first define the size of an implementation in this respect, after which we will introduce all ingredients required for ncomplete test suites.
Definition 34
Let \(S=(Q,T,q_{0}) \in \mathcal {S\!A}\) be a suspension automaton and \(I=(Q_{I},T_{I},{q_{0}^{I}}) \in \mathcal {S\!A}_{\textit {IE}}\) be an implementation.

Define the set of reachable states from a state q ∈ Q in S as the set \({\textsf {Reachable}}(S, q) = \bigcup _{\sigma \in L^{*}} q \textsf {after} {\sigma }\). The set of reachable states from q_{0} is denoted by Reachable(S).

A state q ∈ Q_{I} is specified if ∃σ ∈ traces(S) : I afterσ = q. A transition (q, μ, q^{′}) ∈ T_{I} is specified if q is specified, and if either μ ∈ L_{O}, or μ ∈ L_{I} ∧∃σ ∈ L^{∗} : I afterσ = q ∧ σμ ∈ traces(S).

We denote the number of specification states by S = Reachable(S).

The set of reachable specified implementation states is denoted Specified_{S}(I) = {q ∈Reachable(I)∣q is specified}. We define I_{S} = Specified_{S}(I).
Definition 35
Let \(S \in \mathcal {S\!A}\) be a specification. A test suite \(\mathbb {T}\) for S is ncomplete if \(\forall I \in \mathcal {S\!A}_{\textit {IE}}\): \(\mathbb {T}\) produces verdict pass for I ⇒ I iocoS ∨I_{S} > n.
In particular, Scomplete means that if an implementation passes the test suite, then the implementation is correct (w.r.t. ioco) or it has strictly more states than the specification.
In the FSM setting, ncomplete test suites require access sequences and distinguishing sequences. In our context, we will use the term distinguishing experiments instead of distinguishing sequences. We already have distinguishing graphs for distinguishing incompatible states. Distinguishing experiments for compatible states, as well as access sequences, will be explained in the next two sections. After that, we give the definition of a test suite constituting of these parts and explain how it must be executed. We also give a proof that this test suite is indeed ncomplete.
6.1 Distinguishing compatible states
Distinguishing graphs as described in Section 5 rely on incompatibility of states, by steering the implementation to a point where the specification states disagree on the allowed outputs, i.e., the states have disjoint outsets. In this way, an implementation state cannot conform to both states, so it shows a nonconformance to at least one of the states. By using multiple distinguishing graphs, we hence show that an implementation state conforms to all but one specification state. By doing this for all implementation states, each implementation state conforms to a different specification state.
This technique fails for compatible specification states, as an implementation state may conform to multiple specification states. In such a case, a tester cannot with certainty steer the implementation to showing a nonconformance to any of the compatible specification states.
We thus extend the aim of a distinguishing experiment: instead of showing a nonconformance to any of two states q and q^{′} of specification S, we may also prove conformance to both. As our implementation is blackbox, we can only prove this by testing: this is achieved precisely by an ncomplete test suite for q ∧ q^{′}, as this describes all common behavior of S/q and S/q^{′} (Lemma 24). Hence, failing an ncomplete test suite for q ∧ q^{′} means disproving conformance to either S/q, S/q^{′}, or both, thus achieving the original goal of a distinguishing experiment. Passing this ncomplete test suite means proving conformance to both S/q and S/q^{′}, under the assumption that the implementation has no more than n states. This is already assumed, when distinguishing q and q^{′} in the context of an ncomplete test suite for S.
6.2 Access sequences
In FSMbased testing, the implementation states are reached in a rather efficient way. A set P of access sequences is used to reach P implementation states, after which all other states are reached by extending P with sequences of L_{I}. If we directly translate this to using P ⊆traces(S), and alphabet L, this is not sufficient for reaching all states Specified_{S}(I) of implementation I. This is because I may have less than P states reached by P, and hence P ⋅ L^{≤k} reaches less than n = P + k states of I. This has two causes: (1) the specification has multiple compatible states, which are implemented by a single state; (2) ioco allows to have a sequence p ∈ P with p∉traces(I) if p = σxρ with out(S after σ) > 1, i.e., transition x is optional for I to implement (Safterσx is then not certainly reachable according to Simao and Petrenko 2014).
Example 36
Consider Fig. 8 for an example. An implementation can omit state 2 of specification S, as shown in Fig. 8b, while still conforming to S. The implementation in Fig. 8c exploits this: it is nonconforming, while still having no more states than S, yet it is not detected by test suite P ⋅ L^{≤ 1} ⋅ W. We have P ⋅ L^{≤ 1} ⋅ W = {𝜖, x, y}⋅{𝜖, x, y, z}⋅{x, y, z}, so if we take y ∈ P (the implementation has no xtransition), z ∈ L (the implementation has no other possible transitions), and observe z ∈ W(3), we do not reach the faulty y transition in the implementation. This means that we may need to increase the size of the test suite in order to obtain the desired completeness. In this example, a test suite P ⋅ L^{≤ 2} ⋅ W is sufficient as the test suite will contain a test with yzz ∈ P ⋅ L ⋅ L after which the faulty output y∉W(3) will be observed.
Clearly, we reach all states in a nstate implementation for any specification S, by taking P to be all traces in traces(S) of length less than n. This set P can be constructed by simple enumeration. We then have that the traces in the set P will reach all specified, reachable states in all implementations I such that I_{S} ≤ n. In particular, this means that P^{+} = P ⋅ L reaches all specified transitions. We conjecture that a much more efficient construction is possible with a careful analysis of compatible states and not certainly reachable states.
6.3 Test suite definition
We now have all ingredients to define a test suite. Definition 37 uses mutual recursion, as a test suite can show up inside another test suite (as discussed in Section 6.1).
Definition 37
Let \(S=(Q,T,q_{0}) \in \mathcal {S\!A}\), and \(n \in \mathbb {N}\). Let {W(q)∣q ∈ Q} be a harmonized set of state identifiers for S. The distinguishing testsuite\(\mathbb {T}(S,n)\) is defined as follows.
When having a test suite \(\mathbb {T}(S,n)\), we refer with access sequences to its set P(S, n), and with distinguishing experiments to its sets DE(S, q). A merge q ∧ q^{′} used as part of a distinguishing experiment may be bigger even than S itself, which may cause an infinite distinguishing test suite from Definition 37. We give an alternative solution with a finite upper bound in Section 6.6.
We remark that specification states which allow all behavior (i.e., all states equivalent to χ) never need to be tested, as conformance for any implementation is intrinsic. Thus, we can remove these states from the specification (similar to Beneš et al. 2015) before constructing a test suite.
Example 38
We will briefly show the ingredients for a test suite for S in Fig. 1a by constructing \(\mathbb {T}(S,6)\). The set of access sequences P^{+}(S,6) contains all traces of S up to length 6. To also determine the distinguishing experiments for all states, we first analyze the compatible states as explained in Example 32. This analysis shows that the only pairs of inequivalent, compatible states are 2 ◊ 3, and 4 ◊ 5. For all incompatible pairs, we obtain a distinguishing graph. For example, the distinguishing graph for as constructed in Example 32 is included in the distinguishing experiments DE(S,1) and DE(S,4).
For every compatible pair, we recursively compute a test suite for their merge, which we use as distinguishing experiments: \(\mathbb {T}(2 \wedge 3, 6) \in \mathit {D{\kern .75pt}E}(S,2) \cap \mathit {D{\kern .75pt}E}(S,3)\) and \(\mathbb {T}(4 \wedge 5, 6) \in \mathit {D{\kern .75pt}E}(S,4) \cap \mathit {D{\kern .75pt}E}(S,5)\). The merge 2 ∧ 3 was given in Fig. 3 and the merge 4 ∧ 5 occurs as a subautomaton. When making the distinguishing experiments for these compatible states, we can remove the state (χ, χ) as it is equivalent to the chaos state. This leaves us with a 3state and a 2state automaton.
To recursively compute \(\mathbb {T}(2 \wedge 3, 6)\), we take all prefixes of wz^{5} and ayz^{4} as access sequences. Performing Algorithm 1 on these automata, we find that all pairs of states in these automata are incompatible. Distinguishing experiments DE(2 ∧ 3,q) thus only contain distinguishing graphs for all states q of 2 ∧ 3, so no new test suites have to be computed recursively. Computing \(\mathbb {T}(4 \wedge 5, 6)\) is done likewise, and also terminates without recursion.
6.4 Execution of test suites
So far we have introduced distinguishing test suites, access sequences and distinguishing graphs. Each of those describes an executable experiment, for which we need to define how it is executed.
First, we consider the execution of a trace σ as a sequential execution of its labels, where inputs and outputs are treated differently.
An output x is executed by waiting for the implementation to produce an output y, and then checking whether x = y. If so, we continue with the next label of σ. Otherwise, we try again by resetting the implementation to its initial state and execute σ from its first label. We require execution to be fair: if a trace σx is executed often enough, then every output y appearing in the implementation after σ will eventually be observed. Therefore, after a finite number of times resetting, we may conclude that the implementation cannot show the intended xtransition. Determining the exact number is left to the tester. Concluding that the implementation does not contain the trace σx is also considered a successful execution.
An input is executed by providing it to the implementation. An implementation may produce an output after σ before the tester can supply an input. Again, we require fairness: if a trace σa is executed often enough, then the tester will eventually succeed in executing a after σ. Assuming fairness is unavoidable for any notion of completeness in testing: a fault can never be detected if an implementation consistently chooses paths that avoid this fault.
Distinguishing test suites are executed by executing all tests contained in it. A test (σ, τ) is executed by first executing σ as described, and then executing the distinguishing experiment τ. If we conclude by fairness that some output of σ cannot be produced by the implementation, we declare σ, and also (σ, τ) to have been executed. While executing any test of a test suite \(\mathbb {T}\) for specification S, it is always checked whether any executed trace is a trace of S. If an ioco counterexample for S is observed, the test suite \(\mathbb {T}\) produce the verdict fail, and test execution stops. If all tests have been executed without encountering an ioco counterexample, then the test suite \(\mathbb {T}\) produces a pass verdict.
Example 39
Consider the distinguishing test suite \(\mathbb {T}\) for specification S in Fig. 1a. One of the tests contained in \(\mathbb {T}\) is \((a,\mathbb {T^{\prime }})\), where \(\mathbb {T^{\prime }}\) is a distinguishing test suite for the merge of compatible state 2 and 3 (see Example 38).
We continuously execute access sequence a before executing a test from \(\mathbb {T^{\prime }}\). If during execution of \(\mathbb {T^{\prime }}\), we observe the trace ax, then \(\mathbb {T^{\prime }}\) fails: trace ax is not a trace of 2 ∧ 3. We thus have successfully distinguished states 2 and 3 in the test \((a,\mathbb {T^{\prime }})\). This corresponds to observing trace aax during execution of \(\mathbb {T}\), which is no ioco counterexample for S, and hence does not result in a fail for \(\mathbb {T}\) itself.
If τ is a distinguishing test suite, then we execute it recursively, as already shown in Example 39. If it is a distinguishing graph, then it can be executed on an implementation by providing the inputs and observing the outputs on the edges of the tree going downwards from the root. In other words, if we view a distinguishing graph G with nodes V, edges E, and root node D, as a suspension automaton G = (V, E, D), then test execution of G on an implementation I is taking the parallel composition of G and I as in Definition 20. If (g, i) is a state of the composition, and g is a pass node, then distinguishing graph τ has been executed successfully. If g is a reset node, then the test needs to be reattempted. Note that the pass and reset states of the composition are the only blocking states, as all nodes of the distinguishing graph have edges for all outputs, and the implementation is inputenabled and nonblocking. Again, a test suite \(\mathbb {T}\) using the distinguishing graph τ does not use the verdict of τ, similar to Example 39: it only requires that distinguishing is successful. In the proof of Theorem 40, we need the following consequence of fairness. If a certain sequence ρ is observed in executing τ and τ is also used in testing another state, then if the other state does not show ρ (at some point), we conclude that ρ is not a trace of that state.
Finishing a distinguishing experiment τ may take several attempts: a distinguishing graph may give a reset because an input transition was not taken, and an ncomplete test suite for distinguishing two compatible states may contain multiple tests. Access sequence σ needs to be executed before every attempt. By assuming fairness and finiteness of the test suite, every distinguishing experiment is guaranteed to terminate, and thus also every test.
6.5 Completeness proof for distinguishing test suites
Theorem 40
Let\(S \in \mathcal {S\!A}\)bea specification and\(n \in \mathbb {N}\).The distinguishing test suite\(\mathbb {T}(S,n)\)fromDefinition37 is ncomplete.
Proof
We will show that for any implementation I with I_{S} ≤ n which passes the test suite we can build a coinductive ioco relation which contain the initial states. As a basis for that relation we take the states which are reached by the set P(S, n). This may not be an ioco relation, but by extending it (in two ways) we obtain a full ioco relation. Extending the relation is an instance of a socalled uptotechnique, we will use terminology from Bonchi and Pous (2015).
More precisely, let \(S = (Q_{S},T_{S},{q_{0}^{S}})\) and let \(I = (Q_{I},T_{I},{q_{0}^{I}})\) be an implementation with I_{S} ≤ n which passes \(\mathbb {T}(S,n)\). By construction of P(S, n), all states Specified_{S}(I) are reached by P(S, n) and so all specified transitions are reached by P^{+}(S, n).
Using the set P(S, n), we define \( R = \{ (q_{0}^{I} \textsf {after} \sigma , q_{0}^{S} \textsf {after} \sigma ) \mid \sigma \in P(S,n) \} \) as a subset of Q_{I} × Q_{S}. First, we extend R by adding relations for all equivalent specification states: R^{′} = {(i, s)∣(i, s^{′}) ∈ R, s ∈ Q_{S}, s ≈ s^{′}}. Second, let \(\mathcal {J} = \{ (i,s) \mid i \in Q_{I}, s \in Q_{S} \text { such that} i \textsf {ioco} s \}\) and R_{i, s} be the ioco relation for i iocos, now define \(\overline {R} = R^{\prime } \cup \bigcup _{(i,s) \in \mathcal {J}} R_{i,s}\). We want to show that \(\overline {R}\) defines a coinductive ioco relation. We do this by showing that R progresses to \(\overline {R}\).
Let (i, s) ∈ R. We assume that we have seen all of out(i) and that out(i) ⊆out(s) (this is taken care of by the test suite and the fairness assumption). Then, because we use P^{+}(S, n), we also reach the transitions after i. We need to show that the input and output successors are again related.

Let a ∈ L_{I}. Since the implementation is inputenabled there is a transition for a with i aftera = i_{2}. Suppose there is a transition for a from s: Saftera = s_{2} (if not, then we are done). We have to show that \((i_{2}, s_{2}) \in \overline {R}\).

Let x ∈ L_{O}. Suppose there is a transition for x: iafterx = i_{2}. Then, (since out(i) ⊆out(s)) there is a transition for x from s: safterx = s_{2}. We have to show that (i_{2}, s_{2}) \(\in \overline {R}\).
In both cases, we have a successor (i_{2}, s_{2}) which we have to prove to be in \(\overline {R}\). Now since P(S, n) reaches all specified states of I, we know that i_{2} is reached and so \((i_{2}, s_{2}^{\prime }) \in R\) for some \(s_{2}^{\prime }\). If \(s_{2} \approx s_{2}^{\prime }\), then \((i_{2}, s_{2}) \in R^{\prime } \subseteq \overline {R}\) holds and we are done. So now suppose that \(s_{2} \not \approx s_{2}^{\prime }\). There are two cases:

If , then there exists a distinguishing graph \(w \in W(s_{2}) \cap W(s_{2}^{\prime })\) (since W is a harmonized set of state identifiers). This graph w is executed twice in i_{2}: once as a test (σ, w) for some σ ∈ P(S, n) with S afterσ = s, and once as a test (σ^{′}, w) for some σ^{′}∈ P^{+}(S, n) with Safterσ^{′} = s_{2}. By fairness, there is a single sequence ρ in w executed in both executions. This sequences reaches a pass state of w in both cases as our implementation passed the test suite. By construction of distinguishing graphs, ρ must be an ioco counterexample for either S/s_{2} or \(S/s_{2}^{\prime }\). This contradicts that the two tests passed, so this case cannot happen.

If \(s_{2} \mathrel {\Diamond } s_{2}^{\prime }\) (but \(s_{2} \not \approx s_{2}^{\prime }\) as assumed above), then we executed a test suite τ ∈ W(s_{2}) for \(s_{2} \wedge s_{2}^{\prime }\). By induction, we assume that τ is ncomplete. If all the tests in τ pass, then we can conclude that i_{2}iocos_{2} and so \((i_{2}, s_{2}) \in R_{i,s_{2}} \subseteq \overline {R}\). It can happen that a test in the distinguishing test suite τ fails, so that i_{2} does not conform to \(s_{2} \wedge s_{2}^{\prime }\). In that case, there is a sequence ρ which is an ioco counterexample executed after an access sequence of s_{2}. By fairness, we may assume this trace ρ is also executed after \(s_{2}^{\prime }\) (since we execute it from the same implementation state). Since i_{2} does not conform to \(s_{2} \wedge s_{2}^{\prime }\), either execution makes the whole test suite \(\mathbb {T}(S,n)\) fail, contradicting the assumption.
In both cases, we either have a contradiction, so that \(s_{2} \not \approx s_{2}^{\prime }\) cannot hold, or we have proven directly that \((i_{2}, s_{2}) \in \overline {R}\).
So we have now seen that R progresses to \(\overline {R}\). It is clear that R^{′} progresses to \(\overline {R}\) too. Then, since each R_{i, s} is an ioco relation, they progress to \(R_{i,s} \subseteq \overline {R}\). And so the union, \(\overline {R}\), progresses to \(\overline {R}\), meaning that \(\overline {R}\) is a coinductive ioco relation. Furthermore, we have \((i_{0}, s_{0}) \in \overline {R}\) (since 𝜖 ∈ P(S, n)), concluding the proof. □
We remark that if the specification does not contain any compatible states, the proof can be simplified considerably. In particular, we do not need test suites for merges of states, and we can use the relation R^{′} instead of \(\overline {R}\).
6.6 Unconditional test suite
The distinguishing test suite relies on executing distinguishing experiments. If a specification contains compatible states, the test suite contains distinguishing experiments which are themselves distinguishing test suites. This is thus a recursive construction: we need to show that such a test suite is finite. For particular specifications, recursive repetition of the distinguishing test suite as described above is already finite. For example, specification S in Fig. 3 contains compatible states, but in the merge of every two compatible states, no further compatible states remain (when ignoring state (χ, χ) as explained in Example 38). Consequently, the distinguishing test suites of each merge only have distinguishing graphs as distinguishing experiments, and hence the recursion terminates.
However, the merge of two compatible states may in general again contain compatible states. In these cases, recursive repetition of distinguishing test suites may cause a blowup in the size of the test suite. We therefore provide an alternative: the unconditional test suite which has a clear upper bound. This bound is based on what is called state counting in FSM theory (Hierons 2004). The bound constitutes counting the number of times a specification state is visited while executing a trace on the implementation. Definition 41 and Lemma 42 make this precise in our ioco setting.
Definition 41
Let \(S = (Q, T, q_{0}) \in \mathcal {S\!A}\) and \(n \in \mathbb {N}\). A trace σ ∈ traces(S) is nbounded if ∀q ∈ Q : {ρ∣ρ is a prefix of σ ∧ Safterρ = q}≤ n.
Lemma 42
Let\(S=(Q,T,q_{0}) \in \mathcal {S\!A}\)and\(I \in \mathcal {S\!A}_{\textit {IE}}\).If, then traces(I) contains an I_{S}bounded ioco counterexample for S.
Proof
If , then traces(I) contains an ioco counterexample σ for S by Lemma 7. If σ is I_{S}bounded, the proof is trivial, so assume it is not. Hence, there exists a state q ∈ Q, with at least I_{S} + 1 prefixes of σ leading to q. At least two of those prefixes ρ and ρ^{′} must lead to the same implementation state, i.e., it holds that Iafterρ = Iafterρ^{′} and Safterρ = Safterρ^{′}. Assuming ρ < ρ^{′} without loss of generality, we can thus create an ioco counterexample σ^{′} shorter than σ by replacing ρ^{′} by ρ. If σ^{′} is still not I_{S}bounded, we can repeat this process until it is. □
Contrapositively, if we would observe all nbounded traces of an implementation, and we find no ioco counterexamples, we know that the implementation must be conforming. Note that an nbounded trace has a length of at most S⋅ n, thus exhaustively checking all nbounded traces is possible.
Definition 43
Let \(S \in \mathcal {S\!A}\) and \(n \in \mathbb {N}\). The unconditional test suite is then: \(\mathbb {U}(S,n) = \{\sigma \in \textsf {traces}({S}) \mid \sigma \text { is} n\text {bounded}\}\).
Corollary 44
Let\(S \in \mathcal {S\!A}\)and\(n \in \mathbb {N}\).Then,\(\mathbb {U}(S,n)\)isan ncomplete test suite. We have\(\forall \sigma \in \mathbb {U}(S,n): \sigma  \le S\cdot n\).
The upperbound S⋅ n is tight, as shown in Example 45. A set of traces of length at most S⋅ n is much bigger than the set P^{+}(S, n) of at most nlength traces, as the number of traces grows exponentially in their length. Thus, a distinguishing test suite as introduced in
Section 6.3 may be significantly smaller, depending on the number of compatible states. The unconditional test suite shows the possibility of unconditional termination with a fixed upper bound, though.
As \(\mathbb {U}(S,n)\) consists of traces, test execution amounts to executing all these traces, i.e., by executing them according to the fairness assumption. If the implementation produces a trace not in traces(S), the test suite has verdict fail. If all traces of \(\mathbb {U}(S,n)\) have been executed without obtaining a fail verdict, the test suite has verdict pass.
Example 45
Figure 9 shows a specification and a nonconforming implementation with ioco counterexample yyxyyxyyxyyx, of maximal length S⋅I_{S} = 12.
7 Conclusions and future work
We firmly embedded theory on ncomplete test suites into ioco theory, without making any restrictive assumptions. We have identified several problems where classical FSM techniques fail for suspension automata, in particular for compatible states. The concept of distinguishing states has been extended such that compatible states can be handled, by ncomplete testing of the merge of such states. Additionally, we have given a construction for distinguishing graphs for incompatible states, which follows naturally from the computation of the compatibility relation.
We use an extended domain of suspension automata, which may not respect the usual conditions for quiescence. This is a conservative approach: detecting any faulty implementation in our extended domain, also finds any faulty implementation which does respect quiescence. However, this may produce more tests than required to detect “spurious” implementations. A further area of research is to tighten the definitions of equivalence, compatibility and ncomplete test suites to capture the more restricted usual implementation domain.
For reaching all implementation states, we used all traces up to length n, which is hence an upper bound exponential in the number of states. Furthermore, the recursion of using a test suite for testing a merge of compatible states may possibly not terminate. We therefore introduced an unconditional test suite, which provides an exponential but finite upper bound. These two exponential upper bounds may limit practical applicability, so further investigation is needed to efficiently tackle these problems. Furthermore, experiments are needed to determine the actual efficiency of computation and execution time, preferably on real world case studies. This should include a quantitative comparison with other methods, for example random testing as by Tretmans (2008).
References
Aarts, F., & Vaandrager, F. (2010). Learning I/O Automata. In International conference on concurrency theory. Springer (pp. 71–85).
Beneš, N., Daca, P., Henzinger, T.A. , Křetínskỳ, J., Ničković, D. (2015). Complete composition operators for iocotesting theory. In Proceedings of the 18th international ACM SIGSOFT symposium on componentbased software engineering. ACM (pp. 101–110).
Bonchi, F., & Pous, D. (2015). Hacking Nondeterminism with Induction and Coinduction. Communications of the ACM, 58(2), 87–95.
Chow, T.S. (1978). Testing software design modeled by finitestate machines. IEEE Transactions on Software Engineering, 4(3), 178–187.
Dorofeeva, R., ElFakih, K., Maag, S., Cavalli, A.R., Yevtushenko, N. (2010). FSMbased conformance testing methods: a survey annotated with experimental evaluation. Information and Software Technology, 52(12), 1286–1297.
Hierons, R.M. (2004). Testing from a nondeterministic finite state machine using adaptive state counting. IEEE Transactions on Computers, 53(10), 1330–1342.
Lee, D., & Yannakakis, M. (1994). Testing finitestate machines: state identification and verification. IEEE Transactions on Computers, 43(3), 306–320.
Noroozi, N. (2014). Improving inputoutput conformance testing theories. PhD thesis, Technische Universiteit Eindhoven.
Paiva, S.C., & Simao, A. (2016). Generation of complete test suites from mealy input/output transition systems. Formal Aspects of Computing, 28(1), 65–78.
Petrenko, A., & Yevtushenko, N. (2005). Conformance tests as checking experiments for partial nondeterministic FSM. In International workshop on formal approaches to software testing. Springer (pp. 118–133).
Petrenko, A., & Yevtushenko, N. (2011). Adaptive testing of deterministic implementations specified by nondeterministic FSMs. In IFIP international conference on testing software and systems. Springer (pp. 162–178).
Simao, A., & Petrenko, A. (2014). Generating complete and finite test suite for ioco: Is it possible?. In Proceedings ninth workshop on modelbased Testing, MBT 2014, Grenoble, France (pp. 56–70).
Tretmans, J. (1996). Test generation with inputs, outputs and repetitive quiescence software concepts and tools.
Tretmans, J. (2008). Model based testing with labelled transition systems. In Formal methods and testing. Springer (pp. 1–38).
van den Bos, P., Janssen, R., Moerman, J. (2017). ncomplete test suites for IOCO. In IFIP international conference on testing software and systems. Springer (pp. 91–107).
Veanes, M., & Bjørner, N. (2012). Alternating simulation and IOCO. International Journal on Software Tools for Technology Transfer, 14(4), 387–405.
Willemse, T.A.C. (2006). Heuristics for iocobased testbased modelling. In International workshop on formal methods for industrial critical systems. Springer (pp. 132–147).
Acknowledgments
We would like to thank Frits Vaandrager, Jan Tretmans, and the anonymous reviewers for their valuable time and useful feedback.
Petra van den Bos and Ramon Janssen were supported by NWO project 13859 (SUMBAT).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
van den Bos, P., Janssen, R. & Moerman, J. nComplete test suites for IOCO. Software Qual J 27, 563–588 (2019). https://doi.org/10.1007/s112190189422x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s112190189422x