Alternating Automata Modulo First Order Theories

. We introduce ﬁrst-order alternating automata, a generalization of boolean alternating automata, in which transition rules are described by multisorted ﬁrst-order formulae, with states and internal variables given by uninterpreted predicate terms. The model is closed under union, intersection and complement, and its emptiness problem is undecidable, even for the simplest data theory of equality. To cope with the undecidability problem, we develop an abstraction reﬁne-ment semi-algorithm based on lazy annotation of the symbolic execution paths with interpolants, obtained by applying (i) quantiﬁer elimination with witness term generation and (ii) Lyndon interpolation in the quantiﬁer-free theory of the data domain, with uninterpreted predicate symbols. This provides a method for checking inclusion of timed and ﬁnite-memory register automata, and emptiness of quantiﬁed predicate automata, previously used in the veriﬁcation of parameterized concurrent programs, composed of replicated threads, with shared memory.


Introduction
Many results in automata theory rely on the finite alphabet hypothesis, which guarantees, in some cases, the existence of determinization, complementation and inclusion checking methods.However, this hypothesis prevents the use of automata as models of real-time systems or even simple programs, whose input and output are data values ranging over very large domains, typically viewed as infinite mathematical abstractions.
Traditional attempts to generalize classical Rabin-Scott automata to infinite alphabets, such as timed automata [1] and finite-memory automata [16] face the complement closure problem: there exist automata for which the complement language cannot be recognized by an automaton in the same class.This makes it impossible to encode a language inclusion problem L(A) ⊆ L(B) as the emptiness of an automaton recognizing the language L(A) ∩ L c (B), where L c (B) denotes the complement of L(B).
Even for finite alphabets, complementation of finite-state automata faces inherent exponential blowup, due to nondeterminism.However, if we allow universal nondeterminism, in addition to the classical existential nondeterminism, complementation is possible is linear time.Having both existential and universal nondeterminism defines the alternating automata model [4].A finite-alphabet alternating automaton is described by a set of transition rules q a − → φ, where q is a state, a is an input symbol and φ is a boolean formula, whose propositional variables denote successor states.Our Contribution We extend alternating automata to infinite data alphabets, by defining a model of computation in which all boolean operations, including complementation, can be done in linear time.The control states are given by k-ary predicate symbols q(y 1 , . . ., y k ), the input consists of an event a from a finite alphabet and a tuple of data variables x 1 , . . ., x n , ranging over an infinite domain, and transitions are of the form q(y 1 , . . ., y k ) a(x 1 ,...,xn) − −−−−−− → φ(x 1 , . . ., x n , y 1 , . . ., y k ), where φ is a formula in the first-order theory of the data domain.In this model, the arguments of a predicate atom q(y 1 , . . ., y k ) represent the values of the internal variables associated with the state.Together with the input values x 1 , . . ., x n , these values define the next configurations, but remain invisible in the input sequence.
The tight coupling of internal values and control states, by means of uninterpreted predicate symbols, allows for linear-time complementation just as in the case of classical propositional alternating automata.Complementation is, moreover, possible when the transition formulae contain first-order quantifiers, generating infinitely-branching execution trees.The price to be paid for this expressivity is that emptiness of first-order alternating automata is undecidable, even for the simplest data theory of equality [6].
The main contribution of this paper is an effective emptiness checking semi-algorithm for first-order alternating automata, in the spirit of the IMPACT lazy annotation procedure, originally developed for checking safety of nondeterministic integer programs [20,21].In a nutshell, a lazy annotation procedure unfolds an automaton A trying to find an execution that recognizes a word from L(A).If a path that reaches a final state does not correspond to a concrete run of the automaton, the positions on the path are labeled with interpolants from the proof of infeasibility, thus marking this path and all continuations as infeasible for future searches.Termination of lazy annotation procedures is not guaranteed, but having a suitable coverage relation between the nodes of the search tree may ensure convergence of many real-life examples.However, applying lazy annotation to first-order alternating automata faces two nontrivial problems: 1. Quantified transition rules make it hard, if not impossible, in general, to decide if a path is infeasible.This is mainly because adding uninterpreted predicate symbols to decidable first-order theories, such as Presburger arithmetic, results in undecidability [10].To deal with this problem, we assume that the first-order data theory, without uninterpreted predicate symbols, has a quantifier elimination procedure, that instantiates quantifers with effectively computable witness terms.2. The interpolants that prove the infeasibility of a path are not local, as they may refer to input values encountered in the past.However, the future executions are oblivious to when these values have been seen in the past and depend only on the relation between the past and current values.We use this fact to define a labeling of nodes, visited by the lazy annotation procedure, with conjunctions of existentially quantified interpolants combining predicate atoms with data constraints.We use first-order alternating automata to develop practical semi-algorithms for a number of known undecidable problems, such as: inclusion of regular timed languages [1], inclusion of quasi-regular languages recognized by finite-memory automata [16] and emptiness of predicate automata, a subclass of first-order alternating automata used to verify parameterized concurrent programs [6,7].Related Work Recognizers for languages over infinite alphabets have found various applications, ranging from Unicode text recognition [5] to runtime program monitoring [2].Extending finite automata to infinite alphabets has been considered in the context of symbolic alternating finite automata (s-AFA), whose transitions are labeled with guards taken from a decidable theory of the data domain [5].As in our model, s-AFA are closed under union, intersection and complement and emptiness is decidable, due to the lack of registers.However, s-AFA are strictly less expressive than our model, because comparing data at different positions in the input word is not possible.
Constrained Horn clauses (CHC) are a branching computation model widespread in program verification [9].The main difference between alternating and bottom-up branching computations is that, in an alternating model, all branches of the computation must synchronize on the same input word.With this in mind, it is possible to express emptiness of first-order alternating automata as the existence of solutions of a CHC over a higher-order theory of data, extended with algebraic data types (lists).The effectiveness of such an encoding depends on the effectiveness of interpolation and witness term generation for theories of algebraic data types [11].
The alternating automata model presented in this paper extends the alternating automata with variables ranging over infinite data considered in [14].There all variables were required to be observable in the input.We overcome this restriction by allowing internal (invisible) variables.Another closely related work [13] considers an inclusion between an asynchronous product of automata A 1 × . . .× A n , extended with data variables, and a monitor automaton B. The semi-algorithm defined there was based on the assumption that all variables of the observer B must be declared in the automata A 1 , . . ., A n under check.This limitation can now be bypassed, since the inclusion problem can be encoded as emptiness of a first-order alternating automaton and, moreover, the emptiness checking semi-algorithm can handle invisible variables.
The work probably closest to ours concerns the model of predicate automata (PA) [6,7,17], used in the verification of parameterized concurrent programs with shared memory.In this model, the alphabet consists of pairs of program statements and thread identifiers and is considered infinite because the number of threads is unbounded.Since thread identifiers can only be compared for equality, the data theory in PA is the theory of equality.Even with this simplification, the emptiness problem is undecidable when either the predicates have arity greater than one [6] or use quantified transition rules [17].Checking emptiness of quantifier-free PA is possible semi-algorithmically, by explicitly enumerating reachable configurations and checking coverage by looking for permutations of argument values.However, no semi-algorithm has been given for quantified PA.Dealing with quantified transition rules is one of our contributions.

Preliminaries
We consider two disjoint sorts D and B, where D is an infinite domain and B = { , ⊥} is the set of boolean values true ( ) and false (⊥), respectively.The D sort is equipped with countably many function symbols f : D #( f ) → D ∪ B, where #( f ) ≥ 0 denotes the number of arguments (arity) of f .A predicate is a function symbol p : D #(p) → B that is, a #(p)-ary relation.
We consider the interpretation of all function symbols f : D #( f ) → D to be fixed by the interpretation of the D sort, for instance if D is the set of integers Z, these are zero, the successor function and the arithmetic operations of addition and multiplication.We extend this convention to several predicates over D, such as the inequality relation over Z, and write Pred for the set of remaining uninterpreted predicates.
Let Var = {x, y, z, . ..} be a countably infinite set of variables, ranging over D. Terms are either constants of sort D, variables or function applications f (t 1 , . . ., t #( f ) ), where t 1 , . . ., t #( f ) are terms.The set of first-order formulae is defined by the syntax below: . φ 1 where t, s, t 1 , . . ., t #(p) denote terms and p is a predicate symbol.We write φ 1 ∨ φ 2 , φ 1 → φ 2 and ∀x .φ 1 for ¬(¬φ 1 ∧ ¬φ 2 ), ¬φ 1 ∨ φ 2 and ¬∃x .¬φ 1 , respectively.FV(φ) is the set of free variables in φ and the size |φ| of a formula φ is the number of symbols needed to write it down.A sentence is a formula φ with no free variables.A formula is positive if each uninterpreted predicate symbol occurs under an even number of negations and we denote by Form + (Q, X) the set of positive formulae with predicates from the set Q ⊆ Pred and free variables from the set X ⊆ Var.A formula is in prenex form if it is of the form ϕ = Q 1 x 1 . . .Q n x n .φ, where φ has no quantifiers.In this case we call φ the matrix of ϕ.Every first-order formula can be written in prenex form, by renaming each quantified variable to a unique name and moving the quantifiers upfront.
An interpretation I maps each predicate symbol p into a set p I ⊆ D #(p) , if #(p) > 0, or into an element of B if #(p) = 0.A valuation ν maps each variable x into an element of D. Given a term t, we denote by t ν the value obtained by replacing each variable x by the value ν(x) and evaluating each function application.For a formula φ, we define the forcing relation I, ν | = φ recursively on the structure of φ, as usual.For a formula φ and a valuation ν, we define for every interpretation I and every valuation ν.We say that φ entails ψ, Interpretations are partially ordered by the pointwise subset order, defined as I 1 ⊆ I 2 if and only if p I 1 ⊆ p I 2 for each predicate symbol p ∈ Pred.Given a formula φ and a valuation ν, we define [[φ]]

First Order Alternating Automata
Let Σ be a finite alphabet Σ of input events.Given a finite set of variables X ⊆ Var, we denote by X → D the set of valuations of the variables X and Σ[X] = Σ × (X → D) be the possibly infinite set of data symbols (a, ν), where a is an input symbol and ν is a valuation.A data word (simply called word in the following) is a finite sequence w = (a 1 , ν 1 )(a 2 , ν 2 ) . . .(a n , ν n ) of data symbols.Given a word w, we denote by w Σ def = a 1 . . .a n its sequence of input events and by w D the valuation associating each time-stamped variable x (i) , where x ∈ Var, the value ν i (x), for all i ∈ [1, n].We denote by ε the empty sequence, by Σ * the set of finite input sequences and by Σ[X] * the set of finite data words over the variables X.
A first-order alternating automaton is a tuple A = Σ, X, Q, ι, F, ∆ , where Σ is a finite set of input events, X is a finite set of input variables, Q is a finite set of predicates denoting control states, ι ∈ Form + (Q, ∅) is a sentence defining initial configurations, F ⊆ Q is the set of predicates denoting final states and ∆ is a set of transition rules.A transition rule is of the form q(y 1 , . . ., y #(q) ) a(X) − −− → ψ, where q ∈ Q is a predicate, a ∈ Σ is an input event and ψ ∈ Form + (Q, X ∪ {y 1 , . . ., y #(q) }) is a positive formula, where X ∩ {y 1 , . . ., y #(q) } = ∅.Without loss of generality, we consider, for each predicate q ∈ Q and each input event a ∈ Σ, at most one such rule, as two or more rules can be joined using disjunction.The quantifiers occurring in the right-hand side formula of a transition rule are called transition quantifiers.The size of A is The semantics of first-order alternating automata is analogous to the semantics of propositional alternating automata, with rules of the form q a − → φ, where q is a propositional variable and φ a positive boolean combination of propositional variables.For instance, q 0 a − → (q 1 ∧ q 2 ) ∨ q 3 means that the automaton can choose to transition in either both q 1 and q 2 or in q 3 alone.This leads to defining transitions as the minimal models of the right hand side of a rule1 .The original definition of alternating automata [4] works around this problem and considers boolean valuations instead of formulae.In contrast, a finite description of a first-order alternating automaton cannot be given in terms of interpretations, as a first-order formula may have infinitely many models, corresponding to infinitely many initial or successor states occurring within an execution step.
Given an uninterpreted predicate symbol q ∈ Q and data values d 1 , . . ., d #(q) ∈ D, the tuple (q, d 1 , . . ., d #(q) ) is called a configuration, sometimes written q(d 1 , . . ., d #(q) ), when no confusion arises.A configuration is final if q ∈ F. An interpretation I corresponds to a set of configurations c(I) This notation is lifted to sets of configurations in the usual way.
where each T i is a tree labeled with configurations, such that: 1. c = {T ( ) | T ∈ T } is the set of configurations labeling the roots of T 1 , T 2 , . . .and 2. if (q, d 1 , . . ., d #(q) ) labels a node on the level j ∈ [n − 1] in T i , then the labels of its children form a cube from c( ] and q(y 1 , . . ., y #(q) ) a j+1 (X) An execution T over w, starting with c, is accepting if and only if all paths in T have the same length and the frontier of each tree T ∈ T is labeled with final configurations.If A has an accepting execution over w starting with a cube c ∈ c([[ι]] µ ), then A accepts w and let L(A) be the set of words accepted by A. For example, consider the automaton A = {a}, {x}, {q 0 , q 1 , q 2 , q f }, q 0 (0), {q f }, ∆ , where ∆ is the set: q 0 (y) A possible execution tree of this automaton is the following: The execution tree is not accepting, since its frontier is not labeled with final configurations everywhere.Incidentally, here we have L(A) = ∅, which is proved by our tool in ∼ 0.5 seconds on an average machine.
In the rest of this paper, we are concerned with the following problems: 1. boolean closure: For technical reasons, we address the following problem next: given an automaton A and an input sequence α ∈ Σ * , does there exists a word w ∈ L(A) such that w Σ = α ?By solving this problem first, we develop the machinery required to prove that first-order alternating automata are closed under complement and, further, set up the ground for developping a practical semi-algorithm for the emptiness problem.

Path Formulae
In the upcoming developments it is sometimes more convenient to work with logical formulae defining executions of automata, than with low-level execution forests.For this reason, we first introduce path formulae Θ(α), which are formulae defining the executions of an automaton, over words that share a given sequence α of input events.Second, we restrict a path formula Θ(α) to an acceptance formula Υ(α), which defines only those executions that are accepting among Θ(α).Consequently, the automaton accepts a word w such that w Σ = α if and only if Υ(α) is satisfiable.
Let A = Σ, X, Q, ι, F, ∆ be an automaton for the rest of this section.For any i ∈ N, we denote by the sets of time-stamped predicate symbols and variables, respectively.We also define /Q] the formula in which all input variables and state predicates (and only those symbols) are replaced by their time-stamped counterparts.Moreover, we write q(y) for q(y 1 , . . ., y #(q) ), when no confusion arises.
Given a sequence of input events α = a 1 . . .a n ∈ Σ * , the path formula of α is: The automaton A, to which Θ(α) refers, will always be clear from the context.To formalize the relation between the low-level configuration-based execution semantics and path formulae, consider a word w = (a 1 , ν 1 ) . . .(a n , ν n ) ∈ Σ[X] * .Any execution T of A over w has an associated interpretation I T of time-stamped predicates Q (≤n) : T is an execution of A over w}.Next, we give a logical characterization of acceptance, relative to a given sequence of input events α ∈ Σ * .To this end, we constrain the path formula Θ(α) by requiring that only final states of A occur on the last level of the execution.The result is the acceptance formula for α: The top-level universal quantifiers from a subformula ∀y 1 . . .∀y #(q) .q (i) (y) → ψ of Υ(α) will be referred to as path quantifiers, in the following.Notice that path quantifiers are distinct from the transition quantifiers that occur within a formula ψ of a transition rule q(y 1 , . . ., y #(q) ) a(X) The relation between the words accepted by A and the acceptance formula above, is formally captured by the following lemma: Lemma 2. Given an automaton A = Σ, X, Q, ι, F, ∆ , for every word w ∈ Σ[X] * , the following are equivalent: (1) there exists an interpretation I such that I, w D | = Υ(w Σ ) and (2) w ∈ L(A).
As an immediate consequence, one can decide whether A accepts some word w with a given input sequence w Σ = α, by checking whether Υ(α) is satisfiable.However, unlike non-alternating infinite-state models of computation, such as counter automata (nondeterministic programs with integer variables), the satisfiability query for an acceptance (path) formula falls outside of known decidable theories, supported by standard SMT solvers.There are basically two reasons for this, namely (i) the presence of predicate symbols, and (ii) the non-trivial alternation of quantifiers.To understand this point, consider for example, the decidable theory of Presburger arithmetic [24].Adding even only one monadic predicate symbol to it yields undecidability in the presence of non-trivial quantifier alternation [10].On the other hand, the quantifier-free fragment of Presburger arithmetic extended with uninterpreted function symbols is decidable, by a Nelson-Oppen style congruence closure argument [22].
To tackle the problem of deciding satisfiability of Υ(α) formulae, we start from the observation that their form is rather particular, which allows the elimination of path quantifiers and uninterpreted predicate symbols, by a couple of satisfiability-preserving transformations.The result of applying these transformations is a formula with no predicate symbols, whose only quantifiers are those introduced by the transition rules of the automaton.Next, in §3 we shall assume moreover that the first-order theory of the data sort D (without uninterpreted predicate symbols) has quantifier elimination, providing thus an effective decision procedure.
Example 2 (Contd.from Example 1).The result of the elimination of predicate atoms from the acceptance formula in Example 1 is shown below: ] Since this formula is unsatisfiable, by Lemma 5 below, no word w with input event sequence w Σ = a 1 a 2 is accepted by the automaton A from Example 1.
At this point, we prove the formal relation between the satisfiability of the formulae Υ(α) and Υ(α).Since there are no occurrences of predicates in Υ(α), for each valuation ν : X (≤n) → D, there exists an interpretation I such that I, ν | = Υ(α) if and only if J, ν | = Υ(α), for every interpretation J.In this case we omit I and simply write ν | = Υ(α).Finally, we define the acceptance of a word with a given input event sequence by means of a quantifier-free formula in which no predicate atom occurs.

Boolean Closure of First Order Alternating Automata
Given a positive formula φ, we define the dual formula φ ∼ recursively as follows: The following theorem shows closure of automata under all boolean operations.Note that it is sufficient to show closure under intersection and negation because L(A 1 ) ∪ L(A 2 ) is the complement of the language L c (A 1 ) ∩ L c (A 2 ), for any two automata A 1 and A 2 with the same input event alphabet and set of input variables.

The Emptiness Problem
The emptiness problem is undecidable even for automata with predicates of arity two, whose transition rules use only equalities and disequalities, having no transition quantifiers [6].Since even such simple classes of alternating automata have no general decision procedure for emptiness, we use an abstraction-refinement semi-algorithm based on lazy annotation [20,21].In a nutshell, a lazy annotation procedure systematically explores the set of finite input event sequences searching for an accepting execution.For an input sequence, if the path formula is satisfiable, we compute a word in the language of the automaton, from the model of the path formula.Otherwise, i.e. the sequence is spurious, the search backtracks and each position in the sequence is annotated with an interpolant, thus marking the sequence as infeasible.The semi-algorithm uses moreover a coverage relation between sequences, ensuring that the continuations of already covered sequences are never explored.Sometimes this coverage relation provides a sound termination argument, in case when the automaton is empty.
For two input event sequences α, β ∈ Σ * , we say that α is a prefix of β, written α β, if α = βγ for some sequence γ ∈ Σ * .A set S of sequences is prefix-closed if for each α ∈ S , if β α then β ∈ S , and complete if for each α ∈ S , there exists a ∈ Σ such that αa ∈ S if and only if αb ∈ S for all b ∈ Σ.A prefix-closed set is the backbone of a tree whose edges are labeled with input events.If the set is, moreover, complete, then every node of the tree has either zero successors, in which case it is called a leaf, or it has a successor edge labeled with a for each input event a ∈ Σ. Definition 2. An unfolding of an automaton A = Σ, X, Q, ι, F, ∆ is a finite partial mapping U : Σ * fin Form + (Q, ∅), whose domain dom(U) is a finite prefix-closed complete set, such that U( ) = ι, and for each sequence αa ∈ dom(U), such that α ∈ Σ * and a ∈ Σ: − −− →ψ ∀y 1 . . .∀y #q .q (0) (y) → ψ (1) | = U(αa) (1)   A path α is safe in U if and only if U(α)∧ q∈Q\F ∀y 1 . . .∀ y #(q) .q(y) → ⊥ is unsatisfiable.The unfolding U is safe if and only if every path in dom(U) is safe in U.
Lazy annotation semi-algorithms [20,21] build unfoldings of automata trying to discover counterexamples for emptiness.If the automaton A in question is non-empty, a systematic enumeration of the input event sequences2 from Σ * will suffice to discover a word w ∈ L(A), provided that the first-order theory of the data domain D is decidable (Lemma 2).However, if L(A) = ∅, the enumeration of input event sequences may, in principle, run forever.The typical way of fighting this divergence problem is to define a coverage relation between the nodes of the unfolding tree.

Algorithm 1 IMPACT-based Semi-algorithm for First Order Alternating Automata
input: a first order alternating automaton A = Σ, X, Q, ι, F, ∆ output: if L(A) = ∅, or word w ∈ L(A), otherwise data structures: WorkList and unfolding tree U = N, E, r, U, , where: -N is a set of nodes, -E ⊆ N × Σ × N is a set of edges labeled by input events, -U : dequeue n from WorkList 3: let α(n) be a 1 , . . ., a k 5: if Υ(α)(X (1) , . . ., X (k) ) is satisfiable then counterexample is feasible 6: get model ν of Υ(α)(X (1) , . . ., X (k) ) 7: return w = (a 1 , ν(X (1) )) . . .(a k , ν(X (k) )) w ∈ L(A) by construction 8: else spurious counterexample 9: let (I 0 , . . ., I k ) be a GLI for α 10: b ← ⊥ 11:  A lazy annotation semi-algorithm will stop and report emptiness provided that it succeeds in building a closed and safe unfolding of the automaton.Notice that, by Definition 3, for any three nodes of an unfolding U, say α, β, γ ∈ dom(U), if α ≺ β and α γ, then β γ as well.As we show next (Theorem 2), there is no need to expand covered nodes, because, intuitively, there exists a word w ∈ L(A) such that α w Σ and α γ only if there exists another word u ∈ L(A) such that γ u Σ .Hence, exploring only those input event sequences that are continuations of γ (and ignoring those of α) suffices in order to find a counterexample for emptiness, if one exists.
An unfolding node α ∈ dom(U) is said to be spurious if and only if Υ(α) is unsatisfiable.In this case, we change (refine) the labels of (some of the) prefixes of α (and that of α), such that U(α) becomes ⊥, thus indicating that there is no real execution of the automaton along that input event sequence.As a result of the change of labels, if a node γ α used to cover another node from dom(U), it might not cover it with the new label.Therefore, the coverage relation has to be recomputed after each refinement of the labeling.The semi-algorithm stops when (and if) a safe complete unfolding has been found.

Theorem 2. If an automaton A has a nonempty safe closed unfolding then
We describe the semi-algorithm used to check emptiness of first-order alternating automata.The execution of Algorithm 1 consists of three phases, corresponding to the Close, Refine and Expand of the original IMPACT procedure [20].Let n be a node removed from the worklist at line 2 and let α(n) be the input sequence labeling the path from the root node to n.If Υ(α(n)) is satisfiable, the sequence α(n) is feasible, in which case a model of Υ(α(n)) is obtained and a word w ∈ L(A) is returned.Otherwise, α(n) is an infeasible input sequence and the procedure enters the refinement phase (lines 9-19).The GLI for α(n) is used to strenghten the labels of all the ancestors of n, by conjoining the formulae of the interpolant, changed according to Lemma 7, to the existing labels.
In this process, the nodes on the path between r and n, including n, might become eligible for coverage, therefore we attempt to close each ancestor of n that is impacted by the refinement (line 19).Observe that, in this case the call to Close must uncover each node which is covered by a successor of n (line 30 of the Close function).This is required because, due to the over-approximation of the sets of reachable configurations, the covering relation is not transitive, as explained in [20].If Close adds a covering edge (n i , m) to , it does not have to be called for the successors of n i on this path, which is handled via the boolean flag b.Finally, if n is still uncovered (it has not been previously covered during the refinement phase) we expand n (lines 21-25) by creating a new node for each successor s via the input event a ∈ Σ and inserting it into the worklist.

Interpolant Generation
Typically, when checking the unreachability of a set of program configurations, the interpolants used to annotate the unfolded control structure are assertions about the values of the program variables in a given control state, at a certain step of an execution [20].Because we consider alternating computation trees (forests), we must distinguish between (i) locality of interpolants w.r.t. a given control state (control locality) and (ii) locality w.r.t. a given time stamp (time locality).In logical terms, control-local interpolants are formulae involving a single predicate symbol, whereas time-local interpolants involve only predicates q (i) and variables x (i) , for a single i ≥ 0. When considering alternating executions, control-local interpolants are not always enough to prove emptiness, because of the synchronization of several branches of the computation on the same input word.For this reason, the interpolants considered in this paper will never be control-local and we shall use the term local to denote time-local interpolants, with no free variables.
First, let us give the formal definition of the class of interpolants we shall work with.Given a formula φ, the vocabulary of φ, denoted V(φ) is the set of predicate symbols q ∈ Q (i) and variables x ∈ X (i) , occurring in φ, for some i ≥ 0. For a term t, its vocabulary V(t) is the set of variables that occur in t.Observe that quantified variables and the interpreted function symbols of the data theory 3 do not belong to the vocabulary of a formula.By P + (φ) [P − (φ)] we denote the set of predicate symbols that occur in φ under an even [odd] number of negations.
Proposition 1.If there exists a Lyndon interpolant for any two formulae φ and ψ, in the first-order theory of data with uninterpreted predicate symbols, such that φ ∧ ψ is unsatisfiable, then any sequence of input events α = a 1 . . .a n ∈ Σ * , such that Υ(α) is unsatisfiable, has a local GLI (I 0 , . . ., I n ).
A problematic point of the above proposition is that the existence of Lyndon interpolants (Definition 4) is proved in principle, but the proof is non-constructive.In other words, the proof of Proposition 1 does not yield an algorithm for computing GLIs, for the following reason.Building an interpolant for an unsatisfiable conjunction of formulae φ ∧ ψ is typically the job of the decision procedure that proves the unsatisfiability and, in general, there is no such procedure, when φ and ψ contain predicates and have non-trivial quantifier alternation.In this case, some provers use instantiation heuristics for the universal quantifiers that are sufficient for proving unsatisfiability, however these heuristics are not always suitable for interpolant generation.Consequently, from now on, we assume the existence of an effective Lyndon interpolation procedure only for decidable theories, such as the quantifier-free linear (integer) arithmetic with uninterpreted functions (UFLIA, UFLRA, etc.) [26].This is where the predicate-free path formulae (defined in §2.1) come into play.Recall that, for a given event sequence α, the automaton A accepts a word w such that w Σ = α if and only if Υ(α) is satisfiable (Lemma 5).Assuming further that the equality and interpreted predicates (e.g.inequalities for integers) atoms from the transition rules of A belong to a decidable first-order theory, such as Presburger arithmetic, Lemma 5 gives us an effective way of checking emptiness of A, relative to a given event sequence.However, this method does not cope well with lazy annotation, because there is no way to extract, from the unsatisfiability proof of Υ(α), the interpolants needed to annotate α.This is because (I) the formula Υ(α), obtained by repeated substitutions loses track of the steps of the execution, and (II) quantifiers that occur nested in Υ(α) make it difficult to write Υ(α) as an unsatisfiable quantifier-free conjunction of formulae from which interpolants are extracted (Definition 4).
The solution we adopt for the first issue (I) consists in partially recovering the timestamped structure of the acceptance formula Υ(α) using the formula Υ(α), in which only transition quantifiers occur.The second issue (II) is solved under the additional assuption that the theory of the data domain D has witness-producing quantifier elimination.More precisely, we assume that, for each formula ∃x .φ(x), there exists an effectively computable term τ, in which x does not occur, such that ∃x .φ and φ[τ/x] are equisatisfiable.These terms, called witness terms in the following, are actual definitions of the Skolem function symbols from the following folklore theorem: Theorem 3 ( [3]).Given Q 1 x 1 . . .Q n x n .φ a first-order sentence, where Q 1 , . . ., Q n ∈ {∃, ∀} and φ is quantifier-free, let η i def = f i (y 1 , . . ., where f i is a fresh function symbol and {y 1 , . . ., Examples of witness-producing quantifier elimination procedures can be found in the literature for e.g.linear integer (real) arithmetic (LIA,LRA), Presburger arithmetic and boolean algebra of sets and Presburger cardinality constraints (BAPA) [18].
Under the assumption that witness terms can be effectively built, we describe the generation of a non-local GLI for a given input event sequence α = a 1 . . .a n .First, we generate successively the acceptance formula Υ(α) and its equisatisfiable forms both written in prenex form, with matrices Φ and Φ, respectively.Because we assumed that the first order theory of D has quantifier elimination, the satisfiability problem for Υ(α) is decidable.If Υ(α) is satisfiable, we build a counterexample for emptiness w such that w Σ = α and w D is a satisfying assignment for Υ(α).Otherwise, Υ(α) is unsatisfiable and there exist witness terms τ i 1 . . .τ i , where {i 1 , . . .
is unsatisfiable, and 2. a GLI (I 0 , . . ., I n ) for α, such that Consequently, under two assumptions about the first-order theory of the data domain, namely (i) witness-producing quantifier elimination, and (ii) Lyndon interpolation for the quantifier-free fragment with uninterpreted functions, we developed a generic method that produces GLIs for unfeasible input event sequences.Moreover, each formula in the interpolant refers only to the current predicate symbols, the current and past input variables and the existentially quantified transition variables introduced at the previous steps.The remaining questions are how to use these GLIs to label the sequences in the unfolding of an automaton (Definition 2) and compute coverage (Definition 3) between nodes of the unfolding.

Unfolding with Non-local Interpolants
As required by Definition 2, the unfolding U of an automaton A = Σ, X, Q, ι, F, ∆ is labeled by formulae U(α) ∈ Form + (Q, ∅), with no free symbols, other than predicate symbols, such that the labeling is compatible with the transition relation of the automaton.Each newly expanded input sequence of A is initially labeled with and the labels are refined using GLIs computed from proofs of spuriousness.The following lemma describes the refinement of the labeling of an input sequence by a non-local GLI: Lemma 7. Let U be an unfolding of an automaton A = Σ, X, Q, ι, F, ∆ such that α = a 1 . . .a n ∈ dom(U) and (I 0 , . . ., I n ) is a GLI for α.Then the mapping U : dom(U) → Form + (Q, ∅) is an unfolding of A, where: - , where J k is the formula obtained from I k by removing the time stamp of each predicate symbol q (k) and existentially quantifying each free variable, and Observe that, by Lemma 6 (2), the set of free variables of a GLI formula I k consists of (i) variables X (≤k) keeping track of data values seen in the input at some earlier moment in time, and (ii) variables that track past choices made within the transition rules.Basically, it is not important when exactly in the past a certain input has been read or when a choice has been made, because only the relation between the values of these and the current variables determines the future behavior of the automaton.Quantifying these variables existentially does the job of ignoring when exactly in the past these values have been seen.Moreover, the last point of Lemma 7 ensures that the refined path is safe in the new unfolding and will stay safe in all future refinements of this unfolding.
The last ingredient of the lazy annotation semi-algorithm based on unfoldings consist in the implementation of the coverage check, when the unfolding of an automaton is labeled with conjunctions of existentially quantified formulae with predicate symbols, obtained from interpolation.By Definition 3, checking whether a given node α ∈ dom(U) is covered amounts to finding a prefix α α and a node β ∈ dom(U) such that U(α ) | = U(β), or equivalently, the formula U(α ) ∧ ¬U(β) is unsatisfiable.However, the latter formula, in prenex form, has quantifier prefix in the language ∃ * ∀ * and, as previously mentioned, the satisfiability problem for such formulae becomes undecidable when the data theory subsumes Presburger arithmetic [10].
Nevertheless, if we require just a yes/no answer (i.e.not an interpolant) recently developed quantifier instantiation heuristics [25] perform rather well in answering a large number of queries in this class.Observe, moreover, that coverage does not need to rely on a complete decision procedure.If the prover fails in answering the above satisfiability query, then the semi-algorithm assumes that the node is not covered and continues exploring its successors.Failure to compute complete coverage may lead to divergence (non-termination) and ultimately, to failure to prove emptiness, but does not affect the soundness of the semi-algorithm (real counterexamples will still be found).

Experimental Results
We have implemented a version of the IMPACT semi-algorithm [20] in a prototype tool, avaliable online [8].The tool is written in Java and uses the Z3 SMT solver [27], via the JavaSMT interface [15], for spuriousness and coverage queries and also for interpolant generation.Table 1 reports the size of the input automaton in bytes, the numbers of Predicates, Variables and Transitions, the result of emptiness check, the number of Expanded and Visited Nodes during the unfolding and the Time in miliseconds.The experiments were carried out on a MacOS x64 -1.The test cases shown in Table 1, come from several sources, namely predicate automata models (*.pa) [6,7] available online [23], timed automata inclusion problems (abp.ada,train.ada,rr-crossing.foada),array logic entailments (array rotation.ada,array simple.ada,array shift.ada)and hardware circuit verification (hw1.ada,hw2.ada), initially considered in [13], with the restriction that local variables are made visible in the input.The train-simpleN.foada and fischer-mutexN.foada examples are parametric verification problems in which one checks inclusions of the form N i=1 L(A i ) ⊆ L(B), where A i is the i-th copy of the template automaton.The advantage of using FOADA over the INCLUDER [12] tool from [13] is the possibility of having automata over infinite alphabets with local variables, whose values are not visible in the input.In particular, this is essential for checking inclusion of timed automata that use internal clocks to control the computation.

Conclusions
We present first-order alternating automata, a model of computation that generalizes classical boolean alternating automata to first-order theories.Due to their expressivity, first-order alternating automata are closed under union, intersection and complement.However the emptiness problem is undecidable even in the most simple case, of the quantifier-free theory of equality with uninterpreted predicate symbols.We deal with the emptiness problem by developping a practical semi-algorithm that always terminates, when the automaton is not empty.In case of emptiness, termination of the semialgorithm occurs in most practical test cases, as shown by a number of experiments.

Definition 3 .
or a successor of x} ∪ {(x, Given an unfolding U of an automaton A = Σ, X, Q, ι, F, ∆ a node α ∈ dom(U) is covered by another node β ∈ dom(U), denoted α β, if and only if there exists a node α α such that U(α ) | = U(β).Moreover, U is closed if and only if every leaf from dom(U) is covered by an uncovered node.