PushDown Automata with GapOrder Constraints
Abstract.
We consider pushdown automata with data (Pdad) that operate on variables ranging over the set of natural numbers. The conditions on variables are defined via gaporder constraint. Gaporder constraints allow to compare variables for equality, or to check that the gap between the values of two variables exceeds a given natural number. The messages inside the stack are equipped with values that are natural numbers reflecting their “values”. When a message is pushed to the stack, its value may be defined by a variable in the program. When a message is popped, its value may be copied to a variable. Thus, we obtain a system that is infinite in two dimensions, namely we have a stack that may contain an unbounded number of messages each of which is equipped with a natural number. We present an algorithm for solving the control state reachability problem for Pdad based on two steps. We first provide a translation to the corresponding problem for contextfree grammars with data (Cfgd). Then, we use ideas from the framework of well quasiorderings in order to obtain an algorithm for solving the reachability problem for Cfgds.
1 Introduction
Model checking has become one of the main techniques for algorithmic verification of computer systems. The original applications were found in context of finitestate systems, such as hardware circuits, where the behavior of the system can be captured by a finite state machine. In the last two decades, there has also been a large amount of work devoted to extending model checking so that its can handle models with infinite state spaces such as Petri nets, timed automata, pushdown systems, counter automata, and channel machines. Recent works have considered systems that are infinite in multiple dimensions. For instance, many classes of timed protocols are parameterized (consist of unbounded numbers of components), and hence they can be naturally modeled by timed Petri nets [10]. Also, many message passing protocols have behaviors that are constrained by timing conditions, giving rise to timed channel systems [5].
In particular, PushDown Automata (Pda) have been studied extensively as a model for the analysis of recursive programs (e.g., [12, 23, 25, 33]). The model of Pda has been extended to allow quantitative reasoning with respect to time [1] and probabilities [24, 26]. However, all existing models assume finitestate control, which means that variables in the program are assumed to range over finite domains. In this paper, we consider an extension of Pda, which we call Pdad, that strengthens the model in two ways. First, in addition to the stack, a Pdad also operates on a number of variables ranging over the natural numbers. Furthermore, each message inside the stack is equipped with a natural number which represents its “value”. Thus, we get a model that is possibly unbounded in two dimensions, namely we have an unbounded number of messages inside the stack each of which has an attribute that is a natural number. The operations allowed on the stack are the standard push and pop operations. However, when pushing a symbol to the stack, its value may be defined to be the value of a program variable. Also, when a message is popped, then its value may be copied to a variable. A Pdad allows comparing the values of variables according to the gaporder constraint system, where two variables may be tested for equality, or for checking that there is a minimal gap (defined by a natural number) between the values of the two variables. Also, a variable may be assigned a new arbitrary value, the value of another variable, or a value that is at least some (given) natural number larger than the value of another variable. In this manner, the model of Pdad subsumes two known models, namely that of Pda (which we get by removing the variables in the program and by neglecting the values of the symbols in the stack), and the model of Integral Relational Automata [15] (which we get by removing the stack).
In this paper, we show decidability of the control reachability problem for Pdad. Given a control (local) state of the automaton, we check whether the automaton reaches the state from its initial configuration. We solve the problem in two steps. We introduce a class of ContextFree Grammars with Data (Cfgd). In a Cfgd, each nonterminal has an arity. The grammar generates terms each of which is either a terminal or a nonterminal equipped with a tuple of natural number (as many as its arity). An application of a production rewrites a term to a set of terms. Such an application is constrained by the arguments of the involved nonterminals. The constraints are defined by gaporder conditions. For Cfgd, we solve a reachability problem in which we ask whether it is possible to derive a set of terms each of which is a terminal belonging to a given set of terminals. In the first step of our method, we give a reachability analysis algorithm that solves the above mentioned problem for Cfgds.
The algorithm is based on a constraint representation of infinite sets of terms, and it is formulated within the framework of well structured transition systems [4, 6].
The second step of our method translates a given Pdad into a Cfgd so as to exploit the corresponding reachability analysis procedure to solve control state reachability for Pdads.
To our knowledge our result yields a new decidable fragment of pushdown automata with data (see Section 10).
2 Preliminaries
In this section, we introduce some notations and definitions that we will use in the rest of the paper. We use \(\mathbb{N }\) to denote the set of natural numbers.
We fix a finite set \(\mathcal{V }\) of variables that range over \(\mathbb{N }\). A valuation is a mapping \({{ Val}}:{\mathcal{V }}\rightarrow {\mathbb{N }}\), i.e., it assigns a natural number to each variable. Given a variable \(x\in \mathcal{V }\), a natural number \(c\in \mathbb{N }\), and a valuation \({{ Val}}:{\mathcal{V }}\rightarrow {\mathbb{N }}\), we use \({ Val}\left[ {x}\leftarrow {c}\right] \) to denote the valuation \({ Val}'\) defined as follows: \({ Val}'(x)=c\), and \({ Val}'(y)={ Val}(y)\) for all \(y\in (\mathcal{V }\setminus \{x\})\).
A renaming is a mapping \({{ Ren}}:{\mathcal{V }}\rightarrow {\mathcal{V }}\), i.e., it renames each variable to another one. A renaming \({ Ren}\) does not need to be injective, i.e., several variables may be renamed to the same variable by \({ Ren}\). We say that \({ Ren}\) is a renaming for \(W\) if \({ Ren}\left( {x}\right) \in W\) for all \(x\in \mathcal{V }\).
For a set \(A\), we use \({A}^*\) to denote the set of finite words over \(A\). We use \(\epsilon \) to denote the empty word. For words \(\alpha _1,\alpha _2\in {A}^*\), we use \(\alpha _1\cdot \alpha _2\) to denote the concatenation of \(\alpha _1\) and \(\alpha _2\).
A transition system is a tuple \(\left\langle {\Upsilon ,\gamma _{ init},\overset{{}}{\longrightarrow }}\right\rangle \) where \(\Upsilon \) is a (potentially infinite) set of configurations, \(\gamma _{ init}\in \Upsilon \) is the initial configuration, and \(\overset{{}}{\longrightarrow }\subseteq \Upsilon \times \Upsilon \) is the transition relation. As usual, we write \(\gamma \overset{{}}{\longrightarrow }\gamma '\) to denote that \(\left\langle {\gamma ,\gamma '}\right\rangle \in \overset{{}}{\longrightarrow }\), and use \(\overset{{*}}{\longrightarrow }\) to denote the reflexive transition closure of \(\overset{{}}{\longrightarrow }\). For a configuration \(\gamma \in \Upsilon \) and a set \(\Gamma \subseteq \Upsilon \) of configurations, we use \(\gamma \overset{{*}}{\longrightarrow }\Gamma \) to denote that \(\gamma \overset{{*}}{\longrightarrow }\gamma '\) for some \(\gamma '\in \Gamma \).
3 PushDown Automata with Data
In this section, we introduce PushDown Automata with Data (Pdad) that are extensions of the classical model of PushDown Automata (Pda). First, we define the model, then we define the operational semantics, i.e., the transition system induced by a Pdad, and finally we introduce the reachability problem. As in the case of a Pda a Pdad operates on an unbounded stack to which it can push (append) messages and from which it can pop (remove) message in lastinfirstout manner. The messages are chosen from a finite alphabet. Pdads extend Pdas in two ways. First, in addition to the stack, the automaton is equipped with a finite set of variables ranging over natural numbers. Second, each message inside the stack is equipped by a natural number that represents its “value”. The allowed operations on variables are defined by the gaporder constraint system [15, 31]. More precisely, the model allows nondeterministic value assignment, copying the value of one variable to another, and assignment of a value \(v\) to some variable such that \(v\) is larger of at least a given natural number than the current value of another variable. The transitions may be conditioned by tests that compare the values of two variables for equality, or that give the minimal allowed gap between two variables. A push operation may copy the value of a variable to the pushed message, and a pop operation may copy the value of the popped message to a variable.
Model. A Pdad \(\mathcal{A }\) is a tuple\(\left\langle {Q,q_{ init},A,\Delta }\right\rangle \) where \(Q\) is the finite set of states, \(q_{ init}\in Q\) is the initial state, \(A\) is the stack alphabet, and \(\Delta \) is the transition relation. We remark that the stack alphabet is infinite since it consists of pairs \(\left\langle {a,\ell }\right\rangle \) where \(a\) is taken from a finite set and \(\ell \) is a natural number. A transition \(\delta \in \Delta \) is a triple \(\left\langle {q_1,{ op},q_2}\right\rangle \) where \(q_1,q_2\in Q\) are states and \({ op}\) is an operation of one of the following forms: (i) \({ nop}\) is an empty operation that does not change the values of the variables or the content of the stack, (ii) \(x\leftarrow *\) assigns nondeterministically an arbitrary value in \(\mathbb{N }\) to the variable \(x\), (iii) \(y\leftarrow x\) copies the value of variable \(x\) to \(y\), (iv) \(y\leftarrow \left( >_cx\right) \) assigns nondeterministically to \(y\) a value that exceeds the current value of \(x\) by \(c\) (so the new value of \(y\) is \(>x+c\)), (v) \(y=x\) checks whether the value of \(y\) is equal to the value of \(x\), (vi) \(x<_{c}y\) checks whether the gap between the values of \(y\) and \(x\) is larger than \(c\), (vii) \({ push}\left( a\right) \left( x\right) \) pushes the symbol \(a\in A\) to the stack and assigns to it the value of \(x\), and (viii) \({ pop}\left( a\right) \left( x\right) \) pops the symbol \(a\in A\) (if \(a\) is the topmost symbol at the stack) and assigns its value to the variable \(x\).
Transition System. A Pdad induces a transition system as follows. A configuration \(\gamma \) is a triple \(\left\langle {q,{ Val},\alpha }\right\rangle \) where \(q\in Q\) is a state, \({ Val}:\mathcal{V }\mapsto \mathbb{N }\) is a valuation, and \(\alpha \in {\left( A\times \mathbb{N }\right) }^*\) defines the content of the stack (each element of the word is a pair \(\left\langle {a,c}\right\rangle \) where \(a\) is the symbol and \(c\) is its value).

\({ op}\) is \({ nop}\), \({ Val}'={ Val}\), and \(\alpha '=\alpha \). The values of the variables and the stack content are not changed.

\({ op}\) is \(x\leftarrow *\), \({ Val}'={ Val}\left[ {x}\leftarrow {c}\right] \) where \(c\in \mathbb{N }\), and \(\alpha '=\alpha \). The value of the variable \(x\) is changed nondeterministically to some natural number. The values of the other variables and the stack content are not changed.

\({ op}\) is \(y\leftarrow x\), \({ Val}'={ Val}\left[ {y}\leftarrow {{ Val}\left( {x}\right) }\right] \), and \(\alpha '=\alpha \). The value of the variable \(x\) is copied to the variable \(y\). The values of the other variables and the stack content are not changed.

\({ op}\) is \(y\leftarrow \left( >_cx\right) \), \({ Val}'={ Val}\left[ {y}\leftarrow {c'}\right] \), where \(c'>{ Val}\left( {x}\right) +c\), and \(\alpha '=\alpha \). The variable \(y\) is assigned nondeterministically a value that exceeds the value of \(x\) by \(c\). The values of the other variables and the stack content are not changed.

\({ op}\) is \(y=x\), \({ Val}\left( {y}\right) ={ Val}\left( {x}\right) \), \({ Val}'={ Val}\), and \(\alpha '=\alpha \). The transition is only enabled if the value of \(y\) is equal to the value of \(x\). The values of the variables and the stack content are not changed.

\({ op}\) is \(x<_{c}y\), \({ Val}\left( {y}\right) >{ Val}\left( {x}\right) +c\), \({ Val}'={ Val}\), and \(\alpha '=\alpha \). The transition is only enabled if the value of \(y\) is larger than the value of \(x\) by more than \(c\). The values of the variables and the stack content are not changed.

\({ op}\) is \({ push}\left( a\right) \left( x\right) \), \({ Val}'={ Val}\), and \(\alpha '=\left\langle {a,{ Val}\left( {x}\right) }\right\rangle \cdot \alpha \). The symbol \(a\) is pushed onto the stack with a value equal to that of \(x\).

\({ op}\) is \({ pop}\left( x\right) \left( a\right) \), \(\alpha =\left\langle {a,c}\right\rangle \cdot \alpha '\) for some \(c\in \mathbb{N }\), and \({ Val}'={ Val}\left[ {x}\leftarrow {c}\right] \). The symbol \(a\) is popped from the stack (if it is the topmost symbol), and its value is copied to the variable \(x\).
For a configuration and a state \(q\in Q\), we write \(\gamma \overset{{*}}{\longrightarrow }q\) to denote that \(\gamma \overset{{*}}{\longrightarrow } \gamma '=\left\langle {q,{ Val},\alpha }\right\rangle \) for some \({ Val}:\mathcal{V }\mapsto \mathbb{N }\) and \(\alpha \in {\left( A\times \mathbb{N }\right) }^*\).
In other words, from \(\gamma \) we can reach a configuration whose state is \(q\).
Reachability Problem. In the reachability problem PdadReach, given a Pdad \(\mathcal{A }=\left\langle {Q,q_{ init},A,\Delta }\right\rangle \) and a state \(q_{ target}\in Q\), we ask whether \(\gamma _{ init}\overset{{*}}{\longrightarrow }q_{ target}\).
4 ContextFree Grammars with Data
In this section, we introduce ContextFree Grammars with Data (Cfgd) that are extensions of the classical model of ContextFree Grammars (Cfg) in which (terminal and non terminal) symbols are defined by terms with free variables and productions have conditions defined by gap order constraints. We define the model, the operational semantics, and the reachability problem.
Model. A ContextFree Grammars with Data (Cfgd) is a tuple \(\mathcal{G }=\left\langle {\mathcal{S },X_{ init},P}\right\rangle \), where \(\mathcal{S }\) is a finite set of symbols. \(X_{ init}\in \mathcal{S }\) is the start (or initial) symbol, and \(P\) is the set of productions. Each symbol \(X\) has an arity \(\rho \left( {X}\right) \in \mathbb{N }\) that is a natural number. Without loss of generality, we assume that \(\rho \left( {X_{ init}}\right) =1\). A term has the form \(X(x_1,\ldots ,x_n)\) where \(X\in \mathcal{S }\), \(\rho \left( {X}\right) =n\) and \(x_1,\ldots x_n\in \mathcal{V }\) are variables. A ground term has the form \(X(c_1,\ldots ,c_n)\) where \(X\in \mathcal{S }\), \(\rho \left( {X}\right) =n\) and \(c_1,\ldots c_n\in \mathbb{N }\) are natural numbers. For a term \(\sigma \) of the form \(X(x_1,\ldots ,x_n)\) we define \({ Sym}\left( \sigma \right) =X\) and \({ Var}\left( \sigma \right) =\{x_1,\ldots ,x_n\}\). We define \({ Sym}\left( \sigma \right) \) for a ground term \(\sigma \) similarly. A (ground) sentence \(\alpha \) is a finite set \(\left\{ \sigma _1,\sigma _2,\cdots ,\sigma _n\right\} \), where each \(\sigma _i\) is a (ground) term. We define \({ Sym}\left( \alpha \right) :=\left\{ { Sym}\left( \sigma _1\right) ,\ldots ,{ Sym}\left( \sigma _n\right) \right\} \), i.e., it is the set of symbols that occur in \(\alpha \). For a term \(\sigma =X(x_1,\ldots ,x_n)\) and a valuation \({ Val}\), we define \({ Val}\left( {\sigma }\right) :=X({ Val}\left( {x_1}\right) ,\ldots ,{ Val}\left( {x_n}\right) )\) to be the ground term we get by substituting each variable \(x_i\) in \(\sigma \) by \({ Val}\left( {x_i}\right) \). For a sentence \(\alpha \), we define \({ Val}\left( {\alpha }\right) \) similarly.
A condition \(\theta \) is a finite conjunction of formulas of the forms: \(x<_{c}y\) or \(x=y\), where \(x,y\in \mathcal{V }\) and \(c\in \mathbb{N }\). Here \(x<_{c}y\) stands for \(x+c<y\). Sometimes, we treat a condition \(\theta \) as set, and write e.g. \((x<_{c}y)\in \theta \) to indicate that \(x<_{c}y\) is one of the conjuncts in \(\theta \). For a valuation \({ Val}\), we use \({ Val}\left( {\theta }\right) \) to denote the result of substituting each variable \(x\) in \(\theta \) by \({ Val}\left( {x}\right) \). We use \({ Val}\models \theta \) to denote that \({ Val}\left( {\theta }\right) \) evaluates to true. We use \({ Var}\left( \theta \right) \) to denote the set of variables that occur in \(\theta \).
A production \(p\) is of the form \({\sigma }\leadsto {\alpha }\;:\;{\theta }\), where \(\sigma \) is a term, \(\alpha \) is a nonempty sentence, and \(\theta \) is a condition. We often use the notation \({\sigma }\leadsto {\sigma _1\cdots \sigma _n}\;:\;{\theta }\) to denote the production \({\sigma }\leadsto {\{\sigma _1,\ldots ,\sigma _n\}}\;:\;{\theta }\) (i.e. a sequence in the righthand side denotes a set of terms). We use \(\mathcal{N }\) to denote the set of nonterminals consisting of symbols that occur in the lefthand side of a production (we say that they are defined by a production). We use \(\mathcal{T }\) to denote the set of terminals consisting of symbols that do not occur in the lefthand side of a production. Furthermore, we use \(\mathcal{A _T}\) to denote the set of ground terms with symbols in \(\mathcal{T }\).
Transition System. A configuration \(\gamma \) is a ground sentence. We define a transition relation \(\overset{{}}{\longrightarrow }_{\mathcal{G }}\) on the set of configurations by \(\overset{{}}{\longrightarrow }_{\mathcal{G }}:=\cup _{p\in P}\overset{{p}}{\longrightarrow }\) where \(\overset{{p}}{\longrightarrow }\) represents the effect of applying the production \(p\). More precisely, for a production \(p\in P\) of the form \({\sigma }\leadsto {\alpha }\;:\;{\theta }\), we have \(\gamma _1\overset{{p}}{\longrightarrow }\gamma _2\) if there is a valuation \({ Val}\models \theta \) such that \(\gamma _1=\alpha ' \cup \{{ Val}\left( {\sigma }\right) \}\) and \(\gamma _2=\alpha ' \cup \{{ Val}\left( {\alpha }\right) \}\).
For a set \(S\) of ground terms, we define \({ Pre}\left( {S}\right) \) to be the set of ground terms \(\sigma \) which can, through the single application of a production, generate a configuration \(\gamma \subseteq S\) (i.e., \({\sigma \overset{{}}{\longrightarrow }_{\mathcal{G }}\gamma }\)). Let \({ Pre}^*\left( {\cdot }\right) \) denote the transitive closure of \({ Pre}\left( {\cdot }\right) \).
We will use the following lemmata later in the paper.
Lemma 1.
Let \(\alpha \) be a ground sentence of \(\mathcal{G }\). Then, if for every ground term \(\sigma \in \alpha \), we have \(\sigma \overset{{*}}{\longrightarrow }_{\mathcal{G }} \alpha ''\) for some ground sentence \(\alpha ''\) such that \({ Sym}\left( \alpha ''\right) \subseteq \mathcal{T }\), then \(\alpha \overset{{*}}{\longrightarrow }_{\mathcal{G }} \alpha '\) for \(\alpha '\) such that \({ Sym}\left( \alpha '\right) \subseteq \mathcal{T }\).
Lemma 2.
Let \(S\) be a set of ground terms and \(\sigma \) be a ground term such that \(\sigma \in { Pre}^*\left( {S}\right) \). If \(\sigma \notin S\) then there is a ground term \(\sigma ' \in ({ Pre}\left( {S}\right) \setminus S)\).
Reachability Problem. In the reachability problem CfgdReach, we are given a Cfgd \(\mathcal{G }=\left\langle {\mathcal{S },X_{ init},P}\right\rangle \) and we are asked the question whether \(X_{ init}(0)\overset{{*}}{\longrightarrow }_{\mathcal{G }}\alpha \) for some ground sentence \(\alpha \) such that \({ Sym}\left( \alpha \right) \subseteq \mathcal{T }\). In other words, we start from a configuration consisting of the start symbol with its parameter set to zero, and ask whether the system can reach a configuration where all its ground terms have symbols in \(\mathcal{T }\).
Cfgd vs Cfg A ContextFree Grammars (Cfg) is defined by production of the form \(S\rightarrow w\) where \(w\) is a word defined over terminal and non terminal symbols. We can encode a Cfg as a Cfgd by associating to each terminal/non terminal symbol \(X\) (except the initial) a term \(X(a,b)\) in which \((a,b)\) are used to maintain an order in the righthand side of a rule. For instance, the production \(S\rightarrow S a S\) is encoded via the Cfgd production \(S(x,y)\rightarrow \{S(x,z),a(z,t),S(t,y)\}:x<z,z<t,t<y\).
Cfgd vs CMRS Cfgd also differ from the CMRS model [7]. CMRS is obtained by combining multiset rewriting and Gap Order constraints and it is aimed at modeling concurrent processes. CMRS rules have multiple heads and work over multisets of monadic terms (i.e. with a single argument, no nested terms). Differently from CMRS, Cfgd productions have a single term in the lefthand side and a set of terms in the righthand side. This implies that multiple occurrences (with the same variables) of a term like \(p(x,y)\) are counted only once. Furthermore, nonterminal symbols have arbitrary finite arity.
5 Symbolic Encoding
In this section, we define the symbolic representation used in the definition of the reachability algorithm (Section 6). The algorithm operates on constraints, where each constraint \(\phi \) characterizes a (potentially) infinite set \(\left[\![\phi \right]\!]\) of ground terms. A constraint \(\phi \) is of the form \({\sigma }:{\theta }\) where \(\sigma \) is a term and \(\theta \) is a condition. We define \({ Sym}\left( \phi \right) ={ Sym}\left( \sigma \right) \) and \({ Var}\left( \phi \right) ={ Var}\left( \sigma \right) \cup { Var}\left( \theta \right) \).
Definition 3.
The constraint \(\phi \) characterizes a set of ground terms defined by \(\left[\![\phi \right]\!]= \left\{ \sigma '\, {\exists { Val}.\;({ Val}\models \theta )\wedge (\sigma '={ Val}(\sigma )}\right\} \). For a finite set of constraints \(\Phi \), \(\left[\![\Phi \right]\!]=\bigcup\nolimits _{\phi \in \Phi }\left[\![\phi \right]\!]\).
Without loss of generality, we can assume that \({ Var}\left( \theta \right) ={ Var}\left( \sigma \right) \), and that \(\theta \) is consistent (constraints with inconsistent conditions characterize empty sets of configurations, and can therefore be safely discarded from the reachability analysis). A term \(X(x_1,\ldots ,x_n)\) is said to be pure if \(x_i\ne x_j\) whenever \(i\ne j\). A constraint \({\sigma }:{\theta }\) is said pure if \(\sigma \) is pure. We can assume without loss of generality that all constraints are pure. The reason is that if a variable \(x\) occurs (say) twice then the two occurrences of \(x\) can be replaced by two different variables \(y_1\) and \(y_2\) provided that we add a new conjunct \(y_1=y_2\) to the condition \(\theta \). For constraints \(\phi _1,\phi _2\), we use \(\phi _1\sqsubseteq \phi _2\) to denote that \(\phi _1\) subsumes \(\phi _2\), i.e., \(\left[\![\phi _1\right]\!]\supseteq \left[\![\phi _2\right]\!]\). Then, it is easy to see that checking whether \(\phi _1\sqsubseteq \phi _2\) can be reduced to the satisfiability problem for an existential Presburger formula (which is known to be NPcomplete [34] ).
Lemma 4.
For constraints \(\phi _1,\phi _2\), the problem of checking whether \(\phi _1\sqsubseteq \phi _2\) is decidable.
The following lemma states that we can transform any constraint \(\phi \) of the form \({\sigma }:{\theta }\) to an equivalent constraint \( clean (\phi )\) of the form \({\sigma }:{\theta }'\) such that \({ Var}\left( \theta '\right) ={ Var}\left( \sigma \right) \) (i.e., we remove the extravariables \(({ Var}\left( \theta \right) \setminus { Var}\left( \sigma \right) )\) from \(\theta \) in order to satisfy the assumption that \({ Var}\left( \theta \right) ={ Var}\left( \sigma \right) \)).
Lemma 5.
[31] Given a constraint \(\phi \) of the form \({\sigma }:{\theta }\), we can construct a constraint \( clean (\phi )\) of the form \({\sigma }:{\theta }'\) such that \({ Var}\left( \theta '\right) ={ Var}\left( \sigma \right) \) and \(\left[\![ clean (\phi )\right]\!]=\left[\![\phi \right]\!]\).
Given two terms \(\sigma _1\) and \(\sigma _2\), we say that \(\sigma _1\) matches \(\sigma _2\) iff \({ Sym}\left( \sigma _1\right) = { Sym}\left( \sigma _2\right) \). For matching terms \(\sigma _1=X(x_1,\ldots ,x_n)\) and \(\sigma _2=X(y_1,\ldots ,y_n)\), where \(\sigma _2\) is pure, we define \({ Ren}^{\sigma _2}_{\sigma _1}\) to be a renaming such that \({ Ren}^{\sigma _2}_{\sigma _1}(y_i)=x_i\) for all \(i:1\le i\le n\). Consider a production \(p={\sigma }\leadsto {\sigma _1\cdots \sigma _n}\;:\;{\theta }\) and constraints \(\phi _1={\sigma '_1}:{\theta _1},\ldots , \phi _n={\sigma '_n}:{\theta _n}\) such that \(\sigma _i\) and \(\sigma '_i\) are matching, and such that \(\sigma '_i\) is pure for all \(i:1\le i\le n\). We define \(p\otimes \phi _1\otimes \cdots \otimes \phi _n\) to be the constraint \({\sigma }:{\theta \wedge { Ren}_{\sigma _1}^{\sigma '_1}(\theta _1) \wedge \cdots \wedge { Ren}_{\sigma _n}^{\sigma '_n}(\theta _n)}\). For a set \(\Phi \) of constraints, and production \(p\in P\), we define \({ Pre}_{p}\left( {\Phi }\right) := \left\{ clean (\phi ')\, {\exists \phi _1,\ldots ,\phi _n\in \Phi .\, \phi '=p\otimes \phi _1\cdots \otimes \phi _n}\right\} \). We define \({ Pre}\left( {\Phi }\right) :=\cup _{p\in P}{ Pre}_{p}\left( {\Phi }\right) \). Intuitively, \({ Pre}\left( {\Phi }\right) \) defines a finite set of constraints that characterize the terms which can, through the single application of a production, generate a set of terms each of which belongs to \(\Phi \).
Lemma 6.
\(\bigcup\nolimits _{\phi '\in { Pre}\left( {\Phi }\right) }\left[\![\phi '\right]\!]= { Pre}\left( {\left[\![\Phi \right]\!]}\right) \).
6 Reachability Analysis
In this section, we present an algorithm for solving the reachability analysis problem for Cfgds, and prove its partial correctness. The algorithm (Algorithm 1) inputs a Cfgd \(\mathcal{G }=\left\langle {\mathcal{S },X_{ init},P}\right\rangle \) and answers the question whether we can reach a sentence where all the occurring terms are in \(\mathcal{A _T}\) (i.e. terms with symbols in \(\mathcal{T }\)). The algorithm maintains two sets of constraints: a set \(\mathtt {ToExplore}\), initialized to \(\Phi _\mathcal{T }\), of constraints that have not yet been analyzed; and a set \(\mathtt {Explored}\), initialized to the empty set, of constraints that contain constraints that have already been analyzed.
 1.
For each \(\sigma \in \left[\![\mathtt {ToExplore}\cup \mathtt {Explored}\right]\!]\), \(\sigma \overset{{*}}{\longrightarrow }{\alpha }\) for some \(\alpha \) s.t. \({ Sym}\left( \alpha \right) \subseteq \mathcal{T }\).
 2.
If \(X_{ init}(0)\overset{{*}}{\longrightarrow }\alpha \) for some \(\alpha \) s.t. \({ Sym}\left( \alpha \right) \subseteq \mathcal{T }\), then there is a ground term \(\sigma \in \left[\![\mathtt {ToExplore}\right]\!]\) such that \(\sigma \not \in \left[\![\mathtt {Explored}\right]\!]\).
 3.
\(X_{ init}(0)\not \in \left[\![\mathtt {Explored}\right]\!]\).
 4.
\(\left[\![\Phi _\mathcal{T }\right]\!]\subseteq \left[\![\mathtt {ToExplore}\cup \mathtt {Explored}\right]\!]\).
It is easy to see that the third and fourth invariants will be preserved. More precisely, for the third invariant, \(\mathtt {Explored}\) is initially empty, and the condition at line 5 prevents adding any constraint whose symbol is \(X_{ init}\) and parameter equals to \(0\) to \(\mathtt {Explored}\). The fourth invariant holds initially since \(\mathtt{ToExplore}\cup \mathtt {Explored}={\Phi _\mathcal{T }\cup \emptyset }={\Phi _\mathcal{T }}\). This invariant is preserved since each time we remove a constraint from \(\mathtt {ToExplore}\) (line 4), it is either eventually moved to \(\mathtt {Explored}\) (line 9), or (in case it is discarded at line 6) there is already a constraint \(\phi '\in \mathtt {Explored}\) with \(\left[\![\phi '\right]\!]\supseteq \left[\![\phi \right]\!]\). Also, each time we remove a constraint \(\phi '\) from \(\mathtt {Explored}\) (line 9), we add the constraint \(\phi \) to \(\mathtt {Explored}\) where \(\left[\![\phi \right]\!]\supseteq \left[\![\phi '\right]\!]\).

From the second invariant, if \(\mathtt {ToExplore}\) becomes empty then the algorithm terminates with a negative answer.

From the first invariant, if a constraint \(\phi \) is detected such that \(X_{ init}(0)\in \left[\![\phi \right]\!]\), then the algorithm terminates with a positive answer.

If there exists a constraint \(\phi '\in \mathtt {Explored}\) with \(\phi '\sqsubseteq \phi \), then we discard \(\phi \). The first invariant is preserved since this operation will not add any new elements to \(\left[\![\mathtt {ToExplore}\cup \mathtt {Explored}\right]\!]\). If \(X_{ init}(0)\overset{{*}}{\longrightarrow }\alpha \) for some \(\alpha \) s.t. \({ Sym}\left( \alpha \right) \subseteq \mathcal{T }\), then the second invariant and the fact that \(\left[\![\phi \right]\!]\subseteq \left[\![\mathtt {Explored}\right]\!]\) imply that there is still some \(\sigma \in \mathtt {ToExplore}\) such that \(\sigma \not \in \left[\![\mathtt {Explored}\right]\!]\). This means that the second invariant will also be preserved by this step.

Otherwise, we compute the elements of \({ Pre}\left( \mathtt{Explored}\cup {\phi }\right) \), add them in \(\mathtt {ToExplore}\), move \(\phi \) to \(\mathtt {Explored}\), and remove all constraints in \(\mathtt {Explored}\) that are subsumed by \(\phi \). Let \(\mathtt {Explored^{old}}\) and \(\mathtt {Explored^{new}}\) be the contents of the set \(\mathtt {Explored}\) before resp. after performing the operation. Define \(\mathtt {ToExplore^{old}}\) and \(\mathtt {ToExplore^{new}}\) analogously. The operation preserves the first invariant as follows. Pick any \(\sigma \in \left[\![\mathtt {ToExplore^{new}}\cup \mathtt {Explored^{new}}\right]\!]\). If \(\sigma \in \left[\![\mathtt {ToExplore^{old}}\cup \mathtt {Explored^{old}}\right]\!]\) then the result follows by the first invariant. Otherwise we know that \(\sigma \in \left[\![{ Pre}\left( \mathtt{Explored^{old}}\cup \left\{ \phi \right\} \right) \right]\!]\), i.e., \(\sigma \overset{{}}{\longrightarrow }_{\mathcal{G }}\alpha \) where \(\alpha \subseteq \left[\![\mathtt {Explored^{old}}\cup \left\{ \phi \right\} \right]\!]\) (see Lemma 6). By the induction hypothesis and the first invariant, we know that every ground term \(\sigma '\in \alpha \), \(\sigma '\overset{{*}}{\longrightarrow }_{\mathcal{G }} \alpha '\) for some \( \alpha '\) s.t. \({ Sym}\left( \alpha '\right) \subseteq \mathcal{T }\) . Hence \(\alpha \overset{{*}}{\longrightarrow }_{\mathcal{G }}\alpha ''\) for some \( \alpha ''\) s.t. \({ Sym}\left( \alpha ''\right) \subseteq \mathcal{T }\) (see Lemma 1). In other words, \(\sigma \overset{{}}{\longrightarrow }_{\mathcal{G }}\alpha \overset{{*}}{\longrightarrow }_{\mathcal{G }}\alpha ''\) s.t. \({ Sym}\left( \alpha ''\right) \subseteq \mathcal{T }\). The operation also preserves the second invariant as follows. Assume that \(X_{ init}(0)\overset{{*}}{\longrightarrow }_{\mathcal{G }}\alpha \) for some \(\alpha \) s.t. \({ Sym}\left( \alpha \right) \subseteq \mathcal{T }\). There are two cases. If there is a \(\sigma \in \left[\![\Phi _\mathcal{T }\right]\!]\) such that \(\sigma \not \in \left[\![\mathtt {Explored^{new}}\right]\!]\), then by the fourth invariant \(\sigma \in \left[\![\mathtt {ToExplore^{new}}\right]\!]\) and the invariant holds immediately. Otherwise, \(\left[\![\Phi _\mathcal{T }\right]\!]\subseteq \left[\![\mathtt {Explored^{new}}\right]\!]\). Since \(X_{ init}(0)\overset{{*}}{\longrightarrow }_{\mathcal{G }}\alpha \) we have also that \(X_{ init}(0) \in { Pre}^*\left( {\left[\![\mathtt {Explored^{new}}\right]\!]}\right) \). By the third invariant, we know that \(X_{ init}(0)\not \in \left[\![\mathtt {Explored^{new}}\right]\!]\) . By Lemma 2 that there is a ground term \(\sigma \in ({ Pre}\left( {\left[\![\mathtt {Explored^{new}}\right]\!]}\right) \setminus \left[\![\mathtt {Explored^{new}}\right]\!])\). Since \(\left[\![\mathtt {Explored^{new}}\right]\!]=\left[\![\mathtt {Explored^{old}}\cup \left\{ \phi \right\} \right]\!]\) it follows that \(\sigma \in \left[\![{ Pre}\left( \mathtt{Explored^{old}}\cup \left\{ \phi \right\} \right) \right]\!]\) and hence \(\sigma \in \left[\![\mathtt {ToExplore^{new}}\right]\!]\).
Theorem 7.
Algorithm 1, under termination assumption, always return the correct answer.
7 Termination
In this section, we show that Algorithm 1 is guaranteed to terminate. To do that, we first recall some basics of the theory of well and better quasiorderings. Then, we introduce a new class of constraints that we call flat constraints and show that they are better quasiordered. We show that each condition can be translated into a number of flat constraints. We use this to show that the set of conditions is well quasiordered under set inclusion. This leads to the well quasiordering of the set of constraints (of Section 5). Finally, we show the termination of the algorithm.
Wqos and Bqos. A QuasiOrdering (or a Qo for short), is a pair \(\left\langle {A,\preceq }\right\rangle \) where \(\preceq \) is a reflexive and transitive binary relation on the set \(A\). A QO \(\left\langle {A,\preceq }\right\rangle \) is a Well QuasiOrdering (Wqo), if for each infinite sequence \(a_1,a_2,a_3,\dots \) of elements of \(A\) , there are \(i<j\) such that \(a_i\preceq a_j\). The following lemma follows from the definition of a Wqo.
Lemma 8.
For Qos \(\preceq \) and \(\preceq '\) on some set \(A\), if \(\preceq \subseteq \preceq '\) and \(\preceq \) is a Wqo then \(\preceq '\) is a Wqo.
Given a Qo \(\left\langle {A,\preceq }\right\rangle \), we define a Qo \(\left\langle {{A}^*,\preceq ^*}\right\rangle \) on the set of words \({A}^*\) such that \(a_1a_2\cdots a_m\preceq ^* a'_1a'_2\cdots a'_n\) if there is an injection \(h:\left\{ 1,\ldots ,m\right\} \mapsto \left\{ 1,\ldots ,n\right\} \) such that \(i<j\) implies \(h(i)<h(j)\) for all \(i,j:1\le i,j\le m\), and \(a_i\preceq a'_{h(i)}\) for each \(i:1\le i\le m\). We define the relation \(\preceq ^\mathcal{P }\) on the powerset \(\mathcal{P }\left( {A}\right) \) (finite set of elements in \(A\)) of \(A\), so that \(A_1\preceq ^\mathcal{P }A_2\) if \(\forall a_2\in A_2.\exists a_1\in A_1. a_1\preceq a_2\).
We define the relation \(\preceq ^{p}\) on the Cartesian product \(A_1\times \ldots \times A_n\) of orders \(\left\langle {A_i,\le _i}\right\rangle \) for \(i:1,\ldots ,n\), so that \(\left\langle {a_1,\ldots ,a_n}\right\rangle \preceq ^{p}\left\langle {a_1',\ldots ,a_n'}\right\rangle \) if \(a_i\preceq _i a_i'\) for \(i:1,\ldots ,n\).
In the following lemma we state some properties of Bqos^{1} [10, 28].
Lemma 9.

Each Bqo is Wqo.

If \(A\) is finite, then \(\left\langle {A,=}\right\rangle \) is a Bqo, and \(\left\langle {\mathcal{P }\left( {A}\right) ,\subseteq }\right\rangle \) is a Bqo.

\(\left\langle {\mathbb{N },\le }\right\rangle \) is a Bqo.

If \(\left\langle {A_i,\le _i}\right\rangle \) is a Bqo for \(i:1,\ldots ,n\) then \(\left\langle {A_1\times \ldots \times A_n,\preceq ^{p}}\right\rangle \) is a Bqo.

If \(\left\langle {A,\preceq }\right\rangle \) is a Bqo, then \(\left\langle {\mathcal{P }\left( {A}\right) ,\preceq ^\mathcal{P }}\right\rangle \) is a Bqo.

\(d_i=d_j\) if \(h_\psi (i)=h_\psi (j)\).

If \(h_\psi (i)=k\). and \(h_\psi (j)=k+1\) then \(c_{k+1}<d_jd_i\).
Lemma 10.
\(\psi \preceq \psi '\) implies that \(\left[\![\psi \right]\!]\supseteq \left[\![\psi '\right]\!]\).
By Lemma 9 it follows that
Lemma 11.
\(\preceq \) is a Bqo on the set of flat constraints.
Proof.
We first observe that flat contraints can be viewed as tuples with at most \(K=\mathcal{V }\) partitions and \(\mathcal{V }1\) constants and we can always add finite sequences such as \(0\emptyset 0\ldots 0\emptyset \) to consider \(K\)tuples only. From Lemma 9, we know that \(\left\langle {\mathbb{N },\le }\right\rangle \) and \(\left\langle {\mathcal{P }\left( {\mathcal{V }}\right) , =}\right\rangle \) are Bqos. Thus, the Cartesian product \((\mathcal{P }\left( {\mathcal{V }}\right) \times \mathbb{N })^{K1}\times \mathcal{P }\left( {\mathcal{V }}\right) \) with \(\preceq \) is still a Bqo.

If \((x=y)\in \theta \) then \(x,y\in A_i\) for some \(i:1\le i\le m\).

If \((x<_cy)\in \theta \), \(x\in A_i\), and \(y\in A_j\) then \(c\le \left( \sum\nolimits _{k=i+1}^j(c_k+1)1\right) \).
We define an ordering \(\preceq \) on conditions such that \(\theta \preceq \theta '\) if for each \(\psi '\in \mathcal{F }\left( \theta '\right) \) there is a \(\psi \in \mathcal{F }\left( \theta \right) \) with \(\psi \preceq \psi '\). From Lemma 10 we get the following.
Lemma 12.
\(\theta \preceq \theta '\) implies that \(\left[\![\theta \right]\!]\supseteq \left[\![\theta '\right]\!]\).
The following lemma follows from Lemma 9 and Lemma 11.
Lemma 13.
\(\preceq \) is a Bqo (and hence Wqo) on the set of conditions.
From Lemma 13, Lemma 12, and Lemma 8 we get the following lemma.
Lemma 14.
The set of conditions is Wqo under \(\sqsubseteq \).
The following lemma then holds.
Lemma 15.
The set of constraints is Wqo under \(\sqsubseteq \).
Proof.
Consider an infinite sequence of constraints: \(\phi _1,\phi _2,\phi _3,\ldots \). Since the set \(\mathcal{N }\cup \mathcal{T }\) is finite, there is an infinite sequence \(i_1<i_2<i_3<\cdots \) such that \({ Sym}\left( \phi _{i_1}\right) ={ Sym}\left( \phi _{i_2}\right) ={ Sym}\left( \phi _{i_3}\right) =\cdots \). If \({ Sym}\left( \phi _{i_j}\right) \in \mathcal{T }\) then the result follows immediately (since \(\left[\![\phi _{i_j}\right]\!]=\left\{ { Sym}\left( \phi _{i_j}\right) \right\} \) for all \(j\ge 1\)). Otherwise, we can assume, without loss of generality, that \(\phi _{i_j}\) is of the form \({X(x_1,\ldots ,x_n)}:{\theta _{i_j}}\). Notice that each \({ Var}\left( \theta _{i_j}\right) =\left\{ x_1,\ldots ,x_n\right\} \) is a condition over \(\left\{ x_1,\ldots ,x_n\right\} \). By Lemma 14, there are \(j<k\) such that \(\theta _{i_j}\sqsubseteq \theta _{i_k}\), and hence \(\phi _{i_j}\sqsubseteq \phi _{i_k}\).
Termination. The reason why the algorithm always terminates is that only a finite set of constraints can be added to \(\mathtt {Explored}\). This can be explained as follows. By definition, a new element \(\phi \) is added to \(\mathtt {Explored}\) only if \(\phi '\not \sqsubseteq \phi \), for each \(\phi ^\prime \) already added to \(\mathtt {Explored}\). This means that the constraints added to \(\mathtt {Explored}\) form a sequence \(\phi _1,\phi _2,\phi _3,\ldots \), such that \(\phi _i\not \sqsubseteq \phi _j\) for all \(i < j\). By Wqo of \(\sqsubseteq \) (Lemma 15) it follows that this sequence is finite. This gives the following theorem.
Theorem 16.
Algorithm 1 is guaranteed to terminate.
8 Translation
Reachability with Empty Stacks. We consider a different variant of PdadReach which we call PdadReachEmpty. An instance of PdadReachEmpty is defined by a Pdad \(\mathcal{A }=\left\langle {Q,q_{ init},A,\Delta }\right\rangle \) and a state \(q_{ target}\in Q\), and we are asked whether \(\gamma _{ init}\overset{{*}}{\longrightarrow }\gamma \) for some \(\gamma \) of the form \(\left\langle {q_{ target},{ Val},\epsilon }\right\rangle \), i.e., we ask whether we reach \(q_{ target}\) at a configuration where the stack is empty. Given an instance of PdadReach, defined by a Pdad \(\mathcal{A }=\left\langle {Q,q_{ init},A,\Delta }\right\rangle \) and a state \(q_{ target}\in Q\), we derive an equivalent instance of PdadReachEmpty as follows. We construct a new Pdad \(\mathcal{A }'\) from \(\mathcal{A }\) by adding a new state \(q_{ new}\) to \(Q\), and adding a transition labeled with \({ nop}\) from \(q_{ target}\) to \(q_{ new}\). For each member \(a\in A\) of the stack alphabet, we add a selfloop on \(q_{ new}\) that pops \(a\) (with any value). The two problem instances are equivalent as follows. Suppose that \(q_{ new}\) is reachable with an empty stack in \(\mathcal{A }'\). Then, the run of \(\mathcal{A }'\) reaching \(q_{ new}\) must have passed through \(q_{ target}\) (since \(q_{ new}\) can only be reached from \(q_{ target}\)). This means that \(q_{ target}\) is reachable in \(\mathcal{A }\). On the other hand, suppose that \(q_{ target}\) is reachable in \(\mathcal{A }\). Then, \(\mathcal{A }'\) can simulate the run of \(\mathcal{A }\) until it reaches \(q_{ target}\). From there, it takes the transition to \(q_{ new}\), and starts executing the selfloops, popping all the symbols in the stack until the stack becomes empty.
From Pdad to Cfgd. Suppose that we are given an instance of PdapReachEmpty defined by a Pdad \(\mathcal{A }=\left\langle {Q,q_{ init},A,\Delta }\right\rangle \) and a state \(q_{ target}\in Q\). Let \(\left\{ x_1,\ldots ,x_n\right\} \) be the set of variables that occur in \(\mathcal{A }\). We derive an equivalent instance of CfgdReach defined by a Cfgd \(\mathcal{G }=\left\langle {\mathcal{S },X_{ init},P}\right\rangle \). The set \(\mathcal{T }\) of \(\mathcal{G }\) is defined by the singleton set \(\left\{ t\right\} \) and we assume that the arity of \(t\) is \(0\) (i.e., \(\rho \left( {t}\right) =0\)). The set of \(\mathcal{N }\) of \(\mathcal{G }\) is defined as follows: For each pair of states \(q_1,q_2\in Q\) and symbol \(a\in A\cup \{\bot \}\), with \(\bot \notin A\), we have a nonterminal \(X_{(q_1,a,q_2)}\in \mathcal{N }\) with arity \(2n+1\). The symbol \(\bot \) is used to denote that the stack of \(\mathcal{A }\) is empty. The set of nonterminal set \(\mathcal{N }\) contains the initial symbol \(X_{ init}\) (by definition).
In the following, let \(\bar{y}\) denote a vector \(\left\langle {y_1,\ldots ,y_n}\right\rangle \) of length \(n\), and define \(\bar{y}[i]:=y_i\) for \(i:1\le i\le n\). For vectors \(\bar{z}=\left\langle {z_1,\ldots ,z_n}\right\rangle \) and \(\bar{y}=\left\langle {y_1,\ldots ,y_n}\right\rangle \), we use \(\bar{z}=\bar{y}\) (resp. \(\bar{z}\ne _j \bar{y}\) for some \(j : 1 \le j \le n\)) to denote the condition \(\bigwedge\nolimits _{1\le i \le n} z_i= y_i\) (resp. \(\bigwedge\nolimits _{(1\le i \le n) \wedge (i \ne j)} z_i= y_i\)). Furthermore, for brevity, we sometimes shorten a conjunction of conditions \(\theta _1\wedge \ldots \wedge \theta _n\) into a list \(\theta _1,\ldots ,\theta _n\).
The set \(P\) is derived from \(\Delta \), and it contains the productions of Fig. 1. Then the following property holds.
Proposition 17.
\(\gamma _{ init}\overset{{*}}{\longrightarrow }\gamma \) for some \(\gamma =\left\langle {q_{ target},{ Val},\epsilon }\right\rangle \) iff \(X_{ init}\overset{{*}}{\longrightarrow }_{\mathcal{G }}\alpha \) for some sentence \(\alpha \) such that \({ Sym}\left( \alpha \right) \subseteq \mathcal{T }\).
As an immediate consequence of the above Proposition, Theorem 7, and Theorem 16, we get:
Theorem 18.
The PdadReach and PdadReachEmpty problems are decidable for Pdads.
9 Extended Pdads
In this section, we present generalizations of the basic Pdad model for which the results presented in this paper still hold.
The first extension consists in adding to conditions of the form \(x=c\), \(x>c\), and \(x<c\) for a variable \(x\) and a constant value \(c\ge 0\). The resulting formulas corresponds to the original Gap Order Constraints considered in [31].
The second extension consists in adding multiple data fields in each element pushed to the stack. For fixed number of data fields \(k\ge 0\), the configuration of Pdad \(_k\) becomes a triple \(\left\langle {q,{ Val},\alpha }\right\rangle \) where \(q\in Q\) is a state, \({ Val}:\mathcal{V }\mapsto \mathbb{N }\) is a valuation, and \(\alpha \in {\left( A\times \mathbb{N }^k\right) }^*\) defines the content of the stack (each element of the word is a pair \(\left\langle {a,c_1,\ldots ,c_k}\right\rangle \) where \(a\) is the symbol and \(c_i\) is its value for the \(i\)th field).
We now consider operations that manipulate the data fields. We first extend the push operation and consider \({ push}\left( a\right) \left( x_1,\ldots ,x_k\right) \) to push the symbol \(a\in A\) and to assign to the \(i\)th field the value of \(x_i\) for \(i:1,\ldots ,k\). We also consider operation \({ pop}\left( a\right) \left( x_1,\ldots ,x_k\right) \) to pop the symbol \(a\in A\) from the stack and to assign to \(x_i\) the value of the \(i\)th field on the top of the stack \(i:1,\ldots ,k\). The operational semantics can be naturally extended in order to cope with tuples of values instead of single one.
Finally, we consider operations that test and modify the data fields on the stack. We can use special identifiers \(topx_1,\ldots ,topx_k\) to denote such data fields and use them in conditions of transitions.
To encode the resulting model into Cfgd, we need to introduce nonterminals with extra arguments that represent both the current value and the (guessed) updated value of data fields. More specifically, we need nonterminals of the form \({X_{(q_1,a,q_2)}(\bar{x},\bar{y},\bar{z},\bar{u})}\) to represent a run of a \({\mathcal{A }}_k\) from a configuration where the state is \(q_1\), the topmost stack symbol is \(a\) and its corresponding data field values are given by the vector \(\bar{z}\), and the valuation of the shared variables of \(\mathcal{A }\) is given by the valuation of \(\bar{x}\), to a configuration with the updated data fields \(\bar{u}\) and where the state is \(q_2\) and the valuation of the shared variables is given by the valuation of \(\bar{y}\).
We leave a detailed treatment of this extension for future work.
10 Related Work and Conclusion
Decidability and complexity of reachability problems for pushdown systems with or without data have been extensively studied in the literature. In [12] the authors present an algorithm to compute \(Post^*\) and \(Pre^*\) for a pushdown automata and a regular set of its configurations (represented as automata). Symbolic versions of the algorithms have been studied e.g. in [29]. In [11] the authors consider approximated verification methods for subclasses of pushdown systems called finite indices in which it is possible to handle counters without zero test (i.e. transitions of a Petri net). In [1, 2] the authors present decidability results for timed extensions of pushdown systems. In [14] the authors present decidability results for pushdown systems with either a wellquasi ordered set of control locations or of data values. In our model we do not consider a wellquasi ordered data domain, but introduce a wellquasi ordered relation over values pushed to and popped from the stack in order to decide reachability. Our extensions of pushdown system with Gap Order is orthogonal to the above mentioned models. Furthermore, it subsumes the model presented in [32], where the authors consider pushdown systems in which messages carry (object) identifiers that can be compared by equality. In addition to equality tests, Gap Order can be used to order messages in the stack.
Concerning our proof techniques, the algorithm for solving the Cfgd reachability problem is inspired to the seminal results on Datalog and contextfree language reachability [30, 35] and to the evaluation of Datalog with Gap Order Constraints [31]. CLP programs with Gap Order constraints without conjunctions in the body have been used to model transition systems in [27]. The fixpoint semantics of CLP programs has been used to characterize model checking problems in [21] and applied to infinitestate systems in [16, 17, 18, 20]. In [15] extended automata with Gap Order conditions over variables are used as an approximated model of counter systems. The model however does not have recursion. The complexity of verification problems (expressed in temporal logic) for transitions systems with Gap Order Constraints has been studied in [13]. Allowing rules with sets of terms in the righthand side, Gfgd are more general than the model in [13]. Multiset rewriting systems with Gap Order Constraints (i.e. systems with an arbitrary number of integral variables) have been introduced in [3] and applied to different types of systems in [8] extending the parameterized models described in [9, 22]. These systems are a subclass of multiset rewriting with (linear) constraints applied to infinite state verification, e.g., in [19].
The evaluation procedure for Datalog with Gap Order Constraints in [31] and its termination depend on specific data structures (weighted graphs kept in normal form) used to represent relations between variables that occur in Datalog clauses. In the present paper we formulate an algorithmic solution to Cfgd reachability as an instance of the general framework of wellstructured transition systems and apply the theory of betterquasi ordering to naturally infer its termination. This approach has the great advantage of capturing the essential ingredients needed for extending the algorithm to other classes of grammars with data. For instance, under some restrictions on the arity of terms, a slightly modified algorithm can be applied to grammars with sets of terms in the lefthand side of a production. A more formal treatment of this kind of generalization together with a deeper investigation of the complexity of the resulting algorithm is part of our future work.
Footnotes
References
 Abdulla, P.A., Atig, M.F., Stenman, J.: Densetimed pushdown automata. In: LICS. IEEE Computer Society (2012)Google Scholar
 Abdulla, P.A., Atig, M.F., Stenman, J.: The minimal Cost reachability problem in priced timed pushdown systems. In: Dediu, A.H., MartínVide, C. (eds.) LATA 2012. LNCS, vol. 7183, pp. 58–69. Springer, Heidelberg (2012)Google Scholar
 Abdulla, P.A., Delzanno, G.: On the coverability problem for constrained multiset rewriting. In: Proc. AVIS 2006, 5th Int. Workshop on on Automated Verification of InfiniteState Systems (2006)Google Scholar
 Abdulla, P.A.: Well (and Better) QuasiOrdered Transition Systems. The Bulletin of Symbolic Logic 16(4), 457–515 (2010)Google Scholar
 Abdulla, P.A., Atig, M.F., Cederberg, J.: Timed lossy channel systems. In: Proc. FSTTCS 2005, 32nd Conf. on Foundations of Software Technology and, Theoretical Computer Science (2012)Google Scholar
 Abdulla, P.A., Čerāns, K., Jonsson, B., Tsay, Y.K.: General decidability theorems for infinitestate systems. In: Proc. LICS 1996, 11th IEEE Int. Symp. on Logic in Computer Science, pp. 313–321 (1996)Google Scholar
 Abdulla, P.A., Delzanno, G., Begin, L.V.: A classification of the expressive power of wellstructured transition systems. Inf. Comput. 209(3), 248–279 (2011)Google Scholar
 Abdulla, P.A., Delzanno, G., Rezine, A.: Parameterized verification of infinitestate processes with global conditions. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 145–157. Springer, Heidelberg (2007)Google Scholar
 Abdulla, P.A., Delzanno, G., Rezine, A.: Approximated parameterized verification of infinitestate processes with global conditions. Formal Methods in System Design 34(2), 126–156 (2009)Google Scholar
 Abdulla, P.A., Nylén, A.: Better is better than well: On efficient verification of infinitestate systems. In: Proc. LICS 2000, 16th IEEE Int. Symp. on Logic in Computer Science, pp. 132–140 (2000)Google Scholar
 Atig, M.F., Ganty, P.: Approximating Petri net reachability along contextfree traces. In: FSTTCS 2011. LIPIcs, vol. 13, pp. 152–163. Schloss Dagstuhl  LeibnizZentrum fuer Informatik (2011)Google Scholar
 Bouajjani, A., Esparza, J., Maler, O.: Reachability analysis of pushdown automata: Application to modelchecking. In: Mazurkiewicz, A., Winkowski, J. (eds.) CONCUR 1997. LNCS, vol. 1243, pp. 135–150. Springer, Heidelberg (1997)Google Scholar
 Bozzelli, L., Pinchinat, S.: Verification of gaporder constraint abstractions of counter systems. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 88–103. Springer, Heidelberg (2012)Google Scholar
 Cai, X., Ogawa, M.: Wellstructured extensions of pushdown systems. In: RP 2012 (2012)Google Scholar
 Čerāns, K.: Deciding properties of integral relational automata. In: Shamir, E., Abiteboul, S. (eds.) ICALP 1994. LNCS, vol. 820, pp. 35–46. Springer, Heidelberg (1994)Google Scholar
 Delzanno, G.: Automatic verification of parameterized cache coherence protocols. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 53–68. Springer, Heidelberg (2000)Google Scholar
 Delzanno, G., Bultan, T.: Constraintbased verification of clientserver protocols. In: Walsh, T. (ed.) CP 2001. LNCS, vol. 2239, pp. 286–301. Springer, Heidelberg (2001)Google Scholar
 Delzanno, G., Esparza, J., Podelski, A.: Constraintbased analysis of broadcast protocols. In: Flum, J., RodríguezArtalejo, M. (eds.) CSL 1999. LNCS, vol. 1683, pp. 50–66. Springer, Heidelberg (1999)Google Scholar
 Delzanno, G., Ganty, P.: Automatic verification of time sensitive cryptographic protocols. In: Jensen, K., Podelski, A. (eds.) TACAS 2004. LNCS, vol. 2988, pp. 342–356. Springer, Heidelberg (2004)Google Scholar
 Delzanno, G.: Constraintbased verification of parameterized cache coherence protocols. Formal Methods in System Design 23(3), 257–301 (2003)Google Scholar
 Delzanno, G., Podelski, A.: Model checking in CLP. In: Cleaveland, W.R. (ed.) TACAS 1999. LNCS, vol. 1579, pp. 223–239. Springer, Heidelberg (1999)Google Scholar
 Delzanno, G., Rezine, A.: A lightweight regular model checking approach for parameterized systems. STTT 14(2), 207–222 (2012)Google Scholar
 Esparza, J., Hansel, D., Rossmanith, P., Schwoon, S.: Efficient algorithms for model checking pushdown systems. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 232–247. Springer, Heidelberg (2000)Google Scholar
 Esparza, J., Kučera, A., Mayr, R.: Model checking probabilistic pushdown automata. In: Proc. LICS 2004, 20th IEEE Int. Symp. on Logic in Computer Science, pp. 12–21 (2004)Google Scholar
 Esparza, J., Schwoon, S.: A BDDbased model checker for recursive programs. In: Berry, G., Comon, H., Finkel, A. (eds.) CAV 2001. LNCS, vol. 2102, pp. 324–336. Springer, Heidelberg (2001)Google Scholar
 Etessami, K., Yannakakis, M.: Algorithmic verification of recursive probabilistic state machines. In: Halbwachs, N., Zuck, L.D. (eds.) TACAS 2005. LNCS, vol. 3440, pp. 253–270. Springer, Heidelberg (2005)Google Scholar
 Fribourg, L., Richardson, J.: Symbolic verification with gaporder constraints. In: Gallagher, J.P. (ed.) LOPSTR 1996. LNCS, vol. 1207, pp. 20–37. Springer, Heidelberg (1997)Google Scholar
 Marcone, A.: Foundations of BQO theory. Transactions of the American Mathematical Society 345(2) (1994)Google Scholar
 Reps, T.W., Schwoon, S., Jha, S., Melski, D.: Weighted pushdown systems and their application to interprocedural dataflow analysis. Sci. Comput. Program. 58(12), 206–263 (2005)Google Scholar
 Reps, T.: Program analysis via graph reachability. Information & Software Technology 40(11–12), 701–726 (1998)Google Scholar
 Revesz, P.Z.: A closedform evaluation for datalog queries with integer (gap)order constraints. TCS 116(1 &2), 117–149 (1993)Google Scholar
 Rot, J., de Boer, F.S., Bonsangue, M.M.: Pushdown System Representation For Unbounded Object Creation. Tech. Rep. KIT13, Karlsruhe Institute of Technology (July 2010)Google Scholar
 Schwoon, S.: ModelChecking Pushdown Systems. Ph.D. thesis, Technische Universität München (2002)Google Scholar
 Verma, K.N., Seidl, H., Schwentick, T.: On the complexity of equational Horn clauses. In: Nieuwenhuis, R. (ed.) CADE 2005. LNCS (LNAI), vol. 3632, pp. 337–352. Springer, Heidelberg (2005)Google Scholar
 Yannakakis, M.: Graphtheoretic methods in database theory. In: PODS 1990, pp. 230–242 (1990)Google Scholar