1 Introduction

Separation Logic [12, 15] (SL) was primarily introduced for writing concise Hoare logic proofs of programs that handle pointer-linked recursive data structures (lists, trees, etc). Over time, SL has evolved into a powerful logical framework, that constitutes the basis of several industrial-scale static program analyzers [2, 3, 5], that perform scalable compositional analyses, based on the principle of local reasoning: describing the behavior of a program statement with respect only to the small (local) set of memory locations that are changed by that statement, with no concern for the rest of the program’s state.

Given a set of memory locations (e.g., addresses), SL formulæ describe heaps, that are finite partial functions mapping finitely many locations to records of locations. A location \(\ell \) is allocated if it occurs in the domain of the heap. An atom \(x \mapsto (y_1,\dots ,y_\upkappa )\) states that there is only one allocated location, associated with x, that moreover refers to the tuple of locations associated with \((y_1,\dots ,y_\upkappa )\), respectively. The separating conjunction \(\upphi * \uppsi \) states that the heap can split into two parts, with disjoint domains, that make \(\upphi \) and \(\uppsi \) true, respectively. The separating conjunction is instrumental in supporting local reasoning, because the disjointness between the (domains of the) models of its arguments ensures that no update of one heap can actually affect the other.

Reasoning about recursive data structures of unbounded sizes (lists, trees, etc.) is possible via the use of predicate symbols, whose interpretation is specified by a user-provided set of inductive definitions (SID) of the form \(p(x_1,\ldots ,x_n) \Leftarrow \uppi \), where p is a predicate symbol of arity n and the free variables of the formula \(\uppi \) are among the parameters \(x_1,\ldots ,x_n\) of the rule. Here the separating conjunction ensures that each unfolding of the rules, which substitute some predicate atom \(p(y_1,\ldots ,y_n)\) by a formula \(\uppi [x_1/y_1,\ldots ,x_n/y_n]\), corresponds to a way of building the recursive data structure. For instance, a list is either empty, in which case its head equals its tail pointer, or is built by first allocating the head, followed by all elements up to but not including the tail, as stated by the inductive definitions \(\mathsf {ls}(x,y) \Leftarrow x \approx y\) and \(\mathsf {ls}(x,y) \Leftarrow \exists z ~.~ x \mapsto (z) * \mathsf {ls}(z,y)\).

An important problem in program verification, arising during the construction of Hoare-style correctness proofs of programs, is the discharge of verification conditions of the form \(\upphi \models \uppsi \), where \(\upphi \) and \(\uppsi \) are SL formulæ, asking whether every model of \(\upphi \) is also a model of \(\uppsi \). These problems, called entailments, are, in general, undecidable in the presence of inductively defined predicates [1, 11].

A first decidable class of entailments, described in [10], involves three restrictions on the SID rules: progress, connectivity and establishment. Intuitively, the progress (P) condition states that every rule allocates exactly one location, the connectivity (C) condition states that the set of allocated locations has a tree-shaped structure, and the establishment (E) condition states that every existentially quantified variable from a rule defining a predicate is (eventually) allocated in every unfolding of that predicate. A \(\mathsf {2\text {EXPTIME}}\) algorithm was proposed for testing the validity of PCE entailments [13, 14] and a matching \(\mathsf {2\text {EXPTIME}}\)-hardness lower bound was provided shortly after [6].

Later work relaxes the establishment condition, necessary for decidability [7], by proving that the entailment problem is still in \(\mathsf {2\text {EXPTIME}}\) if the establishment condition is replaced by the restrictedness (R) condition, which requires that every disequality (\(x \not \approx y\)) involves at least one free variable from the left-hand side of the entailment, propagated through the unfoldings of the inductive system [8]. Interestingly, the rules of a progressive, connected and restricted (PCR) entailment may generate data structures with “dangling” (i.e. existentially quantified but not allocated) pointers, which was not possible with PCE entailments.

In this paper, we generalize PCR entailments further, by showing that the connectivity and restrictedness conditions are needed only on the right-hand side of the entailment, whereas the only condition required on the left-hand side is progress (which can usually be enforced by folding or unfolding definitions). Our results thus allow for “asymetric” entailments, i.e., one can test whether the structures described by inductive rules that are (almost) arbitrary fulfill some restricted formula. Although the class of data structures that can be described is much larger, we show that this new class of entailments, called safe, is also \(\mathsf {2\text {EXPTIME}}\)-complete, by a many-one reduction of the validity of safe entailments to the validity of PCE entailments. A second contribution of the paper is the cross-certification of the two independent proofs of the \(\mathsf {2\text {EXPTIME}}\) upper bounds, for the PCE [14, 8, 6] and PCR [8] classes of entailments, respectively, by closing the loop. Namely, the reduction given in this paper enables the translation of any of the three entailment problems into an equivalent problem in any other class, while preserving the \(\mathsf {2\text {EXPTIME}}\) upper bound. This is because all the reductions are polynomial in the overall size of the SID and singly-exponential in the maximum size of the rules in the SID. The theoretical interest of the reduction is that it makes the proof of decidability and of the complexity class much shorter and clearer. It also has some practical advantages, since it allows one to re-use existing implementations designed for established systems instead of having to develop entirely new automated reasoning systems. Due to space restrictions, some of the proofs are omitted. All proofs can be found in [9].

2 Definitions

For a (partial) function \(f : A \rightarrow B\), we denote by \(\mathrm {dom}(f)\) and \(\mathrm {rng}(f)\) its domain and range, respectively. For a relation \(R \subseteq A \times A\), we denote by \(R^*\) the reflexive and transitive closure of R.

Let \(\upkappa \) be a fixed natural number throughout this paper and let \(\mathsf {P}\) be a countably infinite set of predicate symbols. Each predicate symbol \(p \in \mathsf {P}\) is associated a unique arity, denoted \( ar (p)\). Let \(\mathsf {V}\) be a countably infinite set of variables. For technical convenience, we also consider a special constant \(\bot \), which will be used to denote “empty” record fields. Formulæ are built inductively, according to the following syntax:

$$\upphi := x \not \approx x' \mid x \approx x' \mid x \mapsto (y_1,\dots ,y_\upkappa ) \mid p(x_1,\dots ,x_n) \mid \upphi _1 * \upphi _2 \mid \upphi _1 \vee \upphi _2 \mid \exists x ~.~ \upphi _1$$

where \(p\in \mathsf {P}\) is a predicate symbol of arity \(n = ar (p)\), \(x,x',x_1,\dots ,x_n \in \mathsf {V}\) are variables and \(y_1,\dots ,y_\upkappa \in \mathsf {V}\cup \{ \bot \}\) are terms, i.e. either variables or \(\bot \).

The set of variables freely occurring in a formula \(\upphi \) is denoted by \(\mathrm {fv}(\upphi )\), we assume by \(\upalpha \)-equivalence that the same variable cannot occur both free and bound in the same formula \(\upphi \), and that distinct quantifiers bind distinct variables. The size \(|\upphi |\) of a formula \(\upphi \) is the number of occurrences of symbols in \(\upphi \). A formula \(x \approx x'\) or \(x \not \approx x'\) is an equational atom, \(x \mapsto (y_1, \ldots , y_\upkappa )\) is a points-to atom, whereas \(p(x_1, \ldots , x_n)\) is a predicate atom. Note that \(\bot \) cannot occur in an equational or in a predicate atom. A formula is predicate-less if no predicate atom occurs in it. A symbolic heap is a formula of the form , where each \(\upalpha _i\) is an atom and \(\pmb {x}\) is a possibly empty vector of variables.

Definition 1

A variable x is allocated by a symbolic heap \(\upphi \) iff \(\upphi \) contains a sequence of equalities \(x_1 \approx x_2 \approx \ldots \approx x_{n-1} \approx x_n\), for \(n \ge 1\), such that \(x = x_1\) and \(x_n \mapsto (y_1, \ldots , y_\upkappa )\) occurs in \(\upphi \), for some variables \(x_1, \ldots , x_n\) and some terms \(y_1, \ldots , y_\upkappa \in \mathsf {V}\cup \{\bot \}\).

A substitution is a partial function mapping variables to variables. If \(\upsigma \) is a substitution and \(\upphi \) is a formula, a variable or a tuple, then \(\upphi \upsigma \) denotes the formula, the variable or the tuple obtained from \(\upphi \) by replacing every free occurrence of a variable \(x \in \mathrm {dom}(\upsigma )\) by \(\upsigma (x)\), respectively. We denote by \( \left\{ \langle x_i,y_i\rangle \mid i \in \llbracket 1,n \rrbracket \right\} \) the substitution with domain \(\{ x_1,\dots ,x_n\}\) that maps \(x_i\) to \(y_i\), for each \(i \in \llbracket 1,n \rrbracket \).

A set of inductive definitions (SID) \(\mathcal{R}\) is a finite set of implications (or rules) of the form \(p(x_1,\dots ,x_n) \Leftarrow \uppi \), where \(p \in \mathsf {P}\), \(n = ar (p)\), \(x_1,\dots ,x_n\) are pairwise distinct variables and \(\uppi \) is a quantifier-free symbolic heap. The predicate atom \(p(x_1,\ldots ,x_n)\) is the head of the rule and \(\mathcal{R}(p)\) denotes the subset of \(\mathcal{R}\) consisting of rules with head \(p(x_1,\ldots ,x_n)\) (the choice of \(x_1, \ldots , x_n\) is not important). The variables in \(\mathrm {fv}(\uppi ) \setminus \{x_1,\dots ,x_n\}\) are called the existential variables of the rule. Note that, by definition, these variables are not explicitly quantified inside \(\uppi \) and that \(\uppi \) is quantifier-free. For simplicity, we denote by \(p(x_1, \ldots , x_n) \Leftarrow _\mathcal{R}\uppi \) the fact that the rule \(p(x_1, \ldots , x_n) \Leftarrow \uppi \) belongs to \(\mathcal{R}\). The size of \(\mathcal{R}\) is defined as and its width as .

We write \(p \succeq _{\mathcal{R}} q\), \(p, q \in \mathsf {P}\) iff \(\mathcal{R}\) contains a rule of the form \(p(x_1, \ldots , x_n) \Leftarrow \uppi \), and q occurs in \(\uppi \). We say that p depends on q if \(p \succeq _{\mathcal{R}}^* q\). For a formula \(\upphi \), we denote by \(\mathcal{P}(\upphi )\) the set of predicate symbols q, such that \(p \succeq _{\mathcal{R}}^* q\) for some predicate p occurring in \(\upphi \).

Given formulæ \(\upphi \) and \(\uppsi \), we write \(\upphi \Leftarrow _{\mathcal{R}} \uppsi \) if \(\uppsi \) is obtained from \(\upphi \) by replacing an atom \(p(u_1,\dots ,u_n)\) by \(\uppi \left\{ \langle x_1,u_1\rangle ,\dots ,\langle x_n,u_n\rangle \right\} \), where \(\mathcal{R}\) contains a rule \(p(x_1,\ldots ,x_n) \Leftarrow \uppi \). We assume, by a renaming of existential variables, that the set \((\mathrm {fv}(\uppi ) \setminus \{ x_1,\dots ,x_n\}) \cap \mathrm {fv}(\upphi )\) is empty. We call \(\uppsi \) an unfolding of \(\upphi \) iff \(\upphi \Leftarrow _{\mathcal{R}}^* \uppsi \).

We now define the semantics of SL. Let \(\mathcal{L}\) be a countably infinite set of locations containing, in particular, a special location . A structure is a pair \(({\mathfrak s},{\mathfrak h})\), where:

  • \({\mathfrak s}\) is a partial function from \(\mathsf {V}\cup \{ \bot \}\) to \(\mathcal{L}\), called a store, such that \(\bot \in \mathrm {dom}({\mathfrak s})\) and , for all \(x\in \mathsf {V}\cup \{ \bot \}\), and

  • \({\mathfrak h}: \mathcal{L}\rightarrow \mathcal{L}^\upkappa \) is a finite partial function, such that .

If \(x_1,\dots ,x_n\) are pairwise distinct variables and \(\ell _1,\dots ,\ell _n \in \mathcal{L}\) are locations, we denote by \({\mathfrak s}[x_i \leftarrow \ell _i \mid 1 \le i \le n]\) the store \({\mathfrak s}'\) defined by \(\mathrm {dom}({\mathfrak s}') = \mathrm {dom}({\mathfrak s}) \cup \left\{ x_1, \ldots , x_n\right\} \), \({\mathfrak s}'(y) = \ell _i\) if \(y = x_i\) for some \(i \in \llbracket 1,n \rrbracket \), and \({\mathfrak s}'(y) = {\mathfrak s}(x)\) otherwise. If \(x_1,\dots ,x_n \not \in \mathrm {dom}({\mathfrak s})\), then the store \({\mathfrak s}'\) is called an extension of \({\mathfrak s}\) to \(\{x_1,\dots ,x_n\}\).

Given a heap \({\mathfrak h}\), we define and . Two heaps \({\mathfrak h}_1\) and \({\mathfrak h}_2\) are disjoint iff \(\mathrm {dom}({\mathfrak h}_1) \cap \mathrm {dom}({\mathfrak h}_2) = \emptyset \), in which case \({\mathfrak h}_1 \uplus {\mathfrak h}_2\) denotes the union of \({\mathfrak h}_1\) and \({\mathfrak h}_2\), undefined whenever \({\mathfrak h}_1\) and \({\mathfrak h}_2\) are not disjoint.

Given an SID \(\mathcal{R}\), \(({\mathfrak s},{\mathfrak h}) \models _{\mathcal{R}} \upphi \) is the least relation between structures and formulæ such that whenever \(({\mathfrak s},{\mathfrak h}) \models _{\mathcal{R}} \upphi \), we have \(\mathrm {fv}(\upphi ) \subseteq \mathrm {dom}({\mathfrak s})\) and the following hold:

$$\begin{array}{rcll} ({\mathfrak s},{\mathfrak h}) &{} \models _\mathcal{R}&{} x \approx x' &{} \text { if }\mathrm {dom}({\mathfrak h}) = \emptyset \text { and }{\mathfrak s}(x) = {\mathfrak s}(x') \\ ({\mathfrak s},{\mathfrak h}) &{} \models _\mathcal{R}&{} x \not \approx x' &{} \text { if }\mathrm {dom}({\mathfrak h}) = \emptyset \text { and }{\mathfrak s}(x) \ne {\mathfrak s}(x') \\ ({\mathfrak s},{\mathfrak h}) &{} \models _\mathcal{R}&{} x \mapsto (y_1, \ldots , y_\upkappa ) &{} \text { if }\mathrm {dom}({\mathfrak h}) = \{{\mathfrak s}(x)\} \text { and } {\mathfrak h}({\mathfrak s}(x)) = \langle {\mathfrak s}(y_1), \ldots , {\mathfrak s}(y_\upkappa )\rangle \\ ({\mathfrak s},{\mathfrak h}) &{} \models _\mathcal{R}&{} \upphi _1 * \upphi _2 &{} \text { if there exist disjoint heaps }{\mathfrak h}_1 \text { and }{\mathfrak h}_2 \text { such that} \\ &{}&{}&{} {\mathfrak h}= {\mathfrak h}_1 \uplus {\mathfrak h}_2 \text { and } ({\mathfrak s},{\mathfrak h}_i) \models _\mathcal{R}\upphi _i\text {, for both }i=1,2 \\ ({\mathfrak s},{\mathfrak h}) &{} \models _\mathcal{R}&{} \upphi _1 \vee \upphi _2 &{} \text { if }({\mathfrak s},{\mathfrak h}) \models _\mathcal{R}\upphi _i\text {, for some }i = 1,2 \\ ({\mathfrak s},{\mathfrak h}) &{} \models _\mathcal{R}&{} \exists x~.~ \upphi &{} \text { if there exists }\ell \in \mathcal{L}\text { such that }({\mathfrak s}[x \leftarrow \ell ],{\mathfrak h}) \models \upphi \\ ({\mathfrak s},{\mathfrak h}) &{} \models _\mathcal{R}&{} p(x_1,\dots ,x_n) &{} \text { if }p(x_1,\dots ,x_n) \Leftarrow _{\mathcal{R}} \upphi \text { , and there exists a store }{\mathfrak s}_e \\ &{}&{}&{} \text { coinciding with }{\mathfrak s}\text { on }\{ x_1,\dots ,x_n\}\text {, such that }({\mathfrak s}_e,{\mathfrak h}) \models \upphi \end{array}$$

Given formulæ \(\upphi \) and \(\uppsi \), we write \(\upphi \models _{\mathcal{R}} \uppsi \) whenever \(({\mathfrak s},{\mathfrak h}) \models _{\mathcal{R}} \upphi \Rightarrow ({\mathfrak s},{\mathfrak h}) \models _{\mathcal{R}} \uppsi \), for all structures \(({\mathfrak s},{\mathfrak h})\) and \(\upphi \equiv _{\mathcal{R}} \uppsi \) for (\(\upphi \models _\mathcal{R}\uppsi \) and \(\uppsi \models _\mathcal{R}\upphi \)). We omit the subscript \(\mathcal{R}\) whenever these relations hold for any SID. It is easy to check that, for all formulæ \(\upphi _1,\upphi _2,\uppsi \), it is the case that \((\upphi _1 \vee \upphi _2) * \uppsi \equiv (\upphi _1 * \uppsi ) \vee (\upphi _2 * \uppsi )\) and \((\exists x . \upphi _1) * \upphi _2 \equiv \exists x ~.~ \upphi _1 * \upphi _2\). Consequently, each formula can be transformed into an equivalent finite disjunction of symbolic heaps.

Definition 2

An entailment problem is a triple , where \(\upphi \) is a quantifier-free formula, \(\uppsi \) is a formula and \(\mathcal{R}\) is an SID. The problem \(\mathfrak {P}\) is valid iff \(\upphi \models _{\mathcal{R}} \uppsi \). The size of the problem \(\mathfrak {P}\) is defined as and its width is defined as .

Note that considering \(\upphi \) to be quantifier-free loses no generality, because \(\exists x.\upphi \models _{\mathcal{R}} \uppsi \iff \upphi \models _{\mathcal{R}} \uppsi \).

3 Decidable Entailment Problems

The class of general entailment problems is undecidable, see Theorem 5 below for a refinement of the initial undecidability proofs [1, 11]. A first attempt to define a natural decidable class of entailment problems is described in [10] and involves three restrictions on the SID rules, formally defined below:

Definition 3

A rule \(p(x_1,\dots ,x_n) \Leftarrow \uppi \) is:

  1. 1.

    progressing (P) iff \(\uppi = x_1 \mapsto (y_1,\dots ,y_\upkappa ) * \uprho \) and \(\uprho \) contains no points-to atoms,

  2. 2.

    connected (C) iff it is progressing, \(\uppi = x_1 \mapsto (y_1,\dots ,y_\upkappa ) * \uprho \) and every predicate atom in \(\uprho \) is of the form \(q(y_i,\pmb {u})\), for some \(i \in \llbracket 1,\upkappa \rrbracket \),

  3. 3.

    established (E) iff every existential variable \(x \in \mathrm {fv}(\uppi ) \setminus \{x_1, \ldots , x_n\}\) is allocated by every predicate-less unfolding \(\uppi \Leftarrow _\mathcal{R}^* \upphi \).

An SID \(\mathcal{R}\) is P (resp. C, E) for a formula \(\upphi \) iff every rule in \(\bigcup _{p \in \mathcal{P}(\upphi )}\mathcal{R}(p)\) is P (resp. C,E). An entailment problem \(\upphi \vdash _{\mathcal{R}} \uppsi \) is left- (resp. right-) P (resp. C, E) iff \(\mathcal{R}\) is P (resp. C, E) for \(\upphi \) (resp. \(\uppsi \)). An entailment problem is P (resp. C, E) iff it is both left- and right-P (resp. C, E).

The decidability of progressing, connected and left-established entailment problems is an immediate consequence of the result of [10]. Moreover, an analysis of the proof [10] leads to an elementary recursive complexity upper bound, which has been recently tighten down to \(\mathsf {2\text {EXPTIME}}\)-complete [6, 8, 14]. In the following, we refer to Table 1 for a recap of the complexity results for the entailment problem. The last line is the main result of the paper and corresponds to the most general (known) decidable class of entailment problems (Definition 8).

Table 1. Decidability and Complexity Results for the Entailment Problem (\(\checkmark \) means that the corresponding condition holds on the left- and right-hand side of the entailment)

The following theorem is an easy consequence of previous results [6].

Theorem 4

The progressing, connected and left-established entailment problem is \(\mathsf {2\text {EXPTIME}}\)-complete. Moreover, there exists a decision procedure that runs in time \(2^{2^{\mathcal {O}(\mathrm {w}(\mathfrak {P})^8 \cdot \log |\mathfrak {P}|)}}\) for every instance \(\mathfrak {P}\) of this problem.

A natural question arises in this context: which of the restrictions from the above theorem can be relaxed and what is the price, in terms of computational complexity, of relaxing (some of) them? In the light of Theorem 5 below, the connectivity restriction cannot be completely dropped. Further, if we drop the establishment condition, the problem becomes undecidable [7, Theorem 6], even if both the left/right progress and connectivity conditions apply.

Theorem 5

The progressing, left-connected and established entailment problem is undecidable.

The second decidable class of entailment problems [8] relaxes the connectivity condition and replaces the establishment with a syntactic condition (that can be checked in polynomial time in the size of the SID), while remaining \(\mathsf {2\text {EXPTIME}}\)-complete. Informally, the definition forbids (dis)equations between existential variables in symbolic heaps or rules: the only allowed (dis)equations are of the form \(x \bowtie y\) where x is a free variable (viewed as a constant in [8]). The definition given below is essentially equivalent to that of [8], but avoids any reference to constants; instead it uses a notion of \(\mathcal{R}\)-positional functions, which helps to identify existential variables that are always replaced by a free variable from the initial formula during unfolding.

An \(\mathcal{R}\)-positional function maps every n-ary predicate symbol p occurring in \(\mathcal{R}\) to a subset of \(\llbracket 1,n \rrbracket \). Given an \(\mathcal{R}\)-positional function \(\uplambda \) and a formula \(\upphi \), we denote by \({\mathsf {V}}_{\uplambda }(\upphi )\) the set of variables \(x_i\) such that \(\upphi \) contains a predicate atom \(p(x_1,\dots ,x_n)\) with \(i \in \uplambda (p)\). Note that \({\mathsf {V}}_{\uplambda }\) is stable under substitutions, i.e. \({\mathsf {V}}_{\uplambda }(\upphi \upsigma ) = ({\mathsf {V}}_{\uplambda }(\upphi ))\upsigma \), for each formula \(\upphi \) and each substitution \(\upsigma \).

Definition 6

Let \(\uppsi \) be a formula and \(\mathcal{R}\) be an SID. The fv-profile of the pair \((\uppsi ,\mathcal{R})\) is the \(\mathcal{R}\)-positional function \(\uplambda \) such that the sets \(\uplambda (p)\), for \(p \in \mathsf {P}\), are the maximal sets satisfying the following conditions:

  1. 1.

    \({\mathsf {V}}_{\uplambda }(\uppsi ) \subseteq \mathrm {fv}(\uppsi )\).

  2. 2.

    For all predicate symbols \(p\in \mathcal{P}(\uppsi )\), all rules \(p(x_1,\dots ,x_n) \Leftarrow \uppi \) in \(\mathcal{R}\), all predicate atoms \(q(y_1,\dots ,y_m)\) in \(\uppi \) and all \(i \in \uplambda (q)\), there exists \(j \in \uplambda (p)\) such that \(x_j = y_i\).

The fv-profile of \((\uppsi ,\mathcal{R})\) is denoted by \(\uplambda ^{\uppsi }_{\mathcal{R}}\).

Intuitively, given a predicate \(p \in \mathsf {P}\), the set \(\uplambda ^{\uppsi }_{\mathcal{R}}(p)\) denotes the formal parameters of p that, in every unfolding of \(\uppsi \), will always be substituted by variables occurring freely in \(\uppsi \). It is easy to check that \(\uplambda ^{\uppsi }_{\mathcal{R}}\) can be computed in polynomial time w.r.t. \(|\uppsi | + |\mathcal{R}|\), using a straightforward greatest fixpoint algorithm. The algorithm starts with a function mapping every predicate p of arity n to \(\llbracket 1,n \rrbracket \) and repeatedly removes elements from the sets \(\uplambda (p)\) to ensure that the above conditions hold. In the worst case, we may have eventually \(\uplambda (p) = \emptyset \) for all predicate symbols p.

Definition 7

Let \(\uplambda \) be an \(\mathcal{R}\)-positional function, and V be a set of variables. A formula \(\upphi \) is \(\uplambda \)-restricted (\(\uplambda \)-R) w.r.t. V iff the following hold:

  1. 1.

    for every disequation \(y \not \approx z\) in \(\upphi \), we have \(\{ y,z\} \cap V \not = \emptyset \), and

  2. 2.

    \({\mathsf {V}}_{\uplambda }(\upphi ) \subseteq V\).

A rule \(p(x_1,\ldots ,x_n) \Leftarrow x \mapsto (y_1,\dots ,y_\upkappa ) * \uprho \) is:

  • \(\uplambda \)-connected (\(\uplambda \)-C) iff for every atom \(q(z_1,\dots ,z_m)\) occurring in \(\uprho \), we have \(z_1 \in {\mathsf {V}}_{\uplambda }(p(x_1,\dots ,x_n)) \cup \{y_1, \ldots , y_\upkappa \}\),

  • \(\uplambda \)-restricted (\(\uplambda \)-R) iff \(\uprho \) is \(\uplambda \)-restricted w.r.t. \({\mathsf {V}}_{\uplambda }(p(x_1,\dots ,x_n))\).

An SID \(\mathcal{R}\) is P (resp. \(\uplambda \)-C, \(\uplambda \)-R) for a formula \(\upphi \) iff every rule in \(\bigcup _{p \in \mathcal{P}(\upphi )}\mathcal{R}(p)\) is P (resp. \(\uplambda \)-C, \(\uplambda \)-R).

An SID \(\mathcal{R}\) is \(\uplambda \)-C (\(\uplambda \)-R) for a formula \(\upphi \) iff every rule in \(\bigcup _{p \in \mathcal{P}(\upphi )}\mathcal{R}(p)\) is \(\uplambda \)-C (\(\uplambda \)-R). An entailment problem \(\upphi \vdash _{\mathcal{R}} \uppsi \) is left- (right-) \(\uplambda \)-C, (\(\uplambda \)-R) iff \(\mathcal{R}\) is \(\uplambda \)-C (\(\uplambda \)-R) for \(\upphi \) (\(\uppsi \)), where \(\uplambda \) is considered to be \(\uplambda ^{\upphi }_{\mathcal{R}}\) (\(\uplambda ^{\uppsi }_{\mathcal{R}}\)). An entailment problem is \(\uplambda \)-C (\(\uplambda \)-R) iff it is both left- and right-\(\uplambda \)-C (\(\uplambda \)-R).

The class of progressing, \(\uplambda \)-connected and \(\uplambda \)-restricted entailment problems has been shown to be a generalization of the class of progressing, connected and left-established problems, because the latter can be reduced to the former by a many-one reduction [8, Theorem 13] that runs in time \(|\mathfrak {P}| \cdot 2^{\mathcal {O}(\mathrm {w}(\mathfrak {P})^2)}\) on input \(\mathfrak {P}\) (Figure 1) and preserves the problem’s width asymptotically.

Fig. 1.
figure 1

Many-one Reductions between Decidable Entailment Problems

In the rest of this paper we close the loop by defining a syntactic extension of \(\uplambda \)-progressing, \(\uplambda \)-connected and \(\uplambda \)-restricted entailment problems and by showing that this extension can be reduced to the class of progressing, connected and left-established entailment problems by a many-one reduction. The new fragment is defined as follows:

Definition 8

An entailment problem \(\upphi \vdash _{\mathcal{R}} \uppsi \) is safe if, for , the following hold:

  1. 1.

    every rule in \(\mathcal{R}\) is progressing,

  2. 2.

    \(\uppsi \) is \(\uplambda \)-restricted w.r.t. \(\mathrm {fv}(\upphi )\),

  3. 3.

    all the rules from \(\bigcup _{p \in \mathcal{P}(\uppsi )} \mathcal{R}(p)\) are \(\uplambda \)-connected and \(\uplambda \)-restricted.

Note that there is no condition on the formula \(\upphi \), or on the rules defining the predicates occurring only in \(\upphi \), other than the progress condition. The conditions in Definition 8 ensure that all the disequations occurring in any unfolding of \(\uppsi \) involve at least one variable that is free in \(\upphi \). Further, the heaps of the model of \(\uppsi \) must be forests, i.e. unions of trees, the roots of which are associated with the first argument of the predicate atoms in \(\uppsi \) or to free variables from \(\upphi \).

A typical yet very simple example of such an entailment is the so-called “reversed list” problem that consists in checking that any list segment \(\texttt {revls}(z,y)\) defined in the reverse direction (from the tail to the head) is a list segment \(\texttt {ls}(x,y)\) in the usual sense (defined inductively from head to tail). This corresponds to the entailment problem \(\texttt {revls}(z,y) \vdash _{\mathcal{R}} \exists x. \texttt {ls}(x,y)\) where \(\mathcal{R}\) contains the following rules:

figure c

This problem is considered as challenging for proof search-based automated reasoning procedures (see, e.g., [4, 16]). The antecedent does not fulfill the connectivity condition, but the subsequent does, hence the entailment is safe. Similar, more complex examples can be defined, for instance a list can be constructed by interleaving elements at odd or even positions. Another example is the case of a data structure containing an unbounded number of acylic lists (e.g., a list of acyclic lists). Such a data structure does not fulfill the restricteness condition, since one needs to compare the pointers occurring along each list to the point at the end. Checking, for instance, that the concatenation of two lists of acyclic lists is again a list of (possibly cyclic) lists is a problem that fits into the safe class and can thus be effectively checked by our algorithm.

We refer the reader to Figure 1 for a general picture of the entailment problems considered so far and of the many-one reductions between them, where the reduction corresponding to the dashed arrow is the concern of the next section. Importantly, since all reductions are many-one, taking time polynomial in the size and exponential in the width of the input problem, while preserving its width asymptotically, the three classes from Figure 1 can be unified into a single (2EXPTIME-complete) class of entailments.

4 Reducing Safe to Established Entailments

In a model of a safe SID (Definition 8), the existential variables introduced by the replacement of predicate atoms with corresponding rule bodies are not required to be allocated. This is because safe SIDs are more liberal than established SIDs and allow heap structures with an unbounded number of dangling pointers. As observed in [8], checking the validity of an entailment (w.r.t a restricted SID) can be done by considering only those structures in which the dangling pointers point to pairwise distinct locations. The main idea of the hereby reduction of safe to established entailment problems is that any such structure can be extended by allocating all dangling pointers separately and, moreover, the extended structures can be defined by an established SID.

In what follows, we fix an arbitrary instance \(\mathfrak {P}= \upphi \vdash _{\mathcal{R}} \uppsi \) of the safe entailment problem (Definition 8) and denote by the fv-profile of \((\uppsi ,\mathcal{R})\) (Definition 6). Let be the vector of free variables from \(\upphi \) and \(\uppsi \), where the order of variables is not important and assume w.l.o.g. that \(\upnu > 0\). Let and be the sets of predicate symbols that depend on the predicate symbols occurring in the left- and right-hand side of the entailment, respectively. We assume that \(\upphi \) and \(\uppsi \) contain no points-to atoms and that \(\mathcal{P}_{l}\cap \mathcal{P}_{r}= \emptyset \). Again, these assumptions lose no generality, because a points-to atom \(u \mapsto (v_1,\dots ,v_\upkappa )\) can be replaced by a predicate atom \(p(u,v_1,\dots ,v_\upkappa )\), where p is a fresh predicate symbol associated with the rule \(p(x,y_1,\dots ,y_\upkappa ) \Leftarrow x \mapsto (y_1,\dots ,y_\upkappa )\). Moreover the condition \(\mathcal{P}_{l}\cap \mathcal{P}_{r}\ne \emptyset \) may be enforced by considering two copies of each predicate, for the left-hand side and for the right-hand side, respectively. Finally, we assume that every rule contains exactly \(\mu \) existential variables, for some fixed \(\mu \in {\mathbb N}\); this condition can be enforced by adding dummy literals \(x \approx x\) if needed.

We describe a reduction of \(\mathfrak {P}\) to an equivalent progressing, connected, and left-established entailment problem. The reduction will extend heaps, by adding \(\upnu +\mu \) record fields. We shall therefore often consider heaps and points-to atoms having \(\upkappa +\upnu +\mu \) record fields, where the formal definitions are similar to those given previously. Usually such formulæ and heaps will be written with a prime. These additional record fields will be used to ensure that the constructed system is connected, by adding all the existential variables of a given rule (as well as the variables in \(w_1,\dots ,w_\upnu \)) into the image of the location allocated by the considered rule. Furthermore, the left-establishment condition will be enforced by adding predicates and rules in order to allocate all the locations that correspond to existential quantifiers and that are not already allocated, making such locations point to a dummy vector , of length \(\upkappa +\upnu + \mu \), where \(\bot \) is the special constant denoting empty heap entries. To this aim, we shall use a predicate symbol \(\underline{\pmb {\bot }}\) associated with the rule \(\underline{\pmb {\bot }}(x) \Leftarrow x \mapsto \pmb {\bot }\). Note that allocating all these locations will entail (by definition of the separating conjunction) that they are distinct, thus the addition of such predicates and rules will reduce the number of satisfiable unfoldings. However, due to the restrictions on the use of disequationsFootnote 1, we shall see that this does not change the status of the entailment problem.

Definition 9

For any total function \(\upgamma : \mathcal{L}\rightarrow \mathcal{L}\) and any tuple \(\pmb {\ell } = \langle \ell _1,\dots ,\ell _n\rangle \in \mathcal{L}^n\), we denote by \(\upgamma (\pmb {\ell })\) the tuple \(\langle \upgamma (\ell _1),\dots ,\upgamma (\ell _n)\rangle \). If \({\mathfrak s}\) is a store, then \(\upgamma ({\mathfrak s})\) denotes the store with domain \(\mathrm {dom}({\mathfrak s})\), such that , for all \(x \in \mathrm {dom}({\mathfrak s})\). Consider a heap \({\mathfrak h}\) such that for all \(\ell \ne \ell ' \in \mathrm {dom}({\mathfrak h})\), we have \(\upgamma (\ell ) \ne \upgamma (\ell ')\). Then \(\upgamma ({\mathfrak h})\) denotes the heap with domain \(\mathrm {dom}(\upgamma ({\mathfrak h})) = \{\upgamma (\ell ) \mid \ell \in \mathrm {dom}({\mathfrak h})\}\), such that , for all \(\ell \in \mathrm {dom}({\mathfrak h})\).

The following lemma identifies conditions ensuring that the application of a mapping to a structure (Definition 9) preserves the truth value of a formula.

Lemma 10

Given a set of variables V, let \(\upalpha \) be a formula that is \(\uplambda \)-restricted w.r.t. V, such that \(\mathcal{P}(\upalpha )\subseteq \mathcal{P}_{r}\) and let \(({\mathfrak s},{\mathfrak h})\) be an \(\mathcal{R}\)-model of \(\upalpha \). For every mapping \(\upgamma : \mathcal{L}\rightarrow \mathcal{L}\) such that \(\upgamma (\ell ) = \upgamma (\ell ') \Rightarrow \ell =\ell '\) holds whenever either \(\{ \ell ,\ell ' \} \subseteq \mathrm {dom}({\mathfrak h})\) or \(\{\ell ,\ell ' \} \cap {\mathfrak s}(V) \not = \emptyset \), we have \((\upgamma ({\mathfrak s}),\upgamma ({\mathfrak h})) \models _{\mathcal{R}} \upalpha \).

If \(\upgamma \) is, moreover, injective, then the result of Lemma 10 holds for any formula:

Lemma 11

Let \(\upalpha \) be a formula and let \(({\mathfrak s},{\mathfrak h})\) be an \(\mathcal{R}\)-model of \(\upalpha \). For every injective mapping \(\upgamma : \mathcal{L}\rightarrow \mathcal{L}\) we have \((\upgamma ({\mathfrak s}),\upgamma ({\mathfrak h})) \models _{\mathcal{R}} \upalpha \).

4.1 Expansions and Truncations

We introduce a so-called expansion relation on structures, as well as a truncation operation on heaps. Intuitively, the expansion of a structure is a structure with the same store and whose heap is augmented with new allocated locations (each pointing to \(\boldsymbol{\bot }\)) and additional record fields, referring in particular to all the newly added allocated locations. These locations are introduced to accommodate all the existential variables of the predicate-less unfolding of the left-hand side of the entailment (to ensure that the obtained entailment is left-established). Conversely, the truncation of a heap is the heap obtained by removing these extra locations. We also introduce the notion of a \(\upgamma \)-expansion which is a structure whose image by \(\upgamma \) is an expansion.

We recall that, throughout this and the next sections, \(\pmb {w} = (w_1, \ldots , w_\upnu )\) denotes the vector of free variables occurring in the problem, which is assumed to be fixed throughout this section and that \(\{ w_1,\dots ,w_\upnu , \bot \}\subseteq \mathrm {dom}({\mathfrak s})\), for every store \({\mathfrak s}\) considered here. Moreover, we assume w.l.o.g. that \(w_1,\dots ,w_\upnu \) do not occur in the considered SID \(\mathcal{R}\) and denote by \(\mu \) the number of existential variables in each rule of \(\mathcal{R}\). We refer to Figure 2 for an illustration of the definition below:

Fig. 2.
figure 2

Heap Expansion and Truncation

Definition 12

Let \(\upgamma : \mathcal{L}\rightarrow \mathcal{L}\) be a total mapping. A structure \(({\mathfrak s},{\mathfrak h}')\) is a \(\upgamma \)-expansion (or simply an expansion if \(\upgamma = { i d}\)) of some structure \(({\mathfrak s},{\mathfrak h})\), denoted by \(({\mathfrak s},{\mathfrak h}') \triangleright _{\upgamma } ({\mathfrak s},{\mathfrak h})\), if \({\mathfrak h}: \mathcal{L}\rightarrow \mathcal{L}^\upkappa \), \({\mathfrak h}' : \mathcal{L}\rightarrow \mathcal{L}^{\upkappa +\mu +\upnu }\) and there exist two disjoint heaps, and , such that and the following hold:

  1. 1.

    for all , if \(\upgamma (\ell _1) = \upgamma (\ell _2)\) then \(\ell _1 = \ell _2\),

  2. 2.

    ,

  3. 3.

    for each , we have \({\mathfrak h}'(\ell ) = \langle \pmb {a},{\mathfrak s}(\pmb {w}),b_1^{\ell },\dots ,b_{\mu }^{\ell }\rangle \), for some locations \(b_1^{\ell }, \ldots , b_\mu ^{\ell } \in \mathcal{L}\) and \(\upgamma (\pmb {a}) = {\mathfrak h}(\upgamma (\ell ))\),

  4. 4.

    for each , we have and there exists a location such that is of the form \(\langle \pmb {a},\pmb {\ell },b_1^{\ell '},\dots ,b_{\mu }^{\ell '}\rangle \) where \(\boldsymbol{\ell }\) is a tuple of locations and \(\ell = b_i^{\ell '}\), for some \(i \in \llbracket 1,\mu \rrbracket \). The element \(\ell '\) is called the connection of \(\ell \) in \({\mathfrak h}'\) and is denoted by \(\mathrm {C}_{{\mathfrak h}'}(\ell )\).Footnote 2

Let \(({\mathfrak s},{\mathfrak h}')\) be a \(\upgamma \)-expansion of \(({\mathfrak s},{\mathfrak h})\) and let \(\ell \in \mathrm {dom}(\mathrm{main}({\mathfrak h}'))\) be a location. Since \(\upnu >0\) and for all \(i \in \llbracket 1,\upnu \rrbracket \), \({\mathfrak s}(w_i)\) occurs in \({\mathfrak h}'(\ell )\), and since we assume that for every \(i \in \llbracket 1,\upnu \rrbracket \), necessarily . This entails that the decomposition \({\mathfrak h}' = \mathrm{main}({\mathfrak h}') \uplus \mathrm{aux}({\mathfrak h}')\) is unique: and are the restrictions of \({\mathfrak h}'\) to the locations \(\ell \) in \(\mathrm {dom}({\mathfrak h}')\) such that and , respectively. In the following, we shall thus freely use the notations and , for arbitrary heaps \({\mathfrak h}'\).

Definition 13

Given a heap \({\mathfrak h}'\), we denote by \(\mathrm {trunc}({\mathfrak h}')\) the heap \({\mathfrak h}\) defined as follows: and for all \(\ell \in \mathrm {dom}({\mathfrak h})\), if \({\mathfrak h}'(\ell ) = (\ell _1, \ldots , \ell _{\upkappa +\upnu +\mu })\), then .

Note that, if \({\mathfrak h}= \mathrm {trunc}({\mathfrak h}')\) then \({\mathfrak h}: \mathcal{L}\rightarrow \mathcal{L}^\upkappa \) and \({\mathfrak h}' : \mathcal{L}\rightarrow \mathcal{L}^{\upkappa +\mu +\upnu }\) are heaps of different out-degrees. In the following, we silently assume this fact, to avoid cluttering the notation by explicitly specifying the out-degree of a heap.

Example 14

Assume that \(\mathcal{L}= {\mathbb N}\), \(\upnu = \mu = 1\). Let \({\mathfrak s}\) be a store such that \({\mathfrak s}(w_1) = 0\). We consider:

We have \(({\mathfrak s}, {\mathfrak h}'_1) \triangleright _{{ i d}} ({\mathfrak s}, {\mathfrak h})\) and \(({\mathfrak s}, {\mathfrak h}'_2) \triangleright _{\upgamma } ({\mathfrak s}, {\mathfrak h})\), with . Also, \(\mathrm {trunc}({\mathfrak h}_1') = \{ \langle 1,2\rangle , \langle 2,2\rangle \} = {\mathfrak h}\) and \(\mathrm {trunc}({\mathfrak h}_2') = \{ \langle 1,3\rangle , \langle 2,4\rangle \}\). Note that \({\mathfrak h}\) has out-degree \(\upkappa = 1\), whereas \({\mathfrak h}_1'\) and \({\mathfrak h}_2'\) have out-degree 3. \(\blacksquare \)

Lemma 15

If \(({\mathfrak s}, {\mathfrak h}') \triangleright _{\upgamma } ({\mathfrak s}, {\mathfrak h})\) then \({\mathfrak h}= \upgamma (\mathrm {trunc}({\mathfrak h}'))\), hence \(({\mathfrak s}, {\mathfrak h}') \triangleright _{{ i d}} ({\mathfrak s}, \mathrm {trunc}({\mathfrak h}'))\).

The converse of Lemma 15 does not hold in general, but it holds under some additional conditions:

Lemma 16

Consider a store \({\mathfrak s}\), let \({\mathfrak h}'\) be a heap and let . Let and . Assume that:

  1. 1

    for every location \(\ell \in D_1\), \({\mathfrak h}(\ell )\) is of the form \((\ell _1,\dots ,\ell _\upkappa )\) and \({\mathfrak h}'(\ell )\) is of the form \((\ell _1,\dots ,\ell _\upkappa ,{\mathfrak s}(\pmb {w}),\ell _1',\dots ,\ell _\mu ')\);

  2. 2

    every location \(\ell \in D_2\) has a connection in \({\mathfrak h}'\).

Then \(({\mathfrak s},{\mathfrak h}') \triangleright _{{ i d}} ({\mathfrak s},{\mathfrak h})\).

4.2 Transforming the Consequent

We first describe the transformation for the right-hand side of the entailment problem, as this transformation is simpler.

Definition 17

We associate each n-ary predicate \(p\in \mathcal{P}_{r}\) with a new predicate \(\widehat{p}\) of arity \(n + \upnu \). We denote by \(\widehat{\upalpha }\) the formula obtained from \(\upalpha \) by replacing every predicate atom \(p(x_1,\dots ,x_n)\) by \(\widehat{p}(x_1,\dots ,x_n,\pmb {w})\), where \(\pmb {w} = (w_1, \ldots , w_\upnu )\).

Definition 18

We denote by \(\widehat{\mathcal{R}}\) the set of rules of the form:

$$\widehat{p}(x_1,\dots ,x_n,\pmb {w}) \Leftarrow x_1 \mapsto (y_1,\dots ,y_\upkappa ,\pmb {w},z_1,\dots ,z_{\mu })\upsigma * \widehat{\uprho }\upsigma * \upxi _I * \upchi _\upsigma $$

where:

  • \(p(x_1,\dots ,x_n) \Leftarrow x_1 \mapsto (y_1,\dots ,y_\upkappa ) * \uprho \) is a rule in \(\mathcal{R}\) with \(p \in \mathcal{P}_{r}\),

  • \(z_1,\dots ,z_\mu \) are variables not occurring in \(\mathrm {fv}(\uprho ) \cup \{ x_1,\dots ,x_n,y_1,\dots ,y_\upkappa ,w_1,\dots ,w_\upnu \}\),

  • \(\upsigma \) is a substitution with \(\mathrm {dom}(\upsigma ) \subseteq \mathrm {fv}(\uprho ) \setminus \{ x_1 \}\) and \(\mathrm {rng}(\upsigma ) \subseteq \{ w_1,\dots ,w_\upnu \}\),

  • , with \(I \subseteq \{ 1,\dots ,\mu \}\),

  • .

We denote by \(\mathcal{R}_r\) the set of rules in \(\widehat{\mathcal{R}}\) that are connectedFootnote 3.

Note that the free variables \(\pmb {w}\) are added as parameters in the rules above, instead of some arbitrary tuple of fresh variables \(\pmb {\upomega }\), of the same length as \(\pmb {w}\). This is for the sake of conciseness, since these parameters \(\pmb {\upomega }\) will be systematically mapped to \(\pmb {w}\).

Example 19

Assume that \(\uppsi = \exists x~.~p(x,w_1)\), with \(\upnu = 1\), \(\mu = 1\) and \(\uplambda (p) = \{ 2 \}\). Assume also that p is associated with the rule: \(p(u_1,u_2) \Leftarrow u_1 \mapsto u_1 * q(u_2)\). Observe that the rule is \(\uplambda \)-connected, but not connected. Then \(\mathrm {dom}(\upsigma ) \subseteq \left\{ u_2\right\} \), \(\mathrm {rng}(\upsigma ) \subseteq \left\{ w_1\right\} \) and \(I\subseteq \left\{ 1\right\} \), so that \(\widehat{\mathcal{R}}\) contains the following rules:

figure d

Rules (1) and (2) are not connected, hence do not occur in \(\mathcal{R}_r\). Rules (3) and (4) are connected, hence occur in \(\mathcal{R}_r\). Note that (4) is established, but (3) is not. \(\blacksquare \)

We now relate the SIDs \(\mathcal{R}\) and \(\mathcal{R}_r\) by the following result:

Lemma 20

Let \(\upalpha \) be a formula that is \(\uplambda \)-restricted w.r.t. \(\{ w_1,\dots ,w_\upnu \}\) and contains no points-to atoms, with \(\mathcal{P}(\upalpha ) \subseteq \mathcal{P}_{r}\). Given a store \({\mathfrak s}\) and two heaps \({\mathfrak h}\) and \({\mathfrak h}'\), such that \(({\mathfrak s},{\mathfrak h}') \triangleright _{{ i d}} ({\mathfrak s},{\mathfrak h})\), we have \(({\mathfrak s},{\mathfrak h}') \models _{\mathcal{R}_r} \widehat{\upalpha }\) if and only if \(({\mathfrak s},{\mathfrak h}) \models _{\mathcal{R}} \upalpha \).

4.3 Transforming the Antecedent

We now describe the transformation operating on the left-hand side of the entailment problem. For technical convenience, we make the following assumption:

Assumption 21

We assume that, for every predicate \(p \in \mathcal{P}_{l}\), every rule of the form \(p(x_1,\dots ,x_n) \Leftarrow \uppi \) in \(\mathcal{R}\) and every atom \(q(x'_1,\dots ,x'_m)\) occurring in \(\uppi \), \(x'_1 \not \in \{ x_1,\ldots ,x_n\}\).

This is without loss of generality, because every variable \(x'_1 \in \{x_1,\ldots ,x_n\}\) can be replaced by a fresh variable z, while conjoining the equational atom \(z \approx x'_1\) to \(\uppi \). Note that the obtained SID may no longer be connected, but this is not problematic, because the left-hand side of the entailment is not required to be connected anyway.

Definition 22

We associate each pair (pX), where \(p \in \mathcal{P}_{l}\), \( ar (p)=n\) and \(X \subseteq \llbracket 1,n \rrbracket \), with a fresh predicate symbol \({p}_{X}\), such that \( ar ({p}_{X}) = n + \upnu \). A decoration of a formula \(\upalpha \) containing no points-to atoms, such that \(\mathcal{P}(\upalpha ) \subseteq \mathcal{P}_{l}\), is a formula obtained by replacing each predicate atom in \(\upalpha \) by an atom of the form \({q}_{X_\beta }(y_1,\dots ,y_m,\pmb {w})\), with \(X_\beta \subseteq \llbracket 1,m \rrbracket \). The set of decorations of a formula \(\upalpha \) is denoted by \(D(\upalpha )\).

The role of the set X in a predicate atom \(p_X(x_1,\ldots ,x_n,\pmb {w})\) will be explained below. Note that the set of decorations of an atom \(\upalpha \) is always finite.

Definition 23

We denote by \(D(\mathcal{R})\) the set of rules of the form

where:

  • \(p(x_1,\dots ,x_n) \Leftarrow x_1 \mapsto (y_1,\dots ,y_\upkappa ) * \uprho \) is a rule in \(\mathcal{R}\) and \(X \subseteq \llbracket 1,n \rrbracket \);

  • \(\{ z_1,\dots ,z_\mu \} = (\mathrm {fv}(\uprho ) \cup \{ y_1,\dots ,y_\upkappa \}) \setminus \{ x_1,\dots ,x_n \}\),

  • \(\upsigma \) is a substitution, with \(\mathrm {dom}(\upsigma ) \subseteq \{ z_1,\dots ,z_\mu \}\) and \(\mathrm {rng}(\upsigma ) \subseteq \{ x_1,\dots ,x_n,w_1,\dots ,w_\upnu ,z_1,\dots ,z_\mu \}\);

  • \(\uprho '\) is a decoration of \(\uprho \upsigma \);

  • \(I \subseteq \{ 1,\dots ,\mu \}\) and \(z_i \not \in \mathrm {dom}(\upsigma )\), for all \(i \in I\).

Lemma 24

Let \(\upalpha \) be a formula containing no points-to atom, with \(\mathcal{P}(\upalpha ) \subseteq \mathcal{P}_{l}\), and let \(\upalpha '\) be a decoration of \(\upalpha \). If \(({\mathfrak s},{\mathfrak h}') \models _{D(\mathcal{R})} \upalpha '\) and \(({\mathfrak s},{\mathfrak h}') \triangleright _{{ i d}} ({\mathfrak s},{\mathfrak h})\), then \(({\mathfrak s},{\mathfrak h}) \models _{\mathcal{R}} \upalpha \).

At this point, the set X for predicate symbol \(p_X\) is of little interest: atoms are simply decorated with arbitrary sets. However, we shall restrict the considered rules in such a way that for every model \(({\mathfrak s},{\mathfrak h})\) of an atom \({p}_{X}(x_1,\ldots ,x_{n+\upnu })\), with \(n = ar (p)\), the set X denotes a set of indices \(i \in \llbracket 1,n \rrbracket \) such that \({\mathfrak s}(x_i) \in \mathrm {dom}({\mathfrak h})\). In other words, X will denote a set of formal parameters of \({p}_{X}\) that are allocated in every model of \({p}_{X}\).

Definition 25

Given a formula \(\upalpha \), we define the set \( Alloc (\upalpha )\) as follows: \(x \in Alloc (\upalpha )\) iff \(\upalpha \) contains either a points-to atom of the form \(x \mapsto (y_1, \dots , y_{\upkappa +\mu +\upnu })\), or a predicate atom \({q}_{X}(x'_1,\dots ,x'_{m+\upnu })\) with \(x'_i = x\) for some \(i \in X\).

Note that, in contrast with Definition 1, we do not consider that \(x \in Alloc (\upalpha )\), for those variables x related to a variable from \( Alloc (\upalpha )\) by equalities.

Definition 26

A rule \({p}_{X}(x_1,\dots ,x_{n+\upnu }) \Leftarrow \uppi \) in \(D(\mathcal{R})\) with \(n = ar (p)\) with \(\uprho = x_1 \mapsto (y_1,\dots ,y_k,\pmb {w},z_1,\dots ,z_\mu ) *\uprho '\) is well-defined if the following conditions hold:

  1. 1.

    \(\{ x_1 \} \subseteq Alloc ({p}_{X}(x_1,\dots ,x_{n+\upnu })) \subseteq Alloc (\uppi )\);

  2. 2.

    \(\mathrm {fv}(\uppi ) \subseteq Alloc (\uppi ) \cup \{ x_1,\dots ,x_{n+\upnu }\}\).

We denote by \(\mathcal{R}_l\) the set of well-defined rules in \(D(\mathcal{R})\).

We first state an important properties of \(\mathcal{R}_l\).

Lemma 27

Every rule in \(\mathcal{R}_l\) is progressing, connected and established.

We now relate the systems \(\mathcal{R}\) and \(\mathcal{R}_l\) by the following result:

Definition 28

A store \({\mathfrak s}\) is quasi-injective if, for all \(x,y\in \mathrm {dom}({\mathfrak s})\), the implication \({\mathfrak s}(x) = {\mathfrak s}(y) \Rightarrow x = y\) holds whenever \(\{ x,y \} \not \subseteq \{ w_1,\dots ,w_\upnu \}\).

Lemma 29

Let L be an infinite subset of \(\mathcal{L}\). Consider a formula \(\upalpha \) containing no points-to atom, with \(\mathcal{P}(\upalpha ) \subseteq \mathcal{P}_{l}\), and let \(({\mathfrak s},{\mathfrak h})\) be an \(\mathcal{R}\)-model of \(\upalpha \), where \({\mathfrak s}\) is quasi-injective, and \((\mathrm {rng}({\mathfrak s}) \cup \mathrm {loc}({\mathfrak h})) \cap L = \emptyset \). There exists a decoration \(\upalpha '\) of \(\upalpha \), a heap \({\mathfrak h}'\) and a mapping \(\upgamma : \mathcal{L}\rightarrow \mathcal{L}\) such that:

  • \(({\mathfrak s},{\mathfrak h}') \triangleright _{\upgamma } ({\mathfrak s},{\mathfrak h})\),

  • if \(\ell \not \in L\) then \(\upgamma (\ell ) = \ell \),

  • \(\mathrm {loc}({\mathfrak h}') \setminus \mathrm {rng}({\mathfrak s}) \subseteq L\),

  • \(\mathrm {dom}(\mathrm{aux}({\mathfrak h}')) \subseteq L\) and

  • \(({\mathfrak s},{\mathfrak h}') \models _{\mathcal{R}_l} \upalpha '\).

Furthermore, if \({\mathfrak s}(u) \in \mathrm {dom}({\mathfrak h}') \setminus \{{\mathfrak s}(w_i) \mid 1 \le i \le \upnu \}\) then \(u \in Alloc (\upalpha ')\).

4.4 Transforming Entailments

We define . We show that the instance \(\upphi \vdash _{\mathcal{R}} \uppsi \) of the safe entailment problem can be solved by considering an entailment problem on \({\widehat{\mathcal{R}}}\) involving the elements of \(D(\upphi )\) (see Definition 22). Note that the rules from \(\mathcal{R}_l\) are progressing, connected and established, by Lemma 27, whereas the rules from \(\mathcal{R}_r\) are progressing and connected, by Definition 18. Hence, each entailment problem \(\upphi ' \vdash _{{\widehat{\mathcal{R}}}} \widehat{\uppsi }\), where \(\upphi ' \in D(\upphi )\), is progressing, connected and left-established.

Lemma 30

\(\upphi \models _{\mathcal{R}} \uppsi \) if and only if \(\bigvee _{\upphi ' \in D(\upphi )} \upphi ' \models _{{\widehat{\mathcal{R}}}} \widehat{\uppsi }\).

Proof

\(\Rightarrow \)” Assume that \(\upphi \models _{\mathcal{R}} \uppsi \) and let \(\upphi '\in D(\upphi )\) be a formula, \(({\mathfrak s},{\mathfrak h}')\) be an \({\widehat{\mathcal{R}}}\)-model of \(\upphi '\) and . By construction, \(({\mathfrak s},{\mathfrak h}')\) is an \(\mathcal{R}_l\)-model of \(\upphi '\). By definition of \(D(\upphi )\), \(\upphi '\) is a decoration of \(\upphi \). Let , , and consider a location \(\ell \in \mathrm {dom}({\mathfrak h}')\). By definition, \(\ell \) must be allocated by some rule in \(\mathcal{R}_l\). If \(\ell \) is allocated by a rule of the form given in Definition 23, then necessarily \({\mathfrak h}'(\ell )\) is of the form \((\ell _1,\dots ,\ell _\upkappa ,{\mathfrak s}(w),\ell _1',\dots ,\ell '_\mu )\) and \(\ell \in D_1\). Otherwise, \(\ell \) is allocated by the predicate \(\underline{\pmb {\bot }}\) and we must have \(\ell \in D_2\) by definition of the only rule for \(\underline{\pmb {\bot }}\). Since this predicate must occur within a rule of the form given in Definition 23, \(\ell \) necessarily occurs in the \(\mu \) last components of the image of a location in \(D_1\), hence admits a connection in \({\mathfrak h}'\). Consequently, by Lemma 16 \(({\mathfrak s}, {\mathfrak h}') \triangleright _{{ i d}} ({\mathfrak s}, {\mathfrak h})\), and by Lemma 24, \(({\mathfrak s},{\mathfrak h}) \models _{\mathcal{R}} \upphi \). Thus \(({\mathfrak s},{\mathfrak h}) \models _{\mathcal{R}} \uppsi \), and by Lemma 20, \(({\mathfrak s},{\mathfrak h}') \models _{\mathcal{R}_r} \widehat{\uppsi }\), thus \(({\mathfrak s},{\mathfrak h}') \models _{{\widehat{\mathcal{R}}}} \widehat{\uppsi }\).

\(\Leftarrow \)” Assume that \(\bigvee _{\upphi ' \in D(\upphi )} \upphi ' \models _{{\widehat{\mathcal{R}}}} \widehat{\uppsi }\) and let \(({\mathfrak s},{\mathfrak h})\) be a \(\mathcal{R}\)-model of \(\upphi \). Since the truth values of \(\upphi \) and \(\uppsi \) depend only on the variables in \(\mathrm {fv}(\upphi ) \cup \mathrm {fv}(\uppsi )\), we may assume, w.l.o.g., that \({\mathfrak s}\) is quasi-injective. Consider an infinite set \(L\subseteq \mathcal{L}\) such that \((\mathrm {rng}({\mathfrak s}) \cup \mathrm {loc}({\mathfrak h})) \cap L = \emptyset \). By Lemma 29, there exist a heap \({\mathfrak h}'\), a mapping \(\upgamma :\mathcal{L}\rightarrow \mathcal{L}\) and a decoration \(\upphi '\) of \(\upphi \) such that \(\upgamma (\ell ) = \ell \) for all \(\ell \notin L\), \(({\mathfrak s},{\mathfrak h}') \triangleright _{\upgamma } ({\mathfrak s},{\mathfrak h})\) and \(({\mathfrak s},{\mathfrak h}')\models \upphi '\). Since \(\mathrm {rng}({\mathfrak s}) \cap L =\emptyset \), we also have \(\upgamma ({\mathfrak s}) = {\mathfrak s}\). Then \(({\mathfrak s},{\mathfrak h}') \models \widehat{\uppsi }\). Let . Since \(({\mathfrak s},{\mathfrak h}') \triangleright _{\upgamma } ({\mathfrak s},{\mathfrak h})\), by Lemma 15 we have \(({\mathfrak s},{\mathfrak h}') \triangleright _{{ i d}} ({\mathfrak s},{\mathfrak h}_1)\), and by Lemma 20, \(({\mathfrak s},{\mathfrak h}_1) \models \uppsi \). By Lemma 15 we have \({\mathfrak h}= \upgamma ({\mathfrak h}_1)\). Since \(\uppsi \) is \(\uplambda \)-restricted w.r.t. \(\{ w_1,\dots ,w_n\}\), we deduce by Lemma 10 that \(({\mathfrak s},{\mathfrak h}) \models \uppsi \).    \(\square \)

This leads to the main result of this paper:

Theorem 31

The safe entailment problem is 2EXPTIME-complete.

Proof

The 2EXPTIME-hard lower bound follows from [8, Theorem 32], as the class of progressing, \(\uplambda \)-connected and \(\uplambda \)-restricted entailment problems is a subset of the safe entailment class. For the 2EXPTIME membership, Lemma 30 describes a many-one reduction to the progressing, connected and established class, shown to be in 2EXPTIME, by Theorem 4. Considering an instance \(\mathfrak {P}= \upphi \vdash _{\mathcal{R}} \uppsi \) of the safe class, Lemma 30 reduces this to checking the validity of \(|D(\upphi )|\) instances of the form \(\upphi ' \vdash _{{\widehat{\mathcal{R}}}} \widehat{\uppsi }\), that are all progressing, connected and established, by Lemma 27. Since a formula \(\upphi ' \in D(\upphi )\) is obtained by replacing each predicate atom \(p(x_1,\ldots ,x_n)\) of \(\upphi \) by \(p_X(x_1,\ldots ,x_n,\pmb {w})\) and there are at most \(2^n\) such predicate atoms, it follows that \(|D(\upphi )| = 2^{\mathcal {O}(\mathrm {w}(\mathfrak {P}))}\). To obtain 2EXPTIME-membership of the problem, it is sufficient to show that each of the progressing, connected and established instances \(\upphi ' \vdash _{{\widehat{\mathcal{R}}}} \widehat{\uppsi }\) can be built in time \(|\mathfrak {P}| \cdot 2^{\mathcal {O}(\mathrm {w}(\mathfrak {P}) \cdot \log \mathrm {w}(\mathfrak {P}))}\). First, for each \(\upphi ' \in D(\upphi )\), by Definition 22, we have \(|\upphi '| \le |\upphi | \cdot (1 + \upnu ) \le |\upphi | \cdot (1 + \mathrm {w}(\mathfrak {P})) = |\upphi | \cdot 2^{\mathcal {O}(\log \mathrm {w}(\mathfrak {P}))}\). By Definition 17, we have \(|\widehat{\upphi }| \le |\upphi | \cdot (1 + \upnu ) = |\upphi | \cdot 2^{\mathcal {O}(\log \mathrm {w}(\mathfrak {P}))}\). By Definition 23, \(D(\mathcal{R})\) can be obtained by enumeration in time that depends linearly of

$$|D(\mathcal{R})| \le |\mathcal{R}| \cdot 2^\mu \cdot (n+\upnu +\mu )^\upnu \le |\mathcal{R}| \cdot 2^{\mathrm {w}(\mathfrak {P}) + \mathrm {w}(\mathfrak {P}) \cdot \log \mathrm {w}(\mathfrak {P})} = |\mathfrak {P}| \cdot 2^{\mathcal {O}(\mathrm {w}(\mathfrak {P}))}$$

This is because the number of intervals I is bounded by \(2^\mu \) and the number of substitutions \(\upsigma \) by \((n+\upnu +\mu )^\upnu \), in Definition 23. By Definition 25, checking whether a rule is well-defined can be done in polynomial time in the size of the rule, hence in \(2^{\mathcal {O}(\mathrm {w}(\mathfrak {P}))}\), so the construction of \(\mathcal{R}_l\) takes time \(|\mathfrak {P}| \cdot 2^{\mathcal {O}(\mathrm {w}(\mathfrak {P}) \log \mathrm {w}(\mathfrak {P}))}\). Similarly, by Definition 23, the set \(\widehat{\mathcal{R}}\) is constructed in time

$$|\widehat{\mathcal{R}}| \le |\mathcal{R}| \cdot 2^\mu \cdot \mathrm {w}(\mathfrak {P})^\upnu \le |\mathcal{R}| \cdot 2^\mathrm {w}(\mathfrak {P}) \cdot 2^{\mathrm {w}(\mathfrak {P}) \cdot \log \mathrm {w}(\mathfrak {P})} = |\mathfrak {P}| \cdot 2^{\mathcal {O}(\mathrm {w}(\mathfrak {P}))}$$

Moreover, checking that a rule in \(\widehat{\mathcal{R}}\) is connected can be done in time polynomial in the size of the rule, hence the construction of \(\mathcal{R}_r\) takes time \(2^{\mathcal {O}(\mathrm {w}(\mathfrak {P})\log \mathrm {w}(\mathfrak {P}))}\). Then the entire reduction takes time \(2^{\mathcal {O}(\mathrm {w}(\mathfrak {P})\log \mathrm {w}(\mathfrak {P}))}\), which proves the 2EXPTIME upper bound for the safe class of entailments.    \(\square \)

5 Conclusion and Future Work

Together with the results of [6, 8, 10, 14], Theorem 31 draws a clear and complete picture concerning the decidability and complexity of the entailment problem in Separation Logic with inductive definitions. The room for improvement in this direction is probably very limited, since Theorem 31 pushes the frontier quite far. Moreover, virtually any further relaxation of the conditions leads to undecidability.

A possible line of future research which could be relevant for applications would be to consider inductive rules constructing simultaneously several data structures, which could be useful for instance to handle predicates comparing two structures, but it is clear that very strong conditions would be required to ensure decidability. We are also interested in defining effective, goal-directed, proof procedures (i.e., sequent or tableaux calculi) for testing the validity of entailment problems. Thanks to the reduction devised in the present paper, it is sufficient to focus on systems that are progressing, connected and left-established. We are also trying to extend the results to entailments with formulæ involving data with infinite domains, either by considering a theory of locations (e.g., arithmetic on addresses), or, more realistically, by considering additional sorts for data.