1 Introduction

Separation logic [Rey02], in the sequel also referred to by SL, extends first-order logic with the separating connectives of conjunction and implication for reasoning about programs which feature the dynamic allocation of variables that are stored at locations of that part of the memory called the ‘heap’. The separating conjunction allows to specify properties of a partition of the heap into two disjoint sub-heaps. The separating implication (also called ‘the magic wand’) allows to express properties of disjoint extensions of the heap. Both separating connectives involve a second-order quantification over heaps (which are represented by binary relations).

In this paper we study both the model theory and the proof theory of SL. The standard model of SL (as introduced in [Rey02]) extends the standard model of arithmetic with the so-called ‘points-to’ relation which provides a formalization of the heap in terms of the graph of a finitely-based partial function. This function assigns to each location of the heap its stored value, or is undefined if the location is not allocated. In the standard semantics of SL (here also called weak SL), the domains of heaps are finite, that is, only finitely many locations are allocated. Reasoning about finite heaps however requires an infinitary logic because the logic of finite heaps, and that of finite model theory in general, does not satisfy the compactness property: it is straightforward to express for each natural number that the domain of the heap contains at least that number of elements. It follows that every finite subset of this infinite set of sentences is satisfiable, but clearly no finite heap satisfies the entire set.

To study the general model and proof theory of full SLFootnote 1 we (1) extend its semantics to arbitrary first-order models and (2) generalize the notion of a heap to a partial function on the underlying domain of the given (first-order) model: no restrictions are imposed on the cardinality of the domain of heap, in contrast to weak SL which restricts to finite heaps. Our main model-theoretic results are that in this general setting we can express: (1) finiteness of models, (2) well-foundedness of the points-to relation, and (3) existence of countably infinite and uncountable models. As a consequence we have that full SL satisfies neither compactness nor the downward and upward Löwenheim-Skolem theorems (see [CK13]). Non-compactness implies that there does not exist an effective, sound and complete proof theory for SL. In fact, we will show that the well-foundedness of the points-to relation can already be expressed in full SL using only separating conjunction. Consequently, full SL without separating implication is already non-compact. For full SL without separating implication but in which separating conjunction only occurs positively, the fragment which we call separation logic light (SLL), we do have compactness, but its semantic consequence relation is not compact and therefore also does not allow for an effective, sound and complete proof theory.

The question thus arises whether there exists an alternative interpretation of SL that does allow for an effective, sound and complete proof theory. Clearly, the main complexity of SL stems from the (second-order) quantification over heaps (or sub-heaps, as in the case of the separating conjunction). For second-order logic a sound and complete axiomatization can be obtained by generalizing its semantics by means of so-called general models. Such models extend first-order models with a set of possible interpretations of the second-order variables. For example, instead of interpreting a monadic predicate over all possible subsets of the given first-order domain, a general model restricts its interpretation to a given set of such subsets. This generalization of the semantics of second-order logic allows for a sound and complete axiomatization by restricting to so-called Henkin models. A Henkin model is a general model for second-order logic which additionally satisfies the comprehension axiom

$$ \exists R\forall x_1,\ldots ,x_n (R(x_1,\ldots ,x_n)\leftrightarrow \phi (x_1,\ldots , x_n)) $$

for any second-order formula \(\phi (x_1,\ldots , x_n)\) which does not contain the n-ary relation symbol R. In the arithmetic comprehension axiom \(\phi (x_1,\ldots , x_n)\) is first-order.

Generalizing the semantics of SL accordingly in terms of a given set of possible heaps, which does not necessarily contain all heaps, we can formulate in SL the following version of the arithmetic comprehension axiom

$$ \blacklozenge (\forall x,y ((x\hookrightarrow y)\leftrightarrow \phi (x,y))) $$

which expresses the existence of a heap such that its graph, as denoted by the points-to relation \(\hookrightarrow \), satisfies the ‘pure’ first-order formula \(\phi (x,y)\) (i.e., \(\phi \) does not involve the separation connectives and the points-to relation). The \(\blacklozenge \)-modality (formally defined in Sect. 3) expresses the existence of a heap which satisfies the associated formula. Such an instance of the arithmetic comprehension axiom holds if there exists a heap which is characterized by the formula \(\phi (x,y)\). We cannot generalize this axiom to arbitrary SL formulas because it is not obvious how to avoid contradictions like \( \blacklozenge (\forall x,y ((x\hookrightarrow y)\leftrightarrow \lnot (x\hookrightarrow y))) \). Simply requiring that the points-to relation does not occur in \(\phi (x,y)\) does not work because the separating connectives implicitly refer to it. Therefore, we introduce a new interpretation of SL that restricts the (second-order) quantification to first-order definable heaps. For this new interpretation we introduce a sequent calculus which is sound and complete. The completeness proof is based on the construction of a model for a consistent theory (a theory from which false is not derivable), following [Hen49]. From the completeness proof we further derive that this new interpretation satisfies both compactness and the downward Löwenheim-Skolem theorem. By the seminal theorem of Lindström we then infer that this new interpretation is as expressive as first-order logic.

Related Work. The model theory of SL has been focused mainly on finite heaps. For example, the computability and complexity results in [CYO01] depend on this assumption. Surprisingly, in [BDL12] the authors show that weak SL is as expressive as weak second-order logic [Man96], which is a semantics of second-order logic where quantification is restricted to finite relations. In [DD16] this result is further refined by the restriction to two variables and the separating implication (no separating conjunction) which still is as expressive as weak second-order logic. In [EIP20] the satisfiability problem for SL with k record fields has been studied for finite heaps, but over arbitrary first-order models. A tableaux method for a propositional fragment of SL has been developed in [GM10] which has been proven sound and complete. Extensions to first-order SL are discussed assuming finite heaps. In fact, the tableaux method introduced is based on a labelling mechanism for encoding finite heap structures.

In contrast, when investigating complete proof systems for SL the assumption of the finiteness of heaps has to be dropped, thus allowing for infinite heaps, because, as already observed above, finiteness leads to non-compactness. Our general model theory shows that this generalization of SL, full SL, is also non-compact, and therefore does not allow for a finitary sound and complete logic either. Consequently, to obtain such a logic one either has to syntactically restrict SL or further abstract or generalize its semantics. In [DLM21], for example, a sound and complete sequent calculus is described for a quantifier-free subset of SL. On the other hand, examples of further abstractions and generalizations are [HT16] and [Pym02], and both describe a finitary logic which is sound and complete. In [Pym02], models are based on very general preordered commutative monoids and there is no points-to relation. In [HT16], special commutative monoids called separation algebras are used to give semantics to the separating connectives. The elements of such separation algebras represent heaps as relations on the underlying (first-order) domain. This allows for a standard set-theoretic interpretation of the points-to relation. However, the semantics of separating conjunction is defined in terms of the abstract monoid, and as such is decoupled from the set-theoretic interpretation of the points-to relation. For example, a first-order specification (using plain conjunction) of an enumeration of the elements of the domain of a (finite) heap as a set does not in general correspond with an enumeration using separation conjunction.

A sound and complete axiomatization of the points-to relation in the general context of first-order SL respecting its standard set-theoretic interpretation thus remains a main challenge.

Second-order logic allows for a straightforward translation of the (weak or full) semantics of SL, and one can use second-order logic to reason about validity in SL. This approach is followed for example by the IRIS project [JKJ+18] which formalizes the semantics of weak SL in the higher-order logic of Coq [HH14]. By restricting the semantics of the separating connectives to (first-order) definable heaps, our approach instead transforms a compositional second-order logical description of the semantics of SL into corresponding rules of a standard first-order sequent calculus. The resulting calculus allows us to reason, in a natural manner, in first-order logic about the (hierarchical) heap structures generated by the rules for the separating connectives. As such it does not involve the additional tree structures of the so-called bunched contexts of the sequent calculi of [HT16] and [Pym02]. Also [Kri08] avoids the use of bunched contexts in a modal sequent calculus for propositional SL, which is proven sound. However it is incomplete because it provides limited support for equational reasoning about the modal contexts (so-called ‘worlds’) associated with the SL formulas.

Plan of the Paper. In the next section we introduce the syntax and semantics of full SL. In Sect. 3 we investigate the expressiveness of full SL. Section 4 introduces a restriction of the semantics to definable heaps. In Sect. 5 we introduce the sequent calculus, and discuss soundness and completeness. Finally, in the conclusion section we wrap up, and discuss some future work.

2 Separation Logic

In this section we introduce the syntax of SL and define its classical semantics with respect to arbitrary first-order models. For an intuitive introduction to separation logic, see [Rey05]. Given a first-order signature of function and predicate symbolsFootnote 2 and a countably infinite set of first-order variables \(x,y,z,\ldots \), the first-order terms of this signature are denoted by \(t,t',\ldots \).

We have the following inductive definition of formulas of separation logic.

Definition 1

(Syntax of SL). We define

$$ p \mathrel {::=} (t_1 = t_2) \mid R(t_1,\ldots ,t_n) \mid (\lnot p) \mid (p\wedge q) \mid \exists x (p) \mid (p\mathrel {*}q) \mid (p\mathrel {-*}q) $$

where R is a n-ary relation symbol. As a special case we have the binary ‘points-to’ relation symbol \(\hookrightarrow \) (also called the weak/loose points-to).

Let \(M=(D,I)\) denote a first-order model, where D denotes the non-empty domain and I provides an interpretation of the function and predicate symbols as functions and relations over D. A valuation s assigns elements of the domain D of M to the first-order variables \(x,y,z,\ldots \). We omit the standard inductive definition of the value \(I_s(t)\) of a term t. Given a model \(M=(D,I)\), we denote by \(M,h,s\models p\) that p holds in the model M, under the interpretation \(h\subseteq D\times D \) of the binary relation symbol \(\hookrightarrow \), where h denotes a so-called heap, represented as the graph of a partial function with finite domain.

Definition 2

(Semantics of SL). We have the following main cases.

  • \(M,h,s\models (t\hookrightarrow t')\) if and only if \(\langle I_s(t),I_s(t')\rangle \in h\).

  • \(M,h,s\models (p\mathrel {*}q)\) if and only if \(M,h_1,s\models p\) and \(M,h_2,s\models q\), for some heaps \(h_1,h_2\subseteq D\times D\) such that \(h=h_1\cup h_2\) and \(h_1\perp h_2\).

  • \(M,h,s\models (p\mathrel {-*}q)\) if and only if \(M,h',s\models p\) implies \(M,h\cup h',s\models q\), for all heaps \(h'\subseteq D\times D\) such that \({h\perp h'}\).

Other cases are the Tarksi-style semantics of classical logic [Yan01, Table 5.2].

In the above definition we use the set-theoretic operation of union of binary relations as sets of pairs. On the other hand, by \(h_1\perp h_2\) we denote that the domains of the relations \(h_1\) and \(h_2\) are disjointFootnote 3. As such, we can introduce the strict/tight points-to relation \(\mapsto \) of SL, defined by \(M,h,s\models t\mapsto t'\) if and only if \(h=\{\langle I_s(t),I_s(t')\rangle \} \), as a derived concept: it can be expressed by \((t\hookrightarrow t')\wedge \forall x,y({(x\hookrightarrow y)} \rightarrow {(x=t\wedge y=t')})\). The concept \(\text{ emp }\) of the empty relation can also be expressed by \(\forall x,y(x\not \hookrightarrow y)\). Intuitionistic SL only allows for the weak/loose points-to relation. The strict version cannot be expressed in intuitionistic SL because of its monotonicity property that the truth of a formula is preserved by extensions of the domain of the heap [Rey00]. In this article we focus on classical separation logic only.

Let \((x_i\hookrightarrow -)\) abbreviate \(\exists y(x_i\hookrightarrow y)\). The sentences \(\phi _n\) defined by

$$\exists x_1,\ldots , x_n ( (x_1 \hookrightarrow -) * \ldots * (x_n \hookrightarrow -))$$

then state that there exist at least n allocated elements of the underlying domain of the given first-order model. Note that the semantics of the separating conjunction implies that \(x_i\not =x_j\) for \(i\not =j\). It is also possible to formulate the same property using propositional conjunction instead of separating conjunction by explicitly stating this fact, that the variables are not aliases. Now collect all \(\phi _n\) in a set. Clearly, every finite subset of this set of sentences is satisfied by a finite heap, but that there does not exist a finite heap satisfying all these sentences. This simple counterexample to compactness provides the basic motivation to study the above semantics of SL extended to unbounded heaps, i.e. heaps which potentially have an infinite domain.

Further, for technical convenience only, we generalize the semantics to arbitrary binary relations. For an arbitrary (binary) relation \(\mathcal {R}\subseteq D\times D\) on the underlying domain D of the given first-order model, we define \(M,\mathcal {R},s\models p\) as above, where the interpretation of the separating connectives ranges over arbitrary subsets of \(D\times D\). In fact, in this generalized semantics, which we call relational SL, we can model the restriction to heaps simply by syntactically restricting the separating implication to assertions of the form \((p\wedge { fun})\mathrel {-*}q\), where \({ fun}\) denotes the assertion \(\forall x,y,z ((x\hookrightarrow y\wedge x\hookrightarrow z) \rightarrow y=z)\). Let \(p'\) denote the result of restricting syntactically all occurrences of the separating implication in p to heaps (as described above). It follows that the evaluation of \(p'\wedge { fun}\) is restricted to heaps.

It is worthwhile to observe here that there exists a straightforward formalization of relational SL in second-order logic. For any formula p as defined above we define inductively the second-order formula p(R), where R is a binary relation.

Definition 3

(Logical formalization of relational SL).

We have the following main cases.

  • \((t\hookrightarrow t')(R)=R(t,t')\),

  • \((p\mathrel {*}q)(R)=\exists R_1,R_2( R=R_1\uplus R_2\wedge p(R_1)\wedge q(R_2))\),

  • \((p\mathrel {-*}q)(R)=\forall R_1,R_2(( R_2= R_1\uplus R \wedge p(R_1)) \rightarrow q(R_2))\).

Here we denote by \(R=R_1\uplus R_2\), for any binary relation symbols \(R,R_1,R_2\), the conjunction of the formulas \(\forall x,y (R(x,y)\leftrightarrow (R_1(x,y)\vee R_2(x,y)))\) and \(\forall x,y,z( \lnot R_1(x,y)\vee \lnot R_2(x,z))\). We denote by \(M,s\models \phi \) the standard truth definition of a second-order formula \(\phi \), where the evaluation s additionally interprets the second-order variables. Correctness of this translation, that is, \(M,\mathcal {R},s\models p\) if and only if \({M,s[R:=\mathcal {R}]}\models p(R)\) (where \(s[R:=\mathcal {R}]\) denotes the update of s which assigns to the binary variable R the relation \(\mathcal {R}\)), can be established by a straightforward induction on p.

3 Model Theory: Compactness and Countability

To explore the general model theory of SL we introduce the modalities \(\blacksquare p\) and \(\Box p\) as abbreviations of \( \text{ true }\mathrel {*}(\text{ emp }\wedge (\text{ true }\mathrel {-*}p)) \) and \(\lnot (\text{ true }\mathrel {*}\lnot p)\), respectivelyFootnote 4. For \(M=(D,I)\) we have \(M,\mathcal {R},s\models \blacksquare p\) if and only if \(M,\mathcal {R}',s\models p\), for every \(\mathcal {R}'\subseteq D\times D\). Further, we have \(M,\mathcal {R},s\models \Box p\) if and only if \(M,\mathcal {R}',s\models p\), for every sub-relation \(\mathcal {R}'\) of \(\mathcal {R}\) (that is, \(\mathcal {R}'\subseteq \mathcal {R}\)). By \(\blacklozenge p\) we denote the formula \(\lnot {\blacksquare {\lnot p}}\). It follows that \(M,\mathcal {R},s\models \blacklozenge p\) if and only if \(M,\mathcal {R}',s\models p\), for some \(\mathcal {R}'\subseteq D\times D\).

Characterizing Finite Models. The above \(\blacksquare \)-modality allows to express that the domain D of a model \(M=(D,I)\) is finite, by asserting that every injective function \(f:D\rightarrow D\) is a surjection: Let \({ inj}\) be the conjunction of the formulas \({ fun}\) (as defined above), \(\forall x,y,z ( (x\hookrightarrow z \wedge y\hookrightarrow z)\rightarrow x=y)\), and \(\forall x\exists y(x\hookrightarrow y)\). We have that \(M,\mathcal {R},s\models { inj}\) if and only if \(\mathcal {R}:D\rightarrow D\) is injective (note that the domain of \(\mathcal {R}\) is D because \(M,\mathcal {R},s\models \forall x\exists y(x\hookrightarrow y)\)). And so \(M,\mathcal {R},s\models \blacksquare ({ inj}\rightarrow \forall x \exists y (y\hookrightarrow x))\) if and only if D is finite. Note that the occurrences of \(\hookrightarrow \) in the scope of the \(\blacksquare \)-modality are universally bounded, and the interpretation of \(\hookrightarrow \) thus ranges over all \(\mathcal {R}\subseteq D\times D\).

Characterizing Countable Infinity. We next show that countability of the underlying domain of a model can be expressed, using the above two modalities. We will be working with chains related by \(\hookrightarrow \), and in that sense we speak of a predecessor of x, being any y such that \((y\hookrightarrow x)\), and successor of x, being any y such that \((x\hookrightarrow y)\). Let \(\text{ enum }\) be the conjunction of the following formulas:

  • the above formula \({ inj}\),

  • the formula \(\exists ! x \forall y(y\not \hookrightarrow x)\)Footnote 5, which states the existence of a unique minimal element (that is, an element that has no predecessor),

  • the formula \(\Box (\text{ emp }\vee \exists x((x\hookrightarrow -)\wedge \forall y ((y\hookrightarrow -)\rightarrow (y\not \hookrightarrow x)))\), which expresses that the points-to relation \(\hookrightarrow \) is well-founded.

Note that a relation \(\mathcal {R}\) is well-founded iff every (non-empty) sub-relation of \(\mathcal {R}\) has a minimal element (with respect to that sub-relation). This fact can be expressed by the use of the formula \( enum \). Let \(M,\mathcal {R},s\models { enum}\). We show that \(\mathcal {R}\) encodes an enumeration \(\langle d_n\rangle _n\) of D (still we have \(M=(D,I)\)). We define the sequence \(\langle d_n\rangle _n\) by induction on n: for \(d_0\) we take the (unique) minimal element, and for \(d_{n+1}\) we take the unique element \(d\in D\) such that \(\langle d_n,d\rangle \in \mathcal {R}\). Note that inj implies that every element of D has a unique ‘successor’ and that \(d_{n+1}\not \in \{d_0,\ldots ,d_n\}\). Well-foundedness ensures that every element of D appears in the enumeration \(\langle d_n\rangle _n\). Because otherwise we can construct an infinite descending chain of elements not appearing in the enumeration \(\langle d_n\rangle _n\) (since \(d_0\) denotes the unique minimal element with respect to the functional interpretation \(\mathcal {R}\) of \(\hookrightarrow \), it follows that for any \(d\in D\) which does not appear in the enumeration \(\langle d_n\rangle _n\) there exists a \(d'\in D\) which also does not appear in the enumeration \(\langle d_n\rangle _n\) and \(\langle d',d\rangle \in \mathcal {R}\)).

We thus have that \(M,\mathcal {R},s\models { enum}\) implies that the domain of M is countably infinite. The formula \(\blacklozenge { enum}\) further abstracts from the current interpretation of the points-to relation \(\hookrightarrow \), so that if the domain of M is countably infinite then \(M,\mathcal {R},s\models \blacklozenge { enum}\), for arbitrary \(\mathcal {R}\) (and s).

The class of uncountable models is characterized by \(\lnot (\blacklozenge { enum} \vee { fin})\), where \({ fin}\) denotes the above formula which characterizes the class of finite models.

Summarizing, the logic of full SL is neither compact nor does it satisfy the Löwenheim-Skolem theorem because it can distinguish between countable and uncountable models. Further, we observe that the above expressiveness results do not depend on the interpretation of the points-to relation as an arbitrary relation. That is, these results also hold for the semantics restricted to (infinite) heaps.

Interestingly, since we can express that the points-to relation \(\hookrightarrow \) is well-founded (see above), even restricting to the separating conjunction gives rise to non-compactness: given a countably infinite set of individual constants \(c_n\), \(n\ge 0\), let \(\varGamma \) consist of the above formula \(\Box (\text{ emp }\vee \exists x((x\hookrightarrow -)\wedge \forall y ((y\hookrightarrow -)\rightarrow (y\not \hookrightarrow x)))\) and the formulas \(c_{n+1}\hookrightarrow c_n\), \(n\ge 0\). Clearly, every finite subset of \(\varGamma \) is satisfiable but \(\varGamma \) itself is not. Note that we do not need to require that all the \(c_i\not =c_j\), for every \(i\not =j\), because in case the formulas \(c_{n+1}\hookrightarrow c_n\), \(n\ge 0\), are satisfied and additionally \(c_i=c_j\) holds, for some \(i\not =j\), we have a loop in the interpretation of \(\hookrightarrow \). Further, restricting SL to separating conjunction also does not satisfy the upward Löwenheim-Skolem theorem, because, as argued above, \(M,\mathcal {R},s\models { enum}\) implies (infinite) countability of the domain of M.

Separation Logic Light. What about further restricting to positive occurrences of the separating conjunction? Since we then can push negation inside, this restriction can be formally defined by the following syntax describing SLL (‘separation logic light’):

$$ p \mathrel {::=} (\lnot ) R(t_1,\ldots ,t_n) \mid (p\vee q) \mid (p\wedge q) \mid \exists x (p) \mid \forall x (p) \mid (p\mathrel {*}q) $$

Here R denotes either a n-ary relation symbol or the points-to relation \(\hookrightarrow \). Thus, in this version of SL, negation can only be applied to atomic formulas. To show that the notion of satisfiability of SLL is compact, we introduce the following first-order translation p@R, where R is a binary predicate different from \(\hookrightarrow \), \(\circ \) denotes conjunction/disjunction, and Q denotes the existential/universal quantifier.

$$ \begin{array}{ll} (\lnot ) R(t_1,\ldots ,t_n)@R'&{}= (\lnot ) R(t_1,\ldots ,t_n)\\ (t\hookrightarrow t')@R&{}= R(t,t')\\ (p \circ q) @R &{}= p@R\circ q@R\\ Q x (p) @R &{}= Q x (p@R)\\ (p\mathrel {*}q)@R&{}= R=R_1\uplus R_2 \wedge p@R_1\wedge q@R_2 \end{array} $$

The binary relation symbols \(R_1\) and \(R_2\) are ‘fresh’. It follows that p is satisfiable if and only if p@R is satisfiable. More precisely, \(M,\mathcal {R},s\models p\) if and only if there exists a (first-order) model \(M'\) such that \(M',s\models p@R\). Consequently, compactness of first-order logic implies compactness of SLL: Let \(\varGamma \) be an infinite set of formulas of SLL and \(\varGamma '=\{p@R\mid p\in \varGamma \}\)Footnote 6, for some binary relation symbol R. If every finite subset of \(\varGamma \) is satisfiable, so is every finite subset of \(\varGamma '\). By the compactness of first-order logic \(\varGamma '\) is satisfiable, and so is \(\varGamma \). Along the same lines it follows that if \(\varGamma \) is satisfiable then there exists a model \(M=(D,I)\) such that D is countable and \(M,\mathcal {R},s\models p\), for every \(p\in \varGamma \).

Note however that compactness of the satisfiability relation does not imply that the (semantic) consequence relation is compact. In fact, non-compactness of the consequence relation for SLL follows directly from the above argument involving well-founded relations: Let \(\varGamma \) denote the set formulas \(c_{n+1}\hookrightarrow c_n\), \(n\ge 0\). It follows that \(\varGamma \models \text{ true }\mathrel {*}(\lnot \text{ emp } \wedge \forall x((x\hookrightarrow -)\rightarrow \exists y (y\hookrightarrow x)))\). But clearly, there does not exist a finite subset \(\varGamma _0\) of \(\varGamma \) such that \(\varGamma _0\models \text{ true }\mathrel {*}(\lnot \text{ emp } \wedge \forall x((x\hookrightarrow -)\rightarrow \exists y (y\hookrightarrow x)))\).

Some Open Problems. The question remains whether restricting to separating conjunction satisfies the downward Löwenheim-Skolem theorem. A counterexample to the downward Löwenheim-Skolem theorem would be the expressibility of uncountable models. This seems to require the \(\blacksquare p\) modality (and thus the separating implication).

Another interesting question is whether we can express finiteness of the domain of the current interpretation of the points-to relation, that is, does there exist a formula p in SL such that \(M,\mathcal {R},s\models p\) if and only if the domain of the relation \(\mathcal {R}\) is finite?

A main open problem is a formalization of the relation between full SL and second-order logic. Intuitively, one of the main differences is the local perspective of SL, which is determined by the current heap. Remarkably, as already mentioned in the introduction, [BDL12] presents a rather intricate encoding of (dyadic) weak second-order logic into weak SL. Apparently this restriction to finite heaps allows to break the local perspective. Our conjecture however is that full SL is strictly less expressive than (dyadic) second-order logic. To illustrate how subtle this difference may be, consider the following extension of separation logic with a binding operator \({\downarrow \!R }(p)\) which binds the binary variable R in the evaluation of p to the current interpretation of the points-to relation. In other words, it corresponds to a bounded (second-order) quantification \(\exists R ((R={\hookrightarrow }) \wedge p)\), where, \(R={\hookrightarrow }\) abbreviates the first-order formula \(\forall x,y (R(x,y)\leftrightarrow (x\hookrightarrow y))\). Alternatively, we can directly define \(M,\mathcal {R},s\models {\downarrow \! R}(p)\) if and only if \(M,\mathcal {R},s[R:=\mathcal {R}]\models p \). This definition thus assumes an extension of the valuation s to (binary) second-order variables. The expressive power of this binding operator lies in that it allows to ‘break the spell’ of the local perspective since the bound binary variable allows in the local context of the current interpretation of the points-to relation to refer to those ‘outer’ ones that have generated it (by the separating connectives). This extension of SL allows for a simple, compositional translation of (dyadic) second-order logic. We have the following main case which translates \(\exists R (\phi )\), where \(\phi \) a dyadic second-order formula (which is assumed not to contain occurrences of the points-to relation of SL), into the SL formula \( \blacklozenge (\downarrow \! R (p))\).

4 Separation Logic of Definable Binary Relations

In this section we restrict the interpretation of the separating connectives to first-order definable binary relations. By \(\phi \) we now denote a first-order formula which does not contain occurrences of the points-to relation \(\hookrightarrow \) of SL. We omit the standard inductive truth definition \(M,s\models \phi \) of a first-order formula \(\phi \).

By \(\phi (x_1,\ldots ,x_n)\) we denote that the free (first-order) variables of \(\phi \) are among the distinct variables \(x_1,\ldots ,x_n\). A formula \(\phi (x,y)\) is called a binary formula. A binary formula is also simply denoted by \(\phi \), omitting its free variables x and y. Given a model \(M=(D,I)\), and a first-order formula \(\phi (x,y)\), we denote by \({ Rel}_{M}(\phi )\) the relation \(\{\langle s(x),s(y)\rangle \mid M,s\models \phi \}\subseteq D\times D\). Note that the evaluation of \(\phi (x,y)\) only depends on the values of its free variables x and y, that is, \(M,s\models \phi \) if and only if \(M,s'\models \phi \), where \(s(x)=s'(x)\) and \(s(y)=s'(y)\). By \(\phi (t,t')\) we denote the result of replacing in \(\phi (x,y)\) the variables x and y by t and \(t'\), respectively (if necessary renaming bound variables to ensure that the variables of t and \(t'\) do not become bound).

Definition 4

(First-order definability). Given a model \({M=(D,I)}\), a relation \({\mathcal {R}\subseteq D\times D}\) is first-order definable if \(\mathcal {R}={ Rel}_{M}(\phi )\), for some binary formula \(\phi (x,y)\).

Note that, given a model \(M=(D,I)\), \(I(R)={ Rel}_{M}(R)\), that is, for any binary relation symbol R its interpretation I(R) is trivially a first-order definable relation. We generalize the definition of \(R=R_1\uplus R_2\) to arbitrary binary formulas: we denote by \(\phi =\phi _1\uplus \phi _2\) that the binary formulas \(\phi _1(x,y)\) and \(\phi _2(x,y)\) represent a partition of the binary formula \(\phi (x,y)\) which is expressed by the conjunction of \(\forall x,y (\phi (x,y)\leftrightarrow (\phi _1(x,y)\vee \phi _2(x,y)))\) and \(\forall x,y,z( \lnot \phi _1(x,y)\vee \lnot \phi _2(x,z))\). The latter formula, which states that the domains of the binary relations represented by \(\phi _1(x,y)\) and \(\phi _1(x,y)\) are disjoint, we abbreviate by \(\phi _1\perp \phi _2\).

Fig. 1.
figure 1

Sequent calculus. The binary relation symbols \(R_1, R_2\) and R introduced in the rules \(\textbf{L}_{\mathrel {*}}\) and \(\textbf{R}_{\mathrel {-*}}\) are ‘fresh’. In the points-to rules p denotes a basic formula (which does not contain occurrences of the separating connectives).

In the sequel we denote by \(M,\mathcal {R},s\models p\) the restriction of the relational semantics of full SL (Definition 2 extended to binary relations) such that instead of quantifying over arbitrary binary relations, the separating connectives involve quantification over first-order definable binary relations. It is worthwhile to observe here that, as for Henkin models of second-order logic [Hen50], the implicit second-order quantification depends on the underlying signature of function and relation symbols. Extending or restricting the signature affects the semantics of formulas of the ‘old’ signature.

5 Sequent Calculus

To reason about the implicit quantification over definable (binary) relations, we introduce rooted assertions of the form \(p@\phi \), where \(\phi \) denotes a binary formula and p is a formula of SL (see Definition 1). We define \(M,s\models p@\phi \) if and only if \(M,\mathcal {R},s\models p\), where \(\mathcal {R}={ Rel}_{M}(\phi )\). The variables x and y of the binary formula \(\phi (x,y)\) are thus implicitly bound by the @-operator, that is, \(M,s\models p@\phi \) if and only if \(M,s'\models p@\phi \), for any s and \(s'\) such that \(s(z)=s'(z)\), for any free variable occurring in p.

Note that the separating connectives are interpreted in terms of relations which are definable by first-order formulas which do not involve the points-to relation \(\hookrightarrow \). This allows for the following alternative predicative definitionFootnote 7 of the semantics of the separating connectives in rooted assertions (used in both the soundness and completeness proofs). Here \(\psi \perp \phi \), for the binary formulas \(\psi (x,y)\) and \(\phi (x,y)\), denotes the formula \(\forall x,y,z( \lnot \psi (x,y)\vee \lnot \phi (x,z))\).

Lemma 1

We have

  • \(M,s\models (p\mathrel {*}q)@\phi \) if and only if there exist binary formulas \(\phi _1\) and \(\phi _2\) such that \(M,s\models \phi =\phi _1\uplus \phi _2\), \(M,s\models p@\phi _1\), and \(M,s\models q@\phi _2\).

  • \(M,s\models (p\mathrel {-*}q)@\phi \) if and only if \(M,s\models \psi \perp \phi \) and \(M, s\models p@\psi \) implies \(M, s\models q@(\phi \vee \psi )\), for all binary formulas \(\psi \).

We now develop a calculus for sequents \( A_1,\ldots , A_n \Rightarrow B_1,\ldots , B_m \), where each \(A_i\), \(i=1,\ldots ,n\), and \(B_j\), \(j=1,\ldots ,m\), is constructed from first-order formulas and rooted assertions, which can be further composed using propositional connectives and quantification of first-order variables. This calculus is an extension of standard first-order sequent calculus (including cut), where the standard rules are applicable with respect to top-level propositional connectives and quantifiers. Figure 1 shows the left and right rules for separating conjunction and implication. These rules closely follow the translation in Definition 3 of SL into second-order logic, eliminating the explicit second-order quantification by applying the standard proof rules for second-order quantification (which themselves are straightforward generalizations of the rules for first-order quantification, instantiating the second-order variables by formulas). The binary relation symbols \(R_1, R_2\) and R introduced in the rules \(\textbf{L}_{\mathrel {*}}\) and \(\textbf{R}_{\mathrel {-*}}\) are ‘fresh’ binary relation symbols, that is, they must not appear in the formulas of the conclusion of the rules.

We also have rules which allow classical reasoning under rooted assertions: \((p\circ q)@\phi \leftrightarrow (p@\phi )\circ (q@\phi )\), where \(\circ \) denotes binary propositional connectives, e.g., conjunction, disjunction, and implication, \((\lnot p)@\phi \leftrightarrow \lnot (p@\phi )\), and \((\exists x p) @\phi \leftrightarrow \exists x (p@\phi )\) (and similarly \((\forall x p)@\phi \leftrightarrow \forall x (p@\phi )\)). Further, we have \(\forall x,y(\phi \leftrightarrow \psi ) \rightarrow (p@\phi \leftrightarrow p@\psi )\). It is straightforward to validate these rules, but we omit the details of the semantics \(M,s\models A\), which follows the standard Tarski-style classical semantics, given the semantics of rooted assertions which may appear in the place of atomic formulas.

In the so-called ‘points-to’ rules of Fig. 1 the formula p does not involve occurrences of the separating connectives. Such a formula of SL we call basic. Note that it differs from pure first-order formulas in that basic formulas additionally may involve the points-to relation. For such formulas we denote by \({p[\phi /\hookrightarrow ]}\), for any binary formula \(\phi (x,y)\), the result of replacing every atomic assertion \((t\hookrightarrow t')\) in p by \(\phi (t,t')\), which is a pure first-order formula. It follows that \(M,s\models p{[\phi /\hookrightarrow ]}\) if and only if \(M,{ Rel}_M(\phi ),s\models p\), for any basic formula p.

Example Proofs

figure a

As a first example of the use of the sequent calculus, above we have a derivation of the sequent \(\Rightarrow ((p\mathrel {*}(p\mathrel {-*}q))\rightarrow q)@R\) which represents the validity of \((p\mathrel {*}(p\mathrel {-*}q))\rightarrow q\). This derivation essentially consists of an application of the rule \(\textbf{L}_{\mathrel {*}}\) followed by an application of the rule \(\textbf{L}_{\mathrel {-*}}\). In this derivation \(\varGamma \) denotes the formulas \(R=R_1\uplus R_2, p@R_1\) generated by the application of rule \(\textbf{L}_{\mathrel {*}}\). The second premise of the application of the rule \(\textbf{L}_{\mathrel {-*}}\) is derivable from an instance of the axiom \(\varGamma , A\Rightarrow A,\varDelta \). Note that \(\psi \) (in the \(\textbf{L}_{\mathrel {-*}}\) rule) is instantiated with \(R_1\). The first and third premise follows from the fact that \(R=R_1\uplus R_2\) reduces to \(R_1\perp R_2\) and \(R=R_1\cup R_2\) (that part of the proof is not shown above).

Next we show how to use the calculus in reasoning about the equivalence of weakest preconditions that arise in the practice of verifying the correctness of heap manipulating programs. Let p denote the weakest precondition \((u\hookrightarrow -)\wedge (z=0 \triangleleft u=v \triangleright v\hookrightarrow z)\) of the heap update \([u]:=0\) which ensures the postcondition \(v\hookrightarrow z\) after assigning the value 0 to the location denoted by the variable u (here \(\phi \triangleleft b \triangleright \psi \) abbreviates \((b\wedge \phi )\vee (\lnot b\wedge \psi )\)) (in [dBHdG23] a dynamic logic extension of SL is introduced which generates this weakest precondition). The standard rule for backwards reasoning in [Rey02] gives the weakest precondition \( (u\mapsto -)\mathrel {*}(u\mapsto 0\mathrel {-*}v\hookrightarrow z) \), which we denote by \(p'\). These preconditions are equivalent because both are the weakest.

Surprisingly, a proof of the implication \(p'\rightarrow p\) however exceeds the capability of all the automatic SL provers in the benchmark competition for SL [SNPR+19]. In particular, of the automatic provers, only the CVC4-SL tool [RISK16] supports the fragment of SL that includes the separating implication connective. However, from our own experiments with that tool, we found that it produces an incorrect counter-example and reported this as a bug to one of the maintainers of the project (Andrew Reynolds). In fact, the latest version, CVC5-SL, reports the same input as ‘unknown’, indicating that the tool is incomplete. In the case of (semi) interactive SL provers (such as Iris [JKJ+18], and VerCors [AH21, MRH22] that uses Viper [MSS16] as a back-end) we sought out expertise and collaborated in our search for a tool-supported proof of the above equivalence. Even after personally visiting the Iris team in Nijmegen (lead by Robbert Krebbers) and the VerCors team in Twente (lead by Marieke Huisman), we were unable to guide the tools to produce a proof of \(p' \rightarrow p\). The problem here seems similar to that of [HT16], in that their semantics of separating connectives, which are formalized in terms of abstract monoids, are not compatible with the set-theoretic interpretation of the points-to relation.

In fact, the equivalence between the above two formulas can be expressed in quantifier-free separation logic, for which a complete axiomatization of all valid formulas has been given in [DLM21]. In the sequent calculus we can express the equivalence of p and \(p'\) in terms of the sequent \( { fun}(R)\Rightarrow (p\leftrightarrow p')@R \). Here R is an arbitrary binary relation symbol used to represent the current interpretation of the points-to relation. We abbreviate \(\forall x,y,z ((R(x,y)\wedge R(x,z))\rightarrow y=z)\) by \({ fun}(R)\). A proof of the above sequent amounts to proving the sequents \({ fun}(R), p'@R\Rightarrow p@R\) and \({ fun}(R), p@R \Rightarrow p'@R\). Below we present a high-level proof of the first sequent, abstracting from some basic first-order reasoning in the calculus.

By an application of \(\textbf{L}_{\mathrel {*}}\) to derive the sequent \({ fun}(R), p'@R\Rightarrow p@R\) it suffices to derive

$$ { fun}(R), R=R_1\uplus R_2, (u\mapsto -)@R_1, (u\mapsto 0\mathrel {-*}v\hookrightarrow z)@R_2 \Rightarrow p@R $$

for some fresh \(R_1\) and \(R_2\). Let \(\psi (x,y)\) denote the binary formula \(x=u\wedge y=0\). Further, let \(\varGamma \) denote the set of formulas \({ fun}(R), R=R_1\uplus R_2, (u\mapsto -)@R_1\). By an application of the rule \(\textbf{L}_{\mathrel {-*}}\) it then suffices to prove the following sequents (from \(\varGamma \Rightarrow \varDelta \) we can derive \(\varGamma \Rightarrow A,\varDelta \) by right-weakening). First we prove \( \varGamma \Rightarrow R_2\cap \psi =\emptyset \): By the points-to rules the rooted assertion \((u\mapsto -)@R_1\) (appearing in \(\varGamma \)) reduces to \(\exists z (R_1(u,z)\wedge \forall x,y (R_1(x,y)\rightarrow x=u\wedge y=z))\) (the forall-part of the formula is due to the ‘strict’ points-to which states that the domain contains u as its only location). Further, \(R_2\cap \psi =\emptyset \) logically boils down to \(\lnot \exists x,y(R_2(x,y)\wedge (x=u\wedge y=0))\), that is, \(\lnot R_2(u,0)\), which in basic first-order logic follows from \(\exists z R_1(u,z)\) and the assumptions \(R=R_1\uplus R_2\) and \({ fun}(R)\).

Second, we prove \( \varGamma \Rightarrow (u\mapsto 0) @\psi \): By the points-to rules \((u\mapsto 0) @\psi \) (using the expanded definition \(\phi \) of \(u\mapsto 0\) and the definition of the substitution \(\phi {[\psi / \hookrightarrow ]}\)) reduces to \((u=u)\wedge (0=0)\wedge \forall x,y ({(x=u\wedge y=0)} \rightarrow {(x=u\wedge y=0)})\) which is equivalent to \(\text{ true }\).

And, finally, we prove \( \varGamma , (v\hookrightarrow z)@(R_2\vee \psi ) \Rightarrow p@R \): First note that (again, by the points-to rules)

$$ ((u\hookrightarrow -)\wedge (z=0 \triangleleft u=v \triangleright v\hookrightarrow z))@R $$

reduces to

$$ (\exists z R(u,z))\wedge (z=0 \triangleleft u=v \triangleright R(v,z))) $$

The assertion \(\exists z R(u,z)\) clearly follows from the assumptions \(R=R_1\uplus R_2\) and \((u\mapsto -)@R_1\) in \(\varGamma \). To prove \({z=0 \triangleleft u=v \triangleright R(v,z)}\), we first reduce the assumption \({(v\hookrightarrow z)@(R_2\vee \psi )}\) to \(R_2(v,z)\vee (v=u\wedge z=0)\). Now, if \(v=u\) then \(\lnot R_2(v,z)\), because of the assumptions \({ fun}(R)\), \(R=R_1\uplus R_2\) and \((u\mapsto -)@R_1\). So we have that \(z=0\). Otherwise, we have \(R_2(v,z)\), and thus R(vz), because \(R=R_1\uplus R_2\).

Soundness and Completeness. We denote by \(\vdash \varGamma \Rightarrow \varDelta \) that there exists a proof of the sequent \(\varGamma \Rightarrow \varDelta \). To define \(\models \varGamma \Rightarrow \varDelta \), let \(\sigma \) denote a substitution which assigns to every binary relation symbol R of the sequent \(\varGamma \Rightarrow \varDelta \) a binary formula \(\phi \). Such a substitution \(\sigma \) simply replaces occurrences of \(R(t,t')\) by \(\phi (t,t')\), where \(\sigma (R)=\phi (x,y)\). By \(\models \varGamma \Rightarrow \varDelta \) we then denote that \(M,s\models \bigwedge \varGamma \sigma \) (that is, \(M,s\models A\sigma \), for every \(A\in \varGamma \)) implies \(M,s\models \bigvee \varDelta \sigma \) (that is, \(M,s\models B\sigma \), for some \(B\in \varDelta \)), for every Ms and every substitution \(\sigma \).

In the soundness proof below we use these substitutions to instantiate the fresh binary relation symbols introduced in the rules \(\textbf{L}_{\mathrel {*}}\) and \(\textbf{R}_{\mathrel {-*}}\). Note that updating the interpretation of these symbols (as provided by M) would affect the semantics of the separating connectives if binary formulas would refer to these fresh binary relation symbols (note that they are only supposed not to appear in formulas of the conclusion of the rules \(\textbf{L}_{\mathrel {*}}\) and \(\textbf{R}_{\mathrel {-*}}\)).

We generalize the above notions of derivability and validity to possibly infinite \(\varGamma \): \(\varGamma \vdash \varDelta \) indicates that \(\vdash \varGamma '\Rightarrow \varDelta \), for some finite \(\varGamma '\subseteq \varGamma \), and \(\varGamma \models \varDelta \) indicates that for every substitution \(\sigma \) we have that \(M,s\models \varGamma \sigma \) (that is, \(M,s\models A\sigma \), for every \(A\in \varGamma \)) implies \(M,s\models B\sigma \), for some \(B\in \varDelta \).

Theorem 1

(Soundness). We have that \(\vdash \varGamma \Rightarrow \varDelta \) implies \(\models \varGamma \Rightarrow \varDelta \).

Proof

We prove that the rules for the separating connectives preserve validity. The points-to rules are sound because \(M,{ Rel}_M(\phi ),s\models p\) if and only if \(M,s\models p{[\phi /\hookrightarrow ]}\), for any basic formula p (note that \(p{[\phi /\hookrightarrow ]}\) is a pure first-order formula which does not depend on the heap).

\(\textbf{L}_{\mathrel {*}}\): Let \(M,s\models \varGamma \sigma \) and \(M,s\models (p\sigma \mathrel {*}q\sigma )@\phi \sigma \). We have to show that \(M,s\models \bigvee \varDelta \sigma \). By Lemma 1, there exist \(\phi _1\) and \(\phi _2\) such that \(M,s\models (\phi \sigma )=\phi _1\uplus \phi _2\), \(M,s\models p\sigma @\phi _1\), and \(M,s\models q\sigma @\phi _2\). Let \(\sigma '=\sigma [R_1,R_2:=\phi _1,\phi _2]\). Since \(R_1\) and \(R_2\) are fresh and as such do not appear in \(\varGamma , (p\mathrel {*}q)@\phi \), it follows that \(M,s\models \varGamma '\sigma '\), where \(\varGamma '=\varGamma , \phi =R_1\uplus R_2, p@R_1, q@R_2\). By the validity of the premise we thus obtain that \(M,s\models \bigvee \varDelta \sigma '\). Since \(R_1\) and \(R_2\) also do not appear in \(\varDelta \), we conclude that \(M,s\models \bigvee \varDelta \sigma \).

\(\textbf{R}_{\mathrel {*}}\): Let \(M,s\models \varGamma \sigma \) and suppose that \(M,s\not \models \bigvee \varDelta \sigma \). From the validity of the premises it then follows that \(M,s\models {\phi \sigma =(\phi _1\uplus \phi _2)\sigma }\), \(M,s\models p\sigma @\phi _1\sigma \), and \(M,s\models q\sigma @\phi _2\sigma \), By Lemma 1 we conclude \(M,s\models (p\sigma \mathrel {*}q\sigma )@\phi \sigma \).

\(\textbf{L}_{\mathrel {-*}}\): Let \(M,s\models \varGamma \sigma \) and \(M,s\models (p\sigma \mathrel {-*}q\sigma )@\phi \sigma \), and suppose that \(M,s\not \models \bigvee \varDelta \sigma \). From the validity of the first two premises it then follows that \(M,s\models \phi \sigma \perp \psi \sigma \) and \(M,s\models p\sigma @\psi \sigma \). By Lemma 1 again, it follows that \(M,s\models q\sigma @(\phi \sigma \vee \psi \sigma )\). By the validity of the third premise we thus derive that \(M,s\not \models \bigvee \varDelta \sigma \), which a contradicts our assumption.

\(\textbf{R}_{\mathrel {-*}}\): Let \(M,s\models \varGamma \sigma \) and suppose that \(M,s\not \models \bigvee \varDelta \sigma \). We have to show that \(M,s\models (p\sigma \mathrel {-*}q\sigma )@\phi \sigma \). Let \(\psi \) be such that \(M,s\models \psi \perp (\phi \sigma )\) and \(M, s\models p\sigma @\psi \). Further, let R be a fresh variable and \(\sigma '=s[R:=\psi ]\). It follows that \(M,s\models \varGamma '\sigma '\), where \(\varGamma '=\varGamma , R \perp \phi , p@R\) and \(M,s\not \models \bigvee \varDelta \sigma '\). And so we derive from the validity of the premise of the rule that \(M,s\models q\sigma @(\phi \sigma \cup \psi )\). Since \(\psi \) was arbitrarily chosen, by Lemma 1 again we conclude that \(M,s\models (p\sigma \mathrel {-*}q\sigma )@\phi \sigma \).    \(\square \)

As a corollary we obtain that \(\varGamma \vdash \varDelta \) implies \(\varGamma \models \varDelta \).

Following the completeness proof of first-order logic as described in [Hen49], it suffices to show that every consistent set of formulas is satisfiable (the so-called ‘model existence theorem’). A set of formulas \(\varGamma \) is consistent if \(\varGamma \not \vdash \emptyset \). We first show that every consistent set of formulas can be extended to a maximal consistent set. To this end we assume an infinite set of ‘fresh’ binary relation symbols R that do not appear in \(\varGamma \). We construct for any consistent set \(\varGamma \) a maximal consistent extension \(\varGamma ^\infty \), assuming an enumeration of all formulas A (which also covers all first-order formulas). We define \(\varGamma _0=\varGamma \) and \(\varGamma _{n+1}\) satisfies the general rule: if \(\varGamma _n, A_n\not \vdash \emptyset \) then \(\varGamma _n\cup \{A_n\}\subseteq \varGamma _{n+1}\), otherwise \(\varGamma _{n+1}=\varGamma _n\). Additionally, in case \(A_n\) is added and \(A_n\) is of the form \(\exists x A\) or a rooted assertion \((p\mathrel {*}q)@\phi \) or \(\lnot (p\mathrel {-*}q)@\phi \), we also include corresponding witnesses in \(\varGamma _{n+1}\):

  • If \(A_n\) is of the form \(\exists x A\) we additionally add A(y), where A(y) results from replacing all free occurrences of x in A by the fresh variable y which does not appear in \(\varGamma _n\). Note that A(y) can indeed be added consistently because from \(\varGamma _n, A(y)\vdash \emptyset \) we would derive \(\varGamma _n, \exists x A \vdash \emptyset \), which contradicts the assumption that \(\varGamma _n, \exists x A\not \vdash \emptyset \).

  • If \(A_n\) is of the form \((p\mathrel {*}q)@\phi \) we additionally add the formulas \(\phi =R_1\uplus R_2, R_1\perp R_2, p@R_1\), and \(q@R_2\), where \(R_1\) and \(R_2\) are fresh (e.g., not appearing in \(\varGamma _n\)). Note that these formulas can indeed be added consistently because from \(\varGamma _n, \phi =R_1\uplus R_2, R_1\perp R_2, p@R_1, q@R_2 \vdash \emptyset \) we would derive \(\varGamma _n,(p\mathrel {*}q)@\phi \vdash \emptyset \) (by rule \(\textbf{L}_{\mathrel {*}}\)).

  • If \(A_n\) is of the form \(\lnot (p\mathrel {-*}q)@\phi \) (which is equivalent to \(\lnot ((p\mathrel {-*}q)@\phi )\)) we additionally add the formulas \(R\perp \phi , p@R_{}(x,y)\), and \(\lnot q@(\phi \vee R)\), where R is fresh (e.g., not appearing in \(\varGamma _n\)). Note that these formulas can indeed be added consistently because from \(\varGamma _n, R\perp \phi , p@R_{}(x,y), \lnot q@(\phi \vee R) \vdash \emptyset \) we would derive \(\varGamma _n \vdash (p\mathrel {-*}q)@\phi \) (by rule \(\textbf{R}_{\mathrel {-*}}\)), which contradicts the assumption that \(\varGamma _n, \lnot (p\mathrel {-*}q)@\phi \not \vdash \emptyset \).

We define \(\varGamma ^\infty =\bigcup _n \varGamma _n\). By construction \(\varGamma ^\infty \) is maximal consistent. Given a maximal consistent set of formulas \(\varGamma \), let \(M_\varGamma =(D,I)\), where D is the set of equivalences classes \([t]=\{t'\mid t=t' \in \varGamma \}\). For any function symbol f and relation symbol R (excluding the points-to relation \(\hookrightarrow \)) we define

  • \(I(f)([t_1],\ldots ,[t_n])=[f(t_1,\ldots ,t_n)]\),

  • \(I(R)([t_1],\ldots ,[t_n])=\text{ true }\) if and only if \(R(t_1,\ldots ,t_n)\in \varGamma \).

The above interpretation of the function and relational symbols is well-defined because its definition does not depend on the choice of the representatives (this follows from the equality axioms).

Given a maximal consistent set of formulas \(\varGamma \) and the model \(M_\varGamma =(D,I)\), a corresponding valuation s assigns to every variable x an equivalence class [t]. However, in the sequel we will represent such a valuation by a substitution s which simply assigns to each variable a term. The value \(I_s(x)\) of a variable x then is given by the equivalence class [s(x)] of the term s(x).

Given a substitution s, for any term t and formula A (of the sequent calculus) we denote by ts and As the result of replacing every free occurrence of a (first-order) variable x in t and A by s(x). Note that \((p@\phi )s=ps @\phi \), because the meaning of \(p@\phi \) does not depend on the free variables x and y of the binary formula \(\phi (x,y)\).

Given a maximal consistent set of formulas \(\varGamma \) and the model \(M_\varGamma =(D,I)\), it follows that \(I_s(t)=[ts]\), for every term t and substitution s.

Lemma 2

Given a maximal consistent set of formulas \(\varGamma \) and the model \(M_\varGamma =(D,I)\), we have \(M,s\models A\) if and only if \(As\in \varGamma \), for every formula A and substitution s.

Proof

The proof proceeds by induction on the following well-founded ordering \(A< B\) on formulas of the sequent calculus: Let \(\#A=(n,m)\), where n denotes the number of occurrences of the separating connectives and the @-binding operator of A and m denotes the number of occurrences of the (standard) first-order logical operations of A. Then \(A<B\) if \(\#A< \#B\), where the latter denotes the lexicographical ordering on \(\mathbb {N}\times \mathbb {N}\) (w.r.t. the standard ‘smaller than’ ordering on the natural numbers). We treat the following main cases (for notational convenience M denotes the model \(M_\varGamma \)).

  • Let \(M,s\models A\), where A denotes the formula \((p\mathrel {*}q)@\phi \). By Lemma 1 there exist \(\phi _1\) and \(\phi _2\) such that \(M,s\models \phi =\phi _1\uplus \phi _2\), \(M,s\models p@\phi _1\) and \(M,s\models q@\phi _2\). From the induction hypothesis it follows that \(ps@\phi _1, qs@\phi _2, \phi =\phi _1\uplus \phi _2\in \varGamma \) (note that the first-order formula \(\phi =\phi _1\uplus \phi _2\) does not contain free variables, and thus is not affected by the substitution s). So we derive by rule \(\textbf{R}_{\mathrel {*}}\) that \(\varGamma \vdash (ps\mathrel {*}qs)@\phi \). By maximal consistency of \(\varGamma \), we then conclude that \((ps\mathrel {*}qs)@\phi \in \varGamma \), that is, \(As\in \varGamma \). On the other hand, let \(As\in \varGamma \). That is, \((ps\mathrel {*}qs)@\phi \in \varGamma \). By construction \(\phi =R_1\uplus R_2, ps@R_1, qs@R_2 \in \varGamma \), for some witnesses \(R_1\) and \(R_2\). By the induction hypothesis it then follows that \(M,s\models p@R_1\) and \(M,s\models p@R_2\). Further, the induction hypothesis gives \(M,s\models \phi =R_1\uplus R_2\) (again, note that the formula \(\phi =R_1\uplus R_2\) has no free variables, and thus is not affected by the substitution s). We conclude by Lemma 1 that \(M,s\models (p\mathrel {*}q)@\phi \).

  • Let \(M,s\models A\), where A denotes the formula \((p\mathrel {-*}q)@\phi \). Suppose \(As\not \in \varGamma \). By the maximal consistency of \(\varGamma \), we then have \(\lnot (ps\mathrel {-*}qs)@\phi \in \varGamma \). By construction \(R\perp \phi , ps@R, \lnot qs@(\phi \vee R) \in \varGamma \), for some witness R, which contradicts \(M,s\models (p\mathrel {-*}q)@\phi \) (after application of the induction hypothesis and using Lemma 1 again). On the other hand, let \(As\in \varGamma \). To show that \(M,s \models (p\mathrel {-*}q)@\phi \), let \(M,s\models \phi \perp \psi \) and \(M,s\models p@\psi \), for some binary formula \(\psi \). By the induction hypothesis we have that \(\phi \perp \psi , ps@\psi \in \varGamma \). Suppose that \(qs@(\phi \vee \psi )\not \in \varGamma \), that is \(\lnot qs@(\phi \vee \psi )\in \varGamma \) (\(\varGamma \) is maximal consistent), and thus \( \varGamma , qs@(\phi \vee \psi ) \vdash \emptyset \). Applying rule \(\textbf{L}_{\mathrel {-*}}\) we then derive \(\varGamma , (ps\mathrel {-*}qs)@\phi \vdash \emptyset \), which contradicts the consistency of \(\varGamma \) (\((ps\mathrel {-*}qs)@\phi \in \varGamma \)). So we have that \(qs@(\phi \vee \psi )\in \varGamma \), that is, \(M,s\models q@(\phi \vee \psi )\), by the induction hypothesis. Since \(\psi \) is chosen arbitrarily, it follows by Lemma 1 that \(M,s \models {(p\mathrel {-*}q)@\phi }\).

  • Let A be a formula \(p@\phi \), where p denotes a basic formula. Let \(\mathcal {R}=\text{ Rel}_M(\phi )\). We then have \(M,s\models p@\phi \) iff (by definition) \(M,\mathcal {R},s\models p\) iff (straightforward induction on p) \(M,s\models p[\phi /\hookrightarrow ]\) iff (induction hypothesis for \(p[\phi /\hookrightarrow ]\)) \(ps[\phi /\hookrightarrow ]\in \varGamma \) iff (by the points-to rules) \(ps@\phi \in \varGamma \). Note that applying the substitution s to \(p@\phi \) and \(p[\phi /\hookrightarrow ]\) results in \(ps@\phi \) and \(ps[\phi /\hookrightarrow ]\).    \(\square \)

The downward Löwenheim-Skolem property follows. It should be noted that we cannot remove from the constructed model the binary relation symbols which are introduced as witnesses, as these determine the notion of first-order definability.

Theorem 2

(Completeness). We have that \(\varGamma \models \varDelta \) implies \( \varGamma \vdash \varDelta \).

Compactness follows. We thus derive (by Lindström’s theorem [Vää10]) that this version of SL is as expressive as first-order logic.

6 Conclusion

We investigated the expressiveness of full SL over arbitrary first-order models. We have shown that restricting the quantification over first-order definable heaps gives rise to a semantic consequence relation that can be captured by a sound and complete extension of the standard sequent calculus for first-order logic.

The main question remains what is the exact relationship between full SL which allows for infinite heaps and second-order logic. In [KR04] a translation is given of general second-order logic in a first-order logic with spatial conjunction. Spatial conjunction (as defined in [KR04]) allows to split a global set of arbitrary relations. As such it goes beyond the local scope of separating conjunction which is restricted to the points-to relation. We conjecture that second-order logic is strictly more expressive than full SL.