Frame Inference for Inductive Entailment Proofs in Separation Logic

. Given separation logic formulae A and C , frame inference is the problem of checking whether A entails C and simultaneously inferring residual heaps. Existing approaches on frame inference do not support inductive proofs with general inductive predicates. In this work, we present an automatic frame inference approach for an expressive fragment of separation logic. We further show how to strengthen the inferred frame through predicate normalization and arithmetic inference. We have integrated our approach into an existing veriﬁcation system. The experimental results show that our approach helps to establish a number of non-trivial inductive proofs which are beyond the capability of all existing tools.


Introduction
Separation logic (SL) [20,37] has been well established for reasoning about heapmanipulating programs (like linked-lists and trees).Often, SL is used in combination with inductive predicates to precisely specify data structures manipulated by a program.In the last decade, a large number of SL-based verification systems have been developed [1,6,3,19,8,36,33,18,29,13,24].In these systems, SL is typically used to express assertions about program states.The problem of validating these assertions can be reduced to the entailment problem in SL, i.e., given two SL formulas ∆ a and ∆ c , to check whether ∆ a |= ∆ c holds.Moreover, SL provides the frame rule [20], one prominent feature to enable compositional (a.k.a.modular) reasoning in the presence of the heap: where c is a program, P , Q and F are SL formulas, and * is the separating conjunction in SL.Intuitively, P * F states that P and F hold in disjoint heaps.This conjunction allows the frame rule to guarantee that F is unchanged under the action of c.This feature of SL is essential for scalability [21,44,6] as it allows the proof of a program to be decomposed (and reused) into smaller ones, e.g., proofs of procedures.To automate the application of the frame rule, SL-based proof systems rely on a generalized form of the entailment, which is referred to as frame inference [1,12,8,33,39].That is, given ∆ a and ∆ c , to check whether ∆ a entails ∆ c and simultaneously generate the residual heap, which is a satisfiable frame ∆ f capturing properties of the memory in ∆ a that is not covered by ∆ c .This problem, especially if ∆ a and ∆ c are constituted by general inductive predicates, is highly non-trivial as it may require inductive reasoning.Existing approaches [1,33] are limited to specific predicates e.g., linked lists and trees.The systems reported in [12,8,39] do not adequately support the frame inference problem for inductive entailments in separation logic with predicate definitions and arithmetic.
In this work, we propose a sound approach for frame inference which aims to enhance modular verification in an expressive SL fragment with general inductive predicates and Presburger arithmetic.Intuitively, given an entailment ∆ a |= ∆ c , our goal is to infer a satisfiable frame axiom ∆ f such that ∆ a |= ∆ c * ∆ f holds.Our approach works as follows.We first augment the entailment checking with an unknown second-order variable U f ( t) as a place-holder of the frame, where t is a set of pointer-typed variables common in ∆ a and ∆ c .That is, the entailment checking becomes ∆ a |= ∆ c * U f ( t).Afterwards, the following two steps are conducted.Firstly, we invoke a novel proof system to derive a cyclic proof for ∆ a |= ∆ c * U f ( t) whilst inferring a predicate which U f must satisfy so that the entailment is valid.We show that the cyclic proof is valid if this predicate is satisfiable.Secondly, we strengthen the inferred frame with shape normalization and arithmetic inference.
For the first step, we design a new cyclic proof system (e.g., based on [2,3]) with an automated cut rule so as to effectively infer the predicate on U f .A cyclic proof is a derivation tree whose root is the given entailment checking and whose edges are constructed by applying SL proof rules.A derivation tree of a cyclic proof may contain virtual back-links, each of which links a (leaf) node back to an ancestor.Intuitively, a back-link from a node l to an internal node i means that the proof obligation at l is induced by that at i. Furthermore, to avoid potentially unsound cycles (i.e., self-cycles), a global soundness condition must be imposed upon these derivations to qualify them as genuine proofs.In this work, we develop a sequent-based cyclic proof system with a cyclic cut rule so as to form back-links effectively and check the soundness condition eagerly.Furthermore, we show how to extract lemmas from the proven cyclic proofs and reuse them through lemma application for an efficient proof system.These synthesized lemmas work as dynamic cuts in the proposed proof system.
For the second step, we strengthen the inferred predicate on the frame U f ( t) so that it becomes more powerful in establishing correctness of certain programs.In particular, the inferred frame is strengthened with predicate normalization and arithmetic inference.The normalization includes predicate split (i.e., to expose the spatial separation of the inferred frame) and predicate equivalence (i.e., to relate the inferred frame with user-supplied predicates).The arithmetic inference discovers predicates on pure properties (size, sum, height, content and bag) to support programs which require induction reasoning on both shape and data properties.
Lastly, we have implemented the proposal and integrated it into a modular verification engine.Our experiments show that our approach infers strong frames which enhances the verification of heap-manipulating programs.

Preliminaries
In this section, we present the fragment of SL which is used as the assertion language in this work.This fragment, described in Fig. 1, is expressive enough for specifying and verifying properties of a variety of data structures [24,25,41,26,35].We use t to denote a sequence of terms and occasionally use a sequence (i.e., t) to denote a set when there is no ambiguity.A formula Φ in our language is a disjunction of multiple clauses ∆, each of which is a conjunction of a spatial predicate κ and a pure (non-heap) constraint π.The spatial predicate κ captures properties of the heap whereas π captures properties of the data.κ can be an empty heap emp, or a points-to predicate r →c(v) where c is a data structure, or a user-defined predicate P( t) or a spatial conjunction κ 1 * κ 2 .null is a special heap location.A pure constraint π is in the form of (dis)equality α (on pointers) and Presburger arithmetic φ.We write v 1 =v 2 and v =null for ¬(v 1 =v 2 ) and ¬(v=null), respectively.We often omit the pure part of a formula Φ when it is true .For standardizing the notations, we use uppercase letters for unknown (to-be-inferred) predicates, (e.g., P( t)) and lowercase letters (e.g., p( t)) for known predicates.
A user-defined (inductive) predicate P(v) with parameters v is defined in the form of a disjunction, i.e., pred P(v)≡Φ, where each disjunct in Φ is referred to as a branch.In each branch, variables that are not in v are implicitly existentially-quantified.We use function unfold(P( t)) to replace an occurrence of inductive predicates by the disjuncts in the definition of P with actual/formal parameters renaming.For example, the following predicates lseg and lsegn are defined to express list segments where every node contains the same value 1, given data structure node{int val; node next; }. pred lseg(root,l)≡emp∧root=l ∨ ∃ q•root →node(1,q) * lseg(q,l); pred lsegn(root,l,n)≡emp∧root=l∧n=0 ∨ ∃ q• root →node(1,q) * lsegn(q,l,n−1); where root is the head, l the end of the segment and n the length of the segment.
In our framework, we may have lemmas to assist program verification.A lemma ι of the form ∆ l → ∆ r , which means that the entailment ∆ l |= ∆ r holds.We write A↔B, a short form of A→B and B→A, to denote a two-way lemma.If A↔B, A is semantically equivalent to B. We use E and F to denote an entailment problem.
In the following, we discuss semantics of the SL fragment.Concrete heap models assume a fixed finite collection Node, a fixed finite collection Fields, a disjoint set Loc of locations (i.e., heap addresses), a set of non-address values Val such that null∈Val and Val ∩ Loc =∅.The semantics is given by a satisfaction relation: s,h|=Φ that forces the stack s and heap h to satisfy the constraint Φ where h ∈ Heaps, s∈Stacks, and Φ is a formula.Heaps and Stacks are defined as follows.
The details of semantics of this SL fragment follow the one in [25].

Illustrative Example
In the following, we first discuss the limitation of the existing entailment procedures [1,8] to the frame inference problem.Given an entailment, these procedures deduce it until the following subgoal is obtained: ∆ a emp ∧ true .Then, they conclude that ∆ a is the residual frame.However, these approaches provide limited support for proofs of induction.While [1] provides inference rules as a sequence of inductive reasoning for hardwired lists and trees, our previous work [8] supports inductive proofs via usersupplied lemmas [30].Hence, it is very hard for these procedures to automatically infer the frame for the entailments which require proofs of induction.
We illustrate our approach via the verification of the append method shown in Fig. 2, which appends a singly-linked list referred to by y to the end of the singly-linked list referred to by x.It uses the auxiliary procedure last (lines 8-12) to obtain the pointer referrring to the last node in the list.Each node object x has a data value x->data and a next pointer x->next.For simplicity, we assume that every node in the x list and the y list has data value 1.The correctness of append and last is specified using our fragment of SL with a pre-condition (requires) and a post-condition (ensures).The auxiliary variable res denotes the return value of the procedure.Note that these specifications refer to the user-provided predicates lln and ll last, which are defined as follows.
pred lln(root,n) ≡ emp∧root=null∧n=0 ∨ ∃ q•root →node(1,q) * lln(q,n−1); pred ll last(root,l,n) ≡ l →node(1,null)∧root=l∧n=1 ∨ ∃q• root →node(1,q) * ll last(q,l,n−1); Intuitively, the predicate lln(root,n) is satisfied if root points to a singly-linked list with n nodes.The predicate ll last(t,p,n) is satisfied if t points to a list segment with last element p and length n.In our framework, we provide a library of commonly used inductive predicates (and the corresponding lemmas), including for example the definitions for list segments lseg and lsegn introduced earlier.Given these specifications, we automatically deduce predicates on the intermediate program states (using existing approaches [8]), shown as comments in Fig. 2, as well as the following three entailment checks that must be established in order to verify the absence of memory errors and the correctness of the method append.
E 1 :lln(x,i) * lln(y,j)∧i>0 ∃ n 1 •lln(x,n 1 )∧n 1 >0 E 2 :ll last(x,t,i) * lln(y,j)∧i>0 ∃ q,v•t →node(v,q) E 3 :lsegn(res,t,i−1) * t →node(1,y) * lln(y,j)∧i>0 lln(res,i+j) E1 aims to establish a local specification at line 5 which we generate automatically.E2 must be satisfied so that no null-dereference error would occur for the assignment to t->next at line 6.E3 aims to establish that the postcondition is met.Frame inference is necessary in order to verify the program.In particular, frame inference for E2 is crucial to construct a precise heap state after line 6, i.e., the state α in the figure, which is necessary to establish E3.Furthermore, the frame of E3 (which is inferred as emp) helps to show that this program does not leak memory.As the entailment checks E2 and E3 require both induction reasoning and frame inference, they are challenging for existing SL proof systems [12,3,8,36,31,9,15,40].In what follows, we illustrate how our system establishes a cyclic proof with frame inference for E2.
Frame Inference Our frame inference starts with introducing an unknown predicate (a second-order variable) U 1 (x,t,q,v,y)1 as the initial frame, which is a place-holder for a heap predicate on variables x, t, q and y (i.e., variables referred to in E2).That is, E2 is transformed to the following entailment checking problem: where L 0 is a set of induction hypotheses and sound lemmas.This set is accumulated automatically during the proof search and used for constructing cyclic proofs and lemma application.If a hypothesis is proven, it becomes a lemma and may be applied latter during the proof search. in this example, initially L 0 =∅.The proposed proof system derives a cyclic proof for the entailment problem and, at the same time, infers a set of constraints R for U 1 (x,t,q,v,y) such that the proof is valid if the system R is satisfiable.Each constraint in R has the form of logical implication i.e., ∆ b ⇒ U(v) where ∆ b is the body and U(v) is the head (a second-order variable).For F2, the following two constraints are inferred, denoted by σ 1 and σ 2 .
σ 1 : lln(y,j)∧t=x∧q=null∧v=1 ⇒ U 1 (x,t,q,v,y) We then use a decision procedure (e.g., S2SATSL [25,26] or [4]) to check the satisfiability of σ 1 ∧σ 2 .Note that we write a satisfiable definition of For instance, the above constraints are written as: Note that, in the above definition of U1, the separation of those heap-lets referred to by root, y and q is not explicitly captured.Additionally, relations over the sizes are also missing.Such information is necessary in order to establish the left-hand side of E3.The successful verification of E3 in turn establishes the postcondition of method append.In the following we show how to strengthen the inferred frame.
Frame Strengthening We strengthen U1 with spatial separation constraints on the pointer variables root, y and q.To explicate the spatial separation among these pointers, our system generates the following equivalent lemma and splits U1 into two disjoint heap regions (with * conjunction): where U2 is a new auxiliary predicate with an inferred definition: Next, our system detects that U2 is equivalent to the user-defined predicate lseg, and generates the lemma: U 2 (root,t)↔lseg(root,t).Relating U2 to lseg enhances the understanding of the inferred predicates.Furthermore, as shown in [9], this relation helps to reduce the requirements of induction reasoning among equivalent inductive predicates with different names.Substituting U2 with the equivalent lseg, U1 becomes: This definition states that frame U1 holds in two disjoint heaps: one list segment pointed to by root and a list pointed to by y.After substitution the entailment F2 becomes ll last(x,t,i) * lln(y,j)∧i>0 L0 t →node(1,null) * lseg(x,t) * lln(y,j) Next, we further strengthen the frame with pure properties, which is necessary to successfully establish the left hand side of E3.In particular, we generate constraints to capture that the numbers of allocated heaps in the left hand side and the right hand side of F2 are identical.Our system obtains these constraints through two phases.First, it automatically augments an argument for each inductive predicate in F2 to capture its size property.Concretely, it detects that while predicates ll last and lln have such size argument already, the shape-based frame lseg has not.As so, it extends lseg(root,t) to obtain the predicate lsegn(root,t,m) where the size property is captured by parameter m.Now, we substitute the lsegn into F2 to obtain: ll last(x,t,i) * lln(y,j)∧i>0 L0 ∃k•t →node(1,null) * lsegn(x,t,k) * lln(y,j) After that, we apply the same three steps of frame inference to generate the size constraint: constructing unknown predicates, proving entailment and inferring a set of constraints and checking satisfiability.For the first step, the above entailment is enriched with one unknown (pure) predicate: P 1 (i,j,k) which is the place-holder for arithmetical constraints among size variables i, j and k.The augmented entailment checking is: Secondly, our system successfully derives a proof for the above entailment under condition that the following disjunctive set of two constraints is satisfiable.

Frame Inference
In this section, we present our approach for frame inference in detail.Given an entailment ∆ a ∆ c , where ∆ a is the antecedent (LHS) and ∆ c is the consequence (RHS), our system attempts to infer a frame ∆ f such that when a frame is successfully inferred, the validity of the entailment ∆ a ∆ c * ∆ f is established at the same time.
Our approach has three main steps.Firstly, we enrich RHS with an unknown predicate in the form of U(v) to form the entailment ∆ a L ∆ c * U(v) where v includes all free pointer-typed variables of ∆ a and ∆ c and L is the union of a set of user-supplied lemmas and a set of induction hypotheses (initially ∅).Among these, the parameters are annotated with # following the principle that instantiation (and subtraction) must be done before inference.The detail is as follows: (i) all common variables of ∆ a and ∆ c are #-annotated; (ii) points-to pointers of ∆ c are #-annotated; (iii) the remaining pointers are not #-annotated.In the implementation, inference of frame predicates is performed incrementally such that shape predicates are inferred prior to pure ones.Secondly, we construct a proof of the entailment and infer a set of constraints R for U(v).Thirdly, we check the satisfiability of R using the decision procedure in [25,26].
In the following, we present our entailment checking procedure with a set of proof rules shown in Fig. 3 and 4. For each rule, the obligation is at the bottom and its reduced form is on the top.In particular, the rules in Fig. 3 are used for entailment proving (i.e., to establish a cyclic proof) and the rules in Fig. 4 are used for predicate inference.
Given an entailment check in the form of ∆ a L ∆ c , the rules shown in Fig. 3 are designed to subtract the heap (via the rules [M] and [PRED−M]) on both sides until their heaps are empty.After that, it checks the validity for the implication of two pure formulas by using an SMT solver, like Z3 [27], as shown in rule [EMP].Algorithmically, this entailment checking is performed as follows.
-Matching.The rules [M] and [PRED−M] are used to match up identified heap chains.
Starting from identified root pointers, the procedure keeps matching all their reach- which has at least one UD predicate, we attempt to apply a lemma as an alternative search using [CCUT] rule.We notice that as we assume that a lemma which is supplied by the user is valid, applying this lemma does not requires the global condition.
Cyclic Proof The proof rules in Fig. 3 are designed to establish cyclic proofs.In the following, we briefly describe a cyclic proof technique enhancing the proposal in [2].
Definition 1 (Pre-proof) A pre-proof of entailment E is a pair (T i , L) where T i is a derivation tree and L is a back-link function such that: the root of T i is E; for every edge from E i to E j in T i , E i is a conclusion of an inference rule with a premise E j .
There is a back-link between E c and E l if there exists L(E l ) = E c (i.e., E c = E l θ with some substitution θ) ; and for every leaf E l , E l is an axiom rule (without conclusion).
) is referred as a bud (resp.companion).
Definition 2 (Trace) Let (T i , L) be a pre-proof of ∆ a L ∆ c ; (∆ ai Li ∆ ci ) i≥0 be a path of T i .A trace following (∆ ai Li ∆ ci ) i≥0 is a sequence (α i ) i≥0 such that each α i (for all i≥0) is an instance of the predicate P( t) in the formula ∆ ai , and either: α i+1 is the subformula containing an instance of P( t) in ∆ ai+1 ; or ∆ ai Li ∆ ci is the conclusion of an unfolding rule, α i is an instance predicate P( t) in ∆ ai and α i+1 is a subformula ∆[ t/v] which is a definition rule of the inductive predicate P(v).i is a progressing point of the trace.
To ensure that a pre-proof is sound, a global soundness condition must be imposed to guarantee well-foundedness.
Definition 3 (Cyclic proof) A pre-proof (T i , L) of ∆ a L ∆ c is a cyclic proof if, for every infinite path (∆ ai Li ∆ ci ) i≥0 of T i , there is a tail of the path p=(∆ ai Li ∆ ci ) i≥n such that there is a trace following p which has infinitely progressing points.
Brotherston et.al. proved [2] that ∆ a ∆ c holds if there is a cyclic proof of ∆ a ∅ ∆ c where ∆ a and ∆ c do not contain any unknown predicate.
In the following, we explain how cyclic proofs are constructed using the proof rules shown in Fig. 3. [LU] and [CCUT] are the most important rules for forming back-links and then pre-proof construction.While rule [LU] accumulates possible companions and stores them in historical sequents L, [CCUT] links a bud with a companion using some substitutions as well as checks the global soundness condition eagerly.Different to the original cyclic system [3], our linking back function only considers companions selected in the set of historical sequents L. Particularly, ∆ l →∆ r ∈ L is used as an intelligent cut as follows.During proof search, a subgoal (i.e., ∆ a1 * ∆ a2 L ∆ c ) may be matched with the above historical sequent to form a cycle and close the proof branch using the following principle.First, ∆ l ∆ r is used as an induction hypothesis.As so, we have ∆ l ρ * ∆ a2 |= ∆ r ρ * ∆ a2 where ρ are substitutions including those for avoiding clashing of variables between ∆ r and ∆ a2 .If both ∆ a1 * ∆ a2 L ∆ l ρ * ∆ a2 and ∆ r ρ * ∆ a2 L ∆ c are proven, then we have: Thus, the subgoal ∆ a1 * ∆ a2 L ∆ c holds.We remark that if a hypothesis is proven, it can be applied as a valid lemma subsequently.
In our system, often a lemma includes universally quantified variables.We thus show a new mechanism to instantiate those lemmas that include universally quantified variables.We denote constraints with universal variables as universal guards ∀G.A universal guard ∀G is equivalent to an infinite conjunction ρ G[ρ].Linking a leaf with universal guards is not straightforward.For illustration, let us consider the following bud B0 and the universally quantified companion/lemma C 0 ∈ L. As shown in rule [CCUT], to link B0 back to C0, the LHS of these two entailments must be implied through some substitution.To obtain that, we propose lemma instantiation, a sound solution for universal lemma application.Based on the constraints in the LHS of the bud, our technique instantiates a universally quantified guard (of the selected companion/lemma) before linking it back.Concretely, we replace the universal guard by a finite set of its instances; an instantiation of a formula ∀vG( t) is G( t)[ w/v] for some vector of terms w.These instances are introduced based on the instantiations in both LHS and RHS of the corresponding bud e.g., n=10 ∧ a=3 ∧ b=7 in B0.

Frame Inference
The two inference rules shown in Fig. 4 are designed specifically to infer constraints for frame.In these rules, ( w, π) is an auxiliary function that existentially quantifies free variables in π that are not in the set w.This function extracts relevant arithmetic constraints to define data contents of the unknown predicates.R(r, t) is either r →c( t) or a known (defined) predicate P(r, t), or an unknown predicate U (r, t, w#).The # in the unknown predicates is used to guide inference and proof search.We only infer on pointers without #-annotation.U f ( w, t ) is another unknown predicate which is generated to infer the shape of pointers w.Inferred pointers are annotated with # to avoid double inference.A new unknown predicate Uf is generated only if there exists at least one parameter not to be annotated with # (i.e., w ∪ t =∅).To avoid conflict between the inference rules and the other rules (e.g., unfolding and matching), root pointers of a heap formula must be annotated with # in unknown predicates.For example, in our system while x →c 1 (y) * U 1 (x#,y) is legal, x →c 1 (y) * U 1 (x, y) is illegal.
Our system applies subtraction on the heap pointed to by x rather than inference for the following check: Soundness The soundness of the inference rules in Fig. 3 has been shown in unfoldand-match systems for general inductive predicates [3,8].In the following, we present the soundness of the inference rules in Fig. 4. We introduce the notation R(Γ ) to denote a set of predicate definitions The soundness of the predicate synthesis requires that if definitions generated for unknown predicates are satisfiable, then the entailment is valid.
Theorem 1 follows from the soundness of the rules in Fig. 3 and Lemma 1.

Extensions
In this section, we present two ways to strengthen the inferred frame, by inferring pure properties and by normalizing inductive predicates.

Pure Constraint Inference
The inferred frame is strengthened with pure constraints following two phases.We first enrich the shape-base frame with pure properties such as size, height, sum, set of addresses/values, and their combinations.After that, we apply the same three steps in section 4 to infer relational assumptions on the new pure properties.Lastly, we check satisfiability of these assumptions using FixCalc [34].
In the following, we describe how to infer size properties given a set of dependent predicates.We can similarly infer properties on height, set of addresses and values properties.We first extend an inductive predicate with a size function to capture size properties.That is, given an inductive predicate P(v)≡ ∆ i , we generate a new predicate Pn with a new size parameter n as: Pn(v, n)≡ (∆ i ∧n=sizeF (∆ i )) where function sizeF is inductively defined as follows.
sizeF (κ 1 * κ 2 )=sizeF (κ 1 )+sizeF (κ 2 ) sizeF (P( t))=t s where t s ∈ t and t s is a size parameter To support pure properties, we extend the proposed cyclic proof system with biabduction for pure constraints which was presented in [43].In particular, we adopt the abduction rules to generate relational assumptions over the pure properties in LHS and RHS.These rules are applied exhaustively until no more unknown predicates occur.
Normalization We aim to relate the inferred frame to existing user-provided predicates if possible as well as to explicate the heap separation (a.k.a.pointer non-aliasing) which may be implicitly constrained through predicates.Particularly, we present a lemma synthesis mechanism to explore relations between inductive predicates.Our system processes each inductive predicate in four steps.First, it generates heap-only conjectures (with quantifiers).Secondly, it enriches these conjectures with unknown predicates.Thirdly, it invokes the proposed entailment procedure to prove these conjectures, infer definitions for the unknown predicates and synthesize the lemmas.Last, it strengthens the inferred lemma with pure inference.
In the following, we present two types of normalization.This first type is to generate equivalence lemmas.This normalization equivalently matches a new generated predicate to an existing predicate in a given predicate library.Under the assumption that a library of predicates is provided together with advanced knowledge (i.e., lemmas in [1]) to enhance completeness.This normalization helps to reuse this knowledge for the new synthesized predicates, and potentially enhance the completeness of the proof system.Intuitively, given a set S of inductive predicates and another inductive predicate P (which is not in S), we identify all predicates in S which are equivalent to P. Heap-only conjecture is generated to explore the equivalent relation between two predicates, e.g., in the case of P(x, v) and Q(x, w): ∀v•P(root, v)→∃ w•Q(root, w).The shared root parameter x has been identified by examining all permutations of root parameters of the two predicates.Moreover, our system synthesizes lemmas incrementally for the combined domains of shape and pure properties.For example, with lln and lsegn, our system generates the following lemma afterwards: lsegn(root,null,n)↔lln(root,n).
The other type of normalization is to generate separating lemmas.This normalization aims to expose hidden separation of heaps in inductive definitions.This paragraph explores parallel or consequence separate relations over inductive predicates parameters.Two parameters of a predicate are parallel separating if they are both root parameters e.g., r 1 and r 2 of the predicate zip2 as follows.
Two arguments of a predicate are consequence separating if one is a root parameter and another is reachable from the root in all base formulas derived by unfolding the predicate (e.g., those of the predicate ll last).We generate these separating lemmas to explicate separation globally.As a result, the separation of actual parameters is externally visible to analyses.This visible separation enables strong updates in a modular heap analysis or frame inference in modular verification.Suppose r 1 , r 2 are consequence or parallel parameters in Q(r 1 , r 2 , w), heap conjecture is generated as: This technique could be applied to synthesize spit/join lemmas to transform predicates into the fragment of linearly compositional predicates [15,14].For example, our system splits the predicate zip2 into two separating singly-lined lists through the following equivalent lemma: zip2(root,r 2 ,n) ↔ lln(root,n) * lln(r 2 ,n).

Implementation and Experiments
We have implemented the proposed ideas into a procedure called S2ENT for entailment checking and frame inference, based on the SLEEK [8].S2ENT relies on the SMT solver Z3 [27] to check satisfiability of arithmetical formulas.We have also integrated S2ENT into the verifier S2 [24].We have conducted two sets of experiments to evaluate the effectiveness and efficiency of S2ENT.The first set of experiments are conducted on a set of inductive entailment checking problems gathered from previous publications [1,5,9].We compare S2ENT with the state-of-the-art tools to see how many of these problems can be solved.In the second set of experiments, we apply S2ENT to conduct modular verification of a set of non-trivial programs.The experiments are conducted on a machine with the Intel i3-M370 (2.4GHz) processor and 3 GB of RAM.

Entailment Proving
In Table 1, we evaluate S2ENT on a set of 36 valid entailment problems that require induction reasoning techniques.In particular, Ent 1-5 were taken from Smallfoot [1], Ent 6-19 from Cyclic SL [3,5], Ent 20-28 from [9], and Ent 29-36 were generated by us.We evaluate S2ENT against the existing proof systems presented for user-defined predicates.While the tools reported in [12,8,36] could handle a subset of these benchmarks if users provide auxiliary lemmas/axioms, [15] was designed neither for those inductive predicates in Ent 6-28 nor frame problems in Ent 29-36.The only two tools which we can compare S2ENT with are Cyclic SL [3] and songbird [40].
The experimental results are presented in Table 1.The second column shows the entailment problems.Column bl captures the number of back-links in cyclic proofs generated by S2ENT.We observe that most of problems require only one back-link in the cyclic proofs, except that Ent 4 requires two back-links and Ent 13-15 of mutual inductive odd-even singly linked lists require three back-links.The last three columns show the results of Cyclic SL , songbird and S2ENT respectively.Each cell shown in these columns is either CPU times (in seconds) if the tool proves successfully, or TO if the tool runs longer than 30s, or X if the tool returns a false positive, or NA if the entailment is beyond the capability of the tool.In summary, out of the 36 problems, Cyclic SL solves 18 (with one TO -Ent 4); songbird solves 25 (with two false positive -Ent 17 and 27 and one TO -Ent 23); and S2ENT solves all 36 problems.
In Table 1, each entailment check in Ent 1-19 has emp as frame axioms (their LHS and RHS have the same heaps).Hence, they may be handled by existing inductive proof systems like [3,9,15,40].In particular, Ent 1-19 include shape-only predicates.The results show that Cyclic SL and songbird ran a bit faster than S2ENT in most of the their successful cases.It is expected as S2ENT requires additional steps for frame inference.Each entailment check in Ent 20-28 includes inductive predicates with pure properties (e.g., size and sortedness).While Cyclic SL can provide inductive reasoning for arithmetic and heap domains separately [5], there is no system proposed for cyclic proofs in the combined domain.Hence, these problems are beyond the capability of Cyclic SL .Ent 20 which requires mutual induction reasoning is the motivating example of songbird (agumented with size property) [40].In particular, sortll represents a sorted list with smallest value min, and tll is a binary tree whose nodes point to their parents and leaves are linked by a linked list [19,24].S2ENT solves each entailment incrementally: shape-based frame and then pure properties.The results show that S2ENT was more effective and efficient than songbird.
Each entailment check in Ent 29-36 requires both inductive reasoning and frame inference.These checks are beyond the capability of all existing entailment procedures for SL.S2ENT generates frame axioms for inductive reasoning.The experiments show that the proposed proof system can support efficient and effective reasoning on both shape and numeric domains as well as inductive proofs and frame inference.

Modular Verification for Memory Safety
We enhance the existing program verifier S2 [24] with S2ENT to automatically verify a range of heap-manipulating programs.We evaluate the enhanced S2 on the C library Glib open source [16] which includes non-GUI code from the GTK+ toolkit and the GNOME desktop environment.We conduct experiments on heap-manipulating files, i.e., singly-linked lists (gslist.c),doubly-linked lists (glist.c),balanced binary trees (gtree.c)and N-ary trees (gnode.c).These files contain fairly complex algorithms (e.g., sortedness) and the data structures used in gtree.cand gnode.care very complex.Some procedures of gslist.cand glist.cwere evaluated  [36,31,9] where the user had to manually provide a large number of lemmas to support the tool.Furthermore, the verification in [9] is semi-automatic, i.e., verification conditions were manually generated.Besides the tool in [9], tools in [36,31] were no longer available for comparison.In Table 2 we show, for each file the number of lines of code (excluding comments) LOC and the number of procedures #Pr.We remark that these procedures include tailrecursive procedures which are translated from loops.The columns (# √ ) (and sec.)show the number of procedures (and time in seconds) for which S2 can verify memory safety without (wo.)and with (w.) S2ENT.Column #syn shows the number of synthesized lemmas that used the technique in Sec. 5.With the lemma synthesis, the number of procedures that can be successfully verified increases from 168 (81%) to 182 (88%) with a time overhead of 28% (157secs/123secs).
A closer look shows that with S2ENT we are able to verify a number of challenging methods in gslist.cand glist.c.By generating separating lemmas, S2ENT successfully infers shape specifications of methods manipulating the last element of lists (i.e., g slist concat in gslist.cand g list append in glist.c).By generating equivalence lemmas, matching a newly-inferred inductive predicate with predefined predicates in S2 is now extended beyond the shape-only domain.Moreover, the experimental results also show that the enhanced S2 were able to verify 41/52 procedures in gslist.cand 39/51 procedures in glist.c.In comparison, while the tool in [9] could semi-automatically verify 11 procedures in gslist.cand 6 procedures in glist.c,with user-supplied lemmas the tool in [31] could verify 22 procedures in gslist.cand 10 procedures in glist.c.

Related Work and Conclusion
This work is related to three groups of work.The first group are those on entailment procedures in SL.Initial proof systems in SL mainly focus on a decidable fragment combining linked lists (and trees) [1,11,32,33,29,13,17,14,22,7].Recently, Iosif et.al. extend the decidable fragment to restricted inductive predicates [19].Timos et.al. [42] present a comprehensive summary on computational complexity for entailments in SL with inductive predicates.Smallfoot [1] and GRASShopper [33] provide systematic approaches for frame inference but with limited support for (general) inductive predicates.Extending these approaches to support general inductive predicates is non-trivial.GRASShopper is limited to a GRASS-reducible class of inductive predicates.While Smallfoot system has been designed to allow the use of general inductive predicates, the inference rules in Smallfoot are hardwired for list predicates only and a set of new rules must be developed for a proof system targeting general inductive predicates.SLEEK [8] and jStar [12] support frame inference with a soundness guarantee for general inductive predicates.However, they provide limited support for induction using user-supplied lemmas [30,12].Our work, like [8,36], targets an undecidable SL fragment including (arbitrary) inductive predicates and numerical constraints; we trade completeness for expressiveness.In addition to what are supported in [8,36], we support frame inference with inductive reasoning in SL by providing a system of cyclic proofs.
The second group is work on inductive reasoning.Lemmas are used to enhance the inductive reasoning of heap-based programs [30,5,12].They are used as alternative unfoldings beyond predicates' definitions [30,5], external inference rules [12], or intelligent generalization to support inductive reasoning [3].Unfortunately, the mechanisms in these systems require users to supply those additional lemmas that might be needed during a proof.SPEN [15] synthesizes lemmas to enhance inductive reasoning for some inductive predicates with bags of values.However, it is designed to support some specific classes of inductive predicates and it is difficult to extend it to cater for general inductive predicates.For a solution to inductive reasoning in SL, Smallfoot [1,3,5] presents subtraction rules that are consequent from a set of lemmas of lists and trees.Brotherston et.al. propose cyclic proof system for the entailment problem [2,3].Similarly, the circularity rule has been introduced in matching logic [38], Constraint Logic Programming [9] and separation logic combined with predicate definitions and arithmetic [40].Furthermore, work in [39] supports frame inference based on an ad-hoc mechanism, using a simple unfolding and matching.Like [3,9,40], our system also uses historical sequents at case split steps as induction hypotheses.Beyond these systems [3,9,15,40], S2ENT infers frames for inductive proofs systematically; and thus it gives a better support for modular verification of heap-manipulating programs.Moreover, we show how we can incrementally support inductive reasoning for the combination of heap and pure domains.In contrast, there are no formalized discussions in [5,9,40] about inductive reasoning for the combined domains; while [5] supports these domains separately, [9,40] only demonstrates their support through experimental results.
The third group is on lemma synthesis.In inductive reasoning, auxiliary lemmas are generated to discover theorems (e.g.[23,10,28]).The key elements of these techniques are heuristics used to generate equivalent lemmas for sets of given functions, constants and datatypes.In our work, we introduce lemma synthesis to strengthen the inductive constraints.To support theorem discovery, we synthesize equivalent and separating lemmas.This mechanism can be extended to other heuristics to enhance the completeness of modular verification.

Conclusion
We have presented a novel approach to frame inference for inductive entailments in SL with inductive predicates and arithmetic.The core of our proposal is the system of lemma synthesis through cyclic proofs in which back-links are formed using the cut rule.Moreover, we have presented two extensions to strengthen the inferred frames.Our evaluation indicates that our system is able to infer frame axioms for inductive entailment checking that are beyond the capability of the existing systems.

Fig. 3 .
Fig. 3. Basic Inference Rules for Entailment Procedure (where gsc is global soundness condition)

Table 2 .
Experiments on Glib Library