A Complete Normal-Form Bisimilarity for State

: We present a sound and complete bisimilarity for an untyped λ -calculus with higher-order local references. Our relation compares values by applying them to a fresh variable, like normal-form bisimilarity, and it uses environments to account for the evolving store. We achieve completeness by a careful treatment of evaluation contexts comprising open stuck terms. This work improves over Støvring and Lassen’s incomplete environment-based normal-form bisimilarity for the λρ -calculus, and conﬁrms, in relatively elementary terms, Jaber and Tabareau’s result, that the state construct is discriminative enough to be characterized with a bisimilarity without any quantiﬁcation over testing arguments.


Introduction
Two terms are contextually equivalent if replacing one by the other in a bigger program does not change the behavior of the program.The quantification over program contexts makes contextual equivalence hard to use in practice and it is therefore common to look for more effective characterizations of this relation.In a calculus with local state, such a characterization has been achieved either through logical relations [14,1,4], which rely on types, denotational models [9,12,5], or coinductively defined bisimilarities [8,17,18,16,11].
Koutavas et al. [7] argue that to be sound w.r.t.contextual equivalence, a bisimilarity for state should accumulate the tested terms in an environment to be able to try them again as the store evolves.Such environmental bisimilarities usually compare terms by applying them to arguments built from the environment [18,16,11], and therefore still rely on some universal quantification over testing arguments.An exception is Støvring and Lassen's bisimilarity [17], which compares terms by applying them to a fresh variable, like one would do with a normal-form (or open) bisimilarity [15,10].Their bisimilarity characterizes contextual equivalence in a calculus with control and state, but is not complete in a calculus with state only: there exist equivalent terms that are not related by the bisimilarity.Jaber and Tabareau [5] go further and propose a sound and complete Kripke Open Bisimilarity for a calculus with local state, which also compares terms by applying them to a fresh variable, but uses notions from Kripke logical relations, namely transition systems of invariants, to reason about heaps.
In this paper, we propose a sound and complete normal-form bisimilarity for a call-by-value λ-calculus with local references which relies on environments to handle heaps.We therefore improve over Støvring and Lassen's work, since our relation is complete, by following a different, potentially simpler, path than Jaber and Tabareau, since we use environments to represent possible worlds and do not rely on any external structures such as transition systems of invariants.Moreover, we do not need types and define our relation in an untyped calculus.
We obtain completeness by treating carefully normal forms that are not values, i.e., open stuck terms of the form E[x v].First, we distinguish in the environment the terms which should be tested multiple times from the ones that should be run only once, namely the evaluation contexts like E in the above term.The latter are kept in a separate environment that takes the form of a stack, according to the idea presented by Laird [9] and by Jagadeesan et al. [6].Second, we relate the so-called deferred diverging terms [4,5], i.e., open stuck terms which hide a diverging behavior in the evaluation context E, with the regular diverging terms.
It may be worth stressing that our congruence proof is based on the machinery we have developed before [3] and is simpler than Støvring and Lassen's one, in particular in how it accounts for the extensionality of functions.
We believe that this work makes a contribution to the understanding of how one should adjust the normal-form bisimulation proof principle when the calculus under consideration becomes less discriminative, assuming that one wishes to preserve completeness of the theory.In particular, it is quite straightforward to define a complete normal-form bisimilarity for the λ-calculus with first-class continuations and global store, with no need to refer to other notions than the ones already present in the reduction semantics.Similarly, in the λµρ-calculus (continuations and local references), one only needs to introduce environments to ensure soundness of the theory, but essentially nothing more is required to obtain completeness [17].In this article we show which new ingredients are needed when moving from these two highly expressive calculi to the corresponding, less discriminative ones-with global or local references only-that do not offer access to the current continuation.
The rest of this paper is as follows.In Section 2, we study a simple calculus with global store to see what is needed to reach completeness in that case.In particular, we show in Section 2.2 how we deal with deferred diverging terms.We remind in Section 2.3 the notion of diacritical progress [3] and the framework our bisimilarity and its proof of soundness are based upon.We sketch the completeness proof in Section 2.4.Section 2 paves the way for the main result of the paper, described in Section 3, where we turn to the calculus with local store.We define the bisimilarity in Section 3.2, prove its soundness and completeness in Section 3.3.We use it in Section 3. 4 to prove examples taken from the litera-A λ-abstraction λx.t binds x in t; we write fv(t) (respectively fv(E)) for the set of free variables of t (respectively E).We identify terms up to α-conversion of their bound variables.A variable or reference is fresh if it does not occur in any other entities under consideration, and a store is fresh if it maps references to pairwise distinct fresh variables.A term or context is closed if it has no free variables.We write fr(t) for the set of references that occur in t.
The call-by-value semantics of the calculus is defined on configurations h | t such that fr(t) ⊆ dom(h) and for all l ∈ dom(h), fr(h(l)) ⊆ dom(h).We let c and d range over configurations.We write t{v/x} for the usual capture-avoiding substitution of x by v in t, and we let ∫ range over simultaneous substitutions .{v 1 /x 1 } . . .{v n /x n }.We write h[l := v] for the operation updating the value of l to v. The reduction semantics → is defined by the following rules.
The well-formedness condition on configurations ensures that a read operation !l cannot fail.We write → * for the reflexive and transitive closure of →.
Testing only evaluation contexts is not a restriction, as it implies the equivalence w.r.t.all contexts ≡ C : one can show that t ≡ C s iff λx.t ≡ C λx.s iff λx.t ≡ λx.s.

Normal-Form Bisimulation
Informal presentation.Two open terms are normal-form bisimilar if their normal forms can be decomposed into bisimilar subterms.For example in the plain λcalculus, a stuck term E[xv] is bisimilar to t if t reduces to a stuck term F [xw] so that respectively E, F and v, w are bisimilar when they are respectively plugged with and applied to a fresh variable.
Such a requirement is too discriminating for many languages, as it distinguishes terms that should be equivalent.For instance in plain λ-calculus, given a closed value v, t def = x v is not normal form bisimilar to s def = (λy.xv) (x v).Indeed, is not bisimilar to (λy.x v) when plugged with a fresh z: the former produces a value z while the latter reduces to a stuck term x v.However, t and s are contextually equivalent, as for all closed value w, t{w/x} and s{w/x} behave like w v: if w v diverges, then they both diverges, and if w v evaluates to some value w , then they also evaluates to w .Similarly, x v Ω and Ω are not normal-form bisimilar (one is a stuck term while the other is diverging), but they are contextually equivalent by the same reasoning.
The terms t and s are no longer contextually equivalent in a λ-calculus with store, since a function can count how many times it is applied and change its behavior accordingly.More precisely, t and s are distinguished by the context l := 0; (λx. ) λz.l :=!l + 1; if !l = 1 then 0 else Ω.But this counting trick is not enough to discriminate x v Ω and Ω, as they are still equivalent in a λ-calculus with store.Although x v Ω is a normal form, it is in fact always diverging when we replace x by an arbitrary closed value w, either because w v itself diverges, or it evaluates to some w and then w Ω diverges.A stuck term which hides a diverging behavior has been called deferred diverging in the literature [4,5].
It turns out that being able to relate a diverging term to a deferred diverging term is all we need to change from the plain λ-calculus normal-form bisimilarity to get a complete equivalence when we add global store.We do so by distinguishing two cases in the clause for open-stuck terms: a configuration h | E[x v] is related to c either if c can reduce to a stuck configuration with related subterms, or if E is a diverging context, and we do not require anything of c.The resulting simulation is not symmetric as it relates a deferred diverging configuration with any configuration c (even converging one), but the corresponding notion of bisimulation equates such configuration only to either a configuration of the same kind or a diverging configuration such as h | Ω .
Progress.We define simulation using the notion of diacritical progress we developed in a previous work [2,3], which distinguishes between active and passive clauses.Roughly, passive clauses are between simulation states which should be considered equal, while active clauses are between states where actual progress is taking place.This distinction does not change the notions of bisimulation or bisimilarity, but it simplifies the soundness proof of the bisimilarity.It also allows for the definition of powerful up-to techniques, relations that are easier to use than bisimulations but still imply bisimilarity.For normal-form bisimilarity, our framework enables up-to techniques which respects η-expansion [3].
Progress is defined between objects called candidate relations, denoted by R, S, T .A candidate relation R contains pairs of configurations, and a set of configurations written R↑, which we expect to be composed of diverging or deferred diverging configurations (for such relations we take R −1 ↑ to be R↑).We extend R to stores, terms, values, and contexts with the following definitions.
We use these extensions to define progress as follows.

Inria
Definition 2. A candidate relation R progresses to S, T written R S, T , if R ⊆ S, S ⊆ T , and A normal-form simulation is a candidate relation R such that R R, R, and a bisimulation is a candidate relation R such that R and R −1 are simulations.Normal-form bisimilarity ≈ is the union of all normal-form bisimulations.
We test values and contexts by applying or plugging them with a fresh variable x, and running them in a fresh store; with a global memory, the value represented by x may access any reference and assign it an arbitrary value, hence the need for a fresh store.The stores of two bisimilar value configurations must have the same domain, as it would be easy to distinguish them otherwise by testing the content of the references that would be in one store but not in the other.
The main novelty compared to usual definitions of normal-form bisimilarity [10,3] is the set of (deferred) diverging configurations used in the stuck terms clause.We detect that E in a configuration h | E[x v] is (deferred) diverging by running h | E[y] where y and h are fresh; this configuration may then diverge or evaluate to an other deferred diverging configuration h Like in the plain λ-calculus [3], R progresses towards S in the value clause and T in the others; the former is passive while the others are active.Our framework prevents some up-to techniques from being applied after a passive transition.In particular, we want to forbid the application of bisimulation up to context as it would be unsound: we could deduce that v x and w x are equivalent for all v and w just by building a candidate relation containing v and w.
for fresh y and g, and we have and the resulting terms are in R.

Soundness
In this framework, proving that ≈ is sound is a consequence that a form of bisimulation up to context is valid, a result which itself may require to prove that other up-to techniques are valid.We distinguish the techniques which can be used in passive clauses (called strong up-to techniques), from the ones which cannot.An up-to technique (resp.strong up-to technique) is a function To show that a given f is an up-to technique, we rely on a notion of respectfulness, which is simpler to prove and gives sufficient conditions for f to be an up-to technique.
We briefly recall the notions we need from our previous work [2].We extend ⊆ and ∪ to functions argument-wise (e.g., (f ∪ g)(R) = f (R) ∪ g(R)), and given a set F of functions, we also write F for the function defined as f ∈F f .We define f ω as n∈N f n .We write id for the identity function on relations, and f for f ∪ id.A function f is monotone if R ⊆ S implies f (R) ⊆ f (S).We write P fin (R) for the set of finite subsets of R, and we say f is continuous if it can be defined by its image on these finite subsets, i.e., if f (R) ⊆ S∈P fin (R) f (S).The up-to techniques we use are defined by inference rules with a finite number of premises, so they are trivially continuous.Definition 3. A function f evolves to g, h, written f g, h, if for all R and T , R R, T implies f (R) g(R), h(T ).A function f strongly evolves to g, h, written f s g, h, if for all R, S, and T , R S, T implies f (R) g(S), h(T ).
Evolution can be seen as progress for functions on relations.Evolution is more restrictive than strong evolution, as it requires R such that R R, T .Definition 4. A set F of continuous functions is respectful if there exists S such that S ⊆ F and for all f ∈ S, we have f s S ω , F ω ; for all f ∈ F, we have f In words, a function is in a respectful set F if it evolves towards a combination of functions in F after active clauses, and in S after passive ones.When checking that f is regular (second case), we can use a regular function at most once after a passive clause.The (possibly empty) subset S intuitively represents the strong up-to techniques of F. If S 1 and S 2 are subsets of F which verify the conditions of the definition, then S 1 ∪ S 2 also does, so there exists the largest subset of F which satisfies the conditions, written strong(F).
Lemma 1.Let F be a respectful set. - Showing that f is in a respectful set F is easier than proving it is an up-to technique.Besides, proving that a bisimulation up to context is respectful implies that ≈ is preserved by contexts thanks to the last property of Lemma 1.
The up-to techniques for the calculus with global store are given in Figure 1.The techniques subst and plug allow to prove that ≈ is preserved by substitution and by evaluation contexts.The remaining ones are auxiliary techniques which are used in the respectfulness proof: red relies on the fact that the calculus is Fig. 1: Up-to techniques for the calculus with global store deterministic to relate terms up to reduction steps.The technique div allows to relate a diverging configuration to any other configuration, while plugdiv states that if E is a diverging context, then h | E[t] is a diverging configuration for all h and t.We distinguish the technique plug c from plug ↑ to get a more fine-grained classification, as plug c is the only one which is not strong.

Lemma 2. The set
We omit the proof, as it is similar but much simpler than for the calculus with local store of Section 3. We deduce that ≈ is sound using Lemma 1.
Theorem 1.For all t, s, and fresh store h, if h | t ≈ h | s , then t ≡ s.

Completeness
We prove the reverse implication by building a bisimulation which contains ≡.
Theorem 2. For all t, s, if t ≡ s, then for all fresh stores h, h | t ≈ h | s .Proof (Sketch).It suffices to show that the candidate R defined as is a simulation.We proceed by case analysis on the behavior of h | t .The details are in Appendix B.1; we sketch the proof in the case when , and E is not deferred diverging.
A first step is to show that g | s also evaluates to an open-stuck configuration with x in function position.To do so, we consider a fresh l and we define ∫ such that ∫ (y) sets l at 1 when it is first applied if y = x, and at 2 if y = x.Then h l := 0 | t ∫ sets l at 1, which should also be the case of g l := 0 | s ∫ , and it is possible only if g | s → * g | F [x w] for some g , F , and w.
We then have to show that E R c F , v R v w, and h R h g .We sketch the proof for the contexts, as the proofs for the values and the stores are similar.
Given h f a fresh store, y a fresh variable, E a context, h E a store, ∫ a closing substitution, we want Let l be a fresh reference.Assuming dom(h) = {l 1 . . .l n }, given a term t, we write i l i := h; t for l 1 := h(l 1 ); . . .l n := h(l n ); t.We define The substitution ∫ x behaves like ∫ except that when ∫ x (x) is applied for the first time, it replaces its argument by ∫ (y) and sets the store to h f h E .Therefore we can conclude from there.

Local Store
We adapt the ideas of the previous section to a calculus where terms create their own local store.To be able to deal with local resources, the relation we define mixes principles from normal-form and environmental bisimilarities.

Syntax, Semantics, and Contextual Equivalence
In this section, the terms no longer share a global store, but instead must create local references before storing values.We extend the syntax of Section 2 with a construct to create a new reference.
Terms: t, s ::= . . .| new l := v in t Reference creation new l := v in t binds l in t; we identify terms up to αconversion of their references.We write fr(t) and fr(E) for the set of free references of t or E, and a term or context is reference-closed if its set of free references is empty.Following [17] and in contrast with [4,5], references are not values, but we can still give access to a reference l by passing λx.!l and λx.l := x; λy.y.
As before, the semantics is defined on configurations h | t verifying fr(t) ⊆ dom(h) and for all l ∈ dom(h), fr(h(l)) ⊆ dom(h).We add to the rules of Section 2 the following one for reference creation.
We remind that is defined for disjoint stores only, so the above rule assumes that l / ∈ dom(h), which is always possible using α-conversion.We define contextual equivalence on reference-closed terms as we expect programs to allocate their own store.Definition 5. Two reference-closed terms t and s are contextually equivalent, written t ≡ s, if for all reference-closed evaluation contexts E and closing substitutions

Bisimilarity
With local stores, an external observer no longer has direct access to the stored values.In presence of such information hiding, a sound bisimilarity relies on an environment to accumulate terms which should be tested in different stores [7].
true then l := false; true else false and f 2 def = λx.true.If we compare new l := true in f 1 and f 2 only once in the empty store, they would be seen as equivalent as they both return true, however f 1 modify its store, so running f 1 and f 2 a second time distinguishes them.
Environments generally contain only values [16], except in λµρ [17], where plugged evaluation contexts are kept in the environment when comparing openstuck configurations.In contrast with λµρ, our environment collects values, and we use a stack for registering contexts [9,6].Unlike values, contexts are therefore tested only once, following a last-in first-out ordering.The next example shows that considering contexts repeatedly would lead to an overly-discriminating bisimilarity.For the stack discipline of testing contexts in action see Example 8 in Section 3.4.
Example 3.With the same f 1 and f 2 as in Example 2, the terms t def = new l := true in f 1 (x λy.y) and s def = f 2 (x λy.y) are contextually equivalent.Roughly, for all closing substitution ∫ , t and s either both diverge (if ∫ (x) λy.y diverges), or evaluate to true, since ∫ (x) cannot modify the value in l.Testing f 1 and f 2 twice would discriminate them and wrongfully distinguish t and s.
Remark 1.The bisimilarity for λµρ runs evaluation contexts several times and is still complete because of the µ operator, which, like call/cc, captures evaluation contexts, and may then execute them several times.
We let E range over sets of pairs of values, and over sets of values.Similarly, we write Σ for a stack of pairs of evaluation contexts and σ for a stack of evaluation contexts.We write for the empty stack, :: for the operator putting an element on top of a stack, and + + for the concatenation of two stacks.The projection operator π 1 transforms a set or stack of pairs into respectively a set or stack of single elements by taking the first element of each pair.A candidate relation R can be composed of: quadruples (E, Σ, c, d), written E, Σ c R d, meaning that c and d are related under E and Σ; quadruples (E, Σ, h, g), written E, Σ h R g, meaning that the elements of E and the top of Σ should be related when run with the stores h and g; triples ( , σ, c), written , σ c ∈ R↑, meaning that either c is (deferred) diverging, or σ is non-empty and contains a (deferred) diverging context; triples ( , σ, h), written , σ h ∈ R↑, meaning that σ is non-empty and contains a (deferred) diverging context.Definition 6.A candidate relation R progresses to S, T written R S, T , if R ⊆ S, S ⊆ T , and A normal-form simulation is a candidate relation R such that R R, R, and a bisimulation is a candidate relation R such that R and R −1 are simulations.Normal-form bisimilarity ≈ is the union of all normal-form bisimulations.
At that point, either d also reduces to a normal form of the same kind, or we test (the first projection of) the stack Σ for divergence, assuming it is not empty.In the former case, we add the values to E and the evaluation contexts at the top of Σ, getting a judgment of the form E , Σ h R g, which then tests the environment and the stack by running either terms in E or at the top of Σ .l := z R ∅ for a fresh z.Executing the contexts on the stack, we get a stuck term of the form if z then l := false; true else false and a value true, which cannot be related, because the former is not deferred diverging.
The terms t and s are therefore not bisimilar, and they are indeed not contextually equivalent, since t gives access to its private reference by passing λy.l := y; y to x.The function represented by x can then change the value of l to false and break the equivalence.
Fig. 2: Selected up-to techniques for the calculus with local store The last two cases of the bisimulation definition aim at detecting a deferred diverging context.The judgment , σ h ∈ R↑ roughly means that if σ = E n :: . . .E 1 :: , then the configuration h | E 1 [. . .E n [x]] diverges for all fresh x and all h obtained by running a term from E with the store h.As a result, when , σ h ∈ R↑, we have two possibilities: either we run a term from E in h to potentially change h, or we run the context at the top of σ (which cannot be empty in that case) to check if it is diverging.In both cases, we get a judgment of the form , σ c ∈ R↑.In that case, either c diverges and we are done, or it terminates, meaning that we have to look for divergence in σ .Example 6.We prove that ∅ | x v Ω and ∅ | Ω are bisimilar.We define R such that ∅, ∅ | x v Ω R ∅ | Ω , for which we need {v}, Ω :: Finally, only the two clauses where a reduction step takes place are active; all the others are passive, because they are simply switching from one judgment to the other without any real progress taking place.For example, when comparing value configurations, we go from a configuration judgment E, Σ c R d to a store judgment E, Σ h R g or a diverging store judgment E, Σ h ∈ R↑.In a (diverging) store judgment, we simply decide whether we reduce a term from the store of from the stack, going back to a (diverging) configuration judgment.Actual progress is made only when we start reducing the chosen configuration.

Soundness and Completeness
We briefly discuss the up-to techniques we need to prove soundness.We write E{(v, w)/x} for the environment {(v {v/x}, w {w/x}) | v E w }, and we also define Σ{(x, w)/x}, {v/x}, and σ{v/x} as expected.To save space, Figure 2 presents the up-to techniques for the configuration judgment only; we give the definitions for the other judgments in Appendix A.
As in Section 2.3, the techniques subst and plug allow to reason up to substitution and plugging into an evaluation context, except that the substituted values and plugged contexts must be taken from respectively the environment and the top of the stack.The technique div relates a diverging configuration to any configuration, like in the calculus with global store.The technique ccomp allows to merge successive contexts in the stack into one.The weakening technique weak, originally known as bisimulation up to environment [16], is an usual technique for environmental bisimulations.Making the environment smaller creates a weaker judgment, as having less testing terms means a less discriminating candidate relation.Bisimulation up to reduction red is also standard and allows for a big-step reasoning by ignoring reduction steps.Finally, the technique refl allows to introduce identical contexts in the stack, but also values in the environment or terms in configurations (cf Appendix A.3).
We denote by subst c the up to substitution technique restricted to the configuration and diverging configuration judgments, and by subst s the restriction to the store and diverging store judgments.The respectfulness proofs are in the appendix.Using refl, plug, subst c , and Lemma 1 we prove that ≈ is preserved by evaluation contexts and substitution, from which we deduce it is sound w.r.t.contextual equivalence.
To establish completeness, we follow the proof of Theorem 2, i.e., we construct a candidate relation R that contains ≡ and prove it is a simulation by case analysis on the behavior of the related terms.
Theorem 4. For all t and s, The main difference is that the contexts and closing substitutions are built from the environment using compatible closures [16], to take into account the private resources of the related terms.We discuss the proof in Appendix B.2.

Examples
Example 7. We start by the so-called awkward example [14,4,5].Let We can check that such a candidate is a bisimulation, and it ensures that when l is read (when E 2 is executed), it contains the value 1.  (E, F ) n l := n + 1 R ∅ for any n is a bisimulation.Indeed, running v and w increases the value stored in l and adds a pair (E, F ) on the stack.If n > 0, we can run a copy of E and F , thus decreasing the value in l by 1, and then returning true in both cases.
Example 9.This deferred divergence example comes from Dreyer et al. [4].Let We prove that new l := false in new k := false in v 2 is equivalent to w 2 .Informally, if f in w 2 applies its argument w 1 , the term diverges.Divergence also happens in v 2 but in a delayed fashion, as v 1 first sets k to true, and the continuation t def = if !k then Ω else l := true; λy.y then diverges.Similarly, if f stores w 1 or v 1 to later apply it, then divergence also occurs in both cases: in that case t sets l to true, and when v 1 is later applied, it diverges. To for all n is a bisimulation up to refl and red.

Related Work and Conclusion
Related work.As pointed out in Section 1, the other bisimilarities defined for state either feature universal quantification over testing arguments [8,18,16,11], or are complete only for a more expressive language [17].Kripke logical relations [1,4] also involve quantification over arguments when testing terms of a functional type.Finally, denotational models [9,12] can also be used to prove program equivalence, by showing that the denotations of two terms are equal.However, computing such denotations is difficult in general, and the automation of this task is so far restricted to a language with first-order references [13].
The work most closely related to ours is Jaber and Tabareau's Kripke Open Bisimulation (KOB) [5].A KOB tests functional terms with fresh variables and not with related values like a regular logical relation would do.To relate two given configurations, one has to provide a World Transition System (WTS) which states the invariants the heaps of the configurations should satisfy and how to go from one invariant to the other during the evaluation.Similarly, the bisimulations for the examples of Section 3.4 state properties which could be seen as invariants about the stores at different points of the evaluation.
The difficulty for KOB as well as with our bisimilarity is to come up with the right invariants about the heaps, expressed either as a WTS or as a bisimulation.We believe that choosing a technique over the other is just a matter of preference, depending on whether one is more comfortable with game semantics or with coinduction.It would be interesting to see if there is a formal correspondence between KOB and our bisimilarity; we leave this question as a future work.

Conclusion.
We define a sound and complete normal-form bisimilarity for higherorder local state, with an environment to be able to run terms in different stores.

Inria
We distinguish in the environment values which should be tested several times from the contexts which should be executed only once.The other difficulty is to relate deferred and regular diverging terms, which is taken care of by the specific judgments about divergence.The lack of quantification over arguments make the bisimulation proofs quite simple.
A future work would be to make these proofs even simpler by defining appropriate up-to techniques.The techniques we use in Section 3.3 to prove soundness turn out to be not that useful when establishing the equivalences of Section 3.4, except for trivial ones such as up to reduction or reflexivity.The difficulty in defining the candidate relations for the examples of Section 3.4 is in finding the right property relating the stack Σ to the store, so maybe an up-to technique could make this task easier.
As pointed out in Section 1, our results can be seen as an indication of what kind of additional infrastructure in a complete normal-form bisimilarity is required when the considered syntactic theory becomes less discriminative-in our case, when control operators vanish from the picture, and mutable state is the only extension of the λ-calculus.A question one could then ask is whether we can find a less expressive calculus-maybe the plain λ-calculus itself-for which a suitably enhanced normal-form bisimilarity is still complete.

A Respectfulness Proofs
For the proofs, we write f (respectively f s ) if there exist f and f such that f f , f (respectively f s f , f ) and the conditions of Definition 4 are met.
A.1 Up to substitution If t = v , then there exist g 1 , w such that g 1 | s → * g 1 | w and E ∪ {(v , w )}, Σ h 1 R g 1 .We conclude with subst s .
Suppose t = E[zv ] and there exist g , w , Otherwise, v y is an open-stuck term, i.e., v = x for some x .There exists F , g , and x and v = x for some x , then we conclude with subst s↑ .Otherwise, we have As a result, we have as wished.Lemma 6. subst s↑ s subst ↑ , id Proof.Same as the previous one.
Otherwise, because v ∈ , we have ∪ {v }, E :: σ h | v y ∈ R↑ for some y.But we also have h | vy → h | t for some h , and t , therefore ∪{v }, E ::

A.2 Up to Context Plugging and Composition
Proof.By case analysis on t.
From there, we conclude as with subst c .Suppose t = E [x v] and there exist g , w, and ↑, so we conclude as with subst ↑ . If Proof.The clause for checking terms of the environment is easy using ccomp.
For the other check, if Σ 1 is not empty, we can conclude with ccomp.
Lemma 11. ccomp s Proof.Same as the previous one.

Lemma 12. ccomp s
Proof.By case analysis on t.In all the cases, we are just unfolding the definitions.

Lemma 13. ccomp s
Proof.Same as the previous one.

A.3 Reflexivity
The context E or value v may allocate fresh references, hence the need for h .Other than that, the proofs are straightforward.

A.4 Other Techniques
Proof.By case analysis on t; each case is straightforward.
Given a store h such that dom(h) = {l 1 . . .l n } and a term t, we write i l i := h; t for the term l 1 := h(l 1 ); . . .l n := h(l n ); t.In the following we implicitly take advantage of the fact that if Lemma 18.The relation R defined by We consider the possible cases in turn.
Case h | t → c .Follows immediately from the definition of R.
Case t = v.We have to check first that g | s evaluates to a value configuration.By definition of R, since h | v is a value configuration, g | s cannot diverge.Furthermore, it cannot reduce to a stuck configuration either.For if g | s → * g | E[x w] for some g , E, x, and w, then taking ∫ to be the substitution which maps all the free variables of h, t, g, and s to λx.Ω, we get g | s ∫ → * g ∫ | E∫ [(λx.Ω) w∫ ] , a diverging configuration, hence a contradiction with the fact that h | v ∫ is a value configuration.Therefore, there exist g , w such that g | s → * g | w .We have to show that h R h g and v R v w.For the latter, we have to show that for a fresh store h f , a fresh variable x, a context E, a store h E , and a closing substitution ∫ , it is the case that and similarly for g | s .To show h R h g , we proceed similarly using (λy.letz =!l j in i l i := h f h E ; E[z x]) for each j.
Case t = E[x v].Suppose there exist ∫ , h E , and E such that h h E | E [t] ∫ terminates.We show that g | s also evaluates to a stuck-term configuration with x in function position.Let l be a fresh reference, and x → λa.if !l = 0 then l := 1; ∫ (x) a else ∫ (x) a y → λa.if !l = 0 then l := 2; ∫ (y) a else ∫ (y) x also terminates, with 1 also in l, since we can easily build a context that diverges if !l = 1.Therefore g | s does not evaluate to a value configuration, otherwise l would contain 0, and it does not evaluate to a stuck configuration with y = x in function position, otherwise l would contain 2. As a result, there exist g , F , and We first show that E R c F .Let h f be a fresh store, y a fresh variable, E a context, h E a store, ∫ a closing substitution, and l be a fresh reference.We define The substitution ∫ x behaves like ∫ except that when ∫ x (x) is applied for the first time, it replaces its argument by ∫ (y) and sets the store to h f h E .Therefore We have the same sequence of reductions for g l := 0 | E [s] ∫ x so we can conclude from there.
The proofs to show v R v w and h R h g rely on similar discriminating substitution and context.The idea is to run h h E | E [t] ∫ , and store when ∫ (x) is applied for the first time the value of either v or h(l j ) for a given j in a fresh reference l v .Because we know that the aforementioned configuration terminates, at the end of its evaluation, we can retrieve the value in l v to test it.To do so, we use the same entities as in the previous case, and the extra fresh reference l v .To compare v and w, we define To compare h and g , we would store in l v the value stored in l j .The substitution ∫ x behaves like ∫ when !l < 2, and then behaves as ∫ .It stores what should be tested when ∫ x (x) is applied for the first time (when !l = 0).Let

and the resulting configuration behaves like h
We have the same reductions for g h E l := 0 | E lv [s] ∫ x so we can conclude.

Inria
Suppose now that for all ∫ , h E , and E , h h E | E [t] ∫ diverges.We show that E ∈ R↑ c .Let h f be a fresh store and y a fresh variable.Suppose there exist Let l be a fresh reference and What remains to check are the clauses for c ∈ R↑.The case c → c is easy to check, and the case c = E[x v] is similar to the previous case.

B.2 Proof of Theorem 4
We first define a notion of a compatible closure over an environment E, for values E v , terms E t , evaluation contexts E c , and heaps E h : We use the same notation for the unary counterparts of the above relations that are defined in the expected way.
Then we define a relation subst on substitutions that reify an environment E: and a relation ctxt on evaluation contexts that reify a context stack Σ under an environment E: As before we assume the corresponding unary versions of the two relations.
For each kind of simulation statement, we define a closing predicate stating that elements built from the environment and/or the stack are closing the simulation statement.
Lemma 19.The relation R defined as follows is a simulation: The proof rely on the following auxiliary lemmas.
Lemma 20.For all h, E, t, v, x, and a fresh l, h Proof.Clear, as v is replaced by a λ-abstraction which behaves like an η-expansion of v.
Lemma 21.For all h, E 1 , E 2 , v, ∫ , fresh l, and Proof.The first application of λy.if !l then l := false; E 2 [y] else x y] plugs v into E 2 ; any potential additional application then behaves as ∫ (x).

For all t E
t s , there exist t and s such that t E t s, t = t[x → v], and Similarly for the unary versions of the environment, subst, and ctxt.
Proof.The lemma states that we can replace any occurrence of v and w by a fresh variable, so that we obtain entities built only from E. The proofs are easy by induction on the derivations of the closures.
Proof.In this lemma, we extend unary predicates to binary predicates.The entities h 0 , E 0 , and ∫ 0 may be constructed using variables and references fresh from π 1 (E), π 1 (Σ), h, t, but not from E, Σ, g, and s.Besides, ∫ 0 may not be closing these extras entities as well.We therefore construct h 1 , E 1 , and ∫ 1 like h 0 , E 0 , and ∫ 0 are constructed, but with "fresher" variables and references when those are needed.Besides, ∫ 1 is extended so that it closes g and s.
We then build g 1 , F 1 , and ∫ 2 by turning the unary derivations of h 1 , E 1 , and ∫ 1 built out of π 1 (E) and π 1 (Σ) into binary derivations out of E and Σ.
With these results, we can prove Lemma 19.
We proceed by case analysis on t.
Using Lemma 23 and substitutions similar to those in the proof of Theorem 2, we can show that there exist g , F , and w such that g | s → * g | F [x w] for some g , F , and w.We need to prove that E ∪ {(v, w)}, (E, F ) :: Σ h R g .We proceed like in the value case, but using the same intermediate configurations as in the previous subcase.
Checking R 2 : let E, Σ h R g.We have to check the condition on the environment and the one on the stack.
Checking R 4 : similar to R 2 .
which is exactly what we want.