Classical ByNeed
 4 Citations
 736 Downloads
Abstract
Callbyneed calculi are complex to design and reason with. When adding control effects, the very notion of canonicity is irremediably lost, the resulting calculi being necessarily ad hoc. This calls for a design of callbyneed guided by logical rather than operational considerations. Ariola et al. proposed such an extension of callbyneed with control making use of Curien and Herbelin’s duality of computation framework.
In this paper, Classical byneed is developed as an alternative extension of callbyneed with control, bettersuited for a programmingoriented reader.This method is prooftheoretically oriented by relying on linear head reduction (LHR) – an evaluation strategy coming from linear logic – and on the \(\lambda \mu \)calculus – a classical extension of the \({\lambda }\)calculus.

LHR is first reformulated by introducing closure contexts and extended to the \({\lambda }{\mu }\)calculus;

it is then shown how to derive a callbyneed calculus from LHR. The result is compared with standard callbyneed calculi, namely those of Ariola–Felleisen and Chang–Felleisen;

it is finally shown how to lift the previous item to classical logic, that is from the \({\lambda }\)calculus to the \({\lambda }{\mu }\)calculus, providing a classical byneed calculus, that is a lazy \({\lambda }{\mu }\)calculus. The result is compared with the callbyneed with control of Ariola et al.
Keywords
Callbyneed Classical logic Control operators Lambdacalculus Lambdamucalculus Lazy evaluation Linear head reduction Linear logic Krivine abstract machine Sigma equivalence1 Introduction
In his survey on the origins of continuations, Reynolds noticed that “in the early history of continuations, basic concepts were independently discovered an extraordinary number of times” [28]. It is actually a wellknown fact of the (long) history of science that deep, structuring ideas, are rediscovered several times. Computer science and modern proof theory have much shorter history but are no exception. Very much related to the question of continuations, we may think of doublenegation translations or, more recently, Girard and Reynolds’ discoveries of, respectively, System F [17] and of the polymorphic \({\lambda }\)calculus [27].
We think that this convergence of structuring ideas and independent discoveries is at play, to some extent, with callbyneed evaluation and linear head reduction: while the first is operationally motivated, the latter comes from the structure of linear logic proofs. This paper aims at demonstrating this and applying this in incorporating firstclass control in callbyneed.
Computation on Demand. Executing computations which may not be used to produce a value may obviously lead to unnecessary work being done, potentially resulting in nontermination even when a value exists. An alternative is to fire a redex only when it happens to be necessary to pursue the evaluation towards a value.
In this example^{1}, callbyvalue reduction will reduce \(\varDelta \ \varDelta \) again and again when the redex is of no use for reaching a value while callbyname simply discards the argument.
Callbyname, and more precisely (weak) head reduction thus realizes a form of demanddriven computation: a redex is fired only if it contributes to the (weak) head normal form (usually abbreviated as (w)hnf).
In the above example, callbyname reduction duplicates the computation of \(I\ I\) while callbyvalue only duplicates value I, resulting in a shorter reduction path to value.
Interestingly, demanddriven computation resulted in two lines of work, one motivated by theoretical purposes and rooted in logic, Danos and Regnier’s linear head reduction, the other being motivated by more practical concerns and resulting in the study of lazy evaluation strategies for functional languages.

first it reduces only the \(\beta \)redex binding to the leftmost variable occurrence (therefore the “head” from its name) and

secondly it substitutes for the argument only the head occurrence of the variable (therefore the “linear” from its name) without destroying the fired redex.
Lazy Evaluation. Wadsworth introduced lazy evaluation [30] as a way to overcome defects of both callbyname and callbyvalue evaluation recalled in the above paragraphs. Lazy evaluation, or Callbyneed, can be viewed as a strategy conciling the best of the byvalue and byname worlds in terms of reductions: a computation is triggered only when it is needed for the evaluation to progress and, in this case, it avoids redoing computations. The price to pay is that the byneed strategy is tricky to formulate and reason about. For instance, Wadsworth had to introduce a graph reduction in order to allow sharing of subterms, and the following developments on lazy evaluation essentially dealt with machines. The essence of callbyneed is summarized by Danvy et al. [16]:
Demanddriven computation & memoization of intermediate results
Designing a proper calculus for callbyneed remained open for about two decades, until the midnineties when, in 1994, two very similar solutions to this problem were simultaneously presented by Ariola and Felleisen on the one hand, and Maraist, Odersky and Wadler on the other [8, 9, 22].
Ariola and Felleisen’s calculus can be presented as follows:
Definition 1

The lazy behaviour of the calculus is coded in the structure of contexts: term E[x] evidences that variable x is in needed position in term E[x].

Rule Deref then gets the argument, in case it has already been computed and it has been detected as needed. In that case, the argument is substituted for one copy of the variable x, the one in needed position. As a consequence, the application is not erased and a single occurrence of the variable has been substituted. (E is a singlehole context.)

Rules Lift and Assoc allow for the commutation of evaluation contexts in order for deref redexes to appear despite the persisting binders.
We gave an example of a reduction sequence in AriolaFelleisen callbyneed \({\lambda }\)calculus in Fig. 2. In the last line we highlighted the term that would remain after applying the garbagecollection rule considered by Maraist et al. [22]. Even though this is not part of the calculus, this convention of garbagecollecting weakening redexes is used in the rest of the paper to ease the reading of values.

callbyneed can be seen as an optimization of both callbyname and callbyvalue while LHR can be seen as an optimization of head reduction;

both rely on a linear, rather than destructive, substitution (at least in AriolaFelleisen calculus presented above);

more importantly, both share with callbyname the same notion of convergence and the induced observational equivalences. Being observationally indistinguishable in the pure \({\lambda }\)calculus, they require instead sideeffects to be told apart from callbyname.
LHR made very scarce appearances in the literature for fifteen years, seemingly falling into oblivion except for the original authors. Yet, it made a surprise comeback by the beginning of the 2010’s through a research line initiated by Accattoli and Kesner [4]. Their article describes the socalled structural \({\lambda }\)calculus, featuring explicit substitutions and atdistance reduction, taking once again inspiration from the computational behaviour of proofnets and revamping the \({\sigma }\)equivalence relation in this framework. In their system, blocks of explicits substitutions are stuck where they were created and are considered transparent for all purposes but the rule of substitution of variables, contrasting sharply with the usual treatment of explicit substitutions. In practice, this is done by splitting \({\beta }\)reduction in multiplicative steps (corresponding to the creation of explicit substitutions) and exponential steps (corresponding to the effective substitution of a variable by some term). LHR naturally arises from the callbyname flavour of the atdistance rules, and indeed the connection with the historical LHR is made explicit in many articles from the subsequent trend [2, 4, 6] and is furthemore used to obtain results ranging from computational complexity to factorization of rewriting theories of the \({\lambda }\)calculus [3, 5].
While it took two decades for callbyneed to be equipped with a proper calculus, the way LHR is usually defined is intricate and inconvenient to work with. We actually view this fact, together with the observational indistinguishability, as one of the reasons for the almost complete nonexistence of LHR in literature until rediscovery by Accattoli and Kesner.
In byname, a is immediately substituted both in p and q, duplicating the callcc so that it reduces to 0. In byneed, a must be fully evaluated before being substituted, and the callcc is fired once. This forces the term to reduce to 99 instead. The impact of control is actually deeper: we can actually distinguish between several callbyneed calculi as evidenced by the second author in joint work with Ariola and Herbelin [10] about defining callbyneed extensions of \(\overline{\lambda }\)\({\mu }\)\(\tilde{\mu }\) which are sequentstyle \({\lambda \mu }\)calculus [13].
In this context, it does make sense to wonder which calculus to pick and what observational impact these choices may have. We can summarize the aim of the present paper as integrating logically callbyneed and control operators. We take a different approach from that of Ariola et al. [10]: instead of starting from the sequent calculus which readily integrates control [13], we show how to transform systematically LHR into callbyneed and show that this derivation can be smoothly lifted to the case of the \({\lambda \mu }\)calculus.
Contributions and Organization of the Paper. The contributions of the present paper are threefold.

First, we reformulate LHR by introducing closure contexts and extend LHR to the \(\lambda \mu \)calculus in Sect. 2.
 Then, after recalling AriolaFelleisen’s callbyneed calculus, we show in Sect. 3 how to derive a callbyneed calculus from LHR in three simple steps:We validate our constructions by comparing the resulting calculus with wellknown callbyneed calculi, namely Ariola and Felleisen’s or Chang and Felleisen’s callbyneed. This justifies the following motto:

Third, we finally show in Sect. 4 how to lift the previous derivation to classical logic, that is from the \(\lambda \)calculus to the \(\lambda \mu \)calculus, synthesizing two classical byneed calculi, that is a callbyneed \(\lambda \mu \)calculus, from classical LHR. The result is compared with Ariola et al. callbyneed with control.
The whole picture is summarized in the above diagram. Plain arrows indicate some form of equivalence between two calculi with the corresponding theorem indicated. Dashed arrows indicate that a calculus is obtained from another by a small transformation, which is described in the section aside. Blocks indicate to which family of reduction a calculus pertains.
2 A Modern Linear Head Reduction
In the introduction, we informally introduced LHR. We now turn to the actual study of LHR, first recalling its historical presentation [15] and \({\sigma }\)equivalence and then giving a new formulation of the reduction based on closure contexts, that allows us to provide a classical variant seamlessly.
2.1 Historical Presentation of Linear Head Reduction
We first define Danos and Regnier’s linear head reduction:
Definition 2
Definition 3
Remark 1
Head lambdas are precisely lambdas from the spine which will not be fed with arguments during head reduction. Now that we are equipped with the above notions, we can now formally define the linear head reduction:
Definition 4
 1.
there exists some term t s.t. \(\{x \leftarrow t\}\in p(u)\);
 2.
r is u where the variable occurrence \({{\mathrm{hoc}}}{(u)}\) has been substituted by t.
Remark 2
Linear head reduction only substitutes one occurrence of a variable at a time and never destroys an application node. Likewise, it does not decrease the number of prime redexes. Thus terms keep growing, hence the name “linear” taken for linear substitution. An example of linear head reduction is given in Fig. 2 where prime redexes are shown in grey boxes.
2.2 Reduction Up to \({\sigma }\)equivalence
It is noteworthy that LHR reduces terms which are not yet redexes for \(\beta _h\), i.e. lh may get the argument of a binder even if it is not directly applied to it. The third reduction \((\star )\) of the example from Fig. 2 features such a crossredex reduction. In this reduction, the \(\lambda y_0\) binder steps across the prime redex \(\{z_0 \leftarrow x\}\) in order to recover its argument x. This kind of reduction would not have been allowed by the usual head reduction \(\beta _h\). This peculiar behaviour can be made more formal thanks to a rewriting up to equivalence, also introduced by Regnier [25, 26].
Definition 5
Intuitively, \({\sigma }\)equivalence allows reduction in a term where it would have been forbidden by other essentially transparent redexes.
Proposition 1
If \(t\mathrel {{\cong }_{\sigma }}u\), then \(p(t) = p(u)\).
Proof
By case analysis on the rules of \({\sigma }\)equivalence.
Proposition 2
Proof
By induction on p(t). Existence of \(L_1\) follows from p being inductively defined over a left context, that of \(L_2\) from the fact that the hoc is the leftmost variable.
The previous result can be slightly refined. The \(\mathrel {{\cong }_{\sigma }}\) relation is reversible, so that we can rebuild r by applying to \({L}_{{{1}}}[(\lambda {{x}}.\,{L}_{{{2}}}[u])\ u]\) the reverse \({\sigma }\)equivalence steps from the rewriting from t to \({L}_{{{1}}}[(\lambda {{x}}.\,{L}_{{{2}}}[x])\ u]\). We will not detail this operation here but rather move to the definition of closure contexts.
2.3 Closure Contexts and the \({\lambda }_{lh}\)calculus
With the aim to give a firstclass status to the reduction up to \({\sigma }\)equivalence of Proposition 2, we introduce closure contexts, which will allow to reformulate linear head reduction.
Definition 6
Closure contexts feature all the required properties that provide them with a nice algebraic behaviour, that is, composability and factorization. Composition of contexts, \(E_1[E_2]\), will be written \(E_1\circ E_2\) in the following.
Proposition 3
(Composition). Let \({\mathcal {C}}_{1}\), \({\mathcal {C}}_{2}\) be closure contexts. \({\mathcal {C}}_{1}\circ {\mathcal {C}}_{2}\) is a closure context.
Proposition 4
(Factorization). Any term t can be uniquely decomposed as a maximal closure context, in the usual meaning of composition, and a subterm \(t_0\).
Actually, we get even more: closure contexts precisely capture the notion of prime redex as asserted by the following proposition.
Proposition 5
Let t be a term. Then \(\lbrace x\leftarrow u\rbrace \in p(t)\) if and only if there exist a left context L, a closure context \(\mathcal {C}\) and a term \(t_0\) such that \(t = L[\mathcal {C}[\lambda {{x}}.\,{t}_{{{0}}}]\ u]\).
Proof
By induction on p(t). The proof goes same way as for Proposition 2, except that we make explicit the context \(\mathcal {C}\) instead of writing it as a \({\sigma }\)equivalence.
Owing to the fact that closure contexts capture prime redexes, we will provide an alternative and more conventional definition for the LHR. It will result in the \({\lambda }_{lh}\)calculus, based on contexts rather than adhoc variable manipulations.
Definition 7
Proposition 6
(Stability of\({\lambda }_{lh}\)under\({\sigma }\)). Let t, u and v be terms such that \(t\mathrel {{\cong }_{\sigma }}u\mathrel {{\rightarrow }_{lh}}v\), then there is w such that \(t\mathrel {{\rightarrow }_{lh}}w\mathrel {{\cong }_{\sigma }}v\).
Proof
By induction over the starting \({\sigma }\)equivalence, and case analysis of the possible interactions between contexts. For instance, if the \({\lambda }\)abstraction of the rule interacts with \(\mathcal {C}\) through the second generator of the \({\sigma }\)equivalence, this amounts to transfer a fragment from \(\mathcal {C}\) into \(L_2\) which is transparent for the reduction rule. Similar interactions may appear at context boundaries or inside contexts.
Theorem 1
Proof
Indeed, the x from the rule is precisely the hoc of the term since closure contexts are in particular left contexts, and because we are reducing up to closure contexts, Proposition 5 ensures that \(\lbrace x\leftarrow u\rbrace \) is a prime redex.
2.4 Closure Contexts and the KAM: A Strong Relationship
Remarkably enough, closure contexts are not totally coming out of the blue. They are indeed already present in ChangFelleisen callbyneed calculus [12], even if their intrinsic interest, their properties as well as the natural notion of LHR stemming from them were not made explicit. Maybe the main contribution of our work is to put them at work as a design principle.
Closure contexts are morally transparent for some wellbehaved head reduction: one can consider that \({(\mathcal {C}[\lambda {x}.\,{[\cdot ]}])}\,{t}\) is a context that only adds a binding \((x := t)\) to the environment, as well as the bindings contained in \(\mathcal {C}\). This intuition can be made formal thanks to the Krivine abstract machine (KAM), recalled in Fig. 3. As stated by the following result, transitions Push and Pop of the KAM implement the computation of closure contexts.
Proposition 7
Proof
The first property is given by a direct induction on \(\mathcal {C}\), while the second is done by induction on the KAM reduction.
Actually, the KAM can even be seen as an implementation of a (weak) LHR rather than the (weak) head reduction. Indeed, the substitution is delayed in practice until a variable appears in head position, i.e. when it is the hoc of a term. Such a phenomenon was formalized by Danos and Regnier [15] who proved that the sequence of substitutions from the LHR and the sequence of closures substituted by the Grab rule are the same.
2.5 Classical Linear Head Reduction
We will only be interested in the reduction of commands in the remainder of this section. Our calculus is a direct elaboration of the aforementioned linear head calculus, and we dedicate this section to its thorough description.
Definition 8
We then adapt the \({\sigma }\)equivalence to the classical setting following Laurent [21].
Definition 9
Proposition 8
(Stability under\({\sigma }\)). Let t, u and v be terms such that \(t\mathrel {{\cong }_{\sigma }}u\mathrel {{\rightarrow }_{{\lambda }_{clh}}}v\), then there is w such that \(t\mathrel {{\rightarrow }_{{\lambda }_{clh}}}w\mathrel {{\cong }_{\sigma }}v\).
Danos and Regnier obtain a simulation theorem relating the KAM with LHR by defining substitution sequences [15]. This can be lifted to the \({\lambda \mu }\)calculus: there is a simulation theorem relating the \({\mu }\)KAM with the classical LHR. In order to state Theorem 2 which concludes this section, one first needs to introduce some preliminary definitions which are motivated by the following remark.
Remark 3
We need to define properly the relation between the original and the substituted terms in the above rule.
Definition 10
(Onestep residual). In the above rule, we say that t is the residual of Open image in new window in the source term.
It turns out that this definition can be extended to a reduction of arbitrary length thanks to the following lemma.
Proposition 9
Proof
By induction on the reduction. The key point is that all along the reduction, all terms on the right of an application node are subterms of the original term, up to some variable renaming. The original subterm can then be traced back by jumping up into the term being substituted at each step.
Definition 11
(Substitution sequence). Given two terms t and \(t_0\) s.t. \(t_0 \mathrel {{\rightarrow }_{{\lambda }_{clh}}^{*}} t\), we define the substitution sequence of t w.r.t. \(t_0\) as the (possibly infinite) sequence \({\mathfrak {S}}_{t_0}(t)\) of subterms of \(t_0\) defined as follows, where \({\alpha }\) is a fresh stack variable.

If \({[\alpha ] t} \mathrel {{\not \rightarrow }_{{\lambda }_{clh}}}\) then \({{\mathfrak {S}}_{t_0}(t)}\mathrel {{:}{:}=}{\emptyset }\).

If \({[\alpha ] t} \equiv {[\alpha ] {{L}_{1}}[{\mathcal {C}}[\lambda {{x}}.\,{{L}_{2}}[x]]\ r]} \mathrel {{\rightarrow }_{{\lambda }_{clh}}} {[\alpha ] {t'}}\) then \({{\mathfrak {S}}_{t_0}(t)}\mathrel {{:}{:}=}{r_0\ {:}{:}\ {\mathfrak {S}}_{t_0}(t')}\) where \(r_0\) is the residual of r in \(t_0\).
We finally pose \({\mathfrak {S}(t)}\mathrel {{:}{:}=}{{\mathfrak {S}}_{t}(t)}\).
The \({\mu }\)KAM naturally features a similar behaviour w.r.t. residuals.
Proposition 10
If \(\langle (t, \cdot )\mid \varepsilon \rangle \mathrel {{\rightarrow }^{*}}\langle (t_0, \sigma )\mid \pi \rangle \) then \(t_0\) is a subterm of t.
Proof
By a straightforward induction over the reduction path.
This proposition can (and actually needs to) be generalized to any source process whose stacks and closures only contain subterms of t. This leads to the definition of a similar notion of substitution sequence for the KAM.
Definition 12

If \(p\not \rightarrow \) then \({\mathfrak {K}(p)}\mathrel {{:}{:}=}{\emptyset }\).

If \(p\equiv \langle (x, \sigma )\mid \pi \rangle \rightarrow \langle (t, \tau )\mid \pi \rangle \) then \({\mathfrak {K}(p)}\mathrel {{:}{:}=}{t\ {:}{:}\ \mathfrak {K}(\langle (t, \tau )\mid \pi \rangle )}\).

Otherwise if \(p\rightarrow q\) then \({\mathfrak {K}(p)}\mathrel {{:}{:}=}{\mathfrak {K}(q)}\).
Finally, the KAM substitution sequence of any term t is defined as \({\mathfrak {K}(t)}\mathrel {{:}{:}=}{\mathfrak {K}(\langle (t, \cdot )\mid \varepsilon \rangle )}\).
By the previous lemma, \(\mathfrak {K}(t)\) is a sequence of subterms of t. We can therefore formally relate it to \(\mathfrak {S}(t)\).
Proposition 11
Let t be a term. Then \(\mathfrak {K}(t)\) is a prefix of \(\mathfrak {S}(t)\).
Proof

either \(\langle (x, \sigma + (x := (r_0, \tau ))\mid \pi \rangle \) for some \({\sigma }\), \({\tau }\) and \({\pi }\)

or a blocked state of the KAM for all rules
The second case can occur if there are too many \({\lambda }\)abstractions in the left contexts of the above reduction rule or if there is an free stack variable appearing in a command part of the left contexts. In this case \(\mathfrak {K}(t) = \emptyset \), which is indeed a prefix of \(\mathfrak {S}(t)\).
Otherwise, one has Open image in new window and Open image in new window . It it therefore sufficient to show that the property holds for the tail of those two sequences.
But now, we can conclude, because Open image in new window and \(\mathfrak {K}(\langle (r_0, \tau )\mid \pi \rangle )\) (resp. \(\mathfrak {S}(t_r)\) and \({\mathfrak {S}}_{t}(t_r)\)) are the same sequence up to a renaming of the bound variables coming from Open image in new window which is common to both kinds of reduction. Thus \(\mathfrak {K}(\langle (r_0, \tau )\mid \pi \rangle )\) is a prefix of \({\mathfrak {S}}_{t}(t_r)\) and we are done.
The following theorem is a direct corollary of Proposition 11.
Theorem 2
Let \(c_1\mathrel {{\rightarrow }_{{\lambda }_{clh}}}c_2\) where \({c_1}\mathrel {:=}{{[\alpha ] {{L}_{1}}[{\mathcal {C}}[\lambda {{x}}.\,{{L}_{2}}[x]]\ t]}}\), then the substitution sequence of process \(c_1\) is either empty or of the form \(t {:}{:} \ell \) where \({\ell }\) is the substitution sequence of process \(c_2\).
Proposition 8 and Theorem 2 validate our calculus as a sound classical extension of LHR.
3 Towards CallbyNeed
Our journey from LHR to callbyneed will now follow three steps: first restricting LHR to a weak reduction, imposing a valuerestriction and finally adding an amount of sharing.
3.1 Weak Linear Head Reduction
The LHR as given at paragraph 2.3 is a strong reduction: it reduces under abstractions. We now adapt \({\lambda }_{lh}\)calculus to the weak case. It is easy to give a weak version of the reduction in the historical LHR, which inherits the same defects as its strong counterpart.
Definition 13
(Historical wlhreduction). We say that t weaklinearhead reduces to r, written \(t \rightarrow _{wlh} r\), iff \(t \rightarrow _{lh} r\) and t does not have any head \(\lambda \).
On the other hand, the \({\lambda }_{lh}\) reduction can be denied the possibility to reduce under abstractions by restricting the evaluation contexts inside which it can be triggered. This requires some care though. Indeed, the contexts may contain \({\lambda }\)abstractions, assuming they have been compensated by as many previous applications. That is, those binders must pertain to a prime redex as in \((\lambda {{{z}_{{{1}}}}}\,{{{{\ldots }}}}\,{{{z}_{{{n}}}}}\,{{x}}.\,{E^w}[x])\ {r}_{{{1}}}\ {{\ldots }}\ {r}_{{{n}}}\ u\). Plain closure contexts are not expressive enough to capture this situation.
To solve this issue, we extend the \({\lambda }\)calculus in a way which is inspired both by techniques used for studying reductions, residuals and developments in \({\lambda }\)calculus [11] and by \({\lambda }\)letcalculi [8] or explicit substitutions [1], and in particular the structural \({\lambda }\)calculus [4]. We are indeed going to recognize when a prime redex has been created by marking them explicitly. Yet, contrarily to standard \({\lambda }\)calculus rewriting theory, we will mark lambdas which are not necessarily actually involved in \({\beta }\)redexes and contrarily to \({\lambda }\)letcalculi or the structural \({\lambda }\)calculus, we will preserve the underlying structure of the \({\lambda }\)term by only marking abstractions rather than creating letbindings and making them transparent for all rules. We shall discuss the significance of this design choice in Sect. 3.5.
Definition 14
We have to update the definition of closure contexts to fit into this presentation. It is actually enough to restrict all abstractions appearing inside a closure context to marked abstractions.
Definition 15
In general, arbitrary marked terms do not make sense, because marked abstractions may not have a matching application. This is why we define a notion of wellformed marked terms.
Definition 16
(Wellformed marked terms). A marked term t is wellformed whenever for any decomposition \({t}\equiv {E[\ell {x}.\,u]}\) where E is an arbitrary context, E can be further decomposed as \({E}\equiv {{E}_{{{0}}}[\mathcal {C}\ r]}\) where \({E}_{{{0}}}\) is an arbitrary context, \(\mathcal {C}\) a marked closure context and r a marked term.
In the rest of the paper, we will only work with wellformed marked terms even when this remains implicit: all our constructions and reductions will preserve wellformedness as in the following definition:
Definition 17
Proposition 12
(Stability of\(\lambda _{wlh}\)by\(\sigma \)). Let t, u and v be terms such that \(t \cong _\sigma u \rightarrow _{\lambda _{wlh}} v\), then there is w such that \(t \rightarrow _{\lambda _{wlh}} w \cong _\sigma v\).
This property is proved similarly to Proposition 6. We can now prove that \(\lambda _{wlh}\) and the historical wlhreduction coincide:
Theorem 3
\(t \rightarrow _{\lambda _{wlh}} r\) iff \(t \rightarrow _{wlh} r\).
Proof
They correspond since not having a head lambda is exactly equivalent to having all its subcontexts starting with an abstraction marked.
3.2 Callby“Value” Linear Head Reduction
In order to obtain a callbyvalue LHR, we will restrict contexts that trigger substitutions to react only in front of a value. In addition, the uptoclosure paradigm used so far will also incite us to consider values up to closures defined as \(W {:}{:}= \mathcal {C}[V]\) when \(V {:}{:}= \lambda {x}.\,{t}\) stands for values.
Going from the usual callbyname to the usual callbyvalue is then simply a matter of adding a context forcing values. Likewise, we just add a context forcing upto values. This construction is made in a systematic way according to the standard callbyvalue encoding.
Definition 18
The callbyvalue weak linear head reduction is obtained straightforwardly.
Definition 19
It is easy to check that the reduction was not deeply modified, the difference lies in the clever choice of contexts. Stability by \({\sigma }\) is proved as in Proposition 6.
Proposition 13
(Stability of\(\lambda _{\mathtt {wlv}}\)by\(\sigma \)). Let t, u and v be terms such that \(t \cong _\sigma u \rightarrow _{\lambda _{\mathtt {wlv}}} v\), then there is w such that \(t \rightarrow _{\lambda _{\mathtt {wlv}}} w \cong _\sigma v\).
In the first transition, the reduction occurs in the argument required by x, returning a value (up to closure) that will then be substituted.
3.3 Closure Sharing
It is possible to solve this issue in an elegant way akin to the Assoc rule of AriolaFelleisen calculus. This is achieved by the extrusion of the closure of the value at the instant it is substituted. There is no need to refine contexts further, because everything is already in order. We obtain the calculus below:
Definition 20
3.4 \(\lambda _\mathtt {wls}\) is a CallbyNeed Calculus
Definition 21
Theorem 4
For any wellformed marked terms t and r, if \(t\mathrel {{\rightarrow }_{{\lambda }_{\mathtt {cfr}}}^{*}}r\) then \([t]\mathrel {{\rightarrow }_{{\beta }_{cf}}^{*}}[r]\) where the length of the second reduction is the number of uses of the second rule in the first reduction. Conversely, for any unmarked terms t and r s.t. \(t\mathrel {{\rightarrow }_{{\beta }_{cf}}}r\) there exist markings \(t'\) and \(r'\) of t and r where \(t'\mathrel {{\rightarrow }_{{\lambda }_{\mathtt {cfr}}}^{*}}r'\) using exactly once the second rule.
Proof
It is sufficient to observe that wellformedness in the marked calculus is equivalent to the existence of a decomposition into inner and outer contexts that are balanced in the unmarked calculus.
The difference in the order of closure plugging may seem irrelevant in Chang and Felleisen’s framework because they use nonlinear destructive substitutions and both orders are possible: an adhoc choice was made there. On the contrary, our design – strongly guided by logic – directly led us to a plugging order compatible with linear substitution.
3.5 Comparison with Other CallbyNeed Calculi
CFcalculus is a bit peculiar in works on callbyneed. It would be better to also compare our approach to more standard calculi such as AFcalculus. Moreover, a third variant of callbyneed calculus has been defined recently by considering the linear substitution \({\lambda }\)calculus [2] (LScalculus), and it turns out to be very close to our presentation. This section is devoted to the comparison of \({\lambda }_{\mathtt {wls}}\) with those calculi.
The major source of difference between the three calculi lies in the handling and encoding of term bindings.

AFcalculus uses a microscopic reduction (simple, small steps) and relies on rewriting rules to build up flat binding contexts;

LScalculus uses a macroscopic reduction (“at distance”, relying on a veryelaborated and structured context) and enforces reduction rules to be transparent w.r.t. binding contexts;

\({\lambda }_{\mathtt {wls}}\) does the same thing but uses a refined version of explicit substitutions embodied by closure contexts.
We now explain to which extent these calculi are essentially the same except for the technology used for the implementation of binding contexts. Both for AFcalculus and LScalculus, it would be natural to switch to a calculus featuring letbinders, or equivalently explicit substitutions [1]. Yet, we will describe them in the usual \({\lambda }\)calculus for uniformity with the rest of the paper and because marked \({\lambda }\)abstractions are not needed in this case.
 1.
weak reduction constrain evaluation contexts to be applicative contexts up to closures: \(E {:}{:}= [\cdot ] \mid E\, t \mid {(\lambda {x}.\,{E})}\,{t}\)
 2.
restriction to value (up to closure) substitutions, which creates new callbyvalue, evaluation contexts: \(E {:}{:}= \dots \mid {(\lambda {x}.\,{E[x]})}\,{E}\)
 3.
sharing of closures, introducing the rule for commutation of closure contexts and which happens to be, with the simplified contexts, the usual (Assoc) rule: \({(\lambda {x}.\,{E[x]})}\,{{(\lambda {y}.\,{A})}\,{t}} \rightarrow {(\lambda {y}.\,{{(\lambda {x}.\,{E[x]})}\,{A}})}\,{t} \quad ({\textsc {Assoc}})\)
Proposition 14
The resulting calculus is precisely AFcalculus.
As one can witness, the first rule, known as the distant byname \({\beta }\)rule, is just Open image in new window . The second rule corresponds to the dereferencing rule of \({\lambda }_{\mathtt {wls}}\) without a closure context around the \({\lambda }\)abstraction because Open image in new window builds up explicit substitutions by putting every abstraction in front of its corresponding argument.
Nonetheless, we advocate that \({\lambda }_{\mathtt {wls}}\) is more finegrained and that there are cases where this may matter. Indeed, Proposition 7 shows that closure contexts are faithful reifications of KAM environments. This is not the case for flat \(\mathcal {L}\) contexts which may represent several KAM environments. This mismatch can be indeed observed in presence of sideeffects sensitive to the structure of environments, most notably forcing [23], but probably linear effects as well. As we precisely want to extend callbyneed with effects, we claim that closure contexts should be preferred over flat context in this setting.
4 Classical ByNeed
To extend our classical LHR calculus to a fullyfledged callbyneed calculus, we follow the same threestep path that led us from LHR to callbyneed. We will not give the full details for the three steps though, and we will instead only give the final calculus.
The most delicate point is actually the introduction of weak reduction. In a classical setting, the actual applicative context of a variable may be strictly larger than it seems, because in commands of the form \([\alpha ] t\), the \({\alpha }\) variable may be bound to a stack featuring supplementary applications. This means that we need to take into account supernumerary abstractions at the beginning of commands. Yet, the marking procedure allows us to remember which abstractions are actually paired with a corresponding application in a direct way.
We present in this section two variants of a classical byneed calculus, one effectively taking into account supernumerary arguments as described above, and a less smart variant that perfoms classical substitution upfront without caring for abstractionapplication balancing on command boundaries. The advantage of the latter over the former is that it can be easily linked to a previous classical byneed calculus [10].
4.1 Classical ByNeed with Classical Closure Contexts
To implement the mechanism described above, we simply need to acknowledge the requirement to go through \({\mu }\) binders in our contexts. This leads to the mutual definition of classicalbyvalue contexts \({E}^{v}\) and closure stack fragments \(K^v\), where closure contexts are updated as well to handle classical binders.
\( \begin{array}{lcl} \mathcal {C} &{} \mathrel {{:}{:}=} &{} {[\cdot ]}\mid {{\mathcal {C}}_{1}[\ell {x}.\,{\mathcal {C}}_{2}]\ t}\mid {{\mathcal {C}}_{1}[\mu \alpha . {K^v}[[\alpha ] {\mathcal {C}}_{2}]]}\\ {E}^{v} &{} \mathrel {{:}{:}=} &{} {[\cdot ]}\mid {{{E}^{v}}\ t}\mid {\ell {x}.\,{{E}^{v}}{} }\mid {\mathcal {C}[\ell {x}.\,{{E}_{1}^{v}}[x]]\ {{E}_{2}^{v}}{} }\mid {\mu \alpha . {K^v}[[\alpha ] {{E}^{v}}]}\\ K^v &{} \mathrel {{:}{:}=} &{} {[\cdot ]}\mid {[\alpha ] {E^v}[\mu \beta . {K^v}]}\\ \end{array} \)
Definition 22
The issue of this calculus is that the delayed classical substitution is a quite novel phenomenon that does not ressemble anything from the literature as far as we know. It is, in particular, difficult to compare with previous attempts at a callbyneed variant of the \({\lambda \mu }\)calculus.
4.2 Classical ByNeed with Intuitionistic Closure Contexts
We describe here a modification of the above calculus whose laziness has been watered down. Instead of delaying classical substitutions, it performs them as soon as possible. The main difference with the smart calculus lies in the fact that closure contexts remain intuitionistic and do not allow to go down under \({\mu }\)binders. The corresponding contexts are inductively defined as follows.
\( \begin{array}{lcl} \mathcal {C} &{} \mathrel {{:}{:}=} &{} {[\cdot ]}\mid {{\mathcal {C}}_{1}[\ell {x}.\,{\mathcal {C}}_{2}]\ t}\\ {E}^{v} &{} \mathrel {{:}{:}=} &{} {[\cdot ]}\mid {{{E}^{v}}\ t}\mid {\ell {x}.\,{{E}^{v}}{} }\mid {\mathcal {C}[\ell {x}.\,{{E}_{1}^{v}}[x]]\ {{E}_{2}^{v}}{} }\\ K^v &{} \mathrel {{:}{:}=} &{} {[\cdot ]}\mid {[\alpha ] {E^v}[\mu \beta . {K^v}]}\\ \end{array} \)
Definition 23
4.3 Comparison with Existing Classical CallbyNeed Calculus
In order to better understand the calculi of the previous section, we now turn to the comparison with another classicalbyneed calculus [10], referred to as AHS, which is obtained from a calculus derived from Curien and Herbelin’s duality of computation. As a consequence, AHS features plain callbyvalue \({\beta }\) reduction and not a linear, nondestructive, deref rule la AriolaFelleisen:
\( \begin{array}{l} (\lambda {{x}}.\,t)\ v~\mathrel {\rightarrow }~t\lbrace {{x\mathrel {:=}v}}\rbrace ~\text {with}~v~\mathrel {:=}~x\mid \lambda {{x}}.\,t\\ \end{array} \)
Comparing precisely our calculi with AHS is tricky because AHS is built on destructive substitution which additionally is plain \(\beta _v\). The reason for such a presentation of AHS is to be found in its sequent calculus origin [10]. As we did for the comparison with ChangFelleisen calculus, we will consider a variant of AHS with a deref rule la AriolaFelleisen described, in sequent style, in the last section of [7] for which we will prove that established \(\mathtt {cwls}\)reduction is sound and complete. We conjecture that the same result holds for AHS but do not yet have a proof of this fact.
AHS Modified Calculus. We now consider a slightly modified version of the previous calculus from [10]. AHS’calculus consists in AHScalculus where the beta reduction has been replaced by a deref rule la AriolaFelleisen (where variables are not values) and the notion of evaluation context has been adapted accordingly. This calculus has been first described, in sequent style, in [7].
Definition 24
Theorem 5
For any command c, there exists an infinite standard reduction in AHS’calculus starting from c iff there exists an infinite reduction starting from c in \(\mathtt {cwls'}\)calculus.
Proof

The structural rules (S) which are made of the Lift and Assoc rules, together with the rule \({(\lambda {{x}}.\,\mu \alpha . [\beta ] t)\ n}\mathrel {\rightarrow }{\mu \alpha . [\beta ] (\lambda {{x}}.\,t)\ n}\).

The performing rules (P) which is only the dereferencing rule.

The classical rules (C) which are the three remaining rules.
Transforming reductions in \(\mathtt {cwls'}\)calculus into AHS’ is straightforward. First, assuming a closure stack fragment \(K_v\), one can see that AHS’ will normalize it into a delimited C context in the following way. For each splice of \(K_v\) of the form \([\alpha ] {{E_v}}[\mu \beta . {[\cdot ]}]\), the \(E_v\) context will be simplified by a series of applications of the (S) rules. According to the form of \(K_v\), either the reduction stops (if there is no remaining splice) or it performs a certain number of (C) rules, until which the normalization procedure recursively applies. A dereferencing cannot occur at this point because while there are remaining splices, the current needed context cannot contain variables, as the splices all have a \({\mu }\) binder in needed position. Note that the resulting normalized context is still a closure stack fragment up to unmarking. By a simple size argument, this normalization procedure must terminate, so that we will consider it transparent for the simulation.
The second reduction rule, corresponding to stack substitutions, is actually directly handled by the normalization procedure described just above.
We turn to the simulation of the AHS’ reduction by the \(\mathtt {cwls'}\)calculus. First, the standard reduction contexts are a degenerated case of \(K_v\) contexts with one splice and flattened closure contexts, which allows to easily transfer rules from the source to the target. We actually match each class of reduction (S), (P) and (C) to a given behaviour in the target calculus.

The (S) rules are transparent for the \(\mathtt {cwls'}\)calculus, because they are natively handled by closure contexts. So a (S)reduction does not give rise to a \({\lambda }_{clh}\)reduction.

A group of (C) rules can be matched by an arbitrary number of reductions, including none. This depends on the way the corresponding stack variable is used.

The (P) rule is conversely matched by exactly one rule in the \(\mathtt {cwls'}\) reduction.
The trick is to use the fact that \((C + S)\) is normalizing, as we already did in the previous case. Moreover, such reductions do not change the possibility to perform a derefencing in the corresponding \(\mathtt {cwls'}\) term. So we actually consider groups of reductions \({(C + S)}^{*}, P\) in the source calculus. This is always possible to decompose a sequence of AHS’ reductions as such thanks to the normalization of \((C + S)\). It it then easy to witness that the S part will have no effect, each C reduction will be matched by a finite number of context reductions in \({\lambda }_{clh}\), and that the final (P) will correspond to exactly one derefencing reduction in \({\lambda }_{clh}\).
5 Conclusion
Nevertheless, the early history of continuations is a sharp reminder that original ideas are rarely born in full generality, and that their communication is not always a simple or straightforward task.
John C. Reynolds [28]

our approach to linear head reduction (using closure contexts and, to deal with weak reduction, marked terms) is validated by the fact that it transfers smoothly to \(\lambda \mu \)calculus, a new result of this paper. Additionally, one can see the use of closure contexts, in particular when dealing with marked terms, as a generalization of LSC where the structure of substitutions is encoded in closure contexts, a treelike structure, and not in substitution contexts, which are linearized: we keep closer to the original structure of the term which is important when dealing with computational effects.

The callbyneed \(\lambda \) and \(\lambda \mu \)calculi that we obtain in the paper are related with previously known versions of callbyneed calculi. In the case of the \(\lambda \)calculus, they are related with AriolaFelleisen’s and ChangFelleisen’s calculi. In the case of \(\lambda \mu \)calculi, two classical byneed calculi are actually proposed, one being related to a variant of AriolaHerbelinSaurin’s calculus, and the other calling for further investigations.

we developed a different methodology from that of Ariola et al. in that we stayed within the framework of natural deduction and analyzed the systematic synthesis of callbyneed from LHR in a careful way resulting in the ability to lift this to the \(\lambda \mu \)calculus.

Compared with LSC, our approach can be viewed as less specified and somehow more general than that of LSC, which obviously prevents us from having results as precise as those of LSC, for instance regarding complexity analysis. On the other hand, maintaining the structure of \(\lambda \)terms suggests interesting perspectives for handling various computational effects.
More Computational Effects. The robustness of our approach is encouraging for testing other computational effects where maintaining the term structure may be even more crucial than for control effects.
Classical byNeed. The comparisons with other proposals for classical variants of the callbyneed reduction [7, 10] remain to be established more precisely.
We conjecture that our calculus is sound and complete not only with AHS’ but also with AHScalculus [10]. Moreover, the classicalbyneed calculus with classical closure contexts remains difficult to connect to already known calculi.
Towards Full Laziness. Our design guided by \(\sigma \)equivalence can do more than callbyneed and can actually already encompass a weak form of full laziness. Future work will pursue this direction.
Reduction Strategies Versus Calculi. The original motivation of our work was to relate formally LHR and callbyneed. As a result, instead of focusing on proper calculi we concentrated our attention on a specific evaluation strategy for several reasons: macroscopic and weak reductions are more naturally expressed with strategies. However, the \(\lambda _{lh}\)calculus can very easily be studied as a calculus and we shall develop it as such in the future.
Footnotes
 1.
As is usual, \({\varDelta }\) stands for \(\lambda {{x}}.\,x\ x\), I for \(\lambda {{y}}.\,y\) and we write \(\mathrel {{\rightarrow }_{\mathrm {cbn}}}\) (resp. \(\mathrel {{\rightarrow }_{\mathrm {cbv}}}\)) the reductions associated with callbyname (resp. by value). The redex which is involved in a reduction is emphasized by showing it in a grey box. We will be implicitly working up to \({\alpha }\)conversion, and we will use Barendregt’s conventions not to capture variables unwillingly.
 2.
For instance Maraist, Odersky and Wadler calculus differ in their 1998 journal version from the calculus introduced by Ariola and Felleisen, but in no essential way since both calculi share the same standard reductions.
Notes
Acknowledgements
The authors would like to thank Beniamino Accattoli, Thibaut Balabonski, Olivier Danvy and Delia Kesner for discussions regarding this work as well as anonymous reviewers.
References
 1.Abadi, M., Cardelli, L., Curien, P.L., Lévy, J.J.: Explicit substitutions. J. Funct. Program. 1(4), 375–416 (1991)MathSciNetzbMATHCrossRefGoogle Scholar
 2.Accattoli, B., Barenbaum, P., Mazza, D.: Distilling abstract machines. In: Jeuring, Chakravarty (eds.) ICFP 2014, pp. 363–376. ACM (2014)Google Scholar
 3.Accattoli, B., Bonelli, E., Kesner, D., Lombardi, C.: A nonstandard standardization theorem. In: POPL 2014, San Diego, CA, USA, pp. 659–670 (2014)Google Scholar
 4.Accattoli, B., Kesner, D.: The structural \(\lambda \)calculus. In: Dawar, A., Veith, H. (eds.) CSL 2010. LNCS, vol. 6247, pp. 381–395. Springer, Heidelberg (2010)CrossRefGoogle Scholar
 5.Accattoli, B., Kesner, D.: The permutative \(\lambda \)calculus. In: Bjørner, N., Voronkov, A. (eds.) LPAR18 2012. LNCS, vol. 7180, pp. 23–36. Springer, Heidelberg (2012)CrossRefGoogle Scholar
 6.Accattoli, B., Lago, U.D.: Beta reduction is invariant, indeed. In: CSLLICS 2014, Vienna, Austria, July 2014Google Scholar
 7.Ariola, Z.M., Downen, P., Herbelin, H., Nakata, K., Saurin, A.: Classical callbyneed sequent calculi: the unity of semantic artifacts. In: Schrijvers, T., Thiemann, P. (eds.) FLOPS 2012. LNCS, vol. 7294, pp. 32–46. Springer, Heidelberg (2012)CrossRefGoogle Scholar
 8.Ariola, Z.M., Felleisen, M.: The callbyneed lambda calculus. J. Funct. Program. 7(3), 265–301 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
 9.Ariola, Z.M., Felleisen, M., Maraist, J., Odersky, M., Wadler, P.: The callbyneed lambda calculus. In: POPL 1995, pp. 233–246. ACM Press (1995)Google Scholar
 10.Ariola, Z.M., Herbelin, H., Saurin, A.: Classical callbyneed and duality. In: Ong, L. (ed.) Typed Lambda Calculi and Applications. LNCS, vol. 6690, pp. 27–44. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 11.Barendregt, H.: The Lambda Calculus, its Syntax and Semantics. Studies in Logic and the Foundations of Mathematics. Elsevier, Amsterdam (1984)zbMATHGoogle Scholar
 12.Chang, S., Felleisen, M.: The callbyneed lambda calculus, revisited. In: Seidl, H. (ed.) Programming Languages and Systems. LNCS, vol. 7211, pp. 128–147. Springer, Heidelberg (2012)CrossRefGoogle Scholar
 13.Curien, P.L., Herbelin, H.: The duality of computation. In: Odersky, M., Wadler, P. (eds.) ICFP 2000, pp. 233–243. ACM Press (2000)Google Scholar
 14.Danos, V., Herbelin, H., Regnier, L.: Game semantics & abstract machines. In: LICS 1996, pp. 394–405. IEEE Press (1996)Google Scholar
 15.Danos, V., Regnier, L.: Head linear reduction (2004) (unpublished)Google Scholar
 16.Danvy, O., Millikin, K., Munk, J., Zerny, I.: Defunctionalized interpreters for callbyneed evaluation. In: Blume, M., Kobayashi, N., Vidal, G. (eds.) FLOPS 2010. LNCS, vol. 6009, pp. 240–256. Springer, Heidelberg (2010)CrossRefGoogle Scholar
 17.Girard, J.Y.: Une extension de l’interprtation de Gdel l’analyse, et sonapplication a l’élimination des coupures dans l’analyse et la thorie destypes. In: Fenstad (ed.) Proceedings of the Second Scandinavian Logic Symposium. Studies in Logic and the Foundations of Mathematics, vol. 63, pp. 63–92. Elsevier (1971)Google Scholar
 18.Girard, J.Y.: Linear logic. Theoret. Comput. Sci. 50, 1–102 (1987)MathSciNetzbMATHCrossRefGoogle Scholar
 19.Hyland, M., Ong, L.: On full abstraction for PCF. Inf. Comput. 163(2), 285–408 (2000)MathSciNetzbMATHCrossRefGoogle Scholar
 20.Krivine, J.L.: A callbyname lambdacalculus machine. HigherOrder Symb. Comput. 20(3), 199–207 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
 21.Laurent, O.: A study of polarization in logic. Ph.D. Thesis, Université de la Méditerranée  AixMarseille II, March 2002Google Scholar
 22.Maraist, J., Odersky, M., Wadler, P.: The callbyneed lambdacalculus. J. Funct. Program. 8(3), 275–317 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
 23.Miquel, A.: Forcing as a program transformation. In: LICS, pp. 197–206. IEEE Computer Society (2011)Google Scholar
 24.Parigot, M.: Lambdamucalculus: an algorithmic interpretation of classical natural deduction. In: Voronkov, A. (ed.) LPAR 1992. LNCS, vol. 624, pp. 190–201. Springer, Heidelberg (1992)CrossRefGoogle Scholar
 25.Regnier, L.: Lambdacalcul et réseaux. Ph.D. Thesis, Univ. Paris VII (1992)Google Scholar
 26.Regnier, L.: Une équivalence sur les \(\lambda \)termes. Theoret. Comput. Sci. 126, 281–292 (1994)MathSciNetCrossRefGoogle Scholar
 27.Reynolds, J.C.: Towards a theory of type structure. In: Robinet, B. (ed.) Programming Symposium. LNCS, vol. 19, pp. 408–423. Springer, Heidelberg (1974)CrossRefGoogle Scholar
 28.Reynolds, J.C.: The discoveries of continuations. Lisp Symb. Comput. 6(3–4), 233–248 (1993)CrossRefGoogle Scholar
 29.Streicher, T., Reus, B.: Classical logic, continuation semantics and abstract machines. J. Funct. Program. 8(6), 543–572 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
 30.Wadsworth, C.P.: Semantics and pragmatics of the lambdacalculus. Ph.D. Thesis, Programming Research Group, Oxford University (1971)Google Scholar