FabULous Interoperability for ML and a Linear Language

Instead of a monolithic programming language trying to cover all features of interest, some programming systems are designed by combining together simpler languages that cooperate to cover the same feature space. This can improve usability by making each part simpler than the whole, but there is a risk of abstraction leaks from one language to another that would break expectations of the users familiar with only one or some of the involved languages. We propose a formal specification for what it means for a given language in a multi-language system to be usable without leaks: it should embed into the multi-language in a fully abstract way, that is, its contextual equivalence should be unchanged in the larger system. To demonstrate our proposed design principle and formal specification criterion, we design a multi-language programming system that combines an ML-like statically typed functional language and another language with linear types and linear state. Our goal is to cover a good part of the expressiveness of languages that mix functional programming and linear state (ownership), at only a fraction of the complexity. We prove that the embedding of ML into the multi-language system is fully abstract: functional programmers should not fear abstraction leaks. We show examples of combined programs demonstrating in-place memory updates and safe resource handling, and an implementation extending OCaml with our linear language.


INTRODUCTION
Feature accretion is a common trend among mature but actively evolving programming languages, including C++, Haskell, Java, OCaml, Python, and Scala.Each new feature strives for generality and expressiveness, and may provide a large usability improvement to users of the particular problem domain or programming style it was designed to empower (e.g., XML documents, asynchronous communication, staged evaluation).But feature creep in general-purpose languages may also make it harder for programmers to master the language as a whole, degrade the user experience (e.g., leading to more cryptic error messages), require additional work on the part of tooling providers, and lead to fragility in language implementations.
A natural response to increased language complexity is to de ne subsets of the language designed for a better programming experience.For instance, a subset can be easier to teach (e.g., "Core" ML1 , Haskell 98 as opposed to GHC Haskell, Scala mastery levels2 ); it can facilitate static analysis or decrease the risk of programming errors, while remaining su ciently expressive for the target users' needs (e.g., MISRA C, Spark/Ada); it can enforce a common style within a company; or it can be designed to encourage a transition to deprecate some ill-behaved language features (e.g., strict Javascript).
Once a subset has been selected, it may be the case that users write whole programs purely in the subset (possibly using tooling to enforce that property), but programs will commonly rely on other libraries that are not themselves implemented in the same subset of the language.If users stay in the subset while using these libraries, they will only interact with the part of the library whose interface is expressible in the subset.But does the behavior of the library respect the expectations of users who only know the subset?When calling a function from within the subset breaks subset expectations, it is a sign of leaky abstraction.
How should we design languages with useful subsets that manage complexity and avoid abstraction leaks?
The linear type system ensures that the le handle is properly closed: removing the close handle call would give a type error.On the other hand, only the parts concerned with the resource-handling logic need to be written in the red linear language; the user can keep all general-purpose logic (here, how to accumulate lines and what to do with them at the end) in the more convenient general-purpose blue language-and call this function from a blue-language program.Fine-grained boundaries allow users to rely on each language's strength and to use the advanced features only when necessary.In this example, the le-handle API speci es that the call to line, which reads a line, returns the data at type ![String].The latter represents how U values of type String can be put into a lump type to be passed to the linear world where they are treated as opaque blackboxes that must be passed back to the ML world for consumption.
For other examples, such as in-place list manipulation or transient operations on an persistent data structure, we will need a deeper form of interoperability where the linear world creates, dissects or manipulate U values.To enable this, our multi-language supports translation of types from one language to the other, using a type compatibility relation σ σ between λ U types σ and λ L types σ .
We claim the following contributions: (1) We propose a formal speci cation of what it means for advanced language features to be introduced in a (multi-)language system without introducing a class of abstraction leaks that break equational reasoning.This speci cation captures a useful usability property, and we hope it will help us and others design more usable programming languages, much like the formal notion of principal types served to better understand and design type inference systems.(2) We design a simple linear language, λ L , that supports linear state (Section 2).This simple design for linear state is a contribution of its own.A nice property of the language (shared by some other linear languages) is that the code has both an imperative interpretation-with in-place memory update, which provides resource guarantees-and a functional interpretation-which aids program reasoning.The imperative and functional interpretations have di erent resource usage, but the same input/output behavior.(3) We present a multi-language programming system λ UL combining a core ML language, λ U (U for Unrestricted, as opposed to Linear) with λ L and prove that the embedding of the ML language λ U in λ UL is fully abstract (Section 3).Moreover, the multi-language is designed to ensure that our full abstraction result is stable under extension of the embedded ML language λ U .(4) We de ne a logical relation and prove parametricity for λ UL .The logical relation illustrates, semantically, why one can reason functionally about programs in λ UL despite the presence of state and strong updates (Section 4).( 5) We evaluate the resulting language design by providing examples of hybrid λ UL programs that exhibit programming patterns inaccessible to ML alone, such that safe in-place updates and typestate-like static protocol enforcement (Section 5), and describe our prototype implementation (Section 6).

THE λ U AND λ L LANGUAGES
The unrestricted language λ U is a run-of-the-mill idealized ML language with functions, pairs, sums, iso-recursive types and polymorphism.It is presented in its explicitly typed form-we will not discuss type inference in this work.The full syntax is described in Figure 1, and the typing rules in Figure 2. The dynamic semantics is completely standard.Having binary sums, binary products and iso-recursive types lets us express algebraic datatypes in the usual way.
Typing contexts Γ ::= • | Γ, x :σ Fig. 3. Linear Language: Surface Syntax The novelty lies in the linear language λ L , which we present in several steps.As is common in λ-calculi with references, the small-step operational semantics is given for a language that is not exactly the surface language in which programs are written, because memory allocation returns locations that are not in the grammar of surface terms.Reductions are de ned on con gurations, a local store paired with a term in a slightly larger internal language.We give two type systems, a type system on surface terms, that does not mention locations and stores-which is the one a programmer needs to know-and a type system on con gurations, which contains enough static information to reason about the dynamics of our language and prove subject reduction.Again, this follows the standard structure of syntactic soundness proofs for languages with a mutable store.
We present the surface language and type system in Section 2.1, except for the language fragment manipulating the linear store which is presented in Section 2.2.Finally, the internal terms, their typing and reduction semantics are presented in Section 2.3.

The Core of λ L
Figure 3 presents the surface syntax of our linear language λ L .For the syntactic categories of types σ , and expressions e, the last line contains the constructions related to the linear store that we only discuss in Section 2.2.
In technical terms, our linear type system is exactly propositional intuitionistic linear logic, extended with iso-recursive types.For simplicity and because we did not need them, our current system also does not have polymorphism or additive/lazy pairs σ 1 & σ 2 .Additive pairs would be a trivial addition, but polymorphism would require more work when we de ne the multi-language semantics in Section 3. In less technical terms, our type system can enforce that values be used linearly, meaning that they cannot be duplicated or erased, they have to be deconstructed exactly once.Only some types have this linearity restriction; others allow duplication and sharing of values at will.We can think of linear values as resources to be spent wisely; for any linear value somewhere in a term, there can be only one way to access this value, so we can interpret the language as enforcing an ownership discipline where whoever points to a linear value owns it.
The types of linear values are the type of linear pairs σ 1 ⊗ σ 2 , of linear disjoint unions σ 1 ⊕ σ 2 , of linear functions σ 1 σ 2 , and of the linear unit type 1.For example, a linear function must be called exactly once, and its result must in turn be consumed -such linear functions can safely capture linear resources.The expressionformers at these types use the same syntax as the unrestricted language λ U , with the exception of linear pair deconstruction let v 1 , v 2 = e 1 in e 2 , which names both members of the deconstructed pair at once.A linear pair type with projection would only ever allow to observe one of the two members; this would correspond to the additive/lazy pairs σ 1 & σ 2 , where only one of the two members is ever computed.
The types of non-linear, duplicable values are the types of the form !σ -the exponential modality of linear logic.If e has type σ , the term share e has type !σ .Values of this type are not uniquely owned, they can be shared at will.If the term e has duplicable type !σ , then the type copy e has type σ : this creates a local copy of the value that is uniquely-owned by its receiver and must be consumed linearily.
This resource-usage discipline is enforced by the surface typing rules of λ L , presented in Figure 4.They are exactly the standard (two-sided) logical rules of intuitionistic linear logic, annotated with program terms.The non-duplicability of linear values is enforced by the way contexts are merged by the inference rules: if e 1 is type-checked in the context Γ 1 and e 2 in Γ 2 , then the linear pair e 1 , e 2 is only valid in the combined context Γ 1 Γ 2 .The ( ) operation is partial; this combined context is de ned only if the variables shared by Γ 1 and Γ 2 are duplicable-their type is of the form !σ .In other words, a variable at a non-duplicable type in Γ 1 Γ 2 cannot 1:6 • Gabriel Scherer, Max New, Nicholas Rioux, and Amal Ahmed possibly appear in both Γ 1 and Γ 2 : it must appear exactly once4 .A good way to think of the linear judgment Γ e : σ is that the evaluation of e consumes the linear variables of Γ; it is thus natural that the strict pair e 1 , e 2 would need separate sets of resources Γ 1 and Γ 2 , as it evaluates both members to return a value.On the other hand, case elimination case e of x 1 .e 1 | x 2 .e 2 reuses the same context Γ in both branches e 1 and e 2 : only one will be evaluated, so they do not compete for resources.
The variable rule does not expect a context of the form Γ, x :σ but of the form !Γ, x :σ .Here !Γ is a notation for the pointwise application of the (!) connective to all the types in Γ-i.e., all types in !Γ are of the form !σ .This means that the variable rule can only be used when all variables in the context are duplicable, except maybe the variable that is being used.A context of the form Γ, x :σ would allow us to forget some variable present in the context; in our judgment Γ e : σ , all non-duplicable variables in Γ must appear (once) in e.
The form !Γ is also used in the typing rule for share e: a term can only be made duplicable if it does not depend on linear resources from the context.Otherwise, duplicating the shared value could break the unique-ownership discipline on these linear resources.
Finally, the linear isomorphism notation for fold and unfold in Figure 4 de nes them as primitive functions, at the given linear function type, in the empty context -using them does not consume resources.This notation also means that, operationally, these two operations shall be inverses of each other.The rules for the linear store type Box 1 σ and Box 0 are described in Section 2.2.

Linear Memory in λ L
The surface typing rules for the linear store are given at the end of Figure 4.The linear type Box 1 σ represents a memory location that holds a value of type σ .The type type Box 0 represents a location that has been allocated, but does not currently hold a value.The primitive operations to act on this type are given as linear isomorphisms: new allocates, turning a unit value into an empty location; conversely, free reclaims an empty location.Putting a value into the location and taking it out are expressed by box and unbox , which convert between a pair of an empty location and a value, of type (Box 0) ⊗ σ , and a full location, of type Box 1 σ .
For example, the following program takes a full reference and a value, and swaps the value with the content of the reference: The programming style following from this presentation of linear memory is functional, or applicative, rather than imperative.Rather than insisting on the mutability of references-which is allowed by the linear disciplinewe may think of the type Box 1 σ as representing the indirection through the heap that is implicit in functional programs.In a sense, we are not writing imperative programs with a mutable store, but rather making explicit the allocations and dereferences happening in higher-level purely functional language.In this view, empty cells allow memory reuse.
This view that Box 1 σ represents indirection through the memory suggests we can encode lists of values of type σ by the type LinList σ The placement of the box inside the sum mirrors the fact that empty list is represented as an immediate value in functional languages.From this type de nition, one can write an in-place reverse function on lists of σ as follows: This de nition uses a xpoint operator fix that can be de ned, in the standard way, using the iso-recursive type µα .α σ σ of the strict xpoint combinator on functions σ σ .Our linear language λ L is a formal language that is not terribly convenient to program directly.We will not present a full surface language in this work, but one could easily de ne syntactic sugar to write the exact same function as follows: rev_into Nil acc = acc rev_into Cons x, xs @ l acc = rev_into xs (Cons x, acc @ l) One can read this function as the usual functional rev_append function on lists, annotated with memory reuse information: if we assume we are the unique owner of the input list and won't need it anymore, we can reuse the memory of its cons cells (given in this example the name l) to store the reversed list.On the other hand, if you read the box and unbox as imperative operations, this code expresses the usual imperative pointer-reversal algorithm.
This double view of linear state occurs in other programming systems with linear state.It was recently emphasized in O'Connor, Chen, Rizkallah, Amani, Lim, Murray, Nagashima, Sewell, and Klein (2016), where the functional point of view is seen as easing formal veri cation, while the imperative view is used as a compilation technique to produce e cient C code from linear programs.

Internal λ L Syntax and Typing
To give a dynamic semantics for λ L and prove it sound, we need to extend the language with explicit stores and store locations.Indeed, the allocating term new should reduce to a "fresh location" allocated in some store s, and neither are part of the surface-language syntax.The corresponding internal typing judgment is more complex, but note users do not need to know about it to reason about correctness of surface programs.The internal typing is essential for the soundness proof, but also useful for de ning the multi-language semantics in Section 3.
The syntax of internal terms and the internal type system are presented in Figure 5. Reduction will be de ned on con gurations (s | e), which are pairs of a store s and a term e. Stores s map locations to either nothing (the location is empty), written [ → •], or a value paired with its own local store, noted [ → (s | v)].Having local stores in this way, instead of a single global store as is typical in formalizations of ML, directly expresses the idea of "memory ownership" in the syntax: a term e "owns" the locations that appear in it, and a con guration (s | e) is only well-typed if the domain of s is exactly those locations.Each store slot, in turn, may contain a value and the local store owned by the value; in particular, passing a full location of type Box 1 σ transfers ownership of the location, but also of the store fragment captured by the value.
Our internal typing judgment Ψ; Γ s | e : σ checks con gurations, not just terms, and relies not only on a typing context for variables Γ but also on a store typing Ψ, which maps the locations of the con guration to typing assumptions of two forms: (•; • : Box 0 ) indicates that must be empty in the con guration, and (Γ; Ψ : Box 1 σ ) indicates that is full, and that the value it contains owns a local store of type s and the resources in Γ.
1:8 • Gabriel Scherer, Max New, Nicholas Rioux, and Amal Ahmed Union of stores on disjoint locations Union of store typings on disjoint locations Just as linear variables must occur exactly once in a term, locations have linear types and thus occur exactly once in a term.Our typing judgment uses disjoint store typings Ψ 1 Ψ 2 to enforce this linearity.Similarly, leaf rules such as the variable, unit, and location rules enforce that both the store typing and the store be empty, which enforces that all locations are used in the term.
Locations are always linear, never duplicable.To allow sharing terms that contain locations, the internal language uses the internal construction share(s : Ψ). e, that captures a local store s : Ψ.This notation is a binding construct: the locations in s are bound by this shared term, and not visible outside this term.In particular, the typing rule for share(s : Ψ). e checks the term e in the store s, but it is itself only valid paired with an empty store, under the empty store typing.When new copies of a shared term are made, the local store is copied as well: this is necessary to guarantee that locations remain linear-and for correctness of linear state update.
The typing rule for functions λ(x : σ ).e lets function bodies use an arbitrary store typing Ψ.This would be unsound if our functions were duplicable, but it is a natural and expressive choice for linear, one-shot functions.To make a function duplicable, one can share it at type !(σ σ ), whose values are of the canonical form share(s : Ψ). λ(x : σ ).e.It is the sharing construct, not the function itself, that closes over the local store.
With the macro-expansion share e def = share(∅ : •).e, any term e of the surface language (Figure 3 can be seen as a term of the internal language (Figure 5).In particular, we can prove that the surface and internal typing judgments coincide on surface terms.The following technical results are used in the soundness proof for the language, Theorem 2 (Subject reduction for λ L ).

L 3 (I λ L
).In any complete derivation of Ψ; Γ s | v : σ , either v is a variable x, or the derivation starts with the introduction rule for σ .
For example, if we have Ψ; Γ s | v : !σ , then we know that v is either a variable or of the form share(s : Ψ ).v for some v , but also that s = ∅, Ψ = • and that Γ is of the form !Γ for some Γ .The latter is immediate if v is share(s : Ψ ).v , and also holds if v is a variable.

Reduction of Internal Terms
Figure 6 gives a small-step operational semantics for the internal terms of λ L .We separate the head reductions The head reduction of the linear types of the core language do not involve the store and are standard.For the store primitives of Figure 4 acting on Box 0, Box 1 σ , we reuse the isomorphism notation to emphasize that the related primitives are inverses of each other.
There are several reduction rules for copy (share e), one for each type connective.These reductions perform a deep copy of the value, stopping only on ground data ( ), function values, and shared sub-terms: when copying a !!σ into a !σ , there is no need for a deep copy.When it encounters a location, copy (share ) reduces to a new allocation.If the location contains a value, the new location is lled with a copy of this value.
The copying rule for functions performs a copy of the local store s of the shared function.The locations in s are bound on the left-hand-side of the reduction, and free on the right-hand-side: this reduction step allocates fresh locations, and the store typing of the term changes from • on the left to Ψ on the right.The fact that reduction changes the store typing is not unique to this rule, it is also the case when directly copying locations.In ML languages with references, the store only grows during reduction.That is not the case for our linear store: our reduction may either allocate new locations or free existing ones.
We de ne a grammar of (deterministic) reduction contexts, which contain exactly one hole in evaluation position.However, we only de ne linear contexts K that do not share their hole: we need a speci c treatment of the share(s : Ψ). e reduction.Its subterm e is reduced in the local store s, but may create or free locations in the store; so we need to update the local store and its store typing during the reduction.
T 1 (P ).If Ψ; Γ s | e : σ , then either e is a value v or there exists (s | e ) such that

MULTI-LANGUAGE SEMANTICS
To formally de ne our multi-language semantics we create a combined language λ UL which lets us compose term fragments from both λ U and λ L together, and we give an operational semantics to this combined language.Interoperability is enabled by specifying how to transport values across the language boundaries.
Multi-language systems in the wild are not de ned in this way: both languages are given a semantics, by interpretation or compilation, in terms of a shared lower-level language (C, assembly, the JVM or CLR bytecode, or Racket's core forms), and the two languages are combined at that level.Our formal multi-language description Types σ | σ σ (unchanged from Figure 1) Reduction rules (Reduction rules of λ U and λ L reused unchanged) Fig. 7. Multi-language: Lump and Boundaries can be seen as a model of such combinations, that gives a speci cation of the expected observable behavior of this language combination.Another di erence from multi-languages in the wild is our use of very ne-grained language boundaries: a term written in one language can have its subterms written in the other, provided the type-checking rules allow it.Most multi-language systems, typically using Foreign Function Interfaces, o er coarser-grained composition at the level of compilation units.Fine-grained composition of existing languages, as done in the Eco project (Barrett, Bolz, Diekmann, and Tratt 2016), is di cult because of semantic mismatches.In Section 5 (Hybrid program examples) we demonstrate that ne-grained composition is a rewarding language design, enabling new programming patterns.

Lump Type and Language Boundaries
The core components the multi-language semantics are shown Figure 7-the communication of values from one language to the other will be described in the next section.The multi-language λ UL has two distinct syntactic categories of types, values, and expressions: those that come from λ U and those that come from λ L .Contexts, on the other hand, are mixed, and can have variables of both sorts.For a mixed context Γ, the notation !Γ only applies (!) to its linear variables.
The typing rules of λ U and λ L are imported into our multi-language system, working on those two separate categories of program.They need to be extended to handle mixed contexts Γ instead of their original contexts Γ and Γ.In the linear case, the rules look exactly the same.In the ML case, the typing rules implicitly duplicate all the variables in the context.It would be unsound to extend them to arbitrary linear variables, so they use a duplicable context !Γ.
To build interesting multi-language programs, we need a way to insert a fragment coming from a language into a term written in another.This is done using language boundaries, two new term formers LU(e) UL(s : Ψ | e) that inject an ML term into the syntactic category of linear terms, and a linear con guration into the syntactic category of ML terms.
Of course, we need new typing rules for these term-level constructions, clarifying when it is valid to send a value from λ U into λ L and vice versa.It would be incorrect to allow sending any type from one language into the other-for instance, by adding the counterpart of our language boundaries in the syntax of types-since values of linear types must be uniquely owned so they cannot possibly be sent to the ML side as the ML type system cannot enforce unique ownership.
On the other hand, any ML value could safely be sent to the linear world.For closed types, we could provide a corresponding linear type (1 maps to !1, etc.), but an ML value may also be typed by an abstract type variable α, in which case we can't know what the linear counterpart should be.Instead of trying to provide translations, we will send any ML type σ to the lump type [σ ], which embeds ML types into linear types.A lump is a blackbox, not a type translation: the linear language does not assume anything about the behavior of its values-the values of [σ ] are of the form [v], where v : y is an ML value that the linear world cannot use.More precisely, we only propagate the information that ML values are all duplicable by sending σ to ![σ ].
The typing rules for language boundaries insert lumps when going from λ U to λ L , and remove them when going back from λ L to λ U .In particular, arbitrary linear types cannot occur at the boundary, they must be of the form ![σ ].
Finally, boundaries have reduction rules: a term or con guration inside a boundary in reduction position is reduced until it becomes a value, and then a lump is added or removed depending on the boundary direction.Note that because the v in UL(s : Ψ | v) is at a duplicable type ![σ ], we know by inversion that the store is empty.

Interoperability: Static Semantics
If the linear language could not interact with lumped values at all, our multi-language programs would be rather boring, as the only way for the linear extension to provide a value back to ML would be to have received it from λ U and pass it back unchanged (as in the lump embedding of Matthews and Findler (2009)).To provide a real interaction, we provide a way to extract values out of a lump ![σ ], use it at some linear type σ , and put it back in before sending the result to λ U .
The correspondence between intuitionistic types σ and linear types σ is speci ed by a heterogeneous compatibility relation σ σ de ned in Figure 8 (Multi-language: Static Interoperability Semantics).The speci cation of this relation is that if σ σ holds, then the space of values of ![σ ] and σ are isomorphic: we can convert back and forth between them.When this relation holds, the term-formers lump σ and σ unlump perform the conversion.(The position of the index σ emphasizes that the input e of lump σ e has type σ , while the output of σ unlump e has type σ .) For example, we have !
).Given a lumped ML function, we can unlump it to see it as a linear function.We can call it from the linear side, but have to pass it a duplicable argument since an ML function may duplicate its argument.Conversely, we can convert a linear function into a lumped function type to pass it to the ML side, but it has to have a duplicable return type since the ML side may freely share the return value.
Our lump σ and σ unlump primitives are only indexed by the linear type σ , because a compatible ML type σ can be uniquely recovered, as per the following result.Compatibility relation Interoperability primitives and derived constructs: Note that the converse property does not hold: for a given σ , there are many σ such that σ σ .For example, we have 1 !1 but also 1 !!1.This corresponds to the fact that the linear types are more ne-grained, and make distinctions (inner duplicability, dereference of full locations) that are erased in the ML world.The σ ![σ ] case also allows you to (un)lump as deeply or as shallowly as you need: We could not systematically translate the complete type σ , as type variables cannot be translated and need to remain lumped.Allowing lumps to "stop" the translation at arbitrary depth is a natural generalization.
The term LU(e) turns a e : σ into a lumped type ![σ ], and we need to unlump it with some σ unlump for a compatible σ σ to interact with it on the linear side.It is common to combine both operations and we provide syntactic sugar for it: σ LU(e).Similarly UL σ (e) rst lumps a linear term then sends the result to the ML world.

Interoperability: Dynamic Semantics
We were careful to de ne the compatibility relation such that σ σ only holds when ![σ ] and σ are isomorphic, in the sense that any value of one can be converted into a value of another.Figure 9 de nes the operational semantics of the lumping and unlumping operations precisely as realizing these isomorphisms.For concision, we specify the isomorphisms as relations, following the inductive structure of the compatibility judgment itself.We write (↔ ) when a rule can be read bidirectionally to convert in either directions (assuming the same direction holds of the premises), and (← ) or (→ ) for rules that only describe how to convert values in one direction.
T 1 (V ).If σ σ , then for any closed value v : σ there is a unique v : σ such that v → σ v, and conversely for any closed value v : σ there is a unique v : σ such that v ← σ v.

L 8 (L
).The lump conversions lump σ and σ unlump cancel each other modulo βη.In particular, Implementation consideration.In a realistic implementation of this multi-language system, we would expect the representation choices made for λ U and λ L to be such that, for some but not all compatible pairs σ σ , Fig. 9. Multi-language: Dynamic Interoperability Semantics the types σ and σ actually have the exact same representation, making the conversion an e cient no-op.An implementation could even restrict the compatibility relation to accept only the pairs that can be implemented in this way.That is, it would reject some λ UL programs, but the "graceful interoperability" result that is our essential contribution would still hold.

Full Abstraction from λ U into λ UL
We can now state and prove the major meta-theoretical result of this work, which is the proposed multi-language design extends the simple language λ U in a way that provably has, in a certain sense, "no abstraction leaks".The proof of full abstraction is actually rather simple.It relies on the idea, that we already mentioned in Section 2.2 (Linear Memory in λ L ), that linear state can be seen as either being imperatively mutated, but also as a purely functional feature that just explicits memory layout.In the absence of aliasing, we can give a purely functional semantics to linear state operations-instead of the store-modifying semantics of Figure 6 (Linear Language: Operational Semantics)-and, in fact, this semantics determines a translation from linear programs back into pure ML programs.Those ML programs will not have the same allocation behavior as the initial linear programs (in-place programs won't be in-place anymore), but they are observably equivalent in that they are equi-terminating and return the same outputs from the same inputs.
The de nition of the functional translation of linear contexts, terms, and types is given in Figure 10.To simplify the translation of terms and the statement of Lemma 9, we assume that a global injective mapping is chosen from linear variables x and locations to ML variables x x and x , from linear type variables α to ML type variables α α , Stability.Given that our proof technique relies on translating λ L back into λ U , it is stable by extension of the language λ U -any extension of λ U that preserves the reduction behavior of closed programs, for example adding ML references, preserves the full-abstraction result.On the contrary, this technique is not stable by extension of λ L , and in fact the result could become false if λ L was extended with abstraction-breaking features.This explains why we prove that the embedding of λ U into λ UL is fully-abstract, but do not prove any result on the embedding of λ L into λ UL : such a result would be immediately broken by considering a larger general-purpose language, for example adding ML references.

MULTI-LANGUAGE PARAMETRICITY
We discussed the design choice of manipulating lumps [σ ] of any ML type, not just the type variable that motivates them.In the presence of polymorphism, this generalization is also an important design choice to preserve parametricity.
Let us de ne id σ (e) def = lump σ ( σ unlump e), and consider a polymorphic term of the form Λα .UL(. . .id ![α ] . . .).The (un)lumping operations on a lumped type such as ![α] are just the identity: the lumped value is passed around unchanged, so id ![α ] (v) will reduce to v. Now, if we instantiate this polymorphic term with a ML type σ , it will reduce to a term UL(. . .id ![σ ] . . . ) whose unlumping operation is still on a lumped type, so is still exactly the identity.
On the contrary, if we allowed lumps only on type variables, we would have to push the lump inside σ , and the (un)lumping operations would become more complex: if σ starts with an ML product type _ × _, it would be turned into a shared linear pair !(_ ⊗ _) by unlumping, and back into an ML pair by lumping.In general, id σ may perform deep η-expansions of lumped values.The fact that, after instantiation of the polymorphic term, we get a monomorphic term that has di erent (but η-equivalent) computational behavior would cause meta-theoretic di culties; this is the approach that was adopted in previous work on multi-languages with polymorphism by Perconti and Ahmed (2014), and it made some of their proofs using a logical relations argument substantially more complicated.In the logical relation, polymorphism is obtained by allowing each polymorphic variable to be replaced by two types related by an "admissible" relation R, and the notion of admissibility of this previous work had to force relations to be compatible with η-expansion, which complicates the proofs.
In contrast, our handling of lump types as turning arbitrary types into blackboxes makes type instantiation obviously parametric.To formally demonstrate this aspect of our design, we develop a step-indexed logical relation (Figure 11) that proves that our multi-language satis es a strong parametricity property that is not disrupted by the linear sublanguage or the cross-language boundaries.
The logical relation is a family of relations indexed by closed "relational types" which extend the grammar of σ , σ , which include a case for admissible relations R that is used to enable parametric arguments.The step index j in the de nitions decreases strictly whenever related values of a type σ or σ are de ned in terms of a non-strictly-smaller type; this happens in the de nition of the relation at recursive types V µα .σ j . Because the language is non-terminating, our relation does not de ne an equivalence but an approximation: two expressions (e 1 , e 1 ) are related in E σ j if e 1 approximates e 2 : if e 1 reduces to a value in less than j steps, then e 2 must reduce to a related value.
The de nition of admissible relations Rel[σ 1 , σ 2 ], used to de ne when λ U values of polymorphic types are related, V ∀α .σ , is completely standard, which demonstrates that our notion of boundaries preserves simple parametricity reasoning.
Although we have a stateful linear language, the logical relations for the linear types have more in common with a language with explicit closures-this is another consequence of the remark in Section 2.2 (Linear Memory in λ L ) that the language can also be interpreted using a functional semantics.The relations for closed λ L values and expressions, V σ and E σ , are indexed by a type but do not depend on a store typing: the related values may have di erent, non-empty store typings.This allows to relate two programs that are equivalent but allocate di erent references in di erent ways.Furthermore, since all state is linear, we don't need additional machinery to relate stores, since all the values in a store owned by a value will be re ected in the value.For example, the relation for empty locations V Box 0 relates any two arbitrary (empty) locations, and the relation for non-empty locations V Box 1 σ relates possibly-distinct locations that contain related values.
Logical relations e ectively translate global invariants of the system into properties of type connectives.For example, consider the reduction rule for lumped values: UL(∅ : In this rule we implicitly assumed that a linear value of the shape [v] at type ![σ ] would occur in an empty local store.The term share [v] desugars into share(∅ : •).[v], but it is not immediately obvious that this should always be the case since it is possible to compute a value of type ![σ ] by allocating references and using them.The intuitive reason why the store becomes empty when a value [v] is reached is that linear sub-terms e within v may only occur within a language boundary UL(s : Ψ | e): linear sub-terms have their own local store, so there are no globally visible linear locations for [v] to refer to.This global reasoning is elegantly expressed in a type-directed way in our logical relation by the de nition of related values at lump type, which encodes the invariant that they always have an empty store: We also prove that the logical relation is sound with respect to contextual equivalence.For this, we de ne contextual approximation relations Γ e 1 ctx e 2 : σ and Γ (s 1 | e 1 ) ctx (s 2 | e 2 ) : σ , by asking that e 1 , (s 1 | e 1 ) terminate more often than e 2 , (s 2 | e 2 ) when run under arbitrary contexts-see Appendix C for details.

HYBRID PROGRAM EXAMPLES 5.1 In-Place Transformations
In Section 2.2 (Linear Memory in λ L ) we proposed a program for in-place reversal of linear lists de ned by the type LinList σ We can also de ne a type of ML lists List σ def = µα . 1 + σ × α.Note that ML lists are compatible with shared linear lists, in the sense that List σ !(LinList ![σ ]).This enables writing 1:18 • Gabriel Scherer, Max New, Nicholas Rioux, and Amal Ahmed in-place list-manipulation functions in λ L , and exposing them to beginners at a λ U type: This example is arguably silly, as the allocations that are avoided by doing an in-place traversal are paid when copying the shared list to obtain a uniquely-owned version.A better example of list operations that can pro tably be sent on the linear side is quicksort, whose code we give in Appendix A (Quicksort).An ML implementation allocates intermediary lists for each recursive call, while the surprisingly readable λ U implementation only allocates for the rst copy.

Typestate Protocols
Linear types can enforce proper allocation and deallocation of resources, and in general any automata/typestatelike protocols on their usage by encoding the state transitions as linear transformations.In the simple example of le-descriptor handling in the introduction, additional safety compared to ML programming can be obtained by exposing le-handling functions on the λ U side, with linear types.We assumed the following API for linear le handling, which enforces a correct usage protocol: Another interesting example of protocol usage for which linear types help is the use of transient versions of persistent data structures, as popularized by Clojure.An unrestricted type Set α may represent persistent sets as balanced trees with logarithmic operations performing path-copying.A transient call returns a mutable version of the structure that supports e cient batch in-place updates, before a persistent call freezes this transient structure back into a persistent tree.To preserve a purely functional semantics, we must enforce that the intermediate transient value is uniquely owned.We can do this by using the linear types for the transient API:

IMPLEMENTATION
We have a prototype implementation for λ UL .In this implementation, instead of λ U we use the full OCaml language; we implemented the linear language λ L and wrote a compiler from λ L to (unsafe) OCaml code using unsafe in-place mutation.
Here is what in-place reversal looks like in the syntax supported by our prototype: The syntax for the boundaries is (%L ..) and (%U ..) in expressions and patterns, and (%%L ..) and (%%U ..) for sequences of toplevel declarations.The U parts accept the full grammar of the OCaml programming language (version 4.04.0), and use the OCaml implementation for type-checking and compilation.The L parts use our own parser and type-checker that enforces the linear discipline; in particular, we have not implemented type inference, so function parameters are fully speci ed, and U boundaries within L terms come with an annotation (:> !(llist 'a) in this example) indicating at which L terms they should be unlumped.
Finally, (x, xs)@l is syntactic sugar for boxing and unboxing, available in both L expressions and patterns.In an expression, it is equivalent to box (l, (x, xs)).In a pattern, it unboxes the reference l and matches its contents with the pattern (x, xs).
As we previously pointed out, our proof technique for Theorem 3 (Full Abstraction) is stable by extension of the general-purpose language, as long as the extended language keeps reducing closed λ U programs in the same way-we admit that this is the case for OCaml.This means that using OCaml as the general-purpose language does not endanger the full-abstraction result: we have formally established that λ L does not leak into OCaml's abstractions.OCaml programmers can now use our prototype to safely add resource control or in-place update to their programs.

CONCLUSION
In our proposed multi-language design, a simple linear type system mirroring the standard rules of intuitionistic linear logic can be equipped with linear state and usefully complement a general-purpose functional ML language, without breaking equational reasoning or parametricity-and without requiring a signi cantly more complex meta-theory.
Fine-grained language boundaries allow interesting programming patterns to emerge, and full abstraction provides a novel rigorous speci cation of what it means for multi-language design to avoid abstraction leaks from advanced features into the general-purpose or beginner-friendly languages.

Related Work
Having a stack of usable, interoperable languages, extensions or dialects is at the forefront of the Racket approach to programming environments, in particular for teaching (Felleisen, Findler, Flatt, and Krishnamurthi 2004).
Our multi-language semantics builds on the seminal work by Matthews and Findler (2009), who gave a formal semantics of interoperability between a dynamically and a statically typed language.Others have followed the Matthews-Findler approach of designing multi-language systems with ne-grained boundaries-for instance, formalizing interoperability between a simply and dependently typed language (Osera, Sjöberg, and Zdancewic 2012); between a functional and typed assembly language (Patterson, Perconti, Dimoulas, and Ahmed 2017); between an ML-like and an a nely typed language, where linearity is enforced at runtime on the ML side using stateful contracts (Tov and Pucella 2010); and between the source and target languages of compilation to specify compiler correctness (Perconti and Ahmed 2014).However, all these papers address only the question of soundness of the multi-language; we propose a formal treatment of usability and absence of abstraction leaks.
The only work to establish that a language embeds into a multi-language in a fully abstract way is the work on fully abstract compilation by Ahmed and Blume (2011) and New, Bowman, and Ahmed (2016) who show that their compiler's source language embeds into their source-target multi-language in a fully abstract way.But the focus of this work was on fully abstract compilation, not on usability of user-facing languages.
The Eco project (Barrett, Bolz, Diekmann, and Tratt 2016) is studying multi-language systems where userexposed languages are combined in a very ne-grained way; it is closely related in that it studies the user experience in a multi-language system.The choice of an existing dynamic language creates delicate interoperability issues (con icting variable scoping rules, etc.) as well as performance challenges.We propose a di erent approach, to design new multi-languages from scratch with interoperability in mind to avoid legacy obstacles.
1:20 • Gabriel Scherer, Max New, Nicholas Rioux, and Amal Ahmed We are not aware of existing systems exploiting the simple idea of using promotion to capture uniquely-owned state and dereliction to copy it-common formulations would rather perform copies on the contraction rule.
The general idea that linear types can permit reuse of unused allocated cells is not new.In Wadler (1990), a system is proposed with both linear and non-linear types to attack precisely this problem.It is however more distant from standard linear logic and somewhat ad-hoc; for example, there is no way to permanently turn a uniquely-owned value into a shared value, it provides instead a local borrowing construction that comes with ad-hoc restrictions necessary for safety.(The inability to give up unique ownership, which is essential in our list-programming examples, seems to also be missing from Rust, where one would need to perform a costly operation of traversing the graph of the value to turn all pointers into Arc nodes.) The RAML project (Ho mann, Aehlig, and Hofmann 2012) also combines linear logic and memory reuse: its destructive match operator will implicitly reuse consumed cells in new allocations occurring within the match body.Multi-languages give us the option to explore more explicit, exible representations of those low-level concern, without imposing the complexity to all programmers.
A recent related work is the Cogent language (O'Connor, Chen, Rizkallah, Amani, Lim, Murray, Nagashima, Sewell, and Klein 2016), in which linear state is also viewed as both functional and imperative -the latter view enabling memory reuse.The language design is interestingly reversed: in Cogent, the linear layer is the simple language that everyone uses, and the non-linear language is a complex but powerful language that is used when one really has to, named C.
Our linear language λ L is sensibly simpler, and in several ways less expressive, than advanced programming languages based on linear logic (Tov and Pucella 2011), separation logic (Balabonski, Pottier, and Protzenko 2016), ne-grained permissions (Garcia, Tanter, Wol , and Aldrich 2014): it is not designed to stand on its own, but to serve as a useful side-kick to a functional language, allowing safer resource handling.
One major simpli cation of our design compared to more advanced linear or separation-logic-based languages is that we do not separate physical locations from the logical capability/permission to access them (e.g., as in Ahmed, Fluet, and Morrisett (2007)).This restricts expressiveness in well-understood ways (Fahndrich and DeLine 2002): shared values cannot point to linear values.Some existing (Morris 2016) and ongoing work on linear types focuses on whether it is possible to add them to a functional language without losing desirable usability property, such as type inference or the genericity of polymorphic higher-order functions -this was also a signi cant part of the contribution of Alms (Tov and Pucella 2011).Our multi-language design side-steps this issue as the general-purpose language remains unchanged.Language boundaries are more rigid than an ideal no-compromise language, as they force users to preserve the distinction between the general-purpose and the advanced features; it is precisely this compromise that gives a design of reduced complexity.
Finally, on the side of the semantics, our system is related to LNL (Benton 1994), a calculus for linear logic that, in a sense, is itself built as a multi-language system where (non-duplicable) linear types and (duplicable) intuitionistic types interact through a boundary.It is not surprising that our design contains has an instance of this adjunction: for any σ there is a unique σ such that σ !σ , and converting a σ value to this σ and back gives a !σ and is provably equivalent, by boundary cancellation, to just using share .B PROOF OUTLINES T 1 (P ).By induction on the typing derivation of e, using induction hypothesis in the evaluation order corresponding to the structure of contexts K.If one induction hypothesis returns a reduction, we build a bigger reduction ( L →) for the whole term.If all induction hypotheses return a value, the proof depends on whether the head term-former is an introduction/construction form or an elimination/destruction form.An introduction form whose subterms are values is a value.For elimination forms, we use Lemma 3 (Inversion principle for λ L values) on the eliminated subterm (a value), to learn that it starts with an introduction form, and thus forms a head redex with the head elimination form, so we build a head reduction ( L ).
L 5 (N ).The proof, summarized below, proceeds by induction on the typing derivation of e.
Most cases need an additional case analysis on whether the substituted type σ is a duplicable type of the form !σ , as it in uence whether it may appear in zero or several subterms of e. (This is a price to pay for contraction and weakening happening in all rules for convenience, instead of being isolated in separate structural rules.) For example, in the variable case, e may be the variable x itself, in which case we know that Γ is empty and conclude immediately.But e may also be another variable y if x is duplicable and has been dropped.In that case, we perform an inversion (Lemma 3) on the v premise to learn that Ψ is empty and Γ is duplicable, and can thus use Lemma 4 (Weakening of duplicable contexts).
In the e 1 , e 2 case, if x is a linear variable it only occurs in one subterm on which we apply our induction hypothesis.If x is duplicable, inversion on the v premises again tells us that Γ is duplicable.We know by assumption that (Γ 1 Γ 2 ) Γ ; because Γ is duplicable, we can deduce from Lemma 1 (Context joining properties) that the Γ i Γ are also de ned, which let us apply an induction hypothesis on both subterms e i .To conclude, we need the computation which again comes from duplicability of Γ .
The assumption x Ψ enforces that the resource x is consumed in the term e itself, not in one of the values [ → (s | v)] in the store: otherwise x would appear in the store typing (Γ; Ψ : Box 1 ) of this location in Ψ.It is used in the case where e : σ is a full location : Box 1 σ .If x could appear in the value of in the store, we would have substitute it in the store as well -in our substitution statement, only the term is modi ed.Here we know that this value is unused, so it has a duplicable type and we can perform an inversion in the other cases.
T 2 (S λ L ).The proof is done by induction on the reduction derivation.The head-reduction rules involving substitutions rely on Lemma 5 (Non-store-escaping substitution principle); note that in each of them, for example (λ(x : σ ).e ) e , the substituted variable x is bound in the term e, and thus does not appear in the store s: the non-store-escaping hypothesis holds.
For the copy rule and the store operators, we build a valid derivation for the reduced con guration by inverting the typing derivation of the reducible con guration.
In the non-head-reduction cases, the share case is by direction, and the context case K[e] uses Lemma 6 (Context decomposition) to obtain a typing derivation for e, and the same lemma again rebuild a derivation of the reduced term K[e ].

L 7 (D )
. By induction on the syntax of σ : the judgment S ).By induction on the σ σ derivation.There are two leaf cases: the case recursive hypotheses, which is immediate, and the case of lump σ ![σ ].In this latter case, notice that σ is a type of λ U , so in particular it does not contain the variable β; and we assumed α σ so we also have α σ , so σ Remark that in the statement of the term, when we quantify over all closed values v : σ , we implicitly assume that in the general case values of v live in the empty global store -otherwise we would have a value of the form Ψ; • s | v : σ .This is valid because all types σ in the image of the type-compatibility relation are duplicable types of the form !σ , so by Lemma 3 (Inversion principle for λ L values) we know that v is in fact of the form a share(s : Ψ). e , living in the empty store.) The two sides of the result are proved simultaneously by induction on σ σ , using inversion to reason on the shapes of v and v.Note that the inductive cases remain on closed values: the only variable-binder constructions, λ-abstractions, do not use the recursion hypothesis.
In the recursive case µα .σ !(µα .σ ), to use the induction hypothesis on the folded values we need to know that the unfolded types σ [µα .σ /α] and σ [µα .σ /α] are compatible.This is exactly Lemma 11 (Substitution of recursive hypotheses), using the hypothesis µα .σ !(µα .σ ) itself.L 8 (L ).By induction on σ , and then by parallel induction on the derivations of type compatibility and value compatibility.The parallel cases are symmetric by de nition, only the function case !(!σ !σ ) needs to be checked.A simple computation, using the induction hypothesis on the smaller types !σ and !σ , shows that composing the two function translations gives an η-expansion -plus the βη-steps from the induction hypotheses.
ρ ::= . . .all cases for σ , but using ρ, ρ recursively | (R, σ 1 , σ 2 ) ρ ::= all cases for σ , but using ρ, ρ recursively  ).By induction on σ σ .In the variable leaf case, we know α Σ.In the lump leaf case σ ! ).The translation respects the evaluation structure: a value is translated into a value, and a position in the original term is reducible if and only if the same position is reducible in the translation -both properties are checked by direct induction, on values and evaluation contexts.
Furthermore, the translation was carefully chosen (especially for the store operations) so that there is a redex in the translated term if and only if there is a redex in the original term, and the reduction of the translation is also the translation of the reduction.For example, we have where

C LOGICAL RELATION
To de ne the logical relation in a precise way, we introduce a new grammar of "relational types" ρ, ρ, that simply extend the grammar of σ , σ to include a case for a relation on λ U types (R, σ 1 , σ 2 ).
Introducing this lightweight syntax for relations here is a middle way between de nitions that use an explicit relational substitution (cluttering the relation with yet another index) and a full-blown logic for parametricity as in Plotkin and Abadi (1993) Every ρ has two associated types, the types of terms that it relates, which we denote (ρ) 1 , (ρ) 2 .It is de ned as follows: First, we de ne when closed values are related at each type, indexing by a natural number to break the circularity of recursive types.The relations are de ned by nested induction on j, ρ, ρ, any time a bigger type is used in a de nition, the step-index j is decremented.
The de nition of E σ j shows that this is an assymetric relation capturing a notion of approximation, not equivalence.
Next, we extend the relations to open terms by de ning open terms to be related when they are related when closed by related substitutions.
The Fundamental Property is the key to proving parametricity results.The proof is by induction on contexts, showing that every term formation rule preserves logical relatedness.These "compatibility" lemmas are extensive, but their proofs are simple.Their proofs are in the extended technical report.

L 2 .
If e is a surface term of λ L , then the surface judgment Γ e : σ holds if and only if the internal judgment •; Γ ∅ | e : σ holds.
Fig. 10.Pure Semantics of Linear State

Fig. 12 .
Fig. 12. Relation Type Syntax [s/s] denotes the composed substitution [ , (s | v) /x ] for each [ → (s | v)] in s. show that two λ U terms e, e are contextually equivalent in λ UL , we are given a context C in λ UL and must prove that C[e], C[e] are equi-terminating.From Theorem 2 (Termination equivalence) we know that C[e] and C[e] are equi-terminating, and from Lemma 9 (Compositionality) that C[e] is equal to C [ e ], which is equal to C [e] by Lemma 10 (Projection).Similarly, C[e ] and C [e ] are equi-terminating.Because C is a context in λ U , we can use our assumption that e ≈ ctx e to conclude that C [e] and C [e ] are equi-terminating.
If !Γ v : σ then !Γ v log v : σ (2) If !Γ e : σ then !Γ e log e : σ (3) If Ψ; Γ s | e : σ then !Γ (s | e) log (s | e) : σFinally, we prove our logical relation is sound with respect to contextual equivalence, that is, it can be used as a more tractible way to prove contextual equivalence results, such as lump/unlump cancellation.