Distributive Disjoint Polymorphism for Compositional Programming

. Popular programming techniques such as shallow embeddings of Domain Speciﬁc Languages (DSLs), ﬁnally tagless or object algebras are built on the principle of compositionality . However, existing programming languages only support simple compositional designs well, and have limited support for more sophisticated ones. This paper presents the F + i calculus, which supports highly modular and compositional designs that improve on existing techniques. These improvements are due to the combination of three features: disjoint inter-section types with a merge operator ; parametric (disjoint) polymorphism ; and BCD-style distributive subtyping . The main technical challenge is F + i ’s proof of coherence. A naive adaptation of ideas used in System F’s parametricity to canonicity (the logical relation used by F + i to prove coherence) results in an ill-founded logical relation. To solve the problem our canonicity relation employs a diﬀerent technique based on immediate substitutions and a restriction to predicative instantiations. Besides coherence, we show several other important meta-theoretical results, such as type-safety, sound and complete algorithmic subtyping, and decidability of the type system. Remarkably, unlike F < : ’s bounded polymorphism , disjoint polymorphism in F + i supports decidable type-checking.


Introduction
Compositionality is a desirable property in programming designs.Broadly defined, it is the principle that a system should be built by composing smaller subsystems.For instance, in the area of programming languages, compositionality is a key aspect of denotational semantics [48,49], where the denotation of a program is constructed from the denotations of its parts.Compositional definitions have many benefits.One is ease of reasoning: since compositional definitions are recursively defined over smaller elements they can typically be reasoned about using induction.Another benefit is that compositional definitions are easy to extend, without modifying previous definitions.
Programming techniques that support compositional definitions include: shallow embeddings of Domain Specific Languages (DSLs) [20], finally tagless [11], polymorphic embeddings [26] or object algebras [35].These techniques allow us to create compositional definitions, which are easy to extend without modifications.Moreover, when modeling semantics, both finally tagless and object algebras support multiple interpretations (or denotations) of syntax, thus offering a solution to the well-known Expression Problem [53].Because of these benefits these techniques have become popular both in the functional and object-oriented programming communities.
However, programming languages often only support simple compositional designs well, while support for more sophisticated compositional designs is lacking.For instance, once we have multiple interpretations of syntax, we may wish to compose them.Particularly useful is a merge combinator, which composes two interpretations [35,37,42] to form a new interpretation that, when executed, returns the results of both interpretations.
The merge combinator can be manually defined in existing programming languages, and be used in combination with techniques such as finally tagless or object algebras.Moreover variants of the merge combinator are useful to model more complex combinations of interpretations.A good example are so-called dependent interpretations, where an interpretation does not depend only on itself, but also on a different interpretation.These definitions with dependencies are quite common in practice, and, although they are not orthogonal to the interpretation they depend on, we would like to model them (and also mutually dependent interpretations) in a modular and compositional style.
Defining the merge combinator in existing programming languages is verbose and cumbersome, requiring code for every new kind of syntax.Yet, that code is essentially mechanical and ought to be automated.While using advanced meta-programming techniques enables automating the merge combinator to a large extent in existing programming languages [37,42], those techniques have several problems: error messages can be problematic, type-unsafe reflection is needed in some approaches [37] and advanced type-level features are required in others [42].An alternative to the merge combinator that supports modular multiple interpretations and works in OO languages with support for some form of multiple inheritance and covariant type-refinement of fields has also been recently proposed [55].While this approach is relatively simple, it still requires a lot of manual boilerplate code for composition of interpretations.
This paper presents a calculus and polymorphic type system with (disjoint) intersection types [36], called F + i .F + i supports our broader notion of compositional designs, and enables the development of highly modular and reusable programs.F + i has a built-in merge operator and a powerful subtyping relation that are used to automate the composition of multiple (possibly dependent) interpretations.In F + i subtyping is coercive and enables the automatic generation of coercions in a type-directed fashion.This process is similar to that of other type-directed code generation mechanisms such as type classes [52], which eliminate boilerplate code associated to the dictionary translation [52].
F + i continues a line of research on disjoint intersection types.Previous work on disjoint polymorphism (the F i calculus) [2] studied the combination of parametric polymorphism and disjoint intersection types, but its subtyping relation does not support BCD-style distributivity rules [3] and the type system also prevents unrestricted intersections [16].More recently the NeColus calculus (or λ + i ) [5] introduced a system with disjoint intersection types and BCD-style distributivity rules, but did not account for parametric polymorphism.F + i is unique in that it combines all three features in a single calculus: disjoint intersection types and a merge operator ; parametric (disjoint) polymorphism; and a BCD-style subtyping relation with distributivity rules.The three features together allow us to improve upon the finally tagless and object algebra approaches and support advanced compositional designs.Moreover previous work on disjoint intersection types has shown various other applications that are also possible in F + i , including: first-class traits and dynamic inheritance [4], extensible records and dynamic mixins [2], and nested composition and family polymorphism [5].
Unfortunately the combination of the three features has non-trivial complications.The main technical challenge (like for most other calculi with disjoint intersection types) is the proof of coherence for F + i .Because of the presence of BCD-style distributivity rules, our coherence proof is based on the recent approach employed in λ + i [5], which uses a heterogeneous logical relation called canonicity.To account for polymorphism, which λ + i 's canonicity does not support, we originally wanted to incorporate the relevant parts of System F's logical relation [43].However, due to a mismatch between the two relations, this did not work.The parametricity relation has been carefully set up with a delayed type substitution to avoid ill-foundedness due to its impredicative polymorphism.Unfortunately, canonicity is a heterogeneous relation and needs to account for cases that cannot be expressed with the delayed substitution setup of the homogeneous parametricity relation.Therefore, to handle those heterogeneous cases, we resorted to immediate substitutions and predicative instantiations.We do not believe that predicativity is a severe restriction in practice, since many source languages (e.g., those based on the Hindley-Milner type system like Haskell and OCaml) are themselves predicative and do not require the full generality of an impredicative core language.Should impredicative instantiation be required, we expect that step-indexing [1] can be used to recover well-foundedness, though at the cost of a much more complicated coherence proof.
The formalization and metatheory of F + i are a significant advance over that of F i .Besides the support for distributive subtyping, F + i removes several restrictions imposed by the syntactic coherence proof in F i .In particular F + i supports unrestricted intersections, which are forbidden in F i .Unrestricted intersections enable, for example, encoding certain forms of bounded quantification [39].Moreover the new proof method is more robust with respect to language extensions.For instance, F + i supports the bottom type without significant complications in the proofs, while it was a challenging open problem in F i .A final interesting aspect is that F + i 's type-checking is decidable.In the design space of languages with polymorphism and subtyping, similar mechanisms have been known to lead to undecidability.Pierce's seminal paper "Bounded quantification is undecidable" [40] shows that the contravariant subtyping rule for bounded quantification in F <: leads to undecidability of subtyping.In F + i the contravariant rule for disjoint quantification retains decidability.Since with unrestricted intersections F + i can express several use cases of bounded quantification, F + i could be an interesting and decidable alternative to F <: .In summary the contributions of this paper are: -The F + i calculus, which is the first calculus to combine disjoint intersection types, BCD-style distributive subtyping and disjoint polymorphism.We show several meta-theoretical results, such as type-safety, sound and complete algorithmic subtyping, coherence and decidability of the type system.F + i includes the bottom type, which was considered to be a significant challenge in previous work on disjoint polymorphism [2].
-An extension of the canonicity relation with polymorphism, which enables the proof of coherence of F + i .We show that the ideas of System F's parametricity cannot be ported to F + i .To overcome the problem we use a technique based on immediate substitutions and a predicativity restriction.
-Improved compositional designs: We show that F + i 's combination of features enables improved compositional programming designs and supports automated composition of interpretations in programming techniques like object algebras and finally tagless.
-Implementation and proofs: All of the metatheory of this paper, except some manual proofs of decidability, has been mechanically formalized in Coq.Furthermore, F + i is implemented and all code presented in the paper is available.The implementation, Coq proofs and extended version with appendices can be found in https://github.com/bixuanzju/ESOP2019-artifact.

Compositional Programming
To demonstrate the compositional properties of F + i we use Gibbons and Wu's shallow embeddings of parallel prefix circuits [20].By means of several different shallow embeddings, we first illustrate the short-comings of a state-of-the-art compositional approach, popularly known as a finally tagless encoding [11], in Haskell.Next we show how parametric polymorphism and distributive intersection types provide a more elegant and compact solution in SEDEL [4], a source language built on top of our F + i calculus.

A Finally Tagless Encoding in Haskell
The circuit DSL represents networks that map a number of inputs (known as the width) of some type A onto the same number of outputs of the same type.The outputs combine (with repetitions) one or more inputs using a binary associative operator ⊕ : A × A → A. A particularly interesting class of circuits that can be expressed in the DSL are parallel prefix circuits.These represent computations that take n > 0 inputs x 1 , . . ., x n and produce n outputs y 1 , . . ., y n , where The DSL features 5 language primitives: two basic circuit constructors and three circuit combinators.These are captured in the Haskell type class  An identity circuit with n inputs x i , has n outputs y i = x i .A fan circuit has n inputs x i and n outputs y i , where y 1 = x 1 and y j = x 1 ⊕ x j (j > 1).The binary beside combinator puts two circuits in parallel; the combined circuit takes the inputs of both circuits to the outputs of both circuits.The binary above combinator connects the outputs of the first circuit to the inputs of the second; the width of both circuits has to be same.Finally, stretch ws c interleaves the wires of circuit c with bundles of additional wires that map their input straight on their output.The ws parameter specifies the width of the consecutive bundles; the ith wire of c is preceded by a bundle of width ws i − 1.
Basic width and depth embeddings.Figure 1 shows two simple shallow embeddings, which represent a circuit respectively in terms of its width and its depth.The former denotes the number of inputs/outputs of a circuit, while the latter is the maximal number of ⊕ operators between any input and output.Both definitions follow the same setup: a new Haskell datatype (Width/Depth) wraps the primitive result value and provides an instance of the Circuit type class that interprets the 5 DSL primitives accordingly.The following code creates a so-called Brent-Kung parallel prefix circuit [9]: Here e1 evaluates to W {width = 4}.If we want to know the depth of the circuit, we have to change type signature to Depth.
Interpreting multiple ways.Fortunately, with the help of polymorphism we can define a type of circuits that support multiple interpretations at once.
This way we can provide a single Brent-Kung parallel prefix circuit definition that can be reused for different interpretations.A type annotation then selects the desired interpretation.For instance, brentKung :: Width yields the width and brentKung :: Depth the depth.
Composition of embeddings.What is not ideal in the above code is that the same brentKung circuit is processed twice, if we want to execute both interpretations.We can do better by processing the circuit only once, computing both interpretations simultaneously.The finally tagless encoding achieves this with a boilerplate instance for tuples of interpretations.
instance (Circuit c1, Circuit c2) ⇒ Circuit (c1, c2) where identity n = (identity n, identity n) fan n = (fan n, fan n) beside c1 c2 = (beside (fst c1) (fst c2), beside (snd c1) (snd c2)) above c1 c2 = (above (fst c1) (fst c2), above (snd c1) (snd c2)) stretch ws c = (stretch ws (fst c), stretch ws (snd c)) Now we can get both embeddings simultaneously as follows: Composition of dependent interpretations.The composition above is easy because the two embeddings are orthogonal.In contrast, the composition of dependent interpretations is rather cumbersome in the standard finally tagless setup.An example of the latter is the interpretation of circuits as their well-sizedness, which captures whether circuits are well-formed.This interpretation depends on the interpretation of circuits as their width. 3ata WellSized = WS { wS :: Bool, ox :: Width } instance Circuit WellSized where identity n = WS True (identity n) fan n = WS True (fan n) beside c1 c2 = WS (wS c1 && wS c2) (beside (ox c1) (ox c2)) above c1 c2 = WS (wS c1 && wS c2 && width (ox c1) == width (ox c2)) (above (ox c1) (ox c2)) stretch ws c = WS (wS c && length ws==width (ox c)) (stretch ws (ox c)) The WellSized datatype represents the well-sizedness of a circuit with a Boolean, and also keeps track of the circuit's width.The 5 primitives compute the wellsizedness in terms of both the width and well-sizedness of the subcomponents.What makes the code cumbersome is that it has to explicitly delegate to the Width interpretation to collect this additional information.
With the help of a substantially more complicated setup that features a dozen Haskell language extensions, and advanced programming techniques, we can make the explicit delegation implicit (see the appendix).Nevertheless, that approach still requires a lot of boilerplate that needs to be repeated for each DSL, as well as explicit projections that need to be written in each interpretation.Another alternative Haskell encoding that also enables multiple dependent interpretations is proposed by Zhang and Oliveira [55], but it does not eliminate the explicit delegation and still requires substantial amounts of boilerplate.A final remark is that adding new primitives (e.g., a "right stretch" rstretch combinator [25]) can also be easily achieved [46].

The SEDEL Encoding
SEDEL is a source language that elaborates to F + i , adding a few convenient source level constructs.The SEDEL setup of the circuit DSL is similar to the finally tagless approach.Instead of a Circuit c type class, there is a Circuit[C] type that gathers the 5 circuit primitives in a record.Like in Haskell, the type parameter C expresses that the interpretation of circuits is a parameter.
As a side note if a new constructor (e.g., rstretch) is needed, then this is done by means of intersection types (& creates an intersection type) in SEDEL: Figure 2 shows the two basic shallow embeddings for width and depth.In both cases, a named SEDEL definition replaces the corresponding unnamed Haskell type class instance in providing the implementations of the 5 language primitives for a particular interpretation.
The use of the SEDEL embeddings is different from that of their Haskell counterparts.Where Haskell implicitly selects the appropriate type class instance based on the available type information, in SEDEL the programmer explicitly selects the implementation following the style used by object algebras.The following code does this by building a circuit with l1 (short for language1).Here e1 evaluates to {width = 4}.If we want to know the depth of the circuit, we have to replicate the code with language2.
Dynamically reusable circuits.Just like in Haskell, we can use polymorphism to define a type of circuits that can be interpreted with different languages.
In contrast to the Haskell solution, this implementation explicitly accepts the implementation.
brentKung : DCircuit = { accept C l = l.above(l.beside (l.fan 2) (l.fan 2)) (l.above (l.stretch (cons 2 (cons 2 nil)) (l.fan 2)) (l.beside (l.beside (l.identity 1) (l.fan 2)) (l.identity 1))) }; e1 = brentKung.acceptWidth language1; e2 = brentKung.acceptDepth language2; Automatic composition of languages.Of course, like in Haskell we can also compute both results simultaneously.However, unlike in Haskell, the composition of the two interpretation requires no boilerplate whatsoever-in particular, there is no SEDEL counterpart of the Circuit (c1, c2) instance.Instead, we can just compose the two interpretations with the term-level merge operator (,,) and specify the desired type Composition of dependent interpretations.In SEDEL the composition scales nicely to dependent interpretations.For instance, the well-sizedness interpretation can be expressed without explicit projections.Here the WellSized & Width type in the above and stretch cases expresses that both the well-sizedness and width of subcircuits must be given, and that the width implementation is left as a dependency-when language4 is used, then the width implementation must be provided.Again, the distributive properties of & in the type system take care of merging the two interpretations.This way the components x and y are only known at runtime and thus the merge can only happen at that time.The types A and B cannot be chosen entirely freely.For instance, if both components would contribute an implementation for the same method, which implementation is provided by the combination would be ambiguous.To avoid this problem the two types A and B have to be disjoint.This is expressed in the disjointness constraint * A on the quantifier of the type variable B. If a quantifier mentions no disjointness constraint, like that of A, it defaults to the trivial * constraint which implies no restriction.
3 Semantics of the F + i Calculus This section gives a formal account of F + i , the first typed calculus combining disjoint polymorphism [2] (and disjoint intersection types) with BCD subtyping [3].
The main differences to F i are in the subtyping, well-formedness and disjointness relations.F + i adds BCD subtyping and unrestricted intersections, and also closes an open problem of F i by including the bottom type.The dynamic semantics of F + i is given by elaboration to the target calculus F co -a variant of System F extended with products and explicit coercions.

Syntax and Semantics
Figure 3 shows the syntax of F + i .Metavariables A, B , C range over types.Types include standard constructs from prior work [2,36]: integers Int, the top type , arrows A → B , intersections A & B , single-field record types {l : A} and disjoint quantification ∀(α * A).B .One novelty in F + i is the addition of the uninhabited bottom type ⊥.Metavariable E ranges over expressions.Expressions are integer literals i, the top value , lambda abstractions λx .E , applications E 1 E 2 , merges E 1 , , E 2 , annotated terms E : A, single-field records {l = E }, record projections E .l, type abstractions Λ(α * A).E and type applications E A.
Well-formedness and unrestricted intersections.F + i 's well-formedness judgment of types ∆ A is standard, and only enforces well-scoping.This is one of the key differences from F i , which uses well-formedness to also ensure that all intersection types are disjoint.In other words, while in F i all valid intersection types must be disjoint, in F + i unrestricted intersection types such as Int & Int are allowed.More specifically, the well-formedness of intersection types in F + i and F i is: Notice that F i has an extra disjointness condition ∆ A * B in the premise.This is crucial for F i 's syntactic method for proving coherence, but also burdens the calculus with various syntactic restrictions and complicates its metatheory.For example, it requires extra effort to show that F i only produces disjoint intersection types.As a consequence, F i features a weaker substitution lemma (note the gray part in Proposition 1) than S-distArr Fig. 4: Declarative subtyping Declarative subtyping.F + i 's subtyping judgment is another major difference to F i , because it features BCD-style subtyping and a rule for the bottom type.The full set of subtyping rules are shown in Fig. 4. The reader is advised to ignore the gray parts for now.Our subtyping rules extend the BCD-style subtyping rules from λ + i [5] with a rule for parametric (disjoint) polymorphism (rule S-forall).Moreover, we have three new rules: rule S-bot for the bottom type, and rules S-distAll and S-topAll for distributivity of disjoint quantification.The subtyping relation is a partial order (rules S-refl and S-trans).Most of the rules are quite standard.⊥ is a subtype of all types (rule S-bot).Subtyping of disjoint quantification is covariant in its body, and contravariant in its disjointness constraints (rule S-forall).Of particular interest are those so-called "distributivity" rules: rule S-distArr says intersections distribute over arrows; rule S-distRcd says intersections distribute over records.Similarly, rule S-distAll dictates that intersections may distribute over disjoint quantifiers.

Disjointness
We now turn to another core judgment of F + i -the disjointness relation, shown in Fig. 6.The disjointness rules are mostly inherited from F i [2], but the new bottom type requires a notable change regarding disjointness with top-like types.
Top-like types.Top-like types are all types that are isomorphic to (i.e., simultaneously sub-and supertypes of ).Hence, they are inhabited by a single value, isomorphic to the value.Fig. 6 captures this notion in a syntax-directed fashion in the A predicate.As a historical note, the concept of top-like types was already known by Barendregt et al. [3].The λ i calculus [36] re-discovered it and coined the term "top-like types"; the F i calculus [2] extended it with universal quantifiers.Note that in both calculi, top-like types are solely employed for enabling a syntactic method of proving coherence, and due to the lack of BCD subtyping, they do not have a type-theoretic interpretation of top-like types.

Disjointness rules. The disjointness judgment ∆
A * B is helpful to check whether the merge of two expressions of type A and B preserves coherence.Incoherence arises when both expressions produce distinct values for the same Fig. 7: Syntax of F co type, either directly when they are both of that same type, or through implicit upcasting a common supertype.Of course we can safely disregard top-like types in this matter because they do not have two distinct values.In short, it suffices to check that the two types have only top-like supertypes in common.
Because ⊥ and any another type A always have A as a common supertype, it follows that ⊥ is only disjoint to A when A is top-like.More generally, if A is a top-like type, then A is disjoint to any type.This is the rationale behind the two rules D-topL and D-topR, which generalize and subsume ∆ * A and ∆ A * from F i , and also cater to the bottom type.Two other interesting rules are D-tvarL and D-tvarR, which dictate that a type variable α is disjoint with some type B if its disjointness constraints A is a subtype of B .Disjointness axioms A * ax B (appearing in rule D-ax) take care of two types with different type constructors (e.g., Int and records).Axiom rules can be found in the appendix.Finally we note that the disjointness relation is symmetric.

Elaboration and Type Safety
The dynamic semantics of F + i is given by elaboration into a target calculus.The target calculus F co is the standard call-by-value System F extended with products and coercions.The syntax of F co is shown in Fig. 7.

Type translation. Definition 1 defines the type translation function | • | from F +
i types A to F co types τ .Most cases are straightforward.For example, ⊥ is mapped to an uninhabited type ∀α.α; disjoint quantification is mapped to universal quantification, dropping the disjointness constraints.| • | is naturally extended to work on contexts as well.
Fig. 8: Selected reduction rules Coercions and coercive subtyping.We follow prior work [5,6] by having a syntactic category for coercions [22].In Fig. 7, we have several new coercions: bot, co ∀ , dist ∀ and top ∀ due to the addition of polymorphism and bottom type.As seen in Fig. 4 the coercive subtyping judgment has the form A <: B co, which says that the subtyping derivation for A <: B produces a coercion co that converts terms of type |A| to |B |.
F co static semantics.The typing rules of F co are quite standard.We have one rule t-capp regarding coercion application, which uses the judgment co :: τ τ to type coercions.We show two representative rules ct-forall and ct-bot.F co dynamic semantics.The dynamic semantics of F co is mostly unremarkable.
We write e −→ e to mean one-step reduction.Figure 8 shows selected reduction rules.The first line shows three representative rules regarding coercion reductions.They do not contribute to computation but merely rearrange coercions.
Our coercion reduction rules are quite standard but not efficient in terms of space.Nevertheless, there is existing work on space-efficient coercions [23,50], which should be applicable to our work as well.Rule r-app is the usual β-rule that performs actual computation, and rule r-ctxt handles reduction under an evaluation context.As usual, −→ * is the reflexive, transitive closure of −→.Now we can show that F co is type safe:

Algorithmic System and Decidability
The subtyping relation in Fig. 4 is highly non-algorithmic due to the presence of a transitivity rule.This section presents an alternative algorithmic formulation.
Our algorithm extends that of λ + i , which itself was inspired by Pierce's decision procedure [38], to handle disjoint quantifiers and the bottom type.We then prove that the algorithm is sound and complete with respect to declarative subtyping.
Additionally we prove that the subtyping and disjointness relations are decidable.Although the proofs of this fact are fairly straightforward, it is nonetheless remarkable since it contrasts with the subtyping relation for (full) F <: [10], which is undecidable [40].Thus while bounded quantification is infamous for its undecidability, disjoint quantification has the nicer property of being decidable.

Algorithmic Subtyping Rules
While Fig. 4 is a fine specification of how subtyping should behave, it cannot be read directly as a subtyping algorithm for two reasons: (1) the conclusions of rules S-refl and S-trans overlap with the other rules, and (2) the premises of rule S-trans mention a type that does not appear in the conclusion.Simply dropping the two offending rules from the system is not possible without losing expressivity [29].Thus we need a different approach.Following λ + i , we intend the algorithmic judgment Q A <: B to be equivalent to A <: Q ⇒ B , where Q is a queue used to track record labels, domain types and disjointness constraints.The full rules of the algorithmic subtyping of F + i are shown Fig. 9.
For brevity of the algorithm, we use metavariable c to mean type constants: The basic idea of Q A <: B is to perform a case analysis on B until it reaches type constants.We explain new rules regarding disjoint quantification and the bottom type.When a quantifier is encountered in B , rule A-forall pushes the type variables with its disjointness constraints onto Q and continue with the body.Correspondingly, in rule A-allConst, when a quantifier is encountered in A, and the head of Q is a type variable, this variable is popped out and we continue with the body.Rule A-bot is similar to its declarative counterpart.Two meta-functions Q and Q & are meant to generate correct forms of coercions, and their definitions are shown in the appendix.For other algorithmic rules, we refer to λ + i [5] for detailed explanations.

Fig. 9: Algorithmic subtyping
Correctness of the algorithm.We prove that the algorithm is sound and complete with respect to the specification.We refer the reader to our Coq formalization for more details.We only show the two major theorems: Theorem 4 (Completeness).If A <: B co, then ∃co .[] A <: B co .

Decidability
Moreover, we prove that our algorithmic type system is decidable.To see this, first notice that the bidirectional type system is syntax-directed, so we only need to show decidability of algorithmic subtyping and disjointness.The full (manual) proofs for decidability can be found in the appendix.
Lemma 3 (Decidability of algorithmic subtyping).Given Q, A and B , it is decidable whether there exists co, such that Q A <: B co.
Lemma 4 (Decidability of disjointness checking).Given ∆, A and B , it is decidable whether ∆ A * B .
One interesting observation here is that although our disjointness quantification has a similar shape to bounded quantification ∀(α <: A).B in F <: [10], subtyping for F <: is undecidable [40].In F <: , the subtyping relation between bounded quantification is: Compared with rule S-forall, both rules are contravariant on bounded/disjoint types, and covariant on the body.However, with bounded quantification it is fundamental to track the bounds in the environment, which complicates the design of the rules and makes subtyping undecidable with rule fsub-forall.
Decidability can be recovered by employing an invariant rule for bounded quantification (that is by forcing A 1 and A 2 to be identical).Disjoint quantification does not require such invariant rule for decidability.

Establishing Coherence for F + i
In this section, we establish the coherence property for F + i .The proof strategy mostly follows that of λ + i , but the construction of the heterogeneous logical relation is significantly more complicated.Firstly in Section 5.1 we discuss why adding BCD subtyping to disjoint polymorphism introduces significant complications.In Section 5.2, we discuss why a natural extension of System F's logical relation to deal with disjoint polymorphism fails.The technical difficulty is well-foundedness, stemming from the interaction between impredicativity and disjointness.Finally in Section 5.3, we present our (predicative) logical relation that is specially crafted to prove coherence for F + i .

The Challenge
Before we tackle the coherence of F + i , let us first consider how F i (and its predecessor λ i ) enforces coherence.Its essentially syntactic approach is to make sure that there is at most one subtyping derivation for any two types.As an immediate consequence, the produced coercions are uniquely determined and thus the calculus is clearly coherent.Key to this approach is the invariant that the type system only produces disjoint intersection types.As we mentioned in Section 3, this invariant complicates the calculus and its metatheory, and leads to a weaker substitution lemma.Moreover, the syntactic coherence approach is incompatible with BCD subtyping, which leads to multiple subtyping derivations with different coercions and requires a more general substitution lemma.To accommodate BCD into λ i , Bi et al. [5] have created the λ + i calculus and developed a semantically-founded proof method based on logical relations.Because λ + i does not feature polymorphism, the problem at hand is to incorporate support for polymorphism in this semantic approach to coherence, which turns out to be more challenging than is apparent.

Impredicativity and Disjointness at Odds
Figure 10 shows selected cases of canonicity, which is λ + i 's (heterogeneous) logical relation used in the coherence proof.The definition captures that two values v 1 and v 2 of types τ 1 and τ 2 are in V τ 1 ; τ 2 iff either the types are disjoint or the types are equal and the values are semantically equivalent.Because both alternatives entail coherence, canonicity is key to λ + i 's coherence proof.
Well-foundedness issues.For F + i , we need to extend canonicity with additional cases to account for universally quantified types.For reasons that will become clear in Section 5.3, the type indices become source types (rather than target types as in Fig. 10).A naive formulation of one case rule is: This case is problematic because it destroys the well-foundedness of λ + i 's logical relation, which is based on structural induction on the type indices.Indeed, the type [C 1 /α]B 1 may well be larger than ∀(α * A 1 ).B 1 .
However, System F's well-known parametricity logical relation [43] provides us with a means to avoid this problem.Rather than performing the type substitution immediately as in the above rule, we can defer it to a later point by adding it to an extra parameter ρ of the relation, which accumulates the deferred substitutions.This yields a modified rule where the type indices in the recursive occurrences are indeed smaller: Of course, the deferred substitution has to be performed eventually, to be precise when the type indices are type variables.
Unfortunately, this way we have not only moved the type substitution to the type variable case, but also the ill-foundedness problem.Indeed, this problem is also present in System F. The standard solution is to not fix the relation R by which values at type α are related to V ρ 1 (α); ρ 2 (α) , but instead to make it a parameter that is tracked by ρ.This yields the following two rules for disjoint quantification and type variables: Now we have finally recovered the well-foundedness of the relation.It is again structurally inductive on the size of the type indexes.
Heterogeneous issues.We have not yet accounted for one major difference between the parametricity relation, from which we have borrowed ideas, and the canonicity relation, to which we have been adding.The former is homogeneous (i.e., the types of the two values is the same) and therefore has one type index, while the latter is heterogeneous (i.e., the two values may have different types) and therefore has two type indices.Thus we must also consider cases like V α; Int .A definition that seems to handle this case appropriately is: Here is an example to motivate it.
We expect that E Int 1 evaluates to 1, 1 .To prove that, we need to show (1, 1) ∈ V α; . According to Eq. ( 1), this is indeed the case.However, we run into ill-foundedness issue again, because ρ 1 (α) could be larger than α.Alas, this time the parametricity relation has no solution for us.

The Canonicity Relation for F + i
In light of the fact that substitution in the logical relation seems unavoidable in our setting, and that impredicativity is at odds with substitution, we turn to predicativity: we change rule T-tapp to its predicative version: where metavariable t ranges over monotypes (types minus disjoint quantification).We do not believe that predicativity is a severe restriction in practice, since many source languages (e.g., those based on the Hindley-Milner type system [24,32] like Haskell and OCaml) are themselves predicative and do not require the full generality of an impredicative core language.Luckily, substitution with monotypes does not prevent well-foundedness.The relation V A; B is defined by induction on the structures of A and B .For integers, it requires the two values to be literally the same.For two records to  For conciseness, we write ∆; Γ e 1 log e 2 : A to mean ∆; Γ e 1 log e 2 : A; A.
Contextual equivalence.Following λ + i , the notion of coherence is based on contextual equivalence.The intuition is that two programs are equivalent if we cannot tell them apart in any context.As usual, contextual equivalence is expressed using expression contexts (C and D denote F + i and F co expression contexts, respectively), Due to the bidirectional nature of the type system, the typing judgment of C features 4 different forms (full rules are in the appendix), e.g., The judgment also generates a well-typed F co context D. The following two definitions capture the notion of contextual equivalence: Definition 4 (Kleene Equality ).Two complete programs (i.e., closed terms of type Int), e and e , are Kleene equal, written e e , iff there exists an integer i such that e −→ * i and e −→ * i.
Coherence.For space reasons, we directly show the coherence statement of F + i .We need several technical lemmas such as compatibility lemmas, fundamental property, etc.The interested reader can refer to our Coq formalization.
Theorem 5 (Coherence).We have that That is, coherence is a special case of Definition 5 where E 1 and E 2 are the same.At first glance, this appears underwhelming: of course E behaves the same as itself!The tricky part is that, if we expand it according to Definition 5, it is not E itself but all its translations e 1 and e 2 that behave the same!

Related Work
Coherence.In calculi featuring coercive subtyping, a semantics that interprets the subtyping judgment by introducing explicit coercions is typically defined on typing derivations rather than on typing judgments.A natural question that arises for such systems is whether the semantics is coherent, i.e., distinct typing derivations of the same typing judgment possess the same meaning.Since Reynolds [45] proved the coherence of a calculus with intersection types, many researchers have studied the problem of coherence in a variety of typed calculi.Two approaches are commonly found in the literature.The first approach is to find a normal form for a representation of the derivation and show that normal forms are unique for a given typing judgment [8,15,47].However, this approach cannot be directly applied to Curry-style calculi (where the lambda abstractions are not type annotated).Biernacki and Polesiuk [6] considered the coherence problem of coercion semantics.Their criterion for coherence of the translation is contextual equivalence in the target calculus.Inspired by this approach, Bi et al. [5] proposed the canonicity relation to prove coherence for a calculus with disjoint intersection types and BCD subtyping.As we have shown in Section 5, constructing a suitable logical relation for F + i is challenging.On the one hand, the original approach by Alpuim et al. [2] in F i does not work any more due to the addition of BCD subtyping.On the other hand, simply combining System F's logical relation with λ + i 's canonicity relation does not work as expected, due to the issue of well-foundedness.To solve the problem, we employ immediate substitutions and a restriction to predicative instantiations.
BCD subtyping and decidability.The BCD type system was first introduced by Barendregt et al. [3] to characterize exactly the strongly normalizing terms.The BCD type system features a powerful subtyping relation, which serves as a base for our subtyping relation.The decidability of BCD subtyping has been shown in several works [27,38,41,51].Laurent [28] formalized the relation in Coq in order to eliminate transitivity cuts from it, but his formalization does not deliver an algorithm.Only recently, Laurent [30] presented a general way of defining a BCD-like subtyping relation extended with generic contravariant/covariant type constructors that enjoys the "sub-formula property".Our Coq formalization extends the approach used in λ + i , which follows a different idea based on Pierce's decision procedure [38], with parametric (disjoint) polymorphism and corresponding distributivity rules.More recently, Muehlboeck and Tate [34] presented a decidable algorithmic system (proved in Coq) with union and intersection types.Similar to F + i , their system also has distributive subtyping rules.They also discussed the addition of polymorphism, but left a Coq formalization for future work.In their work they regard intersections of disjoint types (e.g., String & Int) as uninhabitable, which is different from our interpretation.As a consequence, coherence is a non-issue for them.
Intersection types, the merge operator and polymorphism.Forsythe [44] has intersection types and a merge-like operator.However to ensure coherence, various restrictions were added to limit the use of merges.In Forsythe merges cannot contain more than one function.Castagna et al. [12] proposed a coherent calculus λ& to study overloaded functions.λ& has a special merge operator that works on functions only.Dunfield proposed a calculus [16] (which we call λ ,, ) that shows significant expressiveness of type systems with unrestricted intersection types and an (unrestricted) merge operator.However, because of his unrestricted merge operator (allowing 1 , , 2), his calculus lacks coherence.Blaauwbroek's λ ∨ ∧ [7] enriched λ ,, with BCD subtyping and computational effects, but  he did not address coherence.The coherence issue for a calculus similar to λ ,, was first addressed in λ i [36] with the notion of disjointness, but at the cost of dropping unrestricted intersections, and a strict notion of coherence (based on α-equivalence).Later Bi et al. [5] improved calculi with disjoint intersection types by removing several restrictions, adopted BCD subtyping and a semantic notion of coherence (based on contextual equivalence) proved using canonicity.
The combination of intersection types, a merge operator and parametric polymorphism, while achieving coherence was first studied in F i [2], which serves as a foundation for F + i .However, F i suffered the same problems as λ i .Additionally in F i a bottom type is problematic due to interactions with disjoint polymorphism and the lack of unrestricted intersections.The issues can be illustrated with the well-typed F + i expression Λ(α * ⊥).λx : α. x , , x .In this expression the type of x , , x is α & α.Such a merge does not violate disjointness because the only types that α can be instantiated with are top-like, and top-like types do not introduce incoherence.In F i a type variable α can never be disjoint to another type that contains α, but (as the previous expression shows) the addition of a bottom type allows expressions where such (strict) condition does not hold.In this work, we removed those restrictions, extended BCD subtyping with polymorphism, and proposed a more powerful logical relation for proving coherence.Figure 12 summarizes the main differences between the aforementioned calculi.
There are also several other calculi with intersections and polymorphism.Pierce proposed F ∧ [39], a calculus combining intersection types and bounded quantification.Pierce translates F ∧ to System F extended with products, but he left coherence as a conjecture.More recently, Castagna et al. [14] proposed a polymorphic calculus with set-theoretic type connectives (intersections, unions, negations).But their calculus does not include a merge operator.Castagna and Lanvin also proposed a gradual type system [13] with intersection and union types, but also without a merge operator.
Row polymorphism and bounded polymorphism.Row polymorphism was originally proposed by Wand [54] as a mechanism to enable type inference for a simple object-oriented language based on recursive records.These ideas were later adopted into type systems for extensible records [19,21,31].Our merge operator can be seen as a generalization of record extension/concatenation, and selection is also built-in.In contrast to most record calculi, restriction is not a primitive operation in F + i , but can be simulated via subtyping.Disjoint quantification can simulate the lacks predicate often present in systems with row polymorphism.Recently Morris and McKinna presented a typed language [33], generalizing and abstracting existing systems of row types and row polymorphism.Alpuim et al. [2] informally studied the relationship between row polymorphism and disjoint polymorphism, but it would be interesting to study such relationship more formally.The work of Morris and McKinna may be interesting for such study in that it gives a general framework for row type systems.
Bounded quantification is currently the dominant mechanism in major mainstream object-oriented languages supporting both subtyping and polymorphism.F <: [10] provides a simple model for bounded quantification, but type-checking in full F <: is proved to be undecidable [40].Pierce's thesis [39] discussed the relationship between calculi with simple polymorphism and intersection types and bounded quantification.He observed that there is a way to "encode" many forms of bounded quantification in a system with intersections and pure (unbounded) second-order polymorphism.That encoding can be easily adapted to F + i : The idea is to replace bounded quantification by (unrestricted) universal quantification and all occurrences of α by A & α in the body.Such an encoding seems to indicate that F + i could be used as a decidable alternative to (full) F <: .It is worthwhile to note that this encoding does not work in F i because A & α is not well-formed (α is not disjoint to A).In other words, the encoding requires unrestricted intersections.

Conclusion and Future Work
We have proposed F + i , a type-safe and coherent calculus with disjoint intersection types, BCD subtyping and parametric polymorphism.F + i improves the state-ofart of compositional designs, and enables the development of highly modular and reusable programs.One interesting and useful further extension would be implicit polymorphism.For that we want to combine Dunfield and Krishnaswami's approach [17] with our bidirectional type system.We would also like to study the parametricity of F + i .As we have seen in Section 5.2, it is not at all obvious how to extend the standard logical relation of System F to account for disjointness, and avoid potential circularity due to impredicativity.A promising solution is to use step-indexed logical relations [1].
Figure 11 defines the canonicity relation for F + i .The canonicity relation is a family of binary relations over F co values that are heterogeneous, i.e., indexed by two F + i types.Two points are worth mentioning.(1) An apparent difference from λ + i 's logical relation is that our relation is now indexed by source types.The reason is that the type translation function (Definition 1) discards disjointness constraints, which are crucial in our setting, whereas λ + i 's type translation does not have information loss.(2) Heterogeneity allows relating values of different types, and in particular values whose types are disjoint.The rationale behind the canonicity relation is to combine equality checking from traditional (homogeneous) logical relations with disjointness checking.It consists of two relations: the value relation V A; B relates closed values; and the expression relation E A; B -defined in terms of the value relation-relates closed expressions.

Fig. 12 :
Fig. 12: Summary of intersection calculi ( = yes, = no, = syntactic coherence) , x : A ρ Fig.11:The canonicity relation for F + i behave the same, their fields must behave the same.For two functions to behave the same, they are required to produce outputs related at B 1 and B 2 when given related inputs at A 1 and A 2 .For the next two cases regarding intersection types, the relation distributes over intersection constructor & .Of particular interest is the case for disjoint quantification.Notice that it does not quantify over arbitrary relations, but directly substitutes α with monotype t in B 1 and B 2 .This means that our canonicity relation does not entail parametricity.However, it suffices for our purposes to prove coherence.Another noticeable thing is that we keep the invariant that A and B are closed types throughout the relation, so we no longer need to consider type variables.This simplifies things a lot.Note that when one type is ⊥, two values are vacuously related because there simply are no values of type ⊥.We need to show that the relation is indeed well-founded: Let | • | ∀ and | • | s be the number of ∀-quantifies and the size of types, respectively.Consider the measure | • | ∀ , | • | s , where . . .denotes lexicographic order.For the case of disjoint quantification, the number of ∀-quantifiers decreases.For the other cases, the measure of | • | ∀ does not increase, and the measure of | • | s strictly decreases.