Polarized Subtyping

Polarization of types in call-by-push-value naturally leads to the separation of inductively defined observable values (classified by positive types), and coinductively defined computations (classified by negative types), with adjoint modalities mediating between them. Taking this separation as a starting point, we develop a semantic characterization of typing with step indexing to capture observation depth of recursive computations. This semantics justifies a rich set of subtyping rules for an equirecursive variant of call-by-push-value, including variant and lazy records. We further present a bidirectional syntactic typing system for both values and computations that elegantly and pragmatically circumvents difficulties of type inference in the presence of width and depth subtyping for variant and lazy records. We demonstrate the flexibility of our system by systematically deriving related systems of subtyping for (a) isorecursive types, (b) call-by-name, and (c) call-by-value, all using a structural rather than a nominal interpretation of types.


Introduction
Subtyping is an important concept in programming languages because it simultaneously allows more programs to be typed and more precise properties of programs to be expressed as types. The interaction of subtyping with parametric polymorphism and recursive types is complex and despite a lot of progress and research, not yet fully understood.
In this paper we study the interaction of subtyping with equirecursive types in call-by-push-value [51,52], which separates the language of types into positive and negative layers. This polarization elegantly captures that positive types classifying observable values are inductive, while negative types classifying (possibly recursive) computations are coinductive. It lends itself to a particularly simple semantic definition of typing using a mixed induction/coinduction [9,13,22]. From this definition, we can immediately derive a form of semantic subtyping [15,35,36]. Concretely, we realize the mixed induction/coinduction via step-indexing and carry out our metatheory in Brotherston and Simpson's system CLKID ω of circular proofs [14]. This includes a novel proof that syntactic versions of typing and subtyping are sound with respect to our semantic definitions. While we also conjecture that subtyping is precise (in the sense of [53]), we postpone this more syntactic property to future work.
Because our foundation is call-by-push-value, a paradigm that synthesizes callby-name and call-by-value based on the logical principle of polarization, we obtain several additional results in relatively straightforward ways. For example, both width and depth subtyping for variant and lazy records are naturally included. Furthermore, following Levy's interpretation of call-by-value and call-by-name functional languages into call-by-push-value, we extract subtyping relations and algorithms for these languages and prove them sound and complete. We also note that we can directly interpret the isorecursive types in Levy's original formulation of call-by-push-value [51].
We further provide a systematic notion of bidirectional typing that avoids some complexities that arise in a structural type system with variant and lazy records. The resulting decision procedure for typing is quite precise and suggests clear locations for noting failure of typechecking. The combination of equirecursive callby-push-value with bidirectional typing achieves some of the goals of refinement types [24,34], which fit a structural system inside a generative type language.
Here we have considerably more freedom and less redundancy. However, we do not yet treat intersection types or polymorphism.
We summarize our main contributions: 1. A simple semantics for types and subtyping in call-by-push-value, interpreting positive types inductively and negative types coinductively, realized via step indexing (Sections 3 and 4) 2. A new decidable system of equirecursive subtyping for call-by-push-value including width and depth subtyping for variant and lazy records (Section 4) 3. A novel application of Brotherston and Simpson's system CLKID ω [14] of circular proofs to give a particularly elegant and flexible soundness proof for subtyping (Section 5) 4. A system of bidirectional typing that captures a straightforward and precise typechecking algorithm (Section 6) 5. A simple interpretation of Levy's original isorecursive types for call-by-pushvalue [51] into our equirecursive setting (Section 7) 6. Subtyping rules for call-by-name and call-by-value, derived via Levy's translations of such languages into call-by-push-value (Section 8) These are followed by a discussion of related work and a conclusion. Additional material and proofs are provided in an appendix.
The intuition is that positive types classify observable values v while negative types classify computations e. τ + , σ + :: The usual binary product τ × σ splits into two: τ + σ + for eager, observable products inhabited by pairs of values, and { : σ − } ∈L for lazy, unobservable records with a finite set L of fields we can project out. Binary sums are also generalized to variant record types { : τ + } ∈L . 4 These are not just a programming convenience but allow for richer subtyping: lazy and variant record types support both width and depth subtyping, whereas the usual binary products and sums support only the latter. For example, width subtyping means that {false : 1} is a subtype of bool + = {false : 1, true : 1}, while 1 would not be a subtype of the usual binary 1 + 1. Neither is 1 a subtype of bool + , demonstrating the utility of variant record types with one label, such as {false : 1}. Similar examples exist for lazy record types. This way, we recover some of the benefits of refinement types without the syntactic burden of a distinct refinement layer.
The shift ↓σ − is inhabited by an unevaluated computation of type σ − (a "thunk"). Conversely, the shift ↑τ + includes a value as a trivial computation (a "return"). Levy [51] writes U B instead of ↓σ − and F A instead of ↑τ + . Finally, we model recursive types not by explicit constructors µα + . τ + and να − . σ − but by type names t + and s − which are defined in a global signature Σ. They may mutually refer to each other. We treat these as equirecursive (see Section 3) and we require them to be contractive, which means the right-hand side of a type definition cannot itself be a type name. Since we would like to directly observe the values of positive types, the definitions of type names t + = τ + are inductive. This allows inductive reasoning about values returned by computations. On the other hand, negative type definitions s − = σ − are recursive rather than coinductive in the usual sense, which would require, for example, stream computations to be productive. Because we do not wish to restrict recursive computations to those that are productive in this sense, they are "productive" only in the sense that they satisfy a standard progress theorem.
Next, we come to the syntax for values v of a positive type and computations e of a negative type. Variables x always stand for values and therefore have a positive type. We use j to stand for labels, naming fields of variant records or lazy records, where j · v injects value v into a sum with alternative labeled j and e.j projects field e out of a lazy record. When we quantify over a (always finite) set of labels we usually write as a metavariable for the labels. In order to represent recursion, we use equations f = e in the signature where f is a defined expression name,which we distinguish from variables, and all equations can mutually reference each other. An alternative would have been explicit fixed point expressions fix f. e, but this mildly complicates both typing and mutual recursion. Also, it seems more elegant to represent all forms of recursion at the level of types and expressions in the same manner. We also choose to fix a type for each expression name in a signature. Otherwise, each occurrence of f in an expression could potentially be assigned a different type, which strays into the domain of parametric polymorphism and intersection types. Following Levy, we do not allow names for values because this would add an undesirable notion of computation to values, and, furthermore, circular values would violate the inductive interpretation of positive types. As discussed in [51,Chapter 4], they could be added back conservatively under some conditions.

Dynamics
For the operational semantics, we use a judgment e → e defined inductively by the following rules which may reference a global signature Σ to look up the definitions of expression names f . In contrast, values do not reduce. The dynamics of call-by-push-value are defined as follows: Note that some computations, specifically λx. e, { = e } ∈L , and return v, do not reduce and may be considered values in other formulations. Here, we call them terminal computations and use the judgment e terminal to identify them.
We will silently use simple properties of computations in the remainder of the paper which follow by straightforward induction.

Lemma 1 (Computation).
1. If e → e and e → e then e = e 2. It is not possible that both e → e and e terminal.

Some Sample Programs
Example 1 (Computing with Binary Numbers). We show some example programs for binary numbers in "little endian" representation (least significant bit first) and in standard form, that is, without leading zeros. bin + = {e : 1, b0 : bin, b1 : bin} std + = {e : 1, b0 : pos, b1 : std} pos + = { b0 : pos, b1 : std} We expect the subtyping relationships pos ≤ std ≤ bin to hold, because every positive standard number is a standard number, and every standard number is a binary number. According to our definition and rules in Sections 3 and 5 these will hold semantically as well as syntactically.
We now show some simple definitions f : σ − = e.
The increment function on binary numbers implements the carry with a recursive call, which has to be wrapped in a let/return.
By subtyping, we also have inc : std → ↑std, for example, but not inc : bin → ↑bin since bin ≤ std. However, the definition could be separately checked against this type, which points towards an eventual need for intersection types.
The following incorrect version of the decrement function does not have the indicated desired type!
The error here is quite precisely located by the bidirectional type checker (see Section 6): When we inject b0 · x in the second branch it is not the case that x : pos as required for standard numbers! And, indeed, dec 0 b1 · e · → * return b0 · e · which is not in standard form. On the other hand, the fact that a branch for e · u is missing is correct because the type pos does not have an alternative for this label.
We can fix this problem by discriminating one more level of the input (which could be made slightly more appealing by a compound syntax for nested pattern matching).
Example 2 (Computing with Streams). We present an example of a type with mixed polarities: a stream of standard numbers with a finite amount of padding between consecutive numbers. Programmer's intent is for the stream to be lazy and infinite, i.e., no end-of-stream is provided. But because we do not restrict recursion even a well-typed implementation may diverge and fail to produce another number. On the other hand, the padding must always be finite because the meaning of positive types is inductive. We present padded streams as two mutually dependent type definitions, one positive and one negative. Because our type definitions are equirecursive this isn't strictly necessary, and we could just substitute out the definition of pstream − . For our example, we also define a subtype with zero padding, as forcing a single padding label none between any two elements could also be expressed.
In zstream, we see the significance of variant record types with just one label: some. We exploit this in Section 7 to interpret isorecursive types into equirecursive ones. We have that zstream ≤ pstream, which means we can pass a stream with zero padding into any function expecting one with arbitrary padding. We now program two mutually recursive functions to create a stream with zero padding from a stream with arbitrary (but finite!) padding.
compress : (↓pstream) → zstream omit : padding → zstream compress = λs. let return np = force s in match np ( n, p ⇒ return n, some · thunk (omit p) ) omit = λp. match p ( none · p ⇒ omit p | some · s ⇒ compress s ) Example 3 (Omega). As a final example in this section we consider the embedding of the untyped λ-calculus. The untyped term under consideration is (λx. x x) (λx. x x). The first thing to notice is that this term is not even syntactically well-formed because x stands for a value, but in x x the function parts needs to be an expression. Closely related is that the "usual" definition for the embedding of the untyped λ-calculus (see, for example, [42]) U = U → U isn't properly polarized. So, we define it as U − = (↓U) → U instead: Because our type definitions are equirecursive, both of these definitions are welltyped. Moreoever, we also have ω : U and in fact the embedding of every untyped λ-term will have type U. We also observe that ω (thunk ω) → 3 ω (thunk ω) and therefore represents a well-typed diverging term. Of course, f : U = f is also well-typed and reduces to itself in one step. Remarkably, with our notion of semantic typing we will see that Ω will have every type σ − and not just U (Appendix B, Example 9)!

Semantic Typing
Our aim is to justify both typing and subtyping by semantic means. We therefore start with semantic typing of closed values and computations, written v ∈ τ + and e ∈ σ − . From this we can, for example, define semantic subtyping for positive Conceptually, semantic typing is a mixed inductive/coinductive definition. Values are typed inductively, which yields the correct interpretation of purely positive types such as natural numbers, lists, or trees, describing finite data structures. Computations are typed coinductively because they include the possibility of infinite computation by unbounded recursion. While we assume we can observe the structure of values, computations e cannot be observed directly. Different notions of observation for computation would yield different definitions of semantic typing. For our purposes, since we want to allow unfettered recursion, we posit we can (a) observe the fact that a computation steps according to our dynamics, even if we cannot examine the computation itself, and (b) when a computation is terminal we can observe its behavior by applying elimination forms (for types τ + → σ − and { : σ − } ∈L ) or by observing its returned value (for the type ↑τ + ).
Besides capturing a certain notion of observability, our semantics incorporates the usual concept of type soundness which is important both for implementations and for interpreting the results of computations. These are: Semantic Preservation (Theorem 1) If e ∈ σ − and e → e then e ∈ σ − . Semantic Progress (Theorem 2) If e ∈ σ − then either e → e for some e or e is terminal (but not both). This captures the usual slogan that "well-typed programs do not go wrong" [55]. An implementation will not accidentally treat a pair as a function or try to decompose a function as if it were a pair. Semantic Observation If v ∈ τ + then the structure of the value v is determined (inductively) by the type τ + . Similarly, a terminal computation e ∈ ↑τ + must have the form e = return v with v ∈ τ + .
These combine to the following: if we start a computation for e ∈ ↑τ + then either e → * return v for an observable value v ∈ τ + after a finite number of steps, or e does not terminate. These are close to their usual syntactic analogues, but the fact that we do not rely on any form of syntactic typing is methodologically significant. For example, if we have a program that does not obey a syntactic typing discipline but behaves correctly according to our semantic typing, our results will apply and this program, in combination with others that are well typed, will both be safe (semantic progress) and return meaningfully observable results (semantic preservation and observation). This point has been made passionately by Dreyer et al. [28] and applied, for example, to trusted libraries in Rust [47]. Another example can be found in gradual typing [38,58]. As long as we can prove by any means that the "dynamically typed" portion of the program is semantically well-typed (even if not syntactically so), the combination is sound and can be executed without worry, returning a correctly observable result. A third example is provided by session types for message-passing concurrency [44]. While it is important to have a syntactic type discipline, processes in a distributed system may be programmed in a variety of languages some of which will have much weaker guarantees. Being able to prove their semantic soundness then guarantees the behavioral soundness of the composed system.
Semantic typing in the context of call-by-push-value is well-suited for encoding computational effects, such as input/output, memory mutation, nontermination, etc. Call-by-push-value was designed as a study for the λ-calculus with effects [51,Sec. 2.4], stratifying terms into values (which have no side-effects) and computations (which might). Through the lens of semantic typing, we can ensure behavioral soundness in the presence of effects.

Semantic Typing with Observation Depth
Despite the extensive work on mixed inductive and coinductive definitions [3,11,20,21,22,43,48,49,57,59,67], there is no widely accepted style in presenting such definitions and reasoning with them concisely in an mathematical language of discourse. With some regret, we therefore present our semantic definition by turning the coinductive part into an inductive one, following the basic idea underlying step indexing [7,8,10,27]. Since the coinduction has priority over the induction, arguments proceed by nested induction, first over the step index and second over the structure of the inductive definition. This representation of mixed definitions implies that reasoning over step indices has lexicographic priority over values.
An alternative point of view is provided by sized types [5,6]. Both sized types and step indexing employ the same concept of observation depth; however, for sized types, we would observe data constructors, whereas for step indexing we observe computation steps. General recursion is supported in our system because "productivity" in the negative layer means that computations can step rather than produce a data constructor. The step index is actually the (universally quantified) observation depth for a coinductively defined predicate. We do not index the (existentially quantified) size of the inductive predicate but use its structure directly since values are finite and become smaller. All step indices k, i and occasionally j range over natural numbers. We use three judgments, 1. e ∈ k σ − (e has semantic type σ − at index k) 2. e∈ k+1 σ − (terminal e has semantic type σ − at index k + 1) 3. v ∈ k τ + (v has semantic type τ + at index k) They should be defined by nested induction, first on k and second on the structure of v/e, where part 2 can rely on part 1 for a computation that is not terminal. We write v < v when v is a strict subexpression of v . The clauses of the definition can be found in Figure 1.
A few notes on these definitions. When expanding type definitions t = τ + and s = σ − we rely on the assumption that type definitions are contractive, so one of the immediately following cases will apply next. This means that unlike many thunk e and e ∈ k σ − for some e e ∈0 σ − always e ∈ k+1 σ − (e → e and e ∈ k σ − ) or (e terminal and e∈ k+1 σ − ) A number of variations on this definition are possible. A particularly interesting one avoids decreasing the step index unless recursion is unrolled [8,27,58] so sources of nontermination can be characterized more precisely. It may also be possible to keep the step index constant when analyzing a terminal computation of type ↑τ + . Stripping the return constructor constitutes a form of observation and therefore decreasing the index seems both appropriate and simplest.
The quantification over i ≤ k in the case of terminal computations of function type seems necessary because we need the relation to be downward closed so that it defines a deflationary fixed point [4,41]. Values and computations are then semantically well-typed if they are well-typed for all step indices.
Proof. By a routine nested induction on k and the structure of v/e where part 2 can appeal to part 1 when e is not terminal.
Here are some semantic types that can easily be verified (see Appendix B).

Properties of Semantic Typing
The properties of semantic preservation and progress follow immediately just by applying the definitions and Lemma 1, so we elide their proofs.
Theorem 1 (Semantic Preservation). If e ∈ σ − and e → e then e ∈ σ − . Theorem 2 (Semantic Progress). If e ∈ σ − then either e → e or e is terminal, but not both.

Subtyping
The semantics of subtyping is quite easy to express using semantic typing.
We would now like to give a syntactic definition of subtyping that expresses an algorithm and show it both sound and complete with respect to the given semantic definition. The intuitive rules for subtyping shouldn't be surprising, although to our knowledge our formulation is original.

Empty and Full Types
A first observation is that τ + ⊆ σ + whenever τ + is an empty type, regardless of σ + , because the necessary implication holds vacuously. So we need an algorithm to determine emptiness of a positive type. For the most streamlined presentation (which is also most suitable for an implementation) we first put the signature into a normal form that alternates between structural types and type names.
A usual presentation of emptiness maintains a collection of recursive types in a context in order to do a kind of loop detection. For example, the type t = 1 t is empty because we may assume that t is empty while testing 1 t. Instead, we express this and similar kinds of arguments using valid circular reasoning. If one were to formalize it, it would be in CLKID ω [14], although the succedent of any sequent is either empty or a singleton (as in CLJID ω [12]).
We construct circular derivations for t empty where t is a positive type name. Note that negative types are never empty. We can form a valid cycle when we encounter a goal t empty as a proper subgoal of t empty. Since we fix a signature Σ once and for all before defining each judgment such as emptiness or subtyping, we omit the index Σ since it never changes. The rules can be found in Figure 2.

Fig. 2. Circular Derivation Rules for Emptiness
Example 5. We continue Example 4, part (4), building a formal circular derivation. We first bring the signature into normal form, Theorem 3 (Emptiness). If t empty then for all k and v, v ∈ k t.
Proof. We interpret the judgment t empty semantically as v ∈ k t · (which expresses v ∈ k t in a sequent), where t is given and k and v are parameters and therefore implicitly universally quantified. The proof of this judgment is carried out in a circular metalogic. We translate each inference rule for t empty into a derivation for v ∈ k t ·, where each unproven subgoal corresponds to a premise of the rule. When the derivation of t empty is closed by a cycle, the corresponding derivation of v ∈ k t · is closed by a corresponding cycle in the metalogic. The cases can be found in Appendix D.
Next we symmetrically define what it means for a computation type σ − to be full, namely that it is inhabited by every (semantically well-typed) computation. A simple example is the type { }, that is, the lazy record without any fields. It contains every well-typed expression because all projections (of which there are none) are well-typed. It turns out the fullness is directly defined from emptiness.
We may construct a derivation using the following rules. It could be circular, since the judgment t empty allows circular derivations.
We interpret s full as the entailment e ∈ k r e ∈ k s. In other words, we are assuming that e is semantically well-typed at some r and use that to show that it then will also be well-typed at the unrelated s.
Theorem 4 (Fullness). If s full then e ∈ k r implies e ∈ k s for all k, e, and r.

Proof. (see Appendix E)
Note that there is no rule that would allow us to conclude that

Syntactic Subtyping
The rules for syntactic subtyping build a circular derivation of t + ≤ u + and s − ≤ r − . A circularity arises when a goal t ≤ u or s ≤ r arises as a subgoal strictly above a goal that is of one of these two forms. In general, we use t and u to stand for positive type names and s and r for negative type names without annotating those names. The polarity will also be clear from the context. Moreover, in the interest of saving space, we write t = τ + and s = σ − when these definitions are in the fixed global signature Σ. The rules can be found in Figure 3. In particular, we would like to highlight the ⊥sub + , ⊥sub − , and sub rules, which incorporate emptiness and fullness into syntactic subtyping. For example, among other subtypings, the ⊥sub + rule establishes t ≤ u whenever t = t 1 t 2 and either t 1 empty or t 2 empty. Example 6. We revisit Example 1 to show that pos ≤ std. We have annotated each subgoal from the sub rule with the corresponding label; we have elided the reference to the sub rule in the derivation for lack of space. Again, we normalize the signature before running the algorithm.

Fig. 3. Circular Derivation Rules for Subtyping
From a circular derivation we now construct a valid circular proof in an intuitionistic metalogic [12]. For example, t ≤ u is interpreted as t ⊆ u, that is, every value in t is also a value in u. We actually prove a slightly stronger theorem, namely that for the step index on both sides can remain the same.
1. If t ≤ u then v ∈ k t v ∈ k u for all k and v (and so, t ⊆ u). 2. If s ≤ r then e ∈ k s e ∈ k r for all k and e (and so, s ⊆ r).
Proof. We proceed by a compositional translation of the circular derivation of subtyping into a circular derivation in the metalogic. For each rule we construct a derived rule on the semantic side with corresponding premises and conclusion.
When the subtyping proof is closed due to a cycle, we close the proof in the metalogic with a corresponding cycle. In order for this cycle to be valid, it is critical that the judgments in the premises of the derived rule are strictly smaller than the judgments in the conclusion. Since our mixed logical relation is defined by nested induction, first on the step index k and second on the structure of the value v or expression e, the lexicographic measure (k, v/e) should strictly decrease. Some sample cases can be found in Appendix F.
Besides soundness, reflexivity and transitivity of syntactic subtyping are two other properties that we prove for assurance that the syntactic subtyping rules are sensible and have no obvious gaps. These proofs can be found in Appendix G. Ligatti et al. [53] also consider a notion of preciseness as a syntactic means for judging the correctness of their syntactic subtyping rules. As they mention in [53, Sec. 6.2], this property is highly language-sensitive, depending on the choice of evaluation strategy (strict vs. nonstrict), where nonstrict subtyping relies on "which primitives are present in the language, sometimes in nonorthogonal ways." Moreover, preciseness requires syntactically well-typed counterexamples, whereas we also consider ill-typed terms. We can straightforwardly prove that syntactic subtyping for purely positive types (in relation to strict evaluation) is complete with respect to semantic subtyping. We leave the preciseness of syntactic subtyping of negative types for future consideration.

Syntactic Typing and Soundness
We now introduce a syntactic typing judgment, at the moment without regard to decidability. Such a judgment is often called declarative typing in contrast with what is algorithmic typing in Section 6 ( Figure 4). We prove that all syntactically well-typed terms are also semantically well-typed. Conceptually, a declarative system is unnecessary because the bidirectional system is very closely related, and there are no problems in justifying the soundness of the the bidirectional system directly with respect to our semantics. Besides the fact that there is a small amount of additional bureaucracy (the rules are divided between four judgments instead of two, and there are two additional rules), it is also the case that the standard versions of call-by-name and call-by-value use a similar form of declarative typing and are therefore easier to relate to our system in Section 8.
Because all declarations in a signature can be mutually recursive, each declaration f : σ − = e is checked assuming all other declarations are valid. The soundness proof below justifies this. The complete set of judgments and rules with their corresponding presuppositions can be found in Appendix H, Figures 7 and 8. For these rules, we need contexts Γ , defined as usual with the presupposition that all variables declared in a context are distinct.
Γ ::= · | Γ, x:τ + The rules for key judgments Γ v : τ + and Γ e : σ − can be obtained from the bidirectional rules in Section 6 by replacing both v ⇐ τ + and v ⇒ τ + with v : τ + and, similarly, e ⇐ σ − and e ⇒ σ − with e : σ − . Moreover, one should drop the two annotation rules anno + and anno − because these are not in the source language for declarative typing.
We would like to show that the syntactic typing rules are sound with respect to their semantic interpretation. For that, we first define simultaneous substitutions θ of closed values for variables and θ ∈ k Γ for the semantic interpretation of contexts as sets of substitutions at step index k.
On the semantic side, we define We now can prove a number of lemmas, one for each syntactic typing rule. A representative selection of the lemmas, each written as an admissible rule for semantic typing, can be given by: The proofs are somewhat interesting: some require induction on k, others follow more directly by definition. Due to a lack of space, the proofs can be found in Appendix I, each admissible rule formulated as a separate lemma.
We construct a circular proof based on the typing derivation, and the typing derivations for all definitions f : σ − = e ∈ Σ. There are three kinds of cases (see Appendix I for samples of each): 1. The case of variables x follows by assumption on θ. 2. In the case of names f : σ − = e ∈ Σ we either expand to e or close the proof with a cycle if we have expanded f already. 3. All other rules follow by the lemmas presented above.
In all these lemmas the step index remains constant for the premises, which is important so we can form a circular proof in the case of names.
Because soundness is stated for all θ, Γ , and k, we can immediately obtain corollaries such as that · v : τ + implies that v ∈ τ + , and that · e : σ − implies that e ∈ σ − .

Bidirectional Typing
We now shift from our declarative typing system into an algorithmic one that describes a practical decision procedure. We choose to express it as a bidirectional typechecking algorithm, particularly to avoid inference issues regarding subsumption [45] and our extensive use of type names and variant records, as Moreover, bidirectional typing is quite robust with respect to language extensions where various inference procedures are not. Bidirectional typechecking [66] has been a popular choice for presenting algorithmic typing, especially when concerned with subtyping [30], and is decidable for a wide range of rich type systems. This approach splits each of the typing judgments, Γ v : τ + and Γ e : σ − , into checking (⇐) and synthesis (⇒) judgments for values and expressions, respectively: We follow the recipe laid out by [25,32]: introduction rules check and elimination rules synthesize. More precisely, the principal judgment, premise or conclusion, has the connective being introduced by checking or eliminated by synthesis.
We introduce two new forms of syntactic values (v : τ + ) and computations (e : σ − ) which exist purely for typechecking purposes and are erased before evaluation. This is not actually used on any of our examples because definitions in the signature already require annotations.
Applying the recipe, we can easily convert our declarative rules into bidirectional ones, as laid out in Section 5. The only rules we add to the system are anno + and anno − , which allow us to prove completeness. All the examples in Section 2.2 check with these rules and only require type annotations at the top level of the declarations in the signature.
Due to our use of equirecursive types, the implementation of this system can closely follow the structure of the rules in Figures 2, 3, and 4. First, as mentioned in Section 4.1, we convert the signature into a normal form that alternates structural types and type names. Then, we determine all the empty type names using a memoization table for t + empty to easily construct circular derivations of emptiness (bottom-up) using the rules in Figure 2. If constructing such a derivation fails then t + is nonempty. Fullness is derived from emptiness non-recursively. From there, we build a memoization table for t + ≤ u + and s − ≤ r − , for positive and negative type names, so we can construct circular derivations of subtyping between names (also bottom-up). This happens lazily, only computing t + ≤ u + or s − ≤ r − if typechecking requires this information.
Bidirectional typing, given subtyping, follows the rules in Figure 4, including the rules for positive and negative subsumption, but it requires that the types in annotations are also translated to normal form, possibly introducing new (user-invisible) definitions in the signature.
The theorems (with straightforward proofs) for soundness and completeness of bidirectional typechecking can be found in Appendix J (Theorems 12 and 13).

Interpretation of Isorecursive Types
Our system uses equirecursive types, which allow many subtyping relations since there are no term constructors for folding recursive types. Moreover, equirecursive types support the normal form where constructors are always applied to type names (see Section 4.1), simplifying our algorithms, their description and implementations. Most importantly, perhaps, equirecursive types are more general because we can directly interpret isorecursive types, which are embodied by fold and unfold operators, into our equirecursive setting and apply our results.
We give a short sketch here; details can be found in Appendix K. For every recursive type µα + . τ + we introduce a definition t + = {fold µ : [t/α]τ }. Similarly, for every corecursive type να − . σ − we introduce a definition s − = {fold ν : [s/α]σ}. Now, the labels fold µ and fold ν tagging the sole choice of a unary variant or lazy record, respectively, play exactly the role that the fold constructor plays for recursive types. This entirely straightforward translation is enabled by our generalization of the binary sum and lazy pairs to variant and lazy records, respectively, so we can use them in their unary form.

Call-by-Name and Call-by-Value
More familiar than call-by-push-value (CBPV) are the lazy, call-by-name (CBN) and eager, call-by-value (CBV) operational semantics that underlie the Haskell and ML families of functional programming languages. Levy [52] has shown that both CBN and CBV exist as fragments of CBPV, exhibiting translations from CBN and CBV types and terms into the CBPV language. In this section, we derive systems of subtyping for CBN and CBV from these translations into ours and prove them sound and complete. We discover that they are minor variants of existing systems for CBN [39] and CBV [53] subtyping.
Because polarized subtyping is able to connect Levy's translations with existing systems for CBN and CBV subtyping, it serves as further evidence that those prior translations and our subtyping rules are, in some sense, canonical. Moreover, it is yet one more piece of evidence that CBPV is an effective synthesis of evaluation orders in which to study the theory of functional programming.

Call-by-name
Consider a CBN language with the following types. The language of terms and the standard statics and dynamics can be found in Appendix L.
In this section, we will focus on function types τ → σ and variant record types { : τ } ∈L and their corresponding terms. Levy [52] presents translations, (−) , from CBN types and terms to CBPV negative types and expressions, respectively. An auxiliary translation, ↓(−) , on contexts is also used. Here, we elide the translation of terms other than variables and the terms for function and variant record types; the full translation on terms can be found in [52]. We also translate type names t to fresh type names t , translating the body of t's definition and inserting additional type names as required for the normal form that alternates between structural types and type names. Levy [52] proves that well-typed terms are well-typed after the translation to CBPV is applied. Our syntactic typing rules are the same, so the theorem carries over to our setting. We adapt the subtyping system of Gay and Hole [39] to a λ-calculus from the π-calculus, which reverses the direction of subtyping from their classical system and adds empty records, obtaining the CBN syntactic subtyping rules shown in Figure 5.
These rules introduce a CBN syntactic subtyping judgment t ≤ u. To distinguish it from CBPV syntactic subtyping, we will take care in this section to always include superscript pluses and minuses for CBPV type names, with CBN type names being unmarked. As for CBPV syntactic subtyping, the rules for CBN subtyping shown in Figure 5 build a circular derivation. Just as before, a circularity arises when a goal t ≤ u arises as a proper subgoal of itself. These rules are exact analogues of those of Gay and Hole [39], with one exception. The three rules involving empty variants and records, namely ⊥sub n , sub n , and full n , have no analogues in [39] only because their language did not include the corresponding empty internal and external choice types.
As we will prove below, the CBN subtyping rules in Figure 5 are exactly those for which t ≤ u in the CBN language if and only if t ≤ u in the CBPV metalanguage. We thereby show that our polarized subtyping on the image of Levy's CBN translation is sound and complete with respect to Gay and Hole's CBN subtyping.
Before proceeding to those proofs, it is worth pointing out that many of these CBN subtyping rules exactly follow CBPV, with a few notable differences. First, the sub n rule does not permit empty branches that do not occur in the supertype. This is because the ↓ shifts that appear in ( { : τ } ∈L ) prevent each branch from being empty-there is no emptiness rule for ↓ shifts in the CBPV subtyping. Second, for this CBN language, only types t = { } are full. In particular, a CBN function type t = t 1 → t 2 is never full, even though a CBPV function type s − = t + 1 → s − 2 is full if the argument type t + 1 is empty. This stems from the ↓ shift that appears in the argument type in (τ → σ) = ↓τ → σ . Third, the reader may be surprised by the omission of an emptiness judgment for CBN types. The ⊥sub n rule mentions the CBN type t = { }, which looks like it ought to be an empty type-the CBPV type t + 0 = { } is empty, after all.
Yes, but the CBN translation of t = { } is in fact the negative type t = ↑ { }, and negative types are never empty. Nevertheless, t = ↑ { } ≤ u in this case. Now we prove that polarized subtyping on the image of Levy's CBN embedding, (−) , is sound and complete with respect to the CBN subtyping rules of Figure 5. The proofs can be found in Appendix L.
1. If t full, then t full.
1. If t full, then t full.

Call-by-Value
We can play through a similar procedure for Levy's CBV translation. Consider a CBV language with the following types. The language of terms, typing rules, and standard dynamics can be found in Appendix M.
The translations that Levy [52] presents from CBV types and terms to CBPV positive types and expressions are as follows. We only present the translation of variables, function abstractions, and function applications; the full translation on terms can be found in [52].
We also translate type names t to fresh type names t , translating the body of t's definition and inserting additional type names as required for the normal form that alternates between structural types and type names. Levy proves that well-typed terms translate to well-typed expressions. Because our syntactic typing rules are the same as his, his theorem carries over.
We adapt the CBV subtyping system of Ligatti et al. [53] to our setting, which means that we include variants and lazy records with width and depth subtyping and replace isorecursive with equirecursive types. We obtain the syntactic subtyping rules shown in Figure 6. Once again, we will take care to distinguish the CBV syntactic subtyping judgment, t ≤ u, from CBPV syntactic subtyping by marking CBPV type names with pluses and minuses. The rules shown in Figure 6 build circular derivations. These rules match those of Ligatti et al., with one minor exception that we will detail below. As we will prove, these rules are exactly those for which t ≤ u in the CBV language if and only if t ≤ u in the CBPV metalanguage.
Before proceeding to the proofs, a few remarks about these rules. First, unlike the CBN sub n rule, the sub v rule here includes the possibility that some components of a variant record type may be empty. More generally, the differences between CBN and CBV subtyping arise from the differences in emptiness and fullness between the two calculi. Emptiness and fullness are quite sensitive to the eager/lazy distinction between the two evaluation strategies. Because this distinction manifests in almost every layer of a complex type, the two subtyping systems diverge more than one might expect.
Second, besides the adaptions mentioned above, the rules of Figure 6  Somewhat unexpectedly, polarized subtyping on the image of Levy's CBV translation would be incomplete with respect to this more general rule. This is because the ↓ shift inserted by Levy's translation acts as a barrier to fullness: "t ≤ u if u = ↓r and r full" would be unsound in polarized subtyping. For example, Ligatti et al. have 1 ≤ 0 → 1 for an empty type 0, but we do not have 1 = 1 ≤ ↓(0 → ↑1) = (0 → 1) because the unit value does not have type ↓(0 → ↑1). This phenomenon is primarily of theoretical interest since it is confined to functions that can never be applied to any arguments and empty records (and only when they are compared against CBV types t 1 t 2 , 1, and { : t } ∈L ). Nevertheless, we conjecture a more differentiated translation of types and terms could restore completeness.
These observations notwithstanding, we can prove that the CBV subtyping rules of Figure 6 are sound and complete with respect to the subtyping rules for CBPV under Levy's translation. The proofs can be found in Appendix M.
1. If t empty, then t empty.
1. If t empty, then t empty.

Related Work and Discussion
We now dive deeper into research related to our underlying theme on how polarization affects the interaction and definition of subtyping with recursive types across varying interpretations.
Subtyping Recursive Types. The groundwork for coinductive interpretations of subtyping equirecursive types has been laid by Amadio and Cardelli [9], subsequently refined by others [13,37]. Danielsson and Altenkirch [22] also provided significant inspiration since they formally clarify that subtyping recursive types relies on a mixed induction/coinduction. In using an equirecursive presentation within different calculi, our work has been influenced by its predominant use in session types [19,23,40] and, in particular, Gay and Hole's coinductive subtyping algorithm [39], which we take as a template for call-by-name typing.
Another important influence has been the work on refinement types [24,34] which are also recursive but exist within predefined universes of generative types. As such, subtyping relations are simpler in their interactions, but face many of the same issues such as emptiness checking. One can see this paper as an attempt to free refinement types from some of its restrictions while retaining some of its good properties. The key ingredients are (1) explicitly separating values from computations via polarization, (2) the introduction of variant and lazy records and their width and depth subtyping rules (owing much to [68]), and (3) simple bidirectional typechecking. What is still missing is the use of intersections and unions that allow subtyping to propagate more richly to higher-order types [31].
Our treatment of empty-value-uninhabited -and full types in Section 4.1, as well as our call-by-value interpretation in Section 8.2 builds on Ligatti et al.'s work [53] on precise subtyping with isorecursive types.
Our direct interpretation of isorecursive types and translation into an equirecursive setting furthers numerous works either comparing or relating both formulations [65,71,72]. In particular, Abadi and Fiore [1] and more recently Patrigniani et al. [61] prove that terms in one equirecursive setting can be typed in the other (and vice versa) with varying approaches. The former treats type equality inductively and is focused on syntactic considerations. The latter treats type equality coinductively and analyzes types semantically. Neither of these handle subtyping or mixed coinductive/inductive types like in our study.
Finally, Zhou et al.
[74] serves as a helpful overview paper on subtyping recursive types at large and discusses how Ligatti et al.'s complete set of rules requires very specific environments for subtyping, as well as non-standard subtyping rules. This observation demonstrates why our semantic typing/subtyping approach can offer a more flexible abstraction for reasoning about expressive type systems while maintaining type safety.
Semantic Typing and Subtyping. Semantic typing goes back to Milner's semantic soundness theorem [55], which defined a well-typed program being semantically free of a type violation. Whereas syntactic typing specifies a fixed set of syntactic rules that safe terms can be constructed from, semantic typing here combines two requirements: positive types circumscribe observable values, exposing their structure, and computations of negative types are only required to behave in a safe way. As we demonstrate throughout section 5, we can prove our semantic definitions compatible with our syntactic type rules, leaving syntactic type soundness to fall out easily (Theorem 6).
Milner's initial model didn't scale well to richer types, like recursive types. With a lens toward more expressive systems, step indexing has become a prominent approach [7,8,10,27], which we use to observe that a computation in our model steps according to our dynamics.
As with syntactic/semantic typing, syntactic subtyping is the more typical approach in modeling subtyping relations over its semantic counterpart. Nonetheless, in what's operated almost parallel to the research on semantic types, research on semantic subtyping has also made strides [35,15,64]. Mainly, these exploit semantic subtyping for developing type systems based on set-theoretic subtyping relations and properties, particularly in the context of handling richer types, including polymorphic functions [17,16,63] and variants [18], recursive types (interpreted coinductively), and union, intersection, and negation connectives [36]. A major theme in this line of work is excising "circularity" [15,36] by means of an involved bootstraping technique, as issues arise when the denotation of a type is defined simply as the set of values having that type.
We depart from this line of research in the treatment of functions (defined computationally rather than set-theoretically), recursive types (equirecursive setting; inductive for the positive layer and coinductive for the negative layer), both variant and lazy record types, and the commitment to explicit polarization (including our incorporation of emptiness/fullness). The latter of which eliminates circularity and ties together multiple threads defined in this study.
With this combination of semantic typing and subtyping, our work provides a metatheory for a more interesting set of typed expressions while also providing a stronger and more flexible basis for type soundness [28], as semantic typing can reason about syntactically ill-typed expressions as long as those expressions are semantically well-typed. This combination scales well to our polarized, mixed setting and focus on subtyping in the presence of recursive types.
Polarized Type Theory and Call-by-Push-Value. At the core of this work has been the call-by-push-value [51,52] (CBPV) calculus with its notions of values, computations, and the shifts between them. Beyond Levy's work, this subsuming paradigm has formed the foundation of much recent research, ranging from probabilistic domains [33] to those reasoning about effects [54] and dependent types [62]. New et al.'s [58] gradual typing extension to the calculus shares similarities with our use of step indexing, but its relations (binary rather than unary), dynamics, and step-counting are treated differently, and its goals are very different as well, including no coverage on subtyping.
To our knowledge, there are no direct treatments of subtyping recursive types in a CBPV system or applying a full semantic typing approach in this context with subtyping. It is, as we've shown, a fruitful setting for our investigation since the explicit polarization of the language mirrors the mixed reasoning required to analyze the subtyping.
Though CBPV and polarized type theory typically go hand-in-hand, there are investigations that look at polarization (focusing) and algebraic typing and subtyping from alternate perspectives. Steffen [70] predates Levy's research and presents polarity as a kinding system for exploiting monotone and antimonotone operators in subtyping function application. Abel [2] built upon this and extended it with sized types. The inherent connection between types and evaluation strategy has also been studied in the setting of program synthesis [69] and proof theory [56], but these do not share our specific semantic concerns.
Polarization as an organizing principle for subtyping is present in Zeilberger's thesis [73], but addresses a problem that is fundamentally different in multiple ways, e.g. using "classical" types and continuations, and no width and depth subtyping. The biggest difference, however, is that its setting considers refinement types, while we do not have a refinement relation and show that some of the advantages of refinement types can be achieved without the additional layer.
Two studies on a global approach to algebraic subtyping [26,60] define subtyping relationships with generative datatype constructors while discussing polarity (here with a different meaning) and discarding semantic interpretations. However, the generative nature of datatype constructors in this work makes its quite different from ours.
Mixed Coinductive/Inductive Reasoning for Recursive Types. The natural separation of positive and negative layers in CBPV led us through the literature on mixed coinductive/inductive definitions for recursive types. Related to our work in this paper, Danielsson and Altenkirch [22] and Jones and Pearce [46] provide definitions for equirecursive subtyping relations in a mixed setting while using a suspension monad for non-terminating computations, which shares an affinity with force/return CBPV computations. Danielsson and Altenkirch, however, do not try to justify the structural typing rules themselves via semantic typing of values or expressions-only the subtyping rules. Jones and Pearce are closer to our approach since they also use a semantic interpretation of types for expressions. While not polarized, they do consider inductive/coinductive types separately, but do not lift them to cover function types, instead studying other constructs such as unions.
Komendantsky [48] manages infinitary subtyping (for only function and recursive types) via a semantic encoding by folding an inductive relation into a coinductive one. We work in the opposite direction, turning the coinductive portion into an inductive one by step indexing. Lepigre and Raffali [50] mix induction and coinduction in a syntax-directed framework, focusing on circular proof derivations and sized types [6]; also managing inductive types coinductively. Cohen and Rowe [21] provide a proposal for circular reasoning in a mixed setting, but the focus is on a transitive closure logic built around least and greatest fixed point operators. It seems quite plausible that we could use such systems to formalize our investigation, although we found some merit in using step-indexing and Brotherston and Simpson's circular proof system for induction [14].

Conclusion
We introduced a rich system of subtyping for an equirecusive variant of call-bypush-value and proved its soundness via semantic means. We also provided a bidirectional type checking algorithm and illustrated its expressiveness through several different kinds of examples. We showed the fundamental nature of the results by deriving systems of subtyping for isorecursive types and languages with call-by-name and call-by-value dynamics. The limitations of the present systems lie primarily in the lack of intersection and union types and parametric polymorphism which are the subject of ongoing work.

B Examples of Semantic Typing
Example 7 (Identity Function). λx. return x ∈ τ + → ↑τ + for all τ + Reason for k ≥ 2: Prove e 0 ∈ k s 0 for all k by induction on k. Reason for k ≥ 2: By ind. hyp e 0 ∈ k s 0 for all k By downward closure e 0 ∈ s 0 By definition Example 9 (Ω). Define: Prove ω (thunk ω) ∈ k σ − for every k by induction on k Reason for k ≥ 3: Holds by ind. hyp. and then ω (thunk ω) ∈ k σ − By downward closure Ω ∈ k+1 σ − By definition Ω ∈ σ − By downward closure Example 10 (Empty Recursive Type). Define: We prove something stronger: for all k and v, it is not the case that v ∈ k t 0 . The proof is by induction on v.
Continuing the example: Assume e has any type at all (that is e ∈ ρ − for some ρ − ). Then for all σ − we have e ∈ t 0 → σ − .
We prove e ∈ k t 0 → σ − by induction on k Because e ∈ ρ − we know one of the following cases applies: Case: k = 0. Then e ∈ 0 t 0 → σ − by definition Case: k > 0 and e → e . Then e ∈ k−1 t 0 → σ − by ind. hyp.
Case: k > 0 and e is terminal. By definition, it remains to show that e v ∈ k σ − for all i < k and v with v ∈ i t 0 But that's vacuously true by the first part of this example. e ∈ σ − Given e ∈ k σ − for all k By definition e ∈ k+1 σ − for all k By definition, since e → e e ∈ i σ − for all i By downward closure e ∈ σ − By definition

D Emptiness
Proof. (of Theorem 3) We interpret the judgment t empty semantically as v ∈ k t · (which expresses v ∈ k t in a sequent), where t is given and k and v are parameters and therefore implicitly universally quantified. The proof of this judgment is carried out in a circular metalogic. We translate each inference rule for t empty into a derivation for v ∈ k t ·, where each unproven subgoal corresponds to a premise of the rule. When the derivation of t empty is closed by a cycle, the corresponding derivation of v ∈ k t · is closed by a corresponding cycle in the metalogic. During this compositional translation of t empty we need to ensure that the lexicographically ordered pair (k, v) is smaller for each subgoal on the semantic side. This ensures that we can build a valid cycle in the metalogic whenever we have a cycle in the derivation of t empty. As we will see, k never changes, and v becomes smaller. Recall that we write v < v when v is a strict subterm of v .
This shows we prove a slightly stronger statement than simply that v is empty, namely that v ∈ k t for all k. When showing a derivation we implicitly apply weakening when we do not use an assumption any longer, reading in proof construction order from the conclusion to the premises. Case: In each of the |L| premises we have v j < v = j · v j so the structure of v decreases. Case: t empty is justified by a cycle. Then v ∈ k t is justified by a corresponding cycle.

E Fullness
Proof. (of Theorem 4) There are three cases for why e ∈ k r could be true: (1) k = 0, (2) k > 0 ∧ e → e ∧e ∈ k−1 r and (3) k > 0∧e terminal∧e∈ k r. Only in the last do we distinguish between the rules for the s full judgment.
Case: k = 0. Then e ∈ 0 s is true by definition. Case: k > 0 and e → e with e ∈ k−1 r.

· · ·
cycle(k−1/k,e /e) e ∈ k−1 r e ∈ k−1 s k > 0, e → e , e ∈ k−1 r e ∈ k s · · · e ∈ k r e ∈ k s So in this case we close the derivation with a local cycle, which corresponds to an appeal of the induction hypothesis on k − 1, regardless of s. We indicate here the substitution for the parameters that is applied as part of forming the cycle. Case: k > 0 and e terminal with e∈ k r. We do not use the last assumption. Now we distinguish cases on the rule use to derive e ∈ k s. Subcase: Here, we reduce the result to Theorem 3 (using weakening here not only in the antecedent but also in the succedent). Subcase: Similar to the previous case.

F Subtyping
Proof. (of Theorem 5) We proceed by a compositional translation of the circular derivation of subtyping into a circular derivation in the metalogic. For each rule we construct a derived rule on the semantic side with corresponding premises and conclusion. When the subtyping proof is closed due to a cycle, we close the proof in the metalogic with a corresponding cycle. In order for this cycle to be valid, it is critical that the judgments in the premises of the derived rule are strictly smaller than the judgments in the conclusion. Since our mixed logical relation is defined by nested induction, first on the step index k and second on the structure of the value v, this is the lexicographic measure (k, v) should strictly decrease.
We provide some sample cases. We freely apply weakening to simplify the judgment under consideration. Case: in the second branch. Case: At the inference ( * ) we distinguish the two cases from the premise of sub for = j: either t j empty or j ∈ K. Observe that v j < v = j · v j .
For computations, we separate out the cases the k = 0 and k > 0 with e → e because the argument is essentially the same except in the case of sub. When k > 0 and e terminal we distinguish cases based on the various rules. Case: Case: s ≤ r and e ∈ k s for k = 0. Then, e ∈ 0 r directly by definition.
Case: k > 0 and e → e . Then we can close of the derivation with a (local) cycle, representing an immediate appeal to the induction hypothesis with k − 1 < k.
cycle(k−1/k,e /e) e ∈ k−1 s e ∈ k−1 r k > 0, e → e , e ∈ k−1 s k > 0 ∧ e → e ∧ e ∈ k−1 r k > 0, e → e , e ∈ k−1 s e ∈ k r e ∈ k s e ∈ k r Case: k > 0 and e terminal. Then we distinguish subcases based on the rule to conclude e ∈ k s. Subcase: w ∈ j u 1 w ∈ j t 1 e ∈ k−1 s 2 e ∈ k−1 r 2 e w → e , e ∈ k−1 s 2 e w ∈ k r 2 ( * ) e w ∈ k s 2 e w ∈ k r 2 j < k, w ∈ j t 1 ⊃ e w ∈ k s 2 , w ∈ j u 1 e w ∈ k r 2 In the place marked ( * ) we only have one possible case since k > 0 and e terminal and therefore e w is not terminal and must reduce since e w ∈ k s 2 .
In the first open premise we have (j, w) < (k, w) because j < k (even if w is arbitrary).
In the second open premise we have k − 1 < k.
Subcase: Recall that k > 0 and e terminal. s = { : s } ∈L r = {j : r j } j∈K ∀j ∈ K. j ∈ L ∧ s j ≤ r j s ≤ r sub e ∈ k−1 s j e ∈ k−1 r j e.j → e , e ∈ k−1 s j e.j ∈ k r j e.j ∈ k s j e.j ∈ k r j ( * * ) At the inference ( * ) we use that j ∈ L by the premise of sub. At the inference ( * * ) with use that k > 0 and e.j is not terminal. Subcase: Recall that k > 0 and e terminal.
Observe that in the translation of t ≤ u we have k − 1 < k. Subcase: Recall that k > 0 and e terminal.
The last two cases follow immediately from the properties of the emptiness and fullness judgments. Case: Case: In this case, we can appeal to the lemma for fullness because we have the assumption that v ∈ k t.

G Reflexivity and Transitivity of Syntactic Subtyping
Theorem 11 (Reflexivity and Transitivity).
1. t ≤ t and s ≤ s for all types names s and t in signature Σ 2. t 1 ≤ t 2 and t 2 ≤ t 3 implies t 1 ≤ t 3 3. s 1 ≤ s 2 and s 2 ≤ s 3 implies s 1 ≤ s 3 Proof. All rules except ⊥sub + , ⊥sub − , and sub simply compare components, thus are directly amenable to reflexivity. For rule ⊥sub + , t empty implies t ≤ t. For rule ⊥sub − , t empty again implies ↑t ≤ ↑t. Finally, for sub, s full implies s ≤ s. Proving transitivity requires an additional lemma: If t 1 ≤ t 2 and t 2 empty, then t 1 empty. This lemma follows by applying inversion to the syntactic subtyping judgment. This lemma can then be utilized to take circular proofs of t 1 ≤ t 2 and t 2 ≤ t 3 to assemble a circular proof of t 1 ≤ t 3 . A similar proof technique holds for (3).

H Declarative Typing Judgments
While semantic typing worked with closed values and computations only, the syntactic rules require consideration of free variables. In a polarized presentation they always stand for values and therefore have positive type. We collect them in a context Γ and, as usual, presuppose that all variables declared in a context are distinct.
Γ ::= · | Γ, x:τ + There are several official judgments for the syntactic validity of signatures, contexts, types, and the typing of values and computations. In order to avoid excessive bureaucracy we use some presuppositions and some implicit checking or renaming to maintain these. The complete list of judgments can be found in Figure 7. The last two arise from τ + ≤ σ + and τ − ≤ σ − in that they do not require the normal form of alternating names and structural types introduced in Section 4.1.

Valid Signatures
In particular, we have Σ τ + ≤ t and Σ t ≤ τ + if t = τ + ∈ Σ, and analogously for negative types. This captures the equirecursive nature of type definitions. We omit its straightforward rules, as well as the rules for valid types which only check that all its type names are defined in the signature. We write τ + struct and σ − struct if τ + and σ − are not type names, which is needed to guarantee that type definitions are contractive.

I Soundness of Syntactic Typing
We state and proof the rules for semantic typing from Section 5 separately.
The proof is by induction on k.
e v ∈ 0 σ − By definition Case: k > 0 and e terminal e ∈ k τ + → σ − First premise v ∈ i τ + for all i < k From second premise by downward closure e v ∈ k σ − By definition and second premise Case: e → e and k > 0 From second premise and downward closure e v ∈ k−1 τ + → σ − By ind. hyp. e v ∈ k τ + Since e v → e v Lemma 5.
Subcase: e 1 → e 1 and e 1 ∈ k−1 ↑τ + . e 1 ∈ i ↑τ + for all i ≤ k − 1 By downward closure x : τ + |= e 2 ∈ k−1 σ − From second premise let return x = e 1 in e 2 → let return x = e 1 in e 2 By rule let return x = e 1 in e 2 ∈ k−1 ↑τ + By ind. hyp let return x = e 1 in e 2 ∈ k ↑τ + By definition Subcase: e 1 terminal and From second premise and downward closure let return x = return v 1 in e 2 ∈ k σ − By definition let return x = e 1 in e 2 ∈ k σ − Since e 1 = return v 1 By second premise and Theorem 5 v ∈ k ↓σ − Given v = thunk e and e ∈ k σ − By definition Proof. (of Theorem 6) Case: has not yet been translated, we deduce f [θ] = f ∈ k σ − from f → e and e ∈ k−1 σ − if k > 0. In this case it is important that k > k − 1. If f has already been translated (that is, we are in the premise of the translation of this rule application), then it will be at a judgment f ∈ k σ − for some k > k and we can form a valid cycle. This translation results in a finite circular proof for two reasons: 1. There are only finitely many definitions f : σ − = e ∈ Σ. 2. The type for f is fixed to be σ − , so when f is encountered in the derivation of e ∈ k−1 σ − we can always form a valid cycle. Case: Note the the step index k remains the same in all premises. Case: Note that the step index k remains the same.

J Soundness and Completeness of Bidirectional Typechecking
Of note, |v | = v and |e | = e used in the theorems below refer to the same value and/or expression, but with the possibility of extra annotations that will also be erased from the term at runtime.
1. If Γ v ⇐ τ + or Γ v ⇒ τ + then there exists an v such that Γ v : τ + and |v | = v 2. If Γ e ⇐ σ − or Γ e ⇒ σ − then there exists an e such that Γ e : σ − and |e | = e Proof. By straightforward induction on the structure of the typing derivation.
We can also show that our bidirectional system is complete, as annotations can always be added to make values and/or computations well-typed. Proof. By straightforward induction on the structure of the typing derivation and using the rules anno + and anno − where needed.

K Interpretation of Isorecursive Types
As we discussed in Section 7, we can also directly interpret isorecurisve typestypes that are isomorphic, embodied by fold and unfold operators, but not equal to their expansions-in order to obtain a formulation for isorecursive semantic typing, and therefore semantic subtyping, within our equirecursive setting. Previous work has studied the relation between these two formulations from a syntactic perspective [1], via type assignment with positive recursive typing [71], and, more recently, in relation to semantic expressiveness [61]. For our needs, we demonstrate a semantic translation from the iso-to equi-recursive settings that showcases no significant differences between these formulations.
Syntax While our focus is on an isorecursive semantic interpretation, we need to facilitate some additional syntax for values and computations to establish our operational semantics, introducing the fold constructor and unfold destructor (for computations only), typical of isorecursive formulations.
Dynamics In the isorecursive interpretation, two reduction rules are added for the judgment e → e . As before, values do not reduce.
For this interpretation, we also expand our set of terminal computations to include one additional computation, following Lemma 1.
(fold ν e) terminal Semantic Typing We extend our semantic typing definitions from Section 3 to incorporate the isorecursive fold, introduced for values, and unfold, as the elimination form of fold for computations.
With the addition of this setting, we can model recursive types with explicit constructors µα + . τ + and να − . σ − . As before, we observe computations steps according to our reduction rules.

K.1 Recursive Types as Type Definitions
We define a generalized translation for mapping all recursive variables, α, and all arbitrary recursive types, µα. τ (with the possibility of nested recursion), to fresh equirecursive type names t + and s − . We encode this translation and mapping over a series of steps: 1. Define a translation function · , distinguishing recursive types and type variables from all other types. Each recursive type variable, positive and negative, is translated into a fresh type name: α + = t + for t + fresh and for all α + α − = s − for s − fresh and for all α − The translation maps each µ-or ν-type to the corresponding type name and is the identity function for all other types.
Define type names through possibly non-contractive type definitions: (a) each µ-type µα + . τ + establishes the definition for the corresponding type name t + through the definition t + = τ + ;

K.2 Iso-to Equi-recursive Translation
In the spirit of Ligatti et al.'s [53]'s conjecture that equirecursive subtypes could automatically be translated into isorecursive subtypes by inserting any "missing µ s" (in a call-by-value language), our translation works in the other direction, from an iso-to equi-recursive setting, by inserting unary variant records for µ positive types and unary lazy records for ν negative types with corresponding fold labels.
We start with a set of translations, · , involving fold µ and fold ν introduction and elimination forms for values and computations: Definition 2 (Iso-to equi-recursive translations for values and expressions).
fold ν e = {fold ν = e } unfold e = e .fold ν This is extended compositionally to all other constructs.
Going further, for this specific interpretation, we extend the generalized translation function · in Section K.1 with an additional transformation: for every positive type definition encountered, a unary variant record is inserted into the global signature Σ i2e . Similarly, for every negative type definition, a unary lazy record is inserted into the signature as well: Definition 3 (Iso-to equi-recursive translation for isorecursive types). The translation of a (positive or negative) isorecursive type τ , is the equirecursive type τ defined over the extended global signature Σ i2e .
Proof. Directly, using our translation definitions in 2 and 3; the rest follows from Section 3.
A Note on Contractiveness Given our translation in definition 3, we can have the following two translations, for example: While these isorecursive types on the left may seem to break our contractive restriction on first glance, µα + . α + and να − . α − are not restricted in the isorecursive setting [74]. Instead, these translations demonstrate that our restriction on being contractive in the equirecursive formulation is preserved for any given isorecursive type by the insertion of unary variants.
L Call-by-Name τ, σ : Syntactic typing for this call-by-name language is captured by the judgment Γ e : τ , and its rules are standard. For function types, variant record types, and the corresponding terms, these rules are: This semantics is call-by-name because, for example, in a function application e 1 e 2 , the argument e 2 remains unevaluated when substituted into the body of an abstraction λx. e 1 . Similarly, in match e 1 ( · x ⇒ e ) ∈L , the term e 1 is evaluated to the form j · e 1 for some j ∈ L. Because j · e 1 is terminal, e 1 is not further evaluated; instead, e 1 is substituted into e j , the body of the j th branch.
Because of the way that we handle equirecursive types and expressions, we translate signatures. Levy [52] proves that well-typed terms are well-typed after the translation to call-by-push-value is applied. Our syntactic typing rules are the same as his, so the theorem carries over to our setting. Now, we prove that polarized subtyping on the image of Levy's call-by-name translation is sound. We begin with an easy lemma.
Proof. By a straightforward examination of the call-by-push-value syntactic subtyping rules, observing that t + empty is not derivable when t + = ↓s − .
The soundness theorem is then proved as follows.
Proof. (of Theorem 7) Part 1 is easy to prove directly. By inversion on the body of t's definition, there are several cases: -If t = t 1 → t 2 , then t = t + 0 → t 2 , where t + 0 = ↓t 1 is an auxiliary definition introduced for the normal form of type definitions. By inversion on t full, we must have t + 0 empty. Because t + 0 = ↓t 1 , this case is contradictory: there is no emptiness rule for the ↓ shift.
-If t = { : τ } ∈L , then t = { : τ } ∈L . By inversion on t full, we must have L = ∅. In this case, we indeed have t full by the call-by-name full rule. -In all other cases, t = ↑t + 0 with t + 0 introduced for the normal form of type definitions. However, there is no call-by-push-value fullness rule for ↑ shifts, so these cases are contradictory as well. Part 2 is proved by mapping a circular proof t ≤ u to a circular proof of t ≤ u. We can also prove: If t = ↑t + 0 and u = ↑u + 0 with t + 0 ≤ u + 0 , then t ≤ u This is done by mapping a circular proof t + 0 ≤ u + 0 to a circular proof of t ≤ u. This and Part 2 are proved simultaneously.
-Consider the case in which t ≤ u is derived by the sub rule.
By part 1, u full in the call-by-name language. It follows from the sub n rule that t ≤ u .
-Consider the case in which t ≤ u is derived by the →sub rule. By inversion on t and u , this can only happen if t = t 1 → t 2 and u = u 1 → u 2 , with t = t + 0 →t 2 and u = u + 0 →u 2 , where t + 0 = ↓t 1 and u + 0 = ↓u 1 are auxiliary definitions introduced for the normal form of type definitions.
→sub By Lemma 13, u 1 ≤ t 1 . By transforming according to part 2, we have both u 1 ≤ t 1 and t 2 ≤ u 2 . From these we can derive t ≤ u with the →sub n rule. -Consider the case in which t ≤ u is derived by the ↑sub rule.
By item 1, t ≤ u. -Consider the case in which t = ↑t + 0 and u = ↑u + 0 with t + 0 ≤ u + 0 being derived by the sub rule. In this case, t = { : t } ∈L and u = {j : u j } j∈J , with t + 0 = { : t + } ∈L and t + = ↓t and u + 0 = {j : u + j } j∈J and u + j = ↓u j are auxiliary definitions introduced for the normal of type definitions. Observe that t + empty is not derivable for any ∈ L \ J because t + = ↓t . Therefore, L ⊆ J must hold. By Lemma 13, t ≤ u for all ∈ L ∩ J = L. By transforming according to part 2, we have t ≤ u for all ∈ L. From these we can derive t ≤ u with the sub n rule.
-Consider the case in which t = ↑t + 0 and u = ↑u + 0 with t + 0 ≤ u + 0 being derived by the ⊥sub + rule. In this case, t + 0 empty. There are three subcases. • If t = { : t } ∈L , then t + 0 = { : t + } ∈L , with t + = ↓t being auxiliary definitions introduced for the normal form of type definitions. None of t + empty are derivable because there is no call-by-push-value emptiness rule for the ↓ shift. Therefore, t + 0 empty is derivable only if L = ∅. In this case, the call-by-name ⊥sub n rule derives t ≤ u.
• The subcase in which t = t 1 t 2 is similarly impossible.
• If t = 1, then t + 0 = 1. The judgment t + 0 empty is not derivable, as there is no call-by-push-value emptiness rule for 1.
We again use a small-step operational semantics that relies on the judgments e → e and e value. The rules involving functions are: λx. e value This semantics is call-by-value because, for example, in a function application e 1 e 2 , the argument e 2 is evaluated to a value first: only values are ever substituted into the body of an abstraction λx. e 1 . We now prove that polarized subtyping on the image of Levy's call-by-value translation is sound with respect to Figure 6. We begin with an easy lemma.
Proof. By a straightforward examination of the call-by-push-value subtyping rules. Now we prove the main soundness theorem.
Proof. (of Theorem 9) Part 1 is easy to prove directly. By inversion on the body of t's definition, there are several cases: -If t = t 1 → t 2 , then t = ↓t − 0 , where t − 0 = t 1 → t − 3 and t − 3 = ↑t 2 are auxiliary definitions introduced for the normal form of type definitions. Because t = ↓t − 0 , this case is contradictory: there is no emptiness rule for the ↓ shift.
-The case for t = { : t } ∈L is similar.
-In all other cases, t proceeds homomorphically and the call-by-push-value emptiness rules have corresponding call-by-value rules.
Part 2 is proved by mapping a circular proof of t ≤ u to a circular proof of t ≤ u. We can also similarly show: -Consider the case in which t ≤ u is derived by the ⊥sub + rule.
By part 1, t empty in the call-by-value language. It follows from the ⊥sub v rule that t ≤ u.
-Consider the case in which t = ↓t − 0 and u = ↓u − 0 , with t − 0 ≤ u − 0 being derived by the →sub rule. By inversion on t and u , this can only happen if t = t 1 → t 2 and u = u 1 → u 2 , where t − 0 = t 1 → t − 3 and t − 3 = ↑t 2 and u − 0 = u 1 → u − 3 and u − 3 = ↑u 2 are auxiliary definitions introduced for the normal form of type definitions.
By Lemma 14, either t 2 empty or t 2 ≤ u 2 . In the former case, t 2 empty, and we have t 2 ≤ u 2 by the ⊥sub v rule. In the latter case, by transforming according to part 2, we have t 2 ≤ u 2 . In both cases, by transforming according to part 2, we have u 1 ≤ t 1 . From these we can derive t ≤ u with the →sub v rule. -Consider the case in which t = ↓t − 0 and u = ↓u − 0 , with t − 0 ≤ u − 0 being derived by the sub rule. By inversion on t and u , this can only happen in cases where t and u are either function types or lazy record types. If both t and u are lazy record types, then t ≤ u is derivable by sub v . Otherwise, t ≤ u is derivable by one of the sub →→ v , sub → v , or sub → v rules. -Consider the case in which t ≤ u is derived by the ↓sub rule.
The remaining cases are handled similarly.
Next, we prove that polarized subtyping on the image of Levy's call-by-value translation is complete with respect to Figure 6.
Proof. (of Theorem 10) Part 1 is proved by mapping a circular proof of t empty to a circular proof of t empty. As an example, consider the case in which t empty is derived by the emp v rule: Transforming each circular proof of t empty according to part 1, we have t empty for each ∈ L. Because t = { : t } ∈L , we can derive t empty with the emp rule. The other cases are similar. Part 2 is proved by mapping a circular proof t ≤ u to a circular proof of t ≤ u . The image of each call-by-value subtyping rule is derivable with the call-by-push-value syntactic subtyping rules. For example, consider the following call-by-value subtyping rule for function types.