Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Parametricity was first introduced by Strachey [22] as a way to characterise the behaviour of polymorphic programs as being uniform with respect to the type of the arguments provided. He opposed this notion to ad-hoc polymorphism, where a function can produce arbitrarily different outputs when provided inputs of different types (for example an integer and a boolean). To formalise this notion of parametricity, Reynolds introduced relational parametricity [21]. It is defined using an equivalence on programs, that we call Reynolds equivalence and is a generalisation of logical relations to System F. This equivalence uses arbitrary relations over pairs of types to relate polymorphic programs. So a parametric program that takes related arguments as input will produce related results. Reynolds parametricity has been developed into a fundamental theory for studying polymorphic programs [1, 20, 23].

Following results of Mitchell on PER-models of polymorphism [18], Abadi, Cardelli, Curien and Plotkin [1, 20] introduced another, more intentional notion of equivalence, called Strachey equivalence. Two terms of System F are Strachey equivalent whenever, by removing all their type annotations, we obtain two \(\beta \eta \)-equivalent untyped terms. The authors conjectured that Strachey equivalence implies Reynolds equivalence (the converse being easily shown to be false).

In this paper we examine a notion of Reynolds equivalence based on operational logical relations, and prove that, for this notion, the conjecture holds. To do so, we introduce a trace model for System F based on operational nominal game semantics [12, 14]. Terms in our model are denoted as sets of traces, generated by a labelled transition system, which represent interactions with arbitrary term contexts. In order to abstract away type information from inputs to polymorphic functions, our semantics uses names to model such inputs. The idea is the following: since names have no internal structure, the function has no choice but to act “the same way” on such inputs, i.e. be parametric. Our trace model yields a third notion of equivalence: trace equivalence (i.e. equality of sets of traces). Then, the result is proven by showing that trace equivalence is included in (operational) Reynolds equivalence, while it includes Strachey equivalence.

The traces in our model are formed of moves, which represent interactions between the modelled term (the Player) and its context (the Opponent): either of Player or Opponent can interrogate the terms provided by the other one, or respond to a previous such interrogation. These moves are called questions and answers respectively. Names enter the scene when calling terms which are of polymorphic type, in which case the calling party would replace the actual argument type \(\theta \) with a type name \(\alpha \), and record locally the correspondence between \(\alpha \) and \(\theta \). Another use of names in our model is for representing terms that are passed around as arguments to questions. These are called computation names, and are typed according to the term they each represent.

2 Definition of System F and Parametricity

We start off by giving the definitions of System F and of the parametric equivalence relations we shall examine on it. The grammar for System F is standard and given by:

$$\begin{array}{rrll} \mathsf {Type}\,\ni &{} \theta , \theta ' &{} {:}{:=} &{} X~|~\theta \rightarrow \theta ' ~|~\forall X.\theta \\ \mathsf {Term}\,\ni &{} M,N &{} {:}{:=}\ &{} \lambda x^\theta .M ~|~\varLambda X.M ~|~M N ~|~M \theta \end{array}$$

We write x, etc. for (term) variables, sourced from a countable set \(\mathsf {Var}\); and \(X\), etc. for type variables, taken from \(\mathsf {TVar}\). We define substitutions of open variables of either kind in the usual capture-avoiding way. For instance, the term obtained by consecutively applying substitutions \(\eta :\mathsf {Var}\rightharpoonup \mathsf {Term}\) and \(\delta :\mathsf {TVar}\rightharpoonup \mathsf {Type}\) on M is written \(M\{\eta \}\{\delta \}\).

Terms are typed in environments \(\varDelta ;\varGamma \), where \(\varDelta \) is a finite set of type variables, and \(\varGamma \) is a set \(\{x_1:\theta _1,\ldots ,x_m:\theta _m\}\) of variable-type pairs. The typing rules are given in Fig. 1. The operational semantics we examine is \(\beta \eta \)-equality, defined as the least syntactic congruence \(=_{\beta \eta }\) that includes the axioms given on the RHS part of Fig. 1.

Fig. 1.
figure 1

Typing rules and \(\beta \eta \)-equality axioms.

We shall use the following common polymorphic encodings:

  • \(\mathbf {Bool}=\forall X.\ X\rightarrow X\rightarrow X\), \(\mathbf {true}=\varLambda X.\lambda x^X\!.\lambda y^X\!.x\) and \(\mathbf {false}=\varLambda X.\lambda x^X\!.\lambda y^X\!.y\),

  • \(\mathbf {Unit}=\forall X.\ X\rightarrow X\) and \(\mathbf {id}=\varLambda X.\lambda x^X.x\).

Reynolds Equivalence. We next introduce logical relations for System F. First, we let \(\mathrm {Rel}\) be the set of all typed relations between closed terms that are compatible with \(=_{\beta \eta }\):

$$\begin{aligned} \mathrm {Rel}=\{(\theta _1,\theta _2,R) ~|~&R \subseteq \mathsf {Term}\times \mathsf {Term}\wedge \forall (M_1,M_2) \in R.\ \cdot ;\cdot \vdash M_i : \theta _i \\&\wedge \forall M'_1 =_{\beta \eta }M_1. \forall M'_2 =_{\beta \eta }M_2.\ (M'_1,M'_2) \in R\} \end{aligned}$$

Logical relations \(\mathcal {R}[\![\theta ]\!]_{\delta }\) are defined below, indexed by environments \(\delta :\mathsf {TVar}\rightharpoonup \mathrm {Rel}\):

$$\begin{aligned} \mathcal {R}[\![X ]\!]_{\delta }&=R \text { when } \delta (X) = (\_,\_,R) \\ \mathcal {R}[\![\forall X.\theta ]\!]_{\delta }&=\{(M_1,M_2) ~|~\forall (\theta _1,\theta _2,R) \in \mathrm {Rel}.\ (M_1\theta _1,M_2\theta _2) \in \mathcal {R}[\![\theta ]\!]_{\delta \cdot [X\mapsto (\theta _1,\theta _2,R)]}\} \\ \mathcal {R}[\![\theta _1\,{ \rightarrow }\, \theta _2 ]\!]_{\delta }&=\{(M_1,M_2) ~|~\forall (N_1,N_2) \in \mathcal {R}[\![\theta _1 ]\!]_{\delta }.\ (M_1 N_1,M_2 N_2) \in \mathcal {R}[\![\theta _2 ]\!]_{\delta }\} \end{aligned}$$

We can now define the first notion of parametric equivalence for System F.

Definition 1

Given terms \(\varDelta ; \varGamma \vdash M_1,M_2 : \theta \), we say that they are Reynolds equivalent, and write \(\varDelta ;\varGamma \vdash M_1 \simeq _{log} M_2 : \theta \), if:

$$ \forall \delta \in \mathcal {R}[\![\varDelta ]\!]. \forall (\eta _1,\eta _2) \in \mathcal {R}[\![\varGamma ]\!]_{\delta }.\ (M_1\{{\eta _1}\}\{{\delta _1}\},M_2\{{\eta _2}\}\{{\delta _2}\}) \in \mathcal {R}[\![\theta ]\!]_{\delta } $$

where \(\mathcal {R}[\![\varDelta ]\!] =\mathrm {dom}(\varDelta )\rightarrow \mathrm {Rel}\), \(\delta _1=\{(X,{\theta _1})~|~\delta (X)=({\theta _1},\_,\_)\}\) (similar for \(\delta _2\)) and \(\mathcal {R}[\![\varGamma ]\!]_{\delta } =\{(\eta _1,\eta _2)\in (\mathrm {dom}(\varGamma )\rightharpoonup \mathsf {Term})^2 ~|~\forall (x,\theta ') \in \varGamma .\ (\eta _1(x),\eta _2(x)) \in \mathcal {R}[\![\theta ' ]\!]_{\delta }\}\).

The following result is standard [21].

Theorem 2

(Fundamental Property). If \(\varDelta ;\varGamma \vdash M:\tau \) then \(\varDelta ;\varGamma \vdash M \simeq _{log} M : \theta \).

Remark 3

Note that our definition of Reynolds equivalence does not coincide with either of the definitions given in [1, 20]: therein, parametricity is defined using relational logics (and accompanying proof systems), whereas here we use quantification over concrete relations over closed terms.

Strachey Equivalence. Another notion of parametric equivalence is defined by means of erasing types from terms. We define the type erasure \(\mathbf {erase}(M)\) of a term M by:

$$ \begin{array}{rlrl} \mathbf {erase}(\varLambda X{.}M) &{} =\mathbf {erase}(M) &{} \mathbf {erase}(M N) &{} =\mathbf {erase}(M) \mathbf {erase}(N) \\ \mathbf {erase}(\lambda x^\theta {.}M) &{} =\lambda x{.}\mathbf {erase}(M) &{} \mathbf {erase}(M \theta ) &{} =\mathbf {erase}(M) \end{array}$$

and \(\mathbf {erase}(x) =x\). Thus, \(\mathbf {erase}(M)\) is an untyped \(\lambda \)-term. Below we overload \(=_{\beta \eta }\) to also mean \(\beta \eta \)-equality in the untyped \(\lambda \)-calculus.

Definition 4

Given terms \(\varDelta ;\varGamma \vdash M_1,M_2 : \theta \), we say that they are Strachey equivalent if \(\mathbf {erase}(M_1) =_{\beta \eta }\mathbf {erase}(M_2)\).

It was conjectured in [1, 20] that Reynolds equivalence includes Strachey equivalence. We prove this holds for the version of Reynolds equivalence given in Definition 1.

Theorem 5

Any two Strachey equivalent terms are also Reynolds equivalent.

It is interesting to think why a direct approach would not work in order to prove this conjecture. Given Strachey equivalent terms \(M_1,M_2\) of type \(\mathbf {Bool}\), suppose we want to prove them Reynolds equivalent. We therefore take \((\theta _1,\theta _2,R) \in \mathrm {Rel}\), \((N_{1,1},N_{2,1}) \in R\), and \((N_{1,2},N_{2,2}) \in R\), and aim to prove that \((M_1 \theta _1 N_{1,1} N_{1,2}, M_2 \theta _2 N_{2,1} N_{2,2}) \in R\). Ideally, we would like to prove that there exists \(j \in \{1,2\}\) s.t. for all \(i \in \{1,2\}\), \(M_i \theta _i N_{i,1} N_{i,2} =_{\beta \eta }N_{i,j}\), but that seems overly optimistic. A first trick is to use Theorem 2, to get that \(M_2\) is related with itself. Thus, we get that \((M_2 \theta _1N_{1,1} N_{1,2}, M_2 \theta _2 N_{2,1} N_{2,2}) \in R\), and it would suffice to prove \(M_1 \theta _1N_{1,1} N_{1,2} =_{\beta \eta }M_2 \theta _1N_{1,1} N_{1,2}\) to conclude. However, our hypothesis is simply that \(\mathbf {erase}(M_1) =_{\beta \eta }\mathbf {erase}(M_2)\).

A possible solution to the above could be to \(\beta \)-reduce both \(M_i \theta _1N_{1,1} N_{1,2}\), hoping that the distinction between the two terms will vanish. Our trace semantics provides a way to model the interaction between such a term \(M_i\) and a context \(\bullet \, \theta _jN_{j,1} N_{j,2}\), and to deduce properties about the normal form reached by their application via head reduction.

3 A Nominal Trace Semantics for System F

In this section we introduce a trace semantics for open terms which will be our main vehicle of study for System F. The terms in our semantics will be allowed to contain special constants representing any term that could fill in their open variables (these be term or type variables). The use of names can be seen as a nominal approach to parametricity: parametric types and values are represented in our semantics by names, without internal structure. Thus, e.g. a parametric function is going to behave “the same way” for any input, since the latter will be nothing but a name.

Our approach follows the line of work on nominal techniques [7, 19] and nominal operational game semantics [12, 14]. We let the set of names be:

$$ \mathsf {N}=\mathsf {TN}\uplus \mathsf {CN}$$

We therefore use two kinds of names: type names \(\alpha ,\beta \in \mathsf {TN}\); and computation names \(c,d \in \mathsf {CN}\). We will range over arbitrary names by a and variants. We extend the syntax of terms and types by including computation and type names as constants, and call the resulting syntax namey terms and types:

$$ M,N\, {:}{:=}\ c\mid x~|~\lambda x^\theta {.}M ~|~\varLambda X{.}M ~|~M N ~|~M \theta \qquad \theta ,\theta '\, {:}{:=}~\alpha \mid X\mid \theta \rightarrow \theta '\mid \varLambda X{.}\theta $$

A namey term or type is closed if it contains no free (type/term) variables – but it may contain names. On the other hand, a value is a closed term in head normal form that contains no names. We range over values with v and variants.

We will use the notation \(\hat{M},\hat{N}\), and variants, to refer jointly to namey terms and namey types. Namey terms are typed with additional typing hypotheses for the added constants. These typings are made explicit in the trace model. By abuse of terminology, we will drop the adjective “namey” and refer to the above simply as “terms” and “types”. Formally speaking, namey terms and types form nominal sets (cf. Definition 8).

Note 6

(what do c’s and \(\alpha \)’s represent?). A computation name c represents a term that can replace the open variables of a term M. That is, in order to examine the semantics of \(\lambda x^\theta {.}M\), we will look instead at \(M\{c/x\}\) where c a computation name of appropriate type. Type names \(\alpha \) have a similar purpose, for types.

Our trace semantics is built on top of head reduction, which is reminded next. Moreover, we shall be using types in extended form, which determines the number and types of arguments needed in order to fully apply a term of a given type.

Definition 7

The (standard) head reduction rules are given in Fig. 2. Head normal forms are given by the syntax on the LHS below,

$$ M_{\mathsf {hnf}}\, {:}{:=} \,E[x]\mid {E[c]}\mid \lambda x^\theta {.}M_\mathsf{hnf}\mid \varLambda X{.}M_\mathsf{hnf}\qquad \quad E\, {:}{:=}\ \bullet ~|~E M ~|~E \theta $$

where E ranges over evaluation contexts (defined on the RHS). Evaluation contexts are typed with types of the form . We write if we can derive \(\bullet :\theta \vdash E: \theta '\).

An extended type form is a sequence \((\tau _1,...,\tau _n,\xi )\) with \(\xi \in \mathsf {TVar}\cup \mathsf {TN}\) and, for each i, \(\tau _i \in \mathsf {Type}\cup \{\forall X~|~X\in \mathsf {TVar}\}\). Formally, the extended form of a type \(\theta \), written \(\mathrm {ext}(\theta )\), is defined by:

$$ \mathrm {ext}(\forall X{.}\theta ) =(\forall X)\,{:}{:}\,\mathrm {ext}(\theta ) \qquad \mathrm {ext}(\theta \rightarrow \theta ') =\theta \,{:}{:}\,\mathrm {ext}(\theta ') \qquad \mathrm {ext}(\xi ) =(\xi ) $$

where we write \(h\,{:}{:}\,t\) for the sequence with head h and tail t (cf. list notation). Elements of the form \(\forall X\) in these sequences are binders that bind to their right.

Fig. 2.
figure 2

Head reduction rules. Condition (\(*\)) stipulates that M be not a \(\varLambda /\lambda \)-abstraction.

We let \(\rightarrow ^{*}\) be the reflexive-transitive closure of \(\rightarrow \). It is a standard result that \(\rightarrow ^{*}\) preserves typing and (strongly) normalises to head normal forms.

We finally introduce some infrastructure for working with objects with names.

Definition 8

We call a permutation \(\pi :\mathsf {N}\rightarrow \mathsf {N}\) finite if the set \(\{a~|~\pi (a)\ne a\}\) is finite, and component-preserving if, for all \(a\in \mathsf {N}\), \(a\in \mathsf {TN}\) iff \(\pi (a)\in \mathsf {TN}\).

A nominal set [7] is a pair \((Z,*)\) of a set Z along with an action \((*)\) from the set of finite component-preserving computations of \(\mathsf {N}\) on the set Z. For each \(z\in Z\), the set of names featuring in z form its support, written \(\nu (z)\), which we stipulate to be finite.

In the sequel, when constructing objects with names (such as moves or traces) we shall implicitly assume that these form nominal sets, where the permutation action is defined by taking \(\pi *z\) to be the result of applying \(\pi \) to each name in z.

3.1 Trace Semantics Preview

Before formally presenting the trace model, we look at some examples informally, postponing the full details for the next section. Head-reduction brings terms into head normal form. The trace semantics allows us to further ‘reduce’ terms of the form \(E[c\hat{M}_1\cdots \hat{M}_n]\), where c is some computation name. For such a term, following the game semantics approach [3, 11], our model will issue a move interrogating the computation c on arguments \(\hat{M}_i\), and putting E on top of an evaluation stack, denoted \(\mathcal {E}\). The move is effectively a call to c, and \(\mathcal {E}\) functions as a call stack which registers the calls that have been made and are still pending. This will effectively lead to a labelled transition system in which labels are moves issued by two parties: a Player (P), representing the modelled term, and an Opponent (O) representing its enclosing term context.

Traces are sequences of moves, which in turn are tuples of names belonging to one of these four classes, taking \(c\in \mathsf {CN}\) and \(a_i\in \mathsf {N}\) for each i:

  • Player questions \(\bar{c}( a_1,...,a_n )\) (also P-questions),

  • Opponent questions \(c ( a_1,...,a_n )\) (also O-questions),

  • PO-answers , and OP-answers .

Given a question move as above, we let its core name be c. We distinguish a computation name \(c_\mathrm{in}\in \mathsf {CN}\), and call questions with core name \(c_\mathrm{in}\) initial. We define a trace T to be a finite sequence of moves. Traces will be restricted to legal ones in Definition 12.

In the following examples we give traces produced by simple System F terms. Traces are formally produced by an LTS over configurations whose main component is an evaluation stack. An evaluation stack is a stack whose elements are typed evaluation contexts, apart from the top element which can also be a typed term:

We denote the empty stack with \(\lozenge \). In the next two examples, for simplicity, configurations shall only contain evaluation stacks.

Example 9

Recall that \(\mathbf {id}=\varLambda X{.}\lambda x^X.\,x:\mathbf {Unit}\) and \(\mathbf {Unit}=\forall X{.}X\rightarrow X\). The extended type of \(\mathbf {Unit}\), \(\mathrm {ext}(\mathbf {Unit})=(\forall X,X,X)\), indicates that \(\mathbf {id}\) requires two arguments in order to be evaluated: one type and one term of that given type. Thus, the traces produced by \(\mathbf {id}\) will start with an interrogating/calling move \(c_\mathrm{in}(\alpha ,c)\) of O:

  • \(c_\mathrm{in}\) is the computation name assigned (by convention) to the term being evaluated (in this case, \(\mathbf {id}\));

  • \(\alpha ,c\) are names abstracting the actual type and term arguments which \(\mathbf {id}\) is called on. It is assumed that c is of type \(\alpha \).

Starting from the initial move \(c_\mathrm{in}(\alpha ,c)\), a trace of \(\mathbf {id}\) can be produced as follows:

Thus, O starts the interaction by interrogating \(\mathbf {id}\) with \(\alpha ,c\). This results in \(\mathbf {id}\,\alpha \,c\), which gets head reduced to c. At this point, c is a head normal form of type \(\alpha \), and P can answer the initial question \(c_\mathrm{in} ( \alpha ,c )\). This is done in two steps. First, P further reduces c by playing a move \(\bar{c}( )\) (here c takes 0 arguments as \(\mathrm {ext}(\alpha )=(\alpha )\)), and pushes the current evaluation context on the stack. O then responds by triggering a pair of answers , which answer both questions played so far. The resulting trace is: .

Note 10

(what are and ?). As System F base types are type variables, there is no real need for answer moves: a type X has no return values. For example, in the game models of Hughes [9] and Laird [15], answer moves were effectively suppressed (either explicitly, or by allowing moves \(c(\cdots )\) to function as answers). Here, to give the semantics an operational flavour, we introduce instead explicit ‘dummy’ answers \(\mathsf {OK}\).

Example 11

Consider now \(M=\lambda f^\mathbf {Unit}\!.\,f:\mathbf {Unit}\rightarrow \mathbf {Unit}\). We have that \(\mathrm {ext}(\mathbf {Unit}\rightarrow \mathbf {Unit})=(\mathbf {Unit},\forall X,X,X)\), and therefore M requires three arguments for its evaluation: one term of type \(\mathbf {Unit}\), one type, and one term if that latter type. We can therefore start a trace of M with an initial move \(c_\mathrm{in} ( c_1,\alpha _1,c )\) and continue as follows.

Thus, the initial move leads to \(Mc_1\alpha _1c_2\), which in turn reaches the hnf \(c_1\alpha _1c_2\), with \(c_1:\mathbf {Unit}\), and at that point P needs to invoke \(c_1\) with arguments \(\alpha _1\) and \(c_2\). These are abstracted away by fresh names \(\alpha _2\) and \(c_3\) respectively, which are passed as arguments to \(c_1\). \(c_3\) in particular has type \(\alpha _2\). The result of this invocation will be of type \(\alpha _2\), which is the hole type in . O can only produce a term of \(\alpha _2\) by simply returning \(c_3\). Similarly to before, this is done in two steps: by O playing \(c_3 ( )\), which brings \(c_2\) (the term represented by \(c_3\)) at the top of the stack, which in turn triggers a pair of answers and brings \(c_2\) inside the context .

The latter step leaves us with \((c_2,\alpha _1)\), which reaches \(\lozenge \) as in the previous example.

3.2 Definition of the LTS

We now proceed with the formal definition of the trace semantics. We start off with a series of definitions setting the conditions for a trace to be legal.

The names appearing in a trace are owned by whoever introduces them. A move m introduces a name a in a trace T if m is a question \(q ( \vec a )\) with \(a_i=a\) for some i. For each \(A\in \{O,P\}\), we let the set of names of T that are owned by A be:

$$ A(T) = \{ a \in \mathsf {N}~|~\exists m.\ m\text { is an } A \text {-question in }T \wedge m\text { introduces }a\}. $$

We will be referring to the names appearing in A(T) as A-names.

Each move in a trace needs to be justified, i.e. depend on an earlier move (unless the move is initial). Justification is defined in different ways for questions and answers. Given a trace T and two moves \(m,m'\) in T, we say that \(m'\) justifies m when \(m'\) is before m in T and:

  • m is a question with core name c and \(m'\) introduces c, or

  • m is an answer which answers \(m'\) (and \(m'\) is a question).

Answering of questions is defined as follows. Each answer (occurrence) m answers the pair of question moves \((m_1,m_2)\) containing the last two question moves in T which are before m and have not been answered yet.

We can now define legality conditions for traces. Below, for \(A\in \{O,P\}\), we say that a move is A-starting if it is an A-question or an \(AA^\bot \)-answer (where \(O^\bot =P\) and \(P^\bot =O\)). Similarly, a move is A-ending if it is either an A-question or an \(A^\bot A\)-answer.

Definition 12

A trace T is said to be legal when, for each \(A\in \{O,P\}\):

  1. 1.

    A-ending moves can only be followed by \(A^\bot \)-starting moves;

  2. 2.

    all moves in T are justified, apart from the first move which must be initial;

  3. 3.

    apart from \(c_\mathrm{in}\), every name of T is introduced exactly once in it;

  4. 4.

    for each A-question with core name \(c\ne c_\mathrm{in}\), we have \(c\in A^\bot (T)\);

  5. 5.

    if an \(A{A}^\bot \)-answer answers \((\!m,m')\) then these are A- and \(A^\bot \)-questions respectively.

The conditions above can be given names (suggesting their purpose) as follows: 1. alternation, 2. justification, 3. well-introduction, 4. well-calling, 5. well-answering.

Each trace T has a complement, which we denote \(T^\bot \) and is obtained from T by switching \(O{\slash }P\) in all of its moves (i.e. each \(c ( \vec a )\) becomes \(\bar{c}( \vec a )\), becomes , etc). T is legal iff \(T^\bot \) is.

Traces are produced by use of a labelled transition system. The LTS comprises moves as labels, and of configurations as nodes. Each configuration contains an evaluation stack of terms and environments that need to be evaluated, as well as mappings containing type/term information on names that have appeared so far. We introduced evaluation stacks in the previous section. Here we shall restrict the allowed shapes thereof as follows. We let passive and active evaluation stacks be defined by the following two grammars respectively, and take evaluation stacks to be \(\mathcal {E}\,{:}{:}\!=\mathcal {E}_\mathsf{pass}~|~\mathcal {E}_\mathsf{actv}\),

where \(\theta \) ranges over closed types with \(\nu (\theta )=\varnothing \), and \(\lozenge \) is the empty stack.

The other two components of configurations will be maps \(\gamma \) and \(\phi \) of the shape:

$$ \gamma \in (\mathsf {CN}{\rightharpoonup }(\mathsf {Term}\times \mathsf {Type}))\otimes (\mathsf {TN}{\rightharpoonup }(\mathsf {Type}\times \{\mathcal {U}\})),\;\;\; \phi \in (\mathsf {CN}{\rightharpoonup }\mathsf {Type})\otimes (\mathsf {TN}{\rightharpoonup }\{\mathcal {U}\}), $$

with \(F\otimes G=\{f\cup g~|~f\in F\wedge g\in G\}\). \(\mathcal {U}\) is a special “universe” symbol that represents the type of types – it is only used for convenience. Then, in words:

  • \(\gamma \) assigns term-type pairs to computation names, and type-\(\mathcal {U}\) pairs to type names,

  • \(\phi \) assigns types to computation names, and \(\mathcal {U}\) to type names.

The role of a map \(\gamma \) is to abstract away terms to computational names, and types to type names. On the other hand, a map \(\phi \) simply types names. In the LTS, when P wants to interrogate an O-computation name c with some arguments, they will abstract away the actual arguments to names, record the abstraction in \(\gamma \), and call c on these names. On the other hand, when O interrogates a P-computation name c with some move \(c ( \vec a )\), we will record in \(\phi \) the types of the (new!) O-names \(\vec a\).

Fig. 3.
figure 3

Reduction rules for the LTS.

The abstraction of arguments to names is instrumented by a dedicated operation \(\mathsf {AVal}\). This operation assigns to each sequence \(((\hat{M}_1,\tau _1),...,(\hat{M}_n,\tau _n),\xi )\), where \((\tau _1,...,\tau _n,\xi )\) is an extended type (i.e. the type of the computation name we want to call) and each \(\hat{M}_i\) is a closed term or type (the i-th argument), a set of triples of the form \((\vec a,\gamma ,\beta )\) where:

  • \(\vec a\) is a sequence \((a_1,...,a_n)\) of names (abstracting each of the arguments \(\hat{M}_i\)),

  • \(\gamma \) is a map as above, with domain \(\{a_1,...,a_n\}\),

  • \(\beta \) is the result type one gets after applying each \(a_i\) for each \(\tau _i\).

The operator is formally defined next. In the same definition we introduce the semantics of types, \([\![\theta ]\!]\), as sets of triples of the form \((\vec a,\phi ,\beta )\), which represent all possible input-output name tuples \((\vec a,\beta )\) that are allowed for \(\theta \), including their typing \(\phi \).

Definition 13

Given a closed type \(\theta \) (which may contain type names), we let its semantics be \([\![\theta ]\!]=[\![\mathrm {ext}(\theta ) ]\!]\), where the latter is defined inductively by:

$$\begin{aligned}{}[\![(\alpha ) ]\!]&=\{(\varepsilon ,\varepsilon ,\alpha )\} \\ [\![\theta \,{:}{:}\,L ]\!]&=\{((c,\vec {a}),\phi \cdot [c \mapsto \theta ],\alpha ) ~|~c \in \mathsf {CN}, (\vec {a},\phi ,\alpha ) \in [\![L ]\!]\} \\ [\![\forall X\,{:}{:}\,L ]\!]&=\{((\beta ,\vec {a}),\phi \cdot [\beta \mapsto \mathcal {U}],\alpha ) ~|~\beta \in \mathsf {TN}, (\vec {a},\phi ,\alpha ) \in [\![L\{\alpha / X \} ]\!]\} \end{aligned}$$

On the other hand, to each sequence \(((\hat{M}_1,\tau _1),...,(\hat{M}_n,\tau _n),\xi )\) we assign a set of abstract values \(\mathsf {AVal}(((\hat{M}_1,\tau _1),...,(\hat{M}_n,\tau _n),\xi ))\) inductively by:

$$\begin{aligned} \mathsf {AVal}((\alpha ))&=\{(\varepsilon ,\varepsilon ,\alpha )\} \\ \mathsf {AVal}((M,\theta )\,{:}{:}\,L)&=\{((c,\vec {a}),\gamma \cdot [c \mapsto (M,\theta )],\alpha ) ~|~c \in \mathsf {CN}, (\vec {a},\gamma ,\alpha ) \in \mathsf {AVal}(L)\} \\ \mathsf {AVal}((\theta ,\forall X)\,{:}{:}\,L)&=\{((\beta ,\vec {a}),\gamma \cdot [\beta \mapsto (\theta ,\mathcal {U})],\alpha ) ~|~\beta \in \mathsf {TN}, (\vec {a},\gamma ,\alpha ) \in \mathsf {AVal}(L\{\beta / X \})\} \end{aligned}$$

Both \(\phi \) and \(\gamma \) are finite partial functions whose domains are sets of names. For such maps, the extension notation we used e.g. in \(\phi \cdot [c\mapsto z]\) (for appropriate z) means fresh extension: \(\phi \cdot [c\mapsto z]=\phi \cup \{(c,z)\}\) and given that \(c\notin \mathrm {dom}(\phi )\). This notation is extended to whole maps: e.g. \(\phi \cdot \phi '=\phi \cup \phi '\) and given that \(\mathrm {dom}(\phi )\cap \mathrm {dom}(\phi ')=\varnothing \). Moreover, for each map \(\gamma \) we write \(\mathsf {fst}(\gamma )\) for its first projection: \(\mathsf {fst}(\gamma )= \{(a,\hat{M})~|~\gamma (a)=(\hat{M},\_)\}\). Similarly, second projection is given by: \(\mathsf {snd}(\gamma )= \{(a,Z)~|~\gamma (a)=(\_,Z)\}\).

Definition 14

A configuration is a triple \(\langle \mathcal {E},\gamma ,\phi \rangle \) where \(\mathcal {E}\) is an evaluation stack and \(\gamma \) and \(\phi \) are as above. The reduction rules of the LTS are given in Fig. 3. We write \(\mathsf {Tr}(C)\) for the set of traces generated by a configuration C.

Given a typed term \(\varDelta ; \varGamma \vdash M : \theta \), with \(\varDelta = \{X_1,\ldots ,X_n\}\), \(\varGamma = \{x_1:\theta _1,\ldots ,x_m:\theta _m\}\), we set \(\langle \varDelta ; \varGamma \vdash M : \theta \rangle =\langle \lozenge ,[c_\mathrm{in}\mapsto (\widetilde{M},\widetilde{\theta })],\varepsilon \rangle \) and

$$ [\![\varDelta ; \varGamma \vdash M : \theta ]\!]=\{ T\in \mathsf {Tr}(\langle \varDelta ; \varGamma \vdash M : \theta \rangle ) ~|~T\text { has at most one initial move } \} $$

where \(\widetilde{\theta } =\forall X_1. \ldots \forall X_n.\theta _1\rightarrow \cdots \rightarrow \theta _m \rightarrow \theta \) and \(\widetilde{M} =\varLambda X_1{.} \ldots \varLambda X_n.\lambda x_1^{\theta _1}. \ldots \) \(\lambda x_m^{\theta _m}{.}M\).

A configuration is active (resp. passive) if its evaluation stack is so. An active configuration stands for a term being computed and it may only produce P-moves. A passive configuration, on the other hand, stands for a scenario where O is next to play. Moreover, the map \(\phi \) in a configuration contains information on the O-names that have been played, i.e. \(\mathrm {dom}(\phi )\) contains O-names, while \(\mathrm {dom}(\gamma )\) contains P-names.

To better grasp Fig. 3 let us consider an initial configuration \(\langle \lozenge ,[c_\mathrm{in}\mapsto ({M},{\theta })],\varepsilon \rangle \) and look at its traces, for some closed term M (so no need for \(\widetilde{M},\widetilde{\theta }\)) with empty support.

  • At the beginning, the only rule that can be applied is (OQ\(_0\)), whereby O interrogates the term M by issuing a move \(c_\mathrm{in} ( \vec a )\). The names \(\vec a\) are selected from \([\![\theta ]\!]\) and represent arguments that O fully applies the term M on. Since \(\theta \) has empty support, its extended form is of the shape \((\tau _1,...,\tau _n,X)\) with \(X\) bound by one of the \(\tau _i\)’s. Consequently, when the names \(a_1,...,a_n\) are applied for \(\tau _1,...,\tau _n\), the variable \(X\) will be replaced by some type name \(\alpha \). The rule makes this explicit, by requiring that \((\vec a,\phi ',\alpha )\in [\![\theta ]\!]\). Thus, writing \(\phi _0\) instead of \(\phi '\) and setting \(\gamma _0=[c_\mathrm{in}\mapsto ({M},{\theta })]\), the transition brings us to a configuration \(\langle [(M\vec a,\alpha )],\gamma _0,\phi _0\rangle \), where \(\mathrm {dom}(\phi _0)=\{a_1,...,a_n\}\).

  • At this point, the term \(M\vec a\) can be reduced using head reduction and brought to head normal form. Applying the (INT) rule we reach some \(\langle [(E[c\hat{M}_1\cdots \hat{M}_k],\alpha )],\gamma _0,\phi _0\rangle \).

  • We next interrogate the computation name c. The latter must have come from the \(a_1,...,a_n\) that were applied to M, hence is an O-name. To interrogate it, P plays a question \(\bar{c}( \vec a' )\), using the (PQ) rule and assuming \((\vec a',\gamma ',\alpha ')\in \mathsf {AVal}(((\hat{M}_1,\tau _1'),...,(\hat{M}_k,\tau _k'),\xi ))\), \(\phi _0(c)=\theta '\), \(\mathrm {ext}(\theta ')=(\tau _1',...,\tau _k',\xi )\). This leads to (\(\gamma _1=\gamma _0\cdot \gamma '\)).

  • We are now at a passive configuration, where E has been stored on the stack and O is required to produce a response of type \(\alpha '\). By definition of \(\mathsf {AVal}\), either \(\alpha '=\alpha \) or \(\alpha '\) is in \(a_1',...,a_k'\) and hence belongs to P. In the latter case, O can only produce such a response by calling back P, using rule (OQ), playing an O-question and adding a new term on the evaluation stack. In the former case, O would directly respond with a hnf of type \(\alpha \), say N. But, since and therefore \(E=\bullet \), P would simply reply back playing N again. To avoid this copycat of hnf’s, we simply play an OP-answer and remove the top of the evaluation stack – this is what the (OA) rule achieves.

Fig. 4.
figure 4

Top: traces for two terms of type \(\mathbf {Unit}{\rightarrow }\mathbf {Unit}\). Bottom: traces for Church numeral \(M_k\).

Example 15

In Fig. 4 we include example traces for terms \(M_1,M_2:\mathbf {Unit}\rightarrow \mathbf {Unit}\) (taken from [1], Instance 3.25) and for the Church numerals \(M_k:\mathbf {Nat}\). The former pair is an instance of Theorem 21 – Strachey equivalence implies trace equivalence.

In our scenario above we started from a passive configuration with empty stack and a singleton \(\gamma \). A different way to produce a trace is to start from an active configuration with a stack containing only a term \(E[c_\mathrm{in}\hat{M}_1\cdots \hat{M}_n]\), in which case the rule (PQ\(_0\)) would commence the trace. More generally, we call a configuration C with stack \(\mathcal {E}\):

  • a term configuration, if \(\mathcal {E}=\lozenge \) or the bottom element of \(\mathcal {E}\) has type \(\alpha \) or ;

  • a context configuration, if the bottom of \(\mathcal {E}\) has type \(\theta \) or , and \(\theta \) is a closed with empty support.

Each reduction sequence in the LTS can only contain either term or context configurations. In our discussion above and in Example 15 we examine the semantics of terms, and therefore use term configurations. In later sections, when we shall start looking at the semantics of contexts, we will be using context configurations as well.

While we have not defined leaves for our LTS, there is a natural notion of a trace being “completed”. In particular, we call a trace T complete if all its questions have been answered. We write \(\mathsf {CTr}(C)\) for the set of complete traces generated from C. Term and context configurations can both produce complete traces. Given a term configuration C and a complete trace T, we write \(C\Downarrow _{T}\) if \(C\xrightarrow {T}C'\) and \(C'\) has an empty evaluation stack. On the other hand, given a context configuration C, a complete trace T and a value v, we write \(C\Downarrow _{T,v}\) if \(C\xrightarrow {T}C'\) and \(C'\) has an evaluation stack with a single element \((v,\theta )\).

Lemma 16

Given a term configuration C and \(T \in \mathsf {Tr}(C)\), then T is complete iff \(C \Downarrow _{T}\).

We conclude this section by looking at some restrictions characterising actual configurations. We first extend \(\mathsf {fst}\) to evaluation stacks by: \(\mathsf {fst}(\lozenge )=\lozenge \) and \(\mathsf {fst}((Z,\_)\,{:}{:}\,\mathcal {E})=Z\,{:}{:}\,\mathsf {fst}(\mathcal {E})\).

Definition 17

A configuration \(\langle \mathcal {E},\gamma ,\phi \rangle \) is said to be legal when:

  • \(\mathrm {dom}(\gamma ) \cap \mathrm {dom}(\phi ) = \varnothing \) and \(\nu (\mathsf {fst}(\mathcal {E})) \cup \nu ({\mathrm {cod}(\mathsf {fst}(\gamma ))}) \subseteq \mathrm {dom}(\phi )\);

  • for all \(c \in \mathrm {dom}(\gamma ) \cap \mathsf {CN}\), given \(\gamma (c)=(M,\theta )\), we have \(\varDelta _{\phi }; \varGamma _{\phi ,\gamma } \vdash M:\theta \{\gamma _v\}\);

  • if the top of \(\mathcal {E}\) is \((M,\theta )\), then \(\varDelta _{\phi }; \varGamma _{\phi ,\gamma } \vdash M:\widetilde{\theta }\) with either \(\theta = \alpha \in \mathrm {dom}(\gamma )\) and \(\gamma (\alpha ) = (\widetilde{\theta },\mathcal {U})\), or \(\theta = \alpha \in \mathrm {dom}(\phi )\) and \(\widetilde{\theta } = \theta \), or \(\theta = \widetilde{\theta }\) is a closed type with empty support and \(\mathcal {E}=[(M,\theta )]\);

  • If \(\mathcal {E}= (M,\alpha _1)\,{:}{:}\,(E,\alpha _2 \rightsquigarrow \theta )\,{:}{:}\,\mathcal {E}'\), either \(\alpha _1 =\alpha _2\) or \(\alpha _1 \in \mathrm {dom}(\phi )\);

  • for all \((E,\alpha \rightsquigarrow \theta )\) in \(\mathcal {E}\) with \(\alpha \in \mathrm {dom}(\gamma )\), \(\varDelta _{\phi }; \varGamma _{\phi ,\gamma }, \vdash E:\gamma _v(\alpha ) \rightsquigarrow \theta \), and either \(\theta = \alpha \in \mathrm {dom}(\phi )\) or \(\theta \) is a closed type with empty support, and \((E,\alpha \rightsquigarrow \theta )\) is at the bottom of \(\mathcal {E}\);

  • for all \((E,\alpha \rightsquigarrow \theta )\) in \(\mathcal {E}\) with \(\alpha \in \mathrm {dom}(\phi )\), we have \(\theta = \alpha \) and \(E = \bullet \);

where \(\varDelta _{\phi } = \mathrm {dom}(\phi )\cap \mathsf {TN}\) and \(\varGamma _{\phi ,\gamma } = \{(x,\theta \{\mathsf {fst}(\gamma )\})~|~(x,\theta )\in \phi \}\).

Lemma 18

If C is a legal configuration and \(C \xrightarrow {m} C'\) then \(C'\) is a legal configuration.

4 Parametricity in the Trace Model, and Proof of Theorem 5

We next examine the relationship between trace equivalence and the notions of Reynolds and Strachey equivalence. We prove that Strachey equivalence is included in trace equivalence (Theorem 21), which in turn is included in Reynolds equivalence (Theorem 28).

4.1 From Strachey to Trace Equivalence

Definition 19

Let \(C_i=\langle \mathcal {E}_i,\gamma _i,\phi _i\rangle \), for \(i=1,2\), be two configurations. We say that \(C_1\) and \(C_2\) are Strachey-equivalent when \(\mathcal {E}_1\) and \(\mathcal {E}_2\) have the same size, \(\mathrm {dom}(\gamma _1) = \mathrm {dom}(\gamma _2)\), \(\phi _1 = \phi _2\) and:

  • for all \(c \in \mathrm {dom}(\gamma _1)\), if \(\gamma _i(c)=(M_i,\theta _i)\) then \(\theta _1 = \theta _2\) and \(\mathbf {erase}(M_1)\! =_{\beta \eta }\!\mathbf {erase}(M_2)\);

  • if \((Z_i,\alpha _i)\) is the j-th element of \(\mathcal {E}_i\), then \(\alpha _1 = \alpha _2\) and \(\mathbf {erase}(Z_1) =_{\beta \eta }\mathbf {erase}(Z_2)\);

where \(E_1=_{\beta \eta }E_2\) just if \(E_1[x]=_{\beta \eta }E_2[x]\) for some/all fresh x.

The first inclusion can then be proven as follows.

Lemma 20

Given two Strachey-equivalent legal configurations \(C_1,C_2\), if \(C_1 \xrightarrow {m} C'_1\) for some \(m,C_1'\) then there is \(C_2 \xrightarrow {m} C'_2\) such that \(C'_1\) and \(C'_2\) are Strachey-equivalent.

Theorem 21

For all Strachey-equivalent \(\varDelta ,\varGamma \vdash M_1,M_2 : \theta \), we have \([\![M_1 ]\!] = [\![M_2 ]\!]\).

Proof

Taking \(T \in [\![\varDelta ; \varGamma \vdash M_1 : \theta ]\!]\), we prove that \(T \in [\![\varDelta ; \varGamma \vdash M_2 : \theta ]\!]\) by induction on the length of T, using the previous lemma.   \(\square \)

The inclusion above is strict. This is shown, for example, by the following terms \(M_{\mathbf {true}},M_{\mathbf {false}}:\mathbf {Unit}\rightarrow \mathbf {Unit}\), which are trace equivalent but not Strachey-equivalent:

$$ M_{\mathbf {b}} =\lambda f^{\mathbf {Unit}}{.}\varLambda X{.}\lambda x^{X}{.}\mathbf {snd}(f (\mathbf {Bool}\times X) \langle \mathbf {b},x\rangle )\quad (\mathbf {b}=\mathbf {true},\mathbf {false}) $$

Here we use the impredicative encoding of product types [8]: \(\theta _1 \times \theta _2=\forall X{.} (\theta _1 \rightarrow \theta _2 \rightarrow X) \rightarrow X\), \(\langle M, N \rangle =\varLambda X{.}\lambda f^{\theta _1 \rightarrow \theta _2 \rightarrow X}{.} f M N\) and \(\mathbf {snd}=\lambda x^{\theta _1 \times \theta _2}{.}x \theta _2 (\lambda y^{\theta _1}{.}\lambda z^{\theta _2}{.}z)\). Setting \(\gamma _0=[c_\mathrm{in}\mapsto (M_{\mathbf {b}},\mathbf {Unit}\rightarrow \mathbf {Unit})]\) and \(C_{\mathbf {b}}=\langle \cdot ;\cdot \vdash M_{\mathbf {b}} : \mathbf {Unit}\rightarrow \mathbf {Unit}\rangle \), we have:

and this is the only complete trace in \([\![M_{\mathbf {b}} ]\!]\). Indeed, O cannot interrogate another name, as \(c_\mathrm{in}\) can only be played once, and \(c'\) cannot be played with the (OQ\(_0\)) rule.

The other inclusion (trace included in Reynolds) is more challenging and requires us to introduce machinery for relating the semantics of terms and semantics of contexts to that of terms and contexts composed.

Fig. 5.
figure 5

Composite LTS.

4.2 Composite LTS

We let a composite configuration be a tuple \(\langle \mathcal {E}_P,\mathcal {E}_O,\gamma _P,\gamma _O\rangle \), where \(\gamma _P\) and \(\gamma _O\) are maps \(\gamma \) as above, \(\mathcal {E}_P\) is a term evaluation stack, and \(\mathcal {E}_O\) is a context evaluation stack. These configurations represent the interaction between a term and a context. The term-part in the interaction is played by \(\mathcal {E}_P\) and \(\gamma _P\), while the context-part by \(\mathcal {E}_O\) and \(\gamma _O\). As with ordinary configurations, we define an LTS for composite ones in Fig. 5. Given a composite configuration C, a trace T and a value v (hnf with empty support) we write \(C \Downarrow _{T,v}\) when \(C \xrightarrow {T} \langle \lozenge ,[(v,\theta )],\gamma _P,\gamma _O\rangle \).

Composite configurations allow us to compose a term and a context semantically: we essentially play the traces of one against the other. Another way to obtain a composite semantics is to work syntactically, i.e. by composing configurations and then executing the resulting term. This is defined next.

Definition 22

Given two evaluation stacks \((\mathcal {E}_P,\mathcal {E}_O)\), we build their merge (which may not always be defined) \(\mathcal {E}_P || \mathcal {E}_O\) inductively by \(\lozenge || [(M,\theta )] =M\) and:

$$ \begin{array}{rll} ((M,\alpha )\,{:}{:}\,\mathcal {E}_P) || ((E,\alpha \rightsquigarrow \theta )\,{:}{:}\,\mathcal {E}_O) &{} =\mathcal {E}_P || ((E[M],\theta )\,{:}{:}\,\mathcal {E}_O) \\ ((E,\alpha \rightsquigarrow \theta )\,{:}{:}\,\mathcal {E}_P) || ((M,\alpha )\,{:}{:}\,\mathcal {E}_O) &{} =((E[M],\theta )\,{:}{:}\,\mathcal {E}_P) || \mathcal {E}_O \\ \end{array} $$

When it is defined, we say that \(\mathcal {E}_P,\mathcal {E}_O\) are compatible. Then, a composite configuration \(C = \langle \mathcal {E}_P,\mathcal {E}_O,\gamma _P,\gamma _O\rangle \) is legal when \((\mathcal {E}_P,\mathcal {E}_O)\) are compatible and when both \(\langle \mathcal {E}_P,\gamma _P,\mathsf {snd}(\gamma _{O})\rangle \) and \(\langle \mathcal {E}_O,\gamma _O,\mathsf {snd}(\gamma _{P})\rangle \) are legal.

We now relate the reduction of a composite configuration with the head reduction of the merge of its two evaluation stacks. First, taking the two environments \(\gamma _P,\gamma _O\) of a legal composite configuration, we compute their closure \((\gamma _P \cdot \gamma _O)^{*}\) as follows. Setting \(\gamma ^0=\mathsf {fst}(\gamma _P \cdot \gamma _O)\), and \(\gamma ^i = \{(a,\hat{M}\{\gamma \}) ~|~(a,\hat{M}) \in \gamma ^{i-1}\}\) \((i > 0)\), there is an integer n such that \(\nu ({\mathrm {cod}(\gamma ^n)}) = \varnothing \). We write \((\gamma _P \cdot \gamma _O)^{*}\) for the environment defined as \(\gamma ^n\), for the least n satisfying this latter condition.

Theorem 23

Given a legal composite configuration \(C=\langle \mathcal {E}_P,\mathcal {E}_O,\gamma _P,\gamma _O\rangle \), then \(C \Downarrow _{T,v}\) iff \((\mathcal {E}_P || \mathcal {E}_O)\{(\gamma _{P}\cdot \gamma _{O})^{*}\} \rightarrow ^{*} v\).

Finally, we relate the LTS’s for composite configurations and ordinary configurations (Theorem 26). Combined with Theorem 23, this gives us a correlation between the traces of two compatible configurations and the head reduction we obtain once we merge their evaluation stacks.

Definition 24

Given legal configurations \(C_P = \langle \mathcal {E}_P,\gamma _P,\phi _P\rangle \) and \(C_O = \langle \mathcal {E}_O,\gamma _O,\phi _O\rangle \), we say that they are compatible when \(\mathcal {E}_P,\mathcal {E}_O\) are compatible, \(\mathsf {snd}(\gamma _{P}) = \phi _O\) and \(\mathsf {snd}(\gamma _{O}) = \phi _P\). For each pair \((C_P,C_O)\) of compatible configurations, we define their merge \(C_P \mathbin {\!{\wedge }\!\!\!{\wedge }\!} C_O\) as the composite configuration \(\langle \mathcal {E}_P,\mathcal {E}_O,\gamma _P,\gamma _O\rangle \).

Lemma 25

Taking \((C_P,C_O)\) a pair of compatible configurations, \(C_P \mathbin {\!{\wedge }\!\!\!{\wedge }\!} C_O \Downarrow _{T,v}\) iff \(C_P \Downarrow _{T}\) and \(C_O \Downarrow _{T^{\bot },v}\).

Theorem 26

Given \(C_{P,1},C_{P,2},C_O\) such that \(C_{P,1},C_O\) and \(C_{P,2},C_O\) are pairwise compatible and \(\mathsf {Tr}(C_{P,1}) = \mathsf {Tr}(C_{P,2})\), if \(C_{P,1} \mathbin {\!{\wedge }\!\!\!{\wedge }\!} C_{O} \Downarrow _{T,v}\), then \(C_{P,2} \mathbin {\!{\wedge }\!\!\!{\wedge }\!} C_{O} \Downarrow _{T,v}\).

Proof

From Lemma 25 we get \(C_{P,1} \Downarrow _{T}\) and \(C_{O} \Downarrow _{T^{\bot },v}\). Thus, \(T \in \mathsf {Tr}(C_{P,1})\) and hence \(T \in \mathsf {Tr}(C_{P,2})\). Lemma 16 then yields \(C_{P,2} \Downarrow _{T}\) and, from Lemma 25, \(C_{P,2} \mathbin {\!{\wedge }\!\!\!{\wedge }\!} C_{O} \Downarrow _{T,v}\).   \(\square \)

4.3 Proof of Theorem 5

Theorem 5 follows from Theorems 21 and 28. Theorem 28, which is proved below, shows that any trace equivalent terms are also Reynolds equivalent. This is achieved as follows. In the previous section we saw how to relate reductions of terms-in-context to the semantics of terms and contexts. Given terms \(M_1,M_2\) which are trace equivalent, and fully applying them to related arguments, we obtain head reductions to values. These reductions can be decomposed into LTS reductions producing corresponding traces, for the terms and their argument terms (which form contexts). But, since the terms are trace equivalent, \(M_2\) can simulate the behaviour of \(M_1\) in the context of \(M_1\), and that allows us to show that the two composites reduce to the same value.

We start by extending logical relations to extended types with empty support. We define \(\mathcal {R}[\![\mathrm {ext}(\theta ) ]\!]_{\delta }\) by:

$$ \begin{array}{rll} \mathcal {R}[\![(X) ]\!]_{\delta } &{} =&{} \{R ~|~\delta (X) = (\_,\_,R) \} \\ \mathcal {R}[\![\theta \,{:}{:}\,L ]\!]_{\delta } &{} =&{} \{(M_1,N_1)\,{:}{:}\,L' ~|~(M_1,N_1) \in \mathcal {R}[\![\theta ]\!]_{\delta } \wedge L' \in \mathcal {R}[\![L ]\!]_{\delta } \} \\ \mathcal {R}[\![\forall X\,{:}{:}\,L ]\!]_{\delta } &{} =&{} \{(\theta _1,\theta _2)\,{:}{:}\,L' ~|~(\theta _1,\theta _2,R) \in \mathrm {Rel}\wedge L' \in \mathcal {R}[\![L ]\!]_{\delta \cdot [X\mapsto (\theta _1,\theta _2,R)]}\} \end{array}$$

Lemma 27

\((M_1,M_2) \in \mathcal {R}[\![\theta ]\!]_{\delta }\) iff for all \(((\hat{N}^{1}_1,\hat{N}^{1}_2),\ldots ,(\hat{N}^{n}_1,\hat{N}^{n}_2),R) \in \mathcal {R}[\![\mathrm {ext}(\theta ) ]\!]_{\delta }\), \((M_1 \hat{N}^{1}_1 \cdots \hat{N}^{n}_1,M_2 \hat{N}^{1}_2 \cdots \hat{N}^{n}_2) \in R\).

Theorem 28

For all trace equivalent \(\varDelta ;\varGamma \vdash M_1,M_2:\theta \), we have that \({M_1}\simeq _{log}~{M_2}\).

Proof

Taking \(\delta \in \mathcal {R}[\![\varDelta ]\!]\) and \((\eta _1,\eta _2) \in \mathcal {R}[\![\varGamma ]\!]_{\delta }\), we show \((M_1\{\eta _1\}\{\delta _1\}, M_2\{\eta _2\}\{\delta _2\}) \in \mathcal {R}[\![\theta ]\!]_{\delta }\). Using Lemma 27, we take \(((\hat{N}^{1}_1,\hat{N}^{1}_2),\ldots ,(\hat{N}^{n}_1,\hat{N}^{n}_2),R) \in \mathcal {R}[\![\mathrm {ext}(\theta ) ]\!]_{\delta }\), and prove that \((M_1\{\eta _1\}\{\delta _1\} \hat{N}^{1}_1 \cdots \hat{N}^{n}_1, M_2\{\eta _2\}\{\delta _2\} \hat{N}^{1}_2 \cdots \hat{N}^{n}_2) \in R\).

For each \(i \in \{1,2\}\), there exists a value \(v_i\) s.t. \(M_i\{\eta _i\}\{\delta _i\} \hat{N}^{1}_i \cdots \hat{N}^{n}_i \rightarrow ^{*} v_{i}\). Using the closure of R w.r.t. \(=_{\beta \eta }\), it suffices to show that \((v_{1},v_{2}) \in R\). Suppose \(\varDelta = X_1,\ldots ,X_k\) and \(\varGamma = x_1:\theta _1,\ldots ,x_m:\theta _m\). We write \(C_{P_i}\) for the configuration \(\langle \varDelta ; \varGamma \vdash M_i : \theta \rangle \), and \(C_{O,i}\) for the configuration \(\langle c_\mathrm{in}\delta _i(X_1) \cdots \delta _i(X_k) \eta _i(x_1) \cdots \eta _i(x_m) \hat{N}^{1}_i \cdots \hat{N}^{n}_i,\varepsilon ,[c_\mathrm{in}\mapsto \widetilde{\theta }]\rangle \), where \(\widetilde{\theta } =\forall X_1. \ldots \forall X_n.\theta _1 \rightarrow \cdots \rightarrow \theta _m \rightarrow \theta \).

From Theorem 23, for each \(i \in \{1,2\}\) there is a trace \(T_{i}\) such that \(C_{P,i} \mathbin {\!{\wedge }\!\!\!{\wedge }\!} C_{O,i} \Downarrow _{T_{i},v_{i}}\). \(M_1,M_2\) being trace equivalent, we have that \(\mathsf {Tr}(C_{P,1}) = \mathsf {Tr}(C_{P,2})\). So from Theorem 26, we get that \(C_{P,2} \mathbin {\!{\wedge }\!\!\!{\wedge }\!} C_{O,1} \Downarrow _{T_{1},v_1}\), and from Theorem 23 that \(M_2\{\eta _1\}\{\delta _1\} \hat{N}^{1}_1 \cdots \hat{N}^{n}_1 \rightarrow ^{*} v_1\). Finally, from Theorem 2, we get that \((M_2\{\eta _1\}\{\delta _1\} \hat{N}^{1}_1 \cdots \hat{N}^{n}_1, M_2\{\eta _2\}\{\delta _2\} \hat{N}^{1}_2 \cdots \hat{N}^{n}_2) \in R\). Thus, using the closure of R w.r.t. \(=_{\beta \eta }\), we have that \((v_{1},v_{2}) \in R\).   \(\square \)

5 Related and Future Work

The literature on parametric polymorphism is vast; here we look at the works closest to ours, which come from the game semantics area. The first game model for System F was introduced by Hughes [9, 10]. The model is intentional, in the sense that it is fully complete for \(\beta \eta \)-equivalence. Starting from that model, de Lataillade [5, 6] characterised parametricity categorically via the notion of dinaturality [4]. In [2], Abramsky and Jagadeesan developed a model for System F to characterise genericity, as introduced by Longo et al. [17]. A type \(\theta \) is said to be generic when two terms \(M_1,M_2\) of type \(\forall X.\theta '\) are equivalent just if \(M_1 \theta \) and \(M_2 \theta \) are equivalent. Their model contains several generic types. More recently, Laird [15] has introduced a game model for System F augmented with mutable variables. His model is closer to ours than the previous ones, and in particular his notion of copycat links can be seen as connected to the use of names for parametricity.

In all of the above models the denotation of terms is built compositionally by induction on the structure of the term. In a different line of work, closer in spirit to our model, Lassen and Levy [16] have introduced normal form bisimulations for a language with parametric polymorphism. These bisimulations are defined on LTSs whose definition has similarities with ours. However, the model is for a CPS-style language which has not only polymorphic but also recursive types. Finally, our own model for a higher-order polymorphic language with general references [13] can be seen as a direct precursor to this work, albeit in a very different setting (call-by-value, with references).

Further on, we would like to study the existence of generic types in our model, as well as its dinaturality properties. We would moreover like to examine coarser notions of trace equivalence that bring us closer to Reynolds polymorphism. Finally, we would like to see if the trace model can be used to prove the original conjecture of [1, 20]. While this seems plausible in principle, proving equivalences using definable logical relations requires additional tools, such as restrictions on the LTS, to avoid circular reasoning.