Abstract
ZFH stands for Zermelo-Fraenkel set theory implemented in higher-order logic. It is a descendant of Agerholm’s and Gordon’s HOL-ST but does not allow the use of type variables nor the definition of new types. We first motivate why we are using ZFH for ProofPeer, the collaborative theorem proving system we are building. We then focus on the type inference algorithm we have developed for ZFH. In ZFH’s syntax, function application, written as juxtaposition, is overloaded to be either set-theoretic or higher-order. Our algorithm extends Hindley-Milner type inference to cope with this particular overloading of function application. We describe the algorithm, prove its correctness, and discuss why prior general approaches to type inference in the presence of coercions or overloading do not cover our particular case.
Keywords
- Type Inference Algorithm
- Function Application
- Agerholm
- Classical Higher-order Logic
- Polymorphic Constants
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
The ProofPeer project [1, 2] is our attempt to combine interactive theorem proving (ITP) and the modern web, making ITP technology more accessible than it has been. We will first explain why we have chosen ZFH as the logic of ProofPeer, and then introduce the problem this paper solves.
1.1 Why ZFH?
Despite a few prominent counter examples [3, 4] it is particularly astonishing how few mathematicians are aware of or even use ITP systems. We believe that one reason for this is that traditionally the development and application of ITP technology has been driven by computer scientists, not mathematicians. Major successful ITP systems like Isabelle and Coq are based on variants of type theory, while most mathematicians feel more familiar with set theory. Simple mathematical standards like point set topology cannot be formalized in either system without the result feeling alien to most mathematicians.
We have therefore decided that the logic used in the ProofPeer system should be based on Zermelo-Fraenkel set theory which is more or less familiar to all mathematicians. At the same time we want to build on the considerable technical advances that contemporary ITP systems have achieved. Therefore we embed set theory within simply-typed classical higher-order logic by introducing a special type \({\fancyscript{U}}\) which forms the universe of Zermelo-Fraenkel sets, additional constants like the element-of operator \(\in : {\fancyscript{U}} \rightarrow {\fancyscript{U}} \rightarrow \mathbb {P}\), and additional axioms describing the properties of these new constants. The symbol \(\mathbb {P}\) denotes the type of propositions/booleans, and for any two types \(\alpha \) and \(\beta \) we can form the type of higher-order functions \(\alpha \rightarrow \beta \). For a full list of all new constants and axioms see theory root [9] in the ProofPeer system.
Because technically, all we have done is add an additional type together with a few new constants and axioms, all of the machinery present in systems like HOL-4 or Isabelle/HOL can be ported to work in our system. For example, Isabelle/HOL’s facilities for defining partial and nested recursive functions [8] could be translated to ProofPeer.
This approach was first advocated by Agerholm and Gordon [5]. They called the resulting logic HOL-ST. A related approach is pursued by Isabelle/ZF which embeds set theory within its intuitionistic higher-order meta logic [7]. The Isabelle/ZF approach seems more involved than our approach: it begins at base with intuitionistic higher-order logic, over which first-order classical logic is introduced, which in turn, is used to formalise set theory. We instead skip the middle step and base set theory directly on classical higher-order logic, obtaining a more powerful logic by simpler means. This is just how HOL-ST works as well, but that opens up a new dilemma: HOL-ST is so powerful that often it is not clear how concepts should best be formalised. Take for example the natural numbers: should they be formalised as a type, or should they be formalised as a set, i.e. as an element of \({\fancyscript{U}}\)? Or take lists: should they be formalised as a type \(\alpha \ \mathsf{list}\) together with polymorphic operations like \(\mathsf{cons}:\alpha \rightarrow \alpha \ \mathsf{list} \rightarrow \alpha \ \mathsf{list}\), or should they be formalised as a constant \(\mathsf{list}: {\fancyscript{U}} \rightarrow {\fancyscript{U}}\) such that \(\mathsf{list}\ \alpha \) denotes the set of lists over elements of \(\alpha \)? Note how in the latter case we can extend our discussion to the class of all (heterogeneous) lists by defining
The type of cons would now be \({\fancyscript{U}} \rightarrow {\fancyscript{U}} \rightarrow {\fancyscript{U}}\) and for reasonable definitions of list we could prove theorems like
We want people to perceive ProofPeer as a system based on set theory; the only reason we also employ simply-typed higher-order logic is because of its technical convenience and simplicity. Therefore for us there is an easy and coherent way out of the dilemma that HOL-ST has: we forbid the introduction of new types besides the ones we already described, and we furthermore do not use type variables as part of our internal term representation. The only polymorphic constants in our logic are equality (\(=\)), universal quantification (\(\forall \)) and existential quantification (\(\exists \)), and we do not provide any means for defining additional ones.
Abstaining from polymorphic terms in favour of monomorphic ones has a further advantage noticed already by Gordon [6, Sect. 3]: We can treat theories as simple (albeit large) theorems. The axioms of the theory become the antecedents of the theorem, and constants declared in the theory can be treated as universally quantified variables. This doesn’t work in polymorphic simply-typed higher order logic because polymorphic constants can appear with different types in the theory but variables must appear always with the same type in the theorem.
We choose the name ZFH for the logical system we obtain by embedding set theory into classical higher-order logic in the way outlined above. ZFH represents the same logic as HOL-ST minus type variables and minus a mechanism for defining custom types. In particular this means that ZFH and HOL-ST are equiconsistent, and that both HOL and ZFC can be formalized and proven to be consistent within ZFH.
1.2 Set-Theoretic vs. Higher-Order Function Application
There are two kinds of function application in ZFH:
-
application of a higher-order function \(f : \alpha \rightarrow \beta \) to its argument \(x : \alpha \), and
-
application of a set-theoretic function \(f : {\fancyscript{U}}\) to its argument \(x : {\fancyscript{U}}\).
In ZFH, set-theoretic functions are governed by two properties:
Here \(\mathsf{fun} : {\fancyscript{U}} \rightarrow ({\fancyscript{U}} \rightarrow {\fancyscript{U}}) \rightarrow {\fancyscript{U}}\) takes a domain \(X : {\fancyscript{U}}\) and a higher-order function \(f : {\fancyscript{U}} \rightarrow {\fancyscript{U}}\) as its arguments and produces the corresponding set-theoretic function on that domain. Set-theoretic functions created thus can then be applied via \({\mathsf{apply}}: {\fancyscript{U}} \rightarrow {\fancyscript{U}} \rightarrow {\fancyscript{U}}\).
In the actual ProofPeer theory [9], the second property is written like this:
Instead of explicitly mentioning \({\mathsf{apply}}\) we write application of a set-theoretic function in exactly the same way as application of a higher-order function! This is possible because in the above \(\mathsf{fun}\ X\ f\) is a set, which leads type inference to conclude that set-theoretic function application must be meant, not higher-order function application.
In general, the situation is not so clear-cut. Consider the following term:
Informally the above says that every x is the fixpoint of some function f. But which types should we assign to f and x? There are infinitely many valid ones:
-
1.
\(f : {\fancyscript{U}}\) and \(x : {\fancyscript{U}}\)
-
2.
\(f : {\mathbb {P}} \rightarrow {\mathbb {P}}\) and \(x : {\mathbb {P}}\)
-
3.
\(f : {\fancyscript{U}} \rightarrow {\fancyscript{U}}\) and \(x : {\fancyscript{U}}\)
-
4.
\(f : ({\fancyscript{U}} \rightarrow {\fancyscript{U}}) \rightarrow ({\fancyscript{U}} \rightarrow {\fancyscript{U}})\) and \(x : {\fancyscript{U}} \rightarrow {\fancyscript{U}}\)
-
5.
\(f : ({\mathbb {P}} \rightarrow {\fancyscript{U}}) \rightarrow ({\mathbb {P}} \rightarrow {\fancyscript{U}})\) and \(x : {\mathbb {P}} \rightarrow {\fancyscript{U}}\)
...and so on
Even if we had type variables at our disposal to formulate the typing (which we don’t) there would still be two equally valid typings to choose from:
-
1.
\(f : {\fancyscript{U}}\) and \(x : {\fancyscript{U}}\)
-
2.
\(f : \alpha \rightarrow \alpha \) and \(x : \alpha \)
Which one should we pick?
In the next section we will present a type inference algorithm for ZFH with the following properties:
-
If there is a valid typing at all, the algorithm will find one, and will otherwise fail. In particular, all function applications will be resolved to be either set-theoretic or higher-order.
-
Preference is given to the type \({\fancyscript{U}}\) over all other types, and to set-theoretic function application over higher-order function application.
Note that the second property is a desirable one in our case, as this again emphasises the set theory focus of ProofPeer.
In our above example the algorithm yields then the typing \(f : {\fancyscript{U}}\) and \(x : {\fancyscript{U}}\).
2 The Type Inference Algorithm
We first introduce the types and terms our algorithm operates on. Then we introduce the type equations which guide the algorithm, and recall how to solve type equations. After highlighting the basic difficulties of the problem we state the algorithm. Finally we prove that the algorithm terminates, that it is sound, and in what sense it is complete.
2.1 Types and Terms
Although we do not allow type variables as part of proper ZFH terms, we do allow them for type inference purposes. In particular a pretype \(\tau \) is either the universal type \({\fancyscript{U}}\), the propositional/boolean type \({\mathbb {P}}\), a function type \(\tau _1 \rightarrow \tau _2\), or a type variable \(\alpha \):
A type is a pretype which does not contain any type variables. A preterm t is either a constant c, a polymorphic constant \(p[\tau ]\), an explicit typing \(t : \tau \), a higher-order function \(x : \tau _1 \mapsto t :\tau _2\), a variable x, a higher-order function application \(t_1 \,\, {\diamond _\mathsf{H }}\,\, t_2 : \tau \), a set-theoretic function application \(t_1 \,\, {\diamond _\mathsf{ZF }}\,\, t_2 : \tau \), or a function application \(t_1 \,\, {\diamond _\mathsf{? }}\,\, t_2 : \tau \) where it is unspecified if it is of higher-order or set-theoretic kind:
A term is a preterm which does not contain any type variables, nor any function applications of unspecified kind.
Example 1
Our introductory example \(\forall x.\ \exists f.\ f x = x\) corresponds to the preterm:
Note that everywhere our preterm format requires a type, we simply used a fresh type variable.
2.2 Type Equations
A substitution \(\sigma \) associates every type variable \(\alpha \) with a pretype \(\sigma _\alpha \). Applying a substitution to a pretype \(\tau \) means replacing every type variable in \(\tau \) by its associated pretype (Fig. 1), and applying a substitution to a preterm t means applying the substitution to every pretype in t (Fig. 2).
With each constant c a fixed type \({\mathcal {C}}(c)\) is associated, e.g. \({\mathcal {C}}({\mathsf{apply}}) = {\fancyscript{U}} \rightarrow {\fancyscript{U}} \rightarrow {\fancyscript{U}}\). Assuming also a partial map \({\mathcal {V}}\) from variables to pretypes we can associate with each preterm t its type \({{\Gamma }_{{\mathcal {C}},{\mathcal {V}}}}(t)\) and a set of equations between pretypes \({{\mathcal {E}}_{{\mathcal {C}},{\mathcal {V}}}}(t)\) as shown in Fig. 3. In the following we will assume an implicitly given \({\mathcal {C}}\) and define
where \({\emptyset }\) in this context denotes the empty map.
A substitution \(\sigma \) is a unifier of a set \({\mathcal {E}}\) of equations of pretypes iff for all equations \(l \equiv r \in {\mathcal {E}}\) the left hand side and the right hand side of the equation become identical after substitution, i.e. \(\sigma (l) = \sigma (r)\) holds. We call \({\mathcal {E}}\) solvable if it has a unifier. Defining \(\sigma ({\mathcal {E}}) = \{ \sigma (l) \equiv \sigma (r)\ |\ l \equiv r \in {\mathcal {E}}\}\) allows the following rephrasing: \(\sigma \) is a unifier of \({\mathcal {E}}\) iff \(\sigma ({\mathcal {E}})\) is a set of identities.
Substitutions can be composed. The composition \(\delta \circ \sigma \) of a substitution \(\sigma \) with a substitution \(\delta \) is defined via
A unifier \(\sigma _1\) is called more general than a unifier \(\sigma _2\), in symbols \(\sigma _1 \ge \sigma _2\), iff there is a substitution \(\delta \) such that \(\sigma _2 = \delta \circ \sigma _1\).
Lemma 1
If \({\mathcal {E}}\) is solvable then it has an idempotent most general unifier \(\mathsf{mgu}_{\mathcal {E}}\), i.e. the following two properties hold for \(\mathsf{mgu}_{\mathcal {E}}\):
-
1.
\(\mathsf{mgu}_{\mathcal {E}}\ge \sigma \) for any unifier \(\sigma \) of \({\mathcal {E}}\), and
-
2.
\(\mathsf{mgu}_{\mathcal {E}}\circ \mathsf{mgu}_{\mathcal {E}}= \mathsf{mgu}_{\mathcal {E}}.\)
Proof
See [10, Sect. 4.5]. \(\square \)
If \({\mathcal {E}}(t)\) is solvable for a given preterm t, then we define
Note that \(\mathcal {S}(t)\) is unique up to a renaming of type variables. Computation of \(\mathcal {S}(t)\) is known as Hindley-Milner type inference [11].
Example 2
Given the preterm t from Example 1, the type equations \({\mathcal {E}}(t)\) are:
A most general unifier for these equations of pretypes is given by
and therefore
2.3 A First Attempt
An obvious first attempt to solve our type inference problem for a given preterm t would be to single out all n occurrences of \({\diamond _\mathsf{? }}\) in t and to form all \(2^n\) possibilities \(t_i\) by replacing \({\diamond _\mathsf{? }}\) with either \({\diamond _\mathsf{H }}\) or \({\diamond _\mathsf{ZF }}\).
If none of the sets of type equations \({\mathcal {E}}(t_i)\) is solvable, type inference fails. Otherwise let \(t_j\) denote those \(t_i\) for which \({\mathcal {E}}(t_i)\) is solvable. This gives us up to \(2^n\) almost-solutions \(s_j\) where
Because the \(s_j\) possibly contain type variables, but proper ZFH terms may not contain type variables, we need to somehow eliminate all type variables from the \(s_j\). One rather arbitrary way of doing so would be to replace all type variables by \({\fancyscript{U}}\), i.e. to form
where \(\mathcal {U}\) is the substitution which replaces all type variables by \({\fancyscript{U}}\):
This leaves us finally with up to \(2^n\) possible solutions \(r_j\) to our type inference problem. Computing all of these solutions is not practical for obvious performance reasons; furthermore, even if we did compute all of them, it is not clear which one among them we should pick as the result of the type inference.
2.4 The Algorithm
In our above attempt at a type inference algorithm we computed \(\mathcal {S}(t_i)\) only for preterms \(t_i\) which did not contain any occurrences of \({\diamond _\mathsf{? }}\). This was an arbitrary choice we made and it did not pay off.
Instead, given a preterm t which may still contain occurrences of \({\diamond _\mathsf{? }}\), let us directly compute \(t_0 = \mathcal {S}(t)\) if \({\mathcal {E}}(t)\) is solvable. If t contained any occurrences of \({\diamond _\mathsf{? }}\), then so will \(t_0\), but we now might have more type information available to decide whether an occurrence of \({\diamond _\mathsf{? }}\) should really be replaced by \({\diamond _\mathsf{H }}\) or \({\diamond _\mathsf{ZF }}!\)
To exploit type information present in a preterm t we define a function \(\mathcal {D}(t)\) which is able to decide in certain situations whether an occurrence of \({\diamond _\mathsf{? }}\) in t should be converted into \({\diamond _\mathsf{H }}\) or into \({\diamond _\mathsf{ZF }}\) (Fig. 4). Analogously to the definition of \({\mathcal {E}}(t)\) in terms of \({{\mathcal {E}}_{{\mathcal {C}},{\mathcal {V}}}}(t)\) we define \(\mathcal {D}(t)\) in terms of \(\mathcal {D}_{{\mathcal {C}},{\mathcal {V}}}(t)\). The main work in \(\mathcal {D}\) is done by the function
which takes three pretypes \(\tau _1\), \(\tau _2\), \(\tau _3\) as arguments and tries to determine which kind of function application f x must be when the type of f is known to be \(\tau _1\), the type of x is known to be \(\tau _2\) and the type of f x is known to be \(\tau _3\). If any of the \(\tau _i\) cannot be the type \({\fancyscript{U}}\), in symbols \(\lnot ^{\fancyscript{U}}(\tau _i)\), then we know that f x cannot be set-theoretic function application and therefore can only be (if any at all) higher-order function application. On the other hand, if the type of \(\tau _1\) cannot be a function type, in symbols \(\lnot ^\rightarrow (\tau _1)\), then f x cannot be higher-order function application and can therefore be only (if any at all) set-theoretic function application (Fig. 5).
We have now gathered all the pieces to formulate our type inference algorithm as shown in Fig. 6.
Example 3
Continuing Example 2 we compute now \(\mathsf{TypeInfer}(t)\). Having already computed \(s = \mathcal {S}(t)\) we now need to compute \(\mathcal {D}(s)\). There is only one occurrence of \({\diamond _\mathsf{? }}\) in s and the corresponding invocation of \(\diamond \) yields
and thus \(\mathcal {D}(s) = s\). This means that no recursive call to \(\mathsf{TypeInfer}\) is necessary and therefore
2.5 Termination
Let us first show that our algorithm actually terminates. There are only finitely many occurrences of \({\diamond _\mathsf{? }}\) in a preterm t, let us denote the number of such occurrences by N(t). For two preterms s and t let us write \(s \sqsubseteq t\) if s arises from t by replacing some (or none) of the occurrences of \({\diamond _\mathsf{? }}\) in t by either \({\diamond _\mathsf{H }}\) or \({\diamond _\mathsf{ZF }}\). Obviously \(s \sqsubseteq t\) together with \(s \ne t\) implies \(N(s) < N(t)\).
Lemma 2
TypeInfer(t) terminates for every preterm t.
Proof
Given some preterm s, \(\mathcal {D}(s) \sqsubseteq s\) holds. Therefore \(s \ne \mathcal {D}(s)\) implies
We also know that \(N(t) = N(\mathcal {S}(t))\) because \(\mathcal {S}\) only possibly instantiates type variables and leaves occurrences of \({\diamond _\mathsf{? }}\) unchanged. Together this means that for each recursive call to \(\mathsf{TypeInfer}\) its argument strictly decreases as measured by N and therefore the algorithm must terminate. \(\square \)
2.6 Soundness and Completeness
Given two preterms t and \(t'\) we say that \(t'\) is an instance of t, in symbols
iff there is a substitution \(\sigma \) such that \(t' \sqsubseteq \sigma (t)\).
What does it mean for our type inference algorithm to be sound? Given a preterm t as input it should output a preterm \(t'\) such that
-
1.
\(t'\) is a term,
-
2.
\(t' \le t\), and
-
3.
\({\mathcal {E}}(t')\) is solvable.
If there is no such \(t'\) the algorithm should fail. If there are several possible candidates for \(t'\) it would also be good to have a simple and sensible criterion for which of the candidates the algorithm will pick. Our algorithm fulfills such a criterion: it will pick the unique candidate \(t'\) which is minimal with respect to the relation \(\preceq \) which is first defined on types (Fig. 7) and then lifted to terms (Fig. 8). The reflexive, transitive and antisymmetric relation \(\preceq \) expresses formally what we referred to earlier as “\({\fancyscript{U}}\) is preferred over any other type, and set-theoretic function application is preferred over higher-order function application”.
Lemma 3
Let \(\sigma \) be a substitution and t a preterm. Then \({\mathcal {E}}(\sigma (t)) = \sigma ({\mathcal {E}}(t))\).
Proof
Immediate from the definitions. \(\square \)
Lemma 4
Let t be a preterm such that \({\mathcal {E}}(t)\) is solvable. Then \({\mathcal {E}}(\mathcal {S}(t))\) is a set of identities.
Proof
\({\mathcal {E}}(\mathcal {S}(t)) = {\mathcal {E}}(\mathsf{mgu}_{{\mathcal {E}}(t)}(t)) = \mathsf{mgu}_{{\mathcal {E}}(t)}({\mathcal {E}}(t))\). \(\square \)
Lemma 5
Let t be a preterm such that \({\mathcal {E}}(t)\) is solvable and \(\mathcal {S}(t) = t\). Then \(\mathsf{mgu}_{{\mathcal {E}}(t)} = \mathsf{id}\) and \({\mathcal {E}}(t)\) is a set of identities.
Proof
\(\mathsf{id}({\mathcal {E}}(t)) = {\mathcal {E}}(t) = {\mathcal {E}}(\mathcal {S}(t))\) \(\square \)
Lemma 6
Let s and t be preterms such that \(s \sqsubseteq t\). Then \({\mathcal {E}}(t) \subseteq {\mathcal {E}}(s)\). In particular, if \({\mathcal {E}}(s)\) is solvable then so is \({\mathcal {E}}(t)\).
Proof
Immediate from the definitions. \(\square \)
Lemma 7
If t is a preterm without any type variables then \(\mathcal {D}(t)\) is a term.
Proof
If \(\tau _1\) does not contain any type variables then either \(\lnot ^{\fancyscript{U}}(\tau _1)\) or \(\lnot ^\rightarrow (\tau _1)\) is true, and therefore \(\diamond (\tau _1, \tau _2, \tau _3) \in \{{\diamond _\mathsf{H }}, {\diamond _\mathsf{ZF }}\}\). \(\square \)
Lemma 8
If t is a preterm without any type variables, and \(t'\) is a term such that \({\mathcal {E}}(t')\) is solvable and \(t' \sqsubseteq t\) then \(t' = \mathcal {D}(t)\).
Proof
The terms \(t'\) and \(\mathcal {D}(t)\) could only possibly differ in places where t has an occurrence of \({\diamond _\mathsf{? }}\). In those places, choosing differently from \(\mathcal {D}\) would make the resulting equations unsolvable; however, \({\mathcal {E}}(t')\) is solvable. \(\square \)
Lemma 9
For any preterm t and any substitution \(\sigma \)
Proof
This follows from the fact that \(\diamond (\tau _1,\tau _2,\tau _3) \in \{{\diamond _\mathsf{H }},{\diamond _\mathsf{ZF }}\}\) implies
\(\square \)
Lemma 10
Let t be a preterm and \(t'\) a term such that \(t' \le t\) and \({\mathcal {E}}(t')\) is solvable. Then \({\mathcal {E}}(t)\) is solvable and both \(t' \le \mathcal {S}(t)\) and \(t' \le \mathcal {D}(t)\) hold.
Proof
Because \(t' \le t\) there exist \(\sigma \) and \(t''\) such that \(t'' = \sigma (t)\) and \(t' \sqsubseteq t''\). Because \({\mathcal {E}}(t')\) is solvable so is \({\mathcal {E}}(t'')\). Because \(t'\) is a term, neither \(t'\) nor \(t''\) contain any type variables and thus \(\mathcal {S}(t'') = t''\) which implies that \({\mathcal {E}}(t'') = {\mathcal {E}}(\sigma (t)) = \sigma ({\mathcal {E}}(t))\) are all sets of identities, and therefore \(\sigma \) is a unifier of \({\mathcal {E}}(t)\). This means there is a substitution \(\delta \) such that \(\sigma = \delta \circ \mathsf{mgu}_{{\mathcal {E}}(t)}\) which implies \(t'' = \sigma (t) = \delta (\mathcal {S}(t))\). Thus \(t' \le \mathcal {S}(t)\). Furthermore,
and thus \(t' \le \mathcal {D}(t)\). \(\square \)
Lemma 11
TypeInfer is sound. It is also complete in the sense that it will compute the unique \(\preceq \)-minimal solution if there is any solution at all.
Proof
Given a preterm t, TypeInfer will check if \({\mathcal {E}}(t)\) is solvable.
If it is not, it will fail; this is correct, because then there can be no solution \(t'\) with \(t' \le t\) and \({\mathcal {E}}(t')\) solvable because otherwise \({\mathcal {E}}(t)\) would be solvable as well because of Lemma 10.
If on the other hand \({\mathcal {E}}(t)\) is solvable it will either recursively call itself with argument d where \(d = \mathcal {D}(\mathcal {S}(t))\) or perform a final calculation and return the result. In the case of a recursive call, we know because of Lemma 10 that every solution \(t'\) of t is also a solution of d.
So let us look at the final calculation now. We know that \(d = s\) holds where \(s = \mathcal {S}(t)\). In other words, d is a fixpoint of \(\mathcal {D}\) which means that
holds for all invocations of \(\diamond \) during the computation of \(\mathcal {D}(d)\) which implies that all of \(\tau _1\), \(\tau _2\) and \(\tau _3\) are either equal to \({\fancyscript{U}}\) or equal to a type variable. The substitution \(\mathcal {U}\) will therefore make all \(\tau _i\) in those invocations equal to \({\fancyscript{U}}\) and thus the effect of applying \(\mathcal {D}\) to \(\mathcal {U}(d)\) is to switch all occurrences of \({\diamond _\mathsf{? }}\) to \({\diamond _\mathsf{ZF }}\). In particular, \({\mathcal {E}}(\mathcal {D}(\mathcal {U}(d)))\) is solvable because \({\mathcal {E}}(d)\) is a set of identities and
That means that \(t_0\) is a solution where \(t_0 = \mathcal {D}(\mathcal {U}(d))\). Furthermore \(t_0\) is minimal with respect to \(\preceq \) because for any solution \(t'\) we know \(t' \le d\) and because for any term a and any preterm b such that \(a \le b\) it follows that \(\mathcal {D}^{\mathsf{ZF}}(\mathcal {U}(b)) \preceq a\) where \(\mathcal {D}^{\mathsf{ZF}}\) replaces all occurrences of \({\diamond _\mathsf{? }}\) in its argument by \({\diamond _\mathsf{ZF }}\). Because of the antisymmetry of \(\preceq \), minimality implies uniqueness. \(\square \)
2.7 Examples
We present three more examples of applying TypeInfer. We will use abbreviated notations for preterms in the following.
Example 4
Let t be the preterm \(\forall x : \alpha .\ \exists f : \beta .\ f \,\, {\diamond _\mathsf{? }}\,\, x : \gamma .\) Then
Because of \(\diamond (\beta , \alpha , {\mathbb {P}}) = {\diamond _\mathsf{H }}\) we know
Computing \(\mathsf{TypeInfer}(t')\) yields first \(\mathcal {S}(t') = \forall x : \alpha .\ \exists f : \alpha \rightarrow {\mathbb {P}}.\ f {\,\,} {\diamond _\mathsf{H }}{\,\,} x\) and then
Example 5
Let t be \(a : \alpha \mapsto b : \beta \mapsto c : \gamma \mapsto d : \delta \mapsto a {\,\,} {\diamond _\mathsf{? }}{\,\,} b {\,\,} {\diamond _\mathsf{? }}{\,\,} c {\,\,} {\diamond _\mathsf{? }}{\,\,} d\). Then
Example 6
Let us modify the previous example and infer the type of
This time the algorithm needs three recursive calls and yields finally
This example can be generalized to produce for any n an example with n occurrences of \({\diamond _\mathsf{? }}\) such that TypeInfer needs n recursive calls.
3 Related Work
In HOL-ST [5], set-theoretic and higher-order function application have different syntax; in particular, higher-order function application is written f x and set-theoretic function application is denoted by \(f \diamond x\). Because HOL-ST has type variables and capabilities for defining new types, the type \({\fancyscript{U}}\) is just one type besides many others; our type inference algorithm does not yield a desirable result in such a setting. Of course, as HOL-ST is a strict superset of ZFH, one could work in it as one works in ZFH; our type inference algorithm can be directly translated to HOL-ST to support such a scenario.
Isabelle/ZF [7] also uses two different notations, f x for higher-order and f‘x for set-theoretic function application. Although Isabelle/ZF is embedded in polymorphic intuitionistic higher-order logic it is used in an essentially monomorphic way using an identical type system to ZFH. Isabelle has a flexible mechanism for syntax extension by adding context-free grammar rules so it should be possible to introduce syntax to write set-theoretic function application via juxtaposition as well. Type information is used in Isabelle to disambiguate between several possible parse trees. Using this built-in mechanism would lead to a situation similar to what we described in Sect. 2.3: whenever there are multiple possible typings parsing would fail. But in principle it should be possible to write a system-level Isabelle extension which implements our type inference algorithm for Isabelle/ZF.
Our operator for set-theoretic function application \({\mathsf{apply}}: {\fancyscript{U}} \rightarrow {\fancyscript{U}} \rightarrow {\fancyscript{U}}\) could be viewed as a coercion from \({\fancyscript{U}}\) to \({\fancyscript{U}} \rightarrow {\fancyscript{U}}\). There has been previous work with regard to the general problem of extending Hindley-Milner type inference in the presence of coercions. In [12] coercions between types which only differ in their base types but not in their type constructors are considered; because \({\fancyscript{U}}\) does not contain the type constructor \(\rightarrow \) but \({\fancyscript{U}} \rightarrow {\fancyscript{U}}\) does, their work is not applicable to our case. In [13] more general coercions are considered but their algorithm has the property that no coercions are inserted if Hindley-Milner type inference alone already yields a valid typing; this is not what we would like in our setting as this property means that their algorithm would choose the typing \(f : \alpha \rightarrow \alpha \) and \(x : \alpha \) over the typing \(f : {\fancyscript{U}}\) and \(x : {\fancyscript{U}}\) in our introductory example. And then there would still be the question of how that polymorphic type should be converted into a monomorphic one.
Another way of looking at our scenario is from an overloading point of view where the generic operator \({\diamond _\mathsf{? }}\) of type \(\alpha \rightarrow \beta \rightarrow \gamma \) has two different instances \({\diamond _\mathsf{ZF }}: {\fancyscript{U}} \rightarrow {\fancyscript{U}} \rightarrow {\fancyscript{U}}\) and \({\diamond _\mathsf{H }}: (\alpha \rightarrow \beta ) \rightarrow \alpha \rightarrow \beta \). But typical algorithms which extend Hindley-Milner to take overloading into account like in [14] compute a principal type of which all other possible valid typings are instances. This is not what our algorithm does; instead we minimize a preference relation \(\preceq \) which is different from the is-an-instance-of relation a principal type maximizes.
4 Conclusion
We have implemented TypeInfer as part of the implementation of ProofScript, the proof language of ProofPeer. Combining the strengths of set theory with the strengths of higher-order logic has always had a certain appeal to ITP researchers. We believe that the answer has been staring into our faces for quite some time now in the form of ZFH; all we had to do to arrive at ZFH was to take HOL-ST and take away powers which HOL practitioners take for granted but which are of little use in the context of set theory. The existence of TypeInfer which allows us to fuse the notations for higher-order function application and set-theory function application into a single one because of the absence of those powers supports our belief.
References
ProofPeer. http://www.proofpeer.net
Obua, S., Fleuriot, J., Scott, P., Aspinall, D.: ProofPeer: Collaborative Theorem Proving. http://arxiv.org/abs/1404.6186
Hales, T., et al.: A formal proof of the Kepler conjecture. http://arxiv.org/abs/1501.02155
Homotopy Type Theory. http://homotopytypetheory.org/
Agerholm, S., Gordon, M.: Experiments with ZF set theory in HOL and Isabelle. In: Schubert, E.T., Alves-Foss, J., Windley, P. (eds.) HUG 1995. LNCS, vol. 971. Springer, Heidelberg (1995)
Gordon, M.: Set theory, higher order logic or both? In: von Wright, J., Harrison, J., Grundy, J. (eds.) TPHOLs 1996. LNCS, vol. 1125. Springer, Heidelberg (1996)
Paulson, L.C.: Set theory for verification: I. from foundations to functions. J. Autom. Reasoning 11(3), 353–389 (1993). Springer
Krauss, A.: Partial and nested recursive function definitions in higher-order logic. J. Autom. Reasoning 44(4), 303–336 (2010). Springer
ProofPeer Root Theory. http://proofpeer.net/repository?root.thy
Baader, F., Nipkow, T.: Term Rewriting and All That. Cambridge University Press, Cambridge (1999)
Milner, R.: A theory of type polymorphism in programming. J. Comput. Syst. Sci. 17, 348–375 (1978)
Traytel, D., Berghofer, S., Nipkow, T.: Extending hindley-milner type inference with coercive structural subtyping. In: Yang, H. (ed.) APLAS 2011. LNCS, vol. 7078, pp. 89–104. Springer, Heidelberg (2011)
Luo, Z.: Coercions in a polymorphic type system. Math. Struct. Comput. Sci. 18(04), 729–751 (2008). Cambridge Journals
Odersky, M., Wadler, P., Wehr, M.: A second look at overloading. In: Proceedings of the Seventh International Conference on Functional Programming Languages and Computer Architecture. ACM (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Obua, S., Fleuriot, J., Scott, P., Aspinall, D. (2015). Type Inference for ZFH. In: Kerber, M., Carette, J., Kaliszyk, C., Rabe, F., Sorge, V. (eds) Intelligent Computer Mathematics. CICM 2015. Lecture Notes in Computer Science(), vol 9150. Springer, Cham. https://doi.org/10.1007/978-3-319-20615-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-20615-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20614-1
Online ISBN: 978-3-319-20615-8
eBook Packages: Computer ScienceComputer Science (R0)