1 Introduction

Mechanizing formal systems and proofs about them plays an important role in establishing trust in programming languages and verifying software systems in general. Key questions in this setting are how to represent variables, (simultaneous) substitutions, assumptions, and derivations that depend on assumptions. Higher-order abstract syntax (HOAS) provides an elegant and unifying answer to these questions, relieving users from having to write boilerplate code.

Beluga is a proof checker with built-in support for HOAS encodings of formal systems based on the logical framework LF [13]. Metatheoretic inductive proofs are implemented as recursive, dependently-typed functions that manipulate and transform HOAS representations [21, 4, 25]. In this paper, we describe the interactive proof engine Harpoon which is built on top of Beluga. A Harpoon user modularly and incrementally develops a metatheoretic proof by solving independent subgoals via a fixed set of high-level actions. An action eliminates the subgoal on which it is executed, filling it with a proof that possibly contains new subgoals to be resolved. The actions we support are: introduction of assumptions, case-analysis, inductive reasoning, and both forward and backward reasoning styles.

While our fixed set of actions is largely inspired by similar systems such as Twelf [20, 28, 27] and Abella [11], Harpoon advances the state of the art in interactively developing mechanized proofs about HOAS representations in two ways: 1. We treat subgoals as first-class and characterize them using contextual types that pair their goal types together with the contexts in which they are meaningful; a contextual substitution property guarantees that each step of proof development correctly refines the partial proof under construction [8]. 2. Rather than simply record the sequence of actions given by the user, we elaborate this sequence into an assertion-level proof [15], represented as what we call a proof script. The proof script is what we record as output of an interactive session. It can be both typechecked directly and translated into a Beluga program.

We have used Harpoon (see https://beluga-lang.readthedocs.io/) on a wide range of representative examples from the Beluga library: normalization proofs for the simply-typed lambda calculus [6], benchmarks for reasoning about binders [9, 10], and the recent POPLMark Reloaded challenge [1]. These examples involve numerous concerns that arise in proof development, and cover all the domain-specific abstractions that Beluga provides. Our experience shows that Harpoon lowers the entry barrier for users: they only need to understand how to represent formal systems and derivations using HOAS encodings and can then manipulate the HOAS representations directly via the high-level actions which correspond closely to how proofs are developed on paper. As such, we believe that Harpoon eases the task of proving metatheoretic statements.

2 Proof Development in Harpoon

We introduce the main features of Harpoon by interactively developing the proof of two lemmas that play a central role in the proof of weak normalization of the simply-typed lambda calculus. For a more detailed description, see [6].

2.1 Initial setup: encoding the language

We begin by defining the simply-typed lambda-calculus in the logical framework LF [13] using an intrinsically typed encoding. In typical HOAS style, lambda abstraction takes an LF function representing the abstraction of a term over a variable. There is no case for variables, as they are treated implicitly. We remind the reader that this is a weak, representational function space – there is no case analysis or recursion, so only genuine lambda terms can be represented.

figure a

Free variables such as and are implicitly universally quantified (see [23]) and programmers subsequently do not supply arguments for implicitly quantified parameters when using a constructor.

Next, we define a small-step operational semantics for the language. For simplicity, we use a call-by-name reduction strategy and do not reduce under lambda-abstractions. Note that we use LF application to encode the object-level substitution in the rule.

figure e

Using this definition, we define a notion of termination: a term halts if it reduces to a value. This is captured by the constructor .

figure g

2.2 Termination Property: intros, split, unbox, and solve

As the first short lemma, we show the Termination property: if is known to halt and , then also halts. We start our interactive proof session by loading the signature and defining the name of the theorem and the statement that we want to prove.

figure k

We pair each LF object such as together with the LF context in which it is meaningful [21, 26, 19]. We refer to such an object as a contextual object and embed contextual types, written as , into Beluga types using the “box” syntax. In this example, the LF context, written on the left of , is empty, as we consider closed LF objects. As before, the free variables and are implicitly quantified at the outside. They themselves stand for contextual objects and have contextual type . The theorem statements are hence statements about contextual LF objects and directly correspond to Beluga types.

The proof begins with a single subgoal whose type is simply the statement of the theorem under no assumptions. Since this subgoal has a function type, Harpoon will automatically apply the action, which introduces assumptions as follows: First, the (implicitly) universally quantified variables , are added to the meta-context. This context collects parameters introduced by universal quantifiers. This is in contrast with the computational context, which collects assumptions introduced by the simple function space. In particular, the second phase of the action adds the assumptions and to the computational context. Observe that since and have type , also adds to the meta-context, although it is implicit in the definitions of and and is not visible at all in the theorem statement (see the meta-context Fig. 1 step 1).

Fig. 1.
figure 1

Interactive session of the proof for the lemma.

The proof proceeds by inversion on . Using the action, we add the two new assumptions and to the meta-context (see Fig. 1, step 1.). To build a proof for , we need to show that there is a step from to some value . To build such a derivation, we use first the action on the computation-level assumption to obtain an assumption in the meta-context which is accessible to the LF layer (inside a box) (see Fig. 1, step 2.). Finally, we can finish the proof by supplying the term with the action (see Fig. 1, step 3). This is similar to the tactic in Coq.

The resulting proof script is given below. Assertions are written in boldface and curly braces denote new scopes, listing the full meta-context and the full computational context. Using an erasure we can then generate a translated program in the external syntax, i.e. the syntax a user would use when implementing the proof directly, rather than the internal syntax. It is hence much more compact than the actual proof script. This program can then be seamlessly combined with hand-written Beluga programs and can also independently type-checked.

figure as

2.3 Setup continued: reducibility

We now consider one of the key lemmas in the weak normalization proof, called the backwards closed lemma, i.e. if is reducible at some type and steps to , then is also reducible at . We begin to define a set of terms reducible at a type . All reducible terms are required to halt, and reducible terms at an arrow type are required to produce reducible output given reducible input. Concretely, a term is reducible at type , if for all terms where is reducible at type , then is reducible at type . Reducibility cannot be directly encoded on the LF layer, as it is not merely describing the syntax of an expression or derivation. Instead, we encode the set of reducible terms using the stratified type which is recursively defined on the type in Beluga (see [16]). Note that we write for explicit universal quantification over contextual objects.

figure bk

2.4 Backwards Closed Property: msplit, suffices, and by

We can now state the backwards closed lemma formally as follows: if is reducible at some type and steps to , then is also reducible at . We prove this lemma by induction on . This is specified by referring to the position of the induction variable in the statement.

figure bs

After Harpoon automatically introduces the metavariables , , and together with an assumption and , we use to split the proof into two cases (see Fig. 2, step 1). Whereas case analyzes a Beluga type, considers the cases for a (contextual) LF type. In reality, is implemented in terms of the action.

Fig. 2.
figure 2

Backwards Closed Lemma. Step 1: Case analysis of the type ; Steps 2 and 3: Base case (  \(=\)  ).

The case for  \(=\)  is straightforward (see Fig. 2, steps 2 and 3). First, we use the action to invert the premise . Then, we use the action to invoke the lemma (see Sec. 2.2) to obtain an assumption . We this case by supplying the term (see Fig. 2 step 3).

In the case for  \(=\)  , we begin similarly by inversion on using the action (see Fig. 3 step 4). We observe that the goal type is , which can be produced by using the constructor if we can construct a proof for each of the user-specified types, and . Such backwards reasoning is accomplished via the action. The user supplies a term representing an implication whose conclusion is compatible with the current goal and proceeds to prove its premises as specified (see Fig.3 step 5).

Fig. 3.
figure 3

Backwards Closed Lemma: Step Case

To prove the first premise, we apply the lemma (see Fig. 3 step 6). As for the second premise, Harpoon first automatically introduces the variable and the assumption , so it remains to show . We deduce using the assumption . Using , we build a derivation using . Finally, we appeal to the induction hypothesis. Using the action, we refer to the recursive call to complete the proof (see Fig. 3 step 7). The resulting proof script (of around 70 lines) can again be translated into a compact program.

Note that Harpoon allows users to use underscores to stand for arguments that are uniquely determined (see Harpoon Proof 3 step 7). We enforce that these underscores stand for uniquely determined objects in order to guarantee that the contexts and the goal type of every subgoal are closed. This ensures modularity: solving one subgoal does not affect any other open subgoals. As a consequence, users are not restricted in their proof development. As they would on paper, users can work on goals in any order, mix forward and backward reasoning, erase wrong parts, and replace them by correct steps.

Using the explained actions, one can now prove the fundamental lemma and the weak normalization theorem. For a more detailled description of this proof in Beluga see [5, 6].

Additional actions. Harpoon supports some additional features not discussed in this paper; see https://beluga-lang.readthedocs.io/ for a complete list of actions. In general, these actions add no expressive power, but enable more precise expression of a user’s intent. For example, the action splits on the type of a given term, ensuring that there is a unique case to consider. It is implemented simply as the action followed by an additional check.

3 Implementation of Harpoon

Harpoon is a front end that allows users to construct a proof for a theorem statement represented as a Beluga type. Types in Beluga include universal quantification over contextual types (dependent function space, written with curly braces), implications (simple function space), boxed contextual types, and stratified/recursive types (written as \(\mathbf{c} ~\overrightarrow{C}\) where C stands for a contextual object). In addition, Beluga supports quantification over LF contexts and even LF substitutions relating two LF contexts. We omit these below for simplicity, although they are also supported in Harpoon. In essence, Beluga types correspond to statements in first-order logic over a domain consisting of contextual objects, LF contexts, and LF substitutions. We can view \(\mathbf{c} ~\overrightarrow{C}\) and \([\varPsi \vdash A]\) as atomic propositions.

figure dk

Users construct a natural deduction proof for a theorem statement where \(\varGamma \), the computation context, contains hypotheses introduced from the simple function space and where \(\varDelta \), the meta-context, holds parameters introduced from the universal quantifier (curly-brace syntax) or by lifting an assumption \([\varPsi \vdash A]\) from \(\varGamma \) (box-elimination rule).

A subgoal in Harpoon is a typed hole in the proof that remains to be filled by the user. Such a hole is represented by a subgoal variable, the type of which is a contextual type \((\varDelta ; \varGamma \vdash \tau )\) that captures the typechecking state at the point the variable occurs [19, 3]: it remains to construct a proof for \(\tau \) with the parameters from \(\varDelta \) and the assumptions from \(\varGamma \). Subgoal variables in the proof script are collected into a subgoal context and substitution of subgoal variables is type-preserving [8]. Interactive actions are implemented with subgoal substitutions, so the correctness of interactive proof refinement is a consequence of the subgoal substitution property. Note that a subgoal’s type cannot itself contain subgoals – the subgoal type must be fully determined, so solving one subgoal cannot affect any other subgoal. Furthermore, subgoal variables may be introduced only in positions where we must construct a normal term (written e); these are terms that we must check against a given type. This given type becomes part of the subgoal’s type. Subgoal variables stand thus in contrast with ordinary variables, which are neutral terms (written i). (See [14, 26, 16] for examples of this so-called bi-directional characterization of normal and neutral proof terms in Beluga.)

An action is executed on a subgoal to eliminate it, while possibly introducing new subgoals. Actions emphasize the bi-directional nature of interactive proof construction: some demand normal terms e and others demand neutral terms i. To execute an action, the system synthesizes a proof script fragment from it, and substitutes that fragment for the current subgoal. Any subgoal variables present in the fragment become part of the subgoal context, and the user will have to solve them later. When no subgoals remain, the proof script is closed and can be translated straightforwardly to a Beluga program in internal (fully elaborated) syntax. We employ an erasure to display the program to the user. These are the essential actions for proof development, omitting our so-called “administrative” actions (such as ):

figure dm

intros introduces all assumptions from function types in the current goal; solve closes the current subgoal with a given a normal term, introducing no new subgoals. This action trivially makes Harpoon complete, as a full Beluga program could be given via solve to eliminate the initial subgoal of any proof. The action by enables introducing an intermediate result, often from a lemma or an induction hypothesis, demanding a neutral term i and binding it to a given name; unbox is the same as by, but it binds the result as a variable in the meta-context; split considers a covering set of cases for a neutral term (typically a variable) and generates possible induction hypotheses based on the specified induction order, (for details on coverage, see [24]); suffices allows programmers to reason backwards by supplying a neutral term i of function type and the types \(\overrightarrow{\tau }\) of arguments to construct for this function.

4 Empirical evaluation of Harpoon

We give a summary of representative case studies that we replayed using Harpoon in Table 1. In porting these proofs to Harpoon, we use solve e only when e is atomic, i.e. it describes either a contextual LF term or a constant applied to all its arguments (either \(e = M\), \(e = [C]\) or \(e = c~\overrightarrow{C}~e_1 \ldots e_n\)). We list in the table the number of commands used to complete the proof and what particular features made the selected case study interesting for testing Harpoon.

Table 1. Summary of proofs ported to Harpoon from Beluga.

The first four examples proceed by straightforward induction, but the remaining examples are less direct since they feature logical relations. The STLC strong normalization and algorithmic equality completeness examples are larger developments, totalling 38 and 26 theorems respectively. Crucially, these case studies make use of Beluga ’s domain-specific abstractions, by splitting on contexts, reasoning about object-language variables, and exploiting the built-in equational theory of substitutions. We have since used Harpoon to replay the meta-theoretic proofs about Standard ML from [18].

This evaluation gives us confidence in the robustness and expressive power of Harpoon.

5 Related work

There are several approaches to specify and reason about formal systems.

Beluga and hence Harpoon belong to the lineage of the Twelf system [20], which also implements the logical framework LF. Metatheoretic proofs in Twelf are implemented as relations. Totality checking then ensures that these relations correspond to actual proofs. As Twelf is limited to proving \(\varPi _1\) formulas (“forall-exists” statements), normalization proofs using logical relations cannot be directly encoded. Although Harpoon ’s actions are largely inspired by the internal actions of Twelf’s (experimental) fully-automated metatheorem prover [28, 27], Harpoon supports user interaction, more expressive theorem statements, and generation of proof witnesses, in the form of both the generated proof script and Beluga program resulting from translation.

The Abella system [11] also provides an interactive theorem prover for reasoning about specifications using HOAS. First, its theoretical basis is quite different from Beluga ’s: Abella’s reasoning logic extends first-order logic with a \(\nabla \) quantifier [12] that is used to express properties about variables. Second, Abella’s interactive mode provides a fixed set of tactics, similar to the actions we describe in this paper. However, these tactics only loosely connect to the actual theoretical foundation of Abella and no proof terms are generated as witnesses by the Abella system.

We can also reason about formal systems in general purpose proof assistants such as Coq. The general philosophy in such systems is that users should be in the position of writing complex domain-specific tactics to facilitate proof construction using languages such as LTac [7] or MTac(2) [29, 17]. Although this is an extremely flexible approach, we believe that the tactic-centric view often obscures the actual line of reasoning in the proof. The proofs themselves can often be illegible and incomprehensible. Further, strong static guarantees about interactive proof construction are lacking; for example, dynamic checks enforce variable dependencies. In contrast, our goal is to enable mechanized proof development in a style close to that of a proof on paper. Thus we provide a fixed set of tactics suitable for a wide array of proofs, so users can concentrate on proof development instead of tactic development. As such, our work draws inspiration from [2] where the authors describe high-level actions within the tutorial proof checker Tutch. Our work extends and adapts this view to the mechanization of inductive metatheoretic proofs based on HOAS representations.

6 Conclusion

We have presented Harpoon, an interactive command-driven front-end of Beluga for mechanizing meta-theoretic proofs based on high-level actions. The sequence of interactive actions is elaborated into a proof script behind the scenes that represents an assertion-level proof. Last, proof scripts can soundly be translated to Beluga programs. We have evaluated Harpoon on several case-studies, ranging from purely syntactic arguments to proofs by logical relations. Our experience is that Harpoon lowers the entry barrier for users to develop meta-theoretic proofs about HOAS encodings.

In the future, we aim to extend Harpoon with additional high-level actions that support further automation. A natural first step is to support an action trivial which would attempt to automatically close an open sub-goal.