# Meta-F\(^\star \): Proof Automation with SMT, Tactics, and Metaprograms

- 2 Citations
- 8.5k Downloads

## Abstract

We introduce Meta-F\(^{\star }\), a tactics and metaprogramming framework for the F\(^\star \) program verifier. The main novelty of Meta-F\(^\star \) is allowing the use of tactics and metaprogramming to discharge assertions not solvable by SMT, or to just simplify them into well-behaved SMT fragments. Plus, Meta-F\(^\star \) can be used to generate verified code automatically.

Meta-F\(^\star \) is implemented as an F\(^\star \) *effect*, which, given the powerful effect system of F\(^{\star }\), heavily increases code reuse and even enables the lightweight verification of metaprograms. Metaprograms can be either interpreted, or compiled to efficient native code that can be dynamically loaded into the F\(^\star \) type-checker and can interoperate with interpreted code. Evaluation on realistic case studies shows that Meta-F\(^\star \) provides substantial gains in proof development, efficiency, and robustness.

## Keywords

Tactics Metaprogramming Program verification Verification conditions SMT solvers Proof assistants## 1 Introduction

Scripting proofs using tactics and metaprogramming has a long tradition in interactive theorem provers (ITPs), starting with Milner’s Edinburgh LCF [37]. In this lineage, properties of *pure* programs are specified in expressive higher-order (and often dependently typed) logics, and proofs are conducted using various imperative programming languages, starting originally with ML.

Along a different axis, program verifiers like Dafny [47], VCC [23], Why3 [33], and Liquid Haskell [59] target both pure *and effectful* programs, with side-effects ranging from divergence to concurrency, but provide relatively weak logics for specification (e.g., first-order logic with a few selected theories like linear arithmetic). They work primarily by computing verification conditions (VCs) from programs, usually relying on annotations such as pre- and postconditions, and encoding them to automated theorem provers (ATPs) such as satisfiability modulo theories (SMT) solvers, often providing excellent automation.

These two sub-fields have influenced one another, though the situation is somewhat asymmetric. On the one hand, most interactive provers have gained support for exploiting SMT solvers or other ATPs, providing push-button automation for certain kinds of assertions [26, 31, 43, 44, 54]. On the other hand, recognizing the importance of interactive proofs, Why3 [33] interfaces with ITPs like Coq. However, working over proof obligations translated from Why3 requires users to be familiar not only with both these systems, but also with the specifics of the translation. And beyond Why3 and the tools based on it [25], no other SMT-based program verifiers have full-fledged support for interactive proving, leading to several downsides:

**Limits to expressiveness.** The expressiveness of program verifiers can be limited by the ATP used. When dealing with theories that are undecidable and difficult to automate (e.g., non-linear arithmetic or separation logic), proofs in ATP-based systems may become impossible or, at best, extremely tedious.

**Boilerplate.** To work around this lack of automation, programmers have to construct detailed proofs by hand, often repeating many tedious yet error-prone steps, so as to provide hints to the underlying solver to discover the proof. In contrast, ITPs with metaprogramming facilities excel at expressing domain-specific automation to complete such tedious proofs.

**Implicit proof context.** In most program verifiers, the logical context of a proof is implicit in the program text and depends on the control flow and the pre- and postconditions of preceding computations. Unlike in interactive proof assistants, programmers have no explicit access, neither visual nor programmatic, to this context, making proof structuring and exploration extremely difficult.

In direct response to these drawbacks, we seek a system that successfully combines the convenience of an automated program verifier for the common case, while seamlessly transitioning to an interactive proving experience for those parts of a proof that are hard to automate. Towards this end, we propose Meta-F\(^\star \), a tactics and metaprogramming framework for the F\(^\star \) [1, 58] program verifier.

**Highlights and Contributions of Meta-**F\(^\star \)

F\(^\star \) has historically been more deeply rooted as an SMT-based program verifier. Until now, F\(^\star \) discharged VCs exclusively by calling an SMT solver (usually Z3 [28]), providing good automation for many common program verification tasks, but also exhibiting the drawbacks discussed above.

Meta-F\(^\star \) is a framework that allows F\(^\star \) users to manipulate VCs using *tactics*. More generally, it supports *metaprogramming*, allowing programmers to script the construction of programs, by manipulating their syntax and customizing the way they are type-checked. This allows programmers to (1) implement custom procedures for manipulating VCs; (2) eliminate boilerplate in proofs and programs; and (3) to inspect the proof state visually and to manipulate it programmatically, addressing the drawbacks discussed above. SMT still plays a central role in Meta-F\(^\star \): a typical usage involves implementing tactics to transform VCs, so as to bring them into theories well-supported by SMT, without needing to (re)implement full decision procedures. Further, the generality of Meta-F\(^\star \) allows implementing non-trivial language extensions (e.g., typeclass resolution) entirely as metaprogramming libraries, without changes to the F\(^\star \) type-checker.

The technical **contributions** of our work include the following:

**“Meta-” is just an effect (Sect.** 3.1**).** Meta-F\(^\star \) is implemented using F\(^{\star }\)’s extensible effect system, which keeps programs and metaprograms properly isolated. Being first-class F\(^\star \) programs, metaprograms are typed, call-by-value, direct-style, higher-order functional programs, much like the original ML. Further, metaprograms can be themselves verified (to a degree, see Sect. 3.4) and metaprogrammed.

**Reconciling tactics with VC generation (Sect.** 4.2**).** In program verifiers the programmer often guides the solver towards the proof by supplying intermediate assertions. Meta-F\(^\star \) retains this style, but additionally allows assertions to be solved by tactics. To this end, a contribution of our work is extracting, from a VC, a proof state encompassing all relevant hypotheses, including those implicit in the program text.

**Executing metaprograms efficiently (Sect.** 5**).** Metaprograms are executed during type-checking. As a baseline, they can be interpreted using F\(^\star \)’s existing (but slow) abstract machine for term normalization, or a faster normalizer based on normalization by evaluation (NbE) [10, 16]. For much faster execution speed, metaprograms can also be run natively. This is achieved by combining the existing extraction mechanism of F\(^\star \) to OCaml with a new framework for safely extending the F\(^\star \) type-checker with such native code.

**Examples (Sect.** 2**)** **and evaluation (Sect.** 6**).** We evaluate Meta-F\(^\star \) on several case studies. First, we present a functional correctness proof for the Poly1305 message authentication code (MAC) [11], using a novel combination of proofs by reflection for dealing with non-linear arithmetic and SMT solving for linear arithmetic. We measure a clear gain in proof robustness: SMT-only proofs succeed only rarely (for reasonable timeouts), whereas our tactic+SMT proof is concise, never fails, and is faster. Next, we demonstrate an improvement in expressiveness, by developing a small library for proofs of heap-manipulating programs in separation logic, which was previously out-of-scope for F\(^\star \). Finally, we illustrate the ability to automatically construct verified effectful programs, by introducing a library for metaprogramming verified low-level parsers and serializers with applications to network programming, where verification is accelerated by processing the VC with tactics, and by programmatically tweaking the SMT context.

We conclude that tactics and metaprogramming can be prosperously combined with VC generation and SMT solving to build verified programs with better, more scalable, and more robust automation.

The full version of this paper, including appendices, can be found online in https://www.fstar-lang.org/papers/metafstar.

## 2 Meta-F\(^\star \) by Example

F\(^\star \) is a general-purpose programming language aimed at program verification. It puts together the automation of an SMT-backed deductive verification tool with the expressive power of a language with full-spectrum dependent types. Briefly, it is a functional, higher-order, effectful, dependently typed language, with syntax loosely based on OCaml. F\(^\star \) supports refinement types and Hoare-style specifications, computing VCs of computations via a type-level weakest precondition (WP) calculus packed within *Dijkstra monads* [57]. F\(^\star \)’s effect system is also user-extensible [1]. Using it, one can model or embed imperative programming in styles ranging from ML to C [55] and assembly [35]. After verification, F\(^\star \) programs can be extracted to efficient OCaml or F# code. A first-order fragment of F\(^\star \), called Low\(^\star \), can also be extracted to C via the KreMLin compiler [55].

This paper introduces Meta-F\(^\star \), a metaprogramming framework for F\(^\star \) that allows users to safely customize and extend F\(^\star \) in many ways. For instance, Meta-F\(^\star \) can be used to preprocess or solve proof obligations; synthesize F\(^\star \) expressions; generate top-level definitions; and resolve implicit arguments in user-defined ways, enabling non-trivial extensions. This paper primarily discusses the first two features. Technically, none of these features deeply increase the expressive power of F\(^\star \), since one could manually program in F\(^\star \) terms that can now be metaprogrammed. However, as we will see shortly, manually programming terms and their proofs can be so prohibitively costly as to be practically infeasible.

Meta-F\(^\star \) is similar to other tactic frameworks, such as Coq’s [29] or Lean’s [30], in presenting a set of goals to the programmer, providing commands to break them down, allowing to inspect and build abstract syntax, etc. In this paper, we mostly detail the characteristics where Meta-F\(^\star \) *differs* from other engines.

This section presents Meta-F\(^\star \) informally, displaying its usage through case studies. We present any necessary F\(^\star \) background as needed.

### 2.1 Tactics for Individual Assertions and Partial Canonicalization

Non-linear arithmetic reasoning is crucially needed for the verification of optimized, low-level cryptographic primitives [18, 64], an important use case for F\(^\star \) [13] and other verification frameworks, including those that rely on SMT solving alone (e.g., Dafny [47]) as well as those that rely exclusively on tactic-based proofs (e.g., FiatCrypto [32]). While both styles have demonstrated significant successes, we make a case for a middle ground, leveraging the SMT solver for the parts of a VC where it is effective, and using tactics only where it is not.

We focus on Poly1305 [11], a widely-used cryptographic MAC that computes a series of integer multiplications and additions modulo a large prime number \(p = 2^{130} - 5\). Implementations of the Poly1305 multiplication and mod operations are carefully hand-optimized to represent 130-bit numbers in terms of smaller 32-bit or 64-bit registers, using clever tricks; proving their correctness requires reasoning about long sequences of additions and multiplications.

**Previously: Guiding SMT Solvers by Manually Applying Lemmas.** Prior proofs of correctness of Poly1305 and other cryptographic primitives using SMT-based program verifiers, including F\(^\star \) [64] and Dafny [18], use a combination of SMT automation and manual application of lemmas. On the plus side, SMT solvers are excellent at linear arithmetic, so these proofs delegate all associativity-commutativity (AC) reasoning about addition to SMT. Non-linear arithmetic in SMT solvers, even just AC-rewriting and distributivity, are, however, inefficient and unreliable—so much so that the prior efforts above (and other works too [40, 41]) simply turn off support for non-linear arithmetic in the solver, in order not to degrade verification performance across the board due to poor interaction of theories. Instead, users need to explicitly invoke lemmas.^{1}

Given enough time, the solver can sometimes find a proof without the additional hints, but this is usually rare and dependent on context, and almost never robust. In this particular example we find by varying Z3’s random seed that, in an isolated setting, the lemma is proven automatically about 32% of the time. The numbers are much worse for more complex proofs, and where the context contains many facts, making this style quickly spiral out of control. For example, a proof of one of the main lemmas in Poly1305, Open image in new window , requires 41 steps of rewriting for associativity-commutativity of multiplication, and distributivity of addition and multiplication—making the proof much too long to show here.

**SMT and Tactics in Meta-F**\({^{\star }}\mathbf{.}\) The listing below shows the statement and proof of Open image in new window in Meta-F\(^\star \), of which the lemma above was previously only a small part. Again, the specific property proven is not particularly relevant to our discussion. But, this time, the proof contains just two steps.

First, we call a single lemma about modular addition from F\(^\star \)’s standard library. Then, we assert an equality annotated with a tactic ( Open image in new window ). Instead of encoding the assertion as-is to the SMT solver, it is preprocessed by the Open image in new window tactic. The tactic is presented with the asserted equality as its goal, in an environment containing not only all variables in scope but also hypotheses for the precondition of Open image in new window and the postcondition of the Open image in new window call (otherwise, the assertion could not be proven). The tactic will then canonicalize the sides of the equality, but notably only “up to” linear arithmetic conversions. Rather than fully canonicalizing the terms, the tactic just rewrites them into a sum-of-products canonical form, leaving all the remaining work to the SMT solver, which can then easily and robustly discharge the goal using linear arithmetic only.

*syntax*of the goal, using Meta-F\(^\star \)’s reflection capabilities (detailed ahead in Sect. 3.3). We have no way to prove once and for all that the expressions built by Open image in new window correctly denote the terms, but this fact can be proven automatically at each application of the tactic, by simple unification. The tactic then applies the lemma Open image in new window , and the goal is changed to Open image in new window Open image in new window . Finally, by normalization, each side will be canonicalized by running Open image in new window and Open image in new window .

The Open image in new window tactic follows a similar approach, and is similar to existing reflective tactics for other proof assistants [9, 38], except that it only canonicalizes up to linear arithmetic, as explained above. The full VC for Open image in new window contains many other facts, e.g., that Open image in new window is non-zero so the division is well-defined and that the postcondition does indeed hold. These obligations remain in a “skeleton” VC that is also easily proven by Z3. This proof is much easier for the programmer to write and much more robust, as detailed ahead in Sect. 6.1. The proof of Poly1305’s other main lemma, Open image in new window , is also similarly well automated.

**Tactic Proofs Without SMT.** Of course, one can verify
Open image in new window
in Coq, following the same conceptual proof used in Meta-F\(^\star \), but relying on tactics only. Our proof (included in the appendix) is 27 lines long, two of which involve the use of Coq’s
Open image in new window
tactic (similar to our
Open image in new window
tactic) and
Open image in new window
tactic for solving formulas in Presburger arithmetic. The remaining 25 lines include steps to destruct the propositional structure of terms, rewrite by equalities, enriching the context to enable automatic modulo rewriting (Coq does not fully automatically recognize equality modulo *p* as an equivalence relation compatible with arithmetic operators). While a mature proof assistant like Coq has libraries and tools to ease this kind of manipulation, it can still be verbose.

In contrast, in Meta-F\(^\star \) all of these mundane parts of a proof are simply dispatched to the SMT solver, which decides linear arithmetic efficiently, beyond the quantifier-free Presburger fragment supported by tactics like Open image in new window , handles congruence closure natively, etc.

### 2.2 Tactics for Entire VCs and Separation Logic

A different way to invoke Meta-F\(^\star \) is over an entire VC. While the exact shape of VCs is hard to predict, users with some experience can write tactics that find and solve particular sub-assertions within a VC, or simply massage them into shapes better suited for the SMT solver. We illustrate the idea on proofs for heap-manipulating programs.

One verification method that has eluded F\(^\star \) until now is separation logic, the main reason being that the pervasive “frame rule” requires instantiating existentially quantified heap variables, which is a challenge for SMT solvers, and simply too tedious for users. With Meta-F\(^\star \), one can do better. We have written a (proof-of-concept) embedding of separation logic and a tactic ( Open image in new window ) that performs heap frame inference automatically.

The approach we follow consists of designing the WP specifications for primitive stateful actions so as to make their footprint syntactically evident. The tactic then descends through VCs until it finds an existential for heaps arising from the frame rule. Then, by solving an equality between heap expressions (which requires canonicalization, for which we use a variant of
Open image in new window
targeting *commutative* monoids) the tactic finds the frames and instantiates the existentials. Notably, as opposed to other tactic frameworks for separation logic [4, 45, 49, 51], this is *all* our tactic does before dispatching to the SMT solver, which can now be effective over the instantiated VC.

We now provide some detail on the framework. Below, ‘
Open image in new window
’ represents the empty heap, ‘\(\bullet \)’ is the separating conjunction and ‘
Open image in new window
’ is the heaplet with the single reference
Open image in new window
set to value
Open image in new window
.^{2} Our development distinguishes between a “heap” and its “memory” for technical reasons, but we will treat the two as equivalent here. Further,
Open image in new window
is a predicate discriminating valid heaps (as in [52]), i.e., those built from separating conjunctions of *actually* disjoint heaps.

The Open image in new window tactic: (1) uses syntax inspection to unfold and traverse the goal until it reaches a Open image in new window —say, the one for Open image in new window ; (2) inspects Open image in new window ’s first explicit argument (here Open image in new window ) to compute the references the current command requires (here Open image in new window ); (3) uses unification variables to build a memory expression describing the required framing of input memory (here Open image in new window ) and instantiates the existentials of Open image in new window with these unification variables; (4) builds a goal that equates this memory expression with Open image in new window ’s third argument (here Open image in new window ); and (5) uses a commutative monoids tactic (similar to Sect. 2.1) with the heap algebra ( Open image in new window , Open image in new window ) to canonicalize the equality and sort the heaplets. Next, it can solve for the unification variables component-wise, instantiating Open image in new window to Open image in new window and Open image in new window , and then proceed to the next Open image in new window .

In general, after frames are instantiated, the SMT solver can efficiently prove the remaining assertions, such as the obligations about heap definedness. Thus, with relatively little effort, Meta-F\(^\star \) brings an (albeit simple version of a) widely used yet previously out-of-scope program logic (i.e., separation logic) into F\(^\star \). To the best of our knowledge, the ability to *script* separation logic into an SMT-based program verifier, without any primitive support, is unique.

### 2.3 Metaprogramming Verified Low-Level Parsers and Serializers

Above, we used Meta-F\(^\star \) to manipulate VCs for user-written code. Here, we focus instead on generating verified code automatically. We loosely refer to the previous setting as using “tactics”, and to the current one as “metaprogramming”. In most ITPs, tactics and metaprogramming are not distinguished; however in a program verifier like F\(^\star \), where some proofs are not materialized at all (Sect. 4.1), proving VCs of existing terms is distinct from generating new terms.

Metaprogramming in F\(^\star \) involves programmatically generating a (potentially effectful) term (e.g., by constructing its syntax and instructing F\(^\star \) how to type-check it) and processing any VCs that arise via tactics. When applicable (e.g., when working in a domain-specific language), metaprogramming verified code can substantially reduce, or even eliminate, the burden of manual proofs.

We illustrate this by automating the generation of parsers and serializers from a type definition. Of course, this is a routine task in many mainstream metaprogramming frameworks (e.g., Template Haskell, camlp4, etc). The novelty here is that we produce imperative parsers and serializers extracted to C, with proofs that they are memory safe, functionally correct, and mutually inverse. This section is slightly simplified, more detail can be found the appendix.

^{3}stating that Open image in new window is an inverse of the serializer. A Open image in new window is a dependent record of a parser and an associated serializer. Basic combinators in the library include constructs for parsing and serializing base values and pairs, such as the following: Next, we define low-level versions of these combinators, which work over mutable arrays instead of byte sequences. These combinators are coded in the Low\(^\star \) subset of F\(^\star \) (and so can be extracted to C) and are proven to both be memory-safe and respect their high-level variants. The type for low-level parsers, Open image in new window , denotes an imperative function that reads from an array of bytes and returns a Open image in new window , behaving as the specificational parser Open image in new window . Conversely, a Open image in new window Open image in new window writes into an array of bytes, behaving as Open image in new window .

Given such a library, we would like to build verified, mutually inverse, low-level parsers and serializers for specific data formats. The task is mechanical, yet overwhelmingly tedious by hand, with many auxiliary proof obligations of a predictable structure: a perfect candidate for metaprogramming.

*Deriving Specifications from a Type Definition.*Consider the following F\(^\star \) type, representing lists of exactly 18 pairs of bytes. The first component of our metaprogram is Open image in new window , which generates parser and serializer specifications from a type definition. The syntax Open image in new window is the way to call Meta-F\(^\star \) for code generation. Meta-F\(^\star \) will run the metaprogram Open image in new window and, if successful, replace the underscore by the result. In this case, the Open image in new window inspects the syntax of the Open image in new window type (Sect. 3.3) and produces the package below ( Open image in new window and Open image in new window are sequencing combinators):

*Deriving Low-Level Implementations that Match Specifications.*From this pair of specifications, we can automatically generate Low\(^\star \) implementations for them: which will produce the following low-level implementations: For simple types like the one above, the generated code is fairly simple. However, for more complex types, using the combinator library comes with non-trivial proof obligations. For example, even for a simple enumeration, Open image in new window Open image in new window Open image in new window , the parser specification is as follows: We represent Open image in new window with Open image in new window and Open image in new window with Open image in new window . The parser first parses a “bounded” byte, with only two values. The Open image in new window combinator then expects functions between the bounded byte and the datatype being parsed ( Open image in new window ), which must be proven to be mutual inverses. This proof is conceptually easy, but for large enumerations nested deep within the structure of other types, it is notoriously hard for SMT solvers. Since the proof is inherently computational, a proof that destructs the inductive type into its cases and then normalizes is much more natural. With our metaprogram, we can produce the term and then discharge these proof obligations with a tactic

*on the spot*, eliminating them from the final VC. We also explore simply tweaking the SMT context, again via a tactic, with good results. A quantitative evaluation is provided in Sect. 6.2.

## 3 The Design of Meta-F\(^\star \)

Having caught a glimpse of the use cases for Meta-F\(^\star \), we now turn to its design. As usual in proof assistants (such as Coq, Lean and Idris), Meta-F\(^\star \) tactics work over a set of goals and apply primitive actions to transform them, possibly solving some goals and generating new goals in the process. Since this is standard, we will focus the most on describing the aspects where Meta-F\(^\star \) differs from other engines. We first describe how metaprograms are modelled as an effect (Sect. 3.1) and their runtime model (Sect. 3.2). We then detail some of Meta-F\(^\star \)’s syntax inspection and building capabilities (Sect. 3.3). Finally, we show how to perform some (lightweight) verification of metaprograms (Sect. 3.4) within F\(^\star \).

### 3.1 An Effect for Metaprogramming

*computation types*of the form Open image in new window , where Open image in new window is the return type and Open image in new window a specification. However, until Sect. 3.4 we shall only use the derived form Open image in new window , where the specification is trivial. These computation types are distinct from their underlying monadic representation type Open image in new window —users cannot directly access the proof state except via the actions. The simplest actions stem from the Open image in new window monad definition: Open image in new window returns the current proof state and Open image in new window fails with the given exception

^{4}. Failures can be handled using Open image in new window Open image in new window , which resets the state on failure, including that of unification metavariables. We emphasize two points here. First, there is no “ Open image in new window ” action. This is to forbid metaprograms from arbitrarily replacing their proof state, which would be unsound. Second, the argument to Open image in new window must be thunked, since in F\(^\star \) impure un-suspended computations are evaluated before they are passed into functions.

These two small combinators illustrate a few key points of Meta-F\(^\star \). As for all other F\(^\star \) effects, metaprograms are written in applicative style, without explicit Open image in new window , Open image in new window , or Open image in new window of computations (which are inserted under the hood). This also works across different effects: Open image in new window can seamlessly combine the pure Open image in new window from F\(^\star \)’s list library with a metaprogram like Open image in new window . Metaprograms are also type- and effect-inferred: while Open image in new window was not at all annotated, F\(^\star \) infers the polymorphic type Open image in new window for it.

It should be noted that, if lacking an effect extension feature, one could embed metaprograms simply via the (properly abstracted) Open image in new window monad instead of the Open image in new window effect. It is just more convenient to use an effect, given we are working within an effectful program verifier already. In what follows, with the exception of Sect. 3.4 where we describe specifications for metaprograms, there is little reliance on using an effect; so, the same ideas could be applied in other settings.

### 3.2 Executing Meta-F\(^\star \) Metaprograms

Running metaprograms involves three steps. First, they are *reified* [1] into their underlying
Open image in new window
representation, i.e. as state-passing functions. User code cannot reify metaprograms: only F\(^\star \) can do so when about to process a goal.

Second, the reified term is applied to an initial proof state, and then simply evaluated according to F\(^\star \)’s dynamic semantics, for instance using F\(^\star \)’s existing normalizer. For intensive applications, such as proofs by reflection, we provide faster alternatives (Sect. 5). In order to perform this second step, the proof state, which up until this moments exists only internally to F\(^\star \), must be *embedded* as a term, i.e., as abstract syntax. Here is where its abstraction pays off: since metaprograms cannot interact with a proof state except through a limited interface, it need not be *deeply* embedded as syntax. By simply wrapping the internal proofstate into a new kind of “alien” term, and making the primitives aware of this wrapping, we can readily run the metaprogram that safely carries its alien proof state around. This wrapping of proof states is a constant-time operation.

The third step is interpreting the primitives. They are realized by functions of similar types implemented within the F\(^\star \) type-checker, but over an internal
Open image in new window
monad and the concrete definitions for
Open image in new window
,
Open image in new window
, etc. Hence, there is a translation involved on every call and return, switching between embedded representations and their concrete variants. Take
Open image in new window
, for example, with type
Open image in new window
. Its internal implementation, implemented within the F\(^\star \) type-checker, has type
Open image in new window Open image in new window
. When interpreting a call to it, the interpreter must *unembed* the arguments (which are representations of F\(^\star \) terms) into a concrete string and a concrete proofstate to pass to the internal implementation of
Open image in new window
. The situation is symmetric for the return value of the call, which must be *embedded* as a term.

### 3.3 Syntax Inspection, Generation, and Quotation

If metaprograms are to be reusable over different kinds of goals, they must be able to reflect on the goals they are invoked to solve. Like any metaprogramming system, Meta-F\(^\star \) offers a way to inspect and construct the syntax of F\(^\star \) terms. Our representation of terms as an inductive type, and the variants of quotations, are inspired by the ones in Idris [22] and Lean [30].

**Inspecting Syntax.**Internally, F\(^\star \) uses a locally-nameless representation [21] with explicit, delayed substitutions. To shield metaprograms from some of this internal bureaucracy, we expose a simplified view [61] of terms. Below we present a few constructors from the Open image in new window type: The Open image in new window type provides the “one-level-deep” structure of a term: metaprograms must call Open image in new window to reveal the structure of the term, one constructor at a time. The view exposes three kinds of variables: bound variables, Open image in new window ; named local variables Open image in new window ; and top-level fully qualified names, Open image in new window . Bound variables and local variables are distinguished since the internal abstract syntax is locally nameless. For metaprogramming, it is usually simpler to use a fully-named representation, so we provide Open image in new window and Open image in new window functions that open and close binders appropriately to maintain this invariant. Since opening binders requires freshness, Open image in new window has effect Open image in new window .

^{5}As generating large pieces of syntax via the view easily becomes tedious, we also provide some ways of

*quoting*terms:

**Static Quotations.**A static quotation Open image in new window is just a shorthand for statically calling the F\(^\star \) parser to convert Open image in new window into the abstract syntax of F\(^\star \) terms above. For instance, Open image in new window is equivalent to the following,

**Dynamic Quotations.** A second form of quotation is
Open image in new window Open image in new window
, an effectful operation that is interpreted by F\(^\star \)’s normalizer during metaprogram evaluation. It returns the syntax of its argument at the time
Open image in new window
is evaluated. Evaluating
Open image in new window
substitutes all the free variables in
Open image in new window
with their current values in the execution environment, suspends further evaluation, and returns the abstract syntax of the resulting term. For instance, evaluating
Open image in new window
produces the abstract syntax of
Open image in new window
.

**Anti-quotations.** Static quotations are useful for building big chunks of syntax concisely, but they are of limited use if we cannot combine them with existing bits of syntax. Subterms of a quotation are allowed to “escape” and be substituted by arbitrary expressions. We use the syntax
Open image in new window
to denote an antiquoted
Open image in new window
, where
Open image in new window
must be an expression of type
Open image in new window
in order for the quotation to be well-typed. For example,
Open image in new window
creates syntax for an addition where one operand is the integer constant
Open image in new window
and the other is the term represented by
Open image in new window
.

**Unquotation.** Finally, we provide an effectful operation,
Open image in new window Open image in new window Open image in new window
, which takes a term representation
Open image in new window
and an expected type for it
Open image in new window
(usually inferred from the context), and calls the F\(^\star \) type-checker to check and elaborate the term representation into a well-typed term.

### 3.4 Specifying and Verifying Metaprograms

Since we model metaprograms as a particular kind of effectful program within F\(^\star \), which is a program verifier, a natural question to ask is whether F\(^\star \) can specify and verify metaprograms. The answer is “yes, to a degree”.

Due to type abstraction, though, the specifications of most primitives cannot provide complete detail about their behavior, and deeper specifications (such as ensuring a tactic will correctly solve a goal) cannot currently be proven, nor even stated—to do so would require, at least, an internalization of the typing judgment of F\(^\star \). While this is an exciting possibility [3], we have for now only focused on verifying basic safety properties of metaprograms, which helps users detect errors early, and whose proofs the SMT can handle well. Although in principle, one can also write tactics to discharge the proof obligations of metaprograms.

## 4 Meta-F\(^\star \), Formally

We now describe the trust assumptions for Meta-F\(^\star \) (Sect. 4.1) and then how we reconcile tactics within a program verifier, where the exact shape of VCs is not given, nor known a priori by the user (Sect. 4.2).

### 4.1 Correctness and Trusted Computing Base (TCB)

As in any proof assistant, tactics and metaprogramming would be rather useless if they allowed to “prove” invalid judgments—care must be taken to ensure soundness. We begin with a taste of the specifics of F\(^\star \)’s static semantics, which influence the trust model for Meta-F\(^\star \), and then provide more detail on the TCB.

**Proof Irrelevance in F**\({^{\star }}\mathbf{.}\) The following two rules for introducing and eliminating refinement types are key in F\(^\star \), as they form the basis of its proof irrelevance.The \(\vDash \) symbol represents F\(^\star \)’s

*validity judgment*[1] which, at a high-level, defines a proof-irrelevant, classical, higher-order logic. These validity hypotheses are usually collected by the type-checker, and then encoded to the SMT solver in bulk. Crucially, the irrelevance of validity is what permits efficient interaction with SMT solvers, since reconstructing F\(^\star \) terms from SMT proofs is unneeded.

As evidenced in the rules, validity and typing are mutually recursive, and therefore Meta-F\(^\star \) must also construct validity derivations. In the implementation, we model these validity goals as holes with a “squash” type [5, 53], where Open image in new window , i.e., a refinement of Open image in new window . Concretely, we model Open image in new window as Open image in new window using a unification variable. Meta-F\(^\star \) does not construct deep solutions to squashed goals: if they are proven valid, the variable Open image in new window is simply solved by the Open image in new window value Open image in new window . At any point, any such irrelevant goal can be sent to the SMT solver. Relevant goals, on the other hand, cannot be sent to SMT.

**Scripting the Typing Judgment.** A consequence of validity proofs not being materialized is that type-checking is undecidable in F\(^\star \). For instance: does the unit value
Open image in new window
solve the hole Open image in new window Well, only if \(\phi \) holds—a condition which no type-checker can effectively decide. This implies that the type-checker cannot, in general, rely on proof terms to reconstruct a proof. Hence, the primitives are designed to provide access to the typing judgment of F\(^\star \) directly, instead of building syntax for proof terms. One can think of F\(^\star \)’s type-checker as implementing one particular algorithmic heuristic of the typing and validity judgments—a heuristic which happens to work well in practice. For convenience, this default type-checking heuristic is also available to metaprograms: this is in fact precisely what the
Open image in new window
primitive does. Having programmatic access to the typing judgment also provides the flexibility to tweak VC generation as needed, instead of leaving it to the default behavior of F\(^\star \). For instance, the
Open image in new window
primitive implements T-Refine. When applied, it produces two new goals, including that the refinement actually holds. At that point, a metaprogram can run any arbitrary tactic on it, instead of letting the F\(^\star \) type-checker collect the obligation and send it to the SMT solver in bulk with others.

**Trust.** There are two common approaches for the correctness of tactic engines: (1) the *de Bruijn criterion* [6], which requires constructing full proofs (or proof terms) and checking them at the end, hence reducing trust to an independent proof-checker; and (2) the LCF style, which applies backwards reasoning while constructing validation functions at every step, reducing trust to primitive, forward-style implementations of the system’s inference rules.

As we wish to make use of SMT solvers within F\(^\star \), the first approach is not easy. Reconstructing the proofs SMT solvers produce, if any, back into a proper derivation remains a significant challenge (even despite recent progress, e.g. [17, 31]). Further, the logical encoding from F\(^\star \) to SMT, along with the solver itself, are already part of F\(^\star \)’s TCB: shielding Meta-F\(^\star \) from them would not significantly increase safety of the combined system.

Instead, we roughly follow the LCF approach and implement F\(^\star \)’s typing rules as the basic user-facing metaprogramming actions. However, instead of implementing the rules in forward-style and using them to validate (untrusted) backwards-style tactics, we implement them directly in backwards-style. That is, they run by breaking down goals into subgoals, instead of combining proven facts into new proven facts. Using LCF style makes the primitives part of the TCB. However, given the primitives are sound, any combination of them also is, and any user-provided metaprogram must be safe due to the abstraction imposed by the Open image in new window effect, as discussed next.

**Correct Evolutions of the Proof State.**For soundness, it is imperative that tactics do not arbitrarily drop goals from the proof state, and only discharge them when they are solved, or when they can be solved by other goals tracked in the proof state. For a concrete example, consider the following program: Here, Meta-F\(^\star \) will create an initial proof state with a single goal of the form Open image in new window and begin executing the metaprogram. When applying the Open image in new window primitive, the proof state transitions as shown below.Here, a solution to the original goal has not yet been built, since it

*depends*on the solution to the goal on the right hand side. When it is solved with, say, Open image in new window , we can solve our original goal with Open image in new window . To formalize these dependencies, we say that a proof state \(\phi \)

*correctly evolves (via*

*f*

*) to*\(\psi \), denoted Open image in new window , when there is a generic transformation

*f*, called a

*validation*, from solutions to all of \(\psi \)’s goals into correct solutions for \(\phi \)’s goals. When \(\phi \) has

*n*goals and \(\psi \) has

*m*goals, the validation

*f*is a function from Open image in new window into Open image in new window . Validations may be composed, providing the transitivity of correct evolution, and if a proof state \(\phi \) correctly evolves (in any amount of steps) into a state with no more goals, then we have fully defined solutions to all of \(\phi \)’s goals. We emphasize that validations are not constructed explicitly during the execution of metaprograms. Instead we exploit unification metavariables to instantiate the solutions automatically.

Note that validations may construct solutions for more than one goal, i.e., their codomain is not a single term. This is required in Meta-F\(^\star \), where primitive steps may not only decompose goals into subgoals, but actually combine goals as well. Currently, the only primitive providing this behavior is Open image in new window , which finds a maximal common prefix of the environment of two irrelevant goals, reverts the “extra” binders in both goals and builds their conjunction. Combining goals using Open image in new window is especially useful for sending multiple goals to the SMT solver in a single call. When there are common obligations within two goals, Open image in new window them before calling the SMT solver can result in a significantly faster proof.

*f*is the (mathematical, meta-level) function taking a term of type

*int*(the solution for Open image in new window ) and building syntax for its abstraction over

*x*. Further, the Open image in new window primitive respects the correct-evolution preorder, by the very typing rule (T-Fun) from which it is defined. In this manner, every typing rule induces a syntax-building metaprogramming step. Our primitives come from this dual interpretation of typing rules, which ensures that logical consistency is preserved.

Since the Open image in new window relation is a preorder, and every metaprogramming primitive we provide the user evolves the proof state according Open image in new window , it is trivially the case that the final proof state returned by a (successful) computation is a correct evolution of the initial one. That means that when the metaprogram terminates, one has indeed broken down the proof obligation correctly, and is left with a (hopefully) simpler set of obligations to fulfill. Note that since Open image in new window is a preorder, Open image in new window provides an interesting example of monotonic state [2].

### 4.2 Extracting Individual Assertions

As discussed, the logical context of a goal processed by a tactic is not always syntactically evident in the program. And, as shown in the Open image in new window call in Open image in new window from Sect. 3.4, some obligations crucially depend on the control-flow of the program. Hence, the proof state must crucially include these assumptions if proving the assertion is to succeed. Below, we describe how Meta-F\(^\star \) finds proper contexts in which to prove the assertions, including control-flow information. Notably, this process is defined over logical formulae and does not depend at all on F\(^\star \)’s WP calculus or VC generator: we believe it should be applicable to any VC generator.

As seen in Sect. 2.1, the basic mechanism by which Meta-F\(^\star \) attaches a tactic to a specific sub-goal is Open image in new window . Our encoding of this expression is built similarly to F\(^\star \)’s existing Open image in new window construct, which is simply sugar for a pure function Open image in new window of type Open image in new window , which essentially introduces a cut in the generated VC. That is, the term Open image in new window roughly produces the verification condition Open image in new window , requiring a proof of Open image in new window at this point, and assuming Open image in new window in the continuation. For Meta-F\(^\star \), we aim to keep this style while allowing asserted formulae to be decorated with user-provided tactics that are tasked with proving or pre-processing them. We do this in three steps.

*R*is: Afterwards, this obligation is removed from the original VC. This is done by replacing it with Open image in new window , leaving a “skeleton” VC with all remaining facts. The validity of Open image in new window and Open image in new window implies that of Open image in new window . F\(^\star \) also recursively descends into Open image in new window and Open image in new window , in case there are more Open image in new window markers in them. Then, tactics are run on the the split VCs (e.g., Open image in new window on Open image in new window ) to break them down (or solve them). All remaining goals, including the skeleton, are sent to the SMT solver.

Note that while the *obligation* to prove
Open image in new window
, in
Open image in new window
, is preprocessed by the tactic
Open image in new window
, the *assumption* Open image in new window
for the continuation of the code, in
Open image in new window
, is left as-is. This is crucial for tactics such as the canonicalizer from Sect. 2.1: if the skeleton
Open image in new window
contained an assumption for the canonicalized equality it would not help the SMT solver show the uncanonicalized postcondition.

However, not all nodes marked with
Open image in new window
are proof obligations. Suppose
Open image in new window
in the previous VC was given as
Open image in new window
. In this case, one certainly does not want to attempt to prove
Open image in new window
, since it is an hypothesis. While it would be *sound* to prove it and replace it by
Open image in new window
, it is useless at best, and usually irreparably affects the system. Consider asserting the tautology
Open image in new window
.

Hence, F\(^\star \) splits such obligations only in strictly-positive positions. On all others, F\(^\star \) simply drops the Open image in new window marker, e.g., by just unfolding the definition of Open image in new window . For regular uses of the Open image in new window construct, however, all occurrences are strictly-positive. It is only when (expert) users use the Open image in new window marker directly that the above discussion might become relevant.

Formally, the soundness of this whole approach is given by the following metatheorem, which justifies the splitting out of sub-assertions, and by the correctness of evolution detailed in Sect. 4.1. The proof of Theorem 1 is straightforward, and included in the appendix. We expect an analogous property to hold in other verifiers as well (in particular, it holds for first-order logic).

### Theorem 1

*E*be a context with \(\varGamma \vdash E : prop \Rightarrow prop\), and \(\phi \) a squashed proposition such that \(\varGamma \vdash \phi : prop\). Then the following holds:

*E*introduces. If

*E*is strictly-positive, then the reverse implication holds as well.

## 5 Executing Metaprograms Efficiently

F\(^\star \) provides three complementary mechanisms for running metaprograms. The first two, F\(^\star \)’s call-by-name (CBN) interpreter and a (newly implemented) call-by-value (CBV) NbE-based evaluator, support strong reduction—henceforth we refer to these as “normalizers”. In addition, we design and implement a new *native plugin* mechanism that allows both normalizers to interface with Meta-F\(^\star \) programs extracted to OCaml, reusing F\(^\star \)’s existing extraction pipeline for this purpose. Below we provide a brief overview of the three mechanisms.

### 5.1 CBN and CBV Strong Reductions

As described in Sect. 3.1, metaprograms, once reified, are simply F\(^\star \) terms of type Open image in new window . As such, they can be reduced using F\(^\star \)’s existing computation machinery, a CBN interpreter for strong reductions based on the Krivine abstract machine (KAM) [24, 46]. Although complete and highly configurable, F\(^\star \)’s KAM interpreter is slow, designed primarily for converting types during dependent type-checking and higher-order unification.

Shifting focus to long-running metaprograms, such as tactics for proofs by reflection, we implemented an NbE-based strong-reduction evaluator for F\(^\star \) computations. The evaluator is implemented in F\(^\star \) and extracted to OCaml (as is the rest of F\(^\star \)), thereby inheriting CBV from OCaml. It is similar to Boespflug et al.’s [16] NbE-based strong-reduction for Coq, although we do not implement their low-level, OCaml-specific tag-elimination optimizations—nevertheless, it is already vastly more efficient than the KAM-based interpreter.

### 5.2 Native Plugins and Multi-language Interoperability

Since Meta-F\(^\star \) programs are just F\(^\star \) programs, they can also be extracted to OCaml and natively compiled. Further, they can be dynamically linked into F\(^\star \) as “plugins”. Plugins can be directly called from the type-checker, as is done for the primitives, which is much more efficient than interpreting them. However, compilation has a cost, and it is not convenient to compile every single invocation. Instead, Meta-F\(^\star \) enables users to choose which metaprograms are to be plugins (presumably those expected to be computation-intensive, e.g. Open image in new window ). Users can choose their native plugins, while still quickly scripting their higher-level logic in the interpreter.

This requires (for higher-order metaprograms) a form of multi-language interoperability, converting between representations of terms used in the normalizers and in native code. We designed a small multi-language calculus, with ML-style polymorphism, to model the interaction between normalizers and plugins and conversions between terms. See the appendix for details.

Beyond the notable efficiency gains of running compiled code vs. interpreting it, native metaprograms also require fewer embeddings. Once compiled, metaprograms work over the internal, *concrete* types for
Open image in new window
,
Open image in new window
, etc., instead of over their F\(^\star \) representations (though still treating them abstractly). Hence, compiled metaprograms can call primitives without needing to embed their arguments or unembed their results. Further, they can call each other directly as well. Indeed, operationally there is little operational difference between a primitive and a compiled metaprogram used as a plugin.

Native plugins, however, are not a replacement for the normalizers, for several reasons. First, the overhead in compilation might not be justified by the execution speed-up. Second, extraction to OCaml erases types and proofs. As a result, the F\(^\star \) *interface* of the native plugins can only contain types that can also be expressed in OCaml, thereby excluding full-dependent types—internally, however, they can be dependently typed. Third, being OCaml programs, native plugins do not support reducing open terms, which is often required. However, when the programs treat their open arguments parametrically, relying on parametric polymorphism, the normalizers can pass such arguments *as-is*, thereby recovering open reductions in some cases. This allows us to use native datastructure implementations (e.g.
Open image in new window
), which is much faster than using the normalizers, even for open terms. See the appendix for details.

## 6 Experimental Evaluation

We now present an experimental evaluation of Meta-F\(^\star \). First, we provide benchmarks comparing our reflective canonicalizer from Sect. 2.1 to calling the SMT solver directly without any canonicalization. Then, we return to the parsers and serializers from Sect. 2.3 and show how, for VCs that arise, a domain-specific tactic is much more tractable than a SMT-only proof.

### 6.1 A Reflective Tactic for Partial Canonicalization

*i*x rows represent asking the solver to prove the lemma without any help from tactics, where

*i*represents the resource limit (rlimit) multiplier given to the solver. This rlimit is memory-allocation based and independent of the particular system or current load. For the Open image in new window and Open image in new window rows, the Open image in new window tactic is used, running it using F\(^\star \)’s KAM normalizer and as a native plugin respectively—both with an rlimit of 1. For each setup, we display the success rate of verification, the average (CPU) time taken for the SMT queries (not counting the time for parsing/processing the theory) with its standard deviation, and the average total time (its standard deviation coincides with that of the queries). When applicable, the time for tactic execution (which is independent of the seed) is displayed. The smt rows show very poor success rates: even when upping the rlimit to a whopping 100x, over three quarters of the attempts fail. Note how the (relative) standard deviation increases with the rlimit: this is due to successful runs taking rather random times, and failing ones exhausting their resources in similar times. The setups using the tactic show a clear increase in robustness: canonicalizing the assertion causes this proof to always succeed, even at the default rlimit. We recall that the tactic variants still leave goals for SMT solving, namely, the skeleton for the original VC and the canonicalized equality left by the tactic, easily dischargeable by the SMT solver through much more well-behaved linear reasoning. The last column shows that native compilation speeds up this tactic’s execution by about 5x.

Rate | Queries | Tactic | Total | |
---|---|---|---|---|

smt1x | 0.5% | 0.216 ± 0.001 | – | 2.937 |

smt2x | 2% | 0.265 ± 0.003 | – | 2.958 |

smt3x | 4% | 0.304 ± 0.004 | – | 3.022 |

smt6x | 10% | 0.401 ± 0.008 | – | 3.155 |

smt12x | 12.5% | 0.596 ± 0.031 | – | 3.321 |

smt25x | 16.5% | 1.063 ± 0.079 | – | 3.790 |

smt50x | 22% | 2.319 ± 0.230 | – | 5.030 |

smt100x | 24% | 5.831 ± 0.776 | – | 8.550 |

interp | 100% | 0.141 ± 0.001 | 1.156 | 4.003 |

native | 100% | 0.139 ± 0.001 | 0.212 | 3.071 |

### 6.2 Combining SMT and Tactics for the Parser Generator

In Sect. 2.3, we presented a library of combinators and a metaprogramming approach to automate the construction of verified, mutually inverse, low-level parsers and serializers from type descriptions. Beyond generating the code, tactics are used to process and discharge proof obligations that arise when using the combinators.

Size | SMT only | Tactic only | Hybrid |
---|---|---|---|

4 | 178 | 17.3 | 6.6 |

7 | 468 | 38.3 | 9.8 |

10 | 690 | 63.0 | 19.4 |

The table alongside shows the total time in seconds for verifying metaprogrammed low-level parsers and serializers for enumerations of different sizes. In short, the hybrid approach scales the best; the tactic-only approach is somewhat slower; while the SMT-only approach scales poorly and is an order of magnitude slower. Our hybrid approach is very simple. With some more work, a more sophisticated hybrid strategy could be more performant still, relying on tactic-based normalization proofs for fragments of the VC best handled computationally (where the SMT solver spends most of its time), while using SMT only for integer arithmetic, congruence closure etc. However, with Meta-F\(^\star \)’s ability to manipulate proof contexts programmatically, our simple context-pruning tactic provides a big payoff at a small cost.

## 7 Related Work

Many SMT-based program verifiers [7, 8, 19, 34, 48], rely on user hints, in the form of assertions and lemmas, to complete proofs. This is the predominant style of proving used in tools like Dafny [47], Liquid Haskell [60], Why3 [33], and F\(^\star \) itself [58]. However, there is a growing trend to augment this style of semi-automated proof with interactive proofs. For example, systems like Why3 [33] allow VCs to be discharged using ITPs such as Coq, Isabelle/HOL, and PVS, but this requires an additional embedding of VCs into the logic of the ITP in question. In recent concurrent work, support for *effectful* reflection proofs was added to Why3 [50], and it would be interesting to investigate if this could also be done in Meta-F\(^\star \). Grov and Tumas [39] present Tacny, a tactic framework for Dafny, which is, however, limited in that it only transforms source code, with the program verifier unchanged. In contrast, Meta-F\(^\star \) combines the benefits of an SMT-based program verifier and those of tactic proofs within a single language.

Moving away from SMT-based verifiers, ITPs have long relied on separate languages for proof scripting, starting with Edinburgh LCF [37] and ML, and continuing with HOL, Isabelle and Coq, which are either extensible via ML, or have dedicated tactic languages [3, 29, 56, 62]. Meta-F\(^\star \) builds instead on a recent idea in the space of dependently typed ITPs [22, 30, 42, 63] of reusing the object-language as the meta-language. This idea first appeared in Mtac, a Coq-based tactics framework for Coq [42, 63], and has many generic benefits including reusing the standard library, IDE support, and type checker of the proof assistant. Mtac can additionally check the partial correctness of tactics, which is also sometimes possible in Meta-F\(^\star \) but still rather limited (Sect. 3.4). Meta-F\(^\star \)’s design is instead more closely inspired by the metaprogramming frameworks of Idris [22] and Lean [30], which provide a deep embedding of terms that metaprograms can inspect and construct at will without dependent types getting in the way. However, F\(^\star \)’s effects, its weakest precondition calculus, and its use of SMT solvers distinguish Meta-F\(^\star \) from these other frameworks, presenting both challenges and opportunities, as discussed in this paper.

Some SMT solvers also include tactic engines [27], which allow to process queries in custom ways. However, using SMT tactics from a program verifier is not very practical. To do so effectively, users must become familiar not only with the solver’s language and tactic engine, but also with the translation from the program verifier to the solver. Instead, in Meta-F\(^\star \), everything happens within a single language. Also, to our knowledge, these tactics are usually coarsely-grained, and we do not expect them to enable developments such as Sect. 2.2. Plus, SMT tactics do not enable metaprogramming.

Finally, ITPs are seeing increasing use of “hammers” such as Sledgehammer [14, 15, 54] in Isabelle/HOL, and similar tools for HOL Light and HOL4 [43], and Mizar [44], to interface with ATPs. This technique is similar to Meta-F\(^\star \), which, given its support for a dependently typed logic is especially related to a recent hammer for Coq [26]. Unlike these hammers, Meta-F\(^\star \) does not aim to reconstruct SMT proofs, gaining efficiency at the cost of trusting the SMT solver. Further, whereas hammers run in the background, lightening the load on a user otherwise tasked with completing the entire proof, Meta-F\(^\star \) relies more heavily on the SMT solver as an end-game tactic in nearly all proofs.

## 8 Conclusions

A key challenge in program verification is to balance automation and expressiveness. Whereas tactic-based ITPs support highly expressive logics, the tactic author is responsible for all the automation. Conversely, SMT-based program verifiers provide good, scalable automation for comparatively weaker logics, but offer little recourse when verification fails. A design that allows picking the right tool, at the granularity of each verification sub-task, is a worthy area of research. Meta-F\(^\star \) presents a new point in this space: by using hand-written tactics alongside SMT-automation, we have written proofs that were previously impractical in F\(^\star \), and (to the best of our knowledge) in other SMT-based program verifiers.

## Footnotes

- 1.
Open image in new window is F\(^\star \) notation for the type of a computation proving Open image in new window —we omit Open image in new window when it is trivial. In F\(^\star \)’s standard library, math lemmas are proved using SMT with little or no interactions between problematic theory combinations. These lemmas can then be explicitly invoked in larger contexts, and are deleted during extraction.

- 2.
This differs from the usual presentation where these three operators are heap

*predicates*instead of heaps. - 3.
F\(^\star \) syntax for refinements is Open image in new window , denoting the type of all Open image in new window of type Open image in new window satisfying Open image in new window

- 4.
We use greek letters \(\alpha \), \(\beta \), ... to abbreviate universally quantified type variables.

- 5.
We also provide functions Open image in new window , Open image in new window which stay in a locally-nameless representation and are thus pure, total functions.

## Notes

### Acknowledgements

We thank Leonardo de Moura and the Project Everest team for many useful discussions. The work of Guido Martínez, Nick Giannarakis, Monal Narasimhamurthy, and Zoe Paraskevopoulou was done, in part, while interning at Microsoft Research. Clément Pit-Claudel’s work was in part done during an internship at Inria Paris. The work of Danel Ahman, Victor Dumitrescu, and Cătălin Hriţcu is supported by the MSR-Inria Joint Centre and the European Research Council under ERC Starting Grant SECOMP (1-715753).

## References

- 1.Ahman, D., et al.: Dijkstra monads for free. In: POPL (2017). https://doi.org/10.1145/3009837.3009878
- 2.Ahman, D., Fournet, C., Hriţcu, C., Maillard, K., Rastogi, A., Swamy, N.: Recalling a witness: foundations and applications of monotonic state. PACMPL
**2**(POPL), 65:1–65:30 (2018). https://arxiv.org/abs/1707.02466Google Scholar - 3.Anand, A., Boulier, S., Cohen, C., Sozeau, M., Tabareau, N.: Towards certified meta-programming with typed Template-Coq. In: Avigad, J., Mahboubi, A. (eds.) ITP 2018. LNCS, vol. 10895, pp. 20–39. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94821-8_2. https://template-coq.github.io/template-coq/CrossRefGoogle Scholar
- 4.Appel, A.W.: Tactics for separation logic. Early Draft (2006). https://www.cs.princeton.edu/~appel/papers/septacs.pdf
- 5.Awodey, S., Bauer, A.: Propositions as [Types]. J. Log. Comput.
**14**(4), 447–471 (2004). https://doi.org/10.1093/logcom/14.4.447MathSciNetCrossRefzbMATHGoogle Scholar - 6.Barendregt, H., Geuvers, H.: Proof-assistants using dependent type systems. In: Handbook of Automated Reasoning, pp. 1149–1238. Elsevier Science Publishers B. V., Amsterdam (2001). http://dl.acm.org/citation.cfm?id=778522.778527
- 7.Barnett, M., Chang, B.-Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie: a modular reusable verifier for object-oriented programs. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.-P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 364–387. Springer, Heidelberg (2006). https://doi.org/10.1007/11804192_17CrossRefGoogle Scholar
- 8.Barnett, M., et al.: The Spec# programming system: challenges and directions. In: Meyer, B., Woodcock, J. (eds.) VSTTE 2005. LNCS, vol. 4171, pp. 144–152. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69149-5_16CrossRefGoogle Scholar
- 9.Barras, B., Grégoire, B., Mahboubi, A., Théry, L.: Chap. 25: The ring and field tactic families. Coq reference manual. https://coq.inria.fr/refman/ring.html
- 10.Berger, U., Schwichtenberg, H.: An inverse of the evaluation functional for typed lambda-calculus. In: LICS (1991). https://doi.org/10.1109/LICS.1991.151645
- 11.Bernstein, D.J.: The Poly1305-AES message-authentication code. In: Gilbert, H., Handschuh, H. (eds.) FSE 2005. LNCS, vol. 3557, pp. 32–49. Springer, Heidelberg (2005). https://doi.org/10.1007/11502760_3. https://cr.yp.to/mac/poly1305-20050329.pdfCrossRefGoogle Scholar
- 12.Besson, F.: Fast reflexive arithmetic tactics the linear case and beyond. In: Altenkirch, T., McBride, C. (eds.) TYPES 2006. LNCS, vol. 4502, pp. 48–62. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74464-1_4CrossRefzbMATHGoogle Scholar
- 13.Bhargavan, K., et al.: Everest: towards a verified, drop-in replacement of HTTPS. In: SNAPL (2017). http://drops.dagstuhl.de/opus/volltexte/2017/7119/pdf/LIPIcs-SNAPL-2017-1.pdf
- 14.Blanchette, J.C., Popescu, A.: Mechanizing the metatheory of Sledgehammer. In: Fontaine, P., Ringeissen, C., Schmidt, R.A. (eds.) FroCoS 2013. LNCS (LNAI), vol. 8152, pp. 245–260. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40885-4_17CrossRefzbMATHGoogle Scholar
- 15.Blanchette, J.C., Böhme, S., Paulson, L.C.: Extending Sledgehammer with SMT solvers. JAR
**51**(1), 109–128 (2013). https://doi.org/10.1007/s10817-013-9278-5MathSciNetCrossRefzbMATHGoogle Scholar - 16.Boespflug, M., Dénès, M., Grégoire, B.: Full reduction at full throttle. In: Jouannaud, J.-P., Shao, Z. (eds.) CPP 2011. LNCS, vol. 7086, pp. 362–377. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25379-9_26CrossRefGoogle Scholar
- 17.Böhme, S., Weber, T.: Fast LCF-style proof reconstruction for Z3. In: Kaufmann, M., Paulson, L.C. (eds.) ITP 2010. LNCS, vol. 6172, pp. 179–194. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14052-5_14CrossRefGoogle Scholar
- 18.Bond, B., et al.: Vale: verifying high-performance cryptographic assembly code. In: USENIX Security (2017). https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/bond
- 19.Burdy, L., et al.: An overview of JML tools and applications. STTT
**7**(3), 212–232 (2005). https://doi.org/10.1007/s10009-004-0167-4CrossRefGoogle Scholar - 20.Chaieb, A., Nipkow, T.: Proof synthesis and reflection for linear arithmetic. J. Autom. Reason.
**41**(1), 33–59 (2008). https://doi.org/10.1007/s10817-008-9101-xMathSciNetCrossRefzbMATHGoogle Scholar - 21.Charguéraud, A.: The locally nameless representation. J. Autom. Reason.
**49**(3), 363–408 (2012). https://doi.org/10.1007/s10817-011-9225-2MathSciNetCrossRefzbMATHGoogle Scholar - 22.Christiansen, D.R., Brady, E.: Elaborator reflection: extending Idris in Idris. In: ICFP (2016). https://doi.org/10.1145/2951913.2951932
- 23.Cohen, E., Moskal, M., Schulte, W., Tobies, S.: Local verification of global invariants in concurrent programs. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 480–494. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14295-6_42CrossRefGoogle Scholar
- 24.Crégut, P.: Strongly reducing variants of the Krivine abstract machine. HOSC
**20**(3), 209–230 (2007). https://doi.org/10.1007/s10990-007-9015-zCrossRefzbMATHGoogle Scholar - 25.Cuoq, P., Kirchner, F., Kosmatov, N., Prevosto, V., Signoles, J., Yakobowski, B.: Frama-C: a software analysis perspective. In: Eleftherakis, G., Hinchey, M., Holcombe, M. (eds.) SEFM 2012. LNCS, vol. 7504, pp. 233–247. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33826-7_16CrossRefGoogle Scholar
- 26.Czajka, Ł., Kaliszyk, C.: Hammer for Coq: automation for dependent type theory. JAR
**61**(1–4), 423–453 (2018). https://doi.org/10.1007/s10817-018-9458-4MathSciNetCrossRefzbMATHGoogle Scholar - 27.de Moura, L., Passmore, G.O.: The strategy challenge in SMT solving. In: Bonacina, M.P., Stickel, M.E. (eds.) Automated Reasoning and Mathematics. LNCS (LNAI), vol. 7788, pp. 15–44. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36675-8_2. http://dl.acm.org/citation.cfm?id=2554473.2554475CrossRefGoogle Scholar
- 28.de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24CrossRefGoogle Scholar
- 29.Delahaye, D.: A tactic language for the system Coq. In: Parigot, M., Voronkov, A. (eds.) LPAR 2000. LNAI, vol. 1955, pp. 85–95. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44404-1_7CrossRefzbMATHGoogle Scholar
- 30.Ebner, G., Ullrich, S., Roesch, J., Avigad, J., de Moura, L.: A metaprogramming framework for formal verification. PACMPL
**1**(ICFP), 34:1–34:29 (2017). https://doi.org/10.1145/3110278CrossRefGoogle Scholar - 31.Ekici, B., et al.: SMTCoq: a plug-in for integrating SMT solvers into Coq. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017, Part II. LNCS, vol. 10427, pp. 126–133. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_7CrossRefGoogle Scholar
- 32.Erbsen, A., Philipoom, J., Gross, J., Sloan, R., Chlipala, A.: Simple high-level code for cryptographic arithmetic - with proofs, without compromises. In: IEEE S&P (2019). https://doi.org/10.1109/SP.2019.00005
- 33.Filliâtre, J.-C., Paskevich, A.: Why3 — where programs meet provers. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 125–128. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37036-6_8. https://hal.inria.fr/hal-00789533/documentCrossRefGoogle Scholar
- 34.Flanagan, C., Leino, K.R.M., Lillibridge, M., Nelson, G., Saxe, J.B., Stata, R.: PLDI 2002: extended static checking for Java. SIGPLAN Not.
**48**(4S), 22–33 (2013). https://doi.org/10.1145/2502508.2502520CrossRefGoogle Scholar - 35.Fromherz, A., Giannarakis, N., Hawblitzel, C., Parno, B., Rastogi, A., Swamy, N.: A verified, efficient embedding of a verifiable assembly language. PACMPL (POPL) (2019). https://github.com/project-everest/project-everest.github.io/raw/master/assets/vale-popl.pdf
- 36.Gonthier, G.: Formal proof—the four-color theorem. Not. AMS
**55**(11), 1382–1393 (2008). https://www.ams.org/notices/200811/tx081101382p.pdfMathSciNetzbMATHGoogle Scholar - 37.Gordon, M.J., Milner, A.J., Wadsworth, C.P.: Edinburgh LCF: A Mechanised Logic of Computation. LNCS, vol. 78. Springer, Heidelberg (1979). https://doi.org/10.1007/3-540-09724-4CrossRefzbMATHGoogle Scholar
- 38.Grégoire, B., Mahboubi, A.: Proving equalities in a commutative ring done right in Coq. In: Hurd, J., Melham, T. (eds.) TPHOLs 2005. LNCS, vol. 3603, pp. 98–113. Springer, Heidelberg (2005). https://doi.org/10.1007/11541868_7CrossRefzbMATHGoogle Scholar
- 39.Grov, G., Tumas, V.: Tactics for the Dafny program verifier. In: Chechik, M., Raskin, J.-F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 36–53. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49674-9_3CrossRefGoogle Scholar
- 40.Hawblitzel, C., et al.: Ironclad apps: end-to-end security via automated full-system verification. In: OSDI (2014). https://www.usenix.org/conference/osdi14/technical-sessions/presentation/hawblitzel
- 41.Hawblitzel, C., et al.: Ironfleet: proving safety and liveness of practical distributed systems. CACM
**60**(7), 83–92 (2017). https://doi.org/10.1145/3068608CrossRefGoogle Scholar - 42.Kaiser, J., Ziliani, B., Krebbers, R., Régis-Gianas, Y., Dreyer, D.: Mtac2: typed tactics for backward reasoning in Coq. PACMPL
**2**(ICFP), 78:1–78:31 (2018). https://doi.org/10.1145/3236773CrossRefGoogle Scholar - 43.Kaliszyk, C., Urban, J.: Learning-assisted automated reasoning with Flyspeck. JAR
**53**(2), 173–213 (2014). https://doi.org/10.1007/s10817-014-9303-3MathSciNetCrossRefzbMATHGoogle Scholar - 44.Kaliszyk, C., Urban, J.: MizAR 40 for Mizar 40. JAR
**55**(3), 245–256 (2015). https://doi.org/10.1007/s10817-015-9330-8MathSciNetCrossRefzbMATHGoogle Scholar - 45.Krebbers, R., Timany, A., Birkedal, L.: Interactive proofs in higher-order concurrent separation logic. In: POPL (2017). http://dl.acm.org/citation.cfm?id=3009855
- 46.Krivine, J.-L.: A call-by-name lambda-calculus machine. Higher Order Symbol. Comput.
**20**(3), 199–207 (2007). https://doi.org/10.1007/s10990-007-9018-9MathSciNetCrossRefzbMATHGoogle Scholar - 47.Leino, K.R.M.: Dafny: an automatic program verifier for functional correctness. In: Clarke, E.M., Voronkov, A. (eds.) LPAR 2010. LNCS (LNAI), vol. 6355, pp. 348–370. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17511-4_20. http://dl.acm.org/citation.cfm?id=1939141.1939161CrossRefzbMATHGoogle Scholar
- 48.Rustan, K., Leino, M., Nelson, G.: An extended static checker for modula-3. In: Koskimies, K. (ed.) CC 1998. LNCS, vol. 1383, pp. 302–305. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026441CrossRefGoogle Scholar
- 49.McCreight, A.: Practical tactics for separation logic. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009. LNCS, vol. 5674, pp. 343–358. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03359-9_24CrossRefGoogle Scholar
- 50.Melquiond, G., Rieu-Helft, R.: A Why3 framework for reflection proofs and its application to GMP’s algorithms. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR 2018. LNCS (LNAI), vol. 10900, pp. 178–193. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94205-6_13CrossRefGoogle Scholar
- 51.Nanevski, A., Morrisett, J.G., Birkedal, L.: Hoare type theory, polymorphism and separation. JFP
**18**(5–6), 865–911 (2008). http://ynot.cs.harvard.edu/papers/jfpsep07.pdfMathSciNetzbMATHGoogle Scholar - 52.Nanevski, A., Vafeiadis, V., Berdine, J.: Structuring the verification of heap-manipulating programs. In: POPL (2010). https://doi.org/10.1145/1706299.1706331
- 53.Nogin, A.: Quotient types: a modular approach. In: Carreño, V.A., Muñoz, C.A., Tahar, S. (eds.) TPHOLs 2002. LNCS, vol. 2410, pp. 263–280. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45685-6_18CrossRefGoogle Scholar
- 54.Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link between automatic and interactive theorem provers. In: IWIL (2010). https://www21.in.tum.de/~blanchet/iwil2010-sledgehammer.pdf
- 55.Protzenko, J., et al.: Verified low-level programming embedded in F*. PACMPL
**1**(ICFP), 17:1–17:29 (2017). https://doi.org/10.1145/3110261CrossRefGoogle Scholar - 56.Stampoulis, A., Shao, Z.: VeriML: typed computation of logical terms inside a language with effects. In: ICFP (2010). https://doi.org/10.1145/1863543.1863591
- 57.Swamy, N., Weinberger, J., Schlesinger, C., Chen, J., Livshits, B.: Verifying higher-order programs with the Dijkstra monad. In: PLDI (2013). https://www.microsoft.com/en-us/research/publication/verifying-higher-order-programs-with-the-dijkstra-monad/
- 58.Swamy, N., et al.: Dependent types and multi-monadic effects in F*. In: POPL (2016). https://www.fstar-lang.org/papers/mumon/
- 59.Vazou, N., Seidel, E.L., Jhala, R., Vytiniotis, D., Peyton Jones, S.L.: Refinement types for Haskell. In: ICFP (2014). https://goto.ucsd.edu/~nvazou/refinement_types_for_haskell.pdf
- 60.Vazou, N., et al.: Refinement reflection: complete verification with SMT. PACMPL
**2**(POPL), 53:1–53:31 (2018). https://doi.org/10.1145/3158141CrossRefGoogle Scholar - 61.Wadler, P.: Views: a way for pattern matching to cohabit with data abstraction. In: POPL (1987). https://dl.acm.org/citation.cfm?doid=41625.41653
- 62.Wenzel, M.: The Isabelle/Isar reference manual (2017). http://isabelle.in.tum.de/doc/isar-ref.pdf
- 63.Ziliani, B., Dreyer, D., Krishnaswami, N.R., Nanevski, A., Vafeiadis, V.: Mtac: a monad for typed tactic programming in Coq. JFP
**25**(2015). https://doi.org/10.1017/S0956796815000118 - 64.Zinzindohoué, J.-K., Bhargavan, K., Protzenko, J., Beurdouche, B.: HACL*: a verified modern cryptographic library. In: CCS (2017). http://eprint.iacr.org/2017/536

## Copyright information

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.