1 Introduction

The Lean projectFootnote 1 started in 2013 [9] as an interactive theorem prover based on the Calculus of Inductive Constructions [4] (CIC). In 2017, using Lean 3, a community of users with very different backgrounds started the Lean mathematical library project mathlib [13]. At the time of this writing, mathlib has roughly half a million lines of code, and contains many nontrivial mathematical objects such as Schemes [2]. Mathlib is also the foundation for the Perfectoid Spaces in Lean project [1], and the Liquid Tensor challenge [11] posed by the renowned mathematician Peter Scholze. Mathlib contains not only mathematical objects but also Lean metaprograms that extend the system [5]. Some of these metaprograms implement nontrivial proof automation, such as a ring theory solver and a decision procedure for Presburger arithmetic. Lean metaprograms in mathlib also extend the system by adding new top-level command and features not related to proof automation. For example, it contains a package of semantic linters that alert users to many commonly made mistakes [5]. Lean 3 metaprograms have been also instrumental in building standalone applications, such as a SQL query equivalence checker [3].

We believe the Lean 3 theorem prover’s success is primarily due to its extensibility capabilities and metaprogramming framework [6]. However, users cannot modify many parts of the system without changing Lean 3 source code written in C++. Another issue is that many proof automation metaprograms are not competitive with similar proof automation implemented in programming languages with an efficient compiler such as C++ and OCaml. The primary source of inefficiency in Lean 3 metaprograms is the virtual machine interpretation overhead.

Lean 4 is a reimplementation of the Lean theorem prover in Lean itselfFootnote 2. It is an extensible theorem prover and an efficient programming language. The new compiler produces C code, and users can now implement efficient proof automation in Lean, compile it into efficient C code, and load it as a plugin. In Lean 4, users can access all internal data structures used to implement Lean by merely importing the package. Lean 4 is also a platform for developing efficient domain-specific automation. It has a more robust and extensible elaborator, and addresses many other shortcomings of Lean 3. We expect the Lean community to extend and add new features without having to change the Lean source code. We released Lean 4 at the beginning of 2021, it is open source, the community is already porting mathlib, and the number of applications is quickly growing. It includes a translation verifier for ReoptFootnote 3, a package for supporting inductive-inductive typesFootnote 4, and a car controllerFootnote 5.

2 Lean by Example

In this section, we introduce the Lean language using a series of examples. The source code for the examples is available at https://github.com/leanprover/lean4/blob/cade2021/doc/BoolExpr.lean. For additional details and installation instructions, we recommend the reader consult the online manualFootnote 6.

We define functions by using the keyword followed by its name, a parameter list, return type, and body. The parameter list consists of successive parameters that are separated by spaces. We can specify an explicit type for each parameter. If we do not specify a specific argument type, the elaborator tries to infer the function body’s type. The Boolean function is defined by pattern-matching as follows

figure d

We can use the command to inspect the type of term, and to evaluate it.

figure g

Lean has a hygienic macro system and comes equipped with many macros for commonly used idioms. For example, we can also define the function using

figure i

The notation above is a macro that expands into a -expression. In Lean, a theorem is a definition whose result type is a proposition. For an example, consider the following simple theorem about the definition above

figure k

The constant has type ,the curly braces indicate that the parameters \(\upalpha \) and are implicit and should be inferred by solving typing constraints. In the example above, the inferred values for \(\upalpha \) and are and , respectively, and the resulting type is . This is a valid proof because is definitionally equal to . In dependent type theory, every term has a computational behavior, and supports a notion of reduction. In principle, two terms that reduce to the same value are called definitionally equal. In the following example, we use pattern matching to prove that

figure v

Note that does not reduce to , but after pattern matching we have that ( ) reduces to ( ).

In the following example, we define the recursive datatype for representing Boolean expressions using the command .

figure ae

This command generates constructors , , , and . The Lean kernel also generates an inductive principle for the new type . We can write a basic “simplifier” for Boolean expressions as follows

figure ak

The function is a simple bottom-up simplifier. We use the clause to define two local auxiliary functions and for constructing “simplified” and expressions respectively. Their global names are and .

Given a context that maps variable names to Boolean values, we define a “denotation” function (or evaluator) for Boolean expressions. We use an association list to represent the context.

figure at

In the example above, is notation for , for , and is a macro that expands into . The term is syntax sugar for .

As in previous versions, we can use tactics for constructing proofs and terms. We use the keyword to switch into tactic mode. Tactics are user-defined or built-in procedures that construct various terms. They are all implemented in Lean itself. The tactic implements an extensible simplifier, and is one of the most popular tactics in mathlib. Its implementationFootnote 7 can be extended and modified by Lean users.

figure be

In the example above, we use the tactic, its syntax is similar to a -expression. The variables

figure bh

and

figure bi

are the induction hypothesis for and in the first alternative for the case is a . The tactic uses any theorem marked with the attribute as a rewriting rule (e.g., ). We explicitly provide the induction hypotheses as additional rewriting rules inside square brackets.

Typeclass Resolution. Typeclasses [16] provide an elegant and effective way of managing ad-hoc polymorphism in both programming languages and interactive proof assistants. Then we can declare particular elements of a typeclass to be instances. These provide hints to the elaborator: any time the elaborator is looking for an element of a typeclass, it can consult a table of declared instances to find a suitable element. What makes typeclass inference powerful is that one can chain instances, that is, an instance declaration can in turn depend on other instances. This causes class inference to recurse through instances, backtracking when necessary. The Lean typeclass resolution procedure can be viewed as a simple \(\lambda \)-Prolog interpreter [8], where the Horn clauses are the user declared instances.

For example, the standard library defines a typeclass to enable typeclass inference to infer a “default” or “arbitrary” element of types that contain at least one element.

figure br

The annotation

figure bs

at indicates that this implicit parameter should be synthesized from instance declarations using typeclass resolution. We can define an instance for our type defined earlier as follows

figure bv

This instance specifies that the “default” element for is . The following declaration shows that if two types \(\upalpha \) and \(\upbeta \) are inhabited, then so is their product:

figure by

The standard library has many builtin classes such as

figure bz

and

figure ca

. The class

figure cb

is similar to Haskell’s

figure cc

typeclass, and

figure cd

is a typeclass for types that have decidable equality. Lean 4 also provides code synthesizers for many builtin classes. The command instructs Lean to auto-generate an instance.

figure cf

In the example above, the command generates the instance

figure ch

The function evaluates decidable propositions. Thus, the last command returns since is not equal to .

The increasingly sophisticated uses of typeclasses in mathlib have exposed a few limitations in Lean 3: unnecessary overhead due to the lack of term indexing techniques, and exponential running times in the presence of diamonds. Lean 4 implements a new procedure [12], tabled typeclass resolution, that solves these problems by using discrimination treesFootnote 8. for better indexing and tabling, which is a generalization of memoizing introduced initially to address similar limitations of early logic programming systemsFootnote 9.

The hygienic macro system. In interactive theorem provers (ITPs), Lean included, extensible syntax is not only crucial to lower the cognitive burden of manipulating complex mathematical objects, but plays a critical role in developing reusable abstractions in libraries. Lean 3 support such extensions in the form of restrictive “syntax sugar” substitutions and other ad hoc mechanisms, which are too rudimentary to support many desirable abstractions. As a result, libraries are littered with unnecessary redundancy. The Lean 3 tactic languages is plagued by a seemingly unrelated issue: accidental name capture, which often produces unexpected and counterintuitive behavior. Lean 4 takes ideas from the Scheme family of programming languages and solves these two problems simultaneously by use of a hygienic, i.e. capture-avoiding, macro system custom-built for ITPs [15].

Lean 3’s “mixfix” notation system is still supported in Lean 4, but based on the much more general macro system; in fact, the Lean 3 keyword itself has been reimplemented as a macro, more specifically as a macro-generating macro. By providing such a tower of abstractions for writing syntax sugars, of which we will see more levels below, we want to enable users to work in the simplest model appropriate for their respective use case while always keeping open the option to switch to a lower, more expressive level.

As an example, we define the infix notation

figure cn

, with precedence 50, for the function defined earlier.

figure cp

The command expands to

figure cr

which itself expands to the macro declaration

figure cs

where the syntactic category ( ) of placeholders and of the entire macro is now specified explicitly, implying that macros can also be written for/using other categories such as the top-level . The right-hand side uses an explicit syntax quasiquotation to construct the syntax tree, with syntax placeholders (antiquotations) prefixed with . As suggested by the explicit use of quotations, the right-hand side may now be an arbitrary Lean term computing a syntax object, allowing for procedural macros as well.

  itself is another command-level macro that, for our example, expands to two commands

figure cy

that is, a pair of parser extension and syntax transformer. By separating these two steps at this abstraction level, it becomes possible to define (mutually) recursive macros and to reuse syntax between macros. Using , users can even extend existing macros with new rules. In general, separating parsing and expansion means that that we can obtain a well-structured syntax tree pre-expansion, i.e. a concrete syntax tree, and use it to implement source code tooling such as auto-completion, go-to-definition, and refactorings.

We can use the command for defining embedded domain-specific languages. In simple cases, we can reuse existing syntactic categories for this but assign them new semantics, such as in the following notation for constructing objects.

figure dc

The command above specifies how to convert a subset of the builtin syntax for terms into constructor applications for . The term converts the identifier into a string literal.

As a final example, we modify the notation

figure dh

. In the following version,

figure di

is not an arbitrary term anymore, but a comma-separated sequence of entries of the form

figure dj

, and the right-hand side is now interpreted as a term by reusing our macro from above.

figure dl

We use the antiquotation splice

figure dm

to deconstruct the sequence of entries into two arrays and containing the variable names and values, respectively, adjust the former array, and combine them again in a second splice.

3 The Code Generator

The Lean 4 code generator produces efficient C code. It is useful for building both efficient Lean extensions and standalone applications. The code generator performs many transformations, and many of them are based on techniques used in the Haskell compiler GHC [7]. However, in contrast to Haskell, Lean is a strict language. We control code inlining and specialization using the attributes and . They are crucial for eliminating the overhead introduced by the towers of abstractions used in our source code. Before emitting C code, we erase proof terms and convert Lean expressions into an intermediate representation (IR). The IR is a collection of Lean data structures,Footnote 10 and users can implement support for backends other than C by writing Lean programs that import . Lean 4 also comes with an interpreter for the IR, which allows for rapid incremental development and testing right from inside the editor. Whenever the interpreter calls a function for which native, ahead-of-time compiled code is available, it will switch to that instead, which includes all functions from the standard library. Thus the interpretation overhead is negligible as long as e.g. all expensive tactics are precompiled.

Functional but in-place. Most functional languages rely on garbage collection for automatic memory management. They usually eschew reference counting in favor of a tracing garbage collector, which has less bookkeeping overhead at runtime. On the other hand, having an exact reference count of each value enables optimizations such as destructive updates [14]. When performing functional updates, objects often die just before creating an object of the same kind. We observe a similar phenomenon when we insert a new element into a purely functional data structure, such as binary trees, a theorem prover rewrites formulas, a compiler applies optimizations by transforming abstract syntax trees, or the function defined earlier. We call it the resurrection hypothesis: many objects die just before creating an object of the same kind. The Lean memory manager uses reference counting and takes advantage of this hypothesis, and enables pure code to perform destructive updates in all scenarios described above when objects are not shared. It also allows a novel programming paradigm that we call functional but in-place (FBIP) [10]. Our preliminary experimental results demonstrate our new compiler produces competitive code that often outperforms the code generated by high-performance compilers such as ocamlopt and GHC [14]. As an example, consider the function map f as that applies a function f to each element of a list as.

In this example, \(\texttt {[]}\) denotes the empty list, and \(\texttt {a::as}\) the list with head \(\texttt {a}\) followed by the tail \(\texttt {as}\).

figure dt

If the list referenced by \(\texttt {as}\) is not shared, the code generated by our compiler does not allocate any memory. Moreover, if \(\texttt {as}\) is a nonshared list of list of integers, then \(\texttt {map (map inc) as}\) will not allocate any memory either. In contrast to static linearity systems, allocations are also avoided even if only a prefix of the list is not shared. FBIP also allows Lean users to use data structures, such as arrays and hashtables, in pure code without any performance penalty when they are not shared. We believe this is an attractive feature because hashtables are frequently used to implement decision procedures and nontrivial proof automation.

4 The User Interface

Our system implements the Language Server Protocol (LSP) using the task abstraction provided by its standard library. The Lean 4 LSP server is incremental and is continuously analyzing the source text and providing semantic information to editors implementing LSP. Our LSP server implements most LSP features found in advanced IDEs, such as hyperlinks, syntax highlighting, type information, error handling, auto-completion, etc. Many editors implement LSP, but VS Code is the preferred editor by the Lean user community. We provide extensions for visualizing the intermediate proof states in interactive tactic blocks, and we want to port the Lean 3 widget library for constructing interactive visualizations for their proofs and programs.

5 Conclusion

Lean 4 aims to be a fully extensible interactive theorem prover and functional programming language. It has an expressive logical foundation for writing mathematical specifications and proofs and formally verified programs. Lean 4 provides many new unique features, including a hygienic macro-system, an efficient typeclass resolution procedure based on tabled resolution, efficient code generator, and abstractions for sealing low-level optimizations. The new elaboration procedure is more general and efficient than those implemented in previous versions. Users may also extend and modify the elaborator using Lean itself. Lean has a relatively small trusted kernel, and the rich API allows users to export their developments to other systems and implement their own reference checkers. Lean is an ongoing and long-term effort, and future plans include integration with external SMT solvers and first-order theorem provers, new compiler backends, and porting the Lean 3 Mathematical Library.