The Lean 4 Theorem Prover and Programming Language (System Description

. Lean 4 is a reimplementation of the Lean interactive theorem prover (ITP) in Lean itself. It addresses many shortcomings of the previous versions and contains many new features. Lean 4 is fully extensible: users can modify and extend the parser, elaborator, tactics, decision procedures, pretty printer, and code generator. The new system has a hygienic macro system custom-built for ITPs. It contains a new typeclass resolution procedure based on tabled resolution, addressing signiﬁcant performance problems reported by the growing user base. Lean 4 is also an eﬃcient functional programming language based on a novel programming paradigm called functional but in-place . Eﬃcient code generation is crucial for Lean users because many write custom proof automation procedures in Lean itself.


Introduction
The Lean project 3 started in 2013 [9] as an interactive theorem prover based on the Calculus of Inductive Constructions [4] (CIC). In 2017, using Lean 3, a community of users with very different backgrounds started the Lean mathematical library project mathlib [13]. At the time of this writing, mathlib has roughly half a million lines of code, and contains many nontrivial mathematical objects such as Schemes [2]. Mathlib is also the foundation for the Perfectoid Spaces in Lean project [1], and the Liquid Tensor challenge [11] posed by the renowned mathematician Peter Scholze. Mathlib contains not only mathematical objects but also Lean metaprograms that extend the system [5]. Some of these metaprograms implement nontrivial proof automation, such as a ring theory solver and a decision procedure for Presburger arithmetic. Lean metaprograms in mathlib also extend the system by adding new top-level command and features not related to proof automation. For example, it contains a package of semantic linters that alert users to many commonly made mistakes [5]. Lean 3 metaprograms have been also instrumental in building standalone applications, such as a SQL query equivalence checker [3].
We believe the Lean 3 theorem prover's success is primarily due to its extensibility capabilities and metaprogramming framework [6]. However, users cannot modify many parts of the system without changing Lean 3 source code written in C++. Another issue is that many proof automation metaprograms are not competitive with similar proof automation implemented in programming languages with an efficient compiler such as C++ and OCaml. The primary source of inefficiency in Lean 3 metaprograms is the virtual machine interpretation overhead.
Lean 4 is a reimplementation of the Lean theorem prover in Lean itself 4 . It is an extensible theorem prover and an efficient programming language. The new compiler produces C code, and users can now implement efficient proof automation in Lean, compile it into efficient C code, and load it as a plugin. In Lean 4, users can access all internal data structures used to implement Lean by merely importing the Lean package. Lean 4 is also a platform for developing efficient domain-specific automation. It has a more robust and extensible elaborator, and addresses many other shortcomings of Lean 3. We expect the Lean community to extend and add new features without having to change the Lean source code. We released Lean 4 at the beginning of 2021, it is open source, the community is already porting mathlib, and the number of applications is quickly growing. It includes a translation verifier for Reopt 5 , a package for supporting inductive-inductive types 6 , and a car controller 7 .

Lean by Example
In this section, we introduce the Lean language using a series of examples. The source code for the examples is available at https://github.com/leanprover/ lean4/blob/cade2021/doc/BoolExpr.lean. For additional details and installation instructions, we recommend the reader consult the online manual 8 .
We define functions by using the def keyword followed by its name, a parameter list, return type, and body. The parameter list consists of successive parameters that are separated by spaces. We can specify an explicit type for each parameter. If we do not specify a specific argument type, the elaborator tries to infer the function body's type. The Boolean or function is defined by pattern-matching as follows We can use the command #check <term> to inspect the type of term, and #eval <term> to evaluate it. #check or true false --Bool (this is a comment in Lean) #eval or true false --true Lean has a hygienic macro system and comes equipped with many macros for commonly used idioms. For example, we can also define the function or using def or : Bool → Bool → Bool | true, _ => true | false, b => b The notation above is a macro that expands into a match-expression. In Lean, a theorem is a definition whose result type is a proposition. For an example, consider the following simple theorem about the definition above The constant rfl has type ∀ {α : Sort u} {a : α}, a = a, the curly braces indicate that the parameters α and a are implicit and should be inferred by solving typing constraints. In the example above, the inferred values for α and a are Bool and or true b, respectively, and the resulting type is or true b = or true b. This is a valid proof because or true b is definitionally equal to b. In dependent type theory, every term has a computational behavior, and supports a notion of reduction. In principle, two terms that reduce to the same value are called definitionally equal. In the following example, we use pattern matching to prove that or b b = b theorem or_self : ∀ (b : Bool), or b b = b | true => rfl | false => rfl Note that or b b does not reduce to b, but after pattern matching we have that or true true (or false false) reduces to true (false). In the following example, we define the recursive datatype BoolExpr for representing Boolean expressions using the command inductive. The function simplify is a simple bottom-up simplifier. We use the where clause to define two local auxiliary functions mkOr and mkNot for constructing "simplified" or and not expressions respectively. Their global names are simplify.mkOr and simplify.mkNot.
Given a context that maps variable names to Boolean values, we define a "denotation" function (or evaluator) for Boolean expressions. We use an association list to represent the context. In the example above, p || q is notation for or p q, !p for not p, and if let p := t then a else b is a macro that expands into match t with | p => a | _ => b. The term ctx.find? x is syntax sugar for AssocList.find? ctx x. As in previous versions, we can use tactics for constructing proofs and terms. We use the keyword by to switch into tactic mode. Tactics are user-defined or built-in procedures that construct various terms. They are all implemented in Lean itself. The simp tactic implements an extensible simplifier, and is one of the most popular tactics in mathlib. Its implementation 9 can be extended and modified by Lean users. In the example above, we use the induction tactic, its syntax is similar to a matchexpression. The variables ih1 and ih2 are the induction hypothesis for p and q in the first alternative for the case p is a BoolExpr.or. The simp tactic uses any theorem marked with the @[simp] attribute as a rewriting rule (e.g., denote_mkOr). We explicitly provide the induction hypotheses as additional rewriting rules inside square brackets.
Typeclass Resolution. Typeclasses [16] provide an elegant and effective way of managing ad-hoc polymorphism in both programming languages and interactive proof assistants. Then we can declare particular elements of a typeclass to be instances. These provide hints to the elaborator: any time the elaborator is looking for an element of a typeclass, it can consult a table of declared instances to find a suitable element. What makes typeclass inference powerful is that one can chain instances, that is, an instance declaration can in turn depend on other instances. This causes class inference to recurse through instances, backtracking when necessary. The Lean typeclass resolution procedure can be viewed as a simple λ-Prolog interpreter [8], where the Horn clauses are the user declared instances.
For example, the standard library defines a typeclass Inhabited to enable typeclass inference to infer a "default" or "arbitrary" element of types that contain at least one element. The standard library has many builtin classes such as Repr α and DecidableEq α. The class Repr α is similar to Haskell's Show α typeclass, and DecidableEq α is a typeclass for types that have decidable equality. Lean 4 also provides code synthesizers for many builtin classes. The command deriving instructs Lean to auto-generate an instance. The function decide evaluates decidable propositions. Thus, the last command returns false since BoolExpr.val true is not equal to BoolExpr.val false.
The increasingly sophisticated uses of typeclasses in mathlib have exposed a few limitations in Lean 3: unnecessary overhead due to the lack of term indexing techniques, and exponential running times in the presence of diamonds. Lean 4 implements a new procedure [12], tabled typeclass resolution, that solves these problems by using discrimination trees 10 . for better indexing and tabling, which is a generalization of memoizing introduced initially to address similar limitations of early logic programming systems 11 . The hygienic macro system. In interactive theorem provers (ITPs), Lean included, extensible syntax is not only crucial to lower the cognitive burden of manipulating complex mathematical objects, but plays a critical role in developing reusable abstractions in libraries. Lean 3 support such extensions in the form of restrictive "syntax sugar" substitutions and other ad hoc mechanisms, which are too rudimentary to support many desirable abstractions. As a result, libraries are littered with unnecessary redundancy. The Lean 3 tactic languages is plagued by a seemingly unrelated issue: accidental name capture, which often produces unexpected and counterintuitive behavior. Lean 4 takes ideas from the Scheme family of programming languages and solves these two problems simultaneously by use of a hygienic, i.e. capture-avoiding, macro system custom-built for ITPs [15]. Lean 3's "mixfix" notation system is still supported in Lean 4, but based on the much more general macro system; in fact, the Lean 3 notation keyword itself has been reimplemented as a macro, more specifically as a macro-generating macro. By providing such a tower of abstractions for writing syntax sugars, of which we will see more levels below, we want to enable users to work in the simplest model appropriate for their respective use case while always keeping open the option to switch to a lower, more expressive level.
As an example, we define the infix notation Γ p, with precedence 50, for the function denote defined earlier.

infix:50 " " => denote
The infix command expands to notation:50 Γ " " p:50 => denote Γ p which itself expands to the macro declaration macro:50 Γ:term " " p:term:50 : term => (denote $Γ $p) where the syntactic category (term) of placeholders and of the entire macro is now specified explicitly, implying that macros can also be written for/using other categories such as the top-level command. The right-hand side uses an explicit syntax quasiquotation to construct the syntax tree, with syntax placeholders (antiquotations) prefixed with $. As suggested by the explicit use of quotations, the right-hand side may now be an arbitrary Lean term computing a syntax object, allowing for procedural macros as well.
macro itself is another command-level macro that, for our notation example, expands to two commands syntax:50 term " " term:50 : term macro_rules | ($Γ $e) => (denote $Γ $e) that is, a pair of parser extension and syntax transformer. By separating these two steps at this abstraction level, it becomes possible to define (mutually) recursive macros and to reuse syntax between macros. Using macro_rules, users can even extend existing macros with new rules. In general, separating parsing and expansion means that that we can obtain a well-structured syntax tree pre-expansion, i.e. a concrete syntax tree, and use it to implement source code tooling such as auto-completion, go-to-definition, and refactorings. We can use the syntax command for defining embedded domain-specific languages. In simple cases, we can reuse existing syntactic categories for this but assign them new semantics, such as in the following notation for constructing BoolExpr objects. The macro_rules command above specifies how to convert a subset of the builtin syntax for terms into constructor applications for BoolExpr. The term $(quote x.getId.toString) converts the identifier x into a string literal. As a final example, we modify the notation Γ p. In the following version, Γ is not an arbitrary term anymore, but a comma-separated sequence of entries of the form var → value, and the right-hand side is now interpreted as a BoolExpr term by reusing our macro from above. We use the antiquotation splice $[$xs:ident → $vs:term],* to deconstruct the sequence of entries into two arrays xs and vs containing the variable names and values, respectively, adjust the former array, and combine them again in a second splice.

The Code Generator
The Lean 4 code generator produces efficient C code. It is useful for building both efficient Lean extensions and standalone applications. The code generator performs many transformations, and many of them are based on techniques used in the Haskell compiler GHC [7]. However, in contrast to Haskell, Lean is a strict language. We control code inlining and specialization using the attributes @[inline] and @[specialize]. They are crucial for eliminating the overhead introduced by the towers of abstractions used in our source code. Before emitting C code, we erase proof terms and convert Lean expressions into an intermediate representation (IR). The IR is a collection of Lean data structures, 12 and users can implement support for backends other than C by writing Lean programs that import Lean.Compiler.IR. Lean 4 also comes with an interpreter for the IR, which allows for rapid incremental development and testing right from inside the editor. Whenever the interpreter calls a function for which native, ahead-of-time compiled code is available, it will switch to that instead, which includes all functions from the standard library. Thus the interpretation overhead is negligible as long as e.g. all expensive tactics are precompiled.
Functional but in-place. Most functional languages rely on garbage collection for automatic memory management. They usually eschew reference counting in favor of a tracing garbage collector, which has less bookkeeping overhead at runtime. On the other hand, having an exact reference count of each value enables optimizations such as destructive updates [14]. When performing functional updates, objects often die just before creating an object of the same kind. We observe a similar phenomenon when we insert a new element into a purely functional data structure, such as binary trees, a theorem prover rewrites formulas, a compiler applies optimizations by transforming abstract syntax trees, or the function simplify defined earlier. We call it the resurrection hypothesis: many objects die just before creating an object of the same kind. The Lean memory manager uses reference counting and takes advantage of this hypothesis, and enables pure code to perform destructive updates in all scenarios described above when objects are not shared. It also allows a novel programming paradigm that we call functional but in-place (FBIP) [10]. Our preliminary experimental results demonstrate our new compiler produces competitive code that often outperforms the code generated by high-performance compilers such as ocamlopt and GHC [14]. As an example, consider the function map f as that applies a function f to each element of a list as. In this example, [] denotes the empty list, and a::as the list with head a followed by the tail as.
[] | f, a::as => f a :: map f as If the list referenced by as is not shared, the code generated by our compiler does not allocate any memory. Moreover, if as is a nonshared list of list of integers, then map (map inc) as will not allocate any memory either. In contrast to static linearity systems, allocations are also avoided even if only a prefix of the list is not shared. FBIP also allows Lean users to use data structures, such as arrays and hashtables, in pure code without any performance penalty when they are not shared. We believe this is an attractive feature because hashtables are frequently used to implement decision procedures and nontrivial proof automation.

The User Interface
Our system implements the Language Server Protocol (LSP) using the task abstraction provided by its standard library. The Lean 4 LSP server is incremental and is continuously analyzing the source text and providing semantic information to editors implementing LSP. Our LSP server implements most LSP features found in advanced IDEs, such as hyperlinks, syntax highlighting, type information, error handling, auto-completion, etc. Many editors implement LSP, but VS Code is the preferred editor by the Lean user community. We provide extensions for visualizing the intermediate proof states in interactive tactic blocks, and we want to port the Lean 3 widget library for constructing interactive visualizations for their proofs and programs.

Conclusion
Lean 4 aims to be a fully extensible interactive theorem prover and functional programming language. It has an expressive logical foundation for writing mathematical specifications and proofs and formally verified programs. Lean 4 provides many new unique features, including a hygienic macro-system, an efficient typeclass resolution procedure based on tabled resolution, efficient code generator, and abstractions for sealing low-level optimizations. The new elaboration procedure is more general and efficient than those implemented in previous versions. Users may also extend and modify the elaborator using Lean itself. Lean has a relatively small trusted kernel, and the rich API allows users to export their developments to other systems and implement their own reference checkers. Lean is an ongoing and long-term effort, and future plans include integration with external SMT solvers and first-order theorem provers, new compiler backends, and porting the Lean 3 Mathematical Library.