figure a
figure b

1 Introduction

CoqCryptoLine  [1] is a verified model checker with certified verification results. It is designed for verifying complex non-linear integer computations commonly found in cryptographic programs. The verification algorithms of CoqCryptoLine consist of two reductions. The algebraic reduction transforms polynomial equality checking into a root entailment problem in commutative algebra; the bit-vector reduction reduces range properties to satisfiability of queries in the Quantifier-Free Bit-Vector (QF_BV) logic from Satisfiability Modulo Theories (SMT) [6]. Both verification algorithms are formally specified and verified by the proof assistant Coq with MathComp  [7, 17]. CoqCryptoLine verification programs are extracted from the formal specification and therefore verified by the proof assistant automatically.

To minimize errors from external tools, recent developments in certified verification are employed by CoqCryptoLine. The root entailment problem is solved by the computer algebra system (CAS) Singular  [19]. CoqCryptoLine asks the external algebraic tool to provide certificates and validates certificates with the formal polynomial theory in Coq. SMT QF_BV queries on the other hand are answered by the verified SMT QF_BV solver CoqQFBV  [33]. Answers to SMT QF_BV queries are therefore all certified as well. With formally verified algorithms and certified answers from external tools, CoqCryptoLine gives verification results with much better guarantees than average automatic verification tools.

Reliable verification tools would not be very useful if they could not check real-world programs effectively. In our experiments, CoqCryptoLine verifies 54 real-world cryptographic programs. 52 of them are from well-known security libraries such as Bitcoin  [35] and OpenSSL  [30]. They are implementations of field and group operations in elliptic curve cryptography. The remaining two are the Number-Theoretic Transform (NTT) programs from the post-quantum cryptosystem Kyber  [10]. All field operations are implemented in a few hundred lines and verified in 6 minutes. The most complicated generic group operation in the elliptic curve Curve25519 consists of about 4000 lines and is verified by CoqCryptoLine in 1.5 h.

Related Work. There are numerous model checkers in the community, e.g. [8, 13, 21,22,23]. Nevertheless, few of them are formally verified. To our knowledge, the first verification of a model checker was performed in Coq for the modal \(\mu \)-calculus [34]. The LTL model checker CAVA  [15, 27] and the model checker Munta  [38, 39] for timed automata were developed and verified using Isabelle/HOL  [29], which can be considered as verified counterparts of SPIN  [21] and Uppaal  [23], respectively. CoqCryptoLine instead checks CryptoLine models [16, 31] that are for the correctness of cryptographic programs. It can be seen as a verified version of CryptoLine. A large body of work studies the correctness of cryptographic programs, e.g. [2,3,4, 9, 12, 14, 24, 26, 40], cf. [5] for a survey. They either require human intervention or are unverified, while our work is fully automatic and verified. The most relevant work is bvCryptoLine  [37], which is the first automated and partly verified model checker for a very limited subset of CryptoLine. We will compare our work with it comprehensively in Sect. 2.3.

2 CoqCryptoLine

CoqCryptoLine is an automatic verification tool that takes a CryptoLine specification as input and returns certified results indicating the validity of the specification. We briefly describe the CryptoLine language [16] followed by the modules, features, and optimizations of CoqCryptoLine in this section.

2.1 CryptoLine Language

A CryptoLine specification contains a CryptoLine program with pre- and post-conditions, where the CryptoLine program usually models some cryptographic program [16, 31]. Both the pre- and post-conditions consist of an algebraic part, which is formulated as a conjunction of (modular) equations, and a range part as an SMT QF_BV predicate. A CryptoLine specification is valid if every program execution starting from a program state satisfying the pre-condition ends in a state satisfying the post-condition.

CryptoLine is designed for modeling cryptographic assembly programs. Besides the assignment (mov) and conditional assignment (cmov) statements, CryptoLine provides arithmetic statements such as addition (add), addition with carry (adc), subtraction (sub), subtraction with borrow (sbb), half multiplication (mul) and full multiplication (mull). Most of them have versions that model the carry/borrow flags explicitly (like adds, adcs, subs, sbbs). It also allows bitwise statements, for instance, bitwise AND (and), OR (or) and left-shift (shl). To deal with multi-word arithmetic, CryptoLine further includes multi-word constructs, for example, those that split (split) or join (join) words, as well as multi-word shifts (cshl). CryptoLine is strongly typed, admitting both signed and unsigned interpretations for bit-vector variables and constants. The cast statement converts types explicitly. Finally, CryptoLine also supports special statements (assert and assume) for verification purposes.

2.2 The Architecture of CoqCryptoLine

CoqCryptoLine reduces the verification problem of a CryptoLine specification to instances of root entailment problems and SMT problems over the QF_BV logic. These instances are then solved by respective certified techniques. Moreover, the components in CoqCryptoLine are also specified and verified by the proof assistant Coq with MathComp  [7, 17]. Figure 1 gives an overview of CoqCryptoLine. In the figure, dashed components represent external tools. Rectangular boxes are verified components and rounded boxes are unverified. Note that all our proof efforts using Coq are transparent to users. No Coq proof is required from users during verification of cryptographic programs with CoqCryptoLine. Details can be found in [36].

Fig. 1.
figure 1

Overview of CoqCryptoLine

Starting from a CryptoLine specification text, the CoqCryptoLine parser translates the text into an abstract syntax tree defined in the Coq module DSL. The module gives formal semantics for the typed CryptoLine language [16]. The validity of CryptoLine specifications is also formalized. Similar to most program verification tools, CoqCryptoLine transforms CryptoLine specifications to the static single assignment (SSA) form. The SSA module gives our transformation algorithm. It moreover shows that validity of CryptoLine specifications is preserved by the SSA transformation. CoqCryptoLine then reduces the verification problem via two Coq modules.

The SSA2ZSSA module contains our algebraic reduction to the root entailment problem. Concretely, a system of (modular) equations is constructed from the given program so that program executions correspond to the roots of the system of (modular) equations. To verify algebraic post-conditions, it suffices to check if the roots for executions are also roots of (modular) equations in the post-condition. However, program executions can deviate from roots of (modular) equations when over- or under-flow occurs. CoqCryptoLine will generate soundness conditions to ensure the executions conform to our (modular) equations. The algebraic verification problem is thus reduced to the root entailment problem provided that soundness conditions hold.

The SSA2QFBV module gives our bit-vector reduction to the SMT QF_BV problem. It constructs an SMT query to check the validity of the given CryptoLine range specification. Concretely, an SMT QF_BV query is built such that all program executions correspond to satisfying assignments to the query and vice versa. To verify the range post-conditions, it suffices to check if satisfying assignments for the query also satisfy the post-conditions. The range verification problem is thus reduced to the SMT QF_BV problem. On the other hand, additional SMT queries are constructed to check soundness conditions for the algebraic reduction. We formally prove the equivalence between soundness conditions and corresponding queries.

With the two formally verified reduction algorithms, it remains to solve the root entailment problems and the SMT QF_BV problems with external solvers. CoqCryptoLine invokes an external computer algebra system (CAS) to solve the root entailment problems, and improves the techniques in [20, 37] to validate the (untrusted) returned answers. Currently, the CAS Singular  [19] is supported. To solve the SMT QF_BV problems, CoqCryptoLine employs the certified SMT QF_BV solver CoqQFBV  [33]. In all cases, instances of the two kinds of problems are solved with certificates. And CoqCryptoLine employs verified certificate checkers to validate the answers to further improve assurance.

Note that the algebraic reduction in SSA2ZSSA is sound but not complete due to the abstraction of bit-accurate semantics into (modular) polynomial equations over integers. Thus a failure in solving the root entailment problem by CAS does not mean that the algebraic post-conditions are violated. On the other hand, the bit-vector reduction in SSA2QFBV is both sound and complete.

The CoqCryptoLine tool is built on OCaml programs extracted from verified algorithms in Coq with MathComp. We moreover integrate the OCaml programs from the certified SMT QF_BV solver CoqQFBV. Our trusted computing base consists of (1) CoqCryptoLine parser, (2) text interface with external SAT solvers (from CoqQFBV), (3) the proof assistant Isabelle  [29] (from the SAT solver certificate validator Grat used by CoqQFBV) and (4) the Coq proof assistant. Particularly, sophisticated decision procedures in external CASs and SAT solvers used in CoqQFBV need not be trusted.

2.3 Features and Optimizations

CoqCryptoLine comes with the following features and optimizations implemented in its modules.

Type System. CoqCryptoLine fully supports the type system of the CryptoLine language. The type system is used to model bit-vectors of arbitrary bit-widths with unsigned or signed interpretation. Such a type system allows CoqCryptoLine to model more industrial examples translated from C programs via GCC [16] or LLVM [24] compared to bvCryptoLine  [37], which only allows unsigned bit-vectors, all of the same bit-width.

Mixed Theories. With the assert and assume statements supported by CoqCryptoLine, it is possible to make an assertion on the range side (or on the algebraic side) and then make an equivalent assumption on the algebraic side (or resp. on the range side). With this feature, a predicate can be asserted on one side where the predicate is easier to prove, and then assumed on the other side to ease the verification of other predicates. The equivalence between the asserted predicate and the assumed predicate is currently not verified by CoqCryptoLine, though it is achievable. Both assert and assume statements are not available in bvCryptoLine.

Multi-threading. All extracted OCaml code from the verified algorithms in Coq runs sequentially. To speed up, SMT QF_BV problems, as well as root entailment problems, are solved parallelly.

Efficient Root Entailment Problem Solving. CoqCryptoLine can be used as a solver for root entailment problems with certificates validated by a verified validator. A root entailment problem is reduced to an ideal membership problem, which is then solved by computing Gröbner basis [20]. To solve a root entailment problem with a certificate, we need to find a witness of polynomials \(c_0, \ldots , c_n\) such that

$$\begin{aligned} q = \varSigma _{i=0}^n{c_ip_i} \end{aligned}$$
(1)

where q and \(p_i\)’s are given polynomials. To compute the witness, bvCryptoLine relies on gbarith  [32], where new variables are introduced. CoqCryptoLine utilizes the lift command in Singular instead without adding fresh variables. We show in the evaluation section that using lift is more efficient than using gbarith. The witness found is further validated by CoqCryptoLine, which relies on the polynomial normalization procedure norm_subst in Coq to check if Eq. 1 holds. bvCryptoLine on the other hand uses the ring tactic in Coq, where extra type checking is performed. Elimination of ideal generators through variable substitution is an efficient approach to simplify an ideal membership problem [37]. The elimination procedure implemented in CoqCryptoLine can identify much more variable substitution patterns than those found by bvCryptoLine.

Multi-moduli. Modular equations with multi-moduli are common in post-quantum cryptography. For example, the post-quantum cryptosystem Kyber uses the polynomial ring \(\mathbb {Z}_{3329}[X]/\langle X^{256} + 1 \rangle \) containing two moduli 3329 and \(X^{256} + 1\). To support multi-moduli in CoqCryptoLine, in the proof of our algebraic reduction, we have to find integers \(c_0, \ldots , c_n\) such that \(e_1 - e_2 = \varSigma _{i=0}^nc_im_i\) given the proof of \(e_1 = e_2 \pmod {m_0, \ldots , m_n}\) where \(e_1\), \(e_2\), and \(m_i\)’s are integers. Instead of implementing a complicated procedure to find the exact \(c_i\)’s, we simply invoke the xchoose function provided by MathComp to find \(c_i\)’s based on the proof of \(e_1 = e_2 \pmod {m_0, \ldots , m_n}\). Multi-moduli is not supported by bvCryptoLine.

Tight Integration with CoqQFBV . CoqCryptoLine verifies every atomic range predicate separately using the certified SMT QF_BV solver CoqQFBV. Constructing a text file as the input to CoqQFBV for every atomic range predicate is not a good idea because the bit-blasting procedure in CoqQFBV is performed several times for the identical program. CoqCryptoLine thus is tightly integrated with CoqQFBV to speed up bit-blasting of the same program using the cache provided by CoqQFBV. bvCryptoLine uses the SMT solver Boolector to prove range predicates without certificates.

Slicing. During the reductions from the verification problem of a CryptoLine specification to instances of root entailment problems and SMT QF_BV problems, a verified static slicing is performed in CoqCryptoLine to produce smaller problems. Unlike the work in [11], which sets all assume statements as additional slicing criteria, the slicing in CoqCryptoLine is capable of pruning unrelated predicates in assume statements. The slicing procedure implemented in CoqCryptoLine is much more complicated than the one in bvCryptoLine due to the presence of assume statements. This feature is provided as command-line option because it makes the verification incomplete. With slicing, the time in verifying industrial examples is reduced dramatically.

3 Walkthrough

We illustrate how CoqCryptoLine is used in this section. The x86_64 assembly subroutine ecp_nistz256_mul_montx from OpenSSL  [30] shown in Fig. 2 is verified as an example.

An input for CoqCryptoLine contains a CryptoLine specification for the assembly subroutine. The original subroutine is marked between the comments PROGNAME STARTS and PROGNAME ENDS, which is obtained automatically from the Python script provided by CryptoLine  [31].

Prior to the “START” comment are the parameter declaration, pre-condition, and variable initialization. After the “END” comment is the post-condition of the subroutine. After the subroutine ends, the result is moved to the output variables.

The assembly subroutine ecp_nistz256_mul_montx takes two 256-bit unsigned integers a and b and the modulus m as inputs. The 256-bit integer m is the prime \(p256 = 2^{256} - 2^{224} + 2^{192} + 2^{96} - 1\) from the NIST curve. The 256-bit integers a and b (less than the prime) are the multiplicands. Each 256-bit input integer \(d \in \{a, b, m\}\) is denoted by four 64-bit unsigned integer variables \(d_i\) (for \(0 \le i < 4\)) in little-endian representation. The expression limbs n [\({\texttt {\textit{d}}}_{\texttt {\textit{0}}}\), \({\texttt {\textit{d}}}_{\texttt {\textit{1}}}\), ..., \({\texttt {\textit{d}}}_{\texttt {\textit{i}}}\)] is short for \({\texttt {\textit{d}}}_{\texttt {\textit{0}}}\) + \({\texttt {\textit{d}}}_{\texttt {\textit{1}}}\)*2** n + ... +\({\texttt {\textit{d}}}_{\texttt {\textit{i}}}\)*2**(i* n)Footnote 1. The inputs and constants are then put in the variables for memory cells with the mov statements. There are two parts to a pre-condition. The first part is for the algebraic reduction; the second part is for the bit-vector reduction:

figure c

The output 256-bit integer represented by the four variables \(c_i\) (for \(0 \le i < 4\)) has two requirements. Firstly, the output integer times \(2^{256}\) equals the product of the input integers modulo p256. Secondly, the output integer is less than p256. Formally, we have this post-condition:

figure d

Here, we employ the algebraic reduction to verify the non-linear modular equality, and the bit-vector reduction to verify the proper range of the output integer.

Fig. 2.
figure 2

CryptoLine Model for ecp_nistz256_mul_montx

However, verifying ecp_nistz256_mul_montx takes extra annotations to hint CoqCryptoLine how to verify the post-condition. E.g., in adding two 256-bit integers represented by 64-bit variables, a chain of four 64-bit additions is performed and carries are propagated. The last carry as the chain ends must be zero or the 256-bit sum is incorrect. In ecp_nistz256_mul_montx two interleaved addition chains use the carry and the overflow flags for carries respectively, so we annotate as follows at the end of two interleaving addition chains to tell CoqCryptoLine about the final carries:

figure e

The assert statement verifies that both the carry and overflow flags are zeroes through the bit-vector reduction. The assume statement then passes this information to the algebraic reduction. Effectively, CoqCryptoLine checks that both flags are zero for all inputs satisfying the pre-condition, then uses those facts as lemmas to verify the post-condition with the algebraic reduction.

The full specification for ecp_nistz256_mul_montx has 230 lines, including 50 lines of manual annotations. 20 are straightforward annotations for variable declaration and initialization. The remaining 30 lines of annotations are hints to CoqCryptoLine, which then verifies the post-condition in 30 s with 24 threads.

The illustration of the typical verification flow shows how a user constructs a CryptoLine specification. The pre-condition for program inputs, the post-condition for outputs, and variable initialization must be specified manually. Additional annotations may be added as hints. Notice that hints only tell CoqCryptoLine what, not why properties should hold. Proofs of annotated hints and the post-condition are found by CoqCryptoLine automatically. Consequently, manual annotations are minimized and verification efforts are reduced significantly.

4 Evaluation

We evaluate CoqCryptoLine on 52 benchmarks from four industrial security libraries Bitcoin  [35], boringSSL  [14, 18], nss  [25], and OpenSSL  [30]. The C reference and optimized avx2 implementations of the Number-Theoretic Transform (NTT) from the post-quantum key encapsulation mechanism Kyber  [10] are also evaluated. Among the total 54 benchmarks, 43 benchmarks contain features not supported by bvCryptoLine such as signed variables. All experiments are performed on an Ubuntu 22.04.1 machine with a 3.20GHz Intel Xeon Gold 6134M CPU and 1TB RAM.

Benchmarks from security libraries are various field and group operations from elliptic curve cryptography (ECC). In ECC, rational points on curves are represented by elements in large finite fields. In Bitcoin, the finite field is the residue system modulo the prime \(p256k1 = 2^{256} - 2^{32} - 2^9 - 2^8 - 2^7 - 2^6 - 2^4 - 1\). For other security libraries (boringSSL, nss, and OpenSSL), we verify the operations in Curve25519 using the residue system modulo the prime \(p25519 = 2^{255} - 19\) as the underlying field. Rational points on elliptic curves form a group. The group operation in turn is implemented by a number of field operations.

In lattice-based post-quantum cryptosystems, polynomial rings are used. Specifically, the polynomial ring \(\mathbb {Z}_{3329}[X]/\langle X^{256} + 1 \rangle \) is used in Kyber. To speed up multiplication in the polynomial ring, Kyber requires the multiplication to be implemented by NTT. NTT is a discrete Fast Fourier Transform over finite fields. Instead of complex roots of unity, NTT uses the principal roots of unity in fields. Mathematically, the Kyber NTT computes the following ring isomorphism

$$ \mathbb {Z}_{3329}[X]/\langle X^{256} + 1 \rangle \cong \mathbb {Z}_{3329}[X]/\langle X^{2}-\zeta _0\rangle \times \cdots \times \mathbb {Z}_{3329}[X]/\langle X^{2}-\zeta _{127}\rangle $$

where \(\zeta _i\)’s are the principal roots of unity.

Fig. 3.
figure 3

Running time (in seconds) comparisons

We first compare CoqCryptoLine with all optimizations described in this paper against the unverified model checker CryptoLine  [16]. Both tools invoke the computer algebra system Singular  [19], but CryptoLine neither lets Singular produce certificates nor certifies answers from Singular. CoqCryptoLine moreover uses the certified SMT QF_BV solver CoqQFBV [33]; CryptoLine uses the uncertified but very efficient Boolector  [28].

For the ECC experiments, CoqCryptoLine verifies all field operations in 6 minutes. It takes a few thousand seconds to verify group operations. The most complex implementation (x25519_scalar_mult_generic) from boringSSL (4274 statements) takes about 1.5 hours.Footnote 2 For Kyber, CoqCryptoLine verifies in 2642 and 1048 seconds, respectively, that the reference and avx2 NTT implementations indeed compute the isomorphism. The unverified CryptoLine in comparison finishes verification in about 95 seconds. A summary of the comparison between CoqCryptoLine and CryptoLine is shown in Fig. 3a. Though CoqCryptoLine is much slower than CryptoLine, the running time (1.5 hours) for the most complex implementation is still acceptable.

Figure 3b shows the percentages of average running time for CoqCryptoLine internal OCaml code (INT), external SMT QF_BV solver (SMT), and external computer algebra system (CAS). External solvers take much more time than the internal OCaml program does. Between external solvers, the external computer algebra system takes 4.63% of the time and the external SMT QF_BV solver spends 93.28% of the time.

To show the performance of the lift optimization, we run CoqCryptoLine and bvCryptoLine on root entailment problems generated from the benchmarks. Here we only consider 12 root entailment problems that trigger gbarith in bvCryptoLine. Figure 3c shows the running time of Singular in solving root entailment problems based on gbarith in bvCryptoLine and lift in CoqCryptoLine. bvCryptoLine fails to solve 3 root entailment problems in one hour. For the other 9 root entailment problems, lift outperforms gbarith.

We also compare CoqCryptoLine with and without slicing. The version of CoqCryptoLine without slicing is denoted by CoqCryptoLine \(^{-}\). The running time comparison between CoqCryptoLine and CoqCryptoLine \(^{-}\) in Fig. 3d shows that slicing reduces the running time obviously.

5 Conclusion

CoqCryptoLine is a verified model checker for cryptographic programs with certified results. Its modules are formally verified in Coq with MathComp. CoqCryptoLine moreover employs external tools and validates their answers with certificates. We evaluate CoqCryptoLine on benchmarks from industrial security libraries (Bitcoin, boringSSL, nss and OpenSSL) and a post-quantum cryptography standard candidate (Kyber). In our experiments, CoqCryptoLine verifies most cryptographic programs with certificates in a reasonable time (6 min). Benchmarks with thousands of lines are verified in 1.5 h. To our knowledge, this is the first certified verification on operations of the elliptic curve secp256k1 used in Bitcoin, and the avx2 and reference implementations of Kyber number-theoretic transform.