1 Introduction

Cryptographic circuits are widely applied in various embedded and cyber-physical systems [5, 39]. However, they are vulnerable to fault injection attacks, which disrupt the execution of cryptographic primitives via clock glitch [2], underpowering [34], voltage glitch [41], electromagnetic pulse [16], or laser beam [36]. With circuit’s faulty outputs, attackers can employ statistical analysis methods to infer sensitive information, thereby threatening the security of, e.g., authentication. As a result, fault injection attacks pose a significant threat to the security of embedded and cyber-physical systems.

While countermeasures have been proposed to mitigate these attacks [1, 26, 35], their implementation does not necessarily guarantee security. Crucially, the fault-resistance of these countermeasures needs to be formally verified. While a plethora of fault-resistance analysis approaches have been proposed (cf. Sect. 6), the state-of-the-art formal verification approaches are non-compositional and limited in efficiency and scalability for realistic cryptographic circuits.

Contributions. In this work, we propose the first compositional verification approach for sequential circuits of cryptographic primitives with countermeasures against fault injection attacks, aiming to combat the efficiency and scalability challenges. Different from existing approaches for compositional safety and equivalence checking (e.g., [15, 24, 25]) which are not applicable for fault-resistance verification, our approach leverages the structural feature of round-based cryptographic circuits and decomposes the circuit into a set of single-round sub-circuits extended with, importantly, primary inputs/outputs, registers and their connections to guarantee soundness. We then verify those sub-circuits by leveraging SAT/SMT- and BDD-based approaches [31, 37]. Our decomposition approach guarantees that the composition of fault-resistant single-round sub-circuits is always fault-resistant. Furthermore, we investigate various acceleration techniques that can significantly enhance verification efficiency.

We implement our approach as an open-source tool CLEAVE (Compositional fauLt injEction Attacks VErifier), based on Verilog gate-level netlist. We thoroughly evaluate CLEAVE on 9 real-world cryptographic circuits (i.e., AES and LED64) equipped by both detection- and correction-based countermeasures, where the number of gates ranges from 1,020 to 34,351. The experimental results show that our approach is effective and efficient. For instance, the SAT-based compositional approach can verify most of the benchmarks (17/18) within 200 s and the remaining one can be done in 53 min; in contrast, the monolithic counterpart can only deal with 12 benchmarks within 6 h and requires significantly more verification time. The same improvements can be observed for SMT- and BDD-based compositional approaches.

To summarize, we make the following contributions.

  • We propose a novel compositional fault-resistance verification framework for cryptographic circuits and various techniques to enhance efficiency;

  • We implement an open-source tool CLEAVE for Verilog gate-level netlists;

  • We extensively evaluate our tool on realistic cryptographic circuits, demonstrating its effectiveness and efficiency.

Outline. Section 2 introduces preliminaries. Section 3 defines the fault-resistance verification problem. Section 4 presents our compositional verification approach. Section 5 reports experimental results; We discuss related work in Sect. 6 and conclude the work in Sect. 7. Benchmarks, the source code of CLEAVE, more experiential results and missing proofs are provided [38].

2 Preliminaries

Let \(\mathbb {B}:=\{\texttt {0},\texttt {1} \}\) and \([n]:=\{1,\cdots , n\}\) for a natural number \(n\ge 1\). We consider two types of logic gates: one-input gate \(g:\mathbb {B}\rightarrow \mathbb {B}\) (e.g., \(\texttt {not} \)) and two-input gate \(g:\mathbb {B}\times \mathbb {B}\rightarrow \mathbb {B}\) (e.g., \(\texttt {and} \), \(\texttt {or} \), \(\texttt {xor} \)). To model faulty gates, we define three faulty counterparts (\(\overline{g}\), \(g_s\), \(g_r\)) of each gate g with \(\overline{g}=\lnot g\), \(g_s=\texttt {1} \) and \(g_r=\texttt {0} \).

Definition 1

A combinational circuit C is a tuple \((V,I,O, E,\texttt {g}),\) where

  • V is a finite set of vertices in the circuit such that each vertex \(v\in V\setminus (I\cup O)\) is associated with a logic gate \(\texttt {g}(v)\) whose fan-in is the in-degree of v;

  • \(I\subseteq V\) and \(O\subseteq V\) are the primary inputs and outputs, respectively;

  • \(E\subseteq (V\setminus O) \times (V\setminus I)\) is a set of edges, each of which \((v_1,v_2)\in E\) transmits the signal over \(\mathbb {B}\) from \(v_1\) to \(v_2\), namely, one of the inputs of the logic gate \(\texttt {g}(v_2)\) is driven by the output of the logic gate \(\texttt {g}(v_1)\);

  • and (V, E) forms a Directed Acyclic Graph (DAG).

A combinational circuit C represents a Boolean function \(\llbracket C\rrbracket :\mathbb {B}^{|I|}\rightarrow \mathbb {B}^{|O|}\) such that for any input signals \(\boldsymbol{x}\in \mathbb {B}^{|I|}\), \(\llbracket C\rrbracket (\boldsymbol{x})\) is the output of the circuit C when fed with \(\boldsymbol{x}\).

A (synchronous) sequential circuit is a combinational circuit with feedback via registers and synchronized by a global clock. It is memoryful as the registers store the internal state. In this paper, we focus on round-based circuit implementations of cryptographic algorithms. Conceptually, the circuit consists of several rounds, and physically each round may comprise some clock cycles. For our purpose, the sequential circuit is defined as follows.

Definition 2

A k-clock cycle sequential circuit \(\mathcal {S}[k]\) (we may simply write \(\mathcal {S}\) to simplify the notation) is a tuple \((\mathcal {I}, \mathcal {O}, \mathcal {C}, \mathcal {R}, \boldsymbol{s}_0),\) where

  • \(\mathcal {I}\) and \(\mathcal {O}\) comprise the primary inputs and primary outputs, respectively.

  • \(\mathcal {R}=\mathcal {R}_{in}\cup \mathcal {R}_{s}\) is a finite set of registers (aka memory gates), with initial signals \(\boldsymbol{s}_0\in \mathbb {B}^{|\mathcal {R}_{s}|}\) for state registers in \(\mathcal {R}_{s}\). Intuitively, registers in \(\mathcal {R}_{in}\) (resp. \(\mathcal {R}_{s}\)) store primary input signals (resp. results) of combinational circuits.

  • \(\mathcal {C}=\{C_1,\cdots , C_k\}\), where for each \(i\in [k]\), \(C_i=(V_i, I_i, O_i, E_i, \texttt {g}_i)\) is a combinational circuit for the i-th clock cycle. Moreover, it is required that all the primary inputs \(\mathcal {I}\) are connected to registers in \(\mathcal {R}_{in}\) which in turn are connected to the inputs \(I_i\) to avoid glitches, and the outputs \(O_i\) are connected to the primary outputs \(\mathcal {O}\) and registers in \(\mathcal {R}_{s}\). We also extend function \(\texttt {g}_i\) such that \(\texttt {g}_i(r)\) is an identity function for every register \(r\in \mathcal {R}\)

A state \(\boldsymbol{s}:\mathcal {R}_s\rightarrow \mathbb {B}\) of \(\mathcal {S}[k]\) is a valuation of the registers \(\mathcal {R}_s\). In each clock cycle \(i\in [k-1]\), given a state \(\boldsymbol{s}_{i-1}\) and primary input signals \(\boldsymbol{x}_i\), the next state \(\boldsymbol{s}_i\) is \(\llbracket C_i\rrbracket (\boldsymbol{s}_{i-1},\boldsymbol{x}_i)\) projected onto \(\mathcal {R}_s\), while \(\llbracket C_i\rrbracket (\boldsymbol{s}_{i-1},\boldsymbol{x}_i)\) projected onto \(\mathcal {O}\) gives the primary output signals \(\boldsymbol{y}_i\), written as \(\boldsymbol{s}_{i-1}{\mathop {\longrightarrow }\limits ^{\boldsymbol{x}_i|\boldsymbol{y}_i}}\boldsymbol{s}_{i}\).

Given a sequence of primary input signals \((\boldsymbol{x}_{1},\cdots ,\boldsymbol{x}_{k})\), a run \(\rho \) of the circuit \(\mathcal {S}[k]\) is a sequence

$$\boldsymbol{s}_0{\mathop {\longrightarrow }\limits ^{\boldsymbol{x}_1|\boldsymbol{y}_1}}\boldsymbol{s}_1{\mathop {\longrightarrow }\limits ^{\boldsymbol{x}_2|\boldsymbol{y}_2}}\boldsymbol{s}_2{\mathop {\longrightarrow }\limits ^{\boldsymbol{x}_3|\boldsymbol{y}_3}}\boldsymbol{s}_3{\longrightarrow }\cdots {\longrightarrow }\boldsymbol{s}_{k-1}{\mathop {\longrightarrow }\limits ^{\boldsymbol{x}_k|\boldsymbol{y}_k}}\boldsymbol{s}_k,$$

where \((\boldsymbol{y}_{1},\cdots ,\boldsymbol{y}_{k})\) is the sequence of primary output signals. The circuit \(\mathcal {S}[k]\) can also be seen as a Boolean function \(\llbracket \mathcal {S}[k]\rrbracket :(\mathbb {B}^{|\mathcal {I}|})^k\rightarrow (\mathbb {B}^{|\mathcal {O}|})^k\) such that \(\llbracket \mathcal {S}[k]\rrbracket (\boldsymbol{x}_{1},\cdots ,\boldsymbol{x}_{k})\) is the sequence of primary output signals for a sequence of primary input signals \((\boldsymbol{x}_{1},\ldots ,\boldsymbol{x}_{k})\).

We remark that our definition of sequential circuits is slightly different from the one given in [37], in which primary inputs can be connected to logic gates. We only allow primary inputs to connect to registers to avoid glitches which often introduce faults as well. Hence, our definition is sufficient for cryptographic circuits according to our experience while it facilitates the decomposition.

3 The Fault-Resistance Verification Problem

A fault injection attack actively injects faults into the execution of a cryptographic circuit and then infers sensitive data (such as the cryptographic key) via statistical analysis [3, 8, 9]. A general introduction refers to [21]. In particular, both non-invasive fault injections (i.e., clock glitches, underpowering and voltage glitches) and semi-invasive fault injections (i.e., electromagnetic pulses and laser beams) have been widely studied to compromise the security of cryptographic circuits, varying with attack cost and attack effectiveness [30]. There are detection- and correction-based countermeasures to mitigate fault injection attacks [1, 35]: the former aims to detect fault injection attacks and raise an error flag once the attack is detected, so sensitive data can be destroyed in time; the latter aims to correct faults induced by attacks and produce the desired outputs.

3.1 Security Notions

We consider the following three fault types that suffice to capture both non-invasive fault injections and semi-invasive fault injections (cf. [30, 37]):

  • bit-set fault \(\tau _{s}\): when injected on a gate g, its output becomes \(\texttt {1} \), namely, the gate g becomes the faulty gate \(g_s\), denoted by \(\tau _s(g)\);

  • bit-reset fault \(\tau _{r}\): when injected on a gate, its output becomes \(\texttt {0} \), namely, the gate g becomes the faulty gate \(g_r\), denoted by \(\tau _r(g)\);

  • bit-flip fault \(\tau _{bf}\): when injected on a gate, its output is flipped, namely, the gate g becomes the faulty gate \(\overline{g}\), denoted by \(\tau _{bf}(g)\);

Fix a circuit \(\mathcal {S}[k]=(\mathcal {I},\mathcal {O},\mathcal {R}, \boldsymbol{s}_0, \mathcal {C})\) protected using either a detection-based or correction-based countermeasure, where \(\mathcal {C}=\{C_1,\cdots , C_k\}\) and for each \(i\in [k]\), \(C_i=(V_i, I_i, O_i, E_i,\texttt {g}_i)\). We assume \(o_\texttt {flag}\in \mathcal {O}\), where \(o_\texttt {flag}\) is an error flag indicating whether a fault was detected when \(\mathcal {S}\) adopts a detection-based countermeasure. If \(\mathcal {S}\) adopts a correction-based countermeasure (i.e., no error flag is involved), we simply assume that \(o_\texttt {flag}\) is always \(\texttt {0} \). We denote by \(\textbf{B}\) the blacklist of invulnerable gates that are protected against fault injection attacks. \(\textbf{B}\) usually contains the gates used in implementing a countermeasure.

Definition 3

A fault vector on the circuit \(\mathcal {S}\) with the blacklist \(\textbf{B}\) and a set of fault types T, denoted by \(\textsf{V}(\mathcal {S},\textbf{B},T)\), is a set of fault events

$$\textsf{V}(\mathcal {S},\textbf{B},T):=\big \{ \textsf{e}(\alpha _1,\beta _1,\tau _1), \cdots , \textsf{e}(\alpha _m,\beta _m,\tau _m)\mid i\ne j\implies (\sigma _i\ne \sigma _j\vee \beta _i\ne \beta _j) \big \},$$

where each fault event \(\textsf{e}(\sigma ,\beta ,\tau )\) consists of

  • \(\sigma \in [k]\) specifying the clock cycle of the fault injection, namely, the fault injection occurs at the \(\sigma \)-th clock cycle;

  • \(\beta \in \mathcal {R}\cup V_\sigma \setminus (I_\sigma \cup O_\sigma )\) specifying the vulnerable gate on which the fault is injected (note that \(\beta \not \in \textbf{B}\));

  • \(\tau \in T\) specifying the fault type.

A fault vector \(\textsf{V}(\mathcal {S},\textbf{B},T)\) yields a faulty circuit \(\mathcal {F}(\mathcal {S},\textbf{B},T):=(\mathcal {I},\mathcal {O},\mathcal {R}, \boldsymbol{s}_0, \mathcal {C}'),\) where \(\mathcal {C}'=\{C_1',\cdots , C_k'\}\), for each \(i\in [k]\): \(C_i':=(V_i,I_i, O_i, E_i,\texttt {g}_i')\) and \(\texttt {g}_i'(\beta ):=\tau (\texttt {g}_i(\beta ))\) if \(\textsf{e}(i,\beta ,\tau )\in \textsf{V}(\mathcal {S},\textbf{B},T)\), otherwise \(C_i':=C_i\) and \(\texttt {g}_i'(\beta ):=\texttt {g}_i(\beta )\).

Intuitively, the faulty circuit \(\mathcal {F}(\mathcal {S},\textbf{B},T)\) is the same as the circuit \(\mathcal {S}\) except that for each fault event \(\textsf{e}(i,\beta ,\tau )\in \textsf{V}(\mathcal {S},\textbf{B},T)\), the gate \(\texttt {g}_i(\beta )\) is transiently replaced by its faulty counterpart \(\tau (\texttt {g}_i(\beta ))\) in the i-th clock cycle, whereas all the other gates remain the same.

Definition 4

A fault vector \(\textsf{V}(\mathcal {S},\textbf{B},T)\) is effective if there exists a sequence of primary input signals \((\boldsymbol{x}_{1},\cdots ,\boldsymbol{x}_{k})\) such that two sequences of primary output signals

\(\llbracket \mathcal {S}\rrbracket (\boldsymbol{x}_{1},\cdots ,\boldsymbol{x}_{k})\) and \(\llbracket \mathcal {F}(\mathcal {S},\textbf{B},T)\rrbracket (\boldsymbol{x}_{1},\cdots ,\boldsymbol{x}_{k})\)

differ at some clock cycle before the error flag \(o_\texttt {flag}\) is set.

Otherwise, the fault vector \(\textsf{V}(\mathcal {S},\textbf{B},T)\) is ineffective and the circuit \(\mathcal {S}\) is resistant against the fault vector \(\textsf{V}(\mathcal {S},\textbf{B},T)\).

An effective fault vector results in faulty primary output signals where the fault is not successfully detected (i.e., the error flag \(o_\texttt {flag}\) is not set in time). Note that there are two possible cases for an ineffective fault vector: either \(\llbracket \mathcal {S}\rrbracket (\boldsymbol{x}_{1},\cdots ,\boldsymbol{x}_{k})\) and \(\llbracket \mathcal {F}(\mathcal {S},\textbf{B},T)\rrbracket (\boldsymbol{x}_{1},\cdots ,\boldsymbol{x}_{k})\) are the same or the fault is successfully detected.

Inspired by the consolidated fault model [30], we define the security model for fault-resistance verification which characterizes the capabilities of the adversary.

Definition 5

A fault-resistance model for the circuit \(\mathcal {S}\) with the blacklist \(\textbf{B}\) is given by \(\mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell ),\) where

  • \(\texttt {n}_e\) is the maximum number of fault events per clock cycle;

  • \(\texttt {n}_c\) is the maximum number of clock cycles in which fault events can occur;

  • \(T\subseteq \{\tau _s,\tau _r,\tau _{bf}\}\) specifies the set of allowed fault types; and

  • \(\ell \in \{\texttt {c},\texttt {r},\texttt {cr}\}\) defines vulnerable gates: \(\texttt {c}\) for logic gates in combinational circuits, \(\texttt {r}\) for registers and \(\texttt {cr}\) for both logic gates and registers.

For example, \(\mathfrak {m}(\texttt {n}_e,k,\{ \tau _{s},\tau _{r},\tau _{bf}\},\texttt {cr})\) models the strongest adversary, who can inject faults to all the gates simultaneously at any clock cycle (except for those protected in the blacklist \(\textbf{B}\)) while \(\mathfrak {m}(1,1,\{\tau _{s}\},\texttt {c})\) only allows the adversary to choose one logic gate to inject a set fault in one chosen clock cycle.

Formally, the fault-resistance model \(\mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\) defines the following set \(\llbracket \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\rrbracket \) of possible fault vectors that can be applied by the adversary:

$$\llbracket \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\rrbracket :=\left\{ \textsf{V}(\mathcal {S},\textbf{B}_\ell ,T) \quad \begin{array}{|c} \quad \sharp \texttt {MaxE}(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T))\le \texttt {n}_e \\ \text{ and } \\ \sharp \texttt {Clk}(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T))\le \texttt {n}_c \end{array} \right\} $$

where

  • \(\textbf{B}_\ell := \left\{ \begin{array}{ll} \textbf{B}, & \hbox {if } \ell =\texttt {cr}; \\ \textbf{B}\cup \mathcal {R}, & \hbox {if } \ell =\texttt {c}; \\ \textbf{B}\cup \bigcup _{i\in [k]} V_i\setminus (I_i\cup O_i), & \hbox {if } \ell =\texttt {r}; \\ \end{array} \right. \)

  • \(\sharp \texttt {MaxE} (\textsf{V}(\mathcal {S},\textbf{B}_\ell , T)):=\max _{\alpha \in [k]} |\{\textsf{e}(\alpha ,\beta ,\tau )\in \textsf{V}(\mathcal {S},\textbf{B}_\ell ,T)\}|\), i.e., the maximum number of fault events per clock cycle in the fault vector \(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T)\);

  • \(\sharp \texttt {Clk}(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T)):= |\{\alpha \mid \textsf{e}(\alpha ,\beta ,\tau )\in \textsf{V}(\mathcal {S},\textbf{B}_\ell ,T)\}|\), i.e., the number of clock cycles when fault events can occur.

Definition 6

The circuit \(\mathcal {S}\) is fault-resistant against \(\mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\), denoted by \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\), if all the fault vectors \(\textsf{V}(\mathcal {S},\textbf{B},T)\in \llbracket \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\rrbracket \) are ineffective.

The fault-resistance verification problem is to determine whether or not \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\).

By Definition 6, it is straightforward to show that:

Proposition 1

If \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T_1,\texttt {cr})\), then \(\langle \mathcal {S},\textbf{B}'\rangle \models \mathfrak {m}(\texttt {n}_e',\texttt {n}_c',T_2,\ell )\) for any \(\textbf{B}\subseteq \textbf{B}'\), \(\texttt {n}_e'\le \texttt {n}_e\), \(\texttt {n}_c'\le \texttt {n}_c\), \(T_2\subseteq T_1\), \(\ell \in \{\texttt {c},\texttt {r},\texttt {cr}\}\).

By adapting the proof of NP-completeness [37] which reduces from the SAT problem, we can show

Theorem 1

The problem of determining whether a k-clock cycle circuit \(\mathcal {S}[k]\) for any fixed \(k\ge 3\) is not fault-resistant is NP-complete.

3.2 Motivating Example

A motivating example is given in Fig. 1, which is a simplified implementation of AES with a detection-based countermeasure [1]. The circuit has three cryptographic blocks (B1, B2, B3), three redundancy blocks (RB1, RB2, RB3), two selective blocks (MUX1, MUX2) and a check block CHECK, where all the gates in the check block CHECK are added to the blacklist \(\textbf{B}\). The cryptographic blocks and the two selective blocks together implement the functionality of AES, while the others implement a detection-based countermeasure.

Fig. 1.
figure 1

The AES circuit.

The first round starts with a reset signal rst (i.e., rst =1) after which the primary input signals INPUT are selected by MUX1 and stored in the registers REG. Moreover, rst is set to \(\texttt {0} \). Next, the values stored in the registers REG are processed by the cryptographic and redundancy blocks. The cryptographic block B1 produces primary output signals of the current round; the results of the cryptographic block B3 and redundancy block RB3 are stored in the registers REG as inputs of the next round (called feedback). Furthermore, the values of registers and the results of all the cryptographic and redundancy blocks are fed to the check block CHECK which checks whether a fault injection attack occurs. The primary output FLAG is the error flag.

The internal rounds are the same as the first round except that the feedback from the previous round is stored in the registers, instead of the primary input signals, because the reset signal rst has been set to 0 in the first round. The last round is the same as the internal rounds except that the results of the cryptographic block B1 (resp. the redundancy block RB1) are fed to the cryptographic block B3 (resp. the redundancy block RB3) by setting the input signal sel=1 of the selective block MUX2, respectively.

To verify its fault-resistance, one can unroll it according to the clock cycle (cf. [38]), then enumerate and check the effectiveness of each possible fault vector by analyzing the unrolled and faulty counterparts via BDD [31] or SAT/SMT [37]. However, there are two shortcomings which hurdle their efficiency and scalability. (1) One shall verify the equivalence of the primary outputs of the circuit and its faulty counterpart, which must be done for each round (unless the error flag is set). Since the subsequent round depends upon preceding rounds, the size of the SAT/SMT formulas or BDDs usually increases dramatically, which incurs a blowup in rounds of circuits. (2) To achieve completeness (or at least a high coverage), a large number of possible fault vectors have to be checked, which incurs a blowup in the number of fault vectors. Our work proposes a novel compositional approach to combat these two types of blowups in fault-resistance verification by decomposing the verification of an entire circuit into the verification of (typically much smaller) single-round sub-circuits.

4 Compositional Verification

In this section, we first describe the overview of our approach and our decomposition, next briefly recap two symbolic approaches (SAT/SMT- and BDD-based) for verifying sub-circuits, and finally present three acceleration techniques to improve the verification efficiency.

4.1 Overview of the Approach

Our approach relies on the structural feature of (round-based) cryptographic primitives, e.g., block ciphers, for which countermeasures are developed round-by-round accordingly, aiming to isolate the effects of fault injection in each round. Furthermore, the rounds are often similar, many of which are even the same, For instance, the first \((k-1)\) rounds in Fig. 1 are the same except that the first round uses the primary input signals while the other (internal) rounds use the feedback from the previous round (i.e., the values stored in the registers).

Based on the above key observation, as shown in Fig. 2, given a circuit \(\mathcal {S}\), a blacklist \(\textbf{B}\) of gates on which faults cannot be injected and a fault-resistance model \(\mathfrak {m}(n_e, n_c, T,\ell )\), our approach first decomposes the circuit \(\mathcal {S}\) into single-round sub-circuits (\(S_1,\cdots , S_r\)) where each \(S_i\) for \(i\in [r]\) implements one round. As many sub-circuits are indeed identical, we only need to verify a small number of single-round sub-circuits in isolation whereby the fault-resistance of the entire circuit \(\mathcal {S}\) is guaranteed. For instance, in the motivating example, we only need to verify the first and the k-th (i.e., last) round, because the first \((k-1)\) rounds are virtually the same. It reduces the verification of a k-round circuit to the verification of two single-round sub-circuits.

Fig. 2.
figure 2

Overview of our approach.

To verify each sub-circuit, we leverage two symbolic verification approaches, based on SAT/SMT and BDD. To further improve efficiency, we also study various acceleration techniques exploiting fault effects and propagation.

4.2 The Decomposition

For a k-clock cycle circuit \(\mathcal {S}[k]=(\mathcal {I},\mathcal {O},\mathcal {R}, \boldsymbol{s}_0, \mathcal {C})\) where \(\mathcal {R}=\mathcal {R}_{in}\cup \mathcal {R}_{s}\), \(\mathcal {C}=\{C_1,\cdots , C_k\}\) and \(C_i=(V_i,I_i, O_i, E_i,\texttt {g}_i)\) for each \(i\in [k]\), let r be the number of rounds of \(\mathcal {S}[k]\). An r-decomposition of \(\mathcal {S}[k]\) is \((S_1[k_1],\cdots ,S_r[k_r])\), where for every \(i\in [r]\), \(S_i[k_i]\) is a single-round, \(k_i\)-clock cycle sub-circuit \((\mathcal {I}^{(i)},\mathcal {O}^{(i)},\mathcal {R}^{(i)}, \boldsymbol{s}^{(i)}, \mathcal {C}^{(i)})\) defined as (note that \(\sum _{i \in [r]} k_i=k\))

  • \(\mathcal {I}^{(i)}=\mathcal {I}\cup \mathcal {I}_{fb}\), where \(\mathcal {I}_{fb}\) comprises additional primary inputs used for representing the signals passed from the previous round, i.e., the values stored in the state registers \(\mathcal {R}_s\) at the end of the \((i-1)\)-th round;

  • \(\mathcal {O}^{(i)}=\mathcal {O}\cup \mathcal {O}_{fb}\), where \(\mathcal {O}_{fb}\) comprises additional primary outputs used for representing the signals passed to the next round, i.e., the values stored to the state registers \(\mathcal {R}_s\) at the end of the \((i-1)\)-th round;

  • \(\mathcal {R}^{(i)}=\mathcal {R}_{in}'\cup \mathcal {R}_s'\) where \(\mathcal {R}_{in}'=\mathcal {R}_{in}\cup \mathcal {R}_s^{in}\), \(\mathcal {R}_s^{in}\subseteq \mathcal {R}_s\) comprises registers used for storing signals passed from one round to the next round, and \(\mathcal {R}_s'\subseteq \mathcal {R}_s\) comprises the registers used for connecting combinational circuits of \(\mathcal {C}^{(i)}\) (note that \(\mathcal {R}_s'\) can be \(\emptyset \) if \(k_i=1\), i.e., the round has one clock cycle);

  • \(\boldsymbol{s}^{(1)}=\boldsymbol{s}_0\) and \(\boldsymbol{s}^{(i)}\) for \(i\ge 2\) is not defined;

  • \(\mathcal {C}^{(i)}=\{C_{i,1},\cdots , C_{i,k_i}\}\) with \(C_{1,1},\cdots , C_{1,k_1},\cdots ,C_{r,1},\cdots , C_{r,k_r}=C_1\cdots C_k\), and the connection between any two adjacent single-rounds sub-circuits via the registers \(\mathcal {R}_s'\) is the same as that in \(\mathcal {S}\);

  • the registers in \(\mathcal {R}_s^{in}\) that were connected by the outputs \(O_{i-1,k_{i-1}}\) of \(C_{i-1,k_{i-1}}\) are now connected by the additional primary inputs \(\mathcal {I}_{fb}\) if \(i\ge 2\);

  • the outputs \(O_{i-1,k_{i-1}}\) of \(C_{i,k_{i}}\) that were connected to the registers in \(\mathcal {R}_s\) are now connected to the additional primary outputs \(\mathcal {O}_{fb}\).

Fig. 3.
figure 3

Single-round sub-circuits of the motivating example.

Two single-round sub-circuits \(S_i[k_i]\) and \(S_j[k_j]\) are isomorphic w.r.t. the blacklist \(\textbf{B}\) if they are identical up to the renaming of the primary inputs/outputs, registers and vertices in the combinational circuits, and the matched gate pairs are either both protected or not protected in \(\textbf{B}\). Note that this condition is much stricter than the semantic equivalence of two circuits, namely, the same input-output relation, which is insufficient for our decomposition theorem. For instance, consider one single-round sub-circuit correctly implements a correction-based countermeasure but the other one does not implement any countermeasure. They are semantically equivalent, but both have to be verified.

Proposition 2

For any pair of isomorphic circuits \((S_i,S_j)\) and fault-resistance model \(\mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\), \(\langle S_i,\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\) iff \(\langle S_j,\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\).    \(\square \)

Consider the example in Fig. 1. In this case, \(r=k\) (\(k_i=1\) for each \(i\in [k]\)). As illustrated in Fig. 3, our r-decomposition removes all the connections labeled with FeedBack, re-connects the outputs of the blocks \(\texttt {B3}\) and \(\texttt {RB3}\) to the additional primary outputs that were connected to the registers REG, and connects the additional primary inputs to the registers REG that were connected by the outputs from the previous round. Then, all the single-sound sub-circuits except for the last one are isomorphic.

Theorem 2

Given a k-clock cycle circuit \(\mathcal {S}[k]=(\mathcal {I},\mathcal {O},\mathcal {R}, \boldsymbol{s}_0, \mathcal {C})\) and a blacklist \(\textbf{B}\), let \((S_1[k_1],S_2[k_2],\cdots ,S_r[k_r])\) be the r-decomposition of \(\mathcal {S}[k]\). For any fault-resistance model \(\mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\), if \(\langle S_i,\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\) for all single-round sub-circuits \(S_i\in \{S_1,S_2\cdots ,S_r\}\), then \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\).

Furthermore, if \(\texttt {n}_c\ge k_i\) for all \(i\in [r]\),then \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,k,T,\ell )\).

We should emphasize that the additional primary inputs \(\mathcal {I}_{fb}\), primary outputs \(\mathcal {O}_{fb}\), registers \(\mathcal {R}_s^{in}\) and their connections are crucial to guarantee that the composition \(\mathcal {S}[k]\) of the fault-resistant sing-round sub-circuits \((S_1[k_1],\cdots ,S_r[k_r])\) is also fault-resistant. The fault-resistance of all the single-round sub-circuits, i.e., \(\langle S_i,\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\) for \(i\in [r]\), ensures that the primary outputs \(\mathcal {O}'=\mathcal {O}\cup \mathcal {O}_{bf}\) remain the same (unless the error flag is set) for any fault vector \(\textsf{V}(\mathcal {S},\textbf{B},T)\in \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\). It guarantees that not only the primary outputs \(\mathcal {O}\) but also the values stored to the registers \(\mathcal {R}_s^{in}\) at the end of each round remain the same (unless the error flag is set) for any fault vector \(\textsf{V}(\mathcal {S},\textbf{B},T)\in \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\). In other words, the single-round sub-circuits are able to detect any fault injections which change the primary outputs \(\mathcal {O}\) or the values used by the next round (i.e., isolating fault effects in each round). Thus, our decomposition approach for compositional fault-resistance verification is different from previous ones used for compositional safety and equivalence checking (e.g., [15, 24, 25]).

4.3 SAT/SMT-Based Verification

We adopt the SAT/SMT-based approach used in FIRMER [37] which reduces the problem to SAT/SMT solving. Given a fault-resistance model \(\mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\) and a (single-round) k-clock cycle circuit \(\mathcal {S}[k]=(\mathcal {I},\mathcal {O},\mathcal {R}, \boldsymbol{s}_0, \mathcal {C})\), FIRMER first encodes all the possible fault vectors into \(\mathcal {S}[k]\) by introducing additional inputs to control if a fault is injected on a gate and which fault type is injected. This will result in a controllable faulty circuit, denoted by \(\mathcal {S}_\mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\). The fault-resistance verification of \(\mathcal {S}[k]\) is reduced to equivalence checking of \(\mathcal {S}\) and \(\mathcal {S}_\mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\) with constraints on the additional inputs and error flag, which in turn is reduced to the SAT/SMT solving. (Cf. [37] for details.)

4.4 BDD-Based Verification

We adopt the BDD-based approach used in FIVER [31]. To avoid re-construction of the BDD from scratch for each fault vector, FIVER first attaches each gate g in the circuit \(\mathcal {S}\) with a BDD \(D_g\) representing the output of the gate in \(\mathcal {S}\). Then, for each fault vector \(\textsf{V}(\mathcal {S},\textbf{B},T)\in \llbracket \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\rrbracket \), on a copy \(\mathcal {S}'\) of the BDD-attached circuit \(\mathcal {S}\), the BDD \(D_g\) of the gate g is revised according to each fault event \(\textsf{e}(i,g,\tau )\in \textsf{V}(\mathcal {S},\textbf{B},T)\), where the BDDs of the gates depending upon g are also revised accordingly. Finally, for each clock cycle, FIVER checks each primary output o by comparing the attached BDDs of the primary output o in the circuit \(\mathcal {S}\) and its faulty counterpart \(\mathcal {S}'\). Furthermore, some optimizations to reduce the number of considered fault vectors and improve the construction of the desired \(\mathcal {S}'\) are implemented. (Cf. [31] for details.)

4.5 Acceleration Techniques

For both SAT/SMT-based and BDD-based verification, we apply the following acceleration techniques.

Fixed Number of Fault Events. Recall that to prove fault-resistance, we considered all possible fault vectors \(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T)\) such that \(\sharp \texttt {MaxE}(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T))\le \texttt {n}_e\) and \(\sharp \texttt {Clk}(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T))\le \texttt {n}_c\). It turns out that these two conditions can be safely improved to “\(\texttt {n}_e\) fault events for each clock cycle if some fault events occur in this clock cycle” when \(\tau _s,\tau _r\in T\) and the number of vulnerable gates is more than \(\texttt {n}_e\) in each clock cycle, reducing the number of fault vectors to be checked. Indeed, if there is an effective fault vector \(\textsf{V}(\mathcal {S},\textbf{B},T)\in \llbracket \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\rrbracket \) such that the number of fault events is n in some clock cycle with \(1\le n<\texttt {n}_e\), there exists a sequence of primary input signals \((\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) such that \(\llbracket \mathcal {S}\rrbracket (\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) and \(\llbracket \mathcal {F}(\mathcal {S},\textbf{B},T)\rrbracket (\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) differ at some clock cycle before the error flag is set. We can add \((\texttt {n}_e-n)\) fault events \(\textsf{e}(i,g,\tau )\) to \(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T)\), where the output of the gate g under the primary input signals \((\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) remains the same by choosing \(\tau \in \{\tau _s,\tau _r\}\). The resulting fault vector is still effective.

Fault Type Reduction. Let \(\mathcal {T}=\{ \tau _{s},\tau _{r},\tau _{bf}\}\). We find that \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,\mathcal {T},\ell )\) iff \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,\tau _{bf},\ell )\) iff \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,\{ \tau _{s},\tau _{r}\},\ell )\), allowing us to consider only \(\{\tau _{s},\tau _{r}\}\) if \(\{ \tau _{s},\tau _{r}\}\subseteq T\) and only \(\tau _{bf}\) if \(\tau _{bf}\in T\) for any set T of fault types. Consider an effective fault vector \(\textsf{V}(\mathcal {S},\textbf{B},\mathcal {T})\in \llbracket \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,\mathcal {T},\ell )\rrbracket \) and a sequence of primary input signals \((\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) such that \(\llbracket \mathcal {S}\rrbracket (\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) and \(\llbracket \mathcal {F}(\mathcal {S},\textbf{B},\mathcal {T})\rrbracket (\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) differ at some clock cycle before the error flag is set.

  • For every fault event \(\textsf{e}(i,g,\tau _{bf})\in \textsf{V}(\mathcal {S},\textbf{B},\mathcal {T})\), if the output of the gate g at the i-th clock cycle in \(\llbracket \mathcal {F}(\mathcal {S},\textbf{B},\mathcal {T})\rrbracket (\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) is flipped from \(\texttt {1} \) to \(\texttt {0} \) (resp. from \(\texttt {0} \) to \(\texttt {1} \)), \(\textsf{e}(i,g,\tau _{bf})\) can be safely replaced by \(\textsf{e}(i,g,\tau _{r})\) (resp. \(\textsf{e}(i,g,\tau _{s})\)). Thus, \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,\{\tau _s,\tau _r\},\ell )\) entails \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,\mathcal {T},\ell )\).

  • For every fault event \(\textsf{e}(i,g,\tau )\in \textsf{V}(\mathcal {S},\textbf{B},\mathcal {T})\) such that \(\tau \in \{\tau _s,\tau _r\}\), if the output of the gate g at the i-th clock cycle in \(\llbracket \mathcal {F}(\mathcal {S},\textbf{B},\mathcal {T})\rrbracket (\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) is flipped by applying \(\textsf{e}(i,g,\tau )\), \(\textsf{e}(i,g,\tau )\) can be safely replaced by \(\textsf{e}(i,g,\tau _{bf})\); otherwise the output of the gate g at the i-th clock cycle in \(\llbracket \mathcal {F}(\mathcal {S},\textbf{B},\mathcal {T})\rrbracket (\boldsymbol{x}_1,\cdots ,\boldsymbol{x}_k)\) remains the same by applying \(\textsf{e}(i,g,\tau )\), \(\textsf{e}(i,g,\tau )\) can be safely removed from \(\textsf{V}(\mathcal {S},\textbf{B},\mathcal {T})\). Thus, \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,\tau _{bf},\ell )\) entails \(\langle \mathcal {S},\textbf{B}\rangle \models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,\mathcal {T},\ell )\).

Vulnerable Gate Reduction. If the output of a gate g is only connected to one vulnerable logic gate \(g'\not \in \textbf{B}_\ell \), then the gate g can be safely added into the blacklist \(\textbf{B}\) while no protection is required for the gate g. It is because:

  • if the output of the gate g does not change at the i-th clock cycle after applying the fault event \(\textsf{e}(i,g,\tau )\), then the effect of the fault event \(\textsf{e}(i,g,\tau )\) terminates at the gate \(g'\), thus \(\textsf{e}(i,g,\tau )\) can be removed from any fault vector;

  • if the output of the gate g does change at the i-th clock cycle after applying the fault event \(\textsf{e}(i,g,\tau )\), it is flipped either from \(\texttt {1} \) to \(\texttt {0} \) or from \(\texttt {0} \) to \(\texttt {1} \), the same effect can be achieved by applying the fault event \(\textsf{e}(i,g',\tau _{bf})\), or the fault event \(\textsf{e}(i,g',\tau _{s})\) if it is flipped from \(\texttt {0} \) to \(\texttt {1} \) or the fault event \(\textsf{e}(i,g',\tau _{r})\) if it is flipped from \(\texttt {1} \) to \(\texttt {0} \).

As a result, it suffices to consider fault injections on the gate \(g'\) instead of both g and \(g'\) when \(\tau _{bf}\in T\) or \(\{\tau _s,\tau _r\}\subseteq T\), which reduces the number of vulnerable gates [37]. By a graph traversal of the circuit \(\mathcal {S}\), all the gates g whose output is only connected to one vulnerable logic gate \(g'\not \in \textbf{B}_\ell \) can be identified and then added into the blacklist \(\textbf{B}\).

We finally remark that the above three acceleration techniques can be applied simultaneously except that we cannot fix the number of fault events if the set and reset fault types (i.e., \(\tau _s\) and \(\tau _r\)) are unavailable.

5 Implementation and Evaluation

We have implemented our approach as an open-source tool CLEAVE based on the parallel SAT solver Glucose 4.2.1 [6] and SMT solver bitwuzla 1.0-prerelease [28], where the BDD-based compositional verification is implemented based on FIVER which uses the CUDD package. Given a circuit \(\mathcal {S}\) in Verilog gate-level netlist, a blacklist \(\textbf{B}\) and a fault-resistance model \(\mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\), CLEAVE determines whether \((\mathcal {S},\textbf{B})\models \mathfrak {m}(\texttt {n}_e,\texttt {n}_c,T,\ell )\). Currently, CLEAVE directly extracts single-round sub-circuits from \(\mathcal {S}\) by enumerating all the feasible combinations of input signals of selective blocks. One feasible combination gives one single-round sub-circuit on which fault resistance is verified. Though more than one isomorphic single-round sub-circuits may be verified, the computational-expensive (GI-complete) isomorphism checking of pairs of single-round sub-circuits is avoided. For instance, the two distinct single-round sub-circuits of the circuit in Fig. 1) are extracted by fixing the signals of rst and sel to \((\texttt {1},\texttt {0})\), \((\texttt {0},\texttt {0})\) and \((\texttt {0},\texttt {1})\), respectively, where the first two pairs of signals give the same single-round sub-circuits after adding/re-connecting primary inputs/outputs and registers according to our decomposition.

Benchmarks. We use 9 VHDL implementations [1, 35] of 3 cryptographic algorithms (i.e., CRAFT, LED and AES [31]). The VHDL implementations are transformed into Verilog gate-level netlists using the Synopsys design compiler (version O-2018.06-SP2). The blacklists are generated according to [1, 35]. The statistics of the benchmarks are given in Table 1. The first column shows the name of the cryptographic algorithm, the maximal number of protected faulty bits per clock cycle (bi), the type of the adopted countermeasure (D for detection-based and C for correction-based). The second column shows the single-round sub-circuit and its number of times used in the implementation, e.g., the 10-round AES-b1-D has two single-round sub-circuits (S1, S2) and S1 is used in 9 rounds. The other columns respectively give the size of the blacklist \(\textbf{B}\), the numbers of primary inputs, primary outputs, gates and each specific gate.

We can observe that CRAFT benchmarks use both detection-based (D) and correction-based (C) countermeasures, many single-round sub-circuits are isomorphic in each implementation, the number of distinct single-round sub-circuits ranges from 1 to 3, and the number of gates in one single-round sub-circuit ranges from 1,020 to 34,351 so that the scalability of CLEAVE can be evaluated.

Table 1. Benchmark statistics.

Setup. The experiments were conducted on a machine with Intel Xeon Gold 6342 2.80 GHz CPU, 1T RAM, and Ubuntu 20.04.1. Each verification task is run with 6-hour timeout. All the SAT-based and BDD-based (compositional) verification approaches are run with eight threads while the SMT-based (compositional) verification approaches are run with a single thread, with their default parameters (There are no promising parallel SMT solvers for QF_BV). The verification time is given in seconds with the best one highlighted in boldface, column R reports the verification result, and column DR shows the desired verification result. Mark âś“ (resp. âś—) indicates that the circuit is fault-resistant (resp. not fault-resistant) w.r.t. the fault-resistance model.

5.1 Effectiveness of Acceleration Techniques

Recall that we present three acceleration techniques: fixed number of fault events (denoted by fe), fault type reduction (denoted by tr), and vulnerable gate reduction (denoted by gr). We denote by “no-opt" the verification without any of these acceleration techniques, by tr\(_{sr}\) and tr\(_{bf}\) the fault type reduction that reduces to the fault types \((\tau _s,\tau _r)\) and the fault type \(\tau _{bf}\), respectively. The acceleration techniques can be combined, e.g., gr\(\cdot \) fe applies “vulnerable gate reduction" with “fixed number of fault events". Note that tr\(_{bf}\) cannot be combined with a fixed number of fault events (i.e., no fe\(\cdot \) tr\(_{bf}\) or gr\(\cdot \) fe\(\cdot \) tr\(_{bf}\)). We evaluate all the acceleration techniques and their feasible combinations on the first single-round sub-circuits of AES-b1-D, AES-b2-D, CRAFT-b2-C, and CRAFT-b3-D.

Table 2. SAT-based verification of single-round sub-circuits.
Table 3. Results of fault-resistance verification: compositional vs. monolithic.

The results of SAT-based verification are reported in Table 2. Overall, all three acceleration techniques and their combinations can improve the SAT-based verification approach (no-opt) for almost all the verification tasks, solving one timeout case and significantly reducing the verification time for the other cases. The combination gr\(\cdot \) tr\(_{bf}\) outperforms the others because encoding the bit-flip fault type needs fewer fault type selection inputs than that of set and reset fault types. Note that adding more acceleration techniques does not necessarily make an improvement, e.g., gr\(\cdot \) tr\(_{sr}\) vs. gr\(\cdot \) fe\(\cdot \) tr\(_{sr}\) on AES-bi-D, because \(\sharp \texttt {MaxE}(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T))= \texttt {n}_e\) and \(\sharp \texttt {Clk}(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T))= \texttt {n}_c\) are encoded as \(\texttt {n}_e\le \sharp \texttt {MaxE}(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T))\le \texttt {n}_e\) and \(\texttt {n}_c\le \sharp \texttt {Clk}(\textsf{V}(\mathcal {S},\textbf{B}_\ell ,T))\le \texttt {n}_c\) before bit-blasting. Remark that FIRMER [37] indeed is CLEAVE when only gr is enabled. Due to space limitations, the results of SMT- and BDD-based verification are reported elsewhere [38], from which the same conclusion can be drawn. Thus, hereafter, we adopt the combination of acceleration techniques gr\(\cdot \) tr\(_{bf}\).

5.2 Evaluation of Compositional Verification

To evaluate our compositional approach, we compare it with the monolithic one, both of which adopt the combination of acceleration techniques gr\(\cdot \) tr\(_{bf}\).

The results are reported in Table 3. Overall, our compositional reasoning is very effective, allowing CLEAVE to verify fault-resistance of almost all the benchmarks while their monolithic counterparts often run out of time. For instance, the monolithic BDD-based approach fails to verify all the benchmarks due to the huge number of BDD variables. Indeed, the maximal number of rounds that can be handled is 2 (cf. [38] for details).

In contrast, the compositional reasoning can verify all the benchmarks, except for AES-b2-D and CRAFT-b3-D where even the single-round sub-circuit cannot be verified by the BDD-based approach. For SAT/SMT-based verification, the compositional reasoning takes significantly less time than its monolithic counterpart. Note that the diverse performance between SAT/SMT- and BDD-based approaches is mainly because we use the parallel SAT solver Glucose (8 threads) versus sequential SMT solver bitwuzla, and there is a cost for building (several) BDDs.

6 Related Work

Equivalence and safety checking play an essential role in the design of circuits. Various SAT/SMT-based approaches (e.g., [7, 10,11,12, 22]) and BDD-based approaches (e.g., [13, 14, 17, 29]) have been studied. They are orthogonal to our work and cannot be directly applied to check fault-resistance.

Due to the prevalence of fault injection attacks, there are studies for finding the effective fault vectors or checking the effectiveness of the fault vectors provided by users, e.g., [4, 23, 33]. However, it is virtually impossible to enumerate all the possible fault vectors and valid inputs in practice, thus these approaches are limited in efficiency and scalability. To mitigate these issues, the BDD-based approach, FIVER [31], was proposed which does not need to explicitly enumerate all the possible valid inputs [31], but still has to explicitly enumerate all the possible fault vectors. Very recently, the SAT/SMT-based approach, FIRMER [37], was proposed to implicitly encode all the possible fault vectors into SAT/SMT formulas, and thus no explicit enumeration is required for both possible fault vectors and valid inputs. However, they often fail to verify the entire circuit under all the possible fault vectors and valid inputs. Our compositional approach circumvents the verification of the entire circuit of a large size, and can significantly boost both SAT/SMT-based and BDD-based verification approaches with novel acceleration techniques.

Compositional reasoning is a powerful divide-and-conquer approach for addressing the state-explosion problem. Hence, various compositional reasoning techniques and methods have been investigated, e.g., [19, 20, 25, 27], for safety, equivalence and side-channel security verification. Our compositional reasoning relies on the structural feature of (round-based) cryptographic circuits and the fault-resistance verification problem, thus is different from the prior ones.

Synthesis techniques have been proposed to repair flaws (e.g., [18, 32, 40]). However, they do not provide security guarantees (e.g., [32, 40]) or are limited to one specific type of fault injection attacks (e.g., clock glitch in [18]) and thus may be still vulnerable to other fault injection attacks.

7 Conclusion

We have proposed the first compositional reasoning which decomposes the fault-resistance verification of a whole round-based cryptographic circuit into that of single-round sub-circuits. To efficiently verify single-round sub-circuits, we have proposed various acceleration techniques and studied both SAT/SMT-based and BDD-based approaches. We have implemented our approach in an open-source tool CLEAVE and extensively evaluated it on a set of realistic cryptographic circuits. The experimental results show that our compositional approach and acceleration techniques can significantly improve all the SAT/SMT-based and BDD-based verification approaches, outperforming the state-of-the-art baselines.