1 Introduction

Post-quantum cryptography has gained much attention in the past few years. One of the main reasons is the National Institute of Standards and Technology (NIST) call for proposals for post-quantum schemes (Signature schemes and Key encapsulation mechanisms). Currently, the call is in the third round, and there are few candidates for signature schemes: Picnic, Falcon, Rainbow, Crystals-Dilithium, GeMSS, and SPHINCS+.

The security of the schemes relies on different mathematical properties, so one can break a scheme if one finds a way to exploit some weaknesses in these mathematical properties, and hence may, in an easy way, recover information that is sensitive. Moreover, there are attacks where the main target is the implementation of the scheme, and such attacks are called side-channel attacks. One of those attacks is called a cold boot attack. Briefly, the idea of the attack is to fetch sensitive data from the memory of an electronic device.

This paper presents a general procedure by which an attacker may recover a block cipher secret key after procuring a noisy version of the key via a cold boot attack. More specifically, we describe a method that exploits key enumeration algorithms and a well-known quantum algorithm, namely, Grover’s Algorithm. Also, we show how to implement the quantum component of our algorithm for several block ciphers such as AES, PRESENT and GIFT, and LowMC. Furthermore, we give a use case where Picnic (a third-round signature scheme from NIST) is evaluated in the cold boot attack setting, focusing on its current reference implementation. According to our knowledge, this is the first paper evaluating this signature scheme in the cold boot attack setting. In the study case, we further detail our key-recovery method for Picnic private keys in the cold boot attack setting, providing a detailed analysis of its costs of resources, its running time, and success rates for all Picnic parameter sets.

This paper is structured as follows. In Section 2, we present background material about cold-boot attacks, the model we assume for studying cold-boot attacks on cryptographic schemes, as well as a literature review on previous works on cold-boot attacks on cryptographic algorithms and background material on quantum computing. Section 3 gives a high-level idea of the key-recovery problem in the cold blood attack setting. In Section 4, we present our hybrid key-recovery method. In particular, Section 4.2 describes our key-recovery strategy combined with Grover’s quantum algorithm (a.k.a hybrid attack), its running time, and costs in terms of resources for several block ciphers. In Section 5, we concentrate on Picnic, particularly on its key-generation algorithm and implementation, providing a detailed description of how to apply our algorithm to LowMC in the context of Picnic. Lastly, Section 6 encloses our final comments on the paper, highlighting some future research works.

2 Background

In this section, we will present background material about cold boot attacks, the model we assume for studying cold-boot attacks on cryptographic schemes, a literature review of previous works about cold boot attacks on cryptographic algorithms, background material on quantum computing, and lastly, a general strategy to tackle the key-recovery problem in the cold boot attack setting.

2.1 Cold boot attacks

A cold boot attack is a kind of data remanence attack by which an adversary could fetch sensitive data from an electronic device’s main memory after the device has supposedly deleted the memory data. This attack vector exploits the data remanence property of Dynamic RAM (DRAM). Through it, an adversary might recover readable memory content after the device’s power is off for a while. This attack vector, introduced in [1], has been explored extensively against multiple cryptographic schemes, as we will discuss in Section 2.3. In this setting, an adversary, who has physical access to a device, might retrieve chunks of memory content from the device via carrying out a cold-rebooting on it [1,2,3]. In general terms, the adversary forces the operating system to shut down, which causes it to go past all tasks that typically execute during a normal shutdown, such as the file system synchronization. Therefore such an adversary may employ an external disk to start and run a lightweight operating system to copy memory contents of pre-boot DRAM to a file. Alternatively, such an attacker may remove the physical memory modules from the device (if possible) and place them in an adversary-controlled device. The attacker then may run a lightweight operating system to copy and paste chunks of memory content from these physical memory modules to an external drive. Because of some physical effects on the main memory, the memory bits experience a deterioration process once the device’s power is off, by which some bits get changed. Particularly some 0 bits of the original content change to 1 bits and vice-versa. Therefore the extracted data from the target device’s main memory will be recognizably different from the original memory data.

Previous works [1,2,3] point out that an attacker can decelerate the bit degrading process by means of spraying a chemical product, like liquid nitrogen, onto the memory modules (that is, spraying cold compressed liquid onto the modules may maintain the original bit states for a prolonged period). Nonetheless, the attacker has yet to extract the memory content before restoring any important information from the target device’s main memory. To extract chunks of memory, the attacker has to handle several possible issues. On rebooting, the initial boot process may overwrite chunks of memory with its running code and data, even though the overwritten chunks are normally small. Moreover, the initial boot process might execute a destructive memory check, yet this memory check may be bypassed. In particular, the attacker may use memory-imaging tools to produce correct dumps of memory contents to any external device, as was reported in [1,2,3]. These tools consume trivial amounts of RAM and usually are placed in memory in such a way that do not affect the data of interest. In case that such an attacker cannot force boot memory-imaging tools, the attacker removes the memory modules and place them in a compatible device and copy and paste the content to an external disk, like mentioned by the authors of [1].

Once the attacker extracts some memory content, the attacker has to profile the content to estimate the probabilities of bit-flipping. That is the probability for a 1 to 0 bit flipping and a 0 to 1 bit flipping. Furthermore, according to the results of the experiment reported in [1], almost all memory bits tend to decay to predictable “ground” states, with only a portion flipping in the opposite direction. Additionally, the authors of [1] mention that the probability of a bit-flipping in the opposite direction stays constant and is very small (circa 0.01) as time elapses, while the probability for a bit to decay to the ground state increases over time. These results suggest that the attacker could model the decay in a portion of the memory as a binary asymmetric channel, i.e., we can assume that the probability for a 1 to 0 bit flipping is a fixed number and that the probability for a 0 to 1 bit flipping is another fixed number in a given time. Note that by reading and counting the number of 0 bits and 1 bits, the attacker can discover the ground state of a specific memory region. Additionally, the attacker can estimate the bit-flipping probabilities by comparing the bit count of original content in a memory region with its corresponding noisy version.

Finding encryption keys after procuring memory content is another challenge that the attacker has to address. Such a problem has been extensively discussed in [1] for Advanced Encryption Standard (AES) and RSA keys in-memory images. Even though the algorithms presented in [1] are scheme-specific, their algorithmic rationale may be easily adapted to devise key-finding algorithms for other schemes. These algorithms search for specific secret-key-identifying characteristics in the secret key in-memory formats as identifying labels for sequences of bytes. More precisely, these algorithms search for byte sequences with low Hamming distance to these identifying labels and verify that the remaining bytes in a possible sequence satisfy some conditions. Once the previous issues are coped with, the attacker will obtain a version with errors of the original secret key obtained from the memory image. Hence the attacker’s ultimate goal is to reconstruct the original private key from its noisy version with the help of public cryptographic data associated with the target key.

The study of cold boot attacks on cryptographic algorithms has focused on developing key-recovery algorithms to efficiently and effectively reconstruct a secret key from its noisy version with the help of associated public cryptographic data for a target cryptosystem and evaluate the robustness and tolerance of these key-recovery algorithms to noise.

2.2 Cold boot attack model

Based on our previous discussion on cold boot attacks, we assume an attacker knows about the data structures storing the private key in memory and has access to the corresponding public parameters without any noise. Also, we suppose such an attacker procures a noisy version of the target private key via applying several key finding algorithms. We note that finding the memory region that stores the private key requires to carry out this attack in practice and may be taken care of via applying several key finding algorithms [1,2,3]. Therefore, the adversary’s main objective is to reconstruct the original private key.

We denote \(\alpha = P(0 \rightarrow 1)\) as the probability of a 0 to 1 bit-flipping (a 0 bit in the bit representation of the private key changes to a 1 bit). Moreover, we denote \( \beta = P(1 \rightarrow 0)\) as the probability of a 1 to 0 bit-flipping (viz. a 1 bit in the bit representation of the secret key changes to a 0 bit). Furthermore, based on experimental results obtained in [1,2,3], we assume one of these values is very small (approximately 0.001) and not liable to variation over time, while the other value does increase over time. As stated by preceding works on cold boot attacks [1,2,3], such an attacker may estimate both α and β by comparing original content with its corresponding noisy version (using the public key), and both remain fixed across the memory region that stores the private key.

2.3 Literature review

In this section, we present a review of previous works about cold boot attacks on cryptographic schemes. In particular, we introduce this literature review by describing cold boot attacks on RSA, then cold boot attacks on discrete-logarithm-based schemes, then cold boot attacks on symmetric-key schemes, and finally cold boot attacks on post-quantum schemes.

2.3.1 RSA setting

The research paper by Heninger and Shacham [4] is the first work dealing with this class of attacks on RSA keys. They introduce a key-recovery algorithm, which relies on Hansel lifting and exploit the redundancy found in the popular RSA secret key in-memory format. The authors of [5] and the authors of [6] improve the previous work, and both papers exploit the mathematical structure on which RSA relies. Furthermore, the research paper [6] further concentrates on the error channel’s asymmetric nature, which is intrinsically connected to the cold boot setting, analyzing the key-recovery problem from an information-theoretic perspective.

2.3.2 Discrete logarithm setting

The authors of [7] were the first to look into this attack in the discrete logarithm setting. This work pays particular attention to recovering the secret key x given the public key gx, where g is a field element and x is a positive integer. Their model assumes the attacker has access to the public key gx and the noisy version of the private key x, as well as knowledge of an upper bound on the number of errors found in the noisy version of the secret key. Since their algorithm assumes knowing such an upper bound (hardly achievable) and exploits small redundancy in the secret-key format, it does not perform well in recovering keys if these keys are susceptible to considerable levels of noise.

A follow-up work by Poettering and Sibborn [8] also explores this attack in the discrete logarithm setting, more concretely in the elliptic curve cryptography (ECC) setting. Their work is practical since it centers on two implementations for elliptic curve cryptography. In particular, this work exploits redundancy present in two secret key in-memory formats from two popular ECC implementations from Transport Layer Security (TLS) libraries. They develop a dedicated key-recovery algorithm in the bit-flipping model for each studied memory representation, showing better results than the preceding work.

2.3.3 Symmetric key setting

Regarding the feasibility of cold boot attacks against symmetric-key primitives, several papers have already explored this class of attacks against some prominent block ciphers. At first, the paper by Albrecht and Cid [9] concentrates on the recovery of symmetric encryption keys by employing polynomial system solvers. Particularly, they use integer programming techniques and apply them to the key-recovery of Serpent block cipher’s secret keys, and also introduce a dedicated key-recovery algorithm to Twofish secret keys. Furthermore, the paper by Kamal and Youssef [10] introduces key-recovery algorithms based on SAT-solving techniques to tackle the same problem. We refer the interested reader to [9,10,11] for more details.

2.3.4 Post-quantum setting

Regarding the feasibility of performing this attack against post-quantum crypto-systems, several research papers have already carried out cold boot attacks on post-quantum schemes. At first, the work by the authors of [12] explores this attack against NTRU. Their work focuses on two existing NTRU implementations, the ntru-crypto implementation and the tbuktu/Bouncy Castle Java implementation. For each in-memory format analyzed in the paper, a dedicated key-recovery algorithm is presented and tested in the bit-flipping model. One of their key-recovery algorithms may recover the private key for a small and fixed α and varying β ranging from 1% up to 9%. A follow-up work by Villanueva-Polanco [13] expands on the previous results and presents a general key-recovery strategy via key enumeration, which is successfully applied to recover BLISS private keys. Another paper by Villanueva-Polanco [14] adjusts the previous key recovery strategy to successfully key-recovery LUOV private keys, exploiting the fact that a LUOV private key is derived from a 256 bit string. Additionally, these ideas are applied to tackle the key-recovery problem for toy parameters of Rainbow and McEliece Public-Key Encryption [15]. Another recent paper [16] extends these ideas to successfully key-recovery Supersingular Isogeny Key Encapsulation (SIKE) Mechanism private keys. Furthermore, the authors of [17] explore cold boot attacks on post-quantum cryptographic schemes based on the ring-and module- variants of the Learning with Errors (LWE) problem. Their work concentrates on Kyber key encapsulation mechanism (KEM) and New Hope KEM, for which they present dedicated key recovery algorithms to tackle both cases in the bit-flipping model.

2.4 Quantum background

Quantum registers are qubit strings whose length determines the amount of information that they can store. In superposition, each qubit in the register is in a superposition of |0〉 and |1〉, and consequently, a register of n qubits is in a superposition of all 2n possible bit strings represented by n “classical” bits.

As with single qubits, the squared absolute value of the amplitude associated with a given bit string is the probability of observing that bit string upon collapsing the register to a classical state.

2.4.1 Quantum gates

In classical computing, binary values, as stored in a register, pass through logic gates that, given a certain binary input, produce a certain binary output. Mathematically, classical logic gates are described as boolean functions. Quantum logic gates present a certain similarity with classical gates. When a quantum logic gate is applied to quantum registers it maps the current state to another state, transforming the state until it reaches a final state, i.e., the measured state.

There are several quantum gates each one with a specific function. In this work, we will use, 1qClifford, CNOT and Toffoli gate. For more details about gates and quantum computing see [18].

Remark 1

Since the quantum operations are inherently reversible, we can use unitary matrices to represent those operations. Moreover, for a computation to be reversible the output of the computation contains sufficient information to reconstruct the input, i.e. no input information is erased. Unless, one needs to measure the state, the collapse of the state, i.e., the measurement is the only non-unitary operation in quantum computing.

3 A framework to key recovery

According to the results by Villanueva-Polanco [19], the key-recovery problem in the cold boot attack setting can be coped with through key-enumeration techniques. We now present the key idea from that paper.

Let us assume that \(\widetilde {\texttt {k}}=\widetilde {\texttt {k}}_{0}\widetilde {\texttt {k}}_{1}\widetilde {\texttt {k}}_{2} {\cdots } \widetilde {\texttt {k}}_{W-1}\) represent the noisy bit-string of a key of bit-length W obtained via a cold boot attack. This bit string can be written as a sequence of \(\mathcal {N}=W/w\) chunks, where each chunk is of length w bits, i.e. \(\widetilde {\texttt {k}}=\widetilde {\texttt {K}}^{0}\widetilde {\texttt {K}}^{1}\widetilde {\texttt {K}}^{2} {\cdots } \widetilde {\texttt {K}}^{W/w-1}\) with \(\widetilde {\texttt {K}}^{i}=\widetilde {\texttt {k}}_{i \cdot w}\widetilde {\texttt {k}}_{i \cdot w+1} {\ldots } \widetilde {\texttt {k}}_{(i+1)\cdot w-1}\).

Let us assume we can generate full key candidates c for the original secret key encoding. Based on Bayes’s theorem, the probability of c to be the correct full key candidate given the noisy version \(\widetilde {\texttt {k}}\) is given by \(\mathbf {P}(\texttt {c}\lvert \widetilde {\texttt {k}})=\frac {\mathbf {P}(\widetilde {\texttt {k}}\lvert \texttt {c})\mathbf {P}(\texttt {c})}{\mathbf {P}(\widetilde {\texttt {k}})}\). Thus the maximum likelihood estimation method suggests choosing c to maximise \(\mathbf {P}(\texttt {c} \lvert \widetilde {\texttt {k}})\). Note that both \( \mathbf {P}(\widetilde {\texttt {k}})\) and P(c) are constants. Thus maximising it is equivalent to maximise \(\mathbf {P}(\widetilde {\texttt {k}}\lvert \texttt {c})=(1-\alpha )^{n_{00}}\alpha ^{n_{01}}\beta ^{n_{10}}(1-\beta )^{n_{11}},\) where n00 counts the positions in which both c and \(\widetilde {\texttt {k}}\) contain a 0 bit, n01 counts the positions in which c contains a 0 bit and \(\widetilde {\texttt {k}}\) contains a 1 bit, etc. Or equivalently, choosing c such that maximises \(\log (\mathbf {P}(\widetilde {\texttt {k}}\lvert \texttt {c}))\). Therefore each candidate can be assigned a score, viz. \(S(\texttt {c},\widetilde {\texttt {k}}):=\log (\mathbf {P}(\widetilde {\texttt {k}}\lvert \texttt {c}))\).

Let us assume that the full key candidates c are written as a sequence of chunks as for \(\widetilde {\texttt {k}}\), i.e. \(\texttt {c}=\texttt {C}^{0}|| \texttt {C}^{1}|| \ldots || \texttt {C}^{\mathcal {N}-1}\), where Ci is a w bit-string, then we may also assign a score \(S(\texttt {C}^{i},\widetilde {\texttt {K}}^{i})\) to each of the at most 2w values for a chunk candidate Ci. Since \({ S(\texttt {c},\widetilde {\texttt {k}}) ={\sum }_{i=0}^{\mathcal {N}-1}S(\texttt {C}^{i},\widetilde {\texttt {K}}^{i} )}\), then we can build \(\mathcal {N}\) lists of chunk candidates, where each contains up to 2w entries. More concretely, each list contains at most 2w 2-tuples of the form (score,value), where the first component score is a real number (candidate score) and the second component value is a w-bit strings (candidate value). Now note that the original key-recovery problem reduces to a enumeration problem that consists in traversing the lists of chunk candidates to produce full key candidates c of which total scores are obtained by summation. The enumeration problem has been previously studied in the side-channel analysis literature [19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34], and there are many algorithms that may be useful for our key-recovery setting, in particular those enumerating full key candidates in descending order based on the score component.

After acquiring the lists of chunk candidates, one can run them into a “search” algorithm to find the correct key. The search can be performed by a classical or a classical-and-quantum search. In the latter, it is possible to use Grover’s algorithm. However, as we will see in Section 4.2, the algorithm requires an oracle, and the oracle needs a quantum circuit of the underlying block cipher. In this regard, the attack becomes narrower in the direction of a specific implementation.

4 Recovering secret keys via a cold boot attack

In this section, we present our hybrid key-recovery method. We first will describe Grover’s algorithm and how an attacker can use it to key-search for a block cipher and then present our key-recovery method, its general running time and costs in terms of resources.

4.1 Grover’s algorithm

Grover’s algorithm [35] is one of the most popular quantum algorithms among cryptographers. This algorithm provides a quadratic speedup for searching an element such as a key in a keyspace. In the following, we define the search problem:

Definition 1

For N = 2n, we are given a function f : {0,1}N →{0,1} which assumes the value 0 for almost all entries. The goal is to find an x such that f(x) = 1.

In the classical setting, one needs to perform Θ(N) queries for finding x, the number of queries varies with the randomness in the search. In the quantum setting, that is, using Grover’s algorithm, one needs to perform \(O(\sqrt {N})\) queries. Algorithm 1 gives a high level abstraction of Grover’s algorithm.

Algorithm 1
figure a

Grover’s algorithm on a list with n elements (on a high level).

4.1.1 Key search for a block cipher

Grover’s algorithm can be used for searching a key in a key space. However, first the attacker needs to define the Boolean function f which Grover’s oracle will use it. So, a general definition can be found in [37] and it is as follows:

Definition 2

Let \(\mathcal {E}=(\texttt {E}, \texttt {D})\) be a block cipher defined over \((\mathcal {K}, \mathcal {X})\), where \(\mathcal {K}=\{0,1\}^{W}\) and \(\mathcal {X}=\{0,1\}^{n}\). We denote by \(\texttt {E}_{\texttt {k}}(m) \in \{0,1\}^{n}\) the encryption of message block m ∈{0,1}n under key k. Given np plaintext-ciphertext pairs (mi,ci) with ci = Ek(mi). The goal is to apply Grover’s algorithm to find the unknown key k by defining the function f as

$$ f(\texttt{k}) = \begin{cases} 1 & \text{if } \texttt{E}_{\texttt{k}}(m_{i}) = c_{i} \text{ for all } 1 \leq i \leq n_{p},\\ 0 & \text{otherwise. } \end{cases} $$

4.2 Our key-recovery algorithm

Throughout this section, we present a key-recovery method that combines key enumeration algorithms and Grover’s algorithm. The first version of this set of algorithms is introduced in [38] in the context of side-channel attacks and recently has been adjusted to be used in the cold boot attack setting on the Supersingular Isogeny Key Encapsulation (SIKE) Mechanism [16].

Here we adapt it for recovering a block cipher secret key sk from its noisy version procured via a cold boot attack. Let us assume that a cold boot attacker has access to a noisy version \(\widetilde {\texttt {k}}\) of a secret key \(\texttt {sk} \in \mathcal {K}\) and a pair \((\texttt {m},\texttt {c}) \in \mathcal {X}\times \mathcal {X}\) such that Esk(m) = c, and has estimated the values of α and β, as is suggested by preceding works on cold boot attacks [1,2,3] via comparing original content with its corresponding noisy version. The attacker’s goal is to recover sk.

Recall that from our discussion in Section 3, we can assign scores to each chunk candidate for a chunk by using the function S. Let W be the length of \(\widetilde {\texttt {k}}\) in bits, w be the length of a chunk in bits with w dividing W, η be an positive integer dividing \(\mathcal {N}=W/w\) and let μ be a positive integer. Algorithm 2 creates lists of chunk candidates on inputs \(\widetilde {\texttt {k}}, W, w, \eta , \mu \). The function toWeight on input s returns a weight (a positive integer), as suggested in [38]. Algorithm 2 makes use of a optimal key enumeration algorithm (OKEA) [19] to get the μ most high-scoring chunk candidates for the block of chunks from iη through iη + η − 1, for \(i=0,1,\ldots , \mathcal {N}/\eta -1\). We remark that the function OKEA.init initializes a tree-like structure from the given lists. This data structured is used by the function OKEA.getNext() to return the next high-scoring chunk candidate that can be constructed from the given lists.

Algorithm 2
figure b

Creates the lists of candidates.

Given the weights B1,B2, Algorithm 3 constructs a two dimensional array B with ξ × B2 entries. For i = ξ − 1 and 0 ≤ b < B2, the entry B[i][b] contains the number of chunk candidates such that their total score plus b lies in the interval [B1,B2). Therefore, B[i][b] is given by the number of chunk candidates L[i][j], 0 ≤ j < μ, such that B1bL[i][j].score < B2b.

Algorithm 3
figure c

Constructs the two dimensional array B.

On the other hand, for i = ξ − 2,ξ − 3,…,0, and 0 ≤ b < B2, the entry B[i][b] contains the number of chunk candidates that can be constructed from the chunk i to the chunk ξ − 1 such that their total score plus b lies in the interval [B1,B2). Therefore, B[i][b] may be calculated as follows. For 0 ≤ j < μ, B[i][b] = B[i][b] + B[i + 1][b + L[i][j].score] if b + L[i][j].score < B2. Note that, by construction, B[0][0] is the total number of full key candidates with weights in the interval [B1,B2).

Algorithm 4 simply constructs the matrix B by calling create and then computes the total number of full key candidates with weights in [B1,B2) by returning B[0][0].

Algorithm 4
figure d

Computes the number of full key candidates in [B1,B2).

We now present Algorithm 5. This algorithm returns the full key candidate kr with weight in the interval [B1,B2), with r ∈{1,2,3…,B[0][0]}. By construction the output of Algorithm 5 is deterministic in the sense that for given fixed values of L,B,B1,B2,W,w,η,μ and r, Algorithm 5 will return the same key kr.

Algorithm 5
figure e

returns the full key candidate kr with weight in the interval [B1,B2).

Indeed, let us assume that L,B,B1,B2,W,w,η,μ and r are inputs to Algorithm 5. We first analyse the lines from 7 to 19 of Algorithm 5. Let us fix i ∈{0,…,ξ − 2}. For j ∈{0,…,μ − 1}, the condition of the line 12 verifies whether r is less than the number of chunk candidates that can be constructed from the chunk i + 1 to the chunk ξ − 1 such that their total score plus b + s lies in the interval [B1,B2). If so, the algorithm finds the proper j for the fixed i, then concatenate the chunk candidate L[i][j].candidate to kr and updates b as bb + s. Otherwise r is updated as rr −B[i + 1][b + s]. Similarly, the block of instructions from the line 20 to the line 29 finds the proper j for i = ξ − 1. Note that the selection of j’s are determined by the input parameters. Hence, for given fixed values of L,B,B1,B2,W,w,η,μ and r, Algorithm 5 will return the same key kr.

For completeness, we present Algorithm 6 that enumerates and tests all full key candidates with weight in the interval [B1,B2) in a classic way (without a quantum algorithm). The function T is a boolean function that returns 1 if kr satisfies some specific condition. Otherwise, it returns 0. More specifically, the function T tests if kr is the correct key.

Algorithm 6
figure f

enumerates and tests all full key candidates with weight in the interval [B1,B2).

We now present Algorithm 7 that performs a quantum key enumeration over an interval with roughly e full key candidates. In particular, it searches over an interval of the form [Bmin,Be), where Bmin is the minimum weight that a full candidate can attain given the list L and Be is a calculated weight to guarantee the number of full candidates with weights in the interval [Bmin,Be) will be roughly e. Recall that L contains \(\xi = \mathcal {N}/\eta \) lists of chunk candidates. Therefore we can calculate the value Bmin by summing the score of the first chunk candidate of each list contained in L.

Algorithm 7
figure g

Performs a quantum key enumeration over a interval with roughly e full key candidates.

We recall that Algorithm 7 is “generic”, that is, it uses Grover’s algorithm in line 11 to speed up the search on a small set of keys. The advantage of this approach is that one can attack a more broad spectrum of symmetric ciphers.

4.2.1 Quantum circuit for f

The quantum circuit for f (Line 10 of Algorithm 7) can be seen as the oracle implementation of E. In particular, given a plain-text/cipher-text pair (m,c), T is defined as

$$ \texttt{T}(\texttt{k}) = \begin{cases} 1 & \text{if } \texttt{E}_{\texttt{k}}(\texttt{m}) = \texttt{c}\\ 0 & \text{otherwise. } \end{cases} $$

where k = GETKEY(L,B,B1,B2,W,w,η,μ,r) and r ∈{1,2,3…,B[0][0]}. That is, Grover’s algorithm is run to search a key in the space \(\mathcal {K}_{1}\) generated by GETKEY for fixed values of L,B,B1,B2,W,w,η,μ and r ∈{1,2,3…,B[0][0]}. In this regard, each attempt for running Grover’s algorithm with oracle f will cost \(O(\sqrt {\texttt {B}[0][0]} \approx 2^{s/2})\), where s = 0,1,2,…, (for more details, see Appendix).

In a practical example, let us suppose that Algorithm 7 at line 9 generates a matrix B such that B[0][0] = 2s, where s = 16. Therefore, 216 candidates need to be tested. At line 11, Grover’s algorithm is run with an oracle f, which can be constructed from the result by [37], to check if it can find the correct answer. Given we have an unique result, we will need to run this algorithm \(O(\sqrt {2^{16}}=2^{8})\) times, until we have reached our correct solution or not.

As pointed out, a critical component of our algorithm is the quantum oracle, so we will next present how to implement the quantum oracle for several block ciphers, namely, AES, PRESENT, and GIFT. Afterward, in Section 5, we further evaluate our algorithm for LowMC, in particular in the context of Picnic, the post-quantum signature algorithm assessed by the NIST standardization process.

4.2.2 Quantum AES

As previously mentioned, quantum computations need to be reversible. Also, the oracle \(\mathcal {O}\) present in Grover’s algorithm implements the block cipher as a reversible function. In [39], the authors give the first version of a reversible AES. Their seminal work generate other implementations in the literature such as [37, 40,41,42,43].

AES is a block cipher, designed by Daemen and Rijmen [44]. It is based on Rijndael but only provides 128-bit blocks. AES has different transformations operating on an intermediate result that is called State. The state can be seen as an array of bytes, with four rows and four columns. The number of rounds Nr depends on the size of the key, e.g., AES-128 performs 10 rounds, AES-192 performs 12 rounds and AES-256 performs 14 rounds.

In the encryption process with AES, one needs first to perform key addition, denoted by AddRoundKey, followed by Nr − 1 executions of Round, and finally one application of FinalRound. The Round function is the application of 4 transformations which are SubBytes, ShiftRows, MixColumns and AddRoundKey. The FinalRound consists of the application of SubBytes, ShiftRows and AddRoundKey. Algorithm 8 shows, in a pseudo C language, how those rounds are put together. One advantage of AES is that one just needs to implement the transformation functions and then reuse them in the rounds.

Algorithm 8
figure h

High level description of AES.

In the latest literature, we can see an improvement in the quantum circuit developed to AES. In our case, we will consider the implementation in [37] since it gives the lowest depth. We consider the “in-place” setting, more details in [37, Sec. 4.6]. Table 1 gives the number of gates necessary to run AES in Grover’s algorithm.

Table 1 Number of quantum gates for the full encryption circuit for AES presented in [37, Sec. 4.6]

4.2.3 Quantum present & quantum GIFT

PRESENT [45] and GIFT [46] follow the block cipher construction, that is, both schemes have a certain number of rounds in which they apply an Sbox transformation followed by a permutation. However, each of them has some difference. For PRESENT, the first operation is the addition of the round key, while, for GIFT, the first operation is the Sbox transformation.

PRESENT has block sizes of 64 bits, and GIFT uses 64 and 128 bits blocks. PRESENT support 80-bit key size, and both of them support 128-bit key size. More details can be found in the original papers [45, 46].

Fortunately, there are implementations of both of them in the quantum world, that is, there are reversible implementations using quantum gates. The work in [47] provides a deeper analysis of the quantum circuit. Table 2 show the number of gates for PRESENT and GIFT. The authors in [47] give the estimation using CNOT and Toffoli gates, in order to use in our work we use the same decomposition as [39] and decompose 1 Toffoli gate as 7 T gates + 8 Clifford gates. We remark that this gives an upper bound on the number of T gates as we use the generic decomposition; the circuits above could be built using T-gates directly and possibly use fewer T gates [48].

Table 2 Number of quantum gates for the full encryption circuit for PRESENT and GIFT presented in [47]

Generic Implementation and Different ciphers.

We present the costs to implement AES, PRESENT and GIFT into a quantum computer. As mentioned before, our attack is generic, and one can easily replace the function f(⋅) in Algorithm 7 by one of those implementations. In the following, we will focus in LowMC given that it is the one used in Picnic, which is the scope of this work.

5 Cold boot attacks on Picnic

In this section, we further evaluate our algorithm for LowMC, in particular in the context of Picnic, the post-quantum signature algorithm assessed by the NIST standardization process. We first describe the key-generation algorithm as it is implemented in [49]. We then describe the inner workings of LowMC and its Quantum version, and then the costs and success rate of our algorithm in this context.

5.1 Picnic key generation algorithm

In our analysis, we use the current reference implementation of Picnic [49]. Algorithm 9 summarizes the process of key generation.

Algorithm 9
figure i

Picnic’s Key Generation Algorithm.

As one can see, the input of the function KeyGen is P, which represents an instance of a structure to store a parameter set (paramset_t). This structure points to a relatively big set of fields. In particular, the field stateSizeBytes refers to the number of bytes needed to store stateSizeBits bits, which is the bit length of sk,m and c. In particular, Table 3 shows the values of both stateSizeBits and stateSizeBytes for each Parameter Set for Picnic, as defined in the Picnic reference implementation file picnic.c [49].

Table 3 Values of both stateSizeBits and stateSizeBytes for each Parameter Set for Picnic

For the sake of completeness, the call to randBytes(size) returns a random byte array of length size, while the call to zeroTrailBits(byteArray,bitLength) sets to 0 all bits of byteArray at position i for all bitLength < i ≤ 8 ⋅ l, where l is the number of entries of byteArray. At line 6, we see a call to LowMCEnc, the LowMC encryption algorithm, which we will describe next.

5.2 LowMC block cipher

LowMC [50, 51] is a block cipher that tries to reduce the multiplicative complexity of circuits. Different from other block ciphers, the instantiation of LowMC is not fixed, and it depends on the choice of certain parameters such as the block size, number of S-Boxes per round, and security expectations. Besides encryption and decryption, LowMC is also a component of the Picnic signature scheme.

First, LowMC performs a key-whitening and then iterates a round function by R times, where R depends on the parameters. The round function consists of 4 steps and is summarized as follows.

  1. 1.

    SBoxLayer: A 3-bit S-Box is applied to the first 3m bits of the state in parallel, while an identity map is applied to the remaining bits;

  2. 2.

    MatrixMul: A regular matrix \(L_{i} \in \mathbb {F}^{n\times n}_{2}\) is generated at random and the n-bit state is multiplied by Li;

  3. 3.

    ConstantAddition: An n-bit constant \(C_{i} \in \mathbb {F}^{n}_{2}\) is randomly generated and then compute the addition of n-bit state and Ci;

  4. 4.

    KeyAddition: A full-rank matrix \(M_{i+1} \in \mathbb {F}^{n \times k}_{2}\) is randomly generated. The n-bit round key Ki+ 1 is obtained by multiplying the k-bit master key with Mi+ 1. Then, the n-bit state is added with Ki+ 1, where addition means XOR operation.

To use LowMC in Picnic, the authors in [49] defined three levels: L1, L3, L5. For details about the construction given the parameters, we refer to the documentation in [49].

5.2.1 Quantum LowMC

In this context, we will need a quantum version of LowMC. Fortunately, [37] presents a quantum version of LowMC with low depth in their circuit. Furthermore, the authors provide a Q# implementation of the LowMC. We will reuse their results since it deals with the problems of building quantum circuits. Table 4 shows the number of quantum gates necessary for applying the LowMC encryption. The levels L1, L3, and L5 are the security levels required by Picnic scheme.

Table 4 Number of quantum gates for the full encryption circuit for LowMC presented in [37, Sec. 5.4]

Figure 1 shows the implementation of one S-Box, it is possible to notice that it requires 3 ancillas for storing intermediate results and it requires 12 CNOT gates and 3 Toffoli gates. In the Picnic specification it defines that a full S-boxLayer consists of 10 parallel S-Boxes.

Fig. 1
figure 1

Quantum circuit for computation of one S-Box from LowMC. The figure is directly from [37]

The AffineLayer since it is an affine transformation, it consists of a matrix multiplication following by an addition of a constant vector. The details can be seen in [37, Sec. 5.2]. The last function to describe, that is, the KeyExpansion and KeyAddition are only CNOT gates in parallel to perform the addition.

5.3 Costs for running our key recovery algorithm

The costs in terms of gates for running LowMC are similar to those provided in [37]. The only difference for our case is that we will search in a smaller keyspace, that is, the candidates that Algorithm 7 generates in line 9. Table 5 shows the costs for running Grover’s algorithm with the oracle provided in [37]. Furthermore, we select 3 different sizes of windows for the interval [Bmin,Be), namely e ∈{230,240,250} full candidates.

Table 5 Total number of gates for running Grover’s algorithm against LowMC

In our analysis, we need to consider the costs to run O(N) times, since the costs provided in [37] are only for 1 query. In our case, our costs are O(N) × #CNOT, O(N) × #1qCliff, O(N) × #T, for CNOT, 1qCliff and T gates respectively, where O(N) is taken as \(\frac {\pi }{4}\sqrt {2^{e}}\).

Remark 2

It is possible to run our algorithm in parallel or reuse the circuit. Since we fix the size of window, one can pre-compute the sub-intervals [B0,B1),[B1,B2),…,[Bj,Be), where each has size 2s, for s = 0,1,…. One can reuse the circuit to run each chunk in sequence or run several instances of Grover’s algorithm each one with their chunk of keys.

Remark 3

Our Algorithm 7 is a “hybrid” algorithm. In our case, we are considering that everything before the Grover’s call is “classical” computation. The same after the call, that is, when we check if the element is found. Hence, we do not need to take into account the costs of the other functions in a quantum computer besides the one in line 11. We refer the reader to Appendix for more details on the running time of our algorithms.

5.4 Success rate of our key recovery algorithm

In this section, we present the success rate of our key-recovery algorithm for each set of parameters defined for Picnic in [49]. The success rates are estimated by performing simulations of our key recovery algorithm for several selected hyper-parameters.

We note that our key-recovery method might find sk from \(\widetilde {\texttt {k}}\), only if each list from the list L returned by Algorithm 2 contains the proper chunk candidates to reconstruct sk. In such a case, a full enumeration of all candidates constructed from the lists of chunk candidates contained in L will find the real private key.

Based on the previous observation, we estimate the success rate of our key-recovery method by assuming the attacker can perform various enumerations from the set of candidates, \(\mathcal {C}\), that can be constructed from L. In particular, we assume an attacker is able to enumerate (1) all candidates from \(\mathcal {C}\), and (2) the e best high-scoring candidates from \(\mathcal {C}\), where e ∈{230,240,250} (this is basically what Algorithm 7 does for a given e).

To calculate the success rate of our algorithm for a given α,β and a Picnic parameter set P, we perform the following experiment that consists of 100 trials. In each trial, we first create the key pair sk,pk via calling the key generation algorithm from Picnic, as implemented in the Picnic reference implementation [49]. We then perturb sk according to α,β to get \(\widetilde {\texttt {k}}\). We then select appropriate values for W,w,η,μ, and generate L via calling Algorithm 2 and check if the real key can be reconstructed from L, i.e., by verifying if the corresponding chunk candidates are in the lists of chunk candidates contained in L. If so, that signifies that a full enumeration can recover sk. Otherwise, sk cannot be recovered. Additionally, in case sk can be recovered by a full enumeration, we then calculate three intervals of the form [Bmin,Be) for each e, as in Algorithm 7, to check if the score of the real private key lies in each of them. Note that this check verifies if performing an enumeration of the e best high-scoring candidates is enough to recover the real private key.

Figure 2 shows the results for the Picnic parameters picnic-{L1-FS, L1-UR, L1-full} and picnic3-L1. In particular, it shows that our key recovery algorithm may find the real private key for α = 0.001 and β in the set {0.001,0.01,0.02,…,0.4} when run with the parameters W = 128,w = 8,η = 2 and μ ∈{256,512,1024}. Note that the success rate improves as the value of e increases, which is expected. Similarly, Fig. 2d shows the success rate for the full enumeration improves as the the value of μ increases, which is also expected. Additionally, our experiments confirm that although the bit length of the private key for the parameters sets picnic-L1-full and picnic3-L1 is 129 bits, the success rate of our algorithm for these two parameter sets is essentially the same as shown by Fig. 2.

Fig. 2
figure 2

Success rate of our key recovery algorithm with W = 128,w = 8,η = 2, α = 0.001 and β ∈{0.001,0.01,0.02,…,0.4} for Picnic parameters picnic-{L1-FS, L1-UR, L1-full} and picnic3-L1. The x-axis represents β, while y-axis represents the success rate

Figure 3 shows the results for the Picnic parameters picnic-{L3-FS, L3-UR, L3-full} and picnic3-L3. In particular, it shows that our key recovery algorithm may find the real private key for α = 0.001 and β in the set {0.001,0.01,0.02,…,0.3} when run with the parameters W = 192,w = 8,η = 3 and μ ∈{256,512,1024}. As mentioned before, the success rate improves as the value of e increases, which is expected. Similarly, Fig. 3d shows the success rate for the full enumeration improves as the the value of μ increases, which is also expected.

Fig. 3
figure 3

Success rate of our key recovery algorithm with W = 192,w = 8,η = 3, α = 0.001 and β ∈{0.001,0.01,0.02,…,0.4} for Picnic parameters picnic-{L3-FS, L3-UR, L3-full} and picnic3-L3. The x-axis represents β, while y-axis represents the success rate

Figure 4 shows the results for the Picnic parameters picnic-{L5-FS, L5-UR, L5, L5-full} and picnic3-L5. In particular, it shows that our key recovery algorithm may find the real private key for α = 0.001 and β in the set {0.001,0.01,0.02,…,0.2} when run with the parameters W = 256,w = 8,η = 4 and μ ∈{256,512,1024}. As mentioned before, the success rate improves as the value of e increases, which is expected. Similarly, Fig. 4d shows the success rate for the full enumeration improves as the the value of μ increases, which is also expected. Additionally, our experiments confirm that although the bit length of the private key for the parameters sets picnic-L5-full and picnic3-L5 is 255 bits, the success rate of our algorithm for these two parameter sets is essentially the same as shown by Fig. 4.

Fig. 4
figure 4

Success rate of our key recovery algorithm with W = 256,w = 8,η = 4, α = 0.001 and β ∈{0.001,0.01,0.02,…,0.4} for Picnic parameters picnic-{L5-FS, L5-UR, L5-full} and picnic3-L5. The x-axis represents β, while y-axis represents the success rate

6 Conclusions

This paper presented a general procedure by which a cold boot attacker may recover a block cipher secret key after procuring a noisy version of the key via a cold boot attack. More specifically, the procedure exploits key enumeration algorithms and a well-known quantum algorithm, namely, Grover’s Algorithm. Also, we showed how to implement the quantum component of our algorithm for several block ciphers such as AES, PRESENT and GIFT, and LowMC. This paper also evaluated Picnic, a post-quantum signature algorithm, in the cold boot attack setting, focusing on its reference implementation. We showed that our key-recovery method effectively reconstructs Picnic private keys for all Picnic parameters for α = 0.001 and values of β in the set {0.001,0.01,0.02,…,0.4} (the upper bound for β depends on the used parameter set). Additionally, we provided the costs for running our key recovery algorithm by giving the number of quantum gates required to implement it and its running time. As future work, we believe that our key-recovery algorithm may be adapted to tackle key-recovery of other post-quantum algorithms’ private keys in the cold boot attack setting.