Abstract
The success rate is the most common evaluation metric for measuring the performance of a particular side-channel attack scenario. We improve on an analytic formula for the success rate.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In [1], a general statistical model for side-channel attack analysis is proposed. Based on this model, one can calculate a success rate of an attack by numerical simulation. This success rate is the most common evaluation metric for measuring the performance of a particular attack scenario. In [5], it is stated:
“Closed-form expressions of success rate are desirable because they provide an explicit functional dependence on relevant parameters such as number of measurements and signal-to-noise ratio which help to understand the effectiveness of a given attack and how one can mitigate its threat by countermeasures. However, such closed-form expressions involve high-dimensional complex statistical functions that are hard to estimate”. In the following, we will derive an analytic formula for the success rate. Simulation experiments confirm that this analytic formula is a good approximation for the success rate for a wide class of leakage functions.
2 Leakage model
We consider the case of a side-channel attack against a typical block cipher. We assume that this block cipher consists of several rounds for encryption and decryption. In each round, the block cipher uses computations of substitution boxes of small size n (e.g., 6 bits for DES or n bits for AES), where the key is mixed with intermediate values.
We further restrict ourselves to the simplest setting:
-
The attacker tries to find an n-bit subkey \(k_c\) of the S-Box computation in the first round of the block cipher. The input of this S-Box computation is of the form \(p_{w}\oplus k_c\) with plaintext inputs \(p_w\).
-
We have m measurements. m is a multiple of \(N={2^n}\), and all plaintext inputs \(p_w\) of this S-Box are equally distributed over these m measurements.
-
The side-channel measurement is a trace of a certain number of points. We assume that the key-dependent leakage occurs in just one point of time which is known to the attacker.
-
The measurement in this point of time is the sum of a deterministic signal and Gaussian noise. It can be written in the form
$$\begin{aligned} \tilde{b}_{w}=\tilde{h}(p_{w}\oplus k_c)+\tilde{\tau }_{w}.\end{aligned}$$\(\tilde{h}\) is a deterministic function that only depends on the input \(p_{w}\oplus k_c\) of the S-Box computation. \(\tilde{h}\) is completely known to the attacker. \(\tilde{\tau }_w\) describes the noise of the measurement. We assume that \(\tilde{\tau }_w\) are realizations of m independent random variables \(\tilde{T}_w\); each one is normally distributed with known expectation and variance. For ease of notation, we associate the sets \(\{0,1\}^n\) and \(\{0,1,\ldots ,N-1\}\) by the 2-adic representation of an integer. We further assume
$$\begin{aligned} E(\tilde{T}_w)= & {} 0, V(\tilde{T}_w)=\sigma ^2, \\&\sum _{z=0}^{N-1} \tilde{h}(z)=0, \sum _{z=0}^{N-1} \tilde{h}(z)^2=N \tilde{\delta }^2.\end{aligned}$$ -
We can calculate the mean value of all \(\tilde{b}_w\) with the same \(p_w\). In the representation of \(\tilde{b}_w\), this just reduces the variance of \(\tilde{T}_w\). Additionally, by applying a constant factor to each \(\tilde{b}_w\) we can normalize the representation of \(\tilde{b}_w\). To this end, we get a representation in the form
$$\begin{aligned}b_{w}={h}(w\oplus k_c)+\tau _{w}, w=0,\ldots ,N-1\end{aligned}$$with
$$\begin{aligned} E(T_w){=}0, V(T_w){=}1, \sum _{z=0}^{N-1} h(z)=0, \sum _{z=0}^{N-1} h(z)^2{=}N \delta ^2.\end{aligned}$$If we start with the representation of \(\tilde{b}_w\), the normalized representation \(b_w\) has parameter \(\delta \) with
$$\begin{aligned}\delta ^2=\frac{m}{N} \frac{\tilde{\delta }^2}{\sigma ^2}.\end{aligned}$$
As in [1], we now apply the maximum likelihood attack: We compute the conditional probability density function of the observations \(b_w\) under each hypothesis k. We choose as the correct key that k which maximizes the probability density function. An easy calculation shows that we have to compare the values
This can further be reduced to the values
since \(\sum _{w=0}^{N-1} h(w\oplus k)^2\) does not depend on k. The success rate as defined in [1] is the probability that
where \(X_k\) is the random variable
This success rate can certainly be computed by numerical simulation of the \(T_w\).
3 An approximation of the success rate
Let A be the N\(\times \)N-matrix with entries \(h(w\oplus k)\). The rows of A are
Let T be the random vector (as column) of length N with entries \(T_w\). Let \(d = A \cdot a_{k_c}^t\) with entries \(d_k\). We define the set R of all vectors of length N with entries \(y_k\) that fulfill
An easy calculation shows that the success rate can be written as
A is a symmetric matrix, and therefore there exists an orthonormal basis of eigenvectors \(v_0,\ldots , v_{N-1}\) with corresponding eigenvalues \(\lambda _0,\ldots ,\lambda _{N-1}\) of A. T can be written in the basis of eigenvectors in the form
where the \(X_i\) are independent random variables with standard normal distribution. The distribution of \(A \cdot T\) is the image of the standard normal distribution under A. Each vector in the distribution of T is stretched in the direction of the eigenvectors of A with the corresponding eigenvalue as factor.
We easily compute
For values like \(n=6\) or \(n=8\), \(N=2^n\) is a relatively large number, so that the typical vector in the distribution of \(A \cdot T\) has square of norm \(N^2 \delta ^2\). As a heuristic approximation for the success rate, we just replace the distribution of \(A\cdot T\) by the normal distribution stretched by the constant factor \(2^{n/2} \delta \):
In addition, we omit the influence of d and get
where \(\tilde{R}\) is the set of all vectors \(t_k\) that fulfill
The last probability can be in fact computed as a two-dimensional integral
This expression only depends on \(\delta \), so that it can easily be listed for different \(\delta \) by numerical methods. Figure 1 plots this approximated success rate as computed by MAPLE software for \(n=8\).
Remarks:
-
If we start with the representation of \(\tilde{b}_w\), the success rate as computed by the second approximating formula only depends on
$$\begin{aligned} \delta ^2=\frac{m}{N} \frac{\tilde{\delta }^2}{\sigma ^2}. \end{aligned}$$ -
The approximating formulas are only valid if the eigenvalues do not vary too much. As an extreme example, we can consider the case that only one eigenvalue is large, whereas the others can be neglected. Let \(\lambda _0> 0\) be this large eigenvalue. Then, \(A \cdot T\) is roughly distributed as \(\lambda _0 X_0v_0\). \(\hbox {Pr}(A \cdot T \in R)\) can be written as a one-dimensional integral over the random variable \(X_0\).
-
In our approach, we replaced the covariance matrix \(A^2\) by a diagonal matrix. In effect, we treated \(X_k\) as independent random variables.
-
\(\hbox {Pr}( T \in \tilde{R}) \ge \frac{1}{N}\) with equality for \(\delta =0\). The probability of \(\frac{1}{N}\) for \(\delta =0\) follows from the symmetry of the set \(\tilde{R}\).
4 More on the matrix A
The properties of the matrix A are used in the context of dyadic codes; see [2]. In [3], the matrix A is called dyadic matrix. Due to the structure of A, we can compute the eigenvectors of A explicitly: There are N \(\hbox {GF}(2)\)-linear functions L
For every L, \(v_L=[(-1)^{L(w)}]_w\) is a vector of length N. For every k, we have
Therefore, \(v_L\) is an eigenvector with eigenvalue \( \sum _y h(y) (-1)\)\(^{L(y)} \). The rank of A is the number of nonzero eigenvalues.
5 Example: h depends on a single bit
Let S be the S-Box of the AES and G a fixed \(\hbox {GF}(2)\)-linear function. We assume that the leakage function h only depends on \(G \circ S\), i.e., after normalization
The eigenvalues of A are now
With other words: The set of eigenvalues is exactly the Walsh spectrum of the Boolean function \(G\circ S\) multiplied by \(\delta \). Each eigenvalue is a measure how good \(G\circ S\) can be approximated by a linear function L. S is the composition of the inversion over \(F=\hbox {GF}(256)\) and an affine function. The Walsh spectrum of any function of the form \(G\circ S\) is well known: It can be expressed by the so-called Kloosterman sums; see [4].
where tr(y) denotes the trace of y over F. Any \(\hbox {GF}(2)\)-linear function \(L: F \longrightarrow \hbox {GF}(2)\) can be written as \(L(y)=tr(l y)\) for exactly one \(l \in F\). Therefore, we find \(c \in F\) such that
or
Note that for \(c\ne 0\)
The distribution of the Kloosterman sums can be described by values of certain class numbers (see [4, Prop. 9.1]), which can be interpreted in terms of the Walsh spectrum.
6 Example: h depends on the Hamming weight of the input
In this example, h does not depend on the output of the substitution box, but on the Hamming weight of the input. After normalization, we can write
In this case, A has exactly n eigenvectors with eigenvalues \(\ne 0\) and these are given by the n linear projections
The eigenvalues of these n eigenvectors are equal to \(\delta \frac{N}{\sqrt{n}}\). Since we have only a few eigenvalues \(\ne 0\), we cannot expect that the second approximating formula is a good approximation in this case.
However, we can derive an exact formula for the success rate: Since h is a linear function, we have
The sums in brackets do not depend on k, so that
The maximum likelihood attack is therefore successful exactly in the event that
With other words: The success rate is the probability that the random variable \(Y_j\) fulfills
\(Y_j\) is normally distributed with an expectation value \(\frac{\delta N }{\sqrt{n}}\) and variance N. Since the covariance between \(Y_j\) and \(Y_{\tilde{j}}\) is 0 for \(j \ne \tilde{j}\), the success rate is given by the formula
7 Simulation results
We computed the success rate for different n, h and \(\delta \) by numerical simulation of the \(T_w\). Table 1 compares the success rates for \(n=8\), and Table 2 the same for \(n=6\). In both tables, f is chosen as a random function \(\hbox {GF}(2)^n \longrightarrow \hbox {GF(2)}\), but uniformly distributed. P is chosen as a random permutation on \(\hbox {GF}(2)^n\). g is the function from paragraph 6. We repeated the simulation 1000 times with different f and P, so that a mean is given in both tables.
We note that the second approximating formula and the Hamming weight formula from paragraph 6 give different values for identical \(\delta \), but both formulas match the numerical values very well. In all experiments, the numerical values in each of the 1000 repetitions were very close to the mean given in the tables. For \(n=8\) (Table 1), the empirical standard deviation was less than 0.004. For \(n=6\) (Table 2), the empirical standard deviation was less than 0.02.
8 Success rate in the case of masking
Similar to [5], we can apply the second approximating formula to the case of masking. For a concrete example, we adapt our leakage model in the following way:
-
We have m measurements. m is a multiple of N, and all plaintext inputs \(p_w\) of this S-Box are equally distributed over these m measurements.
-
There are exactly two points of time when meaningful leakages occur. Both points of time are known to the attacker. One leakage is mask-dependent; the other one is key-dependent, but on the input of an S-Box computation.
-
The measurements can be written in the form
$$\begin{aligned} \tilde{b}'_{w}= & {} \mu (p_{w}\oplus k_c \oplus m_w)+\tilde{\tau }'_{w}\\ \tilde{b}''_{w}= & {} \mu (m_w)+\tilde{\tau }''_{w}. \end{aligned}$$\(\mu \) is a centralized form of the Hamming weight, i.e.,
$$\begin{aligned}\mu (z)=(-1)^{z_1}+ \cdots +(-1)^{z_n}. \end{aligned}$$\(\tilde{\tau }'_w\) and \(\tilde{\tau }''_{w}\) describe the noise of the measurement. We assume that \(\tilde{\tau }'_w\) and \(\tilde{\tau }''_{w}\) are realizations of 2m independent random variables \(\tilde{T}'_w\), \(\tilde{T}''_w\); each one is normally distributed with expectation 0 and variance \(\sigma ^2\). \(m_w\) describes the mask. \(m_w\) are the realizations of m independent uniformly distributed random variables \(M_w\) on \(\hbox {GF}(N)\).
We set
The sum is taken over \(\frac{m}{N}\) realizations of independent random variable. For any fixed mask \(m_w\), we compute
and
If \(\frac{m}{N}\) is not too small, we approximate \(c_\nu \) as realizations of N independent normally distributed random variables, each with expectation
and variance
Again if \(\frac{m}{N}\) is not too small, we approximate these sums by the expectation over the random variables \(M_w\). An easy calculation shows
and
Since \(\sum _{z}\mu (z)^2=n\cdot N\), we can apply the leakage model of paragraph 2 with
Given the measurements \(\tilde{b}'_w, \tilde{b}''_w\), we directly compare the values
for different k and decide for the k with the largest value. For large m, we can expect that the success rate of this ad hoc attack only depends on \(\delta ^2= \frac{n m}{N(2 n \sigma ^2+\sigma ^4)}\).
Table 3 gives the success rates of this attack computed by numerical simulation and \(n=8\). We compare these success rates with the values for the example from paragraph 6 (\(h=\delta g\)). Since the numerical simulations are rather slow, we repeated the simulation only for a few instances. However, in all instances the values matched very well.
Table 4 gives similar data, but for \(m=N^2\).
Remark:
The leakage in \(\tilde{b}'_{w}\) depends on the input of an S-Box computation. We can certainly consider the case that the leakage depends on the output of an S-Box computation, i.e.,
The computation is completely analog, but we expect that the second approximating formula applies. Tables 4 and 5 compare the numerical values for the success rate with the second approximating formula. Again, we computed only a few instances, but in all instances the values matched very well (Table 6)
References
Fei, Y., Ding, A.A., Lao, J., Zhang L.: A statistics-based success rate model for DPA and CPA. J. Cryptogr. Eng. (2015)
Rajan, S.: Moon Ho Lee: Quasi-Cyclic dyadic codes in the Walsh-Hadamard transform domain. IEEE Trans. Inf. Theory 48(8), 2406–2412 (2002)
Misoczki, R., Barreto, P.S.L.M.: Compact McEliece Keys from Goppa Codes. In: Jacobson, M.J., Rijmen, V., Safavi-Naini, R. (eds.) Selected Areas in Cryptography. SAC, : Lecture Notes in Computer Science, vol. 5867. Springer, Berlin (2009)
Lachaud, W.: Weights of the orthogonals of the extended quadratic binary Goppa codes, IEEE Trans. Inf. Theory (1990). https://doi.org/10.1109/18.54892
Guilley, S., Heuser, A., Rioul, O.: A Key to Success–Success Exponents for Side-Channel Distinguishers, Cryptology ePrint Archive: Report 2016/987
Acknowledgements
Open Access funding provided by Projekt DEAL. Open Access funding provided by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wiemers, A. A remark on a success rate model for side-channel attack analysis. J Cryptogr Eng 10, 269–274 (2020). https://doi.org/10.1007/s13389-020-00235-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13389-020-00235-6