1 Introduction

An attacker can mount a side-channel attack against a cryptographic circuit by probing its internal nodes to derive information correlated with the secret. The first countermeasure against this type of attack appeared almost two decades ago [1] and it officially gave birth to a branch of research known today as t-probing security. Since then, it became clear that, while proving the security for small gadgets required a small effort, reasoning about their composition was (and still is) not trivial. In fact, one of the main problems addressed is the composability of security properties, i.e., determining, given two t-probingFootnote 1 secure gadgetsFootnote 2, if their functional composition is still t-probing secure.

Over the years, it has been observed that composability depends on the amount of refreshingFootnote 3 that is used [2] and on the non-interference property [3] of the circuit’s implementation, which ensures that the probabilistic distribution of the probed values does not depend on any secret [4]. Subsequent research has shown that a stronger type of non-interference property, which dictates that such probabilistic distribution might vary only with the number of internal probes, could be useful [3].

Since its introduction, the cryptographic research community has worked toward putting non-interference to work in proof mechanization and optimization [3,4,5]. Later developments concerned the realistic application to circuits that might leak more than one share per probe [6] and proposed some new categorization (e.g., pseudo-\(t\)-NI/\(t\)-SNI [7]) to be able to reason about them. Among these, one of the most studied is the nonlinear part of the Advanced Encryption Standard (AES) [8, 9], which is also one of the target applications of the current work.

1.1 Our contribution

Reasoning about non-interference still needs either difficult ratiocination or complex automatic tools [3, 4]. In this work, we investigate an alternative formalization which, we argue, is simpler to reason with; in fact, it allowed us to prove some new properties of probing security (see Theorem 4). Our approach is based on the spectral theory of Boolean functions and correlation matrices [10, 11], and represents, to some extent, an extension of the work presented in [12]. However, while the latter only addressed single output Boolean functions, we are the first to formalize multiple dependencies between outputs and inputs of a vector function, with the added benefit of being supported by matrix-based toolboxes. Note that the use of these tools in our area is not new; for example, in [13] the authors exploit the correlation matrix of shared functions to estimate some security bounds and define correlation between probed values, while other works use linear algebra to investigate resiliency against side-channel attacks [14] [15]. These works, however, address either monovariate attacks or other sharing schemes (inner product masking) while we aim at providing a larger and stronger foundational contribution allowing to concisely explain known general compositional patterns and potentially discover new ones.

This work is organized as follows: we present our new mathematical approach in Sect. 2 by introducing the concept of shares’ relation matrix, which is a compact matrix representation of the (cor-)relation between a Boolean function’s inputs and outputs. Methodologically, this new formalism allows to create a linear algebra view of t-probing security which we present in Sect. 3. This approach brings several benefits among which the ability to prove general probing security composition theorems through linear algebra. To this end, in Sect. 4 we put to work our formalism by proving the t-SNI property of the AES nonlinear part. We conclude by sketching the possible evolution of this work (Sect. 5) and by providing relevant mathematical background and proofs supporting our argument (Appendix).

2 A relation calculus for shares

Let us consider a circuit implementing a function:

$$\begin{aligned} f(x_1,\ldots , x_n): \mathbb {F}_2^{n}\rightarrow \mathbb {F}_2^{m} \end{aligned}$$

where the values \(x_i\) are sensitive (i.e., they have been computed using a secret). A side-channel attack consists of measuring the power consumption of internal nodes of the circuit (through probes) and by searching through a set of guesses of the secret for the one that maximizes the correlation.

To design a mitigation against a side-channel attack, designers split each sensitive value \(x_i\) into d values \(\alpha _i = \{\alpha _{i,j}\}_{j \in 1 \ldots d}\) such that \(\sum _j \alpha _{i,j}= x_i\); these d values are called shares. In principle, this is done by using \(d-1\) auxiliary, uniformly distributed random values (aka masks) and, unless one obtains all d shares \(\alpha _{i,j}\), the correlation of each share with the sensitive value \(x_i\) is negligible [1]. The implementation of f must be changed so as to provide the result as a set of shares much like the original sensitive values. The computation of each output \(f_i\) is thus split into a set of d vector functions \(\omega _i = \{\omega _{i,j}\}_{j \in 1 \ldots d}\) such that:

$$\begin{aligned} f_i(x_1, \ldots , x_n) = \sum _{j} \omega _{i,j}(A_1, \ldots , A_n), \\ A_i \subseteq \{\alpha _{i,1}, \ldots \alpha _{i,d}\} \end{aligned}$$

where each \(\omega _{i,j}\) is called the j-th output share of \(f_i\) and it must be impossible to obtain information about \(f_i\) unless one obtains all d output shares.

In the probing security attack model, aside from regular output shares \(\omega _i\), attackers can observe (through probes) a group of the internal values of the circuit as additional outputs:

$$\begin{aligned} \Pi = \{ \pi _1, \ldots , \pi _{|\Pi |}\} \end{aligned}$$

where each \(\pi _i\) is a function of the input shares. A mitigation against a probing attack ensures that none of the \(\pi _i\) are correlated with the original sensitive values. To design such countermeasures, besides the shares of the original sensitive values, designers use an additional group of inputs \(P=\{ \rho _1 \ldots \rho _{|P|}\}\) which are uniformly random. These values are used to “refresh” the internally computed values of the function so as to make each \(\pi \) and \(\omega \) not correlated with the sensitive values.

It is clear that correlation between each \(\omega \) and \(\pi \) with any \(\alpha \) and \(\rho \) is critical to determine whether the circuit is probing secure. A possible way to encode this information is to have a multi-dimensional matrixFootnote 4 called the shares’ relation matrix:

Definition 1

(Shares’ relation matrix) Given a Boolean function \(f: \mathbb {F}_2^{|A|+|P|}\rightarrow \mathbb {F}_2^{|\Omega |+|\Pi |}\), where A is the set of the function’s input shares \(\alpha _k\), \(\Omega \) is the set of output shares \(\omega _k\), we define the shares’ relation matrix of f as a multi-dimensional matrix F where each element:

$$\begin{aligned} F^{j_{\rho }j_{{\alpha }_{|A|}} \cdots j_{{\alpha }_1}}_{i_\pi i_{{\omega }_{|\Omega |}} \cdots i_{{\omega }_1}} \in \{ 0, 1\}, \end{aligned}$$
(1)

is indexed by:

  • \(j_{\alpha _k} \in \{0,\dots ,d\}, k \in \{ 1, \dots , |A|\}\)

  • \(j_{\rho } \in \{ 0, \dots , |P| \}\)

  • \(i_{\omega _p} \in \{0,\dots ,d\}, p \in \{ 1, \dots , |\Omega | \}\)

  • \(i_{\pi } \in \{0, \dots , |\Pi |\}\)

and it is equal to 1 only if there exist a nonzero correlation between \(j_{\alpha _k}\) shares of \(\alpha _k\) and \(j_\rho \) randoms with \(i_{\omega _p}\) output shares of \(\omega _p\) and \(i_{\pi }\) probes, for all kp.

A formal definition of such type of matrices is presented in Appendix B where, in particular, we consider multiple random \(P_l\) and probe \(\Pi _z\) groups instead of a single P and \(\Pi \) as above; however, for the rest of the paper, it is only necessary to get an intuitive understanding of it which we will develop in the following paragraphs. A practical way to compute a shares’ relation matrix for a function f is deriving it from the Walsh matrix of f. Indeed, correlation matrices are useful to determine whether a set of output shares is vulnerable, i.e., correlated with one or more sensitive variables [16,17,18]. In particular, for a circuit f, any combination of outputs (encoded with the spectral coordinate \(\phi \)) is correlated with a set of inputs (encoded with the spectral coordinate \(\psi \)) if \(W_f(\phi , \psi ) \ne 0\). It is possible to see a correlation matrix as an incidence matrix which encodes a dependency relation between the inputs and outputs of f. Such relation matrices (which are built over a Boolean semiring \(K = \{ (0,1), \vee , \wedge \}\)) are the fundamental building block for the calculus of relationsFootnote 5, an algorithmic device that allows the substitution of computation for a sometimes difficult ratiocination [19]. We can derive a relation matrix \(\widetilde{W}_f\) from the correlation matrix \(W_f\) of a vectorial Boolean function element-wise:Footnote 6:

(2)

Once a relation matrix \(\widetilde{W}_f\) has been found, one can derive easily the shares’ relation matrix in Eq. 1 by inspection. Thus, we derive the shares’ relation matrix in a two-step process by starting from the correlation matrix of f:

$$\begin{aligned} W_f \rightarrow \widetilde{W}_f \rightarrow F \end{aligned}$$

Example 1

(Taken from [20] and reported here for simplicity) Consider a function \(f: \mathbb {F}_2^{4}\rightarrow \mathbb {F}_2^{3}\)

$$\begin{aligned} f(a_0, a_1, r_0, r_1) = \left[ \begin{array}{c} f_0 \\ f_1 \\ f_2 \end{array}\right] = \left[ \begin{array}{c} a_0 + r_0 + r_1 \\ a_1 + r_0 + r_1 \\ a_1 + r_0 \end{array}\right] \end{aligned}$$

such that

  • \(a_0\) and \(a_1\) are two shares of a single sensitive input a,

  • \(r_0\) and \(r_1\) are two random values,

  • \(f_0\) and \(f_1\) are two shares of a single output o, and

  • \(f_2\) is the value associated with a potential internal probe p within the circuit realization of f.

From its correlation matrix, we can derive through Eq. 2 the following relation matrix \(\widetilde{W}_f(\phi ,\psi )\) (\(\phi =[\gamma _{f_2}\gamma _{f_1}\gamma _{f_0}], \psi =[\gamma _{r_1}\gamma _{r_0}\gamma _{a_1}\gamma _{a_0}]\)):

$$\begin{aligned} \begin{array}{cccccccccccccccccccc} &{} &{} &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} 1 &{} 1 &{} 1 &{} 1 &{} 1 &{} 1 &{} \gamma _{r_1}\\ &{} &{} &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} 1 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} 1 &{} 1 &{} \gamma _{r_0}\\ &{} &{} &{} 0 &{} 0 &{} 1 &{} 1 &{} 0 &{} 0 &{} 1 &{} 1 &{} 0 &{} 0 &{} 1 &{} 1 &{} 0 &{} 0 &{} 1 &{} 1 &{} \gamma _{a_1}\\ &{} &{} &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 1 &{} \gamma _{a_0}\\ \gamma _{f_2} &{} \gamma _{f_1} &{} \gamma _{f_0} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} \\ 0 &{} 0 &{} 0 &{} 1 &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} \\ 0 &{} 0 &{} 1 &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} 1 &{} &{} &{} \\ 0 &{} 1 &{} 0 &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} 1 &{} &{} \\ 0 &{} 1 &{} 1 &{} &{} &{} &{} 1 &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} \\ 1 &{} 0 &{} 0 &{} &{} &{} &{} &{} &{} &{} 1 &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} \\ 1 &{} 0 &{} 1 &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} 1 &{} &{} &{} &{} &{} \\ 1 &{} 1 &{} 0 &{} &{} &{} &{} &{} &{} &{} &{} &{} 1 &{} &{} &{} &{} &{} &{} &{} &{} \\ 1 &{} 1 &{} 1 &{} &{} &{} &{} &{} &{} 1 &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} \\ \end{array}. \end{aligned}$$
(3)

Note that we have labeled columns (rows) with the corresponding combination of inputs (outputs) in binary form. For example, element \(\widetilde{W}_f([011],[0011])\) (which is 1) represents an existing dependency between \(f_0 \oplus f_1\) and \(a_0 \oplus a_1\); note that in this specific case \(W_f=\widetilde{W}_f=1\) but in general \(\widetilde{W}_f(i,j)\) is 1 whenever \(W_f(i,j)\) is different from zero.

From the original correlation matrix \(\widetilde{W}_f\) (Eq. 3), it is thus possible to derive the corresponding shares’ relation matrix F which accounts only for the amount of shares of the output and probes whose combination is correlated with a specified amount of shares of the input and of randoms:

$$\begin{aligned} \begin{array}{cccccccccccc} &{}\qquad &{} 0 \qquad &{} 0 \qquad &{} 0 \qquad &{} 1 \qquad &{} 1 \qquad &{} 1 \qquad &{} 2 \qquad &{} 2 \qquad &{} 2 \qquad &{} j_r\\ &{} \qquad &{} 0 \qquad &{} 1 \qquad &{} 2 \qquad &{} 0 \qquad &{} 1 \qquad &{} 2 \qquad &{} 0 \qquad &{} 1 \qquad &{} 2 \qquad &{} j_a\\ i_{p} \qquad &{} i_{o} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \\ 0 \qquad &{} 0 \qquad &{} 1 \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \\ 0 \qquad &{} 1 \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} 1 \qquad &{} \qquad &{} \\ 0 \qquad &{} 2 \qquad &{} \qquad &{} \qquad &{} 1 \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \\ 1 \qquad &{} 0 \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} 1 \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \\ 1 \qquad &{} 1 \qquad &{} \qquad &{} \qquad &{} \qquad &{} 1 \qquad &{} \qquad &{} 1 \qquad &{} \qquad &{} \qquad &{} \qquad &{} \\ 1 \qquad &{} 2 \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} 1 \qquad &{} \qquad &{} \qquad &{} \qquad &{} \qquad &{} \\ \end{array}. \end{aligned}$$
(4)

Note that the coordinates of this new relation matrix are computed by the Hamming weights of the spectral coordinates of \(\widetilde{W}_f\) split by their type (r the randoms, a the inputs, o the outputs, p the probes). This allows us to index in an alternative way any element F(ij):

$$\begin{aligned} F_{i_p i_o}^{j_r j_a} \end{aligned}$$

where \(i_p, i_o, j_r, j_a\) are exactly the mixed-radix representation of ij and carry additional information, i.e., the distinction between the related input-random (output-probe) composition. The mixed-radix representation \(mr_{\rho }(n)\) of a number n over the vector of parts \(\varvec{\rho } = [{\rho }_{N}, \ldots , {\rho }_{1}]\) is a vector \(b = [{b}_{N+1}, \ldots , {b}_{1}]\) such that:

$$\begin{aligned} n = \sum ^{N+1}_{i=1} b_i \prod ^{i-1}_{j=1}(\rho _j + 1) \text {~where~} 0 \le b_i < \rho _i \end{aligned}$$

The vector \([i_p, i_o]\) (resp., \([j_r, j_a]\)) is just the mixed-radix representation of i (resp. j) over the vector of parts \([f_p, f_o]\) (resp. \([f_r,f_a]\)) where \(f_a\) is the number of shares of the input of function f, \(f_r\) is the number of refresh values, \(f_o\) is the number of shares for function f’s outputs and \(f_p\) is the number of probes:

$$\begin{aligned} i&= i_p \cdot (f_o+1) + i_o \end{aligned}$$
(5)
$$\begin{aligned} j&= j_r \cdot (f_a+1) + j_a \end{aligned}$$
(6)

Example 2

The fifth row with index i = 4 of the matrix in Eq. (4) has a mixed-radix representation \([i_p, i_o] = [1,1]\) over the vector of parts \([f_p, f_o] = [1,2]\) because:

$$i = 1 \cdot (2+1) + 1 $$

Same reasoning goes for column index 3 which corresponds to the representation [1, 0] over the vector of parts \([f_r, f_a] = [2,2]\), i.e.,

$$\begin{aligned} j = 1 \cdot (2+1) + 0 \end{aligned}$$

Thus, we have that \(F_{1,1}^{1,0}\) is the element (4, 3) of matrix F and its indexes carries the fact that it corresponds the correlation of both probes and outputs (\([i_p, i_o] = [1,1]\)) with one of the random values (\([j_r, j_a]=[1,0]\)).

This work concerns itself with deriving security properties associated with the composition of functions. In the following we consider the function \(h(x) = g(f(x))\) as a horizontal composition of g with f while the vector function:

$$\begin{aligned} v(x_1, x_2) = [ f(x_1) ; g(x_2) ] \end{aligned}$$

as the vertical composition of f and g. We will show that the shares’ relation matrix of a function distributes over vertical composition while, concerning horizontal composition, we can assert a weaker rule (see below) which will be still valid for inferring probing security. With regard to the proofs of following theorems, the reader is invited to refer to Appendix B.1.

Note that the definition of the shares’ relation matrix is different from the Probe Distribution Table (PDT) introduced in [21] because the latter does not account for the potential compression of information that is obtained by encoding the Hamming weights of the spectral coordinates. With respect to [21], we show that it is possible to work with such minimal objects without resorting to encoding explicitly all possible input/output relationships. Note also that the goal of our work is more related to explaining how the composition of primitive gadgets works rather than determining inner properties for such primitives through their correlation matrices. However, other work [20] provides some deduction on the complexity required for deriving from scratch the above shares’ relation matrix matrices.

Theorem 1

(Identity) Given \(id: \mathbb {F}_2^{n}\rightarrow \mathbb {F}_2^{n}\) the identity function, its shares’ relation matrix is \({\mathbf {I}}_{n+1}\), where \({\mathbf {I}}\) is the identity matrix (see Appendix for the proof).

The horizontal compositionality of the shares’ relation matrices is determined by a weaker rule with respect to the conventional correlation matrix (see Theorem 7); in particular, as long as we look at the constituent parts of a horizontal composition of shares’ relation matrices, their product will be always conservatively more than the original shares’ relation matrix, as stated in the following theorem:

Theorem 2

(Shares’ relation matrices pseudo-horizontal composition) Given two functions f and g, and F, G, FG the shares’ relation matrices of f, g and \(g\circ f\), respectively, the following dominance holds:

$$\begin{aligned} (FG)_{I_\pi I_\omega }^{J_\rho J_\alpha } \preceq F_{I_\pi I_\omega }^{K L}G_{K L}^{J_\rho J_\alpha } \end{aligned}$$

(see Appendix for the proof).

Practically speaking, if the product of two shares’ relation matrices does not imply a dependency between variables, this will be absent from the whole shares’ relation matrix as well. Vertical composition, however, still holds as the following theorems show:

Theorem 3

(Shares’ relation matrices pseudo-distributivity over tensor product) Given two functions f and g, and F, G, F|G the shares’ relation matrices of f, g and the vertical juxtaposition of g above f, respectively, the following holds:

$$\begin{aligned} (F|G)_{(I_{\pi _f}|I_{\pi _g}) (I_{\omega _f}|I_{\omega _g})}^{(J_{\rho _f}|J_{\rho _g}) (J_{\alpha _f}|J_{\alpha _g})} = F_{I_{\pi _f} I_{\omega _f}}^{J_{\rho _f} J_{\alpha _f}}\otimes G_{I_{\pi _g} I_{\omega _g}}^{J_{\rho _g} J_{\alpha _g}} \end{aligned}$$

where \(\otimes \) is the Kronecker (or tensor) product, see Appendix for the proof.

Corollary 1

(Tensor product with identity) From the previous theorems, it follows that the following equalities hold:

$$\begin{aligned} \begin{aligned} ({\mathbf {I}}_{n+1}|G)_{(I_{\pi _{id}}|I_{\pi _g}) (I_{\omega _{id}}|I_{\omega _g})}^{(J_{\rho _{id}}|J_{\rho _g}) (J_{\alpha _{id}}|J_{\alpha _g})}&= \delta (J_{\rho _{id}}J_{\alpha _{id}},I_{\pi _{id}}J_{\omega _{id}}) \cdot G_{I_{\pi _g} I_{\omega _g}}^{J_{\rho _g} J_{\alpha _g}} \\ (G|{\mathbf {I}}_{n+1})_{(I_{\pi _g}|I_{\pi _{id}}) (I_{\omega _g}|I_{\omega _{id}})}^{(J_{\rho _g}|J_{\rho _{id}}) (J_{\alpha _g}|J_{\alpha _{id}})}&= G_{I_{\pi _g} I_{\omega _g}}^{J_{\rho _g} J_{\alpha _g}} \cdot \delta (J_{\rho _{id}}J_{\alpha _{id}},I_{\pi _{id}}J_{\omega _{id}}) \\ \end{aligned} \end{aligned}$$
(7)

where \(\delta \) is the Kronecker’s delta.

3 Application to t-probing security

In this section, we revisit and enhance known theorems about t-probing security by showing how they naturally descend from the relation calculus of shares based on shares’ relation matrices. We recall that t-probing security centers around the concept of t-non-interfering function. A function f is \(t\)-NI if, when given a total of s outputs and internal probes, \(s \le t\) implies a dependency with maximum s input shares. A function f is \(t\)-SNI if \(s \le t\) implies a dependency with maximum i input shares, where i is the number of internal probes.

Much has been said about the composition rules of such functions and, unfortunately, their proofs are complex, long or require much expertise in type theoretical or formal validation area [3]; we will show that the relation calculus of shares allows to revisit and extend these proofs with conventional linear algebra tools, broadening the potential audience.

To talk about t-probing security, we’ve found useful to follow this general pattern: (i) we explicitly include random refresh values as inputsFootnote 7 and (ii) we include in the signature of the function also the probes considered. This creates a natural subdivision of the shares’ relation matrix for the considered function. Before introducing some general results that can be derived with our formalism, however, we introduce an additional example that shows how one could identify a violation of compositionality in an existing gadget with our formalism.

Example 3

(Extended from [20]). In this example, we revisit through our formalism a case discovered in [2] that proves that, in general, the composition of \(t\)-NI and \(t\)-SNI functions is not \(t\)-NI.

Fig. 1
figure 1

The composition pattern of f (\(t\)-NI) and g (\(t\)-SNI) studied in Example 3 and derived from [2]. The composed function h(a) is not \(t\)-NI as can be easily checked with our formalism

Figure 1 shows the structure of a function h(a) which is a composition of two functions f and g; the assumptions are that f is \(t\)-NI and g is \(t\)-SNI. In particular, f refreshes its input a with two random bits \(r_f\):

$$\begin{aligned} o_f(a_0, a_1, a_2, r_0, r_1) = [ a_0 \oplus r_0 \oplus r_1, a_1 \oplus r_0, a_2 \oplus r_1] \end{aligned}$$

and it is assumed to have been probed at location \(p_f = a_0 \oplus r_0\). On the other hand, \(g(a,b,r_g)\) is the ISW multiplication [1] which consumes 3 random bits \(r_g\) for the secret computation. Also in this case, it is assumed a single probe \(p_g = a_2 \wedge b_1\). We will show that our method provides a sufficient precision to individuate the vulnerability spotted in [2]. To fit into our formalism however, we must consider the underlying correlation matrices that include explicitly i) the random values both f and g consume to refresh the data and ii) the probes that are present. The string diagram in Fig. 5 describes the composition pattern of correlation matrices as a mapping from the space of the Fourier transform of the input distribution \({\mathbb {A}} \otimes {\mathbb {R}}_f \otimes {\mathbb {R}}_g\) (i.e., the actual inputs plus the random values) to the one of the output distribution \({\mathbb {O}}_g \otimes {\mathbb {P}}_g \otimes {\mathbb {P}}_f\) (i.e., the actual output of g and the probes in both f and g). Still considering the string diagram of Fig. 5, one can derive one of the equivalent expressions of the correlation matrix of h as:

$$ W_{h} \!=\! (\mathbf {I}_2 \otimes W_{g}) (W_{q} \otimes \mathbf {I}_{2^3} \otimes \mathbf {I}_{2^3})(\mathbf {I}_{2^3} \otimes W_{f} \otimes \mathbf {I}_{2^3}) (\mathbf {I}_{2^3} \otimes \mathbf {I}_{2^2} \otimes W_{s})\! $$
Fig. 2
figure 2

The shares’ relation matrix of function h in Example 3 derived from [2] (we use Greek letters to indicate the spectral coordinate associated with each function variable, i.e., \(\alpha \) is the spectral coordinate associated with variable a and so on). Gray areas indicate where h is allowed to have nonzero values in its shares’ relation matrix to meet \(t\)-NI hypotheses. One can see that in row [1, 1, 0], column [0, 0, 3] there is a potential relation between two probes and the three shares of a, meaning that the composition is not even 2-NI

where \(W_{s}\) is the correlation matrix of the duplication function \(s = (x) \mapsto (x,x)\) and \(W_{q}\) is the correlation matrix of function \(q = (x,y) \mapsto (y,x)\). We are interested in computing the potential dependencies between any combination of output/probes and inputs that are not masked by random values. Thus, computing the shares’ relation matrices from all the previous correlation matrices, by Theorems 2 and  3, the following holds:

$$\begin{aligned} H\preceq (\mathbf {I}_2 \otimes G) (Q \otimes \mathbf {I}_4 \otimes \mathbf {I}_4) (\mathbf {I}_4 \otimes F \otimes \mathbf {I}_4) (\mathbf {I}_4 \otimes \mathbf {I}_3 \otimes S) \end{aligned}$$
(8)

where H, F, G, S and Q are the shares’ relation matrices computed for functions h, f, g, s and q, respectively.

The value of the right-hand side of Eq. 8 is shown in Fig. 2. First of all, we are interested only in the first 4 columns, as these are the ones that represent relationships between the outputs and the shares of a not masked by any random value. We note that there is a potential dependency in row [1, 1, 0], column [0, 0, 3], exactly the one found in [2], which says that one needs only two probe values to get three shares; h is thus not even 2-NI, showing that \(t\)-NI and \(t\)-SNI do not compose into a \(t\)-NI function. This example shows that the proposed calculus of shares has sufficient precision to discover these cases. On one hand, these could be false positives because of the dominance relation in Eq. (8); on the other hand, however, this formalism rules out any false negative. We will show that the stronger concept of \(t\)-SNI naturally emerges, in our relation calculus, as a fundamental property to ensure compositionality.

3.1 Proving general patterns of compositional security

The shares’ relation matrix can be a reasonable way for exploring t-probing security, but there is more. In fact, it is possible to demonstrate that in order to rule out dependencies similar to Example 3, both f and g must be \(t\)-SNI. In this section, we will revisit some known composition patterns (e.g., Theorem 5 and Corollaries 2 and  3 appeared in [4]) and introduce a new one not known in literature (Theorem 4).

Here, we restate what it means for a function f to be \(t\)-NI/\(t\)-SNI in terms of the shares’ relation matrix F:

Definition 2

f is \(t\)-SNI iff, for any set of probes that could be introduced in it, the following predicate is true for any element (ij) of its shares’ relation matrix:

$$\begin{aligned} |i_\pi | + |i_\omega | \le t \wedge (\exists a.j_{\alpha _a} > |i_\pi |) \implies \lnot F^{0 \cdots 0j_{{\alpha }_{|A|}} \cdots j_{{\alpha }_1}}_{i_{{\pi }_{|\Pi |}} \cdots i_{{\pi }_1}i_{{\omega }_{|\Omega |}} \cdots i_{{\omega }_1}}. \end{aligned}$$

Definition 3

f is \(t\)-NI iff, for any set of probes that could be introduced in it, the following predicate is true for any element (ij) of its shares’ relation matrix:

$$\begin{aligned} |i_\pi | + |i_\omega | \le t \wedge (\exists a.j_{\alpha _a}&> |i_\pi | + |i_\omega |) \\&\implies \lnot F^{0 \cdots 0j_{{\alpha }_{|A|}} \cdots j_{{\alpha }_1}}_{i_{{\pi }_{|\Pi |}} \cdots i_{{\pi }_1}i_{{\omega }_{|\Omega |}} \cdots i_{{\omega }_1}} \end{aligned}$$

where it is evident that \(t\)-NI corresponds to a weaker version of \(t\)-SNI.

Example 4

The Coron’s linear-space variant [22] of the ISW multiplication [1] is \(t\)-SNI [3] and this can be easily seen through the shares’ relation matrix. Let us consider its form for \(t=1\); in this case we have two shares for two inputs a and b, one random value r, two output shares o and six possible internal probes p:

$$\begin{aligned} \textsf {SecMult}(a_0, a_1, b_0, b_1, r) = [o_0, o_1, p_0, p_1, p_2, p_3, p_4, p_5] \end{aligned}$$

where

$$\begin{aligned} \left[ \begin{array}{c} o_0\\ o_1 \\ p_0 \\ p_1 \\ p_2 \\ p_3 \\ p_4 \\ p_5 \\ \end{array}\right] = \left[ \begin{array}{c} a_0b_0 + r \\ a_1b_1 + ( (a_0b_1 + r) + a_1b_0 ) \\ a_0b_0 \\ a_1b_1 \\ a_0b_1 \\ a_1b_0 \\ a_0b_1 + r \\ (a_0b_1 + r) + a_1b_0 \\ \end{array}\right] \end{aligned}$$
(10)

Part of the corresponding shares’ relation matrix is shown in Fig. 3; it can be seen that for \(\pi +\omega \le 1\), \(\rho =0\) and \(\alpha , \beta > 1\) (white areas) we have a null dependency, i.e., the function is 1-SNI. Note that, for this ISW implementation, the number of outputs and probes varies with t with the following law:

$$(t+1)^2+4\cdot \sum \limits _{i=1}^t i$$

where \(t+1\) of these correspond to outputs while the others are internal probes.

Fig. 3
figure 3

Part of the shares’ relation matrix of SecMult function [22] (only interesting rows for \(t\)-SNI are shown). Note that \(\alpha \), \(\beta \) and \(\rho \) are the spectral coordinates associated with inputs a, b and r, while \(\omega \) and \(\pi \) are the spectral coordinates for o and p. Gray areas indicate where SecMult is allowed to have nonzero values in its shares’ relation matrix to meet \(t\)-SNI hypotheses

Fig. 4
figure 4

Map between Fourier transforms of probability distributions implied by a function composition \(l = g \circ f\)

The simplest composition pattern for which we can derive general rules is \(l = g \circ f\). The corresponding map between the Fourier transforms of distributions is shown in Fig. 4. The question we address is if l (with the associated shares’ relation matrix L) is t-SNI/t-NI according to Definitions 2 and 3, by making assumptions on the probing security of the underlying functions f and g (whose shares’ relation matrices are called F and G, respectively). Note that, to fit within our formalism, we need to explicitly route the refresh values for g and probed value of f with a function q that just swaps those values. Note that, since matrix Q is the shares’ relation matrix of \(q: (x,y)\mapsto (y,x)\) function, it can be shown that the following holds:

$$\begin{aligned} Q_{i_{\rho _g},i_{\pi _f}}^{j_{\pi _f},j_{\rho _g}} = \delta (i_{\pi _f},j_{\pi _f})\cdot \delta (i_{\rho _g},j_{\rho _g}). \end{aligned}$$

Besides, by Theorem 2, we know that L is dominated by the product:

$$\begin{aligned} ABC =~&({\mathbf {I}}_{n_{\pi _f}} \otimes G )\cdot (Q \otimes {\mathbf {I}}_{n_{\omega _f}})\cdot (\mathbf {I}_{n_{\rho _g}} \otimes F) \end{aligned}$$

where \(n_{\pi _f}\) (\(n_{\omega _f}\), \(n_{\rho _g}\)) is the number of probes in f (output’s shares of f, randoms needed to refresh g) plus 1 (see Theorem 1).

The following lemma can be proved

Lemma 1

The product ABC is such that:

$$\begin{aligned} (ABC)^{0,0,j_{\alpha }}_{i_{\pi _f}i_{\pi _g}i_{\omega _g}} =&\sum \limits _{r}{G}_{i_{\pi _g}i_{\omega _g}}^{0,r}{F}^{0,j_{\alpha }}_{i_{\pi _f},r} \end{aligned}$$

For a proof, see Appendix.

We are now able to derive formally whether and when l is \(t\)-SNI/\(t\)-NI.

Theorem 4

If f is \(t\)-SNI and g is \(t\)-NI, then \(l(x) = g(f(x))\) is \(t\)-SNI. Formally, the following three axioms:

A 1:

\(r + |i_{\pi _f}| \le t \wedge v > |i_{\pi _f}| \implies \lnot F_{i_{\pi _f}, r}^{0,v}\)

A 2:

\(|i_{\omega _g}| + |i_{\pi _g}| \le t \wedge r > |i_{\pi _g}|+|i_{\omega _g}| \implies \lnot G_{i_{\pi _g}, i_{\omega _g}}^{0,r}\)

A 3:

\((|i_{\pi _g}| + |i_{\pi _f}| + |i_{\omega _g}| \le t) \wedge (j_{\alpha }> |i_{\pi _g}| + |i_{\pi _f}|)\)

entail \((ABC)^{0,0,j_{\alpha }}_{i_{\pi _f}i_{\pi _g}i_{\omega _g}}=0\)

Proof

Exploiting above axioms and Lemma 1 we can derive that:

$$\begin{aligned} (ABC)^{0,0,j_{\alpha }}_{i_{\pi _f}i_{\pi _g}i_{\omega _g}}&{{\mathop {=}\limits ^{\text {Lem.}~1}}} \sum \limits _{r}{G}_{i_{\pi _g}i_{\omega _g}}^{0,r}{F}^{0,j_{\alpha }}_{i_{\pi _f},r} \\ {\mathop {\preceq }\limits ^{\text {A1,A2}}}&\sum \limits _{t-i_{\pi _f}< r \le i_{\pi _g}+i_{\omega _g}}G_{i_{\pi _g},i_{\omega _g}}^{0,r} {\mathop {\preceq }\limits ^{\text {A3}}} 0 \end{aligned}$$

\(\square \)

Corollary 2

If f and g are \(t\)-SNI functions then also \(l(x) = g(f(x))\) is \(t\)-SNI.

Proof

Assuming g is \(t\)-SNI, then it is also \(t\)-NI and the thesis follows from Theorem 4. \(\square \)

We already saw an example of another composition pattern studied in the literature, whose circuit diagram is shown in Fig. 1. The diagram associated with its correlation matrices is the one shown in Fig. 5. With our formalism, it is possible to identify some general rules to determine if such a composed function is \(t\)-NI/\(t\)-SNI (according to Definitions 2 and 3) by making assumptions on the probing security of the underlying functions f and g. Note that, to reconcile with our model of function, we explicitly split the whole function l into a composition \(a \circ b \circ c \circ d\). In particular, d contains the duplication function s that sends a copy of the shared input to both f and g, while b contains q as in the pattern that we previously studied. The shares relation matrix S associated with \(s: x \mapsto (x,x)\) function is characterized by the following lemma:

Lemma 2

For any \(i_{\alpha _1},i_{\alpha _2},j_\alpha \) indices, the following holds:

$$\begin{aligned} |i_{\alpha _1}|+|i_{\alpha _2}| < |j_{\alpha }| \implies S_{i_{\alpha _1},i_{\alpha _2}}^{j_\alpha } = 0 \end{aligned}$$

For a proof, see Appendix.

Fig. 5
figure 5

Map between Fourier transforms of probability distributions implied by the second composition pattern studied in this paper

From the point of view of the shares’ relation matrix involved, we know that whole function is dominated by the product (see Theorem 2):

$$\begin{aligned} ABCD&=~({\mathbf {I}}_{n_{\pi _f}} \otimes G ) \cdot (Q \otimes {\mathbf {I}}_{n_{\omega _f}} \otimes {\mathbf {I}}_{n_{\alpha _1}}) \cdot \\&\quad ({\mathbf {I}}_{n_{\rho _g}} \otimes F \otimes {\mathbf {I}}_{n_{\alpha _1}}) \cdot \cdot ({\mathbf {I}}_{n_{\rho _g}} \otimes {\mathbf {I}}_{n_{\rho _f}} \otimes S) \end{aligned}$$

where \(n_{\alpha _1}\) (\(n_{\rho _f}\)) is the number of shares of the first g’s input (randoms needed to refresh f) plus 1.

Lemma 3

The complete relation matrix ABCD computed in Figure 5 is such that:

$$\begin{aligned} (ABCD)^{0,0,j_{\alpha }}_{i_{\pi _f}i_{\pi _g}i_{\omega _g}}\\&= \sum \limits _{v,z}(\sum \limits _{r}{G}_{i_{\pi _g}i_{\omega _g}}^{0,r,z}{F}^{0,v}_{i_{\pi _f},r})S_{v,z}^{j_{\alpha }} \end{aligned}$$

For a proof, see Appendix.

We are now able to derive formally when l is \(t\)-SNI/\(t\)-NI.

Theorem 5

If f is \(t\)-SNI function and g is \(t\)-NI, then \(l(x) = g(f(x), x)\) is \(t\)-NI. Formally, the following three axioms:

A 4:

\(r + |i_{\pi _f}| \le t \wedge v > |i_{\pi _f}| \implies \lnot F_{i_{\pi _f}, r}^{0,v}\)

A 5:

\(|i_{\omega _g}| + |i_{\pi _g}| \le t \wedge (r> |i_{\pi _g}|+|i_{\omega _g}| \vee z > |i_{\pi _g}|+|i_{\omega _g}|) \implies \lnot G_{i_{\pi _g}, i_{\omega _g}}^{0,r,z}\)

A 6:

\((|i_{\pi _g}| + |i_{\pi _f}| + |i_{\omega _g}| \le t) \wedge (|j_{\alpha }| > |i_{\pi _g}| + |i_{\pi _f}|+|i_{\omega _g}|)\)

entail \((ABCD)^{0,0,j_{\alpha }}_{i_{\pi _f}i_{\pi _g}i_{\omega _g}}=0\)

Proof

Exploiting above axioms and Lemmas 2 and 3:

$$\begin{aligned} (ABCD)^{0,0,j_{\alpha }}_{i_{\pi _f}i_{\pi _g}i_{\omega _g}}&{{\mathop {=}\limits ^{\text {Lem.}~3}}}&\sum \limits _{v,z}(\sum \limits _{r}{G}_{i_{\pi _g}i_{\omega _g}}^{0,r,z}{F}^{0,v}_{i_{\pi _f},r})S_{v,z}^{j_{\alpha }} \end{aligned}$$
(12)
$$\begin{aligned}&{{\mathop {\preceq }\limits ^{\text {A4}}}}&\sum \limits _{v\le i_{\pi _f},z}\sum \limits _{r > t - i_{\pi _f}}G_{i_{\pi _g}i_{\omega _g}}^{0,r,z}S_{v,z}^{j_{\alpha }} \end{aligned}$$
(13)
$$\begin{aligned}&{{\mathop {\preceq }\limits ^{\text {A5,A6}}}}&\sum \limits _{v\le i_{\pi _f},z \le i_{\pi _g}+i_{\omega _g}}S_{v,z}^{j_{\alpha }} {\mathop {\preceq }\limits ^{\text {\tiny Lem. },{\text {A6}}}} 0 \end{aligned}$$
(14)

. \(\square \)

Remark 1

Note that the case handled in Theorem 5 concerns f \(t\)-SNI and g \(t\)-NI; vice versa, Example 3 concerns the inverted case f \(t\)-NI and g \(t\)-SNI.

Corollary 3

If f and g are \(t\)-SNI functions then also \(l(x) = g(f(x), x)\) is \(t\)-SNI. Formally, the following three axioms:

A 7:

\(r + |i_{\pi _f}| \le t \wedge v > |i_{\pi _f}| \implies \lnot F_{i_{\pi _f}, r}^{0,v}\)

A 8:

\(|i_{\omega _g}| + |i_{\pi _g}| \le t \wedge (r> |i_{\pi _g}| \vee z > |i_{\pi _g}|) \implies \lnot G_{i_{\pi _g}, i_{\omega _g}}^{0,r,z}\)

A 9:

\((|i_{\pi _g}| + |i_{\pi _f}| + |i_{\omega _g}| \le t) \wedge (|j_{\alpha }| > |i_{\pi _g}| + |i_{\pi _f}|)\)

entail \((ABCD)^{0,0,j_{\alpha }}_{i_{\pi _f}i_{\pi _g}i_{\omega _g}}=0\)

Proof

The initial part of the proof is the same of Theorem 5 up to Equation ( 13); then the different axioms apply:

$$\begin{aligned} \sum \limits _{v\le i_{\pi _f},z}\sum \limits _{r > t - i_{\pi _f}}G_{i_{\pi _g}i_{\omega _g}}^{0,r,z}S_{v,z}^{j_{\alpha }} {{\mathop {\preceq }\limits ^{\mathrm{A8,A9}}}} \sum \limits _{v\le i_{\pi _f},z \le i_{\pi _g}}S_{v,z}^{j_{\alpha }} {{\mathop {\preceq }\limits ^{Lem. 2,A9}}} 0 \end{aligned}$$

. \(\square \)

Fig. 6
figure 6

Example of reduction operation \(u^{(k,n) \triangleright i}\). The new spectral coordinate binary encoding \(u^{(4,2) \triangleright 2}\) is the result of or’ing k-bit wide blocks of the original encoding u

4 Extending the approach to \({\mathbb {F}}_{2^k}^{n}\): the AES inversion

In this section, we present an extension of the proposed formalism to address the case where shares encode values over k bits, i.e., they belong to \({\mathbb {F}}_{2^k}^{n}\). Let us thus consider a function \(f: {\mathbb {F}}_{2^k}^{n} \rightarrow {\mathbb {F}}_{2^k}^{m}\); we can extend Eq. 2 as follows:

$$\begin{aligned} \widetilde{W_f}(i,j)&= \exists u,v. W_f(u,v) \ne 0 \wedge (u^{(k,n)\triangleright n} = i)\nonumber \\&\wedge (v^{(k,m)\triangleright m} =j) \end{aligned}$$
(15)

where \(u^{(k,n) \triangleright n}\) is a reduction operation over the binary encoding of the spectral coordinate u (see Fig. 6). It can be shown that the shares’ relation matrix for the relation matrices computed as in Eq. 15 still complies with Definitions 2 and 3 and Theorems 2 and  3. In this setting, affine functions have a nice representation that will be useful to extend the application of previous theorems.

Definition 4

A function \(f:\mathbb {F}_{2^k}^n \rightarrow {\mathbb {F}}_{2^k}^n\) is a (multi-share) affine function if:

$$\begin{aligned} \forall x\in {\mathbb {F}}_{2^k}^n, \forall i \in \{0,\dots ,n-1\}\exists g. f(x)_i=g(x_i) \end{aligned}$$

where g is an affine function, \(x_i\) is the i-th share of x and \(f(x)_i\) is the i-th share of f(x) (see [4]). For conciseness, we will refer to f as an affine function as well.

The relation matrix of an affine function (as well as its shares’ relation matrix) is an identity, as the following lemma shows.

Lemma 4

Let \(f:{\mathbb {F}}_{2^k}^n \rightarrow {\mathbb {F}}_{2^k}^n\) be an affine function; then \(\widetilde{W_f}=I_{2^n}\).

Proof

The affine function f can be seen as the parallel application of n functions \(g_i\) such that \(f(x)_i = g_i(x_i)\) with \(0 \le i \le n-1\); this implies that:

$$\begin{aligned} W_f=\bigotimes \limits _{i=n-1}^{0}W_{g_i}. \end{aligned}$$

Since each \(g_i\) is an affine (and balanced) function, then \(\widetilde{W}_{g_i}=I_2\) and:

$$\begin{aligned} \widetilde{W_f}=\bigotimes \limits _{i=n-1}^{0}I_2=I_{2^n}. \end{aligned}$$

\(\square \)

When using our formalism to determine if a function over \({\mathbb {F}}_{2^k}^n\) is \(t\)-NI/\(t\)-SNI, we can thus treat affine functions as identities because their shares’ relation matrix is the same as the one of an identity function.

4.1 AES inversion function

A function that has been widely studied in the probing security framework is the inversion function in AES algorithm; finding a gadget that implements it in a probing secure way, also when it is composed with previous and following gadgets, is an important research cornerstone.

Fig. 7
figure 7

An example application of the proposed formalism to functions over \(\mathbb {F}_{2^k}^n\). Blocks \(m_4\) and \(m_2\) in (b) are structured as in (a)

Let us consider the \(t\)-SNI gadget proposed in [9] as the AES inversion in \(\mathbb {F}_{2^8}\). A formal demonstration for the strong security of this implementation has been introduced in [4]. Here we show how this could be proven with our formalism, exploiting only patterns that we have presented and proved in this work.

We report the inversion gadget in Fig. 7b. Note that we have slightly modified the algorithm presented in [9] by moving two power computation blocks across duplication points; semantically it is always the same circuit but it is easier to see how previously introduced patterns can still be used to show that it is \(t\)-SNI.

First of all, we note that there is a recurring pattern in that particular algorithm, i.e., the circuit in Fig. 7a. The block is composed of a mask refresh Refresh (\(t\)-SNI), the ISW multiplication SecMult (\(t\)-SNI), and \(\cdot ^x\), an affine power function parameterized over the exponent x (which is a multiple of two). It is possible to demonstrate that \(m_x\) is \(t\)-SNI following the same line of reasoning of Theorem 3 because, by Lemma 4, the relation matrix of the power function can be interpreted as an identity, thus the same case as the one shown in Fig. 5 applies. Considering the overall algorithm in Fig. 7b, we observe that this is \(t\)-SNI if \(b \circ m_2\) is \(t\)-SNI (by Theorem 3). By Corollary  2, \(b \circ m_2\) is \(t\)-SNI if b is \(t\)-SNI and the latter is true by Theorem 3 and by Lemma 4.

5 Conclusion

We originally started this research to extend our understanding of t-probing security. We have discovered a new relation calculus of shares which exploits the conventional Walsh transform. This calculus is precise enough to prove and extend known compositional properties without much semi-formal or verbal ratiocination. We believe that the underlying linear algebra, while providing a more intuitive understanding, but will allow for an easier mechanization of probing security proofs.

We also believe that a similar approach can be used to address vulnerabilities associated with circuit glitches. In this sense, we have made a preliminary proposal that shows that the approach is viable [20]. Indeed, more work must still be done toward a unifying approach that encompasses circuit glitches and new composability definitions such as the t-PINI condition [23].