Skip to main content
Log in

Masking ring-LWE

  • CHES 2015
  • Published:
Journal of Cryptographic Engineering Aims and scope Submit manuscript

Abstract

In this paper, we propose a masking scheme to protect ring-LWE decryption from first-order side-channel attacks. In an unprotected ring-LWE decryption, the recovered plaintext is computed by first performing polynomial arithmetic on the secret key and then decoding the result. We mask the polynomial operations by arithmetically splitting the secret key polynomial into two random shares; the final decoding operation is performed using a new bespoke masked decoder. The outputs of our masked ring-LWE decryption are Boolean shares suitable for derivation of a symmetric key. Thus, the masking scheme keeps all intermediates, including the recovered plaintext, in the masked domain. We have implemented the masking scheme on both hardware and software. On a Xilinx Virtex-II FPGA, the masked ring-LWE processor requires around 2000 LUTs, a \(20~\%\) increase in the area with respect to the unprotected architecture. A masked decryption operation takes 7478 cycles, which is only a factor \(2.6\times \) larger than the unprotected decryption. On a 32-bit ARM Cortex-M4F processor, the masked software implementation costs around \(5.2\times \) more cycles than the unprotected implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. We use here the term “refresh” to refer to the process of modifying the masked representation \((a',a'')\) of a without modifying the unshared value a, but, contrary to other contexts in the literature, we do not imply that we are pumping new randomness in the new representation.

  2. Note that in the special case, q is a prime close to a power of two and the construction of the quadrant block can be further simplified.

  3. We would like to thank the anonyomus reviewer for bringing this important issue to our attention.

References

  1. Balasch, J., Gierlichs, B., Reparaz, O., Verbauwhede, I.: Dpa, bitslicing and masking at 1 ghz. In: Güneysu, Handschuh [15], pp. 599–619

  2. Bernstein, D.J., Buchmann, J., Dahmen, E.: Post Quantum Cryptography, 1st edn. Springer, Berlin (2008)

    Google Scholar 

  3. Bilgin, B., Gierlichs, B., Nikova, S., Nikov, V., Rijmen, V.: Higher-order threshold implementations. In: ASIACRYPT, Volume 8874 of LNCS, pp. 326–343. Springer, Berlin (2014)

  4. Bos, J.W., Lauter, K., Loftus, J., Naehrig, M.: Improved security for a ring-based fully homomorphic encryption scheme. In: Cryptography and Coding, Volume 8308 of LNCS, pp. 45–64. Springer, Berlin (2013)

  5. Brenner, H., Gaspar, L., Leurent, G., Rosen, A., Standaert, F.-X.: FPGA implementations of SPRING—and their countermeasures against side-channel attacks. In: CHES, Volume 8731 of LNCS, pp. 414–432. Springer, Berlin (2014)

  6. Brier, E., Clavier, C., Olivier, F.: Correlation power analysis with a leakage model. In: CHES, Volume 3156 of LNCS, pp. 16–29. Springer, Berlin (2004)

  7. Chari, S., Jutla, C.S., Rao, J.R., Rohatgi, P.: Towards sound approaches to counteract power-analysis attacks. In: CRYPTO, Volume 1666 of LNCS, pp. 398–412. Springer, Berlin (1999)

  8. Coron, J.S.: Higher order masking of look-up tables. In: EUROCRYPT, Volume 8441 of LNCS, pp. 441–458. Springer, Berlin (2014)

  9. de Clercq, R., Roy, S.S., Vercauteren, F., Verbauwhede, I.: Efficient software implementation of ring-LWE encryption. In: Nebel, W., Atienza, D. (eds.) Proceedings of the 2015 Design, Automation and Test in Europe Conference and Exhibition, DATE 2015, Grenoble, France, March 9–13, 2015, pp. 339–344. ACM (2015)

  10. Ducas, L., Durmus, A., Lepoint, T., Lyubashevsky, V.: Lattice signatures and bimodal gaussians. In: CRYPTO, Volume 8042 of LNCS, pp. 40–56. Springer, Berlin (2013)

  11. Fan, J., Vercauteren, F.: Somewhat practical fully homomorphic encryption. Cryptology ePrint Archive, Report 2012/144 (2012). http://eprint.iacr.org/

  12. Fujisaki, E., Okamoto, T.: Secure integration of asymmetric and symmetric encryption schemes. J. Cryptol. 26(1), 80–101 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  13. Göttert, N., Feller, T., Schneider, M., Buchmann, J., Huss, S.: On the design of hardware building blocks for modern lattice-based encryption schemes. In: CHES, Volume 7428 of LNCS, pp. 512–529. Springer, Berlin (2012)

  14. Goubin, L., Patarin, J.: DES and differential power analysis the duplication method. In: CHES, Volume 1717 of LNCS, pp. 158–172. Springer, Berlin (1999)

  15. Kocher, P.: Timing attacks on implementations of Diffie–Hellman, RSA, DSS, and other systems. In: CRYPTO, Volume 1109 of LNCS, pp. 104–113. Springer, Berlin (1996)

  16. Kocher, P., Jaffe, J., Jun, B.: Differential power analysis. In: CRYPTO, Volume 1666 of LNCS, pp. 388–397. Springer, Berlin (1999)

  17. Lyubashevsky, V., Peikert, C., Regev, O.: On ideal lattices and learning with errors over rings. In: EUROCRYPT, Volume 6110 of LNCS, pp. 1–23. Springer, 2010. Full Version available at Cryptology ePrint Archive, Report 2012/230

  18. Pan, J., den Hartog, J.I., Lu, J.: You cannot hide behind the mask: power analysis on a provably secure s-box implementation. In: Information Security Applications, Volume 5932 of LNCS, pp. 178–192. Springer, Berlin (2009)

  19. Peikert, C.: Lattice cryptography for the internet. In: Post-Quantum Cryptography—6th International Workshop, PQCrypto 2014, Waterloo, ON, Canada, October 1–3, 2014. Proceedings, pp. 197–219 (2014)

  20. Pöppelmann, T., Güneysu, T.: Towards practical lattice-based public-key encryption on reconfigurable hardware. In: Selected Areas in Cryptography—SAC 2013, Volume 8282 of LNCS, pp. 68–85. Springer, Berlin (2014)

  21. Prouff, E., Rivain, M., Bevan, R.: Statistical analysis of second order differential power analysis. IEEE Trans. Comput. 58(6), 799–811 (2009)

    Article  MathSciNet  Google Scholar 

  22. Rebeiro, C., Roy, S.S., Mukhopadhyay, D.: Pushing the limits of high-speed GF(\(2^m\)) elliptic curve scalar multiplication on fpgas. In: CHES, Volume 7428 of LNCS, pp. 494–511. Springer, Berlin (2012)

  23. Regev, O.: On lattices, learning with errors, random linear codes, and cryptography. In: Proceedings of the Thirty-seventh Annual ACM Symposium on Theory of Computing, STOC ’05, pp. 84–93, New York, NY, USA, 2005. ACM

  24. Reparaz, O., Bilgin, B., Nikova, S., Gierlichs, B., Verbauwhede, I.: Consolidating masking schemes. In: CRYPTO, Volume 9215 of LNCS, pp. 764–783. Springer, Berlin (2015)

  25. Reparaz, O., Gierlichs, B., Verbauwhede, I.: Selecting time samples for multivariate DPA attacks. In: CHES, Volume 7428 of LNCS, pp. 155–174. Springer, Berlin (2012)

  26. Reparaz, O., Roy, S.S., Vercauteren, F., Verbauwhede, I.: A masked ring-lwe implementation. In: Güneysu and Handschuh [15], pp. 683–702 (2015)

  27. Roy, S.S., Reparaz, O., Vercauteren, F., Verbauwhede, I.: Compact and side channel secure discrete gaussian sampling. In: IACR Cryptology ePrint Archive, vol. 2014, p. 591 (2014)

  28. Roy, S.S., Vercauteren, F., Mentens, N., Chen, D.D., Verbauwhede, I.: Compact ring-lwe cryptoprocessor. In: CHES, Volume 8731 of LNCS, pp. 371–391. Springer, Berlin (2014)

  29. Rudell, R.L.: Multiple-valued logic minimization for pla synthesis. Technical report, DTIC Document (1986)

  30. E.V. Trichina. Table lookup operation on masked data, 2013. US Patent 8,422,668

  31. Tunstall, M., Hanley, N., McEvoy, R.P., Whelan, C., Murphy, C.C., Marnane, W.P.: Correlation power analysis of large word sizes. In: IET Irish Signals and Systems Conference (ISSC) 2007, September 2007. Available at http://www.cs.bris.ac.uk/home/tunstall/papers/THMWMM.pdf

Download references

Acknowledgments

The authors would like to thank the CHES 2015 reviewers for their valuable comments. This work has been supported in part by the European Commission through the ICT programme under contracts H2020-ICT-645622 PQCRYPTO, H2020-ICT-644209 HEAT and FP7-ICT-2013-10-SEP-210076296 PRACTICE, by the Research Council KU Leuven TENSE (GOA/11/007); by the Flemish Government FWO G.0550.12N, G.00130.13N and G.0876.14N; and by the Hercules Foundation AKUL/11/19. Oscar Reparaz is funded by a PhD fellowship of the Fund for Scientific Research-Flanders (FWO). Sujoy Sinha Roy was supported by Erasmus Mundus PhD Scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oscar Reparaz.

Additional information

This journal version is based on a paper that appeared at the CHES 2015 conference [26]. Sections 6, 7.3 and 8.2 carry substantial differences.

Appendices

Appendix A: Optimal values of \(\Delta _i\) for \(q=7681\)

$$\begin{aligned} \Delta (i)&= (960,1440,480,1680,240,720,1200,1800, \end{aligned}$$
(6)
$$\begin{aligned}&\qquad 120,360,600,840,1080,1320,1560,1860, \end{aligned}$$
(7)
$$\begin{aligned}&\qquad 60,180,300,420,540,660,780,900,1020, \end{aligned}$$
(8)
$$\begin{aligned}&\qquad 1140,1260,1380,1500,1620,1740,1890, \end{aligned}$$
(9)
$$\begin{aligned}&\qquad 30,90,150,210,270,330,390,450,510, \end{aligned}$$
(10)
$$\begin{aligned}&\qquad 570,630,690,750,810,870,930,990,1050, \end{aligned}$$
(11)
$$\begin{aligned}&\qquad 1110,1170,1230) \end{aligned}$$
(12)

These values were found by exhaustive first-order search. The value \(\Delta _i\) is chosen so that it maximizes the number of pairs that get decoded after i iterations.

Appendix B: Attack on half-masked variant

In this section, we analyze the security of a masked ring-LWE variant where the intermediates just before decoding are unmasked, and the decoding is performed in the unmasked domain. This alternative is definitely cheaper than full masking.

In the following, we provide evidence to show that this clearly does not provide enough security in our case.

(A seemingly similar situation appears in [5]. However, there are important differences—namely, it is not possible to choose ciphertexts. In the following, we are not analyzing the variant of [5], but only the half-masked ring-LWE.)

A common argument is that after key diffusion is complete, prediction of the intermediates is not possible and hence standard DPA attacks to the half-masked ring-LWE do not apply. We will see that this is not strictly true, if the attacker can choose ciphertexts.

Assume that the coefficients of the polynomial \(a={\text {INTT}}(r\cdot c_1 + c_2)\) appear unmasked in the implementation. Let the adversary collect measurements with the chosen ciphertext. The ciphertext \(c_1\) has the following structure: all the coefficients are fixed except \(c_1[0]\) that is randomly varying. The ciphertext \(c_2\) has the same structure. Then observe that due to linearity of the \({\text {INTT}}\) operation, a[0] can be written as \(a[0]=\alpha (r[0] \cdot c_1[0] + c_2[0]) + \beta \), where

  • \(\alpha \) is a public constant determined by the \({\text {INTT}}\) transformation.

  • \(\beta \) is a secret constant that is a function of the other (unknown) key coefficients \(r[1], \ldots , r[255]\). Note that by construction, \(\beta \) is constant within the set of collected traces.

Thus, an attacker can perform a DPA attack targeting the intermediate a[0] and placing predictions on \((r[0], \beta )\). The adversary recovers r[0] and proceeds to recover other key coefficients. We have verified this attack in simulations, even when using \({\text {th}}(a[i])\) as intermediate.

(It may seem that the high number of hypotheses, \(2^{26},\) may produce a cumbersome attack. However, one can apply techniques of partial correlation [6] to alleviate the computational effort of DPA on large word sizes [31]. We have experimented that in practice it makes sense to first recover r[0] (this is easier due to larger non-linearity of the modular multiplication) and then \(\beta \) (which may be harder due to the low non-linearity of the modular addition), splitting the \(2^{26}\) effort in two \(2^{13}\) steps.)

Appendix C: Generalization of the decoding scheme

Table 4 The rules for octant decoding

The probability of not hitting any rule can be reduced by increasing the number of rules, i.e., by splitting the domain of decoding into more than four sections. For example, in Table 4, the rules are shown for the case when the decoding domain is split into eight sections or octant. As seen from the table, the probability of not hitting a rule has reduced to 1 / 4. Hence to meet a same decryption failure rate, an octant decoder needs almost half the number of iterations as required by a quad decoder. However, there are overheads associated with an octant decoder when it is compared to a quad decoder: the number of comparisons to locate the position of a coefficient in the octant chart doubles and the sizes of the tables quadruple.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Reparaz, O., Roy, S.S., de Clercq, R. et al. Masking ring-LWE. J Cryptogr Eng 6, 139–153 (2016). https://doi.org/10.1007/s13389-016-0126-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13389-016-0126-5

Keywords

Navigation