Abstract
In this work, we optimize the performance of QC-MDPC code-based cryptosystems through the insertion of configurable failure rates in their arithmetic procedures. We present constant time algorithms with a configurable failure rate for multiplication and inversion over binary polynomials, the two most expensive subroutines used in QC-MDPC implementations. Using a failure rate negligible compared to the security level (\(2^{-128}\)), our multiplication is 2 times faster than NTL on sparse polynomials and 1.6 times faster than a naive constant-time sparse polynomial multiplication. Our inversion algorithm, based on Wu et al., is 2 times faster than the original algorithm and 12 times faster than Itoh-Tsujii using the same modulus polynomial (\(x^{32749} - 1\)). By inserting these algorithms in a version of QcBits at the 128-bit quantum security level, we were able to achieve a speedup of 1.9 on the key generation and up to 1.4 on the decryption time. Comparing with variant 2 of the BIKE suite, which also implements the Niederreiter Cryptosystem using QC-MDPC codes, our final version of QcBits performs the uniform decryption 2.7 times faster.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aragon, N., et al.: BIKE: bit flipping key encapsulation, December 2017. https://hal.archives-ouvertes.fr/hal-01671903. Submission to the NIST post quantum standardization process. Website: http://bikesuite.org/
Barreto, P.S.L.M., et al.: Cake: code-based algorithm for key encapsulation. Cryptology ePrint Archive, Report 2017/757 (2017). http://eprint.iacr.org/2017/757
Berlekamp, E., McEliece, R., van Tilborg, H.: On the inherent intractability of certain coding problems (corresp.). IEEE Trans. Inf. Theory 24(3), 384–386 (1978). https://doi.org/10.1109/TIT.1978.1055873
Bernstein, D.J.: SUPERCOP: system for unified performance evaluation related to cryptographic operations and primitives (2009)
Bernstein, D.J., Chuengsatiansup, C., Lange, T., van Vredendaal, C.: NTRU prime: reducing attack surface at low cost. In: Adams, C., Camenisch, J. (eds.) SAC 2017. LNCS, vol. 10719, pp. 235–260. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-72565-9_12
Brunner, H., Curiger, A., Hofstetter, M.: On computing multiplicative inverses in GF(\(2^m\)). IEEE Trans. Comput. 42(8), 1010–1015 (1993). https://doi.org/10.1109/12.238496
Ceze, L., et al.: Disciplined approximate computing: from language to hardware, and beyond. Personal Web-page, https://homes.cs.washington.edu/~luisceze/ceze-approx-overview.pdf
Chou, T.: QcBits: constant-time small-key code-based cryptography. In: Gierlichs, B., Poschmann, A.Y. (eds.) CHES 2016. LNCS, vol. 9813, pp. 280–300. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53140-2_14
Drucker, N., Gueron, S., Krasnov, V.: Fast multiplication of binary polynomials with the forthcoming vectorized VPCLMULQDQ instruction. In: 2018 IEEE 25th Symposium on Computer Arithmetic (ARITH), pp. 115–119, June 2018. https://doi.org/10.1109/ARITH.2018.8464777
Drucker, N., Gueron, S.: A toolbox for software optimization of QC-MDPCcode-based cryptosystems. J. Cryptogr. Eng. (2019). https://doi.org/10.1007/s13389-018-00200-4
Eaton, E., Lequesne, M., Parent, A., Sendrier, N.: QC-MDPC: a timing attack and a CCA2 KEM. In: Lange, T., Steinwandt, R. (eds.) PQCrypto 2018. LNCS, vol. 10786, pp. 47–76. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-79063-3_3
Faugère, J.-C., Otmani, A., Perret, L., Tillich, J.-P.: Algebraic cryptanalysis of McEliece variants with compact keys. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 279–298. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13190-5_14
Flammenkamp, A.: Shortest addition chains. Achim’s WWW Domain (2018). http://wwwhomes.uni-bielefeld.de/achim/addition_chain.html
Gallager, R.: Low-Density Parity-Check Codes. MIT press, Cambridge (1963)
Goppa, V.D.: A new class of linear correcting codes. Problemy Peredachi Informatsii 6(3), 24–30 (1970)
Guimarães, A., Aranha, D.F., Borin, E.: Optimizing the decoding process of a post-quantum cryptographic algorithm. In: XVIII Simpósio em Sistemas Computacionais de Alto Desempenho-WSCAD, vol. 18, no. 1/2017, pp. 160–171 (2017)
Guimarães, A., Aranha, D.F., Borin, E.: Optimized implementation of QC-MDPC code-based cryptography. Concurr. Comput. Pract. Exp. (2018). https://doi.org/10.1002/cpe.5089
Hamming, R.W.: Coding and Information Theory, 2nd edn. Prentice-Hall Inc., Upper Saddle River (1986)
Hankerson, D., Menezes, A.J., Vanstone, S.: Guide to Elliptic Curve Cryptography. SPC. Springer, New York (2004). https://doi.org/10.1007/B97644
Itoh, T., Tsujii, S.: A fast algorithm for computing multiplicative inverses in GF(\(2^m\)) using normal bases. Inf. Comput. 78(3), 171–177 (1988). https://doi.org/10.1016/0890-5401(88)90024-7
Jevdjic, D., Strauss, K., Ceze, L., Malvar, H.S.: Approximate storage of compressed and encrypted videos, vol. 45, pp. 361–373. ACM, New York, April 2017. https://doi.org/10.1145/3093337.3037718
Kou, Y., Xu, J., Tang, H., Lin, S., Abdel-Ghaffar, K.: On circulant low density parity check codes. In: Proceedings IEEE International Symposium on Information Theory, p. 200 (2002). https://doi.org/10.1109/ISIT.2002.1023472
Maurich, I.V., Oder, T., Güneysu, T.: Implementing QC-MDPC McEliece encryption. ACM Trans. Embed. Comput. Syst. 14(3), 44:1–44:27 (2015). https://doi.org/10.1145/2700102
McEliece, R.J.: A public-key cryptosystem based on algebraic coding theory. Deep Space Network Progress Report 44, pp. 114–116 (1978)
Misoczki, R., Tillich, J.P., Sendrier, N., Barreto, P.S.L.M.: MDPC-McEliece: new McEliece variants from moderate density parity-check codes. In: 2013 IEEE International Symposium on Information Theory, pp. 2069–2073, July 2013. https://doi.org/10.1109/ISIT.2013.6620590
Monico, C., Rosenthal, J., Shokrollahi, A.: Using low density parity check codes in the McEliece cryptosystem. In: 2000 IEEE International Symposium on Information Theory, p. 215 (2000). https://doi.org/10.1109/ISIT.2000.866513
Niederreiter, H.: Knapsack-type cryptosystems and algebraic coding theory. Prob. Control Inf. Theory 15(2), 159–166 (1986)
NIST: Submission requirements and evaluation criteria for the post-quantum cryptography standardization process. NIST web page (2016). http://csrc.nist.gov/groups/ST/post-quantum-crypto/documents/call-for-proposals-final-dec-2016.pdf
Rabin, M.O.: Probabilistic algorithm for testing primality. J. Number Theory 12(1), 128–138 (1980). https://doi.org/10.1016/0022-314X(80)90084-0
Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21(2), 120–126 (1978). https://doi.org/10.1145/359340.359342
Rossi, M., Hamburg, M., Hutter, M., Marson, M.E.: A side-channel assisted cryptanalytic attack against QcBits. In: Fischer, W., Homma, N. (eds.) CHES 2017. LNCS, vol. 10529, pp. 3–23. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66787-4_1
Shor, P.W.: Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J. Comput. 26(5), 1484–1509 (1997). https://doi.org/10.1137/S0097539795293172
Shoup, V.: Number Theory C++ Library (NTL) (2003)
Stanley, R.P.: What Is Enumerative Combinatorics?. The Wadsworth & Brooks/Cole Mathematics Series, vol. 1, pp. 1–63. Springer, Boston (1986). https://doi.org/10.1007/978-1-4615-9763-6_1
Stein, J.: Computational problems associated with Racah algebra. J. Comput. Phys. 1(3), 397–405 (1967). https://doi.org/10.1016/0021-9991(67)90047-2
Strenzke, F., Tews, E., Molter, H.G., Overbeck, R., Shoufan, A.: Side channels in the McEliece PKC. In: Buchmann, J., Ding, J. (eds.) PQCrypto 2008. LNCS, vol. 5299, pp. 216–229. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88403-3_15
Wu, C.H., Wu, C.M., Shieh, M.D., Hwang, Y.T.: High-speed, low-complexity systolic designs of novel iterative division algorithms in GF(\(2^m\)). IEEE Trans. Comput. 53(3), 375–380 (2004). https://doi.org/10.1109/TC.2004.1261843
Acknowledgements
We would like to thank CNPq, Intel, and the São Paulo Research Foundation (FAPESP) for supporting this research under grants 313012/2017-2, 2014/50704-7 and 2013/08293-7. We also would like to thank Microsoft for providing the cloud infrastructure for our experiments with state-of-the-art microprocessors. And, finally, we would like to thank Professor Julio López for the useful comments and ideas.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Implementing Conditional Statements (Ifs) in Constant-Time
Listing 1 shows an example of a non-constant-time conditional operation. Assuming that A and B are secret data, this implementation is vulnerable to timing side-channel attacks. Listing 2 shows the equivalent constant-time implementation (considering that Function1 and Function2 do not have side effects). The operations using 64-bit integers (uint64_t) had their results conditionally selected through a bit-wise AND with the mask cond. When using AVX-512 registers, the implementation of conditional operations is significantly simplified. The AVX-512 instruction set extension already provides masked versions for most of its instructions. In this way, we simply use the mask cond in the mask field of the intrinsics of these instructions.
B Implementing the Degree Verification Efficiently
Our modification of Wu et al. algorithm introduced two main drawbacks in the performance of the algorithm. The first is the constant-time implementation of the function \(Smallest\_Monomial\_Degree\). For large polynomials (such as the ones used in code-based cryptography), it would be very expensive to search the smallest monomial on the entire polynomial. Therefore, we search only the first E bits of the polynomial, change the If condition to test if the result is different of E and adjust the number of iterations to compensate for this limitation. Algorithm 11 shows this modification. Using the inverted binary representation of the polynomial (shown in Fig. 5), we can obtain the degree of the smallest monomial by calculating the number of leading zeros of the representation. Most of the modern architectures enable this calculation with a single instruction. Intel, for instance, provides instructions for counting the leading zeros running in constant-time for 32-bit words (since i386), 64-bit words (since Haswell), and 64-bit lanes on SIMD registers (AVX-512). Other architectures enable equivalent or complementary operations, such as rounded binary logarithm or trailing zeros, which may require modifications in the polynomial representation, but, ultimately, would not impact performance.
The second drawback in our version are the divisions. In the original algorithm, the divisor was always x. We modified it to \(x^b\), where \(0 < b \le E\). Constant-time divisions usually have its execution time defined by the upper bound of the divisor and, thus, the parameter E also appears as a trade-off between the number of iterations and the performance of each iteration. Fortunately, it is easy to optimize its value in our case. Using SIMD registers in the Intel architecture, the execution time of dividing by x or \(x^{64}\) is the same, while greater exponents require much more expensive instructions to move bits across the SIMD lanes. In this way, we choose \(E = 64\), which also helps the implementation of the function \(Smallest\_Monomial\_Degree\).
C Proof of Eq. 2
Proposition 1
In Eq. 2, if \(P(r_{j,0} = 1) \le 0.5\) then \(P(r_{j,i} = 1) \le 0.5\) for all \(i \ge 0\).
Proof
We prove it using induction on i.
Base case: If \(i = 0\), then \(P(r_{j,i} = 1) = P(r_{j,0} = 1) \le 0.5\).
Inductive Hypothesis: if \(P(r_{j,0} = 1) \le 0.5\) then \(P(r_{j,i} = 1) \le 0.5\) for all \(1 \le i < n\)
Inductive Step: Let \(i = n\).
Knowing that
We have
Let f(X, Y, Z) be the value of \(P(r_{j,n} = 1)\) for \(X = P(r_{0, n-1} = 1)\), \(Y = P(r_{j+1, n-1} = 1)\) and \(Z = P(s_{j+1, n-1} = 1)\). By our inductive hypothesis, \(0 \le P(r_{0, n-1} = 1) \le 0.5\) and \(0 \le P(r_{j+1, n-1} = 1) \le 0.5\). We could obtain a tighter interval for \(P(s_{j+1, n-1} = 1)\) addressing its own recurrence relation, but that is not necessary. Thus, we consider \(0 \le P(s_{j+1, n-1} = 1) \le 1\). To maximize the value of f in these intervals, we first check the boundaries:
The next step would be a search for a local maximum, which clearly does not exist since f is linear in all variables.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Guimarães, A., Borin, E., Aranha, D.F. (2019). Introducing Arithmetic Failures to Accelerate QC-MDPC Code-Based Cryptography. In: Baldi, M., Persichetti, E., Santini, P. (eds) Code-Based Cryptography. CBC 2019. Lecture Notes in Computer Science(), vol 11666. Springer, Cham. https://doi.org/10.1007/978-3-030-25922-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-25922-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25921-1
Online ISBN: 978-3-030-25922-8
eBook Packages: Computer ScienceComputer Science (R0)