Skip to main content

Fast Implementation of Curve25519 Using AVX2

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9230))

Abstract

AVX2 is the newest instruction set on the Intel Haswell processor that provides simultaneous execution of operations over vectors of 256 bits. This work presents the advances on the applicability of AVX2 on the development of an efficient software implementation of the elliptic curve Diffie-Hellman protocol using the Curve25519 elliptic curve. Also, we will discuss some advantages that vector instructions offer as an alternative method to accelerate prime field and elliptic curve arithmetic. The performance of our implementation shows a slight improvement against the fastest state-of-the-art implementations.

Armando Faz-Hernández and Julio López were partially supported by the Intel Labs University Research Office.

Julio López was partially supported by FAPESP, Projeto Temático grant number 2013/25.977-7.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aranha, D.F., Gouvêa, C.P.L.: RELIC is an Efficient LIbrary for Cryptography. http://code.google.com/p/relic-toolkit/

  2. Aranha, D.F., Barreto, P.S.L.M., Pereira, G.C.C.F., Ricardini, J.E.: A note on high-security general-purpose elliptic curves. Cryptology ePrint Archive, Report 2013/647 (2013). http://eprint.iacr.org/

  3. Bernstein, D.J.: Curve25519: new Diffie-Hellman speed records. In: Yung, M., Dodis, Y., Kiayias, A., Malkin, T. (eds.) PKC 2006. LNCS, vol. 3958, pp. 207–228. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Bernstein, D.J.: Cryptography in NaCl, March 2009. http://cr.yp.to/highspeed/naclcrypto-20090310.pdf

  5. Bernstein, D.J.: DNSCurve: usable security for DNS, June 2009. http://dnscurve.org

  6. Bernstein, D.J., Lange, T.: eBACS: ECRYPT benchmarking of cryptographic systems, March 2015. Accessed on 20 March 2015 http://bench.cr.yp.to/supercop.html

  7. Bernstein, D.J., Lange, T.: SafeCurves: choosing safe curves for elliptic-curve cryptography (2015). Accessed 20 March 2015 http://safecurves.cr.yp.to

  8. Bernstein, D.J., Lange, T., Schwabe, P.: NaCl: Networking and Cryptography library, October 2013. http://nacl.cr.yp.to/

  9. Bernstein, D.J., Schwabe, P.: NEON Crypto. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 320–339. Springer, Heidelberg (2012). http://dx.doi.org/10.1007/978-3-642-33027-8_19

    Chapter  Google Scholar 

  10. Bos, J.W., Costello, C., Longa, P., Naehrig, M.: Selecting Elliptic Curves for Cryptography: An Efficiency and Security Analysis. Cryptology ePrint Archive, Report 2014/130 (2014). http://eprint.iacr.org/

  11. Cohen, H., Frey, G., Avanzi, R., Doche, C., Lange, T., Nguyen, K., Vercauteren, F.: Handbook of Elliptic and Hyperelliptic Curve Cryptography, (2nd edn). Chapman & Hall/CRC (2012)

    Google Scholar 

  12. Corporation, I.: Intel Pentium processor with MMX technology documentation, January 2008. http://www.intel.com/design/archives/Processors/mmx/

  13. Corporation, I.: Define SSE2, SSE3 and SSE4, January 2009. http://www.intel.com/support/processors/sb/CS-030123.htm

  14. Corporation, I.: Intel Advanced Vector Extensions Programming Reference, June 2011. https://software.intel.com/sites/default/files/m/f/7/c/36945

  15. Fog, A.: Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs, December 2014

    Google Scholar 

  16. Granger, R., Scott, M.: Faster ECC over \(\mathbb{F}_{2^{521}-1}\). Cryptology ePrint Archive, Report 2014/852 (2014). http://eprint.iacr.org/

  17. Granlund, T., the GMP development team: GNU MP: The GNU Multiple Precision Arithmetic Library, (5.0.5 edn) (2012). http://gmplib.org/

  18. Itoh, T., Tsujii, S.: A fast algorithm for computing multiplicative inverses in GF\((2^m)\) using normal bases. Inf. Comput. 78(3), 171–177 (1988). http://dx.doi.org/10.1016/0890-5401(88)90024–7

    Article  MathSciNet  Google Scholar 

  19. Montgomery, P.L.: Speeding the pollard and elliptic curve methods of factorization. Math. Comput. 48(177), 243–264 (1987). http://dx.doi.org/10.2307/2007888

    Article  Google Scholar 

  20. National Institute of Standards and Technology: Digital Signature Standard (DSS). FIPS Publication 186, may 1994. http://www.bibsonomy.org/bibtex/2a98c67565fa98cc7c90d7d622c1ad252/dret

  21. Shell, O.S.: OpenSSH, January 2014. http://www.openssh.com/txt/release-6.5

  22. Solinas, J.A.: Generalized Mersenne Numbers. Technical report,Center of Applied Cryptographic Research (CACR) (1999)

    Google Scholar 

  23. The OpenSSL Project: OpenSSL: The Open Source toolkit for SSL/TLS, April 2003. www.openssl.org

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their helpful suggestions and comments. Additionally, they would like to show their gratitude to Jérémie Detrey for his valuable comments on an earlier version of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Armando Faz-Hernández .

Editor information

Editors and Affiliations

Appendices

A Relevant AVX2 Instructions

A list of the most relevant instructions used in this work is presented. For clarity, instructions were grouped according to their functionality. Table 4 shows in the second column a mnemonic used in this document; in the third column is described the specific assembler name of the instruction, and the last columns show the latency and the reciprocal throughput of every instruction, the entries were taken from the Agner Fog’s measurements published in [15].

figure e

B Algorithms

1.1 B.1 Implementation of Modular Squaring Using AVX2

To compute the modular squaring we follow a similar approach like in the case of modular multiplication. Algorithm 4 shows the scheduling of instructions used to compute the modular squaring of an interleaved tuple \(\langle \mathbf {A},\mathbf {B}\rangle \). The products \(a_{x,y}\) such that \(\nu _{x,y}=2\) are computed in the inner loops (lines 12 to 15 and 20 to 23) and once that these products were accumulated, they are multiplied by 2 using shift instructions. At the end, the lines from 26 to 29 compute the modular reduction.

figure f
figure g

1.2 B.2 Implementation of Coefficient Reduction Using AVX2

The coefficient reduction is processed coefficient-wise. We split each coefficient into three parts \(a_i=h_i\parallel m_i\parallel l_i\) and compute the process described in Sect. 3.2. Simultaneously, each \(m_i\) (medium coefficient) is added to the correspondent \(l_{i+1}\) (low coefficient) and to the \(h_{i-1}\) (high coefficient). For those coefficients that need to be reduced modulo p, we compute the multiplication by c using just shift instructions. After the coefficient reduction is processed, the size of each coefficient in the updated tuple will have at most \(\beta _i+1\) bits.

1.3 B.3 Point Multiplication Using Montgomery Ladder

Algorithm 6 shows the computation of the Montgomery point multiplication to calculate the x-coordinate of k P given the x-coordinate of P and an integer scalar k. This algorithm also requires the ladder step presented in Algorithm 1.

For its use in the computation of the elliptic curve Diffie-Hellman protocol using the Curve25519, the document [4] describes an encoding for the secret key when is given as a string of bytes. Then, the description of Algorithm 6 assumes that the secret key was already encoded.

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Faz-Hernández, A., López, J. (2015). Fast Implementation of Curve25519 Using AVX2. In: Lauter, K., Rodríguez-Henríquez, F. (eds) Progress in Cryptology -- LATINCRYPT 2015. LATINCRYPT 2015. Lecture Notes in Computer Science(), vol 9230. Springer, Cham. https://doi.org/10.1007/978-3-319-22174-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22174-8_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22173-1

  • Online ISBN: 978-3-319-22174-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics