Fast Software Polynomial Multiplication on ARM Processors Using the NEON Engine

  • Danilo Câmara
  • Conrado P. L. Gouvêa
  • Julio López
  • Ricardo Dahab
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8128)


Efficient algorithms for binary field operations are required in several cryptographic operations such as digital signatures over binary elliptic curves and encryption. The main performance-critical operation in these fields is the multiplication, since most processors do not support instructions to carry out a polynomial multiplication. In this paper we describe a novel software multiplier for performing a polynomial multiplication of two 64-bit binary polynomials based on the VMULL instruction included in the NEON engine supported in many ARM processors. This multiplier is then used as a building block to obtain a fast software multiplication in the binary field \(\mathbb{F}_{2^m}\), which is up to 45% faster compared to the best known algorithm. We also illustrate the performance improvement in point multiplication on binary elliptic curves using the new multiplier, improving the performance of standard NIST curves at the 128- and 256-bit levels of security. The impact on the GCM authenticated encryption scheme is also studied, with new speed records. We present timing results of our software implementation on the ARM Cortex-A8, A9 and A15 processors.


binary field arithmetic ARM NEON elliptic curve cryptography authenticated encryption software implementation 


  1. 1.
    Aranha, D.F., Gouvêa, C.P.L.: RELIC is an Efficient LIbrary for Cryptography,
  2. 2.
    Aranha, D.F., Faz-Hernández, A., López, J., Rodríguez-Henríquez, F.: Faster implementation of scalar multiplication on Koblitz curves. In: Hevia, A., Neven, G. (eds.) LatinCrypt 2012. LNCS, vol. 7533, pp. 177–193. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    ARM Limited: ARMv8 instruction set overview (2012)Google Scholar
  4. 4.
    Barker, E., Johnson, D., Smid, M.: NIST SP 800-56A: Recommendation for pair-wise key establishment schemes using discrete logarithm cryptography (March 2007)Google Scholar
  5. 5.
    Bernstein, D.J.: Batch binary Edwards. In: Halevi, S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 317–336. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    Bernstein, D.J., Schwabe, P.: NEON crypto. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 320–339. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  7. 7.
    Faz-Hernández, A., Longa, P., Sánchez, A.H.: Efficient and secure algorithms for GLV-based scalar multiplication and their implementation on GLV-GLS curves. Cryptology ePrint Archive, Report 2013/158 (2013),
  8. 8.
    Gallant, R.P., Lambert, R.J., Vanstone, S.A.: Faster point multiplication on elliptic curves with efficient endomorphisms. In: Kilian, J. (ed.) CRYPTO 2001. LNCS, vol. 2139, pp. 190–200. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  9. 9.
    Hamburg, M.: Fast and compact elliptic-curve cryptography. Cryptology ePrint Archive, Report 2012/309 (2012),
  10. 10.
    Itoh, T., Tsujii, S.: A fast algorithm for computing multiplicative inverses in GF(2m) using normal bases. Information and Computation 78(3), 171–177 (1988)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Karatsuba, A., Ofman, Y.: Multiplication of multidigit numbers on automata. Soviet Physics Doklady 7, 595 (1963)Google Scholar
  12. 12.
    Kocher, P.C.: Timing attacks on implementations of Diffie-hellman, RSA, DSS, and other systems. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 104–113. Springer, Heidelberg (1996), Google Scholar
  13. 13.
    Krovetz, T., Rogaway, P.: The software performance of authenticated-encryption modes. In: Joux, A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 306–327. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  14. 14.
    Käsper, E.: Fast elliptic curve cryptography in OpenSSL. In: Danezis, G., Dietrich, S., Sako, K. (eds.) FC 2011 Workshops. LNCS, vol. 7126, pp. 27–39. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    López, J., Dahab, R.: High-speed software multiplication in \(\mathbb{F}_{2^m}\). In: Roy, B., Okamoto, E. (eds.) INDOCRYPT 2000. LNCS, vol. 1977, pp. 203–212. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  16. 16.
    López, J., Dahab, R.: Fast multiplication on elliptic curves over GF(2m) without precomputation. In: Koç, Ç.K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 316–327. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  17. 17.
    McGrew, D.A., Viega, J.: The Security and Performance of the Galois/Counter Mode (GCM) of Operation. In: Canteaut, A., Viswanathan, K. (eds.) INDOCRYPT 2004. LNCS, vol. 3348, pp. 343–355. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Möller, B.: Algorithms for multi-exponentiation. In: Vaudenay, S., Youssef, A.M. (eds.) SAC 2001. LNCS, vol. 2259, pp. 165–180. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  19. 19.
    Möller, N.: Nettle, low-level cryptographics library. Nettle Git repository (2013),
  20. 20.
    Morozov, S., Tergino, C., Schaumont, P.: System integration of elliptic curve cryptography on an OMAP platform. In: 2011 IEEE 9th Symposium on Application Specific Processors (SASP), pp. 52–57. IEEE (2011)Google Scholar
  21. 21.
    National Institute of Standards and Technology: FIPS 186-3: Digital signature standard (DSS) (June 2009),
  22. 22.
  23. 23.
    Schnorr, C.P.: Efficient signature generation by smart cards. Journal of Cryptology 4(3), 161–174 (1991)MathSciNetzbMATHCrossRefGoogle Scholar
  24. 24.
    Solinas, J.A.: Efficient arithmetic on Koblitz curves. Designs, Codes and Cryptography 19(2), 195–249 (2000)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2013

Authors and Affiliations

  • Danilo Câmara
    • 1
  • Conrado P. L. Gouvêa
    • 1
  • Julio López
    • 1
  • Ricardo Dahab
    • 1
  1. 1.University of Campinas (Unicamp)Brazil

Personalised recommendations