Journal of Cryptographic Engineering

, Volume 6, Issue 3, pp 171–185 | Cite as

Faster 64-bit universal hashing using carry-less multiplications

Regular Paper


Intel and AMD support the carry-less multiplication (CLMUL) instruction set in their x64 processors. We use CLMUL to implement an almost universal 64-bit hash family (CLHASH). We compare this new family with what might be the fastest almost universal family on x64 processors (VHASH). We find that CLHASH is at least 60 % faster. We also compare CLHASH with a popular hash function designed for speed (Google’s CityHash). We find that CLHASH is 40 % faster than CityHash on inputs larger than 64 bytes and just as fast otherwise.


Universal hashing Carry-less multiplication Finite field arithmetic 



This work was supported by the National Research Council of Canada, under Grant 26143.


  1. 1.
    Appleby, A.: SMHasher & MurmurHash (2012). Last checked March 2015
  2. 2.
    ARM Limited: ARMv8 architecture reference manual (2014). Last checked March 2015
  3. 3.
    Aumasson, J.P., Bernstein, D.J.: SipHash: a fast short-input PRF. In: Galbraith, S., Nandi, M. (eds.) Progress in Cryptology (INDOCRYPT 2012). Lecture Notes in Computer Science, vol. 7668, pp. 489–508. Springer, Berlin (2012). doi: 10.1007/978-3-642-34931-7_28
  4. 4.
    Aumasson, J.P., Bernstein, D.J.: SipHash: high-speed pseudorandom function (reference code) (2014). Last checked Nov 2014
  5. 5.
    Barrett, P.: Implementing the rivest shamir and adleman public key encryption algorithm on a standard digital signal processor. In: Odlyzko, A.M. (ed.) Advances in Cryptology (CRYPTO’ 86). Lecture Notes in Computer Science, vol. 263, pp. 311–323. Springer, Berlin (1987). doi: 10.1007/3-540-47721-7_24
  6. 6.
    Bernstein, D.J.: The Poly1305-AES message-authentication code. In: Fast Software Encryption. Lecture Notes in Computer Science, vol. 3557, pp. 32–49. Springer, Berlin (2005). doi: 10.1007/11502760_3
  7. 7.
    Black, J., Halevi, S., Krawczyk, H., Krovetz, T., Rogaway, P.: UMAC: fast and secure message authentication. In: Wiener, M. (ed.) Advances in Cryptology (CRYPTO’ 99). Lecture Notes in Computer Science, vol. 1666, pp. 216–233. Springer, Berlin (1999). doi: 10.1007/3-540-48405-1_14
  8. 8.
    Bluhm, M., Gueron, S.: Fast software implementation of binary elliptic curve cryptography. Tech. rep, Cryptology ePrint Archive (2013)Google Scholar
  9. 9.
    Bos, J.W., Özen, O., Stam, M.: Efficient hashing using the AES instruction set. In: Proceedings of the 13th International Conference on Cryptographic Hardware and Embedded Systems (CHES’11), pp. 507–522. Springer, Berlin (2011)Google Scholar
  10. 10.
    Carter, J.L., Wegman, M.N.: Universal classes of hash functions. J. Comput. System Sci. 18(2), 143–154 (1979). doi: 10.1016/0022-0000(79)90044-8 MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn, 3rd edn. The MIT Press, Cambridge (2009)MATHGoogle Scholar
  12. 12.
    Dai, W., Krovetz, T.: VHASH security. Tech. Rep. 338, IACR Cryptology ePrint Archive (2007)Google Scholar
  13. 13.
    Estébanez, C., Hernandez-Castro, J.C., Ribagorda, A., Isasi, P.: Evolving hash functions by means of genetic programming. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, pp. 1861–1862. ACM, New York (2006)Google Scholar
  14. 14.
    Etzel, M., Patel, S., Ramzan, Z.: Square hash: fast message authentication via optimized universal hash functions. In: Wiener, M. (ed.) Advances in Cryptology (CRYPTO’ 99). Lecture Notes in Computer Science, vol. 1666, pp. 234–251. Springer, Berlin (1999). doi: 10.1007/3-540-48405-1_15
  15. 15.
    Fan, B., Andersen, D.G., Kaminsky, M., Mitzenmacher, M.D.: Cuckoo filter: practically better than Bloom. In: Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies (CoNEXT ’14), pp. 75–88. ACM, New York (2014). doi: 10.1145/2674005.2674994
  16. 16.
    Fog, A.: Instruction tables: lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs. Tech. rep., Copenhagen University College of Engineering (2014). Last checked March 2015
  17. 17.
    Gueron, S., Kounavis, M.: Efficient implementation of the Galois Counter Mode using a carry-less multiplier and a fast reduction algorithm. Inf. Process. Lett. 110(14), 549–553 (2010). doi: 10.1016/j.ipl.2010.04.011 MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Halevi, S., Krawczyk, H.: MMH: software message authentication in the Gbit/second rates. In: Biham, E. (ed.) Fast Software Encryption. Lecture Notes in Computer Science, vol. 1267, pp. 172–189. Springer, Berlin (1997). doi: 10.1007/BFb0052345
  19. 19.
    Intel Corporation: Intel IACA tool: a static code analyser (2012). Last checked March 2015
  20. 20.
    Intel Corporation: Power ISA Version 2.07 (2013). Last checked March 2015
  21. 21.
    Intel Corporation: Power ISA Version 2.07 (2014). Last checked March 2015
  22. 22.
    Knežević, M., Sakiyama, K., Fan, J., Verbauwhede, I.: Modular reduction in \(GF(2^n)\) without pre-computational phase. In: von zur Gathen, J., Imaña, J.L., Koç, C.K. (eds.) Arithmetic of Finite Fields. Lecture Notes in Computer Science, vol. 5130, pp. 77–87. Springer, Berlin (2008). doi: 10.1007/978-3-540-69499-1_7
  23. 23.
    Knuth, D.E.: Searching and Sorting. The Art of Computer Programming, vol. 3. Addison-Wesley, Reading (1997)MATHGoogle Scholar
  24. 24.
    Krovetz, T.: Message authentication on 64-bit architectures. In: Selected Areas in Cryptography. Lecture Notes in Computer Science, vol. 4356, pp. 327–341. Springer, Berlin (2007). doi: 10.1007/978-3-540-74462-7_23
  25. 25.
    Krovetz, T., Dai, W.: VMAC and VHASH implementation (2007). Last checked March 2015
  26. 26.
    Lemire, D., Kaser, O.: Strongly universal string hashing is fast. Comput. J. 57(11), 1624–1638 (2014). doi: 10.1093/comjnl/bxt070 CrossRefGoogle Scholar
  27. 27.
    Lim, H., Han, D., Andersen, D.G., Kaminsky, M.: Mica: a holistic approach to fast in-memory key-value storage. In: Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14), pp. 429–444. USENIX Association, Berkeley (2014)Google Scholar
  28. 28.
    Matsumoto, M., Nishimura, T.: Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans. Model. Comput. Simul. 8(1), 3–30 (1998). doi: 10.1145/272991.272995 CrossRefMATHGoogle Scholar
  29. 29.
    Motzkin, T.S.: Evaluation of polynomials and evaluation of rational functions. Bull. Am. Math. Soc. 61(9), 163 (1955)Google Scholar
  30. 30.
    Mullen, G.L., Panario, D.: Handbook of Finite Fields, 1st edn. Chapman & Hall/CRC, London (2013)CrossRefMATHGoogle Scholar
  31. 31.
    Nguyen, L.H., Roscoe, A.W.: New combinatorial bounds for universal hash functions. Tech. Rep. 153, Cryptology ePrint Archive (2009)Google Scholar
  32. 32.
    Oliveira, T., Aranha, D.F., López, J., Rodríguez-Henríquez, F.: Fast point multiplication algorithms for binary elliptic curves with and without precomputation. In: Joux, A., Youssef, A. (eds.) Selected Areas in Cryptography (SAC 2014). Lecture Notes in Computer Science, pp. 324–344. Springer International Publishing, Switzerland (2014). doi: 10.1007/978-3-319-13051-4_20
  33. 33.
    Oliveira, T., López, J., Aranha, D.F., Rodríguez-Henríquez, F.: Two is the fastest prime: lambda coordinates for binary elliptic curves. J. Cryptogr. Eng. 4(1), 3–17 (2014). doi: 10.1007/s13389-013-0069-z CrossRefGoogle Scholar
  34. 34.
    Paoloni, G.: How to Benchmark Code Execution Times on Intel IA-32 and IA-64 Instruction Set Architectures. Intel Corporation, Santa Clara (2010)Google Scholar
  35. 35.
    Pike, G., Alakuijala, J.: The CityHash family of hash functions (2011). Last checked March 2015
  36. 36.
    Stinson, D.R.: Universal hashing and authentication codes. Des. Codes Cryptogr. 4(4), 369–380 (1994). doi: 10.1007/BF01388651 MathSciNetCrossRefMATHGoogle Scholar
  37. 37.
    Stinson, D.R.: On the connections between universal hashing, combinatorial designs and error-correcting codes. Congr. Numer. 114, 7–28 (1996)MathSciNetMATHGoogle Scholar
  38. 38.
    Su, C., Fan, H.: Impact of Intel’s new instruction sets on software implementation of \(GF (2)[x]\) multiplication. Inf. Process. Lett. 112(12), 497–502 (2012). doi: 10.1016/j.ipl.2012.03.012 MathSciNetCrossRefMATHGoogle Scholar
  39. 39.
    Taverne, J., Faz-Hernández, A., Aranha, D.F., Rodríguez-Henríquez, F., Hankerson, D., López, J.: Speeding scalar multiplication over binary elliptic curves using the new carry-less multiplication instruction. J. Cryptogr. Eng. 1(3), 187–199 (2011). doi: 10.1007/s13389-011-0017-8 CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.LICEF Research Center, TELUQ, Université du QuébecMontrealCanada
  2. 2.Department of CSASUniversity of New BrunswickSaint JohnCanada

Personalised recommendations