Faster 64-bit universal hashing using carry-less multiplications

Lemire, Daniel; Kaser, Owen

doi:10.1007/s13389-015-0110-5

Faster 64-bit universal hashing using carry-less multiplications

Regular Paper
Published: 04 September 2015

Volume 6, pages 171–185, (2016)
Cite this article

Journal of Cryptographic Engineering Aims and scope Submit manuscript

Daniel Lemire¹ &
Owen Kaser²

399 Accesses
13 Citations
43 Altmetric
4 Mentions
Explore all metrics

Abstract

Intel and AMD support the carry-less multiplication (CLMUL) instruction set in their x64 processors. We use CLMUL to implement an almost universal 64-bit hash family (CLHASH). We compare this new family with what might be the fastest almost universal family on x64 processors (VHASH). We find that CLHASH is at least 60 % faster. We also compare CLHASH with a popular hash function designed for speed (Google’s CityHash). We find that CLHASH is 40 % faster than CityHash on inputs larger than 64 bytes and just as fast otherwise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

The low-power AMD Jaguar microarchitecture does even better with a throughput of one cycle and a latency of three cycles.
In the present paper, \(\log n\) means \(\log _2 n\).
The general construction of a finite field of cardinality \(p^n\) for \(n>1\) is commonly explained in terms of polynomials with coefficients from \(\mathrm{GF}(p)\). To avoid unnecessary abstraction, we present finite fields of cardinality \(2^L\) using regular L-bit integers. Interested readers can see Mullen and Panario [30], for the alternative development.
This can be readily verified using a mathematical software package such as Sage or Maple.
Our benchmark software is made freely available under a liberal open-source license (https://github.com/lemire/StronglyUniversalStringHashing), and it includes the modified SMHasher as well as all the necessary software to reproduce our results.
For comparison, Dai and Krovetz reported that VHASH used 0.6 cycles per byte on an Intel Core 2 processor (Merom) [25].

References

Appleby, A.: SMHasher & MurmurHash (2012). http://code.google.com/p/smhasher. Last checked March 2015
ARM Limited: ARMv8 architecture reference manual (2014). http://infocenter.arm.com/help/topic/com.arm.doc.subset.architecture.reference/. Last checked March 2015
Aumasson, J.P., Bernstein, D.J.: SipHash: a fast short-input PRF. In: Galbraith, S., Nandi, M. (eds.) Progress in Cryptology (INDOCRYPT 2012). Lecture Notes in Computer Science, vol. 7668, pp. 489–508. Springer, Berlin (2012). doi:10.1007/978-3-642-34931-7_28
Aumasson, J.P., Bernstein, D.J.: SipHash: high-speed pseudorandom function (reference code) (2014). https://github.com/veorq/SipHash. Last checked Nov 2014
Barrett, P.: Implementing the rivest shamir and adleman public key encryption algorithm on a standard digital signal processor. In: Odlyzko, A.M. (ed.) Advances in Cryptology (CRYPTO’ 86). Lecture Notes in Computer Science, vol. 263, pp. 311–323. Springer, Berlin (1987). doi:10.1007/3-540-47721-7_24
Bernstein, D.J.: The Poly1305-AES message-authentication code. In: Fast Software Encryption. Lecture Notes in Computer Science, vol. 3557, pp. 32–49. Springer, Berlin (2005). doi:10.1007/11502760_3
Black, J., Halevi, S., Krawczyk, H., Krovetz, T., Rogaway, P.: UMAC: fast and secure message authentication. In: Wiener, M. (ed.) Advances in Cryptology (CRYPTO’ 99). Lecture Notes in Computer Science, vol. 1666, pp. 216–233. Springer, Berlin (1999). doi:10.1007/3-540-48405-1_14
Bluhm, M., Gueron, S.: Fast software implementation of binary elliptic curve cryptography. Tech. rep, Cryptology ePrint Archive (2013)
Bos, J.W., Özen, O., Stam, M.: Efficient hashing using the AES instruction set. In: Proceedings of the 13th International Conference on Cryptographic Hardware and Embedded Systems (CHES’11), pp. 507–522. Springer, Berlin (2011)
Carter, J.L., Wegman, M.N.: Universal classes of hash functions. J. Comput. System Sci. 18(2), 143–154 (1979). doi:10.1016/0022-0000(79)90044-8
Article MathSciNet MATH Google Scholar
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn, 3rd edn. The MIT Press, Cambridge (2009)
MATH Google Scholar
Dai, W., Krovetz, T.: VHASH security. Tech. Rep. 338, IACR Cryptology ePrint Archive (2007)
Estébanez, C., Hernandez-Castro, J.C., Ribagorda, A., Isasi, P.: Evolving hash functions by means of genetic programming. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, pp. 1861–1862. ACM, New York (2006)
Etzel, M., Patel, S., Ramzan, Z.: Square hash: fast message authentication via optimized universal hash functions. In: Wiener, M. (ed.) Advances in Cryptology (CRYPTO’ 99). Lecture Notes in Computer Science, vol. 1666, pp. 234–251. Springer, Berlin (1999). doi:10.1007/3-540-48405-1_15
Fan, B., Andersen, D.G., Kaminsky, M., Mitzenmacher, M.D.: Cuckoo filter: practically better than Bloom. In: Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies (CoNEXT ’14), pp. 75–88. ACM, New York (2014). doi:10.1145/2674005.2674994
Fog, A.: Instruction tables: lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs. Tech. rep., Copenhagen University College of Engineering (2014). http://www.agner.org/optimize/instruction_tables.pdf. Last checked March 2015
Gueron, S., Kounavis, M.: Efficient implementation of the Galois Counter Mode using a carry-less multiplier and a fast reduction algorithm. Inf. Process. Lett. 110(14), 549–553 (2010). doi:10.1016/j.ipl.2010.04.011
Article MathSciNet MATH Google Scholar
Halevi, S., Krawczyk, H.: MMH: software message authentication in the Gbit/second rates. In: Biham, E. (ed.) Fast Software Encryption. Lecture Notes in Computer Science, vol. 1267, pp. 172–189. Springer, Berlin (1997). doi:10.1007/BFb0052345
Intel Corporation: Intel IACA tool: a static code analyser (2012). https://software.intel.com/en-us/articles/intel-architecture-code-analyzer. Last checked March 2015
Intel Corporation: Power ISA Version 2.07 (2013). https://www.power.org/wp-content/uploads/2013/05/PowerISA_V2.07_PUBLIC.pdf. Last checked March 2015
Intel Corporation: Power ISA Version 2.07 (2014). https://software.intel.com/sites/landingpage/IntrinsicsGuide/. Last checked March 2015
Knežević, M., Sakiyama, K., Fan, J., Verbauwhede, I.: Modular reduction in \(GF(2^n)\) without pre-computational phase. In: von zur Gathen, J., Imaña, J.L., Koç, C.K. (eds.) Arithmetic of Finite Fields. Lecture Notes in Computer Science, vol. 5130, pp. 77–87. Springer, Berlin (2008). doi:10.1007/978-3-540-69499-1_7
Knuth, D.E.: Searching and Sorting. The Art of Computer Programming, vol. 3. Addison-Wesley, Reading (1997)
MATH Google Scholar
Krovetz, T.: Message authentication on 64-bit architectures. In: Selected Areas in Cryptography. Lecture Notes in Computer Science, vol. 4356, pp. 327–341. Springer, Berlin (2007). doi:10.1007/978-3-540-74462-7_23
Krovetz, T., Dai, W.: VMAC and VHASH implementation (2007). http://fastcrypto.org/vmac/. Last checked March 2015
Lemire, D., Kaser, O.: Strongly universal string hashing is fast. Comput. J. 57(11), 1624–1638 (2014). doi:10.1093/comjnl/bxt070
Article Google Scholar
Lim, H., Han, D., Andersen, D.G., Kaminsky, M.: Mica: a holistic approach to fast in-memory key-value storage. In: Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14), pp. 429–444. USENIX Association, Berkeley (2014)
Matsumoto, M., Nishimura, T.: Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans. Model. Comput. Simul. 8(1), 3–30 (1998). doi:10.1145/272991.272995
Article MATH Google Scholar
Motzkin, T.S.: Evaluation of polynomials and evaluation of rational functions. Bull. Am. Math. Soc. 61(9), 163 (1955)
Google Scholar
Mullen, G.L., Panario, D.: Handbook of Finite Fields, 1st edn. Chapman & Hall/CRC, London (2013)
Book MATH Google Scholar
Nguyen, L.H., Roscoe, A.W.: New combinatorial bounds for universal hash functions. Tech. Rep. 153, Cryptology ePrint Archive (2009)
Oliveira, T., Aranha, D.F., López, J., Rodríguez-Henríquez, F.: Fast point multiplication algorithms for binary elliptic curves with and without precomputation. In: Joux, A., Youssef, A. (eds.) Selected Areas in Cryptography (SAC 2014). Lecture Notes in Computer Science, pp. 324–344. Springer International Publishing, Switzerland (2014). doi:10.1007/978-3-319-13051-4_20
Oliveira, T., López, J., Aranha, D.F., Rodríguez-Henríquez, F.: Two is the fastest prime: lambda coordinates for binary elliptic curves. J. Cryptogr. Eng. 4(1), 3–17 (2014). doi:10.1007/s13389-013-0069-z
Article Google Scholar
Paoloni, G.: How to Benchmark Code Execution Times on Intel IA-32 and IA-64 Instruction Set Architectures. Intel Corporation, Santa Clara (2010)
Pike, G., Alakuijala, J.: The CityHash family of hash functions (2011). https://code.google.com/p/cityhash/. Last checked March 2015
Stinson, D.R.: Universal hashing and authentication codes. Des. Codes Cryptogr. 4(4), 369–380 (1994). doi:10.1007/BF01388651
Article MathSciNet MATH Google Scholar
Stinson, D.R.: On the connections between universal hashing, combinatorial designs and error-correcting codes. Congr. Numer. 114, 7–28 (1996)
MathSciNet MATH Google Scholar
Su, C., Fan, H.: Impact of Intel’s new instruction sets on software implementation of \(GF (2)[x]\) multiplication. Inf. Process. Lett. 112(12), 497–502 (2012). doi:10.1016/j.ipl.2012.03.012
Article MathSciNet MATH Google Scholar
Taverne, J., Faz-Hernández, A., Aranha, D.F., Rodríguez-Henríquez, F., Hankerson, D., López, J.: Speeding scalar multiplication over binary elliptic curves using the new carry-less multiplication instruction. J. Cryptogr. Eng. 1(3), 187–199 (2011). doi:10.1007/s13389-011-0017-8
Article MATH Google Scholar

Download references

Acknowledgments

This work was supported by the National Research Council of Canada, under Grant 26143.

Author information

Authors and Affiliations

LICEF Research Center, TELUQ, Université du Québec, Montreal, QC, Canada
Daniel Lemire
Department of CSAS, University of New Brunswick, Saint John, NB, Canada
Owen Kaser

Authors

Daniel Lemire
View author publications
You can also search for this author in PubMed Google Scholar
Owen Kaser
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Lemire.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lemire, D., Kaser, O. Faster 64-bit universal hashing using carry-less multiplications. J Cryptogr Eng 6, 171–185 (2016). https://doi.org/10.1007/s13389-015-0110-5

Download citation

Received: 27 December 2014
Accepted: 14 August 2015
Published: 04 September 2015
Issue Date: September 2016
DOI: https://doi.org/10.1007/s13389-015-0110-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Faster 64-bit universal hashing using carry-less multiplications

Abstract

Access this article

Similar content being viewed by others

HalftimeHash: Modern Hashing Without 64-Bit Multipliers or Finite Fields

BLAKE2: Simpler, Smaller, Fast as MD5

KangarooTwelve: Fast Hashing Based on $${\textsc {Keccak}\text {-}p}{}$$

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Faster 64-bit universal hashing using carry-less multiplications

Abstract

Access this article

Similar content being viewed by others

HalftimeHash: Modern Hashing Without 64-Bit Multipliers or Finite Fields

BLAKE2: Simpler, Smaller, Fast as MD5

KangarooTwelve: Fast Hashing Based on $${\textsc {Keccak}\text {-}p}{}$$

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation