Comb to Pipeline: Fast Software Encryption Revisited

  • Andrey BogdanovEmail author
  • Martin M. Lauridsen
  • Elmar Tischhauser
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9054)


AES-NI, or Advanced Encryption Standard New Instructions, is an extension of the x86 architecture proposed by Intel in 2008. With a pipelined implementation utilizing AES-NI, parallelizable modes such as AES-CTR become extremely efficient. However, out of the four non-trivial NIST-recommended encryption modes, three are inherently sequential: CBC, CFB, and OFB. This inhibits the advantage of using AES-NI significantly. Similar observations apply to CMAC, CCM and a great deal of other modes. We address this issue by proposing the comb scheduler – a fast scheduling algorithm based on an efficient look-ahead strategy, featuring a low overhead – with which sequential modes profit from the AES-NI pipeline in real-world settings by filling it with multiple, independent messages.

We apply the comb scheduler to implementations on Haswell, Intel’s latest microarchitecture, for a wide range of modes. We observe a drastic speed-up of factor 5 for NIST’s CBC, CFB, OFB and CMAC performing around 0.88 cpb. Surprisingly, contrary to the entire body of previous performance analysis, the throughput of the authenticated encryption (AE) mode CCM gets very close to that of GCM and OCB3, with about 1.64 cpb (vs. 1.63 cpb and 1.51 cpb, resp.), despite Haswell’s heavily improved binary field multiplication. This suggests CCM as an AE mode of choice as it is NIST-recommended, does not have any weak-key issues like GCM, and is royalty-free as opposed to OCB3. Among the CAESAR contestants, the comb scheduler significantly speeds up CLOC/SILC, JAMBU, and POET, with the mostly sequential nonce-misuse resistant design of POET, performing at 2.14 cpb, becoming faster than the well-parallelizable COPA.

Finally, this paper provides the first optimized AES-NI implementations for the novel AE modes OTR, CLOC/SILC, COBRA, POET, McOE-G, and Julius.




  1. 1.
    Abed, F., Fluhrer, S., Forler, C., List, E., Lucks, S., McGrew, D., Wenzel, J.: Pipelineable on-line encryption. In: Cid, C., Rechberger, C. (eds.) FSE 2014. LNCS, vol. 8540, pp. 205–223. Springer, Heidelberg (2015) Google Scholar
  2. 2.
    Akdemir, K., Dixon, M., Feghali, W., Fay, P., Gopal, V., Guilford, J., Ozturk, E., Wolrich, G., Zohar, R.: Breakthrough AES Performance with Intel AES New Instructions. Intel Corporation (2010)Google Scholar
  3. 3.
    Andreeva, E., Bilgin, B., Bogdanov, A., Luykx, A., Mennink, B., Mouha, N., Yasuda, K.: APE: authenticated permutation-based encryption for lightweight cryptography. In: Cid, C., Rechberger, C. (eds.) FSE 2014. LNCS, vol. 8540, pp. 168–186. Springer, Heidelberg (2015) Google Scholar
  4. 4.
    Andreeva, E., Bogdanov, A., Luykx, A., Mennink, B., Tischhauser, E., Yasuda, K.: Parallelizable and authenticated online ciphers. In: Sako, K., Sarkar, P. (eds.) ASIACRYPT 2013, Part I. LNCS, vol. 8269, pp. 424–443. Springer, Heidelberg (2013) Google Scholar
  5. 5.
    Andreeva, E., Luykx, A., Mennink, B., Yasuda, K.: COBRA: a parallelizable authenticated online cipher without block cipher inverse. In: Cid, C., Rechberger, C. (eds.) FSE 2014. LNCS, vol. 8540, pp. 187–203. Springer, Heidelberg (2015) Google Scholar
  6. 6.
    Aoki, K., Iwata, T., Yasuda, K.: How fast can a two-pass mode go? a parallel deterministic authenticated encryption mode for AES-NI. In: DIAC 2012: Directions in Authenticated Ciphers (2012)Google Scholar
  7. 7.
    Bahack, L.: Julius: Secure Mode of Operation for Authenticated Encryption Based on ECB and Finite Field Multiplications. CAESAR competition proposalGoogle Scholar
  8. 8.
    Bernstein, D.J., Schwabe, P.: New AES software speed records. In: Chowdhury, D.R., Rijmen, V., Das, A. (eds.) INDOCRYPT 2008. LNCS, vol. 5365, pp. 322–336. Springer, Heidelberg (2008) Google Scholar
  9. 9.
    Bogdanov, A., Mendel, F., Regazzoni, F., Rijmen, V., Tischhauser, E.: ALE: AES-based lightweight authenticated encryption. In: Moriai, S. (ed.) FSE 2013. LNCS, vol. 8424, pp. 447–466. Springer, Heidelberg (2014) Google Scholar
  10. 10.
    Dworkin, M.J.: SP 800–38D. Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC. Technical report, Gaithersburg, MD, USA (2007)Google Scholar
  11. 11.
    Fleischmann, E., Forler, C., Lucks, S.: McOE: a family of almost foolproof on-line authenticated encryption schemes. In: Canteaut, A. (ed.) FSE 2012. LNCS, vol. 7549, pp. 196–215. Springer, Heidelberg (2012) Google Scholar
  12. 12.
    Fleischmann, E., Forler, C., Lucks, S., Wenzel, J.: McOE: A Family of Almost Foolproof On-Line Authenticated Encryption Schemes. Cryptology ePrint Archive, Report 2011/644 (2011).
  13. 13.
    Fog, A.: Software Optimization Resources, February 2014. Accessed 17 February 2014
  14. 14.
    Gueron, S.: Intel’s new AES Instructions for enhanced performance and security. In: Dunkelman, O. (ed.) FSE 2009. LNCS, vol. 5665, pp. 51–66. Springer, Heidelberg (2009) Google Scholar
  15. 15.
    Gueron, S.: Intel Advanced Encryption Standard (AES) New Instructions Set. Intel Corporation (2010)Google Scholar
  16. 16.
    Gueron, S.: AES-GCM software performance on the current high end CPUs as a performance baseline for CAESAR. In: DIAC 2013: Directions in Authenticated Ciphers (2013)Google Scholar
  17. 17.
    Gueron, S., Kounavis, M.E.: Intel Carry-Less Multiplication Instruction and its Usage for Computing the GCM Mode. Intel Corporation (2010)Google Scholar
  18. 18.
    Gulley, S., Gopal, V.: Haswell Cryptographic Performance. Intel Corporation (2013)Google Scholar
  19. 19.
    Hollingsworth, V.: New “Bulldozer” and “Piledriver” Instructions. Advanced Micro Devices Inc. (2012)Google Scholar
  20. 20.
    Iveson, S.: IPSec Bandwidth Overhead Using AES, October 2013. Accessed 17 February 2014
  21. 21.
    Iwata, T., Minematsu, K., Guo, J., Morioka, S.: CLOC: authenticated encryption for short input. In: Cid, C., Rechberger, C. (eds.) FSE 2014. LNCS, vol. 8540, pp. 149–167. Springer, Heidelberg (2015) Google Scholar
  22. 22.
    Iwata, T., Minematsu, K., Guo, J., Morioka, S., Kobayashi, E.: SILC: Simple Lightweight CFB. CAESAR competition proposalGoogle Scholar
  23. 23.
    Jankowski, K., Laurent, P.: Packed AES-GCM Algorithm Suitable for AES/PCLMULQDQ Instructions, pp. 135–138 (2011)Google Scholar
  24. 24.
    John, W., Tafvelin, S.: Analysis of internet backbone traffic and header anomalies observed. In: Internet Measurement Conference, pp. 111–116 (2007)Google Scholar
  25. 25.
    Käsper, E., Schwabe, P.: Faster and Timing-Attack Resistant AES-GCM. In: Clavier, C., Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 1–17. Springer, Heidelberg (2009) Google Scholar
  26. 26.
    Krovetz, T., Rogaway, P.: The software performance of authenticated-encryption modes. In: Joux, A. (ed.) FSE 2011. LNCS, vol. 6733, pp. 306–327. Springer, Heidelberg (2011) Google Scholar
  27. 27.
    Lim, C.H., Lee, P.J.: More Flexible Exponentiation with Precomputation. In: Desmedt, Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 95–107. Springer, Heidelberg (1994) Google Scholar
  28. 28.
    Matsui, M.: How far can we go on the x64 processors? In: Robshaw, M. (ed.) FSE 2006. LNCS, vol. 4047, pp. 341–358. Springer, Heidelberg (2006) Google Scholar
  29. 29.
    Matsui, M., Fukuda, S.: How to maximize software performance of symmetric primitives on Pentium III and 4 processors. In: Gilbert, H., Handschuh, H. (eds.) FSE 2005. LNCS, vol. 3557, pp. 398–412. Springer, Heidelberg (2005) Google Scholar
  30. 30.
    Matsui, M., Nakajima, J.: On the power of bitslice implementation on intel core2 processor. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 121–134. Springer, Heidelberg (2007) Google Scholar
  31. 31.
    Dworkin, M.J.: SP 800-38D. Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC. Technical report, National Institute of Standards & Technology, Gaithersburg, MD, USA (2007)Google Scholar
  32. 32.
    McGrew, D.A., Viega, J.: The security and performance of the Galois/Counter Mode (GCM) of operation. In: Canteaut, A., Viswanathan, K. (eds.) INDOCRYPT 2004. LNCS, vol. 3348, pp. 343–355. Springer, Heidelberg (2004) Google Scholar
  33. 33.
    Minematsu, K.: Parallelizable rate-1 authenticated encryption from pseudorandom functions. In: Nguyen, P.Q., Oswald, E. (eds.) EUROCRYPT 2014. LNCS, vol. 8441, pp. 275–292. Springer, Heidelberg (2014) Google Scholar
  34. 34.
    Murray, D., Koziniec, T.: The state of enterprise network traffic in 2012. In: 2012 18th Asia-Pacific Conference on Communications (APCC), pp. 179–184. IEEE (2012)Google Scholar
  35. 35.
    Osvik, D.A., Bos, J.W., Stefan, D., Canright, D.: Fast software AES encryption. In: Hong, S., Iwata, T. (eds.) FSE 2010. LNCS, vol. 6147, pp. 75–93. Springer, Heidelberg (2010) Google Scholar
  36. 36.
    Pentikousis, K., Badr, H.G.: Quantifying the deployment of TCP options - a comparative study, pp. 647–649 (2004)Google Scholar
  37. 37.
    Whiting, D., Housley, R., Ferguson, N.: Counter with CBC-MAC (CCM) (2003)Google Scholar
  38. 38.
    Wu, H., Huang, T.: JAMBU Lightweight Authenticated Encryption Mode and AES-JAMBU. CAESAR competition proposalGoogle Scholar

Copyright information

© International Association for Cryptologic Research 2015

Authors and Affiliations

  • Andrey Bogdanov
    • 1
    Email author
  • Martin M. Lauridsen
    • 1
  • Elmar Tischhauser
    • 1
  1. 1.DTU ComputeTechnical University of DenmarkKgs. LyngbyDenmark

Personalised recommendations