Abstract
Recent Intel processors provide hardware instructions that implement a full AES round in a single instruction. Existing libraries use hand-tuned assembly language to overlap the execution of multiple AES instructions and extract maximum performance. We present a program generator that creates optimized AES code automatically from a simple, annotated C version of the code. We show how this generator can be used to rapidly create highly optimized versions of several AES modes. The resulting code generated has performance that is equal to, or up to 7% faster than the hand-tuned assembly libraries from Intel.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Specification for the Advanced Encryption Standard (AES) (2001)
Daemen, J., Rijmen, V.: The design of Rijndael: AES — the Advanced Encryption Standard. Springer, Heidelberg (2002)
Gueron, S.: Intel’s New AES Instructions for Enhanced Performance and Security. In: Dunkelman, O. (ed.) Fast Software Encryption. LNCS, vol. 5665, pp. 51–66. Springer, Heidelberg (2009)
Gueron, S.: Intel Advanced Encryption Standard (AES) Instructions Set (White Paper). Intel Corp. (2010), http://software.intel.com/file/24917
Gopal, V., Feghali, W., Guilford, J., Ozturk, E., Wolrich, G., Dixon, M., Locktyukhin, M., Perminov, M.: Fast Cryptographic Computation on Intel Architecture Via Function Stitching (White Paper). Intel Corp. (2010), http://download.intel.com/design/intarch/PAPERS/323686.pdf
Akdemir, K., Dixon, M., Feghali, W., Fay, P., Gopal, V., Guilford, J., Ozturk, E., Wolrich, G., Zohar, R.: Breakthrough AES Performance with Intel AES New Instructions (White Paper). Intel Corp. (2010), http://software.intel.com/file/27067
Rudd, T.: Cheetah - The Python-Powered Template Engine (2007), http://www.cheetahtemplate.org/
Cytron, R., Ferrante, J., Rosen, B.K., Wegman, M.N., Zadeck, F.K.: Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst. 13(4), 451–490 (1991)
Skiena, S.S.: The Algorithm Design Manual. Springer, New York (1998)
Fisher, J.A.: Very Long Instruction Word architectures and the ELI-512. In: ISCA 1983: Proceedings of the 10th Annual International Symposium on Computer Architecture, pp. 140–150. ACM, New York (1983)
Rau, B.R.: Iterative modulo scheduling: an algorithm for software pipelining loops. In: MICRO 27: Proceedings of the 27th Annual International Symposium on Microarchitecture, pp. 63–74. ACM, New York (1994)
Manley, R., Gregg, D.: Code Generation for Hardware Accelerated AES. In: 21st IEEE International Conference on Application-specific Systems, Architectures and Processors (Poster Session), ASAP 2010 (2010)
Bernstein, D.J., Schwabe, P.: New AES Software Speed Records. In: Chowdhury, D.R., Rijmen, V., Das, A. (eds.) INDOCRYPT 2008. LNCS, vol. 5365, pp. 322–336. Springer, Heidelberg (2008)
Ehrsam, W.F., Meyer, C.H.W., Powers, R.L., Smith, J.L., Tuchman, W.L.: Product block cipher system for data security. Patent, US 3962539 (June 1976)
McGrew, D.A., Viega, J.: The Galois/Counter Mode of Operation, GCM (2004), http://csrc.nist.gov/CryptoToolkit/modes/proposedmodes/gcm/gcm-spec.pdf
Gueron, S., Kounavis, M.E.: Intel Carry-Less Multiplication Instruction and its Usage for Computing the GCM Mode (White Paper). Intel Corp. (2010), http://software.intel.com/file/24918
Eastlake, D.E., Jones, P.E.: US Secure Hash Algorithm 1, SHA1 (2001), http://www.ietf.org/rfc/rfc3174.txt?number=3174
Gopal, V., Ozturk, E., Feghali, W., Guilford, J., Wolrich, G., Dixon, M.: Optimized Galois-Counter-Mode Implementation on Intel Architecture Processors. Intel Corp. (2010), http://download.intel.com/design/intarch/PAPERS/324194.pdf
Püschel, M., Moura, J.M.F., Johnson, J., Padua, D., Veloso, M., Singer, B., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., Chen, K., Johnson, R.W., Rizzolo, N.: SPIRAL: Code Generation for DSP Transforms. Proceedings of the IEEE, special issue on Program Generation, Optimization, and Adaptation 93(2), 232–275 (2005)
Frigo, M., Steven, Johnson, G.: The Design and Implementation of FFTW3. Proceedings of the IEEE, 216–231 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Manley, R., Gregg, D. (2010). A Program Generator for Intel AES-NI Instructions. In: Gong, G., Gupta, K.C. (eds) Progress in Cryptology - INDOCRYPT 2010. INDOCRYPT 2010. Lecture Notes in Computer Science, vol 6498. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17401-8_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-17401-8_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17400-1
Online ISBN: 978-3-642-17401-8
eBook Packages: Computer ScienceComputer Science (R0)