Instruction Set Extensions for Efficient AES Implementation on 32-bit Processors

  • Stefan Tillich
  • Johann Großschädl
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4249)


Secure communication over public networks like the Internet requires the use of cryptographic algorithms as basic building blocks. Most cryptographic workloads pose a considerable burden on devices like PDAs, cell phones, and sensor nodes, which are limited in processing power, memory and energy. In this paper we present an approach to increase the efficiency of 32-bit processors for handling symmetric cryptographic algorithms with the help of instruction set extensions. We propose a number of custom instructions to support the Advanced Encryption Standard (AES). Using the SPARC V8-compatible Leon2 embedded processor, we evaluate the effects of the extensions on performance and code size of AES, as well as on silicon area. With a moderate increase in silicon area, AES performance can be improved by a factor of nearly 10, while code size is reduced significantly and implementation flexibility is retained. We also show that our approach is very beneficial for implementation in superscalar processors and that it can compete with the performance of previously proposed cryptographic processors and instruction set extensions.


Advanced Encryption Standard instruction set extensions embedded RISC processor SPARC V8 architecture efficient implementation 


  1. 1.
    Bertoni, G., Breveglieri, L., Fragneto, P., Macchetti, M., Marchesin, S.: Efficient Software Implementation of AES on 32-Bit Platforms. In: Kaliski Jr., B.S., Koç, Ç.K., Paar, C. (eds.) CHES 2002. LNCS, vol. 2523, pp. 159–171. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  2. 2.
    Bertoni, G., Breveglieri, L., Farina, R., Regazzoni, F.: Speeding Up AES By Extending a 32-Bit Processor Instruction Set. In: Proceedings of the 17th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP 2006), September 2006. IEEE CS Press, Los Alamitos (to be published, 2006)Google Scholar
  3. 3.
    Burke, J., McDonald, J., Austin, T.: Architectural support for fast symmetric-key cryptography. In: Proceedings of the 9th Int. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2000), pp. 178–189. ACM Press, New York (2000)CrossRefGoogle Scholar
  4. 4.
    Canright, D.: A very compact S-Box for AES. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 441–455. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Fiskiran, A.M., Lee, R.B.: On-Chip Lookup Tables for Fast Symmetric-Key Encryption. In: Proceedings of the 16th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP 2005), pp. 356–363. IEEE CS Press, Los Alamitos (2005)CrossRefGoogle Scholar
  6. 6.
    J. Gaisler. The LEON-2 Processor User’s Manual (Version 1.0.30) (March 2006), Available for download at:
  7. 7.
    Gladman, B.: Implementations of AES (Rijndael) in C/C++ and assembler, Available at:
  8. 8.
    Hodjat, A., Verbauwhede, I.: Interfacing a high speed crypto accelerator to an embedded CPU. In: Proceedings of the 38th Asilomar Conference on Signals, Systems, and Computers, vol. 1, pp. 488–492. IEEE Press, Los Alamitos (2004)Google Scholar
  9. 9.
    Irwin, J., Page, D.: Using Media Processors for Low-Memory AES Implementation. In: Proceedings of the 14th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2003), pp. 144–154. IEEE CS Press, Los Alamitos (2003)CrossRefGoogle Scholar
  10. 10.
    Matsui, M.: How far can we go on the x64 processors? In: Robshaw, M.J.B. (ed.) FSE 2006. LNCS, vol. 4047, pp. 341–358. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Matsui, M., Fukuda, S.: How to Maximize Software Performance of Symmetric Primitives on Pentium III and 4 Processors. In: Gilbert, H., Handschuh, H. (eds.) FSE 2005. LNCS, vol. 3557, pp. 398–412. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Nadehara, K., Ikekawa, M., Kuroda, I.: Extended Instructions for the AES Cryptography and their Efficient Implementation. In: Proceedings of the 18th IEEE Workshop on Signal Processing Systems (SIPS 2004), pp. 152–157. IEEE Press, Los Alamitos (2004)Google Scholar
  13. 13.
    National Institute of Standards and Technology (NIST). FIPS-197: Advanced Encryption Standard (November 2001), Available online at:
  14. 14.
    Oliva, D., Buchty, R., Heintze, N.: AES and the Cryptonite Crypto Processor. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES 2003), pp. 198–209. ACM Press, New York (2003)CrossRefGoogle Scholar
  15. 15.
    Ravi, S., Raghunathan, A., Potlapally, N., Sankaradass, M.: System design methodologies for a wireless security processing platform. In: Proceedings of the 39th Design Automation Conference (DAC 2003), pp. 777–782. ACM Press, New York (2003)Google Scholar
  16. 16.
    Schaumont, P., Sakiyama, K., Hodjat, A., Verbauwhede, I.: Embedded Software Integration for Coarse-Grain Reconfigurable Systems. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), pp. 137–142. IEEE CS Press, Los Alamitos (2004)CrossRefGoogle Scholar
  17. 17.
    Tillich, S., Großschädl, J.: Accelerating AES Using Instruction Set Extensions for Elliptic Curve Cryptography. In: Gervasi, O., Gavrilova, M.L., Kumar, V., Laganá, A., Lee, H.P., Mun, Y., Taniar, D., Tan, C.J.K. (eds.) ICCSA 2005. LNCS, vol. 3481, pp. 665–675. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  18. 18.
    Tillich, S., Großschädl, J., Szekely, A.: An Instruction Set Extension for Fast and Memory-Efficient AES Implementation. In: Dittmann, J., Katzenbeisser, S., Uhl, A. (eds.) CMS 2005. LNCS, vol. 3677, pp. 11–21. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  19. 19.
    Wolkerstorfer, J.: An ASIC Implementation of the AES-MixColumn operation. In: Proceedings of Austrochip 2001, pp. 129–132 (2001), ISBN 3-9501517-0-2Google Scholar
  20. 20.
    Wu, L., Weaver, C., Austin, T.: Cryptomaniac: A fast flexible architecture for secure communication. In: Proceedings of the 28th Annual International Symposium on Computer Architecture (ISCA 2001), pp. 110–119. ACM Press, New York (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Stefan Tillich
    • 1
  • Johann Großschädl
    • 1
  1. 1.Institute for Applied Information Processing and CommunicationsGraz University of TechnologyGrazAustria

Personalised recommendations