AES Encryption Implementation and Analysis on Commodity Graphics Processing Units

  • Owen Harrison
  • John Waldron
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4727)


Graphics Processing Units (GPUs) present large potential performance gains within stream processing applications over the standard CPU. These performance gains are best realised when high computational intensity is required across large amounts of mostly independent input elements. The GPU’s success in general purpose stream processing has been demonstrated in many diverse fields, though attempts to port cryptographic algorithms to the GPU have thus far met little success. In recent years, GPU architectures have continued to develop a more flexible and uniform programming environment. These developments have overcome a lot of previously encountered restrictions in cipher implementations. We present novel approaches for the implementation of the AES block cipher encryption algorithm on these GPUs. This work also serves as a precursor for future cipher implementations on the most advanced GPU architecture, the recently released Nvidia G80, which now includes integer support and a simplified programming interface.


AES Graphics Processor GPU Hardware Accelerated 


  1. 1.
    Buck, I., Lefohn, A., McCormick, P., Owens, J., Purcell, T., Strzodka, R.: General Purpose Computation on Graphics Hardware. In: IEEE Visualization 2005, Minneapolis, USA, p. 33. IEEE Computer Society Press, Los Alamitos (2005)Google Scholar
  2. 2.
    Kahle, J., Day, M., Hofstee, H., Johns, C., Maeurer, T., Shippy, D.: Introduction to the Cell multiprocessor. IBM Journal of Research and Development 49(4/5), 589–604 (2005)CrossRefGoogle Scholar
  3. 3.
    Cook, D., Baratto, R., Keromytis, A.: Remotely Keyed Cryptographics Secure Remote Display Access Using (Mostly) Untrusted Hardware. In: Qing, S., Mao, W., Lopez, J., Wang, G. (eds.) ICICS 2005. LNCS, vol. 3783, Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    National Institute of Standards and Technology (NIST). FIPS-197: Advanced Encryption Standard (November 2001),
  5. 5.
    Daemen, J., Rijmen, V.: The Rijndael Block Cipher (September 1999),
  6. 6.
    Menezes, A., van Oorschot, P., Vanstone, S.: Handbook of Applied Cryptography. CRC Press, Boca Raton, USA (1997)zbMATHGoogle Scholar
  7. 7.
    Bellare, M., Desai, A., Jokipii, E., Rogaway, P.: A concrete security treatment of symmetric encryption: Analysis of the DES modes of operation. In: 38th Annual Symposium on Foundations of Computer Science (FOCS 1997) (1997)Google Scholar
  8. 8.
    Kohno, T., Viega, J., Whiting, D.: CWC: A high-performance conventional authenticated encryption mode. In: The Fast Software Encryption Workshop, Delhi, India, pp. 408–426 (February 2004)Google Scholar
  9. 9.
    ARB, O., Shreiner, D., Woo, M., Neider, J., Davis, T.: OpenGL Programming Guide: The Official Guide to Learning OpenGL. Version 2 (2005)Google Scholar
  10. 10.
    Satoh, A., et al.: A Compact Rijndael Hardware Architecture with S-Box Optimization. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 239–254. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  11. 11.
    Wolkerstorfer, J., Oswald, E., Lamberger, M.: An ASIC Implementation of the AES Sboxes. In: RSA Conference 02, San Jose, CA (February 2002)Google Scholar
  12. 12.
    Hodjat, A., Hwang, D., Lai, B., Tiri, K., Verbauwhede, I.: A 3.84 Gbits/s AES crypto coprocessor with modes of operation in a 0.18-um CMOS Technology. In: Proceedings of the 15th ACM Great Lakes Symposium on VLSI 2005, pp. 60–63. ACM Press, New York (2005)CrossRefGoogle Scholar
  13. 13.
    McLoone, M., McCanny, J.: High Performance Single Chip FPGA Rijndael Algorithm Implementations. In: Workshop on Cryptographic Hardware and Embedded Systems, Paris (2001)Google Scholar
  14. 14.
    Elbirt, A., Yip, W., Chetwynd, B., Paar, C.: An FPGA-based performance evaluation of the AES block cipher candidate algorithm finalists. IEEE Trans. of VLSI Systems 9(4), 545–557 (2001)CrossRefGoogle Scholar
  15. 15.
    Hodjat, A., Verbauwhede, I.: Minimum Area Cost for a 30 to 70 Gbits/s AES Processor. In: 2004 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2004), Emerging Trends in VLSI Systems Design, pp. 83–88. IEEE Computer Society Press, Los Alamitos (2004)Google Scholar
  16. 16.
    Cook, D., Ioannidis, J., Keromytis, A., Luck, J.: CryptoGraphics: Secret Key Cryptography Using Graphics Cards. In: Menezes, A.J. (ed.) CT-RSA 2005. LNCS, vol. 3376, Springer, Heidelberg (2005)Google Scholar
  17. 17.
    Kedem, G., Ishihara, Y.: Brute Force Attack On Unix Passwords With SIMD Computer. In: Proceedings of the 8th USENIX Security Symposium, Washington, D.C., USA (August 23-26, 1999)Google Scholar
  18. 18.
    Olano, M., Lastra, A.: A Shading Language on Graphics Hardware: The PixelFlow Shading System. Journal of Computer Graphics, 159–168 (1998)Google Scholar
  19. 19.
    Buck, I.: Data parallel computing on graphics hardware. In: Siggraph 03: Graphics Hardware Panel, San Diego, USA (2003)Google Scholar
  20. 20.
    Venkatasubramanian, S.: The graphics card as a stream computer. In: DIMACS Workshop on Management and Processing of Data Streams, San Diego, USA (2003)Google Scholar
  21. 21.
    Govindaraju, N., Gray, J., Kumar, R., Manocha, D.: Gputerasort: High performance graphics coprocessor sorting for large database management. ACM SIGMOD/PODS, Chicago, USA (2006)Google Scholar
  22. 22.
    Fung, J., Mann, S., Aimone, C.: Openvidia: Parallel gpu computer vision. ACM Multimedia, Singapore (2005)Google Scholar
  23. 23.
    Govindaraju, N.K., Raghuvanshi, N., Manocha, D.: Fast and approximate stream mining of quantiles and frequencies using graphics processors. In: ACM SIGMOD/PODS, Baltimore, Maryland, USA (2005)Google Scholar
  24. 24.
    Owens, J.D.: A survey of general-purpose computation on graphics hardware. In: Eurographics, Dublin, Ireland (2005)Google Scholar
  25. 25.
    The GPGPU Resources and Forums, available online at
  26. 26.
    Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for GPUs: Stream Computing on Graphics Hardware. In: SIGGRAPH, Las Angeles, USA (2004)Google Scholar
  27. 27.
    Buck, I., Fatahalian, K., Hanrahan, P.: Gpubench: Evaluating gpu performance for numerical and scientifc applications. In: ACM Workshop on General Purpose Computing on Graphics Processors, LA, USA (2004)Google Scholar
  28. 28.
    Hill, M., Smith, A.: Evaluating associativity in cpu caches. IEEE Transactions on Computers 38(12), 1612–1630Google Scholar
  29. 29.
    National Institute of Standards and Technology (NIST). FIPS-46-3: Data Encryption Standard (1976),
  30. 30.
    Rijmen, V., Bosselaers, A., Barreto, P.: Optimised ANSI C code for the Rijndael cipher, Version 3.0 (December 2000),
  31. 31.
    OpenSSL: Open Source Project, can be accessed online at
  32. 32.
    Harrison, O., Waldron, J.: Optimising Data Movement Rates for Parallel Processing Applications on Graphics Processors. In: 25th International Conference on Parallel and Distributed Computing and Networks, February 13-15, Innsbruck, Austria (2007)Google Scholar
  33. 33.
    Govindaraju, N., Larsen, S., Gray, J., Manocha, D.: A Memory Model for Scientific Algorithms on Graphics Processors, SC06, Florida, USA (2006)Google Scholar
  34. 34.

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Owen Harrison
    • 1
  • John Waldron
    • 1
  1. 1.Computer Architecture Group, Trinity College Dublin, Dublin 2Ireland

Personalised recommendations