Bitslice Implementation of AES
Network applications need to be fast and at the same time provide security. In order to minimize the overhead of the security algorithm on the performance of the application, the speeds of encryption and decryption of the algorithm are critical. To obtain maximum performance from the algorithm, efficient techniques for its implementation must be used and the implementation must be tuned for the specific hardware on which it is running.
Bitslice is a non-conventional but efficient way to implement DES in software. It involves breaking down of DES into logical bit operations so that N parallel encryptions are possible on a single N-bit microprocessor. This results in tremendous throughput. AES is a symmetric block cipher introduced by NIST as a replacement for DES. It is rapidly becoming popular due to its good security features, efficiency, performance and simplicity. In this paper we present an implementation of AES using the bitslice technique. We analyze the impact of the architecture of the microprocessor on the performance of bitslice AES. We consider three processors; the Intel Pentium 4, the AMD Athlon 64 and the Intel Core 2. We optimize the implementation to best utilize the superscalar architecture and SIMD instruction set present in the processors.
KeywordsClock Cycle Block Cipher Advance Encryption Standard Advance Encryption Standard Algorithm Fast Software Encryption
Unable to display preview. Download preview PDF.
- 2.National Institute of Standards and Technology (NIST), Information Technology Laboratory (ITL), Advanced Encryption Standard (AES), Federal Information Processing Standards (FIPS) Publication 197 (2001)Google Scholar
- 3.National Institute of Standards and Technology (NIST), Computer Security Division, Recommendation for Block Cipher Modes of Operation: Methods and Techniques, Special Publication 800-38A (2001)Google Scholar
- 4.Daemen, J., Rijmen, V.: AES Proposal: Rijndael, Version 2, AES submission (1999), http://csrc.nist.gov/encryption/aes/rijndael/Rijndael.pdf
- 5.Kwan, M.: Bitslice implementation of DES, http://www.darkside.com.au/bitslice
- 7.Gaubatz, G., Sunar, B.: Leveraging the Multiprocessing Capabilities of Modern Network Processors for Cryptographic Acceleration. In: 4th IEEE International Symposium on Network Computing and Applications (NAC 2005), Cambridge, Massachusetts (July 2005)Google Scholar
- 8.Rijmen, V.: Efficient Implementation of the Rijndael SBox, http://seer.ist.psu.edu/rijmen00efficient.html
- 12.Osvik, D.A., Shamir, A., Tromer, E.: Cache Attacks and Countermeasures: the Case of AES (2005), http://eprint.iacr.org/2005/271.pdf
- 13.Bernstein, D.J.: Cache-timing attacks on AES (2005), http://cr.yp.to/antiforgery/cachetiming-20050414.pdf
- 14.Aoki, K., Lipmaa, H.: Fast Implementations of AES Candidates. In: Proceedings of the 3rd AES Candidate Conference (2000), available at: http://csrc.nist.gov/encryption/aes/round2/conf3/papers/20-kaoki.pdf
- 15.AMD Manual: Software Optimization Guide for AMD Athlon 64 and AMD Opteron Processors, http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF
- 16.Intel Manual, IA-32 Intel Architecture Optimization Reference Manual, http://download.intel.com/design/Pentium4/manuals/24896613.pdf
- 17.Fog, A.: Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel and AMD CPU’s (2006), http://www.agner.org/optimize/instruction_tables.pdf
- 18.The microarchitecture of Intel and AMD CPUs (2006), http://www.agner.org/optimize/microarchitecture.pdf
- 19.Gladman, B.: AES Code, http://fp.gladman.plus.com/AES
- 20.Lipmaa, H.: AES/Rijndael: Speed, http://www.adastral.ucl.ac.uk/~helger/research/aes/rijndael.html