Implementing Lightweight Block Ciphers on x86 Architectures
- Cite this paper as:
- Benadjila R., Guo J., Lomné V., Peyrin T. (2014) Implementing Lightweight Block Ciphers on x86 Architectures. In: Lange T., Lauter K., Lisoněk P. (eds) Selected Areas in Cryptography -- SAC 2013. SAC 2013. Lecture Notes in Computer Science, vol 8282. Springer, Berlin, Heidelberg
Lightweight block ciphers are designed so as to fit into very constrained environments, but usually not really with software performance in mind. For classical lightweight applications where many constrained devices communicate with a server, it is also crucial that the cipher has good software performance on the server side. Recent work has shown that bitslice implementations applied to Piccolo and PRESENT led to very good software speeds, thus making lightweight ciphers interesting for cloud applications. However, we remark that bitslice implementations might not be interesting for some situations, where the amount of data to be enciphered at a time is usually small, and very little work has been done on non-bitslice implementations.
In this article, we explore general software implementations of lightweight ciphers on x86 architectures, with a special focus on LED, Piccolo and PRESENT. First, we analyze table-based implementations, and we provide a theoretical model to predict the behavior of various possible trade-offs depending on the processor cache latency profile. We obtain the fastest table-based implementations for our lightweight ciphers, which is of interest for legacy processors. Secondly, we apply to our portfolio of primitives the vperm implementation trick for \(4\)-bit Sboxes, which gives good performance, extra side-channels protection, and is quite fit for many lightweight primitives. Finally, we investigate bitslice implementations, analyzing various costs which are usually neglected (bitsliced form (un)packing, key schedule, etc.), but that must be taken in account for many lightweight applications. We finally discuss which type of implementation seems to be the best suited depending on the applications profile.