Abstract
Bit-slicing is a non-conventional implementation technique for cryptographic software where an n-bit processor is considered as a collection of n 1-bit execution units operating in SIMD mode. Particularly when implementing symmetric ciphers, the bit-slicing approach has several advantages over more conventional alternatives: it often allows one to reduce memory footprint by eliminating large look-up tables, and it permits more predictable performance characteristics that can foil time based side-channel attacks. Both features are attractive for mobile and embedded processors, but the performance overhead that results from bit-sliced implementation often represents a significant disadvantage. In this paper we describe a set of light-weight Instruction Set Extensions (ISEs) that can improve said performance while retaining all advantages of bit-sliced implementation. Contrary to other crypto-ISE, our design is generic and allows for a high degree of algorithm agility: we demonstrate applicability to several well-known cryptographic primitives including four block ciphers (DES, Serpent, AES, and PRESENT), a hash function (SHA-1), as well as multiplication of ternary polynomials.
Chapter PDF
References
Anderson, R., Biham, E., Knudsen, L.: Serpent: A proposal for the Advanced Encryption Standard. Technical report, http://www.cl.cam.ac.uk/~rja14/serpent.html
Bartolini, S., Branovic, I., Giorgi, R., Martinelli, E.: A performance evaluation of ARM ISA extension for elliptic curve cryptography over binary finite fields. In: Proceedings of the 16th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2004), pp. 238–245. IEEE Computer Society Press, Los Alamitos (2004)
Bertoni, G., Breveglieri, L., Fragneto, P., Macchetti, M., Marchesin, S.: Efficient software implementation of AES on 32-bit platforms. In: Kaliski Jr., B.S., Koç, Ç.K., Paar, C. (eds.) CHES 2002. LNCS, vol. 2523, pp. 129–142. Springer, Heidelberg (2003)
Bertoni, G.M., Breveglieri, L., Farina, R., Regazzoni, F.: Speeding up AES by extending a 32-bit processor instruction set. In: Proceedings of the 17th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP 2006), pp. 275–279. IEEE Computer Society Press, Los Alamitos (2006)
Biham, E.: A fast new DES implementation in software. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 260–272. Springer, Heidelberg (1997)
Bogdanov, A., Knudsen, L.R., Leander, G., Paar, C., Poschmann, A., Robshaw, M.J., Seurin, Y., Vikkelsoe, C.: PRESENT: An ultra-lightweight block cipher. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 450–466. Springer, Heidelberg (2007)
Bonneau, J., Mironov, I.: Cache-collision timing attacks against AES. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 201–215. Springer, Heidelberg (2006)
Buchty, R., Heintze, N., Oliva, D.: Cryptonite – A programmable crypto processor architecture for high-bandwidth applications. In: Müller-Schloer, C., Ungerer, T., Bauer, B. (eds.) ARCS 2004. LNCS, vol. 2981, pp. 184–198. Springer, Heidelberg (2004)
Burke, J., McDonald, J., Austin, T.: Architectural support for fast symmetric-key cryptography. In: Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2000), pp. 178–189. ACM Press, New York (2000)
Canright, D.: A very compact S-box for AES. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 441–455. Springer, Heidelberg (2005)
Daemen, J., Rijmen, V.: The Design of Rijndael: AES – The Advanced Encryption Standard. Springer, Heidelberg (2002)
Davies, P.L., Robsky, S.R.: Customized processor extension speeds network cryptology. Electronic Design 50(19), 83–88 (2002)
Elbirt, A.J.: Fast and efficient implementation of AES via instruction set extensions. In: Proceedings of the 21st International Conference on Advanced Information Networking and Applications (AINA 2007), vol. 1, pp. 481–490. IEEE Computer Society Press, Los Alamitos (2007)
Fiskiran, A.M., Lee, R.B.: PAX: A datapath-scalable minimalist cryptographic processor for mobile devices. In: Embedded Cryptographic Hardware: Design and Security, pp. 19–34. Nova Science Publishers (2004)
Fiskiran, A.M., Lee, R.B.: On-chip lookup tables for fast symmetric-key encryption. In: Proceedings of the 16th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2005), pp. 356–363. IEEE Computer Society Press, Los Alamitos (2005)
Großschädl, J., Savaş, E.: Instruction set extensions for fast arithmetic in finite fields GF(p) and GF(2m). In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 133–147. Springer, Heidelberg (2004)
Großschädl, J., Tillich, S., Szekely, A., Wurm, M.: Cryptography instruction set extensions to the SPARC V8 architecture (preprint submitted for publication, 2007)
Hankerson, D., Menezes, A., Vanstone, S.: Guide to Elliptic Curve Cryptography. Springer, Heidelberg (2004)
Harrison, K., Page, D., Smart, N.P.: Software implementation of finite fields of characteristic three, for use in pairing pased cryptosystems. LMS Journal of Computation and Mathematics 5(1), 181–193 (2002)
Institute of Electrical and Electronics Engineers (IEEE). IEEE Std 1363-2000: IEEE Standard Specifications for Public-Key Cryptography
Könighofer, R.: A fast and cache-timing resistant implementation of the AES. In: Topics in Cryptology — CT-RSA 2008. LNCS, vol. 4964, pp. 187–202. Springer, Heidelberg (2008)
Kwan, M.: Reducing the gate count of bitslice DES. Cryptology ePrint Archive, Report 2000/051 (2000), http://eprint.iacr.org
Lee, R.B., Shi, Z., Yang, X.: Efficient permutation instructions for fast software cryptography. IEEE Mirco. 21(6), 56–69 (2001)
Matsui, M.: How far can we go on the x64 processors? In: Robshaw, M.J.B. (ed.) FSE 2006. LNCS, vol. 4047, pp. 341–358. Springer, Heidelberg (2006)
Matsui, M., Nakajima, J.: On the power of bitslice implementation on Intel Core2 processor. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 121–134. Springer, Heidelberg (2007)
Mimosys. Clarity Product Datasheet (July 2006), http://www.mimosys.com/pdf/Mimosys_Clarity_Product_Datasheet.pdf
O’Melia, S.R.: Instruction Set Extensions for Enhancing the Performance of Symmetric-Key Cryptography. M.Sc. Thesis. University of Massachusetts, Lowell (2007)
Osvik, D.A., Shamir, A., Tromer, E.: Cache attacks and countermeasures: The case of AES. In: Pointcheval, D. (ed.) CT-RSA 2006. LNCS, vol. 3860, pp. 1–20. Springer, Heidelberg (2006)
Patterson, C.: A dynamic FPGA implementation of the Serpent block cipher. In: Paar, C., Koç, Ç.K. (eds.) CHES 2000. LNCS, vol. 1965, pp. 141–155. Springer, Heidelberg (2000)
Phillips, B.J., Burgess, N.: Implementing 1,024-bit RSA exponentiation on a 32-bit processor core. In: Proceedings of the 12th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2000), pp. 127–137. IEEE Computer Society Press, Los Alamitos (2000)
Pozzi, L., Ienne, P.: Exploiting pipelining to relax register-file port constraints of instruction-set extensions. In: Proceedings of the 8th International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES 2005), pp. 2–10. ACM Press, New York (2005)
Ravi, S., Raghunathan, A., Potlapally, N.R., Sankaradass, M.: System design methodologies for a wireless security processing platform. In: Proceedings of the 39th Design Automation Conference (DAC 2002), pp. 777–782. ACM Press, New York (2002)
Ravi, S., Raghunathan, A., Potlapally, N.R.: Securing wireless data: System architecture challenges. In: Proceedings of the 15th International Symposium on System Synthesis (ISSS 2002), pp. 195–200. ACM Press, New York (2002)
Shi, Z., Lee, R.B.: Bit permutation instructions for accelerating software cryptography. In: Proceedings of the 12th IEEE International Conference on Application-Specific Systems, Architectures, and Processors (ASAP 2000), pp. 138–148. IEEE Computer Society Press, Los Alamitos (2000)
Tillich, S., Großschädl, J.: Instruction set extensions for efficient AES implementation on 32-bit processors. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 270–284. Springer, Heidelberg (2006)
Wu, L., Weaver, C., Austin, T.M.: CryptoManiac: A fast flexible architecture for secure communication. In: Proceedings of the 28th Annual International Symposium on Computer Architecture (ISCA 2001), pp. 110–119. ACM Press, New York (2001)
Yang, X., Vachharajani, M., Lee, R.B.: Fast subword permutation instructions based on butterfly networks. In: Media Processors 2000. Proceedings of the SPIE, vol. 3970, pp. 80–86. SPIE (1999)
Yehia, S., Clark, N.T., Mahlke, S.A., Flautner, K.: Exploring the design space of LUT-based transparent accelerators. In: Proceedings of the 8th International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES 2005), pp. 238–249. ACM Press, New York (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grabher, P., Großschädl, J., Page, D. (2008). Light-Weight Instruction Set Extensions for Bit-Sliced Cryptography. In: Oswald, E., Rohatgi, P. (eds) Cryptographic Hardware and Embedded Systems – CHES 2008. CHES 2008. Lecture Notes in Computer Science, vol 5154. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85053-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-85053-3_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85052-6
Online ISBN: 978-3-540-85053-3
eBook Packages: Computer ScienceComputer Science (R0)