Abstract
This paper presents a proposal of a performance prediction model of block ciphers on GPU architectures. The model comprises three phases: micro-benchmarks, analyzing code, and performance equations. Micro-benchmarks are developed in OpenCL considering scalability for GPU architectures of all kinds. Performance equations are developed, extracting some features of GPU architectures. Overall latencies of AES, Camellia, and SC2000, which covers all types of block ciphers, are inside the range of estimated latencies from the model. Moreover, assuming that out-of-order scheduling by Nvidia GPU works well, the model predicted overall encryption latencies respectively with 2.0 % and 8.8 % error for the best case on Nvidia Geforce GTX 580 and GTX 280. This model supports algebraic and bitslice implementation, although evaluation of the model is conducted in this paper only on table-based implementation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cryptography Research and Evaluation Committees, http://www.cryptrec.go.jp/english/index.html
New European Schemes for Signatures, Integrity and Encryption, https://www.cosic.esat.kuleuven.be/nessie/
NVIDIA Corp.: NVIDIA CUDA Programming Guide 4.2 (2012)
NVIDIA Corp.: Profiler User’s Guide (2012)
Khronos Group: Open Compute Language, http://www.khronos.org/
National Institute of Standards and Technology (NIST): FIPS-197 Advanced Encryption Standard, AES (2001)
Aoki, K., Ichikawa, T., Kanda, M., Matsui, M., Moriai, S., Nakajima, J., Tokita, T.: Camellia: A 128-bit block cipher suitable for multiple platforms - design and analysis. In: Stinson, D.R., Tavares, S. (eds.) SAC 2000. LNCS, vol. 2012, pp. 39–56. Springer, Heidelberg (2001)
Shimoyama, T., Yanami, H., Yokoyama, K., Takenaka, M., Itoh, K., Yajima, J., Torii, N., Tanaka, H.: The Block Cipher SC2000. In: Matsui, M. (ed.) FSE 2001. LNCS, vol. 2355, pp. 312–327. Springer, Heidelberg (2002)
Matsui, M.: How far can we go on the x64 processors? In: Robshaw, M. (ed.) FSE 2006. LNCS, vol. 4047, pp. 341–358. Springer, Heidelberg (2006)
NVIDIA Corp.: NVIDIA Nsight Visual Studio Edition 2.2 User Guide (2011)
NVIDIA Corp.: cuobjdump Application Note version 03 (2011)
Kothapalli, K., Mukherjee, R., Rehman, M.S., Patidar, S., Narayanan, P.J., Srinathan, K.: A performance prediction model for the cuda gpgpu platform. In: Yang, Y., Parashar, M., Muralidhar, R., Prasanna, V.K. (eds.) HiPC, pp. 463–472. IEEE (2009)
Guo, P., Wang, L.: Accurate cuda performance modeling for sparse matrix-vector multiplication. In: HPCS, pp. 496–502 (2012)
Hong, S., Kim, H.: An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA 2009, pp. 152–163. ACM, New York (2009)
Zhang, Y., Owens, J.D.: A quantitative performance analysis model for gpu architectures. In: HPCA, pp. 382–393 (2011)
van der Laan, W.J.: Decuda and Cudasm, the cubin utilities package (2009), https://github.com/laanwj/decuda
Collange, S., Daumas, M., Defour, D., Parello, D.: Barra: A parallel functional simulator for gpgpu. In: MASCOTS, pp. 351–360. IEEE (2010)
Baghsorkhi, S.S., Delahaye, M., Patel, S.J., Gropp, W.D., Hwu, W.-M.W.: An adaptive performance modeling tool for gpu architectures. In: PPOPP, pp. 105–114 (2010)
NVIDIA Corp.: OpenCL Programming Guide for the CUDA Architecture (2012)
NVIDIA Corp.: Whitepaper for NVIDIA’s Fermi Architecture (2009)
AMD Corp.: Reference Guide of Southern Islands Series Instruction Set Architecture (2012)
AMD Corp.: AMD Accelerated Parallel Processing OpenCL Programming Guide rev. 2.4 (2012)
The IEEE Security in Storage Working Group: XTS block cipher-based mode (XEX-based tweaked-codebook mode with ciphertext stealing), http://siswg.net/
Osvik, D.A., Bos, J.W., Stefan, D., Canright, D.: Fast software AES encryption. In: Hong, S., Iwata, T. (eds.) FSE 2010. LNCS, vol. 6147, pp. 75–93. Springer, Heidelberg (2010)
Biagio, A.D., Barenghi, A., Agosta, G., Pelosi, G.: Design of a parallel AES for graphics hardware using the CUDA framework. In: International Parallel and Distributed Processing Symposium, pp. 1–8 (2009)
Resios, A., Holdermans: GPU performance prediction using parametrized models. Master Thesis of Utrecht University (2009)
Wong, H., Papadopoulou, M.M., Sadooghi-Alvandi, M., Moshovos, A.: Demystifying gpu microarchitecture through microbenchmarking. In: 2010 IEEE International Symposium on Performance Analysis of Systems Software, ISPASS, pp. 235–246 (2010)
Biham, E.: A fast new DES implementation in software. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 260–272. Springer, Heidelberg (1997)
Agosta, G., Barenghi, A., De Santis, F., Pelosi, G.: Record setting software implementation of des using cuda. In: Proceedings of the 2010 Seventh International Conference on Information Technology: New Generations, ITNG 2010, pp. 748–755 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nishikawa, N., Iwai, K., Tanaka, H., Kurokawa, T. (2013). Performance Prediction Model for Block Ciphers on GPU Architectures. In: Lopez, J., Huang, X., Sandhu, R. (eds) Network and System Security. NSS 2013. Lecture Notes in Computer Science, vol 7873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38631-2_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-38631-2_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38630-5
Online ISBN: 978-3-642-38631-2
eBook Packages: Computer ScienceComputer Science (R0)