Abstract
This paper describes a high-speed software implementation of Elliptic Curve Cryptography (ECC) for GeForce GTX graphics cards equipped with an NVIDIA GT200 Graphics Processing Unit (GPU). In order to maximize throughput, our ECC software allocates just a single thread per scalar multiplication and aims to launch as many threads in parallel as possible. We adopt elliptic curves in Montgomery as well as twisted Edwards form, both defined over a special family of finite fields known as Optimal Prime Fields (OPFs). All field-arithmetic operations use a radix-224 representation for the operands (i.e. 24 operand bits are contained in a 32-bit word) to comply with the native (24 ×24)-bit integer multiply instruction of the GT200 platform. We implemented the OPF arithmetic without conditional statements (e.g. if-then clauses) to prevent thread divergence and unrolled the loops to minimize execution time. The scalar multiplication on the twisted Edwards curve employs a comb approach if the base point is fixed and uses extended projective coordinates so that a point addition requires only seven multiplications in the underlying OPF. Our software currently supports elliptic curves over 160-bit and 224-bit OPFs. After a detailed evaluation of numerous implementation options and configurations, we managed to launch 2880 threads on the 30 multiprocessors of the GT200 when the elliptic curve has Montgomery form and is defined over a 224-bit OPF. The resulting throughput is 115k scalar multiplications per second (for arbitrary base points) and we achieved a minimum latency of 19.2 ms. In a fixed-base setting with 256 precomputed points, the throughput increases to some 345k scalar multiplications and the latency drops to 4.52 ms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Antão, S., Bajard, J.-C., Sousa, L.: Elliptic curve point multiplication on GPUs. In: Proceedings of the 21st IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP 2010), pp. 192–199. IEEE Computer Society Press (2010)
Antão, S., Bajard, J.-C., Sousa, L.: RNS-based elliptic curve point multiplication for massive parallel architectures. Computer Journal 55(5), 629–647 (2012)
Bernstein, D.J., Birkner, P., Joye, M., Lange, T., Peters, C.: Twisted Edwards curves. In: Vaudenay, S. (ed.) AFRICACRYPT 2008. LNCS, vol. 5023, pp. 389–405. Springer, Heidelberg (2008)
Bernstein, D.J., Chen, H.-C., Chen, M.-S., Cheng, C.-M., Hsiao, C.-H., Lange, T., Lin, Z.-C., Yang, B.-Y.: The billion-mulmod-per-second PC. In: Proceedings of the 4th Workshop on Special-Purpose Hardware for Attacking Cryptographic Systems (SHARCS 2009), Lausanne, Switzerland, pp. 131–144 (September 2009)
Bernstein, D.J., Chen, T.-R., Cheng, C.-M., Lange, T., Yang, B.-Y.: ECM on graphics cards. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 483–501. Springer, Heidelberg (2009)
Bos, J.W.: Low-latency elliptic curve scalar multiplication. International Journal of Parallel Programming 40(5), 532–550 (2012)
Chu, D., Großschädl, J., Liu, Z., Müller, V., Zhang, Y.: Twisted Edwards-form elliptic curve cryptography for 8-bit AVR-based sensor nodes. In: Proceedings of the 1st ACM Workshop on Asia Public-Key Cryptography (AsiaPKC 2013), pp. 39–44. ACM Press (2013)
Giorgi, P., Izard, T., Tisserand, A.: Comparison of modular arithmetic algorithms on GPUs. In: Parallel Computing: From Multicores and GPU’s to Petascale. Advances in Parallel Computing, vol. 19, pp. 315–322. IOS Press (2010)
Großschädl, J.: TinySA: A security architecture for wireless sensor networks. In: Proceedings of the 2nd International Conference on Emerging Networking Experiments and Technologies (CoNEXT 2006), pp. 288–289. ACM Press (2006)
Hankerson, D.R., Menezes, A.J., Vanstone, S.A.: Guide to Elliptic Curve Cryptography. Springer (2004)
Hisil, H., Wong, K.K.-H., Carter, G., Dawson, E.: Twisted Edwards curves revisited. In: Pieprzyk, J. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp. 326–343. Springer, Heidelberg (2008)
Jang, K., Han, S., Han, S., Moon, S., Park, K.: SSLShader: Cheap SSL acceleration with commodity processors. In: Andersen, D.G., Ratnasamy, S. (eds.) Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2011). USENIX Organization (2011)
Khan, F.G.: General Purpose Computation on Graphics Processing Units using OpenCL. Ph.D. Thesis, Politecnico di Torino, Torino, Italy (March 2013)
Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro 28(2), 39–55 (2008)
Liu, Z., Großschädl, J., Wong, D.S.: Low-weight primes for lightweight elliptic curve cryptography on 8-bit processors. In: Lin, D., Xu, S., Yung, M. (eds.) The 9th China International Conference on Information Security and Cryptology — INSCRYPT 2013. LNCS. Springer, Heidelberg (to appear)
Liu, Z., Wenger, E., Großschädl, J.: MoTE-ECC: Energy-scalable elliptic curve cryptography for wireless sensor networks (February 2013) (to be published)
Montgomery, P.L.: Modular multiplication without trial division. Mathematics of Computation 44(170), 519–521 (1985)
Montgomery, P.L.: Speeding the Pollard and elliptic curve methods of factorization. Mathematics of Computation 48(177), 243–264 (1987)
NVIDIA Corporation. NVIDIA GeForce® GTX 200 GPU Architectural Overview. Technical brief (2008), http://www.nvidia.com/docs/IO/55506/GeForce_GTX_200_GPU_Technical_Brief.pdf
NVIDIA Corporation. CUDA C Programming Guide. Design guide (2013), http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf
NVIDIA Corporation. Parallel Thread Execution ISA. Application guide (2013), http://docs.nvidia.com/cuda/pdf/ptx_isa_3.2.pdf
Szerwinski, R., Güneysu, T.: Exploiting the power of GPUs for asymmetric cryptography. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 79–99. Springer, Heidelberg (2008)
Yanık, T., Savaş, E., Koç, Ç.K.: Incomplete reduction in modular arithmetic. IEE Proceedings – Computers and Digital Techniques 149(2), 46–52 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Cui, S., Großschädl, J., Liu, Z., Xu, Q. (2014). High-Speed Elliptic Curve Cryptography on the NVIDIA GT200 Graphics Processing Unit. In: Huang, X., Zhou, J. (eds) Information Security Practice and Experience. ISPEC 2014. Lecture Notes in Computer Science, vol 8434. Springer, Cham. https://doi.org/10.1007/978-3-319-06320-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-06320-1_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06319-5
Online ISBN: 978-3-319-06320-1
eBook Packages: Computer ScienceComputer Science (R0)