Cofactorization on Graphics Processing Units

  • Andrea Miele
  • Joppe W. Bos
  • Thorsten Kleinjung
  • Arjen K. Lenstra
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8731)

Abstract

We show how the cofactorization step, a compute-intensive part of the relation collection phase of the number field sieve (NFS), can be farmed out to a graphics processing unit. Our implementation on a GTX 580 GPU, which is integrated with a state-of-the-art NFS implementation, can serve as a cryptanalytic co-processor for several Intel i7-3770K quad-core CPUs simultaneously. This allows those processors to focus on the memory-intensive sieving and results in more useful NFS-relations found in less time.

Keywords

Cofactorization Graphics Processing Unit Number Field Sieve 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Antao, S., Bajard, J.-C., Sousa, L.: Elliptic curve point multiplication on GPUs. In: 2010 21st IEEE International Conference on Application-specific Systems Architectures and Processors (ASAP), pp. 192–199 (2010)Google Scholar
  2. 2.
    Barbulescu, R., Bos, J.W., Bouvier, C., Kleinjung, T., Montgomery, P.L.: Finding ECM-friendly curves through a study of Galois properties. In: Howe, E.W., Kedlaya, K.S. (eds.) ANTS 2012. The Open Book Series, vol. 1, pp. 63–86. Mathematical Sciences Publishers (2013)Google Scholar
  3. 3.
    Bernstein, D.J., Birkner, P., Lange, T.: Starfish on strike. In: Abdalla, M., Barreto, P.S.L.M. (eds.) LATINCRYPT 2010. LNCS, vol. 6212, pp. 61–80. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Bernstein, D.J., Birkner, P., Lange, T., Peters, C.: ECM using Edwards curves. Mathematics of Computation 82(282), 1139–1179 (2013)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Bernstein, D.J., Chen, H.-C., Chen, M.-S., Cheng, C.-M., Hsiao, C.-H., Lange, T., Lin, Z.-C., Yang, B.-Y.: The billion-mulmod-per-second PC. In: Special-purpose Hardware for Attacking Cryptographic Systems – SHARCS 2009, pp. 131–144 (2009)Google Scholar
  6. 6.
    Bernstein, D.J., Chen, H.-C., Cheng, C.-M., Lange, T., Niederhagen, R., Schwabe, P., Yang, B.-Y.: ECC2K-130 on NVIDIA gPUs. In: Gong, G., Gupta, K.C. (eds.) INDOCRYPT 2010. LNCS, vol. 6498, pp. 328–346. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Bernstein, D.J., Chen, T.-R., Cheng, C.-M., Lange, T., Yang, B.-Y.: ECM on graphics cards. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 483–501. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  8. 8.
    Bevand, M.: MD5 Chosen-Prefix Collisions on GPUs. Black Hat (2009); WhitepaperGoogle Scholar
  9. 9.
    Bos, J.W.: Low-latency elliptic curve scalar multiplication. International Journal of Parallel Programming 40(5), 532–550 (2012)CrossRefGoogle Scholar
  10. 10.
    Bos, J.W., Kleinjung, T.: ECM at work. In: Wang, X., Sako, K. (eds.) ASIACRYPT 2012. LNCS, vol. 7658, pp. 467–484. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    Bos, J.W., Stefan, D.: Performance analysis of the SHA-3 candidates on exotic multi-core architectures. In: Mangard, S., Standaert, F.-X. (eds.) CHES 2010. LNCS, vol. 6225, pp. 279–293. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Brent, R.P.: Some integer factorization algorithms using elliptic curves. Australian Computer Science Communications 8, 149–163 (1986)Google Scholar
  13. 13.
    Collange, S., Defour, D., Tisserand, A.: Power consumption of gPUs from a software perspective. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009, Part I. LNCS, vol. 5544, pp. 914–923. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  14. 14.
    de Meulenaer, G., Gosset, F., de Dormale, G.M., Quisquater, J.-J.: Integer factorization based on elliptic curve method: Towards better exploitation of reconfigurable hardware. In: Field-Programmable Custom Computing Machines – FCCM 2007, pp. 197–206. IEEE Computer Society (2007)Google Scholar
  15. 15.
    Edwards, H.M.: A normal form for elliptic curves. Bulletin of the American Mathematical Society 44, 393–422 (2007)CrossRefMATHGoogle Scholar
  16. 16.
    Franke, J., Kleinjung, T.: GNFS for linux. Software (2012)Google Scholar
  17. 17.
    Gaj, K., Kwon, S., Baier, P., Kohlbrenner, P., Le, H., Khaleeluddin, M., Bachimanchi, R.: Implementing the elliptic curve method of factoring in reconfigurable hardware. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 119–133. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  18. 18.
    Gilger, J., Barnickel, J., Meyer, U.: GPU-acceleration of block ciphers in the openSSL cryptographic library. In: Gollmann, D., Freiling, F.C. (eds.) ISC 2012. LNCS, vol. 7483, pp. 338–353. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  19. 19.
    Güneysu, T., Kasper, T., Novotny, M., Paar, C., Rupp, A.: Cryptanalysis with COPACOBANA. IEEE Transactions on Computers 57, 1498–1513 (2008)CrossRefGoogle Scholar
  20. 20.
    Harrison, O., Waldron, J.: AES encryption implementation and analysis on commodity graphics processing units. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 209–226. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  21. 21.
    Harrison, O., Waldron, J.: Practical symmetric key cryptography on modern graphics hardware. In: Proceedings of the 17th Conference on Security Symposium, pp. 195–209. USENIX Association (2008)Google Scholar
  22. 22.
    Harrison, O., Waldron, J.: Efficient acceleration of asymmetric cryptography on graphics hardware. In: Preneel, B. (ed.) AFRICACRYPT 2009. LNCS, vol. 5580, pp. 350–367. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  23. 23.
    Hisil, H., Wong, K.K.-H., Carter, G., Dawson, E.: Twisted edwards curves revisited. In: Pieprzyk, J. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp. 326–343. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  24. 24.
    Jebelean, T.: An algorithm for exact division. Journal of Symbolic Computation 15(2), 169–180 (1993)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Karatsuba, A.A., Ofman, Y.: Multiplication of many-digital numbers by automatic computers. In: Number 145 in Proceedings of the USSR Academy of Science, pp. 293–294 (1962)Google Scholar
  26. 26.
    Kleinjung, T.: Cofactorisation strategies for the number field sieve and an estimate for the sieving step for factoring 1024-bit integers. In: Special-purpose Hardware for Attacking Cryptographic Systems – SHARCS 2006 (2006)Google Scholar
  27. 27.
    Kleinjung, T., Aoki, K., Franke, J., Lenstra, A.K., Thomé, E., Bos, J.W., Gaudry, P., Kruppa, A., Montgomery, P.L., Osvik, D.A., te Riele, H., Timofeev, A., Zimmermann, P.: Factorization of a 768-bit RSA modulus. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 333–350. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  28. 28.
    Kruppa, A.: A software implementation of ECM for NFS. Research Report RR-7041, INRIA (2009), http://hal.inria.fr/inria-00419094/PDF/RR-7041.pdf
  29. 29.
    Leboeuf, K., Muscedere, R., Ahmadi, M.: A GPU implementation of the Montgomery multiplication algorithm for elliptic curve cryptography. In: IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2593–2596 (2013)Google Scholar
  30. 30.
    Lenstra, A.K., Lenstra Jr., H.W.: The Development of the Number Field Sieve. Lecture Notes in Mathematics, vol. 1554. Springer, Heidelberg (1993)CrossRefMATHGoogle Scholar
  31. 31.
    Lenstra Jr., H.W.: Factoring integers with elliptic curves. Annals of Mathematics 126(3), 649–673 (1987)MathSciNetCrossRefMATHGoogle Scholar
  32. 32.
    Loebenberger, D., Putzka, J.: Optimization strategies for hardware-based cofactorization. In: Jacobson Jr., M.J., Rijmen, V., Safavi-Naini, R. (eds.) SAC 2009. LNCS, vol. 5867, pp. 170–181. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  33. 33.
    Manavski, S.: CUDA compatible GPU as an efficient hardware accelerator for AES cryptography. In: IEEE International Conference on Signal Processing and Communications, ICSPC 2007, pp. 65–68 (2007)Google Scholar
  34. 34.
    Miele, A., Bos, J.W., Kleinjung, T., Lenstra, A.K.: Cofactorization on graphics processing units. Cryptology ePrint Archive, Report 2014/397 (2014), http://eprint.iacr.org/
  35. 35.
    Miller, G.L.: Riemann’s hypothesis and tests for primality. In: Proceedings of Seventh Annual ACM Symposium on Theory of Computing, STOC 1975, pp. 234–239. ACM (1975)Google Scholar
  36. 36.
    Montgomery, P.L.: Modular multiplication without trial division. Mathematics of Computation 44(170), 519–521 (1985)MathSciNetCrossRefMATHGoogle Scholar
  37. 37.
    Montgomery, P.L.: Speeding the Pollard and elliptic curve methods of factorization. Mathematics of Computation 48(177), 243–264 (1987)MathSciNetCrossRefMATHGoogle Scholar
  38. 38.
    Montgomery, P.L.: An FFT extension of the elliptic curve method of factorization. PhD thesis, University of California (1992)Google Scholar
  39. 39.
    Montgomery, P.L., Silverman, R.D.: An FFT extension to the p-1 factoring algorithm. Mathematics of Computation 54(190), 839–854 (1990)MathSciNetMATHGoogle Scholar
  40. 40.
    Moss, A., Page, D., Smart, N.P.: Toward acceleration of RSA using 3D graphics hardware. In: Galbraith, S.D. (ed.) Cryptography and Coding 2007. LNCS, vol. 4887, pp. 364–383. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  41. 41.
  42. 42.
    NVIDIA. Cuda programming guide 5 (2013), http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
  43. 43.
    NVIDIA. Parallel thread execution isa version 3.2 (2013), http://docs.nvidia.com/cuda/parallel-thread-execution/index.html
  44. 44.
  45. 45.
    Osvik, D.A., Bos, J.W., Stefan, D., Canright, D.: Fast software AES encryption. In: Hong, S., Iwata, T. (eds.) FSE 2010. LNCS, vol. 6147, pp. 75–93. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  46. 46.
    Pelzl, J., Šimka, M., Kleinjung, T., Franke, J., Priplata, C., Stahlke, C., Drutarovský, M., Fischer, V., Paar, C.: Area-time efficient hardware architecture for factoring integers with the elliptic curve method. IEE Proceedings on Information Security 152(1), 67–78 (2005)CrossRefGoogle Scholar
  47. 47.
    Pollard, J.M.: The lattice sieve. pp. 43–49 in [30]Google Scholar
  48. 48.
    Pollard, J.M.: Theorems on factorization and primality testing. Proceedings of the Cambridge Philosophical Society 76, 521–528 (1974)MathSciNetCrossRefMATHGoogle Scholar
  49. 49.
    Pollard, J.M.: A Monte Carlo method for factorization. BIT Numerical Mathematics 15(3), 331–334 (1975)MathSciNetCrossRefMATHGoogle Scholar
  50. 50.
    Pomerance, C.: The quadratic sieve factoring algorithm. In: Beth, T., Cot, N., Ingemarsson, I. (eds.) EUROCRYPT 1984. LNCS, vol. 209, pp. 169–182. Springer, Heidelberg (1985)CrossRefGoogle Scholar
  51. 51.
    Pomerance, C.: A tale of two sieves. Biscuits of Number Theory 85 (2008)Google Scholar
  52. 52.
    Rabin, M.O.: Probabilistic algorithm for testing primality. Journal of Number Theory 12(1), 128–138 (1980)MathSciNetCrossRefMATHGoogle Scholar
  53. 53.
    Shanks, D.: Class number, a theory of factorization, and genera. In: Lewis, D.J. (ed.) Symposia in Pure Mathematics, vol. 20, pp. 415–440. American Mathematical Society (1971)Google Scholar
  54. 54.
    Šimka, M., Pelzl, J., Kleinjung, T., Franke, J., Priplata, C., Stahlke, C., Drutarovský, M., Fischer, V.: Hardware factorization based on elliptic curve method. In: Field-Programmable Custom Computing Machines – FCCM 2005, pp. 107–116. IEEE Computer Society (2005)Google Scholar
  55. 55.
    Szerwinski, R., Güneysu, T.: Exploiting the power of gPUs for asymmetric cryptography. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 79–99. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  56. 56.
    Xin, G.: Fast smoothness test. Semester project report (June 2013)Google Scholar
  57. 57.
    Yang, J., Goodman, J.: Symmetric key cryptography on modern graphics hardware. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, pp. 249–264. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  58. 58.
    Zimmermann, P., Dodson, B.: 20 years of ECM. In: Hess, F., Pauli, S., Pohst, M. (eds.) ANTS 2006. LNCS, vol. 4076, pp. 525–542. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  59. 59.
    Zimmermann, R., Güneysu, T., Paar, C.: High-performance integer factoring with reconfigurable devices. In: Field Programmable Logic and Applications – FPL 2010, pp. 83–88. IEEE (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Andrea Miele
    • 1
  • Joppe W. Bos
    • 2
  • Thorsten Kleinjung
    • 1
  • Arjen K. Lenstra
    • 1
  1. 1.LACALEPFLLausanneSwitzerland
  2. 2.NXP SemiconductorsLeuvenBelgium

Personalised recommendations