Accelerating Iterative SpMV for the Discrete Logarithm Problem Using GPUs

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9061)

Abstract

In the context of cryptanalysis, computing discrete logarithms in large cyclic groups using index-calculus-based methods, such as the number field sieve or the function field sieve, requires solving large sparse systems of linear equations modulo the group order. Most of the fast algorithms used to solve such systems — e.g., the conjugate gradient or the Lanczos and Wiedemann algorithms — iterate a product of the corresponding sparse matrix with a vector (SpMV). This central operation can be accelerated on GPUs using specific computing models and addressing patterns, which increase the arithmetic intensity while reducing irregular memory accesses. In this work, we investigate the implementation of SpMV kernels on NVIDIA GPUs, for several representations of the sparse matrix in memory. We explore the use of Residue Number System (RNS) arithmetic to accelerate modular operations. We target linear systems arising when attacking the discrete logarithm problem on groups of size 100 to 1000 bits, which includes the relevant range for current cryptanalytic computations. The proposed SpMV implementation contributed to solving the discrete logarithm problem in GF(\(2^{619}\)) and GF(\(2^{809}\)) using the FFS algorithm.

Keywords

Discrete logarithm problem Sparse-matrix–vector product Modular arithmetic Residue number system GPUs 

References

  1. 1.
    Adleman, L.: A subexponential algorithm for the discrete logarithm problem with applications to cryptography. In: Proceedings of the 20th Annual Symposium on Foundations of Computer Science, Washington, DC, USA, pp. 55–60 (1979)Google Scholar
  2. 2.
    Bai, S., Bouvier, C., Filbois, A., Gaudry, P., Imbert, L., Kruppa, A., Morain, F., Thomé, E., Zimmermann, P.: Cado-nfs: Crible algébrique: Distribution, optimisation - number field sieve. http://cado-nfs.gforge.inria.fr/
  3. 3.
    Barbulescu, R., Bouvier, C., Detrey, J., Gaudry, P., Jeljeli, H., Thomé, E., Videau, M., Zimmermann, P.: Discrete logarithm in GF\((2^{809})\) with FFS. In: Krawczyk, H. (ed.) PKC 2014. LNCS, vol. 8383, pp. 221–238. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  4. 4.
    Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on CUDA. Technical report NVR-2008-004, NVIDIA Corporation, December 2008Google Scholar
  5. 5.
    Bell, N., Garland, M.: Cusp: Generic parallel algorithms for sparse matrix and graph computations (2012). http://code.google.com/p/cusp-library/
  6. 6.
    Bernstein, D.J.: Multidigit modular multiplication with the explicit chinese remainder theorem. Technical report (1995). http://cr.yp.to/papers/mmecrt.pdf
  7. 7.
    Blelloch, G.E., Heroux, M.A., Zagha, M.: Segmented operations for sparse matrix computation on vector multiprocessors. Technical report CMU-CS-93-173, School of Computer Science, Carnegie Mellon University, August 1993Google Scholar
  8. 8.
    Boyer, B., Dumas, J.G., Giorgi, P.: Exact sparse matrix-vector multiplication on GPU’s and multicore architectures. CoRR abs/1004.3719 (2010)Google Scholar
  9. 9.
    Hayashi, T., Shimoyama, T., Shinohara, N., Takagi, T.: Breaking pairing-based cryptosystems using \(\eta _t\) pairing over GF\((3^{97})\). Cryptology ePrint Archive, Report 2012/345 (2012)Google Scholar
  10. 10.
    Jeljeli, H.: Resolution of linear algebra for the discrete logarithm problem using GPU and multi-core architectures. In: Silva, F., Dutra, I., Santos Costa, V. (eds.) Euro-Par 2014. LNCS, vol. 8632, pp. 764–775. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  11. 11.
    Kaltofen, E.: Analysis of coppersmith’s block wiedemann algorithm for the parallel solution of sparse linear systems. Math. Comput. 64(210), 777–806 (1995)MATHMathSciNetGoogle Scholar
  12. 12.
    LaMacchia, B.A., Odlyzko, A.M.: Solving large sparse linear systems over finite fields. In: Menezes, A., Vanstone, S.A. (eds.) CRYPTO 1990. LNCS, vol. 537, pp. 109–133. Springer, Heidelberg (1991) Google Scholar
  13. 13.
    Lanczos, C.: Solution of systems of linear equations by minimized iterations. J. Res. Natl. Bur. Stand 49, 33–53 (1952)CrossRefMathSciNetGoogle Scholar
  14. 14.
    NVIDIA Corporation: CUDA Programming Guide Version 4.2 (2012). http://developer.nvidia.com/cuda-downloads
  15. 15.
    NVIDIA Corporation: PTX: Parallel Thread Execution ISA Version 3.0 (2012). http://developer.nvidia.com/cuda-downloads
  16. 16.
    Odlyzko, A.M.: Discrete logarithms in finite fields and their cryptographic significance. In: Beth, T., Cot, N., Ingemarsson, I. (eds.) EUROCRYPT 1984. LNCS, vol. 209, pp. 224–314. Springer, Heidelberg (1985) CrossRefGoogle Scholar
  17. 17.
    Pollard, J.M.: A monte carlo method for factorization. BIT Numer. Math. 15, 331–334 (1975)CrossRefMATHMathSciNetGoogle Scholar
  18. 18.
    Pomerance, C., Smith, J.W.: Reduction of huge, sparse matrices over finite fields via created catastrophes. Exp. Math. 1, 89–94 (1992)CrossRefMATHMathSciNetGoogle Scholar
  19. 19.
    Schmidt, B., Aribowo, H., Dang, H.-V.: Iterative sparse matrix-vector multiplication for integer factorization on GPUs. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011, Part II. LNCS, vol. 6853, pp. 413–424. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  20. 20.
    Sengupta, S., Harris, M., Zhang, Y., Owens, J.D.: Scan primitives for GPU computing, pp. 97–106, August 2007Google Scholar
  21. 21.
    Shanks, D.: Class number, a theory of factorization, and genera. In: 1969 Number Theory Institute (Proc. Sympos. Pure Math., Vol. XX, State Univ. New York, Stony Brook, N.Y., 1969), pp. 415–440. Providence, R.I. (1971)Google Scholar
  22. 22.
    Stach, P.: Optimizations to nfs linear algebra. In:CADO Workshop on Integer Factorization. http://cado.gforge.inria.fr/workshop/abstracts.html
  23. 23.
    Szabo, N.S., Tanaka, R.I.: Residue Arithmetic and Its Applications to Computer Technology. McGraw-Hill Book Company, New York (1967)MATHGoogle Scholar
  24. 24.
    Taylor, F.J.: Residue arithmetic a tutorial with examples. Computer 17, 50–62 (1984)CrossRefGoogle Scholar
  25. 25.
    Thomé, E.: Subquadratic computation of vector generating polynomials and improvement of the block wiedemann algorithm. J. Symbolic Comput. 33(5), 757–775 (2002)CrossRefMATHMathSciNetGoogle Scholar
  26. 26.
    Vázquez, F., Garzón, E.M., Martinez, J.A., Fernández, J.J.: The sparse matrix vector product on GPUs. Technical report, University of Almeria, June 2009Google Scholar
  27. 27.
    Wiedemann, D.H.: Solving sparse linear equations over finite fields. IEEE Trans. Inf. Theor. 32(1), 54–62 (1986)CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.CARAMEL project-teamLORIA, INRIA/CNRS/Université de LorraineVandœuvre-lés-Nancy CedexFrance

Personalised recommendations