A Parallel GPU Implementation of SWIFFTX

  • Metin Evrim UluEmail author
  • Murat CenkEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11989)


The SWIFFTX algorithm is one of the candidates of SHA-3 Hash Competition that uses the number theoretic transform (NTT). It has 256-byte input blocks and 65-byte output blocks. In this paper, a parallel implementation of the algorithm and particular techniques to make it faster on GPU are proposed. We target version 6.1 of NVIDIA®CUDAcompute architecture that employs an ISA (Instruction Set Architecture) called Parallel Thread Execution (PTX) which possesses special instrinsics, hence we modify the reference implementation for better results. Experimental results indicate almost 10x improvement in speed and 5 W decrease in power consumption per \(2^{16}\) hashes.




  1. 1.
    Arbitman, Y., Dogon, G., Lyubashevsky, V., Micciancio, D., Peikert, C., Rosen, A.: SWIFFTX: a proposal for the SHA-3 standard. In: The First SHA-3 Candidate Conference (2008)Google Scholar
  2. 2.
    Durstenfeld, R.: Algorithm 235: random permutation. Commun. ACM 7(7), 420 (1964) CrossRefGoogle Scholar
  3. 3.
    Centre for Research on Cryptography and Brno Czech Republic Security, Masaryk University. Tool for generation of data from cryptoprimitives (block and stream ciphers, hash functions). Accessed Dec 2018
  4. 4.
    Györfi, T., Cret, O., Hanrot, G., Brisebarre, N.: High-throughput hardware architecture for the swifft/swifftx hash functions. IACR Cryptology ePrint Archive, 2012:343 (2012)Google Scholar
  5. 5.
    Lyubashevsky, V., Micciancio, D.: Generalized compact knapsacks are collision resistant. In: 33rd International Colloquium Automata, Languages and Programming, ICALP 2006, Venice, Italy, 10–14 July 2006, Proceedings, Part II, pp. 144–155 (2006)Google Scholar
  6. 6.
    Lyubashevsky, V., Micciancio, D., Peikert, C., Rosen, A.: SWIFFT: a modest proposal for FFT hashing. In: Nyberg, K. (ed.) FSE 2008. LNCS, vol. 5086, pp. 54–72. Springer, Heidelberg (2008). Scholar
  7. 7.
    NVIDIA: GeForce GTX 1080 Whitepaper. Accessed Dec 2018
  8. 8.
    NVIDIA: Parallel Thread Execution ISA. Accessed Apr 2018
  9. 9.
    NVIDIA: Pascal Tuning Guide. Accessed Apr 2018
  10. 10.
    NVIDIA: Visual Profiler. Accessed Apr 2018
  11. 11.
    CUDA NVIDIA: NVIDIA CUDA C programming guide. Nvidia Corporation 120(18), 8 (2011)Google Scholar
  12. 12.
    Peikert, C., Rosen, A.: Efficient collision-resistant hashing from worst-case assumptions on cyclic lattices. In: Theory of Cryptography, Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, 4–7 March 2006, Proceedings, pp. 145–166 (2006)Google Scholar
  13. 13.
    Volkov, V.: Better performance at lower occupancy. Proc. GPU Technol. Conf. 10, 16 (2010)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Middle East Technical UniversityAnkaraTurkey

Personalised recommendations