Skip to main content

An Implementation of Parallel Number-Theoretic Transform Using Intel AVX-512 Instructions

  • Conference paper
  • First Online:
Computer Algebra in Scientific Computing (CASC 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13366))

Included in the following conference series:

Abstract

In this paper, we propose an implementation of the parallel number-theoretic transform (NTT) using Intel Advanced Vector Extensions 512 (AVX-512) instructions. The butterfly operation of the NTT can be performed using modular addition, subtraction, and multiplication. We show that a method known as the six-step fast Fourier transform algorithm can be applied to the NTT. We vectorized NTT kernels using the Intel AVX-512 instructions and parallelized the six-step NTT using OpenMP. We successfully achieved a performance of over 83 giga-operations per second on an Intel Xeon Platinum 8368 (2.4 GHz, 38 cores) for a \(2^{20}\)-point NTT with a modulus of 51 bits.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bailey, D.H.: FFTs in external or hierarchical memory. J. Supercomput. 4, 23–35 (1990)

    Article  Google Scholar 

  2. Boemer, F., Kim, S., Seifu, G., de Souza, F.D.M., Gopal, V.: Intel HEXL: accelerating homomorphic encryption with Intel AVX512-IFMA52. In: Proceedings of 9th Workshop on Encrypted Computing & Applied Homomorphic Cryptography (WAHC 2021), pp. 57–62 (2021)

    Google Scholar 

  3. Boemer, F., et al.: Intel HEXL. https://github.com/intel/hexl

  4. Cochran, W.T., et al.: What is the fast Fourier transform? IEEE Trans. Audio Electroacoust. 15, 45–55 (1967)

    Article  Google Scholar 

  5. Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 297–301 (1965)

    Article  MathSciNet  Google Scholar 

  6. Free Software Foundation Inc: GCC, the GNU Compiler Collection. https://gcc.gnu.org/

  7. Harvey, D.: Faster arithmetic for number-theoretic transforms. J. Symb. Comput. 60, 113–119 (2014)

    Article  MathSciNet  Google Scholar 

  8. Intel Corporation: Intel 64 and IA-32 architectures software developer’s manual, volume 1: Basic architecture. https://software.intel.com/content/dam/develop/public/us/en/documents/253665-sdm-vol-1.pdf (2020)

  9. Intel Corporation: Intel C++ compiler 19.1 developer guide and reference (2020). https://software.intel.com/content/dam/develop/external/us/en/documents/19-1-cpp-compiler-devguide.pdf

  10. Marr, D.T., et al.: Hyper-threading technology architecture and microarchitecture. Intel. Technol. J. 6, 1–11 (2002)

    Google Scholar 

  11. Meng, L., Johnson, J.: Automatic parallel library generation for general-size modular FFT algorithms. In: Gerdt, V.P., Koepf, W., Mayr, E.W., Vorozhtsov, E.V. (eds.) CASC 2013. LNCS, vol. 8136, pp. 243–256. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-02297-0_21

    Chapter  Google Scholar 

  12. Meng, L., Johnson, J.R., Franchetti, F., Voronenko, Y., Maza, M.M., Xie, Y.: Spiral-generated modular FFT algorithms. In: Proceedings of 4th International Workshop on Parallel and Symbolic Computation (PASCO 2010), pp. 169–170 (2010)

    Google Scholar 

  13. Montgomery, P.L.: Modular multiplication without trial division. Math. Comput. 44, 519–521 (1985)

    Article  MathSciNet  Google Scholar 

  14. Pollard, J.M.: The fast Fourier transform in a finite field. Math. Comput. 25, 365–374 (1971)

    Article  MathSciNet  Google Scholar 

  15. Shoup, V.: NTL: a library for doing number theory. https://libntl.org

  16. Swarztrauber, P.N.: FFT algorithms for vector computers. Parallel Comput. 1, 45–63 (1984)

    Article  Google Scholar 

  17. Takahashi, D.: An implementation of parallel 1-D real FFT on Intel Xeon phi processors. In: Gervasi, O., et al. (eds.) ICCSA 2017. LNCS, vol. 10404, pp. 401–410. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62392-4_29

    Chapter  Google Scholar 

  18. Takahashi, D.: Computation of the 100 quadrillionth hexadecimal digit of \(\pi \) on a cluster of Intel Xeon phi processors. Parallel Comput. 75, 1–10 (2018)

    Article  MathSciNet  Google Scholar 

  19. The Clang Team: clang: a C language family frontend for LLVM. https://clang.llvm.org/

  20. Van Loan, C.: Computational Frameworks for the Fast Fourier Transform. SIAM Press, Philadelphia, PA (1992)

    Book  Google Scholar 

Download references

Acknowledgments

This work was supported by JSPS KAKENHI Grant Number JP19K11989.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daisuke Takahashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Takahashi, D. (2022). An Implementation of Parallel Number-Theoretic Transform Using Intel AVX-512 Instructions. In: Boulier, F., England, M., Sadykov, T.M., Vorozhtsov, E.V. (eds) Computer Algebra in Scientific Computing. CASC 2022. Lecture Notes in Computer Science, vol 13366. Springer, Cham. https://doi.org/10.1007/978-3-031-14788-3_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-14788-3_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-14787-6

  • Online ISBN: 978-3-031-14788-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics