Skip to main content

A Survey of Software Implementations for the Number Theoretic Transform

  • Conference paper
  • First Online:
Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14385))

Included in the following conference series:

Abstract

This survey summarizes the software implementation knowledge of the Number Theoretic Transform (NTT)—a major subroutine of lattice-based cryptosystems. The NTT is a special type of Fast Fourier Transform defined over finite fields, and as such, NTT enables faster polynomial multiplication. There have been over a decade of implementations of NTT following different design methods (e.g., CPU vs. GPU), aiming different optimization goals (e.g., memory-footprint vs. high-throughput), and proposing different styles of optimizations at different abstraction levels (e.g., arithmetic vs. assembly). At the same time, there are several techniques for evaluating and mitigating implementation attacks on NTT. Yet there is no quick guideline to help new developers/practitioners or future researchers given the continuing industry and academic efforts on NTT implementations. Our goal in this paper is to provide an overview of a decade of work. To that end, we survey NTT software implementations and categorize them based on their target platforms, optimization goals, and implementation security enhancements. We furthermore provide an executive summary of the key ideas proposed in related works. We hope this paper to be a designer pit stop into the NTT world and help them navigate to their destination.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ajtai, M.: Generating hard instances of lattice problems. In: Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing, pp. 99–108 (1996)

    Google Scholar 

  2. Akleylek, S., Dağdelen, Ö., Yüce Tok, Z.: On the efficiency of polynomial multiplication for lattice-based cryptography on GPUs using CUDA. In: Pasalic, E., Knudsen, L.R. (eds.) BalkanCryptSec 2015. LNCS, vol. 9540, pp. 155–168. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29172-7_10

    Chapter  MATH  Google Scholar 

  3. Alagic, G., et al.: Status report on the second round of the NIST post-quantum cryptography standardization process. US Department of Commerce, NIST (2020)

    Google Scholar 

  4. Alkim, E., Alper Bilgin, Y., Cenk, M., Gérard, F.: Cortex-M4 optimizations for R, M LWE schemes. IACR Trans. Cryptographic Hardw. Embed. Syst. 2020(3), 336–357 (2020)

    Article  Google Scholar 

  5. Alkim, E., Barreto, P.S.L.M., Bindel, N., Kramer, J., Longa, P., Ricardini, J.E.: The lattice-based digital signature scheme qTESLA. Cryptology ePrint Archive, Report 2019/085 (2019)

    Google Scholar 

  6. Alkım, E., Bilgin, Y.A., Cenk, M.: Compact and simple RLWE based key encapsulation mechanism. In: Schwabe, P., Thériault, N. (eds.) LATINCRYPT 2019. LNCS, vol. 11774, pp. 237–256. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30530-7_12

    Chapter  Google Scholar 

  7. Alkim, E., et al.: Polynomial multiplication in NTRU prime: comparison of optimization strategies on cortex-m4. IACR Trans. Cryptographic Hardw. Embed. Syst. 2021(1), 217–238 (2020)

    Article  Google Scholar 

  8. Alkim, E., Ducas, L., Pöppelmann, T., Schwabe, P.: Post-quantum key exchange—a new hope. In: 25th USENIX, pp. 327–343 (2016)

    Google Scholar 

  9. Alves, P.G.M., Ortiz, J.N., Aranha, D.F.: Performance of hierarchical transforms in homomorphic encryption: a case study on logistic regression inference. Cryptology ePrint Archive (2022)

    Google Scholar 

  10. Aysu, A., Patterson, C., Schaumont, P.: Low-cost and area-efficient FPGA implementations of lattice-based cryptography. In: 2013 IEEE International Symposium on Hardware-Oriented Security and Trust (HOST), pp. 81–86 (2013). https://doi.org/10.1109/HST.2013.6581570

  11. Azarderakhsh, R., Liu, Z., Seo, H., Kim, H.: Neon PQCryto: fast and parallel ring-LWE encryption on arm neon architecture. Cryptology ePrint Archive, Report 2015/1081 (2015). https://eprint.iacr.org/2015/1081

  12. Badawi, A.A., Veeravalli, B., Aung, K.M.M., Hamadicharef, B.: Accelerating subset sum and lattice based public-key cryptosystems with multi-core CPUs and GPUs. J. Parallel Distrib. Comput. 119, 179–190 (2018)

    Article  Google Scholar 

  13. Badawi, A.A., Veeravalli, B., Mi Aung, K.M.: Faster number theoretic transform on graphics processors for ring learning with errors based cryptography. In: 2018 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI), pp. 26–31 (2018). https://doi.org/10.1109/SOLI.2018.8476725

  14. Banerjee, U., Chandrakasan, A.P.: Efficient post-quantum TLS handshakes using identity-based key exchange from lattices. In: 2020 IEEE International Conference on Communications (ICC), ICC 2020, pp. 1–6 (2020)

    Google Scholar 

  15. Bentahar, K., Silverman, J., Saarinen, M.J.O., Smart, N.: Lash (2006)

    Google Scholar 

  16. Boemer, F., Kim, S., Seifu, G., de Souza, F.D., Gopal, V.: Intel HEXL: accelerating homomorphic encryption with Intel AVX512-IFMA52. Cryptology ePrint Archive, Report 2021/420 (2021). https://eprint.iacr.org/2021/420

  17. Boorghany, A., Jalili, R.: Implementation and comparison of lattice-based identification protocols on smart cards and microcontrollers. Cryptology ePrint Archive, Report 2014/078 (2014). https://eprint.iacr.org/2014/078

  18. Boorghany, A., Sarmadi, S.B., Jalili, R.: On constrained implementation of lattice-based cryptographic primitives and schemes on smart cards. ACM Trans. Embed. Comput. Syst. 14(3) (2015)

    Google Scholar 

  19. Bos, J., et al.: Frodo: take off the ring! practical, quantum-secure key exchange from LWE. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1006–1018 (2016)

    Google Scholar 

  20. Bos, J., et al.: Crystals-Kyber: a CCA-secure module-lattice-based KEM. In: 2018 IEEE EuroS &P, pp. 353–367. IEEE (2018)

    Google Scholar 

  21. Botros, L., Kannwischer, M.J., Schwabe, P.: Memory-efficient high-speed implementation of Kyber on Cortex-M4. In: Buchmann, J., Nitaj, A., Rachidi, T. (eds.) AFRICACRYPT 2019. LNCS, vol. 11627, pp. 209–228. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23696-0_11

    Chapter  Google Scholar 

  22. Boyen, X.: Attribute-based functional encryption on lattices. In: Sahai, A. (ed.) TCC 2013. LNCS, vol. 7785, pp. 122–142. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36594-2_8

    Chapter  Google Scholar 

  23. Brakerski, Z., Vaikuntanathan, V.: Efficient fully homomorphic encryption from (standard) LWE. Cryptology ePrint Archive, Report 2011/344 (2011)

    Google Scholar 

  24. Brakerski, Z., Vaikuntanathan, V.: Lattice-based FHE as secure as PKE. Cryptology ePrint Archive, Report 2013/541 (2013). https://eprint.iacr.org/2013/541

  25. Chang, B.C., Goi, B.M., Phan, R.C.W., Lee, W.K.: Accelerating multiple precision multiplication in GPU with Kepler architecture. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 844–851 (2016)

    Google Scholar 

  26. Chang, B.C., Goi, B.M., Phan, R.C.W., Lee, W.K.: Multiplying very large integer in GPU with pascal architecture. In: 2018 IEEE Symposium on Computer Applications Industrial Electronics (ISCAIE), pp. 401–405 (2018)

    Google Scholar 

  27. Chu, E., George, A.: Inside the FFT Black Box: Serial and Parallel Fast Fourier Transform Algorithms. CRC Press (1999)

    Google Scholar 

  28. Chung, C.M.M., Hwang, V., Kannwischer, M.J., Seiler, G., Shih, C.J., Yang, B.Y.: NTT multiplication for NTT-unfriendly rings. Cryptology ePrint Archive, Report 2020/1397 (2020). https://eprint.iacr.org/2020/1397

  29. de Clercq, R., Roy, S.S., Vercauteren, F., Verbauwhede, I.: Efficient software implementation of ring-LWE encryption. In: 2015 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 339–344 (2015)

    Google Scholar 

  30. Cousins, D.B., Rohloff, K., Sumorok, D.: Designing an FPGA-accelerated homomorphic encryption co-processor. IEEE Trans. Emerg. Top. Comput. 5(2), 193–206 (2017). https://doi.org/10.1109/TETC.2016.2619669

    Article  Google Scholar 

  31. Dai, W., et al.: Implementation and evaluation of a lattice-based key-policy ABE scheme. IEEE Trans. Inf. Forensics Secur. 13(5), 1169–1184 (2018)

    Article  Google Scholar 

  32. Dai, W., Sunar, B.: cuHE: a homomorphic encryption accelerator library. In: Pasalic, E., Knudsen, L.R. (eds.) BalkanCryptSec 2015. LNCS, vol. 9540, pp. 169–186. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29172-7_11

    Chapter  Google Scholar 

  33. Du, C., Bai, G., Chen, H.: Towards efficient implementation of lattice-based public-key encryption on modern CPUs. In: 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 1230–1236 (2015). https://doi.org/10.1109/Trustcom.2015.510

  34. Duong-Ngoc, P., Pham, T.X., Lee, H., Nguyen, T.T.: Flexible GPU-based implementation of number theoretic transform for homomorphic encryption. In: 2022 19th International SoC Design Conference (ISOCC), pp. 259–260 (2022)

    Google Scholar 

  35. Durrani, S., et al.: Accelerating Fourier and number theoretic transforms using tensor cores and warp shuffles. In: 2021 30th International Conference on Parallel Architectures and Compilation Techniques, pp. 345–355. IEEE (2021)

    Google Scholar 

  36. D’Anvers, J.-P., Karmakar, A., Sinha Roy, S., Vercauteren, F.: Saber: module-LWR based key exchange, CPA-secure encryption and CCA-secure KEM. In: Joux, A., Nitaj, A., Rachidi, T. (eds.) AFRICACRYPT 2018. LNCS, vol. 10831, pp. 282–305. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-89339-6_16

    Chapter  Google Scholar 

  37. Gao, Y., Xu, J., Wang, H.: CuNH: efficient GPU implementations of post-quantum KEM NewHope. IEEE Trans. Parallel Distrib. Syst. 33(3), 551–568 (2021)

    Article  Google Scholar 

  38. Gentry, C.: A fully homomorphic encryption scheme. Ph.D. thesis, Stanford, CA, USA (2009). aAI3382729

    Google Scholar 

  39. Goey, J.Z., Lee, W.K., Goi, B.M., Yap, W.S.: Accelerating number theoretic transform in GPU platform for fully homomorphic encryption. J. Supercomput. 77, 1455–1474 (2021)

    Article  Google Scholar 

  40. Goldreich, O., Goldwasser, S., Halevi, S.: Public-key cryptosystems from lattice reduction problems. In: Kaliski, B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 112–131. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0052231

    Chapter  Google Scholar 

  41. Göttert, N., Feller, T., Schneider, M., Buchmann, J., Huss, S.: On the design of hardware building blocks for modern lattice-based encryption schemes. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 512–529. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33027-8_30

    Chapter  MATH  Google Scholar 

  42. Greconici, D.O.C., Kannwischer, M.J., Sprenkels, D.: Compact Dilithium implementations on Cortex-M3 and Cortex-M4. IACR Trans. Cryptographic Hardw. Embed. Syst. 2021(1), 1–24 (2020)

    Google Scholar 

  43. Güneysu, T., Lyubashevsky, V., Pöppelmann, T.: Practical lattice-based cryptography: a signature scheme for embedded systems. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 530–547. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33027-8_31

    Chapter  MATH  Google Scholar 

  44. Güneysu, T., Oder, T., Pöppelmann, T., Schwabe, P.: Software speed records for lattice-based signatures. In: Gaborit, P. (ed.) PQCrypto 2013. LNCS, vol. 7932, pp. 67–82. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38616-9_5

    Chapter  Google Scholar 

  45. Gupta, N., Jati, A., Chauhan, A.K., Chattopadhyay, A.: PQC acceleration using GPUs: FrodoKEM, NewHope, and Kyber. IEEE Trans. Parallel Distrib. Syst. 32(3), 575–586 (2021)

    Article  Google Scholar 

  46. Hoffstein, J., Howgrave-Graham, N., Pipher, J., Silverman, J.H., Whyte, W.: NTRUSign: digital signatures using the NTRU lattice. In: Joye, M. (ed.) CT-RSA 2003. LNCS, vol. 2612, pp. 122–140. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36563-X_9

    Chapter  Google Scholar 

  47. Hoffstein, J., Pipher, J., Silverman, J.H.: NTRU: a ring-based public key cryptosystem. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 267–288. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0054868

    Chapter  Google Scholar 

  48. Howe, J., Prest, T., Apon, D.: SoK: How (not) to design and implement post-quantum cryptography. Cryptology ePrint Archive, Report 2021/462 (2021)

    Google Scholar 

  49. Imran, M., Pagliarini, S.: An experimental study of building blocks of lattice-based nist post-quantum cryptographic algorithms. Electronics 9(11) (2020)

    Google Scholar 

  50. Jung, W.: Over 100x faster bootstrapping in fully homomorphic encryption through memory-centric optimization with GPUs. IACR Trans. Cryptographic Hardw. Embed. Syst. 114–148 (2021)

    Google Scholar 

  51. Karabulut, E., Aysu, A.: RANTT: a RISC-V architecture extension for the number theoretic transform. In: 2020 30th International Conference on Field-Programmable Logic and Applications (FPL), pp. 26–32 (2020). https://doi.org/10.1109/FPL50879.2020.00016

  52. Kim, S., Jung, W., Park, J., Ahn, J.H.: Accelerating number theoretic transformations for bootstrappable homomorphic encryption on GPUs. In: 2020 IEEE International Symposium on Workload Characterization, pp. 264–275. IEEE (2020)

    Google Scholar 

  53. Lee, W.-K., Akleylek, S., Yap, W.-S., Goi, B.-M.: Accelerating number theoretic transform in GPU platform for qTESLA scheme. In: Heng, S.-H., Lopez, J. (eds.) ISPEC 2019. LNCS, vol. 11879, pp. 41–55. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34339-2_3

    Chapter  Google Scholar 

  54. Lee, W.K., Hwang, S.O.: High throughput implementation of post-quantum key encapsulation and decapsulation on GPU for internet of things applications. IEEE Trans. Serv. Comput. 15(6), 3275–3288 (2021)

    Article  Google Scholar 

  55. Lee, W.-K., et al.: Parallel implementation of Nussbaumer algorithm and number theoretic transform on a GPU platform: application to qTESLA. J. Supercomput. 77, 3289–3314 (2021)

    Article  Google Scholar 

  56. Liu, Z., Azarderakhsh, R., Kim, H., Seo, H.: Efficient implementation of ring-LWE encryption on high-end IoT platform. In: Hancke, G.P., Markantonakis, K. (eds.) RFIDSec 2016. LNCS, vol. 10155, pp. 76–90. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62024-4_6

    Chapter  Google Scholar 

  57. Liu, Z., Seo, H., Sinha Roy, S., Großschädl, J., Kim, H., Verbauwhede, I.: Efficient ring-LWE encryption on 8-bit AVR processors. In: Güneysu, T., Handschuh, H. (eds.) CHES 2015. LNCS, vol. 9293, pp. 663–682. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48324-4_33

    Chapter  Google Scholar 

  58. Livesay, N., et al.: Accelerating finite field arithmetic for homomorphic encryption on GPUs. In: IEEE Micro, pp. 1–9 (2023)

    Google Scholar 

  59. Longa, P., Naehrig, M.: Speeding up the number theoretic transform for faster ideal lattice-based cryptography. In: Foresti, S., Persiano, G. (eds.) CANS 2016. LNCS, vol. 10052, pp. 124–139. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48965-0_8

    Chapter  Google Scholar 

  60. Lyubashevsky, V., Micciancio, D., Peikert, C., Rosen, A.: SWIFFT: a modest proposal for FFT hashing. In: Nyberg, K. (ed.) FSE 2008. LNCS, vol. 5086, pp. 54–72. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71039-4_4

    Chapter  Google Scholar 

  61. Lyubashevsky, V., Seiler, G.: NTTRU: truly fast NTRU using NTT. IACR Trans. Cryptographic Hardw. Embed. Syst. 2019(3), 180–201 (2019)

    Article  Google Scholar 

  62. Mert, A.C., Karabulut, E., Ozturk, E., Savas, E., Aysu, A.: An extensive study of flexible design methods for the number theoretic transform. IEEE Trans. Comput. 71, 2829–2843 (2020). https://doi.org/10.1109/TC.2020.3017930

    Article  Google Scholar 

  63. Mohsen, A.W., Sobh, M.A., Bahaa-Eldin, A.M.: Performance analysis of number theoretic transform for lattice-based cryptography. In: 2018 13th International Conference on Computer Engineering and Systems (ICCES), pp. 442–447 (2018)

    Google Scholar 

  64. Navas, J.A., Dutertre, B., Mason, I.A.: Verification of an optimized NTT algorithm. In: Christakis, M., Polikarpova, N., Duggirala, P.S., Schrammel, P. (eds.) NSV/VSTTE -2020. LNCS, vol. 12549, pp. 144–160. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63618-0_9

    Chapter  Google Scholar 

  65. Nejatollahi, H., Dutt, N., Ray, S., Regazzoni, F., Banerjee, I., Cammarota, R.: Post-quantum lattice-based cryptography implementations: a survey. ACM Comput. Surv. 51(6) (2019)

    Google Scholar 

  66. Ni, N., Zhu, Y.: Enabling zero knowledge proof by accelerating zk-SNARK kernels on GPU. J. Parallel Distrib. Comput. 173, 20–31 (2023)

    Article  Google Scholar 

  67. Oder, T., Pöppelmann, T., Güneysu, T.: Beyond ECDSA and RSA: lattice-based digital signatures on constrained devices. In: 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1–6 (2014)

    Google Scholar 

  68. O’Sullivan, E., Regazzoni, F.: Special session paper: efficient arithmetic for lattice-based cryptography. In: 2017 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), pp. 1–3 (2017)

    Google Scholar 

  69. Özerk, Ö., Elgezen, C., Mert, A.C., Öztürk, E., Savaş, E.: Efficient number theoretic transform implementation on GPU for homomorphic encryption. J. Supercomput. 78(2), 2840–2872 (2022)

    Article  Google Scholar 

  70. Peikert, C.: A decade of lattice cryptography. Cryptology ePrint Archive, Report 2015/939 (2015). https://eprint.iacr.org/2015/939

  71. Pessl, P., Primas, R.: More practical single-trace attacks on the number theoretic transform. In: Schwabe, P., Thériault, N. (eds.) LATINCRYPT 2019. LNCS, vol. 11774, pp. 130–149. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30530-7_7

    Chapter  Google Scholar 

  72. Pöppelmann, T., Güneysu, T.: Towards efficient arithmetic for lattice-based cryptography on reconfigurable hardware. In: Hevia, A., Neven, G. (eds.) LATINCRYPT 2012. LNCS, vol. 7533, pp. 139–158. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33481-8_8

    Chapter  Google Scholar 

  73. Pöppelmann, T., Oder, T., Güneysu, T.: High-performance ideal lattice-based cryptography on 8-bit ATxmega microcontrollers. In: Lauter, K., Rodríguez-Henríquez, F. (eds.) LATINCRYPT 2015. LNCS, vol. 9230, pp. 346–365. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22174-8_19

    Chapter  Google Scholar 

  74. Primas, R., Pessl, P., Mangard, S.: Single-trace side-channel attacks on masked lattice-based encryption. In: Fischer, W., Homma, N. (eds.) CHES 2017. LNCS, vol. 10529, pp. 513–533. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66787-4_25

    Chapter  Google Scholar 

  75. Ravi, P., Poussier, R., Bhasin, S., Chattopadhyay, A.: On configurable SCA countermeasures against single trace attacks for the NTT - a performance evaluation study over Kyber and Dilithium on the arm Cortex-M4. Cryptology ePrint Archive, Report 2020/1038 (2020). https://eprint.iacr.org/2020/1038

  76. Regev, O.: On lattices, learning with errors, random linear codes, and cryptography. J. ACM (JACM) 56(6), 1–40 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  77. Reparaz, O., Roy, S.S., Vercauteren, F., Verbauwhede, I.: A masked ring-LWE implementation. Cryptology ePrint Archive, Report 2015/724 (2015)

    Google Scholar 

  78. Sahu, G., Rohloff, K.: Accelerating lattice based proxy re-encryption schemes on GPUs. In: Krenn, S., Shulman, H., Vaudenay, S. (eds.) CANS 2020. LNCS, vol. 12579, pp. 613–632. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65411-5_30

    Chapter  Google Scholar 

  79. Schaumont, P., Aysu, A.: Three design dimensions of secure embedded systems. In: Gierlichs, B., Guilley, S., Mukhopadhyay, D. (eds.) SPACE 2013. LNCS, vol. 8204, pp. 1–20. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41224-0_1

    Chapter  Google Scholar 

  80. Scott, M.: A note on the implementation of the number theoretic transform. Cryptology ePrint Archive, Report 2017/727 (2017)

    Google Scholar 

  81. Seiler, G.: Faster AVX2 optimized NTT multiplication for ring-LWE lattice cryptography. Cryptology ePrint Archive, Report 2018/039 (2018)

    Google Scholar 

  82. Shen, S., Yang, H., Dai, W., Liu, Z., Zhao, Y.: High-throughput GPU implementation of Dilithium post-quantum digital signature (2022)

    Google Scholar 

  83. Shivdikar, K., et al.: Accelerating polynomial multiplication for homomorphic encryption on GPUs (2022)

    Google Scholar 

  84. Tan, T.N., Lee, H.: High-secure fingerprint authentication system using ring-LWE cryptography. IEEE Access 7, 23379–23387 (2019)

    Article  Google Scholar 

  85. Türkoğlu, E.R., Özcan, A., Ayduman, C., Mert, A.C., Öztürk, E., Savaş, E.: An accelerated GPU library for homomorphic encryption operations of BFV scheme. In: 2022 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1155–1159 (2022). https://doi.org/10.1109/ISCAS48785.2022.9937503

  86. Ulu, M.E., Cenk, M.: A parallel GPU implementation of SWIFFTX. In: Slamanig, D., Tsigaridas, E., Zafeirakopoulos, Z. (eds.) MACIS 2019. LNCS, vol. 11989, pp. 202–217. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43120-4_16

    Chapter  MATH  Google Scholar 

  87. Valencia, F., Khalid, A., O’Sullivan, E., Regazzoni, F.: The design space of the number theoretic transform: a survey. In: 2017 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), pp. 273–277 (2017). https://doi.org/10.1109/SAMOS.2017.8344640

  88. Wang, W., Hu, Y., Chen, L., Huang, X., Sunar, B.: Exploring the feasibility of fully homomorphic encryption. IEEE Trans. Comput. 64(3), 698–706 (2015). https://doi.org/10.1109/TC.2013.154

    Article  MathSciNet  MATH  Google Scholar 

  89. Wang, Z., Li, P., Li, Z., Cao, J., Wang, X., Meng, D.: HE-Booster: an efficient polynomial arithmetic acceleration on GPUs for fully homomorphic encryption. IEEE Trans. Parallel Distrib. Syst. 34(4), 1067–1081 (2023)

    Article  Google Scholar 

  90. Xu, J., Wang, Y., Liu, J., Wang, X.: A general-purpose number theoretic transform algorithm for compact RLWE cryptoprocessors. In: 2020 IEEE 14th International Conference on Anti-counterfeiting, Security, and Identification (ASID), pp. 1–5 (2020). https://doi.org/10.1109/ASID50160.2020.9271722

  91. Xu, Z., Pemberton, O., Roy, S.S., Oswald, D.: Magnifying side-channel leakage of lattice-based cryptosystems with chosen ciphertexts: the case study of Kyber. Cryptology ePrint Archive, Report 2020/912 (2020)

    Google Scholar 

  92. Zhai, Y., et al.: Accelerating encrypted computing on Intel GPUs. In: 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 705–716. IEEE (2022)

    Google Scholar 

  93. Zhang, Y., et al.: PipeZK: accelerating zero-knowledge proof with a pipelined architecture. In: 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), pp. 416–428. IEEE (2021)

    Google Scholar 

  94. Zhao, X., Wang, B., Zhao, Z., Qu, Q., Wang, L.: Highly efficient parallel design of Dilithium on GPUs (2022)

    Google Scholar 

  95. Zhou, S., et al.: Preprocess-then-NTT technique and its applications to Kyber and NewHope. In: Guo, F., Huang, X., Yung, M. (eds.) Inscrypt 2018. LNCS, vol. 11449, pp. 117–137. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14234-6_7

    Chapter  Google Scholar 

  96. Zhu, Y., Liu, Z., Pan, Y.: When NTT meets Karatsuba: preprocess-then-NTT technique revisited. Cryptology ePrint Archive, Report 2019/1079 (2019)

    Google Scholar 

  97. Özcan, A., Ayduman, C., Türkoğlu, E.R., Savaş, E.: Homomorphic encryption on GPU. IEEE Access 1 (2023). https://doi.org/10.1109/ACCESS.2023.3265583

Download references

Acknowledgments

This paper is supported in part by NSF award no CCF 2146881. Erkay Savaş is supported by the European Union’s Horizon Europe research and innovation programme under grant agreement No: 101079319.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aydin Aysu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mert, A.C., Yaman, F., Karabulut, E., Öztürk, E., Savaş, E., Aysu, A. (2023). A Survey of Software Implementations for the Number Theoretic Transform. In: Silvano, C., Pilato, C., Reichenbach, M. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2023. Lecture Notes in Computer Science, vol 14385. Springer, Cham. https://doi.org/10.1007/978-3-031-46077-7_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46077-7_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46076-0

  • Online ISBN: 978-3-031-46077-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics