Abstract
NIST post-quantum cryptography standardization round 3 announced CRYSTALS-Kyber as one of the finalists. As a lattice-based cryptography scheme, CRYSTALS-Kyber performance relies on polynomial multiplication efficiency. This paper presents a high-speed and pipelined hardware number theoretic transform (NTT) and INTT accelerator for CRYSTALS-Kyber. Our work centers around designing and optimizing the NTT accelerator architecture with suitable parameter for hardware implementations of CRYSTALS-Kyber. The work includes modifying the modular arithmetic modules and butterfly units structure with an efficient low-complexity algorithm. As a result, our design achieved 237 MHz fmax when synthesized on Intel FPGA Cyclone V with Quartus. Resources utilization through combinational logic path rebalances allowed us to efficiently pipeline between hardware modules.
Similar content being viewed by others
References
NIST: Post-Quantum Cryptography Standardization. https://csrc.nist.gov/Projects/post-quantum-cryptography
Bos, J.; Ducas, L.; Kiltz, E.; de Lepoint, T.; Lyubashevsky, V.; Schanck, J.M.; Schwabe, P.; Seiler, G.; Stehlé, D.: CRYSTALS-Kyber: a CCA-secure module-lattice-based KEM. In: 2018 IEEE European Symposium on Security and Privacy (EuroSP), pp. 353–367. IEEE (2018). https://doi.org/10.1109/EuroSP.2018.00032.
Andrzejczak, M.; Farahmand, F.; Gaj, K.: Full hardware implementation of the post-quantum public-key cryptography scheme round5. In: 2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig), pp. 1–2. IEEE (2019). https://doi.org/10.1109/ReConFig48160.2019.8994765.
Huang, Y.; Huang, M.; Lei, Z.; Wu, J.: A pure hardware implementation of crystals-kyber PQC algorithm through resource reuse. IEICE Electron. Express (2020). https://doi.org/10.1587/elex.17.20200234
Botros, L.; Kannwischer, M.J.; Schwabe, P.: Memory-efficient high-speed implementation of Kyber on Cortex-M4. In: International Conference on Cryptology in Africa, pp. 209–228. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23696-0_11
Jati, A.; Gupta, N.; Chattopadhyay, A.; Sanadhya, S.K.: A Configurable crystals-kyber hardware implementation with side-channel protection. Cryptology ePrint Archive (2021). https://eprint.iacr.org/2021/1189
Zhao, Y.; Chao, Z.; Ye, J.; Wang, W.; Cao, Y.; Chen, S.; Li, X.; Li, H.: Optimization space exploration of hardware design for CRYSTALS-KYBER. In: 2020 IEEE 29th Asian Test Symposium (ATS), pp. 1–6. IEEE (2020). https://doi.org/10.1109/ATS49688.2020.9301498.
Albrecht, M.R.; Hanser, C.; Hoeller, A.; Pöppelmann, T.; Virdia, F.; Wallner, A.: Implementing RLWE-based schemes using an RSA co-processor, Cryptology ePrint Archive (2018) https://eprint.iacr.org/2018/425
Sanal, P.; Karagoz, E.; Seo, H.; Azarderakhsh, R.; Mozaffari-Kermani, M.: Kyber on ARM64: compact implementations of Kyber on 64-bit ARM cortex-a processors, Cryptology ePrint Archive (2021). https://eprint.iacr.org/2021/561
Seo, H.-j; Kwon, H.-d; Jang, K.-b; Kim, H.: Optimized implementation of scalable multi-precision multiplication method on RISC-V processor for high-speed computation of post-quantum cryptography. J. Korea Inst. Inf. Secur. Cryptol. 31(3), 473–480 (2021)
Xing, Y.; Li, S.: A compact hardware implementation of CCA-secure key exchange mechanism CRYSTALS-KYBER on FPGA. IACR Trans. Cryptogr. Hardware Embed. Syst. 2, 328–356 (2021)
Guo, W.; Li, S.; Kong, L.: An efficient implementation of KYBER. IEEE Trans. Circuits Syst. Express Briefs 2, 10 (2021). https://doi.org/10.1109/TCSII.2021.3103184
Bisheh-Niasar, M.; Azarderakhsh, R.; Mozaffari-Kermani, M.: High-Speed NTT-based Polynomial Multiplication Accelerator for CRYSTALS-Kyber Post-Quantum Cryptography, Cryptology ePrint Archive (2021). https://eprint.iacr.org/2021/563
Yarman, F.; Can, M.A.; Öztürk, E.; Savaş, E.: A hardware accelerator for polynomial multiplication operation of CRYSTALS-KYBER PQC scheme. In: 2021 Design, Automation and Test in Europe Conference and Exhibition (DATE), pp. 1020–1025. IEEE. https://doi.org/10.23919/DATE51398.2021.9474139
Chen, Z.; Ma, Y.; Chen, T.; Lin, J.; Jing, J.: Towards efficient Kyber on FPGAs: a processor for vector of polynomials. In: 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 247–252. IEEE (2020). https://doi.org/10.1109/ASP-DAC47756.2020.9045459.
Zhang, C.; Liu, D.; Liu, X.; Zou, X.N.G.; Liu, B..J.: Towards efficient hardware implementation of NTT for Kyber on FPGAs. In: 2021 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5. IEEE (2021). https://doi.org/10.1109/ISCAS51556.2021.9401170.
Pöppelmann, T.; Güneysu, T.: Towards efficient arithmetic for lattice-based cryptography on reconfigurable hardware. In: International Conference on Cryptology and Information Security in Latin America, pp. 139–158. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33481-8_8.
Langlois, A.; Stehlé, D.: Worst-case to average-case reductions for module lattices. Des. Codes Crypt. 75(3), 565–599 (2015)
Avanzi, R.; Bos, J.; Ducas, L.; Kiltz, E.; de Lepoint, T.; Lyubashevsky, V.; Schanck, J.M.; Schwabe, P.; Seiler, G.; Stehlé, D.: CRYSTALS-Kyber algorithm specifications and supporting documentation. NIST PQC Round 2, 4 (2017)
Loan, V.: Charles: computational frameworks for the fast Fourier transform. Soc. Ind. Appl. Math. (1992). https://doi.org/10.1137/1.9781611970999
Preparata, F.P.; Sarwate, D.V.: Computational complexity of Fourier transforms over finite fields. Math. Comput. 31(139), 740–751 (1977)
Zhang, N.; Yang, B.; Chen, C.; Yin, S.; Wei, S.; Liu, L.: Highly efficient architecture of NewHope-NIST on FPGA using low-complexity NTT/INTT. IACR Trans. Cryptogr. Hardware Embed. Syst. 2, 49–72 (2020)
Fritzmann, T.; Sigl, G.; Sepúlveda, J.: RISQ-V: tightly coupled RISC-V accelerators for post-quantum cryptography. IACR Trans. Cryptogr. Hardware Embed. Syst. 17, 239–280 (2020)
Longa, P.; Naehrig, M.: Speeding up the number theoretic transform for faster ideal lattice-based cryptography. In: International Conference on Cryptology and Network Security, pp. 124–139. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48965-0_8.
Rivoallon, F.: Xillinx, Measuring Device Performance and Utilization: A Competitive Overview, WPA496 (v1.0.1) (2017)
Acknowledgements
We would like to thank Ho Chi Minh City University of Technology (HCMUT), VNU-HCM, for the support of time and facilities for this study.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nguyen, H., Tran, L. Design of Polynomial NTT and INTT Accelerator for Post-Quantum Cryptography CRYSTALS-Kyber. Arab J Sci Eng 48, 1527–1536 (2023). https://doi.org/10.1007/s13369-022-06928-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-022-06928-w