Highly-Parallel Montgomery Multiplication for Multi-Core General-Purpose Microprocessors

  • Selçuk Baktir
  • Erkay Savaş
Conference paper


Popular public key algorithms such as RSA and Diffie-Hellman key exchange, and more advanced cryptographic schemes such as Paillier’s and Damgård-Jurik’s algorithms (with applications in private information retrieval), require efficient modular multiplication with large integers of size at least 1024 bits. Montgomery multiplication algorithm has proven successful for modular multiplication of large integers. While general purpose multi-core processors have become the mainstream on desktop as well as portable computers, utilization of their computing resources have been largely overlooked when it comes to performing computationally intensive cryptographic operations. In this work, we propose a new parallel Montgomery multiplication algorithm which exhibits up to 39 % better performance than the known best serial Montgomery multiplication variant for the bit-lengths of 2048 or larger. Furthermore, for bit-lengths of 4096 or larger, the proposed algorithm exhibits better performance by utilizing multiple cores available. It achieves speedups of up to 81 %, 3.37 times and 4.87 times for the used general-purpose microprocessors with 2, 4 and 6 cores, respectively. To our knowledge, this is the first work that shows with actual implementation results that Montgomery multiplication can be practically and scalably parallelized on general-purpose multi-core processors.


Montgomery multiplication RSA Multi-core architectures General-purpose microprocessors Parallel algorithms 


  1. 1.
    Chen, Z., Schaumont, P.: A parallel implementation of montgomery multiplication on multicore systems: algorithm, analysis, and prototype. IEEE Trans. Comput. 60, 1692–1703 (2011)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Damgård, I., Jurik, M.: A generalisation, a simplification and some applications of paillier’s probabilistic public-key system. In: Proceedings of the 4th International Workshop on Practice and Theory in Public Key Cryptography: Public Key Cryptography, PKC ’01, pp. 119–136, London. Springer, London (2001)Google Scholar
  3. 3.
    Diffie, W., Hellman, M.E.: New directions in cryptography. IEEE Trans. Inf. Theory IT-22, 644–654 (1976)Google Scholar
  4. 4.
    Fan, J., Sakiyama, K., Verbauwhede, I.: Montgomery modular multiplication algorithm on multi-core systems. 2007 IEEE Workshop Signal Process. Syst. 10, 261–266 (2007)Google Scholar
  5. 5.
    Gentry, C., Halevi, S., Vaikuntanathan, V.: i-hop homomorphic encryption and rerandomizable yao circuits. In: Rabin, T. (ed.) CRYPTO Lecture Notes in Computer Science, vol. 6223, pp. 155–172. Springer, Heidelberg (2010)Google Scholar
  6. 6.
    Kaihara, M.E., Takagi, N.: Bipartite modular multiplication. In: Proceedings of Cryptographic Hardware and Embedded Systems—CHES 2005 Lecture notes in Computer Science, vol. 3659, pp. 201–210. Springer, Heidelberg (2005)Google Scholar
  7. 7.
    Kaihara, M.E., Takagi, N.: Bipartite modular multiplication method. IEEE Trans. Comput. 57(2), 157–164 (2008)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Karatsuba, A., Ofman, Y.: Multiplication of multidigit numbers on automata. Sov. Phys. Dokl. (Engl. Transl.) 7(7), 595–596 (1963)Google Scholar
  9. 9.
    Koç, Ç.K., Acar, T.: Montgomery multiplication in \(GF(2^k)\). Des. Codes Cryptogr. 14(1), 57–69 (1998)MathSciNetMATHCrossRefGoogle Scholar
  10. 10.
    Koç, Ç.K., Acar, T., Kaliski, B.: Analyzing and comparing montgomery multiplication algorithms. IEEE Micro 16, 26–33 (1996)CrossRefGoogle Scholar
  11. 11.
    Lipmaa, H.: First CPIR protocol with data-dependent computation. In: Proceedings of the 12th International Conference on Information Security and Cryptology, ICISC’09, pp. 193–210, Berlin. Springer, Heidelberg (2010)Google Scholar
  12. 12.
    Montgomery, P.L.: Modular multiplication without trial division. Math. Comput. 44(170), 519–521 (1985)MATHCrossRefGoogle Scholar
  13. 13.
    Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Advances in Cryptology—EUROCRYPT 1999, pp. 223–238. Springer, Heidelberg (1999)Google Scholar
  14. 14.
    Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21(2), 120–126 (1978)MathSciNetMATHCrossRefGoogle Scholar
  15. 15.
    Sakiyama, K., Batina, L., Preneel, B., Verbauwhede, I.: Multicore curve-based cryptoprocessor with reconfigurable modular arithmetic logic units over \(GF(2^n)\). IEEE Trans. Comput. 56, 1269–1282 (2007)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Sakiyama, K., Knezevic, M., Fan, J., Preneel, B., Verbauwhede, I.: Tripartite modular multiplication. Integration 44(4), 259–269 (2011)Google Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  1. 1.Department of Computer EngineeringBahçeşehir UniversityIstanbulTurkey
  2. 2.Faculty of Engineering and Natural SciencesSabanci UniversityIstanbulTurkey

Personalised recommendations