Energy-Efficient Software Implementation of Long Integer Modular Arithmetic

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3659)


This paper investigates performance and energy characteristics of software algorithms for long integer arithmetic. We analyze and compare the number of RISC-like processor instructions (e.g. single-precision multiplication, addition, load, and store instructions) required for the execution of different algorithms such as Schoolbook multiplication, Karatsuba and Comba multiplication, as well as Montgomery reduction. Our analysis shows that a combination of Karatsuba-Comba multiplication and Montgomery reduction (the so-called KCM method) allows to achieve better performance than other algorithms for modular multiplication. Furthermore, we present a simple model to compare the energy-efficiency of arithmetic algorithms. This model considers the clock cycles and average current consumption of the base instructions to estimate the overall amount of energy consumed during the execution of an algorithm. Our experiments, conducted on a StrongARM SA-1100 processor, indicate that a 1024-bit KCM multiplication consumes about 22% less energy than other modular multiplication techniques.


Clock Cycle Outer Loop Base Instruction Very Large Scale Integration Modular Multiplication 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    ARM Limited. ARM Architecture Reference Manual. ARM Doc No. DDI-0100, Issue H (October 2003)Google Scholar
  2. 2.
    Comba, P.G.: Exponentiation cryptosystems on the IBM PC. IBM Systems Journal 29(4), 526–538 (1990)CrossRefGoogle Scholar
  3. 3.
    Dussé, S.R., Kaliski, B.S.: A cryptographic library for the Motorola DSP56000. In: Damgård, I.B. (ed.) EUROCRYPT 1990. LNCS, vol. 473, pp. 230–244. Springer, Heidelberg (1991)Google Scholar
  4. 4.
    Goodman, J.R.: Energy Scalable Reconfigurable Cryptographic Hardware for Portable Applications. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA (2000)Google Scholar
  5. 5.
    Granlund, T.: GNU MP: The GNU Multiple Precision Arithmetic Library. Manual (September 2004), available for download at,
  6. 6.
    Hager, C.T., Midkiff, S.F., Park, J.-M., Martin, T.L.: Performance and energy efficiency of block ciphers in personal digital assistants. In: Proceedings of the 3rd IEEE International Conference on Pervasive Computing and Communications (PerCom 2005), pp. 127–136. IEEE Computer Society Press, Los Alamitos (2005)CrossRefGoogle Scholar
  7. 7.
    Hodjat, A., Verbauwhede, I.M.: The energy cost of secrets in ad-hoc networks. In: Proceedings of the 5th IEEE CAS Workshop on Wireless Communications and Networking. IEEE, Los Alamitos (2002)Google Scholar
  8. 8.
    Intel Corporation. StrongARM SA-110 microprocessor instruction timing. Application note, order number 278194-001 (September 1998)Google Scholar
  9. 9.
    Intel Corporation. Intel® StrongARM® SA-1100 microprocessor for embedded applications. Brief datasheet, order number 278092-005 (June 1999)Google Scholar
  10. 10.
    Intel Corporation. Intel® StrongARM® SA-1100 microprocessor. Specification update, order number 278105-025 (February 2000)Google Scholar
  11. 11.
    Karatsuba, A.A., Ofman, Y.P.: Multiplication of multidigit numbers on automata. Doklady Akademii Nauk SSSR 145(2), 293–294 (1962)Google Scholar
  12. 12.
    Karri, R., Mishra, P.: Optimizing the energy consumed by secure wireless sessions — Wireless Transport Layer Security case study. Mobile Networks and Applications 8(2), 177–185 (2003)CrossRefGoogle Scholar
  13. 13.
    Knuth, D.E.: Seminumerical Algorithms, 3rd edn. The Art of Computer Programming, vol. 2. Addison-Wesley, Reading (1998)zbMATHGoogle Scholar
  14. 14.
    Koç, Ç.K., Acar, T., Kaliski, B.S.: Analyzing and comparing Montgomery multiplication algorithms. IEEE Micro 16(3), 26–33 (1996)CrossRefGoogle Scholar
  15. 15.
    Mehta, H., Owens, R.M., Irwin, M.J., Chen, R., Ghosh, D.: Techniques for low energy software. In: Proceedings of the 2nd International Symposium on Low Power Electronics and Design (ISLPED 1997), pp. 72–75. ACM Press, New York (1997)CrossRefGoogle Scholar
  16. 16.
    Menezes, A.J., van Oorschot, P.C., Vanstone, S.A.: Handbook of Applied Cryptography. CRC Press, Boca Raton (1996)CrossRefGoogle Scholar
  17. 17.
    Montgomery, P.L.: Modular multiplication without trial division. Mathematics of Computation 44(170), 519–521 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Potlapally, N.R., Ravi, S., Raghunathan, A., Jha, N.K.: Analyzing the energy consumption of security protocols. In: Proceedings of the 8th International Symposium on Low Power Electronics and Design (ISLPED 2003), pp. 30–35. ACM Press, New York (2003)CrossRefGoogle Scholar
  19. 19.
    Roy, K., Johnson, M.C.: Software design for low power. In: Low Power Design in Deep Submicron Electronics. NATO Advanced Science Institutes Series, vol. 337, ch. 6.3, pp. 433–460. Kluwer Academic Publishers, Dordrecht (1997)Google Scholar
  20. 20.
    Scott, M.P.: Fast machine code for modular multiplication. Manuscript (January 1995), available for download at,
  21. 21.
    Scott, M.P.: Comparison of methods for modular exponentiation on 32-bit Intel 80x86 processors. Informal draft (June 1996), available for download at,
  22. 22.
    Shamus Software Ltd. M.I.R.A.C.L. Users Manual (November 2004), Available for download at,
  23. 23.
    Šimunić, T.: Energy Efficient System Design and Utilization. Ph.D. Thesis, Stanford University, Stanford, CA, USA (February 2001)Google Scholar
  24. 24.
    Sinha, A., Chandrakasan, A.P.: JouleTrack - A web based tool for software energy profiling. In: Proceedings of the 38th Design Automation Conference (DAC 2001), pp. 220–225. ACM Press, New York (2001)Google Scholar
  25. 25.
    Tiwari, V., Malik, S., Wolfe, A.: Power analysis of embedded software: A first step towards software power minimization. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2(4), 437–445 (1994)CrossRefGoogle Scholar
  26. 26.
    Tiwari, V., Malik, S., Wolfe, A., Lee, T.-C.: Instruction level power analysis and optimization of software. Journal of VLSI Signal Processing 13(2–3), 223–238 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  1. 1.Institute for Applied Information Processing and CommunicationsGraz University of TechnologyGrazAustria
  2. 2.Faculty of Mathematics and Horst Görtz Institute for IT-SecurityRuhr University BochumBochumGermany
  3. 3.Faculty of Engineering and Natural SciencesSabanci UniversityIstanbulTurkey

Personalised recommendations