Double Level Montgomery Cox-Rower Architecture, New Bounds

  • Jean-Claude Bajard
  • Nabil Merkiche
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8968)


Recently, the Residue Number System and the Cox-Rower architecture have been used to compute efficiently Elliptic Curve Cryptography over FPGA. In this paper, we are rewriting the conditions of Kawamura’s theorem for the base extension without error in order to define the maximal range of the set from which the moduli can be chosen to build a base. At the same time, we give a procedure to compute correctly the truncation function of the Cox module. We also present a modified ALU of the Rower architecture using a second level of Montgomery Representation. Such architecture allows us to select the moduli with the new upper bound defined with the condition. This modification makes the Cox-Rower architecture suitable to compute 521 bits ECC with radix downto 16 bits compared to 18 with the classical Cox-Rower architecture. We validate our results through FPGA implementation of a scalar multiplication at classical cryptography security levels (NIST curves). Our implementation uses 35 % less LUTs compared to the state of the art generic implementation of ECC using RNS for the same performance [5]. We also slightly improve the computation time (latency) and our implementation shows best ratio throughput/area for RNS computation supporting any curve independently of the chosen base.


Residue Number System High speed Hardware implementation Elliptic Curve Cryptography FPGA 

Supplementary material


  1. 1.
    Antão, S., Bajard, J.-C., Sousa, L.: RNS-based elliptic curve point multiplication for massive parallel architectures. Comput. J. 55(5), 629–647 (2012)CrossRefGoogle Scholar
  2. 2.
    Bigou, K., Tisserand, A.: Improving modular inversion in RNS using the plus-minus method. In: Bertoni, G., Coron, J.-S. (eds.) CHES 2013. LNCS, vol. 8086, pp. 233–249. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  3. 3.
    Cheung, R.C.C., Duquesne, S., Fan, J., Guillermin, N., Verbauwhede, I., Yao, G.X.: FPGA implementation of pairings using residue number system and lazy reduction. In: Preneel, B., Takagi, T. (eds.) CHES 2011. LNCS, vol. 6917, pp. 421–441. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  4. 4.
    Güneysu, T., Paar, C.: Ultra high performance ECC over NIST primes on commercial FPGAs. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 62–78. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  5. 5.
    Guillermin, N.: A high speed coprocessor for elliptic curve scalar multiplications over \(\mathbb{F}_p\). In: Mangard, S., Standaert, F.-X. (eds.) CHES 2010. LNCS, vol. 6225, pp. 48–64. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  6. 6.
    Kawamura, S., Koike, M., Sano, F., Shimbo, A.: Cox-rower architecture for fast parallel montgomery multiplication. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 523–538. Springer, Heidelberg (2000) CrossRefGoogle Scholar
  7. 7.
    Ma, Y., Liu, Z., Pan, W., Jing, J.: A high-speed elliptic curve cryptographic processor for generic curves over \(\text{ GF }(p)\). In: Lange, T., Lauter, K., Lisoněk, P. (eds.) SAC 2013. LNCS, vol. 8282, pp. 421–437. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  8. 8.
    Nozaki, H., Motoyama, M., Shimbo, A., Kawamura, S.: Implementation of RSA algorithm based on RNS montgomery multiplication. In: Koç, Ç.K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 364–376. Springer, Heidelberg (2001) CrossRefGoogle Scholar
  9. 9.
    Posch, K.C., Posch, R.: Base extension using a convolution sum in residue number systems. Computing 50(2), 93–104 (1993)CrossRefzbMATHMathSciNetGoogle Scholar
  10. 10.
    Posch, K.C., Posch, R.: Modulo reduction in residue number systems. IEEE Trans. Parallel Distrib. Syst. 6(5), 449–454 (1995)CrossRefMathSciNetGoogle Scholar
  11. 11.
    Schinianakis, D.M., Fournaris, A.P., Michail, H.E., Kakarountas, A.P., Stouraitis, T.: An RNS implementation of an \( f_{p} \) elliptic curve point multiplier. IEEE Trans. Circuits Syst. I: Regul. Pap. 56(6), 1202–1213 (2009)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Szerwinski, R., Güneysu, T.: Exploiting the power of GPUs for asymmetric cryptography. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 79–99. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  13. 13.
    Yao, G.X., Fan, J., Cheung, R.C.C., Verbauwhede, I.: Faster pairing coprocessor architecture. In: Abdalla, M., Lange, T. (eds.) Pairing 2012. LNCS, vol. 7708, pp. 160–176. Springer, Heidelberg (2013) CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Sorbonnes Universités, UPMC Univ Paris 06, UMR 7606, LIP6ParisFrance
  2. 2.CNRS, UMR 7606, LIP6ParisFrance
  3. 3.DGA/MIRennesFrance

Personalised recommendations