Skip to main content
Log in

Design of a Residue Number System Based Linear System Solver in Hardware

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

This paper is focused on error-free solution of dense linear systems using residual arithmetic in hardware. The designed Modular System uses hardware identical Residual Processors (RP)s for solving independent systems of linear congruences and combines their solutions into the solution of the given linear system. This approach uses the residue number system which is based on the Chinese remainder theorem. In order to efficiently exploit parallel processing and cooperation of the individual components, a hardware architecture of the Modular System with several RPs is designed. In order to verify the proposed architecture, a Xilinx FPGA with a MicroBlaze processor was used. Experimental results are obtained for an evaluation FPGA board with Virtex 6. Results from implementation serve for subsequent theoretical analysis of the system performance for various linear system sizes and further improvement of the system. The proposed system can be useful as a special hardware peripheral or a part of an embedded system for solving large nonsingular systems of linear equations with integer, rational or floating-point coefficients with arbitrary precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Similar content being viewed by others

Notes

  1. Ncat 5.51 (http://nmap.org/ncat)

  2. The modular multiplicative inverse can be computed by one of the variants of the Extended Euclidean Algorithm; for further reference, see [21] or [17], for example.

References

  1. IEEE Computer Society Standards Committee (2008). IEEE Standard for Floating-Point Arithmetic. ANSI/IEEE STD 754-2008. IEEE.

  2. Skála, J., & Bárta, M. (2012).

  3. Garner, H.L. (1959). The residue number system. In Papers presented at the the March 3–5, 1959, Western Joint Computer Conference. IRE-AIEE-ACM ’59 (Western), New York, NY, USA, ACM (pp. 146–153).

  4. Koç, Ç.K. (1992). A parallel algorithm for exact solution of linear equations via congruence technique. Computers & Mathematics with Applications, 23(12), 13–24.

    Article  MathSciNet  MATH  Google Scholar 

  5. Morháč, M., & Lórencz, R. (1992). A modular system for solving linear equations exactly, i. Architecture and numerical algorithms. Computers and Artificial Intelligence, 11(4), 351–361.

    MathSciNet  MATH  Google Scholar 

  6. Lórencz, R., & Morháč, M. (1992). A modular system for solving linear equations exactly, ii. Hardware realization. Computers and Artificial Intelligence, 11(5), 497–507.

    MATH  Google Scholar 

  7. Buček, J., Kubalík, P., Lórencz, R., & Zahradnický, T. (2012). Dedicated Hardware Implementation of a Linear Congruence Solver in FPGA. In The 19th IEEE international conference on electronics, circuits, and systems, ICECS 2012, Monterey, IEEE (pp. 689–692).

  8. Buček, J., Kubalík, P., Lórencz, R., & Zahradnický, T. (2013). Comparison of FPGA and ASIC implementation of a linear congruence solver. In 16th Euromicro Conference on Digital System Design (DSD), 2013 (pp. 284–287).

  9. Taleshmekaeil, D., & Mousavi, A. (2010). The use of residue number system for improving the digital image processing. In 10th International Conference on Signal Processing (ICSP), 2010 IEEE (pp. 775–780).

  10. Younes, D., & Steffan, P. (2013). Efficient image processing application using residue number system. In Proceedings of the 20th International Conference on Mixed design of integrated circuits and systems (MIXDES), 2013, IEEE.

  11. Cardarilli, G., Nannarelli, A., & Re, M. (2007). Residue number system for low-power DSP applications. In Signals, Systems and Computers, 2007. ACSSC 2007. Conference Record of the Forty-First Asilomar Conference on (pp. 1412–1416).

  12. Mirshekari, A., & Mosleh, M. (2010). Hardware implementation of a fast FIR filter with residue number system. In 2nd International Conference on Industrial Mechatronics and Automation (ICIMA), 2010, (Vol. 2 pp. 312–315).

  13. Schinianakis, D., Kakarountas, A., & Stouraitis, T. (2006). A new approach to elliptic curve cryptography: an RNS architecture. In Electrotech. Conf., 2006. MELECON 2006. IEEE Mediterranean (pp. 1241–1245).

  14. Güneysu, T., & Paar, C. (2008). Ultra high performance ECC over NIST primes on commercial FPGAs. In Proceedings of the 10th international workshop on cryptographic hardware and embedded systems. CHES ’08 (pp. 62–78). Berlin, Heidelberg: Springer.

  15. Buček, J., Kubalík, P., Lórencz, R., & Zahradnický, T. (2014). An ASIC linear congruence solver synthesized with three cell libraries. In The 21th IEEE international conference on electronics, circuits, and systems, ICECS (p. 2014). France: Marseille.

  16. Buček, J., Kubalík, P., Lórencz, R., & Zahradnický, T. (2014). System on chip design of a linear system solver. In 2014 international symposium on system-on-chip proceedings (pp. 1–6). Piscataway: IEEE.

  17. Gregory, R.T., & Krishnamurthy, E.V. (1984). Methods and application of error-free computation, Springer.

  18. Young, D.M., & Gregory, R.T. (1973). A Survey of Numerical Mathematics. Addison-Wesley Series in Mathematics edn Vol. 2: Addison-Wesley Publishing Company, Inc.

  19. Newman, M. (1967). Solving exquations exactly. National Bureau of Standarts, 71B, 171–179.

    MATH  Google Scholar 

  20. Koç, Ç.K., & Güvenç, A. (1994). B.b.g.: exact solution of linear equations on distributed-memory multiprocessors. Parallel Algorithms and Applications, 3, 135–143.

    Article  MATH  Google Scholar 

  21. Knuth, D.E. (1998). The Art of Computer Programming, Seminumerical Algorithms. Mass. Third edition edn Vol. 2: Addison-Wesley Publishing Company, Inc.

  22. Lórencz, R. (2002). New algorithm for classical modular inverse. In Cryptographic hardware and embedded systems, CHES, (Vol. 2002 pp. 57–70). New York: Springer.

  23. Vondra, L. (2014). System for solving linear equation systems. PhD thesis, Czech Technical University in Prague.

Download references

Acknowledgment

This research was supported by the Czech Science Foundation project no. P103/12/2377.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Róbert Lórencz.

Appendix A: Examples

Appendix A: Examples

Let us have a system of 3 linear equations given by the augmented matrix:

$$\left( \begin{array}{rrrr} 3 & -1 & 2 & 1 \\ 1.5 & 3 & -2 & -1 \\ 0.5 & -1 & 1.5 & 2 \end{array} \right) $$

First, we convert the system to an equivalent system with integer coefficients. In this case, it suffices to multiply rows 2 and 3 by 2. The solution to this system is the same as before. This matrix is the input to Method 1, and thereby also to Algorithm 1:

$$(\mathbf{A|b})=\left( \begin{array}{rrrr} 3 & -1 & 2 & 1 \\ 3 & 6 & -4 & -2 \\ 1 & -2 & 3 & 4 \end{array} \right) $$

Let us now choose the set of moduli (m 1,m 2,m 3)=(5,7,11) for the RNS representation. For example, the element a 2,2=6 has the RNS representation (1,6,6) modulo (5,7,11). This representation is unique modulo \(M = {\prod }_{i=1}^{r}m_{i} = 5\times 7\times 11 = 385\).

Then, we compute the residues of (A | b) mod m 1,m 2, m 3, which gives the augmented matrices of three systems of linear congruences (SLCs). This operation denoted in Algorithm 1 as well as in Method 1 as Input conversion.

$$(\mathbf{A|b})\bmod 5=\left( \begin{array}{cccc} 3 & 4 & 2 & 1 \\ 3 & 1 & 1 & 3 \\ 1 & 3 & 3 & 4 \end{array} \right)$$
$$(\mathbf{A|b})\bmod 7=\left( \begin{array}{cccc} 3 & 6 & 2 & 1 \\ 3 & 6 & 3 & 5 \\ 1 & 5 & 3 & 4 \end{array} \right)$$
$$(\mathbf{A|b})\bmod 11=\left( \begin{array}{cccc} 3 & 10 & 2 & 1 \\ 3 & 6 & 7 & 9 \\ 1 & 9 & 3 & 4 \end{array} \right) $$

Next, we will solve the individual SLCs in their respective moduli (SLC Solving). We will show the process for m 2=7.

In the first elimination step, a pivot is found \(a_{1,1}^{(0)}=3\). Its row index is stored in c 1=1, and the determinant intermediate value is d 2=3.

The pivot’s inverseFootnote 2 is 3−1 mod 7=5 and the pivot’s row is multiplied with this inverse: \(a_{1,j}^{(1)}=[a_{1,j+1}^{(0)} \times 3] \bmod 7\). The first row is thereby shifted one element to the left (no information is lost, the discarded value would be a constant 1).

$$a_{i,j}^{(0)}= \left( \begin{array}{cccc} \mathbf{3} & 6 & 2 & 1 \\ 3 & 6 & 3 & 5 \\ 1 & 5 & 3 & 4 \end{array} \right) \rightarrow \left( \begin{array}{cccc} 2 & 3 & 5 & \cdot \\ 3 & 6 & 3 & 5 \\ 1 & 5 & 3 & 4 \end{array} \right) $$

Further in this elimination step, all other rows (l=2,3) are reduced using the adjusted pivot’s row: \(a_{l,j}^{(1)}= [a_{l,j+1}^{(0)} - (a_{l,1}^{(0)}\, a_{1,j}^{(1)})] \bmod 7\). Again, the elements are by this process shifted to the left, discarding unnecessary zeros.

$$a_{i,j}^{(1)}=\left( \begin{array}{cccc} 2 & 3 & 5 & \cdot \\ 0 & 1 & 4 & \cdot \\ \mathbf{3} & 0 & 6 & \cdot \end{array} \right) $$

In the second elimination step, a pivot is found \(a_{3,1}^{(1)}=3\). Its row index is stored in c 2=3, and the determinant intermediate value is d 2=3×3 mod 7=2. The pivot’s inverse is 3−1 mod 7=5 and the pivot’s row is multiplied with this inverse: \(a_{3,j}^{(2)}=[a_{3,j+1}^{(1)} \times 5] \bmod 7\).

$$\left( \begin{array}{cccc} 2 & 3 & 5 & \cdot\\ 0 & 1 & 4 & \cdot\\ 0 & 2 &\cdot& \cdot \end{array} \right) $$

Then, all other rows (l=1,2) are reduced using the adjusted pivot’s row: \(a_{l,j}^{(2)}= [a_{l,j+1}^{(1)} - (a_{l,1}^{(1)}\, a_{3,j}^{(2)})] \bmod 7\). Again, the elements are by this process shifted to the left, discarding unnecessary zeros.

$$a_{i,j}^{(2)}=\left( \begin{array}{cccc} 3 & 1 & \cdot & \cdot \\ \mathbf{1} & 4 & \cdot & \cdot \\ 0 & 2 & \cdot & \cdot \end{array} \right) $$

In the third and final elimination step, a pivot is found \(a_{2,1}^{(2)}=1\). Its row index is stored in c 3=2, and the determinant intermediate value is d 2=2×1 mod 7=2. The pivot’s inverse is 1−1 mod 7=1 and the pivot’s row is multiplied with this inverse: \(a_{2,j}^{(3)}=[a_{2,j+1}^{(2)} \times 1] \bmod 7\).

$$\left( \begin{array}{cccc} 3 & 1 & \cdot & \cdot \\ 4 & \cdot & \cdot & \cdot \\ 0 & 2 & \cdot & \cdot \end{array} \right) $$

The other rows (l=1,3) are reduced as in the previous steps: \(a_{l,j}^{(3)}= [a_{l,j+1}^{(2)} - (a_{l,1}^{(2)}\, a_{2,j}^{(3)})] \bmod 7\)

$$a_{i,j}^{(3)}= \left( \begin{array}{cccc} 3 & \cdot & \cdot & \cdot \\ 4 & \cdot & \cdot & \cdot \\ 2 & \cdot & \cdot & \cdot \end{array} \right) $$

The solution y 2 is now in the first column, but not in the correct order, since the pivots were found in the order of c=(1,3,2). Therefore, the solution is reordered to get y 2=(3,2,4)T.

For the same reason, the sign of the computed determinant must be corrected. In the index vector c, an odd number of pair swaps is needed to create the ordered sequence (1,2,3). Therefore, sgn(c)=−1 and the determinant’s sign is corrected d 2=2×(−1) mod 7=5.

The solution is then multiplied by the determinant to get z 2 = y 2 d 2 mod 7=(3,2,4)T×5 mod 7=(1,3,6)T.

Similarly, the solutions and determinants in the other moduli are computed, giving z 1=(0,0,2)T, d 1=4 and z 3=(2,1,7)T, d 3=8.

Next, we will perform the Output conversion, the third stage of Method 1. First, the intermediate results are written in the form of \(\textbf {t}_{k}= \left [ \begin {array}{c} \textbf {z}_{k} \\ d_{k} \end {array} \right ]\)

$$[\textbf{t}_{1}, \textbf{t}_{2}, \textbf{t}_{3}]= \left[ \begin{array}{ccc} 0 & 1 & 2 \\ 0 & 3 & 1 \\ 2 & 6 & 7 \\ 4 & 5 & 8 \end{array} \right] $$

The values t 1,t 2,t 3 are now converted according to Algorithm 2. First, we will show the conversion of the determinant d, whose RNS digits are the last elements of t k , i.e. \(d=\det \mathbf {A}\) has the RNS representation (d 1,d 2,d 3)=(4,5,8) modulo (5,7,11).

figure h

During the computation of Algorithm 2, mixed-radix digits (4,3,0) are computed, shown in boxes in the table above. (In fact, the first digit 4 is just taken from t 1). The mixed-radix digit weights are (1,m 1,m 1 m 2)=(1,5,5×7). The value of the determinant is thus d=4×1+3×5+0×5×7=19. Because \(d<\frac {M}{2}\), i.e. \(19<\frac {385}{2}\), it is positive.

A negative number is converted the same way. For example, the first element of z has the RNS represenatation (0,1,2):

figure i

The mixed-radix representation is thus (0,3,10), and the value is 0+3×5+10×5×7=365, which is more than \(\frac {M}{2}\), therefore it is negative. The correct value can be computed by subtracting M, yielding 365−385=−20.

The same process is applied on all members of [t 1,t 2,t 3], getting (z,d)T=(−20,45,62,19)T.

The final step is to get the solution vector \(\mathbf {x}=\frac {\mathbf {z}}{d}=(-\frac {20}{19},\frac {45}{19},\frac {62}{19})^{\mathsf {T}}\). We can now verify that x is indeed the solution of the SLE (A | b):

$$\left( \begin{array}{rrr} 3 & -1 & 2 \\ 3 & 6 & -4 \\ 1 & -2 & 3 \end{array} \right).\left( \begin{array}{r} -\frac{20}{19} \\ \frac{45}{19} \\ \frac{62}{19} \end{array} \right)=\left( \begin{array}{r} 1 \\ -2 \\ 4 \end{array} \right) $$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Buček, J., Kubalík, P., Lórencz, R. et al. Design of a Residue Number System Based Linear System Solver in Hardware. J Sign Process Syst 87, 343–356 (2017). https://doi.org/10.1007/s11265-016-1146-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-016-1146-1

Keywords

Navigation