Design of a Residue Number System Based Linear System Solver in Hardware

Buček, Jiří; Kubalík, Pavel; Lórencz, Róbert; Zahradnický, Tomáš

doi:10.1007/s11265-016-1146-1

Design of a Residue Number System Based Linear System Solver in Hardware

Published: 14 June 2016

Volume 87, pages 343–356, (2017)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Jiří Buček¹,
Pavel Kubalík¹,
Róbert Lórencz¹ &
…
Tomáš Zahradnický¹

280 Accesses
Explore all metrics

Abstract

This paper is focused on error-free solution of dense linear systems using residual arithmetic in hardware. The designed Modular System uses hardware identical Residual Processors (RP)s for solving independent systems of linear congruences and combines their solutions into the solution of the given linear system. This approach uses the residue number system which is based on the Chinese remainder theorem. In order to efficiently exploit parallel processing and cooperation of the individual components, a hardware architecture of the Modular System with several RPs is designed. In order to verify the proposed architecture, a Xilinx FPGA with a MicroBlaze processor was used. Experimental results are obtained for an evaluation FPGA board with Virtex 6. Results from implementation serve for subsequent theoretical analysis of the system performance for various linear system sizes and further improvement of the system. The proposed system can be useful as a special hardware peripheral or a part of an embedded system for solving large nonsingular systems of linear equations with integer, rational or floating-point coefficients with arbitrary precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey on Pipelined FFT Hardware Architectures

Article Open access 06 July 2021

Parallelizing the dual revised simplex method

Article Open access 14 December 2017

Sparse polynomial interpolation: faster strategies over finite fields

Article 27 April 2024

Notes

Ncat 5.51 (http://nmap.org/ncat)
The modular multiplicative inverse can be computed by one of the variants of the Extended Euclidean Algorithm; for further reference, see [21] or [17], for example.

References

IEEE Computer Society Standards Committee (2008). IEEE Standard for Floating-Point Arithmetic. ANSI/IEEE STD 754-2008. IEEE.
Skála, J., & Bárta, M. (2012).
Garner, H.L. (1959). The residue number system. In Papers presented at the the March 3–5, 1959, Western Joint Computer Conference. IRE-AIEE-ACM ’59 (Western), New York, NY, USA, ACM (pp. 146–153).
Koç, Ç.K. (1992). A parallel algorithm for exact solution of linear equations via congruence technique. Computers & Mathematics with Applications, 23(12), 13–24.
Article MathSciNet MATH Google Scholar
Morháč, M., & Lórencz, R. (1992). A modular system for solving linear equations exactly, i. Architecture and numerical algorithms. Computers and Artificial Intelligence, 11(4), 351–361.
MathSciNet MATH Google Scholar
Lórencz, R., & Morháč, M. (1992). A modular system for solving linear equations exactly, ii. Hardware realization. Computers and Artificial Intelligence, 11(5), 497–507.
MATH Google Scholar
Buček, J., Kubalík, P., Lórencz, R., & Zahradnický, T. (2012). Dedicated Hardware Implementation of a Linear Congruence Solver in FPGA. In The 19th IEEE international conference on electronics, circuits, and systems, ICECS 2012, Monterey, IEEE (pp. 689–692).
Buček, J., Kubalík, P., Lórencz, R., & Zahradnický, T. (2013). Comparison of FPGA and ASIC implementation of a linear congruence solver. In 16th Euromicro Conference on Digital System Design (DSD), 2013 (pp. 284–287).
Taleshmekaeil, D., & Mousavi, A. (2010). The use of residue number system for improving the digital image processing. In 10th International Conference on Signal Processing (ICSP), 2010 IEEE (pp. 775–780).
Younes, D., & Steffan, P. (2013). Efficient image processing application using residue number system. In Proceedings of the 20th International Conference on Mixed design of integrated circuits and systems (MIXDES), 2013, IEEE.
Cardarilli, G., Nannarelli, A., & Re, M. (2007). Residue number system for low-power DSP applications. In Signals, Systems and Computers, 2007. ACSSC 2007. Conference Record of the Forty-First Asilomar Conference on (pp. 1412–1416).
Mirshekari, A., & Mosleh, M. (2010). Hardware implementation of a fast FIR filter with residue number system. In 2nd International Conference on Industrial Mechatronics and Automation (ICIMA), 2010, (Vol. 2 pp. 312–315).
Schinianakis, D., Kakarountas, A., & Stouraitis, T. (2006). A new approach to elliptic curve cryptography: an RNS architecture. In Electrotech. Conf., 2006. MELECON 2006. IEEE Mediterranean (pp. 1241–1245).
Güneysu, T., & Paar, C. (2008). Ultra high performance ECC over NIST primes on commercial FPGAs. In Proceedings of the 10th international workshop on cryptographic hardware and embedded systems. CHES ’08 (pp. 62–78). Berlin, Heidelberg: Springer.
Buček, J., Kubalík, P., Lórencz, R., & Zahradnický, T. (2014). An ASIC linear congruence solver synthesized with three cell libraries. In The 21th IEEE international conference on electronics, circuits, and systems, ICECS (p. 2014). France: Marseille.
Buček, J., Kubalík, P., Lórencz, R., & Zahradnický, T. (2014). System on chip design of a linear system solver. In 2014 international symposium on system-on-chip proceedings (pp. 1–6). Piscataway: IEEE.
Gregory, R.T., & Krishnamurthy, E.V. (1984). Methods and application of error-free computation, Springer.
Young, D.M., & Gregory, R.T. (1973). A Survey of Numerical Mathematics. Addison-Wesley Series in Mathematics edn Vol. 2: Addison-Wesley Publishing Company, Inc.
Newman, M. (1967). Solving exquations exactly. National Bureau of Standarts, 71B, 171–179.
MATH Google Scholar
Koç, Ç.K., & Güvenç, A. (1994). B.b.g.: exact solution of linear equations on distributed-memory multiprocessors. Parallel Algorithms and Applications, 3, 135–143.
Article MATH Google Scholar
Knuth, D.E. (1998). The Art of Computer Programming, Seminumerical Algorithms. Mass. Third edition edn Vol. 2: Addison-Wesley Publishing Company, Inc.
Lórencz, R. (2002). New algorithm for classical modular inverse. In Cryptographic hardware and embedded systems, CHES, (Vol. 2002 pp. 57–70). New York: Springer.
Vondra, L. (2014). System for solving linear equation systems. PhD thesis, Czech Technical University in Prague.

Download references

Acknowledgment

This research was supported by the Czech Science Foundation project no. P103/12/2377.

Author information

Authors and Affiliations

Faculty of Information Technology, Czech Technical University in Prague, Thákurova 9, 160 00, Prague, Czech Republic
Jiří Buček, Pavel Kubalík, Róbert Lórencz & Tomáš Zahradnický

Authors

Jiří Buček
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Kubalík
View author publications
You can also search for this author in PubMed Google Scholar
Róbert Lórencz
View author publications
You can also search for this author in PubMed Google Scholar
Tomáš Zahradnický
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Róbert Lórencz.

Appendix A: Examples

Let us have a system of 3 linear equations given by the augmented matrix:

$$\left( \begin{array}{rrrr} 3 & -1 & 2 & 1 \\ 1.5 & 3 & -2 & -1 \\ 0.5 & -1 & 1.5 & 2 \end{array} \right) $$

First, we convert the system to an equivalent system with integer coefficients. In this case, it suffices to multiply rows 2 and 3 by 2. The solution to this system is the same as before. This matrix is the input to Method 1, and thereby also to Algorithm 1:

$$(\mathbf{A|b})=\left( \begin{array}{rrrr} 3 & -1 & 2 & 1 \\ 3 & 6 & -4 & -2 \\ 1 & -2 & 3 & 4 \end{array} \right) $$

Let us now choose the set of moduli (m ₁,m ₂,m ₃)=(5,7,11) for the RNS representation. For example, the element a _2,2=6 has the RNS representation (1,6,6) modulo (5,7,11). This representation is unique modulo $M = {\prod }_{i=1}^{r}m_{i} = 5\times 7\times 11 = 385$.

Then, we compute the residues of (A | b) mod m ₁,m ₂, m ₃, which gives the augmented matrices of three systems of linear congruences (SLCs). This operation denoted in Algorithm 1 as well as in Method 1 as Input conversion.

$$(\mathbf{A|b})\bmod 5=\left( \begin{array}{cccc} 3 & 4 & 2 & 1 \\ 3 & 1 & 1 & 3 \\ 1 & 3 & 3 & 4 \end{array} \right)$$

$$(\mathbf{A|b})\bmod 7=\left( \begin{array}{cccc} 3 & 6 & 2 & 1 \\ 3 & 6 & 3 & 5 \\ 1 & 5 & 3 & 4 \end{array} \right)$$

$$(\mathbf{A|b})\bmod 11=\left( \begin{array}{cccc} 3 & 10 & 2 & 1 \\ 3 & 6 & 7 & 9 \\ 1 & 9 & 3 & 4 \end{array} \right) $$

Next, we will solve the individual SLCs in their respective moduli (SLC Solving). We will show the process for m ₂=7.

In the first elimination step, a pivot is found $a_{1,1}^{(0)}=3$. Its row index is stored in c ₁=1, and the determinant intermediate value is d ₂=3.

The pivot’s inverse^{Footnote 2} is 3⁻¹ mod 7=5 and the pivot’s row is multiplied with this inverse: $a_{1,j}^{(1)}=[a_{1,j+1}^{(0)} \times 3] \bmod 7$. The first row is thereby shifted one element to the left (no information is lost, the discarded value would be a constant 1).

$$a_{i,j}^{(0)}= \left( \begin{array}{cccc} \mathbf{3} & 6 & 2 & 1 \\ 3 & 6 & 3 & 5 \\ 1 & 5 & 3 & 4 \end{array} \right) \rightarrow \left( \begin{array}{cccc} 2 & 3 & 5 & \cdot \\ 3 & 6 & 3 & 5 \\ 1 & 5 & 3 & 4 \end{array} \right) $$

Further in this elimination step, all other rows (l=2,3) are reduced using the adjusted pivot’s row: $a_{l,j}^{(1)}= [a_{l,j+1}^{(0)} - (a_{l,1}^{(0)}\, a_{1,j}^{(1)})] \bmod 7$. Again, the elements are by this process shifted to the left, discarding unnecessary zeros.

$$a_{i,j}^{(1)}=\left( \begin{array}{cccc} 2 & 3 & 5 & \cdot \\ 0 & 1 & 4 & \cdot \\ \mathbf{3} & 0 & 6 & \cdot \end{array} \right) $$

In the second elimination step, a pivot is found $a_{3,1}^{(1)}=3$. Its row index is stored in c ₂=3, and the determinant intermediate value is d ₂=3×3 mod 7=2. The pivot’s inverse is 3⁻¹ mod 7=5 and the pivot’s row is multiplied with this inverse: $a_{3,j}^{(2)}=[a_{3,j+1}^{(1)} \times 5] \bmod 7$.

$$\left( \begin{array}{cccc} 2 & 3 & 5 & \cdot\\ 0 & 1 & 4 & \cdot\\ 0 & 2 &\cdot& \cdot \end{array} \right) $$

Then, all other rows (l=1,2) are reduced using the adjusted pivot’s row: $a_{l,j}^{(2)}= [a_{l,j+1}^{(1)} - (a_{l,1}^{(1)}\, a_{3,j}^{(2)})] \bmod 7$. Again, the elements are by this process shifted to the left, discarding unnecessary zeros.

$$a_{i,j}^{(2)}=\left( \begin{array}{cccc} 3 & 1 & \cdot & \cdot \\ \mathbf{1} & 4 & \cdot & \cdot \\ 0 & 2 & \cdot & \cdot \end{array} \right) $$

In the third and final elimination step, a pivot is found $a_{2,1}^{(2)}=1$. Its row index is stored in c ₃=2, and the determinant intermediate value is d ₂=2×1 mod 7=2. The pivot’s inverse is 1⁻¹ mod 7=1 and the pivot’s row is multiplied with this inverse: $a_{2,j}^{(3)}=[a_{2,j+1}^{(2)} \times 1] \bmod 7$.

$$\left( \begin{array}{cccc} 3 & 1 & \cdot & \cdot \\ 4 & \cdot & \cdot & \cdot \\ 0 & 2 & \cdot & \cdot \end{array} \right) $$

The other rows (l=1,3) are reduced as in the previous steps: $a_{l,j}^{(3)}= [a_{l,j+1}^{(2)} - (a_{l,1}^{(2)}\, a_{2,j}^{(3)})] \bmod 7$

$$a_{i,j}^{(3)}= \left( \begin{array}{cccc} 3 & \cdot & \cdot & \cdot \\ 4 & \cdot & \cdot & \cdot \\ 2 & \cdot & \cdot & \cdot \end{array} \right) $$

The solution y ₂ is now in the first column, but not in the correct order, since the pivots were found in the order of c=(1,3,2). Therefore, the solution is reordered to get y ₂=(3,2,4)^T.

For the same reason, the sign of the computed determinant must be corrected. In the index vector c, an odd number of pair swaps is needed to create the ordered sequence (1,2,3). Therefore, sgn(c)=−1 and the determinant’s sign is corrected d ₂=2×(−1) mod 7=5.

The solution is then multiplied by the determinant to get z ₂ = y ₂ d ₂ mod 7=(3,2,4)^T×5 mod 7=(1,3,6)^T.

Similarly, the solutions and determinants in the other moduli are computed, giving z ₁=(0,0,2)^T, d ₁=4 and z ₃=(2,1,7)^T, d ₃=8.

Next, we will perform the Output conversion, the third stage of Method 1. First, the intermediate results are written in the form of $\textbf {t}_{k}= \left [ \begin {array}{c} \textbf {z}_{k} \\ d_{k} \end {array} \right ]$

$$[\textbf{t}_{1}, \textbf{t}_{2}, \textbf{t}_{3}]= \left[ \begin{array}{ccc} 0 & 1 & 2 \\ 0 & 3 & 1 \\ 2 & 6 & 7 \\ 4 & 5 & 8 \end{array} \right] $$

The values t ₁,t ₂,t ₃ are now converted according to Algorithm 2. First, we will show the conversion of the determinant d, whose RNS digits are the last elements of t _k, i.e. $d=\det \mathbf {A}$ has the RNS representation (d ₁,d ₂,d ₃)=(4,5,8) modulo (5,7,11).

During the computation of Algorithm 2, mixed-radix digits (4,3,0) are computed, shown in boxes in the table above. (In fact, the first digit 4 is just taken from t ₁). The mixed-radix digit weights are (1,m ₁,m ₁ m ₂)=(1,5,5×7). The value of the determinant is thus d=4×1+3×5+0×5×7=19. Because $d<\frac {M}{2}$, i.e. $19<\frac {385}{2}$, it is positive.

A negative number is converted the same way. For example, the first element of z has the RNS represenatation (0,1,2):

The mixed-radix representation is thus (0,3,10), and the value is 0+3×5+10×5×7=365, which is more than $\frac {M}{2}$, therefore it is negative. The correct value can be computed by subtracting M, yielding 365−385=−20.

The same process is applied on all members of [t ₁,t ₂,t ₃], getting (z,d)^T=(−20,45,62,19)^T.

The final step is to get the solution vector $\mathbf {x}=\frac {\mathbf {z}}{d}=(-\frac {20}{19},\frac {45}{19},\frac {62}{19})^{\mathsf {T}}$. We can now verify that x is indeed the solution of the SLE (A | b):

$$\left( \begin{array}{rrr} 3 & -1 & 2 \\ 3 & 6 & -4 \\ 1 & -2 & 3 \end{array} \right).\left( \begin{array}{r} -\frac{20}{19} \\ \frac{45}{19} \\ \frac{62}{19} \end{array} \right)=\left( \begin{array}{r} 1 \\ -2 \\ 4 \end{array} \right) $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Buček, J., Kubalík, P., Lórencz, R. et al. Design of a Residue Number System Based Linear System Solver in Hardware. J Sign Process Syst 87, 343–356 (2017). https://doi.org/10.1007/s11265-016-1146-1

Download citation

Received: 14 June 2015
Revised: 05 May 2016
Accepted: 20 May 2016
Published: 14 June 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11265-016-1146-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Design of a Residue Number System Based Linear System Solver in Hardware

Abstract

Access this article

Similar content being viewed by others

A Survey on Pipelined FFT Hardware Architectures

Parallelizing the dual revised simplex method

Sparse polynomial interpolation: faster strategies over finite fields

Notes

References

Acknowledgment