On fault tolerant matrix decomposition
- 32 Downloads
We present a fault tolerant algorithm for matrix factorization in the presence of multiple hardware faults which can be used for solving the linear systemAx=b without determining the correctZU decomposition ofA. HereZ is eitherL for ordinary Gaussian decomposition with partial pivoting,X for pairwise or neighbor pivoting (motivated by the Gentleman-Kung systolic array structure), orQ for the usualQR decomposition. Our algorithm generalizes that of Luk and Park whose method allows for the correction of a single error in a single iterate of the matrixU. Using ideas from the theory of error correcting codes we prove that the algorithm of Luk and Park can in fact tolerate multiple errors in multiple iterates ofU provided these are all confined to a single column. We then generalize the algorithm to one that tolerates multiple errors in multiple iterates ofU provided they are confined to two columns. Our procedure for identifying the erroneous columns is based on the extended Euclidean algorithm and it analogous to the decoding algorithms for BCH codes. We indicate how our methods may be adapted to apply to any number of columns and finally we show how to compute a correct factorization ofA.
KeywordsParity Check Systolic Array Parity Check Matrix Single Error Multiple Error
Unable to display preview. Download preview PDF.
- 1.J.M. Speiser and H.J. Whitehouse, Signal processing computational needs: An update,in Mathematics in Signal Processing (J.G. McWhirter, ed.), OUP, 1990, pp. 633–664.Google Scholar
- 3.W.M. Gentleman and H.T. Kung, Matrix triangularization by systolic arrays, inReal Time Signal Processing TV, Proc. SPIE (T.F. Tao, ed.), Vol. 298 (1981) pp. 19–26.Google Scholar
- 8.R.P. Brent, F.T. Luk, and C.J. Anfinson, Checksum schemes for fault tolerant systolic computing,in Mathematics in Signal Processing (J.G. McWhirter, ed.), OUP, 1990, pp. 791–804.Google Scholar
- 9.D.L. Boley, R.P. Brent, G.H. Golub, and F.T. Luk, Error correction via the Lanczos process,Tech. Rep. EE-CEG-91-1, School of Elect. Eng., Cornell Univ. (Jan. 1991).Google Scholar
- 11.B.W. Johnson,Design and Analysis of Fault Tolerant Digital Systems, Addison-Wesley, 1989.Google Scholar
- 12.T.R.N. Rao and E. Fujiwara,Error-control coding for Computer Systems, Prentice-Hall, 1989.Google Scholar