On Fault Tolerance of Two-Dimensional Mesh Networks
The catastrophic fault pattern is a pattern of faults occurring at strategic locations that may render a system unusable regardless of its component redundancy and of its reconfiguration capabilities. In this paper, we characterize catastrophic fault patterns in mesh networks when the links are bidirectional or unidirectional. We determine the minimum number of faults required for a fault pattern to be catastrophic. We consider the problem of testing whether a set of faulty processors is catastrophic. In addition, when a fault pattern is not catastrophic we consider the problem of finding optimal reconfiguration strategies, where optimality is with respect to either the number of processing elements in the reconfigured network (the reconfiguration is optimal if such a number is maximized) or the number of bypass links to activate in order to reconfigure the array (the reconfiguration is optimal if such a number is minimized). The problem of finding a reconfiguration strategy that is optimal with respect to the size of the reconfigured network is NP-complete, when the links are bidirectional, while it can be solved in polynomial time, when the links are unidirectional. Considering optimality with respect to the number of bypass links to activate, we provide algorithms which efficiently find an optimal reconfiguration.
KeywordsProcessing Element Fault Tolerance Mesh Network Hamiltonian Path Systolic Array
Unable to display preview. Download preview PDF.
- 2.Bruck, J., Cypher, R., Ho, C.T.: Fault-tolerant meshes with minimal number of spares. In: Proc. of 3rd IEEE Symposium on Parallel and Distributed Processing, pp. 288–295 (1991)Google Scholar
- 3.De Prisco, R., Monti, A., Pagli, L.: Efficient testing and reconfiguration of VLSI linear arrays. Theoretical Computer Science 197, 105–129 (1998)Google Scholar
- 5.Cormen, T.H., Lierson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, CambridgeGoogle Scholar
- 7.Kung, H.T.: Why systolic architecture? IEEE Computer 15, 37–46 (1982)Google Scholar
- 14.Nayak, A., Santoro, N., Tan, R.: Fault-Intolerance of reconfigurable systolic arrays. In: Proc. of 20th Int. Symp. on Fault-Tolerant Computing, pp. 202–209 (1990)Google Scholar