Advertisement

Cellular diagnostic in parallel systems

  • Roman Trobec
Submitted Papers
Part of the Lecture Notes in Computer Science book series (LNCS, volume 342)

Abstract

In this work a new, cellular, local diagnostic procedure for a class of massively parallel systems with a regular topology is reported. The fault model is proposed to be suited for a given realistic system therefore production and run-time failures are assumed. Appropriate cluster and random faults are possible; additionally, permanent and/or intermittent faults are permitted. The system architecture is proposed to be a regular network with low network connectivity, a high number of intelligent nodes, and with no passive hardware redundancy. The diagnostic procedure is organized in parallel communication rounds, and is the same for all system units.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

V. References

  1. /1/.
    D.Fussel, P.Varman, "Fault-Tolerant Wafer-Scale Architecture for VLSI," Proc. 9th Annu. Symp. on Computer Architecture, April 1982, pp. 190–198.Google Scholar
  2. /2/.
    R.Trobec, "A Local Distributed Diagnosis," Technical Report Jozef Stefan Institute, IJS-1432, December 1986.Google Scholar
  3. /3/.
    R.C. Russell, I. Catt, "Wafer-Scale Integration — A Fault-Tolerant Procedure," IEEE Journal of Solid-State Circuits, Vol.SC-13, No.3, June 1978, pp. 339–344.Google Scholar
  4. /4/.
    I. Koren, D.K. Pradhan, "Yield and Performance Enhancement Through Redundancy in VLSI and WSI Multiprocessor Systems," Proceeding of the IEEE, Vol.74, No.5, May 1986, pp. 699–711.Google Scholar
  5. /5/.
    J.G.Kuhl, S.M.Reddy, "Distributed Fault-Tolerance for Large Multiprocessor System," Proc. 7th Annu. Symp. Comput. Arch., May 1980, pp. 23–30.Google Scholar
  6. /6/.
    F.J.Meyer, D.K.Pradhan, "Dynamic Testing Strategy for Distributed System," Proc. of the 15th Inter. Symp. on Fault-Tolerant Computing Systems, June 1985, pp. 84–90.Google Scholar
  7. /7/.
    P.Banerjee, J.A.Abraham, "Fault-Secure Algorithms for Multiple-Processors Systems," Proc. of the Inter. Conf. on Computer Architecture, June 1984, pp. 147–154.Google Scholar
  8. /8/.
    F.R.K.Chung, F.T.Leighton, A.L.Rosenberg, "Diogenes: A Methodology for Designing Fault-Tolerant VLSI Processor Array," Proc. 13th Inter. Symp. on Fault-Tolerant Computing, 1983, pp. 26–32.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1989

Authors and Affiliations

  • Roman Trobec
    • 1
  1. 1.Institute Jozef StefanUniversity of LjubljanaLjubljanaYugoslavia

Personalised recommendations