Advertisement

Constraint based system-level diagnosis of multiprocessors

Session 9 System Level Diagnosis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1150)

Abstract

The paper presents a novel modelling technique for system-level fault diagnosis in massive parallel multiprocessors, based on a re-formulation of the problem of syndrome decoding to a constraint satisfaction problem (CSP). The CSP based approach is able to handle detailed and inhomogeneous functional fault models on a similar level as the Russel-Kime model [18]. Multiple-valued logic is used to describe system components having multiple fault modes. The granularity of the models can be adjusted to the diagnostic resolution of the target without altering the methodology. Two algorithms for the Parsytec GCel massively parallel system are used as illustrations in the paper: the centralized method uses a detailed system model, and provides a fine-granular diagnostic image for off-line evaluation. The distributed method makes fast decisions for reconfiguration control, using a simplified model.

Keywords

Fault Model Constraint Satisfaction Problem Fault Injector Constraint Network Diagnosis Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    E. Selényi, “Generalization of System-Level Diagnosis Theory,” D.Sc. Thesis, Budapest, Hungarian Academy of Sciences, 1985.Google Scholar
  2. 2.
    A. Pataricza, K. Tilly, E. Selényi, M. Dal Cin, “A Constraint Based Approach to System-Level Diagnosis,” Internal report 4/1994, University of Erlangen-Nürnberg, 1994.Google Scholar
  3. 3.
    A. Petri, “A Constraint Based Algorithm for System Level Diagnosis,” Diploma Thesis, Technical University of Budapest, 1994.Google Scholar
  4. 4.
    A. Pataricza, K. Tilly, E. Selényi, M. Dal Cin, A. Petri, “Constraint-based System Level Diagnosis of Multiprocessor Architectures,” Proc. of 8th Symp. on Microprocessor and Microcomputer Applications, vol. 1, pp. 75–84, 1994.Google Scholar
  5. 5.
    P. Urbán, “A Distributed Constraint Based Diagnosis Algorithm for Multiprocessors,” Scientific Conference of the Students, Technical University of Budapest, Faculty of Electrical Engineering and Computer Science, 1995.Google Scholar
  6. 6.
    J. Altmann, T. Bartha, A. Pataricza, “An Event-Driven Approach to Multiprocessor Diagnosis,” Proc. of 8th Symp. on Microprocessor and Microcomputer Applications, vol. 1, pp. 109–118, 1994.Google Scholar
  7. 7.
    J. Altmann, T. Bartha, A. Pataricza, “On Integrating Error Detection into a Fault Diagnosis Algorithm For Massively Parallel Computers,” Proc. of IEEE IPDS '95 Symposium, pp. 154–164, 1995.Google Scholar
  8. 8.
    T. Bartha, “Effective Approximate Fault Diagnosis of System with Inhomogeneous Test Invalidation,” submitted to the Euromicro '96 Conference, 1996.Google Scholar
  9. 9.
    K. Tilly, “Constraint Based Logic Test Generation,” Ph.D. Thesis, Hungarian Academy of Sciences, 1994.Google Scholar
  10. 10.
    U. Montanari, “Networks of Constraints: Fundamental Properties and Applications to Picture Processing,” Information Sciences, vol. 7, pp. 95–132, 1974.Google Scholar
  11. 11.
    R. Mohr, T. C. Henderson, “Arc and Path Consistency Revisited,” Artificial Intelligence, vol. 28, pp. 225–233, 1986.Google Scholar
  12. 12.
    A. Mackworth, E. C. Freuder, “The Complexity of Some Polynomial Network Consistency Algorithms for Constraint Satisfaction Problems,” Artificial Intelligence, vol 25, pp. 65–74, 1985.Google Scholar
  13. 13.
    R. Seidel, “A New Method for Solving Constraint Satisfaction Problems”, IJCAI '81, pp. 338–342, 1981.Google Scholar
  14. 14.
    P. van Beek, “A Binary CSP Solution Library,” available by FTP from ftp.cs.alberta.ca.Google Scholar
  15. 15.
    G. Kondrak, “A Theoretical Evaluation of Selected Backtracking Algorithms,” M.Sc. Thesis, University of Alberta, Edmonton, 1994.Google Scholar
  16. 16.
    M. Barborak, M. Malek, A. Dahbura, “The Consensus Problem in Fault-Tolerant Computing,” ACM Computing Surveys, vol. 25, no. 2, pp. 171–220, June 1993.Google Scholar
  17. 17.
    F. Preparata; G. Metze; R. Chien, “On the Connection Assignment Problem of Diagnosable Systems,” IEEE Trans. Comput., vol. EC-16, no. 6, pp. 848–854, Dec. 1967.Google Scholar
  18. 18.
    C. Kime, “System Diagnosis,” in Fault-Tolerant Computing: Theory and Techniques, D. Pradhan ed., Prentice-Hall, New York, pp. 577–623, 1985.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  1. 1.Dept. of Computer Science IIIUniversity of ErlangenErlangenGermany
  2. 2.Dept. of Measurement and Instrument Eng.Technical University of BudapestBudapestHungary

Personalised recommendations