Fault-tolerant hardware configuration management on the multiprocessor system DIRMU 25

  • E. Maehle
  • K. Moritzen
  • K. Wirl
Architectural Aspects (Session 3.1)
Part of the Lecture Notes in Computer Science book series (LNCS, volume 237)


This paper describes fault tolerance techniques which have been developed and implemented for the multiprocessor system DIRMU 25 — a 25-processor system which is operational at the University of Erlangen-Nuremberg. First a short overview of the DIRMU hardware architecture, programming environment and parallel application programs is given. Fault-diagnosis and reconfiguration are implemented in a layer of the DIRMOS operating system: the hardware configuration management. The concept of this configuration management is described in general (based on a graph model) and its application for the fault-tolerant execution of parallel programs is discussed.


Fault Tolerance Message Passing Multiprocessor System Configuration Management Interprocessor Communication 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Hwang, K., Briggs, F.A.: Computer Architecture and Parallel Processing, McGraw Hill 1984.Google Scholar
  2. [2]
    Siewiorek, D.P., Swarz, R.S.: The Theory and Practice of Reliable System Design, Digital Press 1982.Google Scholar
  3. [3]
    Hopkins, Jr, A. L., Smith, III, T.B., Lala, J.H.: FTMP — A Highly Reliable Fault-Tolerant Multiprocessor for Aircraft, Proc. of the IEEE, Vol. 66, No. 10, 1221–1239.Google Scholar
  4. [4]
    Handler, W., Maehle, E., Wirl, K.: DIRMU Multiprocessor Configurations, Proc. 1985 Int. Conf. on Parallel Processing, St. Charles, Ill., 1985, 652–656.Google Scholar
  5. [5]
    Handler, W., Maehle, E., Wirl, K.: The DIRMU Testbed for High-Performance Multiprocessor Configurations, Proc. Int. Conf. on Supercomputing Systems, St. Petersburg, Fl., 1985, 468–475.Google Scholar
  6. [6]
    Hayes, J.P.: A Graph Model for Fault-Tolerant Computing Systems. IEEE Trans. on Computers, Vol. C-25, No. 9, Sept. 1976, 875–884.Google Scholar
  7. [7]
    Maehle, E., Fehlertolerantes Verhalten in Multiprozessoren — Untersuchungen zur Diagnose und Rekonfiguration, Dissertation, Arbeitsberichte des IMMD, Vol. 15, No. 2, Univ. of Erlangen-Nuremberg 1982.Google Scholar
  8. [8]
    Moritzen, K.: System-Level Fault-Diagnosis in Distributed Systems, 2nd GI/NTG/GMR Conf. ‘Fault-Tolerant Computing Systems', Informatik-Fachberichte 84, Springer, Berlin Heidelberg New York Tokyo 1984, 301–312.Google Scholar
  9. [9]
    Moritzen, K.: Softwarewerkzeuge zur Programmierung von Multiprozessoren mit begrenzten Nachbarschaften — ein Beitrag zur Konfigurationsverwaltung, Dissertation, Univ. of Erlangen-Nuremberg (to appear).Google Scholar
  10. [10]
    Wirth, N.: Programming in Modula-2, Springer, Berlin, Heidelberg New York Tokyo 1982.Google Scholar
  11. [11]
    Maehle, E., Wirl, K., Japel, D.: Experiments with Parallel Programs on the DIRMU Multiprocessor Kit, Proc. ‘Parallel Computing 85', Berlin 1985, 515–520.Google Scholar
  12. [12]
    Bode, A., Fritsch, G., Henning, W., Volkert, J.: High Performance Multiprocessor Systems for Numerical Simulation, Proc. First Int. Conf. on Supercomputing Systems, St.Petersburg, Fl., 1985, 460–467.Google Scholar
  13. [13]
    Cook, S.A.: The Complexity of Theorem Proving Procedures, Proc. 3rd Annual Symp. on Theory of Computing, 1971, 151–158.Google Scholar
  14. [14]
    Andrews, G.R., Schneider, F.B.: Concepts and Notations of Parallel Programming, ACM Computing Surveys, Vol. 15, No. 1, March 1983, 3–43.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1986

Authors and Affiliations

  • E. Maehle
    • 1
  • K. Moritzen
    • 1
  • K. Wirl
    • 1
  1. 1.Department of Computer Science (IMMD)University of Erlangen-NurembergErlangenFederal Republic of Germany

Personalised recommendations