Applicative Architectures for Fault-Tolerant Multiprocessors
This paper proposes functional programming frameworks for the design of highly reliable multiprocessor systems. In contrast to imperative programming environments, a functional environment offers elegant, relatively simple, and efficient solutions to concurrent error detection and recovery problems in multiprocessors. Specific fault tolerance mechanisms for upset exposure, fault containment, secure task assignment, and recovery are developed for a class of applicative multiprocessor architectures. Verification of abstract behavioral characteristics of applicative tasks is used for exposing faults during the execution of tasks. The fault containment mechanism is based on isolation of stack and heap segments of tasks. A protocol for secure task assignment is defined between system components. The architecture permits incremental, distributed, and asynchronous backups of system state. Finally, recovery is accomplished, even in the worst cases, by re-execution of a small number of tasks.
KeywordsMemory Module Recovery Procedure Task Descriptor Applicative Task Task Token
Unable to display preview. Download preview PDF.
- [Gabr84]R.P. Gabriel and J. McCarthy, Queue-based Multi-processing LISP, Pro- ceedings, 1984 ACM Symposium on LISP and Functional Programming.Google Scholar
- [Grit84]D. H. Grit, Towards Fault Tolerance in a Distributed Applicative Multiprocessor, Proceedings, International Symposium on Fault Tolerant Computing, Jun 1984, pp 272–277.Google Scholar
- [Hals84]R.H. Halstead, Jr., Implementation of Multilisp: LISP on a Multiprocessor, Proceedings, 1984 ACM Symposium on LISP and Functional Programming.Google Scholar
- [Hugh83]J. L. A. Hughes, Error Detection and Correction Techniques for Data-Flow Systems, Proceedings, International Symposium on Fault Tolerant Computing, June 1983, pp 318–321.Google Scholar
- [Ke1179]R. M. Keller, G. Lindstrom, and S. Patil, A Loosely-Coupled Applicative Multi-processing System,AFIPS Conference Proceedings, June 1979, pp 613–622.Google Scholar
- [Ke1184]R. M. Keller, F. C. H. Lin, and J. Tanaka, Rediflow Multiprocessing, Proceedings, COMPCON Spring 84, Feb. 1984, pp 410–417.Google Scholar
- [Leun80]C. K. C. Leung and J. B. Dennis, Design of a Fault-Tolerant Packet Communication Computer Architecture, Proceedings, International Symposium on Fault Tolerant Computing, 1980, pp 328–335.Google Scholar
- [LiKe86]F.C.H. Lin and R.M. Keller,Distributed Recovery in Applicative Systems, Proceedings, 1986 International Conference on Parallel Processing.Google Scholar
- [Misu76]D. P. Misunas, Error Detection and Recovery in a Data-flow Computer, Proceedings, 1976 Conference on Parallel Processing, August 1976, pp 123–131.Google Scholar
- [Rand84]B. Randell, Fault Tolerance and System Structuring, Proceedings, 4th Jerusalem Conference on Information Technology, 1984.Google Scholar
- [Srin85]V.P. Srini, A Fault Tolerant Datafiow System, IEEE Computer, March 1985, pp 54–68.Google Scholar
- [Vegd84]S.R. Vegdahl, A Survey of Proposed Architectures for the Execution of Functional Languages, IEEE Trans. on Computers, Dec. 1984, pp 1050–1071.Google Scholar