A reliable client-server model on top of a micro-kernel
The recently emerged micro-kernel technology is now well recognized as a base mechanism for building distributed systems. This paper addresses the problem of designing a fault tolerant operating system while keeping the advantages of the micro-kernel technology. We introduce a solution based on standard workstations and on global consistent state computation using dynamic atomic actions. The advantages of our solution are that it does not introduce RPC performance degradation and that it avoids complete duplication of workstations, thus offering a satisfactory performance/cost ratio.
Unable to display preview. Download preview PDF.
- 1.M. Banâtre, G. Muller, B. Rochat, and P. Sanchez. Design Decisions for the FTM: A General Purpose Fault Tolerant Machine. In Proc. of 21th International Symposium on Fault-Tolerant Computing Systems, pages 71–78, Montréal, Canada, June 1991.Google Scholar
- 2.M. Banâtre, P. Heng, G. Muller, N. Peyrouze and B. Rochat. An Experience in the Design of a Reliable Object Based System. In Proc of the 2th Conference on Parallel and Distributed Information Systems, January 1993.Google Scholar
- 3.K. P. Birman, Replication and fault-tolerance in the ISIS system. In Proc. of 10th Symposium on System Principle, pp. 79–86, December 1985.Google Scholar
- 4.K.P. Birman, A. Schiper and P. Stephenson. Lightweight Causal and Atomic Group Multicast. ACM Transactions on Computer Systems, vol. 9 (3), August 1991.Google Scholar
- 6.J. L. Eppinger, L. B. Mummert and A. Z. Spector. Camelot and Avalon: A Distributed Transation Facility. Morgan Kaufmann publishers,inc. San Mateo, 1991.Google Scholar
- 7.A. Goldberg, A. Gopal, K. Li, R. Strom and D. F. Bacon. Transparent Recovery of Mach Applications. In USENIX Mach Workshop, pp. 169–183, Burlington, Vermont, October 1990.Google Scholar
- 8.B. Lampson. Atomic Transactions. Distributed Systems and Architecture and Implementation: An Advanced Course, pp. 246–65, vol. 105, Springer Verlag, Lecture Notes in Computer Science, 1981.Google Scholar
- 9.B. Liskov. Implementation of Argus. In Proc. of the 11th ACM Symposium on Operating Systems Principles, November 1987.Google Scholar
- 10.M. C Little. Object Replication in a Distributed System. Ph. D. Thesis, University of Newcastle, September 1991.Google Scholar
- 11.P.M. Merlin and B. Randell. State Restoration in Distributed Systems. In Proc. of 8th International Symposium on Fault-Tolerant Computing Systems, pp. 129–134, Toulouse, France, June 1978.Google Scholar
- 12.D. Powell. DELTA 4 Overall System Specification. LAAS-CNRS, Toulouse, France, D. Powell editor, 1988.Google Scholar
- 13.B. Rochat. Une approche à la construction de services fiables dans les systèmes distribués. Ph. D. Thesis, University of Rennes I, February 1992.Google Scholar