Abstract
Reliability is a key concern of designers of distributed computing systems. Checkpointing can be used as a basis for designing resilient processes and process-migration schemes, but very few systems which implement process-checkpointing are heterogeneous. ‘High-level’ process checkpointing schemes capture process-state at a higher level of abstraction than do low-level schemes. The resulting state does not depend on low-level or platform-specific structures, and so is meaningful at any site in a heterogeneous distributed computing network. This paper presents a high-level approach to process checkpointing which is transparent to the programmer, which operates at a fine level of granularity, and which can deal with dynamically allocated memory and multithreaded processes.
Chapter PDF
Similar content being viewed by others
References
Mishra, S. and Schlichting, R. (1992). Abstractions for Constructing Dependable Distributed Systems. Technical Report. Department of Computer Science, University of Arizona
Nuttall, M. (1994) A brief survey of systems providing process or object migration facilities. ACM Operating Systems Review, 64–80.
van Renesse, R., van Staveren, H. and Tanenbaum, A. (1988) Performance of the Worlds
Fastest Distributed Operating System. Operating Systems Review,22, 25–34.
Bal, H. (1992) Fault-tolerant parallel programming in Argus. Concurrency: Practice and Experience, 4 (1), 37–55.
Attardi, G. et. al. (1988) Techniques For Dynamic Software Migration. ESPRIT 88: Proc. 5th Annual ESPRIT Conference, 475–91.
Attardi, G. et. al. (1987) Specifications for High Level Abstract Common Machine Version 3.1. Chameleon Report TR-87–40, 1–62.
Attardi, G. et. al. (1987) Incremental Loading in HLACM. Chameleon Note TR-87-38, 1–9.
Shrivastava, G.D.S and Parrington, G. (1991) An Overview of the Arjuna Distributed Programming System. IEEE Software, 66–73.
Schill, A and Mock, M. (1993) DC++: Distributed Object-Oriented System Support on Top of OSF DCE. Distributed Systems Engineering, 112–25.
ANSAware. (1993) ANSAware 4.1 Application Programming in ANSAware. Programming Manual.
Redhead, T. (1995) Implementation of high-level checkpointing in heterogeneous distributed systems. to be submitted.
Shirley, J. (1992) Guide to Writing DCE Applications. O’Reilly and Associates.
Rosenberry, W. et. al. (1992) Understanding DCE. O’Reilly and Associates.
Theimer, M. and Hayes, B. (1991) Heterogeneous Process Migration by Recompilation. Proceedings 11th Int. Conf. on Distributed Computing Systems. 11–25.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Redhead, T. (1996). A high-level process checkpointing and migration scheme for heterogeneous distributed systems. In: Schill, A., Mittasch, C., Spaniol, O., Popien, C. (eds) Distributed Platforms. IFIP — The International Federation for Information Processing. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-34947-3_21
Download citation
DOI: https://doi.org/10.1007/978-0-387-34947-3_21
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4757-5010-2
Online ISBN: 978-0-387-34947-3
eBook Packages: Springer Book Archive