Migration and rollback transparency for arbitrary distributed applications in workstation clusters

  • Stefan Petri
  • Matthias Bolze
  • Horst Langendörfer
Workshop on Run-Time Systems for Parallel Programming Matthew Haines, University of Wyoming, USA Koen Langendoen, Vrije Universiteit, The Netherlands Greg Benson, University of Califonia at Davis, USA

DOI: 10.1007/3-540-64359-1_686

Part of the Lecture Notes in Computer Science book series (LNCS, volume 1388)
Cite this paper as:
Petri S., Bolze M., Langendörfer H. (1998) Migration and rollback transparency for arbitrary distributed applications in workstation clusters. In: Rolim J. (eds) Parallel and Distributed Processing. IPPS 1998. Lecture Notes in Computer Science, vol 1388. Springer, Berlin, Heidelberg

Abstract

Programmers and users of compute intensive scientific applications often do not want to (or even cannot) code load balancing and fault tolerance into their programs.

The Beam system [18] uses a global virtual name space to provide migration and rollback transparency in user space for distributed groups of processes on workstations. The system calls are interposed and their parameters translated between the name spaces. Unlike other migration mechanisms, Beam does not require the applications to be written for a specific programming model or communication library.

In this paper we describe design and implementation of a separate system call interposition process [3] that accesses the application via the debugging interface. The main advantage of this approach is that it can handle even unmodified (e. g. commercially bought) application programs. We compare measured performance figures with previous similar approaches [15, 20].

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag 1998

Authors and Affiliations

  • Stefan Petri
    • 1
  • Matthias Bolze
    • 2
  • Horst Langendörfer
  1. 1.Institute for Computer EngineeringMedical University at LübeckUSA
  2. 2.Institute for Operating Systems and Computer NetworksTechnical University BraunschweigUSA

Personalised recommendations