Formal and experimental validation of a low overhead execution replay mechanism

  • Alain Fagot
  • Jacques Chassin de Kergommeaux
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 966)


This paper presents a mechanism for record-replay of parallel programs written in a remote procedure call (RPC) based parallel programming model. This mechanism, which will serve as a basis for implementing a user-level debugger, exploits some properties of the programming model to limit drastically the number of records that need to be done. A formal proof of the equivalence between recorded and replayed executions is given. Systematic measurements of the time overhead of the recording indicate that it is sufficiently low for the recording mode to be considered as normal execution mode. Similar techniques can be applied to other programming models.


Instant Replay parallel debugging deterministic reexecutions Remote Procedure Call 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    P. Bouvry, J. Chassin, and D. Trystram. Efficient solutions for mapping parallel programs. In Proceedings of EuroPar'95. Springer-Verlag, August 1995.Google Scholar
  2. 2.
    M. Christaller. Athapascan-0a control parallelism approach on top of PVM. In Proc PVM User's group meeting. University of Tennessee, Oak Ridge, 1994.Google Scholar
  3. 3.
    H. Jamrozik. Aide à la Mise au Point des Applications Parallèles et Réparties à base d'Objets Persistants. PhD thesis, Université Joseph Fourier, Grenoble, 1993.Google Scholar
  4. 4.
    J. P. Kitajima and B. Plateau. Modelling parallel program behaviour in ALPES. Information and Software Technology, 36(7):457–464, July 1994.Google Scholar
  5. 5.
    T.J. LeBlanc and J.M. Mellor-Crummey. Debugging Parallel Programs with Instant Replay. IEEE Transactions on Computers, C-36(4):471–481, 1987.Google Scholar
  6. 6.
    E. Leu and A. Schiper. Execution replay: a mechanism for integrating a visualization tool with a symbolic debugger. In CONPAR 92 — VAPP V, volume 634 of LNCS, September 1992.Google Scholar
  7. 7.
    F. Mattern. Virtual time and global states of distributed systems. In Proceedings of the Workshop on Parallel and Distributed Algorithms, Bonas, France, September 1988. North Holland.Google Scholar
  8. 8.
    J.M. Mellor-Crummey. Debugging and Analysis of Large-Scale Parallel Programs. Technical Report 312, University of Rochester, September 1989.Google Scholar
  9. 9.
    B. Plateau. Présentation d'APACHE. Rapport APACHE 1, IMAG, Grenoble, December 1994. Available at Scholar
  10. 10.
    V. Strassen. Gaussian Elimination is not Optimal. Numerische Mathematik, Band 13(Heft 4):354–356, 1969.Google Scholar
  11. 11.
    C. Tron et al. Performance Evaluation of Parallel Systems: the alpes environment. In Proceedings of ParCo93. Elsevier Science Publishers, 1993.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1995

Authors and Affiliations

  • Alain Fagot
    • 1
  • Jacques Chassin de Kergommeaux
    • 1
  1. 1.APACHE projectIMAGGrenoble Cedex 1France

Personalised recommendations