Using Sequential Debugging Techniques with Massively Parallel Programs

  • Christian Schaubschläger
  • Dieter Kranzlmüller
  • Jens Volkert
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3992)


Debugging is a crucial part of the software development process. Especially massively-parallel programs impose huge difficulties to program analyis and debugging due to their higher complexity compared to sequential programs. For debugging and analysing parallel programs there are several tools available, but many of these fail in case of massively-parallel programs with potentially thousands of processes.

In this work we introduce the single process debugging strategy, a scalable debugging strategy for massively-parallel programs. The goal of this strategy is to make debugging large scale programs as simple and straight-forward as debugging sequential programs. This is achieved by adapting and combining several techniques which are well known from sequential debugging. In combination, these techniques give the user the possibility to execute and investigate small fractions of a possibly huge parallel program, without having to (re-)execute the entire program.


Parallel Program Processor Core Sequential Program Event Graph Debug Process 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid: Enabling scalable vir-tual organizations. The International Journal of High Performance Computing Applications 15, 200–222 (2001)CrossRefGoogle Scholar
  2. 2.
    Schaubschläger, C.: Automatic testing of nondeterministic programs in message passing systems. Masters Thesis, GUP, Johannes Kepler University, Linz, Austria (2000),
  3. 3.
    LeBlanc, T.J., Mellor-Crummey, J.M.: Debugging parallel programs with instant replay. IEEE Trans. Comput. 36, 471–482 (1987)CrossRefGoogle Scholar
  4. 4.
    Balle, S.M., Brett, B.R., Chen, C.P., LaFrance-Linden, D.: Extending a traditional debugger to debug massively parallel programs. Journal of Parallel and Distributed Computing 64, 617–628 (2004)CrossRefGoogle Scholar
  5. 5.
    Cunha, J., Lourenco, J., Antao, T.: A debugging engine for parallel and distributed environment (1996)Google Scholar
  6. 6.
    Kacsuk, P.: Systematic macrostep debugging of message passing parallel programs. Future Gener. Comput. Syst. 16, 609–624 (2000)CrossRefGoogle Scholar
  7. 7.
    Etnus: Totalview debugger (2005),
  8. 8.
    Absoft, Corp.: DDT - Distributed Debugging Tool (2005) Google Scholar
  9. 9.
    Weiser, M.: Program slicing. In: ICSE 1981: Proceedings of the 5th international con-ference on Software engineering, Piscataway, NJ, USA, pp. 439–449. IEEE Press, Los Alamitos (1981)Google Scholar
  10. 10.
    Duesterwald, E., Gupta, R., Soffa, M.L.: Distributed Slicing and Partial Re-execution for Distributed Programs. In: Languages and Compilers for Parallel Computing, pp. 497–511 (1992)Google Scholar
  11. 11.
    Message Passing Interface Forum: MPI: A Message-Passing Interface Standard - Verion 1.1, (1995)
  12. 12.
    Kranzlmüller, D.: Event graph analysis for debugging massively parallel programs. PhD thesis, GUP, Joh. Kepler Univ. Linz (2000),
  13. 13.
    Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21(7), 558–565 (1978)MATHCrossRefGoogle Scholar
  14. 14.
    Kranzlmüller, D., Volkert, J.: NOPE: A nondeterministic program evaluator. In: Zinterhof, P., Vajtersic, M., Uhl, A. (eds.) ACPC 1999 and ParNum 1999, vol. 1557, pp. 490–499. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  15. 15.
    Kobler, R., Schaubschläger, C., Aichinger, B., Kranzlmller, D., Volkert, J.: Exam-ples of monitoring and program analysis activities with dewiz. In: Proc. DAPSYS 2004 (5th Austrian- Hungarian Workshop On Distributed And Parallel Systems) (2004)Google Scholar
  16. 16.
    Thoai, N.: Checkpointing techniques for minimizing the waiting time during debug-ging long-running parallel programs. PhD. Thesis, GUP, Johannes Kepler University, Linz, Austria (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Christian Schaubschläger
    • 1
  • Dieter Kranzlmüller
    • 1
  • Jens Volkert
    • 1
  1. 1.GUPJoh. Kepler University LinzLinzAustria, Europe

Personalised recommendations