Using Sequential Debugging Techniques with Massively Parallel Programs
Debugging is a crucial part of the software development process. Especially massively-parallel programs impose huge difficulties to program analyis and debugging due to their higher complexity compared to sequential programs. For debugging and analysing parallel programs there are several tools available, but many of these fail in case of massively-parallel programs with potentially thousands of processes.
In this work we introduce the single process debugging strategy, a scalable debugging strategy for massively-parallel programs. The goal of this strategy is to make debugging large scale programs as simple and straight-forward as debugging sequential programs. This is achieved by adapting and combining several techniques which are well known from sequential debugging. In combination, these techniques give the user the possibility to execute and investigate small fractions of a possibly huge parallel program, without having to (re-)execute the entire program.
KeywordsParallel Program Processor Core Sequential Program Event Graph Debug Process
- 2.Schaubschläger, C.: Automatic testing of nondeterministic programs in message passing systems. Masters Thesis, GUP, Johannes Kepler University, Linz, Austria (2000), http://www.gup.unilinz.ac.at/~cs/thesis
- 5.Cunha, J., Lourenco, J., Antao, T.: A debugging engine for parallel and distributed environment (1996)Google Scholar
- 7.Etnus: Totalview debugger (2005), http://www.etnus.com/
- 8.Absoft, Corp.: DDT - Distributed Debugging Tool (2005) Google Scholar
- 9.Weiser, M.: Program slicing. In: ICSE 1981: Proceedings of the 5th international con-ference on Software engineering, Piscataway, NJ, USA, pp. 439–449. IEEE Press, Los Alamitos (1981)Google Scholar
- 10.Duesterwald, E., Gupta, R., Soffa, M.L.: Distributed Slicing and Partial Re-execution for Distributed Programs. In: Languages and Compilers for Parallel Computing, pp. 497–511 (1992)Google Scholar
- 11.Message Passing Interface Forum: MPI: A Message-Passing Interface Standard - Verion 1.1, http://www.mcs.anl.gov/mpi/ (1995)
- 12.Kranzlmüller, D.: Event graph analysis for debugging massively parallel programs. PhD thesis, GUP, Joh. Kepler Univ. Linz (2000), http://www.gup.uni-linz.ac.at/~dk/thesis
- 15.Kobler, R., Schaubschläger, C., Aichinger, B., Kranzlmller, D., Volkert, J.: Exam-ples of monitoring and program analysis activities with dewiz. In: Proc. DAPSYS 2004 (5th Austrian- Hungarian Workshop On Distributed And Parallel Systems) (2004)Google Scholar
- 16.Thoai, N.: Checkpointing techniques for minimizing the waiting time during debug-ging long-running parallel programs. PhD. Thesis, GUP, Johannes Kepler University, Linz, Austria (2003)Google Scholar