Using control and data flow analysis for race evaluation
The needs for larger problem sizes and for more accurate results force the users in the field of scientific computing towards applying parallel machines. Besides problems with initial program development another hard task arises with parallel program debugging, where severe difficulties appear with nondeterminism and race conditions.
This paper describes the tools ATEMPT and CDFA, two modules of the MAD environment which support the detection of simple errors in the communication structure and race conditions in parallel programs. While ATEMPT generates an event graph and visualizes race condition candidates of an actual execution, CDFA analyzes the source code and produces data structures for investigation of control and data flow graphs. The combination of both tools gives further insight into a program and makes the evaluation of race evaluation more efficient.
KeywordsDebugging program analysis race conditions event manipulation
Unable to display preview. Download preview PDF.
- [ChMi 91]J.-D. Choi, S.L. Min, “RACE FRONTIER: Reproducing Data Races in Parallel Program Debuggingr”, Proc. 3rd ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming PPOPP, Williamsburg, Virginia, pp. 145–154 (April 1991).Google Scholar
- [ChSt 91]J.-D. Choi, J. Stone, “Balancing Runtime and Replay Costs in a Trace-and-Replay System”, Proc. ACM/ONR Workshop on Parallel and Distributed Debugging, Santa Cruz, CA, pp. 26–35 (May 1991).Google Scholar
- [Grab 97]S. Grabner, Search strategies for errors in parallel programs, Ph.D. thesis, Systems Programming Institute, Johannes Kepler University Linz (1997). [in German]Google Scholar
- [GrVo 96]S. Grabner, J. Volkert, “Debugging Distributed Memory Programs Using Communication Graph Manipulation”, Proc. HPCS 96 Symposium, Montreal, Canada (June 1996).Google Scholar
- [Helm 91]D.P. Helmbold, C.E. McDowell, J.-Z. Wang, “Detecting Data Races from Sequential Traces”, Proceedings HICSS 24, Hawaii, Vol. 2, pp. 408–417 (Jan. 1991).Google Scholar
- [HeMc 96]D.P. Helmbold, C.E. McDowell, “Race Detection — Ten Years Later”, in: M.L. Simmons, A.H. Hayes, J.S. Brown, D.A. Reed (Eds.), “Debugging and Performance Tuning”, IEEE Computer Society, pp. 101–126 (1996)Google Scholar
- [Kran 94]D. Kranzlmüller, S. Grabner, J. Volkert, “PARASIT — Parallel Simulation Tool”, CEI project PACT Technical Report D7V 1, GUP Linz, University Linz, Austria (Dec. 1994).Google Scholar
- [Kran 96a]D. Kranzlmüller, S. Grabner, J. Volkert, “Debugging with the MAD Environment”, Proc. of Workshop on Environments and Tools for Parallel Scientific Computing III, Faverges de la Tour, France (Aug. 1996).Google Scholar
- [Kran 96b]D. Kranzlmüller, S. Grabner, J. Volkert, “Monitoring Strategies for Hypercube Systems”, Proc. of 4th EUROMICRO Workshop on PDP, Braga, Portugal, pp. 486–492 (Jan. 1996).Google Scholar
- [LeMe 87]T.J. LeBlanc, J.M. Mellor-Crummey, “Debugging parallel programs with instant replay”, IEEE Trans. on Comp., pp. 471–482 (April 1987).Google Scholar
- [MiCh 88]B.P. Miller, J.-D. Choi, “A Mechanism for Efficient Debugging of Parallel Programs”, Proc. SIGPLAN/SIGOPS Workshop on Parallel & Distributed Debugging, Madison, Wisconsin, pp. 141–150 (May 1988).Google Scholar
- [MPI 94]MPI: A Message-Passing Interface Standard, special issue of The Intl. Journal of Supercomputer Applications and High Performance Computing, Vol. 8 (3/4) (Fall/Winter 1994).Google Scholar
- [Netz 96]R.H.B. Netzer, T.W. Brennan, S.K. Damodaran-Kamal, “Debugging Race Conditions in Message-Passing Programs”, Proc. SPDT'96, SIGMETRICS Symposium on Parallel and Distributed Tools, Philadelphia, PA, pp. 31–40 (May 1996).Google Scholar
- [NeMi 92]R.H.B. Netzer, B.P. Miller, “What are Race Conditions? — Some Issues and Formalizations”, ACM Letters on Programming Languages and Systems, Vol. 1, No. 1 (March 1992).Google Scholar
- [NeDa 94]R.H.B. Netzer, S.K. Damodaran-Kamal, “Accurate Race Condition Detection for Message-Passing Programs”, Brown Univ. Dept. of Computer Science Technical Report (April 1994). *** DIRECT SUPPORT *** A0008C42 00004Google Scholar