The Importance of Run-Time Error Detection

  • Glenn R. Luecke
  • James Coyle
  • James Hoekstra
  • Marina Kraeva
  • Ying Xu
  • Mi-Young Park
  • Elizabeth Kleiman
  • Olga Weiss
  • Andre Wehe
  • Melissa Yahya
Conference paper

Abstract

The ability of system software to detect and issue error messages that help programmers quickly fix serial and parallel run-time errors is an important productivity criterion for developing and maintaining application programs. Over ten thousand run-time error tests and a run-time error detection (RTED) evaluation tool has been developed for the automatic evaluation of run-time error detection capabilities for serial errors and for parallel errors in MPI, OpenMP and UPC programs. Evaluation results, tests and the RTED evaluation tool are freely available at http://rted.public.iastate.edu. Many compilers, tools and run-time systems scored poorly on these tests. The authors make recommendations for providing better RTED in the future.

Keywords

Run-time error detection Fortran CH MPI OpenMP UPC 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Luecke, G., Coyle, J., Hoekstra, J., Kraeva, M., Li, Y., Taborskaia, O., Wang, Y.: A Survey of Systems for Detecting Serial Run-time Errors. Concurrency and Computation: Practice and Experience, vol. 18, pp 1885–1907 (2006) CrossRefGoogle Scholar
  2. 2.
    Sun Microsystem’s HPC ClusterTools, http://www.sun.com/software/products/clustertools/
  3. 3.
    Snir, M., Otto, S. W., Huss-Lederman, S., Walker, D. W., Dongarra, J.: MPI - The Complete Reference, The MIT Press (1998) Google Scholar
  4. 4.
    Message Passing Interface Forum, http://www.mpi-forum.org
  5. 5.
    The OpenMP API Specification, http://openmp.org
  6. 6.
    Chapman, B., Jost, G., Van der Pas, R.: Using OpenMP: Portable Shared Memory Parallel Programming, The MIT Press (2008) Google Scholar
  7. 7.
    Unified Parallel C, http://upc.gwu.edu
  8. 8.
    El-Ghazawi, T., Carlson, W., Sterling, T., Yelick, K.: UPC Distributed Shared Memory Programming, Wiley-Interscience (2005) Google Scholar
  9. 9.
    Vetter, J.S., De Supinski, B.R.: Dynamic software testing of MPI applications with Umpire, In: Conference on High Performance Networking and Computing Article 51, Proceedings of the 2000 ACM/IEEE conference on Supercomputing, Dallas, Texas, United States (2000) Google Scholar
  10. 10.
    Hilbrich, T., Supinski, B., Mueller, M., Schulz, M.: A Graph Based Approach for MPI Deadlock Detection, In: International Conference on Supercomputing, Yorktown Heights, NY, USA, pp 296–305 (2009) Google Scholar
  11. 11.
  12. 12.
    Luecke, G.R., Chen, H., Coyle, J., Hoekstra, J., Kraeva, Zou, Y.: MPI-CHECK: a Tool for Checking Fortran 90 MPI Programs. Concurrency and Computation: Practice and Experience, vol. 15, pp 93–100 (2003) MATHCrossRefGoogle Scholar
  13. 13.
  14. 14.

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Glenn R. Luecke
    • 1
  • James Coyle
  • James Hoekstra
  • Marina Kraeva
  • Ying Xu
  • Mi-Young Park
  • Elizabeth Kleiman
  • Olga Weiss
  • Andre Wehe
  • Melissa Yahya
  1. 1.Iowa State University’s High Performance Computing GroupIowa State UniversityAmesUSA

Personalised recommendations