Skip to main content

The Importance of Run-Time Error Detection

  • Conference paper
  • First Online:

Abstract

The ability of system software to detect and issue error messages that help programmers quickly fix serial and parallel run-time errors is an important productivity criterion for developing and maintaining application programs. Over ten thousand run-time error tests and a run-time error detection (RTED) evaluation tool has been developed for the automatic evaluation of run-time error detection capabilities for serial errors and for parallel errors in MPI, OpenMP and UPC programs. Evaluation results, tests and the RTED evaluation tool are freely available at http://rted.public.iastate.edu. Many compilers, tools and run-time systems scored poorly on these tests. The authors make recommendations for providing better RTED in the future.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Luecke, G., Coyle, J., Hoekstra, J., Kraeva, M., Li, Y., Taborskaia, O., Wang, Y.: A Survey of Systems for Detecting Serial Run-time Errors. Concurrency and Computation: Practice and Experience, vol. 18, pp 1885–1907 (2006)

    Article  Google Scholar 

  2. Sun Microsystem’s HPC ClusterTools, http://www.sun.com/software/products/clustertools/

  3. Snir, M., Otto, S. W., Huss-Lederman, S., Walker, D. W., Dongarra, J.: MPI - The Complete Reference, The MIT Press (1998)

    Google Scholar 

  4. Message Passing Interface Forum, http://www.mpi-forum.org

  5. The OpenMP API Specification, http://openmp.org

  6. Chapman, B., Jost, G., Van der Pas, R.: Using OpenMP: Portable Shared Memory Parallel Programming, The MIT Press (2008)

    Google Scholar 

  7. Unified Parallel C, http://upc.gwu.edu

  8. El-Ghazawi, T., Carlson, W., Sterling, T., Yelick, K.: UPC Distributed Shared Memory Programming, Wiley-Interscience (2005)

    Google Scholar 

  9. Vetter, J.S., De Supinski, B.R.: Dynamic software testing of MPI applications with Umpire, In: Conference on High Performance Networking and Computing Article 51, Proceedings of the 2000 ACM/IEEE conference on Supercomputing, Dallas, Texas, United States (2000)

    Google Scholar 

  10. Hilbrich, T., Supinski, B., Mueller, M., Schulz, M.: A Graph Based Approach for MPI Deadlock Detection, In: International Conference on Supercomputing, Yorktown Heights, NY, USA, pp 296–305 (2009)

    Google Scholar 

  11. MARMOT, http://www.hlrs.de/organization/av/amt/research/marmot/publications/

  12. Luecke, G.R., Chen, H., Coyle, J., Hoekstra, J., Kraeva, Zou, Y.: MPI-CHECK: a Tool for Checking Fortran 90 MPI Programs. Concurrency and Computation: Practice and Experience, vol. 15, pp 93–100 (2003)

    Article  MATH  Google Scholar 

  13. Intel Message Checker, http://www.intel.com/cd/software/products/asmo-na/eng/227074.htm

  14. Intel Thread Checker, http://software.intel.com/en-us/intel-thread-checker/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Glenn R. Luecke .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Luecke, G.R. et al. (2010). The Importance of Run-Time Error Detection. In: Müller, M., Resch, M., Schulz, A., Nagel, W. (eds) Tools for High Performance Computing 2009. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11261-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11261-4_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11260-7

  • Online ISBN: 978-3-642-11261-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics