UPC-CHECK: a scalable tool for detecting run-time errors in Unified Parallel C

  • James Coyle
  • Indranil Roy
  • Marina Kraeva
  • Glenn R. Luecke
Special Issue Paper

Abstract

Unified Parallel C (UPC) is a language used to write parallel programs for distributed memory parallel computers. UPC-CHECK (http://hpcgroup.public.iastate.edu/UPC-CHECK/) is a scalable tool developed to automatically detect argument errors in UPC functions and deadlocks in UPC programs at run-time and issue high quality error messages to help programmers quickly fix those errors. The run-time complexity of all detection techniques used are optimal, i.e. O(1) except for deadlocks involving locks where it is theoretically known to be linear in the number of threads. The tool is easy to use, and involves merely replacing the compiler command with upc-check. Error messages issued by UPC-CHECK were evaluated using the UPC RTED test suite for argument errors in UPC functions and deadlocks. Results of these tests show that the error messages issued by UPC-CHECK for these tests are excellent.

Keywords

UPC Run-time error detection Distributed deadlock detection Partitioned global address space 

References

  1. 1.
    High Performance Computing (HPC) Group, Iowa State University. http://www.it.iastate.edu/research/hpcg/
  2. 2.
    The High Performance Computing Laboratory, The George Washington University. http://upc.gwu.edu
  3. 3.
    Sun Microsystem’s HPC ClusterTools. http://www.sun.com/software/products/clustertools
  4. 4.
    UPC NAS Parallel Benchmarks. http://threads.hpcl.gwu.edu/sites/npb-upc
  5. 5.
    Chauvin S, Saha P, Cantonnet F, Annareddy S, El-Ghazawi T (2005) UPC manual. http://upc.gwu.edu/downloads/Manual-1.2.pdf
  6. 6.
    Coyle J, Hoekstra J, Kraeva M, Luecke GR, Kleiman E, Srinivas V, Tripathi A, Weiss O, Wehe A, Xu Y, Yahya M (2008) UPC run-time error detection test suite. http://rted.public.iastate.edu/UPC/
  7. 7.
    DeSouza J, Kuhn B, de Supinski BR, Samofalov V, Zheltov S, Bratanov S (2005) Automated, scalable debugging of MPI programs with Intel®message checker. In: Proceedings of the second international workshop on software engineering for high performance computing system applications, SE-HPCS ’05. ACM, New York, pp 78–82. 10.1145/1145319.1145342 CrossRefGoogle Scholar
  8. 8.
    Ebnenasir A (2011) UPC-SPIN: a framework for the model checking of UPC programs. In: Proceedings of fifth conference on partitioned global address space programming models, PGAS’11. http://pgas11.rice.edu/papers/Ebnenasir-UPC-Model-Checking-PGAS11.pdf Google Scholar
  9. 9.
    El-Ghazawi T, Carlson W, Sterling T, Yelick K (2003) UPC: distributed shared memory programming. Wiley-Interscience, New York Google Scholar
  10. 10.
    High Performance Computing Group, ISU: User’s guide for UPC-CHECK 1.0 (2011). http://hpcgroup.public.iastate.edu/UPC-CHECK/UPC-CHECK_UsersGuide.pdf
  11. 11.
    Hilbrich T, de Supinski BR, Schulz M, Müller MS (2009) A graph based approach for MPI deadlock detection. In: Proceedings of the 23rd international conference on supercomputing, ICS ’09. ACM, New York, pp 296–305. 10.1145/1542275.1542319 CrossRefGoogle Scholar
  12. 12.
    Kraeva M, Coyle J, Luecke GR, Roy I, Kleiman E, Hoekstra J (2012) UPC-CompilerCheck: a tool for evaluating error detection capabilities of UPC compilers. http://hpcgroup.public.iastate.edu/papers/UPC.CTED.Paper.pdf. Preprint
  13. 13.
    Krammer B, Müller M, Resch M (2004) MPI application development using the analysis tool marmot. In: Bubak M, van Albada G, Sloot P, Dongarra J (eds) Computational science—ICCS 2004. Lecture notes in computer science, vol 3038. Springer, Berlin, pp 464–471. 10.1007/978-3-540-24688-6_61 CrossRefGoogle Scholar
  14. 14.
    Luecke GR, Coyle J, Hoekstra J, Kraeva M, Roy I (2011) UPC-CHECK tutorial. http://hpcgroup.public.iastate.edu/UPC-CHECK/UPC-CHECK_Tutorial_Aug_30.pptx
  15. 15.
    Luecke GR, Coyle J, Hoekstra J, Kraeva M, Xu Y, Kleiman E, Weiss O (2009) Evaluating error detection capabilities of UPC run-time systems. In: Proceedings of the third conference on partitioned global address space programing models, PGAS’09. ACM, New York, pp 7:1–7:4. 10.1145/1809961.1809971 Google Scholar
  16. 16.
    Luecke GR, Coyle J, Hoekstra J, Kraeva M, Xu Y, Park MY, Kleiman E, Weiss O, Wehe A, Yahya M (2010) The importance of run-time error detection. In: Muller MS, Resch MM, Schulz A, Nagel WE (eds) Tools for high performance computing 2009. Springer, Berlin, pp 145–155. 10.1007/978-3-642-11261-4_10 CrossRefGoogle Scholar
  17. 17.
    Luecke GR, Zou Y, Coyle J, Hoekstra J, Kraeva M (2002) Deadlock detection in MPI programs. Concurrency Computat, Pract Exper 14(11):911–932. 10.1002/cpe.701 MATHCrossRefGoogle Scholar
  18. 18.
    Petersen P, Shah S (2003) OpenMP support in the Intel® Thread checker. In: Voss M (ed) OpenMP shared memory parallel programming. Lecture notes in computer science, vol 2716. Springer, Berlin, pp 1–12. 10.1007/3-540-45009-2_1 CrossRefGoogle Scholar
  19. 19.
    Pirkelbauer P, Liao C, Panas T, Quinlan D (2011) Runtime detection of c-style errors in upc code. In: Proceedings of fifth conference on partitioned global address space programming models, PGAS’11. http://pgas11.rice.edu/papers/PirkelbauerEtAl-UPC-Error-Detect-PGAS11.pdf Google Scholar
  20. 20.
    Quinlan DJ et al ROSE compiler project. http://www.rosecompiler.org/
  21. 21.
    Roy I (2012) UPC-CHECK: a scalable tool for detecting run-time errors in Unified Parallel C. Master’s thesis, Iowa State University, Ames, Iowa, USA. Preprint Google Scholar
  22. 22.
    Roy I, Luecke GR, Coyle J, Kraeva M, Hoekstra J (2012) An optimal deadlock detection algorithm for Unified Parallel C. http://hpcgroup.public.iastate.edu/papers/Deadlock_Dectection_for_UPC.pdf. Preprint
  23. 23.
    The UPC Consortium: UPC language specifications (v1.2) (2005). http://www.gwu.edu/~upc/docs/upc_specs_1.2.pdf
  24. 24.
    Vetter JS, de Supinski BR (2000) Dynamic software testing of MPI applications with umpire. In: Proceedings of the 2000 ACM/IEEE conference on supercomputing (CDROM), Supercomputing ’00. IEEE Computer Society, Washington. http://dl.acm.org/citation.cfm?id=370049.370462 Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • James Coyle
    • 1
  • Indranil Roy
    • 1
  • Marina Kraeva
    • 1
  • Glenn R. Luecke
    • 1
  1. 1.Iowa State University’s High Performance Computing GroupIowa State UniversityAmesUSA

Personalised recommendations