Advertisement

Enhanced Memory debugging of MPI-parallel Applications in Open MPI

  • Shiqing FanEmail author
  • Rainer Keller
  • Michael Resch
Conference paper

Abstract

In this paper, we describe the implementation of memory checking functionality based on instrumentation using Valgrind-Memcheck tool. The combination of Valgrind based checking functions within the MPI-implementation offers superior debugging functionalities, for errors that otherwise are not possible to detect with comparable MPI-debugging tools. The functionality is integrated into Open MPI as the so-called memchecker-framework. This allows other memory debuggers that offer a similar API to be integrated. The tight control of the user’s memory passed to Open MPI, allows not only to find application errors, but also helps track bugs within Open MPI itself. We describe the actual checks, classes of errors being found, how memory buffers internally are being handled, show errors actually found in user’s code and the performance implications of this instrumentation.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    DeSouza, J., Kuhn, B., Supinski, de B.R.: Automated, scalable debugging of MPI programs with Intel message checker. In: Proceedings of the 2nd International Workshop on Software engineering for high performance computing system applications, vol. 4, pp. 78–82. ACM Press, NY, USA (2005) CrossRefGoogle Scholar
  2. 2.
    Keller, R., Resch, M.: Testing the correctness of MPI implementations. In: Proceedings of the 5th Int. Symp. on Parallel and Distributed Computing conference, pp. 291–295. Timisoara, Romania (2006) Google Scholar
  3. 3.
    Krammer, B., Müller, M.S., Resch, M.M.: Runtime checking of MPI applications with Marmot. In: PARCO’05. Malaga, Spain (2005) Google Scholar
  4. 4.
    Message Passing Interface Forum: MPI: A Message Passing Interface Standard (1995). http://www.mpi-forum.org
  5. 5.
    Message Passing Interface Forum: MPI-2: Extensions to the Message-Passing Interface (1997). http://www.mpi-forum.org
  6. 6.
    Seward, J., Nethercote, N.: Using Valgrind to detect undefined value errors with bit-precision. In: Proceedings of the USENIX’05 Annual Technical Conference. Anaheim, CA, USA (2005) Google Scholar
  7. 7.
    The Open Fabrics project webpage. WWW (2007). https://www.openfabrics.org
  8. 8.
    Totalview Memory Debugging capabilities. WWW. http://www.etnus.com/TotalView/Memory.html
  9. 9.
    Vetter, J.S., de Supinski, B.R.: Dynamic software testing of MPI applications with Umpire. In: Proceedings of Supercomputing (SC) (2000). http://www.sc2000.org/proceedings/techpapr/index.htm
  10. 10.
    Woodall, T., Graham, R., Castain, R., Daniel, D., Sukalski, M., Fagg, G., Gabriel, E., Bosilca, G., Angskun, T., Dongarra, J., Squyres, J., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A.: Open MPI’s TEG Point-to-Point Communications Methodology: Comparison to Existing Implementations. In: Recent Advances in Parallel Virtual Machine and Message Passing Interface, vol. 3241, pp. 105–111. Springer, Budapest, Hungary (2004) Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  1. 1.Höchstleistungsrechenzentrum Stuttgart (HLRS)StuttgartGermany

Personalised recommendations