Skip to main content
Log in

Interoperable Run-Time Tools for Distributed Systems—A Case Study

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Tools that observe and manipulate the run-time behavior of parallel and distributed systems are essential for developing and maintaining these systems. Sometimes users would even need to use several tools at the same time in order to have a higher functionality at their disposal. Today, tools developed independently by different vendors are, however, not able to interoperate. Interoperability not only allows concurrent use of tools, but also can lead to an added value for the user. A debugger interoperating with a checkpointing system, for example, can provide a debugging environment where the debugged program can be reset to any previous state, thus speeding up cyclic debugging for long running programs.

Using this example scenario, we derive requirements that should be met by the tools' software infrastructure in order to enable interoperability. A review of existing infrastructures shows that these requirements are only partially met today. In an ongoing research effort, support for all of the requirements is built into the OMIS compliant on-line monitoring system OCM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. A. Bergstra and P. Klint. The ToolBus coordination architecture. In P. Ciancarini and C. Hankin, eds., Coordination: Languages and Models: First International Conference COORDINATION '96, pp. 75-88. LNCS 1061. Springer Verlag. Cesena, Italy, April 1996.

    Google Scholar 

  2. Cray Research Inc. UNICOS CDBX Debugger User's Guide. Cray Publication SG-2094 8.0, 1996.

  3. M. Oberhuber and R. Wismüller. DETOP––An interactive debugger for PowerPC based multicomputers. In P. Fritzson and L. Finmo, eds., Parallel Programming and Applications, pp. 170-183. IOS Press, Amsterdam, May 1995.

    Google Scholar 

  4. The Parallel Tools Consortium. PTools MQM working group home page. www page http://www.ptools.org/projects/mqm/.

  5. R. Wismüller. State based visualization of PVM applications. In Parallel Virtual Machine––EuroPVM'96, pp. 91-99. LNCS 1156. Springer. Munich, Germany, October 1996.

    Google Scholar 

  6. M. Litzkow, T. Tannenbaum, J. Basney and M. Livny. Checkpoint and migration of UNIX processes in the condor distributed environment. Technical report 1346. Univ. of Wisconsin-Madison, Comp.Sci. Dept., April 1997.

  7. G. Stellner and J. Pruyne. Resource management and checkpointing for PVM. In EuroPVM'95, volume 5 of Parallélisme, réseaux et répartition, pp. 131-136. Hermès. Lyon, France, September 1995.

    Google Scholar 

  8. M. Frey and M. Oberhuber. Testing and debugging parallel and distributed programs with temporal logic specifications. In Proc. of 2nd Workshop on Parallel and Distributed Software Engineering dy1997, pp. 62-72. Boston, May 1997.

  9. K. J. Sullivan. Mediators: Easing the design and evolution of integrated systems. Technical report 94-08-01. Dept. of Computer Sciences and Engineering, Univ. of Washington, 1994.

  10. J. C. Cunha and V. Duarte. Monitoring PVM programs using the DAMS approach. In Recent Advances in Parallel Virtual Machine and Messag Passing Interface, Proc. EuroPVM/MPI'98, pp. 273-280. LNCS 1497. Springer. Liverpool, UK, September 1998.

    Google Scholar 

  11. D. Pase. Dynamic probe class library (DPCL): Tutorial and reference guide, Version 0.1. Technical report. IBM Corp., Poughkeepsie, NY, June 1998.

    Google Scholar 

  12. A. M. Julienne and B. Holtz. ToolTalk & Open Protocols––Inter-Application Communication. A Prentice Hall Title. SunSoft Press, Englewood Cliffs, NJ, 1994.

    Google Scholar 

  13. G. Paul, K. Sattler and M. Endig. An integration framework for open tool environments. In H. König, K. Geihs, and T. Preuβ, eds., Distributed Applications and Interoperable Systems, pp. 193-200. IFIP, Chapman & Hall. Cottbus, Germany, September 1997.

    Google Scholar 

  14. F. Long and E. Morris. An overview of PCTE: A basis for a portable common tool environment. Technical report CMU/SEI-93-TR-1. Carnegie Mellon Univ., Pittsburgh, PA, March 1993.

    Google Scholar 

  15. R. Wismüller, J. Trinitis and T. Ludwig. OCM––A monitoring system for interoperable tools. In Proc. 2nd SIGMETRICS Symp. on Parallel and Distributed Tools SPDT'98, pp. 1-9. ACM Press. Welches, OR, USA, August 1998.

    Google Scholar 

  16. T. Ludwig, R. Wismüller, V. Sunderam and A. Bode. OMIS––On-line Monitoring Interface Specification (Version 2.0). Shaker Verlag, Aachen, Germany, 1997. ISBN 3-8265-3035-7.

    Google Scholar 

  17. R. Wismüller. Interoperability support in the distributed monitoring system OCM. In R. Wyrzykowski, ed., Proc. 3rd International Conference on Parallel Processing and Applied Mathematics-PPAM'99, pp. 77-91. Technical University of Czestochowa, Kazimierz Dolny, Poland, September 1999.

    Google Scholar 

  18. J. K. Hollingsworth and B. Buck. DyninstAPI Programmers' Guide, Release 1.2. Univ. of Maryland, College Park, MD 20742, September 1998.

    Google Scholar 

  19. J. Cargille and B. P. Miller. Binary wrapping: A technique for instrumenting object code. ACM SIGPLAN Notices, 27(6):17-18, 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wismüller, R., Ludwig, T. Interoperable Run-Time Tools for Distributed Systems—A Case Study. The Journal of Supercomputing 17, 277–289 (2000). https://doi.org/10.1023/A:1026515407398

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026515407398

Navigation