A Scalable Process-Management Environment for Parallel Programs
Purchase on Springer.com
$29.95 / €24.95 / £19.95*
* Final gross prices may vary according to local VAT.
We present a process management system for parallel programs such as those written using MPI. A primary goal of the system, which we call MPD (for multipurpose daemon), is to be scalable. By this we mean that startup of interactive parallel jobs comprising a thousand processes is quick, that signals can be quickly delivered to processes, and that stdin, stdout, and stderr are managed intuitively. Our primary target is parallel machines made up of clusters of SMPs, but the system is also useful in more tightly integrated environments. We describe how MPD enables much faster startup and better runtime management of MPICH jobs. We show how close control of stdio can support the easy implementation of a number of convenient system utilities, even a parallel debugger. MPD is implemented and freely distributed with MPICH.
- Chiba City home page. http://www.mcs.anl.gov/chiba
- The Maui scheduler home page. http://maui-scheduler.mhpcc.edu/newdoc, http://www.mhpcc.edu/maui.
- M. A. Baker, G. C. Fox, and H. W. Yau. Review of cluster management software. NHSE Review, 1(1), May 1996.
- Amnon Barak, Shai Guday, and Richard G. Wheeler. The MOSIX distributed operating system: Load balancing for UNIX, volume 672 of Lecture Notes in Computer Science. Springer-Verlag, New York, 1993.
- Micah Beck, Jack J. Dongarra, Graham E. Fagg, G. Al Geist, Paul Gray, James Kohl, Mauro Migliardi, Keith Moore, Terry Moore, Philip Papadopoulous, Stephen L. Scott, and Vaidy Sunderam. HARNESS: A next generation distributed virtual machine. International Journal on Future Generation Computer Systems, 15(5/6), 1999.
- Greg Burns, Raja Daoud, and James Vaigl. LAM: An open cluster environment for MPI. In John W. Ross, editor, Proceedings of Supercomputing Symposium’ 94, pages 379–386. University of Toronto, 1994.
- Ralph Butler and Ewing Lusk. Monitors, messages, and clusters: The p4 parallel programming system. Parallel Computing, 20:547–564, April 1994.
- DQS home page. http://www.scri.fsu.edu/~pasko/dqs.html.
- I. Foster and C. Kesselman, editors. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 1999.
- Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Bob Manchek, and Vaidy Sunderam. PVM: Parallel Virtual Machine—A User’s Guide and Tutorial for Network Parallel Computing. MIT Press, Cambridge, Mass., 1994.
- Douglas P. Ghormley, David Petrou, Steven H. Rodrigues, Amin M. Vahdat, and Thomas E. Anderson. GLUnix: A Global Layer Unix for a network of workstations. Software—Practice and Experience, 28(9):929–961, July 1998.
- William Gropp and Ewing Lusk. Scalable Unix tools on parallel processors. In Proceedings of the Scalable High-Performance Computing Conference, pages 56–62. IEEE Computer Society Press, 1994.
- William Gropp, Ewing Lusk, Nathan Doss, and Anthony Skjellum. A high-performance, portable implementation of the MPI Message-Passing Interface standard. Parallel Computing, 22(6):789–828, 1996. CrossRef
- IBM. Loadleveler: Using and Administering, version 2 release 1 edition, November 1998. SA22-7311-00.
- M. J. Litzkow, M. Livny, and M. W. Mutka. Condor-A hunter of idle workstations. In Proc. 8th Intl. Conf. on Distributed Computing Systems, pages 104–111, San Jose, Calif., June 1988.
- M. Migliardi and V. Sunderam. PVM emulation in the Harness metacomput-ing system: A plug-in based approach. In J.J. Dongarra, E. Luque, and Tomas Margalef, editors, Recent advances in parallel virtual machine and message passing interface: 6th European PVM/MPI Users’ Group Meeting, Barcelona, Spain, September 26–29, 1999: Proceedings, volume 1697 of Lecture Notes in Computer Science, pages 117–124, Berlin, 1999. Springer-Verlag. CrossRef
- PBS home page. http://pbs.mrj.com/.
- Load Sharing Facility (LSF). http://www.platform.com.
- J. Pruyne and M. Livny. Interfacing Condor and PVM to harness the cycles of workstation clusters. Future Generation Computer Systems, 12(1):67–85, May 1996.
- Andrew S. Tanenbaum. Computer Networks. Prentice Hall, third edition, 1996.
- A Scalable Process-Management Environment for Parallel Programs
- Book Title
- Recent Advances in Parallel Virtual Machine and Message Passing Interface
- Book Subtitle
- 7th European PVM/MPI Users’ Group Meeting Balatonfüred, Hungary, September 10–13, 2000 Proceedings
- Book Part
- pp 168-175
- Print ISBN
- Online ISBN
- Series Title
- Lecture Notes in Computer Science
- Series Volume
- Series ISSN
- Springer Berlin Heidelberg
- Copyright Holder
- Springer-Verlag Berlin Heidelberg
- Additional Links
- Industry Sectors
- eBook Packages
- Editor Affiliations
- 4. Innovative Computing Lab., University of Tennessee
- 5. MTA SZTAKI Computer and Automation Research Institute
- Author Affiliations
- 6. University of North Florida, USA
- 7. Argonne National Laboratory, USA
To view the rest of this content please follow the download PDF link above.