ScELA: Scalable and Extensible Launching Architecture for Clusters

  • Jaidev K. Sridhar
  • Matthew J. Koop
  • Jonathan L. Perkins
  • Dhabaleswar K. Panda
Conference paper

DOI: 10.1007/978-3-540-89894-8_30

Part of the Lecture Notes in Computer Science book series (LNCS, volume 5374)
Cite this paper as:
Sridhar J.K., Koop M.J., Perkins J.L., Panda D.K. (2008) ScELA: Scalable and Extensible Launching Architecture for Clusters. In: Sadayappan P., Parashar M., Badrinath R., Prasanna V.K. (eds) High Performance Computing - HiPC 2008. HiPC 2008. Lecture Notes in Computer Science, vol 5374. Springer, Berlin, Heidelberg

Abstract

As cluster sizes head into tens of thousands, current job launch mechanisms do not scale as they are limited by resource constraints as well as performance bottlenecks. The job launch process includes two phases – spawning of processes on processors and information exchange between processes for job initialization. Implementations of various programming models follow distinct protocols for the information exchange phase. We present the design of a scalable, extensible and high-performance job launch architecture for very large scale parallel computing. We present implementations of this architecture which achieve a speedup of more than 700% in launching a simple Hello World MPI application on 10,240 processor cores and also scale to more than 3 times the number of processor cores compared to prior solutions.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jaidev K. Sridhar
    • 1
  • Matthew J. Koop
    • 1
  • Jonathan L. Perkins
    • 1
  • Dhabaleswar K. Panda
    • 1
  1. 1.Network-Based Computing LaboratoryThe Ohio State UniversityColumbusUSA

Personalised recommendations