Advertisement

RSD — Resource and Service Description

  • Matthias Brune
  • Jörn Gehring
  • Axel Keller
  • Alexander Reinefeld
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 478)

Abstract

RSD (Resource and Service Description) is a scheme for specifying resources and services in complex heterogeneous computing systems and metacomputing environments. At the system administrator level, RSD is used to specify the available system components, such as the number of nodes, their interconnection topology, CPU speeds, and available software packages. At the user level, a GUI provides a comfortable, high-level interface for specifying system requests. A textual editor can be used for defining repetitive and recursive structures. This gives service providers the necessary flexibility for fine-grained specification of system topologies, interconnection networks, system and software dependent properties. All these representations are mapped onto a single, coherent internal object-oriented resource representation.

Dynamic aspects (like network performance, availability of compute nodes, and compute node loads) are traced at runtime and included in the resource description to allow for optimal process mapping and dynamic task load balancing at runtime at the metacomputer level. This is done in a self-organizing way, with human system operators becoming only involved when new hardware/software components are installed.

Keywords

distributed computing metacomputing resource management specification language multi-site applications 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baraglia, R., Faieta, G., Formica, M., and Laforenza, D. (1996). Experiences with a wide area network metacomputing management tool using IBM SP-2 parallel systems. Concurrency: Practice and Experience, 8.Google Scholar
  2. Bauer, B. and Ramme, F. (1991). A general purpose resource description language. In Grebe, R. and Baumann, M., editors, Parallele Datenverarbeitung mit dem Transputer, pages 68–75, Berlin. Springer-Verlag.Google Scholar
  3. Bayucan, A., Henderson, R., Proett, T., Tweten, D., and Kelly, B. (1996). Portable Batch System: External Reference Specification. Release 1.1.7. NASA Ames Research Center.Google Scholar
  4. Beisel, T., Gabriel, E., and Resch, M. (1997). An extension to MPI for distributed computing on MPPs. In Bubak, M., Dongarra, J., and Wasniewski, J., editors, Recent Advances in Parallel Virtual Machine and Message Passing Interface, pages 25–33. Springer-Verlag LNCS.Google Scholar
  5. Brune, M., Gehring, J., Keller, A., Monien, B., Ramme, F., and Reinefeld, A. (1998). Specifying resources and services in metacomputing environments. Parallel Computing. To appear.Google Scholar
  6. Brune, M., Gehring, J., and Reinefeld, A. (1997). Heterogeneous message passing and a link to resource management. Journal of Supercomputing, 11:355–369.CrossRefGoogle Scholar
  7. Fagg, G. and Dongarra, J. (1996). PVMPI: An integration of the PVM and MPI systems. Calculateurs Paralléles, 8(2):151–166.Google Scholar
  8. Fitzgerald, S., Foster, I., Kesselman, C., Laszewski, G. V., Smith, W., and Tuecke, S. (1997). A directory service for configuring high-performance distributed computations. Preprint. Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL.Google Scholar
  9. Foster, I. and Kesselman, C. (1997). Globus: A metacomputing infrastructure toolkit. Journal of Supercomputer Applications, pages 115–128.Google Scholar
  10. Keller, A. and Reinefeld, A. (1998). CCS resource management in networked HPC systems. In Heterogeneous Computing Workshop HCW’98, Orlando.Google Scholar
  11. LoadLeveler (1997). SP Parallel Programming Workshop: oadLeveler. http://www.mhpcc.edu/training/workshop/html/loadleveler/LoadLeveler.html Google Scholar
  12. MIT (1998). The Athena Project. Massachusetts Institute of Technology. http://web.mit.edu/o1h/Welcome/index.html Google Scholar
  13. Smarr, L. and Catlett, C. (1992). Metacomputing. Communications of the ACM, 35(6): 45–52.CrossRefGoogle Scholar
  14. Tivoli (1998). The Tivoli Management Environment. Tivoli Systems Inc. http://www.tivoli.com.Google Scholar
  15. Yeong, W., Howes, T., and Kille, S. (1995). Lightweight directory access protocol. RFC 1777, 03/2895, Draft Standard.Google Scholar

Copyright information

© Springer Science+Business Media New York 1998

Authors and Affiliations

  • Matthias Brune
    • 1
  • Jörn Gehring
    • 1
  • Axel Keller
    • 1
  • Alexander Reinefeld
    • 1
  1. 1.Paderborn Center for Parallel ComputingUniversität PaderbornGermany

Personalised recommendations