RSD — Resource and Service Description
RSD (Resource and Service Description) is a scheme for specifying resources and services in complex heterogeneous computing systems and metacomputing environments. At the system administrator level, RSD is used to specify the available system components, such as the number of nodes, their interconnection topology, CPU speeds, and available software packages. At the user level, a GUI provides a comfortable, high-level interface for specifying system requests. A textual editor can be used for defining repetitive and recursive structures. This gives service providers the necessary flexibility for fine-grained specification of system topologies, interconnection networks, system and software dependent properties. All these representations are mapped onto a single, coherent internal object-oriented resource representation.
Dynamic aspects (like network performance, availability of compute nodes, and compute node loads) are traced at runtime and included in the resource description to allow for optimal process mapping and dynamic task load balancing at runtime at the metacomputer level. This is done in a self-organizing way, with human system operators becoming only involved when new hardware/software components are installed.
Keywordsdistributed computing metacomputing resource management specification language multi-site applications
Unable to display preview. Download preview PDF.
- Baraglia, R., Faieta, G., Formica, M., and Laforenza, D. (1996). Experiences with a wide area network metacomputing management tool using IBM SP-2 parallel systems. Concurrency: Practice and Experience, 8.Google Scholar
- Bauer, B. and Ramme, F. (1991). A general purpose resource description language. In Grebe, R. and Baumann, M., editors, Parallele Datenverarbeitung mit dem Transputer, pages 68–75, Berlin. Springer-Verlag.Google Scholar
- Bayucan, A., Henderson, R., Proett, T., Tweten, D., and Kelly, B. (1996). Portable Batch System: External Reference Specification. Release 1.1.7. NASA Ames Research Center.Google Scholar
- Beisel, T., Gabriel, E., and Resch, M. (1997). An extension to MPI for distributed computing on MPPs. In Bubak, M., Dongarra, J., and Wasniewski, J., editors, Recent Advances in Parallel Virtual Machine and Message Passing Interface, pages 25–33. Springer-Verlag LNCS.Google Scholar
- Brune, M., Gehring, J., Keller, A., Monien, B., Ramme, F., and Reinefeld, A. (1998). Specifying resources and services in metacomputing environments. Parallel Computing. To appear.Google Scholar
- Fagg, G. and Dongarra, J. (1996). PVMPI: An integration of the PVM and MPI systems. Calculateurs Paralléles, 8(2):151–166.Google Scholar
- Fitzgerald, S., Foster, I., Kesselman, C., Laszewski, G. V., Smith, W., and Tuecke, S. (1997). A directory service for configuring high-performance distributed computations. Preprint. Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL.Google Scholar
- Foster, I. and Kesselman, C. (1997). Globus: A metacomputing infrastructure toolkit. Journal of Supercomputer Applications, pages 115–128.Google Scholar
- Keller, A. and Reinefeld, A. (1998). CCS resource management in networked HPC systems. In Heterogeneous Computing Workshop HCW’98, Orlando.Google Scholar
- LoadLeveler (1997). SP Parallel Programming Workshop: oadLeveler. http://www.mhpcc.edu/training/workshop/html/loadleveler/LoadLeveler.html Google Scholar
- Yeong, W., Howes, T., and Kille, S. (1995). Lightweight directory access protocol. RFC 1777, 03/2895, Draft Standard.Google Scholar