Cluster Computing

, Volume 2, Issue 4, pp 271–279 | Cite as

Implementing noncollective parallel I/O in cluster environments using Active Message communication

  • Jarek Nieplocha
  • Holger Dachsel
  • Ian Foster


A cost‐effective secondary storage architecture for parallel computers is to distribute storage across all processors, which then engage in either computation or I/O, depending on the demands of the moment. A difficulty associated with this architecture is that access to storage on another processor typically requires the cooperation of that processor, which can be hard to arrange if the processor is engaged in other computation. One partial solution to this problem is to require that remote I/O operations occur only via collective calls. In this paper, we describe an alternative approach based on the use of single‐sided communication operations such as Active Messages. We present an implementation of this basic approach called Distant I/O and present experimental results that quantify the low‐level performance of DIO mechanisms. This technique is exploited to support noncollective parallel shared file model for a large out‐of‐core scientific application with very high I/O bandwidth requirements. The achieved performance exceeds by a wide margin the performance of a well equipped PIOFS parallel filesystem on the IBM SP.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    C.C. Chang, G. Czajkowski, C. Hawblitzel and T. von Eicken, Low-latency communication on the IBM RISC System/6000 SP, in: ACM/IEEE Supercomputing '96 (November 1996).Google Scholar
  2. [2]
    Y. Chen, I. Foster, J. Nieplocha and M. Winslett, Optimizing collective I/O performance on parallel computers: A multisystem study, in: Proc. 11th ACM Intl. Conf. on Supercomputing (ACM Press, 1997).Google Scholar
  3. [3]
    Compaq Computer Corp., Intel Corp., Microsoft Corp., Virtual Interface Architecture Specification (December 16, 1997).Google Scholar
  4. [4]
    P.E. Crandall, R.A. Aydt, A.A. Chien and D.A. Reed, Input/output characteristics of scalable parallel applications, in: Proceedings of Supercomputing '95, San Diego, CA (IEEE CS Press, December 1995).Google Scholar
  5. [5]
    D. Culler, K. Keeton, C. Krumbein, L.T. Liu, A. Mainwaring, R. Martin, S. Rodrigues, K. Wright and C. Yoshikawa, Generic active message interface specification, Technical report, University of California at Berkeley (November 1994).Google Scholar
  6. [6]
    H. Dachsel, R.J. Harrison and D. Dixon, Multireference Configuration Calculations on Cr2: Passing the one billion limit in MRCI/MRACPF calculations, J. Phys. Chem. (1999), to appear.Google Scholar
  7. [7]
    H. Dachsel, H. Lischka, R. Shepard, J. Nieplocha and R.J. Harrison, A massively parallel multireference configuration interaction program — the parallel COLUMBUS program, J. Chem. Phys. 18 (1997) 430.Google Scholar
  8. [8]
    H. Dachsel, J. Nieplocha and R.J. Harrison, An out-of-core implementation of the COLUMBUS massively-parallel multireference configuration interactionprogram, in: Proc. of High Performance Networking and Computing Conference SuperComputing '98 (1998). (SC'98 Best Overall Paper Award.)Google Scholar
  9. [9]
    S.A. Moyer and V.S. Sunderam, PIOUS: A scalable I/O system for distributed computing environments, in: Proc. Scalable High-Performance Computing Conf. (1994).Google Scholar
  10. [10]
    MPI Forum, MPI-2: Extensions to Message Passing Interface, University of Tennessee (July 18, 1997).Google Scholar
  11. [11]
    J. Nieplocha, R.J. Harrison and R.J. Littlefield, Global Arrays: A nonuniform memory access programming model for high-performance computers, J. Supercomputing 10 (1996) 197-220.Google Scholar
  12. [12]
    J. Nieplocha and I. Foster, Disk Resident Arrays: An array-oriented library for out-of-core computations, in: Proc. Frontiers of Massively Parallel Computation (1996) pp. 196-204.Google Scholar
  13. [13]
    J. Nieplocha, I. Foster and R.A. Kendall, ChemIO: High-performance parallel I/O for computational chemistry applications, Int. J. Supercomp. Apps. High Perf. Comp. 12(3) (1998).Google Scholar
  14. [14]
    J. Nieplocha, I. Foster and R. Kendall, ChemIO, Scholar
  15. [15]
    N. Nieuwejaar, D. Kotz, A. Purakayastha, C. Schlatter Ellis and M. Best, File-access characteristics of parallel scientific workloads, IEEE Trans. Parallel Distributed Systems 7(10) (1996) 1075-1089.Google Scholar
  16. [16]
    S. Pakin, M. Lauria and A. Chien, High performance messaging on workstations: Illinois Fast Messages (FM) for Myrinet, in: Proc. Supercomputing '95 (1995).Google Scholar
  17. [17]
    K.E. Seamons, Y. Chen, P. Jones, J. Jozwiak and M. Winslett, Server-directed collective I/O in Panda, in: Proc. Supercomputing '95 (December 1995).Google Scholar
  18. [18]
    G. Shah, J. Nieplocha, J. Mirza, C. Kim, R. Harrison, R.K. Govindaraju, K. Gildea, P. DiNicola and C. Bender, Performance and experience with LAPI — a new high-performance communication library for the IBM RS/6000 SP, in: Proc. 1st Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing IPPS '98 (1998) pp. 260-266.Google Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • Jarek Nieplocha
    • 1
  • Holger Dachsel
    • 2
  • Ian Foster
    • 3
  1. 1.Pacific Northwest National LaboratoryRichlandUSA
  2. 2.Scientific Computing & ModellingHV AmsterdamThe Netherlands
  3. 3.Argonne National LaboratoryArgonneUSA

Personalised recommendations