The Journal of Supercomputing

, Volume 46, Issue 3, pp 213–236

Mapping functions and data redistribution for parallel files

Article

Abstract

Data distribution in memory or on disks is an important factor influencing the performance of parallel applications. On the other hand, programs or systems, like a parallel file system, frequently redistribute data between memory and disks.

This paper presents a generalization of previous approaches of the redistribution problem. We introduce algorithms for mapping between two arbitrary distributions of a data set. The algorithms are optimized for multidimensional array partitions. We motivate our approach and present potential utilizations. The paper also presents a case study, the employment of mapping functions, and redistribution algorithms in a parallel file system.

Keywords

Parallel file systems Parallel I/O Noncontiguous I/O Multi-dimensional array redistribution Mapping functions 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    DeBenedictis E, Rosario JD (1992) nCUBE parallel I/O software. In: Proceedings of 11th international Phoenix conference on computers and communication Google Scholar
  2. 2.
    LoVerso S, Isman M, Nanopoulos A, Nesheim W, Milne E, Wheeler R (1993) sfs: a parallel file system for the CM-5. In: Proceedings of the summer 1993 USENIX conference, pp 291–305 Google Scholar
  3. 3.
    Moyer S, Sunderam V (1994) PIOUS: a scalable parallel I/O system for distributed computing environments. In: Proceedings of the scalable high-performance computing conference Google Scholar
  4. 4.
    Huber J, Elford C, Reed D, Chien A, Blumenthal D (1995) PPFS: a high performance portable file system. In: Proceedings of the 9th ACM international conference on supercomputing Google Scholar
  5. 5.
    Corbett P, Feitelson D (1996) The Vesta parallel file system. ACM Trans Comput Syst Google Scholar
  6. 6.
    Freedman C, Burger J, DeWitt D (1996) SPIFFI—a scalable parallel file system for the Intel Paragon. IEEE Trans Parallel Distributed Syst Google Scholar
  7. 7.
    Carretero J, Serez F, Miguel P, Garca F, Alonso L (1996) ParFiSys: a parallel file system for MPP. ACM SIGOPS 30 Google Scholar
  8. 8.
    Nieuwejaar N, Kotz D (1997) The galley parallel file system. Parallel Comput Google Scholar
  9. 9.
    Brodowicz M, Johnson O (1998) Paradise: an advanced featured parallel file system. In: Press, A. (ed) Proceedings of the international conference on supercomputing, pp 220–226 Google Scholar
  10. 10.
    III WL, Ross R (1999) An overview of the parallel virtual file system. In: Proceedings of the extreme Linux workshop Google Scholar
  11. 11.
    Schmuck F, Haskin R (2002) GPFS: a shared-disk file system for large computing clusters. In: Proceedings of FAST Google Scholar
  12. 12.
    Winslett M, Seamons K, Chen Y, Cho Y, Kuo S, Subramaniam M (1996) The Panda library for parallel I/O of large multidimensional arrays. In: Proceedings of scalable parallel libraries conference III Google Scholar
  13. 13.
    Message Passing Interface Forum (1997) MPI2: extensions to the message passing interface Google Scholar
  14. 14.
    Nieuwejaar N, Kotz D, Purakayastha A, Ellis C, Best M (1996) File access characteristics of parallel scientific workloads. IEEE Trans Parallel Distributed Syst 7(10) Google Scholar
  15. 15.
    Smirni E, Reed D (1997) Workload characterization of I/O intensive parallel applications. In: Proceedings of the conference on modelling techniques and tools for computer performance evaluation Google Scholar
  16. 16.
    Simitici H, Reed D (1998) A comparison of logical and physical parallel I/O patterns. Int J High Perform Comput Appl 12(3) Google Scholar
  17. 17.
    Ramaswamy S, Banerjee P (1995) Automatic generation of efficient array redistribution routines for distributed memory multicomputers. In: Proceedings of Frontiers ’95: the fifth symposium on the frontiers of massively parallel computation, McLean Google Scholar
  18. 18.
    Corbett P, Feitelson D, Prost JP, Almasi G, Baylor S, Bolmaricich A, Hsu Y, Satran J, Snir M, Colao R, Herr B, Kavaky J, Morgen T, Zlotek A (1995) Parallel file systems for IBM SP computers. IBM Syst J Google Scholar
  19. 19.
    Loveman DB (1993) High performance Fortran. IEEE Parallel Distributed Technol Google Scholar
  20. 20.
    Message Passing Interface Forum (1995) MPI: a message-passing interface standard Google Scholar
  21. 21.
    Isaila F, Tichy W (2001) Clusterfile: a flexible physical layout parallel file system. In: First IEEE international conference on cluster computing Google Scholar
  22. 22.
    Isaila F, Tichy W (2003) View I/O: improving the performance of non-contiguous I/O. In: Third IEEE international conference on cluster computing, pp 336–343 Google Scholar
  23. 23.
    Isaila F, Tichy W (2003) Clusterfile: a flexible physical layout parallel file system. Concurr Comput Pract Experience 15:653–679 CrossRefGoogle Scholar
  24. 24.
    Isaila F, Malpohl G, Olaru V, Szeder G, Tichy W (2004) Integrating collective I/O and cooperative caching into the “clusterfile” parallel file system. In: Proceedings of ACM international conference on supercomputing (ICS) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Departement of Computer ScienceUniversity Carlos IIIMadridSpain
  2. 2.Department of Computer ScienceUniversity of KarlsruheKarlsruheGermany

Personalised recommendations