Design and Use of htalib – A Library for Hierarchically Tiled Arrays

  • Ganesh Bikshandi
  • Jia Guo
  • Christoph von Praun
  • Gabriel Tanase
  • Basilio B. Fraguela
  • María J. Garzarán
  • David Padua
  • Lawrence Rauchwerger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4382)

Abstract

Hierarchically Tiled Arrays (HTAs) are data structures that facilitate locality and parallelism of array intensive computations with block-recursive nature. The model underlying HTAs provides programmers with a global view of distributed data as well as a single-threaded view of the execution. In this paper we present htalib, a C++ implementation of HTAs. This library provides several novel constructs: (i) A map-reduce operator framework that facilitates the implementation of distributed operations with HTAs. (ii) Overlapped tiling in support of tiling in stencil codes. (iii) Data layering, facilitating the use of HTAs in adaptive mesh refinement applications. We describe the interface and design of htalib and our experience with the new programming constructs.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    NAS Parallel Benchmarks. Website. http://www.nas.nasa.gov/Software/NPB/
  2. 2.
    High Performance Fortran Forum. HPF Specification Version 2.0 (January 1997)Google Scholar
  3. 3.
    Adams, J.C., et al.: Fortran 90 Handbook. McGraw-Hill, New York (1992)Google Scholar
  4. 4.
    Almási, G., et al.: Programming for Locality and Parallelism with Hierarchically Tiled Arrays. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 162–176. Springer, Heidelberg (2004)Google Scholar
  5. 5.
    An, P., et al.: STAPL: An Adaptive, Generic Parallel Programming Library for C++. In: Proc. of LCPC, August, pp. 193–208 (2001)Google Scholar
  6. 6.
    Berger, M.J., Colella, P.: Local adaptive mesh refinement for shock hydrodynamics. Journal of Computational Physics 82(1), 64–84 (1989)MATHCrossRefGoogle Scholar
  7. 7.
    Bikshandi, G., et al.: Programming for parallelism and locality with hierarchically tiled arrays. In: Proc. of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’06), pp. 48–57. ACM Press, New York (2006)Google Scholar
  8. 8.
    Carlson, W., Draper, J., Culler, D., Yelick, K., Brooks, E., Warren, K.: Introduction to UPC and Language Specification. Technical Report CCS-TR-99-157, IDA Center for Computing Sciences (1999)Google Scholar
  9. 9.
    Chamberlain, B.L., et al.: The Case for High Level Parallel Programming in ZPL. IEEE Computational Science and Engineering 5(3), 76–86 (1998)CrossRefGoogle Scholar
  10. 10.
    Collins, T., Browne, J.C.: Matrix+ + : An object-oriented environment for parallel high-performance matrix computations. In: Proc. of the 28th Annual Hawaii Intl. Conf. on System Sciences (HICSS), p. 202 (1995)Google Scholar
  11. 11.
    Culler, D.E., et al.: Parallel programming in split-c. In: Conference on Supercomputing (SC), pp. 262–273 (1993)Google Scholar
  12. 12.
    Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: Symposium on Operating System Design and Implementation (OSDI) (2004)Google Scholar
  13. 13.
    Fatahalian, K., et al.: Programming the memory hierarchy. To appear in Proc. of Supercomputing 2006, Nov. (2006)Google Scholar
  14. 14.
    Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message-Passing Interface”, 2nd edn. MIT Press, Cambridge (1999)Google Scholar
  15. 15.
    Han, H., Rivera, G., Tseng, C.: Software support for improving locality in scientific codes. In: Proc. of the Eighth International Workshop on Compilers for Parallel Computers (CPC’2000), Aussois, France, Jan. (2000)Google Scholar
  16. 16.
    Numrich, R.W., Reid, J.: Co-array Fortran for Parallel Programming. SIGPLAN Fortran Forum 17(2), 1–31 (1998)CrossRefGoogle Scholar
  17. 17.
    Wen, T., Colella, P.: Adaptive mesh refinement in Titanium. IPDPS (2005)Google Scholar
  18. 18.
    Wolf, M.E., Lam, M.S.: A data locality optimizing algorithm. In: PLDI, Toronto, Ontario, Canada, pp. 30–44 (1991), doi:10.1145/113445.113449Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Ganesh Bikshandi
    • 1
  • Jia Guo
    • 1
  • Christoph von Praun
    • 2
  • Gabriel Tanase
    • 3
  • Basilio B. Fraguela
    • 4
  • María J. Garzarán
    • 1
  • David Padua
    • 1
  • Lawrence Rauchwerger
    • 3
  1. 1.University of Illinois, Urbana-Champaign, IL 
  2. 2.IBM T. J. Watson Research Center, Yorktown Heights, NY 
  3. 3.Texas A&M University, College Station, TX 
  4. 4.Universidade da CoruñaSpain

Personalised recommendations