Advertisement

Memory-Constrained Data Locality Optimization for Tensor Contractions

  • Alina Bibireata
  • Sandhya Krishnan
  • Gerald Baumgartner
  • Daniel Cociorva
  • Chi-Chung Lam
  • P. Sadayappan
  • J. Ramanujam
  • David E. Bernholdt
  • Venkatesh Choppella
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2958)

Abstract

The accurate modeling of the electronic structure of atoms and molecules involves computationally intensive tensor contractions over large multi-dimensional arrays. Efficient computation of these contractions usually requires the generation of temporary intermediate arrays. These intermediates could be extremely large, requiring their storage on disk. However, the intermediates can often be generated and used in batches through appropriate loop fusion transformations. To optimize the performance of such computations a combination of loop fusion and loop tiling is required, so that the cost of disk I/O is minimized. In this paper, we address the memory-constrained data-locality optimization problem in the context of this class of computations. We develop an optimization framework to search among a space of fusion and tiling choices to minimize the data movement overhead. The effectiveness of the developed optimization approach is demonstrated on a computation representative of a component used in quantum chemistry suites.

Keywords

Assure Lution 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baumgartner, G., Bernholdt, D.E., Cociorva, D., Harrison, R., Hirata, S., Lam, C., Nooijen, M., Pitzer, R., Ramanujam, J., Sadayappan, P.: A High-Level Approach to Synthesis of High-Performance Codes for Quantum Chemistry. In: Proc. Supercomputing 2002 (November 2002)Google Scholar
  2. 2.
    Cociorva, D., Baumgartner, G., Lam, C., Sadayappan, P., Ramanujam, J., Nooijen, M., Bernholdt, D., Harrison, R.: Space-Time Trade-Off Optimization for a Class of Electronic Structure Calculations. In: Proc. of ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI), June 2002, pp. 177–186 (2002)Google Scholar
  3. 3.
    Cociorva, D., Gao, X., Krishnan, S., Baumgartner, G., Lam, C., Sadayappan, P., Ramanujam, J.: Global Communication Optimization for Tensor Contraction Expressions under Memory Constraints. In: Proc. of 17th International Parallel & Distributed Processing Symposium (IPDPS) (April 2003)Google Scholar
  4. 4.
    Cociorva, D., Wilkins, J., Lam, C.-C., Baumgartner, G., Sadayappan, P., Ramanujam, J.: Loop optimization for a class of memory-constrained computations. In: Proc. 15th ACM International Conference on Supercomputing(ICS 2001), Sorrento, Italy, June 2001, pp. 500–509 (2001)Google Scholar
  5. 5.
    Cociorva, D., Wilkins, J., Baumgartner, G., Sadayappan, P., Ramanujam, J., Nooijen, M., Bernholdt, D.E., Harrison, R.: Towards Automatic Synthesis of High-Performance Codes for Electronic Structure Calculations: Data Locality Optimization. In: Monien, B., Prasanna, V.K., Vajapeyam, S. (eds.) HiPC 2001. LNCS, vol. 2228, pp. 237–248. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  6. 6.
    Krishnan, S., Krishnamoorthy, S., Baumgartner, G., Cociorva, D., Lam, C., Sadayappan, P., Ramanujam, J., Bernholdt, D.E., Choppella, V.: Data Locality Optimization for Synthesis of Efficient Out-of-Core Algorithms. In: Pinkston, T.M., Prasanna, V.K. (eds.) HiPC 2003. LNCS (LNAI), vol. 2913, pp. 406–417. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  7. 7.
    Lam, C.: Performance Optimization of a Class of Loops Implementing Multi- Dimensional Integrals, Ph.D. Dissertation, The Ohio State University, Columbus, OH (August 1999)Google Scholar
  8. 8.
    Lam, C., Cociorva, D., Baumgartner, G., Sadayappan, P.: Optimization of Memory Usage and Communication Requirements for a Class of Loops Implementing Multi-Dimensional Integrals. In: Carter, L., Ferrante, J. (eds.) LCPC 1999. LNCS, vol. 1863, p. 350. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  9. 9.
    Lam, C., Cociorva, D., Baumgartner, G., Sadayappan, P.: Memory-optimal evaluation of expression trees involving large objects. In: Proc. Intl. Conf. on High Perf. Comp. (December 1999)Google Scholar
  10. 10.
    Lam, C., Sadayappan, P., Wenger, R.: On Optimizing a Class of Multi- Dimensional Loops with Reductions for Parallel Execution. Par. Proc. Lett. 2(7), 157–168 (1997)MathSciNetGoogle Scholar
  11. 11.
    Lam, C., Sadayappan, P., Wenger, R.: Optimization of a Class of Multi-Dimensional Integrals on Parallel Machines. In: Proc. of Eighth SIAM Conf. on Parallel Processing for Scientific Computing, Minneapolis, MN (March 1997)Google Scholar
  12. 12.
    Lee, T.J., Scuseria, G.E.: Achieving chemical accuracy with coupled cluster theory. In: Langhoff, S.R. (ed.) Quantum Mechanical Electronic Structure Calculations with Chemical Accuracy, pp. 47–109. Kluwer Academic, Dordrecht (1997)Google Scholar
  13. 13.
    Martin, J.M.L.: Encyclopedia of Computational Chemistry. In: Schleyer, P.v.R., Schreiner, P.R., Allinger, N.L., Clark, T., Gasteiger, J., Kollman, P., Schaefer III, H.F. (eds.), vol. 1, pp. 115–128. Wiley & Sons, Berne (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Alina Bibireata
    • 1
  • Sandhya Krishnan
    • 1
  • Gerald Baumgartner
    • 1
  • Daniel Cociorva
    • 1
  • Chi-Chung Lam
    • 1
  • P. Sadayappan
    • 1
  • J. Ramanujam
    • 2
  • David E. Bernholdt
    • 3
  • Venkatesh Choppella
    • 3
  1. 1.Department of Computer and Information ScienceThe Ohio State UniversityColumbusUSA
  2. 2.Department of Electrical and Computer EngineeringLouisiana State UniversityBaton RougeUSA
  3. 3.Oak Ridge National LaboratoryOak RidgeUSA

Personalised recommendations