A Cache-Optimal Alternative to the Unidirectional Hierarchization Algorithm

Conference paper
Part of the Lecture Notes in Computational Science and Engineering book series (LNCSE, volume 109)

Abstract

The sparse grid combination technique provides a framework to solve high-dimensional numerical problems with standard solvers by assembling a sparse grid from many coarse and anisotropic full grids called component grids. Hierarchization is one of the most fundamental tasks for sparse grids. It describes the transformation from the nodal basis to the hierarchical basis. In settings where the component grids have to be frequently combined and distributed in a massively parallel compute environment, hierarchization on component grids is relevant to minimize communication overhead.

We present a cache-oblivious hierarchization algorithm for component grids of the combination technique. It causes \(\left \vert \mathbf{G}_{\boldsymbol{\ell}}\right \vert \cdot \left ( \frac{1} {B} + \mathcal{O}\left ( \frac{1} {\root{d}\of{M}}\right )\right )\) cache misses under the tall cache assumption \(M =\omega \left (B^{d}\right )\). Here, \(\mathbf{G}_{\boldsymbol{\ell}}\) denotes the component grid, d the dimension, M the size of the cache and B the cache line size. This algorithm decreases the leading term of the cache misses by a factor of d compared to the unidirectional algorithm which is the common standard up to now. The new algorithm is also optimal in the sense that the leading term of the cache misses is reduced to scanning complexity, i.e., every degree of freedom has to be touched once. We also present a variant of the algorithm that causes \(\left \vert \mathbf{G}_{\boldsymbol{\ell}}\right \vert \cdot \left ( \frac{2} {B} + \mathcal{O}\left ( \frac{1} {\root{d-1}\of{M\cdot B^{d-2}}} \right )\right )\) cache misses under the assumption \(M =\omega \left (B\right )\). The new algorithms have been implemented and outperform previously existing software. In several cases the measured performance is close to the best possible.

References

  1. 1.
    A. Aggarwal, J.S. Vitter, The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)MathSciNetCrossRefGoogle Scholar
  2. 2.
    G. Ballard, J. Demmel, O. Holtz, O. Schwartz, Minimizing communication in numerical linear algebra. SIAM J. Matrix Anal. Appl. 32(3), 866–901 (2011)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    H.-J. Bungartz, M. Griebel, Sparse grids. Acta Numer. 13, 147–269 (2004)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    H.-J. Bungartz, A. Heinecke, D. Pflüger, S. Schraufstetter, Option pricing with a direct adaptive sparse grid approach. J. Comput. Appl. Math. 236(15), 3741–3750 (2011). Online Okt. 2011Google Scholar
  5. 5.
    H.-J. Bungartz, D. Pflüger, S. Zimmer, Adaptive sparse grid techniques for data mining, in Modelling, Simulation and Optimization of Complex Processes 2006, Proceedings of the International Conference on HPSC, Hanoi, ed. by H. Bock, E. Kostina, X. Hoang, R. Rannacher (Springer, 2008), pp. 121–130Google Scholar
  6. 6.
    G. Buse, R. Jacob, D. Pflüger, A. Murarasu, A non-static data layout enhancing parallelism and vectorization in sparse grid algorithms, in Proceedings of the 11th International Symposium on Parallel and Distributed Computing (ISPDC), Munich, 25–29 June 2012 (IEEE, 2012), pp. 195–202Google Scholar
  7. 7.
    D. Butnaru, D. Pflüger, H.-J. Bungartz, Towards high-dimensional computational steering of precomputed simulation data using sparse grids, in Proceedings of the International Conference on Computational Science (ICCS), Tsukaba. Volume 4 of Procedia CS (Springer, 2011), pp. 56–65Google Scholar
  8. 8.
    P. Butz, Effiziente verteilte Hierarchisierung und Dehierarchisierung auf vollen Gittern, Bachelor’s thesis, University of Stuttgart, 2014, http://d-nb.info/1063333806 Google Scholar
  9. 9.
    C. Feuersänger, Sparse grid methods for higher dimensional approximation, PhD thesis, Universität Bonn, 2010Google Scholar
  10. 10.
    M. Frigo, C. E. Leiserson, H. Prokop, S. Ramachandran, Cache-oblivious algorithms, in Proceedings of the 40th Annual Symposium on Foundations of Computer Science (FOCS’99), New York (IEEE Computer Society Press, 1999), pp. 285–297Google Scholar
  11. 11.
    J. Garcke, Maschinelles Lernen durch Funktionsrekonstruktion mit verallgemeinerten dünnen Gittern, PhD thesis, Universität Bonn, 2004Google Scholar
  12. 12.
    J. Garcke, M. Griebel, On the parallelization of the sparse grid approach for data mining, in Large-Scale Scientific Computing, ed. by S. Margenov, J. Waśniewski, P. Yalamov. Volume 2179 of Lecture Notes in Computer Science (Springer, Berlin/Heidelberg, 2001), pp. 22–32Google Scholar
  13. 13.
    E. Georganas, J. González-Domínguez, E. Solomonik, Y. Zheng, J. Touriño, K. Yelick, Communication avoiding and overlapping for numerical linear algebra, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12), Salt Lake City (IEEE Computer Society Press, Los Alamitos, 2012), pp. 100:1–100:11Google Scholar
  14. 14.
    M. Griebel, The combination technique for the sparse grid solution of PDE’s on multiprocessor machines. Parallel Process. Lett. 2, 61–70 (1992)CrossRefGoogle Scholar
  15. 15.
    M. Griebel, H. Harbrecht, On the convergence of the combination technique, in Sparse Grids and Applications. Volume 97 of Lecture Notes in Computational Science and Engineering (Springer, Cham/New York, 2014), pp. 55–74Google Scholar
  16. 16.
    M. Griebel, W. Huber, Turbulence simulation on sparse grids using the combination method, in ed. by N. Satofuka, J. Periaux, A. Ecer, Proceedings Parallel Computational Fluid Dynamics, New Algorithms and Applications (CFD’94), Kyoto, Wiesbaden Braunschweig (Vieweg, 1995), pp. 75–84Google Scholar
  17. 17.
    M. Griebel, W. Huber, C. Zenger, Numerical turbulence simulation on a parallel computer using the combination method, in Flow Simulation on High Performance Computers II, Notes on Numerical Fluid Mechanics 52, pp. 34–47 (Vieweg, Wiesbaden 1996) DOI:10.1007/978-3-322-89849-4_4Google Scholar
  18. 18.
    M. Griebel, M. Schneider, C. Zenger, A combination technique for the solution of sparse grid problems, in Iterative Methods in Linear Algebra (IMACS/Elsevier, Amsterdam 1992), pp. 263–281MATHGoogle Scholar
  19. 19.
    M. Griebel, V. Thurner, The efficient solution of fluid dynamics problems by the combination technique. Int. J. Numer. Methods Heat Fluid Flow 5, 51–69 (1995)MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    B. Harding, M. Hegland, A robust combination technique, in CTAC-2012. Volume 54 of ANZIAM Journal, 2013, pp. C394–C411Google Scholar
  21. 21.
    M. Holtz, Sparse Grid Quadrature in High Dimensions with Applications in Finance and Insurance. Volume 77 of Lecture Notes in Computational Science and Engineering (Springer, Heidelberg, 2011)Google Scholar
  22. 22.
    J.-W. Hong, H.-T. Kung, I/O complexity: The red-blue pebble game, in Proceedings of STOC’81, New York (ACM, 1981), pp. 326–333Google Scholar
  23. 23.
    P. Hupp, Communication efficient algorithms for numerical problems on full and sparse grids, PhD thesis, ETH Zurich, 2014Google Scholar
  24. 24.
    P. Hupp, Performance of unidirectional hierarchization for component grids virtually maximized, in International Conference on Computational Science. Volume 29 of Procedia Computer Science (Elsevier, Amsterdam 2014), pp. 2272–2283Google Scholar
  25. 25.
    P. Hupp, M. Heene, R. Jacob, D. Pflüger, Global communication schemes for the numerical solution of high-dimensional PDEs. Parallel Comput. (2016). DOI:10.1016/j.parco.2015.12.006 Google Scholar
  26. 26.
    P. Hupp, R. Jacob, M. Heene, D. Pflüger, M. Hegland, Global communication schemes for the sparse grid combination technique. in Parallel Computing – Accelerating Computational Science and Engineering (CSE). Volume 25 of Advances in Parallel Computing (IOS Press, 2014), pp. 564–573Google Scholar
  27. 27.
    D. Irony, S. Toledo, A. Tiskin, Communication lower bounds for distributed-memory matrix multiplication. J. Parallel Distrib. Comput. 64(9), 1017–1026 (2004)CrossRefMATHGoogle Scholar
  28. 28.
    R. Jacob, Efficient regular sparse grid hierarchization by a dynamic memory layout, in Sparse Grids and Applications 2012, Munich, ed. by J. Garcke, D. Pflüger. Volume 97 of Lecture Notes in Computational Science and Engineering (Springer, Cham/New York, 2014)pp. 195–219Google Scholar
  29. 29.
    C. Kowitz, M. Hegland, The sparse grid combination technique for computing eigenvalues in linear gyrokinetics. Procedia Comput. Sci. 18, 449–458 (2013). International Conference on Computational Science.Google Scholar
  30. 30.
    M.D. Lam, E.E. Rothberg, M.E. Wolf, The cache performance and optimizations of blocked algorithms. SIGPLAN Not. 26(4), 63–74 (1991)CrossRefGoogle Scholar
  31. 31.
    A. Maheshwari, N. Zeh, A survey of techniques for designing I/O-efficient algorithms, in Algorithms for Memory Hierarchies. ed. by U. Meyer, P. Sanders, J. Sibeyn. Volume 2625 of Lecture Notes in Computer Science, pp. 36–61 (Springer, Berlin/Heidelberg, 2003)Google Scholar
  32. 32.
    A. Murarasu, J. Weidendorfer, G. Buse, D. Butnaru, D. Pflüger, Compact data structure and scalable algorithms for the sparse grid technique, in Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming (PPoPP), San Antonio (ACM, 2011), pp. 25–34Google Scholar
  33. 33.
    A. F. Murarasu, G. Buse, D. Pflüger, J. Weidendorfer, A. Bode, fastsg: A fast routines library for sparse grids. Procedia CS 9, 354–363 (2012)Google Scholar
  34. 34.
    C. Pflaum, Convergence of the combination technique for second-order elliptic differential equations. SIAM J. Numer. Anal. 34(6), 2431–2455 (1997)MathSciNetCrossRefMATHGoogle Scholar
  35. 35.
    C. Pflaum, A. Zhou, Error analysis of the combination technique. Numer. Math. 84(2), 327–350 (1999)MathSciNetCrossRefMATHGoogle Scholar
  36. 36.
    D. Pflüger, Spatially adaptive sparse grids for high-dimensional problems, PhD thesis, Institut für Informatik, Technische Universität München, 2010Google Scholar
  37. 37.
    D. Pflüger, H.-J. Bungartz, M. Griebel, F. Jenko, T. Dannert, M. Heene, A. Parra Hinojosa, C. Kowitz, and P. Zaspel, Exahd: An exa-scalable two-level sparse grid approach for higher-dimensional problems in plasma physics and beyond, in Euro-Par 2014: Parallel Processing Workshops. Volume 8806 of Lecture Notes in Computer Science (Springer, Cham 2014), pp. 565–576Google Scholar
  38. 38.
    H. Prokop, Cache-oblivious algorithms, Master’s thesis, Massachusetts Institute of Technology, 1999MATHGoogle Scholar
  39. 39.
    C. Reisinger, Analysis of linear difference schemes in the sparse grid combination technique. IMA J. Numer. Anal. 33(2), 544–581 (2013)MathSciNetCrossRefMATHGoogle Scholar
  40. 40.
    S. Smolyak, Quadrature and interpolation formulas for tensor products of certain classes of functions. Sov. Math. Dokl. 4, 240–243 (1963)MATHGoogle Scholar
  41. 41.
    C. Zenger, Sparse grids, in Parallel Algorithms for Partial Differential Equations. Volume 31 of Notes on Numerical Fluid Mechanics (Vieweg, Wiesbaden 1991), pp. 241–251Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.ETH ZürichZürichSwitzerland
  2. 2.IT University of CopenhagenKøbenhavn SDenmark

Personalised recommendations