Maximizing Cache Memory Usage for Multigrid Algorithms for Applications of Fluid Flow in Porous Media
Computers today rely heavily on good utilization of their cache memory subsystems. Compilers are optimized for business applications, not scientific computing ones, however. Automatic tiling of complex numerical algorithms for solving partial differential equations is simply not provided by compilers. Thus, absolutely terrible cache performance is a common result.
Multigrid algorithms combine several numerical algorithms into a more complicated algorithm. In this paper, an algorithm is derived that allows for data to pass through cache exactly once per multigrid level during a V cycle before the level changes. This is optimal cache usage for large problems that do not fit entirely in cache. The numerical techniques and algorithms discussed in this paper can be easily applied to numerical simulation of fluid flows in porous media.
Keywordsmultigrid cache threads sparse matrix iterative methods domain decomposition compiler optimization
Unable to display preview. Download preview PDF.
- Douglas, C. C., Reusable cache memory object oriented multigrid algorithms. See http://www.ccs.uky.edu/~douglas under preprints, 1999.
- Douglas, C. C., Hu, J., and Iskandarani, M., Preprocessing costs of cache based multigrid, in Proceeding of ENUMATH99: Third European Conference on Numerical Methods for Advanced Applications, 8 pages, Singapore, World Scientific, 2000.Google Scholar
- Douglas, C. C., Hu, J., Karl, W., Kowarschik, M., Rüde, U., and Weiss, C., Fixed and adaptive cache aware algorithms for multigrid methods, in European Multigrid VI, Lecture Notes in Computational Science and Engineering, 7 pages, Springer, Berlin, 2000.Google Scholar
- Douglas, C. C., Hu, J., Kowarschik, M., Rüde, U., and Weiss, C., Cache optimization for structured and unstructured grid multigrid, Electron. Trans. Numer. Anal., 9, 2000.Google Scholar
- Douglas, C. C., Hu, J., Rüde, U., and Bittencourt, M, Cache based multigrid on unstructured two dimensional grids. Notes on Numerical Fluid Mechanics, 11 pages, Vieweg, Braunschweig, 1999, Proceeding of the 14th GAMM-Seminar Kiel on ‘Concepts of Numerical Software’, January, 1998.Google Scholar
- Hellwagner, H., Weiß, C., Stals, L., and Rüde, U., Efficient implementation of multigrid on cache based architectures. Notes on Numerical Fluid Mechanics. Vieweg, Braunschweig, 1999, Proceeding of the 14th GAMM-Seminar Kiel on ‘Concepts of Numerical Software’, January, 1998.Google Scholar
- Philbin, J., Edler, J., Anshus, O. J., Douglas, C. C., and Li, K., Thread scheduling for cache locality, in Proceedings of the Seventh ACM Conference on Architectural Support for Programming Languages and Operating Systems, pages 60–73, Cambridge, MA, 1996, ACM.Google Scholar
- Stals, L. and Rüde, U., Techniques for improving the data locality of iterative methods. Technical Report MRR 038-97, Australian National University, 1997.Google Scholar