Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

International Conference on Compiler Construction

CC 2012: Compiler Construction pp 101–121Cite as

  1. Home
  2. Compiler Construction
  3. Conference paper
Analytical Bounds for Optimal Tile Size Selection

Analytical Bounds for Optimal Tile Size Selection

  • Jun Shirako17,
  • Kamal Sharma17,
  • Naznin Fauzia18,
  • Louis-Noël Pouchet18,
  • J. Ramanujam19,
  • P. Sadayappan18 &
  • …
  • Vivek Sarkar17 
  • Conference paper
  • 1108 Accesses

  • 34 Citations

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7210)

Abstract

In this paper, we introduce a novel approach to guide tile size selection by employing analytical models to limit empirical search within a subspace of the full search space. Two analytical models are used together: 1) an existing conservative model, based on the data footprint of a tile, which ignores intra-tile cache block replacement, and 2) an aggressive new model that assumes optimal cache block replacement within a tile. Experimental results on multiple platforms demonstrate the practical effectiveness of the approach by reducing the search space for the optimal tile size by 1,307× to 11,879× for an Intel Core-2-Quad system; 358× to 1,978× for an Intel Nehalem system; and 45× to 1,142× for an IBM Power7 system. The execution of rectangularly tiled code tuned by a search of the subspace identified by our model achieves speed-ups of up to 1.40× (Intel Core-2 Quad), 1.28× (Nehalem) and 1.19× (Power 7) relative to the best possible square tile sizes on these different processor architectures. We also demonstrate the integration of the analytical bounds with existing search optimization algorithms. Our approach not only reduces the total search time from Nelder-Mead Simplex and Parallel Rank Ordering methods by factors of up to 4.95× and 4.33×, respectively, but also finds better tile sizes that yield higher performance in tuned tiled code.

Keywords

  • Search Space
  • Cache Line
  • Memory Hierarchy
  • Analytical Bound
  • Tile Size

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Download conference paper PDF

References

  1. Barr, T.W., Cox, A.L., Rixner, S.: Translation caching: skip, don’t walk (the page table). In: ISCA 2010, pp. 48–59. ACM, New York (2010)

    Google Scholar 

  2. Baskaran, M., Hartono, A., Tavarageri, S., Henretty, T., Ramanujam, J., Sadayappan, P.: Parameterized tiling revisited. In: CGO, pp. 200–209 (2010)

    Google Scholar 

  3. Bhargava, R., Serebrin, B., Spadini, F., Manne, S.: Accelerating two-dimensional page walks for virtualized systems. In: ASPLOS XIII, pp. 26–35 (2008)

    Google Scholar 

  4. Bilmes, J., Asanovic, K., Chin, C., Demmel, J.: Optimizing matrix multiply using PHiPAC. In: Proc. ICS, pp. 340–347 (1997)

    Google Scholar 

  5. Bodin, F., Jalby, W., Windheiser, D., Eisenbeis, C.: A quantitative algorithm for data locality optimization. In: Code Generation, pp. 119–145 (1991)

    Google Scholar 

  6. Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral program optimization system. In: PLDI (2008)

    Google Scholar 

  7. Boulet, P., Darte, A., Risset, T., Robert, Y. (Pen)-ultimate tiling? Integration, the VLSI Journal 17(1), 33–51 (1994)

    CrossRef  Google Scholar 

  8. Chame, J., Moon, S.: A tile selection algorithm for data locality and cache interference. In: ICS, pp. 492–499 (1999)

    Google Scholar 

  9. Chen, C., Chame, J., Hall, M.: Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy. In: CGO 2005 (2005)

    Google Scholar 

  10. Coleman, S., McKinley, K.: Tile Size Selection Using Cache Organization and Data Layout. In: PLDI, pp. 279–290 (1995)

    Google Scholar 

  11. Datta, K.: Auto-tuning stencil codes for cache-based multicore platforms. Technical report, University of California, Berkeley (December 2009)

    Google Scholar 

  12. Ferrante, J., Sarkar, V., Thrash, W.: On Estimating and Enhancing Cache Effectiveness. In: Banerjee, U., Nicolau, A., Gelernter, D., Padua, D.A. (eds.) LCPC 1991. LNCS, vol. 589, pp. 328–343. Springer, Heidelberg (1992)

    CrossRef  Google Scholar 

  13. Ghosh, S., Martonosi, M., Malik, S.: Cache miss equations: a compiler framework for analyzing and tuning memory behavior. ACM TOPLAS 21(4), 703–746 (1999)

    CrossRef  Google Scholar 

  14. Goto, K., van de Geijn, R.A.: High-performance implementation of the level-3 BLAS. ACM Trans. Math. Softw. 35(1) (July 2008)

    Google Scholar 

  15. Hartono, A., Baskaran, M.M., Bastoul, C., Cohen, A., Krishnamoorthy, S., Norris, B., Ramanujam, J., Sadayappan, P.: Parametric multi-level tiling of imperfectly nested loops. In: Proc. ICS (2009)

    Google Scholar 

  16. Hsu, C., Kremer, U.: A quantitative analysis of tile size selection algorithms. J. Supercomput. 27(3), 279–294 (2004)

    CrossRef  MATH  Google Scholar 

  17. Irigoin, F., Triolet, R.: Supernode partitioning. In: ACM POPL, pp. 319–329 (1988)

    Google Scholar 

  18. Kim, D., Renganarayanan, L., Strout, M., Rajopadhye, S.: Multi-level tiling: ’m’ for the price of one. In: SC (2007)

    Google Scholar 

  19. Knijnenburg, P.M.W., Kisuki, T., O’Boyle, M.F.P.: Combined selection of tile sizes and unroll factors using iterative compilation. The Journal of Supercomputing 24(1), 43–67 (2003)

    CrossRef  MATH  Google Scholar 

  20. Lam, M., Rothberg, E., Wolf, M.: The cache performance and optimizations of blocked algorithms. In: Proc. 4th ACM ASPLOS, pp. 63–74 (1991)

    Google Scholar 

  21. Luersen, M., Riche, R.L., Guyon, F.: A constrained, globalized, and bounded nelder-mead method for engineering optimization. Structural and Multidisciplinary Optimization 27(1-2), 43–54 (2004)

    CrossRef  Google Scholar 

  22. Nelder, J.A., Mead, R.: A simplex method for function minimization. Computer Journal 7(4), 308–313 (1965)

    MATH  Google Scholar 

  23. Ramanujam, J., Sadayappan, P.: Tiling multidimensional iteration spaces for multicomputers. JPDC 16(2), 108–230 (1992)

    Google Scholar 

  24. Renganarayana, L., Kim, D., Rajopadhye, S., Strout, M.: Parameterized tiled loops for free. In: PLDI, pp. 405–414 (2007)

    Google Scholar 

  25. Resource Characterization in the PACE Project, http://www.pace.rice.edu/Content.aspx?id=41

  26. Rivera, G., Tseng, C.: Locality optimizations for multi-level caches. In: SC (1999)

    Google Scholar 

  27. Sarkar, V.: Automatic Selection of High Order Transformations in the IBM XL Fortran Compilers. IBM J. Res. & Dev. 41(3) (May 1997)

    Google Scholar 

  28. Sarkar, V., Megiddo, N.: An analytical model for loop tiling and its solution. In: IEEE ISPASS (2000)

    Google Scholar 

  29. Schreiber, R., Dongarra, J.: Automatic blocking of nested loops. Tech. Report 90.38, RIACS, NASA Ames Research Center (1990)

    Google Scholar 

  30. Tabatabaee, V., Tiwari, A., Hollingsworth, J.K.: Parallel parameter tuning for applications with performance variability. In: Proc. Supercomputing 2005 (2005)

    Google Scholar 

  31. Tapus, C., Chung, I.-H., Hollingsworth, J.K.: Active harmony: towards automated performance tuning. In: SC, pp. 1–11 (2002)

    Google Scholar 

  32. Tiwari, A., Chen, C., Chame, J., Hall, M., Hollingsworth, J.: Scalable autotuning framework for compiler optimization. In: IPDPS 2009 (2009)

    Google Scholar 

  33. Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated empirical optimization of software and the ATLAS project. Parallel Computing 27(1–2), 3–35 (2001)

    CrossRef  MATH  Google Scholar 

  34. Wolf, M., Lam, M.S.: A data locality optimizing algorithm. In: PLDI 1991, pp. 30–44 (1991)

    Google Scholar 

  35. Wolfe, M.: More iteration space tiling. In: Proc. Supercomputing, pp. 655–664 (1989)

    Google Scholar 

  36. Xue, J.: Loop tiling for parallelism. Kluwer Academic Publishers, Norwell (2000)

    CrossRef  MATH  Google Scholar 

  37. Yotov, K., Pingali, K., Stodghill, P.: Think globally, search locally. In: International Conference on Supercomputing (2005)

    Google Scholar 

  38. Yuki, T., Renganarayanan, L., Rajopadhye, S., Anderson, C., Eichenberger, A., O’Brien, K.: Automatic creation of tile size selection models. In: CGO, pp. 190–199 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Rice University, USA

    Jun Shirako, Kamal Sharma & Vivek Sarkar

  2. The Ohio State University, USA

    Naznin Fauzia, Louis-Noël Pouchet & P. Sadayappan

  3. Louisiana State University, USA

    J. Ramanujam

Authors
  1. Jun Shirako
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Kamal Sharma
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Naznin Fauzia
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Louis-Noël Pouchet
    View author publications

    You can also search for this author in PubMed Google Scholar

  5. J. Ramanujam
    View author publications

    You can also search for this author in PubMed Google Scholar

  6. P. Sadayappan
    View author publications

    You can also search for this author in PubMed Google Scholar

  7. Vivek Sarkar
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. School for Informatics, University of Edinburgh, 10 Crichton Street, EH8 9AB, Edinburgh, UK

    Michael O’Boyle

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shirako, J. et al. (2012). Analytical Bounds for Optimal Tile Size Selection. In: O’Boyle, M. (eds) Compiler Construction. CC 2012. Lecture Notes in Computer Science, vol 7210. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28652-0_6

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-28652-0_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28651-3

  • Online ISBN: 978-3-642-28652-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature