Advertisement

Semi-automatic Composition of Data Layout Transformations for Loop Vectorization

  • Shixiong Xu
  • David Gregg
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8707)

Abstract

In this paper we put forward an annotation system for specifying a sequence of data layout transformations for loop vectorization. We propose four basic primitives for data layout transformations that programmers can compose to achieve complex data layout transformations. Our system automatically modifies all loops and other code operating on the transformed arrays. In addition, we propose data layout aware loop transformations to reduce the overhead of address computation and help vectorization. Taking the Scalar Penta-diagonal (SP) solver, from the NAS Parallel Benchmarks as a case study, we show that the programmer can achieve significant speedups using our annotations.

Keywords

Single Instruction Multiple Data Data Layout Language Pragma Loop Transformation Strip Mine 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bae, H., Mustafa, D., et al.: The Cetus Source-to-Source Compiler Infrastructure: Overview and Evaluation. Int. J. Parallel Program. 41, 753–767 (2013)CrossRefGoogle Scholar
  2. 2.
    Sung, I.J., Stratton, J.A., Hwu, W.M.W.: Data Layout Transformation Exploiting Memory-level Parallelism in Structured Grid Many-core Applications. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT 2010 (2010)Google Scholar
  3. 3.
    Ramachandran, A., Vienne, J., et al.: Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi. In: 2013 42nd International Conference onParallel Processing (ICPP), pp. 736–743 (2013)Google Scholar
  4. 4.
    Bacon, D.F., Graham, S.L., Sharp, O.J.: Compiler Transformations for High-performance Computing. ACM Comput. Surv. 26, 345–420 (1994)CrossRefGoogle Scholar
  5. 5.
    O’Boyle, M.F.P., Knijnenburg, P.M.W.: Non-singular Data Transformations: Definition, Validity and Applications. In: Proceedings of the 11th International Conference on Supercomputing, ICS 1997 (1997)Google Scholar
  6. 6.
    Jang, B., Mistry, P., et al.: Data Transformations Enabling Loop Vectorization on Multithreaded Data Parallel Architectures. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2010 (2010)Google Scholar
  7. 7.
    Bailey, D.H., Barszcz, E., et al.: The NAS Parallel Benchmarks. Technical report, The International Journal of Supercomputer Applications (1991)Google Scholar
  8. 8.
    Kennedy, K., Kremer, U.: Automatic Data Layout for Distributed-memory Machines. ACM Trans. Program. Lang. Syst. 20, 869–916 (1998)CrossRefGoogle Scholar
  9. 9.
    Maleki, S., Gao, Y., et al.: An Evaluation of Vectorizing Compilers. In: Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, PACT 2011 (2011)Google Scholar
  10. 10.
    Girbal, S., Vasilache, N., et al.: Semi-automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies. Int. J. Parallel Program. 34, 261–317 (2006)CrossRefzbMATHGoogle Scholar
  11. 11.
    Rice University, CORPORATE:High Performance Fortran Language Specification. SIGPLAN Fortran Forum 12 (1993)Google Scholar
  12. 12.
    Henretty, T., Stock, K., Pouchet, L.-N., Franchetti, F., Ramanujam, J., Sadayappan, P.: Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures. In: Knoop, J. (ed.) CC 2011. LNCS, vol. 6601, pp. 225–245. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Majeti, D., Barik, R., Zhao, J., Grossman, M., Sarkar, V.: Compiler-Driven Data Layout Transformation for Heterogeneous Platforms. In: an Mey, D., et al. (eds.) Euro-Par 2013. LNCS, vol. 8374, pp. 188–197. Springer, Heidelberg (2014)Google Scholar
  14. 14.
    Sinkarovs, A., Scholz, S.B.: Semantics-Preserving Data Layout Transformations for Improved Vectorisation. In: Proceedings of the 2nd ACM SIGPLAN Workshop on Functional High-performance Computing, FHPC 2013 (2013)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2014

Authors and Affiliations

  • Shixiong Xu
    • 1
    • 2
  • David Gregg
    • 1
    • 2
  1. 1.Lero, The Irish Software Engineering Research CentreIreland
  2. 2.Software Tools Group, Department of Computer ScienceUniversity of Dublin, Trinity CollegeDublinIreland

Personalised recommendations