Evaluation of Autoparallelization Toolkits for Commodity GPUs

Williams, David; Codreanu, Valeriu; Yang, Po; Liu, Baoquan; Dong, Feng; Yasar, Burhan; Mahdian, Babak; Chiarini, Alessandro; Zhao, Xia; Roerdink, Jos B. T. M.

doi:10.1007/978-3-642-55224-3_42

Evaluation of Autoparallelization Toolkits for Commodity GPUs

David Williams¹⁹,
Valeriu Codreanu¹⁹,
Po Yang²⁰,
Baoquan Liu²⁰,
Feng Dong²⁰,
Burhan Yasar²¹,
Babak Mahdian²²,
Alessandro Chiarini²³,
Xia Zhao²⁴ &
…
Jos B. T. M. Roerdink¹⁹

Conference paper
First Online: 01 January 2014

1595 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8384))

Abstract

In this paper we evaluate the performance of the OpenACC and Mint toolkits against C and CUDA implementations of the standard PolyBench test suite. Our analysis reveals that performance is similar in many cases, but that a certain set of code constructs impede the ability of Mint to generate optimal code. We then present some small improvements which we integrate into our own GPSME toolkit (which is derived from Mint) and show that our toolkit now out-performs OpenACC in the majority of tests.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Owens, J.D., Luekbe, D., Govindaraju, N., Harris, M., Krger, J., Lefohn, A.E., Purcell, T.J.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum 26(1), 80–113 (2007)
Article Google Scholar
Amini, M., Creusillet, B., Even, S., Keryell, R., Goubier, O., Guelton, S., McMahon, J.O., Pasquier, F.X., Péan, G., Villalon, P.: Par4All: from convex array regions to heterogeneous computing. In: 2nd International Workshop on Polyhedral Compilation Techniques, Paris, France, Jan 2012
Google Scholar
Lee, S., Eigenmann, R.: OpenMPC: extended openMP programming and tuning for GPUs. In: Proceedings of the 2010 ACM/IEEE Conference on Supercomputing, November 2010, pp. 1–11 (2010)
Google Scholar
Meister, B., Vasilache, N., Wohlford, D., Baskaran, M.M., Leung, A., Lethin, R.: R-stream compiler. In: Padua, D. (ed.) Encyclopedia of Parallel Computing, pp. 1756–1765. Springer, Heidelberg (2011)
Google Scholar
Verdoolaege, S., Juega, J.C., Cohen, A., Gómez, J.I., Tenllado, C., Catthoor, F.: Polyhedral parallel code generation for CUDA. ACM Trans. Archit. Code Optim. 9(4), 54:1–54:23 (2013)
Article Google Scholar
Unat, D., Cai, X., Baden, S.B.: Mint: realizing CUDA performance in 3D Stencil methods with Annotated C. In: Proceedings of the International Conference on Supercomputing, pp. 214–224 (2011)
Google Scholar
The OpenACC Application Programming Interface, Version 1.0 (2011)
Google Scholar
OpenMP Application Program Interface, Version 3.1 (2011)
Google Scholar
Dong, F.: A General Toolkit for “GPUtilisation” in SME Applications. http://www.gp-sme.eu/ (2013). Accessed Oct 2013
Lee, S., Vetter, J.S.: Early evaluation of directive-based GPU programming models for productive exascale computing. In: Proceedings of the International Conference on High Performance Computing, Article 23 (2012)
Google Scholar
Pouchet, L-N.: PolyBench: The Polyhedral Benchmark suite (2011), Version 3.2. http://www.cs.ucla.edu/~pouchet/software/polybench/ (2011)
Grauer-Gray, S., Xu, L., Searles, R., Ayalasomayajula, S., Cavazos, J.: Auto-tuning a high-level language targeted to GPU codes. In: Proceedings of Innovative Parallel Computing, pp. 1–10 (2012)
Google Scholar
Zhou, J., Unat, D., Choi, D.J., Guest, C.C., Cui, Y.: Hands-on performance tuning of 3D finite difference earthquake simulation on GPU fermi chipset. Procedia Comput. Sci. 9, 976–985 (2012)
Article Google Scholar
Fang, J., Varbanescu, A.L., Sips, H.: A comprehensive performance comparison of CUDA and OpenCL. In: Proceedings of the Parallel Processing, pp. 216–225 (2011)
Google Scholar
Komatsu, K., Sato, K., Arai, Y., Koyama, K., Takizawa, H., Kobayashi, H.: Evaluating performance and portability of OpenCL programs. In: Proceedings of the Automatic Performance Tuning (2010)
Google Scholar
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Skadron, K.: A performance study of general-purpose applications on graphics processors using cuda. J. Parallel Distrib. Comput. 68(10), 1370–1380 (2008)
Article Google Scholar
Magni, A., Grewe, D., Johnson, N.: Input-aware auto-tuning for directive-based GPU programming. In: Proceedings of the 6th Workshop on General Purpose Processor Using Graphic Processing Units, pp. 66–75 (2013)
Google Scholar
Reyes, R.N., Lopez, I., Fumero, J.J., de Sande, F.: Directive-based programming for GPUs: a comparative study. In: IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS) (2012)
Google Scholar
Wienke, S., Springer, P., Terboven, C., an Mey, D.: OpenACC — First experiences with real-world applications. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 859–870. Springer, Heidelberg (2012)
Google Scholar
Herdman, J.A., Gaudin, W.P., McIntosh-Smith, S., Boulton, M., Beckingsale, D.A., Mallinson, A.C., Jarvis, S.A.: Accelerating hydrocodes with OpenACC, OpeCL and CUDA. In: Proceedings of the High Performance Computing, Networking, Storage and Analysis (SCC), pp. 465–471 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Groningen, Groningen, The Netherlands
David Williams, Valeriu Codreanu & Jos B. T. M. Roerdink
University of Bedfordshire, Luton, UK
Po Yang, Baoquan Liu & Feng Dong
RotaSoft Ltd, Ankara, Turkey
Burhan Yasar
ImageMetry, Prague, Czech Republic
Babak Mahdian
Super Computing Solutions, Bologna, Italy
Alessandro Chiarini
AnSmart, Wembley, UK
Xia Zhao

Authors

David Williams
View author publications
You can also search for this author in PubMed Google Scholar
Valeriu Codreanu
View author publications
You can also search for this author in PubMed Google Scholar
Po Yang
View author publications
You can also search for this author in PubMed Google Scholar
Baoquan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Feng Dong
View author publications
You can also search for this author in PubMed Google Scholar
Burhan Yasar
View author publications
You can also search for this author in PubMed Google Scholar
Babak Mahdian
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Chiarini
View author publications
You can also search for this author in PubMed Google Scholar
Xia Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jos B. T. M. Roerdink
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Williams .

Editor information

Editors and Affiliations

Institute of Computer and Information Science, Czestochowa University of Technology, Czestochowa, Poland
Roman Wyrzykowski
University of Tennessee, Department of Computer Science, Knoxville, Tennessee, USA
Jack Dongarra
Institute of Computer and Information Science, Czestochowa University of Technology, Czestochowa, Poland
Konrad Karczewski
Technical University of Denmark Informatics and Mathematical Modelling, Kongens Lyngby, Denmark
Jerzy Waśniewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Williams, D. et al. (2014). Evaluation of Autoparallelization Toolkits for Commodity GPUs. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55224-3_42

Download citation

DOI: https://doi.org/10.1007/978-3-642-55224-3_42
Published: 06 May 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55223-6
Online ISBN: 978-3-642-55224-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics