Subjective versus objective: classifying analytical models for productive heterogeneous performance prediction

Pallipuram, Vivek K.; Smith, Melissa C.; Sarma, Nilim; Anand, Ranajeet; Weill, Edwin; Sapra, Karan

doi:10.1007/s11227-014-1292-9

Subjective versus objective: classifying analytical models for productive heterogeneous performance prediction

Published: 12 September 2014

Volume 71, pages 162–201, (2015)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Vivek K. Pallipuram¹,
Melissa C. Smith¹,
Nilim Sarma¹,
Ranajeet Anand¹,
Edwin Weill¹ &
…
Karan Sapra¹

275 Accesses
3 Citations
Explore all metrics

Abstract

Heterogeneous analytical models are valuable tools that facilitate optimal application tuning via runtime prediction; however, they require several man-hours of effort to understand and employ for meaningful performance prediction. Consequently, developers face the challenge of selecting adequate performance models that best fit their design goals and level of system knowledge. In this research, we present a classification that enables users to select a set of easy-to-use and reliable analytical models for quality performance prediction. These models, which target the general-purpose graphical processing unit (GPGPU)-based systems, are categorized into two primary analytical classes: subjective-analytical and objective-analytical. The subjective-analytical models predict the computation and communication components of an application by describing the system using minimum qualitative relations among the system parameters; whereas the objective-analytical models predict these components by measuring pertinent hardware events using micro-benchmarks. We categorize, enhance, and characterize the existing analytical models for GPGPU computations, network-level, and inter-connect communications to facilitate fast and reliable application performance prediction. We also explore a suitable combination of the aforementioned analytical classes, the hybrid approach, for high-quality performance prediction and report prediction accuracy up to 95 % for several tested GPGPU cluster configurations. The research aims to ultimately provide a collection of easy-to-select analytical models that promote straightforward and accurate performance prediction prior to large-scale implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A model of architecture for estimating GPU processing performance and power

Article Open access 16 January 2021

An Empirical Evaluation of GPGPU Performance Models

A statistical performance analyzer framework for OpenCL kernels on Nvidia GPUs

Article 13 December 2014

References

Many Integrated Core (MIC) Architecture-Advanced (2014). http://www.intel.com/content/www/us/en/architecture-and-technology/many-integrated-core/intel-many-integrated-core-architecture.html. Accessed 10 Sep 2014
Intel Xeon Phi™ Product Family (2014). http://www.intel.com/content/www/us/en/processors/xeon/xeon-phi-detail.html. Accessed 10 Sep 2014
Burns G, Daoud R, Vaigl J (1994). LAM: an open cluster environment for MPI. In: Proceedings of supercomputing symposium, pp 379–386
The OpenMP\({\textregistered }\) API specification for parallel programming (2014). http://openmp.org/wp/. Accessed 10 Sep 2014
Texas Advanced Computing Center: Stampede (2014). http://www.tacc.utexas.edu/resources/hpc/#stampede
Kindratenko V, Enos J, Shi G, Showerman M, Arnold G, Stone J, Phillips J, Hwu W (2009) GPU clusters for high-performance computing. In: Proceedings of the workshop on parallel programming on accelerator clusters (PPAC 2009) held in conjunction with cluster 2009, New Orleans, LA, pp 1–8, 31 August–4 September 2009
Baghsorkhi SS, Delahaye M, Patel SJ, Gropp WD, Hwu WW (2010) An adaptive performance modeling tool for GPU architectures. In: Proceedings of the 15th ACM SIGPLAN symposium on principles and practice of parallel programming, vol 45(5), pp 105–114, May 2011
Schaa D, Kaeli D (2009) Exploring the multiple-GPU design space. In: Proceedings of the international symposium on parallel and distributed processing (IPDPS 2009), pp 1–12, 23 May–29 May 2009
Hong S, Kim H (2009) An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. In: Proceedings of the 36th international symposium on computer architecture (ISCA 2009), vol 37(3), pp 152–163, June 2009
Infiniband (2014). http://www.infinibandta.org/. Accessed 10 Sep 2014
PCI-Express (2014). http://www.nvidia.com/page/pci_express.html. Accessed 10 Sep 2014
Pallipuram VK, Smith MC, Raut N, Ren X (2012) Exploring multi-level parallelism for large-scale spiking neural networks. In: Proceedings of the international conference on parallel and distributed techniques and applications (PDPTA 2012) held in conjunction with WORLDCOMP 2012, Las Vegas, NV, vol 2, pp 773–779, July 2012
Pallipuram VK, Raut N, Ren X, Smith MC, Naik S (2012) A multi-node GPGPU implementation of non-linear anisotropic diffusion filter. In: Proceedings of the symposium on application accelerators for high-performance computing (SAAHPC 2012), Argonne, IL, pp. 11–18, 10th July–11th July 2012
Kinnmark, Ingemar (1986) The shallow water wave equations: formulation, analysis and application. Lecture notes in engineering, vol 15. Springer, Berlin
Pallipuram VK, Smith MC, Raut N, Ren X (2012) A regression-based performance prediction framework for synchronous iterative algorithms on GPGPU clusters. Concurr Comput Pract Exp. doi:10.1002/cpe.3017
Zhang Y, Owens JD (2011) A quantitative performance analysis model for GPU architectures. In: Proceedings of the 17th international symposium on high performance computer architecture (HPCA 2011), pp 382–393, 12th February–16th February 2011
Culler D, Karp R, Patterson D, Sahay A, Schauser KE, Santos E, Subramonian R, von Eicken T (1993) LogP: towards a realistic model of parallel computation. In: Proceedings of the 4th ACM SIGPLAN symposium on principles and practice of parallel programming, pp 1–12. doi:10.1145/155332.155333
Alexandrov A, Ionescu MF, Schauser KE, Scheiman C (1995) LogGP: incorporating long messages into the LogP model: one step closer towards a realistic model for parallel computation. In: Proceedings of the 7th annual ACM symposium on parallel algorithms and architectures, pp 95–105. doi:10.1145/215399.215426
Kielman T, Bal HE, Verstoep K (2000) Fast measurement of LogP parameters for message passing platforms. In: Proceedings of the 15th workshop on parallel and distributed processing (IPDPS 2000), pp 1176–1183
Hoefler T, Lichei A, Rehm W (2007) Low-overhead LogGP parameter assessment for modern interconnection networks. In: Proceedings of the parallel and distributed processing symposium (IPDPS 2007), pp 1–8, March 2007
Hodgkin AL, Huxley AF (1952) A quantitative description of membrane current and application to conduction and excitation in nerve. J Physiol 117:500–544
Article Google Scholar
Morris C, Lecar H (1981) Voltage oscillations in the barnacle giant muscle fiber. l. Biophys J 35(1):193–213
Article Google Scholar
Wilson HR (1999) Simplified dynamics of human and mammalian neocortical neurons. J Theor Biol 200(4):375–388
Article Google Scholar
Izhikevich EM (2003) Simple model to use for cortical spiking neurons. IEEE Trans Neural Netw 14(5):1569–1572
Article Google Scholar
Gupta A, Long L (2007) Character recognition using spiking neural networks. In: Proceedings of the international joint conference on neural networks (IJCNN 2007), pp 53–58, August 2007
Wu W, Liu H (2008) Noise removal using nonlinear diffusion filtering based on statistic-local open system. In: Proceedings of the congress on image and signal processing (CISP), vol 3, pp 372–378
Perona P, Malik J (1990) Scale space and edge detection using anisotropic diffusion. IEEE Trans Pattern Anal Mach Intell 2(7):629–639
Article Google Scholar
Lax D, Wendroff B (1960) Systems of conservation laws. Commun Pure Appl Math 13(2):217–237
Article MathSciNet MATH Google Scholar
Nvidia GPU Direct (2014). https://developer.nvidia.com/gpudirect. Accessed 10 Sep 2014
The Palmetto Cluster (2014). http://citi.clemson.edu/palmetto/. Accessed 10 Sep 2014
Nvidia Tesla Product Literature (2014). http://www.nvidia.com/object/tesla_product_literature.html. Accessed 10 Sep 2014
Nvidia’s Next Generation CUDA Compute Architecture: Kepler GK110-Whitepaper (2014). http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf. Accessed 10 Sep 2014
CUDA Downloads (2014). https://developer.nvidia.com/cuda-downloads. Accessed 10 Sep 2014
MPI Documents (2014). http://www.mpi-forum.org/docs/. Accessed 10 Sep 2014
Michaelis L, Menten ML (1913) Die kinetic der invertinwirkung. Biochem Z 49(333–369):1913
Google Scholar
National Center for Supercomputing Applications (NCSA).https://www.ncsa.illinois.edu/. Accessed 10 Sep 2014
Danalis A, Marin G, McCurdy C, Meredith JS, Roth PC, Spafford K, Tipparaju V, Vetter JS (2010) The scalable heterogeneous computing (SHOC) benchmark suite. In: Proceedings of the 3rd workshop on general purpose computation on graphical processing units (GPGPU 2010), pp 63–74
Parallel thread execution ISA version 4.0 (2014). http://docs.nvidia.com/cuda/parallel-thread-execution/#abstract. Accessed 10 Sep 2014
Nvidia, CUDA Programming Guide (2014). http://docs.nvidia.com/cuda/index.html. Accessed 10 Sep 2014

Download references

Author information

Authors and Affiliations

Holcombe Department of Electrical and Computer Engineering, Clemson University, Clemson, SC, USA
Vivek K. Pallipuram, Melissa C. Smith, Nilim Sarma, Ranajeet Anand, Edwin Weill & Karan Sapra

Authors

Vivek K. Pallipuram
View author publications
You can also search for this author in PubMed Google Scholar
Melissa C. Smith
View author publications
You can also search for this author in PubMed Google Scholar
Nilim Sarma
View author publications
You can also search for this author in PubMed Google Scholar
Ranajeet Anand
View author publications
You can also search for this author in PubMed Google Scholar
Edwin Weill
View author publications
You can also search for this author in PubMed Google Scholar
Karan Sapra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vivek K. Pallipuram.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pallipuram, V.K., Smith, M.C., Sarma, N. et al. Subjective versus objective: classifying analytical models for productive heterogeneous performance prediction. J Supercomput 71, 162–201 (2015). https://doi.org/10.1007/s11227-014-1292-9

Download citation

Published: 12 September 2014
Issue Date: January 2015
DOI: https://doi.org/10.1007/s11227-014-1292-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Subjective versus objective: classifying analytical models for productive heterogeneous performance prediction

Abstract

Access this article

Similar content being viewed by others

A model of architecture for estimating GPU processing performance and power

An Empirical Evaluation of GPGPU Performance Models

A statistical performance analyzer framework for OpenCL kernels on Nvidia GPUs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Subjective versus objective: classifying analytical models for productive heterogeneous performance prediction

Abstract

Access this article

Similar content being viewed by others

A model of architecture for estimating GPU processing performance and power

An Empirical Evaluation of GPGPU Performance Models

A statistical performance analyzer framework for OpenCL kernels on Nvidia GPUs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation