Many-Core Processors

Dinavahi, Venkata; Lin, Ning

doi:10.1007/978-3-030-86782-9_1

457 Accesses

Abstract

This chapter provides a brief overview of the evolution of the graphics processor; its hardware architecture, CUDA abstraction, and the parallel programming paradigm were provided in this chapter. The important characteristics of the computing system, including massive numbers of cores, SIMT execution, memory bandwidth, CUDA abstract and concurrent engines, and dynamic parallelism, are considered. In addition, multi-threading programming techniques for CPUs are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

D. Blythe, Rise of the graphics processor. Proc. IEEE 96(5), 761–778 (2008)
Article Google Scholar
J.D. Owens, M. Houston, D. Luebke, S. Green, J.E. Stone, J.C. Phillips, GPU computing. Proc. IEEE 96(5), 879–899 (2008)
Article Google Scholar
D. Luebke, G. Humphreys, How GPUs work. Computer 40(2), 96–100 (2007)
Article Google Scholar
E. Lindholm, J. Nickolls, S. Oberman, J. Montrym, NVIDIA^Ⓡ Tesla: A unified graphics and computing architecture. IEEE Micro 28(2), 39–55 (2008)
Article Google Scholar
T. Akenine-Moller, J. Strom, Graphics processing units for handhelds. Proc. IEEE 96(5), 779–789 (2008)
Article Google Scholar
J. Lemley, S. Bazrafkan, P. Corcoran, Deep learning for consumer devices and services: pushing the limits for machine learning, artificial intelligence, and computer vision IEEE Consum. Electron. Mag. 6(2), 48–56 (2017)
Google Scholar
E. Azarkhish, D. Rossi, I. Loi, L. Benini, Neurostream: scalable and energy efficient deep learning with smart memory cubes. IEEE Trans. Parallel Distrib. Syst. 29(2), 420–434 (2018)
Article Google Scholar
W. Choi, R.G. Kim, J.R. Doppa, P.P. Pande, D. Marculescu, R. Marculescu, On-chip communication network for efficient training of deep convolutional networks on heterogeneous manycore systems. IEEE Trans. Comput. 67(5), 672–686 (2018)
Article MathSciNet Google Scholar
NVIDIA Corporation, Whitepaper NVIDIA GF100 (2010)
Google Scholar
NVIDIA Corp., NVIDIA CUDA C Programming Guide Version 4.0 (2011)
Google Scholar
https://www.khronos.org/opencl/
https://developer.nvidia.com/directcompute
G. Amdahl, Validity of the single processor approach to achieving large-scale computing capabilities, in AFIPS Conference Proceedings, pp. 483–485 (1967)
Google Scholar
J.L. Gustafson, Reevaluating Amdahl’s law. Commun. ACM 31, 532–533 (1988)
Google Scholar
NVIDIA^Ⓡ Corp., Whitepaper NVIDIA Tesla P100 (2016)
Google Scholar
NVIDIA ^Ⓡ Tesla V100 GPU architecture. (NVIDIA Corp., USA, 2017)
Google Scholar
NVIDIA^Ⓡ Corp., CUDA C Programming Guide version 5.2 (2015)
Google Scholar
G. Ballard, J. Demmel, O. Holtz, O. Schwartz, Minimizing communication in linear algebra. SIAM J. Matrix Anal. Appl. 32(3), 866–901 (2011)
Article MathSciNet Google Scholar
V. Volkov, J.W. Demmel, Benchmarking GPUs to tune dense linear algebra. SC2008 Proceedings of the ACM/IEEE Conference on Supercomputing, pp. 1–11 (2008)
Google Scholar
NVIDIA^Ⓡ Corp., CUDA C Programming Guide version 11.2 (2021)
Google Scholar
https://www.openmp.org/
OpenMP application programming interface version 5.0, in OpenMP architecture review board (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
Venkata Dinavahi
Powertech Labs, Surrey, BC, Canada
Ning Lin

Authors

Venkata Dinavahi
View author publications
You can also search for this author in PubMed Google Scholar
Ning Lin
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dinavahi, V., Lin, N. (2022). Many-Core Processors. In: Parallel Dynamic and Transient Simulation of Large-Scale Power Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-86782-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-86782-9_1
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86781-2
Online ISBN: 978-3-030-86782-9
eBook Packages: EnergyEnergy (R0)

Publish with us

Policies and ethics