Using Mixed Precision Algorithm for LINPACK Benchmark on AMD GPU

Zhang, Xianyi; Zhang, Yunquan; Wang, Lei

doi:10.1007/978-3-642-16405-7_34

Using Mixed Precision Algorithm for LINPACK Benchmark on AMD GPU

Xianyi Zhang^7,8,
Yunquan Zhang^7,9 &
Lei Wang^7,8

Chapter
First Online: 01 January 2013

2809 Accesses

Part of the book series: Lecture Notes in Earth System Sciences ((LNESS))

Abstract

LINPACK is a de facto benchmark for supercomputers. Nowadays, the CPU and GPU heterogenous cluster becomes an important trendy of supercomputers. Because of high performance of mixed precision algorithm, we had developed a mixed precision high performance LINPACK software package GHPL on NVIDIA GPU cluster. In this paper, we will introduce the recent work about porting and optimizing GHPL on AMD GPU. On AMD GPU platform, we implemented a hybrid of CPU and GPU GEMM function by ACML-GPU and GotoBLAS library. According to our results, the speedup of GHPL over HPL was 3.21. In addition, we would point out the limitations of ACML-GPU library.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Kurzak J, Dongarra J (2006) Implementation of the mixed-precision high performance LINPACK benchmark on the CELL Processor. University of Tennessee Computer Science, Technical report UT-CS-06-580, LAPACK Working Note 177, Sept 2006.
Google Scholar
Langou J, Langou J, Luszcek P, Kurzak J, Buttari A, Dongarra JJ (2006) Exploiting the performance of 32bit floating point arithmetic in obtatining 64 bit accuracy. In: Proceedings of the 2006 ACM/IEEE conference on supercomputing, Tampa, 2006.
Google Scholar
Moler CB (1967) Iterative refinement in floating point. J ACM 14(2):316–321
Article MATH Google Scholar
Wang L, Zhang Y, Zhang X, Liu F (2010) Accelerating linpack performance with mixed precision algorithm on CPU+GPGPU heterogeneous cluster, In: Proceedings of the 10th IEEE international conference on computer and information technology, 2010, pp 1169–1174.
Google Scholar
Wilkinson JH (1965) The algebraic eigenvalue problem. Clarendon, Oxford
MATH Google Scholar

Download references

Acknowledgments

This work is partly supported by the National 863 Plan of China (No.2006AA01A125, No. 2009AA01A129, No.2009AA01A134), the China HGJ Project (No. 2009ZX01036-001-002), the Knowledge Innovation Program of the Chinese Academy of Sciences (No.KGCX1-YW-13), the Ministry of Finance (No. ZDYZ2008-2).

Author information

Authors and Affiliations

Lab of Parallel Software and Computational Science, Institute of Software, Chinese Academy of Sciences, 100190 , Beijing, China
Xianyi Zhang, Yunquan Zhang & Lei Wang
Graduate University of Chinese Academy of Sciences, 100190 , Beijing, China
Xianyi Zhang & Lei Wang
State Key Lab of Computing Science, Chinese Academy of Sciences, 100190 , Beijing, China
Yunquan Zhang

Authors

Xianyi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yunquan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianyi Zhang .

Editor information

Editors and Affiliations

University of Minnesota, Dep. of Earth Sciences and Minnesota, Supercomputing Institute, Pillsbury Hall 23, Minneapolis, 55455, Minnesota, USA
David A. Yuen
Network Information Center, Comuter Center and Computer, Zhong Guan Cun 4, Beijing, 100190, China, People's Republic
Long Wang
Supercomputing Center, Zhong Guan Cun 4, Beijing, 100190, China, People's Republic
Xuebin Chi
, Computer Science, University of Houston, Calhoun Street 4800, Houston, 77204, Texas, USA
Lennart Johnsson
Inst. Process Engineering (IPE), Chinese Academy of Sciences, Zhongguancun North Second Street 1, Beijing, 100190, China, People's Republic
Wei Ge
, Laboratory of Computational Geodynamics,, Chinese Academy of Sciences, Yu Quan Lu 19a, Beijing, 100049, China, People's Republic
Yaolin Shi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhang, X., Zhang, Y., Wang, L. (2013). Using Mixed Precision Algorithm for LINPACK Benchmark on AMD GPU. In: Yuen, D., Wang, L., Chi, X., Johnsson, L., Ge, W., Shi, Y. (eds) GPU Solutions to Multi-scale Problems in Science and Engineering. Lecture Notes in Earth System Sciences. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16405-7_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-16405-7_34
Published: 09 January 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16404-0
Online ISBN: 978-3-642-16405-7
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics