Abstract
Heterogeneous many-core processors become an important trend in high-performance computing area, but their sophisticated architecture greatly complicates the programming and compiling issue. The cost model is an important part of optimizing compilers, which is used to analyze the benefits of various program optimizations. This paper constructs a cost model for SW26010 heterogeneous many-core processor, and proposes a dynamic-static hybrid method to analyze benefit based on this cost model. Then these have been implemented in an automatic parallelizing framework for SW26010. The experimental results show that the cost model and the benefit analysis can filter a large number of non-beneficial parallel loops and the performance of the automatically parallelized programs increases significantly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zheng, F., Yong, X.U., Hongliang, L.I., et al.: A homegrown many-core processor architecture for high-performance computing. Sci. Sin. 45(4), 523 (2015)
Fu, H., Liao, J., Yang, J., et al.: The Sunway TaihuLight supercomputer: system and applications. Sci. Chin. Inf. Sci. 59, 1–16 (2016)
Sodani, A., Gramunt, R., Corbal, J., et al.: Knights landing: second-generation intel xeon phi product. IEEE Micro 36(2), 34–46 (2016)
Wu, G., Greathouse, J.L., Lyashevsky, A., et al.: GPGPU performance and power estimation using machine learning. In: Proceedings of IEEE International Symposium on High PERFORMANCE Computer Architecture, pp. 564–576. IEEE, NJ (2015)
Li, Y.B., Zhao, R.C., Liu, X.X., Zhao, J.: Cost model for automatic OpenMP parallelization. Ruan Jian Xue Bao/J. Softw. 25(2), 101–110 (2014)
Wang, Z., Tournavitis, G., Franke, B., et al.: Integrating profile-driven parallelism detection and machine-learning-based mapping. ACM Trans. Archit. Code Optim. (TACO) 11(1), 2 (2014)
Naishlos, D.: Autovectorization in GCC. In: Proceedings of the 2004 GCC Developers Summit, pp. 105–118 (2004)
Khaldi, D., Chapman, B.: Towards automatic HBM allocation using LLVM: a case study with knights landing. In: Proceedings of the Third Workshop on LLVM Compiler Infrastructure in HPC, pp. 12–20. IEEE Press (2016)
Chakrabarti, G., Chow, F., PathScale, L.: Structure layout optimizations in the open64 compiler: design, implementation and measurements. In: Open64 Workshop at the International Symposium on Code Generation and Optimization (2008)
Enterprise C. Cray Inc., NVIDIA and the Portland Group.: The OpenACC application programming interface, v2.0. (2013)
Lee, S., Min, S.J., Eigenmann, R.: Open MP to GPGPU: a compiler framework for automatic translation and optimization. ACM SIGPLAN Not. 44(4), 101–110 (2009)
Lee S, Eigenmann, R.: OpenMPC: extended open MP programming and tuning for GPUs. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE (2010)
Baskaran, M.M., Ramanujam, J., Sadayappan, P.: Automatic C-to-CUDA code generation for affine programs. Compiler Constr. 6011, 244–263 (2010)
Ravi, N., Yang, Y., Bao, T., Chakradhar, S.: Apricot: an optimizing compiler and productivity tool for x86-compatible many-core coprocessors. In: Proceedings of ICS 2012, 25–29 June 2012, San Servolo Island, Venice, Italy (2012)
Grosser, T., Hoefler, T.: Polly-ACC transparent compilation to heterogeneous hardware. In: Proceedings of the 2016 International Conference on Supercomputing. ACM (2016)
Liao, C.: A Compile-Time OpenMP Cost Model. University of Houston, Houston (2007)
Huang, P., Zhao, R., Yao, Y., Zhao, J.: Parallel cost model for heterogeneous multi-core processors. J. Comput. Appl. 33(06), 1544–1547 (2013)
Henning, J.L.: SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)
Jin, H.Q., Frumkin, M., Yan, J.: The OpenMP implementation of NAS parallel benchmarks and its performance (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd
About this paper
Cite this paper
Li, Y., Wang, Q., Li, Y., Han, L., Gao, Y., Mu, Q. (2017). A Cost Model for Heterogeneous Many-Core Processor. In: Chen, G., Shen, H., Chen, M. (eds) Parallel Architecture, Algorithm and Programming. PAAP 2017. Communications in Computer and Information Science, vol 729. Springer, Singapore. https://doi.org/10.1007/978-981-10-6442-5_54
Download citation
DOI: https://doi.org/10.1007/978-981-10-6442-5_54
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6441-8
Online ISBN: 978-981-10-6442-5
eBook Packages: Computer ScienceComputer Science (R0)