Parallel Iterative Solution of Large Sparse Linear Equation Systems on the Intel MIC Architecture

Alyahya, Hana; Mehmood, Rashid; Katib, Iyad

doi:10.1007/978-3-030-13705-2_16

Hana Alyahya⁶,
Rashid Mehmood⁷ &
Iyad Katib⁸

Part of the book series: EAI/Springer Innovations in Communication and Computing ((EAISICC))

2065 Accesses
9 Citations

Abstract

Many important scientific, engineering, and smart city applications require solving large sparse linear equation systems. The numerical methods for solving linear equations can be categorised into direct methods and iterative methods. Jacobi method is one of the iterative solvers that has been widely used due to its simplicity and efficiency. Its performance is affected by factors including the storage format, the specific computational algorithm, and its implementation. While the performance of Jacobi has been studied extensively on conventional CPU architectures, research on its performance on emerging architectures, such as the Intel Many Integrated Core (MIC) architecture, is still in its infancy. In this chapter, we investigate the performance of parallel implementations of the Jacobi method on Knights Corner (KNC), the first generation of the Intel MIC architectures. We implement Jacobi with two storage formats, Compressed Sparse Row (CSR) and Modified Sparse Row (MSR), and measure their performance in terms of execution time, offloading time, and speedup. We report results of sparse matrices with over 28 million rows and 640 million non-zero elements acquired from 13 diverse application domains. The experimental results show that our Jacobi parallel implementation on MIC achieves speedups of up to 27.75× compared to the sequential implementation. It also delivers a speedup of up to 3.81× compared to a powerful node comprising 24 cores in two Intel Xeon E5-2695v2 processors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mehmood, R., Alturki, R., Zeadally, S.: Multimedia applications over metropolitan area networks (MANs). J. Netw. Comput. Appl. 34, 1518–1529 (2011)
Article Google Scholar
Mehmood, R., Meriton, R., Graham, G., Hennelly, P., Kumar, M.: Exploring the influence of big data on city transport operations: a Markovian approach. Int. J. Oper. Prod. Manag. 37, 75–104 (2017)
Article Google Scholar
Mehmood, R., Graham, G.: Big data logistics: a health-care transport capacity sharing model. Proc. Comput. Sci. 64, 1107–1114 (2015)
Article Google Scholar
Mehmood, R., Lu, J.A.: Computational Markovian analysis of large systems. J. Manuf. Technol. Manag. 22, 804–817 (2011)
Article Google Scholar
Altowaijri, S., Mehmood, R., Williams, J.: A quantitative model of grid systems performance in healthcare organisations. In: 2010 Int. Conf. Intell. Syst. Model. Simul., pp. 431–436 (2010)
Google Scholar
Saad, Y.: Iterative methods for sparse linear systems. Society for Industrial and Applied Mathematics (2003)
Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations (2013)
Google Scholar
Ford, W.: Chapter 20: Basic iterative methods. In: Ford, W. (ed.) Numerical Linear Algebra with Applications, pp. 469–490. Academic, Boston (2015)
Google Scholar
Mehmood, R.: Disk-based techniques for efficient solution of large Markov chains. PhD Thesis, School of Computer Science, University of Birmingham (2004)
Google Scholar
Kwiatkowska, M., Mehmood, R., Norman, G., Parker, D.: A symbolic out-of-core solution method for Markov models. Electron. Notes Theor. Comput. Sci. 68, 589–604 (2002)
Article Google Scholar
Kwiatkowska, M., Parker, D., Yi Zhang, Y., Mehmood, R.: Dual-processor parallelisation of symbolic probabilistic model checking. In: The IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings, pp. 123–130. IEEE (2004)
Google Scholar
Mehmood, R., Crowcroft, J., Elmirghani, J.M.H.: A parallel implicit method for the steady-state solution of CTMCs. In: 14th IEEE International Symposium on Modeling, Analysis, and Simulation, pp. 293–302. IEEE (2006)
Google Scholar
Mehmood, R.: A survey of out-of-core analysis techniques in stochastic modelling. Technical Report CSR-03-7, School of Computer Science, University of Birmingham, Birningham (2003)
Google Scholar
Mehmood, R., Parker, D., Kwiatkowska, M.: An efficient symbolic out-of-core solution method for markov models. Technical Report CSR-03-08, School of Computer Science, University of Birmingham, Birmingham (2003)
Google Scholar
Saad, Y., Van Der Vost, H.A.: Iterative solution of linear systems in the 20th century. J. Comput. Appl. Math. 123, 1–33 (2000)
Article MathSciNet Google Scholar
Banu, S., Vaideeswaran, D.: Performance Analysis on Parallel Sparse Matrix Vector Multiplication Micro-Benchmark Using Dynamic Instrumentation Pintool. Presented at the 2013, pp. 129–136 (2013)
Google Scholar
Kwiatkowska, M., Mehmood, R.: Out-of-core solution of large linear systems of equations arising from Stochastic modelling. In: Hermanns, H., Segala, R. (eds.) Process Algebra and Probabilistic Methods: Performance Modeling and Verification. PAPM-PROBMIV, pp. 135–151. Springer, Berlin, Heidelberg (2002)
Chapter Google Scholar
Mehmood, R.: Serial disk-based analysis of large stochastic models. In: Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.-P., Siegle, M. (eds.) Validation of Stochastic Systems: A Guide to Current Research, pp. 230–255. Springer, Berlin, Heidelberg (2004)
Chapter Google Scholar
Mehmood, R., Crowcroft, J.: Parallel iterative solution method for large sparse linear equation systems. Technical Report Number UCAM-CL-TR-650, Computer Laboratory, University of Cambridge, Cambridge (2005)
Google Scholar
Giles, M.B., Reguly, I.: Trends in high-performance computing for engineering calculations. Philos. Trans. R. Soc. A. 372, 20130319 (2014)
Article Google Scholar
Sodani, A., Gramunt, R., Corbal, J., Kim, H.S., Vinod, K., Chinthamani, S., Hutsell, S., Agarwal, R., Liu, Y.C.: Knights landing: second-generation Intel Xeon Phi Product. IEEE Micro. 36, 34–46 (2016)
Article Google Scholar
Eleliemy, A., Fayze, M., Mehmood, R., Katib, I., Aljohani, N.: Loadbalancing on parallel heterogeneous architectures: spin-image algorithm on CPU and MIC. In: EUROSIM 2016, The 9th Eurosim Congress on Modelling and Simulation. p. 6. Oulu (2016)
Google Scholar
Alyahya, H., Mehmood, R., Katib, I.: Parallel sparse matrix vector multiplication on intel MIC: performance analysis. In: Smart Societies, Infrastructure, Technologies and Applications, SCITA 2017. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, vol. 224, pp. 306–322. Springer, Cham (2018)
Google Scholar
Björck, Å.: Numerical Methods in Matrix Computations. Springer International Publishing, Cham (2015)
Book Google Scholar
Gander, W., Gander, M.J., Kwok, F.: Scientific Computing - An Introduction Using Maple and MATLAB. Springer Publishing Company, Incorporated (2014)
Book Google Scholar
Koza, Z., Matyka, M., Mirosław, Ł., Poła, J.: Sparse matrix-vector product. In: Kindratenko, V. (ed.) Numerical Computations with GPUs, pp. 103–121. Springer International Publishing, Cham (2014)
Google Scholar
Akhunov, R.R., Kuksenko, S.P., Salov, V.K., Gazizov, T.R.: Sparse matrix storage formats and acceleration of iterative solution of linear algebraic systems with dense matrices. J. Math. Sci. (United States). 191, 10–18 (2013)
MATH Google Scholar
Cramer, T., Schmidl, D., Klemm, M., Mey, D.: OpenMP Programming on Intel Xeon Phi Coprocessors: An Early Performance Comparison, pp. 38–44. Marc@Rwth (2012)
Google Scholar
Wang, E., Zhang, Q., Shen, B., Zhang, G., Lu, X., Wu, Q., Wang, Y.: High-Performance Computing on the Intel® Xeon Phi™. Springer International Publishing, Cham (2014)
Book Google Scholar
Maeda, H., Takahashi, D.: Performance evaluation of sparse matrix-vector multiplication using GPU/MIC cluster. In: 2015 Third International Symposium on Computing and Networking. pp. 396–399 (2015)
Google Scholar
Mehmood, R., Parker, D., Kwiatkowska, M.: An efficient BDD-based implementation of Gauss-Seidel for CTMC analysis. Technical Report CSR-03-13, School of Computer Science, University of Birmingham, Birmingham (2013)
Google Scholar
Tang, Z., Huang, H., Jiang, H., Li, B.: MIC-based preconditioned conjugate gradient method for solving large sparse linear equations. In: Hung, J., Yen, N., Li, K.C. (eds.) Frontier Computing. Lecture Notes in Electrical Engineering, vol. 375. Springer, Singapore (2016)
Google Scholar
Li, Z., Donde, V.D., Tournier, J.-C., Yang, F.: On limitations of traditional multi-core and potential of many-core processing architectures for sparse linear solvers used in large-scale power system applications. In: 2011 IEEE Power and Energy Society General Meeting, pp. 1–8. IEEE (2011)
Google Scholar
Yan, D., Cao, H., Dong, X., Zhang, B., Zhang, X.: Optimizing algorithm of sparse linear systems on GPU. In: 2011 Sixth Annu. Chinagrid Conf. pp. 174–179 (2011)
Google Scholar
Ye, F., Calvin, C., Petiton, S.G.: A study of SpMV implementation using MPI and OpenMP on Intel many-Core architecture. In: High Performance Computing for Computational Science—VECPAR 2014: 11th International Conference, Eugene, OR, USA, June 30–July 3, 2014, Revised Selected Papers, pp. 43–56. Springer International Publishing, Cham (2015)
Google Scholar
Saule, E., Kaya, K., Catalyurek, U.V.: Performance evaluation of sparse matrix multiplication kernels on Intel Xeon Phi, ArXiv, Tech. Rep. arXiv:1302.1078, Feb (2013)
Google Scholar
Maeda, H., Takahashi, D.: Parallel sparse matrix-vector multiplication using accelerators. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Torre, C.M., Taniar, D., Apduhan, B.O., Stankova, E., Wang, S. (eds.) Computational Science and Its Applications—ICCSA 2016: 16th International Conference, Beijing, China, July 4–7, 2016, Proceedings, Part II, pp. 3–18. Springer International Publishing, Cham (2016)
Chapter Google Scholar
Ahamed, A.-K.C., Magoules, F.: Iterative methods for sparse linear systems on graphics processing unit. In: 2012 IEEE 14th Int. Conf. High Perform. Comput. Commun. 2012 IEEE 9th Int. Conf. Embed. Softw. Syst. pp. 836–842 (2012)
Google Scholar
Estebanez, A., Llanos, D.R., Gonzalez-Escribano, A.: Using the Xeon Phi platform to run speculatively-parallelized codes. Int. J. Parallel Prog. 45, 225–241 (2017)
Article Google Scholar
Dongarra, J., Gates, M., Haidar, A., Jia, Y., Kabir, K., Luszczek, P., Tomov, S.: HPC programming on Intel many-integrated-Core hardware with MAGMA port to Xeon phi. Sci. Program. 2015, 1–11 (2015)
Google Scholar
Chen, C., Yang, C., Tang, T., Wu, Q., Zhang, P.: OpenACC to intel offload: automatic translation and optimization. In: Communications in Computer and Information Science. pp. 111–120 (2013)
Google Scholar
Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1–1:25 (2011)
MathSciNet MATH Google Scholar
Aziz Supercomputer, Top500. https://www.top500.org/site/50585

Download references

Acknowledgments

The experiments reported in this chapter were performed on the Aziz supercomputer at King Abdulaziz University, Jeddah, Saudi Arabia.

Author information

Authors and Affiliations

Information Technology Section, Institute of Public Administration in Makkah Al-Mukaramah Region Women Branch, Jeddah, Saudi Arabia
Hana Alyahya
High Performance Computing Center, King Abdulaziz University, Jeddah, Saudi Arabia
Rashid Mehmood
Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Iyad Katib

Authors

Hana Alyahya
View author publications
You can also search for this author in PubMed Google Scholar
Rashid Mehmood
View author publications
You can also search for this author in PubMed Google Scholar
Iyad Katib
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hana Alyahya .

Editor information

Editors and Affiliations

High Performance Computing Center, King Abdulaziz University, Jeddah, Saudi Arabia
Rashid Mehmood
Nvidia AI Technology Center, Singapore, Singapore
Simon See
Faculty of Computing and Information Technology (FCIT), King Abdulaziz University, Jeddah, Saudi Arabia
Iyad Katib
European Alliance for Innovation, Gent, Belgium
Imrich Chlamtac

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Alyahya, H., Mehmood, R., Katib, I. (2020). Parallel Iterative Solution of Large Sparse Linear Equation Systems on the Intel MIC Architecture. In: Mehmood, R., See, S., Katib, I., Chlamtac, I. (eds) Smart Infrastructure and Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-13705-2_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-13705-2_16
Published: 21 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13704-5
Online ISBN: 978-3-030-13705-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics