Skip to main content

Parallel Iterative Solution of Large Sparse Linear Equation Systems on the Intel MIC Architecture

  • Chapter
  • First Online:
Smart Infrastructure and Applications

Abstract

Many important scientific, engineering, and smart city applications require solving large sparse linear equation systems. The numerical methods for solving linear equations can be categorised into direct methods and iterative methods. Jacobi method is one of the iterative solvers that has been widely used due to its simplicity and efficiency. Its performance is affected by factors including the storage format, the specific computational algorithm, and its implementation. While the performance of Jacobi has been studied extensively on conventional CPU architectures, research on its performance on emerging architectures, such as the Intel Many Integrated Core (MIC) architecture, is still in its infancy. In this chapter, we investigate the performance of parallel implementations of the Jacobi method on Knights Corner (KNC), the first generation of the Intel MIC architectures. We implement Jacobi with two storage formats, Compressed Sparse Row (CSR) and Modified Sparse Row (MSR), and measure their performance in terms of execution time, offloading time, and speedup. We report results of sparse matrices with over 28 million rows and 640 million non-zero elements acquired from 13 diverse application domains. The experimental results show that our Jacobi parallel implementation on MIC achieves speedups of up to 27.75× compared to the sequential implementation. It also delivers a speedup of up to 3.81× compared to a powerful node comprising 24 cores in two Intel Xeon E5-2695v2 processors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mehmood, R., Alturki, R., Zeadally, S.: Multimedia applications over metropolitan area networks (MANs). J. Netw. Comput. Appl. 34, 1518–1529 (2011)

    Article  Google Scholar 

  2. Mehmood, R., Meriton, R., Graham, G., Hennelly, P., Kumar, M.: Exploring the influence of big data on city transport operations: a Markovian approach. Int. J. Oper. Prod. Manag. 37, 75–104 (2017)

    Article  Google Scholar 

  3. Mehmood, R., Graham, G.: Big data logistics: a health-care transport capacity sharing model. Proc. Comput. Sci. 64, 1107–1114 (2015)

    Article  Google Scholar 

  4. Mehmood, R., Lu, J.A.: Computational Markovian analysis of large systems. J. Manuf. Technol. Manag. 22, 804–817 (2011)

    Article  Google Scholar 

  5. Altowaijri, S., Mehmood, R., Williams, J.: A quantitative model of grid systems performance in healthcare organisations. In: 2010 Int. Conf. Intell. Syst. Model. Simul., pp. 431–436 (2010)

    Google Scholar 

  6. Saad, Y.: Iterative methods for sparse linear systems. Society for Industrial and Applied Mathematics (2003)

    Google Scholar 

  7. Golub, G.H., Van Loan, C.F.: Matrix Computations (2013)

    Google Scholar 

  8. Ford, W.: Chapter 20: Basic iterative methods. In: Ford, W. (ed.) Numerical Linear Algebra with Applications, pp. 469–490. Academic, Boston (2015)

    Google Scholar 

  9. Mehmood, R.: Disk-based techniques for efficient solution of large Markov chains. PhD Thesis, School of Computer Science, University of Birmingham (2004)

    Google Scholar 

  10. Kwiatkowska, M., Mehmood, R., Norman, G., Parker, D.: A symbolic out-of-core solution method for Markov models. Electron. Notes Theor. Comput. Sci. 68, 589–604 (2002)

    Article  Google Scholar 

  11. Kwiatkowska, M., Parker, D., Yi Zhang, Y., Mehmood, R.: Dual-processor parallelisation of symbolic probabilistic model checking. In: The IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings, pp. 123–130. IEEE (2004)

    Google Scholar 

  12. Mehmood, R., Crowcroft, J., Elmirghani, J.M.H.: A parallel implicit method for the steady-state solution of CTMCs. In: 14th IEEE International Symposium on Modeling, Analysis, and Simulation, pp. 293–302. IEEE (2006)

    Google Scholar 

  13. Mehmood, R.: A survey of out-of-core analysis techniques in stochastic modelling. Technical Report CSR-03-7, School of Computer Science, University of Birmingham, Birningham (2003)

    Google Scholar 

  14. Mehmood, R., Parker, D., Kwiatkowska, M.: An efficient symbolic out-of-core solution method for markov models. Technical Report CSR-03-08, School of Computer Science, University of Birmingham, Birmingham (2003)

    Google Scholar 

  15. Saad, Y., Van Der Vost, H.A.: Iterative solution of linear systems in the 20th century. J. Comput. Appl. Math. 123, 1–33 (2000)

    Article  MathSciNet  Google Scholar 

  16. Banu, S., Vaideeswaran, D.: Performance Analysis on Parallel Sparse Matrix Vector Multiplication Micro-Benchmark Using Dynamic Instrumentation Pintool. Presented at the 2013, pp. 129–136 (2013)

    Google Scholar 

  17. Kwiatkowska, M., Mehmood, R.: Out-of-core solution of large linear systems of equations arising from Stochastic modelling. In: Hermanns, H., Segala, R. (eds.) Process Algebra and Probabilistic Methods: Performance Modeling and Verification. PAPM-PROBMIV, pp. 135–151. Springer, Berlin, Heidelberg (2002)

    Chapter  Google Scholar 

  18. Mehmood, R.: Serial disk-based analysis of large stochastic models. In: Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.-P., Siegle, M. (eds.) Validation of Stochastic Systems: A Guide to Current Research, pp. 230–255. Springer, Berlin, Heidelberg (2004)

    Chapter  Google Scholar 

  19. Mehmood, R., Crowcroft, J.: Parallel iterative solution method for large sparse linear equation systems. Technical Report Number UCAM-CL-TR-650, Computer Laboratory, University of Cambridge, Cambridge (2005)

    Google Scholar 

  20. Giles, M.B., Reguly, I.: Trends in high-performance computing for engineering calculations. Philos. Trans. R. Soc. A. 372, 20130319 (2014)

    Article  Google Scholar 

  21. Sodani, A., Gramunt, R., Corbal, J., Kim, H.S., Vinod, K., Chinthamani, S., Hutsell, S., Agarwal, R., Liu, Y.C.: Knights landing: second-generation Intel Xeon Phi Product. IEEE Micro. 36, 34–46 (2016)

    Article  Google Scholar 

  22. Eleliemy, A., Fayze, M., Mehmood, R., Katib, I., Aljohani, N.: Loadbalancing on parallel heterogeneous architectures: spin-image algorithm on CPU and MIC. In: EUROSIM 2016, The 9th Eurosim Congress on Modelling and Simulation. p. 6. Oulu (2016)

    Google Scholar 

  23. Alyahya, H., Mehmood, R., Katib, I.: Parallel sparse matrix vector multiplication on intel MIC: performance analysis. In: Smart Societies, Infrastructure, Technologies and Applications, SCITA 2017. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, vol. 224, pp. 306–322. Springer, Cham (2018)

    Google Scholar 

  24. Björck, Å.: Numerical Methods in Matrix Computations. Springer International Publishing, Cham (2015)

    Book  Google Scholar 

  25. Gander, W., Gander, M.J., Kwok, F.: Scientific Computing - An Introduction Using Maple and MATLAB. Springer Publishing Company, Incorporated (2014)

    Book  Google Scholar 

  26. Koza, Z., Matyka, M., Mirosław, Ł., Poła, J.: Sparse matrix-vector product. In: Kindratenko, V. (ed.) Numerical Computations with GPUs, pp. 103–121. Springer International Publishing, Cham (2014)

    Google Scholar 

  27. Akhunov, R.R., Kuksenko, S.P., Salov, V.K., Gazizov, T.R.: Sparse matrix storage formats and acceleration of iterative solution of linear algebraic systems with dense matrices. J. Math. Sci. (United States). 191, 10–18 (2013)

    MATH  Google Scholar 

  28. Cramer, T., Schmidl, D., Klemm, M., Mey, D.: OpenMP Programming on Intel Xeon Phi Coprocessors: An Early Performance Comparison, pp. 38–44. Marc@Rwth (2012)

    Google Scholar 

  29. Wang, E., Zhang, Q., Shen, B., Zhang, G., Lu, X., Wu, Q., Wang, Y.: High-Performance Computing on the Intel® Xeon Phi™. Springer International Publishing, Cham (2014)

    Book  Google Scholar 

  30. Maeda, H., Takahashi, D.: Performance evaluation of sparse matrix-vector multiplication using GPU/MIC cluster. In: 2015 Third International Symposium on Computing and Networking. pp. 396–399 (2015)

    Google Scholar 

  31. Mehmood, R., Parker, D., Kwiatkowska, M.: An efficient BDD-based implementation of Gauss-Seidel for CTMC analysis. Technical Report CSR-03-13, School of Computer Science, University of Birmingham, Birmingham (2013)

    Google Scholar 

  32. Tang, Z., Huang, H., Jiang, H., Li, B.: MIC-based preconditioned conjugate gradient method for solving large sparse linear equations. In: Hung, J., Yen, N., Li, K.C. (eds.) Frontier Computing. Lecture Notes in Electrical Engineering, vol. 375. Springer, Singapore (2016)

    Google Scholar 

  33. Li, Z., Donde, V.D., Tournier, J.-C., Yang, F.: On limitations of traditional multi-core and potential of many-core processing architectures for sparse linear solvers used in large-scale power system applications. In: 2011 IEEE Power and Energy Society General Meeting, pp. 1–8. IEEE (2011)

    Google Scholar 

  34. Yan, D., Cao, H., Dong, X., Zhang, B., Zhang, X.: Optimizing algorithm of sparse linear systems on GPU. In: 2011 Sixth Annu. Chinagrid Conf. pp. 174–179 (2011)

    Google Scholar 

  35. Ye, F., Calvin, C., Petiton, S.G.: A study of SpMV implementation using MPI and OpenMP on Intel many-Core architecture. In: High Performance Computing for Computational Science—VECPAR 2014: 11th International Conference, Eugene, OR, USA, June 30–July 3, 2014, Revised Selected Papers, pp. 43–56. Springer International Publishing, Cham (2015)

    Google Scholar 

  36. Saule, E., Kaya, K., Catalyurek, U.V.: Performance evaluation of sparse matrix multiplication kernels on Intel Xeon Phi, ArXiv, Tech. Rep. arXiv:1302.1078, Feb (2013)

    Google Scholar 

  37. Maeda, H., Takahashi, D.: Parallel sparse matrix-vector multiplication using accelerators. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Torre, C.M., Taniar, D., Apduhan, B.O., Stankova, E., Wang, S. (eds.) Computational Science and Its Applications—ICCSA 2016: 16th International Conference, Beijing, China, July 4–7, 2016, Proceedings, Part II, pp. 3–18. Springer International Publishing, Cham (2016)

    Chapter  Google Scholar 

  38. Ahamed, A.-K.C., Magoules, F.: Iterative methods for sparse linear systems on graphics processing unit. In: 2012 IEEE 14th Int. Conf. High Perform. Comput. Commun. 2012 IEEE 9th Int. Conf. Embed. Softw. Syst. pp. 836–842 (2012)

    Google Scholar 

  39. Estebanez, A., Llanos, D.R., Gonzalez-Escribano, A.: Using the Xeon Phi platform to run speculatively-parallelized codes. Int. J. Parallel Prog. 45, 225–241 (2017)

    Article  Google Scholar 

  40. Dongarra, J., Gates, M., Haidar, A., Jia, Y., Kabir, K., Luszczek, P., Tomov, S.: HPC programming on Intel many-integrated-Core hardware with MAGMA port to Xeon phi. Sci. Program. 2015, 1–11 (2015)

    Google Scholar 

  41. Chen, C., Yang, C., Tang, T., Wu, Q., Zhang, P.: OpenACC to intel offload: automatic translation and optimization. In: Communications in Computer and Information Science. pp. 111–120 (2013)

    Google Scholar 

  42. Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1–1:25 (2011)

    MathSciNet  MATH  Google Scholar 

  43. Aziz Supercomputer, Top500. https://www.top500.org/site/50585

Download references

Acknowledgments

The experiments reported in this chapter were performed on the Aziz supercomputer at King Abdulaziz University, Jeddah, Saudi Arabia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hana Alyahya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Alyahya, H., Mehmood, R., Katib, I. (2020). Parallel Iterative Solution of Large Sparse Linear Equation Systems on the Intel MIC Architecture. In: Mehmood, R., See, S., Katib, I., Chlamtac, I. (eds) Smart Infrastructure and Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-13705-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-13705-2_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-13704-5

  • Online ISBN: 978-3-030-13705-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics