Addressing Volume and Latency Overheads in 1D-parallel Sparse Matrix-Vector Multiplication

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10417)


The scalability of sparse matrix-vector multiplication (SpMV) on distributed memory systems depends on multiple factors that involve different communication cost metrics. The irregular sparsity pattern of the coefficient matrix manifests itself as high bandwidth (total and/or maximum volume) and/or high latency (total and/or maximum message count) overhead. In this work, we propose a hypergraph partitioning model which combines two earlier models for one-dimensional partitioning, one addressing total and maximum volume, and the other one addressing total volume and total message count. Our model relies on the recursive bipartitioning paradigm and simultaneously addresses three cost metrics in a single partitioning phase in order to reduce volume and latency overheads. We demonstrate the validity of our model on a large dataset that contains more than 300 matrices. The results indicate that compared to the earlier models, our model significantly improves the scalability of SpMV.


Communication cost Sparse matrix-vector multiplication Hypergraph partitioning One-dimensional partitioning 



We acknowledge PRACE for awarding us access to resource Marconi (Lenovo NextScale) based in Italy at CINECA Supercomputing Centre. This work was supported by The Scientific and Technological Research Council of Turkey (TUBITAK) under Grant EEEAG-114E545. This article is also based upon work from COST Action CA 15109 (COSTNET).


  1. 1.
    Acer, S., Selvitopi, O., Aykanat, C.: Improving performance of sparse matrix dense matrix multiplication on large-scale parallel systems. Parallel Comput. 59, 71–96 (2016). Theory and Practice of Irregular ApplicationsMathSciNetCrossRefGoogle Scholar
  2. 2.
    Bisseling, R.H., Meesen, W.: Communication balancing in parallel sparse matrix-vector multiply. Electron. Trans. Numer. Anal. 21, 47–65 (2005)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Boman, E.G., Devine, K.D., Rajamanickam, S.: Scalable matrix computations on large scale-free graphs using 2D graph partitioning. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis SC 2013, NY, USA, pp. 50:1–50:12. ACM, New York (2013)Google Scholar
  4. 4.
    Çatalyürek, U.V., Aykanat, C.: Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans. Parallel Distrib. Syst. 10(7), 673–693 (1999)CrossRefGoogle Scholar
  5. 5.
    Çatalyürek, U., Aykanat, C.: A hypergraph-partitioning approach for coarse-grain decomposition. In: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing SC 2001, NY, USA, pp. 28–28. ACM, New York (2001)Google Scholar
  6. 6.
    Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1:1–1:25 (2011)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Deveci, M., Kaya, K., Uçar, B., Çatalyürek, U.: Hypergraph partitioning for multiple communication cost metrics: model and methods. J. Parallel Distrib. Comput. 77, 69–83 (2015)CrossRefGoogle Scholar
  8. 8.
    Kumar, V.: Introduction to Parallel Computing, 2nd edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2002)Google Scholar
  9. 9.
    Selvitopi, O., Acer, S., Aykanat, C.: A recursive hypergraph bipartitioning framework for reducing bandwidth and latency costs simultaneously. IEEE Trans. Parallel Distrib. Syst. 28(2), 345–358 (2017)Google Scholar
  10. 10.
    Slota, G.M., Madduri, K., Rajamanickam, S.: PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 481–490, October 2014Google Scholar
  11. 11.
    Uçar, B., Aykanat, C.: A library for parallel sparse matrix vector multiplies. Technical report BU-CE-0506, Bilkent University (2005)Google Scholar
  12. 12.
    Uçar, B., Aykanat, C.: Encapsulating multiple communication-cost metrics in partitioning sparse rectangular matrices for parallel matrix-vector multiplies. SIAM J. Sci. Comput. 25(6), 1837–1859 (2004)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Bilkent UniversityAnkaraTurkey

Personalised recommendations