Skip to main content
Log in

A distributed approach for accelerating sparse matrix arithmetic operations for high-dimensional feature selection

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Matrix computations are both fundamental and ubiquitous in computational science, and as a result, they are frequently used in numerous disciplines of scientific computing and engineering. Due to the high computational complexity of matrix operations, which makes them critical to the performance of a large number of applications, their efficient execution in distributed environments becomes a crucial issue. This work proposes a novel approach for distributing sparse matrix arithmetic operations on computer clusters aiming at speeding-up the processing of high-dimensional matrices. The approach focuses on how to split such operations into independent parallel tasks by considering the intrinsic characteristics that distinguish each type of operation and the particular matrices involved. The approach was applied to the most commonly used arithmetic operations between matrices. The performance of the presented approach was evaluated considering a high-dimensional text feature selection approach and two real-world datasets. Experimental evaluation showed that the proposed approach helped to significantly reduce the computing times of big-scale matrix operations, when compared to serial and multi-thread implementations as well as several linear algebra software libraries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. http://trove.starlight-systems.com/.

  2. http://www.jppf.org/.

  3. http://www.digg.com/.

  4. http://www.blogcatalog.com/.

  5. http://www.netlib.org/lapack/.

  6. http://www.netlib.org/blas/.

References

  1. Aggarwal CC, Zhai C (2012) A survey of text classification algorithms. In: Aggarwal CC, Zhai C (eds) Mining text data. Springer, Berlin, pp 163–222

    Chapter  Google Scholar 

  2. Alelyani S, Tang J, Liu H (2013) Feature selection for clustering: a review. In: Data clustering: algorithms and applications, pp 29–60

  3. Bell N, Garland M (2008) Efficient sparse matrix–vector multiplication on cuda. NVIDIA technical report NVR-2008-004, NVIDIA Corporation

  4. Bell N, Garland M (2009) Implementing sparse matrix–vector multiplication on throughput-oriented processors. In: Proceedings of the conference on high performance computing networking, storage and analysis (SC’09). ACM, New York, NY, USA, pp 18:1–18:11

  5. Bisseling RH (2004) Parallel scientific computation: a structured approach using BSP and MPI. Oxford University Press, Oxford

    Book  MATH  Google Scholar 

  6. Bosilca G, Delmas R, Dongarra J, Langou J (2009) Algorithmic based fault tolerance applied to high performance computing. J Parallel Distrib Comput 69(4):410–416

    Article  Google Scholar 

  7. Buluç A, Fineman JT, Frigo M, Gilbert JR, Leiserson CE (2009) Parallel sparse matrix–vector and matrix-transpose–vector multiplication using compressed sparse blocks. In: Proceedings of the 21st symposium on parallelism in algorithms and architectures (SPAA’09). ACM, pp 233–244

  8. Buttari A, Langou J, Kurzak J, Dongarra J (2009) A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput 35(1):38–53

    Article  MathSciNet  Google Scholar 

  9. Chan E, Quintana-Ortí ES, Quintana-Ortí G, Geijn RVD (2007) Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. In: Proceedings of the 9th annual ACM symposium on parallel algorithms and architectures (SPAA’07). ACM, pp 116–125

  10. Chan E, Zee FGV, Bientinesi P, Quintana-Ortí ES, Quintana-Ortí G, van de Geijn RA (2008) Supermatrix: a multithreaded runtime scheduling system for algorithms-by-blocks. In: Chatterjee S, Scott ML (eds) Proceedings of the 13th ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP’08). ACM, pp 123–132

  11. Dang HV, Schmidt B (2013) Cuda-enabled sparse matrix–vector multiplication on gpus using atomic operations. Parallel Comput 39(11):737–750

    Article  MathSciNet  Google Scholar 

  12. Elmroth E, Gustavson F, Jonsson I, Kågström B (2004) Recursive blocked algorithms and hybrid data structures for dense matrix library software. SIAM Rev 46(1):3–45

    Article  MathSciNet  MATH  Google Scholar 

  13. Gilbert JR, Moler C, Schreiber R (1992) Sparse matrices in MATLAB: design and implementation. SIAM J Matrix Anal Appl 13(1):333–356

    Article  MathSciNet  MATH  Google Scholar 

  14. Gu Q, Han J (2011) Towards feature selection in network. In: Macdonald C, Ounis I, Ruthven I (ed) Proceedings of the 20th ACM international conference on Information and knowledge management (CIKM’11). ACM, pp 1175–1184

  15. Gu Q, Li Z, Han J (2011) Generalized fisher score for feature selection. In: Proceedings of the 27th conference annual conference on uncertainty in artificial intelligence (UAI-11), pp 266–273. arxiv:1202.3725

  16. Gustavson F, Henriksson A, Jonsson I, Kågström B, Ling P (1998) Recursive blocked data formats and BLAS’s for dense linear algebra algorithms. In: Proceedings of the 4th international workshop on applied parallel computing. Large scale scientific and industrial problems. Springer, Berlin, pp 195–206

  17. Heath L, Ribbens C, Pemmaraju S (2004) Processor-efficient sparse matrix–vector multiplication. Comput Math Appl 48(34):589–608

    Article  MathSciNet  MATH  Google Scholar 

  18. Hou C, Nie F, Yi D, Wu Y (2011) Feature selection via joint embedding learning and sparse regression. In: Walsh T (ed) Proceedings of the 22nd international joint conference on artificial intelligence (IJCAI). AAAI, pp 1324–1329

  19. Hu X, Tang L, Tang J, Liu H (2013) Exploiting social relations for sentiment analysis in microblogging. In: Leonardi S, Panconesi A, Ferragina P, Gionis A (ed) Proceedings of the 6th ACM international conference on web search and data mining. ACM, pp 537–546

  20. Im EJ, Yelick K, Vuduc R (2004) Sparsity optimization framework for sparse matrix kernels. Int J High Perform Comput Appl 18(1):135–158

    Article  Google Scholar 

  21. Kannan R, Ishteva M, Park H (2014) Bounded matrix factorization for recommender system. Knowl Inf Syst 39(3):491–511

    Article  Google Scholar 

  22. Kourtis K, Goumas GI, Koziris N (2008) Optimizing sparse matrix–vector multiplication using index and value compression. In: Ramírez A, Bilardi G, Gschwind M (ed) ACM international conference on computing frontiers. ACM, pp 87–96

  23. Kurzak J, Alvaro W, Dongarra J (2009) Optimizing matrix multiplication for a short-vector simd architecture—cell processor. Parallel Comput 35(3):138–150

    Article  Google Scholar 

  24. Lee A, Yau C, Giles MB, Doucet A, Holmes CC (2010) On the utility of graphics cards to perform massively parallel simulation with advanced monte carlo methods. J Comput Graph Stat 19(4):769–789

    Article  Google Scholar 

  25. Li Y, Zhai C, Chen Y (2014) Exploiting rich user information for one-class collaborative filtering. Knowl Inf Syst 38(2):277–301

    Article  Google Scholar 

  26. Li Z, Liu J, Yang Y, Zhou X, Lu H (2013) Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Trans Knowl Data Eng 26(9):2138–2150

    Google Scholar 

  27. Lin YR, Sun J, Castro P, Konuru R, Sundaram H, Kelliher A (2009) Metafac: community discovery via relational hypergraph factorization. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD’09), pp 527–536

  28. Liu C, Chih Yang H, Fan J, He LW, Wang YM (2010) Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce. In: Rappa M, Jones P, Freire J, Chakrabarti S (ed) Proceedings of the 19th international conference on world wide web. ACM, pp 681–690

  29. Liu H, He J, Rajan D, Camp J (2013) Outlier detection for training-based adaptive protocols. In: IEEE wireless communications and networking conference (WCNC), pp 333–338

  30. Ma Z, Nie F, Yang Y, Uijlings JRR, Sebe N, Hauptmann AG (2012) Discriminating joint feature analysis for multimedia data understanding. IEEE Trans Multimed 14(6):1662–1672

    Article  Google Scholar 

  31. Marsden PV, Friedkin NE (1993) Network studies of social influence. Sociol Methods Res 22(1):127–151

    Article  Google Scholar 

  32. McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: Homophily in social networks. Annu Rev Sociol 27(1):415–444

    Article  Google Scholar 

  33. Moreira JE, Midkiff SP, Gupta M, Artigas PV, Wu P, Almasi G (2001) The NINJA project. Commun ACM 44(10):102–109

    Article  Google Scholar 

  34. Nesterov Y (2004) Introductory lectures on convex optimization: a basic course (applied optimization), 2nd edn. Springer, Berlin

    Book  MATH  Google Scholar 

  35. Nie F, Huang H, Cai X, Ding CHQ (2010) Efficient and robust feature selection via joint l2, 1-norms minimization. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Advances in neural information processing systems. Curran Associates Inc, Red Hook, pp 1813–1821

    Google Scholar 

  36. Oyarzun G, Borrell R, Gorobets A, Oliva A (2014) Mpi-cuda sparse matrix–vector multiplication for the conjugate gradient method with an approximate inverse preconditioner. Comput Fluids 92:244–252

    Article  MathSciNet  Google Scholar 

  37. Porter MF (1997) Readings in information retrieval. Morgan Kaufmann Publishers Inc., San Francisco (Chap: An algorithm for suffix stripping)

    Google Scholar 

  38. Poulson J, Marker B, van de Geijn RA, Hammond JR, Romero NA (2013) Elemental: a new framework for distributed memory dense matrix computations. ACM Trans Math Softw 39(2):13:1–13:24

    Article  MathSciNet  MATH  Google Scholar 

  39. Qi GJ, Aggarwal CC, Tian Q, Ji H, Huang TS (2012) Exploring context and content links in social media: a latent space method. IEEE Trans Pattern Anal Mach Intell 34(5):850–862

    Article  Google Scholar 

  40. Shahrivari S, Sharifi M (2011) Task-oriented programming: a suitable programming model for multicore and distributed systems. In: Proceedings of the 10th international symposium on parallel and distributed computing (ISPDC’11), pp 139–144

  41. Taboada GL, Ramos S, Expósito RR, Touriño J, Doallo R (2013) Java in the high performance computing arena: research, practice and experience. Sci Comput Program 78(5):425–444

    Article  Google Scholar 

  42. Tang J, Liu H (2012a) Feature selection with linked data in social media. In: Proceedings of the 12th SIAM international conference on data mining (SIAM/Omnipress), pp 118–128

  43. Tang J, Liu H (2012b) Unsupervised feature selection for linked social media data. In: Yang Q, Agarwal D, Pei J (ed) Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’12). ACM, pp 904–912

  44. Tang J, Wang X, Liu H (2011) Social media data integration for community detection. In: Postproceedings of MUSE/MSM 2011

  45. Tang J, Wang X, Gao H, Hu X, Liu H (2012) Enriching short text representation in microblog for clustering. Front Comput Sci China 6(1):88–101

    MathSciNet  MATH  Google Scholar 

  46. Tang J, Hu X, Gao H, Liu H (2013) Unsupervised feature selection for multi-view data in social media. In: Proceedings of the 13th SIAM international conference on data mining (SIAM), pp 270–278

  47. Tomasulo RM (1967) An efficient algorithm for exploiting multiple arithmetic units. IBM J Res Dev 11(1):25–33

    Article  MATH  Google Scholar 

  48. Trinder PW, Cole MI, Hammond K, Loidl H, Michaelson G (2013) Resource analyses for parallel and distributed coordination. Concurr Comput Pract Exp 25(3):309–348

    Article  MATH  Google Scholar 

  49. Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111

    Article  Google Scholar 

  50. Vuduc R, Demmel JW, Yelick KA (2005) OSKI: a library of automatically tuned sparse matrix kernels. J Phys Conf Ser 16(1):521

    Article  Google Scholar 

  51. Wang Q, Li X (2014) Shrink image by feature matrix decomposition. Neurocomputing 140:162–171

    Article  Google Scholar 

  52. Wang X, Tang L, Gao H, Liu H (2010) Discovering overlapping groups in social media. In: Proceedings of the 2010 IEEE International Conference on Data Mining (ICDM ’10). IEEE Computer Society, Washington, DC, USA, pp 569–578

  53. Whiley M, Wilson SP (2004) Parallel algorithms for Markov chain Monte Carlo methods in latent spatial Gaussian models. Stat Comput 14(3):171–179

    Article  MathSciNet  Google Scholar 

  54. Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’03). ACM, New York, NY, USA, pp 267–273

  55. Yakubovich E, Zenkovich D (2001) Matrix approach to lagrangian fluid dynamics. J Fluid Mech 443:167–196

    Article  MathSciNet  MATH  Google Scholar 

  56. Yu HF, Hsieh CJ, Si S, Dhillon I (2014) Parallel matrix factorization for recommender systems. Knowl Inf Syst 41(3):793–819

    Article  Google Scholar 

  57. Yu Y, Qiu RG (2014) Followee recommendation in microblog using matrix factorization model with structural regularization. Sci World J 2014:420841

  58. Yuster R, Zwick U (2005) Fast sparse matrix multiplication. ACM Trans Algorithms 1:2–13

    Article  MathSciNet  MATH  Google Scholar 

  59. Van Zee FG, Chan E, van de Geijn RA, Quintana-Ortí ES, Quintana-Ortí G (2009) The libflame library for dense matrix computations. Comput Sci Eng 11(6):56–63

    Article  Google Scholar 

  60. Zhang K, Wu B (2012) Parallel sparse matrix multiplication for preconditioning and ssta on a many-core architecture. In: Proceedings of the 7th international conference on networking, architecture, and storage, pp 59–68

  61. Zhang Y, Yi D, Wei B, Zhuang Y (2014) A GPU-accelerated non-negative sparse latent semantic analysis algorithm for social tagging data. J Inf Sci 281(0):687–702 (Multimedia modeling)

    Article  MathSciNet  Google Scholar 

  62. Zhao Z, Wang L, Liu H (2010) Efficient spectral feature selection with minimum redundancy. In: Fox M, Poole D (eds) Association for the advancement of artificial intelligence (AAAI). AAAI Press, Menlo Park

    Google Scholar 

  63. Zhou Y, Wilkinson DM, Schreiber R, Pan R (2008) Large-scale parallel collaborative filtering for the netflix prize. In: Fleischer R, Xu J (ed) Proceedings of the 4th international conference on algorithmic aspects in information and management. Lecture notes in computer science, vol 5034. Springer, Berlin, pp 337–348

  64. Zhou Y, Cao W, Liu L, Agaian S, Chen CP (2015) Fast Fourier transform using matrix decomposition. J Inf Sci 291:172–183

    Article  MathSciNet  MATH  Google Scholar 

  65. Zuo W, McNeil A, Wetter M, Lee ES (2014) Acceleration of the matrix multiplication of radiance three phase daylighting simulations with parallel computing on heterogeneous hardware of personal computer. J Build Perform Simul 7(2):152–163

    Article  Google Scholar 

Download references

Acknowledgments

This work has been partially funded by CONICET (Argentina) under Grant PIP No. 112-201201-00185.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonela Tommasel.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tommasel, A., Godoy, D., Zunino, A. et al. A distributed approach for accelerating sparse matrix arithmetic operations for high-dimensional feature selection. Knowl Inf Syst 51, 459–497 (2017). https://doi.org/10.1007/s10115-016-0981-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-016-0981-5

Keywords

Navigation