# Fast Output-Sensitive Matrix Multiplication

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9294)

## Abstract

We consider the problem of multiplying two U ×U matrices A and C of elements from a field $$\mathcal{F}$$. We present a new randomized algorithm that can use the known fast square matrix multiplication algorithms to perform fewer arithmetic operations than the current state of the art for output matrices that are sparse.

In particular, let ω be the best known constant such that two dense U ×U matrices can be multiplied with $$\mathcal{O} \left( U^\omega \right)$$ arithmetic operations. Further denote by N the number of nonzero entries in the input matrices while Z is the number of nonzero entries of matrix product AC. We present a new Monte Carlo algorithm that uses $$\tilde{\mathcal{O}} \left( U^2 \left(\frac{Z}{U}\right)^{\omega-2} + N \right)$$ arithmetic operations and outputs the nonzero entries of AC with high probability. For dense input, i.e., N = U2, if Z is asymptotically larger than U, this improves over state of the art methods, and it is always at most $$\mathcal{O} \left( U^\omega \right)$$. For general input density we improve upon state of the art when N is asymptotically larger than U4 − ωZω − 5/2.

The algorithm is based on dividing the input into ”balanced” subproblems which are then compressed and computed. The new subroutine that computes a matrix product with balanced rows and columns in its output uses time $$\tilde{\mathcal{O}} \left( U Z^{(\omega -1)/2} + N\right)$$ which is better than the current state of the art for balanced matrices when N is asymptotically larger than UZω/2 − 1, which always holds when N = U2.

In the I/O model — where M is the memory size and B is the block size — our algorithm is the first nontrivial result that exploits cancellations and sparsity of the output. The I/O complexity of our algorithm is $$\tilde{\mathcal{O}} \left( U^2 (Z/U)^{\omega-2}/(M^{\omega/2 -1} B) + Z/B + N/B \right)$$, which is asymptotically faster than the state of the art unless M is large.

## Preview

### References

1. 1.
Aggarwal, A., Vitter, S., Jeffrey: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)Google Scholar
2. 2.
Alon, N., Yuster, R., Zwick, U.: Color-coding. J. ACM 42(4), 844–856 (1995)
3. 3.
Alon, N., Yuster, R., Zwick, U.: Finding and counting given length cycles. Algorithmica 17(3), 209–223 (1997)
4. 4.
Amossen, R.R., Pagh, R.: Faster join-projects and sparse matrix multiplications. In: International Conference on Database Theory, ICDT 2009 (2009)Google Scholar
5. 5.
Chan, T.M.: Speeding up the four russians algorithm by about one more logarithmic factor. In: SODA 2015, San Diego, CA, USA, 2015, January 4-6, pp. 212–217 (2015)Google Scholar
6. 6.
Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation 9(3), 251–280 (1990)
7. 7.
Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press (2009)Google Scholar
8. 8.
Gall, F.L.: Powers of tensors and fast matrix multiplication. arXiv preprint arXiv:1401.7714 (2014)Google Scholar
9. 9.
Iwen, M., Spencer, C.: A note on compressed sensing and the complexity of matrix multiplication. Information Processing Letters 109(10), 468–471 (2009)
10. 10.
Jia-Wei, H., Kung, H.T.: I/O complexity: The red-blue pebble game. In: ACM Symposium on Theory of Computing, STOC 1981 (1981)Google Scholar
11. 11.
Lingas, A.: A fast output-sensitive algorithm for boolean matrix multiplication. In: Fiat, A., Sanders, P. (eds.) ESA 2009. LNCS, vol. 5757, pp. 408–419. Springer, Heidelberg (2009)
12. 12.
Mulmuley, K., Vazirani, U., Vazirani, V.: Matching is as easy as matrix inversion. Combinatorica 7(1), 105–113 (1987)
13. 13.
Pagh, R.: Compressed matrix multiplication. ACM Trans. Comput. Theory 5(3), 9:1–9:17 (2013)Google Scholar
14. 14.
Pagh, R., Stöckel, M.: The input/output complexity of sparse matrix multiplication. In: Schulz, A.S., Wagner, D. (eds.) ESA 2014. LNCS, vol. 8737, Springer, Heidelberg (2014)Google Scholar
15. 15.
Rabin, M.O., Vazirani, V.V.: Maximum matchings in general graphs through randomization. J. Algorithms 10(4), 557–567 (1989)
16. 16.
Stothers, A.J.: On the complexity of matrix multiplication. PhD thesis, The University of Edinburgh (2010)Google Scholar
17. 17.
Strassen, V.: Gaussian elimination is not optimal. Numerische Mathematik 13(4), 354–356 (1969)
18. 18.
van Dongen, S.: Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht (2000)Google Scholar
19. 19.
Van Gucht, D., Williams, R., Woodruff, D.P., Zhang, Q.: The communication complexity of distributed set-joins with applications to matrix multiplication. In: Proceedings of the 34th ACM Symposium on Principles of Database Systems, PODS 2015, pp. 199–212. ACM, New York (2015)
20. 20.
Williams, V.V.: Multiplying matrices faster than coppersmith-winograd. In: Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing (2012)Google Scholar
21. 21.
Yu, H.: An improved combinatorial algorithm for boolean matrix multiplication. In: ICALP (2015)Google Scholar
22. 22.
Yu, H., Williams, R.: Finding orthogonal vectors in discrete structures, ch. 135, pp. 1867–1877. SIAM (2014)Google Scholar
23. 23.
Yuster, R., Zwick, U.: Fast sparse matrix multiplication. ACM Trans. Algorithms 1(1), 2–13 (2005)