# Fast Structured Matrix Computations: Tensor Rank and Cohn–Umans Method

## Abstract

We discuss a generalization of the Cohn–Umans method, a potent technique developed for studying the bilinear complexity of matrix multiplication by embedding matrices into an appropriate group algebra. We investigate how the Cohn–Umans method may be used for bilinear operations other than matrix multiplication, with algebras other than group algebras, and we relate it to Strassen’s tensor rank approach, the traditional framework for investigating bilinear complexity. To demonstrate the utility of the generalized method, we apply it to find the fastest algorithms for forming structured matrix–vector product, the basic operation underlying iterative algorithms for structured matrices. The structures we study include Toeplitz, Hankel, circulant, symmetric, skew-symmetric, f-circulant, block Toeplitz–Toeplitz block, triangular Toeplitz matrices, Toeplitz-plus-Hankel, sparse/banded/triangular. Except for the case of skew-symmetric matrices, for which we have only upper bounds, the algorithms derived using the generalized Cohn–Umans method in all other instances are the fastest possible in the sense of having minimum bilinear complexity. We also apply this framework to a few other bilinear operations including matrix–matrix, commutator, simultaneous matrix products, and briefly discuss the relation between tensor nuclear norm and numerical stability.

This is a preview of subscription content, access via your institution.

We’re sorry, something doesn't seem to be working properly.

## Notes

1. Even if the exponent of matrix multiplication turns out to be 2; note that this is asymptotic.

2. This is essential; (7) cannot be replaced by $$\bigl (\sum _{i=1}^{r}|\lambda _{i} |^p \bigr )^{1/p}$$ for $$p > 1$$ or $$\max _{i=1,\ldots ,r} |\lambda _i |$$. See [17, Section 3].

3. Later on in the article we will consider embedding of vector spaces into algebras.

4. We do not distinguish between an irreducible representation of G and its irreducible $$\mathbb {C}[G]$$-submodule.

5. The reader is reminded that scalar multiplications by a constant like $$\omega ^i$$ are not counted in bilinear complexity.

6. The result is, however, coordinate independent, i.e., it does not depend on our choice of the bases.

## References

1. W. A. Adkins and S. H. Weintraub, Algebra: An approach via module theory, Graduate Texts in Mathematics, 136, Springer, New York, 1992.

2. D. Bini and M. Capovani, “Tensor rank and border rank of band Toeplitz matrices,” SIAM J. Comput., 16 (1987), no. 2, pp. 252–258.

3. Å. Björck, Numerical Methods for Least Squares Problems, SIAM, Philadelphia, PA, 1996.

4. J.-L. Brylinski, “Algebraic measures of entanglement,” pp. 3–23, G. Chen and R. K. Brylinski (Eds), Mathematics of Quantum Computation, CRC, Boca Raton, FL, 2002.

5. J. Buczyński and J. M. Landsberg, “Ranks of tensors and a generalization of secant varieties,” Linear Algebra Appl., 438 (2013), no. 2, pp. 668–689.

6. P. Bürgisser, M. Clausen, and M. A. Shokrollahi, Algebraic Complexity Theory, Grundlehren der Mathematischen Wissenschaften, 315, Springer-Verlag, Berlin, 1997.

7. R. H.-F. Chan and X.-Q. Jin, An Introduction to Iterative Toeplitz Solvers, Fundamentals of Algorithms, 5, SIAM, Philadelphia, PA, 2007.

8. H. Cohn, R. Kleinberg, B. Szegedy, and C. Umans, “Group-theoretic algorithms for matrix multiplication,” Proc. IEEE Symp. Found. Comput. Sci. (FOCS), 46 (2005), pp. 379–388.

9. H. Cohn and C. Umans, “A group-theoretic approach to fast matrix multiplication,” Proc. IEEE Symp. Found. Comput. Sci. (FOCS), 44 (2003), pp. 438–449.

10. H. Cohn and C. Umans, “Fast matrix multiplication using coherent configurations,” Proc. ACM–SIAM Symp. Discrete Algorithms (SODA), 24 ( 2013), pp. 1074–1087.

11. S. A. Cook, On the Minimum Computation Time of Functions, Ph.D. thesis, Harvard University, Cambridge, MA, 1966.

12. J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex Fourier series,” Math. Comp., 19 (1965), no. 90, pp. 297–301.

13. D. Coppersmith and S. Winograd, “Matrix multiplication via arithmetic progressions,” J. Symbolic Comput., 9 (1990), no. 3, pp. 251–280.

14. P. J. Davis, Circulant Matrices, John Wiley, New York, NY, 1979.

15. V. De Silva and L.-H. Lim, “Tensor rank and the ill-posedness of the best low-rank approximation problem,” SIAM J. Matrix Anal. Appl., 30 (2008), no. 3, pp. 1084–1127.

16. J. Demmel, I. Dumitriu, O. Holtz, and R. Kleinberg, “Fast matrix multiplication is stable,” Numer. Math., 106 (2007), no. 2, pp. 199–224.

17. S. Friedland and L.-H. Lim, “Nuclear norm of higher-order tensors,” (2016). http://arxiv.org/abs/1410.6072.

18. M. Fürer, “Faster integer multiplication,” SIAM J. Comput., 39 (2009), no. 3, pp. 979–1005.

19. G. Golub and C. Van Loan, Matrix Computations, 4th Ed., Johns Hopkins University Press, Baltimore, MD, 2013.

20. N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd Ed., SIAM, Philadelphia, PA, 2002.

21. N. J. Higham, Functions of Matrices, SIAM, Philadelphia, PA, 2008.

22. N. J. Higham, “Stability of a method for multiplying complex matrices with three real matrix multiplications,” SIAM J. Matrix Anal. Appl., 13 (1992), no. 3, pp. 681–687.

23. Intel 64 and IA-32 Architectures Optimization Reference Manual, September 2015. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual

24. T. Kailath and J. Chun, “Generalized displacement structure for block-Toeplitz, Toeplitz-block, and Toeplitz-derived matrices,” SIAM J. Matrix Anal. Appl., 15 (1994), no. 1, pp. 114–128.

25. A. Karatsuba and Yu. Ofman, “Multiplication of many-digital numbers by automatic computers,” Dokl. Akad. Nauk SSSR, 145 (1962), pp. 293–294 [English translation: Soviet Phys. Dokl., 7 (1963), pp. 595–596].

26. D. E. Knuth, The Art of Computer Programming, Volume 2: Seminumerical algorithms, 3rd Ed., Addison–Wesley, Reading, MA, 1998.

27. V. K. Kodavalla, “IP gate count estimation methodology during micro-architecture phase,” IP Based Electronic System Conference and Exhibition (IP-SOC), Grenoble, France, December 2007. http://www.design-reuse.com/articles/19171/ip-gate-count-estimation-micro-architecture-phase.html

28. J. M. Landsberg, Tensors: Geometry and Applications, Graduate Studies in Mathematics, 128, AMS, Providence, RI, 2012.

29. S. Lang, Algebra, Rev. 3rd Ed., Graduate Texts in Mathematics, 211, Springer, New York, NY, 2002.

30. F. Le Gall, “Powers of tensors and fast matrix multiplication,” Proc. Internat. Symp. Symbolic Algebr. Comput. (ISSAC), 39 (2014), pp. 296–303.

31. L.-H. Lim, “Tensors and hypermatrices,” in: L. Hogben (Ed.), Handbook of Linear Algebra, 2nd Ed., CRC Press, Boca Raton, FL, 2013.

32. J. C. McConnell and J. C. Robson, Noncommutative Noetherian Rings, Rev. Ed., Graduate Studies in Mathematics, 30, AMS, Providence, RI, 2001.

33. W. Miller, “Computational complexity and numerical stability,” SIAM J. Comput., 4 (1975), no. 2, pp. 97–107.

34. M. K. Ng, Iterative Methods for Toeplitz Systems, Oxford University Press, New York, NY, 2004.

35. G. Ottaviani, “Symplectic bundles on the plane, secant varieties and Lüroth quartics revisited,” Quad. Mat., 21 (2007), pp. 315–352.

36. V. Y. Pan, Structured Matrices and Polynomials: Unified superfast algorithms, Birkhäuser, Boston, MA, 2001.

37. A. Schönhage, “Partial and total matrix multiplication,” SIAM J. Comput., 10 (1981), no. 3, pp. 434–455.

38. A. Schönhage and V. Strassen, “Schnelle Multiplikation großer Zahlen,” Computing, 7 (1971), no. 3, pp. 281–292.

39. G. Strang and S. MacNamara, “Functions of difference matrices are Toeplitz plus Hankel,” SIAM Rev., 56 (2014), no. 3, pp. 525–546.

40. V. Strassen, “Gaussian elimination is not optimal,” Numer. Math., 13 (1969), no. 4, pp. 354–356.

41. V. Strassen, “Rank and optimal computation of generic tensors,” Linear Algebra Appl., 52/53 (1983), pp. 645–685.

42. V. Strassen, “Relative bilinear complexity and matrix multiplication,” J. Reine Angew. Math., 375/376 (1987), pp. 406–443.

43. V. Strassen, “Vermeidung von Divisionen,” J. Reine Angew. Math., 264 (1973), pp. 184–202.

44. A. L. Toom, “The complexity of a scheme of functional elements realizing the multiplication of integers,” Dokl. Akad. Nauk SSSR, 150 (1963), pp. 496–498 [English translation: Soviet Math. Dokl., 4 (1963), pp. 714–716].

45. C. F. Van Loan, “The ubiquitous Kronecker product,” J. Comput. Appl. Math., 123 (2000), no. 1–2, pp. 85–100.

46. V. Vassilevska Williams, “Multiplying matrices faster than Coppersmith–Winograd,” Proc. ACM Symp. Theory Comput. (STOC), 44 (2012), pp. 887–898.

47. D. S. Watkins, The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods, SIAM, Philadelphia, PA, 2007.

48. S. Winograd, “Some bilinear forms whose multiplicative complexity depends on the field of constants,” Math. Syst. Theory, 10 (1976/77), no. 2, pp. 169–180.

49. K. Ye and L.-H. Lim, “Algorithms for structured matrix-vector product of optimal bilinear complexity,” Proc. IEEE Inform. Theory Workshop (ITW), 16 (2016), to appear.

50. K. Ye and L.-H. Lim, “Every matrix is a product of Toeplitz matrices,” Found. Comput. Math., 16 (2016), no. 3, pp. 577–598.

## Acknowledgments

We thank Henry Cohn for very helpful discussions that initiated this work. We are also grateful to Andrew Chien, Nikos Pitsianis, and Xiaobai Sun for answering our questions about energy costs and circuit complexity of various integer and floating point operations; to Mike Stein for suggesting that we examine bttb matrices; and to Chris Umans for prompting Construction 6. We thank the two anonymous referees and the handling editor for their exceptionally helpful comments and constructive suggestions. In particular, we included Sects. 1.2 and 3.2 at the handling editor’s urging, which in retrospect were glaring omissions. LHL and KY are partially supported by AFOSR FA9550-13-1-0133, DARPA D15AP00109, NSF IIS 1546413, DMS 1209136, and DMS 1057064. In addition, KY’s work is also partially supported by NSF CCF 1017760.

## Author information

Authors

### Corresponding author

Correspondence to Lek-Heng Lim.

Communicated by Nicholas Higham.

## Rights and permissions

Reprints and Permissions

Ye, K., Lim, LH. Fast Structured Matrix Computations: Tensor Rank and Cohn–Umans Method. Found Comput Math 18, 45–95 (2018). https://doi.org/10.1007/s10208-016-9332-x

• Revised:

• Accepted:

• Published:

• Issue Date:

• DOI: https://doi.org/10.1007/s10208-016-9332-x

### Keywords

• Bilinear complexity
• Tensor rank
• Tensor nuclear norm
• Cohn–Umans method
• Structured matrix–vector product
• Stability
• Sparse and structured matrices

• 15B05
• 65F50
• 65Y20
• 13P25
• 22D20