Skip to main content

Fast Structured Matrix Computations: Tensor Rank and Cohn–Umans Method


We discuss a generalization of the Cohn–Umans method, a potent technique developed for studying the bilinear complexity of matrix multiplication by embedding matrices into an appropriate group algebra. We investigate how the Cohn–Umans method may be used for bilinear operations other than matrix multiplication, with algebras other than group algebras, and we relate it to Strassen’s tensor rank approach, the traditional framework for investigating bilinear complexity. To demonstrate the utility of the generalized method, we apply it to find the fastest algorithms for forming structured matrix–vector product, the basic operation underlying iterative algorithms for structured matrices. The structures we study include Toeplitz, Hankel, circulant, symmetric, skew-symmetric, f-circulant, block Toeplitz–Toeplitz block, triangular Toeplitz matrices, Toeplitz-plus-Hankel, sparse/banded/triangular. Except for the case of skew-symmetric matrices, for which we have only upper bounds, the algorithms derived using the generalized Cohn–Umans method in all other instances are the fastest possible in the sense of having minimum bilinear complexity. We also apply this framework to a few other bilinear operations including matrix–matrix, commutator, simultaneous matrix products, and briefly discuss the relation between tensor nuclear norm and numerical stability.

This is a preview of subscription content, access via your institution.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.


  1. Even if the exponent of matrix multiplication turns out to be 2; note that this is asymptotic.

  2. This is essential; (7) cannot be replaced by \(\bigl (\sum _{i=1}^{r}|\lambda _{i} |^p \bigr )^{1/p}\) for \(p > 1\) or \(\max _{i=1,\ldots ,r} |\lambda _i |\). See [17, Section 3].

  3. Later on in the article we will consider embedding of vector spaces into algebras.

  4. We do not distinguish between an irreducible representation of G and its irreducible \(\mathbb {C}[G]\)-submodule.

  5. The reader is reminded that scalar multiplications by a constant like \(\omega ^i\) are not counted in bilinear complexity.

  6. The result is, however, coordinate independent, i.e., it does not depend on our choice of the bases.


  1. W. A. Adkins and S. H. Weintraub, Algebra: An approach via module theory, Graduate Texts in Mathematics, 136, Springer, New York, 1992.

  2. D. Bini and M. Capovani, “Tensor rank and border rank of band Toeplitz matrices,” SIAM J. Comput., 16 (1987), no. 2, pp. 252–258.

    MathSciNet  Article  MATH  Google Scholar 

  3. Å. Björck, Numerical Methods for Least Squares Problems, SIAM, Philadelphia, PA, 1996.

    Book  MATH  Google Scholar 

  4. J.-L. Brylinski, “Algebraic measures of entanglement,” pp. 3–23, G. Chen and R. K. Brylinski (Eds), Mathematics of Quantum Computation, CRC, Boca Raton, FL, 2002.

    Google Scholar 

  5. J. Buczyński and J. M. Landsberg, “Ranks of tensors and a generalization of secant varieties,” Linear Algebra Appl., 438 (2013), no. 2, pp. 668–689.

    MathSciNet  Article  MATH  Google Scholar 

  6. P. Bürgisser, M. Clausen, and M. A. Shokrollahi, Algebraic Complexity Theory, Grundlehren der Mathematischen Wissenschaften, 315, Springer-Verlag, Berlin, 1997.

    MATH  Google Scholar 

  7. R. H.-F. Chan and X.-Q. Jin, An Introduction to Iterative Toeplitz Solvers, Fundamentals of Algorithms, 5, SIAM, Philadelphia, PA, 2007.

    Book  Google Scholar 

  8. H. Cohn, R. Kleinberg, B. Szegedy, and C. Umans, “Group-theoretic algorithms for matrix multiplication,” Proc. IEEE Symp. Found. Comput. Sci. (FOCS), 46 (2005), pp. 379–388.

    Article  Google Scholar 

  9. H. Cohn and C. Umans, “A group-theoretic approach to fast matrix multiplication,” Proc. IEEE Symp. Found. Comput. Sci. (FOCS), 44 (2003), pp. 438–449.

    Google Scholar 

  10. H. Cohn and C. Umans, “Fast matrix multiplication using coherent configurations,” Proc. ACM–SIAM Symp. Discrete Algorithms (SODA), 24 ( 2013), pp. 1074–1087.

    Google Scholar 

  11. S. A. Cook, On the Minimum Computation Time of Functions, Ph.D. thesis, Harvard University, Cambridge, MA, 1966.

  12. J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex Fourier series,” Math. Comp., 19 (1965), no. 90, pp. 297–301.

    MathSciNet  Article  MATH  Google Scholar 

  13. D. Coppersmith and S. Winograd, “Matrix multiplication via arithmetic progressions,” J. Symbolic Comput., 9 (1990), no. 3, pp. 251–280.

    MathSciNet  Article  MATH  Google Scholar 

  14. P. J. Davis, Circulant Matrices, John Wiley, New York, NY, 1979.

    MATH  Google Scholar 

  15. V. De Silva and L.-H. Lim, “Tensor rank and the ill-posedness of the best low-rank approximation problem,” SIAM J. Matrix Anal. Appl., 30 (2008), no. 3, pp. 1084–1127.

    MathSciNet  Article  MATH  Google Scholar 

  16. J. Demmel, I. Dumitriu, O. Holtz, and R. Kleinberg, “Fast matrix multiplication is stable,” Numer. Math., 106 (2007), no. 2, pp. 199–224.

    MathSciNet  Article  MATH  Google Scholar 

  17. S. Friedland and L.-H. Lim, “Nuclear norm of higher-order tensors,” (2016).

  18. M. Fürer, “Faster integer multiplication,” SIAM J. Comput., 39 (2009), no. 3, pp. 979–1005.

    MathSciNet  Article  MATH  Google Scholar 

  19. G. Golub and C. Van Loan, Matrix Computations, 4th Ed., Johns Hopkins University Press, Baltimore, MD, 2013.

    MATH  Google Scholar 

  20. N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd Ed., SIAM, Philadelphia, PA, 2002.

    Book  MATH  Google Scholar 

  21. N. J. Higham, Functions of Matrices, SIAM, Philadelphia, PA, 2008.

    Book  MATH  Google Scholar 

  22. N. J. Higham, “Stability of a method for multiplying complex matrices with three real matrix multiplications,” SIAM J. Matrix Anal. Appl., 13 (1992), no. 3, pp. 681–687.

    MathSciNet  Article  MATH  Google Scholar 

  23. Intel 64 and IA-32 Architectures Optimization Reference Manual, September 2015.

  24. T. Kailath and J. Chun, “Generalized displacement structure for block-Toeplitz, Toeplitz-block, and Toeplitz-derived matrices,” SIAM J. Matrix Anal. Appl., 15 (1994), no. 1, pp. 114–128.

    MathSciNet  Article  MATH  Google Scholar 

  25. A. Karatsuba and Yu. Ofman, “Multiplication of many-digital numbers by automatic computers,” Dokl. Akad. Nauk SSSR, 145 (1962), pp. 293–294 [English translation: Soviet Phys. Dokl., 7 (1963), pp. 595–596].

  26. D. E. Knuth, The Art of Computer Programming, Volume 2: Seminumerical algorithms, 3rd Ed., Addison–Wesley, Reading, MA, 1998.

  27. V. K. Kodavalla, “IP gate count estimation methodology during micro-architecture phase,” IP Based Electronic System Conference and Exhibition (IP-SOC), Grenoble, France, December 2007.

  28. J. M. Landsberg, Tensors: Geometry and Applications, Graduate Studies in Mathematics, 128, AMS, Providence, RI, 2012.

    Google Scholar 

  29. S. Lang, Algebra, Rev. 3rd Ed., Graduate Texts in Mathematics, 211, Springer, New York, NY, 2002.

  30. F. Le Gall, “Powers of tensors and fast matrix multiplication,” Proc. Internat. Symp. Symbolic Algebr. Comput. (ISSAC), 39 (2014), pp. 296–303.

    MathSciNet  MATH  Google Scholar 

  31. L.-H. Lim, “Tensors and hypermatrices,” in: L. Hogben (Ed.), Handbook of Linear Algebra, 2nd Ed., CRC Press, Boca Raton, FL, 2013.

    Google Scholar 

  32. J. C. McConnell and J. C. Robson, Noncommutative Noetherian Rings, Rev. Ed., Graduate Studies in Mathematics, 30, AMS, Providence, RI, 2001.

  33. W. Miller, “Computational complexity and numerical stability,” SIAM J. Comput., 4 (1975), no. 2, pp. 97–107.

    MathSciNet  Article  MATH  Google Scholar 

  34. M. K. Ng, Iterative Methods for Toeplitz Systems, Oxford University Press, New York, NY, 2004.

    MATH  Google Scholar 

  35. G. Ottaviani, “Symplectic bundles on the plane, secant varieties and Lüroth quartics revisited,” Quad. Mat., 21 (2007), pp. 315–352.

    Google Scholar 

  36. V. Y. Pan, Structured Matrices and Polynomials: Unified superfast algorithms, Birkhäuser, Boston, MA, 2001.

    Book  MATH  Google Scholar 

  37. A. Schönhage, “Partial and total matrix multiplication,” SIAM J. Comput., 10 (1981), no. 3, pp. 434–455.

    MathSciNet  Article  MATH  Google Scholar 

  38. A. Schönhage and V. Strassen, “Schnelle Multiplikation großer Zahlen,” Computing, 7 (1971), no. 3, pp. 281–292.

    MathSciNet  Article  MATH  Google Scholar 

  39. G. Strang and S. MacNamara, “Functions of difference matrices are Toeplitz plus Hankel,” SIAM Rev., 56 (2014), no. 3, pp. 525–546.

    MathSciNet  Article  MATH  Google Scholar 

  40. V. Strassen, “Gaussian elimination is not optimal,” Numer. Math., 13 (1969), no. 4, pp. 354–356.

    MathSciNet  Article  MATH  Google Scholar 

  41. V. Strassen, “Rank and optimal computation of generic tensors,” Linear Algebra Appl., 52/53 (1983), pp. 645–685.

    MathSciNet  Article  MATH  Google Scholar 

  42. V. Strassen, “Relative bilinear complexity and matrix multiplication,” J. Reine Angew. Math., 375/376 (1987), pp. 406–443.

    MathSciNet  MATH  Google Scholar 

  43. V. Strassen, “Vermeidung von Divisionen,” J. Reine Angew. Math., 264 (1973), pp. 184–202.

    MathSciNet  MATH  Google Scholar 

  44. A. L. Toom, “The complexity of a scheme of functional elements realizing the multiplication of integers,” Dokl. Akad. Nauk SSSR, 150 (1963), pp. 496–498 [English translation: Soviet Math. Dokl., 4 (1963), pp. 714–716].

  45. C. F. Van Loan, “The ubiquitous Kronecker product,” J. Comput. Appl. Math., 123 (2000), no. 1–2, pp. 85–100.

    MathSciNet  Article  MATH  Google Scholar 

  46. V. Vassilevska Williams, “Multiplying matrices faster than Coppersmith–Winograd,” Proc. ACM Symp. Theory Comput. (STOC), 44 (2012), pp. 887–898.

    MathSciNet  MATH  Google Scholar 

  47. D. S. Watkins, The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods, SIAM, Philadelphia, PA, 2007.

    Book  MATH  Google Scholar 

  48. S. Winograd, “Some bilinear forms whose multiplicative complexity depends on the field of constants,” Math. Syst. Theory, 10 (1976/77), no. 2, pp. 169–180.

  49. K. Ye and L.-H. Lim, “Algorithms for structured matrix-vector product of optimal bilinear complexity,” Proc. IEEE Inform. Theory Workshop (ITW), 16 (2016), to appear.

  50. K. Ye and L.-H. Lim, “Every matrix is a product of Toeplitz matrices,” Found. Comput. Math., 16 (2016), no. 3, pp. 577–598.

    MathSciNet  Article  MATH  Google Scholar 

Download references


We thank Henry Cohn for very helpful discussions that initiated this work. We are also grateful to Andrew Chien, Nikos Pitsianis, and Xiaobai Sun for answering our questions about energy costs and circuit complexity of various integer and floating point operations; to Mike Stein for suggesting that we examine bttb matrices; and to Chris Umans for prompting Construction 6. We thank the two anonymous referees and the handling editor for their exceptionally helpful comments and constructive suggestions. In particular, we included Sects. 1.2 and 3.2 at the handling editor’s urging, which in retrospect were glaring omissions. LHL and KY are partially supported by AFOSR FA9550-13-1-0133, DARPA D15AP00109, NSF IIS 1546413, DMS 1209136, and DMS 1057064. In addition, KY’s work is also partially supported by NSF CCF 1017760.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Lek-Heng Lim.

Additional information

Communicated by Nicholas Higham.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ye, K., Lim, LH. Fast Structured Matrix Computations: Tensor Rank and Cohn–Umans Method. Found Comput Math 18, 45–95 (2018).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Bilinear complexity
  • Tensor rank
  • Tensor nuclear norm
  • Cohn–Umans method
  • Structured matrix–vector product
  • Stability
  • Sparse and structured matrices

Mathematics Subject Classification

  • 15B05
  • 65F50
  • 65Y20
  • 13P25
  • 22D20