Advertisement

Efficient Automatic Differentiation of Matrix Functions

  • Peder A. Olsen
  • Steven J. Rennie
  • Vaibhava Goel
Conference paper
Part of the Lecture Notes in Computational Science and Engineering book series (LNCSE, volume 87)

Abstract

Forward and reverse mode automatic differentiation methods for functions that take a vector argument make derivative computation efficient. However, the determinant and inverse of a matrix are not readily expressed in the language of vectors. The derivative of a function f(X) for a d ×d matrix X is itself a d ×d matrix. The second derivative, or Hessian, is a d 2 ×d 2 matrix, and so computing and storing the Hessian can be very costly. In this paper, we present a new calculus for matrix differentiation, and introduce a new matrix operation, the box product, to accomplish this. The box product can be used to elegantly and efficiently compute both the first and second order matrix derivatives of any function that can be expressed solely in terms of arithmetic, transposition, trace and log determinant operations. The Hessian of such a function can be implicitly represented as a sum of Kronecker, outer, and box products, which allows us to compute the Newton step efficiently. Whereas the direct computation requires \(\mathcal{O}({d}^{4})\) storage and \(\mathcal{O}({d}^{6})\) operations, the indirect representation of the Hessian allows the storage to be reduced to \(\mathcal{O}(k{d}^{2})\), where k is the number of times the variable X occurs in the expression for the derivative. Likewise, the cost of computing the Newton direction is reduced to \(\mathcal{O}(k{d}^{5})\) in general, and \(\mathcal{O}({d}^{3})\) for k = 1 and k = 2.

Keywords

Box product Kronecker product Sylvester equation Reverse mode 

Notes

Acknowledgements

The authors are indebted to the anonymous reviewers and the editor, Shaun Forth. Their efforts led to significant improvements to the exposition.

References

  1. 1.
    Bartels, R., Stewart, G.: Algorithm 432: Solution of the matrix equation AX+ XB= C [F4]. Communications of the ACM 15(9), 820–826 (1972)Google Scholar
  2. 2.
    Fackler, P.L.: Notes on matrix calculus. North Carolina State University (2005)Google Scholar
  3. 3.
    Giles, M.B.: Collected matrix derivative results for forward and reverse mode algorithmic differentiation. In: C.H. Bischof, H.M. Bücker, P.D. Hovland, U. Naumann, J. Utke (eds.) Advances in Automatic Differentiation, Lecture Notes in Computational Science and Engineering, vol. 64, pp. 35–44. Springer, Berlin (2008). DOI 10.1007/ 978-3-540-68942-3{ _}4Google Scholar
  4. 4.
    Griewank, A.: On automatic differentiation. In: M. Iri, K. Tanabe (eds.) Mathematical Programming, pp. 83–108. Kluwer Academic Publishers, Dordrecht (1989)Google Scholar
  5. 5.
    Harville, D.A.: Matrix algebra from a statistician’s perspective. Springer Verlag (2008)Google Scholar
  6. 6.
    Magnus, J.R., Neudecker, H.: Matrix differential calculus with applications in statistics and econometrics (revised edition). John Wiley & Sons, Ltd. (1999)Google Scholar
  7. 7.
    Minka, T.P.: Old and new matrix algebra useful for statistics. See www.stat.cmu.edu/~minka/papers/matrix.html (2000)
  8. 8.
    Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research. Springer-Verlag, New York, NY (1999)Google Scholar
  9. 9.
    Olsen, P.A., Rennie, S.J.: The box product, matrix derivatives, and Newton’s method (2012). (in preparation)Google Scholar
  10. 10.
    Petersen, K.B., Pedersen, M.S.: The matrix cookbook (2008). URL http://www2.imm.dtu.dk/pubdb/p.php?3274. Version 20081110
  11. 11.
    Rall, L.B., Corliss, G.F.: An introduction to automatic differentiation. In: M. Berz, C.H. Bischof, G.F. Corliss, A. Griewank (eds.) Computational Differentiation: Techniques, Applications, and Tools, pp. 1–17. SIAM, Philadelphia, PA (1996)Google Scholar
  12. 12.
    Searle, S.R.: Matrix algebra useful for statistics, vol. 512. Wiley, New York (1982)Google Scholar
  13. 13.
    Veldhuizen, T.: Expression templates. C++ Report 7(5), 26–31 (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Peder A. Olsen
    • 1
  • Steven J. Rennie
    • 1
  • Vaibhava Goel
    • 1
  1. 1.IBM, TJ Watson Research CenterYorktown HeightsUSA

Personalised recommendations