Abstract
Forward and reverse mode automatic differentiation methods for functions that take a vector argument make derivative computation efficient. However, the determinant and inverse of a matrix are not readily expressed in the language of vectors. The derivative of a function f(X) for a d ×d matrix X is itself a d ×d matrix. The second derivative, or Hessian, is a d 2 ×d 2 matrix, and so computing and storing the Hessian can be very costly. In this paper, we present a new calculus for matrix differentiation, and introduce a new matrix operation, the box product, to accomplish this. The box product can be used to elegantly and efficiently compute both the first and second order matrix derivatives of any function that can be expressed solely in terms of arithmetic, transposition, trace and log determinant operations. The Hessian of such a function can be implicitly represented as a sum of Kronecker, outer, and box products, which allows us to compute the Newton step efficiently. Whereas the direct computation requires \(\mathcal{O}({d}^{4})\) storage and \(\mathcal{O}({d}^{6})\) operations, the indirect representation of the Hessian allows the storage to be reduced to \(\mathcal{O}(k{d}^{2})\), where k is the number of times the variable X occurs in the expression for the derivative. Likewise, the cost of computing the Newton direction is reduced to \(\mathcal{O}(k{d}^{5})\) in general, and \(\mathcal{O}({d}^{3})\) for k = 1 and k = 2.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bartels, R., Stewart, G.: Algorithm 432: Solution of the matrix equation AX+ XB= C [F4]. Communications of the ACM 15(9), 820–826 (1972)
Fackler, P.L.: Notes on matrix calculus. North Carolina State University (2005)
Giles, M.B.: Collected matrix derivative results for forward and reverse mode algorithmic differentiation. In: C.H. Bischof, H.M. Bücker, P.D. Hovland, U. Naumann, J. Utke (eds.) Advances in Automatic Differentiation, Lecture Notes in Computational Science and Engineering, vol. 64, pp. 35–44. Springer, Berlin (2008). DOI 10.1007/ 978-3-540-68942-3{ _}4
Griewank, A.: On automatic differentiation. In: M. Iri, K. Tanabe (eds.) Mathematical Programming, pp. 83–108. Kluwer Academic Publishers, Dordrecht (1989)
Harville, D.A.: Matrix algebra from a statistician’s perspective. Springer Verlag (2008)
Magnus, J.R., Neudecker, H.: Matrix differential calculus with applications in statistics and econometrics (revised edition). John Wiley & Sons, Ltd. (1999)
Minka, T.P.: Old and new matrix algebra useful for statistics. See www.stat.cmu.edu/~minka/papers/matrix.html (2000)
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research. Springer-Verlag, New York, NY (1999)
Olsen, P.A., Rennie, S.J.: The box product, matrix derivatives, and Newton’s method (2012). (in preparation)
Petersen, K.B., Pedersen, M.S.: The matrix cookbook (2008). URL http://www2.imm.dtu.dk/pubdb/p.php?3274. Version 20081110
Rall, L.B., Corliss, G.F.: An introduction to automatic differentiation. In: M. Berz, C.H. Bischof, G.F. Corliss, A. Griewank (eds.) Computational Differentiation: Techniques, Applications, and Tools, pp. 1–17. SIAM, Philadelphia, PA (1996)
Searle, S.R.: Matrix algebra useful for statistics, vol. 512. Wiley, New York (1982)
Veldhuizen, T.: Expression templates. C++ Report 7(5), 26–31 (1995)
Acknowledgements
The authors are indebted to the anonymous reviewers and the editor, Shaun Forth. Their efforts led to significant improvements to the exposition.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Olsen, P.A., Rennie, S.J., Goel, V. (2012). Efficient Automatic Differentiation of Matrix Functions. In: Forth, S., Hovland, P., Phipps, E., Utke, J., Walther, A. (eds) Recent Advances in Algorithmic Differentiation. Lecture Notes in Computational Science and Engineering, vol 87. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30023-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-30023-3_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30022-6
Online ISBN: 978-3-642-30023-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)