## Abstract

The matrix logarithm, when applied to Hermitian positive definite matrices, is concave with respect to the positive semidefinite order. This operator concavity property leads to numerous concavity and convexity results for other matrix functions, many of which are of importance in quantum information theory. In this paper we show how to approximate the matrix logarithm with functions that preserve operator concavity and can be described using the feasible regions of semidefinite optimization problems of fairly small size. Such approximations allow us to use off-the-shelf semidefinite optimization solvers for convex optimization problems involving the matrix logarithm and related functions, such as the quantum relative entropy. The basic ingredients of our approach apply, beyond the matrix logarithm, to functions that are operator concave and operator monotone. As such, we introduce strategies for constructing semidefinite approximations that we expect will be useful, more generally, for studying the approximation power of functions with small semidefinite representations.

This is a preview of subscription content, log in to check access.

## Notes

- 1.
We define \(D_{\text {op}}\) as \(D_{\text {op}}(X\Vert Y) = -P_{\log }(Y,X)\) to match the conventional order of arguments in information theory.

- 2.
This last requirement is to ensure uniqueness.

## References

- 1.
Al-Mohy, A.H., Higham, N.J.: Improved inverse scaling and squaring algorithms for the matrix logarithm. SIAM J. Sci. Comput.

**34**(4), C153–C169 (2012) - 2.
ApS, M.: The MOSEK optimization toolbox for MATLAB manual. Version 7.1 (Revision 28). (2015). http://docs.mosek.com/7.1/toolbox/index.html

- 3.
Ben-Tal, A., Nemirovski, A.: Lectures on modern convex optimization: analysis, algorithms, and engineering applications. SIAM (2001)

- 4.
Ben-Tal, A., Nemirovski, A.: On polyhedral approximations of the second-order cone. Math. Oper. Res.

**26**(2), 193–205 (2001) - 5.
Besenyei, A., Petz, D.: Successive iterations and logarithmic means. Oper. and Matrices

**7**(1), 205–218 (2013). https://doi.org/10.7153/oam-07-12 - 6.
Bhatia, R.: Positive definite matrices. Princeton University Press (2009)

- 7.
Bhatia, R.: Matrix analysis, vol. 169. Springer Science & Business Media (2013)

- 8.
Blekherman, G., Parrilo, P.A., Thomas, R.R.: Semidefinite optimization and convex algebraic geometry. SIAM (2013)

- 9.
Boyd, S., Kim, S.J., Vandenberghe, L., Hassibi, A.: A tutorial on geometric programming. Optim. Eng.

**8**(1), 67–127 (2007) - 10.
Bushell, P.J.: Hilbert’s metric and positive contraction mappings in a Banach space. Arch. Ration. Mech. Anal.

**52**(4), 330–338 (1973) - 11.
Carlen, E.A.: Trace inequalities and quantum entropy. An introductory course. In: Entropy and the quantum, vol. 529, pp. 73–140. AMS (2010)

- 12.
Carlson, B.C.: An algorithm for computing logarithms and arctangents. Math. Comp.

**26**(118), 543–549 (1972) - 13.
Cox, D.A.: The arithmetic-geometric mean of Gauss. In: Pi: A source book, pp. 481–536. Springer (2004)

- 14.
Dieci, L., Morini, B., Papini, A.: Computational techniques for real logarithms of matrices. SIAM J. Matrix Anal. Appl.

**17**(3), 570–593 (1996) - 15.
Domahidi, A., Chu, E., Boyd, S.: ECOS: An SOCP solver for embedded systems. In: European Control Conference (ECC), pp. 3071–3076 (2013)

- 16.
Ebadian, A., Nikoufar, I., Gordji, M.E.: Perspectives of matrix convex functions. Proc. Natl. Acad. Sci. USA

**108**(18), 7313–7314 (2011) - 17.
Effros, E., Hansen, F.: Non-commutative perspectives. Ann. Funct. Anal

**5**(2), 74–79 (2014) - 18.
Effros, E.G.: A matrix convexity approach to some celebrated quantum inequalities. Proc. Natl. Acad. Sci. USA

**106**(4), 1006–1008 (2009) - 19.
Fawzi, H., Fawzi, O.: Relative entropy optimization in quantum information theory via semidefinite programming approximations. arXiv preprint arXiv:1705.06671 (2017)

- 20.
Fawzi, H., Saunderson, J.: Lieb’s concavity theorem, matrix geometric means, and semidefinite optimization. Linear Algebra Appl.

**513**, 240–263 (2017) - 21.
Fujii, J., Kamei, E.: Relative operator entropy in noncommutative information theory. Math. Japon

**34**, 341–348 (1989) - 22.
Glineur, F.: Quadratic approximation of some convex optimization problems using the arithmetic-geometric mean iteration. Talk at the “Workshop GeoLMI on the geometry and algebra of linear matrix inequalities”. http://homepages.laas.fr/henrion/geolmi/geolmi-glineur.pdf (retrieved November 2, 2016) (2009)

- 23.
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1. http://cvxr.com/cvx (2014)

- 24.
Hansen, F., Pedersen, G.K.: Jensen’s inequality for operators and Löwner’s theorem. Math. Ann.

**258**(3), 229–241 (1982) - 25.
Helton, J.W., Klep, I., McCullough, S.: The tracial Hahn–Banach theorem, polar duals, matrix convex sets, and projections of free spectrahedra. J. Eur. Math. Soc. (JEMS)

**19**(6), 1845–1897 (2017) - 26.
Higham, N.J.: Functions of matrices: theory and computation. SIAM (2008)

- 27.
Kenney, C., Laub, A.J.: Condition estimates for matrix functions. SIAM J. Matrix Anal. Appl.

**10**(2), 191–209 (1989) - 28.
Lieb, E.H.: Convex trace functions and the Wigner-Yanase-Dyson conjecture. Adv. Math.

**11**(3), 267–288 (1973) - 29.
Lieb, E.H., Ruskai, M.B.: Proof of the strong subadditivity of quantum mechanical entropy. J. Math. Phys.

**14**(12), 1938–1941 (1973) - 30.
Meurant, G., Sommariva, A.: Fast variants of the Golub and Welsch algorithm for symmetric weight functions in Matlab. Numerical Algorithms

**67**(3), 491–506 (2014) - 31.
Nesterov, Y.E.: Constructing self-concordant barriers for convex cones. CORE Discussion Paper (2006/30) (2006)

- 32.
Sagnol, G.: On the semidefinite representation of real functions applied to symmetric matrices. Linear Algebra Appl.

**439**(10), 2829–2843 (2013) - 33.
Serrano, S.A.: Algorithms for unsymmetric cone optimization and an implementation for problems with the exponential cone. Ph.D. thesis, Stanford University (2015)

- 34.
Skajaa, A., Ye, Y.: A homogeneous interior-point algorithm for nonsymmetric convex conic optimization. Math. Program.

**150**(2), 391–422 (2015) - 35.
Stoer, J., Bulirsch, R.: Introduction to numerical analysis, vol. 12, 3 edn. Springer-Verlag, New York (2002)

- 36.
Trefethen, L.N.: Is Gauss quadrature better than Clenshaw-Curtis? SIAM Rev.

**50**(1), 67–87 (2008) - 37.
Trefethen, L.N.: Approximation theory and approximation practice. SIAM (2013)

- 38.
Tropp, J.A.: From joint convexity of quantum relative entropy to a concavity theorem of Lieb. Proc. Amer. Math. Soc.

**140**(5), 1757–1760 (2012) - 39.
Tropp, J.A.: An introduction to matrix concentration inequalities. Found. Trends Mach. Learn.

**8**(1-2), 1–230 (2015)

## Author information

### Affiliations

### Corresponding author

## Additional information

This work was done in part while Pablo Parrilo was visiting the Simons Institute for the Theory of Computing. It was partially supported by the DIMACS/Simons Collaboration on Bridging Continuous and Discrete Optimization through NSF Grant #CCF-1740425. This work was also supported in part by the Air Force Office of Scientific Research through AFOSR Grants FA9550-11-1-0305 and FA9550-12-1-0287, and in part by the National Science Foundation through NSF Grant #CCF-1565235.

Communicated by James Renegar.

## Appendices

### Background on Approximation Theory

*Gaussian quadrature* A *quadrature rule* is a method of approximating an integral with a weighted sum of evaluations of the integrand. A quadrature rule is determined by the evaluation points, called *nodes*, and the *weights* of the weighted sum. Given a measure \(\nu \) supported on \([-1,1]\), a quadrature rule gives an approximation of the form

where the \(t_j\in [-1,1]\) are the nodes and the \(w_j\) are the weights. A Gaussian quadrature is a choice of nodes \(t_1,\ldots ,t_m\in (-1,1)\) and positive weights \(w_1,\ldots ,w_m\) that integrates all polynomials of degree at most \(2m-1\)*exactly*. For example, when \(\nu \) is the uniform measure on \([-1,1]\), such a quadrature rule is known as *Gauss–Legendre* quadrature, and the nodes and weights can be computed for example by an eigenvalue decomposition of the associated Jacobi matrix, see e.g., [36, Sect. 2].

*Padé approximants* Padé approximants are approximations of a given univariate function, analytic at a point \(x_0\), by rational functions. More precisely, the (*m*, *n*)-Padé approximant of *h* at \(x_0\) is the rational function *p*(*x*) / *q*(*x*) such that *p* is a polynomial of degree *m*, *q* is a polynomial of degree *n*, and the Taylor series expansion of the error at \(x_0\) is of the form

for real numbers \(a_k\) and the largest possible positive integer^{Footnote 2}*s*. Expressed differently, *p* and *q* are chosen so that the Taylor series of *p*(*x*) / *q*(*x*) at \(x_0\) matches as many Taylor series coefficients of *h* at \(x_0\) as possible (and at least the first \(m+n+1\) coefficients).

### Properties and Error Bounds of Gaussian Quadrature-Based Approximations

Assume \(g:\mathbb {R}_{++}\rightarrow \mathbb {R}\) is a function with an integral representation

where \(\nu \) is a probability measure on [0, 1] and \(f_t(x) = \frac{x-1}{1+t(x-1)}\). The case \(g=\log \) corresponds to \(\nu \) being the Lebesgue measure on [0, 1]. In this appendix we show that the rational approximation obtained by applying Gaussian quadrature on (38) coincides with the Padé approximant of *g* at \(x=1\). We also derive error bounds on the quality of this rational approximation. Note that functions of the form (38) are precisely operator monotone functions, by Löwner’s theorem (see Sect. 4.1).

Let \(r_m\) be the rational approximant obtained by using Gaussian quadrature on (38):

where \(w_i > 0, t_i \in [0,1]\) are the Gaussian quadrature weights and nodes for the measure \(\nu \).

### Connection with Padé Approximant

We first show that the function \(r_m\) coincides with the Padé approximant of *g* at \(x=1\). The special case \(g =\log \) was established in [14, Theorem 4.3].

### Proposition 5

Assume \(g:\mathbb {R}_{++}\rightarrow \mathbb {R}\) has the form (38) and let \(r_m\) be the rational approximation obtained via Gaussian quadrature as in (39). Then \(r_m\) is the (*m*, *m*) Padé approximant of *g* at \(x=1\).

### Proof

First we note that \(f_t(x)\) admits the following series expansion, valid for \(|x-1|<\frac{1}{|t|}\):

Let \(\nu _m = \sum _{i=1}^m w_i \delta _{t_i}\) be the atomic measure on [0, 1] corresponding to Gaussian quadrature applied to \(\nu \). By definition of Gaussian quadrature, \(\nu _m\) matches all moments of \(\nu \) up to degree \(2m-1\), i.e., \(\int _{0}^{1}p(t)\;\mathrm{d}\nu (t) = \int _{0}^{1}p(t)\;\mathrm{d}\nu _m(t)\) for all polynomials *p* of degree at most \(2m-1\). It thus follows that

establishing that \(r_m\) matches the first 2*m* Taylor series coefficients of *g* at \(x=1\). Since \(r_m\) has numerator and denominator degree *m*, it is the (*m*, *m*)-Padé approximant of *g* at \(x=1\). \(\square \)

### Error Bounds

In this section we derive an error bound on the approximation quality \(|g(x) - r_m(x)|\). To do this we use standard methods as described, e.g., in [37]. This error is essentially controlled by the decay of the Chebyshev coefficients of the integrand. For the rational functions \(f_t\) one can compute these coefficients exactly.

#### Quadrature Error Bounds for Operator Monotone Functions

To appeal to standard arguments, it is easiest to rewrite the integrals of interest over the interval \([-1,1]\) by the transformation \(t\mapsto 1-2t\) mapping [0, 1] to \([-1,1]\). To this end, let

Let \(T_k(t)\) denote the *k*th Chebyshev polynomial. We start by explicitly computing the Chebyshev expansion of \(\tilde{f}_t(x)\) for fixed *x*, i.e., we find the coefficients \(a_k(x)\) of \(\tilde{f}_t(x) = \sum _{k=0}^{\infty }a_k(x)T_k(t)\). To do this, we first define \(h_\rho (t) = \frac{2}{(\rho +\rho ^{-1})/2 - t}\) and observe that with the substitution \(\rho = \frac{\sqrt{x}-1}{\sqrt{x}+1}\) we have that \(\tilde{f}_t(x) = h_\rho (t)\) and that \(x>0\) if and only if \(-1<\rho <1\). We can compute the Chebyshev expansion of \(h_\rho (t)\) by observing that the generating function of Chebyshev polynomials is (see e.g., [37, Exercise 3.14])

It then follows that the Chebyshev expansion of \(h_\rho (t)\) is

Since \(\frac{8}{\rho ^{-1}-\rho } = 2(\sqrt{x} - 1/\sqrt{x})\), the Chebyshev expansion of \(\tilde{f}_t(x)\) is

We are now ready to state an error bound on the approximation quality \(|g(x) - r_m(x)|\). Our arguments are standard and follow closely the ideas described in [37].

### Proposition 6

Let \(g:\mathbb {R}_{++}\rightarrow \mathbb {R}\) be a function with an integral representation (38) and let \(r_m\) be the rational approximation obtained by applying Gaussian quadrature as in (39). If \(m\ge 1\) and \(x>0\) then

If \(\nu \) is invariant under the map \(t\mapsto 1-t\) (i.e., \(g(x^{-1}) = -g(x)\)) then this can be improved to

Finally, \(r_m(x) \ge g(x)\) for all \(0<x\le 1\) and \(r_m(x) \le g(x)\) for all \(x\ge 1\).

### Proof

Let \(\tilde{\nu }\) be the measure on \([-1,1]\) obtained from \(\nu \) by changing variables \(t \in [0,1] \mapsto 1-2t \in [-1,1]\) so that \(g(x) = g(0) + g'(1) \int _{-1}^1 \tilde{f_t}(x) \mathrm{d}\tilde{\nu }(t)\). Let \(\tilde{\nu }_m\) be the atomic measure supported on *m* points obtained by applying Gaussian quadrature on \(\nu \). Finally let the Chebyshev expansion of \(\tilde{f}_t(x)\) be \(\sum _{k=0}^\infty a_k(x)T_k(t)\). Since \(\int _{-1}^{1}T_k(t)\;\mathrm{d}\tilde{\nu }(t) = \int _{-1}^{1}T_k(t)\;\mathrm{d}\tilde{\nu }_m(t)\) for \(k\le 2m-1\),

For \(k\ge 2\), we have that \(a_k(x) = 2(\sqrt{x}-1/\sqrt{x})\left( \frac{\sqrt{x}-1}{\sqrt{x}+1}\right) ^k\) [see (41)]. So using the fact that \(\tilde{\nu }\) and \(\tilde{\nu }_m\) are probability measures (when \(m\ge 1\)), together with the fact that \(|T_k(t)|\le 1\) for \(t\in [-1,1]\), the triangle inequality gives

If the measure \(\nu \) is invariant under the map \(t\mapsto -t\) then the same is true of \(\nu _m\) (see, e.g., [30]). Since \(\tilde{f}_t(x^{-1}) = -\tilde{f}_{-t}(x)\) it follows that \(r_m(x^{-1}) = -r_m(x)\). Furthermore,

because Chebyshev polynomials of odd degree are odd functions. In this case only the even Chebyshev coefficients contribute to the error bound so

To establish inequalities between \(r_m(x)\) and *g*(*x*), we use an alternative formula for the error obtained by approximating an integral via Gaussian quadrature. Since \(t\mapsto \tilde{f}_t(x)\) has derivatives of all orders, one can show (see, e.g., [35, Theorem 3.6.24]) that there exists \(\tau \in [-1,1]\) and \(\kappa \ge 0\) such that

If \(x\in (0,1)\) then \(\frac{1+x}{1-x}-\tau <0\) for all \(\tau \in [-1,1]\) and so \(g(x)-r_m(x) < 0\). If \(x\in (1,\infty )\) then \(\frac{1+x}{1-x}-\tau > 0\) for all \(\tau \in [-1,1]\) and so \(g(x) - r_m(x) > 0\). If \(x=1\) then \(g(x) = r_m(x)\). \(\square \)

Very similar bounds hold for the error between \(g:\mathbb {R}_{++}\rightarrow \mathbb {R}_{++}\), a *positive* operator monotone function, and \(r_{m}^+\), the rational approximation obtained by applying Gaussian quadrature to the integral representation in (25). Indeed, if \(m\ge 1\) and \(x> 0\),

We omit the proof, since it follows the same basic argument as the proof of (42), together with the observation that \(f_t^+(x) = \frac{x}{x-1}f_t(x)\).

#### The Special Case of Log: Proofs of Proposition 1 and Theorem 1

### Proof (of Proposition 1)

The function \(g(x) = \log (x)\) has an integral representation (38) where the measure \(\nu \) is the uniform measure on [0, 1], which is invariant under the map \(t\mapsto 1-t\). Proposition 6 tells us that, for any \(x>0\),

The error between \(\log (x) = 2^k\log (x^{1/2^k})\) and \(r_{m,k}(x) = 2^kr_m(x^{1/2^k})\) can be obtained by evaluating at \(x^{1/2^k}\) and scaling by \(2^k\) to obtain

where \(\kappa = x^{1/2^k}\). By using the fact that

we can write this as a bound on relative error as

*Asymptotic behavior of* (46): Since \(\kappa = x^{1/2^k} = e^{2^{-k}\log (x)}\), we can rewrite the right-hand side of (46) as

Since \(\sinh ^2(2x)\tanh ^{2m-1}(x) = 4x^{2m+1} + O(x^{2m+3})\), we have that

as \(k\rightarrow \infty \). \(\square \)

### Proof (of Theorem 1)

The function *r* is chosen to be of the form \(r_{m,k}\) for certain *m* and *k*. In particular we can choose \(k = k_1+k_2\), with \(k_1 = \lceil \log _2\log _e(a)\rceil +1\), \(k_2\) being the smallest even integer larger than \(\sqrt{\log _2(32\log _e(a)/\epsilon )}\), and with \(m=k_2/2\). The function \(r_{m,k}\) has a semidefinite representation of size \(m+k\) (as a special case of Theorem 3 in Sect. 3), which is \(O(\sqrt{\log _e(1/\epsilon )})\) for fixed *a*. It remains to establish the error bound. To do so, we first note that \(x^{1/2^{k_1}} < 1\) for all \(x\in [1/a,a]\). Then, for all \(x\in [1/a,a]\),

Here, the second last equality holds because \(\sinh (1/2)^2 \le 1,\tanh (x)\le x\) for all \(x\ge 0\), and \(2^{k_1-1} \le \log _e(a)\) (by our choice of \(k_1\)). The last inequality holds by our choice of *m* and \(k_2\). \(\square \)

#### Proof of Theorem 7

### Proof (of Theorem 7)

The function *r* is of the form \(r_{m,k}\) defined by (29) for particular values of the parameters *m* and *k*. Throughout the proof, for convenience of notation, let \((x_k,y_k) = \varPhi ^{(k)}(x,y)\) for \(k\ge 0\). The error bound (44) of “Appendix B.2” shows that for any \(x,y > 0\):

We will show that if \(\varPhi \) has the linear contraction property (31) then the bound (47) decays like \(O(c^{-k^2})\) for the choice of \(m \approx k\) (that we make precise later). To establish this, we need to bound two terms: first, if (31) holds then \(\log (x_k / y_k) = O(c^{-k/2})\) and so the numerator in (47) converges like \(O(c^{-km})\) as we want. The second term that we need to control is \(\sqrt{x_k y_k}\) and one can show that this term grows at most linearly. This is proved in the following lemma:

### Lemma 1

There is a constant \(b > 0\) such that for any \(x,y > 0\) satisfying \(a^{-1} \le x/y \le a\) we have

where \((x_k,y_k) = \varPhi ^{(k)}(x,y)\).

### Proof

Since \(h_1\) and \(h_2\) are concave, they are each bounded above by their linear approximation at \(x=1\). As such, \(P_{h_i}(x,y) \le h_i'(1)(x-y) + h_i(1)y\) for all \(x,y\in \mathbb {R}_{++}\) and \(i=1,2\). Summing these two inequalities we see that

Because \(h_1\) and \(h_2\) take positive values, \(h_1(1) \ge h_1(1)-h_1'(1)\ge 0\) and \(h_2(1) \ge h_2(1) - h_2'(1)\ge 0\). As such, if \(b = \max \{h_1'(1)+h_2'(1),h_1(1)+h_2(1) - (h_1'(1)+h_2'(1))\}\), then \(P_{h_1}(x,y)+P_{h_2}(x,y) \le b(x+y)\) for all \(x,y\in \mathbb {R}_{++}\). It then follows that \(x_k + y_k \le b^k(x+y)\) for all \(x,y\in \mathbb {R}_{++}\) and so that

as desired. \(\square \)

Plugging (48) in (47) gives us, for any \(a^{-1} \le x/y \le a\):

Choose *k* to be the smallest even integer satisfying

and *m* to be the smallest integer satisfying \(m \ge k\max \{1,\frac{\log (b)}{\log (16)}\}\). Note that both *m* and *k* are \(O(\sqrt{\log _c(1/\epsilon )})\) when we treat *a* and *b* as constants. With these choices, and assumption (31), we have that \(b^k16^{-m} \le 1\) and

Using the inequality \(|\tanh (z)| \le |z|\) for all *z*, and setting \(y=1\) in the error bound (49), we have that

The size of the semidefinite representation of \(r = r_{m,k}\) is \(O(m+k)\), if we view the size of the semidefinite representations of \(h_1\) and \(h_2\) as being constant. Since \(m,k\in O(\sqrt{\log _c(1/\epsilon )})\) it follows that the size of the semidefinite representation of *r* is also \(O(\sqrt{\log _c(1/\epsilon )})\).

In the case where assumption (32) also holds, we choose *m* (respectively, *k*) to be the smallest integer (respectively, even integer) satisfying

and \(m\ge \max \left\{ 1,\frac{k\log (b)}{\log (16/c_0)}\right\} \). Note that both *m* and *k* are \(O(\log _{2}\log _{c_0}(1/\epsilon ))\). With these choices, and assumptions (31) and (32), we have that \(c_0^mb^k16^{-m} \le 1\) and

Using the inequality \(|\tanh (z)|\le |z|\) for all *z*, and putting \(y=1\) in the error bound (49), we obtain

\(\square \)

### Semidefinite Description of \(f_t\)

In this section we establish the linear matrix inequality characterization of \(f_t\) given in Proposition 2. We use the fact that if \(t\in (0,1]\) then

The characterization will follow from the following easy observation.

### Proposition 7

If \(A+B \succ 0\) then

### Proof

The proof follows by expressing the left-hand side of (51) using Schur complements, followed by a congruence transformation:

\(\square \)

### Proof (of Proposition 7)

We need to show that

The case \(t=0\) can be easily verified to hold. We thus assume \(0 < t \le 1\). Given the expression of \(f_t\) in Eq. (50) we simply apply (51) with \(B=(I/t)\) and \(A= X-I\). This shows that

Applying a congruence transformation with the diagonal matrix \({{\mathrm{diag}}}(I,\sqrt{t}I)\) yields the desired linear matrix representation (52). \(\square \)

We can also directly get, from Proposition 7, a semidefinite representation of the noncommutative perspective of \(f_t\) defined by \(P_{f_t}(X,Y) {=} Y^{1/2} f_t\left( Y^{-1/2} X Y^{-1/2}\right) Y^{1/2}\).

### Proposition 8

If \(t\in [0,1]\) then the perspective \(P_{f_t}\) of \(f_t\) is jointly matrix concave since

### Proof

From the definition of \(P_{f_t}\) and the expression (50) for \(f_t\) it is easy to see that we have:

The semidefinite representation (53) then follows easily by applying (51) with \(B=Y/t\) and \(A=X-Y\), followed by applying a congruence transformation with the diagonal matrix \({{\mathrm{diag}}}(1,\sqrt{t})\). \(\square \)

### Integral Representations of Operator Monotone Functions

In this section we show how to obtain the integral representations (24) and (25) as a fairly easy reworking of the following result.

### Theorem 8

([24, Theorem 4.4]) If \(h:(-1,1)\rightarrow \mathbb {R}\) is non-constant and operator monotone then there is a unique probability measure \(\tilde{\nu }\) supported on \([-1,1]\) such that

Suppose \(g:\mathbb {R}_{++}\rightarrow \mathbb {R}\) is operator monotone. Then it is straightforward to check that \(h:(-1,1)\rightarrow \mathbb {R}\) defined by \(h(z) = g\left( \frac{1+z}{1-z}\right) \) is operator monotone and that \(g(x) = h\left( \frac{x-1}{x+1}\right) \). By applying Theorem 8 to *h*(*z*) and then evaluating at \(z = \frac{x-1}{x+1}\), we obtain the integral representation

Using the fact that \(h(0) = g(1)\) and \(h'(0) = 2g'(1)\), and applying a linear change of coordinates to rewrite the integral over [0, 1], we see that there is a probability measure \(\nu \) on [0, 1] such that

This establishes (24). If, in addition, *g* takes positive values, then \(g(0) := \lim _{x\rightarrow 0}g(x) \ge 0\). Hence,

so we can define a probability measure supported on [0, 1] by \(\mathrm{d}\mu (t) = \frac{g'(1)}{g(1)-g(0)}\left( \frac{1}{1-t}\right) \mathrm{d}\nu (t)\). Then using the fact that \(f_t(x) = \frac{1}{1-t}\left[ f^+_t(x)-1\right] \) we immediately obtain, from (55), the representation

This establishes (25).

## Rights and permissions

## About this article

### Cite this article

Fawzi, H., Saunderson, J. & Parrilo, P.A. Semidefinite Approximations of the Matrix Logarithm.
*Found Comput Math* **19, **259–296 (2019). https://doi.org/10.1007/s10208-018-9385-0

Received:

Revised:

Accepted:

Published:

Issue Date:

### Keywords

- Convex optimization
- Matrix concavity
- Quantum relative entropy

### Mathematics Subject Classification

- 90C22
- 52A41
- 47A63