1 Preliminaries

Tsallis [10] generalised in 1988 the standard Boltzmann–Gibbs entropy to a non-extensive quantity \( S_p(\rho ) \) depending on a parameter p. In the quantum version, it is given by

$$\begin{aligned} S_p(\rho )=\frac{1-\textrm{Tr}~\rho ^p}{p-1}\quad p\ne 1, \end{aligned}$$

where \( \rho \) is a density matrix. It has the property that \( S_p(\rho )\rightarrow S(\rho ) \) for \( p\rightarrow 1, \) where \( S(\rho )=-\textrm{Tr}~\rho \log \rho \) is the von Neumann entropy.

1.1 The deformed logarithm and exponential

The Tsallis entropy may be written in a similar form:

$$\begin{aligned} S_p(\rho )=-\textrm{Tr}~\rho \log _p \rho , \end{aligned}$$

where the deformed logarithm \( \log _p \) defined for positive x is given by

$$\begin{aligned} \log _p x=\int _1^x t^{p-2}\,{\textrm{d}}t = \left\{ \begin{array}{ll} \frac{x^{p-1}-1}{p-1}&{}\quad p\ne 1\\ \log x &{}\quad p=1. \end{array}\right. \end{aligned}$$

The deformed logarithm is also denoted by the p-logarithm. The range of the p-logarithm is given by the intervals

$$\begin{aligned} \begin{array}{ll} \bigl (-(p-1)^{-1}, \infty \bigr )&{}\quad \text {for}\ p>1\\ \bigl (-\infty , -(p-1)^{-1}\bigr )&{}\quad \text {for}\ p<1\\ \bigl (-\infty ,\infty \bigr )&{}\quad \text {for}\ p=1. \end{array} \end{aligned}$$

The inverse function \( \exp _p \) (denoted by the p-exponential) is always positive and given by

$$\begin{aligned} \exp _p(x)= \left\{ \begin{array}{ll} (x(p-1)+1)^{1/(p-1)}&{}\quad \hbox {for}\ p>1 \ \hbox {and} \ x>-(p-1)^{-1} \\ (x(p-1)+1)^{1/(p-1)}&{}\quad \hbox {for}\ p<1 \ \hbox {and} \ x<-(p-1)^{-1} \\ \exp x &{}\quad \hbox {for}\ p=1 \hbox { and } x\in {\textbf{R}}. \end{array}\right. \end{aligned}$$

The p-logarithm and the p-exponential functions converge, respectively, to the logarithmic and the exponential functions for \( p\rightarrow 1. \) We note that

$$\begin{aligned} \frac{{\textrm{d}}}{{\textrm{d}}x}\log _p(x)=x^{p-2}\quad \text {and}\quad \frac{{\textrm{d}}}{{\textrm{d}}x}\exp _p(x)=\exp _p(x)^{2-p}. \end{aligned}$$
(1.1)

We will also need the following lemma.

Lemma 1.1

Take arbitrary \( p\in {\textbf{R}}. \) Independent of \( x> 0, \) we have

$$\begin{aligned} \log _p x^q=q \log _\alpha x, \end{aligned}$$

where \( \alpha =1+q(p-1). \) Furthermore,  take arbitrary \( q\ne 0 \) and set \( \beta =1+(p-1)/q. \) For any \( x\in {\textbf{R}} \) in the domain of \( \exp _p, \) we obtain that qx is in the domain of \( \exp _\beta \) and that

$$\begin{aligned} (\exp _p x)^q=\exp _\beta (q x). \end{aligned}$$

Proof

We substitute \( u=t^{1/q} \) (thus \( t=u^q) \) in

$$\begin{aligned} \log _p x^q=\int _1^{x^q} t^{p-2}\,{\textrm{d}}t \end{aligned}$$

and note that \( {\textrm{d}}u=q^{-1} t^{(1-q)/q}\, {\textrm{d}}t. \) Therefore, \( {\textrm{d}}t=q t^{(q-1)/q}\, {\textrm{d}}u \) and thus

$$\begin{aligned} \log _p x^q=\int _1^x u^{q(p-2)} q u^{q-1}\,{\textrm{d}}u=q\int _1^x u^{q(p-1)-1}\, {\textrm{d}}u. \end{aligned}$$

Since \( q(p-1)-1=q(p-1)+1-2=\alpha -2, \) the first statement follows. The definition of \( \beta \) implies \( q/(p-1)=1/(\beta -1). \) There are now four cases, firstly depending on \( p>1 \) and \( p<1, \) and subsequently on \( q>0 \) or \( q<0. \) In all four cases, it follows that qx is in the domain of \( \exp _\beta . \) We finally obtain from the first result in the lemma that

$$\begin{aligned} \log _\beta (\exp _p x)^q=q\log _p(\exp _p x) =qx, \end{aligned}$$

and therefore \( (\exp _p x)^q = \exp _\beta (qx). \) \(\square \)

1.2 Convexity and min–max theorems

An important tool in our investigation is taken from convex analysis. These techniques are used in engineering, automatic control, signal processing, resource allocation, portfolio theory, and numerous other fields. We, in particular, use that partial minimisation of a convex function is convex [2, Section 3.2.5]. This technique was successfully applied by Carlen and Lieb in the investigation of trace functions [3]. We provide the proof for the convenience of the reader.

Lemma 1.2

Let \( f:X\times Y\rightarrow {\textbf{R}} \) be a function of two variables and set

$$\begin{aligned} g(y)=\inf _{x\in X} f(x,y)\quad \text {and}\quad h(y)=\sup _{x\in X} f(x,y) \end{aligned}$$

for \( x\in X. \)

  1. (i)

    If f(xy) is jointly convex,  then g is convex.

  2. (ii)

    If f(xy) is convex in the second variable,  then h is convex.

  3. (iii)

    If f(xy) is concave in the second variable,  then g is concave.

  4. (iv)

    If f(xy) is jointly concave,  then h is concave.

Proof

Take \( \varepsilon >0 \) and elements \( y_1,y_2\in Y. \) Pick \( x_1,x_2\in X \) such that

$$\begin{aligned} g(y_1)\ge f(x_1,y_1)+\varepsilon \quad \text {and}\quad g(y_2)\ge f(x_2,y_2)+\varepsilon . \end{aligned}$$

Then,

$$\begin{aligned}{} & {} g(\lambda y_1+(1-\lambda ) y_2)\le f(\lambda x_1+(1-\lambda ) x_2, \lambda y_1+(1-\lambda ) y_2)\\{} & {} \quad \le \lambda f(x_1,y_1)+(1-\lambda ) f(x_2,y_2)\le \lambda g(y_1)+(1-\lambda ) g(y_2) -\varepsilon , \end{aligned}$$

so g is convex. Pick to \( \lambda y_1+(1-\lambda )y_2\in Y \) and \( \varepsilon >0 \) an element \( z\in X \) such that

$$\begin{aligned} h(\lambda y_1+(1-\lambda ) y_2) \le f(x,\lambda y_1+(1-\lambda ) y_2)+\varepsilon . \end{aligned}$$

Then,

$$\begin{aligned}{} & {} h(\lambda y_1+(1-\lambda ) y_2)\le \lambda f(x,y_1)+(1-\lambda ) f(x,y_2)+\varepsilon \\{} & {} \quad \le \lambda h(y_1)+(1-\lambda )h(y_2)+\varepsilon , \end{aligned}$$

so h is convex. The cases (iii) and (iv) follow by considering \( -f(x,y). \) \(\square \)

1.3 The Young tracial inequalities

The following inequalities are known as the tracial Young inequalities. We prefer to prove them as below.

Proposition 1.3

Let A and B be positive definite matrices. Then

$$\begin{aligned} \textrm{Tr}~A^p B^{1-p} \le p\textrm{Tr}~A+(1-p)\textrm{Tr}~B\quad 0\le p\le 1 \end{aligned}$$

and

$$\begin{aligned} \textrm{Tr}~A^p B^{1-p} \ge p\textrm{Tr}~A+(1-p)\textrm{Tr}~B\quad p\le 0, \ p\ge 1. \end{aligned}$$

Proof

Let first \( 0\le p\le 1. \) We may write

$$\begin{aligned} \textrm{Tr}~A^pB^{1-p}= & {} \textrm{Tr}~L_A^p R_B^{1-p} I=S_f^I(A,B)\\\le & {} \textrm{Tr}~\bigl (pL_A+(1-p)R_B\bigr )I=p\textrm{Tr}~A+(1-p)\textrm{Tr}~B, \end{aligned}$$

where \( f(t)=t^p \) for \( t>0, \) and \( L_A \) and \( R_B \) are the left and right multiplication operators. The first equality above in terms of the quasi-entropy \( S_f^I(A,B) \) follows since \( L_A \) and \( R_B \) commute, and the first inequality in the proposition then follows from the geometric–arithmetic mean inequality. Since Jensen’s inequality reverses for the extensions of a chord (corresponding to the cases \( p\le 0 \) or \( p\ge 1), \) the second inequality of the proposition follows. \(\square \)

2 Variational expressions

We take the following variational representations from our paper [9, Lemma 2.1] with a slightly simplified proof.

Proposition 2.1

For positive definite operators X and Y,  we have

$$\begin{aligned} \textrm{Tr}~Y=\left\{ \begin{array}{ll} \max _{X>0}\bigl \{\textrm{Tr}~X-\textrm{Tr}~X^{2-q}\left( \log _q X-\log _q Y\right) \bigr \}, &{}\quad q\le 2,\\ \min _{X>0}\bigl \{\textrm{Tr}~X-\textrm{Tr}~X^{2-q}\left( \log _q X-\log _q Y\right) \bigr \}, &{}\quad q>2. \end{array} \right. \end{aligned}$$

Proof

We learned in Proposition 1.3 that

$$\begin{aligned} \textrm{Tr}~X^{p}Y^{1-p}\le & {} p \textrm{Tr}~X+(1-p)\textrm{Tr}~Y,\quad 0\le p\le 1,\\ \textrm{Tr}~X^{p}Y^{1-p}\ge & {} p \textrm{Tr}~X+(1-p)\textrm{Tr}~Y,\quad p\le 0,\, p\ge 1. \end{aligned}$$

By combining the first inequality for \( 0\le p<1 \) with the case \( p>1 \) in the second, we obtain

$$\begin{aligned} \textrm{Tr}~Y \ge \textrm{Tr}~X-\frac{\textrm{Tr}~X-\textrm{Tr}~X^pY^{1-p}}{1-p},\quad p\ge 0,\ p\ne 1, \end{aligned}$$

while the case \( p\le 0 \) gives the inequality

$$\begin{aligned} \textrm{Tr}~Y \le \textrm{Tr}~X-\frac{\textrm{Tr}~X-\textrm{Tr}~X^pY^{1-p}}{1-p}, \quad p\le 0. \end{aligned}$$

For \(X=Y,\) the above inequalities become equalities. Setting \( q=2-p, \) the first range (\( p\ge 0, \) \( p\ne 1) \) is transformed to the range \(( q\le 2, \) \( q\ne 1), \) while the second range \( (p\le 0) \) is transformed to the range \( (q\ge 2). \) Since \( p=2-q \) and \( 1-p=q-1, \) we obtain

$$\begin{aligned} \textrm{Tr}~Y=\left\{ \begin{array}{ll} \max _{X>0}\Bigl \{\textrm{Tr}~X-\frac{\textrm{Tr}~X^{2-q}\left( X^{q-1}-Y^{q-1}\right) }{q-1}\Bigr \}, &{}\quad q\in (-\infty ,2],\ q\ne 1, \\ \min _{X>0}\Bigl \{\textrm{Tr}~X-\frac{\textrm{Tr}~X^{2-q}\left( X^{q-1}-Y^{q-1}\right) }{q-1}\Bigr \}, &{}\quad q\in [2,\infty ). \end{array} \right. \end{aligned}$$

By using the definition of the deformed logarithm, we note that

$$\begin{aligned} \frac{X^{q-1}-Y^{q-1}}{q-1}=\log _q(X)-\log _q(Y), \end{aligned}$$

and by inserting this in the expressions above, we obtain the desired statements of the proposition, except for \( q=1. \) We may finally let q tend to one in the first inequality and obtain the variational expression

$$\begin{aligned} \textrm{Tr}~Y=\max _{X>0}\Bigl \{\textrm{Tr}~X-{\textrm{Tr}~X\left( \log X-\log Y\right) }\Bigr \} \end{aligned}$$

by continuity. This completes the proof. \(\square \)

Note that the last statement in the above proof entails the inequality

$$\begin{aligned} S(X\mid Y)\ge \textrm{Tr}~(X-Y) \end{aligned}$$

for the relative quantum entropy \( S(X\mid Y). \)

2.1 Further preliminaries

Lemma 2.2

Let H be an arbitrary matrix,  take \( L\ge 0, \) and choose exponents p and s such that \( s >0. \) We consider the trace function

$$\begin{aligned} \psi _{L,H}(A)=\textrm{Tr}~\bigl (L+H^*A^pH\bigr )^s \end{aligned}$$

defined in positive definite matrices. Then \( \psi _{L,H}(A) \) is convex (respectively concave) for arbitrary H and \( L\ge 0, \) if and only if it is convex (respectively concave) for arbitrary H and \( L=0. \)

Proof

By considering block matrices

$$\begin{aligned} {\hat{H}}=\begin{pmatrix} L^{1/2} &{} 0\\ H &{} 0 \end{pmatrix}\quad \text {and}\quad {\hat{A}}=\begin{pmatrix} I &{} 0\\ 0 &{} A \end{pmatrix}, \end{aligned}$$

we obtain

$$\begin{aligned} {\hat{H}}^*{\hat{A}}^p {\hat{H}}=\begin{pmatrix} L+H^*A^pH &{} 0\\ 0 &{} 0 \end{pmatrix}. \end{aligned}$$

Since \( s>0, \) we obtain in addition

$$\begin{aligned} \bigl ({\hat{H}}^*{\hat{A}}^p {\hat{H}}\bigr )^s=\begin{pmatrix} \bigl (L+H^*A^pH\bigr )^s &{} 0\\ 0 &{} 0 \end{pmatrix}, \end{aligned}$$

since it is meaningful to set \( 0^s=0. \) Indeed, for \( \varepsilon >0, \) we have

$$\begin{aligned} \varepsilon ^s=\exp \bigl (s\log \varepsilon \bigr ), \end{aligned}$$

and this quantity tends to zero as \( \varepsilon \rightarrow 0. \) Therefore, \( \psi _{L,H}(A)=\psi _{0,{\hat{H}}}({\hat{A}}) \) and the statement follows. \(\square \)

If \( s<0, \) there exist examples in two by two matrices such that \( \psi _{0,H}(A) \) is convex while \( \psi _{L,H}(A) \) is not.

3 The main trace function

Let H be an invertible contraction and A positive definite. Then,

$$\begin{aligned} H^*\log _p(A)H> \frac{-1}{p-1}H^*H\ge \frac{-1}{p-1}\quad \text {for}\ p>1 \end{aligned}$$

and

$$\begin{aligned} H^*\log _p(A)H< \frac{-1}{p-1}H^*H\le \frac{-1}{p-1}\quad \text {for}\ p<1. \end{aligned}$$

Therefore, \( H^*\log _p(A)H \) belongs to the domain of the p-exponential. This is true even if H is not invertible since \( \exp _p(0)=1. \) Therefore,

$$\begin{aligned} \exp _p\bigl (L+H^*\log _p(A)H\bigr ) \end{aligned}$$

is well defined and positive for arbitrary contractions H and \( p\ne 1, \) provided \( L\ge 0 \) when \( p>1, \) and \( L\le 0 \) when \( p<1. \) In both cases, we may define the trace function

$$\begin{aligned} \varphi ^L_{p,q}(A)=\textrm{Tr}~\left[ \exp _p\bigl (L+H^*\log _p(A)H\bigr )^q\right] \end{aligned}$$
(3.1)

for arbitrary exponents q. We furthermore obtain the expression

$$\begin{aligned} \varphi _{p,q}^L(A)= & {} \textrm{Tr}~\bigl [I+(p-1)L+(p-1)H^*\frac{A^{p-1}-I}{p-1}H\bigr ]^{q/(p-1)}\nonumber \\= & {} \textrm{Tr}~\bigl [I-H^*H+(p-1)L+H^*A^{p-1}H\bigr ]^{q/(p-1)}. \end{aligned}$$
(3.2)

Note that \( (p-1)L\ge 0 \) in both cases. By using Lemma 2.2, we obtain the following:

Corollary 3.1

Suppose \( q/(p-1)>0. \) Then \( \varphi _{p,q}^L(A) \) is convex (respectively concave) if and only if the trace function \( A\rightarrow \textrm{Tr}~\bigl (H^*A^p H\bigr )^{q/(p-1)} \) is convex (respectively concave).

We shall finally explore yet another expression for the main trace function. Given the expression in (3.1) and setting

$$\begin{aligned} \beta =1+\frac{p-1}{q}, \end{aligned}$$
(3.3)

we obtain

$$\begin{aligned} \varphi _{p,q}^L(A)=\textrm{Tr}~\exp _\beta \bigl (qL+qH^*\log _p(A)H\big ), \end{aligned}$$
(3.4)

where we used Lemma 1.1. By replacing q with \( \beta \) in Proposition 2.1 and setting

$$\begin{aligned} F(X,A)=\textrm{Tr}~X-\textrm{Tr}~X^{2-\beta }\left( \log _\beta X-\log _\beta Y\right) , \end{aligned}$$
(3.5)

where \( Y=\exp _\beta \bigl (qL+qH^*\log _p(A)H\big ), \) we obtain that

$$\begin{aligned} \varphi ^L_{p,q}(A)=\left\{ \begin{array}{ll} \sup _{X>0} F(X,A) &{}\quad \beta \le 2,\\ \inf _{X>0} F(X,A) &{}\quad \beta >2. \end{array} \right. \end{aligned}$$
(3.6)

This is the main variational expression to be used. We next calculate

$$\begin{aligned} F(X,A)= & {} \textrm{Tr}~X-\textrm{Tr}~X^{2-\beta }\left( \log _\beta X-\log _\beta Y\right) \\= & {} \textrm{Tr}~X-\textrm{Tr}~X^{2-\beta }\left( \log _\beta X-qL-q H^*\log _p(A) H\right) \\= & {} \textrm{Tr}~X-\textrm{Tr}~X^{2-\beta }\left( \frac{X^{\beta -1}-I}{\beta -1}-qL-qH^*\frac{A^{p-1}-I}{p-1}H\right) \\= & {} \textrm{Tr}~X-\frac{1}{\beta -1}\textrm{Tr}~\bigl (X-X^{2-\beta }-(p-1)L-X^{2-\beta }H^*(A^{p-1}-I)H\bigr )\\= & {} \left( 1-\frac{1}{\beta -1}\right) \textrm{Tr}~X+G(X,A), \end{aligned}$$

where we used \( q/(p-1)=1/(\beta -1) \) and set

$$\begin{aligned} G(X,A)=\frac{1}{\beta -1}\textrm{Tr}~\Bigl (X^{2-\beta }(I-H^*H+(p-1)L)+ X^{2-\beta }H^* A^{p-1}H\Bigr ). \end{aligned}$$
(3.7)

The first term in F(XA) is linear, so we only have to consider convexity or concavity of G(XA). Note as before that \( (p-1)L\ge 0. \)

Lemma 3.2

Let H be a contraction and consider for arbitrary real q the trace function \( \varphi ^L_{p,q}(A) \) defined in (3.1). If \( q/(1-p)>0 \) and \( \varphi _{p,q}^L(A) \) is convex (respectively concave) for arbitrary contractions H,  then so is the trace function \( \varphi _{2-p,q}^{-L}(A). \)

Proof

We may without loss of generality assume that H is invertible. By using the calculation in (3.2), we obtain

$$\begin{aligned} t^{-q}\varphi ^L_{p,q}(tA)=\textrm{Tr}~\bigl (t^{1-p}(I-H^*H+(p-1)L)+H^*A^{p-1}H\bigr )^{q/(p-1)} \end{aligned}$$
(3.8)

for \( t>0, \) Thus, by letting \( t\rightarrow 0 \) for \( p<1 \) or letting \( t\rightarrow \infty \) for \( p>1, \) we obtain that the trace function

$$\begin{aligned} A\rightarrow \textrm{Tr}~\bigl (H^*A^{p-1}H\bigr )^{q/(p-1)} \end{aligned}$$

is convex (respectively concave). It is no longer necessary to assume that H is a contraction, and since \( H^* A^{p-1}H \) is invertible, we can raise it to any non-zero exponent. Therefore, by inversion we obtain that the trace function

$$\begin{aligned} A\rightarrow \textrm{Tr}~\bigl (H^*A^{1-p}H\bigr )^{q/(1-p)} \end{aligned}$$

is convex (respectively concave) for arbitrary H. In particular, if H is a contraction we obtain by a small calculation the identity

$$\begin{aligned} \varphi _{2-p,q}^{-L}(A)=\textrm{Tr}~\bigl (I-H^*H+(p-1)L +H^*A^{1-p}H\bigr )^{q/(1-p)}. \end{aligned}$$

Since \( q/(1-p) >0, \) we obtain from Lemma 2.2 that also \( \varphi _{2-p,q}^{-L}(A) \) is convex (respectively concave). \(\square \)

3.1 The strategy of the proof

We shall determine parameter values p and q such that F(XA) is either convex (respectively concave) or just convex (respectively concave) in the second variable. To do this, we use that the functions \( t\rightarrow t^p \) are operator concave, if and only if \( 0\le p\le 1, \) and operator convex, if and only if \( -1\le p\le 0 \) or \( 1\le p\le 2. \) It may be of interest to note that the same parameter conditions apply, if we only require matrix convexity or matrix concavity of order two, cf. [6, Proposition 3.1]. We also make use of Lieb’s concavity theorem [8, Corollary 1.1] stating that the trace functions

$$\begin{aligned} (X,A)\rightarrow \textrm{Tr}~X^p H^* A^q H \end{aligned}$$
(3.9)

are concave if \( p,q\ge 0 \) and \( p+q\le 1. \) Ando’s theorem states [1, Corollary 6.3] that the trace function in (3.9) for an arbitrary matrix H,  is convex for either \( -1\le p,q\le 0, \) or for

$$\begin{aligned} \quad -1\le p\le 0\quad \text {and}\quad 1-p\le q\le 2, \end{aligned}$$

where obviously p and q may be interchanged in the condition. Since \( H=I \) is a possibility, we realise that concavity of \( \varphi _{p,q}(A) \) requires \( 0\le q\le 1, \) while convexity of \( \varphi _{p,q}(A) \) requires \( q\le 0 \) or \( q\ge 1. \) Since we intend to eventually use operator convexity/concavity of the function \( t\rightarrow t^p, \) we are restricted to the cases

$$\begin{aligned} -1\le p-1\le 2\quad \text {or equivalently}\quad 0\le p\le 3. \end{aligned}$$

Note that if \( \beta =1, \) then \( p=1. \)

Proposition 3.3

Let K be a positive definite \( n\times n \) matrix,  and let H be any \( n\times n \) matrix. We may define the operator map

$$\begin{aligned} \psi _p^s(A)=\bigl (K+H^* A^p H\bigr )^s \end{aligned}$$

in positive definite \( n\times n \) matrices for exponents p and s. If \( -1\le p\le 0 \) and \( -1\le s\le 0, \) then \( \psi _p^s(A) \) is concave.

Proof

We first consider the case \( -1\le p\le 0 \) and \( s=-1. \) Since

$$\begin{aligned} \psi _p^{-1}(A)=\bigl (K+H^* A^p H\bigr )^{-1}=K^{-1/2}\bigl (I+L^*A^pL\bigr )^{-1}K^{-1/2}, \end{aligned}$$

where \( L=HK^{-1/2}, \) we may assume \( K=I. \) We may also without loss generality assume that H is invertible. We then obtain

$$\begin{aligned} \bigl (I+H^* A^p H\bigr )^{-1}=\frac{(H^*A^pH)^{-1}}{(H^*A^pH)^{-1}+I}= \frac{H^{-1}A^{-p}(H^{-1})^*}{H^{-1}A^{-p}(H^{-1})^*+I} \end{aligned}$$

by an elementary calculation. Since the map \( A\rightarrow H^{-1}A^{-p}(H^{-1})^* \) is concave and the function \( t\rightarrow t(1+t)^{-1} \) is operator monotone and operator concave, we obtain that \( A\rightarrow \bigl (I+H^* A^p H\bigr )^{-1} \) is concave. That is, \( \psi _p^{-1}(A) \) is concave. Since the function \( t\rightarrow t^\alpha \) is both operator monotone and operator concave for \( 0\le \alpha \le 1, \) it follows that \( \psi _p^s(A) \) is concave for \( -1\le s\le 0. \) \(\square \)

4 The main theorem

Theorem 4.1

The trace function \( \varphi ^L_{p,q}(A) \) defined in (3.1) has the following geometric properties depending on the matrix L and the parameters p and q. \( \varphi ^L_{p,q}(A) \) is concave in positive definite A for

$$\begin{aligned} 0&\le p\le 1,\quad L\le 0,\quad 0\le q\le 1. \end{aligned}$$
(4.1a)
$$\begin{aligned} 1&\le p\le 2,\quad L\ge 0,\quad 0\le q\le 1. \end{aligned}$$
(4.1b)

\( \varphi ^L_{p,q}(A) \) is convex in positive definite A for

$$\begin{aligned} 0&\le p\le 1,\quad L\le 0,\quad q\le 0. \end{aligned}$$
(4.1c)
$$\begin{aligned} 1&\le p\le 2,\quad L\ge 0,\quad q\le 0.\end{aligned}$$
(4.1d)
$$\begin{aligned} 2&\le p\le 3,\quad L\ge 0,\quad q\ge 1. \end{aligned}$$
(4.1e)

Proof

We divide the proof following the statement’s five cases.

  1. (a)

    Take \( 0\le p< 1, \) \( L\le 0, \) and \( 1-p\le q\le 1. \) Then

    $$\begin{aligned} 0\le \beta =1+(p-1)/q \le p<1. \end{aligned}$$

    Since \( \varphi _{p,q}(A)=\sup _{X>0} F(X,A), \) we may derive that \( \varphi _{p,q}(A) \) is concave if G(XA) is jointly concave. Since \( 1<2-\beta \le 2 ,\) the first term in G(XA) is concave (note that \( \beta <1). \) Since \( -1\le p-1\le 0 \) and \( 1-(p-1)\le 2-\beta \le 2, \) we realise by Ando’s convexity theorem that \( \varphi _{p,q}(A) \) is concave. Next, take \( 0\le p\le 1, L\le 0, \) and \( 0\le q\le 1-p. \) That is, \( -1\le s\le 0, \) where \( s=q/(p-1)\,. \) It then follows from (3.2) combined with Corollary 3.3 that \( \varphi _{p,q}(A) \) is concave (even without the trace).

  2. (b)

    Take \( 1<p\le 2, \) \( L\ge 0, \) and \( 0< q\le p-1. \) Then

    $$\begin{aligned} \beta =1+\frac{p-1}{q}\ge 2, \end{aligned}$$

    and thus \( \varphi ^L_{p,q}(A)=\inf _{X>0} F(X,A). \) We may thus derive that \( \varphi ^L_{p,q}(A) \) is concave if G(XA) is concave in the second variable. This is so since \( 0\le p-1\le 1. \) Next, take \( 1<p\le 2, \) \( L\ge 0, \) and \( p-1\le q\le 1. \) Then,

    $$\begin{aligned} 1< p\le \beta =1+\frac{p-1}{q}\le 2, \end{aligned}$$

    and thus \( \varphi ^L_{p,q}(A)=\sup _{X>0} F(X,A). \) We may thus derive that \( \varphi ^L_{p,q}(A) \) is concave if G(XA) is concave. Since \( \beta >1 \) and \( 0\le 2-\beta \le 1, \) the first term in G(XA) is concave. The second term is concave by Lieb’s concavity theorem, since \( 0\le p-1\le 1 \) and \( 2-\beta +p-1\le 1. \) The last inequality is satisfied since \( p\le \beta . \) These two cases taken together prove (b).

  3. (c)

    We first prove that (d) implies (c). Take \( 1< p<2, \) \( L\ge 0, \) and \( q<0, \) and note that \( 0\le 2-p<1. \) Since \( \varphi ^L_{p,q}(A) \) is convex by (d) and \( q/(1-p)>0, \) we obtain by Lemma 3.2 that \( \varphi ^{-L}_{2-p,q}(A) \) is convex. This is equivalent to saying that \( \varphi ^L_{p,q}(A) \) is convex for \( 0\le p\le 1, \) \( L\le 0, \) and \( q< 0. \)

  4. (d)

    By continuity, we may assume \( 1< p\le 2, \) \( L\ge 0, \) and \( q<0. \) Therefore, \( \beta =1+(p-1)/q<1 \) and thus \( \varphi ^L_{p,q}(A)=\sup _{X>0}F(X,A). \) We obtain that \( \varphi ^L_{p,q}(A) \) is convex, if G(XA) is convex in the second variable. This is so since \( \beta <1 \) and \( 0\le p-1\le 1. \)

  5. (e)

    Take \( 2\le p\le 3, \) \( L\ge 0, \) and \( 1\le q\le p-1. \) Then

    $$\begin{aligned} 2\le \beta =1+(p-1)/q\le p\le 3. \end{aligned}$$

    Since \( \varphi ^L_{p,q}(A)=\inf _{X>0} F(X,A), \) we may derive that \( \varphi ^L_{p,q}(A) \) is convex if G(XA) is convex. Since \( -1\le 2-\beta \le 0 ,\) this follows by Ando’s convexity theorem if in addition

    $$\begin{aligned} 1-(2-\beta )\le p-1\le 2, \end{aligned}$$

    and this is satisfied since \( \beta \le p. \) Take next \( 2\le p\le 3, \) \( L\ge 0, \) and \( q\ge p-1. \) Then

    $$\begin{aligned} 1<\beta =1+(p-1)/q \le 2. \end{aligned}$$

    Since \( \varphi ^L_{p,q}(A)=\sup _{X>0} F(X,A), \) we may derive that \( \varphi ^L_{p,q}(A) \) is convex if G(XA) is convex in the second variable. But this is so since \( \beta >1 \) and \( 1\le p-1\le 2. \) These two cases taken together prove (e).

\(\square \)

The special case \( q=1 \) was proved in [9, Corollary 2.3].

4.1 Comparison with the literature

The trace functions \( \Upsilon _{p,q}(A) \) were introduced and studied by Carlen and Lieb in [3, Theorem 1.1] and later with a different definition (by setting \( s=q/p) \) in [4, Proposition 5]. We adopt and slightly generalise the latter definition by setting

$$\begin{aligned} \Upsilon ^K_{p,s}(A)=\textrm{Tr}~\bigl (K+H^*A^pH\bigr )^s, \end{aligned}$$
(4.2)

where \( K\ge 0, \) H is arbitrary, and A is positive definite. By replacing p with \( p-1 ,\) we obtain the following corollary to Theorem 4.1.

Corollary 4.2

The trace function \( \Upsilon ^K_{p,s}(A) \) defined in (4.2) has the following geometric properties depending on the parameters p and s. \( \Upsilon ^K_{p,q}(A) \) is concave in positive definite A for

$$\begin{aligned} -1&\le p\le 0,\quad p^{-1}\le s\le 0. \end{aligned}$$
(4.3a)
$$\begin{aligned} 0&\le p\le 1,\quad 0\le s\le p^{-1}. \end{aligned}$$
(4.3b)

\( \Upsilon ^K_{p,q}(A) \) is convex in positive definite A for

$$\begin{aligned} -1&\le p\le 0,\quad s\ge 0. \end{aligned}$$
(4.3c)
$$\begin{aligned} 0&\le p\le 1,\quad s\le 0.\end{aligned}$$
(4.3d)
$$\begin{aligned} 1&\le p\le 2,\quad s\ge p^{-1}. \end{aligned}$$
(4.3e)

Proof

To a given \( K\ge 0, \) we set \( L=(p-1)^{-1} K. \) Then, \( L\le 0 \) for \( p<1 \) and \( L\ge 0 \) for \( p>1. \) By replacing L with \( t^{p-1} L \) in Eq. (3.8), we obtain

$$\begin{aligned} t^{-q}\varphi ^{t^{p-1}L}_{p,q}(tA)=\textrm{Tr}~\bigl ( t^{1-p}(I-H^*H)+(p-1)L+H^*A^{p-1}H\bigr )^{q/(p-1)} \end{aligned}$$

and this expression tends to

$$\begin{aligned} \textrm{Tr}~\bigl (K+H^*A^{p-1}H\bigr )^{q/(p-1)}= \Upsilon ^K_{p-1,s}(A),\quad s=\frac{q}{p-1} \end{aligned}$$

by letting \( t\rightarrow 0 \) in the case \( p<1, \) and letting \( t\rightarrow \infty \) in the case \( p>1. \) With these choices, we realise that \( \Upsilon ^K_{p-1,s}(A) \) has the same geometric properties as \( \varphi ^L_{p,q}(A). \) We may now replace p with \( p+1 \) and obtain that \( \Upsilon ^K_{p,s}(A) \) has the same geometric properties as \( \varphi ^L_{p+1,q}(A), \) where \( s=q/p. \)

In particular, \( \Upsilon ^K_{p,s}(A) \) is concave for \( -1\le p\le 0 \) and \( 0\le q\le 1, \) equivalent to \( p^{-1}\le s \le 0. \) Furthermore, \( \Upsilon ^K_{p,s}(A) \) is convex for \( -1\le p\le 0 \) and \( q\le 0, \) equivalent to \( s\ge 0. \) Likewise, \( \Upsilon ^K_{p,s}(A) \) is convex for \( 0\le p\le 1 \) and \( q\le 0, \) equivalent to \( s\le 0. \) Finally, \( \Upsilon ^K_{p,s}(A) \) is convex for \( 1\le p\le 2 \) and \( q\ge 1, \) equivalent to \( s\ge p^{-1}. \) \(\square \)

In the case \( K=0 ,\) we note that (4.3a) and (4.3b) are counterparts of each other by replacing (ps) with \( (-p,-s). \) This also applies to (4.3c) and (4.3d). These results contain the statements in [4, Proposition 5], where the authors list the following clarifications.

  1. (1)

    Concavity: The case \( 0\le p\le 1 \) and \( K=0. \) The result for \( s=p^{-1} \) is due to Epstein [5]. Carlen and Lieb proved the result for \( 1\le s\le p^{-1}, \) [3, Theorem 1.1]. The full result for \( 0\le s\le p^{-1} \) is due to Hiai [7, Theorem 4.1 (1)].

  2. (2)

    Convexity: The case \( -1\le p\le 0, \) \( K=0, \) and \( s>0 \) is due to Hiai [7, Theorem 4.1 (2)].

  3. (3)

    Convexity: The case \( 1\le p\le 2, \) K=0, and \( s\ge p^{-1} \) is due to Carlen and Lieb [3, Theorem 1.1].

The dual case \( 0\le p\le 1, \) K = 0, and \( s<0 \) is also contained in Hiai [7, Theorem 4.1 (2)]. One may compare Corollary 4.2 to Figure 1.1 in Zhang [11], where a three variable extension \( \Psi _{p,q,s}(A) \) of \( \Upsilon ^0_{p,s}(A) \) is discussed. The comparison is obtained by setting \( q=0 \) in the figure.