Unitary invariance
Proposition 2
u is invariant under unitary transformations U of \(\mathscr {H}\).
Proof
U maps \(\mathscr {S}\) to itself in a linear way and maps \(\mathscr {T}_1\) to itself in an affine-linear way. Thus, any translation invariant measure on \(\mathscr {T}_1\) will be mapped by U to a multiple of itself. Since U also maps \(\mathscr {P}\) to itself, it also maps \(\mathscr {D}\) to itself. As a consequence, it must preserve volumes when acting on \(\mathscr {T}_1\), and so it preserves u. \(\square \)
Proof
(Alternative proof.) Equip \(\mathscr {S}\) with the Hilbert–Schmidt inner product
$$\begin{aligned} \langle A,B \rangle = {{\,\mathrm{tr}\,}}(AB), \end{aligned}$$
(4)
which is invariant under U. Using the inner product, one has a notion of area on every surface, in particular on \(\mathscr {T}_1\). u is just the normalized surface areaFootnote 2 restricted to \(\mathscr {D}\), and it follows that surface area is invariant under U. \(\square \)
Note that unitary invariance does not uniquely select the measure u. Unitary invariance means that the joint distribution of the eigenvectors of \(\rho \) is uniform while saying nothing about the joint distribution of the eigenvalues. The property that selects u as the natural normalized measure on \(\mathscr {D}\) is that u is, up to a normalizing factor, “just volume.” (After all, it is the volume measure in \(\mathscr {T}_1\) applied to subsets of \(\mathscr {D}\subset \mathscr {T}_1\).)Footnote 3
Expectation and covariance
The covariance of a random vector V in a real vector space \(\mathscr {V}\) with inner product \(\langle ~,\,\rangle \) is defined to be the operator \(C:\mathscr {V}\rightarrow \mathscr {V}\) such that
$$\begin{aligned} \langle v,Cv'\rangle = \mathbb {E}\Bigl [\bigl \langle v,(V-\mathbb {E}V)\bigr \rangle \bigl \langle (V-\mathbb {E}V),v'\bigr \rangle \Bigr ] \end{aligned}$$
(5)
for all \(v,v'\in \mathscr {V}\).
Proposition 3
A u-distributed \(\rho \) has expectation
$$\begin{aligned} \mathbb {E}\rho = \tfrac{1}{d}I \end{aligned}$$
(6)
and covariance (in \(\mathscr {V}=\mathscr {S}\) with Hilbert–Schmidt inner product (4))
$$\begin{aligned} C= c(d)\,P_{\mathscr {T}_0}\,, \end{aligned}$$
(7)
with \(c(d)>0\) some constantFootnote 4 and \(P_{\mathscr {T}_0}\) the projection to the set \(\mathscr {T}_0\) of traceless operators in \(\mathscr {S}\).
Proof
As a consequence of Proposition 2, \(\mathbb {E}\rho \) must be invariant under U, and since the only operators in \(\mathscr {H}\) invariant under all unitaries are the multiples of the identity I, (6) follows.
Likewise, C must be invariant under U(d). To determine all U(d)-invariant operators on \(\mathscr {S}\), we first show that the representation of U(d) on \(\mathscr {S}\) is the direct sum of two irreducible representation spaces, \(\mathbb {R}I\) (the multiples of the identity) and \(\mathscr {T}_0\).
Clearly, \(\mathbb {R}I\) and \(\mathscr {T}_0\) are U(d)-invariant (as \({{\,\mathrm{tr}\,}}(UAU^{-1})={{\,\mathrm{tr}\,}}(A)\)), they are orthogonal in the Hilbert–Schmidt inner product, their sum is \(\mathscr {S}\), and \(\mathbb {R}I\) is irreducible because it is 1-dimensional. To show that \(\mathscr {T}_0\) is irreducible, we show that \(\{0\}\) and \(\mathscr {T}_0\) are its only invariant subspaces. To this end, let \(\mathscr {U}\ne \{0\}\) be an invariant subspace of \(\mathscr {T}_0\); we show that \(\mathscr {U}+\mathbb {R}I=\mathscr {S}\), which implies that \(\mathscr {U}=\mathscr {T}_0\). Note that \(\mathscr {U}+\mathbb {R}I\) is invariant. Let \(0\ne A\in \mathscr {U}\). Then A has at least two different eigenvalues; choose an orthonormal basis of \(\mathscr {H}\) that diagonalizes A. We show that all \(B\in \mathscr {S}\) that are diagonal in the same basis also lie in \(\mathscr {U}+\mathbb {R}I\); it then follows by applying unitaries that \(\mathscr {U}+\mathbb {R}I=\mathscr {S}\). For this, it suffices to show that for \(d\ge 2\) the only subspace of \(\mathbb {R}^d\) that is invariant under permutation of components and contains \(\varvec{c}:=(1,1,\ldots ,1)\) and some vector not proportional to \(\varvec{c}\) is \(\mathbb {R}^d\) itself. Indeed, if \(\mathscr {W}\) is such a subspace and \(\varvec{w}\in \mathscr {W}\setminus \mathbb {R}\varvec{c}\), then \(w_i\ne w_j\) for some \(i\ne j\). Let \(\varvec{w}'\) be the vector obtained from \(\varvec{w}\) by permuting \(w_i\) and \(w_j\), then \(\varvec{w}'':=\varvec{w}-\varvec{w}' \in \mathscr {W}\) has \(w''_i= w_i-w_j\), \(w''_j=w_j-w_i\), while all other components of \(\varvec{w}''\) vanish. Thus, using permutations again, \((1,-1,0,\ldots ,0)\in \mathscr {W}\) and
$$\begin{aligned}&(1,0,\ldots ,0) =\nonumber \\&\quad \tfrac{1}{d}\Bigl [\varvec{c}+ (1,-1,0,0,\ldots ,0) + (1,0,-1,0,\ldots ,0) + \ldots + (1,0,\ldots ,0,-1)\Bigr ] \in \mathscr {W}\,. \end{aligned}$$
(8)
By permutation, all \((0,\ldots ,0,1,0,\ldots ,0)\in \mathscr {W}\), so \(\mathscr {W}=\mathbb {R}^d\).
Now, since \(\mathscr {T}_0\) is irreducible, we can apply Schur’s lemma [20]. Since the irreducible representations \(\mathbb {R}I\) (which has dimension 1) and \(\mathscr {T}_0\) (which has dimension \(d^2-1\ge 3\)) are inequivalent, Schur’s lemma yields that every U(d)-invariant operator \(C:\mathscr {S}\rightarrow \mathscr {S}\) is of the form
$$\begin{aligned} C= {\tilde{c}} P_{\mathbb {R}I} + c P_{\mathscr {T}_0}\,. \end{aligned}$$
(9)
For the covariance operator C, since always \(\rho -\mathbb {E}\rho \in \mathscr {T}_0\), we have that \({\tilde{c}}=0\). \(\square \)
We can characterize the value of \(c=c(d)\) as follows. Fix \(\psi \in \mathbb {S}(\mathscr {H})\) and set \(v=v'=|\psi \rangle \langle \psi |\). Then
$$\begin{aligned} \langle v,Cv\rangle&= \mathbb {E}\Bigl [\bigl ({{\,\mathrm{tr}\,}}[v(\rho -\mathbb {E}\rho )]\bigr )^2\Bigr ] \end{aligned}$$
(10)
$$\begin{aligned}&= \mathbb {E}\Bigl [\bigl (\langle \psi |\rho |\psi \rangle -d^{-1}\bigr )^2\Bigr ] \end{aligned}$$
(11)
$$\begin{aligned}&= \mathbb {E}\Bigl [\langle \psi |\rho |\psi \rangle ^2\Bigr ] -2d^{-1}\mathbb {E}\langle \psi |\rho |\psi \rangle +d^{-2} \end{aligned}$$
(12)
$$\begin{aligned}&= \mathbb {E}\Bigl [\langle \psi |\rho |\psi \rangle ^2\Bigr ] -d^{-2}\,. \end{aligned}$$
(13)
On the other hand,
$$\begin{aligned} \langle v,Cv\rangle&= c(d) \langle v,P_{\mathscr {T}_0}v\rangle \end{aligned}$$
(14)
$$\begin{aligned}&= c(d) \Bigl (\langle v,v\rangle - \langle v,P_{\mathbb {R}I}v\rangle \Bigr ) \end{aligned}$$
(15)
$$\begin{aligned}&= c(d) \Bigl (1 - \langle v,d^{-1/2}I\rangle \langle d^{-1/2}I, v\rangle \Bigr ) \end{aligned}$$
(16)
$$\begin{aligned}&= c(d) \bigl (1 - d^{-1}({{\,\mathrm{tr}\,}}v)^2 \bigr ) \end{aligned}$$
(17)
$$\begin{aligned}&= c(d) (1-d^{-1})\,. \end{aligned}$$
(18)
Thus,
$$\begin{aligned} c(d) = \tfrac{d}{d-1} \mathbb {E}\Bigl [\langle \psi |\rho |\psi \rangle ^2\Bigr ] - \tfrac{1}{d(d-1)}\,. \end{aligned}$$
(19)
We did not succeed in evaluating the expectation value.Footnote 5
Distribution of eigenvalues
Let \(T_1\) be the plane
$$\begin{aligned} T_1:= \Bigl \{(\lambda _1,\ldots ,\lambda _d)\in \mathbb {R}^d: \sum _{i=1}^d \lambda _i=1 \Bigr \}\,. \end{aligned}$$
(20)
Proposition 4
Under u, the eigenvalues \(\lambda _1\ge \lambda _2 \ge \ldots \ge \lambda _d\) of \(\rho \) have joint distribution in \(\Lambda \subset T_1\) with densityFootnote 6
$$\begin{aligned} f(\lambda _1,\ldots ,\lambda _d) = {\mathcal {N}} \prod _{1\le i<j \le d} |\lambda _i-\lambda _j|^2 \end{aligned}$$
(21)
relative to the volume measure in \(T_1\) with normalization constant \({\mathcal {N}}>0\).
Proof
The strategy of proof is to use, instead of volume on \(\mathscr {S}\), a Gaussian unitary ensemble, for which the distribution of the eigenvalues is known, and then let its variance tend to infinity, so that the distribution becomes flat on every compact set.
The Gaussian unitary ensemble (e.g., [13, 19, 21]) is the probability distribution over self-adjoint \(d\times d\) matrices \(X_{ij}= A_{ij}+iB_{ij}\) with real part \(A_{ij}=A_{ji}\) and imaginary part \(B_{ij}=-B_{ji}\) such that all \(A_{ij}\) (\(i\le j\)) and all \(B_{ij}\) (\(i<j\)) are independent random variables, where \(A_{ij}\) with \(i<j\) and \(B_{ij}\) are Gaussian with mean 0 and variance 1/(2d), while the \(A_{ii}\) are Gaussian with mean 0 and variance 1/d. Thus, the joint distribution of all \(X_{ij}\) has density (with lower case symbols the possible values of random variables)
$$\begin{aligned} f_X(x_{11},x_{12},\ldots ,x_{dd})&\propto \prod _{i<j} e^{-da_{ij}^2}e^{-db_{ij}^2} \prod _i e^{-da_{ii}^2/2} \end{aligned}$$
(22)
$$\begin{aligned}&= \prod _{i,j=1}^d e^{-d|x_{ij}|^2/2} \end{aligned}$$
(23)
$$\begin{aligned}&= e^{-d {{\,\mathrm{tr}\,}}x^2/2}\,. \end{aligned}$$
(24)
It is known (e.g., [13, 19, 21]) that the eigenvalues \(\mu _1\ge \cdots \ge \mu _d\) of X have joint distribution with density
$$\begin{aligned} g_X(\mu _1,\ldots ,\mu _d) \propto \prod _{k=1}^d e^{-\frac{d}{2} \mu _k^2} \prod _{1\le i<j \le d} |\mu _i-\mu _j|^2\,. \end{aligned}$$
(25)
That is, \(\varphi ^{-1}\) maps the distribution \(f_X(x) \, dx\) to the product of \(g_X(\varvec{\mu })\,d\varvec{\mu }\) (with \(\varvec{\mu }=(\mu _1,\ldots ,\mu _d)\)) and the uniform distribution on U(d).
Now consider \(Y:=\sigma X\) with arbitrary \(\sigma >0\) that we will ultimately let tend to infinity. Y has density
$$\begin{aligned} f_Y(y_{11},y_{12},\ldots ,y_{dd}) \propto e^{-d {{\,\mathrm{tr}\,}}y^2/2\sigma ^2}\,, \end{aligned}$$
(26)
and its eigenvalues \(\nu _1=\sigma \mu _1,\ldots ,\nu _d=\sigma \mu _d\) have joint density
$$\begin{aligned} g_Y(\nu _1,\ldots ,\nu _d) \propto \prod _{k=1}^d e^{-d \nu _k^2/2\sigma ^2} \prod _{1\le i<j \le d} \frac{|\nu _i-\nu _j|^2}{\sigma ^2}\,. \end{aligned}$$
(27)
Again, \(\varphi ^{-1}\) maps the distribution \(f_Y(y) \, dy\) to the product of \(g_Y(\varvec{\nu })\, d\varvec{\nu }\) (with \(\varvec{\nu }=(\nu _1,\ldots ,\nu _d)\)) and the uniform distribution on U(d).
Since \(\varphi \) maps \(T_1\times U(d)\) to \(\mathscr {T}_1\), it maps the conditional distribution of \(\varvec{\nu }\) on \(T_1\), times the uniform distribution on U(d), to the conditional distribution of Y on \(\mathscr {T}_1\). Likewise, it maps the conditional distribution of \(\varvec{\nu }\) on \(\Lambda \), times the uniform distribution on U(d), to the conditional distribution of Y on \(\mathscr {D}\). Note that the conditional distribution of Y on \(\mathscr {T}_1\) has density, up to a normalizing factor, given by \(f_Y\) restricted to \(T_1\), and the conditional distribution of \(\varvec{\nu }\) on \(T_1\) has density \(g_Y\) on \(T_1\) up to a factor. In the limit \(\sigma \rightarrow \infty \), the right-hand side of (26) converges to 1, in fact uniformly on the compact set \(\mathscr {D}\); thus, also \(f_Y\) (including the appropriate normalizing factor) converges uniformly to 1 on \(\mathscr {D}\). On the other hand, in the same way, the right-hand side of (27), after dropping the factors of \(\sigma \) in the denominator, converges to \(\prod |\nu _i-\nu _j|^2\), in fact uniformly on the compact set \(\Lambda \). We want to draw the conclusion that \(\varphi \) maps the limit of \(g_Y\)-conditional-on-\(\Lambda \) (times the uniform distribution on U(d)) to the limit of \(f_Y\)-conditional-on-\(\mathscr {D}\) (i.e., to u).
To justify this conclusion, we note the following. The interior of \(\Lambda \) is
$$\begin{aligned} \Lambda ^\circ = \Bigl \{(\lambda _1,\ldots ,\lambda _d)\in T_1: \lambda _1>\cdots> \lambda _d>0 \Bigr \}\,. \end{aligned}$$
(28)
Since \(\Lambda \) is a convex set, its boundary has measure zero in \(T_1\); thus, it does not matter whether we consider continuous measures on \(\Lambda \) or \(\Lambda ^\circ \). For eigenvalues in \(\Lambda ^\circ \), the orthonormal basis of eigenvectors is unique up to phases; that is, \(\varphi \) maps \(\Lambda ^\circ \times [U(d)/U(1)^d]\) bijectively to the set of non-degenerate positive definite density matrices, a dense set of full u-measure in \(\mathscr {D}\). Since \(\varphi \) is smooth (in particular) on \(T_1\times U(d)\), so is its Jacobian determinant; since \(\Lambda \times U(d)\) is compact, the Jacobian is bounded on \(\Lambda \times U(d)\). According to the transformation formula for integrals, the density of the pre-image is the Jacobian times the density of the image; as a consequence, if the Jacobian is bounded and the density of the image converges uniformly, then so does the density of the pre-image. That is, we can pull the limit through \(\varphi \), as we claimed.
The upshot is that \(\varphi ^{-1}\) maps u to
$$\begin{aligned}&\lim _{\sigma \rightarrow \infty } g_Y(\varvec{\nu }) \, d\varvec{\nu }\times \mathrm {uniform}_{U(d)} \end{aligned}$$
(29)
$$\begin{aligned}&= {\mathcal {N}} \biggl (\prod _{1\le i<j \le d} |\nu _i-\nu _j|^2\biggr ) d\varvec{\nu }\times \mathrm {uniform}_{U(d)}\,, \end{aligned}$$
(30)
which proves (21) (and by the way again the unitary invariance of u). \(\square \)