Parameter-free description of the manifold of non-degenerate density matrices

The paper gives a definition of exponential arcs in the manifold of non-degenerate density matrices and uses it as a starting point to develop a parameter-free version of non-commutative Information Geometry in the finite-dimensional case. Given the Bogoliubov metric, the m- and e-connections are each other dual. Convex potentials are introduced. They allow to introduce dual charts. Affine coordinates are introduced at the end to make the connection with the more usual approach.


Introduction
Models belonging to the quantum exponential family have been intensively studied within Statistical Physics long before Amari [1,2] introduced dually flat geometries into the theory of statistical models. The generalization of Amari's work to quantum models was taken up by Hasegawa and others [3][4][5][6][7][8]. See for instance Chapter 7 of [2] and the book of Petz [9].
In the book of Ay et al. [18], Chapter 3.3, two distinct approaches are mentioned. In Pistone's approach, the manifold of probability measures compatible with a given measure receives the structure of a Banach manifold. Alternatively, a manifold of probability measures can receive its geometry from its embedding in a linear space of signed measures.
Early efforts to generalize Pistone's approach to the quantum context include the works of Grasselli and Streater [19][20][21][22] and of Jenčová [23]. Recently, a different line of research is started by Ciaglia et al. [24,25]. They study the action of the group of invertible operators on the manifold of density operators.
Technical problems appear when considering continuous measure spaces. Such problems are avoided here by restriction to the finite-dimensional case.
From the Literature, the following guidelines are adopted. -The manifold has a maximal extent; models belonging to an exponential family describe submanifolds; -The manifolds are Banach manifolds; charts take values in a Banach space; -Each point of the manifold is the centre of a chart; -The geodesics are exponential arcs; a e-mail: Jan.Naudts@uantwerpen.be (corresponding author) -The metric can be obtained from a divergence function by differentiation; -Parallel transport is used to derive the geometric connection.
In Quantum Information Theory [9], Bures' distance [26,27] is extensively used. It is the quantum analogue of the Hellinger distance and has quite unique properties. Never the less the inner product needed in the present context is that of Bogoliubov [19,[28][29][30]. For this inner product, the e-(exponential) and m-(mixture) connections become each other duals [19]. A proof follows in Sect. 5.
Let me finally point out that the generalization of Information Geometry to the noncommutative context is characterized by non-uniqueness. Section 10.3 of [9] discusses a class of metrics that all generalize the Fisher information metric to the quantum context. In addition, the notion of exponential arcs, which is the topic of the present work, is nonunique. An alternative definition given in [31] introduces exponential arcs of faithful states on a σ -finite von Neumann algebra.
The next section gives the definition of exponential arcs of density matrices. Vectors tangent to these arcs are discussed in Sect. 3. The exponential map is shown to be welldefined, one-to-one and onto. A chart affine for the e-connection is discussed. In Sects. 4 and 5, Bogoliubov's inner product is introduced. Parallel transport is used to derive the covariant derivative for the m-connection and for its dual, the e-connection. Sections 6 and 7 introduce convex potentials and dual charts. The link with parameterized approaches is made in Sect. 8. At the end follows a section with summary and discussion.

Exponential arcs
A first step in the construction of a geometry on the manifold M of non-degenerate density matrices of dimension n-by-n is the choice of the geodesics that will be used to connect pairs of points in the manifold. A non-degenerate density matrix is a positive-definite matrix with complex entries and unit trace. Its eigenvalues can be interpreted as probabilities summing up to 1.
The following definition generalizes the concept introduced by Cena and Pistone [13,14] to the non-commutative context.

Definition 2.1
An exponential arc connecting the density matrix σ to the density matrix ρ is a map t → σ t with σ t given by with α(t) given by Note that, given ρ and σ in M, σ t with t ∈ [0, 1] is a non-degenerate density matrix belonging to the manifold M. It satisfies σ 0 = ρ and σ 1 = σ . The normalization function α satisfies α(0) = α(1) = 0.
For further use, introduce the following notation.

Notation 2.2
For any pair of density matrices ρ and σ in M, the tangent vector Y ρ (σ ) is given by where t → σ t is the exponential arc connecting σ to ρ.

Notation 2.3 Each matrix A defines a matrix denoted [A]
K ρ by the relation The map A → [A] K ρ is the Kubo transform [9].
Note that Tr ρ c ρ (σ ) = 0. With these notations, one can write the tangent vector as

The tangent plane
Fix a non-degenerate density matrix ρ in M. The tangent plane T ρ M at the point ρ in the manifold M is the space of derivatives at the origin t = 0 of exponential arcs t → σ t connecting any density matrix σ in M to the density matrix ρ. Let us characterize this space.

Proposition 3.1 For any n-by-n matrix V , there exists an n-by-n matrix A such that V = [A] K ρ . If V is Hermitian then A is Hermitian as well. If in addition, the trace of V vanishes then the expectation Tr ρ A vanishes as well.
Proof Consider an orthonormal basis (e i ) i in which ρ is diagonal. One has ρe i = λ i with λ i > 0. The matrix A with matrix elements given by If V is Hermitian, then one has This shows that also A is Hermitian. If in addition Tr V = 0, then one has For convenience, the following notations are introduced.

Proposition 3.3 The tangent space T ρ M consists of all Hermitian n-by-n matrices with vanishing trace: T
Proof Let V be any matrix in A 0 sa . By the previous proposition, there exists a Hermitian n-by-n matrix A such that V = [A] K ρ holds. Let σ be defined by Then σ is a density matrix. Let t → σ t denote the exponential arc connecting σ to ρ. The tangent vector at t = 0 is given by In the above calculation, it is used that and Tr ρ A = Tr V . The latter vanishes by assumption.

Proposition 3.4
If two exponential arcs σ t and τ t connecting σ , respectively, τ to ρ have the same tangent vector at t = 0 then they coincide.
Proof Because σ t and τ t have the same tangent vector at t = 0, it follows that Take the trace of this expression to find that One concludes that By Proposition 3.1, the linear map A → [A] K ρ is invertible. Hence, it is a one-to-one map between the spaces A ρ and A 0 sa because these spaces are finite-dimensional. From (6), it then follows that log σ − log τ = 0 and hence, that σ = τ .
The book of Amari and Nagaoka [2] introduces the notions of an m-connection and of an e-connection. The geodesics of the m-or mixture connection are the convex combinations of probability measures. The non-commutative generalization of the probability measures is the quantum expectations, also called quantum states. Their convex combinations correspond with convex combinations of density matrices. In a similar manner, the e-connection of the manifold of quantum states M can be defined as the connection that has exponential arcs as its geodesics.
The inverse σ → Y ρ (σ ) of the exponential map Y ρ (σ ) → σ could be used as a chart for the manifold M. This chart is affine in the case of the m-connection. Alternatively, one can use the correspondence provided by Proposition 3.1 between A 0 sa and A ρ . It will turn out that the chart c ρ is affine in case of the e-connection. Note that it satisfies c ρ (ρ) = 0. It is said to be centered at the point ρ in M.
The transition map c ρ 1 → c ρ 2 from reference point ρ 1 to any other reference point ρ 2 is given by The expression in the r.h.s. is Fréchet-differentiable for any density matrix σ in M. One concludes that the different charts are mutually compatible.

The metric
Eguchi [33] shows how to derive a metric on the tangent planes starting from a divergence function. The obvious divergence function here is Umegaki's relative entropy (2) discussed in Sect. 2. The use of the Bogoliubov metric in relation with Umegaki's relative entropy is found for instance in [4,25,34].

Proposition 4.1 An inner product is defined on the tangent plane T ρ M by,
where σ t and τ t are exponential arcs connecting density matrices σ and τ to the density matrix ρ and Y ρ (σ ) and Y ρ (τ ) are the tangents of t → σ t , respectively, t → ρ t at t = 0. The inner product is given in terms of the chart c ρ by Proof One calculates This implies The tangent vector Y (σ ) can be expressed in terms of the chart c(σ ). This gives This can be written as (7) because Tr ρ c ρ (σ ) = 0. Let us verify that (7) defines a non-degenerate inner product on the tangent space T ρ M. Bilinearity follows because the relation between tangent vector and chart is linear. Positivity follows from Expression (7) is Bogoliubov's inner product [28][29][30] adapted to the present notations.

The dual geometry
With any geometry with parallel transport corresponds a dual geometry [2] with parallel transport * given by Here ρ 1 and ρ 2 belong to the manifold M and V and W are tangent vectors in T ρ 1 M. A flat geometry is obtained when the parallel transport is chosen equal to the identity map, where each tangent space is identified with the space A 0 sa of traceless Hermitian matrices. The geometry is that of the m-connection [2]. Let us verify this now.
The covariant derivative of a vector field V along a smooth curve γ is given by [35] [ With equal to the identity map and with the path γ given by γ t = (1 − t)ρ + tσ and the vector field given by V (γ t ) = γ t − γ 0 one obtains The fact that the covariant derivative is constant along this path and equal to the derivativeγ indicates that the path is a geodesic of a flat connection. It is a geodesic of the m-connection.
Let us now consider the dual of the m-connection. Because is the identity map (9) simplifies to (7),*** it follows that Because V is an arbitrary traceless matrix, it follows that B − A is a multiple of the identity I and hence that * (ρ 1 Choose now the vector field V (ρ) = Y ρ (σ ) in combination with a path γ equal to the exponential arc t → σ t connecting σ to ρ. Then one finds This shows that t → σ t is a geodesic for the dual connection ∇ * . Because the geodesics are exponential arcs, the connection is a non-commutative generalization of the e-connection of [2].

The Legendre structure
The relative entropy D(ρ||σ ) is convex in its first argument ρ. The proof is based on Klein's inequality [36]. See [9] for the more general argument based on operator monotonicity of the function f (x) = −x log x. This convexity suggests the use of Legendre transforms.

Definition 6.1 Given a density matrix ρ and a matrix A in A ρ , the potential ρ (A) is defined by
It is the analogue of the logarithm of the partition sum in Statistical Physics. The matrix A corresponds with minus the Hamiltonian. The term log ρ is added to enable that an arbitrary point of the manifold can be taken as the center of the manifold. Note that the Banach space of Hermitian matrices can be identified with the dual of the linear space generated by the density matrices by identification of the linear functional ρ → Tr ρ A with the matrix A itself. The Legendre transform of the map σ → D(σ ||ρ) is therefore equal to ρ ∈ M and A ∈ A ρ .

Proposition 6.2 For all ρ in M and A in
The maximum is reached for σ = τ A with τ A in M such that c ρ (τ A ) = A. It takes on the value It then follows that Use this to obtain This shows that for any σ one has Take now σ = τ A . Then the inequality 0 ≤ D(σ ||τ A ) in the above calculation becomes an equality. Hence, σ = τ A realizes the supremum in (11).
The proposition shows that A → ρ (A) is a Legendre transform. In particular, this implies that it is a convex function.

The dual chart
The following result is standard.

Proposition 7.1 The plane tangent to the potential ρ (A) at the contact point τ A is the map B → ρ (A) + Tr τ A (B − A).
Proof From (11), one obtains This shows that the plane B → Tr τ A (B − A) + ρ (A) remains below the potential ρ . Contact at B = A is clear.
From the above proposition, one concludes that the Legendre dual of the matrix A in A ρ is the linear functional defined by the density matrix τ A . In the approach with coordinates, the derivative of the dual coordinate yields the metric tensor: The first item of (3.32) of [2] reads The derivative of the potential gives the dual coordinate: (3.33) of [2] reads ∂ i ψ = η i . Parameter-free analogues follow below.

Proposition 7.3 Choose ρ in M and A and B in
Proof One has This can be written in first-order approximation as Take the trace of this expression to see that the third term in the r.h.s. vanishes. Hence, one concludes (14).

Proposition 7.4 Select ρ and σ in M and A in
This is used in the now following calculation.
From Proposition 4.1, one obtains

Affine coordinates
The space A 0 sa of traceless Hermitian matrices of dimension n-by-n is a Hilbert space for the Hilbert-Schmidt inner product The charts c ρ have vanishing expectation value. Their expansion therefore reads Introduce a field of basis vectors e i in the tangent bundle. It is defined by The tangent vectors Y (σ ) can then be written as follows The metric tensor g is defined by One finds for any pair σ , τ in M Let us next consider the dual charts. Take B in A ρ and expand it as

From (14) one obtains for any A and B in
Hence, one has

Summary and discussion
The manifold M of non-degenerate n-by-n matrices is studied in a parameter-free way. Starting point is the notion of exponential arcs. The tangents to such arcs span at each point ρ of the manifold the space A 0 sa of traceless Hermitian matrices. Affine charts are introduced for both the m-and the e-connection. The latter turn M into a Banach manifold by means of a global chart c ρ centered at an arbitrary point ρ of the manifold.
Bogoliubov's inner product is defined on any of the tangent planes T ρ M. Parallel transport relates the different tangent planes. The covariant derivative corresponding with the dual parallel transport is derived. It defines the e-connection.
The divergence function is convex in its first argument. This enables the introduction of a convex potential function ρ defined on the range A ρ of the chart c ρ . The derivative of ρ defines the dual chart.
In a final section, affine coordinates are introduced. In this way, the link is made to more conventional approaches.
The identity (1) plays an essential role in controlling the effects of non-commutativity. In combination with the chart c ρ (σ ), it allows to express the exponential map as Y ρ (σ ) = [c ρ (σ )] K ρ → σ . In the study of quantum exponential families, the e-connection can be easily derived by taking third order derivatives of the divergence function [33]. They yield the connection coefficients k i j . By use of dual coordinates, it then becomes straightforward to show that exponential arcs are geodesics of a flat geometry. In a coordinate-free approach, it is more transparent to start from parallel transport. The parallel transport of the m-connection is the identity map. Given the metric, one can then derive the dual transport and verify that the exponential arcs are geodesics for the dual connection. This way of working is adapted here because of its transparency.
Throughout this work, the distinction is made between the spaces A ρ of Hermitian matrices A with vanishing expectation Tr ρ A = 0 and the space A 0 sa of Hermitian matrices V with vanishing trace Tr V = 0, although the relation between the two spaces is trivial. Doing so is clarifying. Given a Hermitian matrix A, the parallel transport from ρ 1 in M to ρ 2 in M by means of the e-connection maps A − Tr ρ 1 A onto A − Tr ρ 2 A. The commutative analogue of this transport law has been emphasized for instance in [15].
In Amari's work [1,2], it is important that there is available a potential function the Hessian of which is the metric. It allows for an easy introduction of the Legendre duality. It is shown in Sect. 6 that for each ρ in M a potential ρ can be defined on the space A ρ in such a way that its Fréchet derivative, which is the Legendre dual, is a chart affine for the m-connection.
The present paper describes the geometry of the manifold M of non-degenerate n-by-n matrices from a specific point of view. Much more is known and the overall picture is clear. On the other hand, the generalization to infinite dimensions consists of separate studies such as those of [20,21,23,24,31]. Finite-dimensional matrices are replaced by possibly unbounded operators on Hilbert space. Density matrices are replaced by normalized positive functionals called states. The technicality of the subject increases, and many aspects concerning the geometry of the manifold of faithful states are still unclear.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.