The Fisher metric as a metric on the cotangent bundle

The Fisher metric on a manifold of probability distributions is usually treated as a metric on the tangent bundle. In this paper, we focus on the metric on the cotangent bundle induced from the Fisher metric with calling it the Fisher co-metric. We show that the Fisher co-metric can be defined directly without going through the Fisher metric by establishing a natural correspondence between cotangent vectors and random variables. This definition clarifies a close relation between the Fisher co-metric and the variance/covariance of random variables, whereby the Cram\'{e}r-Rao inequality is trivialized. We also discuss the monotonicity and the invariance of the Fisher co-metric with respect to Markov maps, and present a theorem characterizing the co-metric by the invariance, which can be regarded as a cotangent version of \v{C}encov's characterization theorem for the Fisher metric. The obtained theorem can also viewed as giving a characterization of the variance/covariance.


Introduction
The Fisher metric on a statistical manifold (a manifold consisting of probability distributions) is one of the most important notions in information geometry [1].It is usually treated as a Riemannian metric which is a metric on the tangent bundle.The subject of the present paper is the metric on the cotangent bundle corresponding to the Fisher metric, which we call the Fisher co-metric.The Fisher metric and the Fisher co-metric are essentially a single geometric object so that one is induced from another.Nevertheless, studying the Fisher co-metric has several implications as mentioned below, which are what the present paper intends to show.
Firstly, as will be seen in Section 2, the Fisher co-metric is defined via the variance/covariance of random variables based on a natural correspondence between cotangent vectors and random variables.This definition is very natural and does not seem arbitrary.There is no room for questions such as why log appears in the definition of the Fisher metric.
Secondly, the above relationship between cotangent vectors and random variables directly links the variance/covariance of an unbiased estimator and the Fisher co-metric, which trivializes the Cramér-Rao inequality.Recognizing this fact, the Fisher metric appears to be a detour for the Cramér-Rao inequality, at least conceptually.
Thirdly, once we focus on the Fisher co-metric, we are motivated to reconsider a known result for the Fisher metric as a source of similar problems for the Fisher co-metic and the variance/covariance, which may lead to a new insight.As an example, co-metric and variance/covariance versions of Čencov's theorem on characterization of the Fisher metric are investigated in this paper.
The paper is organized as follows.In Section 2, we introduce the Fisher co-metric on the manifold P (Ω), which is the totality of positive probability distributions on a finite set Ω, via the variance/covariance of random variables on Ω.In Section 3, the Fisher co-metric is shown to be equivalent to the Fisher metric by a natural correspondence.In Section 4, the Fisher metric and co-metric on an arbitrary submanifold of P are discussed, where we see that the Cramér-Rao inequality is trivialized by considering the co-metric.Section 5 treats the e-and m-connections on P (Ω), where it is clarified that, in application to estimation theory, the role of the m-connection as a connection on the cotangent bundle and its relation to the Fisher co-metric are crucial.Sections 2-5 can be considered to constitute a first half of the paper, which is aimed at showing the naturalness and the usefulness of considering the Fisher co-metric.
The second half of the paper focuses on the monotonicity and the invariance of the Fisher metric and co-metric with respect to Markov maps.In Section 6, we investigate the monotonicity.We show there that the monotonicity of the Fisher metric, which is well known as a characteristic property of the metric, is equivalently translated into the monotonicity of the Fisher co-metric and that of the variance.In Section 7, after reviewing the invariance of the Fisher metric and Čencov's theorem, we consider their co-metric versions.It is shown that, being different from the monotonicity, the invariance of the metric and that of the co-metric are not logically equivalent.We present a theorem on characterization of the Fisher co-metric in terms of the invariance, which corresponds to Čencov's theorem but does not follow from it.The obtained theorem can also be expressed as a theorem on characterization of the variance/covariance.In Section 8, we investigate a stronger version of the invariance, which can be regarded as the joint condition that combines the invariance of the metric and that of the co-metric.The formulation used for expressing this condition is applied to affine connections in Section 9, whereby a kind of invariance condition for affine connection is obtained.The condition is shown to be equivalent to a known version of invariance condition which is seemingly weaker than the original condition used by Čencov to characterize the -connections, but actually characterizes the -connection as well.Section 10 is devoted to concluding remarks.
Remark 1.1.Throughout this paper, we denote the tangent space and the cotangent space of a manifold at a point ∈ by ( ) and * ( ), respectively.We also denote the totality of smooth vector fields and that of smooth differential 1-forms on by ( ) and ( ), respectively.We generally use capital letters , . . .for vector fields in ( ), which are maps assigning tangent vectors , , . . . in ( ) to each point ∈ .To save the symbols, we also denote general tangent vectors in ( ) by , , . .., not only when they are the values of vector fields.Similarly, We use Greek letters , , . . .for 1-forms in ( ), which are maps assigning cotangent vectors , , . . . in * ( ) to each point ∈ , and also denote general cotangent vectors by , , . .., not only when they are the values of 1-forms.The pairing of ∈ ( ) and ∈ * ( ) is expressed as ( ), considering a cotangent vector as a function on the tangent space.We keep the first capital letters , , . . .for random variables (R-valued functions on sample spaces).

The Fisher co-metric
We introduce the Fisher co-metric in this section, while its equivalence to the Fisher metric will be shown in the next section.
Let Ω be a finite set with cardinality |Ω| ≥ 2, and let P (Ω) be the totality of strictly positive probability distributions on Ω: which is regarded as a manifold with dim P (Ω) = |Ω| − 1.Let the totality of R-valued functions on Ω be denoted by R Ω , and define 1 , its tangent space can be identified with the linear space R Ω 0 .Following the terminology of [1], we denote this identification (P) → R Ω 0 by ↦ → (m) , and call (m) the m-representation of .
For an arbitrary submanifold of P (including the case when When the elements of are parametrized as by a coordinate system = ( ) of , the m-representation of ( ) ∈ ( ), where := , with = is represented as and {( ) (m) } =1 ( = dim ) constitute a basis of (m) ( ).
We denote the expectation of a random variable ∈ R Ω w.r.t. a distribution ∈ P by := ∈Ω ( ) ( ), (2.3) and define the function Since is a smooth function on the manifold P, its differential ( ) ∈ * (P) at each point ∈ P is defined.We introduce the following map: : R Ω → * (P), ↦ → ( ) := ( ) . (2.5) for which we have Proposition 2.1.For every ∈ P, the linear map : R Ω → * (P) is surjective with Ker = R, where R is regarded as a subspace of R Ω by identifying a constant ∈ R with the constant function ↦ → .Hence, induces a linear isomorphism R Ω /R → * (P).
We have thus defined the map which maps each point ∈ P to the inner product on * (P).We generally call such a map (a metric on the cotangent bundle) a co-metric.Although a co-metric is essentially equivalent to a usual (Riemannian) metric (a metric on the tangent bundle) by the correspondence explained in the next section, it is often useful to distinguish them conceptually.The co-metric defined by (2.9) is called the Fisher co-metric, since it corresponds to the Fisher metric as will be shown later.
Remark 2.2.Eq. (2.10) is found in Theorem 2.7 of the book [1], where the norm and the inner product on the cotangent space were considered to be induced from the Fisher metric.

The correspondence between a metric and a co-metric
By a standard argument of linear algebra, an inner product •, • on a R-linear space establishes a natural linear isomorphism between and its dual space * , which we denote by •,• ←→.This gives a one-to-one correspondence between a metric on a manifold and a co-metric on as follows.Given a metric on , a tangent vector ∈ ( ) and a cotangent vector ∈ * ( ) at a point ∈ correspond each other by The correspondence is extended to the correspondence between a vector field ∈ ( ) and a 1-form ∈ ( ) by (Note: some literature refers to this correspondence as the musical isomorphism with notation = ♭ and = ♯ , while we will use the symbol ♯ for a different meaning later.)This correspondence determines a co-metric on , which is denoted by the same symbol , such that for every ∈ ←→ and ←→ ⇒ ( , ) = ( , ).
Conversely, given a co-metric on , the correspondence ←→ is defined by and a metric on is defined by the same relation as (3.3).It should be noted that when a metric and a co-metric correspond in this way, the relations (3.1) and (3.4) are equivalent, so that there arises no confusion even if we use the same symbol for the corresponding metric and co-metric in ←→ and ←→.
Note that for an arbitrary coordinate system ( ) of , := ( , ) and := ( , ) form the inverse matrices of each other at every point of .Note also that the norms for ( ( ), ) and ( * ( ), ) are linked by where the max's in these equations are achieved by those and which correspond to each other by ←→ up to a constant factor.
For a tangent vector ∈ (P), define which is the derivative of the map P → R Ω , ↦ → log w.r.t. .(In [1], is called the e-representation of and is denoted by (e) .)Note that is characterized by (cf.(2.6)) The following proposition shows that the metric induced from the Fisher co-metric by the correspondence ←→ is the Fisher metric.Proposition 3.1.For each point ∈ P, we have: Proof.1: According to (3.4), the condition ←→ ( ) is equivalent to Here the LHS is equal to while the RHS is equal to , by (3.8).Hence, (3.9) is equivalent to = − .2: Obvious from item 1 and (3.3). 4 The Fisher co-metric on a submanifold and the Cramér-Rao inequality Let be an arbitrary submanifold of P. Then a metric on is induced as the restriction of the Fisher metric , which we denote by : ↦ → , = | ( ) 2 .When a coordinate system = ( ) is given on , corresponding to (2.2) it holds that We have which defines the Fisher information matrix The metric induces a co-metric on , which is denoted by the same symbol .Letting we have Suppose that a cotangent vector ∈ * ( ) on is the restriction of a cotangent vector ˜ ∈ * (P) on P; i.e., = ˜ | ( ) .Then, it follows from (3.6) that (Note that = , since the metric on is the restriction of the metric on P.) Furthermore, for an arbitrary ∈ * ( ), there always exists ˜ ∈ * (P) satisfying = ˜ | ( ) and , = ˜ .Indeed, letting ∈ ( ) be defined by , ←→ , such an ˜ is obtained by ←→ ˜ .
The above observations lead to the following proposition.
1.For any ∈ * ( ), we have where ( ) 2. For any , ∈ * ( ), we have The above proposition shows that the Fisher co-metric on can be defined from the Fisher co-metric on P directly by (4.5) and (4.6), not by way of the Fisher metric.

On the e, m-connections
An affine connection is usually treated as a connection on the tangent bundle, while it corresponds to a connection on the cotangent bundle by the relation This correspondence is one-to-one, so that we can define an affine connection by specifying a connection on the cotangent bundle.Therefore, the -connection in information geometry can also be introduced in this way.Although affine connections are out of the main subject of this paper, we will briefly discuss the significance of defining the m-connection (i.e. ( = −1)-connection) in this way, since it is closely related to the role of the Fisher co-metric in the Cramér-Rao inequality.
We start by introducing the m-connection ∇ (m) on P = P (Ω) as a flat connection on the cotangent bundle for which the 1-form is parallel for any ∈ R Ω ; i.e., (5.2) Then the correspondence (5.1) determines a connection on the tangent bundle, which is denoted by the same symbol ∇ (m) .Letting = in (5.1) and applying (5.2), we have This implies that, for any ∈ (P), where "m-parallel" means "parallel w.r.t.∇ (m) ".Since this property characterizes the m-connection on P (e.g.Eq. (2.39) of [1]), our definition of the m-connection is equivalent to the usual definition in information geometry.
(5.5) Using (5.1), we can rewrite (5.5) into ∀ , , ∈ (P), ∀ ∈ (P), ←→ ⇒ (∇ (m) )( ) = (∇ (e) , ). (5.6) This implies that, for any ∈ (P) and ∈ (P), Now, let us recall the situation of Corollary 4.2.An estimator ì = ( 1 , . . . ) is said to be efficient for the statistical model ( , ) when it is unbiased (i.e.∀ , = | ) and achieves the equality in the Cramér-Rao inequality (4.9) for every ∈ .Noting that the achievability at each ∈ is represented by the condition ∀ , ( ) = (( ) ) ♯ and recalling (5.2), we can see that the condition for ( , ) to have an efficient estimator is expressed as (5.8) On the other hand, it is well known that the existence of an efficient estimator is equivalent to the condition that is an exponential family and that is an expectation coordinate system, which can be rephrased as (see Theorem 3.12 of [1]) is an e-autoparallel submanifold of P, and is an an m-affine coordinate system. (5.9) Therefore, the two conditions (5.8) and (5.9) are necessarily equivalent.These are both purely geometrical conditions for a submanifold of the dually flat space (P, , ∇ (e) , ∇ (m) ), and we can prove their equivalence within this geometrical framework, forgetting its statistical background.Indeed, the equivalence can be proved for a more general situation where is a submanifold of a manifold equipped with a Riemannian metric and a pair of dual affine connections ∇, ∇ * on the assumption that ∇ * is flat.Note that this assumption is weaker than the dually-flatness of ( , , ∇, ∇ * ) in that ∇ is allowed to have non-vanishing torsion, which is essential in application to quantum estimation theory.See section 7 of [4] for details.

Monotonicity
The monotonicity with respect to a Markov map is known to be an important and characteristic property of the Fisher metric.In this section we discuss the monotonicity of the Fisher co-metric and its relation to the variance of random variables.
Let Ω 1 and Ω 2 be arbitrary finite sets, and let P := P (Ω ) for = 1, 2. A map Φ : P 1 → P 2 is called a Markov map when it is affine in the sense that where is a surjective channel from Ω 1 to Ω 2 ; i.e., and When Φ is represented as (6.1), we write Φ = Φ .
More generally, for a submanifold of P 1 and a submanifold of P 2 , a map : → is called a Markov map when there exists a Markov map Φ : and its dual where t denotes the transpose of a linear map.See Remark 6.2 below for the notation * = * .
As is well known, the Fisher metric satisfies the following monotonicity property for its norm: The cotangent version of the monotonicity is given below.Proposition 6.1.We have where • , and • , ( ) denote the norms w.r.t. the Fisher co-metrics and , respectively.
Let us consider the case when = P 1 and = P 2 , and let Φ = Φ : P 1 → P 2 be an arbitrary Markov map represented by a surjective channel .Recalling (6.1) and the definition of m-representation of tangent vectors, we have for ∈ (P 1 ) and Φ( ) ∈ Φ( ) (P 2 ).We claim that where Φ * = Φ * , and Eq. ( 6.9) is verified as follows; for every = ( ) ∈ * (P 1 ), where ∈ R Ω 2 , we have where the second ⇔ follows from (2.6) and (6.8), the third ⇔ follows from (m) (P) = R Ω 0 , and R is identified with the set of constant functions on Ω 1 .Invoking (2.10) and (6.9), we see that the monotonicity (6.7) is equivalent to the following well-known inequality for the variance: which we refer to as the monotonicity of the variance.
In the above proof of Prop.6.1, we derived (6.7) from (6.6).Conversely, we can derive (6.6) from (6.7) by the use of (3.5) as follows; for any ∈ ( ), * ( ) ( ) = max where the first ≤ follows from (6.7).Thus, (6.6) and (6.7) are equivalent.Note that this equivalence is derived solely from a general argument on metrics and co-metrics, and does not rely on the special characteristics of the Fisher metric/co-metric.In this sense, we say that (6.6) and (6.7) are logically equivalent.
Recalling that the Fisher metric is characterized as the unique monotone metric up to a constant factor, we obtain the following propositions from the logical equivalence mentioned above.Proposition 6.3.The monotonicity (6.7) characterizes the Fisher co-metric up to a constant factor.Proposition 6.4.The variance is characterized up to a constant factor as the positive quadratic form for random variables satisfying the monotonicity (6.12).Remark 6.5.We have described the above propositions in a rough form for the sake of readability.For the exact statement, we need a formulation similar to Theorems 7.1, 7.2 and 7.3 in the next section.See also Remark 8.4.Remark 6.6.Since the monotonicity of the Fisher metric (6.6), that of the Fisher co-metric (6.7), and that of the variance (6.12) are all logically equivalent, we can derive (6.6) from the more popular (6.12).

Invariance
Čencov showed in [3] that the Fisher metric is characterized up to a constant factor as a covariant tensor field of degree 2 satisfying the invariance for Markov embeddings.Note that the invariance is weaker than the monotonicity and that the tensor field is not assumed to be positive nor symmetric.In this section we review Čencov's theorem and then investigate its co-metric version, which will be shown to be equivalent to a theorem characterizing the variance/covariance of random variables.
We begin by reviewing the invariance property of the Fisher metric.Suppose that and are arbitrary submanifolds of P 1 = P (Ω 1 ) and P 2 = P (Ω 2 ), respectively, and that a pair of Markov maps : → and : → where • denotes the composition of maps.Note that is injective while is surjective.Given a pair of points ( , ) ∈ × satisfying = ( ) and = ( ), we have * , • * , = id ( ) .
It then follows from the monotonicity (6.6) that ) so that we have the invariance of the Fisher metric which is equivalent to , ) = , ( * ( ), * ( )) (7.7)This means that * , : ( ) → ( ) is isometry, which is represented as where ( * , ) † : ( ) → ( ) denotes the adjoint (Hermitian conjugate) of * , w.r.t. the inner products , and , .A markov map Φ : P 1 → P 2 is called an Markov embedding when there exists a Markov map Ψ : P 2 → P 1 such that (7.9) Note that |Ω 1 | ≤ |Ω 2 | necessarily holds in this case.As a special case of the invariance (7.7), we have According to Čencov, this property characterizes the Fisher metric up to a constant factor.The exact statement is presented below.
Let us return to the situation of (7.9).We call a Markov map Ψ : P 2 → P 1 a Markov co-embedding when there exists a Markov embedding Φ : P 1 → P 2 satisfying (7.9).As an example of (7.14), (7.9) implies the invariance which can be rewritten as Actually, the range ∀ ∈ Φ(P 1 ) in the above equation can be extended to ∀ ∈ P 2 for the reason described below.
It is known (e.g.Lemma 9.5 of [3]) that every pair (Φ, Ψ) of Markov embedding and co-embedding satisfying (7.9) is represented in the following form: and where is a surjection Ω 2 → Ω 1 which yields the partition Ω 2 = ∈Ω 1 −1 ( ), and { } ∈Ω 1 is a family of probability distributions on Ω 2 such that the support of is −1 ( ) for every ∈ Ω 1 .We note that a Markov co-embedding Ψ is determined by alone, while a Markov embedding Φ is determined by and { } ∈Ω 1 together.Consequently, Ψ is uniquely determined from Φ, while Φ for a given Ψ has the degree of freedom corresponding to { } ∈Ω 1 .According to this fact, when a Markov co-embedding Ψ and a distribution ∈ P 2 are arbitrarily given, we can always choose a Markov embedding Φ satisfying (7.9) and ∈ Φ(P 1 ); indeed, defining by the resulting Φ satisfies = Φ( ) ∈ Φ(P 1 ).This is the reason why ∀ ∈ Φ(P 1 ) in (7.19) can be replaced with ∀ ∈ P 2 .We thus have or equivalently, for every Markov co-embedding Ψ.
The invariance (7.23) characterizes the Fisher co-metric up to a constant factor.Namely, we have the following theorem.
(ii) For any ≤ and any Markov co-embedding Ψ : P → P , it holds that The proof will be given by rewriting the statement in terms of variance/covariance for random variables.Suppose that a Markov co-embedding Ψ : P 1 → P 2 is represented as (7.20) by a surjection : Ω 2 → Ω 1 .Then Ψ is represented as Ψ = Φ by the channel from Ω 2 to Ω 1 defined by For an arbitrary ∈ R Ω 1 , its conditional expectation w.r.t. is represented as so that it follows from (6.9) that where we have invoked Ψ( ) = from (7.20).Hence, (7.23) and (7.24) are rewritten as These identities themselves are obvious, but what is important is that they characterize the variance/covariance up to a constant factor.Namely, we have the following theorem.
Theorem 7.3.In the same situation as Theorem 7.2, suppose that we are given a sequence { } ∞ =2 , where is a map which continuously maps each point ∈ P to a bilinear form , on R Ω .Then the following conditions (i) and (ii) are equivalent.
= 0. (ii-2) For any ≤ and any surjection : Ω → Ω , it holds that Note that, if we assume that { } are all symmetric tensors, then (7.31) can be replaced with which corresponds to (7.29).
See A1 in Appendix for the proof, where we use an argument similar to Čencov's proof of Theorem 7.1.It is obvious that Theorem 7.2 immediately follows from this theorem.Remark 7.4.If we delete (ii-1) from (ii) in Theorem 7.3, then we have (i) ′ ⇔ (ii-2) by replacing (i) with We give a proof for (i) ′ ⇔ (ii-2) in A1, from which Theorem 7.3 is straightforwad.

Strong invariance
In the preceding two sections, we have observed the following facts.
• The monotonicity of metrics and that of co-metrics are logically equivalent.
• The monotonicity logically implies the invariance of metrics and that of co-metrics.
• The invariance of metrics and that of co-metrics are not logically equivalent.
In this section we introduce a new notion of invariance called the strong invariance, and show that: • The strong invariance of metrics and that of co-metrics are logically equivalent.
• The monotonicity of metrics/co-metrics logically implies the strong invariance of metrics/co-metrics.
• The strong invariance of metrics/co-metrics logically implies the invariance of metrics and that of co-metrics.
The property (8.1) is called the strong invariance of the Fisher metric.The proposition will be proved by using the following lemma.
Lemma 8.2.Let and be finite-dimensional metric linear spaces, and let : → and : → be linear maps satisfying = .Then the following two conditions are equivalent.
When these conditions hold, † = is the orthogonal projector from onto the image Im of .
Proposition 8.3.The strong invariance (8.5) characterizes the Fisher cometric up to a constant factor.
Remark 8.4.The above proposition is stronger than Prop.6.3 and weaker than Theorem 7.2.To formulate Prop.6.3 and Prop.8.3 in exact forms similar to Theorem 7.2, it matters what assumptions should be imposed on bilinear forms {ℎ , } on the cotangent spaces prior to the monotonicity or the strong invariance.Here we should keep in mind that the significance of these propositions, which are weaker than Theorem 7.2, lies in the fact that they follow 9 Weak invariance for affine connections In addition to characterizing the Fisher metric by the invariance with respect to Markov embeddings, Čencov also gave a characterization of the -connections by the invariance condition.In this section we show that a similar notion to the strong invariance of metrics, which is described in terms of Markov embedding/co-embedding pairs, can be considered for affine connections.
Let Ω 1 , Ω 2 be arbitrary finite sets satisfying 2 ≤ |Ω 1 | ≤ |Ω 2 |, and let Φ : P 1 → P 2 be a Markov embedding, where P := P (Ω ), = 1, 2. Suppose that affine connections ∇ and ∇ ′ are given on P 1 and P 2 , respectively.When these connections are the -connection on P 1 and that on P 2 for some common ∈ R, they satisfy Some remarks on the meaning of the above equation are in order.First, we define Φ * ( ) for an arbitrary vector field on S 1 as a vector field on := Φ(P 1 ) that maps each point = Φ( ) ∈ , where ∈ P 1 , to Since is a submanifold of S 2 on which the connection ∇ ′ is given, ∇ ′ Φ * ( ) Φ * ( ) in (9.1) is defined as a map which maps each point ∈ to a tangent vector in (P 2 ), although ∇ ′ Φ * ( ) Φ * ( ) does not belong to ( ) in general.The condition (9.1) means that is autoparallel in P 2 with respect to ∇ ′ and that the restricted connection of ∇ ′ induced on the autoparallel is obtained from ∇ by the diffeomorphism Φ : P 1 → .Čencov [3] showed that the invariance condition characterizes the family { -connection} ∈R by a formulation similar to Theorem 7.1.
Remark 9.1.As is mentioned above, the fact that { -connection} ∈R satisfy the invariance (9.1) implies that = Φ(P 1 ) is autoparallel in P 2 w.r.t. the -connection for every ∈ R. A kind of converse result is found in [5], which states that if a submanifold of P 2 = P (Ω 2 ) is autoparallel w.r.t.theconnection for every ∈ R (or, for some two different values of ), then is represented as Φ(S 1 ) by some Markov embedding Φ from some S 1 = S (Ω 1 ) into S 2 .

Concluding remarks
In this paper we have focused on the face of the Fisher metric as a metric on the cotangent bundle, calling it the Fisher co-metric to distinguish it from the original Fisher metric on the tangent bundle.What we have shown are listed below.
1. Based on a correspondence between cotangent vectors and random variables, the Fisher metric is defined via the variance/covariance in a natural way (Section 2).
3. The role of the m-connection as a connection on the cotangent bundle is important in considering the achievability condition for the Cramér-Rao inequality (Section 5).
4. The monotonicity of the Fisher metric is equivalently translated into the monotonicity of the Fisher co-metric and that of the variance (Section 6).
5. The invariance of the Fisher metric and that of the Fisher co-metric are not logically equivalent, and a new Čencov-type theorem for characterizing the Fisher co-metric by the invariance is established, which can also be regarded as a theorem for characterizing the variance/covariance (Section 7).
6.The notion of strong invariance is introduced, which combines the invariance of the Fisher metric and that of the Fisher co-metric (Section 8).
7. The weak invariance of the -connections is expressed in a formulation similar to the strong invariance (Section 9).
It should be noted that, although this paper emphasizes the importance of the Fisher co-metric, this does not diminish the importance of the Fisher metric at all.Apart from the importance as a metric on the tangent bundle itself, which is essential for geometry of statistical manifolds, we should not forget that the Fisher information matrix (i.e. the components of the Fisher metric) is of primary importance as a practical tool to compute the Fisher co-metric.Even if we know that ( ) = , (( ) , ( ) ) in (4.3) can be defined by (4.5) and (4.6) and that understanding ( ) in this way is important for conceptual understanding of the Cramér-Rao inequality, this does not tell us a method to compute ( ) for a given statistical model ( , ) better than computing the inverse of the Fisher information matrix ( ) := [ , ( )] in general.
Finally, we note that some of the results obtained here can be extended to the quantum case in several directions, which will be discussed in a forthcoming paper.